add new page on memory management and mage cystal a section (like news)

This commit is contained in:
Torsten Ruger
2014-06-15 20:34:45 +02:00
parent 2fdc2853c5
commit 91be87fbbc
6 changed files with 82 additions and 3 deletions

92
crystal/layers.html Normal file
View File

@@ -0,0 +1,92 @@
---
layout: crystal
title: Crystal, a simple and minimal oo machine
---
<div class="row vspace10">
<div class="span12 center">
<h3><span>Crystal layers</span></h3>
<p>Map pretty much to top level directories.</p>
</div>
</div>
<div class="row vspace20">
<div class="span11">
<h5>Machine Code, bare metal</h5>
<p>
This is the code in arm directory. It creates binary code according to the arm specs. All about shifting bits in the
right way.
<br/>
As an abstraction it is not far away from assembler. I mapped the memnonics to function calls and the registers
can be symbols or Values (from vm). But on the whole this is as low level as it gets.
<br/>
Different types of instructions are implemented by different classes. To make machine dependant code possible,
those classes are derived from Vm versions.
<br/>
There is an intel directory which contains an expanded version of wilson, but it has yet to be made to fit into
the architecture. So for now crystal produces arm code.
</p>
</div>
</div>
<div class="row">
<div class="span12">
<h5>Parsing, forever descending</h5>
<p>
Parsing is relatively straightforward too. We all know ruby, so it's just a matter of getting the rules right.
<br/>
If only. Ruby is full of niceties that actually make parsing it quite difficult. But at the moment that story hasn't
even started.
<br/>
Parslet lets us use modules for parts of the parser, so those files are pretty self-explanitory. Not all is done, but
a good start.
<br/>
Parslet also has a seperate Transformation pass, and that creates the AST. Those class names are also
easy, so you can guess what an IfExpression represents.
<br/>
</p>
</div>
</div>
<div class="row">
<div class="span12">
<h5>Virtual Machine</h5>
<p>
The Virtual machine layer (vm) is where it gets interesting, but also more fuzzy.
<br/>
Currently still quite simple, we have Classes for things we know, like program and function. Also things we need
to create the code, like Blocks and Instructions.
<br/>
The most interesting thing is maybe the idea of a Value. If you think of Variables, Values are what a variable may
be assigned, but it may carry a storage place (register). Values are constant, and so to
change a value, we have to create a new Value (of possibly different basic type). Thus
all machine instructions are the trasformation of values into new ones.
<br/>
Also interesting is the slightly unripe Basic Type system. We have a set of machine-word size types and do not
tag them (like mri or BB), but keep type info seperate. These types include integer (signed/unsigned) object reference
and function. Most of the oo machine will build on object references. To make that clearer: The (virtual)machine is
strongly typed (with rtti) and the dynamic ruby behaviour it implemented using that basic type system.
</p>
</p>
</div>
</div>
<div class="row">
<div class="span12">
<h5>The flux</h5>
<p>
This is just a section of things that are unclear, in flux as it were. This does not included undone things, those
are plenty too.
<ul>
<li> the whole type system, more of values, for object types its quite clear</li>
<li> booting. There is this Dichotomy of writing code, and figuring out what it should do when it executes,
that works ok, until i try to think of both, like for booting. </li>
<li> the oo machine abstraction. Currently non-existant i feel like there is a whole layer missing. Possibly
with it's own intruction set</li>
<li> where the core ends, parfait starts and what can be external. </li>
</ul>
</p>
</div>
</div>

57
crystal/memory.md Normal file
View File

@@ -0,0 +1,57 @@
---
layout: crystal
title: Memory layout and management
---
Memory management must be one of the main horrors of computing. That's why garbage collected languages like ruby are so great. Even simple malloc implementations tend to be quite complicated. Unneccessay so, if one used object oriented principles of data hiding.
### Object and values
As has been mentioned, in a true OO system, object tagging is not really an option. Tagging being the technique of adding the lowest bit as marker to pointers and thus having to shift ints and loosing a bit. Mri does this for Integers but not other value types. We accept this and work with it and just say "off course" , but it's not modelled well.
Integers are not Objects like "normal" objects. They are Values, on par with ObjectReferences, and have the following distinctive differences:
- equality implies identity
- constant for whole lifetime
- pass by value semantics
If integers were normal objects, the first would mean they would be sindletons. The second means you can't change them, you can only change a variable to hold a different value. It also means you can't add instance variables to an integer, neither singleton_methods. And the third means that if you do change the variable, a passed value will not be changed. Also they are not garbage collected. If you noticed how weird that idea is (the gc), you can see how natural is that Value idea.
Instead of trying to make this difference go away (like MRI) I think it should be explicit and indeed be expanded to all Objects that have these properties. Words for examples (ruby calls them Symbols), are the same. A Table is a Table, and Toble is not. Floats (all numbers) and Times are the same.
### Object Layout
So if we're not tagging we must pass and keep the type information around seperately. For passing it has been mentioned that a seperate register is used.
For keeping track of the type data we need to make a descision of how many we support. The register for passing gives the upper limit of 4 bits, and this fits well with the idea of cache lines. So if we use cahce lines, for every 8 words, we take one for the type.
Traditionally the class of the object is stored in the object. But this forces the dynamic lookup that is a good part of the performance problem. Instead we store the Object's Layout. The Layout then stores the Class, but it is the layout that describes the memory layout of the object (and all objects with the same layout).
This is is in essence a level of indirection that gives us the space to have several Layouts for one class, and so we can eveolve the class without having to hange the Layout (we just create new ones for every change)
The memory layout of **every** object is type word, layout reference and "data".
That leaves the length open and we can use the 8th 4bits to store it. That gives a maximum of 16 Lines.
#### Continuations
But (i hear), ruby is dynamic, we must be able to add variables and methods to an object at any time. So the layout can't
be fixed. Ok, we can change the Layout every time, but when any empty slots have been used up, what then.
Then we use Continuations, so instead of adding a new variable to the end of the object, we use a new object and store
in the original object. Thus extending the object.
Continuations are pretty normal objects and it is just up to the layout to manage the redirection.
Off course this may splatter objects a little, but in running application this does not really happen much. Most instance variables are added quite soon after startup, just as functions are usually parsed in the beginning.
The good side of continuation is also that we can be quite tight on initial allocation, and even minimal with continuations. Continuations can be completely changed out after all.
### Pages and Spaces
Now we have the smallest units taken care of, we need to store them and allocate and manage larger chunks. This is much
simpler and we can use a fixed size Page, as say 256 lines.
The highest order is a Space, which is just a list of Pages. Spaces manage Pages in a very simliar way that Pages manage Objects, ie ie as liked lists of free Objects/Pages.
A Page, like a Space, is off course a normal object. The actual memory materialises out of nowhere, but then gets
filled immediately with objects. So no empty memory is managed, just objects that can be repurposed.