2014-07-01 20:58:39 +03:00

122 lines
5.5 KiB
HTML

---
layout: crystal
title: Crystal, a simple and minimal oo machine
---
<div class="row vspace10">
<div class="span12 center">
<h3><span>Crystal layers</span></h3>
<p>Map pretty much to top level directories.</p>
</div>
</div>
<div class="row vspace20">
<div class="span11">
<h5>Machine Code, bare metal</h5>
<p>
This is the code in arm directory. It creates binary code according to the arm specs. All about shifting bits in the
right way.
<br/>
As an abstraction it is not far away from assembler. I mapped the memnonics to function calls and the registers
can be symbols or Values (from vm). But on the whole this is as low level as it gets.
<br/>
Different types of instructions are implemented by different classes. To make machine dependant code possible,
those classes are derived from Vm versions.
<br/>
There is an intel directory which contains an expanded version of wilson, but it has yet to be made to fit into
the architecture. So for now crystal produces arm code.
</p>
</div>
</div>
<div class="row">
<div class="span12">
<h5>Parsing, forever descending</h5>
<p>
Parsing is relatively straightforward too. We all know ruby, so it's just a matter of getting the rules right.
<br/>
If only! Ruby is full of niceties that actually make parsing it quite difficult. But at the moment that story hasn't
even started.
<br/>
Traditionally, yacc or bison or talk of lr or ll would come in here and all but a few would zone out. But llvm has
proven that recursive descent parsing is a viable alternative, also for big projects. And Parslet puts that into a nice
ruby framework for us.
<br/>
Parslet lets us use modules for parts of the parser, so those files are pretty self-explanitory. Not all is done, but
a good start.
<br/>
Parslet also has a seperate Transformation pass, and that creates the AST. Those class names are also
easy, so you can guess what an IfExpression represents.
<br/>
</p>
</div>
</div>
<div class="row">
<div class="span12">
<h5>Virtual Machine</h5>
<p>
The Virtual machine layer is where it gets interesting, but also a little fuzzy.
<br/>
After some trying around the virtual machine layer has become a completely self contained layer to describe and
implement an oo machine. In other words it has no reference to any physical machine, that is the next layer down.
<br/>
One can get headaches quite easily while thinking about implementing an oo machine in oo, it's just so difficult to
find the boundaries. To determine those, i like to talk of types (not classes) for the objects (values) in which the
vm is implemented. Also it is neccessary to remove ambiguity about what message sending means.
<br/>
One way to think of this (helps to keep sane) is to think of the types of the system known at compile time. In the
simplest case this could be object reference and integer. The whole vm functionality can be made to work with only
those two types, and it is not specified how the type information is stored. but off course there needs to be a
way to check it at run-time.
<br/>
The vm has an instruction set that, apart from basic integer manipulation, only alows for memory access into an
object. Instead of an implicit stack, we use activation frames and store all variables explicitly.
</p>
</p>
</div>
</div>
<div class="row">
<div class="span12">
<h5>Neumann Machine</h5>
<p>
The von Neumann machine layer is a relatively close abstraction of hardware.
<br/>
Currently still quite simple, we have Classes for things we know, like program and function. Also things we need
to create the code, like Blocks and Instructions.
<br/>
The most interesting thing is maybe the idea of a Value. If you think of Variables, Values are what a variable may
be assigned, but it may carry a storage place (register). Values are constant, and so to
change a value, we have to create a new Value (of possibly different basic type). Thus
all machine instructions are the transformation of values into new ones.
<br/>
Also interesting is the slightly unripe Basic Type system. We have a set of machine-word size types and do not
tag them (like mri or BB), but keep type info seperate. These types include integer (signed/unsigned) object reference
and function. Most of the oo machine will build on object references. To make that clearer: The (virtual)machine is
strongly typed (with rtti) and the dynamic ruby behaviour it implemented using that basic type system.
</p>
</p>
</div>
</div>
<div class="row">
<div class="span12">
<h5>The flux</h5>
<p>
This is just a section of things that are unclear, in flux as it were. This does not included undone things, those
are plenty too.
<ul>
<li> the whole type system, more of values, for object types its quite clear</li>
<li> booting. There is this Dichotomy of writing code, and figuring out what it should do when it executes,
that works ok, until i try to think of both, like for booting. </li>
<li> the oo machine abstraction. Currently non-existant i feel like there is a whole layer missing. Possibly
with it's own intruction set</li>
<li> where the core ends, parfait starts and what can be external. </li>
</ul>
</p>
</div>
</div>