updates architecture to current state
This commit is contained in:
parent
8d239d5c07
commit
df69847757
6
assets/README.md
Normal file
6
assets/README.md
Normal file
@ -0,0 +1,6 @@
|
|||||||
|
## UML and images
|
||||||
|
|
||||||
|
UML (Unified Modeling Language) diagrams are stored as originals in this folder.
|
||||||
|
They are created with [UMLet](http://www.umlet.com/) and exported as jpg.
|
||||||
|
|
||||||
|
Jpeg files are referenced from the markdown.
|
BIN
assets/layers.jpg
Normal file
BIN
assets/layers.jpg
Normal file
Binary file not shown.
After Width: | Height: | Size: 89 KiB |
158
assets/layers.uxf
Normal file
158
assets/layers.uxf
Normal file
@ -0,0 +1,158 @@
|
|||||||
|
<?xml version="1.0" encoding="UTF-8" standalone="no"?>
|
||||||
|
<diagram program="umlet" version="14.2">
|
||||||
|
<zoom_level>10</zoom_level>
|
||||||
|
<element>
|
||||||
|
<id>UMLClass</id>
|
||||||
|
<coordinates>
|
||||||
|
<x>60</x>
|
||||||
|
<y>130</y>
|
||||||
|
<w>120</w>
|
||||||
|
<h>60</h>
|
||||||
|
</coordinates>
|
||||||
|
<panel_attributes><Vool>
|
||||||
|
--
|
||||||
|
Concrete Tree</panel_attributes>
|
||||||
|
<additional_attributes/>
|
||||||
|
</element>
|
||||||
|
<element>
|
||||||
|
<id>UMLClass</id>
|
||||||
|
<coordinates>
|
||||||
|
<x>60</x>
|
||||||
|
<y>70</y>
|
||||||
|
<w>120</w>
|
||||||
|
<h>60</h>
|
||||||
|
</coordinates>
|
||||||
|
<panel_attributes><Ruby>
|
||||||
|
--
|
||||||
|
Abstract Tree</panel_attributes>
|
||||||
|
<additional_attributes/>
|
||||||
|
</element>
|
||||||
|
<element>
|
||||||
|
<id>UMLClass</id>
|
||||||
|
<coordinates>
|
||||||
|
<x>60</x>
|
||||||
|
<y>190</y>
|
||||||
|
<w>120</w>
|
||||||
|
<h>60</h>
|
||||||
|
</coordinates>
|
||||||
|
<panel_attributes><Mom>
|
||||||
|
--
|
||||||
|
Concrete Tree
|
||||||
|
+ Linked List</panel_attributes>
|
||||||
|
<additional_attributes/>
|
||||||
|
</element>
|
||||||
|
<element>
|
||||||
|
<id>UMLClass</id>
|
||||||
|
<coordinates>
|
||||||
|
<x>60</x>
|
||||||
|
<y>250</y>
|
||||||
|
<w>120</w>
|
||||||
|
<h>50</h>
|
||||||
|
</coordinates>
|
||||||
|
<panel_attributes><Risc>
|
||||||
|
--
|
||||||
|
Linked List</panel_attributes>
|
||||||
|
<additional_attributes/>
|
||||||
|
</element>
|
||||||
|
<element>
|
||||||
|
<id>UMLClass</id>
|
||||||
|
<coordinates>
|
||||||
|
<x>60</x>
|
||||||
|
<y>300</y>
|
||||||
|
<w>120</w>
|
||||||
|
<h>50</h>
|
||||||
|
</coordinates>
|
||||||
|
<panel_attributes><Arm>
|
||||||
|
--
|
||||||
|
Linked List</panel_attributes>
|
||||||
|
<additional_attributes/>
|
||||||
|
</element>
|
||||||
|
<element>
|
||||||
|
<id>UMLClass</id>
|
||||||
|
<coordinates>
|
||||||
|
<x>60</x>
|
||||||
|
<y>350</y>
|
||||||
|
<w>120</w>
|
||||||
|
<h>50</h>
|
||||||
|
</coordinates>
|
||||||
|
<panel_attributes><Elf>
|
||||||
|
--
|
||||||
|
Binary / Array</panel_attributes>
|
||||||
|
<additional_attributes/>
|
||||||
|
</element>
|
||||||
|
<element>
|
||||||
|
<id>UMLClass</id>
|
||||||
|
<coordinates>
|
||||||
|
<x>220</x>
|
||||||
|
<y>70</y>
|
||||||
|
<w>670</w>
|
||||||
|
<h>60</h>
|
||||||
|
</coordinates>
|
||||||
|
<panel_attributes>Ruby is parsed into and abstract syntax tree. "Abstract" means all nodes are
|
||||||
|
represented by one class. Compiling to Vool happens by visitor pattern.</panel_attributes>
|
||||||
|
<additional_attributes/>
|
||||||
|
</element>
|
||||||
|
<element>
|
||||||
|
<id>UMLClass</id>
|
||||||
|
<coordinates>
|
||||||
|
<x>220</x>
|
||||||
|
<y>130</y>
|
||||||
|
<w>670</w>
|
||||||
|
<h>60</h>
|
||||||
|
</coordinates>
|
||||||
|
<panel_attributes>Vool (Virtual object oriented Language) is ruby without the fluff. Just simple oo.
|
||||||
|
Vool uses a concrete syntax tree, meaning one class per kind of statement.
|
||||||
|
Code to compile to Mom is in the statement classes.
|
||||||
|
</panel_attributes>
|
||||||
|
<additional_attributes/>
|
||||||
|
</element>
|
||||||
|
<element>
|
||||||
|
<id>UMLClass</id>
|
||||||
|
<coordinates>
|
||||||
|
<x>220</x>
|
||||||
|
<y>190</y>
|
||||||
|
<w>670</w>
|
||||||
|
<h>60</h>
|
||||||
|
</coordinates>
|
||||||
|
<panel_attributes>Mom (Minimal object machine) is a very simple object oriented machine. It only deals in objects,
|
||||||
|
incuding the calling convention. A first pass still uses a tree structure to represent control.
|
||||||
|
The final representation is a list of instructions.</panel_attributes>
|
||||||
|
<additional_attributes/>
|
||||||
|
</element>
|
||||||
|
<element>
|
||||||
|
<id>UMLClass</id>
|
||||||
|
<coordinates>
|
||||||
|
<x>220</x>
|
||||||
|
<y>250</y>
|
||||||
|
<w>670</w>
|
||||||
|
<h>50</h>
|
||||||
|
</coordinates>
|
||||||
|
<panel_attributes>The Risc (reduced intruction set computer) layer is an (sane) subset of ARM. It models only those
|
||||||
|
instructions needed to implement Mom, about 20. It deals in memory, logic and jumps.</panel_attributes>
|
||||||
|
<additional_attributes/>
|
||||||
|
</element>
|
||||||
|
<element>
|
||||||
|
<id>UMLClass</id>
|
||||||
|
<coordinates>
|
||||||
|
<x>220</x>
|
||||||
|
<y>300</y>
|
||||||
|
<w>670</w>
|
||||||
|
<h>50</h>
|
||||||
|
</coordinates>
|
||||||
|
<panel_attributes>The arm chip is the most sold/used in the world, so it is our first "target". Arm has a simple
|
||||||
|
instruction set and we only use the part to implement what is used in Risc.</panel_attributes>
|
||||||
|
<additional_attributes/>
|
||||||
|
</element>
|
||||||
|
<element>
|
||||||
|
<id>UMLClass</id>
|
||||||
|
<coordinates>
|
||||||
|
<x>220</x>
|
||||||
|
<y>350</y>
|
||||||
|
<w>670</w>
|
||||||
|
<h>50</h>
|
||||||
|
</coordinates>
|
||||||
|
<panel_attributes>The lowest level, achieved by assembling arm instructions, is binary code that a cpu can execute.
|
||||||
|
Elf is used to wrap it into a file so Linux can start it. Elf includes debug info too.</panel_attributes>
|
||||||
|
<additional_attributes/>
|
||||||
|
</element>
|
||||||
|
</diagram>
|
@ -9,74 +9,73 @@ To implement an object system to execute object oriented languages takes a large
|
|||||||
The parts or abstraction layers are detailed below.
|
The parts or abstraction layers are detailed below.
|
||||||
|
|
||||||
It is important to understand the approach first though, as it differs from the normal
|
It is important to understand the approach first though, as it differs from the normal
|
||||||
interpretation. The idea is to **compile** ruby. It may be easiest to compare to a static
|
interpretation. The idea is to **compile** ruby. The argument is often made that
|
||||||
object oriented language like c++. When c++ was created c++ code was translated into c, which
|
typed languages are faster, but i don't believe in that. I think dynamic languages
|
||||||
then gets translated into assembler, which gets translated to binary code, which is linked
|
just push more functionality into the "virtual machine" and it is in fact only the
|
||||||
and executed. Compiling to binaries is what gives these languages speed, and is the reason
|
compiling to binaries that gives static languages their speed. This is the reason
|
||||||
to compile ruby.
|
to compile ruby.
|
||||||
|
|
||||||
In a similar way to the c++ example, we need level between ruby and assembler, as it is too
|
![Architectural layers](/assets/layers.jpg)
|
||||||
big a mental step from ruby to assembler. Off course course one could try to compile to c, but
|
|
||||||
since c is not object oriented that would mean dealing with all off c's non oo heritage, like
|
|
||||||
linking model, memory model, calling convention etc.
|
|
||||||
|
|
||||||
Top down the layers are:
|
|
||||||
|
|
||||||
- **Melon** , compiling ruby code into typed layer and includes bootstrapping code
|
### Ruby
|
||||||
|
|
||||||
- **Typed intermediate layer:** Statically typed object oriented with object oriented
|
To compile and run ruby, we first need to parse ruby. While parsing ruby is quite
|
||||||
call semantics.
|
|
||||||
|
|
||||||
- **Risc register machine abstraction** provides a level of machine abstraction, but
|
|
||||||
as the name says, quite a simple one.
|
|
||||||
|
|
||||||
- **Binary and cpu specific assembler** This includes arm assembly and elf support
|
|
||||||
to produce a binary that can then read in ruby programs
|
|
||||||
|
|
||||||
### Melon
|
|
||||||
|
|
||||||
To compile and run ruby, we need to parse and compile ruby code. While parsing ruby is quite
|
|
||||||
a difficult task, it has already been implemented in pure ruby
|
a difficult task, it has already been implemented in pure ruby
|
||||||
[here](https://github.com/whitequark/parser). The output of the parser is again
|
[here](https://github.com/whitequark/parser). The output of the parser is
|
||||||
an ast, which needs to be compiled to the typed layer.
|
an ast, which holds information about the code in instances of a single *Node* class.
|
||||||
|
Nodes have a type (which you sometimes see in s-expressions) and a list of children.
|
||||||
|
|
||||||
The dynamic aspects of ruby are actually relatively easy to handle, once the whole system is
|
There are two basic problems when working with ruby ast: one is the a in ast, the other is ruby.
|
||||||
in place, because the whole system is written in ruby without external dependencies.
|
|
||||||
Since (when finished) it can compile ruby, it can do so to produce a binary. This binary can
|
|
||||||
then contain the whole of the system, and so the resulting binary will be able to produce
|
|
||||||
binary code when it runs. With small changes to the linking process (easy in ruby!) it can
|
|
||||||
then extend itself.
|
|
||||||
|
|
||||||
The type aspect is more tricky: Ruby is not typed but the typed layer is after all.
|
Since an abstract syntax tree only has one base class, one needs to employ the visitor
|
||||||
But since everything is object (yes, also integers and floats are first class citizens)
|
pattern to write a compiler. This ends up being one great class with lots of unrelated
|
||||||
we know the type on any object at any time and can check it easily.
|
functions, removing much of the benefit of OO.
|
||||||
Easy checks also make inline method jump tables relatively easy.
|
|
||||||
|
|
||||||
### Typed intermediate layer
|
The second, possibly bigger problem, is ruby itself: Ruby is full of programmer happiness,
|
||||||
|
three ways to do this, five to do that. To simplify that, remove the duplication and
|
||||||
|
make analyis easier, Vool was created.
|
||||||
|
|
||||||
The Typed intermediate layer is more fully described [here](/typed/typed.html)
|
### Virtual Object Oriented Language
|
||||||
|
|
||||||
In broad strokes it consists off:
|
Virtual, in this context, means that there is no syntax for this language; it is an
|
||||||
|
intermediate representation which *could* be targeted by several languages.
|
||||||
|
|
||||||
- **MethodCompiler:** compiles the ast into a sequence of Register instructions.
|
The main purpose is to simplify existing oo languages down to it's core components: mostly
|
||||||
and runtime objects (classes, methods etc)
|
calling, assignment, continuations and exceptions. Typed classes for each language construct
|
||||||
- **Parfait:** Is the runtime, ie the minimal set of objects needed to
|
exist and make it easier to transform a statement into a lower level representations.
|
||||||
create a binary with the required information to be dynamic
|
|
||||||
- **Builtin:** A very small set of primitives that are impossible to express in ruby
|
|
||||||
|
|
||||||
- **Message based calling:** Calling is completely object oriented (not stack based)
|
Examples for things that exist in ruby but are broken down in Vool are *unless* , ternary operator,
|
||||||
and uses Message and Frame objects.
|
do while or for loops and other similar syntactic sugar.
|
||||||
|
|
||||||
|
### Minimal Object machine
|
||||||
|
|
||||||
### Register Machine
|
We compile Vool statements into Mom instructions. Mom is a machine, which means it has
|
||||||
|
instructions. But unlike a cpu (or the risc layer below) it does not have memory, only objects.
|
||||||
|
It also has no registers, and together these two things mean that all information is stored in
|
||||||
|
objects. Also the calling convention is object based and uses Frame and Message instances to
|
||||||
|
save state.
|
||||||
|
|
||||||
|
Objects are typed, and are in fact the same objects the language operates on. Just the
|
||||||
|
functionality is expressed through instructions. Methods are in fact defined (as vool) on classes
|
||||||
|
and then compiled to Mom/Risc/Arm and the results stored in the method object.
|
||||||
|
|
||||||
|
Compilation to Mom happens in two stages:
|
||||||
|
1. The linear statements/code is translated to Mom instructions.
|
||||||
|
2. Control statements are translated to jumps and labels.
|
||||||
|
|
||||||
|
The second step leaves a linked list of machine instructions as the input for the next stage.
|
||||||
|
In the future a more elaborate system of optimisations is envisioned between these stages.
|
||||||
|
|
||||||
|
### Risc
|
||||||
|
|
||||||
The Register machine layer is a relatively close abstraction of risc hardware, but without the
|
The Register machine layer is a relatively close abstraction of risc hardware, but without the
|
||||||
quirks.
|
quirks.
|
||||||
|
|
||||||
The register machine has registers, indexed addressing, operators, branches and everything
|
The Risc machine has registers, indexed addressing, operators, branches and everything
|
||||||
needed for the next layer. It does not try to abstract every possible machine feature
|
needed for the next layer. It does not try to abstract every possible machine feature
|
||||||
(like llvm), but rather "objectifies" the risc view to provide what is needed for the typed
|
(like llvm), but rather "objectifies" the general risc view to provide what is needed for
|
||||||
layer, the next layer up.
|
the Mom layer, the next layer up.
|
||||||
|
|
||||||
The machine has it's own (abstract) instruction set, and the mapping to arm is quite
|
The machine has it's own (abstract) instruction set, and the mapping to arm is quite
|
||||||
straightforward. Since the instruction set is implemented as derived classes, additional
|
straightforward. Since the instruction set is implemented as derived classes, additional
|
||||||
@ -88,7 +87,7 @@ self contained system. Ie what an object is, a class, a method etc. This minimal
|
|||||||
parfait, and the same objects will be used at runtime and compile time.
|
parfait, and the same objects will be used at runtime and compile time.
|
||||||
|
|
||||||
Since working with at this low machine level (essentially assembler) is not easy to follow for
|
Since working with at this low machine level (essentially assembler) is not easy to follow for
|
||||||
everyone, an interpreter was created. Later a graphical interface, a kind of
|
everyone (me :-), an interpreter was created (by me:-). Later a graphical interface, a kind of
|
||||||
[visual debugger](https://github.com/ruby-x/rubyx-debugger) was added.
|
[visual debugger](https://github.com/ruby-x/rubyx-debugger) was added.
|
||||||
Visualizing the control flow and being able to see values updated immediately helped
|
Visualizing the control flow and being able to see values updated immediately helped
|
||||||
tremendously in creating this layer. And the interpreter helps in testing, ie keeping it
|
tremendously in creating this layer. And the interpreter helps in testing, ie keeping it
|
||||||
|
Loading…
Reference in New Issue
Block a user