goodbye soml
This commit is contained in:
@@ -1,152 +0,0 @@
|
||||
---
|
||||
layout: salama
|
||||
title: Salama architectural layers
|
||||
---
|
||||
|
||||
|
||||
<div class="row span10">
|
||||
<h4>Main Layers</h4>
|
||||
<p>
|
||||
To implement an object system to execute object oriented languages takes a large system.
|
||||
The parts or abstraction layers are detailed below.</br>
|
||||
It is important to undrstand the approach first though, as it differs from the normal
|
||||
interpretation. The idea is to compile (eg) ruby. It may be easiest to compare to a static
|
||||
object oriented language like c++. When c++ was created c++ code was translated into c, which
|
||||
then gets translated into assembler, which gets translated to binary code, which is linked
|
||||
and executed. Compiling to binaries is what gives these languages speed, and is one reason
|
||||
to compile ruby. </br>
|
||||
In a similar way to the c++ example, we need language between ruby and assembler, as it is too
|
||||
big a mental step from ruby to assembler. Off course course one could try to compile to c, but
|
||||
since c is not object oriented that would mean dealing with all off c's non oo heritance, like
|
||||
linking model, memory model, calling convention etc. (more on this in the book) <br/>
|
||||
The layers are:
|
||||
<ul>
|
||||
<li> <b> Binary and cpu specific assembler.</b> This includes arm assembly and elf support
|
||||
to produce a binary that can then read in ruby programs</li>
|
||||
<li> <b> Risc register machine abstraction </b> provides a level of machine abstraction, but
|
||||
as the name says, quite a simle one.</li>
|
||||
<li> <b> Soml, Salama object machine language, </b> which is like our object c. Statically
|
||||
typed object oriented with object oriented call sematics. </li>
|
||||
<li> <b> Salama </b> , which is the layer compiling ruby code into soml and includes
|
||||
bootstraping code</li>
|
||||
</ul>
|
||||
</p>
|
||||
</div>
|
||||
|
||||
<div class="row span10">
|
||||
<h5>Binary , Arm and Elf</h5>
|
||||
<p>
|
||||
A physical machine will run binaries containing intructions that the cpu understands. With arm
|
||||
being our main target, this means we need code to produce binary, which is contained in a
|
||||
seperate module <a href="https://github.com/salama/salama-arm"> salama-arm </a>. <br/>
|
||||
To be able to run code on a unix based operating system, binaries need to be packaged in a
|
||||
way that the os understands, so minimal elf support is included in the package. <br/>
|
||||
Arm is a risc architecture, but anyone who knows it will attest, with it's own quirks.
|
||||
For example any instruction may be executed conditionally in arm. Or there is no 32bit
|
||||
register load instruction. It is possible to create very dense code using all the arm
|
||||
special features, but this is not implemented yet.
|
||||
</p>
|
||||
</div>
|
||||
|
||||
<div class="row span10">
|
||||
<h5>Register Machine</h5>
|
||||
<p>
|
||||
The Register machine layer is a relatively close abstraction of risc hardware, but without the
|
||||
quirks.
|
||||
<br/>
|
||||
The register machine has registers, indexed addressing, operators, branches and everything
|
||||
needed for the next layer. It doesn not try to abstract every possible machine leature
|
||||
(like llvm), but rather "objectifies" the risc view to provide what is needed for soml, the
|
||||
next layer up.
|
||||
<br/>
|
||||
The machine has it's own (abstract) instruction set, and the mapping to arm is quite
|
||||
straightforward. Since the instruction set is implemented as derived classes, additional
|
||||
instructions may be defined and used later, as long as translation is provided for them too.
|
||||
In other words the instruction set is extensible (unlike cpu instruction sets).
|
||||
</p>
|
||||
<p>
|
||||
Basic object oriented concepts are needed already at this level, to be able to generate a whole
|
||||
self contained system. Ie what an object is, a class, a method etc. This minimal runtime is called
|
||||
parfait and will be coded in soml eventually. But since it is the same objects at runtime and
|
||||
compile time, it will then be translated back to ruby for use at compile time. Currenty there
|
||||
are two versions of the code, in ruby and soml, being hand synchronized. More about parfait below.
|
||||
</p>
|
||||
<p>
|
||||
Since working with at this low machine level (essentially assembler) is not easy to follow for
|
||||
everyone, an interpreter was created. Later a graphical interface, a kind of
|
||||
<a href="https://github.com/salama/salama-debugger"> visual debugger </a> was added.
|
||||
Visualizing the control flow and being able to see values updated immediately helped
|
||||
tremendously in creating this layer. And the interpreter helps in testing, ie keeping it
|
||||
working in the face of developer change.
|
||||
</p>
|
||||
</div>
|
||||
|
||||
<div class="row span10">
|
||||
<h5>Soml, Salama object machine language</h5>
|
||||
<p>
|
||||
Soml is probably the larest single part of the system and much more information can be found
|
||||
<a href="/typed/typed.html"> here </a>.
|
||||
<br/>
|
||||
Before soml, a more traditional virtual machine approach was taken and abandoned. The language
|
||||
is easy to understand and provides a good abstraction, both in terms of object orienteation,
|
||||
and in terms of how this is expressed in the register model. <br/>
|
||||
It is like ruby with out the dynamic aspects, but typed. <br/>
|
||||
In broad strokes it consists off:
|
||||
<ul>
|
||||
<li> <b> Parser:</b> Currently a peg parser, though a hand coded one is planned.
|
||||
The result of which is an AST</li>
|
||||
<li> <b> Compiler:</b> compiles the ast into a sequence of Register instructions.
|
||||
and runtime objects (classes, methods etc)</li>
|
||||
<li> <b> Parfait: </b> Is the runtime, ie the minimal set of objects needed to
|
||||
create a binary with the required information to be dynamic</li>
|
||||
<li> <b> Builtin: </b> A very small set of primitives that are impossible to express
|
||||
in soml (remembering that parfait will be expressed in soml eventually)</li>
|
||||
</ul>
|
||||
</p>
|
||||
<p>
|
||||
Just to summarize a few of soml features that are maybe unusual:
|
||||
<ul>
|
||||
<li> <b> Mesage based calling:</b> Calling is completely object oriented (not stack based)
|
||||
and uses Message and Frame objects.</li>
|
||||
<li> <b> Return addresses:</b> A soml method call may return to several addresses, according
|
||||
to type, and in case of exception</li>
|
||||
<li> <b> Overloaded arguments </b> A method is defined by name, but may have several
|
||||
implementations for different types of the arguments (statically matched)</li>
|
||||
</ul>
|
||||
</p>
|
||||
</div>
|
||||
|
||||
<div class="row span10">
|
||||
<h5>Salama</h5>
|
||||
<p>
|
||||
To compile and run ruby, we need to parse and compile ruby code. To compile ruby to soml a clear
|
||||
mapping has to be achieved. Particularly the dynamic aspects, and typing need to be addressed.
|
||||
<br/>
|
||||
While parsing ruby is quite a difficult task, it has already been implemented in pure ruby
|
||||
<a href="https://github.com/whitequark/parser"> here </a>. The output of the parser is again
|
||||
an ast, which needs to be compiled to soml. <br/>
|
||||
The dynamic aspects of ruby are actually realtively easy to handle, once the whole system is
|
||||
in place, because the whole system is written in ruby without external dependencies.
|
||||
Since (when finished) it can compile ruby, it can do so to produce a binary. This binary can
|
||||
then contain the whole of the system, and so the resulting binary will be able to produce
|
||||
binary code when it runs. With small changes to the linking process (easy in ruby!) it can
|
||||
then extend itself.
|
||||
</p>
|
||||
<p>
|
||||
The type aspect is more tricky: Ruby is not typed and soml is after all. And if everything
|
||||
were objects (as we like to pretend in ruby) we could just do a lot of dynamic checking,
|
||||
possibly later introduce some caching. But everything is not an object, minimally integers
|
||||
are not, but maybe also floats and other values. The destinction between what is an integer
|
||||
and what an object has sprouted an elaborate type system, which is (by necessity) present in
|
||||
soml (see there).
|
||||
</p>
|
||||
<p>
|
||||
The idea (because it hasn't been implemented yet) is to have different functions for different
|
||||
types. The soml layer defines the Type class and BasicTypes and also lets us return to different
|
||||
places from a function (in effect a soml function call is like an if). By using this, we can
|
||||
compile a single ruby method into several soml functtions. Each such function is typed, ie all
|
||||
arguments and variables are of known type. According to these types we can call functions according
|
||||
to their signatures. Also we can autognerate error methods for unhandled types, and predict
|
||||
that only a fraction of the possible combinations will actually be needed.
|
||||
</p>
|
||||
</div>
|
||||
132
salama/layers.md
Normal file
132
salama/layers.md
Normal file
@@ -0,0 +1,132 @@
|
||||
---
|
||||
layout: salama
|
||||
title: Salama architectural layers
|
||||
---
|
||||
|
||||
## Main Layers
|
||||
|
||||
To implement an object system to execute object oriented languages takes a large system.
|
||||
The parts or abstraction layers are detailed below.
|
||||
|
||||
It is important to understand the approach first though, as it differs from the normal
|
||||
interpretation. The idea is to **compile** ruby. It may be easiest to compare to a static
|
||||
object oriented language like c++. When c++ was created c++ code was translated into c, which
|
||||
then gets translated into assembler, which gets translated to binary code, which is linked
|
||||
and executed. Compiling to binaries is what gives these languages speed, and is the reason
|
||||
to compile ruby.
|
||||
|
||||
In a similar way to the c++ example, we need level between ruby and assembler, as it is too
|
||||
big a mental step from ruby to assembler. Off course course one could try to compile to c, but
|
||||
since c is not object oriented that would mean dealing with all off c's non oo heritance, like
|
||||
linking model, memory model, calling convention etc.
|
||||
|
||||
Top down the layers are:
|
||||
|
||||
- **Melon** , compiling ruby code into typed layer and includes bootstrapping code
|
||||
|
||||
- **Typed intermediate layer:** Statically typed object oriented with object oriented
|
||||
call semantics.
|
||||
|
||||
- **Risc register machine abstraction** provides a level of machine abstraction, but
|
||||
as the name says, quite a simple one.
|
||||
|
||||
- **Binary and cpu specific assembler** This includes arm assembly and elf support
|
||||
to produce a binary that can then read in ruby programs
|
||||
|
||||
### Melon
|
||||
|
||||
To compile and run ruby, we need to parse and compile ruby code. While parsing ruby is quite
|
||||
a difficult task, it has already been implemented in pure ruby
|
||||
[here](https://github.com/whitequark/parser). The output of the parser is again
|
||||
an ast, which needs to be compiled to the typed layer.
|
||||
|
||||
The dynamic aspects of ruby are actually reltively easy to handle, once the whole system is
|
||||
in place, because the whole system is written in ruby without external dependencies.
|
||||
Since (when finished) it can compile ruby, it can do so to produce a binary. This binary can
|
||||
then contain the whole of the system, and so the resulting binary will be able to produce
|
||||
binary code when it runs. With small changes to the linking process (easy in ruby!) it can
|
||||
then extend itself.
|
||||
|
||||
The type aspect is more tricky: Ruby is not typed and but the typed layer is after all. And
|
||||
if everything were objects (as we like to pretend in ruby) we could just do a lot of
|
||||
dynamic checking, possibly later introduce some caching. But everything is not an object,
|
||||
minimally integers are not, but maybe also floats and other values.
|
||||
The distinction between what is an integer and what an object has sprouted an elaborate
|
||||
type system, which is (by necessity) present in the typed layer.
|
||||
|
||||
|
||||
|
||||
### Typed intermediate layer
|
||||
|
||||
The Typed intermediate layer is more fully described [here](/typed/typed.html)
|
||||
|
||||
In broad strokes it consists off:
|
||||
|
||||
- **MethodCompiler:** compiles the ast into a sequence of Register instructions.
|
||||
and runtime objects (classes, methods etc)
|
||||
- **Parfait:** Is the runtime, ie the minimal set of objects needed to
|
||||
create a binary with the required information to be dynamic
|
||||
- **Builtin:** A very small set of primitives that are impossible to express in ruby
|
||||
|
||||
The idea is to have different methods for different types, but implementing the same ruby
|
||||
logic. In contrast to the usual 1-1 relationship between a ruby method and it's binary
|
||||
definition, there is a 1-n.
|
||||
|
||||
The typed layer defines the Type class and BasicTypes and also lets us return to different
|
||||
places from a function. By using this, we can
|
||||
compile a single ruby method into several typed functions. Each such function is typed, ie all
|
||||
arguments and variables are of known type. According to these types we can call functions according
|
||||
to their signatures. Also we can autognerate error methods for unhandled types, and predict
|
||||
that only a fraction of the possible combinations will actually be needed.
|
||||
|
||||
|
||||
Just to summarize a few of typed layer features that are maybe unusual:
|
||||
|
||||
- **Message based calling:** Calling is completely object oriented (not stack based)
|
||||
and uses Message and Frame objects.
|
||||
- **Return addresses:** A method call may return to several addresses, according
|
||||
to type, and in case of exception
|
||||
- **Cross method jumps** When a type switch is detected, a method may jump into the middle
|
||||
of another method.
|
||||
|
||||
|
||||
### Register Machine
|
||||
|
||||
The Register machine layer is a relatively close abstraction of risc hardware, but without the
|
||||
quirks.
|
||||
|
||||
The register machine has registers, indexed addressing, operators, branches and everything
|
||||
needed for the next layer. It doesn't not try to abstract every possible machine feature
|
||||
(like llvm), but rather "objectifies" the risc view to provide what is needed for the typed
|
||||
layer, the next layer up.
|
||||
|
||||
The machine has it's own (abstract) instruction set, and the mapping to arm is quite
|
||||
straightforward. Since the instruction set is implemented as derived classes, additional
|
||||
instructions may be defined and used later, as long as translation is provided for them too.
|
||||
In other words the instruction set is extensible (unlike cpu instruction sets).
|
||||
|
||||
Basic object oriented concepts are needed already at this level, to be able to generate a whole
|
||||
self contained system. Ie what an object is, a class, a method etc. This minimal runtime is called
|
||||
parfait, and the same objects willbe used at runtime and compile time.
|
||||
|
||||
Since working with at this low machine level (essentially assembler) is not easy to follow for
|
||||
everyone, an interpreter was created. Later a graphical interface, a kind of
|
||||
[visual debugger](https://github.com/salama/salama-debugger) was added.
|
||||
Visualizing the control flow and being able to see values updated immediately helped
|
||||
tremendously in creating this layer. And the interpreter helps in testing, ie keeping it
|
||||
working in the face of developer change.
|
||||
|
||||
|
||||
### Binary , Arm and Elf
|
||||
|
||||
A physical machine will run binaries containing instructions that the cpu understands, in a
|
||||
format the operating system understands (elf). Arm and elf subdirectories hold the code for
|
||||
these layers.
|
||||
|
||||
Arm is a risc architecture, but anyone who knows it will attest, with it's own quirks.
|
||||
For example any instruction may be executed conditionally in arm. Or there is no 32bit
|
||||
register load instruction. It is possible to create very dense code using all the arm
|
||||
special features, but this is not implemented yet.
|
||||
|
||||
All Arm instructions are (ie derive from) Register instruction and there is an ArmTranslator
|
||||
that translates RegisterInstructions to ArmInstructions.
|
||||
Reference in New Issue
Block a user