goodbye soml

2016-12-19 18:56:35 +02:00
parent 1175a8eb97
commit 930d006417
6 changed files with 183 additions and 226 deletions
@@ -1,152 +0,0 @@
---
-layout: salama
-title: Salama architectural layers
---
-
-
-<div class="row span10">
-    <h4>Main Layers</h4>
-    <p>
-      To implement an object system to execute object oriented languages takes a large system.
-      The parts or abstraction layers are detailed below.</br>
-      It is important to undrstand the approach first though, as it differs from the normal
-      interpretation. The idea is to compile (eg) ruby. It may be easiest to compare to a static
-      object oriented language like c++. When c++ was created c++ code was translated into c, which
-      then gets translated into assembler, which gets translated to binary code, which is linked
-      and executed. Compiling to binaries is what gives these languages speed, and is one reason
-      to compile ruby.       </br>
-      In a similar way to the c++ example, we need language between ruby and assembler, as it is too
-      big a mental step from ruby to assembler. Off course course one could try to compile to c, but
-      since c is not object oriented that would mean dealing with all off c's non oo heritance, like
-      linking model, memory model, calling convention etc. (more on this in the book) <br/>
-      The layers are:
-      <ul>
-        <li> <b> Binary and cpu specific assembler.</b>  This includes arm assembly and elf support
-              to produce a binary that can then read in ruby programs</li>
-        <li> <b> Risc register machine abstraction </b> provides a level of machine abstraction, but
-                  as the name says, quite a simle one.</li>
-        <li> <b> Soml, Salama object machine language, </b> which is like our object c. Statically
-                typed object oriented with object oriented call sematics. </li>
-        <li> <b> Salama </b> , which is the layer compiling ruby code into soml and includes
-            bootstraping code</li>
-      </ul>
-    </p>
-</div>
-
-<div class="row span10">
-    <h5>Binary , Arm and Elf</h5>
-    <p>
-      A physical machine will run binaries containing intructions that the cpu understands. With arm
-      being our main target, this means we need code to produce binary, which is contained in a
-      seperate module <a href="https://github.com/salama/salama-arm"> salama-arm </a>. <br/>
-      To be able to run code on a unix based operating system, binaries need to be packaged in a
-      way that the os understands, so minimal elf support is included in the package. <br/>
-      Arm is a risc architecture, but anyone who knows it will attest, with it's own quirks.
-      For example any instruction may be executed conditionally in arm. Or there is no 32bit
-      register load instruction. It is possible to create very dense code using all the arm
-      special features, but this is not implemented yet.
-    </p>
-</div>
-
-<div class="row span10">
-    <h5>Register Machine</h5>
-    <p>
-      The Register machine layer is a relatively close abstraction of risc hardware, but without the
-      quirks.
-      <br/>
-      The register machine has registers, indexed addressing, operators, branches and everything
-      needed for the next layer. It doesn not try to abstract every possible machine leature
-      (like llvm), but rather "objectifies" the risc view to provide what is needed for soml, the
-      next layer up.
-      <br/>
-      The machine has it's own (abstract) instruction set, and the mapping to arm is quite
-      straightforward. Since the instruction set is implemented as derived classes, additional
-      instructions may be defined and used later, as long as translation is provided for them too.
-      In other words the instruction set is extensible (unlike cpu instruction sets).
-    </p>
-    <p>
-      Basic object oriented concepts are needed already at this level, to be able to generate a whole
-      self contained system. Ie what an object is, a class, a method etc. This minimal runtime is called
-      parfait and will be coded in soml eventually. But since it is the same objects at runtime and
-      compile time, it will then be translated back to ruby for use at compile time. Currenty there
-      are two versions of the code, in ruby and soml, being hand synchronized. More about parfait below.
-    </p>
-    <p>
-      Since working with at this low machine level (essentially assembler) is not easy to follow for
-      everyone, an interpreter was created. Later a graphical interface, a kind of
-      <a href="https://github.com/salama/salama-debugger"> visual debugger </a> was added.
-      Visualizing the control flow and being able to see values updated immediately helped
-      tremendously in creating this layer. And the interpreter helps in testing, ie keeping it
-      working in the face of developer change.
-    </p>
-</div>
-
-<div class="row span10">
-    <h5>Soml, Salama object machine language</h5>
-    <p>
-      Soml is probably the larest single part of the system and much more information can be found
-      <a href="/typed/typed.html"> here </a>.
-      <br/>
-      Before soml, a more traditional virtual machine approach was taken and abandoned. The language
-      is easy to understand and provides a good abstraction, both in terms of object orienteation,
-      and in terms of how this is expressed in the register model. <br/>
-      It is like ruby with out the dynamic aspects, but typed. <br/>
-      In broad strokes it consists off:
-      <ul>
-        <li> <b> Parser:</b> Currently a peg parser, though a hand coded one is planned.
-                  The result of which is an AST</li>
-        <li> <b> Compiler:</b>  compiles the ast into a sequence of Register instructions.
-                  and runtime objects (classes, methods etc)</li>
-        <li> <b> Parfait: </b> Is the runtime, ie the minimal set of objects needed to
-                  create a binary with the required information to be dynamic</li>
-        <li> <b> Builtin: </b>  A very small set of primitives that are impossible to express
-                  in soml (remembering that parfait will be expressed in soml eventually)</li>
-      </ul>
-    </p>
-    <p>
-    Just to summarize a few of soml features that are maybe unusual:
-    <ul>
-      <li> <b> Mesage based calling:</b> Calling is completely object oriented (not stack based)
-              and uses Message and Frame objects.</li>
-      <li> <b> Return addresses:</b>  A soml method call may return to several addresses, according
-              to type, and in case of exception</li>
-      <li> <b> Overloaded arguments </b> A method is defined by name, but may have several
-                  implementations for different types of the arguments (statically matched)</li>
-    </ul>
-  </p>
-</div>
-
-<div class="row span10">
-    <h5>Salama</h5>
-    <p>
-      To compile and run ruby, we need to parse and compile ruby code. To compile ruby to soml a clear
-      mapping has to be achieved. Particularly the dynamic aspects, and typing need to be addressed.
-      <br/>
-      While parsing ruby is quite a difficult task, it has already been implemented in pure ruby
-      <a href="https://github.com/whitequark/parser"> here </a>. The output of the parser is again
-      an ast, which needs to be compiled to soml. <br/>
-      The dynamic aspects of ruby are actually realtively easy to handle, once the whole system is
-      in place, because the whole system is written in ruby without external dependencies.
-      Since (when finished) it can compile ruby, it can do so to produce a binary. This binary can
-      then contain the whole of the system, and so the resulting binary will be able to produce
-      binary code when it runs. With small changes to the linking process (easy in ruby!) it can
-      then extend itself.
-    </p>
-    <p>
-      The type aspect is more tricky: Ruby is not typed and soml is after all. And if everything
-      were objects (as we like to pretend in ruby) we could just do a lot of dynamic checking,
-      possibly later introduce some caching. But everything is not an object, minimally integers
-      are not, but maybe also floats and other values. The destinction between what is an integer
-      and what an object has sprouted an elaborate type system, which is (by necessity) present in
-      soml (see there).
-    </p>
-    <p>
-      The idea (because it hasn't been implemented yet) is to have different functions for different
-      types. The soml layer defines the Type class and BasicTypes and also lets us return to different
-      places from a function (in effect a soml function call is like an if). By using this, we can
-      compile a single ruby method into several soml functtions. Each such function is typed, ie all
-      arguments and variables are of known type. According to these types we can call functions according
-      to their signatures. Also we can autognerate error methods for unhandled types, and predict
-      that only a fraction of the possible combinations will actually be needed.
-    </p>
-</div>
@@ -0,0 +1,132 @@
+---
+layout: salama
+title: Salama architectural layers
+---
+
+## Main Layers
+
+To implement an object system to execute object oriented languages takes a large system.
+The parts or abstraction layers are detailed below.
+
+It is important to understand the approach first though, as it differs from the normal
+interpretation. The idea is to **compile** ruby. It may be easiest to compare to a static
+object oriented language like c++. When c++ was created c++ code was translated into c, which
+then gets translated into assembler, which gets translated to binary code, which is linked
+and executed. Compiling to binaries is what gives these languages speed, and is the reason
+to compile ruby.
+
+In a similar way to the c++ example, we need level between ruby and assembler, as it is too
+big a mental step from ruby to assembler. Off course course one could try to compile to c, but
+since c is not object oriented that would mean dealing with all off c's non oo heritance, like
+linking model, memory model, calling convention etc.
+
+Top down the layers are:
+
+- **Melon** , compiling ruby code into typed layer and includes bootstrapping code
+
+- **Typed intermediate layer:** Statically typed object oriented with object oriented
+call semantics.
+
+- **Risc register machine abstraction** provides a level of machine abstraction, but
+              as the name says, quite a simple one.
+
+- **Binary and cpu specific assembler**  This includes arm assembly and elf support
+          to produce a binary that can then read in ruby programs
+
+### Melon
+
+To compile and run ruby, we need to parse and compile ruby code. While parsing ruby is quite
+a difficult task, it has already been implemented in pure ruby
+[here](https://github.com/whitequark/parser). The output of the parser is again
+an ast, which needs to be compiled to the typed layer.
+
+The dynamic aspects of ruby are actually reltively easy to handle, once the whole system is
+in place, because the whole system is written in ruby without external dependencies.
+Since (when finished) it can compile ruby, it can do so to produce a binary. This binary can
+then contain the whole of the system, and so the resulting binary will be able to produce
+binary code when it runs. With small changes to the linking process (easy in ruby!) it can
+then extend itself.
+
+The type aspect is more tricky: Ruby is not typed and but the typed layer is after all. And
+if everything were objects (as we like to pretend in ruby) we could just do a lot of
+dynamic checking, possibly later introduce some caching. But everything is not an object,
+minimally integers are not, but maybe also floats and other values.
+The distinction between what is an integer and what an object has sprouted an elaborate
+type system, which is (by necessity) present in the typed layer.
+
+
+
+### Typed intermediate layer
+
+The Typed intermediate layer is more fully described [here](/typed/typed.html)
+
+In broad strokes it consists off:
+
+- **MethodCompiler:**  compiles the ast into a sequence of Register instructions.
+                        and runtime objects (classes, methods etc)
+- **Parfait:** Is the runtime, ie the minimal set of objects needed to
+                  create a binary with the required information to be dynamic
+- **Builtin:**  A very small set of primitives that are impossible to express in ruby
+
+The idea is to have different methods for different types, but implementing the same ruby
+logic. In contrast to the usual 1-1 relationship between a ruby method and it's binary
+definition, there is a 1-n.
+
+The typed layer defines the Type class and BasicTypes and also lets us return to different
+places from a function. By using this, we can
+compile a single ruby method into several typed functions. Each such function is typed, ie all
+arguments and variables are of known type. According to these types we can call functions according
+to their signatures. Also we can autognerate error methods for unhandled types, and predict
+that only a fraction of the possible combinations will actually be needed.
+
+
+Just to summarize a few of typed layer features that are maybe unusual:
+
+- **Message based calling:** Calling is completely object oriented (not stack based)
+                              and uses Message and Frame objects.
+- **Return addresses:**  A method call may return to several addresses, according
+                          to type, and in case of exception
+- **Cross method jumps** When a type switch is detected, a method may jump into the middle
+                            of another method.
+
+
+### Register Machine
+
+The Register machine layer is a relatively close abstraction of risc hardware, but without the
+quirks.
+
+The register machine has registers, indexed addressing, operators, branches and everything
+needed for the next layer. It doesn't not try to abstract every possible machine feature
+(like llvm), but rather "objectifies" the risc view to provide what is needed for the typed
+layer, the next layer up.
+
+The machine has it's own (abstract) instruction set, and the mapping to arm is quite
+straightforward. Since the instruction set is implemented as derived classes, additional
+instructions may be defined and used later, as long as translation is provided for them too.
+In other words the instruction set is extensible (unlike cpu instruction sets).
+
+Basic object oriented concepts are needed already at this level, to be able to generate a whole
+self contained system. Ie what an object is, a class, a method etc. This minimal runtime is called
+parfait, and the same objects willbe used at runtime and compile time.
+
+Since working with at this low machine level (essentially assembler) is not easy to follow for
+everyone, an interpreter was created. Later a graphical interface, a kind of
+[visual debugger](https://github.com/salama/salama-debugger) was added.
+Visualizing the control flow and being able to see values updated immediately helped
+tremendously in creating this layer. And the interpreter helps in testing, ie keeping it
+working in the face of developer change.
+
+
+### Binary , Arm and Elf
+
+A physical machine will run binaries containing instructions that the cpu understands, in a
+format the operating system understands (elf). Arm and elf subdirectories hold the code for
+these layers.
+
+Arm is a risc architecture, but anyone who knows it will attest, with it's own quirks.
+For example any instruction may be executed conditionally in arm. Or there is no 32bit
+register load instruction. It is possible to create very dense code using all the arm
+special features, but this is not implemented yet.
+
+All Arm instructions are (ie derive from) Register instruction and there is an ArmTranslator
+that translates RegisterInstructions to ArmInstructions.