small fixes

This commit is contained in:
Torsten Ruger 2016-03-07 17:37:24 +02:00
parent 48e4c8a27d
commit de5f4c85cb
2 changed files with 45 additions and 40 deletions

View File

@ -9,28 +9,30 @@ title: Salama object machine language
Soml is a language that is designed to be compiled into, rather than written, like
other languages. It is the base for a higher system,
designed for the needs to compile ruby. It is not an endeavour to abstract from a
lower level, like other system languages, namely off course c.<br/>
lower level, like other system languages, namely off course c.
Still it is a system language, or an object machine language, so almost as low level a
language as possible. Only assembler is really lower, and it could be argued that assembler
is not really a language, rather a data format for expressing binary code. <br/>
is not really a language, rather a data format for expressing binary code.
##### Object oriented to the core, including calling convention
Soml is completely object oriented and strongly typed. For types, the classes are used, but
the main distinction is between object (references) and integers. This is off course
essential as dereferencing integers is what we want to avoid.
Soml is completely object oriented and strongly typed. Types are modelled as classes and carry
information about instance variable names and their basic type. *Every* object stores a reference
to it's types, and while types are immutable, the reference may change. The basic types every
object is made up off, include at least integer and reference (pointer).
The object model, ie the basic properties of objects that the system relies on, is quite simple
and explained in the runtime section. It involves a single reference per object. <br/>
Also the object memory
model is kept quite simple in that objects are always small multiples of the cache size of the
hardware machine. We use object encapsulation to build up larger looking objects from these
basic blocks.
and explained in the runtime section. It involves a single reference per object.
Also the object memory model is kept quite simple in that objects are always small multiples
of the cache size of the hardware machine.
We use object encapsulation to build up larger looking objects from these basic blocks.
The calling convention is also object oriented, not stack based*. Message objects used to
define the data needed for invocation. They carry arguments, a frame and return addresses.
In Soml return addresses are pre-calculated and determined by the caller, and yes, there
are several. In fact there is one return address per masic type, plus one for exception.
are several. In fact there is one return address per basic type, plus one for exception.
A method invocation may thus be made to return to an entirely different location than the
caller.
\*(A stack, as used in c, is not typed and as such a source of problems)
@ -41,22 +43,23 @@ classes that can be accessed by writing the class name in soml source.
##### Syntax and runtime
Soml syntax is a mix between ruby and c. I is like ruby in the sense that semicolons and even
newlines are not neccessary unless they are. It still uses braces, but that will probably
be changed. <br/>
newlines are not neccessary unless they are. Soml still uses braces, but that will probably
be changed.
But off course it is typed, so in argument or variable definitions the type must be specified
like in c. Types are classes, but int may be used for brevity instead of Integer. Return
types are also declared, though more for statci analysis. As mentioned any function may return
to differernt addresses according to type. The compiler automatically inserts erros for
return typesa that are not handled by the caller. <br/>
The complete syntax and their translation is discussed <a href="syntax.html"> here </a>
like in c. Type names are the class names they represent, but the "int" may be used for brevity
instead of Integer. Return types are also declared, though more for static analysis. As mentioned a
function may return to different addresses according to type. The compiler automatically inserts
errors for return types that are not handled by the caller.
The complete syntax and their translation is discussed [here](syntax.html)
As soml is the base for dynamic languages, all compile information is recorded in the runtime.
All inforamtion is off course object oriented, ie in the form off objects. This means a class
hierachy and this itself is off course part of the runtime. The runtime, Parfait, is kept
to a minnimum, currently around 15 classes, described in detail <a href="parfait.html">
here </a>. <br/>
All information is off course object oriented, ie in the form off objects. This means a class
hierarchy, and this itself is off course part of the runtime. The runtime, Parfait, is kept
to a minimum, currently around 15 classes, described in detail [here](parfait.html).
Historically Parfait has been coded in ruby, as it was first needed in the compiler.
This had the additional benefit of providing solid test cases for the functionality.
Currently the process is to recode the same functionality in soml, and by the end of that
a converter will be written. This will convert the soml code into ruby code, thus removing the
duplication.
Currently the process is to convert the code into soml, using the same compiler used to compile
ruby.

View File

@ -14,17 +14,17 @@ The top level declarations in a file may only be class definitions
end
end
The class hierarchy is explained in [here](./parfait.html), but you can leave out the superclass
The class hierarchy is explained in [here](parfait.html), but you can leave out the superclass
and Object will be assumed.
Methods must be typed, both arguments and return. Generally class names serve as types, but int can
Methods must be typed, both arguments and return. Generally class names serve as types, but "int" can
be used as a shortcut for Integer.
Code may not be outside method definitions, like in ruby. A compiled program starts at the builtin
method __init__, that does the inital setup, an then jumps to Object.main
method __init__, that does the initial setup, an then jumps to **Space.main**
Classes are represented by class objects and methods my Method objects, so all information is available
at runtime.
Classes are represented by class objects (instances of class Class to be precise) and methods by
Method objects, so all information is available at runtime.
#### Expressions
@ -33,6 +33,8 @@ action. Both are compiled to Register level instructions for the current method.
expressions store their value in a register and statements store those values elsewhere, possibly
after operating on them.
The subsections below correspond roughly to the parsers rule names.
**Basic expressions** are numbers (integer or float), strings or names, either variable, argument,
field or class names. (normal details applicable). Special names include self (the current
receiver), and message (the currently executed method frame). These all resolve to a register
@ -82,9 +84,9 @@ This lets the programmer express more precisely what is tested, and also opens a
framework for more tests than available in other languages. Specifically overflow may be tested in
soml, without dropping down to assembler.
And **if statement** is started with the keyword if_ and then contains the branch type. The branch
type may be plus, minus, zero, nonzero or overflow. The condition must be in brackets and be any
expression. If may be continued with en else, but doesn't have to be, and is ended with end
An **if statement** is started with the keyword if_ and then contains the branch type. The branch
type may be *plus, minus, zero, nonzero or overflow*. The condition must be in brackets and can be
any expression. *If* may be continued with en *else*, but doesn't have to be, and is ended with *end*
if_zero(a - 5)
....
@ -114,14 +116,14 @@ field, must be in class (not method) scope and may not be assigned to.
...
end
A **local variable definition** declares and possibly assign to a local variable. Local variables
are store in frame objects and the are last in search order. When resolving a name, the compiler
checks argument names first, and then local variables.
A **local variable definition** declares, and possibly assigns to, a local variable. Local variables
are stored in frame objects, in fact they are instance variables of the current frame object.
When resolving a name, the compiler checks argument names first, and then local variables.
int counter = 0
Any of the expression may be assigned to the variable at the time of definition. After a variable is
defined it may be assigned to with an **assignemnt statement** any number of times. The assignment
Any of the expressions may be assigned to the variable at the time of definition. After a variable is
defined it may be assigned to with an **assignment statement** any number of times. The assignment
is like an assignment during definition, without the leading type.
counter = 0
@ -130,7 +132,7 @@ Any of the expressions, basic, call, operator, field access, may be assigned.
### Code generation and scope
Compiling generates two results simultaneously. The more obvious code for a function, but also an
Compiling generates two results simultaneously. The more obvious is code for a function, but also an
object structure of classes etc that capture the declarations. To understand the code part better
the register abstraction should be studied, and to understand the object structure the runtime.
@ -142,5 +144,5 @@ be consistent at the end of the statement. Since there is only only object memor
concerns all assignments, since all variables are either named or indexed members of objects.
Also local variables are just members of the frame.
This obviously does leave room for optimisation as preliminary benchmarks show. But benchmarks also
This obviously does leave room for optimisations as preliminary benchmarks show. But benchmarks also
show that it is not such a bit issue and much more benefit can be achieved by inlining.