small fixes
This commit is contained in:
parent
48e4c8a27d
commit
de5f4c85cb
53
soml/soml.md
53
soml/soml.md
@ -9,28 +9,30 @@ title: Salama object machine language
|
|||||||
Soml is a language that is designed to be compiled into, rather than written, like
|
Soml is a language that is designed to be compiled into, rather than written, like
|
||||||
other languages. It is the base for a higher system,
|
other languages. It is the base for a higher system,
|
||||||
designed for the needs to compile ruby. It is not an endeavour to abstract from a
|
designed for the needs to compile ruby. It is not an endeavour to abstract from a
|
||||||
lower level, like other system languages, namely off course c.<br/>
|
lower level, like other system languages, namely off course c.
|
||||||
|
|
||||||
Still it is a system language, or an object machine language, so almost as low level a
|
Still it is a system language, or an object machine language, so almost as low level a
|
||||||
language as possible. Only assembler is really lower, and it could be argued that assembler
|
language as possible. Only assembler is really lower, and it could be argued that assembler
|
||||||
is not really a language, rather a data format for expressing binary code. <br/>
|
is not really a language, rather a data format for expressing binary code.
|
||||||
|
|
||||||
|
|
||||||
##### Object oriented to the core, including calling convention
|
##### Object oriented to the core, including calling convention
|
||||||
|
|
||||||
Soml is completely object oriented and strongly typed. For types, the classes are used, but
|
Soml is completely object oriented and strongly typed. Types are modelled as classes and carry
|
||||||
the main distinction is between object (references) and integers. This is off course
|
information about instance variable names and their basic type. *Every* object stores a reference
|
||||||
essential as dereferencing integers is what we want to avoid.
|
to it's types, and while types are immutable, the reference may change. The basic types every
|
||||||
|
object is made up off, include at least integer and reference (pointer).
|
||||||
|
|
||||||
The object model, ie the basic properties of objects that the system relies on, is quite simple
|
The object model, ie the basic properties of objects that the system relies on, is quite simple
|
||||||
and explained in the runtime section. It involves a single reference per object. <br/>
|
and explained in the runtime section. It involves a single reference per object.
|
||||||
Also the object memory
|
Also the object memory model is kept quite simple in that objects are always small multiples
|
||||||
model is kept quite simple in that objects are always small multiples of the cache size of the
|
of the cache size of the hardware machine.
|
||||||
hardware machine. We use object encapsulation to build up larger looking objects from these
|
We use object encapsulation to build up larger looking objects from these basic blocks.
|
||||||
basic blocks.
|
|
||||||
|
|
||||||
The calling convention is also object oriented, not stack based*. Message objects used to
|
The calling convention is also object oriented, not stack based*. Message objects used to
|
||||||
define the data needed for invocation. They carry arguments, a frame and return addresses.
|
define the data needed for invocation. They carry arguments, a frame and return addresses.
|
||||||
In Soml return addresses are pre-calculated and determined by the caller, and yes, there
|
In Soml return addresses are pre-calculated and determined by the caller, and yes, there
|
||||||
are several. In fact there is one return address per masic type, plus one for exception.
|
are several. In fact there is one return address per basic type, plus one for exception.
|
||||||
A method invocation may thus be made to return to an entirely different location than the
|
A method invocation may thus be made to return to an entirely different location than the
|
||||||
caller.
|
caller.
|
||||||
\*(A stack, as used in c, is not typed and as such a source of problems)
|
\*(A stack, as used in c, is not typed and as such a source of problems)
|
||||||
@ -41,22 +43,23 @@ classes that can be accessed by writing the class name in soml source.
|
|||||||
##### Syntax and runtime
|
##### Syntax and runtime
|
||||||
|
|
||||||
Soml syntax is a mix between ruby and c. I is like ruby in the sense that semicolons and even
|
Soml syntax is a mix between ruby and c. I is like ruby in the sense that semicolons and even
|
||||||
newlines are not neccessary unless they are. It still uses braces, but that will probably
|
newlines are not neccessary unless they are. Soml still uses braces, but that will probably
|
||||||
be changed. <br/>
|
be changed.
|
||||||
|
|
||||||
But off course it is typed, so in argument or variable definitions the type must be specified
|
But off course it is typed, so in argument or variable definitions the type must be specified
|
||||||
like in c. Types are classes, but int may be used for brevity instead of Integer. Return
|
like in c. Type names are the class names they represent, but the "int" may be used for brevity
|
||||||
types are also declared, though more for statci analysis. As mentioned any function may return
|
instead of Integer. Return types are also declared, though more for static analysis. As mentioned a
|
||||||
to differernt addresses according to type. The compiler automatically inserts erros for
|
function may return to different addresses according to type. The compiler automatically inserts
|
||||||
return typesa that are not handled by the caller. <br/>
|
errors for return types that are not handled by the caller.
|
||||||
The complete syntax and their translation is discussed <a href="syntax.html"> here </a>
|
The complete syntax and their translation is discussed [here](syntax.html)
|
||||||
|
|
||||||
As soml is the base for dynamic languages, all compile information is recorded in the runtime.
|
As soml is the base for dynamic languages, all compile information is recorded in the runtime.
|
||||||
All inforamtion is off course object oriented, ie in the form off objects. This means a class
|
All information is off course object oriented, ie in the form off objects. This means a class
|
||||||
hierachy and this itself is off course part of the runtime. The runtime, Parfait, is kept
|
hierarchy, and this itself is off course part of the runtime. The runtime, Parfait, is kept
|
||||||
to a minnimum, currently around 15 classes, described in detail <a href="parfait.html">
|
to a minimum, currently around 15 classes, described in detail [here](parfait.html).
|
||||||
here </a>. <br/>
|
|
||||||
|
|
||||||
Historically Parfait has been coded in ruby, as it was first needed in the compiler.
|
Historically Parfait has been coded in ruby, as it was first needed in the compiler.
|
||||||
This had the additional benefit of providing solid test cases for the functionality.
|
This had the additional benefit of providing solid test cases for the functionality.
|
||||||
Currently the process is to recode the same functionality in soml, and by the end of that
|
Currently the process is to convert the code into soml, using the same compiler used to compile
|
||||||
a converter will be written. This will convert the soml code into ruby code, thus removing the
|
ruby.
|
||||||
duplication.
|
|
||||||
|
@ -14,17 +14,17 @@ The top level declarations in a file may only be class definitions
|
|||||||
end
|
end
|
||||||
end
|
end
|
||||||
|
|
||||||
The class hierarchy is explained in [here](./parfait.html), but you can leave out the superclass
|
The class hierarchy is explained in [here](parfait.html), but you can leave out the superclass
|
||||||
and Object will be assumed.
|
and Object will be assumed.
|
||||||
|
|
||||||
Methods must be typed, both arguments and return. Generally class names serve as types, but int can
|
Methods must be typed, both arguments and return. Generally class names serve as types, but "int" can
|
||||||
be used as a shortcut for Integer.
|
be used as a shortcut for Integer.
|
||||||
|
|
||||||
Code may not be outside method definitions, like in ruby. A compiled program starts at the builtin
|
Code may not be outside method definitions, like in ruby. A compiled program starts at the builtin
|
||||||
method __init__, that does the inital setup, an then jumps to Object.main
|
method __init__, that does the initial setup, an then jumps to **Space.main**
|
||||||
|
|
||||||
Classes are represented by class objects and methods my Method objects, so all information is available
|
Classes are represented by class objects (instances of class Class to be precise) and methods by
|
||||||
at runtime.
|
Method objects, so all information is available at runtime.
|
||||||
|
|
||||||
#### Expressions
|
#### Expressions
|
||||||
|
|
||||||
@ -33,6 +33,8 @@ action. Both are compiled to Register level instructions for the current method.
|
|||||||
expressions store their value in a register and statements store those values elsewhere, possibly
|
expressions store their value in a register and statements store those values elsewhere, possibly
|
||||||
after operating on them.
|
after operating on them.
|
||||||
|
|
||||||
|
The subsections below correspond roughly to the parsers rule names.
|
||||||
|
|
||||||
**Basic expressions** are numbers (integer or float), strings or names, either variable, argument,
|
**Basic expressions** are numbers (integer or float), strings or names, either variable, argument,
|
||||||
field or class names. (normal details applicable). Special names include self (the current
|
field or class names. (normal details applicable). Special names include self (the current
|
||||||
receiver), and message (the currently executed method frame). These all resolve to a register
|
receiver), and message (the currently executed method frame). These all resolve to a register
|
||||||
@ -82,9 +84,9 @@ This lets the programmer express more precisely what is tested, and also opens a
|
|||||||
framework for more tests than available in other languages. Specifically overflow may be tested in
|
framework for more tests than available in other languages. Specifically overflow may be tested in
|
||||||
soml, without dropping down to assembler.
|
soml, without dropping down to assembler.
|
||||||
|
|
||||||
And **if statement** is started with the keyword if_ and then contains the branch type. The branch
|
An **if statement** is started with the keyword if_ and then contains the branch type. The branch
|
||||||
type may be plus, minus, zero, nonzero or overflow. The condition must be in brackets and be any
|
type may be *plus, minus, zero, nonzero or overflow*. The condition must be in brackets and can be
|
||||||
expression. If may be continued with en else, but doesn't have to be, and is ended with end
|
any expression. *If* may be continued with en *else*, but doesn't have to be, and is ended with *end*
|
||||||
|
|
||||||
if_zero(a - 5)
|
if_zero(a - 5)
|
||||||
....
|
....
|
||||||
@ -114,14 +116,14 @@ field, must be in class (not method) scope and may not be assigned to.
|
|||||||
...
|
...
|
||||||
end
|
end
|
||||||
|
|
||||||
A **local variable definition** declares and possibly assign to a local variable. Local variables
|
A **local variable definition** declares, and possibly assigns to, a local variable. Local variables
|
||||||
are store in frame objects and the are last in search order. When resolving a name, the compiler
|
are stored in frame objects, in fact they are instance variables of the current frame object.
|
||||||
checks argument names first, and then local variables.
|
When resolving a name, the compiler checks argument names first, and then local variables.
|
||||||
|
|
||||||
int counter = 0
|
int counter = 0
|
||||||
|
|
||||||
Any of the expression may be assigned to the variable at the time of definition. After a variable is
|
Any of the expressions may be assigned to the variable at the time of definition. After a variable is
|
||||||
defined it may be assigned to with an **assignemnt statement** any number of times. The assignment
|
defined it may be assigned to with an **assignment statement** any number of times. The assignment
|
||||||
is like an assignment during definition, without the leading type.
|
is like an assignment during definition, without the leading type.
|
||||||
|
|
||||||
counter = 0
|
counter = 0
|
||||||
@ -130,7 +132,7 @@ Any of the expressions, basic, call, operator, field access, may be assigned.
|
|||||||
|
|
||||||
### Code generation and scope
|
### Code generation and scope
|
||||||
|
|
||||||
Compiling generates two results simultaneously. The more obvious code for a function, but also an
|
Compiling generates two results simultaneously. The more obvious is code for a function, but also an
|
||||||
object structure of classes etc that capture the declarations. To understand the code part better
|
object structure of classes etc that capture the declarations. To understand the code part better
|
||||||
the register abstraction should be studied, and to understand the object structure the runtime.
|
the register abstraction should be studied, and to understand the object structure the runtime.
|
||||||
|
|
||||||
@ -142,5 +144,5 @@ be consistent at the end of the statement. Since there is only only object memor
|
|||||||
concerns all assignments, since all variables are either named or indexed members of objects.
|
concerns all assignments, since all variables are either named or indexed members of objects.
|
||||||
Also local variables are just members of the frame.
|
Also local variables are just members of the frame.
|
||||||
|
|
||||||
This obviously does leave room for optimisation as preliminary benchmarks show. But benchmarks also
|
This obviously does leave room for optimisations as preliminary benchmarks show. But benchmarks also
|
||||||
show that it is not such a bit issue and much more benefit can be achieved by inlining.
|
show that it is not such a bit issue and much more benefit can be achieved by inlining.
|
||||||
|
Loading…
Reference in New Issue
Block a user