add the missing some sections

also benchmarks
This commit is contained in:
Torsten Ruger 2015-11-23 19:51:52 +02:00
parent d855f20d52
commit 96e9194df4
4 changed files with 254 additions and 0 deletions

View File

@ -17,6 +17,7 @@ layout: site
<li><a href="/soml/soml.html"> Soml </a> </li>
<li><a href="/soml/syntax.html"> Syntax </a> </li>
<li><a href="/soml/parfait.html"> Parfait </a> </li>
<li><a href="/soml/benchmarks.html"> Performance </a> </li>
</ul>
</div>
</div>

58
soml/benchmarks.md Normal file
View File

@ -0,0 +1,58 @@
---
layout: soml
title: Simple soml performance numbers
---
These benchmarks were made to establish places for optimizations. This early on it is clear that
performance is not outstanding, but still there were some surprises.
- loop - program does empty loop of same size as hello
- hello - output hello world (to dev/null) to measure kernel calls (not terminal speed)
- itos - convert integers from 1 to 100000 to string
- add - run integer adds by linear fibonacci of 40
- call - exercise calling by recursive fibonacci of 20
Hello and puti and add run 100_000 iterations per program invocation to remove startup overhead.
Call only has 10000 iterations, as it is much slower, executing about 10000 calls per invocation
Gcc used to compile c on the machine. soml executables produced by ruby (on another machine)
### Results
Results were measured by a ruby script. Mean and variance was measured until variance was low,
always under one percent.
The machine was a virtual arm run on a powerbook, performance roughly equivalent to a raspberry pi.
But results should be seen as relative, not absolute.
|language | loop | hello | itos | add | call | | loop | hello | itos | add | call |
|-------------------------------------------------------------------------------------------------------------
|c | 0,0500 | 2,1365 | 0,2902 | 0,1245 | 0,8535 | | + 33 % | + 79 % | | | |
|soml | 0,0374 | 1,2071 | 0,7263 | 0,2247 | 1,3625 | | | | + 150% | + 80 % | + 60 % |
### Discussion
Surprisingly there are areas where soml code runs faster than c. Especially in the hello example this
may not mean too much. Printf does caching and has a lot functionality, so it may not be a straight
comparison. The loop example is surprising and needs to be examined.
The add example is slower because of the different memory model and lack of optimisation for soml.
Every result of an arithmetic operation is immediately written to memory in soml, whereas c will
keep things in registers as long as it can, which in the example is the whole time. This can
be improved upon with register code optimisation, which can cut loads after writes and writes that
that are overwritten before calls or jumps are made.
The call was expected to be larger as a typed model is used and runtime information (like the method
name) made available. It is actually a small price to pay for the ability to generate code at runtime
and will off course reduce drastically with inlining.
The itos example was also to be expected as it relies both on calling and on arithmetic. Also itos
relies heavily on division by 10, which when coded in cpu specific assembler may easily be sped up
by a factor of 2-3.
All in all the results are encouraging as no optimization efforts have been made. Off course the
most encouraging fact is that the system works and thus may be used as the basis of a dynamic
code generator, as opposed to having to interpret.

49
soml/parfait.md Normal file
View File

@ -0,0 +1,49 @@
---
layout: soml
title: Parfait, soml's runtime
---
#### Overview
Soml, like ruby, has open classes. This means that a class can be added to by loading another file
with the same class definition that adds fields or methods. The effect of this is that in designing
the runtime, we can concentrate on a minimal function set.
This means all the functionality the compiler need to get the job done, mostly class and type
structure related functionality with it's support.
### Value and Object
In soml object is not the root of the class hierarchy, but Value is. Integer, Float and Object are
derived from Value. So an integer is *not* an object, but still has a class and methods, just no
instance variables.
### Layout and Class
Each object has a layout that describes the instance variables and types of the object. It also
reference the class of the object. Layout objects are constant, may not be changed over their
lifetime. When a field is added to a class, a new layout is created.
A Class describes a set of objects that respond to the same methods (methods are store in the class).
A Layout describes a set of objects that have the same instance variables.
### Method, Message and Frame
The Method class describes a declared method. It carries a name, argument names and types and
several description of the code. The parsed ast is kept for later inlining, the register model
instruction stream for optimisation and further processing and finally the cpu specific binary
represents the executable code.
When Methods are invoked, A message object (instance of Message class) is populated. Message objects
are created at compile time and form a linked list. The data in the Message holds the receiver,
return addresses, arguments and a frame. Frames are also created at compile time and just reused
at runtime.
### Space and support
The single instance of Space hold a list of all Classes, which in turn hold the methods.
Also the space holds messages will hold memory management objects like pages.
Words represent short immutable text and other word processing (buffers, text) is still tbd.
Lists are number indexed, starting at one, and dictionaries are mappings from words to objects.

146
soml/syntax.md Normal file
View File

@ -0,0 +1,146 @@
---
layout: soml
title: Soml Syntax
---
#### Top level Class and methods
The top level declarations in a file may only be class definitions
class Dictionary < Object
int add(Object o)
... statements
end
end
The class hierarchy is explained in [here](./parfait.html), but you can leave out the superclass
and Object will be assumed.
Methods must be typed, both arguments and return. Generally class names serve as types, but int can
be used as a shortcut for Integer.
Code may not be outside method definitions, like in ruby. A compiled program starts at the builtin
method __init__, that does the inital setup, an then jumps to Object.main
Classes are represented by class objects and methods my Method objects, so all information is available
at runtime.
#### Expressions
Soml distinguishes between expressions and statements. Expressions have value, statements perform an
action. Both are compiled to Register level instructions for the current method. Generally speaking
expressions store their value in a register and statements store those values elsewhere, possibly
after operating on them.
**Basic expressions** are numbers (integer or float), strings or names, either variable, argument,
field or class names. (normal details applicable). Special names include self (the current
receiver), and message (the currently executed method frame). These all resolve to a register
with contents.
23
"hi there"
argument_name
Object
A **field access** resolves to the fields value at the time. Fields must be defined by
field definitions, and are basically instance variables, but not hidden (see below).
The example below shows how to define local variables at the same time. Notice chaining, both for
field access and call, is not allowed.
Layout l = self.layout
Class c = l.object_class
Word n = c.name
A **Call expression** is a method call that resolves to the methods return value. If no receiver is
specified, self (the current receiver) is used. The receiver may be any of the basic expressions
above, so also class instances. The receiver type is known at compile time, as are all argument
types, so the class of the receiver is searched for a matching method. Many methods of the same
name may exist, but to issue a call, an exact match for the arguments must be found.
Class c = self.get_class()
c.get_super_class()
An **operator expression** is a binary expression, with either of the other expressions as left
and right operand, and an operator symbol between them. Operand types must be integer.
The symbols allowed are normal arithmetic and logical operations.
a + b
counter | 255
mask >> shift
Operator expressions may be used in assignments and conditions, but not in calls, where the result
would have to be assigned beforehand. This is one of those cases where soml's low level approach
shines through, as soml has no auto-generated temporary variables.
#### Statements
We have seen the top level statements above. In methods the most interesting statements relate to
flow control and specifically how conditionals are expressed. This differs somewhat from other
languages, in that the condition is expressed explicitly (not implicitly like in c or ruby).
This lets the programmer express more precisely what is tested, and also opens an extensible
framework for more tests than available in other languages. Specifically overflow may be tested in
soml, without dropping down to assembler.
And **if statement** is started with the keyword if_ and then contains the branch type. The branch
type may be plus, minus, zero, nonzero or overflow. The condition must be in brackets and be any
expression. If may be continued with en else, but doesn't have to be, and is ended with end
if_zero(a - 5)
....
else
....
end
A **while statement** is very much like an if, with off course the normal loop semantics, and
without the possible else.
while_plus( counter )
....
end
A **return statement** return a value from the current functions. There are no void functions.
return 5
A **field definition** is to declare an instance variable on an object. It starts with the keyword
field, must be in class (not method) scope and may not be assigned to.
class Class < Object
field List instance_methods
field Layout object_layout
field Word name
...
end
A **local variable definition** declares and possibly assign to a local variable. Local variables
are store in frame objects and the are last in search order. When resolving a name, the compiler
checks argument names first, and then local variables.
int counter = 0
Any of the expression may be assigned to the variable at the time of definition. After a variable is
defined it may be assigned to with an **assignemnt statement** any number of times. The assignment
is like an assignment during definition, without the leading type.
counter = 0
Any of the expressions, basic, call, operator, field access, may be assigned.
### Code generation and scope
Compiling generates two results simultaneously. The more obvious code for a function, but also an
object structure of classes etc that capture the declarations. To understand the code part better
the register abstraction should be studied, and to understand the object structure the runtime.
The register machine abstraction is very simple, and so is the code generation, in favour of a simple
model. Especially in the area of register assignment, there is no magic and only a few simple rules.
The main one of those concerns main memory access ordering and states that object memory must
be consistent at the end of the statement. Since there is only only object memory in soml, this
concerns all assignments, since all variables are either named or indexed members of objects.
Also local variables are just members of the frame.
This obviously does leave room for optimisation as preliminary benchmarks show. But benchmarks also
show that it is not such a bit issue and much more benefit can be achieved by inlining.