add the missing some sections
also benchmarks
This commit is contained in:
parent
d855f20d52
commit
96e9194df4
@ -17,6 +17,7 @@ layout: site
|
|||||||
<li><a href="/soml/soml.html"> Soml </a> </li>
|
<li><a href="/soml/soml.html"> Soml </a> </li>
|
||||||
<li><a href="/soml/syntax.html"> Syntax </a> </li>
|
<li><a href="/soml/syntax.html"> Syntax </a> </li>
|
||||||
<li><a href="/soml/parfait.html"> Parfait </a> </li>
|
<li><a href="/soml/parfait.html"> Parfait </a> </li>
|
||||||
|
<li><a href="/soml/benchmarks.html"> Performance </a> </li>
|
||||||
</ul>
|
</ul>
|
||||||
</div>
|
</div>
|
||||||
</div>
|
</div>
|
||||||
|
58
soml/benchmarks.md
Normal file
58
soml/benchmarks.md
Normal file
@ -0,0 +1,58 @@
|
|||||||
|
---
|
||||||
|
layout: soml
|
||||||
|
title: Simple soml performance numbers
|
||||||
|
---
|
||||||
|
|
||||||
|
These benchmarks were made to establish places for optimizations. This early on it is clear that
|
||||||
|
performance is not outstanding, but still there were some surprises.
|
||||||
|
|
||||||
|
|
||||||
|
- loop - program does empty loop of same size as hello
|
||||||
|
- hello - output hello world (to dev/null) to measure kernel calls (not terminal speed)
|
||||||
|
- itos - convert integers from 1 to 100000 to string
|
||||||
|
- add - run integer adds by linear fibonacci of 40
|
||||||
|
- call - exercise calling by recursive fibonacci of 20
|
||||||
|
|
||||||
|
Hello and puti and add run 100_000 iterations per program invocation to remove startup overhead.
|
||||||
|
Call only has 10000 iterations, as it is much slower, executing about 10000 calls per invocation
|
||||||
|
|
||||||
|
Gcc used to compile c on the machine. soml executables produced by ruby (on another machine)
|
||||||
|
|
||||||
|
### Results
|
||||||
|
|
||||||
|
Results were measured by a ruby script. Mean and variance was measured until variance was low,
|
||||||
|
always under one percent.
|
||||||
|
|
||||||
|
The machine was a virtual arm run on a powerbook, performance roughly equivalent to a raspberry pi.
|
||||||
|
But results should be seen as relative, not absolute.
|
||||||
|
|
||||||
|
|
||||||
|
|language | loop | hello | itos | add | call | | loop | hello | itos | add | call |
|
||||||
|
|-------------------------------------------------------------------------------------------------------------
|
||||||
|
|c | 0,0500 | 2,1365 | 0,2902 | 0,1245 | 0,8535 | | + 33 % | + 79 % | | | |
|
||||||
|
|soml | 0,0374 | 1,2071 | 0,7263 | 0,2247 | 1,3625 | | | | + 150% | + 80 % | + 60 % |
|
||||||
|
|
||||||
|
|
||||||
|
### Discussion
|
||||||
|
|
||||||
|
Surprisingly there are areas where soml code runs faster than c. Especially in the hello example this
|
||||||
|
may not mean too much. Printf does caching and has a lot functionality, so it may not be a straight
|
||||||
|
comparison. The loop example is surprising and needs to be examined.
|
||||||
|
|
||||||
|
The add example is slower because of the different memory model and lack of optimisation for soml.
|
||||||
|
Every result of an arithmetic operation is immediately written to memory in soml, whereas c will
|
||||||
|
keep things in registers as long as it can, which in the example is the whole time. This can
|
||||||
|
be improved upon with register code optimisation, which can cut loads after writes and writes that
|
||||||
|
that are overwritten before calls or jumps are made.
|
||||||
|
|
||||||
|
The call was expected to be larger as a typed model is used and runtime information (like the method
|
||||||
|
name) made available. It is actually a small price to pay for the ability to generate code at runtime
|
||||||
|
and will off course reduce drastically with inlining.
|
||||||
|
|
||||||
|
The itos example was also to be expected as it relies both on calling and on arithmetic. Also itos
|
||||||
|
relies heavily on division by 10, which when coded in cpu specific assembler may easily be sped up
|
||||||
|
by a factor of 2-3.
|
||||||
|
|
||||||
|
All in all the results are encouraging as no optimization efforts have been made. Off course the
|
||||||
|
most encouraging fact is that the system works and thus may be used as the basis of a dynamic
|
||||||
|
code generator, as opposed to having to interpret.
|
49
soml/parfait.md
Normal file
49
soml/parfait.md
Normal file
@ -0,0 +1,49 @@
|
|||||||
|
---
|
||||||
|
layout: soml
|
||||||
|
title: Parfait, soml's runtime
|
||||||
|
---
|
||||||
|
|
||||||
|
|
||||||
|
#### Overview
|
||||||
|
|
||||||
|
Soml, like ruby, has open classes. This means that a class can be added to by loading another file
|
||||||
|
with the same class definition that adds fields or methods. The effect of this is that in designing
|
||||||
|
the runtime, we can concentrate on a minimal function set.
|
||||||
|
|
||||||
|
This means all the functionality the compiler need to get the job done, mostly class and type
|
||||||
|
structure related functionality with it's support.
|
||||||
|
|
||||||
|
### Value and Object
|
||||||
|
|
||||||
|
In soml object is not the root of the class hierarchy, but Value is. Integer, Float and Object are
|
||||||
|
derived from Value. So an integer is *not* an object, but still has a class and methods, just no
|
||||||
|
instance variables.
|
||||||
|
|
||||||
|
### Layout and Class
|
||||||
|
|
||||||
|
Each object has a layout that describes the instance variables and types of the object. It also
|
||||||
|
reference the class of the object. Layout objects are constant, may not be changed over their
|
||||||
|
lifetime. When a field is added to a class, a new layout is created.
|
||||||
|
|
||||||
|
A Class describes a set of objects that respond to the same methods (methods are store in the class).
|
||||||
|
A Layout describes a set of objects that have the same instance variables.
|
||||||
|
|
||||||
|
### Method, Message and Frame
|
||||||
|
|
||||||
|
The Method class describes a declared method. It carries a name, argument names and types and
|
||||||
|
several description of the code. The parsed ast is kept for later inlining, the register model
|
||||||
|
instruction stream for optimisation and further processing and finally the cpu specific binary
|
||||||
|
represents the executable code.
|
||||||
|
|
||||||
|
When Methods are invoked, A message object (instance of Message class) is populated. Message objects
|
||||||
|
are created at compile time and form a linked list. The data in the Message holds the receiver,
|
||||||
|
return addresses, arguments and a frame. Frames are also created at compile time and just reused
|
||||||
|
at runtime.
|
||||||
|
|
||||||
|
### Space and support
|
||||||
|
|
||||||
|
The single instance of Space hold a list of all Classes, which in turn hold the methods.
|
||||||
|
Also the space holds messages will hold memory management objects like pages.
|
||||||
|
|
||||||
|
Words represent short immutable text and other word processing (buffers, text) is still tbd.
|
||||||
|
Lists are number indexed, starting at one, and dictionaries are mappings from words to objects.
|
146
soml/syntax.md
Normal file
146
soml/syntax.md
Normal file
@ -0,0 +1,146 @@
|
|||||||
|
---
|
||||||
|
layout: soml
|
||||||
|
title: Soml Syntax
|
||||||
|
---
|
||||||
|
|
||||||
|
|
||||||
|
#### Top level Class and methods
|
||||||
|
|
||||||
|
The top level declarations in a file may only be class definitions
|
||||||
|
|
||||||
|
class Dictionary < Object
|
||||||
|
int add(Object o)
|
||||||
|
... statements
|
||||||
|
end
|
||||||
|
end
|
||||||
|
|
||||||
|
The class hierarchy is explained in [here](./parfait.html), but you can leave out the superclass
|
||||||
|
and Object will be assumed.
|
||||||
|
|
||||||
|
Methods must be typed, both arguments and return. Generally class names serve as types, but int can
|
||||||
|
be used as a shortcut for Integer.
|
||||||
|
|
||||||
|
Code may not be outside method definitions, like in ruby. A compiled program starts at the builtin
|
||||||
|
method __init__, that does the inital setup, an then jumps to Object.main
|
||||||
|
|
||||||
|
Classes are represented by class objects and methods my Method objects, so all information is available
|
||||||
|
at runtime.
|
||||||
|
|
||||||
|
#### Expressions
|
||||||
|
|
||||||
|
Soml distinguishes between expressions and statements. Expressions have value, statements perform an
|
||||||
|
action. Both are compiled to Register level instructions for the current method. Generally speaking
|
||||||
|
expressions store their value in a register and statements store those values elsewhere, possibly
|
||||||
|
after operating on them.
|
||||||
|
|
||||||
|
**Basic expressions** are numbers (integer or float), strings or names, either variable, argument,
|
||||||
|
field or class names. (normal details applicable). Special names include self (the current
|
||||||
|
receiver), and message (the currently executed method frame). These all resolve to a register
|
||||||
|
with contents.
|
||||||
|
|
||||||
|
23
|
||||||
|
"hi there"
|
||||||
|
argument_name
|
||||||
|
Object
|
||||||
|
|
||||||
|
A **field access** resolves to the fields value at the time. Fields must be defined by
|
||||||
|
field definitions, and are basically instance variables, but not hidden (see below).
|
||||||
|
The example below shows how to define local variables at the same time. Notice chaining, both for
|
||||||
|
field access and call, is not allowed.
|
||||||
|
|
||||||
|
Layout l = self.layout
|
||||||
|
Class c = l.object_class
|
||||||
|
Word n = c.name
|
||||||
|
|
||||||
|
A **Call expression** is a method call that resolves to the methods return value. If no receiver is
|
||||||
|
specified, self (the current receiver) is used. The receiver may be any of the basic expressions
|
||||||
|
above, so also class instances. The receiver type is known at compile time, as are all argument
|
||||||
|
types, so the class of the receiver is searched for a matching method. Many methods of the same
|
||||||
|
name may exist, but to issue a call, an exact match for the arguments must be found.
|
||||||
|
|
||||||
|
Class c = self.get_class()
|
||||||
|
c.get_super_class()
|
||||||
|
|
||||||
|
An **operator expression** is a binary expression, with either of the other expressions as left
|
||||||
|
and right operand, and an operator symbol between them. Operand types must be integer.
|
||||||
|
The symbols allowed are normal arithmetic and logical operations.
|
||||||
|
|
||||||
|
a + b
|
||||||
|
counter | 255
|
||||||
|
mask >> shift
|
||||||
|
|
||||||
|
Operator expressions may be used in assignments and conditions, but not in calls, where the result
|
||||||
|
would have to be assigned beforehand. This is one of those cases where soml's low level approach
|
||||||
|
shines through, as soml has no auto-generated temporary variables.
|
||||||
|
|
||||||
|
#### Statements
|
||||||
|
|
||||||
|
We have seen the top level statements above. In methods the most interesting statements relate to
|
||||||
|
flow control and specifically how conditionals are expressed. This differs somewhat from other
|
||||||
|
languages, in that the condition is expressed explicitly (not implicitly like in c or ruby).
|
||||||
|
This lets the programmer express more precisely what is tested, and also opens an extensible
|
||||||
|
framework for more tests than available in other languages. Specifically overflow may be tested in
|
||||||
|
soml, without dropping down to assembler.
|
||||||
|
|
||||||
|
And **if statement** is started with the keyword if_ and then contains the branch type. The branch
|
||||||
|
type may be plus, minus, zero, nonzero or overflow. The condition must be in brackets and be any
|
||||||
|
expression. If may be continued with en else, but doesn't have to be, and is ended with end
|
||||||
|
|
||||||
|
if_zero(a - 5)
|
||||||
|
....
|
||||||
|
else
|
||||||
|
....
|
||||||
|
end
|
||||||
|
|
||||||
|
A **while statement** is very much like an if, with off course the normal loop semantics, and
|
||||||
|
without the possible else.
|
||||||
|
|
||||||
|
while_plus( counter )
|
||||||
|
....
|
||||||
|
end
|
||||||
|
|
||||||
|
A **return statement** return a value from the current functions. There are no void functions.
|
||||||
|
|
||||||
|
return 5
|
||||||
|
|
||||||
|
|
||||||
|
A **field definition** is to declare an instance variable on an object. It starts with the keyword
|
||||||
|
field, must be in class (not method) scope and may not be assigned to.
|
||||||
|
|
||||||
|
class Class < Object
|
||||||
|
field List instance_methods
|
||||||
|
field Layout object_layout
|
||||||
|
field Word name
|
||||||
|
...
|
||||||
|
end
|
||||||
|
|
||||||
|
A **local variable definition** declares and possibly assign to a local variable. Local variables
|
||||||
|
are store in frame objects and the are last in search order. When resolving a name, the compiler
|
||||||
|
checks argument names first, and then local variables.
|
||||||
|
|
||||||
|
int counter = 0
|
||||||
|
|
||||||
|
Any of the expression may be assigned to the variable at the time of definition. After a variable is
|
||||||
|
defined it may be assigned to with an **assignemnt statement** any number of times. The assignment
|
||||||
|
is like an assignment during definition, without the leading type.
|
||||||
|
|
||||||
|
counter = 0
|
||||||
|
|
||||||
|
Any of the expressions, basic, call, operator, field access, may be assigned.
|
||||||
|
|
||||||
|
### Code generation and scope
|
||||||
|
|
||||||
|
Compiling generates two results simultaneously. The more obvious code for a function, but also an
|
||||||
|
object structure of classes etc that capture the declarations. To understand the code part better
|
||||||
|
the register abstraction should be studied, and to understand the object structure the runtime.
|
||||||
|
|
||||||
|
The register machine abstraction is very simple, and so is the code generation, in favour of a simple
|
||||||
|
model. Especially in the area of register assignment, there is no magic and only a few simple rules.
|
||||||
|
|
||||||
|
The main one of those concerns main memory access ordering and states that object memory must
|
||||||
|
be consistent at the end of the statement. Since there is only only object memory in soml, this
|
||||||
|
concerns all assignments, since all variables are either named or indexed members of objects.
|
||||||
|
Also local variables are just members of the frame.
|
||||||
|
|
||||||
|
This obviously does leave room for optimisation as preliminary benchmarks show. But benchmarks also
|
||||||
|
show that it is not such a bit issue and much more benefit can be achieved by inlining.
|
Loading…
x
Reference in New Issue
Block a user