rename typed to soml

This commit is contained in:
Torsten Ruger
2018-04-11 19:58:57 +03:00
parent 8787a77d14
commit 4dcfddb270
10 changed files with 30 additions and 34 deletions

View File

@@ -0,0 +1,8 @@
.row
%ul.nav
%li
%a{:href => "typed.html"} Typed
%li
%a{:href => "benchmarks.html"} Performance
%li
%a{:href => "syntax.html"} Syntax (obsolete)

Binary file not shown.

View File

@@ -0,0 +1,51 @@
= render "pages/soml/menu"
%h1= title "Simple soml performance numbers"
%p
These benchmarks were made to establish places for optimizations. This early on it is clear that
performance is not outstanding, but still there were some surprises.
%ul
%li loop - program does empty loop of same size as hello
%li hello - output hello world (to dev/null) to measure kernel calls (not terminal speed)
%li itos - convert integers from 1 to 100000 to string
%li add - run integer adds by linear fibonacci of 40
%li call - exercise calling by recursive fibonacci of 20
%p
Hello and itos and add run 100_000 iterations per program invocation to remove startup overhead.
Call only has 10000 iterations, as it is much slower, executing about 10000 calls per invocation
%p Gcc used to compile c on the machine. soml executables produced by ruby (on another machine)
%h2 Results
%p
Results were measured by a ruby script. Mean and variance was measured until variance was low,
always under one percent.
%p
The machine was a virtual arm run on a powerbook, performance roughly equivalent to a raspberry pi.
But results should be seen as relative, not absolute (some were scaled)
%p
= image_tag "typed/bench.png" ,alt: "Graph"
%h2 Discussion
%p
Surprisingly there are areas where soml code runs faster than c. Especially in the hello example this
may not mean too much. Printf does caching and has a lot functionality, so it may not be a straight
comparison. The loop example is surprising and needs to be examined.
%p
The add example is slower because of the different memory model and lack of optimisation for soml.
Every result of an arithmetic operation is immediately written to memory in soml, whereas c will
keep things in registers as long as it can, which in the example is the whole time. This can
be improved upon with register code optimisation, which can cut loads after writes and writes that
that are overwritten before calls or jumps are made.
%p
The call was expected to be larger as a typed model is used and runtime information (like the method
name) made available. It is actually a small price to pay for the ability to generate code at runtime
and will off course reduce drastically with inlining.
%p
The itos example was also to be expected as it relies both on calling and on arithmetic. Also itos
relies heavily on division by 10, which when coded in cpu specific assembler may easily be sped up
by a factor of 2-3.
%p
All in all the results are encouraging as no optimization efforts have been made. Off course the
most encouraging fact is that the system works and thus may be used as the basis of a dynamic
code generator, as opposed to having to interpret.

View File

@@ -0,0 +1,190 @@
= render "pages/soml/menu"
%h1= title "Soml Syntax"
%h2 Top level Class and methods
%p The top level declarations in a file may only be class definitions
%pre
%code
:preserve
class Dictionary < Object
int add(Object o)
... statements
end
end
%p
The class hierarchy is explained in
= succeed "," do
%a{:href => "parfait.html"} here
%p
Methods must be typed, both arguments and return. Generally class names serve as types, but “int” can
be used as a shortcut for Integer.
%p
Code may not be outside method definitions, like in ruby. A compiled program starts at the builtin
method
= succeed "," do
%strong init
%strong Space.main
%p
Classes are represented by class objects (instances of class Class to be precise) and methods by
Method objects, so all information is available at runtime.
%h2 Expressions
%p
Soml distinguishes between expressions and statements. Expressions have value, statements perform an
action. Both are compiled to Register level instructions for the current method. Generally speaking
expressions store their value in a register and statements store those values elsewhere, possibly
after operating on them.
%p The subsections below correspond roughly to the parsers rule names.
%p
%strong Basic expressions
are numbers (integer or float), strings or names, either variable, argument,
field or class names. (normal details applicable). Special names include self (the current
receiver), and message (the currently executed method frame). These all resolve to a register
with contents.
%pre
%code
:preserve
23
"hi there"
argument_name
Object
%p
A
%strong field access
resolves to the fields value at the time. Fields must be defined by
field definitions, and are basically instance variables, but not hidden (see below).
The example below shows how to define local variables at the same time. Notice chaining, both for
field access and call, is not allowed.
%pre
%code
:preserve
Type l = self.type
Class c = l.object_class
Word n = c.name
%p
A
%strong Call expression
is a method call that resolves to the methods return value. If no receiver is
specified, self (the current receiver) is used. The receiver may be any of the basic expressions
above, so also class instances. The receiver type is known at compile time, as are all argument
types, so the class of the receiver is searched for a matching method. Many methods of the same
name may exist, but to issue a call, an exact match for the arguments must be found.
%pre
%code
:preserve
Class c = self.get_class()
c.get_super_class()
%p
An
%strong operator expression
is a binary expression, with either of the other expressions as left
and right operand, and an operator symbol between them. Operand types must be integer.
The symbols allowed are normal arithmetic and logical operations.
%pre
%code
:preserve
a + b
counter | 255
mask >> shift
%p
Operator expressions may be used in assignments and conditions, but not in calls, where the result
would have to be assigned beforehand. This is one of those cases where somls low level approach
shines through, as soml has no auto-generated temporary variables.
%h2 Statements
%p
We have seen the top level statements above. In methods the most interesting statements relate to
flow control and specifically how conditionals are expressed. This differs somewhat from other
languages, in that the condition is expressed explicitly (not implicitly like in c or ruby).
This lets the programmer express more precisely what is tested, and also opens an extensible
framework for more tests than available in other languages. Specifically overflow may be tested in
soml, without dropping down to assembler.
%p
An
%strong if statement
is started with the keyword if_ and then contains the branch type. The branch
type may be
= succeed "." do
%em plus, minus, zero, nonzero or overflow
%em If
may be continued with en
= succeed "," do
%em else
%em end
%pre
%code
:preserve
if_zero(a - 5)
....
else
....
end
%p
A
%strong while statement
is very much like an if, with off course the normal loop semantics, and
without the possible else.
%pre
%code
:preserve
while_plus( counter )
....
end
%p
A
%strong return statement
return a value from the current functions. There are no void functions.
%pre
%code
:preserve
return 5
%p
A
%strong field definition
is to declare an instance variable on an object. It starts with the keyword
field, must be in class (not method) scope and may not be assigned to.
%pre
%code
:preserve
class Class < Object
field List instance_methods
field Type object_type
field Word name
...
end
%p
A
%strong local variable definition
declares, and possibly assigns to, a local variable. Local variables
are stored in frame objects, in fact they are instance variables of the current frame object.
When resolving a name, the compiler checks argument names first, and then local variables.
%pre
%code
:preserve
int counter = 0
%p
Any of the expressions may be assigned to the variable at the time of definition. After a variable is
defined it may be assigned to with an
%strong assignment statement
any number of times. The assignment
is like an assignment during definition, without the leading type.
%pre
%code
:preserve
counter = 0
%p Any of the expressions, basic, call, operator, field access, may be assigned.
%h2 Code generation and scope
%p
Compiling generates two results simultaneously. The more obvious is code for a function, but also an
object structure of classes etc that capture the declarations. To understand the code part better
the register abstraction should be studied, and to understand the object structure the runtime.
%p
The register machine abstraction is very simple, and so is the code generation, in favour of a simple
model. Especially in the area of register assignment, there is no magic and only a few simple rules.
%p
The main one of those concerns main memory access ordering and states that object memory must
be consistent at the end of the statement. Since there is only only object memory in soml, this
concerns all assignments, since all variables are either named or indexed members of objects.
Also local variables are just members of the frame.
%p
This obviously does leave room for optimisations as preliminary benchmarks show. But benchmarks also
show that it is not such a bit issue and much more benefit can be achieved by inlining.

View File

@@ -0,0 +1,55 @@
= render "pages/soml/menu"
%h1= title "Typed intermediate representation"
%p
Compilers use different intermediate representations to go from the source code to a binary,
which would otherwise be too big a step.
%p
The
%strong typed
intermediate representation is a strongly typed layer, between the dynamically typed
ruby above, and the register machine below. One can think of it as a mix between c and c++,
minus the syntax aspect. While in 2015, this layer existed as a language, (see soml-parser), it
is now a tree representation only.
%h2 Object oriented to the core, including calling convention
%p
Types are modeled by the class Type and carry information about instance variable names
and their basic type.
%em Every object
stores a reference
to its type, and while
= succeed "," do
%strong types are immutable
%p
The object model, ie the basic properties of objects that the system relies on, is quite simple
and explained in the runtime section. It involves a single reference per object.
Also the object memory model is kept quite simple in that object sizes are always small multiples
of the cache size of the hardware machine.
We use object encapsulation to build up larger looking objects from these basic blocks.
%p
The calling convention is also object oriented, not stack based*. Message objects are used to
define the data needed for invocation. They carry arguments, a frame and return address.
The return address is pre-calculated and determined by the caller, so
a method invocation may thus be made to return to an entirely different location.
*(A stack, as used in c, is not typed, not object oriented, and as such a source of problems)
%p
There is no non- object based memory at all. The only global constants are instances of
classes that can be accessed by writing the class name in ruby source.
%h2 Runtime / Parfait
%p
The typed representation layer depends on the higher layer to actually determine and instantiate
types (type objects, or objects of class Type). This includes method arguments and local variables.
%p
The typed layer is mainly concerned in defining TypedMethods, for which argument or local variable
have specified type (like in c). Basic Type names are the class names they represent,
but the “int” may be used for brevity
instead of Integer.
%p
The runtime, Parfait, is kept
to a minimum, currently around 15 classes, described in detail
= succeed "." do
%a{:href => "parfait.html"} here
%p
Historically Parfait has been coded in ruby, as it was first needed in the compiler.
This had the additional benefit of providing solid test cases for the functionality.