move md to haml
This commit is contained in:
50
typed/benchmarks.html.haml
Normal file
50
typed/benchmarks.html.haml
Normal file
@ -0,0 +1,50 @@
|
||||
%hr/
|
||||
%p
|
||||
layout: typed
|
||||
title: Simple soml performance numbers
|
||||
—
|
||||
%p
|
||||
These benchmarks were made to establish places for optimizations. This early on it is clear that
|
||||
performance is not outstanding, but still there were some surprises.
|
||||
%ul
|
||||
%li loop - program does empty loop of same size as hello
|
||||
%li hello - output hello world (to dev/null) to measure kernel calls (not terminal speed)
|
||||
%li itos - convert integers from 1 to 100000 to string
|
||||
%li add - run integer adds by linear fibonacci of 40
|
||||
%li call - exercise calling by recursive fibonacci of 20
|
||||
%p
|
||||
Hello and itos and add run 100_000 iterations per program invocation to remove startup overhead.
|
||||
Call only has 10000 iterations, as it is much slower, executing about 10000 calls per invocation
|
||||
%p Gcc used to compile c on the machine. soml executables produced by ruby (on another machine)
|
||||
%h3#results Results
|
||||
%p
|
||||
Results were measured by a ruby script. Mean and variance was measured until variance was low,
|
||||
always under one percent.
|
||||
%p
|
||||
The machine was a virtual arm run on a powerbook, performance roughly equivalent to a raspberry pi.
|
||||
But results should be seen as relative, not absolute (some were scaled)
|
||||
%p
|
||||
%img{:alt => "Graph", :src => "bench.png"}/
|
||||
%h3#discussion Discussion
|
||||
%p
|
||||
Surprisingly there are areas where soml code runs faster than c. Especially in the hello example this
|
||||
may not mean too much. Printf does caching and has a lot functionality, so it may not be a straight
|
||||
comparison. The loop example is surprising and needs to be examined.
|
||||
%p
|
||||
The add example is slower because of the different memory model and lack of optimisation for soml.
|
||||
Every result of an arithmetic operation is immediately written to memory in soml, whereas c will
|
||||
keep things in registers as long as it can, which in the example is the whole time. This can
|
||||
be improved upon with register code optimisation, which can cut loads after writes and writes that
|
||||
that are overwritten before calls or jumps are made.
|
||||
%p
|
||||
The call was expected to be larger as a typed model is used and runtime information (like the method
|
||||
name) made available. It is actually a small price to pay for the ability to generate code at runtime
|
||||
and will off course reduce drastically with inlining.
|
||||
%p
|
||||
The itos example was also to be expected as it relies both on calling and on arithmetic. Also itos
|
||||
relies heavily on division by 10, which when coded in cpu specific assembler may easily be sped up
|
||||
by a factor of 2-3.
|
||||
%p
|
||||
All in all the results are encouraging as no optimization efforts have been made. Off course the
|
||||
most encouraging fact is that the system works and thus may be used as the basis of a dynamic
|
||||
code generator, as opposed to having to interpret.
|
@ -1,54 +0,0 @@
|
||||
---
|
||||
layout: typed
|
||||
title: Simple soml performance numbers
|
||||
---
|
||||
|
||||
These benchmarks were made to establish places for optimizations. This early on it is clear that
|
||||
performance is not outstanding, but still there were some surprises.
|
||||
|
||||
|
||||
- loop - program does empty loop of same size as hello
|
||||
- hello - output hello world (to dev/null) to measure kernel calls (not terminal speed)
|
||||
- itos - convert integers from 1 to 100000 to string
|
||||
- add - run integer adds by linear fibonacci of 40
|
||||
- call - exercise calling by recursive fibonacci of 20
|
||||
|
||||
Hello and itos and add run 100_000 iterations per program invocation to remove startup overhead.
|
||||
Call only has 10000 iterations, as it is much slower, executing about 10000 calls per invocation
|
||||
|
||||
Gcc used to compile c on the machine. soml executables produced by ruby (on another machine)
|
||||
|
||||
### Results
|
||||
|
||||
Results were measured by a ruby script. Mean and variance was measured until variance was low,
|
||||
always under one percent.
|
||||
|
||||
The machine was a virtual arm run on a powerbook, performance roughly equivalent to a raspberry pi.
|
||||
But results should be seen as relative, not absolute (some were scaled)
|
||||
|
||||

|
||||
|
||||
|
||||
### Discussion
|
||||
|
||||
Surprisingly there are areas where soml code runs faster than c. Especially in the hello example this
|
||||
may not mean too much. Printf does caching and has a lot functionality, so it may not be a straight
|
||||
comparison. The loop example is surprising and needs to be examined.
|
||||
|
||||
The add example is slower because of the different memory model and lack of optimisation for soml.
|
||||
Every result of an arithmetic operation is immediately written to memory in soml, whereas c will
|
||||
keep things in registers as long as it can, which in the example is the whole time. This can
|
||||
be improved upon with register code optimisation, which can cut loads after writes and writes that
|
||||
that are overwritten before calls or jumps are made.
|
||||
|
||||
The call was expected to be larger as a typed model is used and runtime information (like the method
|
||||
name) made available. It is actually a small price to pay for the ability to generate code at runtime
|
||||
and will off course reduce drastically with inlining.
|
||||
|
||||
The itos example was also to be expected as it relies both on calling and on arithmetic. Also itos
|
||||
relies heavily on division by 10, which when coded in cpu specific assembler may easily be sped up
|
||||
by a factor of 2-3.
|
||||
|
||||
All in all the results are encouraging as no optimization efforts have been made. Off course the
|
||||
most encouraging fact is that the system works and thus may be used as the basis of a dynamic
|
||||
code generator, as opposed to having to interpret.
|
89
typed/debugger.html.haml
Normal file
89
typed/debugger.html.haml
Normal file
@ -0,0 +1,89 @@
|
||||
%hr/
|
||||
%p
|
||||
layout: typed
|
||||
title: Register Level Debugger / simulator
|
||||
—
|
||||
%h2#views Views
|
||||
%p
|
||||
From left to right there are several views showing different data and controls.
|
||||
All of the green boxes are in fact pop-up menus and can show more information.
|
||||
%br/
|
||||
Most of these are implemented as a single class with the name reflecting what part.
|
||||
I wrote 2 base classes that handle element generation (ie there is hardly any html involved, just elements)
|
||||
%p
|
||||
%img{:alt => "Debugger", :src => "https://raw.githubusercontent.com/ruby-x/rubyx-debugger/master/static/debugger.png", :width => "100%"}/
|
||||
%h3#switch-view Switch view
|
||||
%p
|
||||
Top left at the top is a little control to switch files.
|
||||
The files need to be in the repository, but at least one can have several and switch between
|
||||
them without stopping the debugger.
|
||||
%p
|
||||
Parsing is the only thing that opal chokes on, so the files are parsed by a server script and the
|
||||
ast is sent to the browser.
|
||||
%h3#classes-view Classes View
|
||||
%p
|
||||
The first column on the left is a list of classes in the system. Like on all boxes one can hover
|
||||
over a name to look at the class and it’s instance variables (recursively)
|
||||
%h3#source-view Source View
|
||||
%p
|
||||
Next is a view of the Soml source. The Source is reconstructed from the ast as html.
|
||||
Soml (RubyX object machine language) is is a statically typed language,
|
||||
maybe in spirit close to c++ (without the c). In the future RubyX will compile ruby to soml.
|
||||
%p While stepping through the code, those parts of the code that are active get highlighted in blue.
|
||||
%p
|
||||
Currently stepping is done only in register instructions, which means that depending on the source
|
||||
constructs it may take many steps for the cursor to move on.
|
||||
%p Each step will show progress on the register level though (next view)
|
||||
%h3#register-instruction-view Register Instruction view
|
||||
%p
|
||||
RubyX defines a register machine level which is quite close to the arm machine, but with more
|
||||
sensible names. It has 16 registers (below) and an instruction set that is useful for Soml.
|
||||
%p
|
||||
Data movement related instruction implement an indexed get and set. There is also Constant load and
|
||||
integer operators and off course branches.
|
||||
Instructions print their name and used registers r0-r15.
|
||||
%p The next instruction to be executed is highlighted in blue. A list of previous instructions is shown.
|
||||
%p One can follow the effect of instruction in the register view below.
|
||||
%h3#status-view Status View
|
||||
%p
|
||||
The last view at the top right show the status of the machine (interpreter to be precise), the
|
||||
instruction count and any stdout
|
||||
%p Current controls include stepping and three speeds of running the program.
|
||||
%ul
|
||||
%li
|
||||
Next (green button) will execute exactly one instruction when clicked. Mostly useful when
|
||||
debugging the compiler, ie inspecting the generated code.
|
||||
%li
|
||||
Crawl (first blue button) will execute at a moderate speed. One can still follow the
|
||||
logic at the register level
|
||||
%li
|
||||
Run (second blue button) runs the program at a higher speed where register instruction just
|
||||
whizz by, but one can still follow the source view. Mainly used to verify that the source executes
|
||||
as expected and also to get to a specific place in the program (in the absence of breakpoints)
|
||||
%li
|
||||
Wizz (third blue button) makes the program run so fast that it’s only useful function is to
|
||||
fast forward in the code (while debugging)
|
||||
%h3#register-view Register view
|
||||
%p
|
||||
The bottom part of the screen is taken up by the 16 register. As we execute an object oriented
|
||||
language, we show the object contents if it is an object (not an integer) in a register.
|
||||
%p
|
||||
The (virtual) machine only uses objects, and specifically a linked list of Message objects to
|
||||
make calls. The current message is always in register 0 (analgous to a stack pointer).
|
||||
All other registers are scratch for statement use.
|
||||
%p
|
||||
In Soml expressions compile to the register that holds the expressions value and statements may use
|
||||
all registers and may not rely on anything other than the message in register 0.
|
||||
%p The Register view is now greatly improved, especially in it’s dynamic features:
|
||||
%ul
|
||||
%li when the contents update the register obviously updates
|
||||
%li when the object that the register holds updates, the new value is shown immediately
|
||||
%li
|
||||
hovering over a variable will
|
||||
%strong expand that variable
|
||||
\.
|
||||
%li the hovering works recursively, so it is possible to drill down into objects for several levels
|
||||
%p
|
||||
The last feature of inspecting objects is show in the screenshot. This makes it possible
|
||||
to very quickly verify the programs behaviour. As it is a pure object system , all data is in
|
||||
objects, and all objects can be inspected.
|
@ -1,97 +0,0 @@
|
||||
---
|
||||
layout: typed
|
||||
title: Register Level Debugger / simulator
|
||||
---
|
||||
|
||||
## Views
|
||||
|
||||
From left to right there are several views showing different data and controls.
|
||||
All of the green boxes are in fact pop-up menus and can show more information.
|
||||
Most of these are implemented as a single class with the name reflecting what part.
|
||||
I wrote 2 base classes that handle element generation (ie there is hardly any html involved, just elements)
|
||||
|
||||
{: width="100%"}
|
||||
|
||||
|
||||
### Switch view
|
||||
|
||||
Top left at the top is a little control to switch files.
|
||||
The files need to be in the repository, but at least one can have several and switch between
|
||||
them without stopping the debugger.
|
||||
|
||||
Parsing is the only thing that opal chokes on, so the files are parsed by a server script and the
|
||||
ast is sent to the browser.
|
||||
|
||||
### Classes View
|
||||
|
||||
The first column on the left is a list of classes in the system. Like on all boxes one can hover
|
||||
over a name to look at the class and it's instance variables (recursively)
|
||||
|
||||
|
||||
### Source View
|
||||
|
||||
Next is a view of the Soml source. The Source is reconstructed from the ast as html.
|
||||
Soml (RubyX object machine language) is is a statically typed language,
|
||||
maybe in spirit close to c++ (without the c). In the future RubyX will compile ruby to soml.
|
||||
|
||||
While stepping through the code, those parts of the code that are active get highlighted in blue.
|
||||
|
||||
Currently stepping is done only in register instructions, which means that depending on the source
|
||||
constructs it may take many steps for the cursor to move on.
|
||||
|
||||
Each step will show progress on the register level though (next view)
|
||||
|
||||
|
||||
### Register Instruction view
|
||||
|
||||
RubyX defines a register machine level which is quite close to the arm machine, but with more
|
||||
sensible names. It has 16 registers (below) and an instruction set that is useful for Soml.
|
||||
|
||||
Data movement related instruction implement an indexed get and set. There is also Constant load and
|
||||
integer operators and off course branches.
|
||||
Instructions print their name and used registers r0-r15.
|
||||
|
||||
The next instruction to be executed is highlighted in blue. A list of previous instructions is shown.
|
||||
|
||||
One can follow the effect of instruction in the register view below.
|
||||
|
||||
### Status View
|
||||
|
||||
The last view at the top right show the status of the machine (interpreter to be precise), the
|
||||
instruction count and any stdout
|
||||
|
||||
Current controls include stepping and three speeds of running the program.
|
||||
|
||||
- Next (green button) will execute exactly one instruction when clicked. Mostly useful when
|
||||
debugging the compiler, ie inspecting the generated code.
|
||||
- Crawl (first blue button) will execute at a moderate speed. One can still follow the
|
||||
logic at the register level
|
||||
- Run (second blue button) runs the program at a higher speed where register instruction just
|
||||
whizz by, but one can still follow the source view. Mainly used to verify that the source executes
|
||||
as expected and also to get to a specific place in the program (in the absence of breakpoints)
|
||||
- Wizz (third blue button) makes the program run so fast that it's only useful function is to
|
||||
fast forward in the code (while debugging)
|
||||
|
||||
### Register view
|
||||
|
||||
The bottom part of the screen is taken up by the 16 register. As we execute an object oriented
|
||||
language, we show the object contents if it is an object (not an integer) in a register.
|
||||
|
||||
The (virtual) machine only uses objects, and specifically a linked list of Message objects to
|
||||
make calls. The current message is always in register 0 (analgous to a stack pointer).
|
||||
All other registers are scratch for statement use.
|
||||
|
||||
In Soml expressions compile to the register that holds the expressions value and statements may use
|
||||
all registers and may not rely on anything other than the message in register 0.
|
||||
|
||||
|
||||
The Register view is now greatly improved, especially in it's dynamic features:
|
||||
|
||||
- when the contents update the register obviously updates
|
||||
- when the object that the register holds updates, the new value is shown immediately
|
||||
- hovering over a variable will **expand that variable** .
|
||||
- the hovering works recursively, so it is possible to drill down into objects for several levels
|
||||
|
||||
The last feature of inspecting objects is show in the screenshot. This makes it possible
|
||||
to very quickly verify the programs behaviour. As it is a pure object system , all data is in
|
||||
objects, and all objects can be inspected.
|
36
typed/parfait.html.haml
Normal file
36
typed/parfait.html.haml
Normal file
@ -0,0 +1,36 @@
|
||||
%hr/
|
||||
%p
|
||||
layout: typed
|
||||
title: Parfait, a minimal runtime
|
||||
—
|
||||
%h3#type-and-class Type and Class
|
||||
%p
|
||||
Each object has a type that describes the instance variables and basic types of the object.
|
||||
Types also reference the class they implement.
|
||||
Type objects are unique and constant, may not be changed over their lifetime.
|
||||
When a field is added to a class, a new Type is created. For a given class and combination
|
||||
of instance names and basic types, only one instance every exists describing that type (a bit
|
||||
similar to symbols)
|
||||
%p
|
||||
A Class describes a set of objects that respond to the same methods (the methods source is stored
|
||||
in the RubyMethod class).
|
||||
A Type describes a set of objects that have the same instance variables.
|
||||
%h3#method-message-and-frame Method, Message and Frame
|
||||
%p
|
||||
The TypedMethod class describes a callable method. It carries a name, argument and local variable
|
||||
type and several descriptions of the code.
|
||||
The typed ast is kept for debugging, the register model instruction stream for optimisation
|
||||
and further processing and finally the cpu specific binary
|
||||
represents the executable code.
|
||||
%p
|
||||
When TypedMethods are invoked, A message object (instance of Message class) is populated.
|
||||
Message objects are created at compile time and form a linked list.
|
||||
The data in the Message holds the receiver, return addresses, arguments and a frame.
|
||||
Frames are also created at compile time and just reused at runtime.
|
||||
%h3#space-and-support Space and support
|
||||
%p
|
||||
The single instance of Space hold a list of all Types and all Classes, which in turn hold
|
||||
the methods.
|
||||
Also the space holds messages and will hold memory management objects like pages.
|
||||
%p Words represent short immutable text and other word processing (buffers, text) is still tbd.
|
||||
%p Lists (aka Array) are number indexed, starting at one, and dictionaries (aka Hash) are mappings from words to objects.
|
@ -1,41 +0,0 @@
|
||||
---
|
||||
layout: typed
|
||||
title: Parfait, a minimal runtime
|
||||
---
|
||||
|
||||
|
||||
### Type and Class
|
||||
|
||||
Each object has a type that describes the instance variables and basic types of the object.
|
||||
Types also reference the class they implement.
|
||||
Type objects are unique and constant, may not be changed over their lifetime.
|
||||
When a field is added to a class, a new Type is created. For a given class and combination
|
||||
of instance names and basic types, only one instance every exists describing that type (a bit
|
||||
similar to symbols)
|
||||
|
||||
A Class describes a set of objects that respond to the same methods (the methods source is stored
|
||||
in the RubyMethod class).
|
||||
A Type describes a set of objects that have the same instance variables.
|
||||
|
||||
### Method, Message and Frame
|
||||
|
||||
The TypedMethod class describes a callable method. It carries a name, argument and local variable
|
||||
type and several descriptions of the code.
|
||||
The typed ast is kept for debugging, the register model instruction stream for optimisation
|
||||
and further processing and finally the cpu specific binary
|
||||
represents the executable code.
|
||||
|
||||
When TypedMethods are invoked, A message object (instance of Message class) is populated.
|
||||
Message objects are created at compile time and form a linked list.
|
||||
The data in the Message holds the receiver, return addresses, arguments and a frame.
|
||||
Frames are also created at compile time and just reused at runtime.
|
||||
|
||||
### Space and support
|
||||
|
||||
The single instance of Space hold a list of all Types and all Classes, which in turn hold
|
||||
the methods.
|
||||
Also the space holds messages and will hold memory management objects like pages.
|
||||
|
||||
Words represent short immutable text and other word processing (buffers, text) is still tbd.
|
||||
|
||||
Lists (aka Array) are number indexed, starting at one, and dictionaries (aka Hash) are mappings from words to objects.
|
191
typed/syntax.html.haml
Normal file
191
typed/syntax.html.haml
Normal file
@ -0,0 +1,191 @@
|
||||
%hr/
|
||||
%p
|
||||
layout: typed
|
||||
title: Soml Syntax
|
||||
—
|
||||
%h4#top-level-class-and-methods Top level Class and methods
|
||||
%p The top level declarations in a file may only be class definitions
|
||||
%pre
|
||||
%code
|
||||
:preserve
|
||||
class Dictionary < Object
|
||||
int add(Object o)
|
||||
... statements
|
||||
end
|
||||
end
|
||||
%p
|
||||
The class hierarchy is explained in
|
||||
= succeed "," do
|
||||
%a{:href => "parfait.html"} here
|
||||
%p
|
||||
Methods must be typed, both arguments and return. Generally class names serve as types, but “int” can
|
||||
be used as a shortcut for Integer.
|
||||
%p
|
||||
Code may not be outside method definitions, like in ruby. A compiled program starts at the builtin
|
||||
method
|
||||
= succeed "," do
|
||||
%strong init
|
||||
%strong Space.main
|
||||
%p
|
||||
Classes are represented by class objects (instances of class Class to be precise) and methods by
|
||||
Method objects, so all information is available at runtime.
|
||||
%h4#expressions Expressions
|
||||
%p
|
||||
Soml distinguishes between expressions and statements. Expressions have value, statements perform an
|
||||
action. Both are compiled to Register level instructions for the current method. Generally speaking
|
||||
expressions store their value in a register and statements store those values elsewhere, possibly
|
||||
after operating on them.
|
||||
%p The subsections below correspond roughly to the parsers rule names.
|
||||
%p
|
||||
%strong Basic expressions
|
||||
are numbers (integer or float), strings or names, either variable, argument,
|
||||
field or class names. (normal details applicable). Special names include self (the current
|
||||
receiver), and message (the currently executed method frame). These all resolve to a register
|
||||
with contents.
|
||||
%pre
|
||||
%code
|
||||
:preserve
|
||||
23
|
||||
"hi there"
|
||||
argument_name
|
||||
Object
|
||||
%p
|
||||
A
|
||||
%strong field access
|
||||
resolves to the fields value at the time. Fields must be defined by
|
||||
field definitions, and are basically instance variables, but not hidden (see below).
|
||||
The example below shows how to define local variables at the same time. Notice chaining, both for
|
||||
field access and call, is not allowed.
|
||||
%pre
|
||||
%code
|
||||
:preserve
|
||||
Type l = self.type
|
||||
Class c = l.object_class
|
||||
Word n = c.name
|
||||
%p
|
||||
A
|
||||
%strong Call expression
|
||||
is a method call that resolves to the methods return value. If no receiver is
|
||||
specified, self (the current receiver) is used. The receiver may be any of the basic expressions
|
||||
above, so also class instances. The receiver type is known at compile time, as are all argument
|
||||
types, so the class of the receiver is searched for a matching method. Many methods of the same
|
||||
name may exist, but to issue a call, an exact match for the arguments must be found.
|
||||
%pre
|
||||
%code
|
||||
:preserve
|
||||
Class c = self.get_class()
|
||||
c.get_super_class()
|
||||
%p
|
||||
An
|
||||
%strong operator expression
|
||||
is a binary expression, with either of the other expressions as left
|
||||
and right operand, and an operator symbol between them. Operand types must be integer.
|
||||
The symbols allowed are normal arithmetic and logical operations.
|
||||
%pre
|
||||
%code
|
||||
:preserve
|
||||
a + b
|
||||
counter | 255
|
||||
mask >> shift
|
||||
%p
|
||||
Operator expressions may be used in assignments and conditions, but not in calls, where the result
|
||||
would have to be assigned beforehand. This is one of those cases where soml’s low level approach
|
||||
shines through, as soml has no auto-generated temporary variables.
|
||||
%h4#statements Statements
|
||||
%p
|
||||
We have seen the top level statements above. In methods the most interesting statements relate to
|
||||
flow control and specifically how conditionals are expressed. This differs somewhat from other
|
||||
languages, in that the condition is expressed explicitly (not implicitly like in c or ruby).
|
||||
This lets the programmer express more precisely what is tested, and also opens an extensible
|
||||
framework for more tests than available in other languages. Specifically overflow may be tested in
|
||||
soml, without dropping down to assembler.
|
||||
%p
|
||||
An
|
||||
%strong if statement
|
||||
is started with the keyword if_ and then contains the branch type. The branch
|
||||
type may be
|
||||
= succeed "." do
|
||||
%em plus, minus, zero, nonzero or overflow
|
||||
%em If
|
||||
may be continued with en
|
||||
= succeed "," do
|
||||
%em else
|
||||
%em end
|
||||
%pre
|
||||
%code
|
||||
:preserve
|
||||
if_zero(a - 5)
|
||||
....
|
||||
else
|
||||
....
|
||||
end
|
||||
%p
|
||||
A
|
||||
%strong while statement
|
||||
is very much like an if, with off course the normal loop semantics, and
|
||||
without the possible else.
|
||||
%pre
|
||||
%code
|
||||
:preserve
|
||||
while_plus( counter )
|
||||
....
|
||||
end
|
||||
%p
|
||||
A
|
||||
%strong return statement
|
||||
return a value from the current functions. There are no void functions.
|
||||
%pre
|
||||
%code
|
||||
:preserve
|
||||
return 5
|
||||
%p
|
||||
A
|
||||
%strong field definition
|
||||
is to declare an instance variable on an object. It starts with the keyword
|
||||
field, must be in class (not method) scope and may not be assigned to.
|
||||
%pre
|
||||
%code
|
||||
:preserve
|
||||
class Class < Object
|
||||
field List instance_methods
|
||||
field Type object_type
|
||||
field Word name
|
||||
...
|
||||
end
|
||||
%p
|
||||
A
|
||||
%strong local variable definition
|
||||
declares, and possibly assigns to, a local variable. Local variables
|
||||
are stored in frame objects, in fact they are instance variables of the current frame object.
|
||||
When resolving a name, the compiler checks argument names first, and then local variables.
|
||||
%pre
|
||||
%code
|
||||
:preserve
|
||||
int counter = 0
|
||||
%p
|
||||
Any of the expressions may be assigned to the variable at the time of definition. After a variable is
|
||||
defined it may be assigned to with an
|
||||
%strong assignment statement
|
||||
any number of times. The assignment
|
||||
is like an assignment during definition, without the leading type.
|
||||
%pre
|
||||
%code
|
||||
:preserve
|
||||
counter = 0
|
||||
%p Any of the expressions, basic, call, operator, field access, may be assigned.
|
||||
%h3#code-generation-and-scope Code generation and scope
|
||||
%p
|
||||
Compiling generates two results simultaneously. The more obvious is code for a function, but also an
|
||||
object structure of classes etc that capture the declarations. To understand the code part better
|
||||
the register abstraction should be studied, and to understand the object structure the runtime.
|
||||
%p
|
||||
The register machine abstraction is very simple, and so is the code generation, in favour of a simple
|
||||
model. Especially in the area of register assignment, there is no magic and only a few simple rules.
|
||||
%p
|
||||
The main one of those concerns main memory access ordering and states that object memory must
|
||||
be consistent at the end of the statement. Since there is only only object memory in soml, this
|
||||
concerns all assignments, since all variables are either named or indexed members of objects.
|
||||
Also local variables are just members of the frame.
|
||||
%p
|
||||
This obviously does leave room for optimisations as preliminary benchmarks show. But benchmarks also
|
||||
show that it is not such a bit issue and much more benefit can be achieved by inlining.
|
148
typed/syntax.md
148
typed/syntax.md
@ -1,148 +0,0 @@
|
||||
---
|
||||
layout: typed
|
||||
title: Soml Syntax
|
||||
---
|
||||
|
||||
|
||||
#### Top level Class and methods
|
||||
|
||||
The top level declarations in a file may only be class definitions
|
||||
|
||||
class Dictionary < Object
|
||||
int add(Object o)
|
||||
... statements
|
||||
end
|
||||
end
|
||||
|
||||
The class hierarchy is explained in [here](parfait.html), but you can leave out the superclass
|
||||
and Object will be assumed.
|
||||
|
||||
Methods must be typed, both arguments and return. Generally class names serve as types, but "int" can
|
||||
be used as a shortcut for Integer.
|
||||
|
||||
Code may not be outside method definitions, like in ruby. A compiled program starts at the builtin
|
||||
method __init__, that does the initial setup, an then jumps to **Space.main**
|
||||
|
||||
Classes are represented by class objects (instances of class Class to be precise) and methods by
|
||||
Method objects, so all information is available at runtime.
|
||||
|
||||
#### Expressions
|
||||
|
||||
Soml distinguishes between expressions and statements. Expressions have value, statements perform an
|
||||
action. Both are compiled to Register level instructions for the current method. Generally speaking
|
||||
expressions store their value in a register and statements store those values elsewhere, possibly
|
||||
after operating on them.
|
||||
|
||||
The subsections below correspond roughly to the parsers rule names.
|
||||
|
||||
**Basic expressions** are numbers (integer or float), strings or names, either variable, argument,
|
||||
field or class names. (normal details applicable). Special names include self (the current
|
||||
receiver), and message (the currently executed method frame). These all resolve to a register
|
||||
with contents.
|
||||
|
||||
23
|
||||
"hi there"
|
||||
argument_name
|
||||
Object
|
||||
|
||||
A **field access** resolves to the fields value at the time. Fields must be defined by
|
||||
field definitions, and are basically instance variables, but not hidden (see below).
|
||||
The example below shows how to define local variables at the same time. Notice chaining, both for
|
||||
field access and call, is not allowed.
|
||||
|
||||
Type l = self.type
|
||||
Class c = l.object_class
|
||||
Word n = c.name
|
||||
|
||||
A **Call expression** is a method call that resolves to the methods return value. If no receiver is
|
||||
specified, self (the current receiver) is used. The receiver may be any of the basic expressions
|
||||
above, so also class instances. The receiver type is known at compile time, as are all argument
|
||||
types, so the class of the receiver is searched for a matching method. Many methods of the same
|
||||
name may exist, but to issue a call, an exact match for the arguments must be found.
|
||||
|
||||
Class c = self.get_class()
|
||||
c.get_super_class()
|
||||
|
||||
An **operator expression** is a binary expression, with either of the other expressions as left
|
||||
and right operand, and an operator symbol between them. Operand types must be integer.
|
||||
The symbols allowed are normal arithmetic and logical operations.
|
||||
|
||||
a + b
|
||||
counter | 255
|
||||
mask >> shift
|
||||
|
||||
Operator expressions may be used in assignments and conditions, but not in calls, where the result
|
||||
would have to be assigned beforehand. This is one of those cases where soml's low level approach
|
||||
shines through, as soml has no auto-generated temporary variables.
|
||||
|
||||
#### Statements
|
||||
|
||||
We have seen the top level statements above. In methods the most interesting statements relate to
|
||||
flow control and specifically how conditionals are expressed. This differs somewhat from other
|
||||
languages, in that the condition is expressed explicitly (not implicitly like in c or ruby).
|
||||
This lets the programmer express more precisely what is tested, and also opens an extensible
|
||||
framework for more tests than available in other languages. Specifically overflow may be tested in
|
||||
soml, without dropping down to assembler.
|
||||
|
||||
An **if statement** is started with the keyword if_ and then contains the branch type. The branch
|
||||
type may be *plus, minus, zero, nonzero or overflow*. The condition must be in brackets and can be
|
||||
any expression. *If* may be continued with en *else*, but doesn't have to be, and is ended with *end*
|
||||
|
||||
if_zero(a - 5)
|
||||
....
|
||||
else
|
||||
....
|
||||
end
|
||||
|
||||
A **while statement** is very much like an if, with off course the normal loop semantics, and
|
||||
without the possible else.
|
||||
|
||||
while_plus( counter )
|
||||
....
|
||||
end
|
||||
|
||||
A **return statement** return a value from the current functions. There are no void functions.
|
||||
|
||||
return 5
|
||||
|
||||
|
||||
A **field definition** is to declare an instance variable on an object. It starts with the keyword
|
||||
field, must be in class (not method) scope and may not be assigned to.
|
||||
|
||||
class Class < Object
|
||||
field List instance_methods
|
||||
field Type object_type
|
||||
field Word name
|
||||
...
|
||||
end
|
||||
|
||||
A **local variable definition** declares, and possibly assigns to, a local variable. Local variables
|
||||
are stored in frame objects, in fact they are instance variables of the current frame object.
|
||||
When resolving a name, the compiler checks argument names first, and then local variables.
|
||||
|
||||
int counter = 0
|
||||
|
||||
Any of the expressions may be assigned to the variable at the time of definition. After a variable is
|
||||
defined it may be assigned to with an **assignment statement** any number of times. The assignment
|
||||
is like an assignment during definition, without the leading type.
|
||||
|
||||
counter = 0
|
||||
|
||||
Any of the expressions, basic, call, operator, field access, may be assigned.
|
||||
|
||||
### Code generation and scope
|
||||
|
||||
Compiling generates two results simultaneously. The more obvious is code for a function, but also an
|
||||
object structure of classes etc that capture the declarations. To understand the code part better
|
||||
the register abstraction should be studied, and to understand the object structure the runtime.
|
||||
|
||||
The register machine abstraction is very simple, and so is the code generation, in favour of a simple
|
||||
model. Especially in the area of register assignment, there is no magic and only a few simple rules.
|
||||
|
||||
The main one of those concerns main memory access ordering and states that object memory must
|
||||
be consistent at the end of the statement. Since there is only only object memory in soml, this
|
||||
concerns all assignments, since all variables are either named or indexed members of objects.
|
||||
Also local variables are just members of the frame.
|
||||
|
||||
This obviously does leave room for optimisations as preliminary benchmarks show. But benchmarks also
|
||||
show that it is not such a bit issue and much more benefit can be achieved by inlining.
|
57
typed/typed.html.haml
Normal file
57
typed/typed.html.haml
Normal file
@ -0,0 +1,57 @@
|
||||
%hr/
|
||||
%p
|
||||
layout: typed
|
||||
title: Typed intermediate representation
|
||||
—
|
||||
%h3#intermediate-representation Intermediate representation
|
||||
%p
|
||||
Compilers use different intermediate representations to go from the source code to a binary,
|
||||
which would otherwise be too big a step.
|
||||
%p
|
||||
The
|
||||
%strong typed
|
||||
intermediate representation is a strongly typed layer, between the dynamically typed
|
||||
ruby above, and the register machine below. One can think of it as a mix between c and c++,
|
||||
minus the syntax aspect. While in 2015, this layer existed as a language, (see soml-parser), it
|
||||
is now a tree representation only.
|
||||
%h4#object-oriented-to-the-core-including-calling-convention Object oriented to the core, including calling convention
|
||||
%p
|
||||
Types are modeled by the class Type and carry information about instance variable names
|
||||
and their basic type.
|
||||
%em Every object
|
||||
stores a reference
|
||||
to it’s type, and while
|
||||
= succeed "," do
|
||||
%strong types are immutable
|
||||
%p
|
||||
The object model, ie the basic properties of objects that the system relies on, is quite simple
|
||||
and explained in the runtime section. It involves a single reference per object.
|
||||
Also the object memory model is kept quite simple in that object sizes are always small multiples
|
||||
of the cache size of the hardware machine.
|
||||
We use object encapsulation to build up larger looking objects from these basic blocks.
|
||||
%p
|
||||
The calling convention is also object oriented, not stack based*. Message objects are used to
|
||||
define the data needed for invocation. They carry arguments, a frame and return address.
|
||||
The return address is pre-calculated and determined by the caller, so
|
||||
a method invocation may thus be made to return to an entirely different location.
|
||||
*(A stack, as used in c, is not typed, not object oriented, and as such a source of problems)
|
||||
%p
|
||||
There is no non- object based memory at all. The only global constants are instances of
|
||||
classes that can be accessed by writing the class name in ruby source.
|
||||
%h4#runtime--parfait Runtime / Parfait
|
||||
%p
|
||||
The typed representation layer depends on the higher layer to actually determine and instantiate
|
||||
types (type objects, or objects of class Type). This includes method arguments and local variables.
|
||||
%p
|
||||
The typed layer is mainly concerned in defining TypedMethods, for which argument or local variable
|
||||
have specified type (like in c). Basic Type names are the class names they represent,
|
||||
but the “int” may be used for brevity
|
||||
instead of Integer.
|
||||
%p
|
||||
The runtime, Parfait, is kept
|
||||
to a minimum, currently around 15 classes, described in detail
|
||||
= succeed "." do
|
||||
%a{:href => "parfait.html"} here
|
||||
%p
|
||||
Historically Parfait has been coded in ruby, as it was first needed in the compiler.
|
||||
This had the additional benefit of providing solid test cases for the functionality.
|
@ -1,53 +0,0 @@
|
||||
---
|
||||
layout: typed
|
||||
title: Typed intermediate representation
|
||||
---
|
||||
|
||||
### Intermediate representation
|
||||
|
||||
Compilers use different intermediate representations to go from the source code to a binary,
|
||||
which would otherwise be too big a step.
|
||||
|
||||
The **typed** intermediate representation is a strongly typed layer, between the dynamically typed
|
||||
ruby above, and the register machine below. One can think of it as a mix between c and c++,
|
||||
minus the syntax aspect. While in 2015, this layer existed as a language, (see soml-parser), it
|
||||
is now a tree representation only.
|
||||
|
||||
|
||||
#### Object oriented to the core, including calling convention
|
||||
|
||||
Types are modeled by the class Type and carry information about instance variable names
|
||||
and their basic type. *Every object* stores a reference
|
||||
to it's type, and while **types are immutable**, the reference may change. The basic types every
|
||||
object is made up off, include at least integer and reference (pointer).
|
||||
|
||||
The object model, ie the basic properties of objects that the system relies on, is quite simple
|
||||
and explained in the runtime section. It involves a single reference per object.
|
||||
Also the object memory model is kept quite simple in that object sizes are always small multiples
|
||||
of the cache size of the hardware machine.
|
||||
We use object encapsulation to build up larger looking objects from these basic blocks.
|
||||
|
||||
The calling convention is also object oriented, not stack based*. Message objects are used to
|
||||
define the data needed for invocation. They carry arguments, a frame and return address.
|
||||
The return address is pre-calculated and determined by the caller, so
|
||||
a method invocation may thus be made to return to an entirely different location.
|
||||
\*(A stack, as used in c, is not typed, not object oriented, and as such a source of problems)
|
||||
|
||||
There is no non- object based memory at all. The only global constants are instances of
|
||||
classes that can be accessed by writing the class name in ruby source.
|
||||
|
||||
#### Runtime / Parfait
|
||||
|
||||
The typed representation layer depends on the higher layer to actually determine and instantiate
|
||||
types (type objects, or objects of class Type). This includes method arguments and local variables.
|
||||
|
||||
The typed layer is mainly concerned in defining TypedMethods, for which argument or local variable
|
||||
have specified type (like in c). Basic Type names are the class names they represent,
|
||||
but the "int" may be used for brevity
|
||||
instead of Integer.
|
||||
|
||||
The runtime, Parfait, is kept
|
||||
to a minimum, currently around 15 classes, described in detail [here](parfait.html).
|
||||
|
||||
Historically Parfait has been coded in ruby, as it was first needed in the compiler.
|
||||
This had the additional benefit of providing solid test cases for the functionality.
|
Reference in New Issue
Block a user