diff --git a/_layouts/soml.html b/_layouts/soml.html index 36b9eb6..607688c 100644 --- a/_layouts/soml.html +++ b/_layouts/soml.html @@ -17,6 +17,7 @@ layout: site
  • Soml
  • Syntax
  • Parfait
  • +
  • Performance
  • diff --git a/soml/benchmarks.md b/soml/benchmarks.md new file mode 100644 index 0000000..5a5c05e --- /dev/null +++ b/soml/benchmarks.md @@ -0,0 +1,58 @@ +--- +layout: soml +title: Simple soml performance numbers +--- + +These benchmarks were made to establish places for optimizations. This early on it is clear that +performance is not outstanding, but still there were some surprises. + + +- loop - program does empty loop of same size as hello +- hello - output hello world (to dev/null) to measure kernel calls (not terminal speed) +- itos - convert integers from 1 to 100000 to string +- add - run integer adds by linear fibonacci of 40 +- call - exercise calling by recursive fibonacci of 20 + +Hello and puti and add run 100_000 iterations per program invocation to remove startup overhead. +Call only has 10000 iterations, as it is much slower, executing about 10000 calls per invocation + +Gcc used to compile c on the machine. soml executables produced by ruby (on another machine) + +### Results + +Results were measured by a ruby script. Mean and variance was measured until variance was low, +always under one percent. + +The machine was a virtual arm run on a powerbook, performance roughly equivalent to a raspberry pi. +But results should be seen as relative, not absolute. + + +|language | loop | hello | itos | add | call | | loop | hello | itos | add | call | +|------------------------------------------------------------------------------------------------------------- +|c | 0,0500 | 2,1365 | 0,2902 | 0,1245 | 0,8535 | | + 33 % | + 79 % | | | | +|soml | 0,0374 | 1,2071 | 0,7263 | 0,2247 | 1,3625 | | | | + 150% | + 80 % | + 60 % | + + +### Discussion + +Surprisingly there are areas where soml code runs faster than c. Especially in the hello example this +may not mean too much. Printf does caching and has a lot functionality, so it may not be a straight +comparison. The loop example is surprising and needs to be examined. + +The add example is slower because of the different memory model and lack of optimisation for soml. +Every result of an arithmetic operation is immediately written to memory in soml, whereas c will +keep things in registers as long as it can, which in the example is the whole time. This can +be improved upon with register code optimisation, which can cut loads after writes and writes that +that are overwritten before calls or jumps are made. + +The call was expected to be larger as a typed model is used and runtime information (like the method +name) made available. It is actually a small price to pay for the ability to generate code at runtime +and will off course reduce drastically with inlining. + +The itos example was also to be expected as it relies both on calling and on arithmetic. Also itos +relies heavily on division by 10, which when coded in cpu specific assembler may easily be sped up +by a factor of 2-3. + +All in all the results are encouraging as no optimization efforts have been made. Off course the +most encouraging fact is that the system works and thus may be used as the basis of a dynamic +code generator, as opposed to having to interpret. diff --git a/soml/parfait.md b/soml/parfait.md new file mode 100644 index 0000000..23e0f4a --- /dev/null +++ b/soml/parfait.md @@ -0,0 +1,49 @@ +--- +layout: soml +title: Parfait, soml's runtime +--- + + +#### Overview + +Soml, like ruby, has open classes. This means that a class can be added to by loading another file +with the same class definition that adds fields or methods. The effect of this is that in designing +the runtime, we can concentrate on a minimal function set. + +This means all the functionality the compiler need to get the job done, mostly class and type +structure related functionality with it's support. + +### Value and Object + +In soml object is not the root of the class hierarchy, but Value is. Integer, Float and Object are +derived from Value. So an integer is *not* an object, but still has a class and methods, just no +instance variables. + +### Layout and Class + +Each object has a layout that describes the instance variables and types of the object. It also +reference the class of the object. Layout objects are constant, may not be changed over their +lifetime. When a field is added to a class, a new layout is created. + +A Class describes a set of objects that respond to the same methods (methods are store in the class). +A Layout describes a set of objects that have the same instance variables. + +### Method, Message and Frame + +The Method class describes a declared method. It carries a name, argument names and types and +several description of the code. The parsed ast is kept for later inlining, the register model +instruction stream for optimisation and further processing and finally the cpu specific binary +represents the executable code. + +When Methods are invoked, A message object (instance of Message class) is populated. Message objects +are created at compile time and form a linked list. The data in the Message holds the receiver, +return addresses, arguments and a frame. Frames are also created at compile time and just reused +at runtime. + +### Space and support + +The single instance of Space hold a list of all Classes, which in turn hold the methods. +Also the space holds messages will hold memory management objects like pages. + +Words represent short immutable text and other word processing (buffers, text) is still tbd. +Lists are number indexed, starting at one, and dictionaries are mappings from words to objects. diff --git a/soml/syntax.md b/soml/syntax.md new file mode 100644 index 0000000..1386329 --- /dev/null +++ b/soml/syntax.md @@ -0,0 +1,146 @@ +--- +layout: soml +title: Soml Syntax +--- + + +#### Top level Class and methods + +The top level declarations in a file may only be class definitions + + class Dictionary < Object + int add(Object o) + ... statements + end + end + +The class hierarchy is explained in [here](./parfait.html), but you can leave out the superclass +and Object will be assumed. + +Methods must be typed, both arguments and return. Generally class names serve as types, but int can +be used as a shortcut for Integer. + +Code may not be outside method definitions, like in ruby. A compiled program starts at the builtin +method __init__, that does the inital setup, an then jumps to Object.main + +Classes are represented by class objects and methods my Method objects, so all information is available +at runtime. + +#### Expressions + +Soml distinguishes between expressions and statements. Expressions have value, statements perform an +action. Both are compiled to Register level instructions for the current method. Generally speaking +expressions store their value in a register and statements store those values elsewhere, possibly +after operating on them. + +**Basic expressions** are numbers (integer or float), strings or names, either variable, argument, +field or class names. (normal details applicable). Special names include self (the current +receiver), and message (the currently executed method frame). These all resolve to a register +with contents. + + 23 + "hi there" + argument_name + Object + +A **field access** resolves to the fields value at the time. Fields must be defined by +field definitions, and are basically instance variables, but not hidden (see below). +The example below shows how to define local variables at the same time. Notice chaining, both for +field access and call, is not allowed. + + Layout l = self.layout + Class c = l.object_class + Word n = c.name + +A **Call expression** is a method call that resolves to the methods return value. If no receiver is +specified, self (the current receiver) is used. The receiver may be any of the basic expressions +above, so also class instances. The receiver type is known at compile time, as are all argument +types, so the class of the receiver is searched for a matching method. Many methods of the same +name may exist, but to issue a call, an exact match for the arguments must be found. + + Class c = self.get_class() + c.get_super_class() + +An **operator expression** is a binary expression, with either of the other expressions as left +and right operand, and an operator symbol between them. Operand types must be integer. +The symbols allowed are normal arithmetic and logical operations. + + a + b + counter | 255 + mask >> shift + +Operator expressions may be used in assignments and conditions, but not in calls, where the result +would have to be assigned beforehand. This is one of those cases where soml's low level approach +shines through, as soml has no auto-generated temporary variables. + +#### Statements + +We have seen the top level statements above. In methods the most interesting statements relate to +flow control and specifically how conditionals are expressed. This differs somewhat from other +languages, in that the condition is expressed explicitly (not implicitly like in c or ruby). +This lets the programmer express more precisely what is tested, and also opens an extensible +framework for more tests than available in other languages. Specifically overflow may be tested in +soml, without dropping down to assembler. + +And **if statement** is started with the keyword if_ and then contains the branch type. The branch +type may be plus, minus, zero, nonzero or overflow. The condition must be in brackets and be any +expression. If may be continued with en else, but doesn't have to be, and is ended with end + + if_zero(a - 5) + .... + else + .... + end + +A **while statement** is very much like an if, with off course the normal loop semantics, and +without the possible else. + + while_plus( counter ) + .... + end + +A **return statement** return a value from the current functions. There are no void functions. + + return 5 + + +A **field definition** is to declare an instance variable on an object. It starts with the keyword +field, must be in class (not method) scope and may not be assigned to. + + class Class < Object + field List instance_methods + field Layout object_layout + field Word name + ... + end + +A **local variable definition** declares and possibly assign to a local variable. Local variables +are store in frame objects and the are last in search order. When resolving a name, the compiler +checks argument names first, and then local variables. + + int counter = 0 + +Any of the expression may be assigned to the variable at the time of definition. After a variable is +defined it may be assigned to with an **assignemnt statement** any number of times. The assignment +is like an assignment during definition, without the leading type. + + counter = 0 + +Any of the expressions, basic, call, operator, field access, may be assigned. + +### Code generation and scope + +Compiling generates two results simultaneously. The more obvious code for a function, but also an +object structure of classes etc that capture the declarations. To understand the code part better +the register abstraction should be studied, and to understand the object structure the runtime. + +The register machine abstraction is very simple, and so is the code generation, in favour of a simple +model. Especially in the area of register assignment, there is no magic and only a few simple rules. + +The main one of those concerns main memory access ordering and states that object memory must +be consistent at the end of the statement. Since there is only only object memory in soml, this +concerns all assignments, since all variables are either named or indexed members of objects. +Also local variables are just members of the frame. + +This obviously does leave room for optimisation as preliminary benchmarks show. But benchmarks also +show that it is not such a bit issue and much more benefit can be achieved by inlining.