move posts in directory by year

2019-12-07 11:30:52 +02:00
parent 5ce2d1b625
commit 285c6531e4
38 changed files with 8 additions and 11 deletions
@@ -0,0 +1,30 @@
+%p
+  Well, it has been a good holiday, two months in Indonesia, Bali and diving Komodo. It brought
+  clarity, and so  i have to start a daunting task.
+%p
+  When i learned programming at University, they were still teaching Pascal. So when I got to choose
+  c++ in my first bigger project that was a real step up. But even as i wrestled templates, it was
+  Smalltalk that took my heart immediately when i read about it. And I read quite a bit, including the Blue Book about the implementation of it.
+%p
+  The next distinct step up was Java, in 1996, and then ruby in 2001. Until i mostly stopped coding
+  in 2004 when i moved to the country side and started our
+  %a{:href => "http://villataika.fi/en/index.html"} B&amp;B
+  But then we needed web-pages, and before long a pos for our shop, so i was back on the keyboard.
+  And since it was a thing i had been wanting to do, I wrote a database.
+%p
+  Purple was my current idea of an ideal data-store. Save by reachability, automatic loading by
+  traversal and schema-free any ruby object saving. In memory, based on Judy, it did about 2000
+  transaction per second. Alas, it didn’t have any searching.
+%p
+  So i bit the bullet and implemented an sql interface to it. After a failed attempt with rails 2
+  and after 2 major rewrites i managed to integrate what by then was called warp into Arel (rails3).
+  But while raw throughput was still about the same, when it had to go through Arel it crawled to 50
+  transactions per second, about the same as sqlite.
+%p
+  This was maybe 2011, and there was no doubt anymore. Not the database, but ruby itself was the
+  speed hog. I aborted.
+%p
+  In 2013 I bought a Raspberry Pi and off course I wanted to use it with ruby. Alas… Slow pi + slow ruby = nischt gut.
+  I gave up.
+%p So then the clarity came with the solution, build your own ruby. I started designing a bit on the beach already.
+%p Still, daunting. But maybe just possible….
@@ -0,0 +1,29 @@
+%h2#the-c-machine The c machine
+%p Software engineers have clean brains, scrubbed into full c alignment through decades. A few rebels (klingons?) remain on embedded systems, but of those most strive towards posix compliancy too.
+%p In other words, since all programming ultimately boils down to c, libc makes the bridge to the kernel/machine. All ….  all but a small village in the northern (cold) parts of europe (Antskog) where …
+%p So i had a look what we are talking about.
+%h2#the-issue The issue
+%p
+  Many, especially embedded guys, have noticed that your standard c library has become quite heavy
+  (2 Megs). Since it provides a defined api (posix) and large functionality on a plethora of systems (os’s) and cpu’s. Even for different ABI’s (application binary interfaces) and compilers/linkers it is no wonder.
+%p ucLibc or dietLibc get the size down, especially diet quite a bit (130k). So that’s ok then. Or is it?
+%p Then i noticed that the real issue is not the size. Even my pi has 512 Mb, and of course even libc gets paged.
+%p The real issue is the step into the C world. So, extern functions, call marshalling, and the question is for what.
+%p Afer all the c library was created to make it easier for c programs to use the kernel. And i have no intention of coding any more c.
+%h2#ruby-corestd-lib ruby core/std-lib
+%p Off course the ruby-core and std libs were designed to do for ruby what libc does for c. Unfortunately they are badly designed and suffer from above brainwash (designed around c calls)
+%p
+  Since salama is pure ruby there is a fair amount of functionality that would be nicer to provide straight in ruby. As gems off course, for everybody to see and fix.
+  For example, even if there were to be a printf (which i dislike) , it would be easy to code in ruby.
+%p What is needed is the underlying write to stdout.
+%h2#solution Solution
+%p To get salama up and running, ie to have a “ruby” executable, there are really very few kernel calls needed. File open, read and stdout write, brk.
+%p So the way this will go is to write syscalls where needed.
+%p Having tried to reverse engineer uc, diet and musl, it seems best to go straight to the source.
+%p
+  Most of that is off course for intel, but eax goes to r7 and after that the args are from r0 up, so not too bad. The definite guide for arm is here
+  %a{:href => "http://sourceforge.net/p/strace/code/ci/master/tree/linux/arm/syscallent.h"} http://sourceforge.net/p/strace/code/ci/master/tree/linux/arm/syscallent.h
+  But doesn’t include arguments (only number of them), so
+  %a{:href => "http://syscalls.kernelgrok.com/"} http://syscalls.kernelgrok.com/
+  can be used.
+%p So there, getting more metal by the minute. But the time from writing this to a hello world was 4 hours.
@@ -0,0 +1,12 @@
+%p Part of the reason why i even thought this was possible was because i had bumped into Metasm.
+%p
+  Metasm creates native code in 100% ruby. Either from Assembler or even C (partially). And for many cpu’s too.
+  It also creates many binary formats, elf among them.
+%p
+  Still, i wanted something small that i could understand easily as it was clear it would have to be changed to fit.
+  As there was no external assembler file format planned, the whole approach from parsing was inappropriate.
+%p
+  I luckily found a small library, as, that did arm only and was just a few files. After removing not needed parts
+  like parsing and some reformatting i added an assembler like dsl.
+%p This layer (arm subdirectory) said hello after about 2 weeks of work.
+%p I also got qemu to work and can thus develop without the actual pi.
@@ -0,0 +1,29 @@
+%p Parsing is a difficult, the theory incomprehensible and older tools cryptic. At least for me.
+%p
+  And then i heard recursive is easy and used by even llvm. Formalised as peg parsing libraries exists, and in ruby
+  they have dsl’s and are suddenly quite understandable.
+%p
+  Off the candidates i had first very positive experiences with treetop. Upon continuing i found the code
+  generation aspect not just clumsy (after all you can define methods in ruby), but also to interfere unneccessarily
+  with code control. On top of that conversion into an AST was not easy.
+%p After looking around i found Parslet, which pretty much removes all those issues. Namely
+%ul
+  %li It does not generate code, it generates methods. And has a nice dsl.
+  %li
+    It transforms to ruby basic types and has the notion on a transformation.
+    So an easy and clean way to create an AST
+  %li One can use ruby modules to partition a larger parser
+  %li Minimal dependencies (one file).
+  %li Active use and development.
+%p
+  So i was sold, and i got up to speed quite quickly. But i also found out how fiddly such a parser is in regards
+  to ordering and whitespace.
+%p
+  I spent some time to make quite a solid test framework, testing the different rules separately and also the
+  stages separately, so things would not break accidentally when growing.
+%p
+  After about another 2 weeks i was able to parse functions, both calls and definitions, ifs and whiles and off course basic
+  types of integers and strings.
+%p
+  With the great operator support it was a breeze to create all 15 ish binary operators. Even Array and Hash constant
+  definition was very quick. All in all surprisingly painless, thanks to Kasper!
@@ -0,0 +1,24 @@
+%p Both “ends”, parsing and machine code, were relatively clear cut. Now it is into unknown territory.
+%p I had ported the Kaleidoscope llvm tutorial language to ruby-llvm last year, so there were some ideas floating.
+%p
+  The idea of basic blocks, as the smallest unit of code without branches was pretty clear. Using those as jump
+  targets was also straight forward. But how to get from the AST to arm Intructions was not, and took some trying out.
+%p
+  In the end, or rather now, it is the AST layer that “compiles” itself into the Vm layer. The Vm layer then assembles
+  itself into Instructions.
+%p
+  General instructions are part of the Vm layer, but the code picks up derived classes and thus makes machine
+  dependent code possible. So far so ok.
+%p
+  Register allocation was (and is) another story. Argument passing and local variables do work now, but there is definitely
+  room for improvement there.
+%p
+  To get anything out of a running program i had to implement putstring (easy) and putint (difficult). Surprisingly
+  division is not easy and when pinned to 10 (divide by 10) quite strange. Still it works. While i was at writing
+  assembler i found a fibonachi in 10 or so instructions.
+%p
+  To summarise, function definition and calling (including recursion) works.
+  If and and while structures work and also some operators and now it’s easy to add more.
+%p
+  So we have a Fibonacchi in ruby using a while implementation that can be executed by salama and outputs the
+  correct result. After a total of 7 weeks this is much more than expected!
@@ -0,0 +1,44 @@
+%p It’s such a nice name, crystal. My first association is clarity, and that is exactly what i am trying to achieve.
+%p But i’ve been struggling a bit to achieve any clarity on the topic of system boundary: where does OO stop. I mean i can’t very well define method lookup in ruby syntax, as that involves method lookups. But tail recursion is so booring, it just never stops!
+%h4#kernel Kernel
+%p In the design phase (yes there was one!), i had planned to use lambdas. A little naive maybe, as they are off course objects. Thus calling them means a method resolution.
+%p So i’m settling for Module methods. I say settling because that off course always makes the module object available, though i don’t see any use for it. A waste in space (one register) and time (loading it), but no better ideas are forthcoming.
+%p The place for these methods, and i’ll go into it a little which in a second, is the Kernel. And finally the name makes sense too. That is it’s original (pre 1.9) place, as a module that Object includes, ie “below” even Object.
+%p So Kernel is the place for methods that are needed to build the system, and may not be called on objects. Simple.
+%p In other words, anything that can be coded on normal objects, should. But when that stops being possible, Kernel is the place.
+%p And what are these functions? get_instance_variable or set too. Same for functions. Strangley these may in turn rely on functions that can be coded in ruby, but at the heart of the matter is an indexed operation ie object[2].
+%p This functionality, ie getting the n’th data in an object, is essential, but c makes such a good point of of it having no place in a public api. So it needs to be implemented in a “private” part and used in a save manner. More on the layers emerging below.
+%p The Kernel is a module in salama that defines functions which return function objects. So the code is generated, instead of parsed. An essential distinction.
+%h4#system System
+%p
+  It’s an important side note on that Kernel definition above, that it is
+  %em not
+  the same as system access function. These are in their own Module and may (or must) use the kernel to implement their functionality. But not the same.
+%p Kernel is the VM’s “core” if you want.
+%p System is the access to the operating system functionality.
+%h4#layers Layers
+%p So from that Kernel idea have now emerged 3 Layers, 3 ways in which code is created.
+%h5#machine Machine
+%p The lowest layer is the Machine layer. This Layer generates Instructions or sequences thereof. So off course there is an Instruction class with derived classes, but also Block, the smallest, linear, sequences of Instructions.
+%p Also there is an abstract RegisterMachine that is mostly a mediator to the current implementation (ArmMachine). The machine has functions that create Instructions
+%p Some few machine functions return Blocks, or append their instructions to blocks. This is really more a macro layer. Usually they are small, but div10 for example is a real 10 instruction beauty.
+%h5#kernel-1 Kernel
+%p The Kernel functions return function objects. Kernel functions have the same name as the function they implement, so Kernel::putstring defines a function called putstring. Function objects (Vm::Function) carry entry/exit/body code, receiver/return/argument types and a little more.
+%p The important thing is that these functions are callable from ruby code. Thus they form the glue from the next layer up, which is coded in ruby, to the machine layer. In a way the Kernel “exports” the machine functionality to salama.
+%h5#parfait Parfait
+%p Parfait is a thin layer implementing a mini-minimal OO system. Sure, all your usual suspects of string and integers are there, but they only implement what is really really necessary. For example strings mainly have new equals and put.
+%p Parfait is heavy on Object/Class/Metaclass functionality, object instance and method lookup. All things needed to make an OO system OO. Not so much “real” functionality here, more creating the ability for that.
+%p Stdlib would be the next layer up, implementing the whole of ruby functionality in terms of what Parfait provides.
+%p The important thing here is that Parfait is written completely in ruby. Meaning it get’s parsed by salama like any other code, and then transformed into executable form and written.
+%p Any executable that salama generates will have Parfait in it. But only the final version of salama as a ruby vm, will have the whole stdlib and parser along.
+%h4#salama Salama
+%p
+  Salama uses the Kernel and Machine layers straight when creating code. Off course.
+  The closest equivalent to salama would be a compiler and so it is it’s job to create code (machine layer objects).
+%p But it is my intention to keep that as small as possible. And the good news is it’s all ruby :-)
+%h5#extensions Extensions
+%p I just want to mention the idea of extensions that is a logical step for a minimal system. Off course they would be gems, but the interesting thing is they (like salama) could:
+%ul
+  %li use salamas existing kernel/machine abstraction to define new functionality that is not possible in ruby
+  %li define new machine functionality, adding kernel type api’s, to create wholly new, possibly hardware specific functionality
+%p I am thinking graphic acceleration, GPU usage, vector api’s, that kind of thing. In fact i aim to implement the whole floating point functionality as an extensions (as it clearly not essential for OO).
@@ -0,0 +1,56 @@
+%p
+  I was just reading my ruby book, wondering about functions and blocks and the like, as one does when implementing
+  a vm. Actually the topic i was struggling with was receivers, the pesty self, when i got the exception.
+%p And while they say two steps forward, one step back, this goes the other way around.
+%h3#one-step-back One step back
+%p
+  As I just learnt assembler, it is the first time i am really considering how functions are implemented, and how the stack is
+  used in that. Sure i heard about it, but the details were vague.
+%p
+  Off course a function must know where to return to. I mean the memory-address, as this can’t very
+  well be fixed at compile time. In effect this must be passed to the function. But as programmers we
+  don’t want to have to do that all the time and so it is passed implicitly.
+%h5#the-missing-link The missing link
+%p
+  The arm architecture makes this nicely explicit. There, a call is actually called branch with link.
+  This almost rubbed me for a while as it struck me as an exceedingly bad name. Until i “got it”,
+  that is. The link is the link back, well that was simple. But the thing is that the “link” is
+  put into the link register.
+%p
+  This never struck me as meaningful, until now. Off course it means that “leaf” functions do not
+  need to touch it. Leaf functions are functions that do not call other functions, though they may
+  do syscalls as the kernel restores all registers. In other cpu’s the return address is pushed on
+  the stack, but in arm you have to do that yourself. Or not and save the instruction if you’re so inclined.
+%h5#the-hidden-argument The hidden argument
+%p
+  But the point here is, that this makes it very explicit. The return address is in effect just
+  another argument. It usually  gets passed automatically by compiler generated code, but never
+  the less. It is an argument.
+%p
+  The “step back” is to make this argument explicit in the vm code. Thus making it’s handling,
+  ie passing or saving explicit too. And thus having less magic going on, because you can’t
+  understand magic (you gotta believe it).
+%h3#two-steps-forward Two steps forward
+%p And so the thrust becomes clear i hope. We are talking about exceptions after all.
+%p
+  Because to those who have not read the windows calling convention on exception handling or even
+  heard of the dwarf specification thereof, i say don’t. It melts the brain.
+  You have to be so good at playing computer in your head, it’s not healthy.
+%p
+  Instead, we make things simple and explicit. An exception is after all just a different way for
+  a function to return. So we need an address for it to return too.
+%p
+  And as we have just made the normal return address an explicit argument, we just make the
+  exception return address and argument too. And presto.
+%p
+  Even just the briefest of considerations of how we generate those exception return addresses
+  (landing pads? what a strange name), leads to the conclusion that if a function does not do
+  any exception handling, it just passes the same address on, that it got itself. Thus a
+  generated exception would jump clear over such a function.
+%p
+  Since we have now got the exceptions to be normal code (alas with an exceptional name :-)) control
+  flow to and from it becomes quite normal too.
+%p
+  To summarize each function has now a minimum of three arguments: the self, the return address and
+  the exception address.
+%p We have indeed taken a step forward.
@@ -0,0 +1,44 @@
+%p I am not stuck. I know i’m not. Just because there is little visible progress doesn’t mean i’m stuck. It may just feel like it though.
+%p But like little cogwheels in the clock, i can hear the background process ticking away and sometimes there is a gong.
+%p What i wasn’t stuck with, is where to draw the layer for the vm.
+%h3#layers Layers
+%p
+  Software engineers like layers. Like the onion boy. You can draw boxes, make presentation and convince your boss.
+  They help us to reason about the software.
+%p
+  In this case the model was to go from ast layer to a vm layer. Via a compile method, that could just as well have been a
+  visitor.
+%p
+  That didn’t work, too big  astep and so it was from ast, to vm, to neumann. But i couldn’t decide
+  on the abstraction of the virtual machine layer. Specifically, when you have a send (and you have
+  soo many sends in ruby), do you:
+%ul
+  %li model it as a vm instruction (a bit like java)
+  %li implement it in a couple instructions like resolve, a loop and call
+  %li go to a version that is clearly translatable to neumann, say without the value type implementation
+%p
+  Obviously the third is where we need to get to, as the next step is the neumann layer and somewhow
+  we need to get there. In effect one could take those three and present them as layers, not
+  as alternatives like i have.
+%h3#passes Passes
+%p
+  And then the little cob went click, and the idea of passes resurfaced. LLvm has these passes on
+  the code tree, is probably where it surfaced from.
+%p
+  So we can have as high of a degree of abstraction as possible when going from ast to code.
+  And then have as many passes over that as we want / need.
+%p
+  Passes can be order dependent, and create more and more detail. To solve the above layer
+  conundrum, we just do a pass for each of those options.
+%p The two main benefits that come from this are:
+%p
+  1 - At each point, ie after and during each pass we can analyse the data. Imagine for example
+  that we would have picked the second layer option, that means there would never have been a
+  representation where the sends would have been explicit. Thus any analysis of them would be impossible or need reverse engineering (eg call graph analysis, or class caching)
+%p
+  2 - Passes can be gems or come from other sources. The mechanism can be relatively oblivious to
+  specific passes. And they make the transformation explicit, ie easier to understand.
+  In the example of having picked the second layer level, one would have to patch the
+  implementation of that transformation to achieve a different result. With passes it would be
+  a matter of replacing a pass, thus explicitly stating “i want a non-standard send implementation”
+%p Actually a third benefit is that it makes testing simpler. More modular. Just test the initial ast-&gt;code and then mostly the results of passes.
@@ -0,0 +1,77 @@
+%p In a picture, or when taking a picture, the frame is very important. It sets whatever is in the picture into context.
+%p
+  So it is a bit strange that having a
+  %strong frame
+  had the same sort of effect for me in programming.
+  I made the frame explicit, as an object, with functions and data, and immediately the whole
+  message sending became a whole lot clearer.
+%p
+  You read about frames in calling conventions, or otherwise when talking about the machine stack.
+  It is the area a function uses for storing data, be it arguments, locals or temporary data.
+  Often a frame pointer will be used to establish a frames dynamic size and things like that.
+  But since it’s all so implicit and handled by code very few programmers ever see it was
+  all a bit muddled for me.
+%p My frame has: return and exceptional return address, self,  arguments, locals,  temps
+%p and methods to:  create a frame, get a value to or from a slot or args/locals/tmps , return or raise
+%h3#the-divide-compile-and-runtime The divide, compile and runtime
+%p
+  I saw
+  %a{:href => "http://codon.com/compilers-for-free"} Tom’s video on free compilers
+  and read the underlying
+  book on
+  %a{:href => "http://www.itu.dk/people/sestoft/pebook/jonesgomardsestoft-a4.pdf"} Partial Evaluation
+  a bit, and it helped to make the distinctions clearer. As did the Layers and Passes post.
+  And the explicit Frame.
+%p
+  The explicit frame established the vm explicitly too, or much better. All actions of the vm happen
+  in terms of the frame. Sending is creating a new one, loading it, finding the method and branching
+  there. Getting and setting variables is just indexing into the frame at the right index and so on.
+  Instance variables are a send to self, and on it goes.
+%p
+  The great distinction is at the end quite simple, it is compile-time or run-time. And the passes
+  idea helps in that i start with most simple implementation against my vm. Then i have a data structure and can keep expanding it to “implement” more detail. Or i can analyse it to save
+  redundancies, ie optimize. But the point is in both cases i can just think about data structures
+  and what to do with them.
+%p
+  And what i can do with my data (which is off course partially instruction sequences, but that’s beside the point) really always depends on the great question: compile time vs run-time.
+  What is constant, can i do immediately. Otherwise leave for later. Simple.
+%p
+  An example, attribute accessor: a simple send. I build a frame, set the self. Now a fully dynamic
+  implementation would leave it at that. But i can check if i know the type, if it’s not
+  reference (ie integer) we can raise immediately. Also the a reference tags the class for when
+  that is known at compile time. If so i can determine the layout at compile time and inline the
+  get’s implementation. If not i could cache, but that’s for later.
+%p
+  As a further example on this, when one function has two calls on the same object, the layout
+  must only be retrieved once. ie in the sequences getType, determine method, call, the first
+  step can be omitted for the second call as a layout is constant.
+%p
+  And as a final bonus of all this clarity, i immediately spotted the inconsistency in my own design: The frame i designed holds local variables, but the caller needs to create it. The caller can
+  not possibly know the number of local variables as that is decided by the invoked method,
+  which is only known at run-time. So we clearly need a two level thing here, one
+  that the caller creates, and one that the receiver creates.
+%h3#messaging-and-slots Messaging and slots
+%p It is interesting to relate what emerges to concepts learned over the years:
+%p
+  There is this idea of message passing, as opposed to function calling. Everyone i know has learned
+  an imperative language as the first language and so message passing is a bit like vegetarian
+  food, all right for some. But off course there is a distinct difference in dynamic languages as
+  one does not know the actual method invoked beforehand. Also exceptions make the return trickier
+  and default values even the argument passing which then have to be augmented by the receiver.
+%p
+  One main difficulty i had in with the message passing idea has always been what the message is.
+  But now i have the frame, i know exactly what it is: it is the frame, nothing more nothing less.
+  (Postscript: Later introduced the Message object which gets created by the caller, and the Frame
+  is what is created by the callee)
+%p
+  Another interesting observation is the (hopefully) golden path this design goes between smalltalk
+  and self. In smalltalk (like ruby and…) all objects have a class. But some of the smalltalk researchers went on to do
+  = succeed "," do
+    %a{:href => "http://en.wikipedia.org/wiki/Self_(programming_language)"} Self
+%p
+  Now in ruby, any object can have any variables anyway, but they incur a dynamic lookup. Types on
+  the other hand are like slots, and keeping each Type constant (while an object can change layouts)
+  makes it possible to have completely dynamic behaviour (smalltalk/ruby)
+  %strong and
+  use a slot-like (self) system with constant lookup speed. Admittedly the constancy only affects cache hits, but
+  as most systems are not dynamic most of the time, that is almost always.
@@ -0,0 +1,49 @@
+%p It has been a bit of a journey, but now we have arrived: Salama is officially named.
+%h3#salama Salama
+%p
+  Salama is a
+  = succeed "." do
+    %strong real word
+%p
+  It is a word of my
+  %strong home-country
+  Finland, a finnish word (double plus).
+%p
+  Salama means
+  %strong lightning
+  (or flash), and that is fast (double double plus) and bright.
+%p
+  As some may have noticed in most places my nick is
+  = succeed "." do
+    %strong dancinglightning
+%p
+  Also
+  %strong my wife
+  suggested it, so it always reminds me of her.
+%h4#journey Journey
+%p I started with crystal, which i liked. It speaks of clarity. It is related to ruby. All was good.
+%p
+  But I was not the first to have this thought: The name is taken, as i found out by
+  chance. Ary Borenszweig started the
+  %a{:href => "http://crystal-lang.org/"} project
+  already two
+  years ago and they not only have a working system, but even compile themselves.
+%p
+  Alas, Ary started out with the idea of ruby on rockets (ie fast), but when the
+  dynamic aspects came (as they have for me a month ago), he went for speed, to be
+  precise for a static system, not for ruby.
+  So his crystal is now it’s own language with ruby-ish style, but not semantics.
+%p
+  That is why i had not found it. But when i did we talked, all was friendly, and we
+  agreed i would look for a new name.
+%p
+  And so i did and many were taken. Kide (crystal in finish) was a step on the way,
+  as was ruby in ruby. And many candidates were explored and discarded, like broom
+  (basic ruby object oriented machine), or som (simple object machine), even ahimsa.
+%h4#official Official
+%p But then i found it, or rather we did, as it was a suggestion from my wife: Salama.
+%p
+  After i found the name i made sure to claim it: I published first versions of gems
+  for salama and sub-modules. They don’t work off course, but at least the name is
+  taken in rubygems too. Off course the github name is too.
+%p So now i can get on with things at lightning speed :-)
@@ -0,0 +1,81 @@
+%p
+  While trying to figure out what i am coding i had to attack this storage format before i wanted to. The
+  immediate need is for code dumps, that are concise but readable. I started with yaml but that just takes
+  too many lines, so it’s too difficult to see what is going on.
+%p
+  I just finished it, it’s a sort of condensed yaml i call sof (salama object file), but i want to take the
+  moment to reflect why i did this, what the bigger picture is, where sof may go.
+%h3#program-lifecycle Program lifecycle
+%p
+  Let’s take a step back to mother smalltalk: there was the image. The image was/is the state of all the
+  objects in the system. Even threads, everything. Absolute object thinking taken to the ultimate.
+  A great idea off course, but doomed to ultimately fail because no man is an island (so no vm is either).
+%h4#development Development
+%p
+  Software development is a team sport, a social activity at it’s core. This is not always realised,
+  when the focus is too much on the outcome, but when you look at it, everything is done in teams.
+%p
+  The other thing not really taken into account in the standard developemnt model is that it is a process in
+  time that really only gets jucy with a first customer released version. Then you get into branches for bugs
+  and features, versions with major and minor and before long you’r in a jungle of code.
+%h4#code-centered Code centered
+%p
+  But all that effort is concentrated on code. Ok nowadays schema evlolution is part of the game, so the
+  existance of data is acknowledged, but only as an external thing. Nowhere near that smalltalk model.
+%p
+  But off course a truely object oriented program is not just code. It’s data too. Maybe currently “just”
+  configuration and enums/constants and locales, but that is exactly my point.
+%p
+  The lack of defined data/object storage is holding us back, making all our programs fruit-flies.
+  I mean it lives a short time and dies. A program has no way of “learning”, of accumulating data/knowledge
+  to use in a next invocation.
+%h4#optimisation-example Optimisation example
+%p
+  Let’s take optimisation as an example. So a developer runs tests (rubyprof/valgrind or something)
+  with some output and makes program changes accordingly. But there are two obvious problems.
+  Firstly the data is collected in development not production. Secondly, and more importantly, a person is
+  needed.
+%p
+  Of course a program could quite easily monitor itself, possibly over a long time, possibly only when
+  not at epak load. And surely some optimisations could be automated, a bit like the O1 .. On compiler
+  switches, more and more effort could be exerted on critical regions. Possibly all the way to
+  super-optimisation.
+%p
+  But even if we did this, and a program would improve/jit itself, the fruits of this work are only usable
+  during that run of that program. Future invocations, just like future versions of that program do not
+  benefit. And thus start again, just like in Groundhog day.
+%h3#storage Storage
+%p
+  So to make that optimisation example work, we would need a storage: Theoretically we could make the program
+  change it’s own executable/object files, in ruby even it’s source. Theoretically, as we have no
+  representation of the code to work on.
+%p
+  In salama we do have an internal representation, both at the code level (ast) and the compiled code
+  (CompiledMethod, Intructions and friends).
+%h4#storage-format Storage Format
+%p
+  Going back to the Image we can ask why was it doomed to fail: because of the binary,
+  proprietary implementation. Not because of the idea as such.
+%p
+  Binary data needs either a rigourous specification and/or software to work on it. Work, what work?
+  We need to merge the data between installations, maintain versions and branches. That sounds a lot like
+  version control, because it basically is. Off course this “could” have been solved by the smalltalk
+  people, but wasn’t. I think it’s fair to say that git was the first system to solve that problem.
+%p
+  And git off course works with diff, and so for a 3-way merge to be successful we need a text format.
+  Which is why i started with yaml, and which is why also sof is text-based.
+%p The other benefit is off course human readability.
+%p
+  So now we have an object file * format in text, and we have git. What we do with it is up to us.
+  (* well, i only finished the writer. reading/parsing is “left as an excercise for the reader”:-)
+%h4#sof-as-object-file-format Sof as object file format
+%p
+  Ok, i’ll sketch it a little: Salama would use sof as it’s object file format, and only sof would ever be
+  stored in git. For developers to work, tools would create source and when that is edited compile it to sof.
+%p
+  A program would be a repository of sof and resource files. Some convention for load order would be helpful
+  and some “area” where programs may collect data or changes to the program. Some may off course alter the
+  sof’s directly.
+%p
+  How, when and how automatically changes are merged (via git) is up to developer policy . But it is
+  easily imaginable that data in program designated areas get merged back into the “mainstream” automatically.
@@ -0,0 +1,71 @@
+%p The time of introspection is coming to an end and i am finally producing executables again. (hurrah)
+%h3#block-and-exception Block and exception
+%p
+  Even neither ruby blocks or exceptions are implemented i have figured out how to do it, which is sort of good news.
+  I’ll see off course when the day comes, but a plan is made and it is this:
+%p No information lives on the machine stack.
+%p
+  Maybe it’s easier to understand this way: All objects live in memory primarily. Whatever get’s moved onto the machine
+  stack is just a copy and, for purposes of the gc, does not need to be considered.
+%h3#objects-4-registers 4 Objects, 4 registers
+%p As far as i have determined the vm needs internal access to exactly four objects. These are:
+%ul
+  %li Message: the currently received one, ie the one that in a method led to the method being called
+  %li Self: this is an instance variable of the message
+  %li Frame: local and temporary variables of the method. Also part of the message.
+  %li NewMessage: where the next call is prepared
+%p And, as stated above, all these objects live in memory.
+%h3#single-set-instruction Single Set Instruction
+%p
+  Self and frame are duplicated information, because then it is easier to transfer. After inital trying, i settle on a
+  single Instruction to move data around in the vm, Set. It can move instance variables from any of the objects to any
+  other of the 4 objects.
+%p
+  The implementation of Set ensures that any move to the self slot in Message gets duplicated into the Self register. Same
+  for the frame, but both are once per method occurances, and both are read only afterwards, so don’t need updating later.
+%p Set, like other instructions may use any other variables at any time. Those registers (r4 and up) are scratch.
+%h3#simple-call Simple call
+%p
+  This makes calling relatively simple and thus easy to understand. To make a call we must be in a method, ie Message,
+  Self and Frame have been set up.
+%p
+  The method then produces values for the call. This involves operations and the result of that is stored in a variable
+  (tmp/local/arg). When all values have been calculated a NewMessage is created and all data moved there (see Set)
+%p
+  A Call is then quite simple: because of the duplication of Self and Frame, we only need to push the Message to the
+  machine stack. Then we move the NewMessage to Message, unroll (copy) the Self into it’s register and assign a new
+  Frame.
+%p
+  Returning is also not overly complicated: Remembering that the return value is an instance variable in the
+  Message object. So when the method is done, the value is there, not for example in a dedicated register.
+  So we need to undo the above: move the current Message to NewMessage, pop the previously pushed message from the
+  machine stack and unroll the Self and Frame copies.
+%p
+  The caller then continues and can pick up the return from it’s NewMessage if it is used for further calculation.
+  It’s like it did everything to built the (New)Message and immediately the return value was filled in.
+%p
+  As I said, often we need to calculate the values for the call, so we need to make calls. This happens in exacly the same
+  way, and the result is shuffled to a Frame slot (local or temporary variable).
+%h3#message-creation Message creation
+%p
+  Well, i hear, that sounds good and almost too easy. But …. (always one isn’t there) what about the Message and Frame
+  objects, where do you get those from ?
+%p
+  And this is true: in c the Message does not exist, it’s just data in registers and the Frame is created on the stack if
+  needed.
+%p And unfortunately we can’t really make a call to get/create these objects as that would create an endless loop. Hmm
+%p We need a very fast way to create and reuse these objects: a bit like a stack. So let’s just use a Stack :-)
+%p
+  Off course not the machine stack, but a Stack object. An array to which we append and take from.
+  It must be global off course, or rather accessible from compiling code. And fast may be that we use assembler, or
+  if things work out well, we can use the same code as what makes builtin arrays tick.
+%p
+  Still, this is a different problem and the full solution will need a bit time. But clearly it is solvable and does
+  not impact above register usage convention.
+%h3#the-fineprint The fineprint
+%p
+  Just for the sake of completeness: The assumtion i made a the beginning of the Simple Call section, can off course not
+  possibly be always true.
+%p
+  To boot the vm, we must create the first message by “magic” and place it and the Self (Kernel module reference).
+  As it can be an empty Message for now, this is not difficult, just one of those little gotachs.
@@ -0,0 +1,100 @@
+%p The register machine abstraction has been somewhat thin, and it is time to change that
+%h3#current-affairs Current affairs
+%p
+  When i started, i started from the assembler side, getting arm binaries working and off course learning the arm cpu
+  instruction set in assembler memnonics.
+%p
+  Not having
+  %strong any
+  experience at this level i felt that arm was pretty sensible. Much better than i expected. And
+  so i abtracted the basic instruction classes a little and had the arm instructions implement them pretty much one
+  to one.
+%p
+  Then i tried to implement any ruby logic in that abstraction and failed. Thus was born the virtual machine
+  abstraction of having Message, Frame and Self objects. This in turn mapped nicely to registers with indexed
+  addressing.
+%h3#addressing Addressing
+%p
+  I just have to sidestep here a little about addressing: the basic problem is off course that we have no idea at
+  compile-time at what address the executable will end up.
+%p
+  The problem first emerged with calling functions. Mostly because that was the only objects i had, and so i was
+  very happy to find out about pc relative addressing, in which you jump or call relative to your current position
+  (
+  %strong> p
+  rogram
+  = succeed "ounter)." do
+    %strong c
+%p
+  Then came the first strings and the aproach can be extended: instead of grabbing some memory location, ie loading
+  and address and dereferencing, we calculate the address in relation to pc and then dereference. This is great and
+  works fine.
+%p
+  But the smug smile is wiped off the face when one tries to store references. This came with the whole object
+  aproach, the bootspace holding references to
+  %strong all
+  objects in the system. I even devised a plan to always store
+  relative addresses. Not relative to pc, but relative to the self that is storing. This i’m sure would have
+  worked fine too, but it does mean that the running program also has to store those relative addresses (or have
+  different address types, shudder). That was a runtime burden i was not willing to accept.
+%p
+  So there are two choices as far as i see: use elf relocation, or relocate in init code. And yet again i find myself
+  biased to the home-growm aproach. Off course i see that this is partly because i don’t want to learn the innards of
+  elf as something very complicated that does a simple thing. But also because it is so simple i am hoping it isn’t
+  such a big deal. Most of the code for it, object iteration, type testing, layout decoding, will be useful and
+  neccessary later anyway.
+%h3#concise-instruction-set Concise instruction set
+%p
+  So that addressing aside was meant to further the point of a need for a good register instruction set (to write the
+  relocation in). And the code that i have been writing to implement the vm instructions clearly shows a need for
+  a better model at the register model.
+%p
+  On the other hand, the idea of Passes will make it very easy to have a completely sepeate register machine layer.
+  We just transfor the vm to that, and then later from that to arm (or later intel). So there are three things that i
+  am looking for with the new register machine instruction set:
+%ul
+  %li easy to understand the model (ie register machine, pc, ..), free of real machine quirks
+  %li small set of instructions that is needed for our vm
+  %li better names for instructions
+%p
+  Especially the last one: all the mvn and ldr is getting to me. It’s so 50’s, as if we didn’t have the space to spell
+  out move or load. And even those are not good names, at least i am always wondering what is a move and what a load.
+  And as i explained above in the addressing, if i wanted to load an address of an object into a register with relative
+  addressing, i would actually have to do an add. But when reading an add instruction it is not an intuative
+  conclusion that a load is meant. And since this is a fresh effort i would rather change these things now and make
+  it easier for others to learn sensible stuff than me get used to cryptics only to have everyone after me do the same.
+%p
+  So i will have instructions like RegisterMove, ConstantLoad, Branch, which will translate to mov, ldr and b in arm. I still like to keep the arm level with the traditional names, so people who actually know arm feel right at home.
+  But the extra register layer will make it easier for everyone who has not programmed assembler (and me!),
+  which i am guessing is quite a lot in the
+  %em ruby
+  community.
+%p
+  In implementation terms it is a relatively small step from the vm layer to the register layer. And an even smaller
+  one to the arm layer. But small steps are good, easy to take, easy to understand, no stumbling.
+%h3#extra-benefits Extra Benefits
+%p
+  As i am doing this for my own sanity, any additional benefits are really extra, for free as it were. And those extra
+  benefits clearly exist.
+%h5#clean-interface-for-cpu-specific-implementation Clean interface for cpu specific implementation
+%p
+  That really says it all. That interface was a bit messy, as the RegisterMachine was used in Vm code, but was actually
+  an Arm implementation. So no seperation. Also as mentioned the instruction set was arm heavy, with the quirks
+  even arm has.
+%p
+  So in the future any specific cpu implementation can be quite self sufficient. The classes it uses don’t need to
+  derive from anything specific and need only implement the very small code interface (position/length/assemble).
+  And to hook in, all that is needed is to provide a translation from RegisterMachine instructions, which can be
+  done very nicely by providing a Pass for every instruction. So that layer of code is quite seperate from the actual
+  assembler, so it should be easy to reuse existing code (like wilson or metasm).
+%h5#reusable-optimisations Reusable optimisations
+%p
+  Clearly the better seperation allows for better optimisations. Concretely Passes can be written to optimize the
+  RegiterMachine’s workings. For example register use, constant extraction from loops, or folding of double
+  moves (when a value is moved from reg1 to reg2, and then from reg2 to reg3, and reg2 never being used).
+%p
+  Such optimisations are very general and should then be reusable for specific cpu implementations. They are still
+  usefull at RegiterMachine level mind, as the code is “cleaner” there and it is easier to detect fluff. But the same
+  code may be run after a cpu translation, removing any “fluff” the translation introduced. Thus the translation
+  process may be kept simpler too, as that doesn’t need to check for possible optimisations at the same time
+  as translating. Everyone wins :-)