move posts in directory by year

2019-12-07 11:30:52 +02:00
parent 5ce2d1b625
commit 285c6531e4
38 changed files with 8 additions and 11 deletions
@@ -0,0 +1,115 @@
+%p
+  Now that i
+  %em have
+  had time to write some more code (250 commits last month), here is
+  the good news:
+%h2#sending-is-done Sending is done
+%p
+  A dynamic language like ruby really has at it’s heart the dynamic method resolution. Without
+  that we’d be writing C++. Not much can be done in ruby without looking up methods.
+%p
+  Yet all this time i have been running circles around this mother of a problem, because
+  (after all) it is a BIG one. It must be the one single most important reason why dynamic
+  languages are interpreted and not compiled.
+
+%h2#a-brief-recap A brief recap
+%p
+  Last year already i started on a rewrite. After hitting this exact same wall for the fourth
+  time. I put in some more Layers, the way a good programmer fixes any daunting problem.
+%p
+  The
+  %a{:href => "https://github.com/ruby-x/rubyx"} Readme
+  has quite a good summary on the new layers,
+  and off course i’ll update the architecture soon. But in case you didn’t click, here is the
+  very very short summary:
+%ul
+  %li Vool is a Virtual Object Oriented Language.
+  Virtual in that is has no own syntax. But
+  it has semantics, and those are substantially simpler than ruby. Vool is Ruby without
+  the fluff.
+
+  %li Mom, the Minimal Object Machine layer is the first machine layer.
+  Mom has no concept of memory
+  yet, only objects. Data is transferred directly from object
+  to object with one of Mom’s main instructions, the SlotLoad.
+  %li Risc layer here abstracts the Arm in a minimal and independent way.
+  It does not model
+  any real RISC cpu instruction set, but rather implements what is needed for rubyx.
+  %li Arm and Elf:
+  There is a minimal
+  %em Arm
+  translator that transforms Risc instructions to Arm instructions.
+  Arm instructions assemble themselves into binary code. A minimal
+  %em Elf
+  implementation is
+  able to create executable binaries from the assembled code and Parfait objects.
+  %li Parfait:
+  Generating code (by descending above layers) is only half the story in an oo system.
+  The other half is classes, types, constant objects and a minimal run-time. This is
+  what is Parfait is.
+%h2#compiling-and-building Compiling and building
+%p
+  After having finished all this layering work, i was back to square
+  = succeed ":" do
+    %em resolve
+%p
+  But off course when i got there i started thinking that the resolve method (in ruby)
+  would need resolve itself. And after briefly considering cheating (hardcoding type
+  information into this
+  %em one
+  method), i opted to write the code in Risc. Basically assembler.
+%p
+  And it was horrible. It worked, but it was completely unreadable. So then i wrote a dsl for
+  generating risc instructions, using a combination of method_missing, instance_eval and
+  operator overloading. The result is quite readable code, a mixture between assembler and
+  a mathematical notation, where one can just freely name registers and move data around
+  with
+  %em []
+  and
+  = succeed "." do
+    %em «
+%p
+  By then resolving worked, but it was still a method. Since it was already in risc, i basically
+  inlined the code by creating a new Mom instruction and moving the code to it’s
+  = succeed "." do
+    %em to_risc
+%p
+  A small bug in calling the resulting method was fixed, and
+  = succeed "," do
+    %em voila
+%h2#the-proof The proof
+%p
+  Previous, static, Hello Worlds looked like this:
+  %blockquote
+    “Hello world”.putstring
+%p
+  Off course we can know the type that putstring applies to and so this does not
+  involve any method resolution at runtime, only at compile time.
+%p
+  Todays step is thus:
+%blockquote
+  a = “Hello World”
+  %br
+  a.putstring
+%p
+  This does involve a run-time lookup of the
+  %em putstring
+  method. It being a method on String,
+  it is indeed found and called.(1) Hurray.
+%p
+  And maths works too:
+%blockquote
+  a = 150
+  %br
+  a.div10
+%p
+  Does indeed result in 15. Also most operator (+,- <<) work. Even with the
+  %em new
+  integers. Part of the rewrite was to upgrade integers to first class objects.
+%p
+  PS(1): I know with more analysis the compiler
+  %em could
+  now that
+  %em a
+  is a String (or Integer),
+  but just now it doesn’t. Take my word for it or even better, read the code.
@@ -0,0 +1,90 @@
+%p
+  After
+  =link_to "finishing the code," , "/blog/a-dynamic-hello-world"
+  i updated all the docs too!
+
+%h2 The rewrite
+%p
+  Doing anything for the first time is not so easy. I have taught enough by now to see
+  how central
+  %em guidance,
+  the experience of another, is to the process of learning.
+  %br
+  I was so much thinking about Vm's in the beginning that a lot went sideways.
+%p
+  Now it feels
+  =link_to "the abstractions", "rubyx/layers.html"
+  are coming into focus, the code is clean and relatively easy to understand.
+%p
+  During this latest wobble, maybe 500 commits in all, almost everything above the
+  Risc layer was rewritten. At the low point, i was down to just over 400 tests, but
+  now, back strong, at over 800. That's 94% coverage with a
+  =ext_link "CodeClimate A", "https://codeclimate.com/github/ruby-x/rubyx/"
+  so that's ok.
+%p
+  In the process i got much closer to the actual goal, which i'll go into more detail.
+
+%h2 The docs
+%p
+  Now i have also cleaned up all the documentation. This does not mean that everything
+  is documented, but i hope one can get a good idea from the docs, and then just
+  read the code.
+%h3 Architecture
+%p
+  The
+  =link_to "Architecture" , "/rubyx/layers.html"
+  section given an overview over the new layers.
+  %ul
+    %li Ruby Simplified: Vool
+    %li Mom, a simple machine with object memory
+    %li Risc, the old abstraction of a CPU
+    %li Arm and Elf, to actually generate binaries
+    %li Parfait and Builtin to get the the system up
+%h3 Parfait
+%p
+  There is a separate document describing the classes needed to boot the system.
+  A little about the current state of Types and Classes.
+%p
+  But it should definitely be expanded, and there is nothing about Builtin.
+  Builtin is the way to write methods that can not be expressed in ruby. And since
+  writing them got so messy i wrote a DSL, which is only documented in
+  =ext_link "code." , "https://github.com/ruby-x/rubyx/blob/master/lib/risc/builder.rb"
+%h3 Calling
+%p
+  Since Calling is now done, i documented both the
+  =link_to "calling convention,", ""
+  and the way
+  =link_to "method resolution", ""
+  is done.
+%h3 Interpreter
+%p
+  Off course the
+  =link_to "Interpreter", "rubyx/debugger.html"
+  is still working, (since the Risc layer didn't change much) and is a large part of
+  the testing scheme.
+%p
+  And i even go the
+  =link_to "Debugger" , "/debugger"
+  working again and integrated into the new site (which is now a rails app).
+%h3 Misc
+%p
+  Finally i cleaned up the old mumble jumble docs and sorted them a bit into what
+  is ideas, plans and just background info, in the
+  =link_to "Misc section" , "/misc/index.html"
+
+%h2 Next Steps
+%p
+  The plan for the near future is something like this
+  %ul
+    %li
+      More complicated tests. Whole methods that do something and
+      test containing several methods. All just testing results.
+    %li Better test framework for testing binaries.
+    %li Blocks
+    %li Baby steps towards Stdlib
+%p   As one can see, work happens when i have time and inspiration.
+%p.full_width
+  =image_tag "github-timeline-2018.jpg"
+
+  But even after 4 years, i haven't given up yet :-)
+  Though i may have given up on any time estimates.
@@ -0,0 +1,126 @@
+%p
+  It was almost going to be working binaries and over 1000 tests. But i am coming
+  more and more to the point where software is measured in number of tests, not
+  lines of code.
+
+%h2 1000+ Tests
+
+%p.full_width
+  =image_tag "1000_tests.jpg"
+  It was shortly after the last post that i first noticed that 1k was approaching.
+  A little hard to grasp that i have written all those, kind of: what do they
+  all do?
+%p
+  A good step too: just about 200 tests in 2 months of work. I noticed the couple of
+  times i didn't have good coverage for new code immediately, i started to have
+  problems and had to write tests later. It seems it is the only way to understand
+  even my own code anymore: by making the assumptions explicit. Some of the bugs
+  where tests were missing are just the classics, +1 or -1 errors. And i feel a real
+  newbie having to debug for 5 hours to find it was "<" not "<=" .
+
+%h2 Working binaries
+
+%p
+  But off course it feels good to finally have
+  %b working binaries.
+  This is after all the first time that i compile real ruby into real binary.
+  The compiler does off course have many limitations, but what it does, it does
+  right. Even it was just Hello World for starters.
+  %br
+    =image_tag "hello.jpg"
+  Off course i tried the next one straight after, "2+2" and .... it worked too.
+  I don't know if that is just my bad habit or an occupational thing, this
+  being surprised when things work.
+  %br
+  But a little bit about the journey and what how this works.
+
+%h3 Positioning
+%p
+  Since ruby-x approach is oo from the start, we do not rely on the C way of creating
+  binaries. Instead, the binary is a sort of snapshot of a running system, or
+  in other words there is only heap.
+%p
+  This means we create binaries that look the same as the memory during runtime,
+  which is made up of small fixed sized objects. Currently we only have objects
+  of sizes 2 to the power of 2,3, 4 and 5. Maybe larger later, but with oo
+  complete data hiding it is easy to extend objects transparently.
+
+%h3 Constant loading
+%p
+  Especially for Code (the only objects larger than 16 words currently),
+  this presented a challenge. Maybe even an extra challenge on top of the
+  purely static one, because of the way ARM load constants.
+%p
+  Constant loading happens when a known object or address is loaded into a register.
+  Arms constant 32bit instruction only allow 10 bit constants to be loaded.
+  So if the constant is larger (eg the object further away) two instructions instead
+  of one are needed. But this only becomes clear when all positions of objects
+  have been determined.
+
+%h3 Event approach
+%p
+  Off course this is not new, and this is in fact the third time i have coded this,
+  finally getting it right. The problem gets hairy with the 16 words limit,
+  when the code overlaps the originally assigned length and a new object has
+  to be inserted.
+%p
+  To keep one methods code continuous, all other methods code has to be moved up,
+  an thus a whole lot of positions change. Off course when some objects position
+  change, a load depending on that may go from 1 to 2 instructions and so on and
+  on.
+  And then there are the branches that load their targets (forward and backward
+  branches) off course, and they need to be updated etc etc.
+%p
+  I now have position objects which fire events, and about 4 different kind of
+  listeners reacting in different ways when different objects change. The whole
+  thing works, though as with many an event system, it is difficult to say
+  exactly how. (only easy in the small, not the whole i mean)
+
+%h3 Object continuation
+%p
+  As i mentioned, it is quite straightforward to have larger data amounts, made up
+  of 16 word chunks, by having a linked list. This is how the BinaryCode objects, that
+  hold the binary code, do it.
+%p
+  But with the binaries there is an extra twist to this. The BinaryCode object has a
+  header (the type and next), which are not code. So the code has to jump over this
+  header at every end of an object.
+
+%p.full_width
+  =image_tag "binary_codes.jpg"
+  This is demonstrated by the object dump above. If the assembly is scary, don't
+  worry, just look at the top left, address 16260, where the BinaryCode object for
+  the main method starts. You see the first two words are separated, as i said the
+  type and next (see the 162a0 value is the address of the BinaryCode on the right).
+%p
+  Mainly i wanted to demonstrate the jump, which is the last instruction on the left
+  side. The
+  %b b
+  stands for branch and the address 162a8 is exactly the code of the next BinaryCode,
+  ie just after the header.
+%p
+  You can just make out on the bottom left, that this is in fact the code for the
+  "Hello World" , as it jumps to the (Word_Type.) putstring.
+
+%h2 Next steps
+%p
+  Hello World is off course a very small step and work will continue on making other
+  things work. On the Interpreter side, many more things, like loops, conditionals,
+  maths and dynamic dispatch already work.
+%p
+  Luckily, part of this push was to make the Interpreter a platform similar to the
+  Arm. So it too has BinaryCode and works with addresses, not objects as before.
+  In short the differences between Interpreter and Arm have shrunk, and there is
+  good reason to believe that much will work quite soon.
+%p
+  Next i will build a testing framework to test the same code on Interpreter and
+  Arm and see that both work. And specifically get all those working Interpreter
+  tests working on Arm.
+%p
+  I think then it is time for some benchmarks. It has been a while since
+  =link_to "i made some," , "/misc/soml_benchmarks.html"
+  and they were quite promising. Especially loops of the Hello World and
+  Fibonacci.
+%p
+  On the further horizon i was planning for continuations next, probably with
+  a small rework of the return mechanism (unified return sequence).
@@ -0,0 +1,184 @@
+%p
+  Off course the
+  =link_to "architecture" , "/rubyx/layers.html"
+  gives a good overview of the system as it is. But it does not explain how we got
+  there. And sometimes knowing the journey makes it easier to understand where
+  one is. So i shall try to highlight the four or five main
+
+%h2 Macbook + Ruby == Rasperry Pi
+
+%p.full_width
+  =image_tag "mac_plus.png"
+  When i bought my first 30Euro Pi i noticed that ruby is unusable on it.
+  Looking at how slow ruby actually is, it occurred to me that ruby just about turns
+  the Pi into my first 286 laptop (running at 6MHz), which is the same as turning my
+  MacBook Pro into a Pi.
+%p
+  Off course, while working on web-apps, which can be parallelized so easily, and with
+  a company paying both developer and hardware, the std ruby argument holds.
+  But since i wanted to use my pi for demanding projects something had to be done.
+
+%h2 Judy, the importance of cpu cache
+
+%p
+  =ext_link "Judy" , "http://judy.sourceforge.net/"
+  is a really really fast digital tree, kind of hash. I actually built a memory
+  database with it that was also really really fast. When connecting it to rails i
+  ran into the above problem, the niceties of ActiveRecord (ruby) brought performance
+  of my extension (c) down by a factor of 40.
+%p
+  But anyway, the point is that Judy's speed is based on a radical optimisation for
+  cache lines (and key compression). This means all data structures are exactly a cpu
+  cache line big. As i learned, cpu's do not access memory in word sizes, but instead, always
+  a cache line at a time. This basically lead to ruby-x's memory model, which is
+  fixed sized objects, multiples of a cache-line.
+
+%h3 Microkernel
+%p
+  As a young engineer, i thought, as my peers, that Linux (then 0.93) was the greatest
+  thing. Only much later did i learn that it is just a copy really, and the reason
+  it got popular was not technical, but licensing (Same reason it is in Android i
+  believe). The reason it stayed popular is inertia, in other words writing device
+  drivers is hard.
+%p
+  =ext_link "Synthesis," , "https://en.wikipedia.org/wiki/Self-modifying_code#Massalin's_Synthesis_kernel"
+  =ext_link "L4,", "https://en.wikipedia.org/wiki/L4_microkernel_family"
+  and
+  =ext_link "Minix," , "http://www.minix3.org/"
+  are good proof that the superior architecture is the Microkernel. Eg L4 can run
+  another OS as an application with about 4% performance degradation. Or Minix can
+  recover from a device driver failure.
+%p
+  This, plus the fact that we have bundler, brought me to the approach that:
+  If you can leave it out, do. Much of the functionality that is in ruby (mri),
+  will never be in RubyX, but rather supplied by gems.
+
+%h3 System interrupts
+%p
+  In the beginning i was off course contemplating how much of c based systems i would use.
+  Like LLVM, which is off course a great tool, though made for c-ish applications.
+  Or libc, which again is really for c apps to access the kernel.
+%p
+  The sheer size of the functionality one inherits almost swayed me. Even i had long
+  since determined that one of ruby's biggest flaws, it's std-lib, came from modelling
+  and using libc.
+%p
+  Then i learned assembler and looked at libc implementations and learned what i
+  believe made the decision: Kernel calls are not really calls at all. They are
+  software interrupts, which basically means you fill some registers, flick the
+  switch, and the next instruction you can collect the result in a specified
+  register. This may look like a call, and off course, by using libc it is presented
+  as a call, but it is not. It is a very simple set of assembler instructions.
+%p
+  For me this meant there is very very little benefit in using c, either in it's
+  libc form, or assembler/linker (i had found a ruby gem to do that easily),
+  or, maybe most importantly, the c calling convention. All of these things
+  are great for c programs, but they are just not made for dynamic languages
+  and that would have brought a whole sloth of problems.
+
+%h3 Return address is a parameter
+%p
+  In C calling (probably other languages too), the return address is determined
+  in the callee, usually by pushing the pc to the stack. But Arm has a different
+  way, an instruction called Branch With Link, that actually stores the pc in a
+  separate register called Link.
+%p
+  And this made me realise, that really, the return address is always a parameter
+  to a function. Like other parameters it uses a register. It is the C way to
+  hide this implicit parameter, much in the same way it is the oo way of hiding
+  the self parameter.
+%p
+  By this time i was already coding some rudimentary calling convention and it
+  did not take long to verify this in code. It is in fact quite easy to determine
+  the return address at compile time and pass it explicitly. (Easy if one does not
+  use a c linker that is)
+
+%h3 OO calling convention
+%p
+  Another thing that deterred me from C is the way they use the stack. It is so
+  completely not oo and cryptic. It is in other words very difficult to unwind,
+  and almost impossible to implement closures.
+%p
+  Since the assembly had progressed easily, i made performance tests with an oo
+  calling convention, and determined that the price would be
+  =link_to "about 50%." , "/misc/soml_benchmarks.html"
+  Since currently the gap is more than an order of magnitude, this seemed ok,
+  given that it would make the compilation process so much easier.
+%p
+  The resulting calling convention uses normal Message objects that form a linked
+  list, rather than a stack. Since they are completely standard objects, manipulation
+  both at run and compile time is totally integrated.
+%p
+  Function calling has been working for years, but recently i cracked dynamic method
+  dispatch too, which was not that hard really. Currently the work is progressing to
+  blocks, and the clear structure does help a lot.
+  And while exceptions (or bindings) are not started, i think they will come with
+  relative ease (compared to the c way), since the structures are very simple.
+
+%h2 Decisions that affect the future
+
+%h3 Metasm
+%p
+  I gave
+  =ext_link "Metasm" , "https://github.com/jjyg/metasm/"
+  several long looks. After all it has assembler and disassembler for at least
+  10 cpu's, and support for several binary formats, including elf. The
+  reason not to use it was not that it is big (including much we don't need).
+  But rather that it is unmaintained and unresponsive.
+%p
+  It would be great to split all that code into several gems, a core and one
+  per cpu / binary format / assembly, disassembly. Only the core would need
+  to be integrated into rubyx, and one could just use the platform specific
+  gems. But I am not the one to do this work, was the decision.
+
+%h3 Lock free Concurrency
+%p
+  Concurrency will have to be part of the core, even if it is just to get a gc
+  working. The work that
+  =ext_link "Massalin did" , "http://valerieaurora.org/synthesis/SynthesisOS/abs.html"
+  already showed how effective lock free
+  concurrency is, but Dr Cliff took it into the modern (java) world by
+  publishing a
+  =ext_link "lock free hash" , "https://www.youtube.com/watch?v=HJ-719EGIts"
+  that he later run on some crazy machine with 800 cpus.
+%p
+  I am not sure whether it will be better to port the java code, or try a
+  =ext_link "diy" , "https://preshing.com/20130605/the-worlds-simplest-lock-free-hash-table/"
+  version. And off course to even get started on this rubyx will need the
+  compare and swap primitives that underly the lock free approach.
+  But all in due time.
+%p
+  The actual concurrency i am envisioning as two os-threads per core. One for kernel
+  interaction and one for normal operation. Kernel calls
+  would never be executed on the second, but always queued on dedicated kernel
+  threads. The non kernel threads would be used to run fibers.
+  If we insert some little check into the calling, switching could happen very often
+  and because of the linked list approach would be very very fast. And because of
+  the offloading of kennel calls would never stall (completely). This way one can
+  achieve the sort of millions of fibers erlang is known for.
+
+%h3 House keeping and garbage collection
+%p
+  Often, in systems that are designed to be collected, the base object has some
+  field to support this. This was deliberately left out. RubyX only has objects,
+  so the field would have to be an Object, which is too much overhead.
+  Or there would have to be dedicated instruction to deal with a raw data word
+  which is too much overhead in another way.
+%p
+  Gc will be a completely external gem, so experimenting will be easy and
+  encouraged. Gc implementers will just have to use their own structures to keep
+  track of the state that they need. Judy style digital trees can do this by actually
+  using less memory than a field would use, but handcrafted bitfields will also be good.
+%p
+  The actual marking phase should be relatively easy, as the world is known completely.
+  There are no grey stack areas where one has to guess, as all objects are typed
+  and the type determines which slots are objects. Not even registers are grey
+  area, as we switch cooperatively; only the Message register is ever valid.
+%p
+  In fact, all this makes even moving objects relatively easy. Though there is off
+  course the effort of going through the world to find all backlinks. But if that
+  done during a mark, it comes at relatively low cost.
+%p
+  All in all a very interesting topic, and surely someone will come up with some
+  great idea. And off course we there will have e to be the most rudimentary from
+  the start, just enough to work and give someone motivation to improve it.
@@ -0,0 +1,147 @@
+%p
+  Basic enumerator style blocks were not as bad as i though. Admittedly i thought they
+  would be close to impossible, so compared to that a few hundred commits are really
+  quite little.
+
+%h2 Different kind of blocks
+%p
+  To start with let me lay the ground. In ruby code, i see blocks used in basically
+  two kind of ways. I call the first one the
+  %b implicit block
+  which is what you do when using iterators/enumerable. Ruby let's you pass the
+  block as an
+  %em implicit
+  argument. This is the kind that is implemented and that i will go into detail about.
+%p
+  The other kind i shall call
+  %em explicit
+  is when you define blocks as variables, either with lambda or proc syntax.
+  As a slight complication implicit blocks may be captured and used in the same
+  way as explicit blocks, but let's forget for a moment that i said that.
+  Explicit blocks are good for a more functional style of programming and used much
+  (much?) less. Also they are the ones that will need some expansion on what we have now.
+
+%h2 Implicit Block properties
+%p
+  Since i never had to implement blocks before, it was a bit of a surprise how simple
+  it was.
+  After dynamic dispatch was done i had planned to improve the std library. But i
+  quickly ran into loops, and doing loops without blocks in ruby is just too weird.
+  So i started on blocks instead, which i must admit i thought would be very (very)
+  difficult.
+%p
+  But then i found that actually blocks are very similar to methods, just with a
+  twist:
+  As it turns out, the implicit block calling basically guarantees that the caller's caller
+  is the method where the block is defined. This means one knows all local variables and
+  method args, while compiling the block. And can thus resolve all variable access
+  at compile time, who knew!
+%p
+  Ok, just in case that slipped off too quick, i'll say it again: For the
+  %em implicit
+  blocks, all variable (local/args/instance) are statically known at compile time.
+  And since basic control structures (if/while) are obviously the same inside
+  a block and method, the whole problem of blocks reduces to variable access.
+
+%h2 Base classes
+%p
+  When we have things that are the same in oo, the big oo hammer comes out: inheritance.
+  So i made a base class for Block and Method, called Callable. And similarly
+  a base class for MethodCompiler and it's new equivalent BlockCompiler, called
+  CallableCompiler.
+
+%p
+  The reason i mention this much detail is just because i was so surprised how little
+  difference there is between the derived classes. In the case of Block and Method over
+  95% of the code is in the base class, and for the compilers it's still over 80%.
+  It really is only that scope resolution.
+%p
+  The difference is that a Method resolves a variable in it's own frame, whereas a Block
+  resolves it in the frame of the callers caller, ie where it was defined. And since
+  we have a nice and simple calling convention, it is just two extra instruction per
+  variable access.
+
+%p
+  So, in the hope of proving how crazy fast it would be, i started on benchmarks.
+  But here we come to another story. RubyX does consume memory quite fast, but has
+  no allocation yet. So i could fix it by creating megabytes of shell objects at
+  compile time, or bite the bullet and implement
+  %b "new". 'Cause i'll do that, means we have to wait for the numbers.
+
+%h2 Dynamic Blocks
+%p
+  Since i pushed the Procs aside up there, i just want to say that this was not without
+  consideration. I think the solution to Procs is not too difficult and the current state
+  can be expanded to handle them thus: We need to check the method of the callers caller
+  when entering the block code. If the implicit assumption holds, the code can execute.
+  If not, we need to jump to an alternate version of the code that does the variable
+  resolution dynamically.
+%p
+  Basically that means compiling two alternate versions of the code and having the switch
+  when entering the block code. Again though, since the calling convention is simple,
+  the runtime resolution is relatively simple. And it can even be coded in ruby,
+  since we can call out to a method from the generated code.
+
+%h2 Ying and Yang of Methods and Blocks
+%p
+  Sending for methods is sort of equivalent to yielding for blocks. The two use the exact same
+  calling convention. In fact yield is almost identical to ".send", so when the time
+  comes to do that, we're almost set.
+%p
+  In methods we have the static case, where the method is known
+  at compile time. And then we have dynamic dispatch, where the the method is resolved
+  at run-time and called dynamically. But in both cases variable resolution is
+  completely compile-time.
+%p
+  And then we have blocks with the "static" version, where the block that is passed
+  is known at compile-time, but only to the caller, not the callee. So the callee needs
+  to invoke (yield) dynamically, but still the variable resolution is static (compile-time).
+%p
+  And then the dynamic block version (Procs) where no resolution is necessary to
+  call the Proc (since it is given as a variable), but instead the variables
+  have to be resolved at run-time.
+%p
+  To me they are sort of reversely symmetric. I'll have to try and make a diagram
+  one day.
+
+%h2 Side note on Builder
+%p
+  Since i started with the builder and the associated dsl, i got more and more into it.
+  The dsl provides quite readable code, there is sort of assignment and a few shortcuts
+  to other risc instructions. But at the risc level one is really quite busy shuffling
+  data from here to there, so the "assignment" which covers
+  =ext_link "RegToSlot" , "https://github.com/ruby-x/rubyx/blob/master/lib/risc/instructions/reg_to_slot.rb"
+  ,
+  =ext_link "SlotToReg" , "https://github.com/ruby-x/rubyx/blob/master/lib/risc/instructions/slot_to_reg.rb"
+  and
+  =ext_link "Transfer" , "https://github.com/ruby-x/rubyx/blob/master/lib/risc/instructions/transfer.rb"
+  helps a lot.
+
+%p
+  Because of this, i  have now rewritten all of the to_risc functions in Mom, that generate
+  risc instructions using the dsl. Also the builtin code (including div10, shudder) uses
+  the dsl. It is
+  %em much
+  easier to understand, and gets rid of a fair few crutches i created on the way.
+  It's even
+  =link_to  "documented", "/rubyx/builder.html.haml"
+
+%h2 Future
+%p
+  As i said, what i really would want to do now is some benchmarking.
+  At least i got the Fibonacci of 30 to work. That's something! It took 7632 instructions.
+  That doesn't sound too bad, and is in fact twice as fast as mri (theoretically).
+  That means 1000 times fibo(30) per second on a PI.
+%p
+  Alas, we need
+  %b new
+  first, even to count to 1000. That's not too bad in itself, but it does need allocate.
+  That in itself is also not too bad, until you get to that else case,
+  where the memory has run out.
+%p
+  Then there is a mmap syscall and ... what? I guess i'll find out.
+%p
+  A note for the far future: Since we now have different compilers, and we will need
+  alternative code paths before long, inlining doesn't sound so impossible anymore
+  either. Just another compiler with different scoping rules, another type test, another
+  path.