move posts in directory by year
This commit is contained in:
115
app/views/posts/2018/_04-09-a-dynamic-hello-world.haml
Normal file
115
app/views/posts/2018/_04-09-a-dynamic-hello-world.haml
Normal file
@@ -0,0 +1,115 @@
|
||||
%p
|
||||
Now that i
|
||||
%em have
|
||||
had time to write some more code (250 commits last month), here is
|
||||
the good news:
|
||||
%h2#sending-is-done Sending is done
|
||||
%p
|
||||
A dynamic language like ruby really has at it’s heart the dynamic method resolution. Without
|
||||
that we’d be writing C++. Not much can be done in ruby without looking up methods.
|
||||
%p
|
||||
Yet all this time i have been running circles around this mother of a problem, because
|
||||
(after all) it is a BIG one. It must be the one single most important reason why dynamic
|
||||
languages are interpreted and not compiled.
|
||||
|
||||
%h2#a-brief-recap A brief recap
|
||||
%p
|
||||
Last year already i started on a rewrite. After hitting this exact same wall for the fourth
|
||||
time. I put in some more Layers, the way a good programmer fixes any daunting problem.
|
||||
%p
|
||||
The
|
||||
%a{:href => "https://github.com/ruby-x/rubyx"} Readme
|
||||
has quite a good summary on the new layers,
|
||||
and off course i’ll update the architecture soon. But in case you didn’t click, here is the
|
||||
very very short summary:
|
||||
%ul
|
||||
%li Vool is a Virtual Object Oriented Language.
|
||||
Virtual in that is has no own syntax. But
|
||||
it has semantics, and those are substantially simpler than ruby. Vool is Ruby without
|
||||
the fluff.
|
||||
|
||||
%li Mom, the Minimal Object Machine layer is the first machine layer.
|
||||
Mom has no concept of memory
|
||||
yet, only objects. Data is transferred directly from object
|
||||
to object with one of Mom’s main instructions, the SlotLoad.
|
||||
%li Risc layer here abstracts the Arm in a minimal and independent way.
|
||||
It does not model
|
||||
any real RISC cpu instruction set, but rather implements what is needed for rubyx.
|
||||
%li Arm and Elf:
|
||||
There is a minimal
|
||||
%em Arm
|
||||
translator that transforms Risc instructions to Arm instructions.
|
||||
Arm instructions assemble themselves into binary code. A minimal
|
||||
%em Elf
|
||||
implementation is
|
||||
able to create executable binaries from the assembled code and Parfait objects.
|
||||
%li Parfait:
|
||||
Generating code (by descending above layers) is only half the story in an oo system.
|
||||
The other half is classes, types, constant objects and a minimal run-time. This is
|
||||
what is Parfait is.
|
||||
%h2#compiling-and-building Compiling and building
|
||||
%p
|
||||
After having finished all this layering work, i was back to square
|
||||
= succeed ":" do
|
||||
%em resolve
|
||||
%p
|
||||
But off course when i got there i started thinking that the resolve method (in ruby)
|
||||
would need resolve itself. And after briefly considering cheating (hardcoding type
|
||||
information into this
|
||||
%em one
|
||||
method), i opted to write the code in Risc. Basically assembler.
|
||||
%p
|
||||
And it was horrible. It worked, but it was completely unreadable. So then i wrote a dsl for
|
||||
generating risc instructions, using a combination of method_missing, instance_eval and
|
||||
operator overloading. The result is quite readable code, a mixture between assembler and
|
||||
a mathematical notation, where one can just freely name registers and move data around
|
||||
with
|
||||
%em []
|
||||
and
|
||||
= succeed "." do
|
||||
%em «
|
||||
%p
|
||||
By then resolving worked, but it was still a method. Since it was already in risc, i basically
|
||||
inlined the code by creating a new Mom instruction and moving the code to it’s
|
||||
= succeed "." do
|
||||
%em to_risc
|
||||
%p
|
||||
A small bug in calling the resulting method was fixed, and
|
||||
= succeed "," do
|
||||
%em voila
|
||||
%h2#the-proof The proof
|
||||
%p
|
||||
Previous, static, Hello Worlds looked like this:
|
||||
%blockquote
|
||||
“Hello world”.putstring
|
||||
%p
|
||||
Off course we can know the type that putstring applies to and so this does not
|
||||
involve any method resolution at runtime, only at compile time.
|
||||
%p
|
||||
Todays step is thus:
|
||||
%blockquote
|
||||
a = “Hello World”
|
||||
%br
|
||||
a.putstring
|
||||
%p
|
||||
This does involve a run-time lookup of the
|
||||
%em putstring
|
||||
method. It being a method on String,
|
||||
it is indeed found and called.(1) Hurray.
|
||||
%p
|
||||
And maths works too:
|
||||
%blockquote
|
||||
a = 150
|
||||
%br
|
||||
a.div10
|
||||
%p
|
||||
Does indeed result in 15. Also most operator (+,- <<) work. Even with the
|
||||
%em new
|
||||
integers. Part of the rewrite was to upgrade integers to first class objects.
|
||||
%p
|
||||
PS(1): I know with more analysis the compiler
|
||||
%em could
|
||||
now that
|
||||
%em a
|
||||
is a String (or Integer),
|
||||
but just now it doesn’t. Take my word for it or even better, read the code.
|
||||
90
app/views/posts/2018/_04-22-four-years-and-going-strong.haml
Normal file
90
app/views/posts/2018/_04-22-four-years-and-going-strong.haml
Normal file
@@ -0,0 +1,90 @@
|
||||
%p
|
||||
After
|
||||
=link_to "finishing the code," , "/blog/a-dynamic-hello-world"
|
||||
i updated all the docs too!
|
||||
|
||||
%h2 The rewrite
|
||||
%p
|
||||
Doing anything for the first time is not so easy. I have taught enough by now to see
|
||||
how central
|
||||
%em guidance,
|
||||
the experience of another, is to the process of learning.
|
||||
%br
|
||||
I was so much thinking about Vm's in the beginning that a lot went sideways.
|
||||
%p
|
||||
Now it feels
|
||||
=link_to "the abstractions", "rubyx/layers.html"
|
||||
are coming into focus, the code is clean and relatively easy to understand.
|
||||
%p
|
||||
During this latest wobble, maybe 500 commits in all, almost everything above the
|
||||
Risc layer was rewritten. At the low point, i was down to just over 400 tests, but
|
||||
now, back strong, at over 800. That's 94% coverage with a
|
||||
=ext_link "CodeClimate A", "https://codeclimate.com/github/ruby-x/rubyx/"
|
||||
so that's ok.
|
||||
%p
|
||||
In the process i got much closer to the actual goal, which i'll go into more detail.
|
||||
|
||||
%h2 The docs
|
||||
%p
|
||||
Now i have also cleaned up all the documentation. This does not mean that everything
|
||||
is documented, but i hope one can get a good idea from the docs, and then just
|
||||
read the code.
|
||||
%h3 Architecture
|
||||
%p
|
||||
The
|
||||
=link_to "Architecture" , "/rubyx/layers.html"
|
||||
section given an overview over the new layers.
|
||||
%ul
|
||||
%li Ruby Simplified: Vool
|
||||
%li Mom, a simple machine with object memory
|
||||
%li Risc, the old abstraction of a CPU
|
||||
%li Arm and Elf, to actually generate binaries
|
||||
%li Parfait and Builtin to get the the system up
|
||||
%h3 Parfait
|
||||
%p
|
||||
There is a separate document describing the classes needed to boot the system.
|
||||
A little about the current state of Types and Classes.
|
||||
%p
|
||||
But it should definitely be expanded, and there is nothing about Builtin.
|
||||
Builtin is the way to write methods that can not be expressed in ruby. And since
|
||||
writing them got so messy i wrote a DSL, which is only documented in
|
||||
=ext_link "code." , "https://github.com/ruby-x/rubyx/blob/master/lib/risc/builder.rb"
|
||||
%h3 Calling
|
||||
%p
|
||||
Since Calling is now done, i documented both the
|
||||
=link_to "calling convention,", ""
|
||||
and the way
|
||||
=link_to "method resolution", ""
|
||||
is done.
|
||||
%h3 Interpreter
|
||||
%p
|
||||
Off course the
|
||||
=link_to "Interpreter", "rubyx/debugger.html"
|
||||
is still working, (since the Risc layer didn't change much) and is a large part of
|
||||
the testing scheme.
|
||||
%p
|
||||
And i even go the
|
||||
=link_to "Debugger" , "/debugger"
|
||||
working again and integrated into the new site (which is now a rails app).
|
||||
%h3 Misc
|
||||
%p
|
||||
Finally i cleaned up the old mumble jumble docs and sorted them a bit into what
|
||||
is ideas, plans and just background info, in the
|
||||
=link_to "Misc section" , "/misc/index.html"
|
||||
|
||||
%h2 Next Steps
|
||||
%p
|
||||
The plan for the near future is something like this
|
||||
%ul
|
||||
%li
|
||||
More complicated tests. Whole methods that do something and
|
||||
test containing several methods. All just testing results.
|
||||
%li Better test framework for testing binaries.
|
||||
%li Blocks
|
||||
%li Baby steps towards Stdlib
|
||||
%p As one can see, work happens when i have time and inspiration.
|
||||
%p.full_width
|
||||
=image_tag "github-timeline-2018.jpg"
|
||||
|
||||
But even after 4 years, i haven't given up yet :-)
|
||||
Though i may have given up on any time estimates.
|
||||
126
app/views/posts/2018/_06-22-1000-tests-and-working-binaries.haml
Normal file
126
app/views/posts/2018/_06-22-1000-tests-and-working-binaries.haml
Normal file
@@ -0,0 +1,126 @@
|
||||
%p
|
||||
It was almost going to be working binaries and over 1000 tests. But i am coming
|
||||
more and more to the point where software is measured in number of tests, not
|
||||
lines of code.
|
||||
|
||||
%h2 1000+ Tests
|
||||
|
||||
%p.full_width
|
||||
=image_tag "1000_tests.jpg"
|
||||
It was shortly after the last post that i first noticed that 1k was approaching.
|
||||
A little hard to grasp that i have written all those, kind of: what do they
|
||||
all do?
|
||||
%p
|
||||
A good step too: just about 200 tests in 2 months of work. I noticed the couple of
|
||||
times i didn't have good coverage for new code immediately, i started to have
|
||||
problems and had to write tests later. It seems it is the only way to understand
|
||||
even my own code anymore: by making the assumptions explicit. Some of the bugs
|
||||
where tests were missing are just the classics, +1 or -1 errors. And i feel a real
|
||||
newbie having to debug for 5 hours to find it was "<" not "<=" .
|
||||
|
||||
%h2 Working binaries
|
||||
|
||||
%p
|
||||
But off course it feels good to finally have
|
||||
%b working binaries.
|
||||
This is after all the first time that i compile real ruby into real binary.
|
||||
The compiler does off course have many limitations, but what it does, it does
|
||||
right. Even it was just Hello World for starters.
|
||||
%br
|
||||
=image_tag "hello.jpg"
|
||||
Off course i tried the next one straight after, "2+2" and .... it worked too.
|
||||
I don't know if that is just my bad habit or an occupational thing, this
|
||||
being surprised when things work.
|
||||
%br
|
||||
But a little bit about the journey and what how this works.
|
||||
|
||||
%h3 Positioning
|
||||
%p
|
||||
Since ruby-x approach is oo from the start, we do not rely on the C way of creating
|
||||
binaries. Instead, the binary is a sort of snapshot of a running system, or
|
||||
in other words there is only heap.
|
||||
%p
|
||||
This means we create binaries that look the same as the memory during runtime,
|
||||
which is made up of small fixed sized objects. Currently we only have objects
|
||||
of sizes 2 to the power of 2,3, 4 and 5. Maybe larger later, but with oo
|
||||
complete data hiding it is easy to extend objects transparently.
|
||||
|
||||
%h3 Constant loading
|
||||
%p
|
||||
Especially for Code (the only objects larger than 16 words currently),
|
||||
this presented a challenge. Maybe even an extra challenge on top of the
|
||||
purely static one, because of the way ARM load constants.
|
||||
%p
|
||||
Constant loading happens when a known object or address is loaded into a register.
|
||||
Arms constant 32bit instruction only allow 10 bit constants to be loaded.
|
||||
So if the constant is larger (eg the object further away) two instructions instead
|
||||
of one are needed. But this only becomes clear when all positions of objects
|
||||
have been determined.
|
||||
|
||||
%h3 Event approach
|
||||
%p
|
||||
Off course this is not new, and this is in fact the third time i have coded this,
|
||||
finally getting it right. The problem gets hairy with the 16 words limit,
|
||||
when the code overlaps the originally assigned length and a new object has
|
||||
to be inserted.
|
||||
%p
|
||||
To keep one methods code continuous, all other methods code has to be moved up,
|
||||
an thus a whole lot of positions change. Off course when some objects position
|
||||
change, a load depending on that may go from 1 to 2 instructions and so on and
|
||||
on.
|
||||
And then there are the branches that load their targets (forward and backward
|
||||
branches) off course, and they need to be updated etc etc.
|
||||
%p
|
||||
I now have position objects which fire events, and about 4 different kind of
|
||||
listeners reacting in different ways when different objects change. The whole
|
||||
thing works, though as with many an event system, it is difficult to say
|
||||
exactly how. (only easy in the small, not the whole i mean)
|
||||
|
||||
%h3 Object continuation
|
||||
%p
|
||||
As i mentioned, it is quite straightforward to have larger data amounts, made up
|
||||
of 16 word chunks, by having a linked list. This is how the BinaryCode objects, that
|
||||
hold the binary code, do it.
|
||||
%p
|
||||
But with the binaries there is an extra twist to this. The BinaryCode object has a
|
||||
header (the type and next), which are not code. So the code has to jump over this
|
||||
header at every end of an object.
|
||||
|
||||
%p.full_width
|
||||
=image_tag "binary_codes.jpg"
|
||||
This is demonstrated by the object dump above. If the assembly is scary, don't
|
||||
worry, just look at the top left, address 16260, where the BinaryCode object for
|
||||
the main method starts. You see the first two words are separated, as i said the
|
||||
type and next (see the 162a0 value is the address of the BinaryCode on the right).
|
||||
%p
|
||||
Mainly i wanted to demonstrate the jump, which is the last instruction on the left
|
||||
side. The
|
||||
%b b
|
||||
stands for branch and the address 162a8 is exactly the code of the next BinaryCode,
|
||||
ie just after the header.
|
||||
%p
|
||||
You can just make out on the bottom left, that this is in fact the code for the
|
||||
"Hello World" , as it jumps to the (Word_Type.) putstring.
|
||||
|
||||
%h2 Next steps
|
||||
%p
|
||||
Hello World is off course a very small step and work will continue on making other
|
||||
things work. On the Interpreter side, many more things, like loops, conditionals,
|
||||
maths and dynamic dispatch already work.
|
||||
%p
|
||||
Luckily, part of this push was to make the Interpreter a platform similar to the
|
||||
Arm. So it too has BinaryCode and works with addresses, not objects as before.
|
||||
In short the differences between Interpreter and Arm have shrunk, and there is
|
||||
good reason to believe that much will work quite soon.
|
||||
%p
|
||||
Next i will build a testing framework to test the same code on Interpreter and
|
||||
Arm and see that both work. And specifically get all those working Interpreter
|
||||
tests working on Arm.
|
||||
%p
|
||||
I think then it is time for some benchmarks. It has been a while since
|
||||
=link_to "i made some," , "/misc/soml_benchmarks.html"
|
||||
and they were quite promising. Especially loops of the Hello World and
|
||||
Fibonacci.
|
||||
%p
|
||||
On the further horizon i was planning for continuations next, probably with
|
||||
a small rework of the return mechanism (unified return sequence).
|
||||
@@ -0,0 +1,184 @@
|
||||
%p
|
||||
Off course the
|
||||
=link_to "architecture" , "/rubyx/layers.html"
|
||||
gives a good overview of the system as it is. But it does not explain how we got
|
||||
there. And sometimes knowing the journey makes it easier to understand where
|
||||
one is. So i shall try to highlight the four or five main
|
||||
|
||||
%h2 Macbook + Ruby == Rasperry Pi
|
||||
|
||||
%p.full_width
|
||||
=image_tag "mac_plus.png"
|
||||
When i bought my first 30Euro Pi i noticed that ruby is unusable on it.
|
||||
Looking at how slow ruby actually is, it occurred to me that ruby just about turns
|
||||
the Pi into my first 286 laptop (running at 6MHz), which is the same as turning my
|
||||
MacBook Pro into a Pi.
|
||||
%p
|
||||
Off course, while working on web-apps, which can be parallelized so easily, and with
|
||||
a company paying both developer and hardware, the std ruby argument holds.
|
||||
But since i wanted to use my pi for demanding projects something had to be done.
|
||||
|
||||
%h2 Judy, the importance of cpu cache
|
||||
|
||||
%p
|
||||
=ext_link "Judy" , "http://judy.sourceforge.net/"
|
||||
is a really really fast digital tree, kind of hash. I actually built a memory
|
||||
database with it that was also really really fast. When connecting it to rails i
|
||||
ran into the above problem, the niceties of ActiveRecord (ruby) brought performance
|
||||
of my extension (c) down by a factor of 40.
|
||||
%p
|
||||
But anyway, the point is that Judy's speed is based on a radical optimisation for
|
||||
cache lines (and key compression). This means all data structures are exactly a cpu
|
||||
cache line big. As i learned, cpu's do not access memory in word sizes, but instead, always
|
||||
a cache line at a time. This basically lead to ruby-x's memory model, which is
|
||||
fixed sized objects, multiples of a cache-line.
|
||||
|
||||
%h3 Microkernel
|
||||
%p
|
||||
As a young engineer, i thought, as my peers, that Linux (then 0.93) was the greatest
|
||||
thing. Only much later did i learn that it is just a copy really, and the reason
|
||||
it got popular was not technical, but licensing (Same reason it is in Android i
|
||||
believe). The reason it stayed popular is inertia, in other words writing device
|
||||
drivers is hard.
|
||||
%p
|
||||
=ext_link "Synthesis," , "https://en.wikipedia.org/wiki/Self-modifying_code#Massalin's_Synthesis_kernel"
|
||||
=ext_link "L4,", "https://en.wikipedia.org/wiki/L4_microkernel_family"
|
||||
and
|
||||
=ext_link "Minix," , "http://www.minix3.org/"
|
||||
are good proof that the superior architecture is the Microkernel. Eg L4 can run
|
||||
another OS as an application with about 4% performance degradation. Or Minix can
|
||||
recover from a device driver failure.
|
||||
%p
|
||||
This, plus the fact that we have bundler, brought me to the approach that:
|
||||
If you can leave it out, do. Much of the functionality that is in ruby (mri),
|
||||
will never be in RubyX, but rather supplied by gems.
|
||||
|
||||
%h3 System interrupts
|
||||
%p
|
||||
In the beginning i was off course contemplating how much of c based systems i would use.
|
||||
Like LLVM, which is off course a great tool, though made for c-ish applications.
|
||||
Or libc, which again is really for c apps to access the kernel.
|
||||
%p
|
||||
The sheer size of the functionality one inherits almost swayed me. Even i had long
|
||||
since determined that one of ruby's biggest flaws, it's std-lib, came from modelling
|
||||
and using libc.
|
||||
%p
|
||||
Then i learned assembler and looked at libc implementations and learned what i
|
||||
believe made the decision: Kernel calls are not really calls at all. They are
|
||||
software interrupts, which basically means you fill some registers, flick the
|
||||
switch, and the next instruction you can collect the result in a specified
|
||||
register. This may look like a call, and off course, by using libc it is presented
|
||||
as a call, but it is not. It is a very simple set of assembler instructions.
|
||||
%p
|
||||
For me this meant there is very very little benefit in using c, either in it's
|
||||
libc form, or assembler/linker (i had found a ruby gem to do that easily),
|
||||
or, maybe most importantly, the c calling convention. All of these things
|
||||
are great for c programs, but they are just not made for dynamic languages
|
||||
and that would have brought a whole sloth of problems.
|
||||
|
||||
%h3 Return address is a parameter
|
||||
%p
|
||||
In C calling (probably other languages too), the return address is determined
|
||||
in the callee, usually by pushing the pc to the stack. But Arm has a different
|
||||
way, an instruction called Branch With Link, that actually stores the pc in a
|
||||
separate register called Link.
|
||||
%p
|
||||
And this made me realise, that really, the return address is always a parameter
|
||||
to a function. Like other parameters it uses a register. It is the C way to
|
||||
hide this implicit parameter, much in the same way it is the oo way of hiding
|
||||
the self parameter.
|
||||
%p
|
||||
By this time i was already coding some rudimentary calling convention and it
|
||||
did not take long to verify this in code. It is in fact quite easy to determine
|
||||
the return address at compile time and pass it explicitly. (Easy if one does not
|
||||
use a c linker that is)
|
||||
|
||||
%h3 OO calling convention
|
||||
%p
|
||||
Another thing that deterred me from C is the way they use the stack. It is so
|
||||
completely not oo and cryptic. It is in other words very difficult to unwind,
|
||||
and almost impossible to implement closures.
|
||||
%p
|
||||
Since the assembly had progressed easily, i made performance tests with an oo
|
||||
calling convention, and determined that the price would be
|
||||
=link_to "about 50%." , "/misc/soml_benchmarks.html"
|
||||
Since currently the gap is more than an order of magnitude, this seemed ok,
|
||||
given that it would make the compilation process so much easier.
|
||||
%p
|
||||
The resulting calling convention uses normal Message objects that form a linked
|
||||
list, rather than a stack. Since they are completely standard objects, manipulation
|
||||
both at run and compile time is totally integrated.
|
||||
%p
|
||||
Function calling has been working for years, but recently i cracked dynamic method
|
||||
dispatch too, which was not that hard really. Currently the work is progressing to
|
||||
blocks, and the clear structure does help a lot.
|
||||
And while exceptions (or bindings) are not started, i think they will come with
|
||||
relative ease (compared to the c way), since the structures are very simple.
|
||||
|
||||
%h2 Decisions that affect the future
|
||||
|
||||
%h3 Metasm
|
||||
%p
|
||||
I gave
|
||||
=ext_link "Metasm" , "https://github.com/jjyg/metasm/"
|
||||
several long looks. After all it has assembler and disassembler for at least
|
||||
10 cpu's, and support for several binary formats, including elf. The
|
||||
reason not to use it was not that it is big (including much we don't need).
|
||||
But rather that it is unmaintained and unresponsive.
|
||||
%p
|
||||
It would be great to split all that code into several gems, a core and one
|
||||
per cpu / binary format / assembly, disassembly. Only the core would need
|
||||
to be integrated into rubyx, and one could just use the platform specific
|
||||
gems. But I am not the one to do this work, was the decision.
|
||||
|
||||
%h3 Lock free Concurrency
|
||||
%p
|
||||
Concurrency will have to be part of the core, even if it is just to get a gc
|
||||
working. The work that
|
||||
=ext_link "Massalin did" , "http://valerieaurora.org/synthesis/SynthesisOS/abs.html"
|
||||
already showed how effective lock free
|
||||
concurrency is, but Dr Cliff took it into the modern (java) world by
|
||||
publishing a
|
||||
=ext_link "lock free hash" , "https://www.youtube.com/watch?v=HJ-719EGIts"
|
||||
that he later run on some crazy machine with 800 cpus.
|
||||
%p
|
||||
I am not sure whether it will be better to port the java code, or try a
|
||||
=ext_link "diy" , "https://preshing.com/20130605/the-worlds-simplest-lock-free-hash-table/"
|
||||
version. And off course to even get started on this rubyx will need the
|
||||
compare and swap primitives that underly the lock free approach.
|
||||
But all in due time.
|
||||
%p
|
||||
The actual concurrency i am envisioning as two os-threads per core. One for kernel
|
||||
interaction and one for normal operation. Kernel calls
|
||||
would never be executed on the second, but always queued on dedicated kernel
|
||||
threads. The non kernel threads would be used to run fibers.
|
||||
If we insert some little check into the calling, switching could happen very often
|
||||
and because of the linked list approach would be very very fast. And because of
|
||||
the offloading of kennel calls would never stall (completely). This way one can
|
||||
achieve the sort of millions of fibers erlang is known for.
|
||||
|
||||
%h3 House keeping and garbage collection
|
||||
%p
|
||||
Often, in systems that are designed to be collected, the base object has some
|
||||
field to support this. This was deliberately left out. RubyX only has objects,
|
||||
so the field would have to be an Object, which is too much overhead.
|
||||
Or there would have to be dedicated instruction to deal with a raw data word
|
||||
which is too much overhead in another way.
|
||||
%p
|
||||
Gc will be a completely external gem, so experimenting will be easy and
|
||||
encouraged. Gc implementers will just have to use their own structures to keep
|
||||
track of the state that they need. Judy style digital trees can do this by actually
|
||||
using less memory than a field would use, but handcrafted bitfields will also be good.
|
||||
%p
|
||||
The actual marking phase should be relatively easy, as the world is known completely.
|
||||
There are no grey stack areas where one has to guess, as all objects are typed
|
||||
and the type determines which slots are objects. Not even registers are grey
|
||||
area, as we switch cooperatively; only the Message register is ever valid.
|
||||
%p
|
||||
In fact, all this makes even moving objects relatively easy. Though there is off
|
||||
course the effort of going through the world to find all backlinks. But if that
|
||||
done during a mark, it comes at relatively low cost.
|
||||
%p
|
||||
All in all a very interesting topic, and surely someone will come up with some
|
||||
great idea. And off course we there will have e to be the most rudimentary from
|
||||
the start, just enough to work and give someone motivation to improve it.
|
||||
147
app/views/posts/2018/_08-20-implicit-blocks-are-working.haml
Normal file
147
app/views/posts/2018/_08-20-implicit-blocks-are-working.haml
Normal file
@@ -0,0 +1,147 @@
|
||||
%p
|
||||
Basic enumerator style blocks were not as bad as i though. Admittedly i thought they
|
||||
would be close to impossible, so compared to that a few hundred commits are really
|
||||
quite little.
|
||||
|
||||
%h2 Different kind of blocks
|
||||
%p
|
||||
To start with let me lay the ground. In ruby code, i see blocks used in basically
|
||||
two kind of ways. I call the first one the
|
||||
%b implicit block
|
||||
which is what you do when using iterators/enumerable. Ruby let's you pass the
|
||||
block as an
|
||||
%em implicit
|
||||
argument. This is the kind that is implemented and that i will go into detail about.
|
||||
%p
|
||||
The other kind i shall call
|
||||
%em explicit
|
||||
is when you define blocks as variables, either with lambda or proc syntax.
|
||||
As a slight complication implicit blocks may be captured and used in the same
|
||||
way as explicit blocks, but let's forget for a moment that i said that.
|
||||
Explicit blocks are good for a more functional style of programming and used much
|
||||
(much?) less. Also they are the ones that will need some expansion on what we have now.
|
||||
|
||||
%h2 Implicit Block properties
|
||||
%p
|
||||
Since i never had to implement blocks before, it was a bit of a surprise how simple
|
||||
it was.
|
||||
After dynamic dispatch was done i had planned to improve the std library. But i
|
||||
quickly ran into loops, and doing loops without blocks in ruby is just too weird.
|
||||
So i started on blocks instead, which i must admit i thought would be very (very)
|
||||
difficult.
|
||||
%p
|
||||
But then i found that actually blocks are very similar to methods, just with a
|
||||
twist:
|
||||
As it turns out, the implicit block calling basically guarantees that the caller's caller
|
||||
is the method where the block is defined. This means one knows all local variables and
|
||||
method args, while compiling the block. And can thus resolve all variable access
|
||||
at compile time, who knew!
|
||||
%p
|
||||
Ok, just in case that slipped off too quick, i'll say it again: For the
|
||||
%em implicit
|
||||
blocks, all variable (local/args/instance) are statically known at compile time.
|
||||
And since basic control structures (if/while) are obviously the same inside
|
||||
a block and method, the whole problem of blocks reduces to variable access.
|
||||
|
||||
%h2 Base classes
|
||||
%p
|
||||
When we have things that are the same in oo, the big oo hammer comes out: inheritance.
|
||||
So i made a base class for Block and Method, called Callable. And similarly
|
||||
a base class for MethodCompiler and it's new equivalent BlockCompiler, called
|
||||
CallableCompiler.
|
||||
|
||||
%p
|
||||
The reason i mention this much detail is just because i was so surprised how little
|
||||
difference there is between the derived classes. In the case of Block and Method over
|
||||
95% of the code is in the base class, and for the compilers it's still over 80%.
|
||||
It really is only that scope resolution.
|
||||
%p
|
||||
The difference is that a Method resolves a variable in it's own frame, whereas a Block
|
||||
resolves it in the frame of the callers caller, ie where it was defined. And since
|
||||
we have a nice and simple calling convention, it is just two extra instruction per
|
||||
variable access.
|
||||
|
||||
%p
|
||||
So, in the hope of proving how crazy fast it would be, i started on benchmarks.
|
||||
But here we come to another story. RubyX does consume memory quite fast, but has
|
||||
no allocation yet. So i could fix it by creating megabytes of shell objects at
|
||||
compile time, or bite the bullet and implement
|
||||
%b "new". 'Cause i'll do that, means we have to wait for the numbers.
|
||||
|
||||
%h2 Dynamic Blocks
|
||||
%p
|
||||
Since i pushed the Procs aside up there, i just want to say that this was not without
|
||||
consideration. I think the solution to Procs is not too difficult and the current state
|
||||
can be expanded to handle them thus: We need to check the method of the callers caller
|
||||
when entering the block code. If the implicit assumption holds, the code can execute.
|
||||
If not, we need to jump to an alternate version of the code that does the variable
|
||||
resolution dynamically.
|
||||
%p
|
||||
Basically that means compiling two alternate versions of the code and having the switch
|
||||
when entering the block code. Again though, since the calling convention is simple,
|
||||
the runtime resolution is relatively simple. And it can even be coded in ruby,
|
||||
since we can call out to a method from the generated code.
|
||||
|
||||
%h2 Ying and Yang of Methods and Blocks
|
||||
%p
|
||||
Sending for methods is sort of equivalent to yielding for blocks. The two use the exact same
|
||||
calling convention. In fact yield is almost identical to ".send", so when the time
|
||||
comes to do that, we're almost set.
|
||||
%p
|
||||
In methods we have the static case, where the method is known
|
||||
at compile time. And then we have dynamic dispatch, where the the method is resolved
|
||||
at run-time and called dynamically. But in both cases variable resolution is
|
||||
completely compile-time.
|
||||
%p
|
||||
And then we have blocks with the "static" version, where the block that is passed
|
||||
is known at compile-time, but only to the caller, not the callee. So the callee needs
|
||||
to invoke (yield) dynamically, but still the variable resolution is static (compile-time).
|
||||
%p
|
||||
And then the dynamic block version (Procs) where no resolution is necessary to
|
||||
call the Proc (since it is given as a variable), but instead the variables
|
||||
have to be resolved at run-time.
|
||||
%p
|
||||
To me they are sort of reversely symmetric. I'll have to try and make a diagram
|
||||
one day.
|
||||
|
||||
%h2 Side note on Builder
|
||||
%p
|
||||
Since i started with the builder and the associated dsl, i got more and more into it.
|
||||
The dsl provides quite readable code, there is sort of assignment and a few shortcuts
|
||||
to other risc instructions. But at the risc level one is really quite busy shuffling
|
||||
data from here to there, so the "assignment" which covers
|
||||
=ext_link "RegToSlot" , "https://github.com/ruby-x/rubyx/blob/master/lib/risc/instructions/reg_to_slot.rb"
|
||||
,
|
||||
=ext_link "SlotToReg" , "https://github.com/ruby-x/rubyx/blob/master/lib/risc/instructions/slot_to_reg.rb"
|
||||
and
|
||||
=ext_link "Transfer" , "https://github.com/ruby-x/rubyx/blob/master/lib/risc/instructions/transfer.rb"
|
||||
helps a lot.
|
||||
|
||||
%p
|
||||
Because of this, i have now rewritten all of the to_risc functions in Mom, that generate
|
||||
risc instructions using the dsl. Also the builtin code (including div10, shudder) uses
|
||||
the dsl. It is
|
||||
%em much
|
||||
easier to understand, and gets rid of a fair few crutches i created on the way.
|
||||
It's even
|
||||
=link_to "documented", "/rubyx/builder.html.haml"
|
||||
|
||||
%h2 Future
|
||||
%p
|
||||
As i said, what i really would want to do now is some benchmarking.
|
||||
At least i got the Fibonacci of 30 to work. That's something! It took 7632 instructions.
|
||||
That doesn't sound too bad, and is in fact twice as fast as mri (theoretically).
|
||||
That means 1000 times fibo(30) per second on a PI.
|
||||
%p
|
||||
Alas, we need
|
||||
%b new
|
||||
first, even to count to 1000. That's not too bad in itself, but it does need allocate.
|
||||
That in itself is also not too bad, until you get to that else case,
|
||||
where the memory has run out.
|
||||
%p
|
||||
Then there is a mmap syscall and ... what? I guess i'll find out.
|
||||
%p
|
||||
A note for the far future: Since we now have different compilers, and we will need
|
||||
alternative code paths before long, inlining doesn't sound so impossible anymore
|
||||
either. Just another compiler with different scoping rules, another type test, another
|
||||
path.
|
||||
Reference in New Issue
Block a user