Proofread and new pic

This commit is contained in:
Torsten Rüger 2019-08-21 16:30:26 +03:00
parent 90977d1a7c
commit 881afd864a
5 changed files with 70 additions and 224 deletions

View File

@ -1,80 +0,0 @@
%hr/
%p
layout: site
author: Torsten
%p
Part of what got me started on this project was the intuition that our programming model is in some way broken and so by
good old programmers logic: you havent understood it till you programmed it, I started to walk into the fog.
%h3#fpgas FPGAs
%p
Dont ask me why they should be called Field Programmable Gate Arrays, but they have fascinated me for years,
because off course they offer the “ultimate” in programming. Do away with fixed cpu instruction sets and get the program in silicon. Yeah!
%p
But several attempts at learning the black magic have left me only little the wiser.
Verlilog or VHDL are the languages that make up 80-90% of what is used and they so not object oriented,
or in any way user friendly. So that has been on the long
list, until i bumped into
%a{:href => "http://pshdl.org/"} pshdl
by way of Karstens
= succeed "." do
%a{:href => "https://www.youtube.com/watch?v=Er9luiBa32k"} excellent video on it
%p
But what struck me is something he said. That in hardware programming its all about getting your design/programm to fit into
the space you have, and make the timing of the gates work.
%p
And i realized that is what is missing from our programming model: time and space. There is no time, as calls happen
sequentially / always immediately. And there is no space as we have global memory with random access, unlimited by virtual
memory. But the world we live in is governed by time and space, and that governs the way our brain works.
%h3#active-objects-vs-threads Active Objects vs threads
%p
That is off course not soo new, and the actor model has been created to fix that. And while i havent used it much,
i believe it does, especially in non techie problems. And
%a{:href => "http://celluloid.io/"} Celluloid
seems to be a great
implementation of that idea.
%p
Off course Celluloid needs native threads, so youll need to run rubinius or jruby. Understandibly. And so we have
a fix for the problem, if we use celluloid.
%p
But it is a fix, it is not part of the system. The system has sequetial calls per thread and threads. Threads are evil as
i explain (rant about?)
= succeed "," do
%a{:href => "/rubyx/threads.html"} here
%h3#messaging-with-inboxes Messaging with inboxes
%p
If you read the rant (it is a little older) youll se that it established the problem (shared global memory) but does not propose a solution as such. The solution came from a combination of the rant,
the
%a{:href => "/2014/07/17/framing.html"} previous post
and the fpga physical perspective.
%p
A physical view would be that we have a fixed number of object places on the chip (like a cache) and
as the previous post explains, sending is creating a message (yet another object) and transferring
control. Now in a physical view control is not in one place like in a cpu. Any gate can switch at
any cycle, so any object could be “active” at every cycle (without going into any detail about what that means).
%p
But it got me thinking how that would be coordinated, because one object doing two things may lead
to trouble. But one of the Sythesis ideas was
%a{:href => "http://valerieaurora.org/synthesis/SynthesisOS/ch5.html"} lock free synchronisation
by use of a test-and-swap primitive.
%p
So if every object had an inbox, in a similar way that each object has a class now, we could create
the message and put it there. And by default we would expect it to be empty, and test that and if
so put our message there. Otherwise we queue it.
%p
From a sender perspective the process is: create a new Message, fill it with data, put it to
receivers inbox. From a receivers perspective its check you inbox, if empty do nothing,
otherwise do what it says. Do what it says could easily include the ruby rules for finding methods.
Ie check if your yourself have a method by that name, send to super if not etc.
%p
In a fpga setting this would be even nicer, as all lookups could be implemented by associative memory
and thus happen in one cycle. Though there would be some manager needed to manage which objects are
on the chip and which could be hoisted off. Nothing more complicated than a virtual memory manager though.
%p
The inbox idea represents a solution to the thread problem and has the added benefit of being easy to understand and
possibly even to implement. It should also make it safe to run several kernel threads, though i prefer the idea of
only having one or two kernel threads that do exclusively system calls and the rest with green threads that use
home grown scheduling.
%p
This approach also makes one way messaging very natural though one would have to invent a syntax for
that. And futures should come easy too.

View File

@ -1,93 +0,0 @@
%hr/
%p
layout: site
author: Torsten
%p
As noted in previous posts, differentiating between compile- and run-time is one of the more
difficult things in doing the vm. That is because the computing that needs to happen is so similar,
in other words almost all of the vm - level is available at run-time too.
%p But off course we try to do as much as possible at compile-time.
%p
One hears or reads that exactly this is a topic causing (also) other vms problems.
Specifically how one assures that what is compiled at compile-time and and run-time are
identical or at least compatible.
%h2#inlining Inlining
%p
The obvious answer seems to me to
= succeed ".In" do
%strong use the same code
%p
Lets take a simple example of accessing an instance variable. This is off course available at
run-time through the function
%em instance_variable_get
, which could go something like:
%pre
%code
:preserve
def instance_variable_get name
index = @layout.index name
return nil unless index
at_index(index)
end
%p
Lets assume the
%em builtin
at_index function and take the layout to be an array like structure.
As noted in previous posts, when this is compiled we get a Method with Blocks, and exactly one
Block will initiate the return. The previous post detailed how at that time the return value will
be in the ReturnSlot.
%p
So then we get to the idea of how: We “just” need to take the blocks from the method and paste
them where the instance variable is accessed. Following code will pick the value from the ReturnSlot
as it would any other value and continue.
%p
The only glitch in this plan is that the code will assume a new message and frame. But if we just
paste it it will use message/frame/self from the enclosing method. So that is where the work is:
translating slots from the inner, inlined function to the outer one. Possibly creating new frame
entries.
%h2#inlining-what Inlining what
%p
But lets take a step back from the mechanics and look at what it is we need to inline. Above
example seems to suggest we inline code. Code, as in text, is off course impossible to inline.
Thats because we have no information about it and so the argument passing and returning cant
possibly work. Quite apart from the tricky possibility of shadow variables, ie the inlined code
assigning to variables of the outside function.
%p
Ok, so then we just take our parsed code, the abstract syntax tree. There we have all the
information we need to do the magic, at least it looks like that.
But, we may not have the ast!
%p
The idea is to be able to make the step to a language independent system. Hence the sof (salama
object file), even it has no reader yet. The idea being that we store object files of any
language in sof and the vm would read those.
%p
To do that we need to inline at the vm instruction level. Which in turn means that we will need
to retain enough information at that level to be able to do that. What that entails in detail
is unclear at the moment, but it gives a good direction.
%h2#a-rough-plan A rough plan
%p
To recap the function calling at the instruction level. Btw it should be clear that we can
not inline method sends, as we dont know which function is being called. But off course the
actual send method may be inlined and that is in fact part of the aim.
%p
To call a function, a NewMessage is created, loaded with args and stuff, then the FunctionCall is
issued. Upon entering a new frame may be created for local and temporary variables and at the
end the function returns. When it returns the return value will be in the Return slot and the
calling method will grab it if interested and swap the Message back to what it was before the call.
%p
From that (and at that level) it becomes clearer what needs to be done, and it starts with the
the caller, off course. In the caller there needs to be a way to make the decision whether to
inline or not. For the run-time stuff we need a list for “always inline”, later a complexity
analysis, later a run-time analysis. When the decision goes to inline, the message setup will
be skipped. Instead a mapping needs to be created from the called functions argument names to
the newly created (unique) local variables.
Then, going through the instructions, references to arguments must be exchanged with references
to the new variables. A similar process needs to replace reference to local variables in the
called method to local variables in the calling method. Similarly the return and self slots need
to be mapped.
%p
After the final instruction of the called method, the reassigned return must be moved to the real
return and the calling function may commence. And while this may sound a lot, one must remember
that the instruction set of the machine is quite small, and further refinement
(abstracting base classes for example) can be done to make the work easier.

Binary file not shown.

Before

Width:  |  Height:  |  Size: 17 KiB

After

Width:  |  Height:  |  Size: 20 KiB

View File

@ -1,6 +1,6 @@
%p
As we rubyx will eventually need to parse and compile itself, i am very happy to
report success on the first steps on that journey. Also benchmarks, better design
As rubyx will eventually need to parse and compile itself, i am very happy to
report success on the first steps towards that goal. Also better design, benchmarks,
and another conference are on the list.
%h2 Compiling parfait
@ -14,23 +14,23 @@
and so Parfait must be available at run-time, ie parsed and compiled. Since i have been
busy doing the basics, this has been on the ToDo for a long while.
%p
Now, finally, most the basics are inplace and i have started what feels like a tremendous
task. In fact i have succefully compiled
Now, finally, most the basics are in place and i have started what feels like a tremendous
task. In fact i have successfully compiled
%em three files.
Object, DataObject and Integer, to be precise.
%p
The significance of this is actually much greater (especially since there are no tests)
yet. Parfait, as part of rubyx, is what one may call ideomatic ruby, ie real world ruby.
Off course i have to smoothen out a few bugs before compiling actually worked, but
The significance of this is actually much greater (especially since there are no tests
yet). Parfait, as part of rubyx, is what one may call ideomatic ruby, ie real world ruby.
Off course i had to smoothen out a few bugs before compiling actually worked, but
surprisingly little. In other words the compiler is functional enough to
compile larger more feature ritch ruby programs, but more on that below.
compile larger, more feature rich ruby programs, but more on that below.
%h2 Design improvements
%p
The overall design has been like in the picture below for a while already.
Alas, the implementation oof this architecture was slightly lacking.
Alas, the implementation of this architecture was slightly lacking.
To be precise, when mom code was generated, it was immediately converted to risc.
In other words the layer only existed conceptually, or in transit.
In other words the layer existed only conceptually, or in transit.
%p.center.three_width
= image_tag "architecture.png" , alt: "Architectural layers"
%p
@ -39,42 +39,43 @@
as advertised. Ruby comes in from the top and binary code out at the bottom.
But more than that, every layer is a distinct step, in fact there are methods on the
topmost compiler object to create every level down from ruby. This is obviously
very handy for testing.
very handy for testing, and by iself sped up testing by 30%, as risc is not renerated
before really needed.
%h2 Automated binary tests
%p
Speaking of testing, we are at over 1600 tests, which is more than 200 up from before
the design rewrite. At over 15000 assertions this is still 95%, in other words everything
apart from a few fails. And with parallel execution still fast.
the design rewrite. At over 15000 assertions this is still 95% of the code, in other
words everything apart from a few fails. And with parallel execution still fast.
%p.center.full_width
= image_tag "1600_tests.png" , alt: "Lots of test, never boring"
%p
But the main achievement a couple of weeks ago was the integration of binary testing
int the automated test flow. Specifically on
into the automated test flow. Specifically on
%em Travis.
%p
This uses a feature of Qemu that i had not know before, namely that one can get qemu
to run binaries from a different target on a machine, by simply calling it with
qemu-arm.
%p
I had done previous testing of binaries via ssh, usually to an qemu emulated pi on my
Previously i had done testing of binaries via ssh, usually to an qemu emulated pi on my
machine. This setup is vastly more complicated, as described
=ext_link "here" , "/arm/qemu.html"
and i had shied away from that. Meaning they would happen irregularily and all that.
My only consolation was that the test would run on the interpreter, but off course that
does not test the arm and elf genertion.
and i had shied away from automating that. Meaning they would happen irregularily and
all that. My only consolation was that the test would run on the interpreter, but off
course that does not test the arm and elf genertion.
%p
The actual tests that i am talking about are a growing number of "mains" tests, found in
the tets/mains directory.
These are actal programs that calculate or output stuff. They are complete system
tests in the sense that we only test their output (system output).
the tests/mains directory.
These are actual programs that calculate or output stuff. They are complete system
tests in the sense that we only test their output (system output and exit code).
%p
As we usually link to "a.out" files (thus overwriting and avoiding cleanup), the actual
invocation of qemu for a binary is really simple:
%pre
%code
qemnu-arm ./a.out
qemu-arm ./a.out
but that still leaves you to generate that binary. This can be done by using the
rubyxc compiler and linking the resulting object file (see bug #13). But sine i too am a
lazy programmer i have automated these steps into the rubyxc compiler, and so one
@ -83,11 +84,11 @@
%code
:preserve
./bin/rubyxc execute test/mains/source/fibo__8.rb
This will compile, link and execute this specific fibonachi test. This output
This will compile, link and execute this specific fibonacci test. The exit code
of this test will be 8, as encoded in the file name.
So this and 20 others will be tested as binaries now every time travid does its thing.
Now this test, and 20 others, will be run as binaries every time travis does its thing.
%p
BTW, i have also created arubyxc command to execute a file via the interpreter.
BTW, i have also created a rubyxc command to execute a file via the interpreter.
This can sometimes yield better errors when things go wrong.
%pre
%code
@ -98,20 +99,23 @@
%h2 Misc other news
%h3 Microbenchmarks
%p
At the last conference in Hamburg, someone asked the fair question: So how fast is it?
It's been so long that i did tests, that i could only mumble.
It's been so long that i did
=ext_link "tests," , "/misc/soml_benchmarks.html"
that i could only mumble.
Now i finished updating the tests, but it will be a while before i can answer the
question more fully.
%p
So for starters, because the functionality of the compiler is limited, i did very small
benchmarks. Very small means 20 lines or less, loops, string output, fibonacchi, both
linear and recursive. I realized too late, that that will tell most about integer
linear and recursive. I realized too late, that all that will tell about is integer
performance.
%p
Now because of the early days, i will not go into detail here. In general speed was not
as fast as i had hoped from by 4 year old benchmarks, about the same as mri. I
Now because it's early days, i will not go into detail here. In general speed was not
as fast as i had hoped for, about the same as mri. I
will have to do some work on the calling convention and probably some on integer
handling too. I think i can quite easily shave 30-50% off, and that alone should
verify the saying that all benchmarks are lies. Like the one where rubyx is doing
@ -126,7 +130,7 @@
As part of parsing Parfait, i implemented a first version of implicit returns.
Low hanging fruits, and in fact most common use cases, included constants and calls.
So when a method ends in a simple variable, constant, or a call, a return will be added.
More complex rules like returns for if's or while will ave to wait, but i found that i
More complex rules like returns for if's or while will have to wait, but i found that i
personally don't tend to use them anyway.
%p
Since class methods are basically methods (of the meta class), adding the unified
@ -135,8 +139,8 @@
%h3 Improved Block handling
%p
Block handling, at least the simple implicit kind, has worked for a while, but was in
several ways too complicated. The block was unneccessarily assigned to a local, and
compiling was handled by picking them out.
several ways too complicated. The block was unnecessarily assigned to a local, and
compiling was handled by looking for blocks statements, not during the normal flow.
%p
This all stemmed from a misunderstanding, or lack of understanding: Blocks, or should
i say Lambdas, are constants. Just like a string or integer. They are created once at
@ -145,7 +149,7 @@
before Statements.
%p
So now the Lambda Expression is created and just added as an argument to the send.
Compiling thee Lambda is triggered by the constant creation, ie the step down from
Compiling the Lambda is triggered by the constant creation, ie the step down from
vool to mom, and the block compiler added to method compiler automatically.
%h3 Vool coming into focus
@ -158,38 +162,39 @@
recursive calls are flattened into a list, and as such the calling does not rely on a
stack as in ruby.
%p
Secondly, Vool distinguishes between expressions and statements. Like other lower level,
but not ruby. As a rule of thumb, Statements do things, Expression are things. In other
words, only expressions have value, statements (lke if or while) do not.
Secondly, Vool distinguishes between expressions and statements. Like other lower level
languages, but not ruby. As a rule of thumb, Statements do things, Expression are things.
In other words, only expressions have value, statements (like if or while) do not.
%h2 Plans
%h4 GrillRB conference
%p
I will speak in
=ext_link "Wrazlaw" , "https://grillrb.com/"
in about a week. The plan is to make a comparison with rails and focus on the
possibilities, rather than technical detail.
%h4 Calling
%p
The Calling can do with work and i noticed two mistakes i did. One is that creating
a new message for every call is unneccessarily complicated. Its is only in the
special case that a Proc is created that the return sequence (a mom instruciton) needs
a new message for every call is unneccessarily complicated. It is only in the
special case that a Proc is created that the return sequence (a mom instruction) needs
to keep the message alive.
%p
The other is that having arguments and local variables as seperate arrays may be handy
and easy to code. But it does add an extra indirection for every access _and_ store.
Since Mom is memory based, and Mom translates to risc, that does amount to a lot of
instructions.
%h4 Integers
%p
I still want to hang on to Integers being objects, though creation is clealy costly.
In the future a full escape analysis will help off course, but for now it should be easy
enough to figure out wether an int is passed down. If not loops can be
destructively change the int. A simple special case is a the times method.
enough to figure out wether an int is passed down. If not loops can be made to
destructively change the int.
%h4 Mom instruction invocation
%p
I have this idea of being able to code more stuff higher up. To make that more
efficient i am thinking of macros or instruction invocation at the vool level.
Only inside Parfait off course. The basic idea would be to save the call/return
code, and have eg X.return_jump map to the Mom::ReturnJump Instruction. "Just" have
to figure out the passing semantics, or how that integrates intot the vools code.
code, and have the compiler map eg X.return_jump to the Mom::ReturnJump Instruction. "Just" have
to figure out the passing semantics, or how that integrates into the vool code.
%h4 Better Builtin
%p
The generation of the current builtin methods has always bothered me a bit.
@ -197,8 +202,22 @@
alternative mechanism is needed (even in c one can embed assembler).
%p
The main problem i have is that those methods don't check their arguments and as such
may cause core dumps. So they are to high level and hopefully all we really need is
may cause core dumps. So they are too high level and hopefully all we really need is
that previous idea of being able to integrate Mom code into vool. As Mom is extensible
that should take care of any possible need. And we could code the methods normally as
part of Parfait, make them safe, and just use the lower level inside them. Lets see!.
part of Parfait, make them safe, and just use the lower level inside them. Lets see!
%h4 Compiling Parfait tests
%p
Since Parfait is part of rubyx, we have off course unit tests for it. The plan is to
parse the tests too, and run them as a test for both Prfait and the compiler.
Off course this will involve writing some mini version of minitest that the compiler
can actually handle (Why do i have the feeling that the real minitest involves too much
mmagic).
%h4 GrillRB conference
%p
Last, but not least, i will speak in
=ext_link "Wrocław" , "https://grillrb.com/"
in about a week. The plan is to make a comparison with rails and focus on the
possibilities, rather than technical detail. See you there :-)

Binary file not shown.