Proofread and new pic
This commit is contained in:
parent
90977d1a7c
commit
881afd864a
@ -1,80 +0,0 @@
|
||||
%hr/
|
||||
%p
|
||||
layout: site
|
||||
author: Torsten
|
||||
—
|
||||
%p
|
||||
Part of what got me started on this project was the intuition that our programming model is in some way broken and so by
|
||||
good old programmers logic: you haven’t understood it till you programmed it, I started to walk into the fog.
|
||||
%h3#fpgas FPGA’s
|
||||
%p
|
||||
Don’t ask me why they should be called Field Programmable Gate Arrays, but they have fascinated me for years,
|
||||
because off course they offer the “ultimate” in programming. Do away with fixed cpu instruction sets and get the program in silicon. Yeah!
|
||||
%p
|
||||
But several attempts at learning the black magic have left me only little the wiser.
|
||||
Verlilog or VHDL are the languages that make up 80-90% of what is used and they so not object oriented,
|
||||
or in any way user friendly. So that has been on the long
|
||||
list, until i bumped into
|
||||
%a{:href => "http://pshdl.org/"} pshdl
|
||||
by way of Karstens
|
||||
= succeed "." do
|
||||
%a{:href => "https://www.youtube.com/watch?v=Er9luiBa32k"} excellent video on it
|
||||
%p
|
||||
But what struck me is something he said. That in hardware programming it’s all about getting your design/programm to fit into
|
||||
the space you have, and make the timing of the gates work.
|
||||
%p
|
||||
And i realized that is what is missing from our programming model: time and space. There is no time, as calls happen
|
||||
sequentially / always immediately. And there is no space as we have global memory with random access, unlimited by virtual
|
||||
memory. But the world we live in is governed by time and space, and that governs the way our brain works.
|
||||
%h3#active-objects-vs-threads Active Objects vs threads
|
||||
%p
|
||||
That is off course not soo new, and the actor model has been created to fix that. And while i haven’t used it much,
|
||||
i believe it does, especially in non techie problems. And
|
||||
%a{:href => "http://celluloid.io/"} Celluloid
|
||||
seems to be a great
|
||||
implementation of that idea.
|
||||
%p
|
||||
Off course Celluloid needs native threads, so you’ll need to run rubinius or jruby. Understandibly. And so we have
|
||||
a fix for the problem, if we use celluloid.
|
||||
%p
|
||||
But it is a fix, it is not part of the system. The system has sequetial calls per thread and threads. Threads are evil as
|
||||
i explain (rant about?)
|
||||
= succeed "," do
|
||||
%a{:href => "/rubyx/threads.html"} here
|
||||
%h3#messaging-with-inboxes Messaging with inboxes
|
||||
%p
|
||||
If you read the rant (it is a little older) you’ll se that it established the problem (shared global memory) but does not propose a solution as such. The solution came from a combination of the rant,
|
||||
the
|
||||
%a{:href => "/2014/07/17/framing.html"} previous post
|
||||
and the fpga physical perspective.
|
||||
%p
|
||||
A physical view would be that we have a fixed number of object places on the chip (like a cache) and
|
||||
as the previous post explains, sending is creating a message (yet another object) and transferring
|
||||
control. Now in a physical view control is not in one place like in a cpu. Any gate can switch at
|
||||
any cycle, so any object could be “active” at every cycle (without going into any detail about what that means).
|
||||
%p
|
||||
But it got me thinking how that would be coordinated, because one object doing two things may lead
|
||||
to trouble. But one of the Sythesis ideas was
|
||||
%a{:href => "http://valerieaurora.org/synthesis/SynthesisOS/ch5.html"} lock free synchronisation
|
||||
by use of a test-and-swap primitive.
|
||||
%p
|
||||
So if every object had an inbox, in a similar way that each object has a class now, we could create
|
||||
the message and put it there. And by default we would expect it to be empty, and test that and if
|
||||
so put our message there. Otherwise we queue it.
|
||||
%p
|
||||
From a sender perspective the process is: create a new Message, fill it with data, put it to
|
||||
receivers inbox. From a receivers perspective it’s check you inbox, if empty do nothing,
|
||||
otherwise do what it says. Do what it says could easily include the ruby rules for finding methods.
|
||||
Ie check if your yourself have a method by that name, send to super if not etc.
|
||||
%p
|
||||
In a fpga setting this would be even nicer, as all lookups could be implemented by associative memory
|
||||
and thus happen in one cycle. Though there would be some manager needed to manage which objects are
|
||||
on the chip and which could be hoisted off. Nothing more complicated than a virtual memory manager though.
|
||||
%p
|
||||
The inbox idea represents a solution to the thread problem and has the added benefit of being easy to understand and
|
||||
possibly even to implement. It should also make it safe to run several kernel threads, though i prefer the idea of
|
||||
only having one or two kernel threads that do exclusively system calls and the rest with green threads that use
|
||||
home grown scheduling.
|
||||
%p
|
||||
This approach also makes one way messaging very natural though one would have to invent a syntax for
|
||||
that. And futures should come easy too.
|
@ -1,93 +0,0 @@
|
||||
%hr/
|
||||
%p
|
||||
layout: site
|
||||
author: Torsten
|
||||
—
|
||||
%p
|
||||
As noted in previous posts, differentiating between compile- and run-time is one of the more
|
||||
difficult things in doing the vm. That is because the computing that needs to happen is so similar,
|
||||
in other words almost all of the vm - level is available at run-time too.
|
||||
%p But off course we try to do as much as possible at compile-time.
|
||||
%p
|
||||
One hears or reads that exactly this is a topic causing (also) other vms problems.
|
||||
Specifically how one assures that what is compiled at compile-time and and run-time are
|
||||
identical or at least compatible.
|
||||
%h2#inlining Inlining
|
||||
%p
|
||||
The obvious answer seems to me to
|
||||
= succeed ".In" do
|
||||
%strong use the same code
|
||||
%p
|
||||
Let’s take a simple example of accessing an instance variable. This is off course available at
|
||||
run-time through the function
|
||||
%em instance_variable_get
|
||||
, which could go something like:
|
||||
%pre
|
||||
%code
|
||||
:preserve
|
||||
def instance_variable_get name
|
||||
index = @layout.index name
|
||||
return nil unless index
|
||||
at_index(index)
|
||||
end
|
||||
%p
|
||||
Let’s assume the
|
||||
%em builtin
|
||||
at_index function and take the layout to be an array like structure.
|
||||
As noted in previous posts, when this is compiled we get a Method with Blocks, and exactly one
|
||||
Block will initiate the return. The previous post detailed how at that time the return value will
|
||||
be in the ReturnSlot.
|
||||
%p
|
||||
So then we get to the idea of how: We “just” need to take the blocks from the method and paste
|
||||
them where the instance variable is accessed. Following code will pick the value from the ReturnSlot
|
||||
as it would any other value and continue.
|
||||
%p
|
||||
The only glitch in this plan is that the code will assume a new message and frame. But if we just
|
||||
paste it it will use message/frame/self from the enclosing method. So that is where the work is:
|
||||
translating slots from the inner, inlined function to the outer one. Possibly creating new frame
|
||||
entries.
|
||||
%h2#inlining-what Inlining what
|
||||
%p
|
||||
But lets take a step back from the mechanics and look at what it is we need to inline. Above
|
||||
example seems to suggest we inline code. Code, as in text, is off course impossible to inline.
|
||||
That’s because we have no information about it and so the argument passing and returning can’t
|
||||
possibly work. Quite apart from the tricky possibility of shadow variables, ie the inlined code
|
||||
assigning to variables of the outside function.
|
||||
%p
|
||||
Ok, so then we just take our parsed code, the abstract syntax tree. There we have all the
|
||||
information we need to do the magic, at least it looks like that.
|
||||
But, we may not have the ast!
|
||||
%p
|
||||
The idea is to be able to make the step to a language independent system. Hence the sof (salama
|
||||
object file), even it has no reader yet. The idea being that we store object files of any
|
||||
language in sof and the vm would read those.
|
||||
%p
|
||||
To do that we need to inline at the vm instruction level. Which in turn means that we will need
|
||||
to retain enough information at that level to be able to do that. What that entails in detail
|
||||
is unclear at the moment, but it gives a good direction.
|
||||
%h2#a-rough-plan A rough plan
|
||||
%p
|
||||
To recap the function calling at the instruction level. Btw it should be clear that we can
|
||||
not inline method sends, as we don’t know which function is being called. But off course the
|
||||
actual send method may be inlined and that is in fact part of the aim.
|
||||
%p
|
||||
To call a function, a NewMessage is created, loaded with args and stuff, then the FunctionCall is
|
||||
issued. Upon entering a new frame may be created for local and temporary variables and at the
|
||||
end the function returns. When it returns the return value will be in the Return slot and the
|
||||
calling method will grab it if interested and swap the Message back to what it was before the call.
|
||||
%p
|
||||
From that (and at that level) it becomes clearer what needs to be done, and it starts with the
|
||||
the caller, off course. In the caller there needs to be a way to make the decision whether to
|
||||
inline or not. For the run-time stuff we need a list for “always inline”, later a complexity
|
||||
analysis, later a run-time analysis. When the decision goes to inline, the message setup will
|
||||
be skipped. Instead a mapping needs to be created from the called functions argument names to
|
||||
the newly created (unique) local variables.
|
||||
Then, going through the instructions, references to arguments must be exchanged with references
|
||||
to the new variables. A similar process needs to replace reference to local variables in the
|
||||
called method to local variables in the calling method. Similarly the return and self slots need
|
||||
to be mapped.
|
||||
%p
|
||||
After the final instruction of the called method, the reassigned return must be moved to the real
|
||||
return and the calling function may commence. And while this may sound a lot, one must remember
|
||||
that the instruction set of the machine is quite small, and further refinement
|
||||
(abstracting base classes for example) can be done to make the work easier.
|
Binary file not shown.
Before Width: | Height: | Size: 17 KiB After Width: | Height: | Size: 20 KiB |
@ -1,6 +1,6 @@
|
||||
%p
|
||||
As we rubyx will eventually need to parse and compile itself, i am very happy to
|
||||
report success on the first steps on that journey. Also benchmarks, better design
|
||||
As rubyx will eventually need to parse and compile itself, i am very happy to
|
||||
report success on the first steps towards that goal. Also better design, benchmarks,
|
||||
and another conference are on the list.
|
||||
|
||||
%h2 Compiling parfait
|
||||
@ -14,23 +14,23 @@
|
||||
and so Parfait must be available at run-time, ie parsed and compiled. Since i have been
|
||||
busy doing the basics, this has been on the ToDo for a long while.
|
||||
%p
|
||||
Now, finally, most the basics are inplace and i have started what feels like a tremendous
|
||||
task. In fact i have succefully compiled
|
||||
Now, finally, most the basics are in place and i have started what feels like a tremendous
|
||||
task. In fact i have successfully compiled
|
||||
%em three files.
|
||||
Object, DataObject and Integer, to be precise.
|
||||
%p
|
||||
The significance of this is actually much greater (especially since there are no tests)
|
||||
yet. Parfait, as part of rubyx, is what one may call ideomatic ruby, ie real world ruby.
|
||||
Off course i have to smoothen out a few bugs before compiling actually worked, but
|
||||
The significance of this is actually much greater (especially since there are no tests
|
||||
yet). Parfait, as part of rubyx, is what one may call ideomatic ruby, ie real world ruby.
|
||||
Off course i had to smoothen out a few bugs before compiling actually worked, but
|
||||
surprisingly little. In other words the compiler is functional enough to
|
||||
compile larger more feature ritch ruby programs, but more on that below.
|
||||
compile larger, more feature rich ruby programs, but more on that below.
|
||||
|
||||
%h2 Design improvements
|
||||
%p
|
||||
The overall design has been like in the picture below for a while already.
|
||||
Alas, the implementation oof this architecture was slightly lacking.
|
||||
Alas, the implementation of this architecture was slightly lacking.
|
||||
To be precise, when mom code was generated, it was immediately converted to risc.
|
||||
In other words the layer only existed conceptually, or in transit.
|
||||
In other words the layer existed only conceptually, or in transit.
|
||||
%p.center.three_width
|
||||
= image_tag "architecture.png" , alt: "Architectural layers"
|
||||
%p
|
||||
@ -39,42 +39,43 @@
|
||||
as advertised. Ruby comes in from the top and binary code out at the bottom.
|
||||
But more than that, every layer is a distinct step, in fact there are methods on the
|
||||
topmost compiler object to create every level down from ruby. This is obviously
|
||||
very handy for testing.
|
||||
very handy for testing, and by iself sped up testing by 30%, as risc is not renerated
|
||||
before really needed.
|
||||
|
||||
%h2 Automated binary tests
|
||||
%p
|
||||
Speaking of testing, we are at over 1600 tests, which is more than 200 up from before
|
||||
the design rewrite. At over 15000 assertions this is still 95%, in other words everything
|
||||
apart from a few fails. And with parallel execution still fast.
|
||||
the design rewrite. At over 15000 assertions this is still 95% of the code, in other
|
||||
words everything apart from a few fails. And with parallel execution still fast.
|
||||
|
||||
%p.center.full_width
|
||||
= image_tag "1600_tests.png" , alt: "Lots of test, never boring"
|
||||
%p
|
||||
But the main achievement a couple of weeks ago was the integration of binary testing
|
||||
int the automated test flow. Specifically on
|
||||
into the automated test flow. Specifically on
|
||||
%em Travis.
|
||||
%p
|
||||
This uses a feature of Qemu that i had not know before, namely that one can get qemu
|
||||
to run binaries from a different target on a machine, by simply calling it with
|
||||
qemu-arm.
|
||||
%p
|
||||
I had done previous testing of binaries via ssh, usually to an qemu emulated pi on my
|
||||
Previously i had done testing of binaries via ssh, usually to an qemu emulated pi on my
|
||||
machine. This setup is vastly more complicated, as described
|
||||
=ext_link "here" , "/arm/qemu.html"
|
||||
and i had shied away from that. Meaning they would happen irregularily and all that.
|
||||
My only consolation was that the test would run on the interpreter, but off course that
|
||||
does not test the arm and elf genertion.
|
||||
and i had shied away from automating that. Meaning they would happen irregularily and
|
||||
all that. My only consolation was that the test would run on the interpreter, but off
|
||||
course that does not test the arm and elf genertion.
|
||||
%p
|
||||
The actual tests that i am talking about are a growing number of "mains" tests, found in
|
||||
the tets/mains directory.
|
||||
These are actal programs that calculate or output stuff. They are complete system
|
||||
tests in the sense that we only test their output (system output).
|
||||
the tests/mains directory.
|
||||
These are actual programs that calculate or output stuff. They are complete system
|
||||
tests in the sense that we only test their output (system output and exit code).
|
||||
%p
|
||||
As we usually link to "a.out" files (thus overwriting and avoiding cleanup), the actual
|
||||
invocation of qemu for a binary is really simple:
|
||||
%pre
|
||||
%code
|
||||
qemnu-arm ./a.out
|
||||
qemu-arm ./a.out
|
||||
but that still leaves you to generate that binary. This can be done by using the
|
||||
rubyxc compiler and linking the resulting object file (see bug #13). But sine i too am a
|
||||
lazy programmer i have automated these steps into the rubyxc compiler, and so one
|
||||
@ -83,11 +84,11 @@
|
||||
%code
|
||||
:preserve
|
||||
./bin/rubyxc execute test/mains/source/fibo__8.rb
|
||||
This will compile, link and execute this specific fibonachi test. This output
|
||||
This will compile, link and execute this specific fibonacci test. The exit code
|
||||
of this test will be 8, as encoded in the file name.
|
||||
So this and 20 others will be tested as binaries now every time travid does its thing.
|
||||
Now this test, and 20 others, will be run as binaries every time travis does its thing.
|
||||
%p
|
||||
BTW, i have also created arubyxc command to execute a file via the interpreter.
|
||||
BTW, i have also created a rubyxc command to execute a file via the interpreter.
|
||||
This can sometimes yield better errors when things go wrong.
|
||||
%pre
|
||||
%code
|
||||
@ -98,20 +99,23 @@
|
||||
|
||||
|
||||
%h2 Misc other news
|
||||
|
||||
%h3 Microbenchmarks
|
||||
%p
|
||||
At the last conference in Hamburg, someone asked the fair question: So how fast is it?
|
||||
It's been so long that i did tests, that i could only mumble.
|
||||
It's been so long that i did
|
||||
=ext_link "tests," , "/misc/soml_benchmarks.html"
|
||||
that i could only mumble.
|
||||
Now i finished updating the tests, but it will be a while before i can answer the
|
||||
question more fully.
|
||||
%p
|
||||
So for starters, because the functionality of the compiler is limited, i did very small
|
||||
benchmarks. Very small means 20 lines or less, loops, string output, fibonacchi, both
|
||||
linear and recursive. I realized too late, that that will tell most about integer
|
||||
linear and recursive. I realized too late, that all that will tell about is integer
|
||||
performance.
|
||||
%p
|
||||
Now because of the early days, i will not go into detail here. In general speed was not
|
||||
as fast as i had hoped from by 4 year old benchmarks, about the same as mri. I
|
||||
Now because it's early days, i will not go into detail here. In general speed was not
|
||||
as fast as i had hoped for, about the same as mri. I
|
||||
will have to do some work on the calling convention and probably some on integer
|
||||
handling too. I think i can quite easily shave 30-50% off, and that alone should
|
||||
verify the saying that all benchmarks are lies. Like the one where rubyx is doing
|
||||
@ -126,7 +130,7 @@
|
||||
As part of parsing Parfait, i implemented a first version of implicit returns.
|
||||
Low hanging fruits, and in fact most common use cases, included constants and calls.
|
||||
So when a method ends in a simple variable, constant, or a call, a return will be added.
|
||||
More complex rules like returns for if's or while will ave to wait, but i found that i
|
||||
More complex rules like returns for if's or while will have to wait, but i found that i
|
||||
personally don't tend to use them anyway.
|
||||
%p
|
||||
Since class methods are basically methods (of the meta class), adding the unified
|
||||
@ -135,8 +139,8 @@
|
||||
%h3 Improved Block handling
|
||||
%p
|
||||
Block handling, at least the simple implicit kind, has worked for a while, but was in
|
||||
several ways too complicated. The block was unneccessarily assigned to a local, and
|
||||
compiling was handled by picking them out.
|
||||
several ways too complicated. The block was unnecessarily assigned to a local, and
|
||||
compiling was handled by looking for blocks statements, not during the normal flow.
|
||||
%p
|
||||
This all stemmed from a misunderstanding, or lack of understanding: Blocks, or should
|
||||
i say Lambdas, are constants. Just like a string or integer. They are created once at
|
||||
@ -145,7 +149,7 @@
|
||||
before Statements.
|
||||
%p
|
||||
So now the Lambda Expression is created and just added as an argument to the send.
|
||||
Compiling thee Lambda is triggered by the constant creation, ie the step down from
|
||||
Compiling the Lambda is triggered by the constant creation, ie the step down from
|
||||
vool to mom, and the block compiler added to method compiler automatically.
|
||||
|
||||
%h3 Vool coming into focus
|
||||
@ -158,38 +162,39 @@
|
||||
recursive calls are flattened into a list, and as such the calling does not rely on a
|
||||
stack as in ruby.
|
||||
%p
|
||||
Secondly, Vool distinguishes between expressions and statements. Like other lower level,
|
||||
but not ruby. As a rule of thumb, Statements do things, Expression are things. In other
|
||||
words, only expressions have value, statements (lke if or while) do not.
|
||||
Secondly, Vool distinguishes between expressions and statements. Like other lower level
|
||||
languages, but not ruby. As a rule of thumb, Statements do things, Expression are things.
|
||||
In other words, only expressions have value, statements (like if or while) do not.
|
||||
|
||||
%h2 Plans
|
||||
%h4 GrillRB conference
|
||||
%p
|
||||
I will speak in
|
||||
=ext_link "Wrazlaw" , "https://grillrb.com/"
|
||||
in about a week. The plan is to make a comparison with rails and focus on the
|
||||
possibilities, rather than technical detail.
|
||||
|
||||
%h4 Calling
|
||||
%p
|
||||
The Calling can do with work and i noticed two mistakes i did. One is that creating
|
||||
a new message for every call is unneccessarily complicated. Its is only in the
|
||||
special case that a Proc is created that the return sequence (a mom instruciton) needs
|
||||
a new message for every call is unneccessarily complicated. It is only in the
|
||||
special case that a Proc is created that the return sequence (a mom instruction) needs
|
||||
to keep the message alive.
|
||||
%p
|
||||
The other is that having arguments and local variables as seperate arrays may be handy
|
||||
and easy to code. But it does add an extra indirection for every access _and_ store.
|
||||
Since Mom is memory based, and Mom translates to risc, that does amount to a lot of
|
||||
instructions.
|
||||
|
||||
%h4 Integers
|
||||
%p
|
||||
I still want to hang on to Integers being objects, though creation is clealy costly.
|
||||
In the future a full escape analysis will help off course, but for now it should be easy
|
||||
enough to figure out wether an int is passed down. If not loops can be
|
||||
destructively change the int. A simple special case is a the times method.
|
||||
enough to figure out wether an int is passed down. If not loops can be made to
|
||||
destructively change the int.
|
||||
|
||||
%h4 Mom instruction invocation
|
||||
%p
|
||||
I have this idea of being able to code more stuff higher up. To make that more
|
||||
efficient i am thinking of macros or instruction invocation at the vool level.
|
||||
Only inside Parfait off course. The basic idea would be to save the call/return
|
||||
code, and have eg X.return_jump map to the Mom::ReturnJump Instruction. "Just" have
|
||||
to figure out the passing semantics, or how that integrates intot the vools code.
|
||||
code, and have the compiler map eg X.return_jump to the Mom::ReturnJump Instruction. "Just" have
|
||||
to figure out the passing semantics, or how that integrates into the vool code.
|
||||
|
||||
%h4 Better Builtin
|
||||
%p
|
||||
The generation of the current builtin methods has always bothered me a bit.
|
||||
@ -197,8 +202,22 @@
|
||||
alternative mechanism is needed (even in c one can embed assembler).
|
||||
%p
|
||||
The main problem i have is that those methods don't check their arguments and as such
|
||||
may cause core dumps. So they are to high level and hopefully all we really need is
|
||||
may cause core dumps. So they are too high level and hopefully all we really need is
|
||||
that previous idea of being able to integrate Mom code into vool. As Mom is extensible
|
||||
that should take care of any possible need. And we could code the methods normally as
|
||||
part of Parfait, make them safe, and just use the lower level inside them. Lets see!.
|
||||
|
||||
part of Parfait, make them safe, and just use the lower level inside them. Lets see!
|
||||
|
||||
%h4 Compiling Parfait tests
|
||||
%p
|
||||
Since Parfait is part of rubyx, we have off course unit tests for it. The plan is to
|
||||
parse the tests too, and run them as a test for both Prfait and the compiler.
|
||||
Off course this will involve writing some mini version of minitest that the compiler
|
||||
can actually handle (Why do i have the feeling that the real minitest involves too much
|
||||
mmagic).
|
||||
|
||||
%h4 GrillRB conference
|
||||
%p
|
||||
Last, but not least, i will speak in
|
||||
=ext_link "Wrocław" , "https://grillrb.com/"
|
||||
in about a week. The plan is to make a comparison with rails and focus on the
|
||||
possibilities, rather than technical detail. See you there :-)
|
||||
|
BIN
public/rubyx.odp
BIN
public/rubyx.odp
Binary file not shown.
Loading…
x
Reference in New Issue
Block a user