spelling and 100 indenting

This commit is contained in:
Torsten Ruger 2017-04-07 14:57:12 +03:00
parent f309b12c37
commit e769849ca7
9 changed files with 173 additions and 164 deletions

View File

@ -3,32 +3,33 @@ layout: news
author: Torsten author: Torsten
--- ---
Well, it has been a good holiday, two months in Indonesia, Bali and diving Komodo. It brought clarity, and so Well, it has been a good holiday, two months in Indonesia, Bali and diving Komodo. It brought
i have to start a daunting task. clarity, and so i have to start a daunting task.
When i learned programming at University, they were still teaching Pascal. So when I got to choose c++ in my first When i learned programming at University, they were still teaching Pascal. So when I got to choose
bigger project that was a real step up. But even i wrestled templates, it was Smalltalk that took my heart c++ in my first bigger project that was a real step up. But even i wrestled templates, it was
immediately when i read about it. And I read quite a bit, including the Blue Book about the implementation of it. Smalltalk that took my heart immediately when i read about it. And I read quite a bit, including the Blue Book about the implementation of it.
The next disctinct step up was Java, in 1996, and then ruby in 2001. Until i mostly stopped coding in 2004 when i The next distinct step up was Java, in 1996, and then ruby in 2001. Until i mostly stopped coding
moved to the country side and started our <a href="http://villataika.fi/en/index.html"> B&amp;B </a> in 2004 when i moved to the country side and started our [B&amp;B](http://villataika.fi/en/index.html)
But then we needed web-pages, and before long a pos for our shop, so i was back on the keyboard. And since it was But then we needed web-pages, and before long a pos for our shop, so i was back on the keyboard.
a thing i had been wanting to do, I wrote a database. And since it was a thing i had been wanting to do, I wrote a database.
Purple was my current idea of an ideal data-store. Save by reachability, automatic loading by traversal Purple was my current idea of an ideal data-store. Save by reachability, automatic loading by
and schema-free any ruby object saving. In memory, based on Judy, it did about 2000 transaction per second. traversal and schema-free any ruby object saving. In memory, based on Judy, it did about 2000
Alas, it didn't have any searching. transaction per second. Alas, it didn't have any searching.
So i bit the bullet and implemented an sql interface to it. After a failed attempt with rails 2 and after 2 major rewrites So i bit the bullet and implemented an sql interface to it. After a failed attempt with rails 2
i managed to integrate what by then was called warp into Arel (rails3). and after 2 major rewrites i managed to integrate what by then was called warp into Arel (rails3).
But while raw throughput was still about the same, when But while raw throughput was still about the same, when it had to go through Arel it crawled to 50
it had to go through Arel it crawled to 50 transactions per second, about the same as sqlite. transactions per second, about the same as sqlite.
This was maybe 2011, and there was not doubt anymore. Not the database, but ruby itself was the speed hog. I aborted. This was maybe 2011, and there was no doubt anymore. Not the database, but ruby itself was the
speed hog. I aborted.
In 2013 I bought a Raspberry Pi and off course I wanted to use it with ruby. Alas... Slow pi + slow ruby = nischt gut. In 2013 I bought a Raspberry Pi and off course I wanted to use it with ruby. Alas... Slow pi + slow ruby = nischt gut.
I gave up. I gave up.
So then the clarity came with the solution, build your own ruby. I started designing a bit on the beach already. So then the clarity came with the solution, build your own ruby. I started designing a bit on the beach already.
Still, daunting. But maybe just possible....
Still, daunting. But maybe just possible....

View File

@ -9,18 +9,18 @@ The c machine
Software engineers have clean brains, scrubbed into full c alignment through decades. A few rebels (klingons?) remain on embedded systems, but of those most strive towards posix compliancy too. Software engineers have clean brains, scrubbed into full c alignment through decades. A few rebels (klingons?) remain on embedded systems, but of those most strive towards posix compliancy too.
In other words, since all programming ultimately boils down to c, libc makes the bridge to the kernel/machine. All .... all but a small village in the northern (cold) parts of europe (Antskog) where ... In other words, since all programming ultimately boils down to c, libc makes the bridge to the kernel/machine. All .... all but a small village in the northern (cold) parts of europe (Antskog) where ...
So i had a look what we are talking about. So i had a look what we are talking about.
The issue The issue
---------- ----------
Many, especially embedded guys, have noticed that your standard c library has become quite heavy (2 Megs). Many, especially embedded guys, have noticed that your standard c library has become quite heavy
Since it provides a defined (posix) and large functionality on a plethora of systems (os's) and cpu's. Even for different ABI's (application binary interfaces) and compilers/linkers it is no wonder. (2 Megs). Since it provides a defined api (posix) and large functionality on a plethora of systems (os's) and cpu's. Even for different ABI's (application binary interfaces) and compilers/linkers it is no wonder.
ucLibc or dietLibc get the size down, especially diet quite a bit (130k). So that's ok then. Or is it? ucLibc or dietLibc get the size down, especially diet quite a bit (130k). So that's ok then. Or is it?
Then i noticed that the real issue is not the size. Even my pi has 512 Mb, and of course even libc gets paged. Then i noticed that the real issue is not the size. Even my pi has 512 Mb, and of course even libc gets paged.
The real issue is the step into the C world. So, extern functions, call marshelling, and the question is for what. The real issue is the step into the C world. So, extern functions, call marshelling, and the question is for what.
@ -31,8 +31,8 @@ ruby core/std-lib
Off course the ruby-core and std libs were designed to do for ruby what libc does for c. Unfortunately they are badly designed and suffer from above brainwash (designed around c calls) Off course the ruby-core and std libs were designed to do for ruby what libc does for c. Unfortunately they are badly designed and suffer from above brainwash (designed around c calls)
Since salama is pure ruby there is a fair amount of functionality that would be nicer to provide straight in ruby. As gems off course, for everybody to see and fix. Since salama is pure ruby there is a fair amount of functionality that would be nicer to provide straight in ruby. As gems off course, for everybody to see and fix.
For example, even if there were to be a printf (which i dislike) , it would be easy to code in ruby. For example, even if there were to be a printf (which i dislike) , it would be easy to code in ruby.
What is needed is the underlying write to stdout. What is needed is the underlying write to stdout.
@ -41,11 +41,11 @@ Solution
To get salama up and running, ie to have a "ruby" executable, there are really very few kernel calls needed. File open, read and stdout write, brk. To get salama up and running, ie to have a "ruby" executable, there are really very few kernel calls needed. File open, read and stdout write, brk.
So the way this will go is to write syscalls where needed. So the way this will go is to write syscalls where needed.
Having tried to reverse engineer uc, diet and musl, it seems best to go straight to the source. Having tried to reverse engineer uc, diet and musl, it seems best to go straight to the source.
Most of that is off course for intel, but eax goes to r7 and after that the args are from r0 up, so not too bad. The definate guide for arm is here http://sourceforge.net/p/strace/code/ci/master/tree/linux/arm/syscallent.h Most of that is off course for intel, but eax goes to r7 and after that the args are from r0 up, so not too bad. The definite guide for arm is here [http://sourceforge.net/p/strace/code/ci/master/tree/linux/arm/syscallent.h](http://sourceforge.net/p/strace/code/ci/master/tree/linux/arm/syscallent.h)
But doesn't include arguments (only number of them), so http://syscalls.kernelgrok.com/ can be used. But doesn't include arguments (only number of them), so [http://syscalls.kernelgrok.com/](http://syscalls.kernelgrok.com/) can be used.
So there, getting more metal by the minute. But the time from writing this to a hello world was 4 hours. So there, getting more metal by the minute. But the time from writing this to a hello world was 4 hours.

View File

@ -3,18 +3,17 @@ layout: news
author: Torsten author: Torsten
--- ---
Part of the reason why i even thougth this was possible was because i had bumped into Metasm. Part of the reason why i even thought this was possible was because i had bumped into Metasm.
Metasm creates native code in 100% ruby. Either from Assembler or even C (partially). And for many cpu's too. Metasm creates native code in 100% ruby. Either from Assembler or even C (partially). And for many cpu's too.
It also creates many binary formats, elf among them. It also creates many binary formats, elf among them.
Still, i wanted something small that i could understand easily as it was clear it would have to be changed to fit. Still, i wanted something small that i could understand easily as it was clear it would have to be changed to fit.
As there was no external assembler file format planned, the whole aproach from parsing was inapropriate. As there was no external assembler file format planned, the whole approach from parsing was inappropriate.
I luckily found a small library, as, that did arm only and was just a few files. After removing not needed parts I luckily found a small library, as, that did arm only and was just a few files. After removing not needed parts
like parsing and some reformatting i added an assmbler like dsl. like parsing and some reformatting i added an assembler like dsl.
This layer (arm subdirectory) said hello after about 2 weeks of work. This layer (arm subdirectory) said hello after about 2 weeks of work.
I also got qemu to work and can thus develop without the actual pi. I also got qemu to work and can thus develop without the actual pi.

View File

@ -2,7 +2,7 @@
layout: news layout: news
author: Torsten author: Torsten
--- ---
Both "ends", parsing and machine code, were relatively clear cut. Now it is into unknown territory. Both "ends", parsing and machine code, were relatively clear cut. Now it is into unknown territory.
I had ported the Kaleidescope llvm tutorial language to ruby-llvm last year, so thee were some ideas floating. I had ported the Kaleidescope llvm tutorial language to ruby-llvm last year, so thee were some ideas floating.
@ -10,20 +10,20 @@ The idea of basic blocks, as the smallest unit of code without branches was pret
targets was also straight forward. But how to get from the AST to arm Intructions was not, and took some trying out. targets was also straight forward. But how to get from the AST to arm Intructions was not, and took some trying out.
In the end, or rather now, it is the AST layer that "compiles" itself into the Vm layer. The Vm layer then assembles In the end, or rather now, it is the AST layer that "compiles" itself into the Vm layer. The Vm layer then assembles
itself into Instructions. itself into Instructions.
General instructions are part of the Vm layer, but the code picks up derived classes and thus makes machine General instructions are part of the Vm layer, but the code picks up derived classes and thus makes machine
dependant code possible. So far so ok. dependent code possible. So far so ok.
Register allocation was (and is) another story. Argument passing and local variables do work now, but there is definately Register allocation was (and is) another story. Argument passing and local variables do work now, but there is definitely
room for improvement there. room for improvement there.
To get anything out of a running program i had to implement putstring (easy) and putint (difficult). Surprisingly To get anything out of a running program i had to implement putstring (easy) and putint (difficult). Surprisingly
division is not easy and when pinned to 10 (divide by 10) quite strange. Still it works. While i was at writing division is not easy and when pinned to 10 (divide by 10) quite strange. Still it works. While i was at writing
assmbler i found a fibonachi in 10 or so instructions. assembler i found a fibonachi in 10 or so instructions.
To summarise, function definition and calling (including recursion) works. To summarise, function definition and calling (including recursion) works.
If and and while structures work and also some operators and now it's easy to add more. If and and while structures work and also some operators and now it's easy to add more.
So we have a Fibonacchi in ruby using a while implementation that can be executed by salama and outputs the So we have a Fibonacchi in ruby using a while implementation that can be executed by salama and outputs the
correct result. After a total of 7 weeks this is much more than expected! correct result. After a total of 7 weeks this is much more than expected!

View File

@ -6,16 +6,16 @@ author: Torsten
Parsing is a difficult, the theory incomprehensible and older tools cryptic. At least for me. Parsing is a difficult, the theory incomprehensible and older tools cryptic. At least for me.
And then i heard recursive is easy and used by even llvm. Formalised as peg parsing libraries exists, and in ruby And then i heard recursive is easy and used by even llvm. Formalised as peg parsing libraries exists, and in ruby
they have dsl's and are suddenly quite understandable. they have dsl's and are suddenly quite understandable.
Off the candidates i had first very positive experiences with treetop. Upon continuing i found the code Off the candidates i had first very positive experiences with treetop. Upon continuing i found the code
generation aspect not just clumbsy (after all you can define methods in ruby), but also to interfere unneccessarily generation aspect not just clumsy (after all you can define methods in ruby), but also to interfere unneccessarily
with code control. On top of that conversion into an AST was not easy. with code control. On top of that conversion into an AST was not easy.
After looking around i found Parslet, which pretty much removes all those issues. Namely After looking around i found Parslet, which pretty much removes all those issues. Namely
- It does not generate code, it generates methods. And has a nice dsl. - It does not generate code, it generates methods. And has a nice dsl.
- It transforms to ruby basic types and has the notion on a transormation. - It transforms to ruby basic types and has the notion on a transformation.
So an easy and clean way to create an AST So an easy and clean way to create an AST
- One can use ruby modules to partition a larger parser - One can use ruby modules to partition a larger parser
- Minimal dependencies (one file). - Minimal dependencies (one file).
@ -24,12 +24,11 @@ After looking around i found Parslet, which pretty much removes all those issues
So i was sold, and i got up to speed quite quickly. But i also found out how fiddly such a parser is in regards So i was sold, and i got up to speed quite quickly. But i also found out how fiddly such a parser is in regards
to ordering and whitespace. to ordering and whitespace.
I spent some time to make quite a solid test framework, testing the differnet rules seperately and also the I spent some time to make quite a solid test framework, testing the different rules separately and also the
stages seperately, so things would not break accidentally when growing. stages separately, so things would not break accidentally when growing.
After about another 2 weeks i was able to parse functions, both calls and definitions, ifs and whiles and off course basic After about another 2 weeks i was able to parse functions, both calls and definitions, ifs and whiles and off course basic
types of integers and strings. types of integers and strings.
With the great operator support it was a breeze to create all 15 ish binary operators. Even Array and Hash constant
definition was very quick. All in all surprisingly painless, thanks to Kasper!
With the great operator support it was a breeze to create all 15 ish binary operators. Even Array and Hash constant
definition was very quick. All in all surprisingly painless, thanks to Kasper!

View File

@ -18,17 +18,17 @@ The place for these methods, and i'll go into it a little which in a second, is
So Kernel is the place for methods that are needed to build the system, and may not be called on objects. Simple. So Kernel is the place for methods that are needed to build the system, and may not be called on objects. Simple.
In ohter words, anything that can be coded on normal objects, should. But when that stops being possible, Kernel is the place. In other words, anything that can be coded on normal objects, should. But when that stops being possible, Kernel is the place.
And what are these funtions? get_instance_variable or set too. Same for functions. Strangley these may in turn rely on functions that can be coded in ruby, but at the heart of the matter is an indexed operation ie object[2]. And what are these functions? get_instance_variable or set too. Same for functions. Strangley these may in turn rely on functions that can be coded in ruby, but at the heart of the matter is an indexed operation ie object[2].
This functionality, ie getting the n'th data in an object, is essential, but c makes such a good point of of it having no place in a public api. So it needs to be implemented in a "private" part and used in a save manner. More on the layers emerging below. This functionality, ie getting the n'th data in an object, is essential, but c makes such a good point of of it having no place in a public api. So it needs to be implemented in a "private" part and used in a save manner. More on the layers emerging below.
The Kernel is a module in salama that defines functions which return function objects. So the code is generated, instead of parsed. An essential destinction. The Kernel is a module in salama that defines functions which return function objects. So the code is generated, instead of parsed. An essential distinction.
#### System #### System
It's an important side note on that Kernel definition above, that it is _not_ the same as system acccess function. These are in their own Module and may (or must) use the kernel to implement their functionality. But not the same. It's an important side note on that Kernel definition above, that it is _not_ the same as system access function. These are in their own Module and may (or must) use the kernel to implement their functionality. But not the same.
Kernel is the VM's "core" if you want. Kernel is the VM's "core" if you want.
@ -36,46 +36,46 @@ System is the access to the operating system functionality.
#### Layers #### Layers
So from that Kernel idea have now emerged 3 Layers, 3 ways in which code is created. So from that Kernel idea have now emerged 3 Layers, 3 ways in which code is created.
##### Machine ##### Machine
The lowest layer is the Machine layer. This Layer generates Instructions or sequnces thereof. So off course there is an Instruction class with derived classes, but also Block, the smallest, linear, sequences of Instructions. The lowest layer is the Machine layer. This Layer generates Instructions or sequences thereof. So off course there is an Instruction class with derived classes, but also Block, the smallest, linear, sequences of Instructions.
Also there is an abstract RegisterMachine that is mostly a mediator to the curent implemementation (ArmMachine). The machine has functions that create Instructions Also there is an abstract RegisterMachine that is mostly a mediator to the current implementation (ArmMachine). The machine has functions that create Instructions
Some few machine functions return Blocks, or append their instructions to blocks. This is really more a macro layer. Usually they are small, but div10 for example is a real 10 instruciton beauty. Some few machine functions return Blocks, or append their instructions to blocks. This is really more a macro layer. Usually they are small, but div10 for example is a real 10 instruction beauty.
##### Kernel ##### Kernel
The Kernel functions return function objects. Kernel functions have the same name as the function they implement, so Kernel::putstring defines a function called putstring. Function objects (Vm::Function) carry entry/exit/body code, receiver/return/argurmt types and a little more. The Kernel functions return function objects. Kernel functions have the same name as the function they implement, so Kernel::putstring defines a function called putstring. Function objects (Vm::Function) carry entry/exit/body code, receiver/return/argument types and a little more.
The important thing is that these functions are callable from ruby code. Thus they form the glue from the next layer up, which is coded in ruby, to the machine layer. In a way the Kernel "exports" the machine functionality to salama. The important thing is that these functions are callable from ruby code. Thus they form the glue from the next layer up, which is coded in ruby, to the machine layer. In a way the Kernel "exports" the machine functionality to salama.
##### Parfait ##### Parfait
Parfait is a thin layer imlementing a mini-minimal OO system. Sure, all your usual suspects of string and integers are there, but they only implement what is really really neccessay. For examle strings mainly have new equals and put. Parfait is a thin layer implementing a mini-minimal OO system. Sure, all your usual suspects of string and integers are there, but they only implement what is really really necessary. For example strings mainly have new equals and put.
Parfait is heavy on Object/Class/Metaclass functionality, object instance and method lookup. All things needed to make an OO system OO. Not so much "real" functionality here, more creating the ability for that. Parfait is heavy on Object/Class/Metaclass functionality, object instance and method lookup. All things needed to make an OO system OO. Not so much "real" functionality here, more creating the ability for that.
Stdlib would be the next layer up, implementing the whole of ruby functionality in terms of what Parfait provides. Stdlib would be the next layer up, implementing the whole of ruby functionality in terms of what Parfait provides.
The important thing here is that Parfait is written completely in ruby. Meaning it get's parsed by salama like any other code, and then transformed into executable form and written. The important thing here is that Parfait is written completely in ruby. Meaning it get's parsed by salama like any other code, and then transformed into executable form and written.
Any executable that salama generates will have Parfait in it. But only the final version of salama as a ruby vm, will have the whole stdlib and parser along. Any executable that salama generates will have Parfait in it. But only the final version of salama as a ruby vm, will have the whole stdlib and parser along.
#### Salama #### Salama
Salama uses the Kernel and Machine layers straight when creating code. Off course. Salama uses the Kernel and Machine layers straight when creating code. Off course.
The closest equivalent to salama would be a compiler and so it is it's job to create code (machine layer objects). The closest equivalent to salama would be a compiler and so it is it's job to create code (machine layer objects).
But it is my intention to keep that as small as possible. And the good news is it's all ruby :-) But it is my intention to keep that as small as possible. And the good news is it's all ruby :-)
##### Extensions ##### Extensions
I just want to mention the idea of extensions that is a logical step for a minimal system. Off course they would be gems, but the integesting thing is they (like salama) could: I just want to mention the idea of extensions that is a logical step for a minimal system. Off course they would be gems, but the interesting thing is they (like salama) could:
- use salamas existing kernel/machine abstraction to define new functionality that is not possible in ruby - use salamas existing kernel/machine abstraction to define new functionality that is not possible in ruby
- define new machine functionality, adding kernel type api's, to create wholly new, possibly hardware specific functionality - define new machine functionality, adding kernel type api's, to create wholly new, possibly hardware specific functionality
I am thinking graphic accellaration, GPU usage, vector api's, that kind of thing. In fact i aim to implement the whole floating point functionality as an extensions (as it clearly not essential for OO). I am thinking graphic acceleration, GPU usage, vector api's, that kind of thing. In fact i aim to implement the whole floating point functionality as an extensions (as it clearly not essential for OO).

View File

@ -6,56 +6,62 @@ author: Torsten
I was just reading my ruby book, wondering about functions and blocks and the like, as one does when implementing I was just reading my ruby book, wondering about functions and blocks and the like, as one does when implementing
a vm. Actually the topic i was struggling with was receivers, the pesty self, when i got the exception. a vm. Actually the topic i was struggling with was receivers, the pesty self, when i got the exception.
And while they say two setps forward, one step back, this goes the other way around. And while they say two steps forward, one step back, this goes the other way around.
### One step back ### One step back
As I just learnt assembler, it is the first time i am really considering how functions are implemented, and how the stack is As I just learnt assembler, it is the first time i am really considering how functions are implemented, and how the stack is
used in that. Sure i heard about it, but the details were vague. used in that. Sure i heard about it, but the details were vague.
Off course a function must know where to return to. I mean the memory-address, as this can't very well be fixed at compile Off course a function must know where to return to. I mean the memory-address, as this can't very
time. In effect this must be passed to the function. But as programmers we don't want to have to do that all the time and well be fixed at compile time. In effect this must be passed to the function. But as programmers we
so it is passed implicitly. don't want to have to do that all the time and so it is passed implicitly.
##### The missing link ##### The missing link
The arm architecture makes this nicely explicit. There, a call is actually called branch with link. This almost rubbed me The arm architecture makes this nicely explicit. There, a call is actually called branch with link.
for a while as it struck me as an exceedingly bad name. Until i "got it", that is. The link is the link back, well that This almost rubbed me for a while as it struck me as an exceedingly bad name. Until i "got it",
was simple. But the thing is that the "link" is put into the link register. that is. The link is the link back, well that was simple. But the thing is that the "link" is
put into the link register.
This never struck me as meaningful, until now. Off course it means that "leaf" functions do not
need to touch it. Leaf functions are functions that do not call other functions, though they may
do syscalls as the kernel restores all registers. In other cpu's the return address is pushed on
the stack, but in arm you have to do that yourself. Or not and save the instruction if you're so inclined.
This never struck me as meaningful, until now. Off course it means that "leaf" functions do not need to touch it. Leaf
functions are functions that do not call other functions, though they may do syscalls as the kernel restores all registers.
In other cpu's the return address is pushed on the stack, but in arm you have to do that yourself. Or not and save the
instruction if you're so inclined.
##### The hidden argument ##### The hidden argument
But the point here is, that this makes it very explicit. The return address is in effect just another argument. It usually But the point here is, that this makes it very explicit. The return address is in effect just
gets passed automatically by compiler generated code, but never the less. It is an argument. another argument. It usually gets passed automatically by compiler generated code, but never
the less. It is an argument.
The "step back" is to make this argument explicit in the vm code. Thus making it's handling, ie passing or saving explicit The "step back" is to make this argument explicit in the vm code. Thus making it's handling,
too. And thus having less magic going on, because you can't understand magic (you gotta believe it). ie passing or saving explicit too. And thus having less magic going on, because you can't
understand magic (you gotta believe it).
### Two steps forward ### Two steps forward
And so the thrust becomes clear i hope. We are talking about exceptions after all. And so the thrust becomes clear i hope. We are talking about exceptions after all.
Because to those who have not read the windows calling convention on exception handling or even heard of the dwarf specification thereof, i say don't. It melts the brain. Because to those who have not read the windows calling convention on exception handling or even
You have to be so good at playing computer in your head, it's not healthy. heard of the dwarf specification thereof, i say don't. It melts the brain.
You have to be so good at playing computer in your head, it's not healthy.
Instead, we make things simple and explicit. An exception is after all just a different way for a function to return. Instead, we make things simple and explicit. An exception is after all just a different way for
So we need an address for it to return too. a function to return. So we need an address for it to return too.
And as we have just made the normal return address an explicit argument, we just make the exception return address And as we have just made the normal return address an explicit argument, we just make the
and argument too. And presto. exception return address and argument too. And presto.
Even just the briefest of considerations of how we generate those exception return addresses (landing pads? Even just the briefest of considerations of how we generate those exception return addresses
what a strange name), leads to the conclusion that if a function does not do any exception handling, it just passes (landing pads? what a strange name), leads to the conclusion that if a function does not do
the same addess on, that it got itself. Thus a generated excetion would jump clear over such a function. any exception handling, it just passes the same address on, that it got itself. Thus a
generated exception would jump clear over such a function.
Since we have now got the exceptions to be normal code (alas with an exceptional name :-)) control flow to and from Since we have now got the exceptions to be normal code (alas with an exceptional name :-)) control
it becomes quite normal too. flow to and from it becomes quite normal too.
To summarize each function has now a minimum of three arguments: the self, the return address and the exception address. To summarize each function has now a minimum of three arguments: the self, the return address and
the exception address.
We have indeed taken a step forward. We have indeed taken a step forward.

View File

@ -11,43 +11,45 @@ What i wasn't stuck with, is where to draw the layer for the vm.
### Layers ### Layers
Software engineers like layers. Like the onion boy. You can draw boxes, make presentation and convince your boss. Software engineers like layers. Like the onion boy. You can draw boxes, make presentation and convince your boss.
They help us to reason about the software. They help us to reason about the software.
In this case the model was to go from ast layer to a vm layer. Via a compile method, that could just as well have been a In this case the model was to go from ast layer to a vm layer. Via a compile method, that could just as well have been a
visitor. visitor.
That didn't work, too big astep and so it was from ast, to vm, to neumann. But i couldn't decide on the abstraction of the That didn't work, too big astep and so it was from ast, to vm, to neumann. But i couldn't decide
virtual machine layer. Specifically, when you have a send (and you have soo many sends in ruby), do you: on the abstraction of the virtual machine layer. Specifically, when you have a send (and you have
soo many sends in ruby), do you:
- model it as a vm instruction (a bit like java) - model it as a vm instruction (a bit like java)
- implement it in a couple instructions like resolve, a loop and call - implement it in a couple instructions like resolve, a loop and call
- go to a version that is clearly translatable to neumann, say without the value type implementation - go to a version that is clearly translatable to neumann, say without the value type implementation
Obviously the third is where we need to get to, as the next step is the neumann layer and somewhow we need to get there. Obviously the third is where we need to get to, as the next step is the neumann layer and somewhow
In effect one could take those three and present them as layers, not as alternatives like i have. we need to get there. In effect one could take those three and present them as layers, not
as alternatives like i have.
### Passes ### Passes
And then the little cob went click, and the idea of passes resurfaced. LLvm has these passes on the code tree, is probably And then the little cob went click, and the idea of passes resurfaced. LLvm has these passes on
where it surfaced from. the code tree, is probably where it surfaced from.
So we can have as high of a degree of abstraction as possible when going from ast to code. And then have as many passes So we can have as high of a degree of abstraction as possible when going from ast to code.
over that as we want / need. And then have as many passes over that as we want / need.
Passes can be order dependent, and create more and more datail. To solve the above layer conundrum, we just do a pass Passes can be order dependent, and create more and more detail. To solve the above layer
for each of those options. conundrum, we just do a pass for each of those options.
The two main benefits that come from this are: The two main benefits that come from this are:
1 - At each point, ie after and during each pass we can analyse the data. Imagine for example that we would have picked the 1 - At each point, ie after and during each pass we can analyse the data. Imagine for example
second layer option, that means there would never have been a representation where the sends would have been explicit. Thus that we would have picked the second layer option, that means there would never have been a
any analasis of them would be impossible or need reverse engineering (eg call graph analysis, or class caching) representation where the sends would have been explicit. Thus any analysis of them would be impossible or need reverse engineering (eg call graph analysis, or class caching)
2 - Passes can be gems or come from other sources. The mechanism can be relatively oblivious to specific passes. And they 2 - Passes can be gems or come from other sources. The mechanism can be relatively oblivious to
make the transformation explicit, ie easier to understand. In the example of having picked the second layer level, one specific passes. And they make the transformation explicit, ie easier to understand.
would have to patch the implementation of that transformation to achieve a different result. With pases it would be a matter In the example of having picked the second layer level, one would have to patch the
of replacing a pass, thus explicitly stating "i want a non-standard send implementation" implementation of that transformation to achieve a different result. With passes it would be
a matter of replacing a pass, thus explicitly stating "i want a non-standard send implementation"
Actually a third benefit is that it makes testing simpler. More modular. Just test the initial ast->code and then mostly Actually a third benefit is that it makes testing simpler. More modular. Just test the initial ast->code and then mostly the results of passes.
the results of passes.

View File

@ -5,12 +5,14 @@ author: Torsten
In a picture, or when taking a picture, the frame is very important. It sets whatever is in the picture into context. In a picture, or when taking a picture, the frame is very important. It sets whatever is in the picture into context.
So it is a bit strange that having a **frame** had the same sort of effect for me in programming. I made the frame explicit, So it is a bit strange that having a **frame** had the same sort of effect for me in programming.
as an object, with functions and data, and immidiately the whole message sending became a whole lot clearer. I made the frame explicit, as an object, with functions and data, and immediately the whole
message sending became a whole lot clearer.
You read about frames in calling conventions, or otherwise when talking about the machine stack. It is the area a function You read about frames in calling conventions, or otherwise when talking about the machine stack.
uses for storing data, be it arguments, locals or temporary data. Often a frame pointer will be used to establish a frames It is the area a function uses for storing data, be it arguments, locals or temporary data.
dynamic size and things like that. But since it's all so implicit and handled by code very few programmers ever see it was Often a frame pointer will be used to establish a frames dynamic size and things like that.
But since it's all so implicit and handled by code very few programmers ever see it was
all a bit muddled for me. all a bit muddled for me.
My frame has: return and exceptional return address, self, arguments, locals, temps My frame has: return and exceptional return address, self, arguments, locals, temps
@ -19,58 +21,58 @@ and methods to: create a frame, get a value to or from a slot or args/locals/tm
### The divide, compile and runtime ### The divide, compile and runtime
I saw [Tom's video on free compilers](http://codon.com/compilers-for-free) and read the underlying book on I saw [Tom's video on free compilers](http://codon.com/compilers-for-free) and read the underlying
[Partial Evaluation](http://www.itu.dk/people/sestoft/pebook/jonesgomardsestoft-a4.pdf) a bit, and it helped to make the book on [Partial Evaluation](http://www.itu.dk/people/sestoft/pebook/jonesgomardsestoft-a4.pdf) a bit, and it helped to make the distinctions clearer. As did the Layers and Passes post.
distinctions clearer. As did the Layers and Passes post. And the explicit Frame. And the explicit Frame.
The explicit frame established the vm explicitly too, or much better. All actions of the vm happen in terms of the frame. The explicit frame established the vm explicitly too, or much better. All actions of the vm happen
Sending is creating a new one, loading it, finding the method and branching there. Getting and setting variables is just in terms of the frame. Sending is creating a new one, loading it, finding the method and branching
indexing into the frame at the right index and so on. Instance variables are a send to self, and on it goes. there. Getting and setting variables is just indexing into the frame at the right index and so on.
Instance variables are a send to self, and on it goes.
The great distinction is at the end quite simple, it is compile-time or run-time. And the passes idea helps in that i start The great distinction is at the end quite simple, it is compile-time or run-time. And the passes
with most simple implementation against my vm. Then i have a data structure and can keep expanding it to "implement" more idea helps in that i start with most simple implementation against my vm. Then i have a data structure and can keep expanding it to "implement" more detail. Or i can analyse it to save
detail. Or i can analyse it to save redundancies, ie optimize. But the point is in both cases i can just think about redundancies, ie optimize. But the point is in both cases i can just think about data structures
data structures and what to do with them. and what to do with them.
And what i can do with my data (which is off course partially instruction sequences, but that's beside the point) really And what i can do with my data (which is off course partially instruction sequences, but that's beside the point) really always depends on the great question: compile time vs run-time.
always depends on the great question: compile time vs run-time. What is constant, can i do immediately. Otherwise leave What is constant, can i do immediately. Otherwise leave for later. Simple.
for later. Simple.
An example, attribute accessor: a simple send. I build a frame, set the self. Now a fully dynamic implementation would An example, attribute accessor: a simple send. I build a frame, set the self. Now a fully dynamic
leave it at that. But i can check if i know the type, if it's not reference (ie integer) we can raise immediately. Also the implementation would leave it at that. But i can check if i know the type, if it's not
a reference tags the class for when that is known at compile time. If so i can determine the layout at compile time and reference (ie integer) we can raise immediately. Also the a reference tags the class for when
inline the get's implementation. If not i could cache, but that's for later. that is known at compile time. If so i can determine the layout at compile time and inline the
get's implementation. If not i could cache, but that's for later.
As a furhter example on this, when one function has two calls on the same object, the layout must only be retrieved once. As a further example on this, when one function has two calls on the same object, the layout
ie in the sequences getType, determine method, call, the first step can be ommitted for the second call as a layout is must only be retrieved once. ie in the sequences getType, determine method, call, the first
constant. step can be omitted for the second call as a layout is constant.
And as a final bonus of all this clarity, i immediately spotted the inconcistency in my own design: The frame i designed And as a final bonus of all this clarity, i immediately spotted the inconsistency in my own design: The frame i designed holds local variables, but the caller needs to create it. The caller can
holds local variables, but the caller needs to create it. The caller can not possibly know the number of local variables not possibly know the number of local variables as that is decided by the invoked method,
as that is decided by the invoked method, which is only known at run-time. So we clearly need a two level thing here, one which is only known at run-time. So we clearly need a two level thing here, one
that the caller creates, and one that the receiver creates. that the caller creates, and one that the receiver creates.
### Messaging and slots ### Messaging and slots
It is interesting to relate what emerges to concepts learned over the years: It is interesting to relate what emerges to concepts learned over the years:
There is this idea of message passing, as opposed to function calling. Everyone i know has learned an imperative There is this idea of message passing, as opposed to function calling. Everyone i know has learned
language as the first language and so message passing is a bit like vegetarian food, all right for some. But off course there an imperative language as the first language and so message passing is a bit like vegetarian
is a distinct difference in dynamic languages as one does not know the actual method invoked beforehand. Also exceptions food, all right for some. But off course there is a distinct difference in dynamic languages as
make the return trickier and default values even the argument passing which then have to be augmented by the receiver. one does not know the actual method invoked beforehand. Also exceptions make the return trickier
and default values even the argument passing which then have to be augmented by the receiver.
One main difficulty i had in with the message passing idea has always been what the message is. One main difficulty i had in with the message passing idea has always been what the message is.
But now i have the frame, i know exactly what it is: it is the frame, nothing more nothing less. But now i have the frame, i know exactly what it is: it is the frame, nothing more nothing less.
(Postscript: Later introduced the Message object which gets created by the caller, and the Frame is what is created (Postscript: Later introduced the Message object which gets created by the caller, and the Frame
by the callee) is what is created by the callee)
Another interesting observation is the (hopefully) golden path this design goes between smalltalk and self. In Another interesting observation is the (hopefully) golden path this design goes between smalltalk
smalltalk (like ruby and...) all objects have a class. But some of the smalltalk researchers went on to do and self. In smalltalk (like ruby and...) all objects have a class. But some of the smalltalk researchers went on to do [Self](http://en.wikipedia.org/wiki/Self_(programming_language)), which
[Self](http://en.wikipedia.org/wiki/Self_(programming_language)), which has no classes only has no classes only objects. This was supposed to make things easier and faster. Slots were a bit like instance variables, but there were no classes to rule them.
objects. This was supposed to make things easier and faster. Slots were a bit like instance variables, but there were no
classes to rule them.
Now in ruby, any object can have any variables anyway, but they incur a dynamic lookup. Types on the other hand are like Now in ruby, any object can have any variables anyway, but they incur a dynamic lookup. Types on
slots, and keeping each Type constant (while an object can change layouts) makes it possible to have completely the other hand are like slots, and keeping each Type constant (while an object can change layouts)
dynamic behaviour (smalltalk/ruby) **and** use a slot-like (self) system with constant lookup speed. Admittatley the makes it possible to have completely dynamic behaviour (smalltalk/ruby) **and** use a slot-like (self) system with constant lookup speed. Admittedly the constancy only affects cache hits, but
constantcy only affects cache hits, but as most systems are not dynamic most of the time, that is almost always. as most systems are not dynamic most of the time, that is almost always.