spelling and 100 indenting

This commit is contained in:
Torsten Ruger 2017-04-07 14:57:12 +03:00
parent f309b12c37
commit e769849ca7
9 changed files with 173 additions and 164 deletions

View File

@ -3,32 +3,33 @@ layout: news
author: Torsten
---
Well, it has been a good holiday, two months in Indonesia, Bali and diving Komodo. It brought clarity, and so
i have to start a daunting task.
Well, it has been a good holiday, two months in Indonesia, Bali and diving Komodo. It brought
clarity, and so i have to start a daunting task.
When i learned programming at University, they were still teaching Pascal. So when I got to choose c++ in my first
bigger project that was a real step up. But even i wrestled templates, it was Smalltalk that took my heart
immediately when i read about it. And I read quite a bit, including the Blue Book about the implementation of it.
When i learned programming at University, they were still teaching Pascal. So when I got to choose
c++ in my first bigger project that was a real step up. But even i wrestled templates, it was
Smalltalk that took my heart immediately when i read about it. And I read quite a bit, including the Blue Book about the implementation of it.
The next disctinct step up was Java, in 1996, and then ruby in 2001. Until i mostly stopped coding in 2004 when i
moved to the country side and started our <a href="http://villataika.fi/en/index.html"> B&amp;B </a>
But then we needed web-pages, and before long a pos for our shop, so i was back on the keyboard. And since it was
a thing i had been wanting to do, I wrote a database.
The next distinct step up was Java, in 1996, and then ruby in 2001. Until i mostly stopped coding
in 2004 when i moved to the country side and started our [B&amp;B](http://villataika.fi/en/index.html)
But then we needed web-pages, and before long a pos for our shop, so i was back on the keyboard.
And since it was a thing i had been wanting to do, I wrote a database.
Purple was my current idea of an ideal data-store. Save by reachability, automatic loading by traversal
and schema-free any ruby object saving. In memory, based on Judy, it did about 2000 transaction per second.
Alas, it didn't have any searching.
Purple was my current idea of an ideal data-store. Save by reachability, automatic loading by
traversal and schema-free any ruby object saving. In memory, based on Judy, it did about 2000
transaction per second. Alas, it didn't have any searching.
So i bit the bullet and implemented an sql interface to it. After a failed attempt with rails 2 and after 2 major rewrites
i managed to integrate what by then was called warp into Arel (rails3).
But while raw throughput was still about the same, when
it had to go through Arel it crawled to 50 transactions per second, about the same as sqlite.
So i bit the bullet and implemented an sql interface to it. After a failed attempt with rails 2
and after 2 major rewrites i managed to integrate what by then was called warp into Arel (rails3).
But while raw throughput was still about the same, when it had to go through Arel it crawled to 50
transactions per second, about the same as sqlite.
This was maybe 2011, and there was not doubt anymore. Not the database, but ruby itself was the speed hog. I aborted.
This was maybe 2011, and there was no doubt anymore. Not the database, but ruby itself was the
speed hog. I aborted.
In 2013 I bought a Raspberry Pi and off course I wanted to use it with ruby. Alas... Slow pi + slow ruby = nischt gut.
In 2013 I bought a Raspberry Pi and off course I wanted to use it with ruby. Alas... Slow pi + slow ruby = nischt gut.
I gave up.
So then the clarity came with the solution, build your own ruby. I started designing a bit on the beach already.
Still, daunting. But maybe just possible....
Still, daunting. But maybe just possible....

View File

@ -9,18 +9,18 @@ The c machine
Software engineers have clean brains, scrubbed into full c alignment through decades. A few rebels (klingons?) remain on embedded systems, but of those most strive towards posix compliancy too.
In other words, since all programming ultimately boils down to c, libc makes the bridge to the kernel/machine. All .... all but a small village in the northern (cold) parts of europe (Antskog) where ...
So i had a look what we are talking about.
The issue
The issue
----------
Many, especially embedded guys, have noticed that your standard c library has become quite heavy (2 Megs).
Since it provides a defined (posix) and large functionality on a plethora of systems (os's) and cpu's. Even for different ABI's (application binary interfaces) and compilers/linkers it is no wonder.
Many, especially embedded guys, have noticed that your standard c library has become quite heavy
(2 Megs). Since it provides a defined api (posix) and large functionality on a plethora of systems (os's) and cpu's. Even for different ABI's (application binary interfaces) and compilers/linkers it is no wonder.
ucLibc or dietLibc get the size down, especially diet quite a bit (130k). So that's ok then. Or is it?
Then i noticed that the real issue is not the size. Even my pi has 512 Mb, and of course even libc gets paged.
Then i noticed that the real issue is not the size. Even my pi has 512 Mb, and of course even libc gets paged.
The real issue is the step into the C world. So, extern functions, call marshelling, and the question is for what.
@ -31,8 +31,8 @@ ruby core/std-lib
Off course the ruby-core and std libs were designed to do for ruby what libc does for c. Unfortunately they are badly designed and suffer from above brainwash (designed around c calls)
Since salama is pure ruby there is a fair amount of functionality that would be nicer to provide straight in ruby. As gems off course, for everybody to see and fix.
For example, even if there were to be a printf (which i dislike) , it would be easy to code in ruby.
Since salama is pure ruby there is a fair amount of functionality that would be nicer to provide straight in ruby. As gems off course, for everybody to see and fix.
For example, even if there were to be a printf (which i dislike) , it would be easy to code in ruby.
What is needed is the underlying write to stdout.
@ -41,11 +41,11 @@ Solution
To get salama up and running, ie to have a "ruby" executable, there are really very few kernel calls needed. File open, read and stdout write, brk.
So the way this will go is to write syscalls where needed.
So the way this will go is to write syscalls where needed.
Having tried to reverse engineer uc, diet and musl, it seems best to go straight to the source.
Having tried to reverse engineer uc, diet and musl, it seems best to go straight to the source.
Most of that is off course for intel, but eax goes to r7 and after that the args are from r0 up, so not too bad. The definate guide for arm is here http://sourceforge.net/p/strace/code/ci/master/tree/linux/arm/syscallent.h
But doesn't include arguments (only number of them), so http://syscalls.kernelgrok.com/ can be used.
Most of that is off course for intel, but eax goes to r7 and after that the args are from r0 up, so not too bad. The definite guide for arm is here [http://sourceforge.net/p/strace/code/ci/master/tree/linux/arm/syscallent.h](http://sourceforge.net/p/strace/code/ci/master/tree/linux/arm/syscallent.h)
But doesn't include arguments (only number of them), so [http://syscalls.kernelgrok.com/](http://syscalls.kernelgrok.com/) can be used.
So there, getting more metal by the minute. But the time from writing this to a hello world was 4 hours.

View File

@ -3,18 +3,17 @@ layout: news
author: Torsten
---
Part of the reason why i even thougth this was possible was because i had bumped into Metasm.
Part of the reason why i even thought this was possible was because i had bumped into Metasm.
Metasm creates native code in 100% ruby. Either from Assembler or even C (partially). And for many cpu's too.
It also creates many binary formats, elf among them.
It also creates many binary formats, elf among them.
Still, i wanted something small that i could understand easily as it was clear it would have to be changed to fit.
As there was no external assembler file format planned, the whole aproach from parsing was inapropriate.
As there was no external assembler file format planned, the whole approach from parsing was inappropriate.
I luckily found a small library, as, that did arm only and was just a few files. After removing not needed parts
like parsing and some reformatting i added an assmbler like dsl.
like parsing and some reformatting i added an assembler like dsl.
This layer (arm subdirectory) said hello after about 2 weeks of work.
I also got qemu to work and can thus develop without the actual pi.

View File

@ -2,7 +2,7 @@
layout: news
author: Torsten
---
Both "ends", parsing and machine code, were relatively clear cut. Now it is into unknown territory.
Both "ends", parsing and machine code, were relatively clear cut. Now it is into unknown territory.
I had ported the Kaleidescope llvm tutorial language to ruby-llvm last year, so thee were some ideas floating.
@ -10,20 +10,20 @@ The idea of basic blocks, as the smallest unit of code without branches was pret
targets was also straight forward. But how to get from the AST to arm Intructions was not, and took some trying out.
In the end, or rather now, it is the AST layer that "compiles" itself into the Vm layer. The Vm layer then assembles
itself into Instructions.
itself into Instructions.
General instructions are part of the Vm layer, but the code picks up derived classes and thus makes machine
dependant code possible. So far so ok.
dependent code possible. So far so ok.
Register allocation was (and is) another story. Argument passing and local variables do work now, but there is definately
Register allocation was (and is) another story. Argument passing and local variables do work now, but there is definitely
room for improvement there.
To get anything out of a running program i had to implement putstring (easy) and putint (difficult). Surprisingly
To get anything out of a running program i had to implement putstring (easy) and putint (difficult). Surprisingly
division is not easy and when pinned to 10 (divide by 10) quite strange. Still it works. While i was at writing
assmbler i found a fibonachi in 10 or so instructions.
assembler i found a fibonachi in 10 or so instructions.
To summarise, function definition and calling (including recursion) works.
If and and while structures work and also some operators and now it's easy to add more.
To summarise, function definition and calling (including recursion) works.
If and and while structures work and also some operators and now it's easy to add more.
So we have a Fibonacchi in ruby using a while implementation that can be executed by salama and outputs the
correct result. After a total of 7 weeks this is much more than expected!

View File

@ -6,16 +6,16 @@ author: Torsten
Parsing is a difficult, the theory incomprehensible and older tools cryptic. At least for me.
And then i heard recursive is easy and used by even llvm. Formalised as peg parsing libraries exists, and in ruby
they have dsl's and are suddenly quite understandable.
they have dsl's and are suddenly quite understandable.
Off the candidates i had first very positive experiences with treetop. Upon continuing i found the code
generation aspect not just clumbsy (after all you can define methods in ruby), but also to interfere unneccessarily
Off the candidates i had first very positive experiences with treetop. Upon continuing i found the code
generation aspect not just clumsy (after all you can define methods in ruby), but also to interfere unneccessarily
with code control. On top of that conversion into an AST was not easy.
After looking around i found Parslet, which pretty much removes all those issues. Namely
- It does not generate code, it generates methods. And has a nice dsl.
- It transforms to ruby basic types and has the notion on a transormation.
- It does not generate code, it generates methods. And has a nice dsl.
- It transforms to ruby basic types and has the notion on a transformation.
So an easy and clean way to create an AST
- One can use ruby modules to partition a larger parser
- Minimal dependencies (one file).
@ -24,12 +24,11 @@ After looking around i found Parslet, which pretty much removes all those issues
So i was sold, and i got up to speed quite quickly. But i also found out how fiddly such a parser is in regards
to ordering and whitespace.
I spent some time to make quite a solid test framework, testing the differnet rules seperately and also the
stages seperately, so things would not break accidentally when growing.
I spent some time to make quite a solid test framework, testing the different rules separately and also the
stages separately, so things would not break accidentally when growing.
After about another 2 weeks i was able to parse functions, both calls and definitions, ifs and whiles and off course basic
types of integers and strings.
With the great operator support it was a breeze to create all 15 ish binary operators. Even Array and Hash constant
definition was very quick. All in all surprisingly painless, thanks to Kasper!
After about another 2 weeks i was able to parse functions, both calls and definitions, ifs and whiles and off course basic
types of integers and strings.
With the great operator support it was a breeze to create all 15 ish binary operators. Even Array and Hash constant
definition was very quick. All in all surprisingly painless, thanks to Kasper!

View File

@ -18,17 +18,17 @@ The place for these methods, and i'll go into it a little which in a second, is
So Kernel is the place for methods that are needed to build the system, and may not be called on objects. Simple.
In ohter words, anything that can be coded on normal objects, should. But when that stops being possible, Kernel is the place.
In other words, anything that can be coded on normal objects, should. But when that stops being possible, Kernel is the place.
And what are these funtions? get_instance_variable or set too. Same for functions. Strangley these may in turn rely on functions that can be coded in ruby, but at the heart of the matter is an indexed operation ie object[2].
And what are these functions? get_instance_variable or set too. Same for functions. Strangley these may in turn rely on functions that can be coded in ruby, but at the heart of the matter is an indexed operation ie object[2].
This functionality, ie getting the n'th data in an object, is essential, but c makes such a good point of of it having no place in a public api. So it needs to be implemented in a "private" part and used in a save manner. More on the layers emerging below.
The Kernel is a module in salama that defines functions which return function objects. So the code is generated, instead of parsed. An essential destinction.
The Kernel is a module in salama that defines functions which return function objects. So the code is generated, instead of parsed. An essential distinction.
#### System
It's an important side note on that Kernel definition above, that it is _not_ the same as system acccess function. These are in their own Module and may (or must) use the kernel to implement their functionality. But not the same.
It's an important side note on that Kernel definition above, that it is _not_ the same as system access function. These are in their own Module and may (or must) use the kernel to implement their functionality. But not the same.
Kernel is the VM's "core" if you want.
@ -36,46 +36,46 @@ System is the access to the operating system functionality.
#### Layers
So from that Kernel idea have now emerged 3 Layers, 3 ways in which code is created.
So from that Kernel idea have now emerged 3 Layers, 3 ways in which code is created.
##### Machine
The lowest layer is the Machine layer. This Layer generates Instructions or sequnces thereof. So off course there is an Instruction class with derived classes, but also Block, the smallest, linear, sequences of Instructions.
The lowest layer is the Machine layer. This Layer generates Instructions or sequences thereof. So off course there is an Instruction class with derived classes, but also Block, the smallest, linear, sequences of Instructions.
Also there is an abstract RegisterMachine that is mostly a mediator to the curent implemementation (ArmMachine). The machine has functions that create Instructions
Also there is an abstract RegisterMachine that is mostly a mediator to the current implementation (ArmMachine). The machine has functions that create Instructions
Some few machine functions return Blocks, or append their instructions to blocks. This is really more a macro layer. Usually they are small, but div10 for example is a real 10 instruciton beauty.
Some few machine functions return Blocks, or append their instructions to blocks. This is really more a macro layer. Usually they are small, but div10 for example is a real 10 instruction beauty.
##### Kernel
The Kernel functions return function objects. Kernel functions have the same name as the function they implement, so Kernel::putstring defines a function called putstring. Function objects (Vm::Function) carry entry/exit/body code, receiver/return/argurmt types and a little more.
The Kernel functions return function objects. Kernel functions have the same name as the function they implement, so Kernel::putstring defines a function called putstring. Function objects (Vm::Function) carry entry/exit/body code, receiver/return/argument types and a little more.
The important thing is that these functions are callable from ruby code. Thus they form the glue from the next layer up, which is coded in ruby, to the machine layer. In a way the Kernel "exports" the machine functionality to salama.
##### Parfait
Parfait is a thin layer imlementing a mini-minimal OO system. Sure, all your usual suspects of string and integers are there, but they only implement what is really really neccessay. For examle strings mainly have new equals and put.
Parfait is a thin layer implementing a mini-minimal OO system. Sure, all your usual suspects of string and integers are there, but they only implement what is really really necessary. For example strings mainly have new equals and put.
Parfait is heavy on Object/Class/Metaclass functionality, object instance and method lookup. All things needed to make an OO system OO. Not so much "real" functionality here, more creating the ability for that.
Stdlib would be the next layer up, implementing the whole of ruby functionality in terms of what Parfait provides.
Stdlib would be the next layer up, implementing the whole of ruby functionality in terms of what Parfait provides.
The important thing here is that Parfait is written completely in ruby. Meaning it get's parsed by salama like any other code, and then transformed into executable form and written.
The important thing here is that Parfait is written completely in ruby. Meaning it get's parsed by salama like any other code, and then transformed into executable form and written.
Any executable that salama generates will have Parfait in it. But only the final version of salama as a ruby vm, will have the whole stdlib and parser along.
#### Salama
Salama uses the Kernel and Machine layers straight when creating code. Off course.
Salama uses the Kernel and Machine layers straight when creating code. Off course.
The closest equivalent to salama would be a compiler and so it is it's job to create code (machine layer objects).
But it is my intention to keep that as small as possible. And the good news is it's all ruby :-)
##### Extensions
I just want to mention the idea of extensions that is a logical step for a minimal system. Off course they would be gems, but the integesting thing is they (like salama) could:
I just want to mention the idea of extensions that is a logical step for a minimal system. Off course they would be gems, but the interesting thing is they (like salama) could:
- use salamas existing kernel/machine abstraction to define new functionality that is not possible in ruby
- define new machine functionality, adding kernel type api's, to create wholly new, possibly hardware specific functionality
I am thinking graphic accellaration, GPU usage, vector api's, that kind of thing. In fact i aim to implement the whole floating point functionality as an extensions (as it clearly not essential for OO).
I am thinking graphic acceleration, GPU usage, vector api's, that kind of thing. In fact i aim to implement the whole floating point functionality as an extensions (as it clearly not essential for OO).

View File

@ -6,56 +6,62 @@ author: Torsten
I was just reading my ruby book, wondering about functions and blocks and the like, as one does when implementing
a vm. Actually the topic i was struggling with was receivers, the pesty self, when i got the exception.
And while they say two setps forward, one step back, this goes the other way around.
And while they say two steps forward, one step back, this goes the other way around.
### One step back
As I just learnt assembler, it is the first time i am really considering how functions are implemented, and how the stack is
used in that. Sure i heard about it, but the details were vague.
Off course a function must know where to return to. I mean the memory-address, as this can't very well be fixed at compile
time. In effect this must be passed to the function. But as programmers we don't want to have to do that all the time and
so it is passed implicitly.
Off course a function must know where to return to. I mean the memory-address, as this can't very
well be fixed at compile time. In effect this must be passed to the function. But as programmers we
don't want to have to do that all the time and so it is passed implicitly.
##### The missing link
The arm architecture makes this nicely explicit. There, a call is actually called branch with link. This almost rubbed me
for a while as it struck me as an exceedingly bad name. Until i "got it", that is. The link is the link back, well that
was simple. But the thing is that the "link" is put into the link register.
The arm architecture makes this nicely explicit. There, a call is actually called branch with link.
This almost rubbed me for a while as it struck me as an exceedingly bad name. Until i "got it",
that is. The link is the link back, well that was simple. But the thing is that the "link" is
put into the link register.
This never struck me as meaningful, until now. Off course it means that "leaf" functions do not
need to touch it. Leaf functions are functions that do not call other functions, though they may
do syscalls as the kernel restores all registers. In other cpu's the return address is pushed on
the stack, but in arm you have to do that yourself. Or not and save the instruction if you're so inclined.
This never struck me as meaningful, until now. Off course it means that "leaf" functions do not need to touch it. Leaf
functions are functions that do not call other functions, though they may do syscalls as the kernel restores all registers.
In other cpu's the return address is pushed on the stack, but in arm you have to do that yourself. Or not and save the
instruction if you're so inclined.
##### The hidden argument
But the point here is, that this makes it very explicit. The return address is in effect just another argument. It usually
gets passed automatically by compiler generated code, but never the less. It is an argument.
But the point here is, that this makes it very explicit. The return address is in effect just
another argument. It usually gets passed automatically by compiler generated code, but never
the less. It is an argument.
The "step back" is to make this argument explicit in the vm code. Thus making it's handling, ie passing or saving explicit
too. And thus having less magic going on, because you can't understand magic (you gotta believe it).
The "step back" is to make this argument explicit in the vm code. Thus making it's handling,
ie passing or saving explicit too. And thus having less magic going on, because you can't
understand magic (you gotta believe it).
### Two steps forward
And so the thrust becomes clear i hope. We are talking about exceptions after all.
And so the thrust becomes clear i hope. We are talking about exceptions after all.
Because to those who have not read the windows calling convention on exception handling or even heard of the dwarf specification thereof, i say don't. It melts the brain.
You have to be so good at playing computer in your head, it's not healthy.
Because to those who have not read the windows calling convention on exception handling or even
heard of the dwarf specification thereof, i say don't. It melts the brain.
You have to be so good at playing computer in your head, it's not healthy.
Instead, we make things simple and explicit. An exception is after all just a different way for a function to return.
So we need an address for it to return too.
Instead, we make things simple and explicit. An exception is after all just a different way for
a function to return. So we need an address for it to return too.
And as we have just made the normal return address an explicit argument, we just make the exception return address
and argument too. And presto.
And as we have just made the normal return address an explicit argument, we just make the
exception return address and argument too. And presto.
Even just the briefest of considerations of how we generate those exception return addresses (landing pads?
what a strange name), leads to the conclusion that if a function does not do any exception handling, it just passes
the same addess on, that it got itself. Thus a generated excetion would jump clear over such a function.
Even just the briefest of considerations of how we generate those exception return addresses
(landing pads? what a strange name), leads to the conclusion that if a function does not do
any exception handling, it just passes the same address on, that it got itself. Thus a
generated exception would jump clear over such a function.
Since we have now got the exceptions to be normal code (alas with an exceptional name :-)) control flow to and from
it becomes quite normal too.
Since we have now got the exceptions to be normal code (alas with an exceptional name :-)) control
flow to and from it becomes quite normal too.
To summarize each function has now a minimum of three arguments: the self, the return address and the exception address.
To summarize each function has now a minimum of three arguments: the self, the return address and
the exception address.
We have indeed taken a step forward.

View File

@ -11,43 +11,45 @@ What i wasn't stuck with, is where to draw the layer for the vm.
### Layers
Software engineers like layers. Like the onion boy. You can draw boxes, make presentation and convince your boss.
Software engineers like layers. Like the onion boy. You can draw boxes, make presentation and convince your boss.
They help us to reason about the software.
In this case the model was to go from ast layer to a vm layer. Via a compile method, that could just as well have been a
In this case the model was to go from ast layer to a vm layer. Via a compile method, that could just as well have been a
visitor.
That didn't work, too big astep and so it was from ast, to vm, to neumann. But i couldn't decide on the abstraction of the
virtual machine layer. Specifically, when you have a send (and you have soo many sends in ruby), do you:
That didn't work, too big astep and so it was from ast, to vm, to neumann. But i couldn't decide
on the abstraction of the virtual machine layer. Specifically, when you have a send (and you have
soo many sends in ruby), do you:
- model it as a vm instruction (a bit like java)
- implement it in a couple instructions like resolve, a loop and call
- go to a version that is clearly translatable to neumann, say without the value type implementation
Obviously the third is where we need to get to, as the next step is the neumann layer and somewhow we need to get there.
In effect one could take those three and present them as layers, not as alternatives like i have.
Obviously the third is where we need to get to, as the next step is the neumann layer and somewhow
we need to get there. In effect one could take those three and present them as layers, not
as alternatives like i have.
### Passes
And then the little cob went click, and the idea of passes resurfaced. LLvm has these passes on the code tree, is probably
where it surfaced from.
And then the little cob went click, and the idea of passes resurfaced. LLvm has these passes on
the code tree, is probably where it surfaced from.
So we can have as high of a degree of abstraction as possible when going from ast to code. And then have as many passes
over that as we want / need.
So we can have as high of a degree of abstraction as possible when going from ast to code.
And then have as many passes over that as we want / need.
Passes can be order dependent, and create more and more datail. To solve the above layer conundrum, we just do a pass
for each of those options.
Passes can be order dependent, and create more and more detail. To solve the above layer
conundrum, we just do a pass for each of those options.
The two main benefits that come from this are:
1 - At each point, ie after and during each pass we can analyse the data. Imagine for example that we would have picked the
second layer option, that means there would never have been a representation where the sends would have been explicit. Thus
any analasis of them would be impossible or need reverse engineering (eg call graph analysis, or class caching)
1 - At each point, ie after and during each pass we can analyse the data. Imagine for example
that we would have picked the second layer option, that means there would never have been a
representation where the sends would have been explicit. Thus any analysis of them would be impossible or need reverse engineering (eg call graph analysis, or class caching)
2 - Passes can be gems or come from other sources. The mechanism can be relatively oblivious to specific passes. And they
make the transformation explicit, ie easier to understand. In the example of having picked the second layer level, one
would have to patch the implementation of that transformation to achieve a different result. With pases it would be a matter
of replacing a pass, thus explicitly stating "i want a non-standard send implementation"
2 - Passes can be gems or come from other sources. The mechanism can be relatively oblivious to
specific passes. And they make the transformation explicit, ie easier to understand.
In the example of having picked the second layer level, one would have to patch the
implementation of that transformation to achieve a different result. With passes it would be
a matter of replacing a pass, thus explicitly stating "i want a non-standard send implementation"
Actually a third benefit is that it makes testing simpler. More modular. Just test the initial ast->code and then mostly
the results of passes.
Actually a third benefit is that it makes testing simpler. More modular. Just test the initial ast->code and then mostly the results of passes.

View File

@ -5,12 +5,14 @@ author: Torsten
In a picture, or when taking a picture, the frame is very important. It sets whatever is in the picture into context.
So it is a bit strange that having a **frame** had the same sort of effect for me in programming. I made the frame explicit,
as an object, with functions and data, and immidiately the whole message sending became a whole lot clearer.
So it is a bit strange that having a **frame** had the same sort of effect for me in programming.
I made the frame explicit, as an object, with functions and data, and immediately the whole
message sending became a whole lot clearer.
You read about frames in calling conventions, or otherwise when talking about the machine stack. It is the area a function
uses for storing data, be it arguments, locals or temporary data. Often a frame pointer will be used to establish a frames
dynamic size and things like that. But since it's all so implicit and handled by code very few programmers ever see it was
You read about frames in calling conventions, or otherwise when talking about the machine stack.
It is the area a function uses for storing data, be it arguments, locals or temporary data.
Often a frame pointer will be used to establish a frames dynamic size and things like that.
But since it's all so implicit and handled by code very few programmers ever see it was
all a bit muddled for me.
My frame has: return and exceptional return address, self, arguments, locals, temps
@ -19,58 +21,58 @@ and methods to: create a frame, get a value to or from a slot or args/locals/tm
### The divide, compile and runtime
I saw [Tom's video on free compilers](http://codon.com/compilers-for-free) and read the underlying book on
[Partial Evaluation](http://www.itu.dk/people/sestoft/pebook/jonesgomardsestoft-a4.pdf) a bit, and it helped to make the
distinctions clearer. As did the Layers and Passes post. And the explicit Frame.
I saw [Tom's video on free compilers](http://codon.com/compilers-for-free) and read the underlying
book on [Partial Evaluation](http://www.itu.dk/people/sestoft/pebook/jonesgomardsestoft-a4.pdf) a bit, and it helped to make the distinctions clearer. As did the Layers and Passes post.
And the explicit Frame.
The explicit frame established the vm explicitly too, or much better. All actions of the vm happen in terms of the frame.
Sending is creating a new one, loading it, finding the method and branching there. Getting and setting variables is just
indexing into the frame at the right index and so on. Instance variables are a send to self, and on it goes.
The explicit frame established the vm explicitly too, or much better. All actions of the vm happen
in terms of the frame. Sending is creating a new one, loading it, finding the method and branching
there. Getting and setting variables is just indexing into the frame at the right index and so on.
Instance variables are a send to self, and on it goes.
The great distinction is at the end quite simple, it is compile-time or run-time. And the passes idea helps in that i start
with most simple implementation against my vm. Then i have a data structure and can keep expanding it to "implement" more
detail. Or i can analyse it to save redundancies, ie optimize. But the point is in both cases i can just think about
data structures and what to do with them.
The great distinction is at the end quite simple, it is compile-time or run-time. And the passes
idea helps in that i start with most simple implementation against my vm. Then i have a data structure and can keep expanding it to "implement" more detail. Or i can analyse it to save
redundancies, ie optimize. But the point is in both cases i can just think about data structures
and what to do with them.
And what i can do with my data (which is off course partially instruction sequences, but that's beside the point) really
always depends on the great question: compile time vs run-time. What is constant, can i do immediately. Otherwise leave
for later. Simple.
And what i can do with my data (which is off course partially instruction sequences, but that's beside the point) really always depends on the great question: compile time vs run-time.
What is constant, can i do immediately. Otherwise leave for later. Simple.
An example, attribute accessor: a simple send. I build a frame, set the self. Now a fully dynamic implementation would
leave it at that. But i can check if i know the type, if it's not reference (ie integer) we can raise immediately. Also the
a reference tags the class for when that is known at compile time. If so i can determine the layout at compile time and
inline the get's implementation. If not i could cache, but that's for later.
An example, attribute accessor: a simple send. I build a frame, set the self. Now a fully dynamic
implementation would leave it at that. But i can check if i know the type, if it's not
reference (ie integer) we can raise immediately. Also the a reference tags the class for when
that is known at compile time. If so i can determine the layout at compile time and inline the
get's implementation. If not i could cache, but that's for later.
As a furhter example on this, when one function has two calls on the same object, the layout must only be retrieved once.
ie in the sequences getType, determine method, call, the first step can be ommitted for the second call as a layout is
constant.
As a further example on this, when one function has two calls on the same object, the layout
must only be retrieved once. ie in the sequences getType, determine method, call, the first
step can be omitted for the second call as a layout is constant.
And as a final bonus of all this clarity, i immediately spotted the inconcistency in my own design: The frame i designed
holds local variables, but the caller needs to create it. The caller can not possibly know the number of local variables
as that is decided by the invoked method, which is only known at run-time. So we clearly need a two level thing here, one
And as a final bonus of all this clarity, i immediately spotted the inconsistency in my own design: The frame i designed holds local variables, but the caller needs to create it. The caller can
not possibly know the number of local variables as that is decided by the invoked method,
which is only known at run-time. So we clearly need a two level thing here, one
that the caller creates, and one that the receiver creates.
### Messaging and slots
It is interesting to relate what emerges to concepts learned over the years:
There is this idea of message passing, as opposed to function calling. Everyone i know has learned an imperative
language as the first language and so message passing is a bit like vegetarian food, all right for some. But off course there
is a distinct difference in dynamic languages as one does not know the actual method invoked beforehand. Also exceptions
make the return trickier and default values even the argument passing which then have to be augmented by the receiver.
There is this idea of message passing, as opposed to function calling. Everyone i know has learned
an imperative language as the first language and so message passing is a bit like vegetarian
food, all right for some. But off course there is a distinct difference in dynamic languages as
one does not know the actual method invoked beforehand. Also exceptions make the return trickier
and default values even the argument passing which then have to be augmented by the receiver.
One main difficulty i had in with the message passing idea has always been what the message is.
One main difficulty i had in with the message passing idea has always been what the message is.
But now i have the frame, i know exactly what it is: it is the frame, nothing more nothing less.
(Postscript: Later introduced the Message object which gets created by the caller, and the Frame is what is created
by the callee)
(Postscript: Later introduced the Message object which gets created by the caller, and the Frame
is what is created by the callee)
Another interesting observation is the (hopefully) golden path this design goes between smalltalk and self. In
smalltalk (like ruby and...) all objects have a class. But some of the smalltalk researchers went on to do
[Self](http://en.wikipedia.org/wiki/Self_(programming_language)), which has no classes only
objects. This was supposed to make things easier and faster. Slots were a bit like instance variables, but there were no
classes to rule them.
Another interesting observation is the (hopefully) golden path this design goes between smalltalk
and self. In smalltalk (like ruby and...) all objects have a class. But some of the smalltalk researchers went on to do [Self](http://en.wikipedia.org/wiki/Self_(programming_language)), which
has no classes only objects. This was supposed to make things easier and faster. Slots were a bit like instance variables, but there were no classes to rule them.
Now in ruby, any object can have any variables anyway, but they incur a dynamic lookup. Types on the other hand are like
slots, and keeping each Type constant (while an object can change layouts) makes it possible to have completely
dynamic behaviour (smalltalk/ruby) **and** use a slot-like (self) system with constant lookup speed. Admittatley the
constantcy only affects cache hits, but as most systems are not dynamic most of the time, that is almost always.
Now in ruby, any object can have any variables anyway, but they incur a dynamic lookup. Types on
the other hand are like slots, and keeping each Type constant (while an object can change layouts)
makes it possible to have completely dynamic behaviour (smalltalk/ruby) **and** use a slot-like (self) system with constant lookup speed. Admittedly the constancy only affects cache hits, but
as most systems are not dynamic most of the time, that is almost always.