73 lines
4.5 KiB
Markdown
73 lines
4.5 KiB
Markdown
|
---
|
||
|
layout: news
|
||
|
author: Torsten
|
||
|
---
|
||
|
|
||
|
The register machine abstraction has been somewhat thin, and it is time to change that
|
||
|
|
||
|
### Current affairs
|
||
|
|
||
|
When i started, i started from the assembler side, getting arm binaries working and off course learning the arm cpu
|
||
|
instruction set in assembler memnonics.
|
||
|
|
||
|
Not having **any** experience at this level i felt that arm was pretty sensible. Much better than i expected. And
|
||
|
so i abtracted the basic instruction classes a little and had the arm instructions implement them pretty much one
|
||
|
to one.
|
||
|
|
||
|
Then i tried to implement any ruby logic in that abstraction and failed. Thus was born the virtual machine
|
||
|
abstraction of having Message, Frame and Self objects. This in turn mapped nicely to registers with indexed
|
||
|
addressing.
|
||
|
|
||
|
### Addressing
|
||
|
|
||
|
I just have to sidestep here a little about addressing: the basic problem is off course that we have no idea at
|
||
|
compile-time at what address the executable will end up.
|
||
|
|
||
|
The problem first emerged with calling functions. Mostly because that was the only objects i had, and so i was
|
||
|
very happy to find out about pc relative addressing, in which you jump or call relative to your current position
|
||
|
(**p**rogram **c**ounter). Since the relation is not changed by relocation all is well.
|
||
|
|
||
|
Then came the first strings and the aproach can be extended: instead of grabbing some memory location, ie loading
|
||
|
and address and dereferencing, we calculate the address in relation to pc and then dereference. This is great and
|
||
|
works fine.
|
||
|
|
||
|
But the smug smile is wiped off the face when one tries to store references. This came with the whole object
|
||
|
aproach, the bootspace holding references to **all** objects in the system. I even devised a plan to always store
|
||
|
relative addresses. Not relative to pc, but relative to the self that is storing. This i'm sure would have
|
||
|
worked fine too, but it does mean that the running program also has to store those relative addresses (or have
|
||
|
different address types, shudder). That was a runtime burden i was not willing to accept.
|
||
|
|
||
|
So there are two choices as far as i see: use elf relocation, or relocate in init code. And yet again i find myself
|
||
|
biased to the home-growm aproach. Off course i see that this is partly because i don't want to learn the innards of
|
||
|
elf as something very complicated that does a simple thing. But also because it is so simple i am hoping it isn't
|
||
|
such a big deal. Most of the code for it, object iteration, type testing, layout decoding, will be useful and
|
||
|
neccessary later anyway.
|
||
|
|
||
|
### Concise instruction set
|
||
|
|
||
|
So that addressing aside was meant to further the point of a need for a good register instruction set (to write the
|
||
|
relocation in). And the code that i have been writing to implement the vm instructions clearly shows a need for
|
||
|
a better model at the register model.
|
||
|
|
||
|
On the other hand, the idea of Passes will make it very easy to have a completely sepeate register machine layer.
|
||
|
We just transfor the vm to that, and then later from that to arm (or later intel). So there are three things that i
|
||
|
am looking for with the new register machine instruction set:
|
||
|
|
||
|
- easy to understand the model (ie register machine, pc, ..), free of real machine quirks
|
||
|
- small set of instructions that is needed for our vm
|
||
|
- better names for instructions
|
||
|
|
||
|
Especially the last one: all the mvn and ldr is getting to me. It's so 50's, as if we didn't have the space to spell
|
||
|
out move or load. And even those are not good names, at least i am always wondering what is a move and what a load.
|
||
|
And as i explained above in the addressing, if i wanted to load an address of an object into a register with relative
|
||
|
addressing, i would actually have to do an add. But when reading an add instruction it is not an intuative
|
||
|
conclusion that a load is meant. And since this is a fresh effort i would rather change these things now and make
|
||
|
it easier for others to learn sensible stuff than me get used to cryptics only to have everyone after me do the same.
|
||
|
|
||
|
So i will have instructions like RegisterMove, ConstantLoad, Branch, which will translate to mov, ldr and b in arm. I still like to keep the arm level with the traditional names, so people who actually know arm feel right at home.
|
||
|
But the extra register layer will make it easier for everyone who has not programmed assembler (and me!),
|
||
|
which i am guessing is quite a lot in the *ruby* community.
|
||
|
|
||
|
In implementation terms it is a relatively small step from the vm layer to the register layer. And an even smaller
|
||
|
one to the arm layer. But small steps are good, easy to take, easy to understand, no stumbling.
|