rubyx/README.md

154 lines
5.8 KiB
Markdown
Raw Normal View History

2014-07-29 18:33:11 +03:00
#Salama
2014-05-30 14:49:34 +03:00
2014-04-14 05:51:44 -07:00
2014-07-29 18:33:11 +03:00
Salama is about native code generation in and of ruby. In is done.
2014-04-14 16:46:17 +03:00
2014-05-30 14:49:34 +03:00
### Step 1 - Assembly
2014-04-16 12:45:36 +03:00
2014-04-27 22:19:32 +03:00
Produce binary that represents code.
Traditionally called assembling, but there is no need for an external file representation.
2014-04-16 12:45:36 +03:00
Ie only in ruby code do i want to create machine code.
2014-04-27 22:19:32 +03:00
Most instructions are in fact assembling correctly. Meaning i have tests, and i can use objbump to verify the correct assembler code is disasembled
2014-08-22 18:00:23 +03:00
I even polished the dsl and so (from the tests), this is a valid hello world:
2014-04-27 22:19:32 +03:00
2014-09-19 19:39:08 +03:00
hello = "Hello World\n"
@program.main do
2014-04-27 22:19:32 +03:00
mov r7, 4 # 4 == write
mov r0 , 1 # stdout
add r1 , pc , hello # address of "hello World"
mov r2 , hello.length
swi 0 #software interupt, ie kernel syscall
mov r7, 1 # 1 == exit
swi 0
2014-09-19 19:39:08 +03:00
end
write(7 + hello.length/4 + 1 , 'hello')
2014-04-16 12:45:36 +03:00
2014-05-30 14:49:34 +03:00
### Step 2 -Link to system
2014-04-16 12:45:36 +03:00
2014-04-21 17:27:05 +03:00
Package the code into an executable. Run that and verify it's output. But full elf support (including externs) is eluding me for now.
2014-04-16 12:45:36 +03:00
2014-04-21 17:27:05 +03:00
Still, this has proven to be a good review point for the arcitecture and means no libc for now.
2014-08-22 18:00:23 +03:00
Full rationale on the web pages, but it means starting an extra step.
2014-04-21 17:27:05 +03:00
Above Hello World can be linked and run. And will say its thing.
2014-04-27 22:19:32 +03:00
2014-05-30 14:49:34 +03:00
### Step 3 - syscalls
2014-04-21 17:27:05 +03:00
Start implementing some syscalls and add the functionality we actually need from c (basic io only really)
2014-04-16 12:45:36 +03:00
2014-09-19 19:39:08 +03:00
This is surprisingly easy, framework is done. As said, "Hello world" comes out and does use syscall 4.
Also the program stops by syscall exit. The full list is ont the net and involves mostly grunt work.
2014-05-30 14:49:34 +03:00
### Step 4 -Parse ruby
Parse simple code, using Parslet.
Parsing is a surprisingly fiddly process, very space and order sensitive. But Parslet is great and simple
expressions (including function definitions and calls) are starting to work.
2014-09-19 19:39:08 +03:00
I spent some time on the parse testing framework, so it is safe to fiddle and add. In fact it is very modular and
so ot is easy to add.
2014-08-22 18:00:23 +03:00
### Step 5 - Virtual: Compile the Ast
2014-08-22 18:00:23 +03:00
Since we now have an Abstact syntax tree, it needs to be compiled to a virtual machine Instruction format.
2014-09-19 19:39:08 +03:00
For the parsed subset that's done.
2014-08-22 18:00:23 +03:00
It took me a while to come up with a decent but simple machine model. I had tried to map straight to hardware
but failed. The current Virtual directory represent a machine with basic oo features.
2014-08-22 18:00:23 +03:00
Instead of having more Layers to go from virtual to arm, i opted to have passes that go over the data structure
and modify it.
2014-04-16 12:45:36 +03:00
2014-09-19 19:39:08 +03:00
This allows optimisation after every pass as we have a data structure at every point in time.
2014-09-19 19:39:08 +03:00
### Step 6 - Compound types
2014-05-30 14:49:34 +03:00
Arrays and Hash parse. Good. But this means The Actual datastructures should be implemented. AWIP ( a work in progress)
2014-04-16 12:45:36 +03:00
2014-05-30 14:49:34 +03:00
Implement Core library of arrays/hash/string , memory definition and access
2014-04-16 12:45:36 +03:00
2014-09-19 19:39:08 +03:00
Also compound data needs to find it's way into the executable, needs to be assembled. This is done. (though there is
very little to be done with it at runtime)
### Step 7 - Dynmic function lookup
It proved to be quite a big step to go from static function calling to oo method lookup. Also ruby is very
introspective and that means much of the compiled code needs to be accessible in the runtime (not just present,
accessible).
This has teken me the better part of three months, but is starting to come around.
So the current staus is that i can
- parse a usable subset of ruby
- compile that to my vm model
- generate assembler for all higher level constructs in the vm model
- assemle and link the code and objects (strings/arrays/hashes) into an executable
- run the executable and debug :-(
### Step x + 1
Implement ruby Blocks, and make new vm classes to deal with that. This is in fact a little open, but i have a general
notion that blocks are "just" methods with even more implicit arguments.
### Step +2
2014-04-16 12:45:36 +03:00
2014-09-19 19:39:08 +03:00
Implement Exceptions. Conceptionally this is not so difficult in an oo machine as it would be in c.
2014-04-16 12:45:36 +03:00
2014-09-19 19:39:08 +03:00
I have a post about it http://salama.github.io/2014/06/27/an-exceptional-though.html
2014-04-16 12:45:36 +03:00
2014-09-19 19:39:08 +03:00
which boild down to the fact that we can treat the address to return to in an exception quite like a return address
from a function. Ie just another implicit parameter (as return is really an implicit parameter, a little like self for oo)
2014-04-16 12:45:36 +03:00
2014-09-19 19:39:08 +03:00
### Step +3
2014-04-16 12:45:36 +03:00
2014-09-19 19:39:08 +03:00
Implement a way to call libc and other c libraries. I am not placing a large emphasis on this personally,
but excpect somebody will come along and have library they want to use so much they can't stop themselves.
Personally i think a fresh start is what we need much more. I once counted the call chain from a simple
printf to the actual kernel invocation in some libc once and it was getting to 10! I hope with dynamic (re)compiling
we can do better than that.
2014-05-27 19:19:55 +03:00
2014-09-19 19:39:08 +03:00
### Step +4
2014-05-30 14:49:34 +03:00
Iterate from one:
1. more cpus (ie intel)
2. more systems (ie mac)
3. more syscalls, there are after all some hundreds
4. Ruby is full of nicities that are not done, also negative tests are non existant
5. A lot of modern cpu's functionality has to be mapped to ruby and implemented in assembler to be useful
6. Different sized machines, with different register types ?
7. on 64bit, there would be 8 bits for types and thus allow for rational, complex, and whatnot
8. Housekeeping (the superset of gc) is abundant
2014-09-19 19:39:08 +03:00
9. Any amount of time could be spent on a decent digital tree (see judy). Or possibly Dr.Cliffs hash.
10. Also better string/arrays would be good.
11. The minor point of threads and hopefully lock free primitives to deal with that.
12. Inlining would be good
2014-05-30 14:49:34 +03:00
And generally optimize and work towards that perfect world (we never seem to be able to attain).
2014-09-19 19:39:08 +03:00
### Step 30
2014-04-16 12:45:36 +03:00
Celebrate New year 2030
2014-04-16 12:45:36 +03:00
2014-04-14 16:46:17 +03:00
2014-07-29 18:33:11 +03:00
Contributing to salama
2014-04-14 16:46:17 +03:00
-----------------------
2014-04-27 22:19:32 +03:00
Probably best to talk to me, if it's not a typo or so.
I do have a todo, for the adventurous.
Fork and create a branch before sending pulls.
2014-04-14 16:46:17 +03:00
== Copyright
Copyright (c) 2014 Torsten Ruger. See LICENSE.txt for
further details.