bit of cleaning, updated readme

This commit is contained in:
Torsten Ruger
2015-10-22 17:38:49 +03:00
parent c68577c3f4
commit f658ecf425
5 changed files with 33 additions and 89 deletions

View File

@ -1,50 +1,48 @@
Register Machine
===============
================
This is the logic that uses the compiled virtual object space to produce code and an executable binary.
The RegisterMachine, is an abstract machine with registers. Think of it as an arm machine with
normal instruction names. It is not however an abstraction of existing hardware, but only
of that subset that we need.
There is a mechanism for an actual machine (derived class) to generate harware specific instructions (as the
plain ones in this directory don't assemble to binary). Currently there is only the Arm module to actually do
that.
Our primary objective is to compile Phisol to this level, so the register machine has:
- object access instructions
- object load
- object oriented call semantics
- extended (and extensible) branching
- normal integer operators (but no sub word instructions)
The elf module is used to generate the actual binary from the final Space. Space is a virtual class representing
all objects that will be in the executable. Other than MethodSource, objects get transformed to data.
All data is in objects.
But MethodSource, which are made up of Blocks, are compiled into a stream of bytes,
which are the binary code for the function.
The register machine is aware of Parfait objects, and specifically uses Message and Frame to
express call semantics.
Virtual Objects
----------------
Calls and syscalls
------------------
There are four virtual objects that are accessible (we can access their variables):
The RegisterMachine only uses 1 fixed register, the currently worked on Message.
- Self
- Message (arguments, method name, self)
- Frame (local and tmp variables)
- NewMessage ( to build the next message sent)
There is no stack, rather messages form a linked list, and preparing to call, the data is pre-filled
into the next message. Calling then means moving the new message to the current one and jumping
to the address of the method. Returning is the somewhat reverse process.
These are pretty much the first four registers. When the code goes from virtual to register,
we use register instructions to replace virtual ones.
Syscalls are implemented by *one* Syscall instruction. The Register machine does not specify/limit
the meaning or number of syscalls. This is implemented by the level below, eg the arm/interpreter.
Eg: A Virtual::Set can move data around inside those objects.
And since in Arm this can not be done in one instruction, we use two, one to move to an unused register
and then into the destination. And then we need some fiddling of bits to shift the type info.
Interpreter
===========
Another simple example is a Call. A simple case of a Class function call resolves the class object,
and with the method name the function to be found at compile-time.
And so this results in a Register::Call, which is an Arm instruction.
There is an interpreter that can interpret compiled register machine programs.
This is very handy for debugging (an nothing else).
A C call
---------
Even more handy is the graphical interface for the interpreter, which is in it's own repository:
salama-debugger.
Ok, there are no c calls. But syscalls are very similar.
This is not at all as simple as the nice Class call described above.
Arm / Elf
=========
For syscall in Arm (linux) you have to load registers 0-x (depending on call), load R7 with the
syscall number and then issue the software interupt instruction.
If you get back something back, it's in R0.
There is also a (very strightforward) transformation to arm instructions.
Together with the also quite minimal elf module, arm binaries can be produced.
In short, lots of shuffling. And to make it fit with our four object architecture,
we need the Message to hold the data for the call and Sys (module) to be self.
And then the actual functions do the shuffle, saving the data and restoring it.
And setting type information according to kernel documentation (as there is no runtime info)
These binaries have no external dependencies and in fact can not even call c at the moment
(only syscalls :-)).

View File

@ -1,51 +0,0 @@
module Register
class UnusedAndAbandonedInteger < Word
# needs to be here as Word's constructor is private (to make it abstract)
def initialize reg
super
end
def less_or_equal block , right
block.cmp( self , right )
Register::BranchCondition.new :le
end
def greater_or_equal block , right
block.cmp( self , right )
Register::BranchCondition.new :ge
end
def greater_than block , right
block.cmp( self , right )
Register::BranchCondition.new :gt
end
def less_than block , right
block.cmp( self , right )
Register::BranchCondition.new :lt
end
def plus block , first , right
block.add( self , left , right )
self
end
def minus block , left , right
block.sub( self , left , right )
self
end
def left_shift block , left , right
block.mov( self , left , shift_lsr: right )
self
end
def equals block , right
block.cmp( self , right )
Register::BranchCondition.new :eq
end
def is_true? function
function.cmp( self , 0 )
Register::BranchCondition.new :ne
end
def move block , right
block.mov( self , right )
self
end
end
end

View File

@ -1,82 +0,0 @@
module Register
# Passes, or BlockPasses, could have been procs that just get each block passed.
# Instead they are proper objects in case they want to save state.
# The idea is
# - reduce noise in the main code by having this code seperately (aspect/concern style)
# - abstract the iteration
# - allow not yet written code to hook in
class RemoveStubs
def run block
block.codes.dup.each_with_index do |kode , index|
next unless kode.is_a? StackInstruction
if kode.registers.empty?
block.codes.delete(kode)
puts "deleted stack instruction in #{b.name}"
end
end
end
end
# Operators eg a + b , must assign their result somewhere and as such create temporary variables.
# but if code is c = a + b , the generated instructions would be more like tmp = a + b ; c = tmp
# SO if there is an move instruction just after a logic instruction where the result of the logic
# instruction is moved straight away, we can undo that mess and remove one instruction.
class LogicMoveReduction
def run block
org = block.codes.dup
org.each_with_index do |kode , index|
n = org[index+1]
next if n.nil?
next unless kode.is_a? LogicInstruction
next unless n.is_a? MoveInstruction
# specific arm instructions, don't optimize as don't know what the extra mean
# small todo. This does not catch condition_code that are not :al
next if (n.attributes.length > 3) or (kode.attributes.length > 3)
if kode.result == n.from
puts "Logic reduction #{kode} removes #{n}"
kode.result = n.to
block.codes.delete(n)
end
end
end
end
# Sometimes there are double moves ie mov a, b and mov b , c . We reduce that to move a , c
# (but don't check if that improves register allocation. Yet ?)
class MoveMoveReduction
def run block
org = block.codes.dup
org.each_with_index do |kode , index|
n = org[index+1]
next if n.nil?
next unless kode.is_a? MoveInstruction
next unless n.is_a? MoveInstruction
# specific arm instructions, don't optimize as don't know what the extra mean
# small todo. This does not catch condition_code that are not :al
next if (n.attributes.length > 3) or (kode.attributes.length > 3)
if kode.to == n.from
puts "Move reduction #{kode}: removes #{n} "
kode.to = n.to
block.codes.delete(n)
end
end
end
end
#As the name says, remove no-ops. Currently mov x , x supported
class NoopReduction
def run block
block.codes.dup.each_with_index do |kode , index|
next unless kode.is_a? MoveInstruction
# specific arm instructions, don't optimize as don't know what the extra mean
# small todo. This does not catch condition_code that are not :al
next if (kode.attributes.length > 3)
if kode.to == kode.from
block.codes.delete(kode)
puts "deleted noop move in #{block.name} #{kode}"
end
end
end
end
end

View File

@ -1,3 +0,0 @@
module Register
end

View File

@ -1,68 +0,0 @@
module Virtual
# Plock (Proc-Block) is mostly a Block but also somewhat Proc-ish: A Block that carries data.
#
# Data in a Block is usefull in the same way data in objects is. Plocks being otherwise just code.
#
# But the concept is not quite straigtforwrd: If one thinks of a Plock embedded in a normal method,
# the a data in the Plock would be static data. In OO terms this comes quite close to a Proc,
# if the data is the local variables.
# Quite possibly they shall be used to implement procs, but that is not the direction now.
#
# For now we use Plocks behaind the scenes as it were. In the code that you never see,
# method invocation mainly.
#
# In terms of implementation the Plock is a Block with data
# (Not too much data, mainly a couple of references).
# The block writes it's instructions as normal, but a jump is inserted as the last instruction.
# The jump is to the next block, over the data that is inserted after the block code
# (and so before the next)
#
# It follows that Plocks should be linear blocks.
class Plock < Block
def initialize(name , method , next_block )
super
@data = []
@branch_code = RegisterMachine.instance.b next_block
end
def set_next next_b
super
@branch_code = RegisterMachine.instance.b next_block
end
# Data gets assembled after methods
def add_data o
return if @objects.include? o
raise "must be derived from Code #{o.inspect}" unless o.is_a? Register::Code
@data << o # TODO check type , no basic values allowed (must be wrapped)
end
# Code interface follows. Note position is inheitted as is from Code
# length of the Plock is the length of the block, plus the branch, plus data.
def byte_length
len = @data.inject(super) {| sum , item | sum + item.word_length}
len + @branch_code.word_length
end
# again, super + branch plus data
def link_at pos , context
super(pos , context)
@branch_code.link_at pos , context
@data.each do |code|
code.link_at(pos , context)
pos += code.word_length
end
end
# again, super + branch plus data
def assemble(io)
super
@branch_code.assemble(io)
@data.each do |obj|
obj.assemble io
end
end
end
end