renaming, making space for extra layer

This commit is contained in:
Torsten Ruger
2014-06-25 02:33:44 +03:00
parent 2044a3e994
commit 9b39a3a816
28 changed files with 34 additions and 1 deletions

View File

@ -0,0 +1,36 @@
Von Neumann Machine
===============
This is the logic that uses the generated ast to produce code, using the asm layer.
Apart from shuffeling things around from one layer to the other, it keeps track about registers and
provides the stack glue. All the stuff a compiler would usually do.
Also all syscalls are abstracted as functions.
The Crystal Convention
----------------------
Since we're not in c, we use the regsters more suitably for our job:
- return register is _not_ the same as passing registers
- we pin one more register (ala stack/fp) for type information (this is used for returns too)
- one line (8 registers) can be used by a function (caller saved?)
- rest are scratch and may not hold values during call
For Arm this works out as:
- 0 type word (for the line)
- 1-6 argument passing + workspace
- 7 return value
This means syscalls (using 7 for call number and 0 for return) must shuffle a little, but there's space to do it.
Some more detail:
1 - returning in the same register as passing makes that one register a special case, which i want to avoid. shuffling it gets tricky and involves 2 moves for what?
As i see it the benefitd of reusing the same register are one more argument register (not needed) and easy chaining of calls, which doen't really happen so much.
On the plus side, not using the same register makes saving and restoring registers easy (to implement and understand!).
An easy to understand policy is worth gold, as register mistakes are HARD to debug and not what i want to spend my time with just now. So that's settled.
2 - Tagging integers like MRI/BB is a hack which does not extend to other types, such as floats. So we don't use that and instead carry type information externally to the value. This is a burden off course, but then so is tagging.
The convention (to make it easier) is to handle data in lines (8 words) and have one of them carry the type info for the other 7. This is also the object layout and so we reuse that code on the stack.

110
lib/neumann/block.rb Normal file
View File

@ -0,0 +1,110 @@
require_relative "values"
module Vm
# Think flowcharts: blocks are the boxes. The smallest unit of linear code
# Blocks must end in control instructions (jump/call/return).
# And the only valid argument for a jump is a Block
# Blocks form a linked list
# There are four ways for a block to get data (to work on)
# - hard coded constants (embedded in code)
# - memory move
# - values passed in (from previous blocks. ie local variables)
# See Value description on how to create code/instructions
# Codes then get assembled into bytes (after linking)
class Block < Code
def initialize(name , function , next_block )
super()
@function = function
@name = name.to_sym
@next = next_block
@branch = nil
@codes = []
# keeping track of register usage, left (assigns) or right (uses)
@assigns = []
@uses = []
end
attr_reader :name , :next , :codes , :function , :assigns , :uses
attr_accessor :branch
def reachable ret = []
add_next ret
add_branch ret
ret
end
def add_code kode
kode.assigns.each { |a| (@assigns << a) unless @assigns.include?(a) }
kode.uses.each { |use| (@uses << use) unless (@assigns.include?(use) or @uses.include?(use)) }
#puts "IN ADD #{name}#{uses}"
@codes << kode
end
def set_next next_b
@next = next_b
end
# returns if this is a block that ends in a call (and thus needs local variable handling)
def call_block?
return false unless codes.last.is_a?(CallInstruction)
return false unless codes.last.opcode == :call
codes.dup.reverse.find{ |c| c.is_a? StackInstruction }
end
# Code interface follows. Note position is inheitted as is from Code
# length of the block is the length of it's codes, plus any next block (ie no branch follower)
# Note, the next is in effect a linked list and as such may have many blocks behind it.
def length
cods = @codes.inject(0) {| sum , item | sum + item.length}
cods += @next.length if @next
cods
end
# to link we link the codes (instructions), plus any next in line block (non- branched)
def link_at pos , context
super(pos , context)
@codes.each do |code|
code.link_at(pos , context)
pos += code.length
end
if @next
@next.link_at pos , context
pos += @next.length
end
pos
end
# assemble the codes (instructions) and any next in line block
def assemble(io)
@codes.each do |obj|
obj.assemble io
end
@next.assemble(io) if @next
end
private
# helper for determining reachable blocks
def add_next ret
return if @next.nil?
return if ret.include? @next
ret << @next
@next.reachable ret
end
# helper for determining reachable blocks
def add_branch ret
return if @branch.nil?
return if ret.include? @branch
ret << @branch
@branch.reachable ret
end
end
end

37
lib/neumann/call_site.rb Normal file
View File

@ -0,0 +1,37 @@
module Vm
# name and args , return
class CallSite < Value
def initialize(name , value , args , function )
@name = name
@value = value
@args = args
@function = function
raise "oh #{name} " unless value
end
attr_reader :function , :args , :name , :value
def load_args into
if value.is_a?(IntegerConstant) or value.is_a?(ObjectConstant)
function.receiver.load into , value
else
raise "meta #{name} " if value.is_a? Boot::MetaClass
function.receiver.move( into, value ) if value.register_symbol != function.receiver.register_symbol
end
raise "function call '#{args.inspect}' has #{args.length} arguments, but function has #{function.args.length}" if args.length != function.args.length
args.each_with_index do |arg , index|
if arg.is_a?(IntegerConstant) or arg.is_a?(StringConstant)
function.args[index].load into , arg
else
function.args[index].move( into, arg ) if arg.register_symbol != function.args[index].register_symbol
end
end
end
def do_call into
RegisterMachine.instance.function_call into , self
end
end
end

48
lib/neumann/code.rb Normal file
View File

@ -0,0 +1,48 @@
module Vm
# Base class for anything that we can assemble
# Derived classes include instructions and constants(data)
# The commonality abstracted here is the length and position
# and the ability to assemble itself into the stream(io)
# All code is position independant once assembled.
# But for jumps and calls two passes are neccessary.
# The first setting the position, the second assembling
class Code
def class_for clazz
RegisterMachine.instance.class_for(clazz)
end
# set the position to zero, will have to reset later
def initialize
@position = 0
end
# the position in the stream. Think of it as an address if you want. The difference is small.
# Especially since we produce _only_ position independant code
# in other words, during assembly the position _must_ be resolved into a pc relative address
# and not used as is
def position
@position
end
# The containing class (assembler/function) call this to tell the instruction/data where it is in the
# stream. During assembly the position is then used to calculate pc relative addresses.
def link_at address , context
@position = address
end
# length for this code in bytes
def length
raise "Not implemented #{inspect}"
end
# we pass the io (usually string_io) in for the code to assemble itself.
def assemble(io)
raise "Not implemented #{self.inspect}"
end
end
end

16
lib/neumann/context.rb Normal file
View File

@ -0,0 +1,16 @@
module Vm
#currently just holding the object_space in here so we can have global access
class Context
def initialize object_space
@object_space = object_space
@locals = {}
end
attr_reader :attributes ,:object_space
attr_accessor :current_class , :locals , :function
end
end

185
lib/neumann/function.rb Normal file
View File

@ -0,0 +1,185 @@
require_relative "block"
require_relative "passes"
module Vm
# Functions are similar to Blocks. Where Blocks can be jumped to, Functions can be called.
# Functions also have arguments and a return. These are Value subclass instances, ie specify
# type (by class type) and register by instance
# They also have local variables. Args take up the first n regs, then locals the rest. No
# direct manipulating of registers (ie specifying the number) should be done.
# Code-wise Functions are made up from a list of Blocks, in a similar way blocks are made up of codes
# Four of the block have a special role:
# - entry/exit: are usually system specific
# - body: the logical start of the function
# - return: the logical end, where ALL blocks must end
# Blocks can be linked in two ways:
# -linear: flow continues from one to the next as they are sequential both logically and "physically"
# use the block set_next for this.
# This "the straight line", there must be a continuous sequence from body to return
# Linear blocks may be created from an existing block with new_block
# - branched: You create new blocks using function.new_block which gets added "after" return
# These (eg if/while) blocks may themselves have linear blocks ,but the last of these
# MUST have an uncoditional branch. And remember, all roads lead to return.
class Function < Code
def initialize(name , receiver = Vm::Reference , args = [] , return_type = Vm::Reference)
super()
@name = name.to_sym
if receiver.is_a?(Value)
@receiver = receiver
raise "arg in non std register #{receiver.inspect}" unless RegisterMachine.instance.receiver_register == receiver.register_symbol
else
puts receiver.inspect
@receiver = receiver.new(RegisterMachine.instance.receiver_register)
end
@args = Array.new(args.length)
args.each_with_index do |arg , i|
shouldda = RegisterReference.new(RegisterMachine.instance.receiver_register).next_reg_use(i + 1)
if arg.is_a?(Value)
@args[i] = arg
raise "arg #{i} in non std register #{arg.register}, expecting #{shouldda}" unless shouldda == arg.register
else
@args[i] = arg.new(shouldda)
end
end
set_return return_type
@exit = RegisterMachine.instance.function_exit( Vm::Block.new("exit" , self , nil) , name )
@return = Block.new("return", self , @exit)
@body = Block.new("body", self , @return)
@insert_at = @body
@entry = RegisterMachine.instance.function_entry( Vm::Block.new("entry" , self , @body) ,name )
@locals = []
end
attr_reader :args , :entry , :exit , :body , :name , :return_type , :receiver
def insertion_point
@insert_at
end
def set_return type_or_value
@return_type = type_or_value || Vm::Reference
if @return_type.is_a?(Value)
raise "return in non std register #{@return_type.inspect}" unless RegisterMachine.instance.return_register == @return_type.register_symbol
else
@return_type = @return_type.new(RegisterMachine.instance.return_register)
end
end
def arity
@args.length
end
def new_local type = Vm::Integer
register = args.length + 3 + @locals.length # three for the receiver, return and type regs
l = type.new(register) #so start at r3
#puts "new local #{l.register_symbol}"
raise "Register overflow in function #{name}" if register >= 13 # yep, 13 is bad luck
@locals << l
l
end
# return a list of registers that are still in use after the given block
# a call_site uses pushes and pops these to make them available for code after a call
def locals_at l_block
used =[]
# call assigns the return register, but as it is in l_block, it is not asked.
assigned = [ RegisterReference.new(Vm::RegisterMachine.instance.return_register) ]
l_block.reachable.each do |b|
b.uses.each {|u|
(used << u) unless assigned.include?(u)
}
assigned += b.assigns
end
used.uniq
end
# return a list of the blocks that are addressable, ie entry and @blocks and all next
def blocks
ret = []
b = @entry
while b
ret << b
b = b.next
end
ret
end
# when control structures create new blocks (with new_block) control continues at some new block the
# the control structure creates.
# Example: while, needs 2 extra blocks
# 1 condition code, must be its own blockas we jump back to it
# - the body, can actually be after the condition as we don't need to jump there
# 2 after while block. Condition jumps here
# After block 2, the function is linear again and the calling code does not need to know what happened
# But subsequent statements are still using the original block (self) to add code to
# So the while expression creates the extra blocks, adds them and the code and then "moves" the insertion point along
def insert_at block
@insert_at = block
self
end
# create a new linear block after the current insertion block.
# Linear means there is no brach needed from that one to the new one.
# Usually the new one just serves as jump address for a control statement
# In code generation (assembly) , new new_block is written after this one, ie zero runtime cost
# This does _not_ change the insertion point, that has do be done with insert_at(block)
def new_block new_name
block_name = "#{@insert_at.name}_#{new_name}"
new_b = Block.new( block_name , self , @insert_at.next )
@insert_at.set_next new_b
return new_b
end
def add_code(kode)
raise "alarm #{kode}" if kode.is_a? Word
raise "alarm #{kode.class} #{kode}" unless kode.is_a? Code
@insert_at.add_code kode
self
end
# sugar to create instructions easily.
# any method will be passed on to the RegisterMachine and the result added to the insertion block
# With this trick we can write what looks like assembler,
# Example func.instance_eval
# mov( r1 , r2 )
# add( r1 , r2 , 4)
# end
# mov and add will be called on Machine and generate Inststuction that are then added
# to the current block
# also symbols are supported and wrapped as register usages (for bare metal programming)
def method_missing(meth, *args, &block)
add_code RegisterMachine.instance.send(meth , *args)
end
# following id the Code interface
# to link we link the entry and then any blocks. The entry links the straight line
def link_at address , context
super #just sets the position
@entry.link_at address , context
end
# position of the function is the position of the entry block
def position
@entry.position
end
# length of a function is the entry block length (includes the straight line behind it)
# plus any out of line blocks that have been added
def length
@entry.length
end
# assembling assembles the entry (straight line/ no branch line) + any additional branches
def assemble io
@entry.assemble(io)
end
end
end

206
lib/neumann/instruction.rb Normal file
View File

@ -0,0 +1,206 @@
require_relative "code"
module Vm
# Because the idea of what one instruction does, does not always map one to one to real machine
# instructions, and instruction may link to another instruction thus creating an arbitrary list
# to get the job (the original instruciton) done
# Admittately it would be simpler just to create the (abstract) instructions and let the machine
# encode them into what-ever is neccessary, but this approach leaves more possibility to
# optimize the actual instruction stream (not just the crystal instruction stream). Makes sense?
# We have basic classes (literally) of instructions
# - Memory
# - Stack
# - Logic
# - Math
# - Control/Compare
# - Move
# - Call
# Instruction derives from Code, for the assembly api
class Instruction < Code
def initialize options
@attributes = options
end
attr_reader :attributes
def opcode
@attributes[:opcode]
end
#abstract, only should be called from derived
def to_s
atts = @attributes.dup
atts.delete(:opcode)
atts.delete(:update_status)
atts.delete(:condition_code) if atts[:condition_code] == :al
atts.empty? ? "" : ", #{atts}"
end
# returns an array of registers (RegisterReferences) that this instruction uses.
# ie for r1 = r2 + r3
# which in assembler is add r1 , r2 , r3
# it would return [r2,r3]
# for pushes the list may be longer, whereas for a jump empty
def uses
raise "abstract called for #{self.class}"
end
# returns an array of registers (RegisterReferences) that this instruction assigns to.
# ie for r1 = r2 + r3
# which in assembler is add r1 , r2 , r3
# it would return [r1]
# for most instruction this is one, but comparisons and jumps 0 , and pop's as long as 16
def assigns
raise "abstract called for #{self.class}"
end
def method_missing name , *args , &block
return super unless (args.length <= 1) or block_given?
set , attribute = name.to_s.split("set_")
if set == ""
@attributes[attribute.to_sym] = args[0] || 1
return self
else
return super
end
return @attributes[name.to_sym]
end
end
class StackInstruction < Instruction
def initialize first , options = {}
@first = first
super(options)
end
# when calling we place a dummy push/pop in the stream and calculate later what registers actually need saving
def set_registers regs
@first = regs.collect{ |r| r.symbol }
end
def is_push?
opcode == :push
end
def is_pop?
!is_push?
end
def uses
is_push? ? regs : []
end
def assigns
is_pop? ? regs : []
end
def regs
@first
end
def to_s
"#{opcode} [#{@first.collect {|f| f.to_asm}.join(',') }] #{super}"
end
end
class MemoryInstruction < Instruction
def initialize result , left , right = nil , options = {}
@result = result
@left = left
@right = right
super(options)
end
def uses
ret = [@left.register ]
ret << @right.register unless @right.nil?
ret
end
def assigns
[@result.register]
end
end
class LogicInstruction < Instruction
# result = left op right
#
# Logic instruction are your basic operator implementation. But unlike the (normal) code we write
# these Instructions must have "place" to write their results. Ie when you write 4 + 5 in ruby
# the result is sort of up in the air, but with Instructions the result must be assigned
def initialize result , left , right , options = {}
@result = result
@left = left
@right = right.is_a?(Fixnum) ? IntegerConstant.new(right) : right
super(options)
end
attr_accessor :result , :left , :right
def uses
ret = []
ret << @left.register if @left and not @left.is_a? Constant
ret << @right.register if @right and not @right.is_a?(Constant)
ret
end
def assigns
[@result.register]
end
def to_s
"#{opcode} #{result.to_asm} , #{left.to_asm} , #{right.to_asm} #{super}"
end
end
class CompareInstruction < Instruction
def initialize left , right , options = {}
@left = left
@right = right.is_a?(Fixnum) ? IntegerConstant.new(right) : right
super(options)
end
def uses
ret = [@left.register ]
ret << @right.register unless @right.is_a? Constant
ret
end
def assigns
[]
end
def to_s
"#{opcode} #{@left.to_asm} , #{@right.to_asm} #{super}"
end
end
class MoveInstruction < Instruction
def initialize to , from , options = {}
@to = to
@from = from.is_a?(Fixnum) ? IntegerConstant.new(from) : from
raise "move must have from set #{inspect}" unless from
super(options)
end
attr_accessor :to , :from
def uses
@from.is_a?(Constant) ? [] : [@from.register]
end
def assigns
[@to.register]
end
def to_s
"#{opcode} #{@to.to_asm} , #{@from.to_asm} #{super}"
end
end
class CallInstruction < Instruction
def initialize first , options = {}
@first = first
super(options)
opcode = @attributes[:opcode].to_s
if opcode.length == 3 and opcode[0] == "b"
@attributes[:condition_code] = opcode[1,2].to_sym
@attributes[:opcode] = :b
end
if opcode.length == 6 and opcode[0] == "c"
@attributes[:condition_code] = opcode[4,2].to_sym
@attributes[:opcode] = :call
end
end
def uses
if opcode == :call
@first.args.collect {|arg| arg.register }
else
[]
end
end
def assigns
if opcode == :call
[RegisterReference.new(RegisterMachine.instance.return_register)]
else
[]
end
end
def to_s
"#{opcode} #{@first.to_asm} #{super}"
end
end
end

108
lib/neumann/passes.rb Normal file
View File

@ -0,0 +1,108 @@
module Vm
# Passes, or BlockPasses, could have been procs that just get each block passed.
# Instead they are proper objects in case they want to save state.
# The idea is
# - reduce noise in the main code by having this code seperately (aspect/concern style)
# - abstract the iteration
# - allow not yet written code to hook in
class RemoveStubs
def run block
block.codes.dup.each_with_index do |kode , index|
next unless kode.is_a? StackInstruction
if kode.registers.empty?
block.codes.delete(kode)
puts "deleted stack instruction in #{b.name}"
end
end
end
end
# Operators eg a + b , must assign their result somewhere and as such create temporary variables.
# but if code is c = a + b , the generated instructions would be more like tmp = a + b ; c = tmp
# SO if there is an move instruction just after a logic instruction where the result of the logic
# instruction is moved straight away, we can undo that mess and remove one instruction.
class LogicMoveReduction
def run block
org = block.codes.dup
org.each_with_index do |kode , index|
n = org[index+1]
next if n.nil?
next unless kode.is_a? LogicInstruction
next unless n.is_a? MoveInstruction
# specific arm instructions, don't optimize as don't know what the extra mean
# small todo. This does not catch condition_code that are not :al
next if (n.attributes.length > 3) or (kode.attributes.length > 3)
if kode.result == n.from
puts "Logic reduction #{kode} removes #{n}"
kode.result = n.to
block.codes.delete(n)
end
end
end
end
# Sometimes there are double moves ie mov a, b and mov b , c . We reduce that to move a , c
# (but don't check if that improves register allocation. Yet ?)
class MoveMoveReduction
def run block
org = block.codes.dup
org.each_with_index do |kode , index|
n = org[index+1]
next if n.nil?
next unless kode.is_a? MoveInstruction
next unless n.is_a? MoveInstruction
# specific arm instructions, don't optimize as don't know what the extra mean
# small todo. This does not catch condition_code that are not :al
next if (n.attributes.length > 3) or (kode.attributes.length > 3)
if kode.to == n.from
puts "Move reduction #{kode}: removes #{n} "
kode.to = n.to
block.codes.delete(n)
end
end
end
end
#As the name says, remove no-ops. Currently mov x , x supported
class NoopReduction
def run block
block.codes.dup.each_with_index do |kode , index|
next unless kode.is_a? MoveInstruction
# specific arm instructions, don't optimize as don't know what the extra mean
# small todo. This does not catch condition_code that are not :al
next if (kode.attributes.length > 3)
if kode.to == kode.from
block.codes.delete(kode)
puts "deleted noop move in #{block.name} #{kode}"
end
end
end
end
# We insert push/pops as dummies to fill them later in CallSaving
# as we can not know ahead of time which locals wil be live in the code to come
# and also we don't want to "guess" later where the push/pops should be
# Here we check which registers need saving and add them
# Or sometimes just remove the push/pops, when no locals needed saving
class SaveLocals
def run block
push = block.call_block?
return unless push
return unless block.function
locals = block.function.locals_at block
pop = block.next.codes.first
if(locals.empty?)
#puts "Empty #{block.name}"
block.codes.delete(push)
block.next.codes.delete(pop)
else
#puts "PUSH #{push}"
push.set_registers(locals)
#puts "POP #{pop}"
pop.set_registers(locals)
end
end
end
end

64
lib/neumann/plock.rb Normal file
View File

@ -0,0 +1,64 @@
module Vm
#Plock (Proc-Block) is mostly a Block but also somewhat Proc-ish: A Block that carries data.
#
# Data in a Block is usefull in the same way data in objects is. Plocks being otherwise just code.
#
# But the concept is not quite straigtforwrd: If one think of an Plock enbedded in a normal function,
# the a data in the Plock would be static data. In OO terms this comes quite close to a Proc, if the data is the local
# variables. Quite possibly they shall be used to implement procs, but that is not the direction now.
#
# For now we use Plocks behaind the scenes as it were. In the code that you never see, method invocation mainly.
#
# In terms of implementation the Plock is a Block with data (Not too much data, mainly a couple of references).
# The block writes it's instructions as normal, but a jump is inserted as the last instruction. The jump is to the
# next block, over the data that is inserted after the block code (and so before the next)
#
# It follows that Plocks should be linear blocks.
class Plock < Block
def initialize(name , function , next_block )
super
@data = []
@branch_code = RegisterMachine.instance.b next_block
end
def set_next next_b
super
@branch_code = RegisterMachine.instance.b next_block
end
# Data gets assembled after functions
def add_data o
return if @objects.include? o
raise "must be derived from Code #{o.inspect}" unless o.is_a? Vm::Code
@data << o # TODO check type , no basic values allowed (must be wrapped)
end
# Code interface follows. Note position is inheitted as is from Code
# length of the Plock is the length of the block, plus the branch, plus data.
def length
len = @data.inject(super) {| sum , item | sum + item.length}
len + @branch_code.length
end
# again, super + branch plus data
def link_at pos , context
super(pos , context)
@branch_code.link_at pos , context
@data.each do |code|
code.link_at(pos , context)
pos += code.length
end
end
# again, super + branch plus data
def assemble(io)
super
@branch_code.assemble(io)
@data.each do |obj|
obj.assemble io
end
end
end
end

View File

@ -0,0 +1,143 @@
module Vm
# Our virtual c-machine has a number of registers of a given size and uses a stack
# So much so standard
# But our machine is oo, meaning that the register contents is typed.
# Off course current hardware does not have that (a perceived issue), but for our machine we pretend.
# So internally we have at least 8 word registers, one of which is used to keep track of types*
# and any number of scratch registers
# but externally it's all Values (see there)
# * Note that register content is typed externally. Not as in mri, where int's are tagged. Floats can's
# be tagged and lambda should be it's own type, so tagging does not work
# A Machines main responsibility in the framework is to instantiate Instruction
# Value functions are mapped to machines by concatenating the values class name + the methd name
# Example: IntegerValue.plus( value ) -> Machine.signed_plus (value )
# Also, shortcuts are created to easily instantiate Instruction objects. The "standard" set of instructions
# (arm-influenced) provides for normal operations on a register machine,
# Example: pop -> StackInstruction.new( {:opcode => :pop}.merge(options) )
# Instructions work with options, so you can pass anything in, and the only thing the functions does
# is save you typing the clazz.new. It passes the function name as the :opcode
class RegisterMachine
# hmm, not pretty but for now
@@instance = nil
attr_reader :registers
attr_reader :scratch
attr_reader :pc
attr_reader :stack
# is often a pseudo register (ie doesn't support move or other operations).
# Still, using if to express tests makes sense, not just for
# consistency in this code, but also because that is what is actually done
attr_reader :status
# conditions specify all the possibilities for branches. Branches are b + condition
# Example: beq means brach if equal.
# :al means always, so bal is an unconditional branch (but b() also works)
CONDITIONS = [ :al , :eq , :ne , :lt , :le, :ge, :gt , :cs , :mi , :hi , :cc , :pl, :ls , :vc , :vs ]
# here we create the shortcuts for the "standard" instructions, see above
# Derived machines may use own instructions and define functions for them if so desired
def initialize
[:push, :pop].each do |inst|
define_instruction_one(inst , StackInstruction)
end
[:adc, :add, :and, :bic, :eor, :orr, :rsb, :rsc, :sbc, :sub].each do |inst|
define_instruction_three(inst , LogicInstruction)
end
[:mov, :mvn].each do |inst|
define_instruction_two(inst , MoveInstruction)
end
[:cmn, :cmp, :teq, :tst].each do |inst|
define_instruction_two(inst , CompareInstruction)
end
[:strb, :str , :ldrb, :ldr].each do |inst|
define_instruction_three(inst , MemoryInstruction)
end
[:b, :call , :swi].each do |inst|
define_instruction_one(inst , CallInstruction)
end
# create all possible brach instructions, but the CallInstruction demangles the
# code, and has opcode set to :b and :condition_code set to the condition
CONDITIONS.each do |suffix|
define_instruction_one("b#{suffix}".to_sym , CallInstruction)
define_instruction_one("call#{suffix}".to_sym , CallInstruction)
end
end
def create_method(name, &block)
self.class.send(:define_method, name , &block)
end
def self.instance
@@instance
end
def self.instance= machine
@@instance = machine
end
def class_for clazz
c_name = clazz.name
my_module = self.class.name.split("::").first
clazz_name = clazz.name.split("::").last
if(my_module != Vm )
module_class = eval("#{my_module}::#{clazz_name}") rescue nil
clazz = module_class if module_class
end
clazz
end
private
#defining the instruction (opcode, symbol) as an given class.
# the class is a Vm::Instruction derived base class and to create machine specific function
# an actual machine must create derived classes (from this base class)
# These instruction classes must follow a naming pattern and take a hash in the contructor
# Example, a mov() opcode instantiates a Vm::MoveInstruction
# for an Arm machine, a class Arm::MoveInstruction < Vm::MoveInstruction exists, and it will
# be used to define the mov on an arm machine.
# This methods picks up that derived class and calls a define_instruction methods that can
# be overriden in subclasses
def define_instruction_one(inst , clazz , defaults = {} )
clazz = self.class_for(clazz)
create_method(inst) do |first , options = nil|
options = {} if options == nil
options.merge defaults
options[:opcode] = inst
first = Vm::Integer.new(first) if first.is_a? Symbol
clazz.new(first , options)
end
end
# same for two args (left right, from to etc)
def define_instruction_two(inst , clazz , defaults = {} )
clazz = self.class_for(clazz)
create_method(inst) do |left ,right , options = nil|
options = {} if options == nil
options.merge defaults
left = Vm::Integer.new(left) if left.is_a? Symbol
right = Vm::Integer.new(right) if right.is_a? Symbol
options[:opcode] = inst
clazz.new(left , right ,options)
end
end
# same for three args (result = left right,)
def define_instruction_three(inst , clazz , defaults = {} )
clazz = self.class_for(clazz)
create_method(inst) do |result , left ,right = nil , options = nil|
options = {} if options == nil
options.merge defaults
options[:opcode] = inst
result = Vm::Integer.new(result) if result.is_a? Symbol
left = Vm::Integer.new(left) if left.is_a? Symbol
right = Vm::Integer.new(right) if right.is_a? Symbol
clazz.new(result, left , right ,options)
end
end
end
end

View File

@ -0,0 +1,33 @@
module Vm
# RegisterReference is not the name for a register, "only" for a certain use of it.
# In a way it is like a variable name, a storage location. The location is a register off course,
# but which register can be changed, and _all_ instructions sharing the RegisterReference then use that register
# In other words a simple level of indirection, or change from value to reference sematics.
class RegisterReference
attr_accessor :symbol
def initialize r
if( r.is_a? Fixnum)
r = "r#{r}".to_sym
end
raise "wrong type for register init #{r}" unless r.is_a? Symbol
raise "double r #{r}" if r == :rr1
@symbol = r
end
def == other
return false if other.nil?
return false if other.class != RegisterReference
symbol == other.symbol
end
#helper method to calculate with register symbols
def next_reg_use by = 1
int = @symbol[1,3].to_i
sym = "r#{int + by}".to_sym
RegisterReference.new( sym )
end
end
end

52
lib/neumann/values.rb Normal file
View File

@ -0,0 +1,52 @@
require_relative "code"
require_relative "register_reference"
module Vm
# Values represent the information as it is processed. Different subclasses for different types,
# each type with different operations.
# The oprerations on values is what makes a machine do things. Operations are captured as
# subclasses of Instruction and saved to Blocks
# Values are a way to reason about (create/validate) instructions.
# Word Values are what fits in a register. Derived classes
# Float, Reference , Integer(s) must fit the same registers
# just a base class for data. not sure how this will be usefull (may just have read too much llvm)
class Value
def class_for clazz
RegisterMachine.instance.class_for(clazz)
end
end
# Just a nice way to write branches
# Comparisons produce them, and branches take them as argument.
class BranchCondition < Value
def initialize operator
@operator = operator
end
attr_accessor :operator
#needed to check the opposite, ie not true
def not_operator
case @operator
when :le
:gt
when :gt
:le
when :lt
:ge
when :eq
:ne
else
raise "no implemented #{@operator}"
end
end
end
end
require_relative "values/constants"
require_relative "values/word"
require_relative "values/integer"
require_relative "values/reference"
require_relative "values/mystery"