move compiler to bosl and get first test working (adjusting syntax as i go)
This commit is contained in:
85
lib/bosl/compiler/README.md
Normal file
85
lib/bosl/compiler/README.md
Normal file
@ -0,0 +1,85 @@
|
||||
### Compiling
|
||||
|
||||
The Ast (abstract syntax tree) is created by [salama-reader](https://github.com/salama/salama-reader)
|
||||
gem and the classes defined there
|
||||
|
||||
The code in this directory compiles the AST to the virtual machine code, and Parfait object structure.
|
||||
|
||||
If this were an interpreter, we would just walk the tree and do what it says.
|
||||
Since it's not things are a little more difficult, especially in time.
|
||||
|
||||
When compiling we deal with two times, compile-time and run-time.
|
||||
All the headache comes from mixing those two up.*
|
||||
|
||||
Similarly, the result of compiling is two-fold: a static and a dynamic part.
|
||||
|
||||
- the static part are objects like the constants, but also defined classes and their methods
|
||||
- the dynamic part is the code, which is stored as streams of instructions in the MethodSource
|
||||
|
||||
Too make things a little simpler, we create a very high level instruction stream at first and then
|
||||
run transformation and optimization passes on the stream to improve it.
|
||||
|
||||
Each ast class gets a compile method that does the compilation.
|
||||
|
||||
#### MethodSource and Instructions
|
||||
|
||||
The first argument to the compile method is the MethodSource.
|
||||
All code is encoded as a stream of Instructions in the MethodSource.
|
||||
Instructions are stored as a list of Blocks, and Blocks are the smallest unit of code,
|
||||
which is always linear.
|
||||
|
||||
Code is added to the method (using add_code), rather than working with the actual instructions.
|
||||
This is so each compiling method can just do it's bit and be unaware of the larger structure
|
||||
that is being created.
|
||||
The general structure of the instructions is a graph
|
||||
(with if's and whiles and breaks and what), but we build it to have one start and *one* end (return).
|
||||
|
||||
|
||||
#### Messages and frames
|
||||
|
||||
Since the machine is virtual, we have to define it, and since it is oo we define it in objects.
|
||||
|
||||
Also it is important to define how instructions operate, which is is in a physical machine would
|
||||
be by changing the contents of registers or some stack.
|
||||
|
||||
Our machine is not a register machine, but an object machine: it operates directly on objects and
|
||||
also has no separate stack, only objects. There are a number of objects which are accessible,
|
||||
and one can think of these (their addresses) as register contents.
|
||||
(And one wouldn't be far off as that is the implementation.)
|
||||
|
||||
The objects the machine works on are:
|
||||
|
||||
- Message
|
||||
- Frame
|
||||
- Self
|
||||
- NewMessage
|
||||
|
||||
and working on means, these are the only objects which the machine accesses.
|
||||
Ie all others would have to be moved first.
|
||||
|
||||
When a Method needs to make a call, or send a Message, it creates a NewMessage object.
|
||||
Messages contain return addresses and arguments.
|
||||
|
||||
Then the machine must find the method to call.
|
||||
This is a function of the virtual machine and is implemented in ruby.
|
||||
|
||||
Then a new Method receives the Message, creates a Frame for local and temporary variables
|
||||
and continues execution.
|
||||
|
||||
The important thing here is that Messages and Frames are normal objects.
|
||||
|
||||
And interestingly we can partly use ruby to find the method, so in a way it is not just a top
|
||||
down transformation. Instead the sending goes back up and then down again.
|
||||
|
||||
The Message object is the second parameter to the compile method, the run-time part as it were.
|
||||
Why? Since it only exists at runtime: to make compile time analysis possible
|
||||
(it is after all the Virtual version, not Parfait. ie compile-time, not run-time).
|
||||
Especially for those times when we can resolve the method at compile time.
|
||||
|
||||
|
||||
*
|
||||
As ruby is a dynamic language, it also compiles at run-time. This line of thought does not help
|
||||
though as it sort of mixes the seperate times up, even they are not.
|
||||
Even in a running ruby programm the stages of compile and run are seperate.
|
||||
Similarly it does not help to argue that the code is static too, not dynamic,
|
||||
as that leaves us with a worse working model.
|
80
lib/bosl/compiler/basic_expressions.rb
Normal file
80
lib/bosl/compiler/basic_expressions.rb
Normal file
@ -0,0 +1,80 @@
|
||||
module Bosl
|
||||
# collection of the simple ones, int and strings and such
|
||||
|
||||
module Compiler
|
||||
|
||||
# Constant expressions can by definition be evaluated at compile time.
|
||||
# But that does not solve their storage, ie they need to be accessible at runtime from _somewhere_
|
||||
# So we view ConstantExpressions like functions that return the value of the constant.
|
||||
# In other words, their storage is the return slot as it would be for a method
|
||||
|
||||
# The current approach moves the constant into a variable before using it
|
||||
# But in the future (in the one that holds great things) we optimize those unneccesay moves away
|
||||
|
||||
# attr_reader :value
|
||||
def self.compile_int expression , method
|
||||
int = *expression
|
||||
to = Virtual::Return.new(Integer , int)
|
||||
method.source.add_code Virtual::Set.new( int , to )
|
||||
to
|
||||
end
|
||||
|
||||
def self.compile_true expression , method
|
||||
to = Virtual::Return.new(Reference , true )
|
||||
method.source.add_code Virtual::Set.new( true , to )
|
||||
to
|
||||
end
|
||||
|
||||
def self.compile_false expression , method
|
||||
to = Virtual::Return.new(Reference , false)
|
||||
method.source.add_code Virtual::Set.new( false , to )
|
||||
to
|
||||
end
|
||||
|
||||
def self.compile_nil expression , method
|
||||
to = Virtual::Return.new(Reference , nil)
|
||||
method.source.add_code Virtual::Set.new( nil , to )
|
||||
to
|
||||
end
|
||||
|
||||
def self.compile_modulename expression , method
|
||||
clazz = Parfait::Space.object_space.get_class_by_name expression.name
|
||||
raise "compile_modulename #{clazz}.#{name}" unless clazz
|
||||
to = Virtual::Return.new(Reference , clazz )
|
||||
method.source.add_code Virtual::Set.new( clazz , to )
|
||||
to
|
||||
end
|
||||
|
||||
# attr_reader :string
|
||||
def self.compile_string expression , method
|
||||
# Clearly a TODO here to implement strings rather than reusing symbols
|
||||
value = expression.string.to_sym
|
||||
to = Virtual::Return.new(Reference , value)
|
||||
method.source.constants << value
|
||||
method.source.add_code Virtual::Set.new( value , to )
|
||||
to
|
||||
end
|
||||
|
||||
#attr_reader :left, :right
|
||||
def self.compile_assignment expression , method
|
||||
unless expression.left.instance_of? Ast::NameExpression
|
||||
raise "must assign to NameExpression , not #{expression.left}"
|
||||
end
|
||||
r = Compiler.compile(expression.right , method )
|
||||
raise "oh noo, nil from where #{expression.right.inspect}" unless r
|
||||
index = method.has_arg(expression.left.name.to_sym)
|
||||
if index
|
||||
method.source.add_code Virtual::Set.new(ArgSlot.new(index , r.type , r ) , Virtual::Return.new)
|
||||
else
|
||||
index = method.ensure_local(expression.left.name.to_sym)
|
||||
method.source.add_code Virtual::Set.new(FrameSlot.new(index , r.type , r ) , Virtual::Return.new)
|
||||
end
|
||||
r
|
||||
end
|
||||
|
||||
def self.compile_variable expression, method
|
||||
method.source.add_code InstanceGet.new(expression.name)
|
||||
Virtual::Return.new( Unknown )
|
||||
end
|
||||
end
|
||||
end
|
36
lib/bosl/compiler/callsite_expression.rb
Normal file
36
lib/bosl/compiler/callsite_expression.rb
Normal file
@ -0,0 +1,36 @@
|
||||
module Bosl
|
||||
module Compiler
|
||||
# operators are really function calls
|
||||
|
||||
# call_site - attr_reader :name, :args , :receiver
|
||||
|
||||
def self.compile_call expession , method
|
||||
name , arguments , receiver = *expession
|
||||
name = name.to_a.first
|
||||
|
||||
me = Compiler.compile( receiver.to_a.first , method )
|
||||
|
||||
## need two step process, compile and save to frame
|
||||
# then move from frame to new message
|
||||
method.source.add_code Virtual::NewMessage.new
|
||||
method.source.add_code Virtual::Set.new( me , Virtual::NewSelf.new(me.type))
|
||||
method.source.add_code Virtual::Set.new( name.to_sym , Virtual::NewMessageName.new())
|
||||
compiled_args = []
|
||||
arguments.to_a.each_with_index do |arg , i|
|
||||
#compile in the running method, ie before passing control
|
||||
val = Compiler.compile( arg , method)
|
||||
# move the compiled value to it's slot in the new message
|
||||
# + 1 as this is a ruby 0-start , but 0 is the last message ivar.
|
||||
# so the next free is +1
|
||||
to = Virtual::NewArgSlot.new(i + 1 ,val.type , val)
|
||||
# (doing this immediately, not after the loop, so if it's a return it won't get overwritten)
|
||||
method.source.add_code Virtual::Set.new( val , to )
|
||||
compiled_args << to
|
||||
end
|
||||
method.source.add_code Virtual::MessageSend.new(name , me , compiled_args) #and pass control
|
||||
# the effect of the method is that the NewMessage Return slot will be filled, return it
|
||||
# (this is what is moved _inside_ above loop for such expressions that are calls (or constants))
|
||||
Virtual::Return.new( method.source.return_type )
|
||||
end
|
||||
end
|
||||
end
|
16
lib/bosl/compiler/compound_expressions.rb
Normal file
16
lib/bosl/compiler/compound_expressions.rb
Normal file
@ -0,0 +1,16 @@
|
||||
module Bosl
|
||||
module Compiler
|
||||
|
||||
# attr_reader :values
|
||||
def self.compile_array expession, context
|
||||
# to.do
|
||||
end
|
||||
# attr_reader :key , :value
|
||||
def self.compile_association context
|
||||
# to.do
|
||||
end
|
||||
def self.compile_hash context
|
||||
# to.do
|
||||
end
|
||||
end
|
||||
end
|
10
lib/bosl/compiler/expression_list.rb
Normal file
10
lib/bosl/compiler/expression_list.rb
Normal file
@ -0,0 +1,10 @@
|
||||
module Bosl
|
||||
module Compiler
|
||||
# list - attr_reader :expressions
|
||||
def self.compile_expressions expession , method
|
||||
expession.children.collect do |part|
|
||||
Compiler.compile( part , method )
|
||||
end
|
||||
end
|
||||
end
|
||||
end
|
37
lib/bosl/compiler/function_expression.rb
Normal file
37
lib/bosl/compiler/function_expression.rb
Normal file
@ -0,0 +1,37 @@
|
||||
module Bosl
|
||||
module Compiler
|
||||
# function attr_reader :name, :params, :body , :receiver
|
||||
def self.compile_function expression, method
|
||||
return_type , name , parameters, kids = *expression
|
||||
name = name.to_a.first
|
||||
args = parameters.to_a.collect do |p|
|
||||
raise "error, argument must be a identifier, not #{p}" unless p.type == :field
|
||||
p[2]
|
||||
end
|
||||
|
||||
if expression[:receiver]
|
||||
# compiler will always return slot. with known value or not
|
||||
r = Compiler.compile(expression.receiver, method )
|
||||
if( r.value.is_a? Parfait::Class )
|
||||
class_name = r.value.name
|
||||
else
|
||||
raise "unimplemented case in function #{r}"
|
||||
end
|
||||
else
|
||||
r = Virtual::Self.new()
|
||||
class_name = method.for_class.name
|
||||
end
|
||||
new_method = Virtual::MethodSource.create_method(class_name, name , args )
|
||||
new_method.source.receiver = r
|
||||
new_method.for_class.add_instance_method new_method
|
||||
|
||||
#frame = frame.new_frame
|
||||
kids.to_a.each do |ex|
|
||||
return_type = Compiler.compile(ex,new_method )
|
||||
raise return_type.inspect if return_type.is_a? Virtual::Instruction
|
||||
end
|
||||
new_method.source.return_type = return_type
|
||||
Virtual::Return.new(return_type)
|
||||
end
|
||||
end
|
||||
end
|
43
lib/bosl/compiler/if_expression.rb
Normal file
43
lib/bosl/compiler/if_expression.rb
Normal file
@ -0,0 +1,43 @@
|
||||
module Bosl
|
||||
module Compiler
|
||||
# if - attr_reader :cond, :if_true, :if_false
|
||||
|
||||
def self.compile_if expression , method
|
||||
# to execute the logic as the if states it, the blocks are the other way around
|
||||
# so we can the jump over the else if true ,
|
||||
# and the else joins unconditionally after the true_block
|
||||
merge_block = method.source.new_block "if_merge" # last one, created first
|
||||
true_block = method.source.new_block "if_true" # second, linked in after current, before merge
|
||||
false_block = method.source.new_block "if_false" # directly next in order, ie if we don't jump we land here
|
||||
|
||||
|
||||
is = Compiler.compile(expression.cond, method )
|
||||
# TODO should/will use different branches for different conditions.
|
||||
# just a scetch : cond_val = cond_val.is_true?(method) unless cond_val.is_a? BranchCondition
|
||||
method.source.add_code IsTrueBranch.new( true_block )
|
||||
|
||||
# compile the true block (as we think of it first, even it is second in sequential order)
|
||||
method.source.current true_block
|
||||
last = is
|
||||
expression.if_true.each do |part|
|
||||
last = Compiler.compile(part,method )
|
||||
raise part.inspect if last.nil?
|
||||
end
|
||||
|
||||
# compile the false block
|
||||
method.source.current false_block
|
||||
expression.if_false.each do |part|
|
||||
#puts "compiling in if false #{part}"
|
||||
last = Compiler.compile(part,method )
|
||||
raise part.inspect if last.nil?
|
||||
end
|
||||
method.source.add_code UnconditionalBranch.new( merge_block )
|
||||
|
||||
#puts "compiled if: end"
|
||||
method.source.current merge_block
|
||||
|
||||
#TODO should return the union of the true and false types
|
||||
last
|
||||
end
|
||||
end
|
||||
end
|
25
lib/bosl/compiler/module_expression.rb
Normal file
25
lib/bosl/compiler/module_expression.rb
Normal file
@ -0,0 +1,25 @@
|
||||
module Bosl
|
||||
module Compiler
|
||||
# module attr_reader :name ,:expressions
|
||||
def self.compile_module expression , context
|
||||
return clazz
|
||||
end
|
||||
|
||||
def self.compile_class expression , method
|
||||
clazz = Parfait::Space.object_space.get_class_by_name! expression.name
|
||||
#puts "Compiling class #{clazz.name.inspect}"
|
||||
expression_value = nil
|
||||
expression.expressions.each do |expr|
|
||||
# check if it's a function definition and add
|
||||
# if not, execute it, but that does means we should be in salama (executable), not ruby.
|
||||
# ie throw an error for now
|
||||
raise "only functions for now #{expr.inspect}" unless expr.is_a? Ast::FunctionExpression
|
||||
#puts "compiling expression #{expression}"
|
||||
expression_value = Compiler.compile(expr,method )
|
||||
#puts "compiled expression #{expression_value.inspect}"
|
||||
end
|
||||
|
||||
return expression_value
|
||||
end
|
||||
end
|
||||
end
|
23
lib/bosl/compiler/name_expression.rb
Normal file
23
lib/bosl/compiler/name_expression.rb
Normal file
@ -0,0 +1,23 @@
|
||||
module Bosl
|
||||
module Compiler
|
||||
|
||||
# attr_reader :name
|
||||
# compiling name needs to check if it's a variable and if so resolve it
|
||||
# otherwise it's a method without args and a send is issued.
|
||||
# whichever way this goes the result is stored in the return slot (as all compiles)
|
||||
def self.compile_name expression , method
|
||||
name = expression.to_a.first
|
||||
return Virtual::Self.new( Reference.new(method.for_class)) if name == :self
|
||||
# either an argument, so it's stored in message
|
||||
ret = Virtual::Return.new
|
||||
if( index = method.has_arg(name))
|
||||
method.source.add_code Virtual::Set.new( Virtual::ArgSlot.new(index ) , ret)
|
||||
else # or a local so it is in the frame
|
||||
index = method.ensure_local( name )
|
||||
method.source.add_code Virtual::Set.new(Virtual::FrameSlot.new(index ) , ret )
|
||||
end
|
||||
return ret
|
||||
end
|
||||
|
||||
end #module
|
||||
end
|
20
lib/bosl/compiler/operator_expressions.rb
Normal file
20
lib/bosl/compiler/operator_expressions.rb
Normal file
@ -0,0 +1,20 @@
|
||||
module Bosl
|
||||
module Compiler
|
||||
# operator attr_reader :operator, :left, :right
|
||||
def self.compile_operator expression, method
|
||||
call = Ast::CallSiteExpression.new(expression.operator , [expression.right] , expression.left )
|
||||
Compiler.compile(call, method)
|
||||
end
|
||||
|
||||
def self.compile_assign expression, method
|
||||
puts "assign"
|
||||
puts expression.inspect
|
||||
name , value = *expression
|
||||
name = name.to_a.first
|
||||
v = self.compile(value , method )
|
||||
index = method.ensure_local( name )
|
||||
method.source.add_code Virtual::Set.new(Virtual::FrameSlot.new(index ) , v )
|
||||
end
|
||||
|
||||
end
|
||||
end
|
9
lib/bosl/compiler/return_expression.rb
Normal file
9
lib/bosl/compiler/return_expression.rb
Normal file
@ -0,0 +1,9 @@
|
||||
module Bosl
|
||||
module Compiler
|
||||
|
||||
# return attr_reader :expression
|
||||
def self.compile_return expression, method
|
||||
return Compiler.compile(expression.to_a.first , method)
|
||||
end
|
||||
end
|
||||
end
|
29
lib/bosl/compiler/while_expression.rb
Normal file
29
lib/bosl/compiler/while_expression.rb
Normal file
@ -0,0 +1,29 @@
|
||||
module Bosl
|
||||
module Compiler
|
||||
|
||||
# while- attr_reader :condition, :body
|
||||
def self.compile_while expression, method
|
||||
# this is where the while ends and both branches meet
|
||||
merge = method.source.new_block("while merge")
|
||||
# this comes after the current and beofre the merge
|
||||
start = method.source.new_block("while_start" )
|
||||
method.source.current start
|
||||
|
||||
cond = Compiler.compile(expression.condition, method)
|
||||
|
||||
method.source.add_code IsTrueBranch.new(merge)
|
||||
|
||||
last = cond
|
||||
expression.body.each do |part|
|
||||
last = Compiler.compile(part , method)
|
||||
raise part.inspect if last.nil?
|
||||
end
|
||||
# unconditionally branch to the start
|
||||
method.source.add_code UnconditionalBranch.new(start)
|
||||
|
||||
# continue execution / compiling at the merge block
|
||||
method.source.current merge
|
||||
last
|
||||
end
|
||||
end
|
||||
end
|
Reference in New Issue
Block a user