ruby-x.github.io/2015-06-20-closing-the-chasm.html.haml

%hr/
%p
  layout: site
  author: Torsten
  —
%p
  As noted in previous posts, differentiating between compile- and run-time is one of the more
  difficult things in doing the vm. That is because the computing that needs to happen is so similar,
  in other words almost all of the vm - level is available at run-time too.
%p But off course we try to do as much as possible at compile-time.
%p
  One hears or reads that exactly this is a topic causing (also) other vms problems.
  Specifically how one assures that what is compiled at compile-time and and run-time are
  identical or at least compatible.
%h2#inlining Inlining
%p
  The obvious answer seems to me to
  = succeed ".In" do
    %strong use the same code
%p
  Let’s take a simple example of accessing an instance variable. This is off course available at
  run-time through the function
  %em instance_variable_get
  , which could go something like:
%pre
  %code
    :preserve
      def instance_variable_get name
        index = @layout.index name
        return nil unless index
        at_index(index)
      end
%p
  Let’s assume the
  %em builtin
  at_index function and take the layout to be an array like structure.
  As noted in previous posts, when this is compiled we get a Method with Blocks, and exactly one
  Block will initiate the return. The previous post detailed how at that time the return value will
  be in the ReturnSlot.
%p
  So then we get to the idea of how: We “just” need to take the blocks from the method and paste
  them where the instance variable is accessed. Following code will pick the value from the ReturnSlot
  as it would any other value and continue.
%p
  The only glitch in this plan is that the code will assume a new message and frame. But if we just
  paste it it will use message/frame/self from the enclosing method. So that is where the work is:
  translating slots from the inner, inlined function to the outer one. Possibly creating new frame
  entries.
%h2#inlining-what Inlining what
%p
  But lets take a step back from the mechanics and look at what it is we need to inline. Above
  example seems to suggest we inline code. Code, as in text, is off course impossible to inline.
  That’s because we have no information about it and so the argument passing and returning can’t
  possibly work. Quite apart from the tricky possibility of shadow variables, ie the inlined code
  assigning to variables of the outside function.
%p
  Ok, so then we just take our parsed code, the abstract syntax tree. There we have all the
  information we need to do the magic, at least it looks like that.
  But, we may not have the ast!
%p
  The idea is to be able to make the step to a language independent system. Hence the sof (salama
  object file), even it has no reader yet. The idea being that we store object files of any
  language in sof and the vm would read those.
%p
  To do that we need to inline at the vm instruction level. Which in turn means that we will need
  to retain enough information at that level to be able to do that. What that entails in detail
  is unclear at the moment, but it gives a good direction.
%h2#a-rough-plan A rough plan
%p
  To recap the function calling at the instruction level. Btw it should be clear that we can
  not inline method sends, as we don’t know which function is being called. But off course the
  actual send method may be inlined and that is in fact part of the aim.
%p
  To call a function, a NewMessage is created, loaded with args and stuff, then the FunctionCall is
  issued. Upon entering a new frame may be created for local and temporary variables and at the
  end the function returns. When it returns the return value will be in the Return slot and the
  calling method will grab it if interested and swap the Message back to what it was before the call.
%p
  From that (and at that level) it becomes clearer what needs to be done, and it starts with the
  the caller, off course. In the caller there needs to be a way to make the decision whether to
  inline or not. For the run-time stuff we need a list for “always inline”, later a complexity
  analysis, later a run-time analysis. When the decision goes to inline, the message setup will
  be skipped. Instead a mapping needs to be created from the called functions argument names to
  the newly created (unique) local variables.
  Then, going through the instructions, references to arguments must be exchanged with references
  to the new variables. A similar process needs to replace reference to local variables in the
  called method to local variables in the calling method. Similarly the return and self slots need
  to be mapped.
%p
  After the final instruction of the called method, the reassigned return must be moved to the real
  return  and the calling function may commence. And while this may sound a lot, one must remember
  that the instruction set of the machine is quite small, and further refinement
  (abstracting base classes for example) can be done to make the work easier.