138 lines
5.9 KiB
Plaintext
138 lines
5.9 KiB
Plaintext
= render "pages/rubyx/menu"
|
||
|
||
%h1=title "Memory layout and management"
|
||
|
||
%p
|
||
Memory management must be one of the main horrors of computing.
|
||
That’s why garbage collected languages like ruby are so great.
|
||
Even simple malloc implementations tend to be quite complicated.
|
||
Unnecessary so, if one used object oriented principles of data hiding.
|
||
|
||
%h3 Objects
|
||
|
||
%p
|
||
As has been mentioned, in a true OO system, object tagging is not really an option.
|
||
Tagging being the technique of adding the lowest bit as marker to pointers and thus
|
||
having to shift ints, loosing a bit, and having some check before any pointer access.
|
||
%br
|
||
Mri does this for Integers but not other value types.
|
||
We accept this and work with it and just say “off course” , but it’s not modelled well.
|
||
|
||
%p
|
||
In a real OO system,
|
||
%b everything really is an object.
|
||
Strings are objects, floats, symbols, arrays, and yes,
|
||
%b Integers are normal Objects
|
||
%p
|
||
The difference with Integers is that they are
|
||
%b immutable.
|
||
As are Symbols in Ruby 2.x and Strings in Ruby 3.x and Javascript. Sensibly so, and
|
||
in general the property of immutable should be modelled explicitly.
|
||
|
||
%h3 Object Memory Layout
|
||
|
||
%p
|
||
When we say everything in an Object, what does that mean in practise.
|
||
Well, in short it means every Object has a Type, and the type is the
|
||
%b first word
|
||
in the memory layout.
|
||
%p
|
||
A Type stores the instance variable names, the methods, and refers to a class,
|
||
which in turn defines behaviour in a ruby.
|
||
%p
|
||
As a further stipulation, making our life easier, we define objects to be of fixed
|
||
size (according to type) and a multiple of a cache line long.
|
||
Objects are managed in Pages of same sized objects and the ObjectSpace, see below.
|
||
%p
|
||
|
||
%h4 Continuations
|
||
%p
|
||
But (i hear), ruby is dynamic, we must be able to add variables and methods to an object
|
||
at any time. So the type, or length, can’t be fixed. Ok, we can change the Type every
|
||
time, but when any empty slots have been used up, what then.
|
||
%p
|
||
Then we use Continuations, so instead of adding a new variable to the end of the object,
|
||
we use a new object and store it in the original object. Thus extending the object.
|
||
A linked list basically.
|
||
%p
|
||
Continuations are pretty normal objects and it is just up to the object to manage the
|
||
redirection. Off course this may splatter objects a little, but in running application
|
||
this does not really happen much. Most instance variables are added quite soon after
|
||
startup, just as functions are usually parsed in the beginning.
|
||
%p
|
||
We can avoid the added redirection of Continuations by clever code analysis and
|
||
over dimensioning. While this, and the whole concept of fixed size objects, may seem
|
||
wasteful at first sight, it is
|
||
%em much
|
||
more efficient than using a hash (as in mri, that not only stores all those names
|
||
over and over, but also has buckets, list functionality and just about uses a cache
|
||
line for a single variable)
|
||
|
||
%h3 Data
|
||
|
||
%p
|
||
So if we’re not tagging and we only have
|
||
%em Objects
|
||
where is the data. Where is that int, that char, the byte-buffer.
|
||
%p
|
||
Just to make that totally clear: The OO level has no access, no idea of data.
|
||
%p
|
||
Data does off course exist, but it is hidden, beyond the instance variables,
|
||
inaccessible to normal ruby code.
|
||
%p
|
||
The way this works, is that all access to data, or one should really say all
|
||
functionality that is needed to perform on data, is implemented in the lower
|
||
levels, mostly the Risc layer.
|
||
%p
|
||
In the Builtin module, we can define methods in purely Risc terms. The Risc
|
||
layer does have access to the memory, and can thus do things with it. Let's
|
||
look at the simple example of Integer addition. The method is defined in
|
||
=ext_link "Builtin," , "https://github.com/ruby-x/rubyx/blob/master/lib/risc/builtin/integer.rb#L82"
|
||
on the Integer type. The method requires one argument and checks that that too
|
||
is an Integer. Then it loads the data from both objects, performs the operation,
|
||
"allocates" a new Integer object and saves the machine word into it.
|
||
%p
|
||
We can also define Mom Instruction to manipulate data, but as work is done in
|
||
methods, the Builtin approach has been sufficient up to now.
|
||
%p
|
||
Again one may think this is wasteful, the simplest of Integer operation thus taking
|
||
10-20 cpu instructions instead of one. But not only are we speed-wise up against
|
||
interpretation (ie not one), but number crunching is not really what ruby is made
|
||
for. And if it ever is, there is always the possibility to optimise those Builtin methods.
|
||
|
||
%h3 Pages, Space and object allocation
|
||
%p
|
||
A
|
||
%em Page
|
||
manages a fixed size number of objects of the same size. They do not need to be of
|
||
same Type, just same memory length.
|
||
%p
|
||
The Space, manages Pages, and is ultimately responsible for "allocating" new memory.
|
||
%p
|
||
Objects are
|
||
%b not
|
||
allocated in the same way as mri, but rather recycled. Mri used C and specifically
|
||
malloc to do memory allocation and freeing.
|
||
%p
|
||
RubyX only every allocates Pages, or many pages (depending on object size), and
|
||
does so by getting it directly from the operating system (system call).
|
||
%p
|
||
Objects are only every recycled. Pages keep free lists of the objects (of the size
|
||
they manage) that are not used, and hand them out upon request. When the garbage
|
||
collector deems an object to be "freed" it is put back on the free-list of the
|
||
appropriate Page. This is done by changing the type of the object.
|
||
|
||
%h3 Status
|
||
%p
|
||
Not all of this has been implemented yet, only the
|
||
%em static
|
||
side of this. Pages and Space are still barely existent in terms of functionality
|
||
and objects are only statically allocated at the moment.
|
||
%p
|
||
But fixed size objects (and off course the type system) are done. When creating
|
||
a binary, only fixes sized objects are written. The next step will be to sort them
|
||
according to size and arrange them in Pages.
|
||
%p
|
||
Integers and their basic operations are done, and strings have basic read/write
|
||
access, but no allocation yet.
|