ruby-x.github.io/app/views/pages/rubyx/memory.html.haml

138 lines
5.9 KiB
Plaintext
Raw Normal View History

2018-04-11 13:51:31 +03:00
= render "pages/rubyx/menu"
2018-04-12 14:38:44 +03:00
%h1=title "Memory layout and management"
%p
Memory management must be one of the main horrors of computing.
Thats why garbage collected languages like ruby are so great.
Even simple malloc implementations tend to be quite complicated.
Unnecessary so, if one used object oriented principles of data hiding.
%h3 Objects
%p
As has been mentioned, in a true OO system, object tagging is not really an option.
Tagging being the technique of adding the lowest bit as marker to pointers and thus
having to shift ints, loosing a bit, and having some check before any pointer access.
%br
Mri does this for Integers but not other value types.
We accept this and work with it and just say “off course” , but its not modelled well.
%p
In a real OO system,
%b everything really is an object.
Strings are objects, floats, symbols, arrays, and yes,
%b Integers are normal Objects
%p
The difference with Integers is that they are
%b immutable.
As are Symbols in Ruby 2.x and Strings in Ruby 3.x and Javascript. Sensibly so, and
in general the property of immutable should be modelled explicitly.
%h3 Object Memory Layout
%p
When we say everything in an Object, what does that mean in practise.
Well, in short it means every Object has a Type, and the type is the
%b first word
in the memory layout.
%p
A Type stores the instance variable names, the methods, and refers to a class,
which in turn defines behaviour in a ruby.
%p
As a further stipulation, making our life easier, we define objects to be of fixed
size (according to type) and a multiple of a cache line long.
Objects are managed in Pages of same sized objects and the ObjectSpace, see below.
%p
%h4 Continuations
%p
But (i hear), ruby is dynamic, we must be able to add variables and methods to an object
at any time. So the type, or length, cant be fixed. Ok, we can change the Type every
time, but when any empty slots have been used up, what then.
%p
Then we use Continuations, so instead of adding a new variable to the end of the object,
we use a new object and store it in the original object. Thus extending the object.
A linked list basically.
%p
Continuations are pretty normal objects and it is just up to the object to manage the
redirection. Off course this may splatter objects a little, but in running application
this does not really happen much. Most instance variables are added quite soon after
startup, just as functions are usually parsed in the beginning.
%p
We can avoid the added redirection of Continuations by clever code analysis and
over dimensioning. While this, and the whole concept of fixed size objects, may seem
wasteful at first sight, it is
%em much
more efficient than using a hash (as in mri, that not only stores all those names
2018-08-20 09:41:14 +03:00
over and over, but also has buckets, list functionality and just about uses a cache
line for a single variable)
2018-04-12 14:38:44 +03:00
%h3 Data
%p
So if were not tagging and we only have
%em Objects
where is the data. Where is that int, that char, the byte-buffer.
%p
Just to make that totally clear: The OO level has no access, no idea of data.
%p
Data does off course exist, but it is hidden, beyond the instance variables,
2018-08-20 09:41:14 +03:00
inaccessible to normal ruby code.
2018-04-12 14:38:44 +03:00
%p
The way this works, is that all access to data, or one should really say all
functionality that is needed to perform on data, is implemented in the lower
levels, mostly the Risc layer.
%p
In the Builtin module, we can define methods in purely Risc terms. The Risc
layer does have access to the memory, and can thus do things with it. Let's
2018-08-20 09:41:14 +03:00
look at the simple example of Integer addition. The method is defined in
=ext_link "Builtin," , "https://github.com/ruby-x/rubyx/blob/master/lib/risc/builtin/integer.rb#L82"
2018-04-12 14:38:44 +03:00
on the Integer type. The method requires one argument and checks that that too
is an Integer. Then it loads the data from both objects, performs the operation,
"allocates" a new Integer object and saves the machine word into it.
%p
2019-10-04 01:22:34 +03:00
We can also define SlotMachine Instruction to manipulate data, but as work is done in
2018-04-12 14:38:44 +03:00
methods, the Builtin approach has been sufficient up to now.
%p
Again one may think this is wasteful, the simplest of Integer operation thus taking
10-20 cpu instructions instead of one. But not only are we speed-wise up against
interpretation (ie not one), but number crunching is not really what ruby is made
for. And if it ever is, there is always the possibility to optimise those Builtin methods.
%h3 Pages, Space and object allocation
%p
A
%em Page
manages a fixed size number of objects of the same size. They do not need to be of
same Type, just same memory length.
%p
The Space, manages Pages, and is ultimately responsible for "allocating" new memory.
%p
Objects are
%b not
allocated in the same way as mri, but rather recycled. Mri used C and specifically
malloc to do memory allocation and freeing.
%p
RubyX only every allocates Pages, or many pages (depending on object size), and
does so by getting it directly from the operating system (system call).
%p
Objects are only every recycled. Pages keep free lists of the objects (of the size
they manage) that are not used, and hand them out upon request. When the garbage
collector deems an object to be "freed" it is put back on the free-list of the
appropriate Page. This is done by changing the type of the object.
%h3 Status
%p
Not all of this has been implemented yet, only the
%em static
side of this. Pages and Space are still barely existent in terms of functionality
and objects are only statically allocated at the moment.
%p
But fixed size objects (and off course the type system) are done. When creating
a binary, only fixes sized objects are written. The next step will be to sort them
according to size and arrange them in Pages.
%p
Integers and their basic operations are done, and strings have basic read/write
access, but no allocation yet.