change of name
This commit is contained in:
121
salama/layers.html
Normal file
121
salama/layers.html
Normal file
@ -0,0 +1,121 @@
|
||||
---
|
||||
layout: salama
|
||||
title: Salama, a simple and minimal oo machine
|
||||
---
|
||||
|
||||
|
||||
<div class="row vspace10">
|
||||
<div class="span12 center">
|
||||
<h3><span>Salama layers</span></h3>
|
||||
<p>Map pretty much to top level directories.</p>
|
||||
</div>
|
||||
</div>
|
||||
<div class="row vspace20">
|
||||
<div class="span11">
|
||||
<h5>Machine Code, bare metal</h5>
|
||||
<p>
|
||||
This is the code in arm directory. It creates binary code according to the arm specs. All about shifting bits in the
|
||||
right way.
|
||||
<br/>
|
||||
As an abstraction it is not far away from assembler. I mapped the memnonics to function calls and the registers
|
||||
can be symbols or Values (from vm). But on the whole this is as low level as it gets.
|
||||
<br/>
|
||||
Different types of instructions are implemented by different classes. To make machine dependant code possible,
|
||||
those classes are derived from Vm versions.
|
||||
<br/>
|
||||
There is an intel directory which contains an expanded version of wilson, but it has yet to be made to fit into
|
||||
the architecture. So for now salama produces arm code.
|
||||
</p>
|
||||
</div>
|
||||
</div>
|
||||
|
||||
|
||||
<div class="row">
|
||||
<div class="span12">
|
||||
<h5>Parsing, forever descending</h5>
|
||||
<p>
|
||||
Parsing is relatively straightforward too. We all know ruby, so it's just a matter of getting the rules right.
|
||||
<br/>
|
||||
If only! Ruby is full of niceties that actually make parsing it quite difficult. But at the moment that story hasn't
|
||||
even started.
|
||||
<br/>
|
||||
Traditionally, yacc or bison or talk of lr or ll would come in here and all but a few would zone out. But llvm has
|
||||
proven that recursive descent parsing is a viable alternative, also for big projects. And Parslet puts that into a nice
|
||||
ruby framework for us.
|
||||
<br/>
|
||||
Parslet lets us use modules for parts of the parser, so those files are pretty self-explanitory. Not all is done, but
|
||||
a good start.
|
||||
<br/>
|
||||
Parslet also has a seperate Transformation pass, and that creates the AST. Those class names are also
|
||||
easy, so you can guess what an IfExpression represents.
|
||||
<br/>
|
||||
</p>
|
||||
</div>
|
||||
</div>
|
||||
|
||||
<div class="row">
|
||||
<div class="span12">
|
||||
<h5>Virtual Machine</h5>
|
||||
<p>
|
||||
The Virtual machine layer is where it gets interesting, but also a little fuzzy.
|
||||
<br/>
|
||||
After some trying around the virtual machine layer has become a completely self contained layer to describe and
|
||||
implement an oo machine. In other words it has no reference to any physical machine, that is the next layer down.
|
||||
<br/>
|
||||
One can get headaches quite easily while thinking about implementing an oo machine in oo, it's just so difficult to
|
||||
find the boundaries. To determine those, i like to talk of types (not classes) for the objects (values) in which the
|
||||
vm is implemented. Also it is neccessary to remove ambiguity about what message sending means.
|
||||
<br/>
|
||||
One way to think of this (helps to keep sane) is to think of the types of the system known at compile time. In the
|
||||
simplest case this could be object reference and integer. The whole vm functionality can be made to work with only
|
||||
those two types, and it is not specified how the type information is stored. but off course there needs to be a
|
||||
way to check it at run-time.
|
||||
<br/>
|
||||
The vm has an instruction set that, apart from basic integer manipulation, only alows for memory access into an
|
||||
object. Instead of an implicit stack, we use activation frames and store all variables explicitly.
|
||||
</p>
|
||||
</p>
|
||||
</div>
|
||||
</div>
|
||||
|
||||
<div class="row">
|
||||
<div class="span12">
|
||||
<h5>Neumann Machine</h5>
|
||||
<p>
|
||||
The von Neumann machine layer is a relatively close abstraction of hardware.
|
||||
<br/>
|
||||
Currently still quite simple, we have Classes for things we know, like program and function. Also things we need
|
||||
to create the code, like Blocks and Instructions.
|
||||
<br/>
|
||||
The most interesting thing is maybe the idea of a Value. If you think of Variables, Values are what a variable may
|
||||
be assigned, but it may carry a storage place (register). Values are constant, and so to
|
||||
change a value, we have to create a new Value (of possibly different basic type). Thus
|
||||
all machine instructions are the transformation of values into new ones.
|
||||
<br/>
|
||||
Also interesting is the slightly unripe Basic Type system. We have a set of machine-word size types and do not
|
||||
tag them (like mri or BB), but keep type info seperate. These types include integer (signed/unsigned) object reference
|
||||
and function. Most of the oo machine will build on object references. To make that clearer: The (virtual)machine is
|
||||
strongly typed (with rtti) and the dynamic ruby behaviour it implemented using that basic type system.
|
||||
</p>
|
||||
</p>
|
||||
</div>
|
||||
</div>
|
||||
|
||||
<div class="row">
|
||||
<div class="span12">
|
||||
<h5>The flux</h5>
|
||||
<p>
|
||||
This is just a section of things that are unclear, in flux as it were. This does not included undone things, those
|
||||
are plenty too.
|
||||
<ul>
|
||||
<li> the whole type system, more of values, for object types its quite clear</li>
|
||||
<li> booting. There is this Dichotomy of writing code, and figuring out what it should do when it executes,
|
||||
that works ok, until i try to think of both, like for booting. </li>
|
||||
<li> the oo machine abstraction. Currently non-existant i feel like there is a whole layer missing. Possibly
|
||||
with it's own intruction set</li>
|
||||
<li> where the core ends, parfait starts and what can be external. </li>
|
||||
</ul>
|
||||
</p>
|
||||
</div>
|
||||
</div>
|
||||
|
57
salama/memory.md
Normal file
57
salama/memory.md
Normal file
@ -0,0 +1,57 @@
|
||||
---
|
||||
layout: salama
|
||||
title: Memory layout and management
|
||||
---
|
||||
|
||||
Memory management must be one of the main horrors of computing. That's why garbage collected languages like ruby are so great. Even simple malloc implementations tend to be quite complicated. Unneccessay so, if one used object oriented principles of data hiding.
|
||||
|
||||
### Object and values
|
||||
|
||||
As has been mentioned, in a true OO system, object tagging is not really an option. Tagging being the technique of adding the lowest bit as marker to pointers and thus having to shift ints and loosing a bit. Mri does this for Integers but not other value types. We accept this and work with it and just say "off course" , but it's not modelled well.
|
||||
|
||||
Integers are not Objects like "normal" objects. They are Values, on par with ObjectReferences, and have the following distinctive differences:
|
||||
|
||||
- equality implies identity
|
||||
- constant for whole lifetime
|
||||
- pass by value semantics
|
||||
|
||||
If integers were normal objects, the first would mean they would be sindletons. The second means you can't change them, you can only change a variable to hold a different value. It also means you can't add instance variables to an integer, neither singleton_methods. And the third means that if you do change the variable, a passed value will not be changed. Also they are not garbage collected. If you noticed how weird that idea is (the gc), you can see how natural is that Value idea.
|
||||
|
||||
Instead of trying to make this difference go away (like MRI) I think it should be explicit and indeed be expanded to all Objects that have these properties. Words for examples (ruby calls them Symbols), are the same. A Table is a Table, and Toble is not. Floats (all numbers) and Times are the same.
|
||||
|
||||
### Object Layout
|
||||
|
||||
So if we're not tagging we must pass and keep the type information around seperately. For passing it has been mentioned that a seperate register is used.
|
||||
|
||||
For keeping track of the type data we need to make a descision of how many we support. The register for passing gives the upper limit of 4 bits, and this fits well with the idea of cache lines. So if we use cahce lines, for every 8 words, we take one for the type.
|
||||
|
||||
Traditionally the class of the object is stored in the object. But this forces the dynamic lookup that is a good part of the performance problem. Instead we store the Object's Layout. The Layout then stores the Class, but it is the layout that describes the memory layout of the object (and all objects with the same layout).
|
||||
|
||||
This is is in essence a level of indirection that gives us the space to have several Layouts for one class, and so we can eveolve the class without having to hange the Layout (we just create new ones for every change)
|
||||
|
||||
The memory layout of **every** object is type word, layout reference and "data".
|
||||
|
||||
That leaves the length open and we can use the 8th 4bits to store it. That gives a maximum of 16 Lines.
|
||||
|
||||
#### Continuations
|
||||
|
||||
But (i hear), ruby is dynamic, we must be able to add variables and methods to an object at any time. So the layout can't
|
||||
be fixed. Ok, we can change the Layout every time, but when any empty slots have been used up, what then.
|
||||
|
||||
Then we use Continuations, so instead of adding a new variable to the end of the object, we use a new object and store it
|
||||
in the original object. Thus extending the object.
|
||||
|
||||
Continuations are pretty normal objects and it is just up to the layout to manage the redirection.
|
||||
Off course this may splatter objects a little, but in running application this does not really happen much. Most instance variables are added quite soon after startup, just as functions are usually parsed in the beginning.
|
||||
|
||||
The good side of continuation is also that we can be quite tight on initial allocation, and even minimal with continuations. Continuations can be completely changed out after all.
|
||||
|
||||
### Pages and Spaces
|
||||
|
||||
Now we have the smallest units taken care of, we need to store them and allocate and manage larger chunks. This is much
|
||||
simpler and we can use a fixed size Page, as say 256 lines.
|
||||
|
||||
The highest order is a Space, which is just a list of Pages. Spaces manage Pages in a very simliar way that Pages manage Objects, ie ie as liked lists of free Objects/Pages.
|
||||
|
||||
A Page, like a Space, is off course a normal object. The actual memory materialises out of nowhere, but then gets
|
||||
filled immediately with objects. So no empty memory is managed, just objects that can be repurposed.
|
85
salama/optimisations.md
Normal file
85
salama/optimisations.md
Normal file
@ -0,0 +1,85 @@
|
||||
---
|
||||
layout: salama
|
||||
title: Optimisation ideas
|
||||
---
|
||||
|
||||
I won't manage to implement all of these idea in the beginning, so i just jot them down.
|
||||
|
||||
### Avoid dynamic lookup
|
||||
|
||||
This off course is a broad topic, which may be seen under the topic of caching. Slightly wrongly though in my view, as avoiding them is really the aim. Especially for variables.
|
||||
|
||||
#### I - Instance Variables
|
||||
|
||||
Ruby has dynamic instance variables, meaning you can add a new one at any time. This is as it should be.
|
||||
|
||||
But this can easily lead to a dictionary/hash type of implementation. As variable "lookup" is probably *the* most
|
||||
common thing an OO system does, that leads to bad performance (unneccessarily).
|
||||
|
||||
So instead we keep variables layed out c++ style, continous, array style, at the address of the object. Then we have
|
||||
to manage that in a dynamic manner. This (as i mentioned [here](memory.html)) is done by the indirection of the Layout. A Layout is
|
||||
a dynamic structure mapping names to indexes (actually implemented as an array too, but the api is hash-like).
|
||||
|
||||
When a new variable is added, we create a *new* Layout and change the Layout of the object. We can do this as the Layout will
|
||||
determine the Class of the object, which stays the same. The memory page mentions how this works with constant sized objects.
|
||||
|
||||
So, Problem one fixed: instance variable access at O(1)
|
||||
|
||||
#### II - Method lookup
|
||||
|
||||
Off course that helps with Method access. All Methods are at the end variables on some (class) object. But as we can't very well have the same (continuous) index for a given method name on all classes, it has to be looked up. Or does it?
|
||||
|
||||
Well, yes it does, but maybe not more than once: We can conceivably store the result, except off course not in a dynamic
|
||||
structure as that would defeat the purpose.
|
||||
|
||||
In fact there could be several caching strategies, possibly for different use cases, possibly determined by actual run-time
|
||||
measurements, but for now I just destribe a simeple one using Data-Blocks, Plocks.
|
||||
|
||||
So at a call-site, we know the name of the function we want to call, and the object we want to call it on, and so have to
|
||||
find the actual function object, and by that the actual call address. In abstract terms we want to create a switch with
|
||||
3 cases and a default.
|
||||
|
||||
So the code is something like, if first cache hit, call first cache , .. times three and if not do the dynamic lookup.
|
||||
The Plock can store those cache hits inside the code. So then we "just" need to get the cache loaded.
|
||||
|
||||
Initializing the cached values is by normal lazy initialization. Ie we check for nil and if so we do the dynamic lookup, and store the result.
|
||||
|
||||
Remember, we cache Layout against function address. Since Layouts never change, we're done. We could (as hinted above)
|
||||
do things with counters or robins, but that is for later.
|
||||
|
||||
Alas: While Layouts are constant, darn the ruby, method implementations can actually change! And while it is tempting to
|
||||
just create a new Layout for that too, that would mean going through existing objects and changing the Layout, nischt gut.
|
||||
So we need change notifications, so when we cache, we must register a change listener and update the generated function,
|
||||
or at least nullify it.
|
||||
|
||||
### Inlining
|
||||
|
||||
Ok, this may not need too much explanation. Just work. It may be intersting to experiment how much this saves, and how much
|
||||
inlining is useful. I could imagine at some point it's the register shuffling that determines the effort, not the
|
||||
actual call.
|
||||
|
||||
Again the key is the update notifications when some of the inlined functions have changed.
|
||||
|
||||
And it is important to code the functions so that they have a single exit point, otherwise it gets messy. Up to now this
|
||||
was quite simple, but then blocks and exceptions are undone.
|
||||
|
||||
### Register negotiation
|
||||
|
||||
This is a little less baked, but it comes from the same idea as inlining. As calling functions is a lot of register
|
||||
shuffling, we could try to avoid some of that.
|
||||
|
||||
More precisely, usually calling conventions have registers in which arguments are passed. And to call an "unknown", ie any function, some kind of convention is neccessary.
|
||||
|
||||
But on "cached" functions, where the function is know, it is possible to do something else. And since we have the source
|
||||
(ast) of the function around, we can do things previouly imposible.
|
||||
|
||||
One such thing may be to recompile the function to acccept arguments exactly where they are in the calling function. Well, now that it's written down. it does sound a lot like inlining, except without the inlining:-)
|
||||
|
||||
An expansion if this idea would be to have a Negotiator on every function call. Meaning that the calling function would not
|
||||
do any shuffling, but instead call a Negotiator, and the Negotiator does the shuffling and calling of the function.
|
||||
This only really makes sense if the register shuffling information is encoded in the Negotiator object (and does not have
|
||||
to be passed).
|
||||
|
||||
Negotiators could do some counting and do the recompiling when it seems worth it. The Negotiator would remove itself from
|
||||
the chain and connect called and new receiver directly. How much is in this i couldn't say though.
|
||||
|
79
salama/threads.md
Normal file
79
salama/threads.md
Normal file
@ -0,0 +1,79 @@
|
||||
---
|
||||
layout: salama
|
||||
title: Threads are broken
|
||||
author: Torsten
|
||||
---
|
||||
|
||||
Having just read about rubys threads, i was moved to collect my thoughts on the topic. How this will influence implementation
|
||||
i am not sure yet. But good to get it out on paper as a basis for communication.
|
||||
|
||||
### Processes
|
||||
|
||||
I find it helps to consider why we have threads. Before threads, unix had only processes and ipc,
|
||||
so inter-process-communication.
|
||||
|
||||
Processes were a good idea, keeping each programm save from the mistakes of others by restricting access to the processes
|
||||
own memory. Each process had the view of "owning" the machine, being alone on the machine as it were. Each a small turing/
|
||||
von neumann machine.
|
||||
|
||||
But one had to wait for io, the network and so it was difficult, or even impossible to get one process to use the machine
|
||||
to the hilt.
|
||||
|
||||
IPC mechnisms were and are sockets, shared memory regions, files, each with their own sets of strengths, weaknesses and
|
||||
api's, all deemed complicated and slow. Each switch encurs a process switch and processes are not lightweight structures.
|
||||
|
||||
### Thread
|
||||
|
||||
And so threads were born as a lightweight mechanisms of getting more things done. Concurrently, because when the one
|
||||
thread is in a kernel call, it is suspended.
|
||||
|
||||
#### Green or fibre
|
||||
|
||||
The first threads that people did without kernel support, were quickly found not to solve the problem so well. Because as any
|
||||
thread is calling the kernel, all threads stop. Not really that much won one might think, but wrongly.
|
||||
|
||||
Now that Green threads are coming back in fashion as fibres they are used for lightweight concurrency, actor programming and
|
||||
we find that the different viewpoint can help to express some solutions more naturally.
|
||||
|
||||
#### Kernel threads
|
||||
|
||||
The real solution, where the kernel knows about threads and does the scheduling, took some while to become standard and
|
||||
makes processes more complicated a fair degree. Luckily we don't code kernels and don't have to worry.
|
||||
|
||||
But we do have to deal with the issues that come up. The isse is off course data corruption. I don't even want to go into
|
||||
how to fix this, or the different ways that have been introduced, because the main thrust becomes clear in the next chapter:
|
||||
|
||||
### Broken model
|
||||
|
||||
My main point about threads is that they are one of the worse hacks, especially in a c environemnt. Processes had a good
|
||||
model of a programm with a global memory. The equivalent of threads would have been shared memory with **many** programs
|
||||
connected. A nightmare. It even breaks that old turing idea and so it is very difficult to reason about what goes on in a
|
||||
multi threaded program, and the only ways this is achieved is by developing a more restrictive model.
|
||||
|
||||
In essence the thread memory model is broken. Ideally i would not like to implement it, or if implemented, at least fix it
|
||||
first.
|
||||
|
||||
But what is the fix? It is in essence what the process model was, ie each thread has it's own memory.
|
||||
|
||||
### Thread memory
|
||||
|
||||
In OO it is possible to fix the thread model, just because we have no global memory access. In effect the memory model
|
||||
must be inverted: instead of almost all memory being shared by all threads and each thread having a small thread local
|
||||
storage, threads must have mostly thread specific data and a small amount of shared resources.
|
||||
|
||||
A thread would thus work as a process used. In essence it can update any data it sees without restrictions. It must
|
||||
exchange data with other threads through specified global objects, that take the role of what ipc used to be.
|
||||
|
||||
In an oo system this can be enforced by strict pass-by-value over thread borders.
|
||||
|
||||
The itc (inter thread communication) objects are the only ones that need current thread synchronization techniques.
|
||||
The one mechanism that could cover all needs could be a simple lists.
|
||||
|
||||
### Salama
|
||||
|
||||
The original problem of what a program does during a kernel call could be solved by a very small number of kernel threads.
|
||||
Any kernel call would be listed and "c" threads would pick them up to execute them and return the result.
|
||||
|
||||
All other threads could be managed as green threads. Threads may not share objects, other than a small number of system
|
||||
provided.
|
||||
|
Reference in New Issue
Block a user