rather than use hardcoded r0 etc use syscall_X change the syscalls and interpreter to use them later use platform to map from syscall_X to actually used register (like r0 in arm)
seems to fit the layer much better as we really have a very reduced instruction set