Page 1
__rendered_path__2 __rendered_path__3 __rendered_path__89
A
A
d
v
a
n
c
R
e
d
R
I
S
C
M
M
a
c
h
i
n
e
s
__rendered_path__60
The ARM Instruction Set
__rendered_path__1__rendered_path__4__rendered_path__59Image_11_0__rendered_path__88__rendered_path__112
The ARM Instruction Set - ARM University Program - V1.0
1

Page 2
u
t
o
i
e
A
Th
Processor Modes
* The ARM has six operating modes:
User (unprivileged mode under which most tasks run)
FIQ (entered when a high priority (fast) interrupt is raised)
IRQ (entered when a low priority (normal) interrupt is raised)
Supervisor (entered on reset and when a Software Interrupt instr
executed)
Abort (used to handle memory access violations)
Undef (used to handle undefined instructions)
* ARM Architecture Version 4 adds a seventh mode:
System (privileged mode using the same registers as user mode)
RM Instruction Set - ARM University Program - V1.0
__rendered_path__1__rendered_path__2
c
i
n
s
__rendered_path__3__rendered_path__58__rendered_path__59Image_21_0__rendered_path__568
2

Page 3
e
A
n
Th
*
*
RM I
__rendered_path__1__rendered_path__2
The Registers
ARM has 37 registers in total, all of which are 32-bits long.
• 1 dedicated program counter
• 1 dedicated current program status register
• 5 dedicated saved program status registers
• 30 general purpose registers
__rendered_path__59
However these are arranged into several banks, with the accessible
bank being governed by the processor mode. Each mode can access
• a particular set of r0-r12 registers
• a particular r13 (the stack pointer) and r14 (link register)
• r15 (the program counter)
__rendered_path__665
• cpsr (the current program status register)
and privileged modes can also access
Image_34_0
• a particular spsr (saved program status register)
__rendered_path__3__rendered_path__58
struction Set - ARM University Program - V1.0
3

Page 4
g
a
e
v
Re
User32 / System
r0
__rendered_path__143
r1
__rendered_path__143
r2
__rendered_path__143
r3
__rendered_path__143
r4
__rendered_path__143
r5
__rendered_path__143
r6
__rendered_path__143
r7
__rendered_path__143
r8
__rendered_path__143
r9
__rendered_path__143
r10
__rendered_path__143
r11
__rendered_path__143
r12
__rendered_path__330
r13 (sp)
__rendered_path__143
r14 (lr)
__rendered_path__143
r15 (pc)
__rendered_path__143
cpsr
__rendered_path__143
The ARM Instruction Set - ARM Uni
ister Organis
General registers and Program Count
FIQ32
Supervisor32
Abort32
IRQ32
r0
r0
r0
r0
__rendered_path__448
r1
r1
r1
r1
__rendered_path__448__rendered_path__448
r2
r2
r2
r2
__rendered_path__448__rendered_path__448__rendered_path__448
r3
r3
r3
r3
__rendered_path__448__rendered_path__448__rendered_path__448
r4
r4
r4
r4
__rendered_path__448__rendered_path__448__rendered_path__448
r5
r5
r5
r5
__rendered_path__448__rendered_path__448__rendered_path__448
r6
r6
r6
r6
__rendered_path__448__rendered_path__448__rendered_path__448
r7
r7
r7
r7
__rendered_path__448__rendered_path__448__rendered_path__448
r8_fiq
r8
r8
r8
__rendered_path__448__rendered_path__448__rendered_path__448
r9_fiq
r9
r9
r9
__rendered_path__448__rendered_path__448__rendered_path__448
r10_fiq
r10
r10
r10
__rendered_path__448__rendered_path__448__rendered_path__170
r11_fiq
r11
r11
r11
__rendered_path__448__rendered_path__448__rendered_path__170
r12_fiq
r12
r12
r12
__rendered_path__466__rendered_path__448__rendered_path__170
r13_fiq
r13_svc
r13_abt
r13_irq
__rendered_path__170__rendered_path__466__rendered_path__170
r14_fiq
r14_svc
r14_abt
r14_irq
__rendered_path__170__rendered_path__170__rendered_path__293
r15 (pc)
r15 (pc)
r15 (pc)
r15 (pc)
__rendered_path__170__rendered_path__170__rendered_path__170
Program Status Registers
__rendered_path__170__rendered_path__170
cpsr
cpsr
cpsr
cpsr
__rendered_path__170
s
s
s
s
s
p
p
p
p
p
s
p
r
r
r
r
r
s
s
s
s
s
s
r
r
r
r
r
r
_
_
_
_
_
_
f
f
f
f
f
f
i
i
i
i
i
i
q
q
q
q
q
q
s
p
s
r
_
s
v
c
s
p
s
r
_
a
b
t
s
s
s
s
s
p
p
p
p
p
s
r
r
r
r
r
p
s
s
s
s
s
s
r
r
r
r
r
r
_
_
_
_
_
_
f
f
f
f
f
i
i
i
i
i
i
r
q
q
q
q
q
q
__rendered_path__293__rendered_path__448
ersity Program - V1.0
__rendered_path__1__rendered_path__2__rendered_path__144__rendered_path__144__rendered_path__331__rendered_path__144__rendered_path__144__rendered_path__144__rendered_path__144__rendered_path__144__rendered_path__144__rendered_path__144__rendered_path__144__rendered_path__144
tion
__rendered_path__145__rendered_path__172__rendered_path__172__rendered_path__144__rendered_path__144__rendered_path__449__rendered_path__449__rendered_path__449
r
__rendered_path__171__rendered_path__172__rendered_path__172__rendered_path__144__rendered_path__449__rendered_path__449__rendered_path__449__rendered_path__449__rendered_path__449
Undefined32
__rendered_path__171__rendered_path__172__rendered_path__172__rendered_path__449__rendered_path__449__rendered_path__449__rendered_path__449__rendered_path__449__rendered_path__449__rendered_path__449
r0
__rendered_path__171__rendered_path__172__rendered_path__172__rendered_path__449__rendered_path__449__rendered_path__449__rendered_path__449__rendered_path__172__rendered_path__172__rendered_path__172
r1
__rendered_path__171__rendered_path__467__rendered_path__449__rendered_path__449__rendered_path__449__rendered_path__449__rendered_path__449__rendered_path__449__rendered_path__449__rendered_path__172__rendered_path__172__rendered_path__172__rendered_path__172__rendered_path__172
r2
__rendered_path__144__rendered_path__467__rendered_path__449__rendered_path__449__rendered_path__449__rendered_path__449__rendered_path__449__rendered_path__449__rendered_path__172__rendered_path__172__rendered_path__172
r3
__rendered_path__294__rendered_path__294__rendered_path__172__rendered_path__172__rendered_path__172__rendered_path__172__rendered_path__172__rendered_path__172__rendered_path__172
r4
__rendered_path__294__rendered_path__294__rendered_path__172__rendered_path__172__rendered_path__172__rendered_path__172__rendered_path__172__rendered_path__172
r5
__rendered_path__294__rendered_path__584__rendered_path__172__rendered_path__172__rendered_path__294__rendered_path__172__rendered_path__172__rendered_path__172__rendered_path__172
r6
__rendered_path__59__rendered_path__294__rendered_path__584
r7
__rendered_path__331__rendered_path__171
r8
__rendered_path__294__rendered_path__171
r9
__rendered_path__294__rendered_path__171
r10
__rendered_path__331__rendered_path__626__rendered_path__632__rendered_path__449
r11
__rendered_path__294__rendered_path__449
r12
__rendered_path__294__rendered_path__172
r13_undef
__rendered_path__170__rendered_path__294
r14_undef
__rendered_path__170__rendered_path__170__rendered_path__294
r15 (pc)
__rendered_path__170__rendered_path__170__rendered_path__294
cpsr
__rendered_path__294
s
s
s
s
s
s
p
p
p
p
p
p
s
r
r
r
r
r
r
_
s
s
s
s
s
r
r
r
r
r
u
_
_
_
_
_
n
f
f
f
f
f
d
i
i
i
i
i
q
q
q
q
q
e
f
__rendered_path__3__rendered_path__58Image_44_0__rendered_path__293__rendered_path__293__rendered_path__293__rendered_path__330__rendered_path__293__rendered_path__293__rendered_path__330__rendered_path__293__rendered_path__293__rendered_path__293__rendered_path__293__rendered_path__293__rendered_path__293__rendered_path__293__rendered_path__170__rendered_path__170__rendered_path__170__rendered_path__170__rendered_path__170__rendered_path__170__rendered_path__170__rendered_path__170__rendered_path__170__rendered_path__170__rendered_path__170__rendered_path__170__rendered_path__293__rendered_path__170__rendered_path__170__rendered_path__170__rendered_path__170__rendered_path__170__rendered_path__170__rendered_path__170__rendered_path__170__rendered_path__170__rendered_path__170__rendered_path__170__rendered_path__170__rendered_path__448__rendered_path__170__rendered_path__170__rendered_path__170__rendered_path__625__rendered_path__631__rendered_path__448__rendered_path__448__rendered_path__170__rendered_path__696__rendered_path__697__rendered_path__696__rendered_path__697__rendered_path__696__rendered_path__697__rendered_path__696__rendered_path__697__rendered_path__696__rendered_path__697__rendered_path__696__rendered_path__697__rendered_path__753
4

Page 5
p
e
o
e
n
o
t
Registers i
__rendered_path__460
r0
r1
r2
r3
r4
r5
r6
r7
r8
r9
r10
r11
r12
r13 (sp)
r14 (lr)
r15 (pc)
cpsr
The ARM Instruction Se
Register Exam
User to FIQ M
use
User Mode
FIQ M
__rendered_path__504
r8_fiq
EXCEPTION
r8
r9_fiq
r9
__rendered_path__482
r10_fiq
r10
__rendered_path__480
r11_fiq
r11
__rendered_path__494
r12_fiq
r12
r13_fiq
r13 (sp)
r14_fiq
r14 (lr)
Return address calculated from User mode
__rendered_path__365
PC value and stored in FIQ mode LR
spsr_fiq
User mode CPSR copied to FIQ mode SPSR
__rendered_path__254
- ARM University Program - V1.0
l
d
de
__rendered_path__1__rendered_path__2__rendered_path__94__rendered_path__104__rendered_path__94__rendered_path__109__rendered_path__110__rendered_path__109__rendered_path__110__rendered_path__110__rendered_path__459
:
__rendered_path__110__rendered_path__119__rendered_path__300__rendered_path__261__rendered_path__261__rendered_path__477
Registers in use
__rendered_path__110__rendered_path__261__rendered_path__270
r0
__rendered_path__110__rendered_path__94__rendered_path__136__rendered_path__94__rendered_path__94__rendered_path__94__rendered_path__94__rendered_path__136__rendered_path__300__rendered_path__261
r1
__rendered_path__110__rendered_path__119__rendered_path__110__rendered_path__110__rendered_path__110__rendered_path__110__rendered_path__119__rendered_path__270__rendered_path__261
r2
__rendered_path__253__rendered_path__104__rendered_path__261__rendered_path__261__rendered_path__270__rendered_path__261__rendered_path__261__rendered_path__261__rendered_path__261__rendered_path__270
r3
__rendered_path__255__rendered_path__261__rendered_path__270__rendered_path__261__rendered_path__261__rendered_path__261__rendered_path__261__rendered_path__483
r4
__rendered_path__364__rendered_path__479__rendered_path__481__rendered_path__493
r5
__rendered_path__441
r6
__rendered_path__59__rendered_path__478__rendered_path__94
r7
r8_fiq
r9_fiq
r10_fiq
r11_fiq
r12_fiq
r13_fiq
r14_fiq
r15 (pc)
__rendered_path__442
cpsr
__rendered_path__103
spsr_fiq
__rendered_path__3__rendered_path__58Image_54_0__rendered_path__184__rendered_path__185__rendered_path__184__rendered_path__184__rendered_path__184__rendered_path__184__rendered_path__185__rendered_path__256__rendered_path__513__rendered_path__522
5

Page 6
t
e
A
Th
Accessing Registers using
ARM Instructions
* No breakdown of currently accessible registers.
• All instructions can access r0-r14 directly.
• Most instructions also allow use of the PC.
* Specific instructions to allow access to CPSR and SPSR.
* Note : When in a privileged mode, it is also possible to load / store
(banked out) user mode registers to or from memory.
• See later for details.
RM Instruction Set - ARM University Program - V1.0
__rendered_path__1__rendered_path__2
he
__rendered_path__3__rendered_path__58__rendered_path__59Image_64_0__rendered_path__446
6

Page 7
__rendered_path__1 __rendered_path__2 __rendered_path__508 __rendered_path__509 __rendered_path__510 __rendered_path__511 __rendered_path__513
The Program Status Registers
__rendered_path__512__rendered_path__514
(CPSR and SPSRs)
__rendered_path__514
31
28
8
4
0
__rendered_path__514
N Z C V
I F T
Mode
__rendered_path__514
Copies of the ALU status flags (latched if the
__rendered_path__514
instruction has the "S" bit set).
__rendered_path__59__rendered_path__514
* Condition Code Flags
* Interrupt Disable bits.
__rendered_path__514
N = Negative result from ALU flag.
I = 1, disables the IRQ.
__rendered_path__514
Z = Zero result from ALU flag.
F = 1, disables the FIQ.
__rendered_path__514
C = ALU operation Carried out
__rendered_path__514
V = ALU operation oVerflowed
* T Bit (Architecture v4T only)
__rendered_path__514
T = 0, Processor in ARM state
__rendered_path__514
* Mode Bits
T = 1, Processor in Thumb state
__rendered_path__514
M[4:0] define the processor mode.
__rendered_path__3__rendered_path__58Image_74_0__rendered_path__514__rendered_path__514__rendered_path__514__rendered_path__514__rendered_path__515__rendered_path__515__rendered_path__516__rendered_path__515__rendered_path__514__rendered_path__514__rendered_path__513__rendered_path__514__rendered_path__517__rendered_path__516__rendered_path__517__rendered_path__514__rendered_path__517__rendered_path__541__rendered_path__542__rendered_path__543__rendered_path__544__rendered_path__593
The ARM Instruction Set - ARM University Program - V1.0
7

Page 8
n
F
e
A
M
S
M
U
Th
R
Flag
__rendered_path__107
Negative
(N=‘1’)
Zero
(Z=‘1’)
Carry
(C=‘1’)
oVerflow
(V=‘1’)
Instruction
__rendered_path__61
et -
AR
Conditio
Logical Instruction
No meaning
Result is all zeroes
After Shift operation
‘1’ was left in carry flag
No meaning
niversity Program - V1.0
__rendered_path__1__rendered_path__2__rendered_path__481
lags
__rendered_path__482
Arithmetic Instruction
__rendered_path__483
Bit 31 of the result has been set
Indicates a negative number in
signed operations
__rendered_path__59
Result of operation was zero
Result was greater than 32 bits
Result was greater than 31 bits
Indicates a possible corruption of
the sign bit in signed
numbers
__rendered_path__3__rendered_path__58Image_84_0__rendered_path__499
8

Page 9
e
A
Th
__rendered_path__1__rendered_path__2
The Program Counter (R15)
* When the processor is executing in ARM state:
• All instructions are 32 bits in length
• All instructions must be word aligned
• Therefore the PC value is stored in bits [31:2] with bits [1:0] equal to
zero (as instruction cannot be halfword or byte aligned).
* R14 is used as the subroutine link register (LR) and stores the return
__rendered_path__59
address when Branch with Link operations are performed,
calculated from the PC.
* Thus to return from a linked branch
MOV r15,r14
or
__rendered_path__551
MOV pc,lr
__rendered_path__3__rendered_path__58Image_94_0
RM Instruction Set - ARM University Program - V1.0
9

Page 10
d
T
e
R
*
*
The A
Exception Han
and the Vector
When an exception occurs, the core:
Copies CPSR into SPSR_<mode>
Sets appropriate CPSR bits
If core implements ARM Architecture 4T and is
__rendered_path__155
currently in Thumb state, then
ARM state is entered.
__rendered_path__232
Mode field bits
__rendered_path__155
Interrupt disable flags if appropriate.
__rendered_path__155
Maps in appropriate banked registers
Stores the “return address” in LR_<mode>
Sets PC to vector address
To return, exception handler needs to:
• Restore CPSR from SPSR_<mode>
• Restore PC from LR_<mode>
M Instruction Set - ARM University Program - V1.0
ling
abl
0x00000000
0x00000004
0x00000008
0x0000000C
0x00000010
0x00000014
0x00000018
0x0000001C
__rendered_path__1__rendered_path__2__rendered_path__546
Reset
__rendered_path__545__rendered_path__548
Undefined Instruction
__rendered_path__547__rendered_path__546
Software Interrupt
__rendered_path__545__rendered_path__546
Prefetch Abort
__rendered_path__60__rendered_path__545__rendered_path__546
Data Abort
__rendered_path__545__rendered_path__548
Reserved
__rendered_path__547__rendered_path__546
IRQ
__rendered_path__545__rendered_path__549
FIQ
__rendered_path__3__rendered_path__59Image_106_0__rendered_path__550__rendered_path__550__rendered_path__550__rendered_path__550__rendered_path__550__rendered_path__549__rendered_path__550__rendered_path__713__rendered_path__714__rendered_path__714__rendered_path__715
10

Page 11
e
A
n
Th
*
*
RM I
__rendered_path__1__rendered_path__2__rendered_path__372
The Instruction Pipeline
__rendered_path__372__rendered_path__391
The ARM uses a pipeline in order to increase the speed of the flow of
__rendered_path__372__rendered_path__393
instructions to the processor.
• Allows several operations to be undertaken simultaneously, rather than
serially.
ARM
__rendered_path__579
PC
FETCH
Instruction fetched from memory
__rendered_path__60
PC - 4
DECODE
Decoding of registers used in instruction
__rendered_path__392
P
C
-
8
E
X
E
C
U
T
E
R
S
e
h
g
i
f
i
t
s
a
t
e
n
r
d
(
s
A
)
L
r
e
U
a
d
o
p
f
e
r
o
r
m
a
t
i
o
R
n
e
g
i
s
t
e
r
B
a
n
k
__rendered_path__394
Write register(s) back to Register Bank
__rendered_path__580
Rather than pointing to the instruction being executed, the
Image_121_0
PC points to the instruction being fetched.
__rendered_path__3__rendered_path__59
struction Set - ARM University Program - V1.0
11

Page 12
Quiz #1 - Verbal
* What registers are used to store the program counter and link register?
* What is r13 often used to store?
* Which mode, or modes has the fewest available number of registers
available? How many and why?
Image_133_0
The ARM Instruction Set - ARM University Program - V1.0
__rendered_path__1__rendered_path__2__rendered_path__3__rendered_path__59__rendered_path__60__rendered_path__277
12

Page 13
o
ARM Instruction Set F
31
2827
1615
87
0
Cond 0 0 I Opcode S Rn Rd Operand2
__rendered_path__443
Cond 0 0 0 0 0 0 A S Rd Rn Rs 1 0 0 1 Rm
__rendered_path__445__rendered_path__449
Cond 0 0 0 0 1 U A S RdHi RdLo Rs 1 0 0 1 Rm
__rendered_path__443__rendered_path__445__rendered_path__451
Cond 0 0 0 1 0 B 0 0 Rn Rd 0 0 0 0 1 0 0 1 Rm
__rendered_path__447__rendered_path__453
Cond 0 1 I P U B W L Rn Rd Offset
__rendered_path__455
Cond 1 0 0 P U S W L Rn Register List
__rendered_path__455
Cond 0 0 0 P U 1 W L Rn Rd Offset1 1 S H 1 Offset2
__rendered_path__451
Cond 0 0 0 P U 0 W L Rn Rd 0 0 0 0 1 S H 1 Rm
__rendered_path__460
Cond 1 0 1 L Offset
__rendered_path__462
Cond 0 0 0 1 0 0 1 0 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 1 Rn
__rendered_path__462
Cond 1 1 0 P U N W L Rn CRd CPNum Offset
__rendered_path__462
Cond 1 1 1 0 Op1 CRn CRd CPNum Op2 0 CRm
__rendered_path__462
Cond 1 1 1 0 Op1 L CRn Rd CPNum Op2 1 CRm
__rendered_path__445__rendered_path__443
Cond 1 1 1 1 SWI Number
__rendered_path__584__rendered_path__460__rendered_path__443
The ARM Instruction Set - ARM University Program - V1.0
__rendered_path__1__rendered_path__2__rendered_path__444__rendered_path__446__rendered_path__444
rmat
__rendered_path__446__rendered_path__450
Instruction type
__rendered_path__104__rendered_path__448__rendered_path__452
Data processing / PSR Transfer
__rendered_path__454
Multiply
__rendered_path__456
Long Multiply
(v3M / v4 only)
__rendered_path__456
Swap
__rendered_path__457
Load/Store Byte/Word
__rendered_path__458__rendered_path__459
Load/Store Multiple
__rendered_path__60__rendered_path__461
Halfword transfer : Immediate offset (v4 only)
__rendered_path__463
Halfword transfer: Register offset (v4 only)
__rendered_path__463
Branch
__rendered_path__463
Branch Exchange (v4T only)
__rendered_path__463
Coprocessor data transfer
__rendered_path__462__rendered_path__463
Coprocessor data operation
__rendered_path__446__rendered_path__444
Coprocessor register transfer
__rendered_path__585__rendered_path__461__rendered_path__444
Software interrupt
__rendered_path__3__rendered_path__59Image_143_0__rendered_path__584__rendered_path__585__rendered_path__460__rendered_path__461__rendered_path__451__rendered_path__452__rendered_path__462__rendered_path__463__rendered_path__462__rendered_path__463__rendered_path__462__rendered_path__463__rendered_path__462__rendered_path__463__rendered_path__586__rendered_path__587__rendered_path__455__rendered_path__456__rendered_path__462__rendered_path__463__rendered_path__462__rendered_path__463__rendered_path__462__rendered_path__463__rendered_path__588__rendered_path__589__rendered_path__590__rendered_path__591__rendered_path__592__rendered_path__593__rendered_path__445__rendered_path__446__rendered_path__445__rendered_path__446__rendered_path__445__rendered_path__446__rendered_path__445__rendered_path__446__rendered_path__594__rendered_path__595__rendered_path__596__rendered_path__597__rendered_path__449__rendered_path__450__rendered_path__443__rendered_path__444__rendered_path__443__rendered_path__444__rendered_path__443__rendered_path__444__rendered_path__443__rendered_path__444__rendered_path__772__rendered_path__773__rendered_path__451__rendered_path__452__rendered_path__455__rendered_path__456__rendered_path__455__rendered_path__456__rendered_path__455__rendered_path__456__rendered_path__455__rendered_path__456__rendered_path__455__rendered_path__456__rendered_path__445__rendered_path__446__rendered_path__460__rendered_path__461__rendered_path__462__rendered_path__463__rendered_path__462__rendered_path__463__rendered_path__462__rendered_path__463__rendered_path__445__rendered_path__446__rendered_path__462__rendered_path__463__rendered_path__837__rendered_path__838__rendered_path__445__rendered_path__446__rendered_path__445__rendered_path__446__rendered_path__445__rendered_path__446__rendered_path__445__rendered_path__446__rendered_path__445__rendered_path__446__rendered_path__584__rendered_path__585__rendered_path__445__rendered_path__446__rendered_path__455__rendered_path__456__rendered_path__903__rendered_path__904__rendered_path__905__rendered_path__906__rendered_path__443__rendered_path__444__rendered_path__462__rendered_path__463__rendered_path__455__rendered_path__456__rendered_path__462__rendered_path__463__rendered_path__462__rendered_path__463__rendered_path__455__rendered_path__456__rendered_path__460__rendered_path__950__rendered_path__462__rendered_path__463__rendered_path__462__rendered_path__463__rendered_path__455__rendered_path__456__rendered_path__455__rendered_path__456__rendered_path__462__rendered_path__463__rendered_path__462__rendered_path__463__rendered_path__951__rendered_path__952__rendered_path__462__rendered_path__463__rendered_path__451__rendered_path__457__rendered_path__460__rendered_path__950__rendered_path__460__rendered_path__950__rendered_path__451__rendered_path__457__rendered_path__596__rendered_path__597__rendered_path__837__rendered_path__838__rendered_path__584__rendered_path__585__rendered_path__449__rendered_path__450__rendered_path__953__rendered_path__954__rendered_path__462__rendered_path__463__rendered_path__455__rendered_path__456__rendered_path__955__rendered_path__956__rendered_path__445__rendered_path__446__rendered_path__445__rendered_path__446__rendered_path__445__rendered_path__446__rendered_path__445__rendered_path__446__rendered_path__445__rendered_path__446__rendered_path__445__rendered_path__446__rendered_path__955__rendered_path__956__rendered_path__455__rendered_path__456__rendered_path__455__rendered_path__456__rendered_path__455__rendered_path__456__rendered_path__455__rendered_path__456__rendered_path__451__rendered_path__457__rendered_path__451__rendered_path__457__rendered_path__451__rendered_path__457__rendered_path__451__rendered_path__457__rendered_path__445__rendered_path__446__rendered_path__462__rendered_path__463__rendered_path__460__rendered_path__950__rendered_path__460__rendered_path__950__rendered_path__460__rendered_path__950__rendered_path__584__rendered_path__585__rendered_path__462__rendered_path__463__rendered_path__955__rendered_path__956__rendered_path__445__rendered_path__446__rendered_path__584__rendered_path__585__rendered_path__584__rendered_path__585__rendered_path__584__rendered_path__585__rendered_path__445__rendered_path__446__rendered_path__445__rendered_path__446__rendered_path__445__rendered_path__446__rendered_path__1322
13

Page 14
e
A
Th
__rendered_path__1__rendered_path__2
Conditional Execution
* Most instruction sets only allow branches to be executed conditionally.
* However by reusing the condition evaluation hardware, ARM effectively
increases number of instructions.
• All instructions contain a condition field which determines whether the
CPU will execute them.
• Non-executed instructions soak up 1 cycle.
__rendered_path__60
– Still have to complete cycle so as to allow fetching and decoding of
following instructions.
* This removes the need for many branches, which stall the pipeline (3
cycles to refill).
• Allows very dense in-line code, without branches.
• The Time penalty of not executing several conditional instructions is
__rendered_path__796
frequently less than overhead of the branch
or subroutine call that would otherwise be needed.
__rendered_path__3__rendered_path__59Image_153_0
RM Instruction Set - ARM University Program - V1.0
14

Page 15
C
o
n
i
o
g
m
V
The
31
28
24
Cond
0000 = EQ - Z set (equal)
0001 = NE - Z clear (not equal)
0010 = HS / CS - C set (unsigned
higher or same)
0011 = LO / CC - C clear (unsigned
lower)
0100 = MI -N set (negative)
0101 = PL - N clear (positive or
zero)
0110 = VS - V set (overflow)
0111 = VC - V clear (no overflow)
1000 = HI - C set and Z clear
(unsigned higher)
The ARM Instruction Set - ARM University Pro
ra
-
dit
20
1.0
16
__rendered_path__1__rendered_path__2__rendered_path__62__rendered_path__83
n Field
__rendered_path__82__rendered_path__83__rendered_path__83
12
8
4
0
__rendered_path__83
1001 = LS - C clear or Z (set unsigned
__rendered_path__83
lower or same)
__rendered_path__83
1010 = GE - N set and V set, or N clear
__rendered_path__60__rendered_path__83
and V clear (>or =)
__rendered_path__83
1011 = LT - N set and V clear, or N clear
__rendered_path__83
and V set (>)
__rendered_path__83
1100 = GT - Z clear, and either N set and
__rendered_path__83
V set, or N clear and V set (>)
__rendered_path__83
1101 = LE - Z set, or N set and V clear,or
__rendered_path__83
N clear and V set (<, or =)
__rendered_path__83
1110 = AL - always
__rendered_path__83
1111 = NV - reserved.
__rendered_path__3__rendered_path__59Image_163_0__rendered_path__83__rendered_path__83__rendered_path__83__rendered_path__83__rendered_path__83__rendered_path__83__rendered_path__84__rendered_path__83__rendered_path__83__rendered_path__83__rendered_path__83__rendered_path__83__rendered_path__100__rendered_path__83__rendered_path__83__rendered_path__83__rendered_path__83__rendered_path__760__rendered_path__761__rendered_path__760__rendered_path__84__rendered_path__762__rendered_path__763__rendered_path__764__rendered_path__765__rendered_path__766__rendered_path__767
15

Page 16
I
*
*
The ARM
__rendered_path__1__rendered_path__2
Using and updating the
Condition Field
To execute an instruction conditionally, simply postfix it with the
appropriate condition:
• For example an add instruction takes the form:
ADD r0,r1,r2
; r0 = r1 + r2 (ADDAL)
• To execute this only if the zero flag is set:
ADDEQ r0,r1,r2 ; If zero flag set then
__rendered_path__60
; ... r0 = r1 + r2
By default, data processing operations do not affect the condition flags
(apart from the comparisons where this is the only effect). To cause the
condition flags to be updated, the S bit of the instruction needs to be set
by postfixing the instruction (and any condition code) with an “S”.
• For example to add two numbers and set the condition flags:
__rendered_path__774
ADDS r0,r1,r2
; r0 = r1 + r2
; ... and set flags
__rendered_path__3__rendered_path__59Image_173_0
nstruction Set - ARM University Program - V1.0
16

Page 17
e
A
M
n
Th
R
I
__rendered_path__1__rendered_path__2__rendered_path__546__rendered_path__548
Branch instructions (1)
__rendered_path__547__rendered_path__549__rendered_path__548
* Branch :
B{<cond>} label
__rendered_path__547__rendered_path__549__rendered_path__549
* Branch with Link :
BL{<cond>} sub_routine_label
__rendered_path__549
31
28 27
25 24 23
0
__rendered_path__549
Cond 1 0 1 L
Offset
__rendered_path__549
Link bit 0 = Branch
__rendered_path__60__rendered_path__549
1 = Branch with link
__rendered_path__549
Condition field
__rendered_path__549
* The offset for branch instructions is calculated by the assembler:
__rendered_path__549
By taking the difference between the branch instruction and the
__rendered_path__549
target address minus 8 (to allow for the pipeline).
__rendered_path__549
This gives a 26 bit offset which is right shifted 2 bits (as the
__rendered_path__549
bottom two bits are always zero as instructions are word –
__rendered_path__549
aligned) and stored into the instruction encoding.
Image_183_0__rendered_path__550
This gives a range of 32 Mbytes.
__rendered_path__3__rendered_path__59__rendered_path__512__rendered_path__549__rendered_path__549__rendered_path__549__rendered_path__549__rendered_path__549__rendered_path__547__rendered_path__549__rendered_path__549__rendered_path__549__rendered_path__549__rendered_path__558__rendered_path__559__rendered_path__558__rendered_path__560__rendered_path__549__rendered_path__549__rendered_path__549__rendered_path__660__rendered_path__661__rendered_path__700__rendered_path__707
struction Set - ARM University Program - V1.0
17

Page 18
e
A
Th
__rendered_path__1__rendered_path__2
Branch instructions (2)
* When executing the instruction, the processor:
• shifts the offset left two bits, sign extends it to 32 bits, and adds it to PC.
* Execution then continues from the new PC, once the pipeline has been
refilled.
* The "Branch with link" instruction implements a subroutine call by
writing PC-4 into the LR of the current bank.
__rendered_path__60
• i.e. the address of the next instruction following the branch with link
(allowing for the pipeline).
* To return from subroutine, simply need to restore the PC from the LR:
MOV pc, lr
• Again, pipeline has to refill before execution continues.
__rendered_path__801
* The "Branch" instruction does not affect LR.
* Note: Architecture 4T offers a further ARM branch instruction, BX
Image_197_0
• See Thumb Instruction Set Module for details.
__rendered_path__3__rendered_path__59
RM Instruction Set - ARM University Program - V1.0
18

Page 19
n
e
A
Th
Data processing Instructions
* Largest family of ARM instructions, all sharing the same instructio
format.
* Contains:
• Arithmetic operations
• Comparisons (no results - just set condition codes)
• Logical operations
• Data movement between registers
* Remember, this is a load / store architecture
• These instruction only work on registers, NOT memory.
* They each perform a specific operation on one or two operands.
• First operand always a register - Rn
• Second operand sent to the ALU via barrel shifter.
* We will examine the barrel shifter shortly.
RM Instruction Set - ARM University Program - V1.0
__rendered_path__1__rendered_path__2
__rendered_path__3__rendered_path__59__rendered_path__60Image_207_0__rendered_path__596
19

Page 20
a
i
n
e
A
Th
Arithmetic Oper
* Operations are:
• ADD
operand1 + operand2
• ADC
operand1 + operand2 + carry
• SUB
operand1 - operand2
• SBC
operand1 - operand2 + carry -1
• RSB
operand2 - operand1
• RSC
operand2 - operand1 + carry - 1
* Syntax:
• <Operation>{<cond>}{S} Rd, Rn, Operand2
* Examples
• ADD r0, r1, r2
• SUBGT r3, r3, #1
• RSBLES r4, r5, #5
RM Instruction Set - ARM University Program - V1.0
t
o
__rendered_path__1__rendered_path__2
s
__rendered_path__3__rendered_path__59__rendered_path__60Image_219_0__rendered_path__371
20

Page 21
S
b
t
e
A
Th
Comparisons
* The only effect of the comparisons is to
UPDATE THE CONDITION FLAGS. Thus no need to set
__rendered_path__141
* Operations are:
• CMP
operand1 - operand2, but result not written
• CMN
operand1 + operand2, but result not written
• TST
operand1 AND operand2, but result not written
• TEQ
operand1 EOR operand2, but result not written
* Syntax:
• <Operation>{<cond>} Rn, Operand2
* Examples:
• CMP
r0, r1
• TSTEQ
r2, #5
RM Instruction Set - ARM University Program - V1.0
__rendered_path__1__rendered_path__2
i
.
__rendered_path__3__rendered_path__59__rendered_path__60Image_229_0__rendered_path__450
21

Page 22
e
A
Th
__rendered_path__1__rendered_path__2
Logical Operations
* Operations are:
• AND
operand1 AND operand2
• EOR
operand1 EOR operand2
• ORR
operand1 OR operand2
• BIC
operand1 AND NOT operand2 [ie bit clear]
* Syntax:
__rendered_path__60
• <Operation>{<cond>}{S} Rd, Rn, Operand2
* Examples:
• AND
r0, r1, r2
• BICEQ
r2, r3, #7
• EORS
r1,r3,r0
__rendered_path__3__rendered_path__59Image_239_0__rendered_path__315
RM Instruction Set - ARM University Program - V1.0
22

Page 23
m
n
e
A
Th
Data Move
* Operations are:
• MOV
operand2
• MVN
NOT operand2
Note that these make no use of operand1.
* Syntax:
• <Operation>{<cond>}{S} Rd, Operand2
* Examples:
• MOV
r0, r1
• MOVS
r2, #10
• MVNEQ
r1,#0
RM Instruction Set - ARM University Program - V1.0
e
__rendered_path__1__rendered_path__2
t
__rendered_path__3__rendered_path__59__rendered_path__60Image_249_0__rendered_path__246
23

Page 24
#
e
A
y
P
__rendered_path__143
Th
__rendered_path__141
Yes
__rendered_path__135
r0 = r0 - r1
__rendered_path__105__rendered_path__106
RM Instruction Set -
Start
__rendered_path__132__rendered_path__145
r0 = r1
?
__rendered_path__83__rendered_path__84
No
__rendered_path__132
r0 > r1
?
__rendered_path__94__rendered_path__95
ARM Universit
__rendered_path__69__rendered_path__70
Quiz
Yes
Stop
__rendered_path__147
No
__rendered_path__137
r1 = r1 - r0
__rendered_path__119__rendered_path__106
rogram - V1.0
2
*
__rendered_path__77__rendered_path__70
*
__rendered_path__1__rendered_path__2__rendered_path__71
Convert the GCD
__rendered_path__78__rendered_path__131__rendered_path__144
algorithm given in this
__rendered_path__85__rendered_path__146
flowchart into
__rendered_path__96__rendered_path__133
1) “Normal” assembler,
__rendered_path__107__rendered_path__134
where only branches can
__rendered_path__107__rendered_path__136
be conditional.
__rendered_path__60__rendered_path__138
2) ARM assembler, where
__rendered_path__138
all instructions are
__rendered_path__140
conditional, thus
__rendered_path__142
improving code density.
The only instructions you
need are CMP, B and SUB.
__rendered_path__3__rendered_path__59Image_259_0__rendered_path__139__rendered_path__139__rendered_path__412
24

Page 25
n
e
A
M
n
u
o
Th
R
I
str
cti
Quiz #2 - Sample Solutio
“Normal” Assembler
__rendered_path__94
gcd cmp r0, r1 ;reached the end?
__rendered_path__105
beq stop
blt less ;if r0 > r1
sub r0, r0, r1 ;subtract r1 from r0
bal gcd
less sub r1, r1, r0 ;subtract r0 from r1
bal gcd
stop
ARM Conditional Assembler
__rendered_path__292
gcd cmp r0, r1 ;if r0 > r1
__rendered_path__304
subgt r0, r0, r1 ;subtract r1 from r0
__rendered_path__314
sublt r1, r1, r0 ;else subtract r0 from r1
bne gcd ;reached the end?
n Set - ARM University Program - V1.0
__rendered_path__1__rendered_path__2__rendered_path__468__rendered_path__469
s
__rendered_path__3__rendered_path__59__rendered_path__60Image_269_0__rendered_path__470
25

Page 26
The Barrel Shifter
* The ARM doesn’t have actual shift instructions.
* Instead it has a barrel shifter which provides a mechanism to carry out
shifts as part of other instructions.
* So what operations does the barrel shifter support?
Image_280_0
The ARM Instruction Set - ARM University Program - V1.0
__rendered_path__1__rendered_path__2__rendered_path__3__rendered_path__59__rendered_path__60__rendered_path__290
26

Page 27
Barrel Shifter - Left Shift
* Shifts left by the specified amount (multiplies by powers of two) e.g.
LSL #5 = multiply by 32
Logical Shift Left (LSL)
CF
Destination
0
__rendered_path__223
The ARM Instruction Set - ARM University Program - V1.0
__rendered_path__1__rendered_path__2__rendered_path__3__rendered_path__59__rendered_path__60Image_290_0__rendered_path__207__rendered_path__219__rendered_path__222__rendered_path__224__rendered_path__223__rendered_path__226
27

Page 28
e
A
Th
Barrel Shifter - Right Shifts
Logical Shift Right
Logical Shift Right
__rendered_path__82
Shifts right by the
specified amount
(divides by powers of
...0
Destination
__rendered_path__367
two) e.g.
LSR #5 = divide by 32
Arithmetic Shift Right
Arithmetic Shift Right
__rendered_path__196
Shifts right (divides by
powers of two) and
preserves the sign bit,
Destination
__rendered_path__367
for 2's complement
operations. e.g.
Sign bit shifted in
ASR #5 = divide by 32
RM Instruction Set - ARM University Program - V1.0
__rendered_path__1__rendered_path__2__rendered_path__351__rendered_path__366__rendered_path__368__rendered_path__351__rendered_path__366
CF
__rendered_path__60__rendered_path__363__rendered_path__369__rendered_path__382__rendered_path__370
CF
__rendered_path__3__rendered_path__59Image_300_0__rendered_path__367__rendered_path__370__rendered_path__367__rendered_path__385__rendered_path__431__rendered_path__451
28

Page 29
s
Barrel Shifter - Rotation
Rotate Right (ROR)
Rotate Right
__rendered_path__106
• Similar to an ASR but the
b
l
e
i
t
a
s
v
w
e
r
h
t
a
p
e
L
a
r
S
o
B
u
n
a
d
n
d
a
s
a
p
t
h
p
e
e
y
a
r
a
s
D
e
s
t
i
n
a
t
i
o
n
__rendered_path__406
the MSB.
e.g. ROR #5
Note the last bit rotated is
also used as the Carry Out.
Rotate Right Extended (RRX)
Rotate Right through Carry
__rendered_path__290
• This operation uses the
CPSR C flag as a 33rd bit.
Rotates right by 1 bit.
Destination
__rendered_path__434
Encoded as ROR #0.
The ARM Instruction Set - ARM University Program - V1.0
__rendered_path__1__rendered_path__2__rendered_path__386__rendered_path__401__rendered_path__404
__rendered_path__398__rendered_path__403
CF
__rendered_path__60__rendered_path__402__rendered_path__403
CF
__rendered_path__3__rendered_path__59Image_310_0__rendered_path__405__rendered_path__419__rendered_path__431__rendered_path__401__rendered_path__434__rendered_path__435__rendered_path__436__rendered_path__405__rendered_path__437__rendered_path__436__rendered_path__438__rendered_path__465
29

Page 30
a
r
e
d
O
p
n
n
n
O
The ARM I
era
1
__rendered_path__441
structio
Using the B
The Secon
d
Operand
__rendered_path__456
2
__rendered_path__443__rendered_path__458
Barrel
Shifter
__rendered_path__445
ALU
__rendered_path__443
Result
Set - ARM University Program - V1.0
r
*
*
*
__rendered_path__1__rendered_path__2__rendered_path__105__rendered_path__248__rendered_path__442__rendered_path__455
l Shifter:
__rendered_path__414__rendered_path__415__rendered_path__418__rendered_path__439
perand
__rendered_path__416__rendered_path__419__rendered_path__440__rendered_path__457
Register, optionally with shift
__rendered_path__417__rendered_path__420__rendered_path__444
operation applied.
__rendered_path__421
Shift value can be either be:
__rendered_path__444
• 5 bit unsigned integer
• Specified in bottom byte of
another register.
__rendered_path__60
Immediate value
• 8 bit number
• Can be rotated right through
an even number of
positions.
• Assembler will calculate
rotate for you from
constant.
__rendered_path__3__rendered_path__59Image_320_0__rendered_path__459
30

Page 31
e
A
M
Th
R
__rendered_path__1__rendered_path__2
Second Operand :
Shifted Register
* The amount by which the register is to be shifted is contained in
either:
• the immediate 5-bit field in the instruction
NO OVERHEAD
__rendered_path__240
– Shift is done for free - executes in single cycle.
• the bottom byte of a register (not PC)
__rendered_path__60
– Then takes extra cycle to execute
– ARM doesn’ t have enough read ports to read 3 registers at
once.
– Then same as on other processors where shift is
separate instruction.
__rendered_path__618
* If no shift is specified then a default shift is applied: LSL #0
• i.e. barrel shifter has no effect on value in register.
__rendered_path__3__rendered_path__59Image_330_0
Instruction Set - ARM University Program - V1.0
31

Page 32
n
n
I
*
*
*
*
The ARM
Second Operand :
Using a Shifted Register
Using a multiplication instruction to multiply by a constant means first
loading the constant into a register and then waiting a number of
internal cycles for the instruction to complete.
A more optimum solution can often be found by using some combinatio
of MOVs, ADDs, SUBs and RSBs with shifts.
• Multiplications by a constant equal to a ((power of 2) 1) can be done i
__rendered_path__457
one cycle.
Example: r0 = r1 * 5
Example: r0 = r1 + (r1 * 4)
ï ADD r0, r1, r1, LSL #2
Example: r2 = r3 * 105
Example: r2 = r3 * 15 * 7
Example: r2 = r3 * (16 - 1) * (8 - 1)
ï RSB r2, r3, r3, LSL #4
; r2 = r3 * 15
ï RSB r2, r2, r2, LSL #3
; r2 = r2 * 7
Image_341_0
nstruction Set - ARM University Program - V1.0
__rendered_path__1__rendered_path__2
__rendered_path__3__rendered_path__59__rendered_path__60__rendered_path__714
32

Page 33
n
e
A
Th
Second Operand :
Immediate Value (1)
* There is no single instruction which will load a 32 bit immediate consta
into a register without performing a data load from memory.
• All ARM instructions are 32 bits long
• ARM instructions do not use the instruction stream as data.
* The data processing instruction format has 12 bits available for
operand2
• If used directly this would only give a range of 4096.
* Instead it is used to store 8 bit constants, giving a range of 0 - 255.
* These 8 bits can then be rotated right through an even number of
positions (ie RORs by 0, 2, 4,..30).
• This gives a much larger range of constants that can be directly loaded,
though some constants will still need to be loaded
from memory.
Image_355_0
RM Instruction Set - ARM University Program - V1.0
__rendered_path__1__rendered_path__2
t
__rendered_path__3__rendered_path__59__rendered_path__60__rendered_path__765
33

Page 34
6
6
I
*
*
*
*
*
The ARM
Second Operand :
Immediate Value (2)
This gives us:
• 0 - 255
[0 - 0xff]
• 256,260,264,..,1020
[0x100-0x3fc, step 4, 0x40-0xff ror 30]
• 1024,1040,1056,..,4080
[0x400-0xff0, step 16, 0x40-0xff ror 28]
• 4096,4160, 4224,..,16320
[0x1000-0x3fc0, step 64, 0x40-0xff ror 2
These can be loaded using, for example:
• MOV r0, #0x40, 26
; => MOV r0, #0x1000 (ie 4096)
To make this easier, the assembler will convert to this form for us if
simply given the required constant:
• MOV r0, #4096
; => MOV r0, #0x1000 (ie 0x40 ror 2
The bitwise complements can also be formed using MVN:
• MOV r0, #0xFFFFFFFF
; assembles to MVN r0, #0
If the required constant cannot be generated, an error will
Image_365_0
be reported.
nstruction Set - ARM University Program - V1.0
__rendered_path__1__rendered_path__2
]
__rendered_path__60
)
__rendered_path__3__rendered_path__59__rendered_path__741
34

Page 35
e
A
Th
__rendered_path__1__rendered_path__2
Loading full 32 bit constants
* Although the MOV/MVN mechansim will load a large range of constants
into a register, sometimes this mechansim will not generate the required
constant.
* Therefore, the assembler also provides a method which will load ANY 32
bit constant:
LDR rd,=numeric constant
__rendered_path__60
* If the constant can be constructed using either a MOV or MVN then this
will be the instruction actually generated.
* Otherwise, the assembler will produce an LDR instruction with a PC-
relative address to read the constant from a literal pool.
LDR r0,=0x42
; generates MOV r0,#0x42
LDR r0,=0x55555555 ; generate LDR r0,[pc, offset to lit pool]
__rendered_path__816
* As this mechanism will always generate the best instruction for a given
case, it is the recommended way of loading constants.
__rendered_path__3__rendered_path__59Image_375_0
RM Instruction Set - ARM University Program - V1.0
35

Page 36
e
a
s
e
A
Th
Multiplication Instructions
* The Basic ARM provides two multiplication instructions.
* Multiply
• MUL{<cond>}{S} Rd, Rm, Rs
; Rd = Rm * Rs
* Multiply Accumulate
- does addition for free
• MLA{<cond>}{S} Rd, Rm, Rs,Rn
; Rd = (Rm * Rs) + Rn
* Restrictions on use:
• Rd and Rm cannot be the same register
– Can be avoid by swapping Rm and Rs around. This works b
multiplication is commutative.
• Cannot use PC.
These will be picked up by the assembler if overlooked.
* Operands can be considered signed or unsigned
• Up to user to interpret correctly.
RM Instruction Set - ARM University Program - V1.0
__rendered_path__1__rendered_path__2
c
u
__rendered_path__60
e
__rendered_path__3__rendered_path__59Image_385_0__rendered_path__590
36

Page 37
i
e
Multiplication Implementation
* The ARM makes use of Booth’s Algorithm to perform integer
multiplication.
* On non-M ARMs this operates on 2 bits of Rs at a time.
• For each pair of bits this takes 1 cycle (plus 1 cycle to start with).
• However when there are no more 1’ s left in Rs, the multiplication w
early-terminate.
* Example: Multiply 18 and -1 : Rd = Rm * Rs
Rm
18 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 1 0 18
Rs
-1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 -1
17 cycles
* Note: Compiler does not use early termination criteria to
decide on which order to place operands.
The ARM Instruction Set - ARM University Program - V1.0
__rendered_path__1__rendered_path__2
ll
Rs
Rm
4 cycl
__rendered_path__60__rendered_path__523__rendered_path__524__rendered_path__525
s
__rendered_path__3__rendered_path__59Image_395_0__rendered_path__525__rendered_path__526__rendered_path__527__rendered_path__527__rendered_path__525__rendered_path__525__rendered_path__525__rendered_path__527__rendered_path__525__rendered_path__525__rendered_path__525__rendered_path__527__rendered_path__525__rendered_path__525__rendered_path__525__rendered_path__525__rendered_path__528__rendered_path__525__rendered_path__525__rendered_path__525__rendered_path__525__rendered_path__525__rendered_path__527__rendered_path__525__rendered_path__525__rendered_path__525__rendered_path__525__rendered_path__529__rendered_path__528__rendered_path__530__rendered_path__530__rendered_path__523__rendered_path__524__rendered_path__525__rendered_path__526__rendered_path__527__rendered_path__587__rendered_path__525__rendered_path__525__rendered_path__525__rendered_path__587__rendered_path__525__rendered_path__525__rendered_path__525__rendered_path__587__rendered_path__525__rendered_path__525__rendered_path__525__rendered_path__525__rendered_path__525__rendered_path__528__rendered_path__525__rendered_path__525__rendered_path__525__rendered_path__525__rendered_path__525__rendered_path__587__rendered_path__525__rendered_path__525__rendered_path__525__rendered_path__525__rendered_path__588__rendered_path__528__rendered_path__530__rendered_path__530__rendered_path__649__rendered_path__649__rendered_path__679
37

Page 38
e
A
M
Th
R
__rendered_path__1__rendered_path__2
Extended Multiply Instructions
* M variants of ARM cores contain extended multiplication
hardware. This provides three enhancements:
An 8 bit Booth’s Algorithm is used
– Multiplication is carried out faster (maximum for standard
instructions is now 5 cycles).
Early termination method improved so that now completes
__rendered_path__60
multiplication when all remaining bit sets contain
– all zeroes (as with non-M ARMs), or
– all ones.
Thus the previous example would early terminate in 2 cycles in
both cases.
64 bit results can now be produced from two 32bit operands
__rendered_path__662
– Higher accuracy.
Image_405_0
– Pair of registers used to store result.
__rendered_path__3__rendered_path__59
Instruction Set - ARM University Program - V1.0
38

Page 39
e
A
Th
__rendered_path__1__rendered_path__2
Multiply-Long and
Multiply-Accumulate Long
* Instructions are
• MULL which gives RdHi,RdLo:=Rm*Rs
• MLAL which gives RdHi,RdLo:=(Rm*Rs)+RdHi,RdLo
* However the full 64 bit of the result now matter (lower precision
multiply instructions simply throws top 32bits away)
• Need to specify whether operands are signed or unsigned
__rendered_path__60
* Therefore syntax of new instructions are:
• UMULL{<cond>}{S} RdLo,RdHi,Rm,Rs
• UMLAL{<cond>}{S} RdLo,RdHi,Rm,Rs
• SMULL{<cond>}{S} RdLo, RdHi, Rm, Rs
• SMLAL{<cond>}{S} RdLo, RdHi, Rm, Rs
__rendered_path__638
* Not generated by the compiler.
Warning : Unpredictable on non-M ARMs.
__rendered_path__3__rendered_path__59Image_416_0
RM Instruction Set - ARM University Program - V1.0
39

Page 40
#
e
A
Th
Quiz #3
1. Specify instructions which will implement the following:
a) r0 = 16
b) r1 = r0 * 4
c) r0 = r1 / 16 ( r1 signed 2's comp.)
d) r1 = r2 * 7
2. What will the following instructions do?
a) ADDS r0, r1, r1, LSL #2
b) RSB r2, r1,
3. What does the following instruction sequence do?
ADD r0, r1, r1, LSL #1
SUB r0, r0, r1, LSL #4
ADD r0, r0, r1, LSL #7
RM Instruction Set - ARM University Program - V1.0
__rendered_path__1__rendered_path__2
0
__rendered_path__3__rendered_path__59__rendered_path__60Image_427_0__rendered_path__409
40

Page 41
e
A
M
Th
R
__rendered_path__1__rendered_path__2
Load / Store Instructions
* The ARM is a Load / Store Architecture:
• Does not support memory to memory data processing operations.
• Must move data values into registers before using them.
* This might sound inefficient, but in practice isn’t:
• Load data values from memory into registers.
• Process data in registers using a number of data processing
__rendered_path__60
instructions which are not slowed down by memory access.
• Store results from registers out to memory.
* The ARM has three sets of instructions which interact with main
memory. These are:
• Single register data transfer (LDR / STR).
__rendered_path__685
• Block data transfer (LDM/STM).
• Single Data Swap (SWP).
__rendered_path__3__rendered_path__59Image_437_0
Instruction Set - ARM University Program - V1.0
41

Page 42
d
e
A
Th
Single register data transfer
* The basic load and store instructions are:
• Load and Store Word or Byte
– LDR / STR / LDRB / STRB
* ARM Architecture Version 4 also adds support for halfwords and signe
data.
• Load and Store Halfword
– LDRH / STRH
• Load Signed Byte or Halfword - load value and sign extend it to 32 bits.
– LDRSB / LDRSH
* All of these instructions can be conditionally executed by inserting the
appropriate condition code after STR / LDR.
• e.g. LDREQB
* Syntax:
Image_447_0
• <LDR|STR>{<cond>}{<size>} Rd, <address>
RM Instruction Set - ARM University Program - V1.0
__rendered_path__1__rendered_path__2
__rendered_path__3__rendered_path__59__rendered_path__60__rendered_path__561
42

Page 43
__rendered_path__1 __rendered_path__2 __rendered_path__330 __rendered_path__330 __rendered_path__351 __rendered_path__359 __rendered_path__360 __rendered_path__330 __rendered_path__373
Load and Store Word or Byte:
__rendered_path__361__rendered_path__362
Base Register
__rendered_path__361__rendered_path__361
* The memory location to be accessed is held in a base register
__rendered_path__363__rendered_path__360__rendered_path__400
• STR r0, [r1]
; Store contents of r0 to location pointed to
__rendered_path__403__rendered_path__405__rendered_path__437
; by contents of r1.
• LDR r2, [r1]
; Load r2 with contents of memory location
; pointed to by contents of r1.
r0
Memory
__rendered_path__60
S
R
o
e
g
u
i
r
s
c
t
e
e
r
0
x
5
__rendered_path__329__rendered_path__372
for STR
Base
r1
r2
Destination
__rendered_path__439
Register
0x200
0x200
0x5
0x5
Register
__rendered_path__329__rendered_path__331__rendered_path__352__rendered_path__401
for LDR
__rendered_path__3__rendered_path__59Image_457_0__rendered_path__402__rendered_path__404__rendered_path__438
The ARM Instruction Set - ARM University Program - V1.0
43

Page 44
r
*
*
*
*
The ARM Inst
__rendered_path__1__rendered_path__2
Load and Store Word or Byte:
Offsets from the Base Register
As well as accessing the actual location contained in the base register,
these instructions can access a location offset from the base register
pointer.
This offset can be
• An unsigned 12bit immediate value (ie 0 - 4095 bytes).
• A register, optionally shifted by an immediate value
__rendered_path__60
This can be either added or subtracted from the base register:
• Prefix the offset value or register with ‘+’ (default) or ‘-’ .
This offset can be applied:
• before the transfer is made: Pre-indexed addressing
– optionally auto-incrementing the base register, by postfixing the
__rendered_path__615
instruction with an ‘!’ .
__rendered_path__796
• after the transfer is made: Post-indexed addressing
Image_467_0
– causing the base register to be auto-incremented.
__rendered_path__3__rendered_path__59
uction Set - ARM University Program - V1.0
44

Page 45
e
Load and Store Word or Byt
Pre-indexed Addressing
* Example: STR r0, [r1,#12]
Memory
r0
0x5
Offset
12
0x20c
0x5
__rendered_path__416
r1
__rendered_path__342__rendered_path__344__rendered_path__433
R
B
e
a
g
i
s
s
e
t
e
r
0
x
2
0
0
0
x
2
0
0
__rendered_path__342__rendered_path__344__rendered_path__365__rendered_path__431
* To store to location 0x1f4 instead use: STR r0, [r1,#-12]
* To auto-increment base pointer to 0x20c use: STR r0, [r1, #12]!
* If r2 contains 3, access 0x20c by multiplying this by 4:
• STR r0, [r1, r2, LSL #2]
The ARM Instruction Set - ARM University Program - V1.0
__rendered_path__1__rendered_path__2__rendered_path__343__rendered_path__343__rendered_path__364__rendered_path__372__rendered_path__373__rendered_path__386__rendered_path__388
:
__rendered_path__374__rendered_path__374__rendered_path__415
Source
__rendered_path__375__rendered_path__376__rendered_path__343__rendered_path__343__rendered_path__432
Register
__rendered_path__377__rendered_path__373__rendered_path__430
for STR
__rendered_path__3__rendered_path__59__rendered_path__60Image_478_0__rendered_path__342__rendered_path__387__rendered_path__434
45

Page 46
e
Th
__rendered_path__1__rendered_path__2__rendered_path__335__rendered_path__337__rendered_path__365__rendered_path__373__rendered_path__374
Load and Store Word or Byte:
__rendered_path__375__rendered_path__376__rendered_path__389__rendered_path__465
Post-indexed Addressing
__rendered_path__375__rendered_path__376__rendered_path__387__rendered_path__437
* Example: STR r0, [r1], #12
Memory
__rendered_path__377__rendered_path__374__rendered_path__416__rendered_path__387__rendered_path__419
Updated
r1
Offset
r0
Source
__rendered_path__433
R
B
e
a
g
i
s
s
e
t
e
r
0
x
2
0
c
1
2
0
x
2
0
c
0
x
5
R
f
e
o
g
r
i
S
s
t
T
e
R
r
__rendered_path__60__rendered_path__386__rendered_path__388__rendered_path__464
Original
r1
0x200
0x5
__rendered_path__366__rendered_path__417__rendered_path__386__rendered_path__418__rendered_path__436__rendered_path__466
Base
0x200
__rendered_path__336__rendered_path__434__rendered_path__435
Register
__rendered_path__334
* To auto-increment the base register to location 0x1f4 instead use:
• STR r0, [r1], #-12
* If r2 contains 3, auto-incremenet base register to 0x20c by multiplying
__rendered_path__467
this by 4:
• STR r0, [r1], r2, LSL #2
__rendered_path__3__rendered_path__59Image_488_0
ARM Instruction Set - ARM University Program - V1.0
46

Page 47
m
d
e
A
Th
Load and Stores
with User Mode Privilege
* When using post-indexed addressing, there is a further form of
Load/Store Word/Byte:
• <LDR|STR>{<cond>}{B}T Rd, <post_indexed_address>
* When used in a privileged mode, this does the load/store with user
privilege.
• Normally used by an exception handler that is emulating a memory
access instruction that would normally execute in user mode.
RM Instruction Set - ARM University Program - V1.0
__rendered_path__1__rendered_path__2
o
e
__rendered_path__3__rendered_path__59__rendered_path__60Image_500_0__rendered_path__442
47

Page 48
e
o
e
I
*
*
*
The ARM
Example Usage of
Addressing Modes
Imagine an array, the first element of which is pointed to by the cont
of r0.
If we want to access a particular element,
element
then we can use pre-indexed addressing:
• r1 is element we want.
• LDR r2, [r0, r1, LSL #2]
3
Pointer to
2
If we want to step through every
start of array
1
element of the array, for instance
r0
0
__rendered_path__708__rendered_path__710__rendered_path__713
to produce sum of elements in the
array, then we can use post-indexed addressing within a loop:
• r1 is address of current element (initially equal to r0).
• LDR r2, [r1], #4
Use a further register to store the address of final element,
so that the loop can be correctly terminated.
nstruction Set - ARM University Program - V1.0
__rendered_path__1__rendered_path__2
nts
Mem
Offs
12
8
4
0
__rendered_path__648__rendered_path__649
ry
__rendered_path__650__rendered_path__650__rendered_path__649__rendered_path__649
t
__rendered_path__3__rendered_path__59__rendered_path__60Image_510_0__rendered_path__650__rendered_path__651__rendered_path__709__rendered_path__711__rendered_path__712__rendered_path__714
48

Page 49
h
e
A
Th
Offsets for Halfword and
Signed Halfword / Byte Access
* The Load and Store Halfword and Load Signed Byte or Halfword
instructions can make use of pre- and post-indexed addressing in muc
the same way as the basic load and store instructions.
* However the actual offset formats are more constrained:
• The immediate value is limited to 8 bits (rather than 12 bits) giving an
offset of 0-255 bytes.
• The register form cannot have a shift applied to it.
Image_520_0
RM Instruction Set - ARM University Program - V1.0
__rendered_path__1__rendered_path__2
__rendered_path__3__rendered_path__59__rendered_path__60__rendered_path__506
49

Page 50
d
e
A
M
Th
R
Effect of endianess
* The ARM can be set up to access its data in either little or big
endian format.
* Little endian:
• Least significant byte of a word is stored in bits 0-7 of an addresse
word.
* Big endian:
• Least significant byte of a word is stored in bits 24-31 of an
addressed word.
* This has no real relevance unless data is stored as words and then
accessed in smaller sized quantities (halfwords or bytes).
• Which byte / halfword is accessed will depend on the endianess of
the system involved.
Image_530_0
Instruction Set - ARM University Program - V1.0
__rendered_path__1__rendered_path__2
__rendered_path__3__rendered_path__59__rendered_path__60__rendered_path__557
50

Page 51
x
d
a
Endianess E
r0 = 0x11223344
31 24 23 16 15 8 7 0
11 22 33 44
STR r0, [r1]
__rendered_path__535__rendered_path__536
31 24 23 16 15 8 7 0
__rendered_path__567
r1 = 0x100
11 22 33 44
Memory
Little-endian
LDRB r2, [r1]
__rendered_path__551
31 24 23 16 15 8 7 0
__rendered_path__513__rendered_path__550
00 00 00 44
r2 = 0x44
The ARM Instruction Set - ARM University Program - V1.0
ample
__rendered_path__569
31 24 23 16 15 8 7 0
44 33 22 11
__rendered_path__586
31 24 23 16 15 8 7 0
__rendered_path__514
00 00 00 11
r2 = 0x11
__rendered_path__1__rendered_path__2
r1 = 0x100
Big-en
i
__rendered_path__60__rendered_path__112__rendered_path__113__rendered_path__114
n
__rendered_path__3__rendered_path__59Image_540_0__rendered_path__113__rendered_path__205__rendered_path__206__rendered_path__207__rendered_path__207__rendered_path__112__rendered_path__208__rendered_path__209__rendered_path__210__rendered_path__112__rendered_path__208__rendered_path__209__rendered_path__329__rendered_path__361__rendered_path__208__rendered_path__114__rendered_path__329__rendered_path__436__rendered_path__208__rendered_path__114__rendered_path__329__rendered_path__512__rendered_path__512__rendered_path__537__rendered_path__552__rendered_path__566__rendered_path__568
51

Page 52
)
o
a
e
A
Th
Quiz #4
* Write a segment of code that add together elements x to x+(n-1
array, where the element x=0 is the first element of the array.
* Each element of the array is word sized (ie. 32 bits).
* The segment should use post-indexed addressing.
* At the start of your segments, you should assume that:
• r0 points to the start of the array.
Elements
__rendered_path__456
• r1 = x
• r2 = n
x + (n - 1)
n
e
l
e
m
e
n
t
s
{
x
x
+
1
__rendered_path__412__rendered_path__414__rendered_path__414__rendered_path__435
r0
0
__rendered_path__417
RM Instruction Set - ARM University Program - V1.0
__rendered_path__1__rendered_path__2
f
__rendered_path__413__rendered_path__415__rendered_path__415__rendered_path__434__rendered_path__416__rendered_path__447
n
__rendered_path__3__rendered_path__59__rendered_path__60Image_550_0__rendered_path__415__rendered_path__416__rendered_path__418__rendered_path__419__rendered_path__420__rendered_path__421__rendered_path__424__rendered_path__425__rendered_path__427__rendered_path__428__rendered_path__434__rendered_path__435__rendered_path__446__rendered_path__447__rendered_path__434__rendered_path__428__rendered_path__469
52

Page 53
e
Th
__rendered_path__1__rendered_path__2
Quiz #4 - Sample Solution
ADD r0, r0, r1, LSL#2
; Set r0 to address of element x
__rendered_path__86
ADD r2, r0, r2, LSL#2
; Set r2 to address of element n+1
MOV r1, #0
; Initialise counter
loop
LDR r3, [r0], #4
; Access element and move to next
__rendered_path__60
ADD r1, r1, r3
; Add contents to counter
CMP r0, r2
; Have we reached element x+n?
BLT loop
; If not - repeat for
;
next element
; on exit sum contained in r1
__rendered_path__3__rendered_path__59Image_560_0__rendered_path__429
ARM Instruction Set - ARM University Program - V1.0
53

Page 54
__rendered_path__1 __rendered_path__2 __rendered_path__63 __rendered_path__65 __rendered_path__89 __rendered_path__90 __rendered_path__91 __rendered_path__92
Block Data Transfer (1)
__rendered_path__92
* The Load and Store Multiple instructions (LDM / STM) allow betweeen
__rendered_path__93__rendered_path__93__rendered_path__91__rendered_path__92
1 and 16 registers to be transferred to or from memory.
__rendered_path__91__rendered_path__92
* The transferred registers can be either:
__rendered_path__91__rendered_path__92
• Any subset of the current bank of registers (default).
__rendered_path__91
• Any subset of the user mode bank of registers when in a priviledged
__rendered_path__91
mode (postfix instruction with a ‘^’ ).
__rendered_path__60__rendered_path__91
31
28 27
24 23 22 21 20 19
16 15
0
__rendered_path__91
Cond 1 0 0 P U S W L Rn
Register list
__rendered_path__91
Condition field
Base register
Each bit corresponds to a particular
__rendered_path__91
U
0
p
=
/
D
D
o
o
w
w
;
n
n
s
u
b
b
i
t
t
r
a
c
t
o
f
f
s
e
t
f
r
o
m
b
a
s
e
L
0
o
=
a
S
d
t
o
/
S
e
r
t
t
o
o
r
e
m
e
b
m
i
t
o
r
y
r
e
B
g
i
i
t
s
0
t
e
s
r
e
t
.
c
F
a
o
u
s
r
e
s
e
r
x
0
a
t
m
o
b
p
e
l
e
t
r
:
a
n
s
f
e
r
r
e
d
.
__rendered_path__91
1
=
U
p
;
a
d
d
o
f
f
s
e
t
t
o
b
a
s
e
1
=
L
o
a
d
f
r
o
m
m
e
m
o
r
y
A
B
t
i
t
l
e
0
a
u
s
n
t
s
e
o
t
n
c
a
e
u
r
s
e
e
s
g
r
i
0
s
t
n
e
o
r
t
t
m
o
u
b
e
s
t
t
r
a
b
n
e
s
f
e
r
r
e
d
.
__rendered_path__91
Pre/Post indexing bit
Write- back bit
transferred as the list cannot be empty.
__rendered_path__94
0 = Post; add offset after transfer,
0 = no write-back
__rendered_path__62__rendered_path__64__rendered_path__91
1 = Pre ; add offset before transfer
1 = write address into base
__rendered_path__91
PSR and force user bit
Image_568_0__rendered_path__91
0 = don’ t load PSR or force user mode
__rendered_path__91
1 = load PSR or force user mode
__rendered_path__3__rendered_path__59__rendered_path__91__rendered_path__93__rendered_path__91__rendered_path__91__rendered_path__91__rendered_path__91__rendered_path__95__rendered_path__96__rendered_path__95__rendered_path__97__rendered_path__175__rendered_path__191__rendered_path__91__rendered_path__192__rendered_path__193__rendered_path__192__rendered_path__93__rendered_path__192__rendered_path__93__rendered_path__194__rendered_path__195__rendered_path__194__rendered_path__196__rendered_path__263__rendered_path__264__rendered_path__323__rendered_path__324__rendered_path__325__rendered_path__326__rendered_path__91__rendered_path__819__rendered_path__194__rendered_path__194__rendered_path__196__rendered_path__1147
The ARM Instruction Set - ARM University Program - V1.0
54

Page 55
r
e
A
Th
Block Data Transfer (2)
* Base register used to determine where memory access should occur.
• 4 different addressing modes allow increment and decrement inclusive o
exclusive of the base register location.
• Base register can be optionally updated following the transfer (by
appending it with an ‘!’ .
• Lowest register number is always transferred to/from lowest memory
location accessed.
* These instructions are very efficient for
• Saving and restoring context
– For this useful to view memory as a stack.
• Moving large blocks of data around memory
– For this useful to directly represent functionality of the instructions.
Image_578_0
RM Instruction Set - ARM University Program - V1.0
__rendered_path__1__rendered_path__2
__rendered_path__3__rendered_path__59__rendered_path__60__rendered_path__672
55

Page 56
f
e
Th
Stacks
* A stack is an area of memory which grows as new data is “pushed” onto
the “top” of it, and shrinks as data is “popped” off the top.
* Two pointers define the current limits of the stack.
• A base pointer
– used to point to the “bottom” of the stack (the first location).
• A stack pointer
– used to point the current “top” of the stack.
PUSH
{1,2,3}
POP
__rendered_path__417__rendered_path__433
SP
3
Result o
__rendered_path__418__rendered_path__429
2
SP
2
pop = 3
__rendered_path__418__rendered_path__460
1
1
__rendered_path__418
B
S
A
P
S
E
B
A
S
E
B
A
S
E
Image_588_0__rendered_path__393__rendered_path__419
ARM Instruction Set - ARM University Program - V1.0
__rendered_path__1__rendered_path__2__rendered_path__392__rendered_path__394__rendered_path__394__rendered_path__397__rendered_path__416__rendered_path__416
__rendered_path__3__rendered_path__59__rendered_path__60__rendered_path__395__rendered_path__396__rendered_path__397__rendered_path__394__rendered_path__394__rendered_path__395__rendered_path__396__rendered_path__397__rendered_path__397__rendered_path__392__rendered_path__393__rendered_path__392__rendered_path__394__rendered_path__394__rendered_path__395__rendered_path__396__rendered_path__419__rendered_path__397__rendered_path__397__rendered_path__392__rendered_path__457__rendered_path__392__rendered_path__392__rendered_path__429__rendered_path__461
56

Page 57
e
A
Th
__rendered_path__1__rendered_path__2
Stack Operation
* Traditionally, a stack grows down in memory, with the last “ pushed”
value at the lowest address. The ARM also supports ascending stacks,
where the stack structure grows up through memory.
* The value of the stack pointer can either:
• Point to the last occupied address (Full stack)
– and so needs pre-decrementing (ie before the push)
__rendered_path__60
• Point to the next occupied address (Empty stack)
– and so needs post-decrementing (ie after the push)
* The stack type to be used is given by the postfix to the instruction:
• STMFD / LDMFD : Full Descending stack
• STMFA / LDMFA : Full Ascending stack.
__rendered_path__790
• STMED / LDMED : Empty Descending stack
• STMEA / LDMEA : Empty Ascending stack
Image_598_0
* Note: ARM Compiler will always use a Full descending stack.
__rendered_path__3__rendered_path__59
RM Instruction Set - ARM University Program - V1.0
57

Page 58
E
a
{
}
5
P
P
O
d
S
R
n
R
Old S
SP
The A
__rendered_path__115__rendered_path__116
M I
STMFD sp!,
r0,r1,r3-r5
__rendered_path__99__rendered_path__101__rendered_path__103
r5
__rendered_path__103__rendered_path__101
r4
__rendered_path__103__rendered_path__103
r3
__rendered_path__103__rendered_path__101__rendered_path__121
r1
__rendered_path__99__rendered_path__101
r0
__rendered_path__101__rendered_path__121
struction Set - A
Stack
STMED sp!,
{r0,r1,r3-r5}
__rendered_path__99__rendered_path__101__rendered_path__103
Old SP
r5
__rendered_path__103__rendered_path__101
r4
__rendered_path__103__rendered_path__103
r3
__rendered_path__103__rendered_path__101__rendered_path__121
r1
__rendered_path__101
r0
__rendered_path__99
SP
__rendered_path__115__rendered_path__116__rendered_path__165
M University Program - V1.0
x
SP
__rendered_path__205
ld SP
mples
STMFA sp!,
{r0,r1,r3-r5}
r5
__rendered_path__103__rendered_path__209
r4
__rendered_path__103__rendered_path__115
r3
__rendered_path__101
r1
__rendered_path__103
r0
__rendered_path__101
Ol
__rendered_path__1__rendered_path__2
S
__rendered_path__205
STMEA sp!,
{r0,r1,r3-r
__rendered_path__165
r5
__rendered_path__99__rendered_path__115
r4
__rendered_path__101
r3
__rendered_path__103
r1
__rendered_path__103
P
r0
__rendered_path__100__rendered_path__102__rendered_path__104
}
__rendered_path__104__rendered_path__102__rendered_path__100__rendered_path__102__rendered_path__104
0x418
__rendered_path__60__rendered_path__104__rendered_path__104__rendered_path__104__rendered_path__102__rendered_path__104__rendered_path__104__rendered_path__102__rendered_path__104
0x400
__rendered_path__104__rendered_path__102__rendered_path__104__rendered_path__104__rendered_path__104__rendered_path__102__rendered_path__206__rendered_path__120__rendered_path__120
0x3e8
__rendered_path__3__rendered_path__59Image_608_0__rendered_path__100__rendered_path__102__rendered_path__104__rendered_path__102__rendered_path__117__rendered_path__120__rendered_path__104__rendered_path__102__rendered_path__100__rendered_path__102__rendered_path__104__rendered_path__102__rendered_path__161__rendered_path__164__rendered_path__102__rendered_path__99__rendered_path__100__rendered_path__103__rendered_path__104__rendered_path__101__rendered_path__102__rendered_path__103__rendered_path__104__rendered_path__101__rendered_path__102__rendered_path__103__rendered_path__104__rendered_path__103__rendered_path__104__rendered_path__100__rendered_path__102__rendered_path__104__rendered_path__102__rendered_path__103__rendered_path__104__rendered_path__104__rendered_path__103__rendered_path__104__rendered_path__101__rendered_path__102__rendered_path__99__rendered_path__100__rendered_path__101__rendered_path__102__rendered_path__103__rendered_path__104__rendered_path__101__rendered_path__102__rendered_path__264__rendered_path__120__rendered_path__164__rendered_path__164__rendered_path__209__rendered_path__164__rendered_path__121__rendered_path__273
58

Page 59
__rendered_path__1 __rendered_path__2
Stacks and Subroutines
* One use of stacks is to create temporary register workspace for
subroutines. Any registers that are needed can be pushed onto the stack
at the start of the subroutine and popped off again at the end so as to
restore them before return to the caller :
STMFD sp!,{r0-r12, lr}
; stack all registers
........
; and the return address
........
__rendered_path__60
LDMFD sp!,{r0-r12, pc}
; load all the registers
; and return automatically
* See the chapter on the ARM Procedure Call Standard in the SDT
Reference Manual for further details of register usage within
subroutines.
* If the pop instruction also had the ‘S’ bit set (using ‘^’) then the transfer
__rendered_path__827
of the PC when in a priviledged mode would also cause the SPSR to be
copied into the CPSR (see exception handling module).
__rendered_path__3__rendered_path__59Image_617_0
The ARM Instruction Set - ARM University Program - V1.0
59

Page 60
o
e
A
Th
Direct functionality of
Block Data Transfer
* When LDM / STM are not being used to implement stacks, it is clearer t
specify exactly what functionality of the instruction is:
• i.e. specify whether to increment / decrement the base pointer, before or
after the memory access.
* In order to do this, LDM / STM support a further syntax in addition to
the stack one:
• STMIA / LDMIA : Increment After
• STMIB / LDMIB : Increment Before
• STMDA / LDMDA : Decrement After
• STMDB / LDMDB : Decrement Before
Image_627_0
RM Instruction Set - ARM University Program - V1.0
__rendered_path__1__rendered_path__2
__rendered_path__3__rendered_path__59__rendered_path__60__rendered_path__550
60

Page 61
4
Example: Block Copy
• Copy a block of memory, which is an exact multiple of 12 words long
from the location pointed to by r12 to the location pointed to by r13. r1
points to the end of block to be copied.
; r12 points to the start of the source data
; r14 points to the end of the source data
; r13 points to the start of the destination data
loop LDMIA r12!, {r0-r11} ; load 48 bytes
r13
__rendered_path__610
STMIA r13!, {r0-r11} ; and store them
r14
__rendered_path__620
CMP
r12, r14
; check for the end
BNE
loop
; and loop until done
• This loop transfers 48 bytes in 31 cycles
r12
__rendered_path__622
• Over 50 Mbytes/sec at 33 MHz
The ARM Instruction Set - ARM University Program - V1.0
__rendered_path__1__rendered_path__2__rendered_path__603__rendered_path__609__rendered_path__623__rendered_path__626
__rendered_path__60__rendered_path__605__rendered_path__606__rendered_path__608__rendered_path__609__rendered_path__624
Increasing
__rendered_path__602__rendered_path__607__rendered_path__627
Memory
__rendered_path__3__rendered_path__59Image_637_0__rendered_path__604__rendered_path__621__rendered_path__625__rendered_path__625__rendered_path__628__rendered_path__645
61

Page 62
t
e
A
Th
Quiz #5
* The contents of registers r0 to r6 need to be swapped around thus:
• r0 moved into r3
• r1 moved into r4
• r2 moved into r6
• r3 moved into r5
• r4 moved into r0
• r5 moved into r1
• r6 moved into r2
* Write a segment of code that uses full descending stack operations
carry this out, and hence requires no use of any other registers for
temporary storage.
RM Instruction Set - ARM University Program - V1.0
__rendered_path__1__rendered_path__2
o
__rendered_path__3__rendered_path__59__rendered_path__60Image_647_0__rendered_path__412
62

Page 63
m
p
o
u
i
n
Old SP
SP
__rendered_path__226
The ARM I
Quiz #5 - Sa
STMFD sp!,
LDMFD sp!,
{r0-r6}
{r3,r4,r6}
__rendered_path__209__rendered_path__231
r6
r6
__rendered_path__209__rendered_path__237
r5
r5
__rendered_path__209__rendered_path__237
r4
r4
__rendered_path__211__rendered_path__237
r3
SP
r3
__rendered_path__213__rendered_path__239
r2
__rendered_path__213__rendered_path__246__rendered_path__247__rendered_path__252
r1
__rendered_path__213__rendered_path__239
r0
__rendered_path__225__rendered_path__231
r3 = r0
__rendered_path__209
r4 = r1
r6 = r2
struction Set - ARM University Program - V1.0
SP
le S
LDMFD sp!,
{r5}
__rendered_path__209
r6
__rendered_path__209
r5
__rendered_path__209
r4
__rendered_path__259__rendered_path__252
r5 = r3
l
__rendered_path__1__rendered_path__2
t
SP
__rendered_path__210__rendered_path__210__rendered_path__234
on
__rendered_path__212__rendered_path__210__rendered_path__234
LDMFD sp!,
__rendered_path__214__rendered_path__214__rendered_path__238__rendered_path__238__rendered_path__234
{r0-r2}
__rendered_path__60__rendered_path__214__rendered_path__240__rendered_path__238__rendered_path__234
r0 = r4
__rendered_path__227__rendered_path__230__rendered_path__238__rendered_path__241__rendered_path__248__rendered_path__251__rendered_path__210__rendered_path__210__rendered_path__234
r1 = r5
__rendered_path__234__rendered_path__210__rendered_path__238__rendered_path__240__rendered_path__210__rendered_path__234
r2 = r6
__rendered_path__3__rendered_path__59Image_657_0__rendered_path__234__rendered_path__234__rendered_path__234__rendered_path__210__rendered_path__214__rendered_path__214__rendered_path__214__rendered_path__247__rendered_path__248__rendered_path__230__rendered_path__234__rendered_path__234__rendered_path__213__rendered_path__214__rendered_path__241__rendered_path__237__rendered_path__238__rendered_path__238__rendered_path__241__rendered_path__241__rendered_path__241__rendered_path__266__rendered_path__246__rendered_path__247__rendered_path__248__rendered_path__230__rendered_path__269__rendered_path__234__rendered_path__234__rendered_path__234__rendered_path__234__rendered_path__238__rendered_path__270__rendered_path__271
63

Page 64
t
e
A
M
n
Th
R
I
Swap and Swap Byte
Instructions
* Atomic operation of a memory read followed by a memory wri
which moves byte or word quantities between registers and
memory.
* Syntax:
• SWP{<cond>}{B} Rd, Rm, [Rn]
1
Rn
temp
__rendered_path__363__rendered_path__374__rendered_path__375__rendered_path__379
2
3
__rendered_path__381__rendered_path__383
Memory
__rendered_path__385__rendered_path__365__rendered_path__370
Rm
Rd
__rendered_path__365__rendered_path__366__rendered_path__370
* Thus to implement an actual swap of contents make Rd = Rm.
__rendered_path__365
* The compiler cannot produce this instruction.
struction Set - ARM University Program - V1.0
__rendered_path__1__rendered_path__2__rendered_path__362__rendered_path__364__rendered_path__362__rendered_path__395__rendered_path__396
e
__rendered_path__3__rendered_path__59__rendered_path__60Image_666_0__rendered_path__367__rendered_path__371__rendered_path__362__rendered_path__378__rendered_path__380__rendered_path__382__rendered_path__384__rendered_path__387__rendered_path__389__rendered_path__387__rendered_path__371__rendered_path__395__rendered_path__395__rendered_path__403
64

Page 65
g
e
A
Th
Software Interrupt (SWI)
31
28 27
24 23
0
Cond 1 1 1 1
Comment field (ignored by Processor)
Condition Field
* In effect, a SWI is a user-defined instruction.
* It causes an exception trap to the SWI hardware vector (thus causing a
change to supervisor mode, plus the associated state saving), thus causin
the SWI exception handler to be called.
* The handler can then examine the comment field of the instruction to
decide what operation has been requested.
* By making use of the SWI mechansim, an operating system can
implement a set of privileged operations which applications running in
user mode can request.
Image_676_0
* See Exception Handling Module for further details.
RM Instruction Set - ARM University Program - V1.0
__rendered_path__1__rendered_path__2__rendered_path__636
__rendered_path__3__rendered_path__59__rendered_path__60__rendered_path__636__rendered_path__637__rendered_path__636__rendered_path__636__rendered_path__636__rendered_path__636__rendered_path__636__rendered_path__636__rendered_path__636__rendered_path__636__rendered_path__636__rendered_path__636__rendered_path__636__rendered_path__636__rendered_path__636__rendered_path__636__rendered_path__636__rendered_path__636__rendered_path__636__rendered_path__638__rendered_path__636__rendered_path__648__rendered_path__636__rendered_path__636__rendered_path__636__rendered_path__636__rendered_path__636__rendered_path__636__rendered_path__636__rendered_path__636__rendered_path__712__rendered_path__712__rendered_path__713__rendered_path__714__rendered_path__715__rendered_path__733__rendered_path__734
65

Page 66
f
o
e
A
Th
PSR Transfer Instructions
* MRS and MSR allow contents of CPSR/SPSR to be transferred
appropriate status register to a general purpose register.
• All of status register, or just the flags, can be transferred.
* Syntax:
MRS{<cond>} Rd,<psr>
; Rd = <psr>
MSR{<cond>} <psr>,Rm
; <psr> = Rm
MSR{<cond>} <psrf>,Rm
; <psrf> = Rm
where
<psr> = CPSR, CPSR_all, SPSR or SPSR_all
<psrf> = CPSR_flg or SPSR_flg
* Also an immediate form
MSR{<cond>} <psrf>,#Immediate
• This immediate must be a 32-bit immediate, of which the 4
most significant bits are written to the flag bits.
RM Instruction Set - ARM University Program - V1.0
__rendered_path__1__rendered_path__2
r
m
__rendered_path__3__rendered_path__59__rendered_path__60Image_686_0__rendered_path__621
66

Page 67
n
e
A
Th
Using MRS and MSR
* Currently reserved bits, may be used in future, therefore:
• they must be preserved when altering PSR
• the value they return must not be relied upon when testing other bits.
31
28
8
4
0
N Z C V
I F T
Mode
* Thus read-modify-write strategy must be followed when modifying a
PSR:
• Transfer PSR to register using MRS
• Modify relevant bits
• Transfer updated value back to PSR using MSR
* Note:
• In User Mode, all bits can be read but only the flag bits can
be written to.
RM Instruction Set - ARM University Program - V1.0
__rendered_path__1__rendered_path__2__rendered_path__506__rendered_path__507__rendered_path__508__rendered_path__510
y
__rendered_path__3__rendered_path__59__rendered_path__60Image_696_0__rendered_path__509__rendered_path__511__rendered_path__511__rendered_path__511__rendered_path__511__rendered_path__511__rendered_path__511__rendered_path__511__rendered_path__511__rendered_path__511__rendered_path__511__rendered_path__511__rendered_path__511__rendered_path__511__rendered_path__511__rendered_path__511__rendered_path__511__rendered_path__511__rendered_path__512__rendered_path__512__rendered_path__513__rendered_path__512__rendered_path__511__rendered_path__511__rendered_path__510__rendered_path__511__rendered_path__512__rendered_path__513__rendered_path__512__rendered_path__511__rendered_path__512__rendered_path__536__rendered_path__537
67

Page 68
e
A
Th
__rendered_path__1__rendered_path__2
Coprocessors
* The ARM architecture supports 16 coprocessors
* Each coprocessor instruction set occupies part of the ARM instruction
set.
* There are three types of coprocessor instruction
• Coprocessor data processing
• Coprocessor (to/from ARM) register transfers
__rendered_path__60
• Coprocessor memory transfers (load and store to/from memory)
* Assembler macros can be used to transform custom coprocessor
mneumonics into the generic mneumonics understood by the processor.
* A coprocessor may be implemented
• in hardware
__rendered_path__664
• in software (via the undefined instruction exception)
• in both (common cases in hardware, the rest in software)
__rendered_path__3__rendered_path__59Image_706_0
RM Instruction Set - ARM University Program - V1.0
68

Page 69
f
e
A
Th
Coprocessor Data Processing
* This instruction initiates a coprocessor operation
* The operation is performed only on internal coprocessor state
• For example, a Floating point multiply, which multiplies the contents o
two registers and stores the result in a third register
* Syntax:
CDP{<cond>} <cp_num>,<opc_1>,CRd,CRn,CRm,{<opc_2>}
31 28 27 26 25 24 23 20 19 16 15 12 11 8 7 5 4 3 0
Cond 1 1 1 0 opc_1 CRn CRd cp_num opc_2 0 CRm
__rendered_path__390
Destination Register
Opcode
__rendered_path__490
Source Registers
__rendered_path__490
Opcode
__rendered_path__490
Condition Code Specifier
Image_716_0__rendered_path__493
RM Instruction Set - ARM University Program - V1.0
__rendered_path__1__rendered_path__2
__rendered_path__3__rendered_path__59__rendered_path__60__rendered_path__391__rendered_path__491__rendered_path__491__rendered_path__492__rendered_path__491__rendered_path__514__rendered_path__515__rendered_path__514__rendered_path__516__rendered_path__514__rendered_path__533__rendered_path__514__rendered_path__540__rendered_path__490__rendered_path__491__rendered_path__565__rendered_path__566__rendered_path__490__rendered_path__492__rendered_path__567__rendered_path__516__rendered_path__568__rendered_path__569__rendered_path__490__rendered_path__491__rendered_path__514__rendered_path__515
69

Page 70
*
*
*
The ARM
__rendered_path__1__rendered_path__2__rendered_path__565
Coprocessor Register
__rendered_path__566
Transfers
__rendered_path__659
These two instructions move data between ARM registers and
__rendered_path__660
coprocessor registers
__rendered_path__662
• MRC : Move to Register from Coprocessor
__rendered_path__663
• MCR : Move to Coprocessor from Register
__rendered_path__659
An operation may also be performed on the data as it is transferred
__rendered_path__660
• For example a Floating Point Convert to Integer instruction can be
__rendered_path__60__rendered_path__659
implemented as a register transfer to ARM that also converts the data
__rendered_path__664
from floating point format to integer format.
__rendered_path__690__rendered_path__659
Syntax
__rendered_path__691__rendered_path__693__rendered_path__760
<MRC|MCR>{<cond>} <cp_num>,<opc_1>,Rd,CRn,CRm,<opc_2>
__rendered_path__694__rendered_path__728__rendered_path__762
31 28 27 26 25 24 23 22 21 20 19 16 15 12 11
8 7 5 4 3 0
__rendered_path__729__rendered_path__763
Cond 1 1 1 0 opc_1 L CRn Rd cp_num opc_2 1 CRm
__rendered_path__564__rendered_path__659
ARM Source/Dest Register
Opcode
__rendered_path__658__rendered_path__760
Coprocesor Source/Dest Registers
Image_726_0__rendered_path__661__rendered_path__693__rendered_path__767
Condition Code Specifier
Transfer To/From Coprocessor
__rendered_path__658__rendered_path__694__rendered_path__664
Opcode
__rendered_path__3__rendered_path__59__rendered_path__658__rendered_path__689__rendered_path__692__rendered_path__727__rendered_path__658__rendered_path__761__rendered_path__658__rendered_path__692__rendered_path__764__rendered_path__765__rendered_path__766__rendered_path__692__rendered_path__693__rendered_path__691__rendered_path__774__rendered_path__775__rendered_path__776__rendered_path__692__rendered_path__693__rendered_path__777__rendered_path__727__rendered_path__728__rendered_path__729
Instruction Set - ARM University Program - V1.0
70

Page 71
0
e
Th
Coprocessor Memory
Transfers (1)
* Load from memory to coprocessor registers
* Store to memory from coprocessor registers.
31 28 27 26 25 24 23 22 21 20 19 16 15 12 11
8 7
Cond 1 1 0 P U N W L Rn CRd cp_num Offset
__rendered_path__288
Source/Dest Register
Address Offset
__rendered_path__288
Base Register
__rendered_path__288
Load/Store
__rendered_path__313__rendered_path__288
Condition Code Specifier
Base Register Writeback
__rendered_path__319
Transfer Length
__rendered_path__322
Add/Subtract Offset
__rendered_path__325
Pre/Post Increment
__rendered_path__328
ARM Instruction Set - ARM University Program - V1.0
__rendered_path__1__rendered_path__2
__rendered_path__289
__rendered_path__3__rendered_path__59__rendered_path__60Image_736_0__rendered_path__290__rendered_path__289__rendered_path__291__rendered_path__289__rendered_path__292__rendered_path__314__rendered_path__315__rendered_path__316__rendered_path__317__rendered_path__318__rendered_path__289__rendered_path__291__rendered_path__320__rendered_path__321__rendered_path__323__rendered_path__324__rendered_path__326__rendered_path__327__rendered_path__329__rendered_path__330__rendered_path__325__rendered_path__326__rendered_path__327__rendered_path__325__rendered_path__326__rendered_path__327__rendered_path__331__rendered_path__332__rendered_path__333__rendered_path__347__rendered_path__348__rendered_path__349__rendered_path__360__rendered_path__361__rendered_path__362__rendered_path__386__rendered_path__387__rendered_path__388__rendered_path__347__rendered_path__348__rendered_path__413__rendered_path__347__rendered_path__348__rendered_path__429__rendered_path__347__rendered_path__348__rendered_path__449__rendered_path__450__rendered_path__451__rendered_path__315
71

Page 72
e
A
Th
__rendered_path__1__rendered_path__2
Coprocessor Memory
Transfers (2)
* Syntax of these is similar to word transfers between ARM and memory:
<LDC|STC>{<cond>}{<L>} <cp_num>,CRd,<address>
– PC relative offset generated if possible, else causes an error.
<LDC|STC>{<cond>}{<L>} <cp_num>,CRd,<[Rn,offset]{!}>
– Pre-indexed form, with optional writeback of the base register
__rendered_path__60
<LDC|STC>{<cond>}{<L>} <cp_num>,CRd,<[Rn],offset>
– Post-indexed form
where
• <L> when present causes a “ long” transfer to be performed (N=1) else
causes a “ short” transfer to be performed (N=0).
– Effect of this is coprocessor dependant.
__rendered_path__3__rendered_path__59Image_746_0
RM Instruction Set - ARM University Program - V1.0
72

Page 73
e
A
Th
__rendered_path__1__rendered_path__2__rendered_path__576__rendered_path__577__rendered_path__578__rendered_path__580
Quiz #6
__rendered_path__579__rendered_path__581
* Write a short code segment that performs a mode change by modifying
__rendered_path__581
the contents of the CPSR
__rendered_path__581
• The mode you should change to is user mode which has the value 0x10.
__rendered_path__581
• This assumes that the current mode is a priveleged mode such as
__rendered_path__581
supervisor mode.
__rendered_path__581
• This would happen for instance when the processor is reset - reset code
__rendered_path__60__rendered_path__581
would be run in supervisor mode which would then need to switch to
__rendered_path__581
user mode before calling the main routine in your application.
__rendered_path__581
• You will need to use MSR and MRS, plus 2 logical operations.
__rendered_path__581
31
28
8
4
0
__rendered_path__581
N Z C V
I F T
Mode
__rendered_path__3__rendered_path__59Image_756_0__rendered_path__581__rendered_path__581__rendered_path__581__rendered_path__581__rendered_path__581__rendered_path__581__rendered_path__582__rendered_path__582__rendered_path__583__rendered_path__582__rendered_path__581__rendered_path__581__rendered_path__580__rendered_path__581__rendered_path__582__rendered_path__584__rendered_path__582__rendered_path__581__rendered_path__582__rendered_path__608
RM Instruction Set - ARM University Program - V1.0
73

Page 74
e
A
Th
__rendered_path__1__rendered_path__2
Quiz #6 - Sample Solution
* Set up useful constants:
mmask EQU 0x1f
; mask to clear mode bits
userm EQU 0x10
; user mode value
* Start off here in supervisor mode.
__rendered_path__60
MRS r0, cpsr
; take a copy of the CPSR
BIC r0,r0,#mmask ; clear the mode bits
ORR r0,r0,#userm ; select new mode
MSR cpsr, r0
; write back the modified
; CPSR
* End up here in user mode.
__rendered_path__3__rendered_path__59Image_766_0
RM Instruction Set - ARM University Program - V1.0
74

Page 75
g
e
A
Th
Main features of the
ARM Instruction Set
* All instructions are 32 bits long.
* Most instructions execute in a single cycle.
* Every instruction can be conditionally executed.
* A load/store architecture
• Data processing instructions act only on registers
– Three operand format
– Combined ALU and shifter for high speed bit manipulation
• Specific memory access instructions with powerful auto-indexin
addressing modes.
– 32 bit and 8 bit data types
and also 16 bit data types on ARM Architecture v4.
– Flexible multiple register load and store instructions
* Instruction set extension via coprocessors
RM Instruction Set - ARM University Program - V1.0
__rendered_path__1__rendered_path__2
__rendered_path__3__rendered_path__59__rendered_path__60Image_776_0
75