Anonymous user
ARM Overview: Difference between revisions
m
→Memory Detection: fixed a typo (double word)
[unchecked revision] | [unchecked revision] |
m (More simpler) |
m (→Memory Detection: fixed a typo (double word)) |
||
(23 intermediate revisions by 12 users not shown) | |||
Line 1:
ARM is a family of instruction set architectures based on RISC architecture developed by a single company - ARM Holdings.
Line 11 ⟶ 9:
All of processors you'll see in tablets or smartphones are SoCs which act as sort of motherboard with a processor. They contain logic for driving peripherals (ethernet, USB, SD/MMC cards, SPI, I2C, audio), they may contain GPU for graphics coprocessing or FPGA for custom logic.
Many uses means either few multifunctional (and theferefore complex and power-consuming) devices, or many simple devices suited just for that kind of operation. ARM processors as RISC devices chose simplicity over complexity, and therefore they are way too many cores with different instructions used. To have just ''one assembler to rule them all'', ARM defined Unified Assembly Language which can be translated for
ARM cores are divided in lastest versions to three main lines:
Line 17 ⟶ 15:
* Cortex-R cores, used for real-time devices
* Cortex-A cores, used for applications in multifunctional devices like smartphones, TVs or maybe computers.
Apple machines use custom ARM cores, and so do some Nvidia boards.
==Overview==
Line 70 ⟶ 69:
The 'Undefined' Mode is switched to on the encounter by the CPU of an undefined exception. However, based on the tone used in the ARMv4 manual, and the fact that they blatantly imply this, the exception mode was really meant for the kernel to emulate instructions for the usermode process, and then return.
=== Note regarding the 'Spectre' exploit ===
ARM have recently added a new CPU instruction to their specification that mitigates cache speculation side-channel exploits. You can read more about this design patch and recommended action to take regarding Spectre (https://developer.arm.com/-/media/Files/pdf/Cache_Speculation_Side-channels.pdf?revision=8b5a5f33-c686-4b00-8186-187dd2910355 here).
{| class="wikitable"
Line 83 ⟶ 86:
Unlike the x86, important operating registers are clearly visible through general use registers. For example, r15 is 'pc', or the 'program counter', and r13 is the 'stack pointer', or 'sp'.
Along with the general purpose registers, there is also the CPSR register, or, the 'Current Program Status Register'. This registers keeps track of the current operating mode, whether interrupts are enabled or not, etc. The operating system can read and write to this register using the MSR\MRS instructions. ([http://www.arm.com/support/faqdev/1472.html See Here]) There are several other system registers which can be used to [[Detecting_Raspberry_Pi_Board|detect the ARM board]] for example.
{| class="wikitable"
Line 147 ⟶ 150:
|}
<small id="Note1">Note 1: Stack is 8 byte aligned at all times outside of prologue/epilogue of non-leaf function. Leaf functions can have 4 byte aligned stacks. 8 byte alignment is required for ldrd/strd to function on the stack, which mostly only comes into play when using varargs with 64-bit integers. Care must be taken in interrupts and exceptions to align the stack to 8 byte before calling C code.</small>
For details look at the [http://infocenter.arm.com/help/topic/com.arm.doc.ihi0042e/IHI0042E_aapcs.pdf EABI specs].
Line 169 ⟶ 172:
|}
Loading of large immediate values into registers can be interestingly different from
Example, of machine code produced by GCC to load a register with a 32-bit value. As you can note the immediate value is technically outside of the instruction stream. On the
value would have been encoded into the instruction. Also, take note how each instruction is constant in length. The instruction doing the work is LDR using the PC register and a immediate offset in order to place the value 0x12345678 into the register R3.
<pre>
Line 195 ⟶ 198:
Almost all instructions support conditional execution which is directly encoded into the instruction. On a
Here are the conditional codes for example:
Line 252 ⟶ 255:
===== Memory Detection =====
Memory detection is much different if you are coming from a background in the
Therefore memory detection mechanisms may be non-existent and instead your operating system may opt for a value to be encoded into it at compile
It may however be possible to probe memory and recover using processor exceptions. This still may not provide information about if a region of memory is FLASH, memory-mapped I/O, RAM, or ROM depending on how the system board was designed as I do suspect it could be quite possible for some ROM to be external to the core and allow writes to silently fail, and this coupled with the possibility of a region of memory to need a special unlock sequence in order to write to it will render your memory auto-detection code into a potential corner-case.
Line 263 ⟶ 266:
''Note: For some reason, the ARM people use the terms 'interrupt' and 'exception' as if they were the same.''
For exceptions, ARM uses a table similar to the IVT of the real mode x86. The table consists of a number of 32-bit entries. Each entry is an instruction (ARM instructions are
Take note of that, and understand the design impact it imposes: On x86, the hardware vector table holds the addresses of handler routines. On ARM, the hardware vector table holds actual instructions. These instructions must fit into 4 bytes. This is actually not a big deal since all ARM instructions (assuming ARM mode and not Thumb, or Jazelle) are actually
Also used are various devices to ''vector'' interrupts. Two such are the Generic Interrupt Controller and the Vectored Interrupt Controller.
Line 272 ⟶ 275:
===== Heap Pointers Needs To Be Aligned =====
I can not state at this time how many processors support unaligned memory access natively, but from what I know unaligned memory access is more expensive than aligned. And, a good many structures which have fields greater than one byte in size a lot of times are allocated on the heap. So from this you can see how big of a performance hit you could take. Not to mention the bug that would be introduced if running under a processor which does not even handle unaligned memory access gracefully, and by that I meant at the very least raising an exception of some sort so the code can crash and show the problem.
===== Missing Division Functions (__aeabi_uidivmod, __aeabi_idiv)
This is caused by using GCC and not linking with ''libgcc''. You ''NEED TO'' link with ''libgcc'' when using GCC. For information about why you should link and information about ''libgcc'', read [[Libgcc]] and [[GCC_Cross-Compiler]].
Line 285 ⟶ 288:
Also, some extra information that maybe useful:
http://www.linkedin.com/groups/ARM-cores-hardware-division-85447.S.242517259
A discussion about this section, and also at the end an example of
http://forum.osdev.org/viewtopic.php?f=8&t=27767
The source for
https://github.com/mirrors/gcc/blob/master/libgcc/udivmodsi4.c
https://github.com/mirrors/gcc/blob/master/libgcc/
</pre>
===== Unaligned Memory Access And Byte Order =====
{| border="1px"
Line 333 ⟶ 337:
{| class="wikitable"
! colspan="8" | Little Endian Word Size (
|-
| ED || CB || A9 || 87 || 78 || 9A || BC || DE
Line 356 ⟶ 360:
{| class="wikitable"
! colspan="8" | Little Endian Half-Word Size (
|-
| ED || CB || A9 || 87 || 78 || 9A || BC || DE
Line 380 ⟶ 384:
{| class="wikitable"
! colspan="8" | Byte Size (
|-
| ED || CB || A9 || 87 || 78 || 9A || BC || DE
Line 405 ⟶ 409:
|}
''From reading the data sheet it appears that if operating in big endian mode the half-word access would be reversed to be more natural as you would expect on the
''The word access with offset of ''10b (0x2)'' may be defined, but I am not sure because it does not really state. However, it may employ some of the mechanisms for loading half-words. (Need someone to come through and correct this if it is wrong)''
The reason memory access has to be aligned is because unaligned access requires additional access cycles, due
You could simulate an unaligned memory access on the ARM7TDMI-S, but you would have to make separate loads from two memory locations. A compiler could probably emit code to do this automatically, but the checks whether an access is unaligned or not would slow down ''all'' memory accesses. Some code is provided in the ''ARM7TDMI-S Data Sheet'' on page ''4-35'' for such "checked" memory access, when you do not know if the address will be aligned or non-aligned.
Line 449 ⟶ 453:
''It can be a wise idea to populate all entries of the table. If the table is populated with zeros each vector will hold the instruction "andeq r0, r0, r0" which does nothing. Meaning if the CPU jumps to an unpopulated vector it will effectively execute a NOP and move to the next vector which can cause a confusing bug in your code. At least point any unused vectors to a dummy function that will notify you an unhandled exception has occured!''
===== Specifying CPSR using AS (
You have to use ''cpsr'' not ''%%cpsr'' or any other form.
Line 476 ⟶ 480:
}
</pre>
===== GCC (Chars Not Signed) =====
In some cases GCC when targeting ARM architecture may not handle signed values as you may expected.
Line 508 ⟶ 513:
5c: e0823003 add r3, r2, r3
</pre>
The
ARM you may be surprised to find that ''char'' is treated as ''unsigned char''. You must specify ''signed''
before ''char'', and then the compiler will generate code to correct perform the addition.
Line 531 ⟶ 536:
! Brief Description
|-
| [[ARM_Beagleboard|
| Tutorial on bare-metal [OS] development on the Texas Instruments
|-
| [[ARM_Integrator-CP_Bare_Bones|Integrator Barebones]]
Line 541 ⟶ 546:
|-
| [[PL050_PS/2_Controller|PL050 PS/2 Controller]]
| Information about interfacing a
|-
| [[ARM_Integrator-CP_IRQTimerAndPIC|IRQ, Timer, And PIC]]
Line 548 ⟶ 553:
| [[ARM_Integrator-CP_ITPTMME_Main|ELK Pages (Thin ARM)]]
| The experimental learning kernel pages aimed to take someone gradually through the process of building a functional kernel using possibly (in later part of series) experimental designs and implementations that differ from the standard and conventional design in certain areas.
|-
| [[ARM_RaspberryPi]]
| Description and details on the commonly used Raspberry Pi boards
|-
| [[User:Pancakes/ARM_QEMU_REALVIEW-PB-A|QEMU realview-pb-a board]]
Line 554 ⟶ 562:
==Highly Useful External Resources==
*[http://infocenter.arm.com/help/index.jsp
*[http://www.arm.com/documentation/Software_Development_Tools/ More ARM Documentation]
*[http://www.coranac.com/tonc/text/asm.htm Whirlwind Tour of ARM Assembly]
*[http://re-eject.gbadev.org/files/GasARMRef.pdf
*[http://re-eject.gbadev.org/files/armref.pdf ARM Instruction Reference]
*[http://www.arm.com/miscPDFs/9658.pdf
*[http://infocenter.arm.com/help/topic/com.arm.doc.dui0159b/DUI0159B_integratorcp_1_0_ug.pdf Integrator/CP Reference Manual]
*[http://www.amazon.com/ARM-System-Developers-Guide-Architecture/dp/1558608745 Amazon: ARM System Developer's Guide: Designing and Optimizing System Software]
Line 565 ⟶ 573:
[[Category:ARM]]
[[Category:Instruction Set Architecture]]
[[de:ARM]]
|