User:Pancakes/Sandbox

Unaligned Memory Access And Byte Order

This accuracy of this section has not been verified widely, and could be inaccurate.

Lets imagine we have a board with a stick of RAM that has 8 bytes. Lets, depict this below:

Value	A	B	C	D	E	F	G	H
Address	0	1	2	3	4	5	6	7

To recap a little endian machine reads LSB (least significant byte) first. So therefore if we made a 32-bit (word sized) read at address 0 on the memory above we would yield the integer value of DCBA. On a big endian you would get what you see with ABCD. It all depends on your depiction of how you imagine or see memory addresses. Do they flow to the right, or towards the left. If you see them flowing towards the left them the little endian (X86/64) would seem more natural, but if as above the big endian seems more natural (ARM). Of course some ARM cores can toggle between the two.

The new gotcha comes at a twist. It depends on the core. I know for example for the ARM7TDMI-S which QEMU emulates supports two byte ordering modes. Now, this mode can not be toggled in software on this core unless the board provides this ability which in QEMU I can not find where it does. But, a pin from the core called BIGEND provides the functionality if the voltage present on it is above the threshold or below it which switches the core from little endian or big endian. The mode on QEMU and the default mode is little endian. Of, course because of this you will find a very interesting effect on how it reads half-words.

You have two areas of difference from a more mainstream architecture like X86/X64:

not going to handle unaligned memory accesses, and if it does the value is considered unpredictable.
accessing half-words in the little endian mode will swap the half-words in the word

The core by default operates in little endian mode. The following code will demonstrate the twist when accessing memory using half-words:

uint32        *a;
uint16        *b;

a = (uint32*)0;
b = (uint16*)0;

a[0] = 0x12345678;
printf("%04x --> %0x4x\n", b[0], b[1]);

Will produce 0x5678 --> 0x1234, while if in big endian mode you will get 0x1234 --> 0x5678. Of course there is no way to switch it to big endian, unless you patch QEMU AFAIK to produce a big endian build.

Little Endian Word Size (32Bit)
ED	CB	A9	87	78	9A	BC	DE
Offset				Register(LittleEndian)		Register(BigEndian)
00				0x87A9CBED		0xEDCBA987		(GOOD - ALIGNED)
01				0xDEEDCBA9				(BAD - UNALIGNED)
10				0xEDCBBCDE				(BAD - UNALIGNED)
11				0xDEED9ABC				(BAD - UNALIGNED)

Little Endian Half-Word Size (16Bit)
ED	CB	A9	87	78	9A	BC	DE
Offset				Register(LittleEndian)		Register(BigEndian)
00				0xA987		0xEDCB		(GOOD - ALIGNED)
01				0xCBA9				(BAD - UNALIGNED)
10				0xEDCB		0xA987		(GOOD - ALIGNED)
11				0x87ED				(BAD - UNALIGNED)

Byte Size (8Bit)
ED	CB	A9	87	78	9A	BC	DE
Offset				Register(LittleEndian)		Register(BigEndian)
00				0x87		0xED
01				0xA9		0xCB
10				0xCB		0xA9
11				0xED		0x87

Do not get caught in the trap of taking the behavior shown above for unaligned accesses for granted. It is not supported, and it is specified on 4-32 ARM7TDMI-S Data Sheet that if bit 0 for the memory address on a half-word access is high then the results are unpredictable.

From reading the data sheet it appears that if operating in big endian mode the half-word access would be reversed to be more natural as you would expect on the X86/64 architecture. But, since I can not actually test it at the moment I am hoping I got it right.

The word access with offset of 10b (0x2) may be predictable, but I am not sure because it does not really state. However, it may employ some of the mechanisms for loading half-words. (Need someone to come through and correct this if it is wrong)

The reason memory access has to be aligned is because if it is not then a second memory fetch must be made, or at least you end up with the potential to have x number of cycles for an aligned access and y number of cycles for an unaligned access. This is caused by the way memory modules work which you can check out by using the link to the datasheet of 256MBDDR below.

Of course thats not the only problem. Then, you also have increased complexity of load and store operations which increases complexity of the processor core.

You can perform an unaligned memory access on the ARM7TDMI-S but you will have to use extra instructions to check if the address is not aligned, and then use some inline conditional instructions to adjust and load from two memory locations. GCC, I imagine could emit code to do this, but it would end up causing all memory accesses to be slower. So I suppose they decided that the default was to assume all access are on the appropriate boundary. Some code is provided in the ARM7TDMI-S Data Sheet on page 4-35 for memory access when you do not know if the address will be aligned or non-aligned.

The last revision I pulled was from commit 217bfb445b54db618a30f3a39170bebd9fd9dbf2, and there was not a build target that I found for big endian. But, here is a link to a patch supposedly:

http://lists.gnu.org/archive/html/qemu-devel/2004-12/msg00206.html

http://download.micron.com/pdf/datasheets/dram/ddr/256MBDDRx4x8x16.pdf

User:Pancakes/Sandbox

Unaligned Memory Access And Byte Order

Navigation menu

Search