Interview Question
Country: United States
I said that generally primitives have to be aligned to start on an address that is a multiple of the lesser of the size of the primitive and the machine alignment size. I'd like to explain why this is so.
Generally, machines read data only in chunks that are the machine alignment size and only at addresses that are multiples of the machine alignment size. When your alignment size is 4, any read of a lower power of 2, like a 2- byte or 1-byte datatype, results in the full 4-byte read that is then appropriately AND-masked or shifted to get just the part you want. When the address of a primitive is a multiple of the size of the datatype, any such read that involves shifting is guaranteed to need only the data from one 4-byte chunk. Let's say the readable 4-byte chunks are bytes 0-3 and bytes 4-7. If you positioned your 2-byte short int starting at byte 3 and extending into byte 4, the architecture would have to read both 4-byte locations and combine them with a lot of masking and shifting to get your short. But if you position your short at a multiple of its size, 2 (so address 0, 2, 4, or 6 here), the read can always be completed with just one read and a simple mask operation.
The architecture could allow you to do inefficient reads that require reading 2 locations and provide the wiring for that, but that would require extra logic and circuitry for something that's inefficient anyway. So most architectures require alignment and throw bus errors when you don't meet the alignment criteria.
When you have types > 4 bytes on 4-byte alignment machines, multiple locations have to be read anyway, and the large datatypes are required to be aligned to a multiple of the machine alignment size to ensure that, for instance, reading a double requires at most 2 reads and simple shifting instead of 3 reads and complex shifting (how many 4-byte blocks must be read to read a double that starts at byte 3? 0-3, 4-7, and 8-11.) Some architectures may have an 8-byte mechanism for reading these types and require alignment to 8-byte multiples for these types.
The best strategy for alignment? Align all types to addresses that are multiples of the lesser of their size or the maximum read size on a machine, which for current architectures is usually 8 bytes.
If it's an array of type T, we'll be accessing datatypes of type T, so it's whatever alignment we need to access type T. Arrays of bytes can start on any byte, shorts on any address divisible by 2, ints on every address divisble by 4, and so on. However, if the array was allocated using malloc, you'll always get something aligned to the alignment size from malloc. This is because 1) malloc has no idea what you will do with the memory it returns and wants to make sure that it's useable for anything you decide to use it for; and 2) to make the implementation of malloc a little more sane.
"However, if the array was allocated using malloc, you'll always get something aligned to the alignment size from malloc."
What I mean is that malloc always makes sure any addresses it returns are aligned to the largest alignment that could be required for anything. On most architectures, that's 8 bytes.
For example, it would be dumb if malloc returned you a pointer to address 333 when you need to use that memory for an array of ints, and so will have to start placing the ints at address 336 (since the ints must be aligned in multiples of 4). Then you'd have to have logic like this:
//get memory for 30 ints. Add in one extra int of space since
//part of the first int's space may be unuseable because it
//might not be aligned
void* mallocVal = malloc ((30+1)*sizeof(int));
int* arrayLocation = (int*) roundUpToNearestMultiple(mallocVal, 4);
//use the array
arrayLocation[0] = 3;
...
//you still need to remember the original non-aligned value to free!
free (mallocVal);
Luckily, malloc never returns space that could potentially be unaligned for your use, and you only need to do this:
//get memory for 30 ints. It's simple: ask for 30*sizeof(int) bytes
int* arrayLocation = (int*) malloc (30*sizeof(int));
//you're all set to use the array
arrayLocation[0] = 3;
...
//just use your existing array pointer to free the memory
free (mallocVal);
It should be 4 bytes, unless your system makes it a rule to align all structs to 8 bytes. char datatypes should generally be readable on any byte. chars can be read on any byte because data generally has to be aligned to the lesser of the size of the datatype in question and the word alignment size of the machine.
- eugene.yarovoi August 29, 2012