To follow on from
dabcanboulet's excellent writeup on
computer memory boundary alignment, I feel there are a couple of points that need clarifying:
Data type of an object may determine its alignment
This depends on the
instruction set and underlying implementation of data types in the compiler. For example, a
long may need to be
long aligned, but a
short only
needs to be
short aligned, on the same machine. There also might need to be special requirements for arrays of objects, especially the
Pascal concept of
packed arrays.
Why is alignment important?
Alignment is high on the list of potential issues when it comes to
programming language portability. Different
machine architectures have different alignments, often related to the byte
addressing scheme of the
hardware.
Consider the following C structure:
struct foo {
char blah;
short bar;
long quux;
};
If we have a
struct foo at address 0x00054168, then on a
byte aligned machine, such as the
VAX, this is represented as 7 bytes long, with .bar at 0x00054169 and .quux at 0x0005416b.
Whereas, on a machine where shorts are aligned to shorts, and longs to longs, the structure is 8 bytes long, .bar is at 0x0005416a and .quux at 0x0005416c.
Consider an RPC transport, sending a struct foo between these machines, or a database storing it as a blob. The two machines will not be able to read correctly each other's structures.
Unaligned data does have a speed penalty, as it takes more instructions and CPU cycles to use the unaligned form. This is because the memory access takes place using the full width of the bus (32 bits or 64 bits), and this operation is by its very nature aligned to this size, owing to the way that memory works. If an unaligned data item spans more than one bus-sized memory location, this will require two fetch or store operations.
In some cases, the machine architecture can offer a choice of different alignment schemes. For example, the DEC Alpha natively has instructions that work on aligned data types, but can also work on byte aligned (unaligned) data for compatibility with the VAX. This was very important for the porting of VMS to the Alpha platform, as many of VMS's internal data structures and control blocks contain unaligned data. The DECC compiler has a pragma to control this.
#pragma nomember_alignment
struct foo {
char blah;
short bar;
long quux;
};
#pragma member_alignment
is how you would code a
struct foo to behave like a VAX.
The problems of alignment can be exacerbated by bad programming practices which rely on structures being implemented in particular ways.
To write portable C code, you must stick to the rules - breaking which C may let you get away with on a particular platform. Many other programming languages do not allow you to take any liberties in the first place.