Vlad Ioan Topan

My playground

Archive for March 2009

Translating headers: struct / record field alignment in C / Delphi

with 4 comments

If you’ve ever taken a shot at porting headers from C to Delphi, you may have noticed the different way in which the struct/record fields are aligned, but most likely you havent. And if you had the bad luck of stumbling over a struct that gets aligned differently when ported to Delphi, even though the data types were translated correctly, you’re probably interested in what happens behind the scenes and how to fix it. Well, here it goes.

Why variable addresses are aligned in memory

As explained in depth here, because of it’s internal structure, the time required by the CPU to access data from the RAM depends on the alignment of the address it requests (the “aligned” memory addresses being accessed faster). Aligned means that the address should be a multiple of the CPU’s memory word (not to be confused with the WORD data type in C/Delphi, which is always 2 bytes), which is the least amount of memory the CPU can get from RAM at one time. It’s also the size of the general purpose registers on the CPU, and it represents the “chunk” of data the CPU is most “comfortable” working with. On x86 architectures it’s 32 bits, which means 4 bytes (on 64 bit architectures it’s… well, 64 bits = 8 bytes). So, the memory addresses should be multiples of 4 on x86 systems for faster access.

Nitpicking corner: why is the WORD type 2 bytes long then, if the memory word on x86 is 4 bytes? Well, it’s not actually *all* the x86 architectures that have a 4 byte memory word, just the 386-and-higher ones (the aptly-named 32-bit architectures). But back when dinosaurs still roamed the Earth and “640K of memory were enough for anybody” (to paraphrase (?) one of the foremost computer visionaries), the computer word was just 16 bits, or 2 bytes, and that’s when the WORD data type was coined (without very much foresight, truth be told).

Struct field alignment in C

According to this paragraph on Wikipedia, struct member fields are aligned based on their size by padding them with, basically, their size in bytes. So, assuming we declare the structure:

typedef struct {
BYTE a;
WORD b;
BYTE c;
DWORD d;
} foo;

you might be tempted to assume it will take 8 bytes of memory. But, due to the afore-mentioned alignment, b will be aligned on a 2-byte boundary, and c on a 4-byte boundary, so that structure will actually be equivalent to this one:

typedef struct {
BYTE a;
BYTE padding_for_b[1]; // 1-byte padding to align .b
WORD b;
BYTE c;
BYTE padding_for_d[3]; // 3-byte padding to align .d
DWORD d;
} foo;

and take precisely 12 bytes. You can force C’s hand not to align the members using the #pragma pack(1) directive, but (a) the code will be slower and (b) you can’t do that to standard structures in the Windows API for obvious reasons.

Record field alignment in Delphi

As explained in the “align fields compiler directive” help entry, in Delphi the record field alignment is not based on the field’s data type size, but on a constant specified by the ALIGN (A) directive (which can also be set for the entire project from Project->Options->Compiler->Record field alignment.
The default is {$ALIGN 8} (short: {$A8}), which means alignment to an 8-byte boundary. However, that’s how things work only in theory. In practice, for some mysterious reason, Delphi does exactly what C does: unless the alignment is set to 1 (eg. by {$A1}), it aligns record fields based on their size.

The difference

But if Delphi behaves just like C, why do things sometimes still crash? They do because of one exception to the rule: enumerations are treated differently. In C, they are 4 bytes in size by default (this can be changed using the #pragma enum directive). In Delphi, the enumeration size is controlled by the $Z directive, and by default it’s set to 1. Therefore, in order to get the proper translation of API structures from C to Delphi, you should use the {$Z4} directive.

Advertisements

Written by vtopan

March 10, 2009 at 11:37 PM

Posted in Delphi

Tagged with , , , , ,

Converting C structs to Delphi

leave a comment »

In the process of writing a Process Explorer meets HijackThis one-size-fits-all monster app, I faced the rather tedious task of porting Windows SDK headers (some less documented than others) to Delphi. As expected, I ran into annoying gotchas; here are some of them.

Boolean type in C / Windows SDK

The C specs didn’t define a boolean type up until C99 (which introduced the bool data type), which led to a variety of “standards”. The SDK uses the BOOLEAN type, which is actually a BYTE, and should be mapped to the Boolean type in Delphi, not BOOL/LongBool (yes, I am well aware that this may be a “duh!” for some).

Other type mappings

UCHAR is unsigned char, not “unicode char” as some might expect, so it translates to BYTE, not WORD.

Struct to record mappings

See this post.

[to be continued]

Written by vtopan

March 10, 2009 at 2:49 AM

Posted in Delphi

Tagged with , , ,