Johan Bezem ist Mitglied bei
Archive for March, 2009
Why 32768 isn’t always the same as 0×8000
Contrary to intuition, the C constants ‘32768′ and ‘0×8000′ have an
identical representation (0×8000), but possibly different types in C.
If you consider a processor with a 16-bit int type, and a 32-bit long
type, 32768 is considered long, whereas 0×8000 (and the octal variant
0100000) is considered ‘unsigned int’.
If you feel the need, check the C standard at
http://www.open-std.org/JTC1/SC22/WG14/www/docs/n1256.pdf, page 55 at the
bottom, and the table at page 56. (look at Harbison & Steele, 5th edition, and look at section 2.7.1 page 24ff).
Normally, there is no problem when using such values, since the
representations are identical.
However, consider this small example:
#define C_DECIMAL 32768
#define C_HEXADECIMAL 0x8000
void main(int argc, char *argv[])
{
volatile long long_dec = ((long)~C_DECIMAL);
volatile long long_hex = ((long)~C_HEXADECIMAL);
return;
}
When C_DECIMAL is considered long, the negation will invert 32 bits,
resulting in a representation 0xFFFF7FFF with type ‘long’; the cast is
superfluous.
When C_HEXADECIMAL is considered ‘unsigned int’, the negation will invert
16 bits, resulting in a representation 0×7FFF with type ‘unsigned int’;
the cast will then zero-extend to a ‘long’ value of 0×00007FFF.
Checking with a 16-bit integer compiler (CW7.1 ColdFire using ‘-intsize 2′):
0x00000000 _main: ; main: 0x00000000 0x4E560000 link a6,#0 0x00000004 0x518F subq.l #8,a7 0x00000006 0x223CFFFF7FFF move.l #-32769,d1 0x0000000C 0x2D41FFF8 move.l d1,-8(a6) 0x00000010 0x223C00007FFF move.l #32767,d1 0x00000016 0x2D41FFFC move.l d1,-4(a6) 0x0000001A 0x4E5E unlk a6 0x0000001C 0x4E75 rts
For those of you who do not know how to read assembler code I have made the differing values italic. So the compiler confirms the difference in behavior, and this is not a compiler error.
Lucky you if you have Lint to warn you. (Yes, I know, other tools will too, if you let them…)
Happy coding!
Use of in-line assembler using ‘asm’ variants
Just a “small” post to expand on my most recent tweet. In-line assembler is quite a difficult topic, but unavoidable in most embedded environments. And the syntactic variants are more numerous than the bugs in your code.
For that reason I give a piece of advice: For each different assembler semantic, use a different macro. For an assembler function, use ASM_FN like this:
ASM_FN int MyAssemblerFunction(void)For an in-line assembler block, use
{
...assembler instructions...
}
ASM_BLOCK like this:And for single-line in-line assembler instructions use
...
a += 4;
ASM_BLOCK volatile {
...assembler instructions...
};
ASM_LINE like this:Now, all these uses will need to expand into the non-standard keyword
...
a += 4;
ASM_LINE movb ax,0b00000010;
...
asm for the compiler to process everything correctly. Many compilers accept different forms of the keyword, so you may use asm, __asm and __asm__ interchangeably. If you want to, you can take these three variants instead of the ASM_(FN|BLOCK|LINE) as I suggested.
The idea is to enable Lint to expand each of these three forms differently: The function containing only assembler instructions shall be ignored by Lint, but its prototype needs to be known. Therefore we need to enable the Lint keyword _ignore_init (the body of the function is seen as a form of “initialization”), and provide the options:The plus-sign in the second option prevents the definition in our code (to
+rw(_ignore_init)
+dASM_FN=_ignore_init
asm or one its variants) to override the Lint-specific definition. However, for ASM_BLOCK this replacement will not work, so we need a different replacement: And the third form again needs a different replacement, since in this case, no brackets need be present at all:
+rw(_up_to_brackets)
+dASM_BLOCK=_up_to_brackets
With some other form of in-line assembler definition you might even need
+rw(_to_semi)
+dASM_LINE=_to_semi
_to_eol, or one of the other gobblers. But make sure to use different macros for different syntactic usages of asm, so you have the chance to use different gobblers for all situations.
Happy Linting!
Deutsch
English
Nederlands