195753SdwmaloneCopyright (C) 2000, 2003 Free Software Foundation, Inc. 295753Sdwmalone 395753SdwmaloneThis file is intended to contain a few notes about writing C code 495753Sdwmalonewithin GCC so that it compiles without error on the full range of 595753Sdwmalonecompilers GCC needs to be able to compile on. 695753Sdwmalone 795753SdwmaloneThe problem is that many ISO-standard constructs are not accepted by 895753Sdwmaloneeither old or buggy compilers, and we keep getting bitten by them. 995753SdwmaloneThis knowledge until know has been sparsely spread around, so I 1095753Sdwmalonethought I'd collect it in one useful place. Please add and correct 1195753Sdwmaloneany problems as you come across them. 1295753Sdwmalone 1395753SdwmaloneI'm going to start from a base of the ISO C90 standard, since that is 1495753Sdwmaloneprobably what most people code to naturally. Obviously using 1595753Sdwmaloneconstructs introduced after that is not a good idea. 1695753Sdwmalone 1795753SdwmaloneFor the complete coding style conventions used in GCC, please read 1895753Sdwmalonehttp://gcc.gnu.org/codingconventions.html 1995753Sdwmalone 2095753Sdwmalone 2195753SdwmaloneString literals 2295753Sdwmalone--------------- 2395753Sdwmalone 2495753SdwmaloneIrix6 "cc -n32" and OSF4 "cc" have problems with constant string 2595753Sdwmaloneinitializers with parens around it, e.g. 2695753Sdwmalone 2795753Sdwmaloneconst char string[] = ("A string"); 2895753Sdwmalone 2995753SdwmaloneThis is unfortunate since this is what the GNU gettext macro N_ 3095753Sdwmaloneproduces. You need to find a different way to code it. 3195753Sdwmalone 3295753SdwmaloneSome compilers like MSVC++ have fairly low limits on the maximum 3395753Sdwmalonelength of a string literal; 509 is the lowest we've come across. You 3495753Sdwmalonemay need to break up a long printf statement into many smaller ones. 3595753Sdwmalone 3695753Sdwmalone 3795753SdwmaloneEmpty macro arguments 3895753Sdwmalone--------------------- 3995753Sdwmalone 4095753SdwmaloneISO C (6.8.3 in the 1990 standard) specifies the following: 4195753Sdwmalone 4295753SdwmaloneIf (before argument substitution) any argument consists of no 4395753Sdwmalonepreprocessing tokens, the behavior is undefined. 4495753Sdwmalone 4595753SdwmaloneThis was relaxed by ISO C99, but some older compilers emit an error, 4695753Sdwmaloneso code like 4795753Sdwmalone 4895753Sdwmalone#define foo(x, y) x y 4995753Sdwmalonefoo (bar, ) 5095753Sdwmalone 5195753Sdwmaloneneeds to be coded in some other way. 5295753Sdwmalone 5395753Sdwmalone 5495753Sdwmalonefree and realloc 5595753Sdwmalone---------------- 5695753Sdwmalone 5795753SdwmaloneSome implementations crash upon attempts to free or realloc the null 5895753Sdwmalonepointer. Thus if mem might be null, you need to write 5995753Sdwmalone 6095753Sdwmalone if (mem) 6195753Sdwmalone free (mem); 6295753Sdwmalone 6395753Sdwmalone 6495753SdwmaloneTrigraphs 6595753Sdwmalone--------- 6695753Sdwmalone 6795753SdwmaloneYou weren't going to use them anyway, but some otherwise ISO C 6895753Sdwmalonecompliant compilers do not accept trigraphs. 6995753Sdwmalone 7095753Sdwmalone 7195753SdwmaloneSuffixes on Integer Constants 7295753Sdwmalone----------------------------- 7395753Sdwmalone 7495753SdwmaloneYou should never use a 'l' suffix on integer constants ('L' is fine), 7595753Sdwmalonesince it can easily be confused with the number '1'. 7695753Sdwmalone 7795753Sdwmalone 7895753Sdwmalone Common Coding Pitfalls 7995753Sdwmalone ====================== 8095753Sdwmalone 8195753Sdwmaloneerrno 8295753Sdwmalone----- 8395753Sdwmalone 8495753Sdwmaloneerrno might be declared as a macro. 8595753Sdwmalone 8695753Sdwmalone 8795753SdwmaloneImplicit int 8895753Sdwmalone------------ 8995753Sdwmalone 9095753SdwmaloneIn C, the 'int' keyword can often be omitted from type declarations. 9195753SdwmaloneFor instance, you can write 9295753Sdwmalone 9395753Sdwmalone unsigned variable; 9495753Sdwmalone 9595753Sdwmaloneas shorthand for 9695753Sdwmalone 9795753Sdwmalone unsigned int variable; 9895753Sdwmalone 9995753SdwmaloneThere are several places where this can cause trouble. First, suppose 10095753Sdwmalone'variable' is a long; then you might think 10195753Sdwmalone 10295753Sdwmalone (unsigned) variable 10395753Sdwmalone 10495753Sdwmalonewould convert it to unsigned long. It does not. It converts to 10595753Sdwmaloneunsigned int. This mostly causes problems on 64-bit platforms, where 10695753Sdwmalonelong and int are not the same size. 10795753Sdwmalone 10895753SdwmaloneSecond, if you write a function definition with no return type at 10995753Sdwmaloneall: 11095753Sdwmalone 11195753Sdwmalone operate (int a, int b) 11295753Sdwmalone { 11395753Sdwmalone ... 11495753Sdwmalone } 11595753Sdwmalone 11695753Sdwmalonethat function is expected to return int, *not* void. GCC will warn 11795753Sdwmaloneabout this. 11895753Sdwmalone 11995753SdwmaloneImplicit function declarations always have return type int. So if you 12095753Sdwmalonecorrect the above definition to 12195753Sdwmalone 12295753Sdwmalone void 12395753Sdwmalone operate (int a, int b) 12495753Sdwmalone ... 12595753Sdwmalone 12695753Sdwmalonebut operate() is called above its definition, you will get an error 12795753Sdwmaloneabout a "type mismatch with previous implicit declaration". The cure 12895753Sdwmaloneis to prototype all functions at the top of the file, or in an 12995753Sdwmaloneappropriate header. 13095753Sdwmalone 13195753SdwmaloneChar vs unsigned char vs int 13295753Sdwmalone---------------------------- 13395753Sdwmalone 13495753SdwmaloneIn C, unqualified 'char' may be either signed or unsigned; it is the 13595753Sdwmaloneimplementation's choice. When you are processing 7-bit ASCII, it does 13695753Sdwmalonenot matter. But when your program must handle arbitrary binary data, 13795753Sdwmaloneor fully 8-bit character sets, you have a problem. The most obvious 13895753Sdwmaloneissue is if you have a look-up table indexed by characters. 13995753Sdwmalone 14095753SdwmaloneFor instance, the character '\341' in ISO Latin 1 is SMALL LETTER A 14195753SdwmaloneWITH ACUTE ACCENT. In the proper locale, isalpha('\341') will be 14295753Sdwmalonetrue. But if you read '\341' from a file and store it in a plain 14395753Sdwmalonechar, isalpha(c) may look up character 225, or it may look up 14495753Sdwmalonecharacter -31. And the ctype table has no entry at offset -31, so 14595753Sdwmaloneyour program will crash. (If you're lucky.) 14695753Sdwmalone 14795753SdwmaloneIt is wise to use unsigned char everywhere you possibly can. This 14895753Sdwmaloneavoids all these problems. Unfortunately, the routines in <string.h> 14995753Sdwmalonetake plain char arguments, so you have to remember to cast them back 15095753Sdwmaloneand forth - or avoid the use of strxxx() functions, which is probably 15195753Sdwmalonea good idea anyway. 15295753Sdwmalone 15395753SdwmaloneAnother common mistake is to use either char or unsigned char to 15495753Sdwmalonereceive the result of getc() or related stdio functions. They may 15595753Sdwmalonereturn EOF, which is outside the range of values representable by 15695753Sdwmalonechar. If you use char, some legal character value may be confused 15795753Sdwmalonewith EOF, such as '\377' (SMALL LETTER Y WITH UMLAUT, in Latin-1). 15895753SdwmaloneThe correct choice is int. 15995753Sdwmalone 16095753SdwmaloneA more subtle version of the same mistake might look like this: 16195753Sdwmalone 16295753Sdwmalone unsigned char pushback[NPUSHBACK]; 16395753Sdwmalone int pbidx; 16495753Sdwmalone #define unget(c) (assert(pbidx < NPUSHBACK), pushback[pbidx++] = (c)) 16595753Sdwmalone #define get(c) (pbidx ? pushback[--pbidx] : getchar()) 16695753Sdwmalone ... 16795753Sdwmalone unget(EOF); 16895753Sdwmalone 16995753Sdwmalonewhich will mysteriously turn a pushed-back EOF into a SMALL LETTER Y 17095753SdwmaloneWITH UMLAUT. 17195753Sdwmalone 17295753Sdwmalone 17395753SdwmaloneOther common pitfalls 17495753Sdwmalone--------------------- 17595753Sdwmalone 17695753Sdwmaloneo Expecting 'plain' char to be either sign or unsigned extending. 17795753Sdwmalone 17895753Sdwmaloneo Shifting an item by a negative amount or by greater than or equal to 17995753Sdwmalone the number of bits in a type (expecting shifts by 32 to be sensible 18095753Sdwmalone has caused quite a number of bugs at least in the early days). 18195753Sdwmalone 18295753Sdwmaloneo Expecting ints shifted right to be sign extended. 18395753Sdwmalone 18495753Sdwmaloneo Modifying the same value twice within one sequence point. 18595753Sdwmalone 18695753Sdwmaloneo Host vs. target floating point representation, including emitting NaNs 18795753Sdwmalone and Infinities in a form that the assembler handles. 18895753Sdwmalone 18995753Sdwmaloneo qsort being an unstable sort function (unstable in the sense that 19095753Sdwmalone multiple items that sort the same may be sorted in different orders 19195753Sdwmalone by different qsort functions). 19295753Sdwmalone 19395753Sdwmaloneo Passing incorrect types to fprintf and friends. 19495753Sdwmalone 19595753Sdwmaloneo Adding a function declaration for a module declared in another file to 19695753Sdwmalone a .c file instead of to a .h file. 19795753Sdwmalone 19895753Sdwmalone