1132718SkanCopyright (C) 2000, 2003 Free Software Foundation, Inc.
290075Sobrien
390075SobrienThis file is intended to contain a few notes about writing C code
490075Sobrienwithin GCC so that it compiles without error on the full range of
590075Sobriencompilers GCC needs to be able to compile on.
690075Sobrien
790075SobrienThe problem is that many ISO-standard constructs are not accepted by
890075Sobrieneither old or buggy compilers, and we keep getting bitten by them.
990075SobrienThis knowledge until know has been sparsely spread around, so I
1090075Sobrienthought I'd collect it in one useful place.  Please add and correct
1190075Sobrienany problems as you come across them.
1290075Sobrien
13132718SkanI'm going to start from a base of the ISO C90 standard, since that is
1490075Sobrienprobably what most people code to naturally.  Obviously using
1590075Sobrienconstructs introduced after that is not a good idea.
1690075Sobrien
17132718SkanFor the complete coding style conventions used in GCC, please read
18132718Skanhttp://gcc.gnu.org/codingconventions.html
1990075Sobrien
2090075Sobrien
2190075SobrienString literals
2290075Sobrien---------------
2390075Sobrien
24132718SkanIrix6 "cc -n32" and OSF4 "cc" have problems with constant string
25132718Skaninitializers with parens around it, e.g.
2690075Sobrien
2790075Sobrienconst char string[] = ("A string");
2890075Sobrien
2990075SobrienThis is unfortunate since this is what the GNU gettext macro N_
3090075Sobrienproduces.  You need to find a different way to code it.
3190075Sobrien
32132718SkanSome compilers like MSVC++ have fairly low limits on the maximum
33132718Skanlength of a string literal; 509 is the lowest we've come across.  You
34132718Skanmay need to break up a long printf statement into many smaller ones.
3590075Sobrien
3690075Sobrien
3790075SobrienEmpty macro arguments
3890075Sobrien---------------------
3990075Sobrien
4090075SobrienISO C (6.8.3 in the 1990 standard) specifies the following:
4190075Sobrien
4290075SobrienIf (before argument substitution) any argument consists of no
4390075Sobrienpreprocessing tokens, the behavior is undefined.
4490075Sobrien
4590075SobrienThis was relaxed by ISO C99, but some older compilers emit an error,
4690075Sobrienso code like
4790075Sobrien
4890075Sobrien#define foo(x, y) x y
4990075Sobrienfoo (bar, )
5090075Sobrien
5190075Sobrienneeds to be coded in some other way.
5290075Sobrien
5390075Sobrien
5490075Sobrienfree and realloc
5590075Sobrien----------------
5690075Sobrien
5790075SobrienSome implementations crash upon attempts to free or realloc the null
5890075Sobrienpointer.  Thus if mem might be null, you need to write
5990075Sobrien
6090075Sobrien  if (mem)
6190075Sobrien    free (mem);
6290075Sobrien
6390075Sobrien
6490075SobrienTrigraphs
6590075Sobrien---------
6690075Sobrien
67132718SkanYou weren't going to use them anyway, but some otherwise ISO C
68132718Skancompliant compilers do not accept trigraphs.
6990075Sobrien
7090075Sobrien
7190075SobrienSuffixes on Integer Constants
7290075Sobrien-----------------------------
7390075Sobrien
7490075SobrienYou should never use a 'l' suffix on integer constants ('L' is fine),
7590075Sobriensince it can easily be confused with the number '1'.
7690075Sobrien
7790075Sobrien
7890075Sobrien			Common Coding Pitfalls
7990075Sobrien			======================
8090075Sobrien
8190075Sobrienerrno
8290075Sobrien-----
8390075Sobrien
8490075Sobrienerrno might be declared as a macro.
8590075Sobrien
8690075Sobrien
8790075SobrienImplicit int
8890075Sobrien------------
8990075Sobrien
9090075SobrienIn C, the 'int' keyword can often be omitted from type declarations.
9190075SobrienFor instance, you can write
9290075Sobrien
9390075Sobrien  unsigned variable;
9490075Sobrien
9590075Sobrienas shorthand for
9690075Sobrien
9790075Sobrien  unsigned int variable;
9890075Sobrien
9990075SobrienThere are several places where this can cause trouble.  First, suppose
10090075Sobrien'variable' is a long; then you might think
10190075Sobrien
10290075Sobrien  (unsigned) variable
10390075Sobrien
10490075Sobrienwould convert it to unsigned long.  It does not.  It converts to
10590075Sobrienunsigned int.  This mostly causes problems on 64-bit platforms, where
10690075Sobrienlong and int are not the same size.
10790075Sobrien
10890075SobrienSecond, if you write a function definition with no return type at
10990075Sobrienall:
11090075Sobrien
111132718Skan  operate (int a, int b)
11290075Sobrien  {
11390075Sobrien    ...
11490075Sobrien  }
11590075Sobrien
11690075Sobrienthat function is expected to return int, *not* void.  GCC will warn
117132718Skanabout this.
11890075Sobrien
11990075SobrienImplicit function declarations always have return type int.  So if you
12090075Sobriencorrect the above definition to
12190075Sobrien
12290075Sobrien  void
123132718Skan  operate (int a, int b)
12490075Sobrien  ...
12590075Sobrien
12690075Sobrienbut operate() is called above its definition, you will get an error
12790075Sobrienabout a "type mismatch with previous implicit declaration".  The cure
12890075Sobrienis to prototype all functions at the top of the file, or in an
12990075Sobrienappropriate header.
13090075Sobrien
13190075SobrienChar vs unsigned char vs int
13290075Sobrien----------------------------
13390075Sobrien
13490075SobrienIn C, unqualified 'char' may be either signed or unsigned; it is the
13590075Sobrienimplementation's choice.  When you are processing 7-bit ASCII, it does
13690075Sobriennot matter.  But when your program must handle arbitrary binary data,
13790075Sobrienor fully 8-bit character sets, you have a problem.  The most obvious
13890075Sobrienissue is if you have a look-up table indexed by characters.
13990075Sobrien
14090075SobrienFor instance, the character '\341' in ISO Latin 1 is SMALL LETTER A
14190075SobrienWITH ACUTE ACCENT.  In the proper locale, isalpha('\341') will be
14290075Sobrientrue.  But if you read '\341' from a file and store it in a plain
14390075Sobrienchar, isalpha(c) may look up character 225, or it may look up
14490075Sobriencharacter -31.  And the ctype table has no entry at offset -31, so
14590075Sobrienyour program will crash.  (If you're lucky.)
14690075Sobrien
14790075SobrienIt is wise to use unsigned char everywhere you possibly can.  This
14890075Sobrienavoids all these problems.  Unfortunately, the routines in <string.h>
14990075Sobrientake plain char arguments, so you have to remember to cast them back
15090075Sobrienand forth - or avoid the use of strxxx() functions, which is probably
15190075Sobriena good idea anyway.
15290075Sobrien
15390075SobrienAnother common mistake is to use either char or unsigned char to
15490075Sobrienreceive the result of getc() or related stdio functions.  They may
15590075Sobrienreturn EOF, which is outside the range of values representable by
15690075Sobrienchar.  If you use char, some legal character value may be confused
15790075Sobrienwith EOF, such as '\377' (SMALL LETTER Y WITH UMLAUT, in Latin-1).
15890075SobrienThe correct choice is int.
15990075Sobrien
16090075SobrienA more subtle version of the same mistake might look like this:
16190075Sobrien
16290075Sobrien  unsigned char pushback[NPUSHBACK];
16390075Sobrien  int pbidx;
16490075Sobrien  #define unget(c) (assert(pbidx < NPUSHBACK), pushback[pbidx++] = (c))
16590075Sobrien  #define get(c) (pbidx ? pushback[--pbidx] : getchar())
16690075Sobrien  ...
16790075Sobrien  unget(EOF);
16890075Sobrien
16990075Sobrienwhich will mysteriously turn a pushed-back EOF into a SMALL LETTER Y
17090075SobrienWITH UMLAUT.
17190075Sobrien
17290075Sobrien
17390075SobrienOther common pitfalls
17490075Sobrien---------------------
17590075Sobrien
176132718Skano Expecting 'plain' char to be either sign or unsigned extending.
17790075Sobrien
17890075Sobrieno Shifting an item by a negative amount or by greater than or equal to
17990075Sobrien  the number of bits in a type (expecting shifts by 32 to be sensible
18090075Sobrien  has caused quite a number of bugs at least in the early days).
18190075Sobrien
18290075Sobrieno Expecting ints shifted right to be sign extended.
18390075Sobrien
18490075Sobrieno Modifying the same value twice within one sequence point.
18590075Sobrien
18690075Sobrieno Host vs. target floating point representation, including emitting NaNs
18790075Sobrien  and Infinities in a form that the assembler handles.
18890075Sobrien
18990075Sobrieno qsort being an unstable sort function (unstable in the sense that
19090075Sobrien  multiple items that sort the same may be sorted in different orders
19190075Sobrien  by different qsort functions).
19290075Sobrien
19390075Sobrieno Passing incorrect types to fprintf and friends.
19490075Sobrien
19590075Sobrieno Adding a function declaration for a module declared in another file to
19690075Sobrien  a .c file instead of to a .h file.
197132718Skan
198