Writing Portable Programs · Translation-Time Issues · Character-Set Issues · Representation Issues · Expression-Evaluation Issues · Library Issues · Converting to Standard C · Function-Call Issues · Preprocessing Issues · Library Issues · Quiet Changes · Newer Dialects
A portable program is one that you can move with little or no extra investment of effort to a computer that differs from the one on which you originally developed the program. Writing a program in Standard C does not guarantee that it will be portable. You must be aware of the aspects of the program that can vary among implementations. You can then write the program so that it does not depend critically on implementation-specific aspects.
This document describes what you must be aware of when writing a portable program. It also tells you what to look for when you alter programs written in older dialects of C so that they behave properly under a Standard C implementation. It briefly summarizes the features added with Amendment 1 to the C Standard. And it suggests ways to write C code that is also valid as C++ code.
Although the language definition specifies most aspects of Standard C, it intentionally leaves some aspects unspecified. The language definition also permits other aspects to vary among implementations. If the program depends on behavior that is not fully specified or that can vary among implementations, then there is a good chance that you will need to alter the program when you move it to another computer.
This section identifies issues that affect portability, such as how the translator interprets the program and how the target environment represents files. The list of issues is not complete, but it does include the common issues that you confront when you write a portable program.
An implementation of Standard C must include a document that describes any behavior that is implementation defined. You should read this document to be aware of those aspects that can vary, to be alert to behavior that can be peculiar to a particular implementation, and to take advantage of special features in programs that need not be portable.
A program can depend on peculiar properties of the translator.
filenames acceptable to an
include directive can
vary considerably among implementations. If you use filenames that
consist of other than six letters (of a single case), followed by
a dot (
.), followed by a single letter, then an implementation
can find the name unacceptable. Each implementation defines the filenames
that you can create.
How preprocessing uses a filename to locate a file can also vary. Each implementation defines where you must place files that you want to include with an include directive.
If you write two or more of the operators
## within a
macro definition, the order in which preprocessing
can vary. If any order produces an invalid preprocessing token as
an intermediate result, the program can misbehave when you move it.
A translator can limit the size and complexity of a program that it can translate. Such limits can also depend on the environment in which the translator executes. Thus, no translation unit you write can assuredly survive all Standard C translators. Obey the following individual limits, however, to ensure the highest probability of success:
The program can depend on peculiar properties of the character set.
If you write in the source files any characters not in the basic C character set, a corresponding character might not be in another character set, or the corresponding character might not be what you want. The set of characters is defined for each implementation.
Similarly, if the program makes special use of characters not in the basic C character set when it executes, you might get different behavior when you move the program.
If you write a
that specifies more than one character,
'ab', the result might change when you move
the program. Each implementation
defines what values it assigns such
If the program depends on a particular value for one or more character codes, it can behave differently on an implementation with a different character set. The codes associated with each character are implementation defined.
The program can depend on how an implementation represents objects. All representations are implementation defined.
If the program depends on the representation of an object type (such as its size in bits or whether type char or the plain bitfield types can represent negative values), the program can change behavior when you move it.
If you treat an arithmetic object that has more than one byte as an array of characters, you must be aware that the order of significant bytes can vary among implementations. You cannot write an integer or floating-point type object to a binary stream on one implementation, then later read those bytes into an object of the same type on a different implementation, and portably obtain the same stored value.
The method of encoding integer and floating-point values can vary widely. For signed integer types, negative values have several popular encodings. Floating-point types have numerous popular encodings. This means that, except for the minimum guaranteed range of values for each type, the range of values can vary widely.
Both signed integer and floating-point types can have values that represent an exceptional result on some implementations. Performing an arithmetic operation or a comparison on such a value can report a signal or otherwise terminate execution. Initialize all such objects before accessing them -- and avoid overflow, underflow, or zero divide -- to avoid exceptional results.
The alignment requirements of various object types can vary
widely. The placement and size of
holes in structures is
You can portably determine the offset of a given member from the beginning
of a structure, but only by using the
Each implementation defines how bitfields pack into integer objects and whether bitfields can straddle two or more underlying objects. You can declare bitfields of 16 bits or less in all implementations.
How an implementation represents enumeration types can vary. You can be certain that all enumeration constants can be represented as type int.
The program can depend on how an implementation evaluates expressions.
The order in which the program evaluates subexpressions can vary widely, subject to the limits imposed by the sequence points within and between expressions. Therefore, the timing and order of side effects can vary between any two sequence points. A common error is to depend on a particular order for the evaluation of argument expressions on a function call. Any order is permissible.
Whether you can usefully type cast a pointer value to an integer value or type cast a nonzero integer value to a pointer value depends on the implementation. Each implementation defines how it converts between scalar types.
If the quotient of an integer
division is negative, the sign
of a nonzero remainder can be either positive or negative. The result
is implementation defined. Use the
for consistent behavior across implementations.
When the program right shifts a negative integer value, different implementations can define different results. To get consistent results across implementations, you can right shift only positive (or unsigned) integer values.
When the program converts a long double value to another floating-point type, or a double to a float, it can round the result to either a nearby higher or a nearby lower representation of the original value. Each implementation defines how such conversions behave.
When the program accesses or stores a value in a volatile object, each implementation defines the number and nature of the accesses and stores. Three possibilities exist:
You cannot write a program that assuredly produces the same pattern of accesses across multiple implementations.
The expansion of the
null pointer constant macro
can be any of
The program should not depend on a particular choice.
You should not assign
to a pointer to a function, and you should not use
an argument to a function call that has no
type information for the
The actual integer types corresponding to the type definitions
wchar_t can vary.
Use the type definitions.
The behavior of the Standard C library can vary.
What happens to the
for a text stream
immediately after a successful call to
ungetc is not defined.
with calls to this function.
When the function
can match either of two equal elements of an array,
different implementations can return different matches.
When the function
sorts an array containing two
elements that compare equal, different implementations can leave the
elements in different order.
Whether or not floating-point
underflow causes the value
ERANGE to be stored in
(as the result of a
range error) can vary.
Each implementation defines how it handles floating-point underflow.
What library functions store values in
To determine whether the function of interest reported an error, you
must store the value zero in
errno before you call a library
function and then test the stored value before you call another library
You can do very little with signals in a portable program. A target environment can elect not to report signals. If it does report signals, any handler you write for an asynchronous signal can only:
signalfor that particular signal
Asynchronous signals can disrupt proper operation of the library. Avoid using signals, or tailor how you use them to each target environment.
can give special meaning to a minus (
that is not the first or the last character of a
scan set. The behavior is
Write this character only first or last in a scan set.
If you allocate an object of zero size by calling one of the functions
realloc, the behavior is
Avoid such calls.
If you call the function
exit with a status argument
value other than zero (for successful termination),
the behavior is
implementation defined. Use
only these values to report status.
If you have a program written in an earlier dialect of C that you want to convert to Standard C, be aware of all the portability issues described earlier in this document. You must also be aware of issues peculiar to earlier dialects of C. Standard C tries to codify existing practice wherever possible, but existing practice varied in certain areas. This section discusses the major areas to address when moving an older C program to a Standard C environment.
In earlier dialects of C, you cannot write a function prototype. Function types do not have argument information, and function calls occur in the absence of any argument information. Many implementations let you call any function with a varying number of arguments.
You can directly address many of the potential difficulties in converting a program to Standard C by writing function prototypes for all functions. Declare functions with external linkage that you use in more than one file in a separate file, and then include that file in all source files that call or define the functions.
The translator will check that function calls and function definitions are consistent with the function prototypes that you write. It will emit a diagnostic if you call a function with an incorrect number of arguments. It will emit a diagnostic if you call a function with an argument expression that is not assignment compatible with the corresponding function parameter. It will convert an argument expression that is assignment compatible but that does not have the same type as the corresponding function parameter.
Older C programs often rely on argument values of different types having the same representation on a given implementation. By providing function prototypes, you can ensure that the translator will diagnose, or quietly correct, any function calls for which the representation of an argument value is not always acceptable.
For functions intended to accept a varying number of arguments,
different implementations provide different methods of accessing the
unnamed arguments. When you identify such a function, declare it with
the ellipsis notation, such as
int f(int x, ...). Within the
function, use the macros defined in
<stdarg.h> to replace the
existing method for accessing unnamed arguments.
Perhaps the greatest variation in dialects among earlier implementations of C occurs in preprocessing. If the program defines macros that perform only simple substitutions of preprocessing tokens, then you can expect few problems. Otherwise, be wary of variations in several areas.
Some earlier dialects expand macro arguments after substitution, rather than before. This can lead to differences in how a macro expands when you write other macro invocations within its arguments.
Some earlier dialects do not rescan the replacement token sequence after substitution. Macros that expand to macro invocations work differently, depending on whether the rescan occurs.
Dialects that rescan the replacement token sequence work differently, depending on whether a macro that expands to a macro invocation can involve preprocessing tokens in the text following the macro invocation.
The handling of a macro name during an expansion of its invocation varies considerably.
Some dialects permit empty argument sequences in a macro invocation. Standard C does not always permit empty arguments.
The concatenation of tokens with the operator
## is new
with Standard C. It replaces several earlier methods.
string literals with the operator
new with Standard C. It replaces the practice in some earlier dialects
of substituting macro parameter names that you write within string
literals in macro definitions.
The Standard C library is largely a superset of existing libraries. Some conversion problems, however, can occur.
Many earlier implementations offer an additional set of input/output
functions with names such as
You must replace calls to these functions
with calls to other functions defined in
Standard C has several minor changes in the behavior of library functions, compared with popular earlier dialects. These changes generally occur in areas where practice also varied.
Most differences between Standard C and earlier dialects of C cause a Standard C translator to emit a diagnostic when it encounters a program written in the earlier dialect of C. Some changes, unfortunately, require no diagnostic. What was a valid program in the earlier dialect is also a valid program in Standard C, but with different meaning.
While these quiet changes are few in number and generally subtle, you need to be aware of them. They occasionally give rise to unexpected behavior in a program that you convert to Standard C. The principal quiet changes are discussed below.
do not occur in earlier dialects of C. An older program
that happens to contain a sequence of two question marks (
can change meaning in a variety of ways.
Some earlier dialects effectively promote any declaration you write that has external linkage to file level. Standard C keeps such declarations at block level.
Earlier dialects of C let you use the digits
9 in an
octal escape sequence,
such as in the string literal
Standard C treats this as a string literal with two characters (plus
the terminating null character).
escape sequences, such as
\xff, and the escape
\a are new with Standard C.
In certain earlier implementations, they may have different meaning.
Some earlier dialects guarantee that identical string literals share common storage, and others guarantee that they do not. Some dialects let you alter the values stored in string literals. You cannot be certain that identical string literals overlap in Standard C, or that they do not. Do not alter the values stored in string literals in Standard C.
Some earlier dialects have different rules for promoting the types unsigned char, unsigned short, and unsigned bitfields. On most implementations, the difference is detectable only on a few expressions where a negative value becomes a large positive value of unsigned type. Add type casts to specify the types you require.
Earlier dialects convert lvalue expressions of type float to double, in a value context, so all floating-point arithmetic occurs only in type double. A program that depends on this implicit increase in precision can behave differently in a Standard C environment. Add type casts if you need the extra precision.
On some earlier dialects of C, shifting an int or unsigned int value left or right by a long or unsigned long value first converts the value to be shifted to the type of the shift count. In Standard C, the type of the shift count has no such effect. Use a type cast if you need this behavior.
Some earlier dialects guarantee that the
performs arithmetic to the same precision as the
(You can write an if directive that reveals properties of the
target environment.) Standard C makes no such guarantee.
Use the macros defined in
to test properties of the target environment.
Earlier dialects vary considerably in the grouping of values within an object initializer, when you omit some (but not all) of the braces within the initializer. Supply all braces for maximum clarity.
Earlier dialects convert the expression in any switch statement to type int. Standard C also performs comparisons within a switch statement in other integer types. A case label expression that relies on being truncated when converted to int, in an earlier dialect, can behave differently in a Standard C environment.
Some earlier preprocessing
parameter names within string literals or character constants
that you write within a macro definition.
Standard C does not. Use the string literal
#, along with
concatenation, to replace this method.
Some earlier preprocessing concatenates preprocessor tokens
separated only by a comment within a macro definition. Standard C
does not. Use the
## to replace this method.
Making standards for programming languages is an on-going activity. As of this writing, the C Standard has been formally amended. A standard for C++, which is closely related to C, is in the late stages of development. One aspect of portability is writing code that is compatible with these newer dialects, whether or not the code makes use of the newer features.
Most of the features added with
Amendment 1 are declared or
defined in three new headers --
A few take the form of capabilities added to
the functions declared in
While not strictly necessary,
it is best to avoid using any of the names declared or defined in
these new headers.
Maintaining compatibility with C++ takes considerably more work. It can be useful, however, to write in a common dialect called typesafe C Here is a brief summary of the added constraints:
Avoid using any C++ keywords. As of this writing, the list includes:
and and_eq asm bitand bitor bool catch class compl delete explicit false friend inline mutable namespace new not not_eq operator or or_eq private protected public template this throw true try typeid typename using virtual wchar_t xor xor_eq const_cast dynamic_cast reinterpret_cast static_cast
Write function prototypes for all functions you call.
Define each tag name also as a type, as in:
typedef struct x x;
Assume each enumeration type is a distinct type that promotes to an integer type. Type cast an integer expression that you assign to an object of enumeration type.
Write an explicit storage class for each constant object declaration at file level.
Do not write tentative declarations.
Do not apply the sizeof operator to an rvalue operand.
See also the Table of Contents and the Index.
Copyright © 1989-1996 by P.J. Plauger and Jim Brodie. All rights reserved.