3238 lines
122 KiB
Plaintext
3238 lines
122 KiB
Plaintext
Newsgroups: comp.lang.c,comp.answers,news.answers
|
|
Path: senator-bedfellow.mit.edu!bloom-beacon.mit.edu!world!news.kei.com!sol.ctr.columbia.edu!emory!europa.eng.gtefsd.com!uunet!nwnexus!oneworld!eskimo!scs
|
|
From: scs@eskimo.com (Steve Summit)
|
|
Subject: comp.lang.c Answers to Frequently Asked Questions (FAQ List)
|
|
Message-ID: <1993Nov04.0146.scs.0007@eskimo.com>
|
|
Followup-To: poster
|
|
Sender: scs@eskimo.com (Steve Summit)
|
|
Supersedes: <CE87C4.1zI@eskimo.com>
|
|
Reply-To: scs@eskimo.com
|
|
X-Archive-Name: C-faq/faq
|
|
Organization: none, at the moment
|
|
Date: Thu, 4 Nov 1993 09:46:28 GMT
|
|
X-Last-Modified: November 3, 1993
|
|
Approved: news-answers-request@MIT.Edu
|
|
Expires: Fri, 3 Dec 1993 00:00:00 GMT
|
|
Lines: 3219
|
|
Xref: senator-bedfellow.mit.edu comp.lang.c:82574 comp.answers:2520 news.answers:14301
|
|
|
|
Archive-name: C-faq/faq
|
|
Comp-lang-c-archive-name: C-FAQ-list
|
|
|
|
[Last modified November 3, 1993 by scs.]
|
|
|
|
Certain topics come up again and again on this newsgroup. They are good
|
|
questions, and the answers may not be immediately obvious, but each time
|
|
they recur, much net bandwidth and reader time is wasted on repetitive
|
|
responses, and on tedious corrections to the incorrect answers which are
|
|
inevitably posted.
|
|
|
|
This article, which is posted monthly, attempts to answer these common
|
|
questions definitively and succinctly, so that net discussion can move
|
|
on to more constructive topics without continual regression to first
|
|
principles.
|
|
|
|
No mere newsgroup article can substitute for thoughtful perusal of a
|
|
full-length tutorial or language reference manual. Anyone interested
|
|
enough in C to be following this newsgroup should also be interested
|
|
enough to read and study one or more such manuals, preferably several
|
|
times. Some C books and compiler manuals are unfortunately inadequate;
|
|
a few even perpetuate some of the myths which this article attempts to
|
|
refute. Several noteworthy books on C are listed in this article's
|
|
bibliography. Many of the questions and answers are cross-referenced to
|
|
these books, for further study by the interested and dedicated reader
|
|
(but beware of ANSI vs. ISO C Standard section numbers; see question
|
|
5.1).
|
|
|
|
If you have a question about C which is not answered in this article,
|
|
first try to answer it by checking a few of the referenced books, or by
|
|
asking knowledgeable colleagues, before posing your question to the net
|
|
at large. There are many people on the net who are happy to answer
|
|
questions, but the volume of repetitive answers posted to one question,
|
|
as well as the growing number of questions as the net attracts more
|
|
readers, can become oppressive. If you have questions or comments
|
|
prompted by this article, please reply by mail rather than following up
|
|
-- this article is meant to decrease net traffic, not increase it.
|
|
|
|
Besides listing frequently-asked questions, this article also summarizes
|
|
frequently-posted answers. Even if you know all the answers, it's worth
|
|
skimming through this list once in a while, so that when you see one of
|
|
its questions unwittingly posted, you won't have to waste time
|
|
answering.
|
|
|
|
This article is always being improved. Your input is welcomed. Send
|
|
your comments to scs@eskimo.com .
|
|
|
|
The questions answered here are divided into several categories:
|
|
|
|
1. Null Pointers
|
|
2. Arrays and Pointers
|
|
3. Memory Allocation
|
|
4. Expressions
|
|
5. ANSI C
|
|
6. C Preprocessor
|
|
7. Variable-Length Argument Lists
|
|
8. Boolean Expressions and Variables
|
|
9. Structs, Enums, and Unions
|
|
10. Declarations
|
|
11. Stdio
|
|
12. Library Subroutines
|
|
13. Lint
|
|
14. Style
|
|
15. Floating Point
|
|
16. System Dependencies
|
|
17. Miscellaneous (Fortran to C converters, YACC grammars, etc.)
|
|
|
|
Herewith, some frequently-asked questions and their answers:
|
|
|
|
|
|
Section 1. Null Pointers
|
|
|
|
1.1: What is this infamous null pointer, anyway?
|
|
|
|
A: The language definition states that for each pointer type, there
|
|
is a special value -- the "null pointer" -- which is
|
|
distinguishable from all other pointer values and which is not
|
|
the address of any object or function. That is, the address-of
|
|
operator & will never yield a null pointer, nor will a
|
|
successful call to malloc. (malloc returns a null pointer when
|
|
it fails, and this is a typical use of null pointers: as a
|
|
"special" pointer value with some other meaning, usually "not
|
|
allocated" or "not pointing anywhere yet.")
|
|
|
|
A null pointer is conceptually different from an uninitialized
|
|
pointer. A null pointer is known not to point to any object; an
|
|
uninitialized pointer might point anywhere. See also questions
|
|
3.1, 3.11, and 17.1.
|
|
|
|
As mentioned in the definition above, there is a null pointer
|
|
for each pointer type, and the internal values of null pointers
|
|
for different types may be different. Although programmers need
|
|
not know the internal values, the compiler must always be
|
|
informed which type of null pointer is required, so it can make
|
|
the distinction if necessary (see below).
|
|
|
|
References: K&R I Sec. 5.4 pp. 97-8; K&R II Sec. 5.4 p. 102; H&S
|
|
Sec. 5.3 p. 91; ANSI Sec. 3.2.2.3 p. 38.
|
|
|
|
1.2: How do I "get" a null pointer in my programs?
|
|
|
|
A: According to the language definition, a constant 0 in a pointer
|
|
context is converted into a null pointer at compile time. That
|
|
is, in an initialization, assignment, or comparison when one
|
|
side is a variable or expression of pointer type, the compiler
|
|
can tell that a constant 0 on the other side requests a null
|
|
pointer, and generate the correctly-typed null pointer value.
|
|
Therefore, the following fragments are perfectly legal:
|
|
|
|
char *p = 0;
|
|
if(p != 0)
|
|
|
|
However, an argument being passed to a function is not
|
|
necessarily recognizable as a pointer context, and the compiler
|
|
may not be able to tell that an unadorned 0 "means" a null
|
|
pointer. For instance, the UNIX system call "execl" takes a
|
|
variable-length, null-pointer-terminated list of character
|
|
pointer arguments. To generate a null pointer in a function
|
|
call context, an explicit cast is typically required, to force
|
|
the 0 to be in a pointer context:
|
|
|
|
execl("/bin/sh", "sh", "-c", "ls", (char *)0);
|
|
|
|
If the (char *) cast were omitted, the compiler would not know
|
|
to pass a null pointer, and would pass an integer 0 instead.
|
|
(Note that many UNIX manuals get this example wrong.)
|
|
|
|
When function prototypes are in scope, argument passing becomes
|
|
an "assignment context," and most casts may safely be omitted,
|
|
since the prototype tells the compiler that a pointer is
|
|
required, and of which type, enabling it to correctly convert
|
|
unadorned 0's. Function prototypes cannot provide the types for
|
|
variable arguments in variable-length argument lists, however,
|
|
so explicit casts are still required for those arguments. It is
|
|
safest always to cast null pointer function arguments, to guard
|
|
against varargs functions or those without prototypes, to allow
|
|
interim use of non-ANSI compilers, and to demonstrate that you
|
|
know what you are doing. (Incidentally, it's also a simpler
|
|
rule to remember.)
|
|
|
|
Summary:
|
|
|
|
Unadorned 0 okay: Explicit cast required:
|
|
|
|
initialization function call,
|
|
no prototype in scope
|
|
assignment
|
|
variable argument in
|
|
comparison varargs function call
|
|
|
|
function call,
|
|
prototype in scope,
|
|
fixed argument
|
|
|
|
References: K&R I Sec. A7.7 p. 190, Sec. A7.14 p. 192; K&R II
|
|
Sec. A7.10 p. 207, Sec. A7.17 p. 209; H&S Sec. 4.6.3 p. 72; ANSI
|
|
Sec. 3.2.2.3 .
|
|
|
|
1.3: What is NULL and how is it #defined?
|
|
|
|
A: As a matter of style, many people prefer not to have unadorned
|
|
0's scattered throughout their programs. For this reason, the
|
|
preprocessor macro NULL is #defined (by <stdio.h> or
|
|
<stddef.h>), with value 0 (or (void *)0, about which more
|
|
later). A programmer who wishes to make explicit the
|
|
distinction between 0 the integer and 0 the null pointer can
|
|
then use NULL whenever a null pointer is required. This is a
|
|
stylistic convention only; the preprocessor turns NULL back to 0
|
|
which is then recognized by the compiler (in pointer contexts)
|
|
as before. In particular, a cast may still be necessary before
|
|
NULL (as before 0) in a function call argument. (The table
|
|
under question 1.2 above applies for NULL as well as 0.)
|
|
|
|
NULL should _only_ be used for pointers; see question 1.8.
|
|
|
|
References: K&R I Sec. 5.4 pp. 97-8; K&R II Sec. 5.4 p. 102; H&S
|
|
Sec. 13.1 p. 283; ANSI Sec. 4.1.5 p. 99, Sec. 3.2.2.3 p. 38,
|
|
Rationale Sec. 4.1.5 p. 74.
|
|
|
|
1.4: How should NULL be #defined on a machine which uses a nonzero
|
|
bit pattern as the internal representation of a null pointer?
|
|
|
|
A: Programmers should never need to know the internal
|
|
representation(s) of null pointers, because they are normally
|
|
taken care of by the compiler. If a machine uses a nonzero bit
|
|
pattern for null pointers, it is the compiler's responsibility
|
|
to generate it when the programmer requests, by writing "0" or
|
|
"NULL," a null pointer. Therefore, #defining NULL as 0 on a
|
|
machine for which internal null pointers are nonzero is as valid
|
|
as on any other, because the compiler must (and can) still
|
|
generate the machine's correct null pointers in response to
|
|
unadorned 0's seen in pointer contexts.
|
|
|
|
1.5: If NULL were defined as follows:
|
|
|
|
#define NULL (char *)0
|
|
|
|
wouldn't that make function calls which pass an uncast NULL
|
|
work?
|
|
|
|
A: Not in general. The problem is that there are machines which
|
|
use different internal representations for pointers to different
|
|
types of data. The suggested #definition would make uncast NULL
|
|
arguments to functions expecting pointers to characters to work
|
|
correctly, but pointer arguments to other types would still be
|
|
problematical, and legal constructions such as
|
|
|
|
FILE *fp = NULL;
|
|
|
|
could fail.
|
|
|
|
Nevertheless, ANSI C allows the alternate
|
|
|
|
#define NULL ((void *)0)
|
|
|
|
definition for NULL. Besides helping incorrect programs to work
|
|
(but only on machines with homogeneous pointers, thus
|
|
questionably valid assistance) this definition may catch
|
|
programs which use NULL incorrectly (e.g. when the ASCII NUL
|
|
character was really intended; see question 1.8).
|
|
|
|
References: ANSI Rationale Sec. 4.1.5 p. 74.
|
|
|
|
1.6: I use the preprocessor macro
|
|
|
|
#define Nullptr(type) (type *)0
|
|
|
|
to help me build null pointers of the correct type.
|
|
|
|
A: This trick, though popular in some circles, does not buy much.
|
|
It is not needed in assignments and comparisons; see question
|
|
1.2. It does not even save keystrokes. Its use suggests to the
|
|
reader that the author is shaky on the subject of null pointers,
|
|
and requires the reader to check the #definition of the macro,
|
|
its invocations, and _all_ other pointer usages much more
|
|
carefully. See also question 8.1.
|
|
|
|
1.7: Is the abbreviated pointer comparison "if(p)" to test for non-
|
|
null pointers valid? What if the internal representation for
|
|
null pointers is nonzero?
|
|
|
|
A: When C requires the boolean value of an expression (in the if,
|
|
while, for, and do statements, and with the &&, ||, !, and ?:
|
|
operators), a false value is produced when the expression
|
|
compares equal to zero, and a true value otherwise. That is,
|
|
whenever one writes
|
|
|
|
if(expr)
|
|
|
|
where "expr" is any expression at all, the compiler essentially
|
|
acts as if it had been written as
|
|
|
|
if(expr != 0)
|
|
|
|
Substituting the trivial pointer expression "p" for "expr," we
|
|
have
|
|
|
|
if(p) is equivalent to if(p != 0)
|
|
|
|
and this is a comparison context, so the compiler can tell that
|
|
the (implicit) 0 is a null pointer, and use the correct value.
|
|
There is no trickery involved here; compilers do work this way,
|
|
and generate identical code for both statements. The internal
|
|
representation of a pointer does _not_ matter.
|
|
|
|
The boolean negation operator, !, can be described as follows:
|
|
|
|
!expr is essentially equivalent to expr?0:1
|
|
|
|
It is left as an exercise for the reader to show that
|
|
|
|
if(!p) is equivalent to if(p == 0)
|
|
|
|
"Abbreviations" such as if(p), though perfectly legal, are
|
|
considered by some to be bad style.
|
|
|
|
See also question 8.2.
|
|
|
|
References: K&R II Sec. A7.4.7 p. 204; H&S Sec. 5.3 p. 91; ANSI
|
|
Secs. 3.3.3.3, 3.3.9, 3.3.13, 3.3.14, 3.3.15, 3.6.4.1, and
|
|
3.6.5 .
|
|
|
|
1.8: If "NULL" and "0" are equivalent, which should I use?
|
|
|
|
A: Many programmers believe that "NULL" should be used in all
|
|
pointer contexts, as a reminder that the value is to be thought
|
|
of as a pointer. Others feel that the confusion surrounding
|
|
"NULL" and "0" is only compounded by hiding "0" behind a
|
|
#definition, and prefer to use unadorned "0" instead. There is
|
|
no one right answer. C programmers must understand that "NULL"
|
|
and "0" are interchangeable and that an uncast "0" is perfectly
|
|
acceptable in initialization, assignment, and comparison
|
|
contexts. Any usage of "NULL" (as opposed to "0") should be
|
|
considered a gentle reminder that a pointer is involved;
|
|
programmers should not depend on it (either for their own
|
|
understanding or the compiler's) for distinguishing pointer 0's
|
|
from integer 0's.
|
|
|
|
NULL should _not_ be used when another kind of 0 is required,
|
|
even though it might work, because doing so sends the wrong
|
|
stylistic message. (ANSI allows the #definition of NULL to be
|
|
(void *)0, which will not work in non-pointer contexts.) In
|
|
particular, do not use NULL when the ASCII null character (NUL)
|
|
is desired. Provide your own definition
|
|
|
|
#define NUL '\0'
|
|
|
|
if you must.
|
|
|
|
References: K&R II Sec. 5.4 p. 102.
|
|
|
|
1.9: But wouldn't it be better to use NULL (rather than 0) in case
|
|
the value of NULL changes, perhaps on a machine with nonzero
|
|
null pointers?
|
|
|
|
A: No. Although symbolic constants are often used in place of
|
|
numbers because the numbers might change, this is _not_ the
|
|
reason that NULL is used in place of 0. Once again, the
|
|
language guarantees that source-code 0's (in pointer contexts)
|
|
generate null pointers. NULL is used only as a stylistic
|
|
convention.
|
|
|
|
1.10: I'm confused. NULL is guaranteed to be 0, but the null pointer
|
|
is not?
|
|
|
|
A: When the term "null" or "NULL" is casually used, one of several
|
|
things may be meant:
|
|
|
|
1. The conceptual null pointer, the abstract language
|
|
concept defined in question 1.1. It is implemented
|
|
with...
|
|
|
|
2. The internal (or run-time) representation of a null
|
|
pointer, which may or may not be all-bits-0 and which
|
|
may be different for different pointer types. The
|
|
actual values should be of concern only to compiler
|
|
writers. Authors of C programs never see them, since
|
|
they use...
|
|
|
|
3. The source code syntax for null pointers, which is the
|
|
single character "0". It is often hidden behind...
|
|
|
|
4. The NULL macro, which is #defined to be "0" or
|
|
"(void *)0". Finally, as red herrings, we have...
|
|
|
|
5. The ASCII null character (NUL), which does have all bits
|
|
zero, but has no necessary relation to the null pointer
|
|
except in name; and...
|
|
|
|
6. The "null string," which is another name for an empty
|
|
string (""). The term "null string" can be confusing in
|
|
C (and should perhaps be avoided), because it involves a
|
|
null ('\0') character, but not a null pointer, which
|
|
brings us full circle...
|
|
|
|
This article always uses the phrase "null pointer" (in lower
|
|
case) for sense 1, the character "0" for sense 3, and the
|
|
capitalized word "NULL" for sense 4.
|
|
|
|
1.11: Why is there so much confusion surrounding null pointers? Why
|
|
do these questions come up so often?
|
|
|
|
A: C programmers traditionally like to know more than they need to
|
|
about the underlying machine implementation. The fact that null
|
|
pointers are represented both in source code, and internally to
|
|
most machines, as zero invites unwarranted assumptions. The use
|
|
of a preprocessor macro (NULL) suggests that the value might
|
|
change later, or on some weird machine. The construct
|
|
"if(p == 0)" is easily misread as calling for conversion of p to
|
|
an integral type, rather than 0 to a pointer type, before the
|
|
comparison. Finally, the distinction between the several uses
|
|
of the term "null" (listed above) is often overlooked.
|
|
|
|
One good way to wade out of the confusion is to imagine that C
|
|
had a keyword (perhaps "nil", like Pascal) with which null
|
|
pointers were requested. The compiler could either turn "nil"
|
|
into the correct type of null pointer, when it could determine
|
|
the type from the source code, or complain when it could not.
|
|
Now, in fact, in C the keyword for a null pointer is not "nil"
|
|
but "0", which works almost as well, except that an uncast "0"
|
|
in a non-pointer context generates an integer zero instead of an
|
|
error message, and if that uncast 0 was supposed to be a null
|
|
pointer, the code may not work.
|
|
|
|
1.12: I'm still confused. I just can't understand all this null
|
|
pointer stuff.
|
|
|
|
A: Follow these two simple rules:
|
|
|
|
1. When you want to refer to a null pointer in source code,
|
|
use "0" or "NULL".
|
|
|
|
2. If the usage of "0" or "NULL" is an argument in a
|
|
function call, cast it to the pointer type expected by
|
|
the function being called.
|
|
|
|
The rest of the discussion has to do with other people's
|
|
misunderstandings, or with the internal representation of null
|
|
pointers (which you shouldn't need to know), or with ANSI C
|
|
refinements. Understand questions 1.1, 1.2, and 1.3, and
|
|
consider 1.8 and 1.11, and you'll do fine.
|
|
|
|
1.13: Given all the confusion surrounding null pointers, wouldn't it
|
|
be easier simply to require them to be represented internally by
|
|
zeroes?
|
|
|
|
A: If for no other reason, doing so would be ill-advised because it
|
|
would unnecessarily constrain implementations which would
|
|
otherwise naturally represent null pointers by special, nonzero
|
|
bit patterns, particularly when those values would trigger
|
|
automatic hardware traps for invalid accesses.
|
|
|
|
Besides, what would this requirement really accomplish? Proper
|
|
understanding of null pointers does not require knowledge of the
|
|
internal representation, whether zero or nonzero. Assuming that
|
|
null pointers are internally zero does not make any code easier
|
|
to write (except for a certain ill-advised usage of calloc; see
|
|
question 3.11). Known-zero internal pointers would not obviate
|
|
casts in function calls, because the _size_ of the pointer might
|
|
still be different from that of an int. (If "nil" were used to
|
|
request null pointers rather than "0," as mentioned in question
|
|
1.11, the urge to assume an internal zero representation would
|
|
not even arise.)
|
|
|
|
1.14: Seriously, have any actual machines really used nonzero null
|
|
pointers, or different representations for pointers to different
|
|
types?
|
|
|
|
A: The Prime 50 series used segment 07777, offset 0 for the null
|
|
pointer, at least for PL/I. Later models used segment 0, offset
|
|
0 for null pointers in C, necessitating new instructions such as
|
|
TCNP (Test C Null Pointer), evidently as a sop to all the extant
|
|
poorly-written C code which made incorrect assumptions. Older,
|
|
word-addressed Prime machines were also notorious for requiring
|
|
larger byte pointers (char *'s) than word pointers (int *'s).
|
|
|
|
The Eclipse MV series from Data General has three
|
|
architecturally supported pointer formats (word, byte, and bit
|
|
pointers), two of which are used by C compilers: byte pointers
|
|
for char * and void *, and word pointers for everything else.
|
|
|
|
Some Honeywell-Bull mainframes use the bit pattern 06000 for
|
|
(internal) null pointers.
|
|
|
|
The CDC Cyber 180 Series has 48-bit pointers consisting of a
|
|
ring, segment, and offset. Most users (in ring 11) have null
|
|
pointers of 0xB00000000000.
|
|
|
|
The Symbolics Lisp Machine, a tagged architecture, does not even
|
|
have conventional numeric pointers; it uses the pair <NIL, 0>
|
|
(basically a nonexistent <object, offset> handle) as a C null
|
|
pointer.
|
|
|
|
Depending on the "memory model" in use, 80*86 processors (PC's)
|
|
may use 16 bit data pointers and 32 bit function pointers, or
|
|
vice versa.
|
|
|
|
1.15: What does a run-time "null pointer assignment" error mean? How
|
|
do I track it down?
|
|
|
|
A: This message, which occurs only under MS-DOS (see, therefore,
|
|
section 16) means that you've written, via a null pointer, to
|
|
location zero.
|
|
|
|
A debugger will usually let you set a data breakpoint on
|
|
location 0. Alternately, you could write a bit of code to copy
|
|
20 or so bytes from location 0 into another buffer, and
|
|
periodically check that it hasn't changed.
|
|
|
|
|
|
Section 2. Arrays and Pointers
|
|
|
|
2.1: I had the definition char a[6] in one source file, and in
|
|
another I declared extern char *a. Why didn't it work?
|
|
|
|
A: The declaration extern char *a simply does not match the actual
|
|
definition. The type "pointer-to-type-T" is not the same as
|
|
"array-of-type-T." Use extern char a[].
|
|
|
|
References: CT&P Sec. 3.3 pp. 33-4, Sec. 4.5 pp. 64-5.
|
|
|
|
2.2: But I heard that char a[] was identical to char *a.
|
|
|
|
A: Not at all. (What you heard has to do with formal parameters to
|
|
functions; see question 2.4.) Arrays are not pointers. The
|
|
array declaration "char a[6];" requests that space for six
|
|
characters be set aside, to be known by the name "a." That is,
|
|
there is a location named "a" at which six characters can sit.
|
|
The pointer declaration "char *p;" on the other hand, requests a
|
|
place which holds a pointer. The pointer is to be known by the
|
|
name "p," and can point to any char (or contiguous array of
|
|
chars) anywhere.
|
|
|
|
As usual, a picture is worth a thousand words. The statements
|
|
|
|
char a[] = "hello";
|
|
char *p = "world";
|
|
|
|
would result in data structures which could be represented like
|
|
this:
|
|
|
|
+---+---+---+---+---+---+
|
|
a: | h | e | l | l | o |\0 |
|
|
+---+---+---+---+---+---+
|
|
|
|
+-----+ +---+---+---+---+---+---+
|
|
p: | *======> | w | o | r | l | d |\0 |
|
|
+-----+ +---+---+---+---+---+---+
|
|
|
|
It is important to realize that a reference like x[3] generates
|
|
different code depending on whether x is an array or a pointer.
|
|
Given the declarations above, when the compiler sees the
|
|
expression a[3], it emits code to start at the location "a,"
|
|
move three past it, and fetch the character there. When it sees
|
|
the expression p[3], it emits code to start at the location "p,"
|
|
fetch the pointer value there, add three to the pointer, and
|
|
finally fetch the character pointed to. In the example above,
|
|
both a[3] and p[3] happen to be the character 'l', but the
|
|
compiler gets there differently. (See also question 17.14.)
|
|
|
|
2.3: So what is meant by the "equivalence of pointers and arrays" in
|
|
C?
|
|
|
|
A: Much of the confusion surrounding pointers in C can be traced to
|
|
a misunderstanding of this statement. Saying that arrays and
|
|
pointers are "equivalent" neither means that they are identical
|
|
nor interchangeable.
|
|
|
|
"Equivalence" refers to the following key definition:
|
|
|
|
An lvalue [see question 2.5] of type array-of-T
|
|
which appears in an expression decays (with
|
|
three exceptions) into a pointer to its first
|
|
element; the type of the resultant pointer is
|
|
pointer-to-T.
|
|
|
|
(The exceptions are when the array is the operand of a sizeof or
|
|
& operator, or is a literal string initializer for a character
|
|
array.)
|
|
|
|
As a consequence of this definition, there is no apparent
|
|
difference in the behavior of the "array subscripting" operator
|
|
[] as it applies to arrays and pointers. In an expression of
|
|
the form a[i], the array reference "a" decays into a pointer,
|
|
following the rule above, and is then subscripted just as would
|
|
be a pointer variable in the expression p[i] (although the
|
|
eventual memory accesses will be different, as explained in
|
|
question 2.2). In either case, the expression x[i] (where x is
|
|
an array or a pointer) is, by definition, identical to *((x)+(i)).
|
|
|
|
References: K&R I Sec. 5.3 pp. 93-6; K&R II Sec. 5.3 p. 99; H&S
|
|
Sec. 5.4.1 p. 93; ANSI Sec. 3.2.2.1, Sec. 3.3.2.1, Sec. 3.3.6 .
|
|
|
|
2.4: Then why are array and pointer declarations interchangeable as
|
|
function formal parameters?
|
|
|
|
A: Since arrays decay immediately into pointers, an array is never
|
|
actually passed to a function. As a convenience, any parameter
|
|
declarations which "look like" arrays, e.g.
|
|
|
|
f(a)
|
|
char a[];
|
|
|
|
are treated by the compiler as if they were pointers, since that
|
|
is what the function will receive if an array is passed:
|
|
|
|
f(a)
|
|
char *a;
|
|
|
|
This conversion holds only within function formal parameter
|
|
declarations, nowhere else. If this conversion bothers you,
|
|
avoid it; many people have concluded that the confusion it
|
|
causes outweighs the small advantage of having the declaration
|
|
"look like" the call and/or the uses within the function.
|
|
|
|
References: K&R I Sec. 5.3 p. 95, Sec. A10.1 p. 205; K&R II
|
|
Sec. 5.3 p. 100, Sec. A8.6.3 p. 218, Sec. A10.1 p. 226; H&S
|
|
Sec. 5.4.3 p. 96; ANSI Sec. 3.5.4.3, Sec. 3.7.1, CT&P Sec. 3.3
|
|
pp. 33-4.
|
|
|
|
2.5: How can an array be an lvalue, if you can't assign to it?
|
|
|
|
A: The ANSI C Standard defines a "modifiable lvalue," which an
|
|
array is not.
|
|
|
|
References: ANSI Sec. 3.2.2.1 p. 37.
|
|
|
|
2.6: Why doesn't sizeof properly report the size of an array which is
|
|
a parameter to a function?
|
|
|
|
A: The sizeof operator reports the size of the pointer parameter
|
|
which the function actually receives (see question 2.4).
|
|
|
|
2.7: Someone explained to me that arrays were really just constant
|
|
pointers.
|
|
|
|
A: This is a bit of an oversimplification. An array name is
|
|
"constant" in that it cannot be assigned to, but an array is
|
|
_not_ a pointer, as the discussion and pictures in question 2.2
|
|
should make clear.
|
|
|
|
2.8: Practically speaking, what is the difference between arrays and
|
|
pointers?
|
|
|
|
A: Arrays automatically allocate space, but can't be relocated or
|
|
resized. Pointers must be explicitly assigned to point to
|
|
allocated space (perhaps using malloc), but can be reassigned
|
|
(i.e. pointed at different objects) at will, and have many other
|
|
uses besides serving as the base of blocks of memory.
|
|
|
|
Due to the "equivalence of arrays and pointers" (see question
|
|
2.3), arrays and pointers often seem interchangeable, and in
|
|
particular a pointer to a block of memory assigned by malloc is
|
|
frequently treated (and can be referenced using [] exactly) as
|
|
if it were a true array (see also question 2.13).
|
|
|
|
2.9: I came across some "joke" code containing the "expression"
|
|
5["abcdef"] . How can this be legal C?
|
|
|
|
A: Yes, Virginia, array subscripting is commutative in C. This
|
|
curious fact follows from the pointer definition of array
|
|
subscripting, namely that a[e] is identical to *((a)+(e)), for
|
|
_any_ expression e and primary expression a, as long as one of
|
|
them is a pointer expression and one is integral. This
|
|
unsuspected commutativity is often mentioned in C texts as if it
|
|
were something to be proud of, but it finds no useful
|
|
application outside of the Obfuscated C Contest (see question
|
|
17.9).
|
|
|
|
References: ANSI Rationale Sec. 3.3.2.1 p. 41.
|
|
|
|
2.10: My compiler complained when I passed a two-dimensional array to
|
|
a routine expecting a pointer to a pointer.
|
|
|
|
A: The rule by which arrays decay into pointers is not applied
|
|
recursively. An array of arrays (i.e. a two-dimensional array
|
|
in C) decays into a pointer to an array, not a pointer to a
|
|
pointer. Pointers to arrays can be confusing, and must be
|
|
treated carefully. (The confusion is heightened by the
|
|
existence of incorrect compilers, including some versions of pcc
|
|
and pcc-derived lint's, which improperly accept assignments of
|
|
multi-dimensional arrays to multi-level pointers.) If you are
|
|
passing a two-dimensional array to a function:
|
|
|
|
int array[NROWS][NCOLUMNS];
|
|
f(array);
|
|
|
|
the function's declaration should match:
|
|
|
|
f(int a[][NCOLUMNS]) {...}
|
|
or
|
|
f(int (*ap)[NCOLUMNS]) {...} /* ap is a pointer to an array */
|
|
|
|
In the first declaration, the compiler performs the usual
|
|
implicit parameter rewriting of "array of array" to "pointer to
|
|
array;" in the second form the pointer declaration is explicit.
|
|
Since the called function does not allocate space for the array,
|
|
it does not need to know the overall size, so the number of
|
|
"rows," NROWS, can be omitted. The "shape" of the array is
|
|
still important, so the "column" dimension NCOLUMNS (and, for 3-
|
|
or more dimensional arrays, the intervening ones) must be
|
|
included.
|
|
|
|
If a function is already declared as accepting a pointer to a
|
|
pointer, it is probably incorrect to pass a two-dimensional
|
|
array directly to it.
|
|
|
|
References: K&R I Sec. 5.10 p. 110; K&R II Sec. 5.9 p. 113.
|
|
|
|
2.11: How do I write functions which accept 2-dimensional arrays when
|
|
the "width" is not known at compile time?
|
|
|
|
A: It's not easy. One way is to pass in a pointer to the [0][0]
|
|
element, along with the two dimensions, and simulate array
|
|
subscripting "by hand:"
|
|
|
|
f2(aryp, nrows, ncolumns)
|
|
int *aryp;
|
|
int nrows, ncolumns;
|
|
{ ... ary[i][j] is really aryp[i * ncolumns + j] ... }
|
|
|
|
This function could be called with the array from question 2.10
|
|
as
|
|
|
|
f2(&array[0][0], NROWS, NCOLUMNS);
|
|
|
|
It must be noted, however, that a program which performs
|
|
multidimensional array subscripting "by hand" in this way is not
|
|
in strict conformance with the ANSI C Standard; the behavior of
|
|
accessing (&array[0][0])[x] is not defined for x > NCOLUMNS.
|
|
|
|
See also question 2.14.
|
|
|
|
2.12: How do I declare a pointer to an array?
|
|
|
|
A: Usually, you don't want to. When people speak casually of a
|
|
pointer to an array, they usually mean a pointer to its first
|
|
element.
|
|
|
|
Instead of a pointer to an array, consider using a pointer to
|
|
one of the array's elements. Arrays of type T decay into
|
|
pointers to type T (see question 2.3), which is convenient;
|
|
subscripting or incrementing the resultant pointer accesses the
|
|
individual members of the array. True pointers to arrays, when
|
|
subscripted or incremented, step over entire arrays, and are
|
|
generally only useful when operating on arrays of arrays, if at
|
|
all. (See question 2.10 above.)
|
|
|
|
If you really need to declare a pointer to an entire array, use
|
|
something like "int (*ap)[N];" where N is the size of the array.
|
|
(See also question 10.4.) If the size of the array is unknown,
|
|
N can be omitted, but the resulting type, "pointer to array of
|
|
unknown size," is useless.
|
|
|
|
2.13: How can I dynamically allocate a multidimensional array?
|
|
|
|
A: It is usually best to allocate an array of pointers, and then
|
|
initialize each pointer to a dynamically-allocated "row." Here
|
|
is a two-dimensional example:
|
|
|
|
int **array1 = (int **)malloc(nrows * sizeof(int *));
|
|
for(i = 0; i < nrows; i++)
|
|
array1[i] = (int *)malloc(ncolumns * sizeof(int));
|
|
|
|
(In "real" code, of course, malloc would be declared correctly,
|
|
and each return value checked.)
|
|
|
|
You can keep the array's contents contiguous, while making later
|
|
reallocation of individual rows difficult, with a bit of
|
|
explicit pointer arithmetic:
|
|
|
|
int **array2 = (int **)malloc(nrows * sizeof(int *));
|
|
array2[0] = (int *)malloc(nrows * ncolumns * sizeof(int));
|
|
for(i = 1; i < nrows; i++)
|
|
array2[i] = array2[0] + i * ncolumns;
|
|
|
|
In either case, the elements of the dynamic array can be
|
|
accessed with normal-looking array subscripts: array[i][j].
|
|
|
|
If the double indirection implied by the above schemes is for
|
|
some reason unacceptable, you can simulate a two-dimensional
|
|
array with a single, dynamically-allocated one-dimensional
|
|
array:
|
|
|
|
int *array3 = (int *)malloc(nrows * ncolumns * sizeof(int));
|
|
|
|
However, you must now perform subscript calculations manually,
|
|
accessing the i,jth element with array3[i * ncolumns + j]. (A
|
|
macro can hide the explicit calculation, but invoking it then
|
|
requires parentheses and commas which don't look exactly like
|
|
multidimensional array subscripts.)
|
|
|
|
Finally, you can use pointers-to-arrays:
|
|
|
|
int (*array4)[NCOLUMNS] =
|
|
(int (*)[NCOLUMNS])malloc(nrows * sizeof(*array4));
|
|
|
|
, but the syntax gets horrific and all but one dimension must be
|
|
known at compile time.
|
|
|
|
With all of these techniques, you may of course need to remember
|
|
to free the arrays (which may take several steps; see question
|
|
3.8) when they are no longer needed, and you cannot necessarily
|
|
intermix the dynamically-allocated arrays with conventional,
|
|
statically-allocated ones (see question 2.14 below, and also
|
|
question 2.10).
|
|
|
|
2.14: How can I use statically- and dynamically-allocated
|
|
multidimensional arrays interchangeably when passing them to
|
|
functions?
|
|
|
|
A: There is no single perfect method. Given the array and f() as
|
|
declared in question 2.10, f2() as declared in question 2.11,
|
|
array1, array2, array3, and array4 as declared in 2.13, and a
|
|
function f3() declared as:
|
|
|
|
f3(pp, m, n)
|
|
int **pp;
|
|
int m, n;
|
|
|
|
; the following calls should work as expected:
|
|
|
|
f(array, NROWS, NCOLUMNS);
|
|
f(array4, nrows, NCOLUMNS);
|
|
f2(&array[0][0], NROWS, NCOLUMNS);
|
|
f2(*array2, nrows, ncolumns);
|
|
f2(array3, nrows, ncolumns);
|
|
f2(*array4, nrows, NCOLUMNS);
|
|
f3(array1, nrows, ncolumns);
|
|
f3(array2, nrows, ncolumns);
|
|
|
|
The following two calls would probably work, but involve
|
|
questionable casts, and work only if the dynamic ncolumns
|
|
matches the static NCOLUMNS:
|
|
|
|
f((int (*)[NCOLUMNS])(*array2), nrows, ncolumns);
|
|
f((int (*)[NCOLUMNS])array3, nrows, ncolumns);
|
|
|
|
It must again be noted that passing array to f2() is not
|
|
strictly conforming; see question 2.11.
|
|
|
|
If you can understand why all of the above calls work and are
|
|
written as they are, and if you understand why the combinations
|
|
that are not listed would not work, then you have a _very_ good
|
|
understanding of arrays and pointers (and several other areas)
|
|
in C.
|
|
|
|
2.15: Here's a neat trick: if I write
|
|
|
|
int realarray[10];
|
|
int *array = &realarray[-1];
|
|
|
|
I can treat "array" as if it were a 1-based array.
|
|
|
|
A: Although this technique is attractive (and is used in the book
|
|
Numerical Recipes in C), it does not conform to the C standards.
|
|
Pointer arithmetic is defined only as long as the pointer points
|
|
within the same allocated block of memory, or to the imaginary
|
|
"terminating" element one past it; otherwise, the behavior is
|
|
undefined, _even if the pointer is not dereferenced_. The code
|
|
above could fail if, while subtracting the offset, an illegal
|
|
address were generated (perhaps because the address tried to
|
|
"wrap around" past the beginning of some memory segment).
|
|
|
|
References: ANSI Sec. 3.3.6 p. 48, Rationale Sec. 3.2.2.3 p. 38;
|
|
K&R II Sec. 5.3 p. 100, Sec. 5.4 pp. 102-3, Sec. A7.7 pp. 205-6.
|
|
|
|
2.16: I passed a pointer to a function which initialized it:
|
|
|
|
...
|
|
int *ip;
|
|
f(ip);
|
|
...
|
|
|
|
void f(ip)
|
|
int *ip;
|
|
{
|
|
static int dummy = 5;
|
|
ip = &dummy;
|
|
}
|
|
|
|
, but the pointer in the caller was unchanged.
|
|
|
|
A: Did the function try to initialize the pointer itself, or just
|
|
what it pointed to? Remember that arguments in C are passed by
|
|
value. The called function altered only the passed copy of the
|
|
pointer. You'll want to pass the address of the pointer (the
|
|
function will end up accepting a pointer-to-a-pointer).
|
|
|
|
2.17: I have a char * pointer that happens to point to some ints, and
|
|
I want to step it over them. Why doesn't
|
|
|
|
((int *)p)++;
|
|
|
|
work?
|
|
|
|
A: In C, a cast operator does not mean "pretend these bits have a
|
|
different type, and treat them accordingly;" it is a conversion
|
|
operator, and by definition it yields an rvalue, which cannot be
|
|
assigned to, or incremented with ++. (It is an anomaly in pcc-
|
|
derived compilers, and an extension in gcc, that expressions
|
|
such as the above are ever accepted.) Say what you mean: use
|
|
|
|
p = (char *)((int *)p + 1);
|
|
|
|
, or simply
|
|
|
|
p += sizeof(int);
|
|
|
|
References: ANSI Sec. 3.3.4, Rationale Sec. 3.3.2.4 p. 43.
|
|
|
|
|
|
Section 3. Memory Allocation
|
|
|
|
3.1: Why doesn't this fragment work?
|
|
|
|
char *answer;
|
|
printf("Type something:\n");
|
|
gets(answer);
|
|
printf("You typed \"%s\"\n", answer);
|
|
|
|
A: The pointer variable "answer," which is handed to the gets
|
|
function as the location into which the response should be
|
|
stored, has not been set to point to any valid storage. That
|
|
is, we cannot say where the pointer "answer" points. (Since
|
|
local variables are not initialized, and typically contain
|
|
garbage, it is not even guaranteed that "answer" starts out as a
|
|
null pointer. See question 17.1.)
|
|
|
|
The simplest way to correct the question-asking program is to
|
|
use a local array, instead of a pointer, and let the compiler
|
|
worry about allocation:
|
|
|
|
#include <string.h>
|
|
|
|
char answer[100], *p;
|
|
printf("Type something:\n");
|
|
fgets(answer, 100, stdin);
|
|
if((p = strchr(answer, '\n')) != NULL)
|
|
*p = '\0';
|
|
printf("You typed \"%s\"\n", answer);
|
|
|
|
Note that this example also uses fgets instead of gets (always a
|
|
good idea; see question 11.5), so that the size of the array can
|
|
be specified, so that fgets will not overwrite the end of the
|
|
array if the user types an overly-long line. (Unfortunately for
|
|
this example, fgets does not automatically delete the trailing
|
|
\n, as gets would.) It would also be possible to use malloc to
|
|
allocate the answer buffer, and/or to parameterize its size
|
|
(#define ANSWERSIZE 100).
|
|
|
|
3.2: I can't get strcat to work. I tried
|
|
|
|
char *s1 = "Hello, ";
|
|
char *s2 = "world!";
|
|
char *s3 = strcat(s1, s2);
|
|
|
|
but I got strange results.
|
|
|
|
A: Again, the problem is that space for the concatenated result is
|
|
not properly allocated. C does not provide an automatically-
|
|
managed string type. C compilers only allocate memory for
|
|
objects explicitly mentioned in the source code (in the case of
|
|
"strings," this includes character arrays and string literals).
|
|
The programmer must arrange (explicitly) for sufficient space
|
|
for the results of run-time operations such as string
|
|
concatenation, typically by declaring arrays, or by calling
|
|
malloc.
|
|
|
|
strcat performs no allocation; the second string is appended to
|
|
the first one, in place. Therefore, one fix would be to declare
|
|
the first string as an array with sufficient space:
|
|
|
|
char s1[20] = "Hello, ";
|
|
|
|
Since strcat returns the value of its first argument (s1, in
|
|
this case), the s3 variable is superfluous.
|
|
|
|
References: CT&P Sec. 3.2 p. 32.
|
|
|
|
3.3: But the man page for strcat says that it takes two char *'s as
|
|
arguments. How am I supposed to know to allocate things?
|
|
|
|
A: In general, when using pointers you _always_ have to consider
|
|
memory allocation, at least to make sure that the compiler is
|
|
doing it for you. If a library routine's documentation does not
|
|
explicitly mention allocation, it is usually the caller's
|
|
problem.
|
|
|
|
The Synopsis section at the top of a UNIX-style man page can be
|
|
misleading. The code fragments presented there are closer to
|
|
the function definition used by the call's implementor than the
|
|
invocation used by the caller. In particular, many routines
|
|
which accept pointers (e.g. to structs or strings), are usually
|
|
called with the address of some object (a struct, or an array --
|
|
see questions 2.3 and 2.4.) Another common example is stat().
|
|
|
|
3.4: I have a function that is supposed to return a string, but when
|
|
it returns to its caller, the returned string is garbage.
|
|
|
|
A: Make sure that the memory to which the function returns a
|
|
pointer is correctly allocated. The returned pointer should be
|
|
to a statically-allocated buffer, or to a buffer passed in by
|
|
the caller, but _not_ to a local array. See also question 17.3.
|
|
|
|
3.5: You can't use dynamically-allocated memory after you free it,
|
|
can you?
|
|
|
|
A: No. Some early man pages for malloc stated that the contents of
|
|
freed memory was "left undisturbed;" this ill-advised guarantee
|
|
was never universal and is not required by ANSI.
|
|
|
|
Few programmers would use the contents of freed memory
|
|
deliberately, but it is easy to do so accidentally. Consider
|
|
the following (correct) code for freeing a singly-linked list:
|
|
|
|
struct list *listp, *nextp;
|
|
for(listp = base; listp != NULL; listp = nextp) {
|
|
nextp = listp->next;
|
|
free((char *)listp);
|
|
}
|
|
|
|
and notice what would happen if the more-obvious loop iteration
|
|
expression listp = listp->next were used, without the temporary
|
|
nextp pointer.
|
|
|
|
References: ANSI Rationale Sec. 4.10.3.2 p. 102; CT&P Sec. 7.10
|
|
p. 95.
|
|
|
|
3.6: How does free() know how many bytes to free?
|
|
|
|
A: The malloc/free package remembers the size of each block it
|
|
allocates and returns, so it is not necessary to remind it of
|
|
the size when freeing.
|
|
|
|
3.7: So can I query the malloc package to find out how big an
|
|
allocated block is?
|
|
|
|
A: Not portably.
|
|
|
|
3.8: I'm allocating structures which contain pointers to other
|
|
dynamically-allocated objects. When I free a structure, do I
|
|
have to free each subsidiary pointer first?
|
|
|
|
A: Yes. In general, you must arrange that each pointer returned
|
|
from malloc be individually passed to free, exactly once (if it
|
|
is freed at all).
|
|
|
|
3.9: I have a program which mallocs but then frees a lot of memory,
|
|
but memory usage (as reported by ps) doesn't seem to go back
|
|
down.
|
|
|
|
A: Most implementations of malloc/free do not return freed memory
|
|
to the operating system (if there is one), but merely make it
|
|
available for future malloc calls.
|
|
|
|
3.10: Is it legal to pass a null pointer as the first argument to
|
|
realloc()? Why would you want to?
|
|
|
|
A: ANSI C sanctions this usage (and the related realloc(..., 0),
|
|
which frees), but several earlier implementations do not support
|
|
it, so it is not widely portable. Passing an initially-null
|
|
pointer to realloc can make it easier to write a self-starting
|
|
incremental allocation algorithm.
|
|
|
|
References: ANSI Sec. 4.10.3.4 .
|
|
|
|
3.11: What is the difference between calloc and malloc? Is it safe to
|
|
use calloc's zero-fill guarantee for pointer and floating-point
|
|
values? Does free work on memory allocated with calloc, or do
|
|
you need a cfree?
|
|
|
|
A: calloc(m, n) is essentially equivalent to
|
|
|
|
p = malloc(m * n);
|
|
memset(p, 0, m * n);
|
|
|
|
The zero fill is all-bits-zero, and does not therefore guarantee
|
|
useful zero values for pointers (see section 1 of this list) or
|
|
floating-point values. free can (and should) be used to free
|
|
the memory allocated by calloc.
|
|
|
|
References: ANSI Secs. 4.10.3 to 4.10.3.2 .
|
|
|
|
3.12: What is alloca and why is its use discouraged?
|
|
|
|
A: alloca allocates memory which is automatically freed when the
|
|
function which called alloca returns. That is, memory allocated
|
|
with alloca is local to a particular function's "stack frame" or
|
|
context.
|
|
|
|
alloca cannot be written portably, and is difficult to implement
|
|
on machines without a stack. Its use is problematical (and the
|
|
obvious implementation on a stack-based machine fails) when its
|
|
return value is passed directly to another function, as in
|
|
fgets(alloca(100), 100, stdin).
|
|
|
|
For these reasons, alloca cannot be used in programs which must
|
|
be widely portable, no matter how useful it might be.
|
|
|
|
References: ANSI Rationale Sec. 4.10.3 p. 102.
|
|
|
|
|
|
Section 4. Expressions
|
|
|
|
4.1: Why doesn't this code:
|
|
|
|
a[i] = i++;
|
|
|
|
work?
|
|
|
|
A: The subexpression i++ causes a side effect -- it modifies i's
|
|
value -- which leads to undefined behavior if i is also
|
|
referenced elsewhere in the same expression.
|
|
|
|
References: ANSI Sec. 3.3 p. 39.
|
|
|
|
4.2: Under my compiler, the code
|
|
|
|
int i = 7;
|
|
printf("%d\n", i++ * i++);
|
|
|
|
prints 49. Regardless of the order of evaluation, shouldn't it
|
|
print 56?
|
|
|
|
A: Although the postincrement and postdecrement operators ++ and --
|
|
perform the operations after yielding the former value, the
|
|
implication of "after" is often misunderstood. It is _not_
|
|
guaranteed that the operation is performed immediately after
|
|
giving up the previous value and before any other part of the
|
|
expression is evaluated. It is merely guaranteed that the
|
|
update will be performed sometime before the expression is
|
|
considered "finished" (before the next "sequence point," in ANSI
|
|
C's terminology). In the example, the compiler chose to
|
|
multiply the previous value by itself and to perform both
|
|
increments afterwards.
|
|
|
|
The behavior of code which contains multiple, ambiguous side
|
|
effects has always been undefined. Don't even try to find out
|
|
how your compiler implements such things (contrary to the ill-
|
|
advised exercises in many C textbooks); as K&R wisely point out,
|
|
"if you don't know _how_ they are done on various machines, that
|
|
innocence may help to protect you."
|
|
|
|
References: K&R I Sec. 2.12 p. 50; K&R II Sec. 2.12 p. 54; ANSI
|
|
Sec. 3.3 p. 39; CT&P Sec. 3.7 p. 47; PCS Sec. 9.5 pp. 120-1.
|
|
(Ignore H&S Sec. 7.12 pp. 190-1, which is obsolete.)
|
|
|
|
4.3: But what about the &&, ||, and comma operators?
|
|
I see code like "if((c = getchar()) == EOF || c == '\n')" ...
|
|
|
|
A: There is a special exception for those operators, (as well as
|
|
?: ); each of them does imply a sequence point (i.e. left-to-
|
|
right evaluation is guaranteed). Any book on C should make this
|
|
clear.
|
|
|
|
References: K&R I Sec. 2.6 p. 38, Secs. A7.11-12 pp. 190-1;
|
|
K&R II Sec. 2.6 p. 41, Secs. A7.14-15 pp. 207-8; ANSI
|
|
Secs. 3.3.13 p. 52, 3.3.14 p. 52, 3.3.15 p. 53, 3.3.17 p. 55,
|
|
CT&P Sec. 3.7 pp. 46-7.
|
|
|
|
4.4: If I'm not using the value of the expression, should I use i++
|
|
or ++i to increment a variable?
|
|
|
|
A: Since the two forms differ only in the value yielded, they are
|
|
entirely equivalent when only their side effect is needed.
|
|
|
|
4.5: Why doesn't the code
|
|
|
|
int a = 1000, b = 1000;
|
|
long int c = a * b;
|
|
|
|
work?
|
|
|
|
A: Under C's integral promotion rules, the multiplication is
|
|
carried out using int arithmetic, and the result may overflow
|
|
and/or be truncated before being assigned to the long int left-
|
|
hand-side. Use an explicit cast to force long arithmetic:
|
|
|
|
long int c = (long int)a * b;
|
|
|
|
|
|
Section 5. ANSI C
|
|
|
|
5.1: What is the "ANSI C Standard?"
|
|
|
|
A: In 1983, the American National Standards Institute commissioned
|
|
a committee, X3J11, to standardize the C language. After a
|
|
long, arduous process, including several widespread public
|
|
reviews, the committee's work was finally ratified as an
|
|
American National Standard, X3.159-1989, on December 14, 1989,
|
|
and published in the spring of 1990. For the most part, ANSI C
|
|
standardizes existing practice, with a few additions from C++
|
|
(most notably function prototypes) and support for multinational
|
|
character sets (including the much-lambasted trigraph
|
|
sequences). The ANSI C standard also formalizes the C run-time
|
|
library support routines.
|
|
|
|
The published Standard includes a "Rationale," which explains
|
|
many of its decisions, and discusses a number of subtle points,
|
|
including several of those covered here. (The Rationale is "not
|
|
part of ANSI Standard X3.159-1989, but is included for
|
|
information only.")
|
|
|
|
The Standard has been adopted as an international standard,
|
|
ISO/IEC 9899:1990, although the sections are numbered
|
|
differently (briefly, ANSI sections 2 through 4 correspond
|
|
roughly to ISO sections 5 through 7), and the Rationale is
|
|
currently not included.
|
|
|
|
5.2: How can I get a copy of the Standard?
|
|
|
|
A: ANSI X3.159 has been officially superseded by ISO 9899.
|
|
Copies are available from
|
|
|
|
American National Standards Institute
|
|
11 W. 42nd St., 13th floor
|
|
New York, NY 10036 USA
|
|
(+1) 212 642 4900
|
|
|
|
or
|
|
|
|
Global Engineering Documents
|
|
2805 McGaw Avenue
|
|
Irvine, CA 92714 USA
|
|
(+1) 714 261 1455
|
|
(800) 854 7179 (U.S. & Canada)
|
|
|
|
The cost is $130.00 from ANSI or $162.50 from Global. Copies of
|
|
the original X3.159 (including the Rationale) are still
|
|
available at $205.00 from ANSI or $200.50 from Global. Note
|
|
that ANSI derives revenues to support its operations from the
|
|
sale of printed standards, so electronic copies are _not_
|
|
available.
|
|
|
|
The text of the Rationale (not the full Standard) is now
|
|
available for anonymous ftp from ftp.uu.net (see question 17.8)
|
|
in directory doc/standards/ansi/X3.159-1989 . The Rationale has
|
|
also been printed by Silicon Press, ISBN 0-929306-07-4.
|
|
|
|
5.3: Does anyone have a tool for converting old-style C programs to
|
|
ANSI C, or vice versa, or for automatically generating
|
|
prototypes?
|
|
|
|
A: Two programs, protoize and unprotoize, convert back and forth
|
|
between prototyped and "old style" function definitions and
|
|
declarations. (These programs do _not_ handle full-blown
|
|
translation between "Classic" C and ANSI C.) These programs
|
|
exist as patches to the FSF GNU C compiler, gcc. Look for the
|
|
file protoize-1.39.0.5.Z in pub/gnu at prep.ai.mit.edu
|
|
(18.71.0.38), or at several other FSF archive sites.
|
|
|
|
The unproto program (/pub/unix/unproto4.shar.Z on
|
|
ftp.win.tue.nl) is a filter which sits between the preprocessor
|
|
and the next compiler pass, converting most of ANSI C to
|
|
traditional C on-the-fly.
|
|
|
|
The GNU GhostScript package comes with a little program called
|
|
ansi2knr.
|
|
|
|
Several prototype generators exist, many as modifications to
|
|
lint. Version 3 of CPROTO was posted to comp.sources.misc in
|
|
March, 1992. See also question 17.8.
|
|
|
|
Finally, are you sure you really need to convert lots of old
|
|
code to ANSI C? The old-style function syntax is still
|
|
acceptable.
|
|
|
|
5.4: I'm trying to use the ANSI "stringizing" preprocessing operator
|
|
# to insert the value of a symbolic constant into a message, but
|
|
it keeps stringizing the macro's name rather than its value.
|
|
|
|
A: You must use something like the following two-step procedure to
|
|
force the macro to be expanded as well as stringized:
|
|
|
|
#define str(x) #x
|
|
#define xstr(x) str(x)
|
|
#define OP plus
|
|
char *opname = xstr(OP);
|
|
|
|
This sets opname to "plus" rather than "OP".
|
|
|
|
An equivalent circumlocution is necessary with the token-pasting
|
|
operator ## when the values (rather than the names) of two
|
|
macros are to be concatenated.
|
|
|
|
References: ANSI Sec. 3.8.3.2, Sec. 3.8.3.5 example p. 93.
|
|
|
|
5.5: What's the difference between "char const *p" and
|
|
"char * const p"?
|
|
|
|
A: "char const *p" is a pointer to a constant character (you can't
|
|
change the character); "char * const p" is a constant pointer to
|
|
a (variable) character (i.e. you can't change the pointer).
|
|
(Read these "inside out" to understand them. See question
|
|
10.4.)
|
|
|
|
References: ANSI Sec. 3.5.4.1 .
|
|
|
|
5.6: Why can't I pass a char ** to a function which expects a
|
|
const char **?
|
|
|
|
A: You can use a pointer-to-T (for any type T) where a pointer-to-
|
|
const-T is expected, but the rule (an explicit exception) which
|
|
permits slight mismatches in qualified pointer types is not
|
|
applied recursively, but only at the top level.
|
|
|
|
You must use explicit casts (e.g. (const char **) in this case)
|
|
when assigning (or passing) pointers which have qualifier
|
|
mismatches at other than the first level of indirection.
|
|
|
|
References: ANSI Sec. 3.1.2.6 p. 26, Sec. 3.3.16.1 p. 54,
|
|
Sec. 3.5.3 p. 65.
|
|
|
|
5.7: My ANSI compiler complains about a mismatch when it sees
|
|
|
|
extern int func(float);
|
|
|
|
int func(x)
|
|
float x;
|
|
{...
|
|
|
|
A: You have mixed the new-style prototype declaration
|
|
"extern int func(float);" with the old-style definition
|
|
"int func(x) float x;". Old C (and ANSI C, in the absence of
|
|
prototypes, and in variable-length argument lists) "widens"
|
|
certain arguments when they are passed to functions. floats are
|
|
promoted to double, and characters and short integers are
|
|
promoted to ints. (The values are automatically converted back
|
|
to the corresponding narrower types within the body of the
|
|
called function, if they are declared that way there.)
|
|
|
|
The problem can be fixed either by using new-style syntax
|
|
consistently in the definition:
|
|
|
|
int func(float x) { ... }
|
|
|
|
or by changing the new-style prototype declaration to match the
|
|
old-style definition:
|
|
|
|
extern int func(double);
|
|
|
|
(In this case, it would be clearest to change the old-style
|
|
definition to use double as well, as long as the address of that
|
|
parameter is not taken.)
|
|
|
|
It may also be safer to avoid "narrow" (char, short int, and
|
|
float) function arguments and return types.
|
|
|
|
References: ANSI Sec. 3.3.2.2 .
|
|
|
|
5.8: Why does the declaration
|
|
|
|
extern f(struct x {int s;} *p);
|
|
|
|
give me an obscure warning message about "struct x introduced in
|
|
prototype scope"?
|
|
|
|
A: In a quirk of C's normal block scoping rules, a struct declared
|
|
only within a prototype cannot be compatible with other structs
|
|
declared in the same source file, nor can the struct tag be used
|
|
later as you'd expect (it goes out of scope at the end of the
|
|
prototype).
|
|
|
|
To resolve the problem, precede the prototype with the vacuous-
|
|
looking declaration
|
|
|
|
struct x;
|
|
|
|
, which will reserve a place at file scope for struct x's
|
|
definition, which will be completed by the struct declaration
|
|
within the prototype.
|
|
|
|
References: ANSI Sec. 3.1.2.1 p. 21, Sec. 3.1.2.6 p. 26,
|
|
Sec. 3.5.2.3 p. 63.
|
|
|
|
5.9: I'm getting strange syntax errors inside code which I've
|
|
#ifdeffed out.
|
|
|
|
A: Under ANSI C, the text inside a "turned off" #if, #ifdef, or
|
|
#ifndef must still consist of "valid preprocessing tokens."
|
|
This means that there must be no unterminated comments or quotes
|
|
(note particularly that an apostrophe within a contracted word
|
|
could look like the beginning of a character constant), and no
|
|
newlines inside quotes. Therefore, natural-language comments
|
|
and pseudocode should always be written between the "official"
|
|
comment delimiters /* and */. (But see also question 17.10, and
|
|
6.7.)
|
|
|
|
References: ANSI Sec. 2.1.1.2 p. 6, Sec. 3.1 p. 19 line 37.
|
|
|
|
5.10: Can I declare main as void, to shut off these annoying "main
|
|
returns no value" messages? (I'm calling exit(), so main
|
|
doesn't return.)
|
|
|
|
A: No. main must be declared as returning an int, and as taking
|
|
either zero or two arguments (of the appropriate type). If
|
|
you're calling exit() but still getting warnings, you'll have to
|
|
insert a redundant return statement (or use some kind of
|
|
"notreached" directive, if available).
|
|
|
|
References: ANSI Sec. 2.1.2.2.1 pp. 7-8.
|
|
|
|
5.11: Is exit(status) truly equivalent to returning status from main?
|
|
|
|
A: Essentially, except under a few older, nonconforming systems,
|
|
and unless data local to main might be needed during cleanup
|
|
(due perhaps to a setbuf or atexit call).
|
|
|
|
References: ANSI Sec. 2.1.2.2.3 p. 8.
|
|
|
|
5.12: Why does the ANSI Standard not guarantee more than six monocase
|
|
characters of external identifier significance?
|
|
|
|
A: The problem is older linkers which are neither under the control
|
|
of the ANSI standard nor the C compiler developers on the
|
|
systems which have them. The limitation is only that
|
|
identifiers be _significant_ in the first six characters, not
|
|
that they be restricted to six characters in length. This
|
|
limitation is annoying, but certainly not unbearable, and is
|
|
marked in the Standard as "obsolescent," i.e. a future revision
|
|
will likely relax it.
|
|
|
|
This concession to current, restrictive linkers really had to be
|
|
made, no matter how vehemently some people oppose it. (The
|
|
Rationale notes that its retention was "most painful.") If you
|
|
disagree, or have thought of a trick by which a compiler
|
|
burdened with a restrictive linker could present the C
|
|
programmer with the appearance of more significance in external
|
|
identifiers, read the excellently-worded section 3.1.2 in the
|
|
X3.159 Rationale (see question 5.1), which discusses several
|
|
such schemes and explains why they could not be mandated.
|
|
|
|
References: ANSI Sec. 3.1.2 p. 21, Sec. 3.9.1 p. 96, Rationale
|
|
Sec. 3.1.2 pp. 19-21.
|
|
|
|
5.13: What is the difference between memcpy and memmove?
|
|
|
|
A: memmove offers guaranteed behavior if the source and destination
|
|
arguments overlap. memcpy makes no such guarantee, and may
|
|
therefore be more efficiently implementable. When in doubt,
|
|
it's safer to use memmove.
|
|
|
|
References: ANSI Secs. 4.11.2.1, 4.11.2.2, Rationale
|
|
Sec. 4.11.2 .
|
|
|
|
5.14: My compiler is rejecting the simplest possible test programs,
|
|
with all kinds of syntax errors.
|
|
|
|
A: Perhaps it is a pre-ANSI compiler, unable to accept function
|
|
prototypes and the like.
|
|
|
|
5.15: Why won't the Frobozz Magic C Compiler, which claims to be ANSI
|
|
compliant, accept this code? I know that the code is ANSI,
|
|
because gcc accepts it.
|
|
|
|
A: Most compilers support a few non-Standard extensions, gcc more
|
|
so than most. Are you sure that the code being rejected doesn't
|
|
rely on such an extension? It is usually a bad idea to perform
|
|
experiments with a particular compiler to determine properties
|
|
of a language; the applicable standard may permit variations, or
|
|
the compiler may be wrong.
|
|
|
|
5.16: Is char a[3] = "abc"; legal? What does it mean?
|
|
|
|
A: It is legal, though questionably useful. It declares an array
|
|
of size three, initialized with the three characters 'a', 'b',
|
|
and 'c', without the usual terminating '\0' character; the array
|
|
is therefore not a true C string and cannot be used with strcpy,
|
|
printf %s, etc.
|
|
|
|
References: ANSI Sec. 3.5.7 pp. 72-3.
|
|
|
|
5.17: What are #pragmas and what are they good for?
|
|
|
|
A: The #pragma directive provides a single, well-defined "escape
|
|
hatch" which can be used for all sorts of implementation-
|
|
specific controls and extensions: source listing control,
|
|
structure packing, warning suppression (like the old lint
|
|
/* NOTREACHED */ comments), etc.
|
|
|
|
References: ANSI Sec. 3.8.6 .
|
|
|
|
|
|
Section 6. C Preprocessor
|
|
|
|
6.1: How can I write a generic macro to swap two values?
|
|
|
|
A: There is no good answer to this question. If the values are
|
|
integers, a well-known trick using exclusive-OR could perhaps be
|
|
used, but it will not work for floating-point values or
|
|
pointers, or if the two values are the same variable (and the
|
|
"obvious" supercompressed implementation for integral types
|
|
a^=b^=a^=b is in fact illegal due to multiple side-effects,
|
|
and...). If the macro is intended to be used on values of
|
|
arbitrary type (the usual goal), it cannot use a temporary,
|
|
since it does not know what type of temporary it needs, and
|
|
standard C does not provide a typeof operator.
|
|
|
|
The best all-around solution is probably to forget about using a
|
|
macro, unless you're willing to pass in the type as a third
|
|
argument.
|
|
|
|
6.2: I have some old code that tries to construct identifiers with a
|
|
macro like
|
|
|
|
#define Paste(a, b) a/**/b
|
|
|
|
but it doesn't work any more.
|
|
|
|
A: That comments disappeared entirely and could therefore be used
|
|
for token pasting was an undocumented feature of some early
|
|
preprocessor implementations, notably Reiser's. ANSI affirms
|
|
(as did K&R) that comments are replaced with white space.
|
|
However, since the need for pasting tokens was demonstrated and
|
|
real, ANSI introduced a well-defined token-pasting operator, ##,
|
|
which can be used like this:
|
|
|
|
#define Paste(a, b) a##b
|
|
|
|
(See also question 5.4.)
|
|
|
|
References: ANSI Sec. 3.8.3.3 p. 91, Rationale pp. 66-7.
|
|
|
|
6.3: What's the best way to write a multi-statement cpp macro?
|
|
|
|
A: The usual goal is to write a macro that can be invoked as if it
|
|
were a single function-call statement. This means that the
|
|
"caller" will be supplying the final semicolon, so the macro
|
|
body should not. The macro body cannot be a simple brace-
|
|
delineated compound statement, because syntax errors would
|
|
result if it were invoked (apparently as a single statement, but
|
|
with a resultant extra semicolon) as the if branch of an if/else
|
|
statement with an explicit else clause.
|
|
|
|
The traditional solution is to use
|
|
|
|
#define Func() do { \
|
|
/* declarations */ \
|
|
stmt1; \
|
|
stmt2; \
|
|
/* ... */ \
|
|
} while(0) /* (no trailing ; ) */
|
|
|
|
When the "caller" appends a semicolon, this expansion becomes a
|
|
single statement regardless of context. (An optimizing compiler
|
|
will remove any "dead" tests or branches on the constant
|
|
condition 0, although lint may complain.)
|
|
|
|
If all of the statements in the intended macro are simple
|
|
expressions, with no declarations or loops, another technique is
|
|
to write a single, parenthesized expression using one or more
|
|
comma operators. (See the example under question 6.10 below.
|
|
This technique also allows a value to be "returned.")
|
|
|
|
References: CT&P Sec. 6.3 pp. 82-3.
|
|
|
|
6.4: Is it acceptable for one header file to #include another?
|
|
|
|
A: There has been considerable debate surrounding this question.
|
|
Many people believe that "nested #include files" are to be
|
|
avoided: the prestigious Indian Hill Style Guide (see question
|
|
14.3) disparages them; they can make it harder to find relevant
|
|
definitions; they can lead to multiple-declaration errors if a
|
|
file is #included twice; and they make manual Makefile
|
|
maintenance very difficult. On the other hand, they make it
|
|
possible to use header files in a modular way (a header file
|
|
#includes what it needs itself, rather than requiring each
|
|
#includer to do so, a requirement that can lead to intractable
|
|
headaches); a tool like grep (or a tags file) makes it easy to
|
|
find definitions no matter where they are; a popular trick:
|
|
|
|
#ifndef HEADER_FILE_NAME
|
|
#define HEADER_FILE_NAME
|
|
...header file contents...
|
|
#endif
|
|
|
|
makes a header file "idempotent" so that it can safely be
|
|
#included multiple times; and automated Makefile maintenance
|
|
tools (which are a virtual necessity in large projects anyway)
|
|
handle dependency generation in the face of nested #include
|
|
files easily. See also section 14.
|
|
|
|
6.5: Does the sizeof operator work in preprocessor #if directives?
|
|
|
|
A: No. Preprocessing happens during an earlier pass of
|
|
compilation, before type names have been parsed. Consider using
|
|
the predefined constants in ANSI's <limits.h>, if applicable, or
|
|
a "configure" script, instead. (Better yet, try to write code
|
|
which is inherently insensitive to type sizes.)
|
|
|
|
References: ANSI Sec. 2.1.1.2 pp. 6-7, Sec. 3.8.1 p. 87
|
|
footnote 83.
|
|
|
|
6.6: How can I use a preprocessor #if expression to tell if a machine
|
|
is big-endian or little-endian?
|
|
|
|
A: You probably can't. (Preprocessor arithmetic uses only long
|
|
ints, and there is no concept of addressing.) Are you sure you
|
|
need to know the machine's endianness explicitly? Usually it's
|
|
better to write code which doesn't care.
|
|
|
|
6.7: I've got this tricky processing I want to do at compile time and
|
|
I can't figure out a way to get cpp to do it.
|
|
|
|
A: cpp is not intended as a general-purpose preprocessor. Rather
|
|
than forcing it to do something inappropriate, consider writing
|
|
your own little special-purpose preprocessing tool, instead.
|
|
You can easily get a utility like make(1) to run it for you
|
|
automatically.
|
|
|
|
If you are trying to preprocess something other than C, consider
|
|
using a general-purpose preprocessor (such as m4).
|
|
|
|
6.8: I inherited some code which contains far too many #ifdef's for
|
|
my taste. How can I preprocess the code to leave only one
|
|
conditional compilation set, without running it through cpp and
|
|
expanding all of the #include's and #define's as well?
|
|
|
|
A: There is a program floating around called unifdef which does
|
|
exactly this. (See question 17.8.)
|
|
|
|
6.9: How can I list all of the pre#defined identifiers?
|
|
|
|
A: There's no standard way, although it is a frequent need. The
|
|
most expedient way is probably to extract printable strings from
|
|
the compiler or preprocessor executable with something like the
|
|
UNIX strings(1) utility.
|
|
|
|
6.10: How can I write a cpp macro which takes a variable number of
|
|
arguments?
|
|
|
|
A: One popular trick is to define the macro with a single argument,
|
|
and call it with a double set of parentheses, which appear to
|
|
the preprocessor to indicate a single argument:
|
|
|
|
#define DEBUG(args) (printf("DEBUG: "), printf args)
|
|
|
|
if(n != 0) DEBUG(("n is %d\n", n));
|
|
|
|
The obvious disadvantage is that the caller must always remember
|
|
to use the extra parentheses. Another solution is to use
|
|
different macros (DEBUG1, DEBUG2, etc.) depending on the number
|
|
of arguments. (It is often better to use a bona-fide function,
|
|
which can take a variable number of arguments in a well-defined
|
|
way. See questions 7.1 and 7.2 below.)
|
|
|
|
|
|
Section 7. Variable-Length Argument Lists
|
|
|
|
7.1: How can I write a function that takes a variable number of
|
|
arguments?
|
|
|
|
A: Use the <stdarg.h> header (or, if you must, the older
|
|
<varargs.h>).
|
|
|
|
Here is a function which concatenates an arbitrary number of
|
|
strings into malloc'ed memory:
|
|
|
|
#include <stdlib.h> /* for malloc, NULL, size_t */
|
|
#include <stdarg.h> /* for va_ stuff */
|
|
#include <string.h> /* for strcat et al */
|
|
|
|
char *vstrcat(char *first, ...)
|
|
{
|
|
size_t len = 0;
|
|
char *retbuf;
|
|
va_list argp;
|
|
char *p;
|
|
|
|
if(first == NULL)
|
|
return NULL;
|
|
|
|
len = strlen(first);
|
|
|
|
va_start(argp, first);
|
|
|
|
while((p = va_arg(argp, char *)) != NULL)
|
|
len += strlen(p);
|
|
|
|
va_end(argp);
|
|
|
|
retbuf = malloc(len + 1); /* +1 for trailing \0 */
|
|
|
|
if(retbuf == NULL)
|
|
return NULL; /* error */
|
|
|
|
(void)strcpy(retbuf, first);
|
|
|
|
va_start(argp, first);
|
|
|
|
while((p = va_arg(argp, char *)) != NULL)
|
|
(void)strcat(retbuf, p);
|
|
|
|
va_end(argp);
|
|
|
|
return retbuf;
|
|
}
|
|
|
|
Usage is something like
|
|
|
|
char *str = vstrcat("Hello, ", "world!", (char *)NULL);
|
|
|
|
Note the cast on the last argument. (Also note that the caller
|
|
must free the returned, malloc'ed storage.)
|
|
|
|
Under a pre-ANSI compiler, rewrite the function definition
|
|
without a prototype ("char *vstrcat(first) char *first; {"),
|
|
include <stdio.h> rather than <stdlib.h>, add "extern
|
|
char *malloc();", and use int instead of size_t. You may also
|
|
have to delete the (void) casts, and use the older varargs
|
|
package instead of stdarg. See the next question for hints.
|
|
|
|
Remember that in variable-length argument lists, function
|
|
prototypes do not supply parameter type information; therefore,
|
|
default argument promotions apply (see question 5.7), and null
|
|
pointer arguments must be typed explicitly (see question 1.2).
|
|
|
|
References: K&R II Sec. 7.3 p. 155, Sec. B7 p. 254; H&S
|
|
Sec. 13.4 pp. 286-9; ANSI Secs. 4.8 through 4.8.1.3 .
|
|
|
|
7.2: How can I write a function that takes a format string and a
|
|
variable number of arguments, like printf, and passes them to
|
|
printf to do most of the work?
|
|
|
|
A: Use vprintf, vfprintf, or vsprintf.
|
|
|
|
Here is an "error" routine which prints an error message,
|
|
preceded by the string "error: " and terminated with a newline:
|
|
|
|
#include <stdio.h>
|
|
#include <stdarg.h>
|
|
|
|
void
|
|
error(char *fmt, ...)
|
|
{
|
|
va_list argp;
|
|
fprintf(stderr, "error: ");
|
|
va_start(argp, fmt);
|
|
vfprintf(stderr, fmt, argp);
|
|
va_end(argp);
|
|
fprintf(stderr, "\n");
|
|
}
|
|
|
|
To use the older <varargs.h> package, instead of <stdarg.h>,
|
|
change the function header to:
|
|
|
|
void error(va_alist)
|
|
va_dcl
|
|
{
|
|
char *fmt;
|
|
|
|
change the va_start line to
|
|
|
|
va_start(argp);
|
|
|
|
and add the line
|
|
|
|
fmt = va_arg(argp, char *);
|
|
|
|
between the calls to va_start and vfprintf. (Note that there is
|
|
no semicolon after va_dcl.)
|
|
|
|
References: K&R II Sec. 8.3 p. 174, Sec. B1.2 p. 245; H&S
|
|
Sec. 17.12 p. 337; ANSI Secs. 4.9.6.7, 4.9.6.8, 4.9.6.9 .
|
|
|
|
7.3: How can I discover how many arguments a function was actually
|
|
called with?
|
|
|
|
A: This information is not available to a portable program. Some
|
|
old systems provided a nonstandard nargs() function, but its use
|
|
was always questionable, since it typically returned the number
|
|
of words passed, not the number of arguments. (Floating point
|
|
values and structures are usually passed as several words.)
|
|
|
|
Any function which takes a variable number of arguments must be
|
|
able to determine from the arguments themselves how many of them
|
|
there are. printf-like functions do this by looking for
|
|
formatting specifiers (%d and the like) in the format string
|
|
(which is why these functions fail badly if the format string
|
|
does not match the argument list). Another common technique
|
|
(useful when the arguments are all of the same type) is to use a
|
|
sentinel value (often 0, -1, or an appropriately-cast null
|
|
pointer) at the end of the list (see the execl and vstrcat
|
|
examples under questions 1.2 and 7.1 above).
|
|
|
|
7.4: I can't get the va_arg macro to pull in an argument of type
|
|
pointer-to-function.
|
|
|
|
A: The type-rewriting games which the va_arg macro typically plays
|
|
are stymied by overly-complicated types such as pointer-to-
|
|
function. If you use a typedef for the function pointer type,
|
|
however, all will be well.
|
|
|
|
References: ANSI Sec. 4.8.1.2 p. 124.
|
|
|
|
7.5: How can I write a function which takes a variable number of
|
|
arguments and passes them to some other function (which takes a
|
|
variable number of arguments)?
|
|
|
|
A: In general, you cannot. You must provide a version of that
|
|
other function which accepts a va_list pointer, as does vfprintf
|
|
in the example above. If the arguments must be passed directly
|
|
as actual arguments (not indirectly through a va_list pointer)
|
|
to another function which is itself variadic (for which you do
|
|
not have the option of creating an alternate, va_list-accepting
|
|
version) no portable solution is possible. (The problem can be
|
|
solved by resorting to machine-specific assembly language.)
|
|
|
|
7.6: How can I call a function with an argument list built up at run
|
|
time?
|
|
|
|
A: There is no guaranteed or portable way to do this. If you're
|
|
curious, ask this list's editor, who has a few wacky ideas you
|
|
could try... (See also question 16.10.)
|
|
|
|
|
|
Section 8. Boolean Expressions and Variables
|
|
|
|
8.1: What is the right type to use for boolean values in C? Why
|
|
isn't it a standard type? Should #defines or enums be used for
|
|
the true and false values?
|
|
|
|
A: C does not provide a standard boolean type, because picking one
|
|
involves a space/time tradeoff which is best decided by the
|
|
programmer. (Using an int for a boolean may be faster, while
|
|
using char may save data space.)
|
|
|
|
The choice between #defines and enums is arbitrary and not
|
|
terribly interesting (see also question 9.1). Use any of
|
|
|
|
#define TRUE 1 #define YES 1
|
|
#define FALSE 0 #define NO 0
|
|
|
|
enum bool {false, true}; enum bool {no, yes};
|
|
|
|
or use raw 1 and 0, as long as you are consistent within one
|
|
program or project. (An enum may be preferable if your debugger
|
|
expands enum values when examining variables.)
|
|
|
|
Some people prefer variants like
|
|
|
|
#define TRUE (1==1)
|
|
#define FALSE (!TRUE)
|
|
|
|
or define "helper" macros such as
|
|
|
|
#define Istrue(e) ((e) != 0)
|
|
|
|
These don't buy anything (see question 8.2 below; see also
|
|
question 1.6).
|
|
|
|
8.2: Isn't #defining TRUE to be 1 dangerous, since any nonzero value
|
|
is considered "true" in C? What if a built-in boolean or
|
|
relational operator "returns" something other than 1?
|
|
|
|
A: It is true (sic) that any nonzero value is considered true in C,
|
|
but this applies only "on input", i.e. where a boolean value is
|
|
expected. When a boolean value is generated by a built-in
|
|
operator, it is guaranteed to be 1 or 0. Therefore, the test
|
|
|
|
if((a == b) == TRUE)
|
|
|
|
will work as expected (as long as TRUE is 1), but it is
|
|
obviously silly. In general, explicit tests against TRUE and
|
|
FALSE are undesirable, because some library functions (notably
|
|
isupper, isalpha, etc.) return, on success, a nonzero value
|
|
which is _not_ necessarily 1. (Besides, if you believe that
|
|
"if((a == b) == TRUE)" is an improvement over "if(a == b)", why
|
|
stop there? Why not use "if(((a == b) == TRUE) == TRUE)"?) A
|
|
good rule of thumb is to use TRUE and FALSE (or the like) only
|
|
for assignment to a Boolean variable, or as the return value
|
|
from a Boolean function, never in a comparison.
|
|
|
|
The preprocessor macros TRUE and FALSE are used for code
|
|
readability, not because the underlying values might ever
|
|
change. (See also questions 1.7 and 1.9.)
|
|
|
|
References: K&R I Sec. 2.7 p. 41; K&R II Sec. 2.6 p. 42,
|
|
Sec. A7.4.7 p. 204, Sec. A7.9 p. 206; ANSI Secs. 3.3.3.3, 3.3.8,
|
|
3.3.9, 3.3.13, 3.3.14, 3.3.15, 3.6.4.1, 3.6.5; Achilles and the
|
|
Tortoise.
|
|
|
|
|
|
Section 9. Structs, Enums, and Unions
|
|
|
|
9.1: What is the difference between an enum and a series of
|
|
preprocessor #defines?
|
|
|
|
A: At the present time, there is little difference. Although many
|
|
people might have wished otherwise, the ANSI standard says that
|
|
enumerations may be freely intermixed with integral types,
|
|
without errors. (If such intermixing were disallowed without
|
|
explicit casts, judicious use of enums could catch certain
|
|
programming errors.)
|
|
|
|
Some advantages of enums are that the numeric values are
|
|
automatically assigned, that a debugger may be able to display
|
|
the symbolic values when enum variables are examined, and that
|
|
they obey block scope. (A compiler may also generate nonfatal
|
|
warnings when enums and ints are indiscriminately mixed, since
|
|
doing so can still be considered bad style even though it is not
|
|
strictly illegal). A disadvantage is that the programmer has
|
|
little control over the size (or over those nonfatal warnings).
|
|
|
|
References: K&R II Sec. 2.3 p. 39, Sec. A4.2 p. 196; H&S
|
|
Sec. 5.5 p. 100; ANSI Secs. 3.1.2.5, 3.5.2, 3.5.2.2 .
|
|
|
|
9.2: I heard that structures could be assigned to variables and
|
|
passed to and from functions, but K&R I says not.
|
|
|
|
A: What K&R I said was that the restrictions on struct operations
|
|
would be lifted in a forthcoming version of the compiler, and in
|
|
fact struct assignment and passing were fully functional in
|
|
Ritchie's compiler even as K&R I was being published. Although
|
|
a few early C compilers lacked struct assignment, all modern
|
|
compilers support it, and it is part of the ANSI C standard, so
|
|
there should be no reluctance to use it.
|
|
|
|
References: K&R I Sec. 6.2 p. 121; K&R II Sec. 6.2 p. 129; H&S
|
|
Sec. 5.6.2 p. 103; ANSI Secs. 3.1.2.5, 3.2.2.1, 3.3.16 .
|
|
|
|
9.3: How does struct passing and returning work?
|
|
|
|
A: When structures are passed as arguments to functions, the entire
|
|
struct is typically pushed on the stack, using as many words as
|
|
are required. (Programmers often choose to use pointers to
|
|
structures instead, precisely to avoid this overhead.)
|
|
|
|
Structures are often returned from functions in a location
|
|
pointed to by an extra, compiler-supplied "hidden" argument to
|
|
the function. Some older compilers used a special, static
|
|
location for structure returns, although this made struct-valued
|
|
functions nonreentrant, which ANSI C disallows.
|
|
|
|
References: ANSI Sec. 2.2.3 p. 13.
|
|
|
|
9.4: The following program works correctly, but it dumps core after
|
|
it finishes. Why?
|
|
|
|
struct list
|
|
{
|
|
char *item;
|
|
struct list *next;
|
|
}
|
|
|
|
/* Here is the main program. */
|
|
|
|
main(argc, argv)
|
|
...
|
|
|
|
A: A missing semicolon causes the compiler to believe that main
|
|
returns a structure. (The connection is hard to see because of
|
|
the intervening comment.) Since struct-valued functions are
|
|
usually implemented by adding a hidden return pointer, the
|
|
generated code for main() tries to accept three arguments,
|
|
although only two are passed (in this case, by the C start-up
|
|
code). See also question 17.15.
|
|
|
|
References: CT&P Sec. 2.3 pp. 21-2.
|
|
|
|
9.5: Why can't you compare structs?
|
|
|
|
A: There is no reasonable way for a compiler to implement struct
|
|
comparison which is consistent with C's low-level flavor. A
|
|
byte-by-byte comparison could be invalidated by random bits
|
|
present in unused "holes" in the structure (such padding is used
|
|
to keep the alignment of later fields correct). A field-by-
|
|
field comparison would require unacceptable amounts of
|
|
repetitive, in-line code for large structures.
|
|
|
|
If you want to compare two structures, you must write your own
|
|
function to do so. C++ would let you arrange for the ==
|
|
operator to map to your function.
|
|
|
|
References: K&R II Sec. 6.2 p. 129; H&S Sec. 5.6.2 p. 103; ANSI
|
|
Rationale Sec. 3.3.9 p. 47.
|
|
|
|
9.6: I came across some code that declared a structure like this:
|
|
|
|
struct name
|
|
{
|
|
int namelen;
|
|
char name[1];
|
|
};
|
|
|
|
and then did some tricky allocation to make the name array act
|
|
like it had several elements. Is this legal and/or portable?
|
|
|
|
A: This technique is popular, although Dennis Ritchie has called it
|
|
"unwarranted chumminess with the compiler." An ANSI
|
|
Interpretation Ruling has deemed it not strictly conforming. It
|
|
seems, however, to be portable to all known implementations.
|
|
(Compilers which check array bounds carefully might issue
|
|
warnings.)
|
|
|
|
References: ANSI Rationale Sec. 3.5.4.2 pp. 54-5.
|
|
|
|
9.7: How can I determine the byte offset of a field within a
|
|
structure?
|
|
|
|
A: ANSI C defines the offsetof macro, which should be used if
|
|
available; see <stddef.h>. If you don't have it, a suggested
|
|
implementation is
|
|
|
|
#define offsetof(type, mem) ((size_t) \
|
|
((char *)&((type *) 0)->mem - (char *)((type *) 0)))
|
|
|
|
This implementation is not 100% portable; some compilers may
|
|
legitimately refuse to accept it.
|
|
|
|
See the next question for a usage hint.
|
|
|
|
References: ANSI Sec. 4.1.5, Rationale Sec. 3.5.4.2 p. 55.
|
|
|
|
9.8: How can I access structure fields by name at run time?
|
|
|
|
A: Build a table of names and offsets, using the offsetof() macro.
|
|
The offset of field b in struct a is
|
|
|
|
offsetb = offsetof(struct a, b)
|
|
|
|
If structp is a pointer to an instance of this structure, and b
|
|
is an int field with offset as computed above, b's value can be
|
|
set indirectly with
|
|
|
|
*(int *)((char *)structp + offsetb) = value;
|
|
|
|
9.9: Why does sizeof report a larger size than I expect for a
|
|
structure type, as if there was padding at the end?
|
|
|
|
A: Structures may have this padding (as well as internal padding;
|
|
see also question 9.5), so that alignment properties will be
|
|
preserved when an array of contiguous structures is allocated.
|
|
|
|
9.10: My compiler is leaving holes in structures, which is wasting
|
|
space and preventing "binary" I/O to external data files. Can I
|
|
turn off the padding, or otherwise control the alignment of
|
|
structs?
|
|
|
|
A: Your compiler may provide an extension to give you this control
|
|
(perhaps a #pragma), but there is no standard method. See also
|
|
question 17.2.
|
|
|
|
9.11: Can I initialize unions?
|
|
|
|
A: ANSI Standard C allows an initializer for the first member of a
|
|
union. There is no standard way of initializing the other
|
|
members (nor, under a pre-ANSI compiler, is there generally any
|
|
way of initializing any of them).
|
|
|
|
|
|
Section 10. Declarations
|
|
|
|
10.1: How do you decide which integer type to use?
|
|
|
|
A: If you might need large values (above 32767 or below -32767),
|
|
use long. Otherwise, if space is very important (there are
|
|
large arrays or many structures), use short. Otherwise, use
|
|
int. If well-defined overflow characteristics are important
|
|
and/or negative values are not, use the corresponding unsigned
|
|
types. (But beware of mixing signed and unsigned in
|
|
expressions.) Similar arguments apply when deciding between
|
|
float and double.
|
|
|
|
Although char or unsigned char can be used as a "tiny" int type,
|
|
doing so is often more trouble than it's worth, due to
|
|
unpredictable sign extension and increased code size.
|
|
|
|
These rules obviously don't apply if the address of a variable
|
|
is taken and must have a particular type.
|
|
|
|
If for some reason you need to declare something with an _exact_
|
|
size (usually the only good reason for doing so is when
|
|
attempting to conform to some externally-imposed storage layout,
|
|
but see question 17.2), be sure to encapsulate the choice behind
|
|
an appropriate typedef.
|
|
|
|
10.2: What should the 64-bit type on new, 64-bit machines be?
|
|
|
|
A: Some vendors of C products for 64-bit machines support 64-bit
|
|
long ints. Others fear that too much existing code depends on
|
|
sizeof(int) == sizeof(long) == 32 bits, and introduce a new 64-
|
|
bit long long int type instead.
|
|
|
|
Programmers interested in writing portable code should therefore
|
|
insulate their 64-bit type needs behind appropriate typedefs.
|
|
Vendors who feel compelled to introduce a new long long int type
|
|
should advertise it as being "at least 64 bits" (which is truly
|
|
new; a type traditional C doesn't have), and not "exactly 64
|
|
bits."
|
|
|
|
10.3: I can't seem to define a linked list successfully. I tried
|
|
|
|
typedef struct
|
|
{
|
|
char *item;
|
|
NODEPTR next;
|
|
} *NODEPTR;
|
|
|
|
but the compiler gave me error messages. Can't a struct in C
|
|
contain a pointer to itself?
|
|
|
|
A: Structs in C can certainly contain pointers to themselves; the
|
|
discussion and example in section 6.5 of K&R make this clear.
|
|
The problem with this example is that the NODEPTR typedef is not
|
|
complete at the point where the "next" field is declared. To
|
|
fix it, first give the structure a tag ("struct node"). Then,
|
|
declare the "next" field as "struct node *next;", and/or move
|
|
the typedef declaration wholly before or wholly after the struct
|
|
declaration. One corrected version would be
|
|
|
|
struct node
|
|
{
|
|
char *item;
|
|
struct node *next;
|
|
};
|
|
|
|
typedef struct node *NODEPTR;
|
|
|
|
, and there are at least three other equivalently correct ways
|
|
of arranging it.
|
|
|
|
A similar problem, with a similar solution, can arise when
|
|
attempting to declare a pair of typedef'ed mutually referential
|
|
structures.
|
|
|
|
References: K&R I Sec. 6.5 p. 101; K&R II Sec. 6.5 p. 139; H&S
|
|
Sec. 5.6.1 p. 102; ANSI Sec. 3.5.2.3 .
|
|
|
|
10.4: How do I declare an array of N pointers to functions returning
|
|
pointers to functions returning pointers to characters?
|
|
|
|
A: This question can be answered in at least three ways:
|
|
|
|
1. char *(*(*a[N])())();
|
|
|
|
2. Build the declaration up in stages, using typedefs:
|
|
|
|
typedef char *pc; /* pointer to char */
|
|
typedef pc fpc(); /* function returning pointer to char */
|
|
typedef fpc *pfpc; /* pointer to above */
|
|
typedef pfpc fpfpc(); /* function returning... */
|
|
typedef fpfpc *pfpfpc; /* pointer to... */
|
|
pfpfpc a[N]; /* array of... */
|
|
|
|
3. Use the cdecl program, which turns English into C and vice
|
|
versa:
|
|
|
|
cdecl> declare a as array of pointer to function returning
|
|
pointer to function returning pointer to char
|
|
char *(*(*a[])())()
|
|
|
|
cdecl can also explain complicated declarations, help with
|
|
casts, and indicate which set of parentheses the arguments
|
|
go in (for complicated function definitions, like the
|
|
above). Versions of cdecl are in volume 14 of
|
|
comp.sources.unix (see question 17.8) and K&R II.
|
|
|
|
Any good book on C should explain how to read these complicated
|
|
C declarations "inside out" to understand them ("declaration
|
|
mimics use").
|
|
|
|
References: K&R II Sec. 5.12 p. 122; H&S Sec. 5.10.1 p. 116.
|
|
|
|
10.5: I'm building a state machine with a bunch of functions, one for
|
|
each state. I want to implement state transitions by having
|
|
each function return a pointer to the next state function. I
|
|
find a limitation in C's declaration mechanism: there's no way
|
|
to declare these functions as returning a pointer to a function
|
|
returning a pointer to a function returning a pointer to a
|
|
function...
|
|
|
|
A: You can't do it directly. Either have the function return a
|
|
generic function pointer type, and apply a cast before calling
|
|
through it; or have it return a structure containing only a
|
|
pointer to a function returning that structure.
|
|
|
|
10.6: What's the best way to declare and define global variables?
|
|
|
|
A: First, though there can be many _declarations_ (and in many
|
|
translation units) of a single "global" (strictly speaking,
|
|
"external") variable (or function), there must be exactly one
|
|
_definition_. (The definition is the declaration that actually
|
|
allocates space, and provides an initialization value, if any.)
|
|
It is best to place the definition in some central (to the
|
|
program, or to the module) .c file, with an external declaration
|
|
in a header (".h") file, which is #included wherever the
|
|
declaration is needed. The .c file containing the definition
|
|
should also #include the header file containing the external
|
|
declaration, so that the compiler can check that the
|
|
declarations match.
|
|
|
|
This rule promotes a high degree of portability, and is
|
|
consistent with the requirements of the ANSI C Standard. Note
|
|
that UNIX compilers and linkers typically use a "common model"
|
|
which allows multiple (uninitialized) definitions. A few very
|
|
odd systems may require an explicit initializer to distinguish a
|
|
definition from an external declaration.
|
|
|
|
It is possible to use preprocessor tricks to arrange that the
|
|
declaration need only be typed once, in the header file, and
|
|
"turned into" a definition, during exactly one #inclusion, via a
|
|
special #define.
|
|
|
|
References: K&R I Sec. 4.5 pp. 76-7; K&R II Sec. 4.4 pp. 80-1;
|
|
ANSI Sec. 3.1.2.2 (esp. Rationale), Secs. 3.7, 3.7.2,
|
|
Sec. F.5.11 .
|
|
|
|
10.7: I finally figured out the syntax for declaring pointers to
|
|
functions, but now how do I initialize one?
|
|
|
|
A: Use something like
|
|
|
|
extern int func();
|
|
int (*fp)() = func;
|
|
|
|
When the name of a function appears in an expression but is not
|
|
being called (i.e. is not followed by a "("), it "decays" into a
|
|
pointer (i.e. it has its address implicitly taken), much as an
|
|
array name does.
|
|
|
|
An explicit extern declaration for the function is normally
|
|
needed, since implicit external function declaration does not
|
|
happen in this case (again, because the function name is not
|
|
followed by a "(").
|
|
|
|
10.8: I've seen different methods used for calling through pointers to
|
|
functions. What's the story?
|
|
|
|
A: Originally, a pointer to a function had to be "turned into" a
|
|
"real" function, with the * operator (and an extra pair of
|
|
parentheses, to keep the precedence straight), before calling:
|
|
|
|
int r, func(), (*fp)() = func;
|
|
r = (*fp)();
|
|
|
|
It can also be argued that functions are always called through
|
|
pointers, but that "real" functions decay implicitly into
|
|
pointers (in expressions, as they do in initializations) and so
|
|
cause no trouble. This reasoning, made widespread through pcc
|
|
and adopted in the ANSI standard, means that
|
|
|
|
r = fp();
|
|
|
|
is legal and works correctly, whether fp is a function or a
|
|
pointer to one. (The usage has always been unambiguous; there
|
|
is nothing you ever could have done with a function pointer
|
|
followed by an argument list except call through it.) An
|
|
explicit * is harmless, and still allowed (and recommended, if
|
|
portability to older compilers is important).
|
|
|
|
References: ANSI Sec. 3.3.2.2 p. 41, Rationale p. 41.
|
|
|
|
|
|
Section 11. Stdio
|
|
|
|
11.1: Why doesn't this code:
|
|
|
|
char c;
|
|
while((c = getchar()) != EOF)...
|
|
|
|
work?
|
|
|
|
A: For one thing, the variable to hold getchar's return value must
|
|
be an int. getchar can return all possible character values, as
|
|
well as EOF. By passing getchar's return value through a char,
|
|
either a normal character might be misinterpreted as EOF, or the
|
|
EOF might be altered and so never seen.
|
|
|
|
References: CT&P Sec. 5.1 p. 70.
|
|
|
|
11.2: Why doesn't the code scanf("%d", i); work?
|
|
|
|
A: You must always pass addresses (in this case, &i) to scanf.
|
|
|
|
11.3: Why doesn't this code:
|
|
|
|
double d;
|
|
scanf("%f", &d);
|
|
|
|
work?
|
|
|
|
A: With scanf, use %lf for values of type double, and %f for float.
|
|
(Note the discrepancy with printf, which uses %f for both double
|
|
and float, due to C's default argument promotion rules.)
|
|
|
|
11.4: Why won't the code
|
|
|
|
while(!feof(infp)) {
|
|
fgets(buf, MAXLINE, infp);
|
|
fputs(buf, outfp);
|
|
}
|
|
|
|
work?
|
|
|
|
A: C's I/O is not like Pascal's. EOF is only indicated _after_ an
|
|
input routine has tried to read, and has reached end-of-file.
|
|
Usually, you should just check the return value of the input
|
|
routine (fgets in this case); often, you don't need to use
|
|
feof() at all.
|
|
|
|
11.5: Why does everyone say not to use gets()?
|
|
|
|
A: It cannot be told the size of the buffer it's to read into, so
|
|
it cannot be prevented from overflowing that buffer.
|
|
|
|
11.6: Why does errno contain ENOTTY after a call to printf?
|
|
|
|
A: Many implementations of the stdio package adjust their behavior
|
|
slightly if stdout is a terminal. To make the determination,
|
|
these implementations perform an operation which fails (with
|
|
ENOTTY) if stdout is not a terminal. Although the output
|
|
operation goes on to complete successfully, errno still contains
|
|
ENOTTY.
|
|
|
|
References: CT&P Sec. 5.4 p. 73.
|
|
|
|
11.7: My program's prompts and intermediate output don't always show
|
|
up on the screen, especially when I pipe the output through
|
|
another program.
|
|
|
|
A: It is best to use an explicit fflush(stdout) whenever output
|
|
should definitely be visible. Several mechanisms attempt to
|
|
perform the fflush for you, at the "right time," but they tend
|
|
to apply only when stdout is a terminal. (See question 11.6.)
|
|
|
|
11.8: When I read from the keyboard with scanf, it seems to hang until
|
|
I type one extra line of input.
|
|
|
|
A: scanf was designed for free-format input, which is seldom what
|
|
you want when reading from the keyboard. In particular, "\n" in
|
|
a format string does _not_ mean to expect a newline, but rather
|
|
to read and discard characters as long as each is a whitespace
|
|
character.
|
|
|
|
A related problem is that unexpected non-numeric input can cause
|
|
scanf to "jam." Because of these problems, it is usually better
|
|
to use fgets to read a whole line, and then use sscanf or other
|
|
string functions to pick apart the line buffer. If you do use
|
|
sscanf, don't forget to check the return value to make sure that
|
|
the expected number of items were found.
|
|
|
|
11.9: I'm trying to update a file in place, by using fopen mode "r+",
|
|
then reading a certain string, and finally writing back a
|
|
modified string, but it's not working.
|
|
|
|
A: Be sure to call fseek before you write, both to seek back to the
|
|
beginning of the string you're trying to overwrite, and because
|
|
an fseek or fflush is always required between reading and
|
|
writing in the read/write "+" modes.
|
|
|
|
References: ANSI Sec. 4.9.5.3 p. 131.
|
|
|
|
11.10: How can I read one character at a time, without waiting for the
|
|
RETURN key?
|
|
|
|
A: See question 16.1.
|
|
|
|
11.11: How can I flush pending input so that a user's typeahead isn't
|
|
read at the next prompt? Will fflush(stdin) work?
|
|
|
|
A: fflush is defined only for output streams. Since its definition
|
|
of "flush" is to complete the writing of buffered characters
|
|
(not to discard them), discarding unread input would not be an
|
|
analogous meaning for fflush on input streams. There is no
|
|
standard way to discard unread characters from a stdio input
|
|
buffer, nor would such a way be sufficient; unread characters
|
|
can also accumulate in other, OS-level input buffers.
|
|
|
|
11.12: How can I redirect stdin or stdout to a file from within a
|
|
program?
|
|
|
|
A: Use freopen.
|
|
|
|
11.13: Once I've used freopen, how can I get the original stdout (or
|
|
stdin) back?
|
|
|
|
A: If you need to switch back and forth, the best all-around
|
|
solution is not to use freopen in the first place. Try using
|
|
your own explicit output (or input) stream variable, which you
|
|
can reassign at will, while leaving the original stdout (or
|
|
stdin) undisturbed.
|
|
|
|
11.14: How can I recover the file name given an open file descriptor?
|
|
|
|
A: This problem is, in general, insoluble. Under UNIX, for
|
|
instance, a scan of the entire disk, (perhaps requiring special
|
|
permissions) would theoretically be required, and would fail if
|
|
the file descriptor was a pipe or referred to a deleted file
|
|
(and could give a misleading answer for a file with multiple
|
|
links). It is best to remember the names of files yourself when
|
|
you open them (perhaps with a wrapper function around fopen).
|
|
|
|
|
|
Section 12. Library Subroutines
|
|
|
|
12.1: Why does strncpy not always place a '\0' termination in the
|
|
destination string?
|
|
|
|
A: strncpy was first designed to handle a now-obsolete data
|
|
structure, the fixed-length, not-necessarily-\0-terminated
|
|
"string." strncpy is admittedly a bit cumbersome to use in
|
|
other contexts, since you must often append a '\0' to the
|
|
destination string by hand.
|
|
|
|
12.2: I'm trying to sort an array of strings with qsort, using strcmp
|
|
as the comparison function, but it's not working.
|
|
|
|
A: By "array of strings" you probably mean "array of pointers to
|
|
char." The arguments to qsort's comparison function are
|
|
pointers to the objects being sorted, in this case, pointers to
|
|
pointers to char. (strcmp, of course, accepts simple pointers
|
|
to char.)
|
|
|
|
The comparison routine's arguments are expressed as "generic
|
|
pointers," const void * or char *. They must be converted back
|
|
to what they "really are" (char **) and dereferenced, yielding
|
|
char *'s which can be usefully compared. Write a comparison
|
|
function like this:
|
|
|
|
int pstrcmp(p1, p2) /* compare strings through pointers */
|
|
char *p1, *p2; /* const void * for ANSI C */
|
|
{
|
|
return strcmp(*(char **)p1, *(char **)p2);
|
|
}
|
|
|
|
12.3: Now I'm trying to sort an array of structures with qsort. My
|
|
comparison routine takes pointers to structures, but the
|
|
compiler complains that the function is of the wrong type for
|
|
qsort. How can I cast the function pointer to shut off the
|
|
warning?
|
|
|
|
A: The conversions must be in the comparison function, which must
|
|
be declared as accepting "generic pointers" (const void * or
|
|
char *) as discussed above.
|
|
|
|
12.4: How can I convert numbers to strings (the opposite of atoi)? Is
|
|
there an itoa function?
|
|
|
|
A: Just use sprintf. (You'll have to allocate space for the result
|
|
somewhere anyway; see questions 3.1 and 3.2. Don't worry that
|
|
sprintf may be overkill, potentially wasting run time or code
|
|
space; it works well in practice.)
|
|
|
|
References: K&R I Sec. 3.6 p. 60; K&R II Sec. 3.6 p. 64.
|
|
|
|
12.5: How can I get the current date or time of day in a C program?
|
|
|
|
A: Just use the time, ctime, and/or localtime functions. (These
|
|
routines have been around for years, and are in the ANSI
|
|
standard.) Here is a simple example:
|
|
|
|
#include <stdio.h>
|
|
#include <time.h>
|
|
|
|
main()
|
|
{
|
|
time_t now = time((time_t *)NULL);
|
|
printf("It's %.24s.\n", ctime(&now));
|
|
return 0;
|
|
}
|
|
|
|
References: ANSI Sec. 4.12 .
|
|
|
|
12.6: I know that the library routine localtime will convert a time_t
|
|
into a broken-down struct tm, and that ctime will convert a
|
|
time_t to a printable string. How can I perform the inverse
|
|
operations of converting a struct tm or a string into a time_t?
|
|
|
|
A: ANSI C specifies a library routine, mktime, which converts a
|
|
struct tm to a time_t. Several public-domain versions of this
|
|
routine are available in case your compiler does not support it
|
|
yet.
|
|
|
|
Converting a string to a time_t is harder, because of the wide
|
|
variety of date and time formats which should be parsed.
|
|
Public-domain routines have been written for performing this
|
|
function (see, for example, the file partime.c, widely
|
|
distributed with the RCS package), but they are less likely to
|
|
become standardized.
|
|
|
|
References: K&R II Sec. B10 p. 256; H&S Sec. 20.4 p. 361; ANSI
|
|
Sec. 4.12.2.3 .
|
|
|
|
12.7: I need a random number generator.
|
|
|
|
A: The standard C library has one: rand(). The implementation on
|
|
your system may not be perfect, but writing a better one isn't
|
|
necessarily easy, either.
|
|
|
|
References: ANSI Sec. 4.10.2.1 p. 154; Knuth Vol. 2 Chap. 3
|
|
pp. 1-177.
|
|
|
|
12.8: Each time I run my program, I get the same sequence of numbers
|
|
back from rand().
|
|
|
|
A: You can call srand() to seed the pseudo-random number generator
|
|
with a more random initial value. Popular seed values are the
|
|
time of day, or the elapsed time before the user presses a key
|
|
(although keypress times are hard to determine portably; see
|
|
question 16.9).
|
|
|
|
References: ANSI Sec. 4.10.2.2 p. 154.
|
|
|
|
12.9: I need a random true/false value, so I'm taking rand() % 2, but
|
|
it's just alternating 0, 1, 0, 1, 0...
|
|
|
|
A: Poor pseudorandom number generators (such as the ones
|
|
unfortunately supplied with some systems) are not very random in
|
|
the low-order bits. Try using the higher-order bits.
|
|
|
|
12.10- I'm trying to port this old A: These routines are variously
|
|
12.14: program. Why do I get obsolete; you should
|
|
"undefined external" errors instead:
|
|
for:
|
|
|
|
12.10: index? A: use strchr.
|
|
12.11: rindex? A: use strrchr.
|
|
12.12: bcopy? A: use memmove, after
|
|
interchanging the first and
|
|
second arguments (see also
|
|
question 5.13).
|
|
12.13: bcmp? A: use memcmp.
|
|
12.14: bzero? A: use memset, with a second
|
|
argument of 0.
|
|
|
|
12.15: How can I execute a command with system() and read its output
|
|
into a program?
|
|
|
|
A: UNIX and some other systems provide a popen() routine, which
|
|
sets up a stdio stream on a pipe connected to the process
|
|
running a command, so that the output can be read (or the input
|
|
supplied).
|
|
|
|
12.16: How can I read a directory in a C program?
|
|
|
|
A: See if you can use the opendir() and readdir() routines, which
|
|
are available on most UNIX systems. Implementations also exist
|
|
for MS-DOS, VMS, and other systems. (MS-DOS also has FINDFIRST
|
|
and FINDNEXT routines which do essentially the same thing.)
|
|
|
|
|
|
Section 13. Lint
|
|
|
|
13.1: I just typed in this program, and it's acting strangely. Can
|
|
you see anything wrong with it?
|
|
|
|
A: Try running lint first (perhaps with the -a, -c, -h, -p and/or
|
|
other options). Many C compilers are really only half-
|
|
compilers, electing not to diagnose numerous source code
|
|
difficulties which would not actively preclude code generation.
|
|
|
|
13.2: How can I shut off the "warning: possible pointer alignment
|
|
problem" message lint gives me for each call to malloc?
|
|
|
|
A: The problem is that traditional versions of lint do not know,
|
|
and cannot be told, that malloc "returns a pointer to space
|
|
suitably aligned for storage of any type of object." It is
|
|
possible to provide a pseudoimplementation of malloc, using a
|
|
#define inside of #ifdef lint, which effectively shuts this
|
|
warning off, but a simpleminded #definition will also suppress
|
|
meaningful messages about truly incorrect invocations. It may
|
|
be easier simply to ignore the message, perhaps in an automated
|
|
way with grep -v.
|
|
|
|
13.3: Where can I get an ANSI-compatible lint?
|
|
|
|
A: A product called FlexeLint is available (in "shrouded source
|
|
form," for compilation on 'most any system) from
|
|
|
|
Gimpel Software
|
|
3207 Hogarth Lane
|
|
Collegeville, PA 19426 USA
|
|
(+1) 215 584 4261
|
|
|
|
The System V release 4 lint is ANSI-compatible, and is available
|
|
separately (bundled with other C tools) from UNIX Support Labs
|
|
(a subsidiary of AT&T), or from System V resellers.
|
|
|
|
|
|
Section 14. Style
|
|
|
|
14.1: Here's a neat trick:
|
|
|
|
if(!strcmp(s1, s2))
|
|
|
|
Is this good style?
|
|
|
|
A: It is not particularly good style, although it is a popular
|
|
idiom. The test succeeds if the two strings are equal, but its
|
|
form suggests that it tests for inequality.
|
|
|
|
Another solution is to use a macro:
|
|
|
|
#define Streq(s1, s2) (strcmp((s1), (s2)) == 0)
|
|
|
|
Opinions on code style, like those on religion, can be debated
|
|
endlessly. Though good style is a worthy goal, and can usually
|
|
be recognized, it cannot be codified.
|
|
|
|
14.2: What's the best style for code layout in C?
|
|
|
|
A: K&R, while providing the example most often copied, also supply
|
|
a good excuse for avoiding it:
|
|
|
|
The position of braces is less important,
|
|
although people hold passionate beliefs.
|
|
We have chosen one of several popular styles.
|
|
Pick a style that suits you, then use it
|
|
consistently.
|
|
|
|
It is more important that the layout chosen be consistent (with
|
|
itself, and with nearby or common code) than that it be
|
|
"perfect." If your coding environment (i.e. local custom or
|
|
company policy) does not suggest a style, and you don't feel
|
|
like inventing your own, just copy K&R. (The tradeoffs between
|
|
various indenting and brace placement options can be
|
|
exhaustively and minutely examined, but don't warrant repetition
|
|
here. See also the Indian Hill Style Guide.)
|
|
|
|
The elusive quality of "good style" involves much more than mere
|
|
code layout details; don't spend time on formatting to the
|
|
exclusion of more substantive code quality issues.
|
|
|
|
References: K&R Sec. 1.2 p. 10.
|
|
|
|
14.3: Where can I get the "Indian Hill Style Guide" and other coding
|
|
standards?
|
|
|
|
A: Various documents are available for anonymous ftp from:
|
|
|
|
Site: File or directory:
|
|
|
|
cs.washington.edu ~ftp/pub/cstyle.tar.Z
|
|
(128.95.1.4) (the updated Indian Hill guide)
|
|
|
|
cs.toronto.edu doc/programming
|
|
|
|
giza.cis.ohio-state.edu pub/style-guide
|
|
|
|
|
|
Section 15. Floating Point
|
|
|
|
15.1: My floating-point calculations are acting strangely and giving
|
|
me different answers on different machines.
|
|
|
|
A: First, make sure that you have #included <math.h>, and correctly
|
|
declared other functions returning double.
|
|
|
|
If the problem isn't that simple, recall that most digital
|
|
computers use floating-point formats which provide a close but
|
|
by no means exact simulation of real number arithmetic.
|
|
Underflow, cumulative precision loss, and other anomalies are
|
|
often troublesome.
|
|
|
|
Don't assume that floating-point results will be exact, and
|
|
especially don't assume that floating-point values can be
|
|
compared for equality. (Don't throw haphazard "fuzz factors"
|
|
in, either.)
|
|
|
|
These problems are no worse for C than they are for any other
|
|
computer language. Floating-point semantics are usually defined
|
|
as "however the processor does them;" otherwise a compiler for a
|
|
machine without the "right" model would have to do prohibitively
|
|
expensive emulations.
|
|
|
|
This article cannot begin to list the pitfalls associated with,
|
|
and workarounds appropriate for, floating-point work. A good
|
|
programming text should cover the basics.
|
|
|
|
References: EoPS Sec. 6 pp. 115-8.
|
|
|
|
15.2: I'm trying to do some simple trig, and I am #including <math.h>,
|
|
but I keep getting "undefined: _sin" compilation errors.
|
|
|
|
A: Make sure you're linking against the correct math library. For
|
|
instance, under UNIX, you usually need to use the -lm option,
|
|
and at the _end_ of the command line, when compiling/linking.
|
|
|
|
15.3: Why doesn't C have an exponentiation operator?
|
|
|
|
A: Because few processors have an exponentiation instruction.
|
|
Instead, you can #include <math.h> and use the pow() function,
|
|
although explicit multiplication is often better for small
|
|
positive integral exponents.
|
|
|
|
References: ANSI Sec. 4.5.5.1 .
|
|
|
|
15.4: I'm having trouble with a Turbo C program which crashes and says
|
|
something like "floating point formats not linked."
|
|
|
|
A: Some compilers for small machines, including Turbo C (and
|
|
Ritchie's original PDP-11 compiler), leave out floating point
|
|
support if it looks like it will not be needed. In particular,
|
|
the non-floating-point versions of printf and scanf save space
|
|
by not including code to handle %e, %f, and %g. It happens that
|
|
Turbo C's heuristics for determining whether the program uses
|
|
floating point are insufficient, and the programmer must
|
|
sometimes insert an extra, explicit call to a floating-point
|
|
library routine to force loading of floating-point support.
|
|
|
|
|
|
Section 16. System Dependencies
|
|
|
|
16.1: How can I read a single character from the keyboard without
|
|
waiting for a newline?
|
|
|
|
A: Contrary to popular belief and many people's wishes, this is not
|
|
a C-related question. (Nor are closely-related questions
|
|
concerning the echo of keyboard input.) The delivery of
|
|
characters from a "keyboard" to a C program is a function of the
|
|
operating system in use, and has not been standardized by the C
|
|
language. Some versions of curses have a cbreak() function
|
|
which does what you want. Under UNIX, use ioctl to play with
|
|
the terminal driver modes (CBREAK or RAW under "classic"
|
|
versions; ICANON, c_cc[VMIN] and c_cc[VTIME] under System V or
|
|
Posix systems). Under MS-DOS, use getch(). Under VMS, try the
|
|
Screen Management (SMG$) routines, or curses, or issue low-level
|
|
$QIO's to ask for one character at a time. Under other
|
|
operating systems, you're on your own. Beware that some
|
|
operating systems make this sort of thing impossible, because
|
|
character collection into input lines is done by peripheral
|
|
processors not under direct control of the CPU running your
|
|
program.
|
|
|
|
Operating system specific questions are not appropriate for
|
|
comp.lang.c . Many common questions are answered in
|
|
frequently-asked questions postings in such groups as
|
|
comp.unix.questions and comp.os.msdos.programmer . Note that
|
|
the answers are often not unique even across different variants
|
|
of a system; bear in mind when answering system-specific
|
|
questions that the answer that applies to your system may not
|
|
apply to everyone else's.
|
|
|
|
References: PCS Sec. 10 pp. 128-9, Sec. 10.1 pp. 130-1.
|
|
|
|
16.2: How can I find out if there are characters available for reading
|
|
(and if so, how many)? Alternatively, how can I do a read that
|
|
will not block if there are no characters available?
|
|
|
|
A: These, too, are entirely operating-system-specific. Some
|
|
versions of curses have a nodelay() function. Depending on your
|
|
system, you may also be able to use "nonblocking I/O", or a
|
|
system call named "select", or the FIONREAD ioctl, or kbhit(),
|
|
or rdchk(), or the O_NDELAY option to open() or fcntl().
|
|
|
|
16.3: How can I clear the screen? How can I print things in inverse
|
|
video?
|
|
|
|
A: Such things depend on the terminal type (or display) you're
|
|
using. You will have to use a library such as termcap or
|
|
curses, or some system-specific routines, to perform these
|
|
functions.
|
|
|
|
16.4: How do I read the mouse?
|
|
|
|
A: Consult your system documentation, or ask on an appropriate
|
|
system-specific newsgroup (but check its FAQ list first). Mouse
|
|
handling is completely different under the X window system, MS-
|
|
DOS, Macintosh, and probably every other system.
|
|
|
|
16.5: How can my program discover the complete pathname to the
|
|
executable file from which it was invoked?
|
|
|
|
A: argv[0] may contain all or part of the pathname, or it may
|
|
contain nothing. You may be able to duplicate the command
|
|
language interpreter's search path logic to locate the
|
|
executable if the name in argv[0] is present but incomplete.
|
|
However, there is no guaranteed or portable solution.
|
|
|
|
16.6: How can a process change an environment variable in its caller?
|
|
|
|
A: In general, it cannot. Different operating systems implement
|
|
name/value functionality similar to the UNIX environment in
|
|
different ways. Whether the "environment" can be usefully
|
|
altered by a running program, and if so, how, is system-
|
|
dependent.
|
|
|
|
Under UNIX, a process can modify its own environment (some
|
|
systems provide setenv() and/or putenv() functions to do this),
|
|
and the modified environment is usually passed on to any child
|
|
processes, but it is _not_ propagated back to the parent
|
|
process.
|
|
|
|
16.7: How can I find out the size of a file, prior to reading it in?
|
|
|
|
A: If the "size of a file" is the number of characters you'll be
|
|
able to read from it in C, it is in general impossible to
|
|
determine this number in advance. Under UNIX, the stat call
|
|
will give you an exact answer, and several other systems supply
|
|
a UNIX-like stat which will give an approximate answer. You can
|
|
fseek to the end and then use ftell, but this usage is
|
|
nonportable (it gives you an accurate answer only under UNIX,
|
|
and otherwise a quasi-accurate answer only for ANSI C "binary"
|
|
files).
|
|
|
|
Are you sure you have to determine the file's size in advance?
|
|
Since the most accurate way of determining the size of a file as
|
|
a C program will see it is to open the file and read it, perhaps
|
|
you can rearrange the code to learn the size as it reads.
|
|
|
|
16.8: How can a file be shortened in-place without completely clearing
|
|
or rewriting it?
|
|
|
|
A: BSD systems provide ftruncate(), several others supply chsize(),
|
|
and a few may provide a (possibly undocumented) fcntl option
|
|
F_FREESP. Under MS-DOS, you can sometimes use write(fd, "", 0).
|
|
However, there is no truly portable solution.
|
|
|
|
16.9: How can I implement a delay, or time a user's response, with
|
|
sub-second resolution?
|
|
|
|
A: Unfortunately, there is no portable way. V7 UNIX, and derived
|
|
systems, provided a fairly useful ftime() routine with
|
|
resolution up to a millisecond, but it has disappeared from
|
|
System V and Posix. Other routines you might look for on your
|
|
system include nap(), setitimer(), msleep(), usleep(), clock(),
|
|
and gettimeofday(). The select() and poll() calls (if
|
|
available) can be pressed into service to implement simple
|
|
delays. On MS-DOS machines, it is possible to reprogram the
|
|
system timer and timer interrupts.
|
|
|
|
16.10: How can I read in an object file and jump to routines in it?
|
|
|
|
A: You want a dynamic linker and/or loader. It is possible to
|
|
malloc some space and read in object files, but you have to know
|
|
an awful lot about object file formats, relocation, etc. Under
|
|
BSD UNIX, you could use system() and ld -A to do the linking for
|
|
you. Many (most?) versions of SunOS and System V have the -ldl
|
|
library which allows object files to be dynamically loaded.
|
|
There is also a GNU package called "dld". See also question
|
|
7.6.
|
|
|
|
|
|
Section 17. Miscellaneous
|
|
|
|
17.1: What can I safely assume about the initial values of variables
|
|
which are not explicitly initialized? If global variables start
|
|
out as "zero," is that good enough for null pointers and
|
|
floating-point zeroes?
|
|
|
|
A: Variables with "static" duration (that is, those declared
|
|
outside of functions, and those declared with the storage class
|
|
static), are guaranteed initialized to zero, as if the
|
|
programmer had typed "= 0". Therefore, such variables are
|
|
initialized to the null pointer (of the correct type; see also
|
|
Section 1) if they are pointers, and to 0.0 if they are
|
|
floating-point.
|
|
|
|
Variables with "automatic" duration (i.e. local variables
|
|
without the static storage class) start out containing garbage,
|
|
unless they are explicitly initialized. Nothing useful can be
|
|
predicted about the garbage.
|
|
|
|
Dynamically-allocated memory obtained with malloc and realloc is
|
|
also likely to contain garbage, and must be initialized by the
|
|
calling program, as appropriate. Memory obtained with calloc
|
|
contains all-bits-0, but this is not necessarily useful for
|
|
pointer or floating-point values (see question 3.11, and section
|
|
1).
|
|
|
|
17.2: How can I write data files which can be read on other machines
|
|
with different word size, byte order, or floating point formats?
|
|
|
|
A: The best solution is to use text files (usually ASCII), written
|
|
with fprintf and read with fscanf or the like. (Similar advice
|
|
also applies to network protocols.) Be skeptical of arguments
|
|
which imply that text files are too big, or that reading and
|
|
writing them is too slow. Not only is their efficiency
|
|
frequently acceptable in practice, but the advantages of being
|
|
able to manipulate them with standard tools can be overwhelming.
|
|
|
|
If you must use a binary format, you can improve portability,
|
|
and perhaps take advantage of prewritten I/O libraries, by
|
|
making use of standardized formats such as Sun's XDR (RFC 1014),
|
|
OSI's ASN.1, CCITT's X.409, or ISO 8825 "Basic Encoding Rules."
|
|
See also question 9.10.
|
|
|
|
17.3: How can I return several values from a function?
|
|
|
|
A: Either pass pointers to locations which the function can fill
|
|
in, or have the function return a structure containing the
|
|
desired values, or (in a pinch) consider global variables. See
|
|
also questions 2.16, 3.4, and 9.2.
|
|
|
|
17.4: If I have a char * variable pointing to the name of a function
|
|
as a string, how can I call that function?
|
|
|
|
A: The most straightforward thing to do is maintain a
|
|
correspondence table of names and function pointers:
|
|
|
|
int function1(), function2();
|
|
|
|
struct {char *name; int (*funcptr)(); } symtab[] =
|
|
{
|
|
"function1", function1,
|
|
"function2", function2,
|
|
};
|
|
|
|
Then, just search the table for the name, and call through the
|
|
associated function pointer. See also questions 9.8 and 16.10.
|
|
|
|
17.5: I seem to be missing the system header file <sgtty.h>. Can
|
|
someone send me a copy?
|
|
|
|
A: Standard headers exist in part so that definitions appropriate
|
|
to your compiler, operating system, and processor can be
|
|
supplied. You cannot just pick up a copy of someone else's
|
|
header file and expect it to work, unless that person is using
|
|
exactly the same environment. Ask your compiler vendor why the
|
|
file was not provided (or to send a replacement copy).
|
|
|
|
17.6: How can I call FORTRAN (C++, BASIC, Pascal, Ada, LISP) functions
|
|
from C? (And vice versa?)
|
|
|
|
A: The answer is entirely dependent on the machine and the specific
|
|
calling sequences of the various compilers in use, and may not
|
|
be possible at all. Read your compiler documentation very
|
|
carefully; sometimes there is a "mixed-language programming
|
|
guide," although the techniques for passing arguments and
|
|
ensuring correct run-time startup are often arcane. More
|
|
information may be found in FORT.Z by Glenn Geers, available via
|
|
anonymous ftp from suphys.physics.su.oz.au in the src directory.
|
|
|
|
cfortran.h, a C header file, simplifies C/FORTRAN interfacing on
|
|
many popular machines. It is available via anonymous ftp from
|
|
zebra.desy.de (131.169.2.244).
|
|
|
|
In C++, a "C" modifier in an external function declaration
|
|
indicates that the function is to be called using C calling
|
|
conventions.
|
|
|
|
17.7: Does anyone know of a program for converting Pascal or FORTRAN
|
|
(or LISP, Ada, awk, "Old" C, ...) to C?
|
|
|
|
A: Several public-domain programs are available:
|
|
|
|
p2c written by Dave Gillespie, and posted to
|
|
comp.sources.unix in March, 1990 (Volume 21); also
|
|
available by anonymous ftp from csvax.cs.caltech.edu,
|
|
file pub/p2c-1.20.tar.Z .
|
|
|
|
ptoc another comp.sources.unix contribution, this one written
|
|
in Pascal (comp.sources.unix, Volume 10, also patches in
|
|
Volume 13?).
|
|
|
|
f2c jointly developed by people from Bell Labs, Bellcore,
|
|
and Carnegie Mellon. To find about f2c, send the mail
|
|
message "send index from f2c" to netlib@research.att.com
|
|
or research!netlib. (It is also available via anonymous
|
|
ftp on research.att.com, in directory dist/f2c.)
|
|
|
|
This FAQ list's maintainer also has available a list of other
|
|
commercial translation products, and some for more obscure
|
|
languages.
|
|
|
|
See also question 5.3.
|
|
|
|
17.8: Where can I get copies of all these public-domain programs?
|
|
|
|
A: If you have access to Usenet, see the regular postings in the
|
|
comp.sources.unix and comp.sources.misc newsgroups, which
|
|
describe, in some detail, the archiving policies and how to
|
|
retrieve copies. The usual approach is to use anonymous ftp
|
|
and/or uucp from a central, public-spirited site, such as uunet
|
|
(ftp.uu.net, 192.48.96.9). However, this article cannot track
|
|
or list all of the available archive sites and how to access
|
|
them. The comp.archives newsgroup contains numerous
|
|
announcements of anonymous ftp availability of various items.
|
|
The "archie" mailserver can tell you which anonymous ftp sites
|
|
have which packages; send the mail message "help" to
|
|
archie@quiche.cs.mcgill.ca for information. Finally, the
|
|
newsgroup comp.sources.wanted is generally a more appropriate
|
|
place to post queries for source availability, but check _its_
|
|
FAQ list, "How to find sources," before posting there.
|
|
|
|
17.9: When will the next International Obfuscated C Code Contest
|
|
(IOCCC) be held? How can I get a copy of the current and
|
|
previous winning entries?
|
|
|
|
A: The contest typically runs from early March through mid-May. To
|
|
obtain a current copy of the rules and guidelines, send e-mail
|
|
with the Subject: line "send rules" to:
|
|
|
|
{apple,pyramid,sun,uunet}!hoptoad!judges (not the addresses for
|
|
or judges@toad.com submitting entries)
|
|
|
|
Contest winners are first announced at the Summer Usenix
|
|
Conference in mid-June, and posted to the net sometime in July-
|
|
August. Winning entries from previous years (to 1984) are
|
|
archived at uunet (see question 17.8) under the directory
|
|
~/pub/ioccc.
|
|
|
|
As a last resort, previous winners may be obtained by sending
|
|
e-mail to the above address, using the Subject: "send YEAR
|
|
winners", where YEAR is a single four-digit year, a year range,
|
|
or "all".
|
|
|
|
17.10: Why don't C comments nest? Are they legal inside quoted
|
|
strings?
|
|
|
|
A: Nested comments would cause more harm than good, mostly because
|
|
of the possibility of accidentally leaving comments unclosed by
|
|
including the characters "/*" within them. For this reason, it
|
|
is usually better to "comment out" large sections of code, which
|
|
might contain comments, with #ifdef or #if 0 (but see question
|
|
5.9).
|
|
|
|
The character sequences /* and */ are not special within
|
|
double-quoted strings, and do not therefore introduce comments,
|
|
because a program (particularly one which is generating C code
|
|
as output) might want to print them.
|
|
|
|
References: ANSI Appendix E p. 198, Rationale Sec. 3.1.9 p. 33.
|
|
|
|
17.11: How can I implement sets and/or arrays of bits?
|
|
|
|
A: Use arrays of char or int, with a few macros to access the right
|
|
bit at the right index (try using 8 for CHAR_BIT if you don't
|
|
have <limits.h>):
|
|
|
|
#include <limits.h> /* for CHAR_BIT */
|
|
|
|
#define BITMASK(bit) (1 << ((bit) % CHAR_BIT))
|
|
#define BITSLOT(bit) ((bit) / CHAR_BIT)
|
|
#define BITSET(ary, bit) ((ary)[BITSLOT(bit)] |= BITMASK(bit))
|
|
#define BITTEST(ary, bit) ((ary)[BITSLOT(bit)] & BITMASK(bit))
|
|
|
|
17.12: What is the most efficient way to count the number of bits which
|
|
are set in a value?
|
|
|
|
A: This and many other similar bit-twiddling problems can often be
|
|
sped up and streamlined using lookup tables (but see the next
|
|
question).
|
|
|
|
17.13: How can I make this code more efficient?
|
|
|
|
A: Efficiency, though a favorite comp.lang.c topic, is not
|
|
important nearly as often as people tend to think it is. Most
|
|
of the code in most programs is not time-critical. When code is
|
|
not time-critical, it is far more important that it be written
|
|
clearly and portably than that it be written maximally
|
|
efficiently. (Remember that computers are very, very fast, and
|
|
that even "inefficient" code can run without apparent delay.)
|
|
|
|
It is notoriously difficult to predict what the "hot spots" in a
|
|
program will be. When efficiency is a concern, it is important
|
|
to use profiling software to determine which parts of the
|
|
program deserve attention. Often, actual computation time is
|
|
swamped by peripheral tasks such as I/O and memory allocation,
|
|
which can be sped up by using buffering and caching techniques.
|
|
|
|
For the small fraction of code that is time-critical, it is
|
|
vital to pick a good algorithm; it is less important to
|
|
"microoptimize" the coding details. Many of the "efficient
|
|
coding tricks" which are frequently suggested (e.g. substituting
|
|
shift operators for multiplication by powers of two) are
|
|
performed automatically by even simpleminded compilers.
|
|
Heavyhanded "optimization" attempts can make code so bulky that
|
|
performance is degraded.
|
|
|
|
For more discussion of efficiency tradeoffs, as well as good
|
|
advice on how to increase efficiency when it is important, see
|
|
chapter 7 of Kernighan and Plauger's The Elements of Programming
|
|
Style, and Jon Bentley's Writing Efficient Programs.
|
|
|
|
17.14: Are pointers really faster than arrays? How much do function
|
|
calls slow things down? Is ++i faster than i = i + 1?
|
|
|
|
A: Precise answers to these and many similar questions depend of
|
|
course on the processor and compiler in use. If you simply must
|
|
know, you'll have to time test programs carefully. (Often the
|
|
differences are so slight that hundreds of thousands of
|
|
iterations are required even to see them. Check the compiler's
|
|
assembly language output, if available, to see if two purported
|
|
alternatives aren't compiled identically.)
|
|
|
|
It is "usually" faster to march through large arrays with
|
|
pointers rather than array subscripts, but for some processors
|
|
the reverse is true.
|
|
|
|
Function calls, though obviously incrementally slower than in-
|
|
line code, contribute so much to modularity and code clarity
|
|
that there is rarely good reason to avoid them.
|
|
|
|
Before rearranging expressions such as i = i + 1, remember that
|
|
you are dealing with a C compiler, not a keystroke-programmable
|
|
calculator. Any decent compiler will generate identical code
|
|
for ++i, i += 1, and i = i + 1. The reasons for using ++i or
|
|
i += 1 over i = i + 1 have to do with style, not efficiency.
|
|
(See also question 4.4.)
|
|
|
|
17.15: This program crashes before it even runs! (When single-stepping
|
|
with a debugger, it dies before the first statement in main.)
|
|
|
|
A: You probably have one or more very large (kilobyte or more)
|
|
local arrays. Many systems have fixed-size stacks, and those
|
|
which perform dynamic stack allocation automatically (e.g. UNIX)
|
|
can be confused when the stack tries to grow by a huge chunk all
|
|
at once.
|
|
|
|
It is often better to declare large arrays with static duration
|
|
(unless of course you need a fresh set with each recursive
|
|
call).
|
|
|
|
(See also question 9.4.)
|
|
|
|
17.16: What do "Segmentation violation" and "Bus error" mean?
|
|
|
|
A: These generally mean that your program tried to access memory it
|
|
shouldn't have, invariably as a result of improper pointer use,
|
|
often involving malloc (see question 17.17) or perhaps scanf
|
|
(see question 11.2).
|
|
|
|
17.17: My program is crashing, apparently somewhere down inside malloc,
|
|
but I can't see anything wrong with it.
|
|
|
|
A: It is unfortunately very easy to corrupt malloc's internal data
|
|
structures, and the resulting problems can be hard to track
|
|
down. The most common source of problems is writing more to a
|
|
malloc'ed region than it was allocated to hold; a particularly
|
|
common bug is to malloc(strlen(s)) instead of strlen(s) + 1.
|
|
Other problems involve freeing pointers not obtained from
|
|
malloc, or trying to realloc a null pointer (see question 3.10).
|
|
|
|
A number of debugging packages exist to help track down malloc
|
|
problems; one popular one is Conor P. Cahill's "dbmalloc".
|
|
|
|
17.18: Does anyone have a C compiler test suite I can use?
|
|
|
|
A: Plum Hall (1 Spruce Ave., Cardiff, NJ 08232, USA) sells one.
|
|
The FSF's GNU C (gcc) distribution includes a c-torture-
|
|
test.tar.Z which checks a number of common problems with
|
|
compilers. Kahan's paranoia test, found in netlib on
|
|
research.att.com, strenuously tests a C implementation's
|
|
floating point capabilities.
|
|
|
|
17.19: Where can I get a YACC grammar for C?
|
|
|
|
A: The definitive grammar is of course the one in the ANSI
|
|
standard. Several copies are floating around; keep your eyes
|
|
open. There is one (due to Jeff Lee) on uunet (see question
|
|
17.8) in usenet/net.sources/ansi.c.grammar.Z (including a
|
|
companion lexer). Another one, by Jim Roskind, is in
|
|
pub/*grammar* at ics.uci.edu . The FSF's GNU C compiler
|
|
contains a grammar, as does the appendix to K&R II.
|
|
|
|
References: ANSI Sec. A.2 .
|
|
|
|
17.20: How do you pronounce "char"?
|
|
|
|
A: You can pronounce the C keyword "char" in at least three ways:
|
|
like the English words "char," "care," or "car;" the choice is
|
|
arbitrary.
|
|
|
|
17.21: What's a good book for learning C?
|
|
|
|
A: Mitch Wright maintains an annotated bibliography of C and UNIX
|
|
books; it is available for anonymous ftp from ftp.rahul.net in
|
|
directory pub/mitch/YABL.
|
|
|
|
This FAQ list's editor maintains a collection of previous
|
|
answers to this question, which is available upon request.
|
|
|
|
17.22: Where can I get extra copies of this list? What about back
|
|
issues?
|
|
|
|
A: For now, just pull it off the net; it is normally posted to
|
|
comp.lang.c on the first of each month, with an Expiration: line
|
|
which should keep it around all month. It can also be found in
|
|
the newsgroups comp.answers and news.answers . Several sites
|
|
archive news.answers postings and other FAQ lists, including
|
|
this one: two sites are rtfm.mit.edu (directory pub/usenet), and
|
|
ftp.uu.net (directory usenet). The archie server should help
|
|
you find others. See the meta-FAQ list in news.answers for more
|
|
information; see also question 17.8.
|
|
|
|
This list is an evolving document of questions which have been
|
|
Frequent since before the Great Renaming, not just a collection
|
|
of this month's interesting questions. Older copies are
|
|
obsolete and don't contain much, except the occasional typo,
|
|
that the current list doesn't.
|
|
|
|
|
|
Bibliography
|
|
|
|
ANSI American National Standard for Information Systems --
|
|
Programming Language -- C, ANSI X3.159-1989 (see question 5.2).
|
|
|
|
JLB Jon Louis Bentley, Writing Efficient Programs, Prentice-Hall,
|
|
1982, ISBN 0-13-970244-X.
|
|
|
|
H&S Samuel P. Harbison and Guy L. Steele, C: A Reference Manual,
|
|
Second Edition, Prentice-Hall, 1987, ISBN 0-13-109802-0.
|
|
(A third edition has recently been released.)
|
|
|
|
PCS Mark R. Horton, Portable C Software, Prentice Hall, 1990,
|
|
ISBN 0-13-868050-7.
|
|
|
|
EoPS Brian W. Kernighan and P.J. Plauger, The Elements of Programming
|
|
Style, Second Edition, McGraw-Hill, 1978, ISBN 0-07-034207-5.
|
|
|
|
K&R I Brian W. Kernighan and Dennis M. Ritchie, The C Programming
|
|
Language, Prentice Hall, 1978, ISBN 0-13-110163-3.
|
|
|
|
K&R II Brian W. Kernighan and Dennis M. Ritchie, The C Programming
|
|
Language, Second Edition, Prentice Hall, 1988, ISBN 0-13-
|
|
110362-8, 0-13-110370-9.
|
|
|
|
Knuth Donald E. Knuth, The Art of Computer Programming, (3 vols.),
|
|
Addison Wesley, 1981.
|
|
|
|
CT&P Andrew Koenig, C Traps and Pitfalls, Addison-Wesley, 1989,
|
|
ISBN 0-201-17928-8.
|
|
|
|
P.J. Plauger, The Standard C Library, Prentice Hall, 1992,
|
|
ISBN 0-13-131509-9.
|
|
|
|
Harry Rabinowitz and Chaim Schaap, Portable C, Prentice-Hall,
|
|
1990, ISBN 0-13-685967-4.
|
|
|
|
There is a more extensive bibliography in the revised Indian Hill style
|
|
guide (see question 14.3). See also question 17.21.
|
|
|
|
|
|
Acknowledgements
|
|
|
|
Thanks to Jamshid Afshar, Sudheer Apte, Randall Atkinson, Dan Bernstein,
|
|
Vincent Broman, Stan Brown, Joe Buehler, Gordon Burditt, Burkhard Burow,
|
|
D'Arcy J.M. Cain, Christopher Calabrese, Paul Carter, Raymond Chen,
|
|
Jonathan Coxhead, James Davies, Jutta Degener, Norm Diamond, Ray Dunn,
|
|
Stephen M. Dunn, Bjorn Engsig, Alexander Forst, Jeff Francis, Dave
|
|
Gillespie, Samuel Goldstein, Alasdair Grant, Ron Guilmette, Doug Gwyn,
|
|
Tony Hansen, Joe Harrington, Guy Harris, Jos Horsmeier, Blair Houghton,
|
|
Kirk Johnson, Peter Klausler, Andrew Koenig, Ajoy Krishnan T, Tom
|
|
Koenig, John Lauro, Felix Lee, Don Libes, Christopher Lott, Tim
|
|
McDaniel, John R. MacMillan, Bob Makowski, Evan Manning, Barry Margolin,
|
|
Brad Mears, Mark Moraes, Darren Morby, Landon Curt Noll, David O'Brien,
|
|
Richard A. O'Keefe, Hans Olsson, Francois Pinard, Pat Rankin, Erkki
|
|
Ruohtula, Rich Salz, Chip Salzenberg, Paul Sand, Doug Schmidt, Patricia
|
|
Shanahan, Peter da Silva, Joshua Simons, Henry Spencer, David Spuler,
|
|
Erik Talvola, Clarke Thatcher, Wayne Throop, Chris Torek, Goran
|
|
Uddeborg, Rodrigo Vanegas, Wietse Venema, Ed Vielmetti, Larry Virden,
|
|
Chris Volpe, Freek Wiedijk, Dave Wolverton, Mitch Wright, Conway Yee,
|
|
and Zhuo Zang, who have contributed, directly or indirectly, to this
|
|
article. Special thanks to Karl Heuer, and particularly to Mark Brader,
|
|
who (to borrow a line from Steve Johnson) have goaded me beyond my
|
|
inclination, and occasionally beyond my endurance, in relentless pursuit
|
|
of a better FAQ list.
|
|
|
|
Steve Summit
|
|
scs@eskimo.com
|
|
|
|
This article is Copyright 1988, 1990-1993 by Steve Summit.
|
|
It may be freely redistributed so long as the author's name, and this
|
|
notice, are retained.
|
|
The C code in this article (vstrcat(), error(), etc.) is public domain
|
|
and may be used without restriction.
|