309 lines
12 KiB
Plaintext
309 lines
12 KiB
Plaintext
GNU coding standards. last updated 10 Feb 89
|
||
|
||
Reference standards:
|
||
|
||
Don't in any circumstances refer to Unix source code for or during
|
||
your work on GNU! (Or to any other proprietary programs.)
|
||
|
||
If you have a vague recollection of the internals of a Unix program,
|
||
this does not absolutely mean you can't write an imitation of it, but
|
||
do try to organize the imitation along different lines, because this
|
||
is likely to make the details of the Unix version irrelevant and
|
||
dissimilar to your results.
|
||
|
||
For example, Unix utilities were generally optimized to minimize
|
||
memory use; if you go for speed instead, your program will be very
|
||
different. You could keep the entire input file in core and scan it
|
||
there instead of using stdio. Use a smarter algorithm discovered more
|
||
recently than the Unix program. Eliminate use of temporary files. Do
|
||
it in one pass instead of two (we did this in the assembler).
|
||
|
||
Or, on the contrary, emphasize simplicity instead of speed. For some
|
||
applications, the speed of today's computers makes simpler algorithms
|
||
adequate.
|
||
|
||
Or go for generality. For example, Unix programs often have static
|
||
tables or fixed-size strings, which make for arbitrary limits; use
|
||
dynamic allocation instead. Make sure your program handles nulls and
|
||
other funny characters in the input files. Add a programming language
|
||
for extensibility and write part of the program in that language.
|
||
|
||
Or turn some parts of the program into independently usable libraries.
|
||
Or use a simple garbage collector instead of tracking precisely when
|
||
to free memory, or use a new GNU facility such as obstacks.
|
||
|
||
|
||
Compatibility standards:
|
||
|
||
Utility programs and libraries for GNU should be totally upward
|
||
compatible with those in Berkeley Unix, with certain exceptions.
|
||
When a feature is used only by users (not by programs or command files),
|
||
and it is done poorly in Berkeley Unix, it is good to replace
|
||
it completely with something totally different and better.
|
||
(For example, vi is replaced with Emacs.) But it is nice to
|
||
offer a compatible feature as well. (There is a free vi-clone,
|
||
so we will offer it.)
|
||
|
||
Additional useful features not in Berkeley Unix are welcome.
|
||
Additional programs with no counterpart in Unix may be useful,
|
||
but our first priority is usually to duplicate what Unix already
|
||
has.
|
||
|
||
|
||
Formatting standards:
|
||
|
||
It is important put the open-brace that starts the body of a C function
|
||
in column zero, and avoid putting any other open-brace or open-parenthesis
|
||
or open-bracket in column zero. Several tools look for
|
||
open-braces in column zero to find the beginnings of C functions.
|
||
These tools will not work on code not formatted that way.
|
||
|
||
It is also important for function definitions to start the name
|
||
of the function in column zero. `ctags' or `etags' cannot recognize
|
||
them otherwise. Thus, the proper format is this:
|
||
|
||
static char *
|
||
concat (s1, s2) /* Name starts in column zero here */
|
||
char *s1, *s2;
|
||
{ /* Open brace in column zero here */
|
||
...
|
||
}
|
||
|
||
Aside from this, I prefer code formatted like this:
|
||
|
||
if (x < foo (y, z))
|
||
haha = bar[4] + 5;
|
||
else
|
||
{
|
||
while (z)
|
||
{
|
||
haha += foo (z, z);
|
||
z--;
|
||
}
|
||
return ++x + bar ();
|
||
}
|
||
|
||
I find it easier to read a program when it has spaces before the
|
||
open-parentheses and after the commas. Especially after the commas.
|
||
|
||
Please use formfeed characters to divide the program into pages at
|
||
logical places (but not within a function). It does not matter just
|
||
how long the pages are, since they do not have to fit on a printed
|
||
page.
|
||
|
||
When you split an expression into multiple lines, split it
|
||
before an operator, not after one. For example:
|
||
|
||
if (foo_this_is_long && bar > win (x, y, z)
|
||
&& remaining_condition)
|
||
|
||
|
||
Commenting Standards:
|
||
|
||
Every program should start with a comment saying briefly
|
||
what it is for. Example: "fmt -- filter for simple filling of text".
|
||
|
||
Please put a comment on each function saying what the function does,
|
||
what sorts of arguments it gets, and what the possible values of
|
||
arguments mean and are used for. It is not necessary to duplicate in
|
||
words the meaning of the C argument declarations, if a C type is being
|
||
used in its customary fashion. If there is anything nonstandard about
|
||
its use (such as an argument of type `char *' which is really the
|
||
address of the second character of a string, not the first), or any
|
||
possible values that would not work the way one would expect (such as,
|
||
that strings containing newlines are not guaranteed to work), be sure
|
||
to say so.
|
||
|
||
Please put two spaces after the end of a sentence in your comments, so
|
||
that the Emacs sentence commands will work. Also, please write
|
||
complete sentences and capitalize the first word. Avoid putting
|
||
a case-sensitive identifier at the beginning of a sentence, because
|
||
such identifiers look strange if lower case and yet are erroneous if
|
||
capitalized.
|
||
|
||
The comment on a function is much clearer if you use the argument
|
||
names to speak about the argument values. Thus, "the inode number
|
||
NODE_NUM" rather than "an inode". Our convention is to put the
|
||
argument name in all caps when speaking about its value in this way.
|
||
|
||
There is usually no purpose in restating the name of the function in
|
||
the comment before it, because the reader can see that for himself.
|
||
There might be an exception when the comment is very long.
|
||
|
||
There should be a comment on each static variable as well, like this:
|
||
|
||
/* Nonzero means truncate lines in the display;
|
||
zero means continue them. */
|
||
|
||
int truncate_lines;
|
||
|
||
|
||
Syntactic Standards:
|
||
|
||
Please explicitly declare all arguments to functions.
|
||
Don't omit them just because they are ints.
|
||
|
||
Declarations of external functions and functions to appear later
|
||
in the source file should all go in one place near the beginning of
|
||
the file (somewhere before the first function definition in the file),
|
||
or else should go in a header file. Don't put extern declarations
|
||
inside functions.
|
||
|
||
Don't declare multiple variables in one declaration that spans lines.
|
||
Start a new declaration on each line, instead. For example, instead
|
||
of this:
|
||
|
||
int foo,
|
||
bar;
|
||
|
||
write either this:
|
||
|
||
int foo, bar;
|
||
|
||
or this:
|
||
|
||
int foo;
|
||
int bar;
|
||
|
||
(If they are global variables, each should have a comment
|
||
preceding it anyway.)
|
||
|
||
When you have an if-else statement nested in another if statement,
|
||
always put braces around the if-else. Thus, never write like this:
|
||
|
||
if (foo)
|
||
if (bar)
|
||
win ();
|
||
else
|
||
lose ();
|
||
|
||
always like this:
|
||
|
||
if (foo)
|
||
{
|
||
if (bar)
|
||
win ();
|
||
else
|
||
lose ();
|
||
}
|
||
|
||
Don't declare both a structure tag and variables or typedefs in the
|
||
same declaration. Instead, declare the structure tag separately
|
||
and then use it to declare the variables or typedefs.
|
||
|
||
Try to avoid assignments inside if-conditions. For example, don't
|
||
write this:
|
||
|
||
if ((foo = (char *) malloc (sizeof *foo)) == 0)
|
||
fatal ("out of memory");
|
||
|
||
instead, write this:
|
||
|
||
foo = (char *) malloc (sizeof *foo);
|
||
if (foo == 0)
|
||
fatal ("out of memory");
|
||
|
||
Naming Standards:
|
||
|
||
Please use underscores to separate words in a name,
|
||
so that the Emacs word commands can be useful within them.
|
||
Stick to lower case; reserve upper case for macros and enum
|
||
constants, and for name-prefixes that follow a uniform convention.
|
||
|
||
For example, use names like `ignore_space_change_flag';
|
||
don't use names like `iCantReadThis'.
|
||
|
||
Variables that indicate whether command-line options have been
|
||
specified should be named after the meaning of the option, not after
|
||
the option-letter. A comment should state both the exact meaning of
|
||
the option and its letter. For example,
|
||
|
||
/* Ignore changes in horizontal whitespace (-b). */
|
||
int ignore_space_change_flag;
|
||
|
||
|
||
Semantic Standards:
|
||
|
||
Avoid arbitrary limits on the length or number of ANY data structure,
|
||
including filenames, lines, files, and symbols, by allocating all
|
||
data structures dynamically. In most Unix utilities, "long lines
|
||
are silently truncated". This is not acceptable in a GNU utility.
|
||
|
||
Utilities reading files should not drop null characters, or any other
|
||
nonprinting characters including those with codes above 0177, except
|
||
perhaps utilities specifically intended for interface to printers.
|
||
|
||
Check every system call for an error return, unless you know you
|
||
wish to ignore errors. Include the system error text (from perror
|
||
or equivalent) in *every* error message resulting from a failing
|
||
system call, as well as the name of the file if any and the
|
||
name of the utility. Just "cannot open foo.c" or "stat failed"
|
||
is not sufficient.
|
||
|
||
Check every call to `malloc' or `realloc' to see if it returned zero.
|
||
Check `realloc' even if you are making the block smaller; in a power-of-2
|
||
system, this can require allocating new space!
|
||
|
||
In Unix, `realloc' can destroy the storage block if it returns zero.
|
||
GNU `realloc' does not have this bug, so your utilities can assume
|
||
that `realloc' works conveniently (does not destroy the original block
|
||
when it returns zero.) If you wish to run them on Unix, and wish to
|
||
avoid lossage in this case, you can use the GNU `malloc'.
|
||
|
||
Do not assume anything about the contents of a block of memory
|
||
after it has been passed to `free'.
|
||
|
||
When static storage is to be written in during program execution,
|
||
use explicit C code to initialize it. Reserve C initialized
|
||
declarations for data that will not be changed.
|
||
|
||
Try to avoid low-level interfaces to obscure Unix data structures
|
||
(such as file directories, utmp, or the layout of kernel memory),
|
||
since these are less likely to work compatibly. If you need to
|
||
find all the files in a directory, use `readdir' or some other
|
||
high-level interface. These will be supported compatibly by GNU.
|
||
|
||
GNU signal handling will probably be like that in BSD, rather than
|
||
that in system V, because BSD's is more powerful and easier to use.
|
||
|
||
|
||
Portability standards:
|
||
|
||
Much of what is called "portability" in the Unix world refers to
|
||
porting to different Unix versions. This is not relevant to GNU
|
||
software, because its purpose is to run on top of one and only
|
||
one kernel, the GNU kernel, compiled with one and only one C
|
||
compiler, the GNU C compiler. The amount and kinds of variation
|
||
among GNU systems on different cpu's will be like the variation
|
||
among Berkeley 4.3 systems on different cpu's.
|
||
|
||
It is difficult to be sure exactly what facilities the GNU kernel will
|
||
provide, since it isn't finished yet. Therefore, assume you can use
|
||
anything in 4.3; just avoid using the format of semi-internal data
|
||
bases (utmp, directories, kmem) when there is a higher-level
|
||
alternative.
|
||
|
||
You can freely assume any reasonably standard facilities in the C
|
||
language, libraries or kernel, because any such facility is part
|
||
of GNU's job to support. The fact that there may exist kernels or
|
||
C compilers that lack those facilities is irrelevant as long
|
||
as the GNU kernel and C compiler support them.
|
||
|
||
It remains necessary to worry about differences among cpu types, such
|
||
as the difference in byte ordering and alignment restrictions.
|
||
However, I don't expect 16-bit machines ever to be supported by GNU,
|
||
so there is no point in spending any time to consider the possibility
|
||
that an int will be less than 32 bits.
|
||
|
||
You can assume that it is reasonable to use a meg of memory. Don't
|
||
strain to reduce memory usage unless it can get to that level. If
|
||
your program creates complicated data structures, just make them in
|
||
core and give a fatal error if malloc returns zero.
|
||
|
||
If a program works by lines and could be applied to arbitrary user-
|
||
supplied input files, it should keep only a line in memory, because
|
||
this is not very hard and users will want to be able to operate
|
||
on input files that are bigger than will fit in core all at once.
|
||
|
||
|
||
|