C (/ˈsiː/, as in the letter c) is a general-purpose, imperative computer programming language, supporting structured programming, lexical variable scope andrecursion, while a static type system prevents many unintended operations. By design, C provides constructs that map efficiently to typical machine instructions, and therefore it has found lasting use in applications that had formerly been coded in assembly language, including operating systems, as well as various application software for computers ranging from supercomputers to embedded systems.
C was originally developed by Dennis Ritchiebetween 1969 and 1973 at Bell Labs,[5] and used to re-implement the Unix operating system.[6] It has since become one of themost widely used programming languages of all time,[7][8] with C compilers from various vendors available for the majority of existingcomputer architectures and operating systems. C has been standardized by theAmerican National Standards Institute (ANSI) since 1989 (see ANSI C) and subsequently by the International Organization for Standardization (ISO).
| |||||||||
|
C99 reserved five more words:
C11 reserved seven more words:[22]
|
|
|
|
Most of the recently reserved words begin with an underscore followed by a capital letter, because identifiers of that form were previously reserved by the C standard for use only by implementations. Since existing program source code should not have been using these identifiers, it would not be affected when C implementations started supporting these extensions to the programming language. Some standard headers do define more convenient synonyms for underscored identifiers. The language previously included a reserved word called
entry, but this was seldom implemented, and has now been removed as a reserved word.[23]OperatorsEdit
Main article: Operators in C and C++
C supports a rich set of operators, which are symbols used within an expression to specify the manipulations to be performed while evaluating that expression. C has operators for:
- arithmetic:
+,-,*,/,% - assignment:
= - augmented assignment:
+=,-=,*=,/=,%=,&=,|=,^=,<<=,>>= - bitwise logic:
~,&,|,^ - bitwise shifts:
<<,>> - boolean logic:
!,&&,|| - conditional evaluation:
? : - equality testing:
==,!= - calling functions:
( ) - increment and decrement:
++,-- - member selection:
.,-> - object size:
sizeof - order relations:
<,<=,>,>= - reference and dereference:
&,*,[ ] - sequencing:
, - subexpression grouping:
( ) - type conversion:
(typename)
C uses the
= operator, reserved in mathematics to express equality, to indicate assignment, following the precedent ofFortran and PL/I, but unlike ALGOL and its derivatives. The similarity between C's operator for assignment and that for equality (==) has been criticized[by whom?] as it makes it easy to accidentally substitute one for the other. In many cases, each may be used in the context of the other without a compilation error (although some compilers produce warnings). For example, the conditional expression in if(a=b+1) is true if a is not zero after the assignment.[24] Additionally, C'soperator precedence is non-intuitive, such as == binding more tightly than & and | in expressions like x & 1 == 0, which would need to be written (x & 1) == 0 to be properly evaluated.[25]
"Hello, world" exampleEdit
The "hello, world" example, which appeared in the first edition of K&R, has become the model for an introductory program in most programming textbooks, regardless of programming language. The program prints "hello, world" to the standard output, which is usually a terminal or screen display.
The original version was:[26]
main()
{
printf("hello, world\n");
}
A standard-conforming "hello, world" program is:[a]
#include <stdio.h>
int main(void)
{
printf("hello, world\n");
}
The first line of the program contains apreprocessing directive, indicated by
#include. This causes the compiler to replace that line with the entire text of the stdio.h standard header, which contains declarations for standard input and output functions such as printf. The angle brackets surrounding stdio.h indicate that stdio.h is located using a search strategy that prefers headers in the compiler's include path to other headers having the same name; double quotes are used to include local or project-specific header files.[discuss]
The next line indicates that a function named
main is being defined. The main function serves a special purpose in C programs; the run-time environment calls the mainfunction to begin program execution. The type specifier int indicates that the value that is returned to the invoker (in this case the run-time environment) as a result of evaluating the main function, is an integer. The keyword void as a parameter list indicates that this function takes no arguments.[b]
The opening curly brace indicates the beginning of the definition of the
mainfunction.
The next line calls (diverts execution to) a function named
printf, which is supplied from a system library. In this call, the printf function is passed (provided with) a single argument, the address of the first character in the string literal "hello, world\n". The string literal is an unnamedarray with elements of type char, set up automatically by the compiler with a final 0-valued character to mark the end of the array (printf needs to know this). The \n is anescape sequence that C translates to a newlinecharacter, which on output signifies the end of the current line. The return value of the printf function is of type int, but it is silently discarded since it is not used. (A more careful program might test the return value to determine whether or not the printf function succeeded.) The semicolon; terminates the statement.
The closing curly brace indicates the end of the code for the
main function. According to the C99 specification and newer, the mainfunction, unlike any other function, will implicitly return a status of 0 upon reaching the } that terminates the function. This is interpreted by the run-time system as an exit code indicating successful execution.[27]
Data typesEdit
Main article: C variable types and declarations
The type system in C is static and weakly typed, which makes it similar to the type system of ALGOL descendants such asPascal.[28] There are built-in types for integers of various sizes, both signed and unsigned,floating-point numbers, characters, and enumerated types (
enum). C99 added aboolean datatype. There are also derived types including arrays, pointers, records(struct), and untagged unions (union).
C is often used in low-level systems programming where escapes from the type system may be necessary. The compiler attempts to ensure type correctness of most expressions, but the programmer can override the checks in various ways, either by using a type cast to explicitly convert a value from one type to another, or by using pointers or unions to reinterpret the underlying bits of a data object in some other way.
Some find C's declaration syntax unintuitive, particularly for function pointers. (Ritchie's idea was to declare identifiers in contexts resembling their use: "declaration reflects use".)[29]
C's usual arithmetic conversions allow for efficient code to be generated, but can sometimes produce unexpected results. For example, a comparison of signed and unsigned integers of equal width requires a conversion of the signed value to unsigned. This can generate unexpected results if the signed value is negative.
PointersEdit
C supports the use of pointers, a type ofreference that records the address or location of an object or function in memory. Pointers can be dereferenced to access data stored at the address pointed to, or to invoke a pointed-to function. Pointers can be manipulated using assignment or pointer arithmetic. The run-time representation of a pointer value is typically a raw memory address (perhaps augmented by an offset-within-word field), but since a pointer's type includes the type of the thing pointed to, expressions including pointers can be type-checked at compile time. Pointer arithmetic is automatically scaled by the size of the pointed-to data type. Pointers are used for many purposes in C. Text stringsare commonly manipulated using pointers into arrays of characters. Dynamic memory allocation is performed using pointers. Many data types, such as trees, are commonly implemented as dynamically allocated
struct objects linked together using pointers. Pointers to functions are useful for passing functions as arguments to higher-order functions (such as qsort or bsearch) or as callbacks to be invoked by event handlers.[27]
A null pointer value explicitly points to no valid location. Dereferencing a null pointer value is undefined, often resulting in a segmentation fault. Null pointer values are useful for indicating special cases such as no "next" pointer in the final node of a linked list, or as an error indication from functions returning pointers. In appropriate contexts in source code, such as for assigning to a pointer variable, a null pointer constant can be written as
0, with or without explicit casting to a pointer type, or as the NULL macro defined by several standard headers. In conditional contexts, null pointer values evaluate to false, while all other pointer values evaluate to true.
Void pointers (
void *) point to objects of unspecified type, and can therefore be used as "generic" data pointers. Since the size and type of the pointed-to object is not known, void pointers cannot be dereferenced, nor is pointer arithmetic on them allowed, although they can easily be (and in many contexts implicitly are) converted to and from any other object pointer type.[27]
Careless use of pointers is potentially dangerous. Because they are typically unchecked, a pointer variable can be made to point to any arbitrary location, which can cause undesirable effects. Although properly used pointers point to safe places, they can be made to point to unsafe places by using invalid pointer arithmetic; the objects they point to may be deallocated and reused (dangling pointers); they may be used without having been initialized (wild pointers); or they may be directly assigned an unsafe value using a cast, union, or through another corrupt pointer. In general, C is permissive in allowing manipulation of and conversion between pointer types, although compilers typically provide options for various levels of checking. Some other programming languages address these problems by using more restrictive reference types.
ArraysEdit
See also: C string
Array types in C are traditionally of a fixed, static size specified at compile time. (The more recent C99 standard also allows a form of variable-length arrays.) However, it is also possible to allocate a block of memory (of arbitrary size) at run-time, using the standard library's
malloc function, and treat it as an array. C's unification of arrays and pointers means that declared arrays and these dynamically allocated simulated arrays are virtually interchangeable.
Since arrays are always accessed (in effect) via pointers, array accesses are typically notchecked against the underlying array size, although some compilers may providebounds checking as an option.[30] Array bounds violations are therefore possible and rather common in carelessly written code, and can lead to various repercussions, including illegal memory accesses, corruption of data, buffer overruns, and run-time exceptions. If bounds checking is desired, it must be done manually.
C does not have a special provision for declaring multidimensional arrays, but rather relies on recursion within the type system to declare arrays of arrays, which effectively accomplishes the same thing. The index values of the resulting "multidimensional array" can be thought of as increasing in row-major order.
Multidimensional arrays are commonly used in numerical algorithms (mainly from appliedlinear algebra) to store matrices. The structure of the C array is well suited to this particular task. However, since arrays are passed merely as pointers, the bounds of the array must be known fixed values or else explicitly passed to any subroutine that requires them, and dynamically sized arrays of arrays cannot be accessed using double indexing. (A workaround for this is to allocate the array with an additional "row vector" of pointers to the columns.)
C99 introduced "variable-length arrays" which address some, but not all, of the issues with ordinary C arrays.
Array–pointer interchangeabilityEdit
The subscript notation
x[i] (where xdesignates a pointer) is a syntactic sugar for *(x+i).[31] Taking advantage of the compiler's knowledge of the pointer type, the address that x + i points to is not the base address (pointed to by x) incremented by i bytes, but rather is defined to be the base address incremented by i multiplied by the size of an element that x points to. Thus, x[i] designates the i+1th element of the array.
Furthermore, in most expression contexts (a notable exception is as operand of
sizeof), the name of an array is automatically converted to a pointer to the array's first element. This implies that an array is never copied as a whole when named as an argument to a function, but rather only the address of its first element is passed. Therefore, although function calls in C usepass-by-value semantics, arrays are in effect passed by reference.
The size of an element can be determined by applying the operator
sizeof to any dereferenced element of x, as in n = sizeof *x or n = sizeof x[0], and the number of elements in a declared array Acan be determined as sizeof A / sizeof A[0]. The latter only applies to array names: variables declared with subscripts (int A[20]). Due to the semantics of C, it is not possible to determine the entire size of arrays through pointers to arrays or those created by dynamic allocation (malloc); code such as sizeof arr / sizeof arr[0] (where arrdesignates a pointer) will not work since the compiler assumes the size of the pointer itself is being requested.[32][33] Since array name arguments to sizeof are not converted to pointers, they do not exhibit such ambiguity. However, arrays created by dynamic allocation are accessed by pointers rather than true array variables, so they suffer from the same sizeof issues as array pointers.
Thus, despite this apparent equivalence between array and pointer variables, there is still a distinction to be made between them. Even though the name of an array is, in most expression contexts, converted into a pointer (to its first element), this pointer does not itself occupy any storage; the array name is not an l-value, and its address is a constant, unlike a pointer variable. Consequently, what an array "points to" cannot be changed, and it is impossible to assign a new address to an array name. Array contents may be copied, however, by using the
memcpy function, or by accessing the individual elements.
Memory management
LibrariesEdit
The C programming language uses librariesas its primary method of extension. In C, a library is a set of functions contained within a single "archive" file. Each library typically has a header file, which contains the prototypes of the functions contained within the library that may be used by a program, and declarations of special data types and macro symbols used with these functions. In order for a program to use a library, it must include the library's header file, and the library must be linked with the program, which in many cases requires compiler flags (e.g.,
-lm, shorthand for "math library").[27]
The most common C library is the C standard library, which is specified by the ISO andANSI C standards and comes with every C implementation. (Implementations which target limited environments such asembedded systems may provide only a subset of the standard library.) This library supports stream input and output, memory allocation, mathematics, character strings, and time values. Several separate standard headers (for example,
stdio.h) specify the interfaces for these and other standard library facilities.
Another common set of C library functions are those used by applications specifically targeted for Unix and Unix-like systems, especially functions which provide an interface to the kernel. These functions are detailed in various standards such as POSIXand the Single UNIX Specification.
Since many programs have been written in C, there are a wide variety of other libraries available. Libraries are often written in C because C compilers generate efficient object code; programmers then create interfaces to the library so that the routines can be used from higher-level languages like Java, Perl, and Python.[27]
Language toolsEdit
Tools have been created to help C programmers avoid some of the problems inherent in the language, such as statements with undefined behavior or statements that are not a good practice because they are likely to result in unintended behavior or run-time errors.
Automated source code checking and auditing are beneficial in any language, and for C many such tools exist, such as Lint. A common practice is to use Lint to detect questionable code when a program is first written. Once a program passes Lint, it is then compiled using the C compiler. Also, many compilers can optionally warn about syntactically valid constructs that are likely to actually be errors. MISRA C is a proprietary set of guidelines to avoid such questionable code, developed for embedded systems.[34]
There are also compilers, libraries, and operating system level mechanisms for performing actions that are not a standard part of C, such as array bounds checking,buffer overflow detection, serialization, andautomatic garbage collection.
Tools such as Purify or Valgrind and linking with libraries containing special versions of the memory allocation functions can help uncover runtime errors in memory usage.
UsesEdit
The TIOBE index graph from 2002 to 2015, showing a comparison of the popularity of various programming languages[35]
C is widely used for "system programming", including implementing operating systemsand embedded system applications, due to a combination of desirable characteristics such as code portability and efficiency, ability to access specific hardware addresses, ability to pun types to match externally imposed data access requirements, and low run-timedemand on system resources. C can also be used for website programming using CGI as a "gateway" for information between the Web application, the server, and the browser.[36]Some reasons for choosing C over interpreted languages are its speed, stability, and near-universal availability.[37]
One consequence of C's wide availability and efficiency is that compilers, libraries and interpreters of other programming languages are often implemented in C. The primary implementations of Python, Perl 5 and PHP, for example, are all written in C.
Due to its thin layer of abstraction and low overhead, C allows efficient implementations of algorithms and data structures, which is useful for programs that perform a lot of computations. For example, the GNU Multiple Precision Arithmetic Library, the GNU Scientific Library, Mathematica and MATLABare completely or partially written in C.
C is sometimes used as an intermediate language by implementations of other languages. This approach may be used for portability or convenience; by using C as an intermediate language, it is not necessary to develop machine-specific code generators. C has some features, such as line-number preprocessor directives and optional superfluous commas at the end of initializer lists, which support compilation of generated code. However, some of C's shortcomings have prompted the development of other C-based languages specifically designed for use as intermediate languages, such as C--.
C has also been widely used to implementend-user applications, but much of that development has shifted to newer, higher-level languages.
Related languagesEdit
C has directly or indirectly influenced many later languages such as C#, D, Go, Java,JavaScript, Limbo, LPC, Perl, PHP, Python, and Unix's C shell. The most pervasive influence has been syntactical: all of the languages mentioned combine the statement and (more or less recognizably) expressionsyntax of C with type systems, data models and/or large-scale program structures that differ from those of C, sometimes radically.
Several C or near-C interpreters exist, including Ch and CINT, which can also be used for scripting.
When object-oriented languages became popular, C++ and Objective-C were two different extensions of C that provided object-oriented capabilities. Both languages were originally implemented as source-to-source compilers; source code was translated into C, and then compiled with a C compiler.
The C++ programming language was devised by Bjarne Stroustrup as an approach to providing object-oriented functionality with a C-like syntax.[38] C++ adds greater typing strength, scoping, and other tools useful in object-oriented programming, and permitsgeneric programming via templates. Nearly a superset of C, C++ now supports most of C, with a few exceptions.
Objective-C was originally a very "thin" layer on top of C, and remains a strict superset of C that permits object-oriented programming using a hybrid dynamic/static typing paradigm. Objective-C derives its syntax from both C and Smalltalk: syntax that involves preprocessing, expressions, function declarations, and function calls is inherited from C, while the syntax for object-oriented features was originally taken from Smalltalk.