In C programming, tokens are the smallest individual units of a program’s source code that have a distinct meaning to the compiler. They’re like the building blocks that the compiler uses to construct the program’s structure and logic.
Here are the primary types of tokens in C:
-
Keywords:
- Predefined reserved words with specific meanings in C syntax.
- Examples:
int
,float
,char
,if
,else
,while
,for
,return
,void
, etc.
-
Identifiers:
- User-defined names for variables, functions, arrays, structures, etc.
- Rules:
- Must start with a letter (A-Z, a-z) or an underscore (_).
- Can contain letters, digits, or underscores.
- Case-sensitive (e.g.,
count
andCount
are different).
-
Constants:
- Fixed values that don’t change during program execution.
- Types:
- Numeric literals: Integers (e.g.,
10
,-5
,0x1F
), floating-point numbers (e.g.,3.14
,1.2e-5
) - Character literals: Single characters enclosed in single quotes (e.g.,
'A'
,'\n'
,'\t'
) - String literals: Sequences of characters enclosed in double quotes (e.g.,
"Hello, world!"
)
- Numeric literals: Integers (e.g.,
-
Operators:
- Symbols that perform operations on values.
- Examples:
+
,-
,*
,/
,%
,&&
,||
,==
,!=
,<
,>
,=
,++
,--
, etc.
-
Punctuators:
- Special symbols used for grouping, separating, and terminating elements in code.
- Examples:
()
,{}
,[]
,;
,,
,.
,:
, etc.
Tokenization Process:
- The compiler first breaks down the source code into a stream of tokens during compilation.
- It analyzes each token to determine its type and meaning.
- This token stream is then used to construct the program’s abstract syntax tree (AST), representing the program’s structure.
Importance of Tokens:
- Tokens form the basic vocabulary of C programming.
- They define the structure and rules of the language.
- Understanding tokens is essential for writing syntactically correct C code and for comprehending how the compiler interprets code.