Tokens in C - GeeksforGeeks (2024)

Last Updated : 28 Aug, 2024

Comments

Improve

A token in C can be defined as the smallest individual element of the C programming language that is meaningful to the compiler. It is the basic component of a C program.

Types of Tokens in C

The tokens of C language can be classified into six types based on the functions they are used to perform. The types of C tokens are as follows:

Tokens in C - GeeksforGeeks (1)

  1. Keywords
  2. Identifiers
  3. Constants
  4. Strings
  5. Special Symbols
  6. Operators

1. C Token – Keywords

The keywords are pre-defined or reserved words in a programming language. Each keyword is meant to perform a specific function in a program. Since keywords are referred names for a compiler, they can’t be used as variable names because by doing so, we are trying to assign a new meaning to the keyword which is not allowed. You cannot redefine keywords. However, you can specify the text to be substituted for keywords before compilation by using C preprocessor directives. C language supports 32 keywords which are given below:

auto double int struct
break else long switch
case enum register typedef
char extern return union
const float short unsigned
continue for signed void
default goto sizeof volatile
do if static while

Note: The number of keywords may change depending on the version of C you are using. For example, keywords present in ANSI C are 32 while in C11, it was increased to 44. Moreover, in the latest c23, it is increased to around 54.

2. C Token – Identifiers

Identifiers are used as the general terminology for the naming of variables, functions, and arrays. These are user-defined names consisting of an arbitrarily long sequence of letters and digits with either a letter or the underscore(_) as a first character. Identifier names must differ in spelling and case from any keywords. You cannot use keywords as identifiers; they are reserved for special use. Once declared, you can use the identifier in later program statements to refer to the associated value. A special identifier called a statement label can be used in goto statements.

Rules for Naming Identifiers

Certain rules should be followed while naming c identifiers which are as follows:

  • They must begin with a letter or underscore(_).
  • They must consist of only letters, digits, or underscore. No other special character is allowed.
  • It should not be a keyword.
  • It must not contain white space.
  • It should be up to 31 characters long as only the first 31 characters are significant.

Note: Identifiers are case-sensitive so names like variable and Variable will be treated as different.

For example,

  • main: method name.
  • a: variable name.

3. C Token – Constants

The constants refer to the variables with fixed values. They are like normal variables but with the difference that their values can not be modified in the program once they are defined.

Constants may belong to any of the data types.

Examples of Constants in C

const int c_var = 20;
const int* const ptr = &c_var;

4. C Token – Strings

Strings are nothing but an array of characters ended with a null character (‘\0’). This null character indicates the end of the string. Strings are always enclosed in double quotes. Whereas, a character is enclosed in single quotes in C and C++.

Examples of String

char string[20] = {‘g’, ’e’, ‘e’, ‘k’, ‘s’, ‘f’, ‘o’, ‘r’, ‘g’, ’e’, ‘e’, ‘k’, ‘s’, ‘\0’};
char string[20] = “geeksforgeeks”;
char string [] = “geeksforgeeks”;

5. C Token – Special Symbols

The following special symbols are used in C having some special meaning and thus, cannot be used for some other purpose. Some of these are listed below:

  • Brackets[]: Opening and closing brackets are used as array element references. These indicate single and multidimensional subscripts.
  • Parentheses(): These special symbols are used to indicate function calls and function parameters.
  • Braces{}: These opening and ending curly braces mark the start and end of a block of code containing more than one executable statement.
  • Comma (, ): It is used to separate more than one statement like for separating parameters in function calls.
  • Colon(:): It is an operator that essentially invokes something called an initialization list.
  • Semicolon(;): It is known as a statement terminator. It indicates the end of one logical entity. That’s why each individual statement must be ended with a semicolon.
  • Asterisk (*): It is used to create a pointer variable and for the multiplication of variables.
  • Assignment operator(=): It is used to assign values and for logical operation validation.
  • Pre-processor (#): The preprocessor is a macro processor that is used automatically by the compiler to transform your program before actual compilation.
  • Period (.): Used to access members of a structure or union.
  • Tilde(~): Bitwise One’s Complement Operator.

6. C Token – Operators

Operators are symbols that trigger an action when applied to C variables and other objects. The data items on which operators act are called operands.
Depending on the number of operands that an operator can act upon, operators can be classified as follows:

  • Unary Operators: Those operators that require only a single operand to act upon are known as unary operators.For Example increment and decrement operators
  • Binary Operators: Those operators that require two operands to act upon are called binary operators. Binary operators can further are classified into:
    1. Arithmetic operators
    2. Relational Operators
    3. Logical Operators
    4. Assignment Operators
    5. Bitwise Operator
  • Ternary Operator: The operator that requires three operands to act upon is called the ternary operator. Conditional Operator(?) is also called the ternary operator.

This article is contributed by I.HARISH KUMAR.


I

I.HARISH KUMAR

Tokens in C - GeeksforGeeks (2)

Improve

Previous Article

C Comments

Next Article

Keywords in C

Please Login to comment...

Tokens in C - GeeksforGeeks (2024)

FAQs

How many tokens are available in C? ›

C Tokens are of 6 types, and they are classified as: Identifiers, Keywords, Constants, Operators, Special Characters and Strings.

What is tokenization in C GFG? ›

In C, tokenization is the process of breaking the string into smaller parts using delimiters (characters treated as separators) like space, commas, a specific character, or even a string. Those smaller parts are called tokens where each token is a substring of the original string separated by the delimiter.

What is tokenization in compiler design code in C? ›

In C, tokens are the smallest meaningful elements used to create a program. They include keywords, identifiers, constants, string literals, operators, punctuation marks, and special symbols.

What are tokens used for in C? ›

Tokens in C are the smallest individual units or components of a C program. These units include identifiers, keywords, constants, operators, and punctuation symbols. The C compiler reads the source code and breaks it down into these tokens, which it uses to understand the program's structure and functionality.

What are C ++ tokens? ›

A token is the smallest element of a C++ program that is meaningful to the compiler. The C++ parser recognizes these kinds of tokens: Keywords. Identifiers. Numeric, Boolean and Pointer Literals.

How many types of tokens are allowed? ›

Python breaks each logical line into a sequence of elementary lexical components known as tokens. Each token corresponds to a substring of the logical line. The normal token types are identifiers, keywords, operators, delimiters, and literals. There are five types of tokens allowed in Python.

How to find tokens in C program? ›

This is called the lexical analysis phase of the compiler. The lexical analyzer is the part of the compiler that detects the token of the program and sends it to the syntax analyzer. Token is the smallest entity of the code, it is either a keyword, identifier, constant, string literal, symbol.

How do you break a string into tokens in C? ›

In C, the strtok() function is used to split a string into a series of tokens based on a particular delimiter. A token is a substring extracted from the original string.

What is the difference between token and tokenization? ›

A token is a collection of characters that has semantic meaning for a model. Tokenization is the process of converting the words in your prompt into tokens. You can monitor foundation model token usage in a project on the Environments page on the Resource usage tab.

What are special symbols in C tokens? ›

C Token – Special Symbols

Brackets [] : Used for array element references, indicating single or multidimensional subscripts. Parentheses () : Indicate function calls and function parameters. Braces {} : Mark the start and end of a code block containing multiple executable statements.

How to tokenize input in C? ›

To fully tokenize a string, str should be the string you want to tokenize on your first call to strtok , and should be NULL on all subsequent calls. Tokens extracted using strtok cannot be tokenized themselves. For hierarchies of multiple tokens, use strtok_r .

How many types of tokens are there in compiler design? ›

We can also say that tokens are the basic buildings blocks in C language which are constructed together to write a C program. Tokens are of 6 types. Keywords are pre-defined words in a C compiler. Each keyword is meant to perform a specific function in a C program.

How to tokenize a text file in C? ›

In this section, we will see how to tokenize strings in C. The C has library function for this. The C library function char *strtok(char *str, const char *delim) breaks string str into a series of tokens using the delimiter delim. Following is the declaration for strtok() function.

How many tokens are there in a programming language? ›

There are 6 tokens in C: Identifiers, Keywords, Operators, Strings, Special Characters, Constant.

How many tokens will be generated from the following C statement? ›

The number of tokens in the following C statement printf("i = %d, & i = %x", i, & i); is. 10.

How many keywords are there in C? ›

As of the C99 standard, there is a set of 32 keywords in C language, as shown in the table below. All of these keywords in C have specific meanings and are used to define control structures, data types, function declarations, and other fundamental elements in a C program.

How many types of tokens are there? ›

A token is defined as the smallest individual unit present in the program. C language consists of five types of tokens. The C compiler parses the source code to generate tokens. The five types of tokens are: Keywords, Identifiers, Operators, Special symbols, and Constants.

Top Articles
Can You Lose Belly Fat by Cycling?
How Response Times Are Calculated - Re:amaze
Wisconsin Women's Volleyball Team Leaked Pictures
Hertz Car Rental Partnership | Uber
Horned Stone Skull Cozy Grove
Jet Ski Rental Conneaut Lake Pa
Craigslist Chautauqua Ny
Hillside Funeral Home Washington Nc Obituaries
Readyset Ochsner.org
Hartford Healthcare Employee Tools
Conan Exiles Thrall Master Build: Best Attributes, Armor, Skills, More
Curtains - Cheap Ready Made Curtains - Deconovo UK
Forum Phun Extra
Dover Nh Power Outage
Chase Bank Pensacola Fl
Bn9 Weather Radar
Nk 1399
R Baldurs Gate 3
Abga Gestation Calculator
Schooology Fcps
Pioneer Library Overdrive
Miles City Montana Craigslist
Busted! 29 New Arrests in Portsmouth, Ohio – 03/27/22 Scioto County Mugshots
Red Sox Starting Pitcher Tonight
Advance Auto Parts Stock Price | AAP Stock Quote, News, and History | Markets Insider
Emiri's Adventures
Average weekly earnings in Great Britain
Cbs Trade Value Chart Week 10
Myhrconnect Kp
Appleton Post Crescent Today's Obituaries
Soulstone Survivors Igg
Instafeet Login
Can You Buy Pedialyte On Food Stamps
Troy Gamefarm Prices
Kelly Ripa Necklace 2022
Husker Football
Appraisalport Com Dashboard Orders
ACTUALIZACIÓN #8.1.0 DE BATTLEFIELD 2042
Arcanis Secret Santa
Craigslist St Helens
Sherwin Source Intranet
FactoryEye | Enabling data-driven smart manufacturing
Clock Batteries Perhaps Crossword Clue
Runescape Death Guard
Msatlantathickdream
Craigslist Monterrey Ca
Competitive Comparison
Psalm 46 New International Version
Laurel Hubbard’s Olympic dream dies under the world’s gaze
Latest Posts
Article information

Author: Lilliana Bartoletti

Last Updated:

Views: 5733

Rating: 4.2 / 5 (73 voted)

Reviews: 80% of readers found this page helpful

Author information

Name: Lilliana Bartoletti

Birthday: 1999-11-18

Address: 58866 Tricia Spurs, North Melvinberg, HI 91346-3774

Phone: +50616620367928

Job: Real-Estate Liaison

Hobby: Graffiti, Astronomy, Handball, Magic, Origami, Fashion, Foreign language learning

Introduction: My name is Lilliana Bartoletti, I am a adventurous, pleasant, shiny, beautiful, handsome, zealous, tasty person who loves writing and wants to share my knowledge and understanding with you.