Introduction — The Linux Kernel documentation (2024)

  • Introduction
  • View page source

View slides

Lecture objectives:

  • Basic operating systems terms and concepts
  • Overview of the Linux kernel

Basic operating systems terms and concepts

User vs Kernel

Kernel and user are two terms that are often used in operatingsystems. Their definition is pretty straight forward: The kernel isthe part of the operating system that runs with higher privilegeswhile user (space) usually means by applications running with lowprivileges.

However these terms are heavily overloaded and might have veryspecific meanings in some contexts.

User mode and kernel mode are terms that may refer specifically to theprocessor execution mode. Code that runs in kernel mode can fully[1] control the CPU while code that runs in user mode hascertain limitations. For example, local CPU interrupts can only bedisabled or enable while running in kernel mode. If such an operationis attempted while running in user mode an exception will be generatedand the kernel will take over to handle it.

[1]some processors may have even higher privileges thankernel mode, e.g. a hypervisor mode, that is onlyaccessible to code running in a hypervisor (virtualmachine monitor)

User space and kernel space may refer specifically to memoryprotection or to virtual address spaces associated with either thekernel or user applications.

Grossly simplifying, the kernel space is the memory area that isreserved to the kernel while user space is the memory area reserved toa particular user process. The kernel space is accessed protected sothat user applications can not access it directly, while user spacecan be directly accessed from code running in kernel mode.

Typical operating system architecture

In the typical operating system architecture (see the figure below)the operating system kernel is responsible for access and sharing thehardware in a secure and fair manner with multiple applications.

Introduction — The Linux Kernel documentation (1)

The kernel offers a set of APIs that applications issue which aregenerally referred to as "System Calls". These APIs are different fromregular library APIs because they are the boundary at which theexecution mode switch from user mode to kernel mode.

In order to provide application compatibility, system calls are rarelychanged. Linux particularly enforces this (as opposed to in kernelAPIs that can change as needed).

The kernel code itself can be logically separated in core kernelcode and device drivers code. Device drivers code is responsible ofaccessing particular devices while the core kernel code isgeneric. The core kernel can be further divided into multiple logicalsubsystems (e.g. file access, networking, process management, etc.)

Monolithic kernel

A monolithic kernel is one where there is no access protection betweenthe various kernel subsystems and where public functions can bedirectly called between various subsystems.

Introduction — The Linux Kernel documentation (2)

However, most monolithic kernels do enforce a logical separationbetween subsystems especially between the core kernel and devicedrivers with relatively strict APIs (but not necessarily fixed instone) that must be used to access services offered by one subsystemor device drivers. This, of course, depends on the particular kernelimplementation and the kernel's architecture.

Micro kernel

A micro-kernel is one where large parts of the kernel are protectedfrom each-other, usually running as services in user space. Becausesignificant parts of the kernel are now running in user mode, theremaining code that runs in kernel mode is significantly smaller, hencemicro-kernel term.

Introduction — The Linux Kernel documentation (3)

In a micro-kernel architecture the kernel contains just enough codethat allows for message passing between different runningprocesses. Practically that means implement the scheduler and an IPCmechanism in the kernel, as well as basic memory management to setupthe protection between applications and services.

One of the advantages of this architecture is that the services areisolated and hence bugs in one service won't impact other services.

As such, if a service crashes we can just restart it without affectingthe whole system. However, in practice this is difficult to achievesince restarting a service may affect all applications that depend onthat service (e.g. if the file server crashes all applications withopened file descriptors would encounter errors when accessing them).

This architecture imposes a modular approach to the kernel and offersmemory protection between services but at a cost of performance. Whatis a simple function call between two services on monolithic kernelsnow requires going through IPC and scheduling which will incur aperformance penalty [2].

[2]https://lwn.net/Articles/220255/

Micro-kernels vs monolithic kernels

Advocates of micro-kernels often suggest that micro-kernel aresuperior because of the modular design a micro-kernelenforces. However, monolithic kernels can also be modular and thereare several approaches that modern monolithic kernels use toward thisgoal:

  • Components can enabled or disabled at compile time
  • Support of loadable kernel modules (at runtime)
  • Organize the kernel in logical, independent subsystems
  • Strict interfaces but with low performance overhead: macros,inline functions, function pointers

There is a class of operating systems that (used to) claim to behybrid kernels, in between monolithic and micro-kernels (e.g. Windows,Mac OS X). However, since all of the typical monolithic services runin kernel-mode in these operating systems, there is little merit toqualify them other then monolithic kernels.

Many operating systems and kernel experts have dismissed the labelas meaningless, and just marketing. Linus Torvalds said of thisissue:

"As to the whole 'hybrid kernel' thing - it's just marketing. It's'oh, those microkernels had good PR, how can we try to get good PRfor our working kernel? Oh, I know, let's use a cool name and tryto imply that it has all the PR advantages that that other systemhas'."

Address space

The address space term is an overload term that can have differentmeanings in different contexts.

The physical address space refers to the way the RAM and devicememories are visible on the memory bus. For example, on 32bit Intelarchitecture, it is common to have the RAM mapped into the lowerphysical address space while the graphics card memory is mapped highin the physical address space.

The virtual address space (or sometimes just address space) refers tothe way the CPU sees the memory when the virtual memory module isactivated (sometime called protected mode or paging enabled). Thekernel is responsible of setting up a mapping that creates a virtualaddress space in which areas of this space are mapped to certainphysical memory areas.

Related to the virtual address space there are two other terms thatare often used: process (address) space and kernel (address) space.

The process space is (part of) the virtual address space associatedwith a process. It is the "memory view" of processes. It is acontinuous area that starts at zero. Where the process's address spaceends depends on the implementation and architecture.

The kernel space is the "memory view" of the code that runs in kernelmode.

User and kernel sharing the virtual address space

A typical implementation for user and kernel spaces is one where thevirtual address space is shared between user processes and the kernel.

In this case kernel space is located at the top of the address space,while user space at the bottom. In order to prevent the user processesfrom accessing kernel space, the kernel creates mappings that preventaccess to the kernel space from user mode.

Introduction — The Linux Kernel documentation (4)

Execution contexts

One of the most important jobs of the kernel is to service interruptsand to service them efficiently. This is so important that a specialexecution context is associated with it.

The kernel executes in interrupt context when it runs as a result ofan interrupt. This includes the interrupt handler, but it is notlimited to it, there are other special (software) constructs that runin interrupt mode.

Code running in interrupt context always runs in kernel mode and thereare certain limitations that the kernel programmer has to be aware of(e.g. not calling blocking functions or accessing user space).

Opposed to interrupt context there is process context. Code that runsin process context can do so in user mode (executing application code)or in kernel mode (executing a system call).

Multi-tasking

Multitasking is the ability of the operating system to"simultaneously" execute multiple programs. It does so by quicklyswitching between running processes.

Cooperative multitasking requires the programs to cooperate to achievemultitasking. A program will run and relinquish CPU control backto the OS, which will then schedule another program.

With preemptive multitasking the kernel will enforce strict limits foreach process, so that all processes have a fair chance ofrunning. Each process is allowed to run a time slice (e.g. 100ms)after which, if it is still running, it is forcefully preempted andanother task is scheduled.

Preemptive kernel

Preemptive multitasking and preemptive kernels are different terms.

A kernel is preemptive if a process can be preempted while runningin kernel mode.

However, note that non-preemptive kernels may support preemptivemultitasking.

Pageable kernel memory

A kernel supports pageable kernel memory if parts of kernel memory(code, data, stack or dynamically allocated memory) can be swappedto disk.

Kernel stack

Each process has a kernel stack that is used to maintain thefunction call chain and local variables state while it is executingin kernel mode, as a result of a system call.

The kernel stack is small (4KB - 12 KB) so the kernel developer hasto avoid allocating large structures on stack or recursive callsthat are not properly bounded.

Portability

In order to increase portability across various architectures andhardware configurations, modern kernels are organized as follows at thetop level:

  • Architecture and machine specific code (C & ASM)
  • Independent architecture code (C):
    • kernel core (further split in multiple subsystems)
    • device drivers

This makes it easier to reuse code as much as possible betweendifferent architectures and machine configurations.

Asymmetric MultiProcessing (ASMP)

Asymmetric MultiProcessing (ASMP) is a way of supporting multipleprocessors (cores) by a kernel, where a processor is dedicated to thekernel and all other processors run user space programs.

The disadvantage of this approach is that the kernel throughput(e.g. system calls, interrupt handling, etc.) does not scale with thenumber of processors and hence typical processes frequently use systemcalls. The scalability of the approach is limited to very specificsystems (e.g. scientific applications).

Introduction — The Linux Kernel documentation (5)

Symmetric MultiProcessing (SMP)

As opposed to ASMP, in SMP mode the kernel can run on any of theexisting processors, just as user processes. This approach is moredifficult to implement, because it creates race conditions in thekernel if two processes run kernel functions that access the samememory locations.

In order to support SMP the kernel must implement synchronizationprimitives (e.g. spin locks) to guarantee that only one processor isexecuting a critical section.

Introduction — The Linux Kernel documentation (6)

CPU Scalability

CPU scalability refers to how well the performance scales withthe number of cores. There are a few things that the kernel developershould keep in mind with regard to CPU scalability:

  • Use lock free algorithms when possible
  • Use fine grained locking for high contention areas
  • Pay attention to algorithm complexity

Overview of the Linux kernel

Linux development model

The Linux kernel is one the largest open source projects in the worldwith thousands of developers contributing code and millions of lines ofcode changed for each release.

It is distributed under the GPLv2 license, which simply put,requires that any modification of the kernel done on software that isshipped to customer should be made available to them (the customers),although in practice most companies make the source code publiclyavailable.

There are many companies (often competing) that contribute code to theLinux kernel as well as people from academia and independentdevelopers.

The current development model is based on doing releases at fixedintervals of time (usually 3 - 4 months). New features are merged intothe kernel during a one or two week merge window. After the mergewindow, a release candidate is done on a weekly basis (rc1, rc2, etc.)

Maintainer hierarchy

In order to scale the development process, Linux uses a hierarchicalmaintainership model:

  • Linus Torvalds is the maintainer of the Linux kernel and merges pullrequests from subsystem maintainers
  • Each subsystem has one or more maintainers that accept patches orpull requests from developers or device driver maintainers
  • Each maintainer has its own git tree, e.g.:
    • Linux Torvalds: git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux-2.6.git
    • David Miller (networking): git://git.kernel.org/pub/scm/linux/kernel/git/davem/net.git/
  • Each subsystem may maintain a -next tree where developers can submitpatches for the next merge window

Since the merge window is only a maximum of two weeks, most of themaintainers have a -next tree where they accept new features fromdevelopers or maintainers downstream while even when the merge windowis closed.

Note that bug fixes are accepted even outside merge window in themaintainer's tree from where they are periodically pulled by theupstream maintainer regularly, for every release candidate.

Linux source code layout

Introduction — The Linux Kernel documentation (7)

These are the top level of the Linux source code folders:

  • arch - contains architecture specific code; each architecture isimplemented in a specific sub-folder (e.g. arm, arm64, x86)
  • block - contains the block subsystem code that deals with readingand writing data from block devices: creating block I/O requests,scheduling them (there are several I/O schedulers available),merging requests, and passing them down through the I/O stack to theblock device drivers
  • certs - implements support for signature checking using certificates
  • crypto - software implementation of various cryptography algorithmsas well as a framework that allows offloading such algorithms inhardware
  • Documentation - documentation for various subsystems, Linux kernelcommand line options, description for sysfs files and format, devicetree bindings (supported device tree nodes and format)
  • drivers - driver for various devices as well as the Linux drivermodel implementation (an abstraction that describes drivers, devicesbuses and the way they are connected)
  • firmware - binary or hex firmware files that are used by variousdevice drivers
  • fs - home of the Virtual Filesystem Switch (generic filesystem code)and of various filesystem drivers
  • include - header files
  • init - the generic (as opposed to architecture specific)initialization code that runs during boot
  • ipc - implementation for various Inter Process Communication systemcalls such as message queue, semaphores, shared memory
  • kernel - process management code (including support for kernelthread, workqueues), scheduler, tracing, time management, genericirq code, locking
  • lib - various generic functions such as sorting, checksums,compression and decompression, bitmap manipulation, etc.
  • mm - memory management code, for both physical and virtual memory,including the page, SL*B and CMA allocators, swapping, virtual memorymapping, process address space manipulation, etc.
  • net - implementation for various network stacks including IPv4 andIPv6; BSD socket implementation, routing, filtering, packetscheduling, bridging, etc.
  • samples - various driver samples
  • scripts - parts the build system, scripts used for building modules,kconfig the Linux kernel configurator, as well as various otherscripts (e.g. checkpatch.pl that checks if a patch is conform withthe Linux kernel coding style)
  • security - home of the Linux Security Module framework that allowsextending the default (Unix) security model as well asimplementation for multiple such extensions such as SELinux, smack,apparmor, tomoyo, etc.
  • sound - home of ALSA (Advanced Linux Sound System) as well as theold Linux sound framework (OSS)
  • tools - various user space tools for testing or interacting withLinux kernel subsystems
  • usr - support for embedding an initrd file in the kernel image
  • virt - home of the KVM (Kernel Virtual Machine) hypervisor

Linux kernel architecture

arch

  • Architecture specific code
  • May be further sub-divided in machine specific code
  • Interfacing with the boot loader and architecture specificinitialization
  • Access to various hardware bits that are architecture or machinespecific such as interrupt controller, SMP controllers, BUScontrollers, exceptions and interrupt setup, virtual memory handling
  • Architecture optimized functions (e.g. memcpy, string operations,etc.)

This part of the Linux kernel contains architecture specific code andmay be further sub-divided in machine specific code for certainarchitectures (e.g. arm).

"Linux was first developed for 32-bit x86-based PCs (386 orhigher). These days it also runs on (at least) the Compaq Alpha AXP,Sun SPARC and UltraSPARC, Motorola 68000, PowerPC, PowerPC64, ARM,Hitachi SuperH, IBM S/390, MIPS, HP PA-RISC, Intel IA-64, DEC VAX, AMDx86-64 and CRIS architectures.”

It implements access to various hardware bits that are architecture ormachine specific such as interrupt controller, SMP controllers, BUScontrollers, exceptions and interrupt setup, virtual memory handling.

It also implements architecture optimized functions (e.g. memcpy,string operations, etc.)

Device drivers

The Linux kernel uses a unified device model whose purpose is tomaintain internal data structures that reflect the state and structureof the system. Such information includes what devices are present,what is their status, what bus they are attached to, to what driverthey are attached, etc. This information is essential for implementingsystem wide power management, as well as device discovery and dynamicdevice removal.

Each subsystem has its own specific driver interface that is tailoredto the devices it represents in order to make it easier to writecorrect drivers and to reduce code duplication.

Linux supports one of the most diverse set of device drivers type,some examples are: TTY, serial, SCSI, fileystem, ethernet, USB,framebuffer, input, sound, etc.

Process management

Linux implements the standard Unix process management APIs such asfork(), exec(), wait(), as well as standard POSIX threads.

However, Linux processes and threads are implemented particularlydifferent than other kernels. There are no internal structuresimplementing processes or threads, instead there is a structtask_struct that describe an abstract scheduling unit called task.

A task has pointers to resources, such as address space, filedescriptors, IPC ids, etc. The resource pointers for tasks that arepart of the same process point to the same resources, while resourcesof tasks of different processes will point to different resources.

This peculiarity, together with the clone() and unshare() systemcall allows for implementing new features such as namespaces.

Namespaces are used together with control groups (cgroup) to implementoperating system virtualization in Linux.

cgroup is a mechanism to organize processes hierarchically anddistribute system resources along the hierarchy in a controlled andconfigurable manner.

Memory management

Linux memory management is a complex subsystem that deals with:

  • Management of the physical memory: allocating and freeing memory
  • Management of the virtual memory: paging, swapping, demandpaging, copy on write
  • User services: user address space management (e.g. mmap(), brk(),shared memory)
  • Kernel services: SL*B allocators, vmalloc

Block I/O management

The Linux Block I/O subsystem deals with reading and writing data fromor to block devices: creating block I/O requests, transforming block I/Orequests (e.g. for software RAID or LVM), merging and sorting therequests and scheduling them via various I/O schedulers to the blockdevice drivers.

Virtual Filesystem Switch

The Linux Virtual Filesystem Switch implements common / genericfilesystem code to reduce duplication in filesystem drivers. Itintroduces certain filesystem abstractions such as:

  • inode - describes the file on disk (attributes, location of datablocks on disk)
  • dentry - links an inode to a name
  • file - describes the properties of an opened file (e.g. filepointer)
  • superblock - describes the properties of a formatted filesystem(e.g. number of blocks, block size, location of root directory ondisk, encryption, etc.)

The Linux VFS also implements a complex caching mechanism whichincludes the following:

  • the inode cache - caches the file attributes and internal filemetadata
  • the dentry cache - caches the directory hierarchy of a filesystem
  • the page cache - caches file data blocks in memory

Networking stack

Linux Security Modules

  • Hooks to extend the default Linux security model
  • Used by several Linux security extensions:
    • Security Enhancened Linux
    • AppArmor
    • Tomoyo
    • Smack
Introduction — The Linux Kernel  documentation (2024)

FAQs

What is the introduction of Linux kernel? ›

The Linux® kernel is the main component of a Linux operating system (OS) and is the core interface between a computer's hardware and its processes. It communicates between the 2, managing resources as efficiently as possible.

Is it hard to contribute to the Linux kernel? ›

Working with the kernel development community is not especially hard. But, that notwithstanding, many potential contributors have experienced difficulties when trying to do kernel work.

What is the Linux kernel quizlet? ›

The Linux Kernel is a low-level systems software whose main role is to manage hardware resources for the user. It is also used to provide an interface for user-level interaction.

Is kernel programming difficult? ›

Coding in kernel can be challenging, in part because one cannot use common libraries (like a full-featured libc), and because one needs to use a source-level debugger like gdb. Rebooting the computer is often required. This is not just a problem of convenience to the developers.

Is Linux a kernel or OS? ›

Linux is the kernel: the program in the system that allocates the machine's resources to the other programs that you run. The kernel is an essential part of an operating system, but useless by itself; it can only function in the context of a complete operating system.

What is the main function of the Linux kernel? ›

It manages the system's resources and facilitates communication between hardware and software components. As the heart of the Linux OS, the kernel plays a crucial role in enabling the seamless operation and integration of various software applications and system components.

What is the salary of Linux kernel programmer? ›

Linux Kernel Developer Salaries in India

The average salary for Linux Kernel Developer is ₹88,685 per month in the India. The average additional cash compensation for a Linux Kernel Developer in the India is ₹11,185, with a range from ₹10,732 - ₹11,639.

Do Linux contributors get paid? ›

"Over 70% of all kernel development is demonstrably done by developers who are being paid for their work," concluded the report. In other words, they've been given full-time jobs to work on the kernel by companies with a vested interest in the kernel's ongoing development.

Which university banned from contributing to Linux kernel? ›

Today, a major Linux kernel developer, Greg Kroah-Hartman has banned the University of Minnesota (UMN) from contributing to the open-source Linux kernel project.

What are the 5 basic components of Linux? ›

The Kernel, Hardware layer, System library, Shell, and System utility are the main components of the Linux Operating System's architecture.

What runs the Linux kernel? ›

The kernel is usually built with the GNU toolchain. The GNU C compiler, GNU cc, part of the GNU Compiler Collection (GCC), is the default compiler for mainline Linux. Sequencing is handled by GNU make. The GNU Assembler (often called GAS or GNU as) outputs the object files from the GCC generated assembly code.

What is Linux kernel for dummies? ›

The kernel is responsible for tasks such as process management, memory management, device management, and system calls. It provides a foundation for various operating systems, and because it's open-source, developers worldwide can contribute to its development.

Is Linux kernel faster than Windows? ›

Yes. Linux is faster and less resource-intensive than Windows. Linux also doesn't contain bloatware like Windows does. That means it boots up and completes tasks much faster.

What is the best language to write the kernel? ›

The kernel is written in the C programming language [c-language]. More precisely, the kernel is typically compiled with gcc [gcc] under -std=gnu11 [gcc-c-dialect-options]: the GNU dialect of ISO C11. clang [clang] is also supported, see docs on Building Linux with Clang/LLVM.

How do I learn Linux kernel development? ›

The Linux kernel is written in C, so you should have at least a basic understanding of C before diving into kernel work. You don't need expert level C knowledge, since you can always pick some things up underway, but it certainly helps to know the language and to have written some userspace C programs already.

What is the Linux kernel simplified? ›

The Linux kernel is a free and open source, UNIX-like kernel that is used in many computer systems worldwide. The kernel was created by Linus Torvalds in 1991 and was soon adopted as the kernel for the GNU operating system (OS) which was created to be a free replacement for Unix.

How does the Linux kernel start? ›

During boot-up, the boot loader (such as GRUB) loads the Linux kernel into memory. The kernel then decompresses itself and, if configured to use an initrd, loads the initrd image as a temporary root file system into a predetermined memory location.

What is the introduction of Linux server operating system? ›

At its core, a Linux server consists of Linux, a family of free, open source software operating systems built around the Linux kernel. The Linux OS was created as an alternative, free, open source version of the MINIX OS, which was itself based on the principles and design of Unix.

What is inside the Linux kernel? ›

The Linux kernel consists of several important parts: process management, memory management, hardware device drivers, filesystem drivers, network management, and various other bits and pieces. Figure 2-1 shows some of them.

Top Articles
8 important Vastu tips for money, prosperity and growth in wealth at home
12.4 Currency Fluctuations and Global Pricing – Core Principles of International Marketing
122242843 Routing Number BANK OF THE WEST CA - Wise
Stretchmark Camouflage Highland Park
Mrh Forum
Nyu Paralegal Program
Sarah F. Tebbens | people.wright.edu
Brgeneral Patient Portal
Localfedex.com
Miss Carramello
Craigslist In South Carolina - Craigslist Near You
Skip The Games Norfolk Virginia
Missing 2023 Showtimes Near Landmark Cinemas Peoria
Santa Clara Valley Medical Center Medical Records
Myql Loan Login
Signs Of a Troubled TIPM
Winterset Rants And Raves
Summoner Class Calamity Guide
Flower Mound Clavicle Trauma
Nene25 Sports
Payment and Ticket Options | Greyhound
Everything We Know About Gladiator 2
How do I get into solitude sewers Restoring Order? - Gamers Wiki
Nhl Tankathon Mock Draft
VERHUURD: Barentszstraat 12 in 'S-Gravenhage 2518 XG: Woonhuis.
Popular Chinese Restaurant in Rome Closing After 37 Years
Craigslist Lewes Delaware
Nesb Routing Number
How To Tighten Lug Nuts Properly (Torque Specs) | TireGrades
Craigs List Jonesboro Ar
1979 Ford F350 For Sale Craigslist
Duke University Transcript Request
Ipcam Telegram Group
Housing Intranet Unt
15 Downer Way, Crosswicks, NJ 08515 - MLS NJBL2072416 - Coldwell Banker
Los Amigos Taquería Kalona Menu
Mega Millions Lottery - Winning Numbers & Results
Breckie Hill Fapello
How Much Is Mink V3
Dallas City Council Agenda
Academic important dates - University of Victoria
The disadvantages of patient portals
Oxford House Peoria Il
Review: T-Mobile's Unlimited 4G voor Thuis | Consumentenbond
Executive Lounge - Alle Informationen zu der Lounge | reisetopia Basics
Amc.santa Anita
13 Fun & Best Things to Do in Hurricane, Utah
Jimmy John's Near Me Open
Nope 123Movies Full
How To Win The Race In Sneaky Sasquatch
Cataz.net Android Movies Apk
Latest Posts
Article information

Author: Reed Wilderman

Last Updated:

Views: 6555

Rating: 4.1 / 5 (52 voted)

Reviews: 83% of readers found this page helpful

Author information

Name: Reed Wilderman

Birthday: 1992-06-14

Address: 998 Estell Village, Lake Oscarberg, SD 48713-6877

Phone: +21813267449721

Job: Technology Engineer

Hobby: Swimming, Do it yourself, Beekeeping, Lapidary, Cosplaying, Hiking, Graffiti

Introduction: My name is Reed Wilderman, I am a faithful, bright, lucky, adventurous, lively, rich, vast person who loves writing and wants to share my knowledge and understanding with you.