A First Look at the C Programming Language
This chapter introduces the C programming language for complete beginners.
A First Look at the C Programming Language
Before we can dive into writing FreeBSD device drivers, we need to learn the language they're written in. That language is C, short, powerful, and, admittedly, a little quirky. But don't worry, you don't need to be a programming expert to get started.
In this chapter, I'll walk you through the basics of the C programming language, assuming absolutely no prior experience. If you've never written a line of code in your life, you're in the right place. If you've done some programming in other languages like Python or JavaScript, that's fine too; C might feel a little more manual, but we'll tackle it together.
Our goal here isn't to become master C programmers in one chapter. Instead, I want to introduce you to the language gently, showing you its syntax, its building blocks, and how it works in the context of UNIX systems like FreeBSD. Along the way, I'll point out real-world examples taken directly from the FreeBSD source code to help ground the theory in actual practice.
By the time we're done, you'll be able to read and write basic C programs, understand the core syntax, and feel confident enough to take the next steps toward kernel development. But that part will come later, for now, let's focus on learning the essentials.
Reader Guidance: How to Use This Chapter
This chapter is not just a quick read; it is both a reference and a hands-on bootcamp in C programming with a FreeBSD flavour. The depth of material is significant, covering everything from the very first “Hello, World!” to pointers, memory safety, modular code, debugging, good practices, labs, and challenge exercises. How much time you'll spend here depends on how deep you go:
- Reading only: Around 12 hours to read all explanations and FreeBSD kernel examples at a beginner's pace. This assumes careful reading but not stopping for practice.
- Reading + labs: Around 18 hours if you pause to type, compile, and run each of the practical labs on your FreeBSD system, making sure you understand the results.
- Reading + labs + challenges: Around 22 hours or more, since the challenge exercises are designed to make you stop, think, debug, and sometimes revisit earlier material before moving forward.
How to Get the Most Out of This Chapter
- Take it in sections. Don't attempt the whole chapter in one sitting. Each section (variables, operators, control flow, pointers, arrays, structs, etc.) is self-contained and can be studied, practiced, and digested before moving on.
- Type the code yourself. Copy-pasting may feel faster, but it skips the muscle memory that makes coding natural. Typing every example builds fluency with both C and FreeBSD's environment.
- Explore the FreeBSD source tree. Many examples come directly from kernel code. Open the referenced files, read surrounding context, and see how the pieces fit in real-world systems.
- Treat labs as checkpoints. Each lab is a moment to pause, apply what you've learned, and verify that the concepts are solid.
- Leave the challenges for last. They are meant to consolidate all material. Attempt them only once you feel comfortable with the main text and labs.
- Set a realistic pace. If you dedicate 1-2 hours per day, expect this chapter to take a week or more to complete with labs, and longer if you also tackle all challenges. Think of it as a training programme rather than a single reading task.
This chapter is deliberately long because C is the foundation for everything else you'll do in FreeBSD device driver programming. Treat it as your toolbox. Once you master it, the material in later chapters will click into place much more naturally.
Introduction
Let's start at the beginning: what is C, and why is it important to us?
What is C?
C is a programming language created in the early 1970s by Dennis Ritchie at Bell Labs. It was designed to write operating systems, and that's still one of its biggest strengths today. In fact, most modern operating systems, including FreeBSD, Linux, and even parts of Windows and macOS, are written mainly in C.
C is fast, compact, and close to the hardware, but unlike assembly language, it's still readable and expressive. You can write efficient, powerful code with C, but it also expects you to be careful. There's no safety net: no automatic memory management, no runtime error messages, and not even built-in strings like in Python or JavaScript.
This might sound scary, but it's actually a feature. When writing drivers or working inside the kernel, you want control, and C gives you that control.
Why Should I Learn C for FreeBSD?
FreeBSD is written almost entirely in C, and that includes the kernel, device drivers, userland tools, and system libraries. If you want to write code that interacts with the operating system, whether it's a new device driver or a custom kernel module, C is your entry point. More specifically:
- The FreeBSD kernel APIs are written in C.
- All device drivers are implemented in C.
- Even debugging tools like dtrace and kgdb understand and expose C-level information.
So, to work with FreeBSD's internals, you'll need to understand how C code is written, structured, compiled, and used in the system.
What If I've Never Programmed Before?
No problem! I'm writing this chapter with you in mind. We'll take it one step at a time, starting with the simplest possible program and slowly building our way up. You'll learn about:
- Variables and data types
- Functions and flow control
- Pointers, arrays, and structures
- How to read real code from the FreeBSD kernel
And don't worry if any of those terms are unfamiliar right now, they'll all make sense soon. I'll provide plenty of examples, explain every step in plain language, and help you build your confidence as we go.
How Is This Chapter Organized?
Here's a quick preview of what's coming up:
- We'll begin by setting up your development environment on FreeBSD.
- Then, we'll walk through your first C program, the classic "Hello, World!".
- From there, we'll cover the syntax and semantics of C: variables, loops, functions, and more.
- We'll show you real examples from the FreeBSD source tree, so you can start learning how the system works under the hood.
- Finally, we'll wrap up with some good practices and a look at what's coming in the next chapter, where we begin applying C to the kernel world.
Are you ready? Let's jump into the next section and get your environment set up so you can run your first C program on FreeBSD.
Setting Up Your Environment
Before we can start writing C code, we need to set up a working development environment. The good news? If you're running FreeBSD, you already have most of what you need.
In this section, we'll:
- Verify your C compiler is installed
- Compile your first program manually
- Learn how to use Makefiles for convenience
Let's go step by step.
Installing a C Compiler on FreeBSD
FreeBSD includes the Clang compiler as part of the base system, so you typically don't need to install anything extra to start writing C code.
To confirm that Clang is installed and working, open a terminal and run:
% cc --version
You should see output like this:
FreeBSD clang version 19.1.7 (https://github.com/llvm/llvm-project.git
llvmorg-19.1.7-0-gcd708029e0b2)
Target: aarch64-unknown-freebsd14.3
Thread model: posix
InstalledDir: /usr/bin
If cc is not found, you can install the base development utilities running the following command as root:
# pkg install llvm
But for almost all standard FreeBSD setups, Clang should already be ready to use.
Let's write the classic "Hello, World!" program in C. This will verify that your compiler and terminal are working correctly.
Open a text editor like ee, vi, or nano, and create a file called hello.c:
#include <stdio.h>
int main(void) {
printf("Hello, World!\n");
return 0;
}
Let's break this down:
#include <stdio.h>tells the compiler to include the Standard I/O header file, which provides printf.int main(void)defines the main entry point of the program.printf(...)writes a message to the terminal.return 0;indicates successful execution.
Now save the file and compile it:
% cc -o hello hello.c
This tells Clang to:
- Compile
hello.c - Output the result to a file called
hello
Run it:
% ./hello
Hello, World!
Congratulations! You just compiled and ran your first C program on FreeBSD.
Behind the Scenes: The Compilation Pipeline
When you run:
% cc -o hello hello.c
a lot happens under the hood. The process goes through several stages:
- Preprocessing
- Handles
#includeand#define. - Expands macros, includes headers, and produces a pure C source file.
- Handles
- Compilation
- Translates the preprocessed C code into assembly language for your CPU architecture.
- Assembly
- Converts the assembly into machine code instructions, producing an object file (
hello.o).
- Converts the assembly into machine code instructions, producing an object file (
- Linking
- Combines object files with the standard library (e.g.,
printf()from libc). - Produces the final executable (
hello).
- Combines object files with the standard library (e.g.,
Finally, when you run:
% ./hello
the operating system loads the program into memory and starts executing it at the main() function.
Quick Mental Model: Think of it like building a house:
- Preprocessing = gathering the blueprints
- Compilation = turning blueprints into instructions
- Assembly = workers cutting raw materials
- Linking = putting all parts together into the finished house
Later in Section 4.15, we’ll dive deeper into compiling, linking, and debugging, but for now, this mental picture will help you understand what’s really happening when you build your first programs.
Using Makefiles
Typing long compile commands can get annoying as your programs grow. That's where Makefiles come in handy.
A Makefile is a plain text file named Makefile that defines how to build your program. Here's a very simple one for our Hello World example:
# Makefile for hello.c
hello: hello.c
cc -o hello hello.c
Attention: Every command line that will be executed by the shell within a Makefile rule must begin with a tab character, not spaces. If you use spaces, the make execution will fail."
To use it:
Save this in a file called Makefile (note the capital "M") Run make in the same directory:
% make
cc -o hello hello.c
This is especially helpful when your project grows to include multiple files.
Important Note: One of the most common mistakes when writing your first Makefile is forgetting to use a TAB character at the beginning of each command line in a rule. In Makefiles, every line that should be executed by the shell must start with a TAB, not spaces. If you accidentally use spaces instead, make will produce an error and fail to run. This detail often trips up beginners, so be sure to check your indentation carefully!
This error will appear as shown below:
% make
make: "/home/ebrandi/hello/Makefile" line 4: Invalid line type
make: Fatal errors encountered -- cannot continue
make: stopped in /home/ebrandi/hello
For now, we're only compiling one source file at a time. In Section 4.14, you'll see how larger programs are organized into multiple .c files with shared headers, just like in the FreeBSD kernel.
Installing the FreeBSD Source Code
As we move forward, we'll look at examples from the actual FreeBSD kernel source. To follow along, it's useful to have the FreeBSD source tree installed locally.
To store a complete local copy of the FreeBSD source code, you will need approximately 3.6 GB of free disk space. You can install it using Git by running the following command:
# git clone --branch releng/14.3 --depth 1 https://git.FreeBSD.org/src.git src /usr/src
This will give you access to all source code, which we'll reference frequently throughout this book.
Summary
You now have a working development setup on FreeBSD! It was simple, wasn't it?
Here's what you've accomplished:
- Verified the C compiler is installed
- Wrote and compiled your first C program
- Learned how to use Makefiles
- Cloned the FreeBSD source tree for future reference
These tools are all you need to start learning C, and later, to build your own kernel modules and drivers. In the next section, we'll look at what makes up a typical C program and how it's structured.
Anatomy of a C Program
Now that you've compiled your first "Hello, World!" program, let's take a closer look at what's actually going on inside that code. In this section, we'll break down the basic structure of a C program and explain what each part does, step by step.
We'll also introduce how this structure appears in the FreeBSD kernel code, so you can begin recognising familiar patterns in real-world systems programming.
The Basic Structure
Every C program follows a similar structure:
#include <stdio.h>
int main(void) {
printf("Hello, World!\n");
return 0;
}
Let's dissect this line by line.
#include Directives: Adding Libraries
#include <stdio.h>
This line is handled by the C preprocessor before the program is compiled. It tells the compiler to include the contents of a system header file.
<stdio.h>is a standard header file that provides I/O functions like printf.- Anything you include this way is pulled into your program at compile time.
In FreeBSD source code, you'll often see many #include directives at the top of a file. Here's an example from the FreeBSD kernel file sys/kern/kern_shutdown.c:
#include <sys/cdefs.h>
#include "opt_ddb.h"
#include "opt_ekcd.h"
#include "opt_kdb.h"
#include "opt_panic.h"
#include "opt_printf.h"
#include "opt_sched.h"
#include "opt_watchdog.h"
#include <sys/param.h>
#include <sys/systm.h>
#include <sys/bio.h>
#include <sys/boottrace.h>
#include <sys/buf.h>
#include <sys/conf.h>
#include <sys/compressor.h>
#include <sys/cons.h>
#include <sys/disk.h>
#include <sys/eventhandler.h>
#include <sys/filedesc.h>
#include <sys/jail.h>
#include <sys/kdb.h>
#include <sys/kernel.h>
#include <sys/kerneldump.h>
#include <sys/kthread.h>
#include <sys/ktr.h>
#include <sys/malloc.h>
#include <sys/mbuf.h>
#include <sys/mount.h>
#include <sys/priv.h>
#include <sys/proc.h>
#include <sys/reboot.h>
#include <sys/resourcevar.h>
#include <sys/rwlock.h>
#include <sys/sbuf.h>
#include <sys/sched.h>
#include <sys/smp.h>
#include <sys/stdarg.h>
#include <sys/sysctl.h>
#include <sys/sysproto.h>
#include <sys/taskqueue.h>
#include <sys/vnode.h>
#include <sys/watchdog.h>
#include <crypto/chacha20/chacha.h>
#include <crypto/rijndael/rijndael-api-fst.h>
#include <crypto/sha2/sha256.h>
#include <ddb/ddb.h>
#include <machine/cpu.h>
#include <machine/dump.h>
#include <machine/pcb.h>
#include <machine/smp.h>
#include <security/mac/mac_framework.h>
#include <vm/vm.h>
#include <vm/vm_object.h>
#include <vm/vm_page.h>
#include <vm/vm_pager.h>
#include <vm/swap_pager.h>
#include <sys/signalvar.h>
These headers define macros, constants, and function prototypes used in the kernel. For now, just remember: #include brings in definitions you want to use.
The main() Function: Where Execution Begins
int main(void) {
- This is the entry point of your program. When your program runs, it starts here.
- The
intmeans the function returns an integer to the operating system. - void means it takes no arguments.
In user programs, main() is where you write your logic. In the kernel, however, there's no main() function like this; the kernel has its own bootstrapping process. But FreeBSD kernel modules and subsystems still define entry points that act in similar ways.
For example, device drivers use functions like:
static int
mydriver_probe(device_t dev)
And they are registered with the kernel during initialisation; these behave like a main() for specific subsystems.
Statements and Function Calls
printf("Hello, World!\n");
This is a statement, a single instruction that performs some action.
printf()is a function provided by<stdio.h>that prints formatted output."Hello, World!\n"is a string literal, with\nmeaning "new line".
Important Note: In kernel code, you don't use the printf() function from the Standard C Library (libc). Instead, the FreeBSD kernel provides its own internal version of printf() tailored for kernel-space output, a distinction we'll explore in more detail later in the book.
Return Values
return 0;
}
This tells the operating system that the program completed successfully.
Returning 0usually means "no error".
You'll see a similar pattern in kernel code where functions return 0 for success and a non-zero value for failure.
Bonus Learning Point About Return Values
Let's see a practical example from sys/kern/kern_exec.c:
exec_map_first_page(struct image_params *imgp)
{
vm_object_t object;
vm_page_t m;
int error;
if (imgp->firstpage != NULL)
exec_unmap_first_page(imgp);
object = imgp->vp->v_object;
if (object == NULL)
return (EACCES);
#if VM_NRESERVLEVEL > 0
if ((object->flags & OBJ_COLORED) == 0) {
VM_OBJECT_WLOCK(object);
vm_object_color(object, 0);
VM_OBJECT_WUNLOCK(object);
}
#endif
error = vm_page_grab_valid_unlocked(&m, object, 0,
VM_ALLOC_COUNT(VM_INITIAL_PAGEIN) |
VM_ALLOC_NORMAL | VM_ALLOC_NOBUSY | VM_ALLOC_WIRED);
if (error != VM_PAGER_OK)
return (EIO);
imgp->firstpage = sf_buf_alloc(m, 0);
imgp->image_header = (char *)sf_buf_kva(imgp->firstpage);
return (0);
}
Return Values in exec_map_first_page():
-
return (EACCES);Returned when the executable file's vnode (imgp->vp) has no associated virtual memory object (v_object). Without this object, the kernel cannot map the file into memory. This is treated as a permission/access error, using the standardEACCESerror code ("Permission Denied"). -
return (EIO);Returned when the kernel fails to retrieve a valid memory page from the file viavm_page_grab_valid_unlocked(). This may happen due to an I/O failure, memory issue, or file corruption. TheEIOcode ("Input/Output Error") signals a low-level failure in reading or allocating memory for the file. -
return (0);Returned on successful completion of the function. This indicates that the kernel has successfully grabbed the first page of the executable file, mapped it into memory, and stored the address of the header inimgp->image_header. A return value of0is the standard kernel convention for indicating success.
The use of errno-style error codes like EIO and EACCES ensures consistent error handling throughout the kernel, making it easier for driver developers and kernel programmers to propagate errors reliably and interpret failure conditions in a familiar, standardised way.
The FreeBSD kernel makes extensive use of errno-style error codes to represent different failure conditions consistently. Don't worry if they seem unfamiliar at first, as we move forward, you'll naturally encounter many of them, and I'll help you understand how they work and when to use them.
For a complete list of standard error codes and their meanings, you can refer to the FreeBSD manual page:
% man 2 intro
Putting It All Together
Let's revisit our Hello World program, now with full comments:
#include <stdio.h> // Include standard I/O library
int main(void) { // Entry point of the program
printf("Hello, World!\n"); // Print a message to the terminal
return 0; // Exit with success
}
In this short example, you've already seen:
- A preprocessor directive
- A function definition
- A standard library call
- A return statement
These are the building blocks of C and you'll see them repeated everywhere, including deep inside FreeBSD's kernel source code.
A First Glimpse at Good Practices in C
Before we move on to variables, types, and control flow, it is worth pausing for a short note on style and discipline. You are just starting out, but if you learn a few habits now, you will save yourself a lot of trouble later. FreeBSD, like every mature project, follows its own coding conventions called KNF (Kernel Normal Form). We will study these in depth near the end of this chapter, but here are four essentials you should keep in mind right from the start:
1. Always Use Braces
Even if an if or for only controls a single statement, always write it with braces:
if (x > 0) {
printf(Positive\n);
}
This avoids a whole class of bugs and keeps your code safe when you later add more statements.
2. Indentation Matters
The FreeBSD kernel style guide requires tabs, not spaces, for indentation. Your editor should insert a tab character at each indent level. This is not about aesthetics: consistent indentation makes kernel code readable and reviewable.
3. Prefer Meaningful Names
Avoid calling your variables a, b, or tmp1. A name like counter, buffer_size, or error_code makes the code immediately self-explanatory. Remember: in FreeBSD, your code will one day be read by someone else, and often years after you wrote it.
4. No “Magic Numbers”
If you find yourself writing something like:
if (users > 64) { ... }
Replace 64 with a named constant:
#define MAX_USERS 64
if (users > MAX_USERS) { ... }
This makes your code easier to maintain and avoids hidden assumptions.
Why This Matters for You
Right now, these may feel like small details. But by building these habits early, you will avoid learning “sloppy” C that has to be unlearned when you reach kernel development. Think of this as a survival kit: a few essential rules that will keep your code clear, safe, and closer to what you will see in the FreeBSD source tree.
Later in this chapter, you will revisit these practices in depth, along with many more advanced conventions used by FreeBSD developers. For now, just keep these four in mind as you start coding.
Summary
In this section, you've learned:
- The structure of a C program
- How #include and main() work
- What printf() and return do
- How similar structures appear in FreeBSD's kernel code
- Early good practices to keep your code clear and safe
The more C code you read, both your own and from FreeBSD, the more these patterns will become second nature.
Variables and Data Types
In any programming language, variables are how you store and manipulate data. In C, variables are a little more "manual" than in higher-level languages, but they give you the control you need to write fast, efficient programs, and that's precisely what operating systems like FreeBSD require.
In this section, we'll explore:
- How to declare and initialise variables
- The most common data types in C
- How FreeBSD uses them in kernel code
- Some tips to avoid common beginner mistakes
Let's start with the basics.
What Is a Variable?
A variable is like a labeled box in memory where you can store a value, such as a number, a character, or even a block of text.
Here's a simple example:
int counter = 0;
This tells the compiler:
- Allocate enough memory to store an integer
- Call that memory location counter
- Put the number 0 in it to start
Declaring Variables
In C, you must declare the type of every variable before using it. This is different from languages like Python, where the type is determined automatically.
Here's how to declare different types of variables:
int age = 30; // Integer (whole number)
float temperature = 98.6; // Floating-point number
char grade = 'A'; // Single character
You can also declare multiple variables at once:
int x = 10, y = 20, z = 30;
Or leave them uninitialized (but be careful, as uninitialized variables contain garbage values!):
int count; // May contain anything!
Always initialise your variables, not just because it's good C practice, but because in kernel development, uninitialized values can lead to subtle and dangerous bugs, including kernel panics, unpredictable behaviour, and security vulnerabilities. In userland, mistakes might crash your program; in the kernel, they can compromise the stability of the entire system.
Unless you have a very specific and justified reason not to (such as performance-critical code paths where the value is immediately overwritten), make initialisation the rule, not the exception.
Common C Data Types
Here are the core types you'll use most often:
| Type | Description | Example |
|---|---|---|
int | Integer (typically 32-bit) | int count = 1; |
unsigned | Non-negative integer | unsigned size = 10; |
char | A single 8-bit character | char c = 'A'; |
float | Floating-point number (~6 decimal digits) | float pi = 3.14; |
double | Double-precision float (~15 digits) | double g = 9.81; |
void | Represents "no value" (used for functions) | void print() |
Type Qualifiers
C provides type qualifiers to give more information about how a variable should behave:
const: This variable can't be changed.volatile: The value can change unexpectedly (used with hardware!).unsigned: The variable cannot hold negative numbers.
Example:
const int max_users = 100;
volatile int status_flag;
The volatile qualifier can be important in FreeBSD kernel development, but only in very specific contexts, such as accessing hardware registers or dealing with interrupt-driven updates. It tells the compiler not to optimise accesses to a variable, which is critical when values can change outside of normal program flow.
However, volatile is not a substitute for proper synchronisation and should not be used for coordinating access between threads or CPUs. For that, the FreeBSD kernel provides dedicated primitives like mutexes and atomic operations, which offer both compiler and CPU-level guarantees.
Constant Values and #define
In C programming and especially in kernel development, it's very common to define constant values using the #define directive:
#define MAX_DEVICES 64
This line doesn't declare a variable. Instead, it's a preprocessor macro, which means the C preprocessor will replace every occurrence of MAX_DEVICES with 64 before the actual compilation begins. This replacement happens textually, and the compiler never even sees the name MAX_DEVICES.
Why Use #define for Constants?
Using #define for constant values has several advantages in kernel code:
- Improves readability: Instead of seeing magic numbers (like 64) scattered throughout the code, you see meaningful names like MAX_DEVICES.
- Makes code easier to maintain: If the maximum number of devices ever needs to change, you update it in one place, and the change is reflected wherever it's used.
- Keeps kernel code lightweight: Kernel code often avoids runtime overhead, and #define constants don't allocate memory or exist in the symbol table; they simply get replaced during preprocessing.
Real Example From FreeBSD
You will find many #define lines in sys/sys/param.h, for example:
#define MAXHOSTNAMELEN 256 /* max hostname size */
This defines the maximum number of characters allowed in a system hostname, and it's used throughout the kernel and system utilities to enforce a consistent limit. The value 256 is now standardised and can be reused wherever the hostname length is relevant.
Watch Out: There Is No Type Checking
Because #define simply performs textual substitution, it does not respect types or scoping.
For example:
#define PI 3.14
This works, but it can lead to problems in certain contexts (e.g., integer promotion, unintended precision loss). For more complex or type-sensitive constants, you may prefer using const variables or enums in userland, but in the kernel, especially in headers, #define is often chosen for efficiency and compatibility.
Best Practices for #define Constants in Kernel Development
- Use ALL CAPS for macro names to distinguish them from variables.
- Add comments to explain what the constant represents.
- Avoid defining constants that depend on runtime values.
- Prefer
#defineoverconstin header files or when targeting C89 compatibility (which is still common in kernel code).
Best Practices for Variables
Writing correct and robust kernel code starts with disciplined variable usage. The tips below will help you avoid subtle bugs, improve code readability, and align with FreeBSD kernel development conventions.
Always initialise your variables: Never assume a variable starts at zero or any default value, especially in kernel code, where behaviour must be deterministic. An uninitialized variable could hold random garbage from the stack, leading to unpredictable behaviour, memory corruption, or kernel panics. Even when the variable will be overwritten soon, it's often safer and more transparent to initialise it explicitly unless performance measurements prove otherwise.
Don't use variables before assigning a value: This is one of the most common bugs in C, and compilers won't always catch it. In the kernel, using an uninitialized variable can result in silent failures or catastrophic system crashes. Always trace your logic to ensure every variable is assigned a valid value before use, especially if it influences memory access or hardware operations.
Use const whenever the value shouldn't change:
Using const is more than good style; it helps the compiler enforce read-only constraints and catch unintended modifications. This is particularly important when:
- Passing read-only pointers into functions
- Protecting configuration structures or table entries
- Marking driver data that must not change after initialisation
In kernel code, this can even lead to compiler optimisations and make the code easier to reason about for reviewers and maintainers.
Use unsigned for values that can't be negative (like sizes or counters): Variables that represent quantities like buffer sizes, loop counters, or device counts should be declared as unsigned types (unsigned int, size_t, or uint32_t, etc.). This improves clarity and prevents logic bugs, especially when comparing with other unsigned types, which can cause unexpected behaviour if signed values are mixed in.
Prefer fixed-width types in kernel code (uint32_t, int64_t, etc.): Kernel code must behave predictably across architectures (e.g., 32-bit vs 64-bit systems). Types like int, long, or short can vary in size depending on the platform, which can lead to portability issues and alignment bugs. Instead, FreeBSD uses standard types from <sys/types.h> such as:
uint8_t,uint16_t,uint32_t,uint64_tint32_t,int64_t, etc.
These types ensure your code has a known, fixed layout and avoids surprises when compiling or running on different hardware.
Pro Tip: When in doubt, look at existing FreeBSD kernel code, especially drivers and subsystems close to what you're working on. The variable types and initialisation patterns used there are often based on years of hard-earned lessons from real-world systems.
Summary
In this section, you've learned:
- How to declare and initialise variables
- The most important data types in C
- What type qualifiers like const and volatile do
- How to spot and understand variable declarations in FreeBSD's kernel code
You now have the tools to store and work with data in C, and you've already seen how FreeBSD uses the same concepts in production-quality kernel code.
Operators and Expressions
So far, we've learned how to declare and initialise variables. Now it's time to make them do something! In this section, we'll dive into operators and expressions, the mechanisms in C that allow you to compute values, compare them, and control program logic.
We'll cover:
- Arithmetic operators
- Comparison operators
- Logical operators
- Bitwise operators (lightly)
- Assignment operators
- Real examples from FreeBSD kernel code
What Is an Expression?
In C, an expression is anything that produces a value. For example:
int a = 3 + 4;
Here, 3 + 4 is an expression that evaluates to 7. The result is then assigned to a.
Operators are what you use to build expressions.
Arithmetic Operators
These are used for basic math:
| Operator | Meaning | Example | Result |
|---|---|---|---|
+ | Addition | 5 + 2 | 7 |
- | Subtraction | 5 - 2 | 3 |
* | Multiplication | 5 * 2 | 10 |
/ | Division | 5 / 2 | 2 (integer division!) |
% | Modulus | 5 % 2 | 1 (remainder) |
Note: In C, division of two integers discards the decimal part. To get floating-point results, at least one operand must be a float or double.
Comparison Operators
These are used to compare two values and return either true (1) or false (0):
| Operator | Meaning | Example | Result |
|---|---|---|---|
== | Equal to | a == b | 1 if equal |
!= | Not equal to | a != b | 1 if not equal |
< | Less than | a < b | 1 if true |
> | Greater than | a > b | 1 if true |
<= | Less than or equal | a <= b | 1 if true |
>= | Greater than or equal | a >= b | 1 if true |
These are heavily used in if, while, and for statements to control program flow.
Logical Operators
Used to combine or invert conditions:
| Operator | Name | Description | Example | Result |
|---|---|---|---|---|
| && | Logical AND | True if both conditions are true | (a > 0) && (b < 5) | 1 if both are true |
| || | Logical OR | True if either condition is true | (a == 0) || (b > 10) | 1 if at least one is true |
| ! | Logical NOT | Reverses the truth value of the condition | !done | 1 if done is false |
These are especially useful in complex conditionals, like:
if ((a > 0) && (b < 100)) {
// both conditions must be true
}
Tip: In C, any non-zero value is considered "true," and zero is considered "false".
Assignment and Compound Assignment
The = operator assigns a value:
x = 5; // assign 5 to x
Compound assignment combines operation and assignment:
| Operator | Meaning | Example | Equivalent to |
|---|---|---|---|
+= | Add and assign | x += 3; | x = x + 3; |
-= | Subtract and assign | x -= 2; | x = x - 2; |
*= | Multiply and assign | x *= 4; | x = x * 4; |
/= | Divide and assign | x /= 2; | x = x / 2; |
%= | Modulus and assign | x %= 3; | x = x % 3; |
Bitwise Operators
In kernel development, bitwise operators are standard. Here's a light preview:
| Operator | Meaning | Example |
|---|---|---|
| & | Bitwise AND | a & b |
| | | Bitwise OR | a | b |
| ^ | Bitwise XOR | a ^ b |
| ~ | Bitwise NOT | ~a |
| << | Left shift | a << 2 |
| >> | Right shift | a >> 1 |
We'll cover these in detail later when we work with flags, registers, and hardware I/O.
Real Example from FreeBSD: sys/kern/tty_info.c
Let's look at a real example from the FreeBSD source code.
Open the file sys/kern/tty_info.cand look for the function thread_compare() starting on line 109, you will see the code below:
static int
thread_compare(struct thread *td, struct thread *td2)
{
int runa, runb;
int slpa, slpb;
fixpt_t esta, estb;
if (td == NULL)
return (1);
/*
* Fetch running stats, pctcpu usage, and interruptable flag.
*/
thread_lock(td);
runa = TD_IS_RUNNING(td) || TD_ON_RUNQ(td);
slpa = td->td_flags & TDF_SINTR;
esta = sched_pctcpu(td);
thread_unlock(td);
thread_lock(td2);
runb = TD_IS_RUNNING(td2) || TD_ON_RUNQ(td2);
estb = sched_pctcpu(td2);
slpb = td2->td_flags & TDF_SINTR;
thread_unlock(td2);
/*
* see if at least one of them is runnable
*/
switch (TESTAB(runa, runb)) {
case ONLYA:
return (0);
case ONLYB:
return (1);
case BOTH:
break;
}
/*
* favor one with highest recent cpu utilization
*/
if (estb > esta)
return (1);
if (esta > estb)
return (0);
/*
* favor one sleeping in a non-interruptible sleep
*/
switch (TESTAB(slpa, slpb)) {
case ONLYA:
return (0);
case ONLYB:
return (1);
case BOTH:
break;
}
return (td < td2);
}
We are interested in this fragment of code:
...
runa = TD_IS_RUNNING(td) || TD_ON_RUNQ(td);
...
return (td < td2);
Explanation:
TD_IS_RUNNING(td)andTD_ON_RUNQ(td)are macros that return boolean values.- The logical OR
||checks if either condition is true. - The result is assigned to `runa.
Later, this line:
return (td < td2);
Uses the less-than operator to compare two pointers (td and td2). This is valid in C; pointer comparisons are common when choosing between resources.
Another real expression in that same file can be found at line 367:
pctcpu = (sched_pctcpu(td) * 10000 + FSCALE / 2) >> FSHIFT;
This expression:
- Multiplies the CPU usage estimate by 10,000
- Adds half the scale factor for rounding
- Then performs a bitwise right shift to scale it down
- It's an optimised way to compute
(value * scale) / divisorusing bit shifts instead of division
Summary
In this section, you've learned:
- What expressions are in C
- How to use arithmetic, comparison, and logical operators
- How to assign values and use compound assignments
- How bitwise operations show up in kernel code
- How FreeBSD uses these expressions to control logic and calculations
This section builds the foundation for conditional execution and looping, which we'll explore next.
Control Flow
So far, we've learned how to declare variables and write expressions. But programs need to do more than compute values; they need to make decisions and repeat actions. This is where control flow comes in.
Control flow statements allow you to:
- Choose between different paths (
if,else,switch) - Repeat operations using loops (
for,while,do...while) - Exit loops early (
break,continue)
These are the decision-making tools of C, and they're essential for writing meaningful programs, from small utilities to operating system kernels.
Understanding the if, else, and else if
One of the most basic ways to control the flow of a C program is with the if statement. It lets your code make decisions based on whether a condition is true or false.
if (x > 0) {
printf("x is positive\n");
} else if (x < 0) {
printf("x is negative\n");
} else {
printf("x is zero\n");
}
Here's how it works step by step:
-
if (x > 0)– The program checks the first condition. If it's true, the block inside runs and the rest of the chain is skipped. -
else if (x < 0)– If the first condition was false, this second one is checked. If it's true, it's block runs and the chain ends. -
else– If none of the previous conditions are true, the code insideelseruns.
Important syntax rules:
- Each condition must be inside parentheses
( ). - Each block of code is surrounded by curly braces
{ }, even if it's only one line (this prevents common mistakes).
You can see a real example of if , if else and else usage flow control in the function ifhwioctl() that starts at line 2407 of sys/net/if.c file, the fragment that we are interested in starts at line 2537:
/* Copy only (length-1) bytes so if_description is always NUL-terminated. */
/* The length parameter counts the terminating NUL. */
if (ifr_buffer_get_length(ifr) > ifdescr_maxlen)
return (ENAMETOOLONG);
else if (ifr_buffer_get_length(ifr) == 0)
descrbuf = NULL;
else {
descrbuf = if_allocdescr(ifr_buffer_get_length(ifr), M_WAITOK);
error = copyin(ifr_buffer_get_buffer(ifr), descrbuf,
ifr_buffer_get_length(ifr) - 1);
if (error) {
if_freedescr(descrbuf);
break;
}
}
This fragment handles a request from user space to set a description for a network interface, for example, giving em0 a human-readable label like "Main uplink port". The code checks the length of the description provided and decides what to do next.
Let's walk through the flow control step by step:
- First
if– Checks whether the description is too long to fit.- If true, the function immediately stops and returns an error code (
ENAMETOOLONG). - If false, execution moves on to the next condition.
- If true, the function immediately stops and returns an error code (
else if– Runs only if the first condition was false.- If the length is exactly zero, it means the user didn't provide a description, so the code sets
descrbuftoNULL. - If false, the program moves on to the final
else.
- If the length is exactly zero, it means the user didn't provide a description, so the code sets
- Final
else– Executes when neither of the previous conditions are true.- Allocates memory for the description and copies the provided text into it.
- If copying fails, it frees the memory and exits the loop or function.
How the flow works:
- Only one of these three paths runs each time.
- The first matching condition "wins", and the rest are skipped.
- This is a classic example of using
if / else if / elseto handle mutually exclusive conditions, reject invalid input, handle the empty case, or process a valid value.
In C, if / else if / else chains provide a straightforward way to handle several possible outcomes in a single structure. The program checks each condition in order, and as soon as one is true, that block runs and the rest are skipped. This simple rule keeps your logic predictable and easy to follow. In the FreeBSD kernel, you'll see this pattern everywhere, from network stack functions to device drivers, because it ensures that only the correct code path runs for each situation, making the system's decision-making both efficient and reliable.
Understanding the switch and case
A switch statement is a decision-making structure that's useful when you need to compare one variable against multiple possible values. Instead of writing a long chain of if and else if statements, you can list each possible value as a case.
Here's a simple example:
switch (cmd) {
case 0:
printf("Zero\n");
break;
case 1:
printf("One\n");
break;
default:
printf("Unknown\n");
break;
}
- The switch checks the value of
cmd. - Each case is a possible value that
cmdmight have. - The
breakstatement tells the program to stop checking further cases once a match is found. Withoutbreak, execution will continue into the next case, a behaviour called fall-through. - The
defaultcase runs if none of the listed cases match.
You can see a real use of switch in the FreeBSD kernel inside the function thread_compare() (starting at line 109 in sys/kern/tty_info.c). The fragment we're interested in is from lines 134 to 141:
switch (TESTAB(runa, runb)) {
case ONLYA:
return (0);
case ONLYB:
return (1);
case BOTH:
break;
}
What This Code Does
This code decides which of two threads is "more interesting" for the scheduler based on whether each thread is runnable.
runaandrunbare flags that indicate if the first thread (a) and the second thread (b) are runnable.- The macro
TESTAB(a, b)combines those flags into a single value. This result can be one of three predefined constants:ONLYA- Only thread A is runnable.ONLYB- Only thread B is runnable.BOTH- Both threads are runnable.
The switch works like this:
- Case
ONLYA– If only thread A is runnable, return0. - Case
ONLYB– If only thread B is runnable, return1. - Case
BOTH– If both threads are runnable, don't return immediately; instead,breakso the rest of the function can handle this situation.
In short, switch statements provide a clean and efficient way to handle multiple possible outcomes from a single expression, avoiding the clutter of long if / else if chains. In the FreeBSD kernel, they are often used to react to different commands, flags, or states, as in our example, which decides between thread A, thread B, or both. Once you become comfortable reading switch structures, you'll start to recognise them throughout kernel code as a go-to pattern for organising decision-making logic in a clear, maintainable way.
Understanding the for Loops
A for loop in C is perfect when you know how many times you want to repeat something. It sets things up in a compact, easy-to-read style:
for (int i = 0; i < 10; i++) {
printf("%d\n", i);
}
- Start at
i = 0 - Repeat while
i < 10 - Increment
ieach time by 1 (i++)
A widespread beginner error is related to off-by-one errors (<= vs <), and forgetting the increment (which can cause an infinite loop).
You can see a real for loop inside sys/net/iflib.c, in the function netmap_fl_refill(), which begins at line 859. The fragment we care about is the inner batching loop at lines 922–949:
for (i = 0; n > 0 && i < IFLIB_MAX_RX_REFRESH; n--, i++) {
struct netmap_slot *slot = &ring->slot[nm_i];
uint64_t paddr;
void *addr = PNMB(na, slot, &paddr);
/* ... work per buffer ... */
nm_i = nm_next(nm_i, lim);
nic_i = nm_next(nic_i, lim);
}
What this loop does
- The driver is refilling receive buffers so the NIC can keep receiving packets.
- It processes buffers in batches: up to
IFLIB_MAX_RX_REFRESHeach time. icounts how many buffers we've handled in this batch.nis the total remaining buffers to refill; it decrements every iteration.- For each buffer, the code grabs its slot, figures out the physical address, readies it for DMA, then advances the ring indices (
nm_i,nic_i). - The loop stops when either the batch is full (
ihits the max) or there's nothing left to do (n == 0). The batch is then "published" to the NIC by the code right after the loop.
In essence, a for loop is the go-to choice when you have a clear limit on how many times something should run. It packages initialisation, condition checking, and iteration updates into a single, compact header, making the flow easy to follow.
In FreeBSD's kernel code, this structure is everywhere from scanning arrays to walking network ring buffers, because it keeps repetitive work both predictable and efficient. Our example from netmap_fl_refill() shows precisely how this works in practice:
the loop counts through a fixed-size batch of buffers, stopping either when the batch is full or when there's no more work left, then hands that batch off to the NIC. Once you get comfortable reading for loops like this, you'll spot them throughout the kernel and understand how they keep complex systems running smoothly.
Understanding the while Loop
In C, a while loop is a control structure that allows your program to repeat a block of code as long as a certain condition remains true.
Think of it like telling your program, "Keep doing this task while this rule is true. Stop as soon as the rule becomes false."
Lets see a example:
int i = 0;
while (i < 10) {
printf("%d\n", i);
i++;
}
Variable Initialization
int i = 0;
- We create a variable
iand set its value to0. - This will be our
counter, keeping track of how many times the loop has run.
The while Condition
while (i < 10)
- Before each repetition, C checks the condition
i < 10. - If the condition is true, the block inside the loop is executed.
- If the condition is false, the loop stops, and the program continues after the loop.
The Loop Body
{
printf("%d\n", i);
i++;
}
printf("%d\n", i); - Prints the value of i followed by a newline (\n).
i++; - Increases i by 1 after each iteration. This step is crucial; without it, i would stay 0 forever, and the loop would never end, creating an infinite loop.
Key Points to Remember
- A while loop may not run at all if the condition is false from the start.
- Always ensure something inside the loop changes the condition over time, or you risk an infinite loop.
- In FreeBSD kernel code,
whileloops are common for:- Polling hardware status registers until a device is ready.
- Waiting for a buffer to be filled.
- Implementing retry mechanisms.
You can see a real example of while loop usage in the function netmap_fl_refill() that starts at line 858 of sys/net/iflib.c file.
This time, I've decided to show you the complete source code for this FreeBSD kernel function because it offers an excellent opportunity to see several concepts from this chapter working together in a real-world context.
To make it easier to follow, I've added explanatory comments at key points so you can connect the theory to the actual implementation. Don't worry if you don't fully understand every detail right now; this is normal when first looking at kernel code.
For our discussion, pay special attention to the while loop that begins at line 915, as it's the part we will explore in depth. Look for while (n > 0) { in the code below:
/*
* netmap_fl_refill
* ----------------
* This function refills receive (RX) buffers in a netmap-enabled
* FreeBSD network driver so the NIC can continue receiving packets.
*
* It is called in two main situations:
* 1. Initialization (driver start/reset)
* 2. RX synchronization (during packet reception)
*
* The core idea: figure out how many RX slots need refilling,
* then load and map each buffer so the NIC can use it.
*/
static int
netmap_fl_refill(iflib_rxq_t rxq, struct netmap_kring *kring, bool init)
{
struct netmap_adapter *na = kring->na;
u_int const lim = kring->nkr_num_slots - 1;
struct netmap_ring *ring = kring->ring;
bus_dmamap_t *map;
struct if_rxd_update iru;
if_ctx_t ctx = rxq->ifr_ctx;
iflib_fl_t fl = &rxq->ifr_fl[0];
u_int nic_i_first, nic_i; // NIC descriptor indices
u_int nm_i; // Netmap ring index
int i, n; // i = batch counter, n = buffers to process
#if IFLIB_DEBUG_COUNTERS
int rf_count = 0;
#endif
/*
* Figure out how many buffers (n) we need to refill.
* - In init mode: refill almost the whole ring (minus those in use)
* - In normal mode: refill from hardware current to ring head
*/
if (__predict_false(init)) {
n = kring->nkr_num_slots - nm_kr_rxspace(kring);
} else {
n = kring->rhead - kring->nr_hwcur;
if (n == 0)
return (0); /* Nothing to do, ring already full. */
if (n < 0)
n += kring->nkr_num_slots; /* wrap-around adjustment */
}
// Prepare refill update structure
iru_init(&iru, rxq, 0 /* flid */);
map = fl->ifl_sds.ifsd_map;
// Starting positions
nic_i = fl->ifl_pidx; // NIC producer index
nm_i = netmap_idx_n2k(kring, nic_i); // Convert NIC index to netmap index
// Sanity checks for init mode
if (__predict_false(init)) {
MPASS(nic_i == 0);
MPASS(nm_i == kring->nr_hwtail);
} else
MPASS(nm_i == kring->nr_hwcur);
DBG_COUNTER_INC(fl_refills);
/*
* OUTER LOOP:
* Keep processing until we have refilled all 'n' needed buffers.
*/
while (n > 0) {
#if IFLIB_DEBUG_COUNTERS
if (++rf_count == 9)
DBG_COUNTER_INC(fl_refills_large);
#endif
nic_i_first = nic_i; // Save where this batch starts
/*
* INNER LOOP:
* Process up to IFLIB_MAX_RX_REFRESH buffers in one batch.
* This avoids calling hardware refill for every single buffer.
*/
for (i = 0; n > 0 && i < IFLIB_MAX_RX_REFRESH; n--, i++) {
struct netmap_slot *slot = &ring->slot[nm_i];
uint64_t paddr;
void *addr = PNMB(na, slot, &paddr); // Get buffer address and phys addr
MPASS(i < IFLIB_MAX_RX_REFRESH);
// If the buffer address is invalid, reinitialize the ring
if (addr == NETMAP_BUF_BASE(na))
return (netmap_ring_reinit(kring));
// Save the physical address and NIC index for this buffer
fl->ifl_bus_addrs[i] = paddr + nm_get_offset(kring, slot);
fl->ifl_rxd_idxs[i] = nic_i;
// Load or reload DMA mapping if necessary
if (__predict_false(init)) {
netmap_load_map(na, fl->ifl_buf_tag,
map[nic_i], addr);
} else if (slot->flags & NS_BUF_CHANGED) {
netmap_reload_map(na, fl->ifl_buf_tag,
map[nic_i], addr);
}
// Synchronize DMA so the NIC can safely read the buffer
bus_dmamap_sync(fl->ifl_buf_tag, map[nic_i],
BUS_DMASYNC_PREREAD);
// Clear "buffer changed" flag
slot->flags &= ~NS_BUF_CHANGED;
// Move to next position in both netmap and NIC rings (circular increment)
nm_i = nm_next(nm_i, lim);
nic_i = nm_next(nic_i, lim);
}
/*
* Tell the hardware to make these new buffers available.
* This happens once per batch for efficiency.
*/
iru.iru_pidx = nic_i_first;
iru.iru_count = i;
ctx->isc_rxd_refill(ctx->ifc_softc, &iru);
}
// Update software producer index
fl->ifl_pidx = nic_i;
// Ensure we refilled exactly up to the intended position
MPASS(nm_i == kring->rhead);
kring->nr_hwcur = nm_i;
// Final DMA sync for descriptors
bus_dmamap_sync(fl->ifl_ifdi->idi_tag, fl->ifl_ifdi->idi_map,
BUS_DMASYNC_PREREAD | BUS_DMASYNC_PREWRITE);
// Flush the buffers to the NIC
ctx->isc_rxd_flush(ctx->ifc_softc, rxq->ifr_id, fl->ifl_id,
nm_prev(nic_i, lim));
DBG_COUNTER_INC(rxd_flush);
return (0);
}
Understanding the while (n > 0) Loop in sys/net/iflib.c
The loop we're about to study looks like this:
while (n > 0) {
...
}
It comes from iflib (the Interface Library) in FreeBSD's network stack, in a section of code that connects netmap with network drivers.
Netmap is a high-performance packet I/O framework designed for very fast packet processing. In this context, the kernel uses the loop to refill receive buffers, ensuring the network interface card (NIC) always has space ready to store incoming packets, keeping data flowing smoothly at high speed.
Here, n is simply the number of buffers that still need to be prepared. The loop works through them in efficient batches, processing a few at a time until all are ready. This batching approach reduces overhead and is a common technique in high-performance network drivers.
What the while (n > 0) Really Does
As we've just seen, n is the count of receive buffers still waiting to be prepared. This loop's job is simple in concept:
"Work through those buffers in batches until there are none left."
Each pass of the loop prepares a group of buffers and hands them off to the NIC. If there's still work to do, the loop runs again, ensuring that by the end, all required buffers are ready for incoming packets.
What Happens Inside the while (n > 0) Loop
Each time the loop runs, it processes one batch of buffers. Here's the breakdown:
- Debug Tracking – If the driver is compiled with debugging enabled, it may update counters that track how often large batches of buffers are refilled. This is just for performance monitoring.
- Batch Setup – The driver remembers where this batch starts (
nic_i_first) so it can later tell the NIC exactly which slots were updated. - Inner Batch Processing – Inside the loop, there's another for loop that refills up to a maximum number of buffers at a time (IFLIB_MAX_RX_REFRESH). For each buffer in this batch:
- Look up the buffer's address and physical location in memory.
- Check if the buffer is valid. If not, reinitialise the receive ring.
- Store the physical address and slot index so the NIC knows where to place incoming data.
- If the buffer has changed or this is the first initialisation, update its DMA (Direct Memory Access) mapping.
- Synchronise the buffer for reading so the NIC can safely use it.
- Clear any "buffer changed" flags.
- Move to the next buffer position in the ring.
- Publishing the Batch to the NIC – Once the batch is ready, the driver calls a function to tell the NIC:
"These new buffers are ready for use."
By breaking the work into manageable batches and looping until every buffer is ready, this while loop ensures the NIC is always prepared to receive incoming data without interruption. It's a small but crucial part of keeping packet flow continuous in a high-performance networking environment.
Even if some of the lower-level details—like DMA mapping or ring indices aren't fully clear yet, the key takeaway is this:
Loops like this are the engine that quietly keeps the system running at full speed. As you progress through the book, these concepts will become second nature, and you'll start to recognise similar patterns across many parts of the FreeBSD kernel.
Understanding do...while Loops
A do...while loop is a variation of the while loop where the loop body runs at least once, and then repeats only if the condition remains true:
int i = 0;
do {
printf("%d\n", i);
i++;
} while (i < 10);
- The loop always executes the code inside at least once, even if the condition is false to begin with.
- Afterwards, it checks the condition (
i < 10) to decide whether to repeat.
In the FreeBSD kernel, you'll often see this pattern inside macros designed to behave like single statements. For example, in sys/sys/timespec.h, you'll find an example of his type of loop starting at line 44:
#define TIMESPEC_TO_TIMEVAL(tv, ts) \
do { \
(tv)->tv_sec = (ts)->tv_sec; \
(tv)->tv_usec = (ts)->tv_nsec / 1000; \
} while (0)
What This Macro Does
-
Assign Seconds: Copies
tv_secfrom the source (ts) to the target (tv). -
Convert and Assign Microseconds: Divides
ts->tv_nsecby 1000 to convert nanoseconds to microseconds and stores that intv_usec. -
do...while (0): Wraps the two statements so that when this macro is used, it behaves syntactically like a single statement, even if followed by a semicolon, preventing issues in constructs like:
if (x) TIMESPEC_TO_TIMEVAL(tv, ts);
else ...
While do...while (0) may look odd, it's a solid C idiom used to make macro expansions safe and predictable in all contexts (like inside conditional statements). It ensures that the entire macro behaves like one statement and avoids accidentally creating half-executed code. Understanding this helps you read and avoid subtle bugs in kernel code that rely heavily on macros for clarity and safety.
Understanding break and continue
When working with loops in C, sometimes you need to change the normal flow:
break– Immediately exits the loop, even if the loop condition could still be true.continue– Skips the rest of the current iteration and jumps directly to the loop's next iteration.
Here's a simple example:
for (int i = 0; i < 10; i++) {
if (i == 5)
continue; // Skip the number 5, move to the next i
if (i == 8)
break; // Stop the loop entirely when i reaches 8
printf("%d\n", i);
}
How This Works Step-by-Step
-
The loop starts with
i = 0and runswhile i < 10. -
When
i == 5, the continue statement runs:- The rest of the loop body is skipped.
- The loop moves directly to
i++and checks the condition again.
-
When
i == 8, the break statement runs:- The loop stops immediately.
- Control jumps to the first line of code after the loop.
Output of the Code
0
1
2
3
4
6
7
5 is skipped because of continue.
The loop ends at 8 because of break.
You can see a real example of break and continue usage in the function if_purgeaddrs(ifp) that starts at line 994 of sys/net/if.c file.
/*
* Remove any unicast or broadcast network addresses from an interface.
*/
void
if_purgeaddrs(struct ifnet *ifp)
{
struct ifaddr *ifa;
#ifdef INET6
/*
* Need to leave multicast addresses of proxy NDP llentries
* before in6_purgeifaddr() because the llentries are keys
* for in6_multi objects of proxy NDP entries.
* in6_purgeifaddr()s clean up llentries including proxy NDPs
* then we would lose the keys if they are called earlier.
*/
in6_purge_proxy_ndp(ifp);
#endif
while (1) {
struct epoch_tracker et;
NET_EPOCH_ENTER(et);
CK_STAILQ_FOREACH(ifa, &ifp->if_addrhead, ifa_link) {
if (ifa->ifa_addr->sa_family != AF_LINK)
break;
}
NET_EPOCH_EXIT(et);
if (ifa == NULL)
break;
#ifdef INET
/* XXX: Ugly!! ad hoc just for INET */
if (ifa->ifa_addr->sa_family == AF_INET) {
struct ifreq ifr;
bzero(&ifr, sizeof(ifr));
ifr.ifr_addr = *ifa->ifa_addr;
if (in_control(NULL, SIOCDIFADDR, (caddr_t)&ifr, ifp,
NULL) == 0)
continue;
}
#endif /* INET */
#ifdef INET6
if (ifa->ifa_addr->sa_family == AF_INET6) {
in6_purgeifaddr((struct in6_ifaddr *)ifa);
/* ifp_addrhead is already updated */
continue;
}
#endif /* INET6 */
IF_ADDR_WLOCK(ifp);
CK_STAILQ_REMOVE(&ifp->if_addrhead, ifa, ifaddr, ifa_link);
IF_ADDR_WUNLOCK(ifp);
ifa_free(ifa);
}
}
What this function does
if_purgeaddrs(ifp) removes all non-link-layer addresses from a network interface. In plain words, it walks the list of addresses attached to the interface and deletes unicast or broadcast addresses that belong to IPv4 or IPv6. Some families are handled by calling helpers who update the lists for us. Anything not handled by a helper is explicitly removed and freed.
How the loop is organised
The outer while (1) repeats until there are no more removable addresses. Each pass:
- Enters the network epoch (
NET_EPOCH_ENTER) to safely walk the interface address list. - Scans the list with
CK_STAILQ_FOREACHto find the first address after the `AF_LINK entries. Link-layer entries come first and are not purged here. - Leaves the epoch and then decides what to do with the address it found.
Where the break statements act
Break inside the list scan:
if (ifa->ifa_addr->sa_family != AF_LINK)
break;
The scan stops as soon as it reaches the first non-AF_LINK address. We only need one target per pass.
Break after the scan:
if (ifa == NULL)
break;
If the scan did not find any non-AF_LINK address, there is nothing left to purge. The outer while ends.
Where the continue statements act
IPv4 address handled by ioctl:
if (in_control(...) == 0)
continue;
For IPv4, in_control(SIOCDIFADDR) removes the address and updates the list. Since that work is done, we skip the manual removal below and continue to the next outer-loop pass to look for the next address.
IPv6 address removed by helper:
in6_purgeifaddr((struct in6_ifaddr *)ifa);
/* list already updated */
continue;
For IPv6, in6_purgeifaddr() also updates the list. There is nothing more to do in this pass, so we continue to the next one.
The fallback removal path
If the address was neither handled by the IPv4 nor the IPv6 helpers, the code takes the generic path:
IF_ADDR_WLOCK(ifp);
CK_STAILQ_REMOVE(&ifp->if_addrhead, ifa, ifaddr, ifa_link);
IF_ADDR_WUNLOCK(ifp);
ifa_free(ifa);
This explicitly removes the address from the list and frees it.
In loops, break and continue are precision tools for controlling execution flow. In the if_purgeaddrs() function from FreeBSD's sys/net/if.c, break stops the search when there are no more addresses to remove, or halts the inner scan as soon as a target address is found. continue skips the generic removal step when a specialised IPv4 or IPv6 routine has already handled the work, jumping straight to the next pass through the outer loop. This design lets the function repeatedly find one removable address at a time, remove it using the most appropriate method, and keep going until no non-link-layer addresses remain.
The key takeaway is that well-placed break and continue statements keep loops efficient and focused, avoiding wasted work and making the code's intent clear, a pattern you'll encounter often in FreeBSD's kernel for both clarity and performance.
Pro Tip: Always Use Braces {}
In C, if you omit braces after an if, only one statement is actually controlled by the if. This can easily lead to mistakes:
if (x > 0)
printf("Positive\n"); // Runs only if x > 0
printf("Always runs\n"); // Always runs! Not part of the if
This is a common source of bugs because the second printf appears to be inside the if, but it isn't.
To avoid confusion and accidental logic errors, always use braces, even for a single statement:
if (x > 0) {
printf("Positive\n");
}
This makes your intent explicit, keeps your code safe from subtle changes, and follows the style used in the FreeBSD source tree.
Also Safer for Future Changes
When you always use braces, it's much safer to modify the code later:
if (x > 0) {
printf("x is positive\n");
log_positive(x); // Adding this won't break logic!
}
Summary
In this section, you've learned:
- How to make decisions using if, else, and switch
- How to write loops using for, while, and do...while
- How to exit or skip iterations with break and continue
- How FreeBSD uses control flow to walk lists and make kernel decisions
You now have the tools to control the logic and flow of your programs, which is the core of programming itself.
Functions
In C, a function is like a dedicated workshop in a large factory; it is a self-contained area where a specific task is carried out, start to finish, without disturbing the rest of the production line. When you need that task done, you simply send the work there, and the function delivers the result.
Functions are one of the most important tools you have as a programmer because they let you:
- Break down complexity: Large programs become easier to understand when split into smaller, focused operations.
- Reuse logic: Once written, a function can be called anywhere, saving you from typing (and debugging) the same code repeatedly.
- Improve clarity: A descriptive function name turns a block of cryptic code into a clear statement of intent.
You've already seen functions at work:
main()— the starting point of every C program.printf()— a library function that handles formatted output for you.
In the FreeBSD kernel, you'll find functions everywhere, from low-level routines that copy data between memory regions to specialised ones that communicate with hardware. For example, when a network packet arrives, the kernel doesn't put all the processing logic in one giant block of code. Instead, it calls a series of functions, each responsible for a clear, isolated step in the process.
In this section, you'll learn how to create your own functions, giving you the power to write clean, modular code. This isn't just good style in FreeBSD device driver development; it's the foundation for stability, reusability, and long-term maintainability.
How a Function Call Works in Memory
When your program calls a function, something important happens behind the scenes:
+---------------------+
| Return Address | <- Where to resume execution after the function ends
+---------------------+
| Function Arguments | <- Copies of the values you pass in
+---------------------+
| Local Variables | <- Created fresh for this call
+---------------------+
| Temporary Data | <- Space the compiler needs for calculations
+---------------------+
... Stack ...
Step-by-step:
- The caller pauses - your program stops at the function call and saves the return address on the stack so it knows where to continue afterwards.
- Arguments are placed - the values you pass to the function (parameters) are stored, either in registers or on the stack, depending on the platform.
- Local variables are created — the function gets its own workspace in memory, separate from the caller's variables.
- The function runs — it executes its statements in order, possibly calling other functions along the way.
- A return value is sent back — if the function produces a result, it is placed in a register (commonly eax on x86) for the caller to pick up.
- Cleanup and resume — the function's workspace is removed from the stack, and the program continues where it left off.
Why do you need to understand that?
In kernel programming, every function call has a cost in both time and memory. Understanding this process will help you write efficient driver code and avoid subtle bugs, especially when working with low-level routines where stack space is limited.
Defining and Declaring Functions
Every function in C follows a simple recipe. To create one, you need to specify four things:
-
Return type – what kind of value the function gives back to the caller.
- Example:
intmeans the function will return an integer. - If it doesn't return anything, we use the keyword
void.
- Example:
-
Name – a unique, descriptive label for your function, so you can call it later.
- Example:
read_temperature()is much clearer thanrt().
- Example:
-
Parameters – zero or more values the function needs to do its job.
- Each parameter has its own type and name.
- If there are no parameters, use void in the list to make it explicit.
-
Body – the block of code, enclosed in {} braces, that performs the task.
- This is where you write the actual instructions.
General form:
return_type function_name(parameter_list)
{
// statements
return value; // if return_type is not void
}
Example: A function to add two numbers and return the result
int add(int a, int b)
{
int sum = a + b;
return sum;
}
Declaration vs. Definition
A lot of beginners get tripped up here, so let's make it crystal clear:
- Declaration tells the compiler that a function exists, what it's called, what parameters it takes, and what it returns, but it does not provide the code for it.
- Definition is where you actually write the body of the function, the full implementation that does the work.
Think of it like planning and building a workshop:
- Declaration: putting up a sign saying "This workshop exists, here's what it's called, and here's the kind of work it does."
- Definition: actually building the workshop, stocking it with tools, and hiring workers to do the job.
Example:
// Function declaration (prototype)
int add(int a, int b);
// Function definition
int add(int a, int b)
{
int sum = a + b;
return sum;
}
Why declarations are useful
In small single-file programs, you can just put the definition before you call the function and be done. But in larger programs, especially in FreeBSD drivers, code is often split across many files.
For example:
- The function
mydevice_probe()might be defined inmydevice.c. - Its declaration will go into a header file
mydevice.hso that other parts of the driver, or even the kernel, can call it without knowing the details of how it works.
When the compiler sees the declaration, it knows how to check that calls to mydevice_probe() use the right number and types of parameters, even before it sees the definition.
FreeBSD driver perspective
When writing a driver:
- Declarations often live in
.hheader files. - Definitions live in
.csource files. - The kernel will call your driver's functions (like
probe(),attach(),detach()) based on the declarations it sees in your driver's headers, without caring exactly how you implement them as long as the signatures match.
Understanding this difference will save you a lot of compiler errors, especially "implicit declaration of function" or "undefined reference" errors, which are among the most common mistakes beginners hit when starting with C.
How Declarations and Definitions Work Together
In small programs, you might write the function's definition before main() and be done.
But in real projects, like a FreeBSD device driver, code is split into header files (.h) for declarations and source files (.c) for definitions.
+----------------------+
| mydevice.h | <-- Header file
|----------------------|
| int my_probe(void); | // Declaration (prototype)
| int my_attach(void); | // Declaration
+----------------------+
|
v
+----------------------+
| mydevice.c | <-- Source file
|----------------------|
| #include "mydevice.h"| // Include declarations
| |
| int my_probe(void) | // Definition
| { |
| /* detect hw */ |
| return 0; |
| } |
| |
| int my_attach(void) | // Definition
| { |
| /* init hw */ |
| return 0; |
| } |
+----------------------+
How it works:
- Declaration in the header file tells the compiler: "These functions exist somewhere, here's what they look like."
- Definition in the source file provides the actual code.
- Any other
.cfile that includesmydevice.hcan now call these functions, and the compiler will check the parameters and return types. - At link time, the function calls are connected to their definitions.
In the context of FreeBSD drivers:
- You might have
mydevice.ccontaining the driver logic, andmydevice.hholding the function declarations shared across the driver. - The kernel build system will compile your
.cfiles and link them into a kernel module. - If the declarations don't match the definitions exactly, you'll get compiler errors — which is why keeping them in sync is critical.
Common mistakes with functions and how to fix them
-
Calling a function before the compiler knows it exists Symptom: "implicit declaration of function" warning or error. Fix: Add a declaration in a header file and include it, or place the definition above its first use.
-
Declaration and definition do not match Symptom: "conflicting types" or odd runtime bugs. Fix: Make the signature identical in both places. Same return type, parameter types, and qualifiers in the same order.
-
Forgetting
voidfor a function with no parameters Symptom: The compiler may think the function takes unknown arguments. Fix: Use intmy_fn(void)instead of intmy_fn(). -
Returning a value from a
voidfunction or forgetting to return a value Symptom: "void function cannot return a value" or "control reaches end of non-void function." Fix: For non-void functions, always return the right type. Forvoid, do not return a value. -
Returning pointers to local variables Symptom: Random crashes or garbage data. Fix: Do not return the address of a stack variable. Use dynamically allocated memory or pass a buffer in as a parameter.
-
Mismatched
constor pointer levels between declaration and definition Symptom: Type mismatch errors or subtle bugs. Fix: Keep qualifiers consistent. If the declaration hasconst char *, the definition must match exactly. -
Multiple definitions across files Symptom: Linker error "multiple definition of …". Fix: Only one definition per function. If a helper should be private to a file, mark it
staticin that.cfile. -
Putting function definitions in headers by accident Symptom: Multiple definition linker errors when the header is included by several
.cfiles. Fix: Headers should usually have declarations only. If you really need code in a header, make itstatic inlineand keep it small. -
Missing includes for functions you call Symptom: Implicit declarations or wrong default types. Fix: Include the correct system or project header that declares the function you are calling, for example
#include <stdio.h>forprintf. -
Kernel specific: undefined symbols when building a module Symptom: Linker error "undefined reference" while building your KMOD. Fix: Ensure the function is actually defined in your module or exported by the kernel, that the declaration matches the definition, and that the right source files are part of the module build.
-
Kernel specific: using a helper that is meant to be file local Symptom: "undefined reference" from other files or unexpected symbol visibility. Fix: Mark internal helpers as
staticto restrict visibility. Expose only what other files must call through your header. -
Choosing poor names Symptom: Hard to read code and name collisions. Fix: Use descriptive, project prefixed names, for example
mydev_read_reg, notreadreg.
Hands-on exercise: Splitting Declarations and Definitions
For this exercise, we will create 3 files.
mydevice.h - This header file declares the functions and makes them available to any .c file that includes it.
#ifndef MYDEVICE_H
#define MYDEVICE_H
// Function declarations (prototypes)
void mydevice_probe(void);
void mydevice_attach(void);
void mydevice_detach(void);
#endif // MYDEVICE_H
mydevice.c - This source file contains the actual definitions (the working code).
#include <stdio.h>
#include "mydevice.h"
// Function definitions
void mydevice_probe(void)
{
printf("[mydevice] Probing hardware... done.\n");
}
void mydevice_attach(void)
{
printf("[mydevice] Attaching device and initializing resources...\n");
}
void mydevice_detach(void)
{
printf("[mydevice] Detaching device and cleaning up.\n");
}
main.c - This is the "user" of the functions. It just includes the header and calls them.
#include "mydevice.h"
int main(void)
{
mydevice_probe();
mydevice_attach();
mydevice_detach();
return 0;
}
How to Compile and Run on FreeBSD
Open a terminal in the folder with the three files and run:
cc -Wall -o myprogram main.c mydevice.c
./myprogram
Expected output:
[mydevice] Probing hardware... done.
[mydevice] Attaching device and initialising resources...
[mydevice] Detaching device and cleaning up.
Why this matters for FreeBSD driver development
In a real FreeBSD kernel module,
mydevice.hwould hold your driver's public API (function declarations).mydevice.cwould have the full implementations of those functions.- The kernel (or other parts of the driver) would include the header to know how to call into your code, without needing to see the actual implementation details.
This exact pattern is how probe(), attach(), and detach() routines are structured in actual device drivers. Learning it now will make those later chapters feel familiar.
Understanding the relationship between declarations and definitions is a cornerstone of C programming, and it becomes even more important when you step into the world of FreeBSD device drivers. In kernel development, functions are rarely defined and used in the same file; they are spread across multiple source and header files, compiled separately, and linked together into a single module. A clear separation between what a function does (its declaration) and how it does it (its definition) keeps code organized, reusable, and easier to maintain. Master this concept now, and you'll be well-prepared for the more complex modular structures you'll encounter when we begin building real kernel drivers.
Calling Functions
Once you've defined a function, the next step is to call it, that is, to tell the program, "Hey, go run this block of code now and give me the result".
Calling a function is as simple as writing its name followed by parentheses containing any required arguments.
If the function returns a value, you can store that value in a variable, pass it to another function, or use it directly in an expression.
Example:
int result = add(3, 4);
printf("Result is %d\n", result);
Here's what happens step-by-step when this code runs:
- The program encounters
add(3, 4)and pauses its current work. - It jumps to the
add()function's definition, giving it two arguments:3and4. - Inside
add(), the parametersaandbreceive the values3and4. - The function calculates
sum = a + band then executesreturn sum;. - The returned value
7travels back to the calling point and gets stored in the variableresult. - The
printf()function then displays:
Result is 7
FreeBSD Driver Connection
When you call a function in a FreeBSD driver, you're often asking the kernel or your own driver logic to perform a very specific task, for example:
- Calling
bus_space_read_4()to read a 32-bit hardware register. - Calling your own
mydevice_init()to prepare a device for use.
The principle is exactly the same as the add() example:
The function takes parameters, does its job, and returns control to where it was called. The difference in kernel space is that the "job" might involve talking directly to hardware or managing system resources, but the calling process is identical.
Tip for Beginners
Even if a function doesn't return a value (its return type is void), calling it still triggers its entire body to run. In drivers, many important functions don't return anything but perform critical work like initializing hardware or setting up interrupts.
Function Call Flow When your program calls a function, control jumps from the current point in your code to the function's definition, runs its statements, and then comes back. Example flow for add(3, 4) inside main():
main() starts
|
v
Calls add(3, 4) -----------+
| |
v |
Inside add(): |
a = 3 |
b = 4 |
sum = a + b // sum=7 |
return sum; |
| |
+----------- Back to main()
|
v
result = 7
printf("Result is 7")
main() ends
What to notice:
- The program's "path" temporarily leaves
main()when the function is called. - The parameters in the function get copies of the values passed in.
- The return statement sends a value back to where the function was called.
- After the call, execution continues right where it left off.
FreeBSD driver analogy:
When the kernel calls your driver's attach() function, the exact same process happens. The kernel jumps into your code, you run your initialization logic, and then control returns to the kernel so it can continue loading devices. Whether in user space or kernel space, function calls follow the same flow.
Try It Yourself – Simulating a Driver Function Call
In this exercise, you'll write a small program that mimics calling a driver function to read a "hardware register" value.
We'll simulate it in user space so you can compile and run it easily on your FreeBSD system.
Step 1 — Define the function
Create a file called driver_sim.c and start with this function:
#include <stdio.h>
// Function definition
unsigned int read_register(unsigned int address)
{
// Simulated register value based on address
unsigned int value = address * 2;
printf("[driver] Reading register at address 0x%X...\n", address);
return value;
}
Step 2 — Call the function from main()
In the same file, add main() below your function:
int main(void)
{
unsigned int reg_addr = 0x10; // pretend this is a hardware register address
unsigned int data;
data = read_register(reg_addr); // Call the function
printf("[driver] Value read: 0x%X\n", data);
return 0;
}
Step 3 — Compile and run
% cc -Wall -o driver_sim driver_sim.c
./driver_sim
Expected output:
[driver] Reading register at address 0x10...
[driver] Value read: 0x20
What You Learned
- You called a function by name, passing it a parameter.
- The parameter got a copy of your value (
0x10) inside the function. - The function calculated a result and sent it back with
return. - Execution continued exactly where it left off.
In a real driver, read_register() might use the bus_space_read_4() kernel API to access a physical hardware register instead of multiplying a number. The function call flow, however, is exactly the same.
Functions with No Return Value: void
Not every function needs to send data back to the caller.
Sometimes, you just want the function to do something, print a message, initialise hardware, log a status and then finish.
In C, when a function does not return anything, you declare its return type as void.
Example:
void say_hello(void)
{
printf("Hello, World!\n");
}
Here's what's happening:
- void before the name means: "This function will not return a value".
- The
(void)in the parameter list means: "This function takes no arguments". - Inside the braces
{}, we place the statements we want to execute when the function is called.
Calling it:
say_hello();
This will print:
Hello, World!
Common beginner mistakes with void functions
-
Forgetting
voidin the parameter listvoid say_hello() // Works, but less explicit — avoid in new code void say_hello(void) // Best practice
In old C code, () without void means "this function takes an unspecified number of arguments", which can cause confusion.
-
Trying to return a value from a void function
void test(void) { return 42; // Compiler error } -
Assigning the result of a void function
int x = say_hello(); // Compiler error
Now that you've seen the most common pitfalls, let's take a step back and understand why the void keyword is important in the first place.
Why void matters
Marking a function with void clearly tells both the compiler and human readers that this function's purpose is to perform an action, not to produce a result.
If you try to use the "return value" from a void function, the compiler will stop you, which helps catch mistakes early.
FreeBSD driver perspective
In FreeBSD drivers, many important functions are void because they are all about doing work, not returning data.
For example:
mydevice_reset(void)— might reset the hardware to a known state.mydevice_led_on(void)— might turn on a status LED.mydevice_log_status(void)— might print debugging information to the kernel log.
The kernel doesn't care about a return value in these cases, it just expects your function to perform its action.
While void functions in drivers don't return values, that doesn't mean they can't communicate important information. There are still several ways to signal events or issues back to the rest of the system.
Tip for Beginners
In driver code, even though void functions don't return data, they can still report errors or events by:
- Writing to a global or shared variable.
- Logging messages with
device_printf()orprintf(). - Triggering other functions that handle error states.
Understanding void functions is important because in real-world FreeBSD driver development, not every task produces data to return; many simply perform an action that prepares the system or the hardware for something else. Whether it's initializing a device, cleaning up resources, or logging a status message, these functions still play a critical role in the overall behavior of your driver. By recognizing when a function should return a value and when it should simply do its job and return nothing, you'll write cleaner, more purposeful code that matches the way the FreeBSD kernel itself is structured.
Function Declarations (Prototypes)
In C, it's a good habit and often essential to declare a function before you use it.
A function declaration, also called a prototype, tells the compiler:
- The function's name.
- The type of value it returns (if any).
- The number, types, and order of its parameters.
This way, the compiler can check that your function calls are correct, even if the actual definition (the body of the function) appears later in the file or in a different file entirely.
Let's see a basic example
#include <stdio.h>
// Function declaration (prototype)
int add(int a, int b);
int main(void)
{
int result = add(3, 4);
printf("Result: %d\n", result);
return 0;
}
// Function definition
int add(int a, int b)
{
return a + b;
}
When the compiler reads the prototype for add() before main(), it immediately knows:
- the function's name is
add, - it takes two
intparameters, and - it will return an
int.
Later, when the compiler finds the definition, it checks that the name, parameters, and return type match the prototype exactly. If they don't, it raises an error.
Why prototypes matter
Placing the prototype before a function is called provides several benefits:
-
Prevents unnecessary warnings and errors: If you call a function before the compiler knows it exists, you'll often get an "implicit declaration of function" warning or even a compilation error.
-
Catches mistakes early: If your call passes the wrong number or types of arguments, the compiler will flag the problem immediately instead of letting it cause unpredictable behaviour at runtime.
-
Enables modular programming: Prototypes allow you to split your program into multiple source files. You can keep the function definitions in one file and the calls to them in another, with the prototypes stored in a shared header file.
By declaring your functions before you use them, either at the top of your .c file or in a .h header, you're not just keeping the compiler happy; you're building code that's easier to organise, maintain, and scale.
Now that you understand why prototypes are important, let's look at the two most common places to put them: directly in your .c file or in a shared header file.
Prototypes in header files
Although you can write prototypes directly at the top of your .c file, the more common and scalable approach is to place them in header files (.h).
This allows multiple .c files to share the same declarations without repeating them.
Example:
mathutils.h
#ifndef MATHUTILS_H
#define MATHUTILS_H
int add(int a, int b);
int subtract(int a, int b);
#endif // MATHUTILS_H
main.c
#include <stdio.h>
#include "mathutils.h"
int main(void)
{
printf("Sum: %d\n", add(3, 4));
printf("Difference: %d\n", subtract(10, 4));
return 0;
}
mathutils.c
#include "mathutils.h"
int add(int a, int b)
{
return a + b;
}
int subtract(int a, int b)
{
return a - b;
}
This pattern keeps your code organized and avoids having to manually keep multiple prototypes in sync across files.
FreeBSD driver perspective
In FreeBSD driver development, prototypes are essential because the kernel often needs to call into your driver without knowing how your functions are implemented.
For example, in your driver's header file you might declare:
int mydevice_init(void);
void mydevice_start_transmission(void);
These tell the kernel or bus subsystem that your driver has these functions available, even if the actual definitions live deep inside your .c files.
The build system compiles all the pieces together and links the calls to the correct implementations.
Try It Yourself – Moving a Function Below main()
One of the main reasons to use prototypes is so you can call a function that hasn't been defined yet in the file. Let's see this in action.
Step 1 — Start without a prototype
#include <stdio.h>
int main(void)
{
int result = add(3, 4); // This will cause a compiler warning or error
printf("Result: %d\n", result);
return 0;
}
// Function definition
int add(int a, int b)
{
return a + b;
}
Compile it:
cc -Wall -o testprog testprog.c
You'll likely get a warning such as:
testprog.c:5:18:
warning: call to undeclared function 'add'; ISO C99 and later do not support implicit function declarations [-Wimplicit-function-declaration]
5 | int result = add(3, 4); // This will cause a compiler warning or error
| ^
1 warning generated.
Step 2 — Fix it with a prototype
Add the function prototype before main() like this:
#include <stdio.h>
// Function declaration (prototype)
int add(int a, int b);
int main(void)
{
int result = add(3, 4); // No warnings now
printf("Result: %d\n", result);
return 0;
}
// Function definition
int add(int a, int b)
{
return a + b;
}
Recompile, the warning is gone, and the program runs:
Result: 7
Note: Depending on the compiler you use, the warning message might look a little different from the example shown above, but the meaning will be the same.
By adding a prototype, you've just seen how the compiler can recognize a function and validate its use even before it sees the actual code. This same principle is what allows the FreeBSD kernel to call into your driver; it doesn't need the whole function body up front, only the declaration. In the next section, we'll look at how this works in a real driver, where prototypes in header files act as the kernel's "map" to your driver's capabilities.
FreeBSD Driver Connection
In the FreeBSD kernel, function prototypes are the way the system "introduces" your driver's functions to the rest of the codebase.
When the kernel wants to interact with your driver, it doesn't search for the function's code directly; it relies on the function's declaration to know the name, parameters, and return type.
For example, during device detection, the kernel might call your probe() function to check whether a specific piece of hardware is present. The actual definition of probe() could be deep inside your mydriver.c file, but the prototype lives in your driver's header file (mydriver.h). That header is included by the kernel or bus subsystem so it can compile code that calls probe() without needing to see its full implementation.
This arrangement ensures two critical things:
- Compiler validation: The compiler can confirm that any calls to your functions use the correct parameters and return type.
- Linker resolution: When building the kernel or your driver module, the linker knows exactly which compiled function body to connect to the calls.
Without correct prototypes, the kernel build could fail or, worse, compile but behave unpredictably at runtime. In kernel programming, that's not just a bug, it could mean a crash.
Example — Prototypes in a FreeBSD Driver
mydriver.h — Driver header file with prototypes:
#ifndef _MYDRIVER_H_
#define _MYDRIVER_H_
#include <sys/types.h>
#include <sys/bus.h>
// Public entry points: declared here so bus/framework code can call them
// These are the entry points the kernel will call during the device's lifecycle
// Function prototypes (declarations)
int mydriver_probe(device_t dev);
int mydriver_attach(device_t dev);
int mydriver_detach(device_t dev);
#endif /* _MYDRIVER_H_ */
Here, we declare three key entry points probe(), attach(), and detach(), but don't include their bodies.
The kernel or bus subsystem will include this header so it knows how to call these functions during device lifecycle events.
mydriver.c — Driver source file with definitions:
/*
* FreeBSD device driver lifecycle (quick map)
*
* 1) Kernel enumerates devices on a bus
* The bus framework walks hardware and creates device_t objects.
*
* 2) probe()
* The kernel asks your driver if it supports a given device.
* You inspect IDs or capabilities and return a score.
* - Return ENXIO if this driver does not match.
* - Return BUS_PROBE_DEFAULT or a better score if it matches.
*
* 3) attach()
* Called after a successful probe to bring the device online.
* Typical work:
* - Allocate resources (memory, IRQ) with bus_alloc_resource_any()
* - Map registers and set up bus_space
* - Initialize hardware to a known state
* - Set up interrupts and handlers
* Return 0 on success, or an errno if something fails.
*
* 4) Runtime
* Your driver services requests. This may include:
* - Interrupt handlers
* - I/O paths invoked by upper layers or devfs interfaces
* - Periodic tasks, callouts, or taskqueues
*
* 5) detach()
* Called when the device is being removed or the module unloads.
* Cleanup tasks:
* - Quiesce hardware, stop DMA, disable interrupts
* - Tear down handlers and timers
* - Unmap registers and release resources with bus_release_resource()
* Return 0 on success, or an errno if detach must be denied.
*
* 6) Optional lifecycle events
* - suspend() and resume() during power management
* - shutdown() during system shutdown
*
* Files to remember
* - mydriver.h declares the entry points that the kernel and bus code will call
* - mydriver.c defines those functions and contains the implementation details
*/
#include <sys/param.h>
#include <sys/kernel.h>
#include <sys/module.h>
#include <sys/bus.h>
#include <sys/systm.h> // device_printf
#include "mydriver.h"
/*
* mydriver_probe()
* Called early during device enumeration.
* Purpose: decide if this driver matches the hardware represented by dev.
* Return: BUS_PROBE_DEFAULT for a normal match, a better score for a strong match,
* or ENXIO if the device is not supported.
*/
int
mydriver_probe(device_t dev)
{
device_printf(dev, "Probing device...\n");
/*
* Here you would usually check vendor and device IDs or use bus-specific
* helper routines. If the device is not supported, return (ENXIO).
*/
return (BUS_PROBE_DEFAULT);
}
/*
* mydriver_attach()
* Called after a successful probe when the kernel is ready to attach the device.
* Purpose: allocate resources, map registers, initialise hardware, register interrupts,
* and make the device ready for use.
* Return: 0 on success, or an errno value (like ENOMEM or EIO) on failure.
*/
int
mydriver_attach(device_t dev)
{
device_printf(dev, "Attaching device and initializing resources...\n");
/*
* Typical steps you will add here:
* 1) Allocate device resources (I/O memory, IRQs) with bus_alloc_resource_any().
* 2) Map register space and set up bus_space tags and handles.
* 3) Initialise hardware registers to a known state.
* 4) Set up interrupt handlers if needed.
* 5) Create device nodes or child devices if this driver exposes them.
* On any failure, release what you allocated and return an errno.
*/
return (0);
}
/*
* mydriver_detach()
* Called when the device is being detached or the module is unloading.
* Purpose: stop the hardware, free resources, and leave the system clean.
* Return: 0 on success, or an errno value if detach must be refused.
*/
int
mydriver_detach(device_t dev)
{
device_printf(dev, "Detaching device and cleaning up...\n");
/*
* Typical steps you will add here:
* 1) Disable interrupts and stop DMA or timers.
* 2) Tear down interrupt handlers.
* 3) Unmap register space and free bus resources with bus_release_resource().
* 4) Destroy any device nodes or sysctl entries created at attach time.
*/
return (0);
}
Why this works:
- The
.hfile exposes only the function interfaces to the rest of the kernel. - The
.cfile contains the full implementations of the functions declared in the header. - The build system compiles all the source files, and the linker connects calls to the correct function bodies.
- The kernel can call these functions without knowing how they work internally; it only needs the prototypes.
Understanding how the kernel uses your driver's function prototypes is more than just a formality; it's a safeguard for correctness and stability. In kernel programming, even a slight mismatch between a declaration and a definition can lead to build failures or unpredictable runtime behaviour. That's why experienced FreeBSD developers follow a few best practices to keep their prototypes clean, consistent, and easy to maintain. Let's go over some of those tips next.
Tip for Kernel Code
When you start writing FreeBSD drivers, function prototypes aren't just a formality; they're a key part of keeping your code organised and error-free in a large, multi-file project. In the kernel, where functions are often called from deep within the system, a mismatch between a declaration and its definition can cause build failures or subtle bugs that are hard to track down.
To avoid problems and keep your headers clean:
- Always match parameter types exactly between the declaration and the definition; the return type, parameter list, and order must be identical.
- Include qualifiers like
constand*consistently so you don't accidentally change how parameters are treated between the declaration and the definition. - Group related prototypes together in header files so they're easy to find. For example, put all initialisation functions in one section, and hardware access functions in another.
Function prototypes may seem like a small detail in C, but they are the glue that holds multi-file projects and especially kernel code together. By declaring your functions before they are used, you give the compiler the information it needs to catch mistakes early, keep your code organised, and allow different parts of a program to communicate cleanly.
In FreeBSD driver development, well-structured prototypes in header files enable the kernel to interact with your driver reliably, without knowing its internal details. Mastering this habit now is non-negotiable if you want to write stable, maintainable drivers.
In the next section, we'll explore real examples from the FreeBSD source tree to see exactly how prototypes are used throughout the kernel, from core subsystems to actual device drivers. This will not only reinforce what you've learned here, but also help you recognise the patterns and conventions that experienced FreeBSD developers follow every day.
Real Example from the FreeBSD 14.3 Source Tree: device_printf()
Now that you understand how function declarations and definitions work, let's walk through a concrete example from the FreeBSD kernel. We will follow device_printf() from its prototype in a header, to its definition in the kernel source, and finally to a real driver that calls it during initialisation. This shows the full path a function takes in real code and why prototypes are critical in driver development.
1) Prototype — where it is declared
The device_printf() function is declared in the FreeBSD kernel's bus interface header sys/sys/bus.h. Any driver source that includes this header can call it safely because the compiler knows its signature in advance.
int device_printf(device_t dev, const char *, ...) __printflike(2, 3);
What each part means:
intis the return type. The function returns the number of characters printed, similar toprintf(9).device_t devis a handle to the device that owns the message, which allows the kernel to prefix the output with the device name and unit, for examplevtnet0:.const char *is the format string, the same idea used byprintf....indicates a variable argument list. You can pass values that match the format string.__printflike(2, 3)is a compiler hint used in FreeBSD. It tells the compiler that parameter 2 is the format string and that type checking for additional arguments starts at parameter 3. This enables compile time checks for format specifiers and argument types.
Because this declaration lives in a shared header, any driver that includes <sys/sys/bus.h> can call device_printf() without needing to know how it is implemented.
2) Definition — where it is implemented
Here is the actual implementation of device_printf() in sys/kern/subr_bus.c from FreeBSD 14.3. The function builds a prefix with the device name and unit, appends your formatted message, and counts how many characters are produced. I have added extra comments to help you understand how this function works.
/**
* @brief Print the name of the device followed by a colon, a space
* and the result of calling vprintf() with the value of @p fmt and
* the following arguments.
*
* @returns the number of characters printed
*/
int
device_printf(device_t dev, const char * fmt, ...)
{
char buf[128]; // Fixed buffer for sbuf to use
struct sbuf sb; // sbuf structure that manages safe string building
const char *name; // Will hold the device's base name (e.g., "igc")
va_list ap; // Handle for variable argument list
size_t retval; // Count of characters produced by the drain
retval = 0; // Initialise the output counter
sbuf_new(&sb, buf, sizeof(buf), SBUF_FIXEDLEN);
// Initialise sbuf 'sb' over 'buf' with fixed length
sbuf_set_drain(&sb, sbuf_printf_drain, &retval);
// Set a "drain" callback that counts characters
// Every time sbuf emits bytes, sbuf_printf_drain
// updates 'retval' through this pointer
name = device_get_name(dev); // Query the device base name (may be NULL)
if (name == NULL) // If we do not know the name
sbuf_cat(&sb, "unknown: "); // Prefix becomes "unknown: "
else
sbuf_printf(&sb, "%s%d: ", name, device_get_unit(dev));
// Otherwise prefix "name" + unit number, e.g., "igc0: "
va_start(ap, fmt); // Start reading the variable arguments after 'fmt'
sbuf_vprintf(&sb, fmt, ap); // Append the formatted message into the sbuf
va_end(ap); // Clean up the variable argument list
sbuf_finish(&sb); // Finalise the sbuf so its contents are complete
sbuf_delete(&sb); // Release sbuf resources associated with 'sb'
return (retval); // Return the number of characters printed
}
What to notice
- The code uses sbuf to assemble the message safely. The drain callback updates retval so the function can return the number of characters produced.
- The device prefix comes from
device_get_name()anddevice_get_unit(). If the name is not available, it falls back tounknown:. - It accepts a format string and variable arguments, handled by
va_list,va_start, andva_end, then forwards them tosbuf_vprintf().
3) Real driver use — where it is called in practice
Here is a clear example from sys/dev/virtio/virtqueue.c that calls device_printf() while initialising a virtqueue to use indirect descriptors. And like I did for step 2 above, I have added extra comments to help you understand how this function works.
static int
virtqueue_init_indirect(struct virtqueue *vq, int indirect_size)
{
device_t dev;
struct vq_desc_extra *dxp;
int i, size;
dev = vq->vq_dev; // Cache the device handle for logging and feature checks
if (VIRTIO_BUS_WITH_FEATURE(dev, VIRTIO_RING_F_INDIRECT_DESC) == 0) {
/*
* Driver asked to use indirect descriptors, but the device did
* not negotiate this feature. We do not fail the init here.
* Return 0 so the queue can still be used without this feature.
*/
if (bootverbose)
device_printf(dev, "virtqueue %d (%s) requested "
"indirect descriptors but not negotiated\n",
vq->vq_queue_index, vq->vq_name);
return (0); // Continue without indirect descriptors
}
size = indirect_size * sizeof(struct vring_desc); // Total bytes for one indirect table
vq->vq_max_indirect_size = indirect_size; // Remember maximum entries per indirect table
vq->vq_indirect_mem_size = size; // Remember bytes per indirect table
vq->vq_flags |= VIRTQUEUE_FLAG_INDIRECT; // Mark the queue as using indirect descriptors
for (i = 0; i < vq->vq_nentries; i++) { // For each descriptor in the main queue
dxp = &vq->vq_descx[i]; // Access per-descriptor extra bookkeeping
dxp->indirect = malloc(size, M_DEVBUF, M_NOWAIT);
// Allocate an indirect descriptor table for this entry
if (dxp->indirect == NULL) {
device_printf(dev, "cannot allocate indirect list\n");
// Tag the error with the device name and unit
return (ENOMEM); // Tell the caller that allocation failed
}
dxp->indirect_paddr = vtophys(dxp->indirect);
// Record the physical address for DMA use
virtqueue_init_indirect_list(vq, dxp->indirect);
// Initialise the table contents to a known state
}
return (0); // Success. The queue now supports indirect descriptors
}
What this driver code is doing
This helper prepares a virtqueue to use indirect descriptors, a VirtIO feature that allows each top level descriptor to reference a separate table of descriptors. That makes it possible to describe larger I/O requests efficiently. The function first checks whether the device actually negotiated the VIRTIO_RING_F_INDIRECT_DESC feature. If not, and if bootverbose is enabled, it uses device_printf() to log an informative message that includes the device prefix, then carries on without the feature. If the feature is present, it computes the size of the indirect descriptor table, marks the queue as indirect capable, and iterates over every descriptor in the ring. For each one it allocates an indirect table, logs an error with device_printf() if allocation fails, records the physical address for DMA, and initialises the table. This is a typical pattern in real drivers: check a feature, allocate resources, log meaningful messages tagged with the device, and handle errors cleanly.
Why this example matters
You have now seen the complete journey:
- Prototype in a shared header tells the compiler how to call the function and enables compile time checks.
- Definition in the kernel source implements the behaviour, using helpers like sbuf to assemble messages safely.
- Real usage in a driver shows how the function is called during initialisation and error paths, producing logs that are easy to trace back to a specific device.
This is the same pattern you will follow when writing your own driver helpers. Declare them in your header so the rest of the driver, and sometimes the kernel, can call them. Implement them in your .c files with small, focused logic. Call them from probe(), attach(), interrupt handlers, and teardown. Prototypes are the bridge that lets these pieces work together cleanly.
By now, you've seen how a function prototype, its implementation, and its real-world usage come together inside the FreeBSD kernel. From the declaration in a shared header, through the implementation in kernel code, to the call site inside a real driver, each step shows why prototypes are the "glue" that lets different parts of the system communicate cleanly. In driver development, they ensure the kernel can call into your code with complete confidence about the parameters and return type no guesswork, no surprises. Getting this right is a matter of both correctness and maintainability, and it's a habit you'll use in every driver you write.
Before we go further into writing complex driver logic, we need to understand one of the most fundamental concepts in C programming: variable scope. Scope determines where a variable can be accessed in your code, how long it stays alive in memory, and what parts of the program can modify it. In FreeBSD driver development, misunderstanding scope can lead to elusive bugs from uninitialised values corrupting hardware state to variables mysteriously changing between function calls. By mastering scope rules, you'll gain fine-grained control over your driver's data, ensuring that values are only visible where they should be, and that critical state is preserved or isolated as needed. In the next section, we'll break down scope into clear, practical categories and show you how to apply them effectively in kernel code.
Variable Scope in Functions
In programming, scope defines the boundaries within which a variable can be seen and used. In other words, it tells us where in the code a variable is visible and who is allowed to read or change its value.
When a variable is declared inside a function, we say it has local scope. Such a variable comes into existence when the function starts running and disappears as soon as the function finishes. No other function can see it, and even within the same function, it may be invisible if declared inside a more restricted block, such as inside a loop or an if statement.
This form of isolation is a powerful safeguard. It prevents accidental interference from other parts of the program, ensures that one function cannot inadvertently change the internal workings of another, and makes the program's behaviour more predictable. By keeping variables confined to the places they are needed, you make your code easier to reason about, maintain, and debug.
To make this idea more concrete, let's look at a short example in C. We'll create a function with a variable that lives entirely inside it. You'll see how the variable works perfectly within its own function, but becomes completely invisible the moment we step outside that function's boundaries.
#include <stdio.h>
void print_number(void) {
int x = 42; // x has local scope: only visible inside print_number()
printf("%d\n", x);
}
int main(void) {
print_number();
// printf("%d\n", x); // ERROR: 'x' is not in scope here
return 0;
}
Here, the variable x is declared inside print_number(), which means it is created when the function starts and destroyed when the function ends. If we try to use x in main(), the compiler complains because main() has no knowledge of x—it lives in a separate, private workspace. This "one workspace per function" rule is one of the foundations of reliable programming: it keeps code modular, avoids accidental changes from unrelated parts of the program, and helps you reason about the behaviour of each function independently.
Why Local Scope Is Good
Local scope brings three key benefits to your code:
- Prevents bugs — a variable inside one function cannot accidentally overwrite or be overwritten by another function's variable, even if they share the same name.
- Keeps code predictable — you always know exactly where a variable can be read or modified, making it easier to follow and reason about the program's flow.
- Improves efficiency — the compiler can often keep local variables in CPU registers, and any stack space they use is automatically freed when the function returns.
By keeping variables confined to the smallest area where they're needed, you reduce the chances of interference, make debugging easier, and help the compiler optimise performance.
Why scope matters in driver development
In FreeBSD device drivers, you'll often manipulate temporary values—buffer sizes, indices, error codes, flags that are relevant only within a specific operation (e.g., probing a device, initialising a queue, handling an interrupt). Keeping these values local prevents cross-talk between concurrent paths and avoids subtle race conditions. In kernel space, small mistakes propagate fast; tight, local scope is your first line of defence.
From Simple Scope to Real Kernel Code
You've just seen how a local variable inside a small C program lives and dies within its function. Now, let's step into a real FreeBSD driver and see exactly the same principle at work, but this time in code that interacts with actual hardware.
We'll look at part of the VirtIO subsystem, which is used for virtual devices in environments like QEMU or bhyve. This example comes from the function virtqueue_init_indirect() that is located between the lines 230 and 271 in the file sys/dev/virtio/virtqueue.c in FreeBSD 14.3 source code, which sets up "indirect descriptors" for a virtual queue. Watch how variables are declared, used, and limited to the function's own scope, just like in our earlier print_number() example.
Note: I've added some extra comments to highlight what's happening at each step.
static int
virtqueue_init_indirect(struct virtqueue *vq, int indirect_size)
{
// Local variable: holds the device reference for easy access
// Only exists inside this function
device_t dev;
// Local variable: temporary pointer to a descriptor structure
// Used during loop iterations to point to the right element
struct vq_desc_extra *dxp;
// Local variables: integer values used for temporary calculations
// 'i' will be our loop counter, 'size' will hold the calculated memory size
int i, size;
// Initialise 'dev' with the device associated with this virtqueue
// 'dev' is local, so it's only valid here — other functions cannot touch it
dev = vq->vq_dev;
// Check if the device supports the INDIRECT_DESC feature
// This is done through a bus-level feature negotiation
if (VIRTIO_BUS_WITH_FEATURE(dev, VIRTIO_RING_F_INDIRECT_DESC) == 0) {
/*
* If the driver requested indirect descriptors, but they were not
* negotiated, we print a message (only if bootverbose is on).
* Then we return 0 to indicate initialisation continues without them.
*/
if (bootverbose)
device_printf(dev, "virtqueue %d (%s) requested "
"indirect descriptors but not negotiated\n",
vq->vq_queue_index, vq->vq_name);
return (0); // At this point, all locals are destroyed
}
// Calculate the memory size needed for the indirect descriptors
size = indirect_size * sizeof(struct vring_desc);
// Store these values in the virtqueue structure for later use
vq->vq_max_indirect_size = indirect_size;
vq->vq_indirect_mem_size = size;
// Mark this virtqueue as using indirect descriptors
vq->vq_flags |= VIRTQUEUE_FLAG_INDIRECT;
// Loop through all entries in the virtqueue
for (i = 0; i < vq->vq_nentries; i++) {
// Point 'dxp' to the i-th descriptor entry in the queue
dxp = &vq->vq_descx[i];
// Allocate memory for the indirect descriptor list
dxp->indirect = malloc(size, M_DEVBUF, M_NOWAIT);
// If allocation fails, log an error and stop initialisation
if (dxp->indirect == NULL) {
device_printf(dev, "cannot allocate indirect list\n");
return (ENOMEM); // Locals are still destroyed upon return
}
// Get the physical address of the allocated memory
dxp->indirect_paddr = vtophys(dxp->indirect);
// Initialise the allocated descriptor list
virtqueue_init_indirect_list(vq, dxp->indirect);
}
// Successfully initialised indirect descriptors — locals end their life here
return (0);
}
Understanding the Scope in This Code
Even though this is production-level kernel code, the principle is the same as in the tiny example we just saw. The variables dev, dxp, i, and size are all declared inside virtqueue_init_indirect() and exist only while this function is running. Once the function returns, whether it's at the end or early via a return statement, those variables vanish, freeing their stack space for other uses.
Notice that this keeps things safe: the loop counter i can't be accidentally reused in another part of the driver, and the dxp pointer is re-initialised for each call to the function. In driver development, this is a critical local scope that ensures that temporary work variables won't collide with names or data in other parts of the kernel. The isolation you learned about in the simple print_number() example applies here in exactly the same way, just at a higher level of complexity and with real hardware resources involved.
Common Beginner Mistakes (and How to Avoid Them)
One of the quickest ways to get into trouble is to store the address of a local variable in a structure that outlives the function. Once the function returns, that memory is reclaimed and can be overwritten at any time, leading to mysterious crashes. Another issue is "over-sharing", using too many global variables for convenience, which can cause unpredictable results if multiple execution paths modify them at the same time. And finally, be careful not to shadow variables (reusing a name inside an inner block), which can lead to confusion and hard-to-spot bugs.
Wrapping Up and Moving Forward
The lesson here is simple but powerful: local scope makes your code safer, easier to test, and more maintainable. In FreeBSD device drivers, it is the right tool for per-call, temporary data. Long-lived information should be stored in properly designed per-device structures, keeping your driver organised and avoiding accidental data sharing.
Now that you understand where a variable can be used, it is time to look at how long it exists. This is called variable storage duration, and it affects whether your data lives on the stack, in static storage, or on the heap. Knowing the difference is key to writing robust, efficient drivers, and that's precisely where we are headed next.
Variable Storage Duration
So far, you've learned where a variable can be used in your program, as well as its scope. But there's another equally important property: how long the variable actually exists in memory. This is called its storage duration.
While scope is about visibility in the code, storage duration is about lifetime in memory. A variable's storage duration determines:
- When the variable is created.
- When it is destroyed.
- Where it lives (stack, static storage, heap).
Understanding storage duration is critical in FreeBSD driver development because we often handle resources that must persist across function calls (like device state) alongside temporary values that must vanish quickly (like loop counters or temporary buffers).
The Three Main Storage Durations in C
When you create a variable in C, you're not just giving it a name and a value, you're also deciding how long that value will live in memory. This "lifetime" is what we call the storage duration. Even two variables that look similar in the code can behave very differently depending on how long they stick around.
Let's break down the three main types you'll encounter, starting with the most common in day-to-day programming.
Automatic Storage Duration (stack variables)
Think of these as short-term helpers. They are born the moment a function starts running and disappear the instant the function finishes. You don't have to create or destroy them manually; C takes care of that for you.
Automatic variables:
- Are declared inside functions without the
statickeyword. - Are created when the function is called and destroyed when it returns.
- Live on the stack, a section of memory that's automatically managed by the program.
- Are perfect for quick, temporary jobs like loop counters, temporary pointers, or small scratch buffers.
Because they vanish when the function ends, you can't keep their address for later use; doing so leads to one of the most common beginner mistakes in C.
Small Example:
#include <stdio.h>
void greet_user(void) {
char name[] = "FreeBSD"; // automatic storage, stack memory
printf("Hello, %s!\n", name);
} // 'name' is destroyed here
int main(void) {
greet_user();
return 0;
}
Here, name lives only while greet_user() runs. When the function exits, the stack space is freed automatically.
Static Storage Duration (globals and static variables)
Now imagine a variable that doesn't come and go with a function call, instead, it's always there from the moment your program (or in kernel space, your driver module) loads until it ends. This is static storage.
Static variables:
- Are declared outside functions or inside functions with the
statickeyword. - Are created once when the program/module starts.
- Remain in memory until the program/module ends.
- Live in a dedicated static memory area.
- Are great for things like per-device state structures or lookup tables that are needed throughout the program's lifetime.
However, since they stick around, you must be careful in driver code shared, long-lived data can be accessed by multiple execution paths, so you may need locks or other synchronization to avoid conflicts.
Small Example:
#include <stdio.h>
static int counter = 0; // static storage, exists for the entire program
void increment(void) {
counter++;
printf("Counter = %d\n", counter);
}
int main(void) {
increment();
increment();
return 0;
}
counter keeps its value between calls to increment() because it never leaves memory until the program ends.
Dynamic Storage Duration (heap allocation)
Sometimes you don't know in advance how much memory you'll need, or you need to keep something around even after the function that created it has finished. That's where dynamic storage comes in: you request memory at runtime, and you decide when it goes away.
Dynamic variables:
- Are created explicitly at runtime with
malloc()/free()in user space, ormalloc(9)/free(9)in the FreeBSD kernel. - Exist until you explicitly free them.
- Live in the heap, a pool of memory managed by the operating system or kernel.
- Are perfect for things like buffers whose size depends on hardware parameters or user input.
The flexibility comes with responsibility: forget to free them, and you'll have a memory leak. Free them too soon, and you might crash the system by accessing invalid memory.
Small Example:
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
int main(void) {
char *msg = malloc(32); // dynamic storage
if (!msg) return 1;
strcpy(msg, "Hello from dynamic memory!");
printf("%s\n", msg);
free(msg); // must free to avoid leaks
return 0;
}
Here, the program decides at runtime to allocate 32 bytes. The memory is under your control, so you must free it when done.
Bridging Theory and Practice
So far, we've looked at these storage durations in an abstract way. But concepts really sink in when you see them in the wild, inside a real FreeBSD driver or subsystem function. Kernel code often mixes these durations: a few automatic locals for temporary values, some static structures for persistent state, and carefully managed dynamic memory for resources that come and go during runtime.
To make this clearer, let's walk through a real function from the FreeBSD 14.3 source tree. By following each variable and seeing how it's declared, used, and eventually discarded or freed, you'll gain an intuitive feel for how lifetime and scope interact in real-world kernel work.
| Duration | Created | Destroyed | Memory area | Typical declarations | Good driver use cases | Common pitfalls | FreeBSD APIs to know |
|---|---|---|---|---|---|---|---|
| Automatic | On function entry | On function return | Stack | Local variables without static | Scratch values in fast paths and interrupt handlers | Returning addresses of locals. Oversized locals | N/A |
| Static | When module loads | When module unloads | Static storage | File scope variables or static inside funcs | Persistent device state. Constant tables. Tunables | Hidden shared state. Missing locks on SMP | sysctl(9) patterns for tunables |
| Dynamic | When you call allocator | When you free it | Heap | Pointers returned by allocators | Buffers sized at probe time. Lifetime spans calls | Leaks. Use after free. Double free | malloc(9), free(9), M_* types |
Real Example from FreeBSD 14.3
Before we move on, let's look at how these storage duration concepts appear in production-quality FreeBSD code. Our example comes from the network interface subsystem, specifically from the _if_delgroup_locked() function in sys/net/if.c (lines 1474 to 1512 in FreeBSD 14.3). This function removes an interface from a named interface group, updates reference counts, and frees memory when the group becomes empty.
As in our earlier, simpler examples, you'll see automatic variables created and destroyed entirely within the function, dynamic memory being released explicitly with free(9), and, elsewhere in the same file, static variables that persist for the module's entire lifetime. By walking through this function, you'll see lifetime and scope management in action not just in an isolated snippet, but in the complex, interconnected world of the FreeBSD kernel.
Note: I've added some extra comments to highlight what's happening at each step.
/*
* Helper function to remove a group out of an interface. Expects the global
* ifnet lock to be write-locked, and drops it before returning.
*/
static void
_if_delgroup_locked(struct ifnet *ifp, struct ifg_list *ifgl,
const char *groupname)
{
struct ifg_member *ifgm; // [Automatic] (stack) pointer: used only in this call
bool freeifgl; // [Automatic] (stack) flag: should we free the group?
IFNET_WLOCK_ASSERT(); // sanity: we entered with the write lock held
/* Remove the (interface,group) link from the interface's local list. */
IF_ADDR_WLOCK(ifp);
CK_STAILQ_REMOVE(&ifp->if_groups, ifgl, ifg_list, ifgl_next);
IF_ADDR_WUNLOCK(ifp);
/*
* Find and remove this interface from the group's member list.
* 'ifgm' is a LOCAL cursor; it does not escape this function
* (classic automatic storage).
*/
CK_STAILQ_FOREACH(ifgm, &ifgl->ifgl_group->ifg_members, ifgm_next) {
/* [Automatic] 'ifgm' is a local iterator only */
if (ifgm->ifgm_ifp == ifp) {
CK_STAILQ_REMOVE(&ifgl->ifgl_group->ifg_members, ifgm,
ifg_member, ifgm_next);
break;
}
}
/*
* Decrement the group's reference count. If we just removed the
* last member, mark the group for freeing after we drop locks.
*/
if (--ifgl->ifgl_group->ifg_refcnt == 0) {
CK_STAILQ_REMOVE(&V_ifg_head, ifgl->ifgl_group, ifg_group,
ifg_next);
freeifgl = true;
} else {
freeifgl = false;
}
IFNET_WUNLOCK(); // we promised to drop the global lock before return
/*
* Wait for readers in the current epoch to finish before freeing memory
* (RCU-style safety in the networking stack).
*/
NET_EPOCH_WAIT();
/* Notify listeners that the group membership changed. */
EVENTHANDLER_INVOKE(group_change_event, groupname);
if (freeifgl) {
/* Group became empty: fire detach event and free the group object. */
EVENTHANDLER_INVOKE(group_detach_event, ifgl->ifgl_group);
free(ifgl->ifgl_group, M_TEMP); // [Dynamic] (heap) storage being returned
}
/* Free the (interface,group) membership nodes allocated earlier. */
free(ifgm, M_TEMP); // [Dynamic] the 'member' record
free(ifgl, M_TEMP); // [Dynamic] the (ifnet, group) link record
}
What to notice
[Automatic]ifgmandfreeifgllive only for this call. They cannot outlive the function.[Dynamic]frees return heap objects that were allocated earlier in the driver life cycle. The lifetime crosses function boundaries and must be released on the exact success path shown here.[Static]is not used in this function. In the same file you will find persistent configuration and counters that exist from load to unload. Those are[Static].
Understanding the Storage Durations in This Function
If you follow _if_delgroup_locked() from start to finish, you can watch all three storage durations in C play their part. The variables ifgm and freeifgl are automatic, which means they are born when the function is called, live entirely on the stack, and disappear the moment the function returns. They are private to this call, so nothing outside can accidentally change them, and they cannot change anything outside either.
A little further down, the calls to free(...) deal with dynamic storage. The pointers passed to free() were created earlier in the driver's life, often with malloc() during initialisation routines like if_addgroup(). Unlike stack variables, this memory stays around until the driver deliberately lets it go. Freeing it here tells the kernel, "I'm done with this; you can reuse it for something else."
This function doesn't use static variables directly, but in the same file (if.c), you will find examples like debugging flags declared with YSCTL_INT that live for as long as the kernel module is loaded. These variables keep their values across function calls and are a reliable place to store configuration or diagnostics that need to persist.
Each choice here is intentional.
- Automatic variables keep temporary state safe inside the function.
- Dynamic memory gives flexibility at runtime, allowing the driver to adjust and then clean up when done.
- Static storage, found elsewhere in the same codebase, supports persistent, shared information.
Put together, this is a clear, real-world example of how lifetime and visibility work hand in hand in FreeBSD driver code. It is not just theory from a C textbook, it is the day-to-day reality of writing drivers that are reliable, efficient, and safe to run in the kernel.
Why Storage Duration Matters in FreeBSD Drivers
In kernel development, storage duration is not just an academic detail; it's directly tied to system stability, performance, and even security. A wrong choice here can take down the entire operating system.
In FreeBSD drivers, the right storage duration ensures that data lives exactly as long as needed, no more and no less:
- Automatic variables are ideal for short-lived, private state, such as temporary values in an interrupt handler. They vanish automatically when the function ends, avoiding long-term clutter in memory.
- Static variables can safely store hardware state or configuration that must persist across calls, but they introduce shared state that may require locking in SMP systems to avoid race conditions.
- Dynamic allocations give you flexibility when buffer sizes depend on runtime conditions like device probing results, but they must be explicitly freed to avoid leaks and freeing too soon risks accessing invalid memory.
Mistakes with storage duration can be catastrophic in the kernel. Keeping a pointer to a stack variable beyond the function's life is almost guaranteed to cause corruption. Forgetting to free dynamic memory ties up resources until a reboot. Overusing static variables can turn shared state into a performance bottleneck.
Understanding these trade-offs is not optional. In driver code, often triggered by hardware events in unpredictable contexts, correct lifetime management is a foundation for writing code that is safe, efficient, and maintainable.
Common Beginner Mistakes
When you are new to C and especially to kernel programming, it is surprisingly easy to misuse storage duration without even realising it. One classic trap with automatic variables is returning the address of a local variable from a function. At first, it might seem harmless after all, the variable was right there a moment ago, but the moment the function returns, that memory is reclaimed for other uses. Accessing it later is like reading a letter you already burned; the result is undefined behaviour, and in the kernel, that can mean an instant crash.
Static variables can cause trouble differently. Because they persist across function calls, a value left over from a previous run of the function might influence the next run in unexpected ways. This is particularly dangerous if you assume that every call starts with a "clean slate." In reality, static variables remember everything, even when you wish they wouldn't.
Dynamic memory has its own set of hazards. Forgetting to free() something you allocated means the memory will be tied up until the system is restarted, a problem known as a memory leak. In kernel space, where resources are precious, a leak can slowly degrade the system. Freeing the same pointer twice is even worse, it can corrupt kernel memory structures and bring down the whole machine.
Being aware of these patterns early on helps you avoid them when working on real driver code, where the cost of a mistake is often far greater than in user-space programming.
Wrapping Up
We have explored the three main storage durations in C: automatic, static, and dynamic. Each one has its place, and the right choice depends on how long you need the data to live and who should be able to see it. The safest general rule is to choose the smallest necessary lifetime for your variables. This limits their exposure, reduces the risk of unintended interactions, and often makes the compiler's job easier.
In FreeBSD driver development, careful management of variable lifetimes is not optional; it is a fundamental skill. Done right, it helps you write code that is predictable, efficient, and resilient under load. With these principles in mind, you are ready to explore the next piece of the puzzle: understanding how variable linkage affects visibility across files and modules.
Variable Linkage (Visibility Across Files)
So far, we've explored scope (where a name is visible inside your code) and storage duration (how long an object exists in memory). The third and final piece in this visibility puzzle is linkage, the rule that decides whether code in other source files can refer to a given name.
In C (and in FreeBSD kernel code), programs are often split into multiple .c files plus the header files they include. Each .c file and its headers form a translation unit. By default, most names you define are visible only inside the translation unit where they're declared. If you want other files to see them or, often more importantly, to hide them, linkage is the mechanism that controls that access.
The three kinds of linkage in C
Think of linkage as "who outside this file can see this name?":
- External linkage: A name is visible across translation units. Global variables and functions defined at file scope without static have external linkage. Other files can refer to them by declaring extern (for variables) or including a prototype (for functions).
- Internal linkage: A name is visible only within the current file. You get internal linkage by writing static at file scope (for variables or functions). This is how you keep helpers and the private state hidden from the rest of the kernel/program.
- No linkage: A name is visible only within its own block (e.g., variables inside a function). These are locals; they can't be named from outside their scope at all.
A tiny two-file illustration
To really see linkage in action, let's build the smallest possible program that spans two .c files. This will let us test all three cases, external, internal, and no linkage, side by side. We'll create one file (foo.c) that defines a few variables and a helper function, and another file (main.c) that tries to use them.
Below, shared_counter has external linkage (visible in both files), internal_flag has internal linkage (visible only inside foo.c), and the locals inside increment() have no linkage (visible only in that function).
foo.c
#include <stdio.h>
/*
* Global variable with external linkage:
* - Visible to other files (translation units) in the program.
* - No 'static' keyword means it has external linkage by default.
*/
int shared_counter = 0;
/*
* File-private variable with internal linkage:
* - The 'static' keyword at file scope means this name
* is only visible inside foo.c.
*/
static int internal_flag = 1;
/*
* Function with external linkage by default:
* - Can be called from other files if they declare its prototype.
*/
void increment(void) {
/*
* Local variable with no linkage:
* - Exists only during this function call.
* - Cannot be accessed from anywhere else.
*/
int step = 1;
if (internal_flag) // Only code in foo.c can see internal_flag
shared_counter += step; // Modifies the global shared_counter
printf("Counter: %d\n", shared_counter);
}
main.c
#include <stdio.h>
/*
* 'extern' tells the compiler:
* - This variable exists in another file (foo.c).
* - Do not allocate storage for it here.
*/
extern int shared_counter;
/*
* Forward declaration for the function defined in foo.c:
* - Lets us call increment() from this file.
*/
void increment(void);
int main(void) {
increment(); // Calls increment() from foo.c
shared_counter += 10; // Legal: shared_counter has external linkage
increment();
// internal_flag = 0; // ERROR: not visible here (internal linkage in foo.c)
return 0;
}
The pattern generalizes directly to kernel code: keep helpers and private state static in one .c file, expose only the minimal surface via headers (prototypes) or intentionally exported globals.
Real FreeBSD 14.3 Example: External vs. Internal vs. No Linkage
Let's ground this in the FreeBSD network stack (sys/net/if.c). We'll look at:
- a global variable with external linkage (
ifqmaxlen), - file-private toggles with internal linkage (
log_link_state_change,log_promisc_mode_change), and - a function with a local variable (no linkage) (
sysctl_ifcount()), plus how it's exposed viaSYSCTL_PROC.
1) External linkage: a tunable global
In sys/net/if.c, ifqmaxlen is a global integer that other parts of the kernel can reference. That's external linkage.
int ifqmaxlen = IFQ_MAXLEN; // external linkage: visible to other files
You'll also see it referenced from the SYSCTL tree setup:
SYSCTL_INT(_net_link, OID_AUTO, ifqmaxlen, CTLFLAG_RDTUN,
&ifqmaxlen, 0, "max send queue size");
This exposes the global through sysctl, so administrators can read/tune it at boot (depending on flags).
2) Internal linkage: file-private toggles
Right above, the file defines two static integers. Because they're static at file scope, they have internal linkage only if.c can name them:
/* Log link state change events */
static int log_link_state_change = 1;
SYSCTL_INT(_net_link, OID_AUTO, log_link_state_change, CTLFLAG_RW,
&log_link_state_change, 0,
"log interface link state change events");
/* Log promiscuous mode change events */
static int log_promisc_mode_change = 1;
SYSCTL_INT(_net_link, OID_AUTO, log_promisc_mode_change, CTLFLAG_RDTUN,
&log_promisc_mode_change, 1,
"log promiscuous mode change events");
Later in the same file, log_link_state_change is used to decide whether to print a message, but only code inside if.c can refer to that symbol by name:
if (log_link_state_change)
if_printf(ifp, "link state changed to %s\n",
(link_state == LINK_STATE_UP) ? "UP" : "DOWN");
See sys/net/if.c for the static definitions and the reference in do_link_state_change().
3) No linkage (locals) + how a private function is exported via SYSCTL
Here's the full sysctl_ifcount() function (as in FreeBSD 14.3), with line-by-line commentary. Notice how rv is a local; it has no linkage and exists only for the duration of this call.
Note: I've added some extra comments to highlight what's happening at each step.
/* sys/net/if.c */
/*
* 'static' at file scope:
* - Gives the function internal linkage (only visible in if.c).
* - Other files cannot call sysctl_ifcount() directly.
*/
static int
sysctl_ifcount(SYSCTL_HANDLER_ARGS) // SYSCTL handler signature used in the kernel
{
/*
* Local variable with no linkage:
* - Exists only during this function call.
* - Tracks the highest interface index in the current vnet.
*/
int rv = 0;
/*
* IFNET_RLOCK():
* - Acquires a read lock on the ifnet index table.
* - Ensures safe concurrent access in an SMP kernel.
*/
IFNET_RLOCK();
/*
* Loop through interface indices from 1 up to the current max (if_index).
* If an entry is in use and belongs to the current vnet,
* update rv with the highest index seen.
*/
for (int i = 1; i <= if_index; i++)
if (ifindex_table[i].ife_ifnet != NULL &&
ifindex_table[i].ife_ifnet->if_vnet == curvnet)
rv = i;
/*
* Release the read lock on the ifnet index table.
*/
IFNET_RUNLOCK();
/*
* Return rv to user space via the sysctl framework.
* - sysctl_handle_int() handles copying the value to the request buffer.
*/
return (sysctl_handle_int(oidp, &rv, 0, req));
}
The function is then registered with the sysctl tree so other kernel parts (and user space via sysctl) can invoke it without needing external linkage to the function name:
/*
* SYSCTL_PROC:
* - Creates a sysctl entry named 'ifcount' under:
* net.link.generic.system
* - Flags: integer type, vnet-aware, read-only.
* - Calls sysctl_ifcount() when queried.
* - Even though sysctl_ifcount() is static, the sysctl framework
* acts as the public interface to its result.
*/
SYSCTL_PROC(_net_link_generic_system, IFMIB_IFCOUNT, ifcount,
CTLTYPE_INT | CTLFLAG_VNET | CTLFLAG_RD, NULL, 0,
sysctl_ifcount, "I", "Maximum known interface index");
This pattern is common in the kernel: the function itself has internal linkage (static), but it's exposed through a registration mechanism (sysctl, eventhandler, devfs methods, etc.).
Why this matters for drivers
- Encapsulation with internal linkage: Use static at file scope to keep helpers and private state inside a single .c file. This reduces accidental coupling and eliminates a whole class of "who changed this?" bugs under SMP.
- Safe temporaries with no linkage: Prefer locals for per-call data so nothing outside the function can modify it. This helps ensure correctness and makes concurrency easier to reason about.
- Intentional exposure through interfaces: When you need to share information, expose it through a registration mechanism such as SYSCTL_PROC, an eventhandler, or devfs methods, rather than exporting function names directly.
In sys/net/if.c, you can see all three visibility levels in action:
- External linkage:
ifqmaxlenis a global variable accessible to other files. - Internal linkage:
log_link_state_changeandlog_promisc_mode_changeare file-private toggles. - No linkage: Local variable
rvinsidesysctl_ifcount(), exposed intentionally viaSYSCTL_PROC.
Common beginner pitfalls (and how to sidestep them)
A few patterns trip people up when they first juggle scope, storage duration, and linkage:
- Using file-private helpers from another file. If you see "undefined reference" at link time for a helper you thought was global, check for a
staticon its definition. If it's truly meant to be shared, move the prototype to a header and removestaticfrom the definition. If not, keep it private and call it indirectly via a registered interface (like sysctl or an ops table). - Accidentally exporting private state. A bare
int myflag;at file scope has external linkage. If you intended it to be file-local, writestatic int myflag;. This one keyword prevents cross-file name collisions and unintended writes. - Leaning on globals instead of passing arguments. If two unrelated call paths tweak the same global, you've invited heisenbugs. Prefer locals and function parameters, or encapsulate shared state in a per-device struct referenced through
softc. - Beginners often confuse
staticin file scope (linkage control) withstaticinside a function (storage duration control). In file scope, static hides a symbol from other files (linkage control). Inside a function, static makes a variable keep its value between calls (storage duration control).
Wrapping up
You now understand scope, storage duration, and linkage, the three pillars that define where a variable can be used, how long it exists, and who can access it. These concepts form the foundation for managing state in any C program, and they are especially critical in FreeBSD drivers, where per-call locals, file-private helpers, and global kernel state must coexist without interfering with one another.
Next, we'll see what happens when you pass those variables into a function. In C, function parameters are copies of the original values, so changes inside the function won't affect the originals unless you pass their addresses. Understanding this behaviour is key to writing driver code that updates state intentionally, avoids subtle bugs, and communicates data effectively between functions.
Parameters Are Copies
When you call a function in C, the values you pass to it are copied into the function's parameters. The function then works with those copies, not the originals. This is known as call by value, and it means that any changes made to the parameter inside the function are lost when the function returns; the caller's variables remain untouched.
This is different from some other programming languages that use "pass by reference" by default, where a function can directly modify a caller's variable without special syntax. In C, if you want a function to modify something outside its own scope, you must give it the address of that thing. That's done using pointers, which we'll explore in depth in the next section.
Understanding this behaviour is critical in FreeBSD driver development. Many driver functions perform setup work, check for conditions, or calculate values without touching the caller's variables unless they are explicitly passed a pointer. This design helps maintain isolation between different parts of the kernel, reducing the risk of unintended side effects.
A Simple Example: Modifying a Copy
To see this in action, we'll write a short program that passes an integer to a function. Inside the function, we'll try to change it. If C worked the way many beginners expect, this would update the original value. But because parameters in C are copies, the change will only affect the function's local version, leaving the original untouched.
#include <stdio.h>
void modify(int x) {
x = 42; // Only updates the function's own copy
}
int main(void) {
int original = 5;
modify(original);
printf("%d\n", original); // Still prints 5, not 42!
return 0;
}
Here, modify() changes its local version of x, but the original variable in main() stays at 5. The copy disappears as soon as modify() returns, leaving main()'s data untouched.
If you do want to change the original variable inside a function, you must pass a reference to it rather than a copy. In C, that reference takes the form of a pointer, which lets the function work directly with the original data in memory. Don't worry if pointers sound mysterious, we'll cover them thoroughly in the next section.
A Real Example from FreeBSD 14.3
This concept shows up in production kernel code all the time. Let's see a real function from 'sys/net/if.c' in FreeBSD 14.3 that removes an interface from a group (this function is located between lines 1470 and 1512). Pay special attention to the parameters at the top: ifp, ifgl, and groupname. Each is a copy of the value that the caller passed in. They're local to this function call, even though they refer to shared kernel objects.
In the listing below, I've added extra comments so you can see exactly what's happening at each step.
Notice how these parameters are local copies, even though they hold pointers to shared kernel data.
/*
* Helper function to remove a group out of an interface. Expects the global
* ifnet lock to be write-locked, and drops it before returning.
*/
static void
_if_delgroup_locked(struct ifnet *ifp, struct ifg_list *ifgl,
const char *groupname)
{
struct ifg_member *ifgm; // local (automatic) variable: lives only during this call
bool freeifgl; // local flag on the stack: also per-call
/*
* PARAMETERS ARE COPIES:
* - 'ifp' is a copy of a pointer to the interface object.
* - 'ifgl' is a copy of a pointer to a (interface,group) link record.
* - 'groupname' is a copy of a pointer to constant text.
* The pointer VALUES are copied, but they still refer to the same kernel data
* as the caller's originals. Reassigning 'ifp' or 'ifgl' here wouldn't affect
* the caller; modifying the *pointed-to* structures does persist.
*/
IFNET_WLOCK_ASSERT(); // sanity: we entered with the global ifnet write lock held
// Remove the (ifnet,group) link from the interface's list.
IF_ADDR_WLOCK(ifp);
CK_STAILQ_REMOVE(&ifp->if_groups, ifgl, ifg_list, ifgl_next);
IF_ADDR_WUNLOCK(ifp);
// Walk the group's member list and remove this interface from it.
CK_STAILQ_FOREACH(ifgm, &ifgl->ifgl_group->ifg_members, ifgm_next) {
if (ifgm->ifgm_ifp == ifp) {
CK_STAILQ_REMOVE(&ifgl->ifgl_group->ifg_members, ifgm,
ifg_member, ifgm_next);
break;
}
}
// Decrement the group's reference count; if it hits zero, mark for free.
if (--ifgl->ifgl_group->ifg_refcnt == 0) {
CK_STAILQ_REMOVE(&V_ifg_head, ifgl->ifgl_group, ifg_group,
ifg_next);
freeifgl = true;
} else {
freeifgl = false;
}
IFNET_WUNLOCK(); // drop the global ifnet lock before potentially freeing memory
// Wait for current readers to exit the epoch section before freeing (RCU-style safety).
NET_EPOCH_WAIT();
// Notify listeners that a group membership changed (uses the 'groupname' pointer).
EVENTHANDLER_INVOKE(group_change_event, groupname);
if (freeifgl) {
// If the group is now empty: announce detach and free the group object.
EVENTHANDLER_INVOKE(group_detach_event, ifgl->ifgl_group);
free(ifgl->ifgl_group, M_TEMP);
}
// Free the membership record and the (ifnet,group) link record.
free(ifgm, M_TEMP);
free(ifgl, M_TEMP);
}
In this kernel example, the parameters behave like they're passed "by reference" because they hold addresses to kernel objects. However, the pointer values themselves are still copies.
What This Shows
Here, ifp, ifgl, and groupname are copies of what the caller passed. If we reassigned ifp = NULL; inside this function, the caller's ifp would be unaffected. But because the pointer values still point to real kernel structures, changes to those structures, like removing from lists or freeing memory, are seen system-wide.
Meanwhile, ifgm and freeifgl are purely local automatic variables. They live only while this function runs and vanish immediately after it returns.
This mirrors our tiny user-space example exactly; the only difference is that here, the parameters are pointers into complex, shared kernel data.
Why This Matters in FreeBSD Driver Development
In driver code, understanding that parameters are copies helps you avoid dangerous assumptions:
- If you change the parameter variable itself (like reassigning a pointer), the caller won't see that change.
- If you change the object the pointer refers to, the caller and possibly the rest of the kernel will see the change, so you must be sure it's safe.
- Passing large structures by value creates full copies on the stack; passing pointers shares the same data.
This distinction is vital for writing predictable, race-free kernel code.
Common Beginner Mistakes
When working with parameters in C, especially in FreeBSD kernel code, beginners often get caught in subtle traps that stem from not fully grasping the "copy" rule.
Let's look at some of the most common:
- Passing a structure by value instead of a pointer: You expect changes to update the original, but they only update your local copy. Example: passing a struct ifreq by value and wondering why the interface isn't reconfigured.
- Forgetting that a pointer grants write access:
Passing
struct mydev *gives the callee full ability to change the device state. Without proper locking, this can corrupt kernel data. - Confusing a pointer copy with copying data:
Reassigning the pointer parameter (
ptr = NULL;) doesn't affect the caller's pointer. Modifying the pointed-to object (ptr->field = 42;) does affect the caller. - Copying large structures by value in kernel space This wastes CPU time and risks overflowing the limited kernel stack.
- Failing to document modification intent: If your function will modify its input, make it evident in the function name, comments, and parameter type.
Rule of Thumb: Pass by value to keep data safe. Pass a pointer only when you intend to modify the data and make that intent explicit.
Wrapping Up
You've now seen that parameters in C work by value: every function receives its own private copy of what you pass, even if that value is an address pointing to shared data. This model gives you both safety and responsibility: safety, because variables themselves are isolated between caller and callee; responsibility, because the data being pointed to may still be shared and mutable.
Next, we'll shift focus from individual variables to collections of data that C programmers (and FreeBSD drivers) use constantly: arrays and strings.
Arrays and Strings in C
In the previous section, you learned that function parameters are passed by value. That lesson sets the stage for working with arrays and strings, two of the most common structures in C. Arrays give you a way to handle collections of elements in contiguous memory. In contrast, strings are simply arrays of characters with a special terminator.
Both are central to FreeBSD driver development: arrays become buffers that move data in and out of hardware, and strings carry device names, configuration options, and environment variables.
In this section, we will build from the basics, highlight common pitfalls, and then connect the concepts to real FreeBSD kernel code, concluding with hands-on labs.
Declaring and Using Arrays
An array in C is a fixed-size collection of elements, all of the same type, stored in contiguous memory. Once defined, its size cannot change.
int numbers[5]; // Declares an array of 5 integers
You can initialize an array at the time of declaration:
int primes[3] = {2, 3, 5}; // Initialize with values
Each element is accessed by its index, starting at zero:
primes[0] = 7; // Change the first element
int second = primes[1]; // Read the second element (3)
In memory, arrays are laid out sequentially. If numbers starts at address 1000 and each integer takes 4 bytes, then numbers[0] is at 1000, numbers[1] at 1004, numbers[2] at 1008, and so on. This detail becomes very important when we study pointers.
Strings in C
Unlike some languages where strings are a distinct type, in C, a string is simply an array of characters terminated with a special '\0' character known as the null terminator.
char name[6] = {'E', 'd', 's', 'o', 'n', '\0'};
A more convenient form lets the compiler insert the null terminator for you:
char name[] = "Edson"; // Stored as E d s o n \0
Strings can be accessed and modified character by character:
name[0] = 'A'; // Now the string reads "Adson"
If the terminating '\0' is missing, functions that expect a string will continue reading memory until they hit a zero byte somewhere else. This often results in garbage output, memory corruption, or kernel crashes.
Common String Functions (<string.h>)
The C standard library provides helper functions for strings. Although you cannot use the full standard library within the FreeBSD kernel, many equivalents are available. It is important to know the standard ones first:
#include <string.h>
char src[] = "FreeBSD";
char dest[20];
strcpy(dest, src); // Copy src into dest
int len = strlen(dest); // Get string length
int cmp = strcmp(src, dest); // Compare two strings
Frequently used functions include:
strcpy()- copy one string into another (unsafe, no bounds checking).strncpy()- safer variant, lets you specify maximum characters.strlen()- count characters before the null terminator.strcmp()- compare two strings lexicographically.
Warning: many standard functions like strcpy() are unsafe because they do not check buffer sizes. In kernel development, this can corrupt memory and cause the system to crash. Safer variants such as strncpy() or kernel-provided helpers should always be preferred.
Why This Matters in FreeBSD Drivers
Arrays and strings are not just a basic C feature; they're at the heart of how FreeBSD drivers manage data. Nearly every driver you write or study relies on them in one form or another:
- Buffers that temporarily hold data moving between hardware and the kernel, such as keystrokes, network packets, or bytes written to disk.
- Device names like
/dev/ttyu0or/dev/randomare presented to user space by the kernel. - Configuration tunables (sysctl) that depend on arrays and strings to store parameter names and values.
- Lookup tables are fixed-size arrays that hold supported hardware IDs, feature flags, or hardware-to-human-readable name mappings.
Because arrays and strings interact closely with hardware interfaces, mistakes here have consequences far beyond a user-space crash. While a runaway write in user-space might only crash that process, the same bug in kernel-space can overwrite critical memory, cause a kernel panic, corrupt data, or even open a security hole.
A real-world example makes this point clear. In CVE-2024-45288, the FreeBSD libnv library (used in both the kernel and userland) mishandled arrays of strings: it assumed strings were null-terminated without verifying their termination. A maliciously crafted nvlist could cause memory beyond the allocated buffer to be read or written, leading to kernel panic or even privilege escalation. The fix required explicit checks, safer memory allocation, and overflow protection.
Here's a simplified before/after look at the bug and its correction:
/*
* CVE-2024-45288 Analysis: Missing Null-Termination in libnv String Arrays
*
* VULNERABILITY: A missing null-termination character in the last element
* of an nvlist array string can lead to writing outside the allocated buffer.
*/
// BEFORE (Vulnerable Code):
static char **
nvpair_unpack_string_array(bool isbe __unused, nvpair_t *nvp,
const char *data, size_t *leftp)
{
char **value, *tmp, **valuep;
size_t ii, size, len;
tmp = (char *)(uintptr_t)data;
size = nvp->nvp_datasize;
for (ii = 0; ii < nvp->nvp_nitems; ii++) {
len = strnlen(tmp, size - 1) + 1;
size -= len;
// BUG: No check if tmp[len-1] is actually '\0'!
tmp += len;
}
// BUG: nv_malloc does not zero-initialize
value = nv_malloc(sizeof(*value) * nvp->nvp_nitems);
if (value == NULL)
return (NULL);
// ...
}
// AFTER (Fixed Code):
static char **
nvpair_unpack_string_array(bool isbe __unused, nvpair_t *nvp,
const char *data, size_t *leftp)
{
char **value, *tmp, **valuep;
size_t ii, size, len;
tmp = (char *)(uintptr_t)data;
size = nvp->nvp_datasize;
for (ii = 0; ii < nvp->nvp_nitems; ii++) {
len = strnlen(tmp, size - 1) + 1;
size -= len;
// FIX: Explicitly check null-termination
if (tmp[len - 1] != '\0') {
ERRNO_SET(EINVAL);
return (NULL);
}
tmp += len;
}
// FIX: Use nv_calloc to zero-initialize
value = nv_calloc(nvp->nvp_nitems, sizeof(*value));
if (value == NULL)
return (NULL);
// ...
}
Visualising the Missing '\0' Problem
CVE-2024-45288
Legend:
[..] = allocated bytes for one string element in the nvlist array
\0 = null terminator
XX = unrelated memory beyond the allocated buffer (must not be touched)
------------------------------------------------------------------------------
BEFORE (vulnerable): last string not null-terminated
------------------------------------------------------------------------------
nvlist data region (simplified):
+---- element[0] ----+ +---- element[1] ----+ +---- element[2] ----+
| 'F' 'r' 'e' 'e' \0 | | 'B' 'S' 'D' \0 | | 'b' 'u' 'g' '!' |XX|XX|XX|...
+--------------------+ +--------------------+ +--------------------+--+--+--+
^
|
strnlen(tmp, size-1) walks here,
never sees '\0', keeps going...
size -= len is computed as if '\0'
existed, later code writes past end
Effect:
- Readers assume a proper C-string. They continue until a random zero byte in XX.
- Writers may copy len bytes including overflow into XX.
- Result can be buffer overflow, memory corruption, or kernel panic.
------------------------------------------------------------------------------
AFTER (fixed): explicit check for null-termination + safer allocation
------------------------------------------------------------------------------
+---- element[0] ----+ +---- element[1] ----+ +---- element[2] ----+
| 'F' 'r' 'e' 'e' \0 | | 'B' 'S' 'D' \0 | | 'b' 'u' 'g' '!' \0|XX|XX|XX|...
+--------------------+ +--------------------+ +--------------------+--+--+--+
^
|
check: tmp[len-1] == '\0' ? OK : EINVAL
Changes:
- The loop validates the final byte of each element is '\0'.
- If not, it fails early with EINVAL. No overflow occurs.
- Allocation uses nv_calloc(nitems, sizeof(*value)), memory is zeroed.
Tip for kernel developers:
Always check termination when parsing external or untrusted data.
Do not rely on strnlen alone. Validate tmp[len-1] == '\0' before use.
Root Cause Analysis:
- Missing null-termination check
strnlen()was used to find string length.- The code assumed strings ended with
'\0'. - No verification that
tmp[len-1] == '\0'.
- Uninitialized memory
nv_malloc()does not clear memory.- Changed to
nv_calloc()to avoid leaking old memory contents.
- Integer overflow
- In related header checks, nvlh_size could overflow when added to sizeof(nvlhdrp)
- Added explicit overflow checks.
Impact:
- Buffer overflow in kernel or userland.
- Privilege escalation is possible.
- System panic and memory corruption.
Mini Lab: The Danger of a Missing '\0'
To illustrate the subtlety of this class of bug, try the following small program in user space.
#include <stdio.h>
#include <string.h>
int main() {
// Fill stack with non-zero garbage
char garbage[100];
memset(garbage, 'Z', sizeof(garbage));
// Deliberately forget the null terminator
char broken[5] = {'B', 'S', 'D', '!', 'X'};
// Print as if it were a string
printf("Broken string: %s\n", broken);
// Now with proper termination
char fixed[6] = {'B', 'S', 'D', '!', 'X', '\0'};
printf("Fixed string: %s\n", fixed);
return 0;
}
What to do:
- Compile and run.
- The first print may show random garbage after
"BSD!X", becauseprintf("%s")keeps reading memory until it stumbles on a zero byte. - The second print works as expected.
Lesson: This is the same mistake that caused CVE-2024-45288 in FreeBSD. In user space, you get garbage or a crash. In kernel space, you risk a panic or privilege escalation. Always remember: no '\0', no string.
Note: This example shows how a tiny omission, forgetting to check for a '\0', can become a serious vulnerability. That's why professional FreeBSD driver developers are disciplined when handling arrays and strings: they always track buffer sizes, always validate string termination, and always use safe allocation and copy functions. The security and stability of the system depend on it.
Real Example from FreeBSD 14.3 Source Code
FreeBSD stores its kernel environment as an array of C strings, each of the form "name=value". This is a perfect real-world example of arrays and strings in action.
The array itself is declared in sys/kern/kern_environment.c:
// sys/kern/kern_environment.c, line 83
char **kenvp; // Array of pointers to strings like "name=value"
Each kenvp[i] points to a null-terminated string. For example:
kenvp[0] → "kern.ostype=FreeBSD"
kenvp[1] → "hw.model=Intel(R) Core(TM) i7"
...
To look up a variable by name, FreeBSD uses the helper _getenv_dynamic_locked():
// sys/kern/kern_environment.c, lines 495-511
static char *
_getenv_dynamic_locked(const char *name, int *idx)
{
char *cp; // Pointer to the current "name=value" string
int len, i;
len = strlen(name); // Get the length of the variable name
// Walk through each string in kenvp[]
for (cp = kenvp[0], i = 0; cp != NULL; cp = kenvp[++i]) {
// Compare prefix: does "cp" start with "name"?
if ((strncmp(cp, name, len) == 0) &&
(cp[len] == '=')) { // Ensure it's exactly "name="
if (idx != NULL)
*idx = i; // Optionally return the index
// Return pointer to the value part (after '=')
return (cp + len + 1);
}
}
// Not found
return (NULL);
}
Step by step explanation:
- The function receives a variable name, such as
"kern.ostype". - It measures its length.
- It loops through the array
kenvp[]. Each entry is a string like"name=value". - It compares the prefix of each entry with the requested name.
- If it matches and is followed by
'=', it returns a pointer just past the '=', so the caller gets only the value.- For
"kern.ostype=FreeBSD", the return value points to"FreeBSD".
- For
- If no entry matches, it returns
NULL.
The public interface kern_getenv() wraps this logic with safe copying and locking:
// sys/kern/kern_environment.c, lines 561-582
char *
kern_getenv(const char *name)
{
char *cp, *ret;
int len;
if (dynamic_kenv) {
// Compute maximum safe size for a "name=value" string
len = KENV_MNAMELEN + 1 + kenv_mvallen + 1;
// Allocate a buffer (zeroed) for the result
ret = uma_zalloc(kenv_zone, M_WAITOK | M_ZERO);
mtx_lock(&kenv_lock);
cp = _getenv_dynamic(name, NULL); // Look up variable
if (cp != NULL)
strlcpy(ret, cp, len); // Safe copy into buffer
mtx_unlock(&kenv_lock);
// If not found, free the buffer and return NULL
if (cp == NULL) {
uma_zfree(kenv_zone, ret);
ret = NULL;
}
} else {
// Early boot path: static environment
ret = _getenv_static(name);
}
return (ret);
}
What to notice:
kenvpis an array of strings used as a lookup table._getenv_dynamic_locked()walks the array, usesstrncmp()and pointer arithmetic to isolate the value.kern_getenv()wraps this in a safe API: it locks access, copies the value withstrlcpy(), and ensures memory ownership is clear (the caller must laterfreeenv()the result).
This real kernel code ties together almost everything we have discussed so far: arrays of strings, null-terminated strings, standard string functions, and pointer arithmetic.
Common Beginner Pitfalls
Arrays and strings in C look simple, but they hide many traps for beginners. Small mistakes that in user space would only crash your program can, in kernel space, bring down the entire operating system. Here are the most common issues:
- Off-by-one errors
The most classic mistake is writing outside the valid range of an array. If you declare
int items[5];, the valid indices are0through4. Writing toitems[5]is already one past the end, and you are corrupting memory. Avoid it: always think in terms of "zero to size minus one," and double-check loop bounds carefully. - Forgetting the null terminator
A string in C must end with
'\0'. If you forget it, functions likeprintf("%s", ...)will keep reading memory until they randomly find a zero byte, often printing garbage or causing a crash. Avoid it: let the compiler add the terminator by writingchar name[] = "FreeBSD";instead of manually filling character arrays. - Using unsafe functions
Functions like
strcpy()andstrcat()perform no bounds checking. If the destination buffer is too small, they will happily overwrite memory past its end. In kernel code, this can cause panics or even security vulnerabilities. Avoid it: use safer alternatives such asstrlcpy()orstrlcat(), which require you to pass the size of the buffer. - Assuming arrays know their own length In higher-level languages, arrays often "know" how big they are. In C, an array is just a pointer to a block of memory; its size is not stored anywhere. Avoid it: keep track of the size explicitly, usually in a separate variable, and pass it along with the array whenever you share it between functions.
- Mixing up arrays and pointers
Arrays and pointers are closely related in C, but not identical. For example, you cannot reassign an array the way you reassign a pointer, and
sizeof(array)is not the same assizeof(pointer). Confusing the two leads to subtle bugs. Avoid it: remember: arrays "decay" into pointers when passed to functions, but at the declaration level they are distinct.
In user programs, these mistakes usually stop at a segmentation fault. In kernel drivers, they can overwrite scheduler data, corrupt I/O buffers, or break synchronization structures, leading to crashes or exploitable vulnerabilities. This is why FreeBSD developers are disciplined when working with arrays and strings: every buffer has a known size, every string has a checked terminator, and safe functions are preferred by default.
Hands-On Lab 1: Arrays in Practice
In this first lab you will practice the mechanics of arrays: declaring, initializing, looping over them, and modifying elements.
#include <stdio.h>
int main() {
// Declare and initialize an array of 5 integers
int values[5] = {10, 20, 30, 40, 50};
printf("Initial array contents:\n");
for (int i = 0; i < 5; i++) {
printf("values[%d] = %d\n", i, values[i]);
}
// Modify one element
values[2] = 99;
printf("\nAfter modification:\n");
for (int i = 0; i < 5; i++) {
printf("values[%d] = %d\n", i, values[i]);
}
return 0;
}
What to Try Next
- Change the array size to 10 but only initialize the first 3 elements. Print all 10 and notice that uninitialized ones default to zero (in this case, because the array was initialized with braces).
- Move the
values[2] = 99;line into the loop and try modifying every element. This is the same pattern drivers use when filling buffers with new data from hardware. - (Optional curiosity) Try printing
values[5]. This is one step past the last valid element. On your system you might see garbage or nothing unusual, but in the kernel it could overwrite sensitive memory and crash the OS. Treat it as forbidden.
Hands-On Lab 2: Strings and the Null Terminator
This lab focuses on strings. You will see what happens when you forget the terminating '\0', and then you'll practice comparing strings in a way that mirrors how FreeBSD drivers search configuration options.
Incorrect version (missing '\0'):
#include <stdio.h>
int main() {
char word[5] = {'H', 'e', 'l', 'l', 'o'};
printf("Broken string: %s\n", word);
return 0;
}
Correct version:
#include <stdio.h>
int main() {
char word[6] = {'H', 'e', 'l', 'l', 'o', '\0'};
printf("Fixed string: %s\n", word);
return 0;
}
What to Try Next
- Replace
"Hello"with a longer word but keep the array size the same. See what happens when the word does not fit. - Declare
char msg[] = "FreeBSD";without specifying a size and print it. Notice how the compiler automatically adds the null terminator for you.
Kernel-Flavoured Bonus Challenge
In the kernel, environment variables are stored as strings of the form "name=value". Drivers often need to compare names to find the right variable. Let's simulate that:
#include <stdio.h>
#include <string.h>
int main() {
// Simulated environment variables (like entries in kenvp[])
char *env[] = {
"kern.ostype=FreeBSD",
"hw.model=Intel(R) Core(TM)",
"kern.version=14.3-RELEASE",
NULL
};
// Target to search
const char *name = "kern.ostype";
int len = strlen(name);
for (int i = 0; env[i] != NULL; i++) {
// Compare prefix
if (strncmp(env[i], name, len) == 0 && env[i][len] == '=') {
printf("Found %s, value = %s\n", name, env[i] + len + 1);
break;
}
}
return 0;
}
Run it, and you will see:
Found kern.ostype, value = FreeBSD
This is almost exactly what _getenv_dynamic_locked() does inside the FreeBSD kernel: it compares names and, if they match, returns a pointer to the value after the '='.
Wrapping Up
In this section, you explored arrays and strings from both the C language perspective and the FreeBSD kernel perspective. You saw how arrays give you fixed-size storage, how strings depend on the null terminator, and how these simple constructs underpin core driver mechanisms such as device names, sysctl parameters, and kernel environment variables.
You also discovered how subtle mistakes, like writing past an array boundary or forgetting a terminator, can escalate into severe bugs or vulnerabilities, as illustrated by real FreeBSD CVEs.
Recap Quiz - Arrays and Strings
Instructions: answer without running code first, then verify on your system. Keep answers short and specific.
- In C, what makes a character array a "string"? Explain what happens if that element is missing.
- Given
int a[5];, list the valid indices and say what is undefined behavior for indexing. - Why is
strcpy(dest, src)risky in kernel code, and what should you prefer instead? Briefly explain why. - Look at this snippet and say exactly what the return value points to if it matches:
int len = strlen(name);
if (strncmp(cp, name, len) == 0 && cp[len] == '=')
return (cp + len + 1);
- In
sys/kern/kern_environment.c, what is the type and role ofkenvp, and how does_getenv_dynamic_locked()use it at a high level?
Challenge Exercises
If you feel confident, try these challenges. They are designed to push your skills a bit further and prepare you for real driver work.
- Array Rotation: Write a program that rotates the contents of an integer array by one position. For example,
{1, 2, 3, 4}becomes{2, 3, 4, 1}. - String Trimmer: Write a function that removes the newline character (
'\n') from the end of a string if present. Test it with input fromfgets(). - Environment Lookup Simulation: Extend the kernel-flavoured lab from this section. Add a function
char *lookup(char *env[], const char *name)that takes an array of"name=value"strings and returns the value part. Handle the case where the name is not found by returningNULL. - Buffer Size Check: Write a function that safely copies one string into another buffer and explicitly reports an error if the destination is too small. Compare your implementation with
strlcpy().
Looking Ahead
In the next section, we will connect arrays and strings to the deeper concept of pointers and memory. You will learn how arrays decay into pointers, how memory addresses are manipulated, and how the FreeBSD kernel allocates and frees memory safely. This is where you begin to see how data structures and memory management form the backbone of every device driver.
Pointers and Memory
Welcome to one of the most mysterious and magical topics in your C journey: pointers.
By now, you've probably heard things like:
- "Pointers are hard."
- "C is powerful because of pointers."
- "You can shoot yourself in the foot with pointers."
Those statements aren't wrong, but don't worry. I'm going to walk you through it carefully, step by step. Our goal is to demystify pointers, not memorize obscure syntax. And because we're learning with FreeBSD in mind, I'll also point out where and how pointers are used in the real kernel source (without overwhelming you).
When you understand pointers, you'll unlock the true potential of C, especially when it comes to writing system-level code and interacting with the operating system at a low level.
What Is a Pointer?
So far, we've worked with variables like int, char, and float. These are familiar and friendly; you declare them, assign them values, and print them. Easy, right?
Now we're going to talk about something that doesn't store a value directly, but rather stores the location of a value.
This magical concept is called a pointer, and it's one of the most powerful tools in C, especially when you're writing low-level code like device drivers in FreeBSD.
Analogy: Memory as a Row of Lockers
Imagine computer memory as a long row of lockers, each with its own number:
[1000] = 42
[1004] = 99
[1008] = ???
Each locker is a memory address, and the value inside is your data.
When you create a variable in C:
int score = 42;
You're saying:
"Please give me a locker big enough to store an
int, and put42in it."
But what if you want to know where that variable is stored?
That's where pointers come in.
A First Pointer Program
Here's a gentle example to show what a pointer is, and what it can do:
#include <stdio.h>
int main(void) {
int score = 42; // A normal integer variable
int *ptr; // Declare a pointer to an integer
ptr = &score; // Set ptr to the address of score using the & operator
printf("score = %d\n", score); // Prints: 42
printf("ptr points to address %p\n", (void *)ptr); // Prints the memory address of score
printf("value at that address = %d\n", *ptr); // Dereference the pointer to get the value: 42
return 0;
}
Line-by-Line Breakdown
| Line | Explanation |
|---|---|
int score = 42; | Declares a regular int and sets it to 42. |
int *ptr; | Declares a pointer named ptr that can store the address of an int. |
ptr = &score; | The & operator gets the memory address of score. That address is now stored in ptr. |
*ptr | The * operator (called dereference) means: "go to the address stored in ptr and get the value there." |
Pointers in the Kernel
Let's look at a real pointer declaration in FreeBSD.
Remember our good friend, the tty_info() function from sys/kern/tty_info.c?
Inside it, at line 288, you'll find this line:
struct proc *p, *ppick;
Here, p and ppick are pointers to a struct proc, which represents a process.
What this line means:
pandppickdon't store processes; they point to process structures in memory.- In FreeBSD, almost all kernel structures are accessed via pointers because data is passed around and shared across kernel subsystems.
Later in the same function, at line 333, we see:
/*
* Pick the most interesting process and copy some of its
* state for printing later. This operation could rely on stale
* data as we can't hold the proc slock or thread locks over the
* whole list. However, we're guaranteed not to reference an exited
* thread or proc since we hold the tty locked.
*/
p = NULL;
LIST_FOREACH(ppick, &tp->t_pgrp->pg_members, p_pglist)
if (proc_compare(p, ppick))
p = ppick;
Here:
LIST_FOREACH()is walking over a linked list of processes.ppickis pointing to each process in the group.- The function
proc_compare()helps pick the "most interesting" process. - And
pgets assigned to point to that process.
Don't worry if the kernel example feels a little dense right now. The key takeaway is simple:
in FreeBSD, pointers are everywhere because kernel structures are almost always shared and referenced instead of copied.
A Simple Analogy
Think of pointers as labels with GPS coordinates. Instead of carrying the treasure, they tell you where to dig.
- A regular variable holds the value.
- A pointer holds the address of the value.
This is extremely useful in systems programming, where we often pass references to data rather than the data itself.
Quick Check: Test Your Understanding
Can you tell what this code will print?
int num = 25;
int *p = #
printf("num = %d\n", num);
printf("*p = %d\n", *p);
Answer:
num = 25
*p = 25
Because both num and *p refer to the same location in memory.
Summary
- A pointer is a variable that stores a memory address.
- Use
&to get the address of a variable. - Use
*to access the value at a pointer's address. - Pointers are heavily used in FreeBSD (and all OS kernels) because they allow efficient access to shared and dynamic data.
Mini Hands-On Lab — Your First Pointers
Goal
Build confidence with the three core pointer moves: taking an address with &, storing it in a pointer, and reading or writing through that pointer with *.
Starter code
#include <stdio.h>
int main(void) {
int value = 10;
int *p = NULL; // Good habit: initialize pointers
/* 1. Point p to value using the address of operator */
/* 2. Print value, the address stored in p, and the value via *p */
/* 3. Change value through the pointer, then print value again */
/* 4. Declare another int named other = 99 and re point p to it.
Print *p and other to confirm they match. */
return 0;
}
Tasks
- Set
pto the address ofvalue. - Print:
value = ...p points to address ...(useprintf("p points to address %p\n", (void*)p);)*p = ...
- Write through the pointer with
*p = 20;and printvalueagain. - Create
int other = 99;, then setp = &other;and print*pandother.
Expected output example (address will differ):
value = 10
p points to address 0x...
*p = 10
value after write through p = 20
other = 99
*p after re pointing = 99
Stretch exercise
-
Add
int *q = p;and then set*q = 123;. Print both*pandother. What happened? -
Write a helper function:
void set_twice(int *x) { *x = *x * 2; }Call it with
set_twice(&value);and observe the result.
Common Beginner Pitfalls with Pointers
If pointers feel slippery, you're not alone. Most C beginners run into the same traps over and over, and even experienced developers occasionally fall into them.
- Using an uninitialized pointer
Declaring
int *p;without setting it to something valid leavesppointing to "somewhere" in memory. → Always initialize pointers (toNULLor a valid address). - Confusing the pointer with the data
Beginners often mix up
p(the address) with*p(the value at that address). Writing to the wrong one can silently corrupt memory. → Ask yourself: am I working with the pointer or the pointee? - Losing track of ownership If a pointer refers to memory that was freed or that belongs to a different part of the program, using it again is a serious bug (a "dangling pointer"). → We'll learn strategies to manage memory safely later.
- Forgetting about types
A pointer to
intis not the same as a pointer tochar. Mixing types can cause subtle errors because the compiler uses the type to decide how many bytes to step through in memory. → Always match pointer types carefully. - Assuming all addresses are valid Just because a pointer contains a number doesn't mean that address is safe to use. The kernel is full of memory that your code must not touch without permission. → Never invent or guess addresses; only use valid ones from the kernel or OS.
These mistakes are not small annoyances; in kernel development, they can bring down the entire system. The good news is that by understanding how pointers work and building safe habits, you'll learn to avoid them.
Why Pointers Matter in Driver Development
So why spend so much time learning pointers? Because pointers are the language of the kernel!
- In user programs, you often work with copies of data. In the kernel, copying is too expensive, so we pass around pointers instead.
- Device drivers constantly need to share state between different parts of the system (processes, threads, hardware buffers). Pointers are how we make that possible.
- Pointers let us build flexible structures like linked lists, queues, and tables, which are everywhere in FreeBSD's source code.
- Most importantly, hardware itself is accessed through memory addresses. If you want to talk to a device, you'll often be handed a pointer to its registers or buffers.
Understanding pointers is not just about writing clever C code. It's about speaking the kernel's native tongue. Without them, you cannot build safe, efficient, or even functional device drivers.
Wrapping Up
We've just taken our first careful step into the world of pointers: variables that don't hold data directly but instead hold the location of data. This shift in perspective is what makes C so powerful and, at the same time, so dangerous if misunderstood.
Pointers let us share information between parts of a program without copying it around, which is essential in an operating system kernel where efficiency and precision matter. But this power also comes with responsibility: mixing up addresses, dereferencing invalid pointers, or forgetting what memory a pointer refers to can easily crash your program or, in the case of a driver, the entire operating system.
That's why understanding pointers is not just an academic exercise, but a survival skill for FreeBSD driver development.
In the next section, we'll move from the big picture into the nuts and bolts: how to correctly declare pointers, how to use them in practice, and how to start building safe habits from the very beginning.
Declaring and Using Pointers
Now that you know what a pointer is, a variable that stores a memory address instead of a direct value, it is time to learn how to declare and use them in your programs. This is where the idea of a pointer stops being abstract and becomes something you can experiment with in code.
We will move carefully, step by step, with small and fully commented examples. You will see how to declare a pointer, how to assign it the address of another variable, how to dereference it to reach the stored value, and how to modify data indirectly. Along the way, we will look at real FreeBSD kernel code that relies on pointers every day.
Declaring a Pointer
In C, you declare a pointer using the star symbol *. The general pattern looks like this:
int *ptr;
This line means:
"I am declaring a variable called ptr, and it will hold the address of an integer."
The * here does not mean that the name of the variable is *ptr. It is part of the type declaration, telling the compiler that ptr is not a plain integer but a pointer to an integer.
Let's see a complete program that you can type and run:
#include <stdio.h>
int main(void) {
int value = 10; // Declare a regular integer
int *ptr; // Declare a pointer to an integer
ptr = &value; // Store the address of 'value' in 'ptr'
printf("The value is: %d\n", value);
printf("Address of value is: %p\n", (void *)&value);
printf("Pointer ptr holds: %p\n", (void *)ptr);
printf("Value through ptr: %d\n", *ptr);
return 0;
}
Run this program and compare the output. You will see that ptr contains the same address as &value, and when you use *ptr, you get back the integer stored there.
Think of it like this: writing down a friend's street address is &value. Saving that address in your contacts list is ptr. Actually going to that house to say hello (or grab a snack) is *ptr.
The Importance of Initialisation
One of the most important rules about pointers is: never use them before you assign them a valid address. An uninitialised pointer holds garbage data, which means it may point to a random memory location. If you try to dereference it, the program will almost certainly crash.
Here is an unsafe example:
int *dangerous_ptr;
printf("%d\n", *dangerous_ptr); // Undefined behaviour!
Since dangerous_ptr was never assigned a valid address, the program will attempt to read some unpredictable area of memory. In user programs this usually causes a crash. In kernel code it can be far worse, leading to corruption of critical data structures and even security vulnerabilities. This is why being disciplined about initialisation is so important when programming for FreeBSD.
A Real Example from FreeBSD
If you open the file sys/kern/tty_info.c, you will find the following declaration inside the tty_info() function, at line 289:
struct thread *td, *tdpick;
Both td and tdpick are pointers to a structure called thread. FreeBSD uses these pointers to walk through all threads that belong to a process. Later in the code, at line 341, you will see these pointers being utilised:
FOREACH_THREAD_IN_PROC(p, tdpick)
if (thread_compare(td, tdpick))
td = tdpick;
The kernel is looping through each thread in process p. It compares them using the helper function thread_compare(), and if one is a better match, it updates the pointer td to refer to that thread.
Notice that td itself is just a label. What changes is the address it holds, which in turn tells the kernel which thread to focus on. This pattern is extremely common in the kernel: pointers are declared at the top of a function, then updated step by step as the function works its way through structures.
Modifying a Value Through a Pointer
Another classic use of pointers is indirect modification. Let's walk through a simple program:
#include <stdio.h>
int main(void) {
int age = 30; // A regular variable
int *p = &age; // Pointer to that variable
printf("Before: age = %d\n", age);
*p = 35; // Change the value through the pointer
printf("After: age = %d\n", age);
return 0;
}
When we run this code, it prints 30 before and 35 after. We never assigned directly to age, but by dereferencing p, we reached into its memory location and changed the value stored there.
This technique is used everywhere in system programming. Functions that need to return more than one value, or that must directly alter data structures inside the kernel, rely on pointers. Without them, it would be impossible to write efficient drivers or manage complex objects like processes, devices, and memory buffers.
Mini Hands-On Lab — Pointer Chains
Goal Learn how multiple pointers can point to the same variable, and how a pointer-to-pointer works in practice. This prepares you for real kernel patterns where functions receive not just data but pointers to pointers that need updating.
Starter Code
#include <stdio.h>
int main(void) {
int value = 42;
int *p = &value; // p points to value
int **pp = &p; // pp points to p (pointer to pointer)
/* 1. Print value directly, through p, and through pp */
/* 2. Change value to 100 using *p */
/* 3. Change value to 200 using **pp */
/* 4. Make p point to a new variable other = 77, via pp */
/* 5. Print value, other, *p, and **pp to observe the changes */
return 0;
}
Tasks
- Print the same integer in three different ways:
- Directly (
value) - Indirectly with
*p - Double indirection with
**pp
- Directly (
- Assign
100tovalueusing*p, then printvalue. - Assign
200tovalueusing**pp, then printvalue. - Declare a second variable
int other = 77;Use the double pointer (*pp = &other;) to makeppoint tootherinstead. - Print
other,*p, and**pp. Confirm that all three match.
Expected Output (addresses will differ)
value = 42
*p = 42
**pp = 42
value after *p write = 100
value after **pp write = 200
other = 77
*p now points to other = 77
**pp also sees 77
Stretch Exercise
Write a function that takes a pointer to a pointer:
void redirect(int **pp, int *new_target) {
*pp = new_target;
}
- Call it to redirect
pfromvaluetoother. - Print
*pafterwards to see the result.
This is a common idiom in kernel code, where functions receive a pointer to a pointer so they can safely update what the caller's pointer points to.
Common Beginner Pitfalls at Declaration and Use
When you start declaring and using pointers, a new set of mistakes can creep in. They are different from the conceptual traps we already covered, and each has a simple way to avoid it.
One mistake is misplacing the star * in a declaration. Writing int* a, b; looks like you declared two pointers, but in reality only a is a pointer, and b is a plain integer. To avoid this confusion, always write the star next to each variable name: int *a, *b;. This makes it explicit that both are pointers.
Another trap is assigning a pointer without matching the type. For instance, storing the address of a char in an int * may compile with warnings but is unsafe. Always ensure the pointer type matches the type of the variable you are pointing to. If you need to work with different types, use casts carefully and deliberately, not by accident.
A common error is dereferencing too early. Beginners sometimes write *p before assigning p a valid address, which leads to undefined behaviour. Get into the habit of initialising pointers at declaration. Use NULL if you do not yet have a valid address, and only dereference after you have assigned a real target.
Another pitfall is overthinking addresses. Printing or comparing the raw numeric value of pointers is rarely meaningful. What matters is the relationship between a pointer and its pointee. Focus on using %p in printf to display addresses for debugging, but remember that addresses themselves are not portable values you can calculate casually.
Finally, declaring multiple pointers in one line without care often causes subtle errors. A line like int *a, b, c; gives you one pointer and two integers, not three pointers. To avoid mistakes, keep pointer declarations simple and clear, and never assume all variables in a list share the same pointer type.
By adopting these habits early, clear declarations, matching types, safe initialisation, and careful dereferencing, you will build a strong foundation for working with pointers in larger programs and in FreeBSD kernel code.
Why This Matters in Driver Development
Declaring and using pointers correctly is more than a matter of style. In FreeBSD drivers, you will often see entire groups of pointers declared together, each one tied to a subsystem of the kernel or a hardware resource. If you misdeclare one of these pointers, you might end up mixing integers and addresses, leading to very subtle bugs.
Consider linked lists in the kernel. A device driver might declare several pointers to structures like struct mbuf (network buffers) or struct cdev (device entries). These pointers are chained together to form queues, and every declaration must be precise. One missing star or one mismatched type can mean that a list traversal ends in a kernel panic.
Another reason declaration matters is efficiency. Kernel code often creates pointers that refer to existing objects instead of making copies. Declaring pointers correctly means you can traverse large structures, like the list of threads in a process, without duplicating data or wasting memory.
The lesson is clear: understanding how to declare and use pointers properly gives you the vocabulary to describe kernel objects, navigate them, and connect them together safely.
Wrapping Up
At this point, you have moved beyond simply knowing what a pointer is. You can now declare a pointer, initialise it, print its address, dereference it, and even use it to modify a variable indirectly. You have seen how FreeBSD uses these techniques in real code, such as iterating through process threads or updating kernel structures. You also practised with pointer chains and saw how a pointer-to-pointer lets you redirect another pointer, a pattern that will appear often in kernel APIs.
What makes this knowledge powerful is that it transforms your ability to work with functions. Passing pointers into functions allows those functions to update the caller's data, to redirect a pointer to a new target, or to return multiple results at once. In kernel development, this is not a rare pattern but the norm. Drivers almost always interact with the kernel and hardware by passing pointers into functions and receiving updated pointers back.
In the next section, we will explore Pointers and Functions, and you will see how this combination becomes the standard way to write flexible, efficient, and safe code inside FreeBSD.
Pointers and Functions
Back in Section 4.7, we learned that function parameters are always passed by value. This means that when you call a function, it usually receives only a copy of the variable you provide. As a result, the function can't change the original value that lives in the caller.
Now that we've introduced pointers, we have a new possibility. By passing a pointer into a function, you give it direct access to the caller's memory. This is the standard way in C and especially in FreeBSD kernel code to let functions modify data outside their own scope or return multiple results.
Let's walk through the difference step by step.
First Try: Passing by Value (Doesn't Work)
#include <stdio.h>
void set_to_zero(int n) {
n = 0; // Only changes the copy
}
int main(void) {
int x = 10;
set_to_zero(x);
printf("x is now: %d\n", x); // Still prints 10!
return 0;
}
Here, the function set_to_zero() receives a copy of x. That copy is modified, but the real x in main() never changes.
Second Try: Passing by Pointer (Works)
#include <stdio.h>
void set_to_zero(int *n) {
*n = 0; // Follow the pointer and change the real variable
}
int main(void) {
int x = 10;
set_to_zero(&x); // Give the function the address of x
printf("x is now: %d\n", x); // Prints 0!
return 0;
}
This time the caller sends the address of x using &x. Inside the function, *n lets us reach into that memory location and actually change the variable in main().
This pattern is simple but incredibly powerful. It transforms functions from isolated workshops into tools that can work directly on the caller's data.
Why This Matters in the Kernel
In kernel code, this technique is not just helpful, it's essential. Kernel functions often need to report back multiple pieces of information. Since C does not allow multiple return values, the usual approach is to pass pointers to variables or structures that the function can fill in.
Here's an example you can find in the FreeBSD source file sys/kern/tty_info.c, at line 382:
rufetchcalc(p, &ru, &utime, &stime);
The function rufetchcalc() fills in statistics about CPU usage for a process. It can't simply "return" all three results, so it accepts pointers to variables where it will write the data.
Let's simplify this with a small simulation:
#include <stdio.h>
// A simplified kernel-style function
void get_times(int *user, int *system) {
*user = 12; // Fake "user time"
*system = 8; // Fake "system time"
}
int main(void) {
int utime, stime;
get_times(&utime, &stime);
printf("User time: %d\n", utime);
printf("System time: %d\n", stime);
return 0;
}
Here, get_times() updates both utime and stime in one call. That's precisely how kernel code returns complex results without extra overhead.
Common Beginner Pitfalls
Pointers with functions are a frequent stumbling block for beginners. Watch out for these mistakes:
- Forgetting the
&in the call: If you writeset_to_zero(x)instead ofset_to_zero(&x), you'll pass the value instead of the address, and nothing will change. - Assigning the pointer, not the value: Inside the function, writing
n = 0;only overwrites the pointer itself. You must use*n = 0;to change the caller's variable. - Overstepping responsibilities: A function should not free or reallocate memory that belongs to the caller unless it is explicitly designed to do so. Otherwise, you risk creating dangling pointers.
The safest habit is always to be clear about what the pointer represents, and to think carefully before modifying anything that belongs to the caller.
Hands-On Lab: Write Your Own Setter
Here's a small challenge to test your understanding. Complete the function so that it doubles the value of the variable given:
#include <stdio.h>
void double_value(int *n) {
// TODO: Write code that makes *n twice as large
}
int main(void) {
int x = 5;
double_value(&x);
printf("x is now: %d\n", x); // Should print 10
return 0;
}
If your function works, you've just written your first function that modifies a caller's variable using a pointer, precisely the kind of operation you'll use constantly in device drivers.
Wrapping Up: Pointers Unlock New Possibilities
By themselves, functions only work with copies. With pointers, functions gain the ability to modify the caller's variables and to pass back multiple results efficiently. This pattern shows up everywhere in FreeBSD, from memory management to process scheduling, and it is one of the most essential techniques you can master as a future driver developer.
Now that we've seen how pointers connect with functions, let's take the next step. Pointers also form a natural partnership with another key feature of C: arrays. In the next section, we'll explore Pointers and Arrays: A Powerful Duo, and you'll discover how these two concepts work together to make memory access both flexible and efficient.
Pointers and Arrays: A Powerful Duo
In C, arrays and pointers are like close friends. They are not the same thing, but they are deeply connected, and in practice, they often work together. This connection shows up constantly in kernel code, where performance and direct memory access matter. If you can understand how arrays and pointers interact, you will unlock a powerful toolset for navigating buffers, strings, and hardware data.
What's the Connection?
There is a straightforward rule that explains most of the relationship:
In most expressions, the name of an array acts like a pointer to its first element.
That means if you declare:
int numbers[3] = {10, 20, 30};
Then the name numbers can be treated as if it were the same as &numbers[0]. So the line:
int *ptr = numbers;
Is equivalent to:
int *ptr = &numbers[0];
The pointer ptr now points directly to the first element of the array.
A Simple Example
#include <stdio.h>
int main(void) {
int values[3] = {100, 200, 300};
int *p = values; // 'values' behaves like &values[0]
printf("First value: %d\n", *p); // prints 100
printf("Second value: %d\n", *(p + 1)); // prints 200
printf("Third value: %d\n", *(p + 2)); // prints 300
return 0;
}
Here, the pointer p starts at the first element. Adding 1 to p moves it forward by one integer, and so on. This is called pointer arithmetic, and we will study its rules more carefully in the next section. For now, the key idea is that arrays and pointers share the same memory layout, which makes moving through an array with a pointer both natural and efficient.
Using Arrays and Pointers in FreeBSD
The FreeBSD kernel makes heavy use of this connection. A good example is found inside sys/kern/tty_info.c in the tty_info() function, at line 384:
strlcpy(comm, p->p_comm, sizeof comm);
Here, p->p_comm is a character array that belongs to a process structure. The variable comm is another array declared locally, at line 295:
char comm[MAXCOMLEN + 1];
The function strlcpy() copies the string from one array into another. Under the hood, it uses pointer arithmetic to walk through each character until the copy is done. You do not need to see those details to use it, but it is important to know that arrays and pointers make this possible. This is why so many kernel functions operate on "char *" even though you often start with a character array.
Arrays and Pointers: The Differences That Matter
Since arrays and pointers behave in similar ways, it is tempting to think they are the same thing. But they are not, and understanding the differences will help you avoid many subtle bugs.
When you declare an array, the compiler reserves a fixed block of memory large enough to hold all its elements. The array's name represents this memory location, and that association cannot be changed. For example, if you declare int a[5];, the compiler allocates space for five integers, and a will always refer to that same block of memory. You cannot later reassign a to point somewhere else.
A pointer, by contrast, is a variable that stores an address. It does not allocate storage for multiple items by itself. Instead, it can point to any valid memory location you choose. For instance, int *p; creates a pointer that may later hold the address of the first element of an array, the address of a single variable, or memory that has been allocated dynamically. You can also reassign the pointer freely, making it a much more flexible tool.
Another key distinction is that the compiler knows the size of an array, but it does not track the size of the memory a pointer refers to. This means the array's boundaries are known at compile time, while a pointer only knows where it starts, not how far it extends. That responsibility falls on you as the programmer.
These rules can be summarised in plain language. An array is a fixed block of storage, like a house built on a specific lot of land. A pointer is like a set of keys: you can use them to access that house, but tomorrow you might use the same keys to open another house entirely. Both are useful, but they serve different purposes, and the distinction becomes crucial in kernel code where memory management and safety cannot be left to chance.
Common Beginner Pitfall: Off-by-One Errors
Because arrays and pointers are so closely related, the same mistake can happen in two different guises: stepping one element too far.
With arrays, the error looks like this:
int items[3] = {1, 2, 3};
printf("%d\n", items[3]); // Invalid! Out of bounds
Here, the valid indices are 0, 1, and 2. Using 3 goes past the end.
With pointers, the same error can happen more subtly:
int items[3] = {1, 2, 3};
int *p = items;
printf("%d\n", *(p + 3)); // Also invalid, same as items[3]
In both cases, you are asking for memory beyond the array's boundary. The compiler will not stop you, and the program may even seem to run correctly sometimes, which makes the bug even more dangerous.
In user programs, this usually means corrupted data or a crash. In kernel code, it can mean memory corruption, a kernel panic, or even a security hole. That is why experienced FreeBSD developers are extremely careful when writing loops that walk through arrays or buffers with pointers. The loop condition is just as important as the loop body.
Safety Habit
The best way to avoid off-by-one errors is to make your loop boundaries explicit and double-check them. If an array has n elements, valid indices always run from 0 to n - 1. When using a pointer, think in terms of "how many elements have I advanced?" rather than "how many bytes."
For example:
for (int i = 0; i < 3; i++) {
printf("%d\n", items[i]); // Safe, i goes 0..2
}
for (int i = 0; i < 3; i++) {
printf("%d\n", *(p + i)); // Safe, same rule
}
By making the upper limit part of your loop condition, you ensure you never walk past the end of the array. This habit will save you from many subtle bugs, especially when you move from small exercises into real kernel code.
Why FreeBSD Style Embraces This Duo
Many FreeBSD subsystems manage buffers as arrays while navigating them with pointers. This combination lets the kernel avoid unnecessary copying, keep operations efficient, and interact directly with hardware. Whether you are looking at character device buffers, network packet rings, or process command names, you will see this pattern again and again.
By mastering the array–pointer relationship, you will be able to read and write kernel code more confidently, recognizing when the code is simply walking through memory one element at a time.
Hands-On Lab: Walking an Array with a Pointer
Try writing a small program that prints all the elements of an array using a pointer instead of array indexing.
#include <stdio.h>
int main(void) {
int numbers[5] = {10, 20, 30, 40, 50};
int *p = numbers;
for (int i = 0; i < 5; i++) {
printf("numbers[%d] = %d\n", i, *(p + i));
}
return 0;
}
Now, modify the loop so that instead of writing *(p + i), you increment the pointer directly:
for (int i = 0; i < 5; i++) {
printf("numbers[%d] = %d\n", i, *p);
p++;
}
Notice that the result is the same. This is the power of combining pointers and arrays. Try experimenting by starting the pointer at &numbers[2] and see what gets printed.
Wrapping Up: Arrays and Pointers Work Best Together
You have now seen how arrays and pointers fit together. Arrays provide the structure, while pointers provide the flexibility to navigate memory efficiently. In the FreeBSD kernel, this combination is everywhere, from device buffers to string manipulation. Always remember the two golden rules: array[i] is equivalent to *(array + i), and you must never step outside the bounds of the array.
In the next section, we will explore Pointer Arithmetic more deeply. You will learn how incrementing a pointer works under the hood, why it follows the size of the type, and what boundaries you must respect to avoid stepping into dangerous memory.
Pointer Arithmetic and Boundaries
Now that you know how pointers can point to individual variables and even to arrays, we are ready to take the next step: learning how to move those pointers around. This ability is called pointer arithmetic.
The name might sound intimidating, but the idea is simple. Imagine a row of boxes placed neatly side by side. A pointer is like your finger pointing to one of those boxes. Pointer arithmetic is nothing more than moving your finger forward or backward to reach another box.
What Is Pointer Arithmetic?
When you add or subtract an integer from a pointer, C advances the pointer by elements, not by raw bytes. The step size depends on the type the pointer refers to:
- If
pis anint *, thenp + 1moves bysizeof(int)bytes. - If
qis achar *, thenq + 1moves bysizeof(char)bytes (which is always 1). - If
ris adouble *, thenr + 1moves bysizeof(double)bytes.
This behaviour is what makes pointer arithmetic natural for walking arrays, because arrays live in contiguous memory. Each "+1" lands you precisely on the next element, not in the middle of it.
Let’s see a complete program that demonstrates this. I have added comments to allow you to understand what's happening at each step.
Save it as pointer_arithmetic_demo.c:
/*
* Pointer Arithmetic Demo (commented)
*
* This program shows that when you add 1 to a pointer, it moves by
* the size of the element type (in elements, not raw bytes).
* It prints addresses and values for int, char, and double arrays,
* and then shows how to compute the distance between two pointers
* using ptrdiff_t. All accesses stay within bounds.
*/
#include <stdio.h> // printf
#include <stddef.h> // ptrdiff_t, size_t
int main(void) {
/*
* Three arrays of different element types.
* Arrays live in contiguous memory, which is why pointer arithmetic
* can step through them safely when we stay within bounds.
*/
int arr[] = {10, 20, 30, 40};
char text[] = {'A', 'B', 'C', 'D'};
double nums[] = {1.5, 2.5, 3.5};
/*
* Pointers to the first element of each array.
* In most expressions, an array name "decays" to a pointer to its first element.
* So arr has type "array of int", but here it becomes "int *" automatically.
*/
int *p = arr;
char *q = text;
double *r = nums;
/*
* Compute the number of elements in each array.
* sizeof(array) gives the total bytes in the array object.
* sizeof(array[0]) gives the bytes in one element.
* Dividing the two gives the element count.
*/
size_t n_ints = sizeof(arr) / sizeof(arr[0]);
size_t n_chars = sizeof(text) / sizeof(text[0]);
size_t n_doubles = sizeof(nums) / sizeof(nums[0]);
printf("Pointer Arithmetic Demo\n\n");
/*
* INT DEMO
* p + i moves i elements forward, which is i * sizeof(int) bytes.
* We print both the address and the value to see how it steps.
* %zu is the correct format specifier for size_t.
* %p prints a pointer address; cast to (void *) for correct printf typing.
*/
printf("== int demo ==\n");
printf("sizeof(int) = %zu bytes\n", sizeof(int));
for (size_t i = 0; i < n_ints; i++) {
printf("p + %zu -> address %p, value %d\n",
i, (void*)(p + i), *(p + i)); // *(p + i) reads the i-th element
}
printf("\n");
/*
* CHAR DEMO
* For char, sizeof(char) is 1 by definition.
* q + i advances one byte per step.
*/
printf("== char demo ==\n");
printf("sizeof(char) = %zu byte\n", sizeof(char));
for (size_t i = 0; i < n_chars; i++) {
printf("q + %zu -> address %p, value '%c'\n",
i, (void*)(q + i), *(q + i));
}
printf("\n");
/*
* DOUBLE DEMO
* For double, sizeof(double) is typically 8 on modern systems.
* r + i advances by 8 bytes per step on those systems.
*/
printf("== double demo ==\n");
printf("sizeof(double) = %zu bytes\n", sizeof(double));
for (size_t i = 0; i < n_doubles; i++) {
printf("r + %zu -> address %p, value %.1f\n",
i, (void*)(r + i), *(r + i));
}
printf("\n");
/*
* Pointer difference
* Subtracting two pointers that point into the same array
* yields a value of type ptrdiff_t that represents the distance
* in elements, not bytes.
*
* Here, a points to arr[0] and b points to arr[3].
* The difference b - a is 3 elements.
*/
int *a = &arr[0];
int *b = &arr[3];
ptrdiff_t diff = b - a; // distance in ELEMENTS
/*
* %td is the correct format specifier for ptrdiff_t.
* We also verify that advancing a by diff elements lands exactly on b.
*/
printf("Pointer difference (b - a) = %td elements\n", diff);
printf("Check: a + diff == b ? %s\n", (a + diff == b) ? "yes" : "no");
/*
* Program ends successfully.
*/
return 0;
}
Compile and run it on FreeBSD:
% cc -Wall -Wextra -o pointer_arithmetic_demo pointer_arithmetic_demo.c
% ./pointer_arithmetic_demo
You will see how each type moves by its element size. The int * steps by 4 bytes (on most modern FreeBSD systems), the char * steps by 1, and the double * steps by 8. Notice how the addresses jump accordingly, while the values are fetched correctly with *(pointer + i).
The final check with b - a shows that pointer subtraction is also measured in elements, not bytes. If a points to the start of the array and b points three elements ahead, then b - a gives 3.
This program demonstrates the essential rule of pointer arithmetic: C moves pointers in units of the pointed-to type. That is why it works so well with arrays, but also why you must be careful, a wrong step can quickly take you beyond the valid memory region.
Walking Through Arrays with Pointers
This property makes pointers especially useful when working with arrays. Since arrays are laid out in contiguous blocks of memory, a pointer can naturally step through them one element at a time.
#include <stdio.h>
int main() {
int numbers[] = {1, 2, 3, 4, 5};
int *ptr = numbers; // Start at the first element
for (int i = 0; i < 5; i++) {
printf("Element %d: %d\n", i, *(ptr + i));
// *(ptr + i) is equivalent to numbers[i]
}
return 0;
}
Here, the expression *(ptr + i) asks C to move forward i positions from the starting pointer, then fetch the value at that location. The result is identical to numbers[i]. In fact, C allows both notations interchangeably. Whether you write numbers[i] or *(numbers + i), you are doing the same thing.
Staying Within Boundaries
Pointer arithmetic is powerful, but it comes with a serious responsibility: you must never move beyond the memory that belongs to your array. If you do, you step into undefined behaviour. That can mean a crash, corrupted memory, or silent errors that appear much later.
#include <stdio.h>
int main() {
int data[] = {42, 77, 99};
int *ptr = data;
// Wrong! This goes past the last element.
printf("Invalid access: %d\n", *(ptr + 3));
return 0;
}
The array data has three elements, valid at indices 0, 1, and 2. But ptr + 3 points to a place immediately after the last element. C does not stop you, but the result is unpredictable.
The safe way is to always respect the number of elements in your array. Instead of hardcoding the size, you can calculate it:
for (int i = 0; i < sizeof(data) / sizeof(data[0]); i++) {
printf("%d\n", *(data + i));
}
This expression divides the total size of the array by the size of a single element, giving the correct element count regardless of the array's length.
A Glimpse into the FreeBSD Kernel
Pointer arithmetic shows up frequently in the FreeBSD kernel. Sometimes it is used directly with arrays, but more often it appears while navigating through structures linked together in memory. Let us look at a small excerpt adapted from sys/kern/tty_info.c (near line 333):
struct proc *p, *ppick;
p = NULL;
LIST_FOREACH(ppick, &tp->t_pgrp->pg_members, p_pglist) {
if (proc_compare(p, ppick))
p = ppick;
}
This loop is not using ptr + 1 as in our array examples, but it is doing the same conceptual job: moving through memory by following pointer links. Instead of stepping through integers in a row, it walks through a chain of process structures connected in a list. The lesson is the same: pointers let you move from one element to the next, but you must always be careful to stay within the intended bounds of the structure.
Common Beginner Pitfalls
- Forgetting that pointers move by type size: If you assume
p + 1moves one byte, you will misunderstand the result. It always moves by the size of the type the pointer refers to. - Going past array limits: Accessing
arr[5]when the array has only 5 elements (valid indices 0-4) is a classic off-by-one mistake. - Mixing arrays and pointers too casually: Although
arr[i]and*(arr + i)are equivalent, an array name itself is not a modifiable pointer. Beginners sometimes try to reassign an array name as if it were a variable, which is not allowed.
To avoid these traps, always calculate sizes carefully, use sizeof when possible, and keep in mind that arrays and pointers are close relatives but not identical twins.
Tip: Arrays Decay into Pointers
When you pass an array to a function, C actually gives the function only a pointer to the first element. The function has no way to know how many elements exist. For safety, always pass the array size along with the pointer. This is why so many C library and FreeBSD kernel functions include both a buffer pointer and a length parameter.
Mini Hands-On Lab: Walking Safely with Pointer Arithmetic
In this lab, you will experiment with pointer arithmetic on arrays, see how to walk through memory step by step, and learn how to detect and prevent stepping outside array boundaries.
Create a file called lab_pointer_bounds.c with the following code:
#include <stdio.h>
int main(void) {
int values[] = {5, 10, 15, 20};
int *ptr = values; // Start at the first element
int length = sizeof(values) / sizeof(values[0]);
printf("Array has %d elements\n", length);
// Walk forward through the array
for (int i = 0; i < length; i++) {
printf("Forward step %d: %d\n", i, *(ptr + i));
}
// Walk backwards using pointer arithmetic
for (int i = length - 1; i >= 0; i--) {
printf("Backward step %d: %d\n", i, *(ptr + i));
}
// Demonstrate boundary checking
int index = 4; // Out-of-bounds index
if (index >= 0 && index < length) {
printf("Safe access: %d\n", *(ptr + index));
} else {
printf("Index %d is out of bounds, refusing to access\n", index);
}
return 0;
}
Step 1: Compile and run
% cc -o lab_pointer_bounds lab_pointer_bounds.c
% ./lab_pointer_bounds
You should see output that walks forward and backward through the array. Notice how both directions are handled using the same pointer, just with different arithmetic.
Step 2: Try breaking the rule
Now, change the boundary check:
printf("Unsafe access: %d\n", *(ptr + 4));
Compile and run again. On your system, it might print a random number, or it might crash. This is undefined behaviour in action. The program stepped outside the array’s safe memory and read garbage. On FreeBSD in user space this might only crash your program, but in kernel space the same mistake could crash the whole system.
Step 3: Think like a kernel developer
Kernel code often passes around buffers and pointers, but the kernel does not automatically check array limits for you. Safe coding habits like the check we used earlier, are critical:
if (index >= 0 && index < length) { ... }
Always validate that you are within valid bounds before dereferencing a pointer.
Key Takeaways from This Lab
- Pointer arithmetic lets you move forward and backwards through arrays.
- Arrays do not carry their length with them; you must track it yourself.
- Accessing memory beyond array boundaries is undefined behaviour.
- In FreeBSD kernel code, these mistakes can lead to panics or vulnerabilities, so always include boundary checks.
Challenge Questions
- Forward and backwards in one pass: Write a function
void walk_both(const int *base, size_t n)that prints pairs(base[i], base[n-1-i])using only pointer arithmetic, no array indexing. Stop when the pointers meet or cross. - Bounds checked accessor: Implement
int get_at(const int *base, size_t n, size_t i, int *out)that returns 0 on success and a non-zero error code ifiis out of range. Use only pointer arithmetic to read the value. - Find first match: Write
int *find_first(int *base, size_t n, int target)that returns a pointer to the first occurrence orNULLif not found. Walk using a moving pointer frombasetobase + n. - Reverse in place: Create
void reverse_in_place(int *base, size_t n)that swaps elements from the ends toward the middle using two pointers. Do not use indexing. - Safe slice print: Write
void print_slice(const int *base, size_t n, size_t start, size_t count)that prints at mostcountelements beginning atstart, but never steps beyondn. - Off by one detector: Introduce an off-by-one bug in a loop, then add a runtime check that detects it. Fix the loop and confirm the check remains silent.
- Stride walk: Treat the array as records of
strideints. Writevoid walk_stride(const int *base, size_t n, size_t stride)that visits only the first element of each record. - Pointer difference: Given two pointers
aandbinto the same array, compute the element distance usingptrdiff_t. Verify thata + distance == b. - Guarded dereference: Implement
int try_deref(const int *p, const int *begin, const int *end, int *out)that only dereferences ifplies within[begin, end). - Refactor into functions: Rewrite the lab so that walking, printing, and boundary checking are separate functions that all use pointer arithmetic.
Wrapping Up
Pointer arithmetic gives you a new way to move through memory. It allows you to scan arrays efficiently, traverse structures, and interact with hardware buffers. But with this power comes the danger of stepping outside the safe zone. In user programs, that might crash only your program. In kernel code, a mistake could crash the entire system or open a security hole.
As you continue your journey, keep the mental image of walking along a row of boxes. Step carefully, never wander off the edge, and always count how many boxes you really have. With this habit, you will be ready for the next topic: using pointers to access not just plain values, but entire structures, the building blocks of more complex data in FreeBSD.
Pointers to Structs
In C programming, especially when writing device drivers or working inside the FreeBSD kernel, you will constantly encounter structs. A struct is a way of grouping several related variables together under one name. These variables, called fields, can be of different types, and together they represent a more complex entity.
Because structs are often large and frequently shared between parts of the kernel, we usually interact with them through pointers. Understanding how to work with pointers to structs is, therefore, a crucial skill for reading kernel code and writing drivers of your own.
Let’s build this step by step.
What Is a Struct?
A struct groups multiple variables into a single unit. For example:
struct Point {
int x;
int y;
};
This defines a new type called struct Point, which has two integer fields: x and y.
We can create a variable of this type and assign values to its fields:
struct Point p1;
p1.x = 10;
p1.y = 20;
At this point, p1 is a concrete object in memory, holding two integers side by side.
Introducing Pointers to Structs
Just like we can have a pointer to an integer or a pointer to a character, we can also have a pointer to a struct:
struct Point *ptr;
If we already have a variable of type struct Point, we can store its address in the pointer:
ptr = &p1;
Now ptr holds the address of p1. To reach the fields of the struct through this pointer, C provides two notations.
Accessing Struct Fields Through Pointers
Suppose we have:
struct Point p1 = {10, 20};
struct Point *ptr = &p1;
There are two ways to access the fields via the pointer:
// Method 1: Explicit dereference
printf("x = %d\n", (*ptr).x);
// Method 2: Arrow operator
printf("y = %d\n", ptr->y);
Both are correct, but the arrow operator (->) is much cleaner and is the style you will see everywhere in FreeBSD kernel code.
So:
ptr->x
Means the same as:
(*ptr).x
But it is easier to read and write.
Why Pointers Are Preferred
Passing an entire struct around by value can be expensive, since C would need to copy every field each time. Instead, the kernel almost always passes around pointers to structs. This way, only an address (a small integer) is passed, and different parts of the system can all work with the same underlying object.
This is particularly important in device drivers, where structs often represent significant and complex entities such as devices, processes, or threads.
A Real FreeBSD Example
Let’s look at a real snippet from the FreeBSD source tree.
In sys/kern/tty_info.c, the tty_info() function works with a struct proc, which represents a process. This code appears around lines 331-350 in the FreeBSD 14.3 source:
struct proc *p, *ppick;
p = NULL;
// Walk through all members of the foreground process group
LIST_FOREACH(ppick, &tp->t_pgrp->pg_members, p_pglist)
// Use proc_compare() to decide if this candidate is "better"
if (proc_compare(p, ppick))
p = ppick;
// Later in the function, access the chosen process
pid = p->p_pid; // get the process ID
strlcpy(comm, p->p_comm, sizeof comm); // copy the process name into a local buffer
Here’s what happens, step by step:
pandppickare pointers tostruct proc. The code initialisesptoNULL, then iterates over the foreground process group withLIST_FOREACH.- On each step, it calls
proc_compare()to decide whether the current candidateppickis "better" than the one already chosen; if so, it updatesp. Later, it reads fields from the selected process viap->p_pidandp->p_comm.
This is a typical kernel pattern: select a struct instance via pointer traversal, then access its fields through ->.
Note: proc_compare() encapsulates the selection logic: it prefers runnable processes, then the one with higher recent CPU usage, de-prioritises zombies, and breaks ties by choosing the higher PID.
A Quick Bridge: what LIST_FOREACH does
Beginners often see LIST_FOREACH(...) and wonder what magic is happening. There’s no magic, it’s a macro that walks a singly-linked list. Its common BSD style is:
LIST_FOREACH(item, &head->list_field, link_field) {
/* use item->field ... */
}
itemis the loop variable that points to each element.&head->list_fieldis the list head you’re iterating.link_fieldis the name of the link that chains elements together (the "next" pointer field in each node).
In our snippet, ppick is the loop variable, &tp->t_pgrp->pg_members is the list of processes in the foreground process group, and p_pglist is the link field inside each struct proc. Each iteration points ppick at the next process, allowing the code to compare and ultimately select the one stored in p.
This small macro hides the pointer-chasing details so your code reads like: "for each process in this process group, consider it as a candidate."
A Minimal User-Space Example
Here’s a simple program you can compile and run yourself to get hands-on practice:
#include <stdio.h>
#include <string.h>
// Define a simple struct
struct Device {
int id;
char name[20];
};
int main() {
struct Device dev1 = {42, "tty0"};
struct Device *dev_ptr = &dev1;
// Access fields through the pointer
printf("Device ID: %d\n", dev_ptr->id);
printf("Device Name: %s\n", dev_ptr->name);
// Change values through the pointer
dev_ptr->id = 43;
strcpy(dev_ptr->name, "ttyS1");
// Show updated values
printf("Updated Device ID: %d\n", dev_ptr->id);
printf("Updated Device Name: %s\n", dev_ptr->name);
return 0;
}
This mirrors what happens in the kernel. We define a struct, create an instance of it, then use a pointer to access and modify its fields.
Common Beginner Pitfalls with Struct Pointers
Working with pointers to structs is straightforward once you get used to it, but beginners often fall into the same traps. Let’s look at the most frequent mistakes and how to avoid them.
1. Forgetting to Initialise the Pointer A pointer that hasn’t been given a value points to "somewhere" in memory, which usually means garbage. Accessing it causes undefined behaviour, often leading to crashes.
struct Device *d; // Uninitialised, points to who-knows-where
d->id = 42; // Undefined behaviour
How to avoid it: Always initialise your pointer, either to NULL or to the address of a real struct.
struct Device dev1;
struct Device *d = &dev1; // Safe: d points to dev1
2. Confusing . and ->
Remember: use the dot (.) to access a field when you have a real struct variable, and use the arrow (->) when you have a pointer to a struct. Mixing them up is a common beginner error.
struct Point p1 = {1, 2};
struct Point *ptr = &p1;
printf("%d\n", p1.x); // Dot for variables
printf("%d\n", ptr->x); // Arrow for pointers
printf("%d\n", ptr.x); // Error: ptr is not a struct
How to avoid it: Ask yourself: "Am I working with a pointer or the struct itself?" That tells you which operator to use.
3. Dereferencing NULL Pointers
If a pointer is set to NULL (which is common as an initial or error state), dereferencing it will immediately crash your program.
struct Device *d = NULL;
printf("%d\n", d->id); // Crash: d is NULL
How to avoid it: Always check pointers before dereferencing:
if (d != NULL) {
printf("%d\n", d->id);
}
In kernel code, this check is especially important. Dereferencing a NULL pointer inside the kernel can bring the whole system down.
4. Using a Pointer After the Struct Has Gone Out of Scope Pointers don’t "own" the struct they point to. If the struct disappears (for example, because it was a local variable in a function that has returned), the pointer becomes invalid, a dangling pointer.
struct Device *make_device(void) {
struct Device d = {99, "tty0"};
return &d; // Dangerous: d disappears after function returns
}
How to avoid it: Never return the address of a local variable. Allocate dynamically (with malloc in user programs or malloc(9) in the kernel) if you need a struct to outlive the function that creates it.
5. Assuming a Struct Is Small Enough to Copy Around In user programs, you can sometimes get away with passing structs by value. But in kernel code, structs often represent large, complex objects, sometimes with embedded lists or pointers of their own. Copying them accidentally can cause subtle and serious bugs.
struct Device dev1 = {1, "tty0"};
struct Device dev2 = dev1; // Copies all fields, not a shared reference
How to avoid it: Pass pointers to structs instead of copying them, unless you are certain that a shallow copy is intended.
Main Takeaway:
The most common mistakes with struct pointers come from forgetting what a pointer really is: just an address. Always initialise pointers, distinguish carefully between . and ->, check for NULL, and be mindful of scope and lifetime. In kernel development, a single misstep with a struct pointer can destabilise the entire system, so adopting safe habits early will serve you well.
Mini Hands-On Lab: Struct Pointer Pitfalls (No malloc yet)
This lab reproduces common mistakes with pointers to structs and then shows safe, beginner-friendly alternatives using only stack variables and function "out-parameters".
Create a file lab_struct_pointer_pitfalls_nomalloc.c:
/*
* lab_struct_pointer_pitfalls_nomalloc.c
*
* Classic pitfalls with pointers to structs, rewritten to avoid malloc.
* Build and run each case and read the console output as you go.
*
* NOTE: This file intentionally contains ONE warning for make_device_bad()
* to demonstrate what the compiler catches. This is expected!
*/
#include <stdio.h>
#include <string.h>
struct Device {
int id;
char name[16];
};
/* -----------------------------------------------------------
* Case 1: Uninitialised pointer (UB) vs. correctly initialised
* ----------------------------------------------------------- */
static void case_uninitialised_pointer(void) {
printf("\n[Case 1] Uninitialised pointer vs initialised\n");
/* Wrong: d has no valid target. Dereferencing is undefined. */
/* struct Device *d; */
/* d->id = 42; // Do NOT do this */
/* Right: point to a real object or keep NULL until assigned. */
struct Device dev = { 1, "tty0" };
struct Device *ok = &dev;
ok->id = 2;
strcpy(ok->name, "ttyS0");
printf(" ok->id=%d ok->name=%s\n", ok->id, ok->name);
struct Device *maybe = NULL;
if (maybe == NULL) {
printf(" maybe is NULL, avoiding dereference\n");
}
}
/* -----------------------------------------------------------
* Case 2: Dot vs arrow confusion
* ----------------------------------------------------------- */
static void case_dot_vs_arrow(void) {
printf("\n[Case 2] Dot vs arrow\n");
struct Device dev = { 7, "console0" };
struct Device *p = &dev;
/* Correct usage */
printf(" dev.id=%d dev.name=%s\n", dev.id, dev.name); /* variable: use . */
printf(" p->id=%d p->name=%s\n", p->id, p->name); /* pointer: use -> */
/* Uncomment to observe the compiler error for teaching purposes */
/* printf("%d\n", p.id); */ /* p is a pointer, not a struct */
}
/* -----------------------------------------------------------
* Case 3: NULL pointer dereference vs guarded access
* ----------------------------------------------------------- */
static void case_null_deref(void) {
printf("\n[Case 3] NULL dereference guard\n");
struct Device *p = NULL;
/* Wrong: would crash */
/* printf("%d\n", p->id); */
/* Right: guard before dereferencing if pointer may be NULL */
if (p != NULL) {
printf(" id=%d\n", p->id);
} else {
printf(" p is NULL, skipping access\n");
}
}
/* -----------------------------------------------------------
* Case 4: Dangling pointer (returning address of a local)
* and two safe alternatives WITHOUT malloc
* ----------------------------------------------------------- */
/* Dangerous factory: returns address of a local variable (dangling) */
static struct Device *make_device_bad(void) {
struct Device d = { 99, "bad-local" };
return &d; /* The address becomes invalid when the function returns */
}
/* Safe alternative A: initialiser that writes into caller-provided struct */
static void init_device(struct Device *out, int id, const char *name) {
if (out == NULL) return;
out->id = id;
strncpy(out->name, name, sizeof(out->name) - 1);
out->name[sizeof(out->name) - 1] = '\0';
}
/* Safe alternative B: return-by-value (fine for small, plain structs) */
static struct Device make_device_value(int id, const char *name) {
struct Device d;
d.id = id;
strncpy(d.name, name, sizeof(d.name) - 1);
d.name[sizeof(d.name) - 1] = '\0';
return d; /* Returned by value: the caller receives its own copy */
}
static void case_dangling_and_safe_alternatives(void) {
printf("\n[Case 4] Dangling vs safe init (no malloc)\n");
/* Wrong: pointer becomes dangling immediately */
struct Device *bad = make_device_bad();
(void)bad; /* Do not dereference; it is invalid. */
printf(" bad points to invalid memory; we will not dereference it\n");
/* Safe A: caller owns storage; callee fills it via pointer */
struct Device owned_a;
init_device(&owned_a, 123, "owned-A");
printf(" owned_a.id=%d owned_a.name=%s\n", owned_a.id, owned_a.name);
/* Safe B: small plain struct returned by value */
struct Device owned_b = make_device_value(124, "owned-B");
printf(" owned_b.id=%d owned_b.name=%s\n", owned_b.id, owned_b.name);
}
/* -----------------------------------------------------------
* Case 5: Accidental struct copy vs pointer sharing
* ----------------------------------------------------------- */
static void case_copy_vs_share(void) {
printf("\n[Case 5] Copy vs share\n");
struct Device a = { 1, "tty0" };
/* Accidental copy: b is a separate struct */
struct Device b = a;
strcpy(b.name, "tty1");
printf(" after copy+edit: a.name=%s, b.name=%s\n", a.name, b.name);
/* Intentional sharing via pointer */
struct Device *pa = &a;
strcpy(pa->name, "tty2");
printf(" after shared edit: a.name=%s (via pa)\n", a.name);
}
/* Bonus: safe initialisation pattern via out-parameter */
static void case_safe_init_pattern(void) {
printf("\n[Bonus] Safe initialisation via pointer\n");
struct Device dev;
init_device(&dev, 55, "control0");
printf(" dev.id=%d dev.name=%s\n", dev.id, dev.name);
}
int main(void) {
case_uninitialised_pointer();
case_dot_vs_arrow();
case_null_deref();
case_dangling_and_safe_alternatives();
case_copy_vs_share();
case_safe_init_pattern();
return 0;
}
Build and run:
% cc -Wall -Wextra -o lab_struct_pointer_pitfalls_nomalloc lab_struct_pointer_pitfalls_nomalloc.c
% ./lab_struct_pointer_pitfalls_nomalloc
What to notice:
-
Uninitialised versus initialised You never dereference an uninitialised pointer. Point it to a real object or keep it as
NULLuntil you have one. -
Dot versus arrow Use
.with a struct variable,->with a pointer. If you uncomment thep.idline, the compiler will flag it. -
NULL dereference Always guard against
NULLwhen there is any chance the pointer might be unset. -
Dangling pointer without malloc Returning the address of a local variable is unsafe because the local goes out of scope. Two safe options that require no heap:
-
Let the caller provide the storage and pass a pointer to be filled in.
-
Return a small, plain struct by value when a copy is intended.
-
-
Copy vs share Copying a struct makes a separate object; editing one does not alter the other. Using a pointer means both names refer to the same object.
Why this matters for driver code
Kernel code passes pointers to structs everywhere. The habits you just practiced are foundational: initialise pointers, choose the correct operator, guard against NULL, avoid dangling pointers by respecting scope, and be deliberate about copying versus sharing. These patterns keep kernel code safe and predictable, long before you ever need dynamic allocation.
Challenge Questions: Pointers to Structs
Try these exercises to make sure you really understand how struct pointers work. Write small C programs for each one and experiment with the output.
-
Dot vs Arrow Write a program that creates a
struct Pointand a pointer to it. Print the fields twice: once using the dot operator (.) and once using the arrow operator (->). Explain why one works with the variable and the other with the pointer. -
Struct Initialiser Function Write a function
void init_point(struct Point *p, int x, int y)that fills in a struct given a pointer. Call it frommainwith a local variable and print the result. -
Returning by Value vs Returning a Pointer Write two functions:
struct Point make_point_value(int x, int y)that returns a struct by value.struct Point *make_point_pointer(int x, int y)that (wrongly) returns a pointer to a local struct. What happens if you use the second function? Why is it dangerous?
-
Safe NULL Handling Modify the program so that a pointer may be set to
NULL. Write a functionprint_point(const struct Point *p)that safely prints"(null)"ifpisNULLinstead of crashing. -
Copy vs Share Create two structs: one by copying another (
struct Point b = a;) and one by sharing via pointer (struct Point *pb = &a;). Change the values in each and print both. What differences do you observe? -
Mini Linked List Define a simple struct:
struct Node { int value; struct Node *next; };Manually create three nodes and chain them together (
n1 -> n2 -> n3). Use a pointer to walk through the list and print the values. This mimics whatLIST_FOREACHdoes in the kernel.
Wrapping Up
Pointers to structs are one of the most important idioms in kernel programming. They allow you to work with complex objects efficiently, without copying large blocks of memory, and they provide the foundation for navigating linked lists and device tables.
You’ve now seen how:
- Structs group related fields into a single object.
- Pointers to structs let you access and modify those fields efficiently.
- The arrow operator (
->) is the preferred way to reach struct fields through a pointer. - Real kernel code relies heavily on struct pointers to represent processes, threads, and devices.
With the challenge questions, you can now test yourself and confirm you really understand how struct pointers behave in C.
The natural next step is to see what happens when we combine these ideas with arrays. Arrays of pointers and pointer arrays are a powerful duo that appear everywhere in kernel code, from device tables to argument lists.
Let’s continue and learn about the next topic: Arrays of Pointers and Pointer Arrays.
Arrays of Pointers and Pointer Arrays
This is one of those topics that many C beginners find tricky: the difference between an array of pointers and a pointer to an array. The declarations look confusingly similar, but they describe very different things. The difference is not just academic. Arrays of pointers are everywhere in FreeBSD code, while pointers to arrays are less common but still important to understand because they appear in contexts where hardware requires contiguous memory blocks.
We will first look at arrays of pointers, then at pointers to arrays, and then connect them to real FreeBSD examples and driver development.
Array of Pointers
An array of pointers is simply an array where each element is itself a pointer. Instead of holding values directly, the array holds addresses pointing to values stored elsewhere.
Example: Array of Strings
.
#include <stdio.h>
int main() {
// Array of 3 pointers to const char (i.e., strings)
const char *messages[3] = {
"Welcome",
"to",
"FreeBSD"
};
for (int i = 0; i < 3; i++) {
printf("Message %d: %s\n", i, messages[i]);
}
return 0;
}
Here, messages is an array of three pointers. Each element, such as messages[0], holds the address of a string literal. When passed to printf, it prints the string.
This is exactly the same structure as the argv parameter to main(): it is just an array of pointers to characters.
Real FreeBSD Example: Locale Names
Looking into FreeBSD 14.3 source code, more specifically in bin/sh/var.c, around line 130:
static const char *const locale_names[7] = {
"LANG", "LC_ALL", "LC_COLLATE", "LC_CTYPE",
"LC_MESSAGES", "LC_MONETARY", "LC_NUMERIC",
};
This is an array of constant character pointers. Each element points to a string literal with the name of a locale category. The shell uses this array to check or set environment variables consistently. This is a compact, idiomatic way to store a table of names.
Real FreeBSD Example: SNMP Transport Names
Looking into FreeBSD 14.3 source code, more specifically, in contrib/bsnmp/lib/snmpclient.c, around line 1887:
static const char *const trans_list[] = {
"udp", "tcp", "local", NULL
};
This is another array of string pointers, terminated by NULL. The SNMP client library uses this list to recognise valid transport names. The use of NULL as a terminator is a very common C idiom.
Note: Arrays of pointers are common in FreeBSD because they allow flexible, dynamic lookup tables without copying large amounts of data. Instead of storing the strings inline, the array stores pointers to them.
Pointer to an Array
A pointer to an array is very different from an array of pointers. Instead of pointing to scattered objects, it points to one single, contiguous array block. The syntax can look intimidating, but the underlying idea is simple: the pointer represents the entire array as a unit.
Example: Pointer to an Array of Integers
.
#include <stdio.h>
int main() {
int numbers[5] = {1, 2, 3, 4, 5};
// Pointer to an array of 5 integers
int (*p)[5] = &numbers;
printf("Third number: %d\n", (*p)[2]);
return 0;
}
Breaking it down:
int (*p)[5];declarespas a pointer to an array of 5 integers.p = &numbers;makesppoint to the wholenumbersarray.(*p)[2]first dereferences the pointer (giving us the array) and then indexes into it.
Another Example with Structures
.
#include <stdio.h>
#define SIZE 4
struct Point {
int x, y;
};
int main(void) {
struct Point pts[SIZE] = {
{0, 0}, {1, 2}, {2, 4}, {3, 6}
};
struct Point (*parray)[SIZE] = &pts;
printf("Third element: x=%d, y=%d\n",
(*parray)[2].x, (*parray)[2].y);
// Modify via the pointer
(*parray)[2].x = 42;
(*parray)[2].y = 84;
printf("After modification: x=%d, y=%d\n",
pts[2].x, pts[2].y);
return 0;
}
Here, parray points to the entire array of four struct Point. Accessing through it is equivalent to accessing the original array directly, but it emphasises that the pointer represents the array as a single unit.
You have now seen two different examples of pointers to arrays. Both show how a single pointer can name a whole, contiguous region of memory. The natural question is how often this appears in real FreeBSD code.
Why You Rarely See This in FreeBSD
Literal T (*p)[N] declarations are uncommon in the kernel source. FreeBSD developers usually represent fixed-size blocks in one of two ways:
- Wrap the array inside a
struct, which keeps size and type information together and leaves room for metadata. - Pass a base pointer along with an explicit length, especially for buffers and I/O regions.
This style makes the code more transparent, easier to maintain, and integrates better with kernel subsystems. The random subsystem is a good example, where structures carry fixed arrays that are treated as single units by the code paths that process them. For background on how drivers and subsystems feed entropy into the kernel, see the random_harvest(9) manual page.
Real FreeBSD Example: Struct with a Fixed-Size Array
FreeBSD 14.3 source, sys/dev/random/random_harvestq.h, lines 33-44:
#define HARVESTSIZE 2 /* Max length in words of each harvested entropy unit */
/* These are used to queue harvested packets of entropy. The entropy
* buffer size is pretty arbitrary.
*/
struct harvest_event {
uint32_t he_somecounter; /* fast counter for clock jitter */
uint32_t he_entropy[HARVESTSIZE];/* some harvested entropy */
uint8_t he_size; /* harvested entropy byte count */
uint8_t he_destination; /* destination pool of this entropy */
uint8_t he_source; /* origin of the entropy */
};
This is not a raw T (*p)[N] declaration, yet it captures the same idea in a form that is clearer and more practical for the kernel. The struct groups a fixed array he_entropy[HARVESTSIZE] with related fields. Code then passes a pointer to struct harvest_event, treating the entire block as one object. In random_harvestq.c you can see how an instance is filled and processed, including copying into he_entropy and setting the size and metadata fields, which reinforces that the array is handled as part of a single unit.
Even though raw pointers to arrays are rare in the tree, understanding them helps you recognise why kernel code tends to wrap arrays in structures or pair a base pointer with an explicit length. Conceptually, it is the same pattern of referring to a contiguous block as a whole.
Mini Hands-On Lab
Let’s test your understanding. For each declaration, decide whether it is an array of pointers or a pointer to an array. Then explain how you would access the third element.
const char *names[] = { "a", "b", "c", NULL };int (*ring)[64];struct foo *ops[8];char (*line)[80];
Check yourself:
- Array of pointers. Third element with
names[2]. - Pointer to an array of 64 ints. Use
(*ring)[2]. - Array of pointers to
struct foo. Useops[2]. - Pointer to an array of 80 chars. Use
(*line)[2].
Challenge Questions
- Why is
argvinmain(int argc, char *argv[])considered an array of pointers rather than a pointer to an array? - In kernel code, why do developers prefer to use a
structwrapping a fixed-size array instead of a raw pointer-to-array declaration? - How does using
NULLas a terminator in arrays of pointers simplify iteration? - Imagine a driver that manages a ring of DMA descriptors. Would you expect this to be represented as an array of pointers or a pointer to an array? Why?
- What could go wrong if you mistakenly treated a pointer to an array as if it were an array of pointers?
Why This Matters for Kernel and Driver Development
In FreeBSD device drivers, arrays of pointers appear constantly. They are used for option lists, function pointer tables, protocol name arrays, and sysctl handlers. This idiom saves space and allows code to iterate flexibly through lists without knowing their exact size in advance.
Pointers to arrays, while rarer, are conceptually important because they match the way hardware often works. A NIC, for example, may expect a contiguous ring buffer of descriptors. In practice FreeBSD developers usually hide the raw pointer-to-array inside a struct that describes the ring, but the underlying idea is identical: the driver is passing around "a single block of fixed-size elements."
Understanding both patterns is part of thinking like a systems programmer. It ensures you will not confuse two declarations that look similar but behave differently, which prevents subtle and painful bugs.
Wrapping Up
By now you can clearly see the difference between an array of pointers and a pointer to an array. You have also seen why this distinction matters when reading or writing real code. Arrays of pointers give flexibility by letting each element point to different objects, while a pointer to an array treats a whole block of memory as a single unit.
With this foundation in place, we are ready to take the next step: moving from fixed arrays to dynamically allocated memory. In the following section on Dynamic Memory Allocation, you will learn how to use functions such as malloc, calloc, realloc, and free to create arrays at runtime. We will connect this to pointers by showing how to allocate each element separately, how to request one contiguous block when you need a pointer to an array, and how to clean up properly if something goes wrong. This transition from static to dynamic memory is essential for real systems programming and will prepare you for the way memory is managed inside the FreeBSD kernel.
Dynamic Memory Allocation
So far, most of the memory we used in examples was fixed-size: arrays with a known length or structures allocated on the stack. But when writing system-level code like FreeBSD device drivers, you often don't know in advance how much memory you'll need. Maybe a device reports the number of buffers only after probing, or the amount of data depends on user input. That's when dynamic memory allocation comes into play.
Dynamic allocation allows your code to ask the system for memory while it's running, and to give it back when it's no longer needed. This flexibility is essential for drivers, where hardware and workload conditions can change at runtime.
User Space vs Kernel Space
In user space, you've probably seen functions like:
malloc(size)- allocates a block of memory.calloc(count, size)- allocates and zeroes a block.free(ptr)- releases previously allocated memory.
Example:
#include <stdio.h>
#include <stdlib.h>
int main(void) {
int *nums;
int count = 5;
nums = malloc(count * sizeof(int));
if (nums == NULL) {
printf("Memory allocation failed!\n");
return 1;
}
for (int i = 0; i < count; i++) {
nums[i] = i * 10;
printf("nums[%d] = %d\n", i, nums[i]);
}
free(nums); // Always release what you allocated
return 0;
}
In user space, memory comes from the heap, managed by the C runtime and the operating system.
But inside the FreeBSD kernel, we cannot use <stdlib.h>'s malloc() or free(). The kernel has its own allocator, designed with stricter rules and better tracking. The kernel API is documented in malloc(9).
Visualising Memory in User Space
+-----------------------------------------------------------+
| STACK (grows down) |
| - Local variables |
| - Function call frames |
| - Fixed-size arrays |
+-----------------------------------------------------------+
| ^ |
| | |
| HEAP (grows up) |
| - malloc()/calloc()/free() |
| - Dynamic structures and arrays |
+-----------------------------------------------------------+
| DATA SEGMENT |
| - Globals |
| - static variables |
| - .data (initialized) / .bss (zero-initialized) |
+-----------------------------------------------------------+
| CODE / TEXT SEGMENT |
| - Program instructions |
+-----------------------------------------------------------+
Note: Stack and heap grow toward each other at runtime. Fixed-size data lives on the stack or in the data segment, while dynamic allocations come from the heap.
Kernel-Space Allocation with malloc(9)
To allocate memory in the FreeBSD kernel you use:
#include <sys/malloc.h>
void *malloc(size_t size, struct malloc_type *type, int flags);
void free(void *addr, struct malloc_type *type);
Example from the kernel:
char *buf = malloc(1024, M_TEMP, M_WAITOK | M_ZERO);
/* ... use buf ... */
free(buf, M_TEMP);
Breaking it down:
1024→ the number of bytes.M_TEMP→ memory type tag (explained below).M_WAITOK→ wait if memory is temporarily unavailable.M_ZERO→ ensure the block is zeroed.free(buf, M_TEMP)→ release the memory.
Kernel malloc(9) Workflow
┌──────────────────────────┐
│ Driver code │
│ │
│ ptr = malloc(size, │
│ TYPE, │
│ FLAGS); │
└─────────────┬────────────┘
│ request
v
┌──────────────────────────┐
│ Kernel allocator │
│ - Typed pools (TYPE) │
│ - Honors FLAGS: │
│ M_WAITOK / NOWAIT │
│ M_ZERO │
│ - Accounting & tracing │
└─────────────┬────────────┘
│ returns pointer
v
┌──────────────────────────┐
│ Driver uses buffer │
│ - Fill/IO/queues/etc. │
│ - Lifetime under driver │
│ responsibility │
└─────────────┬────────────┘
│ later
v
┌──────────────────────────┐
│ free(ptr, TYPE); │
│ - Returns memory to │
│ kernel pool │
│ - TYPE must match │
└──────────────────────────┘
Note: In the kernel, every allocation is typed and controlled by flags. Pair every malloc(9) with a free(9) on all code paths, including errors.
Memory Types and Flags
One unique aspect of FreeBSD's kernel allocator is the type system: every allocation must be tagged. This makes debugging and leak tracking easier.
Some common types:
M_TEMP- temporary allocations.M_DEVBUF- buffers for device drivers.M_TTY- terminal subsystem memory.
Common flags:
M_WAITOK- sleep until memory is available.M_NOWAIT- return immediately if memory cannot be allocated.M_ZERO- zero out memory before returning.
This explicit style encourages safe, predictable memory usage in critical kernel code.
Hands-On Lab 1: Allocating and Freeing a Buffer
In this exercise, we'll create a simple kernel module that allocates memory when loaded and frees it when unloaded.
my_malloc_module.c
#include <sys/param.h>
#include <sys/module.h>
#include <sys/kernel.h>
#include <sys/malloc.h>
#include <sys/systm.h>
static char *buffer;
#define BUFFER_SIZE 128
MALLOC_DEFINE(M_MYBUF, "my_malloc_buffer", "Buffer for malloc module");
static int
load_handler(module_t mod, int event, void *arg)
{
switch (event) {
case MOD_LOAD:
buffer = malloc(BUFFER_SIZE, M_MYBUF, M_WAITOK | M_ZERO);
if (buffer == NULL)
return (ENOMEM);
snprintf(buffer, BUFFER_SIZE, "Hello from kernel space!\n");
printf("my_malloc_module: %s", buffer);
return (0);
case MOD_UNLOAD:
if (buffer != NULL) {
free(buffer, M_MYBUF);
printf("my_malloc_module: Memory freed\n");
}
return (0);
default:
return (EOPNOTSUPP);
}
}
static moduledata_t my_malloc_mod = {
"my_malloc_module", load_handler, NULL
};
DECLARE_MODULE(my_malloc_module, my_malloc_mod, SI_SUB_DRIVERS, SI_ORDER_MIDDLE);
What to do:
- Build with
make. - Load with
kldload ./my_malloc_module.ko. - Check
dmesgfor the message. - Unload with
kldunload my_malloc_module.
You'll see how memory is reserved and later freed.
When you load and unload the module, you will see the allocation and freeing in action.
Hands-On Lab 2: Allocating an Array of Structures
Now let's extend the idea by creating an array of structs dynamically.
struct my_entry {
int id;
char name[32];
};
MALLOC_DEFINE(M_MYSTRUCT, "my_struct_array", "Array of my_entry");
static struct my_entry *entries;
#define ENTRY_COUNT 5
On load:
- Allocate memory for five entries.
- Initialize each one with an ID and name.
- Print them out.
On unload:
- Free the memory.
This exercise mirrors what real drivers do when they keep track of device states, DMA buffers, or I/O queues.
Real FreeBSD Example: Building the corefile path in coredump_vnode.c
Let's look at a real example from FreeBSD's source code: sys/kern/coredump_vnode.c. This function builds the path for a process core dump. It allocates a temporary buffer, uses it to assemble the path string, and later frees it. I've added additional comments to the example code below to make it easier for you to understand what happens at each step:
/*
* corefile_open(...) builds the final core file path into a temporary
* kernel buffer named `name`. The buffer must be large enough to hold
* a path, so MAXPATHLEN is used. The buffer is freed by the caller
* (see the coredump_vnode() snippet further below).
*/
static int
corefile_open(const char *comm, uid_t uid, pid_t pid, struct thread *td,
int compress, int signum, struct vnode **vpp, char **namep)
{
struct sbuf sb;
const char *format;
char *hostname, *name;
int indexpos, indexlen, ncores;
hostname = NULL;
format = corefilename;
/*
* Allocate a zeroed path buffer from the kernel allocator.
* - size: MAXPATHLEN bytes (fits a full path)
* - M_TEMP: a temporary allocation type tag (helps tracking/debug)
* - M_WAITOK: ok to sleep if memory is briefly unavailable
* - M_ZERO: return the buffer zeroed (no stale data)
*/
name = malloc(MAXPATHLEN, M_TEMP, M_WAITOK | M_ZERO); /* <-- allocate */
indexlen = 0;
indexpos = -1;
ncores = num_cores;
/*
* Initialize an sbuf that writes directly into `name`.
* SBUF_FIXEDLEN means: do not auto-grow, error if too long.
*/
(void)sbuf_new(&sb, name, MAXPATHLEN, SBUF_FIXEDLEN);
/*
* The format string (kern.corefile) may include tokens like %N, %P, %U.
* Iterate, expand tokens, and append to `sb`. If %H (hostname) appears,
* allocate a second small buffer for it, then free it immediately after.
*/
/* ... formatting loop omitted for brevity ... */
/* hostname was conditionally allocated above; free it now if used */
free(hostname, M_TEMP); /* <-- free small temp */
/*
* If compression is requested, append a suffix like ".gz" or ".zst".
* If the sbuf overflowed, clean up and return ENOMEM.
*/
if (sbuf_error(&sb) != 0) {
sbuf_delete(&sb); /* dispose sbuf wrapper (no malloc here) */
free(name, M_TEMP); /* <-- free on error path */
return (ENOMEM);
}
sbuf_finish(&sb);
sbuf_delete(&sb);
/*
* On success, return `name` to the caller via namep.
* Ownership of `name` transfers to the caller, who must free it.
*/
*namep = name;
return (0);
}
What to notice here:
- The buffer is typed with
M_TEMPto help kernel memory accounting and leak detection. This is a FreeBSD kernel convention that you'll reuse for your own drivers by defining your ownMALLOC_DEFINEtag. M_WAITOKis chosen because this path can safely sleep; the kernel allocator will wait rather than fail spuriously. If you are in a context where sleeping is unsafe, you must useM_NOWAITand handle allocation failure immediately.- Error paths free what they allocate before returning. This is the habit to ingrain early: every
malloc(9)must have a clear and reliablefree(9)in all paths.
Now let's see where the caller cleans up:
/*
* coredump_vnode(...) calls corefile_open() to obtain both the vnode (vp)
* and the dynamically built `name` path. After it finishes writing the core
* or handling errors, it must free `name`. This snippet shows both another
* temporary allocation (for cwd when needed) and the final free(name).
*/
static int
coredump_vnode(struct thread *td, off_t limit)
{
struct vnode *vp;
char *name; /* corefile path returned by corefile_open() */
char *fullpath, *freepath = NULL;
size_t fullpathsize;
struct sbuf *sb;
int error, error1;
/* Build name + open/create target vnode */
error = corefile_open(p->p_comm, cred->cr_uid, p->p_pid, td,
compress_user_cores, sig, &vp, &name);
if (error != 0)
return (error);
/* ... write/extend coredump, lock ranges, set attributes, etc ... */
/*
* When emitting a devctl notification, if the core path is relative,
* allocate a small temporary buffer to fetch the current working dir,
* then free it once we've appended it to the sbuf.
*/
if (name[0] != '/') {
fullpathsize = MAXPATHLEN;
freepath = malloc(fullpathsize, M_TEMP, M_WAITOK); /* <-- allocate temp */
if (vn_getcwd(freepath, &fullpath, &fullpathsize) != 0) {
free(freepath, M_TEMP); /* <-- free on error */
/* ... fall through to cleanup below ... */
}
/* use fullpath ... */
free(freepath, M_TEMP); /* <-- free on success */
/* ... continue building notification ... */
}
out:
/*
* Close the vnode we opened and then free the dynamically built `name`.
* This pairs with the malloc(MAXPATHLEN, ...) in corefile_open().
*/
error1 = vn_close(vp, FWRITE, cred, td);
if (error == 0)
error = error1;
free(name, M_TEMP); /* <-- final free */
return (error);
}
This example highlights several important practices that apply directly to device driver development. Temporary kernel strings and small work buffers are often created with malloc(9) and must always be released, whether the code succeeds or fails, as shown in the careful cleanup logic of coredump_vnode(). T
he choice between M_WAITOK and M_NOWAIT also depends on context: in code paths where the kernel can safely sleep, M_WAITOK ensures the allocation will eventually succeed, while in contexts such as interrupt handlers, where sleeping is forbidden, M_NOWAIT must be used and a NULL pointer handled immediately. Finally, keeping allocations local and freeing them as soon as their last use is complete reduces the risk of memory leaks and use-after-free errors.
The handling of the short-lived freepath buffer is a clear demonstration of this principle in practice.
Why This Matters in FreeBSD Device Drivers
In real-world drivers, memory needs are rarely predictable. A network card might advertise the number of receive descriptors only after you probe it. A storage controller could require buffers sized according to device-specific registers. Some devices maintain tables that grow or shrink depending on the workload, such as pending I/O requests or active sessions. All of these cases require dynamic memory allocation.
Static arrays cannot cover such situations because they are fixed at compile time, wasting memory if oversized or failing outright if undersized. With malloc(9) and free(9), a driver can adapt to the actual hardware and workload, allocating exactly what is needed and returning memory once it is no longer in use.
However, this flexibility comes with responsibility. Unlike in user space, memory management errors in the kernel can destabilise the entire system. A missed free() becomes a memory leak that weakens long-term stability. An invalid pointer access after freeing can crash the kernel instantly. Overruns and underruns can silently corrupt memory structures used by other subsystems, sometimes turning into security vulnerabilities.
This is why learning to allocate, use, and release memory correctly is one of the foundational skills for FreeBSD driver developers. Getting this right ensures that your driver not only works under normal conditions but also behaves safely under stress, making the system reliable as a whole.
Real Driver Scenarios
Here are some practical cases where dynamic memory allocation is essential in FreeBSD device drivers:
- Network drivers: Allocate rings of packet descriptors whose size depends on the NIC's capabilities.
- USB drivers: Create transfer buffers sized to the maximum packet length reported by the device.
- PCI storage controllers: Build command tables that expand with the number of active requests.
- Character devices: Manage per-open data structures that exist only while a user process holds the device open.
These examples show that dynamic allocation is not just an academic exercise: it is a daily requirement for making real drivers interact safely and efficiently with hardware.
Common Beginner Pitfalls
Dynamic allocation in kernel code introduces some traps that are easy to overlook:
1. Leaking memory on error paths It's not enough to free memory in the "happy path." If an error occurs after you've allocated but before the function exits, forgetting to free will leak memory inside the kernel. Tip: Always trace every exit path and make sure each allocated block is either used or freed. Using a single cleanup label at the end of your function is a common pattern in FreeBSD.
2. Freeing with the wrong type
Every malloc(9) call is tagged with a type. Freeing with a mismatched type may confuse the kernel's memory accounting and debugging tools.
Tip: Define a custom tag for your driver with MALLOC_DEFINE() and always free with that same tag.
3. Assuming allocation always succeeds
In user space, malloc() often succeeds unless the system is badly constrained. In the kernel, especially with M_NOWAIT, allocation can legitimately fail.
Tip: Always check for NULL and handle the failure gracefully.
4. Choosing the wrong allocation flag
Using M_WAITOK in contexts that cannot sleep (like interrupt handlers) can deadlock the kernel. Using M_NOWAIT when sleeping is safe may force needless failure handling.
Tip: Understand the context of your allocation and pick the correct flag.
Challenge Questions
- In a driver's
detach()routine, what can happen if you forget to free dynamically allocated buffers? - Why is it important that the type passed to
free(9)matches the one used inmalloc(9)? - Imagine you allocate memory with
M_NOWAITduring an interrupt. The call returnsNULL. What should your driver do next? - Why is checking every error path after a successful allocation just as important as freeing on the success path?
- If you use
M_WAITOKinside an interrupt filter, what dangerous condition might arise?
Wrapping Up
You have now seen how C's dynamic memory allocation works in user space and how FreeBSD extends this idea with its own malloc(9) and free(9) for the kernel. You learned why allocations must always be paired with cleanups, how memory types and flags guide safe allocation, and how real FreeBSD code uses these patterns every day.
Dynamic allocation gives your driver the flexibility to adapt to hardware and workload demands, but it also introduces new responsibilities. Handling every error path, choosing the right flags, and keeping allocations short-lived are the habits that separate safe kernel code from fragile code.
In the next section, we will build directly on this foundation by looking at memory safety in kernel code. There, you will learn techniques to protect against leaks, overflows, and use-after-free errors, making your driver not only functional but also reliable and secure.
Memory Safety in Kernel Code
When writing kernel code, especially device drivers, we are working in a privileged and unforgiving environment. There is no safety net. In user-space programming, a crash usually terminates only your process. In kernel-space, a single bug can panic or reboot the entire operating system. That is why memory safety is not optional. It is the foundation of stable and secure FreeBSD driver development.
You must constantly remember that the kernel is persistent and long-running. A memory leak will accumulate for as long as the system remains up. A buffer overflow can silently overwrite unrelated data structures and later trigger a mysterious crash. Using an uninitialized pointer can panic the system instantly.
This section introduces the most common mistakes, shows you how to avoid them, and gives you practice through real experiments, both in user space and inside a small kernel module.
What Can Go Wrong?
Most kernel bugs caused by beginners can be traced to unsafe memory handling. Let's list the most frequent and dangerous ones:
- Using uninitialized pointers: a pointer that is not set to a valid address contains garbage. Dereferencing it usually causes a panic.
- Accessing freed memory (use-after-free): once memory is released, it must never be touched again. Doing so corrupts memory and destabilises the kernel.
- Memory leaks: failing to call
free()aftermalloc()means the memory remains reserved forever, slowly consuming kernel resources. - Buffer overflows: writing beyond the end of a buffer overwrites unrelated memory. This can corrupt kernel state or introduce security vulnerabilities.
- Off-by-one array errors: accessing one index past the end of an array is enough to destroy adjacent kernel data.
Unlike user space, where tools like valgrind can sometimes save you, in kernel programming these errors can lead to instant crashes or subtle corruption that is very difficult to debug.
Best Practices for Safer Kernel Code
FreeBSD provides mechanisms and conventions to help developers write robust code. Follow these guidelines:
-
Always initialise pointers. If you do not yet have a valid memory address, set the pointer to
NULL. This makes accidental dereferences easier to detect.struct my_entry *ptr = NULL; -
Check the result of
malloc(). Memory allocation may fail. Never assume success.ptr = malloc(sizeof(*ptr), M_MYTAG, M_NOWAIT); if (ptr == NULL) { // Handle gracefully, avoid panic } -
Free what you allocate. Every
malloc()must have a matchingfree(). In kernel space, leaks accumulate until reboot.free(ptr, M_MYTAG); -
Avoid buffer overflows. Use safer functions such as
strlcpy()orsnprintf(), which take the buffer size as an argument.strlcpy(buffer, "FreeBSD", sizeof(buffer)); -
Use
M_ZEROto avoid garbage values. This flag ensures allocated memory starts clean.ptr = malloc(sizeof(*ptr), M_MYTAG, M_WAITOK | M_ZERO); -
Use proper allocation flags.
M_WAITOKis used when allocation can safely sleep until memory becomes available.M_NOWAITmust be used in interrupt handlers or any context where sleeping is forbidden.
A Real Example from FreeBSD 14.3
In the FreeBSD source tree, memory is often managed through pre-allocated buffers rather than frequent dynamic allocations. Here is a snippet from sys/kern/tty_info.c, it's located at line 303:
(void)sbuf_new(&sb, tp->t_prbuf, tp->t_prbufsz, SBUF_FIXEDLEN);
sbuf_set_drain(&sb, sbuf_tty_drain, tp);
What happens here?
sbuf_new()creates a string buffer (sb) using an already allocated memory region (tp->t_prbuf).- The size is fixed (
tp->t_prbufsz) and protected by theSBUF_FIXEDLENflag, ensuring no writes beyond the limit. sbuf_set_drain()then specifies a controlled function (sbuf_tty_drain) to handle buffer output.
This pattern demonstrates a safe kernel strategy: memory is allocated once during subsystem initialisation and carefully reused, rather than repeatedly allocated and freed. It reduces fragmentation, avoids allocation failures at runtime, and keeps memory usage predictable.
Dangerous Code to Avoid
The following snippet is wrong because it uses a pointer that was never given a valid address:
struct my_entry *ptr; // Declared, but not initialised. 'ptr' contains garbage.
ptr->id = 5; // Crash risk: dereferencing an uninitialised pointer
ptr does not point anywhere valid. When you try to access ptr->id, the kernel will likely panic because you are touching memory you do not own. In user space, this would usually be a segmentation fault. In kernel space, it can crash the whole system.
The Correct Pattern
Below is a safe version that allocates memory, checks that the allocation worked, uses the memory, and then releases it. The comments explain each step and why it matters in kernel code.
#include <sys/param.h>
#include <sys/malloc.h>
/*
* Give your allocations a custom tag. This helps the kernel track who owns
* the memory and makes debugging leaks much easier.
*/
MALLOC_DECLARE(M_MYTAG); // Declare the tag (usually in a header)
MALLOC_DEFINE(M_MYTAG, "mydriver", "My driver allocations"); // Define the tag
struct my_entry {
int id;
};
void example(void)
{
struct my_entry *ptr = NULL; // Start with NULL to avoid using garbage
/*
* Allocate enough space for ONE struct my_entry.
* We use sizeof(*ptr) so if the type of 'ptr' changes,
* the size stays correct automatically.
*
* Flags:
* - M_WAITOK: allocation is allowed to sleep until memory is available.
* Use this only in contexts where sleeping is safe
* (for example, during driver attach or module load).
*
* - M_ZERO: zero-fill the memory so all fields start in a known state.
* This prevents accidental use of uninitialised data.
*/
ptr = malloc(sizeof(*ptr), M_MYTAG, M_WAITOK | M_ZERO);
if (ptr == NULL) {
/*
* Always check for failure, even with M_WAITOK.
* If allocation fails, handle it gracefully: log, unwind, or return.
*/
printf("mydriver: allocation failed\n");
return;
}
/*
* At this point 'ptr' is valid and zero-initialised.
* It is now safe to access its fields.
*/
ptr->id = 5;
/*
* ... use 'ptr' for whatever work is needed ...
*/
/*
* When you are done, free the memory with the SAME tag you used to allocate.
* Pairing malloc/free is essential in kernel code to avoid leaks
* that accumulate for the entire uptime of the machine.
*/
free(ptr, M_MYTAG);
ptr = NULL; // Optional but helpful to prevent accidental reuse
}
Why this pattern matters
- Initialise pointers: starting with
NULLmakes accidental use obvious during reviews and easier to catch in tests. - Size safely:
sizeof(*ptr)follows the pointer's type automatically, reducing the chance of wrong sizes when refactoring. - Pick the right flags:
- Use
M_WAITOKwhen the code can sleep, such as during attach, open, or module load paths. - Use
M_NOWAITin interrupt handlers or other non-sleepable contexts, and handleNULLimmediately.
- Use
- Zero on allocate:
M_ZEROprevents hidden state from previous allocations, which avoids surprising behaviour. - Always free: every
malloc()must be paired withfree()using the same tag. This is non-negotiable in kernel code. - Set to NULL after free: it reduces the risk of use-after-free bugs if the pointer is referenced later by mistake.
If your context do not allow sleep
Sometimes you are in a context where sleeping is forbidden, such as an interrupt handler. In that case use M_NOWAIT, check immediately for failure, and defer work if needed:
ptr = malloc(sizeof(*ptr), M_MYTAG, M_NOWAIT | M_ZERO);
if (ptr == NULL) {
/* Defer the work or drop it safely; do NOT block here. */
return;
}
Keeping these habits from the beginning will save you from many of the most painful kernel crashes and midnight debugging sessions.
Hands-On Lab 1: Crashing with an Uninitialized Pointer
This exercise demonstrates why you must never use a pointer before giving it a valid address. We will first write a broken program that uses an uninitialised pointer, then fix it with malloc().
Broken Version: lab1_crash.c
#include <stdio.h>
struct data {
int value;
};
int main(void) {
struct data *ptr; // Declared but not initialised
// At this point, 'ptr' points to some random location in memory.
// Trying to use it will cause undefined behaviour.
ptr->value = 42; // Segmentation fault very likely here
printf("Value: %d\n", ptr->value);
return 0;
}
What to Expect
- This program compiles without warnings.
- When you run it, it will almost certainly crash with a segmentation fault.
- The crash happens because
ptrdoes not point to valid memory, yet we try to write toptr->value. - In the kernel, this same mistake would likely panic the entire system.
What is wrong here?
STACK (main)
+---------------------------+
| ptr : ??? (uninitialised) | -> not a real address you own
+---------------------------+
HEAP
+---------------------------+
| no allocation yet |
+---------------------------+
Action performed:
write 42 into ptr->value
Result:
You are dereferencing a garbage address. Crash.
Fixed Version: lab1_fixed.c
#include <stdio.h>
#include <stdlib.h> // For malloc() and free()
struct data {
int value;
};
int main(void) {
struct data *ptr;
// Allocate memory for ONE struct data on the heap
ptr = malloc(sizeof(struct data));
if (ptr == NULL) {
// Always check in case malloc fails
printf("Allocation failed!\n");
return 1;
}
// Now 'ptr' points to valid memory, safe to use
ptr->value = 42;
printf("Value: %d\n", ptr->value);
// Always free what you allocate
free(ptr);
return 0;
}
What Changed
- We used
malloc()to allocate enough space for onestruct data. - We checked that the result was not
NULL. - We safely wrote into the struct's field.
- We freed the memory before exiting, preventing a leak.
Why this works
STACK (main)
+---------------------------+
| ptr : 0xHHEE... | -> valid heap address returned by malloc
+---------------------------+
HEAP
+---------------------------+
| struct data block |
| value = 42 |
+---------------------------+
Actions:
1) ptr = malloc(sizeof(struct data)) -> ptr now points to a valid block
2) ptr->value = 42 -> write inside your own block
3) free(ptr) -> return memory to the system
Result:
No crash. No leak.
Hands-On Lab 2: Memory Leak and the Forgotten Free
This exercise shows what happens when you forget to release allocated memory. In user space, leaks disappear when your program exits. In the kernel, leaks accumulate for the system's entire uptime, which is why this habit must be fixed early.
Leaky Version: lab2_leak.c
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
int main(void) {
// Allocate 128 bytes of memory
char *buffer = malloc(128);
if (buffer == NULL) return 1;
// Copy a string into the allocated buffer
strcpy(buffer, "FreeBSD device drivers are awesome!");
printf("%s\n", buffer);
// Memory was allocated but never freed
// This is a memory leak
return 0;
}
What to Expect
- The program prints the string normally.
- You may not notice the problem right away because the OS reclaims process memory when the program exits.
- In the kernel this would be serious. The memory would remain allocated across operations and only a reboot clears it.
Leak vs program exit
Before exit:
STACK (main)
+---------------------------+
| buffer : 0xABCD... | -> heap address
+---------------------------+
HEAP
+--------------------------------------------------+
| 128-byte block |
| "FreeBSD device drivers are awesome!\0 ..." |
+--------------------------------------------------+
Action:
Program returns without free(buffer)
Consequence in user space:
OS reclaims process memory at exit, so you do not notice.
Consequence in kernel space:
The block remains allocated across operations and accumulates.
Fixed Version: lab2_fixed.c
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
int main(void) {
char *buffer = malloc(128);
if (buffer == NULL) return 1;
strcpy(buffer, "FreeBSD device drivers are awesome!");
printf("%s\n", buffer);
// Free the memory once we are done
free(buffer);
return 0;
}
What Changed
- We added
free(buffer);. - This single line ensures that all memory is returned to the system. Make this a habit.
Proper lifecycle
1) Allocation
HEAP: [ 128-byte block ] <- buffer points here
2) Use
Write string into the block, then print it
3) Free
free(buffer)
HEAP: [ block returned to allocator ]
buffer (optional) -> set to NULL to avoid accidental reuse
Detecting Memory Leaks with AddressSanitizer
On FreeBSD, when compiling user-space programs with Clang, you can detect leaks automatically using AddressSanitizer:
% cc -fsanitize=address -g -o lab2_leak lab2_leak.c
% ./lab2_leak
You will see a report indicating memory was allocated and never freed. Although AddressSanitizer does not apply to kernel code, the lesson is identical. Always release what you allocate.
Mini-Lab 3: Memory Allocation in a Kernel Module
Now let's try a FreeBSD kernel experiment. Create memlab.c:
#include <sys/param.h>
#include <sys/module.h>
#include <sys/kernel.h>
#include <sys/malloc.h>
MALLOC_DEFINE(M_MEMLAB, "memlab", "Memory Lab Example");
static void *buffer = NULL;
static int
memlab_load(struct module *m, int event, void *arg)
{
int error = 0;
switch (event) {
case MOD_LOAD:
printf("memlab: Loading module\n");
buffer = malloc(128, M_MEMLAB, M_WAITOK | M_ZERO);
if (buffer == NULL) {
printf("memlab: malloc failed!\n");
error = ENOMEM;
} else {
printf("memlab: allocated 128 bytes\n");
}
break;
case MOD_UNLOAD:
printf("memlab: Unloading module\n");
if (buffer != NULL) {
free(buffer, M_MEMLAB);
printf("memlab: memory freed\n");
}
break;
default:
error = EOPNOTSUPP;
break;
}
return error;
}
static moduledata_t memlab_mod = {
"memlab", memlab_load, NULL
};
DECLARE_MODULE(memlab, memlab_mod, SI_SUB_DRIVERS, SI_ORDER_MIDDLE);
Compile and load:
% cc -O2 -pipe -nostdinc -I/usr/src/sys -D_KERNEL -DKLD_MODULE \
-fno-common -o memlab.o -c memlab.c
% cc -shared -nostdlib -o memlab.ko memlab.o
% sudo kldload ./memlab.ko
Unload with:
% sudo kldunload memlab
Detecting Leaks
Comment out the free() line, recompile, and load/unload several times. Now inspect memory:
% vmstat -m | grep memlab
You will see lines like:
memlab 128 4 4 0 0 1
indicating four allocations of 128 bytes are still "in use" because they were never freed. With the fix in place, the line disappears after unload.
Optional: DTrace View
To see allocations live:
% sudo dtrace -n 'fbt::malloc:entry { trace(arg1); }'
When you load the module, you will see the 128 bytes being allocated.
Challenge: Prove the Leak
It is one thing to read that kernel leaks accumulate, but another to see it with your own eyes. This short experiment will let you prove it on your own system.
-
Open the
memlab.cmodule you created earlier and comment out thefree(buffer, M_MEMLAB);line in the unload function. This means the module will allocate memory on load, but never release it on unload. -
Rebuild the module and then load and unload it four times in a row:
% sudo kldload ./memlab.ko % sudo kldunload memlab % sudo kldload ./memlab.ko % sudo kldunload memlab % sudo kldload ./memlab.ko % sudo kldunload memlab % sudo kldload ./memlab.ko % sudo kldunload memlab -
Now inspect the kernel's memory allocation table with:
% vmstat -m | grep memlabYou should see output similar to:
memlab 128 4 4 0 0 1This means four allocations of 128 bytes were made, and none were freed. Each time you loaded the module, the kernel allocated more memory that was never released.
-
Finally, restore the
free()line, recompile, and repeat the load/unload cycle. This time, when you runvmstat -m | grep memlab, the line should disappear after unload, confirming that memory is released properly.
This simple test demonstrates a critical fact: in user space, leaks usually vanish when your process exits. In kernel space, leaks survive across module reloads and continue to accumulate. In production systems, such mistakes are not just messy; they are fatal. Over time, leaks can exhaust all available kernel memory and cause the system to crash.
Common Beginner Pitfalls
Memory safety is one of the hardest lessons for new C programmers, and in kernel space the consequences are much harsher. Let's highlight a few traps that beginners often fall into, and how you can avoid them:
- Forgetting to free memory.
Every
malloc()must have a matchingfree()in the proper cleanup path. If you allocate during module load, remember to free during module unload. This habit prevents leaks that otherwise accumulate for the entire uptime of the system. - Using freed memory.
Accessing a pointer after
free()is called is a classic bug known as use-after-free. The pointer may still contain the old address, tricking you into thinking it is valid. A safe habit is to set the pointer toNULLimmediately after freeing it. That way, any accidental use will be obvious. - Choosing the wrong allocation flag.
FreeBSD provides different allocation behaviours for different contexts. If you call
malloc()withM_WAITOK, the kernel may put the thread to sleep until memory becomes available, which is fine during module load or attach, but catastrophic inside an interrupt handler. Conversely,M_NOWAITnever sleeps and fails immediately if memory is not available. Learning to pick the correct flag is an essential skill. - Skipping malloc tags.
Always use
MALLOC_DEFINE()to give your driver a custom memory tag. These tags appear invmstat -mand make debugging leaks much easier. Without them, your allocations may be lumped into generic categories, making it difficult to trace where memory is coming from.
By keeping these pitfalls in mind and practising the good habits shown earlier, you will dramatically reduce the risk of introducing memory bugs into your drivers. These lessons might feel repetitive now, but in real kernel development they are the difference between a stable driver and one that crashes production systems.
Golden Rules for Kernel Memory
1. Every malloc() must have a matching free().
2. Never use a pointer before initialising it (or after freeing it).
3. Use the correct allocation flag (M_WAITOK or M_NOWAIT) for the context.
Keep these three rules in mind whenever you write kernel code. They may look simple, but following them consistently is what separates a stable FreeBSD driver from a crash-prone one.
Pointers Recap: The Lifecycle of Memory in C
Step 1: Declare a pointer
-------------------------
struct data *ptr;
STACK
+-------------------+
| ptr : ??? | -> uninitialised (dangerous!)
+-------------------+
Step 2: Allocate memory
-----------------------
ptr = malloc(sizeof(struct data));
STACK HEAP
+-------------------+ +----------------------+
| ptr : 0xABCD... | -----> | struct data block |
+-------------------+ | value = ??? |
+----------------------+
Step 3: Use the memory
----------------------
ptr->value = 42;
HEAP
+----------------------+
| struct data block |
| value = 42 |
+----------------------+
Step 4: Free the memory
-----------------------
free(ptr);
ptr = NULL;
STACK
+-------------------+
| ptr : NULL | -> safe, prevents reuse
+-------------------+
HEAP
+----------------------+
| (block released) |
+----------------------+
Golden Reminder:
- Never use an uninitialised pointer.
- Always check your allocations.
- Free what you allocate, and set pointers to
NULLafter freeing.
Pointers Recap Quiz
Test yourself with these quick questions before moving on. Answers are at the end of this chapter.
- What happens if you declare a pointer but never initialise it and then dereference it?
- Why should you always check the return value of
malloc()? - What is the purpose of the
M_ZEROflag when allocating memory in the FreeBSD kernel? - After calling
free(ptr, M_TAG);, why is it a good habit to setptr = NULL;? - In which contexts must you use
M_NOWAITinstead ofM_WAITOKwhen allocating memory in kernel code?
Wrapping Up
With this section, we have reached the end of our journey through pointers in C. Along the way you learned what pointers are, how they relate to arrays and structures, and why they are so powerful but also so dangerous. We concluded with one of the most important lessons: memory safety.
Every pointer must be treated with care. Every allocation must be checked. Every buffer must have a known size. In user space, mistakes usually crash only your program. In kernel space, the very same mistakes can corrupt memory or bring down the entire operating system.
By following FreeBSD's allocation patterns, checking results, freeing memory diligently, and using debugging tools like vmstat -m and DTrace, you will be on the path to writing drivers that are both stable and reliable.
In the next section, we'll cover structures and typedefs in C. Structures allow you to group related data, making your code more organized and expressive. Typedefs will enable you to assign meaningful names to complex types, improving readability. Together, they form the basis of almost every real kernel subsystem and are a natural step after mastering pointers.
Structures and typedef in C
So far, you have worked with single variables, arrays, and pointers. These are helpful tools, but real programs and especially operating system kernels need a way to organize related pieces of information together. Imagine trying to keep track of a device's name, its state, and its configuration using separate variables scattered everywhere. The result would be messy and hard to maintain.
C solves this problem with structures. A structure allows you to group related data under one roof, giving it a clear and logical shape. And once you have that structure, the typedef keyword can make your code shorter and easier to read by giving that structure a simpler name.
You will quickly discover that FreeBSD code is built on this foundation. Almost every kernel subsystem is defined as a set of structures. Once you understand them, the FreeBSD source tree starts looking far less intimidating.
What is a struct?
Up until now, you have stored data in single variables (int, char, etc.) or in arrays of the same type. But in real programming, we often need to keep different pieces of information together. For example, if we want to represent a point in two-dimensional space, we need both an x coordinate and a y coordinate. Keeping them in separate variables quickly becomes messy.
This is where a structure (or struct) comes in. A struct is a user-defined type that allows you to group several variables under a single name. Each variable inside the struct is referred to as a member or a field.
A useful way to imagine a struct is to think of it as a folder. A folder can hold different types of documents, a text file, an image, and a spreadsheet, but they are all kept together because they belong to the same project. A struct works the same way in C: it keeps related data together so you can handle it as one logical unit.
Here's a diagram of how a struct Point looks conceptually:
struct Point (like a folder)
├── x (an integer)
└── y (an integer)
Now let's see how this looks in C code:
#include <stdio.h>
// Define a structure to represent a point in 2D space
struct Point {
int x; // The X coordinate
int y; // The Y coordinate
};
int main() {
// Declare a variable of type struct Point
struct Point p1;
// Assign values to its fields
p1.x = 10; // Set the X coordinate
p1.y = 20; // Set the Y coordinate
// Print the values
printf("Point p1 is at (%d, %d)\n", p1.x, p1.y);
return 0;
}
Let's break down what happens here:
- We define a new blueprint called
struct Pointthat describes what every "Point" should contain: two integers, one calledxand one calledy. - In
main(), we create a real variablep1that follows this blueprint. - We fill in its fields by assigning values to
p1.xandp1.y. - We print the values, treating
p1as one object with two related pieces of data.
This way, instead of juggling separate int x and int y variables, we now have a single object p1 that neatly contains both.
In FreeBSD drivers, you will often find the same pattern, just with more complex fields: a device struct might keep its ID, its name, a pointer to its memory buffer, and its current state all in one place.
Accessing Structure Members
Once you define a structure, you need a way to reach its individual fields. C gives you two operators for this:
- The dot operator (
.) is used when you have a structure variable. - The arrow operator (
->) is used when you have a pointer to a structure.
A good way to remember this is: dot for direct, arrow for indirect.
Let's see both in action:
#include <stdio.h>
// Define a structure to represent a device
struct Device {
char name[20];
int id;
};
int main() {
// Create a structure variable
struct Device dev = {"uart0", 1};
// Create a pointer to that structure
struct Device *ptr = &dev;
// Access using the dot operator (direct access)
printf("Device name: %s\n", dev.name);
// Access using the arrow operator (indirect access via pointer)
printf("Device ID: %d\n", ptr->id);
return 0;
}
Here's what is happening step by step:
- We create a structure variable
devthat represents a device with a name and an ID. - We then create a pointer
ptrthat points todev. - When we write
dev.name, we are directly accessing thenamefield of the struct variable. - When we write
ptr->id, we are saying "follow the pointerptrto the struct it points to, then access theidfield."
The arrow operator is essentially shorthand for writing:
(*ptr).id
But because this looks clumsy, C provides ptr->id instead.
This distinction is fundamental in kernel programming. In FreeBSD, most kernel APIs will not give you a structure directly; they give you a pointer to a structure. For example, when working with processes, you will often get a struct proc *p, not a plain struct proc. That means you will spend most of your time using the arrow operator when writing drivers.
Making Code Cleaner with typedef
When we define a struct, we usually have to write the word struct every time we declare a variable. For small programs that is fine, but in large projects like FreeBSD this quickly becomes repetitive and makes the code harder to read.
The typedef keyword allows us to create a shorter alias for a type. It does not invent a new type; it simply gives an existing type a new name. This alias typically makes the code easier to read and clarifies the programmer's intent.
Here's an example that makes our struct Point easier to work with:
#include <stdio.h>
// Define a structure and immediately create a typedef for it
typedef struct {
int x; // X coordinate
int y; // Y coordinate
} Point; // "Point" is now an alias for this struct
int main() {
// We can now declare a Point without writing "struct Point"
Point p = {1, 2};
printf("Point is at (%d, %d)\n", p.x, p.y);
return 0;
}
Without typedef, we would have to write:
struct Point p;
With typedef, we can simply write:
Point p;
This may look like a slight difference, but in real FreeBSD code, where structs can be very large and complex, using typedef keeps the source much cleaner.
Typedefs in FreeBSD
FreeBSD uses typedef extensively to:
-
Give clearer names to primitive types For example, in
/usr/include/sys/typesyou will find lines like the lines below. Be aware that these lines presented below are spread across the file:typedef __pid_t pid_t; /* process id */ typedef __uid_t uid_t; /* user id */ typedef __gid_t gid_t; /* group id */ typedef __dev_t dev_t; /* device number or struct cdev */ typedef __off_t off_t; /* file offset */ typedef __size_t size_t; /* object size */Instead of scattering "int" or "long" everywhere in kernel code, FreeBSD developers use these typedefs. The code then speaks the language of the operating system:
pid_tfor process IDs,uid_tfor user IDs,size_tfor sizes, and so on. -
Hide system-dependent details On some architectures,
pid_tmight be a 32-bit integer, while on others, it might be a 64-bit integer. The code that usespid_tdoesn't care; the typedef handles that detail. -
Simplify complex declarations FreeBSD also uses typedef to simplify long or pointer-heavy declarations. For example:
typedef struct _device *device_t; typedef struct vm_page *vm_page_t;Instead of always writing
struct _device *throughout the kernel, developers can simply writedevice_t. This makes the code shorter and easier to read.
Why this matters in driver development
When you start writing drivers, you'll constantly encounter types like device_t, vm_page_t, or bus_space_tag_t. They are all typedefs hiding more complex structures or pointers. Understanding that typedef is just an alias helps you avoid confusion and allows you to read FreeBSD code more fluently.
Real Example from FreeBSD 14.3: Creating a USB Character Device
Now that you've seen how structures and typedef work in isolation, let's step into real FreeBSD kernel code. This example comes from the USB subsystem, specifically the function usb_make_dev in sys/dev/usb/usb_device.c (between lines 2119-2157). This function is responsible for creating a character device node under /dev for a USB endpoint, allowing user programs to interact with it.
As you read the code, pay attention to two things:
- How FreeBSD uses structures (
struct usb_fs_privdataandstruct make_dev_args) to bundle together all the information needed. - How it relies on typedefs like
uid_tandgid_tto give clearer meaning to simple integers.
Here's the relevant code, with additional comments to help you connect it to the concepts we just studied:
File: sys/dev/usb/usb_device.c
[...]
struct usb_fs_privdata *
usb_make_dev(struct usb_device *udev, const char *devname, int ep,
int fi, int rwmode, uid_t uid, gid_t gid, int mode)
{
struct usb_fs_privdata* pd;
struct make_dev_args args;
char buffer[32];
/* Allocate and initialise a private data structure for this device */
pd = malloc(sizeof(struct usb_fs_privdata), M_USBDEV,
M_WAITOK | M_ZERO);
/* Fill in identifying fields */
pd->bus_index = device_get_unit(udev->bus->bdev);
pd->dev_index = udev->device_index;
pd->ep_addr = ep;
pd->fifo_index = fi;
pd->mode = rwmode;
/* Build the device name string if none was provided */
if (devname == NULL) {
devname = buffer;
snprintf(buffer, sizeof(buffer), USB_DEVICE_DIR "/%u.%u.%u",
pd->bus_index, pd->dev_index, pd->ep_addr);
}
/* Initialise and populate the make_dev_args structure */
make_dev_args_init(&args);
args.mda_devsw = &usb_devsw; // Which device switch to use
args.mda_uid = uid; // Owner user ID (typedef uid_t)
args.mda_gid = gid; // Owner group ID (typedef gid_t)
args.mda_mode = mode; // Permission bits
args.mda_si_drv1 = pd; // Attach our private data
/* Create the device node */
if (make_dev_s(&args, &pd->cdev, "%s", devname) != 0) {
DPRINTFN(0, "Failed to create device %s\n", devname);
free(pd, M_USBDEV);
return (NULL);
}
return (pd);
}
[...]
Why this example matters
This short function highlights several key lessons about structs and typedefs in real kernel work:
- Grouping related data: Instead of passing a dozen parameters around, FreeBSD collects them into
struct make_dev_args. This makes it easier to extend the API later and keeps function calls readable. - Dot vs arrow: Notice
args.mda_uid = uid;uses the dot operator, becauseargsis a direct struct variable. Butpd->dev_index = udev->device_index;uses the arrow, becausepdandudevare pointers. This is exactly the pattern you'll use constantly in your own drivers. - Initialisation: The call to
make_dev_args_init(&args)ensuresargsstarts from a clean state, avoiding the uninitialised-field bug we discussed earlier. - Typedefs for clarity: Instead of raw integers, the function signature uses
uid_t,gid_t, andmode. These typedefs tell you exactly what kind of value is expected: a user ID, a group ID, and a permission mode. This is both safer and more self-documenting.
Takeaways
This example demonstrates how the building blocks you're learning (structures, pointers, and typedefs) are not academic curiosities, but the everyday language of FreeBSD drivers. Whenever a driver needs to configure a device, pass state between layers, or expose a new node in /dev, you'll find structs collecting the data and typedefs making it readable.
As you continue, remember this pattern:
- Bundle related fields into a struct.
- Use pointers with
->when passing around instances. - Use typedefs where they make the code more expressive.
Hands-On Lab 1: Building a TTY Device Struct
In this exercise, you will define a structure to represent a simple TTY (teletypewriter) device. FreeBSD uses similar structures internally to keep track of terminals like ttyv0 (the first virtual console).
#include <stdio.h>
#include <string.h>
// Step 1: Define the structure
struct tty_device {
char name[16]; // Device name, e.g., "ttyv0"
int minor_number; // Device minor number
int enabled; // 1 = enabled, 0 = disabled
};
int main() {
// Step 2: Declare a variable of the struct
struct tty_device dev;
// Step 3: Assign values to the fields
strcpy(dev.name, "ttyv0"); // Copy string into 'name'
dev.minor_number = 0; // Minor number = 0 for ttyv0
dev.enabled = 1; // Enabled (1 = yes, 0 = no)
// Step 4: Print the values
printf("Device: %s\n", dev.name);
printf("Minor number: %d\n", dev.minor_number);
printf("Enabled: %s\n", dev.enabled ? "Yes" : "No");
return 0;
}
Compile and run it:
% cc -o tty_device tty_device.c
% ./tty_device
Expected output:
Device: ttyv0
Minor number: 0
Enabled: Yes
What you learned: A struct groups related data into one logical unit. This mirrors how FreeBSD kernel code groups device metadata in struct tty.
Hands-On Lab 2: Using typedef for Cleaner Code
Now let's simplify the same program using typedef, which allows us to create a shorter alias for the struct. This makes code easier to read and avoids writing struct every time.
#include <stdio.h>
#include <string.h>
// Step 1: Define the struct and typedef it to TTYDevice
typedef struct {
char name[16]; // Device name
int minor_number; // Minor number
int enabled; // Enabled flag (1 = enabled, 0 = disabled)
} TTYDevice;
int main() {
// Step 2: Declare a variable of the new type
TTYDevice dev;
// Step 3: Fill in its fields
strcpy(dev.name, "ttyu0"); // Another device example
dev.minor_number = 1; // Minor = 1 for ttyu0
dev.enabled = 0; // Disabled
// Step 4: Print the values
printf("Device: %s\n", dev.name);
printf("Minor number: %d\n", dev.minor_number);
printf("Enabled: %s\n", dev.enabled ? "Yes" : "No");
return 0;
}
Compile and run:
% cc -o tty_typedef tty_typedef.c
% ./tty_typedef
Expected output:
Device: ttyu0
Minor number: 1
Enabled: No
Challenge Yourself
To make the struct more realistic (closer to FreeBSD's real struct tty), try these variations:
-
Add a baud rate:
int baud_rate;Set it to
9600or115200and print it. -
Add a driver name:
char driver[32];Assign
"console"or"uart"to it. -
Simulate multiple devices: Create an array of structs and initialise two or three devices, then print them all in a loop.
Extra Hands-On Lab: Managing Multiple TTY Devices
So far, you have created a single TTY device at a time. But in real drivers, you often manage many devices at once. FreeBSD keeps tables of devices, updating them as hardware is discovered, configured, or removed.
In this lab, you will build a small program that manages an array of TTY devices, searches for them by name, toggles their state, and even sorts them. This mirrors what real drivers do when working with multiple terminals, USB endpoints, or network interfaces.
#include <stdio.h>
#include <string.h>
#include <stdlib.h> // for qsort
// Define a TTY device structure with typedef
typedef struct {
char name[16]; // e.g., "ttyv0", "ttyu0"
int minor_number; // minor device number
int enabled; // 1 = enabled, 0 = disabled
int baud_rate; // communication speed, e.g., 9600 or 115200
char driver[32]; // driver name, e.g., "console", "uart"
} TTYDevice;
// Print a single device (note: const pointer = read-only access)
void print_device(const TTYDevice *d) {
printf("Device %-6s minor=%d enabled=%s baud=%d driver=%s\n",
d->name, d->minor_number, d->enabled ? "Yes" : "No",
d->baud_rate, d->driver);
}
// Print all devices in the array
void print_all(const TTYDevice *arr, size_t n) {
for (size_t i = 0; i < n; i++)
print_device(&arr[i]);
}
// Find a device by name. Returns a pointer or NULL if not found.
TTYDevice *find_by_name(TTYDevice *arr, size_t n, const char *name) {
for (size_t i = 0; i < n; i++) {
if (strcmp(arr[i].name, name) == 0)
return &arr[i];
}
return NULL;
}
// Comparator for qsort: sort by minor number ascending
int cmp_by_minor(const void *a, const void *b) {
const TTYDevice *A = (const TTYDevice *)a;
const TTYDevice *B = (const TTYDevice *)b;
return (A->minor_number - B->minor_number);
}
int main(void) {
// 1) Create an array of devices using initialisers
TTYDevice table[] = {
{ "ttyv0", 0, 1, 115200, "console" },
{ "ttyu0", 1, 0, 9600, "uart" },
{ "ttyu1", 2, 1, 9600, "uart" },
};
size_t N = sizeof(table) / sizeof(table[0]);
printf("Initial device table:\n");
print_all(table, N);
printf("\n");
// 2) Disable ttyv0 and enable ttyu0 using find_by_name()
TTYDevice *p = find_by_name(table, N, "ttyv0");
if (p != NULL) p->enabled = 0;
p = find_by_name(table, N, "ttyu0");
if (p != NULL) p->enabled = 1;
printf("After toggling enabled flags:\n");
print_all(table, N);
printf("\n");
// 3) Sort the devices by minor number (already sorted, but shows the pattern)
qsort(table, N, sizeof(table[0]), cmp_by_minor);
printf("After sorting by minor number:\n");
print_all(table, N);
printf("\n");
// 4) Update all devices: set baud_rate to 115200
for (size_t i = 0; i < N; i++)
table[i].baud_rate = 115200;
printf("After setting all baud rates to 115200:\n");
print_all(table, N);
return 0;
}
Compile and run:
% cc -o tty_table tty_table.c
% ./tty_table
Expected output:
Initial device table:
Device ttyv0 minor=0 enabled=Yes baud=115200 driver=console
Device ttyu0 minor=1 enabled=No baud=9600 driver=uart
Device ttyu1 minor=2 enabled=Yes baud=9600 driver=uart
After toggling enabled flags:
Device ttyv0 minor=0 enabled=No baud=115200 driver=console
Device ttyu0 minor=1 enabled=Yes baud=9600 driver=uart
Device ttyu1 minor=2 enabled=Yes baud=9600 driver=uart
After sorting by minor number:
Device ttyv0 minor=0 enabled=No baud=115200 driver=console
Device ttyu0 minor=1 enabled=Yes baud=9600 driver=uart
Device ttyu1 minor=2 enabled=Yes baud=9600 driver=uart
After setting all baud rates to 115200:
Device ttyv0 minor=0 enabled=No baud=115200 driver=console
Device ttyu0 minor=1 enabled=Yes baud=115200 driver=uart
Device ttyu1 minor=2 enabled=Yes baud=115200 driver=uart
Challenge Yourself!
Try extending this program:
- Filter enabled devices only
Write a function that prints only devices where
enabled == 1. - Find by minor number
Add a helper
find_by_minor(TTYDevice *arr, size_t n, int minor). - Sort by name
Replace the comparator to sort alphabetically by
name. - Safer strings
Replace
strcpywithsnprintfto avoid buffer overflows. - New field
Add a boolean field
is_consoleand print[CONSOLE]next to those devices.
What did you learn in these labs?
The three labs you just completed began with the basics of defining a single struct, progressed to simplifying code with typedef, and concluded with managing an array of devices that could easily be part of a real driver. Along the way, you learned how C structures can group related fields into one logical unit, how typedef can make code more readable, and how arrays and pointers extend these ideas to represent and manage multiple devices at once.
You also practiced searching for a device by name and updating its fields, which reinforced the difference between dot and arrow access. You saw how pointers to structs (p->enabled) are the natural way to work with objects that are passed around in kernel code. You explored how to sort devices using qsort() and a comparator function, a pattern you will find in kernel subsystems that need to keep lists ordered. Finally, you learned how to apply changes to all devices in one sweep with a simple loop, a technique that reflects how drivers often update state across a whole set of devices when an event occurs.
By completing these labs, you have moved closer to the way FreeBSD drivers are really written: maintaining tables of devices, searching them efficiently, updating their state as hardware events happen, and relying on well-defined structs and typedefs to keep the code both safe and understandable.
Common Beginner Pitfalls with Structs and typedefs
Working with structs in C is straightforward once you get the hang of it, but beginners often trip over the same mistakes. In kernel programming, these mistakes are more than academic; they can lead to corrupted state, unexpected panics, or unreadable code.
1. Forgetting to initialise fields Declaring a struct does not automatically clear its memory. The fields contain whatever random values were left behind.
struct tty_device dev; // uninitialised fields contain garbage
printf("%d\n", dev.minor_number); // undefined value
Safe practice: Always initialise structs. You can:
-
Use designated initialisers:
struct tty_device dev = {.minor_number = 0, .enabled = 1}; -
Or clear everything:
struct tty_device dev = {0}; // all fields zeroed -
Or, in kernel code, call an init helper (like
make_dev_args_init(&args)in FreeBSD).
2. Mixing up dot and arrow
The dot (.) is used with struct variables. The arrow (->) is used with pointers to structs. Beginners often use the wrong one.
struct tty_device dev;
struct tty_device *p = &dev;
dev.minor_number = 1; // correct (variable)
p->minor_number = 2; // correct (pointer)
p.minor_number = 3; // wrong since p is a pointer
Safe practice: remember dot (.) for direct, arrow (->) for indirect.
3. Assuming copying structs = copying everything safely In C, assigning one struct to another copies all its fields by value. This can be dangerous if the struct contains pointers, because both structs will then point to the same memory.
struct buffer {
char *data;
int len;
};
struct buffer a, b;
a.data = malloc(100);
a.len = 100;
b = a; // shallow copy: b.data points to the same memory as a.data
free(a.data); // now b.data is a dangling pointer
Safe practice: If your struct holds pointers, be explicit about ownership. In kernel code, be clear about who allocates and frees memory. Sometimes you need to write a custom "copy" function.
4. Misunderstanding padding and alignment The compiler may insert padding between struct fields to meet alignment requirements. This can surprise beginners who assume the struct's size is just the sum of its fields.
struct example {
char a;
int b;
};
printf("%zu\n", sizeof(struct example)); // Often 8, not 5
Safe practice: Don't assume layout. If you need tight packing for hardware structures, use fixed-size types (uint32_t) and consult sys/param.h. In FreeBSD, bus structures often rely on precise layouts.
5. Overusing typedef
typedef makes code shorter, but using it everywhere can hide meaning.
typedef struct {
int x, y, z;
} Foo; // "Foo" tells us nothing
Safe practice: Reserve typedef for:
- Very common or standardised types (
pid_t,uid_t,device_t). - Hiding architecture details (
uintptr_t,vm_offset_t). - Complex or pointer-heavy types (
typedef struct _device *device_t;).
Avoid typedefs for one-off structs where struct makes intent clearer.
Recap Quiz: Structs and typedef Pitfalls
Before moving on, take a moment to test yourself. The following quick quiz revisits the most common pitfalls we discussed with structs and typedefs.
These questions are not about memorizing syntax, but about checking whether you understand the reasoning behind safe practices. If you can answer them confidently, you are well prepared to recognise and avoid these mistakes when you start writing real FreeBSD driver code.
Let's try?
-
What happens if you declare a struct without initialising it?
- a) It is filled with zeroes.
- b) It contains leftover garbage values.
- c) The compiler raises an error.
-
You have this code:
struct device dev; struct device *p = &dev;Which are correct ways to set a field?
- a)
dev.id = 1; - b)
p->id = 1; - c)
p.id = 1;
- a)
-
Why is assigning one struct to another (
b = a;) dangerous if the struct contains pointers? -
Why might
sizeof(struct example)be larger than the sum of its fields? -
When is it appropriate to use
typedefwith a struct, and when should you avoid it?
Wrapping Up
You have now learned how to define and use structures, how to make code cleaner with typedef, and how to recognize real patterns inside the FreeBSD kernel. Structures are the backbone of kernel programming. Almost every subsystem is represented as a struct, and device drivers rely on them to maintain state, configuration, and runtime data.
In the next section we will take this one step further by looking at header files and modular code in C. This will show you how structs and typedefs are shared across multiple source files, which is precisely how large projects like FreeBSD stay organized and maintainable.
Header Files and Modular Code
So far, all of our programs have lived in a single .c file. That's fine for small examples, but real software quickly outgrows this model. The FreeBSD kernel itself is made up of thousands of small files, each with a clear responsibility. To manage this complexity, C provides a way to build modular code: splitting definitions into .c files and declarations into .h files.
We already saw in Section 4.3 that #include pulls in header files, and in Section 4.7 that declarations tell the compiler what a function looks like. Now we'll go one step further and combine these ideas into a method that scales: header files for modular programs.
What Is a Header File?
When programs grow beyond a single file, you need a way for different .c files to "agree" on what functions, structures, or constants exist. That's the job of a header file, which normally has the extension .h.
Think of a header as a contract: it doesn't do the work itself, but it describes what is available for other files to use. A .c file can then include this contract with #include so that the compiler knows what to expect.
Typical things you will find in a header are:
- Function prototypes (so other files know how to call them)
- Struct and enum definitions (so data types are shared consistently)
- Macros and constants defined with
#define externvariable declarations (so a variable can be shared without creating multiple copies)- Occasionally, small inline helper functions
A header file is never compiled independently. Instead, it is included by one or more .c files, which then provide the actual implementations.
Header Guards: Why They Exist and How They Work
One of the first problems you encounter in modular programs is accidental duplication.
Imagine this scenario:
main.cincludesmathutils.h.mathutils.calso includesmathutils.h.- When the compiler combines everything, it may attempt to include the same header file twice.
If the header defines the same functions or structures more than once, the compiler will throw errors like "redefinition of struct ...".
To prevent this, C programmers wrap every header in a guard, which is a set of preprocessor commands that tell the compiler:
- "If this header hasn't been included yet, include it now."
- "If it has already been included, skip it."
Here's what it looks like in practice:
#ifndef MATHUTILS_H
#define MATHUTILS_H
int add(int a, int b);
int subtract(int a, int b);
#endif
Step by step:
#ifndef MATHUTILS_HReads as "if not defined." The preprocessor checks whetherMATHUTILS_His already defined.#define MATHUTILS_HIf it wasn't defined, we define it now. This marks the header as "processed."- Contents of the header Prototypes, constants, or struct definitions should be placed here.
#endifEnds the guard.
If another file includes mathutils.h again later:
- The preprocessor checks
#ifndef MATHUTILS_H. - But now
MATHUTILS_His already defined. - Result: the compiler skips the entire file.
This ensures the header's contents are only included once, even if you #include it from multiple places.
Why the Name Matters
The symbol used in the guard (MATHUTILS_H in our example) is arbitrary, but there are conventions:
- Written in all caps
- Usually based on the file name
- Sometimes includes path information for uniqueness (e.g.,
_SYS_PROC_H_for/usr/src/sys/sys/proc.h)
A Real FreeBSD Example
If you open the file /usr/src/sys/sys/proc.h, you'll see a very similar guard:
#ifndef _SYS_PROC_H_
#define _SYS_PROC_H_
/* contents of the header ... */
#endif /* !_SYS_PROC_H_ */
This is the same pattern, just with a different naming style. It ensures that the definition of struct proc (and everything else in that header) is only read once, no matter how many other files include it.
Hands-On Lab 1: Seeing Header Guards in Action
Let's prove why header guards matter by creating two small programs. The first will fail without guards, and the second will succeed once we add them.
Step 1: Create a Header Without a Guard
File: badheader.h
// badheader.h
int add(int a, int b);
File: badmain.c
#include <stdio.h>
#include "badheader.h"
#include "badheader.h" // included twice on purpose!
int add(int a, int b) {
return a + b;
}
int main(void) {
printf("Result: %d\n", add(2, 3));
return 0;
}
Now compile it:
% cc badmain.c -o badprog
You should see an error similar to:
badmain.c:3:5: error: redefinition of 'add'
This occurred because the header was included twice, causing the compiler to see the same function declaration twice.
Step 2: Fix It with a Header Guard
Now edit badheader.h to add a guard:
#ifndef BADHEADER_H
#define BADHEADER_H
int add(int a, int b);
#endif
Recompile:
% cc badmain.c -o goodprog
% ./goodprog
Result: 5
With the guard in place, the compiler ignored the duplicate include, and the program compiled successfully.
What You Learned
- Including the same header twice without guards causes errors.
- A header guard prevents those errors by ensuring the file's contents are only processed once.
- This is why every professional C project, including the FreeBSD kernel, consistently uses header guards.
Why Use Modular Code?
As programs grow, keeping everything in one .c file quickly becomes messy. A single file full of unrelated functions is hard to read, hard to maintain, and almost impossible to scale. Modular code solves this problem by splitting the program into logical parts, with each file handling a clear responsibility.
Think of it like building with LEGO: each block is small and specialised, but they connect cleanly to form something larger. In C, the "connecting pieces" are header files. They allow one part of the program to know about the functions and data structures defined in another part without copying the same declarations everywhere.
This separation has important advantages:
- Organisation - Each
.cfile focuses on a single task, making the code easier to navigate. - Reusability - Headers allow you to use the same function declarations in many different files without rewriting them.
- Maintainability - If a function's signature changes, you update its header once and every file that includes it stays consistent.
- Scalability - Adding new functionality is easier when each piece of code has its own place.
- Kernel compatibility - This is exactly how FreeBSD is built: the kernel, drivers, and subsystems all rely on thousands of small
.cand.hfiles that fit together. Without modularity, a project the size of FreeBSD would be unmanageable.
A Simple Multi-File Example
Let's write a small program split into three files.
mathutils.h
#ifndef MATHUTILS_H
#define MATHUTILS_H
int add(int a, int b);
int subtract(int a, int b);
#endif
mathutils.c
#include "mathutils.h"
int add(int a, int b) {
return a + b;
}
int subtract(int a, int b) {
return a - b;
}
main.c
#include <stdio.h>
#include "mathutils.h"
int main(void) {
int x = 10, y = 4;
printf("Add: %d\n", add(x, y));
printf("Subtract: %d\n", subtract(x, y));
return 0;
}
Compile them together:
% cc main.c mathutils.c -o program
% ./program
Add: 14
Subtract: 6
Notice how the header allowed main.c to call add() and subtract() without knowing their implementation details.
How FreeBSD Uses Headers (Example from FreeBSD 14.3)
Kernel headers in FreeBSD do more than just hold declarations. They are carefully structured: first they protect against double inclusion, then they pull in only the dependencies they need, then they give you guidance on how to read the data types they define, and finally they publish those shared types. A good example of this is the header sys/sys/proc.h, which defines the struct proc used to represent processes.
Let's walk through three important parts of this header.
1) Header guard and focused includes
File: /usr/src/sys/sys/proc.h - Lines 37-49
#ifndef _SYS_PROC_H_ // Guard: process this header only once
#define _SYS_PROC_H_
#include <sys/callout.h> /* For struct callout. */
#include <sys/event.h> /* For struct klist. */
#ifdef _KERNEL
#include <sys/_eventhandler.h>
#endif
#include <sys/_exterr.h>
#include <sys/condvar.h>
#ifndef _KERNEL
#include <sys/filedesc.h>
#endif
What to notice
- The guard
_SYS_PROC_H_ensures the file is processed only once, no matter how many times it is included. - The includes are minimal and purposeful. For example,
<sys/filedesc.h>is pulled in only when compiling for userland (#ifndef _KERNEL). This keeps the kernel build lean and avoids unnecessary dependencies.
This shows the discipline you should adopt in your own drivers: guard every header, and only include what you truly need.
2) Preparing the reader: documentation + forward declarations
Lines 129-206 (excerpt)
/*-
* Description of a process.
*
* This structure contains the information needed to manage a thread of
* control, known in UN*X as a process; it has references to substructures
* containing descriptions of things that the process uses, but may share
* with related processes. ...
*
* Below is a key of locks used to protect each member of struct proc. ...
* a - only touched by curproc or parent during fork/wait
* b - created at fork, never changes
* c - locked by proc mtx
* d - locked by allproc_lock lock
* e - locked by proctree_lock lock
* ...
*/
struct cpuset;
struct filecaps;
struct filemon;
/* many more forward declarations ... */
struct syscall_args {
u_int code;
u_int original_code;
struct sysent *callp;
register_t args[8];
};
What to notice
- Before introducing
struct proc, the header provides a long explanatory comment. This is not just filler; it tells you what a process is and introduces the locking key (letters likea,b,c,d,e). These letters are used as shorthand later, next to each field of the process structure, to show which lock protects it. - Then the file defines a number of forward declarations (e.g.,
struct cpuset;). These act as placeholders, saying "this type exists, but you don't need its full definition here." This keeps the header lightweight and avoids pulling in lots of other headers unnecessarily.
Think of this section as a map legend: it gives you the symbols (locking keys) and placeholders you'll need before you dive into the main data structure. And in the next item, you'll see exactly how those locking keys are applied to real fields inside struct proc.
3) The shared data structure definition (struct proc)
Lines 652-779 (start and a few fields shown)
/*
* Process structure.
*/
struct proc {
LIST_ENTRY(proc) p_list; /* (d) List of all processes. */
TAILQ_HEAD(, thread) p_threads; /* (c) all threads. */
struct mtx p_slock; /* process spin lock */
struct ucred *p_ucred; /* (c) Process owner's identity. */
struct filedesc *p_fd; /* (b) Open files. */
...
pid_t p_pid; /* (b) Process identifier. */
...
struct mtx p_mtx; /* (n) Lock for this struct. */
...
struct vmspace *p_vmspace; /* (b) Address space. */
...
char p_comm[MAXCOMLEN + 1]; /* (x) Process name. */
...
struct pgrp *p_pgrp; /* (c + e) Process group linkage. */
...
};
What to notice
- This is the central definition: the
struct proc. - Every field is annotated with one or more locking keys from the earlier comment. For example,
p_listis marked(d), meaning it is protected by theallproc_lock, whilep_threadsis marked(c), meaning it requires theproc mtx. - The structure connects to other subsystems through pointers (
ucred,vmspace,pgrp). Because the header only needs to declare the pointer types, the earlier forward declarations were enough; no need to include all of those subsystem headers here.
This shows the modularity pattern in action: a single header publishes a large, shared structure in a way that is safe, documented, and efficient to include across the kernel.
How to verify on your machine
- Guard and includes: lines 37-49
- Comment with locking key and forward declarations: lines 129-206
- Start of
struct proc: lines 652-779
You can confirm these with:
% nl -ba /usr/src/sys/sys/proc.h | sed -n '37,49p'
% nl -ba /usr/src/sys/sys/proc.h | sed -n '129,206p'
% nl -ba /usr/src/sys/sys/proc.h | sed -n '652,679p'
Note: nl -ba prints line numbers for every line (even blanks), so the ranges above will match exactly on your system.
Why this matters
When you write FreeBSD drivers, you'll follow the same structure:
- Put shared types and prototypes in headers.
- Protect every header with a guard.
- Use forward declarations to avoid dragging in unnecessary dependencies.
- Document your headers so future developers know how to use them.
The modular, disciplined structure you see here is exactly what makes a large system like FreeBSD maintainable.
Common Beginner Pitfalls with Headers
Headers make modular code possible, but they also introduce a few traps that beginners often fall into. Here are the main ones to watch for:
1. Defining variables in headers
int counter = 0; // Wrong inside a header
Every .c file that includes this header will try to create its own counter, leading to multiple definition errors at link time.
Fix: In the header, declare it with extern int counter;. Then, in exactly one .c file, write int counter = 0;.
2. Forgetting header guards
Without guards, including the same header twice will cause "redefinition" errors. This is especially easy to do in large projects where one header includes another.
Fix: Always wrap the file in a guard or use #pragma once if your compiler supports it.
3. Declaring but never defining
A prototype in a header tells the compiler that a function exists, but if you never write it in a .c file, the linker will fail with an "undefined reference" error.
Fix: For every declaration you publish in a header, make sure there's a matching definition somewhere in your code.
4. Circular includes
If a.h includes b.h and b.h includes a.h, the compiler will chase its tail until it errors out. This can happen by accident in large modular projects.
Fix: Break the cycle. Usually you can move one of the includes into the corresponding .c file, or replace it with a forward declaration if you only need a pointer type.
Hands-On Lab 2: Triggering Pitfalls on Purpose
Let's reproduce two of these mistakes so you can see what they look like.
Part A: Defining a variable in a header
Create badvar.h:
// badvar.h
int counter = 0; // Wrong: definition in header
Create file1.c:
#include "badvar.h"
int inc1(void) { return ++counter; }
Create file2.c:
#include "badvar.h"
int inc2(void) { return ++counter; }
And finally main.c:
#include <stdio.h>
int inc1(void);
int inc2(void);
int main(void) {
printf("%d\n", inc1());
printf("%d\n", inc2());
return 0;
}
Now compile:
% cc main.c file1.c file2.c -o badprog
You'll see an error like:
multiple definition of `counter'
Both file1.c and file2.c created their own counter from the header, and the linker refused to merge them.
Fix: Change badvar.h to:
extern int counter; // Only a declaration
And in one .c file (say, file1.c):
int counter = 0; // The single definition
Now the program compiles and runs correctly.
Part B: Declaring but not defining a function
Create mylib.h:
int greet(void); // Declaration only
Create main.c:
#include <stdio.h>
#include "mylib.h"
int main(void) {
printf("Message: %d\n", greet());
return 0;
}
Compile:
% cc main.c -o badprog
The compiler accepts it, but the linker will fail:
undefined reference to `greet'
The declaration promised that greet() exists, but no .c file ever defined it.
Fix: Add a mylib.c with the real function:
#include "mylib.h"
int greet(void) {
return 42;
}
Recompile with:
% cc main.c mylib.c -o goodprog
% ./goodprog
Message: 42
What You Learned
- Defining variables directly in headers creates multiple copies across
.cfiles; useexterninstead. - A declaration without a definition will compile but fail later at the linking stage.
- Header guards and forward declarations prevent common inclusion issues.
- These small mistakes are easy to make, but once you've seen the error messages, you'll recognize them instantly in your own projects.
This is the same discipline followed in FreeBSD: headers only declare things, while the actual work lives in .c files.
Recap Quiz: Header Pitfalls
-
Why does putting
int counter = 0;inside a header cause multiple definition errors?- a) Because each
.cfile gets its own copy - b) Because the compiler does not allow global variables
- c) Because headers cannot contain integers
- a) Because each
-
If a function is declared in a header but never defined in a
.cfile, which stage of the build process will report the error?- a) Preprocessor
- b) Compiler
- c) Linker
-
What is the purpose of a forward declaration (e.g.,
struct device;) in a header?- a) To save memory at runtime
- b) To tell the compiler that the type exists without pulling in its full definition
- c) To define the entire structure immediately
-
What problem do header guards prevent?
- a) Declaring but not defining a function
- b) Multiple inclusion of the same header
- c) Circular dependencies between headers
Hands-On Lab 3: Your First Modular Program
Up to now, you've written all your programs in a single .c file. That works for small exercises, but real-world projects, especially the FreeBSD kernel, rely on many files working together. Let's build a tiny modular program that mimics this structure.
You'll create three files:
Step 1: The Header (greetings.h)
This file declares what the other .c files need to know.
#ifndef GREETINGS_H
#define GREETINGS_H
void say_hello(void);
void say_goodbye(void);
#endif
Notice the header guard. Without it, including the same header more than once would cause errors.
Step 2: The Implementation (greetings.c)
This file provides the actual definitions.
#include <stdio.h>
#include "greetings.h"
void say_hello(void) {
printf("Hello from greetings.c!\n");
}
void say_goodbye(void) {
printf("Goodbye from greetings.c!\n");
}
Step 3: The Main Program (main.c)
This file uses the functions.
#include "greetings.h"
int main(void) {
say_hello();
say_goodbye();
return 0;
}
Step 4: Compile and Run
Compile all three files together:
% cc main.c greetings.c -o greetings
% ./greetings
Expected output:
Hello from greetings.c!
Goodbye from greetings.c!
Understanding the Output
main.cdoesn't know how the functions are implemented; it only knows their prototypes fromgreetings.h.greetings.ccontains the real code, which the linker combines withmain.c.- This is exactly how modular programming works: headers declare,
.cfiles define.
This lab is different from the pitfalls lab because here you're building a working modular program from scratch, not deliberately breaking things.
Hands-On Lab 4: Exploring FreeBSD Headers
Now that you've built your own modular program, let's explore how FreeBSD does the same thing at scale.
Step 1: Locate the Process Header
Go to the system headers:
% cd /usr/src/sys/sys
% grep -n "struct proc" proc.h
This shows you where the definition of struct proc begins.
Step 2: Examine Members of struct proc
Scroll through the file and write down at least five fields you find, for example:
p_pid- the process identifierp_comm- the process namep_ucred- the user credentialsp_vmspace- the address spacep_threads- the list of threads
Each of these fields connects the process to another subsystem in the kernel.
Step 3: Find Where They Are Used
Search for one field (for example, p_pid) across the source tree:
% cd /usr/src/sys
% grep -r p_pid .
You'll see dozens of files that use this field. That's the power of modularity: a single definition in proc.h is shared by the entire kernel.
What You Learned
- The definition of
struct proclives in one header file. - Any
.cfile that needs process information just includes<sys/proc.h>. - This avoids duplication and guarantees consistency: every
.cfile sees the same structure.
This exercise helps you start reading kernel headers the way kernel developers do by following types from their declaration to their usage across the system.
Why This Matters for Driver Development
At first glance, writing headers might seem like bookkeeping, but for FreeBSD driver developers, headers are the glue that holds everything together.
When you write a driver, you will:
- Declare your driver's interface in a header, so other parts of the kernel know how to call your functions (for example, probe, attach, and detach routines).
- Include kernel subsystem headers such as
<sys/bus.h>,<sys/conf.h>, or<sys/proc.h>, so you can use core types likedevice_t,struct cdev, andbus_space_tag_t. - Reuse subsystem APIs without rewriting them; you include the right header and suddenly your driver has access to the correct functions and structures.
- Keep your driver maintainable. Clean headers make it easy for other developers to understand what your driver exposes.
Headers are also one of the first things reviewers check when you submit code to FreeBSD. If your driver puts definitions in the wrong place, forgets guards, or pulls in unnecessary dependencies, it will be flagged immediately. Clean, minimal headers show that you understand how to integrate properly into the kernel.
In short, headers are not optional details; they are the foundation of collaboration inside the FreeBSD kernel.
Wrapping Up
Header files are the backbone of modular C programming. They allow multiple files to work together safely, prevent duplication, and give structure to large systems like FreeBSD.
By now, you've:
- Written your own modular program with headers
- Explored how FreeBSD uses headers to define
struct proc - Learned common pitfalls to avoid
- Understood why headers are critical for writing drivers
In the next section, we'll see how the compiler and linker combine all these separate .c and .h files into a single program and how debugging tools help when things go wrong. This is where your modular code truly comes to life.
Compiling, Linking, and Debugging C Programs
Up to this point you have written short C programs and even split them into more than one file. Now it is time to understand what really happens between writing your code and running it on FreeBSD. This is the step where your text file becomes a living program. It is also the step where you first learn to read the compiler's messages and start using a debugger when things go wrong. These skills are the difference between "just typing code" and truly programming with confidence.
The compilation pipeline, in plain words
Think of the compiler as a factory line. You give it raw material (your .c source file) and it sends the material through several machines. By the end, you have a finished product: an executable program you can run.
Here is the journey in simple terms:
- Preprocessing - This is like preparing the ingredients before cooking. The compiler looks at
#includeand#definelines and replaces them with the actual content or values. At this stage, your program is expanded into the final "recipe" that will be cooked. - Compilation - Now the recipe is turned into instructions your CPU can understand at a more abstract level. Your C code is translated into assembly language, which is close to the language of the hardware but still readable to humans.
- Assembling - This step turns the assembly instructions into pure machine code, stored in an object file with the extension
.o. These files are not yet a full program; they are like separate puzzle pieces. - Linking - Finally, the puzzle pieces are put together. The linker combines all your object files, and also pulls in any libraries you need (such as the standard C library for
printf). The result is a single executable file you can run.
On FreeBSD the command cc (which uses Clang by default) takes care of this whole process for you. You usually do not see each step, but knowing that they exist makes it much easier to understand error messages. For example, a syntax error happens during compilation, while a message like "undefined reference" comes from the linking stage.
Reading compiler messages with care
By now, you have compiled many small programs. Instead of repeating those steps, let us pause to look more closely at what the compiler tells you during compilation. The messages from cc are not just obstacles; they are hints, sometimes even teaching moments.
Take this small example:
#include <stdio.h>
int main(void) {
prinft("Oops!\n"); // Typo on purpose
return 0;
}
If you compile with:
% cc -Wall -o hello hello.c
you will see a message similar to:
hello.c: In function 'main':
hello.c:4:5: error: implicit declaration of function 'prinft'
The compiler is saying: "I do not know what prinft is, maybe you meant printf?"
The -Wall flag is important because it enables the standard set of warnings. Even when your program does compile, warnings can alert you to suspicious code that may later cause a bug.
From this point forward, make it a habit:
- Always compile with warnings enabled.
- Always read the first error or warning carefully. Often fixing that one will also fix the rest.
This practice may look simple, but it is the same discipline you will need when building large drivers in the FreeBSD kernel, where build logs can be long and intimidating.
Multi-file programs and linker errors
You already know how to split code into multiple .c and .h files. What matters now is understanding why we do this and what kinds of errors appear when something goes wrong.
When you compile each file separately, the compiler produces object files (.o). These are like puzzle pieces: each piece has some functions and data, but cannot run alone. The linker is the one that joins all the pieces into a complete picture.
For example, suppose you have this setup:
main.c
#include <stdio.h>
#include "utils.h"
int main(void) {
greet();
return 0;
}
utils.c
#include <stdio.h>
#include "utils.h"
void greet(void) {
printf("Hello from utils!\n");
}
utils.h
#ifndef UTILS_H
#define UTILS_H
void greet(void);
#endif
Build step by step:
% cc -Wall -c main.c # produces main.o
% cc -Wall -c utils.c # produces utils.o
% cc -o app main.o utils.o
Now imagine you accidentally change the function name in utils.c:
void greet_user(void) { // oops: name does not match
printf("Hello!\n");
}
Recompile and link, and you will see something like:
undefined reference to `greet'
This is a linker error, not a compiler error. The compiler was satisfied that each file made sense on its own. The linker, however, could not find the missing piece: the real machine code for greet.
By noticing whether an error comes from the compiler or the linker, you can immediately narrow your search:
- Compiler errors = the syntax or types inside one file are wrong.
- Linker errors = the files do not agree with each other, or a function is missing.
This difference is small but crucial. FreeBSD drivers are often spread across multiple files, so being able to distinguish between compiler and linker messages will save you hours of frustration.
Why make matters in real projects
Compiling a single file by hand with cc is manageable. Even two or three files are fine; you can still type out the commands without losing track. But as soon as your program grows beyond that, the manual approach starts to collapse under its own weight.
Imagine a project with ten .c files, each including two or three different headers. You fix one small typo in a header and suddenly every file that includes that header needs to be recompiled. If you forget even one, the linker may quietly stitch together a program that half-knows about your change and half-doesn't. These out-of-sync builds are some of the most confusing bugs for beginners, because the program "compiles" yet produces unpredictable results.
This is exactly the problem that make was created to solve. Think of make as the guardian of consistency.
A Makefile describes:
- what files your program consists of,
- how each one is compiled,
- and how they fit together into the final executable.
Once that description exists, make takes care of the tedious part. If you change a single source file, make recompiles only that file and then relinks the program. If you only edit comments, nothing gets rebuilt at all. This saves you both time and mistakes.
In FreeBSD, make is not an optional tool, it is the backbone of the build system. When you run make buildworld, you are rebuilding the entire operating system with the same rules you are learning here. When you compile a kernel module, the bsd.kmod.mk file you include in your Makefile is just a highly polished version of the simple Makefiles you are writing now.
The real lesson is this: by learning make now, you are not just avoiding repetitive typing. You are practicing the exact workflow you will need when you reach device drivers. Those drivers will almost always be split across multiple files, include kernel headers, and need to be rebuilt every time you adjust an interface. Without make, you would spend more time typing commands than actually writing code.
Common beginner pitfalls with make:
- Forgetting to list a header file as a dependency, which means changes in the header do not trigger a rebuild.
- Mixing spaces and tabs in a Makefile (a classic frustration). Only tabs are allowed at the start of a command line.
- Hardcoding flags instead of using variables like
CFLAGS, which makes your Makefile less flexible.
By practicing with small Makefiles now, you will be prepared for the larger, system-wide ones that power FreeBSD itself.
Hands-On Lab 1: When make misses a change
Objective: See what happens when a Makefile is missing a dependency, and why accurate dependency tracking is essential in real projects.
Step 1 - Set up a tiny project.
File: main.c
#include <stdio.h>
#include "greet.h"
int main(void) {
greet();
return 0;
}
greet.c
#include <stdio.h>
#include "greet.h"
void greet(void) {
printf("Hello!\n");
}
File: greet.h
#ifndef GREET_H
#define GREET_H
void greet(void);
#endif
File: Makefile
CC=cc
CFLAGS=-Wall -g
app: main.o greet.o
$(CC) $(CFLAGS) -o app main.o greet.o
main.o: main.c
$(CC) $(CFLAGS) -c main.c
greet.o: greet.c
$(CC) $(CFLAGS) -c greet.c
clean:
rm -f *.o app
Notice that the dependencies for main.o and greet.o do not list the header file greet.h.
Step 2 - Build and run.
% make
% ./app
Output:
Hello!
Everything works.
Step 3 - Change the header.
Edit greet.h:
#ifndef GREET_H
#define GREET_H
void greet(void);
void greet_twice(void); // New function added
#endif
Do not change main.c or greet.c. Now run:
% make
make responds with:
make: 'app' is up to date.
But this is wrong! The header changed, so main.o and greet.o should have been rebuilt.
Step 4 - Fix the Makefile.
Update the dependencies:
main.o: main.c greet.h
$(CC) $(CFLAGS) -c main.c
greet.o: greet.c greet.h
$(CC) $(CFLAGS) -c greet.c
Now run:
% make
Both object files are rebuilt as they should be.
What You Learned
This is a small example of a very real problem. Without correct dependencies, make may skip rebuilding files, leaving you with inconsistent results that are extremely hard to debug. In driver development this becomes critical: kernel headers change frequently, and if your Makefile doesn't track them, your module may compile but crash the system.
This lab teaches the real "aha!" moment: make is not just convenience, it is your safeguard against subtle, inconsistent builds.
Recap Quiz: Why make Matters
- If you change a header file but your Makefile does not list it as a dependency, what happens when you run
make?- a) All source files are automatically rebuilt.
- b) Only the files you changed are rebuilt.
- c)
makemay skip rebuilding, leaving your program out of sync.
- Why is it risky to have an out-of-sync build in FreeBSD driver development?
- a) The program may still run but silently ignore your new code.
- b) The kernel module may load but behave unpredictably, potentially crashing the system.
- c) Both of the above.
- In a Makefile, why do we use variables like
CFLAGSinstead of hardcoding flags in every rule?- a) It keeps the file shorter but less flexible.
- b) It makes it easier to change compiler options in one place.
- c) It forces you to type more at the command line.
- What is the main advantage of
makecompared to runningccby hand?- a)
makecompiles faster. - b)
makeensures only the necessary files are rebuilt, keeping everything consistent. - c)
makeautomatically fixes logic errors in your code.
- a)
Debugging with GDB: more than just "fixing bugs"
When your program crashes or behaves strangely, it is tempting to sprinkle printf statements everywhere and hope the output tells you the story. That can work for very small programs, but it quickly becomes messy and unreliable. What if the bug happens only once in fifty runs? What if printing the variable actually changes the timing and hides the problem?
This is where GDB, the GNU Debugger, becomes your best ally. A debugger does more than help you "fix bugs." It teaches you how your program really runs, step by step, and helps you build an accurate mental model of execution. With GDB, you are no longer guessing from the outside, you are looking inside the program while it runs.
Here are some of the things you can do with GDB:
- Set breakpoints at functions or even at specific lines, so the program pauses exactly where you want to observe it.
- Inspect variables and memory at that instant, seeing both values and addresses.
- Step through code line by line, watching how control flow really moves.
- Check the call stack, which tells you not only where you are, but how you got there.
- Watch variables change over time, a powerful way to confirm whether your logic matches your expectations.
For example, imagine your calculator program suddenly prints the wrong result for multiplication. With GDB, you can stop inside the multiply function, look at the input values, then step through the lines. You may discover that the function is accidentally adding instead of multiplying. The compiler could not warn you; syntactically, the code was fine. But GDB shows you the truth.
In user space, this is already helpful. In kernel space, it becomes essential. Drivers are often asynchronous, event-driven, and far harder to debug with printf. Later in this book you will learn to use kgdb, the kernel version of GDB, to step through drivers, inspect kernel memory, and analyse crashes. By learning the GDB workflow now, you are building reflexes that will carry directly into your FreeBSD driver work.
Common beginner pitfalls with GDB:
- Forgetting to compile with
-g, which leaves the debugger with no source-level information. - Using optimisations (
-O2,-O3) while debugging. The compiler may rearrange or inline code, making the debugger's view confusing. Use-O0for clarity. - Expecting GDB to fix logic errors automatically. A debugger does not correct code; it helps you see what the code is really doing.
- Quitting too early. Often, the first bug you notice is only a symptom. Use the stack trace to follow the chain of calls back to the real source.
Why this matters for driver development:
Kernel code has little tolerance for trial-and-error debugging. A careless mistake can crash the entire system. GDB trains you to be methodical: set breakpoints, inspect state, confirm assumptions, and step carefully. By the time you reach kernel debugging with kgdb, this disciplined approach will feel natural, and it will be the difference between hours of frustration and a clear path to the fix.
Hands-On Lab 2: Fixing a Logic Bug with GDB
Objective: Learn how to use GDB to step through a program and find a logic error that compiles fine but produces the wrong result.
Step 1 - Write a buggy program.
Create a file called math.c:
#include <stdio.h>
int multiply(int a, int b) {
return a + b; // BUG: this adds instead of multiplies
}
int main(void) {
int result = multiply(3, 4);
printf("Result = %d\n", result);
return 0;
}
At a glance, this looks fine; it compiles without warnings. But the logic is wrong.
Step 2 - Compile with debugging support.
Run:
% cc -Wall -g -O0 -o math math.c
-Wallturns on useful warnings.-gincludes extra information so GDB can show you the original source code.-O0disables optimisations, which keeps the code structure simple for debugging.
Step 3 - Run the program normally.
% ./math
Output:
Result = 7
Clearly, 3 * 4 is not 7, we have a hidden logic bug.
Step 4 - Start GDB.
% gdb ./math
This opens your program inside the GNU Debugger. You'll see a (gdb) prompt.
Step 5 - Set a breakpoint.
Tell GDB to pause when it enters the multiply function:
(gdb) break multiply
Step 6 - Run the program under GDB.
(gdb) run
The program starts, but pauses as soon as multiply is called.
Step 7 - Step through the function.
Type:
(gdb) step
This moves into the first line of the function.
Step 8 - Inspect the variables.
(gdb) print a
(gdb) print b
GDB shows you the values of a and b: 3 and 4.
Now type:
(gdb) next
This executes the buggy line return a + b;.
Step 9 - Spot the problem.
Look back at the code: multiply is returning a + b instead of a * b. The compiler could not warn you, because both are valid C. But GDB let you see the bug in action.
Step 10 - Fix and recompile.
Correct the function:
int multiply(int a, int b) {
return a * b; // fixed
}
Recompile:
% cc -Wall -g -O0 -o math math.c
% ./math
Output:
Result = 12
Bug fixed!
Why this matters:
This exercise shows you that not all bugs cause crashes. Some are logic errors that only a debugger (and your careful attention) can reveal. In kernel programming, this mindset is vital, you cannot rely on warnings or printf alone.
Hands-On Lab 3: Chasing a Segmentation Fault with GDB
Objective: Experience how GDB helps you investigate a crash by stopping at the exact point where the program failed.
Step 1 - Write a buggy program.
crash.c
#include <stdio.h>
int main(void) {
int *p = NULL; // a pointer with no valid target
printf("About to crash...\n");
*p = 42; // attempt to write to address 0
printf("This line will never be printed.\n");
return 0;
}
Compile it with debug info:
% cc -Wall -g -O0 -o crash crash.c
Step 2 - Run without GDB.
% ./crash
Output:
About to crash...
Segmentation fault (core dumped)
You know the program crashed, but not where or why. This is the frustration most beginners face.
Step 3 - Investigate with GDB.
% gdb ./crash
Inside GDB:
(gdb) run
The program will crash again, but this time GDB will stop at the exact line that caused the fault. You should see something like:
Program received signal SIGSEGV, Segmentation fault.
0x0000000000401136 in main () at crash.c:7
7 *p = 42; // attempt to write to address 0
Now ask GDB to show the backtrace:
(gdb) backtrace
This confirms that the crash happened in main at line 7, with no deeper calls.
Step 4 - Reflect and fix.
The debugger shows you the real cause: you dereferenced a NULL pointer. This is one of the most common mistakes in C and one of the most dangerous in kernel code. To fix it, you would need to make sure p points to valid memory before using it.
Note: Without GDB you would only see "Segmentation fault," leaving you to guess. With GDB, you immediately see the exact line and cause. In kernel programming, where a single NULL pointer can crash the whole operating system, this skill is essential.
Recap Quiz: Debugging Segmentation Faults with GDB
- When you run a buggy program outside of GDB and it crashes with a segmentation fault, what information do you usually get?
- a) The exact line of code that failed.
- b) Only that a segmentation fault occurred, without details.
- c) A list of variables that caused the crash.
- In the lab, why did the crash occur?
- a) The pointer
pwas set toNULLand then dereferenced. - b) The compiler miscompiled the program.
- c) The operating system refused to allow printing to the screen.
- a) The pointer
- How does running the program inside GDB improve your ability to diagnose the crash?
- a) GDB automatically fixes the bug.
- b) GDB pauses execution at the exact line of failure and shows the call stack.
- c) GDB re-runs the program until it succeeds.
- Why is mastering this debugging habit especially important for FreeBSD driver development?
- a) Because kernel bugs are harmless.
- b) Because a single invalid pointer can crash the entire operating system.
- c) Because drivers never use pointers.
A peek into real FreeBSD build rules
FreeBSD keeps the build system simple at the surface and powerful underneath. Your module's Makefile can be only a few lines because the heavy lifting is centralised in shared rules.
The include chain you are stepping into
When you write:
KMOD= hello
SRCS= hello.c
.include <bsd.kmod.mk>
bsd.kmod.mk looks very small, but it is the gateway into the kernel's build system. On a FreeBSD 14.3 system it does three important things:
- Optionally includes
local.kmod.mkso you can override defaults without editing system files. - Includes
bsd.sysdir.mk, which resolves the kernel source directory in the variableSYSDIR. - Includes
${SYSDIR}/conf/kmod.mk, where the real kernel module rules live.
This indirection is deliberate. Your tiny Makefile defines what you want to build. The shared mk files decide how to build it, using the same flags, headers, and link commands that the rest of the kernel uses. This ensures your module is always treated like a first-class citizen of the system.
Setting up your own module directory
To follow along, create a small working directory in your home folder, for example ~/hello_kmod, and place inside it the following two files:
File: hello.c
#include <sys/param.h>
#include <sys/module.h>
#include <sys/kernel.h>
#include <sys/systm.h>
static int
hello_modevent(module_t mod __unused, int event, void *arg __unused)
{
switch (event) {
case MOD_LOAD:
printf("Hello, kernel world!\n");
break;
case MOD_UNLOAD:
printf("Goodbye, kernel world!\n");
break;
default:
return (EOPNOTSUPP);
}
return (0);
}
static moduledata_t hello_mod = {
"hello",
hello_modevent,
NULL
};
DECLARE_MODULE(hello, hello_mod, SI_SUB_DRIVERS, SI_ORDER_MIDDLE);
File Makefile:
KMOD= hello
SRCS= hello.c
.include <bsd.kmod.mk>
Whenever we refer to "your module directory", this is the folder we mean: the one containing hello.c and Makefile.
A guided tour of the build system
Now run the following commands from inside that directory:
-
See where the kernel sources are:
% make -V SYSDIRThis shows the kernel source directory being used. If it does not match your running kernel, fix this before continuing.
-
Reveal the actual build commands:
% make clean % make -nThe
-noption prints the compile and link commands without running them. You will see the fullccinvocations with all the flags and include paths thatkmod.mkadded for you. -
Inspect important variables:
% make -V CFLAGS % make -V KMODDIR % make -V SRCSThese tell you which flags are being passed, where your
.kowill be installed, and which source files are included. -
Try a local override: Create a file called
local.kmod.mkin the same directory:DEBUG_FLAGS+= -gRebuild and notice that
-gnow appears in the compile lines. This demonstrates how you can customise the build without touching system files. -
Test dependency tracking: Add a second source file
helper.c, include it inSRCS, and build. Then modify onlyhelper.cand runmakeagain. Notice that only that file recompiles before the linker runs. The same mechanism scales to the hundreds of files in the FreeBSD kernel.
What you will find in ${SYSDIR}/conf/kmod.mk
When you include <bsd.kmod.mk> in your tiny Makefile, you are not telling the compiler how to build anything yourself. Instead, you are delegating the "how" to a much larger set of rules that live in the FreeBSD source tree under:
${SYSDIR}/conf/kmod.mk
This file is the blueprint for building kernel modules. It encodes decades of careful adjustments, ensuring that every module, whether written by you or a long-time committer, is built in exactly the same way.
If you open it, do not try to read every line. Instead, look for patterns that match the concepts you have just learned:
- Compilation rules: you will see how each
*.cinSRCSis transformed into an object file, and howCFLAGSare extended with kernel-specific defines and include paths. This is why you did not have to manually add-I/usr/src/sysor worry about the proper warning levels, the system did it for you. - Linking rules: deeper down, you will find the recipe that takes all those object files and links them into a single
.kokernel object. This is the equivalent of the finalcc -o app ...you ran for user programs, but tuned for the kernel environment. - Dependency tracking: the file also generates
.dependinformation so thatmakeknows exactly which files to rebuild when headers or sources change. This is the mechanism that protects you from "phantom bugs" caused by stale object files.
You do not need to memorise these rules, and you certainly do not need to re-implement them. But skimming through them once is valuable: it shows you that your three-line Makefile really expands into dozens of careful steps handled on your behalf.
Tip: A fun experiment is to run make -n in your module directory. This prints all the real commands that kmod.mk generates. Compare those with what you see in the file; it will help you connect theory (the rules) with practice (the commands).
By doing this, you begin to see the FreeBSD philosophy in action: developers declare what they want (KMOD= hello, SRCS= hello.c), and the build system decides how it is done. This ensures consistency across the whole kernel, whether the module is written by a beginner following this book or by a maintainer working on a complex driver.
Philosophy in practice
The FreeBSD build system is not just a technical detail; it reflects a philosophy that runs through the whole project.
- Small, declarative Makefiles. Your module's
Makefileonly says what you want to build: its name and its source files. You do not need to describe every compiler flag or linker step. Beginners can focus on the logic of the driver itself, not the plumbing. - Centralised rules. The real "how" is handled once, in the shared makefiles under
/usr/share/mkand${SYSDIR}/conf. By putting the rules in one place, FreeBSD ensures that every driver, yours included, is built with the same standards. If the toolchain changes, or a flag needs to be added, it is fixed centrally and you benefit automatically. - Consistency above all. The same approach builds the entire system: the kernel, userland programs, libraries, and your small module. This means there is no "special case" for beginners. The tiny "hello" module you wrote in this chapter is built in the same way as complex network drivers in the tree. That consistency gives confidence: if your module builds cleanly, you know it has passed through the same process as the rest of the system.
Imagine the alternative: every developer inventing their own build rules, each with slightly different flags or paths. Some modules would compile one way, others another. Debugging would be harder, maintenance nearly impossible. FreeBSD avoids this chaos by giving everyone a shared foundation.
By leaning on this infrastructure, you are not cutting corners; you are standing on decades of accumulated best practices. This is why FreeBSD code often looks "cleaner" than its equivalents elsewhere: developers do not waste time reinventing build logic. They spend their energy on correctness, clarity, and performance.
Common Beginner Pitfalls with FreeBSD Build Rules
Even with FreeBSD's clean build system, beginners often trip over the same issues. Let's look at them more closely.
1. Forgetting to install the kernel sources.
FreeBSD does not ship the full source tree by default. If you try to build a module without /usr/src installed, you will get confusing errors like "sys/module.h: No such file or directory." Always make sure you have the matching source tree for your kernel (for example, releng/14.3).
2. Misusing relative paths in Makefiles.
Beginners sometimes write SRCS= ../otherdir/helper.c or try to pull files from odd locations. While it may compile, it breaks consistency and can confuse make clean or dependency tracking. Keep your module self-contained in its own directory, or use proper include paths via CFLAGS+= -I... if you must share headers.
3. Hardcoding output locations.
Some newcomers try to force the .ko file into /boot/modules by hand, or set absolute paths in the Makefile. This breaks the standard make install workflow and can overwrite system files. Always let the build system decide the correct KMODDIR. Use make install to put your module in the right place.
4. Mixing userland headers with kernel headers.
It's tempting to #include <stdio.h> or other libc headers in your kernel code, but kernel modules cannot use the C library. If you need printing, use printf() from <sys/systm.h>, not <stdio.h>. Mixing headers compiles sometimes, but fails at link or load time.
5. Not cleaning before switching branches or kernel versions.
If you update your kernel or switch source trees but reuse the same object files, you may get baffling build errors. Always run make clean when switching environments, to avoid stale objects from polluting your build.
With this knowledge, you can now see why FreeBSD's module Makefiles are so short yet so powerful. The system does the heavy lifting, and your job is simply to avoid working against it. In the next chapters, when you start writing real drivers, this consistency will save you time and frustration, letting you focus on device logic rather than boilerplate.
Reflection Quiz: FreeBSD Build Rules
- Why can a FreeBSD kernel module
Makefilebe as short as three lines?- a) Because kernel modules don't need compiler flags.
- b) Because
bsd.kmod.mkpulls in all the rules and flags from the kernel build system. - c) Because the kernel automatically guesses how to compile your files.
- What does the variable
SYSDIRrepresent when you runmake -V SYSDIRinside your module directory?- a) The system log directory.
- b) The path to the kernel source tree used for building modules.
- c) The folder where
.kofiles are installed.
- Why is it dangerous to edit files under
/usr/share/mkor${SYSDIR}directly?- a) Because they are read-only on all FreeBSD systems.
- b) Because your changes will be lost on the next update and may break consistency.
- c) Because the compiler will refuse to use modified rules.
- If you want to add debugging flags (like
-g) without touching system files, what is the recommended way?- a) Add them directly into
/usr/share/mk/bsd.kmod.mk. - b) Set them in a local
local.kmod.mkfile. - c) Recompile the kernel with the
-gflag built in.
- a) Add them directly into
- What is the core philosophy behind FreeBSD's build system?
- a) Every developer writes their own Makefiles in full detail.
- b) Boilerplate is avoided by centralising rules, so all components are built consistently.
- c) Drivers are compiled manually with
ccand then hand-linked into the kernel.
Common beginner pitfalls
Working with FreeBSD's build system is straightforward once you understand the rules. But newcomers often fall into a few traps that can waste hours of debugging time. Let's look at the most common ones, with an eye on how they show up in real FreeBSD development.
Ignoring compiler warnings.
It is easy to shrug off warnings when the program still compiles. In user space, this is already risky; in kernel space, it can be catastrophic. A missing prototype, an implicit type conversion, or a discarded return value can all compile but still lead to unpredictable behaviour at runtime. FreeBSD's own build system is configured to treat many warnings as fatal, precisely because the project culture assumes "a warning today is a crash tomorrow." During your learning, adopt the same discipline: use -Wall -Werror so that warnings stop you early.
Mixing compiler and linker errors. Many beginners blur the line between these two. Compiler errors mean the C code is not valid in isolation; linker errors mean the different parts of the program do not agree. In kernel module development this distinction is crucial. For example, you might declare a driver method in a header but forget to provide its implementation. The compiler will happily generate an object file; the linker will later complain with "undefined reference." Understanding that difference quickly points you to whether the problem is in one file or across files.
Incomplete rebuilds leading to "phantom" bugs.
When your program is more than one file, it is tempting to recompile just the file you edited. But headers ripple changes into many files, and if you do not rebuild all of them, you can end up with an inconsistent module. The bug might vanish when you do a clean build, which makes it especially frustrating. The FreeBSD kernel build system avoids this with precise dependency tracking, but if you are experimenting in your own directory, remember to use make clean; make when in doubt. Phantom bugs are almost always build artefacts.
Mismatched kernel and headers.
This is the most FreeBSD-specific trap. Kernel modules must be built against the exact headers of the kernel they are loaded into. If you compile a .ko against FreeBSD 14.2 headers and try to load it into a 14.3 kernel, it may load with missing symbols, fail with cryptic errors, or even cause a crash. This mismatch is subtle because the compiler does not know your running kernel version, it will happily compile. Only at kldload time will the system refuse or, worse, misbehave. Always ensure make -V SYSDIR points to the same source tree as your running kernel. This is why, earlier, we insisted you check out the releng/14.3 sources when working through this book.
An example of mismatched headers in action
Suppose you have a FreeBSD 14.3 system running a kernel built from the 14.3 release branch. You then check out the main branch of the source tree (which may already contain 15.0 development changes) and try to build your hello kernel module against it. The code may compile without errors:
% cd ~/hello_kmod
% make clean
% make
This produces hello.ko as usual. But when you try to load it:
% sudo kldload ./hello.ko
you may see an error like:
linker_load_file: /boot/modules/hello.ko - unsupported file layout
kldload: can't load ./hello.ko: Exec format error
or, in more subtle cases:
linker_load_file: symbol xyz not found
This happens because the module was compiled against headers that declare kernel structures or functions differently from the kernel you are actually running. The compiler cannot detect this mismatch, because both sets of headers are "valid" C, but they no longer agree with your kernel's ABI.
The fix is simple but essential: always build modules against the matching source tree. For FreeBSD 14.3, clone the release branch:
% sudo git clone --branch releng/14.3 --depth 1 https://git.FreeBSD.org/src.git /usr/src
Then confirm that your module build points to it:
% make -V SYSDIR
This should print /usr/src/sys. If it does not, adjust your environment or symlink so that the build system uses the correct headers.
Hands-On Lab: Experiencing Common Pitfalls
Objective: See what happens when you fall into some of the classic traps we just discussed and learn how to avoid them.
Step 1 - Ignoring a warning.
Create warn.c:
#include <stdio.h>
int main(void) {
int x;
printf("Value: %d\n", x); // using uninitialized variable
return 0;
}
Compile with warnings enabled:
% cc -Wall -o warn warn.c
You'll see:
warning: variable 'x' is uninitialized when used here
Run it anyway:
% ./warn
Output might be Value: 32767 or some random number, because x contains garbage.
Lesson: Warnings often point to real bugs. Treat them seriously.
Step 2 - A linker error from missing function.
Create main.c:
#include <stdio.h>
void greet(void); // declared but not defined
int main(void) {
greet();
return 0;
}
Compile:
% cc -o test main.c
The compiler is happy, but the linker fails:
undefined reference to 'greet'
Lesson: The compiler only checks syntax. The linker enforces that declared symbols must exist somewhere.
Step 3 - Header mismatch (phantom bug demo).
-
Create
util.h:int add(int a, int b); -
Create
util.c:#include "util.h" int add(int a, int b) { return a + b; } -
Create
main.c:#include <stdio.h> #include "util.h" int main(void) { printf("%d\n", add(2, 3)); return 0; }
Compile:
% cc -Wall -c util.c
% cc -Wall -c main.c
% cc -o demo main.o util.o
% ./demo
Output: 5
Now change util.h to:
int add(int a, int b, int c); // prototype changed
And also update main.c to match:
#include <stdio.h>
#include "util.h"
int main(void) {
printf("%d\n", add(2, 3, 4)); // now passing 3 arguments
return 0;
}
But don't touch util.c (it still only accepts two parameters). Now recompile only the changed files:
% cc -Wall -c main.c # recompiles fine - header matches the call
% cc -o demo main.o util.o # links without error!
% ./demo
Result: Now ./demo may run incorrectly, print garbage, crash, or appear to work by accident, because main.c thinks add() takes three parameters, but util.o was compiled with a function that only takes two.
Lesson: Partial rebuilds can leave object files out of sync with headers, creating phantom bugs. The linker doesn't verify that function signatures match between compilation units, only that the symbol exists. This is why we use make with proper dependency tracking.
Recap Quiz: Common Pitfalls
- Why should you never ignore compiler warnings in kernel development?
- a) They are harmless.
- b) They often signal real bugs that can crash the system.
- c) They only matter in user space.
- If you declare a function in a header but forget to provide its definition, which stage will fail?
- a) Preprocessor.
- b) Compiler.
- c) Linker.
- What is a "phantom bug" and how does it usually appear?
- a) A bug caused by cosmic rays.
- b) A bug that disappears after a full rebuild because old object files were out of sync with changed headers.
- c) A bug in the debugger itself.
- Why is it important to build kernel modules with the correct version of the FreeBSD source tree?
- a) To get the newest features.
- b) Because mismatched headers can make modules fail to load or behave unpredictably.
- c) It does not matter; headers are always backward compatible.
Why this matters for FreeBSD driver development
Writing drivers is not like writing toy programs from a tutorial. Real drivers are made up of many source files, depend on a maze of kernel headers, and must be rebuilt every time the kernel changes. If you tried to manage this manually with raw cc commands, you would drown in complexity. That is why FreeBSD's build system, and your understanding of it, is so important.
When you learn to read compiler and linker messages carefully, you are not just fixing mistakes, you are building the habit of interpreting what the system is telling you. This habit will pay off when your driver spans multiple files, each depending on kernel interfaces that may change from one release to the next.
Debugging is another area where beginners often carry the wrong reflex. Sprinkling printf everywhere works in user space for very small programs, but in kernel space it can distort timing, slow down interrupts, or even mask the bug you are trying to chase. FreeBSD gives you better tools: symbolic debuggers, kernel core dumps, and kgdb. The way you practiced with gdb in this chapter is training for the same workflow you will later use on real drivers controlled, step-by-step, with the ability to look inside the system rather than guess from the outside.
Takeaway: Every habit you built here, compiling with warnings, using make to manage dependencies, reading errors with care, and debugging with intent, is not just academic. These are the working muscles of a FreeBSD driver developer.
Wrapping up
In this section, you moved beyond the one-line cc hello.c and began thinking like a system builder. You saw that a pipeline of stages produces programs, that compiler and linker errors tell different stories, that make keeps complex projects consistent, and that debuggers let you see code as it really runs.
With these foundations in place, you are ready to explore the C Preprocessor, the very first stage of the pipeline. This is where #include, #define, and conditional compilation reshape your source before the compiler ever sees it. Understanding this stage will explain why almost every FreeBSD kernel header begins with a dense block of preprocessor directives, and it will prepare you to read and write them with confidence.
The C Preprocessor: Directives Before Compilation
Before your C code is compiled, it makes a stop at an earlier stage called preprocessing. You can think of this as the pre-flight checklist for your program: it prepares the source, pulls in external declarations, expands macros, removes comments, and decides which parts of the code will even reach the compiler. In practice, this means the compiler never works directly on the file you wrote. Instead, it sees a transformed version that has already been “cleaned up” and adapted by the preprocessor.
This may sound like a small detail, but it has a huge impact. The preprocessor explains why a simple driver file in FreeBSD often starts with a long series of #include and #define lines before any actual function appears. It is also the reason the same kernel source code can build successfully on different CPU architectures, with or without debugging support, and with optional subsystems like networking or USB enabled.
For a beginner, the preprocessor is your first taste of how C code adapts itself to different situations without needing to maintain dozens of separate files. For a kernel developer, it is a daily tool that makes large projects like FreeBSD manageable, modular, and portable.
What the Preprocessor Does
The C preprocessor is best thought of as a text substitution engine. It does not understand C syntax, data types, or even whether what it produces makes logical sense. Its only job is to transform the source text according to the directives it finds, all before the compiler ever sees the file.
Here are its main responsibilities:
- Including headers so you can reuse declarations, macros, and constants from other files.
- Defining constants and macros that the compiler will see as if they were literally typed into the source code.
- Conditional compilation that allows sections of code to be included or excluded depending on configuration.
- Preventing duplicate inclusion of the same header, which avoids errors and keeps compilation efficient.
Because the preprocessor works purely at the text level, it is both powerful and risky. Used wisely, it makes your code portable, configurable, and easier to maintain. Used carelessly, it can lead to cryptic error messages and behaviour that is difficult to debug.
A Quick Example
Here is a tiny program that shows how the preprocessor can change behaviour without touching the logic of main():
#include <stdio.h>
#define DEBUG /* Try commenting this line out */
int main(void) {
printf("Program started.\n");
#ifdef DEBUG
printf("[DEBUG] Extra information here.\n");
#endif
printf("Program finished.\n");
return 0;
}
Compile and run it normally, then recompile after commenting out the #define DEBUG.
What to Observe:
- The
main()function is identical in both cases. - The only difference is whether the preprocessor includes or skips the
printf("[DEBUG] ...");line. - This demonstrates the central idea: the preprocessor decides what code the compiler even gets to see.
In FreeBSD drivers, this exact pattern is everywhere. Developers often wrap diagnostic output in preprocessor checks. For example, macros like DPRINTF() or device_printf() are enabled only when a debug flag is set. This means the same driver source code can produce a “quiet” production build or a “verbose” debugging build without changing the actual driver logic.
Real Example from FreeBSD 14.3 Source
In sys/dev/usb/usb_debug.h you will find the canonical USB debug pattern used across USB drivers:
/* sys/dev/usb/usb_debug.h */
#ifndef _USB_DEBUG_H_
#define _USB_DEBUG_H_
/* Declare global USB debug variable. */
extern int usb_debug;
/* Check if USB debugging is enabled. */
#ifdef USB_DEBUG_VAR
#ifdef USB_DEBUG
#define DPRINTFN(n, fmt, ...) do { \
if ((USB_DEBUG_VAR) >= (n)) { \
printf("%s: " fmt, __FUNCTION__ , ##__VA_ARGS__); \
} \
} while (0)
#define DPRINTF(...) DPRINTFN(1, __VA_ARGS__)
#define __usbdebug_used
#else
#define DPRINTF(...) do { } while (0)
#define DPRINTFN(...) do { } while (0)
#define __usbdebug_used __unused
#endif
#endif
/* ... later in the same header ... */
#ifdef USB_DEBUG
extern unsigned usb_port_reset_delay;
/* more externs... */
#else
#define usb_port_reset_delay USB_PORT_RESET_DELAY
/* more compile-time constants... */
#endif
#endif /* _USB_DEBUG_H_ */
How this works, in plain language:
- Drivers that want USB debug output define a macro
USB_DEBUG_VARto name their debug-level variable, most oftenusb_debug. - If you also compile with
USB_DEBUGdefined,DPRINTF(...)andDPRINTFN(level, ...)expand toprintfcalls that prefix messages with the current function name. IfUSB_DEBUGis not defined, both macros expand to do-nothing statements and vanish from the final binary. - The second block shows another common pattern: when
USB_DEBUGis defined you get tunable variables exported asextern unsignedso you can tweak timings while testing. When it is not defined, the same names become compile-time constants likeUSB_PORT_RESET_DELAY.
What to Observe:
- Everything is controlled by preprocessor switches, not by runtime
ifstatements. WhenUSB_DEBUGis off, the debug code is not even compiled. USB_DEBUG_VARgives each driver a simple knob: raise it to see more messages, lower it to stay quiet.- This mirrors the “Quick Example” you compiled earlier, but in a real subsystem that many drivers rely on.
Try It Yourself: A driver source file that wants debug logging would add:
#define USB_DEBUG_VAR usb_debug
#include <dev/usb/usb_debug.h>
If you build that driver with -DUSB_DEBUG, the DPRINTF() calls inside it will print messages. If you build without -DUSB_DEBUG, the same calls vanish from the compiled binary. This is the preprocessor in action, shaping how the same code behaves depending on build options.
Common Preprocessor Directives
Now that you know what the preprocessor does, let us look at the most common directives you will encounter. Each one begins with the # symbol and is processed before the compiler sees your code.
1. #include - Bringing in Header Files
The most visible directive is #include. It literally copies the contents of another file into your source before compilation.
#include <stdio.h> /* System header */
#include "myheader.h" /* Local header */
Angle brackets (< >) tell the preprocessor to look in the system’s include directories, while quotes (" ") tell it first to search the current directory.
In FreeBSD, every kernel source file begins with #include lines. For example, most device drivers start with:
#include <sys/param.h>
#include <sys/bus.h>
#include <sys/kernel.h>
<sys/param.h>brings in constants such as system version and limits.<sys/bus.h>defines the bus infrastructure that almost all drivers depend on.<sys/kernel.h>provides kernel-wide symbols and macros.
Without #include, you would need to manually copy declarations into every driver file, which would quickly become unmanageable.
2. #define - Creating Macros and Constants
#define is used to create symbolic names or small code fragments that are replaced before compilation.
#define BUFFER_SIZE 1024
#define MAX(a, b) ((a) > (b) ? (a) : (b))
Macros are not variables. They are direct text replacements. This makes them fast, but also prone to subtle bugs if parentheses are forgotten.
In FreeBSD, if you look at sys/sys/ttydefaults.h, you will find:
#define TTYDEF_IFLAG (BRKINT | ICRNL | IXON | IMAXBEL | ISTRIP)
#define CTRL(x) ((x) & 0x1f)
Here CTRL(x) maps a character into its control-key equivalent. It is used throughout the terminal subsystem.
In device drivers, macros are often used for register offsets or bit masks:
#define REG_STATUS 0x04
#define STATUS_READY 0x01
This makes driver code much more readable than writing raw numbers everywhere.
3. #undef - Removing a Definition
Sometimes a macro is defined differently depending on build conditions. #undef removes a previous definition so it can be replaced.
#undef BUFFER_SIZE
#define BUFFER_SIZE 2048
This is less common in FreeBSD drivers, but it appears in portability code that must adapt to different compilers or architectures.
4. #ifdef, #ifndef, #else, #endif - Conditional Compilation
These directives decide whether a block of code should be compiled depending on whether a macro is defined.
#ifdef DEBUG
printf("Debug mode is on\n");
#endif
#ifdefchecks if a macro exists.#ifndefchecks if it does not exist.
In FreeBSD, every header file in the kernel tree uses an include guard to prevent multiple inclusion. For example, at the top of sys/sys/param.h:
#ifndef _SYS_PARAM_H_
#define _SYS_PARAM_H_
/* declarations here */
#endif /* _SYS_PARAM_H_ */
Without this guard, including the same header twice would lead to duplicate definitions.
5. #if, #elif, #else, #endif - Numerical Conditions
These allow conditional compilation based on constant expressions.
#define VERSION 14
#if VERSION >= 14
printf("This is FreeBSD 14 or later\n");
#else
printf("Older version\n");
#endif
In FreeBSD kernel builds, numerical conditions are widely used to check for optional features. For instance, many files contain:
#if defined(INVARIANTS)
/* Add extra runtime checks */
#endif
The INVARIANTS flag enables additional sanity checks that help developers catch bugs during testing but are removed in production builds.
6. #error and #warning - Forcing Messages at Build Time
Sometimes you want to stop the build if a condition is not met.
#ifndef DRIVER_SUPPORTED
#error "This driver is not supported on your system."
#endif
This ensures that the driver will not even compile in an unsupported configuration.
In FreeBSD, build-time errors like this are common in sys/conf and device headers to prevent mismatched configurations. They provide an early warning before you even attempt to load a broken module.
7. Include Guards vs. #pragma once
Most FreeBSD headers use the #ifndef / #define / #endif pattern. Some modern compilers support #pragma once as a simpler alternative; however, FreeBSD maintains the portable style to ensure headers work on every compiler supported by the project.
Think of them as the switches and dials that configure how the compiler views your program. Without them, FreeBSD’s single source tree could never build across dozens of architectures and feature sets.
Hands-On Lab 1: Protecting Headers with Include Guards
In large projects like FreeBSD, header files are included by many different source files. Without protection, including the same header twice can cause compilation errors. Including guards solves this problem.
Step 1. Create a header file called myheader.h:
/* myheader.h */
#ifndef MYHEADER_H
#define MYHEADER_H
#define GREETING "Hello from the header file!\n"
#endif /* MYHEADER_H */
Step 2. Create a source file called main.c:
#include <stdio.h>
#include "myheader.h"
#include "myheader.h" /* included twice on purpose */
int main(void) {
printf(GREETING);
return 0;
}
Step 3. Compile and run:
% cc -o main main.c
% ./main
What to Observe:
- The program compiles and runs successfully even though the header was included twice.
- Without the
#ifndef / #define / #endifpattern inmyheader.h, you would get duplicate definition errors. - This is exactly how FreeBSD headers like
<sys/param.h>protect themselves.
Hands-On Lab 2: Enforcing Build Requirements with #error
In kernel development, it is often better to fail early than to build a driver that will not work. The preprocessor allows you to enforce conditions at compile time.
Step 1. Create a file named require_version.c:
#include <stdio.h>
#define KERNEL_VERSION 14
int main(void) {
#if KERNEL_VERSION < 14
#error "This code requires FreeBSD 14 or later."
#endif
printf("Building against FreeBSD version %d\n", KERNEL_VERSION);
return 0;
}
Step 2. Compile it normally:
% cc -o require_version require_version.c
You should see a successful build.
Step 3. Change the line #define KERNEL_VERSION 14 to 13 and try compiling again.
What to Observe:
- With
KERNEL_VERSION 14, the build succeeds. - With
KERNEL_VERSION 13, the build fails immediately and prints the error message. - This is the same pattern used in FreeBSD’s build system to stop unsupported configurations before they reach runtime.
Common Beginner Pitfalls with the Preprocessor
The preprocessor is simple and powerful, but it can also create subtle bugs if used carelessly. These are the issues beginners most often face when writing C and especially when moving into kernel code.
Forgetting that macros are plain text. A macro does not behave like a function or a variable. It is copied into your code as text. This easily goes wrong if you skip parentheses:
#define SQUARE(x) x*x /* buggy */
int a = SQUARE(1+2); /* expands to 1+2*1+2 -> 5, not 9 */
#define SQUARE_OK(x) ((x)*(x)) /* safe */
Multi-statement macros that misbehave.
A macro with more than one statement can break if/else logic unless wrapped properly:
#define LOG_BAD(msg) printf("%s\n", msg); counter++;
#define LOG_OK(msg) do { \
printf("%s\n", msg); \
counter++; \
} while (0)
Always prefer the do { ... } while (0) pattern so the macro behaves like a single statement.
Side effects in macro arguments. Arguments may be evaluated more than once, which can cause surprising results:
#define MAX(a,b) ((a) > (b) ? (a) : (b))
int i = 0;
int m = MAX(i++, 10); /* i++ may execute even when not needed */
When side effects matter, a real function or an inline function is safer.
Overusing macros where const, enum, or inline would be clearer.
Prefer const int size = 4096;, enum { BUFSZ = 4096 };, or static inline functions when type safety or single evaluation is important. Use macros for bit masks, compile-time switches, or code that truly must be erased at build time.
Weak or missing include guards. Use a unique, file-based guard for every header:
#ifndef _DEV_FOO_BAR_H_
#define _DEV_FOO_BAR_H_
/* ... */
#endif
Avoid short or generic guard names that could clash across the tree.
Mixing userland and kernel headers.
In kernel modules, do not include userland headers like <stdio.h>. Use kernel equivalents and follow kernel include order, for example:
#include <sys/param.h>
#include <sys/systm.h>
#include <sys/bus.h>
/* ... */
Using quotes vs angle brackets incorrectly.
Use #include "local.h" for headers in your own directory and #include <sys/param.h> for system headers. In the kernel, most includes are system headers.
Sprinkling #ifdef everywhere.
Scattered conditionals make code hard to read. Prefer centralising options in a small header (opt_*.h or a driver-local config.h). Keep long conditional blocks to a minimum and document what each flag controls.
Redefining build options manually.
Kernel options are often generated into opt_*.h headers during the build. Do not hand-define them in source files. Let the build system provide those macros, and include the correct option header when needed.
Undocumented -D flags.
If a module depends on a -DDEBUG or -DUSB_DEBUG flag, document it in the driver’s README or the top of the source file. Future you will thank present you.
Why This Matters for FreeBSD Driver Development
The preprocessor is one of the main reasons a single FreeBSD source tree can target many CPUs, chipsets, and build profiles.
Portability across architectures. Conditionals select the right code paths for amd64, arm64, riscv, and others, without maintaining separate files.
Feature toggles without runtime cost.
Flags like INVARIANTS, WITNESS, or subsystem-specific *_DEBUG macros enable deep checks and logs during development. In release builds, those blocks disappear so there is no performance penalty.
Readable hardware access.
#define constants give meaningful names to register offsets and bit masks. This is essential for driver clarity and review.
Configuration headers as single sources of truth.
A small header can control compile-time options for a driver. This keeps #ifdefs out of the main logic and makes the build behaviour explicit.
Fail early, fail clearly.
#error helps stop unsupported builds before you reach a kernel panic. A precise build-time message is better than a mysterious runtime failure.
Recap Quiz: The C Preprocessor
- What does the C preprocessor do with your source file before the compiler sees it?
- a) Optimises the code for performance
- b) Transforms the text according to directives like
#includeand#define - c) Converts C into assembly
- Why do almost all FreeBSD header files begin with
#ifndef ... #define ... #endif?- a) To make the file easier to read
- b) To ensure the file is only included once, avoiding duplicate definitions
- c) To reserve space in memory for the header
- Which of these definitions of a macro is safer?
- a)
#define SQUARE(x) x*x - b)
#define SQUARE(x) ((x)*(x))
- a)
- If you see a driver that uses
#ifdef USB_DEBUG, what does that mean?- a) USB debugging code will only be compiled if the macro is defined
- b) The driver requires USB hardware to be attached
- c) The driver is always compiled in debug mode
- What happens if you compile this program?
#include <stdio.h>
#define VERSION 13
int main(void) {
#if VERSION < 14
#error "This driver requires FreeBSD 14 or later."
#endif
printf("Driver loaded.\n");
return 0;
}
- a) It prints “Driver loaded.”
- b) It fails to compile and shows the error message
- c) It compiles but prints nothing
- In FreeBSD drivers, why is it better to use macros like
REG_STATUSorSTATUS_READYinstead of writing the raw numbers directly?- a) It makes the compiler run faster
- b) It improves readability and maintainability of the code
- c) It reduces memory usage
Wrapping Up
You have seen how the preprocessor sets the stage before the compiler arrives. It pulls in declarations, expands macros, and decides which code exists in a given build. Used with care, it makes your drivers portable, testable, and tidy. Used without discipline, it can hide bugs and make code difficult to reason about.
In the next section, Good Practices for C Programming, we will shift from mechanisms to habits. You will learn how to choose between macros and inline, how to name and document flags, how to structure includes in kernel code, and how to keep your driver readable for future contributors. We will also connect these practices with FreeBSD’s coding style so your code feels at home in the tree.
Good Practices for C Programming
By now you know the basic building blocks of the C language and how they fit together. But writing code that simply compiles and runs is not the same as writing code that others can read, maintain, and trust inside an operating system kernel. In FreeBSD, style and clarity are not cosmetic details; they are part of what keeps the system stable and maintainable over decades.
Consistent code makes bugs easier to spot, reviews faster to complete, and long-term maintenance less painful. It also helps your own future self when you return to a file months later and need to understand what you were thinking.
In this section, we will go beyond syntax and into habits. These habits may look small in isolation, choosing clear names, checking return values, keeping functions short, but together they shape code that fits naturally into the FreeBSD source tree. If you adopt them early, every driver you write will not only work, but also feel like part of FreeBSD itself.
Let us now explore these practices step by step, beginning with how to make your code readable through indentation, naming, and comments.
Readability First: Indentation, Names, and Comments
Programming is not just about instructing the computer; it is also about communicating with people. The next person who reads your code might be a FreeBSD maintainer reviewing your driver, or it might be you six months later trying to fix a bug you barely remember. Clear, consistent formatting and naming make that task easier and prevent misunderstandings that lead to mistakes.
Indentation
Indentation is like punctuation in writing: it does not change the meaning, but it makes the text easier to read and understand. In the FreeBSD kernel, code is formatted according to KNF (Kernel Normal Form). KNF specifies details such as where braces go, how many spaces follow a keyword, and how to line up code blocks.
FreeBSD uses tabs for indentation and spaces only for alignment. This keeps code consistent across different editors and makes diffs smaller when changes are committed.
Good indentation example (KNF style):
static int
my_device_probe(device_t dev)
{
struct mydev_softc *sc;
sc = device_get_softc(dev);
if (sc == NULL)
return (ENXIO);
return (0);
}
Notice how:
- The function name is on its own line, aligned under the return type.
- The opening brace
{is placed on a new line. - Each nested block is indented with a tab.
The result is a structure you can “see” at a glance.
Naming
Names should describe purpose. Computers do not care what you call things, but humans do. In kernel code, self-explanatory names make reviews smoother and reduce errors when different developers touch the same driver years apart.
- Functions: Use verbs and, in drivers, often a prefix for the driver or subsystem. Example:
uart_attach(),mydev_init(). - Variables: Use descriptive names, except for very short-lived counters in loops (
i,jare fine there). Avoid meaningless letters likeforxfor important values.
Unclear:
int f(int x) { return x * 2; }
Clearer:
int
double_value(int input)
{
return (input * 2);
}
The second example requires no guesswork. The reader instantly knows what the function does.
Comments
A well-placed comment explains why code exists, not just what it does. If the code is already obvious, repeating it in a comment wastes space. Use comments to describe assumptions, tricky conditions, or design choices.
Example:
/*
* The softc (software context) is allocated and zeroed by the bus.
* If it is not available here, something went wrong in probe.
*/
sc = device_get_softc(dev);
if (sc == NULL)
return (ENXIO);
The comment explains the reasoning, not the mechanics. Without it, a new reader might not know why failing here is the correct thing to do.
Why This Matters for Driver Development
Drivers live in the kernel tree for years. Many people will read and edit them. Reviewers have limited time, and confusing names, inconsistent indentation, or misleading comments slow them down and increase the risk of errors slipping through.
Readable code:
- Makes bugs easier to spot during review.
- Reduces merge conflicts by following the same indentation and naming conventions as everyone else.
- Helps future developers (including you) understand intent when debugging a crash at 3 a.m.
In FreeBSD, clarity and consistency are part of correctness.
Mini Lab: Make It Readable
Goal Take a small, messy function and turn it into code that a FreeBSD maintainer would enjoy reading. You will practice KNF indentation with tabs, clear naming, and comments that explain intent.
Starting file: readable_lab.c
#include <stdio.h>
int f(int x,int y){int r=0; for(int i=0;i<=y;i++){r=r+x;} if(r>100)printf(big\n);else printf(%d\n,r);return r;}
What to do
- Reformat the function to KNF style. Use tabs for indentation and keep braces and line breaks consistent with the example patterns you have seen.
- Rename the function and variables so a new reader understands the purpose without reading comments first.
- Add a short comment that explains why the loop starts where it starts and what the threshold check represents.
- Keep the logic identical. This lab is about readability, not algorithm changes.
Compile and run
% cc -Wall -Wextra -o readable_lab readable_lab.c
% ./readable_lab
You should see output when you later add a main() to test it. For now, focus on the function shape.
A clean result might look like this
#include <stdio.h>
/*
* Compute a repeated addition of value 'addend' exactly 'count + 1' times.
* We use <= in the loop to match the original behaviour.
* If the result crosses a simple threshold, print a note; otherwise print the value.
*/
int
accumulate_and_report(int addend, int count)
{
int total;
int i;
total = 0;
for (i = 0; i <= count; i++)
total = total + addend;
if (total > 100)
printf(big\n);
else
printf(%d\n, total);
return (total);
}
Checklist for yourself
- Function name states intent, not a single letter.
- Parameters and locals have descriptive names.
- Tabs used for indentation, spaces only for alignment.
- Braces and line breaks follow the KNF pattern shown earlier.
- Comment explains why, not what each line does.
Stretch test (optional, 3 minutes)
Add a tiny main() that calls your function twice with different inputs. Confirm the output looks sensible.
int
main(void)
{
accumulate_and_report(7, 5);
accumulate_and_report(20, 3);
return (0);
}
Compile and run:
% cc -Wall -Wextra -o readable_lab readable_lab.c
% ./readable_lab
What you learned Readable code is a sequence of small, consistent choices: names that tell a story, indentation that shows structure, and comments that capture intent. These habits make reviews faster and bugs easier to spot.
With readability in place, you are ready to apply the same discipline to error handling and small helper functions in the following subsection.
Small, Focused Functions and Early Returns
Short functions are easier to read, reason about, and test. In kernel code they also reduce the chance of subtle errors hiding inside long control paths.
Keep a single responsibility per function. If a function starts to branch into several tasks, split it into helpers. Use early returns for error cases so the happy path stays visible.
Before:
static int
mydev_configure(device_t dev, int flags)
{
struct mydev_softc *sc;
int err;
sc = device_get_softc(dev);
if (sc == NULL) {
device_printf(dev, no softc\n);
return (ENXIO);
}
/* allocate resources, set up interrupts, register sysctl, etc */
err = mydev_alloc_resources(dev);
if (err != 0) {
device_printf(dev, alloc failed: %d\n, err);
return (err);
}
if ((flags & 0x1) != 0) {
err = mydev_optional_setup(dev);
if (err != 0) {
mydev_release_resources(dev);
return (err);
}
}
err = mydev_register_sysctl(dev);
if (err != 0) {
mydev_teardown_optional(dev);
mydev_release_resources(dev);
return (err);
}
/* lots more steps here... */
return (0);
}
After: the same logic becomes a sequence of small, named steps.
static int mydev_setup_core(device_t dev);
static int mydev_setup_optional(device_t dev, int flags);
static int mydev_setup_sysctl(device_t dev);
static int
mydev_configure(device_t dev, int flags)
{
int err;
err = mydev_setup_core(dev);
if (err != 0)
return (err);
err = mydev_setup_optional(dev, flags);
if (err != 0)
goto fail_core;
err = mydev_setup_sysctl(dev);
if (err != 0)
goto fail_optional;
return (0);
fail_optional:
mydev_teardown_optional(dev);
fail_core:
mydev_release_resources(dev);
return (err);
}
You can now read the top of the function like a story. The cleanup is centralised and easy to audit.
Why this matters for drivers
Attach and detach paths can grow over time. Splitting them into helpers keeps diffs small and reduces the likelihood of merge conflicts. Centralized cleanup reduces leaks and makes failure paths predictable.
Check Return Values and Propagate Errors
In user programs, you can sometimes ignore a failed call, print a warning, and continue. At worst, your program crashes or produces wrong output. In the kernel, ignoring an error almost always causes bigger problems later. An unchecked failure might look harmless in one place but can turn into a panic or data corruption in a completely different subsystem hours later.
The rule in FreeBSD driver code is simple: check every return value, handle the error, and propagate it up the call chain.
The pattern
- Call the routine.
- If it fails, log briefly with the device as context. Do not spam the logs; one concise line is enough.
- Clean up any resources you acquired.
- Return the error code to the caller.
Example: Interrupt setup
int err;
err = bus_setup_intr(dev, sc->irq_res, INTR_TYPE_TTY,
NULL, mydev_intr, sc, &sc->irq_cookie);
if (err != 0) {
device_printf(dev, interrupt setup failed: %d\n, err);
goto fail_irq;
}
This makes the failure explicit, prints context (dev), and leaves cleanup to the fail_irq: label.
What not to do
- Ignore the result entirely:
bus_setup_intr(dev, sc->irq_res, INTR_TYPE_TTY,
NULL, mydev_intr, sc, &sc->irq_cookie); /* return value dropped */
This may work during testing, but will cause hard-to-trace bugs in production.
- Clever chaining:
if ((err = bus_setup_intr(dev, sc->irq_res, INTR_TYPE_TTY,
NULL, mydev_intr, sc, &sc->irq_cookie)) &&
(err = some_other_setup(dev)) &&
(err = yet_another(dev))) {
/* ... */
}
This hides which step failed and makes the logs confusing.
Practical tips for drivers
- Keep error checks local. Use the result right after the call, rather than collecting them and checking later.
- Use the smallest scope. Declare variables near their first use. Don’t reuse one
errvariable for unrelated operations in long functions. - Be consistent. Always return an error code (
errnovalue) that matches FreeBSD conventions (for example,ENXIOwhen no hardware is found). - Log once. Do not log repeatedly at every level of the call stack; let the highest-level function print the context.
Why this matters for FreeBSD drivers
Kernel drivers interact with hardware and subsystems that the rest of the operating system depends on. A single unchecked error can:
- Leave resources half-initialised.
- Cause crashes in unrelated code later.
- Waste hours in debugging, because the first cause of the failure was silently ignored.
By always checking and propagating errors, you make your driver predictable, debuggable, and trusted by the rest of the system.
Mini Lab: Check, Log, Clean Up, Propagate
Goal Take a “works on my machine” setup path that ignores return codes and turn it into FreeBSD-quality code that:
- checks every call,
- logs once with device context,
- unwinds resources in reverse order,
- returns an appropriate error code.
Starting file: error_handling_lab.c
#include <stdio.h>
#include <stdlib.h>
/* Simulated errno-style values (keep small and distinct) */
#define E_OK 0
#define E_RES -1
#define E_IRQ -2
#define E_SYS -3
/* Pretend “device context” */
struct softc {
int res_ready;
int irq_ready;
int sysctl_ready;
const char *name;
};
/* Simulated setup steps; each may succeed (E_OK) or fail (E_*) */
static int
alloc_resources(struct softc *sc, int should_fail)
{
(void)sc;
return should_fail ? E_RES : E_OK;
}
static int
setup_irq(struct softc *sc, int should_fail)
{
(void)sc;
return should_fail ? E_IRQ : E_OK;
}
static int
register_sysctl(struct softc *sc, int should_fail)
{
(void)sc;
return should_fail ? E_SYS : E_OK;
}
/* Teardown in reverse order */
static void
teardown_sysctl(struct softc *sc)
{
(void)sc;
}
static void
teardown_irq(struct softc *sc)
{
(void)sc;
}
static void
release_resources(struct softc *sc)
{
(void)sc;
}
/* TODO: Make this robust:
* - Check return values
* - Log briefly with device context (name + error code)
* - Unwind in reverse order on failure
* - Propagate the error to caller
*/
static int
mydev_attach(struct softc *sc, int fail_res, int fail_irq, int fail_sys)
{
int err = E_OK;
/* Resource allocation (ignored result) */
err = alloc_resources(sc, fail_res);
sc->res_ready = 1; /* assume success */
/* IRQ setup (ignored result) */
err = setup_irq(sc, fail_irq);
sc->irq_ready = 1; /* assume success */
/* Sysctl registration (ignored result) */
err = register_sysctl(sc, fail_sys);
sc->sysctl_ready = 1; /* assume success */
/* Pretend success no matter what */
return E_OK;
}
/* Simple logger to keep messages consistent */
static void
dev_log(const struct softc *sc, const char *msg, int err)
{
printf([%s] %s (err=%d)\n, sc->name ? sc->name : noname, msg, err);
}
int
main(int argc, char **argv)
{
struct softc sc = {0};
int fail_res = 0, fail_irq = 0, fail_sys = 0;
int rc;
sc.name = mydev0;
/* Usage: ./a.out [fail_res] [fail_irq] [fail_sys] (0 or 1) */
if (argc >= 2) fail_res = atoi(argv[1]);
if (argc >= 3) fail_irq = atoi(argv[2]);
if (argc >= 4) fail_sys = atoi(argv[3]);
rc = mydev_attach(&sc, fail_res, fail_irq, fail_sys);
printf(attach() returned %d\n, rc);
return rc == E_OK ? 0 : 1;
}
Tasks
- Check each call immediately.
After each setup step, if it fails:
- log exactly one concise line with
dev_log(), - jump to a single cleanup section,
- return the specific error code.
- log exactly one concise line with
- Unwind in reverse order.
If
sysctlfails, tear down nothing or only what was set after it; Ifirqfails, undoirqif needed and release resources; Ifresourcesfail, just return. - Keep scope small.
Declare
errnear its first use or use separateerr_*variables in small scopes. Avoid reusing one variable across unrelated checks if it hurts clarity. - Log once.
Only the failing site should log. The caller (
main) should not print an error message, beyond the final return code it already prints. - Propagate, don’t mask.
Return
E_RES,E_IRQ, orE_SYSas appropriate. Do not always returnE_OK.
How to test
# Build with warnings:
% cc -Wall -Wextra -o error_lab error_handling_lab.c
# All OK:
% ./error_lab
attach() returned 0
# Force resource failure:
% ./error_lab 1
[mydev0] resource allocation failed (err=-1)
attach() returned 0 # <-- This should become -1 after your fixes
# Force IRQ failure:
% ./error_lab 0 1
[mydev0] irq setup failed (err=-2)
attach() returned 0 # <-- Should become -2
# Force sysctl failure:
% ./error_lab 0 0 1
[mydev0] sysctl registration failed (err=-3)
attach() returned 0 # <-- Should become -3
Your goal is to make each failure path:
- Log once,
- Unwind correctly,
- Return the right code (and thus exit non-zero).
Your fixed version should log once per failure, unwind correctly, and return a non-zero code.
Reference solution
#include <stdio.h>
#include <stdlib.h>
/* Simulated errno-style values */
#define E_OK 0
#define E_RES -1
#define E_IRQ -2
#define E_SYS -3
struct softc {
int res_ready;
int irq_ready;
int sysctl_ready;
const char *name;
};
static int
alloc_resources(struct softc *sc, int should_fail)
{
(void)sc;
return (should_fail ? E_RES : E_OK);
}
static int
setup_irq(struct softc *sc, int should_fail)
{
(void)sc;
return (should_fail ? E_IRQ : E_OK);
}
static int
register_sysctl(struct softc *sc, int should_fail)
{
(void)sc;
return (should_fail ? E_SYS : E_OK);
}
static void
teardown_sysctl(struct softc *sc)
{
(void)sc;
}
static void
teardown_irq(struct softc *sc)
{
(void)sc;
}
static void
release_resources(struct softc *sc)
{
(void)sc;
}
static void
dev_log(const struct softc *sc, const char *msg, int err)
{
printf([%s] %s (err=%d)\n, sc->name ? sc->name : noname, msg, err);
}
/*
* Clean attach path:
* - check, log, unwind, propagate
* - set readiness flags only after confirmed success
*/
static int
mydev_attach(struct softc *sc, int fail_res, int fail_irq, int fail_sys)
{
int err;
err = alloc_resources(sc, fail_res);
if (err != E_OK) {
dev_log(sc, resource allocation failed, err);
return (err);
}
sc->res_ready = 1;
err = setup_irq(sc, fail_irq);
if (err != E_OK) {
dev_log(sc, irq setup failed, err);
goto fail_irq;
}
sc->irq_ready = 1;
err = register_sysctl(sc, fail_sys);
if (err != E_OK) {
dev_log(sc, sysctl registration failed, err);
goto fail_sysctl;
}
sc->sysctl_ready = 1;
return (E_OK);
fail_sysctl:
if (sc->irq_ready) {
teardown_irq(sc);
sc->irq_ready = 0;
}
fail_irq:
if (sc->res_ready) {
release_resources(sc);
sc->res_ready = 0;
}
return (err);
}
int
main(int argc, char **argv)
{
struct softc sc = {0};
int fail_res = 0, fail_irq = 0, fail_sys = 0;
int rc;
sc.name = mydev0;
if (argc >= 2)
fail_res = atoi(argv[1]);
if (argc >= 3)
fail_irq = atoi(argv[2]);
if (argc >= 4)
fail_sys = atoi(argv[3]);
rc = mydev_attach(&sc, fail_res, fail_irq, fail_sys);
printf(attach() returned %d\n, rc);
return (rc == E_OK ? 0 : 1);
}
Checklist
- Every call is checked immediately.
- Exactly one concise log per failure, with device context.
- Cleanup runs in reverse order and clears readiness flags.
- The returned code is specific to the failure site.
- No clever chaining. The flow is readable at a glance.
Stretch goals
- Add an idempotent
detach()that succeeds from any partial state. - Replace the custom error codes with real
errnovalues and messages. - Add a small loop in
mainthat runs all failure permutations and asserts the returned code.
const Signals Intent
In C, the keyword const marks data as read-only. This is not just a technicality: it is a way of signalling intent. When you declare something const, you are telling both the compiler and other developers: “this value should not change.”
The compiler enforces that promise. If you try to modify a const value, the code will fail to compile. This protects you against accidental changes and makes your code more self-documenting.
Example: Read-only data
static const char driver_name[] = mydev;
static int
mydev_print_name(device_t dev)
{
device_printf(dev, %s\n, driver_name);
return (0);
}
Here driver_name is a string that never changes. Declaring it as const makes that explicit and prevents bugs where someone might accidentally try to overwrite it.
Example: Input-only parameters
When a function receives data that it only needs to read, declare the parameter const.
Bad (reader has to wonder if buffer will be modified):
int
checksum(unsigned char *buffer, size_t len);
Better (clear to both reader and compiler):
int
checksum(const unsigned char *buffer, size_t len);
Now the caller knows they can safely pass persistent data, like a string literal or kernel structure, without worrying that the function might alter it.
Why this matters in drivers
In kernel code, data often belongs to other subsystems, the bus, or user space. Accidentally modifying such data can cause subtle corruption that is hard to trace. Using const:
- Documents whose values are read-only.
- Prevents accidental writes at compile time.
- Makes interfaces clearer for future maintainers.
- Signal safety: You are promising not to alter the caller’s data.
Pro Tip
Do not overuse const in local variables that are obviously not reassigned. The real value comes from:
- Function parameters that should be read-only.
- Global or static data that never changes.
Used this way, const becomes a tool for clarity and safety, not just a keyword.
Mini Lab: Making Interfaces Safer with const
Goal
Identify where const should be used in function parameters and global data, and then fix the code so the compiler protects you against accidental writes.
Starting file: const_lab.c
#include <stdio.h>
#include <string.h>
/* A driver name that should never change */
char driver_name[] = mydev;
/* Computes a checksum of a buffer */
int
checksum(unsigned char *buffer, size_t len)
{
int sum = 0;
for (size_t i = 0; i < len; i++)
sum += buffer[i];
/* Oops: accidental modification */
buffer[0] = 0;
return (sum);
}
int
main(void)
{
unsigned char data[] = {1, 2, 3, 4, 5};
printf(Driver: %s\n, driver_name);
printf(Checksum: %d\n, checksum(data, sizeof(data)));
return (0);
}
Tasks
- Global constant: Make
driver_namea read-only global. - Input-only parameter: The
checksum()function should not modify its input buffer. Addconstto its parameter. - Remove accidental modification: Delete or comment out the line
buffer[0] = 0;. - Recompile with warnings enabled:
% cc -Wall -Wextra -o const_lab const_lab.c
% ./const_lab
Fixed version (one possible solution):
#include <stdio.h>
#include <string.h>
/* A driver name that will never change */
static const char driver_name[] = mydev;
/* Computes a checksum of a buffer without modifying it */
int
checksum(const unsigned char *buffer, size_t len)
{
int sum = 0;
for (size_t i = 0; i < len; i++)
sum += buffer[i];
return (sum);
}
int
main(void)
{
unsigned char data[] = {1, 2, 3, 4, 5};
printf(Driver: %s\n, driver_name);
printf(Checksum: %d\n, checksum(data, sizeof(data)));
return (0);
}
What you learned
- Declaring global strings, such as
driver_name, asconstensures they cannot be overwritten. - Adding
constto input-only function parameters (likebuffer) documents intent and prevents accidental modification. - The compiler now enforces safety, if you try to write to
bufferagain, the build will fail.
This lab builds a muscle memory habit: whenever you write a function, ask yourself “Does this parameter need to be modified?” If the answer is no, mark it const.
Why this matters for driver development
When you pass buffers, tables, or strings around in kernel code, const-correctness helps reviewers and maintainers know what can and cannot be modified. This reduces subtle bugs and prevents accidental corruption of read-only kernel data (like device IDs or configuration strings).
Common Beginner Pitfalls
C gives you a lot of freedom, and with that freedom comes room for mistakes. In user programs, these mistakes might just crash your own process. In kernel code, the same mistakes can bring down the whole system. That is why FreeBSD developers learn early to recognise and avoid these classic traps.
These pitfalls are not tied to any one feature of C, but they appear so often in real kernel code that they deserve a place in your mental checklist.
Uninitialised Values
Never assume a variable “starts out” with a useful value. If you use it before assigning one, the contents are unpredictable. In kernel space, that can mean reading garbage memory or corrupting the state. Always initialize variables explicitly, even locals.
int count = 0; /* safe */
Off-by-One Errors
Loops that run one step too far are a constant source of bugs. The simplest way to avoid them is to use a loop condition based on < count rather than <= last_index.
for (int i = 0; i < size; i++) {
/* safe */
}
Using <= here would step past the end of the array and corrupt memory.
Assignment in Conditions
The line if (x = 2) compiles, but it assigns instead of comparing. This mistake is so common that reviewers actively scan for it. Always use == for comparison, and keep your conditions simple.
if (x == 2) {
/* correct */
}
Unsigned vs Signed
Mixing signed and unsigned types can produce surprising results. A negative signed value compared with an unsigned is promoted to a large positive number. Be deliberate about your types, and cast explicitly when needed.
int i = -1;
unsigned int u = 1;
if (i < u) /* false: i promoted to large unsigned */
printf(unexpected!\n);
Macros with Side Effects
Macros are dangerous when they evaluate arguments more than once. If the argument has a side effect, such as i++, it may run multiple times. Instead of complex macros, prefer static inline functions, which are safer and more transparent.
/* Dangerous macro */
#define DOUBLE(x) ((x) + (x))
/* Safe replacement */
static inline int
double_int(int x)
{
return (x + x);
}
Why These Matter for FreeBSD Drivers
These mistakes are easy to make and difficult to debug. They often produce symptoms far away from the actual bug. By training yourself to spot them early, you save yourself hours of chasing crashes and reviewers’ time pointing out obvious issues.
In the next lab, you will practice finding and fixing such problems in real code.
Mini Lab: Spot the Pitfalls
Goal This exercise will help you practise recognising beginner mistakes that can sneak into kernel code. You will read a short program, identify pitfalls, and then fix them.
Starting file: pitfalls_lab.c
#include <stdio.h>
#include <string.h>
#define DOUBLE(x) (x + x)
int
main(void)
{
int count;
unsigned int limit = 5;
char name[4];
int i;
int result = 0;
/* Uninitialised variable used */
if (count > 0)
printf(count is positive\n);
/* Buffer overflow and missing terminator */
strcpy(name, FreeBSD);
/* Off-by-one loop */
for (i = 0; i <= limit; i++)
result = result + i;
/* Assignment in condition */
if (result = 10)
printf(result is 10\n);
/* Macro with side effects */
printf(double of i++ is %d\n, DOUBLE(i++));
return (0);
}
Tasks
- Read the program carefully. Five pitfalls are hiding inside it. Write them down before making changes.
- Correct each issue so the program is safe and predictable:
- Initialize variables properly.
- Fix the buffer size and string handling.
- Correct the loop bounds.
- Replace assignment with comparison.
- Replace the macro with a safer alternative.
- Recompile and run the corrected version.
Hints for beginners
- Does
counthave a value before theifstatement? - How many characters does
FreeBSDactually need in memory? - Will the loop run the correct number of times?
- What happens in
if (result = 10)? - How many times does
i++get executed insideDOUBLE(i++)?
Fixed solution: pitfalls_lab_fixed.c
#include <stdio.h>
#include <string.h>
/*
* Prefer a static inline function to avoid macro side effects.
* The argument is evaluated exactly once.
*/
static inline int
double_int(int x)
{
return (x + x);
}
int
main(void)
{
int count; /* will be initialised before use */
unsigned int limit; /* keep types consistent in the loop */
char name[8]; /* FreeBSD (7) + '\0' (1) = 8 */
int i;
int result;
/* Initialise variables explicitly. */
count = 0;
limit = 5;
result = 0;
/* Safe: count has a defined value before use. */
if (count > 0)
printf(count is positive\n);
/* Safe copy: buffer is large enough for FreeBSD + terminator. */
strcpy(name, FreeBSD);
/*
* Off-by-one fix: iterate while i < limit.
* Also avoid signed/unsigned surprises by comparing with the same type.
*/
for (i = 0; (unsigned int)i < limit; i++)
result = result + i;
/* Comparison, not assignment. */
if (result == 10)
printf(result is 10\n);
else
printf(result is %d\n, result);
/*
* Avoid side effects in arguments. Evaluate once, then pass the value.
* If you really need to use i++, prefer doing it on a separate line.
*/
{
int doubled = double_int(i);
printf(double of i is %d (i was %d)\n, doubled, i);
i++; /* advance i explicitly if needed later */
}
return (0);
}
What you fixed and why
countis initialised before use to remove undefined behaviour.nameis now large enough to holdFreeBSDplus the terminating'\0'.- The loop condition uses
< limitinstead of<= limit, avoiding an off-by-one error. - The condition uses
==for comparison instead of=for assignment. - The macro was replaced by a
static inlinefunction, preventing double evaluation of arguments likei++.
Why this matters for driver development Every one of these issues has real consequences in kernel space:
- An uninitialised variable can lead to reading random memory.
- A buffer overflow can corrupt kernel data and crash the system.
- An off-by-one bug can break data structures or leak information.
- Assignment in a condition can silently misdirect control flow.
- A macro with side effects can run hardware operations more times than expected.
Spotting and correcting these mistakes early will save you hours of debugging and prevent dangerous bugs from ever reaching your drivers.
Keep Whitespace and Braces Consistent
You saw earlier that whitespace mistakes can cause subtle bugs and messy diffs. Here is how FreeBSD’s KNF rules solve that in a consistent way.
Whitespace may look unimportant to a beginner, but in a large project like FreeBSD it makes the difference between clean, easy-to-review code and confusing, error-prone patches. The FreeBSD kernel follows KNF (Kernel Normal Form), which sets expectations for indentation, spacing, and brace placement.
Indentation
- Use tabs for indentation. This keeps code consistent across editors and makes diffs smaller.
- Use spaces only for alignment, such as lining up parameters or comments.
Bad (mixed spaces and tabs, hard to read in diffs):
if(error!=0){printf(failed\n);}
Good (KNF indentation and spacing):
if (error != 0) {
printf(failed\n);
}
Braces
- For function definitions and multi-line statements, put the opening brace on its own line.
- For single-line statements, braces are optional, but if you use them, stay consistent with the file’s style.
Bad (brace style inconsistent, all crammed on one line):
int main(){printf(hello\n);}
Good (KNF style, brace on its own line):
int
main(void)
{
printf(hello\n);
}
Trailing Whitespace
Trailing spaces at the end of a line do not change behaviour, but they pollute version control history. A file with trailing whitespace will show unnecessary changes in diffs, making real logic changes harder to review. Many editors can be configured to automatically strip trailing spaces, turn that on early.
Why This Matters for FreeBSD Drivers
Consistent whitespace and brace style may feel like cosmetic details, but in a codebase maintained for decades they are part of correctness. They:
- Make diffs smaller and cleaner.
- Help reviewers focus on logic instead of formatting.
- Keep your code indistinguishable from existing kernel code.
FreeBSD code should look like it was written by one hand, even though it has thousands of contributors. Following KNF style is how you make your code feel native to the tree.
Clarity Beats Cleverness
It is tempting to show off with one-liners or tricky expressions. Resist that temptation. Clever code is harder to review and even harder to debug at 3 a.m.
/* Hard to read */
x = y++ + ++y;
/* Easier to reason about */
int before = y;
y++;
int after = y;
x = before + after;
Kernel code is expected to be boring. That is not a weakness; it is a strength. When dozens of people will read, maintain, and extend your code, simplicity is the cleverest thing you can do.
FreeBSD Style, Linters, and Editor Helpers
You are not expected to memorise every KNF (Kernel Normal Form) rule before writing your first driver. Instead, FreeBSD provides tools and helpers that guide you into the right style.
- Read the style guide. The rules are documented in
style(9). It covers indentation, braces, whitespace, comments, and more. - Run the style checker. Use
tools/build/checkstyle9.plfrom the source tree to catch violations before you submit code. - Use editor support. FreeBSD ships Vim and Emacs helpers (
tools/tools/editing/freebsd.vimandtools/tools/editing/freebsd.el) that enforce KNF indentation as you type.
Tip: Always review your patch with git diff (or diff) before committing. Smaller, clean diffs are easier for reviewers to read and quicker to accept.
Consistent style is not merely decoration; it is an integral part of correctness. If reviewers have to fight with your indentation or naming, they cannot focus on the actual driver logic.
Hands-On Lab 1: Make It KNF-Clean
Goal Take a tiny but messy file and make it KNF-clean. You will practise indentation with tabs, brace placement, line breaks, and a couple of small logic fixes that beginners often miss.
Starting file: knf_demo.c
#include <stdio.h>
int main(){int x=0;for(int i=0;i<=10;i++){x=x+i;} if (x= 56){printf(sum is 56\n);}else{printf(sum is %d\n,x);} }
Steps
-
Run the style checker from your source tree:
% cd /usr/src % tools/build/checkstyle9.pl /path/to/knf_demo.cLeave the terminal open so you can re-run it after fixes.
-
Re-shape the code
- Put the return type on its own line.
- Put the function name and arguments on the next line.
- Place braces on their own lines.
- Use tabs for indentation; use spaces only for alignment.
- Keep lines to a sensible width.
-
Fix the two classic bugs
- Change
<= 10to< 10in the loop. - Change
if (x = 56)toif (x == 56).
- Change
-
Re-run the checker until the warnings make sense and your file looks clean.
-
Build and run
% cc -Wall -Wextra -o knf_demo knf_demo.c % ./knf_demo
What good looks like Readable at a glance, with no style noise hiding the logic. Your diff should be mostly whitespace and line-break changes plus the two tiny logic fixes.
Why this matters for drivers Clean structure helps reviewers ignore formatting and focus on correctness. That means faster reviews and fewer style round-trips.
Real-Code Tour: KNF in the FreeBSD 14.3 Tree
Reading real code builds instincts faster than any checklist. Instead of pasting long listings here, you will open a few core files in your own tree and look for style patterns that appear again and again.
Before you start
- Ensure your 14.3 source is at
/usr/src. - Use an editor configured for tabs at width 8 and “show invisibles” enabled.
Open one of these widely used files
sys/kern/subr_bus.csys/dev/uart/uart_core.c
What to look for
- Function signatures split across lines Return type on its own line, name and arguments on the next line, opening brace on a line by itself.
- Tabs for indentation, spaces for alignment Indent blocks with tabs. Use spaces only to line up continued arguments or comments.
- Paragraph comments Multi-line comments that explain decisions or constraints rather than narrating each line.
- Early returns on error Short checks that fail fast and keep the “happy path” easy to read.
- Small helpers Long tasks broken into short static functions with focused names.
Your notes
- Write down three examples of KNF you can copy into your own code.
- Write down one habit you want to adopt immediately in your next lab.
Tip: keep the style(9) man page open while you browse. When in doubt, compare what you see with the rule.
Why this works You learn style from the tree itself. When you later submit code, reviewers will recognize these patterns and review your logic rather than your whitespace.
Hands-On Lab 2: Style Hunt in the Kernel Tree
Goal Train your eye to spot KNF style and the small structural habits that make FreeBSD kernel code easy to read and maintain. You will practice reading real driver code and then apply what you see to your own exercises.
Steps
-
Locate real attach/probe functions These are the standard entry points where drivers claim devices. They are short enough to study yet contain common patterns. For example, in the UART driver:
% egrep -nH 'static int .*probe|static int .*attach' /usr/src/sys/dev/uart/*.cPick one of the files listed.
-
Study the code carefully Open the function in your editor. As you read, ask yourself:
- Indentation: Do you see tabs for indentation and spaces only for alignment?
- Braces: Are braces placed on their own lines?
- Early returns: Where does the function bail out on errors?
- Helpers: Is long logic broken into small static helpers so
attach()stays focused? - Comments: Do comments explain why decisions are made, not what each line does?
Take notes on at least three patterns you want to copy.
-
Apply what you saw Pick one of your earlier C exercises (a lab function from this chapter works fine). Rewrite that function to match the style you observed:
- Re-indent using KNF.
- Add early returns for error cases.
- Factor out any long logic into a small helper.
- Add a concise comment for intent, not narration.
Example: Refactoring with Kernel Style
Suppose you wrote this earlier in the chapter:
int process_values(int *arr, int n) { int i, total = 0; if (arr == NULL) { printf(bad array\n); return -1; } for (i = 0; i <= n; i++) { /* off-by-one */ total += arr[i]; } if (total > 1000) { printf(too big!\n); } else { printf(ok\n); } return total; }This works, but it mixes concerns, hides the happy path, and uses style inconsistent with KNF.
After refactoring with KNF style and + kernel habits:
static int validate_array(const int *arr, int n) { if (arr == NULL || n <= 0) return (EINVAL); return (0); } static int sum_array(const int *arr, int n) { int total, i; total = 0; for (i = 0; i < n; i++) /* clear bound */ total += arr[i]; return (total); } int process_values(int *arr, int n) { int err, total; err = validate_array(arr, n); if (err != 0) { printf(invalid input: %d\n, err); return (err); } total = sum_array(arr, n); if (total > 1000) printf(too big!\n); else printf(ok\n); return (total); }What changed and why:
- KNF formatting: return type on its own line, braces on separate lines, tabs for indentation.
- Early return: invalid input is rejected immediately, keeping the happy path clear.
- Small helpers:
validate_array()andsum_array()split responsibilities, keepingprocess_values()short and readable. - Clear loop bound:
i < nprevents the off-by-one. - Consistent return values:
EINVALinstead of a magic-1.
How this mirrors kernel code
When you look at real probe/attach functions in the FreeBSD tree, you’ll notice the same patterns:
- Early exits on failure.
- Helpers to keep top-level functions focused.
- KNF formatting everywhere.
- Clear error codes instead of magic values.
-
Optional check If you want to validate your formatting, run the style checker:
% /usr/src/tools/build/checkstyle9.pl /path/to/your_file.cFix warnings until the file looks clean.
What you should learn
- Style communicates intent. Indentation, naming, and comments all carry meaning for readers.
- Error-first returns keep the “happy path” easy to see.
- Helpers make attach paths short, testable, and reviewable.
- Consistency with the tree keeps diffs small and reviews focused on behaviour, not formatting.
Hands-On Lab 3: Refactor and Harden an Attach Path
Goal Take a monolithic attach function and refactor it into small helpers with clear error handling and tidy cleanup. You will practice early returns, consistent logging, and a single exit path for error unwinding.
Starting file: attach_refactor.c
#include <stdio.h>
/* This is user-space scaffolding to practise structure and flow only. */
struct softc {
int res_ok;
int irq_ok;
int sysctl_ok;
};
static int
my_attach(struct softc *sc, int enable_extra)
{
int err = 0;
/* allocate resources */
sc->res_ok = 1;
/* setup irq */
if (sc->res_ok) {
if (enable_extra) {
/* optional path can fail */
if (0) { /* pretend failure sometimes */
printf(optional setup failed\n);
return -2;
}
}
sc->irq_ok = 1;
} else {
printf(resource alloc failed\n);
return -1;
}
/* register sysctl */
if (sc->irq_ok) {
sc->sysctl_ok = 1;
} else {
printf(irq setup failed\n);
return -3;
}
/* more steps might appear here later */
return err;
}
int
main(void)
{
struct softc sc = {0};
int r;
r = my_attach(&sc, 1);
printf(attach returned %d\n, r);
return 0;
}
Tasks
- Split
my_attach()into three helpers:setup_core(),setup_optional(), andsetup_sysctl(). - Add a single cleanup section that tears down in the reverse order of setup.
- Replace
printfwith a tiny helperlog_dev(const char *msg, int err)so your log lines are consistent. - Keep behaviour the same. This lab is about structure and error paths.
Stretch goals
- Return distinct error codes per failure site and print them in one place.
- Add a boolean flag to simulate failure in each helper and confirm your cleanup runs in the right order.
What good looks like
- Three helpers that each do one thing.
- Early returns from helpers.
- One cleanup path in the caller that is easy to read top to bottom.
- Short log messages that include the error code.
Why this matters
Attach and detach code must be easy to audit. In real drivers you will add features over time. A tidy structure lets you extend behaviour without turning the function into a maze.
Hands-On Lab 4: Putting It All Together
You’ve practised style, naming, indentation, error handling, pitfalls, and refactoring in isolation. Now it’s time to bring them all together. In this lab you will be given a deliberately messy driver-like scaffold. It compiles, but it contains poor style, bad naming, unchecked errors, misleading comments, and several pitfalls. Your task is to clean it up into FreeBSD-quality code.
Starting File: driver_skeleton.c
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#define DOUB(X) X+X
/* this function tries to init device, but its messy */
int initDev(int D, char *nm){
int i=0;char buf[5];int ret;
if(D==0){printf(bad dev\n);return -1;}
strcpy(buf,nm);
for(i=0;i<=D;i++){ret=DOUB(i);}
/* check if ret bigger then 10 */
if(ret=10){printf(ok\n);}
else{printf(fail\n);}
return ret;}
/* main func */
int main(int argc,char **argv){int dev;char*name;
dev=atoi(argv[1]);name=argv[2];
int r=initDev(dev,name);
printf(r=%d\n,r);}
Tasks
- Indentation and formatting
- Reshape the file to KNF style: return type on its own line, braces on separate lines, tabs for indentation.
- Naming
- Rename
initDev()and variables likeD,nm,retto descriptive names that fit kernel conventions.
- Rename
- Comments
- Remove misleading or useless comments. Add new comments explaining why certain checks exist.
- Pitfalls to fix
- Uninitialised variables (
ret). - Buffer overflow risk (
buf[5]withstrcpy). - Off-by-one loop (
<=). - Assignment in condition (
if (ret = 10)). - Macro with side effects (
DOUB).
- Uninitialised variables (
- Error handling
- Add proper checks for missing arguments in
main(). - Use consistent error messages.
- Add proper checks for missing arguments in
- Intent
- Use
constwhere appropriate.
- Use
- Structure
- Split the device initialisation logic into smaller helper functions.
- Keep each function short and focused.
Stretch Goal
- Add a cleanup path that frees memory or resets state if something fails.
- Replace
printf()with a mockdevice_log()that prepends the device name to all messages.
What Good Looks Like
Your final file should:
- Compile cleanly with
-Wall -Wextra. - Be KNF-formatted, with descriptive names.
- Use safe string handling (
strlcpyor checkedsnprintf). - Use
==for comparisons. - Replace the
DOUBmacro with astatic inlinefunction. - Handle errors gracefully and consistently.
- Use
constfor read-only strings. - Split
initDev()into smaller helpers likecheck_device(),copy_name(),compute_value().
Why This Matters for FreeBSD Drivers
Drivers live a long time, and many hands will touch them. A few disciplined habits make that sustainable:
- They reduce review friction, so feedback is about behaviour, not braces.
- They lower subtle bugs, because structure makes errors stand out.
- They keep diffs small, which helps audits and backports.
- They make the tree consistent, so newcomers can learn by reading.
The FreeBSD style rules and tools exist to help you reach this standard from day one. Adopt them now, and every chapter that follows will be easier for you and for your reviewers.
Wrapping Up
In this section you saw that writing C code for FreeBSD is not just about getting something to compile. It is about habits that make your code clear, predictable, and trustworthy inside the kernel. You learned how consistent indentation, meaningful names, and purposeful comments make your code easier to read. You saw how keeping functions short, returning early on errors, and using const express your intent clearly. You learned why small details such as whitespace, brace placement, and trailing spaces matter in a large collaborative project.
We also looked at common pitfalls that beginners fall into uninitialized variables, off-by-one loops, accidental assignments, and dangerous macros and saw how to avoid them. And we reinforced the lesson that boring, straightforward code is the safest and most respected kind of code in FreeBSD. Tools like style(9) and the checkstyle9.pl script, along with studying real code in the tree, give you the practical support you need to write code that blends seamlessly with the rest of the system.
The next step is to consolidate these skills. In the following section you will work through a final set of hands-on labs that combine all the practices you have learned in this chapter. After that, a short recap quiz will help you check whether you remember the key ideas and can apply them on your own.
By finishing these exercises, you will be ready to move beyond C basics and start applying your knowledge directly to driver development in FreeBSD.
Style Hunt Worksheet: From Observation to Practice
To make your learning stick, use the following worksheet as a guided logbook while studying real FreeBSD drivers. It helps you capture patterns you see, record before/after refactors of your own code, and build a portfolio of examples you can return to later.
This worksheet helps you:
- Capture three concrete style patterns you copied from the kernel.
- Save a before-and-after diff of one of your own functions that you refactored.
- Build a reference you can return to later, when you start writing real drivers.
By writing down what you observed and how you applied it, you turn reading into practice and practice into habits.
Style Hunt Worksheet
File under review: ______________________________
Function name: _____________________________ (probe / attach)
Kernel version/source path: ______________________________
Date: __________________
1) First impression (30-60 seconds)
[ ] Function feels short and readable
[ ] Happy path is easy to follow
[ ] Error cases exit early
- Notes:
__________________________________________________________________
2) KNF quick checks (format & whitespace)
- Indentation uses tabs (spaces only for alignment): [ ] Yes / [ ] No
Evidence (line nums):
_____________________________ - Function signature split across lines (return type on its own line): [ ] Yes / [ ] No
Evidence:
_______________________________________ - Braces on their own lines for function/multi-line blocks: [ ] Yes / [ ] No
Evidence:
_______________________________________ - Line width reasonable (no ultra-long lines): [ ] Yes / [ ] No
- Trailing whitespace avoided: [ ] Yes / [ ] No
Copy this pattern: _____________________________________________________________
3) Naming & comments (clarity over cleverness)
- Names express intent (e.g.,
uart_attach,alloc_irq,setup_sysctl): [ ] Yes / [ ] No Examples:____________________ - Comments explain why (decisions/assumptions), not what: [ ] Yes / [ ] No
Example lines:
__________
One naming idea to copy: _________________________________________
One comment pattern to copy: _______________________________________
4) Error handling (check → log once → unwind → return)
- Each call checked immediately: [ ] Yes / [ ] No (examples:
__________) - Short, contextual log at failure site (not spammed up the stack): [ ] Yes / [ ] No
- Cleanup/unwind in reverse order of setup: [ ] Yes / [ ] No
- Returns specific errno (e.g.,
ENXIO,EINVAL), not magic values: [ ] Yes / [ ] No
One clean failure path I liked (lines): __________
What made it clear: _____________________________________________
5) Small helpers & structure
- Top-level function delegates to small static helpers: [ ] Yes / [ ] No
Helper names:
_____________________________________________ - Early returns keep the happy path straight: [ ] Yes / [ ] No
- Shared teardown consolidated (labels or helpers): [ ] Yes / [ ] No
Helper I’ll emulate: _____________________________________________
6) Intent & const-correctness
- Read-only data/params marked
const: [ ] Yes / [ ] No (examples:__________) - No “clever” macros with side effects (prefers
static inline): [ ] Yes / [ ] No
One const usage I’ll adopt: _________________________________________
7) Three patterns I will copy (be specific)
__________________________________________________________From lines:____________________________________________________________________From lines:____________________________________________________________________From lines:__________
8) Red flags I noticed (optional)
[ ] Mixed tabs/spaces
[ ] Deep nesting
[ ] Long function (> ~60 lines)
[ ] Reused err hides failure site
[ ] Vague names / comments narrate code
Notes: __________________________________________________________
9) Apply it: micro-refactor plan for my code
Target function/file: ___________________________________________
-
Split into helpers:
___________________________________________ -
Add early returns at:
___________________________________________ -
Improve naming/comments:
________________________________________ -
Make params
constwhere:_______________________________________ -
Centralise cleanup (reverse order):
_______________________________ -
Run checker:
/usr/src/tools/build/checkstyle9.pl /path/to/my_file.c
Before/After diff saved as: __________________________
10) One-line takeaway (for your portfolio)
“______________________________________________________________”
How to use this worksheet
- Print or duplicate it for each driver you study.
- Capture evidence (line numbers/snippets) so you can revisit patterns later.
- Immediately transform one of your own functions using the items in Section 9.
- Keep the before/after diff with this worksheet, it’s proof of learning and a handy reference when you start writing real drivers.
Final Practice Labs
You have now met the essential C tools you will use over and over again when writing FreeBSD device drivers: operators and expressions, the preprocessor, arrays and strings, function pointers and typedefs, and pointer arithmetic.
Before we move on to kernel-specific topics in later chapters, it is time to practise these ideas in small, realistic exercises. The five labs below are designed to reinforce safe habits and mental models you will rely on in kernel code. Work through them in order.
Each lab includes clear steps, comments, and brief reflections to help you check your understanding and solidify the learning.
Lab 1 (Easy): Bit Flags and Enums - “Device State Inspector”
Goal Use bitwise operators to manage a device-style state. Practise set, clear, toggle, and test, then print a readable summary.
Files
flags.h, flags.c, main.c
flags.h
#ifndef DEV_FLAGS_H
#define DEV_FLAGS_H
#include <stdio.h>
#include <stdint.h>
/*
* Bit flags emulate how drivers track small on/off facts:
* enabled, open, error, TX busy, RX ready. Each flag is a
* single bit inside an unsigned integer.
*/
enum dev_flags {
DF_ENABLED = 1u << 0, // 00001
DF_OPEN = 1u << 1, // 00010
DF_ERROR = 1u << 2, // 00100
DF_TX_BUSY = 1u << 3, // 01000
DF_RX_READY = 1u << 4 // 10000
};
/* Small helpers keep bit logic readable and portable. */
static inline void set_flag(uint32_t *st, uint32_t f) { *st |= f; }
static inline void clear_flag(uint32_t *st, uint32_t f) { *st &= ~f; }
static inline void toggle_flag(uint32_t *st, uint32_t f){ *st ^= f; }
static inline int test_flag(uint32_t st, uint32_t f) { return (st & f) != 0; }
/* Summary printer lives in flags.c to keep main.c tidy. */
void print_state(uint32_t st);
#endif
flags.c
#include flags.h
/*
* Print both the hex value and a friendly list of which flags are set.
* This mirrors how driver debug output often looks in practice.
*/
void
print_state(uint32_t st)
{
printf(State: 0x%08x [, st);
int first = 1;
if (test_flag(st, DF_ENABLED)) { printf(%sENABLED, first?:, ); first=0; }
if (test_flag(st, DF_OPEN)) { printf(%sOPEN, first?:, ); first=0; }
if (test_flag(st, DF_ERROR)) { printf(%sERROR, first?:, ); first=0; }
if (test_flag(st, DF_TX_BUSY)) { printf(%sTX_BUSY, first?:, ); first=0; }
if (test_flag(st, DF_RX_READY)) { printf(%sRX_READY, first?:, ); first=0; }
if (first) printf(none); // No flags set
printf(]\n);
}
main.c
#include <string.h>
#include <stdlib.h>
#include flags.h
/* Map small words to flags to keep the CLI simple. */
static uint32_t flag_from_name(const char *s) {
if (!strcmp(s, enable)) return DF_ENABLED;
if (!strcmp(s, open)) return DF_OPEN;
if (!strcmp(s, error)) return DF_ERROR;
if (!strcmp(s, txbusy)) return DF_TX_BUSY;
if (!strcmp(s, rxready)) return DF_RX_READY;
return 0;
}
/*
* Usage example:
* ./devflags set enable set rxready toggle open
* Try different sequences and confirm your mental model of bits changing.
*/
int
main(int argc, char **argv)
{
uint32_t st = 0; // All flags cleared initially
for (int i = 1; i + 1 < argc; i += 2) {
const char *op = argv[i], *name = argv[i+1];
uint32_t f = flag_from_name(name);
if (f == 0) { printf(Unknown flag '%s'\n, name); return 64; }
if (!strcmp(op, set)) set_flag(&st, f);
else if (!strcmp(op, clear)) clear_flag(&st, f);
else if (!strcmp(op, toggle))toggle_flag(&st, f);
else { printf(Unknown op '%s'\n, op); return 64; }
}
print_state(st);
return 0;
}
Build and run
% cc -Wall -Wextra -o devflags main.c flags.c
% ./devflags set enable set rxready toggle open
Interpret your results You should see a hex state plus a human list. If you “toggle open” from zero, OPEN appears; toggle again and it disappears. Change the order of operations to predict the final state first, then run to confirm.
What you practised
- Bit masks as compact state
- Using
|,&,^,~correctly - Writing small helpers to make bit logic readable
Lab 2 (Easy → Medium): Preprocessor Hygiene - “Feature-Gated Logging”
Goal
Create a minimal logging API whose verbosity is controlled at compile time with a DEBUG macro, and demonstrate safe header usage.
Files
log.h, log.c, demo.c
log.h
#ifndef EB_LOG_H
#define EB_LOG_H
#include <stdio.h>
#include <stdarg.h>
/*
* When compiled with -DDEBUG we enable extra logs.
* This is common in kernel and driver code to keep fast paths quiet.
*/
#ifdef DEBUG
#define LOG_DEBUG(fmt, ...) eb_log(DEBUG, __FILE__, __LINE__, fmt, ##__VA_ARGS__)
#else
#define LOG_DEBUG(fmt, ...) (void)0 // Compiles away
#endif
#define LOG_INFO(fmt, ...) eb_log(INFO, __FILE__, __LINE__, fmt, ##__VA_ARGS__)
#define LOG_ERR(fmt, ...) eb_log(ERROR, __FILE__, __LINE__, fmt, ##__VA_ARGS__)
/* Single declaration for the underlying function. */
void eb_log(const char *lvl, const char *file, int line, const char *fmt, ...);
/*
* Example compile-time assumption check:
* The code should work on 32 or 64 bit. Fail early otherwise.
*/
_Static_assert(sizeof(void*) == 8 || sizeof(void*) == 4, Unsupported pointer size);
#endif
log.c
#include log.h
/* Simple printf-style logger that prefixes level, file, and line. */
void
eb_log(const char *lvl, const char *file, int line, const char *fmt, ...)
{
va_list ap;
fprintf(stderr, [%s] %s:%d: , lvl, file, line);
va_start(ap, fmt);
vfprintf(stderr, fmt, ap);
va_end(ap);
fputc('\n', stderr);
}
demo.c
#include log.h
int main(void) {
LOG_INFO(Starting);
LOG_DEBUG(Internal detail: x=%d, 42);
LOG_ERR(Something went wrong: %s, timeout);
return 0;
}
Build and run
# Quiet build (no DEBUG)
% cc -Wall -Wextra -o demo demo.c log.c
% ./demo
# Verbose build (with DEBUG)
% cc -Wall -Wextra -DDEBUG -o demo_dbg demo.c log.c
% ./demo_dbg
Interpret your results
Compare outputs. In the non-DEBUG build, LOG_DEBUG lines vanish entirely. This is a zero-cost switch controlled by compilation, not runtime flags.
What you practised
- Header guards and a tiny public API
- Conditional compilation with
#ifdef _Static_assertfor build-time safety
Lab 3 (Medium): Safe Strings and Arrays - “Bounded Device Names”
Goal Build device-style names safely into a caller-supplied buffer. Return explicit error codes rather than crashing or truncating silently.
Files
devname.h, devname.c, test_devname.c
devname.h
#ifndef DEVNAME_H
#define DEVNAME_H
#include <stddef.h>
/* Tiny errno-like set for clarity when reading results. */
enum dn_err { DN_OK = 0, DN_EINVAL = 22, DN_ERANGE = 34 };
/*
* Caller supplies dst and its size. We combine prefix+unit into dst.
* This mirrors common kernel patterns: pointer plus explicit length.
*/
int build_devname(char *dst, size_t dstsz, const char *prefix, int unit);
/* Quick validator: letters then at least one digit (e.g., ttyu0). */
int is_valid_devname(const char *s);
#endif
devname.c
#include <stdio.h>
#include <string.h>
#include <ctype.h>
#include devname.h
/* Safe build: check pointers and capacity; use snprintf to avoid overflow. */
int
build_devname(char *dst, size_t dstsz, const char *prefix, int unit)
{
if (!dst || !prefix || unit < 0) return DN_EINVAL;
int n = snprintf(dst, dstsz, %s%d, prefix, unit);
/* snprintf returns the number of chars it *wanted* to write.
* If that does not fit in dst, we report DN_ERANGE.
*/
return (n >= 0 && (size_t)n < dstsz) ? DN_OK : DN_ERANGE;
}
/* Very small validator for practice. Adjust rules if you want stricter checks. */
int
is_valid_devname(const char *s)
{
if (!s || !isalpha((unsigned char)s[0])) return 0;
int saw_digit = 0;
for (const char *p = s; *p; p++) {
if (isdigit((unsigned char)*p)) saw_digit = 1;
else if (!isalpha((unsigned char)*p)) return 0;
}
return saw_digit;
}
test_devname.c
#include <stdio.h>
#include devname.h
/*
* Table-driven tests: we try several cases and print both code and result.
* Practise reading return codes and not assuming success.
*/
struct case_ { const char *pref; int unit; size_t cap; };
int main(void) {
char buf[8]; // Small to force you to think about capacity
struct case_ cases[] = {
{ttyu, 0, sizeof buf}, // fits
{ttyv, 12, sizeof buf},// just fits or close
{, 1, sizeof buf}, // invalid prefix
};
for (unsigned i = 0; i < sizeof(cases)/sizeof(cases[0]); i++) {
int rc = build_devname(buf, cases[i].cap, cases[i].pref, cases[i].unit);
printf(case %u -> rc=%d, buf='%s'\n,
i, rc, (rc==DN_OK)?buf:<invalid>);
}
printf(valid? 'ttyu0'=%d, 'u0tty'=%d\n,
is_valid_devname(ttyu0), is_valid_devname(u0tty));
return 0;
}
Build and run
% cc -Wall -Wextra -o test_devname devname.c test_devname.c
% ./test_devname
Interpret your results
Notice which cases return DN_OK, which return DN_EINVAL, and which return DN_ERANGE. Confirm that the buffer never overflows and that invalid inputs are rejected. Try shrinking buf to char buf[6]; to force DN_ERANGE and observe the outcome.
What you practised
- Arrays decay to pointers, so you must always pass a size
- Using return codes rather than assuming success
- Designing tiny, testable APIs that are hard to misuse
Lab 4 (Medium → Hard): Function Pointer Dispatch - “Mini devsw”
Goal Model a device operations table with function pointers and switch between two implementations at runtime.
Files
ops.h, ops_console.c, ops_uart.c, main_ops.c, optional Makefile
ops.h
#ifndef OPS_H
#define OPS_H
#include <stdio.h>
/* typedefs keep pointer-to-function types readable. */
typedef int (*ops_init_t)(void);
typedef void (*ops_start_t)(void);
typedef void (*ops_stop_t)(void);
typedef const char* (*ops_status_t)(void);
/* A tiny vtable of operations for a device. */
typedef struct {
const char *name;
ops_init_t init;
ops_start_t start;
ops_stop_t stop;
ops_status_t status;
} dev_ops_t;
/* Two concrete implementations declared here, defined in .c files. */
extern const dev_ops_t console_ops;
extern const dev_ops_t uart_ops;
#endif
ops_console.c
#include ops.h
/* Console variant: pretend to manage a simple console device. */
static int inited;
static const char *state = stopped;
static int c_init(void){ inited = 1; return 0; }
static void c_start(void){ if (inited) state = console-running; }
static void c_stop(void){ state = stopped; }
static const char* c_status(void){ return state; }
const dev_ops_t console_ops = {
.name = console,
.init = c_init, .start = c_start, .stop = c_stop, .status = c_status
};
ops_uart.c
#include ops.h
/* UART variant: separate state to show independence between backends. */
static int ready;
static const char *state = idle;
static int u_init(void){ ready = 1; return 0; }
static void u_start(void){ if (ready) state = uart-txrx; }
static void u_stop(void){ state = idle; }
static const char* u_status(void){ return state; }
const dev_ops_t uart_ops = {
.name = uart,
.init = u_init, .start = u_start, .stop = u_stop, .status = u_status
};
main_ops.c
#include <string.h>
#include ops.h
/* Choose one implementation based on argv[1]. */
static const dev_ops_t *pick(const char *name) {
if (!name) return NULL;
if (!strcmp(name, console)) return &console_ops;
if (!strcmp(name, uart)) return &uart_ops;
return NULL;
}
int main(int argc, char **argv) {
if (argc < 2) {
fprintf(stderr, usage: %s {console|uart}\n, argv[0]);
return 64;
}
const dev_ops_t *ops = pick(argv[1]);
if (!ops) { fprintf(stderr, unknown ops\n); return 64; }
/* The call sites do not care which backend we picked. */
if (ops->init() != 0) { fprintf(stderr, init failed\n); return 1; }
ops->start();
printf([%s] status: %s\n, ops->name, ops->status());
ops->stop();
printf([%s] status: %s\n, ops->name, ops->status());
return 0;
}
Build and run
% cc -Wall -Wextra -o mini_ops main_ops.c ops_console.c ops_uart.c
% ./mini_ops console
% ./mini_ops uart
Interpret your results
You should see different status strings for the two backends, with identical call sites in main_ops.c. That is the benefit of function pointer tables: the caller sees one interface and many interchangeable implementations.
What you practised
- Declaring and using function pointer types with
typedef - Grouping related operations in a struct
- Selecting implementations at runtime without
ifchains everywhere
Lab 5 (Hard): Fixed-Size Circular Buffer - “Single-Producer Single-Consumer Queue”
Goal Implement a small ring buffer for integers with wrap-around indexing and clear rules for empty/full conditions. No concurrency yet, just correct arithmetic and invariants.
Files
cbuf.h, cbuf.c, test_cbuf.c
cbuf.h
#ifndef CBUF_H
#define CBUF_H
#include <stddef.h>
#include <stdint.h>
/* Small error codes mirroring common patterns. */
enum cb_err { CB_OK=0, CB_EFULL=28, CB_EEMPTY=35 };
/*
* head points to the next write position.
* tail points to the next read position.
* We leave one slot empty so that head==tail means empty, and
* (head+1)%cap==tail means full. This keeps the logic simple.
*/
typedef struct {
size_t cap; // number of slots in 'data'
size_t head; // next write index
size_t tail; // next read index
int *data; // storage owned by the caller
} cbuf_t;
int cb_init(cbuf_t *cb, int *storage, size_t n);
int cb_push(cbuf_t *cb, int v);
int cb_pop(cbuf_t *cb, int *out);
int cb_is_empty(const cbuf_t *cb);
int cb_is_full(const cbuf_t *cb);
#endif
cbuf.c
#include cbuf.h
/* Prepare a buffer wrapper around caller-provided storage. */
int
cb_init(cbuf_t *cb, int *storage, size_t n)
{
if (!cb || !storage || n == 0) return 22; // EINVAL
cb->cap = n;
cb->head = cb->tail = 0;
cb->data = storage;
return CB_OK;
}
/* Empty when indices match. */
int cb_is_empty(const cbuf_t *cb){ return cb->head == cb->tail; }
/* Full when advancing head would collide with tail. */
int cb_is_full (const cbuf_t *cb){ return ((cb->head + 1) % cb->cap) == cb->tail; }
/* Write the value, then advance head with wrap-around. */
int
cb_push(cbuf_t *cb, int v)
{
if (cb_is_full(cb)) return CB_EFULL;
cb->data[cb->head] = v;
cb->head = (cb->head + 1) % cb->cap;
return CB_OK;
}
/* Read the value, then advance tail with wrap-around. */
int
cb_pop(cbuf_t *cb, int *out)
{
if (cb_is_empty(cb)) return CB_EEMPTY;
*out = cb->data[cb->tail];
cb->tail = (cb->tail + 1) % cb->cap;
return CB_OK;
}
test_cbuf.c
#include <stdio.h>
#include cbuf.h
/*
* We use a 4-slot store. With the leave one empty rule,
* the buffer holds at most 3 values at a time.
*/
int
main(void)
{
int store[4];
cbuf_t cb;
int v;
cb_init(&cb, store, 4);
/* Fill three slots; the fourth remains empty to signal full. */
for (int i=1;i<=3;i++) {
int rc = cb_push(&cb, i);
printf(push %d -> rc=%d\n, i, rc);
}
printf(full? %d (1 means yes)\n, cb_is_full(&cb));
/* Drain to empty. */
while (cb_pop(&cb, &v) == 0)
printf(pop -> %d\n, v);
/* Wrap-around scenario: push, push, pop, then push twice more. */
cb_push(&cb, 7); cb_push(&cb, 8);
cb_pop(&cb, &v); printf(pop -> %d\n, v); // frees one slot
cb_push(&cb, 9); cb_push(&cb,10); // should wrap indices
while (cb_pop(&cb, &v) == 0)
printf(pop -> %d\n, v);
return 0;
}
Build and run
% cc -Wall -Wextra -o test_cbuf cbuf.c test_cbuf.c
% ./test_cbuf
Interpret your results Confirm that it accepts three pushes, then reports full. After popping all items, it reports empty. In the wrap-around phase, check that values come out in the same order you pushed them. Print the indices during testing if you want to see head and tail moving.
What you practised
- Pointer arithmetic through indices with wrap-around
- Designing empty and full rules that avoid ambiguity
- Small, error-returning APIs ready for future concurrency
Wrapping Up
In this final set of labs you revisited, in practice, all the major tools that C gives you and that FreeBSD drivers rely on every day. You saw how bitwise operators transform into compact state machines, how the preprocessor can make or break build-time choices, how to handle arrays and strings safely with explicit lengths, how function pointer tables provide flexible device interfaces, and how pointer arithmetic underpins circular buffers and queues.
Each of these small programs mirrors a real kernel pattern. The logging system echoes how conditional tracing works in drivers. The bounded device name exercise resembles how drivers allocate and validate names for nodes under /dev. The ops table is a miniature of the cdevsw or bus_method structures you will soon work with. And the ring buffer is a simple model of the queues that drive network, USB, and storage devices.
By now, you should not only understand these C concepts in the abstract but also have typed them out, compiled them, and watched them work. That muscle memory will be critical when you are deep in kernel space and bugs are subtle.
Final Knowledge Check
You have now walked through a vast landscape: from your very first printf() all the way to FreeBSD-flavoured details like error codes, Makefiles, pointers, and kernel coding standards. Before we move on, it is time to pause and assess how much of this knowledge has been absorbed.
The following 60 questions are designed to make you reflect on what you learned. They are not here to intimidate you, quite the opposite: they are a mirror for you to see how much ground you have already covered. If you can answer most of them, even approximately, it means you have built a strong foundation for device driver programming.
Treat this as a self-assessment: keep a notebook open, write down your answers, and don’t be afraid to go back to the text or labs if you feel uncertain. Remember: mastery is built by revisiting, practising, and questioning.
Questions
- What makes C particularly suitable for operating system development compared to high-level languages?
- Why is it helpful that C compiles down to predictable machine instructions on FreeBSD?
- What role does the
main()function play in a C program, and how is this different inside the kernel? - Why does every C source file in FreeBSD begin with
#includelines? - What is the difference between declaring a variable and initialising it?
- Why might leaving a variable uninitialised be more dangerous in kernel code than in user programs?
- What happens if you assign a
charinto anintvariable? - Give one example of a kernel structure where you expect to see a
char[]rather than achar *. - What does the
%operator do, and why is it common in buffer management? - Why should kernel code avoid “clever” chained assignments such as
a = b = c = 0;? - How does operator precedence affect the expression
x + y << 2? - Why is
==not the same as=, and what bug appears if you confuse them? - What danger exists when omitting braces
{}inifstatements? - Why are
switchstatements often preferred over manyif/else ifchains in FreeBSD? - In loops, what is the functional difference between
breakandcontinue? - How does a
forloop make buffer iteration safer thanwhile(1)with manual updates? - Why should you mark parameters as
constwhere possible in driver helper functions? - What is an inline function, and why might it be better than a macro?
- How does returning error codes like
ENXIOhelp standardise driver behaviour? - What does it mean that function parameters are passed “by value” in C?
- Why does passing a pointer to a struct give the illusion of “by reference”?
- How does this behaviour influence the design of
probe()orattach()functions? - What could happen if you modify a pointer argument inside a function without informing the caller?
- Why must C strings be null-terminated, and what happens if they are not?
- In the kernel, why are fixed-length arrays sometimes safer than dynamically sized ones?
- How is buffer overflow in a kernel array different from buffer overflow in a user program?
- What mistake do beginners make when copying strings into fixed buffers?
- What does the
*operator do in the expression*p = 10;? - How does pointer arithmetic take into account the type of the pointer?
- Why should you be careful when comparing two unrelated pointers?
- What is the difference between a dangling pointer and a NULL pointer?
- Why does FreeBSD require allocation flags like
M_WAITOKorM_NOWAIT? - Why are structs heavily used in kernel drivers instead of sets of separate variables?
- Why can copying a struct containing pointers be dangerous?
- Why might you want to use
typedefwith a struct, and when should you avoid it? - How do structs in C mirror the design of kernel subsystems like
procorifnet? - What problem do header guards solve in multi-file projects?
- Why do FreeBSD headers often contain forward declarations (
struct foo;)? - Why is it bad practice to put function definitions directly into header files?
- How do modular
.cand.hfiles make teamwork easier in kernel development? - At which compilation stage would a typo in a function name be detected?
- What does the linker do with object files?
- Why is compiling with
-Wallrecommended for beginners? - How does using
kgdborlldbdiffer from debugging in userland? - How can
#definebe used to enable or disable debugging messages? - What is the risk of using macros with side effects like
#define DOUB(X) X+X? - Why do kernel headers rely on
#ifdef _KERNELsections? - What problem do
#pragma onceor header guards prevent? - How does conditional compilation let FreeBSD support many CPU architectures?
- Why is
volatilesometimes necessary when dealing with hardware registers? - Why is it important to always initialise local variables, even if you plan to overwrite them soon?
- How does using fixed-width integer types (
uint32_t) improve portability across architectures? - Why should you prefer descriptive variable names over single letters in kernel code?
- What coding pattern in loops helps prevent “off-by-one” errors?
- Why should every
malloc()in kernel space have a correspondingfree()in the unload path? - How can consistent indentation and braces improve long-term maintainability of drivers?
- Why is it recommended to avoid “magic numbers” and instead use named constants or macros?
- How can reading similar drivers in FreeBSD source guide you toward better style?
- Why is it important to check the return value of every system or library call in kernel code?
- How does following KNF (Kernel Normal Form) style make your contributions easier to review and accept?
Wrapping Up
You’ve reached the end of Chapter 4, and that is no small achievement. In this single chapter, you have gone from zero C knowledge to handling the same building blocks used daily in the FreeBSD kernel. Along the way, you wrote real programs, explored actual kernel code, and faced questions that sharpened both your memory and your reasoning.
The key takeaway is this: **C is not just a language; it is the medium through which the kernel communicates. ** Every pointer you follow, every struct you design, and every header you include brings you closer to understanding how FreeBSD itself works inside.
As we close, remember that learning C is not about memorizing syntax. It is about cultivating precision, discipline, and a sense of curiosity. Those same qualities are what make great driver developers.
In the following chapters, you will apply these foundations in practice, structuring a FreeBSD driver, opening device files, and gradually working toward hardware interaction.
Keep these lessons close, revisit them often, and, above all, stay curious.