The Mind
Back to all posts

A Step-by-Step Guide to Creating Your Own ARM Operating System

This extensive guide will help you build your ARM Operating System from scratch

July 20, 2025 66 mins read
Table of Contents

1. Introduction

Operating systems are complex pieces of software that manage hardware resources and provide services for applications. Building one from scratch is an excellent way to understand how computers work at a fundamental level. In this comprehensive guide, I’ll walk you through building MeringueOS, a simple but educational operating system for the ARM AArch64 architecture.

There are three ways you can go about building MereingueOS from this guide:

  1. Pull/Clone the repo and analyze the full source code and components yourself. You can even run it locally and potentially extend its capabilities.
  2. Ask an AI Agent to use this resource and ask it to build it piece by piece and ask it questions as you go. (I have included troubleshooting prompts that should allow a tool like Codex, GeminiCLI, OpenCode or even Claude Code to get this done autonomously).
  3. You can do it yourself and grind it out. I’ve tried to make this as clear as possible, so feel free to consult your favorite LLM or google about what feels unclear or any knowledge gaps you may have. I will also share any relevant resources/readings you can read as you go.

There’s alternate ways to learn so there’s no shame in whatever works for you. This topic, after all, is not child’s play.

The Basics: Operating System Architecture

An Operating System is the software layer that sits between your applications and the computer’s hardware. When you run a program, it doesn’t directly control the CPU, memory, or devices - instead, it asks the OS to do these things on its behalf. The OS manages processes (running programs), allocates memory, handles files and networking, controls input/output devices, and provides the user interface you interact with. The breakdown of the architecture, looks like this:

For the OS we’ll be building, it’s called MeringueOS and it runs on the QEMU virt platform, which simulates ARM hardware. By the end of this guide, you’ll have a functional kernel with memory management, exception handling, a basic shell, and standard library functions. More importantly, you’ll understand how these components work and interact with each other.

Prerequisites

To follow along, you should have:

  • Basic knowledge of C programming
  • Some familiarity with assembly language (helpful but not required)
  • Understanding of computer architecture fundamentals
  • Linux environment (preferred) or macOS/Windows with appropriate tools

While OS development is complex, I’ll break down concepts into manageable parts and provide the necessary background information as we go.

Why Build an OS from Scratch?

Building an operating system teaches you:

  • How hardware and software interact at a deeper level
  • Memory management principles
  • Concurrency and synchronization
  • Resource allocation and scheduling
  • System design and architecture

Plus, there’s the satisfaction of creating something fundamental from nothing and watching it come to life. In a previous life, I did a lot of vulnerability research on ARM based devices and this involved a lot of Reverse Engineering. Even though I don’t do this a lot any more, deep down, I always wanted to build an ARM based OS from scratch to further my knowledge - that’s why this whole project exists.

Brief Introduction to ARM AArch64

ARM AArch64 is the 64-bit execution state of the ARMv8 architecture. Some key points:

  • 31 general-purpose 64-bit registers (X0-X30)
  • Dedicated stack pointer (SP)
  • Exception levels (EL0-EL3) for privilege separation
  • Vector-based exception handling
  • Memory Management Unit (MMU) for virtual memory

MeringueOS runs at EL1, the kernel privilege level, equivalent to kernel mode in x86 systems.

Under the Hood: ARM’s Exception Levels

┌───────────────────────────────────────────────────┐
│ EL3: Secure Monitor │ Highest Privilege
│ (Secure world management, trusted firmware) │
├───────────────────────────────────────────────────┤
│ EL2: Hypervisor │
│ (Virtual machine management) │
├───────────────────────────────────────────────────┤
│ EL1: OS Kernel │ ← MeringueOS runs here
│ (Privileged operations, hardware access) │
├───────────────────────────────────────────────────┤
│ EL0: Applications │ Lowest Privilege
│ (Unprivileged code, restricted hardware access)│
└───────────────────────────────────────────────────┘

2. Setting Up the Development Environment

Before coding, let’s set up a development environment.

Required Tools

You’ll need:

  • An ARM AArch64 cross-compiler (aarch64-linux-gnu-gcc)
  • QEMU for ARM emulation (qemu-system-aarch64)
  • Make for build automation
  • Git (optional, for version control)

On Debian/Ubuntu, install these tools with:

Terminal window
sudo apt update
sudo apt install gcc-aarch64-linux-gnu qemu-system-arm make git

On macOS with Homebrew:

Terminal window
brew install aarch64-elf-gcc qemu make

Troubleshooting Environment Setup

Common issues and their solutions:

  1. Missing cross-compiler errors:
  • Verify installation with aarch64-linux-gnu-gcc --version
  • Check PATH environment variable
  • On some systems, the package name may be different (e.g., gcc-aarch64-linux-gnu)
  1. QEMU command not found:
  • Install with package manager or from source
  • Verify installation with qemu-system-aarch64 --version

Project Structure

Let’s create our project directory structure:

MeringueOS/
├── build/ # Compiled object files and binaries
├── src/
│ ├── boot/ # Boot code and kernel entry
│ ├── exceptions/ # Exception handling
│ ├── include/ # Header files
│ ├── lib/ # Standard library implementations
│ ├── memory/ # Memory management
│ ├── shell/ # Shell interface
│ └── ui/ # Text User Interface (TUI)
└── test/ # Test files

Understanding the Makefile

The Makefile automates the build process. Here’s MeringueOS’s Makefile:

# Arm-OS Makefile
# Target architecture
ARCH = aarch64
# Cross compiler settings
# Note: On macOS or some systems, you may need to use aarch64-elf- instead
CROSS_COMPILE = aarch64-linux-gnu-
CC = $(CROSS_COMPILE)gcc
AS = $(CROSS_COMPILE)as
LD = $(CROSS_COMPILE)ld
OBJCOPY = $(CROSS_COMPILE)objcopy
OBJDUMP = $(CROSS_COMPILE)objdump
# Compiler flags
CFLAGS = -Wall -Wextra -ffreestanding -nostdlib -nostartfiles -mcpu=cortex-a72 -I./src/include
ASFLAGS = -mcpu=cortex-a72
LDFLAGS = -nostdlib
# Source directories
SRC_DIR = src
BOOT_DIR = $(SRC_DIR)/boot
MEMORY_DIR = $(SRC_DIR)/memory
EXCEPTIONS_DIR = $(SRC_DIR)/exceptions
UI_DIR = $(SRC_DIR)/ui
LIB_DIR = $(SRC_DIR)/lib
SHELL_DIR = $(SRC_DIR)/shell
INCLUDE_DIR = $(SRC_DIR)/include
# Build directories
BUILD_DIR = build
OBJ_DIR = $(BUILD_DIR)/obj
# Source files
ASM_SRCS = $(wildcard $(BOOT_DIR)/*.S) $(wildcard $(EXCEPTIONS_DIR)/*.S)
C_SRCS = $(wildcard $(BOOT_DIR)/*.c) \
$(wildcard $(MEMORY_DIR)/*.c) \
$(wildcard $(EXCEPTIONS_DIR)/*.c) \
$(wildcard $(UI_DIR)/*.c) \
$(wildcard $(LIB_DIR)/*.c) \
$(wildcard $(SHELL_DIR)/*.c)
# Object files
ASM_OBJS = $(patsubst $(SRC_DIR)/%.S, $(OBJ_DIR)/%.o, $(ASM_SRCS))
C_OBJS = $(patsubst $(SRC_DIR)/%.c, $(OBJ_DIR)/%.o, $(C_SRCS))
OBJS = $(ASM_OBJS) $(C_OBJS)
# Output files
KERNEL = $(BUILD_DIR)/kernel8.elf
KERNEL_IMG = $(BUILD_DIR)/kernel8.img
# Targets
.PHONY: all clean qemu debug
all: $(KERNEL_IMG)
$(KERNEL_IMG): $(KERNEL)
$(OBJCOPY) -O binary $< $@
$(KERNEL): $(OBJS) src/linker.ld | $(BUILD_DIR)
$(LD) $(LDFLAGS) -T src/linker.ld -o $@ $(OBJS)
$(OBJ_DIR)/%.o: $(SRC_DIR)/%.c | $(OBJ_DIR)
@mkdir -p $(dir $@)
$(CC) $(CFLAGS) -c $< -o $@
$(OBJ_DIR)/%.o: $(SRC_DIR)/%.S | $(OBJ_DIR)
@mkdir -p $(dir $@)
$(AS) $(ASFLAGS) -c $< -o $@
$(BUILD_DIR) $(OBJ_DIR):
mkdir -p $@
mkdir -p $(OBJ_DIR)/boot
mkdir -p $(OBJ_DIR)/memory
mkdir -p $(OBJ_DIR)/exceptions
mkdir -p $(OBJ_DIR)/ui
mkdir -p $(OBJ_DIR)/lib
mkdir -p $(OBJ_DIR)/shell
qemu: $(KERNEL_IMG)
qemu-system-aarch64 -M virt -cpu cortex-a72 -m 128M -nographic -kernel $(KERNEL_IMG)
debug: $(KERNEL_IMG)
qemu-system-aarch64 -M virt -cpu cortex-a72 -m 128M -nographic -kernel $(KERNEL_IMG) -S -s
clean:
rm -rf $(BUILD_DIR)

Design Decision: Makefile Architecture

This Makefile:

  • Defines compiler and tool variables
  • Sets compiler flags for freestanding development (-ffreestanding, -nostdlib)
  • Creates build directories
  • Compiles C and assembly files
  • Links the kernel
  • Generates a binary image
  • Provides targets for cleaning, running, and debugging

Alternative Approach: A single-step build would be simpler but slower for large projects as it would recompile everything each time.

Tradeoff: We chose the more modular approach for its incremental build capabilities, which saves time during development.

Testing and Debugging

The Makefile includes two targets for testing:

  • make qemu: Runs the OS in QEMU
  • make debug: Starts QEMU in debug mode and connects GDB

When debugging, you can use:

  • Breakpoints: break function_name or break file.c:line
  • Step execution: step (into functions) or next (over functions)
  • Examine memory: x/10x $address (show 10 words in hex)
  • Show registers: info registers

Platform-Specific Setup and Common Issues

Different operating systems and development environments can present unique challenges when building MeringueOS. This section addresses the most common platform-specific issues and their solutions.

macOS Development Setup

macOS users typically use the Homebrew package manager and have slightly different toolchain naming:

Terminal window
# Install ARM cross-compilation toolchain via Homebrew
brew install aarch64-elf-gcc qemu make
# Verify installation
aarch64-elf-gcc --version
qemu-system-aarch64 --version

Makefile Configuration for macOS: Update the cross-compiler setting in your Makefile:

# Cross compiler settings for macOS
CROSS_COMPILE = aarch64-elf-

Common macOS Issues:

  1. Toolchain naming differences:
  • Problem: make fails with “command not found: aarch64-linux-gnu-gcc”
  • Solution: Change CROSS_COMPILE = aarch64-linux-gnu- to CROSS_COMPILE = aarch64-elf-
  • Root cause: macOS Homebrew uses different package naming conventions
  1. Assembly file formatting:
  • Problem: Assembler warnings about “end of file not at end of a line”
  • Solution: Ensure assembly files (.S) end with a newline character
  1. Missing timeout command:
  • Problem: timeout command not found when testing with QEMU
  • Solution: Use gtimeout (via brew install coreutils) or manual process management
  • Alternative: Use backgrounding: (qemu-system-aarch64 ... &); sleep 5; pkill qemu-system-aarch64

Linux Development Setup

Linux development typically uses distribution package managers:

Terminal window
# Ubuntu/Debian
sudo apt update
sudo apt install gcc-aarch64-linux-gnu qemu-system-arm make git
# Fedora/RHEL
sudo dnf install gcc-aarch64-linux-gnu qemu-system-aarch64 make git
# Arch Linux
sudo pacman -S aarch64-linux-gnu-gcc qemu-arch-extra make git

Makefile Configuration for Linux:

# Cross compiler settings for Linux
CROSS_COMPILE = aarch64-linux-gnu-

Common Linux Issues:

  1. Package naming variations:
  • Problem: Package not found errors
  • Ubuntu/Debian: Use gcc-aarch64-linux-gnu
  • Some distros: May be aarch64-linux-gnu-gcc or cross-gcc-aarch64
  • Solution: Check your distribution’s package repository
  1. QEMU package differences:
  • Problem: qemu-system-aarch64 command not found
  • Solution: Install qemu-system-arm (includes AArch64) or qemu-arch-extra
  • Verification: which qemu-system-aarch64
  1. Permission issues with QEMU:
  • Problem: QEMU fails to start with permission errors
  • Solution: Add user to appropriate groups: sudo usermod -a -G kvm $USER
  • Note: May require logout/login to take effect

Build System Troubleshooting

These issues can occur on any platform:

Missing Function Declarations:

// Add to src/include/lib/string.h
char* strtok(char *str, const char *delim);
size_t strspn(const char *s, const char *accept);
char* strpbrk(const char *s, const char *reject);
// Add to src/include/lib/stdio.h
char kgetc(void);
char kgetc_blocking(void);

Linker Script Dependencies: Ensure your Makefile properly depends on the linker script:

$(KERNEL): $(OBJS) src/linker.ld | $(BUILD_DIR)
$(LD) $(LDFLAGS) -T src/linker.ld -o $@ $(OBJS)

Compiler Warning Fixes: To suppress unused parameter warnings in shell commands:

void cmd_help(int argc, char **argv) {
(void)argc; (void)argv; // Suppress unused warnings
kprintf("Available commands:\n");
// ... rest of function
}

Testing and Verification

QEMU Testing Tips:

Terminal window
# Basic test (manual termination with Ctrl+C)
make qemu
# Background testing with automatic termination
(make qemu &); sleep 10; pkill qemu-system-aarch64
# Debug mode (connects GDB on port 1234)
make debug

Expected Boot Output: A successful boot should show:

Terminal window
MeringueOS starting...
Kernel loaded at physical address: 0x0
Memory Sections: [memory layout information]
PMM: Initialization complete. Total: [X] KB, Free: [Y] KB
KHeap: Initialized.
TUI: Initialized (basic mode)
Starting shell...
Welcome to MeringueOS Shell!
meringue> [awaiting input]

Common Boot Issues:

  1. No output from QEMU:
  • Check linker script entry point matches _start symbol
  • Verify UART initialization in early boot
  • Try adding debug prints with direct register access
  1. Memory management failures:
  • Verify bitmap initialization and frame allocation
  • Check that linker symbols are properly aligned
  • Ensure PMM bitmap size calculation is correct
  1. Shell not responding:
  • Verify UART getc functions are implemented
  • Check that character echoing works
  • Ensure shell command parsing handles edge cases

Integration Best Practices

When implementing from this guide:

  1. Start Simple: Begin with a minimal implementation and add features incrementally
  2. Test Early: Verify each major component before proceeding to the next
  3. Use Debug Output: Liberal use of kprintf helps track initialization progress
  4. Handle Edge Cases: Add validation for NULL pointers and boundary conditions
  5. Follow Conventions: Maintain consistent coding style throughout the codebase

Remember: The goal is learning, not perfection. Focus on understanding each component’s role in the larger system rather than optimizing every detail. (Once you understand the basics, you can extend functionality from there, don’t skip the fundamentals)

Now that our environment is set up, let’s dive into OS development principles.

3. Fundamentals of OS Development

Before writing code, let’s understand key OS concepts.

What is a Kernel?

The kernel is the core of an operating system, responsible for:

  • Process management: Creating, scheduling, and terminating processes
  • Memory management: Allocating and tracking memory usage
  • Device management: Interacting with hardware devices
  • System calls: Providing services to user applications

MeringueOS implements a monolithic kernel where all OS services run in kernel space (privileged mode).

An OS Under The Hood

The operating system’s job is to coordinate all the hardware components in your computer to work together effectively. Just as multiple programs might want or need to use the CPU, memory, and disk simultaneously, the OS must schedule and allocate these resources fairly/effectively. It ensures the CPU switches between programs efficiently, manages which program gets which portion of memory, queues up disk requests in an efficient order, and routes network packets to the right applications. Without this coordination, programs would conflict with each other, corrupting data or even potentially crashing the system. The kernel is key when it comes to some of these functions we’ve mentioned here.

Kernel Architecture: The Big Picture

┌───────────────────────────────────────────────────────────────┐
│ MeringueOS Kernel │
│ │
│ ┌─────────────┐ ┌─────────────┐ ┌──────────────────────┐ │
│ │ Boot │ │ Memory │ │ Exception Handling │ │
│ │ Sequence │──▶ Management │──▶ (Vector Table) │ │
│ └─────────────┘ └─────────────┘ └──────────────────────┘ │
│ │ ▲ │ │
│ │ │ │ │
│ ▼ │ ▼ │
│ ┌─────────────┐ ┌─────────────┐ ┌──────────────────────┐ │
│ │ Kernel │ │ Standard │ │ UART/Console I/O │ │
│ │ Heap │◀─┤ Library │◀─┤ │ │
│ └─────────────┘ └─────────────┘ └──────────────────────┘ │
│ │ ▲ │ │
│ │ │ │ │
│ └────────────────┐│┌─────────────────┘ │
│ │││ │
│ ▼▼▼ │
│ ┌─────────────┐ │
│ │ Shell │ │
│ │ Interface │ │
│ └─────────────┘ │
└───────────────────────────────────────────────────────────────┘

Design Decision: Monolithic vs. Microkernel

MeringueOS uses a monolithic kernel design where all OS services run in privileged mode.

Alternative: Microkernel architecture would move most services to user space, communicating via message passing.

Tradeoffs:

  • Monolithic: Has better performance, simpler implementation, but less modular
  • Microkernel: Better isolation, fault tolerance, modular design, but potential performance overhead from context switching

I chose monolithic for educational simplicity and performance, but a microkernel would be more appropriate for a system prioritizing security and reliability or even something that can used for production use cases (potentially).

Execution Privilege Levels in ARM

ARM AArch64 has four Exception Levels (ELs), providing security isolation:

+---------------------------------------+
| EL3: Secure monitor | Highest privilege
+---------------------------------------+
| EL2: Hypervisor |
+---------------------------------------+
| EL1: OS kernel | <- MeringueOS runs here
+---------------------------------------+
| EL0: Applications | Lowest privilege
+---------------------------------------+

MeringueOS runs at EL1 . The OS can access all memory and hardware, while future user applications would run at EL0 with limited access.

Memory Management Concepts

Memory management involves:

  • Physical memory management: Tracking and allocating RAM
  • Virtual memory: Providing each process with its own address space
  • Memory protection: Preventing processes from accessing each other’s memory

MeringueOS implements physical memory management with:

  • A frame allocator for **4KB blocks **of physical memory
  • A kernel heap allocator for dynamic memory allocation

Memory Management

Memory management gives each program its own private view of memory, even though they’re all sharing the same physical RAM. Physical memory consists of actual hardware addresses with fixed locations, while virtual memory provides fake addresses that the OS translates to real ones behind the scenes. This translation happens through page tables - when a program accesses memory address 0x1000, the OS might actually store that data at physical address 0x40521000. This isolation prevents programs from reading or corrupting each other’s data.

Exception Handling Basics

Exceptions are events that require special handling, including:

  • Synchronous exceptions: Instruction-related (e.g., system calls, errors)
  • Asynchronous exceptions: External events (e.g., interrupts)

ARM AArch64 uses a vector table with fixed offsets for different exception types. When an exception occurs, the CPU:

  1. Saves some state (PC, PSTATE)
  2. Jumps to the appropriate vector entry
  3. Executes the handler code

MeringueOS implements a full vector table and handlers for all exception types.

How Exception Handling works

Exception handling is the CPU’s way of dealing with unexpected events and system calls. When something unusual happens - like dividing by zero, accessing invalid memory, or a program requesting OS services - the CPU immediately stops what it’s doing and jumps to a predetermined handler function. The exception vector table acts** like a dispatch system**, containing addresses for different types of exceptions: synchronous ones (errors, system calls) and asynchronous ones (hardware interrupts). Each exception type gets routed to its specific handler, which saves the current program state, handles the issue, then restores execution where it left off.

The Boot Sequence

The boot process for MeringueOS follows these steps:

┌─────────────┐ ┌─────────────┐ ┌─────────────┐
│ CPU Reset │ │ Boot.S │ │ Clear BSS │
│ EL2 or EL3 │────▶│ Setup Stack │────▶│ Section │
│ PC=Reset │ │ Check Core │ │ │
└─────────────┘ └─────────────┘ └─────────────┘
│ │
│ ▼
┌─────────────┐ ┌─────────────┐ ┌─────────────┐
│ Initialize │ │ Set up │ │ Transition │
│ Subsystems │◀─── │ Exceptions │◀─── │ to EL1 │
│ Start Shell │ │ Vector Table│ │ │
└─────────────┘ └─────────────┘ └─────────────┘

Now let’s look at the actual implementation.

4. The Boot Process

The boot process is the first step in bringing our OS to life.

Additional context about the Boot Process

When a computer is powered on, the CPU starts executing from a fixed address with most features disabled - no memory management, no interrupts, just raw instruction execution. The bootloader’s job is to gradually bring the system to life(make it usable/functional): first setting up a stack so functions can be called, then initializing RAM by clearing uninitialized sections and copying data segments to their runtime locations, configuring exception handlers so the CPU knows where to jump when errors occur, and finally enabling hardware features like the MMU and interrupt controller. Only after this foundational setup can the kernel initialize higher-level services like device drivers, file systems, and eventually start the first user process.

Assembly Entry Point

The entry point is in boot.S. Let’s examine it:

/* AArch64 boot code for QEMU virt machine */
.section ".text.boot"
.global _start
_start:
// Check processor ID is 0 (primary core)
mrs x0, mpidr_el1
and x0, x0, #0xFF
cbz x0, primary_core
// Secondary cores loop forever
1: wfe
b 1b
primary_core:
// Set stack pointer to _stack_top
ldr x0, =_stack_top
mov sp, x0
// Copy .rodata section from load address to execution address
ldr x0, =_rodata_start // Destination address
ldr x1, =_rodata_load // Source address
ldr x2, =_rodata_end
sub x2, x2, x0 // Size to copy
bl copy_data_section
// Copy .data section from load address to execution address
ldr x0, =_data_start // Destination address
ldr x1, =_data_load // Source address
ldr x2, =_data_end
sub x2, x2, x0 // Size to copy
bl copy_data_section
// Clear BSS
ldr x0, =_bss_start
ldr x1, =_bss_end
sub x1, x1, x0
cbz x1, skip_bss_clear
clear_bss_loop:
str xzr, [x0], #8
sub x1, x1, #8
cbnz x1, clear_bss_loop
skip_bss_clear:
// Set up exception vector table
ldr x0, =_exception_vector_table
msr vbar_el1, x0
// Set up EL1 (kernel mode)
mrs x0, CurrentEL
lsr x0, x0, #2
cmp x0, #1
beq already_in_el1
// If we're in EL2, configure EL1 and drop to it
/* Disable EL1 timer traps */
mov x0, #0x3 // CNTHCTL_EL2.EL1PCTEN | CNTHCTL_EL2.EL1PCEN
msr cnthctl_el2, x0
msr cntvoff_el2, xzr
/* Set EL1 execution state to AArch64 */
mov x0, #(1 << 31) // AArch64
orr x0, x0, #(1 << 1) // SWIO hardwired
msr hcr_el2, x0
/* Configure EL1 */
// Set up SPSR_EL2 for the transition to EL1
mov x0, #0x3c5 // DAIF + EL1h (SPSel = 1)
msr spsr_el2, x0
/* Set EL1 entry point and switch */
ldr x0, =already_in_el1
msr elr_el2, x0
eret
already_in_el1:
// Enable floating point
mov x0, #0x00300000 // FPEN bits
msr cpacr_el1, x0
isb
// Jump to C code, passing boot info pointer
mov x0, #0 // For now, pass null pointer as boot info
bl kernel_main // Call our C entry point
// If kernel_main returns, halt the CPU
1: wfe
b 1b
//------------------------------------------------------------------
// Helper function to copy data sections
// x0 = destination address
// x1 = source address
// x2 = size in bytes (must be multiple of 8)
//------------------------------------------------------------------
copy_data_section:
// Return immediately if source and destination are the same
cmp x0, x1
beq copy_done
// Return if size is zero
cbz x2, copy_done
// Add debugging output
// Preserve registers used by the output routine
stp x0, x1, [sp, #-16]!
stp x2, x30, [sp, #-16]!
// Call C debug function to print details about the copy
mov x3, x2 // Size
bl boot_debug_copy // Call C function to print debug info
// Restore registers
ldp x2, x30, [sp], #16
ldp x0, x1, [sp], #16
// Perform the copy, byte by byte for safety
copy_byte_loop:
ldrb w4, [x1], #1 // Load a byte
strb w4, [x0], #1 // Store a byte
sub x2, x2, #1 // Decrement size
cbnz x2, copy_byte_loop
copy_done:
ret

This assembly code:

  1. Checks if we’re on the primary CPU core
  2. Sets up the stack pointer
  3. Copies initialized data sections
  4. Clears the BSS section (uninitialized global variables)
  5. Sets up the exception vector table
  6. Switches from EL2 to EL1 if needed
  7. Enables floating-point and SIMD instructions
  8. Jumps to kernel_main in C

Under the Hood: CPU State at Boot

When an ARM processor powers up:

  1. It starts in the highest exception level (EL3 or EL2 depending on implementation)
  2. Most features are disabled (caches, MMU, etc.)
  3. The program counter is set to a predetermined value
  4. Only a single core is active in multi-core systems

Setting Up the Stack

The stack is essential for C function calls. We set it to the address of _stack_top, which is defined in our linker script. The stack grows downward from that address, providing space for function calls and local variables.

State Transitions During Boot

┌───────────────────┐ ┌───────────────────┐ ┌───────────────────┐
│ CPU Reset State │ │ Initial Setup │ │ Stack & BSS Setup │
│ PC = Reset Vector │─────▶│ Check Core ID │─────▶│ Stack initialized │
│ EL = 2 or 3 │ │Disable secondaries│ │ BSS section zeroed│
└───────────────────┘ └───────────────────┘ └────────┬──────────┘
┌───────────────────┐ ┌───────────────────┐ ┌───────────────────┐
│ C Environment │ │ Transition to EL1 │ │ Exception Setup │
│Jump to kernel_main│◀──── │ Configure SPSRs │◀──── │Set up vector table│
│ Start OS services │ │ Set return addr │ │ Enable features │
└───────────────────┘ └───────────────────┘ └───────────────────┘

Debugging the Boot Process

Common boot issues and how to diagnose them:

  1. No output from QEMU:
  • Check linker script entry point matches _start symbol
  • Verify reset vector location (0x40100000 for our setup)
  • Add early debug prints before UART initialization using direct register access
  1. Crash during initialization:
  • Add debug prints between major steps
  • Check that BSS/data section handling matches linker script
  • Verify exception vector alignment (must be 2KB aligned)
  1. Issues transitioning to EL1:
  • Verify SPSR_EL2 configuration is correct (check the ARM documentation)
  • Check that return address (ELR_EL2) is properly set
  • Ensure stack pointer is valid before transition

Transition from Assembly to C

Once the basic hardware initialization is done, we jump to kernel_main() to continue in C. This is defined in kernel.c:

#include <kernel.h>
#include <lib/stdio.h>
// External functions we'll implement later
extern void frame_alloc_init(const KERNEL_BOOT_PARAMS *params);
extern void kheap_init(void);
extern int tui_init(void);
extern void shell_loop(void);
// Debug function called from boot code before UART is initialized
// We need a direct hardware access version
static void early_debug_print(const char *str) {
// UART direct access (PL011)
volatile uint32_t *uart_dr = (volatile uint32_t*)0x09000000;
volatile uint32_t *uart_fr = (volatile uint32_t*)0x09000018;
while (*str) {
// Wait for FIFO to have space
while ((*uart_fr) & (1 << 5)); // TXFF bit
// Send character
*uart_dr = (uint32_t)*str++;
}
}
// Debug function for section copying, called from assembly
void boot_debug_copy(void *dest, void *src, size_t size) {
// Print directly using early UART access
early_debug_print("[BOOT] Copying section: ");
// Convert size to string manually (very simple)
char size_str[16];
int i = 0;
int tmp = (int)size;
// Handle special case of zero
if (tmp == 0) {
size_str[0] = '0';
size_str[1] = '\0';
} else {
// Convert integer to string (backwards)
while (tmp > 0) {
size_str[i++] = '0' + (tmp % 10);
tmp /= 10;
}
size_str[i] = '\0';
// Reverse the string
for (int j = 0; j < i/2; j++) {
char c = size_str[j];
size_str[j] = size_str[i-j-1];
size_str[i-j-1] = c;
}
}
early_debug_print(size_str);
early_debug_print(" bytes\n");
}
// Kernel entry point
void kernel_main(KERNEL_BOOT_PARAMS *params) {
// Early initialization - placeholder for UART setup
// We'll assume kprintf writes to a UART for now
kprintf("MeringueOS starting...\n");
kprintf("Kernel loaded at physical address: 0x%llx\n",
params ? params->kernel_phys_start : 0);
// Debug section information
kprintf("Memory Sections:\n");
kprintf(" .text: %p to %p\n", &_kernel_start, &_text_end);
kprintf(" .rodata: %p to %p (load: %p)\n", &_rodata_start, &_rodata_end, &_rodata_load);
kprintf(" .data: %p to %p (load: %p)\n", &_data_start, &_data_end, &_data_load);
kprintf(" .bss: %p to %p\n", &_bss_start, &_bss_end);
// Initialize memory management subsystem
kprintf("Initializing Physical Memory Manager...\n");
frame_alloc_init(params);
kprintf("Initializing Kernel Heap Allocator...\n");
kheap_init();
// Initialize Text User Interface
kprintf("Initializing TUI subsystem...\n");
if (tui_init() != 0) {
kprintf("Failed to initialize TUI subsystem!\n");
// Continue without TUI for now
}
// Enter the shell
kprintf("Starting shell...\n");
shell_loop();
// If shell returns, halt
kprintf("Kernel halting.\n");
while(1) {
// This is equivalent to a halt
asm volatile("wfi");
}
}

This C code initializes each subsystem in turn:

  1. Prints debug information about memory sections
  2. Initializes the physical memory manager
  3. Sets up the kernel heap
  4. Initializes the text UI
  5. Finally, starts the shell, which takes over

Memory Layout

The memory layout is defined in the linker script linker.ld:

/* Linker script for Arm-OS */
/* QEMU virt machine memory layout */
ENTRY(_start)
SECTIONS
{
/* Kernel starts at 0x40100000 for QEMU virt machine */
. = 0x40100000;
_kernel_start = .;
.text : ALIGN(8)
{
*(.text.boot) /* Boot code first */
*(.text) /* All other code */
*(.text.*) /* Including subsections */
}
_text_end = .;
.rodata : ALIGN(8)
{
_rodata_start = .;
*(.rodata)
*(.rodata.*)
_rodata_end = .;
}
.data : ALIGN(8)
{
_data_start = .;
*(.data)
*(.data.*)
_data_end = .;
}
/* Add symbols for load addresses (where sections are actually loaded by QEMU) */
_rodata_load = LOADADDR(.rodata);
_data_load = LOADADDR(.data);
.bss : ALIGN(8)
{
_bss_start = .;
*(.bss)
*(.bss.*)
*(COMMON)
. = ALIGN(8);
_bss_end = .;
}
.eh_frame :
{
*(.eh_frame)
}
. = ALIGN(8);
/* Reserve space for the stack */
. = ALIGN(16);
_stack_bottom = .;
. += 0x10000; /* 64 KB for stack */
. = ALIGN(16);
_stack_top = .;
/* Reserve space for PMM bitmap */
. = ALIGN(4096);
_pmm_bitmap_start = .;
. += 0x20000; /* 128 KB for PMM bitmap (can manage 4GB of RAM) */
_pmm_bitmap_end = .;
_kernel_end = .;
}

This script:

  1. Sets the entry point to _start
  2. Places the kernel at physical address 0x40100000
  3. Organizes sections: text (code), rodata (constants), data, and bss
  4. Reserves stack space
  5. Provides symbols for section boundaries

Design Decision: Static vs. Dynamic Section Loading

My implementation uses static, fixed addresses defined in the linker script.

Alternative: A more dynamic approach could parse an ELF header to locate sections.

Tradeoffs:

  • Static: Simpler implementation, fixed memory layout, easier to debug
  • Dynamic: More flexible, supports loading arbitrary binaries, but more complex

I chose static loading for educational clarity and deterministic behavior(easy to debug), though production OSes typically use dynamic approaches for flexibility.

The memory layout looks like:

+-------------------+ 0x40100000
| .text (code) |
| |
+-------------------+
| .rodata |
| (constants) |
+-------------------+
| .data |
| (initialized vars)|
+-------------------+
| .bss |
| (zeroed at boot) |
+-------------------+
| |
| Available memory |
| |
+-------------------+
| PMM bitmap |
| (memory tracking) |
+-------------------+
| Stack (64KB) |
| (grows downward) |
_stack_top → +-------------------+

Now that our kernel is booting, let’s implement memory management.

5. Memory Management

Memory management is crucial for any OS. MeringueOS implements two levels:

  1. Physical frame allocation
  2. Kernel heap for dynamic memory

Physical Memory Management

The physical memory manager (PMM) tracks 4KB blocks of RAM called frames. It uses a bitmap where each bit represents a frame.

Let’s look at the implementation in frame_alloc.h:

#ifndef FRAME_ALLOC_H
#define FRAME_ALLOC_H
#include <stddef.h>
#include <stdint.h>
#include <stdbool.h>
#include "kernel.h"
// Define page size (typically 4KB for AArch64)
#define PAGE_SIZE 4096
#define PAGE_SHIFT 12
// Define the RAM region for QEMU virt machine
#define PMM_RAM_BASE 0x40000000
// Note: Linker symbols are now included from kernel.h
// Initialize the physical memory manager
void frame_alloc_init(const KERNEL_BOOT_PARAMS *params);
// Allocate a physical frame, returns NULL if no free frames
void* alloc_frame(void);
// Free a previously allocated physical frame
void free_frame(void *frame);
// Get information about memory
uint64_t pmm_get_total_memory(void);
uint64_t pmm_get_free_memory(void);
uint64_t pmm_get_highest_usable_address(void);
#endif // FRAME_ALLOC_H

And the implementation in frame_alloc.c:

#include <stddef.h>
#include <stdint.h>
#include <stdbool.h>
#include "memory/frame_alloc.h"
#include "lib/string.h"
#include "lib/stdio.h"
// For QEMU virt, RAM often starts at 0x40000000 and can be e.g., 128MB or more.
// Let's assume a max manageable physical address space, e.g., 1GB beyond RAM start.
// Adjust this based on actual QEMU configuration or dynamic detection.
#define PMM_MANAGEABLE_SIZE (1024ULL * 1024ULL * 1024ULL) // Manage up to 1GB
#define PMM_MAX_ADDRESS (PMM_RAM_BASE + PMM_MANAGEABLE_SIZE)
#define PMM_TOTAL_FRAMES (PMM_MANAGEABLE_SIZE / PAGE_SIZE)
// Bitmap for tracking frame usage. Each bit represents one frame.
// Size = Total Frames / 8 bits per byte
static uint8_t *frame_bitmap = &_pmm_bitmap_start;
static uint64_t total_memory = 0;
static uint64_t free_memory = 0;
static uint64_t highest_usable_address = 0;
// Helper function to set a bit in the bitmap
static void set_bit(size_t bit) {
frame_bitmap[bit / 8] |= (1 << (bit % 8));
}
// Helper function to clear a bit in the bitmap
static void clear_bit(size_t bit) {
frame_bitmap[bit / 8] &= ~(1 << (bit % 8));
}
// Helper function to test a bit in the bitmap
static bool test_bit(size_t bit) {
return (frame_bitmap[bit / 8] & (1 << (bit % 8))) != 0;
}
// Mark a range of frames as used
static void mark_range_used(uint64_t base_addr, uint64_t size) {
uint64_t start_frame = (base_addr >= PMM_RAM_BASE) ?
(base_addr - PMM_RAM_BASE) / PAGE_SIZE :
(UINT64_MAX);
uint64_t end_addr = base_addr + size;
uint64_t end_frame = (end_addr > PMM_RAM_BASE) ?
(end_addr - 1 - PMM_RAM_BASE) / PAGE_SIZE :
0;
if (start_frame >= PMM_TOTAL_FRAMES) return; // Start address out of managed range
if (end_frame >= PMM_TOTAL_FRAMES) end_frame = PMM_TOTAL_FRAMES - 1; // Cap end address
kprintf("PMM: Marking used 0x%llx - 0x%llx (Frames %llu - %llu)\n",
base_addr, base_addr + size, start_frame, end_frame);
for (uint64_t i = start_frame; i <= end_frame; ++i) {
if (!test_bit(i)) {
set_bit(i);
// Assuming these were initially counted as free, decrement count
if (total_memory >= PAGE_SIZE) total_memory -= PAGE_SIZE;
if (free_memory >= PAGE_SIZE) free_memory -= PAGE_SIZE;
}
}
}
// Mark a range of frames as free
static void mark_range_free(uint64_t base_addr, uint64_t size) {
uint64_t start_frame = (base_addr >= PMM_RAM_BASE) ?
(base_addr - PMM_RAM_BASE) / PAGE_SIZE :
(UINT64_MAX);
uint64_t end_addr = base_addr + size;
uint64_t end_frame = (end_addr > PMM_RAM_BASE) ?
(end_addr - 1 - PMM_RAM_BASE) / PAGE_SIZE :
0;
if (start_frame >= PMM_TOTAL_FRAMES) return; // Start address out of managed range
if (end_frame >= PMM_TOTAL_FRAMES) end_frame = PMM_TOTAL_FRAMES - 1; // Cap end address
kprintf("PMM: Marking free 0x%llx - 0x%llx (Frames %llu - %llu)\n",
base_addr, base_addr + size, start_frame, end_frame);
for (uint64_t i = start_frame; i <= end_frame; ++i) {
if (test_bit(i)) { // Only count if it wasn't already free
total_memory += PAGE_SIZE;
free_memory += PAGE_SIZE;
if ((PMM_RAM_BASE + (i + 1) * PAGE_SIZE) > highest_usable_address) {
highest_usable_address = PMM_RAM_BASE + (i + 1) * PAGE_SIZE;
}
}
clear_bit(i); // Mark as free regardless
}
}
void frame_alloc_init(const KERNEL_BOOT_PARAMS *params) {
kprintf("PMM: Initializing Physical Memory Manager...\n");
// Calculate the bitmap size
size_t bitmap_size = (&_pmm_bitmap_end - &_pmm_bitmap_start);
kprintf("PMM: Bitmap size: %zu bytes, located at %p\n",
bitmap_size, frame_bitmap);
// Initially, mark all manageable frames as used
memset(frame_bitmap, 0xFF, bitmap_size);
total_memory = 0;
free_memory = 0;
highest_usable_address = PMM_RAM_BASE;
if (params) {
kprintf("PMM: Kernel Physical Range: 0x%llx - 0x%llx\n",
params->kernel_phys_start, params->kernel_phys_end);
// For now, we'll simplify and just reserve the kernel address range
// In a full implementation, we would parse the UEFI memory map from params
// Mark all memory as free initially (simplified approach)
mark_range_free(PMM_RAM_BASE, PMM_MANAGEABLE_SIZE);
// Then mark kernel memory as used
mark_range_used(params->kernel_phys_start,
params->kernel_phys_end - params->kernel_phys_start);
} else {
// No boot parameters, use linker-provided kernel boundaries
uint64_t kernel_start = (uint64_t)&_kernel_start;
uint64_t kernel_end = (uint64_t)&_kernel_end;
kprintf("PMM: Kernel boundaries from linker: 0x%llx - 0x%llx\n",
kernel_start, kernel_end);
// Mark all memory as free initially
mark_range_free(PMM_RAM_BASE, PMM_MANAGEABLE_SIZE);
// Then mark kernel memory as used
mark_range_used(kernel_start, kernel_end - kernel_start);
}
// Mark the bitmap itself as used (it lies within kernel memory, but just to be explicit)
mark_range_used((uint64_t)frame_bitmap, bitmap_size);
kprintf("PMM: Initialization complete. Total: %llu KB, Free: %llu KB\n",
total_memory / 1024, free_memory / 1024);
}
void* alloc_frame(void) {
// Simple linear search for first free frame
for (size_t i = 0; i < PMM_TOTAL_FRAMES; i++) {
if (!test_bit(i)) {
set_bit(i);
free_memory -= PAGE_SIZE;
// Calculate the physical address
void *frame_addr = (void*)(PMM_RAM_BASE + i * PAGE_SIZE);
// Zero the frame for security/predictability
memset(frame_addr, 0, PAGE_SIZE);
return frame_addr;
}
}
kprintf("PMM: ERROR - Out of physical frames!\n");
return NULL; // No free frame found
}
void free_frame(void *frame) {
if (!frame) return;
uint64_t addr = (uint64_t)frame;
// Basic validation
if (addr < PMM_RAM_BASE || addr >= PMM_MAX_ADDRESS) {
kprintf("PMM: Attempt to free invalid frame at %p\n", frame);
return;
}
// Check alignment
if (addr % PAGE_SIZE != 0) {
kprintf("PMM: Attempt to free unaligned address %p\n", frame);
return;
}
// Calculate the bit index
size_t frame_idx = (addr - PMM_RAM_BASE) / PAGE_SIZE;
if (frame_idx >= PMM_TOTAL_FRAMES) {
kprintf("PMM: Frame index %zu out of range\n", frame_idx);
return;
}
// Check if frame is currently marked as used
if (!test_bit(frame_idx)) {
kprintf("PMM: Warning - double free detected for frame %p\n", frame);
return;
}
// Mark as free
clear_bit(frame_idx);
free_memory += PAGE_SIZE;
}
uint64_t pmm_get_total_memory(void) {
return total_memory;
}
uint64_t pmm_get_free_memory(void) {
return free_memory;
}
uint64_t pmm_get_highest_usable_address(void) {
return highest_usable_address;
}

The PMM:

  1. Uses a bitmap for frame tracking (1 bit per 4KB frame)
  2. Reserves frames for hardware and the kernel
  3. Provides functions to allocate and free frames
  4. Includes validation to catch errors like double-frees

Under the Hood: Physical Memory Allocation

┌───────────────────────────────────────────────────────────────────────┐
│ Physical Memory (1GB Starting at 0x40000000) │
│ │
│ ┌─────────┐ ┌─────────┐ ┌─────────┐ ┌─────────┐ │
│ │ Frame 0 │ │ Frame 1 │ │ Frame 2 │ ... │Frame N-1│ │
│ │ (4 KB) │ │ (4 KB) │ │ (4 KB) │ │ (4 KB) │ │
│ └─────────┘ └─────────┘ └─────────┘ └─────────┘ │
└───────────────────────────────────────────────────────────────────────┘
┌───────────────────────────────────────────────────────────────────────┐
│ Frame Bitmap (1 bit per frame) │
│ │
│ ┌───┬───┬───┬───┬───┬───┬───┬───┬───┐ │
│ │ 1 │ 1 │ 0 │ 0 │ 1 │ 0 │ 1 │ 1 │...│ (1=used, 0=free) │
│ └───┴───┴───┴───┴───┴───┴───┴───┴───┘ │
│ │
└───────────────────────────────────────────────────────────────────────┘

When alloc_frame() is called:

  1. The bitmap is scanned for a free frame (bit=0)
  2. The bit is set to 1 (marking it as used)
  3. The corresponding physical address is calculated and returned
  4. If no free frame is found, NULL is returned

Troubleshooting Memory Issues

Common memory management bugs and how to diagnose them:

  1. Out of memory errors:
  • Check bitmap initialization - ensure it’s properly zeroed
  • Verify that free_frame() is correctly clearing bits
  • Add accounting code to track memory usage patterns
  1. Memory corruption:
  • Ensure frames are aligned properly (check address % PAGE_SIZE == 0)
  • Verify bitmap is correctly representing frame states
  • Add debug logging for allocation/free operations
  1. Double free bugs:
  • Use the validation in free_frame() to catch and log double-frees
  • Add optional “poisoning” to freed frames for easier debugging

Kernel Heap Implementation

The kernel heap provides dynamic memory allocation. Let’s look at the implementation in kheap.h:

#ifndef KHEAP_H
#define KHEAP_H
#include <stddef.h>
#include <stdint.h>
// Initialize the kernel heap
void kheap_init(void);
// Allocate a block of memory
void *kmalloc(size_t size);
// Free a previously allocated block
void kfree(void *ptr);
#endif // KHEAP_H

And the implementation in kheap.c:

#include <stddef.h>
#include <stdint.h>
#include <stdbool.h>
#include "memory/kheap.h"
#include "memory/frame_alloc.h"
#include "lib/string.h"
#include "lib/stdio.h"
// Header for memory blocks (both allocated and free)
typedef struct heap_block {
size_t size; // Size of the data area *excluding* this header
bool is_free; // True if block is free, false if allocated
struct heap_block *next; // Pointer to the next block in the heap (physical order)
struct heap_block *prev; // Pointer to the previous block in the heap (physical order)
struct heap_block *next_free; // Pointer to the next free block in the free list
struct heap_block *prev_free; // Pointer to the previous free block in the free list
} heap_block_t;
#define HEAP_HEADER_SIZE sizeof(heap_block_t)
#define HEAP_MIN_BLOCK_SIZE (HEAP_HEADER_SIZE * 2) // Minimum size to allow splitting
// Head of the free list (doubly linked)
static heap_block_t *free_list_head = NULL;
static heap_block_t *heap_start = NULL;
static heap_block_t *heap_end = NULL;
// --- Free List Management ---
static void add_to_free_list(heap_block_t *block) {
block->is_free = true;
block->next_free = free_list_head;
block->prev_free = NULL;
if (free_list_head) {
free_list_head->prev_free = block;
}
free_list_head = block;
}
static void remove_from_free_list(heap_block_t *block) {
if (block->prev_free) {
block->prev_free->next_free = block->next_free;
} else {
free_list_head = block->next_free; // It was the head
}
if (block->next_free) {
block->next_free->prev_free = block->prev_free;
}
block->is_free = false; // Mark as not free after removal
block->next_free = NULL;
block->prev_free = NULL;
}
// --- Heap Expansion ---
static bool expand_heap(size_t min_expand_size) {
// Request at least a page, or more if needed
size_t pages_needed = (min_expand_size + HEAP_HEADER_SIZE + PAGE_SIZE - 1) / PAGE_SIZE;
if (pages_needed == 0) pages_needed = 1;
kprintf("KHeap: Expanding heap by %zu pages\n", pages_needed);
heap_block_t *new_block = NULL;
for (size_t i = 0; i < pages_needed; ++i) {
void *frame = alloc_frame();
if (!frame) {
kprintf("KHeap Error: Failed to allocate frame during expansion!\n");
// If we allocated some frames but not all, we should ideally free them
// or add what we got. For simplicity here, we fail.
return false;
}
heap_block_t *current_block = (heap_block_t *)frame;
current_block->size = PAGE_SIZE - HEAP_HEADER_SIZE;
current_block->is_free = true;
current_block->next = NULL; // Will be linked below or by coalesce
if (!new_block) {
new_block = current_block; // Keep track of the first new block
}
// Link the new block into the main heap structure
if (heap_end) {
current_block->prev = heap_end;
heap_end->next = current_block;
} else {
// This is the very first block in the heap
heap_start = current_block;
current_block->prev = NULL;
}
heap_end = current_block;
}
if (!new_block) return false; // Should not happen if alloc_frame succeeded
// Coalesce the first new block with the previous block if it was free
if (new_block->prev && new_block->prev->is_free) {
heap_block_t *prev_block = new_block->prev;
remove_from_free_list(prev_block); // Remove old free block
prev_block->size += new_block->size + HEAP_HEADER_SIZE;
prev_block->next = new_block->next;
if (new_block->next) {
new_block->next->prev = prev_block;
}
if (heap_end == new_block) {
heap_end = prev_block;
}
new_block = prev_block; // The merged block is now the 'new' block to add
}
// Add the (potentially merged) new block to the free list
add_to_free_list(new_block);
return true;
}
// --- Coalescing ---
static heap_block_t* coalesce(heap_block_t *block) {
if (!block || !block->is_free) return block;
heap_block_t *current = block;
// Coalesce with next block if it's free
if (current->next && current->next->is_free) {
heap_block_t *next_block = current->next;
kprintf("KHeap: Coalescing forward %p (%zu) with %p (%zu)\n",
current, current->size, next_block, next_block->size);
remove_from_free_list(next_block); // Remove next block from free list
current->size += next_block->size + HEAP_HEADER_SIZE;
current->next = next_block->next;
if (current->next) {
current->next->prev = current;
}
if (heap_end == next_block) {
heap_end = current;
}
// next_block is now merged into current, clear it for safety
memset(next_block, 0, HEAP_HEADER_SIZE);
}
// Coalesce with previous block if it's free
if (current->prev && current->prev->is_free) {
heap_block_t *prev_block = current->prev;
kprintf("KHeap: Coalescing backward %p (%zu) with %p (%zu)\n",
prev_block, prev_block->size, current, current->size);
remove_from_free_list(prev_block); // Previous block is already in free list, remove it
prev_block->size += current->size + HEAP_HEADER_SIZE;
prev_block->next = current->next;
if (prev_block->next) {
prev_block->next->prev = prev_block;
}
if (heap_end == current) {
heap_end = prev_block;
}
// current is now merged into prev_block, clear it for safety
memset(current, 0, HEAP_HEADER_SIZE);
current = prev_block; // The result of the coalesce is the previous block
}
return current; // Return the potentially larger coalesced block
}
// --- Public API ---
void kheap_init() {
free_list_head = NULL;
heap_start = NULL;
heap_end = NULL;
// Pre-allocate some initial pages
expand_heap(PAGE_SIZE * 4); // Pre-allocate 16KB
kprintf("KHeap: Initialized.\n");
}
void* kmalloc(size_t size) {
if (size == 0) {
return NULL;
}
// Ensure minimum allocation size and alignment (e.g., align to 8 or 16 bytes)
// For simplicity, let's align to sizeof(void*)
size_t alignment = sizeof(void*);
size = (size + alignment - 1) & ~(alignment - 1);
// Add space for the header
size_t total_size_needed = size + HEAP_HEADER_SIZE;
// First-fit search
heap_block_t *current_free = free_list_head;
heap_block_t *best_fit = NULL;
while (current_free) {
if (current_free->size >= size) { // Found a block large enough
best_fit = current_free;
break; // First fit
}
current_free = current_free->next_free;
}
// If no block found, try to expand the heap
if (!best_fit) {
if (!expand_heap(total_size_needed)) {
kprintf("KHeap Error: Failed to expand heap for allocation of size %zu\n", size);
return NULL; // Expansion failed
}
// Retry finding a block (the new block should be suitable)
current_free = free_list_head;
while (current_free) {
if (current_free->size >= size) {
best_fit = current_free;
break;
}
current_free = current_free->next_free;
}
if (!best_fit) {
kprintf("KHeap Error: Still no suitable block after expansion!\n");
return NULL; // Should not happen if expansion succeeded
}
}
// We found a suitable block (best_fit)
remove_from_free_list(best_fit); // Remove it from the free list
// Check if we can split the block
if (best_fit->size >= size + HEAP_MIN_BLOCK_SIZE) {
// Split the block
size_t remaining_size = best_fit->size - size - HEAP_HEADER_SIZE;
heap_block_t *new_free_block = (heap_block_t *)((uint8_t *)best_fit + HEAP_HEADER_SIZE + size);
new_free_block->size = remaining_size;
new_free_block->is_free = true; // Will be added to free list
new_free_block->next = best_fit->next;
new_free_block->prev = best_fit;
if (new_free_block->next) {
new_free_block->next->prev = new_free_block;
} else {
heap_end = new_free_block; // It's the new end of the heap
}
best_fit->size = size; // Adjust size of the allocated block
best_fit->next = new_free_block;
// Add the new smaller free block to the free list
add_to_free_list(new_free_block);
kprintf("KHeap: Split block %p. Allocated %zu, remaining %zu at %p\n",
best_fit, best_fit->size, new_free_block->size, new_free_block);
} else {
// Cannot split, allocate the whole block
kprintf("KHeap: Allocated whole block %p (%zu) for size %zu\n",
best_fit, best_fit->size, size);
}
best_fit->is_free = false;
// Return pointer to the data area (after the header)
void *data_ptr = (void *)((uint8_t *)best_fit + HEAP_HEADER_SIZE);
// Optionally zero the allocated memory
memset(data_ptr, 0, best_fit->size);
// kprintf("KHeap: kmalloc(%zu) -> %p\n", size, data_ptr);
return data_ptr;
}
void kfree(void *ptr) {
if (!ptr) {
return;
}
// Get the header from the pointer
heap_block_t *block = (heap_block_t *)((uint8_t *)ptr - HEAP_HEADER_SIZE);
// Basic validation
if (block->is_free) {
kprintf("KHeap Warning: Double free detected for pointer %p\n", ptr);
return;
}
// More robust validation would involve checking magic numbers in the header
// or ensuring the block pointer is within the known heap range [heap_start, heap_end].
kprintf("KHeap: kfree(%p) - block %p, size %zu\n", ptr, block, block->size);
block->is_free = true;
// Attempt to coalesce with neighbors
heap_block_t *coalesced_block = coalesce(block);
// Add the (potentially coalesced) block back to the free list
// Check if it wasn't already added by coalesce (if coalesce returned the same block)
if (coalesced_block == block) {
add_to_free_list(coalesced_block);
kprintf("KHeap: Added block %p (%zu) to free list\n",
coalesced_block, coalesced_block->size);
} else {
// Coalesce already handled adding the merged block (prev_block)
add_to_free_list(coalesced_block); // Ensure the final coalesced block is on the list
kprintf("KHeap: Added coalesced block %p (%zu) to free list\n",
coalesced_block, coalesced_block->size);
}
}

The kernel heap:

  1. Uses a linked list of memory blocks with headers
  2. Implements first-fit allocation with splitting
  3. Coalesces adjacent free blocks during kfree() to reduce fragmentation
  4. Can expand dynamically by allocating more physical frames

How the Kernel Heap works

The kernel heap is a region of memory that can be dynamically allocated and freed as needed. When code calls kmalloc(256), the heap allocator searches through a list of free memory blocks to find one at least 256 bytes large. If it finds a 1024-byte free block, it splits it into a 256-byte allocated block and a 768-byte free block, adding the remainder back to the free list. When memory is freed with kfree(), the allocator marks that block as available and checks if adjacent blocks are also free - if so, it merges them into a larger block to prevent fragmentation. This constant splitting and merging ensures memory can be efficiently reused throughout the kernel’s lifetime.

Allocation Strategy Deep Dive

Our implementation uses a first-fit approach with splitting to reduce fragmentation:

Before allocation of 100 bytes:
┌──────────────┬──────────────────────────────────┬───────────────────┐
│ Header │ Free Block (256 bytes) │ Header │
│ size=256 │ │ Next block... │
│ is_free=1 │ │ │
└──────────────┴──────────────────────────────────┴───────────────────┘
After allocation:
┌──────────────┬────────────────┬──────────────┬──────────────────────┐
│ Header │ Used Block │ Header │ Free Block │
│ size=100 │ (100 bytes) │ size=140 │ (140 bytes) │
│ is_free=0 │ │ is_free=1 │ │
└──────────────┴────────────────┴──────────────┴──────────────────────┘
New header created during split

Design Decision: Memory Allocation Algorithm

I implemented a first-fit algorithm with splitting and coalescing.

Alternatives considered:

  • Best-fit: Find the smallest block that fits the request
  • Worst-fit: Find the largest block to minimize fragmentation
  • Buddy system: Power-of-2 sizes with efficient coalescing
  • Slab allocator: Pre-allocated objects of common sizes

Tradeoffs:

  • First-fit: Good general performance, simple implementation
  • Best-fit: Reduces wasted space but slower search
  • Worst-fit: Can reduce fragmentation but wastes memory
  • Buddy: Excellent for power-of-2 allocations, but internal fragmentation
  • Slab: Excellent for fixed-size allocations, poor for variable sizes

I chose first-fit for its balance of simplicity and reasonable performance characteristics. Production kernels often use combinations of these approaches.

Troubleshooting Memory Issues

Common memory management bugs and how to diagnose them:

  1. Double free:
  • Symptom: Corruption in free list
  • Detection: Add validation in kfree() to check if block is already marked as free
  • Fix: Track allocations during debugging or use memory poisoning
  1. Use after free:
  • Symptom: Random crashes or data corruption
  • Detection: Fill freed memory with pattern (0xDE) for debug builds
  • Fix: Implement pointer nulling after free, heap validation routines
  1. Buffer overflow:
  • Symptom: Corruption of adjacent blocks’ headers
  • Detection: Add canary values after allocations
  • Fix: Add bounds checking, validate heap integrity periodically
  1. Memory leaks:
  • Symptom: Gradually running out of memory
  • Detection: Implement allocation tracking
  • Fix: Add debug-mode tracking of allocation sites

6. Exception Handling

Exception handling is essential for responding to hardware events and system calls.

Mental Model: Exception Handling as Emergency Response

Exception handling is like emergency response in a city. When an alarm sounds (exception occurs), specially trained teams (exception handlers) respond to specific types of emergencies, following prescribed protocols (vector table).

Exception Vector Tables

ARM AArch64 uses a vector table with 16 entries, 4 categories with 4 possible origins:

┌─────────────────────────────┬─────────────────────────────┐
│ Current EL, using SP0 │ Current EL, using current SP│
├─────────────────────────────┼─────────────────────────────┤
│ Synchronous │ Synchronous │
├─────────────────────────────┼─────────────────────────────┤
│ IRQ/vIRQ │ IRQ/vIRQ │
├─────────────────────────────┼─────────────────────────────┤
│ FIQ/vFIQ │ FIQ/vFIQ │
├─────────────────────────────┼─────────────────────────────┤
│ SError/vSError │ SError/vSError │
├─────────────────────────────┼─────────────────────────────┤
│ Lower EL, using AArch64 │ Lower EL, using AArch32 │
├─────────────────────────────┼─────────────────────────────┤
│ Synchronous │ Synchronous │
├─────────────────────────────┼─────────────────────────────┤
│ IRQ/vIRQ │ IRQ/vIRQ │
├─────────────────────────────┼─────────────────────────────┤
│ FIQ/vFIQ │ FIQ/vFIQ │
├─────────────────────────────┼─────────────────────────────┤
│ SError/vSError │ SError/vSError │
└─────────────────────────────┴─────────────────────────────┘

Let’s look at our implementation in exceptions_asm.S:

.section ".text.exceptions"
.align 11 // Align to 2KB (2^11)
.globl _exception_vector_table
_exception_vector_table:
// Handlers for exceptions from Current EL using SP0
.align 7 // Align each entry to 128 bytes (2^7)
b sync_handler_sp0 // Synchronous EL1t
.align 7
b irq_handler_sp0 // IRQ EL1t
.align 7
b fiq_handler_sp0 // FIQ EL1t
.align 7
b serror_handler_sp0 // SError EL1t
// Handlers for exceptions from Current EL using SPx (SP_EL1)
.align 7
b sync_handler_spx // Synchronous EL1h
.align 7
b irq_handler_spx // IRQ EL1h
.align 7
b fiq_handler_spx // FIQ EL1h
.align 7
b serror_handler_spx // SError EL1h
// Handlers for exceptions from Lower EL (EL0) using AArch64
.align 7
b sync_handler_el0_64 // Synchronous EL0 (64-bit)
.align 7
b irq_handler_el0_64 // IRQ EL0 (64-bit)
.align 7
b fiq_handler_el0_64 // FIQ EL0 (64-bit)
.align 7
b serror_handler_el0_64 // SError EL0 (64-bit)
// Handlers for exceptions from Lower EL (EL0) using AArch32 (Placeholder)
.align 7
b sync_handler_el0_32 // Synchronous EL0 (32-bit)
.align 7
b irq_handler_el0_32 // IRQ EL0 (32-bit)
.align 7
b fiq_handler_el0_32 // FIQ EL0 (32-bit)
.align 7
b serror_handler_el0_32 // SError EL0 (32-bit)
// Common entry point macro for saving context
// Assumes exception taken to EL1 using SP_EL1 (current SP)
.macro save_context
// Allocate space on the stack for GPRs (x0-x30), SPSR_EL1, ELR_EL1, SP_EL0
// 31 GPRs + 3 system regs = 34 registers * 8 bytes/reg = 272 bytes
// Align stack pointer to 16 bytes before pushing
sub sp, sp, #288 // Allocate space (272 + padding for alignment)
mov x0, sp // Copy SP to temporary register
and x0, x0, #-16 // Align the register to 16-byte boundary
mov sp, x0 // Update SP with aligned value
// Store GPRs x0-x30 (31 registers)
stp x0, x1, [sp, #16 * 0]
stp x2, x3, [sp, #16 * 1]
stp x4, x5, [sp, #16 * 2]
stp x6, x7, [sp, #16 * 3]
stp x8, x9, [sp, #16 * 4]
stp x10, x11, [sp, #16 * 5]
stp x12, x13, [sp, #16 * 6]
stp x14, x15, [sp, #16 * 7]
stp x16, x17, [sp, #16 * 8]
stp x18, x19, [sp, #16 * 9]
stp x20, x21, [sp, #16 * 10]
stp x22, x23, [sp, #16 * 11]
stp x24, x25, [sp, #16 * 12]
stp x26, x27, [sp, #16 * 13]
stp x28, x29, [sp, #16 * 14]
str x30, [sp, #16 * 15] // Store LR (x30)
// Store relevant system registers
mrs x0, spsr_el1
mrs x1, elr_el1
mrs x2, sp_el0
stp x0, x1, [sp, #16 * 15 + 8] // Store SPSR_EL1, ELR_EL1
str x2, [sp, #16 * 16 + 8] // Store SP_EL0
// Pass pointer to saved registers (current SP) to C handler in x0
mov x0, sp
.endm
// Common exit point macro for restoring context
.macro restore_context
// x0 might contain return value from C handler, but we don't use it here.
// Restore system registers first
ldp x0, x1, [sp, #16 * 15 + 8] // Load SPSR_EL1, ELR_EL1
ldr x2, [sp, #16 * 16 + 8] // Load SP_EL0
msr spsr_el1, x0
msr elr_el1, x1
msr sp_el0, x2
// Restore GPRs x0-x30
ldp x0, x1, [sp, #16 * 0]
ldp x2, x3, [sp, #16 * 1]
ldp x4, x5, [sp, #16 * 2]
ldp x6, x7, [sp, #16 * 3]
ldp x8, x9, [sp, #16 * 4]
ldp x10, x11, [sp, #16 * 5]
ldp x12, x13, [sp, #16 * 6]
ldp x14, x15, [sp, #16 * 7]
ldp x16, x17, [sp, #16 * 8]
ldp x18, x19, [sp, #16 * 9]
ldp x20, x21, [sp, #16 * 10]
ldp x22, x23, [sp, #16 * 11]
ldp x24, x25, [sp, #16 * 12]
ldp x26, x27, [sp, #16 * 13]
ldp x28, x29, [sp, #16 * 14]
ldr x30, [sp, #16 * 15] // Restore LR (x30)
// Deallocate stack space used for saving context
add sp, sp, #288 // Must match allocation size
eret // Return from exception
.endm
// --- Specific Handlers ---
// These handlers assume the exception was taken to EL1 using SP_EL1
sync_handler_common:
save_context
bl handle_sync_exception // Call C handler
restore_context
irq_handler_common:
save_context
bl handle_irq // Call C handler
restore_context
fiq_handler_common:
save_context
bl handle_fiq // Call C handler
restore_context
serror_handler_common:
save_context
bl handle_serror // Call C handler
restore_context
// --- Vector Table Entry Points ---
// Route different vector entries to appropriate common handlers
// For simplicity, this example routes most EL1/EL0_64 entries to common handlers.
// A real OS might need more differentiation based on SP0/SPx.
sync_handler_sp0:
sync_handler_spx:
sync_handler_el0_64:
b sync_handler_common
irq_handler_sp0:
irq_handler_spx:
irq_handler_el0_64:
b irq_handler_common
fiq_handler_sp0:
fiq_handler_spx:
fiq_handler_el0_64:
b fiq_handler_common
serror_handler_sp0:
serror_handler_spx:
serror_handler_el0_64:
b serror_handler_common
// Placeholder handlers for AArch32 (just loop indefinitely)
sync_handler_el0_32:
irq_handler_el0_32:
fiq_handler_el0_32:
serror_handler_el0_32:
wfi
b .
// --- External C function declarations ---
.globl handle_sync_exception
.globl handle_irq
.globl handle_fiq
.globl handle_serror

This code:

  1. Creates a properly aligned vector table (2KB alignment, 128-byte entries)
  2. Provides handlers for all 16 exception types
  3. Uses macros to save and restore all registers (context)
  4. Calls C functions for actual exception handling

Under the Hood: The Exception Mechanism

When an exception occurs:

  1. CPU saves critical state (PC, PSTATE) in dedicated registers
  2. CPU switches to appropriate exception level if needed
  3. PC is set to the address in the vector table corresponding to the exception type
  4. Exception handler code executes
  5. When complete, the handler executes eret to return to the interrupted code
Normal Execution Exception Occurs Handler Execution
┌─────────────────┐ ┌─────────────────┐ ┌─────────────────┐
│ Instruction 1 │ │ Save state: │ │ Exception │
│ Instruction 2 │ │ - ELR_ELx = PC │ │ handler code │
│ Instruction 3 │───Exception──▶│ - SPSR_ELx = PSR│──Jump to─▶│ Analyze cause │
│ Instruction 4 │ │ - Switch SP │ handler │ Handle exception│
│ ... │ │ │ │ Restore state │
└─────────────────┘ └─────────────────┘ └─────────────────┘
▲ │
│ │
└────────────────────────────Return (eret)───────────────────────┘

Design Decision: Context Saving Strategy

My implementation saves all registers for simplicity.

Alternative: Partial register saving based on the AArch64 procedure call standard.

Tradeoffs:

  • Full saving: Simpler, consistent, but more overhead for simple exceptions
  • Partial saving: More efficient but requires understanding which registers must be preserved

I chose full context saving for educational clarity and robustness, though production kernels often optimize this.

Saving and Restoring Context

When an exception occurs, we need to preserve the CPU state. Our save_context macro:

  1. Allocates stack space for saving registers
  2. Saves all 30 general-purpose registers
  3. Saves ELR_EL1 (return address), SPSR_EL1 (saved program status), and SP_EL0

This allows the exception handler to use the registers without disrupting the interrupted code.

Handling Different Types of Exceptions

The C handler in exceptions.c processes exceptions based on their type:

#include <stdint.h>
#include <stdbool.h>
#include "exceptions/exceptions.h"
#include "lib/stdio.h"
// Helper function to read ESR_EL1
static inline uint64_t read_esr_el1(void) {
uint64_t val;
asm volatile("mrs %0, esr_el1" : "=r" (val));
return val;
}
// Helper function to read ELR_EL1
static inline uint64_t read_elr_el1(void) {
uint64_t val;
asm volatile("mrs %0, elr_el1" : "=r" (val));
return val;
}
// Helper function to read FAR_EL1
static inline uint64_t read_far_el1(void) {
uint64_t val;
asm volatile("mrs %0, far_el1" : "=r" (val));
return val;
}
// Simple panic function
void panic(const char *message) {
kprintf("\nKERNEL PANIC: %s\n", message);
kprintf("System halted.\n");
// Disable interrupts here
asm volatile("msr daifset, #0xf");
while (1) {
asm volatile("wfi"); // Wait for interrupt (effectively halt)
}
}
// Function to print register context
void print_registers(const saved_registers_t *context) {
kprintf("Saved Registers:\n");
for (int i = 0; i < 31; i += 2) {
kprintf(" x%-2d: %016llx x%-2d: %016llx\n",
i, context->regs[i], i + 1, (i + 1 < 31)? context->regs[i + 1] : 0);
}
kprintf(" SPSR_EL1: %016llx\n", context->spsr_el1);
kprintf(" ELR_EL1: %016llx\n", context->elr_el1);
kprintf(" SP_EL0: %016llx\n", context->sp_el0);
}
// --- Exception Handlers ---
// Called by assembly wrapper for synchronous exceptions
void handle_sync_exception(saved_registers_t *context) {
uint64_t esr = read_esr_el1();
uint64_t elr = context->elr_el1; // Use saved ELR
uint64_t far = read_far_el1(); // Fault Address Register
uint32_t ec = (esr >> 26) & 0x3F; // Extract Exception Class (bits 31:26)
uint32_t iss = esr & 0x1FFFFFF; // Extract Instruction Specific Syndrome (bits 24:0)
kprintf("\n--- Synchronous Exception Taken ---\n");
kprintf(" ESR_EL1: %016llx (EC: 0x%x, ISS: 0x%x)\n", esr, ec, iss);
kprintf(" ELR_EL1: %016llx (Return Address)\n", elr);
const char* ec_str = "Unknown";
bool far_valid = false;
switch (ec) {
case 0b000000: ec_str = "Unknown reason"; break;
case 0b000001: ec_str = "Trapped WFI or WFE"; break;
//... other EC values for MCR/MRC, MCRR/MRRC, LDC/STC etc. (AArch32 related)
case 0b001110: ec_str = "Illegal Execution State"; break;
case 0b010001: ec_str = "SVC instruction execution in AArch32 state"; break;
case 0b010101: ec_str = "SVC instruction execution in AArch64 state"; break;
case 0b011000: ec_str = "Trapped MSR, MRS or System instruction execution in AArch64 state"; break;
case 0b011001: ec_str = "Access to SVE functionality trapped"; break; // Added in ARMv8.2
case 0b100000: ec_str = "Instruction Abort from a lower Exception level (AArch32)"; far_valid = true; break;
case 0b100001: ec_str = "Instruction Abort from a lower Exception level (AArch64)"; far_valid = true; break;
case 0b100010: ec_str = "PC alignment fault exception"; break;
case 0b100100: ec_str = "Data Abort from a lower Exception level (AArch32)"; far_valid = true; break;
case 0b100101: ec_str = "Data Abort from a lower Exception level (AArch64)"; far_valid = true; break;
case 0b100110: ec_str = "SP alignment fault exception"; break;
case 0b101000: ec_str = "Trapped floating-point exception (AArch32)"; break;
case 0b101100: ec_str = "Trapped floating-point exception (AArch64)"; break;
case 0b110000: ec_str = "SError interrupt"; break;
case 0b110001: ec_str = "Breakpoint exception from a lower Exception level (AArch32)"; break;
case 0b110010: ec_str = "Breakpoint exception from a lower Exception level (AArch64)"; break;
case 0b110100: ec_str = "Step exception from a lower Exception level (AArch32)"; break;
case 0b110101: ec_str = "Step exception from a lower Exception level (AArch64)"; break;
case 0b111000: ec_str = "Watchpoint exception from a lower Exception level (AArch32)"; break;
case 0b111001: ec_str = "Watchpoint exception from a lower Exception level (AArch64)"; break;
case 0b111100: ec_str = "BRK instruction execution in AArch64 state"; break;
// Exceptions from current EL
case 0b100011: ec_str = "Instruction Abort from current EL"; far_valid = true; break;
case 0b100111: ec_str = "Data Abort from current EL"; far_valid = true; break;
default: ec_str = "Unhandled Exception Class"; break;
}
kprintf(" Type: %s\n", ec_str);
if (far_valid) {
kprintf(" FAR_EL1: %016llx (Faulting Virtual Address)\n", far);
}
print_registers(context);
kprintf("-------------------------------------\n");
// Handle specific exceptions or panic
if (ec == 0b111100) { // BRK instruction
kprintf("BRK instruction encountered. Continuing execution.\n");
// Advance ELR_EL1 past the BRK instruction (assuming BRK is 4 bytes)
context->elr_el1 += 4;
// Return normally via restore_context -> eret
} else if (ec == 0b010101) { // SVC instruction
uint16_t svc_imm = iss & 0xFFFF; // Extract immediate value from ISS
kprintf("SVC instruction encountered (Imm: 0x%x). Implement SVC handler.\n", svc_imm);
// Handle the system call based on svc_imm and registers x0-x7 in context
// For now, just advance ELR and return.
context->elr_el1 += 4;
}
else {
// For most other synchronous exceptions, panic.
panic("Unhandled synchronous exception");
}
}
// Called by assembly wrapper for IRQ exceptions
void handle_irq(saved_registers_t *context) {
kprintf("\n--- IRQ Received ---\n");
// TODO: Interact with the Generic Interrupt Controller (GIC)
// 1. Read Interrupt Acknowledge Register (IAR) from CPU interface (GICC_IAR)
// to get the interrupt ID and acknowledge the interrupt.
// 2. Dispatch to the appropriate driver/handler based on the ID.
// 3. Write to End Of Interrupt Register (EOIR) (GICC_EOIR) to signal completion.
kprintf(" (No GIC driver implemented yet)\n");
print_registers(context);
kprintf("--------------------\n");
// For now, just return via restore_context -> eret
}
// Placeholder for FIQ
void handle_fiq(saved_registers_t *context) {
kprintf("\n--- FIQ Received ---\n");
print_registers(context);
panic("FIQ handling not implemented");
}
// Placeholder for SError
void handle_serror(saved_registers_t *context) {
uint64_t esr = read_esr_el1();
kprintf("\n--- SError Received ---\n");
kprintf(" ESR_EL1: %016llx\n", esr);
print_registers(context);
panic("SError handling not implemented");
}

This code:

  1. Reads exception information from system registers
  2. Decodes the exception type using the Exception Class (EC) field from ESR_EL1
  3. Handles specific types like SVC (system calls), BRK instructions, and data aborts
  4. Provides detailed diagnostics by printing register contents

Debugging Exception Handlers

Tips for debugging exception-related issues:

  1. System appears to hang:
  • Check that the vector table is properly registered using vbar_el1
  • Verify that exception handlers eventually return (check for missing eret)
  • Use simple, direct UART output before the register-saving code
  1. Corrupted state after exception:
  • Validate register save/restore sequence matches in pairs
  • Check stack alignment (must be 16-byte aligned for AArch64)
  • Verify SP adjustment matches between save and restore
  1. Recursive exceptions:
  • Implement guard against reentrant exceptions using a simple counter
  • Add debug output showing exception nesting level
  • Check for exceptions during context saving/restoring

Implementing System Calls

System calls allow user programs to request services from the kernel. They’re implemented using the SVC instruction, which generates a synchronous exception.

The full flow is:

  1. User code executes svc #N with a number indicating the desired function
  2. CPU takes a synchronous exception to EL1
  3. Our exception handler decodes the SVC number
  4. The appropriate system call handler is invoked
  5. Results are returned to the user program

Let’s implement I/O functions to interact with our OS.

7. UART and I/O

The Universal Asynchronous Receiver/Transmitter (UART) provides serial communication for console I/O.

How UART Serial Communication Works

UART (Universal Asynchronous Receiver/Transmitter) is one of the simplest ways computers communicate - it sends data one bit at a time over a single wire. To send a byte like ‘A’ (0x41), the UART hardware breaks it into individual bits and transmits them at a predetermined rate (like 115200 bits per second). Before each byte, it sends a “start bit” to signal incoming data, then the 8 data bits, and finally a “stop bit” to mark the end. The receiving UART samples the line at the same rate, reconstructing bytes from the bit stream. This simplicity makes UART perfect for console output and debugging, though it can be limited by its slow speed and lack of error correction beyond basic parity checks.

The PL011 UART Controller

The QEMU virt platform uses ARM’s PL011 UART at address 0x09000000.

Under the Hood: UART Registers

The PL011 UART is a memory-mapped device with these key registers:

┌──────────────────────────────────────────────────────────────┐
│ PL011 UART Registers │
├─────────────┬────────────────────┬───────────────────────────┤
│ Offset │ Register │ Purpose │
├─────────────┼────────────────────┼───────────────────────────┤
│ 0x000 │ UART_DR │ Data Register (read/write) │
│ 0x018 │ UART_FR │ Flag Register │
│ 0x024 │ UART_IBRD │ Integer Baud Rate Divisor │
│ 0x028 │ UART_FBRD │ Fractional Baud Rate │
│ 0x02C │ UART_LCRH │ Line Control Register │
│ 0x030 │ UART_CR │ Control Register │
└─────────────┴────────────────────┴───────────────────────────┘

When writing to the UART:

  1. Check UART_FR to see if the transmit FIFO is full
  2. Write the character to UART_DR when there’s space
  3. The UART hardware handles sending the bit sequence

When reading from the UART:

  1. Check UART_FR to see if the receive FIFO has data
  2. Read the character from UART_DR if data is available

The implementation is in uart.c:

#include "lib/uart.h"
#include <stdint.h>
// QEMU virt PL011 UART registers
#define UART_BASE 0x09000000
#define UART_DR ((volatile uint32_t*)(UART_BASE + 0x00))
#define UART_FR ((volatile uint32_t*)(UART_BASE + 0x18))
#define UART_IBRD ((volatile uint32_t*)(UART_BASE + 0x24))
#define UART_FBRD ((volatile uint32_t*)(UART_BASE + 0x28))
#define UART_LCRH ((volatile uint32_t*)(UART_BASE + 0x2C))
#define UART_CR ((volatile uint32_t*)(UART_BASE + 0x30))
#define UART_IMSC ((volatile uint32_t*)(UART_BASE + 0x38))
// Flag register bits
#define UART_FR_RXFE 0x10 // Receive FIFO empty
#define UART_FR_TXFF 0x20 // Transmit FIFO full
// Line control register bits
#define UART_LCRH_FEN 0x10 // Enable FIFOs
#define UART_LCRH_WLEN_8 0x60 // 8 bits word length
// Control register bits
#define UART_CR_UARTEN 0x01 // UART enable
#define UART_CR_TXE 0x100 // Transmit enable
#define UART_CR_RXE 0x200 // Receive enable
void uart_init(void) {
// Disable UART while configuring
*UART_CR = 0;
// Configure baud rate: 115200 baud
// Assuming 48MHz clock, divisor = 48000000/(16*115200) = 26.041666...
// Integer part = 26
// Fractional part = 0.041666... * 64 = 2.66... ≈ 3
*UART_IBRD = 26;
*UART_FBRD = 3;
// Configure line control: 8 bits, no parity, 1 stop bit, FIFOs enabled
*UART_LCRH = UART_LCRH_WLEN_8 | UART_LCRH_FEN;
// Mask all interrupts initially
*UART_IMSC = 0;
// Enable UART, transmit and receive
*UART_CR = UART_CR_UARTEN | UART_CR_TXE | UART_CR_RXE;
}
void uart_putc(char c) {
// Wait for FIFO to have space
while (*UART_FR & UART_FR_TXFF);
// Send character
*UART_DR = c;
// If it's a newline, also send a carriage return
if (c == '\n') {
uart_putc('\r');
}
}
char uart_getc(void) {
// If receive FIFO is empty, return 0
if (*UART_FR & UART_FR_RXFE) {
return 0;
}
// Read and return character
return *UART_DR;
}
int uart_is_data_available(void) {
// Check if receive FIFO is not empty
return !(*UART_FR & UART_FR_RXFE);
}
### Design Decision: Polling vs. Interrupt-Driven I/O
Our implementation uses polling for simplicity.
**Alternative**: Interrupt-driven I/O would use IRQ handlers to process UART events.
**Tradeoffs**:
- Polling: Simple implementation, but wastes CPU cycles when waiting
- Interrupt-driven: More efficient CPU usage, but more complex to implement
> I chose polling for educational clarity, though a production OS would typically use interrupts for better performance.
### Implementing Standard I/O Functions
Now let's implement a basic printf-like function in `stdio.c`:
```c
#include <stdarg.h>
#include <stddef.h>
#include <stdint.h>
#include <stdbool.h>
#include "lib/stdio.h"
#include "lib/uart.h"
#include "lib/string.h"
// Buffer sizes
#define PRINTF_BUFFER_SIZE 1024
#define MAX_INT_DIGITS 21 // For 64-bit integer
// Basic kprintf implementation
int kprintf(const char *format, ...) {
char buffer[PRINTF_BUFFER_SIZE];
char *buf_ptr = buffer;
va_list args;
va_start(args, format);
while (*format != '\0' && (size_t)(buf_ptr - buffer) < PRINTF_BUFFER_SIZE - 1) {
if (*format == '%') {
format++;
// Check for length modifiers
bool is_long = false;
bool is_longlong = false;
if (*format == 'l') {
is_long = true;
format++;
if (*format == 'l') {
is_longlong = true;
is_long = false;
format++;
}
}
// Process format specifier
switch (*format) {
case 'c':
*buf_ptr++ = (char)va_arg(args, int);
break;
case 's': {
const char *str = va_arg(args, const char*);
if (str == NULL) str = "(null)";
size_t len = strlen(str);
if (len > PRINTF_BUFFER_SIZE - (buf_ptr - buffer) - 1)
len = PRINTF_BUFFER_SIZE - (buf_ptr - buffer) - 1;
memcpy(buf_ptr, str, len);
buf_ptr += len;
break;
}
case 'd':
case 'i': {
int64_t value;
if (is_longlong)
value = va_arg(args, int64_t);
else if (is_long)
value = va_arg(args, long);
else
value = va_arg(args, int);
// Convert to string using helper function
char num_buf[MAX_INT_DIGITS];
char *num_ptr = num_buf + MAX_INT_DIGITS - 1;
*num_ptr = '\0';
bool is_negative = value < 0;
uint64_t abs_value = is_negative ? -value : value;
do {
*--num_ptr = '0' + (abs_value % 10);
abs_value /= 10;
} while (abs_value > 0);
if (is_negative) {
*--num_ptr = '-';
}
size_t len = strlen(num_ptr);
if (len > PRINTF_BUFFER_SIZE - (buf_ptr - buffer) - 1)
len = PRINTF_BUFFER_SIZE - (buf_ptr - buffer) - 1;
memcpy(buf_ptr, num_ptr, len);
buf_ptr += len;
break;
}
case 'u': {
uint64_t value;
if (is_longlong)
value = va_arg(args, uint64_t);
else if (is_long)
value = va_arg(args, unsigned long);
else
value = va_arg(args, unsigned int);
// Convert to string
char num_buf[MAX_INT_DIGITS];
char *num_ptr = num_buf + MAX_INT_DIGITS - 1;
*num_ptr = '\0';
do {
*--num_ptr = '0' + (value % 10);
value /= 10;
} while (value > 0);
size_t len = strlen(num_ptr);
if (len > PRINTF_BUFFER_SIZE - (buf_ptr - buffer) - 1)
len = PRINTF_BUFFER_SIZE - (buf_ptr - buffer) - 1;
memcpy(buf_ptr, num_ptr, len);
buf_ptr += len;
break;
}
case 'x':
case 'X': {
uint64_t value;
if (is_longlong)
value = va_arg(args, uint64_t);
else if (is_long)
value = va_arg(args, unsigned long);
else
value = va_arg(args, unsigned int);
// Convert to hex string
char num_buf[MAX_INT_DIGITS];
char *num_ptr = num_buf + MAX_INT_DIGITS - 1;
*num_ptr = '\0';
const char *hex_chars = (*format == 'x') ? "0123456789abcdef" : "0123456789ABCDEF";
do {
*--num_ptr = hex_chars[value & 0xF];
value >>= 4;
} while (value > 0);
size_t len = strlen(num_ptr);
if (len > PRINTF_BUFFER_SIZE - (buf_ptr - buffer) - 1)
len = PRINTF_BUFFER_SIZE - (buf_ptr - buffer) - 1;
memcpy(buf_ptr, num_ptr, len);
buf_ptr += len;
break;
}
case 'p': {
void *ptr = va_arg(args, void*);
// Print "0x" prefix
if (buf_ptr + 2 < buffer + PRINTF_BUFFER_SIZE - 1) {
*buf_ptr++ = '0';
*buf_ptr++ = 'x';
}
// Convert to hex string
uint64_t value = (uint64_t)ptr;
char num_buf[MAX_INT_DIGITS];
char *num_ptr = num_buf + MAX_INT_DIGITS - 1;
*num_ptr = '\0';
// Always print full pointer width (16 hex digits for 64-bit)
int digit_count = 0;
do {
*--num_ptr = "0123456789abcdef"[value & 0xF];
value >>= 4;
digit_count++;
} while (value > 0 || digit_count < 16);
size_t len = strlen(num_ptr);
if (len > PRINTF_BUFFER_SIZE - (buf_ptr - buffer) - 1)
len = PRINTF_BUFFER_SIZE - (buf_ptr - buffer) - 1;
memcpy(buf_ptr, num_ptr, len);
buf_ptr += len;
break;
}
case '%':
*buf_ptr++ = '%';
break;
default:
// Unsupported format specifier, just copy it
*buf_ptr++ = '%';
*buf_ptr++ = *format;
break;
}
} else {
*buf_ptr++ = *format;
}
format++;
}
// Null-terminate the buffer
*buf_ptr = '\0';
// Output the buffer through UART
char *output_ptr = buffer;
while (*output_ptr) {
uart_putc(*output_ptr++);
}
va_end(args);
return buf_ptr - buffer;
}
// Get a character (non-blocking)
char kgetc(void) {
return uart_getc();
}
// Simple blocking character read
char kgetc_blocking(void) {
char c;
while ((c = uart_getc()) == 0) {
// Wait for character
asm volatile("yield");
}
// Convert CR (Enter key) to LF for processing
if (c == '\r') {
return '\n';
}
return c;
}

Implementing printf: Format String Parsing

Our kprintf implementation parses format strings character by character:

┌─────────────────────────────────────────────────────────────┐
│ Format String: "Value: %d, Hex: 0x%x" │
└───────────┬─────────────────────────────────────────────────┘
┌─────────────────────────────────────────────┐
│ Parser State Machine │
│ │
│ ┌───────┐ '%' ┌───────┐ │
│ ─────▶│ Normal│─────────────▶ │ Format│ │
│ │ Text │ │ Spec │ │
│ └───────┘ format └───┬───┘ │
│ ▲ parsed │ │
│ └────────────────────────┘ │
└─────────────────────────────────────────────┘
┌─────────────────────────────────────────────┐
│ Output: "Value: 42, Hex: 0x2A" │
└─────────────────────────────────────────────┘

This code:

  1. Implements kprintf for formatted console output
  2. Supports basic format specifiers: %c, %s, %d, %x, %p
  3. Uses varargs for handling variable argument lists
  4. Includes length modifiers for different integer sizes

Troubleshooting I/O Issues

Common UART/printf issues and how to diagnose them:

  1. No output from UART:
  • Check UART initialization (baud rate, format settings)
  • Verify UART registers are mapped to correct physical addresses
  • Try direct register writes to UART_DR as a test
  1. Garbled UART output:
  • Check baud rate settings match expected clock frequency
  • Verify line settings (parity, stop bits, word length)
  • Check for buffer overflows in printf implementation
  1. Printf format handling issues:
  • Add bounds checking to prevent buffer overflows
  • Check for missing null termination in generated strings
  • Verify format specifiers are fully supported

With I/O functions in place, let’s build a shell interface for user interaction.

8. Building a Basic Shell

A shell provides a command-line interface for interacting with the OS. Let’s implement one in shell.c:

How Command-Line Shells Work

A shell operates in a continuous Read-Evaluate-Print-Loop (REPL) cycle. First, it waits for and reads user input, typically one line at a time, echoing each character as you type and handling special keys like backspace. Once you press Enter, it parses the command line into arguments by splitting on spaces - the first token becomes the command name, and the rest become parameters. The shell then looks up this command in its command table and executes the corresponding function, passing the parsed arguments. After the command completes and prints its output, the shell loops back to wait for the next command. This cycle continues indefinitely, processing commands one at a time until the shell is terminated.

┌───────────────────────────────────────────────────────────────┐
│ Shell REPL Cycle │
│ │
│ ┌──────────┐ │
│ │ Read │ │
│ │ Input │ │
│ └────┬─────┘ │
│ │ │
│ ┌───────────┐ │ ┌────────────┐ ┌────────────────────┐ │
│ │ Wait for │◀─┘ │ Parse into │ │ Execute Command │ │
│ │ Command │ │ Arguments │───▶│ Corresponding to │ │
│ └───────────┘ └────────────┘ │ Command Table │ │
│ ▲ └─────────┬──────────┘ │
│ │ │ │
│ └────────────────────────────────────────┘ │
│ │
└───────────────────────────────────────────────────────────────┘

Here’s the implementation in shell.c:

#include <stdint.h>
#include <stddef.h>
#include <stdbool.h>
#include "kernel.h"
#include "shell/shell.h"
#include "lib/stdio.h"
#include "lib/string.h"
#include "lib/stdlib_stubs.h"
#include "memory/frame_alloc.h"
#include "memory/kheap.h"
#define MAX_CMD_LEN 128
#define MAX_ARGS 10
// --- Address Validation ---
// Needs access to PMM's knowledge of valid RAM regions.
// This is a simplified check against the highest known usable address.
bool is_address_valid(uint64_t addr, size_t len) {
uint64_t highest_ram = pmm_get_highest_usable_address();
// Basic check: ensure start and end are within the known usable range
// and don't wrap around. Assumes PMM_RAM_BASE is the lowest valid RAM addr.
if (addr < PMM_RAM_BASE || addr >= highest_ram) {
return false;
}
if (len > 0 && (addr + len - 1) >= highest_ram) {
// Check for overflow before checking end boundary
if (addr + len < addr) return false; // Overflow occurred
return false; // End address is out of bounds
}
return true;
}
// --- Shell Commands ---
void cmd_memdump(int argc, char **argv) {
if (argc < 2) {
kprintf("Usage: memdump <address> [length]\n");
return;
}
char *endptr;
uint64_t addr = simple_strtoull(argv[1], &endptr, 0);
if (*endptr != '\0') {
kprintf("Error: Invalid address format '%s'\n", argv[1]);
return;
}
size_t length = 256; // Default length
if (argc > 2) {
length = (size_t)simple_strtoull(argv[2], &endptr, 0);
if (*endptr != '\0') {
kprintf("Error: Invalid length format '%s'\n", argv[2]);
return;
}
}
if (length == 0) return;
// Validate the entire range
if (!is_address_valid(addr, length)) {
kprintf("Error: Address range 0x%llx - 0x%llx is not within valid RAM.\n",
addr, addr + length -1);
return;
}
kprintf("Memory dump from 0x%llx (length %zu):\n", addr, length);
volatile uint8_t *ptr = (volatile uint8_t *)addr;
for (size_t i = 0; i < length; i += 16) {
kprintf("%016llx: ", addr + i);
// Print hex bytes
for (size_t j = 0; j < 16; ++j) {
if (i + j < length) {
kprintf("%02x ", ptr[i + j]);
} else {
kprintf(" ");
}
if (j == 7) kprintf(" "); // Add extra space in the middle
}
kprintf(" |");
// Print ASCII chars
for (size_t j = 0; j < 16; ++j) {
if (i + j < length) {
char c = ptr[i + j];
kprintf("%c", (c >= 32 && c <= 126)? c : '.');
} else {
kprintf(" ");
}
}
kprintf("|\n");
}
}
void cmd_peek(int argc, char **argv) {
if (argc < 2) {
kprintf("Usage: peek <address> [size: b/h/w/d (default: d)]\n");
return;
}
char *endptr;
uint64_t addr = simple_strtoull(argv[1], &endptr, 0);
if (*endptr != '\0') {
kprintf("Error: Invalid address format '%s'\n", argv[1]);
return;
}
char size_char = 'd'; // Default to double word (64-bit)
size_t size_bytes = 8;
if (argc > 2) {
size_char = argv[2][0];
if (strlen(argv[2]) != 1) {
kprintf("Error: Invalid size format '%s'\n", argv[2]);
return;
}
}
switch (size_char) {
case 'b': size_bytes = 1; break; // Byte
case 'h': size_bytes = 2; break; // Half-word (16-bit)
case 'w': size_bytes = 4; break; // Word (32-bit)
case 'd': size_bytes = 8; break; // Double-word (64-bit)
default:
kprintf("Error: Invalid size '%c'. Use b, h, w, or d.\n", size_char);
return;
}
// Validate address for the specified size
if (!is_address_valid(addr, size_bytes)) {
kprintf("Error: Address 0x%llx is not within valid RAM for size %zu.\n", addr, size_bytes);
return;
}
// Ensure address alignment for larger types
if ((size_bytes > 1) && (addr % size_bytes != 0)) {
kprintf("Warning: Address 0x%llx is not aligned for size %zu.\n", addr, size_bytes);
// Proceeding might cause alignment fault depending on CPU config / EL
}
uint64_t value = 0;
volatile void *ptr = (volatile void *)addr;
kprintf("Peek at 0x%llx (size %zu): ", addr, size_bytes);
switch (size_bytes) {
case 1: value = *(volatile uint8_t*)ptr; kprintf("0x%02llx\n", value); break;
case 2: value = *(volatile uint16_t*)ptr; kprintf("0x%04llx\n", value); break;
case 4: value = *(volatile uint32_t*)ptr; kprintf("0x%08llx\n", value); break;
case 8: value = *(volatile uint64_t*)ptr; kprintf("0x%016llx\n", value); break;
}
}
void cmd_poke(int argc, char **argv) {
if (argc < 3) {
kprintf("Usage: poke <address> <value> [size: b/h/w/d (default: d)]\n");
return;
}
char *endptr;
uint64_t addr = simple_strtoull(argv[1], &endptr, 0);
if (*endptr != '\0') {
kprintf("Error: Invalid address format '%s'\n", argv[1]);
return;
}
uint64_t value = simple_strtoull(argv[2], &endptr, 0);
if (*endptr != '\0') {
kprintf("Error: Invalid value format '%s'\n", argv[2]);
return;
}
char size_char = 'd'; // Default to double word (64-bit)
size_t size_bytes = 8;
if (argc > 3) {
size_char = argv[3][0];
if (strlen(argv[3]) != 1) {
kprintf("Error: Invalid size format '%s'\n", argv[3]);
return;
}
}
switch (size_char) {
case 'b': size_bytes = 1; break; // Byte
case 'h': size_bytes = 2; break; // Half-word (16-bit)
case 'w': size_bytes = 4; break; // Word (32-bit)
case 'd': size_bytes = 8; break; // Double-word (64-bit)
default:
kprintf("Error: Invalid size '%c'. Use b, h, w, or d.\n", size_char);
return;
}
// Validate address for the specified size
if (!is_address_valid(addr, size_bytes)) {
kprintf("Error: Address 0x%llx is not within valid RAM for size %zu.\n", addr, size_bytes);
return;
}
// Ensure address alignment for larger types
if ((size_bytes > 1) && (addr % size_bytes != 0)) {
kprintf("Warning: Address 0x%llx is not aligned for size %zu.\n", addr, size_bytes);
// Proceeding might cause alignment fault depending on CPU config / EL
}
volatile void *ptr = (volatile void *)addr;
kprintf("Poke at 0x%llx (size %zu) with value 0x%llx\n", addr, size_bytes, value);
switch (size_bytes) {
case 1: *(volatile uint8_t*)ptr = (uint8_t)value; break;
case 2: *(volatile uint16_t*)ptr = (uint16_t)value; break;
case 4: *(volatile uint32_t*)ptr = (uint32_t)value; break;
case 8: *(volatile uint64_t*)ptr = (uint64_t)value; break;
}
}
void cmd_alloc(int argc, char **argv) {
if (argc < 2) {
kprintf("Usage: alloc <size>\n");
return;
}
char *endptr;
size_t size = (size_t)simple_strtoull(argv[1], &endptr, 0);
if (*endptr != '\0' || size == 0) {
kprintf("Error: Invalid size '%s'\n", argv[1]);
return;
}
void *ptr = kmalloc(size);
if (ptr) {
kprintf("Allocated %zu bytes at 0x%p\n", size, ptr);
} else {
kprintf("Allocation failed!\n");
}
}
void cmd_free(int argc, char **argv) {
if (argc < 2) {
kprintf("Usage: free <address>\n");
return;
}
char *endptr;
uint64_t addr = simple_strtoull(argv[1], &endptr, 0);
if (*endptr != '\0') {
kprintf("Error: Invalid address format '%s'\n", argv[1]);
return;
}
void *ptr = (void *)addr;
kprintf("Freeing memory at 0x%p\n", ptr);
kfree(ptr);
}
void cmd_help(int argc, char **argv) {
kprintf("Available commands:\n");
kprintf(" help - Display this help message\n");
kprintf(" memdump <addr> [len] - Dump memory contents (default len=256)\n");
kprintf(" peek <addr> [sz] - Read value from memory (sz=b/h/w/d, default=d)\n");
kprintf(" poke <addr> <val> [sz] - Write value to memory (sz=b/h/w/d, default=d)\n");
kprintf(" alloc <size> - Allocate memory of given size\n");
kprintf(" free <addr> - Free previously allocated memory\n");
kprintf(" pmm_info - Display Physical Memory Manager info\n");
}
void cmd_pmm_info(int argc, char **argv) {
kprintf("Physical Memory Manager Info:\n");
kprintf(" Total Usable Memory: %llu KB\n", pmm_get_total_memory() / 1024);
kprintf(" Free Memory: %llu KB\n", pmm_get_free_memory() / 1024);
kprintf(" Highest Usable Addr: 0x%llx\n", pmm_get_highest_usable_address());
}
// --- Shell Main Loop ---
// Table of command handlers
typedef struct {
const char *name;
void (*func)(int argc, char **argv);
} command_t;
// Static command array with initialized function pointers
static command_t commands[8];
// Runtime initialization of the command table to work around section initialization issues
static void init_command_table(void) {
static char help_cmd[] = "help";
static char memdump_cmd[] = "memdump";
static char peek_cmd[] = "peek";
static char poke_cmd[] = "poke";
static char alloc_cmd[] = "alloc";
static char free_cmd[] = "free";
static char pmm_info_cmd[] = "pmm_info";
commands[0].name = help_cmd;
commands[0].func = cmd_help;
commands[1].name = memdump_cmd;
commands[1].func = cmd_memdump;
commands[2].name = peek_cmd;
commands[2].func = cmd_peek;
commands[3].name = poke_cmd;
commands[3].func = cmd_poke;
commands[4].name = alloc_cmd;
commands[4].func = cmd_alloc;
commands[5].name = free_cmd;
commands[5].func = cmd_free;
commands[6].name = pmm_info_cmd;
commands[6].func = cmd_pmm_info;
// Sentinel
commands[7].name = NULL;
commands[7].func = NULL;
kprintf("Command table initialized:\n");
for (int i = 0; i < 7; i++) {
kprintf(" [%d] name at %p: '%s', len=%d\n",
i, commands[i].name, commands[i].name,
strlen(commands[i].name));
}
}
void shell_loop() {
char cmd_buffer[MAX_CMD_LEN];
char *argv[MAX_ARGS];
int argc;
kprintf("\nMeringueOS Shell\n");
kprintf("Type 'help' for available commands.\n");
// Initialize command table at runtime
init_command_table();
// Debug command table
kprintf("Command table at %p:\n", commands);
kprintf("Debug: .rodata section address range: %p to %p\n", &_rodata_start, &_rodata_end);
for (int i = 0; commands[i].name != NULL; i++) {
kprintf(" [%d] name at %p: '%s', func at %p\n",
i, commands[i].name, commands[i].name, commands[i].func);
}
while (1) {
kprintf("> ");
memset(cmd_buffer, 0, MAX_CMD_LEN);
int i = 0;
char c;
// Read command line
while (i < MAX_CMD_LEN - 1) {
c = kgetc_blocking(); // Use a blocking read
if (c == '\r' || c == '\n') {
kprintf("\n"); // Echo newline
break;
} else if (c == '\b' || c == 127) { // Handle backspace
if (i > 0) {
i--;
kprintf("\b \b"); // Erase character on screen
}
} else if (c >= 32 && c <= 126) { // Printable ASCII
cmd_buffer[i++] = c;
kprintf("%c", c); // Echo character
}
// Ignore other characters
}
cmd_buffer[i] = '\0'; // Null-terminate
if (i == 0) {
continue; // Empty command
}
// Parse command and arguments using strtok
argc = 0;
char *token = strtok(cmd_buffer, " ");
while (token != NULL && argc < MAX_ARGS) {
argv[argc++] = token;
token = strtok(NULL, " ");
}
if (argc == 0) {
continue; // Only whitespace
}
// Find and execute command
bool found = false;
// Debug info
kprintf("Command entered: '%s'\n", argv[0]);
for (int i = 0; commands[i].name != NULL; i++) {
kprintf("Comparing with command: '%s'\n", commands[i].name);
if (strcmp(argv[0], commands[i].name) == 0) {
kprintf("Match found! Executing...\n");
commands[i].func(argc, argv);
found = true;
break;
}
}
if (!found) {
kprintf("Unknown command: %s\n", argv[0]);
}
}
}

Command Parsing Algorithm

The shell parses command lines using this algorithm:

  1. Skip leading whitespace
  2. Mark the start of an argument
  3. Find the next whitespace or end of string
  4. Replace the whitespace with a null terminator
  5. Add the argument to the argv array
  6. Repeat until the end of the string
Input: " memdump 0x40100000 64 "
Step 1: "memdump 0x40100000 64 "
Step 2: argv[0] = "memdump 0x40100000 64 "
Step 3: Find next whitespace after "memdump"
Step 4: "memdump\0 0x40100000 64 "
Step 5: argv[0] = "memdump"
Repeat for remaining arguments...

Design Decision: Command Table Structure

I implement a command table with function pointers for extensibility.

Alternative: A giant switch statement or if-else chain.

Tradeoffs:

  • Command table: More modular, easier to extend, slightly more complex
  • Switch statement: Simpler, potentially more efficient, but harder to maintain

I chose the command table approach for its modularity and to demonstrate function pointer usage in C.

Implementing Memdump Command

The memdump command demonstrates several important OS concepts:

Command: memdump 0x40100000 64
┌───────────────────────────────────────────────────────────────┐
│ 0x40100000: 60 02 00 d4 00 00 00 00 e1 03 00 aa e2 03 01 aa │
│ 0x40100010: e3 03 02 aa e4 03 03 aa e5 03 04 aa e6 03 05 aa │
│ 0x40100020: e7 03 06 aa e8 03 07 aa e9 03 08 aa ea 03 09 aa │
│ 0x40100030: eb 03 0a aa ec 03 0b aa ed 03 0c aa ee 03 0d aa │
└───────────────────────────────────────────────────────────────┘

To implement this safely:

  1. Command parsing validates the address range is accessible
  2. Memory is accessed through volatile pointers to prevent optimization
  3. Both hex and ASCII representations are shown for clarity

Debugging Shell Issues

Common shell issues and how to diagnose them:

  1. Command parsing fails:
  • Add debug output showing tokenization results
  • Check for buffer overflows during parsing
  • Verify edge cases (empty input, excessive whitespace)
  1. Commands not found:
  • Check string comparison logic
  • Verify command table is properly initialized
  • Debug with explicit comparisons of each character
  1. Memory-related commands crash:
  • Implement robust address validation before access
  • Add range checking to prevent accessing unmapped regions
  • Use volatility qualifiers for register and device access

9. Standard Library Implementation

A standard library provides essential functions for memory and string operations.

Writing a LibC from Scratch

MeringueOS implements its own C standard library functions from scratch, providing the fundamental building blocks for string and memory operations. The memset and memcpy functions handle raw memory manipulation byte-by-byte - while production systems optimize with word-sized operations, our implementation prioritizes clarity and correctness. String functions like strlen count characters until hitting a null terminator, while strcmp compares strings character-by-character, carefully casting to unsigned to handle high-bit characters correctly. The strtok function tokenizes strings by maintaining static state between calls (making it non-reentrant), using helper functions strspn and strpbrk to skip delimiters and find token boundaries. These implementations form the foundation that the rest of the kernel relies on - from parsing shell commands to managing memory blocks.

String Library

Let’s look at string.c:

#include <stddef.h>
#include <stdint.h>
#include <stdbool.h>
#include "lib/string.h"
void* memset(void *s, int c, size_t n) {
unsigned char *p = (unsigned char *)s;
unsigned char uc = (unsigned char)c;
while (n-- > 0) {
*p++ = uc;
}
return s;
}
void* memcpy(void *dest, const void *src, size_t n) {
unsigned char *d = (unsigned char *)dest;
const unsigned char *s = (const unsigned char *)src;
while (n-- > 0) {
*d++ = *s++;
}
return dest;
}
size_t strlen(const char *s) {
size_t len = 0;
while (*s++) {
len++;
}
return len;
}
int strcmp(const char *s1, const char *s2) {
while (*s1 && (*s1 == *s2)) {
s1++;
s2++;
}
return *(const unsigned char*)s1 - *(const unsigned char*)s2;
}
int strncmp(const char *s1, const char *s2, size_t n) {
while (n > 0 && *s1 && (*s1 == *s2)) {
s1++;
s2++;
n--;
}
if (n == 0) {
return 0;
}
return *(const unsigned char*)s1 - *(const unsigned char*)s2;
}
// Note: strtok is not re-entrant due to the static pointer!
static char *strtok_last;
char* strtok(char *str, const char *delim) {
char *token;
if (str == NULL) {
str = strtok_last;
}
if (str == NULL) {
return NULL; // No more tokens
}
// Skip leading delimiters
str += strspn(str, delim);
if (*str == '\0') {
strtok_last = NULL;
return NULL;
}
// Find the end of the token
token = str;
str = strpbrk(token, delim);
if (str == NULL) {
// This token is the last one
strtok_last = NULL;
} else {
// Terminate the token and save the position for the next call
*str = '\0';
strtok_last = str + 1;
}
return token;
}
// Helper for strtok: length of initial segment consisting of accept characters
size_t strspn(const char *s, const char *accept) {
size_t count = 0;
const char *a;
bool found;
while (*s) {
found = false;
for (a = accept; *a; a++) {
if (*s == *a) {
found = true;
break;
}
}
if (!found) {
break;
}
count++;
s++;
}
return count;
}
// Helper for strtok: find first occurrence of reject character
char* strpbrk(const char *s, const char *reject) {
const char *r;
while (*s) {
for (r = reject; *r; r++) {
if (*s == *r) {
return (char *)s;
}
}
s++;
}
return NULL;
}

These functions provide basic string operations:

  • memset: Set memory to a value
  • memcpy: Copy memory regions
  • strlen: Get string length
  • strcmp/strncmp: Compare strings
  • strtok: Tokenize strings
  • strspn: Get length of substring with accepted characters
  • strpbrk: Find character from set in string

Design Decision: String.h Implementation

When implementing string functions, I chose for safety and simplicity.

Alternative: Highly optimized SIMD or word-by-word implementations.

Tradeoffs:

  • Simple byte-by-byte: Easier to understand, safer, but slower
  • Optimized word-by-word: Much faster but more complex and architecture-specific

I chose the simpler approach for educational purposes, but production kernels would use highly optimized implementations.

Memory Functions: The Foundation

Memory functions like memcpy and memset are some of the most frequently called functions in an OS:

┌───────────────────────────────────────────────────────────────┐
│ memcpy(dest, src, n) │
│ │
│ Source Memory Destination Memory │
│ ┌───┬───┬───┬───┬───┬───┐ ┌───┬───┬───┬───┬───┬───┐ │
│ │ A │ B │ C │ D │ E │ F │ Copy │ ? │ ? │ ? │ ? │ ? │ ? │ │
│ └───┴───┴───┴───┴───┴───┘ ───▶ └───┴───┴───┴───┴───┴───┘ │
│ │
│ After memcpy: │
│ ┌───┬───┬───┬───┬───┬───┐ ┌───┬───┬───┬───┬───┬───┐ │
│ │ A │ B │ C │ D │ E │ F │ │ A │ B │ C │ D │ E │ F │ │
│ └───┴───┴───┴───┴───┴───┘ └───┴───┴───┴───┴───┴───┘ │
└───────────────────────────────────────────────────────────────┘

Under the Hood: Optimizing Memory Operations

In production kernels, memory operations use several optimization techniques:

  1. Alignment handling:
  • Check source and destination alignment
  • Use aligned word-sized operations when possible
  • Fall back to byte operations for misaligned portions
  1. SIMD instructions:
  • Use vector operations (like NEON on ARM) for bulk transfers
  • Process 16 or 32 bytes per instruction
  1. Cache considerations:
  • Use prefetch instructions for large transfers
  • Consider cache line size for optimal performance

Stdlib Stubs

For completeness, we provide stubs for standard library functions in stdlib_stubs.c:

#include <lib/stdlib_stubs.h>
#include <lib/stdio.h>
// Simple implementation supporting base 10 and base 16 (0x prefix)
unsigned long long simple_strtoull(const char *nptr, char **endptr, int base) {
unsigned long long result = 0;
bool negative = false; // Ignored for unsigned, but good practice
const char *orig_nptr = nptr;
// Skip leading whitespace
while (*nptr == ' ' || *nptr == '\t' || *nptr == '\n' ||
*nptr == '\r' || *nptr == '\f' || *nptr == '\v') {
nptr++;
}
// Handle optional sign
if (*nptr == '+') {
nptr++;
} else if (*nptr == '-') {
negative = true; // Keep track even if ignored
nptr++;
}
// Determine base
if ((base == 0 || base == 16) && *nptr == '0' && (nptr[1] == 'x' || nptr[1] == 'X')) {
base = 16;
nptr += 2;
} else if (base == 0) {
base = 10;
}
if (base < 2 || base > 16) { // Simplified: only support up to base 16
if (endptr) *endptr = (char *)orig_nptr;
return 0; // Invalid base
}
unsigned long long cutoff = UINT64_MAX / base;
unsigned long long cutlim = UINT64_MAX % base;
while (true) {
int digit;
char c = *nptr;
if (c >= '0' && c <= '9') {
digit = c - '0';
} else if (c >= 'a' && c <= 'z') {
digit = c - 'a' + 10;
} else if (c >= 'A' && c <= 'Z') {
digit = c - 'A' + 10;
} else {
break; // Invalid character
}
if (digit >= base) {
break; // Invalid digit for base
}
// Basic overflow check
if (result > cutoff || (result == cutoff && (unsigned long long)digit > cutlim)) {
result = UINT64_MAX; // Indicate overflow
// Consume remaining valid digits for endptr
while (true) {
c = *(++nptr);
if (c >= '0' && c <= '9') digit = c - '0';
else if (c >= 'a' && c <= 'z') digit = c - 'a' + 10;
else if (c >= 'A' && c <= 'Z') digit = c - 'A' + 10;
else break;
if (digit >= base) break;
}
break; // Exit main loop after overflow
}
result = result * base + digit;
nptr++;
}
if (endptr) {
*endptr = (char *)nptr;
}
// Apply sign if needed (though result is unsigned)
// if (negative && result != UINT64_MAX) {
// // Standard strtoull doesn't negate, but strtoll would
// }
return result;
}
// Simple wrapper for unsigned long
unsigned long simple_strtoul(const char *nptr, char **endptr, int base) {
unsigned long long val = simple_strtoull(nptr, endptr, base);
// Check for overflow against ULONG_MAX if needed, though simple_strtoull already checks against UINT64_MAX
if (val > ULONG_MAX) {
return ULONG_MAX;
}
return (unsigned long)val;
}
// Basic abort function - enters infinite loop
void abort(void) {
kprintf("ABORT CALLED!\n");
// Disable interrupts?
while(1) {}
}

These stubs provide:

  • simple_strtoull: Convert string to unsigned long long
  • simple_strtoul: Convert string to unsigned long
  • abort: Terminate the program with an error

Troubleshooting Standard Library Issues

Common standard library bugs and how to diagnose them:

  1. String function crashes:
  • Add NULL pointer checks in all functions
  • Verify bounds checking for fixed-size buffers
  • Check for proper null termination
  1. Memory corruption with memcpy:
  • Verify source and destination don’t overlap, or use memmove
  • Check that size parameter is correct
  • Verify alignment requirements are met
  1. Integer conversion issues:
  • Add explicit overflow checking in conversion functions
  • Use appropriate integer types for the expected range
  • Implement bounds checking for all conversions

10. Advanced Topics

Now that we have the core OS working, let’s look at some advanced topics.

Text-Based UI

MeringueOS includes a simple text-based UI layer in tui.c:

#include <stddef.h>
#include <stdint.h>
#include <stdbool.h>
#include "ui/tui.h"
#include "lib/stdio.h"
// This is a placeholder implementation for the TUI
// In the full version, this would integrate with a framebuffer
int tui_init(void) {
kprintf("TUI: Initializing...\n");
kprintf("TUI: Using fallback console mode (no framebuffer)\n");
// Return success for now - we'll just use UART output
return 0;
}
void tui_write(const char *data, size_t len) {
// For now, just forward to UART output via kprintf
for (size_t i = 0; i < len; i++) {
kprintf("%c", data[i]);
}
}

The TUI module:

  1. Wraps UART functions for text output
  2. Could be extended to use a framebuffer in the future
  3. Provides a clean abstraction for console output

ANSI Terminal Control

While our current UI is UART-based, it can be extended with ANSI escape sequences:

┌───────────────────────────────────────────────────────────────┐
│ ANSI Terminal Control │
│ │
│ Code Effect │
├───────────────┬───────────────────────────────────────────────┤
│ \033[2J │ Clear the entire screen │
│ \033[H │ Move cursor to top-left corner │
│ \033[<n>A │ Move cursor up n lines │
│ \033[<n>B │ Move cursor down n lines │
│ \033[<n>C │ Move cursor forward n characters │
│ \033[<n>D │ Move cursor backward n characters │
│ \033[<n>;<m>H │ Move cursor to position (n,m) │
│ \033[0m │ Reset all attributes │
│ \033[1m │ Bold/bright │
│ \033[31m │ Red text │
│ \033[42m │ Green background │
└───────────────┴───────────────────────────────────────────────┘

Extending the System

MeringueOS can be extended in several ways:

Process Management

To implement processes, you’d need:

  1. Process Control Block (PCB): Data structure tracking process state
typedef struct {
uint64_t pid; // Process ID
process_state_t state; // Running, Ready, Blocked, etc.
saved_registers_t context; // Saved CPU context
void *stack_pointer; // Process stack
void *page_table; // Virtual memory mappings
struct pcb *next; // For scheduler list
} pcb_t;
  1. Context Switching: Mechanism to save/restore process state
┌─────────────┐ ┌────────────────┐ ┌─────────────┐
│ Process A │ │ Scheduler │ │ Process B │
│ Running │────▶│ 1. Save A state│────▶│ Restore B │
│ │ │ 2. Pick next │ │ state │
└─────────────┘ └────────────────┘ └─────────────┘
  1. Scheduler: Algorithm to determine which process runs next
  • Round-robin: Give each process equal time slices
  • Priority-based: Higher priority processes run first
  • Multi-level feedback queue: Adaptive scheduling based on behavior

Virtual Memory System

A virtual memory system would require:

  1. Page Tables: Data structures mapping virtual to physical addresses
┌───────────────────┐ ┌───────────────────────────────┐
│ Virtual Address │────▶│ Page Table │
│ 0xFFFF000000000000│ │ VA: 0xFFFF000000000000 │
└───────────────────┘ │ PA: 0x40200000 │
│ Permissions: Read/Write/Exec │
└───────────────────────────────┘
  1. MMU Configuration: Setting up translation control registers
// Enable MMU with the page tables we've created
void enable_mmu(uint64_t ttbr0_base) {
// Set Translation Table Base Register
asm volatile("msr ttbr0_el1, %0" : : "r" (ttbr0_base));
// Configure MMU control registers
uint64_t sctlr;
asm volatile("mrs %0, sctlr_el1" : "=r" (sctlr));
sctlr |= (1 << 0); // Enable MMU
asm volatile("msr sctlr_el1, %0" : : "r" (sctlr));
}
  1. TLB Management: Invalidating translations after page table changes
void invalidate_tlb_entry(uint64_t va) {
asm volatile("tlbi vaae1, %0" : : "r" (va));
asm volatile("dsb sy");
asm volatile("isb");
}

11. Conclusion

In this guide, we’ve built MeringueOS from scratch, implementing:

  • A boot process for AArch64
  • Memory management with physical frames and a kernel heap
  • Exception handling with a complete vector table
  • UART-based I/O
  • A command-line shell
  • Standard library functions
  • A text-based UI

You now have a working operating system for the ARM architecture and an understanding of how its components interact. This is just the beginning - there are many ways to extend and improve MeringueOS.

Building an OS from scratch teaches valuable lessons about:

  • Hardware-software interactions
  • Memory management principles
  • System architecture and design
  • Low-level optimization
  • Debugging complex systems

The most important takeaway is the depth of understanding you gain by implementing these fundamental components yourself.

Happy system programming!