Embedded Systems
C for Embedded Systems
The Curiosity Mars rover runs on bare-metal C on a RAD750 processor (5 W power budget). Every sensor register, every motor command, every telemetry bit is controlled through memory-mapped I/O. A missing volatile or a wrong bit mask in a $2.5 billion vehicle stuck on Mars cannot be patched.
- **STM32 HAL Library**: the entire HAL uses the same pattern - volatile structs at fixed addresses, CMSIS types, BSRR for atomic GPIO
- **FreeRTOS portmacro.h**: critical sections via __disable_irq/__enable_irq or the BASEPRI register - inline asm for Cortex-M
- **Arduino**: analogRead(), digitalWrite() are wrappers around memory-mapped registers of ATmega/SAMD. Direct register access is 10-50x faster than HAL functions
Historical context
In 1978, Dennis Ritchie published 'The C Programming Language' (K&R). The volatile keyword did not exist yet. It was added in the ANSI C89/C90 standard specifically for embedded programming: processors needed a way to tell the compiler 'do not optimize this access away'. Before volatile, developers used hacks: global variables, asm inserts, or compiler flags to disable all optimization. MISRA C (1998, Motor Industry Software Reliability Association) codified mandatory use of volatile for hardware registers in safety-critical systems. The Toyota unintended acceleration investigation (2010) made this lesson concrete for the broader industry.
volatile: disabling compiler optimization
Toyota Prius, 2010. NHTSA investigated sudden unintended acceleration. One contributing factor: state variables for the throttle pedal were not marked volatile - GCC cached the value in a register instead of re-reading from memory each iteration. In embedded C, without volatile the compiler is free to assume a variable can only change from within program code. Peripheral registers change through hardware - the compiler has no way to know.
**volatile vs const:** the combination `volatile const uint32_t *REG` is a read-only register whose value may still change from hardware. Typical for status registers. **volatile does not guarantee atomicity** - for 64-bit variables on 32-bit MCUs, additional measures are needed.
Variable `bool flag = false` is written by an ISR (`flag = true`) and read in the main loop. Without volatile, what can go wrong?
Bitfields and bit operations: register control
Peripheral registers pack multiple fields into one 32-bit word: bit 0 is enable, bits 1-3 are mode, bits 4-7 are interrupt enables. C bitfields let code work with these as a struct; the compiler generates the required AND/OR/shift operations. The alternative - explicit bit mask operations - is more portable but more verbose.
**Bitfield portability problem:** the C standard does not guarantee the order of fields in memory - different compilers may lay out bits differently. For hardware registers, explicit bit mask operations with defined shifts are more reliable than bitfield structs. CMSIS (Arm Cortex) uses macros for this reason.
Bits 4-7 of a register need to be set to 0b1010 without touching any other bits. Which operation is correct?
Inline Assembly: when C is not enough
Arm Cortex-M: the WFI (Wait For Interrupt) instruction halts the core until the next interrupt, dropping power consumption from 50 mA to 0.5 mA. C has no equivalent - inline assembly is required. The same applies to memory barriers, atomic operations, and reading processor-specific registers like the cycle counter.
**CMSIS intrinsics:** for Cortex-M, prefer __WFI(), __DSB(), __ISB() from cmsis_compiler.h - portable wrappers over inline asm that work with GCC, IAR, and Keil. Direct inline asm is needed only for non-standard instructions or fine-grained optimization.
In inline assembly, `"memory"` in the clobber list means what?
Memory-Mapped I/O: peripherals as memory
Arm Cortex-M: all peripherals (GPIO, UART, SPI, timers, ADC) are accessible through the processor's address space. Instead of separate IN/OUT instructions (as in x86), ordinary load/store operations on specific addresses control hardware. STM32F4: GPIOA registers start at 0x40020000. Writing to `*(uint32_t*)0x40020018 = 0x01` sets pin PA0.
**BSRR (Bit Set/Reset Register):** atomic GPIO control without read-modify-write. Writing to BSRR is hardware-atomic - there is no window where an interrupt can corrupt the state. This is critical in a multi-task RTOS environment. Writing to ODR requires atomic read-modify-write or a critical section.
volatile is sufficient for safe access to variables shared between ISR and main code
volatile prevents compiler optimization but does not guarantee atomicity. For multi-byte variables (int64 on 32-bit MCU) or read-modify-write operations, atomic instructions or interrupt disabling is needed.
On Cortex-M4, writing uint32_t is one STR instruction - atomic. But uint64_t requires two instructions STR+STR - an interrupt between them sees an inconsistent intermediate state. Use __disable_irq()/__enable_irq() or LDREX/STREX instructions for such cases.
Why are memory-mapped register structs declared with volatile fields rather than ordinary pointers?
Key ideas
- **volatile**: disables optimization - every read/write goes to actual memory. Mandatory for hardware registers and ISR-shared variables
- **Bitfields**: convenient for register access, but bit ordering is not guaranteed by the C standard. In critical code, use explicit masks and shifts
- **Inline assembly**: for WFI, memory barriers, LDREX/STREX atomic instructions; on Cortex-M often replaced by CMSIS intrinsics
- **Memory-Mapped I/O**: all Cortex-M peripherals are volatile structs at fixed addresses. BSRR enables atomic GPIO without read-modify-write
- **volatile does not guarantee atomicity**: for multi-byte variables and RMW operations, disable interrupts or use atomic instructions
Вопросы для размышления
- In FreeRTOS, task A and task B (different priorities) both access one volatile uint32_t variable. Task A reads, task B writes. Is volatile sufficient for correct operation - or is something else needed?
- The STM32 Reference Manual says 'a read of AHB1ENR must follow an AHB1ENR write to ensure the peripheral clock is active'. Why does the compiler need to be prevented from optimizing this read away?
- Why does BSRR provide atomic set/clear while ODR does not? What is the hardware mechanism that makes BSRR atomic?
Связанные уроки
- emb-03 — Basic C and memory management - the foundation that embedded-specific constructs build on
- emb-05 — RTOS requires volatile and atomic operations for correct inter-task communication
- rts-04 — RTOS architecture uses the same memory protection and IPC mechanisms introduced here
- emb-06 — Peripheral buses (SPI, I2C, UART) are controlled via memory-mapped registers
- devops-04 — Docker isolates processes at the OS level, exactly as MPU isolates tasks in embedded systems
- os-01-intro