Embedded Systems
Introduction to Embedded Systems
STM32F4: ARM Cortex-M4 @ 168 MHz, 2 dollars a chip. Raspberry Pi 5: ARM Cortex-A76 @ 2.4 GHz, 80 dollars. A Tesla ships with 70+ MCUs, and none of them run Linux. Why? When the brake pedal sends a signal to the ABS controller, it has 5 ms to respond. A general-purpose OS with kernel preemption and runtime pauses can't make that promise. FreeRTOS on bare-metal can. This is the world where every byte of RAM is accounted for and every millisecond is predictable.
- **Tesla FSD**: custom HW4 silicon + 70+ MCUs per car; motor and brake controllers are STM32-class, not Linux; 5 ms guaranteed response time
- **Medical (FDA IEC 62304)**: pacemakers, insulin pumps - firmware with formal verification, certification, 10-year support lifetime; a bug means a life
- **Industrial IoT**: a 2-dollar STM32 controls a 50-million-dollar turbine; cost-to-reliability ratio is the main reason MCUs dominate industrial automation
- **ESP32 in smart home**: Wi-Fi + BLE + FreeRTOS for 3 dollars; Espressif ships billions of chips per year
Microcontroller: a computer on a chip
An STM32F103 costs 2 dollars. A Raspberry Pi 5 costs 80. The first sells in billions. The second in millions. Why? Because controlling a traffic light, a temperature sensor, or a door lock does not require a quad-core ARM running Linux. It requires a **microcontroller** (MCU): CPU + Flash + RAM + peripherals on one chip - power on, execute immediately.
**Microcontroller (MCU, Microcontroller Unit)** - an integrated circuit containing: **CPU** (processor, 8–400 MHz), **Flash** (non-volatile memory for the program, 16 KB – 2 MB), **SRAM** (RAM, 2 KB – 512 KB), **peripherals** (GPIO, UART, SPI, I2C, ADC, timers). Everything on one die - no external memory chips needed.
| Parameter | Microcontroller (MCU) | Processor (CPU) |
|---|---|---|
| Purpose | Device control | General-purpose computing |
| Clock speed | 8–400 MHz | 1–5 GHz |
| RAM | 2 KB – 512 KB | 8–128 GB |
| Cost | $0.10 – $15 | $50 – $1000 |
| Power consumption | μW – mW | 15 – 300 W |
| Peripherals | Built-in (GPIO, ADC, UART) | External (via chipset) |
| OS | None / RTOS | Linux / Windows |
**Scale**: 30+ billion MCUs produced in 2025 - more than 3 per person. Every car: 50-150 MCUs. Tesla Model 3: 70+. STM32 (ST Microelectronics), ESP32 (Espressif), AVR/ATmega (Microchip), NXP i.MX, TI MSP430 - a fraction of hundreds of families. An ESP32 with Wi-Fi and FreeRTOS costs 3 dollars.
Intel 8048 - the first MCU
In 1976, Intel released the 8048 - the first mass-produced microcontroller with CPU, RAM, ROM, and I/O ports on a single chip. It was used in the IBM PC keyboard. Its successor, the 8051 (1980), became so popular that its architecture is still used in billions of devices today.
What fundamentally distinguishes a microcontroller from an ordinary processor (CPU)?
Firmware: flashing the microcontroller
On a PC, a program runs from a file - there is an OS, a filesystem, a loader. On an MCU, none of that exists. The program is burned directly into Flash. It is called **firmware**. Power on - the CPU starts executing code from address 0x08000000. No OS, no initialization delay, no waiting. Pacemakers run on this principle.
**Firmware** - a program written to the non-volatile memory (Flash) of a microcontroller. Development workflow: write code in C/C++ on a regular PC → **cross-compile** under the target architecture (ARM, RISC-V) → get a binary file (.bin/.hex) → **flash** via a programmer (ST-Link, J-Link) or USB.
**Cross-compilation** is the key point. PC: x86_64. MCU: ARM Cortex-M. Regular gcc generates x86 - wrong architecture. The solution is **arm-none-eabi-gcc**: runs on x86, generates ARM machine code. The same concept drives the Android NDK for Android apps and the iOS toolchain for iPhone - whenever the target architecture differs from the host.
| Stage | Tool | Input → Output |
|---|---|---|
| Write code | VS Code + extensions | .c/.h files |
| Cross-compile | arm-none-eabi-gcc | .c → .o (object files) |
| Link | arm-none-eabi-ld | .o + .ld → .elf |
| Convert | objcopy | .elf → .bin / .hex |
| Flash | ST-Link / J-Link / USB DFU | .bin → MCU Flash |
Why does an MCU require cross-compilation instead of regular compilation?
Bare-metal: programming without an OS
On a PC, between code and hardware there are layers: OS, drivers, libc, scheduler. On an MCU, none of that exists. **Bare-metal**: code runs directly on the processor with no intermediaries. main() is called from Reset_Handler - the first code the CPU executes after power-on. Stack initialization, BSS section clearing, global variable setup - all of that is the developer's responsibility. Everything.
**Bare-metal programming** - writing firmware without an operating system. Basic structure: 1. peripheral initialization 2. **super-loop while(1)** - an infinite loop processing tasks sequentially 3. **interrupts** - hardware signals that instantly switch the CPU to an event handler.
**volatile** - one of the most important words in embedded. Variables modified in interrupts MUST be volatile. Without it, GCC or Clang may cache the value in a register and never re-read from memory - an optimization correct for single-threaded code, catastrophic for bare-metal.
| Aspect | Super-loop (polling) | Interrupts |
|---|---|---|
| Principle | Constantly check state | React to an event |
| Latency | Depends on loop length | Microseconds |
| Power | CPU always active | CPU can sleep between events |
| Complexity | Simple | Priorities, nesting, race conditions |
| Example | while(1) { if(button)... } | void EXTI0_IRQHandler(void) |
**Bare-metal rule:** the interrupt handler must be as SHORT as possible. Set a flag - return. Long processing goes in the super-loop. Otherwise other interrupts will be missed.
Why is button_pressed declared as volatile?
RTOS: when the super-loop isn't enough
The super-loop works for simple devices. But an insulin pump must simultaneously: measure glucose every 5 minutes, drive the injection motor, accept Bluetooth commands, and flash an alarm LED at critical readings. Tasks conflict in time. A **scheduler** is needed - which is why RTOS exists, and why medical devices default to FreeRTOS rather than Linux.
**RTOS (Real-Time Operating System)** - a real-time operating system for MCUs. Not to be confused with Linux! RTOS is a minimal kernel (4–20 KB Flash) providing: **tasks** (tasks/threads), **priorities**, **scheduler** (preemptive), **synchronization mechanisms** (semaphores, mutexes, queues). Popular ones: **FreeRTOS** (most widespread), Zephyr, ThreadX.
| Criterion | Bare-metal (super-loop) | RTOS |
|---|---|---|
| Tasks | One sequential | Multiple "parallel" |
| Priorities | Manual (order in loop) | Automatic (scheduler) |
| Response time | Depends on loop length | Guaranteed (deterministic) |
| RAM usage | Minimal | +4–20 KB kernel + stack per task |
| Debug complexity | Simple | Race conditions, deadlocks |
| When to use | Simple devices (LED, sensor) | Multiple concurrent tasks |
**Real-time ≠ fast.** Real-time means a **guaranteed deadline**, not raw speed. An STM32 @ 168 MHz is slower than a Raspberry Pi @ 2.4 GHz. But the STM32 guarantees: the ABS task will complete in no more than 5 ms, without exception. Linux cannot guarantee that - GC, kernel preemption, and the I/O scheduler can all add latency. That is exactly why Tesla uses Linux for infotainment and MCUs for motor and brake control.
Arduino = embedded systems. Knowing Arduino means knowing embedded
Arduino is a learning platform that hides complexity behind abstractions (digitalWrite, analogRead). Professional embedded development uses STM32, ESP32, NXP directly: registers, HAL, CMSIS, linker scripts, JTAG/SWD debuggers, power analysis, certification (medical, aviation, automotive)
Arduino's digitalWrite() takes ~50 cycles instead of 1 with a direct GPIO register write. In production, what matters is: deterministic execution time, power consumption in microwatts, hardware safety, 10-year chip support from the manufacturer. For a pacemaker, the FDA requires formal testing of every line of code - Arduino IDE is not designed for that.
In FreeRTOS, a task with priority 3 and a task with priority 1 are both ready to run. Which one will execute?
Key Takeaways
- **MCU** - a self-contained computer on a chip (CPU + Flash + RAM + peripherals) for $0.10-15; 30 billion units per year; 70+ per Tesla
- **Firmware** - a program in Flash; cross-compiled with arm-none-eabi-gcc on x86 to ARM binary; no runtime environment whatsoever
- **Bare-metal** - super-loop while(1) + interrupts; volatile protects shared variables from the optimizer; interrupt handlers must be as short as possible
- **RTOS (FreeRTOS)** - when multitasking with guaranteed response time is needed; Linux is disqualified anywhere a GC pause can miss a deadline
Related topics
Introduction to embedded systems is the foundation for studying MCU architecture and interfaces:
- Microcontroller architecture — Deep dive into ARM Cortex-M, RISC-V, registers, and memory map
- GPIO, UART, SPI, I2C — Practical guide to connecting sensors and devices to an MCU
- Binary system — Foundation for working with registers and bit masks in MCU
Вопросы для размышления
- A pacemaker runs for years without rebooting and must pass FDA IEC 62304 certification. What firmware requirements does that impose regarding error handling, updates, and fault tolerance?
- Tesla uses Linux for infotainment and MCU with FreeRTOS for ABS and steering. Why can't everything run on Linux with the PREEMPT_RT patch?
- A 3-dollar ESP32 with FreeRTOS controls a 50,000-dollar industrial valve. What architectural decisions make this safe - watchdog, fail-safe, redundancy?
Связанные уроки
- arch-01-binary — Processor architecture is the foundation of embedded: registers, interrupts, memory-mapped I/O, ISA
- os-01-intro — RTOS vs general-purpose OS is the same contrast as bare-metal vs Linux in embedded
- rts-01 — Real-time systems are built on top of embedded: deterministic scheduling, interrupts, watchdog
- net-01-intro — IoT devices communicate over networks: MQTT, CoAP, TCP/IP on embedded requires networking knowledge
- arch-04-cpu