Computer Architecture
Binary System: The Language of Computers
Цели урока
- Understand why computers use binary
- Convert numbers between systems: bin ↔ dec ↔ hex
- Know the units: bit, byte, word
- Understand signed vs unsigned numbers
- Avoid overflow bugs
Предварительные знания
- Basic arithmetic
Apple M3 Ultra (2024): 192 GB of memory, 800 GB/s bandwidth, 80 CPU cores, over 100 billion transistors. Intel 4004 (1971): 2,300 transistors, 108 kHz. Fifty-three years, a hundred-billion-transistor gap. The underlying logic has not changed by a single bit: 0 and 1, high and low voltage. Why binary specifically? Not because "two states" is the only option - ternary computers existed (Soviet Setun, 1958). But with a 50% threshold, the noise margin is maximized and misreads at the electrical level are minimized. Every modern chip, every instruction set, every OS is built on that one engineering decision from the 1940s.
- **Memory dump debugging:** every address in gdb, lldb, Instruments is in hex. `0x7ffee4b1c2f0` is a 64-bit stack address. Without understanding hex, debugging native code is guesswork
- **Bit flags in system APIs:** Unix permissions (chmod 755 = 111 101 101), HTTP method masks, OpenGL state flags - all are bitmasks. `permissions & 0x04` checks one flag in a single CPU instruction
- **Integer overflow in security:** CVE-2021-3156 (sudo heap overflow) and Heartbleed both exploit integer overflow or underflow. `unsigned int len = 0; len - 1 = 4294967295` is a classic attack vector
Why Binary?
Every transistor in a processor does one thing: distinguish whether voltage is present or not. Two states. The M3 Ultra has over 100 billion of them. How does 4K video, machine learning, and an operating system emerge from that?
Think of a light switch: **ON** (1) or **OFF** (0). One switch - two states. Eight switches - the picture changes dramatically:
| Switches | Combinations | Can Encode |
|---|---|---|
| 1 | 2 | Yes/No |
| 2 | 4 | 4 variants (00, 01, 10, 11) |
| 4 | 16 | Digits 0-9 + letters |
| 8 | 256 | All ASCII characters |
| 32 | 4 billion | Any IPv4 address |
| 64 | 18 quintillion | More than atoms on Earth |
**Formula:** n switches = 2ⁿ combinations. Exponential growth!
**Why not ternary?** The USSR ran a ternary computer called Setun (1958) - three states (-1, 0, +1) are theoretically denser. But a transistor distinguishes two voltage levels more reliably than three: at a 50% threshold, the noise margin is maximized. Physics chose binary; engineers agreed.
How many distinct values can be encoded with 10 bits?
Positional Number Systems
The number **234** in decimal is not just symbols on paper. Each position carries a weight:
Any number system works the same way - only the **base** changes. In hex, `FF` is not letters; it is the number 255:
| System | Base | Digits | Usage |
|---|---|---|---|
| Binary | 2 | 0, 1 | Inside the computer |
| Octal | 8 | 0-7 | Unix permissions (chmod 755) |
| Decimal | 10 | 0-9 | Everyday life |
| Hexadecimal | 16 | 0-9, A-F | Colors (#FF0000), memory |
**Hex color code:** #FF0000 = red. FF = 255 (maximum red), 00 = 0 (no green), 00 = 0 (no blue).
In which system is the number 777 written (Unix file permissions)?
Binary -> Decimal
Each bit is a power of two. The rightmost bit (position 0) weighs 1; the leftmost in a byte (position 7) weighs 128:
**Tip:** Memorize powers of two: 1, 2, 4, 8, 16, 32, 64, 128, 256, 512, 1024. Programmers recognize them instantly!
What is 11111111₂ in decimal?
Decimal -> Binary
**Division by 2 method:** divide by 2, record the remainders. Read bottom to top - each remainder becomes a bit, from least significant to most significant.
**Powers of two method:** greedier and more visual. Take the largest power of two that does not exceed the number, mark a 1, subtract - and repeat:
**Pattern:** 42 = 101010 - alternating! 42 = 32 + 8 + 2 = 2⁵ + 2³ + 2¹. All odd powers.
100₁₀ in binary is:
Hexadecimal System
Binary numbers are long: a 32-bit memory address is 32 characters of zeros and ones. **Hex** fixes this: every 4 bits map to one hex digit, and `0xDEADBEEF` is far more readable than its 32-bit binary equivalent.
| Dec | Bin | Hex | Dec | Bin | Hex |
|---|---|---|---|---|---|
| 0 | 0000 | 0 | 8 | 1000 | 8 |
| 1 | 0001 | 1 | 9 | 1001 | 9 |
| 2 | 0010 | 2 | 10 | 1010 | A |
| 3 | 0011 | 3 | 11 | 1011 | B |
| 4 | 0100 | 4 | 12 | 1100 | C |
| 5 | 0101 | 5 | 13 | 1101 | D |
| 6 | 0110 | 6 | 14 | 1110 | E |
| 7 | 0111 | 7 | 15 | 1111 | F |
Colors on the web
CSS colors are RGB in hex
#FF5733 FF = 255 (red 100%) 57 = 87 (green 34%) 33 = 51 (blue 20%) Result: orange-red color
What color is the hex code #FFFFFF?
Bit, Byte, Word
**Bit** (binary digit) - the minimum unit: 0 or 1. One transistor in a charged or uncharged state.
**Byte** - 8 bits, chosen historically as the minimum to store one ASCII character. Today the byte is the de-facto atomic unit of memory addressing on all processors.
| Unit | Size | Range | Example |
|---|---|---|---|
| 1 bit | 1 | 0-1 | Yes/No |
| 1 nibble | 4 bits | 0-15 | 1 hex digit |
| 1 byte | 8 bits | 0-255 | ASCII character |
| 1 word | 16-64 bits | Depends on CPU | Memory address |
**Word** - the size of a processor register. On a 64-bit CPU: 64 bits = 8 bytes. That is why pointers take 8 bytes, and the theoretical maximum RAM is 2^64 bytes. 32-bit CPUs were limited to 4 GB of addressable memory - exactly 2^32 bytes.
**Confusion:** Is 1 KB = 1024 bytes or 1000 bytes? Officially: KB = 1000, KiB = 1024. But most programs still use KB = 1024.
How many bits are in 2 bytes?
Signed and Unsigned Numbers
Only zeros and ones - how are negative numbers represented? Three historical approaches, one winner:
**1. Sign-magnitude (obsolete):** the most significant bit is the sign (0 = +, 1 = -), the rest encode the magnitude.
**2. Two's complement:** the winner. Invert all bits, then add 1. The elegance: ordinary addition works without any special-case logic - overflow simply wraps around.
| Type (C) | Size | Range |
|---|---|---|
| unsigned char | 8 bits | 0 to 255 |
| signed char | 8 bits | -128 to +127 |
| unsigned int | 32 bits | 0 to 4,294,967,295 |
| signed int | 32 bits | -2,147,483,648 to +2,147,483,647 |
**Bug trap:** Signed overflow. int max = 2147483647; max + 1 = -2147483648 (not an error, but a wraparound!)
Computers understand negative numbers the same way humans do - just with a sign attached
Computers do not store a sign separately. In two's complement, the most significant bit carries a negative weight (-128 for an 8-bit number), and all signed arithmetic falls out automatically
-1 in 8 bits is `11111111`. This is not 'minus' plus '1111111'. It is: -128 + 64 + 32 + 16 + 8 + 4 + 2 + 1 = -128 + 127 = -1. The trick: adding 5 + (-5) uses ordinary binary addition - overflow is discarded and the result is 0.
What happens when the 8-bit signed number 127 is incremented by 1?
Binary System
- **Binary is not arbitrary:** at a 50% threshold, noise margin is maximized. Ternary Setun (USSR, 1958) is theoretically denser, but transistors distinguish two voltage levels more reliably than three
- **Exponential power:** n bits = 2ⁿ combinations. 64 bits = 18 quintillion states - more than atoms on Earth. That is why 64-bit addressing makes RAM limits irrelevant in practice
- **Hex is just compressed binary:** 4 bits = 1 hex digit. Memory address `0x7FFFFFFF` and a 32-bit mask are the same thing, just far more readable
- **Two's complement is elegant by design:** -1 in 8 bits is `11111111`, and it works because overflow is discarded during addition - no special-case circuitry needed
- **Overflow is behavior, not an error:** signed overflow in C is undefined behavior; unsigned is guaranteed wraparound. The distinction is critical for security and correctness
Related Topics
Binary is the foundation for all hardware.
- Logic Gates — How to implement 0/1 in hardware
- ALU — Arithmetic with binary numbers
- Bitwise Operations — AND, OR, XOR, shifts
Связанные уроки
- arch-02-logic-gates — Logic gates are the next level: how 0s and 1s build computational logic
- ar-01-natural — The binary system is a special case of positional arithmetic
- ds-01-arrays — Understanding binary memory addressing explains O(1) access in arrays
- prog-01-intro — Programming runs on top of binary representation - two views of the same machine
- la-04-matrix-ops