>
section 2 of 1312 min read

2. Number Systems

To do useful work we need to represent not just abstract bits but actual numbers. Words, images, audio, network packets, executable instructions: every form of data in a computer is ultimately a sequence of bits that some convention interprets as something meaningful.

2.1 Decimal: the place-value idea you already know

You learned this in primary school. Each digit is one of {0, 1, 2, 3, 4, 5, 6, 7, 8, 9}, and the value of a multi-digit number is

D=dn10n+dn110n1++d110+d0.D = d_n \cdot 10^n + d_{n-1} \cdot 10^{n-1} + \cdots + d_1 \cdot 10 + d_0.

So 1234 means 11000+2100+310+4=12341 \cdot 1000 + 2 \cdot 100 + 3 \cdot 10 + 4 = 1234. Each position carries ten times the place value of the position to its right. The 10 in there is the base or radix of the system. Pull that 10 out, replace it with any other positive integer, and you have a different number system.

2.2 Binary: base 2

Same idea but base 2. Each digit is 0 or 1, and each position is twice the place value of the position to its right:

B=bn2n+bn12n1++b12+b0.B = b_n \cdot 2^n + b_{n-1} \cdot 2^{n-1} + \cdots + b_1 \cdot 2 + b_0.

So 1101 in binary means 18+14+02+11=131 \cdot 8 + 1 \cdot 4 + 0 \cdot 2 + 1 \cdot 1 = 13. The leftmost bit is the most significant bit (MSB), the rightmost is the least significant bit (LSB). This vocabulary will follow you into every datasheet you read.

A few useful powers of 2 to memorize:

plaintext
 2^0 = 1          2^8  = 256        2^16 = 65,536
 2^1 = 2          2^9  = 512        2^20 = 1,048,576 (~1 M)
 2^2 = 4          2^10 = 1,024      2^30 = ~1 G
 2^3 = 8          2^11 = 2,048      2^32 = ~4.3 G
 2^4 = 16         2^12 = 4,096      2^40 = ~1 T

The pattern 2101032^{10} \approx 10^3 is enormously useful for back-of-envelope estimates. A 32-bit address space spans about 4 gigabytes; that is exactly 2322^{32}, and the reason "4 GB" was the natural memory ceiling on every 32-bit OS until the 64-bit transition.

2.3 Octal and hexadecimal: for tired humans

Binary is the machine's native language but humans hate reading long strings of 0s and 1s. Eight bytes of data is 64 bits long, easy to misread, easier still to miscopy. So we group binary digits.

Octal (base 8) groups bits in threes from the right. Each group is one octal digit (0–7).

110 101 010 in binary becomes 652 in octal.

Octal was popular on early computers (the PDP-8 used 12-bit words, dividing nicely into four octal digits) and survives in Unix file permissions (chmod 755 is octal: rwxr-xr-x). But it has largely lost out to hex.

Hexadecimal (base 16) groups bits in fours. Each group is one hex digit, 0 through F (where A = 10, B = 11, C = 12, D = 13, E = 14, F = 15).

1101 0110 in binary becomes D6 in hex.

Hex is the human-readable form for binary today. Every byte fits in exactly two hex digits, every nibble (half-byte, 4 bits) in one. You will see hex everywhere: memory addresses (0x7fff_a000), MAC addresses (AA:BB:CC:DD:EE:FF), CSS color codes (#FF8800), MD5 and SHA hashes, USB device IDs, anything bit-pattern-y. The 0x prefix that C inherited and Python adopted is the universal "this is hex" tag.

Why not just decimal everywhere? Because decimal does not align with binary. A decimal digit needs roughly 3.32 bits, which is awkward. Eight bits in decimal is "0 to 255," and unless you have memorized the table you cannot tell at a glance which bits are set. Eight bits in hex is "00 to FF," and each hex digit cleanly maps to four bits you can recite. Try it: 0xA5. The A is 1010, the 5 is 0101. So 0xA5 is 10100101. No arithmetic needed.

2.4 Conversions

These four conversions are the most basic skill of digital design. Practice until they feel automatic.

Decimal → binary: divide by 2 repeatedly, write down the remainders, then read them bottom-up.

plaintext
 13 / 2 = 6  remainder 1   ← LSB
  6 / 2 = 3  remainder 0
  3 / 2 = 1  remainder 1
  1 / 2 = 0  remainder 1   ← MSB
 → 1101

The intuition: the remainder at each step tells you whether the current "amount" is odd or even, which is exactly the LSB of what is left. After dividing by 2, you are now asking the same question of the next bit up, and so on.

Binary → decimal: multiply each bit by its place value and sum.

1101 becomes 8+4+0+1=138 + 4 + 0 + 1 = 13.

Binary → hex: group bits in 4s starting from the right (pad the leftmost group with zeros if needed). Each group of 4 is one hex digit.

1101 0110 is D6. 1 1010 1111 would be padded as 0001 1010 1111, which is 1AF.

Hex → binary: each hex digit becomes 4 bits.

A5 is 1010 0101. 0xDEADBEEF is 1101 1110 1010 1101 1011 1110 1110 1111. (And no, it does not encode a dead beef. The pattern was just memorable to early systems programmers, who used it to mark uninitialized memory in ways that were obvious in hex dumps.)

For octal, the same procedure applies with groups of 3 instead of 4. 110101010 becomes 652 in octal.

Conversions between non-binary bases (decimal to hex, octal to hex) are usually done by going through binary as a stepping stone. There is no need to memorize direct decimal-to-hex tables.

2.5 Signed numbers: three schemes and why two died

How do we represent negative numbers in binary? Three schemes have been used historically. Only one survived.

Sign-magnitude. Reserve the leftmost bit for the sign (0 = positive, 1 = negative). The remaining bits are the magnitude.

  • +5=00000101+5 = 0\,0000101
  • 5=10000101-5 = 1\,0000101

Drawback: there are two zeros, +0 (00000000) and −0 (10000000). Worse, addition is awkward. You cannot just add the bit patterns; you have to first compare signs, then either add magnitudes (same sign) or subtract them (different signs), then sort out the result's sign. The hardware to do this is much more complicated than plain bitwise addition.

1's complement. Negate by flipping every bit.

  • +5=00000101+5 = 0000\,0101
  • 5=11111010-5 = 1111\,1010

Still has the two-zeros problem (+0 = 00000000, −0 = 11111111). Addition is slightly easier than sign-magnitude but requires an "end-around carry" trick: any carry-out of the MSB is added back to the LSB. Used in some early machines (the CDC 6600, the PDP-1) but obsolete today.

2's complement. Flip every bit, then add 1.

  • +5=00000101+5 = 0000\,0101
  • 5-5: flip to 111110101111\,1010, add 1, get 111110111111\,1011.

There is exactly one zero (00000000), the same circuit handles both addition and subtraction, and bitwise addition just works. Verify: 5+(5)=00000101+11111011=1000000005 + (-5) = 0000\,0101 + 1111\,1011 = 1\,0000\,0000. The carry-out off the top is discarded, and we get 00000000, the zero we wanted. 2's complement is what every modern CPU uses.

For an n-bit 2's-complement number, the range is from 2n1-2^{n-1} to +2n11+2^{n-1} - 1:

  • 8-bit: −128 to +127.
  • 16-bit: −32,768 to +32,767.
  • 32-bit: roughly ±2.1 billion.
  • 64-bit: roughly ±9.2 quintillion.

Notice the asymmetry: there is one more negative number than positive number. That extra negative slot used to be the second zero in 1's complement, and 2's complement ate it instead. Concretely, 8-bit −128 is 10000000, and its 2's complement... is itself. (Flip to 01111111, add 1, carry through, get 10000000 again.) So -(-128) overflows in 8-bit 2's complement, a tiny but real source of bugs.

Clock-arithmetic analogy. Think of an 8-bit register as the face of a clock with 256 positions (0 through 255 going around once). On this clock, 255 + 1 wraps to 0. Now relabel the second half: 128 = −128, 129 = −127, …, 255 = −1. Subtraction becomes "going backward on the clock," or, equivalently, "going forward by the complement amount." That is exactly what 2's complement does. Once you see the wrap-around as natural, subtraction stops feeling magical.

The security implications of these conventions are real. Many integer-overflow vulnerabilities exploit the fact that, in 2's complement, adding a large positive number can wrap to negative, and a "length must be positive" check might pass while the value used downstream is huge as an unsigned offset. If you have heard of CVE-laden bugs from "integer signedness confusion," this is what they mean.

2.6 Binary-coded decimal (BCD)

Sometimes, instead of pure binary, we encode each decimal digit independently as 4 bits. So 23 in BCD is 0010 0011, "two" and "three" placed side by side. This is wasteful (BCD uses 4 bits to encode 10 values, when 4 bits could encode 16) but easy for humans and easy for displays.

Used in:

  • Digital clocks, where the seven-segment digits map directly from BCD.
  • Pocket calculators, where exact decimal arithmetic matters and base conversion would introduce rounding artifacts.
  • Old financial systems in COBOL, where regulators required cents-exact arithmetic.
  • Seven-segment-display drivers like the famous 74LS47, still made today, which converts BCD to seven-segment outputs.

BCD addition needs a tiny adjustment circuit: if a 4-bit sum exceeds 9, you add 6 to skip past the unused codes 1010–1111 and produce a proper carry into the next decimal position. Old TTL parts like the 74LS83 (4-bit adder) plus 74LS86 (XOR) used to combine into BCD adders for calculators.

2.7 Other 4-bit codes: Excess-3 and Gray

Excess-3 adds 3 to BCD. So 0 is 0011, 1 is 0100, …, 9 is 1100. Property: complementing every bit of an Excess-3 digit gives you the 9's complement of the decimal value, which used to make subtraction circuits simpler. Mostly a historical curiosity now.

Gray code is more interesting. Adjacent values differ in exactly one bit. So 0 = 000, 1 = 001, 2 = 011, 3 = 010, 4 = 110, 5 = 111, 6 = 101, 7 = 100. Notice how each step flips one bit: there is no place where you cross from 011 to 100 in one go (which would require three bits to change simultaneously).

Why does this matter? Because real circuits don't change all bits simultaneously. In ordinary binary, going from 7 (0111) to 8 (1000) flips four bits. If those four bits don't transition at exactly the same instant (and they never do), an observer reading mid-transition might briefly see 1111 or 0000 or any other intermediate. In a Gray code, since only one bit flips at a time, the worst you can read mid-transition is the current value or the next one, never something wild.

Gray code shows up in:

  • Rotary encoders, where slight optical-sensor misalignment between bits would otherwise cause two transitions to appear unsynchronized. With Gray, no glitches.
  • K-maps, which we will meet in Section 4. The Gray ordering of rows and columns is what makes adjacent cells differ in exactly one variable.
  • Asynchronous clock-domain crossings, where a counter incrementing in one clock domain is sampled by another. If the counter is Gray-coded, the sampler always reads either the old or the new value, never garbage. This is so common that high-speed FIFO designs almost universally use Gray-coded read/write pointers.

2.8 Error detection and correction

Bits get flipped. By cosmic rays. By aging chips. By manufacturing defects. By radiation in space. By rowhammer attacks. We can detect (and sometimes correct) these flips by adding redundancy.

Parity bit. Append one bit chosen so the total number of 1s is either even (even parity) or odd (odd parity). Single-bit flips change the count by 1, breaking parity, and are detected. A two-bit flip slips through. Older RAM modules used parity, and serial protocols (UART, RS-232) often have an optional parity bit per byte.

Hamming code. Place multiple parity bits at power-of-2 positions in the codeword (positions 1, 2, 4, 8, …). Each parity bit covers a specific subset of the data bits in a clever interlocking pattern. With enough parities, any single bit flip (anywhere in the data or parities) leaves a unique "syndrome" that points at exactly which bit flipped, letting you correct it. The (7, 4) Hamming code carries 4 data bits in 7 total bits, a 75% overhead, and corrects any single-bit error.

ECC RAM uses Hamming-derivative codes. Server memory typically encodes 64 data bits with 8 parity bits (a (72, 64) code) for SECDED: single-error correct, double-error detect. Spacecraft memories use even stronger codes because cosmic rays at high altitude flip bits a thousand times more often than at sea level.

Reed-Solomon, BCH, LDPC are more powerful codes that correct multiple errors at once. CDs use Reed-Solomon to recover from scratches; Wi-Fi, 5G, and deep-space probes use LDPC to push close to the Shannon limit. We will see these in detail in the Digital Communications chapter.

Hardware-security tie-in: rowhammer. Repeatedly hammering a row of DRAM with reads can flip bits in adjacent rows by capacitive disturbance. ECC makes rowhammer harder but not impossible. Multi-bit flips in the same word can occasionally slip past single-error correction. Recent attacks (TRRespass, half-double rowhammer) target ECC and chipkill DRAM specifically. Servers handling cryptographic keys are increasingly deploying stronger codes (chipkill, ChipSparing, on-die ECC) and refresh-rate randomization to counter this.