Databases
System Design: Payment System
Wise processes $12 billion in international transfers per month across 70+ banking partners. A single rounding error in monetary storage, a duplicate charge from a retry without idempotency, or a missing transaction in reconciliation can cause regulatory action or user harm. Payment systems are built on a small set of non-negotiable patterns: DECIMAL types, idempotency keys, double-entry bookkeeping, and immutable audit trails.
- **Stripe**: idempotency keys are mandatory for all Payment Intents API calls. Their documentation explicitly describes retry-safe payment flows. Stripe processes hundreds of billions of dollars annually with this pattern.
- **Brex**: event sourcing on Kafka for corporate card balances. Account balance = sum of all ledger events. No stored balance field that could drift. Complete audit trail from the first transaction.
Double-Entry Bookkeeping and DECIMAL Types
Financial systems store monetary amounts as DECIMAL (or NUMERIC), never FLOAT. Floating-point representation cannot exactly represent most decimal fractions: 0.1 + 0.2 = 0.30000000000000004 in IEEE 754. A single float rounding error in a banking ledger can compound into significant discrepancies across millions of transactions.
Never store monetary amounts as FLOAT or DOUBLE. Use DECIMAL(19,4) or store amounts as integer cents (100 = $1.00). A single rounding error compounded across millions of transactions becomes a regulatory and accounting problem. Stripe, Brex, and all payment processors use DECIMAL or integer cents exclusively.
Why must monetary amounts be stored as DECIMAL instead of FLOAT?
Idempotency Keys
A payment API call can fail with a timeout, network error, or server crash - leaving the caller uncertain whether the payment was processed. Retrying without protection causes duplicate charges. Idempotency keys allow safe retries: the client generates a unique key per payment attempt; the server records which keys have been processed and returns the cached result for duplicate requests.
Stripe requires idempotency keys for all Payment Intents API calls that create or confirm payments. Their documentation states: 'If a request fails, your code retries using the same idempotency key. Stripe returns the exact same response as the original successful request.' Stripe expires idempotency keys after 24 hours.
A client sends a $50 transfer request and receives a timeout (no response). What is the correct next action?
Reconciliation
Reconciliation compares two independent records of the same financial events to detect discrepancies. Internal system records are compared against external bank statements, payment processor reports, or partner system records. Discrepancies require investigation: network failures, bugs, race conditions, or fraud can cause records to diverge.
Wise (TransferWise) processes $12 billion in international transfers per month across multiple payment networks. Their reconciliation system runs every 15 minutes comparing internal ledger records against statements from 70+ banking partners. Discrepancies above $0.01 trigger automated investigation workflows.
Reconciliation shows a transaction in the bank statement that is not in the internal system. What could cause this?
Immutable Audit Trail
Financial regulations (PCI DSS, SOX, PSD2) require an immutable audit trail: a record of every change to financial data, who made it, and when. The audit trail must be append-only - no updates or deletes. Event sourcing stores the current state as a sequence of immutable events rather than a mutable record.
Brex uses event sourcing on Kafka for corporate card balances. The balance of each card is the sum of all ledger events (credits, debits, reversals) - never a stored field that could drift from the actual transaction history. This design provides a complete, tamper-evident audit trail and makes balance reconstruction possible at any historical point.
Why must an audit trail table not allow UPDATE or DELETE operations?
Key Ideas
- **DECIMAL not FLOAT**: IEEE 754 cannot exactly represent decimal fractions. 0.1 + 0.2 = 0.30000000000000004. Use DECIMAL(19,4) or integer cents for all monetary values.
- **Double-entry bookkeeping**: every transaction creates two ledger entries (debit + credit) that sum to zero. The invariant makes accounting errors detectable.
- **Idempotency keys**: client generates a unique key per payment attempt. Server caches the result by key. Safe retries on timeout or network failure prevent duplicate charges.
- **Reconciliation**: compare internal records against external statements every N minutes. Any discrepancy triggers investigation. Common cause: partial failures where external processor succeeded but internal system did not record the result.
- **Immutable audit trail**: append-only INSERT. No UPDATE or DELETE. Enforced via PostgreSQL rules or Kafka event sourcing. Required for PCI DSS, SOX, and forensic accuracy.
Related Topics
Payment systems rely on core database transaction guarantees:
- ACID Transactions — Double-entry bookkeeping requires atomic transactions: both ledger entries must commit or neither does. ACID guarantees prevent partial writes that would corrupt account balances.
- Replication — Payment systems use synchronous replication to guarantee zero data loss. A committed payment must exist on multiple nodes before the API returns success.
- Database Monitoring — Reconciliation failures and discrepancy rates are critical monitoring metrics for payment systems. Automated alerting on discrepancies above threshold triggers immediate investigation.
Вопросы для размышления
- A payment system processes $1 million per day. The daily reconciliation shows a $0.05 total discrepancy across 100,000 transactions. Is this acceptable? What could cause it?
- Brex computes account balance as SUM of all ledger events. At 1 billion events per account, this aggregation takes seconds. How would you optimize balance queries without losing the event sourcing guarantees?
- A regulatory audit requires showing the exact state of every account at any point in the past 7 years. Design the schema and query strategy for this requirement.