Computer Networks
RPC: gRPC, Thrift
Google processes 10 billion RPC calls per second inside its infrastructure. gRPC was born from an internal system called Stubby. When JSON/REST isn't fast enough - RPC saves the day.
- **Google** - internal infrastructure fully on gRPC (formerly Stubby)
- **Netflix** - gRPC between microservices, REST for APIs
- **Uber** - Thrift for legacy, gRPC for new services
Предварительные знания
Remote Procedure Call
**RPC (Remote Procedure Call)** is a paradigm where calling a function on a remote server looks like a local call. You write `userService.getUser(123)`, and under the hood serialization, an HTTP request, and deserialization happen.
The idea of RPC appeared in the 1980s (Sun RPC, CORBA). Modern implementations: gRPC (Google), Thrift (Facebook), Twirp (Twitch). The alternative is a REST API, where you explicitly work with HTTP.
**Leaky abstraction:** RPC hides the network, but not its problems. Latency, partial failures, retries - all of that remains. A local call takes nanoseconds; an RPC call takes milliseconds to seconds.
**Components of an RPC system:** • **IDL (Interface Definition Language)** - API description (Protobuf, Thrift IDL) • **Code generator** - generates client/server stubs from IDL • **Serialization** - binary (Protobuf) or text (JSON) • **Transport** - HTTP/2, TCP, Unix sockets
What problem does RPC NOT automatically solve?
gRPC
**gRPC** is a modern RPC framework from Google. It uses HTTP/2 for transport and Protocol Buffers for serialization. It supports streaming, bidirectional communication, and deadline propagation.
**gRPC-Web** is a variant of gRPC for browsers. Browsers don't support HTTP/2 trailers, so gRPC-Web uses a proxy (Envoy) for translation.
What transport protocol does gRPC use?
Protocol Buffers
**Protocol Buffers (Protobuf)** is a binary serialization format from Google. 3-10x more compact than JSON, 20-100x faster. The schema is defined in .proto files and code is generated for any language.
**Wire format:** Each field is encoded as (field_number << 3 | wire_type) + value. This allows skipping unknown fields - the key to backward compatibility.
Why is it safe to add new fields to a Protobuf schema?
Apache Thrift
**Apache Thrift** is an RPC framework from Facebook (now Apache). Similar to gRPC, but older with different design decisions. Supports many transports (TCP, HTTP, Unix socket) and serialization protocols (Binary, Compact, JSON).
**Thrift in Big Data:** Parquet (columnar storage) uses Thrift for metadata. Many Hadoop tools use Thrift for IPC. If you work in data engineering - you'll encounter Thrift more often than gRPC.
**Other RPC systems:** • **Twirp** (Twitch) - gRPC-like, but on HTTP/1.1 JSON/Protobuf • **Connect** (Buf) - gRPC-compatible with HTTP/1.1 support • **Cap'n Proto** - zero-copy serialization, faster than Protobuf • **FlatBuffers** (Google) - zero-copy for games/embedded
What is the main difference between Thrift and gRPC?
RPC vs REST
**RPC** is action-oriented: "execute an operation". **REST** is resource-oriented: "manage resources via CRUD". Both approaches are valid; the choice depends on the use case.
**GraphQL** is a third option. The client requests exactly the fields it needs. Solves over-fetching (REST) and N+1 (without batching). But adds server-side complexity.
**Hybrid approach:** REST/GraphQL for external clients (browsers, mobile), gRPC for internal services. API Gateway translates between protocols.
gRPC is always better than REST for microservices
gRPC is optimal for internal services; REST is better for public APIs and browser clients
gRPC requires HTTP/2 (not all proxies support it), binary format is harder to debug, browsers don't support it natively. REST is simpler for integration, documentation, and caching. The choice depends on context
When is gRPC preferable to REST for microservices?
Key Takeaways
- **RPC** hides network calls behind a local function interface (leaky abstraction)
- **gRPC** - HTTP/2 + Protobuf, streaming, deadline propagation
- **Protobuf** - binary serialization, 3-10x more compact than JSON, backward compatible
- **Thrift** - alternative with flexible transports and serializations
- **REST vs RPC** - resource-oriented vs action-oriented; often a hybrid approach is used
Related Topics
RPC is one way to communicate in distributed systems:
- Networking in Distributed Systems — Partial failures and latency affect RPC
- Message Queues — Asynchronous alternative to synchronous RPC
Вопросы для размышления
- Which operations in your API are better expressed as RPC, and which as REST resources?
- How would you handle backward compatibility when adding a required field to Protobuf?
- Which gRPC streaming mode is right for real-time notifications?