Web Development
GraphQL
When Facebook rewrote its mobile app in 2012, the REST API was killing performance: building one screen took 5-10 requests, and on weak 3G that meant seconds. Lee Byron and Nick Schrock flipped the model: let the client describe exactly the fields it needs in one request, and the server returns exactly those - one round-trip. By 2024 GraphQL serves Shopify (5+ billion queries per day), GitHub, and Airbnb. Not a silver bullet - a different data model with its own tradeoffs.
- **Shopify Storefront API**: 5+ billion GraphQL queries per day; every Shopify store uses GraphQL for a flexible storefront that REST cannot cover.
- **GitHub GraphQL v4**: running in parallel with REST v3 for 8 years; used by CLI tools, integrations, GitHub Mobile - wherever query flexibility matters.
- **Apollo Federation at Netflix**: 50+ microservices joined behind a federated gateway - one GraphQL endpoint instead of an API gateway over REST.
GraphQL Schema
In 2012 Facebook was rewriting its mobile app. The REST API returned a news feed: posts, authors, comments, likes. To render one screen the app made 5-10 requests and stitched the data together. Over weak 3G this was painful. Over backbone networks it was expensive on battery. The Lee Byron and Nick Schrock team imagined something else: the client describes exactly which fields it needs, and the server returns exactly those. Schema-first - both client and server know the types. GraphQL was open-sourced in 2015. By 2024 it serves Shopify (5+ billion queries per day), GitHub, Airbnb, and Twitter. Not a silver bullet, but a different data model for APIs.
Schema Definition Language (SDL): type defines an object, query/mutation/subscription are root types. Scalars: Int, Float, String, Boolean, ID + custom (DateTime, JSON). Modifiers: ! - non-null, [] - list, [Type!]! - non-null list of non-null. Interfaces and unions support polymorphism. Directives (@deprecated, @auth) carry metadata. The schema is the contract between client and server.
Cursor pagination via connections (the Relay standard) is the correct way to paginate in GraphQL. offset+limit breaks when data changes between requests. Cursor-based pagination is robust: 'give me the next N after this cursor'.
The main advantage of GraphQL over a REST API for a mobile client is:
Resolvers
Schema is the what. Resolvers are the how. For every field in the schema there is a function that returns its value. The GraphQL engine walks the query tree and calls the corresponding resolver for each field. A resolver receives: parent (the parent object), args (query arguments), context (auth, data sources), info (the schema info for the current field). If post.author has no resolver, the default is used: post.author (reading the field off the object). With an explicit resolver, the function is called. That gives an extension point for lazy loading: a post may not include author, but the resolver fetches it on demand.
Resolver chain - the call tree mirrors the query structure. Query { user(id) { posts { author { name } } } } has 4 resolver levels: Query.user -> User.posts -> Post.author -> User.name. Each returns a Promise or a value. Apollo Server, Mercurius, and Yoga are popular runtimes. The resolver context is created per request and holds DB connections, current user, dataloader instances. Auth and authorization usually live in directives or context checks inside resolvers.
Query depth and complexity attacks: an attacker can request { user { posts { author { posts { author { posts { ... }}}}}}} - unbounded nesting that exponentially loads the database. Defenses: depth-limit (typically 7-10), query complexity analysis (graphql-cost-analysis), persisted queries in production.
What happens if no resolver is defined for the Post type's author field, but the DB row has an authorId column?
Subscriptions
Query is a one-shot request. Mutation is a one-shot change. Subscription is a long-lived event stream: the client subscribes to server-pushed updates. WebSocket over the graphql-ws protocol is the standard. Each subscription is a persistent connection with server-side filters. Apollo Server + Redis PubSub is a typical architecture: a mutation publishes to a Redis channel, and all subscribers filter by their arguments. Real-world uses: chat, live notifications, collaborative editing (Figma), real-time dashboards. Not for everything - polling is often simpler and sufficient. Subscriptions make sense when sub-second latency and event-driven UX are required.
The subscription resolver returns an AsyncIterator instead of a value. PubSub is a thin abstraction over Redis/Kafka/RabbitMQ. Subscription filter is a function filtering events for a specific subscriber. Authorization is performed at connection time (via connectionParams) and optionally on each event. Server-sent events (SSE) are an alternative to WebSocket: simpler but one-way (server -> client). graphql-http supports multipart responses for streaming large query results.
Persistent subscription connections grow linearly with the active subscriber count. 100k online users in a chat = 100k WebSocket connections. ~5-10 KB of memory per connection, plus CPU for keepalive. Horizontal scaling requires sticky sessions or Redis PubSub fanout. Cloudflare Workers/Durable Objects are a strong runtime for scalable subscriptions.
How does a GraphQL subscription differ from REST long-polling for real-time updates?
DataLoader and N+1
The most common GraphQL pitfall is the N+1 query. Query { posts { author { name } } } runs 1 query for posts (returning 100 rows), then 100 separate queries, one for each author. One user query turns into 101 SQL queries. In REST this is obvious: the developer writes a JOIN. In GraphQL the resolver is called per field, and the illusion of cleanliness hides the issue. DataLoader (Facebook, 2014) solves it: it wraps a load(id) function, batching and de-duplicating requests within one tick of the event loop. The resolver calls loader.load(authorId) 100 times - DataLoader accumulates IDs and runs ONE query: SELECT * FROM users WHERE id IN (1,2,...,100). Per-request caching prevents repeated fetches of the same object.
DataLoader API: new DataLoader(batchLoadFn) - the function takes a list of keys and returns a list of values (in the same order!). loader.load(key) is async and returns a Promise. loader.loadMany([keys]) is the batch variant. Cache scope is per-request, not shared across requests. For complex fetchers: dataloader-mongoose, mongoose-dataloader-cache provide ready integrations. An alternative is join-monster: it parses the GraphQL info and generates a SQL JOIN, but it is less flexible.
Per-request DataLoader cache scope matters. A global cache would let different users see each other's data and leak stale data across requests. Per-request: the cache lives for the duration of one GraphQL request and is then garbage collected.
GraphQL replaces REST - all new APIs should use GraphQL
GraphQL and REST solve different problems. GraphQL excels at: flexible clients with diverse needs, aggregation across services, mobile apps with varied screens. REST excels at: file uploads, simple CRUD, HTTP caching, public APIs (easier to document and understand). Many companies use both: GraphQL for BFF (backend-for-frontend), REST for service-to-service
GraphQL adds complexity: schema, resolvers, the N+1 problem, security (query depth/complexity), client setup (Apollo/Relay). For simple CRUD APIs this overhead does not pay off. GitHub has run GraphQL v4 alongside REST v3 for 8 years - both are needed.
DataLoader resolves N+1. But what happens when a query is { posts { author { posts { author } } } }?
Key Ideas
- **Schema-first contract**: client and server share types, IDs, and required fields. SDL is the source of truth and the generator for client/server SDKs.
- **Resolvers** are functions per field; the call tree mirrors the query structure. The default resolver covers scalars; relations require explicit resolvers.
- **Subscriptions over WebSocket** deliver server-pushed real-time updates. Not for everything - polling is often simpler; subscriptions pay off below 1s latency.
- **DataLoader** solves N+1: batching and dedup within one event-loop tick. Without it, GraphQL easily generates 100+ SQL queries per request.
Related Topics
GraphQL overlaps with REST, databases, and microservice architecture:
- REST API Design — GraphQL and REST are not alternatives but tools for different jobs. Many shops run both: GraphQL for BFF, REST for public APIs and file uploads
- Microservices — GraphQL Federation merges schemas from different services into a single supergraph - an alternative to an API gateway for microservice backends
Вопросы для размышления
- GraphQL gives the client flexibility but loses HTTP caching (everything is POST to /graphql). Persisted queries and automatic persisted queries (APQ) help, but add complexity. In which scenarios does REST remain the right choice?
- Schema-first vs code-first: the schema is described in SDL separately, or generated from code. Which approach suits a 20-person team building a public API?
- Federation lets multiple GraphQL services share one schema. But it introduces coupling and complicates debugging. When is federation justified, and when is a single monolithic GraphQL service better?
Связанные уроки
- web-12 — REST API is the predecessor to GraphQL
- web-14 — After GraphQL - authentication and authorization
- db-05-sql-basics — GraphQL queries and SQL SELECT: similar declarative style
- aie-16-tool-calling — GraphQL resolver and tool calling - declarative data access specification
- ds-09-trees-intro — GraphQL Schema is a type tree; traversal gives data
- net-21-http-basics