Backend Transport

GraphQL: Flexible Data Queries

In 2015, Facebook open-sourced GraphQL after 3 years of internal use. By then it was already serving the entire mobile app with 1+ billion users. The problem it solved is familiar to anyone who has built APIs: a mobile screen needs data from 5 different REST endpoints - that is 5 round-trips, 5 JSON responses full of fields nobody asked for, and a hard dependency on whatever the server decided to return.

  • **GitHub API v4** - fully on GraphQL, replacing REST v3 for complex queries
  • **Shopify Storefront API** - GraphQL allows customizing storefronts without adding new endpoints
  • **Twitter API** - GraphQL for internal services where different clients need different data shapes

Schema and the Type System

In 2012, Facebook was rewriting its mobile app from HTML5 to native and hit a wall: hundreds of REST endpoints, each returning a fixed structure. Different mobile screens needed different data - one screen pulled 3 fields from 5 different endpoints. The result: extra HTTP round-trips and over-fetching on every screen. The solution they built: **GraphQL**, where the client describes exactly what it needs.

The foundation of GraphQL is the **Schema Definition Language (SDL)**. The schema describes all available types and operations - it is the API contract. Unlike REST, where the response shape is decided by the server, in GraphQL the response shape is decided by the client's query.

**! in types:** `String!` - non-nullable (will never be null). `[Post!]!` - a non-nullable array where each element is also non-nullable. `[Post]` - can be null, and elements can be null. The client knows the guarantees upfront.

A field is declared as `tags: [String!]`. Which values are valid?

Queries and Mutations

GraphQL has three operation types. **Query** - reads data, analogous to GET. **Mutation** - changes data, analogous to POST/PUT/DELETE. **Subscription** - subscribes to real-time events. The defining feature: the client specifies exactly which fields it needs and the server returns precisely that.

**Batching mutations:** multiple mutations in one request execute **sequentially** (unlike queries, which can execute in parallel). This guarantees ordering for stateful operations.

A client requests fields `id, name` from a User type that has 10 more fields. What does GraphQL return?

The N+1 Problem and DataLoader

GraphQL introduces a deceptive performance trap. Fetching 100 posts with their authors looks harmless, but here is what happens internally: 1 query for the post list, then 100 queries - one per author. That is 101 database queries instead of 2. This is the **N+1 problem** - invisible during development (small data) and destructive in production.

**DataLoader and caching:** DataLoader caches results within a single request by default. Calling `userLoader.load('u_5')` a second time returns from cache without a new DB query. For mutation requests, clear explicitly: `loader.clear('u_5')`.

DataLoader accumulates IDs and sends a batch query. When exactly does it fire the batch?

Subscriptions: Real-Time Data

Query and Mutation follow request-response. **Subscription** is a long-lived connection: the client subscribes to an event, and the server pushes data every time it occurs. Technically, subscriptions run over WebSocket, though the protocol remains GraphQL.

**When Subscriptions are not needed:** polling every 5-10 seconds is simpler and more reliable for data that changes infrequently. Subscriptions are for genuine real-time scenarios - chat, collaborative editing, live metrics. Every WebSocket holds resources - 10,000 subscribers means 10,000 open connections.

GraphQL Subscription in production with multiple server instances. Why is an external PubSub (Redis) needed?

Apollo Federation: Distributed Schema

When the number of microservices grows, a single GraphQL schema becomes a monolith: all teams edit one file, deploys block each other. **Apollo Federation** takes a different approach: each service owns its slice of the schema, and the Federation Gateway merges them into a unified graph at runtime.

**Rover CLI:** the tool for working with Federation. `rover subgraph publish` registers the updated schema in the Apollo Schema Registry - the Gateway picks up the change without restarting. Schemas from all services are versioned centrally.

GraphQL is always faster than REST because the client gets only the fields it needs

GraphQL solves over-fetching, but without DataLoader it creates the N+1 problem. REST with well-designed endpoints can outperform a naive GraphQL implementation

GraphQL's flexibility requires a more sophisticated server architecture. A simple REST endpoint with an optimized SQL query will beat a naive GraphQL resolver

In Federation, a client requests a User with their Posts. How many HTTP requests does the Gateway make to the subgraph services?

GraphQL: key ideas

  • SDL schema - single API contract; the client defines the response shape, not the server
  • Query (read), Mutation (write), Subscription (real-time) - three operation types
  • N+1 problem - a resolver for each list item fires a separate DB query
  • DataLoader - batching and deduplication through the event loop tick boundary
  • Federation - each microservice owns part of the schema, the Gateway merges them

Related topics

GraphQL lives within the ecosystem of synchronous protocols and complements REST where flexibility is needed.

  • REST — The classic approach GraphQL is compared against
  • WebSocket — The transport layer for GraphQL Subscriptions
  • Message Queues — Alternative to Subscriptions for event-driven architecture

Вопросы для размышления

  • In what cases does GraphQL create more problems than it solves compared to REST?
  • How does DataLoader affect read consistency - can two .load() calls in the same batch receive inconsistent data?
  • How does Federation change the API development process in a team spanning 5+ services?

Связанные уроки

  • net-22-http-headers
GraphQL: Flexible Data Queries

0

1

Sign In