AI Engineering

How AI Coding Assistants Work: Copilot, Cursor, Claude Code Under the Hood

Цели урока

  • Understand Copilot's architecture: FIM, context gathering, ghost text pipeline
  • Break down Cursor's approach: codebase indexing, semantic search, Apply Model
  • Study Claude Code's agentic loop: tool use, self-correction, terminal integration
  • Master components for building a code assistant: LSP + embeddings + LLM
  • Evaluate the evolution of AI pair programming and the changing role of developers

GitHub Copilot was hired in 2022 as "autocomplete on steroids". In 2024, Cursor started writing entire functions, performing refactoring, explaining code. The difference in approach: Copilot completes, Cursor agents. The market noticed - USD 400M ARR in two years. 92% of developers already use AI coding assistants (GitHub Survey). But most treat them as black boxes. Understanding the architecture from the inside is the difference between "AI sometimes helps" and "AI accelerates work 5-10x".

  • GitHub Copilot - 1.8M paid subscribers, RLHF on developer feedback, Codex FIM-model specifically fine-tuned on GitHub repositories
  • Cursor - USD 400M ARR in two years, VS Code fork with deep AI integration, vector index of codebase + Apply Model for multi-file edits
  • Claude Code - agentic assistant in the terminal, agentic loop with self-correction via tsc/npm test, autonomously solves Junior-Mid level tasks
  • Amazon CodeWhisperer, Tabnine, Codeium - alternative code assistants; DeepSeek Coder and StarCoder - open-source FIM-models

From IntelliSense to agents

**2001**: Microsoft IntelliSense in Visual Studio - the first smart autocomplete. **2018**: TabNine - first ML-based completion, a small local model. **2021**: GitHub Copilot in beta, Codex (GPT-3 fine-tuned on code) - the turning point. **2022**: Copilot GA, 1M users in 2 months. **2023**: Cursor - VS Code fork with codebase indexing, multi-file editing; Apply Model concept emerges. **2024**: Claude Code, Devin - agentic loop; agents begin solving junior-level developer tasks autonomously. **2025**: 92% of developers use AI assistants, USD 400M ARR at Cursor in two years.

Предварительные знания

  • RAG: Retrieval-Augmented Generation from Theory to Implementation
  • AI Agents: ReAct, Planning, Memory, Observe-Think-Act Loop

GitHub Copilot: context gathering, FIM, ghost text

GitHub Copilot launched in 2022 as "autocomplete on steroids" - and the market was skeptical. Smart code completion existed before: IntelliSense, TabNine. But Copilot bet on **Fill-in-the-Middle (FIM)** - a prompt format where the model sees code both before and after the cursor, and generates what belongs in between. Not just "next line". Prediction with intent understanding.

The entire pipeline fits in 200-500 ms: gather context from open tabs, build a FIM prompt, send to the model (Codex, then GPT-4), rank the suggestions, show ghost text. Debounce of 50-100 ms after each keystroke - otherwise API requests come in waves. Codex was specifically fine-tuned on GitHub repositories with RLHF on developer feedback - that's what made suggestions relevant rather than random.

  • **Debounce** - waits 50-100ms after the last keystroke to avoid spamming the API
  • **Context gathering** - collects prefix/suffix from the current file + snippets from open tabs
  • **Prompt building** - forms a FIM prompt with context (up to ~8K tokens)
  • **Model call** - requests the model with low temperature (0.0-0.2) for deterministic suggestions
  • **Ranking** - multiple suggestions are ranked by probability and length
  • **Ghost text** - the best suggestion is displayed as gray text. Tab to accept

**FIM models are specifically trained** on the prefix-suffix-middle format. A regular chat model can't efficiently "fill in the gap" in code. Codex, StarCoder, DeepSeek Coder - all support FIM.

What is Fill-in-the-Middle (FIM) in the context of code completion?

Cursor: codebase indexing, multi-file edits, apply model

In 2024, Cursor started doing what Copilot simply couldn't: writing entire functions, performing refactoring, explaining unfamiliar code. The difference in approach is architectural - Copilot **completes**, Cursor **agents**. The market noticed: USD 400M ARR in two years, a VS Code fork used at Google, Stripe, Shopify.

The key innovation is **codebase indexing**: when a project opens, Cursor splits all files into chunks (~50-100 lines), embeds each via text-embedding-3-small (1536 dim), stores locally. When a developer types "add email validation to UserService" - Cursor does semantic search over this index and finds the right files from thousands, even if they're not open. RAG over the codebase, implemented directly inside the IDE.

**Apply Model** is a separate Cursor innovation. The main LLM (Claude or GPT-4) generates a diff-like description of changes, and a small fine-tuned model applies those changes to the actual file. This solves an exact problem: large models struggle with precise line numbers and don't want to regenerate entire files - but they're excellent at describing *what* and *where* to change.

AspectCopilotCursor
ContextOpen tabs (~8K tokens)Entire project via embeddings (~100K+ tokens)
Primary taskInline completion (next line)Multi-file editing (refactoring)
ModelCodex/GPT-4 (FIM)Claude/GPT-4 + Apply Model
IndexingNone (runtime context)Vector index of entire codebase
UXGhost text (Tab)Chat + inline diff + multi-file apply

Why does Cursor index the codebase into embeddings?

Claude Code: agentic loop, tool use, terminal integration

Claude Code takes a fundamentally different approach. Not an IDE plugin - an **autonomous agent in the terminal**. It receives a task, explores the project on its own (Read, Grep, Glob), edits files itself (Edit, Write), runs commands itself (Bash). Agentic loop: think → act → observe → think again. Not an assistant - a junior developer who doesn't get tired.

Cursor needs a vector index because it operates inside an IDE with a fixed context window. Claude Code doesn't need pre-indexing: the agent explores the project dynamically - Glob → Read → Grep → understand the structure. Like an experienced developer opening an unfamiliar repository for the first time. The critical piece: **self-correction**. After every change, the agent runs `tsc --noEmit` or `npm test`, sees errors, and fixes them in the next iteration. Without this loop, coding agents would be useless.

Claude Code ToolPurposeDeveloper's analog
ReadRead a fileOpen a file in IDE
EditPrecise string replacement in a fileCtrl+H - find and replace
WriteCreate a new fileCtrl+N - new file
GlobSearch files by patternCtrl+P - find file
GrepSearch text in filesCtrl+Shift+F - search in project
BashExecute terminal commandsTerminal - npm test, git commit

**The agentic approach scales to complex tasks.** Copilot completes a line. Cursor edits a file. Claude Code can execute a task spanning 20 steps: read 10 files, create 3 new ones, edit 5 existing ones, run tests and fix errors - all from a single prompt.

What is the fundamental difference between the agentic approach of Claude Code and the completion approach of Copilot?

Building a code assistant: LSP + embeddings + LLM

Understanding the architecture of Copilot, Cursor, and Claude Code enables building custom code assistants for specific tasks: autocomplete for an internal framework, code reviewer, test generator. Core components: **LSP** for code understanding, **embeddings** for retrieval, **LLM** for generation.

LSP (Language Server Protocol) is not just "syntax highlighting". It's full code semantics in real time: types, interfaces, imports, compilation errors - before the file is even saved. Embeddings find *similar* code by meaning. LSP knows *precise* types in the current scope. Together - context from which the LLM generates correct TypeScript, not plausible-looking TypeScript.

  • **LSP (Language Server Protocol)** - types, symbols, go-to-definition, diagnostics. Free code semantics
  • **Embedding Index** - vector DB for semantic search across the codebase. ChromaDB, Qdrant, or even in-memory
  • **LLM API** - OpenAI, Anthropic, or a local model (DeepSeek Coder, CodeLlama)
  • **Context Engine** - gathers data from LSP + Index, forms prompts respecting the token budget
  • **IDE Extension** - VS Code extension API for ghost text, inline completions, chat panel

**Token budget is the main constraint.** There's always more context than fits in the context window. Prioritization is critical: current file > imports > LSP symbols > retrieved chunks. Every extra fragment displaces something useful.

Why does a code assistant use LSP (Language Server Protocol) in addition to embeddings?

The evolution of AI pair programming: what changes for developers

In 3 years (2022-2025), AI coding assistants went from autocomplete to autonomous agents. Gen 1 - Copilot, completes a line. Gen 2 - Cursor, edits files from a description. Gen 3 - Claude Code, Devin - autonomously explore the project, write tests, fix errors. Each generation shifts the developer's role: from "writing every line" to "describing intent and verifying the result".

GenerationProductCapabilitiesDeveloper's role
Gen 1 (2022)CopilotInline completion, single line/functionWrite code, AI completes it
Gen 2 (2023)Cursor, CodyMulti-file edit, codebase-aware chatDescribe the task, AI edits files
Gen 3 (2024-25)Claude Code, DevinAgentic: exploration, planning, self-correctionSet the direction, AI solves the task
Gen 4 (2025+)?Full cycle: from issue to deployed PRReview, architect, prioritize

**What becomes more important:** code reading and review skills (verifying AI output), architectural thinking (AI is good at tactics, poor at strategy), prompt engineering (precise task descriptions), system design (breaking down into components for AI).

  • **Code review** - reviewing AI-generated code becomes the primary task. Spotting edge cases, security issues, performance problems is essential
  • **Architecture** - AI writes code from descriptions excellently, but doesn't make architectural decisions. System design remains the developer's domain
  • **Prompt engineering for code** - the ability to precisely formulate a task yields a 10x difference in AI output quality
  • **Debugging** - AI accelerates writing, but complex bugs still require deep understanding of the system
  • **Domain knowledge** - AI knows general patterns, but doesn't know the business logic of a specific product

**AI doesn't replace a 10x developer, AI makes a regular developer closer to 10x.** But only if the developer understands what AI generates and can evaluate the quality of the result. "Vibe coding" without understanding is a path to technical debt.

Which skill becomes MOST critical as AI coding assistants grow?

AI coding assistant = replacement for developers

AI assistant is a productivity multiplier. 55% of accepted suggestions require edits; architectural decisions still belong to humans

GitHub research 2024: developers accept ~30% of suggestions unchanged, ~55% with edits, ~15% reject. AI generates syntactically correct code but doesn't know the business logic context, doesn't see security edge cases, doesn't make architectural decisions. "Vibe coding" without code review is a direct path to technical debt. The right analogy: a smart junior on every Tab press - who still needs to be reviewed.

Key takeaways

  • Copilot: FIM (Fill-in-the-Middle) + context from open tabs + RLHF on developer feedback → inline ghost text in 200-500ms
  • Cursor: codebase indexing into embeddings (text-embedding-3-small, 1536 dim) + semantic retrieval + Apply Model for multi-file edits
  • Claude Code: agentic loop (think → act → observe → correct) with tools: Read, Edit, Grep, Bash; self-correction via tsc/tests
  • Custom assistant: LSP (precise types, symbols) + Embedding Index (semantic retrieval) + LLM + Context Engine
  • The developer's role evolves from 'writing code' to 'setting direction, reviewing, architecting'
  • 55% of accepted suggestions require edits - AI multiplies productivity, doesn't replace understanding

Вопросы для размышления

  • Why does Cursor use a separate Apply Model instead of having the main LLM apply changes directly? What specific problem does this architecture solve?
  • What's the difference between Cursor's codebase indexing and Claude Code's dynamic exploration? When does each approach work better?
  • If 55% of accepted suggestions require edits - does that mean AI assistants are ineffective, or is this normal for a productivity multiplier?

What's Next

AI coding assistants are semi-autonomous. The next step is fully autonomous agents (Devin, SWE-Agent) that solve tasks from issue to PR without developer involvement.

  • Autonomous Agents — Devin, SWE-Agent, OpenHands - AI that writes code from task to pull request autonomously
  • MCP (Model Context Protocol) — The standard protocol through which AI assistants connect to tools

Связанные уроки

  • aie-12-rag-fundamentals — Assistants retrieve code context via RAG
  • aie-17-agent-fundamentals — Coding assistants are tool-using agents
  • aie-45-mcp-protocol — Assistants connect tools through MCP
  • ml-52-search-ranking — Rank relevant code snippets for the context window
  • aie-47-autonomous-agents — Assistants evolve toward autonomous coding
  • ml-01
How AI Coding Assistants Work: Copilot, Cursor, Claude Code Under the Hood

0

1

Sign In