Abstract Algebra

Tensor Products

How do one combine two objects of different natures - a vector from physics and a vector from chemistry - into one object preserving 'both contributions'? The tensor product is the mathematical answer. From entangled qubits to batched matrix multiplication on GPUs - it's the same construction everywhere.

Neural networks: all operations (convolutions, self-attention in transformers) are tensor contractions; GPUs are optimized precisely for these
Quantum mechanics: entanglement consists of tensors that cannot be decomposed as a simple product; tensor networks are the foundation of modern quantum simulations

Предварительные знания

Rings: Addition and Multiplication

Definition via Universal Property

PyTorch torch.einsum processes tensors of shape 1024×1024×512 - 536 million numbers per call. Multi-head attention in GPT-4 contracts tensors of rank 4 on every forward pass. The mathematics behind einsum is the tensor product of modules.

**Common mistake:** assuming every element of M ⊗ N is a simple tensor m ⊗ n. In reality, most elements are sums of several simple tensors. Decomposing an element into a minimal sum of simple tensors is nontrivial (the tensor rank). For matrices this is equivalent to the matrix rank.

What is the dimension of the vector space ℝ^m ⊗ ℝ^n?

Bilinear Maps and Universality

The essence of the tensor product: it **linearizes bilinearity**. Any bilinear map φ: V × W → U (linear in each argument separately) is equivalent to a linear map f: V ⊗ W → U via the commutative diagram: ``` V × W --⊗--> V ⊗ W \ | φ f \ | ----------> U ``` Bilinear(V × W, U) ≅ Hom(V ⊗ W, U) - that is what the universal property says.

**In quantum mechanics**, the tensor product is the fundamental operation for composite systems. Two qubits |ψ₁⟩ ∈ ℂ² and |ψ₂⟩ ∈ ℂ² together are described as |ψ₁⟩ ⊗ |ψ₂⟩ ∈ ℂ⁴. States that do NOT decompose as a simple tensor are called **entangled** - the basis of quantum computing.

Matrix A has size 3×2 and matrix B has size 4×5. What is the size of the Kronecker product A ⊗ B?

Tensors in ML and Physics

**A tensor in machine learning** is a multi-dimensional array of numerical data. A scalar is a rank-0 tensor, a vector is rank 1, a matrix is rank 2. A rank-3 tensor is a 'cube' (e.g., a color image H×W×C). Operations in neural networks are essentially tensor contractions (index summation). **A tensor in physics** is a geometric object whose components transform in a specific way under a change of coordinates. The stress tensor σᵢⱼ maps a normal direction to a force vector; it lives in V* ⊗ V.

**Einstein summation convention:** When an index appears twice (once up, once down) - summation is implied. Aᵢⱼ Bʲₖ = Σⱼ AᵢⱼBⱼₖ = Cᵢₖ. In NumPy/TensorFlow this is `einsum('ij,jk->ik', A, B)`. The notation lets one write complex tensor operations concisely and without errors in index ordering.

A tensor is just a 'multi-dimensional array', the same as a matrix but bigger

Mathematically, a tensor is an element of a tensor product of vector spaces (or their duals). The key property is the transformation law for components under a change of basis. An array in computer memory is merely a representation of a tensor in a fixed basis.

In neural networks, image data has shape (batch, height, width, channels). What is the rank of this tensor?

Key Ideas

M ⊗ N - module with universal property: bilinear maps from M×N ↔ linear maps from M⊗N
dim(V ⊗ W) = dim(V) · dim(W); basis is {eᵢ ⊗ fⱼ}
Not every element of M⊗N is a simple tensor m⊗n
Kronecker product of matrices = matrix representation of tensor product of operators
Tensor rank of an element = minimum number of simple tensors in its decomposition
Hom(V, W) ≅ V* ⊗ W: matrices are rank-2 tensors

Further Directions

Tensor products are the building block for exterior algebras (∧^k V) and symmetric algebras (S^k V). Homological algebra studies derived functors of tensor product (Tor) - measuring how far from exact ⊗ is.

Exterior Algebra — ∧^k V = quotient of tensor product by antisymmetry relations
Homological Algebra — Tor_n(M, N) - derived functor of tensor product; measures non-flatness

Вопросы для размышления

Prove that ℝ ⊗_ℝ V ≅ V for any ℝ-vector space V. What is the unit for tensor product?
In quantum computing, a system of n qubits is described by (ℂ²)^{⊗n}, which has dimension 2ⁿ. Why are quantum computers exponentially more powerful than classical ones for certain problems?
How does the tensor rank of a matrix relate to its ordinary (linear) rank? Do they always coincide?

Связанные уроки

la-01-vectors-intro