Automata and Cognition

Theater of Consciousness: Global Workspace Theory

Цели урока

Understand the Global Workspace architecture: specialists, workspace, broadcast
Explain the coalition competition mechanism for workspace access
Describe the ignition phenomenon and its neuroscientific evidence
Compare GWT with other consciousness theories (IIT, HOT, PP)
Identify architectural GWT analogs in neural networks

Предварительные знания

Multi-agent systems (lesson 10-multi-agent)
Attention and memory (lesson 07-attention-memory)
Self-models and introspection (lesson 09-self-models)

The brain processes 11 million bits/sec but consciously handles only 40-50. Global Workspace Theory explains this 220,000x gap - and predicts how to build AI architectures with "conscious" control.

**Inattentional blindness (1999)** - half of observers miss a gorilla while counting basketball passes: GWT explains this as suppression of competing coalitions
**Consciousness Prior (Bengio, NeurIPS 2019)** - proposed neural architecture with bottleneck as workspace analog
**Neural correlates of consciousness (Dehaene, 2001)** - ignition experimentally confirmed via EEG during stimulus awareness
**Transformer attention** - partial broadcast analog: one Query vector receives information from all Key-Value pairs
**General anesthesia** - blocks broadcast specifically while preserving local processing in individual modules

From Baars theater to neural correlates

Bernard Baars proposed GWT in 1988, inspired by a theatrical metaphor: specialized "actors" compete for the "stage" - the global workspace. In 2001, Stanislas Dehaene and Jean-Pierre Changeux found neural correlates: EEG ignition corresponds to the moment of awareness. In 2019, Yoshua Bengio proposed the Consciousness Prior - an architectural GWT analog for neural networks with an explicit information bottleneck.

The Consciousness Bottleneck

**The human brain processes roughly 11 million bits per second through sensory channels - yet consciousness handles only 40-50 bits.** That is a ratio of 220,000:1. Bernard Baars formalized Global Workspace Theory in 1988: consciousness is not a separate module but an architectural mechanism - a single "stage" where only one of thousands of parallel processes performs at a time.

**Global Workspace Theory (GWT)** - Baars (1988): specialized modules (specialists) compete for access to a single global workspace. The winning content is broadcast to all other modules simultaneously - this is the moment of awareness.

Architecture: three components

Component	Role	Neural network analog
Specialists	Parallel local processing (vision, audio, language, memory...)	Modality-specific encoders
Workspace	Single bottleneck - only 1 content at a time	Narrow latent layer
Broadcast	Transmitting winning content to all modules	Skip-connections / cross-attention

Consciousness processes a lot of information in parallel

Consciousness is a serial bottleneck: 3-4 elements per second

Unconscious processing is genuinely parallel. But the workspace is strictly serial. The feeling of rich experience is an illusion of rapid switching, not true workspace parallelism.

Why is the Global Workspace called a "bottleneck"?

Coalition Competition and Ignition

**Stanislas Dehaene (2001) discovered the neural correlate of awareness: two distinct EEG signatures emerge for the same stimulus.** When a stimulus is not consciously perceived - local activation peaks around 100 ms and fades. When it is perceived - a sharp global activation surge sweeps the entire cortex around 300 ms. This surge is called ignition. It is the transition from unconscious to conscious processing.

The competition mechanism

Specialists with similar content form coalitions. A coalition amplifies its signal: the combined strength of all members. Between coalitions, winner-take-all competition decides which content enters the workspace. The winner's activation is amplified; losers are suppressed through lateral inhibition.

Scenario	Activation	Result
Threshold not reached	Local, ~100 ms, fades out	Unconscious processing
Threshold reached	Global surge ~300 ms across entire cortex	Ignition - awareness
Competing coalitions	Mutual suppression	Only one wins

Yoshua Bengio at NeurIPS 2019 proposed the "Consciousness Prior" for neural networks: a compact bottleneck layer (e.g., 1024 -> 64 dimensions) forces the model to select only the most essential information. Loss = reconstruction + sparsity. This is an architectural analog of ignition: enforced compression implements "conscious selection" of information.

Ignition just means a stronger stimulus

Ignition is a nonlinear phase transition: positive feedback + lateral inhibition

A weak stimulus can trigger ignition if its coalition is strong enough. A strong stimulus may fail to ignite if it competes against an even stronger coalition. What matters is the competition outcome, not input intensity.

What does Dehaene's EEG data show happens during ignition?

Broadcast and the Functions of Consciousness

**When a coalition wins and enters the workspace, broadcast fires: the content becomes simultaneously available to all specialists.** This is why conscious information "binds" different modalities: the visual image of a red apple, the word "apple", the smell, the emotion - all these integrate only through the global workspace broadcast.

Broadcast function	Description	Example
RECRUIT	Find appropriate specialists for the task	Spotted a problem - recruits memory + reasoning
ENCODE	Write to episodic memory	Consciously experienced moments are better remembered
LEARN	Update weights from experience	Deliberate practice beats automatic repetition
COORDINATE	Synchronize distributed processes	Vision + motor when catching a ball
VERBALIZE	Convert to language for reporting	Only conscious content can be described in words

Why is red color consciously perceived?

1. Visual system processes wavelength - locally, unconsciously. 2. Coalition (vision + attention) wins the competition. 3. Ignition: global cortical activation. 4. Broadcast: "red" is available to all - language, memory, emotion. 5. Language system can say "I see red". 6. Memory records the episode. Without broadcast: processing happens, but is not consciously experienced - classic "inattentional blindness".

**Inattentional blindness:** In the classic experiment by Simons and Chabris (1999), roughly half of participants failed to notice a person in a gorilla suit walking through a basketball game - while counting passes. The visual system processed the gorilla, but the attention coalition for basketball suppressed ignition for the gorilla content.

Recognizing a friend's face in a crowd - is this a conscious or unconscious process according to GWT?

GWT vs Other Theories and AI Architectures

**Yoshua Bengio, NeurIPS 2019: "Consciousness Prior" - how to embed a GWT analog into a neural network.** Key idea: a conscious representation must be compact (low dimensionality), global (broadcast), and semantically meaningful. Architecturally, this means a narrow bottleneck layer with an information constraint - the structural analog of a workspace.

Theory	Core idea	Mechanism
Global Workspace (Baars)	Broadcast + competition	Bottleneck + attention + winner-take-all
Higher Order Thought (Rosenthal)	A thought about a thought makes it conscious	Meta-representation
IIT (Tononi)	Consciousness = integrated information Phi	Phi - measure of causal integration
Predictive Processing (Clark/Friston)	Brain is a prediction machine	Prediction error + active inference

Properties of conscious computation

Property	Description	AI implication
Seriality	~3-4 items per second	Transformer attention - partial analog
Capacity limit	~4 items simultaneously	LLM working memory - context window
Global access	All modules see the same content	Cross-attention between modalities
Reportability	Conscious content can be described verbally	Chain-of-thought reasoning in LLMs

LLMs implement Global Workspace through transformer attention

Attention is only a partial analog: no explicit bottleneck, no winner-take-all, no true broadcast

GWT requires: 1. competition with an explicit threshold 2. a single winner 3. one content propagated to all modules. Transformer attention softly weights everything - closer to averaging than selection. The true architectural GWT analog is a shared bottleneck with information bottleneck loss.

What neural network component corresponds to broadcast in GWT?

Вопросы для размышления

What is currently in the workspace right now? Which processes run backstage without awareness - and why might a bottleneck be an evolutionary advantage rather than a limitation?

Связанные уроки

ml-01-intro