An immersive audiovisual installation simulating humanity's transformative impact and the progressive erosion of the natural world.
- 📖 Overview
- ✨ The Experience
- 📁 Project Structure
- 🚀 Installation & Setup
- 🛠️ Technology Stack
- 📽️ Visual System
- 🔊 Audio System
- 💡 Lighting System
- 🌍 Credits & Attributions
- 📸 Visual Preview
- 📄 License & Usage Terms
Anthropocene serves as a bridge between scientific observation and cultural philosophy, manifesting as an interactive audiovisual environment that vividly reconstructs how human presence reshapes the Earth's biosphere. Envisioned as an artistic installation with a generative environment driven by real-time sensors and AI, the project forces a direct confrontation with the reality of how human presence alters and reshapes the natural environment.
The user experience begins in a state of pure nature, immersing participants in pristine visuals and sounds, but as camera-based sensors track their movement, the system responds with a progressive metamorphosis: the landscape decays into urban forms and industrial noise, directly reflecting the audience's reshaping influence. This dynamic evolution is designed to foster emotional resonance and intuitive responsibility, inviting viewers to move beyond judgment and engage in deep reflection on their relationship with our world.
- Visuals: Participants enter an enclosed space immersed in a pristine natural landscape.
- Audio: Natural soundscapes—birdsong, wind, flowing water etc.
- State: Untouched wilderness.
- Trigger: Sensors detect human presence via computer vision (camera detection).
- Visuals: The equilibrium is broken. Vegetation stiffens, and organic branches mutate into geometric shapes and artificial lights.
- Audio: Metallic rhythms and distant engines superimpose over the wind.
- Visuals: The digital city takes complete control. Glitches, frantic traffic, and visual pollution saturate the screens.
- Audio: The soundscape collapses into auditory chaos, reflecting acoustic stress.
When the room empties, the structures disintegrate little by little, allowing the forest to slowly regenerate to its initial state.
While the project can be scaled down for smaller demonstrations, this layout represents the optimal deployment designed for a dedicated exhibition space with sufficient resources.
-
Central Console: The "brain" of the operation (Laptop/Workstation), managing the OSC communication pipeline between vision, audio, and sensing logic.
-
Immersive Visual: A large-scale generative screen (or projection) dominates the scene, driven by TouchDesigner and Stream Diffusion.
-
Spatial Audio Field: Two active speakers (Stereo) are positioned flanking the screen. Controlled by SuperCollider, they create a stereo field that physically envelops the audience.
-
Interactive Zone: A centralized area where the audience is tracked. A standalone camera, positioned towards the crowd, feeds real-time data to the Python/YOLO controller to trigger state changes based on crowd density.
-
Atmospheric Lighting: A grid of DMX-controlled LED spots on the ceiling (or on the ground) reacts to the system's "entropy level".
- Projector: For immersive visual output.
- Camera: For presence detection (webcam or USB camera).
- Speakers: For spatial audio experience.
- Lighting: 2x DMX-controlled LED Fixtures connected via USB-DMX interface (or Art-net).
- Computing Power:
- Internet connection: Required for remote Stream Diffusion inference.
- OR High-end GPU: (e.g., NVIDIA RTX 3080 or higher) to run the model locally without an internet connection.
Video Bridge:
- macOS: Syphon (Required for Python-to-TD video routing).
- Assets & Media:
- Download the image dataset and the audio samples.
- Place the
Layersfolder in the same directory as the SuperCollider script. - Ensure the
Images(Dataset) folder is placed in the project root for TouchDesigner access.
- Audio: Boot the SuperCollider server and load the
GranularReceiver.scdfile to start the audio engine. - Lighting Setup:
- Open the QLC+ file provided.
- Verify the Art-Net/DMX input is receiving signals from TouchDesigner to control the two fixtures.
- Sensing: Run the Python controller to start the computer vision system:
pip install ultralytics python-osc # For macOS users, check if Syphon is installed python detection_controller.py - Visuals: Open
AnthropoceneCPAC.toein TouchDesigner. Ensure the OSC in/out ports match the Python configuration.
The system relies on a distributed architecture to handle real-time generative media.
- TouchDesigner: Handles dynamic environmental rendering, fluid state transitions, and responsive visual morphing.
- Stream Diffusion (Remote): Utilized for high-fidelity generative landscapes. Due to high computational costs, models are run on remote servers.
- DayDream API: A lightweight visual framework used for post-processing effects and ambient textures that bridge the gap between generative AI and real-time rendering.
- SuperCollider: Generates adaptive soundscapes using procedural sound design and granular synthesis.
- Reaper: Acts as the primary Digital Audio Workstation for music composition.
- YOLO (Ultralytics): Camera-based presence detection and multi-participant tracking.
- Python: A communication pipeline that normalizes tracking data and controls system parameters in real-time.
- OSC: The low-latency network protocol used to synchronize data between Python, SuperCollider, and TouchDesigner.
- DMX/Art-net: Controls the physical lighting environment.
- QLC+: For DMX lighting management and bridge to TouchDesigner.
dependencies:
# Python
python: 3.x
pythonosc
numpy
opencv
PyObjC (Metal)
Syphon [only macOS]
# YOLO ultralytics models
yolov8n-seg.pt
# TouchDesigner
DayDream APIThe visual core of Anthropocene is a real-time generative pipeline built in TouchDesigner. It functions as a centralized hub that interprets sensor data and translates it into a visual metamorphosis. The system does not merely play back video; it integrates new frames in real-time based on the audience's live behavior.
The system operates as a continuous, feedback-driven loop:
-
Logic & State Control: The network listens for the
/entropy_levelvia OSC to determine the installation's current phase (Genesis, Colonisation, or Saturation). These logic gates drive the parameters of the generative model, ensuring the visuals remain synchronized with the audio and lighting atmosphere. -
Live Input & Motion: To ground the AI generation in physical reality, the system ingests a live camera feed of the audience via Syphon. This real-time visual input is blended with pre-rendered video assets to provide StreamDiffusion with a sense of movement, spatial dimension, and organic fluctuation.
-
Generative Morphing: The combined video signal is processed by the StreamDiffusionTD component, which acts as the neural rendering engine. Through dynamic Prompt Interpolation, the AI re-imagines the scene in real-time—shifting from keywords like "pristine forest, 4k, organic" to "cyberpunk city, destruction, glitch"—effectively allowing the audience's physical movements to visually corrupt the digital landscape.
The auditory experience of Anthropocene is driven by a custom-built generative engine developed in SuperCollider. Unlike traditional loop-based playback, the system employs Real-Time Granular Synthesis to sonically represent the erosion of the natural world.
The audio engine listens for OSC messages (/entropy_level) from the central control unit and dynamically manipulates the soundscape through a custom synth architecture (\texturePlayer).
The system manages 6 distinct audio layers, morphing them based on the crowd's activity level. As the "Entropy Level" rises, the engine applies the following transformations:
- Granular Erosion: The signal is split between a "clean" path and a "granular" path. As human presence increases, the system fragments the audio into microscopic grains (
GrainBuf), creating a cloudy, textured, and disintegrated sound. - Time Dilation: High entropy levels trigger a time-stretching algorithm that slows down the playback speed (down to 50%) without altering the window size, creating a heavy, dragging atmosphere.
- Harmonic Dissonance: A
pitchJitterparameter is introduced at peak levels, randomly detuning the grains to generate acoustic discomfort and instability. - Spatial Wash: The signal is fed into a stereo reverb (
FreeVerb2) whose room size and wet mix expand proportionally with the chaos, drowning the clarity of nature in a wash of industrial noise.
- Input: OSC Data (
/entropy_level) → Logic: Layer blending & Parameter Mapping → Synthesis: Granular Texture → FX: Reverb & Limiting → Output: Stereo Field.
The lighting system acts as a physical extension of the digital environment. Controlled via DMX/Art-net through the integration of TouchDesigner and QLC+, a pair of LED fixtures dynamically alters the room's atmosphere based on the system's "Entropy Level".
The lighting follows a 6-stage progression, shifting from organic, natural tones to harsh, industrial states.
The installation begins with a pristine atmosphere dominated by ,
, and
hues, representing the untouched biosphere and nature.
As human presence is detected, the environment progressively decays. The palette shifts towards and
, eventually collapsing into the aggressive tones of
, and a cold, desaturated
.
- Bridge: TouchDesigner maps entropy to DMX values sent via Art-Net/OSC to QLC+.
- Smoothing: A Filter/Lag operator in TD ensures fluid color crossfades between states.
This project was conceived and developed as part of the Creative Programming and Computing course (A.Y. 2025/2026) at Politecnico di Milano.
Core Frameworks & Libraries
- Generative Visuals: Stream Diffusion pipeline & TouchDesigner.
- Audio Engine: SuperCollider (Real-time synthesis) & Reaper (Composition).
- Sensing & Logic: Ultralytics YOLO for computer vision &
python-oscfor networking.
Acknowledgements
- Special thanks to the open-source community for the DayDream visual framework.
- Models hosted and accelerated via HuggingFace.
| Cyberpunk city | While testing our experience... | Lighting setup |
|---|---|---|
![]() |
![]() |
![]() |
ANTHROPOCENE © 2026 All Rights Reserved.
No part of this project may be reproduced or used for commercial purposes without explicit permission from the authors.










