Spatial Sound and Light Stream

personal project
Builder
metaverse, performance, sound
Javascript, Three.js, TouchDesigner

Spatial Sound and Light Stream is a performance that combines multi-channel sound and 3D visual elements in a live stream. The sound is linked to balls of light that move around in real-time, controlled by the performer and modulated by the sound. This creates an immersive, visual, and binaural listening experience. I performed it in a virtual environment called Yorb as part of the ITP residents show.

The idea for this performance was inspired by a 4DSound event I attended at a church in Amsterdam in October 2019. At this event, 40 speakers were placed in a circle and sound and light moved around in sync, played from analog modular synthesizers. When the COVID-19 pandemic hit and it became impossible to have a physical performance space, I began exploring the possibility of creating a similar experience in a virtual space.

To overcome the 10-30 second delay typical in live streams, I embedded values that I controlled and values extracted from musical instruments into pixels hidden in the stream. These values were then unpacked and used to change elements in the space in real-time, ensuring that the modifications were in sync with the audio and video in the stream.

In the performance, multichannel audio is streamed into a virtual 3D space, and each channel of audio is attached to a light. I control the position of the light through encoded values, and the characteristics of the light, including size, brightness, and color, are continuously modified by properties of the sound. When viewed with headphones, the audio moves in sync with the light around the viewer's head, creating the illusion of being in a real space with moving sound and light.

This work unlocks new possibilities for browser-based multi-channel audio and 3D visual live streaming.

A bit more detail on how it works

Sound synthesis + midi signal generation with Bitwig Studio + Softube Modular

Audio is synthesized live in Bitwig Studio using Softube Modular reproductions of analog modular synthesizers.  X and Y positions of left and right channels are generated as seen on the bottom right, and transmitted via midi to TouchDesigner.  Additional characteristics of the synthesized sounds, such as pulse, timber, and reverb amount are also sent via midi to TouchDesigner.
Audio is synthesized live in Bitwig Studio using Softube Modular reproductions of analog modular synthesizers. X and Y positions of left and right channels are generated as seen on the bottom right, and transmitted via midi to TouchDesigner. Additional characteristics of the synthesized sounds, such as pulse, timber, and reverb amount are also sent via midi to TouchDesigner.

Midi parsing + pixel value video generation in TouchDesigner

In TouchDesigner, the midi signals from Bitwig are converted into grayscale pixel values.  These pixels are packed into the stream to be parsed later in the front-end metaverse renderer.
In TouchDesigner, the midi signals from Bitwig are converted into grayscale pixel values. These pixels are packed into the stream to be parsed later in the front-end metaverse renderer.

Audio + light modulation values + camera feed streaming with OBS Studio + Mux

The different camera angles, the screencpature of the audio, and the pixel values extracted from the midi are packed together into one video feed in OBS Studio and streamed out to Mux along with the live synthesized audio.  These different screens can be unpacked later in the 3d space, and the pixel values can be parsed to modulate the lights attached to the sound.
The different camera angles, the screencpature of the audio, and the pixel values extracted from the midi are packed together into one video feed in OBS Studio and streamed out to Mux along with the live synthesized audio. These different screens can be unpacked later in the 3d space, and the pixel values can be parsed to modulate the lights attached to the sound.

Audio channel splitting + sound & light position & light characteristics + additional screens projected into the Three.js metaverse space

In the virtual space, the left and right audio channels are extract from the hls stream, and each channel is attached to a light, the light position and characteristics is controlled by the pixel values extracted from the stream.  The light moves around the viewer's head with the corresponding channel of audio attached to it, creating the illusion of being in a real space with moving sound and light.  Additional screens from the stream, such as the instrument view, are extracted and placed in the space.
In the virtual space, the left and right audio channels are extract from the hls stream, and each channel is attached to a light, the light position and characteristics is controlled by the pixel values extracted from the stream. The light moves around the viewer's head with the corresponding channel of audio attached to it, creating the illusion of being in a real space with moving sound and light. Additional screens from the stream, such as the instrument view, are extracted and placed in the space.

View a live demo here