Audio Reactive LED Strips Are Diabolically Hard

245 points - last Tuesday at 1:55 PM

Comments

doctorhandshake last Wednesday at 2:30 PM

I like this writeup but I feel like the title doesn't really tell you what it's about ... to me it's about creativity within constraints.

The author finds, as many do, that naive or first-approximation approaches fail within certain constraints and that more complex methods are necessary to achieve simplicity. He finds, as I have, that perceptual and spectral domains are a better space to work in for things that are perceptual and spectral than in the raw data.

What I don't see him get to (might be the next blog post, IDK), is getting into constraints in the use of color - everything is in 'rainbow town' as we say, and it's there that things get chewy.

I'm personally not a fan of emissive green LED light in social spaces. I think it looks terrible and makes people look terrible. Just a personal thing, but putting it into practice with these sorts of systems is challenging as it results in spectral discontinuities and immediately requires the use of more sophisticated color systems.

I'm also about maximum restraint in these systems - if they have flashy tricks, I feel they should do them very very rarely and instead have durational and/or stochastic behavior that keeps a lot in reserve and rewards closer inspection.

I put all this stuff into practice in a permanent audio-reactive LED installation at a food hall/ nightclub in Boulder: https://hardwork.party/rosetta-hall-2019/

thot_experiment yesterday at 5:32 AM

I wish my bike hadn't gotten stolen, I had audioreactive LEDs on there and I'd take a video for y'all. Maybe it's time to resurrect the project.

I don't think FFTs are particularly good for music visualization if you're trying to be expressive because there isn't a particularly meaningful mapping from an FFT to the subjective experience of music and it adds significant latency. I ended up using a few stacked bandpass filters as well as mixing the raw PCM into the light strip for texture. Compressor with a slow attack and even slower release for auto-leveling (the release has to be like 20 seconds to make sure you don't up the gain a bunch during a breakdown in the music). I ran all the realtime stuff on one core of an ESP32 and a bluetooth stack and all the UI stuff on the other. I was getting about 200FPS on a strip of about 120 SK9822s w/ a custom HDR driver giving me about 11.5 bits of color per channel.

I really miss my bike, watch your shit on caltrain.

menno-dot-ai last Wednesday at 2:16 PM

Woow, this was my first hardware project right around the time it released! I remember stapling a bunch of LED strips around our common room and creating a case for the pi + power supply by drilling a bunch of ventilation + cable holes in a wooden box.

And of course, by the time I got it to work perfectly I never looked at it again. As is tradition.

WarmWash last Wednesday at 3:13 PM

The real killer is that humans don't hear frequencies, they hear instruments, which are a stack of frequencies that roughly sometimes correlate with a frequency range.

I wonder if transformer tech is close to achieving real-time audio decoding, where you can split a track into it's component instruments, and light show off of that. Think those fancy Christmas time front yard light shows as opposed to random colors kind of blinking with what maybe is a beat.

cnlohr yesterday at 7:07 AM

I'm a little surprised that colorchord didn't get a mention, since several folks use it for chromatic sound analysis to color conversion. ColorChord .NET (the desktop music visualizer) as well as AudioLink in VRChat both use the ColorChord core for the chromatic mapping. There was a really good interview with Macyler from CC.net on youtube a few years back.

There's so many massively better solutions than FFTs when it comes to the way people perceive sound.

technimad yesterday at 11:52 AM

I own an emotiscope[1]. Crafted by a solo developer that went on to do other things. This device seems to have implemented the hard part pretty well. Different algorithms with minimal delay.

1. https://github.com/Lixie-Labs/Emotiscope

RickHull yesterday at 5:27 AM

This reminds me of the various visualizations in WinAmp, and there was no shortage of creativity there! Geiss (sp?) anyone? It really whips the llama's ass!

gonzalohm yesterday at 11:33 AM

I loved this project. I started a similar project when covid started thinking that it would be easy to implement and I would be able to use it for the after COVID parties. I was really wrong, by the time COVID finished my project was barely working (using FFT) and it really didn't look like an audio reactive LED. I remember cloning your repo and thinking wow.

We used it for a bunch of parties, but the one I'm most proud of was an outdoor rave with ~20 people for which I got to organize everything. I had to program the raspberry pi to start your code at startup because it wasn't going to be connected to a laptop. I hooked the raspberry pi and LED to a generator and everything went crazy when the lights went on.

It was an amazing party

iamjackg last Wednesday at 2:09 PM

Scott's work is amazing.

Another related project that builds on a similar foundation: https://github.com/ledfx/ledfx

aappleby yesterday at 5:58 AM

I'm late to the thread, but I was able to solve this on a microcontroller ~13 years ago.

https://youtu.be/yItm-9xl0as?si=9I4DLA3qETnQ1N2G

rustyhancock last Wednesday at 11:50 AM

More than 20 years ago or so I made a small LED display that used a series of LM567 (frequency detection ICs) and LM3914 (bar chart drivers) to make a simple histogram for music.

It was fiddly, and probably too inaccurate for a modern audience but I can't claim it was diabolically hard. Tuning was a faff but we were more willing to sit and tweak resistor and capacitor values then.

JKCalhoun last Wednesday at 1:20 PM

I made a decent audio visualizer using the MSGEQ7 [1]. It buckets a count for seven audio frequency ranges—an Arduino would poll on every loop. It looks like the MSGEQ7 is not a standard part any longer unfortunately.

(And it looks like the 7 frequencies are not distributed linearly—perhaps closer to the mel scale.)

I tried using one of the FFT libraries on the Arduino directly but had no luck. The MSGEQ7 chip is nice.

[1] https://cdn.sparkfun.com/assets/d/4/6/0/c/MSGEQ7.pdf

aleksiy123 last Wednesday at 5:00 PM

Fun I actually did a similar project during my time at UVic 10 years ago but it was a hoodie.

https://youtu.be/-LMZxSWGLSQ

I remember thinking really hard on what to do with color. Except like you say mine is pretty much a naive fft.

https://github.com/aleksiy325/PiSpectrumHoodie?tab=readme-ov...

Thanks for reminding me.

copypaper last Wednesday at 4:08 PM

This is awesome! I did a similar project in college for one of my classes and ran into the same exact walls as you.

- The more filters I added the worse it got. A simple EMA with smoothing gave the best results. Although, your pipeline looks way better than what I came up with!

- I ended up using the Teensy 4.0 which let me do real time FFT and post processing in less than 10ms (I want to say it was ~1ms but I can't recall; it's been a while). If anyone goes down this path I'd heavily recommend checking out the teensy. It removes the need for a raspi or computer. Plus, Paul is an absolute genius and his work is beyond amazing [1].

- I started out with non-addressable LEDs also. I attempted to switch to WS2812's as well, but couldn't find a decent algorithm to make it look good. Yours came out really well! Kudos.

- Putting the leds inside of an LED strip diffuser channel made the biggest difference. I spent so long trying to smooth it out getting it to look good when a simple diffuser was all I needed (I love the paper diffuser you made).

RE: What's Still Missing: I came to a similar conclusion as well. Manually programmed animation sequences are unparalleled. I worked as a stagehand in college and saw what went into their shows. It was insane. I think the only way to have that same WOW factor is via pre-processing. I worked on this before AI was feasible, but if I were to take another stab at it I would attempt to do it with something like TinyML. I don't think real time is possible with this approach. Although, maybe you could buffer the audio with a slight delay? I know what I'll be doing this weekend... lol.

Again, great work. To those who also go down this rabbit hole: good luck.

[1]: https://www.pjrc.com/

londons_explore last Wednesday at 1:02 PM

The mel spectrum is the first part of a speech recognition pipeline...

But perhaps you'd get better results if more of a ML speech/audio recognition pipeline were included?

Eg. the pipeline could separate out drum beats from piano notes, and present them differently in the visualization?

An autoencoder network trained to minimize perceptual reconstruction loss would probably have the most 'interesting' information at the bottleneck, so that's the layer I'd feed into my LED strip.

panki27 last Wednesday at 1:07 PM

Had a similar setup based on an Arduino, 3 hardware filters (highs/mids/lows) for audio and a serial connection. Serial was used to read the MIDI clock from a DJ software.

This allowed the device to count the beats, and since most modern EDM music is 4/4 that means you can trigger effects every time something "changes" in the music after synching once.

milleramp last Wednesday at 3:09 PM

This guy has been making music controlled LED items, boxes and wrist bands. https://www.kickstarter.com/projects/markusloeffler/lumiband...

nixpulvis yesterday at 12:53 AM

I had a lot of fun making this a while back: https://nixpulvis.com/projects/freqk

mdrzn last Wednesday at 12:14 PM

Always been very interested in audio-reactive led strips or led bulbs, I've been using a Windows app to control my LIFX lights for years but lately it hasn't been maintained and it won't connect to my lights anymore.

I tried recreating the app (and I can connect via BT to the lights) but writing the audio-reactive code was the hardest part (and I still haven't managed to figure out a good rule of thumb or something). I mainly use it when listening to EDM or club music, so it's always a classic 4/4 110-130bpm signature, yet it's hard to have the lights react on beat.

wolvoleo last Wednesday at 2:58 PM

Thanks for this! Exactly the thing I'm struggling with now. Making decent visualisation for music based on ESP32-S3.

nsedlet last Wednesday at 5:21 PM

I also attempted to do real-time audio visualizations with LED strips. What was unsatisfying is that the net effect always seemed to be: the thing would light up with heavy beats and general volume. But otherwise the visual didn't FEEL like the music. This is the same issue I always had with the Winamp visualizations back in the day.

To solve this I tried pre-processing the audio, which only works with recordings obviously. I extract the beats and the chords (using Chordify). I made a basic animation and pulsed the lights to the beat, and mapped the chords to different color palettes.

Some friends and I rushed it to put it together as a Burning Man art project and it wasn't perfect, but by the time we launched it felt a lot closer to what I'd imagined. Here's a grainy video of it working at Burning Man: https://www.youtube.com/watch?v=sXVZhv_Xi0I

It works pretty well with most songs that you pick. Just saying there's another way to go somewhere between (1) fully reactive to live audio, and (2) hand designed animations.

I don't think there's an easy bridge to make it work with live audio though unfortunately.

serf last Wednesday at 5:28 PM

the hard part is dousing a room in pulsing bright colorful LEDs tastefully.

I haven't seen that done yet. I think it's one of those Dryland myths.

MomsAVoxell last Wednesday at 6:28 PM

> I think the future of audio visualization on LED strips will involve a mixture of experts tuned for different genres, likely using neural networks.

I think its more likely going to come from a direct integration with existing synthesis methods, but .. I’m kind of biased when it comes to audio and light synthesizers, having made a few of each…

We have addressed this expert tuning issue with the MagicShifter, which is a product not quite competing with the OP’s work, but very much aligned with it[1]:

https://magicshifter.net/

.. which is a very fun little light synthesizer capable of POV rendering, in-air text effects, light sequencer programming, MIDI, and so on .. plus, has a 6dof sensor enabling some degree of magnetometers, accelerometers, touch-sensing and so on .. so you can use it for a lot of great things. We have a mode “BEAT” that you can place on a speaker and get reactive LED strips of a form (quite functional) pretty much micro-mechanically, as in: through the case and thus the sensor, not an ADAC, not processing audio - but the levers in between the sensor and the audio source. So - not quite the same, but functionally equivalent in the long-rung (plus the magicshifter is battery powered and pocketable, and you can paint your own POV images and so on, but .. whatever..)

The thing is, the limits: yes, there are limits - but like all instruments you need to tune to/from/with those limits. It’s not so much that achieving perfect audio reactive LED’s is diabolically hard, but rather making aesthetically/functionally relevant decisions about when to accept those limits requires a bit of gumption.

Humans can be very forgiving with LED/light-based interfaces, if you stack things right. The aesthetics of the thing can go a long way towards providing a great user experience .. and in fact, is important to giving it.

[1] - (okay, you can power a few meters of LED strips with a single MagicShifter, so maybe it is ‘competition’, but whatever..)

p0w3n3d last Wednesday at 11:49 AM

IANAE but I would go for electric circuit, not electronic software that steers the led. I think that nowadays, with the LLM support it can be easier and better to optimise it for the sake of latency.

8cvor6j844qw_d6 last Wednesday at 12:51 PM

Are these available commercially for consumers?

blobbers last Wednesday at 7:05 PM

Am I the only one who was surprised the obvious answer is to map frequencies to notes and basically turn your LED strip into a piano visualization? Then just norm to strip size?

There’s plenty of visual experiments of pianists doing this “rock band” “guitar hero” style visualization of notes.

ogurechny last Wednesday at 9:51 PM

The moral of the story seems to be missing.

This kind of visualisation is an arbitrary artistic choice, not just a function of inputs. A can of spray paint is a tool that needs to be mastered, and it's different from, say, oil paints and brush. LED strip is just another tool. You need to figure out first which movements, pulses, patterns it can produce, and what “looks good”. Those would be the strokes.

The same happens on the other side. Choosing how to interpret sound is also an artistic choice. Everyone does the audio spectrum because everyone has seen the audio spectrum, and considers it a “natural” projection to some one-dimensional form. It only seems “natural” because of all of the graphs you've seen in textbooks. The need to use log scale or smoothing when real audio is not a pure set of harmonics is how “nature” has to smuggle itself back into the abstract reasoning. Beats work for a reason: what we call “modern music” is defined by its constant use of rhythm. When you have a different kind of sound, you need to process it differently.

So the goal is to match something you hear in the audio with something nice that the LED strip does. Which is also an arbitrary artistic choice, and can only be judged as a whole. There is no rule that tone has to match specific position or specific colour. Also, people rarely look at LED strips on their own. Just like film crews, you need to take ambient environment into account, and sometimes increase contrast with light, sometimes blend everything together. Some kind of compressor/expander for dynamic range is probably needed for different environments.

Often the thing that reflects the light is more important. I'd even say that the best way to increase the complexity of that low resolution source is to combine it with some complex object instead of using just the straight line. A Christmas tree should come to mind as an example.

It is wrong to think that the goal of such projects is to figure out a perfect simple process that turns one array of values into another. Their goal is to make people feel something.

askl last Wednesday at 11:41 AM

Interesting. I'm currently in the process of building something with a audio reactive LED strip but didn't come across this project yet. The WLED [1] ESP32 firmware seems to be able to do something similar or potentially more though.

[1] https://kno.wled.ge/

Edit: Oh wait, that project needs a PC or Raspberry PI for audio processing. WLED does everything on the ESP32.

burnt-resistor last Wednesday at 11:54 PM

That's one part of it but there are numerous, cheap, COTS audio to RGBA drivers with zillions of music-following modes that work well enough.

The hardest part IMO is distributing it to a lot of LEDs over a large area. It usually involves finding the single serial maximum power length, cutting a bit before that, and SPI multiplexer(s) like SP901E, and strategically distributing right-sized power supplies. SPI amplifiers are also sometimes needed on long runs that skip across areas.

IshKebab last Wednesday at 3:40 PM

It's not that hard. I did a real-time version of the Beatroot algorithm decades ago that worked pretty well for being such a simple algorithm.

mockbolt last Wednesday at 2:06 PM

[flagged]

kbouck last Wednesday at 1:54 PM

[flagged]

m3kw9 last Wednesday at 2:20 PM

how is it hard, do a A to D, add a filter, do compute, then do D to A.