FFT in Bonzomatic
Bonzomatic can have the sound from a microphone or a loopback device (Desktop sound etc...) processed with a FFT and then available on the shader fia the texture texFFT. Characteristics
The lib used by bonzomatic to fetch audio frame is miniaudio . The initial configuration is defined at : samplingRate = 44100 , it's hardcoded
2 channels enabled, left and right
The lib then define an interface waiting for a callback function that will be called per audio frame. This callback is defined at . It's important to note that frameCount represent the frame size of the current sample which has been observed to be samplingRate / 100 = 441 with interleaved data for left and right channel. This callback function, at each audio frame, append the latest received audioframe by shifting the current sampleBuf .
The actual computation is done at . The process done to convert the result of the FFT to an explotable texture is : Getting the Amplitude of the signal
Amplify the signal with a configurable variable.
The array is then injected as texture at Knowing theses details :
FFT bandwidth : 44100 / 2 = 22050 . Which means FFT is going from 0hz to 22khz Bin width : 22050 / 1024 = 21.53 . So each bin has a width of 21.53 hz Consequences
Out of the fact that the FFT Amplitude is used "raw" without prior pre-processing for "normalisation", you should also take into account the FFT Bandwidth :
Human ear goes from 20hz to 20khz, which means the first bin and the last 100 bins (which represent around 2khz width) are technically un-eared sound.
Warning high frequency notes on following video
https://scrapbox.io/files/66feb93845de66001d7a1591.mp4
You also need to take into account what kind of sound is getting generated :
If the sound generated is from a "raw" sources, like generating sound directly from the computer or using a lossless format, then all the bandwidth will technically have data
If the sound comes from a lossy format, like mp3 or a stream on any VOD / Streaming platform, it's highly probable that a part of high frequencies of the FFT won't have any information, as most of theses format will just striped theses frequencies. E.g, MP3 with some parameter will mosly cut frequencies above 16 -19khz
Following example does some amplitude rectification to have something more visual.
FLAC File (We got roughtly everything <20khz represented)
https://scrapbox.io/files/66febce97c7b8e001d7ef110.mp4
MP3 from website (notice the drop much before the 20khz from example above)
https://scrapbox.io/files/66febd3bb8a4de001c34529d.mp4
FFT Smoothed
The texture texFFTSmoothed defined at takes the FFT values but do a mix of values between current and previous FFT values using the fFFTSlightSmoothingFactor . The impact of this operation will generate a similar FFT data but more robust to suddent changes and spike of amplitude. FFT Integrated
This generate a texture where each texel will increment at the pace of the bin frequency associated. You could imagine having a fGlobalTime but instead of increasing monotonously like a normal clock, it increase based on the selected frequency amplitude, which could create a sense of rythm if well used. See FFT based motion for concreate example. Notice also the check defined at that make sure the value of all bins in fftDataIntegrated never go above maxIntegralValue, which is a hard coded value defined at to 1024.