You could pre-process the sound once at game start and save the result in some struct. Depending on the sound's sample rate and the desired display frequency (like 60fps for the visualizer) you could then average sampleCountPerSecond/60 samples into one (and optionally save min and max values or even separate into different frequencies).

