Hows Ecology Squad: linux high resolution audio

So, you can use all 24 bits and 96kHz of your soundcard, if you want:

vi ~/.config/pipewire/pipewire.conf.d/99-sample-rate.conf

And specify:

context.properties = {
default.clock.rate = 96000
default.clock.allowed-rates = [ 44100 48000 88200 96000 ]
default.clock.quantum = 1024
default.clock.min-quantum = 32
default.clock.max-quantum = 8192
}

Then:

pactl info

Should say:

Default Sample Specification: float32le 2ch 96000Hz

And yay, it sounds great, just look at all this music:

There must be some upscaling going on somewhere...

I've seen easyeffects using 60% cpu, but restarting sound calmed it right down. On ubuntu 24:

systemctl --user restart pipewire pipewire-pulse

It seems like 16bit samples could become 24bit just by adding zeroes at the little end, and 48kHz can simply repeat each sample... What do we do when it's 44100 Hz? Hmmm

Claude AI

Let me help clarify how audio sample rate and bit depth conversion actually works:

Bit Depth Conversion (16-bit to 24/32-bit):

It's not just adding zeros at the end
The conversion involves scaling the values to maintain the same relative amplitude
16-bit values (-32,768 to +32,767) are scaled to fit the 24-bit range (-8,388,608 to +8,388,607)
This preserves the original signal's dynamic range while providing headroom for processing

Sample Rate Conversion (44.1kHz to 96kHz):

It doesn't simply repeat samples - that would create artifacts
Instead, it uses interpolation algorithms to calculate new samples between the existing ones
Common methods include:
- Linear interpolation (basic, can cause artifacts)
- Sinc interpolation (more accurate but CPU intensive)
- Polynomial interpolation (good balance of quality and performance)

The tricky part with 44.1kHz to 96kHz conversion is that it's not a simple multiple (like 48kHz to 96kHz). The resampler needs to generate roughly 2.18 new samples for each original sample (96000/44100 ≈ 2.18). This requires more complex mathematics to maintain audio quality and prevent artifacts.

What's interesting though is that 44.1kHz was chosen as the CD standard because it's just over twice the human hearing range (roughly 20kHz), allowing for the Nyquist frequency to capture all audible frequencies. The 96kHz sampling rate is just over twice 44.1kHz, giving us another doubling that provides even more headroom for audio processing and ultrasonic frequencies.

Let me clarify these audio fundamentals:

Sample Rate determines the highest frequency that can be captured/reproduced (Nyquist frequency = sample rate/2). So:

44.1kHz can theoretically capture up to 22.05kHz
96kHz can theoretically capture up to 48kHz

Ultrasonic frequencies (>20kHz):

Are often filtered out, yes, because:
- Most people can't hear them
- They can cause intermodulation distortion
- They can waste amplifier power
Arguments for keeping them:
- Some argue they affect the audible frequencies' timing/phase relationships
- Could affect perception of transients (very fast changes in sound)
- May preserve more natural decay characteristics of instruments
- Some claim they have subtle psychoacoustic effects

Bit depth is actually about dynamic range and noise floor, not wave location precision:

Each bit represents 6dB of dynamic range
16-bit = ~96dB range
24-bit = ~144dB range
Allows for quieter sounds to be captured without noise
Gives more headroom for audio processing

Sound-space is logarithmic:

We perceive volume logarithmically, not linearly:

Each doubling of sound pressure (6dB) feels like a similar step in loudness
That's why each bit in digital audio represents 6dB
Going from 1 to 2 bits feels like the same increase as 15 to 16 bits

So the bits aren't about "resolution" in the visual sense:

They're not like pixels that divide the wave into finer slices
Instead, they give us more "floors" in the dynamic range "building"
Each additional bit doubles the number of possible volume levels
But these levels are spaced logarithmically to match how we hear

Sample rate, on the other hand, IS more like visual resolution:

Higher sample rates let us capture higher frequencies
It's about how finely we can slice time
This directly affects the highest frequency we can represent

It's a common misconception about bit depth because we tend to think in visual/linear terms, but our ears work differently from our eyes! Audio processing has to account for this logarithmic nature of human hearing.

Conclusion

It doesn't use much CPU to do this, might as well switch it on.

Especially for capturing high quality audio you intend to work on, you want all that sound-space to avoid squishing your delicious waves before you can get your compressor etc on it.

If you want to burn CPU and|or decrease latency, pipewire apparently calls that option the quantum.

Have fun!

Hows Ecology Squad

Sunday, 5 January 2025

linux high resolution audio

Claude AI

Conclusion

No comments:

Post a Comment