Muzik Faktry: Processing music files on Linux

intro

Apparently i'm an oddball because my music files usually contain only the artist, title and genre tags (and i only use 3 or 4 of the latter), along with some ReplayGain information. I'm not interested in embedding lyrics, images or whatever else people are sticking in their music these days. I like to aim for minimal, compact, error-free files that meet my medium-high quality standard and that's why we're here. If you're an audiophile geek with multiple ex-wives who spends weeks deciding which DAC will provide the truest sound, this might not be for you, but if you just like to listen to good sounding music while you prune your hedges, read on.

Typically i acquire specific songs from various artists rather than entire albums, mainly because i've yet to run across an album for which i like every song. The problem with this disjointed approach is that it results in consistency problems regarding tags, headers, volume levels, file names, etc.. Most maddening is when some aspiring government employee thinks that transcoding a 64 kbit/s MP3 to FLAC is somehow a good idea. With the exception of the latter, most issues are generally easy to fix using a few software tools.

Enter Muzik Faktry, a Bash shell script which is essentially a wrapper that handles various 3rd party tools and does so without relying on stuff i don't like relying on, such as Java, Wine, Mono or Electron.

Muzik Faktry is a menu driven script that runs in a terminal, but don't let that scare you half to death if you're not a terminal freak... get it? There are plenty of prompts to guide you along and it presents a unified interface for running several software tools which are available for Linux-based operating systems (because they suck less than Windows operating systems).

Muzik Faktry was formerly named MP3 Factory because i had originally been focused largely on producing high quality MP3s until i learned more about what a scrambled mess the MP3 format is. As a result i shifted my focus focus on the lossless formats which is far superior and much easier to work with. All of the tasks that Muzik Faktry performs are thus lossless, meaning there is no reduction in audio quality.

Muzik Faktry is intended primarily for transcoding (converting) uncompressed tracks or albums to the FLAC format and/or performing various operations on them before adding them to your collection, including comprehensive integrity checking. So comprehensive in fact, that when i first ran my MP3 music collection through it before i moved to lossless, the bloody thing flagged every single file as "JUNK"!

The script may not be a complete solution for processing your music since it offers only rudimentary tagging functions, nor is it a complete replacement for comprehensive audio analysis tools, however it has served my needs quite well without having to employ any additional tools.

Muzik Faktry was developed and tested on Manjaro Linux or, as i affectionately call it, Arch for dummies, however it should run on any flavor of Linux that includes a Bash compatible shell and for which the dependent packages are available.

The majority of documentation for Muzik Faktry is located on the code repository over at Codeberg so i won't bother repeating it here, however i do want to expand on the spectral analysis of music files a bit.

Spectral Analysis

Spectral analysis of music files, while somewhat time consuming, can be an important determining factor regarding the quality of the audio. I'm using mostly MP3's in the following examples, but this information can be applied to lossless formats as well which is all that Muzik Faktry is designed to work with.

One of the things to look for in these images is the frequency cutoff point, meaning the highest frequency the audio attains. Using the MP3 format as an example, the highest frequency that can be encoded is limited by the bit rate, however there is more to consider. Pictures are worth stuff they say, so let's give that a shot...

Here's what the frequency spectrum looks like for a random FLAC music file that indicated a bitrate of 872 kbit/s. The keen observer you are, you'll notice that the frequency cutoff is close to 21000 Hz, or 21 kHz, which is slightly higher than the highest frequency which the average young human is capable of hearing (~20 kHz). The file size here is a whopping 26,924,696 bytes.

Frequency spectrum for FLAC sample 872kbps

Next i transcoded the 872 kbit/s FLAC file to a 320 kbit/s CBR MP3 (not something i would normally do). The file size dropped to 9,882,876 bytes, less than half of what it was. You'll notice the frequency cutoff point has dropped to around 20000 Hz, or 20 kHz, so while certainly lost the higher frequencies, we didn't lose anything that anybody without extremely good hearing and who is listening in a very quiet environment with a very decent sound system is likely to notice. We also saved a hell of a lot of disk space which may be worthwhile if you need to cram as many songs as possible onto that microscopic SD card that you struggle to plug in to your fondleslab.

Frequency spectrum for MP3 sample 320kbps

Next i converted the 872 kbit/s FLAC to a 128 kbit/s MP3. This time the quality loss is significant to the point where many people with decent hearing in a quiet listening environment and a decent sound system would probably notice. If you're listening to music while operating a jackhammer however, then a 128 kbit/s might be just dandy. Here the frequency cutoff is around 16000 Hz, or 16 kHz, and the file size is 3,953,289 bytes, again half of what it was.

Frequency spectrum for MP3 sample 128kbps

Now it's time to 'do as stoopid does'; i took the 128 kbit/s MP3 and upsampled it to a 320 kbit/s MP3 because "more quality", however the only thing i actually accomplished was to fatten the file size, which has now more than doubled to 9,882,876 bytes, the same size of the 320 kbit/s MP3 that was converted from FLAC earlier. The quality is, at best, no better than the 128 kbit/s file and the frequency cutoff point hasn't changed. The lesson here is that you cannot add quality to an audio file that doesn't have it in the first place!

Frequency spectrum for MP3 sample 320kbps

So now we know that files with a high bit rate and a low frequency cutoff point are upsampled junk, right? Well, no. Not necessarily.

When encoding an MP3, audio data is sacrificed in order to reduce the file size, however the way that a good encoder (LAME) does this is to discard sound that the ear would probably never hear anyway, thus everything above a given frequency as well as everything below a given frequency is discarded right off the bat, but there's allot more to it than that. The bit rate is a primary factor regarding sound quality and frequency cut-off with the LAME encoder and as the bit rate is decreased, so too is the maximum frequency that the audio is able to attain. We know that the average young human with really good hearing can hear frequencies up to about 20 kHz, but as we age that threshold drops. Below is the approximate maximum frequency that can be attained at a given bit rate for a LAME encoded MP3. Keep this in mind when viewing spectrogram images of your audio files:

64  kbit/s = ~11 kHz frequency cut-off
128 kbit/s = ~16 kHz frequency cut-off
192 kbit/s = ~19 kHz frequency cut-off
320 kbit/s = ~20 kHz frequency cut-off

But again, the frequency cut-off point is not in itself a determining factor of audio quality. While the cut-off points above are useful for Rock and Pop type genres with a variety of instruments and vocals, they may be meaningless for songs which are not composed with so much diversity. For example, a 'perfect' quality audio recording of a flute, a vocal, or a piano, may have a frequency cut-off point that is well below 20 kHz.

Another thing to look for in the spectrogram is signs of clipping. Clipping is the result of the gain (volume) having been raised too high and this can lead to really nasty distortion when listening to the song. This is not uncommon given the loudness war we've been subjected to, as well as poor encoding practices. Here's a frequency spectrum of Acoustic Alchemy - Clear Air For Miles.mp3 (320 kbit/s CBR 44.1 KHz). There are no significant signs of clipping:

Frequency spectrum: Acoustic Alchemy - Clear Air For Miles MP3

Using MP3gain i then increased the gain by 10 dB and this was the result:

Frequency spectrum: Acoustic Alchemy – Clear Air For Miles MP3 (clipping)

What you'll notice is that a lot of the green and blue stuff now reaches the upper frequency cutoff point and this is indicative of clipping. Also see the Clipping or distortion section of the article Understanding Spectrograms.

Another bit of information worth considering is an answer to a question posted on Stack Exchange, How to tell if a high-res flac file has been upsampled from a CD-quality file?. Following is an excerpt:

Here are a few things to look for in a spectral analysis

  1. Each format has it's own rolloff: CD drops like a rock at 20 kHz and MP3 drops steeply at 16 kHz. If you have a 96 kHz file with these sharp drop offs, it's likely been up-sampled.
  2. Inspect the content above 20 kHz. If there is random "noise like" features in there, it's probably genuine. If it has very little content and/or the content looks like a low-pass filtered mirror image of the content below 20 kHz, it's been up-sampled.
  3. You can look at correlation at high frequencies. For a genuine recording this will mostly be uncorrelated. If there is significant correlation, it's a potential sign of "joint stereo coding" which could hint at lossy compression.
  4. Look at the recording date: If it's been made before 1990 it's almost guaranteed to be up-sampled. There never was a digital studio master and the best they can do is to sample a tape master.

So how do you use all this information? Well, this is where it gets tricky because a file that was encoded using a high bit rate, yet doesn't approach 20 kHz, is not necessarily junk. Given the wide variety of sounds in most Rock, Pop and some other genres of music we might listen to however, such songs will often approach or exceed 20 kHz as long as a lossless or high quality lossy encoding method was used.

Over time you'll develop a sense of what the frequency spectrum should look like given the bit rate and the different sounds present in the recording. Ultimately what matters however is whether you're happy with what you hear, not the fancy colors on a graph.

See also: Spectral Analysis (archive) and this post on Reddit, How to determine the true quality of an audio file.

bug reports

You can drop a comment below without having to create an account, or open an issue on the code repository.

suggested software

Muzik Faktry does not use the following programs, however i readily recommend them.

Kwave Sound Editor (pkg. name: kwave): Kwave for the KDE desktop is a nice and simple sound editor that i used while developing Muzik Faktry. If you need more power, try Audacity or ardour.

Sonic Visualiser (pkg. name: sonic-visualizer): An excellent tool to analyze audio in different ways.

resources