De-borking MP3s on Linux

Tux with headphones

DISCLAIMER: This is a (heavy) work in progress so if you follow this guide you should do so using test files. Furthermore, i am not proficient in writing bash scripts, so until more polish is applied, you should also be competent enough to review the code. Lastly, if anything explodes, it’s your fault, though having said that i am developing and using these scripts on my daily driver box.

TODO:
* fix line feeds in code boxes – all line feeds are removed when pasting code
* add processing.txt file
* create a zip archive of all the little things we use here

Yes, “de-borking” is a word you grammar Nazi! It’s right here in the “official” dictionary :)

Apparently i’m an oddball because my MP3s contain only the artist, title and genre tags (and i only use 3 genres), along with the other required stuff and absolutely nothing more… well, except for the actual music. I’m not interested in embedding cover art, lyrics, bagel recipes or whatever else people are sticking in their music these days. I like to aim for minimal, compact, error-free files and that’s why we’re all gathered here today in this particularly lovely place.

In an ideal world of unlimited availability and smartphones with unlimited storage, i’d never, ever mess with MP3s, preferring instead one of the lossless formats such as FLAC, however some songs are difficult or impossible to find in a lossless format and storage space is still at a premium on our fondle slabs, so there’s that.

Typically i pirate download specific songs from various artists rather than entire albums, mainly because i never bumped into an album for which i liked every song. The problem with this is that files are often improperly tagged, have errors, are padded with too much silence, and the volume isn’t consistent from song to song. The worst is when some dummy thinks that upsampling a 64 kbps MP3 to a 320 kbps is somehow a good idea. With the exception of the latter, all of these problems, and more, are generally fixable using a few software tools.

Following is my personal convoluted method for batch processing MP3s using software that’s available for Linux without any goofy dependencies (Wine, Mono, Java, etc.). Even if your tastes differ you may find that you can hack my preferences to fit your own (and if you come up with better ideas, let me know ’cause i don’t claim to be the sharpest diode in the technical drawer).

All of the editing preformed here is non-destructive, meaning no re-encoding is necessary since there’s no loss of sound quality other than for files that i convert from FLAC, OOG, etc., to MP3.

Having used Windows for entirely too long (‘too long’ being more than 10 minutes.), i had curated several decent editing and error correction tools over the years which served my innermost desires quite well. It took some time to curate a similar tool chain for use with Linux but i’m quite tickled with the result. All of what follows was tested on Manjaro Linux (that’s Arch for us dummies) but is surely adaptable to other flavors of Linux.

Process overview

  1. convert non-MP3s to MP3s
  2. begin repairs and start weeding out the junk
  3. normalize volume (gain)
  4. analyze frequency spectrum
  5. trim silence from beginning and end
  6. tagging, file renaming
  7. more error checking and repair, cleaning, compacting
  8. final integrity check

Software

With the possible exception of MP3 Diags, the remainder of the software i use can be replaced with your preferred tools. Check your package manager to see what’s available in your repo.

  • FFmpeg (pkg. name: ffmpeg): A very comprehensive console utility for performing all kinds of actions on audio and video files. Here we will use it to convert non-MP3s to MP3s and do a frequency spectrum analysis to check their quality. Possible alternatives: soundKonverter (GUI), several others.
  • MP3 Diags (pkg. name: mp3diags): A comprehensive MP3 diagnostic, repair, tagging and file naming utility with a GUI. You probably want the “unstable” version which is the newest. Possible alternatives: There’s no other tool of decent quality that i’m aware of that can replace MP3 Diags.
  • MP3Gain (pkg. name mp3gain): Console utility to normalize the volume of MP3s. Possible alternatives: Replay Gain.
  • MP3Info (pkg. name: mp3info): Console utility that provides information about MP3s.
  • Mp3Splt (pkg. name: mp3splt): Console utility for splitting MP3s that will be used to remove silence at the beginning and end of songs. Possible alternatives: Silence removal feature/plugin for your music player. Other than that there are no utilities that i’m aware of that can trim silence without re-encoding.
  • MP3val (pkg. name: mp3val): Console utility that quickly and quietly corrects many errors in MP3s.
  • puddletag (pkg. name: puddletag): Used to edit MP3 metadata and more, puddletag is a powerful and excellent drop-in replacement for the Windows program, Mp3tag. Possible alternatives: Quod Libet (GUI), Picard (GUI), EasyTAG (GUI), Kid3 (GUI), others.

If it’s OK with you, i’m going to use the package names from now on.

Folder Structure

To keep things organized i create the following folder structure since it’s important in what order we run the tools. This is especially useful if you stop processing your music because you’ve had too many beers and want resume the next morning afternoon. What i do is move the files to the next sequential folder as soon as the process has been completed in the current folder.

/00-incoming
/01-backup
/02-replace
/10-mp3me
/11-mp3fix
/12-mp3vol
/13-mp3spect
/14-mp3trim
/15-puddletag
/16-mp3diags
/17-mp3chk
/98-listen test
/99-distribute
/tools
/processing.txt

Notes regarding the folders:

  • the “incoming” folder is where you’ll dump all your new music files
  • the “replace” folder is for all files that didn’t survive the beating we subjected them to
  • the “extra” folder is just for storing extra copies of songs, such as original FLAC files that have been converted to MP3
  • the purpose of the remainder of the folders will become apparent shortly
  • the “processing.txt” file contains an abbreviated form of this wall of text

Notes regarding the folder structure:

  • 10-mp3me: ffmpeg is used here to convert non-MP3 files so they can be processed by the rest of the tools
  • 11-mp3fix: mp3val is run as early as possible to make initial repairs, correct tags used by mp3splt, and to determine which files warrant further processing
  • mp3diags is run after puddletag to correct any remaining problems, remove unnecessary metadata and remove any extra space in ID3v2 tags that may have been left by puddletag
  • 17-mp3chk: mp3info is run as late as possible to do a final integrity check and because it depends on all unnecessary metadata having been removed by mp3diags

Tool Settings: mp3diags

Ignored notes tab: I add the ‘ea’ note to ignored notes since i don’t embed cover art in my MP3s.

Custom transformation lists tab: The only list i use is the first one. I remove all the items in the default list and add the following. Note that the order is important. If you add something in the wrong order and need to move it, click and drag on the number labels to the left of each item. For a description of these options see the MP3 Diags – Transformations – Details page.

  1. Discard invalid ID3V2 data
  2. Remove unsupported streams
  3. Remove unknown streams
  4. Remove inner non-audio
  5. Remove broken streams
  6. Remove truncated audio streams
  7. Remove null streams
  8. Remove non-basic ID3V2 frames
  9. Remove all ID3V1 streams
  10. Remove multiple ID3 streams
  11. Remove all APE streams
  12. Remove all Xing streams
  13. Remove all LAME streams
  14. Convert non-ASCII ID3V2 text frames to Unicode assuming codepage ISO-8859-1
  15. Restore flipped bit in audio
  16. Rebuild VBR data
  17. Remove extra space from ID3V2

Tool Settings: puddletag

You can use MP3 Diags for tagging and renaming files, however i find it clunky and limiting compared to puddletag. You can adjust settings here as necessary, but these are the more important ones that i use.

Tags -> ID3 Options:

  • Remove ID3v1 Tag
  • Write ID3v2.3 Tag

Patterns:

  • %artist% - %title%

Main window:

If you right click on the column headers where the songs are listed you can select what headers are displayed. I choose the following, several of which help me to determine whether i want to keep or replace a song (some of these you’ll have to add manually):

Filename | Artist | Title | Genre | Layer | Channels | Mode | Frequency | Bitrate | Length | Size

All other settings are optional and according to your preference.

STEP 1: mp3me.sh

If you have any non-MP3s, place them in the “mp3me” directory to be converted using ffmpeg.

Create the following script in the tools directory and make it executable. Don’t forget to edit the variables.

The -q:a 0 parameter will produce the highest quality VBR output that the libmp3lame encoder is capable of which averages about 245 kbps, the typical minimum being 220 kbps and the maximum being 260 kbps. If for some odd reason you want a constant bitrate, change -q:a 0 to -b:a 320k to produce a 320 kbps CBR file, but this is not recommended by the ffmpeg developers nor the folks at Hydrogenaud.io, both of which are widely considered authorities.

mp3me.sh
#!/bin/bash
# ==============================================================================
#
# TODO
# ! figure out how to stop encoder and exit script when it's running - i think
# we need to capture Ctrl + C and send Q and Ctrl + C (Q stops the encoder) -
# although i don't know why the need to do so since we're passing 1 file at a
# time to ffmpeg
#
# name      : mp3me.sh
# version   : 1.0
# date      : 31-JAN-2020
# author    : 12bytes.org
# website   : https://12bytes.org/articles/tech/de-borking-mp3s-on-linux
# license   : do what the hell you want with it
# comment   : tested on Manjaro Linux / KDE
#
# This script uses the Linux package 'ffmpeg' to convert music files to MP3.
# Processed files are placed in the "output" subfolder of the folder containing
# the originals (the folder will be created
# if it does not exist).
#
# Run the script:
# $ ./mp3me.sh
#
# Save script output to a file. The file will be created, or overwritten if it
# exists.
# $ ./mp3me.sh | tee log.txt
#
# Append output to existing file. The file will be created if it does not exist.
# $ ./mp3me.sh | tee -a log.txt
#
# To forcefully stop the script press Ctrl + C
#
# ==============================================================================
#
# USER VARIABLES
# ------------------------------------------------------------------------------
# Edit these variables to suit your needs.

# working directory (no trailing slash)
# example:
# dir="/home//Music/working"
dir="../10-mp3me"
#
# file extension to process (case in-sensitive)
fExt="flac"
#
# log level: "quiet", "panic", "fatal", "error", "warning", "info", "verbose",
# "debug", "trace"
logLevel="info"
#
# quality: "q:a 0" = highest quality VBR (suggested), "b:a 320k" = highest
# quality CBR
quality="q:a 0"
#quality="b:a 320k"

# ------------------------------------------------------------------------------
# STOP EDITING

# disable case sensitivity
shopt -s nocaseglob

mkdir -p "$dir/output"

# see if we have any files to play with
iFiles=`ls "$dir" | grep -i '\.'"$fExt"'$' | wc -l`

if (( $iFiles < 1 )); then
    echo
    echo -e "\e[1;91mThere are no '$fExt' files to process in '$dir'.\e[0m"
    echo "Goodbye!"
    exit 1
else
    echo
    echo "Files to process: $iFiles"
    echo
    echo "Setings: directory  : $dir"
    echo "         files ext. : $fExt"
    echo "         log level  : $logLevel"
    echo "         quality    : $quality"
    echo
    read -p "Shall we begin? (y/n)" -n 1 -r
    echo
    if [[ $REPLY =~ ^[Nn]$ ]]; then
        echo "Goodbye!"
        exit 1
    fi
fi

iCount=0
iStartTime=$SECONDS

for files in $dir/*.flac; do
    (( iCount += 1 ))
    fname="$(basename "$files")"
    echo
    echo -e "\e[1mProcessing: $files ...\e[0m"
    ffmpeg -hide_banner -loglevel "$logLevel" -y -i "$files" -vsync 0 -c:a libmp3lame -$quality "$dir/output/${fname%.*}.mp3"
    echo "Finished!"
done

iEndTime=$SECONDS
iElpTime=$(( $iEndTime - $iStartTime ))
echo "---------"
echo -e "\e[1;32mAll done!\e[0m"
echo "Total files  : $iCount"
echo "Time elapsed : $iElpTime seconds"

If you’re wondering why -vsync 0 is used for an audio file, it’s to avoid error messages when ffmpeg copies images from the FLACs to the MP3s which it processes as video.

To run the script, cd to the “tools” directory in a terminal and run:

$ ./mp3me.sh

When i run the script, ffmpeg produces the following on occasion:

[NULL @ 0x5630fa82a8c0] sample/frame number mismatch in adjacent frames

I’m not sure what that means, but it seems it can be ignored.

The script will create the /01-mp3me/output directory and dump the MP3s in it. You can then move them to the next directory and delete the original FLACs or move them to the “extra” folder if you want keep them, just in case.

STEP 2: mp3fix

This script uses the Linux package mp3val to batch process MP3 files and check for and repair any problems it can.

Create the following script in the tools directory and make it executable. Don’t forget to edit the variables.

mp3fix.sh
#!/bin/bash
# ==============================================================================
#
# name      : mp3fix.sh
# version   : 1.0
# date      : 1-FEB-2020
# author    : 12bytes.org
# website   : https://12bytes.org/articles/tech/de-borking-mp3s-on-linux
# license   : do what the hell you want with it
# comment   : tested on Manjaro Linux / KDE
#
# This script uses the Linux package 'mp3val' to batch process MP3 files and
# check for and repair any problems it can.
#
# Run the script:
# $ ./mp3fix.sh
#
# Save script output to a file. The file will be created, or overwritten if it
# exists.
# $ ./mp3fix.sh | tee log.txt
#
# Append output to existing file. The file will be created if it does not exist.
# $ ./mp3fix.sh | tee -a log.txt
#
# To forcefully stop the script press Ctrl + C
#
# ==============================================================================
#
# USER VARIABLES
# ------------------------------------------------------------------------------
# Edit these variables to suit your needs.

# working directory (no trailing slash) - example:
# dir="/home//Music/working/11-mp3fix"
dir="../11-mp3fix"

# ------------------------------------------------------------------------------
# STOP EDITING

# disable case sensitivity
shopt -s nocaseglob

pause=0

# see if we have any files to play with
iFiles=`ls "$dir" | grep -i '\.mp3$' | wc -l`
if (( $iFiles < 1 )); then echo echo -e "\e[1;91mThere are no MP3 files to process in '$dir'.\e[0m" echo "Goodbye!" exit 1 else echo echo "Files to process: $iFiles" echo echo "Setings: directory : $dir" echo read -p "Shall we begin? (y/n)" -n 1 -r echo if [[ $REPLY =~ ^[Nn]$ ]]; then echo "Goodbye!" exit 1 fi fi if (( $iFiles > 1 )); then
    echo
    read -p "Pause after processing each file? (y/n)" -n 1 -r
    echo
    if [[ $REPLY =~ ^[Yy]$ ]]; then
        pause=1
    fi
fi

iCount=0
iStartTime=$SECONDS

for files in $dir/*.mp3; do
    (( iCount += 1 ))
    fname="$(basename "$files")"
    echo
    echo -e "\e[1mProcessing: $files ...\e[0m"
    mp3val -f "$files"

    if (( $iCount < $iFiles )) && (( $pause == 1 )); then
        echo
        read -p "Press 'y' to continue or any other key to quit." -n 1 -r
        echo
        if [[ ! $REPLY =~ ^[Yy]$ ]]; then
            echo "-------------"
            echo -e "\e[93mUser aborted.\e[0m"
            echo "Total files processed: $iCount"
            exit 1
        fi
    elif (( $iCount == $iFiles )); then
        iEndTime=$SECONDS
        iElpTime=$(( $iEndTime - $iStartTime ))
        echo "---------"
        echo -e "\e[1;32mAll done!\e[0m"
        echo "Total files  : $iCount"
        echo "Time elapsed : $iElpTime seconds"
        echo
    fi
done

To run the script, cd to the “tools” directory in a terminal and run:

$ ./mp3fix.sh

STEP 3: mp3vol

This script uses the Linux package mp3gain to batch process MP3 files and adjust their volume (gain). You can opt to be asked if you want to keep each file after it’s processed and if you choose to discard it, it will be moved to a “junk” subfolder of the folder containing the MP3s.

You can run mp3gain from mp3diags, however it needs to be run before the mp3spect.sh script, plus we’ll run it with a little more flexibility than is available from mp3diags.

Create the following script in the tools directory and make it executable. Don’t forget to edit the variables.

mp3vol.sh
#!/bin/bash
#
# ==============================================================================
#
# name          : mp3vol.sh
# version       : 1.0
# date          : 1-FEB-2020
# dependencies  : mp3gain
# author        : 12bytes.org
# website       : https://12bytes.org/articles/tech/de-borking-mp3s-on-linux
# license       : do what the hell you want with it
# comment       : tested on Manjaro Linux / KDE
#
# This script uses the Linux package 'mp3gain' to batch process MP3 files and
# adjust their volume (gain). You can opt to be asked if you want to keep each
# file after it's processed and if you choose to discard it, it will be moved
# to a "junk" subfolder of the folder containing the MP3s.
#
# Run the script:
# $ ./mp3vol.sh
#
# Save script output to a log file in the same folder as the script. The file
# will be created, or overwritten if it exists.
# $ ./mp3vol.sh | tee mp3chk_log.txt
#
# Append output to existing file in the same folder as the script. The file will
# be created if it does not exist.
# $ ./mp3vol.sh | tee -a mp3chk_log.txt
#
# To forcefully stop the script press Ctrl + C
#
# ==============================================================================

# USER VARIABLES
# ------------------------------------------------------------------------------
# Edit these variables to suit your needs.

# folder containing the MP3s (no trailing slash). by default the location of the
# script is expected to be in a folder along side a '11-mp3vol' folder which
# contains the MP3s. example:
# sDir="/home//Music/working/11-mp3vol"
sDir="../11-mp3vol"    # def="../11-mp3vol"

# options to pass to mp3gain. explanation of the default options:
# -e - skip Album analysis, even if multiple files listed
# -r - apply Track gain automatically (all files set to equal loudness)
# -s r - force re-calculation (do not read tag info)
# -s i - use ID3v2 tag for MP3 gain info
sOpts="-e -r -s r -s i"    # def="-e -r -s r -s i"

# ------------------------------------------------------------------------------
# STOP EDITING

iCount=0
iPause=1

# disable case sensitivity
shopt -s nocaseglob

# set our working dir
cd $sDir
if [[ $? != 0 ]]; then
    echo
    echo -e "\e[1;91mFailed to change directory to '$sDir'.\e[0m"
    echo "Please recheck the path."
    echo "Exiting..."
    echo
    exit
fi

# see if we have any files to play with
iFiles=`ls | grep -i '\.mp3$' | wc -l`

if (( $iFiles < 1 )); then
    echo
    echo -e "\e[1;91mThere are no MP3 files to process in '$sDir'.\e[0m"
    echo "Goodbye!"
    echo
    exit
else
    echo
    echo "Files to process: $iFiles"
    echo
    echo "Settings"
    echo "--------"
    echo "directory       : $sDir"
    echo "mp3gain options : $sOpts"
    echo
    read -p "Ask if you want to keep each file? (Y/n)" -n 1 -r
    echo
    if [[ $REPLY =~ ^[Nn]$ ]]; then iPause=0; fi
fi

# make junk dir
mkdir -p "junk"
if [[ $? != 0 ]]; then
    echo
    echo -e "\e[1;91mFailed to create directory '$sDir/junk'.\e[0m"
    echo "Exiting..."
    echo
    exit
fi

# mian loop
iStartTime=$SECONDS

for files in *.mp3; do
    (( iCount += 1 ))
    echo
    echo -e "\e[1mProcessing: $files ...\e[0m"
    mp3gain $sOpts "$files"

    if (( $iPause == 1 )); then
        echo
        read -p "Keep this file? (Y/n)" -n 1 -r
        echo
        if [[ $REPLY =~ ^[Nn]$ ]]; then
            mv "$files" "junk/"
        fi
    fi
done

echo "Total files  : $iCount"
iElpTime=$(( iEndTime=$SECONDS - $iStartTime ))
echo "Time elapsed : $iElpTime sec."
echo

exit

To run the script, cd to the “tools” directory in a terminal and run:

$ ./mp3vol.sh

STEP 4: mp3spect

Next we’ll do a frequency spectrum analysis of the MP3s to get a half-baked idea of their quality. If you download various songs as singles or from different albums as i do, this step is probably more important than if you download only good quality albums or rip your own music.

Analyzing the frequency spectrum is especially useful to determine whether some bone head took a crappy quality music file and re-encoded it at a higher bitrate which does nothing other than waste disk space. These songs you’ll want to replace right after you hunt down the punk responsible and flood his Facebook page with derogatory memes regarding his mother.

Create the following script in the “tools” directory and make it executable. Don’t forget to edit the variables.

mp3spect.sh
#!/bin/bash
# ==============================================================================
#
# name      : mp3spect.sh
# version   : 1.0
# date      : 31-JAN-2020
# author    : 12bytes.org
# website   : https://12bytes.org/articles/tech/de-borking-mp3s-on-linux
# license   : do what the hell you want with it
# comment   : tested on Manjaro Linux / KDE
#
# This script uses the Linux package 'ffmpeg' to create frequency spectrum PNG
# images of MP3s. Processed files are placed in the "output" subfolder of the
# folder containing the MP3s (the folder will be created if it does not exist).
#
# Run the script:
# $ ./mp3spect.sh
#
# Save script output to a file. The file will be created, or overwritten if it
# exists.
# $ ./mp3spect.sh | tee log.txt
#
# Append output to existing file. The file will be created if it does not exist.
# $ ./mp3spect.sh | tee -a log.txt
#
# To forcefully stop the script press Ctrl + C
#
# ==============================================================================
#
# USER VARIABLES
# ------------------------------------------------------------------------------
# Edit these variables to suit your needs.

# working directory (no trailing slash) - example:
# dir="/home//Music/working/16-mp3spect"
dir="../16-mp3spect"
#
# log level: "quiet", "panic", "fatal", "error", "warning", "info", "verbose",
# "debug", "trace"
logLevel="info"
#
# images size. the ratio should not be changed, so if for example you double the
# width then the height should also be doubled ("WxH")
size="512x256"

# ------------------------------------------------------------------------------
# STOP EDITING

# TODO
# check disk space?

# disable case sensitivity
shopt -s nocaseglob

mkdir -p "$dir/output"

# see if we have any files to play with
iFiles=`ls "$dir" | grep -i '\.mp3$' | wc -l`
if (( $iFiles < 1 )); then
    echo
    echo -e "\e[1;91mThere are no MP3 files to process in '$dir'.\e[0m"
    echo "Goodbye!"
    exit 1
else
    echo
    echo "Files to process: $iFiles"
    echo
    echo "Setings: directory  : $dir"
    echo "         log level  : $logLevel"
    echo "         image size : $size"
    echo
    read -p "Shall we begin? (y/n)" -n 1 -r
    echo
    if [[ $REPLY =~ ^[Nn]$ ]]; then
        echo "Goodbye!"
        exit 1
    fi
fi

iCount=0
iStartTime=$SECONDS

for files in $dir/*.mp3; do
    (( iCount += 1 ))
    fname="$(basename "$files")"
    echo
    echo -e "\e[1mProcessing: $files ...\e[0m"
    ffmpeg -hide_banner -loglevel "$logLevel" -y -i "$files" -lavfi showspectrumpic=s="$size":color=rainbow:gain=1 "$dir/output/${fname%.*}.png"
    echo "Finished!"
done

iEndTime=$SECONDS
iElpTime=$(( $iEndTime - $iStartTime ))
echo "---------"
echo -e "\e[1;32mAll done!\e[0m"
echo "Total files  : $iCount"
echo "Time elapsed : $iElpTime seconds"

To run the script, cd to the “tools” directory in a terminal and run:

$ ./mp3spect.sh

The script will create PNG images of the frequency spectrum of each MP3 and dump them in the /12-spectrum/output directory. Each image will display the frequency spectrum of its corresponding MP3 and one of the things you want to look at is the frequency cutoff point, meaning the highest frequency the audio reaches. Another issue to look for is clipping. Pictures are worth words so let’s do that.

Here’s what the frequency spectrum looks like for a random FLAC music file that indicated a bitrate of 872 kbps. The keen observer you are, you’ll notice that the audio frequency tops out close to 21000 Hz, or 21 kHz, which is slightly higher than the maximum frequency the average human is capable of hearing (20 kHz). The files size here is 26,924,696 bytes. OK, the numbers on the scales are really tiny, but trust me on this one. Have i ever lied to you, even once?

Frequency spectrum for FLAC sample 872kbps

Next i converted the 872 kbps FLAC file to a 320 kbps MP3. The file size dropped to 9,882,876 bytes, less than half of what it was, and we didn’t lose much sound quality. You’ll notice the frequency now tops out around 20000 Hz, or 20 kHz, so we lost something there, but not much, and we saved a heck of a lot of disk space which just might be important if you’re copying your music collection to your fondle slabs or other devices with limited storage.

Frequency spectrum for MP3 sample 320kbps

Next i converted the 872 kbps FLAC to a 128 kbps MP3. This time the quality loss is significant to the point where most people would easily notice when played using a decent sound system. Here the frequency tops out around 16000 Hz, or 16 kHz, and the file size is 3,953,289 bytes, again half of what it was.

Frequency spectrum for MP3 sample 128kbps

Now i ‘did as stoopid does’; i took the 128 kbps MP3 and converted it to a 320 kbps MP3 because “more quality”, however the only thing i really accomplished was to fatten the file size which is now 9,882,876, exactly the same size of the 320 kbps MP3 that was converted from FLAC earlier. As with the 128 kbps MP3, the frequency again tops out around 16000 Hz, or 16 kHz, however the file size has more than doubled and there is zero improvement in quality.

Frequency spectrum for MP3 sample 320kbps

The other thing to look for is MP3s where the gain (volume) has been raised so high that clipping occurs which can produce horrible audible distortion. Here is a frequency spectrum image of a song, Acoustic Alchemy – Clear Air For Miles.mp3 (320 kbps CBR 44.1 KHz):

Frequency spectrum: Acoustic Alchemy - Clear Air For Miles MP3

Using mp3gain i then upped the gain by 10 dB and here’s the result:

Frequency spectrum: Acoustic Alchemy – Clear Air For Miles MP3 (clipping)

What you’ll notice is that there’s a lot of green and darker colored blue stuff that reaches the upper frequency cutoff point and abruptly stops and this can result in distortion of the sound, however unless the spectrum looks really, really bad, don’t worry too much because what matters most is what you hear, not the fancy colors on an image.

So how do you use this information? Well, this is where it gets little tricky because a song that doesn’t approach the upper frequency of what we humans can hear is not necessarily an indication of lousy quality. Think of a super high quality lossless recording of a synthesizer playing only low notes with no other instruments or vocals. Although the quality of the recording is superb, the frequency wouldn’t approach 20 kHz, however given the wide variety of sounds in most Rock, Pop and other genres of music, such songs will often approach 20 kHz or slightly higher even after they’ve been converted to MP3.

Over time you’ll develop a sense of what the frequency spectrum should look like given the bitrate and how the song sounds. Since you’re going through the trouble of reading this wall of text, and i went through the trouble of writing it, my guess is that we two are fairly particular about our music and therefore desire either MP3s encoded at the highest quality VBR or 320 kbps CBR in which case you may want to replace those that have higher pitch sounds but don’t approach at least 20000 Hz, or 20 kHz.

After the spectrum.sh script generates the images, i use whatever image viewer i happen to have to view large thumbnails of all the PNGs in the “output” folder.

Done? Good. Move the files to the next folder.

STEP 5: mp3trim

This script will use mp3split to trim the silence from the beginning and end of the files. Personally i prefer to have about 1 second of silence on either end, but i haven’t yet found any (easy) way to do this with a Linux program without re-encoding. Before you run the script however, listen to the songs to see if they need to be trimmed because mp3splt will sometimes remove a tiny slice of audible sound from a song even if it doesn’t need to be trimmed.

Sometimes when using mp3splt you might see the error, “split points are equal”. I don’t know what this means or why it occurs (good luck trying to find out), but when it does you might want to dump the file that produced the error in your “replace” folder, move the already processed files to the next folder, then run mp3splt again on the remaining files.

Create the following script in the “tools” directory and make it executable. Don’t forget to edit the variables.

mp3trim.sh
#!/bin/bash
# ==============================================================================
#
# name      : mp3trim
# version   : 1.0
# date      : 1-FEB-2020
# author    : 12bytes.org
# website   : https://12bytes.org/articles/tech/de-borking-mp3s-on-linux
# license   : do what the hell you want with it
# comment   : tested on Manjaro Linux / KDE
#
# This script uses the Linux package 'mp3splt' to batch process MP3 files and
# remove any silence from the beginning and end of the track. Processed files
# are placed in the "output" subfolder of the folder containing the originals
# (the folder will be created if it does not exist).
#
# Run the script:
# $ ./mp3trim
#
# Save script output to a file. The file will be created, or overwritten if it
# exists.
# $ ./mp3trim | tee log.txt
#
# Append output to existing file. The file will be created if it does not exist.
# $ ./mp3trim | tee -a log.txt
#
# To forcefully stop the script press Ctrl + C
#
# ==============================================================================
#
# USER VARIABLES
# ------------------------------------------------------------------------------
# Edit these variables to suit your needs.

# working directory (no trailing slash) - example:
# sDir="/home//Music/working/12-mp3trim"
sDir="../12-mp3trim"

# The silence threshold in decibels expected for silence detection.
iTh=-50    # def=-48 (db) (integer)

# ------------------------------------------------------------------------------
# STOP EDITING

iCount=0
iKBSaved=0
iStartTime=$SECONDS

# disable case sensitivity
shopt -s nocaseglob

# set our working dir
cd $sDir
if [[ $? != 0 ]]; then
    echo
    echo -e "\e[1;91mFailed to change directory to '$sDir'.\e[0m"
    echo "Please recheck the path."
    echo "Exiting..."
    echo
    exit
fi

# make output dir
if [[ $sTestMode == "off" ]]; then
    mkdir -p "output"
    if [[ $? != 0 ]]; then
        echo
        echo -e "\e[1;91mFailed to create directory '$sDir/output'.\e[0m"
        echo "Exiting..."
        echo
        exit
    fi
fi

# see if we have any files to play with
iFiles=`ls | grep -i '\.mp3$' | wc -l`

if (( $iFiles < 1 )); then echo echo -e "\e[1;91mThere are no MP3 files to process in '$sDir'.\e[0m" echo "Goodbye!" echo exit else echo echo "Files to process: $iFiles" echo echo "Settings" echo "--------" echo "directory : $sDir" echo "silence threshold : $iTh db" echo read -p "Shall we begin? (Y/n)" -n 1 -r if [[ $REPLY =~ ^[Nn]$ ]]; then echo echo "Goodbye!" echo exit fi fi # pause after each file? if (( $iFiles > 1 )); then
    echo
    read -p "Pause after processing each file? (y/N)" -n 1 -r
    echo
    if [[ ! $REPLY =~ ^[Yy]$ ]]; then iPause=0; fi
fi

for files in *.mp3; do
    (( iCount += 1 ))
    #sFileName="$( basename "$files" )"
    iFileSize="$( stat -c %s "$files" )"
    #mp3splt -q -f -r -p th=$iTh -d "output" "$files"
    mp3splt -q -f -r -p th=$iTh -d "output" -o @d "$files"
    # mp3splt adds an extra .mp3 extension, so...
    mv "output/$files.mp3" "output/$files"
    iNewSize="$( stat -c %s "output/$files" )"
    echo "Original size = $iFileSize bytes"
    echo "New size      = $iNewSize bytes"
    iDiff="$( bc <<< "$iFileSize - $iNewSize" )"
    echo "Difference    = $iDiff bytes"
    (( iKBSaved += iDiff ))


    if (( $iCount < $iFiles )) && (( $pause == 1 )); then
        echo
        read -p "Press 'y' to continue or any other key to quit." -n 1 -r
        echo
        if [[ ! $REPLY =~ ^[Yy]$ ]]; then
            echo "-------------"
            echo -e "\e[93mUser aborted.\e[0m"
            echo "Total files processed: $iCount"
            exit 1
        fi
    elif (( $iCount == $iFiles )); then
        iEndTime=$SECONDS
        iElpTime=$(( $iEndTime - $iStartTime ))
        echo "---------"
        echo -e "\e[1;32mAll done!\e[0m"
        echo "Total files       : $iCount"
        echo "Total space saved : $iKBSaved bytes"
        echo "Time elapsed      : $iElpTime seconds"
        echo
    fi
done

To run the script, cd to the “tools” directory in a terminal and run:

$ ./mp3trim.sh

After the script finishes, the original files can be deleted if you’re happy with the results and you can move the trimmed MP3s to the next directory.

s” directory in a terminal and run:

$ ./mp3trim.sh

After the script finishes, the original files can be deleted if you’re happy with the results and you can move the trimmed MP3s to the next directory.

STEP 6: puddeltag

If you configured puddletag the way i suggested it will display the Layer, Channels, Mode, Frequency, Bitrate, Length and Size columns. By clicking the column headers you can sort the files according to that column and check that the information for the layer, channels, frequency and bitrate look OK. For the Layer column you don’t want to see anything other than MPEG-1 Layer III. The channel column should be all “2”. The mode column should contain some sort of stereo identifier, whether it be “Stereo”, “Joint-Stereo” or some other kind of stereo. The frequency column should contain a minimum of 32kHz and a maximum of 48 kHz.

Next i remove and rebuild the metatags. Often i download specific music tracks from an album where the artist name is missing from the file name. In this case it is most important to set a proper file name before you remove the tags. The first thing i do is to see if the artist and title tags need any work and then perform any actions or functions on them if they do, or edit them manually if i have to. Once the artist and title tags look good i select all the songs in the list and click the ‘Tag->File’ icon on the toolbar to rename all the files with the pattern %artist% - %title%. After that i delete all the tags, both ID3 and APE, then refresh the view (F5), then click the ‘File->Tag’ button to rewrite the artist and title tags according to the file name. Lastly i add the genre tags which in my case is one of “hard”, “soft” or “ambient”. I used to have “medium” in there there too but that led to more intense decision making than my little noggin was capable of.

STEP 7: mp3diags

My motto regarding mp3diags is “beat the hell out of the files and replace any that bust” which i suspect is a substantially different ethic than that held by the smart fella who invented it. If, however, you’re on board with my radical non-conformist logic, then hold your breath and click the icon to apply “custom transformation list #1” and see if anything explodes. If there’s any issues, correct whatever problems you can and move the rest of the files that have unfixable errors to your “replace” folder.

STEP 8: mp3chk

Here we’ll perform a bunch of tests on each MP3 using another bash script. Among the things the script will be looking at are the bitrate, file size, duration, frames, MPEG layer and version, channels, and frequency. Ultimately the script will use the bitrate and the length of the file to determine whether the size of the file corresponds to what it estimates it should be. If any of the tests fail, it will move those files to a “junk” directory within the directory containing the MP3s. These files you can then move to your “replace” directory… or you could try invoking the Mothman to see if he can help, but i haven’t had much luck. Then again, i was too damned scared to actually try.

The bitrate of an MP3 file is directly related to its size. In the table below we find that a 320 kbps MP3 will consume 40 KB of disk space per second, or 2.4 MB per minute, so if you have a song that’s 3 minutes in length, the file size should be 7200 KB (40 KB per second x 180 seconds) or 7.2 MB (2.4 MB per minute x 3 min.).

BitrateFile size / sec.File size / min.
192 kbps24 KB1.44 MB
256 kbps32 KB1.92 MB
320 kbps40 KB2.40 MB

There are variables that affect the actual file size however, one of them being what metadata is embedded in an MP3. In my case i only write one ID3v2 tag containing the artist name, song title and genre so the error is minimal which is what the mp3chk.sh script expects by default, however if you start embedding images and lyrics obviously the file size will be fatter.

If you sourced your MP3 collection indiscriminately (like i do) then it is quite possible that this script will find potential problems with much or all of the songs in your collection when the default settings are used. If however you ripped your collection yourself or acquired it from quality sources, and only a minium of metadata (artist, title, genre …) is embedded, then the script should find few if any problems with your MP3s with the default settings.

In the “tools” directory, create the following bash script and make it executable.

mp3chk.sh
#!/bin/bash
#
# ==============================================================================
#
# name          : mp3chk.sh
# version       : 1.0
# date          : 1-FEB-2020
# dependancies  : mp3info (check your repo)
# author        : 12bytes.org
# website       : https://12bytes.org/articles/tech/de-borking-mp3s-on-linux
# license       : do what the hell you want with it
# comment       : tested on Manjaro Linux / KDE
#
# This script uses the Linux package 'mp3info' to batch process MP3 files and
# check for potential problems in which case the file will be moved to a "junk"
# subfolder of the folder containing the MP3s.
#
# If you sourced your MP3 collection indiscriminately then it is quite possible
# that this script will find potential problems with much or all of the songs in
# your collection when the default settings are used. If however you ripped your
# collection yourself or acquired it from quality sources, and only a minium of
# metadata (artist, title, genre ...) is embedded, then the script should find
# few if any problems with your MP3s with the default settings.
#
# Run the script:
# $ ./mp3chk.sh
#
# Save script output to a log file in the same folder as the script. The file
# will be created, or overwritten if it exists.
# $ ./mp3chk.sh | tee mp3chk_log.txt
#
# Append output to existing file in the same folder as the script. The file will
# be created if it does not exist.
# $ ./mp3chk.sh | tee -a mp3chk_log.txt
#
# To forcefully stop the script press Ctrl + C
#
# ==============================================================================

# USER VARIABLES
# ------------------------------------------------------------------------------
# Edit these variables to suit your needs.

# toggle test mode on or off. when "on" the script will operate in exactly the
# same way except that no files will be moved to the "junk" folder. this is
# useful to see what files will be classified as junk given your settings.
sTestMode="off"    # def="on" ("on"/"off")

# folder containing the MP3s (no trailing slash). by default the location of the
# script is expected to be in a folder along side a '15-mp3chk' folder which
# contains the MP3s. example:
# sDir="/home//Music/working/15-mp3chk"
sDir="../15-mp3chk"    # def="../15-mp3chk"

# minimum acceptable average bitrate.
iMinBr=220    # def=220 (kb/s) (integer)

# minimum acceptable sample rate.
iMinFreq=44    # def=44 (KHz) (integer)

# minimum acceptable song length. this is perhaps useful for discarding
# truncated songs when all songs are expected to exceed a minumum length.
iMinLen=120    # def=120 (seconds) (integer)

# an MP3 may be larger than the predicted size due to the inclusion of extra
# metadata such as comment tags, cover art and lyrics. if you only write a few
# small tags such as artist, title and genre, then the default value should be
# OK, but you may need to increase it if you write cover art, etc.
iFileBgOffset=50    # def=50 (KB) (integer)

# sometimes the size of an MP3 is smaller than the predicted size in which case
# an allowence is applied. this could be the result of a damaged file or an
# incorrectly reported bitrate in which case it's best to keep the value small.
iFileSmOffset=15    # def=15 (KB) (integer)

# ------------------------------------------------------------------------------
# STOP EDITING

iCount=0
iGoodEggs=0
iBadEggs=0
iPause=1

# disable case sensitivity
shopt -s nocaseglob

# set our working dir
cd $sDir
if [[ $? != 0 ]]; then
    echo
    echo -e "\e[1;91mFailed to change directory to '$sDir'.\e[0m"
    echo "Please recheck the path."
    echo "Exiting..."
    echo
    exit
fi

# see if we have any files to play with
iFiles=`ls | grep -i '\.mp3$' | wc -l`

if (( $iFiles < 1 )); then echo echo -e "\e[1;91mThere are no MP3 files to process in '$sDir'.\e[0m" echo "Goodbye!" echo exit else echo echo "Files to process: $iFiles" echo echo "Settings" echo "--------" echo -e "\e[93mtest mode : $sTestMode\e[0m" echo "directory : $sDir" echo "min. bitrate : $iMinBr kb/s" echo "min. sample rate : $iMinFreq KHz" echo "min. length : $iMinLen sec." echo "big file margin : $iFileBgOffset KB" echo "small file margin : $iFileSmOffset KB" echo # pause after each file? if (( $iFiles > 1 )); then
        read -p "Pause after processing each file? (y/N)" -n 1 -r
        echo
        if [[ ! $REPLY =~ ^[Yy]$ ]]; then iPause=0; fi
    fi
fi

# make junk dir
if [[ $sTestMode == "off" ]]; then
    mkdir -p "junk"
    if [[ $? != 0 ]]; then
        echo
        echo -e "\e[1;91mFailed to create directory '$sDir/junk'.\e[0m"
        echo "Exiting..."
        echo
        exit
    fi
fi

# mian loop
iStartTime=$SECONDS

for files in *.mp3; do
    (( iCount += 1 ))
    iErr=0
    echo
    echo -e "\e[1mProcessing: $files ...\e[0m"
    nBitrate="$( mp3info -r a -p %r "$files" )"
    iFileSz="$( mp3info -p %k "$files" )"
    #iFileSz="$( stat -c %s "$files" )"
    #iFileSz="$( bc <<< "$iFileSz / 1024" )"
    iSec="$( mp3info -p %S "$files" )"
    iGoodFr="$( mp3info -p %u "$files" )"
    iBadFr="$( mp3info -p %b "$files" )"
    iFrames="$(( $iGoodFr + $iBadFr ))"
    sLayer="$( mp3info -p %L "$files" )"
    nVersion="$( mp3info -p %v "$files" )"
    sStereo="$( mp3info -p %o "$files" )"
    nFreq="$( mp3info -p %q "$files" )"

    echo "MPEG version ...... = $nVersion"
    echo "MPEG layer ........ = $sLayer"
    echo "stereo ............ = $sStereo"
    echo "sample rate ....... = $nFreq KHz"
    echo "avgerage bitrate .. = $nBitrate kb/s"
    echo "frame count ....... = $iFrames"
    echo "length ............ = $iSec sec."
    # $nBitrate gives KB apparently which we need to convert to KiB
    # https://www.gbmb.org/kb-to-kib - even with this the predicted file size often doesn't
    # closely match actual for unknown reasons, even with all non-audio streams stripped
    # and user offset prefs set to 0
    nEstFileSz="$( bc <<< "(($nBitrate * $iSec) / 8) * 0.9765625" )"
    echo "file size, actual . = $iFileSz KB"
    echo "file size, est. ... = $nEstFileSz KB"
    nCalc="$( bc <<< "$iFileSz - $nEstFileSz" )"
    echo "file size, diff. .. = $nCalc KB"

    if [[ $iSec < $iMinLen ]]; then
        iErr=1
        echo -e "\e[1;91mBAD EGG: Song length too short: $iMinLen\e[0m"
    elif (( $( bc <<< "$nBitrate < $iMinBr" ) )); then iErr=1 echo -e "\e[1;91mBAD EGG: Low Bitrate: $nBitrate\e[0m" elif (( $iBadFr > 0 )); then
        iErr=1
        echo -e "\e[1;91mBAD EGG: Bad frames: $iBadFr\e[0m"
    elif (( $( bc <<< "$nVersion != 1" ) )); then
        iErr=1
        echo -e "\e[1;91mBAD EGG: Incorrect MPEG version: $nVersion\e[0m"
    elif [[ $sLayer != "III" ]]; then
        iErr=1
        echo -e "\e[1;91mBAD EGG: File is not Layer III: $sLayer\e[0m"
    elif [[ $sStereo != *"stereo"* ]]; then
        iErr=1
        echo -e "\e[1;91mBAD EGG: File is not stereo: $sStereo\e[0m"
    elif (( $( bc <<< "$nFreq < $iMinFreq" ) )); then
        iErr=1
        echo -e "\e[1;91mBAD EGG: Low sampling rate: $nFreq KHz\e[0m"
    elif (( $( bc <<< "$iFileSz > ($nEstFileSz + $iFileBgOffset)" ) )); then
        iErr=1
        echo -e "\e[1;91mBAD EGG: File size too large.\e[0m"
        echo -e "\e[93mOne possibility is that the reported bitrate is not the actual bitrate.\e[0m"
        echo -e "\e[93mAnother possibility is that the file contains more metadata than expected, such as a cover art, etc.\e[0m"
    elif (( $( bc <<< "$iFileSz < ($nEstFileSz - $iFileSmOffset)" ) )); then
        iErr=1
        echo -e "\e[1;91mBAD EGG: File size too small.\e[0m"
        echo -e "\e[93mOne possibility is that the reported bitrate is not the actual bitrate.\e[0m"
        echo -e "\e[93mOther possibilities include a damaged file, missing audio, etc.\e[0m"
    fi

    if (( $iErr == 1 )); then
        (( iBadEggs += 1 ))
        if [[ $sTestMode == "off" ]]; then
            mv "$files" "junk"
            echo -e "\e[93mThe file was moved to the 'junk' folder.\e[0m"
        fi
    else
        (( iGoodEggs += 1 ))
        echo -e "\e[1;32mGOOD EGG!\e[0m"
    fi

    if (( $iCount < $iFiles )) && (( $iPause == 1 )); then
        echo
        read -p "Press 'y' to continue or any other key to quit." -n 1 -r
        echo
        if [[ ! $REPLY =~ ^[Yy]$ ]]; then
            echo "-------------"
            echo -e "\e[93mUser aborted.\e[0m"
            break
        fi
    elif (( $iCount == $iFiles )); then
        echo "---------"
        echo -e "\e[1;32mAll done!\e[0m"
        break
    fi
done

echo "Good Eggs    : $iGoodEggs"
echo "Bad Eggs     : $iBadEggs"
echo "Total files  : $iCount"
iElpTime=$(( iEndTime=$SECONDS - $iStartTime ))
echo "Time elapsed : $iElpTime sec."
echo

exit

To run the script, cd to the “tools” directory in a terminal and run:

$ ./mp3chk.sh

STEP 10: Listen test

Grab a beer (or 12), put on headphones and listen.

STEP 11: Organize

Organize the tracks into folders and/or playlists. You can use puddletag to create playlists, your music player, etc., or you can simply use a text editor (an m3u file is nothing more than a text file with a list of file paths, though you should give it an m3u8 extension if the file is Unicode).

If you’re copying your music collection to other devices you may need to edit the paths in your playlists to make them relative.

STEP 8: Distribution

Copy your freshly minted MP3s to your devices and test to make sure they play OK there too and that the metadata is displayed. If you have any beer left, estimate its value in cryptocurrency and send it to me.

Further resources

Leave a Reply

Your email address will not be published. Required fields are marked *