Content update: MP3 Factory: 'De-borking' MP3s on Linux

Version 0.11a of the MP3 Factory script is now available. The link is in the MP3 Factory: 'De-borking' MP3s on Linux page.

This version fixes a bug that caused the processing of multiple files to fail if they had different file extensions. Also the temporary copy of the file that is being processed is now stored in RAM (/run/user/$UID/mp3f_tmp) which greatly reduces storage reads and writes and speeds up processing a bit.

Various other smaller changes were made as well.

Introducing MP3 Factory for 'de-borking' MP3s on Linux!

I had written a bunch of Bash scripts that i was using to process downloaded MP3 files with the objective of correcting problems and dumping all the junk i didn't want to save in my music files, such as cover art. I have now combined all these scripts, and much more, into what i'm calling MP3 Factory and i very much like the how things are panning out.

The script is in the alpha stage of development, but it is usable. It is also my first shell script, so not only am i learning a lot about some of the technical details MP3s, but i'm also learning the Bash syntax. You can check out my work on the MP3 Factory: De-borking MP3s on Linux page.

How to stop Mp3splt from adding "_trimmed" to file names

Mp3splt is an old but pretty nice console utility for splitting MP3 files without re-encoding them (it can apparently handle a few other types as well). I'm working on a project that makes use of it to batch-trim extra silence from the beginning and end of MP3s and one of the long-time gripes i've had with Mp3splt is that it adds "_trimmed" to the output file names by default. Not being the sharpest knife in the drawer, it took me a long bloody time to FINALLY figure out how to eliminate that. I'm disclosing my solution here with the hope that someone else benefits since i never found a solution anywhere else.

If you man mp3splt, the clues are all there, primarily with the -o and @f options, but what complicated matters for me what that i wanted to run Mp3splt in a directory above the MP3s which would be dumped first in /input and then in /output and i wanted to do it without using cd in my script.

Here's what worked for me:

-o '../output/@f'

If the ../ is eliminated, then Mp3split creates the /output directory in the /input directory which is just plain strange i think. Anyway, the @f will cause Mp3splt to retain the original file name.

The full command:

mp3splt -r -f -p th=-50,min=0.2 -o '../output/@f' 'input/song.mp3'

One of the other gripes i had with Mp3splt is that, in some cases, it will keep trimming "silence" that isn't silence from an MP3 every time you run it on the same file, thus the file size will keep getting smaller and smaller and you lose a little more audio each time. The min=0.2 seems to prevent that. When Mp3splt is used in trim mode (-r), the value of min= is the amount of silence in seconds that must be detected before trimming occurs. It is also the the amount of silence that will be retained at the beginning and end of the MP3. This value can of course be changed to whatever you want, but obviously Mp3splt cannot add silence.

New guide: De-borking MP3s on Linux

I've been wanting to publish a guide for repairing and editing MP3s for a music collection on Linux for a long time. De-borking MP3s on Linux is nothing professional, but it's a starting point if you, like me, download a lot of music and want to repair errors in the files, normalize the volume and properly tag them. It's a work in progress and i'd appreciate any feedback.

MP3 Factory: 'De-borking' MP3s on Linux

NOTE: This script is under heavy development and i don't necessarily suggest using it at this time. Either check back later or, better yet, subscribe to be notified when this page is updated.

Send me an email when this page has been updated

MP3 Factory: 'De-borking' MP3s on Linux

intro

Yes, "de-borking" is a word you grammar Nazi! It's right here in the "official" dictionary :)

Apparently i'm an oddball because my MP3s contain only the artist, title and genre tags (and i only use 3 or 4 of the latter), along with the other required stuff and absolutely nothing else... well, except for the actual music. I'm not interested in embedding cover art, lyrics, soupie recipes or whatever else people are sticking in their music these days. I like to aim for minimal, compact, error-free files that meet my (medium-high) quality standard and that's what this is all about.

In an ideal world of unlimited availability and smartphones with unlimited storage, i'd never mess with MP3s, preferring instead one of the lossless formats such as FLAC, however some songs are difficult or impossible to find in such a format and storage space is still at a premium on our fondle slabs, so there's that.

Typically i acquire specific songs from various artists rather than entire albums, primarily because, with very few exceptions, i've never run across an album for which i liked every song. The problem with this disjointed approach is that downloading songs willy-nilly often results in files that are not consistent, which is to say they're often improperly tagged, have errors, are padded with too much or too little silence and the volume is all over the place. The worst is when some aspiring government employee thinks that upsampling a 64 kbit/s MP3 to 320 kbit/s is somehow a good idea. With the exception of the latter, such problems, and more, are generally easy to fix using a few software tools.

Enter MP3 Factory, a Bash shell script that i'm quite tickled to share.

I developed MP3 Factory to meet my own selfish needs as well as potentially address the perceived lack of a unified, Linux compatible program that can perform all or most of the tasks i require without depending on things i don't wish to depend on, like Java, Wine, or Mono.

highlights

  • Batch processing
  • Convert to MP3
  • Convert tags to file names and file names to tags
  • Repairing and rebuilding damaged files
  • Integrity checking, including CRC testing
  • Generate frequency spectrographs for graphic analysis
  • Trim excess silence interactively
  • Normalize volume (gain) with ReplayGain 2.0 data
  • Output clean, error-free, compact MP3 files
  • Process files other then MP3
  • Play a video of a music file
  • Code is ShellCheck and Shellharden compliant
  • Developed for Linux and the Bash shell

dependencies

Following are the primary dependencies of MP3F. Those which are not installed with the operating system may be available in your software repository (with the exception of mp3sum all are available in the Arch or Arch User Repositories). MP3 Factory will let you know if any dependencies are not available as you use it.

FFmpeg (pkg. name: ffmpeg): A powerful and comprehensive set of console utilities for performing all sorts of actions on audio and video files. MP3 Factory also depends on FFplay and FFprobe which are likely installed with FFmpeg in mainstream Linux distributions. MP3 Factory will let you know about any missing dependencies.

id3ted (pkg. name id3ted): Console utility for tagging MP3 files.

loudgain (pkg. name: loudgain): Console utility used to write ReplayGain 2.0 information to many different audio file formats.

MP3Info (pkg. name: mp3info): Console utility that provides information about MP3s.

Mp3Splt (pkg. name: mp3splt): Console utility for splitting MP3s. It is used by MP3F to remove a portion of the silence at the beginning and end of the files.

mp3sum (pkg. name mp3sum): Console utility used to calculate the CRC of the audio stream and verify that it matches the checksum in the info header of an MP3. mp3sum is a Python program that can be installed like so:

$ cd /tmp
$ git clone https://github.com/okdana/mp3sum
$ cd mp3sum
$ sudo pip3 install -r requirements.txt
$ sudo make install

MP3val (pkg. name: mp3val): Console utility capable of finding and fixing various problems in MP3 files, including some rather important ones such as frame, stream and header errors.

download

Both this guide and the MP3 Factory script are under heavy development, so revisit often for updates.

DISCLAIMER: MP3 Factory is in the early stage of development and is considered 'alpha' software, meaning there be bugs here, it may not be feature complete, and some stuff may simply fail to work. If you choose to use it you should be competent enough to review the code and smart enough to backup your music files before processing them. If anything explodes, a mirror will reveal who's to blame :)

Download MP3 Factory alpha

Having said that, i am developing and using it on my daily driver box. Then again, i've been known to do stupid things that result in catastrophes, like that time i formatted the wrong partition without making a fresh backup (i never even thought to ask the NSA to send me their copy).

Lastly, because MP3 Factory is in alpha stage and under heavy development, don't look for change logs. If you want to see the changes in the code you can use a diff tool.

overview

MP3 Factory (MP3F if you like) is a Bash shell script that runs in a terminal, but don't let that scare you if you're not a terminal freak... get it? MP3F presents a unified interface to run a few software tools available for Linux. You need not be a rocket scientist to use it since the script makes working on your music files fast and easy while still offering a plethora of configuration options.

MP3 Factory is GPL licensed and was developed on Manjaro Linux (Arch for dummies!) but it should run on any other flavors of Linux that has a Bash compatible shell. Using MP3 Factory is as easy as typing ./mp3factory.sh in your terminal, feeding it some music files and following the prompts.

Following are the functions offered by MP3F, all of which are discussed in detail below.

1) Tag To File Name     7) Rebuild            13) Add Log Entry
2) Convert To MP3       8) File Name To Tag   14) View Log
3) Repair               9) Normalize Gain     15) Documentation
4) Integrity Check 1   10) Write A Tag        16) Edit Options
5) View Spectrograph   11) Integrity Check 2  17) Quit
6) Trim Silence        12) Play

With the exception of re-encoding non-MP3s to MP3s, all of the operations preformed by MP3 Factory and its dependencies are non-destructive, meaning no re-encoding is necessary and therefore there is no degradation of quality.

Music files can be batch processed and automatically retained or discarded according to parameters defined by you, or the operation can be interrupted after each file is processed to allow you to override the batch behavior. All important operations are logged.

MP3 Factory will create the following directories and files within the 'MP3 Factory' folder:

/backup          (copies of source files converted to MP3)
/input           (files to be processed are placed here)
/junk            (discarded files, if they are retained)
/logs            (holds current and old log files)
/output          (processed files are moved here)
/spectro         (spectrograph images of music files)
/.temp           (files undergoing processing)
junk.txt         (file names of files that are discarded)
options.conf     (configuration options)

audio functions

Following are all of the audio related functions offered by MP3F.

Tag To File Name: This operation uses FFprobe to read user specified tags from the metadata of the file and change the file name according to the tags. This operation is intended for those times when you download album tracks that might have file names like '01 - Sara Smile' and you want to change it to 'Darrel Hall & John Oats - Sara Smile'. The default settings can be changed.

This operation accepts the following file types: aac, ape, flac, m4a, mp3, oog, wav

Convert To MP3: This operation uses FFmpeg to re-encode non-MP3s to MP3s. By default MP3s are encoded using a high quality variable bit rate (VBR) as opposed to a constant bit rate (CBR). The default settings can be changed.

This operation accepts the following file types: aac, ape, flac, m4a, oog, wav

Repair: This operation uses MP3val to check for some rather crucial problems and repair them when possible.

This operation accepts the following file types: mp3

Integrity Check 1: This operation uses FFmpeg, FFprobe, MP3Info and mp3sum to gather information about the files to help you make a determination as to whether they should be discarded or retained. Tests are performed to ensure that the minimum bit rate, sample rate, duration, etc., meet your specifications. Checks for decoding errors and CRC checksums are also performed. The default settings can be changed.

This operation accepts the following file types: mp3

Trim Silence: This operation uses Mp3splt to remove extra silence from the beginning and end of an MP3. By default some silence will be retained as long as it exists if the first place. The default settings can be changed.

This operation accepts the following file types: mp3, oog, flac

Rebuild: This operation uses FFmpeg to essentially rebuild the file without re-encoding it. By default the audio stream is copied to a new file which is given a fresh LAME/Xing header. Most other metadata, such as cover art, is dropped by default. Note that the rebuilt file may be slightly larger in size than the source file, particularly if the source file did not contain any metadata. The default settings can be changed.

This operation accepts the following file types: aac, ape, flac, m4a, mp3, oog, wav

File Name To Tag: This operation uses FFmpeg to convert the file name to tags. Here it is expected that the file name, typically <artist> - <title>, have at least one hyphen in it with a single space on either side. Everything before the first hyphen will be written to one user specified tag and everything after it to another. The default settings can be changed.

This operation accepts the following file types: mp3

Normalize Gain: This operation uses loudgain to write ReplayGain 2.0 information to the file so that all your music files have a similar volume. Unlike MP3Gain, loudgain does not touch the actual audio and does not write a depreciated APE tag. The caveat here is that your music player must support ReplayGain, which probably every decent program does. The default settings can be changed.

This operation accepts the following file types: aac, ape, flac, m4a, mp3, oog, wav

View Spectrograph: This operation uses FFmpeg to create a frequency spectrograph image (PNG) of the music file or FFplay to create a video spectrograph with audio. The purpose of this is to help determine whether the audio file has been upscaled, as would be the case if someone took a 128 kbit/s MP3 and re-encoded it to 320 kbit/s, or whether there exists an unacceptable amount of clipping. If pausing is disabled, only images will be created in the 'spectro' folder and they will not be opened for preview. The image files will have the same name as the music file. If pausing is not disabled, you can either preview a static image or play the video which will display a scrolling frequency spectrograph.

This operation accepts the following file types: aac, ape, flac, m4a, mp3, oog, wav

Write A Tag: Allows you to write metadata to the file. The default comment tag includes 'MP3 Factory' and the version number so that you can re-process the file if you like, should something change in a future version of the script that interests you. If you disable pausing and process the files in batch mode, only the default comment tag is written.

This operation accepts the following file types: mp3

Integrity Check 2: The second integrity check runs a subset of the tests performed by the first integrity check. The purpose is to ensure that the MP3s still meet your standards after undergoing any previous operations. With the default settings it is expected that the file will have been rebuilt by this time and all unnecessary metadata removed, including cover art and lyrics, however you can change the settings to accommodate your needs.

This operation accepts the following file types: mp3

Play: This operation uses either FFplay or the system default handler to play a music file.

This operation accepts the following file types: Any supported by FFplay or the system default handler.

other functions

Add Log Entry: Allows you to add an entry to the log file.

View Log: Open the log file for reading. All important processes are logged while MP3F operates on your files.

Cleanup: After prompting, all files in the 'junk' and 'spectro' directories can be deleted. This function can be invoked from the main menu or automatically upon exit.

Help: Displays a minimal amount of helpful information.

Documentation: Links to this wall of text.

Edit Options: Opens the options.conf file for editing. MP3F will need to be restarted if any changes are made.

usage

setup

Upon extracting the ZIP archive, mp3factory.sh will be contained in a folder named 'MP3 Factory'. I know you never expected such stunning creativity, but there it is! You can do what want regarding the folder name, but do keep the script in a dedicated folder, perhaps somewhere in your 'home' directory.

The first thing you'll want to do is edit the configuration options contained in the mp3factory.conf file. Because the content of this file is imported using source, it is vital to not mess up the syntax else all hell could break loose. To edit the options you'll need a decent text/code editor, perhaps one that offers syntax highlighting. If you're running the KDE desktop then Kate is a nice choice. Just don't use something like LibreOffice Writer or whatever. If you need help with program parameters, such as for FFmpeg, try man <program name> in your terminal, or search for the on-line manual, or leave a comment here (you need not be logged in and you can subscribe to be notified of replies). For help with the more complex options, search this page (Ctrl + F) for the option name.

Be careful when editing the options! Again, it is vital to not mess up the syntax which is in the form of option=value. In particular, do not:

  • edit anything other than the option values
  • accidentally delete a single or double quote
  • mix quotes
  • add a space before or after the equals character
  • delete a hash (#) character

There are four types of option values; string, array, integer and number. They are easily identified by the first character of the option name which is one of 's', 'a', 'i' or 'n'. This is what the syntax generally looks like:

sString='string'
aArray=( 'element 1' 'element 2' '...' )
iInteger=1
nNumber=1.2

If you want to make certain you didn't screw something up, install ShellCheck and run shellcheck -x mp3factroy.sh in your terminal. It should not output anything if all is copacetic.

first run

Make the mp3factory.sh script executable which you may be able to do from your graphical file manager, or you can chmod +x mp3factory.sh in your terminal. After that, run the script by opening a terminal, navigating to the 'MP3 Factory' folder, and running ./mp3factroy.sh. The script will run some checks to make sure its dependencies are met and then create some folders after which it's ready to use. Don't rename the folders else they'll just be recreated. If you have any problems, leave a comment.

processing files

Copy whatever files you want to process to the 'input' folder and from the main menu of MP3 Factory select an operation to perform. You did notice i used the word "copy" and not "move and potentially f'up all your precious music files", right? The point being that you should work on copies of your source files rather than the originals even though MP3F creates a temporary copy of the file to work on.

The order in which some of the processes are run is rather crucial and this is why the menu items are ordered as they are, in addition to what i perceive as being logical. For instance, if you want to convert tags to file names, you obviously don't want to rebuild the file beforehand, thereby garbage-canning all the tags (by default). With few exceptions, it is always best to perform the operations in order for each operation you wish to perform.

The file names of music files that are discarded are saved to a text file so you can remember to replace them at some point. Discarded files are either moved to the 'junk' folder if you opted to keep them, or deleted if you want to save disk space.

Following is how i process my music files:

  1. Convert the artist and title metadata (tags) to file names (<artist> - <title>.mp3). I like to get this out of the way early so that it's easier to identify the file names with the songs, especially for those that may be discarded later where only the file name may remain in the junk.txt file.
  2. Convert anything that isn't an MP3 to MP3.
  3. Run the repair operation to potentially correct any major issues.
  4. Create a spectrograph image of the files to determine if they attain the desired cutoff frequency (whether they may have been upscaled), or whether there's excessive clipping. Some clipping is permissible as this will be handled by RG (Normalize Volume), but if there's too much then i trash the file. Be sure to read the 'view spectrograph' section below because there's more to know about this!
  5. Trim and excessive silence from the beginning and end of the files leaving 1 second of silence, assuming that much exists in the first place. Be sure to read the 'trim silence' section of this guide below!
  6. Rebuild the files in order to remove most metadata, some of which may be damaged, unrecognized, depreciated or simply unwanted.
  7. Run an integrity check operation on the files to see if they meet my standards (bit rate, sample rate, etc.).
  8. Re-write the artist and title metadata as derived from the file names and, optionally, add the genre tag.
  9. Adjust the gain (volume) of the files by adding the ReplayGain 2.0 tags so that the volume across all files is consistent when played using a media player that supports RG.
  10. Batch write comment tags to add MP3 Factory and its version number so i know which version was used to process the files. This could come in handy later if changes are made to the MP3F script and wish to reprocess the files.

If your music collection consists of singles you downloaded from all over the place without paying much attention to quality, then don't be a bit surprised if a huge chunk of it is flagged as 'bad eggs' if you run an integrity check with the default settings.

Upon processing your files, the 'good eggs' will be moved to the 'output' folder. If you want to run another operation on the files, you could manually move them from the 'output' to the 'input' folder, or you could let MP3F do this for you by simply initiating the next operation in which case all files that are compatible with the operation will be automatically moved to the 'input' folder if there are none already in the folder. If you enable the automatic file rotation option, then each time you initiate an operation all compatible files will be moved from the 'output' folder to the 'input' folder without prompting and regardless of whether any already exist in the /input folder..

As you begin an operation, MP3F will ask a few questions such as whether you wish to accept the settings for the operation and whether you want to pause after each file is processed. If pausing is enabled you will be offered a choice as to whether you want to keep, discard, or, in certain cases, whether to retain the original file after the operation has completed. For more about batch processing, see the 'important notes' section below.

FFplay, a part of the FFmpeg package, is used in various operations to preview the music files. FFplay does not offer any controls in its interface but you can control it with hotkeys. See the 3.6 While playing section of the FFplay documentation. The basic controls you may find handy are:

  • space bar : play/pause
  • left/right (arrows) : seek backwards/forwards 10 sec.
  • down/up (arrows) : seek backwards/forwards 1 min.
  • escape key : quit

important notes for various operations

batch processing

If you choose to disable pausing after each song is processed then MP3 Factory will batch process all files in the 'input' directory that are compatible with the chosen operation. In this mode of operation the files will be automatically discarded (moved to the 'junk' directory or deleted) or moved to the 'output' directory based upon the options you have chosen, as well as other internal parameters. If you enable automatic file rotation then all files which are compatible with the selected operation will be moved to the 'input' directory regardles of whether compatible already exist in the directory.

If you're just getting to know MP3F then i suggest you do not invoke batch processing until you become familiar with it.

trim silence

Mp3splt can trim more than just silence if the silence detection threshold (the value of th in the Mp3splt options) is not properly set, therefore i strongly suggest you do not batch trim all your music files. It is always best to preview the files first to determine if they actually need trimming. When pausing is enabled you will have the opportunity to preview a few seconds of the beginning and end of each file both before and after trimming and if you're not satisfied with the result you can adjust the silence threshold value on the fly and quickly process the file again. If the file needs trimmed but not enough was removed, raise the db threshold by 1 or 2 (e.g. from -48 to -46) and repeat the process. Conversely, if too much was removed, lower the db level a bit and try again.

view spectrograph

MP3 Factory uses FFmpeg to create PNG images of the frequency spectrum of your music files. If pausing is enabled you will be shown each image as it is generated in order to make a determination as to whether to keep the file. One of the things to look for in these images is the frequency cutoff point, meaning the highest frequency the audio attains. The highest frequency that can be encoded in an MP3 limited by the bit rate used to encode it, however there is more to consider. Another thing to look for is clipping. Pictures are worth stuff they say, so let's do it...

Here's what the frequency spectrum looks like for a random FLAC music file that indicated a bitrate of 872 kbit/s. The keen observer you are, you'll notice that the frequency cutoff is close to 21000 Hz, or 21 kHz, which is slightly higher than the highest frequency the average young human is capable of hearing (20 kHz). The files size here is 26,924,696 bytes.

Frequency spectrum for FLAC sample 872kbps

Next i converted the 872 kbit/s FLAC file to a 320 kbit/s CBR MP3. The file size dropped to 9,882,876 bytes, less than half of what it was, and we lost very little quality. You'll notice the frequency cutoff point has dropped to around 20000 Hz, or 20 kHz, so we lost a little something, but nothing anybody without very keen hearing and decent headphones plugged into a decent sound system, and listening in a quiet environment, is likely to notice. We also saved a hell of a lot of disk space which is crucial if you want to cram as many songs as possible onto that microscopic SD card that you struggle to plug in to your fondle slab.

Frequency spectrum for MP3 sample 320kbps

Next i converted the 872 kbit/s FLAC to a 128 kbit/s MP3. This time the quality loss is significant to the point where most people with decent hearing in a quiet listening environment with a decent sound system would notice. If you're listening to music while operating a jackhammer however, then not so much. Note that the cutoff frequency has dropped as a result of the lower bit rate, however a low frequency cutoff point is not necessarily an indication of poor audio quality. Here the frequency cutoff is around 16000 Hz, or 16 kHz, and the file size is 3,953,289 bytes, again half of what it was.

Frequency spectrum for MP3 sample 128kbps

Now i 'did as stoopid does'; i took the 128 kbit/s MP3 and upscaled it to a 320 kbit/s because "more quality", however the only thing i actually accomplished was to fatten the file size which has now more than doubled to 9,882,876 bytes, exactly the same size of the 320 kbit/s MP3 that was converted from FLAC earlier. As with the 128 kbit/s MP3, the frequency again tops out around 16000 Hz. In summary, we more than doubled the size of the 128 kbit/s file yet gained absolutely nothing in sound quality. So this mean that every file with a high bit rate and a low frequency cutoff point is garbage, right? Well, no it doesn't. Read on.

Frequency spectrum for MP3 sample 320kbps

The other thing to look for is signs of clipping. Clipping is the result of the gain (volume) having been raised too high and which can sometimes be heard as distortion. This is fairly common given the loudness war we've been subjected to as of late. Here is a frequency spectrum of Acoustic Alchemy - Clear Air For Miles.mp3 (320 kbit/s CBR 44.1 KHz). There is no sign of significant clipping here:

Frequency spectrum: Acoustic Alchemy - Clear Air For Miles MP3

Using MP3gain i then increased the gain by 10 dB and this was the result:

Frequency spectrum: Acoustic Alchemy – Clear Air For Miles MP3 (clipping)

What you'll notice is that there's a lot of green and darker colored blue stuff that reaches the upper frequency cutoff point and this is indicative of clipping. The file is now garbage as far as i'm concerned.

So how do you use this information? Well, this is where it gets tricky because an MP3 that was encoded using a high bit rate, yet doesn't approach 20 kHz or so, is not indicative of poor quality. Think of a super high quality lossless recording of a synthesizer playing only low frequency notes with no other instruments or vocals. Although the quality of the recording is superb, the frequency wouldn't even begin to approach 20 kHz. Given the wide variety of sounds in most Rock, Pop and some other genres of music we listen to today however, these songs will often approach or slightly exceed 20 kHz even after being converted to MP3 as long as a high quality VBR or CBR encoding was performed.

Over time you'll develop a sense of what the frequency spectrum should look like given the bit rate and the different kinds and amplitude of sounds present in the song. Ultimately what matters most however is what you hear, not the fancy colors on a graph.

integrity check

The Integrity Check operation is fairly comprehensive and includes the following:

  • file information provided by FFprobe
  • file information provided by MP3Info
  • duration error check (audio frame size is used to estimate duration)
  • length of audio check (whether the song is longer or shorter than user allowed)
  • bitrate check (minimum average bitrate as set by user)
  • good frame count (total number of good audio frames)
  • bad frame count (total number of damaged audio frames)
  • MPEG version check (check for MPEG version 1)
  • MPEG layer check (check for MPEG layer III)
  • stereo check (checks whether audio is stereo)
  • sample rate check (check for minimum sample rate)
  • file size overhead check (check size of non-audio data)
  • decoding error check (check for decoding errors)

integrity check, CRC verification

The LAME encoder v3.90 or newer writes CRC checksums to the LAME/Xing headers of the encoded MP3s. While not bulletproof, this checksum can be used to verify the integrity of the audio stream specifically (metadata alterations such as artist, title, etc., do not affect the CRC verification). While LAME can also write a CRC checksum to each audio frame which would provide a more accurate diagnosis as to whether the audio has been altered, this is highly discouraged since the space used for these checksums is the same as that used for the actual audio, thus leading to a possible reduction in audio quality. MP3 Factory uses mp3sum to calculate the checksum of the audio stream and compare it with that in the info header of the MP3. Enabling this check is a great way to weed out a lot of damaged, truncated, or otherwise compromised files if you're particular about your music collection, however be aware that MP3s processed by Mp3splt (used to trim silence by MP3F) will fail the CRC check, hence why this check is performed during the first integrity check and not the second which takes place after the MP3 may have been processed by Mp3splt.

notes regarding various options

For options that apply to external programs, such as FFmpeg, Mp3splt, etc., you can get help right from your terminal by entering either man <program> to view the manual, or <program> --help, or <program> -h to view a brief overview including the parameters it accepts. Note that '<program>' must be the exact name of the package, case-sensitive. For example, to get a brief overview of Mp3splt and the parameters it takes, enter mp3splt -h in your terminal. You can also redirect the output to a file for easier reading and searching. For example, to save the manual (man) page for FFmpeg to ffmpeg.txt: man ffmpeg > ffmpeg.txt

sFilesAutoRotate : Whether to auto-rotate files from the 'output' directory to the 'input' directory..

If this option is set to 'on' then all files, if any, which are compatible with the selected operation will be moved from the 'output' to the 'input' directory without prompting upon initiating an operation. If it is set to 'off', you'll be asked whether to move compatible files from the 'output' to the 'input' directory only if there are no compatible files in the 'input' directory.

aEncFfmpegOpt : FFmpeg encoder options.

By default MP3 Factory, using FFmpeg, will encode files using a variable bit rate (VBR). The -qscale:a 0 parameter handed to FFmpeg will produce the highest quality VBR output that the libmp3lame encoder is capable of and which averages around 245 kbit/s, the typical minimum being 220 kbit/s and the maximum 260 kbit/s. If for some reason you prefer a constant bit rate (CBR), you can change -qscale:a 0 to -b:a 320k where '320' is your desired bit rate, however this is not recommended by the FFmpeg developers nor the folks at Hydrogenaud.io, both of which are widely regarded as authorities on the subject. The simple reason is that it's pointless to encode data you likely cannot hear and, besides, if you were interested in the best quality you wouldn't be messing with MP3s. More on the bit rate encoding parameter can be learned here.

aIntgCkFfprobeInfoOpt : Want information is displayed by FFprobe when performing an integrity check.

See the FFprobe Documentation for help.

aIntegCkFfmpegOpt : Options used by FFmpeg when checking for decoding errors.

See the FFmpeg Documentation for help.

aRebuildFfmpegOpt : Options for FFmpeg when rebuilding a file.

By default most metadata is ignored when FFmpeg muxes (rebuilds) the file. The audio stream is copied to a new file and an Xing/LAME frame is written, however other metadata such as artist, title or cover art is not transferred to the new file. If you want to copy the metadata to the new file, see the FFmpeg Documentation. If you want to copy all the metadata, remove the -map 0:a and -map_metadata -1 parameters. I chose the default options because i prefer to scrap all metadata in order to produce a clean file and then write only the artist, title and genre tags using the file name as the template for the artist and title tags.

aTagToFileFfprobeOpt : Options used by FFprobe when converting tags to file names.

Typically the file name pattern for a single music file is <artist> - <title>. This option allows you to choose which two tags you want to use to rename your files. While you may use tags other than the default artist and title, keep in mind that they are the only two tags which are virtually guaranteed to be present. See the MP3 section of the FFmpeg Metadata Wiki for supported tag names.

aFileToTagId3tedOpt : Options used by id3ted when writing tags derived from file names.

This option control which parts of a file name are written to which tags when writing tags derived from file names. Typically the file name pattern for a single music file is <artist> - <title>. In order to write the artist and title tags, id3ted needs to know which part of the file name is the artist and which part is the title. The separator in the file name is always the first hyphen which must be in the form of space-hyphen-space. In our case we consider everything before the first hyphen to be the artist and everything after it to be the song title. Since id3ted sees the -a parameter as artist, and -t as title, our value for the aFileToTagId3tedOpt option is simply ( '-a' '-t' ). While you can change which tags are written, keep in mind that the artist and title tags should always be written and only 2 tags are supported at this time.

aSpecFfmpegPicOpt : Options used by FFmpeg for creating a still image spectrograph.

There's a lot you can do with this option. See the showspectrum section of the FFmpeg Filters Documentation. Although the default options i chose might not produce the sexiest images, i choose them because i think they accentuate the spectrograph to a degree that makes it rather easy to determine whether to keep the file, so just be aware of that should you change them.

aLoudgainOpt : Options used by loudgain.

By default loudgain will perform track gain verses album gain, avoid clipping, and write ID3 v2.4 tags.

aMp3spltOpt (string) : Options used by Mp3splt when trimming silence from the beginning and end of the files.

I have a strong hunch that quite a few people are not using Mp3splt correctly. I suspect this because i've been misusing it myself until recently, which is to say that i've failed to use the -p th=n and min=n parameters properly. Although this may surprise you, apparently one of the significant problems faced by Mp3splt (or FFmpeg using silencedetect), is detecting silence. This is why the silence threshold setting is so important. By default Mp3splt uses -48 db. In an ideal world apparently -50 db is "perfect silence", so if you thought it was 0 db, which seems more logical, and which might make things much easier, you'd be wrong.

OK, so if perfect silence is -50 db, then why does this parameter even exist? Why not hard-code -50 into the software? Well the problem is that this threshold moves up and down depending on how the audio was encoded (and perhaps other factors), so what may be perfect silence at -50 db in one file may be the middle of a drum solo in another. As it says in the manual, the default -48 db used by Mp3splt is a compromise as the result of tests and it seems to work reasonably well in many cases, but not all, however if we're interested in processing our music quickly we probably don't want to spend a lot of time tweaking the silence threshold for each and every song and so we just stick with the -48 db level and hope for the best.

The problem with the silence threshold not being set accurately per-file is that it is very possible to trim an file with Mp3splt multiple times with exactly the same settings and each time the file size may shrink. This tells you that Mp3splt is removing more than just silence and i've experienced this in my own tests. Now imagine somebody rips a CD and runs it through Mp3splt with the silence threshold at -48 when the actual silence threshold is -50. Well, there goes a slice of audible audio. Then they share the song with a friend who runs it through Mp3splt again because they didn't know their buddy already did and perhaps a little more of the song is cropped. Then they upload it and you download it and you run ... Get the picture? I've noticed that a lot of songs i download have no silence padding at all, very possibly due to the scenario just discussed or simply not knowing how to properly use whatever software they're using.

So silence detection is one problem, but there's more...

The next problem is the possibility that people aren't making use of the min=n parameter. By default i've set this to 1 in MP3 Factory and what this does is instruct Mp3splt, in combination with the th=-48 parameter, to look for 1 second of audio at -48 db which joins with audio exceeding -48 db and, if it exists, trim everything beyond that 1 second mark all the way to the beginning or end of the track. Put another way, Mp3splt will keep 1 second of "silence" padding at the beginning and end of the song as long as at least that much silence already exists (Mp3splt cannot add silence unfortunately) and assuming the silence threshold is set correctly.

So that was a rather long-winded explanation for one little option, but i thought it was important to expand on it because it's so annoying and such poor practice to chop off an audible slice of the music because the tools were not used correctly, not that i'm any sort of an expert mind you, but i'm learning.

iMp3spltErr (integer) : Set the maximum amount of data in KB that Mp3splt may trim before a warning is triggered.

If you set the silence detection threshold ( -th ) value too low for Mp3splt, such as -70 for example, it isn't unlikely that it will trim a lot more than just silence. This option is a crude safety valve that will trigger a warning if Mp3splt trims more than this value in KB from the file.

aPrevFfplayOpt (array) : Set the options used by FFplay when previewing a file. The -showmode value can be set to video, waves or rdft. The waves mode is perhaps the most sensible for previewing the audio after Mp3splt has trimmed it since this mode makes it easier to see if Mp3splt retained enough silence or trimmed too much.

suggested software

MP3 Factory does not depend on the following programs, however i readily recommend them.

Kwave Sound Editor (pkg. name: kwave): Installed with the KDE desktop, Kwave is a nice and simple sound editor that i used while developing MP3 Factory. If you need more power, try Audacity or ardour.

MP3 Diags (pkg. name: mp3diags): A very comprehensive MP3 diagnostic and repair utility with an odd and less than intuitive interface, and a bit of a steep learning curve. You probably want the "unstable" version if you can get it to run. You could certainly use MP3 Diags in place of MP3 Factory, but you'll need to do some serious reading, configuring and experimentation. I used MP3 Diags while developing MP3F for testing purposes and to verify it was producing clean MP3s (it is).

puddletag (pkg. name: puddletag): Used to edit MP3 metadata and more, puddletag is a powerful and excellent drop-in replacement for the well received Windows program, Mp3tag.

ShellCheck (pkg. name: shellcheck): If you're just getting into shell scripting, as i am, you really need to check this out. ShellCheck is a fairly polished script analysis tool that checks for errors, makes suggestions and links the errors it finds to its on-line database of solutions.

resources