Tuesday, April 15, 2008

Digital Audio Part I

Over the next few days I am going to talk about digital audio. I am going to begin with Audio formats. Majority of people use MP3 files to store their music. This is basically a compressed format of the audio found on a CD, which is stored as a WAVE file (.wav format). While the wave file is a lossless format, meaning this holds all the information (albeit within the constraints of the CD standard, which I shall explain later), whereas the mp3 uses compression algorithms to make the file smaller. The resulting quality and size of an mp3 files are determined by the bitrate of the file. You can choose almost any bitrate under 320kbps, the most popular ones being 128kbps and 160 and 192 kbps.

The lower the bitrate the lower the quality of the file, and the smaller it is going to be. Due to bandwidth limitations, the internet adopted the lower bitrate as the choice for streaming content. However, there is little use for these bitrates outside of that. Compression uses several techniques to reduce the size of a file. The algorithm looks for frequencies above and below a certain threshold based on the bitrate and removes them in mp3 file. For example, a 128kbps file may remove all sounds above 16khz while the 192kbps might do slightly better by removing frequencies above 18khz. What does this mean to the end user? Its a trade off, human hearing goes all the way to 20 and even 22Khz for some people.

The mp3 file is going to miss out on the details of the higher frequencies. Take a cymbal for example. (The high frequency hits you hear in a drum kit). These spread across the higher end of the spectrum, but decay as an inverted V curve to look like a mountain. Say that particular sample is spread between 11khz and 18khz. If you were to use a 128kbps compression, the right edge of the mountain after 16khz is going to get cut off. Its gone, you cannot retrieve that information again and you are going to miss out on the full detail of that cymbal hit. Increase the bitrate of the mp3 however and you are going to get some more information. The same is true at the lower frequencies as well.

But this isnt the only problem with mp3 files. There are also compression artifacts, and other loss of details. Compression artifacts occur when the algorithm finds the sound too complicated to fill in all details at a given bitrate, hence you end up hearing stuff that wasnt there in the original song. This is especially audible on lower bitrate files. Check out the wikipedia page for a sample and you will know what I am talking about. Then there is the loss of information when there are simultaneous instruments or effects one significantly louder than the other. The louder effect or instrument gets priority and the subdued effect is all but gone in the resulting file. There have been some improvements over the years to mp3 however, one of them is the LAME encoder.

This is a slightly smarter mp3 encoder which at higher bitrates, retains most of the information. If you were to use LAME and encode a file close to 200kbps, then it would be tough to spot differences between this and the original wav file unless you have high end equipment which reveal very subtle nuances in music. For the average listener however, this is going to serve well. However, there is no use converting one mp3 to another, the information is already lost in the first mp3 file and it cannot be recovered. Further compression and loss will occur when you convert it to mp3 again. If you want to use LAME encoder, you have to get hold of the original cd and rip it again.These mp3 files are just like the others and will play on everything, the technique used to compress the file is different. Just google LAME and you can download the encoder.

The other development which is useful are Variable bitrate Mp3 files. This sets a range of bitrates to use, and depending on the complexity of the music passage, the bitrate is increased or reduced to store more or less information. These are better than regular mp3's for complex music with varying volume levels and several instruments playing. But personally, I stick to LAME at 195kbps or above.

The other way to enjoy music the way it was originally intended is to use lossless formats such as FLAC, AAC and APE. These are all developed by different organizations and all of them require their own codec (encoder/decoder) to create and play these files. The advantages to these formats are 1.They are smaller than the original wav file. and 2. They support tagging.
Wave files cannot have tags. They only hold file name. So if you are maintaining a digital music library, you will not see any artist info, album info etc etc for wav files, just file names. So unless you want to have 30-40 character file names for wav files with all info, you are better off using one of the lossless compression methods. There isnt any specific advantage of one vs the other.
The only advantage AAC has is that its an apple format and their mp3 players natively support it. To play flac files, you have to mess with the firmware and only some devices natively support FLAC.

However, these formats have not picked up in the commercial segment, and legal downloads still sell mp3 files (bandwidth/storage space being primary reasons). While it might work for when you just want one song off an album, if you are ever planning on getting the whole album, get the cd and rip it yourself, or make sure you either get lossless music, or atleast a high bitrate of above 192kbps. There is some false advertising in regard to mp3 stating that 160kbps is CD quality and 192kbps is better than CD quality.That is just plain wrong. To directly compare, CD audio bitrate would be around 1411.2kbps. 160kbps takes away information from the original file, extrapolates some of the details and creates a smaller version of the file.

The reason these were coined were to make it easy for internet streaming. There is a noticeable quality drop when you step down to 128kbps from 160 and to 96kbps from 128. Hence, for the average user to understand the difference in quality, they used common knowledge terms like CD quality, FM quality and Tape quality to compare the three bitrates. While its useful as a reference to how quality improves, the so called CD quality 160kbps is nowhere close to the actual CD itself. However, it loses minimal information, and especially when using lower end audio equipment, the differences are barely distinguishable. It is important to note that the quality of the file you listen to can be judged only based on what you use to listen to them. But thats a whole other topic. Next I shall talk about a different type of compression, the compression performed by audio engineers while mastering the tracks on an album.

1 comment:

Radioactive Android said...

Nice informative post [:)]