Wednesday, April 23, 2008

Digital Audio Part II - EQ


I will continue with digital audio with a post on EQ. EQ, short for equalization is the alteration of particular frequencies of the audio signal in order to balance the overall sound, or change it per personal preference. This is a widely misused and misunderstood term and hence the blog. Most audio players have an EQ option nowadays. Some of them simply have presets where the manufacturer has stored in a value for each frequency and just given the overall curve a title. The curve as it is called is the curve you see when you have set each frequency to a particular db level increasing or decreasing the gain at the frequency. Simply put, you are increasing or decreasing the volume of that particular frequency of sound.

The human ear can hear from 20hz to 20khz, some can hear a bit further but the average person stays within these limits. The whole frequency spectrum can be divided into three basic sections, Bass, Mid and Treble. When you are a little more particular on isolating different aspects, you can categorize them as Low bass, mid bass, upper bass/lower mids, mids, upper mids, treble, upper treble. The equalizer is basically split into several bars or sliders each lying in one of these sections. Depending on the player, the number of bars or sliders increase or decrease. Simple EQ's have just 5,7 or 9 bars. When you want some more control, you can use a player that has more bars/sliders such as Foobar. Professionals use dedicated EQ boxes per channel (left and right) and have 20-30 bars on each channel or even more.

The image shows the equalizer in foobar. Notice how the frequency associated with each bar is written below it. This is usually the norm, and most equalizers will state the frequency below each slider. As you can see frequencies extend all the way from 55hz to 20khz. The reason
you usually dont see very low frequencies (<100>19kzh) on most commercial EQ's are because while most instruments will have some information in those frequencies, they are at a very low level and are usually not audible to the average listener.
Also, the kind of equipment you use is going to determine how well such high or low frequencies are reproduced. Most ppl will find 100hz to be boomy enough for bass and 18kzh piercing enough for treble. The important thing to note here is also the fact that compressed music
formats such as mp3 lose most information in very high or very low frequencies and basically average them out or zero them out. Hence, trying to increase the gain at 20khz for a 128kbps mp3 file will not make any difference since the file itself doesnt have any information
in that frequency.

The best way to find out what each frequency does is to mess around with the gain on each slider. Starting from the lowest, increase the gain all the way, and bring it back all the way and you will observe which aspect of that song or musical piece this enhances or deteriorates.
All sliders are first set at 0 gain, or in the middle. Push them up, and you go in increments of decibel level for that frequency, and vice versa.

This is all too complicated for the casual user thought popular manufacturers, so they got rid of these customizable options, and just put presets on portable players such as the Ipod or Zune. These are usually titled rock/pop/jazz/vocal/bass boost etc etc.

Rock eq basically has a slight bias in bass. This is because most rock music is played with distorted guitars which are heavy on lower mids and mid bass.

Pop eq enhances the mids, brings out the treble a bit more as well, and keeps the bass intact. This is because pop music is usually very vocal centric, has synthesized music which usually have a slight emphasis on treble.

Jazz eq cuts down the treble, pushes the bass forward.
Vocals as it suggests, pushes the mids and upper mids forward
Bass boost, simple enough, simply pumps up the lower end.

My first level of advice, if you have customizable EQ, stay away from presets, customize your own. The golden rule of EQ is to always reduce neighbouring frequencies when you want to enhance a certain frequency. Say you want more bass. Instead of pushing the first few sliders up, lower the remaining sliders down, you lose a bit of volume, but in net effect, you are enhancing bass. The reason for this is that when you add gain to a certain frequency, the software is going
to add gain/volume by amplifying the signal using software. This is going to degrade the sound. The best way to directly experience this, is to push one of the mid frequencies all the way up. You will clearly hear that frequency distort and sound messy.

This is the same reason why you should never use bass boost. Bass boost simply pumps up the gain of the lower frequencies, in effect adding distortion. If you listen closely, if you turn the volume up with bass boost, the music will distort severly. Even though rock and pop eq's will end up doing the same, they do not add so much gain as to distort easily. So you are better off using rock or pop eq if all you have are presets.

As for how to set a customizable EQ. Start with cutting down the mids. The V or U curve is the most commonly used curve. The primary reason for this is because of the equipment that the average person owns. The commercial speakers and stereo systems do not have proper extension at both ends of the frequency spectrum, nor do they output a balanced level across all spectrums. Upper mids and treble are easiest to do, since the power requirement to drive a tweeter is the lowest.In most cases, most average speakers and headphones will simply output the mids and upper mids clearly, and since this is where most instruments and vocals have a major role in, the listener doesnt feel anything is lacking except bass. So they simply turn on bass boost, which boosts the bass, and in effect you just have a loud speaker or headphone which is just distorting the sound. Instead, if you cut down the mids that the speaker/headphone seem most comfortable with, then you get a more balanced sound overall.

The best way to find out is to try bringing down each frequency by a few notches, and find out which bar made the biggest difference for the same -3db decrement. Keep that as your center, or the bottom of the V, and set the V curve accordingly. Say the speakers I use show the most difference in 1.2khz, i set that at what i feel is a comfortable level, say around -4 db. The immediate frequencies on either side will then be set at -3db, the next at -2db and so on.
Note that this isnt a rule or a fool proof practice. Its just a start, and once you start messing around with the EQ yourself, you will find a curve that suits your equipment and your music tastes and your preferences in sound. But EQ can play a very significant role in how good music sounds to your ears, so take the time to set the curve to your liking and you will definitely enjoy music more.

Tuesday, April 15, 2008

Digital Audio Part I

Over the next few days I am going to talk about digital audio. I am going to begin with Audio formats. Majority of people use MP3 files to store their music. This is basically a compressed format of the audio found on a CD, which is stored as a WAVE file (.wav format). While the wave file is a lossless format, meaning this holds all the information (albeit within the constraints of the CD standard, which I shall explain later), whereas the mp3 uses compression algorithms to make the file smaller. The resulting quality and size of an mp3 files are determined by the bitrate of the file. You can choose almost any bitrate under 320kbps, the most popular ones being 128kbps and 160 and 192 kbps.

The lower the bitrate the lower the quality of the file, and the smaller it is going to be. Due to bandwidth limitations, the internet adopted the lower bitrate as the choice for streaming content. However, there is little use for these bitrates outside of that. Compression uses several techniques to reduce the size of a file. The algorithm looks for frequencies above and below a certain threshold based on the bitrate and removes them in mp3 file. For example, a 128kbps file may remove all sounds above 16khz while the 192kbps might do slightly better by removing frequencies above 18khz. What does this mean to the end user? Its a trade off, human hearing goes all the way to 20 and even 22Khz for some people.

The mp3 file is going to miss out on the details of the higher frequencies. Take a cymbal for example. (The high frequency hits you hear in a drum kit). These spread across the higher end of the spectrum, but decay as an inverted V curve to look like a mountain. Say that particular sample is spread between 11khz and 18khz. If you were to use a 128kbps compression, the right edge of the mountain after 16khz is going to get cut off. Its gone, you cannot retrieve that information again and you are going to miss out on the full detail of that cymbal hit. Increase the bitrate of the mp3 however and you are going to get some more information. The same is true at the lower frequencies as well.

But this isnt the only problem with mp3 files. There are also compression artifacts, and other loss of details. Compression artifacts occur when the algorithm finds the sound too complicated to fill in all details at a given bitrate, hence you end up hearing stuff that wasnt there in the original song. This is especially audible on lower bitrate files. Check out the wikipedia page for a sample and you will know what I am talking about. Then there is the loss of information when there are simultaneous instruments or effects one significantly louder than the other. The louder effect or instrument gets priority and the subdued effect is all but gone in the resulting file. There have been some improvements over the years to mp3 however, one of them is the LAME encoder.

This is a slightly smarter mp3 encoder which at higher bitrates, retains most of the information. If you were to use LAME and encode a file close to 200kbps, then it would be tough to spot differences between this and the original wav file unless you have high end equipment which reveal very subtle nuances in music. For the average listener however, this is going to serve well. However, there is no use converting one mp3 to another, the information is already lost in the first mp3 file and it cannot be recovered. Further compression and loss will occur when you convert it to mp3 again. If you want to use LAME encoder, you have to get hold of the original cd and rip it again.These mp3 files are just like the others and will play on everything, the technique used to compress the file is different. Just google LAME and you can download the encoder.

The other development which is useful are Variable bitrate Mp3 files. This sets a range of bitrates to use, and depending on the complexity of the music passage, the bitrate is increased or reduced to store more or less information. These are better than regular mp3's for complex music with varying volume levels and several instruments playing. But personally, I stick to LAME at 195kbps or above.

The other way to enjoy music the way it was originally intended is to use lossless formats such as FLAC, AAC and APE. These are all developed by different organizations and all of them require their own codec (encoder/decoder) to create and play these files. The advantages to these formats are 1.They are smaller than the original wav file. and 2. They support tagging.
Wave files cannot have tags. They only hold file name. So if you are maintaining a digital music library, you will not see any artist info, album info etc etc for wav files, just file names. So unless you want to have 30-40 character file names for wav files with all info, you are better off using one of the lossless compression methods. There isnt any specific advantage of one vs the other.
The only advantage AAC has is that its an apple format and their mp3 players natively support it. To play flac files, you have to mess with the firmware and only some devices natively support FLAC.

However, these formats have not picked up in the commercial segment, and legal downloads still sell mp3 files (bandwidth/storage space being primary reasons). While it might work for when you just want one song off an album, if you are ever planning on getting the whole album, get the cd and rip it yourself, or make sure you either get lossless music, or atleast a high bitrate of above 192kbps. There is some false advertising in regard to mp3 stating that 160kbps is CD quality and 192kbps is better than CD quality.That is just plain wrong. To directly compare, CD audio bitrate would be around 1411.2kbps. 160kbps takes away information from the original file, extrapolates some of the details and creates a smaller version of the file.

The reason these were coined were to make it easy for internet streaming. There is a noticeable quality drop when you step down to 128kbps from 160 and to 96kbps from 128. Hence, for the average user to understand the difference in quality, they used common knowledge terms like CD quality, FM quality and Tape quality to compare the three bitrates. While its useful as a reference to how quality improves, the so called CD quality 160kbps is nowhere close to the actual CD itself. However, it loses minimal information, and especially when using lower end audio equipment, the differences are barely distinguishable. It is important to note that the quality of the file you listen to can be judged only based on what you use to listen to them. But thats a whole other topic. Next I shall talk about a different type of compression, the compression performed by audio engineers while mastering the tracks on an album.