Mid-side (MS) decomposition is an essential aspect of many audio effects. Several stereo image processing techniques incorporate MS as part of the effect. Additionally, many audio engineers use MS dynamic range compression and MS equalization.
In order to understand how the audio is processed in each case, it can be helpful to look at how MS decomposition works.
Conceptually, the purpose of MS decomposition is to separate a stereo (left/right) signal into a different, two-channel signal composed of “mid” and “sides” information.
This is accomplished by taking advantage of how stereo audio is created from a mono signal. When a mono signal is played back over stereo speakers, the signal is sent to both speakers, while the relative amplitude of signal is changed for each speaker.
When an audio engineer pans a signal to the left, he/she turns down the amplitude of signal going to the right speaker, while turning up the amplitude of the signal going to the left speaker. The opposite is true when panning to the right.
When a signal is panned to the center, the amplitude of the signal is identical in both speakers.
Creating the Sides Channel
To create the “sides” component of MS processing, the right channel can be subtracted from the left channel (sides = left – right). This is done for convention, using the opposite would produce a similar result (sides = right – left).
Therefore, signals that have identical amplitude in the “left” and “right” channels (i.e. those panned to the center) will be removed. In other words, the combined amplitude of center signals will be reduced to zero in the “sides.”
More completely, the “sides” component is created by also scaling the amplitude by a half: sides = 0.5 * (left – right). This reduction in amplitude will be necessary at the end when the “mid” and “sides” are recombined to recover new “left” and “right” signals.
Creating the Mid Channel
If the “sides” channel is created by taking the difference between “left” and “right,” the “mid” channel is created by taking the sum of “left” and “right,” or mid = 0.5*(left + right).
One thing that can be confusing is that the “mid” signal actually contains signals that are panned completely left or completely right. These components are not ‘canceled out’. Rather, signals panned left or right will be relatively lower in amplitude than signals panned to the center.
This result occurs because panning functions are typically intended to maintain constant power across the stereo field (not constant amplitude).
To accomplish constant power, signals panned to the center have a higher summed amplitude. When a mono signal is panned to center, it is played through both speakers. The amplitude of signal through each individual speaker is scaled to be 0.707*original amplitude, or the square root of 0.5.
When the left and right channels are added together, the amplitude of a center signal is 1.414*original mono amplitude (~3 dB louder).
Therefore, the “mid” channel in MS decomposition can have a higher amplitude for center information than side information.
After the “mid” and “sides” channels have been decomposed (or encoded), each can be processed independently. This might mean adding equalization just to the “sides” or compression just to the “mid.”
Finally, new “left” and “right” channels must be recovered (or decoded) to play the signals over stereo loud speakers.
The left channel can be recovered by adding together mid + sides. The right channel can be recovered by subtracting mid – sides.