Tidal

 


VFX Article, Chapter 3

3. Program 1, Harmonizer

The Harmonizer shifts the pitch of an audio signal such as music or speech up or down. The most widely known use of this technique is the novelty musical group and TCV show, the "Chipmunks". The "Chipmunks", recorded in the early 60's used tape recorders to make the up-pitch effect. Today special effects applications, including this one, use Digital Signal Processing (DSP) technology to simplify the equipment and to expand the different effects that can be performed by one device. DSP allows the equipment to perform many different effects with the same hardware. This contrasts with analog equipment which is usually much less flexible.

The principal algorithms performed by the DSP hardware for the Harmonizer is the Fast Fourier Transform (FFT) and Inverse FFT (IFFT). These algorithms convert the audio signal in the time domain to the frequency domain and then back again. Figure 2 shows the original audio signal and the data at each stage in the process as it is spectrum shifted. Figure 2A plots the audio input versus time. Figure 2B shows the frequency spectrum of the audio signal from 2A above. Figure 2C shows the original spectrum on the top and the up shifted spectrum on the bottom. Finally Figure 2D shows the original audio signal on top with the processed audio signal on the bottom. The bottom signal shows higher frequency components and would sound like the person speaking just breathed in from a helium filled balloon.

The timing of the algorithm is described by Figure 3. The input signal from the microphone is sampled at a 6.5 Khz rate. At that rate Buffer #1 is filled in 19.7 milliseconds with 128 samples. (This is what determines the pitch resolution since the resolution in the frequency domain is the inverse of the sampling period or 50.7 Hz.) Then the next 128 samples are stored in Buffer #2. Thus these two buffers are alternately filled. (Figure 4 indicates a double throw switch to suggest the toggling from one buffer to the other.) While buffer #2 is being filled the VFX processor begins the harmonizing effect by processing buffer #1 through a 128 point FFT, then the shift, and the IFFT. The entire FFT/Shift/IFFT algorithm takes approximately 6 milliseconds so that all processing is finished before the next buffer is filled. This allows "Real Time" processing with a minimal two buffer delay of 39.3 msec between the time the MIC input arrives at the VFX processor and the output leaves through the speaker.


Next Chapter

Previous Chapter

VFX Table of Contents