EE2.LabA: L2 Digital Audio Synthesis and Effects

[Section:1,2,3,4,5 ]

Teams:	2 members per group
Software:	Pure Data (pd), v. 0.39 or later
Timetable:	Two weeks allocated. If you have not reached the end of Section 3 by the end of the 1st session, then continue from section 4 in the 2nd session.

Aims of the Experiment

To understand and implement several principles of digital audio synthesis and effects: sinusoidal additive synthesis, vibrato, FM synthesis, ADSR/ADSHR envelopes, noise shaping, stereo-delay effects, and single-tap and multiple-tap delay lines.
To construct the DSP component of a note synthesizer based mainly upon FM synthesis, with an attached delay-effects unit.
To provide an exposure to patcher languages such as Max/MSP, jMax, and pd used for audio/video processing, and specifically, to familiarise oneself with the graphical programming environment Pure Data (pd).

Required reading

"Real Sound Synthesis for Interactive Applications", Perry R. Cook, A K Peters, Ltd.: Natick, MA, 2002.
- ADSR Envelopes - sect 2.1
- Sinusoidal additive synthesis - sect 4.4
- Wave shaping synthesis and FM synthesis - sect 10.3
"DAFX - Digital Audio Effects", Udo Zolzer (Editor), Wiley, John & Sons, Inc. 2002.
- Delays - chapter 3

1. Overview

This laboratory aims to implement some well-known digital audio synthesis methods and effects algorithms, including additive and FM synthesis, stereo-widening using delays, and single-tap and multiple-tap delay lines. Patches (programs) will be built for each of these synthesis/effects units in the graphical programming environment Pure Data (pd), using simple objects such as oscillators, delay lines and DACs. Pd allows real-time control of processing parameters, and the experiment should also provide an opportunity to explore creative uses of digital synthesis/effects. NB! A large number of related audio example patches exist in the pd Browser. These can be used to gain familiarity with the pd environment. Help for any PD object can be obtained by cmd-click on the object and click help. A useful glossary and summary of objects can be found online.

2. Preparation

Fig 1: Square wave

The preparation work covers the material for both weeks of the lab and will be assessed at the beginning of the first session.

Visit pd community site: http://puredata.info and see the guide here, for an overview of pure data. Further tutorial overviews can be found here and here.
Derive an analytical expression for the Fourier series (i.e. frequencies and amplitudes of harmonics) of the square wave in fig. 1 with period T (start from the general expression for a Fourier Series, and find the values of the coefficients).
Sketch the power spectrum for a sinusoid oscillating at f Hz, where f< f_s/2, labelling f and f_s. For an N point FFT, write down an expression for the width of each FFT bin, and comment on the implications of N on the resolution/performance of the transform.
By expanding the expression for the amplitude modulated (AM) signal below, simplify it into a sum of pure sinusoidal components (single sine or cosine terms only). What are their frequencies and amplitudes (for a_AM non-zero)? You will need to use the trigonometric identity for cosAcosB.
y[t] = A (1+m(t)) cos(2π f_c t) = A (1+a_AM cos (2π f_AM t)) cos(2π f_c t)
In frequency modulation (FM) synthesis:
y[t] = a_c cos( 2π ∫ f_c + a_m cos (2π f_m t) dt)
sinusoidal components are generated at frequencies, |f_c + n*f_m|, where n is any integer, f_c is the carrier frequency, and f_m is the modulator frequency. A harmonic spectrum (i.e. where all components are integer multiples of a fundamental frequency component) clearly results when the ratio f_c:f_m = 1:1. At what other ratios of f_c:f_m will a harmonic spectrum result?
Sketch an ADSHR envelope, labelling each section and describing its influence on a note.

3. Experiments

3.1 Sinusoidal Additive synthesis

Using the Fourier series of the square wave determined from your preparation, construct an approximation to a square wave signal generator in pd, by additive synthesis using the first 10 harmonics of the square wave.
Verify the design using the object tabwrite~ to graph the output waveform. Illustrate the square wave approximation when using only the first 1, 3 and 10 sinusoidal components, respectively. Sketch these in your lab book.
Include a GUI box (e.g. slider) for the fundamental frequency which varies across the audible frequency range (20 Hz to 20 kHz).
[Hint: You may need to reset the phase of the oscillators using the cold inlet of the osc~ object.]
Incorporate the square wave synthesiser in to a pure data subpatch, with fundamental frequency as an input, and the resulting audio as an output.

3.2 Amplitude/frequency modulation and vibrato

Fig 2: Frequency vibrato

Vibrato is a slight undulation in pitch and/or variation in intensity of sound, at a rate of around 4 to 10 Hz for most acoustic instruments. In acoustic instruments, it is brought about by rapid movements affecting the excitation mechanism (e.g. vibrating the position of a finger on a fretboard, or for some wind instruments, varying the air pressure within the lungs using the diaphragm). In acoustic instruments, vibrato is generally a combination of amplitude and frequency modulation. In frequency vibrato (see fig. 2), the frequency of the tone can be heard to sweep up and down repetitively.

Open a new patch, and create the square wave synthesiser as an object within it. Incorporate a GUI box that allows variable control (vibrato rate and frequency sweep) of the amount of frequency vibrato applied to the square wave.
Set the fundamental frequency of the square wave to 400 Hz, and the frequency sweep to around a tenth of this. By increasing the vibrato rate/modulation frequency from zero, at what frequency do you begin to perceive a 'roughness' effect?

Vibrato can also be induced by amplitude modulation. This is often heard in wind instruments (e.g. flute) and voice (although it is not always a sign of good playing technique!)

Similarly, incorporate a GUI box for controlling the amount of amplitude modulation (AM) to be applied to the output of the square wave generator. AM of a simple sinusoid is defined as:
y[t] = A (1+m[t]) cos(2π f_c t) = A (1+a_AM cos (2π f_AM t)) cos(2π f_c t)
where m[t] is the modulator, a_AM is the modulation index, and f_AM is the modulator frequency.
Plot the waveform of the AM signal with the settings: f_m = 10 Hz, f_c = 100 Hz, and illustrate your observations.
Plot the power spectrum of the AM signal, and see whether your observations agree with the theoretical result you predicted in your preparation. You should see additional frequency components equally-distant above and below each harmonic of the square wave. Make a rough estimate of the frequency difference between f_c and the side components. What does this correspond to?
[Hint: You will need to calculate the real FFT of the signal using rfft~, then square the result. The object block~ can be used to set the FFT size (4096 samples is advisable at a sampling rate of 44.1 kHz).]

3.3 FM synthesis

The basic concept of frequency modulation (FM) synthesis is much the same as frequency vibrato. The center frequency of the carrier, f_c, is modulated at a frequency, f_m, and by an amount, a_m, to yield the FM signal:
y[t] = a_c cos( 2π ∫ f_c + a_m cos (2π f_m t) dt)
However, whereas vibrato is characterised by a modulator frequency of around 10 Hz, at much higher rates the FM becomes too rapid to discern, and the effect becomes more timbral. This was discovered by John Chowning at Stanford University around 1967, and was later patented and becoming the trademark sound of the hugely successful Yamaha DX7 synthesizer in the 1980s. In FM synthesis, the carrier and modulator frequencies are usually set to be equal, or at least of a similar order.
We define the modulation index as:
B = a_m / f_m

Create a patch for FM synthesis with equal carrier and modulator frequencies. Illustrate your observations of the amplitude spectrum of the resulting signal at each of the following values of the modulation index: 0, 0.5, 1.

One should notice that at non-zero values of the modulation index, harmonic spectra are produced with a fundamental frequency, f_c. In general, FM introduces frequency components at frequencies: |f_c + n*f_m|, where n is any integer value (at negative values of f_c + n*f_m the component is effectively an inverted positive frequency component). Hence, when f_c = f_m, harmonics exist at f_c, 2 f_c, 3 f_c, ... The amplitudes of the harmonics are dependent on a_m, and can be derived in a non-trivial manner by expanding the Fourier series of the FM signal, involving Bessel functions. As a general rule though, by keeping f_m constant whilst increasing the modulation index, the amplitude of the sidebands increases, but the frequencies of the sidebands do not change. As a rule of thumb, Carson's rule states that nearly all (~98%) of the power of a frequency modulated signal lies within a bandwidth of B = 2(fΔ+f_m), where fΔ is the peak deviation of the instantaneous frequency f(t) from the center carrier frequency, f_c, and f_m is the highest modulating frequency.

Harmonic spectra (i.e. where frequency components occur at integer multiples of a certain fundamental frequency) occur when f_c:f_m = 1:1. At what other integer ratios would harmonic spectra occur (preparation)?
Design a patch for FM synthesis which allows independent control of carrier frequency and modulation index, where the ratio f_c:f_m can be selected between different harmonic ratios determined above. Verify that the resulting amplitude spectrum of the signal is harmonic.
Bell-like and other sounds can be obtained using f_c:f_m ratios that do not produce harmonic spectra. Explore some of these possibilities by allowing independent control of carrier and modulator frequencies.

4. Effects

4.1 Single-tap delays

Fig 3: Single-tap feedforward delay line

A wide range of effects based upon delays can be implemented in pd. A delay line is basically a circular buffer in memory, which is sequentially filled at regular sampling intervals with sample values. A write pointer points to the location in memory where the current sample is to be stored. When the end of the buffer is reached, the write pointer returns to the start of the buffer and begins to overwrite old samples. Samples can be read from the buffer using one or more taps or read pointers at specified lags behind the write pointer (i.e. a tap will read samples that are n samples old if it points to a location n memory locations behind the write pointer). In pd, delwrite~ can be used to allocate memory for a delay line, and delread~ creates a tap from the delay line (more than one tap can be created using multiple instances of delread~).

Fig. 4: Amplitude response of a comb-filter

We are going to create some simple digital audio effects using single-tap finite impulse response (FIR) filters (see fig. 3). The difference equation and transfer function for a single-tap FIR comb filter are:

y[n] = x[n] + a x[n-M] H[z] = 1 + a z^{-M}

where M is the delay length in samples, and a is the amplitude of the delayed signal relative to x[n]. For a particular value of M, with the addition of the signal to a delayed version of itself, some frequency components will constructively interfere and others will destructively interfere. We end up with a frequency response similar to that shown in fig. 4 for both a = -1 and a = 1.

The block diagram for the effect
A description of how the effect modifies the sine tone
The effect of varying each delay parameter (eg. delay centre, modulation amount, modulation speed, number of taps)

Build the following effects:

Construct a single-tap FIR delay line. Use a single oscillator with an adjustable frequency, f, as an input, and measure the amplitude responses of the delay line for 0 < f < 1 kHz, with a delay time of 5 ms and a = +-1. Do these graphs agree with your predictions?
[Hint: The object env~ can be used to measure the RMS signal amplitude in dB over a window length that is a power of 2 samples long. Try using env~ 8192.]

If the delay is set to between 10 and 25 ms, a doubling or slapback effect is obtained, i.e. it sounds like two identical instruments are playing in unison. Obviously the amount of slapback is controlled by a.
If the delay is increased to 50 ms or more an echo is heard. It is common in music production to match the echo time to the tempo of the song, so that echos always fall on the beat. Calculate an appropriate echo time to match a tempo of 100 BPM (beats-per-minute).
Now create a flanger by using an oscillator to modulate the delay time from around 0 -10 ms, at a frequency of around 1 Hz.
[Hint: A 4-point interpolating delay tap is implemented in the object vd~, which allows fractional delays to be obtained, thereby creating a smoother effect than if taps at only integer sample values were allowed.]

Fig 5: Single-tap feedback (IIR) delay line

A simple delay effect can be used to enhance stereo placement of instruments, and to make the mix sound wider.

Include an adjustable delay between stereo DAC outputs for the output of the note synthesizer in section 3. Listen to the stereo widening when the delay is around 20 ms.

4.2 Multiple-tap delays

Fig 6: Multi-tap feedforward delay line

Multiple tap delays (fig. 6) allow more flexibility in the design of the effects unit, and can be used to add a rhythmic quality to the instrument. The principle is essentially the same as in single-tap delays.

Use at least three taps at various delay times of at least 50 ms to create several echos.
[Note: An alternative way to create multiple echos would be to feed a single-tap back into the original signal (fig. 6), i.e.:
y[n] = x[n] + a y[n-M]
giving the transfer function:
H[z] = 1/(1-a z^{-M})
If you wish to do it this way, beware: what is the maximum gain of the infinite impulse response (IIR) filter when |a| = 1?]
Implement a chorus effect using several taps (at least 3). The delay of each tap should be centred around 0-20 ms, and each delay should be modulated independently.

Package the delay, chorus and flanger effects in to a subpatch, with inputs to control number of delay taps, chorus on/off, flanger on/off and effects rate/depth.

5. Note Synthesiser

5.1 ADSR / ADSHR Envelopes

Many temporal characteristics of synthesized instrument notes can be summarised using ADSR (attack-decay-sustain-release) or ADSHR (attack-decay-sustain-hold-release) envelopes. These include changes in the volume of a note over its duration (e.g. piano notes have very sharp attacks followed by long release times, wherease in the trumpet, for example, the release is more rapid), but also other effects such as fundamental frequency, filter frequencies, etc.. ADSHR is an acronym for:

Attack - time between key being pressed and envelope reaching its maximum.
Decay - time between end of attack and envelope reaching a constant sustain level.
Sustain - the level of the envelope during the sustain, relative to the envelope maximum.
Hold - the duration at which the note is held at the sustain level.
Release - the time between the key being lifted and the envelope fading to zero.

In section 3.3, it would have been noticed that 'brighter' spectra (i.e. larger bandwidth or spectral centroid) result at larger values of the modulation index. In many instruments, brighter spectra usually also occur at the onsets of notes, accompanied by a sharp increase in dynamics.

Design a note synthesizer that permits the following functionality, making notes about the effect of varying each parameter:

It must be possible to parameterise an ADSHR envelope separately for the volume of the note and the frequency modulation index. You may find this ADSHR patch useful: adshr~.pd. [As an even better alternative, wavetables can be used in pd that would allow the user to draw the temporal trajectory of each of the envelopes by hand]
Incorporate the AM/FM patch from section 3.3 (containing the square wave synthesiser) in to your ADSHR patch. Create a note synthesiser containing independent controls for: fundamental frequency, peak modulation index, and all envelope parameters, that can be triggered using a bang GUI box. Ensure that the carrier to modulator frequency ratio is set to produce a harmonic spectrum.
Incorporate the effects subpatch on the output of the note synthesiser and experiment with adding delay, chorus and flanger effects on the output.

If time permits: To account for some of the impulsive or transient characteristics of the note attack, a white-noise component can also be incorporated.

Add a white noise component to the output with a separate ADSHR controlling the noise amplitude (noise~ can be used to generate white noise). If you have time, you may even wish to use an ADSHR envelope to control the filter cutoff frequency for the noise signal, or some other noise filter parameter.

[ Home ]

FEPS