from scipy.io import wavfile
from IPython.display import Audio
import numpy
from scipy.fft import fft
from scipy.signal import welch
import matplotlib.pyplot as plt
from IPython.display import HTML
plt.rcParams["figure.figsize"] = (20,3)
%matplotlib inline
samplerate, data = wavfile.read('./Fifty.wav')
samplerate2, data2 = wavfile.read('./170_nonlinear_fixed.wav')
samplerate3, data3 = wavfile.read('./100.wav')
samplerate_nonlinear, data_nonlinear = wavfile.read('./170_yes_nonlinear.wav')
samplerate_linear, data_linear = wavfile.read('./170_no_nonlinear.wav')
HTML('''<script>
code_show=true;
function code_toggle() {
if (code_show){
$('div.input').hide();
} else {
$('div.input').show();
}
code_show = !code_show
}
$( document ).ready(code_toggle);
</script>
<form action="javascript:code_toggle()"><input type="submit" value="Click here to toggle on/off the raw code."></form>''')
In this lab, you are challenged to use the hardware resources on the FPGA as wisely as you can to accelerate a simulation of the 2D wave equation on a square mesh. The amplitude of the center node in the mesh will be communicated to the DAC at 48kHz to produce drum-like sounds. Your simulation will include a nonlinear effect related to the instantaneous tension in the mesh that produces the pitch-glide heard in real drums.
You are challenged to simulate the largest drum that you can. The group that uses the DSP blocks and memory blocks on the FPGA most exhaustively (as few as possible left over) and efficiently (as many as possible being used in parallel) will be able to produce the largest drum. Last year, the winning group's drum contained over 88,000 nodes!
You will be asked to compare the sizes of the largest drums that you can simulate using only the HPS vs. the FPGA. This lab will give you a sense of the FPGA's power for hardware acceleration.
A wave equation relates time derivatives to spatial derivates. The continuous form of the 2D wave equation is given below, where $c$ is the speed of sound and $u$ is the out-of-plane amplitude of the node.
\begin{align} \frac{1}{c^2} \frac{\partial^2{u}}{\partial{t}^2} = \frac{\partial^2{u}}{\partial{x^2}} + \frac{\partial^2{u}}{\partial{y^2}} - \eta \frac{\partial{u}}{\partial{t}} \end{align}In order to implement this equation on the FPGA, we will used the discretized version shown below. For a derivation of the below equation from the above equation, see this webpage. The amplitude of a particular node, $u_{i,j}$ at the next timestep, $n+1$, is a function of the node's amplitude at the current and previous timesteps ($n$ and $n-1$) and the amplitudes of the four adjacent nodes at the current timestep.
\begin{align} u_{i,j}^{n+1} = \left[1 + \frac{\eta \Delta t}{2}\right]^{-1}\left\{ \rho_{eff}\left[ u_{i+1,j}^n + u_{i-1,j}^n + u_{i,j-1}^n + u_{i,j+1}^n - 4u_{i,j}^n\right] + 2u_{i,j}^n - \left[1 - \frac{\eta \Delta t}{2}\right]u_{i,j}^{n-1}\right\} \end{align}(Take a close look at that first term. If $\frac{\eta \Delta t}{2}$ is small, is there an opportunity for simplification?) All of the other terms in the above expression are constants, except for $\rho_{eff}$. The value for this term is nonlinearly related to the amplitude of the center node according to the equation given below. This nonlinear coupling generates an audible "pitch-glide" just like is heard in real drums. In order for the drum to be stable, this value must be less than 0.5.
\begin{align} \rho_{eff} = \text{min}\left(0.49, \rho_0 +[u_{center} \cdot G_{tension}]^2\right) \end{align}Reasonable values for the other parameters might be
\begin{align} \rho_0 &\approx 0.25\\ G_{tension} &\approx 2^{-3} \rightarrow 2^{-5}\\ \frac{\eta \Delta t}{2}&\approx 2^{-9} \rightarrow 2^{-10} \end{align}
- Hunter's slides
- Implementation of finite difference schemes for the wave equation on FPGA
- Parallel implementation of finite difference schemes for the plate equation on a FPGA-based multi-processor array
- Time domain numerical simulation for transient waves on reconfigurable coprocessor platform
- Design methodology for real-time FPGA-based sound synthesis
- Wolfson WM8731 audio CODEC datasheet
- The Qsys AV config module abstracts away much of the codec detail
- The codec fast data transfer hardware is hidden behind Altera IP called the University Audio Core for Qsys (configuration interface).
- An example using the audio core is near the bottom of the Avalon Bus master page.
- Matlab program from Bruce gives a sequential version of the algorithm and plots the Fourier modes of the drum.
- Clicking "show code" at the top of this webpage will reveal Python code by Hunter based on Bruce's Matlab code (for the more Pythonically inclined person). Like the Matlab, it gives a sequential version of the algorithm and plots the Fourier modes of the drum. Plots given below. It compares the simulated and theoretical drum modes, as given in Physical Modeling with the 2-D Digital Waveguide Mesh.
- Bruce's HPS code which implements the above equation in C for realtime synthesis on the HPS
- If you play with the boundary conditions, damping, wave speed, drum size, and initial conditions, the sound will be drum-like, chime-like, gong-like, or bell-like.
# Initializize drum
n = 100 ;
u = numpy.zeros((n,n));
u1 = numpy.zeros((n,n));
u2 = numpy.zeros((n,n));
uHit = numpy.zeros((n,n));
# Set rho, sample rate, and simulation time
rho = 0.25;
Fs = 48000 ;
time = numpy.arange(0, 1., (1./Fs))
# Array in which to store audio samples
aud = numpy.zeros(len(time)) ;
# Compute middle nodes
x_mid = int(n/2.);
y_mid = int(n/2.);
# Initialize drum amplitudes to a pyramid
for i in range(1, n-1):
for j in range(1,n-1):
uHit[i,j] = max(0, 30-(abs(x_mid - i) + abs(y_mid - j)))/30.;
# Zero boundary conditions
uHit[0,:] = 0
uHit[n-1,:] = 0
uHit[:,0] = 0
uHit[:,n-1] = 0
# Indexing variable
tindex = 0;
# Loop through time array
for t in time:
u = numpy.zeros((n,n))
# Hit the drum at t=0
if (t==0):
u1 = u1 + uHit;
# Compute updated drum amplitudes
u[1:n-2,1:n-2] = (1./(1+.0001)) * (rho * (u1[2:n-1,1:n-2]+
u1[0:n-3,1:n-2]+
u1[1:n-2,2:n-1]+
u1[1:n-2,0:n-3]-
4*u1[1:n-2,1:n-2])
+ 2*u1[1:n-2,1:n-2]
- (1-.0001) * (u2[1:n-2,1:n-2]) ) ;
# Amplitude of center node goes into audio array
aud[tindex] = u[n//2,n//2] ;
# Update/store drum state
u2 = u1;
u1 = u;
# Iterate
tindex = tindex + 1;
# Plot the center node amplitudes
plt.plot(time, aud)
plt.xlabel('time (sec)')
plt.ylabel('amplitude')
plt.title('Center-Node Amplitude and Power Spectrum for 100x100 drum', fontsize=18)
plt.figsize=(20,3)
plt.show()
# Plot the power spectrum
plt.rcParams["figure.figsize"] = (20,3)
Pxxt, freqst = plt.psd(aud,
NFFT=len(aud),
Fs=48000,
window=numpy.hamming(len(aud)),
noverlap=50, label='100x100')
# Plot the theoretical mode locations
fundamental = 173
yvals = numpy.linspace(-60, 70)
modes = [1.58, 2.0, 2.24, 2.55, 2.92, 3.0, 3.16, 3.54]
plt.plot(numpy.ones(len(yvals))*fundamental,
yvals,
'red',
alpha=0.5, label='theoretical mode locations')
for i in modes:
plt.plot(numpy.ones(len(yvals))*fundamental*i,
yvals,
'red',
alpha=0.5)
# Configure plots
plt.xlim([0,1000])
plt.xticks(size=18)
plt.legend(loc='best')
plt.show()
# Output audio
print("Audio - Python synthesized, 100x100 drum")
Audio(aud, rate=48000)
# Initializize drum
n = 100 ;
u = numpy.zeros((n,n));
u1 = numpy.zeros((n,n));
u2 = numpy.zeros((n,n));
uHit = numpy.zeros((n,n));
# Set rho, sample rate, and simulation time
rho = 0.25;
rho_eff = 0.25;
Fs = 48000 ;
time = numpy.arange(0, 1, (1./Fs))
# Array in which to store audio samples
aud = numpy.zeros(len(time)) ;
# Compute middle nodes
x_mid = int(n/2.);
y_mid = int(n/2.);
# Initialize drum amplitudes to a pyramid
for i in range(1, n-1):
for j in range(1,n-1):
uHit[i,j] = max(0, 30-(abs(x_mid - i) + abs(y_mid - j)))/30.;
# Zero boundary conditions
uHit[0,:] = 0
uHit[n-1,:] = 0
uHit[:,0] = 0
uHit[:,n-1] = 0
# Indexing variable
tindex = 0;
# Loop through time array
for t in time:
u = numpy.zeros((n,n))
# Hit the drum at t=0
if (t==0):
u1 = u1 + uHit;
# Compute updated drum amplitudes
u[1:n-2,1:n-2] = (1./(1+.0001)) * (rho_eff * (u1[2:n-1,1:n-2]+
u1[0:n-3,1:n-2]+
u1[1:n-2,2:n-1]+
u1[1:n-2,0:n-3]-
4*u1[1:n-2,1:n-2])
+ 2*u1[1:n-2,1:n-2]
- (1-.0001) * (u2[1:n-2,1:n-2]) ) ;
# Amplitude of center node goes into audio array
aud[tindex] = u[n//2,n//2] ;
# Update/store drum state
u2 = u1;
u1 = u;
# Update rho
rho_eff = min(0.49, rho + ((1./16)*u[n//2,n//2])**2.)
# Iterate
tindex = tindex + 1;
# Output audio
print("Audio - Python synthesized, nonlinear (exaggerated) rho, 100x100 drum")
Audio(aud, rate=48000)
Sound from a 170x50 drum
drum = numpy.array([float(i[1]+i[0])/2. for i in data[110000:160000]])
drum = drum - numpy.mean(drum)
Audio(drum, rate=samplerate)
Sound from a 170x100 drum
drum3 = numpy.array([float(i[1]+i[0])/2. for i in data3[120000:180000]])
drum3 = drum3 - numpy.mean(drum3)
Audio(drum3, rate=samplerate3)
Sound from a linear drum (170x170)
drum_linear = numpy.array([float(i[1]+i[0])/2. for i in data_linear[100000:160000]])
drum_linear = drum_linear - numpy.mean(drum_linear)
Audio(drum_linear, rate=samplerate_linear)
Sound from a nonlinear drum (170x170)
drum_nonlinear = numpy.array([float(i[1]+i[0])/2. for i in data_nonlinear[105000:160000]])
drum_nonlinear = drum_nonlinear - numpy.mean(drum_nonlinear)
Audio(drum_nonlinear, rate=samplerate_nonlinear)
- Get the linear 2D PDE drum synthesis HPS code running (available on the DSP page, and also linked here
- The above HPS code will help you write your Verilog.
- Modify the code for athe nonlinear rho effect
- Scale up the HPS solution until it is no longer realtime
- The result should be simple harmonic motion
- Remember that rho<0.5. Start with rho=1/16.
- Note that when rho is close to the limit of 0.5, the solution has a very high natural frequency, so the sine wave will be on the border of aliasing and thus look like a square wave.
- Assuming 1.17 fixed point, start with initial conditions $u^n = u^{n-1}$ and an amplitude of about 1/8 full scale
- Use synthesizable Verilog running in ModelSim to simulate a single column of nodes of length ~30 and with zero boundary conditions on the edges and on either end. You may find these slides helpful.
- Store the initial amplitudes for the $n$ and $n-1$ timesteps in M10k memory blocks.
- Create a state machine that reads a node amplitudes from M10k memory, computes their updated amplitudes, and writes those updated amplitudes back to M10k memory. The state machine will start at the bottom of the column and move to the top.
- Remember that node updates must be synchronous. Otherwise your drum will be unconditionally unstable.
- Initialize the node amplitudes in a triangle shape (highest in the middle, linearly falling to zero at the ends). View the node amplitudes in ModelSim. You should see the node amplitudes oscillate back and forth, with their amplitudes decaying over time.
Using a generate statement, instantiate and connect 30 node columns. This will create a 30x30 drum for which all columns are update in parallel, and each row is updated in series.
Give all nodes reasonable initial conditions. Remember that you cannot just give the center node nonzero amplitude (spatial aliasing will occur). Your drum must be initialized with amplitudes that relatively smoothly transition from high in the center to zero at the boundaries. A pyramid works just fine.
Output the amplitude of the center node in ModelSim. It will look something like that shown below.
Export these data and play the sound that they generate with Matlab or Python. It should sound vaguely like a cowbell or a woodblock.
Add the nonlinear rho calculation to your ModelSim simulation.
Add synchronization signals to your ModelSim simulation.
Add computation time to your ModelSim simulation.
Download the Audio output bus_master example from the Quartus 18.1 page..
- This example performs direct digital synthesis of a sine wave, with frequency controlled by the switches
- Get it running and listen to it.
- Integrate a drum of any size with this example (a column is fine, or a small square drum)
- Use hardcoded parameters
- Use a 30x30 drum
- Compile time will be approximately 9 minutes
- Should sound like a cowbell or woodblock
- Look at the compilation report to confirm that your multiplier usage is what you expect for it to be
- Compile and test, compile time will be about 9 minutes
- Your grid should have no more than 4:1 aspect ratio
- Grid expansion via symmetry does not count for size
- Compile time will be >20 minutes
- Through your user interface (either PIO or switches) increase the number of rows in the drum until you can no longer meet timing requirements.
- Listen to the drum with headphones that have nice bass, or through a subwoofer
- Look for opportunities to optimize (pipeline your drum simulator? add a faster PLL to achieve more rows?)
- Demonstrate the largest drum that you can simulate in realtime (no more than 4:1 aspect ratio)
- Demonstrate at least three different drum sounds
- At least one sound must include audible nonlinear tension modulation effects
- Prove that the computation finishes in less than 20.8 $\mu s$ for your drums
Your lab report should contain
- Do you see any cache-size effects on HPS? What is the effect of -O0 vs. -O3?
- Are there any scaling surprises in the FPGA?
drum2 = numpy.array([float(i[1]+i[0])/2. for i in data2[130000:170000]])
drum2 = drum2 - numpy.mean(drum2)
fig, axs = plt.subplots(3, sharex=True, sharey=True, figsize=(20,10))
Pxx3, freqs3, line = axs[1].psd(drum3,
NFFT=len(drum3),
Fs=samplerate3,
window=numpy.hamming(len(drum3)),
noverlap=50, label='170x100', return_line=True)
axs[1].legend(loc='best')
axs[1].set(xlabel='', visible='False')
Pxx, freqs, line = axs[0].psd(drum,
NFFT=len(drum),
Fs=samplerate,
window=numpy.hamming(len(drum)),
noverlap=50, label='170x50', return_line=True)
axs[0].legend(loc='best')
axs[0].set(xlabel='', visible='False')
Pxx2, freqs2, line2 = axs[2].psd(drum2,
NFFT=len(drum2),
Fs=samplerate2,
window=numpy.hamming(len(drum2)),
noverlap=50, label='170x170', return_line=True)
fundamental = 100
yvals = numpy.linspace(-60, 70)
modes = [1.58, 2.0, 2.24, 2.55, 2.92, 3.0]
axs[2].plot(numpy.ones(len(yvals))*fundamental,
yvals,
'red',
alpha=0.5, label='theoretical mode locations')
for i in modes:
axs[2].plot(numpy.ones(len(yvals))*fundamental*i,
yvals,
'red',
alpha=0.5)
axs[2].legend(loc='best')
plt.ylim([-45, 65])
plt.xlim([0, 1000])
fig.suptitle('Power spectra for growing drum', fontsize='18')
plt.xticks(size=18)
plt.show()