Help With Speech Processing and Coding Programming Assignment,RHomework Help,Help With R Programm

Module 1: Speech Processing and Coding
Laboratory Project
Instructions: Perform. all the tasks in this laboratory project. Each student is to write an individual
report containing all your results, observations and comments. Submit your report and Matlab codes
as a single compressed (zipped) file through Stream by the due date. The submitted file should have
the paper number, your surname and your student ID as the filename. Your Matlab code should be
functioning and easy to read, with appropriate comments. We should be able to run your code and
obtain the results in your report without having to make changes to any source code. If there are
problems, we will ask you to come and present your work orally.
This project consists of two parts. The first part should be completed during the laboratory hours where
some supervision is provided. The second part is to be completed on your own using the knowledge
gained from the lectures.
Plagiarising results, any part of your report, or Matlab code will be severely penalised.
Deadline: Friday 31 March.
Late submissions will be penalised by 10% deduction of marks per day up to a maximum of 5 days,
after which submission will not be marked. Requests for extension must be made before the due date.
They will only be granted under exceptional circumstances.
Marking Scheme:
This project is worth 20% of the total marks of this paper. Note that a minimum of 40% averaged over
all three laboratory projects must be achieved in order to pass this paper. The distribution of marks are
as follows:
Exercises in Part 1 50%
Part 2: Analysis (Matlab code, documentation and
discussion)
25%
Part 2: Synthesis (Matlab code, documentation and
discussion)
25%
Total 100%
School of Engineering and Advanced Technology Massey University
281.755 Lab Project 1 2
Part 1
Speech data files and some Matlab functions are provided for this part of the project. Download them
from the Stream course site and save them in an appropriate folder. You will need either headphones
or external speakers on your computer.
1. Load the file “project1.mat” into Matlab. You should see four sets of data loaded. These are
segments of speech sampled at 8 kHz. Listen to the segments named “male_short” and
“female_short” using Matlab commands sound and soundsc. Do they sound different using these
two commands? Why?
2. Plot “female_short”, making sure that you use the correct time scale on the x-axis. Mark the voiced
and unvoiced regions on these plots. Note that you can use the zoom in the figure window to study
details of the signals.
3. Make use of the Matlab command fft to compute the energy spectrum of a frame. of voiced speech.
This frame. of speech should have a duration of 15-20 ms.
(a) Plot the signal as well as the energy spectrum (squared magnitude of the DFT coefficients).
Your plot should display the energy in decibels on the y-axis and the actual frequencies on the
x-axis.
(b) Multiply this frame. of speech by a Hanning window (hanning) before computing the energy
spectrum. Plot the energy spectrum again. Compare with your previous plot and comment on
any differences observed.
(c) What is the fundamental frequency (pitch) of this frame. of voiced speech?
4. Plot the signal and the energy spectrum of an unvoiced frame. of the speech signal. How is it
different from those of the voiced frame?
5. Compute the LPC coefficients for the voiced frame. using a 15
th
order predictor.
(a) Filter the original speech data to obtain the residual signal. Measure the energy of the residue
and compare it to the energy of the original speech signal.
(b) Plot the residue signal. How would you describe this residue signal?
(c) Plot the linear predictive spectral envelope. Identify the formant frequencies of this vowel.
(d) Repeat the above using a 10
th
order and a 5
th
order predictor. How do the results compare with
the 15
th
order predictor?
6. Repeat the LP analysis on the unvoiced frames using a 15
th
order predictor and plot the LP spectral
envelope. Can you observe any formants?
7. Plot the narrowband and wideband spectrograms of “male_long” and “female_long” using the
spectrogram command in MATLAB. Use 320 samples per frame. with a 240 sample overlap.
Align the speech with the spectrogram. (Hint: Use a dictionary to get the phonemes for the word
spoken.)
School of Engineering and Advanced Technology Massey University
281.755 Lab Project 1 3
Part 2: Linear Predictive Vocoder
A vocoder encodes speech based on the parameters obtained from speech analysis and synthesizes
speech using these parameters. Bit rates can be as low as 2 kbits/s, depending on how the parameters
are quantized. This low bit rate is achieved at the expense of the quality of speech produced. In this
project, we shall use linear predictive analysis and synthesis without coefficient quantization.
Write a set of Matlab functions that will perform. linear predictive analysis on a given speech signal.
The speech signal should be at least 3 seconds long, sampled at 8kHz or higher. It can be a recording
of your own voice or one obtained by other means. The signal is to be divided into overlapping frames.
You may determine what is the suitable length and overlap of the frames. For each frame, the analysis
should produce the following parameters:
1. Voiced/unvoiced detection
2. Linear prediction coefficients (your choice of analysis order)
3. An estimate of the pitch frequency for voiced speech
This speech signal is to be re-synthesized from these parameters. For each frame,
1. switch between pulse train excitation and noise excitation based on the voiced/unvoiced
decision,
2. Pitch of the synthesized voice should roughly match the original speech, and
3. Take care of the amplitude discontinuities in the signal between frames.
Your submission for this part should include:
(1) A report – In your report, detail the techniques used and discuss the factors affecting the quality
of the synthesized speech. Comment on the quality of the resynthesized speech produced.
Suggest any changes that can be made to improve its quality.
(2) Well documented Matlab codes.
(3) The resynthesized speech as an audio file.
*********