22.
Advanced speech analysis tools I: SFS
On
the previous page, we learned about two new ways to display the information
in a sound signal, the waterfall spectrogram and the spectrum. So far we have
been using WASP to analyze speech signals. WASP is extremely convenient and
functional for making waveforms, wide and narrow band spectrograms, and pitch
tracks. But for spectra and numerous other kinds of speech analysis and related
operations, we will need software with more functions.
Fortunately, there are some excellent options
available for free download over the Internet. This and the next page will
introduce two of these: SFS and Praat. A
list of other selected speech analysis programs will be given at the end of
the next page. Each program has its own strengths and weaknesses. You can
choose which one or ones are best for each task you wish to perform based
on your own needs and preferences.
The more advanced you are in any subject, the
more independent you must become in your learning. You will no longer have
a teacher constantly telling you how to proceed, and nodding at you when you're
'right' or correcting you when you're 'wrong' each step of the way. You will
no longer have ready-made frameworks where you simply have to learn what is
printed in the textbook or mentioned in class, and then 'fill in the blanks'
of an exercise and pass an exam. You will need to choose your own books, begin
formulating your own questions, and design a personal course of study. You
will need to build up your own communications network of other scholars with
interests similar to yours. You will have to decide, after reviewing several
options, what you tools you want use, and how you want to use them. Professors,
both at your home institution and ones you can consult with over e-mail lists
like Phonetics, and the Internet, will still be available for questions and
discussion, but you will have to take on much more responsibility yourself
for your own continuing education and development.
Choosing advanced speech analysis software,
learning how to use it, and then thinking up meaningful tasks to do with it
are one place to start. This may involve expanding into previously unfamiliar
areas like computer programming, or 'script writing'. And the software may
always have many more functions than you know how to use, or even understand
completely. But that should not stop you from starting simple and doing some
of the things you can do with it. The repertoire of things you can
do will grow over time, the more you work with the software.
The first software package we will look at is
SFS, or Speech Filing System, a DOS-based program running under a Windows
shell, created by Mark Huckvale of University College London. The first immediate
advantage of this option is that it is highly compatible with Huckvale's WASP,
which you are already familiar with.
Here is a link to the download page for SFS:
http://www.phon.ucl.ac.uk/resource/sfs/
Choose the most recent version and click on
'download'. It is probably easiest to download the compressed files to your
desktop, then install them on your system disk under 'Program Files'. The
installation program will help you create an 'SFS' directory if you don't
already have one from when you installed WASP. Remember to put a 'shortcut'
icon on your desktop if it isn't done automatically. If you have a separate
'data' partition of your hard disk, you may want to save files you create
under SFS there instead of on your system disk. That way, you won't crowd
up your system partition with extra files, and if your system crashes and
you have to reinstall, your data files will still be safe.
After installing and launching SFS, click on
'Help' for a short tutorial to help you get started. In it, you will learn
how to generate and replay sinewaves. For fun, you can generate touch-tone
dialing tones by entering a phone number under 'Tools' > 'Generate' >
'DTMF tones' (DTMF
stands for 'Dual Tone Multi-Frequency' touch-tone sounds are formed
from two sine waves at specific frequencies). Try generating tone sequences
under 'Tools' > 'Generate' > 'Generate tone sequences'. Click on the
default values to hear what you get; then you can start changing the variables,
e.g. choose '5' under 'Number of tones' instead of '1', and you will hear
a series of five tones rising in pitch.
Next, follow the instructions on how to record
a signal. This should be easy, since the procedure is the same as for WASP.
You can display a waveform, and/or a wide or narrow band spectrogram like
you did with WASP by clicking on 'Tools' > '1. Speech' > 'Display' >
'Wide Spectrogram' (or whichever you prefer). You can also just check on the
'Display checked items' icon on the toolbar, then click again on the appropriate
icons of the toolbar in the new window that is opened. There is also a 'Display
all items' icon; if you click on this you can see displays of all the items
in your list.
Making sure the box of the file you've just
made is still checked, try clicking on Tools' > '1. Speech' > 'Display'
> 'Cross Section'. Display either a waveform or spectrogram or both. You
will see two other displays at the bottom of the screen. These are spectra
(plural of 'spectrum'). You can mark any section of your spectrogram by clicking
on the left, then the right mouse key. You will immediately see a spectrum
of this section in the righthand box. The 'filter response' box has been filtered
by a process called 'Linear Predictive Coding' or LPC.
Follow the link for a technical explanation of what this is. You can learn
more about LPC in books like Peter Ladefoged's Elements
of Acoustic Phonetics. Basically, LPC is a method for estimating the
resonances or formants from speech signal data by smoothing and connecting
spectral peaks. It enables you to see and identify the formant peaks more
clearly. You can then remove the formant data by a process called inverse
filtering to reveal properties of the original sound source, i.e. the
'buzz' of the vocal folds or 'residue', something that is useful in digitizing
speech data. You may ignore this for now.
Next you can try filtering your speech
signal by following the instructions in part 4. of the Help Introduction file.
Try high-pass (only signals above a certain frequency are preserved;
the rest are blocked out), low-pass, band-pass (only frequencies above
one frequency and below another, higher frequency are allowed through),
and band-stop (frequencies within a certain range are blocked) filtering.
You could, for example, use this function to filter out the fundamental frequency
of a complex wave to hear only the overtones.
Under 'Tools' > 'Speech' > 'Process' >
'Speed change' you can change the rate or speed of a sound signal
you can listen to yourself speaking much faster, or slower.
SFS comes with a user's manual that documents
all of SFS's functions; but you will probably find many of these difficult
to understand and use at this point; and some require additional software.
Don't be overambitious; just learn a function or two at a time and think of
ways you could use each function in a practical way. You will come looking
for other features as you learn more 'tricks of the trade'.
Before you commit to SFS, however, you should
learn about the other options first. Go on to the next page for an introduction
to Praat, which in Dutch means 'talk'.
Next: Advanced
speech analysis tools II: Praat
on to next page back index I index II home