22. Advanced speech analysis tools I: SFS


     On the previous page, we learned about two new ways to display the information in a sound signal, the waterfall spectrogram and the spectrum. So far we have been using WASP to analyze speech signals. WASP is extremely convenient and functional for making waveforms, wide and narrow band spectrograms, and pitch tracks. But for spectra and numerous other kinds of speech analysis and related operations, we will need software with more functions.

     Fortunately, there are some excellent options available for free download over the Internet. This and the next page will introduce two of these: SFS and Praat. A
list of other selected speech analysis programs will be given at the end of the next page. Each program has its own strengths and weaknesses. You can choose which one or ones are best for each task you wish to perform based on your own needs and preferences.

     The more advanced you are in any subject, the more independent you must become in your learning. You will no longer have a teacher constantly telling you how to proceed, and nodding at you when you're 'right' or correcting you when you're 'wrong' each step of the way. You will no longer have ready-made frameworks where you simply have to learn what is printed in the textbook or mentioned in class, and then 'fill in the blanks' of an exercise and pass an exam. You will need to choose your own books, begin formulating your own questions, and design a personal course of study. You will need to build up your own communications network of other scholars with interests similar to yours. You will have to decide, after reviewing several options, what you tools you want use, and how you want to use them. Professors, both at your home institution and ones you can consult with over e-mail lists like Phonetics, and the Internet, will still be available for questions and discussion, but you will have to take on much more responsibility yourself for your own continuing education and development.

      Choosing advanced speech analysis software, learning how to use it, and then thinking up meaningful tasks to do with it are one place to start. This may involve expanding into previously unfamiliar areas like computer programming, or 'script writing'. And the software may always have many more functions than you know how to use, or even understand completely. But that should not stop you from starting simple and doing some of the things you can do with it. The repertoire of things you can do will grow over time, the more you work with the software.

     The first software package we will look at is SFS, or Speech Filing System, a DOS-based program running under a Windows shell, created by Mark Huckvale of University College London. The first immediate advantage of this option is that it is highly compatible with Huckvale's WASP, which you are already familiar with.

     Here is a link to the download page for SFS:

http://www.phon.ucl.ac.uk/resource/sfs/

     Choose the most recent version and click on 'download'. It is probably easiest to download the compressed files to your desktop, then install them on your system disk under 'Program Files'. The installation program will help you create an 'SFS' directory if you don't already have one from when you installed WASP. Remember to put a 'shortcut' icon on your desktop if it isn't done automatically. If you have a separate 'data' partition of your hard disk, you may want to save files you create under SFS there instead of on your system disk. That way, you won't crowd up your system partition with extra files, and if your system crashes and you have to reinstall, your data files will still be safe.

     After installing and launching SFS, click on 'Help' for a short tutorial to help you get started. In it, you will learn how to generate and replay sinewaves. For fun, you can generate touch-tone dialing tones by entering a phone number under 'Tools' > 'Generate' > 'DTMF tones' (DTMF stands for 'Dual Tone Multi-Frequency' – touch-tone sounds are formed from two sine waves at specific frequencies). Try generating tone sequences under 'Tools' > 'Generate' > 'Generate tone sequences'. Click on the default values to hear what you get; then you can start changing the variables, e.g. choose '5' under 'Number of tones' instead of '1', and you will hear a series of five tones rising in pitch.

     Next, follow the instructions on how to record a signal. This should be easy, since the procedure is the same as for WASP. You can display a waveform, and/or a wide or narrow band spectrogram like you did with WASP by clicking on 'Tools' > '1. Speech' > 'Display' > 'Wide Spectrogram' (or whichever you prefer). You can also just check on the 'Display checked items' icon on the toolbar, then click again on the appropriate icons of the toolbar in the new window that is opened. There is also a 'Display all items' icon; if you click on this you can see displays of all the items in your list.

      Making sure the box of the file you've just made is still checked, try clicking on Tools' > '1. Speech' > 'Display' > 'Cross Section'. Display either a waveform or spectrogram or both. You will see two other displays at the bottom of the screen. These are spectra (plural of 'spectrum'). You can mark any section of your spectrogram by clicking on the left, then the right mouse key. You will immediately see a spectrum of this section in the righthand box. The 'filter response' box has been filtered by a process called 'Linear Predictive Coding' or LPC. Follow the link for a technical explanation of what this is. You can learn more about LPC in books like Peter Ladefoged's Elements of Acoustic Phonetics. Basically, LPC is a method for estimating the resonances or formants from speech signal data by smoothing and connecting spectral peaks. It enables you to see and identify the formant peaks more clearly. You can then remove the formant data by a process called inverse filtering to reveal properties of the original sound source, i.e. the 'buzz' of the vocal folds or 'residue', something that is useful in digitizing speech data. You may ignore this for now.

     Next you can try filtering your speech signal by following the instructions in part 4. of the Help Introduction file. Try high-pass (only signals above a certain frequency are preserved; the rest are blocked out), low-pass, band-pass (only frequencies above one frequency and below another, higher frequency are allowed through), and band-stop (frequencies within a certain range are blocked) filtering. You could, for example, use this function to filter out the fundamental frequency of a complex wave to hear only the overtones.

     Under 'Tools' > 'Speech' > 'Process' > 'Speed change' you can change the rate or speed of a sound signal – you can listen to yourself speaking much faster, or slower.

     SFS comes with a user's manual that documents all of SFS's functions; but you will probably find many of these difficult to understand and use at this point; and some require additional software. Don't be overambitious; just learn a function or two at a time and think of ways you could use each function in a practical way. You will come looking for other features as you learn more 'tricks of the trade'.

     Before you commit to SFS, however, you should learn about the other options first. Go on to the next page for an introduction to Praat, which in Dutch means 'talk'.


Next:
Advanced speech analysis tools II: Praat

on to next page        back        index I        index II         home