18b. Seeing with your ears: Peter Meijer's vOICe II
(with applet for drawing soundscapes)

     This page is a bit longer than some of the other pages, and will require careful concentration – so get ready!

     In the preceding page, you were asked to recall first how a spectrograph analyzes the sounds of human speech (or any other sound, but for now we'll concentrate on human speech) into its component frequencies, and also shows the relative loudness of each moment of each component of the sound, as "painted" out over time.

     You should remember from your Introduction to Linguistics class that some of the sounds of human speech are formed on the basis of sound symbolism. Onomatopoeic words like whoosh! and meow imitate sounds in the environment; words like sleek, slip, slippery, slide, slimy may suggest a smooth surface and slipping around due to the initial [sl-]cluster (these are called "phonaesthemes"); and reduplicative words like English so-so and Mandarin ma3ma3hu1hu1 'so-so', along with rhyming compounds like the English humdrum and the Mandarin Ka/ϴa pang2huang2 and r pai2huai2 to pace back and forth or up and down, waver, suggest repetition leading to monotony. But for the most part, there is no direct connection between the words we use in human speech and the objects they refer to. Pferd, caballo, cheval, loshad', ma3, be2 and uma all mean 'horse', but they are all very different from each other in sound, and none sounds inherently any more like a 'horse' than the other. As the great Swiss structuralist linguist Ferdinand de Saussure put it: "Le lien unifiant le signifiant et le signifié est arbitraire" ('the bond between the signifier and the signified is arbitrary'). (de Saussure, Ferdinand. Course in General Linguistics. New York: The Philosophical Library, 1959: 67.)

     But now imagine that you have a snapshot of the scene in front of you. What would houses, cars and people sound like if we could run a picture of them through a spectrograph (converting any spectrograms back into corresponding sounds)? Or conversely, what kinds of sounds would visually appear as houses, cars and people if we recorded these sounds on a spectrograph? Remember that marks that are higher in the spectrogram have a higher frequency, and darker marks are louder than lighter marks.

     We are now going to make two fundamental changes in what the spectrogram display represents. First, instead of darker marks representing louder sounds, they will now represent softer sounds, so the lighter or whiter the display, the louder the sound. And second, while the horizontal axis still represents time, it now represents the entire width of one snapshot, that is, a single scene from left to right. With added stereo panning, the sound will start at your left and seem to travel to your right as the scanned image is outputted from left to right. What would a building or tree sound like, played back on an "inverted spectrograph" like this?

     Well, there is exactly such a setup in existence. Can you imagine it as having any practical use?

     Peter Meijer, a Dutch engineer and inventor, has designed a system that makes "inverted spectrograms" of the world in order to translate visual images into sound. He calls this system The vOICe (read the capitalized letters aloud
individually to get "Oh, I see!"), or "Seeing with Sound". He believes that this is one way to enable blind people to "see", without the aid of a personal visual interpreter to describe everything in the environment. In theory, a blind person using The vOICe can experience the world directly, without the assistance of others.

     There are of course drawbacks. Can you think of some? Why not try the system yourself first and see how it works? Go to the link below and play around with the "soundscapes" for a while. Look at the image, then click on it and listen to the sound it produces after being processed through The vOICe software. After practicing for a while, are you able to identify what is in some of the pictures just by sound, with your eyes closed?


     How did you do? How difficult did you find this?
Peter Meijer's vOICe was featured in two listening exercises used in other NTU classes; click here and here to view them. The first includes links to videos on the Dobelle implant and The vOICe. The second is an interview by the Canadian Broadcasting Corporation with Pat Fletcher, a regular vOICe user.

     Some NTU students who tried The vOICe said they found the sounds it produced "irritating" to listen to. Here is Pat's reply to that:


     Here are some other user comments on The vOICe:


     For some relaxation, click on the Bach sample, in which The vOICe functions like a player-piano playing a roll of music.

     To convince yourself that The vOICe really makes sounds based on visual input, which can then be displayed as "spectrograms" that reconstitute the visual image, go now to the following page, which is a special "phonetics version" of Peter's Java applet of The vOICe, designed by linguist George Dillon of the University of Washington in Seattle. (I first found Peter Meijer's site through George Dillon's phonetics resources page. It contains lots of excellent content and links and is well worth a careful browse, and bookmarking for future use.)

     Remember that with The vOICe, brighter marks rather than darker ones are louder. So we need to feed a negative image of a spectrogram into The vOICe for it to produce sounds like human speech – note how the sample spectrograms look on the Web page. Most of the other parameters, other than the light-dark reversal, are similar to those of a regular spectrograph. And as with a regular spectrograph, you can adjust the frequency range if you like, but it is preset to what Meijer feels is optimal for its intended purpose.     


     After you have played around with the spectrograms for a while, think about how these resources have expanded and deepened your ideas about some of the possible connections between sight and sound.

     We'll return to formants now for a while, with links to some very exotic music samples.

Next: Vowels and Formants III: Formants for fun and profit (with samples of exotic music)

on to next page        back        index I       index II