The Mechanical Talking Head: An Early Speech Synthesiser

Photo of the Euphonia
The Euphonia, a mechanical speech synthesiser. Public domain photo, sourced from Atlas Obscura.

In a dimly-lit room in the back of London’s Egyptian Hall, a few curious people had gathered.  Each had paid a shilling for the privilege of seeing the object standing in the centre of the room.  A grotesque device it was — the mask of a woman’s face, framed in the fashionable ringlets of the day, mounted on a frame which was attached to a piano-like instrument.  Behind the keys sat a doleful-looking man whose clothes, though fine, had seen better days.  The man placed his fingers on the keyboard and moved his foot on the pedal.  The automaton’s mechanical lips parted, and a spectral voice issued thence: “Good morning, ladies and gentlemen.”

It was 1846, and Joseph Faber, a German inventor, was exhibiting the talking machine which he had been developing for the last two decades.  Named the Euphonia, the machine was purportedly able to produce every sound in any European language.  Nobody seems to know much about Faber.  He was born around 1800 in Freiburg in present-day Germany, studied in Vienna, and, after a bout of illness, worked on mechanical projects as a way to rouse himself from hypochondria.1

In this post I’ll describe how Faber’s Euphonia would have worked, and the role she had to play in the history of humanity’s quest for artificial speech.  Along the way, I’ll also be talking about the workings of the human voice itself.

The Source of Our Voice

Let’s start with an introduction to how our voice works. Our journey begins in the lungs, which is our source of air.  When we speak, we relax our diaphragm muscles, compressing the lungs and pushing air up through the throat and out of the mouth.  But it’s no good simply having the air flow out — that would hardly generate any sound at all, rather like breathing.

To make sounds, we have to interrupt the airflow.  Enter the vocal cords — folds of ligament stretched across the throat.  When we speak, they open and close several hundred times per second, allowing the air to escape in pulses.  Try resting a finger lightly on the front of your throat while humming — you can feel your vocal cords vibrating!  It is this repeating pattern of air pulses that gives our voice a pitch.  By controlling the frequency at which the vocal cords open and close, we can make our voice sound higher or lower.  Here’s an animation of the vocal cords in action:

 

Click to expand: How do we control the vibration of our vocal cords?

The vocal cords typically vibrate at a frequency of several hundred Hz, and can get up to above 1000 Hz in a soprano’s singing voice.  That means that they open and close a few hundred times per second.  How do the muscles in the throat do this?  Do they have to keep contracting and relaxing every time?

It turns out that the vocal cords work in a much more efficient way, using the flow of air to their advantage.  The cords are maintained at a constant tension by muscles in the throat.  When they are closed, air (pushed out from the lungs) builds up underneath the cords.  When the pressure has built up enough, the air pushes the vocal cords open, and escapes.  The rush of air between the cords causes a decrease in pressure between them (the Bernoulli effect, for physics and engineering buffs).  Together with the continuous tension in the vocal cords, this causes the cords to return to a closed position.  The cycle then starts again.2

We control the frequency of the open-close cycle (and thus the pitch of our voice) by controlling the tension of the vocal cords — a higher tension gives a higher pitch.  The mass of the cords also plays a role — since men tend to have thicker vocal cords than women and children, their cords tend to vibrate at lower frequencies, leading to a lower-pitched voice.

The Euphonia’s vocal cords were not made of muscle, but rather of an ivory reed — a flat strip of material set in a thin opening.  Her lungs consisted of a bellows connected to the reed.  When the operator pushed down on the bellows with his foot, air was pushed through the reed, causing it to vibrate and produce a buzzing sound similar to that of our vocal cords vibrating.  The reed’s pitch could be adjusted with a screw (though I expect it couldn’t be adjusted while talking, which would have made for a monotonic, flat-sounding voice). 

Photo of a preserved bakers bellows
A preserved baker’s bellows at Deutsches Werkzeugmuseum (German Tools Museum) at Remscheid, by Frank Vincentz, CC BY-SA.

Here’s roughly what vibrating vocal cords would sound like, if we were to hear them by themselves:

 

That’s not terribly interesting, is it?  Nor does it sound very much like a human voice. There is another component of our speech-producing anatomy that we still need to take into account, and this is, of course, our mouth.

Making Speech Sounds

The mouth forms part of a larger system called the vocal tractThis consists of the oral and nasal cavities, from the vocal cords to the lips and nostrils.  We can modify the sound produced by our vocal cords by moving parts of the vocal tract, such as our tongue, lips and jaw.  Have a look at this video, which shows the different positions of the vocal tract as various vowels are pronounced:

As we saw in the video, changing the position of the tongue, jaw and lips changes the shape of the vocal tract.  These different shapes modify the sound from the vocal cords in different ways, forming different vowels (I’ll talk about this more in the next post).

Click to expand: Make your own vocal tract

You can observe for yourself how the vocal tract modifies the sound coming from the vocal cords, by making a model of it!  This site gives some detailed instructions.  It models the vocal tract using tubes divided into differently-sized chambers, much as the sizes of different parts of our vocal tract change depending on how we position our tongue and jaw.  You can hear the results at this exhibit at the Exploratorium in San Francisco, which demonstrates the same concept with a more elaborate model.

The Euphonia had a mouth, too, set behind her lips of India rubber.  She had a tongue, palate, cheeks and a jaw1.  All of these parts were connected via strings and levers to a keyboard, which allowed her to move her mouth to pronounce different words.

But that wasn’t quite enough.  She also had a bunch of baffles and shutters for the air to pass through, as well as whistles and whoopee-cushions.  Her operator could even insert a tube into her nose to change her accent3!  Why all this extra stuff?  I expect it was because her mouth parts couldn’t be controlled finely enough to make certain sounds.

Consonants, for example, are rather more complicated to make than vowels.  When we humans produce them, some part of the vocal tract constricts to interrupt the air flow.  Some consonants, like “s”, are made by forcing air through a small constriction (between the tongue and the back of the upper teeth in this case).  I think these might have been what the Euphonia’s whistles were for.  Other consonants, like “k” and “p”, are made by stopping the airflow for a fraction of a second, and then releasing a burst of air.  It is unclear how the Euphonia managed these, but earlier talking machines had a separate mechanism with auxiliary bellows to produce these bursts of air.  The Euphonia would also have needed nostrils and a nasal cavity, for sounds like “n”. 

As a further complication, the Euphonia also had to be able to pronounce both voiced and unvoiced consonants.  In unvoiced consonants the air passes between open vocal cords, while in voiced consonants the vocal cords vibrate4.  Try putting your finger on the front of your throat while you make an “s” and a “z” sound — feel the difference?

All of the Euphonia’s moving parts were connected to her keyboard, which had seventeen keys.  One key could control whether to bypass the vocal cords, while each of the others produced an elementary sound, which could form words and sentences when strung together5.  It must have taken a lot of practice to learn how to play the Euphonia properly! 

Creeping Out the Public

Faber had high hopes for the Euphonia’s London exhibition.  He had previously shown his talking machine in Vienna, Bavaria and the United States, but had not received much attention, and had even destroyed earlier versions of the device in fits of discouragement.  His invention was not without its admirers among scientists, though, and Faber hoped for a better response this time.

No model or blueprint of the Euphonia seems to have survived, and since she predated the first recording devices, we can’t know for sure what she sounded like.  We could get an idea, though, by looking at a more primitive version of a speaking machine, developed by Wolfgang von Kempelen in the second half of the 18th century.  This device worked on similar principles as the Euphonia, using a bellows, reed, consonant whistles and a horn-shaped mouth, and was part of Faber’s inspiration for the Euphonia.  Von Kempelen left detailed notes describing the workings of his machine6, which allowed researchers at Saarland University to build a modern prototype of it:

If you found that box-child’s “Mama” cry terrifying, imagine hearing a similar voice emanating from the staring mask of the Euphonia.  A more advanced machine she may have been, but it was all still too creepy for Faber’s London audience.  It didn’t help that they could feel her “breath” coming from her mouth as she spoke!  John Hollingshead, who attended the exhibition, produced the following rather cutting report:

“The Professor was not too clean, and his hair and beard sadly wanted the attention of a barber. I had no doubt that he slept in the same room as the figure—his scientific Frankenstein monster—and I felt the secret influence of an idea that the two were destined to live and die together. The Professor, with a slight German accent, put his wonderful toy in motion… The keyboard, touched by the Professor, produced words which, slowly and deliberately in a hoarse sepulchral voice came from the mouth of the figure, as if from the depths of a tomb…

As a crowning display, the head sang a sepulchral version of “God save the Queen”, which suggested inevitably, God save the inventor. This extraordinary effect was achieved by the Professor working two keyboards—one for the words, and one for the music. Never probably, before or since, has the National Anthem been so sung. Sadder and wiser I, and the few visitors, crept slowly from the place, leaving the Professor with his one and only treasure—his child of infinite labour and unmeasurable sorrow.”7

There were, however, some critics who were impressed — the well-regarded scientist Joseph Henry suggested that the Euphonia could be connected to the telegraph system to read out telegrams.  Others joked about how the machine could relieve pastors of their sermon-reading duties, or speak for parliament members who had strong Scottish accents8.  The Euphonia never caught on with the public, however, and according to some reports, a disappointed Faber eventually committed suicide.

The Euphonia’s Legacy

Had Faber lived longer, he might have seen the impact that his invention eventually made.  Among the audience at his London exhibition had been Melville Bell, who would later inspire and encourage his son Alexander Graham Bell in his studies of the human voice1.  The younger Bell actually built a talking head made of wood and rubber as a child, as he describes in this amusing anecdote.  Bell eventually turned from mechanical machines to electrical ones, and invented the telephone that so changed our world.

Faber’s machine was forgotten, but humanity’s experiments with speech synthesis were far from over.  While Bell’s telephone used electricity to transmit speech, someone else developed an electrical machine that could generate its own speech, eventually leading us to the computer voices that are so ubiquitous today.  In a future post, I’ll talk about the Voder — the first electronic talking machine.

Until then, I’ll leave you with this modern version of a talking rubber head:

1.
Joseph Faber’s Marvelous Talking Machine, Euphonia. Racing Nellie Bly. http://racingnelliebly.com/weirdscience/joseph-fabers-marvelous-talking-machine-euphonia/.
2.
The larynx and the glottal cycle. Center for Spoken Language Understanding. http://www.cslu.ogi.edu/tutordemos/SpectrogramReading/cse551html/cse551/node24.html.
3.
Text-To-Speech in 1846 Involved a Talking Robotic Head With Ringlets. Atlas Obscura. http://www.atlasobscura.com/articles/texttospeech-in-1846-involved-a-talking-robotic-head-with-ringlets. Published March 9, 2016.
4.
Rossing TD. The Science of Sound. Addison-Wesley; 1990.
5.
Joseph Faber’s Amazing Talking Machine of 1845. Impact Lab. http://www.impactlab.net/2008/03/15/joseph-fabers-amazing-talking-machine-of-1845/.
6.
Wolfgang von Kempelen’s Speaking Machine and its Successors. Hartmut Traunmüller, Institutionen för Lingvistik. http://www2.ling.su.se/staff/hartmut/kemplne.htm.
7.
Joseph Faber’s Euphonia. History of Computers. http://history-computer.com/Dreamers/Faber.html.
8.
Not Mr. Edison’s Talking Machine. Antique Phonograph News. http://www.capsnews.org/apn2013-3.htm.

Leave a Reply

Be the First to Comment!

Notify of
avatar
wpDiscuz