Speak & Spell

SPEAK&SPELL

Back in 1978, Texas Instruments, innovators in integrated circuit design and manufacture, released the now legendary Speak & Spell, a teaching aid for children to assist with spelling and reading. This might seem a very unlikely 'instrument' to feature here but as artists such as Kraftwerk and others have used the Speak & Spell's crude speech synthesis in their music, it (sort of) qualifies as a 'vintage instrument'.

The idea with the Speak & Spell was that it would 'speak' words at random and the child would then be required to spell it using the 'keyboard' - if the child got it right, he/she would be told - otherwise "ERROR". For the time, it was actually quite a sophisticated little mini-computer with a a large vocabulary of stored words, letters and numbers that were replayed in truly glorious lo-fi 8-bit through a rather crappy on-board speaker.

In other words, sound quality was pretty awful. Fortunately, the young, evolving human brain is sophisticated enough to extrapolate and not to imitate blindly otherwise there would be a whole generation of people who would all be talking like Professor Steven Hawkins!

Also, the thing rankled over here in the UK as the speech patterns and pronunciations were all very distinctly 'American' ("zee" not "zed"; "skedule" instead of the correct "shedule" and more.... ;-)

To describe how it worked is best left to Texas Instruments and I quote the following from their website:

DALLAS (June 11, 1978) - A new speech synthesis monolithic integrated circuit has been developed by Texas Instruments Incorporated. It marks the first time the human vocal tract has been electronically duplicated on a single chip of silicon. Measuring 44,000 square mils, the chip is fabricated using TI's low-cost metal gate P-channel MOS process, the same used for TI calculator MOS ICs.

The speech synthesis MOS/LSI integrated circuit along with two 128K dynamic ROMs each with the capacity to store over 100 seconds of speech, and a special version of the TMS 1000 microcomputer, all TI developed, serve as the main electronics for the new talking learning aid, SPEAK & SPELL™, for seven year olds and up. The new TI consumer product was introduced at the Summer Consumer Electronics Shows in Chicago, June 11-14.

Speech encoding is achieved through pitch excited Linear Predictive Coding (LPC). As the name implies, LPC is based on a linear equation to formulate a mathematical model of the human vocal tract and an ability to predict a speech sample based on previous ones.

Linear Predictive Coding is a technique of analyzing and synthesizing human speech by determining from original speech a description of a time varying digital filter modeling the vocal tract. This filter is then excited by either periodic or random inputs. An on-chip 8-bit digital-to-analog (D/A) converter transforms digital information processed through the filter into synthetic speech.

Codes for twelve synthesis parameters (10 filter coefficients, pitch and energy) serve as inputs to the synthesizer chip. These codes are stored in a ROM and, once decoded by on-chip circuitry, represent the time varying description of the LPC synthesis model.

Inputs to the digital filter take two forms: (1) periodic and (2) random. The periodic inputs are used to reproduce voiced sounds which have a definite pitch such as vowel sounds or voiced fricatives such as Z, B or D. A random input models unvoiced sounds such as S, F, T and SH .

The speech synthesis chip has two separate logic blocks which generate the voiced and unvoiced excitation. Output of the digital filter drives a D to A converter which in turn drives a speaker.

Key to TI's high quality LPC speech synthesizer is an advanced design 10-stage lattice filter which has an integrated array multiplier, an adder coupled to the multiplier output and various delay circuits coupled to the adder output.

With this increased computational sequencing capability and a fast continuous data transfer rate, the multiplier can accept two inputs every five microseconds. Twenty multiply and accumulate operations are needed to generate each speech sample, and the circuit can generate up to 10,000 speech samples per second.

The chip is operated at an eight kilohertz rate for the Speak & Spell. This 10th order Linear Predictive Coding (LPC-10) speech synthesizer IC accurately reproduces human speech from stored or transmitted digital data.

So... there you have it! Simple eh?!

I am extremely grateful to Luke Smiles from "motion laboratories" who has put together a meticulous collection of all the letters, numbers, words and phrases from his own Speak & Spell. Using these creatively, you can construct all manner of phrases, sentences and lyrics to use in your music. Luke has captured the lo-fi quality of the instrument perfectly and these have to be probably the definitive collection of the Speak & Spell. The whole collection is rather large and so has been split up into its respective components for you to use in smaller chunks as you wish.

I have to say that I have recently been turning down offers of sample donations from such 'toys' as that area is now pretty much covered now at Hollow Sun with more cheep-n-cheezy Casios than you can shake a stick at but Luke's offer was too interesting to pass up. If you want truly authentic, lo-fi 'computer speak' samples, this HAS to be the collection to use.

Many thanks, Luke, and congratulations on such a superb representation of this '70s phenomenon.