DALLAS (June
11, 1978) - A new speech synthesis monolithic integrated circuit
has been developed by Texas Instruments Incorporated. It marks
the first time the human vocal tract has been electronically
duplicated on a single chip of silicon. Measuring 44,000 square
mils, the chip is fabricated using TI's low-cost metal gate
P-channel MOS process, the same used for TI calculator MOS
ICs.
The speech synthesis MOS/LSI
integrated circuit along with two 128K dynamic ROMs each with
the capacity to store over 100 seconds of speech, and a special
version of the TMS 1000 microcomputer, all TI developed, serve
as the main electronics for the new talking learning aid,
SPEAK & SPELL™, for seven year olds and up. The
new TI consumer product was introduced at the Summer Consumer
Electronics Shows in Chicago, June 11-14.
Speech encoding is achieved
through pitch excited Linear Predictive Coding (LPC). As the
name implies, LPC is based on a linear equation to formulate
a mathematical model of the human vocal tract and an ability
to predict a speech sample based on previous ones.
Linear Predictive Coding
is a technique of analyzing and synthesizing human speech
by determining from original speech a description of a time
varying digital filter modeling the vocal tract. This filter
is then excited by either periodic or random inputs. An on-chip
8-bit digital-to-analog (D/A) converter transforms digital
information processed through the filter into synthetic speech.
Codes for twelve synthesis
parameters (10 filter coefficients, pitch and energy) serve
as inputs to the synthesizer chip. These codes are stored
in a ROM and, once decoded by on-chip circuitry, represent
the time varying description of the LPC synthesis model.
Inputs to the digital filter
take two forms: (1) periodic and (2) random. The periodic
inputs are used to reproduce voiced sounds which have a definite
pitch such as vowel sounds or voiced fricatives such as Z,
B or D. A random input models unvoiced sounds such as S, F,
T and SH .
The speech synthesis chip
has two separate logic blocks which generate the voiced and
unvoiced excitation. Output of the digital filter drives a
D to A converter which in turn drives a speaker.
Key to TI's high quality
LPC speech synthesizer is an advanced design 10-stage lattice
filter which has an integrated array multiplier, an adder
coupled to the multiplier output and various delay circuits
coupled to the adder output.
With this increased computational
sequencing capability and a fast continuous data transfer
rate, the multiplier can accept two inputs every five microseconds.
Twenty multiply and accumulate operations are needed to generate
each speech sample, and the circuit can generate up to 10,000
speech samples per second.
The chip
is operated at an eight kilohertz rate for the Speak &
Spell. This 10th order Linear Predictive Coding (LPC-10) speech
synthesizer IC accurately reproduces human speech from stored
or transmitted
digital data. |