US3697699A - Digital speech signal synthesizer - Google Patents

Digital speech signal synthesizer Download PDF

Info

Publication number
US3697699A
US3697699A US870012A US3697699DA US3697699A US 3697699 A US3697699 A US 3697699A US 870012 A US870012 A US 870012A US 3697699D A US3697699D A US 3697699DA US 3697699 A US3697699 A US 3697699A
Authority
US
United States
Prior art keywords
signal
information bits
frequency
digital
spectrum
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Expired - Lifetime
Application number
US870012A
Other languages
English (en)
Inventor
Norman P Gluth
Richard A Houghton
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Ltv Electrosystems Inc
Original Assignee
Ltv Electrosystems Inc
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Ltv Electrosystems Inc filed Critical Ltv Electrosystems Inc
Application granted granted Critical
Publication of US3697699A publication Critical patent/US3697699A/en
Anticipated expiration legal-status Critical
Expired - Lifetime legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L19/00Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis

Definitions

  • ABSTRACT Disclosed is an electrical-signal synthesizer for converting digitally coded information associated with at least one electrical signal, whose frequency, amplitude 1 Oct. 10, 1972 or phase may vary, to analog signals whose frequency, amplitude or phase varies in substantially the same manner as that of the at least one electrical signal.
  • the synthesizer is operative to convert a digital signal representative of a first analog signal, such as a voice signal, having varying parameters, such as frequency or amplitude, into an analog output signal which varies in substantially the same manner as the first signal, and where the digital signal is composed of consecutive frames of words, and one word of each frame is representative of a fundamental frequency associated with the first signal at an instant of time, and successive words in the respective frame are representative of the energy associated with at least one of a plurality of successive bands or spectrum segments of the first signal to be reproduced, at the given instant in time, each of the successive bands bearing a predetermined frequency relationship and wherein the synthesis of the output signal is accomplished by generating from the word representative of the fundamental frequency in each respective frame, a stream of digital words representative of the frequency and each of its harmonics at each instant of time and producing therefrom a second stream of digital words which is indicative of the frequency components of the original sound and modulating the second stream with amplitude data corresponding to discrete periods of time and adding the respective digital signals so produced for
  • This invention relates to a synthesizer for receiving digitally coded input information and converting such information into analog signals and, more particularly, to a substantially all-digital synthesizer for receiving digitally coded input information relating to speech and l synthesizing therefrom a speech signal.
  • Speech data are carried largely by the varying shape of the power density spectrum rather than by the sound-pressure versus time characteristic, as many erroneously believe.
  • the description of the speech is formed by an analysis of the power spectrum of a first signal by a series of band-pass filters that divide the audio spectrum into a series of adjacent bands, The energy in each band is measured at the output of each filter, and the energy measurement gives a rough, but continuous, description of the power at discrete portions of the incoming speech.
  • the analyzer provides data relating to the fundamental frequency or pitch information.
  • speech is composed of "voiced and unvoiced" sound.
  • the V- iced sounds include the vowels and the voiced consonants and are produced by vibrating the vocal cords with air in the lungs.
  • Voiced sounds are composed primarily of harmonics of the frequency at which the larynx vibrates.
  • the fundamental frequencies of the voiced sound primarily in a range from about 70 to 350 Hz.
  • the unvoiced sounds are the consonants formed by the lips, teeth, and/or tongue. They have no definite harmonic pattern, but consist essentially of frequencies randomly distributed throughout the audio spectrum and varying in amplitude in accordance with the sound being reproduced.
  • the description of the speech includes the pitch frequency, amplitude information relating to bands of the voice-frequency spectrum, an indication that unvoiced sounds are present, and amplitude data relating to the unvoiced sounds.
  • Voice signal synthesizers utilizing filters are subject to at least two major objections. Since band-pass filters with infinitely short cutoff are not technically feasible, energy from one channel often appears in the next adjacent channel output, thereby producing a substantial amount of distortion. Additionally, a filter cannot have an infinitely short response time, and accordingly, energy is stored in each respective filter such that oscillations are set up in the filter circuit, again producing distortion of the voice-signal produced. Also, the use of a plurality of filters results in a construction that is too large and too heavy for applications where size and weight are critical factors, as in a space vehicle. Filters also require large amounts of power input with respect to the power of the output signal produced, since substantial losses are normally associated with filters. Still further, the error associated with the use of filters prevents the repeatability, when required, of a particular signal with a requisite degree of accuracy.
  • Channel analyzers of the type described do not possess the requisite degree of flexibility required for present day application. It may be desirable in certain situations to shift the phase of a single harmonic or to modulate a harmonic with a second signal or to completely eliminate a particular harmonic in a given situation, thereby to improve or change the quality of the signal which is to be synthesized.
  • an atmosphere which includes a high percentage of helium. The propagation of sound in helium is distorted with respect to propagation of the same sound in air, thus producing an unnaturalness in the sound in the vehicle. If this distortion could be compensated for by a synthesizer which is capable of altering the pitch of the sound produced to compensate for the distorted propagation, it would be possible to thereby return to the sound of naturalness which has been lost.
  • An object of this invention is to provide an improved electrical-signal synthesizer.
  • Another object of this invention is to provide an improved synthesizer for receiving digitally coded input information and converting such information into analog signals which vary in accordance with a first signal from which the input information is coded.
  • Still another object is to provide an electrical-signal synthesizer operative in response to a digitally coded input signal representative of an original electrical signal having at least one varying parameter, to produce an analog output signal having at least one parameter that varies in accordance with the at least one varying parameter of the original signal.
  • Yet another object is to provide an electrical-signal synthesizer which is improved through the use of substantially all digital techniques to accomplish the synthesis.
  • a further object is to provide an electrical-signal synthesizer which reproduces the analog signal with a higher degree of accuracy than that of other synthes izers.
  • a still further object is to provide an electrical-signal synthesizer which is smaller in size and has reduced weight with respect to other synthesizers.
  • Another object is to provide a new and improved synthesizer for receiving digitally coded information relating to the frequency, amplitude or phase of original signals and converting the coded information into analog signals of substantially the same frequency, amplitude or phase as the original signals.
  • Still another object is to provide a synthesizer for converting consecutive frames of digital words, where the consecutive frames contain frequency and amplitude information relating to a frequency and amplitude varying original signal at consecutive, predetermined, instants oftime, into an analog signal having the same frequency and amplitude as the original signal at the respective instant oftime.
  • An important object of the invention is to provide a new and improved synthesizer for receiving digitally coded information of fundamental parameters of speech and converting the digitally coded information into analog signals.
  • Another object is to provide a synthesizer for corn verting digitally coded information relating to the fundamental parameters of speech, which information consists of consecutive frames of digital words, which frames include frequency and amplitude information relating to the speech at consecutive, predetermined, instants of time, with one word of each frame containing information relating to the fundamental frequency of the speech at one instant of time, and the other words of each frame containing amplitude information relating to predetermined frequency bands or spectrum segments, each of the bands having a predetermined relationship to at least one fundamental frequency at the one instant oftime.
  • Yet another object is to provide a substantially alldigital, voice-signal synthesizer which operates in real time without the use ofextensive memory apparatus.
  • Still another object is to provide a synthesizer for converting digitally coded voice information into analog signals, wherein the synthesizer provides improved-quality, voice reproduction through the use of digital apparatus.
  • Still another object is to provide a synthesizer for receiving serially presented, digitally coded information which is indicative of frequency, amplitude or phase of original signals at predetermined instants of time and converting such digitally coded information into at least one digital signal, in parallel form, indica tive of any combination of frequency, amplitude or phase relations of the original signals at consecutive instants oftime.
  • FIG. 1 is a block diagram of a signal synthesizer embodying the invention
  • FIG. 2 is a diagrammatic illustration of a digitally coded, serial-input signal coupled to the synthesizer of FIG. 1;
  • FIG. 3 is a graph illustrating the computation of the frequency components of the synthesized signal
  • FIG. 4 is a graph illustrating a technique used to obtain sine information in the synthesized signal
  • FIG. 5 is a graph illustrating the basic method of computation
  • FIG. 6 is a simplified schematic drawing ofthe serialto-parallel converter of FIG. 1;
  • FIG. 7 is a simplified schematic drawing of the amplitude buffer register of FIG. I;
  • FIG. 8 is a simplified schematic drawing of the 12-bit added-accumulator combination of FIG. 1;
  • FIG. 9 is a simplified schematic drawing of the mag nitude comparator, the envelope update control, and the table of channel bandwidths of FIG. I;
  • FIG. I0 is a simplified schematic of the K-index and synchronization control of FIG. I;
  • FIG. II is a simplified schematic of the table of amplitude modulated trig functions of FIG. I;
  • FIG. 12 and I2a is a diagrammatic illustration of the general timing of various components of the synthesizer ofFlG. I;
  • FIG. 13 is a simplified schematic of the noise generator of FIG. I,
  • the illustrated embodiment ofthe invention is a synthesizer 10 used to convert digitally coded information relating to a first analog signal into analog signals which may in turn be used to reproduce the first signal.
  • Voice analyzers for translating speech into digital code or signals are well known.
  • a digital signal produced by one of these analyzers may comprise, as illustrated in FIG. 2, consecutive frames F, such as I90, of digital words containing information relating to the fundamental parameters of speech at consecutive, predetermined, spaced instants of time.
  • digital signals are transmitted at the rate of 2,400 bits per second.
  • each frame contains information relating to whether the speech at a particular instant of time is voiced or unvoiced, a definition of the fundamental frequency of the speech at the given instant to which the frame is related if the sound is voiced sound, and the amplitude of the energy level ofa predetermined, consecutive series of bands or spectrum segments spaced within the the first being a six-bit word, 92, coded to identify the fundamental frequency of the voiced sound or to indicate that there is an absence of voiced sound at an in stunt of time.
  • serially presented, following the first word are consecutive three-bit words, such as the three-bit words 92-97, each being coded to indicate the amplitude of the energy associated with a respective predetermined.
  • the 17th word 96 similarly, provides the amplitude information for the 16th band, but as opposed to the other words in the series, it does so with two bits; the last bit of the frame being a synchronization bit,
  • the first three-bit word 93 indicates the amplitude energy of the speech in the hand between 200 Hz to 332 Hz and so on with the last word 96 indicating the amplitude of the energy in the spectrum segment between 3,331 Hz and 3,820 Hz.
  • the consecutive bands of the frame related to a respective word each increase in width with respect to frequency in a predetermined, selected manner, for example, the expansion may be on a logarithmic scale.
  • the synchronization bit 97 serves to maintain proper synchronization of the timing relationships between the operation of the various circuits of the voice synthesizer 10.
  • the synthesizer of this invention is a special purpose computing device. It receives the input information at a rate of 2,400 bps, and the bit stream consists of serially arranged 54-bit frames of the type previously described.
  • X is the summation of a sequence of computations relating to the amplitude and frequency of the analog signal to be constructed, where the summation computation is performed for K specific time instants.
  • the term f is the pitch or fundamental frequency in Hz for which the computation is performed, and the term H represents the harmonic number (i.e., l, 2, 3, ....N) for the harmonics associated with the pitch frequency.
  • T is an incremental unit of time associated with the computation of the amplitude of one point for one particular harmonic
  • L represents greatest product of H'fthat is less than 3,820 Hz
  • C represents a scaling factor relating to the number of computations to be performed during the cycle of the basic pitch period
  • K represents a time index related to the number of computations to be performed with respect to a particular cycle of the pitch frequency.
  • K, T and are fully explained in the following portions of the disclosure.
  • the upper limit of the band of frequencies considered has been selected, in the embodiment disclosed, as 3,820 Hz. This use of this upper limit, as opposed to 4,000 Hz, facilitates computation and does not substantially effect the intelligibility or quality of the output produced in its expanded form, equation (1) above can be written as follows for successive periods of time:
  • a pitch or fundamental frequency f is illustrated with each of its harmonies (2 3f, 4f...nf) included in a speech-band of frequencies having a top frequency of 3,820 Hz.
  • an output curve 91 is shown which theoretically represents the summation of the pitch frequency with each of its harmonics falling within the prescribed speech-band.
  • the lowest pitchfrequency which is dealt with is 74 Hz, since this corresponds approximately with the lower-end of the band of fundamental or pitch frequencies. It has been decided arbitrarily to compute 256 points during any one complete cycle of a 74 Hz signal and 256 points for each harmonic thereof where the points computed for the harmonics are equally spaced over a time span equal to the period of the fundamental; thus a representative scale, where the pitch frequency is equal to 74 Hz, is set forth in FIG. 5, and as will be fully explained in the material that follows, the pitch frequency on which a computation is based will change. but the computing rate of K 256 will remain constant.
  • the upper-limit of the voice-band for this embodiment is set at 3.820 Hz, with a fundamental frequency of 74 Hz there will be (38201741 or 5] harmonics lying in the voice-band.
  • the total time of the computation is 52 7 a sec thus the value ofa point for each harmonic or the fundamental is computed in approximately (52.7/5l )or I 03 .1 set.
  • equation (3) provides a value for the amplitude of the output signal at the second increment of time ln equations (2), (3), and (4)
  • the portions of the respective equations labeled a, a ,....a, represent the first harmonic component of the computation
  • b, b, and b represent the second harmonic component
  • m, m, and m represent the he computation.
  • the pitch frequencies utilized in this device fall in the range from 74 to 3l0 Hz.
  • the band of pitch frequencies is normally considered to be approximately 74-330 Hz, but ed slightly without seriously efhe sound produced (when voice of the general equation (1 can be summarily written as follows;
  • the digitally coded input in formation is applied to an input terminal M which is coupled to an input control-unit [3.
  • the input controlunit 13 operates to synchronize the input information (Spectrum component segment (8.) adds to zero for the value K 256. since the time t, is taken as t +1.)
  • the total computation time for computing the components X X,, ,..,.X would be only 317 milliseconds, since there are only It computations, i.e., 12 possible harmonics of the pitch frequency that lie between 3 l0 H7. and 3,820 HZ (3,82U/3l0) l2).
  • the to the input control I3 is control, over line 17 to the me of the serial input-data is pulse from lead 19. on a hitin the converter such that the data received by the converter over line [7 is synchronized "to the operation of the input means. Additionally, a
  • the signal corresponding to the synchronous bit associated with each frame input data is coupled through a lead 16 to the amplitude buffer-register 22, to the pitchfrequency buffer-register 26, and to the K-index and synchronization control unit 20.
  • the signal appearing on line 16 is essentially a pulse-train with a repetition rate of 44.44 bps or one pulse every 54 counts of the 2,400 bps clock.
  • serial-to-parallel converter 18 is of a type well-known to the art relating to digital computer technology.
  • the converter 18 utilizes flip-flops which are also well-known, and those skilled in the art will recognize that a flip-flop has first and second input connections, first and second output connections, generally labeled Q and 6 (not Q a clock input which operates in response to a pulse applied thereto to set the data at the input to the output, and a reset connection which operates in response to a pulse applied thereto to clear the output of the flip-flop.
  • Flip-flops are too well known in the art to require more than the general description provided. Additionally, a one or high referred to herein implies the presence of a DC voltage of a given magnitude, and a low or zero refers to the absence of a voltage. In the embodiment illustrated, a voltage of volts DC is used as a one.
  • a first output connection of each flip-flop corresponds to the first input connection of the same flip-flop, and similarly, a second output connection corresponds to the second input connection, such that when a timing pulse is applied to a clock input of the flip-flop, the output changes state to a condition corresponding to the condition at the input at the time the timing pulse was applied.
  • serial data on line 17 is coupled to the input of converter 18 wherein the serial signal is divided into two parallel paths l7, l7aone path 17a including an inverter 2i, and each of the parallel paths are applied directly to respective input connections of a first flipflop 23 of a group of 54 parallel-connected, flip-flops 23.
  • the first flip-flop 23 includes a first input connection to which line 17 is coupled and a second input connection to which line 17a is coupled; however, the second coupling is through the inverter 21, such that if one bit of the input serial data is a one, the one is applied directly to the first input-connection, and a zero is applied to the second input connection, Conversely, if a zero is applied to the first input connection, a one is applied to the second input connection.
  • the output connections of the first flip-flop 23 are coupled to the input connections of the second flip-flop 23 and so on through the remaining flip-flops of the group.
  • the lead 19 is coupled to the clock input of each flip-flop 23 in the string,
  • each set-reset input is pulsed simultaneously at a rate of 2,400 pulses-per-second by the clock signal on line 19
  • the 2,400 bps input data on line [7 is stepped serially through the flip-flops 23, and at the end of each 54 consecutive steps, the serial bits clocked into the first output-connection of each flipflop corresponds to 21 respective bit of the 54-bit frame of input data, as shown in FIG. 2.
  • a lead, such as 24, is coupled to the first output-connection of each respective flip-flop 23 to provide the desired paralleldata output from the converter l8.
  • the first bit of data to enter the converter 18 with respect to time is the first bit of the six-bit word related to the pitch frequency of the frame.
  • the last word entering the converter 18 is a three-bit word representative of the energy level of the harmonics located in the l6th segment of the voice band, and this three-bit word includes a synchronization bit which is actually the last bit in the frame. For this reason, the amplitude of the energy level associated with this last word is treated in the synthesizer as having only two significant bits.
  • the 48 bits of amplitude information are coupled to the amplitude buffer-register 48 through parallel leads 24 and are coupled interiorly of the register to input of flip-flips 27.
  • a signal on line 16 corresponds to the occurrence of the synchronization bit in each word frame, such as 190 of FIG. 2, and this signal is applied to the clock input of each respective flip-flop 27, simultaneously.
  • each flip-flop 27 is set in accordance with the data on its input and is set at a time when a complete frame of data is available in parallelform from the serial-to-parallel converter 18; therefore, the parallel data is stored in buffer-register 22 for a time-period corresponding to the frequency of occurrence of the synchronization pulse on line 16 or 22.5 milliseconds.
  • a separate lead, such as 25, is coupled to the noninverted output (Q) connection of each flipflop 27 and is coupled to the envelope register 30 (FIG. 1) for use therein at a subsequent time.
  • the buffer-register 22 operates to store the amplitude data while a new frame of serial input data is being converted to parallel form by converter 18 and while the last frame of data stored in the register 22 is being processed by the other circuitry of the synthesizer 10.
  • the six bits of pitch frequency information of each frame is coupled through parallel leads, such as 24, to a pitch frequency buffer-register 26 (FIG. 1).
  • a pitch frequency buffer-register 26 Refer to FIG. 1. Except for the number of flip-flops used therein, the register 26 is substantially identical in operation and construction to the register 22.
  • the output of register 26 includes six parallel lines 31 which are coupled to a frequency data conversion unit 28.
  • the frequency data conversion unit 28 of FIG. 1 operates to convert the digitally coded input information to binary format and operates to change the frequency format arrangement of the digitally coded input information into a binary format usable in the digital equipment of the synthesizer.
  • channel analyzers available at present code the pitchfrequency data substantially in accordance with the following code:
  • the frequency data conversion unit 28 operates to expand the coded data to a nine-bit word, as opposed to the six-bit word of the coded input data. It will be apparent to those skilled in the art that a sixbit word of standard binary arithmetic would not add to 3 l0 Hz. For instance, a standard six-bit binary computing word can be expanded as follows:
  • place 3 OOOIOO
  • a l in place 3 and in place 2 (0001 10) would be the number 6 t0 the base l0; and so until there is a l in each of the places 1-6 (l l l I ll) whereupon the number represented is 63, which is also the maximum number that can be represented with one six-bit word; thus, if the binary number is expanded to nine bits (in lieu of 6), the binary number for 3l0 is easily formed (100i It)!
  • the frequency storage unit 29 is a storage register in cluding flip-flops and is similar in construction and operations to the amplitude buffer-register 22 and the pitch frequency buffer-register 26, except that in the flip-flop and the noninverted output of each flip-flop is coupled through a respective lead, such as 34. to a l2- bit adder 35.
  • the frequency storage unit 29 operates to store the data input thereto from the conversion unit 28 until such time as the pitch frequency reaches the end the K-index and synchronization control unit 40, as will be shown in the following description, to cause the data stored in the storage unit to be transferred to the adder 35.
  • the timing of the gating signal is set to prevent the the computing portion of the synthesizer 10 at an inopportune moment.
  • the 12-bit adder 35 and the accumulator 36 as a spectrum component generator operate to produce bi nary words corresponding to certain, successive har monies ofthe pitch frequency of a respective frame
  • the timing and output control unit I2 is the master timing unit for the computing portion of the synthesizer 10.
  • the unit includes a crystal-controlled oscillator and a series of flip-flops which serve as frequency dividers in a manner which is well known to those skilled in the art, such that ten clock-signals are available from the control unit 12 for timing the various circuits of the synchronizer 10. In the embodiment disclosed.
  • clock 0 is 7.76 MHz pulsetrain
  • clock I is 3.88 MHz
  • clock 2 is [.94 MHz
  • clock 4 is 0.97 MHz
  • clock 8 is 0.425 MHz
  • five additional timing signals each out of phase with a respective one of the above clocks 0-8, are also available
  • combinations of the above disclosed timing signals are used to generate still other timing signals. For instance, a 1.03 p. sec clock is generated by the combination of clock 2 and clock 4 ln HO. 1, the timing pulses are coupled to the various units by certain ones of ten separate leads, such as 42.
  • Each bit of the nine bit word is transferred from the section, tions (12) than there are input data bits (9) to allow room for binary expansion of the number
  • the input data bits are coupled to the adder inputs corresponding to the nine least significant bits.
  • Each of the adder sec tions 39 is of a type which is wellknown, and a Fairchild integrated circuit chip model 9304, manufat turer by Fairchild Semiconductor of Mountaln View California, is a typical device useful In this embodi ment.
  • the Fairchild device incorporates two of the respective adder sections, such as 39, in one chip
  • Each adder section 39 includes three inputs, identified as lN l, [N No 2. carry input (C,,,J and two outputs. identified as carry output (C and sum.
  • the accumulator 36 includes 12 flip-flops 53, and the sum output of each adder section 39 is coupled through a lead 37 to both the inverting and noninverting input connections of a respective flip-flop 53.
  • the noninverting output (Q) of each flip-flop 53 is coupled through a lead 38 to the second input connection (IN,) of a respective adder section 39 and through a second lead, such as 44, to the magnitude comparator 50 (FIG. 1).
  • each respective adder section 39 is coupled directly to the carry input (Cm) 0f the next adjacent adder section, and the C of the last adder section 39 is left open.
  • the clock input of each respective flip-flop $3 is coupled through the lead 42 to the timing and control unit 12 (FIG. 1), and the reset input of each respective flip-flop is coupled through a lead 43 to the K-index and synchronization control unit 40 (FIG. I).
  • the pulse on lead 43 is used to reset the accumulator 36 when processing of a particular frame of data is completed.
  • the binary number appearing on leads 34 will be added to itself, such that the binary word at the output leads 44 will increase in even multiples, and therefore will represent successive harmonics of the pitch frequency, i.e., 2f, 3f, 4f, etc.
  • the pitch frequency is 74 Hz
  • the binary word on lead 34 is 0000l00l0l0
  • the table of channel bandwidths operates to produce on its output leads, such as 46, a seven-bit binary word representative of a respective one of the frequency markers set forth above.
  • the output switches to a seven-bit word representative of the next higher marker frequency until the bandwidth [7 marker is reached, at which time the table recycles, and starts over.
  • the table 70 steps through each of the l7 markers one time during one-two hundred and fifty-sixth of a cycle of the pitch frequency appearing at the output of the frequency storage unit 29.
  • the output of the accumulator 36 is coupled to the mag nitude comparator 50 by seven parallel leads, such as 44, and the output of the table of channel bandwidths 70 is coupled to the comparator by seven leads such as 46.
  • the comparator 50 operates to compare the words coupled thereto from the table 70 and the accumulator 36, and if the value of the binary word presented by the accumulator is equal to or greater than the value of the binary word presented by the table 70, then the output of the comparator, at lead 47, changes state; for instance, the output may change from zero volts to a substantially constant DC voltage of a few tenths of a volt.
  • Comparators suitable for use in this circuit are availa ble from several sources, and in particular, a pair of National Semiconductor Corporation four-bit compara tors, model DM7200/DM8200, coupled in parallel, are suitable for use in this embodiment.
  • the comparator 50, the en velope update control 60 and the table of channel bandwidths 70 cooperate to produce the result set forth above.
  • the smallest frequency represented by the output of the table 70 is 200 Hz thus there is always a word on line 46 equal to or greater than 200 Hz.
  • the output of the comparator 50 is coupled by a lead 47 to the envelope update 60 and specifically to a NAND-gate 55 located therein
  • the NAND-gate 55 has three inputconnectmns and operates in response to the presence of three posiuve signals, one on each respective input, to produce a negative swing or low at its output.
  • the clock pulses from lead 42 are coupled to the input of gate 55 and are normally high, but periodically swing low for the purpose set forth below.
  • a negative swing at the output of gate 55 is inverted by an inverter 56 and coupled to the input of a digital counter 57 which responds to the application of a positive-going signal at its input to in crease the number represented by its output by one.
  • the output of the digital counter 57 is a four-bit word which has 16 specific combinations of binary digits (0000 through I l l 1), representing the numbers from ll6; thus, the counter output provides an address for the first l6 successive marker frequencies set forth above.
  • the output of the counter is 0000, and when the word on line 44 represents a value equal to or larger than 200 Hz, then the comparator 50 output changes state and the output of the digital counter 57 changes to 000i.
  • This address (0001) is coupled by lines 45 to the table of channel bandwidths 70 and is coupled therein to each of l7 detect-only gates 58.
  • a digital counter of the type described herein is a model 8828i] four'bit binary counter/storage element manufactured by Signetics Corporation, Sunnyvale, California.
  • Each detect-only gate 58 except the l7th, recognizes only one of the [6 possible combinations of the output of the counter 57.
  • the read-only or detect-only memories 58 are of a type which are well-known and a typical integrated circuit chip for use as the read-only gate of this invention is a model MM-422 manufactured by National Semiconductor Company of Santa Clara, California.
  • the output of each readonly memory 58 is coupled to a respective bank 59 of parallel-connected diodes 61.
  • each respective diode bank 59 is connected in parallel with the respective outputs of other diode banks, and all the bank outputs are coupled through leads 46 to the input of the comparator 50.
  • the next succeeding detect-only gate 58 is addressed and activated, and the corresponding diode bank 59 produces a binary word representing the next marker frequency.
  • the output of the accumulator 36 increases, and the harmonic value thus produced is compared to the new frequency marker until a comparison is again achieved, in which case the entire process is repeated such that the next marker frequency is brought up for comparison.
  • a pair of timing signals from the timing and output control 12 are applied to respective inputs of the NAND-gate 5S, and, as previously stated, the gate responds to high voltages (ls on each of the gate leads. in this case 3, to cause the envelope update 60 to operate.
  • the timing signals are arranged to force the gate 55 to operate when a harmonic of the pitch frequency does not fall within the specific hand. For instance, consider the pitch frequency of I80 H2 and its second harmonic of 360 Hz. Examining the list til marker frequencies set forth above, it is clear that a harmonic of the pitch frequency does not fall within the hand dcl'iiicd h markers 200 Hz and U2 H2, When the processing ol this pitch frequency begins.
  • the signal from the table 70 represents 200 Hz and the signal from the accumulator represents l8l) HZ, thus a compare signal is not generated on line 47.
  • the signal at the output of the accumulator goes high, i.e., changes state from zero to a positive voltage, thus indicating that a comparison has been made.
  • the envelope update 60 operates to cause the table to produce a new output signal, which in our example is now 332 Hz, but notice that the signal from the accumulator 36 is still larger than the signal from the table.
  • the counter 57 of the envelope update 60 cannot be made to step, since the input to the comparator 50 does not call for a change at its output on line 47.
  • the timing pulses applied to NAND-gate 55 are arranged to cure the problem relating to the lack of harmonics falling in a band, between markers.
  • the three inputs to the NAND-gate 55 must be high, each representing ones, to cause the output of the gate to switch low, thereby enabling the circuitry to cause the counter 57 to switch.
  • timing pulses are arranged on at least one of the lines 42 coupled to the input of the gate 55 such that at least once every cycle of the comparator 50, the voltage on the at least one lead drops to zero for a short period, and if the compare signal has not been generated by a normal compare, i.e., the presence ofa harmonic in the hand, then as the voltage on the at least one lead returns to a high, a false compare is generated, and the counter 57 steps, thus calling up a new bandwidth marker, for example marker 3, which is 442 Hz, and the compare circuitry is then enable to operate in its normal manner.
  • a normal compare i.e., the presence ofa harmonic in the hand
  • the digital counter 57 Since the digital counter 57 has only a four-bit output with 16 possible word combinations, the address of the l 7th marker must be created in some other manner. This input is provided by producing a high voltage, representing a one, from the input line 47. All the outputs from the digital counter 57 are now ones (I l l l), and they are applied to the bandwidth 17 gate which produces a zero out.
  • the output of the bandwidth l7 gate is then inverted and applied to a NAND- gate 63, which is similar to gate 55, such that as the 16th marker frequency is reached, a l is applied to one of the three inputs to the NAND-gate 63,
  • a compare signal again appears on line 47, indicating that the harmonic signal on lead 44 equals or exceeds 3,330 Hz
  • a signal representing a one is applied to the second Input to NANllgate 63.
  • the third signal representing a one is applied to the NAND-gate 63 by a clock pulse from the timing and output control unit l2 and is timed to assure that the bandwidth 16 address and bandwidth [6 compare process is complete.
  • the gate switches to a zero output which in turn sets a flip-flop 64 to produce an output one to the respective diode bank 59, which produces the proper bandwidth 17 comparison signal out of the bank, in the manner previously described.
  • a pulse is applied to the reset terminal of flip-flop 64 by the K- index and synchronization control 40, thus causing the table 70 to recycle.
  • the table of channel bandwidths 70 cycles through each of the sixteen bands for each one-two hundred and fifty-sixth part of one cycle of the basic pitch frequency. ln other words, data relating to the pitch frequency and each of its harmonics is generated during each one-two hundred and fifty-sixth part of a cycle of the pitch frequency, thus the digital operation relates to the scheme of compilation set forth above with respect to the general equation (I) and its expansion in equations (2), (3) and (4).
  • the output signal on lead 65 of the envelope update control 60 is coupled to the A register 66 and to the envelope update register 30, andby properly signaling 1% Tits, harmonic amplitude information in the register 30 is related to at least one sine function of the equations (2), (3), and (4).
  • the output of the comparator 50 is coupled through a lead 47 to the K-index and synchronization control unit 40.
  • the K-index and synchronization control unit 40 operates to produce a bandwidth 17 marker, which indicates the end of the cycle of the 16 bands for one-two hundred and fiftysixth of a cycle of the pitch frequency and provides a means for synchronizing the operation of the K-counter 80, the K-H accumulator 75, the reset signals on line 43, the output accumulator 85, and the digital-toanalog (D/A) converter 86.
  • the K-index and synchronization control 40 includes a first NAND-gate 68 which has 4 input connections. Two of the input connections are coupled to leads 42 from the timing and output control unit 12 while the third lead is coupled to the output of the comparator 50 through a lead 47, and the fourth lead is coupled to the table of channel bandwidths 70 at the noninverting output of flip-flop 64 (FlG. 9), Le, the bandwidth 17 address output.
  • the gate 68 operates in response to high-voltages (ones) on each of its respective inputs to produce a low output (zero). When the bandwidth 17 address is generated in the table of channel bandwidths and the flip-flop 64 (FIG.
  • a reset-disable circuit including NAND-gate 87 is coupled between the output of gate 68 and the reset input of flipflop 67.
  • the gate 87 has four inputs, and a first of the inputs is coupled to the output of gate 68. The remaining three inputs to gate 68 are coupled by line 42 to the timing and output control 12.
  • the noninverted output of flip-flop 67 is coupled to NAND-gate 69, as previously stated, and the output of gate 69 is coupled through an inverter 73 to lead 41.
  • Gate 69 has a second input connection which is coupled to the noninverted output of a flip-flop 71, and the output of gate 69 is coupled to the reset input of flipflop 71.
  • the frame synchronization signal is coupled from the input means (FIG. I) through lead 16 and through an inverter 74 to a first input connection of NAND-gate 72, and a pitch synchronization signal from the K-counter (FIG. I) is coupled through lead 49 to a second input of gate 72.
  • the frame synchronization signal is normally low but is inverted by inverter 74, thus, a positive signal is applied to one input of gate 72 at all times, except when the frame sync signal is present on line 16.
  • the pitch sync signal on line 49 is generated in the K-counter and corresponds to the start of a full-cycle of the pitch frequency (K 0).
  • K 0 the pitch frequency
  • This low is coupled to the set gate of flip-flop 71 and is there inverted to cause the flip-flop to set.
  • a high is produced on lead 88 which is coupled to the input of a NAND-gate 69.
  • the gate 69 has highs on each of its two inputs, as when there is a bandwidth l7 marker, and when the K- counter 80 (FIG. 1) steps to any position other than K o,the output described, the frame sync pulse (on line 9 16) causes the amplitude information data to shift from the amplitude buffer-register 22 to the envelope register 30.
  • a change of frame information is called for, as by a pulse on line 41, at the precise moment that the amplitude information is being transferred from the register 22 to the envelope register 30 and before the transfer lines 25 have settled, erroneous data may be recorded in the envelope register, thereby disrupting the operation of further computations by introducing error.
  • the gate 72 is disabled, as described, to prevent these errors.
  • the bandwidth 17 marker, on line 52, is coupled to the K-counter 80, the output digital-to-analog converter 86, the scaling multiplier 84, and the accumulator 85.
  • the output of gate 69 goes low, it is inverted by an inverter 73 to which it is coupled and a positive or high is produced on lead 41 out of the in verter. Additionally, when the output of 69 goes low, the low is coupled through a lead 89 to the reset connection of flip-flop 71 where the signal is inverted to reset the flip-flop, thus removing a high from the input of gate 69. lt is now apparent that the pulse produced on line 41 has a duration which corresponds to the response time of the reset circuit of flip-flop 71.
  • the disable circuitry associated with flip-flop 71 prevents the occurrence of a K pulse from the K- counter 80 at the same time that a frame synchronization pulse occurs.
  • a pulse on line 41 is coupled to the envelope register 30 to cause the register 30 to load from the register 22.
  • Line 41 is coupled also to the frequency storage unit 29, and the synchronization pulse thereon causes data to transfer from the frequency storage unit 29 to the adder 35. Note that these operations occur only when K 0, since, as previously described, a K 0 pulse is required from the K-counter 80 to enable the generation, in the control unit 40, of the pulse on line 41.
  • the envelope register 30, FIG. 1 accepts and stores the amplitude data from the amplitude bufi'er-register 22 upon the receipt ofa pulse over line 41 from the K- index and synchronization control unit 40 and is properly a part of the input means 15. Since the pulse on line 41 corresponds to the bandwidth 17 marker, it represents the end of a cycle through the 16 segments of the voice band for the one-two hundred fifty-sixth increment of one cycle of the pitch frequency that is present, thus a new frame of amplitude data is called up and stored in the envelope register 30 for use with the next full-cycle ofthe pitch frequency.
  • the circuitry in the envelope register 30 is similar to that of register 22 in that it in eludes Ll group of flip-flops which are pulsed by the signal on line 4
  • the output of the envelope register 30 is coupled back to its input to provide a path for recirculation of the amplitude words of each frame.
  • the update signal on line 65 from the envelope update control 60 is coupled to the A register 66, also, and when an envelope update pulse is generated on line 65, the amplitude data on leads 76 at the output of the envelope register 30 is transferred and stored in the A n register 66 and is also circulated through the recirculation path, previously described, and stored again in the envelope register 30, to become the last word in order of time instead of the first. in this manner, the envelope register 30 shifts through each successive word of amplitude data in response to a shift signal from the envelope update 60.
  • the A register 66 stores the three-bit words of data on line 76 in response to a pulse on line 65. Clock pulses on line 42 are applied to the register 66 to provide a time-window when storage can occur, thus preventing the erroneous storage of data in response to transients. Similar techniques and circuitry for providing the time-window have already been described.
  • the output of the A register 66 is coupled over leads 81 to the table of amplitude modulated trig functions 90.
  • the K-counter 80 is a counter similar to the counter 57 (FIG. 9) previously described, and is a commercially available unit.
  • the basic difference between the counter 80 and the converter 57 is the counting range or magnitude of the output word which is produced.
  • the K-counter 80 has an eightbit output and can therefore count to a higher level than the counter 57 which has only a four-bit output.
  • Each bandwidth 17 marker pulse on line 52 strobes the K-counter 80 causing it to step.
  • the binary output of K-counter 80 is arranged such that it is all zeros (lows) when X 0.
  • Eight parallel-connected logic-gates are coupled respectively to respective ones of the output bit-positions of the K- counter 80, and each gate operates to invert the signal applied to its input, whether it is high or low.
  • the gates can be arranged such that when the output of each gate is a high, and only in this case, a high output is produced. This case occurs only when the K-counter recycles in response to a bandwidth 17 marker, such that its output is all zeros. This high output is applied to lead 49 to signal K 0 to the respective units previously described,
  • the adder 77 and K-H accumulator are similar in construction and operation to the adder-acumulator combination 35, 36.
  • the output of the K-counter 80 is coupled to the input of adder 77, and the adder 77 is coupled to the accumulator 75 in substantially the same manner that adder 35 is coupled to accumulator 36.
  • the KH accumulator 75 is clocked by a pulse from the timing and output control 12, the binary number at the input to the adder 77 adds to itself.

Landscapes

  • Engineering & Computer Science (AREA)
  • Computational Linguistics (AREA)
  • Signal Processing (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Human Computer Interaction (AREA)
  • Physics & Mathematics (AREA)
  • Acoustics & Sound (AREA)
  • Multimedia (AREA)
  • Compression, Expansion, Code Conversion, And Decoders (AREA)
  • Electrophonic Musical Instruments (AREA)
US870012A 1969-10-22 1969-10-22 Digital speech signal synthesizer Expired - Lifetime US3697699A (en)

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
US87001269A 1969-10-22 1969-10-22

Publications (1)

Publication Number Publication Date
US3697699A true US3697699A (en) 1972-10-10

Family

ID=25354615

Family Applications (1)

Application Number Title Priority Date Filing Date
US870012A Expired - Lifetime US3697699A (en) 1969-10-22 1969-10-22 Digital speech signal synthesizer

Country Status (8)

Country Link
US (1) US3697699A (de)
JP (1) JPS521603B1 (de)
CA (1) CA976655A (de)
DE (1) DE2051589C3 (de)
FR (1) FR2088984A5 (de)
GB (1) GB1310036A (de)
IL (1) IL35513A (de)
SE (1) SE367080B (de)

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US3865982A (en) * 1973-05-15 1975-02-11 Belton Electronics Corp Digital audiometry apparatus and method
US3974334A (en) * 1972-12-22 1976-08-10 Electronic Music Studios (London) Limited Waveform processing
US4076958A (en) * 1976-09-13 1978-02-28 E-Systems, Inc. Signal synthesizer spectrum contour scaler
US5054072A (en) * 1987-04-02 1991-10-01 Massachusetts Institute Of Technology Coding of acoustic waveforms
US20110046957A1 (en) * 2009-08-24 2011-02-24 NovaSpeech, LLC System and method for speech synthesis using frequency splicing
US20160189725A1 (en) * 2014-12-25 2016-06-30 Yamaha Corporation Voice Processing Method and Apparatus, and Recording Medium Therefor

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CA1114954A (en) * 1978-07-17 1981-12-22 Arthur J. Tardif Digital sound synthesizer

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US3974334A (en) * 1972-12-22 1976-08-10 Electronic Music Studios (London) Limited Waveform processing
US3865982A (en) * 1973-05-15 1975-02-11 Belton Electronics Corp Digital audiometry apparatus and method
US4076958A (en) * 1976-09-13 1978-02-28 E-Systems, Inc. Signal synthesizer spectrum contour scaler
US5054072A (en) * 1987-04-02 1991-10-01 Massachusetts Institute Of Technology Coding of acoustic waveforms
US20110046957A1 (en) * 2009-08-24 2011-02-24 NovaSpeech, LLC System and method for speech synthesis using frequency splicing
US20160189725A1 (en) * 2014-12-25 2016-06-30 Yamaha Corporation Voice Processing Method and Apparatus, and Recording Medium Therefor
US9865276B2 (en) * 2014-12-25 2018-01-09 Yamaha Corporation Voice processing method and apparatus, and recording medium therefor

Also Published As

Publication number Publication date
DE2051589B2 (de) 1980-04-03
IL35513A0 (en) 1970-12-24
GB1310036A (en) 1973-03-14
DE2051589A1 (de) 1971-06-16
SE367080B (de) 1974-05-13
IL35513A (en) 1974-01-14
DE2051589C3 (de) 1980-11-27
JPS521603B1 (de) 1977-01-17
CA976655A (en) 1975-10-21
FR2088984A5 (de) 1972-01-07

Similar Documents

Publication Publication Date Title
US4076958A (en) Signal synthesizer spectrum contour scaler
CA1157564A (en) Sound synthesizer
US3654450A (en) Digital signal generator synthesizer
US3795864A (en) Methods and apparatus for generating walsh functions
US3848115A (en) Vibration control system
US3995116A (en) Emphasis controlled speech synthesizer
US3831015A (en) System for generating a multiplicity of frequencies from a single reference frequency
JPS6131658B2 (de)
US3566035A (en) Real time cepstrum analyzer
US4119005A (en) System for generating tone source waveshapes
US3697699A (en) Digital speech signal synthesizer
US3403227A (en) Adaptive digital vocoder
CA1172366A (en) Methods and apparatus for encoding and constructing signals
US3703609A (en) Noise signal generator for a digital speech synthesizer
GB1578543A (en) Autocorrelation function generating circuit
US4680479A (en) Method of and apparatus for providing pulse trains whose frequency is variable in small increments and whose period, at each frequency, is substantially constant from pulse to pulse
US3069507A (en) Autocorrelation vocoder
US4064363A (en) Vocoder systems providing wave form analysis and synthesis using fourier transform representative signals
GB1510646A (en) Synthesizer of multifrequency code signals for a keyboard type telephone station
US4245541A (en) Apparatus for reducing noise in digital to analog conversion
US3697892A (en) Digital frequency-shift modulator using a read-only-memory
US3471644A (en) Voice vocoding and transmitting system
US3689844A (en) Digital filter receiver for frequency-shift data signals
US4075424A (en) Speech synthesizing apparatus
Campanella A survey of speech bandwidth compression techniques