FI91457C - A method of storing speech in a memory means and reproducing a stored speech and apparatus for its use - Google Patents

A method of storing speech in a memory means and reproducing a stored speech and apparatus for its use Download PDF

Info

Publication number
FI91457C
FI91457C FI911167A FI911167A FI91457C FI 91457 C FI91457 C FI 91457C FI 911167 A FI911167 A FI 911167A FI 911167 A FI911167 A FI 911167A FI 91457 C FI91457 C FI 91457C
Authority
FI
Finland
Prior art keywords
speech
memory
stored
speech signal
encoder
Prior art date
Application number
FI911167A
Other languages
Finnish (fi)
Swedish (sv)
Other versions
FI91457B (en
FI911167A0 (en
FI911167A (en
Inventor
Timo Kolehmainen
Original Assignee
Nokia Mobile Phones Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Nokia Mobile Phones Ltd filed Critical Nokia Mobile Phones Ltd
Priority to FI911167A priority Critical patent/FI91457C/en
Publication of FI911167A0 publication Critical patent/FI911167A0/en
Priority to GB9204879A priority patent/GB2254986B/en
Publication of FI911167A publication Critical patent/FI911167A/en
Application granted granted Critical
Publication of FI91457B publication Critical patent/FI91457B/en
Publication of FI91457C publication Critical patent/FI91457C/en

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04MTELEPHONIC COMMUNICATION
    • H04M1/00Substation equipment, e.g. for use by subscribers
    • H04M1/64Automatic arrangements for answering calls; Automatic arrangements for recording messages for absent subscribers; Arrangements for recording conversations
    • H04M1/65Recording arrangements for recording a message from the calling party
    • H04M1/656Recording arrangements for recording a message from the calling party for recording conversations
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L19/00Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L25/00Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
    • G10L25/78Detection of presence or absence of voice signals
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04MTELEPHONIC COMMUNICATION
    • H04M1/00Substation equipment, e.g. for use by subscribers
    • H04M1/64Automatic arrangements for answering calls; Automatic arrangements for recording messages for absent subscribers; Arrangements for recording conversations
    • H04M1/65Recording arrangements for recording a message from the calling party
    • H04M1/6505Recording arrangements for recording a message from the calling party storing speech in digital form

Landscapes

  • Engineering & Computer Science (AREA)
  • Signal Processing (AREA)
  • Computational Linguistics (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Human Computer Interaction (AREA)
  • Physics & Mathematics (AREA)
  • Acoustics & Sound (AREA)
  • Multimedia (AREA)
  • Mobile Radio Communication Systems (AREA)

Description

i 91457i 91457

Menetelma puheen tallentamiseksi muistivålineelle ja tallen-netun puheen toistamieeksi seka menetelmåå kåyttSvfi laite -Forfarande for att lagra tal i ett minnesorgan och reproducera ett lagrat tal samt anordning for dees anvandning 5A method for recording speech on a storage medium and a repeater of the recorded speech, and a method of using a device for use as a means of reproducing and reproducing a body and reproducing the same.

Tama keksinto kohdistuu patenttivaatimusten 1 ja 2 johdan-non mukaiseen menetelmå&n, jossa digitalisoitu ja koodattu puhe tallennetaan tiivistettynS. muistiin, sek& menetelm&&n 10 muistiin tallennetun puheen toistamieeksi alkuper&isessS muodossa. Keksintd koskee myds menetelm&ss& k&ytett&v&S laitettta.The present invention relates to a method according to the preamble of claims 1 and 2, wherein the digitized and encoded speech is stored in a compressed format. as a repeater of the 10 speech stored in the memory in its original form. The invention relates to a device for use in this method.

Digitaalisissa matkapuhelinj&rjestelmissS on jokaisessa 15 puhelimessa puhekoodekki, joka koodaa låhetett&v&n puheen ja dekoodaa vastaanotettavan puheen. Puheen koodaus on vilkkaan tutkimuksen kohteena ja suuntaus on kehitt&& mene-telmia, joilla saavutetaan alhainen siirtonopeus puheen laadun sailyesså silti hyvåna. Digitaalinen signaalinkasit-20 tely on mahdollistanut sofistikoitujen puhekoodekkien kehit-tamisen, joissa kåytetåån hyv&ksi naytteistetyn puheen naytteiden valisiS riippuvuuksia: lyhyen aikav&lin ennustus-ta ja pitkfin aikavfilin ennustusta. Lyhyen aikav&lin ennus-tuksessa k&ytett&vi& koodausalgoritmeja nimitetS&n lineaari-25 sen ennustuksen koodaukseksi (Linear Predictive Coding, LPC) ja ne tutkivat per&kk&isten n&ytteiden korrelaatiota, kun taas pitkan aikav&lin ennustuksessa (Long Term Prediction, LTP) kåytettavSt algoritmit tutkivat pitk&n aikav&lin korrelaatiota perattaisten perustaajuussegmenttien v&lillfi.In digital mobile telephone systems, each of the 15 telephones has a speech codec that encodes the speech to be transmitted and decodes the speech to be received. Speech coding is the subject of intense research and the trend is to develop methods that achieve low transmission rates while still maintaining good speech quality. Digital signal processing has enabled the development of sophisticated speech codecs that take advantage of a variety of dependencies on the samples of the sampled speech: short-term & prediction of the longest time profile. The coding algorithms used in short-term prediction are called Linear Predictive Coding (LPC) and they study the correlation of basic and long-term samples, while in long-term prediction, they study

30 N&ihin menetelmiin perustuva puhekoodekki on standardoitu GSM-j&rjestelm&lle suosituksessa 06.10 ja kooderi on nimel-t&&n Regular Pulse Excitation - Long Term Prediction (RPE-LTP) -kooderi, jossa n&ytteenotto tapahtuu 8 kHz:n taajuu-della ja koodauksen tuloksena saadaan 20 ms:n pituisia 260 35 bittia sis&lt&vi& kehyksia. Toinen merkitt&va kooderi on m&&ritelty USAsssa k&ytt66n tulevassa dual mode (analogia/-digitaali) -jarjestelmass&. Kooderi k&ytt&& koodiher&tteisen lineaarisen ennustuksen, toiselta nimelt&ån stokastisen 2 koodauksen, koodausalgoritmin variaatiota nk. VSELP (Vector-Sum Excited Linear Excited Predictive Coding) -koodausalgo-ritmia, jossa kfiytetHSn laskentaa nopeuttavaa ns. koodikir-jaa.30 The speech codec based on these methods is standardized for the GSM system in Recommendation 06.10 and the encoder is a nominal Regular Pulse Excitation - Long Term Prediction (RPE-LTP) encoder, where sampling takes place at 8 kHz and the coding results in 20 ms: 260 35 bits of content. Another significant encoder is defined in the future dual mode (analog / digital) system for use in the United States. The encoder uses a variation of the coding algorithm for code-sensitive linear prediction, also known as stochastic 2 coding, the so-called VSELP (Vector-Sum Excited Linear Excited Predictive Coding) coding algorithm, in which khytetHSn calculates an accelerating ns. codebook-share.

55

Mainituissa matkapuhelinjarjestelmissa k&yteta&n kooderiin liitettynS mySs puheaktiviteetin tunnistinta VAD (Voice Activity Detector), jonka teht&vSnS on maaritelia, onko signaali taustamelun varittclmSa puhetta vai pelkkaS tausta-10 melua. Ajatuksena on se, etta mikSli VAD xnMårittelee signaa-lin taustameluksi, puhelimen lahetin suljetaan ja avataan jalleen, kun signaali tulkitaan puheeksi. LShetys on siten epSjatkuvaa. VAD kåyttåå hyvåkseen puhekoodekin kooderin laskemia parametreja. Puheenkoodausta ja siihen liitettya 15 puheaktiviteetin tunnistusta on selostettu esim. US-paten-tissa 4 630 262.In said mobile telephone systems, a MySs voice activity detector (VAD) connected to the encoder is used, the function of which is to determine whether the signal is speech in the presence of background noise or background noise alone. The idea is that why VAD xnDefines the signal as background noise, the phone transmitter is closed and reopened when the signal is interpreted as speech. The transmission is thus continuous. The VAD utilizes the parameters calculated by the speech codec encoder. Speech coding and associated speech activity recognition are described, e.g., in U.S. Patent 4,630,262.

Keksinnon tavoitteena on saada aikaan menetelmå, jonka mukaan voidaan rakentaa toiminne, jolla voitaisiin taltioda 20 muistiin pelkkaa puhetta ilman alkuperaiseen puheeseeen kuuluvia taukoja ja vastaavasti toistaa muistiin taltioitua puhetta, johon on lisatty alkuperaiseen puheeseen kuuluvat tauot. Toiminne olisi erittfiin sovelias k&ytettav&ksi digi-taalisissa radiopuhelimissa puhelinvastaajana, johon kåyttS-25 ja ja numeroon soittaja voivat sanella viestin. TSma tehtåvå ratkaistaan keksinndn mukaisesti kHyttamailS hyvSksi edellå kuvattua sinSnså tunnettua puheenkoodausta, puheaktiviteetin tunnistusta (VAD) ja sinansa tunnettua puhesignaalin tal-lennusta digitaalisessa muodossa. Keksinndn mukaiselle 30 menetelmMlle on tunnusomaista se, etta puheaktiviteetin tunnistin tarkkailee ensimmåisestci informaatioiahteestå tulevaa digitalisoitua ja/tai toisesta informaatioiahteestå tulevaa koodattua puhesignaalia ja kun se tulkitsee puhesignaalin olevan aktiivista puhetta, tallennetaan kooderin 35 tuottamat kehykset luku/kirjoitusmuistiin ja kun se tulkitsee, etta signaali ei ole aktiivista puhetta, tallennetaan luku/kirjoitusmuistiin epaaktiivisen jakson alkumerkki ja tieto taman jakson pituudesta kehyksina.It is an object of the invention to provide a method according to which a function can be constructed which could record only speech without pauses in the original speech and correspondingly repeat the speech stored in the memory with the pauses in the original speech added. Your function would be very suitable for use in digital radiotelephones as an answering machine, to which the S-25 and and the caller can dictate the message. According to the invention, this task is solved by utilizing the above-described speech coding, speech activity identification (VAD) and speech signal storage in digital form known per se. The method 30 according to the invention is characterized in that the speech activity detector monitors the digitized speech signal from the first information source and / or the encoded speech signal from the second information source and when it interprets the signal as active speech, active speech, the start character of the inactive period and information about the length of that period in frames are stored in read / write memory.

91457 391457 3

Menetelman mukaisesti tallennetaan puhesignaalia muistiin niin, etta tallennetaan vain puhe, mutta ei puheen taukokoh-5 tia. A/D-muunnin seka puhekoodekki muuntaa puheen digitaali-seen muotoon, joka voidaan tallentaa sellaisenaan muistiin ja puheaktiviteetin tunnistin havainnoi puhesignaalia ja paattelee, milloin puhejakso alkaa ja milloin se paattyy.According to the method, the speech signal is stored in the memory so that only the speech is recorded, but not the speech pause points. The A / D converter and the speech codec convert the speech into a digital format that can be stored as such in the memory, and the speech activity detector detects the speech signal and judges when the speech sequence starts and when it ends.

Kun tunnistin havaitsee puhejakson paåttyneeksi, annetaan 10 epaaktiivisen jakson alkumerkki. Tauon kestoa mitataan laskurin avulla ja laskettu tieto tauon kestosta viedaSn myos muistiin. Aktiivisen puheen jfilleen alkaessa viedSSn sitS muistiin kunnes tulee uusi taukokohta. TaHennetun digitalisoidun puheen purkaminen muistista tapahtuu siten, 15 ettS muistista tulevaa bittivirtaa luetaan ja sita dekooda-taan koodekin dekooderilla ja muunnetaan D/A-muuntimella analogiseksi puhesignaaliksi. Jos bittivirrasta tunnistetaan aiemmin tallennettu alkumerkki, pidetaan tauko, jonka kesto on tallenusvaiheessa pantu muistiin.When the sensor detects that a speech sequence has ended, an initial sign of 10 inactive episodes is given. The duration of the break is measured by means of a counter and the calculated information on the duration of the break is also recorded. When the active speech begins, the sitS are memorized until a new pause point is reached. The decrypted digitized speech is decoded from the memory by reading and decoding the bit stream from the memory with a codec decoder and converting it to an analog speech signal with a D / A converter. If a previously stored start character is recognized from the bitstream, a pause is held, the duration of which has been recorded during the recording phase.

2020

Keksintoa kuvataan yksityiskohtaisemmin viitaten oheiseen kuvaan 1, joka kaaviollisesti esittaa puhelinvastaajasovel-lusta digitaalisessa PLMN-puhelimessa, seka kuvaan 2r joka esittaa muistin sisaitea, johon puhetta on tallennettu.The invention will be described in more detail with reference to the accompanying Figure 1, which schematically shows an answering machine application in a digital PLMN telephone, and Figure 2r, which shows an internal memory device in which speech is stored.

2525

Kuvassa 1 on kuvattu ainoastaan keksinnttn kannalta olennai-set puhelimen toiminnalliset osat. Puhesignaali tulee mikro-fonilta A/D-muuntimelle, jossa siita otetaan naytteita saannfillisin vaiein. NHytteenottotaajuuteen vaikuttaa kMy-30 tettava koodekin kooderi, mutta GSM:n RPE-LTP-kooderille seka USDM:n VSELP-kooderille nSytteenottotaajuus on 8 kHz. Enkooderille 2 tuleva signaali on 13-bittista PCM:aa. Nayt-teet segmentoidaan 160 naytteen kehyksiksi, jolloin yhden kehyksen kesto on 20 ms. Kehyksille suoritetaan koodausope-35 raatiot ja tuloksena saadaan koodattua puhetta, GSM:ssa 13 kbit/s nopeudella. Amerikkalaisessa USDM-jarjestelmassa nopeus on alempi, 7,95 kbit/s. A/D-muuntimelta 1 tulevaa digitaalista signaalia tarkkailee lahetyksen puheaktivitee- 4 tin tunnistin 4 (Voice Activity Detector, VAD). VAD 4 voi olla kooderista 2 erillaan tai tavallisesti siihen liitet-tynå ja hyodyntåå sen laskemia parametreja. VAD pååttelee laskenta-algoritmeillaan, milloin puheessa on tauko (signaa-5 li on taustamelua) ja milloin signaali on puhetta. Kun puhekoodekin kooderilta 2 tulevan kehyksen on mååritelty sisåltåvån puhetta, se tallennetaan staattiseen luku/kirjoi-tusmuistiin 3. Se voi olla esim 256 kbit SRAM ja voi tallen-taa kestoltaan 30-90 s puheen. Tallennettavan muistipaikan 10 osoitteen antaa kullekin kehykselle osoitelaskuri 6, jota inkrementoidaan aina kehyksen pååtyttyå. Perakkåisten kehys-ten tallentamista jatketaan niin kauan, kunnes VAD 4 toteaa, ettå puheessa on tauko, ja antaa siitå tiedon mykkajakson kooderille 5. Tåmå generoi tietyn bittikuvion, joka toimii 15 tauon aloitusmerkkinå, ja se tallennetaan SRAM-muistiin 3. Koodekin kooderilta 2 tulevia kehyksiå ei tåmån jålkeen tallenneta muistiin 3, joten laskuria 6 ei inkrementoida. Samalla hetkella kaynnistyy mykkajakson kooderissa lasku-ri, joka on ensin nollattu ja jota inkrementoidaan aina, 20 kun koodekin kooderilta 2 tulee kehys, jonka VAD on mååri-tellyt puhetta sisåltåmattSmåksi kehykseksi. Kehyksen kesto on tavallisesti koodekeissa 20 ms ja oletetaan, ettå VAD 4 tulkitsee esim. N kpl peråkkåisiå kehyksiå puhetta sisålta-måttomaksi, minka jålkeen VAD 4 ilmoittaa, ettå puhejakso 25 alkaa. Tfilldin laskuri pysåhtyy lukemaan N:n ja mykkåjakson kooderilta saatu, sopivaan muotoon koodattu N:n arvo tallennetaan muistiin 3 aloitusmerkin viereiseen muistipaikkaan. Tåtå ennen osoitelaskuria 6 on inkrementoitu, jotta se ohjaisi N:åå vastaavan arvon oikeaan muistipaikkaan. Tåman 30 jålkeen aletaan tallentaa koodekin kooderilta 2 tulevia puhetta sisåltåviå kehyksiå muistiin 3 ja tåtå jatketaan niin kauan kuin VAD 4 tulkitsee muuntimelta 1 tulevan bitti-virran sisåltåvån puhetta. Uuden tauon tullessa mykkåjakson kooderi antaa jålleen aloitusmerkin muistiin 3, kehyslas-35 kuri kåynnistyy ja kehysten tallennus kooderilta 2 keskey- tyy. Toiminta jatkuu sitten kuten edellå on kuvattu. Tallen-nusta jatketaan niin kauan kuin puhetta tai RAM-muistia 3 riittåå. Muistin sisålloksi tulee tålldin kuvan 2 kaltainen.Figure 1 illustrates only the functional parts of the telephone which are essential for the invention. The speech signal comes from the microphone to an A / D converter, where it is sampled in phases. The sampling rate is affected by the codec encoder kMy-30, but for the GSM RPE-LTP encoder and the USDM VSELP encoder the sampling rate is 8 kHz. The signal to encoder 2 is 13-bit PCM. The sample paths are segmented into 160 sample frames, with a duration of one frame of 20 ms. Coding operations are performed on the frames and the result is coded speech, in GSM at a speed of 13 kbit / s. In the American USDM system, the speed is lower, 7.95 kbit / s. The digital signal from the A / D converter 1 is monitored by the Voice Activity Detector (VAD) of the transmission. The VAD 4 can be separate from the encoder 2 or usually connected to it and utilizing the parameters calculated by it. VAD uses its computational algorithms to deduce when speech is paused (signal-5 li is background noise) and when the signal is speech. Once the frame from the speech codec encoder 2 is defined to contain speech, it is stored in the static read / write memory 3. It may be e.g. 256 kbit SRAM and may store speech for a duration of 30-90 s. The address of the memory location 10 to be stored is assigned to each frame by an address counter 6, which is incremented each time the frame ends. The recording of successive frames is continued until the VAD 4 detects that there is a pause in the speech and informs it of the silent period encoder 5. This generates a certain bit pattern which serves as the start signal of the 15 pauses and is stored in the SRAM memory 3. the frames are then not stored in the memory 3, so the counter 6 is not incremented. At the same time, a counter is started in the silent period encoder, which is first reset and incremented whenever a codec 2 is generated from the codec encoder 2, which the VAD has defined as a speech-free frame. The frame duration is usually 20 ms in codecs, and it is assumed that VAD 4 interprets e.g. N consecutive frames of speech as meaningless, after which VAD 4 indicates that the speech period 25 begins. The Tfilld counter stops reading N and the value of N obtained from the encoder of the mute period, encoded in a suitable format, is stored in the memory 3 in the memory location next to the start character. Thus, before the address counter 6, it is incremented to direct the value corresponding to N to the correct memory location. After this, the frames containing the speech from the codec encoder 2 are stored in the memory 3 and this is continued as long as the VAD 4 interprets the speech containing the bit stream from the converter 1. When a new pause occurs, the silent period encoder again gives a start signal to the memory 3, the frame glass-35 discipline starts and the recording of frames from the encoder 2 is interrupted. The operation then continues as described above. Recording will continue as long as there is enough speech or RAM. The contents of the memory will be as shown in Figure 2.

91457 591457 5

Puhetta sisåltavien kehysten vålisså on taukoja, joiden alkaminen ja kesto on tallennettu. Tiedot tauosta vaativat siis vain kaksi muistipaikkaa.There are pauses between frames that contain speech, the start and duration of which are recorded. Thus, pause information requires only two memory locations.

5 Menetelm&n mukaisesti voidaan tallentaa myds ulkopuolisen soittajan, A-tilaajan, jattåmiS viestejM. Toiminta on saman-lainen kuin edella kuvattu. Nyt tarvitaan kuitenkin vastaan-ottopuolella erityyppinen puheaktiviteetin ilmaisin VAD 7 kuin edellS oleva låhetyspuolen VAD 2, koska nåytteitå 10 alkuperaisestS digitalisoidusta puheesta ei ole saatavil-la. Vastaanotettu puhe tallennetaan muistiin 3 ja, kun VAD 7 pSMttelee tauon alkavan, tallennetaan aloitusmerkki muistiin ja puhetta sisaitamåttdmien kehysten lukumS&rå laske-taan ja tallennetaan muistiin, kuten edella esitettiin.5 According to this method, messages from an external caller, subscriber A, can be stored. The operation is similar to that described above. Now, however, a need for a different type of input voice activity detector VAD 7 as the edellS låhetyspuolen the VAD 2, since nåytteitå 10 alkuperaisestS digitized speech is not available for a-la. The received speech is stored in the memory 3, and when the VAD 7 pSM thinks that a pause is to begin, the start character is stored in the memory, and the number of frames of the speech is counted and stored in the memory, as described above.

15 Mikåli vastaanottopuolen puheaktiviteetin ilmaisin 7 on sellainen, ettei se tarvitse alkuperaisen puheen nåytteitS ja voi kåyttåS vastaanotetun puhesignaalin parametreja taukojen havaitsemiseen, voidaan samaa VAD:ia kåyttåå sekå låhetys- etta vastaanottopuolella.15 chemicals is the receiving side voice activity detector 7 is such that it does not need the original speech nåytteitS and can kåyttåS received speech signal parameters, the detection of pauses may be of the same VAD ia kåyttåå låhetys- be submerged in the receiving side.

2020

Muistiin 3 tallennetun koodatun puheen puheen purkaminen tapahtuu puhekoodekin dekooderilla 9. Muistiin 3 tallennet-tuja puhekehyksiH puretaan jårjestyksessS mykkajakson alku-ja loppumerkin ilmaisimen 8 kautta dekooderille 9, joka 25 dekoodaa ne. Dekoodaamalla saadut puhesignaalin nMytteet viedaan D/A-muuntimelle 11, joka muuntaa nSytteet analogi-seksi puhesignaaliksi. Mikåli muistista luettavasta bitti-virrasta ldytyy tauon aloitusmerkin bittikuvio, inkrementoi-daan RAM-osoitelaskuria ja luetaan vielå seuraavan muisti-30 paikan sisåltd, josta saadaan tauon kesto eli koodattu luku N, joka ilmoittaa, montako kehystå tauko keståå. Muistista 3 lukeminen keskeytetSån. Alku- ja loppumerkin ilmaisin 8 antaa signaalin mahdollisesti kåytettavålle mykistimelle 10, jolloin D/A-muuntimelle ei tule lainkaan bitteja, jol-35 loin sen lahto on matalatasoista kohinaa tax pelkkå hiljai-suus. Ilmaisimen 8 laskuri laskee 20 ms:n pituisia aikavale-jå, kunnes laskuri on saavuttanut tauon kestoa osoittavan luvun, esim. N tai M kuvassa 2. TSlloin RAM-osoitelaskuri 6 kaynnistetSSn jalleen, mykistin 10 avataan ja koodattua puhetta sishltSvien kehysten luenta muistista 3 jatkuu aina seuraavaan taukoon asti. RAM-osoitelaskuri 6 kay vain sil-loin, kun uutta informaatiota tarvitaan muistista 3. Se 5 askeltaa vain silloin, kun tallennettuja kehyksia luetaan askelluspituuden ollessa kehyksen kesto ja silloin, kun muistiin 3 kirjoitetaan tai siita luetaan aloitusmerkki ja tauon kestoa osoittava merkki. Itse tauon aikana laskuri ei kay.The speech of the coded speech stored in the memory 3 is decoded by the decoder 9 of the speech codec. The nSamples of the speech signal obtained by decoding are fed to a D / A converter 11, which converts the nSytes to an analog speech signal. If a bit pattern of the pause start character is found in the bit stream read from the memory, the RAM address counter is incremented and the contents of the next memory-30 location are read, which gives the pause duration, i.e. the coded number N, which indicates how many frames the pause lasts. Reading from memory 3 is paused. The start and end signal detector 8 provides a signal to the mute 10 that may be used, so that no bits are received by the D / A converter, so that its output is low-level noise tax only silence. The counter of the detector 8 counts down time intervals of 20 ms until the counter has reached a number indicating the duration of the pause, e.g. N or M in Fig. 2. Then the RAM address counter 6 is restarted, the mute 10 is opened and until the next break. RAM address counter 6 Only use when new information is needed from memory 3. It 5 steps only when stored frames are read with a step length of frame duration and when memory 3 is written or read with a start character and a pause duration character. In fact, during the break, the counter does not Kay.

1010

Keksinndn mukaisella tavalla voidaan analogista puhetta taltioida digitaalisessa muodossa siten, ettei puheeseen kuuluvia taukoja tallenneta lainkaan. Tama saastaa merkitta-våsti muistikapasiteettia. Menetelma kayttåå hyvakseen 15 alhaisen nopeuden puhekoodekkeja ja puheaktiviteetin il- maisinta. Nama molemmat sisaityvat nykyaikaiseen digitaali-seen PLMN-puhelimeen samoin kuin digitaalinen signaali-prosessori, joka on esitetyn menetelmån toteuttamisessa vaittamaton. Digitaalisissa radiopuhelimissa on keskusyksi-20 kdn yhteydessa aina luku/kirjoitusmuisti ja tata RAM-muistia voidaan hyvin kayttaa menetelman mukaiseen puheen tallenta-miseen. NiinpS menetelman luonnollinen sovelluskohde on digitaaliseen matkapuhelimeen liitettava puhelinvastaaja. Puhelinvastaajamoodin asettaminen puhelimessa voidaan tehda 25 monella asiantuntijalle ilmeiselia tavalla ja se riippuu luonnollisesti puhelinjårjestelmåsta. Mykistyspiiri 10 ei ole vaittamaton, koska taildin tauon aikana muuntimelta 11 tuleva audiosignaali on kohinaa, joka saattaa olla korvalle miellyttavampaa kuin taydellinen hiljaisuus. Laitteen toi-30 minta vaatii luonnollisesti erilaisia ohjaus- ym. linjoja puhelimen prosessorin, laitteen lohkojen ja n&ppaimistdn valilla. Ne eivat ole keksinndn kannalta oleellisia ja ne voidaan toteuttaa monella ammattimiehen osaamiseen kuuluval-la tavalla.In the manner according to the invention, analog speech can be recorded in digital form so that no pauses in the speech are recorded at all. This significantly pollutes memory capacity. The method utilizes 15 low rate speech codecs and a speech activity detector. Both are embedded in a modern digital PLMN telephone as well as a digital signal processor, which is indispensable in carrying out the presented method. Digital radiotelephones always have a read / write memory in connection with the central unit-20 kdn, and this RAM can well be used to record speech according to the method. The natural application of the NiinpS method is an answering machine to be connected to a digital mobile telephone. Setting the answering machine mode on the telephone can be done in a manner obvious to many experts and will, of course, depend on the telephone system. The mute circuit 10 is not ineffective, because during the taild pause the audio signal from the converter 11 is noise, which may be more pleasing to the ear than complete silence. The operation of the device naturally requires different control and other lines between the telephone processor, the blocks of the device and the keypad. They are not essential to the invention and can be implemented in many ways within the skill of the art.

3535

Claims (11)

1. Forfarande for lagring av en talsignal i minnet, då en analog talsignal forst digitaliseras med en A/D-omvandlare och dårefter kodas i en kodare (2) till en bitstrom, som 15 består av ramar med konstant tidslångd innehållande informa-tionsbitter, kånnetecknat av att talaktivitetsdetektorn undersåker den digitaliserade talsignalen från en forstå informationskålla och/eller den kodade talsignalen från en andra informationskålla, och då den tolkar talsignalen som 20 aktivt tal, lagras de av kodaren alstrade ramarna i ett lås/skrivminne (3), och då den tolkar signalen som icke-aktivt tal, lagras i lås/skrivminnet (3) ett starttecken fdr en icke-aktiv period och data om långden av denna period som ramar. 25A method of storing a speech signal in memory, when an analog speech signal is first digitized with an A / D converter and then encoded in an encoder (2) into a bit stream, which consists of frames of constant time length containing information bits, characterized in that the speech activity detector examines the digitized speech signal from a understood information source and / or the coded speech signal from a second information source, and when it interprets the speech signal as active speech, the frames generated by the encoder are stored in a lock / write memory (3). it interprets the signal as non-active speech, stores in the lock / write memory (3) a start character for a non-active period and data on the length of that period as frames. 25 2. F6rfarande enligt patentkrav 1, kånnetecknat av att starttecknet for pausen lagras i kodad form.Method according to claim 1, characterized in that the start character for the pause is stored in coded form. 3. FSrfarande enligt patentkrav 1, kånnetecknat av att 30 långden av pausen råknas med en råknare, som inkrementeras med intervaller som år lika med långden av den av kodaren genererade ramen, och då det aktiva talet åter borjar, lagras talet N från råknaren i kodad form på en minnesplats bredvid den som innehåller starttecknet. 353. A method according to claim 1, characterized in that the length of the pause is counted by a counter, which is incremented at intervals equal to the length of the frame generated by the encoder, and when the active number starts again, the number N from the recorder is stored in encoded shape on a memory location next to the one containing the start character. 35 4. F6rfarande enligt patentkrav 1, kånnetecknat av att då starttecknet for pausen och talet N som uttrycker dess långd lagras, inkrementeras åven minnets (3) adressråknare (6), varefter inkrementeringen avbryts f6r en tid motsvarande N stycken långder av ramen som genereras av kodaren. 10Method according to claim 1, characterized in that when the start character of the pause and the number N expressing its length are stored, the address counter (6) of the memory (3) is also incremented, after which the increment is interrupted for a time corresponding to N lengths of the frame generated by the encoder. . 10 5. F6rfarande enligt något av patentkrav 1-4, kåxmetecknat 5 av att den fårsta talaktivitetsdetektorn (4) unders6ker en talsignal från en forstå informationskålla som digitalise-rats med en A/D-omvandlare (1) innan den kodas med en kodare (2) . 10Method according to any one of claims 1-4, caching characterized in that the smallest speech activity detector (4) examines a speech signal from an understanding information source digitized with an A / D converter (1) before being encoded with an encoder (2). ). 10 6. F5rfarande enligt något av patentkrav 1-4, kåxmetecknat av att en andra talaktivitetsdetektor (7) undersåker en ko-dad bitstråm från en andra informationskålla, som fors di-rekt till minnesorganet (3).Method according to any of claims 1-4, characterized by a second speech activity detector (7) examining a coded bit stream from a second information source, which is transmitted directly to the memory means (3). 7. Forfarande enligt något av patentkrav 1-4, kåxmetecknat av att den digitaliserade talsignalen från den forstå infor-mationskållan och den kodade talsignalen från den andra informat ionskållan undersåks av samma talaktivitetsdetektor.Method according to any one of claims 1-4, characterized by the fact that the digitized speech signal from the understood information source and the coded speech signal from the second information source are examined by the same speech activity detector. 8. Forfarande får återgivning av tal som lagrats i digi talform enligt något av foregående patentkrav som en analog talsignal, kåxmetecknat av att de i minnet (3) lagrade ramarna avlåses och fors efter va-randra i lagringsordning till dekodaren (9), varifrån den 25 dekodade bitstråmmen omvandlas med en D/A-omvandlare (11) till en analog talsignal och vid avlåsning av starttecknet får pausen hindras signalen från att nå D/A-omvandlaren (11), avlåses talet N som år kodat på f61jånde minnesplats och avbryts avlåsningen for en 30 tid motsvarande ett antal långder av ramen som år lika med det dekodade talet N.8. A method may reproduce numbers stored in digital form according to any one of the preceding claims as an analog speech signal, cachet characterized by locking the frames stored in memory (3) and forwarding them in storage order to the decoder (9), from where the The decoded bit stream is converted by an D / A converter (11) to an analog speech signal and when the start character is interrupted, the signal is prevented from reaching the D / A converter (11), the number N encoded at the memory location is interrupted and interrupted. the lock for a period corresponding to a number of lengths of the frame equal to the decoded number N. 9. F6rfarande enligt patentkrav 8, kåxmetecknat av att då starttecknet får pausen och talet N som uttrycker dess långd 35 avlåses i minnet (3), inkrementeras åven minnets (3) adress-råknare (6), varefter råknarens inkrementering avbryts f6r en tid motsvarande ett antal långder av ramen som år lika med det dekodade talet N. 91457 11Method according to claim 8, characterized in that when the start character receives the pause and the number N which expresses its length is locked in the memory (3), the address counter (6) of the memory (3) is also incremented, after which the increment of the calculator is interrupted for a time corresponding to a number of lengths of the frame equal to the decoded number N. 91457 11 10. Telefonsvarare i en digital radiotelefon f6r att utfdra f5rfarandet enligt något av foregående patentkrav, kånne-tecknad av att for lagring av ett talmeddelande anvånds elektriska kretsar, som utfår fdrfarandet enligt något av 5 patentkrav 1-7, och for avlåsning av talmeddelandet anvånds ' elektriska kretsar, som utf6r fdrfarandet enligt något av patentkrav 8-10.Answering machine in a digital radio telephone for performing the method according to any of the preceding claims, characterized in that electrical circuits which perform the procedure according to any of claims 1 to 7 are used for storing a voice message and for locking the voice message. electrical circuits performing the method of any of claims 8-10. 11. Telefonsvarare enligt patentkrav 10, kåxmetecknad av 10 att som elektriska kretsar utnyttjas en talkodek som ingår i telefonen, en talaktivitetsdetektor och ett lås/skrivminne anslutet till centralenheten.Answering machine according to claim 10, caching characterized by the use of a voice codec included in the telephone, a speech activity detector and a lock / write memory connected to the central unit.
FI911167A 1991-03-08 1991-03-08 A method of storing speech in a memory means and reproducing a stored speech and apparatus for its use FI91457C (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
FI911167A FI91457C (en) 1991-03-08 1991-03-08 A method of storing speech in a memory means and reproducing a stored speech and apparatus for its use
GB9204879A GB2254986B (en) 1991-03-08 1992-03-05 Device for storing and reproducing speech

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
FI911167 1991-03-08
FI911167A FI91457C (en) 1991-03-08 1991-03-08 A method of storing speech in a memory means and reproducing a stored speech and apparatus for its use

Publications (4)

Publication Number Publication Date
FI911167A0 FI911167A0 (en) 1991-03-08
FI911167A FI911167A (en) 1992-09-09
FI91457B FI91457B (en) 1994-03-15
FI91457C true FI91457C (en) 1994-06-27

Family

ID=8532086

Family Applications (1)

Application Number Title Priority Date Filing Date
FI911167A FI91457C (en) 1991-03-08 1991-03-08 A method of storing speech in a memory means and reproducing a stored speech and apparatus for its use

Country Status (2)

Country Link
FI (1) FI91457C (en)
GB (1) GB2254986B (en)

Families Citing this family (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
GB2272346A (en) * 1992-10-28 1994-05-11 Batalunt Limited Telephone answering system
JPH08292798A (en) * 1995-04-24 1996-11-05 Nec Corp Voice reproducing control method
US5768349A (en) * 1996-03-15 1998-06-16 Casio Phonemate, Inc. Voice mail telephone answering device
FI115867B (en) * 1996-12-20 2005-07-29 Nokia Corp A method and system for recording a call on a storage medium
GB2370206A (en) * 2000-12-15 2002-06-19 Ericsson Telefon Ab L M Storing a speech signal, e.g. in a mobile telephone

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US4802221A (en) * 1986-07-21 1989-01-31 Ncr Corporation Digital system and method for compressing speech signals for storage and transmission

Also Published As

Publication number Publication date
GB2254986A (en) 1992-10-21
GB9204879D0 (en) 1992-04-22
GB2254986B (en) 1995-05-10
FI91457B (en) 1994-03-15
FI911167A0 (en) 1991-03-08
FI911167A (en) 1992-09-09

Similar Documents

Publication Publication Date Title
CA2424202C (en) Method and system for speech frame error concealment in speech decoding
US8340973B2 (en) Data embedding device and data extraction device
EP0680033A2 (en) Speech-rate modification for linear-prediction based analysis-by-synthesis speech coders
CN1319044C (en) Method and apparatus for decoding a coded digital audio signal which is arranged in frames containing headers
US20030152152A1 (en) Audio enhancement communication techniques
FI115867B (en) A method and system for recording a call on a storage medium
AU6403298A (en) Speech coding
US20030236674A1 (en) Methods and systems for compression of stored audio
CN110838894A (en) Voice processing method, device, computer readable storage medium and computer equipment
JP4464484B2 (en) Noise signal encoding apparatus and speech signal encoding apparatus
KR100852613B1 (en) Editing of audio signals
FI91457C (en) A method of storing speech in a memory means and reproducing a stored speech and apparatus for its use
JPH08194493A (en) Low-bit-rate speech encoder and decoder
JP2856185B2 (en) Audio coding / decoding system
CA2080862C (en) Recognizer for recognizing voice messages in pulse code modulated format
JP2010092059A (en) Speech synthesizer based on variable rate speech coding
JP3784583B2 (en) Audio storage device
JP2861889B2 (en) Voice packet transmission system
EP1121686B1 (en) Speech parameter compression
US6134519A (en) Voice encoder for generating natural background noise
JPH07131767A (en) Moving picture coding decoding system
JP2005316499A (en) Voice-coder
JPH0774650A (en) Audio encoder and decoder for encoding signal thereof
US20050101301A1 (en) Apparatus and method for storing/reproducing voice in a wireless terminal
KR0152341B1 (en) Output break removing apparatus and method of multimedia

Legal Events

Date Code Title Description
BB Publication of examined application