CN101436403A - Method and system for recognizing tone - Google Patents

Method and system for recognizing tone Download PDF

Info

Publication number
CN101436403A
CN101436403A CNA2007101775074A CN200710177507A CN101436403A CN 101436403 A CN101436403 A CN 101436403A CN A2007101775074 A CNA2007101775074 A CN A2007101775074A CN 200710177507 A CN200710177507 A CN 200710177507A CN 101436403 A CN101436403 A CN 101436403A
Authority
CN
China
Prior art keywords
tone
voice signal
phoneme
model
described voice
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CNA2007101775074A
Other languages
Chinese (zh)
Other versions
CN101436403B (en
Inventor
许军
张化云
潘春雷
陈炜
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Chuang'exin (Beijing) Technology Co.,Ltd.
Original Assignee
CHUANGXINWEILAI TECHNOLOGY Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by CHUANGXINWEILAI TECHNOLOGY Co Ltd filed Critical CHUANGXINWEILAI TECHNOLOGY Co Ltd
Priority to CN2007101775074A priority Critical patent/CN101436403B/en
Publication of CN101436403A publication Critical patent/CN101436403A/en
Application granted granted Critical
Publication of CN101436403B publication Critical patent/CN101436403B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Landscapes

  • Machine Translation (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention relates to a method for recognizing tone, which comprises the following steps: receiving a voice signal; performing spectral analysis on the voice signal, and generating a voice sequence carrying time alignment information according to a reference text; extracting a tone phoneme from the received voice signal according to the voice sequence; and determining the tone of the voice signal according to the tone phoneme. The invention also relates to a tone recognition system. The method uses the voice sequence carrying the time alignment information to accurately extract the tone phoneme and determine the tone of an input voice signal, thus the false recognition of the tone in the tone recognition is reduced effectively, the accurate recognition of the tone in a tone language is realized, and the reliability of the tone recognition is improved.

Description

The Tone recognition method and system
Technical field
The present invention relates to the speech recognition field, especially a kind of Tone recognition method and system.
Background technology
Nearly in the world 70% language is that the language of tone type is tone language (Tone Language or Tonal Language), as: Chinese, Southeast Asia language, Japanese, Swedish and Norwegian etc.In these language, syllable is minimum pronunciation unit, and each syllable is made up of consonant, vowel and tone.Phoneme is minimum phonetic unit, and it analyzes from syllable.The complete meaning of tone is meant that the height of syllable in phonation rises and falls, and promptly tone is passed on the meaning of words and phrases with pitch, and the tone difference then can cause the syllable implication of identical consonant and vowel different.
For example, Chinese has four kinds of tones (then is five kinds of tones softly if consider) as a kind of typical tone language, and they are respectively high and level tone (), rising tone (two), last sound (three) and falling tone (four tones of standard Chinese pronunciation).The syllable that identical initial consonant (consonant) and simple or compound vowel of a Chinese syllable (vowel) constitute has diverse meaning with tone different, corresponding different Chinese characters, and promptly tone is being born the effect that justice distinguished in important structure word in standard Chinese.Further, the standard Chinese tone is only to appear on the simple or compound vowel of a Chinese syllable, so simple or compound vowel of a Chinese syllable is also referred to as " tone phoneme ", initial consonant just is called " non-tone phoneme ".
Therefore, in the learning system of tone language, need further tone information to be discerned, thus auto judge and marking.Tone can represent with fundamental frequency, i.e. fundamental frequency pattern over time as shown in Figure 1, is the synoptic diagram of four pairing fundamental frequency method for expressing of tone of Chinese.In Tone recognition, traditional method is to use fundamental curve to judge, promptly extracts the fundamental frequency F of each frame voice 0, and according to the fundamental frequency F of each tone 0Track is differentiated tone.Because tone is very complicated, every kind of tone all has a lot of distortion, shown in Fig. 2 A~Fig. 2 D, is the sample synoptic diagram of the real speech of the one~four tones of standard Chinese pronunciation of Chinese.This makes Tone recognition have a lot of challenges.Particularly in the tone language learning system, how speaker's tone is differentiated automatically, the reliability of Tone recognition just becomes and is even more important.
Tone can also represent that the time domain pitch period is fundamental frequency F with the time domain fundamental tone 0Inverse, pitch period is the repetitive of time domain periodic signal minimum, so a pitch period can intactly be described cyclical signal, so tone information can obtain by pitch Detection.And because the complicacy, particularly voiceless sound (unvoiced) of voice signal itself and the differentiation of voiced sound (voiced), make a mistake through regular meeting and to discern the generation of pitch period phenomenon, thereby caused the wrong identification of tone.Because the standard Chinese tone is on the voiced segments to appear at the tone phoneme only, so the mistake of pure and impure differentiation will cause the failure of pitch recognition.In the prior art, generally be to carry out pure and impure sound according to the characteristic of pure and impure sound to differentiate, promptly quasi-periodic voiced sound signal has higher relatively energy; Aperiodic schwa signal has relatively low energy.But because existing voice process technology can't carry out pure and impure differentiation reliably, the voice that occur non-tone phoneme section sometimes also can detect fundamental tone, thereby have caused tone by wrong identification.
Summary of the invention
The purpose of this invention is to provide a kind of Tone recognition method and system,, realize accurately discerning the tone in the tone language, improve the reliability of Tone recognition in order to reduce the generation of the wrong identification tone phenomenon in the prior art Tone recognition.
For achieving the above object, the present invention provides a kind of Tone recognition method by some embodiment, may further comprise the steps:
Received speech signal;
Described voice signal is carried out spectrum analysis, and the voice sequence of time alignment information is carried in generation according to referenced text;
From the voice signal that receives, extract the tone phoneme according to described voice sequence;
Determine the tone of described voice signal according to described tone phoneme.
For achieving the above object, the present invention provides a kind of tone recognition system by other embodiment, comprising:
Grammar database is used for the stored reference text;
Sound identification module is used for received speech signal, and described voice signal is carried out spectrum analysis, and carries the voice sequence of time alignment information according to described referenced text generation;
The Tone recognition module is used for received speech signal, and extracts the tone phoneme according to described voice sequence from described voice signal;
The tone sort module is used for determining according to described tone phoneme the tone of described voice signal.
Based on technique scheme, the voice sequence that time alignment information is carried in embodiment of the invention utilization extracts the tone phoneme exactly, determine the tone of input speech signal, thereby can reduce the wrong identification tone in the Tone recognition effectively, realize the tone in the accurate identification tone language, improved the reliability of Tone recognition.
Description of drawings
Fig. 1 is the synoptic diagram of four pairing fundamental frequency method for expressing of tone of Chinese;
Fig. 2 A is the sample synoptic diagram of the real speech of Chinese;
Fig. 2 B is the sample synoptic diagram of the real speech of Chinese two;
Fig. 2 C is the sample synoptic diagram of the real speech of Chinese three;
Fig. 2 D is the sample synoptic diagram of the real speech of the Chinese four tones of standard Chinese pronunciation;
Fig. 3 is the schematic flow sheet of first embodiment of Tone recognition method of the present invention;
Fig. 4 is the schematic flow sheet of second embodiment of Tone recognition method of the present invention;
Fig. 5 is the structural representation of first embodiment of tone recognition system of the present invention;
Fig. 6 is the structural representation of second embodiment of tone recognition system of the present invention.
Embodiment
Below in conjunction with accompanying drawing the specific embodiment of the present invention is described in further detail.
In the learning system of tone language, speaker's speech recognition is not only comprised identification to syllable structure, but also comprise identification syllable tone.As shown in Figure 3, be the schematic flow sheet of first embodiment of Tone recognition method of the present invention.Present embodiment may further comprise the steps:
Step 101, received speech signal;
Step 102, voice signal is carried out spectrum analysis, and generate according to referenced text and to carry the voice sequence of time alignment information;
Step 103, from the voice signal that receives, extract the tone phoneme according to voice sequence;
Step 104, determine the tone of voice signal according to the tone phoneme.
In the present embodiment, the voice sequence that carries time alignment information by utilization extracts the tone phoneme of input speech signal exactly, thereby determine the tone of input speech signal, reduced the wrong identification tone in the Tone recognition, realize the tone in the accurate identification tone language, thereby improved the reliability of Tone recognition.
As shown in Figure 4, be the schematic flow sheet of second embodiment of Tone recognition method of the present invention.Present embodiment may further comprise the steps:
Step 201, received speech signal.
Receive the sound signal of the tone language speech syllable of input;
Step 202, voice signal is carried out spectrum analysis, extract the phonetic feature parameter.
The extraction of above-mentioned characteristic parameter is based on speech frame, according to the smooth performance in short-term of voice signal, can be divided into some frames to voice signal and handle, and the length of each frame is about 10~30ms, and each frame is extracted a phonetic feature.The method of dividing frame can adopt contiguous segmentation, but in order to embody the correlativity between adjacent two frame data, and make between frame and the frame and seamlessly transit, keep its continuity, the general method that adopts the overlapping segmentation, promptly the frame head of the postamble of each frame and next frame is overlapping, and frame moves and is 1/2 of frame length usually.
Above-mentioned phonetic feature Parameter selection need be taken all factors into consideration the requirement of storage quantitative limitation and recognition performance.As: can use Mel frequency cepstral coefficient (Mel-Frequency Ceptral Coefficients is hereinafter to be referred as MFCC).In order to reduce the truncation effect of speech frame, reduce the gradient at frame two ends, make the two ends of speech frame not cause rapid variation and be smoothly transitted into 0, will allow speech frame multiply by a window function.Because the variation of voice signal on time domain fast and unstable, observed on the frequency domain so all it is transformed into usually, this moment its frequency spectrum can along with the time intercropping change slowly.Frame after the windowing through fast fourier transform (Fast Fourier Transform is called for short FFT), is obtained the frequency spectrum parameter of every frame.Again with the frequency spectrum parameter of every frame by one group of N (N is generally 20~30) Mel frequency filter that the triangle strip bandpass filter is formed, the output of each frequency band is taken the logarithm, obtain logarithm energy (logenergy) E of each output k, k=1,2 ... N.Again this N parameter is carried out cosine transform (cosinetransform) and obtain the Mel cepstrum on L rank (Mel-scale cepstrum) parameter.
Above-mentioned phonetic feature parameter can also use 39 dimensional feature vectors, comprises 13 dimension MFCC, 13 dimension first order difference MFCC and 13 dimension second order difference MFCC;
Step 203, search in speech model according to referenced text, match the voice sequence of phonetic feature parameter, voice sequence carries time alignment information.
Above-mentioned speech model can be Hidden Markov Model (HMM) (Hidden Markov Model, hereinafter to be referred as HMM) be a discrete time-domain finite-state automata, HMM is meant that the internal state external world of this Markov model is invisible, and each output valve constantly can only be seen by the external world.To speech recognition system, output valve is exactly the acoustic feature (phonetic feature) that gets from each frame calculating usually.Need make two hypothesis with HMM portrayal voice signal: the one, the transfer of internal state is only relevant with laststate, and another is that output valve is only relevant with current state (or current state transitions), and these two hypothesis greatly reduce the complexity of model.The corresponding algorithm of the marking of HMM, decoding and training is forward direction algorithm, Viterbi (Viterbi) algorithm and forward-backward algorithm algorithm.
Use HMM normally to use unidirectional from left to right, as to be with ring, band leap certainly topological structure to come in the speech recognition to discerning the primitive modeling, a phoneme is exactly the HMM of one three to five state, speech is exactly to constitute the HMM that the HMM serial of a plurality of phonemes of speech gets up to constitute, and the whole model of continuous speech recognition is exactly speech and the quiet HMM that combines.
In order to make model voice can be described more accurately, can consider context dependent modeling coarticulation when setting up HMM, the influence of adjacent tone and changing before and after promptly sound is subjected to, from sound generating mechanism be exactly people's phonatory organ its characteristic can only gradual change when a sound turns to another sound, thereby make the frequency spectrum of a back sound and the frequency spectrum under other conditions produce difference.If only consider last sound influence be called diphone (Biphone); If consider simultaneously last sound and back one sound influence be called three-tone (Triphone).
The operation of above-mentioned search is sought a speech model sequence exactly with the description input speech signal, thereby is obtained speech decoding sequence (voice sequence).In actual use, often to add a high weight to language model, and a long word punishment mark is set according to experience.
Viterbi algorithm each state on each time point based on dynamic programming, calculate the posterior probability of decoding status switch to observation sequence, the path that keeps the probability maximum, and under each nodes records corresponding status information so that oppositely obtain the speech decoding sequence at last.Viterbi algorithm is under the condition of not losing optimum solution, solved the non-linear time alignment of HMM model state sequence and acoustics observation sequence in the continuous speech recognition simultaneously, the identification of speech Boundary Detection and speech, thus make this algorithm become the elementary tactics of speech recognition search.
This step can provide the voice sequence that carries time alignment information reliably, and when non-tone phoneme (initial consonant) the harmony tuning element (simple or compound vowel of a Chinese syllable) that can know the identification input speech signal is respectively from beginning to when finishing;
Step 204, from the voice signal that receives, extract the tone phoneme according to voice sequence.
The voice sequence and the aligning time that provide in rapid according to previous step, cut away the part that is not the tone joint.For Chinese, cut away the part that is not simple or compound vowel of a Chinese syllable exactly;
Step 205, in the tone model, match the tone of voice signal according to the tone phoneme.
Alternatively, above-mentioned steps 205 can also for:
Utilize the Support Vector Machine algorithm, find out one group of suitable lineoid the tone phoneme is carried out the tone classification.
In the present embodiment, viterbi algorithm by dynamic programming searches out the voice sequence that mates with the input speech signal characteristic parameter in HMM, the voice sequence that time alignment information is carried in utilization extracts the tone phoneme of input speech signal exactly, thereby determine the tone of input speech signal by tone model or one group of suitable lineoid utilizing the Support Vector Machine algorithm to find out, reduced the wrong identification tone in the Tone recognition, realize the tone in the accurate identification tone language, thereby improved the reliability of Tone recognition.
As shown in Figure 5, be the structural representation of first embodiment of tone recognition system of the present invention.Present embodiment comprises: grammar database 10 is used for the stored reference text; Sound identification module 20 is used for received speech signal, and voice signal is carried out spectrum analysis, and the voice sequence of time alignment information is carried in generation according to referenced text; Tone recognition module 30 is used for received speech signal, and extracts the tone phoneme according to voice sequence from voice signal; Tone sort module 40 is used for determining according to the tone phoneme tone of voice signal.
In the present embodiment,, can in grammar database 10, input in advance be referenced text with the object of reading owing to be at the situation in the language learning.Sound identification module 20 provides voice sequence and time alignment information, Tone recognition module 30 extracts the tone phoneme exactly according to above-mentioned voice sequence and time alignment information from voice signal, determine the tone of voice signal by tone sort module 40, thereby reduced the wrong identification tone in the Tone recognition, realized the tone in the accurate identification tone language.
As shown in Figure 6, be the structural representation of second embodiment of tone recognition system of the present invention.Compare with a last embodiment, sound identification module 20 comprises in the present embodiment: feature extraction unit 21, be used for received speech signal, and voice signal is carried out spectrum analysis and extracts the phonetic feature parameter; Speech model unit 22 is used for the storaged voice model; Phonetic search unit 23 is used for matching voice sequence according to phonetic feature parameter and referenced text at speech model, and voice sequence carries time alignment information.
In the present embodiment, the phonetic feature parameter that feature extraction unit 21 is extracted can be the Mel frequency cepstral coefficient; Can also be Mel frequency cepstral coefficient, single order Mel frequency cepstral coefficient and second order Mel frequency cepstral coefficient.The speech model of being stored in the speech model unit 22 is a hidden Markov model.
Compare with a last embodiment, tone sort module 40 comprises in the present embodiment: tone model unit 41 is used to store the tone model; Tone taxon 42 is used for matching at the tone model according to the tone phoneme tone of voice signal.
In the present embodiment, the tone model that tone model unit 41 is stored can be used fundamental frequency F 0The track envelope and the tone features such as envelope of logarithm energy train.
In the present embodiment, voice sequence and time alignment information are provided by sound identification module 20, Tone recognition module 30 extracts the tone phoneme exactly according to above-mentioned voice sequence and time alignment information from voice signal, determine the tone of voice signal by tone sort module 40, reduced the wrong identification tone in the Tone recognition, improve the reliability of Tone recognition, realized the tone in the accurate identification tone language.
One of ordinary skill in the art will appreciate that: all or part of step that realizes said method embodiment can be finished by the relevant hardware of programmed instruction, aforesaid program can be stored in the computer read/write memory medium, this program is carried out the step that comprises said method embodiment when carrying out; And aforesaid storage medium comprises: various media that can be program code stored such as ROM, RAM, magnetic disc or CD.
It should be noted that at last: above embodiment only in order to technical scheme of the present invention to be described, is not intended to limit; Although with reference to previous embodiment the present invention is had been described in detail, those of ordinary skill in the art is to be understood that: it still can be made amendment to the technical scheme that aforementioned each embodiment put down in writing, and perhaps part technical characterictic wherein is equal to replacement; And these modifications or replacement do not make the essence of appropriate technical solution break away from the spirit and scope of various embodiments of the present invention technical scheme.

Claims (13)

1, a kind of Tone recognition method is characterized in that, may further comprise the steps:
Received speech signal;
Described voice signal is carried out spectrum analysis, and the voice sequence of time alignment information is carried in generation according to referenced text;
From the voice signal that receives, extract the tone phoneme according to described voice sequence;
Determine the tone of described voice signal according to described tone phoneme.
2, Tone recognition method according to claim 1 is characterized in that, describedly described voice signal is carried out spectrum analysis is specially:
According to described voice signal, extract the phonetic feature parameter.
3, Tone recognition method according to claim 2 is characterized in that, and is described according to described voice signal, extracts the phonetic feature parameter and is specially: according to described voice signal, extract the Mel frequency cepstral coefficient.
4, Tone recognition method according to claim 2, it is characterized in that, described according to described voice signal, extract the phonetic feature parameter and be specially:, extract Mel frequency cepstral coefficient, single order Mel frequency cepstral coefficient and second order Mel frequency cepstral coefficient according to described voice signal.
5, Tone recognition method according to claim 1 is characterized in that, describedly generates the voice sequence carry time alignment information according to referenced text and is specially:
Match voice sequence according to referenced text in speech model, described voice sequence carries time alignment information.
6, Tone recognition method according to claim 5 is characterized in that, describedly matches voice sequence according to referenced text in speech model and is specially: match voice sequence according to referenced text in hidden Markov model.
7, Tone recognition method according to claim 1 is characterized in that, describedly determines that according to described tone phoneme the tone of described voice signal is specially:
In the tone model, match the tone of described voice signal according to described tone phoneme.
8, a kind of tone recognition system is characterized in that, comprising:
Grammar database is used for the stored reference text;
Sound identification module is used for received speech signal, and described voice signal is carried out spectrum analysis, and carries the voice sequence of time alignment information according to described referenced text generation;
The Tone recognition module is used for received speech signal, and extracts the tone phoneme according to described voice sequence from described voice signal;
The tone sort module is used for determining according to described tone phoneme the tone of described voice signal.
9, tone recognition system according to claim 8 is characterized in that, described sound identification module comprises:
Feature extraction unit is used for received speech signal, and described voice signal is carried out spectrum analysis and extracts the phonetic feature parameter;
The speech model unit is used for the storaged voice model;
The phonetic search unit is used for matching voice sequence according to described phonetic feature parameter and referenced text at speech model, and described voice sequence carries time alignment information.
10, tone recognition system according to claim 9 is characterized in that, described phonetic feature parameter is the Mel frequency cepstral coefficient.
11, tone recognition system according to claim 9 is characterized in that, described phonetic feature parameter is Mel frequency cepstral coefficient, single order Mel frequency cepstral coefficient and second order Mel frequency cepstral coefficient.
12, tone recognition system according to claim 9 is characterized in that, described speech model is a hidden Markov model.
13, according to Claim 8,9,10,11 or 12 described tone recognition system, it is characterized in that described tone sort module comprises:
The tone model unit is used to store the tone model;
The tone taxon is used for matching at described tone model according to described tone phoneme the tone of described voice signal.
CN2007101775074A 2007-11-16 2007-11-16 Method and system for recognizing tone Active CN101436403B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN2007101775074A CN101436403B (en) 2007-11-16 2007-11-16 Method and system for recognizing tone

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN2007101775074A CN101436403B (en) 2007-11-16 2007-11-16 Method and system for recognizing tone

Publications (2)

Publication Number Publication Date
CN101436403A true CN101436403A (en) 2009-05-20
CN101436403B CN101436403B (en) 2011-10-12

Family

ID=40710811

Family Applications (1)

Application Number Title Priority Date Filing Date
CN2007101775074A Active CN101436403B (en) 2007-11-16 2007-11-16 Method and system for recognizing tone

Country Status (1)

Country Link
CN (1) CN101436403B (en)

Cited By (13)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104157288A (en) * 2013-05-13 2014-11-19 通用汽车环球科技运作有限责任公司 Speech recognition with a plurality of microphones
CN104766607A (en) * 2015-03-05 2015-07-08 广州视源电子科技股份有限公司 Television program recommendation method and system
CN106205603A (en) * 2016-08-29 2016-12-07 北京语言大学 A kind of tone appraisal procedure
CN106796786A (en) * 2014-09-30 2017-05-31 三菱电机株式会社 Speech recognition system
CN106971703A (en) * 2017-03-17 2017-07-21 西北师范大学 A kind of song synthetic method and device based on HMM
CN107910005A (en) * 2017-11-16 2018-04-13 海信集团有限公司 The target service localization method and device of interaction text
CN108648760A (en) * 2018-04-17 2018-10-12 四川长虹电器股份有限公司 Real-time sound-groove identification System and method for
CN109102796A (en) * 2018-08-31 2018-12-28 北京未来媒体科技股份有限公司 A kind of phoneme synthesizing method and device
CN111128130A (en) * 2019-12-31 2020-05-08 秒针信息技术有限公司 Voice data processing method and device and electronic device
CN111276156A (en) * 2020-01-20 2020-06-12 深圳市数字星河科技有限公司 Real-time voice stream monitoring method
CN111599347A (en) * 2020-05-27 2020-08-28 广州科慧健远医疗科技有限公司 Standardized sampling method for extracting pathological voice MFCC (Mel frequency cepstrum coefficient) features for artificial intelligence analysis
CN112074903A (en) * 2017-12-29 2020-12-11 流畅人工智能公司 System and method for tone recognition in spoken language
CN112397091A (en) * 2019-08-16 2021-02-23 庞帝教育公司 Chinese speech comprehensive scoring and diagnosing system and method

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1499484A (en) * 2002-11-06 2004-05-26 北京天朗语音科技有限公司 Recognition system of Chinese continuous speech

Cited By (18)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104157288B (en) * 2013-05-13 2017-09-15 通用汽车环球科技运作有限责任公司 With the speech recognition of multiple microphones
CN104157288A (en) * 2013-05-13 2014-11-19 通用汽车环球科技运作有限责任公司 Speech recognition with a plurality of microphones
CN106796786A (en) * 2014-09-30 2017-05-31 三菱电机株式会社 Speech recognition system
CN104766607A (en) * 2015-03-05 2015-07-08 广州视源电子科技股份有限公司 Television program recommendation method and system
CN106205603A (en) * 2016-08-29 2016-12-07 北京语言大学 A kind of tone appraisal procedure
CN106205603B (en) * 2016-08-29 2019-06-07 北京语言大学 A kind of tone appraisal procedure
CN106971703A (en) * 2017-03-17 2017-07-21 西北师范大学 A kind of song synthetic method and device based on HMM
CN107910005A (en) * 2017-11-16 2018-04-13 海信集团有限公司 The target service localization method and device of interaction text
CN107910005B (en) * 2017-11-16 2021-06-01 海信集团有限公司 Target service positioning method and device for interactive text
CN112074903A (en) * 2017-12-29 2020-12-11 流畅人工智能公司 System and method for tone recognition in spoken language
CN108648760A (en) * 2018-04-17 2018-10-12 四川长虹电器股份有限公司 Real-time sound-groove identification System and method for
CN108648760B (en) * 2018-04-17 2020-04-28 四川长虹电器股份有限公司 Real-time voiceprint identification system and method
CN109102796A (en) * 2018-08-31 2018-12-28 北京未来媒体科技股份有限公司 A kind of phoneme synthesizing method and device
CN112397091A (en) * 2019-08-16 2021-02-23 庞帝教育公司 Chinese speech comprehensive scoring and diagnosing system and method
CN111128130A (en) * 2019-12-31 2020-05-08 秒针信息技术有限公司 Voice data processing method and device and electronic device
CN111276156A (en) * 2020-01-20 2020-06-12 深圳市数字星河科技有限公司 Real-time voice stream monitoring method
CN111599347A (en) * 2020-05-27 2020-08-28 广州科慧健远医疗科技有限公司 Standardized sampling method for extracting pathological voice MFCC (Mel frequency cepstrum coefficient) features for artificial intelligence analysis
CN111599347B (en) * 2020-05-27 2024-04-16 广州科慧健远医疗科技有限公司 Standardized sampling method for extracting pathological voice MFCC (functional peripheral component interconnect) characteristics for artificial intelligent analysis

Also Published As

Publication number Publication date
CN101436403B (en) 2011-10-12

Similar Documents

Publication Publication Date Title
CN101436403B (en) Method and system for recognizing tone
US11410684B1 (en) Text-to-speech (TTS) processing with transfer of vocal characteristics
Arora et al. Automatic speech recognition: a review
Aggarwal et al. Acoustic modeling problem for automatic speech recognition system: conventional methods (Part I)
Kasuriya et al. Thai speech corpus for Thai speech recognition
Ghai et al. Analysis of automatic speech recognition systems for indo-aryan languages: Punjabi a case study
Obin et al. Syll-O-Matic: An adaptive time-frequency representation for the automatic segmentation of speech into syllables
Razak et al. Quranic verse recitation recognition module for support in j-QAF learning: A review
Al-Zabibi An acoustic-phonetic approach in automatic Arabic speech recognition
Jothilakshmi et al. Large scale data enabled evolution of spoken language research and applications
Serrino et al. Contextual Recovery of Out-of-Lattice Named Entities in Automatic Speech Recognition.
Mary et al. Searching speech databases: features, techniques and evaluation measures
Lin et al. Automatic segmentation and labeling for Mandarin Chinese speech corpora for concatenation-based TTS
Sultana et al. A survey on Bengali speech-to-text recognition techniques
Kawai et al. Lyric recognition in monophonic singing using pitch-dependent DNN
JP2001312293A (en) Method and device for voice recognition, and computer- readable storage medium
Hirose et al. Accent type recognition and syntactic boundary detection of Japanese using statistical modeling of moraic transitions of fundamental frequency contours
Mittal et al. Implementation of phonetic level speech recognition system for Punjabi language
Pranjol et al. Bengali speech recognition: An overview
Huang et al. Speech-Based Interface for Visually Impaired Users
Pandey et al. Fusion of spectral and prosodic information using combined error optimization for keyword spotting
Salvi Developing acoustic models for automatic speech recognition
Manjunath et al. Improvement of phone recognition accuracy using source and system features
Huckvale 14 An Introduction to Phonetic Technology
Ganesh et al. Grapheme Gaussian model and prosodic syllable based Tamil speech recognition system

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C14 Grant of patent or utility model
GR01 Patent grant
TR01 Transfer of patent right
TR01 Transfer of patent right

Effective date of registration: 20211103

Address after: 100089 4th Floor 403, No. 15 Wanquanzhuang Road, Haidian District, Beijing

Patentee after: CREATIVE KNOWLEDGE (BEIJING) EDUCATION TECHNOLOGY Co.,Ltd.

Address before: 100089 Beijing city Haidian District wanquanzhuang Road No. 15

Patentee before: Innovation (China) Technology Co.,Ltd.

CP03 Change of name, title or address
CP03 Change of name, title or address

Address after: 100089 4th Floor 403, No. 15 Wanquanzhuang Road, Haidian District, Beijing

Patentee after: Chuang'exin (Beijing) Technology Co.,Ltd.

Address before: 100089 4th Floor 403, No. 15 Wanquanzhuang Road, Haidian District, Beijing

Patentee before: CREATIVE KNOWLEDGE (BEIJING) EDUCATION TECHNOLOGY Co.,Ltd.