CN109887499A - A kind of voice based on Recognition with Recurrent Neural Network is made pauses in reading unpunctuated ancient writings algorithm automatically - Google Patents

A kind of voice based on Recognition with Recurrent Neural Network is made pauses in reading unpunctuated ancient writings algorithm automatically Download PDF

Info

Publication number
CN109887499A
CN109887499A CN201910289742.3A CN201910289742A CN109887499A CN 109887499 A CN109887499 A CN 109887499A CN 201910289742 A CN201910289742 A CN 201910289742A CN 109887499 A CN109887499 A CN 109887499A
Authority
CN
China
Prior art keywords
voice
information
shot
text
long term
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201910289742.3A
Other languages
Chinese (zh)
Inventor
张亚飞
张卫山
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
China University of Petroleum East China
Original Assignee
China University of Petroleum East China
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by China University of Petroleum East China filed Critical China University of Petroleum East China
Priority to CN201910289742.3A priority Critical patent/CN109887499A/en
Publication of CN109887499A publication Critical patent/CN109887499A/en
Pending legal-status Critical Current

Links

Landscapes

  • Machine Translation (AREA)

Abstract

Make pauses in reading unpunctuated ancient writings automatically algorithm the invention proposes a kind of voice based on Recognition with Recurrent Neural Network, mode excavation and analysis based on shot and long term memory network realize the automatic punctuate of voice in conjunction with voice messaging core text information.The algorithm is divided into training stage and operation phase: the training stage mainly passes through the corresponding data collection of collection, i.e. audio file and corresponding text file, in conjunction with speech recognition technology, converts the fullstop in text file on the label of punctuate.By Training shot and long term memory network come Optimal Parameters.In the operation phase, make shot and long term memory network output phase that should make pauses in reading unpunctuated ancient writings a little by simple input audio file, and then punctuate cutting is carried out by corresponding program.To be finally reached the purpose that voice is made pauses in reading unpunctuated ancient writings automatically.

Description

A kind of voice based on Recognition with Recurrent Neural Network is made pauses in reading unpunctuated ancient writings algorithm automatically
Technical field
The present invention relates to internet areas and deep learning field, and in particular to a kind of language based on Recognition with Recurrent Neural Network Sound is made pauses in reading unpunctuated ancient writings algorithm automatically.
Background technique
Voice based on Recognition with Recurrent Neural Network is made pauses in reading unpunctuated ancient writings algorithm automatically, the mode excavation based on shot and long term memory network with point Analysis, the automatic punctuate of voice is realized in conjunction with voice messaging core text information.Have closest to technology of the invention:
(1), based on the method that dead time and least energy determine: the voice document feature that this method is spoken according to people, Often can all there be the dead time, be then based on this, according to the method that minimum pause time and least energy determine, to find out sentence The place of pause.This method has the characteristics that easy to operate.But it due to there is multiple parameters to need artificial configuration, is not easy to find most Excellent solution.
(2), based on mixed Gaussian-Hidden Markov Model method: this method is by mixed Gauss model and hidden Ma Erke Husband's model is organically combined, and is modeled by probabilistic inference to acoustic model, and then export translation result, be can be used as one Kind punctuate method.It is that model is smaller a little, is easy to be transplanted to embedded platform, the disadvantage is that contextual information cannot be made full use of.
(3), based on deep neural network-Hidden Markov Model method: this method is by deep neural network and Ma Er Section's husband's model is organically combined, and learns potential expression by deep neural network, in conjunction with the probability of Markov model Infer, to export translation result to carry out voice punctuate.Although deep neural network can learn the transformation of deep layer nonlinear characteristic, But current task can not be assisted to carry out voice punctuate using historical information.
Wherein, the method determined based on dead time and least energy needs human configuration multiple parameters, low efficiency.And base All it is in mixed Gaussian-Hidden Markov Model method and based on deep neural network-Hidden Markov Model method Using currently processed information, using context and historical information therefore cannot need to judge in conjunction with context to make pauses in reading unpunctuated ancient writings for some Place cannot effectively make pauses in reading unpunctuated ancient writings.And the present invention is based on the voice of Recognition with Recurrent Neural Network, punctuate algorithm can efficiently use history automatically Information and contextual information possess better expression effect by historical information persistence, and then when carrying out punctuate processing.
Summary of the invention
To solve shortcoming and defect in the prior art, the invention proposes the voices based on Recognition with Recurrent Neural Network from dynamic circuit breaker Sentence algorithm, mode excavation and analysis based on shot and long term memory network realize voice in conjunction with voice messaging core text information Automatic punctuate.
The technical solution of the present invention is as follows:
A kind of voice based on Recognition with Recurrent Neural Network is made pauses in reading unpunctuated ancient writings algorithm automatically, which is characterized in that shot and long term memory network, voice Identification module, text label conversion module, loss function evaluation module, comprising the following steps:
Step (1), in shot and long term memory module, we by using shot and long term memory network for before and after sequence data The effective processing capacity of dependence, which is established, maps the higher dimensional space of sequence, is translated into a voice insertion vector It piles up and is characterized.
Step (2), in speech recognition module, voice insertion vector is carried out the conversion of text orientation by we, and use is two-way LSTM carries out the mapping between insertion vector.Two-way LSTM helps to make full use of the dependence of front and back, by will be previous The input after information processing as next result is inputted, the effective information of long period can be remembered.
Step (3), text label conversion module, due to being mainly used for the automatic short sentence of voice herein, so its main information For the punctuate information in text information, i.e., fullstop information in text, therefore we will be made pauses in reading unpunctuated ancient writings using text label conversion module Information extracts, the label information as supervised training.
Step (4), in loss function evaluation module, phase that we are exported by calculating label information with voice conversion module It answers the label information of translation result to carry out Similarity measures, obtains corresponding penalty values, then use the update of whole network parameter.
Beneficial effects of the present invention:
(1) pass through shot and long term memory network, it is established that for the dependence between tonic train file front and back, and effectively Learn contextual information, to carry out Feature Mapping.
(2) translation for audio is realized based on certain phonetic rules by speech recognition module, is based on translation result Short sentence situation realize effective assessment for entire model punctuate effect.
(3) by the way that English learner can will be effectively improved with Optimization Platform quality by other platforms of punctuate Module-embedding automatically Learning efficiency, the English material oneself liked can be practiced.
Detailed description of the invention
In order to more clearly explain the embodiment of the invention or the technical proposal in the existing technology, to embodiment or will show below There is attached drawing needed in technical description to be briefly described, it should be apparent that, the accompanying drawings in the following description is only this Some embodiments of invention for those of ordinary skill in the art without creative efforts, can be with It obtains other drawings based on these drawings.
Fig. 1 is that the present invention is based on the overview flow charts of the voice of Recognition with Recurrent Neural Network punctuate algorithm automatically.
Specific embodiment
Following will be combined with the drawings in the embodiments of the present invention, and technical solution in the embodiment of the present invention carries out clear, complete Site preparation description, it is clear that described embodiments are only a part of the embodiments of the present invention, instead of all the embodiments.It is based on Embodiment in the present invention, it is obtained by those of ordinary skill in the art without making creative efforts every other Embodiment shall fall within the protection scope of the present invention.
The algorithm as shown in Figure 1, voice of the invention based on Recognition with Recurrent Neural Network is made pauses in reading unpunctuated ancient writings automatically, which is characterized in that length Phase memory network, speech recognition module, text label conversion module, loss function evaluation module.
Below with reference to Fig. 1, the detailed process of the punctuate algorithm automatically of the voice based on Recognition with Recurrent Neural Network is carried out specifically It is bright:
Step (1), in shot and long term memory module, we by using shot and long term memory network for before and after sequence data The effective processing capacity of dependence, which is established, maps the higher dimensional space of sequence, is translated into a voice insertion vector It piles up and is characterized.
Step (2), in speech recognition module, voice insertion vector is carried out the conversion of text orientation by we, and use is two-way LSTM carries out the mapping between insertion vector.Two-way LSTM helps to make full use of the dependence of front and back, by will be previous The input after information processing as next result is inputted, the effective information of long period can be remembered.
Step (3), text label conversion module, due to being mainly used for the automatic short sentence of voice herein, so its main information For the punctuate information in text information, i.e., fullstop information in text, therefore we will be made pauses in reading unpunctuated ancient writings using text label conversion module Information extracts, the label information as supervised training.
Step (4), in loss function evaluation module, phase that we are exported by calculating label information with voice conversion module It answers the label information of translation result to carry out Similarity measures, obtains corresponding penalty values, then use the update of whole network parameter.
Voice based on Recognition with Recurrent Neural Network of the invention is made pauses in reading unpunctuated ancient writings algorithm automatically, passes through shot and long term memory network, it is established that For the dependence between tonic train file front and back, and effectively learn contextual information, to carry out Feature Mapping.Pass through Speech recognition module realizes the translation for audio based on certain phonetic rules, is realized based on the short sentence situation of translation result Effective assessment for entire model punctuate effect.By will other platforms of punctuate Module-embedding can be with Optimization Platform matter automatically Amount, effectively improves the learning efficiency of English learner, can practice the English material oneself liked.
The foregoing is merely illustrative of the preferred embodiments of the present invention, is not intended to limit the invention, all in essence of the invention Within mind and principle, any modification, equivalent replacement, improvement and so on be should all be included in the protection scope of the present invention.

Claims (1)

  1. The algorithm 1. a kind of voice based on Recognition with Recurrent Neural Network is made pauses in reading unpunctuated ancient writings automatically, which is characterized in that shot and long term memory network, voice are known Other module, text label conversion module, loss function evaluation module, comprising the following steps:
    Step (1), in shot and long term memory module, we pass through using shot and long term memory network for interdependent before and after sequence data The effective processing capacity of relationship, which is established, maps the higher dimensional space of sequence, is translated into a voice insertion vector to pile up It is characterized.
    Step (2), in speech recognition module, voice insertion vector is carried out the conversion of text orientation by we, uses two-way LSTM Carry out the mapping between insertion vector.Two-way LSTM helps to make full use of the dependence of front and back, by by previously input Input after information processing as next result, can remember the effective information of long period.
    Step (3), text label conversion module, due to being mainly used for the automatic short sentence of voice herein, so its main information is text Punctuate information in this information, i.e., the fullstop information in text, therefore we will be made pauses in reading unpunctuated ancient writings information using text label conversion module It extracts, the label information as supervised training.
    Step (4), in loss function evaluation module, we corresponding are turned over by calculate that label information and voice conversion module export The label information for translating result carries out Similarity measures, obtains corresponding penalty values, then uses the update of whole network parameter.
CN201910289742.3A 2019-04-11 2019-04-11 A kind of voice based on Recognition with Recurrent Neural Network is made pauses in reading unpunctuated ancient writings algorithm automatically Pending CN109887499A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910289742.3A CN109887499A (en) 2019-04-11 2019-04-11 A kind of voice based on Recognition with Recurrent Neural Network is made pauses in reading unpunctuated ancient writings algorithm automatically

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910289742.3A CN109887499A (en) 2019-04-11 2019-04-11 A kind of voice based on Recognition with Recurrent Neural Network is made pauses in reading unpunctuated ancient writings algorithm automatically

Publications (1)

Publication Number Publication Date
CN109887499A true CN109887499A (en) 2019-06-14

Family

ID=66937006

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910289742.3A Pending CN109887499A (en) 2019-04-11 2019-04-11 A kind of voice based on Recognition with Recurrent Neural Network is made pauses in reading unpunctuated ancient writings algorithm automatically

Country Status (1)

Country Link
CN (1) CN109887499A (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111090981A (en) * 2019-12-06 2020-05-01 中国人民解放军战略支援部队信息工程大学 Method and system for building Chinese text automatic sentence-breaking and punctuation generation model based on bidirectional long-time and short-time memory network
CN112183084A (en) * 2020-09-07 2021-01-05 北京达佳互联信息技术有限公司 Audio and video data processing method, device and equipment
CN113674764A (en) * 2021-08-20 2021-11-19 广东外语外贸大学 Interpretation evaluation method, system and equipment based on bidirectional cyclic neural network

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111090981A (en) * 2019-12-06 2020-05-01 中国人民解放军战略支援部队信息工程大学 Method and system for building Chinese text automatic sentence-breaking and punctuation generation model based on bidirectional long-time and short-time memory network
CN111090981B (en) * 2019-12-06 2022-04-15 中国人民解放军战略支援部队信息工程大学 Method and system for building Chinese text automatic sentence-breaking and punctuation generation model based on bidirectional long-time and short-time memory network
CN112183084A (en) * 2020-09-07 2021-01-05 北京达佳互联信息技术有限公司 Audio and video data processing method, device and equipment
CN112183084B (en) * 2020-09-07 2024-03-15 北京达佳互联信息技术有限公司 Audio and video data processing method, device and equipment
CN113674764A (en) * 2021-08-20 2021-11-19 广东外语外贸大学 Interpretation evaluation method, system and equipment based on bidirectional cyclic neural network

Similar Documents

Publication Publication Date Title
CN107945805B (en) A kind of across language voice identification method for transformation of intelligence
CN111159368B (en) Reply generation method of personalized dialogue
CN112101045B (en) Multi-mode semantic integrity recognition method and device and electronic equipment
CN111274362B (en) Dialogue generation method based on transformer architecture
CN111640418B (en) Prosodic phrase identification method and device and electronic equipment
CN105427869A (en) Session emotion autoanalysis method based on depth learning
CN110321418A (en) A kind of field based on deep learning, intention assessment and slot fill method
CN109887499A (en) A kind of voice based on Recognition with Recurrent Neural Network is made pauses in reading unpunctuated ancient writings algorithm automatically
CN112466316A (en) Zero-sample voice conversion system based on generation countermeasure network
CN116150338A (en) Intelligent customer service method and system based on multi-round dialogue
CN110532387A (en) A kind of depression aided detection method based on open question and answer text
CN110459208A (en) A kind of sequence of knowledge based migration is to sequential speech identification model training method
CN112489618A (en) Neural text-to-speech synthesis using multi-level contextual features
CN114490991A (en) Dialog structure perception dialog method and system based on fine-grained local information enhancement
CN111653270B (en) Voice processing method and device, computer readable storage medium and electronic equipment
CN111951781A (en) Chinese prosody boundary prediction method based on graph-to-sequence
CN112101044A (en) Intention identification method and device and electronic equipment
CN115394287A (en) Mixed language voice recognition method, device, system and storage medium
CN112349294A (en) Voice processing method and device, computer readable medium and electronic equipment
CN111090726A (en) NLP-based electric power industry character customer service interaction method
CN114003700A (en) Method and system for processing session information, electronic device and storage medium
CN117789771A (en) Cross-language end-to-end emotion voice synthesis method and system
CN117437461A (en) Image description generation method oriented to open world
CN108962281B (en) Language expression evaluation and auxiliary method and device
CN115376547B (en) Pronunciation evaluation method, pronunciation evaluation device, computer equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
WD01 Invention patent application deemed withdrawn after publication
WD01 Invention patent application deemed withdrawn after publication

Application publication date: 20190614