CN104751846B - The method and device of speech-to-text conversion - Google Patents
The method and device of speech-to-text conversion Download PDFInfo
- Publication number
- CN104751846B CN104751846B CN201510126575.2A CN201510126575A CN104751846B CN 104751846 B CN104751846 B CN 104751846B CN 201510126575 A CN201510126575 A CN 201510126575A CN 104751846 B CN104751846 B CN 104751846B
- Authority
- CN
- China
- Prior art keywords
- text
- mark
- label
- text mark
- audio file
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
Landscapes
- Document Processing Apparatus (AREA)
Abstract
The invention discloses a kind of methods of speech-to-text conversion, this method comprises: obtaining audio file;The voice contained in the audio file is converted into text to generate the first text information according to the time shaft of audio file sequence;It gets the recording in the audio file ready label and is converted to text mark;The text mark is inserted into the corresponding position in first text information, to generate the second text information.The invention also discloses a kind of devices of speech-to-text conversion.Using technical solution of the present invention, the text after conversion is marked, people is facilitated the operation such as to check, edit to text.
Description
Technical field
The present invention relates to the method and devices that field of communication technology more particularly to a kind of speech-to-text are converted.
Background technique
With the rapid development of information age, information input/output function importance is added in the electronic device
By force.People can be recorded by mobile phone or recording pen (or other equipment with sound-recording function), facilitate record information;It is recording
During sound, can also use get function ready, such as when attending a lecture, can record while listening, important content is being recorded
When in advance label on, ultimately produce recording file, people can continue to listen back to pervious say subsequently through the recording file
When seat content, it can directly listen from label beginning, be listened one time without entirely recording;As that can beg on one side when discussing in session
It records by one side, important conference content is marked in advance when record, ultimately produces recording file, people can be subsequent
When continuing to listen back to pervious conference content by the recording file, can directly it be listened from label beginning, without entirely recording
It listens one time.Speech recognition technology in the prior art, is had been achieved with voice document being converted into text file using more and more extensive
It is shown, still, the prior art will be when that will have markd voice document and change into text file, to getting label ready without knowing
Not, but voice document is directly converted into text file, it has not been convenient to which people are to the reading of text file, editor, as people think
The content (emphasis for understanding record) for getting mark before seeing ready in voice document, cannot be quickly found out, need to open from text
It is slowly looked at beginning.
Above content is only used to facilitate the understanding of the technical scheme, and is not represented and is recognized that above content is existing skill
Art.
Summary of the invention
The main purpose of the present invention is to provide a kind of method and devices of speech-to-text conversion, it is intended to after conversion
Text is marked, and people is facilitated the operation such as to check, edit to text.
To achieve the above object, the present invention provides a kind of method of speech-to-text conversion, this method comprises:
Obtain audio file;
The voice contained in the audio file is converted into text with life according to the time shaft of audio file sequence
At the first text information;
It gets the recording in the audio file ready label and is converted to text mark;
The text mark is inserted into the corresponding position in first text information, to generate the second text information.
Preferably, the recording by the audio file gets the step of label is converted to text mark ready and includes:
It obtains the recording in the audio file and gets label ready;
Label and text mark mapping table are got ready according to preset recording, and the recording for searching the acquisition is got label ready and corresponded to
Text mark.
Preferably, the text mark is being inserted into first text information, to generate the second text information
After step, this method further include:
Word content between identical and adjacent two text mark in second text information is protruded
It has been shown that, to generate third text information.
Preferably, in the text between identical and adjacent two text mark by second text information
Appearance is highlighted, and includes: the step of third text information to generate
Sequence reads second text information;
If currently reading text mark, the text whether text mark currently read reads with the last time is judged
This label is identical;
If the text mark currently read is identical as the text mark that the last time reads, currently read described
Word content between text mark and the last text mark read is highlighted, to generate third text information.
Preferably, described, by the text between the text mark currently read and the last text mark read
Content is highlighted, and includes: the step of third text information to generate
According to preset text mark and mode mapping table is highlighted, it is corresponding to search the text mark currently read
Highlight mode;
By the word content between the text mark currently read and the last text mark read according to described
The mode that highlights searched is highlighted, to generate third text information.
In addition, to achieve the above object, the present invention also provides a kind of devices of speech-to-text conversion, comprising:
Module is obtained, for obtaining audio file;
First generation module, the language that will contain in the audio file for the time shaft sequence according to the audio file
Sound is converted to text to generate the first text information;
First conversion module is converted to text mark for getting the recording in the audio file ready label;
Second generation module, the corresponding position for being inserted into the text mark in first text information, with
Generate the second text information.
Preferably, first conversion module includes:
First acquisition unit gets label ready for obtaining the recording in the audio file;
First searching unit searches the acquisition for getting label and text mark mapping table ready according to preset recording
Recording get the corresponding text mark of label ready.
Preferably, the device further include:
Third generation module, for will be between identical and adjacent two text mark in second text information
Word content is highlighted, to generate third text information.
Preferably, the third generation module includes:
Reading unit, for sequentially reading second text information;
Judging unit, for judging the text currently read when the reading unit currently reads text mark
Originally mark whether identical as the text mark that the last time reads;
Unit is highlighted, for identical as the upper text mark once read in the text mark currently read
When, the word content between the text mark currently read and the last text mark read is highlighted,
To generate third text information.
Preferably, the unit that highlights includes:
Second searching unit, for identical as the upper text mark once read in the text mark currently read
When, according to preset text mark and mode mapping table is highlighted, searches the corresponding protrusion of the text mark currently read
Display mode;
Subelement is highlighted, for will be between the text mark currently read and the last text mark read
Word content highlighted according to the mode that highlights that second searching unit is searched, to generate third text envelope
Breath.
The present invention is by obtaining audio file;It will be contained in the audio file according to the time shaft of audio file sequence
Some voices are converted to text to generate the first text information;It gets the recording in the audio file ready label and is converted to text
Label;The text mark is inserted into the corresponding position in first text information, to generate the second text information.It is inciting somebody to action
It when audio file is converted to text file, gets the recording in audio file ready label and is converted into text mark, and by the text
Mark the corresponding position that is inserted into first text information, to generate the second text information, can facilitate people to conversion after
Text the operation such as checked, edited.
Detailed description of the invention
Fig. 1 is the flow diagram of the method first embodiment of speech-to-text of the present invention conversion;
Fig. 2 is the refinement flow diagram of step S30 in Fig. 1;
Fig. 3 is the flow diagram of the method second embodiment of speech-to-text of the present invention conversion;
Fig. 4 is the refinement flow diagram of step S50 in Fig. 3;
Fig. 5 is the refinement flow diagram of step S53 in Fig. 4;
Fig. 6 is the functional block diagram of the device first embodiment of speech-to-text of the present invention conversion;
Fig. 7 is the functional block diagram of the device second embodiment of speech-to-text of the present invention conversion;
Fig. 8 is the detailed construction schematic diagram of third generation module in Fig. 7.
The embodiments will be further described with reference to the accompanying drawings for the realization, the function and the advantages of the object of the present invention.
Specific embodiment
It should be appreciated that the specific embodiments described herein are merely illustrative of the present invention, it is not intended to limit the present invention.
Referring to Fig.1, Fig. 1 is the flow diagram of the method first embodiment of speech-to-text of the present invention conversion.
The present invention provides a kind of method of speech-to-text conversion, including
S10, audio file is obtained.
In step S10, audio file can be obtained by wired or wireless mode, such as: acquisition can be downloaded from the Internet
Audio file, for example downloaded a lecture audio file from the Internet.The audio file includes that label is got in recording ready.
S20, the voice contained in the audio file is converted to life by text according to the time shaft sequence of the audio file
At the first text information.
In step S20, digitized the speech by voice-to-text (Speech To Test, STT) function or algorithm
Voice is successively extracted, and the voice of extraction is converted to text according to the time shaft of audio file sequence at text, will be turned
Each text for changing generation synthesizes the first text information.
S30, it gets the recording in the audio file ready label and is converted to text mark.
In step S30, gets the recording in audio file ready label and be converted to text mark, text marking style
Multiplicity can be various colors or icon indicia of various shapes.
S40, the text is marked to the corresponding position being inserted into first text information, to generate the second text information.
In step S40, corresponding recording is marked to get label ready in the position of audio file, by text according to the text
Mark the corresponding position that is inserted into the first text information to generate the second text information so that second text information both included by
The text that voice is converted into, and include the text mark getting label ready by recording and being converted into.
The embodiment of the present invention is converted to the voice contained in audio file during converting speech-to-text
Text gets the recording in audio file ready label and is converted to text mark to generate the first text information, then will be after conversion
Text mark is inserted into the corresponding position in first text information, to generate the second text information;The second text after generating
This information had not only included the text being converted by voice, but also included the text mark getting label ready by recording and being converted into.User can be square
Just the operation such as checked, edited to the second text information, if user is by checking that text mark can be in second text envelope
In breath it is open-and-shut find before done recording and get the place of mark ready, without from the beginning of the second text information
Successively check.
Further, as shown in Fig. 2, step S30 includes:
Label is got in S31, the recording obtained in the audio file ready.
S32, label and text mark mapping table are got ready according to preset recording, label pair is got in the recording for searching the acquisition ready
The text mark answered.
The mapping table that label and text mark are got in the recording ready can be preset according to actual needs, as shown in Table 1.
Table one:
Label is got in recording ready | Text mark |
Get label A ready | Five-pointed star |
Get label B ready | Red circle |
Get label C ready | Green triangle shape |
…… | …… |
If the recording got in step S31 is got ready labeled as label A is got ready, in step S32, according to pre-
If recording get label and text mark mapping table ready, finding this and getting the corresponding text mark of label A ready is five-pointed star.
According to actual needs, the recording can be also updated at any time and get label and text mark mapping table ready, so that the recording is beaten
Point label more meets the use habit of user with text mark mapping table.
Referring to the flow diagram for the method second embodiment that Fig. 3, Fig. 3 are speech-to-text of the present invention conversion.
Based on the method first embodiment of above-mentioned speech-to-text conversion, after the step s 40, this method further include:
S50, the word content between identical and adjacent two text mark in second text information is dashed forward
It shows out, to generate third text information.
In step S50, the word content between two identical and adjacent text marks is highlighted, i.e.,
Second text information can be edited automatically, label is got ready to two recording for having done identical and adjacent in audio file
Between the corresponding text of voice highlighted automatically, which can be with are as follows: bold, red font
Deng.When user the operation such as checks, edits to the third text information, highlighted text open-and-shut can be viewed
Content improves efficiency.
Further, as shown in figure 4, step S50 includes:
S51, second text information is sequentially read.
If S52, currently reading text mark, judge what whether the text mark currently read read with the last time
Text mark is identical, if they are the same, thens follow the steps S53.
In step S52, if currently reading text mark, the text mark currently read can be added to and read
In text mark list, and the last text mark read is found from this read list, then judgement is current reads
Text mark and the last text mark read it is whether identical, if they are the same, S53 is thened follow the steps, if not identical, from this
Continue to read the second text information in the place for currently reading text mark.
S53, the word content between the text mark currently read and the last text mark read is dashed forward
It shows out, to generate third text information.
It, will be in the text between the text mark currently read and the last text mark read in step S53
Appearance is highlighted, and can be edited automatically to the second text information, identical and adjacent to having done in audio file
The corresponding texts of voice that two recording are got ready between label are highlighted automatically, which can be with are as follows:
Runic, red etc..When user the operation such as checks, edits to the third text information, open-and-shut protrusion can be viewed
The content of text of display, improves efficiency.
Further, as shown in figure 5, step S53 includes:
S531, according to preset text mark and mode mapping table is highlighted, search the text mark currently read
It is corresponding to highlight mode.
Text label can be preset according to actual needs and highlights mode mapping table, as shown in Table 2.
Table two:
Text mark | Highlight mode |
Five-pointed star | Bold |
Red circle | Red font |
Green triangle shape | Green font |
…… | …… |
If the text mark currently read is red circle, reflected in the preset text mark with the mode of highlighting
It is red font that firing table, which finds the corresponding mode that highlights of the red circle,.
According to actual needs, text label can be also updated at any time and highlights mode mapping table, so that text mark
Remember and the mode mapping table of highlighting more meets the use habit of user.
Word content between S532, the text mark for reading the text mark currently read and last time is according to this
The mode that highlights searched is highlighted, to generate third text information.
In step S532, according to step S531 find highlight mode to the text mark that currently reads with
The word content between text mark that last time reads is highlighted, and can be edited automatically to the second text information,
Generate third text information.When user the operation such as checks, edits to the third text information, open-and-shut it can view
Highlighted content of text, improves efficiency.
It, should referring to the functional block diagram for the device first embodiment that Fig. 6, Fig. 6 are speech-to-text of the present invention conversion
Device includes:
Module 10 is obtained, for obtaining audio file;
First generation module 20, the voice that will contain in the audio file for the time shaft sequence according to the audio file
Text is converted to generate the first text information;
First conversion module 30 is converted to text mark for getting the recording in the audio file ready label;
Second generation module 40, for the text to be marked the corresponding position being inserted into first text information, with life
At the second text information.
The acquisition module 10 can obtain audio file by wired or wireless mode, such as: can download acquisition sound from the Internet
Frequency file, for example downloaded a lecture audio file from the Internet.The audio file includes that label is got in recording ready.
First generation module 20 is turned voice by voice-to-text (Speech To Test, STT) function or algorithm
It changes text into, according to the time shaft of audio file sequence, successively extracts voice, and the voice of extraction is converted into text, it will
Each text that conversion generates synthesizes the first text information.
First conversion module 30 gets the recording in audio file ready label and is converted to text mark, text label
Pattern multiplicity, can be various colors or icon indicia of various shapes.
Second generation module 40 marks corresponding recording to get ready and marks in the position of audio file according to the text, will be literary
This label is inserted into the corresponding position in the first text information and generates the second text information, so that second text information both included
The text being converted by voice, and include the text mark getting label ready by recording and being converted into.
The embodiment of the present invention, during converting speech-to-text, the first generation module 20 will contain in audio file
Some voices are converted to text to generate the first text information, and the first conversion module 30 gets the recording in audio file ready label
Text mark is converted to, then the text mark after conversion is inserted into first text information by the second generation module 40 again
Corresponding position, to generate the second text information;The second text information after generating not only had included the text being converted by voice, but also
Including getting the text mark that label is converted into ready by recording.User can easily check the second text information, edit
Operation, as user by check text mark can in second text information it is open-and-shut find before done recording get ready
The place of mark, without successively being checked from the beginning of the second text information.
Further, which includes: first acquisition unit 31, for obtaining the record in the audio file
Sound gets label ready;First searching unit 32 is searched this and is obtained for getting label and text mark mapping table ready according to preset recording
The corresponding text mark of label is got in the recording taken ready.
The mapping table that label and text mark are got in the recording ready can be preset according to actual needs, such as above-mentioned one institute of table
Show.
If the recording that first acquisition unit 31 is got, which is got ready, is labeled as getting ready label A, first searching unit 32
Label and text mark mapping table are got ready according to preset recording, and finding this and getting the corresponding text mark of label A ready is five-pointed star.
According to actual needs, the recording can be also updated at any time and get label and text mark mapping table ready, so that the recording is beaten
Point label more meets the use habit of user with text mark mapping table.
Referring to the functional block diagram for the device second embodiment that Fig. 7, Fig. 7 are speech-to-text of the present invention conversion.
Based on the device first embodiment of aforementioned present invention speech-to-text conversion, the device further include:
Third generation module 50, for will be between identical and adjacent two text mark in second text information
Word content is highlighted, to generate third text information.
The third generation module 50 carries out the word content between two identical and adjacent text marks prominent aobvious
Show, the second text information can be edited automatically, two recording for having done identical and adjacent in audio file are got ready
The corresponding text of voice between label is highlighted automatically, which can be with are as follows: bold, red
Font etc..When user the operation such as checks, edits to the third text information, open-and-shut it can view highlighted
Content of text improves efficiency.
Further, as shown in figure 8, the third generation module 50 includes:
Reading unit 51, for sequentially reading second text information;
Judging unit 52, for judging the text currently read when the reading unit 51 currently reads text mark
Originally mark whether identical as the text mark that the last time reads;
Unit 53 is highlighted, the text mark for reading in this prior is identical as the text mark that the last time reads
When, the word content between the text mark currently read and the last text mark read is highlighted, with
Generate third text information.
If the reading unit 51 currently reads text mark, the text mark which will can currently read
It is added to and has read in text mark list, and find the text mark of last reading from this read list, then
Judge whether the text mark currently read is identical as the text mark of last time reading;If not identical, the reading unit 51
Continue to read the second text information in the place for currently reading text mark from this.
When the text mark currently read is identical as the text mark that the last time reads, this highlights unit 53 will
The word content between text mark and the last text mark read currently read is highlighted, can be to second
Text information is edited automatically, to getting voice between label ready having done identical and adjacent two recording in audio file
Corresponding text is highlighted automatically, which can be with are as follows: runic, red etc..User is to third text
When this information such as is checked, edited at the operation, highlighted content of text open-and-shut can be viewed, is improved efficiency.
Further, this highlights unit 53 and includes:
Second searching unit, when the text mark for reading in this prior is identical as the text mark that the last time reads,
According to preset text mark and mode mapping table is highlighted, searches this text mark currently read is corresponding and highlight
Mode;
Subelement is highlighted, for will be between the text mark currently read and the last text mark read
Word content is highlighted according to the mode that highlights that second searching unit is searched, to generate third text information.
Text label can be preset according to actual needs and highlights mode mapping table, as shown in above-mentioned table two.
If the text mark currently read is red circle, the second searching unit is in the preset text mark and dashes forward
It is red font that display mode mapping table, which finds the corresponding mode that highlights of the red circle, out.
According to actual needs, text label can be also updated at any time and highlights mode mapping table, so that text mark
Remember and the mode mapping table of highlighting more meets the use habit of user.
This highlights subelement and highlights mode to the text currently read according to what second searching unit was found
Word content between this label and the last text mark read is highlighted, can automatically to the second text information into
Edlin generates third text information.It, can be very clear when user the operation such as checks, edits to the third text information
View highlighted content of text, improve efficiency.
The above is only a preferred embodiment of the present invention, is not intended to limit the scope of the invention, all to utilize this hair
Equivalent structure or equivalent flow shift made by bright specification and accompanying drawing content is applied directly or indirectly in other relevant skills
Art field, is included within the scope of the present invention.
Claims (10)
1. a kind of method of speech-to-text conversion, which is characterized in that this method comprises:
Obtain audio file;
The voice contained in the audio file is converted into text according to the time shaft of audio file sequence to generate the
One text information;
It gets the recording in the audio file ready label and is converted to text mark;
Label is got ready in the position of audio file according to the corresponding recording of the text mark, and the text mark is inserted into institute
The corresponding position in the first text information is stated, to generate the second text information, wherein second text information had both included by language
The text that sound is converted into, and include the text mark getting label ready by recording and being converted into.
2. the method for speech-to-text as described in claim 1 conversion, which is characterized in that it is described will be in the audio file
Recording gets the step of label is converted to text mark ready and includes:
It obtains the recording in the audio file and gets label ready;
Label and text mark mapping table are got ready according to preset recording, and the corresponding text of label is got in the recording for searching the acquisition ready
This label.
3. the method for speech-to-text conversion as claimed in claim 2, which is characterized in that be inserted by the text mark
In first text information, after the step of the second text information of generation, this method further include:
Word content between identical and adjacent two text mark in second text information is highlighted,
To generate third text information.
4. the method for speech-to-text conversion as claimed in claim 3, which is characterized in that described by second text information
In identical and adjacent two text mark between word content highlighted, to generate the step of third text information
Suddenly include:
Sequence reads second text information;
If currently reading text mark, the text the mark whether text mark currently read reads with the last time is judged
Remember identical;
If the text mark currently read is identical as the text mark that the last time reads, by the text currently read
Word content between label and the last text mark read is highlighted, to generate third text information.
5. the method for speech-to-text as claimed in claim 4 conversion, which is characterized in that it is described, it is currently read described
Word content between text mark and the last text mark read is highlighted, to generate third text information
Step includes:
According to preset text mark and mode mapping table is highlighted, searches the corresponding protrusion of the text mark currently read
Display mode;
By the word content between the text mark currently read and the last text mark read according to the lookup
The mode that highlights highlighted, to generate third text information.
6. a kind of device of speech-to-text conversion characterized by comprising
Module is obtained, for obtaining audio file;
First generation module, for being turned the voice contained in the audio file according to the time shaft sequence of the audio file
Text is changed to generate the first text information;
First conversion module is converted to text mark for getting the recording in the audio file ready label;
Second generation module, for getting label ready in the position of audio file, by institute according to the corresponding recording of the text mark
The corresponding position that text mark is inserted into first text information is stated, to generate the second text information, wherein described second
Text information had not only included the text being converted by voice, but also included the text mark getting label ready by recording and being converted into.
7. the device of speech-to-text as claimed in claim 6 conversion, which is characterized in that first conversion module includes:
First acquisition unit gets label ready for obtaining the recording in the audio file;
First searching unit searches the record of the acquisition for getting label and text mark mapping table ready according to preset recording
Sound gets the corresponding text mark of label ready.
8. the device of speech-to-text conversion as claimed in claim 7, which is characterized in that the device further include:
Third generation module, for by the text between identical and adjacent two text mark in second text information
Content is highlighted, to generate third text information.
9. the device of speech-to-text as claimed in claim 8 conversion, which is characterized in that the third generation module includes:
Reading unit, for sequentially reading second text information;
Judging unit, for when the reading unit currently reads text mark, judging the text mark currently read
Whether note is identical as the text mark that the last time reads;
Unit is highlighted, for inciting somebody to action when the text mark currently read is identical as the upper text mark once read
Word content between the text mark currently read and the last text mark read is highlighted, to generate
Third text information.
10. the device of speech-to-text as claimed in claim 9 conversion, which is characterized in that the unit that highlights includes:
Second searching unit, for when the text mark currently read is identical as the upper text mark once read, root
According to preset text mark and mode mapping table is highlighted, searches the corresponding side of highlighting of the text mark currently read
Formula;
Subelement is highlighted, the text between text mark for reading the text mark currently read and last time
Word content is highlighted according to the mode that highlights that second searching unit is searched, to generate third text information.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201510126575.2A CN104751846B (en) | 2015-03-20 | 2015-03-20 | The method and device of speech-to-text conversion |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201510126575.2A CN104751846B (en) | 2015-03-20 | 2015-03-20 | The method and device of speech-to-text conversion |
Publications (2)
Publication Number | Publication Date |
---|---|
CN104751846A CN104751846A (en) | 2015-07-01 |
CN104751846B true CN104751846B (en) | 2019-03-01 |
Family
ID=53591408
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201510126575.2A Active CN104751846B (en) | 2015-03-20 | 2015-03-20 | The method and device of speech-to-text conversion |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN104751846B (en) |
Families Citing this family (11)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN105653729B (en) * | 2016-01-28 | 2019-10-08 | 努比亚技术有限公司 | A kind of device and method of recording file index |
CN106067302B (en) * | 2016-05-27 | 2019-06-25 | 努比亚技术有限公司 | Denoising device and method |
CN106341204B (en) * | 2016-09-29 | 2019-02-22 | 北京小米移动软件有限公司 | Audio-frequency processing method and device |
CN106571137A (en) * | 2016-10-28 | 2017-04-19 | 努比亚技术有限公司 | Terminal voice dotting control device and method |
CN107181849A (en) * | 2017-04-19 | 2017-09-19 | 北京小米移动软件有限公司 | The way of recording and device |
CN106911832B (en) * | 2017-04-28 | 2020-06-02 | 四川音创伟业科技有限公司 | Voice recording method and device |
CN109243469B (en) * | 2017-12-13 | 2021-12-10 | 中国航空工业集团公司北京航空精密机械研究所 | Digital detection information acquisition system |
CN108647190B (en) * | 2018-04-25 | 2022-04-29 | 北京华夏电通科技股份有限公司 | Method, device and system for inserting voice recognition text into script document |
CN109545187A (en) * | 2018-11-21 | 2019-03-29 | 维沃移动通信有限公司 | A kind of display control method and terminal |
CN114999464A (en) * | 2022-05-25 | 2022-09-02 | 高创(苏州)电子有限公司 | Voice data processing method and device |
CN115237316A (en) * | 2022-06-06 | 2022-10-25 | 华为技术有限公司 | Audio track marking method and electronic equipment |
Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN1469370A (en) * | 2002-06-26 | 2004-01-21 | 日本胜利株式会社 | Text data recording method and apparatus |
CN1652205A (en) * | 2004-01-14 | 2005-08-10 | 索尼株式会社 | Audio signal processing apparatus and audio signal processing method |
CN1822189A (en) * | 2006-03-02 | 2006-08-23 | 无敌科技(西安)有限公司 | Content identifying method for digital recorded file |
CN101253549A (en) * | 2005-08-26 | 2008-08-27 | 皇家飞利浦电子股份有限公司 | System and method for synchronizing sound and manually transcribed text |
CN103247289A (en) * | 2012-02-01 | 2013-08-14 | 鸿富锦精密工业(深圳)有限公司 | Recording system, recording method, sound inputting device, voice recording device and voice recording method |
CN103399865A (en) * | 2013-07-05 | 2013-11-20 | 华为技术有限公司 | Method and device for multi-media file generation |
CN103400592A (en) * | 2013-07-30 | 2013-11-20 | 北京小米科技有限责任公司 | Recording method, playing method, device, terminal and system |
Family Cites Families (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US6353809B2 (en) * | 1997-06-06 | 2002-03-05 | Olympus Optical, Ltd. | Speech recognition with text generation from portions of voice data preselected by manual-input commands |
EP2816549B1 (en) * | 2013-06-17 | 2016-08-03 | Yamaha Corporation | User bookmarks by touching the display of a music score while recording ambient audio |
-
2015
- 2015-03-20 CN CN201510126575.2A patent/CN104751846B/en active Active
Patent Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN1469370A (en) * | 2002-06-26 | 2004-01-21 | 日本胜利株式会社 | Text data recording method and apparatus |
CN1652205A (en) * | 2004-01-14 | 2005-08-10 | 索尼株式会社 | Audio signal processing apparatus and audio signal processing method |
CN101253549A (en) * | 2005-08-26 | 2008-08-27 | 皇家飞利浦电子股份有限公司 | System and method for synchronizing sound and manually transcribed text |
CN1822189A (en) * | 2006-03-02 | 2006-08-23 | 无敌科技(西安)有限公司 | Content identifying method for digital recorded file |
CN103247289A (en) * | 2012-02-01 | 2013-08-14 | 鸿富锦精密工业(深圳)有限公司 | Recording system, recording method, sound inputting device, voice recording device and voice recording method |
CN103399865A (en) * | 2013-07-05 | 2013-11-20 | 华为技术有限公司 | Method and device for multi-media file generation |
CN103400592A (en) * | 2013-07-30 | 2013-11-20 | 北京小米科技有限责任公司 | Recording method, playing method, device, terminal and system |
Also Published As
Publication number | Publication date |
---|---|
CN104751846A (en) | 2015-07-01 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN104751846B (en) | The method and device of speech-to-text conversion | |
US20200294487A1 (en) | Hands-free annotations of audio text | |
US11133025B2 (en) | Method and system for speech emotion recognition | |
CN110751943A (en) | Voice emotion recognition method and device and related equipment | |
CN104240703B (en) | Voice information processing method and device | |
CN109254669B (en) | Expression picture input method and device, electronic equipment and system | |
CN109545184B (en) | Recitation detection method based on voice calibration and electronic equipment | |
CN109410664A (en) | Pronunciation correction method and electronic equipment | |
US20120196260A1 (en) | Electronic Comic (E-Comic) Metadata Processing | |
CN104867494B (en) | The name sorting technique and system of a kind of recording file | |
CN104252872B (en) | Lyric generating method and intelligent terminal | |
KR102076793B1 (en) | Method for providing electric document using voice, apparatus and method for writing electric document using voice | |
CN111292751A (en) | Semantic analysis method and device, voice interaction method and device, and electronic equipment | |
CN107240394A (en) | A kind of dynamic self-adapting speech analysis techniques for man-machine SET method and system | |
CN110111778A (en) | A kind of method of speech processing, device, storage medium and electronic equipment | |
CN106484134A (en) | The method and device of the phonetic entry punctuation mark based on Android system | |
CN105956014A (en) | Music playing method based on deep learning | |
CN112053692A (en) | Speech recognition processing method, device and storage medium | |
KR20140123369A (en) | Question answering system using speech recognition and its application method thereof | |
KR100593589B1 (en) | Multilingual Interpretation / Learning System Using Speech Recognition | |
CN110767233A (en) | Voice conversion system and method | |
KR20190143116A (en) | Talk auto-recording apparatus method | |
CN110047473B (en) | Man-machine cooperative interaction method and system | |
CN107331396A (en) | Export the method and device of numeral | |
CN106911832A (en) | A kind of method and device of voice record |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
C06 | Publication | ||
PB01 | Publication | ||
C10 | Entry into substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |