CN107689225A - A kind of method for automatically generating minutes - Google Patents
A kind of method for automatically generating minutes Download PDFInfo
- Publication number
- CN107689225A CN107689225A CN201710907548.8A CN201710907548A CN107689225A CN 107689225 A CN107689225 A CN 107689225A CN 201710907548 A CN201710907548 A CN 201710907548A CN 107689225 A CN107689225 A CN 107689225A
- Authority
- CN
- China
- Prior art keywords
- speaker
- vocal print
- print feature
- sound bite
- label
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 238000000034 method Methods 0.000 title claims abstract description 19
- 230000001755 vocal effect Effects 0.000 claims abstract description 46
- 238000000605 extraction Methods 0.000 claims abstract description 14
- 238000005516 engineering process Methods 0.000 claims abstract description 13
- 238000010586 diagram Methods 0.000 description 2
- 230000009286 beneficial effect Effects 0.000 description 1
- 239000012634 fragment Substances 0.000 description 1
- 230000010365 information processing Effects 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
Classifications
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/26—Speech to text systems
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/22—Procedures used during a speech recognition process, e.g. man-machine dialogue
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L17/00—Speaker identification or verification techniques
- G10L17/02—Preprocessing operations, e.g. segment selection; Pattern representation or modelling, e.g. based on linear discriminant analysis [LDA] or principal components; Feature selection or extraction
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L17/00—Speaker identification or verification techniques
- G10L17/22—Interactive procedures; Man-machine interfaces
Landscapes
- Engineering & Computer Science (AREA)
- Health & Medical Sciences (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Human Computer Interaction (AREA)
- Physics & Mathematics (AREA)
- Acoustics & Sound (AREA)
- Multimedia (AREA)
- Computational Linguistics (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Telephonic Communication Services (AREA)
- Management, Administration, Business Operations System, And Electronic Commerce (AREA)
Abstract
The present invention provides a kind of method for automatically generating minutes, including step 1)The voice of recorded meeting process;2)Some sound bites for content of speaking only are included by the extraction of VAD technologies;3)Sound bite is converted to by word by speech recognition technology;4)Speaker's vocal print feature is extracted from sound bite, and speaker clustering is carried out to sound bite according to vocal print feature, while speaker's label is set;5)Sound bite, corresponding word and speaker's label are stored;6)It is specific speaker by speaker's tag replacement, obtains final minutes.Phonetic feature, voiceprint of the present invention without collecting personnel participating in the meeting in advance, carry out speaker clustering automatically.
Description
Technical field
The present invention relates to technical field of information processing, and in particular to a kind of method for automatically generating minutes.
Background technology
In general minutes are to pass through the equipment such as recording pen, video camera by special scribe to shoot with video-corder in meeting
Hold, then carry out manual sorting and obtain, with the development of science and technology people begin one's study how automatically to be quickly obtained meeting note
Record, including spokesman, speech content etc..
Patent application CN201510530579 discloses a kind of minutes method, and software is converted by voice signal by voice
Corresponding text information is changed into, then the text information of mistake is recognized, edlin of going forward side by side is shown;Patent application
CN201410839533 discloses a kind of minutes device and its method for automatically generating minutes, is received according in meeting
Voice signal and memory in the user vocal feature table that stores, recognize voice signal corresponding to user, then voice is believed
Number being converted to word obtains minutes.
Above-mentioned prior art, it is special in the voice for need to collect in advance during spokesman's identification and store the related personnel that attends a meeting
Reference ceases, and is then contrasted by the voice messaging and the user vocal feature information of storage of recording so as to pick out use of speaking
Family.But the personnel participating in the meeting actually having a meeting is not often fixed even to have a meeting together with cooperative venture personnel, and this often leads to can not
The voice characteristics information of personnel participating in the meeting, while the temporary variations of personnel participating in the meeting are collected in advance(Increase personnel)Existing skill can not be used
The workflow of art, and the voice characteristics information of personnel participating in the meeting is stored to the potential safety hazard that itself leakage be present(It is important
If the voice characteristics information leakage of personage may be used by crime one's share of expenses for a joint undertaking).
The content of the invention
In view of this, the present invention provides a kind of method for automatically generating minutes, without collecting storage participant in advance
The phonetic feature of member can be classified speaker;Minutes arrangement personnel can will readily appreciate that meeting after carrying out speaker clustering
The speech content of speaker and arranged in view.
To achieve the above object, the technical scheme is that:A kind of method for automatically generating minutes, including it is following
Step:
Step S1:The voice of recorded meeting process;
Step S2:Some sound bites for content of speaking only are included by the extraction of VAD technologies;
Step S3:Cross speech recognition technology and sound bite is converted into word;
Step S4:Speaker's vocal print feature is extracted from sound bite, and sound bite is spoken according to vocal print feature
People classifies, while sets speaker's label;
Step S5:Sound bite, corresponding word and speaker's label are stored;
Step S6:It is specific speaker by speaker's tag replacement, obtains final minutes.
Further, the step S4 is specific as follows:
According to the order of sound bite, the vocal print feature in sound bite is extracted by i-vector and GMM-UBM voice technologies;
Extract first vocal print feature is subjected to interim storage, the text that sound bite corresponding to the vocal print feature is changed
It is meter digital that word, which sets speaker's label speakerX, X, X=1;
Vocal print feature is extracted successively, and is matched one by one with the vocal print feature of interim storage, and such as the match is successful, then by extraction
The word that sound bite corresponding to vocal print feature is changed sets identical speaker's label speakerX;Such as match it is unsuccessful,
Vocal print feature speaker's label counting position of extraction is then added 1, and carries out interim storage.
Further, in addition to step S7:Delete the vocal print feature data of interim storage.
Compared with prior art, the present invention has beneficial effect:Without collecting phonetic feature, the vocal print of personnel participating in the meeting in advance
Information, so the temporary variations of personnel participating in the meeting do not interfere with the workflow of the present invention;Although it is extracted in processing procedure
The vocal print feature information of people is talked about, but is soon deleted after processing terminates, it is ensured that the safety of speaker's vocal print feature information
Property.
Brief description of the drawings
Fig. 1 is a kind of method flow diagram for automatically generating minutes of the present invention;
Fig. 2 is the processing procedure schematic diagram of one embodiment of the invention.
Embodiment
Below in conjunction with the accompanying drawings and embodiment the present invention will be further described.
As shown in figure 1, a kind of method for automatically generating minutes, comprises the following steps:
Step S1:The voice of recorded meeting process;
Step S2:Some sound bites for content of speaking only are included by the extraction of VAD technologies;
Step S3:Sound bite is converted to by word by speech recognition technology;
Step S4:Speaker's vocal print feature is extracted from sound bite, and sound bite is spoken according to vocal print feature
People classifies, while sets speaker's label;
It is specific as follows:
According to the order of sound bite, the vocal print feature in sound bite is extracted by i-vector and GMM-UBM voice technologies;
Extract first vocal print feature is subjected to interim storage, the text that sound bite corresponding to the vocal print feature is changed
It is meter digital that word, which sets speaker's label speakerX, X, X=1;
As shown in Fig. 2 in the present embodiment, the word that sound bite corresponding to first vocal print feature of extraction is changed is
" welcoming everybody to participate in the meeting of today ", speaker's label speaker1 is set;
Vocal print feature is extracted successively, and is matched with the vocal print feature of interim storage, and such as the match is successful, then by the vocal print of extraction
The word that sound bite corresponding to feature is changed sets identical speaker's label speakerX;Such as match it is unsuccessful, then will
Vocal print feature speaker's label counting position of extraction adds 1, and carries out interim storage;
In the present embodiment, second vocal print feature of extraction can not match with the speaker1 of interim storage vocal print feature
Work(, then speaker's label counting position is set to add 1 the word that sound bite corresponding to second vocal print feature is changed, i.e.,
speaker2;Then the vocal print feature of the 3rd vocal print feature of extraction and the speaker1 and speaker2 of interim storage is entered
Row matching, matching is unsuccessful, then the word changed the 3rd sound bite corresponding to vocal print feature sets speaker's label
Meter digital adds 1, i.e. speaker3;In the present embodiment, the speaker1 of the 4th vocal print feature of extraction and interim storage,
Speaker2, speaker3 are matched one by one, and the match is successful with speaker1, then by the 4th voice corresponding to vocal print feature
The word that fragment is changed sets identical speaker label speaker1, and the rest may be inferred, and all sound bites are handled;
Step S5:Sound bite, corresponding word and speaker's label are stored;
Step S6:It is specific speaker by speaker's tag replacement, obtains final minutes;
Step S7:Delete the vocal print feature data of interim storage.
Phonetic feature, voiceprint of the present invention without collecting personnel participating in the meeting in advance, it is local directly to extract voiceprint simultaneously
Contrasted, the automatic classification for carrying out speaker, finally only need tag along sort replacing with actual speaker and can reach to meeting
Discuss the arrangement of record.
The foregoing is only presently preferred embodiments of the present invention, all equivalent changes done according to scope of the present invention patent with
Modification, it should all belong to the covering scope of the present invention.
Claims (3)
- A kind of 1. method for automatically generating minutes, it is characterised in that comprise the following steps:Step S1:The voice of recorded meeting process;Step S2:Some sound bites for content of speaking only are included by the extraction of VAD technologies;Step S3:Sound bite is converted to by word by speech recognition technology;Step S4:Speaker's vocal print feature is extracted from sound bite, and sound bite is spoken according to vocal print feature People classifies, while sets speaker's label;Step S5:Sound bite, corresponding word and speaker's label are stored;Step S6:It is specific speaker by speaker's tag replacement, obtains final minutes.
- 2. a kind of method for automatically generating minutes according to claim 1, it is characterised in that the step S4 is specific It is as follows:According to the order of sound bite, the vocal print feature in sound bite is extracted by i-vector and GMM-UBM voice technologies;Extract first vocal print feature is subjected to interim storage, the text that sound bite corresponding to the vocal print feature is changed It is meter digital that word, which sets speaker's label speakerX, X, X=1;Vocal print feature is extracted successively, and is matched one by one with the vocal print feature of interim storage, and such as the match is successful, then by extraction The word that sound bite corresponding to vocal print feature is changed sets identical speaker's label speakerX;Such as match it is unsuccessful, Vocal print feature speaker's label counting position of extraction is then added 1, and carries out interim storage.
- 3. a kind of method for automatically generating minutes according to claim 2, it is characterised in that also including step S7: Delete the vocal print feature data of interim storage.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201710907548.8A CN107689225B (en) | 2017-09-29 | 2017-09-29 | A method of automatically generating minutes |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201710907548.8A CN107689225B (en) | 2017-09-29 | 2017-09-29 | A method of automatically generating minutes |
Publications (2)
Publication Number | Publication Date |
---|---|
CN107689225A true CN107689225A (en) | 2018-02-13 |
CN107689225B CN107689225B (en) | 2019-11-19 |
Family
ID=61153780
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201710907548.8A Active CN107689225B (en) | 2017-09-29 | 2017-09-29 | A method of automatically generating minutes |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN107689225B (en) |
Cited By (14)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108447502A (en) * | 2018-03-09 | 2018-08-24 | 福州米鱼信息科技有限公司 | A kind of memo method and terminal based on voice messaging |
CN108510992A (en) * | 2018-03-22 | 2018-09-07 | 北京云知声信息技术有限公司 | The method of voice wake-up device |
CN109767757A (en) * | 2019-01-16 | 2019-05-17 | 平安科技(深圳)有限公司 | A kind of minutes generation method and device |
CN110335612A (en) * | 2019-07-11 | 2019-10-15 | 招商局金融科技有限公司 | Minutes generation method, device and storage medium based on speech recognition |
CN110428853A (en) * | 2019-08-30 | 2019-11-08 | 北京太极华保科技股份有限公司 | Voice activity detection method, Voice activity detection device and electronic equipment |
WO2020073633A1 (en) * | 2018-10-12 | 2020-04-16 | 深圳海翼智新科技有限公司 | Conference loudspeaker box, conference recording method, device and system, and computer storage medium |
CN111145903A (en) * | 2019-12-18 | 2020-05-12 | 东北大学 | Method and device for acquiring vertigo inquiry text, electronic equipment and inquiry system |
CN111180025A (en) * | 2019-12-18 | 2020-05-19 | 东北大学 | Method and device for representing medical record text vector and inquiry system |
WO2020147256A1 (en) * | 2019-01-16 | 2020-07-23 | 平安科技(深圳)有限公司 | Conference content distinguishing method and apparatus, and computer device and storage medium |
CN112017632A (en) * | 2020-09-02 | 2020-12-01 | 浪潮云信息技术股份公司 | Automatic conference record generation method |
CN113113022A (en) * | 2021-04-15 | 2021-07-13 | 吉林大学 | Method for automatically identifying identity based on voiceprint information of speaker |
CN113299279A (en) * | 2021-05-18 | 2021-08-24 | 上海明略人工智能(集团)有限公司 | Method, apparatus, electronic device and readable storage medium for associating voice data and retrieving voice data |
CN113536257A (en) * | 2021-07-27 | 2021-10-22 | 南京邮电大学盐城大数据研究院有限公司 | Multi-party conference admission method and system based on block chain |
CN115394304A (en) * | 2021-03-30 | 2022-11-25 | 北京百度网讯科技有限公司 | Voiceprint determination method, apparatus, system, device and storage medium |
Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP2008032825A (en) * | 2006-07-26 | 2008-02-14 | Fujitsu Fsas Inc | Speaker display system, speaker display method and speaker display program |
CN102985965A (en) * | 2010-05-24 | 2013-03-20 | 微软公司 | Voice print identification |
CN103165131A (en) * | 2011-12-17 | 2013-06-19 | 富泰华工业(深圳)有限公司 | Voice processing system and voice processing method |
WO2015024413A1 (en) * | 2013-08-22 | 2015-02-26 | 中兴通讯股份有限公司 | Conference summary extraction method and device |
CN106657865A (en) * | 2016-12-16 | 2017-05-10 | 联想(北京)有限公司 | Method and device for generating conference summary and video conference system |
CN106782545A (en) * | 2016-12-16 | 2017-05-31 | 广州视源电子科技股份有限公司 | A kind of system and method that audio, video data is changed into writing record |
-
2017
- 2017-09-29 CN CN201710907548.8A patent/CN107689225B/en active Active
Patent Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP2008032825A (en) * | 2006-07-26 | 2008-02-14 | Fujitsu Fsas Inc | Speaker display system, speaker display method and speaker display program |
CN102985965A (en) * | 2010-05-24 | 2013-03-20 | 微软公司 | Voice print identification |
CN103165131A (en) * | 2011-12-17 | 2013-06-19 | 富泰华工业(深圳)有限公司 | Voice processing system and voice processing method |
WO2015024413A1 (en) * | 2013-08-22 | 2015-02-26 | 中兴通讯股份有限公司 | Conference summary extraction method and device |
CN106657865A (en) * | 2016-12-16 | 2017-05-10 | 联想(北京)有限公司 | Method and device for generating conference summary and video conference system |
CN106782545A (en) * | 2016-12-16 | 2017-05-31 | 广州视源电子科技股份有限公司 | A kind of system and method that audio, video data is changed into writing record |
Cited By (15)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108447502A (en) * | 2018-03-09 | 2018-08-24 | 福州米鱼信息科技有限公司 | A kind of memo method and terminal based on voice messaging |
CN108447502B (en) * | 2018-03-09 | 2020-09-22 | 福州米鱼信息科技有限公司 | Memorandum method and terminal based on voice information |
CN108510992A (en) * | 2018-03-22 | 2018-09-07 | 北京云知声信息技术有限公司 | The method of voice wake-up device |
WO2020073633A1 (en) * | 2018-10-12 | 2020-04-16 | 深圳海翼智新科技有限公司 | Conference loudspeaker box, conference recording method, device and system, and computer storage medium |
WO2020147256A1 (en) * | 2019-01-16 | 2020-07-23 | 平安科技(深圳)有限公司 | Conference content distinguishing method and apparatus, and computer device and storage medium |
CN109767757A (en) * | 2019-01-16 | 2019-05-17 | 平安科技(深圳)有限公司 | A kind of minutes generation method and device |
CN110335612A (en) * | 2019-07-11 | 2019-10-15 | 招商局金融科技有限公司 | Minutes generation method, device and storage medium based on speech recognition |
CN110428853A (en) * | 2019-08-30 | 2019-11-08 | 北京太极华保科技股份有限公司 | Voice activity detection method, Voice activity detection device and electronic equipment |
CN111180025A (en) * | 2019-12-18 | 2020-05-19 | 东北大学 | Method and device for representing medical record text vector and inquiry system |
CN111145903A (en) * | 2019-12-18 | 2020-05-12 | 东北大学 | Method and device for acquiring vertigo inquiry text, electronic equipment and inquiry system |
CN112017632A (en) * | 2020-09-02 | 2020-12-01 | 浪潮云信息技术股份公司 | Automatic conference record generation method |
CN115394304A (en) * | 2021-03-30 | 2022-11-25 | 北京百度网讯科技有限公司 | Voiceprint determination method, apparatus, system, device and storage medium |
CN113113022A (en) * | 2021-04-15 | 2021-07-13 | 吉林大学 | Method for automatically identifying identity based on voiceprint information of speaker |
CN113299279A (en) * | 2021-05-18 | 2021-08-24 | 上海明略人工智能(集团)有限公司 | Method, apparatus, electronic device and readable storage medium for associating voice data and retrieving voice data |
CN113536257A (en) * | 2021-07-27 | 2021-10-22 | 南京邮电大学盐城大数据研究院有限公司 | Multi-party conference admission method and system based on block chain |
Also Published As
Publication number | Publication date |
---|---|
CN107689225B (en) | 2019-11-19 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN107689225B (en) | A method of automatically generating minutes | |
CN106657865B (en) | Conference summary generation method and device and video conference system | |
CN106782545B (en) | A kind of system and method that audio, video data is converted to writing record | |
CN107274916B (en) | Method and device for operating audio/video file based on voiceprint information | |
WO2020211354A1 (en) | Speaker identity recognition method and device based on speech content, and storage medium | |
US8676586B2 (en) | Method and apparatus for interaction or discourse analytics | |
JP4466564B2 (en) | Document creation / viewing device, document creation / viewing robot, and document creation / viewing program | |
US8219404B2 (en) | Method and apparatus for recognizing a speaker in lawful interception systems | |
CN112037791B (en) | Conference summary transcription method, apparatus and storage medium | |
CN105975569A (en) | Voice processing method and terminal | |
CN107154257A (en) | Customer service quality evaluating method and system based on customer voice emotion | |
US20130158992A1 (en) | Speech processing system and method | |
TWI590240B (en) | Meeting minutes device and method thereof for automatically creating meeting minutes | |
CN105957514A (en) | Portable deaf-mute communication equipment | |
JP2010060850A (en) | Minute preparation support device, minute preparation support method, program for supporting minute preparation and minute preparation support system | |
CA2652970A1 (en) | System and method for sorting objects using ocr and speech recognition techniques | |
CN106982344A (en) | video information processing method and device | |
CN111223487B (en) | Information processing method and electronic equipment | |
CN206672635U (en) | A kind of voice interaction device based on book service robot | |
JP2017167726A (en) | Conversation analyzer, method and computer program | |
CN113744742A (en) | Role identification method, device and system in conversation scene | |
CN113327619B (en) | Conference recording method and system based on cloud-edge collaborative architecture | |
CN111010484A (en) | Automatic quality inspection method for call recording | |
CN112468753B (en) | Method and device for acquiring and checking record data based on audio and video recognition technology | |
CN112562644A (en) | Customer service quality inspection method, system, equipment and medium based on human voice separation |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |