CN106356067A - Recording method, device and terminal - Google Patents

Recording method, device and terminal Download PDF

Info

Publication number
CN106356067A
CN106356067A CN201610729168.5A CN201610729168A CN106356067A CN 106356067 A CN106356067 A CN 106356067A CN 201610729168 A CN201610729168 A CN 201610729168A CN 106356067 A CN106356067 A CN 106356067A
Authority
CN
China
Prior art keywords
sound
voice data
sector
sound source
voice
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201610729168.5A
Other languages
Chinese (zh)
Inventor
潘志刚
于铎
谢莹
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
LeTV Holding Beijing Co Ltd
LeTV Mobile Intelligent Information Technology Beijing Co Ltd
Original Assignee
LeTV Holding Beijing Co Ltd
LeTV Mobile Intelligent Information Technology Beijing Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by LeTV Holding Beijing Co Ltd, LeTV Mobile Intelligent Information Technology Beijing Co Ltd filed Critical LeTV Holding Beijing Co Ltd
Priority to CN201610729168.5A priority Critical patent/CN106356067A/en
Publication of CN106356067A publication Critical patent/CN106356067A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L17/00Speaker identification or verification techniques
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L21/00Speech or voice signal processing techniques to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
    • G10L21/02Speech enhancement, e.g. noise reduction or echo cancellation
    • G10L21/0272Voice signal separating
    • G10L21/028Voice signal separating using properties of sound source

Landscapes

  • Engineering & Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Human Computer Interaction (AREA)
  • Physics & Mathematics (AREA)
  • Acoustics & Sound (AREA)
  • Multimedia (AREA)
  • Computational Linguistics (AREA)
  • Quality & Reliability (AREA)
  • Signal Processing (AREA)
  • Circuit For Audible Band Transducer (AREA)

Abstract

The embodiment of the invention provides a recording method, a device and a terminal. The method comprises the steps of receiving a plurality of audio data sent by at least two sound sources; determining the sound source direction and/or location of at least two sound sources based on the received audio data; determining at least two target sectors corresponding to at least two of the sound sources and assigning a sector identifier to each target sector; generating at least one audio file which contains the corresponding relationship between the audio data and the sector identifier. The method provided by the embodiment of the invention is capable of identifying the sector according to the voice of the audio data, setting a sector identifier for each audio data collected by the sound collection device, and then generating at least one audio file which contains the corresponding relationship between the audio data and the sector identifier so that it is possible to easily acquire the audio data corresponding to the sector identifier according to a sector identifier,thereby simplifying the sound content acquisition flow, saving time, and improving the efficiency.

Description

The way of recording, device and terminal
Technical field
The present invention relates to field of audio processing, more particularly, to a kind of way of recording, device and terminal.
Background technology
Recording is that by Mike, amplifier, voice data is converted to the signal of telecommunication, with different materials and technique record Process on medium.Currently, in the recording file obtaining after recording, all that Mike in Recording Process receives can be recorded The voice data of sound object, for example: in conference process, session recording can record the voice letter of all spokesmans participating in meeting Number, and, noise that limb action of participant etc. sends etc..
Inventor finds during realizing the embodiment of the present invention, receives due to can record Mike in recording file Multiple spokesmans in the voice signal of different time sections, and, the voice of each spokesman is very difficult to distinguish by human ear, because This, when wanting targetedly to obtain the speech content specifying spokesman in recording file it may be necessary to playback repeatedly File, leads to energy of losing time, and efficiency is low.
Content of the invention
For overcoming problem present in correlation technique, the present invention provides a kind of way of recording, device and terminal.
According to embodiments of the present invention in a first aspect, providing a kind of way of recording, comprising:
Receive multiple voice datas that at least two sound sources send;
Determine the sound source of each sound source in described at least two sound sources according to received the plurality of voice data Direction and/or position;
According to determined by the Sounnd source direction of each sound source in described at least two sound sources and/or position, determine and institute State one-to-one at least two target sector of at least two sound sources, and each at least two target sector determined by being Sector mark is distributed in target sector;
Generate at least one audio file comprising described voice data and the corresponding relation of described sector mark.
Alternatively, described at least two target sector do not overlap each other, and each target sector only covers corresponding sound source Sounnd source direction and/or position.
Alternatively, methods described also includes:
Obtain the voice data with common sector mark;
Extract the vocal print feature in described voice data;
According to described vocal print feature, judge whether the voice data in described target sector is derived from same sound source;
When the voice data in described target sector is not derived from same sound source, it is to be derived from not in unison in described target sector The voice data in source is respectively provided with different sound source marks.
Alternatively, described generation comprises at least one audio frequency of described voice data and the corresponding relation of described sector mark File, comprising:
Generate the first audio file, wherein, the multiple voice datas in described first audio file are according to acquisition time Sequencing sorts, and each voice data in the plurality of voice data is respectively provided with corresponding sector mark.
Alternatively, described generation comprises at least one audio frequency literary composition of described voice data and the corresponding relation of sector mark Part, also includes:
Generate at least two second audio files, wherein, each described second audio file has same fan for preservation The voice data of area's mark.
Alternatively, multiple voice datas that described reception at least two sound sources send, comprising:
Obtain the acoustic information of the voice data of each sound collection equipment collection;
Sound collection equipment based on the nearest sound collection equipment of sound source position is determined according to described acoustic information, really Sound collection equipment supplemented by fixed sound collection equipment in addition to described master voice collecting device;
Determine the main audio data of described master voice collecting device collection, determine the auxiliary of described auxiliary sound collection equipment collection Voice data;
The antiphase of described main audio data and described auxiliary voice data is carried out Phase Stacking, obtains sound source data,
Determine the voice data of the sound source that described sound source data is described sound collection equipment collection.
Second aspect according to embodiments of the present invention, provides a kind of recording device, is applied to comprise multiple sound collections set Standby terminal, comprising:
Receiver module, for receiving multiple voice datas that at least two sound sources send;
First determining module, for determining in described at least two sound sources according to received the plurality of voice data The Sounnd source direction of each sound source and/or position;
Second determining module, for the Sounnd source direction of each sound source at least two sound sources described determined by basis And/or position, determine one-to-one at least two target sector with described at least two sound sources, and at least two determined by being Each target sector distribution sector mark in individual target sector;
Generation module, for generating at least one sound comprising described voice data and the corresponding relation of described sector mark Frequency file.
Alternatively, the second determining module, is additionally operable to, and described at least two target sector do not overlap each other, and each target is fanned Area only covers Sounnd source direction and/or the position of corresponding sound source.
Alternatively, described device also includes:
Acquisition module, for obtaining the voice data with common sector mark;
Extraction module, for extracting the vocal print feature in described voice data;
Judge module, for according to described vocal print feature, judging whether the voice data in described target sector is derived from same One sound source;
Setup module, for when the voice data in described target sector is not derived from same sound source, being described target fan The voice data being derived from different sound sources in area is respectively provided with different sound source marks.
Alternatively, described generation module is used for:
Generate the first audio file, wherein, the multiple voice datas in described first audio file are according to acquisition time Sequencing sorts, and each voice data in the plurality of voice data is respectively provided with corresponding sector mark.
Alternatively, described generation module is additionally operable to:
Generate at least two second audio files, wherein, each described second audio file has same fan for preservation The voice data of area's mark.
Alternatively, the distance between the sound collection equipment described in any two in the plurality of sound collection equipment is more than Predeterminable range, described receiver module, comprising:
Acquisition submodule, for obtaining the acoustic information of the voice data of each sound collection equipment collection;
Determination sub-module, for determining based on the nearest sound collection equipment of sound source position according to described acoustic information Sound collection equipment, determines sound collection equipment supplemented by the sound collection equipment in addition to described master voice collecting device;
First determination sub-module, for determining the main audio data of described master voice collecting device collection, determines described auxiliary The auxiliary voice data of sound collection equipment collection;
Superposition submodule, for by the Phase Stacking of the antiphase of described main audio data and described auxiliary voice data, obtaining To sound source data;
3rd determination sub-module, for determining the audio frequency that described sound source data is the sound source that described sound collection equipment gathers Data.
The third aspect according to embodiments of the present invention, provides a kind of terminal, and described terminal includes:
Processor;
For storing the memorizer of processor executable;
Wherein, described processor is configured to:
Receive multiple voice datas that at least two sound sources send;
Determine the sound source of each sound source in described at least two sound sources according to received the plurality of voice data Direction and/or position;
According to determined by the Sounnd source direction of each sound source in described at least two sound sources and/or position, determine and institute State one-to-one at least two target sector of at least two sound sources, and each at least two target sector determined by being Sector mark is distributed in target sector;
Generate at least one audio file comprising described voice data and the corresponding relation of described sector mark.
Fourth aspect according to embodiments of the present invention, also provides a kind of computer-readable storage medium, wherein, this Computer Storage Medium can have program stored therein, and can achieve that first aspect present invention provides a kind of each implementation of way of recording during this program performing In part or all of step.
The technical scheme that embodiments of the invention provide can include following beneficial effect:
The present invention first passes through and receives multiple voice datas of sending of at least two sound sources, according to received described many Individual voice data determines the Sounnd source direction of each sound source and/or position in described at least two sound sources;And then determine with described Each mesh in one-to-one at least two target sector of at least two sound sources, and at least two target sector determined by being Sector mark is distributed in mark sector, ultimately produces the corresponding relation comprising described voice data and described sector mark at least one Audio file.
In the method provided in an embodiment of the present invention, can voice recognition sector according to belonging to voice data, by sound Multiple voice datas of collecting device collection are respectively provided with sector mark, then generate and comprise described voice data and described sector At least one audio file of the corresponding relation of mark, so can be easy to obtain this sector mark pair according to a certain sector mark The voice data answered, can simplify sound-content and obtain flow process, time-consuming, improve efficiency.
It should be appreciated that above general description and detailed description hereinafter are only exemplary and explanatory, not The present invention can be limited.
Brief description
Accompanying drawing herein is merged in description and constitutes the part of this specification, shows the enforcement meeting the present invention Example, and be used for explaining the principle of the present invention together with description.
Fig. 1 is a kind of flow chart of the way of recording according to an exemplary embodiment;
Fig. 2 is a kind of another kind of flow chart of the way of recording according to an exemplary embodiment;
Fig. 3 is the flow chart of step s101 in Fig. 1;
Fig. 4 is a kind of a kind of structure chart of the recording device according to an exemplary embodiment;
Fig. 5 is a kind of another kind of structure chart of the recording device according to an exemplary embodiment;
Fig. 6 is a kind of block diagram of the terminal according to an exemplary embodiment.
Specific embodiment
Here will in detail exemplary embodiment be illustrated, its example is illustrated in the accompanying drawings.Explained below is related to During accompanying drawing, unless otherwise indicated, the same numbers in different accompanying drawings represent same or analogous key element.Following exemplary embodiment Described in embodiment do not represent all embodiments consistent with the present invention.On the contrary, they be only with such as appended The example of the consistent apparatus and method of some aspects being described in detail in claims, the present invention.
Due to the voice signal in different time sections for the multiple spokesmans that Mike receives can be recorded in recording file, and And, the voice of each spokesman is very difficult to identify by human ear, therefore, specifies wanting targetedly to obtain in recording file It may be necessary to playback file repeatedly during the speech content of spokesman, lead to energy of losing time, efficiency is low, for this reason, as schemed Shown in 1, in one embodiment of the invention, provide a kind of way of recording, be applied to comprise the end of multiple sound collection equipment End, the quantity of this sound collection equipment can be 3,4 or 5 etc., appointing in the plurality of sound collection equipment The distance between two described sound collection equipment of meaning can be more than predeterminable range, and here presetting at distance can be more than or equal to 30 milli Rice, for example: 30 millimeters, 35 millimeters or 40 millimeters etc., specifically can be determined according to the actual size of terminal, methods described bag Include following steps.
In step s101, receive multiple voice datas that at least two sound sources send.
In embodiments of the present invention, voice data can refer to all audio frequency numbers that sound collection equipment gathers in working order According to voice data here can be the acoustical signal that multi-acoustical sends, such as: voice signal that people speaks, limb action The acoustical signal of object collision leading to and the noise of indoor environment etc., each sound collection equipment can gather its pickup to be had Voice data in the range of effect.
In this step, after sound collection equipment collects voice data, the voice data collecting can be sent to Processor in terminal, processor receives the voice data of multiple sound collection equipment collections.
In step s102, determined every in described at least two sound sources according to received the plurality of voice data The Sounnd source direction of individual sound source and/or position.
In this step, centered on terminal, due in sound collection equipment pickup effective range, the sending out of any point The sound going out reach each sound collection equipment when along, loudness different with phase place it is possible to according to the multiple sounds receiving Frequency is according to the Sounnd source direction determining each sound source and/or position.
In step s103, according to determined by each sound source in described at least two sound sources Sounnd source direction and/or Position, determines one-to-one at least two target sector with described at least two sound sources, and at least two mesh determined by being Each target sector distribution sector mark in mark sector.
In embodiments of the present invention, effective pickup scope of sound collection equipment can abstract be a 2d plane, and In advance 2d plane can be averagely divided into several preset sound identification sector, for example, it is possible to 2d plane is averagely divided into 4 Individual preset sound identifies sector, is divided into 6 preset sound identification sectors or is divided into 8 preset sound identification sectors etc. Deng.
In this step, can know the preset sound according to belonging to Sounnd source direction and/or position determine each voice data Other sector, the preset sound identification sector of the Sounnd source direction and/or position that will be covered with voice data is defined as target sector, institute State at least two target sector not overlapping each other, each target sector only covers Sounnd source direction and/or the position of corresponding sound source Put, sector mark, such as a, b or c etc. can be distributed for each target sector.
For example, when audio collecting device collects 3 voice datas 1, voice data 2 and voice data 3 simultaneously, then permissible Determine the sound source position of voice data 1, voice data 2 and voice data 3 first, effective pickup scope is divided into terminal Centered on 4 preset sound identify sectors (corresponding sector mark is respectively a, b, c and d) as a example it is assumed that voice data 1 Sound source position is located at the corresponding preset sound of a and identifies that sector, voice data 2 and voice data 3 are located at the corresponding preset sound of c and know Other sector it may be determined that a corresponding preset sound identification sector and c corresponding preset sound identification sector be target sector, this The corresponding sector mark of sample voice data 1 is a, and the corresponding sector mark of voice data 2 is c, and the corresponding sector of voice data 3 is marked Know for c etc..
In step s104, generate at least one sound comprising described voice data and the corresponding relation of described sector mark Frequency file.
In this step, an audio file can be generated, the multiple voice datas in this audio file are according to during collection Between sequencing sequence, each voice data is respectively with its corresponding sector mark labelling;And/or, generate at least two sounds Frequency file, comprises at least one voice data with common sector mark in each described second audio file.
The present invention first passes through and receives multiple voice datas of sending of at least two sound sources, according to received described many Individual voice data determines the Sounnd source direction of each sound source and/or position in described at least two sound sources;And then determine with described Each mesh in one-to-one at least two target sector of at least two sound sources, and at least two target sector determined by being Sector mark is distributed in mark sector, ultimately produces the corresponding relation comprising described voice data and described sector mark at least one Audio file.
In the method provided in an embodiment of the present invention, can voice recognition sector according to belonging to voice data, by sound Multiple voice datas of collecting device collection are respectively provided with sector mark, then generate and comprise described voice data and described sector At least one audio file of the corresponding relation of mark, so can be easy to obtain this sector mark pair according to a certain sector mark The voice data answered, can simplify sound-content and obtain flow process, time-consuming, improve efficiency.
Due in actual applications, two sound sources or more in same preset sound identification sector, may be comprised, or When multiple spokesman are in same orientation, in same voice recognition sector, the voice data of each sound source is still difficult to by human ear Distinguish, for this reason, as shown in Fig. 2 in another embodiment of the present invention, can be further discriminated between by the way of vocal print, described side Method is further comprising the steps of.
In step s201, obtain the voice data with common sector mark.
In this step, the sector mark that can be directed to each target sector searches its corresponding voice data, for example, can So that voice data 1 is found according to sector mark " a ", voice data 2 and voice data 3 are found according to sector mark " c ".
In step s202, extract the vocal print feature in described voice data.
In this step, can be to extract the vocal print feature in voice data using modes such as sound groove recognition technology in es.
In step s203, according to described vocal print feature, judge whether the voice data in described target sector is derived from same One sound source.
In this step, because the vocal print of different sound sources is different it is possible to according to vocal print feature, determine that target is fanned Whether the voice data in area is not derived from same sound source, when the vocal print of the voice data in target sector is different it may be determined that Voice data in target sector is not derived from same sound source.
When the voice data in described target sector is not derived from same sound source, in step s204, it is described target fan The voice data being derived from different sound sources in area is respectively provided with different sound source marks.
In this step, a sound source mark can be respectively provided with for each voice data in target sector, for example, (1), (2) or (3) etc. are it is assumed that the sector mark of this target sector is c it is assumed that arbitrary voice data is the corresponding preset sound of c In identification region, (1) bugle call source sends, then the sound source mark of this voice data could be arranged to c (1) etc..
The present invention passes through to obtain the voice data with common sector mark first, then extracts in described voice data Vocal print feature, further according to described vocal print feature, judges whether the voice data in described target sector is derived from same sound source, works as institute When stating the voice data in target sector and not being derived from same sound source, can be the voice data of each sound source in described target sector It is respectively provided with sound source mark.
The method provided in an embodiment of the present invention, can same preset sound identification sector in comprise two sound sources or More, or when multiple spokesman is in same orientation, same voice recognition sector can be distinguished by way of Application on Voiceprint Recognition The voice data of middle multi-acoustical, and the voice data being derived from different sound sources for each arranges different sound source marks, such energy Enough it is easy to obtain the corresponding voice data of this sector mark according to a certain sector mark, sound-content can be simplified and obtain flow process, Time-consuming, improve efficiency.
In another embodiment of the present invention, described step s104 includes:
Generate the first audio file, wherein, the multiple voice datas in described first audio file are according to acquisition time Sequencing sorts, and each voice data in the plurality of voice data is respectively provided with corresponding sector mark.
In this step, first audio file comprising multiple voice datas can be generated, in the first audio file In, each voice data is respectively provided with the label of sector mark, facilitates user's subsequent query.
In another embodiment of the present invention, described step s104 also includes:
Generate at least two second audio files, wherein, each described second audio file has same fan for preservation The voice data of area's mark.
In this step, each sector mark can be directed to, generate an audio file respectively, for example, it is possible to will have The voice data 2 of identical sector mark " c " and voice data 3, generate an audio file, will have sector mark " a " Voice data 1 generates audio file etc..
In actual applications, the voice data that sound collection equipment collects can comprise a lot of ambient sound data, for example, Environment noise etc., but due to the sound sending of any one sound source reach the time delay of each sound collection equipment, loudness and/or Phase place is different, in order to get the voice data of the high-quality of different sound sources, as shown in figure 3, in the present invention again In one embodiment, described step s101, comprise the following steps.
In step s301, obtain the acoustic information of the voice data of each sound collection equipment collection.
In embodiments of the present invention, acoustic information can refer to time delay, loudness and/or phase place of voice data etc..
In this step, the time delay of voice data, loudness and/or the phase place of the reception of each sound collection equipment can be extracted Deng acoustic information.
In step s302, sound based on the nearest sound collection equipment of sound source position is determined according to described acoustic information Sound collecting device, determines sound collection equipment supplemented by the sound collection equipment in addition to described master voice collecting device.
In this step, can be determined apart from the nearest sound collection equipment of sound source position by contrasting loudness and time delay, And this is defined as master voice collecting device apart from the nearest sound collection equipment of sound source position, other sound in terminal are adopted Sound collection equipment supplemented by the determination of collection equipment.
In step s303, determine the main audio data of described master voice collecting device collection, determine that described auxiliary sound is adopted The auxiliary voice data of collection equipment collection.
In embodiments of the present invention, comprise in described main audio data, and all include in auxiliary voice data sound source data and Ambient sound data.The acoustic energy of auxiliary voice data can be judged to ambient sound (noise or non-principal sound source sound), The acoustic energy of main audio data is judged to main sound source sound+ambient sound.
In step s304, the antiphase of described main audio data and described auxiliary voice data is carried out Phase Stacking, obtains To sound source data.
In embodiments of the present invention, because ambient sound concentrates on low frequency, main audio data has the feature energy of medium-high frequency Amount, therefore, it can in this, as the foundation distinguishing source data and ambient sound, and because ambient sound is adopted for all sound For collection equipment, energy is essentially identical, therefore can be by reversely (phase place of auxiliary voice data is assumed auxiliary voice data Phase place be 0 degree, then the phase place after reversely is 180 degree), be added with the acoustic energy of main audio data and offset, so Ensure that the sound filtering other noise sources only obtains the sound source data that sound source sends.
After in this step, sound can be made by correcting modes such as Filtering Processing, stable state de-noising and unstable state energy compensatings The energy of source data is fully supplemented, and so that noise and ambient sound is weakened enough, the signal to noise ratio of lifting recording.
In step s305, determine the voice data of the sound source that described sound source data is described sound collection equipment collection.
In this step, the sound source data obtaining can be defined as the voice data of sound collection equipment collection.
As shown in figure 4, in another embodiment of the present invention, providing a kind of recording device, it is applied to comprise multiple sound The terminal of collecting device, comprising: receiver module 41, the first determining module 42, the second determining module 43 and generation module 44.
Receiver module 41, for receiving multiple voice datas that at least two sound sources send.
First determining module 42, for determining described at least two sound sources according to received the plurality of voice data In the Sounnd source direction of each sound source and/or position.
Second determining module 43, for the Sounnd source direction of each sound source at least two sound sources described determined by basis And/or position, determine one-to-one at least two target sector with described at least two sound sources, and at least two determined by being Each target sector distribution sector mark in individual target sector.
Generation module 44, for generate the corresponding relation comprising described voice data and described sector mark at least one Audio file.
In another embodiment of the present invention, the second determining module, it is additionally operable to, described at least two target sector are each other not Overlap, each target sector only covers Sounnd source direction and/or the position of corresponding sound source.
As shown in figure 5, in another embodiment of the present invention, described device also includes: acquisition module 51, extraction module 52nd, judge module 53 and setup module 54.
Acquisition module 51, for obtaining the voice data with common sector mark.
Extraction module 52, for extracting the vocal print feature in described voice data.
Judge module 53, for according to described vocal print feature, judging whether the voice data in described target sector is derived from Same sound source.
Setup module 54, for when the voice data in described target sector is not derived from same sound source, being described target The voice data being derived from different sound sources in sector is respectively provided with different sound source marks.
In another embodiment of the present invention, described generation module is used for:
Generate the first audio file, wherein, the multiple voice datas in described first audio file are according to acquisition time Sequencing sorts, and each voice data in the plurality of voice data is respectively provided with corresponding sector mark.
In another embodiment of the present invention, described generation module is additionally operable to:
Generate at least two second audio files, wherein, each described second audio file has same fan for preservation The voice data of area's mark.
In another embodiment of the present invention, sound collection described in any two in the plurality of sound collection equipment sets The distance between standby it is more than predeterminable range, described receiver module, comprising: acquisition submodule, determination sub-module, the first determination submodule Block, superposition submodule and the 3rd determination sub-module.
Acquisition submodule, for obtaining the acoustic information of the voice data of each sound collection equipment collection;
Determination sub-module, for determining based on the nearest sound collection equipment of sound source position according to described acoustic information Sound collection equipment, determines sound collection equipment supplemented by the sound collection equipment in addition to described master voice collecting device;
First determination sub-module, for determining the main audio data of described master voice collecting device collection, determines described auxiliary The auxiliary voice data of sound collection equipment collection;
Superposition submodule, for by the Phase Stacking of the antiphase of described main audio data and described auxiliary voice data, obtaining To sound source data;
3rd determination sub-module, for determining the audio frequency that described sound source data is the sound source that described sound collection equipment gathers Data.
Fig. 6 is a kind of block diagram of the application program erecting device according to an exemplary embodiment.With reference to Fig. 6, this dress Put including:
Processor 21;
For storing the memorizer 22 of processor 21 executable instruction;
Wherein, described processor 21 is configured to:
Receive multiple voice datas that at least two sound sources send;
Determine the sound source of each sound source in described at least two sound sources according to received the plurality of voice data Direction and/or position;
According to determined by the Sounnd source direction of each sound source in described at least two sound sources and/or position, determine and institute State one-to-one at least two target sector of at least two sound sources, and each at least two target sector determined by being Sector mark is distributed in target sector;
Generate at least one audio file comprising described voice data and the corresponding relation of described sector mark.
The embodiment of the present invention also provides a kind of computer-readable storage medium, and wherein, this computer-readable storage medium can be stored with journey Sequence, can achieve the part or complete in each implementation of the way of recording that Fig. 1-embodiment illustrated in fig. 3 provides during this program performing Portion's step.
Those skilled in the art, after considering description and putting into practice invention disclosed herein, will readily occur to its of the present invention Its embodiment.The application is intended to any modification, purposes or the adaptations of the present invention, these modifications, purposes or Person's adaptations are followed the general principle of the present invention and are included the undocumented common knowledge in the art of the present invention Or conventional techniques.Description and embodiments are considered only as exemplary, and true scope and spirit of the invention are by appended Claim is pointed out.
It is described above and precision architecture illustrated in the accompanying drawings it should be appreciated that the invention is not limited in, and And various modifications and changes can carried out without departing from the scope.The scope of the present invention only to be limited by appended claim.

Claims (13)

1. a kind of way of recording is it is characterised in that include:
Receive multiple voice datas that at least two sound sources send;
Determine the Sounnd source direction of each sound source in described at least two sound sources according to received the plurality of voice data And/or position;
According to determined by the Sounnd source direction of each sound source in described at least two sound sources and/or position, determine with described extremely Each target in one-to-one at least two target sector of few two sound sources, and at least two target sector determined by being Sector mark is distributed in sector;
Generate at least one audio file comprising described voice data and the corresponding relation of described sector mark.
2. it is characterised in that wherein, described at least two target sector do not overlap each other method according to claim 1, Each target sector only covers Sounnd source direction and/or the position of corresponding sound source.
3. method according to claim 1 is it is characterised in that methods described also includes:
Obtain the voice data with common sector mark;
Extract the vocal print feature in described voice data;
According to described vocal print feature, judge whether the voice data in described target sector is derived from same sound source;
When the voice data in described target sector is not derived from same sound source, it is in described target sector, to be derived from different sound sources Voice data is respectively provided with different sound source marks.
4. method according to claim 1 is it is characterised in that described generation comprises described voice data and described sector mark At least one audio file of the corresponding relation known, comprising:
Generate the first audio file, wherein, multiple voice datas in described first audio file are according to the priority of acquisition time Order sorts, and each voice data in the plurality of voice data is respectively provided with corresponding sector mark.
5. method according to claim 1 is it is characterised in that described generation comprises described voice data and sector mark At least one audio file of corresponding relation, also includes:
Generate at least two second audio files, wherein, each described second audio file has common sector mark for preservation The voice data known.
6. method according to claim 1 is it is characterised in that multiple audio frequency numbers of sending of described reception at least two sound source According to, comprising:
Obtain the acoustic information of the voice data of each sound collection equipment collection;
Sound collection equipment based on the nearest sound collection equipment of sound source position is determined according to described acoustic information, determination removes Sound collection equipment supplemented by sound collection equipment outside described master voice collecting device;
Determine the main audio data of described master voice collecting device collection, determine the consonant frequency of described auxiliary sound collection equipment collection Data;
The antiphase of described main audio data and described auxiliary voice data is carried out Phase Stacking, obtains sound source data;
Determine the voice data of the sound source that described sound source data is described sound collection equipment collection.
7. a kind of recording device is it is characterised in that be applied to comprise the terminal of multiple sound collection equipment, comprising:
Receiver module, for receiving multiple voice datas that at least two sound sources send;
First determining module, every in described at least two sound sources for being determined according to received the plurality of voice data The Sounnd source direction of individual sound source and/or position;
Second determining module, the Sounnd source direction for each sound source at least two sound sources described determined by basis and/or Position, determines one-to-one at least two target sector with described at least two sound sources, and at least two mesh determined by being Each target sector distribution sector mark in mark sector;
Generation module, for generating at least one the audio frequency literary composition comprising described voice data and the corresponding relation of described sector mark Part.
8. device according to claim 7, it is characterised in that the second determining module, is additionally operable to, described at least two targets Sector does not overlap each other, and each target sector only covers Sounnd source direction and/or the position of corresponding sound source.
9. device according to claim 7 is it is characterised in that described device also includes:
Acquisition module, for obtaining the voice data with common sector mark;
Extraction module, for extracting the vocal print feature in described voice data;
Judge module, for according to described vocal print feature, judging whether the voice data in described target sector is derived from same sound Source;
Setup module, for when the voice data in described target sector is not derived from same sound source, being in described target sector It is respectively provided with different sound source marks from the voice data of different sound sources.
10. device according to claim 7 is it is characterised in that described generation module is used for:
Generate the first audio file, wherein, multiple voice datas in described first audio file are according to the priority of acquisition time Order sorts, and each voice data in the plurality of voice data is respectively provided with corresponding sector mark.
11. devices according to claim 7 are it is characterised in that described generation module is additionally operable to:
Generate at least two second audio files, wherein, each described second audio file has common sector mark for preservation The voice data known.
12. devices according to claim 7 are it is characterised in that described receiver module, comprising:
Acquisition submodule, for obtaining the acoustic information of the voice data of each sound collection equipment collection;
According to described acoustic information, determination sub-module, for determining that apart from the nearest sound collection equipment of sound source position be master voice Collecting device, determines sound collection equipment supplemented by the sound collection equipment in addition to described master voice collecting device;
First determination sub-module, for determining the main audio data of described master voice collecting device collection, determines described auxiliary sound The auxiliary voice data of collecting device collection;
Superposition submodule, for by the Phase Stacking of the antiphase of described main audio data and described auxiliary voice data, obtaining sound Source data;
3rd determination sub-module, for determining the audio frequency number that described sound source data is the sound source that described sound collection equipment gathers According to.
A kind of 13. terminals are it is characterised in that described terminal includes:
Processor;
For storing the memorizer of processor executable;
Wherein, described processor is configured to:
Receive multiple voice datas that at least two sound sources send;
Determine the Sounnd source direction of each sound source in described at least two sound sources according to received the plurality of voice data And/or position;
According to determined by the Sounnd source direction of each sound source in described at least two sound sources and/or position, determine with described extremely Each target in one-to-one at least two target sector of few two sound sources, and at least two target sector determined by being Sector mark is distributed in sector;
Generate at least one audio file comprising described voice data and the corresponding relation of described sector mark.
CN201610729168.5A 2016-08-25 2016-08-25 Recording method, device and terminal Pending CN106356067A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201610729168.5A CN106356067A (en) 2016-08-25 2016-08-25 Recording method, device and terminal

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201610729168.5A CN106356067A (en) 2016-08-25 2016-08-25 Recording method, device and terminal

Publications (1)

Publication Number Publication Date
CN106356067A true CN106356067A (en) 2017-01-25

Family

ID=57854270

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201610729168.5A Pending CN106356067A (en) 2016-08-25 2016-08-25 Recording method, device and terminal

Country Status (1)

Country Link
CN (1) CN106356067A (en)

Cited By (13)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108564961A (en) * 2017-11-29 2018-09-21 华北计算技术研究所(中国电子科技集团公司第十五研究所) A kind of voice de-noising method of mobile communication equipment
CN109817225A (en) * 2019-01-25 2019-05-28 广州富港万嘉智能科技有限公司 A kind of location-based meeting automatic record method, electronic equipment and storage medium
CN109887508A (en) * 2019-01-25 2019-06-14 广州富港万嘉智能科技有限公司 A kind of meeting automatic record method, electronic equipment and storage medium based on vocal print
CN109934731A (en) * 2019-01-25 2019-06-25 广州富港万嘉智能科技有限公司 A kind of method of ordering based on image recognition, electronic equipment and storage medium
CN109979447A (en) * 2019-01-25 2019-07-05 广州富港万嘉智能科技有限公司 The location-based control method of ordering of one kind, electronic equipment and storage medium
CN110033773A (en) * 2018-12-13 2019-07-19 蔚来汽车有限公司 For the audio recognition method of vehicle, device, system, equipment and vehicle
CN110223684A (en) * 2019-05-16 2019-09-10 华为技术有限公司 A kind of voice awakening method and equipment
CN110349584A (en) * 2019-07-31 2019-10-18 北京声智科技有限公司 A kind of audio data transmission method, device and speech recognition system
CN110459239A (en) * 2019-03-19 2019-11-15 深圳壹秘科技有限公司 Role analysis method, apparatus and computer readable storage medium based on voice data
CN110809879A (en) * 2017-06-28 2020-02-18 株式会社OPTiM Computer system, Web conference audio support method, and program
CN112151041A (en) * 2019-06-26 2020-12-29 北京小米移动软件有限公司 Recording method, device and equipment based on recorder program and storage medium
CN113539269A (en) * 2021-07-20 2021-10-22 上海明略人工智能(集团)有限公司 Audio information processing method, system and computer readable storage medium
CN115811574A (en) * 2023-02-03 2023-03-17 合肥炬芯智能科技有限公司 Sound signal processing method and device, main equipment and split type conference system

Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2000352996A (en) * 1999-03-26 2000-12-19 Canon Inc Information processing device
CN1610294A (en) * 2003-10-24 2005-04-27 阿鲁策株式会社 Vocal print authentication system and vocal print authentication program
CN1652205A (en) * 2004-01-14 2005-08-10 索尼株式会社 Audio signal processing apparatus and audio signal processing method
KR20100098104A (en) * 2009-02-27 2010-09-06 고려대학교 산학협력단 Method and apparatus for space-time voice activity detection using audio and video information
CN102254559A (en) * 2010-05-20 2011-11-23 盛乐信息技术(上海)有限公司 Identity authentication system and method based on vocal print
CN105070304A (en) * 2015-08-11 2015-11-18 小米科技有限责任公司 Method, device and electronic equipment for realizing recording of object audio
CN105679356A (en) * 2014-11-17 2016-06-15 中兴通讯股份有限公司 Recording method, device and terminal
CN105895102A (en) * 2015-11-15 2016-08-24 乐视移动智能信息技术(北京)有限公司 Recording editing method and recording device

Patent Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2000352996A (en) * 1999-03-26 2000-12-19 Canon Inc Information processing device
CN1610294A (en) * 2003-10-24 2005-04-27 阿鲁策株式会社 Vocal print authentication system and vocal print authentication program
CN1652205A (en) * 2004-01-14 2005-08-10 索尼株式会社 Audio signal processing apparatus and audio signal processing method
KR20100098104A (en) * 2009-02-27 2010-09-06 고려대학교 산학협력단 Method and apparatus for space-time voice activity detection using audio and video information
CN102254559A (en) * 2010-05-20 2011-11-23 盛乐信息技术(上海)有限公司 Identity authentication system and method based on vocal print
CN105679356A (en) * 2014-11-17 2016-06-15 中兴通讯股份有限公司 Recording method, device and terminal
CN105070304A (en) * 2015-08-11 2015-11-18 小米科技有限责任公司 Method, device and electronic equipment for realizing recording of object audio
CN105895102A (en) * 2015-11-15 2016-08-24 乐视移动智能信息技术(北京)有限公司 Recording editing method and recording device

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
哈维•查理德•施夫曼: "《感觉与知觉》", 31 October 2014 *

Cited By (16)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110809879A (en) * 2017-06-28 2020-02-18 株式会社OPTiM Computer system, Web conference audio support method, and program
CN108564961A (en) * 2017-11-29 2018-09-21 华北计算技术研究所(中国电子科技集团公司第十五研究所) A kind of voice de-noising method of mobile communication equipment
CN110033773A (en) * 2018-12-13 2019-07-19 蔚来汽车有限公司 For the audio recognition method of vehicle, device, system, equipment and vehicle
CN110033773B (en) * 2018-12-13 2021-09-14 蔚来(安徽)控股有限公司 Voice recognition method, device, system and equipment for vehicle and vehicle
CN109817225A (en) * 2019-01-25 2019-05-28 广州富港万嘉智能科技有限公司 A kind of location-based meeting automatic record method, electronic equipment and storage medium
CN109887508A (en) * 2019-01-25 2019-06-14 广州富港万嘉智能科技有限公司 A kind of meeting automatic record method, electronic equipment and storage medium based on vocal print
CN109934731A (en) * 2019-01-25 2019-06-25 广州富港万嘉智能科技有限公司 A kind of method of ordering based on image recognition, electronic equipment and storage medium
CN109979447A (en) * 2019-01-25 2019-07-05 广州富港万嘉智能科技有限公司 The location-based control method of ordering of one kind, electronic equipment and storage medium
CN110459239A (en) * 2019-03-19 2019-11-15 深圳壹秘科技有限公司 Role analysis method, apparatus and computer readable storage medium based on voice data
CN110223684A (en) * 2019-05-16 2019-09-10 华为技术有限公司 A kind of voice awakening method and equipment
CN112151041A (en) * 2019-06-26 2020-12-29 北京小米移动软件有限公司 Recording method, device and equipment based on recorder program and storage medium
CN112151041B (en) * 2019-06-26 2024-03-29 北京小米移动软件有限公司 Recording method, device, equipment and storage medium based on recorder program
CN110349584A (en) * 2019-07-31 2019-10-18 北京声智科技有限公司 A kind of audio data transmission method, device and speech recognition system
CN113539269A (en) * 2021-07-20 2021-10-22 上海明略人工智能(集团)有限公司 Audio information processing method, system and computer readable storage medium
CN115811574A (en) * 2023-02-03 2023-03-17 合肥炬芯智能科技有限公司 Sound signal processing method and device, main equipment and split type conference system
CN115811574B (en) * 2023-02-03 2023-06-16 合肥炬芯智能科技有限公司 Sound signal processing method and device, main equipment and split conference system

Similar Documents

Publication Publication Date Title
CN106356067A (en) Recording method, device and terminal
CN108766418B (en) Voice endpoint recognition method, device and equipment
CN107591152B (en) Voice control method, device and equipment based on earphone
US11869481B2 (en) Speech signal recognition method and device
CN109935226A (en) A kind of far field speech recognition enhancing system and method based on deep neural network
CN112053691B (en) Conference assisting method and device, electronic equipment and storage medium
CN111883168B (en) Voice processing method and device
CN109767757A (en) A kind of minutes generation method and device
CN104269172A (en) Voice control method and system based on video positioning
CN109524013B (en) Voice processing method, device, medium and intelligent equipment
CN103635962A (en) Voice recognition system, recognition dictionary logging system, and audio model identifier series generation device
Kürby et al. Bag-of-Features Acoustic Event Detection for Sensor Networks.
CN109560941A (en) Minutes method, apparatus, intelligent terminal and storage medium
CN109410956A (en) A kind of object identifying method of audio data, device, equipment and storage medium
WO2022087251A1 (en) Multi channel voice activity detection
CN110491409B (en) Method and device for separating mixed voice signal, storage medium and electronic device
KR101976937B1 (en) Apparatus for automatic conference notetaking using mems microphone array
CN110737422B (en) Sound signal acquisition method and device
CN114762039A (en) Conference data processing method and related equipment
CN109215688B (en) Same-scene audio processing method, device, computer readable storage medium and system
Nakadai et al. Footstep detection and classification using distributed microphones
CN107452408B (en) Audio playing method and device
CN113012700A (en) Voice signal processing method, device, system and computer readable storage medium
CN111988705B (en) Audio processing method, device, terminal and storage medium
CN114461842A (en) Method, device, equipment and storage medium for generating discouraging call

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
WD01 Invention patent application deemed withdrawn after publication

Application publication date: 20170125

WD01 Invention patent application deemed withdrawn after publication