CN104123115A

CN104123115A - Audio information processing method and electronic device

Info

Publication number: CN104123115A
Application number: CN201410364822.8A
Authority: CN
Inventors: 高扬
Original assignee: Lenovo Beijing Ltd
Current assignee: Lenovo Beijing Ltd
Priority date: 2014-07-28
Filing date: 2014-07-28
Publication date: 2014-10-29
Anticipated expiration: 2034-07-28
Also published as: CN104123115B

Abstract

The invention discloses an audio information processing method which is used for solving the technical problem that in the prior art, the display effect of an electronic device is poor. The method includes the steps that when a voice file is output, and M segments of audio information with a first vocal print characteristic in the voice file are analyzed; the M segments of audio information are compared with N segments of audio samples, first audio samples corresponding to the vocal print characteristic same as the first vocal print characteristic in the N segments of audio samples are determined, and first user identification information corresponding to the M segments of the audio information is determined according to the correspondence relation between the audio samples and the user identification information; the voice file is output; when the audio information with the first vocal print characteristic is played, the first display effect of an electronic device is controlled to display the first user identification information. The invention further discloses the electronic device used for achieving the method.

Description

A kind of audio-frequency information disposal route and electronic equipment

Technical field

The present invention relates to field of computer technology, particularly a kind of audio-frequency information disposal route and electronic equipment.

Background technology

Day by day fierce with market competition of developing rapidly along with science and technology, the performance of electronic equipment and outward appearance have obtained promoting energetically, wherein notebook computer with its small volume and less weight, be easy to carry, the recreational advantage such as strong is just being subject to liking of increasing people, becomes an indispensable part in studying and living.User utilizes the thing that electronic equipment can do also more and more, as: user can by having, mobile phone or the panel computer of phonetic function communicates, recording etc.

At present, most electronic equipment all has sound-recording function, can meet the recording demand of several scenes, such as the recording in meeting, classroom etc.Conventionally, complicacy due to recording scene, user, use electronic equipment to obtain after recording, in the time that playback can being caused, be not easy to distinguish the concrete corresponding speaker of voice content, speaker like particularly closer for sound, or listener is unfamiliar speaker also, all can cause the discrimination difficulty while listening to.For example, in meeting, user uses electronic equipment to record to conference content, when the later stage, playback was looked back, during many people discuss simultaneously if exist situation, may occur playing sound very noisy, cannot distinguish quickly specifically which participant is speaking, listener also needs to distinguish diligently the calling party that playback is corresponding in listening to the process of recording, and in order to react rapidly the calling party corresponding with recording substance, may need playback repeatedly, thereby make the burden of electronic equipment heavier, user experiences also poor.

In summary, in prior art, there is the poor technical matters of electronic equipment recording effect.

Summary of the invention

The embodiment of the present invention provides a kind of audio-frequency information disposal route and electronic equipment, for solving the poor technical matters of electronic equipment recording effect.

A kind of audio-frequency information disposal route, be applied in electronic equipment, in described electronic equipment, store N section audio sample, the respectively corresponding user totem information of every section audio sample in described N section audio sample, described user totem information comprises the information that characterizes the audio object corresponding with audio-frequency information that can be used in, N is positive integer, and described method comprises:

In the process of output one voice document, parse the M section audio information with the first vocal print feature in described voice document, M is positive integer;

Described M section audio information and described N section audio sample are compared, determine in N the vocal print feature that described N section audio sample is corresponding, whether there be the vocal print feature identical with described the first vocal print feature;

If exist, determine corresponding the first audio sample of vocal print feature identical with described the first vocal print feature in described N section audio sample, and according to the corresponding relation of audio sample and user totem information, determine the first user identification information corresponding with described M section audio information;

Export described voice document; Wherein, when broadcasting has the audio-frequency information of described the first vocal print feature, control described electronic equipment and show described first user identification information with the first display effect.

Optionally, described method also comprises:

Detect while thering is the second vocal print feature and the 3rd vocal print feature in the audio-frequency information section comprising in described voice document simultaneously, according to described the second vocal print feature and described the 3rd vocal print feature, from described audio-frequency information section, isolate second audio-frequency information with described the second vocal print feature, and the 3rd audio-frequency information with described the 3rd vocal print feature;

By described the second audio-frequency information and described the 3rd audio-frequency information are compared with described N section audio sample respectively, determine second audio sample corresponding with described the second vocal print feature, and three audio sample corresponding with the 3rd vocal print feature; And according to the corresponding relation of audio sample and user totem information, determining and corresponding the second user totem information of described the second vocal print feature, and three user totem information corresponding with described the 3rd vocal print feature;

Control described electronic equipment in playing the process of described audio-frequency information, show described the second user totem information and described the 3rd user totem information simultaneously.

Optionally, control described electronic equipment in playing the process of described audio-frequency information end, show described the second user totem information and described the 3rd user totem information simultaneously, also comprise:

Detection has the second audio frequency intensity corresponding to audio-frequency information of described the second vocal print feature, and has the 3rd audio frequency intensity corresponding to audio-frequency information of described the 3rd vocal print feature;

More described the second intensity of sound and described the 3rd intensity of sound, be defined as main audio information by the audio-frequency information that wherein intensity of sound is large, and the little audio-frequency information of intensity of sound be defined as to secondary audio-frequency information;

According to the corresponding relation of intensity of sound and display effect, control described electronic equipment and show the user totem information corresponding with described main audio information with the first display effect, and show the user totem information corresponding with described secondary audio-frequency information with the second display effect.

Optionally, described M section audio information and described N section audio sample are compared, determine in N the vocal print feature that described N section audio sample is corresponding whether have the vocal print feature identical with described the first vocal print feature, also comprise:

If there is not the vocal print feature identical with described the first vocal print feature in N vocal print feature corresponding to described N section audio sample, judge whether described M section audio information is crucial audio-frequency information; Wherein, described crucial audio-frequency information is the audio-frequency information relevant to the contact object of storing in described electronic equipment;

If described M section audio information is described crucial audio-frequency information, according to described contact object, set up the user totem information corresponding with described M section audio information; Or

If described M section audio information is not described crucial audio-frequency information, the first specific identification information is set as the user totem information corresponding with described M section audio information; Wherein, described the first specific identification information is arbitrary information or combined information in specific image information in described electronic equipment, specific character information and special sound information.

Optionally, if be described crucial audio-frequency information in described M section audio information, when setting up the user totem information corresponding with described M section audio information according to described contact object or afterwards, described method also comprises:

According to described M section audio information, obtain the first audio-frequency fragments;

Described the first audio-frequency fragments is stored as N+1 section audio sample; Wherein, described N+1 section audio sample and described M section audio information are corresponding to same user totem information.

A kind of electronic equipment, in described electronic equipment, store N section audio sample, the respectively corresponding user totem information of every section audio sample in described N section audio sample, described user totem information comprises the information that characterizes the audio object corresponding with audio-frequency information that can be used in, N is positive integer, and described electronic equipment comprises:

Parsing module, the process at output one voice document, parses the M section audio information with the first vocal print feature in described voice document, and M is positive integer;

Comparing module, for described M section audio information and described N section audio sample are compared, determines in N the vocal print feature that described N section audio sample is corresponding, whether there be the vocal print feature identical with described the first vocal print feature;

The first determination module, if for existing, determine corresponding the first audio sample of vocal print feature identical with described the first vocal print feature in described N section audio sample, and according to the corresponding relation of audio sample and user totem information, determine the first user identification information corresponding with described M section audio information;

Output module, for exporting described voice document; Wherein, when broadcasting has the audio-frequency information of described the first vocal print feature, control described electronic equipment and show described first user identification information with the first display effect.

Optionally, described electronic equipment also comprises:

Separation module, when thering is the second vocal print feature and the 3rd vocal print feature in the audio-frequency information section comprising in described voice document simultaneously, according to described the second vocal print feature and described the 3rd vocal print feature, from described audio-frequency information section, isolate second audio-frequency information with described the second vocal print feature, and the 3rd audio-frequency information with described the 3rd vocal print feature;

The second determination module, be used for by described the second audio-frequency information and described the 3rd audio-frequency information are compared with described N section audio sample respectively, determine second audio sample corresponding with described the second vocal print feature, and three audio sample corresponding with the 3rd vocal print feature; And according to the corresponding relation of audio sample and user totem information, determining and corresponding the second user totem information of described the second vocal print feature, and three user totem information corresponding with described the 3rd vocal print feature;

Control module for controlling described electronic equipment in the process of playing described audio-frequency information, shows described the second user totem information and described the 3rd user totem information simultaneously.

Optionally, described electronic equipment also comprises:

Detection module, for detection of the second audio frequency intensity corresponding to the audio-frequency information with described the second vocal print feature, and has the 3rd audio frequency intensity corresponding to audio-frequency information of described the 3rd vocal print feature;

Comparison module, for more described the second intensity of sound and described the 3rd intensity of sound, is defined as main audio information by the audio-frequency information that wherein intensity of sound is large, and the little audio-frequency information of intensity of sound is defined as to secondary audio-frequency information;

The first processing module, be used for according to the corresponding relation of intensity of sound and display effect, control described electronic equipment and show the user totem information corresponding with described main audio information with the first display effect, and show the user totem information corresponding with described secondary audio-frequency information with the second display effect.

Optionally, described electronic equipment also comprises:

Judge module, if there is not the vocal print feature identical with described the first vocal print feature for N vocal print feature corresponding to described N section audio sample, judges whether described M section audio information is crucial audio-frequency information; Wherein, described crucial audio-frequency information is the audio-frequency information relevant to the contact object of storing in described electronic equipment;

The second processing module, if be described crucial audio-frequency information for described M section audio information, sets up the user totem information corresponding with described M section audio information according to described contact object; Or, if described M section audio information is not described crucial audio-frequency information, the first specific identification information is set as the user totem information corresponding with described M section audio information; Wherein, described the first specific identification information is arbitrary information or combined information in specific image information in described electronic equipment, specific character information and special sound information.

Optionally, described electronic equipment also comprises:

Acquisition module, for according to described M section audio information, obtains the first audio-frequency fragments;

Memory module, stores described the first audio-frequency fragments as N+1 section audio sample; Wherein, described N+1 section audio sample and described M section audio information are corresponding to same user totem information.

In the embodiment of the present invention, because the described N section audio sample standard deviation of storing in described electronic equipment has respective user identification information, and each user totem information comprises the information that characterizes the audio object corresponding with audio-frequency information that can be used in, therefore when educating voice document described in output, by resolving, can know the described M section audio information with described the first vocal print feature, and according to vocal print feature, described M section audio information and described N section audio sample are compared, can determine described first audio sample with the vocal print feature identical with described the first vocal print feature, thereby according to the first user identification information corresponding with described the first audio sample, thereby can be so that when broadcasting has the audio-frequency information of described the first vocal print feature, while playing to the arbitrary audio-frequency information in described M section audio information, all can show described first user identification information.Therefore, even if the recording substance of playing has a plurality of calling parties, so because vocal print feature corresponding to each calling party is all not identical, therefore have after the multistage audio-frequency information of identical vocal print feature by determining in recording substance, by comparing, determine after corresponding user totem information, when playing this audio-frequency information, can show corresponding user totem information, thereby can know fast audio object corresponding in the voice document of current broadcasting, and spend again the unnecessary time to distinguish without user, therefore strengthened the recording effect of electronic equipment, also improved user's experience.

Accompanying drawing explanation

Fig. 1 is the main process flow diagram of embodiment of the present invention sound intermediate frequency information processing method;

Fig. 2 shows the schematic diagram of first user identification information in the embodiment of the present invention;

Fig. 3 is the schematic diagram that shows the second user totem information and the 3rd user totem information in the embodiment of the present invention;

Fig. 4 is the main modular figure of electronic equipment in the embodiment of the present invention.

Embodiment

The embodiment of the invention discloses a kind of audio-frequency information disposal route, be applied in electronic equipment, in described electronic equipment, store N section audio sample, the respectively corresponding user totem information of every section audio sample in described N section audio sample, described user totem information comprises the information that characterizes the audio object corresponding with audio-frequency information that can be used in, N is positive integer, described method comprises: in the process of output one voice document, parse the M section audio information with the first vocal print feature in described voice document, M is positive integer; Described M section audio information and described N section audio sample are compared, determine in N the vocal print feature that described N section audio sample is corresponding, whether there be the vocal print feature identical with described the first vocal print feature; If exist, determine corresponding the first audio sample of vocal print feature identical with described the first vocal print feature in described N section audio sample, and according to the corresponding relation of audio sample and user totem information, determine the first user identification information corresponding with described M section audio information; Export described voice document; Wherein, when broadcasting has the audio-frequency information of described the first vocal print feature, control described electronic equipment and show described first user identification information with the first display effect.

Refer to Fig. 1, the embodiment of the invention discloses a kind of audio-frequency information disposal route, being applied to one has in the electronic equipment of display unit, in described electronic equipment, store N section audio sample, the respectively corresponding user totem information of every section audio sample in described N section audio sample, described user totem information comprises the information that can be used in the sign audio object corresponding with audio-frequency information, and N is positive integer, and described method can comprise the following steps:

Step 11: in the process of output one voice document, parse the M section audio information with the first vocal print feature in described voice document, M is positive integer.

In the embodiment of the present invention, described voice document can be the recording file of the corresponding special occasions recorded.For example, the recording file of conference content or the recording file in classroom etc.Conventionally, described voice document can be to be stored in local recording file, for example, the file of recording by self or miscellaneous equipment is stored in to this locality, or described voice document can be also the recording file from other electronic equipments or high in the clouds obtaining.

Optionally, in the embodiment of the present invention, described the first vocal print feature can refer in the process that described voice document is exported, vocal print feature corresponding to described voice document of determining by Application on Voiceprint Recognition.

Conventionally, so-called vocal print refers to the sound wave spectrum that carries verbal information that electricity consumption acoustic instrument shows, and any two people's vocal print collection of illustrative plates is all variant.Therefore, pass through Application on Voiceprint Recognition, can determine the vocal print feature that in described voice document, each audio-frequency information is corresponding, therefore can identify the audio-frequency information with identical vocal print feature, and when described voice document is that while recording speech content corresponding to a plurality of speakers, described voice document can be to there being a plurality of vocal print features.

Optionally, by Application on Voiceprint Recognition, can determine the described M section audio information in described voice document with described the first vocal print feature, therefore can think that described M section audio information is the content of speaking that comes from same speaker, and described M section audio information can be different audio position in described voice document.For example, when this speaker and a plurality of other speaker are in Same Scene lower time, this speaker making a speech often, the described M section audio information corresponding with it is recorded into described voice document according to time limit of speech order, thereby when playing described voice document, the content of speaking that comprises all speakers of recording can be play according to recording order, now, described M section audio information may be just to intert the audio-frequency information of a plurality of positions in described voice document.

Step 12: described M section audio information and described N section audio sample are compared, determine whether there be the vocal print feature identical with described the first vocal print feature in N the vocal print feature that described N section audio sample is corresponding.

In the embodiment of the present invention, because the vocal print feature of everyone correspondence is all not identical, therefore determining described M section audio information, and when described M section audio information and described N section audio sample are compared, can be to judge by sound groove recognition technology in e, if the vocal print feature identical with described the first vocal print feature can be detected, the existence vocal print feature identical with described the first vocal print feature in described N vocal print feature is described, there is the audio sample with described M section audio information matches, otherwise, there is not the audio sample corresponding with described M section audio information, therefore can not determine the audio object corresponding with described M section audio information by the described N section audio sample of current storage.

In the embodiment of the present invention, described N section audio sample can be according to one or more recording files, to arrange in advance.For example, from the recording file of prerecording or storing, extract the audio-frequency information corresponding with Related Contact as audio sample, or, also can for it, record corresponding audio fragment as the audio sample corresponding with this contact person according to contact people.Wherein, each the section audio sample standard deviation in described N section audio sample is from the audio-frequency information of voice segments.For example, from the voice segments of voice document, obtain a plurality of audio-frequency informations.

Optionally, in the embodiment of the present invention, the respectively corresponding user totem information of every section audio sample in described N section audio sample, described user totem information can be to comprise the information that characterizes the audio object corresponding with audio-frequency information that can be used in.For example, described user totem information can comprise the information such as contact head image, name, job specification.

Step 13: if exist, determine corresponding the first audio sample of vocal print feature identical with described the first vocal print feature in described N section audio sample, and according to the corresponding relation of audio sample and user totem information, determine the first user identification information corresponding with described M section audio information.

In the embodiment of the present invention, because each audio-frequency information has different when determining described first audio sample identical with described the first vocal print feature by sound groove recognition technology in e, the described first user identification information corresponding with described the first audio sample can be further determined, thereby the audio object corresponding with described M section audio information can be determined.

Optionally, in the embodiment of the present invention, the corresponding relation between audio sample and user totem information can be that user pre-sets.For example, user can be when arranging described N section audio sample, and the information relevant to each audio sample is set to the user totem information corresponding with this audio sample.For example, one in the information such as the head portrait of the audio object corresponding with this section audio sample, name or combination are defined as to corresponding user totem information.

For example, in user mobile phone, store the first audio sample of speaker's first, the corresponding vocal print feature 1 of the sound of speaking of speaker's first, and the head image information that comprises speaker's first in the first user identification information corresponding with described the first audio sample, name information, when user uses mobile phone to play a recording file, if include the sound of speaker's first in this recording file, when playing this recording file, if determine in the vocal print feature identifying in this recording file and there is the vocal print feature identical with vocal print feature 1, can think that the audio-frequency information in recording file with vocal print feature 1 is audio-frequency information corresponding to speaker's first, therefore these audio-frequency informations can be all the audio-frequency information being associated with first user identification information.

In actual mechanical process, described M section audio information and described N section audio sample are being compared, determine while whether there is the vocal print feature identical with described the first vocal print feature in N the vocal print feature that described N section audio sample is corresponding, can also comprise: if there is not the vocal print feature identical with described the first vocal print feature in N vocal print feature corresponding to described N section audio sample, judge whether described M section audio information is crucial audio-frequency information; Wherein, described crucial audio-frequency information is the audio-frequency information relevant to the contact object of storing in described electronic equipment; If described M section audio information is described crucial audio-frequency information, according to described contact object, set up the user totem information corresponding with described M section audio information; Or, if described M section audio information is not described crucial audio-frequency information, the first specific identification information is set as the user totem information corresponding with described M section audio information; Wherein, described the first specific identification information is arbitrary information or combined information in specific image information in described electronic equipment, specific character information and special sound information.

Wherein, judge whether described M section audio information is described crucial audio-frequency information, can there are following two kinds of methods that realize judgement.

The first: judge by user.This process can be to determine according to the contact object of storing in described electronic equipment, if storage described contact object time do not store corresponding audio section, the above deterministic process can be that user realizes.For example, when the described voice document of playing, if determine, the audio-frequency information of playing is the audio-frequency information that the match is successful, user can be according to the own familiarity to sound corresponding to contact person, distinguish that whether this audio-frequency information is acoustic information corresponding to contact person, if so, can be defined as this audio-frequency information described crucial audio-frequency information, otherwise, can this section audio information not carried out to too much setting.Therefore while judging by user self, can there is larger autonomous selectivity, improved user's Experience Degree, also make the recording effect of described electronic equipment there is stronger dirigibility simultaneously.

The second, judges by electronic equipment.If in the described contact object of storage, also store the audio-frequency information corresponding with described contact object, judge whether described M section audio information is that described crucial audio-frequency information can be realized by Application on Voiceprint Recognition and coupling by described electronic equipment.For example, if user when setting up the information of described contact object or afterwards, also for contact object has been stored one section of corresponding voice, thereby in described the first vocal print feature and described N section vocal print feature when the match is successful, can be by described the first vocal print feature vocal print feature corresponding with the voice segments of described contact object be mated, thereby can determine that whether described the first vocal print feature is relevant to described contact object, and then determine whether described M section audio information is described crucial audio-frequency information.

In the embodiment of the present invention, if judged result shows described M section audio information, be described crucial audio-frequency information, can set up the user totem information corresponding with described M section audio information according to described contact object.Conventionally, user is when storage contact object, can comprise the information such as relevant object name, head portrait, work unit, if determine when described contact object corresponding to described M section audio information is contact object 1, can head image information and name information be set to, with described M section audio information, the content that corresponding user totem information comprises is set.

In addition, when the contact object that corresponding head portrait is not set by some is set up the user totem information corresponding with described M section audio information, can by from this locality or high in the clouds obtain the image relevant to this contact object and arrange, to can distinguish fast by this user totem information.For example, while using in mobile phone described in image setting that store, relevant to definite contact object the head image information in user totem information, the head portrait that this image can be comprised partly carries out sectional drawing, thereby is set to the head image information of this contact object, improves discrimination degree.

Or surperficial described M section audio information is not described crucial audio-frequency information, can arrange the first specific identification information as the user totem information corresponding with described M section audio information if judge; Wherein, described the first specific identification information is arbitrary information or combined information in specific image information in described electronic equipment, specific character information and special sound information.

Wherein, the corresponding image that described specific image can refer to described electronic equipment acquiescence or user is preassigned, be used to the unsuccessful audio-frequency information of voice print matching to arrange, and can be Word message corresponding to this image setting, such as " unidentified ", " the unknown " etc.Or, described specific image can be exactly also to have sign easy to identify or image, and the special word that do not need to arrange in pairs or groups for example can be shown as the image of unknown personage's head portrait, the audio-frequency information that user is known at a glance now play for and the incoherent information of contact person.

Optionally, in the embodiment of the present invention, if be described crucial audio-frequency information in described M section audio information, when setting up the user totem information corresponding with described M section audio information according to described contact object or afterwards, described method can also comprise: according to described M section audio information, obtain the first audio-frequency fragments; Described the first audio-frequency fragments is stored as N+1 section audio sample; Wherein, described N+1 section audio sample and described M section audio information are corresponding to same user totem information.When definite described M section audio information is described crucial audio-frequency information, can in described M section audio information, intercept any one audio-frequency fragments as described the first audio-frequency fragments, and described the first audio-frequency fragments is stored as described N+1 section audio sample, thereby constantly increase the quantity of audio sample, so that there is the vocal print feature that more can compare when carrying out voice print matching, with many user totem information corresponding to different vocal print features in described voice document that identify of can trying one's best, thereby know corresponding audio object etc., improved the accuracy that described electronic equipment is analyzed recording file.

Step 14: export described voice document; Wherein, when broadcasting has the audio-frequency information of described the first vocal print feature, control described electronic equipment and show described first user identification information with the first display effect.

In the embodiment of the present invention, in determining described language file, have after the audio-frequency information of identical vocal print feature, can determine the user totem information corresponding with described audio-frequency information.Thereby when playing described voice document, if determine by Application on Voiceprint Recognition, the audio-frequency information of current broadcasting has corresponding audio sample in described N section audio sample, can all show that identical user marks identification information by the audio-frequency information with having this vocal print feature.For example, the head image information of the audio object corresponding with this audio-frequency information, name information etc.

Please participate in Fig. 2, numeral 20 represents described electronic equipment, and this sentences mobile phone is example; Numeral 21 represents the display unit of described electronic equipment, in described display unit, playing described voice document, and the audio frequency of current broadcasting is any one section in described M section audio information, numeral 22 represents described user totem information, this sentences user's head image information is example, wherein, the user totem information that label is 1 represents described first user identification information, and all the other user ID represent user totem information corresponding to other vocal print features comprising with described voice document.

In the embodiment of the present invention, described audio-frequency information disposal route can also comprise: detect while having the second vocal print feature and the 3rd vocal print feature in the section audio message segment comprising in described voice document simultaneously, according to the characteristic parameter of described the second vocal print feature and described the 3rd vocal print feature, from described audio-frequency information section, isolate second audio-frequency information with described the second vocal print feature, and the 3rd audio-frequency information with described the 3rd vocal print feature; By described the second audio-frequency information and described the 3rd audio-frequency information are compared with described N section audio sample respectively, determine second audio sample corresponding with described the second vocal print feature, and three audio sample corresponding with the 3rd vocal print feature; According to the corresponding relation of audio sample and user totem information, determine and corresponding the second user totem information of described the second vocal print feature, and three user totem information corresponding with described the 3rd vocal print feature; Control described electronic equipment in playing the process of described audio-frequency information section, show described the second user totem information and described the 3rd user totem information simultaneously.

The voice segments that comprises multistage audio-frequency information when wherein, described audio-frequency information section can refer in described voice document.For example, in the unit interval, while playing described voice document, may comprise a plurality of speakers' speech content simultaneously, according to the audio-frequency information of everyone correspondence, can determine a plurality of vocal print features.Described the second vocal print feature and described the 3rd vocal print feature can refer to speak each self-corresponding vocal print feature of audio-frequency information of object of difference.

In the section audio information comprising in determining described voice document, exist after described the second vocal print feature and described the 3rd vocal print feature simultaneously, can to described audio-frequency information section, extract according to the characteristic parameter of described the second vocal print feature and described the 3rd vocal print feature, thereby isolate second audio-frequency information with described the second vocal print feature, and the 3rd audio-frequency information with described the 3rd vocal print feature.Wherein, described parameter attribute can be the frequency values of resonance peak in vocal print frequency spectrum.In general, in vocal print frequency spectrum, the frequency values of resonance peak and trend thereof are the most stable characteristic parameters, and have very strong specificity, and the characteristic parameter less stable such as duration, loudness of a sound, waveform also can make reference.

Optionally, in the embodiment of the present invention, determining and corresponding the second user totem information of described the second vocal print feature, and after the 3rd user totem information corresponding with described the 3rd vocal print feature, in playing the process of described audio-frequency information, can show described the second user totem information and described the 3rd user totem information, so that hearer is known many human head pictures corresponding when current many people speak simultaneously simultaneously.For example, in described voice document, comprise speaker's first of speech simultaneously and the audio-frequency information section 1 of speaker's second, when playing to this audio-frequency information section, the head portrait a corresponding with speaker's first and head portrait b corresponding to speaker's second will show simultaneously, take and represent that the audio-frequency information section of current broadcasting is these two sound corresponding to audio object corresponding to head portraits difference.

Refer to Fig. 3, numeral 30 represents described electronic equipment, and this sentences mobile phone is example, numeral 31 represents the display unit of described electronic equipment, in described display unit, playing described audio-frequency information section, and described audio-frequency information section includes the 3rd audio-frequency information of the second audio-frequency information corresponding to described the second vocal print feature and described the 3rd vocal print feature simultaneously, numeral 1 and numeral 2 represent respectively described the second user totem information and described the 3rd user totem information, and described the second user totem information and described the 3rd user totem information are the state in amplifying with respect to the state of other user totem information, represent the current audio-frequency information corresponding with described the second user totem information and described the 3rd user totem information of playing.

Optionally, in the embodiment of the present invention, the described electronic equipment of described control is in playing the process of described audio-frequency information section, show described the second user totem information and described the 3rd user totem information simultaneously, can also comprise: detect and there is the second audio frequency intensity corresponding to audio-frequency information of described the second vocal print feature, and there is the 3rd audio frequency intensity corresponding to audio-frequency information of described the 3rd vocal print feature; More described the second intensity of sound and described the 3rd intensity of sound, be defined as main audio information by the audio-frequency information that wherein intensity of sound is large, and the little audio-frequency information of intensity of sound be defined as to secondary audio-frequency information; According to the corresponding relation of intensity of sound and display effect, control described electronic equipment and show the user totem information corresponding with described main audio information with the first display effect, and show the user totem information corresponding with described secondary audio-frequency information with the second display effect.

When playing described audio-frequency information section, owing to showing described the second user totem information and described the 3rd user totem information simultaneously, for the ease of distinguishing audio-frequency information corresponding to particular user identification information, can determine according to intensity of sound corresponding to audio-frequency information the display effect of corresponding user ID.

For example, the corresponding display effect of the audio-frequency information large with intensity of sound can be that user totem information is beated with high-frequency, and can being user totem information, the corresponding display effect of the audio-frequency information little with intensity of sound beats with low frequency, thereby by observing the jumping frequency rate of user ID, user totem information and speaker's sound intensity degree can be connected, thereby when broadcasting has the audio-frequency information section that many people speak simultaneously, can be so that hearer distinguishes by the sonority of sound and the jumping frequency rate of user totem information the user totem information that sound is corresponding, and cause situation about not being easily distinguishable while containing a plurality of sound in the recording file of having avoided simultaneously playing simultaneously.

Refer to Fig. 4, based on same inventive concept, the embodiment of the present invention also provides a kind of electronic equipment, in described electronic equipment, store N section audio sample, the respectively corresponding user totem information of every section audio sample in described N section audio sample, described user totem information comprises the information that can be used in the sign audio object corresponding with audio-frequency information, and N is positive integer, and described electronic equipment can comprise parsing module 401, comparing module 402, the first determination module 403 and output module 404.

Described parsing module 401 can parse the M section audio information with the first vocal print feature in described voice document in the process at output one voice document, and M is positive integer.

Described comparing module 402 can be for comparing described M section audio information and described N section audio sample, determines in N the vocal print feature that described N section audio sample is corresponding, whether there be the vocal print feature identical with described the first vocal print feature.

If say, the first determination module 403 can be for existing, determine corresponding the first audio sample of vocal print feature identical with described the first vocal print feature in described N section audio sample, and according to the corresponding relation of audio sample and user totem information, determine the first user identification information corresponding with described M section audio information;

Described output module 404 can be for exporting described voice document; Wherein, when broadcasting has the audio-frequency information of described the first vocal print feature, control described electronic equipment and show described first user identification information with the first display effect.

Optionally, in the embodiment of the present invention, described electronic equipment also comprises:

Specifically, computer program instructions corresponding to information processing method in the embodiment of the present application can be stored in CD, hard disk, on the storage mediums such as USB flash disk, when the computer program instructions corresponding with audio-frequency information disposal route in storage medium read or be performed by an electronic equipment, comprise the steps:

Optionally, in described storage medium, also store other computer instruction, these computer instructions are for execution step: detect while having the second vocal print feature and the 3rd vocal print feature in the audio-frequency information section that described voice document comprises simultaneously, according to described the second vocal print feature and described the 3rd vocal print feature, from described audio-frequency information section, isolate second audio-frequency information with described the second vocal print feature, and the 3rd audio-frequency information with described the 3rd vocal print feature;

Optionally, that in described storage medium, stores is controlling described electronic equipment at the described audio-frequency information end of broadcasting with step, show that computer instruction that described the second user totem information and described the 3rd user totem information are corresponding, being specifically performed in process, also comprises the steps: simultaneously

Optionally, that in described storage medium, stores is comparing described M section audio information and described N section audio sample with step, determine in N the vocal print feature that described N section audio sample is corresponding and whether exist computer instruction corresponding to the vocal print feature identical with described the first vocal print feature being specifically performed in process, also to comprise the steps:

Optionally, in described storage medium, also store other computer instruction, these computer instructions with step: if described M section audio information is described crucial audio-frequency information, according to described contact object, set up computer instruction that the user totem information corresponding with described M section audio information is corresponding when being specifically performed or be performed afterwards, when being performed, comprising the steps:

Obviously, those skilled in the art can carry out various changes and modification and not depart from the spirit and scope of the present invention the present invention.Like this, if within of the present invention these are revised and modification belongs to the scope of the claims in the present invention and equivalent technologies thereof, the present invention is also intended to comprise these changes and modification interior.

Claims

1. an audio-frequency information disposal route, be applied in electronic equipment, in described electronic equipment, store N section audio sample, the respectively corresponding user totem information of every section audio sample in described N section audio sample, described user totem information comprises the information that characterizes the audio object corresponding with audio-frequency information that can be used in, N is positive integer, and described method comprises:

2. the method for claim 1, is characterized in that, described method also comprises:

3. method as claimed in claim 2, is characterized in that, controls described electronic equipment in playing the process of described audio-frequency information end, shows described the second user totem information and described the 3rd user totem information simultaneously, also comprises:

4. the method as described in claim as arbitrary in claim 1-3, it is characterized in that, described M section audio information and described N section audio sample are compared, determine in N the vocal print feature that described N section audio sample is corresponding whether have the vocal print feature identical with described the first vocal print feature, also comprise:

5. method as claimed in claim 4, is characterized in that, if be described crucial audio-frequency information in described M section audio information, when setting up the user totem information corresponding with described M section audio information according to described contact object or afterwards, described method also comprises:

6. an electronic equipment, in described electronic equipment, store N section audio sample, the respectively corresponding user totem information of every section audio sample in described N section audio sample, described user totem information comprises the information that characterizes the audio object corresponding with audio-frequency information that can be used in, N is positive integer, and described electronic equipment comprises:

7. electronic equipment as claimed in claim 6, is characterized in that, described electronic equipment also comprises:

8. electronic equipment as claimed in claim 7, is characterized in that, described electronic equipment also comprises:

9. the electronic equipment as described in claim as arbitrary in claim 6-8, is characterized in that, described electronic equipment also comprises:

10. electronic equipment as claimed in claim 9, is characterized in that, described electronic equipment also comprises: