CN109671438A

CN109671438A - It is a kind of to provide the device and method of ancillary service using voice

Info

Publication number: CN109671438A
Application number: CN201910082510.0A
Authority: CN
Inventors: 不公告发明人
Original assignee: Wuhan Entra Information Technology Co Ltd
Current assignee: Wuhan Entra Information Technology Co Ltd
Priority date: 2019-01-28
Filing date: 2019-01-28
Publication date: 2019-04-23

Abstract

The device and method of ancillary service is provided using voice the present invention relates to a kind of, wherein device includes: voice messaging acquisition module, acquires the audio content in its range of receiving；Identification module obtains the voiceprint of more people in response to occurring the vocal print of more people in the audio content of acquisition；And voiceprint and preformed scene voice print database collection based on more people, it determines the identity type belonging to each one, combines the identity type that the identity type belonging to each one assembles more people；Speech analysis module carries out speech recognition to the audio content of acquisition and obtains voice content, and carries out semantics recognition to the voice content of acquisition, obtains key message；And service content provides module, the identity type based on more people combines corresponding Permission Levels, the key message and preset user preference, provides service content and is shown.The technical solution that the embodiment of the present invention proposes specially is interacted with intelligent terminal without user.

Description

It is a kind of to provide the device and method of ancillary service using voice

Technical field

The invention belongs to human-computer interaction technique fields, and in particular to a kind of to provide device and the side of ancillary service using voice Method.

Background technique

With the development of face recognition technology and speech recognition technology, their application scenarios are also constantly being expanded.In mesh In preceding human-computer interaction scene, generally existing is that user interacts with the one-to-one dialogue of intelligent robot, first intelligent robot User is verified by face recognition technology and whether its ID card information for providing is consistent, after being verified, user's sending language Sound instructs exposition need, and intelligent robot identifies voice messaging by speech recognition technology, utilizes visualization technique and voice exhibition Show the product that user needs；Then user by voice confirm the product whether meet demand, complete interaction.Identification client assigns Phonetic order, and according to and intelligent robot according to client instruction carry out current face recognition technology.Applicant It was found that current interactive mode needs user specially to interact with intelligent terminal.

Summary of the invention

In order to solve the technical issues of above-mentioned current interactive mode needs user specially to interact with intelligent terminal, The embodiment of the present invention proposes a kind of device and method using voice offer ancillary service.

In the first aspect of the present invention, a kind of device using voice offer ancillary service is provided.The device includes: voice Information acquisition module, identification module, speech analysis module and service content providing module；Wherein,

Voice messaging acquisition module acquires the audio content in its range of receiving；

Identification module, it is right in response to there is the vocal print of more people in the audio content of voice messaging acquisition module acquisition The audio content of the voice messaging acquisition module acquisition carries out Application on Voiceprint Recognition, obtains the voiceprint of more people；And base In the voiceprint and preformed scene voice print database collection of more people of acquisition, the identity type belonging to each one is determined, The identity type belonging to each one is assembled to the identity type combination of more people；The scene voice print database collection characterizes people Voiceprint and identity type incidence relation；

Speech analysis module carries out speech recognition to the audio content of voice messaging acquisition module acquisition and obtains in voice Hold, and semantics recognition is carried out to the voice content of acquisition, obtains key message；And

Service content provides module, and the identity type based on more people combines corresponding Permission Levels, the crucial letter Breath and preset user preference, provide service content and are shown.

In certain embodiments, the service content provides the body for more people that module is determined according to identification module Part type combination determines that the identity type combines the Permission Levels possessed；According to speech analysis module obtain key message, Provide the alternative services content for meeting the Permission Levels；According to the preset user preference, from alternative services content really Determine service content, and the service content for providing the determination is shown.

In certain embodiments, described device further includes external control interface, and the external control interface is used for receiving The phonetic order or hardware instruction that family is assigned, and changed based on the phonetic order or hardware instruction in the service of the offer Hold or change the service content of the displaying.

In certain embodiments, the audio content in its range of receiving of the voice messaging acquisition module continuous collecting, or Person starts or stops the audio content acquired in its range of receiving according to the instruction of user.

In certain embodiments, it includes that there is the electronics of input and output interaction function to set that the service content, which provides module, It is standby, the service content of the offer is exported by electronic equipment, and can receive the information that user inputs to electronic equipment, press The service content of the output is operated according to the information of the input.

In certain embodiments, if the voiceprint of the people obtained has been saved in the scene voice print database and concentrates, Then the scene voice print database, which is concentrated, saving is determined as the people with the people of the acquisition associated identity type of voiceprint Affiliated identity type；If the voiceprint of the people obtained is not held in the scene voice print database and concentrates, it is determined that the people Affiliated identity type is stranger.

It in certain embodiments, include needing to verify identity type again just to grant execution in the service content of the offer Function, be verified in response to identity type, execute the function；

And/or

The service content of the offer is provided with security classification, needs to carry out identity to related personnel in the security classification When verification, only the voiceprint of the related personnel is confirmed.

In the second aspect of the present invention, a kind of method using voice offer ancillary service is provided.This method comprises:

In response to occurring the vocal print of more people in the audio content of acquisition, vocal print knowledge is carried out to the audio content of the acquisition Not, the voiceprint of more people is obtained；

The voiceprint and preformed scene voice print database collection of more people based on acquisition, determines belonging to each one Identity type combines the identity type that the identity type belonging to each one assembles more people；The scene vocal print number According to the voiceprint of collection characterization people and the incidence relation of identity type；

Speech recognition is carried out to the audio content of acquisition, and semantics recognition is carried out to the voice content of identification, is obtained crucial Information；And

It is inclined that identity type based on more people combines corresponding Permission Levels, the key message and preset user It is good, service content is provided and is shown.

In certain embodiments, the identity type based on more people combines corresponding Permission Levels, the key Information and preset user preference, provide service content and are shown, comprising: it is combined according to the identity type of more people, Determine that the identity type combines the Permission Levels possessed；According to the key message of acquisition, provides and meet the Permission Levels Alternative services content；According to the preset user preference, service content is determined from alternative services content, and is provided described true Fixed service content is shown.

In certain embodiments, the method also includes: receive the phonetic order assigned of user or hardware instruction, change The service content of the offer or the service content for changing the displaying.

Beneficial effects of the present invention: what the embodiment of the present invention proposed provides the device and method of ancillary service using voice, It for the application scenarios of multi-conference, is combined, and is obtained in multi-conference by the identity type that Application on Voiceprint Recognition forms more people Key message, it is inclined according to the key message and preset user of the corresponding Permission Levels of combination of more identity types and acquisition It is good, service content is provided, is specially interacted with intelligent terminal without user, and without acquiring video content, without into Row recognition of face, only need to acquire audio-frequency information can be realized the function of ancillary service.And the technology that the embodiment of the present invention proposes Scheme does not need to provide service by way of question-response with user, therefore will not influence being normally carried out for session.In addition, The technical solution that the embodiment of the present invention proposes provides the external control interface of user, quickly can provide auxiliary clothes for user Business.

Detailed description of the invention

Fig. 1 is the structural schematic diagram for the device that ancillary service is provided using voice that the embodiment of the present invention proposes；

Fig. 2 is the process of an embodiment of the method that ancillary service is provided using voice that the embodiment of the present invention proposes Figure；

Fig. 3 is the structure using the device of voice offer ancillary service in an application scenarios that the embodiment of the present invention proposes Schematic diagram.

Specific embodiment

To make the objectives, technical solutions, and advantages of the present invention clearer, below in conjunction with specific embodiment, and reference Attached drawing, the present invention is described in more detail.But as known to those skilled in the art, the invention is not limited to attached drawings and following reality Apply example.

The embodiment of the present invention proposes a kind of device using voice offer ancillary service, as shown in Figure 1, comprising: voice Information acquisition module, identification module, speech analysis module and service content providing module.

Voice messaging acquisition module acquires the audio content in its range of receiving.

In one embodiment, voice messaging acquisition module includes microphone.As known to those skilled in the art, voice messaging Acquisition module can also be using the equipment of other acquisition audio contents, such as voice messaging acquisition module includes sound pick-up.

Identification module, it is right in response to there is the vocal print of more people in the audio content of voice messaging acquisition module acquisition The audio content of the voice messaging acquisition module acquisition carries out Application on Voiceprint Recognition, obtains the voiceprint of more people；And base In the voiceprint and preformed scene voice print database collection of more people of acquisition, the identity type belonging to each one is determined, The identity type belonging to each one is assembled to the identity type combination of more people.

In one embodiment, the identification module includes Application on Voiceprint Recognition unit and identity type determination unit, described Occur the vocal print of more people in the audio content that Application on Voiceprint Recognition unit face identification unit is acquired in response to voice messaging acquisition module, Application on Voiceprint Recognition is carried out to the audio content of voice messaging acquisition module acquisition, obtains the voiceprint of more people；It is described The voiceprint and preformed scene voice print database collection of the more people of the identity type determination unit based on acquisition, determines institute Identity type belonging to more people is stated, the identity type that the identity type belonging to each one assembles more people is combined.

In the present embodiment, using the voiceprint of scene voice print database collection characterization people and the incidence relation of identity type. Before the identity type belonging to the vocal print for determining the identification, preformed scene voice print database concentration has saved people Voiceprint, and impart the identity type belonging to each one.It is protected it is appreciated that preformed scene voice print database is concentrated The voiceprint of the people deposited can be obtained from the audio content of collection in worksite, such as pass through the voice messaging acquisition module Or the audio content of other voice messaging acquisition modules (such as sound pick-up) acquisition；It can also be obtained from the audio data of upload It takes.In one embodiment, the audio content that a period can be chosen from acquired audio content, can also pass through The audio content that triggering start-stop operation acquires a period in real time obtains people's by identifying the audio content of the period Voiceprint.In one embodiment, the video content that a period can be chosen from acquired video content, can also To acquire the video content of a period in real time by triggering start-stop operation；Identify the people's in the video content of the period Voiceprint, and assign identity type belonging to the people；Identity type belonging to the voiceprint of people and the people is associatedly protected There are scene voice print database concentrations.

In the present embodiment, the identity type is used to indicate different operating, post and/or status.In one embodiment, The identity type include according to post divide identity type, for example, president, general manager, line manager, common employee, Practise employee, personnel regularly doing certain odd jobs etc..In another embodiment, the identity type includes the identity type according to workload partition, example Such as, A project leader, A Project-developing technical staff, A project testing technical staff, A project-based market developer etc..

In one embodiment, if the voiceprint of the people obtained has been saved in the scene voice print database and concentrates, Then the scene voice print database, which is concentrated, saving is determined as the people with the people of the acquisition associated identity type of voiceprint Affiliated identity type；If the voiceprint of the people obtained is not held in the scene voice print database and concentrates, it is determined that the people Affiliated identity type is stranger.

In one embodiment, scene voice print database collection includes the first scene vocal print for indicating to have determined that affiliated identity type Data set and indicate the second scene vocal print Sub Data Set wait identity type belonging to supplementing, wherein identity class belonging to having imparted The voiceprint of the people of type and its affiliated identity type assigned are stored in association in the first scene vocal print Sub Data Set, The voiceprint of stranger is stored in the second scene vocal print Sub Data Set because requiring supplementation with the affiliated identity type of stranger.Institute It states preformed scene voice print database collection and belongs to the first scene vocal print Sub Data Set.

Speech analysis module carries out speech recognition to the audio content of voice messaging acquisition module acquisition, and to voice The voice content of recognition unit identification carries out semantics recognition, obtains key message.In one embodiment, the speech analysis mould Block includes voice recognition unit and semantics recognition unit, wherein the sound that voice recognition unit acquires voice messaging acquisition module Frequency content carries out speech recognition, and semantics recognition unit carries out semantics recognition to the voice content that voice recognition unit identifies, obtains Key message.

The user can be one in more people, be also possible to be not belonging to more people other people, such as institute Someone trustee in more people is stated, someone is on behalf of the scene for appearing in more people described in trustee's commission or appointment In.

In one embodiment, the service content provides the body for more people that module is determined according to identification module Part type combination determines that the identity type combines the Permission Levels possessed；According to speech analysis module obtain key message, The Permission Levels and preset user preference, provide service content and are shown.In one embodiment, the service content The key message that module can be obtained according to speech analysis module is provided, is provided in the alternative services for meeting the Permission Levels Hold；According to the preset user preference, service content is determined from alternative services content, and identified service content is provided It is shown.In one embodiment, when the service content of the offer is multiple, the multiple service content is according to corresponding pass The sequence of key information is sequentially shown according to the preset user preference.It is appreciated that the sequence of key message can be with The frequency of key message appearance, the time of appearance are related；Preset user preference can be with of a personnel in more people Propertyization setting is related, and the personnel can be the highest people of importance in more people, be also possible to the people in more people as host.

In one embodiment, described device further includes external control interface, and the external control interface is for receiving user The phonetic order assigned or hardware instruction.The external control interface receives the phonetic order that user assigns can be by described Voice messaging acquisition module or other voice messaging acquisition modules receive, and the external control interface reception user assigns hard Part instruction can be received by the virtual key or physical button that described device provides.The phonetic order can pass through specific art Language triggering, specific term for example can be the title for indicating that certain is in kind, be also possible to imaginary word, as long as guaranteeing language as far as possible Sound instruction will not be obscured with the language in session context.The hardware instruction can be triggered by specific keys, specific to press Key for example can be the suspension key that described device is arranged on showing picture, be also possible to the key that described device has, this is pressed Key uses in the scene for showing service content as external control interface.In certain application scenarios, such as the offer Service content can switch shown content in response to the phonetic order or key command of user, can also ring when showing The keyword that should be inputted in user by phonetic order or key command provides service content again and is shown.

In one embodiment, it may include display screen that service content, which provides module, show the offer by display screen Service content.In another embodiment, it includes the electronic equipment with input and output interaction function that service content, which provides module, The service content of the offer is provided by electronic equipment, and can receive information that user inputs to electronic equipment (such as Phonetic order or hardware instruction), the service content of the output is operated according to the information of the input.The input letter Breath page operation such as can be instruction and amplify to the service content of the output, reduce, slide, rotate, can also be with It is to indicate that the link provided in the service content to the output carries out the activation operation such as clicking, can also be instruction to described defeated The operation etc. that the selection or confirmation inquiry provided in service content out is replied.

In one embodiment, the scene voice print database concentrates also the voiceprint of preservation and people are associated to contact tool Account.The connection tool at least may include mobile phone, wechat or QQ.It is appreciated that the connection tool can also include Other can be realized the mode of information exchange.Further, in one embodiment, the service content of the offer is sent to described The account of connection tool, and user is prompted to check, so as to individually carry out the displaying of service content on a user device.Another In embodiment, the service content of the offer is sent to the account of the connection tool in the form of theme, and in the master When entitled multiple, show that the theme supplies in order (such as according to the sequence of key message or according to preset user preference) User understands.

In one embodiment, the service content can be obtained from the internal server of unit, can also by network from External server obtains.Further, in one embodiment, it may include that storage is selective that the service content, which provides module, The internal server of the data bank of service content.The service content provides module and can automatically update in the internal server Service content and mark the service content of the update and verified for user, the inside can also be updated according to the instruction of user Service content in server.

In one embodiment, identification module, speech analysis module and service content providing module can integrate and taking In terminal of being engaged in.

In certain application scenarios, the voice messaging acquisition module can be in the audio in its range of receiving of continuous collecting Hold, the audio content acquired in its range of receiving can also be started or stopped according to the instruction of user.

In addition, in one embodiment, it is contemplated that the mode for providing service content is mainly based upon key message and causes and weigh The mechanism of rank screening is limited, therefore keyword interface, Permission Levels interface, alternative services content can be separately provided and show interface Deng enriching alternative services content by subsequent continuous renewal database, continually strengthen the ancillary service function of the embodiment of the present invention Energy.

It, can also be by it is understood that the service content of the offer can be presented in the way of loop play It is presented in a manner of playing automatically according to preparatory setting or the instruction of outside, it can also be according to new key message or new Identity type combination, constantly updates presented service content.In one embodiment, can in the service content having been provided, It is combined according to new key message or new identity type, intercuts relevant service content.

It should be appreciated that the service content shown should have the play right played to current persons.

In addition, if in the service content shown including the function of needing specific identity type that could execute, it can be in institute Before stating function execution, then an identity type verifying is done again.For example, if calling out ticket booking clothes by the service content shown Business function, then carrying out Application on Voiceprint Recognition again before executing ticket-booking service, and prompts related personnel to confirm its identity type, When necessary, related personnel can be required to provide name information.

Furthermore in one embodiment, security classification can also be set to the service content of displaying, be needed in the security classification Will to related personnel carry out identification check when, then the voiceprint of the personnel is confirmed again, without to other staff into Row confirmation.For example, if can provide and need to relevant people when showing the related Corporate finance data with this Commercial Secrets of Enterprise Member carries out the prompt of identification check, and the vocal print in response to having collected specific people in specified voice messaging acquisition module is believed Breath, provides corresponding service content.In this embodiment it is possible to understand, can be divided by multiple voice messaging acquisition modules It is not installed in different location, such as in a meeting room, respectively in the inlet and outlet of meeting room, voice messaging is installed on the ground such as corner Acquisition module, the voiceprint identified in the collected audio content of voice messaging acquisition module, can all collect same One scene voice print database is concentrated；It is fixed bit for the voice messaging acquisition module of identification check when carrying out identification check The voice messaging acquisition module set, for example, the voice messaging acquisition module for being used to identification check is arranged in the indoor meeting of meeting Specific position on table, or even other outside meeting room can be set for the voice messaging acquisition module of identification check Institute, or the mobile device of related personnel (such as mobile phone, laptop, desktop computer, the plate electricity of related personnel can be passed through Brain etc.) it completes.

The embodiment of the present invention also provides a kind of method using voice offer ancillary service, as shown in Figure 2, comprising:

In one embodiment, occurs the vocal print of more people in the video content in response to acquisition, to the sound of the acquisition Frequency content carries out Application on Voiceprint Recognition, obtains the voiceprint of more people, can be realized by voice messaging acquisition module；The base In the voiceprint and preformed scene voice print database collection of more people of acquisition, the identity type belonging to each one is determined, It can be realized by identification module；The audio content of described pair of acquisition carries out speech recognition, and to voice recognition unit The voice content of identification carries out semantics recognition, obtains key message, can be realized by speech analysis module；And it is described The identity type of more people combines corresponding Permission Levels and the key message and preset user preference, provides service content It is shown, module can be provided by service content and realized.

The audio content can be acquired by voice messaging acquisition module.In one embodiment, voice messaging Acquisition module includes microphone.As known to those skilled in the art, voice messaging acquisition module can also be using other acquisition audios The equipment of content, such as voice messaging acquisition module include sound pick-up.

The identity type based on more people combines corresponding Permission Levels, the key message and preset use Family preference provides service content and is shown, comprising:

It is combined according to the identity type of more people, determines that the identity type combines the Permission Levels possessed；

According to the key message of acquisition, the alternative services content for meeting the Permission Levels is provided；

According to the preset user preference, service content is determined from alternative services content, and provide the determination Service content is shown.

In one embodiment, when the service content of the offer is multiple, the multiple service content is according to corresponding pass The sequence of key information is sequentially shown according to the preset user preference.It is appreciated that the sequence of key message can be with The frequency of key message appearance, the time of appearance are related；Preset user preference can be with of a personnel in more people Propertyization setting is related, and the personnel can be the highest people of importance in more people, be also possible to the people in more people as host.

In one embodiment, the method also includes: receive the phonetic order assigned of user or hardware instruction, change institute The service content of offer is provided or changes the service content of the displaying.

The present embodiments relate to provide the method for ancillary service using voice and provide ancillary service using voice The corresponding content of device understands that details are not described herein according to the aforementioned description to the device for providing ancillary service using voice.

The present invention is implemented for the device of ancillary service is provided using voice below with reference to specific application scenarios The technical solution that example proposes illustrates.

In the present embodiment, it is the service terminal that certain enterprise's public conference place is arranged in that service content, which provides module, should Service terminal includes at least a tangible formula display screen in case and user interaction.

When the service terminal is combined according to according to the identity type, obtaining current identity type combination is this enterprise member When work A (as host), this enterprise staff B and stranger C, determine that the Permission Levels of the combination are general permission；Then Keyword according to the key message of acquisition, such as acquisition includes the nothing that enterprise name, enterprise are engaged in multinomial main business Line communication service title, certain is when saving place name, then service content provide module call it is storing with above-mentioned keyword pair in database The data answered is as first kind alternative services content, for example, publicity PPT, the company introduction of enterprise, enterprise externally introduce in enterprise The service content such as radio communication service current year situation summary；Meanwhile service content provides module and calls the sheet stored in database Enterprise location saves the contents such as the train shift of place name, Aircraft shift inquiry entrance to certain as the second alternative services content； Further according to the host i.e. preset user preference of this enterprise staff A, first kind alternative services content and/or second is selected alternatively Some or all of in service content, as determining service content, the service content for providing the determination is shown.

From the point of view of the experience of this enterprise staff A (host), which is this enterprise staff A in fact together with this enterprise staff B introduces the general status of the enterprise to client C together, promotes the scene of this business event, exists in this enterprise staff A and this enterprise staff B It, for example can some enterprises of loop play in the main interface of the service terminal before the service terminal during open chat Displaying information, while in the service terminal can be at operation interface, with this enterprise staff A, this enterprise staff B and client C Continuous talk, Lock-in is some can to click open service content alternate item.For example this enterprise staff A mentions this enterprise When the title and latest developments of industry, enterprise that can naturally by being shown automatically on tangible formula display screen Externally introduce publicity PPT, Lai Jiaqiang bandwagon effect, and when in the exchanging of client C, being mentioned to certain naturally and saving place name, It equally can naturally be introduced by the enterprise shown automatically on tangible formula display screen in certain branch company, province, to be transferred to The introduction and communication of next topic.And once judgement produce actually go to certain save place name visit commence business demand when It waits, train shift, the flight frequency that can be automatically provided by the service terminal inquire entrance to inquire the determining time, even It books tickets on the spot.

It is appreciated that if there is the vocal print of the people of the higher levels such as this enterprise president in the audio content of acquisition, Then the corresponding Permission Levels of combination of current affiliated identity type can be highest level.It will also be appreciated that when current It include the CFO (chief financial officer) of this enterprise in identity type combination, it, can also be with then in the multiple alternative services contents provided Comprising with the recent financial situation of enterprise, related data is set as one of alternative services content.

In an application scenarios, a kind of device for providing ancillary service using voice that the embodiment of the present invention proposes is one whole Terminal, as shown in Figure 3, comprising: shell, the first voice messaging acquisition equipment, the second voice messaging acquisition equipment, Application on Voiceprint Recognition are set Standby, identity-validation device and display equipment.

First voice messaging acquires equipment installation inside housings and close to shell front, acquires the in its range of receiving One audio content, the position that shell corresponds to first voice messaging acquisition equipment offer multiple first through hole.First language The range of receiving of sound information collecting device is located near the mouth of adult, convenient comprehensively to acquire multi-conference as far as possible Audio content.The first voice messaging acquisition equipment can be one or more, and the first voice messaging acquires equipment Quantity can be depending on the space size where terminating machine.

Second voice messaging acquires equipment setting inside housings and close to back side of shell, and shell corresponds to second language The position of sound information collecting device offers multiple second through-holes.Second voice messaging acquisition equipment acquires in its range of receiving Second audio content.In one embodiment, second through hole periphery is arranged radio reception portion, and the radio reception portion is from second through-hole Tapered pyramidal structure is extended to form to second voice messaging acquisition equipment.The second voice messaging acquisition equipment is one It is a.

In one embodiment, the first voice messaging acquisition equipment and the second voice messaging acquisition equipment respectively include Mike Wind.As known to those skilled in the art, the first voice messaging acquisition equipment and the second voice messaging acquisition equipment can also use it The equipment that he acquires audio content, such as the first voice messaging acquisition equipment and the second voice messaging acquisition equipment respectively include picking up Sound device.

Application on voiceprint recognition equipment, setting acquire equipment and second voice inside housings, with first voice messaging Information collecting device connection, the first sound for collecting the first audio content in response to the first voice messaging acquisition equipment and acquiring Occur the vocal print of more people in frequency content, vocal print knowledge is carried out to the first audio content of first voice messaging acquisition equipment acquisition Not, the first voiceprint of more people is obtained；It is collected in the second audio in response to second voice messaging acquisition equipment Hold, Application on Voiceprint Recognition is carried out to the second audio content of second voice messaging acquisition equipment acquisition, obtains collected second The second voiceprint in audio content.In the present embodiment, the technology of the Application on Voiceprint Recognition belongs to prior art, therefore does not make It specifically describes.

Identity-validation device, setting inside housings, are connect with the application on voiceprint recognition equipment, are believed according to second vocal print Breath carries out authentication.In the present embodiment, prior art is belonged to according to the technology that voiceprint carries out authentication, therefore not It is described specifically.

Show equipment, setting connect with the application on voiceprint recognition equipment and the identity-validation device in shell front, shows Service content corresponding with first voiceprint, and pass through authentication, display in response to second voiceprint Service content corresponding with first voiceprint and second voiceprint.

In one embodiment, the terminating machine further includes the external control interface connecting with the display equipment, described outer Portion's control interface is for receiving the phonetic order or hardware instruction that user assigns.The external control interface receives user and assigns Phonetic order can pass through the voice messaging and acquire equipment or other voice messagings and acquire equipment and receive, the external control The hardware instruction that interface user processed assigns can be received by the virtual key or physical button that the terminating machine provides.Institute Stating phonetic order can be triggered by specific term, and specific term for example can be the title for indicating that certain is in kind, be also possible to Imaginary word, as long as guaranteeing that phonetic order will not be obscured with the language in session context as far as possible.The hardware instruction can To be triggered by specific keys, specific keys for example can be the suspension key that the terminating machine is arranged on display picture, can also To be key that the terminating machine has, which is used in the scene of display service content as external control interface.? In certain application scenarios, such as when display service content, it can switch in response to the phonetic order or key command of user and show The content shown, the keyword that can also be inputted in response to user by phonetic order or key command, provides service content again It is shown.

In one embodiment, display equipment may include display screen, show the service content by display screen.Another In one embodiment, display equipment includes the electronic equipment with input and output interaction function, shows the clothes by electronic equipment Business content, and the information (such as phonetic order or hardware instruction) that user inputs to electronic equipment can be received, according to described The information of input operates the service content of the display.The input information for example can be instruction to the display Service content such as amplifies, reduces, sliding, rotating at the page operations, is also possible to indicate in the service content to the display The link of offer click etc. activation operation, can also be instruction to the selection provided in the service content of the display or really Recognize the operation etc. that inquiry is replied.In certain application scenarios, the display screen is touch screen.

In one embodiment, the internal server that the service content can include from device obtains, and can also pass through net Network is obtained from external server.Further, in one embodiment, the terminating machine may include storing in the service for display The internal server of the data bank of appearance, in the internal server, the service content and first voiceprint or Storage associated with first voiceprint and second voiceprint.In another embodiment, the display equipment It further include Transmit-Receive Unit, first voiceprint is sent to external server or in the rising tone by the Transmit-Receive Unit First voiceprint and second voiceprint are sent to external server when line information passes through authentication, and from External server obtains corresponding service content, shows on a display screen, described in the external server Service content and first voiceprint or with first voiceprint and second voiceprint is associated deposits Storage.In another embodiment, the first voice messaging acquisition equipment is connect (not shown) with the display equipment, is shown Equipment obtains first audio content from first voice messaging acquisition equipment, and first audio content is sent to The external server, the external server is according to the information of the first audio content and first voiceprint (or institute State the first voiceprint and second voiceprint) service content is provided.

In certain application scenarios, the first voice messaging acquisition equipment and second voice messaging acquisition equipment can With the audio content in its range of receiving of continuous collecting, it can also start or stop according to the instruction from user or display equipment Only acquire the audio content in its range of receiving.

It should be appreciated that the service content of display should have the play right that can be played to current persons.

In certain application scenarios, occur the function of needing to execute after authentication in the service content, Before the function executes, equipment, application on voiceprint recognition equipment and identity-validation device progress identity are acquired by the second voice messaging and tested Card firstly, display equipment notifies related personnel that the second voice messaging is gone to acquire the position where equipment, and is believed to the second voice Breath acquisition equipment sends instruction, and the second voice capture device receives the instruction, acquires the second audio content, application on voiceprint recognition equipment Application on Voiceprint Recognition is carried out to second audio content, obtains the second voiceprint in the second audio content, identity-validation device Authentication is carried out to second voiceprint, second voiceprint is sent to display after authentication passes through Equipment.

It is normal more not only can to acquire equipment acquisition by the first voice messaging for the terminating machine that the embodiment of the present invention proposes Conference content specially carries out human-computer interaction with intelligent terminal without user；And it can specially be adopted by the second voice messaging Collection equipment carries out identity verification under the premise of not influencing normally to obtain more people's voiceprints, using sound identical with human-computer interaction Line identification method, without additionally introducing the hardware structure of such as recognition of face, therefore overall architecture is simple and practical.

The embodiment of the present invention also proposes a kind of computer readable storage medium, is stored with the computer journey for executing preceding method Sequence.

The embodiment of the present invention also proposes a kind of computer equipment, including what is be connected to the processor on processor and operation Above-mentioned computer readable storage medium, the processor operation execute the computer program in computer-readable medium.

It will be understood by those skilled in the art that in flow charts indicate or logic described otherwise above herein and/or Step may be embodied in and appoint for example, being considered the order list of the executable instruction for realizing logic function In what computer-readable medium, for instruction execution system, device or equipment (such as computer based system including processor System or other can be from instruction execution system, device or equipment instruction fetch and the system executed instruction) use, or combine this A little instruction execution systems, device or equipment and use.For the purpose of this specification, " computer-readable medium " can be it is any can be with Include, store, communicate, propagate, or transport program is for instruction execution system, device or equipment or in conjunction with these instruction execution systems System, device or equipment and the device used.

The more specific example (non-exhaustive list) of computer-readable medium include the following: there are one or more wirings Electrical connection section (electronic device), portable computer diskette box (magnetic device), random access memory (RAM), read-only memory (ROM), erasable edit read-only storage (EPROM or flash memory), fiber device and portable optic disk is read-only deposits Reservoir (CDROM).In addition, computer-readable medium can even is that the paper that can print described program on it or other are suitable Medium, because can then be edited, be interpreted or when necessary with it for example by carrying out optical scanner to paper or other media His suitable method is handled electronically to obtain described program, is then stored in computer storage.

It should be appreciated that each section of the invention can be realized with hardware, software, firmware or their combination.Above-mentioned In embodiment, software that multiple steps or method can be executed in memory and by suitable instruction execution system with storage Or firmware is realized.It, and in another embodiment, can be under well known in the art for example, if realized with hardware Any one of column technology or their combination are realized: having a logic gates for realizing logic function to data-signal Discrete logic, with suitable combinational logic gate circuit specific integrated circuit, programmable gate array (PGA), scene Programmable gate array (FPGA) etc..

In the description of this specification, reference term " one embodiment ", " some embodiments ", " example ", " specifically show The description of example " or " some examples " etc. means specific features, structure, material or spy described in conjunction with this embodiment or example Point is included at least one embodiment or example of the invention.In the present specification, schematic expression of the above terms are not Centainly refer to identical embodiment or example.Moreover, particular features, structures, materials, or characteristics described can be any One or more embodiment or examples in can be combined in any suitable manner.

More than, embodiments of the present invention are illustrated.But the present invention is not limited to above embodiment.It is all Within the spirit and principles in the present invention, any modification, equivalent substitution, improvement and etc. done should be included in guarantor of the invention Within the scope of shield.

Claims

1. a kind of provide the device of ancillary service using voice characterized by comprising voice messaging acquisition module, identity are known Other module, speech analysis module and service content providing module；Wherein,

Identification module, in response to there is the vocal print of more people in the audio content of voice messaging acquisition module acquisition, to described The audio content of voice messaging acquisition module acquisition carries out Application on Voiceprint Recognition, obtains the voiceprint of more people；And based on obtaining The voiceprint and preformed scene voice print database collection of the more people taken, determines the identity type belonging to each one, by institute State the identity type combination that the identity type belonging to each one assembles more people；The sound of the scene voice print database collection characterization people The incidence relation of line information and identity type；

Speech analysis module carries out speech recognition to the audio content of voice messaging acquisition module acquisition and obtains voice content, and Semantics recognition is carried out to the voice content of acquisition, obtains key message；And

Service content provides module, identity type based on more people combine corresponding Permission Levels, the key message with And preset user preference, service content is provided and is shown.

2. the apparatus according to claim 1, which is characterized in that the service content provides module according to identification module The identity type of determining more people combines, and determines that the identity type combines the Permission Levels possessed；According to speech analysis The key message that module obtains, provides the alternative services content for meeting the Permission Levels；According to the preset user preference, Service content is determined from alternative services content, and the service content for providing the determination is shown.

3. the apparatus according to claim 1, which is characterized in that described device further includes external control interface, the outside Control interface is changed for receiving the phonetic order or hardware instruction that user assigns, and based on the phonetic order or hardware instruction The service content of the offer is provided or changes the service content of the displaying.

4. device according to claim 1 or 3, which is characterized in that the voice messaging acquisition module continuous collecting its connect The audio content in range is received, or starts or stops the audio content acquired in its range of receiving according to the instruction of user.

5. the apparatus according to claim 1, which is characterized in that it includes having input and output that the service content, which provides module, The electronic equipment of interaction function exports the service content of the offer by electronic equipment, and can receive user to electronics The information of equipment input, operates the service content of the output according to the information of the input.

6. the apparatus according to claim 1, which is characterized in that if the voiceprint of the people obtained have been saved in it is described Scene voice print database is concentrated, then the scene voice print database concentrates the associated body of voiceprint with the people of the acquisition saved Part type is determined as identity type belonging to the people；If the voiceprint of the people obtained is not held in the scene voice print database It concentrates, it is determined that identity type belonging to the people is stranger.

7. the apparatus according to claim 1, which is characterized in that include needing to verify again in the service content of the offer Identity type just grants the function of executing, and is verified in response to identity type, executes the function；

And/or

The service content of the offer is provided with security classification, needs to carry out identification check to related personnel in the security classification When, only the voiceprint of the related personnel is confirmed.

8. a kind of provide the method for ancillary service using voice characterized by comprising

In response to occurring the vocal print of more people in the audio content of acquisition, Application on Voiceprint Recognition is carried out to the audio content of the acquisition, is obtained Take the voiceprint of more people；

The voiceprint and preformed scene voice print database collection of more people based on acquisition, determines the identity belonging to each one Type combines the identity type that the identity type belonging to each one assembles more people；The scene voice print database collection Characterize the voiceprint of people and the incidence relation of identity type；

Speech recognition is carried out to the audio content of acquisition, and semantics recognition is carried out to the voice content of identification, obtains key message； And

Identity type based on more people combines corresponding Permission Levels, the key message and preset user preference, Service content is provided to be shown.

9. according to the method described in claim 8, it is characterized in that, the identity type combination based on more people is corresponding Permission Levels, the key message and preset user preference, provide service content and are shown, comprising:

According to the preset user preference, service content is determined from alternative services content, and provide the service of the determination Content is shown.

10. according to the method described in claim 8, it is characterized in that, the method also includes:

Receiving the phonetic order that user assigns, perhaps hardware instruction changes the service content of the offer or changes the displaying Service content.