CN111199032A

CN111199032A - Identity authentication method and device

Info

Publication number: CN111199032A
Application number: CN201911408254.6A
Authority: CN
Inventors: 杨长盛
Original assignee: Huawei Technologies Co Ltd
Current assignee: Huawei Technologies Co Ltd
Priority date: 2019-12-31
Filing date: 2019-12-31
Publication date: 2020-05-26
Also published as: WO2021135685A1

Abstract

The application provides an identity authentication method and device, which are applied to the field of Artificial Intelligence (AI), and particularly can be applied to intelligent terminal equipment such as an intelligent robot and the like to perform user identity authentication, and the method comprises the following steps: acquiring information of multi-modal biological features of a user to be identified, wherein the multi-modal biological features comprise at least two biological features of the user to be identified; performing identity authentication on the user to be identified in parallel according to at least two biological characteristics; and determining the identity authentication result of the user to be recognized according to the recognition result obtained by parallelly authenticating the identity of the user to be recognized, wherein the identity authentication result is obtained based on the confidence coefficient of the multi-mode biological feature matched with the preset biological feature. The technical scheme of the application can effectively avoid interference factors existing in identity authentication through a single biological characteristic, thereby improving the robustness and accuracy of the identity authentication method.

Description

Identity authentication method and device

Technical Field

The present application relates to the field of artificial intelligence, and more particularly, to a method and apparatus for identity authentication.

Background

Artificial Intelligence (AI) is a theory, method, technique and application system that uses a digital computer or a machine controlled by a digital computer to simulate, extend and expand human intelligence, perceive the environment, acquire knowledge and use the knowledge to obtain the best results. In other words, artificial intelligence is a branch of computer science that attempts to understand the essence of intelligence and produce a new intelligent machine that can react in a manner similar to human intelligence. Artificial intelligence is the research of the design principle and the realization method of various intelligent machines, so that the machines have the functions of perception, reasoning and decision making. Research in the field of artificial intelligence includes robotics, natural language processing, computer vision, decision and reasoning, human-computer interaction, recommendation and search, AI basic theory, and the like.

With the development of science and technology and the progress of the era, intelligent terminals (e.g., intelligent robots) gradually move into the families of the public, and people increasingly use intelligent terminal devices for learning, communication, housework, entertainment and companioning. At present, the technology of performing fusion identity authentication based on biological characteristics generally needs to perform sequential authentication on multiple biological characteristics, that is, if the confidence of the first biological characteristic identification is in a preset confidence interval, the identification authentication of the second biological characteristic is started; however, when the confidence of the first biometric identification is smaller than the minimum value of the confidence interval, the direct authentication fails; that is, if the acquired biological characteristics cannot be identified due to the intelligent terminal or the user's own factors, for example, the ambient light is not good, a blurred picture is taken, and the front face image of the user is not taken by the intelligent terminal device when the intelligent terminal device takes a picture; when the sound is detected, the background noise is high, the user does not speak, the speaking sound of the user is low, or the user catches a cold and changes the sound, and the like, so that the accuracy of the intelligent terminal device for identifying the identity of the user is low, and even the identity of the user cannot be identified.

Therefore, how to avoid the interference factors that would exist when identity authentication is performed by using a single biometric feature becomes an urgent problem to be solved.

Disclosure of Invention

The application provides an identity authentication method and device, which can effectively avoid interference factors generated by identity authentication through a single biological characteristic, thereby improving the robustness and accuracy of the identity authentication method.

In a first aspect, a method for identity authentication is provided, including: acquiring information of multi-modal biological features of a user to be identified, wherein the multi-modal biological features comprise at least two biological features of the user to be identified; performing identity authentication on the user to be identified in parallel according to the at least two biological characteristics; and determining the identity authentication result of the user to be recognized according to the recognition result obtained by parallelly performing identity authentication on the user to be recognized, wherein the identity authentication result is obtained based on the confidence coefficient of matching the multi-mode biological features with the preset biological features.

The identity authentication method can be executed by an intelligent terminal, wherein the intelligent terminal can be an intelligent robot or an intelligent camera product, or can also be an intelligent home control center product or other intelligent terminal equipment.

For example, the smart terminal may obtain information of multi-modal biometric features of the user to be recognized, wherein the multi-modal biometric features include at least two biometric features of the user to be recognized; the intelligent terminal performs identity authentication on the user to be identified in parallel according to the at least two biological characteristics; and the intelligent terminal fuses the identification results obtained by parallelly carrying out identity authentication on the user to be identified to obtain the identity authentication result of the user to be identified, wherein the identity authentication result is used for indicating the confidence coefficient of matching between the user to be identified and a preset user.

Alternatively, the multi-modal biometric feature may include a face image of the user to be recognized, voice information, iris information, fingerprint information, and the like, vein information, and the like, which are related to the biometric feature of the user to be recognized.

Optionally, the preset biometric features may be that an owner (a user of the smart terminal) sets identity information and authority of the owner and other users (for example, family members) through an application program in the mobile terminal, and inputs corresponding biometric features such as face information and voiceprint information for identity authentication of the user; the identity information may include, but is not limited to, the user's age, gender, preferences, family membership, etc.

In the embodiment of the application, the identity authentication can be performed on the obtained multi-modal biological characteristics of the user to be identified in parallel, so that the identification results corresponding to different biological characteristics are fused to obtain the identity authentication result of the user to be identified; the problem that the identity authentication of the user to be identified is inaccurate finally due to interference factors existing in a single biological characteristic in the serial authentication process can be solved by performing the identity authentication on the multi-mode biological characteristics of the user to be identified in parallel, namely, in the embodiment of the application, the identity authentication can be performed on the user to be identified in parallel through the multi-mode biological characteristics of the user to be identified, so that the interference factors existing in the single-mode identity authentication are avoided, and the robustness and the accuracy of the identity authentication method are improved.

In a possible implementation manner, the user to be identified may be authenticated in parallel according to each of the multi-modal biometric features, and at least two identification results are obtained, where the number of the biometric features and the identification results may be corresponding, for example, the biometric feature of one user to be identified corresponds to one identification result.

In one possible implementation manner, identity authentication is performed on the users to be recognized in parallel according to partial biological characteristics in the multi-modal biological characteristics. For example, according to the priority of each biological feature in the multiple biological features, a part of biological features with high priority is selected, and then identity authentication is performed on the user to be identified in parallel according to the part of biological features.

It should be understood that, performing identity authentication on the users to be recognized in parallel may refer to inputting different biological features of the obtained multi-modal biological features to different recognition systems at the same time to perform identity authentication on the users to be recognized; or, the method may also refer to respectively inputting different biological features in the obtained multi-modal biological features to different recognition systems in tandem to perform identity authentication on a user to be recognized, so as to obtain different recognition results corresponding to the different biological features.

With reference to the first aspect, in certain implementations of the first aspect, the performing identity authentication on the user to be recognized according to the at least two biometrics in parallel includes: and under the condition that the instruction for awakening the intelligent terminal is detected, and under the condition that the instruction for executing the preset service is detected, which is indicated by the user to be identified, the intelligent terminal is instructed by the user to be identified, the identity of the user to be identified is authenticated in parallel according to the at least two biological characteristics.

The method for authenticating the identity provided in the embodiment of the application can be a continuous identity authentication method, and when the intelligent terminal is awakened and the user to be identified wakes up the intelligent terminal and then issues an instruction to the intelligent terminal again, the user to be identified can authenticate the identity again according to the instruction, for example, the intelligent terminal can authenticate the identity of the user to be identified again based on the instruction at the moment when the user to be identified issues the instruction; therefore, the execution authority of the user for sending the instruction is controlled, and the safety problem of the user privacy data caused by the fact that any user contacting the intelligent terminal can use the intelligent terminal is avoided.

With reference to the first aspect, in certain implementations of the first aspect, the performing identity authentication on the user to be recognized according to the at least two biometrics in parallel includes: and under the condition that the instruction that the user to be identified indicates to awaken the intelligent terminal is detected, and under the condition that the detected time interval is greater than the preconfigured time interval, performing identity authentication on the user to be identified in parallel according to the at least two biological characteristics, wherein the time interval refers to the time interval between the time of executing the last identity authentication and the current time.

In the embodiment of the application, the provided identity authentication method can be a continuous identity authentication method, when a user wakes up an intelligent terminal, identity authentication can be performed on a user to be identified, which indicates to wake up the intelligent terminal, and when the intelligent terminal is in a wake-up state, that is, a non-locking state, the user to be identified can perform identity authentication again outside an instruction issued by the user to be identified or a preset time interval, so that the user can be effectively prevented from leaving after waking up the intelligent terminal (for example, an intelligent robot), and any user who contacts the intelligent terminal again can use the intelligent terminal, thereby causing security problems such as leakage of user privacy data in the intelligent terminal.

With reference to the first aspect, in certain implementations of the first aspect, the method further includes: and detecting the image information of the user to be identified in a user tracking mode within the preconfigured time interval, wherein the user tracking mode comprises at least one of user face tracking, user skeleton identification and pedestrian re-identification.

In the embodiment of the application, after the user to be identified passes the identity authentication, namely the user to be identified is matched with the preset user, the user can be dynamically and non-intrusively subjected to continuous identity authentication and authentication through the preset time interval and the user tracking mode, and the problem that the instruction execution speed is low and the user experience is poor due to the fact that the identity authentication is required before the instruction is executed each time can be avoided.

With reference to the first aspect, in certain implementations of the first aspect, the determination unit is configured to indicate that the user to be identified passes identity authentication when the confidence that the multimodal biometric characteristic matches the preset biometric characteristic is greater than or equal to a preset threshold, where the preset threshold is a threshold preconfigured according to different services or instructions.

Optionally, the broadcast time or the confidence threshold of broadcast weather may be set to U ═ 0; the confidence threshold value of the storytelling can be set to be U-0.5, namely, personalized service can be provided; the confidence threshold value of the schedule reminding can be set to be U-0.6, and when the chat with the intelligent robot relates to privacy information, the confidence threshold value can be set to be U-0.7; transfer money, the confidence threshold may be set to U-0.8. And comparing the confidence levels of the identity authentication of the user to be identified with the different confidence threshold values, so that whether the user to be identified has the authority of a certain service or instruction can be determined.

In the embodiment of the application, different preset thresholds can be configured for different business instructions, that is, different preset thresholds can be configured according to the privacy protection risk level of the preset business instructions or the security risk level, so that the user experience can be increased, and the privacy security of the user can be guaranteed.

With reference to the first aspect, in certain implementations of the first aspect, the method further includes: and under the condition that the user to be identified passes identity authentication, providing personalized service for the user to be identified, wherein the personalized service is obtained according to the behavior attribute of the user to be identified.

It should be understood that the behavior attribute of the user may refer to the user's hobbies or the user's behavior habits, etc.

In the embodiment of the application, after the identity authentication of the user to be identified passes and is authenticated, personalized services can be provided for different users according to the identity information of the user; that is, when different users issue the same instruction, different services can be provided for the users based on the attribute information of the different users, for example, the behavior and the like of the users, so that the user experience can be improved.

The intelligent terminal (e.g., intelligent robot) may provide the user with a planned staged learning task based on the identity of the user to be identified.

Alternatively, differential personalized user services may be automatically provided according to the age, gender, region, etc. of the user to be identified.

Optionally, based on the identity of the user to be identified, and according to the behavior preference of the user configured in advance through the APP in the mobile terminal, a corresponding service may be provided.

Optionally, the behavior and preference analysis of the user may be actively performed based on the identity of the user to be identified, so as to provide personalized services according to the preference of the user.

Alternatively, the latest interests and hobbies of the user can be known and different services can be provided according to the daily conversation between the user and the intelligent robot based on the identity of the user to be identified.

With reference to the first aspect, in certain implementations of the first aspect, the at least two biometrics features include a face image of the user to be recognized and voice information of the user to be recognized.

In a second aspect, an apparatus for identity authentication is provided, including: the device comprises an acquisition unit and a recognition unit, wherein the user acquires information of multi-modal biological characteristics of a user to be recognized, and the multi-modal biological characteristics comprise at least two biological characteristics of the user to be recognized; the processing unit is used for carrying out identity authentication on the user to be identified in parallel according to the at least two biological characteristics; and determining the identity authentication result of the user to be recognized according to the recognition result obtained by parallelly performing identity authentication on the user to be recognized, wherein the identity authentication result is obtained based on the confidence coefficient of matching the multi-mode biological features with the preset biological features.

The identity authentication device can be executed by an intelligent terminal, wherein the intelligent terminal can be an intelligent robot, an intelligent camera product, an intelligent home control center product or other intelligent terminal equipment; alternatively, the identity authentication device may also refer to a chip configured in the smart terminal.

For example, there is provided a smart terminal including: the device comprises an acquisition unit, a processing unit and a processing unit, wherein the acquisition unit is used for acquiring information of multi-modal biological characteristics of a user to be identified, and the multi-modal biological characteristics comprise at least two biological characteristics of the user to be identified; the processing unit is used for carrying out identity authentication on the user to be identified in parallel according to the at least two biological characteristics; and determining the identity authentication result of the user to be recognized according to the recognition result obtained by parallelly performing identity authentication on the user to be recognized, wherein the identity authentication result is obtained based on the confidence coefficient of matching the multi-mode biological features with the preset biological features.

Alternatively, the multi-modal biometric feature may include face images, voice information, iris information, vein information, or fingerprint information of the user to be recognized, which is information related to the biometric feature of the user to be recognized.

In the embodiment of the application, the identity authentication can be performed on the obtained multi-modal biological characteristics of the user to be identified in parallel, so that the identification results corresponding to different biological characteristics are fused to obtain the identity authentication result of the user to be identified; the problem that the identity authentication of the user to be identified is inaccurate finally due to interference factors existing in a single biological characteristic in the serial authentication process can be solved by performing the identity authentication on the multi-mode biological characteristics of the user to be identified in parallel, namely, the identity authentication can be performed on the user to be identified in parallel through the multi-mode biological characteristics of the user to be identified in the embodiment of the application, so that the interference factors of single-mode identity authentication are avoided, and the robustness and the accuracy of the identity authentication device are improved.

With reference to the second aspect, in some implementations of the second aspect, the obtaining unit is specifically configured to:

and under the condition that the instruction for awakening the intelligent terminal is detected, and under the condition that the instruction for executing the preset service is detected, which is indicated by the user to be identified, the intelligent terminal is instructed by the user to be identified, the identity of the user to be identified is authenticated in parallel according to the at least two biological characteristics.

The identity authentication device provided in the embodiment of the application can perform continuous identity authentication, and when the intelligent terminal is awakened and the user to be identified wakes up the intelligent terminal and then issues an instruction to the intelligent terminal again, the identity authentication can be performed on the user to be identified again based on the instruction issued by the user, for example, the intelligent terminal can perform the identity authentication on the user to be identified again based on the instruction at the moment when the user to be identified issues the instruction; therefore, the execution authority of the user for sending the instruction is controlled, and the safety problem of the user privacy data caused by the fact that any user contacting the intelligent terminal can use the intelligent terminal is avoided.

With reference to the second aspect, in some implementations of the second aspect, the obtaining unit is specifically configured to: and under the condition that the instruction that the user to be identified indicates to awaken the intelligent terminal is detected, and under the condition that the detected time interval is greater than the preconfigured time interval, performing identity authentication on the user to be identified in parallel according to the at least two biological characteristics, wherein the time interval refers to the time interval between the time of executing the last identity authentication and the current time.

In the embodiment of the application, the provided identity authentication device can perform continuous identity authentication, when a user wakes up the intelligent terminal, the identity authentication can be performed on a user to be identified, which indicates to wake up the intelligent terminal, when the intelligent terminal is in a wake-up state, namely a non-locking state, the identity authentication can be performed again on the user to be identified outside an instruction issued by the user to be identified or a preset time interval, so that the user can be effectively prevented from leaving after the intelligent terminal (for example, an intelligent robot) is woken up, any user contacting the intelligent terminal can use the intelligent terminal, and the safety problems such as user privacy data leakage in the intelligent terminal are caused.

With reference to the second aspect, in certain implementations of the second aspect, the processing unit is further configured to: and detecting the image information of the user to be identified in a user face tracking mode within the preconfigured time interval, wherein the user tracking mode comprises at least one of user face tracking, user skeleton identification and pedestrian re-identification.

In the embodiment of the application, after the user to be identified passes the identity authentication, that is, the user to be identified is matched with the preset user, the device for identity authentication can dynamically and non-intrusively perform continuous identity authentication and authentication on the user through the preset time interval and the user tracking mode, so that the problem that the instruction execution speed is low and the user experience is poor due to the fact that the identity authentication is required before the instruction is executed each time can be solved.

With reference to the second aspect, in some implementations of the second aspect, the determination unit is configured to determine that at least the user to be identified passes identity authentication when the confidence level that the multimodal biometric characteristic matches the preset biometric characteristic is greater than or equal to a preset threshold, where the preset threshold is a threshold preconfigured according to different services or instructions.

With reference to the second aspect, in certain implementations of the second aspect, the processing unit is further configured to: and under the condition that the user to be identified passes identity authentication, providing personalized service for the user to be identified, wherein the personalized service is obtained according to the behavior attribute of the user to be identified.

With reference to the second aspect, in some implementations of the second aspect, the at least two biometrics features include a face image of the user to be recognized and voice information of the user to be recognized.

In a third aspect, an apparatus for identity authentication is provided, including: a memory for storing a program; a processor for executing the memory-stored program, the processor for performing the following processes when the memory-stored program is executed: acquiring information of multi-modal biological features of a user to be identified, wherein the multi-modal biological features comprise at least two biological features of the user to be identified; performing identity authentication on the user to be identified in parallel according to the at least two biological characteristics; and determining the identity authentication result of the user to be recognized according to the recognition result obtained by parallelly performing identity authentication on the user to be recognized, wherein the identity authentication result is obtained based on the confidence coefficient of matching the multi-mode biological features with the preset biological features.

In a possible implementation manner, the processor included in the apparatus is further configured to execute the identity authentication method in the first aspect and any implementation manner of the first aspect.

Alternatively, the memory may be located inside the processor, for example, may be a cache memory (cache) in the processor. The memory may also be located external to the processor and thus independent of the processor.

It will be appreciated that extensions, definitions, explanations and explanations of relevant content in the above-described first aspect also apply to the same content in the third aspect.

In a fourth aspect, an intelligent terminal is provided, which includes: a memory for storing a program; a processor for executing the memory-stored program, the processor for performing the following processes when the memory-stored program is executed: acquiring information of multi-modal biological features of a user to be identified, wherein the multi-modal biological features comprise at least two biological features of the user to be identified; performing identity authentication on the user to be identified in parallel according to the at least two biological characteristics; and determining the identity authentication result of the user to be recognized according to the recognition result obtained by parallelly performing identity authentication on the user to be recognized, wherein the identity authentication result is obtained based on the confidence coefficient of matching the multi-mode biological features with the preset biological features.

The intelligent terminal can be an intelligent robot, an intelligent camera product, an intelligent home control center product or other intelligent terminal equipment.

In a possible implementation manner, the processor included in the intelligent terminal is further configured to execute the identity authentication method in the first aspect and any one of the possible implementation manners of the first aspect.

It is to be understood that extensions, definitions, explanations and explanations of relevant contents in the above-described first aspect also apply to the same contents in the fourth aspect.

In a fifth aspect, there is provided a computer program product comprising: computer program code for causing a computer to perform the method of identity authentication in any one of the implementations of the first aspect and the first aspect when the computer program code runs on a computer.

It should be noted that, all or part of the computer program code may be stored in the first storage medium, where the first storage medium may be packaged together with the processor or may be packaged separately from the processor, and this is not particularly limited in this embodiment of the present application.

A sixth aspect provides a computer-readable medium storing program code, which, when run on a computer, causes the computer to perform the method of identity authentication in any one of the implementations of the first aspect and the first aspect.

In a seventh aspect, a chip is provided, where the chip includes a processor and a data interface, and the processor reads an instruction stored in a memory through the data interface, and executes the method for identity authentication in any one of the implementations of the first aspect and the first aspect.

Optionally, as an implementation manner, the chip may further include a memory, where instructions are stored in the memory, and the processor is configured to execute the instructions stored in the memory, and when the instructions are executed, the processor is configured to perform the method for identity authentication in any one of the foregoing first aspect and the first implementation manner.

Drawings

FIG. 1 is a schematic diagram of a system architecture provided by an embodiment of the present application;

FIG. 2 is a schematic diagram of a multi-modal identity authentication apparatus according to an embodiment of the present application;

FIG. 3 is a schematic diagram of a voiceprint recognition system provided by an embodiment of the present application;

fig. 4 is a schematic flow chart of a method of identity authentication provided by an embodiment of the present application;

FIG. 5 is a schematic flow chart diagram of a multi-modal converged identity authentication method provided by an embodiment of the application;

FIG. 6 is a method for identity authentication based on multi-modal fusion of instructions provided by an embodiment of the present application;

FIG. 7 is a method for multi-modal converged identity authentication based on time windows provided by an embodiment of the present application;

fig. 8 is a schematic block diagram of an apparatus for identity authentication provided by an embodiment of the present application;

fig. 9 is a schematic hardware structure diagram of an identity authentication apparatus according to an embodiment of the present application.

Detailed Description

The technical solutions in the embodiments of the present application will be described below with reference to the drawings in the embodiments of the present application, and it is obvious that the described embodiments are only a part of the embodiments of the present application, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present application.

Fig. 1 is a schematic diagram of a system architecture provided in an embodiment of the present application. As shown in fig. 1, the system 100 may include a mobile terminal 110, a smart robot 120, a user 130, and a cloud 140.

Exemplarily, the mobile terminal 110 may refer to a mobile phone terminal, a tablet computer, a notebook computer, etc., wherein an Application (APP) controlling the smart robot 120 may be installed in the mobile terminal 110.

Illustratively, the intelligent robot 120 may have a part of Artificial Intelligence (AI), such as speech synthesis (TTS), Automatic Speech Recognition (ASR), Natural Language Processing (NLP), emotion calculation, voiceprint recognition, face recognition, sound localization, face tracking, skeleton recognition, and the like, and mainly provide functions of daily communication, learning, entertainment, game, security accompanying, and the like to the person under guardianship.

For example, the smart robot 120 may be configured with hardware such as a processor, a microphone, a camera, a Liquid Crystal Display (LCD), a speaker, a sensor, and various communication interfaces.

Illustratively, the cloud 140 may include an AI cloud, a business cloud, and the like, and the AI cloud and the business cloud may be combined or separated; the AI cloud may be used to run an AI module, for example, to provide relevant capabilities of the intelligent robot 120 to interact with the user 130, such as ASR, NLR, TTS, and the like; the service cloud may be used for providing service capability on the cloud side for the APP running on the intelligent robot 120 and the control APP on the mobile phone side.

The entities in the system 100 shown in fig. 1, when combined, have the following functions: one or more of an auditory function, a visual function, a voiceprint recognition function, a face recognition function, a sound source positioning function, a face tracking function, a skeleton recognition function and a multi-modal consistency determination function.

Wherein, the auditory function can comprise a microphone, a language recognition system and a language understanding system; the microphones may be generally deployed on the smart robot 120, and the AI capabilities (e.g., language recognition system, language understanding system) may be deployed on the cloud 140 (e.g., AI cloud) as well as on the smart robot 120.

For example, visual functions may include cameras, image recognition systems, image understanding systems; wherein, the camera is generally deployed on the intelligent robot 120; AI capabilities (e.g., image recognition system, image understanding system) may be deployed in cloud 140 (e.g., AI cloud) and may also be deployed on smart robot 120.

For example, the Voice Print Recognition (VPR) can be a kind of biometric recognition, and specifically, can refer to a technology for performing identification according to the characteristics of sound waves of a speaking user. The identity recognition can be independent of the accent and the language, and can be used for speaker identification and speaker confirmation. The main tasks of voiceprint recognition may include voice signal processing, voiceprint feature extraction, voiceprint modeling, voiceprint comparison, decision making, and the like, and the voiceprint recognition function may be deployed in the cloud 140 (e.g., an AI cloud) or in the intelligent robot 120.

For example, the face recognition function (face recognition) may be a technique for performing identity verification based on a face image analysis technique from a face image of a person; specifically, the method can provide various functions including face detection and analysis, facial feature positioning, face tracking, face comparison, face verification, face living body detection and the like.

For example, a sound source localization function may be used to localize the direction and distance of the sound source.

For example, the multi-modal coherence determination function can be implemented by a face tracking system in cooperation with a sound source localization system. For example, the biometric feature recognition result can be compared with the biometric features corresponding to all preset users, so that the voiceprint recognition and the face recognition are ensured to be the same user in a multi-user environment, and the voiceprint of the user A and the face of the user B are prevented from being subjected to fusion authentication.

The entities are connected through a network, and the functions of the entities are matched with each other, so that the functions of the intelligent robot 120 for continuously performing multi-mode converged identity authentication on the user 130 and providing personalized services are jointly completed.

It should be understood that, the above is exemplified by the intelligent robot 120, and the intelligent robot 120 may also be other intelligent terminals, for example, an intelligent camera product, or an intelligent home control center product, etc., for example, an intelligent home device Nest Hub Max.

Fig. 2 is a schematic diagram of an apparatus 200 for multi-modal identity authentication provided in an embodiment of the present application. The device 200 may be deployed on the intelligent robot 120 shown in fig. 1, or may also be deployed on other intelligent terminals with identity authentication requirements.

As shown in fig. 2, the apparatus 200 may include a camera 210, a microphone 220, a face recognition system 230, a voiceprint recognition system 240, and a fusion determination system 250.

The camera 210 may have a basic function of capturing video, transmitting or capturing still images, and the image is captured by a lens, and then processed by a photosensitive component circuit and a control component in the camera and converted into a digital signal that can be recognized by a computer.

Illustratively, the microphone 220 may refer to an energy conversion device that converts a sound signal into an electrical signal.

Illustratively, the face recognition system 230 is used for identity confirmation of the user according to a facial picture of the user; the face recognition system 230 may mainly include five components, which are face image acquisition and detection, face image preprocessing, face image feature extraction, matching and recognition, and may further include living body detection of the face image.

The face image acquisition in the face image acquisition and detection can be that a face image of a user is acquired through a camera lens, and the face detection can be used for preprocessing of face recognition; for example, for any given image, it can be identified to determine whether it contains a human face; if the face is included, the position, the face attribute and the like of the face can be returned. The face image can contain quite rich pattern features, such as histogram features, color features, template features, structural features, Haar features and the like; the face detection is used for picking out the useful information and realizing detection by using the characteristics, wherein the face detection method can adopt an Adaboost learning algorithm based on the characteristics, the Adaboost algorithm is a method for classification, and weak classification methods are combined to form a new strong classification method.

In one implementation mode, when the Adaboost algorithm is used in the face detection process, some rectangular features (weak classifiers) capable of representing the face most can be selected, the weak classifiers are constructed into a strong classifier according to a weighted voting mode, and then a plurality of strong classifiers obtained through training are connected in series to form a cascade-structured stacked classifier, so that the detection speed of the classifier is effectively improved.

For example, face image preprocessing refers to a process of processing an image based on a face detection result and finally serving for feature extraction. For example, the original image acquired by the system is often not directly usable due to the limitation of various conditions and random interference, and needs to be subjected to image preprocessing such as gray scale correction and noise filtering in the early stage of image processing.

For example, for a face image, the preprocessing process may include ray compensation, gray-scale transformation, histogram equalization, normalization, geometric correction, filtering, sharpening, and the like.

Features that may be used by the face recognition system 230 are generally classified into visual features, pixel statistical features, face image transform coefficient features, face image algebraic features, and the like; the face feature extraction is performed aiming at certain features of the face, is also called face characterization, and is a process for performing feature modeling on the face.

Illustratively, methods of face feature extraction can be divided into two broad categories: one is a knowledge-based characterization method; the other is a characterization method based on algebraic features or statistical learning.

The knowledge-based characterization method mainly obtains feature data which is helpful for face classification according to shape description of face organs and distance characteristics between the face organs, wherein feature components of the feature data generally comprise Euclidean distances, curvatures, angles and the like among feature points; the human face is composed of parts such as eyes, a nose, a mouth, a chin and the like, geometric description of the parts and the structural relationship among the parts can be used as important features for recognizing the human face, and the features are called as geometric features; the knowledge-based face characterization mainly comprises a geometric feature-based method and a template matching method.

The face image matching and recognition refers to a process of searching and matching extracted feature data of the face image and a feature template stored in a database; and (3) outputting a result obtained by matching when the similarity exceeds a preset threshold value by setting a preset threshold value. The face recognition is to compare the face features to be recognized with the obtained face feature template, and judge the identity information of the face according to the similarity degree. The process of face recognition can be divided into two categories: one is confirmation, which refers to a process of comparing images one to one; the other is recognition, which is a process of performing image matching comparison on a plurality of images.

The face living body detection can comprise static living body detection and dynamic living body detection, wherein the face static living body detection is used for carrying out face living body detection on the uploaded static photos; and the dynamic human face living body detection is realized by detecting the lip language of the user or actions such as shaking head and blinking.

It should be understood that a smart terminal (e.g., a smart robot) may be generally used in a user's home, and a photo of a family member may be hung on a wall surface in the user's home.

It should be further understood that the above-mentioned face recognition system is an example, the face recognition system is used for recognizing the identity of the user by collecting an image of the user, and the face recognition system may be in any form, which is not limited in this application.

The voiceprint recognition system 240 may be a system for performing identity recognition according to the characteristics of the sound wave of a speaker, and the main tasks of voiceprint recognition include speech signal processing, voiceprint feature extraction, voiceprint modeling, voiceprint comparison, decision making and the like; the fusion judgment system can be used for performing fusion judgment on confidence degrees obtained by face recognition and voiceprint recognition to obtain an identity judgment result of the end user.

Exemplarily, fig. 3 is a schematic diagram of a voiceprint recognition system provided in an embodiment of the present application.

As shown in fig. 3, the voiceprint recognition system is mainly composed of two parts, namely a modeling process and an authentication process; the modeling process refers to a process from target human voice to generation of a target human voice pattern model; the authentication process is a process of matching and scoring judgment between the test voice and the target human voice pattern model, and finally an authentication result is given. The voiceprint recognition technology mainly comprises two key technologies, one is characteristic extraction; the other is pattern matching (pattern recognition).

The task of feature extraction refers to extracting and selecting acoustic or language features with characteristics of strong separability, high stability and the like for voiceprints of speakers. Unlike speech recognition, the features of voiceprint recognition must be "personalized" features, while the features of speaker recognition must be "generic" to the speaker.

Generally, most voiceprint recognition systems employ acoustic features related to the anatomy of a human pronunciation mechanism, including: spectrum, cepstrum, formants, pitch, reflection coefficients, nasal sounds, breath sounds with deep breath, and hoarse sounds, etc.; from the aspect of modeling by using a mathematical method, the currently available features of the voiceprint automatic recognition model include: acoustic features (e.g., cepstrum), lexical features (e.g., speaker-dependent words n-gram, phoneme n-gram), prosodic features (e.g., pitch and energy "pose" described with n-gram), language, dialect and accent information, and channel information, among others.

Illustratively, the recognition modes that can be employed by the voiceprint recognition system can include the following:

the first method comprises the following steps: a markov model (HMM) method, which generally uses a single-state HMM, or a gaussian mixture model.

And the second method comprises the following steps: VQ clustering methods (e.g., LBG).

And the third is that: neural network methods, such as, for example, multi-layer perceptions, radial basis functions, etc., may be explicitly trained to distinguish a speaker from its background speaker.

And fourthly: and the nearest neighbor method is used for keeping all the characteristic vectors during training, finding the nearest K training vectors for each vector during identification, and identifying according to the K training vectors.

And a fifth mode: a polynomial classifier method.

It should also be understood that the above-mentioned voiceprint recognition system is an example, the voiceprint recognition system is used for recognizing the identity of the user by acquiring the voice information of the user, and the voiceprint recognition system may be in any form, which is not limited in this application.

At present, the technology of performing fusion identity authentication based on biological characteristics generally needs to perform sequential authentication on multiple biological characteristics, that is, if the confidence of the first biological characteristic identification is in a preset confidence interval, the identification authentication of the second biological characteristic is started; however, when the confidence of the first biometric identification is smaller than the minimum value of the confidence interval, the direct authentication fails; that is, if the acquired biological characteristics cannot be identified due to the intelligent terminal or the user's own factors, for example, the environment light of the image shot by the intelligent terminal device is not good, a blurred image is shot, and the front face image of the user is not shot by the intelligent terminal device; when voice detection is performed, the background noise is high, the user does not speak, the speaking voice of the user is small, or the user catches a cold and changes voice, and the like, so that the accuracy of the intelligent terminal device for identifying the identity of the user is low, and even the identity of the user cannot be identified.

In an embodiment of the present application, information of a multi-modal biometric feature of a user to be identified may be obtained, an identification result is obtained by performing authentication in parallel according to a plurality of biometric features in the multi-modal biometric feature, an identity authentication result of the user to be identified is obtained by fusing the identification results, and a confidence level of matching between the user to be identified and a preset user may be indicated by the identity authentication result; the identity authentication method provided by the embodiment of the application can perform identity authentication on the user to be identified through multi-mode biological characteristics in parallel, so that interference factors of single-mode identity authentication which may exist in a serial identity authentication method are avoided, and the robustness and the accuracy of the identity authentication method are improved.

The identity authentication method provided in the embodiment of the present application is described in detail below with reference to fig. 4 to 7.

Fig. 4 is a schematic flowchart of a method for identity authentication provided in an embodiment of the present application. The identity authentication method 300 may be performed by an apparatus capable of identity authentication, for example, an intelligent terminal device, such as the intelligent robot 120 shown in fig. 1. The method 300 includes steps 310 to 330, and the steps 310 to 330 are described in detail below.

And step 310, acquiring information of the multi-modal biological characteristics of the user to be identified.

Wherein the multi-modal biometric may comprise at least two biometrics of the user to be identified.

For example, the multi-modal biometric feature may include information related to the biometric feature of the user to be recognized, such as a face image of the user to be recognized, voice information, iris information, vein information, or fingerprint information.

And 320, performing identity authentication on the user to be identified in parallel according to the at least two biological characteristics.

For example, the identity of the user to be identified is authenticated in parallel based on each of the multi-modal biometrics.

For example, the identity of the user to be identified is authenticated according to partial biological characteristics in the multi-modal biological characteristics in parallel. For example, according to the priority of each biological feature in the multiple biological features, a part of biological features with high priority is selected, and then identity authentication is performed on the user to be identified in parallel according to the part of biological features.

In one example, the multi-modal biometric features may include a facial image of the user to be recognized, and the user to be recognized may be authenticated by the facial recognition system 230 shown in fig. 2; alternatively, the multi-modal biometric feature may include voice information of the user to be recognized, and the user to be recognized may be authenticated by the voiceprint recognition system 240 shown in fig. 2.

The face recognition system and the voiceprint recognition system may perform identity authentication on the user to be recognized in parallel, that is, the face image and the voice information of the user to be recognized may be simultaneously and respectively input to the face recognition system 230 and the voiceprint recognition system 240 to perform an identity authentication process in parallel; alternatively, the face image and the voice information of the user to be recognized may be input to the face recognition system 230 and the voiceprint recognition system 240 in tandem, respectively, to perform the identity authentication process.

And 330, determining an identity authentication result of the user to be recognized according to a recognition result obtained by parallelly performing identity authentication on the user to be recognized, wherein the identity authentication result is obtained based on the confidence coefficient of matching the multi-mode biological characteristics with preset biological characteristics.

The preset biological features may be that an owner (a user of the intelligent terminal) sets identity information and authority of the owner and other users (for example, family members) through an APP in the mobile terminal shown in fig. 1, and inputs corresponding biological features such as face information and voiceprint information for identity authentication of the user; the identity information may include, but is not limited to, the user's age, gender, preferences, family membership, etc.

For example, the identification results obtained by performing identity authentication on the users to be identified in parallel may be fused to obtain the identity authentication result of the users to be identified, where the identity authentication result of the users to be identified may refer to that the users to be identified pass identity authentication or that the users to be identified do not pass identity authentication.

For example, as shown in fig. 2, the recognition result of the face image of the user to be recognized obtained by the face recognition system 230 and the recognition result of the voice information of the recognized user obtained by the voiceprint recognition system 240 are input to the fusion determination system 250 for fusion, and finally the identity authentication result of the user to be recognized is obtained.

For example, in an embodiment of the present application, the multi-modal biometric feature includes at least two biometric features, and then at least two identification results corresponding to the at least two biometric features can be obtained by performing identity authentication on the at least two biometric features in parallel, and fusing the identification results obtained by performing identity authentication on the user to be identified in parallel may mean fusing the obtained at least two identification results, so as to obtain an identity authentication result of the user to be identified. Wherein, the fusing of the at least two recognition results may refer to performing weighted summation.

In the embodiment of the application, the identity authentication can be performed on the obtained multi-modal biological characteristics of the user to be identified in parallel, so that the identification results corresponding to different biological characteristics are fused to obtain the identity authentication result of the user to be identified; the problem that the identity authentication of the user to be identified is inaccurate finally due to interference factors existing in a single biological characteristic in the serial authentication process can be solved by performing the identity authentication on the multi-mode biological characteristics of the user to be identified in parallel, namely, the identity authentication can be performed on the user to be identified in parallel through the multi-mode biological characteristics of the user to be identified in the embodiment of the application, so that the interference factors of single-mode identity authentication are avoided, and the robustness and the accuracy of the identity authentication method are improved.

For example, the multi-modal biometric features may include M biometric features of the user to be identified, and the process of identity authentication may be executed in parallel according to the M biometric features at the same time, so as to obtain identification results corresponding to the M biometric features; fusing the M identification results to obtain an identity authentication result of the user to be identified, and if the identity authentication of each biological characteristic in the M biological characteristics fails, determining that the identity authentication of the user to be identified fails, namely determining that the user to be identified is not matched with a preset user; therefore, the intelligent terminal can determine that the user to be identified is a stranger and can enter a visitor mode.

For example, the failure of the identity authentication of the user to be recognized may mean that the obtained information of the multi-modal biometric features cannot be used for the identity authentication; the identity authentication result of the user to be recognized obtained by performing identity authentication through the multi-modal object features may also be referred to, that is, the confidence coefficient of the matching between the user to be recognized and the preset user is smaller than the preset threshold.

Further, in an embodiment of the present application, the provided method of identity authentication may be a persistent method of identity authentication. The intelligent terminal, such as an intelligent robot, an intelligent camera, an intelligent home control center product and the like, is different from the mobile phone terminal, and the mobile phone terminal only needs to perform identity authentication when being in a starting state or a locking state, and can be defaulted to be used by the same user after being awakened; when a user uses an intelligent terminal (such as an intelligent robot), the user may leave at any time, and if the user only performs identity authentication once, any user contacting the intelligent robot can add, delete, change and search user privacy data in the intelligent robot when the intelligent robot is in a non-locked state, so that the problem of user privacy disclosure is caused and the use permission cannot be controlled; therefore, the intelligent robot needs to perform continuous identity authentication and authorization on the user.

Optionally, in a possible implementation manner, in a case that an instruction that the user to be recognized instructs to wake up the smart terminal is detected, the identity of the user to be recognized may be authenticated in parallel according to at least two biological features included in the multi-modal biological features; and under the condition that the instruction that the user to be identified indicates the intelligent terminal to execute the preset service is detected, the identity authentication can be performed on the user to be identified again in parallel according to at least two biological characteristics included in the multi-mode biological characteristics.

In the embodiment of the application, when the user to be identified wakes up the intelligent terminal and then issues the instruction to the intelligent terminal again, the user to be identified can be authenticated again through the instruction; for example, the intelligent terminal may perform identity authentication again on the user to be identified based on the instruction at the time when the user to be identified issues the instruction; through continuous identity authentication, the execution authority of the user for sending the instruction is controlled, and the safety problem of user privacy data caused by the fact that any user contacting the intelligent terminal can use the intelligent terminal is avoided.

Optionally, in a possible implementation manner, in a case that an instruction that the user to be recognized instructs to wake up the smart terminal is detected, the identity of the user to be recognized may be authenticated in parallel according to at least two biological features included in the multi-modal biological features; and under the condition that the time interval is detected to be larger than the preconfigured time interval, performing identity authentication again on the user to be identified in parallel according to at least two biological characteristics included in the multi-modal biological characteristics, wherein the time interval can be the time interval between the time when the last identity authentication is performed and the current time.

In the embodiment of the application, when the user wakes up the intelligent terminal, the identity authentication can be carried out on the user to be identified, which indicates to wake up the intelligent terminal, and when the intelligent terminal is in a wake-up state, namely a non-locking state, the identity authentication can be carried out again on the user to be identified outside an instruction issued by the user to be identified or a preset time interval; through the continuous identity authentication, the situation that a user leaves after awakening the intelligent terminal (for example, an intelligent robot) can be effectively avoided, and any user contacting the intelligent terminal can use the intelligent terminal, so that the safety problems that the privacy data of the user in the intelligent terminal is leaked and the like are caused.

Illustratively, within the preconfigured time interval, the image information of the user to be identified may be detected by a user tracking manner, wherein the user tracking manner includes at least one of user face tracking, user skeleton recognition and pedestrian re-recognition.

In the embodiment of the application, after the user to be identified passes the identity authentication, namely the user to be identified is matched with the preset user, the user can be dynamically and non-intrusively subjected to continuous identity authentication and authentication through the preset time interval and the user tracking mode, so that the problem that the execution speed of the instruction is low and the user experience is poor due to the fact that the identity authentication is required before the instruction is executed each time can be solved.

Optionally, in a possible implementation manner, when the confidence of the multi-modal biometric feature matching with the preset biometric feature is greater than or equal to a preset threshold, the method may be used to indicate that the user to be identified passes the identity authentication, that is, may be used to indicate that the result of the identity authentication of the user to be identified is that the user to be identified passes the identity authentication, where the preset threshold may be a threshold preconfigured according to different services or instructions.

For example, the confidence threshold of the broadcast time or broadcast weather may be set to U ═ 0; the confidence threshold value of the storytelling can be set to be U-0.5, namely, personalized service can be provided; the confidence threshold value of the schedule reminding can be set to be U-0.6, and when the chat with the intelligent robot relates to privacy information, the confidence threshold value can be set to be U-0.7; transfer money, the confidence threshold may be set to U-0.8. And comparing the confidence levels of the identity authentication of the user to be identified with the different confidence threshold values, so that whether the user to be identified has the authority of a certain service or instruction can be determined.

Optionally, in a possible implementation manner, in a case that the user to be identified passes the identity authentication, a personalized service may be provided to the user to be identified, where the personalized service may be obtained according to a behavior attribute of the user to be identified.

Illustratively, after the intelligent terminal (e.g., the intelligent robot) completes the identity authentication and authorization of the user through the multi-mode converged identity authentication method, personalized services based on the user attributes can be provided according to different user identities.

In the embodiment of the application, when the intelligent terminal (for example, the intelligent robot) passes the identity authentication and the authentication of the user, personalized services can be provided for different users according to the identity information of the user; that is, when different users issue the same instruction, different services can be provided for the users based on the attribute information of the different users, for example, the behavior and the like of the users, so that the user experience can be improved.

In one example, a smart terminal (e.g., a smart robot) may provide a user with a planned staged learning task based on the identity of the user to be identified.

In one example, a smart terminal (e.g., a smart robot) may automatically provide differentiated and personalized user services according to the age, gender, region, etc. of a user to be identified.

In one example, a smart terminal (e.g., a smart robot) may provide a corresponding service based on an identity of a user to be identified and according to a behavior preference of the user configured in advance through an APP in a mobile terminal.

In one example, a smart terminal (e.g., a smart robot) may actively perform behavior preference analysis of a user based on the identity of the user to be recognized, thereby providing personalized services according to the preference of the user.

In one example, a smart terminal (e.g., a smart robot) may learn about the user's latest interests and tastes and provide different services based on the identity of the user to be identified, from the user's daily conversations with the smart robot.

In the embodiment of the application, the intelligent terminal can provide different services for different users according to the identities of the different users, such as gender, age, region, preferences and the like, so that the user experience is improved.

Fig. 5 is a schematic flowchart of an identity authentication method with multi-modal fusion provided in an embodiment of the present application. The identity authentication method 400 shown in fig. 5 includes steps 410 to 430, and the steps 410 to 430 are described in detail below.

And step 410, inputting identity information and authority of the user.

For example, an owner (a user of an intelligent terminal) may set identity information and permissions of the owner and other users (e.g., family members) through an APP in the mobile terminal as shown in fig. 1, and input corresponding face information and voiceprint information for identity authentication of the user; the identity information may include, but is not limited to, the user's age, gender, preferences, family membership, etc.

Further, the owner can also set the access authority of a visitor mode (such as strangers); the visitor may be set to not use the smart terminal or may be set to use only functions that do not involve the privacy of the user.

The function that does not involve the user privacy may be a function that asks for time, asks for weather, and the like that do not involve the user's private data.

In the embodiment of the application, the face image and the voice print file of the user using the intelligent terminal are input correspondingly, so that the face image and the voice print file can be established for face recognition and voice print recognition during subsequent user identity authentication.

Illustratively, an owner can form Face image files by using a camera to collect pictures of the Face of a user, and generate Face print (Face print) codes and store the Face image files.

For example, in the system architecture shown in fig. 1, an owner may collect and store a facial image of a user through the mobile terminal 110, or may upload the collected facial image to the cloud 140.

Illustratively, the owner may collect the user's voice by using a microphone to form a voice file and store the voice file as a voiceprint code.

For example, in the system architecture shown in fig. 1, an owner may collect and store voice information of a user through the mobile terminal 110, or may upload the collected voice information to the cloud 140.

Step 420, authentication and authorization of the user.

When a user attempts to wake up an intelligent terminal (e.g., an intelligent robot) through a wake-up word, the user may be authenticated through one or more of the face recognition system 230, the voiceprint recognition system 240, and the fusion determination system 250 shown in fig. 2; after the identity of the user passes the authentication, the user can be further authenticated, namely whether the user has the authority of indicating a certain service or instruction is judged; if the user is determined to have the authority of indicating the service or the instruction through authentication, the intelligent terminal can execute the service or the instruction indicated by the user; if the user is determined not to have the authority to indicate the service or the instruction through authentication, the intelligent terminal may not execute the service or the instruction indicated by the user.

Optionally, the intelligent terminal may feed back a corresponding voice prompt to the user when the intelligent robot fails to authenticate the user.

For example, in the embodiment of the present application, the identity authentication and authentication of the user may be performed by one or more of a face recognition system, a voiceprint recognition system, and a fusion determination system by using a fusion algorithm.

Assuming that the confidence coefficient obtained by the face recognition system is V (V is more than or equal to 0 and less than or equal to 1) and the confidence coefficient obtained by the voiceprint recognition system is S (S is more than or equal to 0 and less than or equal to 1), the method for carrying out identity authentication on the user by adopting the fusion algorithm is as follows:

step one, when a human face cannot be detected around an intelligent terminal (for example, an intelligent robot), the human face faces away from a camera (a back head is shot), an included angle between the human face and the camera is too large (for example, more than 45 degrees), an acquired fuzzy picture, an acquired image is dark, and the like, a human face recognition system cannot acquire a human face image for recognition, that is, the confidence coefficient of the human face recognition system may not be acquired, or the confidence coefficient is 0, or the confidence coefficient V is less than a threshold lambda₁(assume face recognition threshold is λ)₁) And the user identity fusion judgment result can depend on the voiceprint recognition system, so that the identity authentication result of the user is obtained.

Step two, when the intelligent terminal (for example, the intelligent robot) cannot detect the voice, or detects that the voice of the voice information is very small, or detects that the background noise is very large, and the like, the voiceprint recognition system cannot acquire the voice information, or the confidence level of the voiceprint recognition system cannot be acquired, or the confidence level is 0, or the confidence level S is smaller than a threshold value λ₂(assuming voiceprint threshold)Is λ₂) And the user identity fusion judgment result can depend on the face recognition system, so that the identity authentication result of the user is obtained.

Step three, when the face image of the user A passes through the confidence coefficient V output by the face recognition system<λ₁And the confidence level S of the voice information of the user A output by the voiceprint recognition system<λ₂In case (2), the user a authentication fails.

Step four, when clearer voice and clearer front face image are detected, the fusion judgment system can adopt the following algorithm to perform identity fusion judgment:

it is assumed that the confidence of the face image of the user a output by the face recognition system is V (λ)₁V is less than or equal to 1), the confidence coefficient of the voice information of the user A output by the voiceprint recognition system is S (lambda)₂V is less than or equal to 1), the identity comprehensive confidence coefficient M is calculated according to the formula (based on DS-evidence theory):

for example, if the face confidence of the user a is V ═ 0.7 and the voiceprint confidence is S ═ 0.6, the integrated identity confidence of the user a is 0.77 according to the above formula.

It should be understood that the confidence V and the confidence S may refer to recognition results obtained through the face features and the voice features of the user a, respectively; the identity comprehensive confidence M can refer to an identity authentication result of the user A; in the embodiment of the present application, the identity authentication result of the user a may also be obtained by fusing the recognition results corresponding to a plurality of biological features of the user a, which is not limited in this application.

In a possible implementation manner, if the intelligent terminal is in a multi-user scene, in order to ensure that voiceprint recognition and face recognition correspond to the same user, comprehensive judgment can be performed according to speaking mouth shapes and sound source positioning positions of a plurality of users and a face recognition result of each user in a plurality of users in a picture taken by the intelligent terminal and a preset user comparison result of a system corresponding to the voiceprint recognition, so that the user sending voice information is determined, and the problem of identity recognition failure caused by fusion of the voiceprint of the user a and the face of the user B through identity authentication is avoided.

The confidence value (face confidence, voiceprint confidence or identity comprehensive confidence) of the user can be obtained by the method, and a confidence threshold U can be set in advance; if the confidence coefficient obtained by the algorithm is greater than or equal to the confidence threshold value U, the user can be considered to pass the identity authentication, and the next authentication can be carried out; if the obtained confidence value of the user is smaller than the preset confidence threshold value, the user is a stranger, and the intelligent robot can enter a visitor mode.

Further, the confidence threshold may be configurable for different services or instructions, that is, the value of the confidence threshold U may take different values according to the security and privacy degree of user data related to different services or instructions.

For example, the confidence threshold of the broadcast time or broadcast weather may be set to U ═ 0; the confidence threshold value of the storytelling can be set to be U-0.5, namely, personalized service can be provided; the confidence threshold value of the schedule reminding can be set to be U-0.6, and when the chat with the intelligent robot relates to privacy information, the confidence threshold value can be set to be U-0.7; transfer money, the confidence threshold may be set to U-0.8.

Step 430, providing personalized service according to the user identity.

In the embodiment of the application, after the intelligent terminal (for example, an intelligent robot) passes the identity authentication of the user and authenticates the identity, personalized services can be provided for different users according to the identity information of the user; that is, when different users issue the same instruction, different services can be provided for the users based on the attribute information of the different users, for example, the behavior and the like of the users, so that the user experience can be improved.

In one example, a smart terminal (e.g., a smart robot) may provide a user with a planned staged learning task based on the user's identity.

For example, a child in a family may learn 5 new english words via an intelligent terminal (e.g., an intelligent robot) every day and review english words learned a few days before; every day, after the intelligent terminal (for example, an intelligent robot) completes identity authentication on the xiaoming, the service of the learning task can be provided for the xiaoming, and the learning task can not be started when other people are identified.

In one example, a smart terminal (e.g., a smart robot) may automatically provide differentiated personalized user services according to the user's age, gender, region, etc.

For example, the user indicates "put a music bar" to the intelligent robot voice, and if the intelligent robot identifies that the speaker is a child through identity authentication, music of the child class can be played; if the intelligent robot identifies that the speaker is an adult through identity authentication, music which the adult likes to listen to can be played.

In one example, a smart terminal (e.g., a smart robot) may provide a corresponding service based on the identity of a user and according to the behavior preference of the user, which is configured in advance through an APP in a mobile terminal.

For example, the user indicates "play a music bar that i am favorite to listen to" to the smart robot by voice, and if the smart robot recognizes that the user is a child through identity authentication and pre-configures that the music that he favorite to listen to is "a piglet is full" through the mobile phone APP, the pre-configured music may be played.

For example, dad, as well as dinosaur, as well as xiao, can be pre-configured in the APP in the mobile terminal. When the intelligent robot actively chats with the user, if the user is identified as a small and clear dad through identity authentication, the intelligent robot can chat about a football topic; if the user is identified to be small through identity authentication, the user can chat about the dinosaur topic.

In one example, a smart terminal (e.g., a smart robot) may actively perform a behavioral preference analysis of a user based on the identity of the user, thereby providing personalized services according to the user's preferences.

For example, after the user listens to a song each time, the intelligent robot can actively inquire whether the user likes the song, and after counting for a period of time, the intelligent robot can find that the user A likes listening to balladry and the user B likes listening to popular songs; when the user says 'put a music bar' to the intelligent robot, the corresponding type of songs can be played according to the identity of the user, and if the user is identified as a user A, the ballad is played; if user B is identified, then the pop song is played.

In one example, a smart terminal (e.g., a smart robot) may learn about the user's latest interests and tastes and provide different services based on the user's identity from the user's daily conversations with the smart robot.

For example, user a frequently likes to chat about topics related to electronic technology recently chatting with the intelligent robot, and some electronic technology news or things can be actively spoken when the intelligent robot chats with user a.

Furthermore, because the robot is different from a mobile phone terminal of a user, the mobile phone terminal only needs to perform identity authentication when being in a starting state or a locking state, and the same user can be used by default after the mobile phone terminal is awakened; when a user uses an intelligent terminal (such as an intelligent robot), the user may leave at any time, and if the user only performs identity authentication once, any user contacting the intelligent robot can add, delete, change and search user privacy data in the intelligent robot when the intelligent robot is in a non-locked state, so that the problem of user privacy disclosure is caused and the use permission cannot be controlled; therefore, the intelligent robot needs to perform continuous identity authentication and authorization on the user.

The following describes in detail a method for performing persistent multimodal fusion identity authentication by a smart terminal, and it should be noted that the persistent multimodal fusion identity authentication by the smart terminal may be based on the following two different cases.

In the first case, a smart terminal (e.g., a smart robot) can perform persistent multi-modal converged identity authentication based on a user's instruction.

For example, the intelligent terminal may determine whether to authenticate the identity of the user when detecting that the user indicates a new command through voice or gesture. For example, as shown in the flowchart of fig. 6, the method 500 of persistent multi-modal converged identity authentication shown in fig. 6 includes steps 510 to 540, and the steps 510 to 540 are described in detail below.

Step 510, a wake-up word is detected.

It should be understood that under the condition that the intelligent terminal is not used by the user within a period of time, the intelligent terminal can process a dormant state, and at the moment, the user can wake up the intelligent terminal through a wake-up word; the wake-up word may refer to voice information for waking up an intelligent terminal (e.g., an intelligent robot), and different wake-up words may be set for different intelligent terminals.

For example, the smart terminal (e.g., smart robot) detects a wake word, such as "sesame open door" to wake up.

And step 520, authenticating the identity of the user through the awakening word.

Illustratively, the smart terminal may perform voiceprint recognition on the wake-up voice message including the wake-up word through the voiceprint recognition system.

Furthermore, the intelligent terminal can start face recognition to the sound source emitting direction while performing user voiceprint recognition, namely, the intelligent terminal can detect and recognize the face of the user in a visual range.

For example, the fusion algorithm shown in step 420 above may be used to perform fusion recognition of the voiceprint and the face of the user, so as to determine the identity of the user; for specific steps, reference may be made to the fusion algorithm in step 420, which is not described herein again.

Exemplarily, if it is determined through the identity recognition that the user indicating the wakeup word is matched with the preset user identity information, the next step is executed; and if the user indicating the awakening word is determined to be a stranger through the identity recognition, entering an access mode.

For example, the preset user identity information may refer to that an owner (e.g., a user of an intelligent terminal) sets identity information and permissions of the owner and other users (e.g., family members) through an APP in a mobile terminal as shown in fig. 1, and inputs corresponding face information and voiceprint information for identity authentication of the user; the identity information may include, but is not limited to, the user's age, gender, preferences, family membership, etc.

And step 530, authenticating the identity of the user again based on the user instruction.

For example, the smart terminal may authenticate the user again based on the new command at the time when the user issues the new command.

For example, t after authentication passes₁During a service period (e.g., 5 minutes), when a user issues a new command to the smart terminal through voice information, motion information (e.g., gestures), or other indication information, the smart terminal may determine whether authentication of the user identity is required.

For example, the intelligent terminal may determine whether to authenticate the user identity according to a confidence threshold corresponding to a new command for the user.

For example, the owner may pre-configure confidence thresholds of user identities corresponding to different services or instructions, for example, the confidence threshold of broadcast time or broadcast weather may be set to U ═ 0; the confidence threshold value of the storytelling can be set to be U-0.5, namely, personalized service can be provided; the confidence threshold value of the schedule reminding can be set to be U-0.6, and when the chat with the intelligent robot relates to privacy information, the confidence threshold value can be set to be U-0.7; transfer money, the confidence threshold may be set to U-0.8.

In an example, if the confidence threshold corresponding to a new command issued by the user to the smart terminal through the voice message, the action message (e.g., a gesture), or other indication information is 0, for example, the smart terminal is instructed to broadcast weather, then authentication of the user identity may not be required at this time.

In one example, if the confidence threshold corresponding to a new command issued by the user to the intelligent terminal through voice information, action information (e.g., gesture) or other indication information is greater than 0, for example, when the intelligent terminal is instructed to perform operations such as story telling, schedule reminding, chatting, transferring remittance, etc., the identity of the user needs to be authenticated again at this time.

Further, if the identity authentication result confirms that the user is a preset user and the user has the authority to execute the command (namely, the authentication and the authorization pass), the intelligent terminal executes the command issued by the user; if the identity authentication fails or the authority is not enough, the user identity authentication failure can be indicated, that is, the matching between the user identity and the preset user identity fails; or, if the user identity authentication is successful, but the user does not have the authority to execute the instruction, the issued command fails to be executed, and at this time, the intelligent robot may perform corresponding voice prompt to the user.

It should be understood that when a user uses an intelligent terminal (e.g., an intelligent robot), the user may leave the ambient environment of the intelligent terminal due to other things after waking up the intelligent terminal, at this time, another user may contact the intelligent terminal and issue a new command to the intelligent terminal, and if the user identity is not continuously authenticated, there may be a risk that another user acquires the user privacy data on the terminal and tampers with or deletes the privacy data on the intelligent terminal; therefore, when the intelligent terminal detects that the user issues a new command, whether the user identity is authenticated again needs to be judged according to the service or the command indicated by the new command, so that the continuous identity authentication of the intelligent terminal on the user is realized, the risk of leakage of the privacy data of the user is reduced, and the safety of the user using the intelligent terminal is improved.

Optionally, in a possible implementation manner, during a time interval t2 seconds (for example, 30 seconds) after the user identity authentication is passed, the intelligent terminal may continuously track the user through a user tracking manner (for example, a face tracking manner); if the user is in the view of face tracking and the preset threshold value of the command issued again by the user is smaller than or equal to the confidence value of the last authentication pass (for example, the confidence value u2 of the new command is smaller than or equal to the confidence value u1 of the previous authentication pass), the user may not need to perform identity authentication again.

It should be noted that the value of the time interval t2 may be configured; for example, when t2 is 0, the user's respective security and privacy related instruction or personalized service providing instruction (i.e. an instruction with a preconfigured confidence threshold greater than 0) needs to perform the identity authentication of the user; if the intelligent robot loses the face tracking of the user due to the fact that the intelligent robot runs a business function (for example, games or dancing), the intelligent robot can perform identity authentication on the user again after receiving a new instruction of the user.

Illustratively, in one possible scenario, if no command issuance is detected for a long time around the intelligent terminal (e.g., intelligent robot), the service period t may be exceeded₁The intelligent terminal can detect surrounding face information and sound information, and if the face information or the sound information cannot be detected or the detected face information or the detected sound information is not a preset user, the intelligent terminal can enter a locking mode or a visitor mode.

And 540, providing personalized services according to the user identity.

In the embodiment of the application, different personalized services can be provided for the user according to different user identities. The specific steps can be referred to as step 430 shown in fig. 5, and are not described herein again.

In the second case, the intelligent terminal (e.g., intelligent robot) can perform multi-modal identity authentication based on time window (i.e., time interval) and user tracking.

For example, after the intelligent robot passes the identity authentication of the user, the intelligent robot may continuously perform the identity authentication again in a user tracking manner (e.g., by a face tracking system); for example, a time window of M seconds for user authentication may be entered again after every N seconds (e.g., 180 seconds) by starting a timer. For example, as shown in the flowchart of fig. 7, the method 600 of persistent multi-modal converged identity authentication shown in fig. 7 includes steps 610 to 640, which are described in detail below in connection with steps 610 to 640, respectively.

It should be understood that the second method for continuously authenticating the user identity is more suitable for a scenario in which the user identity authentication speed is slow or the user tracking is not easily lost, wherein the user tracking may include at least one of user face tracking, user skeleton recognition and pedestrian re-recognition.

Step 610, a wake-up word is detected.

And step 620, authenticating the identity of the user through the awakening word.

And step 630, performing identity authentication again on the user based on the time window and the user tracking mode.

For example, if the intelligent robot passes the authentication of the user in step 620, the intelligent robot may continuously track the user through a user tracking manner (e.g., face tracking, user skeleton recognition, and pedestrian re-recognition); starting a timer, the intelligent robot may start the M-second time window of the user identity authentication again after every N seconds (e.g., 180 seconds), that is, the intelligent robot may execute the flow of the user identity authentication again after every N seconds.

For example, if the time interval N seconds is 180 seconds, the intelligent robot may continuously track the face of the user after the intelligent robot passes the identity authentication of the user within the time interval 180 seconds; if the intelligent robot can continuously track the user within 180 seconds of the time interval and the user issues a plurality of instructions or services within the time interval, the intelligent robot can perform identity authentication on the user only once within the time interval. If the intelligent robot is lost in tracking the face of the user within a time interval, the intelligent robot needs to authenticate the identity of the user again.

For example, within the M-second time window, the smart robot may simultaneously initiate voiceprint recognition and/or face recognition functions to authenticate the identity of the user.

In one example, the intelligent robot may identify the user's voice information by a voiceprint recognition system to determine the identity of the user.

In one example, the intelligent robot may identify the face image information of the user through a face recognition system to determine the identity of the user.

In one example, the intelligent robot may recognize face image information of the user through a voiceprint recognition system and a face recognition system, and transmit a voiceprint recognition result and a face recognition result to a fusion determination system for fusion determination, thereby determining the identity of the user.

Exemplarily, the fusion judgment system fuses the voiceprint recognition result and the face recognition result to obtain a final fusion recognition result; if the fusion recognition result confirms that the user is a preset user, continuously providing personalized service with corresponding authority for the user; if the user is determined to be not matched with the preset user through the fusion recognition result, the intelligent robot can enter a locking state or a tourist mode.

Optionally, in a possible implementation manner, since the intelligent robot may not continuously track the user in the case that the intelligent robot runs a service, such as a game, dancing, or a user moves quickly, so that the face tracking of the user is lost, the intelligent robot may perform re-authentication of the user identity after receiving a new instruction indicated by the user.

Optionally, in a possible implementation manner, if the instruction or service instructed by the user relates to an operation of high-risk sensitive information or instruction (for example, money transfer, remittance, and the like), the intelligent robot may directly perform identity authentication and authorization on the user without distinguishing a time window.

Alternatively, in a possible time manner, if the command issue is not detected around the intelligent terminal (e.g. intelligent robot) for a long time, the service period t may be exceeded₁The intelligent terminal can detect surrounding face information and sound information, and can enter a locking mode or a visitor mode if the face information or the sound information cannot be detected or the detected face information or the detected sound information is not a preset user.

Step 640, providing personalized service according to the user identity.

It should be noted that, in the second case, the authentication and authorization for the user may be a time window and a user tracking manner, where the user tracking manner may include at least one of user face tracking, user skeleton recognition and pedestrian re-recognition, that is, the intelligent terminal may open a time window for user authentication at every period of time, and the authentication and authorization and the service function of the intelligent robot may be run on different processes or threads, so as to avoid the influence of the authentication on the service execution of the intelligent robot, and avoid the problems of slow speed of user authentication and instruction execution delay caused by the need of performing the authentication before executing the instruction after issuing the instruction.

It is to be understood that the above description is intended to assist those skilled in the art in understanding the embodiments of the present application and is not intended to limit the embodiments of the present application to the particular values or particular scenarios illustrated. It will be apparent to those skilled in the art from the foregoing description that various equivalent modifications or changes may be made, and such modifications or changes are intended to fall within the scope of the embodiments of the present application.

The method for identity authentication according to the embodiment of the present application is described in detail above with reference to fig. 1 to 7, and the apparatus embodiment of the present application is described in detail below with reference to fig. 8 and 9. It should be understood that the identity authentication apparatus in the embodiment of the present application may perform the method of the embodiment of the present application for identity authentication, that is, the following specific working processes of various products, and reference may be made to the corresponding processes in the embodiment of the foregoing method.

Fig. 8 is a schematic block diagram of an apparatus for identity authentication provided in an embodiment of the present application. It is understood that the apparatus 700 may perform the method of identity authentication illustrated in fig. 2 to 7. The apparatus 700 comprises: an acquisition unit 710 and a processing unit 720.

The obtaining unit 710 is configured to obtain information of multi-modal biometric features of a user to be identified, where the multi-modal biometric features include at least two biometric features of the user to be identified; the processing unit 720 is configured to perform identity authentication on the user to be identified in parallel according to the at least two biometrics; and determining the identity authentication result of the user to be recognized according to the recognition result obtained by parallelly performing identity authentication on the user to be recognized, wherein the identity authentication result is obtained based on the confidence coefficient of matching the multi-mode biological features with the preset biological features.

Optionally, as an embodiment, the obtaining unit 710 is specifically configured to:

and under the condition that the instruction that the user to be identified indicates to awaken the intelligent terminal is detected, and under the condition that the detected time interval is greater than the preconfigured time interval, performing identity authentication on the user to be identified in parallel according to the at least two biological characteristics, wherein the time interval refers to the time interval between the time of executing the last identity authentication and the current time.

Optionally, as an embodiment, the processing unit 720 is further configured to:

and detecting the image information of the user to be identified in a user tracking mode within the preconfigured time interval, wherein the user tracking mode comprises at least one of user face tracking, user skeleton identification and pedestrian re-identification.

Optionally, as an embodiment, the confidence of the multi-modal biometric feature matching with the preset biometric feature is greater than or equal to a preset threshold, where the preset threshold is a threshold pre-configured according to different services or instructions, and is used to indicate that the user to be identified passes identity authentication.

Optionally, as an embodiment, the processing unit 720 is further configured to:

and under the condition that the user to be identified passes identity authentication, providing personalized service for the user to be identified, wherein the personalized service is obtained according to the behavior attribute of the user to be identified.

Optionally, as an embodiment, the at least two biometrics features include a face image of the user to be recognized and voice information of the user to be recognized.

It should be noted that the apparatus 700 is embodied in the form of a functional unit. The term "unit" herein may be implemented in software and/or hardware, and is not particularly limited thereto.

For example, a "unit" may be a software program, a hardware circuit, or a combination of both that implement the above-described functions. The hardware circuitry may include an Application Specific Integrated Circuit (ASIC), an electronic circuit, a processor (e.g., a shared processor, a dedicated processor, or a group of processors) and memory that execute one or more software or firmware programs, a combinational logic circuit, and/or other suitable components that support the described functionality.

Accordingly, the units of the respective examples described in the embodiments of the present application can be realized in electronic hardware, or a combination of computer software and electronic hardware. Whether such functionality is implemented as hardware or software depends upon the particular application and design constraints imposed on the implementation. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present application.

As shown in fig. 9, the apparatus 800 (the apparatus 800 may specifically be a computer device) includes a memory 801, a processor 802, a communication interface 803, and a bus 804. The memory 801, the processor 802, and the communication interface 803 are communicatively connected to each other via a bus 804.

The memory 801 may be a Read Only Memory (ROM), a static memory device, a dynamic memory device, or a Random Access Memory (RAM). The memory 801 may store a program, and when the program stored in the memory 801 is executed by the processor 802, the processor 802 is configured to perform the steps of the method for identity authentication of the embodiment of the present application, for example, the steps shown in fig. 2 to 7.

It should be understood that the apparatus for identity authentication shown in the embodiment of the present application may be an intelligent terminal, for example, an intelligent robot, an intelligent camera product, an intelligent home control center product, and the like, or may also be a chip configured in the intelligent terminal.

The processor 802 may be a general Central Processing Unit (CPU), a microprocessor, an Application Specific Integrated Circuit (ASIC), or one or more integrated circuits, and is configured to execute related programs to implement the method for authenticating an identity according to the embodiment of the present application.

The processor 802 may also be an integrated circuit chip having signal processing capabilities. In implementation, the steps of the method of identity authentication of the present application may be implemented by integrated logic circuits of hardware or instructions in the form of software in the processor 802.

The processor 802 may also be a general purpose processor, a Digital Signal Processor (DSP), an Application Specific Integrated Circuit (ASIC), an off-the-shelf programmable gate array (FPGA) or other programmable logic device, discrete gate or transistor logic, or discrete hardware components. The various methods, steps, and logic blocks disclosed in the embodiments of the present application may be implemented or performed. A general purpose processor may be a microprocessor or the processor may be any conventional processor or the like. The steps of the method disclosed in connection with the embodiments of the present application may be directly implemented by a hardware decoding processor, or implemented by a combination of hardware and software modules in the decoding processor. The software module may be located in ram, flash memory, rom, prom, or eprom, registers, etc. storage media as is well known in the art. The storage medium is located in the memory 801, and the processor 802 reads information in the memory 801, and completes, in combination with hardware thereof, functions required to be performed by units included in the apparatus for authenticating an identity shown in fig. 8 in the embodiment of the present application, or performs the method for authenticating an identity shown in fig. 2 to 7 in the embodiment of the method of the present application.

The communication interface 803 illustratively enables communication between the apparatus 800 and other devices or communication networks using transceiver means, such as, but not limited to, transceivers.

Illustratively, the bus 804 may include a pathway to transfer information between various components of the apparatus 800 (e.g., the memory 801, the processor 802, the communication interface 803).

It should be noted that although the apparatus 800 described above shows only memories, processors, and communication interfaces, in a particular implementation, those skilled in the art will appreciate that the apparatus 800 may also include other components necessary to achieve proper operation. Also, those skilled in the art will appreciate that the apparatus 800 described above may also include hardware components for performing other additional functions, according to particular needs. Furthermore, those skilled in the art will appreciate that the apparatus 800 described above may also include only those components necessary to implement the embodiments of the present application, and need not include all of the components shown in FIG. 9.

It will also be appreciated that in embodiments of the present application, the memory may comprise both read-only memory and random access memory, and may provide instructions and data to the processor. A portion of the processor may also include non-volatile random access memory. For example, the processor may also store information of the device type.

It should be understood that the term "and/or" herein is merely one type of association relationship that describes an associated object, meaning that three relationships may exist, e.g., a and/or B may mean: a exists alone, A and B exist simultaneously, and B exists alone. In addition, the character "/" herein generally indicates that the former and latter related objects are in an "or" relationship.

It should be understood that, in the various embodiments of the present application, the sequence numbers of the above-mentioned processes do not mean the execution sequence, and the execution sequence of each process should be determined by its function and inherent logic, and should not constitute any limitation to the implementation process of the embodiments of the present application.

Those of ordinary skill in the art will appreciate that the various illustrative elements and algorithm steps described in connection with the embodiments disclosed herein may be implemented as electronic hardware or combinations of computer software and electronic hardware. Whether such functionality is implemented as hardware or software depends upon the particular application and design constraints imposed on the implementation. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present application.

It is clear to those skilled in the art that, for convenience and brevity of description, the specific working processes of the above-described systems, apparatuses and units may refer to the corresponding processes in the foregoing method embodiments, and are not described herein again.

In the several embodiments provided in the present application, it should be understood that the disclosed system, apparatus and method may be implemented in other ways. For example, the above-described apparatus embodiments are merely illustrative, and for example, the division of the units is only one logical division, and other divisions may be realized in practice, for example, a plurality of units or components may be combined or integrated into another system, or some features may be omitted, or not executed. In addition, the shown or discussed mutual coupling or direct coupling or communication connection may be an indirect coupling or communication connection through some interfaces, devices or units, and may be in an electrical, mechanical or other form.

The units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the units can be selected according to actual needs to achieve the purpose of the solution of the embodiment.

In addition, functional modules in the embodiments of the present application may be integrated into one processing unit, or each unit may exist alone physically, or two or more units are integrated into one unit.

The functions, if implemented in the form of software functional units and sold or used as a stand-alone product, may be stored in a computer readable storage medium. Based on such understanding, the technical solution of the present application or portions thereof that substantially contribute to the prior art may be embodied in the form of a software product stored in a storage medium and including instructions for causing a computer device (which may be a personal computer, a server, or a network device) to execute all or part of the steps of the method according to the embodiments of the present application. And the aforementioned storage medium includes: various media capable of storing program codes, such as a usb disk, a removable hard disk, a read-only memory (ROM), a Random Access Memory (RAM), a magnetic disk, or an optical disk.

The above description is only for the specific embodiments of the present application, but the scope of the present application is not limited thereto, and any person skilled in the art can easily conceive of the changes or substitutions within the technical scope of the present application, and shall be covered by the scope of the present application. Therefore, the protection scope of the present application shall be subject to the protection scope of the claims.

Claims

1. A method of identity authentication, comprising:

acquiring information of multi-modal biological features of a user to be identified, wherein the multi-modal biological features comprise at least two biological features of the user to be identified;

performing identity authentication on the user to be identified in parallel according to the at least two biological characteristics;

and determining the identity authentication result of the user to be recognized according to the recognition result obtained by parallelly performing identity authentication on the user to be recognized, wherein the identity authentication result is obtained based on the confidence coefficient of matching the multi-mode biological characteristics with preset biological characteristics.

2. The method of claim 1, wherein the authenticating the user to be identified in parallel according to the at least two biometrics comprises:

3. The method of claim 1, wherein the authenticating the user to be identified in parallel according to the at least two biometrics comprises:

4. The method of claim 3, further comprising:

5. The method according to any one of claims 1 to 4, wherein the method is used for indicating that the user to be identified passes identity authentication when the confidence of the multi-modal biometric feature matching with the preset biometric feature is greater than or equal to a preset threshold, wherein the preset threshold is a threshold pre-configured according to different services or instructions.

6. The method of claim 5, further comprising:

7. The method of any one of claims 1 to 6, wherein the at least two biometrics comprise face images of the user to be recognized and speech information of the user to be recognized.

8. An apparatus for identity authentication, comprising:

the device comprises an acquisition unit and a recognition unit, wherein the user acquires information of multi-modal biological characteristics of a user to be recognized, and the multi-modal biological characteristics comprise at least two biological characteristics of the user to be recognized;

the processing unit is used for carrying out identity authentication on the user to be identified in parallel according to the at least two biological characteristics; and determining the identity authentication result of the user to be recognized according to the recognition result obtained by parallelly performing identity authentication on the user to be recognized, wherein the identity authentication result is obtained based on the confidence coefficient of matching the multi-mode biological features with the preset biological features.

9. The apparatus of claim 8, wherein the obtaining unit is specifically configured to:

10. The apparatus of claim 8, wherein the obtaining unit is specifically configured to:

11. The apparatus as recited in claim 10, said processing unit to further:

12. The apparatus according to any one of claims 8 to 11, wherein the apparatus is configured to indicate that the user to be identified passes identity authentication when the confidence of the multi-modal biometric matching with the preset biometric is greater than or equal to a preset threshold, wherein the preset threshold is a threshold pre-configured according to different services or instructions.

13. The apparatus as recited in claim 12, said processing unit to further:

14. The apparatus according to any one of claims 8 to 13, wherein the at least two biometrics features include a face image of the user to be recognized and voice information of the user to be recognized.

15. An apparatus for identity authentication, comprising at least one processor and a memory, the at least one processor coupled with the memory for reading and executing instructions in the memory to perform the method of any of claims 1 to 7.

16. The apparatus of claim 15, wherein the apparatus is a smart terminal.

17. A chip comprising a processor and a data interface, the processor reading instructions stored on a memory through the data interface to perform the method of any one of claims 1 to 7.

18. A computer-readable medium, characterized in that it stores a program code, which, when run on a computer, causes the computer to perform the method according to any one of claims 1 to 7.