WO2022028425A1 - Object recognition method and apparatus, electronic device and storage medium - Google Patents

Object recognition method and apparatus, electronic device and storage medium Download PDF

Info

Publication number
WO2022028425A1
WO2022028425A1 PCT/CN2021/110358 CN2021110358W WO2022028425A1 WO 2022028425 A1 WO2022028425 A1 WO 2022028425A1 CN 2021110358 W CN2021110358 W CN 2021110358W WO 2022028425 A1 WO2022028425 A1 WO 2022028425A1
Authority
WO
WIPO (PCT)
Prior art keywords
information
target
living
feature information
feature
Prior art date
Application number
PCT/CN2021/110358
Other languages
French (fr)
Chinese (zh)
Inventor
邱尚锋
黄颖
张文伟
Original Assignee
广州虎牙科技有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 广州虎牙科技有限公司 filed Critical 广州虎牙科技有限公司
Publication of WO2022028425A1 publication Critical patent/WO2022028425A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • G06V40/172Classification, e.g. identification
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/22Matching criteria, e.g. proximity measures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/40Spoof detection, e.g. liveness detection
    • G06V40/45Detection of the body part being alive

Definitions

  • the present application relates to the technical field of face recognition, and in particular, to an object recognition method and device, an electronic device and a storage medium.
  • the purpose of the present application is to provide an object recognition method and apparatus, an electronic device and a storage medium.
  • An object recognition method comprising:
  • each frame of the target image includes facial information of the target object
  • Whether the target object belongs to a living object is determined based on the similarity information.
  • the embodiment of the present application also provides an object recognition device, including:
  • a target image obtaining module configured to obtain multiple frames of target images obtained by photographing a target object, wherein each frame of the target image includes facial information of the target object;
  • a similarity information determination module configured to determine similarity information between the multiple frames of target images based on the face information
  • a living object determination module configured to determine whether the target object belongs to a living object based on the similarity information.
  • the processor connected with the memory is used for executing the computer program stored in the memory, so as to realize the above-mentioned object recognition method.
  • an embodiment of the present application further provides a computer-readable storage medium on which a computer program is stored, and when the computer program is executed, the above-mentioned object identification method is implemented.
  • FIG. 1 is a structural block diagram of an electronic device provided by an embodiment of the present application.
  • FIG. 2 is a schematic flowchart of an object recognition method provided by an embodiment of the present application.
  • FIG. 3 is a schematic diagram of the effect of obtaining multiple frames of target images according to an embodiment of the present application.
  • FIG. 4 is a schematic flowchart of sub-steps included in step S120 in FIG. 2 .
  • FIG. 5 is a schematic flowchart of sub-steps included in step S130 in FIG. 2 .
  • FIG. 6 is a schematic flowchart of other sub-steps included in step S130 in FIG. 2 .
  • FIG. 8 is a schematic flowchart of other sub-steps included in step S133 in FIG. 6 .
  • FIG. 9 is a schematic flowchart of other steps of the object recognition method provided by the embodiment of the present application.
  • FIG. 10 is a schematic block diagram of functional modules of an object recognition apparatus provided by an embodiment of the present application.
  • Icons 10-electronic device; 12-memory; 14-processor; 100-object recognition device; 110-target image acquisition module; 120-similarity information determination module; 130-living object determination module.
  • an embodiment of the present application provides an electronic device 10 .
  • the electronic device 10 may include a memory 12 , a processor 14 and an object recognition apparatus 100 .
  • the processor 14 may be a general-purpose processor, including a central processing unit (Central Processing Unit, CPU), a network processor (Network Processor, NP), a system on a chip (System on Chip, SoC), etc.; also It is a digital signal processor (DSP), an application specific integrated circuit (ASIC), a field programmable gate array (FPGA) or other programmable logic device, a discrete gate or transistor logic device, a discrete hardware component.
  • CPU Central Processing Unit
  • NP Network Processor
  • SoC System on Chip
  • DSP digital signal processor
  • ASIC application specific integrated circuit
  • FPGA field programmable gate array
  • an embodiment of the present application further provides an object recognition method, which can be applied to the above-mentioned electronic device 10 .
  • the method steps defined by the flow related to the object recognition method may be implemented by the electronic device 10 .
  • the specific flow shown in FIG. 2 will be described in detail below.
  • the object recognition method provided by this embodiment of the present application mainly includes the following steps S110 to S130:
  • Step S110 obtaining multiple frames of target images obtained by photographing the target object.
  • the electronic device 10 may first obtain multiple frames of target images.
  • the target object performs specified operations such as login account, query specified information, transfer transaction, etc., it is determined that the target object needs to be identified, and the above step S110 is performed.
  • Step S120 determining similarity information between multiple frames of target images based on the face information.
  • the electronic device 10 may determine similarity information between the multiple frames of the target images based on the face information in the multiple frames of the target images.
  • the electronic device 10 may determine whether the target object belongs to a living object based on the similarity information.
  • the device that does not depend on the target object to make a specified action and does not depend on the device for photographing the target object has a depth image sensor
  • the result is the problem of low accuracy, so that the accuracy of face recognition can be effectively guaranteed, and the experience of recognizing objects can be improved (no specific actions are required), and the cost of equipment can be reduced.
  • step S110 the specific manner of obtaining the multi-frame target image is not limited, and can be selected according to actual application requirements.
  • the electronic device 10 is a terminal device such as a mobile phone or a computer, and the electronic device 10 can photograph a target object based on a carried image acquisition device (such as a camera) to obtain multiple frames with facial information of the target object. target image.
  • a carried image acquisition device such as a camera
  • the electronic device 10 is a server, and the electronic device 10 can capture a target object based on an image acquisition device (such as a camera) on a connected terminal device to obtain multiple frames of target images with facial information of the target object.
  • an image acquisition device such as a camera
  • the electronic device 10 when it is necessary to perform face recognition on the target object, the electronic device 10 is connected to a terminal device, and can control the image capturing device to be turned on to photograph the target object. Then, the terminal device can obtain multiple frames of target images captured by the image acquisition device, and send the multiple frames of target images to the electronic device 10, so that the electronic device 10 can obtain multiple frames of target images.
  • the obtained multi-frame target images may be all target images obtained by shooting the target object.
  • the value of the above-mentioned preset number of frames can also be gradually increased.
  • the time difference between two adjacent frames of target images whose time information is more advanced may be smaller.
  • the multi-frame target images obtained by shooting can be in the order of time, the first frame target image, the second frame target image, the third frame target image, the fourth frame target image, the Five target images, the sixth target image, the seventh target image, the eighth target image, the ninth frame target image, and the tenth frame target image.
  • the obtained multi-frame target images may be, in order of time, the first frame target image, the third frame target image (one frame apart), the sixth frame target image (two frames apart) and the tenth frame target image (three frames apart).
  • the total frame length of the obtained multi-frame target images may be less than the preset duration, and the preset duration may be, for example, 1s or 0.5s, etc., it should be noted that this is only an exemplary description, and the preset duration can be specifically set according to the actual situation, which is not limited here.
  • the facial information in multiple frames of the target image can be compared based on some image processing algorithms, such as extracting facial contours in the target image based on a contour extraction algorithm, and then comparing the extracted facial contours, To determine the similarity information between multiple frames of target images.
  • image processing algorithms such as extracting facial contours in the target image based on a contour extraction algorithm, and then comparing the extracted facial contours, To determine the similarity information between multiple frames of target images.
  • Step S121 perform feature extraction processing on the face information in each frame of the target image based on the pre-trained face recognition model to obtain target feature information of each frame of the target image.
  • Step S122 obtaining similarity information between multiple frames of target images based on the target feature information.
  • step S121 After the target feature information of the multi-frame target images is obtained based on step S121, similarity information between the multi-frame target images may be obtained based on the target feature information.
  • the face recognition model (a neural network model with face recognition function) has high information processing capability, the extracted target feature information is more abundant, so that the similarity information determined based on the target feature information has Higher accuracy, which in turn improves the recognition accuracy of live objects.
  • the target images of the multiple frames for which similarity information is determined may be all target images in the multi-frame target images obtained based on the step S110, or may be obtained based on the step S110. part of the target image in the multi-frame target image.
  • step S130 the specific manner of determining whether the target object belongs to a living object is also not limited, and can be selected according to actual application requirements.
  • the similarity information is less than a preset similarity, it may be determined that the target object belongs to a non-living object, that is, it may be a photo or a three-dimensional model. If the similarity information is greater than the preset similarity, it can be determined that the target object belongs to a living object, that is, it may be a real person or the like.
  • the similarity information is 100%, that is, identical, it can be determined that the target object belongs to a non-living object, that is, it may be a photo or a three-dimensional model. If the similarity information is not 100%, it may be determined that the target object belongs to a living object.
  • the inventor of the present application found that the first image is obtained by photographing the target object, and then the photo or three-dimensional model formed based on the first image is photographed to form the second image. For the difference between the first image and the second image , there are still some differences.
  • the above-mentioned specific manner of determining the confidence information may have different choices.
  • step S130 may include steps S131 and S132, the specific content as described below.
  • Step S131 compare at least one target image in the multi-frame target images with the pre-obtained multi-frame images to obtain confidence information that the target object belongs to a living object.
  • step S110 After the multi-frame target image is obtained based on step S110, at least one target image in the multi-frame target image may be compared with the pre-obtained multi-frame image.
  • the multi-frame images may include face information of multiple different objects, and the objects include living objects and non-living objects. In this way, by comparing the target image with the multi-frame images, confidence information that the target object belongs to a living object can be obtained.
  • the target image is an image captured from the front of the target object during recognition
  • the first image is an image captured in advance of the front of a living object (such as a person with the same character as the target object)
  • the second image is captured based on the first An image formed from a photograph.
  • step S131 After the confidence level information is obtained based on step S131, it may be determined whether the target object belongs to a living object based on the confidence level information and in combination with the similarity level information obtained in step S120.
  • Step S133 comparing at least one of the target feature information with a plurality of pre-formed comparative feature information through the face recognition model to obtain confidence information that the target object belongs to a living object.
  • the plurality of comparative feature information may be obtained based on multiple frames of images including face information of a plurality of different objects, and the objects include living objects and non-living objects. In this way, confidence information that the target object belongs to a living object can be obtained.
  • step S133 After the confidence information is obtained based on step S133, it may be determined whether the target object belongs to a living object based on the confidence information and in combination with the similarity information obtained in step S120.
  • step S133 The specific manner of obtaining the confidence information through step S133 is not limited, and can be selected according to actual application requirements.
  • the obtained at least one target feature information may be directly compared with all pre-formed feature information to determine confidence information that the target object belongs to a living object.
  • the entire feature information may be formed based on images obtained by photographing a plurality of different objects, the different objects include objects with different identities, and the objects with the same identities include living objects and non-living objects.
  • the first image is obtained by shooting, and the second image is obtained by shooting the photo B formed by the first image.
  • the photo B is actually a non-living object and has the same identity category as the living object A, such as belonging to the same person.
  • step S133 may include step S133a, step S133b and step S133c, the specific contents are as follows.
  • Step S133a performing identity category recognition processing on at least one of the target feature information through the face recognition model to obtain identity category information of the at least one target feature information.
  • the at least one target feature information obtained can be processed by the face recognition model.
  • Identity category identification processing so as to determine the identity category information corresponding to the target feature information.
  • Step S133b in a plurality of feature spaces included in the face recognition model, determine a target feature space of the at least one target feature information based on the identity category information.
  • the target feature space of the target feature information may be determined from a plurality of feature spaces included in the face recognition model based on the identity category information.
  • different feature spaces have comparative feature information of different objects, and each of the comparative feature information has first label information identifying the identity category of the corresponding object and second label information whether it is a living body.
  • the comparative feature information of the same feature space may have the same first label information, and may have different second label information.
  • Step S133c comparing the at least one target feature information with the contrast feature information in the target feature space corresponding to the target feature information through the face recognition model to obtain confidence information that the target object belongs to a living object.
  • the target feature information can be compared with the contrast feature information in the target feature space through the face recognition model, and then, according to the comparison result and the second label information of the contrast feature information, Confidence information that the target object belongs to a living object is determined.
  • the target feature information that is, the target object
  • a more refined comparison process can be performed when judging a living object. , so that the comparison results can be more accurate.
  • step S133 the specific quantity of the target feature information based on the comparison is different, and the specific manner of obtaining the confidence information may also be different, and may be selected according to actual application requirements.
  • step S121 in order to reduce the data processing amount of the electronic device 10, in order to improve the efficiency of recognition and reduce the performance requirements for the electronic device 10, after obtaining the target feature information of the multi-frame target images based on step S121, one can be selected. frame target feature information of the target image, and then obtain corresponding confidence information based on the target feature information.
  • step S133 may include step S133d and step S133e, and the specific content is as follows.
  • Step S133d for the target feature information corresponding to each frame of the target image, compare the target feature information with a plurality of pre-formed contrast feature information through the face recognition model, and determine that the target object in the frame of the target image belongs to a living body The confidence level of the object, get multiple confidence levels.
  • the face recognition model For each frame of the obtained target image, the face recognition model can be used to compare the target feature information corresponding to the frame of target image with a plurality of pre-formed comparative feature information, so as to determine the target image in the frame of the target image. In this way, multiple confidence levels can be obtained.
  • Step S133e based on the plurality of confidence levels, obtain confidence level information that the target object belongs to a living object.
  • the confidence level information that the target object belongs to a living object may be obtained based on the plurality of confidence levels.
  • a minimum confidence level may be determined among multiple confidence levels as confidence level information that the target object belongs to a living object.
  • a maximum confidence level may be determined among multiple confidence levels as confidence level information that the target object belongs to a living object.
  • an average value may be calculated based on a plurality of confidence levels, and used as confidence level information that the target object belongs to a living object.
  • step S132 and step S134 the specific manner of determining whether the target object belongs to a living object based on the confidence information and the similarity information is not limited, and may also be selected according to actual application requirements.
  • a larger piece of information may be selected as the judgment basis, wherein, if the confidence information and the similarity information are information in different value ranges, the After the normalization process, the larger information is selected to determine whether the target object belongs to a living object. For example, the larger information is compared with the preset information, and if the larger information is greater than the preset information, the target is determined. Objects are living objects.
  • a smaller piece of information may be selected as a judgment basis to determine whether the target object belongs to a living object, for example, by comparing the smaller piece of information with preset information, If the smaller information is greater than the preset information, it is determined that the target object belongs to a living object.
  • different weighting coefficients may be configured for the confidence information and the similarity information respectively, and then the confidence information and the similarity information are calculated.
  • the weighted sum value of and then determine whether the target object belongs to a living object based on the weighted sum value.
  • the weight coefficient corresponding to the similarity information may be greater than the weight coefficient corresponding to the confidence information, so that the basis for determining whether it belongs to a living object is more focused on the similarity between the target images, that is, the face information. subtle changes.
  • the object recognition method may further include a step of model training, which may specifically include steps S140 , S150 and S160 , the details of which are as follows.
  • Step S140 Perform feature extraction processing on multiple sample images through a feature extraction layer in a preset neural network model to obtain multiple sample feature information.
  • feature extraction processing can be performed on the multiple sample images based on the feature extraction layer in the preset neural network model, so that multiple sample feature information can be obtained.
  • the sample feature information can be determined based on the first label information pre-configured for each sample image through the loss determination layer in the neural network model The first loss value of the sample image is determined, and the second loss value of the sample feature information is determined based on the second label information pre-configured for each sample image. In this way, a plurality of first loss values and a plurality of second loss values can be obtained.
  • Step S160 based on the first loss value and the second loss value, train the neural network model to obtain the face recognition model.
  • the neural network model may be trained based on the first loss value and the second loss value, such as through a back-propagation algorithm.
  • the parameters of the neural network model are updated until the loss value converges and the training ends.
  • the neural network model at the end of the training can input the expected effect, which can be used as the face recognition model, that is, the person obtained from the final training.
  • the face recognition model has a good face recognition effect.
  • the neural network model may be a residual model, such as a deep residual network model (DRN, deep residual network).
  • DNN deep residual network model
  • step S150 that the specific composition of the loss determination layer is not limited, and can also be selected according to actual application requirements.
  • the loss determination layer may include an image classification network (such as a fully connected layer, fully connected layers: FC), which is used to perform feature classification processing on the feature information of each sample to obtain multiple feature vectors.
  • FC image classification network
  • step S160 the specific manner of training based on the first loss value and the second loss value is not limited, and can also be selected according to actual application requirements.
  • the sum value of the first loss value and the second loss value may be calculated first, and the sum value may be used as the total loss value, and then the neural network model is trained based on the total loss value until the loss value is reached. The training ends when the total values converge.
  • a weighted sum value of the first loss value and the second loss value may be calculated first, and the weighted sum value may be used as the total loss value, and then the neural network model is trained based on the total loss value until the End training when the total loss value converges.
  • the neural network model can be updated through a backpropagation algorithm (Backpropagation algorithm, BP algorithm, which is a supervised learning algorithm), thereby obtaining the face recognition model.
  • BP algorithm Backpropagation algorithm, which is a supervised learning algorithm
  • an embodiment of the present application further provides an object recognition apparatus 100 , which can be applied to the aforementioned electronic device 10 .
  • the object recognition apparatus 100 may include a target image obtaining module 110 , a similarity information determination module 120 and a living object determination module 130 .
  • the target image obtaining module 110 is configured to obtain multiple frames of target images obtained by photographing a target object, wherein each frame of the target image includes face information of the target object.
  • the target image obtaining module 110 may be configured to execute the step S110 shown in FIG. 2 .
  • the target image obtaining module 110 For the relevant content of the target image obtaining module 110 , reference may be made to the foregoing description of the step S110 .
  • the similarity information determining module 120 is configured to determine similarity information between multiple frames of the target images based on the face information.
  • the similarity information determination module 120 may be configured to execute step S120 shown in FIG. 2 , and for related content of the similarity information determination module 120 , reference may be made to the foregoing description of step S120 .
  • the living object determination module 130 is configured to determine whether the target object belongs to a living object based on the similarity information.
  • the living object determination module 130 may be configured to execute step S130 shown in FIG. 2 .
  • step S130 For the relevant content of the living object determination module 130 , reference may be made to the foregoing description of step S130 .
  • the similarity information determination module is specifically configured to:
  • Similarity information between multiple frames of the target images is obtained based on the target feature information.
  • the living object determination module is specifically configured to:
  • the target feature information Comparing at least one of the target feature information with a plurality of pre-formed comparative feature information through the face recognition model to obtain confidence information that the target object belongs to a living object, wherein the plurality of comparative feature information is based on A plurality of frames of images including facial information of a plurality of different objects are obtained, and the objects include living objects and non-living objects;
  • Whether the target object belongs to a living object is determined based on the confidence information and the similarity information.
  • the at least one target feature information is compared with the contrast feature information in the target feature space corresponding to the target feature information by the face recognition model, so as to obtain confidence information that the target object belongs to a living object.
  • the living object determination module is specifically configured to:
  • the face recognition model compares the target feature information with a plurality of pre-formed comparative feature information, and determines that the target object in the frame of the target image belongs to Confidence of living objects, get multiple confidences;
  • Confidence information that the target object belongs to a living object is obtained based on the plurality of confidences.
  • a model training module configured to:
  • the first loss value of each sample feature information is determined based on the first label information preconfigured for each sample image, and the second label preconfigured for each sample image is determined based on the first loss value of each sample image.
  • the information determines the second loss value of each sample feature information, wherein the first label information is used to identify the identity category of the object in the corresponding sample image, and the second label information is used to identify whether the object in the corresponding sample image is for living;
  • the neural network model is trained to obtain the face recognition model.
  • the living object determination module is specifically configured to:
  • Whether the target object belongs to a living object is determined based on the confidence information and the similarity information.
  • multiple or multiple frames, etc. refer to two or more, for example, a multi-frame target image refers to two or more target images.
  • the object recognition method and device, electronic device and The storage medium determines whether the target object belongs to a living object by obtaining multiple frames of target images obtained by photographing the target object, and then, based on similarity information between face information in the multiple frames of target images.
  • each block of the block diagrams and/or flowchart illustrations, and combinations of blocks in the block diagrams and/or flowchart illustrations can be implemented in dedicated hardware-based systems that perform the specified functions or actions , or can be implemented in a combination of dedicated hardware and computer instructions.
  • each functional module in the embodiments of the present application may be integrated together to form an independent part, or each module may exist independently, or two or more modules may be integrated to form an independent part.
  • the functions are implemented in the form of software function modules and sold or used as independent products, they can be stored in a computer-readable storage medium.
  • the technical solution of the present application can be embodied in the form of a software product in essence, or the part that contributes to the prior art or the part of the technical solution.
  • the computer software product is stored in a storage medium, including Several instructions are used to cause a computer device (which may be a personal computer, an electronic device, or a network device, etc.) to execute all or part of the steps of the methods described in the various embodiments of the present application.
  • the aforementioned storage medium includes: U disk, mobile hard disk, Read-Only Memory (ROM, Read-Only Memory), Random Access Memory (RAM, Random Access Memory), magnetic disk or optical disk and other media that can store program codes .
  • ROM Read-Only Memory
  • RAM Random Access Memory
  • magnetic disk or optical disk and other media that can store program codes .
  • the terms “comprising”, “comprising” or any other variation thereof are intended to encompass non-exclusive inclusion, such that a process, method, article or device comprising a series of elements includes not only those elements, It also includes other elements not expressly listed or inherent to such a process, method, article or apparatus. Without further limitation, an element qualified by the phrase "comprising a" does not preclude the presence of additional identical elements in a process, method, article or apparatus that includes the element.
  • the technical solution provided by the present application can also realize high-accuracy identification of whether the target object belongs to a living object when the device that does not depend on the target object to perform a specified action and does not depend on the device for shooting the target object has a depth image sensor. , so that the accuracy of face recognition can be effectively guaranteed, and because there is no need to recognize the object to make specified actions, it can also improve the use experience of the recognized object. Wider and more practical value.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Multimedia (AREA)
  • Human Computer Interaction (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Evolutionary Computation (AREA)
  • Evolutionary Biology (AREA)
  • General Engineering & Computer Science (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Health & Medical Sciences (AREA)
  • General Health & Medical Sciences (AREA)
  • Oral & Maxillofacial Surgery (AREA)
  • Artificial Intelligence (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Image Analysis (AREA)

Abstract

An object recognition method and apparatus, an electronic device (10) and a storage medium (12), relating to the technical field of face recognition. The object recognition method comprises: firstly, acquiring multiple frames of target images obtained by photographing a target object (S110), wherein each frame of target image comprises face information of the target object; secondly, determining similarity information among the multiple frames of target images on the basis of the face information (S120); and next, on the basis of the similarity information, determining whether the target object is a live object (S130). On the basis of the object recognition method, the problem of low accuracy of recognition results in the existing face recognition technology can be improved.

Description

对象识别方法和装置、电子设备及存储介质Object recognition method and device, electronic device and storage medium
相关申请的交叉引用CROSS-REFERENCE TO RELATED APPLICATIONS
本申请要求于2020年8月5日提交中国专利局的申请号为2020107800554、名称为“对象识别方法和装置、电子设备及存储介质”的中国专利申请的优先权,其全部内容通过引用结合在本公开中。This application claims the priority of the Chinese Patent Application No. 2020107800554, which was filed with the China Patent Office on August 5, 2020, and is entitled "Method and Device for Object Recognition, Electronic Equipment, and Storage Medium", the entire contents of which are incorporated by reference in in this disclosure.
技术领域technical field
本申请涉及人脸识别技术领域,具体而言,涉及一种对象识别方法和装置、电子设备及存储介质。The present application relates to the technical field of face recognition, and in particular, to an object recognition method and device, an electronic device and a storage medium.
背景技术Background technique
随着人脸识别技术的应用范围越来越广,对人脸识别结果的准确度需求也越来越高。其中,为了提高人脸识别结果的准确度,一般会要求识别对象做出如眨眼、摇头等指定动作,或者,需要识别设备上配置深度图像传感器,以采集识别对象的脸部深度信息。With the wider application of face recognition technology, the demand for the accuracy of face recognition results is also higher and higher. Among them, in order to improve the accuracy of the face recognition result, the recognized object is generally required to perform specified actions such as blinking, shaking his head, or a depth image sensor needs to be configured on the recognition device to collect the facial depth information of the recognized object.
发明内容SUMMARY OF THE INVENTION
有鉴于此,本申请的目的在于提供一种对象识别方法和装置、电子设备及存储介质。In view of this, the purpose of the present application is to provide an object recognition method and apparatus, an electronic device and a storage medium.
为实现上述目的,本申请实施例采用如下技术方案:To achieve the above purpose, the embodiment of the present application adopts the following technical solutions:
一种对象识别方法,包括:An object recognition method comprising:
获得拍摄目标对象得到的多帧目标图像,其中,每一帧所述目标图像包括所述目标对象的脸部信息;Obtaining multiple frames of target images obtained by photographing a target object, wherein each frame of the target image includes facial information of the target object;
基于所述脸部信息确定多帧所述目标图像之间的相似度信息;Determine similarity information between multiple frames of the target images based on the face information;
基于所述相似度信息确定所述目标对象是否属于活体对象。Whether the target object belongs to a living object is determined based on the similarity information.
本申请实施例还提供了一种对象识别装置,包括:The embodiment of the present application also provides an object recognition device, including:
目标图像获得模块,用于获得拍摄目标对象得到的多帧目标图像,其中,每一帧所述目标图像包括所述目标对象的脸部信息;A target image obtaining module, configured to obtain multiple frames of target images obtained by photographing a target object, wherein each frame of the target image includes facial information of the target object;
相似度信息确定模块,用于基于所述脸部信息确定所述多帧目标图像之间的相似度信息;a similarity information determination module, configured to determine similarity information between the multiple frames of target images based on the face information;
活体对象确定模块,用于基于所述相似度信息确定所述目标对象是否属于活体对象。A living object determination module, configured to determine whether the target object belongs to a living object based on the similarity information.
在上述基础上,本申请实施例还提供了一种电子设备,包括:On the above basis, the embodiment of the present application also provides an electronic device, including:
存储器,用于存储计算机程序;memory for storing computer programs;
与所述存储器连接的处理器,用于执行该存储器存储的计算机程序,以实现上述的对象识别方法。The processor connected with the memory is used for executing the computer program stored in the memory, so as to realize the above-mentioned object recognition method.
在上述基础上,本申请实施例还提供了一种计算机可读存储介质,其上存储有计算机程序,该计算机程序被执行时,实现上述的对象识别方法。Based on the above, an embodiment of the present application further provides a computer-readable storage medium on which a computer program is stored, and when the computer program is executed, the above-mentioned object identification method is implemented.
为使本申请的上述目的、特征和优点能更明显易懂,下文特举较佳实施例,并配合所附附图,作详细说明如下。In order to make the above-mentioned objects, features and advantages of the present application more obvious and easy to understand, the preferred embodiments are exemplified below, and are described in detail as follows in conjunction with the accompanying drawings.
附图说明Description of drawings
为了更清楚地说明本申请的技术方案,下面将对其中所需要使用的附图作简单地介绍,应当理解,以下附图仅示出了本申请的某些实现方式,因此不应被看作是对范围的限定,对于本领域普通技术人员来讲,在不付出创造性劳动的前提下,还可以根据这些附图获得其它相关的附图。In order to explain the technical solutions of the present application more clearly, the following drawings will be briefly introduced. It should be understood that the following drawings only show some implementations of the present application, and therefore should not be regarded as It is a limitation of the scope. For those of ordinary skill in the art, other related drawings can also be obtained according to these drawings without any creative effort.
图1为本申请实施例提供的电子设备的结构框图。FIG. 1 is a structural block diagram of an electronic device provided by an embodiment of the present application.
图2为本申请实施例提供的对象识别方法的流程示意图。FIG. 2 is a schematic flowchart of an object recognition method provided by an embodiment of the present application.
图3为本申请实施例提供的获得多帧目标图像的效果示意图。FIG. 3 is a schematic diagram of the effect of obtaining multiple frames of target images according to an embodiment of the present application.
图4为图2中步骤S120包括的子步骤的流程示意图。FIG. 4 is a schematic flowchart of sub-steps included in step S120 in FIG. 2 .
图5为图2中步骤S130包括的子步骤的流程示意图。FIG. 5 is a schematic flowchart of sub-steps included in step S130 in FIG. 2 .
图6为图2中步骤S130包括的其它子步骤的流程示意图。FIG. 6 is a schematic flowchart of other sub-steps included in step S130 in FIG. 2 .
图7为图6中步骤S133包括的子步骤的流程示意图。FIG. 7 is a schematic flowchart of sub-steps included in step S133 in FIG. 6 .
图8为图6中步骤S133包括的其它子步骤的流程示意图。FIG. 8 is a schematic flowchart of other sub-steps included in step S133 in FIG. 6 .
图9为本申请实施例提供的对象识别方法的其它步骤的流程示意图。FIG. 9 is a schematic flowchart of other steps of the object recognition method provided by the embodiment of the present application.
图10为本申请实施例提供的对象识别装置的功能模块的方框示意图。FIG. 10 is a schematic block diagram of functional modules of an object recognition apparatus provided by an embodiment of the present application.
图标:10-电子设备;12-存储器;14-处理器;100-对象识别装置;110-目标图像获得模块;120-相似度信息确定模块;130-活体对象确定模块。Icons: 10-electronic device; 12-memory; 14-processor; 100-object recognition device; 110-target image acquisition module; 120-similarity information determination module; 130-living object determination module.
具体实施方式detailed description
为使本申请实施例的目的、技术方案和优点更加清楚,下面将结合本申请实施例中的附图,对本申请实施例中的技术方案进行清楚、完整地描述,显然,所描述的实施例只是本申请的一部分实施例,而不是全部的实施例。通常在此处附图中描述和示出的本申请实施例的组件可以以各种不同的配置来布置和设计。In order to make the purposes, technical solutions and advantages of the embodiments of the present application clearer, the technical solutions in the embodiments of the present application will be described clearly and completely below with reference to the drawings in the embodiments of the present application. Obviously, the described embodiments It is only a part of the embodiments of the present application, but not all of the embodiments. The components of the embodiments of the present application generally described and illustrated in the drawings herein may be arranged and designed in a variety of different configurations.
因此,以下对在附图中提供的本申请的实施例的详细描述并非旨在限制要求保护的本 申请的范围,而是仅仅表示本申请的选定实施例。基于本申请中的实施例,本领域普通技术人员在没有作出创造性劳动前提下所获得的所有其他实施例,都属于本申请保护的范围。Thus, the following detailed description of the embodiments of the application provided in the accompanying drawings is not intended to limit the scope of the application as claimed, but is merely representative of selected embodiments of the application. Based on the embodiments in the present application, all other embodiments obtained by those of ordinary skill in the art without creative work fall within the protection scope of the present application.
如图1所示,本申请实施例提供了一种电子设备10。其中,该电子设备10可以包括存储器12、处理器14和对象识别装置100。As shown in FIG. 1 , an embodiment of the present application provides an electronic device 10 . The electronic device 10 may include a memory 12 , a processor 14 and an object recognition apparatus 100 .
详细地,所述存储器12和处理器14之间直接或间接地电性连接,以实现数据的传输或交互。例如,相互之间可通过一条或多条通讯总线或信号线实现电性连接。所述存储器12中可以存储有至少一个可以以软件或固件(firmware)的形式,存在的软件功能模块,如所述对象识别装置100。所述处理器14可以用于执行所述存储器12中存储的可执行的计算机程序,如所述对象识别装置100,从而实现本申请实施例(如后文所述)提供的对象识别方法,以确定目标对象是否属于活体对象。Specifically, the memory 12 and the processor 14 are electrically connected directly or indirectly to realize data transmission or interaction. For example, they can be electrically connected to each other through one or more communication buses or signal lines. The memory 12 may store at least one software function module that may exist in the form of software or firmware, such as the object recognition apparatus 100 . The processor 14 may be configured to execute the executable computer program stored in the memory 12, such as the object recognition apparatus 100, so as to implement the object recognition method provided by the embodiments of the present application (as described later), to Determine if the target object is a living object.
可选地,所述存储器12可以是,但不限于,随机存取存储器(Random Access Memory,RAM),只读存储器(Read Only Memory,ROM),可编程只读存储器(Programmable Read-Only Memory,PROM),可擦除只读存储器(Erasable Programmable Read-Only Memory,EPROM),电可擦除只读存储器(Electric Erasable Programmable Read-Only Memory,EEPROM)等。Optionally, the memory 12 may be, but not limited to, a random access memory (Random Access Memory, RAM), a read-only memory (Read Only Memory, ROM), a programmable read-only memory (Programmable Read-Only Memory, PROM), Erasable Programmable Read-Only Memory (EPROM), Electrically Erasable Programmable Read-Only Memory (Electric Erasable Programmable Read-Only Memory, EEPROM), etc.
并且,所述处理器14可以是一种通用处理器,包括中央处理器(Central Processing Unit,CPU)、网络处理器(Network Processor,NP)、片上***(System on Chip,SoC)等;还可以是数字信号处理器(DSP)、专用集成电路(ASIC)、现场可编程门阵列(FPGA)或者其他可编程逻辑器件、分立门或者晶体管逻辑器件、分立硬件组件。Moreover, the processor 14 may be a general-purpose processor, including a central processing unit (Central Processing Unit, CPU), a network processor (Network Processor, NP), a system on a chip (System on Chip, SoC), etc.; also It is a digital signal processor (DSP), an application specific integrated circuit (ASIC), a field programmable gate array (FPGA) or other programmable logic device, a discrete gate or transistor logic device, a discrete hardware component.
可以理解,所述电子设备10可以是一种具有数据处理能力的终端设备(如手机、电脑的等)或服务器。It can be understood that the electronic device 10 may be a terminal device (such as a mobile phone, a computer, etc.) or a server with data processing capability.
并且,图1所示的结构仅为示意,所述电子设备10还可包括比图1中所示更多或者更少的组件,或者具有与图1所示不同的配置,例如,还可以包括用于与其它设备进行信息交互的通信单元,如该电子设备10为服务器时,该通信单元可以用于与目标对象的拍摄设备进行通信,以获得目标图像或反馈识别结果等。Moreover, the structure shown in FIG. 1 is only for illustration, and the electronic device 10 may further include more or less components than those shown in FIG. 1 , or have different configurations from those shown in FIG. 1 , for example, may also include A communication unit used for information interaction with other devices. For example, when the electronic device 10 is a server, the communication unit can be used to communicate with a photographing device of a target object to obtain a target image or feedback recognition results.
结合图2,本申请实施例还提供一种对象识别方法,可应用于上述电子设备10。其中,该对象识别方法有关的流程所定义的方法步骤,可以由所述电子设备10实现。下面将对图2所示的具体流程,进行详细阐述。如图2所示,本申请实施例提供的对象识别方法主要包括如下步骤S110~步骤S130:With reference to FIG. 2 , an embodiment of the present application further provides an object recognition method, which can be applied to the above-mentioned electronic device 10 . The method steps defined by the flow related to the object recognition method may be implemented by the electronic device 10 . The specific flow shown in FIG. 2 will be described in detail below. As shown in FIG. 2 , the object recognition method provided by this embodiment of the present application mainly includes the following steps S110 to S130:
步骤S110,获得拍摄目标对象得到的多帧目标图像。Step S110, obtaining multiple frames of target images obtained by photographing the target object.
在本实施例中,在需要对目标对象进行识别,以确定该目标对象识别属于活体对象时,所述电子设备10可以先获得多帧目标图像。在实际应用中,可以在目标对象执行诸如登录 账号、查询指定信息、转账交易等指定操作时,确定需要对目标对象进行识别,并执行上述步骤S110。In this embodiment, when the target object needs to be identified to determine that the target object is identified as a living object, the electronic device 10 may first obtain multiple frames of target images. In practical applications, when the target object performs specified operations such as login account, query specified information, transfer transaction, etc., it is determined that the target object needs to be identified, and the above step S110 is performed.
其中,所述多帧目标图像可以基于拍摄所述目标对象获得,且每一帧所述目标图像可以包括所述目标对象的脸部信息。The multiple frames of target images may be obtained based on photographing the target object, and each frame of the target image may include face information of the target object.
步骤S120,基于所述脸部信息确定多帧目标图像之间的相似度信息。Step S120, determining similarity information between multiple frames of target images based on the face information.
在基于步骤S110获得所述多帧目标图像之后,所述电子设备10可以基于多帧所述目标图像中的脸部信息,确定多帧所述目标图像之间的相似度信息。After obtaining the multiple frames of target images based on step S110, the electronic device 10 may determine similarity information between the multiple frames of the target images based on the face information in the multiple frames of the target images.
步骤S130,基于所述相似度信息确定所述目标对象是否属于活体对象。Step S130, determining whether the target object belongs to a living object based on the similarity information.
在基于步骤S120得到多帧所述目标图像之间的相似度信息之后,所述电子设备10可以基于该相似度信息确定所述目标对象是否属于活体对象。After obtaining similarity information between multiple frames of the target images based on step S120, the electronic device 10 may determine whether the target object belongs to a living object based on the similarity information.
基于上述方法,在不依赖于目标对象做出指定动作且不依赖于拍摄目标对象的设备具有深度图像传感器的情况下,也可以实现目标对象是否属于活体对象的较高准确度的识别,从而改善现有的人脸识别技术中在不能要求识别对象做出指定动作(如为了用户的体验更好等)或拍摄设备上未配置深度图像传感器(如考虑设备的成本等)的应用中存在的识别结果准确度较低的问题,进而使得人脸识别的准确度可以得到有效的保证,且还能提高识别对象的体验(不用做出指定动作),并能降低设备的成本。Based on the above method, in the case where the device that does not depend on the target object to make a specified action and does not depend on the device for photographing the target object has a depth image sensor, it is also possible to achieve a high-accuracy recognition of whether the target object belongs to a living object, thereby improving the Existing face recognition technology exists in applications that cannot require the recognized object to perform specified actions (such as for a better user experience, etc.) or the shooting device is not equipped with a depth image sensor (such as considering the cost of the device, etc.). The result is the problem of low accuracy, so that the accuracy of face recognition can be effectively guaranteed, and the experience of recognizing objects can be improved (no specific actions are required), and the cost of equipment can be reduced.
发明人经研究发现,活体对象(如真人)即便是不刻意在脸部做出指定动作,也会随着时间的变化而在脸部出现细微的变化,即存在微表情变化,使得拍摄得到的多帧目标图像之间至少会存在细微的区别,而针对非活体对象,如照片或三维模型,则不可能存在变化。The inventor has found through research that even if a living object (such as a real person) does not deliberately make a specified action on the face, there will be subtle changes on the face with the change of time, that is, there is a change in micro-expression, which makes the photographed image change. There will be at least subtle differences between multiple frames of target images, while for non-living objects, such as photographs or 3D models, variations are unlikely.
正是基于此发现,本申请的发明人在经过长期研究之后,才提出了基于图像之间的相似度信息确定目标对象是否属于活体对象的技术方案,即本申请实施例提高的对象识别方法。It is based on this discovery that the inventor of the present application, after long-term research, proposes a technical solution for determining whether a target object belongs to a living object based on similarity information between images, that is, the object recognition method improved by the embodiments of the present application.
对于步骤S110需要说明的是,获得多帧目标图像的具体方式不受限制,可以根据实际应用需求进行选择。It should be noted for step S110 that the specific manner of obtaining the multi-frame target image is not limited, and can be selected according to actual application requirements.
可选地,所述电子设备10为诸如手机或电脑等终端设备,该电子设备10可以基于携带的图像采集设备(如摄像头)拍摄目标对象,以获得多帧具有该目标对象的脸部信息的目标图像。Optionally, the electronic device 10 is a terminal device such as a mobile phone or a computer, and the electronic device 10 can photograph a target object based on a carried image acquisition device (such as a camera) to obtain multiple frames with facial information of the target object. target image.
也即,在需要对所述目标对象进行脸部识别时,所述电子设备10可以控制所述图像采集设备开启,以拍摄该目标对象,然后,在该图像采集设备拍摄得到多帧目标图像之后,该图像采集设备可以将该多帧目标图像发送给该电子设备10,使得该电子设备10可以获得多帧目标图像。That is, when it is necessary to perform face recognition on the target object, the electronic device 10 can control the image capture device to be turned on to capture the target object, and then, after the image capture device captures multiple frames of the target image , the image acquisition device can send the multi-frame target image to the electronic device 10, so that the electronic device 10 can obtain the multi-frame target image.
可选地,所述电子设备10为服务器,该电子设备10可以基于连接的终端设备上的图像采集设备(如摄像头)拍摄目标对象,以获得多帧具有该目标对象的脸部信息的目标图像。Optionally, the electronic device 10 is a server, and the electronic device 10 can capture a target object based on an image acquisition device (such as a camera) on a connected terminal device to obtain multiple frames of target images with facial information of the target object. .
也即,在需要对所述目标对象进行脸部识别时,所述电子设备10连接终端设备,可以控制所述图像采集设备开启,以拍摄该目标对象。然后,该终端设备可以获得该图像采集设备拍摄得到的多帧目标图像,并将该多帧目标图像发送给所述电子设备10,使得该电子设备10可以获得多帧目标图像。That is, when it is necessary to perform face recognition on the target object, the electronic device 10 is connected to a terminal device, and can control the image capturing device to be turned on to photograph the target object. Then, the terminal device can obtain multiple frames of target images captured by the image acquisition device, and send the multiple frames of target images to the electronic device 10, so that the electronic device 10 can obtain multiple frames of target images.
其中,对于步骤S110还需要进一步说明的是,获得的多帧目标图像既可以是拍摄目标对象得到的全部目标图像,也可以是拍摄目标对象得到的全部目标图像中的部分目标图像(诸如仅为目标对象的人脸图像)。Wherein, for step S110, it should be further explained that the obtained multi-frame target images may be all target images obtained by photographing the target object, or may be a partial target image (such as only only a portion of the target images obtained by photographing the target object) face image of the target object).
可选地,若对识别准确度需求特别的高,或者拍摄得到的全部目标图像的数量不是特别的多,获得的多帧目标图像可以是拍摄目标对象得到的全部目标图像。Optionally, if the requirement for recognition accuracy is particularly high, or the number of all target images obtained by shooting is not particularly large, the obtained multi-frame target images may be all target images obtained by shooting the target object.
可选地,为了降低所述电子设备10的数据处理量,获得的多帧目标图像可以是拍摄目标对象得到的全部目标图像中的部分目标图像。Optionally, in order to reduce the data processing amount of the electronic device 10, the obtained multi-frame target images may be partial target images among all the target images obtained by photographing the target object.
其中,基于不同的应用需求,在拍摄得到的全部目标图像中获得部分目标图像的方式不受限制,可以根据实际应用需求进行选择。Among them, based on different application requirements, the manner of obtaining part of the target images from all the target images obtained by shooting is not limited, and can be selected according to actual application requirements.
经发明人研究发现,一般活体对象在刚开始被拍摄时更容易出现较大的表情变化,如此,在降低所述电子设备10的数据处理量的同时,由于较大的表情变化更容易被识别,使得可以充分避免误识别的问题,基于此,可以在拍摄得到的全部目标图像中,获得前N帧目标图像;可选地,也可以在拍摄得到的全部目标图像中,获得后N帧目标图像;还可以在拍摄得到的全部目标图像中,获得中间的N帧目标图像。According to the research of the inventors, it is found that a general living subject is more likely to have a large expression change when it is first photographed. In this way, while reducing the data processing amount of the electronic device 10, it is easier to identify due to the large expression change. , so that the problem of misrecognition can be fully avoided. Based on this, the first N frames of target images can be obtained from all the target images obtained by shooting; optionally, the next N frames of targets can also be obtained from all the target images obtained by shooting. image; it is also possible to obtain N frames of target images in the middle among all the target images obtained by shooting.
可选地,经过本申请的发明人的研究发现,相邻帧的目标图像之间,由于拍摄的时间间隔极短,使得表情(脸部信息)区别极小,难以被有效识别,如此,基于数据处理量和识别准确度的综合考虑,可以每间隔预设数量帧,在拍摄得到的全部目标图像中,获得一帧目标图像,从而获得多帧目标图像。Optionally, through the research of the inventor of the present application, it is found that between the target images of adjacent frames, due to the extremely short shooting time interval, the difference in expressions (face information) is extremely small, and it is difficult to be effectively recognized. Considering the amount of data processing and the accuracy of recognition comprehensively, it is possible to obtain one frame of target image among all the target images obtained by shooting at every preset number of frames, thereby obtaining multiple frames of target images.
其中,结合前述的发明人的研究发现(也即,一般活体对象在刚开始被拍摄时更容易出现较大的表情变化),可选地,上述的预设数量帧的数值还可以逐步增加。Wherein, in combination with the aforementioned research findings of the inventors (that is, a general living object is more likely to have a larger expression change when it is initially photographed), optionally, the value of the above-mentioned preset number of frames can also be gradually increased.
也即,获得的多帧目标图像中,时间信息越靠前的相邻两帧目标图像之间,时间差值可以越小。That is, in the obtained multi-frame target images, the time difference between two adjacent frames of target images whose time information is more advanced may be smaller.
可选地,结合图3,若拍摄得到的多帧目标图像按照时间的先后顺序依次可以为,第一帧目标图像、第二帧目标图像、第三帧目标图像、第四帧目标图像、第五帧目标图像、第六帧目标图像、第七帧目标图像、第八帧目标图像、第九帧目标图像和第十帧目标图像。 如此,获得的多帧目标图像按照时间的先后顺序依次可以为,第一帧目标图像、第三帧目标图像(间隔一帧)、第六帧目标图像(间隔两帧)和第十帧目标图像(间隔三帧)。Optionally, with reference to FIG. 3 , if the multi-frame target images obtained by shooting can be in the order of time, the first frame target image, the second frame target image, the third frame target image, the fourth frame target image, the Five target images, the sixth target image, the seventh target image, the eighth target image, the ninth frame target image, and the tenth frame target image. In this way, the obtained multi-frame target images may be, in order of time, the first frame target image, the third frame target image (one frame apart), the sixth frame target image (two frames apart) and the tenth frame target image (three frames apart).
并且,对于步骤S110还需要进一步说明的是,为了进一步提高目标对象是否属于活体对象的识别准确度,获得的多帧目标图像的总帧长可以小于预设时长,预设时长例如可以是1s或0.5s等,应当注意的是,此处仅为示例性说明,具体可根据实际情况设置预设时长,在此不进行限定。In addition, it should be further explained in step S110 that, in order to further improve the recognition accuracy of whether the target object belongs to a living object, the total frame length of the obtained multi-frame target images may be less than the preset duration, and the preset duration may be, for example, 1s or 0.5s, etc., it should be noted that this is only an exemplary description, and the preset duration can be specifically set according to the actual situation, which is not limited here.
如此,通过总帧长的限定,可以充分避免因通过更换照片或采用视频的方式来伪造活体对象而导致出现误识别的问题。In this way, by limiting the total frame length, the problem of misrecognition caused by forging living objects by replacing photos or adopting videos can be fully avoided.
对于步骤S120需要说明的是,确定多帧所述目标图像之间的相似度信息的具体方式不受限制,可以根据实际应用需求进行选择。It should be noted for step S120 that the specific manner of determining the similarity information between the target images of the multiple frames is not limited, and can be selected according to actual application requirements.
可选地,可以基于一些图像处理算法将多帧所述目标图像中的脸部信息进行比较,如基于轮廓提取算法在目标图像中提取脸部轮廓,然后,将提取的脸部轮廓进行比较,以确定多帧目标图像之间的相似度信息。Optionally, the facial information in multiple frames of the target image can be compared based on some image processing algorithms, such as extracting facial contours in the target image based on a contour extraction algorithm, and then comparing the extracted facial contours, To determine the similarity information between multiple frames of target images.
可选地,为了提高确定的相似度信息的准确度,可以基于神经网络进行确定。基于此,结合图4,步骤S120可以包括步骤S121和步骤S122,具体内容如下所述。Optionally, in order to improve the accuracy of the determined similarity information, the determination may be performed based on a neural network. Based on this, with reference to FIG. 4 , step S120 may include step S121 and step S122, and the specific content is as follows.
步骤S121,基于预先训练得到的人脸识别模型对每一帧目标图像中的脸部信息进行特征提取处理,得到每一帧目标图像的目标特征信息。Step S121 , perform feature extraction processing on the face information in each frame of the target image based on the pre-trained face recognition model to obtain target feature information of each frame of the target image.
在基于步骤S110获得多帧目标图像之后,可以该多帧目标图像输入至预先训练得到的人脸识别模型中,基于该人脸识别模型对该多帧目标图像中的脸部信息分别进行特征提取处理,如此,可以得到多帧目标图像的目标特征信息。After the multi-frame target image is obtained based on step S110, the multi-frame target image may be input into a pre-trained face recognition model, and feature extraction is performed on the face information in the multi-frame target image based on the face recognition model. In this way, target feature information of multiple frames of target images can be obtained.
步骤S122,基于目标特征信息得到多帧目标图像之间的相似度信息。Step S122, obtaining similarity information between multiple frames of target images based on the target feature information.
在基于步骤S121获得多帧目标图像的目标特征信息之后,可以基于该目标特征信息得到该多帧目标图像之间的相似度信息。After the target feature information of the multi-frame target images is obtained based on step S121, similarity information between the multi-frame target images may be obtained based on the target feature information.
如此,由于人脸识别模型(一种具有人脸识别功能的神经网络模型)具有较高的信息处理能力,使得提取到目标特征信息更为丰富,从而使得基于目标特征信息确定的相似度信息具有更高的准确度,进而提高活体对象的识别准确度。In this way, because the face recognition model (a neural network model with face recognition function) has high information processing capability, the extracted target feature information is more abundant, so that the similarity information determined based on the target feature information has Higher accuracy, which in turn improves the recognition accuracy of live objects.
并且,对于步骤S120还需要进一步说明的是,进行相似度信息确定的多帧所述目标图像,既可以是基于步骤S110获得的多帧目标图像中的全部目标图像,也可以是基于步骤S110获得的多帧目标图像中的部分目标图像。In addition, it should be further explained for step S120 that the target images of the multiple frames for which similarity information is determined may be all target images in the multi-frame target images obtained based on the step S110, or may be obtained based on the step S110. part of the target image in the multi-frame target image.
对于步骤S130需要说明的是,确定所述目标对象是否属于活体对象的具体方式也不受限制,可以根据实际应用需求进行选择。It should be noted for step S130 that the specific manner of determining whether the target object belongs to a living object is also not limited, and can be selected according to actual application requirements.
可选地,可以仅基于所述相似度信息,确定所述目标对象是否属于活体对象。Optionally, it may be determined whether the target object belongs to a living object based only on the similarity information.
可选地,若所述相似度信息小于预设相似度,可以确定所述目标对象属于非活体对象,即可能是照片或三维模型等。若所述相似度信息大于预设相似度,可以确定所述目标对象属于活体对象,即可能是真人等。Optionally, if the similarity information is less than a preset similarity, it may be determined that the target object belongs to a non-living object, that is, it may be a photo or a three-dimensional model. If the similarity information is greater than the preset similarity, it can be determined that the target object belongs to a living object, that is, it may be a real person or the like.
可选地,若所述相似度信息为100%,即完全相同,可以确定所述目标对象属于非活体对象,即可能是照片或三维模型等。若所述相似度信息不为100%,可以确定所述目标对象属于活体对象。Optionally, if the similarity information is 100%, that is, identical, it can be determined that the target object belongs to a non-living object, that is, it may be a photo or a three-dimensional model. If the similarity information is not 100%, it may be determined that the target object belongs to a living object.
可选地,为了进一步提高是否属于活体对象的识别准确度,在所述相似度信息的基础上,还可以结合其它信息进行综合判断。其中,经过本申请的发明人的研究发现,拍摄目标对象得到第一图像,然后,再拍摄基于该第一图像形成的照片或三维模型形成第二图像,对于第一图像和第二图像之间,实际上还是会存在一些区别。Optionally, in order to further improve the recognition accuracy of whether it belongs to a living object, on the basis of the similarity information, a comprehensive judgment may also be made in combination with other information. Among them, the inventor of the present application found that the first image is obtained by photographing the target object, and then the photo or three-dimensional model formed based on the first image is photographed to form the second image. For the difference between the first image and the second image , there are still some differences.
基于此,可选地,在所述相似度信息的基础上,结合基于所述目标图像与预设的一些图像进行比较得到的属于活体对象的置信度信息,从而确定目标对象是否属于活体对象。Based on this, optionally, on the basis of the similarity information, it is determined whether the target object belongs to the living object by combining the confidence information of belonging to the living object obtained by comparing the target image with some preset images.
其中,基于不同的需求,上述确定所述置信度信息的具体方式可以有不同的选择。Wherein, based on different requirements, the above-mentioned specific manner of determining the confidence information may have different choices.
可选地,为了使得置信度信息的确定不依赖于复杂的神经网络,以降低对所述电子设备10的数据处理性能的需求,结合图5,步骤S130可以包括步骤S131和步骤S132,具体内容如下所述。Optionally, in order to make the determination of the confidence information not rely on a complex neural network, so as to reduce the demand for the data processing performance of the electronic device 10, in conjunction with FIG. 5, step S130 may include steps S131 and S132, the specific content as described below.
步骤S131,将所述多帧目标图像中的至少一帧目标图像,与预先获得的多帧图像进行比较,得到所述目标对象属于活体对象的置信度信息。Step S131 , compare at least one target image in the multi-frame target images with the pre-obtained multi-frame images to obtain confidence information that the target object belongs to a living object.
在基于步骤S110获得多帧目标图像之后,可以将该多帧目标图像中的至少一帧目标图像,与预先获得的多帧图像进行比较。After the multi-frame target image is obtained based on step S110, at least one target image in the multi-frame target image may be compared with the pre-obtained multi-frame image.
其中,所述多帧图像可以包括多个不同对象的脸部信息,且该对象包括活体对象和非活体对象。如此,通过将所述目标图像与所述多帧图像进行对比,可以得到所述目标对象属于活体对象的置信度信息。Wherein, the multi-frame images may include face information of multiple different objects, and the objects include living objects and non-living objects. In this way, by comparing the target image with the multi-frame images, confidence information that the target object belongs to a living object can be obtained.
例如,目标图像为在识别时针对目标对象的正面拍摄得到的图像,第一图像为预先针对活体对象(如与目标对象表征相同的人)的正面拍摄得到的图像,第二图像为拍摄基于第一图像形成的照片得到的图像。如此,若目标图像与第一图像之间的相似度高于与第二图像之间的相似度,可以得到一个较高的置信度,即目标对象更有可能属于活体对象;若目标图像与第一图像之间的相似度小于与第二图像之间的相似度,可以得到一个较小的置信度,即目标对象更有可能属于非活体对象。For example, the target image is an image captured from the front of the target object during recognition, the first image is an image captured in advance of the front of a living object (such as a person with the same character as the target object), and the second image is captured based on the first An image formed from a photograph. In this way, if the similarity between the target image and the first image is higher than the similarity between the second image and the second image, a higher degree of confidence can be obtained, that is, the target object is more likely to belong to a living object; The similarity between an image is smaller than the similarity with the second image, and a smaller confidence can be obtained, that is, the target object is more likely to belong to a non-living object.
步骤S132,基于所述置信度信息和相似度信息确定所述目标对象是否属于活体对象。Step S132, determining whether the target object belongs to a living object based on the confidence information and the similarity information.
在基于步骤S131获得所述置信度信息之后,可以基于该置信度信息,并结合步骤S120获得的相似度信息,确定所述目标对象是否属于活体对象。After the confidence level information is obtained based on step S131, it may be determined whether the target object belongs to a living object based on the confidence level information and in combination with the similarity level information obtained in step S120.
可选地,为了提高确定的置信度信息的准确度,从而保证是否属于活体对象的识别准确度也较高,结合图6,步骤S130也可以包括步骤S133和步骤S134,具体内容如下所述。Optionally, in order to improve the accuracy of the determined confidence information, thereby ensuring that the identification accuracy of whether it belongs to a living object is also high, in conjunction with FIG.
步骤S133,通过所述人脸识别模型将至少一个所述目标特征信息与预先形成的多个对比特征信息进行比较,得到所述目标对象属于活体对象的置信度信息。Step S133, comparing at least one of the target feature information with a plurality of pre-formed comparative feature information through the face recognition model to obtain confidence information that the target object belongs to a living object.
在基于预先训练得到的人脸识别模型对多帧所述目标图像进行特征提取处理之后,如在基于步骤S121得到所述目标特征信息之后,可以通过所述人脸识别模型进一步对该目标特征信息(至少为一个)进行处理,即将该目标特征信息与预先形成的多个对比特征信息进行比较。After the feature extraction process is performed on multiple frames of the target images based on the pre-trained face recognition model, for example, after the target feature information is obtained based on step S121, the target feature information can be further obtained through the face recognition model. (at least one) performs processing, that is, compares the target feature information with a plurality of pre-formed comparative feature information.
其中,所述多个对比特征信息可以基于多帧包括多个不同对象的脸部信息的图像得到,且该对象包括活体对象和非活体对象。如此,可以得到所述目标对象属于活体对象的置信度信息。Wherein, the plurality of comparative feature information may be obtained based on multiple frames of images including face information of a plurality of different objects, and the objects include living objects and non-living objects. In this way, confidence information that the target object belongs to a living object can be obtained.
步骤S134,基于所述置信度信息和所述相似度信息确定所述目标对象是否属于活体对象。Step S134: Determine whether the target object belongs to a living object based on the confidence information and the similarity information.
在基于步骤S133得到所述置信度信息之后,可以基于该置信度信息,并结合步骤S120得到的相似度信息,确定所述目标对象是否属于活体对象。After the confidence information is obtained based on step S133, it may be determined whether the target object belongs to a living object based on the confidence information and in combination with the similarity information obtained in step S120.
通过步骤S133得到所述置信度信息的具体方式不受限制,可以根据实际应用需求进行选择。The specific manner of obtaining the confidence information through step S133 is not limited, and can be selected according to actual application requirements.
可选地,可以直接将得到的至少一个目标特征信息与预先形成的全部特征信息进行对比,以确定所述目标对象属于活体对象的置信度信息。Optionally, the obtained at least one target feature information may be directly compared with all pre-formed feature information to determine confidence information that the target object belongs to a living object.
其中,所述全部特征信息可以是基于拍摄多个不同对象得到的图像形成,该不同对象包括身份类别不同的对象,且身份类别相同的对象包括活体对象和非活体对象,如针对活体对象A进行拍摄得到第一图像,针对第一图像形成的照片B进行拍摄得到第二图像,此时,该照片B实际上就是非活体对象,且与活体对象A具有相同的身份类别,如属于同一个人。Wherein, the entire feature information may be formed based on images obtained by photographing a plurality of different objects, the different objects include objects with different identities, and the objects with the same identities include living objects and non-living objects. The first image is obtained by shooting, and the second image is obtained by shooting the photo B formed by the first image. At this time, the photo B is actually a non-living object and has the same identity category as the living object A, such as belonging to the same person.
可选地,为了在将信息进行对比时,提高对比分析的精度,结合图7,步骤S133可以包括步骤S133a、步骤S133b和步骤S133c,具体内容如下所述。Optionally, in order to improve the accuracy of comparative analysis when comparing the information, with reference to FIG. 7 , step S133 may include step S133a, step S133b and step S133c, the specific contents are as follows.
步骤S133a,通过所述人脸识别模型对至少一个所述目标特征信息进行身份类别识别处理,得到该至少一个目标特征信息的身份类别信息。Step S133a, performing identity category recognition processing on at least one of the target feature information through the face recognition model to obtain identity category information of the at least one target feature information.
在基于所述人脸识别模型对多帧所述目标图像进行特征提取处理之后,如在基于步骤S121得到所述目标特征信息之后,可以通过该人脸识别模型对得到的至少一个目标特征信息进行身份类别识别处理,从而确定该目标特征信息对应的身份类别信息。After the feature extraction process is performed on the multiple frames of the target images based on the face recognition model, for example, after the target feature information is obtained based on step S121, the at least one target feature information obtained can be processed by the face recognition model. Identity category identification processing, so as to determine the identity category information corresponding to the target feature information.
也即,可以基于所述人脸识别模型先确定目标特征信息是属于哪一个人,即确定所述 目标对象是属于哪一个人。That is, based on the face recognition model, it is possible to first determine which person the target feature information belongs to, that is, to determine which person the target object belongs to.
步骤S133b,在所述人脸识别模型包括的多个特征空间中,基于所述身份类别信息确定所述至少一个目标特征信息的目标特征空间。Step S133b, in a plurality of feature spaces included in the face recognition model, determine a target feature space of the at least one target feature information based on the identity category information.
在基于步骤S133a确定目标特征信息的身份类别信息之后,可以基于该身份类别信息,在所述人脸识别模型包括的多个特征空间中确定该目标特征信息的目标特征空间。After determining the identity category information of the target feature information based on step S133a, the target feature space of the target feature information may be determined from a plurality of feature spaces included in the face recognition model based on the identity category information.
其中,不同的特征空间具有不同对象的对比特征信息,且每一个所述对比特征信息具有标识对应对象的身份类别的第一标签信息和是否为活体的第二标签信息。Wherein, different feature spaces have comparative feature information of different objects, and each of the comparative feature information has first label information identifying the identity category of the corresponding object and second label information whether it is a living body.
也即,同一个特征空间的对比特征信息可以具有相同的第一标签信息,且可以具有不同的第二标签信息。That is, the comparative feature information of the same feature space may have the same first label information, and may have different second label information.
步骤S133c,通过所述人脸识别模型将所述至少一个目标特征信息与该目标特征信息对应的目标特征空间中的对比特征信息进行比较,得到所述目标对象属于活体对象的置信度信息。Step S133c, comparing the at least one target feature information with the contrast feature information in the target feature space corresponding to the target feature information through the face recognition model to obtain confidence information that the target object belongs to a living object.
在基于步骤S133b确定目标特征空间之后,可以通过所述人脸识别模型将该目标特征信息与目标特征空间中的对比特征信息进行比较,然后,根据比较结果和对比特征信息的第二标签信息,确定所述目标对象属于活体对象的置信度信息。After determining the target feature space based on step S133b, the target feature information can be compared with the contrast feature information in the target feature space through the face recognition model, and then, according to the comparison result and the second label information of the contrast feature information, Confidence information that the target object belongs to a living object is determined.
也即,在上述可选示例中,通过先确定目标特征信息(即目标对象)的身份类别,再确定是否属于活体对象,如此,使得在进行活体对象判断时可以进行更为精细化的比较处理,使得比较结果可以更为准确。That is, in the above optional example, by first determining the identity category of the target feature information (that is, the target object), and then determining whether it belongs to a living object, in this way, a more refined comparison process can be performed when judging a living object. , so that the comparison results can be more accurate.
并且,在步骤S133中,基于对比的目标特征信息的具体数量不同,得到置信度信息的具体方式也可以不同,可以根据实际应用需求进行选择。Moreover, in step S133, the specific quantity of the target feature information based on the comparison is different, and the specific manner of obtaining the confidence information may also be different, and may be selected according to actual application requirements.
可选地,为了降低所述电子设备10的数据处理量,以提高识别的效率和降低对电子设备10的性能需求,在基于步骤S121得到多帧目标图像的目标特征信息之后,可以选择出一帧目标图像的目标特征信息,然后,基于该目标特征信息得到对应的置信度信息。Optionally, in order to reduce the data processing amount of the electronic device 10, in order to improve the efficiency of recognition and reduce the performance requirements for the electronic device 10, after obtaining the target feature information of the multi-frame target images based on step S121, one can be selected. frame target feature information of the target image, and then obtain corresponding confidence information based on the target feature information.
可选地,为了提高识别的准确度,结合图8,步骤S133可以包括步骤S133d和步骤S133e,具体内容如下所述。Optionally, in order to improve the accuracy of identification, with reference to FIG. 8 , step S133 may include step S133d and step S133e, and the specific content is as follows.
步骤S133d,针对每一帧目标图像对应的目标特征信息,通过所述人脸识别模型将该目标特征信息与预先形成的多个对比特征信息进行比较,确定在该帧目标图像中目标对象属于活体对象的置信度,得到多个置信度。Step S133d, for the target feature information corresponding to each frame of the target image, compare the target feature information with a plurality of pre-formed contrast feature information through the face recognition model, and determine that the target object in the frame of the target image belongs to a living body The confidence level of the object, get multiple confidence levels.
针对获得的每一帧所述目标图像,可以通过所述人脸识别模型将该帧目标图像对应的目标特征信息与预先形成的多个对比特征信息进行比较,从而确定在该帧目标图像中所述目标对象属于活体对象的置信度(具体实现方式可以参照前文对步骤S133a、步骤S133b和步骤S133c以及相关内容的解释说明),如此,可以得到多个置信度。For each frame of the obtained target image, the face recognition model can be used to compare the target feature information corresponding to the frame of target image with a plurality of pre-formed comparative feature information, so as to determine the target image in the frame of the target image. In this way, multiple confidence levels can be obtained.
步骤S133e,基于所述多个置信度得到所述目标对象属于活体对象的置信度信息。Step S133e, based on the plurality of confidence levels, obtain confidence level information that the target object belongs to a living object.
在基于步骤S133d得到多个置信度之后,可以基于该多个置信度,得到所述目标对象属于活体对象的置信度信息。After a plurality of confidence levels are obtained based on step S133d, the confidence level information that the target object belongs to a living object may be obtained based on the plurality of confidence levels.
可选地,可以在多个置信度中确定一个最小的置信度,作为所述目标对象属于活体对象的置信度信息。Optionally, a minimum confidence level may be determined among multiple confidence levels as confidence level information that the target object belongs to a living object.
可选地,可以在多个置信度中确定一个最大的置信度,作为所述目标对象属于活体对象的置信度信息。Optionally, a maximum confidence level may be determined among multiple confidence levels as confidence level information that the target object belongs to a living object.
可选地,可以基于多个置信度计算得到一个平均值,作为所述目标对象属于活体对象的置信度信息。Optionally, an average value may be calculated based on a plurality of confidence levels, and used as confidence level information that the target object belongs to a living object.
在步骤S132和步骤S134中,基于所述置信度信息和所述相似度信息确定目标对象是否属于活体对象的具体方式不受限制,也可以根据实际应用需求进行选择。In step S132 and step S134, the specific manner of determining whether the target object belongs to a living object based on the confidence information and the similarity information is not limited, and may also be selected according to actual application requirements.
可选地,可以在所述置信度信息和所述相似度信息中,选择一个较大信息作为判断依据,其中,若置信度信息和相似度信息为不同取值范围的信息时,可以先进行归一化处理之后,再进行选择较大信息,以确定目标对象是否属于活体对象,如将该较大信息与预设信息进行比较,若该较大信息大于该预设信息,则确定该目标对象属于活体对象。Optionally, among the confidence information and the similarity information, a larger piece of information may be selected as the judgment basis, wherein, if the confidence information and the similarity information are information in different value ranges, the After the normalization process, the larger information is selected to determine whether the target object belongs to a living object. For example, the larger information is compared with the preset information, and if the larger information is greater than the preset information, the target is determined. Objects are living objects.
可选地,可以在所述置信度信息和所述相似度信息中,选择一个较小信息作为判断依据,以确定目标对象是否属于活体对象,如将该较小信息与预设信息进行比较,若该较小信息大于该预设信息,则确定该目标对象属于活体对象。Optionally, among the confidence information and the similarity information, a smaller piece of information may be selected as a judgment basis to determine whether the target object belongs to a living object, for example, by comparing the smaller piece of information with preset information, If the smaller information is greater than the preset information, it is determined that the target object belongs to a living object.
可选地,为了进一步提高确定目标对象是否属于活体对象的准确度,可以分别为所述置信度信息和所述相似度信息配置不同的权重系数,然后,计算该置信度信息和该相似度信息的加权求和值,再基于该加权求和值确定该目标对象是否属于活体对象。Optionally, in order to further improve the accuracy of determining whether the target object belongs to a living object, different weighting coefficients may be configured for the confidence information and the similarity information respectively, and then the confidence information and the similarity information are calculated. The weighted sum value of , and then determine whether the target object belongs to a living object based on the weighted sum value.
可选地,所述相似度信息对应的权重系数可以大于所述置信度信息对应的权重系数,如此,使得确定是否属于活体对象的依据更侧重于目标图像之间的相似度,即脸部信息的细微变化。Optionally, the weight coefficient corresponding to the similarity information may be greater than the weight coefficient corresponding to the confidence information, so that the basis for determining whether it belongs to a living object is more focused on the similarity between the target images, that is, the face information. subtle changes.
进一步地,在上述可选示例中,可以基于人脸识别模型进行特征提取处理,以得到目标特征信息,因此,为了使得基于该人脸识别模型可以提取到更为丰富的目标特征信息,使得该目标特征信息包括更多的细节信息,可选地,结合图9,所述对象识别方法还可以包括模型训练的步骤,具体可以包括步骤S140、步骤S150和步骤S160,具体内容如下所述。Further, in the above-mentioned optional example, feature extraction processing can be performed based on the face recognition model to obtain target feature information. Therefore, in order to extract more abundant target feature information based on the face recognition model, the target feature information can be extracted. The target feature information includes more detailed information. Optionally, with reference to FIG. 9 , the object recognition method may further include a step of model training, which may specifically include steps S140 , S150 and S160 , the details of which are as follows.
步骤S140,通过预设的神经网络模型中的特征提取层对多个样本图像进行特征提取处理,得到多个样本特征信息。Step S140: Perform feature extraction processing on multiple sample images through a feature extraction layer in a preset neural network model to obtain multiple sample feature information.
在获得多个样本图像之后,可以基于预设的神经网络模型中的特征提取层,对该多个 样本图像进行特征提取处理,如此,可以得到多个样本特征信息。After obtaining multiple sample images, feature extraction processing can be performed on the multiple sample images based on the feature extraction layer in the preset neural network model, so that multiple sample feature information can be obtained.
步骤S150,通过所述神经网络模型中的损失确定层,分别基于预先为每一个样本图像配置的第一标签信息确定每一个样本特征信息的第一损失值、基于预先为每一个样本图像配置的第二标签信息确定每一个样本特征信息的第二损失值。Step S150, through the loss determination layer in the neural network model, determine the first loss value of each sample feature information based on the first label information pre-configured for each sample image, and determine the first loss value of each sample image based on the pre-configured value for each sample image. The second label information determines the second loss value of each sample feature information.
在基于步骤S140得到多个样本特征信息之后,针对每一个样本特征信息,可以通过所述神经网络模型中的损失确定层,基于预先为每一个样本图像配置的第一标签信息确定该样本特征信息的第一损失值,并基于预先为每一个样本图像配置的第二标签信息确定该样本特征信息的第二损失值。如此,可以得到多个第一损失值和多个第二损失值。After obtaining a plurality of sample feature information based on step S140, for each sample feature information, the sample feature information can be determined based on the first label information pre-configured for each sample image through the loss determination layer in the neural network model The first loss value of the sample image is determined, and the second loss value of the sample feature information is determined based on the second label information pre-configured for each sample image. In this way, a plurality of first loss values and a plurality of second loss values can be obtained.
其中,所述第一标签信息可以用于标识对应的样本图像中的对象的身份类别,如通过第一标签信息可标识出该对象具体是哪一个人。所述第二标签信息可以用于标识对应的样本图像中的对象是否为活体对象,如通过第二标签信息可以标识出该对象是否为真人。The first label information may be used to identify the identity category of the object in the corresponding sample image, for example, the first label information may identify which person the object is. The second label information can be used to identify whether the object in the corresponding sample image is a living object, for example, whether the object is a real person can be identified through the second label information.
步骤S160,基于所述第一损失值和所述第二损失值,对所述神经网络模型进行训练,得到所述人脸识别模型。Step S160, based on the first loss value and the second loss value, train the neural network model to obtain the face recognition model.
在基于步骤S150获得所述第一损失值和所述第二损失值之后,可以基于该第一损失值和该第二损失值对所述神经网络模型进行训练,诸如,通过反向传播算法对该神经网络模型的参数进行更新,直至损失值收敛后结束训练,结束训练时的神经网络模型能够输入符合预期的效果,可将其作为所述人脸识别模型,也即,最后训练所得的人脸识别模型具有较好的人脸识别效果。After the first loss value and the second loss value are obtained based on step S150, the neural network model may be trained based on the first loss value and the second loss value, such as through a back-propagation algorithm. The parameters of the neural network model are updated until the loss value converges and the training ends. The neural network model at the end of the training can input the expected effect, which can be used as the face recognition model, that is, the person obtained from the final training. The face recognition model has a good face recognition effect.
对于步骤S140需要说明的是,所述神经网络模型的具体架构不受限制,可以根据实际应用需求进行选择。It should be noted for step S140 that the specific architecture of the neural network model is not limited, and can be selected according to actual application requirements.
可选地,所述神经网络模型可以是一种残差模型,如深度残差网络模型(DRN,deep residual network)。Optionally, the neural network model may be a residual model, such as a deep residual network model (DRN, deep residual network).
并且,所述特征提取层的具体构成也不受限制。可选地,该特征提取层可以是一种编码器。Moreover, the specific structure of the feature extraction layer is also not limited. Optionally, the feature extraction layer may be an encoder.
对于步骤S150需要说明的是,所述损失确定层的具体构成也不受限制,也可以根据实际应用需求选择。It should be noted for step S150 that the specific composition of the loss determination layer is not limited, and can also be selected according to actual application requirements.
可选地,所述损失确定层可以包括图像分类网络(如全连接层,fully connected layers:FC),用于对每一个样本特征信息进行特征分类处理,得到多个特征向量。如此,通过将所述特征向量和基于所述第一标签信息形成的第一标签向量进行计算,可以得到所述第一损失值;并且,通过将所述特征向量和基于所述第二标签信息形成的第二标签向量进行计算,可以得到所述第二损失值。Optionally, the loss determination layer may include an image classification network (such as a fully connected layer, fully connected layers: FC), which is used to perform feature classification processing on the feature information of each sample to obtain multiple feature vectors. In this way, by calculating the feature vector and the first label vector formed based on the first label information, the first loss value can be obtained; and by calculating the feature vector and the second label information based on the second label information The formed second label vector is calculated to obtain the second loss value.
对于步骤S160需要说明的是,基于第一损失值和第二损失值进行训练的具体方式不受 限制,也可以根据实际应用需求进行选择。It should be noted that in step S160, the specific manner of training based on the first loss value and the second loss value is not limited, and can also be selected according to actual application requirements.
可选地,可以先计算第一损失值和第二损失值的和值,并将该和值作为损失总值,然后,再基于该损失总值对所述神经网络模型进行训练,直至该损失总值收敛时结束训练。Optionally, the sum value of the first loss value and the second loss value may be calculated first, and the sum value may be used as the total loss value, and then the neural network model is trained based on the total loss value until the loss value is reached. The training ends when the total values converge.
可选地,可以先计算第一损失值和第二损失值得加权和值,并将该加权和值作为损失总值,然后,再基于该损失总值对所述神经网络模型进行训练,直至该损失总值收敛时结束训练。Optionally, a weighted sum value of the first loss value and the second loss value may be calculated first, and the weighted sum value may be used as the total loss value, and then the neural network model is trained based on the total loss value until the End training when the total loss value converges.
其中,在进行训练时,具体的方式也不受限制。Wherein, during training, the specific manner is also not limited.
可选地,可以基于计算得到的损失总值,通过反向传播算法(Backpropagation algorithm,BP算法,是一种监督学习算法)对所述神经网络模型进行更新处理,从而得到所述人脸识别模型。Optionally, based on the calculated loss total value, the neural network model can be updated through a backpropagation algorithm (Backpropagation algorithm, BP algorithm, which is a supervised learning algorithm), thereby obtaining the face recognition model. .
对应于前述对象识别方法,结合图10,本申请实施例还提供一种对象识别装置100,可应用于上述的电子设备10。其中,所述对象识别装置100可以包括目标图像获得模块110、相似度信息确定模块120和活体对象确定模块130。Corresponding to the aforementioned object recognition method, with reference to FIG. 10 , an embodiment of the present application further provides an object recognition apparatus 100 , which can be applied to the aforementioned electronic device 10 . The object recognition apparatus 100 may include a target image obtaining module 110 , a similarity information determination module 120 and a living object determination module 130 .
所述目标图像获得模块110,配置成获得拍摄目标对象得到的多帧目标图像,其中,每一帧所述目标图像包括所述目标对象的脸部信息。在本实施例中,所述目标图像获得模块110可配置成执行图2所示的步骤S110,关于所述目标图像获得模块110的相关内容可以参照前文对步骤S110的描述。The target image obtaining module 110 is configured to obtain multiple frames of target images obtained by photographing a target object, wherein each frame of the target image includes face information of the target object. In this embodiment, the target image obtaining module 110 may be configured to execute the step S110 shown in FIG. 2 . For the relevant content of the target image obtaining module 110 , reference may be made to the foregoing description of the step S110 .
所述相似度信息确定模块120,配置成基于所述脸部信息确定多帧所述目标图像之间的相似度信息。在本实施例中,所述相似度信息确定模块120可配置成执行图2所示的步骤S120,关于所述相似度信息确定模块120的相关内容可以参照前文对步骤S120的描述。The similarity information determining module 120 is configured to determine similarity information between multiple frames of the target images based on the face information. In this embodiment, the similarity information determination module 120 may be configured to execute step S120 shown in FIG. 2 , and for related content of the similarity information determination module 120 , reference may be made to the foregoing description of step S120 .
所述活体对象确定模块130,配置成基于所述相似度信息确定所述目标对象是否属于活体对象。在本实施例中,所述活体对象确定模块130可配置成执行图2所示的步骤S130,关于所述活体对象确定模块130的相关内容可以参照前文对步骤S130的描述。The living object determination module 130 is configured to determine whether the target object belongs to a living object based on the similarity information. In this embodiment, the living object determination module 130 may be configured to execute step S130 shown in FIG. 2 . For the relevant content of the living object determination module 130 , reference may be made to the foregoing description of step S130 .
可选地,所述相似度信息确定模块具体配置成:Optionally, the similarity information determination module is specifically configured to:
基于预先训练得到的人脸识别模型对每一帧所述目标图像中的脸部信息进行特征提取处理,得到每一帧所述目标图像的目标特征信息;Perform feature extraction processing on the face information in each frame of the target image based on the pre-trained face recognition model to obtain target feature information of each frame of the target image;
基于所述目标特征信息得到多帧所述目标图像之间的相似度信息。Similarity information between multiple frames of the target images is obtained based on the target feature information.
可选地,所述活体对象确定模块具体配置成:Optionally, the living object determination module is specifically configured to:
通过所述人脸识别模型将至少一个所述目标特征信息与预先形成的多个对比特征信息进行比较,得到所述目标对象属于活体对象的置信度信息,其中,所述多个对比特征信息基于多帧包括多个不同对象的脸部信息的图像得到,且该对象包括活体对象和非活体对象;Comparing at least one of the target feature information with a plurality of pre-formed comparative feature information through the face recognition model to obtain confidence information that the target object belongs to a living object, wherein the plurality of comparative feature information is based on A plurality of frames of images including facial information of a plurality of different objects are obtained, and the objects include living objects and non-living objects;
基于所述置信度信息和相似度信息确定目标对象是否属于活体对象。Whether the target object belongs to a living object is determined based on the confidence information and the similarity information.
可选地,所述活体对象确定模块具体配置成:Optionally, the living object determination module is specifically configured to:
通过所述人脸识别模型对至少一个所述目标特征信息进行身份类别识别处理,得到该至少一个目标特征信息的身份类别信息;Perform identity category recognition processing on at least one of the target feature information by using the face recognition model to obtain identity category information of the at least one target feature information;
在所述人脸识别模型包括的多个特征空间中,基于所述身份类别信息确定所述至少一个目标特征信息的目标特征空间,其中,不同的特征空间具有不同对象的对比特征信息,且每一个所述对比特征信息具有标识对应对象的身份类别的第一标签信息和是否为活体的第二标签信息;In a plurality of feature spaces included in the face recognition model, a target feature space of the at least one target feature information is determined based on the identity category information, wherein different feature spaces have comparative feature information of different objects, and each One of the comparative feature information has first label information identifying the identity category of the corresponding object and second label information whether it is a living body;
通过所述人脸识别模型将所述至少一个目标特征信息与该目标特征信息对应的目标特征空间中的对比特征信息进行比较,得到所述目标对象属于活体对象的置信度信息。The at least one target feature information is compared with the contrast feature information in the target feature space corresponding to the target feature information by the face recognition model, so as to obtain confidence information that the target object belongs to a living object.
可选地,所述活体对象确定模块具体配置成:Optionally, the living object determination module is specifically configured to:
针对每一帧所述目标图像对应的目标特征信息,通过所述人脸识别模型将该目标特征信息与预先形成的多个对比特征信息进行比较,确定在该帧目标图像中所述目标对象属于活体对象的置信度,得到多个置信度;For the target feature information corresponding to each frame of the target image, the face recognition model compares the target feature information with a plurality of pre-formed comparative feature information, and determines that the target object in the frame of the target image belongs to Confidence of living objects, get multiple confidences;
基于所述多个置信度得到所述目标对象属于活体对象的置信度信息。Confidence information that the target object belongs to a living object is obtained based on the plurality of confidences.
可选地,还包括模型训练模块,配置成:Optionally, it also includes a model training module, configured to:
通过预设的神经网络模型中的特征提取层对多个样本图像进行特征提取处理,得到多个样本特征信息;Perform feature extraction processing on multiple sample images through the feature extraction layer in the preset neural network model to obtain multiple sample feature information;
通过所述神经网络模型中的损失确定层,分别基于预先为每一个样本图像配置的第一标签信息确定每一个样本特征信息的第一损失值、基于预先为每一个样本图像配置的第二标签信息确定每一个样本特征信息的第二损失值,其中,该第一标签信息用于标识对应的样本图像中的对象的身份类别,该第二标签信息用于标识对应的样本图像中的对象是否为活体;Through the loss determination layer in the neural network model, the first loss value of each sample feature information is determined based on the first label information preconfigured for each sample image, and the second label preconfigured for each sample image is determined based on the first loss value of each sample image. The information determines the second loss value of each sample feature information, wherein the first label information is used to identify the identity category of the object in the corresponding sample image, and the second label information is used to identify whether the object in the corresponding sample image is for living;
基于所述第一损失值和所述第二损失值,对所述神经网络模型进行训练,得到所述人脸识别模型。Based on the first loss value and the second loss value, the neural network model is trained to obtain the face recognition model.
可选地,所述活体对象确定模块具体配置成:Optionally, the living object determination module is specifically configured to:
将所述多帧目标图像中的至少一帧目标图像,与预先获得的多帧图像进行比较,得到所述目标对象属于活体对象的置信度信息,其中,该多帧图像包括多个不同对象的脸部信息,且该对象包括活体对象和非活体对象;Comparing at least one frame of target image in the multi-frame target images with the pre-obtained multi-frame images to obtain confidence information that the target object belongs to a living object, wherein the multi-frame image includes a plurality of different objects. facial information, and the object includes living objects and non-living objects;
基于所述置信度信息和相似度信息确定目标对象是否属于活体对象。Whether the target object belongs to a living object is determined based on the confidence information and the similarity information.
在本申请实施例中,对应于上述的对象识别方法,还提供了一种计算机可读存储介质,该计算机可读存储介质中存储有计算机程序,该计算机程序运行时执行上述对象识别方法的各个步骤。In the embodiment of the present application, corresponding to the above-mentioned object identification method, a computer-readable storage medium is further provided, and a computer program is stored in the computer-readable storage medium, and when the computer program runs, each of the above-mentioned object identification methods is executed. step.
其中,前述计算机程序运行时执行的各步骤,在此不再一一赘述,可参考前文对所述对象识别方法的解释说明。Wherein, the steps performed when the aforementioned computer program is run will not be repeated here, and reference may be made to the foregoing explanation of the object recognition method.
可以理解的是,在前文的描述中,多个或多帧等,是指两个及其以上,如多帧目标图像是指,两帧或两帧以上数量的目标图像。It can be understood that, in the foregoing description, multiple or multiple frames, etc., refer to two or more, for example, a multi-frame target image refers to two or more target images.
综上所述,由于经本申请的发明人研究发现,活体对象即便是不刻意在脸部做出指定动作,也会随机时间的变化而出现细微的变化,即存在微表情变化,使得拍摄得到的多帧目标图像之间至少会存在细微的区别,而针对非活体对象,如照片或三维模型,则不可能存在变化,基于此,本申请实施例提供的对象识别方法和装置、电子设备及存储介质,通过获得拍摄目标对象得到的多帧目标图像,然后,基于多帧目标图像中脸部信息之间的相似度信息确定该目标对象是否属于活体对象。如此,在不依赖于目标对象做出指定动作且不依赖于拍摄目标对象的设备具有深度图像传感器的情况下,也可以实现目标对象是否属于活体对象的较高准确度的识别。从而较好改善了现有的人脸识别技术中在不能要求识别对象做出指定动作(如为了用户的体验更好等)或拍摄设备上未配置深度图像传感器(如考虑设备的成本等)的应用中存在的识别结果准确度较低的问题,进而使得人脸识别的准确度可以得到有效的保证,且由于无需识别对象作出指定动作,因此还能提高识别对象的使用体验,由于无需深度图像传感器,因此可以降低设备的成本,使得适用范围更广,实用价值较高。To sum up, as the inventors of the present application have found through research, even if a living subject does not deliberately make a specified action on the face, there will be slight changes in random time changes, that is, there are changes in micro-expressions, which makes the shooting results. There will be at least subtle differences between the multi-frame target images, and for non-living objects, such as photos or three-dimensional models, there is no change. Based on this, the object recognition method and device, electronic device and The storage medium determines whether the target object belongs to a living object by obtaining multiple frames of target images obtained by photographing the target object, and then, based on similarity information between face information in the multiple frames of target images. In this way, in the case where a device that does not depend on the target object to perform a specified action and does not depend on the device for photographing the target object has a depth image sensor, it is also possible to achieve higher accuracy recognition of whether the target object belongs to a living object. Thereby, it is better to improve the existing face recognition technology in which the recognition object cannot be required to perform specified actions (such as for a better user experience, etc.) or the shooting device is not equipped with a depth image sensor (such as considering the cost of the device, etc.). The problem of low accuracy of recognition results in the application, so that the accuracy of face recognition can be effectively guaranteed, and because there is no need to recognize the object to make a specified action, it can also improve the use experience of the recognized object, because no depth image is required. Therefore, the cost of the equipment can be reduced, the scope of application is wider, and the practical value is higher.
在本申请实施例所提供的实施例中,应该理解到,所揭露的装置和方法,也可以通过其它的方式实现。以上所描述的装置和方法实施例仅仅是示意性的,例如,附图中的流程图和框图显示了根据本申请的多个实施例的装置、方法和计算机程序产品的可能实现的体系架构、功能和操作。在这点上,流程图或框图中的每个方框可以代表一个模块、程序段或代码的一部分,所述模块、程序段或代码的一部分包含一个或多个用于实现规定的逻辑功能的可执行指令。也应当注意,在有些作为替换的实现方式中,方框中所标注的功能也可以以不同于附图中所标注的顺序发生。例如,两个连续的方框实际上可以基本并行地执行,它们有时也可以按相反的顺序执行,这依所涉及的功能而定。也要注意的是,框图和/或流程图中的每个方框、以及框图和/或流程图中的方框的组合,可以用执行规定的功能或动作的专用的基于硬件的***来实现,或者可以用专用硬件与计算机指令的组合来实现。In the embodiments provided by the embodiments of the present application, it should be understood that the disclosed apparatus and method may also be implemented in other manners. The apparatus and method embodiments described above are only illustrative, for example, the flowcharts and block diagrams in the accompanying drawings show the architecture, possible implementation of the apparatus, method and computer program product according to various embodiments of the present application, function and operation. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code that contains one or more functions for implementing the specified logical function(s) executable instructions. It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It is also noted that each block of the block diagrams and/or flowchart illustrations, and combinations of blocks in the block diagrams and/or flowchart illustrations, can be implemented in dedicated hardware-based systems that perform the specified functions or actions , or can be implemented in a combination of dedicated hardware and computer instructions.
另外,在本申请实施例中的各功能模块可以集成在一起形成一个独立的部分,也可以是各个模块单独存在,也可以两个或两个以上模块集成形成一个独立的部分。In addition, each functional module in the embodiments of the present application may be integrated together to form an independent part, or each module may exist independently, or two or more modules may be integrated to form an independent part.
所述功能如果以软件功能模块的形式实现并作为独立的产品销售或使用时,可以存储在一个计算机可读取存储介质中。基于这样的理解,本申请的技术方案本质上或者说对现 有技术做出贡献的部分或者该技术方案的部分可以以软件产品的形式体现出来,该计算机软件产品存储在一个存储介质中,包括若干指令用以使得一台计算机设备(可以是个人计算机,电子设备,或者网络设备等)执行本申请各个实施例所述方法的全部或部分步骤。而前述的存储介质包括:U盘、移动硬盘、只读存储器(ROM,Read-Only Memory)、随机存取存储器(RAM,Random Access Memory)、磁碟或者光盘等各种可以存储程序代码的介质。需要说明的是,在本文中,术语“包括”、“包含”或者其任何其他变体意在涵盖非排他性的包含,从而使得包括一系列要素的过程、方法、物品或者设备不仅包括那些要素,而且还包括没有明确列出的其他要素,或者是还包括为这种过程、方法、物品或者设备所固有的要素。在没有更多限制的情况下,由语句“包括一个……”限定的要素,并不排除在包括所述要素的过程、方法、物品或者设备中还存在另外的相同要素。If the functions are implemented in the form of software function modules and sold or used as independent products, they can be stored in a computer-readable storage medium. Based on this understanding, the technical solution of the present application can be embodied in the form of a software product in essence, or the part that contributes to the prior art or the part of the technical solution. The computer software product is stored in a storage medium, including Several instructions are used to cause a computer device (which may be a personal computer, an electronic device, or a network device, etc.) to execute all or part of the steps of the methods described in the various embodiments of the present application. The aforementioned storage medium includes: U disk, mobile hard disk, Read-Only Memory (ROM, Read-Only Memory), Random Access Memory (RAM, Random Access Memory), magnetic disk or optical disk and other media that can store program codes . It should be noted that, herein, the terms "comprising", "comprising" or any other variation thereof are intended to encompass non-exclusive inclusion, such that a process, method, article or device comprising a series of elements includes not only those elements, It also includes other elements not expressly listed or inherent to such a process, method, article or apparatus. Without further limitation, an element qualified by the phrase "comprising a..." does not preclude the presence of additional identical elements in a process, method, article or apparatus that includes the element.
以上所述仅为本申请的优选实施例而已,并不用于限制本申请,对于本领域的技术人员来说,本申请可以有各种更改和变化。凡在本申请的精神和原则之内,所作的任何修改、等同替换、改进等,均应包含在本申请的保护范围之内。The above descriptions are only preferred embodiments of the present application, and are not intended to limit the present application. For those skilled in the art, the present application may have various modifications and changes. Any modification, equivalent replacement, improvement, etc. made within the spirit and principle of this application shall be included within the protection scope of this application.
工业实用性Industrial Applicability
本申请提供的技术方案,在不依赖于目标对象做出指定动作且不依赖于拍摄目标对象的设备具有深度图像传感器的情况下,也可以实现目标对象是否属于活体对象的较高准确度的识别,进而使得人脸识别的准确度可以得到有效的保证,且由于无需识别对象作出指定动作,因此还能提高识别对象的使用体验,由于无需深度图像传感器,因此可以降低设备的成本,使得适用范围更广,实用价值较高。The technical solution provided by the present application can also realize high-accuracy identification of whether the target object belongs to a living object when the device that does not depend on the target object to perform a specified action and does not depend on the device for shooting the target object has a depth image sensor. , so that the accuracy of face recognition can be effectively guaranteed, and because there is no need to recognize the object to make specified actions, it can also improve the use experience of the recognized object. Wider and more practical value.

Claims (16)

  1. 一种对象识别方法,其特征在于,包括:An object recognition method, comprising:
    获得拍摄目标对象得到的多帧目标图像,其中,每一帧所述目标图像包括所述目标对象的脸部信息;Obtaining multiple frames of target images obtained by photographing a target object, wherein each frame of the target image includes facial information of the target object;
    基于所述脸部信息确定多帧所述目标图像之间的相似度信息;Determine similarity information between multiple frames of the target images based on the face information;
    基于所述相似度信息确定所述目标对象是否属于活体对象。Whether the target object belongs to a living object is determined based on the similarity information.
  2. 根据权利要求1所述的对象识别方法,其特征在于,所述基于所述脸部信息确定多帧所述目标图像之间的相似度信息的步骤,包括:The object recognition method according to claim 1, wherein the step of determining similarity information between multiple frames of the target images based on the face information comprises:
    基于预先训练得到的人脸识别模型对每一帧所述目标图像中的脸部信息进行特征提取处理,得到每一帧所述目标图像的目标特征信息;Perform feature extraction processing on the face information in each frame of the target image based on the pre-trained face recognition model to obtain target feature information of each frame of the target image;
    基于所述目标特征信息得到多帧所述目标图像之间的相似度信息。Similarity information between multiple frames of the target images is obtained based on the target feature information.
  3. 根据权利要求2所述的对象识别方法,其特征在于,所述基于所述相似度信息确定所述目标对象是否属于活体对象的步骤,包括:The object recognition method according to claim 2, wherein the step of determining whether the target object belongs to a living object based on the similarity information comprises:
    通过所述人脸识别模型将至少一个所述目标特征信息与预先形成的多个对比特征信息进行比较,得到所述目标对象属于活体对象的置信度信息,其中,所述多个对比特征信息基于多帧包括多个不同对象的脸部信息的图像得到,且该对象包括活体对象和非活体对象;Comparing at least one of the target feature information with a plurality of pre-formed comparative feature information through the face recognition model to obtain confidence information that the target object belongs to a living object, wherein the plurality of comparative feature information is based on A plurality of frames of images including facial information of a plurality of different objects are obtained, and the objects include living objects and non-living objects;
    基于所述置信度信息和相似度信息确定目标对象是否属于活体对象。Whether the target object belongs to a living object is determined based on the confidence information and the similarity information.
  4. 根据权利要求3所述的对象识别方法,其特征在于,所述通过所述人脸识别模型将至少一个所述目标特征信息与预先形成的多个对比特征信息进行比较,得到所述目标对象属于活体对象的置信度信息的步骤,包括:The object recognition method according to claim 3, wherein the face recognition model compares at least one of the target feature information with a plurality of pre-formed comparative feature information, and obtains that the target object belongs to The steps of confidence information for living objects include:
    通过所述人脸识别模型对至少一个所述目标特征信息进行身份类别识别处理,得到该至少一个目标特征信息的身份类别信息;Perform identity category recognition processing on at least one of the target feature information by using the face recognition model to obtain identity category information of the at least one target feature information;
    在所述人脸识别模型包括的多个特征空间中,基于所述身份类别信息确定所述至少一个目标特征信息的目标特征空间,其中,不同的特征空间具有不同对象的对比特征信息,且每一个所述对比特征信息具有标识对应对象的身份类别的第一标签信息和是否为活体的第二标签信息;In a plurality of feature spaces included in the face recognition model, a target feature space of the at least one target feature information is determined based on the identity category information, wherein different feature spaces have comparative feature information of different objects, and each One of the comparative feature information has first label information identifying the identity category of the corresponding object and second label information whether it is a living body;
    通过所述人脸识别模型将所述至少一个目标特征信息与该目标特征信息对应的目标特征空间中的对比特征信息进行比较,得到所述目标对象属于活体对象的置信度信息。The at least one target feature information is compared with the contrast feature information in the target feature space corresponding to the target feature information by the face recognition model, so as to obtain confidence information that the target object belongs to a living object.
  5. 根据权利要求3所述的对象识别方法,其特征在于,所述通过所述人脸识别模型将至少一个所述目标特征信息与预先形成的多个对比特征信息进行比较,得到所述目标对象属于活体对象的置信度信息的步骤,包括:The object recognition method according to claim 3, wherein the face recognition model compares at least one of the target feature information with a plurality of pre-formed comparative feature information, and obtains that the target object belongs to The steps of confidence information for living objects include:
    针对每一帧所述目标图像对应的目标特征信息,通过所述人脸识别模型将该目标特征信息与预先形成的多个对比特征信息进行比较,确定在该帧目标图像中所述目标对象属于活体对象的置信度,得到多个置信度;For the target feature information corresponding to each frame of the target image, the face recognition model compares the target feature information with a plurality of pre-formed comparative feature information, and determines that the target object in the frame of the target image belongs to Confidence of living objects, get multiple confidences;
    基于所述多个置信度得到所述目标对象属于活体对象的置信度信息。Confidence information that the target object belongs to a living object is obtained based on the plurality of confidences.
  6. 根据权利要求2-5任意一项所述的对象识别方法,其特征在于,该方法还包括训练得到所述人脸识别模型的步骤,该步骤包括:The object recognition method according to any one of claims 2-5, wherein the method further comprises the step of obtaining the face recognition model by training, and the step comprises:
    通过预设的神经网络模型中的特征提取层对多个样本图像进行特征提取处理,得到多个样本特征信息;Perform feature extraction processing on multiple sample images through the feature extraction layer in the preset neural network model to obtain multiple sample feature information;
    通过所述神经网络模型中的损失确定层,分别基于预先为每一个样本图像配置的第一标签信息确定每一个样本特征信息的第一损失值、基于预先为每一个样本图像配置的第二标签信息确定每一个样本特征信息的第二损失值,其中,该第一标签信息用于标识对应的样本图像中的对象的身份类别,该第二标签信息用于标识对应的样本图像中的对象是否为活体;Through the loss determination layer in the neural network model, the first loss value of each sample feature information is determined based on the first label information preconfigured for each sample image, and the second label preconfigured for each sample image is determined based on the first loss value of each sample image. The information determines the second loss value of each sample feature information, wherein the first label information is used to identify the identity category of the object in the corresponding sample image, and the second label information is used to identify whether the object in the corresponding sample image is for living;
    基于所述第一损失值和所述第二损失值,对所述神经网络模型进行训练,得到所述人脸识别模型。Based on the first loss value and the second loss value, the neural network model is trained to obtain the face recognition model.
  7. 根据权利要求1或2所述的对象识别方法,其特征在于,所述基于所述相似度信息确定所述目标对象是否属于活体对象的步骤,包括:The object recognition method according to claim 1 or 2, wherein the step of determining whether the target object belongs to a living object based on the similarity information comprises:
    将所述多帧目标图像中的至少一帧目标图像,与预先获得的多帧图像进行比较,得到所述目标对象属于活体对象的置信度信息,其中,该多帧图像包括多个不同对象的脸部信息,且该对象包括活体对象和非活体对象;Comparing at least one frame of target image in the multi-frame target images with the pre-obtained multi-frame images to obtain confidence information that the target object belongs to a living object, wherein the multi-frame image includes a plurality of different objects. facial information, and the object includes living objects and non-living objects;
    基于所述置信度信息和相似度信息确定目标对象是否属于活体对象。Whether the target object belongs to a living object is determined based on the confidence information and the similarity information.
  8. 一种对象识别装置,其特征在于,包括:An object recognition device, characterized in that it includes:
    目标图像获得模块,配置成获得拍摄目标对象得到的多帧目标图像,其中,每一帧所述目标图像包括所述目标对象的脸部信息;a target image obtaining module, configured to obtain multiple frames of target images obtained by photographing a target object, wherein each frame of the target image includes facial information of the target object;
    相似度信息确定模块,配置成基于所述脸部信息确定多帧所述目标图像之间的相似度信息;a similarity information determination module, configured to determine similarity information between multiple frames of the target images based on the face information;
    活体对象确定模块,配置成基于所述相似度信息确定所述目标对象是否属于活体对象。A living object determination module configured to determine whether the target object belongs to a living object based on the similarity information.
  9. 根据权利要求8所述的对象识别装置,其特征在于,所述相似度信息确定模块具体配置成:The object recognition device according to claim 8, wherein the similarity information determination module is specifically configured to:
    基于预先训练得到的人脸识别模型对每一帧所述目标图像中的脸部信息进行特征提取处理,得到每一帧所述目标图像的目标特征信息;Perform feature extraction processing on the face information in each frame of the target image based on the pre-trained face recognition model to obtain target feature information of each frame of the target image;
    基于所述目标特征信息得到多帧所述目标图像之间的相似度信息。Similarity information between multiple frames of the target images is obtained based on the target feature information.
  10. 根据权利要求9所述的对象识别装置,其特征在于,所述活体对象确定模块具体配置成:The object recognition device according to claim 9, wherein the living object determination module is specifically configured to:
    通过所述人脸识别模型将至少一个所述目标特征信息与预先形成的多个对比特征信息进行比较,得到所述目标对象属于活体对象的置信度信息,其中,所述多个对比特征信息基于多帧包括多个不同对象的脸部信息的图像得到,且该对象包括活体对象和非活体对象;Comparing at least one of the target feature information with a plurality of pre-formed comparative feature information through the face recognition model to obtain confidence information that the target object belongs to a living object, wherein the plurality of comparative feature information is based on A plurality of frames of images including facial information of a plurality of different objects are obtained, and the objects include living objects and non-living objects;
    基于所述置信度信息和相似度信息确定目标对象是否属于活体对象。Whether the target object belongs to a living object is determined based on the confidence information and the similarity information.
  11. 根据权利要求10所述的对象识别装置,其特征在于,所述活体对象确定模块具体配置成:The object recognition device according to claim 10, wherein the living object determination module is specifically configured to:
    通过所述人脸识别模型对至少一个所述目标特征信息进行身份类别识别处理,得到该至少一个目标特征信息的身份类别信息;Perform identity category recognition processing on at least one of the target feature information by using the face recognition model to obtain identity category information of the at least one target feature information;
    在所述人脸识别模型包括的多个特征空间中,基于所述身份类别信息确定所述至少一个目标特征信息的目标特征空间,其中,不同的特征空间具有不同对象的对比特征信息,且每一个所述对比特征信息具有标识对应对象的身份类别的第一标签信息和是否为活体的第二标签信息;In a plurality of feature spaces included in the face recognition model, a target feature space of the at least one target feature information is determined based on the identity category information, wherein different feature spaces have comparative feature information of different objects, and each One of the comparative feature information has first label information identifying the identity category of the corresponding object and second label information whether it is a living body;
    通过所述人脸识别模型将所述至少一个目标特征信息与该目标特征信息对应的目标特征空间中的对比特征信息进行比较,得到所述目标对象属于活体对象的置信度信息。The at least one target feature information is compared with the contrast feature information in the target feature space corresponding to the target feature information by the face recognition model, so as to obtain confidence information that the target object belongs to a living object.
  12. 根据权利要求10所述的对象识别装置,其特征在于,所述活体对象确定模块具体配置成:The object recognition device according to claim 10, wherein the living object determination module is specifically configured to:
    针对每一帧所述目标图像对应的目标特征信息,通过所述人脸识别模型将该目标特征信息与预先形成的多个对比特征信息进行比较,确定在该帧目标图像中所述目标对象属于活体对象的置信度,得到多个置信度;For the target feature information corresponding to each frame of the target image, the face recognition model compares the target feature information with a plurality of pre-formed comparative feature information, and determines that the target object in the frame of the target image belongs to Confidence of living objects, get multiple confidences;
    基于所述多个置信度得到所述目标对象属于活体对象的置信度信息。Confidence information that the target object belongs to a living object is obtained based on the plurality of confidences.
  13. 根据权利要求9-12任意一项所述的对象识别装置,其特征在于,还包括模型训练模块,配置成:The object recognition device according to any one of claims 9-12, further comprising a model training module configured to:
    通过预设的神经网络模型中的特征提取层对多个样本图像进行特征提取处理,得到多个样本特征信息;Perform feature extraction processing on multiple sample images through the feature extraction layer in the preset neural network model to obtain multiple sample feature information;
    通过所述神经网络模型中的损失确定层,分别基于预先为每一个样本图像配置的第一标签信息确定每一个样本特征信息的第一损失值、基于预先为每一个样本图像配置的第二标签信息确定每一个样本特征信息的第二损失值,其中,该第一标签信息用于标识对应的样本图像中的对象的身份类别,该第二标签信息用于标识对应的样本图像中的对象是否为活体;Through the loss determination layer in the neural network model, the first loss value of each sample feature information is determined based on the first label information preconfigured for each sample image, and the second label preconfigured for each sample image is determined based on the first loss value of each sample image. The information determines the second loss value of each sample feature information, wherein the first label information is used to identify the identity category of the object in the corresponding sample image, and the second label information is used to identify whether the object in the corresponding sample image is for living body;
    基于所述第一损失值和所述第二损失值,对所述神经网络模型进行训练,得到所述人 脸识别模型。Based on the first loss value and the second loss value, the neural network model is trained to obtain the face recognition model.
  14. 根据权利要求8或9所述的对象识别装置,其特征在于,所述活体对象确定模块具体配置成:The object recognition device according to claim 8 or 9, wherein the living object determination module is specifically configured to:
    将所述多帧目标图像中的至少一帧目标图像,与预先获得的多帧图像进行比较,得到所述目标对象属于活体对象的置信度信息,其中,该多帧图像包括多个不同对象的脸部信息,且该对象包括活体对象和非活体对象;Comparing at least one frame of target image in the multi-frame target images with the pre-obtained multi-frame images to obtain confidence information that the target object belongs to a living object, wherein the multi-frame image includes a plurality of different objects. facial information, and the object includes living objects and non-living objects;
    基于所述置信度信息和相似度信息确定目标对象是否属于活体对象。Whether the target object belongs to a living object is determined based on the confidence information and the similarity information.
  15. 一种电子设备,其特征在于,包括:An electronic device, comprising:
    存储器,用于存储计算机程序;memory for storing computer programs;
    与所述存储器连接的处理器,用于执行该存储器存储的计算机程序,以实现权利要求1-7任意一项所述的对象识别方法。A processor connected to the memory is configured to execute the computer program stored in the memory, so as to implement the object recognition method according to any one of claims 1-7.
  16. 一种计算机可读存储介质,其上存储有计算机程序,其特征在于,该计算机程序被执行时,实现权利要求1-7任意一项所述的对象识别方法。A computer-readable storage medium on which a computer program is stored, characterized in that, when the computer program is executed, the object identification method according to any one of claims 1-7 is implemented.
PCT/CN2021/110358 2020-08-05 2021-08-03 Object recognition method and apparatus, electronic device and storage medium WO2022028425A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202010780055.4 2020-08-05
CN202010780055.4A CN112084858A (en) 2020-08-05 2020-08-05 Object recognition method and device, electronic equipment and storage medium

Publications (1)

Publication Number Publication Date
WO2022028425A1 true WO2022028425A1 (en) 2022-02-10

Family

ID=73735301

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2021/110358 WO2022028425A1 (en) 2020-08-05 2021-08-03 Object recognition method and apparatus, electronic device and storage medium

Country Status (2)

Country Link
CN (1) CN112084858A (en)
WO (1) WO2022028425A1 (en)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112084858A (en) * 2020-08-05 2020-12-15 广州虎牙科技有限公司 Object recognition method and device, electronic equipment and storage medium
CN113822152A (en) * 2021-08-09 2021-12-21 中标慧安信息技术股份有限公司 Method for monitoring clothing condition of commercial tenant of food in market

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107066983A (en) * 2017-04-20 2017-08-18 腾讯科技(上海)有限公司 A kind of auth method and device
CN107818313A (en) * 2017-11-20 2018-03-20 腾讯科技(深圳)有限公司 Vivo identification method, device, storage medium and computer equipment
US20190171886A1 (en) * 2017-12-06 2019-06-06 International Business Machines Corporation Object recognition in video
CN109871834A (en) * 2019-03-20 2019-06-11 北京字节跳动网络技术有限公司 Information processing method and device
CN110188715A (en) * 2019-06-03 2019-08-30 广州二元科技有限公司 A kind of video human face biopsy method of multi frame detection ballot
CN110991432A (en) * 2020-03-03 2020-04-10 支付宝(杭州)信息技术有限公司 Living body detection method, living body detection device, electronic equipment and living body detection system
CN112084858A (en) * 2020-08-05 2020-12-15 广州虎牙科技有限公司 Object recognition method and device, electronic equipment and storage medium

Family Cites Families (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104361326A (en) * 2014-11-18 2015-02-18 新开普电子股份有限公司 Method for distinguishing living human face
CN109508694B (en) * 2018-12-10 2020-10-27 上海众源网络有限公司 Face recognition method and recognition device
CN109670452A (en) * 2018-12-20 2019-04-23 北京旷视科技有限公司 Method for detecting human face, device, electronic equipment and Face datection model
CN110991231B (en) * 2019-10-28 2022-06-14 支付宝(杭州)信息技术有限公司 Living body detection method and device, server and face recognition equipment

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107066983A (en) * 2017-04-20 2017-08-18 腾讯科技(上海)有限公司 A kind of auth method and device
CN107818313A (en) * 2017-11-20 2018-03-20 腾讯科技(深圳)有限公司 Vivo identification method, device, storage medium and computer equipment
US20190171886A1 (en) * 2017-12-06 2019-06-06 International Business Machines Corporation Object recognition in video
CN109871834A (en) * 2019-03-20 2019-06-11 北京字节跳动网络技术有限公司 Information processing method and device
CN110188715A (en) * 2019-06-03 2019-08-30 广州二元科技有限公司 A kind of video human face biopsy method of multi frame detection ballot
CN110991432A (en) * 2020-03-03 2020-04-10 支付宝(杭州)信息技术有限公司 Living body detection method, living body detection device, electronic equipment and living body detection system
CN112084858A (en) * 2020-08-05 2020-12-15 广州虎牙科技有限公司 Object recognition method and device, electronic equipment and storage medium

Also Published As

Publication number Publication date
CN112084858A (en) 2020-12-15

Similar Documents

Publication Publication Date Title
WO2020125623A1 (en) Method and device for live body detection, storage medium, and electronic device
CN110490076B (en) Living body detection method, living body detection device, computer equipment and storage medium
WO2019233266A1 (en) Image processing method, computer readable storage medium and electronic device
WO2017088432A1 (en) Image recognition method and device
CN111626371B (en) Image classification method, device, equipment and readable storage medium
WO2022028425A1 (en) Object recognition method and apparatus, electronic device and storage medium
CN109145717B (en) Face recognition method for online learning
US8938092B2 (en) Image processing system, image capture apparatus, image processing apparatus, control method therefor, and program
CN109271958B (en) Face age identification method and device
CN110866466B (en) Face recognition method, device, storage medium and server
CN111209845A (en) Face recognition method and device, computer equipment and storage medium
CN110765860A (en) Tumble determination method, tumble determination device, computer apparatus, and storage medium
CN109063626B (en) Dynamic face recognition method and device
Ahamed et al. HOG-CNN based real time face recognition
US11126827B2 (en) Method and system for image identification
CN111582027B (en) Identity authentication method, identity authentication device, computer equipment and storage medium
CN112906545A (en) Real-time action recognition method and system for multi-person scene
JP2012190159A (en) Information processing device, information processing method, and program
WO2013075295A1 (en) Clothing identification method and system for low-resolution video
WO2023123923A1 (en) Human body weight identification method, human body weight identification device, computer device, and medium
Xie et al. Inducing predictive uncertainty estimation for face recognition
CN114187463A (en) Electronic archive generation method and device, terminal equipment and storage medium
Bresan et al. Facespoof buster: a presentation attack detector based on intrinsic image properties and deep learning
Hegde et al. Facial Expression Classifier Using Better Technique: FisherFace Algorithm
CN111507289A (en) Video matching method, computer device and storage medium

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 21853569

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 21853569

Country of ref document: EP

Kind code of ref document: A1