WO2020187160A1 - 基于级联的深层卷积神经网络的人脸识别方法及*** - Google Patents

基于级联的深层卷积神经网络的人脸识别方法及*** Download PDF

Info

Publication number
WO2020187160A1
WO2020187160A1 PCT/CN2020/079281 CN2020079281W WO2020187160A1 WO 2020187160 A1 WO2020187160 A1 WO 2020187160A1 CN 2020079281 W CN2020079281 W CN 2020079281W WO 2020187160 A1 WO2020187160 A1 WO 2020187160A1
Authority
WO
WIPO (PCT)
Prior art keywords
network
face
face recognition
output
convolutional neural
Prior art date
Application number
PCT/CN2020/079281
Other languages
English (en)
French (fr)
Inventor
翟新刚
张楠赓
Original Assignee
北京嘉楠捷思信息技术有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 北京嘉楠捷思信息技术有限公司 filed Critical 北京嘉楠捷思信息技术有限公司
Publication of WO2020187160A1 publication Critical patent/WO2020187160A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • G06V40/168Feature extraction; Face representation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • G06V40/161Detection; Localisation; Normalisation

Definitions

  • the field of artificial intelligence technology of the present invention particularly relates to a face recognition method and system based on a cascaded deep convolutional neural network.
  • Face recognition technology is a kind of biometric recognition technology based on the facial feature information of people.
  • the face recognition process is mainly to collect video streams with a camera, automatically detect and track faces in the images, and then perform face recognition on the detected faces.
  • face recognition systems have been widely used in various fields, such as community access control, company attendance, judicial and criminal investigations, etc.
  • face recognition is a kind of multi-task combination work in deep learning, but existing technical solutions often focus on the realization of a certain task, and ignore the relationship between multiple tasks.
  • the main purpose of the present invention is to provide a face recognition method and system based on a cascaded deep convolutional neural network to solve at least one of the above-mentioned problems.
  • a face recognition method based on a cascaded deep convolutional neural network including:
  • the extracting facial features using a cascaded deep convolutional neural network includes:
  • the output of the second network is sent to the third network to extract facial features.
  • the sending the output of the first network to the second network to predict the location of key points on the face includes:
  • the sending the output of the second network to the third network to extract facial features includes:
  • the first network is a face detection network (Face Detection Network, referred to as FDNet)
  • the second network is a key-point detection network (Key-point Detection Network, referred to as KDNet)
  • the third network is a feature Extraction network (Feature Extraction Network, FENet for short).
  • the method before extracting facial features by using the cascaded deep convolutional neural network, the method further includes: collecting facial image data.
  • a face recognition system based on a cascaded deep convolutional neural network including:
  • a feature extraction module for extracting facial features using cascaded deep convolutional neural networks
  • the face recognition module is connected to the feature extraction module and is used to perform face recognition based on the extracted facial features.
  • the feature extraction module includes:
  • the first network is used to receive face image data and predict the regression of the face frame
  • the frame interception unit is used to receive the output of the first network and perform frame interception and size conversion operations
  • the second network is used to receive the output of the frame interception unit and predict the position of key points on the face;
  • the similarity transformation unit is used to receive the output of the second network and perform similarity transformation, mapping and size transformation operations;
  • the third network is used to receive the output of the similarity transformation unit and extract facial features.
  • the first network is a face detection network (Face Detection Network, referred to as FDNet)
  • the second network is a key-point detection network (Key-point Detection Network, referred to as KDNet)
  • the third network is a feature Extraction network (Feature Extraction Network, FENet for short).
  • it further includes a collection module for collecting face image data.
  • the present invention uses a cascaded deep convolutional neural network for feature extraction, and performs face recognition based on the extracted features.
  • Each level of the cascaded deep convolutional neural network only needs to be executed for everyone Once, the control is simple, the amount of calculation is small, and it is easy to accelerate; and extracting facial features through deep learning for face recognition can easily cope with face recognition tasks of various security levels.
  • the present invention uses similar transformations when performing face recognition based on cascaded deep convolutional neural networks, which further reduces the background effect caused by different frame sizes and reduces the demand for the network.
  • the present invention maps face recognition to multiple different deep learning models independently according to different tasks, which has strong replaceability, avoids waste of computing power, and facilitates intuitive determination of the need to upgrade the network part.
  • Fig. 1 is a schematic flow chart of the face recognition method of the present invention.
  • Fig. 2 is a schematic diagram of a frame cut in the face recognition method shown in Fig. 1.
  • FIG. 3 is a flowchart of the face recognition method of the present invention.
  • Fig. 4 is another flowchart of the face recognition method of the present invention.
  • Figure 5 is a flow chart of the inventors extracting facial features.
  • Fig. 6 is a flow chart of predicting the position of key points on a face according to the present invention.
  • Fig. 7 is another flow chart of extracting facial features according to the present invention.
  • Figure 8 is a schematic diagram of the structure of the face recognition system of the present invention.
  • Fig. 9 is a schematic diagram of another structure of the face recognition system of the present invention.
  • Figure 10 is a schematic diagram of the structure of the feature extraction module of the present invention.
  • FIG. 11 is another flowchart of the face recognition method according to the embodiment of the present invention.
  • Face recognition usually includes face detection, face feature extraction, and classification of the extracted face features to complete face recognition.
  • face detection is to find out whether there are one or more faces in a given picture, and return the position and range of each face in the picture. Face detection algorithms are divided into four categories: knowledge-based, feature-based, template matching-based, and appearance-based methods. With the use of DPM (Direct Part Model) algorithm (variable component model) and deep learning Convolutional Neural Networks (CNN), all face detection algorithms can be divided into two categories: (1) Based on Template matching (Based on rigid templates): Among them, there are algorithms (Boosting) + features (Features) and CNN; (2) Based on parts model (Based on parts model).
  • DPM Direct Part Model
  • CNN deep learning Convolutional Neural Networks
  • Facial feature extraction is a process of obtaining facial feature information in the area where the face is located on the basis of face detection.
  • Face feature extraction methods include: Eigenface (Eigenface), Principal Component Analysis (Principal Component Analysis, referred to as PAC).
  • PAC Principal Component Analysis
  • Classification refers to classifying according to type, level or nature, and classifying the extracted features to complete face recognition.
  • Classification methods mainly include: decision tree method, Bayesian method, and artificial neural network.
  • the process of the face recognition method of the present invention is: Pyramid scale transformation is performed on a new picture, and the transformed picture is input into a network to generate a large number of face classification scores.
  • Regression vector with the face rectangle also called box, border, bounding box, window, window, etc.
  • weed out the face rectangle with lower score for example, lower than a threshold M1
  • replace the remaining face rectangle The frame performs non-maximum suppression to obtain the final prediction result; then the predicted result is input into another network, and the face rectangle with a lower score (for example, lower than the threshold M2) is also eliminated, and then non-maximum suppression is used
  • the algorithm filters out the large overlapping face rectangles, displays the key points of the face, and performs feature extraction and face recognition.
  • the face recognition method is introduced by taking the face network (FaceNet) as an example. As shown in Figure 1-2, the face recognition method includes the following steps:
  • FaceNet extracts facial features in two steps:
  • MTCNN Multi-task Cascaded Convolutional Networks
  • the MTCNN predicts the Bounding Box of the face, as shown in Figure 1, including the following sub-steps:
  • the input original image is scaled to various sizes, that is, the original image is subjected to different Scale Resize operations to build an image pyramid, and each layer of the pyramid is sent to the shallow CNN candidate box network (Proposal Network, referred to as PNet) And perform Bounding Box Regression and Non-maximum Suppression (NMS) to quickly generate candidate forms;
  • PNet Proposal Network
  • NMS Non-maximum Suppression
  • a more powerful CNN output network (Output network, referred to as ONet) is used to realize the selection of candidate forms and display the location of five facial key points at the same time.
  • the above method uses MTCNN to predict Bounding Box that requires repeated PNet and RNet multiple times, and the control is relatively complicated and the amount of calculation is large.
  • the Bounding Box predicted by MTCNN is added with a fixed-length Margin and sent to the feature extraction network. Since the Bounding Box of the face in the figure will have various sizes, if a fixed Margin is added to the face of different size, the size will be different. The background information of the face will be very different, so it will weaken the generalization ability of the feature extraction network.
  • the present invention also provides a method for realizing sensorless face recognition based on a cascaded deep convolutional neural network.
  • the face recognition method based on the cascaded deep convolutional neural network of the present invention includes the following steps:
  • S2 Perform face recognition according to the extracted facial features.
  • the present invention uses cascaded deep convolutional neural networks for feature extraction, and performs face recognition based on the extracted features.
  • Each level of network in the cascaded deep convolutional neural network only needs to be executed once for each person, and control Simple, small amount of calculation, easy to accelerate.
  • the face recognition method may further include: S0, collecting facial image data.
  • the extraction of facial features using the cascaded deep convolutional neural network includes:
  • S11 Send the face image data to the first network to predict the regression of the face frame
  • S12 Send the output of the first network to the second network to predict the location of key points on the face;
  • S13 Send the output of the second network to the third network to extract facial features.
  • the cascaded deep convolutional neural network may include three networks, and the three networks form a three-level cascaded deep convolutional neural network; wherein, the first network is a face detection network (Face Detection Network). Network, referred to as FDNet), the second network is a Key-point Detection Network (Key-point Detection Network, referred to as KDNet), and the third network is a Feature Extraction Network (Feature Extraction Network, referred to as FENet).
  • FDNet face detection network
  • KDNet Key-point Detection Network
  • FENet Feature Extraction Network
  • the present invention disassembles the face recognition task and fully considers the relationship between multiple tasks. Specifically, the present invention disassembles the face recognition task
  • the convolutional neural networks provided by deep learning corresponding to the three-level tasks are FDNet, KDNet and FENet.
  • face recognition is independently mapped to multiple different deep learning models according to different tasks, which is highly replaceable, avoids waste of computing power, and facilitates intuitive determination of the need to upgrade the network part.
  • the sending the output of the first network to the second network to predict the location of key points on the face includes:
  • S121 Perform frame interception and size conversion operations on the output of the first network before sending it to the second network;
  • S122 Use the second network to predict the location of key points on the face.
  • the sending the output of the second network to the third network to extract facial features includes:
  • the present invention uses similar transformations to further reduce the background effect caused by different frame sizes, reduce the demand for FDNet, and improve the accuracy of feature extraction .
  • the present invention also provides a face recognition system based on a cascaded deep convolutional neural network.
  • the face recognition system based on a cascaded deep convolutional neural network includes:
  • the feature extraction module 11 is used for extracting facial features using cascaded deep convolutional neural networks.
  • the face recognition module 12 is connected to the feature extraction module 11 and is used to perform face recognition according to the extracted facial features.
  • the face recognition system may further include a collection module 10 for collecting face image data.
  • the feature extraction module 11 is connected to the collection module 10, and is configured to receive face image data sent by the collection module 10, and extract facial features by using a cascaded deep convolutional neural network.
  • the feature extraction module includes:
  • the first network 110 is configured to receive the face image data and predict the regression of the face frame
  • the frame interception unit 111 is configured to receive the output of the first network 110 and perform frame interception and size conversion operations;
  • the second network 112 is configured to receive the output of the frame interception unit 111 and predict the position of key points on the face;
  • the similarity transformation unit 113 is configured to receive the output of the second network 112 and perform similarity transformation, mapping and size transformation operations;
  • the third network 114 is configured to receive the output of the similarity transformation unit 113 and extract facial features.
  • the first network is a face detection network (Face Detection Network, referred to as FDNet)
  • the second network is a key-point detection network (Key-point Detection Network, referred to as KDNet)
  • the third network is a feature extraction network (Feature Extraction Network, referred to as FENet).
  • the face recognition method based on the cascaded deep convolutional neural network specifically includes:
  • Step 1 FDNet is based on YOLO's design idea, using MobileNet as the backbone, directly Bounding Box Regression on the face, and predicting the confidence at the same time; in other words, the task of FDNet is to find The position of all faces in the image, and intercept all face images as the input of KDNet in turn.
  • FDNet includes but is not limited to MobileNet-YOLO.
  • the face is marked in the data set, and the golden truth of the face frame is set as follows:
  • Location is set to among them, Represents the coordinates of the upper left corner of the i-th face frame, Respectively represent the width and height of the face frame;
  • the corresponding YOLO prediction result is x i y i w i h i C i p i ;
  • the MobileNet-YOLO network in the stable state can be used for the prediction of the face frame.
  • Step 2 Based on the output of FDNet, cut out the bounding box, transform the size (Resize) to a fixed size, and send it to KDNet (Keypoints Detection Net) to directly predict the positions of five facial key points; that is, KDNet's task
  • KDNet Keypoints Detection Net
  • this embodiment takes five points as an example (left eye corner, right eye corner, nose, left mouth corner, right mouth corner), traverse all the inputs of FDNet, you can get everyone The key points of the face.
  • KPNet can choose a variety of networks, and only needs to predict the location of key points on the face.
  • the following takes the left eye, right eye, nose, left mouth corner, and right mouth corner of the face as examples to introduce the process of marking the key points of the face in the data set:
  • the face data is preprocessed by the FDNet in the aforementioned step 1, and the face part is cut out by the face frame, and the position of the left eye, right eye, nose, left mouth corner, and right mouth corner of the face is calculated in the face part.
  • Position ratio get five pairs of position coordinates (x, y), x ⁇ [0,1], y ⁇ [0,1]
  • the Sigmoid function is as follows:
  • Step 3 Based on the output of five facial key points in KDNet, perform five-point similarity transformation on the entire frame of image, map it to five points at a fixed golden position, and transform the mapped face image to a fixed size (Resize)
  • the size is sent to FENet (Feature Extraction Net, referred to as FENet) to extract facial features; that is to say, a similar transformation is required between KDNet and FENet as a bridge, and the key point position information obtained by KDNet is compared with the position of golden (Golden).
  • the key point position information obtains the similarity transformation matrix, and then the entire frame image is similarly transformed corresponding to the key points to obtain the similarly transformed face image.
  • the task of FENet is to abstract the face image information into a feature vector representation. This feature Vector representation has the following characteristics: all faces of the same subject can be mapped to similar feature vectors, while the feature vectors obtained by all faces of different subjects are quite different.
  • This embodiment uses a five-point similarity transformation, of course, it is not limited to the five-point similarity transformation.
  • first set the golden positions of the left eye, right eye, nose, left mouth corner, and right mouth corner of the face to resemble the transformed image Take 112*112 as an example, the Golden position can be selected as follows
  • Nose_g [56.0252, 71.7366]
  • Golden key point Landmark_Golden [le_g, re_g, nose_g, l_mouth_g, r_mouth_g]
  • Landmark_get [le, re, nose, l_mouth, r_mouth]
  • Landmark_Get Landmark_Golden can perform similar transformations, as follows:
  • the FENet includes but is not limited to Mobilefacenets, and AM-softmax Loss can be selected as the loss function.
  • the face recognition method based on the cascaded deep convolutional neural network described in this embodiment uses a cascaded deep convolutional neural network (Cascaded-Deep CNN, referred to as CDCNN) to extract facial features and perform human Face recognition.
  • CDCNN cascaded deep convolutional neural network
  • Each level of CDCNN only needs to be executed once for everyone, which is simple to control, small in calculation, and easy to accelerate; and compared to the aforementioned Bounding Box method of adding Margin, it uses a five-point similar transformation to further reduce
  • the background effect caused by the different sizes of the Bounding Box reduces the demand for FDNet (as long as the five key points of the face are accurate, the face detection box does not have to be generated by the MTCNN network).
  • the face recognition method and system based on the cascaded deep convolutional neural network of the present invention may also include other parts, which are not related to the innovation of the present invention, so they will not be repeated here.
  • modules or units or components in the embodiments can be combined into one module or unit or component, and in addition, they can be divided into multiple sub-modules or sub-units or sub-components. Except that at least some of such features and/or processes or units are mutually exclusive, any combination can be used to compare all the features of the invention in this specification (including the accompanying claims, abstract and drawings) and any method or method of such invention. All the processes or units of the equipment are combined. Unless expressly stated otherwise, each feature of the invention in this specification (including the accompanying claims, abstract and drawings) may be replaced by an alternative feature providing the same, equivalent or similar purpose.
  • the various component embodiments of the present invention may be implemented by hardware, or by software modules running on one or more processors, or by their combination.
  • a microprocessor or a digital signal processor (DSP) may be used in practice to implement some or all of the functions of some or all of the components in the relevant device according to the embodiments of the present invention.
  • DSP digital signal processor
  • the present invention can also be implemented as a device or device program (for example, a computer program and a computer program product) for executing part or all of the methods described herein.
  • Such a program for realizing the present invention may be stored on a computer-readable medium, or may have the form of one or more signals. Such signals can be downloaded from Internet websites, or provided on carrier signals, or provided in any other form.
  • ordinal numbers used in the specification and claims such as “first”, “second”, “third”, etc., are used to modify the corresponding elements, and they do not imply or represent that the elements have any
  • the ordinal number does not represent the order of a certain element and another element, or the order in the manufacturing method. The use of these ordinal numbers is only used to make it clear that one element with a certain name can be made clear from another element with the same name distinguish.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Health & Medical Sciences (AREA)
  • General Physics & Mathematics (AREA)
  • General Health & Medical Sciences (AREA)
  • Oral & Maxillofacial Surgery (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Computational Linguistics (AREA)
  • Human Computer Interaction (AREA)
  • Software Systems (AREA)
  • Artificial Intelligence (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Multimedia (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Mathematical Physics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Image Analysis (AREA)

Abstract

本发明提供了一种基于级联的深层卷积神经网络的人脸识别方法及***,其中,所述基于级联的深层卷积神经网络的人脸识别方法包括:利用级联的深层卷积神经网络提取人脸特征;以及根据提取的所述人脸特征进行人脸识别。本发明基于级联的深层卷积神经网络的人脸识别方法及***控制简单,计算量小,便于加速。

Description

基于级联的深层卷积神经网络的人脸识别方法及*** 技术领域
本发明人工智能技术领域,特别涉及一种基于级联的深层卷积神经网络的人脸识别方法及***。
背景技术
人脸识别技术,是基于人的脸部特征信息进行身份识别的一种生物识别技术。人脸识别过程主要是用摄像头采集视频流,自动在图像中检测和跟踪人脸,进而对检测到的人脸进行人像识别。随着人脸识别技术的迅速发展,人脸识别***已经广泛应用于各个领域,例如小区门禁、公司考勤、司法刑侦等。但是,目前,在特征提取过程中,传统的机器学习算法提取的都是手工特征,例如,局部二值模式(Local Binary Pattern,简称为LBP)特征、梯度直方图(Histogram of Oriented Gradient,简称为HOG)特征、哈尔(Haar)特征等,这些手工特征因为加入了设计者的先验知识,所以只能针对某些特定背景下的人脸有较高的准确率,难以应用于一些复杂条件下的人脸识别,因此难以胜任形式多样的人脸识别任务。
此外,人脸识别在深度学习中属于种多任务组合的工作,但是现有技术方案常聚焦于某一种任务的实现,而忽略了多任务间的关系。
发明内容
(一)要解决的技术问题
鉴于上述问题,本发明的主要目的在于提供一种基于级联的深层卷积神经网络的人脸识别方法及***,以便解决上述问题的至少之一。
(二)技术方案
根据本发明的一个方面,提供了一种基于级联的深层卷积神经网络的人脸识别方法,包括:
利用级联的深层卷积神经网络提取人脸特征;以及
根据提取的所述人脸特征进行人脸识别。
在一些实施例中,所述利用级联的深层卷积神经网络提取人脸特征,包括:
将人脸图像数据发送至第一网络,预测人脸边框回归;
将第一网络的输出发送至第二网络,预测面部关键点位置;
将第二网络的输出发送至第三网络,提取人脸特征。
在一些实施例中,所述将第一网络的输出发送至第二网络,预测面部关键点位置,包括:
将第一网络的输出进行边框截取及尺寸变换操作之后再发送至所述第二网络;以及
利用第二网络预测面部关键点位置。
在一些实施例中,所述将第二网络的输出发送至第三网络,提取人脸特征,包括,
将第二网络的输出进行相似变换、映射及尺寸变换操作之后再发送至第三网络;以及
利用第三网络提取人脸特征。
在一些实施例中,所述第一网络为人脸检测网络(Face Detection Network,简称为FDNet),第二网络为关键点检测网络(Key-point Detection Network,简称为KDNet),第三网络为特征提取网络(Feature Extraction Network,简称为FENet)。
在一些实施例中,在利用级联的深层卷积神经网络提取人脸特征之前,还包括:采集人脸图像数据。
根据本发明的另一个方面,提供了一种基于级联的深层卷积神经网络的人脸识别***,包括:
特征提取模块,用于利用级联的深层卷积神经网络提取人脸特征;以及
人脸识别模块,与所述特征提取模块连接,用于根据提取的所述人脸特征进行人脸识别。
在一些实施例中,所述特征提取模块包括:
第一网络,用于接收人脸图像数据,预测人脸边框回归;
边框截取单元,用于接收第一网络的输出,并进行边框截取及尺寸变换操作;
第二网络,用于接收所述边框截取单元的输出,并预测面部关键点位置;
相似变换单元,用于接收所述第二网络的输出,并进行相似变换、映射及尺寸变换操作;以及
第三网络,用于接收所述相似变换单元的输出,并提取人脸特征。
在一些实施例中,所述第一网络为人脸检测网络(Face Detection Network,简称为FDNet),第二网络为关键点检测网络(Key-point Detection Network,简称为KDNet),第三网络为特征提取网络(Feature Extraction Network,简称为FENet)。
在一些实施例中,还包括采集模块,用于采集人脸图像数据。
(三)有益效果
从上述技术方案可以看出,本发明一种基于级联的深层卷积神经网络的人脸识别方法及***至少具有以下有益效果其中之一:
(1)本发明利用级联的深层卷积神经网络进行特征提取,根据提取的特征进行人脸识别,级联的深层卷积神经网络中的每一级网络对于每一个人而言只需执行一次,控制简单,计算量小,便于加速;而且通过深度学习提取人脸特征进行人脸识别可以轻松应对各种安全等级的人脸识别任务。
(2)本发明在基于级联的深层卷积神经网络进行人脸识别时,采用相似变换,进一步降低了因边框尺寸不一所带来的背景效应,降低了对网络的需求。
(3)本发明将人脸识别根据不同任务分别独立地映射至多个不同的深度学习模型上,可替换性较强,避免了算力的浪费,便于直观的确定需升级网络部分。
附图说明
构成本发明的一部分的附图用来提供对本发明的进一步理解,本发明的示意性实施例及其说明用于解释本发明,并不构成对本发明的不当限定。 在附图中:
图1为本发明人脸识别方法流程示意图。
图2为图1所示人脸识别方法中边框截取示意图。
图3为本发明人脸识别方法流程图。
图4为本发明人脸识别方法另一流程图。
图5为本发明人提取人脸特征流程图。
图6为本发明预测面部关键点位置流程图。
图7为本发明提取人脸特征另一流程图。
图8为本发明人脸识别***结构示意图。
图9为本发明人脸识别***另一结构示意图。
图10为本发明特征提取模块结构示意图。
图11为本发明实施例人脸识别方法又一流程图。
具体实施方式
在此先简单介绍人脸识别过程以利于对本发明技术方案的理解。
人脸识别通常包括人脸检测、人脸特征提取、对提取的人脸特征进行分类,从而完成人脸识别。
1.人脸检测
所谓人脸检测,就是给定任意一张图片,找到其中是否存在一个或多个人脸,并返回图片中每个人脸的位置和范围。人脸检测算法分为基于知识的、基于特征的、基于模板匹配的、基于外观的四类方法。随着DPM(Direct Part Model)算法(可变部件模型)和深度学习卷积神经网络(Convolutional Neural Networks,简称为CNN)的运用,人脸检测所有算法可以总分为两类:(1)基于模板匹配(Based on rigid templates):其中,代表有算法(Boosting)+特征(Features)和CNN;(2)基于部件模型(Based on parts model)。
2.人脸特征提取
人脸特征提取是在人脸检测的基础上,在人脸所在区域中获取人脸面部特征信息的过程。人脸特征提取方法包括:特征脸法(Eigenface)、主 成分分析法(Principal ComponentAnalysis,简称为PAC)。深度学习特征提取:softmax作为代价函数,抽取神经网络中的某一层作为特征。
3.分类
分类,是指按照种类、等级或性质分别归类,对提取的特进行分类,从而完成人脸识别。分类方法主要包括:决策树方法、贝叶斯方法、人工神经网络。
以下介绍本发明人脸识别方法,概括而言,本发明人脸识别方法过程为:将一张新的图片进行金字塔尺度变换,将变换后的图片输入一网络中,产生大量的人脸分类得分和人脸矩形框(也称方框、边框、边界框、窗、窗体等)回归向量,淘汰得分较低(例如低于一阈值M1)的人脸矩形框,将剩下的人脸矩形框进行非极大值抑制从而得到最终预测结果;然后将预测的结果输入到另一网络中,同样淘汰得分较低(例如低于阈值M2)的人脸矩形框,再利用非极大值抑制算法筛选重叠较大的人脸矩形框,显示面部关键点位置,进行特征提取及人脸识别。
此处以人脸网络(FaceNet)为例介绍所述人脸识别方法。如图1-2所示,所述人脸识别方法包括以下步骤:
利用FaceNet提取人脸特征;以及
根据提取的人脸特征进行人脸识别。
具体的,所述FaceNet分为两步提取人脸特征:
利用多任务级联卷积网络(Multi-task Cascaded Convolutional Networks,简称为MTCNN)预测人脸的边框(Bounding Box);以及
从原始图片中对Bounding Box加入边缘(Margin)截取,并变换尺寸(Resize)至固定尺寸送入特征提取网络。
其中,所述MTCNN预测人脸的Bounding Box,如图1所示,包括以下子步骤:
将输入原始图像缩放成各种不同的大小,也即对原始图片进行不同Scale的Resize操作,建立图像金字塔,每一层金字塔分别送入浅层的CNN候选框网络(Proposal Network,简称为PNet)并进行边框回归(Bounding Box Regression)和非极大值抑制(Non-maximum suppression,简称为NMS)快速产生候选窗体;
对第一阶段筛选下来的每个Bounding Box,截取出来并Resize至固定尺寸,通过更复杂的CNN校准网络(Refine Network,简称为RNet)精炼候选窗体,并进行Bounding Box Regression和非极大值抑制(Non-Maximum Suppression,简称为NMS)丢弃大量的重叠窗体;
对第二阶段筛选下来的每个Bounding Box,使用更加强大的CNN输出网络(Output network,简称为ONet),实现候选窗体去留,同时显示五个面部关键点定位。
如图2所示,对Bounding Box加入Margin,截取出来,Resize至固定尺寸,并送入人脸特征提取网络,图2中白色方框为Bounding Box,灰色线段长度Margin/2,黑色方框为最终截取出来的人脸,所述人脸Resize至固定尺寸送入人脸特征提取网络。
可以看出,以上方法利用MTCNN预测Bounding Box需要多次重复PNet和RNet,控制相对复杂,运算量较大。而且利用MTCNN预测的Bounding Box加入固定长度的Margin,送入特征提取网络,由于图中的人脸的Bounding Box会有各种尺寸,如果对于不同尺寸的人脸加入固定的Margin,则不同尺寸的人脸所带的背景信息则会大为不一样,因此会弱化特征提取网络的泛化能力。
在此基础上,本发明还提供了一种基于级联的深层卷积神经网络来实现无感人脸识别的方法。如图3所示,本发明基于级联的深层卷积神经网络的人脸识别方法包括以下步骤:
S1,利用级联的深层卷积神经网络提取人脸特征;以及
S2,根据提取的所述人脸特征进行人脸识别。
本发明利用级联的深层卷积神经网络进行特征提取,根据提取的特征进行人脸识别,级联的深层卷积神经网络中的每一级网络对于每一个人而言只需执行一次,控制简单,计算量小,便于加速。
进一步的,如图4所示,在利用级联的深层卷积神经网络提取人脸特征之前,所述人脸识别方法还可包括:S0,采集人脸图像数据。
具体的,如图5所示,所述利用级联的深层卷积神经网络提取人脸特征包括:
S11,将人脸图像数据发送至第一网络,预测人脸边框回归;
S12,将第一网络的输出发送至第二网络,预测面部关键点位置;
S13,将第二网络的输出发送至第三网络,提取人脸特征。
也就是说,所述级联的深层卷积神经网络可以包括三个网络,三个网络构成一个三级级联的深层卷积神经网络;其中,所述第一网络为人脸检测网络(Face Detection Network,简称为FDNet),第二网络为关键点检测网络(Key-point Detection Network,简称为KDNet),第三网络为特征提取网络(Feature Extraction Network,简称为FENet)。
相较于现有的聚焦于单一任务的人脸识别方法,本发明对人脸识别任务进行了拆解,并充分考虑了多任务间的关系,具体的,本发明将人脸识别任务拆解为人脸检测+人脸关键点检测+人脸特征提取三级任务,三级任务相对应的深度学习提供的卷积神经网络分别为FDNet、KDNet和FENet。
由此,将人脸识别根据不同任务分别独立地映射至多个不同的深度学习模型上,可替换性较强,避免了算力的浪费,便于直观的确定需升级网络部分。
更具体而言,如图6所示,所述将第一网络的输出发送至第二网络,预测面部关键点位置,包括:
S121,将第一网络的输出进行边框截取及尺寸变换操作之后再发送至所述第二网络;以及
S122,利用第二网络预测面部关键点位置。
如图7所示,所述将第二网络的输出发送至第三网络,提取人脸特征包括,
S131,将第二网络的输出进行相似变换、映射及尺寸变换操作之后再发送至第三网络;以及
S132,利用第三网络提取人脸特征。
本发明在基于级联的深层卷积神经网络进行人脸识别时,采用相似变换,进一步降低了因边框尺寸不一所带来的背景效应,降低了对于FDNet的需求,提高了特征提取的精度。
此外,本发明还提供了一种基于级联的深层卷积神经网络的人脸识别***,如图8所示,所述基于级联的深层卷积神经网络的人脸识别***包括:
特征提取模块11,用于利用级联的深层卷积神经网络提取人脸特征;以及
人脸识别模块12,与所述特征提取模块11连接,用于根据提取的所述人脸特征进行人脸识别。
进一步的,如图9所示,所述人脸识别***还可包括采集模块10,用于采集人脸图像数据。相应的,所述特征提取模块11与所述采集模块10连接,用于接收所述采集模块10发送的人脸图像数据,并利用级联的深层卷积神经网络提取人脸特征。
具体的,如图10所示,所述特征提取模块包括:
第一网络110,用于接收所述人脸图像数据,预测人脸边框回归;
边框截取单元111,用于接收第一网络110的输出,并进行边框截取及尺寸变换操作;
第二网络112,用于接收所述边框截取单元111的输出,并预测面部关键点位置;
相似变换单元113,用于接收所述第二网络112的输出,并进行相似变换、映射及尺寸变换操作;以及
第三网络114,用于接收所述相似变换单元113的输出,并提取人脸特征。
其中,所述第一网络为人脸检测网络(Face Detection Network,简称为FDNet),第二网络为关键点检测网络(Key-point Detection Network,简称为KDNet),第三网络为特征提取网络(Feature Extraction Network,简称为FENet)。
为使本发明的目的、技术方案和优点更加清楚明白,以下结合具体实施例,并参照附图,对本发明基于级联的深层卷积神经网络的人脸识别方法作进一步的详细说明。
如图11所示,在一实施例中,所述基于级联的深层卷积神经网络的人脸识别方法具体包括:
步骤1,FDNet基于YOLO的设计思路,以移动网(MobileNet)为骨干(Backbone),直接对人脸的边框回归(Bounding Box Regression),并 同时预测置信度;也就是说,FDNet的任务为找到图像中的所有人脸位置,并截取所有人脸图像依次作为KDNet的输入。
当然,FDNet包括但并不仅限于MobileNet-YOLO。其中,对数据集中人脸进行打标处理,人脸边框Golden truth设置如下:
位置设置为
Figure PCTCN2020079281-appb-000001
其中,
Figure PCTCN2020079281-appb-000002
表示第i个人脸边框的左上角的坐标,
Figure PCTCN2020079281-appb-000003
分别表示人脸边框的宽度和高度;
由于人脸检测中只有一个类别,即人脸,所以将人脸类别
Figure PCTCN2020079281-appb-000004
设置为0;
置信度
Figure PCTCN2020079281-appb-000005
设置为1;置信度越低表示检测目标对象是人脸的概率越低;
对应的YOLO预测结果为x i y i w i h i C i p i
在训练的过程中损失函数不断减小并收敛至稳定状态,稳定状态下的MobileNet-YOLO网络即可用于人脸边框的预测。
步骤2,基于FDNet的输出,将边框(Bounding Box)截取出来,变换尺寸(Resize)至固定尺寸,送入KDNet(Keypoints Detection Net)直接预测五个面部关键点位置;也就是说,KDNet的任务为找到人脸关键点位置,如图11所示,本实施例是以五点为例(左眼角、右眼角、鼻子、左嘴角、右嘴角),遍历FDNet的所有输入,即可得到所有人脸的关键点位置。
其中,KPNet可以选用多种网络,只需预测出人脸关键点位置即可。下面以人脸左眼、右眼、鼻子、左嘴角、右嘴角为例介绍对数据集中人脸关键点打标处理过程:
通过前述步骤1中的FDNet对人脸数据进行预处理,利用人脸边框截取出人脸部分,计算出人脸左眼、右眼、鼻子、左嘴角、右嘴角所在位置在该人脸部分的位置占比,得到五对位置坐标(x,y),x∈[0,1],y∈[0,1]
在设计KPNeT时最后需进行Sigmoid处理,使得输出最终落于[0,1]之间,采用均方损失(MSE)的损失函数,训练到稳定并收敛的状态即可。
其中,Sigmoid函数如下:
Figure PCTCN2020079281-appb-000006
步骤3,基于KDNet中五个面部关键点的输出,将整帧图像进行五点相似变换,映射至固定黄金(Golden)位置的五点,将映射后的人脸图像变换尺寸(Resize)至固定尺寸送入FENet(Feature Extraction Net,简称 为FENet),提取人脸特征;也就是说,KDNet与FENet之间需要有相似变换作为桥梁,利用KDNet得到的关键点位置信息与黄金(Golden)位置的关键点位置信息得到相似变换矩阵,之后再将整帧图像进行对应关键点的相似变换,得到相似变换后的人脸图像,FENet的任务为将人脸图像信息抽象为特征向量的表示,该特征向量表示具有如下特点:同一个主体的所有人脸都能够映射到相似的特征向量,而不同主体的所有人脸得到的特征向量之间差异较大。
本实施例采用的是五点相似变换,当然并不仅限于五点的相似变换,具体的,首先设置人脸左眼、右眼、鼻子、左嘴角、右嘴角Golden位置,以相似变换后的图像为112*112为例,Golden位置例如可以做如下选择
左眼le_g=[38.2946,51.6963]
右眼re_g=[73.5318,51.5014]
鼻子nose_g=[56.0252,71.7366]
左嘴角l_mouth_g=[41.5493,92.3655]
右嘴角r_mouth_g=[70.7299,92.2041]
黄金关键点Landmark_Golden=[le_g,re_g,nose_g,l_mouth_g,r_mouth_g]
假设output5为KPNet预测的人脸关键点位置,xmin、ymin、bbwidth、bbheight为人脸Bounding Box信息,则预测到的人脸关键点在原图中的绝对坐标为:
le=[output5[0]*bbwidth+xmin,output5[1]*bbheight+ymin]
re=[output5[2]*bbwidth+xmin,output5[3]*bbheight+ymin]
nose=[output5[4]*bbwidth+xmin,output5[5]*bbheight+ymin]
l_mouth=[output5[6]*bb_width+xmin,output5[7]*bb_height+ymin]
r_mouth=[output5[8]*bb_width+xmin,output5[9]*bb_height+ymin]
Landmark_get=[le,re,nose,l_mouth,r_mouth]
以Python为例,利用Landmark_Get Landmark_Golden就可以进行相似变换,具体如下:
from skimage import transform as trans
import cv2
tform=trans.SimilarityTransform()
tform.estimate(Landmark_Get,Landmark_Golden)
M=tform.params[0:2,:]
affine_output=cv2.warpAffine(img,M,(112,112),borderValue=0.0)
所述FENet包括但不限于Mobilefacenets,可以选择AM-softmax Loss作为损失函数。
采用上述损失函数可以达到减小对应标签项的概率,增大损失的效果,因此对同一类的聚合更有帮助。
本实施例所述基于级联的深层卷积神经网络的人脸识别方法,利用三个网络级联的深度卷积神经网络(Cascaded-Deep CNN,简称为CDCNN)来提取人脸特征,进行人脸识别。CDCNN的每一级网络对于每一个人而言只需执行一次,控制简单,计算量小,便于加速;而且相较于前述的Bounding Box加入Margin的方法,采用五点的相似变换,进一步降低了Bounding Box尺寸不一带来的背景效应,且降低了对于FDNet的需求(只要面部五个关键点准确无误,人脸检测框并不一定要用MTCNN网络产生)。
至此,已经结合附图对本发明基于级联的深层卷积神经网络的人脸识别方法及***进行了详细描述。依据以上描述,本领域技术人员应当对本发明有了清楚的认识。
需要说明的是,在附图或说明书正文中,未绘示或描述的实现方式,均为所属技术领域中普通技术人员所知的形式,并未进行详细说明。此外,上述对各元件的定义并不仅限于实施例中提到的各种具体结构、形状或方式,本领域普通技术人员可对其进行简单地更改或替换。
当然,根据实际需要,本发明基于级联的深层卷积神经网络的人脸识别方法及***还可以包含其他的部分,由于同本发明的创新之处无关,此处不再赘述。
类似地,应当理解,为了精简本发明并帮助理解各个发明方面中的一个或多个,在上面对本发明的示例性实施例的描述中,本发明的各个特征 有时被一起分组到单个实施例、图、或者对其的描述中。然而,并不应将该发明的方法解释成反映如下意图:即所要求保护的本发明要求比在每个权利要求中所明确记载的特征更多的特征。更确切地说,如下面的权利要求书所反映的那样,发明方面在于少于前面发明的单个实施例的所有特征。因此,遵循具体实施方式的权利要求书由此明确地并入该具体实施方式,其中每个权利要求本身都作为本发明的单独实施例。
本领域那些技术人员可以理解,可以对实施例中的设备中的模块进行自适应性地改变并且把它们设置在与该实施例不同的一个或多个设备中。可以把实施例中的模块或单元或组件组合成一个模块或单元或组件,以及此外可以把它们分成多个子模块或子单元或子组件。除了这样的特征和/或过程或者单元中的至少一些是相互排斥之外,可以采用任何组合对本说明书(包括伴随的权利要求、摘要和附图)中发明的所有特征以及如此发明的任何方法或者设备的所有过程或单元进行组合。除非另外明确陈述,本说明书(包括伴随的权利要求、摘要和附图)中发明的每个特征可以由提供相同、等同或相似目的的替代特征来代替。
本发明的各个部件实施例可以以硬件实现,或者以在一个或者多个处理器上运行的软件模块实现,或者以它们的组合实现。本领域的技术人员应当理解,可以在实践中使用微处理器或者数字信号处理器(DSP)来实现根据本发明实施例的相关设备中的一些或者全部部件的一些或者全部功能。本发明还可以实现为用于执行这里所描述的方法的一部分或者全部的设备或者装置程序(例如,计算机程序和计算机程序产品)。这样的实现本发明的程序可以存储在计算机可读介质上,或者可以具有一个或者多个信号的形式。这样的信号可以从因特网网站上下载得到,或者在载体信号上提供,或者以任何其他形式提供。
再者,说明书与权利要求中所使用的序数例如“第一”、“第二”、“第三”等的用词,以修饰相应的元件,其本身并不意含及代表该元件有任何的序数,也不代表某一元件与另一元件的顺序、或是制造方法上的顺序,该些序数的使用仅用来使具有某命名的一元件得以和另一具有相同命名的元件能作出清楚区分。
此外,在附图或说明书描述中,相似或相同的部分都使用相同的图号。 说明书中示例的各个实施例中的技术特征在无冲突的前提下可以进行自由组合形成新的方案,另外每个权利要求可以单独作为一个实施例或者各个权利要求中的技术特征可以进行组合作为新的实施例,且在附图中,实施例的形状或是厚度可扩大,并以简化或是方便标示。再者,附图中未绘示或描述的元件或实现方式,为所属技术领域中普通技术人员所知的形式。另外,虽然本文可提供包含特定值的参数的示范,但应了解,参数无需确切等于相应的值,而是可在可接受的误差容限或设计约束内近似于相应的值。
除非存在技术障碍或矛盾,本发明的上述各种实施方式可以自由组合以形成另外的实施例,这些另外的实施例均在本发明的保护范围中。
虽然结合附图对本发明进行了说明,但是附图中公开的实施例旨在对本发明优选实施方式进行示例性说明,而不能理解为对本发明的一种限制。附图中的尺寸比例仅仅是示意性的,并不能理解为对本发明的限制。
虽然本发明总体构思的一些实施例已被显示和说明,本领域普通技术人员将理解,在不背离本总体发明构思的原则和精神的情况下,可对这些实施例做出改变,本发明的范围以权利要求和它们的等同物限定。
以上所述仅为本发明的较佳实施例而已,并不用以限制本发明,凡在本发明的精神和原则之内,所作的任何修改、等同替换、改进等,均应包含在本发明的保护范围之内。

Claims (13)

  1. 一种基于级联的深层卷积神经网络的人脸识别方法,其特征在于,包括:
    利用级联的深层卷积神经网络提取人脸特征;以及
    根据提取的所述人脸特征进行人脸识别。
  2. 根据权利要求1所述的人脸识别方法,其特征在于,所述利用级联的深层卷积神经网络提取人脸特征,包括:
    将人脸图像数据发送至第一网络,预测人脸边框回归;
    将第一网络的输出发送至第二网络,预测面部关键点位置;
    将第二网络的输出发送至第三网络,提取人脸特征。
  3. 根据权利要求2所述的人脸识别方法,其特征在于,所述将第一网络的输出发送至第二网络,预测面部关键点位置,包括:
    将第一网络的输出进行边框截取及尺寸变换操作之后再发送至所述第二网络;以及
    利用第二网络预测面部关键点位置。
  4. 根据权利要求2所述的人脸识别方法,其特征在于,所述将第二网络的输出发送至第三网络,提取人脸特征,包括,
    将第二网络的输出进行相似变换、映射及尺寸变换操作之后再发送至第三网络;以及
    利用第三网络提取人脸特征。
  5. 根据权利要求2所述的人脸识别方法,其特征在于,所述第一网络为人脸检测网络(Face Detection Network,简称为FDNet),第二网络为关键点检测网络(Key-point Detection Network,简称为KDNet),第三网络为特征提取网络(Feature Extraction Network,简称为FENet)。
  6. 根据权利要求5所述的人脸识别方法,其特征在于,所述将人脸图像数据发送至第一网络,预测人脸边框回归,包括:所述人脸检测网络以移动网为骨干直接对人脸的边框回归,并同时预测置信度。
  7. 根据权利要求6所述的人脸识别方法,其特征在于,所述将第一网络的输出发送至第二网络,预测面部关键点位置,包括:基于人脸检测网 络的输出,将边框截取出来,变换尺寸至固定尺寸,送入关键点检测网络直接预测五个面部关键点位置。
  8. 根据权利要求7所述的人脸识别方法,其特征在于,所述将第二网络的输出发送至第三网络,提取人脸特征,包括:基于关键点检测网络中五个面部关键点的输出,将整帧图像进行五点的相似变换,映射至固定黄金位置的五点,将映射后的人脸图像变换尺寸至固定尺寸送入网络特征提取网络,提取人脸特征。
  9. 根据权利要求1所述的人脸识别方法,其特征在于,在利用级联的深层卷积神经网络提取人脸特征之前,还包括:采集人脸图像数据。
  10. 一种基于级联的深层卷积神经网络的人脸识别***,其特征在于,包括:
    特征提取模块,用于利用级联的深层卷积神经网络提取人脸特征;以及
    人脸识别模块,与所述特征提取模块连接,用于根据提取的所述人脸特征进行人脸识别。
  11. 根据权利要求10所述的人脸识别***,其特征在于,所述特征提取模块包括:
    第一网络,用于接收人脸图像数据,预测人脸边框回归;
    边框截取单元,用于接收第一网络的输出,并进行边框截取及尺寸变换操作;
    第二网络,用于接收所述边框截取单元的输出,并预测面部关键点位置;
    相似变换单元,用于接收所述第二网络的输出,并进行相似变换、映射及尺寸变换操作;以及
    第三网络,用于接收所述相似变换单元的输出,并提取人脸特征。
  12. 根据权利要求11所述的人脸识别***,其特征在于,所述第一网络为人脸检测网络(Face Detection Network,简称为FDNet),第二网络为关键点检测网络(Key-point Detection Network,简称为KDNet),第三网络为特征提取网络(Feature Extraction Network,简称为FENet)。
  13. 根据权利要求10所述的人脸识别***,其特征在于,还包括采集 模块,用于采集人脸图像数据。
PCT/CN2020/079281 2019-03-15 2020-03-13 基于级联的深层卷积神经网络的人脸识别方法及*** WO2020187160A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201910201162.4 2019-03-15
CN201910201162.4A CN111695392B (zh) 2019-03-15 2019-03-15 基于级联的深层卷积神经网络的人脸识别方法及***

Publications (1)

Publication Number Publication Date
WO2020187160A1 true WO2020187160A1 (zh) 2020-09-24

Family

ID=72475529

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2020/079281 WO2020187160A1 (zh) 2019-03-15 2020-03-13 基于级联的深层卷积神经网络的人脸识别方法及***

Country Status (2)

Country Link
CN (1) CN111695392B (zh)
WO (1) WO2020187160A1 (zh)

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112749687A (zh) * 2021-01-31 2021-05-04 云知声智能科技股份有限公司 一种图片质量和静默活体检测多任务训练方法和设备
CN112818772A (zh) * 2021-01-19 2021-05-18 网易(杭州)网络有限公司 一种面部参数的识别方法、装置、电子设备及存储介质
CN113362110A (zh) * 2021-06-03 2021-09-07 中国电信股份有限公司 营销信息的推送方法、装置、电子设备和可读介质
CN113516146A (zh) * 2020-12-21 2021-10-19 腾讯科技(深圳)有限公司 一种数据分类方法、计算机及可读存储介质
CN114723756A (zh) * 2022-06-09 2022-07-08 北京理工大学 基于双监督网络的低分时序遥感目标检测方法及装置

Families Citing this family (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112395393B (zh) * 2020-11-27 2022-09-30 华东师范大学 一种基于多任务多示例的远程监督关系抽取方法
CN113160171B (zh) * 2021-04-20 2023-09-05 中日友好医院(中日友好临床医学研究所) 一种弹性超声成像图像处理的方法和装置
CN116309710B (zh) * 2023-02-27 2024-07-09 荣耀终端有限公司 目标追踪方法和电子设备

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105868689A (zh) * 2016-02-16 2016-08-17 杭州景联文科技有限公司 一种基于级联卷积神经网络的人脸遮挡检测方法
CN106339680A (zh) * 2016-08-25 2017-01-18 北京小米移动软件有限公司 人脸关键点定位方法及装置
CN106485215A (zh) * 2016-09-29 2017-03-08 西交利物浦大学 基于深度卷积神经网络的人脸遮挡检测方法
CN106951867A (zh) * 2017-03-22 2017-07-14 成都擎天树科技有限公司 基于卷积神经网络的人脸识别方法、装置、***及设备
CN107967456A (zh) * 2017-11-27 2018-04-27 电子科技大学 一种基于人脸关键点的多神经网络级联识别人脸方法
CN108875833A (zh) * 2018-06-22 2018-11-23 北京智能管家科技有限公司 神经网络的训练方法、人脸识别方法及装置

Family Cites Families (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2723118B2 (ja) * 1992-08-31 1998-03-09 インターナショナル・ビジネス・マシーンズ・コーポレイション 2次元オブジェクトの認識に用いるためのニューラル・ネットワーク及び光学式文字認識装置
US6751354B2 (en) * 1999-03-11 2004-06-15 Fuji Xerox Co., Ltd Methods and apparatuses for video segmentation, classification, and retrieval using image class statistical models
CN103824054B (zh) * 2014-02-17 2018-08-07 北京旷视科技有限公司 一种基于级联深度神经网络的人脸属性识别方法
CN104463172B (zh) * 2014-12-09 2017-12-22 重庆中科云丛科技有限公司 基于人脸特征点形状驱动深度模型的人脸特征提取方法
CN107832700A (zh) * 2017-11-03 2018-03-23 全悉科技(北京)有限公司 一种人脸识别方法与***
CN108304788B (zh) * 2018-01-18 2022-06-14 陕西炬云信息科技有限公司 基于深度神经网络的人脸识别方法
CN108564049A (zh) * 2018-04-22 2018-09-21 北京工业大学 一种基于深度学习的快速人脸检测识别方法
CN109448707A (zh) * 2018-12-18 2019-03-08 北京嘉楠捷思信息技术有限公司 一种语音识别方法及装置、设备、介质
CN109447053A (zh) * 2019-01-09 2019-03-08 江苏星云网格信息技术有限公司 一种基于双重限制注意力神经网络模型的人脸识别方法

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105868689A (zh) * 2016-02-16 2016-08-17 杭州景联文科技有限公司 一种基于级联卷积神经网络的人脸遮挡检测方法
CN106339680A (zh) * 2016-08-25 2017-01-18 北京小米移动软件有限公司 人脸关键点定位方法及装置
CN106485215A (zh) * 2016-09-29 2017-03-08 西交利物浦大学 基于深度卷积神经网络的人脸遮挡检测方法
CN106951867A (zh) * 2017-03-22 2017-07-14 成都擎天树科技有限公司 基于卷积神经网络的人脸识别方法、装置、***及设备
CN107967456A (zh) * 2017-11-27 2018-04-27 电子科技大学 一种基于人脸关键点的多神经网络级联识别人脸方法
CN108875833A (zh) * 2018-06-22 2018-11-23 北京智能管家科技有限公司 神经网络的训练方法、人脸识别方法及装置

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113516146A (zh) * 2020-12-21 2021-10-19 腾讯科技(深圳)有限公司 一种数据分类方法、计算机及可读存储介质
CN112818772A (zh) * 2021-01-19 2021-05-18 网易(杭州)网络有限公司 一种面部参数的识别方法、装置、电子设备及存储介质
CN112749687A (zh) * 2021-01-31 2021-05-04 云知声智能科技股份有限公司 一种图片质量和静默活体检测多任务训练方法和设备
CN113362110A (zh) * 2021-06-03 2021-09-07 中国电信股份有限公司 营销信息的推送方法、装置、电子设备和可读介质
CN114723756A (zh) * 2022-06-09 2022-07-08 北京理工大学 基于双监督网络的低分时序遥感目标检测方法及装置
CN114723756B (zh) * 2022-06-09 2022-08-12 北京理工大学 基于双监督网络的低分时序遥感目标检测方法及装置

Also Published As

Publication number Publication date
CN111695392B (zh) 2023-09-15
CN111695392A (zh) 2020-09-22

Similar Documents

Publication Publication Date Title
WO2020187160A1 (zh) 基于级联的深层卷积神经网络的人脸识别方法及***
Gou et al. Vehicle license plate recognition based on extremal regions and restricted Boltzmann machines
CN111401257B (zh) 一种基于余弦损失在非约束条件下的人脸识别方法
WO2019232866A1 (zh) 人眼模型训练方法、人眼识别方法、装置、设备及介质
WO2020182121A1 (zh) 表情识别方法及相关装置
WO2019232862A1 (zh) 嘴巴模型训练方法、嘴巴识别方法、装置、设备及介质
Ban et al. Face detection based on skin color likelihood
WO2019114036A1 (zh) 人脸检测方法及装置、计算机装置和计算机可读存储介质
Li et al. Robust visual tracking based on convolutional features with illumination and occlusion handing
WO2021139324A1 (zh) 图像识别方法、装置、计算机可读存储介质及电子设备
US10445602B2 (en) Apparatus and method for recognizing traffic signs
CN105512638B (zh) 一种基于融合特征的人脸检测与对齐方法
WO2020015752A1 (zh) 一种对象属性识别方法、装置、计算设备及***
CN110163111A (zh) 基于人脸识别的叫号方法、装置、电子设备及存储介质
CN112232184B (zh) 一种基于深度学习和空间转换网络的多角度人脸识别方法
WO2021238586A1 (zh) 一种训练方法、装置、设备以及计算机可读存储介质
WO2021203718A1 (zh) 人脸识别方法及***
Du High-precision portrait classification based on mtcnn and its application on similarity judgement
Yang et al. A Face Detection Method Based on Skin Color Model and Improved AdaBoost Algorithm.
Wang et al. Accurate playground localisation based on multi-feature extraction and cascade classifier in optical remote sensing images
Bhatia et al. Face detection using fuzzy logic and skin color segmentation in images
Obaida et al. Comparative of Viola-Jones and YOLO v3 for Face Detection in Real time
Chaki et al. Fragmented handwritten digit recognition using grading scheme and fuzzy rules
Sarmah et al. Facial identification expression-based attendance monitoring and emotion detection—A deep CNN approach
Huang et al. Eye landmarks detection via two-level cascaded CNNs with multi-task learning

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 20773064

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

32PN Ep: public notification in the ep bulletin as address of the adressee cannot be established

Free format text: NOTING OF LOSS OF RIGHTS (EPO FORM 1205A DATED 02.02.2022)

122 Ep: pct application non-entry in european phase

Ref document number: 20773064

Country of ref document: EP

Kind code of ref document: A1