CN110688872A

CN110688872A - Lip-based person identification method, device, program, medium, and electronic apparatus

Info

Publication number: CN110688872A
Application number: CN201810724695.6A
Authority: CN
Inventors: 刘纯平; 季怡; 邢腾飞; 董虎胜; 林欣; 邬晓钧
Original assignee: BEIJING D-EAR TECHNOLOGIES Co Ltd
Current assignee: BEIJING D-EAR TECHNOLOGIES Co Ltd
Priority date: 2018-07-04
Filing date: 2018-07-04
Publication date: 2020-01-14

Abstract

The embodiment of the invention provides a figure identification method and device based on lips, a computer program product, a computer readable storage medium and electronic equipment. The person identification method comprises the following steps: carrying out face detection on the image to be recognized, and extracting a lip region image from the image to be recognized according to a face detection result; performing feature extraction on the lip region image to obtain lip features; and performing person identification on the image to be identified according to the lip characteristics. By adopting the technical scheme of the embodiment of the invention, the lip region of a person can be accurately identified by utilizing the better distinguishing capability of the lip characteristics on the mouth of the person, so that the identification accuracy of the person is improved.

Description

Lip-based person identification method, device, program, medium, and electronic apparatus

Technical Field

Embodiments of the present invention relate to the field of image processing, and in particular, to a method and an apparatus for identifying a person based on lips, a computer program product, a computer-readable storage medium, and an electronic device.

Background

The person identification is carried out based on biological characteristics (such as fingerprints, DNA, faces, voice and the like), the safety and convenience characteristics are high, and the method has wide application prospect on the person identification of a security check system in scenes such as frontiers, airports and the like. The human body identification based on the lip area of the human body is a novel human body feature identification method, and the identification processing of the lip area features can be carried out based on the image shot by a common camera to realize human body identification; and the system can be combined with the characteristics of the face, the sound and the like to construct a multi-modal biological characteristic recognition system so as to improve the recognition accuracy.

However, person identification based on the lip region has problems of few features, small sample size, and the like, so that the identification accuracy is low; in addition, the photographed image has problems of complex background, over-strong or dark illumination, blocked lip area, deviation of photographing angle, etc., and the recognition accuracy is also seriously affected.

Disclosure of Invention

The embodiment of the invention provides a figure identification technology based on lips.

According to a first aspect of embodiments of the present invention, there is provided a lip-based person identification method, including: carrying out face detection on the image to be recognized, and extracting a lip region image from the image to be recognized according to a face detection result; performing feature extraction on the lip region image to obtain lip features; and performing person identification on the image to be identified according to the lip characteristics.

Optionally, the performing face detection on the image to be recognized, and extracting a lip region image from the image to be recognized according to a face detection result includes: carrying out face detection on an image to be recognized to obtain a face detection result comprising face characteristic points; and extracting lip areas including lip feature points from the image to be recognized according to the face detection result.

Optionally, the performing face detection on the image to be recognized includes: carrying out deformation processing on an image to be identified; and carrying out face detection on the image to be recognized after the deformation processing.

Optionally, the performing feature extraction on the lip region image includes: and performing feature extraction on the lip region image through a deep neural network system.

Optionally, the deep neural network system is obtained by performing fine-tuning processing on a neural network system for face recognition based on a lip region data set.

Optionally, the performing person identification on the image to be identified according to the lip features includes: obtaining a distance between the lip feature and a sample lip feature of a sample image; and identifying the person of the image to be identified according to the distance.

Optionally, the obtaining a distance between the lip feature and a sample lip feature of a sample image includes: inputting the lip feature and a sample lip feature of a sample image into a discrimination subspace; obtaining a distance between the lip feature in the discrimination subspace and the sample lip feature.

Optionally, the inputting the lip feature and the sample lip feature of the sample image into a discrimination subspace includes: and projecting the lip features and the sample lip features of the sample image to a discrimination subspace by using a preset projection matrix.

According to a second aspect of an embodiment of the present invention, there is provided a lip-based person identification apparatus, including: the detection module is used for carrying out face detection on the image to be recognized and extracting a lip region image from the image to be recognized according to a face detection result; the extraction module is used for extracting the features of the lip region image to obtain lip features; and the identification module is used for identifying the person of the image to be identified according to the lip characteristics.

Optionally, the detection module includes: the detection unit is used for carrying out face detection on the image to be recognized to obtain a face detection result comprising face characteristic points; and the extraction unit is used for extracting a lip region comprising lip feature points from the image to be recognized according to the face detection result.

Optionally, the detection unit is configured to: carrying out deformation processing on an image to be identified; and carrying out face detection on the image to be recognized after the deformation processing.

Optionally, the extraction module is configured to: and performing feature extraction on the lip region image through a deep neural network system.

Optionally, the identification module comprises: an acquisition unit for acquiring a distance between the lip feature and a sample lip feature of a sample image; and the identification unit is used for identifying the person of the image to be identified according to the distance.

Optionally, the obtaining unit includes: the input subunit is used for inputting the lip feature and the sample lip feature of the sample image into a judgment subspace; an obtaining subunit, configured to obtain a distance between the lip feature in the discrimination subspace and the sample lip feature.

Optionally, the input subunit is configured to: and projecting the lip features and the sample lip features of the sample image to a discrimination subspace by using a preset projection matrix.

According to a third aspect of embodiments of the present invention, there is provided a computer program product comprising: computer program instructions, which when executed by a processor, are configured to implement the steps corresponding to any of the person identification methods provided by the embodiments of the present invention.

According to a fourth aspect of the embodiments of the present invention, there is provided a computer-readable storage medium storing computer program instructions, which when executed by a processor, are used to implement the steps corresponding to any one of the person identification methods provided by the embodiments of the present invention.

According to a fifth aspect of embodiments of the present invention, there is provided an electronic apparatus, including: the system comprises a processor and a memory, wherein the memory is used for storing at least one executable instruction, and the executable instruction enables the processor to execute the steps corresponding to any person identification method provided by the embodiment of the invention.

According to the figure identification scheme based on the lips, the image to be identified is subjected to face detection, the lip region image is extracted from the image to be identified according to the face detection result, the feature extraction is carried out on the lip feature image, and figure identification is carried out on the image to be identified according to the obtained lip features, so that the lip region of a person is accurately identified by utilizing the good distinguishing capability of the lip features on the mouth of the person, and the figure identification accuracy is improved.

Drawings

FIG. 1 is a flow diagram of a lip-based person identification method according to some embodiments of the invention;

FIG. 2 is a schematic illustration of face detection results provided according to some embodiments of the invention;

FIG. 3 is a schematic block diagram of a neural network training system provided in accordance with some embodiments of the present invention;

FIG. 4 is a block diagram of a lip-based person identification device, according to some embodiments of the invention;

FIG. 5 is a schematic structural diagram of an electronic device according to some embodiments of the invention.

Detailed Description

The following detailed description of embodiments of the invention is provided in conjunction with the accompanying drawings (like numerals indicate like elements throughout the several views) and examples. The following examples are intended to illustrate the invention but are not intended to limit the scope of the invention.

It should be understood that the technical solutions disclosed in the embodiments of the present disclosure may be mainly applied to the field of image processing/computer vision, but may also be applied to other fields, and the embodiments of the present disclosure do not limit this.

FIG. 1 is a flow diagram of a method of person identification according to some embodiments of the invention.

Referring to fig. 1, in step S110, face detection is performed on an image to be recognized, and a lip region image is extracted from the image to be recognized according to a face detection result.

In the embodiment of the invention, the image to be recognized can be an image shot by any equipment, and the image to be recognized comprises a human face. For example, the image to be recognized is an image including the head of the user captured by an image capturing device in a security system at an airport, a bank, or the like.

Optionally, when the image to be recognized is subjected to face detection, the image to be recognized is input into a neural network system for face detection, and the neural network system is used for performing processing such as key point detection and feature extraction on the image to be recognized to obtain a face detection result including face feature points. It is to be noted here that, the embodiment of the present invention does not limit a specific face detection method, and can be applied to a face detection method, including but not limited to face detection using a neural network system, which can be applied to face detection of an image to be recognized in the embodiment of the present invention.

Accordingly, when the lip region image is extracted from the image to be recognized, the lip region image containing the lip feature points can be extracted from the image to be recognized or the image to be recognized according to the face detection result. For example, feature points in the face recognition result are analyzed to obtain lip feature points corresponding to lip regions included in the face recognition result, the image to be recognized is segmented according to information of the lip feature points, and an image including a region where the lip feature points are located, that is, an image of the lip region, is segmented from the image to be recognized. The face characteristic points are a plurality of characteristic points corresponding to a face, the lip characteristic points are a plurality of characteristic points corresponding to a lip region, and the lip characteristic point set is a part of the face characteristic point set.

In an alternative embodiment, fig. 2 shows that 68 facial feature points exist in the face detection result, including corresponding feature points such as the top of the chin, the outer contour of each eye, the inner contour of each eyebrow, and so on. For example, the left eye is shown to correspond to feature points 36-41, the right eye to correspond to feature points 42-47, and the lips to correspond to feature points 48-64. The extracted lip region image includes at least the region where the feature points 48-64 corresponding to the lips are located.

Optionally, in practical application, before executing the step, the image to be recognized is subjected to deformation processing, so that the image to be recognized after the deformation processing is subjected to face detection and lip region image extraction. Here, the morphing process is used to make a face organ of a human face in an image to be recognized in a front position in the image, to facilitate face detection, and to improve accuracy of face detection.

In step S120, feature extraction is performed on the lip region image, and lip features are obtained.

Here, the feature extraction may be a feature extraction operation performed on detail information of the lip region image, and the obtained lip features include the aforementioned feature points 48 to 64, and further include more underlying feature points or higher-level feature points to accurately represent the detail information and overall semantic information of the lip region image.

In an optional implementation mode, the lip region image is input into a deep neural network system, and feature extraction is performed on the lip region image through the deep neural network, so that deep lip features of the lip region are obtained. The feature extraction based on the deep neural network system has good extraction capability on both coarse granularity and fine granularity, and can accurately perform processing such as key point detection, feature extraction and the like. Moreover, compared with the artificial features obtained by manually extracting the lip region image in the prior art, the extracted depth lip features can contain more bottom-layer features or high-layer features, and the problem of less features can be avoided by processing based on the depth lip features.

Here, the deep neural network system is obtained by performing fine adjustment processing on the neural network system for face recognition based on the lip region data set. The training sample images in the lip region data set comprise lip regions, and the training sample images are labeled with classification label data. When the deep neural network system is trained, the neural network system for face recognition can be used for carrying out feature extraction processing, classification processing and the like on the training sample images to obtain classification result data; and calculating difference data (such as a deviation value or a loss value) between the classification result data and the labeled classification label data, and adjusting network parameters of a neural network system according to the calculated difference data to obtain the deep neural network for lip feature detection.

For example, fine tuning processing such as fining is performed on a vgfade neural network system shown in fig. 3 based on a pre-labeled lip region data set, so as to obtain a vgglip depth neural network system, which is used for performing depth feature extraction on a lip region image to obtain a depth feature of, for example, fc7 layer.

The deep neural network system may be any suitable neural network system that can implement keypoint detection, feature extraction, including but not limited to convolutional neural networks, reinforcement learning neural networks, generation networks in antagonistic neural networks, and so forth. The specific configuration of the deep neural network system may be set by those skilled in the art according to actual requirements, such as the number of convolutional layers, the size of convolutional kernel, the number of channels, and the like, which is not limited in this embodiment of the present invention.

In step S130, the image to be recognized is subjected to person recognition based on the lip features.

For example, human recognition is performed according to deep lip features extracted by a deep neural network, lip detail features can be strengthened, the lip features have good distinguishing capability for human mouth, human lip areas can be accurately recognized, and the human recognition accuracy is further improved.

Optionally, a distance between the lip feature and the sample lip feature of the sample image is obtained, and the image to be recognized is subjected to person recognition according to the distance.

In an alternative embodiment, when the distance between the lip feature and the sample lip feature of the sample image is obtained, the lip feature and the sample lip feature of the sample image are input into the discrimination subspace, and the distance between the lip feature and the sample lip feature in the discrimination subspace is obtained.

For example, the lip features and the sample lip features are embedded into the discrimination subspace through a pre-calculated projection matrix, and the euclidean distance between the lip features and the sample lip features in the discrimination subspace is obtained. Here, the projection matrix may be calculated by:

1. the feature matrix containing the lip features of the N training sample images is represented as X ═ X (X)₁,x₂,...,x_N)∈R^d ^×NWherein each column x_i∈R_dA lip feature representing an ith training sample image;

defining a kernel function k (x)_i,x_j)＝<Φ(X_i)，Φ(X_j)>Wherein, in the step (A),<>expressing the inner product, and calculating to obtain a K matrix, K_i,j＝k(x_i,x_j) Is an N-dimensional matrix;

further calculating the total scattering matrix

Where I is an NxN identity matrix, M is an NxN matrix and each element is equal to 1 xN.

2. Different images representing the same content belong to the same class, so lip characteristics of the N images are actually divided into a plurality of classes, and an intra-class scattering matrix is calculated

And inter-class scattering matrix

Definition of

Wherein the content of the first and second substances,is represented by the formula_iThe other kw images within the same class (kw 1 if there are only 2 images of the same class),is represented by the formula_iThe closest kb images that do not belong to the same class. Here, regarding the value of kw, since the number of images of the same class is generally smaller than that of images of a different class, the larger kw is better. But there are a few or many images of the same class, so it is recommended that kw be taken to be the minimum number of images in that class minus 1 (to remove itself); with respect to the value of kb, experience shows that performance increases and decreases with increasing kb, which may vary from application to application. In practical applications, kb can be taken to be 20 in order to balance between performance and processing speed.

Computing a Laplace matrix of an intrinsic map in a decision subspace

Wherein D is_wIs a diagonal matrix and is characterized by that it is a diagonal matrix,

calculating a Laplace matrix of penalty maps in a decision subspace

Wherein D is_bIs a diagonal matrix and is characterized by that it is a diagonal matrix,

computing

3. To pair

Decomposing the characteristic value to obtain

Wherein Λ is a diagonal matrix, has N-1 non-zero eigenvalues, and is arranged in descending order;

consisting of corresponding feature vectors. Then calculate

4. ComputingTo pair

Decomposing the eigenvalue to obtain the eigenvector corresponding to the zero eigenvalue

Form a matrix

5. Computing

To pairDecomposing the eigenvalue, arranging the non-zero eigenvalue in descending order to form a matrix with corresponding eigenvector

6. Computing a projection matrix

I.e., 'the first m' columns constitute w^Φ。

Projection matrix w obtained based on calculation^ΦAny lip feature X can be projected to a judgment subspace, and the projection result in the judgment subspace isWherein, K_x＝(k(x₁,x),...,k(x_N,x))^T。

It should be understood that the embodiment of the present invention only uses the above calculation process as an example to describe an alternative calculation method of the projection matrix, but it should be understood by those skilled in the art that any suitable method capable of calculating the projection matrix for projecting the features into the discriminant subspace can be applied to the embodiment of the present invention.

And projecting the lip features and the sample lip features to a discrimination subspace by using the projection matrix, and calculating the Euclidean distance between the projection result of the lip features in the discrimination subspace and the projection result of the sample lip features in the discrimination subspace. The distance between the similar features in the judgment subspace is reduced, the distance between the different features is increased, namely the Euclidean distance between the obtained lip features and the projection results of the sample lip features in the judgment subspace is obvious, and the person identification is carried out based on the distance, so that the person identification can be accurately realized.

When people are identified according to the distance between the lip feature and the sample lip feature, because lips in the image to be identified can be in multiple states, and the range value of the obtained distance is large, when people are identified, the distances between the lip feature and the sample lip features of the multiple sample images can be obtained, the minimum distance value is selected from the distances, and then the corresponding sample image is determined, namely, the image to be identified corresponds to the determined sample image, the people corresponding to the determined sample image can be determined as the people in the image to be identified, and therefore people identification of the image to be identified is achieved.

In addition, in order to further improve the accuracy of person identification, a plurality of acquired lip features corresponding to a plurality of images to be identified of the same person respectively can be acquired, and the distance between each lip feature and the sample lip features of the persons in the sample database is acquired, that is, the minimum value of the plurality of distances is acquired. Here, each minimum distance value corresponds to one candidate person, and the candidate person corresponding to the maximum number of minimum distance values may be determined as the target person, that is, the person corresponding to the image to be recognized, so as to improve the accuracy of person recognition.

In some optional embodiments of the present invention, the person identification based on lip characteristics and the person identification based on other biological characteristics (such as facial characteristics, voice characteristics, etc.) can be combined to form a person identification system based on multi-modal biological characteristics, so as to facilitate the person identification based on other biological characteristics when the result of the person identification based on lip characteristics indicates that the person identification fails, thereby ensuring the reliability of the person identification performed by the person identification system; and the human recognition result based on the lip feature and the human recognition result based on other features can be compared, thereby further improving the accuracy of the human recognition result.

For example, a person recognition system based on multi-modal biological features comprises a lip recognition subsystem and a voiceprint recognition subsystem, wherein the lip recognition subsystem is used for recognizing persons of images to be recognized by adopting the person recognition method based on the lip features and acquiring person recognition results, if the person recognition results indicate that person recognition fails, the voiceprint recognition subsystem is used for acquiring voice data of persons corresponding to the images to be recognized, and the voice data is subjected to feature extraction, voiceprint recognition and other processing through a pre-trained voiceprint recognition model to acquire the person recognition results based on the voiceprint features. Here, in acquiring the result of person recognition based on the lip feature, it may be determined whether or not the person recognition is successful based on the minimum value of the distances between the lip feature of the image to be recognized acquired and the sample lip features of the plurality of sample images. Specifically, if the acquired minimum distance value is smaller than a preset distance threshold, it is determined that the person identification is successful; and if the acquired minimum distance value is not less than a preset distance threshold value, determining that the person identification fails.

According to the person identification method provided by the embodiment of the invention, the image to be identified is subjected to face detection, the lip region image is extracted from the image to be identified according to the face detection result, the feature extraction is carried out on the lip feature image, and the person identification is carried out on the image to be identified according to the obtained lip feature, so that the lip region of a person is accurately identified by utilizing the good identification capability of the lip feature on the mouth of the person, and the accuracy of the person identification is further improved; and extracting the deep lip characteristics through a deep neural network system, inputting the deep lip characteristics into a judgment subspace, and effectively realizing figure identification by utilizing the characteristics that the similar characteristics are close to each other and the heterogeneous characteristics are far from each other in the judgment subspace, so that the figure identification is not influenced by illumination, shooting angles and the like, and the figure identification accuracy is improved.

The person identification method provided by the embodiment of the invention can be executed by any appropriate device with corresponding image or data processing capability, including but not limited to: a terminal device such as a computer, and a computer program, a processor, etc., integrated on the terminal device.

Based on the same technical concept, fig. 4 is a block diagram of a lip-based character recognition apparatus according to some embodiments of the present invention. Can be used to execute the human identification method flow described in the above embodiments.

Referring to fig. 4, a person recognition apparatus according to some alternative embodiments of the present invention includes: the detection module 410 is used for performing face detection on the image to be recognized and extracting a lip region image from the image to be recognized according to a face detection result; an extraction module 420, configured to perform feature extraction on the lip region image to obtain lip features; and the identification module 430 is configured to perform person identification on the image to be identified according to the lip features.

Optionally, the detection module 410 includes: a detection unit 4101, configured to perform face detection on an image to be recognized, to obtain a face detection result including a face feature point; an extracting unit 4102, configured to extract a lip region including lip feature points from the image to be recognized according to the face detection result.

Optionally, the detection unit 4101 is configured to: carrying out deformation processing on an image to be identified; and carrying out face detection on the image to be recognized after the deformation processing.

Optionally, the extracting module 420 is configured to: and performing feature extraction on the lip region image through a deep neural network system.

Optionally, the identifying module 430 includes: an obtaining unit 4301, configured to obtain a distance between the lip feature and a sample lip feature of a sample image; and the identification unit 4302 is configured to perform person identification on the image to be identified according to the distance.

Optionally, the obtaining unit 4301 includes: the input subunit is used for inputting the lip feature and the sample lip feature of the sample image into a judgment subspace; an obtaining subunit (not indicated in the figure) for obtaining a distance between the lip feature in the discrimination subspace and the sample lip feature.

The person identification device of the embodiment of the invention is used for realizing the corresponding person identification method in the foregoing method embodiments, and has the beneficial effects of the corresponding method embodiments, which are not described herein again.

Some embodiments of the invention also provide a computer program comprising computer program instructions for implementing the steps of any of the person identification methods provided by the embodiments of the invention when the program instructions are executed by a processor.

Some embodiments of the present invention also provide a computer-readable storage medium on which computer program instructions are stored, the program instructions, when executed by a processor, implementing the steps of any of the person identification methods provided by the embodiments of the present invention.

Some embodiments of the present invention also provide an electronic device, which may be, for example, a mobile terminal, a Personal Computer (PC), a tablet, a server, or the like. Referring now to fig. 5, there is shown a schematic block diagram of an electronic device 500 suitable for use as a terminal device or server for implementing embodiments of the invention: as shown in fig. 5, the electronic device 500 includes one or more processors, communication elements, and the like, for example: one or more Central Processing Units (CPUs) 501, and/or one or more image processors (GPUs) 513, etc., which may perform various appropriate actions and processes according to executable instructions stored in a Read Only Memory (ROM)502 or loaded from a storage section 508 into a Random Access Memory (RAM) 503. The communication elements include a communication component 512 and/or a communication interface 509. Among other things, the communication component 512 may include, but is not limited to, a network card, which may include, but is not limited to, an ib (infiniband) network card, the communication interface 509 includes a communication interface such as a network interface card of a LAN card, a modem, or the like, and the communication interface 509 performs communication processing via a network such as the internet.

The processor may communicate with the read-only memory 502 and/or the random access memory 503 to execute the executable instructions, connect with the communication component 512 through the communication bus 504, and communicate with other target devices through the communication component 512, thereby completing operations corresponding to any person identification method provided by the embodiments of the present invention, for example, performing face detection on an image to be identified, and extracting a lip region image from the image to be identified according to a face detection result; performing feature extraction on the lip region image to obtain lip features; and performing person identification on the image to be identified according to the lip characteristics.

In addition, in the RAM503, various programs and data necessary for the operation of the apparatus can also be stored. The CPU501 or GPU513, the ROM502, and the RAM503 are connected to each other through a communication bus 504. The ROM502 is an optional module in case of the RAM 503. The RAM503 stores executable instructions or writes executable instructions into the ROM502 during running, and the executable instructions cause the processor to execute operations corresponding to the above-described person identification method. An input/output (I/O) interface 505 is also connected to communication bus 504. The communication component 512 may be integrated or may be configured with multiple sub-modules (e.g., multiple IB cards) and linked over a communication bus.

The following components are connected to the I/O interface 505: an input portion 506 including a keyboard, a mouse, and the like; an output portion 507 including a display such as a Cathode Ray Tube (CRT), a Liquid Crystal Display (LCD), and the like, and a speaker; a storage portion 508 including a hard disk and the like; and a communication interface 509 comprising a network interface card such as a LAN card, modem, or the like. The driver 510 is also connected to the I/O interface 505 as necessary. A removable medium 511 such as a magnetic disk, an optical disk, a magneto-optical disk, a semiconductor memory, or the like is mounted on the drive 510 as necessary, so that a computer program read out therefrom is mounted into the storage section 508 as necessary.

It should be noted that the architecture shown in fig. 5 is only an optional implementation manner, and in a specific practical process, the number and types of the components in fig. 5 may be selected, deleted, added or replaced according to actual needs; in different functional component settings, separate settings or integrated settings may also be used, for example, the GPU and the CPU may be separately set or the GPU may be integrated on the CPU, the communication element may be separately set, or the GPU and the CPU may be integrated, and so on. These alternative embodiments are all within the scope of the present invention.

In particular, according to an embodiment of the present invention, the processes described above with reference to the flowcharts may be implemented as computer software programs. For example, embodiments of the present invention include a computer program product comprising a computer program tangibly embodied on a machine-readable medium, the computer program comprising program code for performing the method illustrated in the flowchart, the program code may include instructions corresponding to performing steps of a person recognition method provided by embodiments of the present invention, for example, performing face detection on an image to be recognized, extracting a lip region image from the image to be recognized according to a face detection result; performing feature extraction on the lip region image to obtain lip features; and performing person identification on the image to be identified according to the lip characteristics. In such an embodiment, the computer program may be downloaded and installed from a network via the communication element, and/or installed from the removable medium 511. Which when executed by a processor performs the above-described functions defined in the method of an embodiment of the invention.

It should be noted that, according to the implementation requirement, each component/step described in the embodiment of the present invention may be divided into more components/steps, and two or more components/steps or partial operations of the components/steps may also be combined into a new component/step to achieve the purpose of the embodiment of the present invention.

The above-described method according to an embodiment of the present invention may be implemented in hardware, firmware, or as software or computer code storable in a recording medium such as a CD ROM, a RAM, a floppy disk, a hard disk, or a magneto-optical disk, or as computer code originally stored in a remote recording medium or a non-transitory machine-readable medium downloaded through a network and to be stored in a local recording medium, so that the method described herein may be stored in such software processing on a recording medium using a general-purpose computer, a dedicated processor, or programmable or dedicated hardware such as an ASIC or FPGA. It will be appreciated that the computer, processor, microprocessor controller or programmable hardware includes memory components (e.g., RAM, ROM, flash memory, etc.) that can store or receive software or computer code that, when accessed and executed by the computer, processor or hardware, implements the processing methods described herein. Further, when a general-purpose computer accesses code for implementing the processes shown herein, execution of the code transforms the general-purpose computer into a special-purpose computer for performing the processes shown herein.

Those of ordinary skill in the art will appreciate that the various illustrative elements and method steps described in connection with the embodiments disclosed herein may be implemented as electronic hardware, or combinations of computer software and electronic hardware. Whether such functionality is implemented as hardware or software depends upon the particular application and design constraints imposed on the implementation. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present embodiments.

The above embodiments are only for illustrating the embodiments of the present invention and not for limiting the embodiments of the present invention, and those skilled in the art can make various changes and modifications without departing from the spirit and scope of the embodiments of the present invention, so that all equivalent technical solutions also belong to the scope of the embodiments of the present invention, and the scope of patent protection of the embodiments of the present invention should be defined by the claims.

Claims

1. A lip-based person identification method, comprising:

carrying out face detection on the image to be recognized, and extracting a lip region image from the image to be recognized according to a face detection result;

performing feature extraction on the lip region image to obtain lip features;

and performing person identification on the image to be identified according to the lip characteristics.

2. The method according to claim 1, wherein the performing face detection on the image to be recognized and extracting the lip region image from the image to be recognized according to the face detection result comprises:

carrying out face detection on an image to be recognized to obtain a face detection result comprising face characteristic points;

and extracting lip region images including lip feature points from the image to be recognized according to the face detection result.

3. The method according to claim 1, wherein the performing face detection on the image to be recognized comprises:

carrying out deformation processing on an image to be identified;

and carrying out face detection on the image to be recognized after the deformation processing.

4. The method of claim 1, wherein the feature extracting the lip region image comprises:

and performing feature extraction on the lip region image through a deep neural network system.

5. The method according to claim 4, wherein the deep neural network system is obtained by performing fine tuning processing on a neural network system for face recognition based on a lip region data set.

6. The method according to any one of claims 1 to 5, wherein the identifying the person of the image to be identified according to the lip features comprises:

obtaining a distance between the lip feature and a sample lip feature of a sample image;

and identifying the person of the image to be identified according to the distance.

7. A lip-based person recognition device, comprising:

the detection module is used for carrying out face detection on the image to be recognized and extracting a lip region image from the image to be recognized according to a face detection result;

the extraction module is used for extracting the features of the lip region image to obtain lip features;

and the identification module is used for identifying the person of the image to be identified according to the lip characteristics.

8. A computer program product, comprising: computer program instructions, wherein said program instructions, when executed by a processor, are adapted to carry out the steps corresponding to the person identification method of any one of claims 1 to 6.

9. A computer-readable storage medium, in which computer program instructions are stored, which program instructions, when executed by a processor, are adapted to carry out the steps corresponding to the person identification method according to any one of claims 1 to 6.

10. An electronic device, comprising: a processor and a memory, the memory being configured to store at least one executable instruction, the executable instruction causing the processor to perform the steps corresponding to the person identification method according to any one of claims 1 to 6.