CN112766238B

CN112766238B - Age prediction method and device

Info

Publication number: CN112766238B
Application number: CN202110278664.4A
Authority: CN
Inventors: 陈晨; 冯子钜; 叶润源; 毛永雄; 董帅; 邹昆; 李悦乔
Original assignee: Zhongshan Xidao Technology Co ltd; University of Electronic Science and Technology of China Zhongshan Institute
Current assignee: Zhongshan Xidao Technology Co ltd; University of Electronic Science and Technology of China Zhongshan Institute
Priority date: 2021-03-15
Filing date: 2021-03-15
Publication date: 2023-09-26
Anticipated expiration: 2041-03-15
Also published as: CN112766238A

Abstract

The application provides an age prediction method and device, and relates to the field of image recognition. According to the age prediction method, according to the face characteristics, the face dynamic change characteristics and the pre-trained age prediction model, the age prediction result corresponding to the face images is output, and the face characteristics of each face image in the face images acquired at a plurality of moments are considered, and the face dynamic change characteristics of the face images acquired at a plurality of moments are also considered. Thus, the predicted age information is more accurate.

Description

Age prediction method and device

Technical Field

The application relates to the field of image recognition, in particular to an age prediction method and an age prediction device.

Background

Generally, in the field of image recognition, the age of a user corresponding to a face image can be predicted by extracting image features of the face image and then analyzing in combination with the extracted image features.

At present, the specific way of determining the age of the user corresponding to the face image through image recognition is as follows: and acquiring four partial images of the left eye, the right eye, the nose and the mouth of the face image, respectively extracting multi-scale partial features of the four partial images, and connecting the multi-scale partial features of the four partial images in series to obtain the face fusion feature. And finally, predicting the age of the user corresponding to the face image according to the face fusion characteristics. However, the accuracy of the age of the user determined in the above manner is low.

Disclosure of Invention

The embodiment of the application aims to provide an age prediction method and an age prediction device, which are used for solving the problem that the accuracy of the age of a user corresponding to a face image determined according to the face image is low.

In a first aspect, an embodiment of the present application provides an age prediction method, including:

acquiring face images of the same user acquired at a plurality of moments;

extracting face characteristics of each face image;

extracting face dynamic change characteristics according to face characteristics of face images acquired at the previous moment and face characteristics of face images acquired at the later moment in every two adjacent moments, wherein the previous moment is later than the later moment;

and outputting an age prediction result corresponding to the face image according to the face characteristics, the face dynamic change characteristics and a pre-trained age prediction model, wherein the age prediction model is trained by taking face characteristics, face dynamic change characteristics and real age information of a historical face image sample acquired at a plurality of historical moments as inputs of the network to be trained.

In a second aspect, an embodiment of the present application further provides an age prediction model training method, where the method includes:

determining an age prediction result based on the face characteristics and the face dynamic change characteristics of the historical face image samples acquired by the network to be trained according to a plurality of historical moments;

determining a loss function according to the age prediction result and the real age information corresponding to the historical face image sample;

determining whether the loss function is smaller than a preset threshold value;

if the historical face image sample is larger than a preset threshold value, updating the historical face image sample and updating the network parameters of the network to be trained based on the loss function, and returning to execute the steps of acquiring the age prediction result based on the face features and the face dynamic change features of the historical face image samples acquired by the network to be trained at a plurality of historical moments;

if the current network parameters of the network to be trained are smaller than the preset threshold value, an age prediction model is built based on the current network parameters of the network to be trained.

In a third aspect, an embodiment of the present application further provides an age prediction apparatus, including:

the information acquisition unit is used for acquiring face images of the same user acquired at a plurality of moments;

the feature extraction unit is used for extracting the face features of each face image;

the feature extraction unit is further configured to extract a feature of dynamic face change according to a face feature of a face image acquired at a previous time and a face feature of a face image acquired at a later time in the plurality of times, where the previous time is later than the later time;

the age prediction unit is used for outputting an age prediction result corresponding to the face image according to the face characteristics, the face dynamic change characteristics and a pre-trained age prediction model, wherein the age prediction model is trained by taking face characteristics, face dynamic change characteristics and real age information of a historical face image sample acquired at a plurality of historical moments as input of the network to be trained.

In a fourth aspect, an embodiment of the present application further provides an age prediction model training apparatus, where the apparatus includes:

the information determining unit is used for determining an age prediction result based on the face characteristics and the face dynamic change characteristics of the historical face image samples acquired by the network to be trained according to a plurality of historical moments;

the information determining unit is further used for determining a loss function according to the age prediction result and the real age information corresponding to the historical face image sample;

the information determining unit is further configured to determine whether the loss function is smaller than a preset threshold;

the information updating unit is used for updating the historical face image sample and updating the network parameters of the network to be trained based on the loss function if the historical face image sample is larger than a preset threshold value, and returning to execute the steps of acquiring the face characteristics and the face dynamic change characteristics of the historical face image sample acquired at a plurality of historical moments based on the network to be trained and acquiring the age prediction result;

and the model building unit is used for building an age prediction model based on the current network parameters of the network to be trained if the model building unit is smaller than a preset threshold value.

In a fifth aspect, an embodiment of the present application provides an electronic device comprising a processor and a memory storing computer readable instructions which, when executed by the processor, perform the steps of the method as provided in the first aspect above.

In a sixth aspect, an embodiment of the present application provides a readable storage medium having stored thereon a computer program which when executed by a processor performs the steps of the method as provided in the first aspect above.

Compared with the prior art, the application has the following beneficial effects: according to the age prediction method, according to the face characteristics, the face dynamic change characteristics and the pre-trained age prediction model, the age prediction result corresponding to the face images is output, and the face characteristics of each face image in the face images acquired at a plurality of moments are considered, and the face dynamic change characteristics of the face images acquired at a plurality of moments are also considered. Thus, the predicted age information is more accurate.

Additional features and advantages of the application will be set forth in the description which follows, and in part will be apparent from the description, or may be learned by practice of the embodiments of the application. The objectives and other advantages of the application will be realized and attained by the structure particularly pointed out in the written description and claims thereof as well as the appended drawings.

Drawings

In order to more clearly illustrate the technical solutions of the embodiments of the present application, the drawings that are needed in the embodiments of the present application will be briefly described below, it should be understood that the following drawings only illustrate some embodiments of the present application and should not be considered as limiting the scope, and other related drawings can be obtained according to these drawings without inventive effort for a person skilled in the art.

FIG. 1 is a flowchart illustrating a method for age prediction according to an embodiment of the present application;

FIG. 2 is a second flowchart of an age prediction method according to an embodiment of the present application;

fig. 3 is an interaction schematic diagram of a server and a terminal device provided in an embodiment of the present application;

FIG. 4 is a flowchart of an age prediction model training method according to an embodiment of the present application;

fig. 5 is a schematic structural diagram of a cascaded multi-layer convolutional neural network according to an embodiment of the present application;

fig. 6 is a schematic structural diagram of a convolutional neural network according to an embodiment of the present application;

fig. 7 is a schematic diagram of a functional unit of an age prediction device according to an embodiment of the present application;

FIG. 8 is a schematic diagram of a functional unit of an age prediction model training device according to an embodiment of the present application;

fig. 9 is a schematic structural diagram of an electronic device according to an embodiment of the present application.

Detailed Description

The technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application.

Technical term interpretation:

a Long Short-Term Memory (LSTM) network is a time-loop neural network, and is specifically designed to solve the Long-Term dependency problem of the common RNN (loop neural network). Due to the unique design structure, LSTM is suitable for processing and predicting very long-spaced and delayed important events in a time series. The LSTM network is ingenious in that the weight coefficient between the connection is designed by adding the input gate, the forget gate and the output gate, so that the LSTM network can accumulate long-term connection between nodes with longer distance, and long-term memory of data is realized.

A Bi-Long Short-Term Memory (Bi-LSTM) network, the Bi-Long Short-Term Memory neural network being formed by combining a forward LSTM with a backward LSTM. For the output of the t moment, the forward LSTM layer has information of the t moment and the previous moment in the input sequence, and the backward LSTM layer has information of the t moment and the next moment in the input sequence, so that the relevance between the context information can be determined.

As shown in fig. 1, the age prediction method provided by the present application may include: preprocessing face images of the same user acquired at a plurality of moments, and then obtaining face characteristics of each face image through picture rolling lamination of the preprocessed face images. And then, the long-term and short-term memory network acquires the dynamic change characteristics of the human face according to the extracted human face characteristics. And outputting a second target vector through the two-way long-short-term memory network by using the fusion characteristics according to the facial characteristics and the facial dynamic change characteristics. Wherein a second target vector is used to indicate the relationship of the fusion feature to age. Further, the full connection layer maps the second target vector into a preset age interval; finally, the softmax function determines the probability of the second target vector at each age in the preset age interval, and the age with the highest probability can be selected as the output age prediction result. The age prediction method provided by the application is described in detail below with reference to fig. 2 to 5.

Referring to fig. 2, the present application provides an age prediction method, which can be applied to a server 100. As shown in fig. 3, the server 100 may be communicatively connected to the terminal device 200 for data exchange.

The method comprises the following steps:

s21: face images of the same user acquired at a plurality of moments are acquired.

The user may trigger the terminal device 200 to collect face images collected at a plurality of moments on a display interface of an application program of the terminal device 200 having a photographing function. For example, the user uses the functions of triple-shot and five-shot in a camera application. Alternatively, the user may trigger the terminal device 200 to collect face images collected at a plurality of times on a display interface of an application program of the terminal device 200 having a video photographing function. For example, a video capture function is used at a camera application to capture face images captured at multiple moments. The multiple times may be consecutive, e.g., every 100ms, the multiple times may be 100ms, 200ms, 300ms, 400ms, 500ms, etc., without limitation. The plurality of moments may be discontinuous and may be continuous, for example, face images are acquired every 100ms, and the plurality of moments may be 100ms, 300ms, 500ms, etc., which is not limited herein. The server 100 may receive face images from the terminal device 200 at a plurality of times.

Optionally, face images acquired at a plurality of moments are preprocessed. The pretreatment method at least comprises the following steps: graying processing, normalization processing, face region extraction in face images, and the like. The normalization processing mode can be as follows: and normalizing parameters such as the size, the brightness and the like of the face images acquired at a plurality of moments so as to carry out face recognition and determine the face change characteristics.

S22: and extracting the face characteristics of each face image.

In the embodiment of the application, the face feature may be an undirected graph feature. The undirected graph feature is used for representing the position relation of each pixel point relative to other pixel points in the face image. Wherein S22 may be performed as follows:

step A: and converting each face image into a phase-free image matrix.

The undirected graph matrix is used for representing the position relation of each pixel point in the face image relative to other pixel points. Specifically, each pixel of the face image may be used as a vertex pixel of the undirected graph, where the euclidean distance between each vertex pixel and the vertex pixel is smaller than a preset distance (e.g. euclidean distance) The pixel points of each pixel point are connected, and an adjacent matrix of each pixel point is generated; then, the adjacency matrix of all pixel points is spliced to the edge set E. Let the face image include w×h pixels, the converted undirected graph matrix g= { V, E }, V be a set of w×h vertex pixels, and E be an edge set.

And (B) step (B): inputting the undirected graph matrix into a cascaded multi-layer graph convolutional neural network to obtain undirected graph characteristics.

Fig. 4 shows the structure of a cascaded multi-layer graph convolutional neural network (fig. 4 includes a 5-layer graph convolutional neural network). The parameters of the rolled neural network of each layer of graph are shown in table 1:

layer number	Parameters (parameters)
		Graph convolutional neural network 1	R＝9,Q＝32,ReLU，C＝2
Graph convolutional neural network 2	R＝9,Q＝32,ReLU，C＝2
		Graph convolutional neural network 3	R＝6,Q＝64,ReLU，C＝1
Graph convolutional neural network 4	R＝6,Q＝64,ReLU，C＝1
		Graph convolutional neural network 5	R＝4,Q＝128,ReLU，C＝1

TABLE 1

In table 1, R is the size of the filter, Q is the number of undirected graphs output to the next layer, reLU represents the activation function, and C represents the number of coarsening times.

As shown in fig. 5, each layer of graph convolutional neural network includes a filter layer, an activation layer, and a roughened layer. Specifically, the processing procedure of the undirected graph matrix by the graph convolution neural network of each layer comprises the following steps: filtering the undirected graph matrix in a filtering layer to obtain a vertex vector matrix formed by all vertex pixel points in the undirected graph matrix; (the filtering process may be that firstly, an undirected graph matrix in a space domain is transformed into a frequency domain to obtain an undirected graph matrix in the frequency domain, then, filtering operation is performed on the undirected graph matrix in the frequency domain to obtain a vertex vector matrix formed by all vertex pixels in the undirected graph matrix in the frequency domain, finally, the vertex vector matrix formed by all vertex pixels in the undirected graph matrix in the frequency domain is transformed into the space domain so as to activate function processing, and then, nonlinear processing is performed on the vertex vector matrix according to an activating function (Relu function) in an activating layer, and the vertex vector matrix after nonlinear processing is coarsened in a coarsening layer.

In addition, in the embodiment of the application, the facial features in the facial image can be extracted by a facial feature extraction algorithm based on geometric features, a facial feature extraction algorithm based on a neural network, a facial feature extraction algorithm based on elastic graph matching and a facial feature extraction algorithm based on a support vector machine, which are not limited herein. The face features may include left eye features, right eye features, nose features, mouth features, or a combination of at least two of the foregoing features.

S23: and extracting the dynamic change characteristics of the human face according to the human face characteristics of the human face images acquired at the previous moment and the human face characteristics of the human face images acquired at the later moment in every two adjacent moments.

Wherein the preceding moment is later than the following moment. For example, the face images acquired at the multiple moments may include a 100ms face image, a 200ms face image, and a 300ms face image, and then the face dynamic change feature of the face image at the previous moment in any two adjacent moments relative to the face image at the subsequent moment includes: the feature of face dynamics of the 200ms face image relative to the 100ms face image and the feature of face dynamics of the 300ms face image relative to the 200ms face image. The dynamic change feature of the face may include a position change feature of the pixel, a brightness change feature of the pixel, a color change feature of the pixel, and the like, which are not limited herein.

Specifically, the face features of the face images acquired at the previous time in every two adjacent times and the face features of the face images acquired at the later time are used as the input of a pre-trained dynamic change feature generation model, and the dynamic change features of the face are determined. The dynamic change feature generation model is formed by taking face features of face images acquired at the previous moment in every two adjacent moments in history and face features of face images acquired at the later moment as training samples and taking face dynamic change features corresponding to the training samples as target results and inputting the target results into the long-term and short-term memory network for training.

S24: and outputting an age prediction result corresponding to the face image according to the face characteristics, the face dynamic change characteristics and the pre-trained age prediction model.

The age prediction model is trained by taking face features, face dynamic change features and real age information of a historical face image sample acquired at a plurality of historical moments as inputs of a network to be trained.

After the age prediction result corresponding to the face image is obtained, the age prediction result may be transmitted back to the terminal device 200, and the terminal device 200 may display the age prediction result on a display interface of the application program having a photographing function. Furthermore, the user can know the age prediction result based on the display interface.

According to the age prediction method, according to the face characteristics, the face dynamic change characteristics and the pre-trained age prediction model, the age prediction result corresponding to the face images is output, and the face characteristics of each face image in the face images acquired at a plurality of moments are considered, and the face dynamic change characteristics of the face images acquired at a plurality of moments are also considered. Therefore, the accuracy of the predicted age information is higher.

The training process of the age prediction model in S24 is described below, and as shown in fig. 4, the training process may include:

s41: and determining an age prediction result based on the face characteristics and the face dynamic change characteristics of the historical face image samples acquired by the network to be trained according to a plurality of historical moments.

The age prediction model may include a two-way long and short term memory network, a fully connected layer, and a softmax function. The two-way long-short-term memory network is used for outputting a second target vector according to the face characteristics and the fusion characteristics of the face dynamic change characteristics (a characteristic matrix spliced according to the face characteristics and the face dynamic change characteristics), and the second target vector is used for indicating the relationship between the fusion characteristics and the age; the full connection layer is used for mapping the second target vector to a preset age interval; the softmax function is used for determining the probability of the second target vector at each age in a preset age interval, and the age with the highest probability is selected as an output age prediction result.

S42: and determining a loss function according to the age prediction result and the real age information corresponding to the historical face image sample.

S43: determining whether the loss function is smaller than a preset threshold value; if so, S44 is performed.

S44: and establishing an age prediction model based on the current network parameters of the network to be trained.

S45: updating the historical face image sample and updating the network parameters of the network to be trained based on the loss function, and returning to S41.

In some alternative embodiments, S23 may include:

step 1: and extracting undirected graph characteristics of each face image.

It can be understood that the manner of extracting the undirected graph features of the face image in the step 1 is the same as the manner of extracting the undirected graph features of the face image in the above step S23, and will not be described herein.

Step 2: and taking undirected graph features respectively corresponding to the face images acquired at a plurality of moments as input of a long-period memory network to acquire the face dynamic change features of the face image at the previous moment relative to the face image at the later moment.

Referring to fig. 7, an embodiment of the present application further provides an age prediction apparatus, which can be applied to the server 100. As also shown in fig. 2, the server 100 may be communicatively coupled to the terminal device 200 for data exchange. The specific implementation method and the beneficial effects of the age prediction device are the same as those of the implementation, and specific reference is made to the above description. The device comprises: an information acquisition unit 801, a feature extraction unit 802, and an age prediction unit 803, wherein,

an information acquisition unit 801 is configured to acquire face images of the same user acquired at a plurality of times.

The apparatus may further include: the preprocessing unit is used for preprocessing face images acquired at a plurality of moments, wherein the preprocessing mode at least comprises the following steps: graying treatment and normalization treatment.

The feature extraction unit 802 is configured to extract a face feature of each face image.

The feature extraction unit 802 is further configured to extract, based on face images acquired at a plurality of moments, a face dynamic change feature of a face image at a previous moment in any two adjacent moments relative to a face image at a subsequent moment, where the previous moment is later than the subsequent moment.

Specifically, the feature extraction unit 802 may include: the first feature extraction module is used for extracting undirected graph features of each face image, wherein the undirected graph features are used for representing the position relation of each pixel point in the face image relative to other pixel points.

The first feature extraction module is specifically used for processing each face image into an undirected graph matrix; inputting the undirected graph matrix into a cascaded multi-layer graph convolutional neural network to obtain undirected graph characteristics;

the processing process of each layer of graph convolution neural network on the undirected graph matrix is as follows: filtering the undirected graph matrix to obtain a vertex vector matrix formed by all vertex pixel points in the undirected graph matrix; nonlinear processing is carried out on the vertex vector matrix according to the activation function; coarsening the vertex vector matrix after nonlinear processing to obtain undirected graph characteristics.

And the second feature extraction module is used for taking undirected graph features respectively corresponding to face images acquired at a plurality of moments as input of a long-period memory network so as to acquire the face dynamic change features of the face image at the previous moment relative to the face image at the later moment.

The age prediction unit 803 is configured to output an age prediction result corresponding to the face image according to the face feature, the face dynamic change feature and a pre-trained age prediction model, where the age prediction model is trained based on the face feature, the face dynamic change feature and the real age information of the historical face image sample acquired at a plurality of historical moments as input of the network to be trained.

Referring to fig. 8, the embodiment of the application further provides an age prediction model training apparatus, which includes an information determining unit 801, an information updating unit 802, and a model building unit 803.

The information determining unit 801 is configured to determine an age prediction result based on face features and face dynamic change features of a historical face image sample acquired by the network to be trained according to a plurality of historical moments.

The information determining unit 801 is further configured to determine a loss function according to the age prediction result and real age information corresponding to the historical face image sample.

The information determining unit 801 is further configured to determine whether the loss function is smaller than a preset threshold.

And the information updating unit 802 is configured to update the historical face image sample and update the network parameters of the network to be trained based on the loss function if the historical face image sample is greater than the preset threshold, and return to the step of executing the face features and the face dynamic change features of the historical face image sample acquired at a plurality of historical moments based on the network to be trained to obtain the age prediction result.

And the model building unit 803 is configured to build an age prediction model based on the current network parameters of the network to be trained if the model is smaller than a preset threshold.

The above prior art solutions have all the drawbacks that the inventors have obtained after practice and careful study, and thus the discovery process of the above problems and the solutions presented below by the embodiments of the present application for the above problems should be all contributions to the present application by the inventors during the present application.

Referring to fig. 9, fig. 9 is a schematic structural diagram of an electronic device for executing an age prediction model training method and an age prediction method according to an embodiment of the present application, where the electronic device may include: at least one processor 110, such as a CPU, at least one communication interface 120, at least one memory 130, and at least one communication bus 140. Wherein the communication bus 140 is used to enable direct connection communication of these components. The communication interface 120 of the device in the embodiment of the present application is used for performing signaling or data communication with other node devices. The memory 130 may be a high-speed RAM memory or a nonvolatile memory (non-volatile memory), such as at least one disk memory. Memory 130 may also optionally be at least one storage device located remotely from the aforementioned processor. The memory 130 has stored therein computer readable instructions which, when executed by the processor 110, perform the method process described above in fig. 1.

It will be appreciated that the configuration shown in fig. 9 is merely illustrative, and that the electronic device may also include more or fewer components than shown in fig. 9, or have a different configuration than shown in fig. 9. The components shown in fig. 9 may be implemented in hardware, software, or a combination thereof.

The apparatus may be a module, a program segment, or code on an electronic device. It should be understood that the apparatus corresponds to the embodiment of the method of fig. 1 described above, and is capable of performing the steps involved in the embodiment of the method of fig. 1, and specific functions of the apparatus may be referred to in the foregoing description, and detailed descriptions thereof are omitted herein as appropriate to avoid redundancy.

It should be noted that, for convenience and brevity, a person skilled in the art will clearly understand that, for the specific working procedure of the system and apparatus described above, reference may be made to the corresponding procedure in the foregoing method embodiment, and the description will not be repeated here.

An embodiment of the application provides a readable storage medium having stored thereon a computer program which, when executed by a processor, performs a method procedure performed by an electronic device in the method embodiment shown in fig. 1.

The present embodiment discloses a computer program product comprising a computer program stored on a non-transitory computer readable storage medium, the computer program comprising program instructions which, when executed by a computer, are capable of performing the methods provided by the above-described method embodiments, for example, acquiring face images acquired at a plurality of moments; extracting face characteristics of each face image; based on face images acquired at a plurality of moments, extracting face dynamic change characteristics of a face image at a previous moment in any two adjacent moments relative to a face image at a later moment, wherein the previous moment is later than the later moment; and outputting an age prediction result corresponding to the face image according to the face characteristics, the face dynamic change characteristics and the pre-trained age prediction model, wherein the age prediction model is formed by taking face characteristics, face dynamic change characteristics and real age information of a historical face image sample acquired at a plurality of historical moments as input of a network to be trained.

In the embodiments provided in the present application, it should be understood that the disclosed apparatus and method may be implemented in other manners. The above-described apparatus embodiments are merely illustrative, for example, the division of the units is merely a logical function division, and there may be other manners of division in actual implementation, and for example, multiple units or components may be combined or integrated into another system, or some features may be omitted, or not performed. Alternatively, the coupling or direct coupling or communication connection shown or discussed with each other may be through some communication interface, device or unit indirect coupling or communication connection, which may be in electrical, mechanical or other form.

Further, the units described as separate units may or may not be physically separate, and units displayed as units may or may not be physical units, may be located in one place, or may be distributed over a plurality of network units. Some or all of the units may be selected according to actual needs to achieve the purpose of the solution of this embodiment.

Furthermore, functional modules in various embodiments of the present application may be integrated together to form a single portion, or each module may exist alone, or two or more modules may be integrated to form a single portion.

In this document, relational terms such as first and second, and the like may be used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions.

The above description is only an example of the present application and is not intended to limit the scope of the present application, and various modifications and variations will be apparent to those skilled in the art. Any modification, equivalent replacement, improvement, etc. made within the spirit and principle of the present application should be included in the protection scope of the present application.

Claims

1. A method of age prediction, the method comprising:

acquiring face images of the same user acquired at a plurality of moments;

extracting face characteristics of each face image;

extracting dynamic change characteristics of the face according to the face characteristics of the face image acquired at the previous moment and the face characteristics of the face image acquired at the later moment in every two adjacent moments in the plurality of moments;

according to the face characteristics, the extracted face dynamic change characteristics and the pre-trained age prediction model which are respectively corresponding to the face images acquired at a plurality of moments, outputting an age prediction result corresponding to the face images, wherein the age prediction model is trained by taking face characteristics, face dynamic change characteristics and real age information of historical face image samples acquired at a plurality of historical moments as inputs of a network to be trained;

the extracting the dynamic change feature of the face according to the face feature of the face image acquired at the previous time and the face feature of the face image acquired at the later time in every two adjacent times comprises: the face features of the face images acquired at the previous time and the face features of the face images acquired at the later time in every two adjacent times are used as the input of a pre-trained dynamic change feature generation model, the face dynamic change features are determined, wherein the dynamic change feature generation model is formed by taking the face features of the face images acquired at the previous time and the face features of the face images acquired at the later time in each two adjacent times in a historical manner as training samples and taking the face dynamic change features corresponding to the training samples as target results and inputting the target results into a long-period memory network for training.

2. The method of claim 1, wherein the extracting face features of each of the face images comprises:

converting each face image into an undirected graph matrix, wherein the undirected graph matrix is used for representing the position relation of each pixel point in the face image relative to other pixel points;

and extracting the face features according to the undirected graph matrix.

3. The method of claim 2, wherein the extracting face features from the undirected graph matrix comprises:

inputting the undirected graph matrix into a cascaded multi-layer graph convolutional neural network to obtain undirected graph characteristics;

the processing process of each layer of convolutional neural network on the undirected graph matrix is as follows:

filtering the input first target vector to obtain a vertex vector matrix formed by all vertex pixel points in the undirected graph matrix;

performing nonlinear processing on the vertex vector matrix according to an activation function;

coarsening the vertex vector matrix after nonlinear processing;

in the cascaded multi-layer graph convolutional neural network, a first target vector of the input of a first layer of graph convolutional neural network is the undirected graph matrix, the output of a last layer of graph convolutional neural network is undirected graph characteristics, and the output of a previous layer of graph convolutional neural network is the input of an adjacent next layer of graph convolutional neural network.

4. The method of claim 1, wherein the age prediction model comprises a two-way long-short term memory network, a full-connection layer, and a softmax function, wherein the two-way long-short term memory network is configured to output a second target vector according to the face feature, a fusion feature of the face dynamic change feature, and wherein the second target vector is configured to indicate a relationship between the fusion feature and age; the full connection layer is used for mapping the second target vector into a preset age interval; the softmax function is used for determining the probability of the second target vector at each age in a preset age interval, and the age with the highest probability is selected as an output age prediction result.

5. A method of training an age prediction model, the method comprising:

determining an age prediction result based on the face characteristics and the face dynamic change characteristics of the historical face image samples acquired by the network to be trained according to a plurality of historical moments; the face dynamic change characteristics are that the face characteristics of face images acquired at the previous moment in every two adjacent moments and the face characteristics of face images acquired at the later moment are used as the input of a pre-trained dynamic change characteristic generation model to be determined; the dynamic change feature generation model is based on the face features of face images acquired at the previous moment in every two adjacent moments in history and the face features of face images acquired at the later moment as training samples, and the face dynamic change features corresponding to the training samples are used as target results and input into a long-period and short-period memory network for training;

determining whether the loss function is smaller than a preset threshold value;

6. An age prediction device, the device comprising:

the feature extraction unit is further used for extracting the dynamic change feature of the face according to the face feature of the face image acquired at the previous moment and the face feature of the face image acquired at the later moment in every two adjacent moments in the plurality of moments;

the age prediction unit is used for outputting an age prediction result corresponding to the face image according to the face characteristics, the face dynamic change characteristics and a pre-trained age prediction model, wherein the age prediction model is formed by training the face characteristics, the face dynamic change characteristics and real age information of a historical face image sample acquired at a plurality of historical moments as input of a network to be trained;

the feature extraction unit is further configured to determine a face dynamic change feature by using a face feature of a face image acquired at a previous time and a face feature of a face image acquired at a later time in every two adjacent times as input of a pre-trained dynamic change feature generation model, where the dynamic change feature generation model is based on a face feature of a face image acquired at a previous time and a face feature of a face image acquired at a later time in each two adjacent times as training samples, and a face dynamic change feature corresponding to the training samples as a target result, and input the training results into the long-short-term memory network.

7. An age prediction model training device, the device comprising:

the information determining unit is used for determining an age prediction result based on the face characteristics and the face dynamic change characteristics of the historical face image samples acquired by the network to be trained according to a plurality of historical moments; the face dynamic change characteristics are that the face characteristics of face images acquired at the previous moment in every two adjacent moments and the face characteristics of face images acquired at the later moment are used as the input of a pre-trained dynamic change characteristic generation model to be determined; the dynamic change feature generation model is based on the face features of face images acquired at the previous moment in every two adjacent moments in history and the face features of face images acquired at the later moment as training samples, and the face dynamic change features corresponding to the training samples are used as target results and input into a long-period and short-period memory network for training;

8. An electronic device comprising a processor and a memory storing computer readable instructions that, when executed by the processor, perform the method of any of claims 1-5.

9. A readable storage medium having stored thereon a computer program which, when executed by a processor, performs the method of any of claims 1-5.