WO2021017316A1 - 基于残差网络的信息识别方法、装置和计算机设备 - Google Patents

基于残差网络的信息识别方法、装置和计算机设备 Download PDF

Info

Publication number
WO2021017316A1
WO2021017316A1 PCT/CN2019/118803 CN2019118803W WO2021017316A1 WO 2021017316 A1 WO2021017316 A1 WO 2021017316A1 CN 2019118803 W CN2019118803 W CN 2019118803W WO 2021017316 A1 WO2021017316 A1 WO 2021017316A1
Authority
WO
WIPO (PCT)
Prior art keywords
pedestrian
image
data
recognition
preset
Prior art date
Application number
PCT/CN2019/118803
Other languages
English (en)
French (fr)
Inventor
张国辉
赵鹏
Original Assignee
平安科技(深圳)有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 平安科技(深圳)有限公司 filed Critical 平安科技(深圳)有限公司
Publication of WO2021017316A1 publication Critical patent/WO2021017316A1/zh

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands

Definitions

  • This application relates to the computer field, and in particular to an information recognition method, device, computer equipment and storage medium based on a residual network.
  • Pedestrian re-identification is a technology that uses computer vision technology to determine whether there are specific pedestrians in an image or video sequence. It can be used for the recognition of pedestrian images obtained by monitoring, can make up for the inaccurate identification of pedestrian images by naked eyes, and can be widely used in fields such as intelligent video surveillance. Among them, whether the pedestrian re-recognition model can obtain accurate recognition results depends on the neural network model used, and the ordinary neural network model has poor training effects when there are too many network layers, and cannot be qualified for the task of accurate recognition. Residual network is a deep convolutional network, which can solve the problem of weakening training effect caused by increasing the number of network layers, and can improve the recognition accuracy of the pedestrian re-recognition model.
  • the main purpose of this application is to provide an information recognition method, device, computer equipment and storage medium based on a residual network, aiming to improve the recognition accuracy of pedestrian re-identification.
  • this application proposes an information recognition method based on a residual network, which includes the following steps:
  • the main data, the global sub-data, and the local sub-data are input into a fully connected layer preset in the pedestrian re-recognition model for calculation, so as to obtain a pedestrian re-recognition result output by the fully connected layer.
  • This application provides an information recognition device based on a residual network, including:
  • An instruction acquisition unit configured to acquire a pedestrian re-identification instruction, wherein the pedestrian re-identification instruction carries an image of a designated pedestrian to be identified;
  • the feature image acquisition unit is used to input the image of the designated pedestrian into a preset trained pedestrian re-recognition model based on the residual network for calculation, thereby obtaining the output of the fourth residual block in the residual network
  • a feature image wherein the pedestrian re-identification model is trained based on the pedestrian image and the sample data of the recognition result associated with the pedestrian image;
  • a data acquisition unit configured to input the characteristic image into the fifth residual block in the residual network for calculation, thereby obtaining the main data output by the fifth residual block; and parallelly transfer the characteristics
  • the image is input into the global recognition sub-network preset in the pedestrian re-recognition model for calculation, thereby obtaining the global sub-data output by the global recognition sub-network; and the characteristic image is input into the pedestrian re-recognition model in parallel Calculate in the preset partial recognition sub-network to obtain the partial sub-data output by the partial recognition sub-network;
  • the pedestrian re-recognition result acquisition unit is configured to input the master data, the global sub-data, and the local sub-data into a fully connected layer preset in the pedestrian re-recognition model for calculation, so as to obtain the fully connected Pedestrian re-identification results output by the layer.
  • the present application provides a computer device including a memory and a processor, the memory stores a computer program, and the processor implements the steps of any one of the above methods when the computer program is executed.
  • the present application provides a non-volatile computer-readable storage medium on which a computer program is stored, and when the computer program is executed by a processor, the steps of any one of the above methods are implemented.
  • the residual network-based information recognition method, device, computer equipment, and storage medium of the present application obtain instructions for pedestrian re-identification, wherein the pedestrian re-identification instruction carries an image of a designated pedestrian to be identified; and the designated pedestrian’s
  • the image input is calculated in the preset trained pedestrian re-recognition model based on the residual network, so as to obtain the characteristic image output by the fourth residual block in the residual network; obtain the output of the fifth residual block Obtain the global sub-data output by the global recognition sub-network; Obtain the local sub-data output by the local recognition sub-network; Input the main data, the global sub-data and the local sub-data into the
  • the calculation is performed in the fully connected layer preset in the pedestrian re-recognition model, so as to obtain the pedestrian re-recognition result output by the fully connected layer.
  • this application also comprehensively considers the feature image output by the fourth residual block in the residual network (retained by global sub-data and local sub-data Detailed data), thereby minimizing the loss of image details and improving the accuracy of pedestrian recognition.
  • FIG. 1 is a schematic flowchart of an information identification method based on a residual network according to an embodiment of this application;
  • FIG. 2 is a schematic block diagram of the structure of an information recognition device based on a residual network according to an embodiment of the application;
  • FIG. 3 is a schematic block diagram of the structure of a computer device according to an embodiment of the application.
  • an embodiment of the present application provides an information recognition method based on a residual network, including the following steps:
  • an instruction for pedestrian re-identification is obtained, wherein the instruction for pedestrian re-identification carries an image of a designated pedestrian to be identified.
  • the image of the designated pedestrian can be obtained in any manner, for example, the image of the designated pedestrian pre-stored in the database, or the image of the designated pedestrian taken in real time, or a frame of the designated pedestrian image intercepted in the surveillance video.
  • the information identification in this application can also be referred to as pedestrian re-identification.
  • step S2 input the image of the designated pedestrian into a preset trained pedestrian re-recognition model based on the residual network for calculation, thereby obtaining the output of the fourth residual block in the residual network
  • a feature image wherein the pedestrian re-recognition model is trained based on a human body image and sample data of recognition results associated with the human body image.
  • the pedestrian re-identification model of this application is based on a residual network, where the residual network is, for example, resnet50, resnet101, and resnet152, and the resnet50 model is preferred in this application.
  • the residual network includes first to fifth residual blocks, and each residual block includes at least one convolutional layer, and can output corresponding feature images.
  • this application also sets up a global recognition sub-network and a local recognition sub-network in the pedestrian re-recognition model, which are used to communicate with the fifth
  • the residual block receives the characteristic image output by the fourth residual block in parallel.
  • the global recognition sub-network and the local recognition sub-network can selectively save the global and local features of the feature image output by the fourth residual block, thereby avoiding the loss of useful data and avoiding the addition of excessive interference data.
  • the output layer of the pedestrian re-recognition model can be any layer. This application preferably uses a fully connected layer to map the output of the fully connected layer into a fixed-length feature vector, and then obtain the recognition result according to the feature vector.
  • step S3 input the characteristic image into the fifth residual block in the residual network for calculation, thereby obtaining the main data output by the fifth residual block; and parallelly transfer the The characteristic image is input into the global recognition sub-network preset in the pedestrian re-recognition model for calculation, so as to obtain the global sub-data output by the global recognition sub-network; and the characteristic image is input into the pedestrian re-recognition model in parallel Calculate in the preset partial recognition sub-network to obtain the partial sub-data output by the partial recognition sub-network.
  • the process in which the fifth residual block calculates the feature image is a process including convolution (it may also include processes such as pooling and activation).
  • the process of calculating the feature image by the global recognition sub-network preset in the pedestrian re-recognition model is the process of extracting the global features of the feature image (features of the entire image), for example, The global color is extracted, and the global contour of the feature image is extracted.
  • the process of calculating the feature image in the preset local recognition sub-network in the pedestrian re-recognition model is to calculate the local area in the feature image (for example, the head area is selected in the entire image)
  • the process of feature extraction such as extracting the local color of the feature image and the local contour of the feature image.
  • the global recognition sub-network and the local recognition sub-network may adopt any neural network structure, for example, a structure based on a convolutional neural network.
  • step S4 input the master data, the global sub-data, and the local sub-data into the fully connected layer preset in the pedestrian re-recognition model for calculation, thereby obtaining the fully connected layer output The result of pedestrian recognition.
  • the residual network in the traditional technology only outputs the main data output by the fifth residual block to the fully connected layer preset in the pedestrian re-recognition model for calculation, so as to obtain the final recognition result, and the recognition accuracy needs to be improved .
  • This application also inputs the main data output by the fifth residual block, the global sub-data output by the global recognition sub-network, and the local sub-data output by the local recognition sub-network into the pedestrian re-recognition model.
  • the calculation is performed in the fully connected layer of, that is, the detailed data in the characteristic image output by the fourth residual block that is ignored in the traditional technology is also used, so that the recognition result is more accurate.
  • the image of the designated pedestrian includes a facial area
  • the image of the designated pedestrian is input into a preset trained pedestrian re-recognition model based on a residual network for calculation, thereby obtaining the residual
  • the characteristic image output by the fourth residual block in the network, where the pedestrian re-recognition model is trained based on the pedestrian image and the sample data of the recognition result associated with the pedestrian image before step S2 includes:
  • S111 Divide the image of the designated pedestrian into multiple regions, compare the image data of each region with preset eye image data to obtain the difference between the image data of each region and the eye image data, and divide the difference between The area exceeding the preset value is recorded as the eye area;
  • the image of the designated pedestrian is pre-identified.
  • the eye image data is standard image data that can be used to identify eye features (for example, the data of the human eye image area collected in advance)
  • the mouth image data is the standard image data that can be used to identify mouth features (such as the data collected in advance).
  • the data of the human mouth image area), the image data is, for example, image pixels (three primary colors, etc.).
  • the specific method for comparing image data can be any conventional comparison method, such as a pixel comparison method, which is not described here.
  • the eye area is larger than a single divided area, then multiple continuous areas whose difference does not exceed a preset value are regarded as the eye area; similarly, multiple consecutive areas whose difference does not exceed a preset value Is the mouth area. Since the five sense organs in a human face are distributed according to a certain geometric structure ratio, if the eye area and the mouth area are determined, the approximate facial contour can be obtained.
  • the standard facial image is called, and the eye area in the standard facial image is overlapped with the eye area in the image of the designated pedestrian through an equal-scale reduction or enlargement operation, and the mouth in the standard facial image
  • the area overlaps the mouth area in the image of the designated pedestrian, and the area in the image of the designated pedestrian that overlaps with the standard facial image after the proportional reduction or enlargement operation is recorded as the face region, and the The image within the face area is regarded as the face image.
  • the preset image similarity calculation method will be used to calculate the similarity value between the facial image and the pre-stored target facial image, and determine whether the similarity value is greater than the preset similarity threshold; if the similarity value is Not greater than the preset similarity threshold, indicating that the image of the designated pedestrian is different from the target facial image for comparison, so further recognition is required, and a pedestrian re-recognition model calculation instruction is generated accordingly, wherein the pedestrian re-recognition model calculates The instruction is used to instruct to input the image of the designated pedestrian into a preset trained pedestrian re-recognition model based on the residual network for calculation.
  • the preset image similarity calculation method for example, adopts a method of comparing pixels one by one to determine the number of the same pixels, and then the quotient of the number of the same pixels divided by the number of all similar points is calculated as The similarity value. Therefore, for pedestrians with obvious characteristics (for example, the face is particularly large, the face has special features such as stubbornness), the preset image similarity calculation method can be used to directly identify the pedestrian without calling the pedestrian recognition model. Thereby improving the efficiency of recognition.
  • the input of the image of the designated pedestrian into a preset trained pedestrian re-recognition model based on the residual network is calculated, so as to obtain the output of the fourth residual block in the residual network
  • the pedestrian re-identification model is trained based on the pedestrian image and the sample data of the recognition result associated with the pedestrian image, including:
  • S121 Obtain a specified amount of sample data, and divide the sample data into a training set and a test set; wherein the sample data includes a pedestrian image and a recognition result associated with the pedestrian image;
  • the pedestrian re-recognition model is trained. This embodiment is based on the pedestrian re-identification model of the residual network.
  • the residual network can be resnet50, resnet101, resnet152, and the resnet50 model is preferred in this application.
  • the stochastic gradient descent method is to randomly sample some training data to replace the entire training set. If the sample size is large (for example, hundreds of thousands), then only tens of thousands or thousands of samples may be used to iterate. When the optimal solution is reached, the training speed can be improved. Further, the training process can also use the reverse conduction rule to update the parameters of each layer. Among them, the reverse conduction law (BP) is based on the gradient descent method.
  • BP reverse conduction law
  • the input-output relationship of the BP network is essentially a mapping relationship: the function of a BP network with n inputs and m outputs is from n-dimensional Euclidean space A continuous mapping to a finite field in the m-dimensional Euclidean space. This mapping is highly non-linear, which facilitates the update of the parameters of each layer of the network model.
  • the initial micro-expression recognition model In order to obtain the initial micro-expression recognition model. Then use the sample data of the test set to verify the result training model; if the verification passes, the result training model is recorded as the pedestrian re-identification model based on the residual network.
  • the initial pedestrian re-identification model based on the residual network not only includes the residual network, but also includes a global recognition sub-network and a local recognition sub-network parallel to the fifth residual block after the fourth residual block
  • the global recognition sub-network and the local recognition sub-network are respectively used to extract the global features (features of the entire image) of the feature image output by the fourth residual block, and output to the fourth residual block
  • the feature of the local area in the feature image (for example, the head area is selected in the entire image) is extracted.
  • a trained pedestrian re-recognition model In order to obtain a trained pedestrian re-recognition model.
  • the pedestrian re-identification model has undergone the process of training and verification, it can ensure that the pedestrian re-identification model is competent for the task of pedestrian re-identification, optimize the parameters of the pedestrian re-identification model, and improve the recognition of the pedestrian re-identification model in the process of formal pedestrian re-identification. Accuracy.
  • the input of the image of the designated pedestrian into a preset trained pedestrian re-recognition model based on the residual network is calculated, so as to obtain the output of the fourth residual block in the residual network
  • the pedestrian re-identification model is trained based on the pedestrian image and the sample data of the recognition result associated with the pedestrian image, including:
  • migration learning is used to quickly obtain a pedestrian re-identification model based on the residual network. If you have a trained residual network model, the training step can be eliminated, and the initial weight parameters of each layer in the residual network in the initial pedestrian re-recognition model can be directly obtained, thereby eliminating the training step.
  • the initial pedestrian re-recognition model is also verified with the sample data of the test set. The sample data includes the pedestrian image and the recognition result associated with the pedestrian image. If the verification passes, Then the initial pedestrian re-identification model is recorded as the pedestrian re-identification model based on the residual network. So as to ensure that the final model obtained is correct and usable.
  • the weight parameters of each layer are obtained by adopting the transfer learning method, and further verification is performed on this basis, thereby eliminating a large amount of time required for training, thereby shortening the acquisition time of the pedestrian re-identification model.
  • the step S3 of inputting the characteristic image into a preset global recognition sub-network in the pedestrian re-recognition model to obtain the global sub-data output by the global recognition sub-network includes:
  • the global sub-data output by the global recognition sub-network is obtained.
  • this application proposes global sub-data in the feature image output by the fourth residual block, where the value of the global sub-data is not within the preset value range, so as to achieve the preservation of data with large differences , And avoid the interference of useless data.
  • the designated data is data that can reflect the characteristics of pedestrians, for example, including human body contour, human skin color, or clothing color. Since the contours of the human body are not uniform, the skin color or the color of clothing is also likely to be different, it is extracted as the designated data accordingly. If the value of the specified data is not within the preset value range, it indicates that the specified data is available.
  • the global identification sub-network selects a plurality of designated data for collection, and uses designated data whose values are not within a preset value range as global sub-data and outputs it.
  • the number of designated data can be set to 2-10, preferably 6-8.
  • the global recognition sub-network may include a neural network with any number of layers, for example, a neural network with 6-8 layers. In this way, the detailed features in the feature image are retained in the form of global sub-data, which is beneficial to the subsequent re-identification of pedestrians, thereby improving the accuracy of recognition.
  • the step S3 of inputting the characteristic image into a preset partial recognition sub-network in the pedestrian re-recognition model to obtain partial sub-data output by the partial recognition sub-network includes :
  • S312 Extract designated data from each of the blocks, and determine whether the value of the designated data is within a preset value range, wherein the designated data includes at least a partial contour, a partial skin color, or a partial clothing color;
  • the partial sub-data output by the partial recognition sub-network is obtained.
  • the network is processed layer by layer, the detailed features of the input image will be correspondingly lost, especially the partial image data will be lost.
  • this application uses the local recognition sub-network to divide the characteristic image into multiple blocks using a preset block division method, and extract designated data from each block. If the numerical value of the designated data is not within the preset numerical range, the designated data is taken as the local sub-data, and the local sub-data is output. In this way, valuable sub-data of the number of rounds can be saved and used as one of the basis for subsequent identification.
  • the local recognition sub-network selects a plurality of designated data for collection, and uses designated data whose values are not within a preset value range as global sub-data and outputs it.
  • the number of designated data can be set to 2-10, preferably 6-8.
  • the local recognition sub-network may include a neural network with any number of layers, for example, a neural network with 8-10 layers.
  • the block division method is, for example, identifying the characteristic shape in the characteristic image, and dividing the area centered on the characteristic shape as a single block (for example, if the contour of the head is recognized, the head The part contour is divided as the head block). Therefore, the detailed features in the feature image are retained in the form of partial sub-data, which is beneficial to the subsequent re-identification of pedestrians, thereby improving the accuracy of recognition.
  • the main data, the global sub-data, and the local sub-data are input into a fully connected layer preset in the pedestrian re-identification model for calculation, thereby obtaining the fully connected layer Step S4 of the output pedestrian re-identification result includes:
  • the models based on the residual network in the traditional technology all input the data of the fifth residual block into the fully connected layer, and then the fully connected layer maps the data into feature vectors.
  • this application also comprehensively considers the main data output by the fifth residual block, the global sub-data output by the global recognition sub-network, and the local sub-data output by the local recognition sub-network, thereby using the fully connected layer to combine It is mapped to a fixed-length feature vector, thereby improving the recognition accuracy.
  • the preset mapping method is similar to the mapping method of the fully connected layer in the traditional technology, and will not be repeated here.
  • each component vector of the feature vector output by the fully connected layer represents the corresponding recognition result
  • the recognition result corresponding to the component vector with the largest value is the most probable recognition result, so the recognition result corresponding to the component vector with the largest value is taken as The final output recognition result.
  • this application not only uses the master data, but also uses the global sub-data and the local sub-data that are ignored by the traditional technology. Therefore, the feature vector obtained by mapping is more accurate, and the recognition accuracy of the final recognition result is also Has been improved.
  • the information recognition method based on the residual network of the present application obtains instructions for pedestrian re-recognition, wherein the instruction for pedestrian re-recognition carries an image of a designated pedestrian to be identified; and the image of the designated pedestrian is input into a preset trained Calculated in the pedestrian re-recognition model based on the residual network to obtain the characteristic image output by the fourth residual block in the residual network; obtain the main data output by the fifth residual block; obtain the global The global sub-data output by the recognition sub-network; the local sub-data output by the local recognition sub-network is obtained; the main data, the global sub-data and the local sub-data are input into the preset in the pedestrian recognition model
  • the calculation is performed in the fully connected layer to obtain the pedestrian re-identification result output by the fully connected layer. Thereby improving the accuracy of pedestrian recognition.
  • an embodiment of the present application provides an information recognition device based on a residual network, including:
  • the instruction acquisition unit 10 is configured to acquire a pedestrian re-identification instruction, where the pedestrian re-identification instruction carries an image of a designated pedestrian to be identified;
  • the feature image acquisition unit 20 is configured to input the image of the designated pedestrian into a preset trained pedestrian re-recognition model based on the residual network for calculation, thereby obtaining the output of the fourth residual block in the residual network
  • the data acquisition unit 30 is configured to input the characteristic image into the fifth residual block in the residual network for calculation, thereby obtaining the main data output by the fifth residual block; and parallelly transfer the The characteristic image is input into the global recognition sub-network preset in the pedestrian re-recognition model for calculation, so as to obtain the global sub-data output by the global recognition sub-network; and the characteristic image is input into the pedestrian re-recognition model in parallel Calculated in the preset partial recognition sub-network to obtain the partial sub-data output by the partial recognition sub-network;
  • the pedestrian re-identification result acquisition unit 40 is configured to input the master data, the global sub-data, and the local sub-data into a fully connected layer preset in the pedestrian re-identification model for calculation, so as to obtain the complete Pedestrian re-identification results output by the connection layer.
  • the image of the designated pedestrian includes a facial area
  • the device includes:
  • the eye area marking unit is used to divide the image of the designated pedestrian into multiple areas, and compare the image data of each area with preset eye image data to obtain the difference between the image data of each area and the eye image data , Mark the area where the difference does not exceed the preset value as the eye area;
  • the mouth area marking unit is used to compare the image data of each area with the preset mouth image data to obtain the difference between the image data of each area and the mouth image data, and record the area where the difference does not exceed the preset value as Mouth area
  • the facial image acquisition unit is used to call a standard facial image, and through a proportional reduction or enlargement operation, make the eye area in the standard facial image coincide with the eye area in the image of the designated pedestrian, and simultaneously make the standard face
  • the mouth area in the image coincides with the mouth area in the image of the designated pedestrian, and then the area in the image of the designated pedestrian that overlaps with the standard facial image after the proportional reduction or enlargement operation is recorded as the face region, And use the image within the facial area as a facial image;
  • the similarity value calculation unit is configured to adopt a preset image similarity calculation method to calculate the similarity value between the facial image and the pre-stored target facial image, and determine whether the similarity value is greater than a preset similarity threshold;
  • a calculation instruction generating unit configured to generate a pedestrian re-identification model calculation instruction if the similarity value is not greater than a preset similarity threshold, wherein the pedestrian re-identification model calculation instruction is used to instruct to transfer the image of the designated pedestrian Enter the pre-trained pedestrian recognition model based on the residual network to calculate.
  • the device includes:
  • the sample data acquisition unit is used to acquire a specified amount of sample data, and divide the sample data into a training set and a test set; wherein the sample data includes a pedestrian image and a recognition result associated with the pedestrian image;
  • the training unit is used to input the sample data of the training set into the initial pedestrian re-recognition model based on the residual network for training; wherein, the stochastic gradient descent method is used in the training process to obtain the result training model;
  • a verification unit for verifying the result training model by using the sample data of the test set
  • the model marking unit is configured to record the result training model as the pedestrian re-recognition model based on the residual network if the verification is passed.
  • the device includes:
  • the weight parameter acquisition unit is used to acquire the weight parameters of each layer of the residual network model that has been trained
  • An initialization unit for initializing the weight parameters of each layer to the initial weight parameters of each layer in the residual network in the initial pedestrian re-identification model
  • the model verification unit is configured to verify the initial pedestrian re-identification model using sample data of the test set, where the sample data includes a pedestrian image and a recognition result associated with the pedestrian image;
  • the pedestrian re-identification model marking unit is configured to record the initial pedestrian re-identification model as the pedestrian re-identification model based on the residual network if the verification is passed.
  • the data acquisition unit 30 includes:
  • the designated data extraction subunit is configured to extract designated data from the characteristic image through the global recognition sub-network and determine whether the value of the designated data is within a preset value range, wherein the designated data at least includes Human body contour, human skin color or clothing color;
  • the global sub-data output sub-unit is configured to, if the value of the designated data is not within the preset value range, use the designated data as the global sub-data and output the global sub-data.
  • the data acquisition unit 30 includes:
  • a block division subunit configured to divide the characteristic image into a plurality of blocks by using a preset block division method through the local recognition sub-network;
  • the data extraction subunit is used to extract designated data from each of the blocks and determine whether the value of the designated data is within a preset value range, wherein the designated data includes at least a partial contour, a partial skin color, and Or partial clothing color;
  • the local sub-data output subunit is configured to, if the value of the designated data is not within the preset value range, use the designated data as the local sub-data and output the local sub-data.
  • the pedestrian re-identification result obtaining unit 40 includes:
  • a mapping subunit configured to use a preset mapping method to map the main data, the global sub-data, and the local sub-data into a fixed-length feature vector through the fully connected layer;
  • the recognition result output subunit is configured to output the recognition result corresponding to the component vector with the largest value in the feature vector according to the preset correspondence between the component vector and the recognition result.
  • the information recognition device based on the residual network of the present application obtains instructions for pedestrian re-recognition, wherein the instruction for pedestrian re-recognition carries an image of a designated pedestrian to be identified; and the image of the designated pedestrian is input into a preset trained Calculated in the pedestrian re-recognition model based on the residual network to obtain the characteristic image output by the fourth residual block in the residual network; obtain the main data output by the fifth residual block; obtain the global The global sub-data output by the recognition sub-network; the local sub-data output by the local recognition sub-network is obtained; the main data, the global sub-data and the local sub-data are input into the preset in the pedestrian recognition model
  • the calculation is performed in the fully connected layer to obtain the pedestrian re-identification result output by the fully connected layer. Thereby improving the accuracy of pedestrian recognition.
  • an embodiment of the present application also provides a computer device.
  • the computer device may be a server, and its internal structure may be as shown in the figure.
  • the computer equipment includes a processor, a memory, a network interface and a database connected through a system bus. Among them, the computer designed processor is used to provide calculation and control capabilities.
  • the memory of the computer device includes a non-volatile storage medium and an internal memory.
  • the non-volatile storage medium stores an operating system, a computer program, and a database.
  • the memory provides an environment for the operation of the operating system and computer programs in the non-volatile storage medium.
  • the database of the computer equipment is used to store the data used in the information recognition method based on the residual network.
  • the network interface of the computer device is used to communicate with an external terminal through a network connection.
  • the computer program is executed by the processor to realize a method of information identification based on the residual network.
  • the above-mentioned processor executes the above-mentioned residual network-based information identification method, wherein the steps included in the method respectively correspond to the steps of executing the residual network-based information identification method of the foregoing embodiment, and will not be repeated here.
  • the computer device of the present application acquires instructions for pedestrian re-recognition, where the instructions for pedestrian re-recognition carry an image of a designated pedestrian to be identified; input the image of the designated pedestrian into a preset trained pedestrian based on the residual network Then calculate in the recognition model to obtain the characteristic image output by the fourth residual block in the residual network; obtain the main data output by the fifth residual block; obtain the global output of the global recognition sub-network Sub-data; obtain the local sub-data output by the local recognition sub-network; input the main data, the global sub-data, and the local sub-data into the fully connected layer preset in the pedestrian re-recognition model for calculation , So as to obtain the pedestrian re-identification result output by the fully connected layer. Thereby improving the accuracy of pedestrian recognition.
  • An embodiment of the present application also provides a computer-readable storage medium on which a computer program is stored.
  • a computer program is executed by a processor, a residual network-based information recognition method is implemented, wherein the steps included in the method are respectively the same as those in the foregoing
  • the steps of the information recognition method based on the residual network of the embodiment correspond to each other, and will not be repeated here.
  • the computer-readable storage medium of the present application acquires instructions for pedestrian re-recognition, wherein the instructions for pedestrian re-recognition carry images of designated pedestrians to be identified; input the images of designated pedestrians into preset trained based residuals
  • the pedestrian recognition model of the network is then calculated to obtain the characteristic image output by the fourth residual block in the residual network; obtain the main data output by the fifth residual block; obtain the global recognition sub-network Output global sub-data; obtain the local sub-data output by the local recognition sub-network; input the main data, the global sub-data and the local sub-data into the fully connected layer preset in the pedestrian recognition model Calculate in, so as to obtain the pedestrian re-identification result output by the fully connected layer.
  • the computer-readable storage medium is, for example, a non-volatile computer-readable storage medium or a volatile computer-readable storage medium.
  • Non-volatile memory may include read only memory (ROM), programmable ROM (PROM), electrically programmable ROM (EPROM), electrically erasable programmable ROM (EEPROM), or flash memory.
  • Volatile memory may include random access memory (RAM) or external cache memory.
  • RAM is available in many forms, such as static RAM (SRAM), dynamic RAM (DRAM), synchronous DRAM (SDRAM), dual-rate SDRAM (SSRSDRAM), enhanced SDRAM (ESDRAM), synchronous Link (Synchlink) DRAM (SLDRAM), memory bus (Rambus) direct RAM (RDRAM), direct memory bus dynamic RAM (DRDRAM), and memory bus dynamic RAM (RDRAM), etc.

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • General Health & Medical Sciences (AREA)
  • General Engineering & Computer Science (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Artificial Intelligence (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Biomedical Technology (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Human Computer Interaction (AREA)
  • Multimedia (AREA)
  • Image Analysis (AREA)

Abstract

一种基于残差网络的信息识别方法、装置、计算机设备和存储介质,所述方法包括:获取行人再识别的指令,其中所述行人再识别的指令携带有待识别的指定行人的图像;将所述指定行人的图像输入预设的训练好的基于残差网络的行人再识别模型中计算,从而获取所述残差网络中的第四个残差块输出的特征图像;获得第五个残差块输出的主数据;获得全局识别子网络输出的全局子数据;获得局部识别子网络输出的局部子数据;将所述主数据、所述全局子数据和所述局部子数据输入所述行人再识别模型中预设的全连接层中进行计算,从而获得所述全连接层输出的行人再识别结果。所述方法提高了行人再识别的准确性。

Description

基于残差网络的信息识别方法、装置和计算机设备
本申请要求于2019年7月30日提交中国专利局、申请号为201910696302.X,发明名称为“基于残差网络的行人再识别方法、装置和计算机设备”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。
技术领域
本申请涉及到计算机领域,特别是涉及到一种基于残差网络的信息识别方法、装置、计算机设备和存储介质。
背景技术
行人再识别,是利用计算机视觉技术判断图像或者视频序列中是否存在特定行人的技术。可用于对监控获得的行人图像的识别,可弥补肉眼识别行人图像不准确的缺陷,可广泛应用于智能视频监控等领域。其中,行人再识别模型能否得到准确的识别结果,取决于采用的神经网络模型,而普通的神经网络模型由于在网络层数过多时,训练效果很差,无法胜任准确识别的任务。残差网络是一种深度卷积网络,能够解决增加网络层数带来的训练效果弱化的问题,可能够提高行人再识别模型的识别准确度。但是,传统技术中使用残差网络,仅是利用残差网络最后一层输出的数据,并没有考虑到主干网络的其他层输出的特征图提取到的低级特征,而网络在经过层层处理之后,输入图像的细节特征会相应丢失,从而获取到的特征会更加抽象,以至于在做匹配特征的时候会出现误差。因此传统技术的行人再识别模型的识别准确度有待提高。
技术问题
本申请的主要目的为提供一种基于残差网络的信息识别方法、装置、计算机设备和存储介质,旨在提高行人再识别的识别准确度。
技术解决方案
为了实现上述发明目的,本申请提出一种基于残差网络的信息识别方法,包括以下步骤:
获取行人再识别的指令,其中所述行人再识别的指令携带有待识别的指定行人的图像;
将所述指定行人的图像输入预设的训练好的基于残差网络的行人再识别模型中计算,从而获取所述残差网络中的第四个残差块输出的特征图像,其中,所述行人再识别模型基于行人图像,以及与行人图像关联的识别结果的样本数据训练而成;
将所述特征图像输入所述残差网络中的第五个残差块中计算,从而获得所述第五个残差块输出的主数据;以及并行地将所述特征图像输入所述行人再识别模型中预设的全局识别子网络中计算,从而获得所述全局识别子网络输出的全局子数据;以及并行地将所述特征图像输入所述行人再识别模型中的预设的局部识别子网络中计算,从而获得所述局部识别子网络输出的局部子数据;
将所述主数据、所述全局子数据和所述局部子数据输入所述行人再识别模型中预设的全连接层中进行计算,从而获得所述全连接层输出的行人再识别结果。
本申请提供一种基于残差网络的信息识别装置,包括:
指令获取单元,用于获取行人再识别的指令,其中所述行人再识别的指令携带有待识别的指定行人的图像;
特征图像获取单元,用于将所述指定行人的图像输入预设的训练好的基于残差网络的行人再识别模型中计算,从而获取所述残差网络中的第四个残差块输出的特征图像,其中,所述行人再识别模型基于行人图像,以及与行人图像关联的识别结果的样本数据训练而成;
数据获取单元,用于将所述特征图像输入所述残差网络中的第五个残差块中计算,从而获得所述第五个残差块输出的主数据;以及并行地将所述特征图像输入所述行人再识别模型中预设的全局识别子网络中计算,从而获得所述全局识别子网络输出的全局子数据;以及并行地将所述特征图像输入所述行人再识别模型中的预设的局部识别子网络中计算,从而获得所述局部识别子网络输出的局部子数据;
行人再识别结果获取单元,用于将所述主数据、所述全局子数据和所述局部子数据输入所述行人再识别模型中预设的全连接层中进行计算,从而获得所述全连接层输出的行人再识别结果。
本申请提供一种计算机设备,包括存储器和处理器,所述存储器存储有计算机程序,所述处理器执行所述计算机程序时实现上述任一项所述方法的步骤。
本申请提供一种非易失性的计算机可读存储介质,其上存储有计算机程序,所述计算机程序被处理器执行时实现上述任一项所述的方法的步骤。
有益效果
本申请的基于残差网络的信息识别方法、装置、计算机设备和存储介质,获取行人再识别的指令,其中所述行人再识别的指令携带有待识别的指定行人的图像;将所述指定行人的图像输入预设的训练好的基于残差网络的行人再识别模型中计算,从而获取所述残差网络中的第四个残差块输出的特征图像;获得所述第五个残差块输出的主数据;获得所述全局识别子网络输出的全局子数据;获得所述局部识别子网络输出的局部子数据;将所述主数据、所述全局子数据和所述局部子数据输入所述行人再识别模型中预设的全连接层中进行计算,从而获得所述全连接层输出的行人再识别结果。相对于传统技术仅利用残差网络最后一层输出的数据,本申请还综合考虑了所述残差网络中的第四个残差块输出的特征图像(通过全局子数据和局部子数据保留了细节数据),从而尽量减少了图像的细节特征的丢失,提高了行人再识别的准确性。
附图说明
图1 为本申请一实施例的基于残差网络的信息识别方法的流程示意图;
图2 为本申请一实施例的基于残差网络的信息识别装置的结构示意框图;
图3 为本申请一实施例的计算机设备的结构示意框图。
本申请目的的实现、功能特点及优点将结合实施例,参照附图做进一步说明。
本发明的最佳实施方式
为了使本申请的目的、技术方案及优点更加清楚明白,以下结合附图及实施例,对本申请进行进一步详细说明。应当理解,此处描述的具体实施例仅仅用以解释本申请,并不用于限定本申请。
参照图1,本申请实施例提供一种基于残差网络的信息识别方法,包括以下步骤:
S1、获取行人再识别的指令,其中所述行人再识别的指令携带有待识别的指定行人的图像;
S2、将所述指定行人的图像输入预设的训练好的基于残差网络的行人再识别模型中计算,从而获取所述残差网络中的第四个残差块输出的特征图像,其中,所述行人再识别模型基于行人图像,以及与行人图像关联的识别结果的样本数据训练而成;
S3、将所述特征图像输入所述残差网络中的第五个残差块中计算,从而获得所述第五个残差块输出的主数据;以及并行地将所述特征图像输入所述行人再识别模型中预设的全局识别子网络中计算,从而获得所述全局识别子网络输出的全局子数据;以及并行地将所述特征图像输入所述行人再识别模型中的预设的局部识别子网络中计算,从而获得所述局部识别子网络输出的局部子数据;
S4、将所述主数据、所述全局子数据和所述局部子数据输入所述行人再识别模型中预设的全连接层中进行计算,从而获得所述全连接层输出的行人再识别结果。
如上述步骤S1所述, 获取行人再识别的指令,其中所述行人再识别的指令携带有待识别的指定行人的图像。其中指定行人的图像可以通过任意方式获得,例如是预存在数据库中的指定行人的图像,或者是实时拍摄的指定行人的图像,或者是监控视频中截取的一帧指定行人的图像。其中,本申请中的所述信息识别也可称之为行人再识别。
如上述步骤S2所述,将所述指定行人的图像输入预设的训练好的基于残差网络的行人再识别模型中计算,从而获取所述残差网络中的第四个残差块输出的特征图像,其中,所述行人再识别模型基于人体图像,以及与所述人体图像关联的识别结果的样本数据训练而成。本申请的行人再识别模型基于残差网络,其中所述残差网络例如为resnet50、resnet101、resnet152,本申请优选resnet50模型。其中残差网络包括第一至第五残差块,每个残差块均包括至少一层卷积层,并能输出对应的特征图像。并且本申请为了解决网络在经过层层处理之后,输入图像的细节特征会相应丢失的技术问题,还在行人再识别模型中设置了全局识别子网络和局部识别子网络,用以与第五个残差块一起并行接收第四个残差块输出的特征图像。而全局识别子网络和局部识别子网络能够将第四个残差块输出的特征图像的全局特征与局部特征选择性地保存下来,从而避免了有用数据的丢失,同时避免过多干扰数据的加入。所述行人再识别模型的输出层可为任意层,本申请优选全连接层,从而利用全连接层输出映射成一个固定长度的特征向量,再根据特征向量得到识别结果。
如上述步骤S3所述,将所述特征图像输入所述残差网络中的第五个残差块中计算,从而获得所述第五个残差块输出的主数据;以及并行地将所述特征图像输入所述行人再识别模型中预设的全局识别子网络中计算,从而获得所述全局识别子网络输出的全局子数据;以及并行地将所述特征图像输入所述行人再识别模型中的预设的局部识别子网络中计算,从而获得所述局部识别子网络输出的局部子数据。其中所述第五个残差块对特征图像进行计算的过程,即是包括卷积在内的过程(也还可以包括池化、激活等过程)。所述行人再识别模型中预设的全局识别子网络对所述特征图像进行计算的过程,即是将所述特征图像的全局特征(整幅图像的特征)提取的过程,例如将特征图像的全局颜色进行提取、特征图像的全局轮廓进行提取。所述行人再识别模型中预设的局部识别子网络中计算对所述特征图像进行计算的过程,即是对所述特征图像中的局部区域(例如在整幅图像中选取头部区域)的特征进行提取的过程,例如将特征图像的局部颜色进行提取、特征图像的局部轮廓进行提取。其中所述全局识别子网络和所述局部识别子网络可以采用任意的神经网络构造,例如采用基于卷积神经网络构造而形成。
如上述步骤S4所述,将所述主数据、所述全局子数据和所述局部子数据输入所述行人再识别模型中预设的全连接层中进行计算,从而获得所述全连接层输出的行人再识别结果。传统技术中的残差网络仅将第五个残差块输出的主数据输出至所述行人再识别模型中预设的全连接层中进行计算,从而得到最终的识别结果,识别准确度待提高。本申请还将所述第五个残差块输出的主数据、所述全局识别子网络输出的全局子数据和所述局部识别子网络输出的局部子数据输入所述行人再识别模型中预设的全连接层中进行计算,即还利用了传统技术中忽略的第四残差块输出的特征图像中的细节数据,使得识别结果更加准确。
在一个实施方式中,所述指定行人的图像包括面部区域,所述将所述指定行人的图像输入预设的训练好的基于残差网络的行人再识别模型中计算,从而获取所述残差网络中的第四个残差块输出的特征图像,其中,所述行人再识别模型基于行人图像,以及与行人图像关联的识别结果的样本数据训练而成的步骤S2之前,包括:
S111、将所述指定行人的图像划分为多个区域,将每个区域的图像数据与预设的眼睛图像数据进行对比,得到每个区域图像数据与眼睛图像数据的差值,将差值不超过预设数值的区域记为眼睛区域;
S112、将每个区域的图像数据与预设的嘴巴图像数据进行比较,得到每个区域图像数据与嘴巴图像数据的差值,将差值不超过预设数值的区域记为嘴巴区域;
S113、调用标准面部图像,并通过等比例缩小或者放大操作,使所述标准面部图像中的眼睛区域与所述指定行人的图像中的眼睛区域重合,同时使所述标准面部图像中的嘴巴区域与所述指定行人的图像中的嘴巴区域重合,再将所述指定行人的图像中与经过所述等比例缩小或者放大操作后的标准面部图像重叠的区域记为面部区域,并将所述面部区域范围内的图像作为面部图像;
S114、采用预设的图像相似度计算方法,计算所述面部图像与预存的目标面部图像的相似度值,并判断所述相似度值是否大于预设的相似度阈值;
S115、若所述相似度值不大于预设的相似度阈值,则生成行人再识别模型计算指令,其中所述行人再识别模型计算指令用于指示将所述指定行人的图像输入预设的训练好的基于残差网络的行人再识别模型中计算。
如上所述,实现了预识别所述指定行人的图像。其中,眼睛图像数据为标准的可用于标识眼睛特征的图像数据(例如预先采集得到的人的眼睛图像区域的数据),嘴巴图像数据为标准的可用于标识嘴巴特征的图像数据(例如预先采集得到的人的嘴巴图像区域的数据),所述图像数据例如为图像像素(三原色等)等。而具体比对图像数据的方法可采用任意的传统比较方式,例如采用像素点比对方法,在此不赘述。进一步地,若所述眼睛区域大于划分的单个区域,则以差值不超过预设数值的多个连续的区域为眼睛区域;同理,以差值不超过预设数值的多个连续的区域为嘴巴区域。由于人的面部中的五官是按一定的几何结构比例分布的,若确定眼睛区域与嘴巴区域,即可获知大致的面部轮廓。据此,调用标准面部图像,并通过等比例缩小或者放大操作,使所述标准面部图像中的眼睛区域与所述指定行人的图像中的眼睛区域重合,同时使所述标准面部图像中的嘴巴区域与所述指定行人的图像中的嘴巴区域重合,再将所述指定行人的图像中与经过所述等比例缩小或者放大操作后的标准面部图像重叠的区域记为面部区域,并将所述面部区域范围内的图像作为面部图像。再将采用预设的图像相似度计算方法,计算所述面部图像与预存的目标面部图像的相似度值,并判断所述相似度值是否大于预设的相似度阈值;若所述相似度值不大于预设的相似度阈值,表明所述指定行人的图像与用于对比的目标面部图像不同,因此需要进一步进行识别,据此生成行人再识别模型计算指令,其中所述行人再识别模型计算指令用于指示将所述指定行人的图像输入预设的训练好的基于残差网络的行人再识别模型中计算。其中,所述预设的图像相似度计算方法,例如采用逐个对比像素点的方法,从而确定相同像素点的数量,再将相同像素点的数量除以所有相像点的数量的商值作为计算得到的相似度值。从而对于有明显特征的行人(例如脸特别大,脸有特别的痦子之类的特征),利用预设的图像相似度计算方法即可直接识别出行人,而不需要调用行人再识别模型,从而提高识别的效率。
在一个实施方式中,所述将所述指定行人的图像输入预设的训练好的基于残差网络的行人再识别模型中计算,从而获取所述残差网络中的第四个残差块输出的特征图像,其中,所述行人再识别模型基于行人图像,以及与行人图像关联的识别结果的样本数据训练而成的步骤S2之前,包括:
S121、获取指定量的样本数据,并将样本数据分成训练集和测试集;其中,所述样本数据包括行人图像,以及与行人图像关联的识别结果;
S122、将训练集的样本数据输入到基于残差网络的初始行人再识别模型中进行训练;其中,训练的过程中采用随机梯度下降法,得到结果训练模型;
S123、利用所述测试集的样本数据验证所述结果训练模型;
S124、如果验证通过,则将所述结果训练模型记为所述基于残差网络的行人再识别模型。
如上所述,实现了训练行人再识别模型。本实施方式基于残差网络的行人再识别模型。其中残差网络可为resnet50、resnet101、resnet152,本申请优选resnet50模型。其中,随机梯度下降法就是随机取样一些训练数据,替代整个训练集,如果样本量很大的情况(例如几十万),那么可能只用其中几万条或者几千条的样本,就已经迭代到最优解了,可以提高训练速度。进一步地,训练的过程还可以采用反向传导法则更新所述各层的参数。其中反向传导法则(BP)建立在梯度下降法的基础上,BP网络的输入输出关系实质上是一种映射关系:一个n输入m输出的BP网络所完成的功能是从n维欧氏空间向m维欧氏空间中一有限域的连续映射,这一映射具有高度非线性,有利于网络模型各层的参数的更新。从而获得初始微表情识别模型。再利用所述测试集的样本数据验证所述结果训练模型;如果验证通过,则将所述结果训练模型记为所述基于残差网络的行人再识别模型。进一步地,所述基于残差网络的初始行人再识别模型不仅包括残差网络,还包括在第四个残差块之后,与第五个残差块并列的全局识别子网络和局部识别子网络,所述全局识别子网络和局部识别子网络分别用于将所述第四个残差块输出的特征图像的全局特征(整幅图像的特征)提取、对所述第四个残差块输出的特征图像中的局部区域(例如在整幅图像中选取头部区域)的特征进行提取。从而获取训练好的行人再识别模型。由于行人再识别模型经由了训练与验证的过程,能够保证行人再识别模型胜任行人再识别任务,优化行人再识别模型的参数,以提高行人再识别模型在正式的行人再识别的过程中的识别准确率。
在一个实施方式中,所述将所述指定行人的图像输入预设的训练好的基于残差网络的行人再识别模型中计算,从而获取所述残差网络中的第四个残差块输出的特征图像,其中,所述行人再识别模型基于行人图像,以及与行人图像关联的识别结果的样本数据训练而成的步骤S2之前,包括:
S131、获取已经训练完成的残差网络模型的各层权重参数;
S132、将所述各层权重参数初始化为初始行人再识别模型中的残差网络中的各层初始权重参数;
S133、利用测试集的样本数据验证所述初始行人再识别模型,其中,所述样本数据包括行人图像,以及与行人图像关联的识别结果;
S134、若验证通过,则将所述初始行人再识别模型记为所述基于残差网络的行人再识别模型。
如上所述,实现了采用迁移学习,快速获得基于残差网络的行人再识别模型。若具有已经训练好的残差网络模型,即可以免去训练的步骤,从而直接得到初始行人再识别模型中的残差网络中的各层初始权重参数,从而省去了训练步骤。为了防止初始行人再识别模型并不适用,还利用测试集的样本数据验证所述初始行人再识别模型,其中,所述样本数据包括行人图像,以及与行人图像关联的识别结果,若验证通过,则将所述初始行人再识别模型记为所述基于残差网络的行人再识别模型。从而保证最终获得的模型正确可用。本申请通过采用迁移学习的方式,获取各层权重参数,在此基础上进一步验证,从而免去了训练需要耗费的大量时间,从而缩短了行人再识别模型的获取时间。
在一个实施方式中,所述将所述特征图像输入所述行人再识别模型中预设的全局识别子网络中计算,从而获得所述全局识别子网络输出的全局子数据的步骤S3,包括:
S301、通过所述全局识别子网络在所述特征图像中提取指定数据,并判断所述指定数据的数值是否在预设的数值范围之内,其中所述指定数据至少包括人体轮廓、人体肤色或者衣着颜色;
S302、若所述指定数据的数值不在预设的数值范围之内,则将所述指定数据作为全局子数据,并输出所述全局子数据。
如上所述,实现了获得所述全局识别子网络输出的全局子数据。为了防止图像细节丢失,本申请在第四个残差块输出的特征图像中提出全局子数据,其中所述全局子数据的数值不在预设的数值范围之内,以实现保留差别较大的数据,而避免无用数据的干扰。指定数据为能够体现行人特征的数据,例如包括人体轮廓、人体肤色或者衣着颜色。由于人体轮廓不均一致、肤色或者衣着颜色也很可能不相同,据此将其作为指定数据进行提取。若所述指定数据的数值不在预设的数值范围之内,表明所述指定数据可用,例如要在黄种人中识别出白人,则人体肤色的数据的颜色值不在预设的数值范围之内,则可以作为有效数据输出。进一步地,所述全局识别子网络选择多个指定数据进行采集,并将数值不在预设的数值范围之内的指定数据作为全局子数据,并输出。其中,指定数据的个数可设置为2-10个,优选6-8个。进一步地,所述全局识别子网络可包括任意层数的神经网络,例如包括6-8层神经网络。从而以全局子数据的形式保留了所述特征图像中的细节特征,利于后续辅助行人再识别,从而提高了识别的准确率。
在一个实施方式中,所述将所述特征图像输入所述行人再识别模型中的预设的局部识别子网络中计算,从而获得所述局部识别子网络输出的局部子数据的步骤S3,包括:
S311、通过所述局部识别子网络,采用预设的区块划分方法将所述特征图像划分为多个区块;
S312、在各个所述区块中分别提取指定数据,并判断所述指定数据的数值是否在预设的数值范围之内,其中所述指定数据至少包括局部轮廓、局部肤色、或者局部衣着颜色;
S313、若所述指定数据的数值不在预设的数值范围之内,则将所述指定数据作为局部子数据,并输出所述局部子数据。
如上所述,实现了获得所述局部识别子网络输出的局部子数据。网络在经过层层处理之后,输入图像的细节特征会相应丢失,尤其是局部的图像数据会丢失。为了保留局部的有效数据,本申请通过所述局部识别子网络,采用预设的区块划分方法将所述特征图像划分为多个区块,并在各个所述区块中分别提取指定数据,若所述指定数据的数值不在预设的数值范围之内,则将所述指定数据作为局部子数据,并输出所述局部子数据。从而实现了保存有价值的局数子数据,并作为后续识别的依据之一。进一步地,所述局部识别子网络选择多个指定数据进行采集,并将数值不在预设的数值范围之内的指定数据作为全局子数据,并输出。其中,指定数据的个数可设置为2-10个,优选6-8个。进一步地,所述局部识别子网络可包括任意层数的神经网络,例如包括8-10层神经网络。进一步地,所述区块划分方法例如为:识别出所述特征图像中的特征形状,并将所述特征形状为中心的区域作为单个区块进行划分(例如识别出头部轮廓,则将头部轮廓作为头部区块进行划分)。从而以局部子数据的形式保留了所述特征图像中的细节特征,利于后续辅助行人再识别,从而提高了识别的准确率。
在一个实施方式中,所述将所述主数据、所述全局子数据和所述局部子数据输入所述行人再识别模型中预设的全连接层中进行计算,从而获得所述全连接层输出的行人再识别结果的步骤S4,包括:
S401、采用预设的映射方法,通过所述全连接层将所述主数据、所述全局子数据和所述局部子数据映射为一个固定长度的特征向量;
S402、根据预设的分向量与识别结果对应关系,输出所述特征向量中数值最大的分向量对应的识别结果。
如上所述,实现了综合利用所述主数据、所述全局子数据和所述局部子数据,从而获得所述全连接层输出的行人再识别结果。传统技术中的基于残差网络的模型,均是将第五个残差块的数据输入全连接层中,再由全连接层将数据映射为特征向量。而本申请还综合考虑了所述第五个残差块输出的主数据、所述全局识别子网络输出的全局子数据和所述局部识别子网络输出的局部子数据,从而利用全连接层将其映射为一个固定长度的特征向量,从而提高了识别准确度。其中预设的映射方法,与传统技术中全连接层的映射方法相似,在此不再赘述。其中全连接层输出的特征向量的各个分向量均代表了对应的识别结果,而数值最大的分向量对应的识别结果则是最可能的识别结果,因此将数值最大的分向量对应的识别结果作为最终输出的识别结果。相对于传统技术,本申请不仅利用了主数据,还利用了传统技术忽视的所述全局子数据和所述局部子数据,因此映射得到的特征向量更准确,最终的识别结果的识别准确性也得到了提高。
本申请的基于残差网络的信息识别方法,获取行人再识别的指令,其中所述行人再识别的指令携带有待识别的指定行人的图像;将所述指定行人的图像输入预设的训练好的基于残差网络的行人再识别模型中计算,从而获取所述残差网络中的第四个残差块输出的特征图像;获得所述第五个残差块输出的主数据;获得所述全局识别子网络输出的全局子数据;获得所述局部识别子网络输出的局部子数据;将所述主数据、所述全局子数据和所述局部子数据输入所述行人再识别模型中预设的全连接层中进行计算,从而获得所述全连接层输出的行人再识别结果。从而提高了行人再识别的准确性。
参照图2,本申请实施例提供一种基于残差网络的信息识别装置,包括:
指令获取单元10,用于获取行人再识别的指令,其中所述行人再识别的指令携带有待识别的指定行人的图像;
特征图像获取单元20,用于将所述指定行人的图像输入预设的训练好的基于残差网络的行人再识别模型中计算,从而获取所述残差网络中的第四个残差块输出的特征图像,其中,所述行人再识别模型基于行人图像,以及与行人图像关联的识别结果的样本数据训练而成;
数据获取单元30,用于将所述特征图像输入所述残差网络中的第五个残差块中计算,从而获得所述第五个残差块输出的主数据;以及并行地将所述特征图像输入所述行人再识别模型中预设的全局识别子网络中计算,从而获得所述全局识别子网络输出的全局子数据;以及并行地将所述特征图像输入所述行人再识别模型中的预设的局部识别子网络中计算,从而获得所述局部识别子网络输出的局部子数据;
行人再识别结果获取单元40,用于将所述主数据、所述全局子数据和所述局部子数据输入所述行人再识别模型中预设的全连接层中进行计算,从而获得所述全连接层输出的行人再识别结果。
其中上述单元分别用于执行的操作与前述实施方式的基于残差网络的信息识别方法的步骤一一对应,在此不再赘述。
在一个实施方式中,所述指定行人的图像包括面部区域,所述装置,包括:
眼睛区域标记单元,用于将所述指定行人的图像划分为多个区域,将每个区域的图像数据与预设的眼睛图像数据进行对比,得到每个区域图像数据与眼睛图像数据的差值,将差值不超过预设数值的区域记为眼睛区域;
嘴巴区域标记单元,用于将每个区域的图像数据与预设的嘴巴图像数据进行比较,得到每个区域图像数据与嘴巴图像数据的差值,将差值不超过预设数值的区域记为嘴巴区域;
面部图像获取单元,用于调用标准面部图像,并通过等比例缩小或者放大操作,使所述标准面部图像中的眼睛区域与所述指定行人的图像中的眼睛区域重合,同时使所述标准面部图像中的嘴巴区域与所述指定行人的图像中的嘴巴区域重合,再将所述指定行人的图像中与经过所述等比例缩小或者放大操作后的标准面部图像重叠的区域记为面部区域,并将所述面部区域范围内的图像作为面部图像;
相似度值计算单元,用于采用预设的图像相似度计算方法,计算所述面部图像与预存的目标面部图像的相似度值,并判断所述相似度值是否大于预设的相似度阈值;
计算指令生成单元,用于若所述相似度值不大于预设的相似度阈值,则生成行人再识别模型计算指令,其中所述行人再识别模型计算指令用于指示将所述指定行人的图像输入预设的训练好的基于残差网络的行人再识别模型中计算。
其中上述单元分别用于执行的操作与前述实施方式的基于残差网络的信息识别方法的步骤一一对应,在此不再赘述。
在一个实施方式中,所述装置,包括:
样本数据获取单元,用于获取指定量的样本数据,并将样本数据分成训练集和测试集;其中,所述样本数据包括行人图像,以及与行人图像关联的识别结果;
训练单元,用于将训练集的样本数据输入到基于残差网络的初始行人再识别模型中进行训练;其中,训练的过程中采用随机梯度下降法,得到结果训练模型;
验证单元,用于利用所述测试集的样本数据验证所述结果训练模型;
模型标记单元,用于如果验证通过,则将所述结果训练模型记为所述基于残差网络的行人再识别模型。
其中上述单元分别用于执行的操作与前述实施方式的基于残差网络的信息识别方法的步骤一一对应,在此不再赘述。
在一个实施方式中,所述装置,包括:
权重参数获取单元,用于获取已经训练完成的残差网络模型的各层权重参数;
初始化单元,用于将所述各层权重参数初始化为初始行人再识别模型中的残差网络中的各层初始权重参数;
模型验证单元,用于利用测试集的样本数据验证所述初始行人再识别模型,其中,所述样本数据包括行人图像,以及与行人图像关联的识别结果;
行人再识别模型标记单元,用于若验证通过,则将所述初始行人再识别模型记为所述基于残差网络的行人再识别模型。
其中上述单元分别用于执行的操作与前述实施方式的基于残差网络的信息识别方法的步骤一一对应,在此不再赘述。
在一个实施方式中,所述数据获取单元30,包括:
指定数据提取子单元,用于通过所述全局识别子网络在所述特征图像中提取指定数据,并判断所述指定数据的数值是否在预设的数值范围之内,其中所述指定数据至少包括人体轮廓、人体肤色或者衣着颜色;
全局子数据输出子单元,用于若所述指定数据的数值不在预设的数值范围之内,则将所述指定数据作为全局子数据,并输出所述全局子数据。
其中上述子单元分别用于执行的操作与前述实施方式的基于残差网络的信息识别方法的步骤一一对应,在此不再赘述。
在一个实施方式中,所述数据获取单元30,包括:
区块划分子单元,用于通过所述局部识别子网络,采用预设的区块划分方法将所述特征图像划分为多个区块;
数据提取子单元,用于在各个所述区块中分别提取指定数据,并判断所述指定数据的数值是否在预设的数值范围之内,其中所述指定数据至少包括局部轮廓、局部肤色、或者局部衣着颜色;
局部子数据输出子单元,用于若所述指定数据的数值不在预设的数值范围之内,则将所述指定数据作为局部子数据,并输出所述局部子数据。
其中上述子单元分别用于执行的操作与前述实施方式的基于残差网络的信息识别方法的步骤一一对应,在此不再赘述。
在一个实施方式中,所述行人再识别结果获取单元40,包括:
映射子单元,用于采用预设的映射方法,通过所述全连接层将所述主数据、所述全局子数据和所述局部子数据映射为一个固定长度的特征向量;
识别结果输出子单元,用于根据预设的分向量与识别结果对应关系,输出所述特征向量中数值最大的分向量对应的识别结果。
其中上述子单元分别用于执行的操作与前述实施方式的基于残差网络的信息识别方法的步骤一一对应,在此不再赘述。
本申请的基于残差网络的信息识别装置,获取行人再识别的指令,其中所述行人再识别的指令携带有待识别的指定行人的图像;将所述指定行人的图像输入预设的训练好的基于残差网络的行人再识别模型中计算,从而获取所述残差网络中的第四个残差块输出的特征图像;获得所述第五个残差块输出的主数据;获得所述全局识别子网络输出的全局子数据;获得所述局部识别子网络输出的局部子数据;将所述主数据、所述全局子数据和所述局部子数据输入所述行人再识别模型中预设的全连接层中进行计算,从而获得所述全连接层输出的行人再识别结果。从而提高了行人再识别的准确性。
参照图3,本申请实施例中还提供一种计算机设备,该计算机设备可以是服务器,其内部结构可以如图所示。该计算机设备包括通过***总线连接的处理器、存储器、网络接口和数据库。其中,该计算机设计的处理器用于提供计算和控制能力。该计算机设备的存储器包括非易失性存储介质、内存储器。该非易失性存储介质存储有操作***、计算机程序和数据库。该内存器为非易失性存储介质中的操作***和计算机程序的运行提供环境。该计算机设备的数据库用于存储基于残差网络的信息识别方法所用数据。该计算机设备的网络接口用于与外部的终端通过网络连接通信。该计算机程序被处理器执行时以实现一种基于残差网络的信息识别方法。
上述处理器执行上述基于残差网络的信息识别方法,其中所述方法包括的步骤分别与执行前述实施方式的基于残差网络的信息识别方法的步骤一一对应,在此不再赘述。
本领域技术人员可以理解,图中示出的结构,仅仅是与本申请方案相关的部分结构的框图,并不构成对本申请方案所应用于其上的计算机设备的限定。
本申请的计算机设备,获取行人再识别的指令,其中所述行人再识别的指令携带有待识别的指定行人的图像;将所述指定行人的图像输入预设的训练好的基于残差网络的行人再识别模型中计算,从而获取所述残差网络中的第四个残差块输出的特征图像;获得所述第五个残差块输出的主数据;获得所述全局识别子网络输出的全局子数据;获得所述局部识别子网络输出的局部子数据;将所述主数据、所述全局子数据和所述局部子数据输入所述行人再识别模型中预设的全连接层中进行计算,从而获得所述全连接层输出的行人再识别结果。从而提高了行人再识别的准确性。
本申请一实施例还提供一种计算机可读存储介质,其上存储有计算机程序,计算机程序被处理器执行时实现基于残差网络的信息识别方法,其中所述方法包括的步骤分别与执行前述实施方式的基于残差网络的信息识别方法的步骤一一对应,在此不再赘述。
本申请的计算机可读存储介质,获取行人再识别的指令,其中所述行人再识别的指令携带有待识别的指定行人的图像;将所述指定行人的图像输入预设的训练好的基于残差网络的行人再识别模型中计算,从而获取所述残差网络中的第四个残差块输出的特征图像;获得所述第五个残差块输出的主数据;获得所述全局识别子网络输出的全局子数据;获得所述局部识别子网络输出的局部子数据;将所述主数据、所述全局子数据和所述局部子数据输入所述行人再识别模型中预设的全连接层中进行计算,从而获得所述全连接层输出的行人再识别结果。从而提高了行人再识别的准确性。其中,所述计算机可读存储介质,例如为非易失性的计算机可读存储介质,或者为易失性的计算机可读存储介质。
本领域普通技术人员可以理解实现上述实施例方法中的全部或部分流程,是可以通过计算机程序来指令相关的硬件来完成,所述的计算机程序可存储于一非易失性计算机可读取存储介质中,该计算机程序在执行时,可包括如上述各方法的实施例的流程。其中,本申请所提供的和实施例中所使用的对存储器、存储、数据库或其它介质的任何引用,均可包括非易失性和/或易失性存储器。非易失性存储器可以包括只读存储器(ROM)、可编程ROM(PROM)、电可编程ROM(EPROM)、电可擦除可编程ROM(EEPROM)或闪存。易失性存储器可包括随机存取存储器(RAM)或者外部高速缓冲存储器。作为说明而非局限,RAM以多种形式可得,诸如静态RAM(SRAM)、动态RAM(DRAM)、同步DRAM(SDRAM)、双速据率SDRAM(SSRSDRAM)、增强型SDRAM(ESDRAM)、同步链路(Synchlink)DRAM(SLDRAM)、存储器总线(Rambus)直接RAM(RDRAM)、直接存储器总线动态RAM(DRDRAM)、以及存储器总线动态RAM(RDRAM)等。
需要说明的是,在本文中,术语“包括”、“包含”或者其任何其他变体意在涵盖非排他性的包含,从而使得包括一系列要素的过程、装置、物品或者方法不仅包括那些要素,而且还包括没有明确列出的其他要素,或者是还包括为这种过程、装置、物品或者方法所固有的要素。在没有更多限制的情况下,由语句“包括一个……”限定的要素,并不排除在包括该要素的过程、装置、物品或者方法中还存在另外的相同要素。
以上所述仅为本申请的优选实施例,并非因此限制本申请的专利范围,凡是利用本申请说明书及附图内容所作的等效结构或等效流程变换,或直接或间接运用在其他相关的技术领域,均同理包括在本申请的专利保护范围内。

Claims (20)

  1. 一种基于残差网络的信息识别方法,其特征在于,包括:
    获取行人再识别的指令,其中所述行人再识别的指令携带有待识别的指定行人的图像;
    将所述指定行人的图像输入预设的训练好的基于残差网络的行人再识别模型中计算,从而获取所述残差网络中的第四个残差块输出的特征图像,其中,所述行人再识别模型基于行人图像,以及与行人图像关联的识别结果的样本数据训练而成;
    将所述特征图像输入所述残差网络中的第五个残差块中计算,从而获得所述第五个残差块输出的主数据;以及并行地将所述特征图像输入所述行人再识别模型中预设的全局识别子网络中计算,从而获得所述全局识别子网络输出的全局子数据;以及并行地将所述特征图像输入所述行人再识别模型中的预设的局部识别子网络中计算,从而获得所述局部识别子网络输出的局部子数据;
    将所述主数据、所述全局子数据和所述局部子数据输入所述行人再识别模型中预设的全连接层中进行计算,从而获得所述全连接层输出的行人再识别结果。
  2. 根据权利要求1所述的基于残差网络的信息识别方法,其特征在于,所述指定行人的图像包括面部区域,所述将所述指定行人的图像输入预设的训练好的基于残差网络的行人再识别模型中计算,从而获取所述残差网络中的第四个残差块输出的特征图像,其中,所述行人再识别模型基于行人图像,以及与行人图像关联的识别结果的样本数据训练而成的步骤之前,包括:
    将所述指定行人的图像划分为多个区域,将每个区域的图像数据与预设的眼睛图像数据进行对比,得到每个区域图像数据与眼睛图像数据的差值,将差值不超过预设数值的区域记为眼睛区域;
    将每个区域的图像数据与预设的嘴巴图像数据进行比较,得到每个区域图像数据与嘴巴图像数据的差值,将差值不超过预设数值的区域记为嘴巴区域;
    调用标准面部图像,并通过等比例缩小或者放大操作,使所述标准面部图像中的眼睛区域与所述指定行人的图像中的眼睛区域重合,同时使所述标准面部图像中的嘴巴区域与所述指定行人的图像中的嘴巴区域重合,再将所述指定行人的图像中与经过所述等比例缩小或者放大操作后的标准面部图像重叠的区域记为面部区域,并将所述面部区域范围内的图像作为面部图像;
    采用预设的图像相似度计算方法,计算所述面部图像与预存的目标面部图像的相似度值,并判断所述相似度值是否大于预设的相似度阈值;
    若所述相似度值不大于预设的相似度阈值,则生成行人再识别模型计算指令,其中所述行人再识别模型计算指令用于指示将所述指定行人的图像输入预设的训练好的基于残差网络的行人再识别模型中计算。
  3. 根据权利要求1所述的基于残差网络的信息识别方法,其特征在于,所述将所述指定行人的图像输入预设的训练好的基于残差网络的行人再识别模型中计算,从而获取所述残差网络中的第四个残差块输出的特征图像,其中,所述行人再识别模型基于行人图像,以及与行人图像关联的识别结果的样本数据训练而成的步骤之前,包括:
    获取指定量的样本数据,并将样本数据分成训练集和测试集;其中,所述样本数据包括行人图像,以及与行人图像关联的识别结果;
    将训练集的样本数据输入到基于残差网络的初始行人再识别模型中进行训练;其中,训练的过程中采用随机梯度下降法,得到结果训练模型;
    利用所述测试集的样本数据验证所述结果训练模型;
    如果验证通过,则将所述结果训练模型记为所述基于残差网络的行人再识别模型。
  4. 根据权利要求1所述的基于残差网络的信息识别方法,其特征在于,所述将所述指定行人的图像输入预设的训练好的基于残差网络的行人再识别模型中计算,从而获取所述残差网络中的第四个残差块输出的特征图像,其中,所述行人再识别模型基于行人图像,以及与行人图像关联的识别结果的样本数据训练而成的步骤之前,包括:
    获取已经训练完成的残差网络模型的各层权重参数;
    将所述各层权重参数初始化为初始行人再识别模型中的残差网络中的各层初始权重参数;
    利用测试集的样本数据验证所述初始行人再识别模型,其中,所述样本数据包括行人图像,以及与行人图像关联的识别结果;
    若验证通过,则将所述初始行人再识别模型记为所述基于残差网络的行人再识别模型。
  5. 根据权利要求1所述的基于残差网络的信息识别方法,其特征在于,所述将所述特征图像输入所述行人再识别模型中预设的全局识别子网络中计算,从而获得所述全局识别子网络输出的全局子数据的步骤,包括:
    通过所述全局识别子网络在所述特征图像中提取指定数据,并判断所述指定数据的数值是否在预设的数值范围之内,其中所述指定数据至少包括人体轮廓、人体肤色或者衣着颜色;
    若所述指定数据的数值不在预设的数值范围之内,则将所述指定数据作为全局子数据,并输出所述全局子数据。
  6. 根据权利要求1所述的基于残差网络的信息识别方法,其特征在于,所述将所述特征图像输入所述行人再识别模型中的预设的局部识别子网络中计算,从而获得所述局部识别子网络输出的局部子数据的步骤,包括:
    通过所述局部识别子网络,采用预设的区块划分方法将所述特征图像划分为多个区块;
    在各个所述区块中分别提取指定数据,并判断所述指定数据的数值是否在预设的数值范围之内,其中所述指定数据至少包括局部轮廓、局部肤色、或者局部衣着颜色;
    若所述指定数据的数值不在预设的数值范围之内,则将所述指定数据作为局部子数据,并输出所述局部子数据。
  7. 根据权利要求1所述的基于残差网络的信息识别方法,其特征在于,所述将所述主数据、所述全局子数据和所述局部子数据输入所述行人再识别模型中预设的全连接层中进行计算,从而获得所述全连接层输出的行人再识别结果的步骤,包括:
    采用预设的映射方法,通过所述全连接层将所述主数据、所述全局子数据和所述局部子数据映射为一个固定长度的特征向量;
    根据预设的分向量与识别结果对应关系,输出所述特征向量中数值最大的分向量对应的识别结果。
  8. 一种基于残差网络的信息识别装置,其特征在于,包括:
    指令获取单元,用于获取行人再识别的指令,其中所述行人再识别的指令携带有待识别的指定行人的图像;
    特征图像获取单元,用于将所述指定行人的图像输入预设的训练好的基于残差网络的行人再识别模型中计算,从而获取所述残差网络中的第四个残差块输出的特征图像,其中,所述行人再识别模型基于行人图像,以及与行人图像关联的识别结果的样本数据训练而成;
    数据获取单元,用于将所述特征图像输入所述残差网络中的第五个残差块中计算,从而获得所述第五个残差块输出的主数据;以及并行地将所述特征图像输入所述行人再识别模型中预设的全局识别子网络中计算,从而获得所述全局识别子网络输出的全局子数据;以及并行地将所述特征图像输入所述行人再识别模型中的预设的局部识别子网络中计算,从而获得所述局部识别子网络输出的局部子数据;
    行人再识别结果获取单元,用于将所述主数据、所述全局子数据和所述局部子数据输入所述行人再识别模型中预设的全连接层中进行计算,从而获得所述全连接层输出的行人再识别结果。
  9. 根据权利要求8所述的基于残差网络的信息识别装置,其特征在于,所述指定行人的图像包括面部区域,所述装置,包括:
    眼睛区域标记单元,用于将所述指定行人的图像划分为多个区域,将每个区域的图像数据与预设的眼睛图像数据进行对比,得到每个区域图像数据与眼睛图像数据的差值,将差值不超过预设数值的区域记为眼睛区域;
    嘴巴区域标记单元,用于将每个区域的图像数据与预设的嘴巴图像数据进行比较,得到每个区域图像数据与嘴巴图像数据的差值,将差值不超过预设数值的区域记为嘴巴区域;
    面部图像获取单元,用于调用标准面部图像,并通过等比例缩小或者放大操作,使所述标准面部图像中的眼睛区域与所述指定行人的图像中的眼睛区域重合,同时使所述标准面部图像中的嘴巴区域与所述指定行人的图像中的嘴巴区域重合,再将所述指定行人的图像中与经过所述等比例缩小或者放大操作后的标准面部图像重叠的区域记为面部区域,并将所述面部区域范围内的图像作为面部图像;
    相似度值计算单元,用于采用预设的图像相似度计算方法,计算所述面部图像与预存的目标面部图像的相似度值,并判断所述相似度值是否大于预设的相似度阈值;
    计算指令生成单元,用于若所述相似度值不大于预设的相似度阈值,则生成行人再识别模型计算指令,其中所述行人再识别模型计算指令用于指示将所述指定行人的图像输入预设的训练好的基于残差网络的行人再识别模型中计算。
  10. 根据权利要求8所述的基于残差网络的信息识别装置,其特征在于,所述装置,包括:
    样本数据获取单元,用于获取指定量的样本数据,并将样本数据分成训练集和测试集;其中,所述样本数据包括行人图像,以及与行人图像关联的识别结果;
    训练单元,用于将训练集的样本数据输入到基于残差网络的初始行人再识别模型中进行训练;其中,训练的过程中采用随机梯度下降法,得到结果训练模型;
    验证单元,用于利用所述测试集的样本数据验证所述结果训练模型;
    模型标记单元,用于如果验证通过,则将所述结果训练模型记为所述基于残差网络的行人再识别模型。
  11. 根据权利要求8所述的基于残差网络的信息识别装置,其特征在于,所述装置,包括:
    权重参数获取单元,用于获取已经训练完成的残差网络模型的各层权重参数;
    初始化单元,用于将所述各层权重参数初始化为初始行人再识别模型中的残差网络中的各层初始权重参数;
    模型验证单元,用于利用测试集的样本数据验证所述初始行人再识别模型,其中,所述样本数据包括行人图像,以及与行人图像关联的识别结果;
    行人再识别模型标记单元,用于若验证通过,则将所述初始行人再识别模型记为所述基于残差网络的行人再识别模型。
  12. 根据权利要求8所述的基于残差网络的信息识别装置,其特征在于,所述数据获取单元,包括:
    指定数据提取子单元,用于通过所述全局识别子网络在所述特征图像中提取指定数据,并判断所述指定数据的数值是否在预设的数值范围之内,其中所述指定数据至少包括人体轮廓、人体肤色或者衣着颜色;
    全局子数据输出子单元,用于若所述指定数据的数值不在预设的数值范围之内,则将所述指定数据作为全局子数据,并输出所述全局子数据。
  13. 根据权利要求8所述的基于残差网络的信息识别装置,其特征在于,所述数据获取单元,包括:
    区块划分子单元,用于通过所述局部识别子网络,采用预设的区块划分方法将所述特征图像划分为多个区块;
    数据提取子单元,用于在各个所述区块中分别提取指定数据,并判断所述指定数据的数值是否在预设的数值范围之内,其中所述指定数据至少包括局部轮廓、局部肤色、或者局部衣着颜色;
    局部子数据输出子单元,用于若所述指定数据的数值不在预设的数值范围之内,则将所述指定数据作为局部子数据,并输出所述局部子数据。
  14. 根据权利要求8所述的基于残差网络的信息识别装置,其特征在于,所述行人再识别结果获取单元,包括:
    映射子单元,用于采用预设的映射方法,通过所述全连接层将所述主数据、所述全局子数据和所述局部子数据映射为一个固定长度的特征向量;
    识别结果输出子单元,用于根据预设的分向量与识别结果对应关系,输出所述特征向量中数值最大的分向量对应的识别结果。
  15. 一种计算机设备,包括存储器和处理器,所述存储器存储有计算机程序,其特征在于,所述处理器执行所述计算机程序时实现基于残差网络的信息识别方法,所述基于残差网络的信息识别方法,包括:
    获取行人再识别的指令,其中所述行人再识别的指令携带有待识别的指定行人的图像;
    将所述指定行人的图像输入预设的训练好的基于残差网络的行人再识别模型中计算,从而获取所述残差网络中的第四个残差块输出的特征图像,其中,所述行人再识别模型基于行人图像,以及与行人图像关联的识别结果的样本数据训练而成;
    将所述特征图像输入所述残差网络中的第五个残差块中计算,从而获得所述第五个残差块输出的主数据;以及并行地将所述特征图像输入所述行人再识别模型中预设的全局识别子网络中计算,从而获得所述全局识别子网络输出的全局子数据;以及并行地将所述特征图像输入所述行人再识别模型中的预设的局部识别子网络中计算,从而获得所述局部识别子网络输出的局部子数据;
    将所述主数据、所述全局子数据和所述局部子数据输入所述行人再识别模型中预设的全连接层中进行计算,从而获得所述全连接层输出的行人再识别结果。
  16. 根据权利要求15所述的计算机设备,其特征在于,所述指定行人的图像包括面部区域,所述将所述指定行人的图像输入预设的训练好的基于残差网络的行人再识别模型中计算,从而获取所述残差网络中的第四个残差块输出的特征图像,其中,所述行人再识别模型基于行人图像,以及与行人图像关联的识别结果的样本数据训练而成的步骤之前,包括:
    将所述指定行人的图像划分为多个区域,将每个区域的图像数据与预设的眼睛图像数据进行对比,得到每个区域图像数据与眼睛图像数据的差值,将差值不超过预设数值的区域记为眼睛区域;
    将每个区域的图像数据与预设的嘴巴图像数据进行比较,得到每个区域图像数据与嘴巴图像数据的差值,将差值不超过预设数值的区域记为嘴巴区域;
    调用标准面部图像,并通过等比例缩小或者放大操作,使所述标准面部图像中的眼睛区域与所述指定行人的图像中的眼睛区域重合,同时使所述标准面部图像中的嘴巴区域与所述指定行人的图像中的嘴巴区域重合,再将所述指定行人的图像中与经过所述等比例缩小或者放大操作后的标准面部图像重叠的区域记为面部区域,并将所述面部区域范围内的图像作为面部图像;
    采用预设的图像相似度计算方法,计算所述面部图像与预存的目标面部图像的相似度值,并判断所述相似度值是否大于预设的相似度阈值;
    若所述相似度值不大于预设的相似度阈值,则生成行人再识别模型计算指令,其中所述行人再识别模型计算指令用于指示将所述指定行人的图像输入预设的训练好的基于残差网络的行人再识别模型中计算。
  17. 根据权利要求15所述的计算机设备,其特征在于,所述将所述指定行人的图像输入预设的训练好的基于残差网络的行人再识别模型中计算,从而获取所述残差网络中的第四个残差块输出的特征图像,其中,所述行人再识别模型基于行人图像,以及与行人图像关联的识别结果的样本数据训练而成的步骤之前,包括:
    获取指定量的样本数据,并将样本数据分成训练集和测试集;其中,所述样本数据包括行人图像,以及与行人图像关联的识别结果;
    将训练集的样本数据输入到基于残差网络的初始行人再识别模型中进行训练;其中,训练的过程中采用随机梯度下降法,得到结果训练模型;
    利用所述测试集的样本数据验证所述结果训练模型;
    如果验证通过,则将所述结果训练模型记为所述基于残差网络的行人再识别模型。
  18. 一种计算机可读存储介质,其上存储有计算机程序,其特征在于,所述计算机程序被处理器执行时实现基于残差网络的信息识别方法,所述基于残差网络的信息识别方法,包括:
    获取行人再识别的指令,其中所述行人再识别的指令携带有待识别的指定行人的图像;
    将所述指定行人的图像输入预设的训练好的基于残差网络的行人再识别模型中计算,从而获取所述残差网络中的第四个残差块输出的特征图像,其中,所述行人再识别模型基于行人图像,以及与行人图像关联的识别结果的样本数据训练而成;
    将所述特征图像输入所述残差网络中的第五个残差块中计算,从而获得所述第五个残差块输出的主数据;以及并行地将所述特征图像输入所述行人再识别模型中预设的全局识别子网络中计算,从而获得所述全局识别子网络输出的全局子数据;以及并行地将所述特征图像输入所述行人再识别模型中的预设的局部识别子网络中计算,从而获得所述局部识别子网络输出的局部子数据;
    将所述主数据、所述全局子数据和所述局部子数据输入所述行人再识别模型中预设的全连接层中进行计算,从而获得所述全连接层输出的行人再识别结果。
  19. 根据权利要求18所述的计算机可读存储介质,其特征在于,所述指定行人的图像包括面部区域,所述将所述指定行人的图像输入预设的训练好的基于残差网络的行人再识别模型中计算,从而获取所述残差网络中的第四个残差块输出的特征图像,其中,所述行人再识别模型基于行人图像,以及与行人图像关联的识别结果的样本数据训练而成的步骤之前,包括:
    将所述指定行人的图像划分为多个区域,将每个区域的图像数据与预设的眼睛图像数据进行对比,得到每个区域图像数据与眼睛图像数据的差值,将差值不超过预设数值的区域记为眼睛区域;
    将每个区域的图像数据与预设的嘴巴图像数据进行比较,得到每个区域图像数据与嘴巴图像数据的差值,将差值不超过预设数值的区域记为嘴巴区域;
    调用标准面部图像,并通过等比例缩小或者放大操作,使所述标准面部图像中的眼睛区域与所述指定行人的图像中的眼睛区域重合,同时使所述标准面部图像中的嘴巴区域与所述指定行人的图像中的嘴巴区域重合,再将所述指定行人的图像中与经过所述等比例缩小或者放大操作后的标准面部图像重叠的区域记为面部区域,并将所述面部区域范围内的图像作为面部图像;
    采用预设的图像相似度计算方法,计算所述面部图像与预存的目标面部图像的相似度值,并判断所述相似度值是否大于预设的相似度阈值;
    若所述相似度值不大于预设的相似度阈值,则生成行人再识别模型计算指令,其中所述行人再识别模型计算指令用于指示将所述指定行人的图像输入预设的训练好的基于残差网络的行人再识别模型中计算。
  20. 根据权利要求18所述的计算机可读存储介质,其特征在于,所述将所述指定行人的图像输入预设的训练好的基于残差网络的行人再识别模型中计算,从而获取所述残差网络中的第四个残差块输出的特征图像,其中,所述行人再识别模型基于行人图像,以及与行人图像关联的识别结果的样本数据训练而成的步骤之前,包括:
    获取指定量的样本数据,并将样本数据分成训练集和测试集;其中,所述样本数据包括行人图像,以及与行人图像关联的识别结果;
    将训练集的样本数据输入到基于残差网络的初始行人再识别模型中进行训练;其中,训练的过程中采用随机梯度下降法,得到结果训练模型;
    利用所述测试集的样本数据验证所述结果训练模型;
    如果验证通过,则将所述结果训练模型记为所述基于残差网络的行人再识别模型。
PCT/CN2019/118803 2019-07-30 2019-11-15 基于残差网络的信息识别方法、装置和计算机设备 WO2021017316A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201910696302.X 2019-07-30
CN201910696302.XA CN110543823B (zh) 2019-07-30 2019-07-30 基于残差网络的行人再识别方法、装置和计算机设备

Publications (1)

Publication Number Publication Date
WO2021017316A1 true WO2021017316A1 (zh) 2021-02-04

Family

ID=68710475

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2019/118803 WO2021017316A1 (zh) 2019-07-30 2019-11-15 基于残差网络的信息识别方法、装置和计算机设备

Country Status (2)

Country Link
CN (1) CN110543823B (zh)
WO (1) WO2021017316A1 (zh)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113962303A (zh) * 2021-10-21 2022-01-21 哈尔滨工程大学 基于密度融合的水下试验环境反演方法及***
CN114937150A (zh) * 2022-05-20 2022-08-23 电子科技大学 一种基于深度阈值残差网络的无人机目标识别方法

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111553213B (zh) * 2020-04-17 2022-09-20 大连理工大学 移动边缘云中实时分布式的身份感知行人属性识别方法
CN115631509B (zh) * 2022-10-24 2023-05-26 智慧眼科技股份有限公司 一种行人再识别方法、装置、计算机设备及存储介质

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2018196396A1 (zh) * 2017-04-24 2018-11-01 清华大学 基于一致性约束特征学习的行人再识别方法
CN108960114A (zh) * 2018-06-27 2018-12-07 腾讯科技(深圳)有限公司 人体识别方法及装置、计算机可读存储介质及电子设备
CN108960127A (zh) * 2018-06-29 2018-12-07 厦门大学 基于自适应深度度量学习的遮挡行人重识别方法
CN108960140A (zh) * 2018-07-04 2018-12-07 国家新闻出版广电总局广播科学研究院 基于多区域特征提取和融合的行人再识别方法

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107633204B (zh) * 2017-08-17 2019-01-29 平安科技(深圳)有限公司 人脸遮挡检测方法、装置及存储介质

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2018196396A1 (zh) * 2017-04-24 2018-11-01 清华大学 基于一致性约束特征学习的行人再识别方法
CN108960114A (zh) * 2018-06-27 2018-12-07 腾讯科技(深圳)有限公司 人体识别方法及装置、计算机可读存储介质及电子设备
CN108960127A (zh) * 2018-06-29 2018-12-07 厦门大学 基于自适应深度度量学习的遮挡行人重识别方法
CN108960140A (zh) * 2018-07-04 2018-12-07 国家新闻出版广电总局广播科学研究院 基于多区域特征提取和融合的行人再识别方法

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113962303A (zh) * 2021-10-21 2022-01-21 哈尔滨工程大学 基于密度融合的水下试验环境反演方法及***
CN113962303B (zh) * 2021-10-21 2024-04-26 哈尔滨工程大学 基于密度融合的水下试验环境反演方法及***
CN114937150A (zh) * 2022-05-20 2022-08-23 电子科技大学 一种基于深度阈值残差网络的无人机目标识别方法
CN114937150B (zh) * 2022-05-20 2023-04-07 电子科技大学 一种基于深度阈值残差网络的无人机目标识别方法

Also Published As

Publication number Publication date
CN110543823A (zh) 2019-12-06
CN110543823B (zh) 2024-03-19

Similar Documents

Publication Publication Date Title
CN110599451B (zh) 医学图像病灶检测定位方法、装置、设备及存储介质
WO2021017316A1 (zh) 基于残差网络的信息识别方法、装置和计算机设备
WO2019227616A1 (zh) 动物身份的识别方法、装置、计算机设备和存储介质
CN112232117A (zh) 一种人脸识别方法、装置及存储介质
CN111340008B (zh) 对抗补丁生成、检测模型训练、对抗补丁防御方法及***
CN110941986A (zh) 活体检测模型的训练方法、装置、计算机设备和存储介质
CN106022317A (zh) 人脸识别方法及装置
CN105678253B (zh) 半监督人脸年龄估计装置及半监督人脸年龄估计方法
CN110321870B (zh) 一种基于lstm的掌静脉识别方法
CN111310624A (zh) 遮挡识别方法、装置、计算机设备及存储介质
CN112446302A (zh) 一种人体姿态检测方法、***、电子设备和存储介质
CN111832581B (zh) 肺部特征识别方法、装置、计算机设备及存储介质
CN110807491A (zh) 车牌图像清晰度模型训练方法、清晰度检测方法及装置
WO2021047190A1 (zh) 基于残差网络的报警方法、装置、计算机设备和存储介质
WO2021000832A1 (zh) 匹配人脸的方法、装置、计算机设备和存储介质
CN112884782B (zh) 生物对象分割方法、装置、计算机设备和存储介质
CN116311549A (zh) 活体对象识别方法、设备和计算机可读存储介质
CN112818821B (zh) 基于可见光和红外光的人脸采集源检测方法和装置
CN108875505A (zh) 基于神经网络的行人再识别方法和装置
CN115984930A (zh) 微表情识别方法、装置、微表情识别模型的训练方法
CN111144285A (zh) 胖瘦程度识别方法、装置、设备及介质
CN110321871B (zh) 一种基于lstm的掌静脉识别***及方法
CN113705685A (zh) 疾病特征识别模型训练、疾病特征识别方法、装置及设备
CN110163151B (zh) 人脸模型的训练方法、装置、计算机设备和存储介质
CN111582155A (zh) 活体检测方法、装置、计算机设备和存储介质

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 19939250

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 19939250

Country of ref document: EP

Kind code of ref document: A1