WO2021082078A1 - 行人重识别方法、装置、计算机设备及可读存储介质 - Google Patents

行人重识别方法、装置、计算机设备及可读存储介质 Download PDF

Info

Publication number
WO2021082078A1
WO2021082078A1 PCT/CN2019/118020 CN2019118020W WO2021082078A1 WO 2021082078 A1 WO2021082078 A1 WO 2021082078A1 CN 2019118020 W CN2019118020 W CN 2019118020W WO 2021082078 A1 WO2021082078 A1 WO 2021082078A1
Authority
WO
WIPO (PCT)
Prior art keywords
neural network
extraction module
image
recognition
identification
Prior art date
Application number
PCT/CN2019/118020
Other languages
English (en)
French (fr)
Inventor
任逍航
Original Assignee
平安科技(深圳)有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 平安科技(深圳)有限公司 filed Critical 平安科技(深圳)有限公司
Publication of WO2021082078A1 publication Critical patent/WO2021082078A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/103Static body considered as a whole, e.g. static pedestrian or occupant recognition
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/50Context or environment of the image
    • G06V20/52Surveillance or monitoring of activities, e.g. for recognising suspicious objects

Definitions

  • This application relates to the field of image analysis, in particular to pedestrian re-identification methods, devices, computer equipment and readable storage media.
  • Pedestrian re-identification in video surveillance is widely used in various fields of video analysis, and plays a central role in many application scenarios such as smart security, smart education, and smart media.
  • Pedestrian re-recognition can quickly find the target pedestrian in multiple cameras, thereby effectively improving customer experience, enhancing social safety and stability, and reducing labor costs and analysis time in video analysis.
  • the current pedestrian re-identification method mainly extracts all the characteristics of the whole according to the overall characteristics of the pedestrian, and analyzes and recognizes all the characteristics to find the target pedestrian.
  • the existing human re-identification technology needs to extract all the characteristics of the pedestrian in the recognition process. There is a problem of accuracy.
  • this application provides a pedestrian re-identification method, which includes the following steps:
  • the weight reorganization of the first neural network is orthogonal to the weight reorganization of the second neural network.
  • this application also provides a pedestrian re-identification device, including:
  • the receiving unit is configured to receive at least one image to be recognized
  • a recognition unit configured to recognize the image to be recognized through the first neural network and the second neural network of the target recognition model, and obtain a recognition result
  • the weight reorganization of the first neural network is orthogonal to the weight reorganization of the second neural network.
  • the present application also provides a computer device that includes a memory, a processor, and a computer program stored in the memory and capable of running on the processor.
  • a computer program stored in the memory and capable of running on the processor.
  • the weight reorganization of the first neural network is orthogonal to the weight reorganization of the second neural network.
  • the present application also provides a computer-readable storage medium on which a computer program is stored, and the computer program is executed by a processor to realize the following steps of the pedestrian re-identification method:
  • the weight reorganization of the first neural network is orthogonal to the weight reorganization of the second neural network.
  • the pedestrian re-identification method, device, computer equipment and readable storage medium extract the global features of the object to be identified through the first neural network in the target recognition model, and extract the object to be identified through the second neural network in the target recognition model
  • the unique features of the object combine the global features and the unique features to identify the object to be identified, thereby improving the accuracy of recognition, and achieving the purpose of quickly and accurately locking the target object through the target recognition model.
  • FIG. 1 is a method flowchart of an embodiment of the pedestrian re-identification method described in this application;
  • FIG. 2 is a flowchart of an embodiment of the method for training the recognition model to obtain the target recognition model in this application;
  • Figure 3 is a flowchart of an embodiment of the method for training a third neural network
  • Fig. 4 is a flowchart of an embodiment of a method for training a fourth neural network
  • Figure 5 is a schematic diagram of pedestrian re-identification
  • Fig. 6 is a diagram of a method for recognizing an image to be recognized
  • FIG. 7 is a block diagram of an embodiment of the pedestrian re-identification device described in this application.
  • FIG. 8 is a schematic diagram of the hardware architecture of an embodiment of the computer device described in this application.
  • the pedestrian re-identification method, device, computer equipment, and readable storage medium provided in this application are suitable for insurance, security and other business fields, and provide a pedestrian re-identification method that can quickly and accurately identify target objects for security systems and monitoring systems.
  • This application uses the first neural network in the target recognition model to extract the global features of the object to be recognized, and the second neural network in the target recognition model extracts the unique features (features that are different from the global features) of the object to be recognized, combining global features and The unique feature recognizes the object to be recognized, thereby improving the accuracy of recognition, and achieving the purpose of quickly and accurately locking the target object through the target recognition model.
  • a pedestrian re-identification method of this embodiment includes the following steps:
  • the pedestrian re-identification method can be applied to monitoring, security, or security systems, and the image to be identified can be received through a collection device (for example, a camera, a camera, or a mobile terminal with a camera function, etc.).
  • a collection device for example, a camera, a camera, or a mobile terminal with a camera function, etc.
  • the weight reorganization of the first neural network is orthogonal to the weight reorganization of the second neural network. This ensures that the global features extracted by the first neural network are not related to the unique features extracted by the second neural network.
  • Global characteristics are used to characterize the head, torso, and limbs of pedestrians. They are characteristics that every pedestrian has; compared to global characteristics, unique characteristics can be rich and diverse, and unique characteristics are used to represent Pedestrian characteristics are different from global characteristics, such as unique hairstyles, bright hair colors, special clothing styles, decorations or patterns, etc. Using unique features can locate the target object faster and more accurately.
  • the global features of the object to be recognized are extracted through the first neural network, and the unique features of the object to be recognized (features that are different from the global features) are extracted through the second neural network in the target recognition model, combining global features and unique features Recognize the object to be recognized, thereby improving the accuracy of recognition.
  • step S2 the step of recognizing the image to be recognized through the first neural network and the second neural network of the target recognition model and obtaining the recognition result may also include (refer to Figure 2):
  • A1. Obtain at least one target image and at least one sample image
  • each sample image only represents one sample object, and the shape of the target object in the target image is different from the shape of the sample object in all sample images.
  • A2 Use the target image and at least one sample image to train the third neural network and the fourth neural network of the recognition model, and the weight reorganization of the third neural network is orthogonal to the weight reorganization of the fourth neural network , Obtain the target recognition model.
  • the orthogonality between the weight reorganization of the third neural network and the weight reorganization of the fourth neural network can ensure that the global features extracted by the third neural network are not related to the unique features extracted by the fourth neural network.
  • the global feature of the sample object in the sample image is extracted through the third neural network
  • the unique feature of the sample object in the sample image is extracted through the fourth neural network
  • the global feature and the unique feature are combined. Recognize objects for recognition, thereby improving the accuracy of recognition.
  • the third neural network includes a first extraction module, a first classification module, and a first classifier
  • the fourth neural network includes: a second extraction module, a third extraction module, a second classification model, and Second classifier
  • the first extraction module includes at least one convolutional layer; the first extraction module may also include at least one down-sampling layer.
  • the first classification module may use a fully connected layer; the first classifier may use a Softmax classifier; the second extraction module may include at least one convolutional layer; the second extraction module may also include at least one down-sampling layer; the third extraction module may include at least A convolutional layer; the third extraction module may also include at least one down-sampling layer; the second classification model may use a fully connected layer; the second classifier may use a Softmax classifier.
  • step A2 the step of using the target image and at least one sample image to train the third neural network of the recognition model (refer to FIG. 3) includes:
  • A201 Extract the first-type feature data of the at least one sample image through the first extraction module
  • the first extraction module includes at least one convolutional layer; the first extraction module may also include at least one down-sampling layer.
  • the first-type feature data (global feature) of each sample image is extracted through the convolutional layer.
  • the first extraction module including a convolutional layer as an example for description.
  • A202 Identify the first type of feature data through the first classification module to obtain first identification data
  • the first type of feature data is identified by the first classification module, the category corresponding to each element in the first type of feature data is obtained, and the first identification data is generated according to the obtained category information.
  • the first classification module may adopt a fully connected layer.
  • A203 Train the gradient of at least one sample image through the first classifier according to the first identification data, and update the parameter values of the first extraction module and the first classification module according to the gradient.
  • the parameter values in the first extraction module and the first classification module are adjusted according to the gradient, so as to achieve the purpose of training the first neural network.
  • the first classifier may adopt a Softmax classifier.
  • step A2 the step of using the target image and at least one sample image to train the fourth neural network of the recognition model includes (refer to FIG. 4):
  • the second type of feature data (specific features) of each sample image is extracted by the second extraction module.
  • the second extraction module may include at least one convolutional layer; the second extraction module may also include at least one down-sampling layer.
  • the second extraction module includes a convolutional layer Cs as an example for the following description:
  • the convolutional layer Cs is used to extract the second type of feature data from the sample image for subsequent analysis.
  • the dimension of the convolutional layer Cs and the dimension of the convolutional layer Cg need to be consistent.
  • the unique characteristics only appear in a small number of pedestrians, and the global characteristics are possessed by all pedestrians, it can be considered that the second type of characteristic data and the first type of characteristic data are completely irrelevant.
  • the weights of the two completely uncorrelated convolutional layers are orthogonal, that is, for each input channel, the sum of its corresponding weighted products is zero.
  • A212 Process the second feature data by the third extraction module to obtain location information of the second feature data
  • the third extraction module may include at least one convolutional layer; the third extraction module may also include at least one down-sampling layer.
  • the third extraction module includes a convolutional layer Cp as an example for the following description:
  • the convolutional layer Cp analyzes the second feature data output by the convolutional layer Cs to obtain a unique feature location mask Y (ie, location information), which provides a location reference for subsequent unique feature extraction.
  • Y is an h ⁇ w matrix
  • h and w are each 1/s of the length and width of the original image
  • s is the downsampling coefficient of the first neural network.
  • Q(Y) represents the degree of relevance of the position information
  • max(Y) represents the largest value of the element in the matrix Y
  • B represents the number of batch data
  • N is a preset constant.
  • B is generally greater than 64
  • N is generally 25.
  • A213. Use the second classification model for identification according to the location information and the second type of feature data, and obtain second identification data;
  • the location information and the second type of feature data are multiplied and input into the second classification model for identification, the category corresponding to each element in the second type of feature data is obtained, and the location-related information is generated according to the obtained category information.
  • the second identification data is a code that specifies the location of the second classification model for identification.
  • the second classification model can adopt a fully connected layer.
  • A214 Train the gradient of at least one sample image through a second classifier based on the correlation of the location information and the second recognition data, and update the second extraction module, the third extraction module, and the The parameter value of the second classification model.
  • the parameter values in the second extraction module, the third extraction module, and the second classification model are adjusted according to the gradient, so as to achieve the purpose of training the second neural network.
  • the second classifier may adopt a Softmax classifier.
  • step A214 the gradient of at least one sample image is trained by a second classifier according to the correlation of the position information and the second recognition data, and the second extraction module, the second extraction module, and the second extraction module are updated according to the gradient.
  • the step of extracting the parameter values of the third module and the second classification model includes:
  • the correlation degree of the position information is obtained according to the relationship between the element with the largest value in the position information and the preset threshold value.
  • the correlation degree is 1; when the element When it is less than the preset threshold, the correlation degree is 0;
  • the gradient of at least one sample image is trained by a second classifier according to the second identification data, and the second extraction module, the third extraction module, and the first extraction module are updated according to the gradient.
  • the parameter values of the two-class model are updated according to the gradient.
  • the first extraction module of the third neural network is the convolutional layer Cg, and the first classification module is the fully connected layer Mg;
  • the second extraction module is the convolutional layer Cs, the third extraction module is the convolutional layer Cp, and the second classification module is the fully connected layer Ms.
  • the second type of feature data of the sample image is extracted through the convolutional layer Cs, and the weighting of the convolutional layer Cg and the convolutional layer Cs is maintained orthogonal according to the loss function Lc; the second feature data is analyzed through the convolutional layer Cp to obtain Position mask Y of the second feature data; multiply the position mask Y with the second feature data and enter the fully connected layer Ms for identification to obtain the recognition result of the unique feature; calculate the recognition result of the unique feature based on the position mask Y through the loss function Lp to ensure the first Second, the stability of the rareness of the feature data.
  • the gradient of at least one sample image is trained through the second classifier according to the second recognition data, and the second extraction module, the third extraction module and the second extraction module are updated according to the gradient.
  • the parameter value of the classification model when the correlation is 0, the parameter value is not updated.
  • the above process is the process of training the model.
  • the first neural network includes a first extraction module and a first classification module
  • the second neural network includes: a second extraction module, a third extraction module, and a second classification model
  • Step S2 The step of recognizing the image to be recognized through the first neural network and the second neural network of the target recognition model, and obtaining the recognition result includes (refer to FIG. 6):
  • the first extraction module includes at least one convolutional layer; the first extraction module may also include at least one down-sampling layer.
  • the first-type feature data (global feature) of each sample image is extracted through the convolutional layer.
  • the first extraction module including a convolutional layer as an example for description.
  • the first type of feature data is identified by the first classification module, the category corresponding to each element in the first type of feature data is obtained, and the first identification data is generated according to the obtained category information.
  • the first classification module may adopt a fully connected layer.
  • the second type of feature data (specific features) of each sample image is extracted by the second extraction module.
  • the second extraction module may include at least one convolutional layer; the second extraction module may also include at least one down-sampling layer.
  • the second extraction module includes a convolutional layer Cs as an example for the following description:
  • the convolutional layer Cs is used to extract the second type of feature data from the sample image for subsequent analysis.
  • the dimension of the convolutional layer Cs and the dimension of the convolutional layer Cg need to be consistent.
  • the unique characteristics only appear in a small number of pedestrians, and the global characteristics are possessed by all pedestrians, it can be considered that the second type of characteristic data and the first type of characteristic data are completely irrelevant.
  • the weights of the two completely uncorrelated convolutional layers are orthogonal, that is, for each input channel, the sum of its corresponding weighted products is zero.
  • the third extraction module may include at least one convolutional layer; the third extraction module may also include at least one down-sampling layer.
  • the third extraction module includes a convolutional layer Cp as an example for the following description:
  • the convolutional layer Cp analyzes the second feature data output by the convolutional layer Cs to obtain a unique feature location mask Y (ie, location information), which provides a location reference for subsequent unique feature extraction.
  • Y is an h ⁇ w matrix
  • h and w are each 1/s of the length and width of the original image
  • s is the downsampling coefficient of the first neural network.
  • Q(Y) represents the degree of relevance of the position information
  • max(Y) represents the largest value of the element in the matrix Y.
  • the location information and the second type of feature data are multiplied and input into the second classification model for identification, the category corresponding to each element in the second type of feature data is obtained, and the location-related information is generated according to the obtained category information.
  • the second identification data is a code that specifies the location of the second classification model for identification.
  • the second classification model can adopt a fully connected layer.
  • step S26 of obtaining the recognition result of the image to be recognized according to the correlation degree of the location information, the second identification data and the first identification data includes:
  • the correlation degree of the position information is obtained according to the relationship between the element with the largest value in the position information and the preset threshold value.
  • the correlation degree is 1; when the element When it is less than the preset threshold, the correlation degree is 0;
  • the correlation degree is 1, the second identification data and the first identification data are combined to generate third identification data, the identification probability is calculated according to the third identification data, and the pending identification data is obtained according to the identification probability.
  • the correlation degree 1
  • the dimension of the first identification data of the global feature and the dimension of the second identification data of the unique feature can be merged to generate a warp dimension.
  • the third identification data is calculated to obtain the identification probability, and based on the identification probability, it is determined whether the object in the image to be identified matches the object in the target image, and a matching result (recognition result) is generated.
  • the recognition probability is calculated according to the first recognition data, and the recognition result of the image to be recognized is obtained according to the recognition probability.
  • the correlation degree when the correlation degree is 0, it means that the global feature is related to the unique feature.
  • the recognition can be based on the global feature. Therefore, the first recognition data is calculated to obtain the recognition probability, and the recognition probability is determined according to the recognition probability. Whether the object in the image matches the object in the target image, a matching result (recognition result) is generated.
  • the first neural network in the target recognition model is used to extract the global features of the object to be recognized
  • the second neural network in the target recognition model is used to extract the unique features (features that are different from the global features) of the object to be recognized. Combining global features and unique features to recognize the object to be recognized, thereby improving the accuracy of recognition, and achieving the purpose of quickly and accurately locking the target object through the target recognition model.
  • Pedestrian re-identification method can extract the unique characteristics of pedestrians with unique characteristics while extracting the global characteristics of pedestrians.
  • the target recognition model includes two neural networks (the first neural network and the second neural network).
  • the parameters of the convolutional layer are orthogonal to separate the unique features from the global features, and the loss function is used to meet the uniqueness of the unique features. The need for sex and rarity.
  • the target pedestrian does not have unique characteristics, only the global characteristics need to be compared, and the most similar pedestrian is found as the recognition result.
  • the target pedestrian has unique features, the extracted unique features can be used for more accurate feature comparison. Because there are fewer base libraries to compare, the comparison is faster and the result will be more accurate.
  • a pedestrian re-identification device 1 of this embodiment includes: a receiving unit 11 and an identification unit 12, wherein:
  • the receiving unit 11 is configured to receive at least one image to be recognized
  • the pedestrian re-identification device 1 can be applied to a monitoring, security, or security system, and the receiving unit 11 can use a collection device (for example, a camera, a camera, or a mobile terminal with a camera function) to receive the image to be identified.
  • a collection device for example, a camera, a camera, or a mobile terminal with a camera function
  • the recognition unit 12 is configured to recognize the image to be recognized through the first neural network and the second neural network of the target recognition model, and obtain a recognition result;
  • the weight reorganization of the first neural network is orthogonal to the weight reorganization of the second neural network. This ensures that the global features extracted by the first neural network are not related to the unique features extracted by the second neural network.
  • the weight reorganization of the first neural network is orthogonal to the weight reorganization of the second neural network.
  • the recognition unit 12 extracts the first type feature data of the at least one image to be recognized through the first extraction module; recognizes the first type feature data through the first classification module, Obtain the first identification data; extract the second-type feature data of the at least one image to be identified through the second extraction module, the weight reorganization in the second extraction module and the weight reorganization in the first extraction module Orthogonal; the second feature data is processed by the third extraction module to obtain the location information of the second feature data; the second classification model is adopted according to the location information and the second type of feature data Recognition is performed to obtain second identification data; and the recognition result of the image to be recognized is obtained according to the correlation degree of the location information, the second identification data and the first identification data.
  • the process for the recognition unit 12 to obtain the recognition result of the image to be recognized according to the correlation degree of the position information, the second recognition data, and the first recognition data is:
  • the correlation degree of the position information is obtained according to the relationship between the element with the largest value in the position information and the preset threshold value.
  • the correlation degree is 1; when the element When it is less than the preset threshold, the correlation degree is 0;
  • the correlation degree is 1, the second identification data and the first identification data are combined to generate third identification data, the identification probability is calculated according to the third identification data, and the pending identification data is obtained according to the identification probability.
  • the correlation degree 1
  • the dimension of the first identification data of the global feature and the dimension of the second identification data of the unique feature can be merged to generate a warp dimension.
  • the third identification data is calculated to obtain the identification probability, and based on the identification probability, it is determined whether the object in the image to be identified matches the object in the target image, and a matching result (recognition result) is generated.
  • the recognition probability is calculated according to the first recognition data, and the recognition result of the image to be recognized is obtained according to the recognition probability.
  • the correlation degree when the correlation degree is 0, it means that the global feature is related to the unique feature.
  • the recognition can be based on the global feature. Therefore, the first recognition data is calculated to obtain the recognition probability, and the recognition probability is determined according to the recognition probability. Whether the object in the image matches the object in the target image, a matching result (recognition result) is generated.
  • the first neural network in the target recognition model is used to extract the global features of the object to be recognized
  • the second neural network in the target recognition model is used to extract the unique features (features that are different from the global features) of the object to be recognized. Combining global features and unique features to recognize the object to be recognized, thereby improving the accuracy of recognition, and achieving the purpose of quickly and accurately locking the target object through the target recognition model.
  • the pedestrian re-identification device 1 can extract the unique characteristics of the pedestrians with unique characteristics while extracting the global characteristics of the pedestrians.
  • the target recognition model includes two neural networks (the first neural network and the second neural network).
  • the parameters of the convolutional layer are orthogonal to separate the unique features from the global features, and the loss function is used to meet the uniqueness of the unique features. The need for sex and rarity.
  • the target pedestrian does not have unique characteristics, only the global characteristics need to be compared, and the most similar pedestrian is found as the recognition result.
  • the target pedestrian has unique features, the extracted unique features can be used for more accurate feature comparison. Because there are fewer base libraries to compare, the comparison is faster and the result will be more accurate.
  • the present application also provides a computer device 2 which includes a plurality of computer devices 2.
  • the components of the pedestrian re-identification device 1 of the second embodiment can be dispersed in different computer devices 2. 2 It can be a smart phone, a tablet, a laptop, a desktop computer, a rack server, a blade server, a tower server or a cabinet server (including independent servers, or server clusters composed of multiple servers) that executes the program Wait.
  • the computer device 2 of this embodiment at least includes but is not limited to: a memory 21, a processor 23, a network interface 22, and a pedestrian re-identification device 1 (refer to FIG. 8) that can be communicably connected to each other through a system bus.
  • FIG. 8 only shows the computer device 2 with components, but it should be understood that it is not required to implement all the components shown, and more or fewer components may be implemented instead.
  • the memory 21 includes at least one type of computer-readable storage medium, and the readable storage medium includes flash memory, hard disk, multimedia card, card-type memory (for example, SD or DX memory, etc.), random access Memory (RAM), Static Random Access Memory (SRAM), Read Only Memory (ROM), Electrically Erasable Programmable Read Only Memory (EEPROM), Programmable Read Only Memory (PROM), Magnetic Memory, Magnetic Disk, Optical Disk, etc.
  • the memory 21 may be an internal storage unit of the computer device 2, for example, a hard disk or a memory of the computer device 2.
  • the memory 21 may also be an external storage device of the computer device 2, such as a plug-in hard disk, a smart media card (SMC), and a secure digital (Secure Digital, SMC) equipped on the computer device 2. SD) card, flash card (Flash Card), etc.
  • the memory 21 may also include both the internal storage unit of the computer device 2 and its external storage device.
  • the memory 21 is generally used to store the operating system and various application software installed in the computer device 2, such as the program code of the pedestrian re-identification method in the first embodiment.
  • the memory 21 can also be used to temporarily store various types of data that have been output or will be output.
  • the processor 23 may be a central processing unit (CPU), a controller, a microcontroller, a microprocessor, or other data processing chips.
  • the processor 23 is generally used to control the overall operation of the computer device 2, for example, to perform data interaction or communication-related control and processing with the computer device 2.
  • the processor 23 is used to run the program code or processing data stored in the memory 21, for example, to run the pedestrian re-identification device 1 and the like.
  • the network interface 22 may include a wireless network interface or a wired network interface, and the network interface 22 is generally used to establish a communication connection between the computer device 2 and other computer devices 2.
  • the network interface 22 is used to connect the computer device 2 to an external terminal through a network, and to establish a data transmission channel and a communication connection between the computer device 2 and the external terminal.
  • the network may be Intranet, Internet, Global System of Mobile Communication (GSM), Wideband Code Division Multiple Access (WCDMA), 4G network, 5G Network, Bluetooth (Bluetooth), Wi-Fi and other wireless or wired networks.
  • Fig. 8 only shows the computer device 2 with components 21-23, but it should be understood that it is not required to implement all the components shown, and more or fewer components may be implemented instead.
  • the pedestrian re-identification device 1 stored in the memory 21 may also be divided into one or more program modules.
  • the one or more program modules are stored in the memory 21 and are composed of one or more program modules.
  • a plurality of processors (the processor 23 in this embodiment) are executed to complete the application.
  • the present application also provides a computer-readable storage medium, which includes multiple storage media, such as flash memory, hard disk, multimedia card, card-type memory (for example, SD or DX memory, etc.), random access memory (RAM ), static random access memory (SRAM), read-only memory (ROM), electrically erasable programmable read-only memory (EEPROM), programmable read-only memory (PROM), magnetic memory, magnetic disks, optical disks, servers, App applications
  • a shopping mall or the like has a computer program stored thereon, and the program is executed by the processor 23 to realize corresponding functions.
  • the computer-readable storage medium in this embodiment is used to store the pedestrian re-identification device 1, and when executed by the processor 23, the pedestrian re-identification method in the first embodiment is implemented.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Engineering & Computer Science (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Multimedia (AREA)
  • Evolutionary Computation (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • General Health & Medical Sciences (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Health & Medical Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Evolutionary Biology (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Human Computer Interaction (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Image Analysis (AREA)

Abstract

一种行人重识别方法、装置、计算机设备及可读存储介质,属于图像分析领域。行人重识别方法通过目标识别模型中的第一神经网络提取待识别对象的全局特征,通过目标识别模型中的第二神经网络提取待识别对象的特有特征(区别于全局特征的特征),结合全局特征和特有特征对待识别对象进行识别,进而提升识别的准确性,实现通过目标识别模型快速、准确的锁定目标对象的目的。

Description

行人重识别方法、装置、计算机设备及可读存储介质
本申请声明享有2019年10月30日递交的申请号为CN 2019110461952、名称为“行人重识别方法、装置、计算机设备及可读存储介质”的中国专利申请的优先权,该中国专利申请的整体内容以参考的方式结合在本申请中。
技术领域
本申请涉及图像分析领域,尤其涉及行人重识别方法、装置、计算机设备及可读存储介质。
背景技术
视频监控中的行人重识别广泛应用于视频分析的各个领域,在智慧安防、智慧教育、智慧媒体等诸多应用场景中都有着核心作用。采用行人重识别可在多个摄像头中快速找到目标行人,从而有效提升客户体验,增强社会平安稳定,缩减视频分析中的人工成本和分析时间。
目前的行人重识别方法主要是根据行人的整体表征提取整体的全部特征,根据该全部特征进行分析识别,从而找到目标行人,但是现有的人重识别技术需提取行人的全部特征在进行识别时存在精准度地问题。
申请内容
针对现有行人重识别方法识别的精准度低的问题,现提供一种旨在可结合特有特征及全局特征对行人进行识别提升识别精准度的行人重识别方法、装置、计算机设备及可读存储介质。
为实现上述目的,本申请提供一种行人重识别方法,包括下述步骤:
接收至少一张待识别图像;
通过目标识别模型的第一神经网络和第二神经网络对所述待识别图像进行识别,获取识别结果;
所述第一神经网络的权重组与所述第二神经网络的权重组正交。
为实现上述目的,本申请还提供一种行人重识别装置,包括:
接收单元,用于接收至少一张待识别图像;
识别单元,用于通过目标识别模型的第一神经网络和第二神经网络对所述待识别图像进行识别,获取识别结果;
所述第一神经网络的权重组与所述第二神经网络的权重组正交。
为实现上述目的,本申请还提供一种计算机设备,所述计算机设备,包括存储器、处理器以及存储在存储器上并可在处理器上运行的计算机程序,所述处理器执行所述计算机程序时实现行人重识别方法的以下步骤:
接收至少一张待识别图像;
通过目标识别模型的第一神经网络和第二神经网络对所述待识别图像进行识别,获取识别结果;
所述第一神经网络的权重组与所述第二神经网络的权重组正交。
为实现上述目的,本申请还提供一种计算机可读存储介质,其上存储有计算机程序,所述计算机程序被处理器执行时实现行人重识别方法的以下步骤:
接收至少一张待识别图像;
通过目标识别模型的第一神经网络和第二神经网络对所述待识别图像进行识别,获取识别结果;
所述第一神经网络的权重组与所述第二神经网络的权重组正交。
上述技术方案的有益效果:
本技术方案中,行人重识别方法、装置、计算机设备及可读存储介质通过目标识别模型中的第一神经网络提取待识别对象的全局特征,通过目标识别模型中的第二神经网络提取待识别对象的特有特征(区别于全局特征的特征),结合全局特征和特有特征对待识别对象进行识别,进而提升识别的准确性,实现通过目标识别模型快速、准确的锁定目标对象的目的。
附图说明
图1为本申请所述的行人重识别方法的一种实施例的方法流程图;
图2为本申请中对对识别模型进行训练获取目标识别模型的一种实施例的方法流程图;
图3为训练第三神经网络的一种实施例的方法流程图;
图4为训练第四神经网络的一种实施例的方法流程图;
图5为行人重识别的原理图;
图6为对待识别图像进行识别的方法图;
图7为本申请所述的行人重识别装置的一种实施例的模块图;
图8为本申请所述的计算机设备一实施例的硬件架构示意图。
具体实施方式
为了使本申请的目的、技术方案及优点更加清楚明白,以下结合附图及实施例,对本申请进行进一步详细说明。应当理解,此处所描述的具体实施例仅 用以解释本申请,并不用于限定本申请。基于本申请中的实施例,本领域普通技术人员在没有做出创造性劳动前提下所获得的所有其他实施例,都属于本申请保护的范围。
本申请提供的行人重识别方法、装置、计算机设备及可读存储介质,适用于保险、安保等业务领域,为安保***及监控***提供一种可快速准确识别目标对象的行人重识别方法。本申请通过目标识别模型中的第一神经网络提取待识别对象的全局特征,通过目标识别模型中的第二神经网络提取待识别对象的特有特征(区别于全局特征的特征),结合全局特征和特有特征对待识别对象进行识别,进而提升识别的准确性,实现通过目标识别模型快速、准确的锁定目标对象的目的。
实施例一
请参阅图1,本实施例的一种行人重识别方法,包括下述步骤:
S1.接收至少一张待识别图像;
于本实施例中,行人重识别方法可应用于监控、安防或安保***中,可通过采集设备(例如:摄像头、相机或带有摄像功能的移动终端等)接收待识别图像。
S2.通过目标识别模型的第一神经网络和第二神经网络对所述待识别图像进行识别,获取识别结果;
需要说明的是:所述第一神经网络的权重组与所述第二神经网络的权重组正交。从而保证第一神经网络提取的全局特征与第二神经网络提取的特有特征不相关。
以行人的外在特征为例,全局特征用于表征行人的头部、躯干和四肢等,是每个人行人都具有的特征;相对于全局特征而言特有特征可以丰富多样,特有特征用于表征行人的特有特征该特征与全局特征不同,例如:特有的发型、鲜艳的发色、特殊的衣服款式、装饰或图案等。利用特有特征可以更快、更准确的定位目标对象。
在本步骤中,通过第一神经网络提取待识别对象的全局特征,通过目标识别模型中的第二神经网络提取待识别对象的特有特征(区别于全局特征的特征),结合全局特征和特有特征对待识别对象进行识别,进而提升识别的准确性。
在步骤S2中,通过目标识别模型的第一神经网络和第二神经网络对所述待识别图像进行识别,获取识别结果的步骤,之前还可包括(参考图2):
A1.获取至少一张目标图像及至少一张样本图像;
在本步骤中,每一张样本图像只表征一个样本对象,目标图像中的目标对象的形态与所有的样本图像中的样本对象的形态均不同。
A2.采用所述目标图像及至少一张样本图像对识别模型的第三神经网络和第四神经网络进行训练,所述第三神经网络的权重组与所述第四神经网络的权重组正交,获取目标识别模型。
需要说明的是,通过第三神经网络的权重组与第四神经网络的权重组正交,可保证第三神经网络提取的全局特征与第四神经网络提取的特有特征不相关。
在本步骤中,通过第三神经网络提取样本图像中样本对象的全局特征,通过第四神经网络提取样本图像中样本对象的特有特征(区别于全局特征的特征),结合全局特征和特有特征对待识别对象进行识别,进而提升识别的准确性。
需要说明的是:所述第三神经网络包括第一提取模块、第一分类模块和第一分类器;所述第四神经网络包括:第二提取模块、第三提取模块、第二分类模型和第二分类器;
其中,第一提取模块包括至少一个卷积层;第一提取模块还可包括至少一个下采样层。第一分类模块可采用全连接层;第一分类器可采用Softmax分类器;第二提取模块包括至少一个卷积层;第二提取模块还可包括至少一个下采样层;第三提取模块包括至少一个卷积层;第三提取模块还可包括至少一个下采样层;第二分类模型可采用全连接层;第二分类器可采用Softmax分类器。
在步骤A2中,采用所述目标图像及至少一张样本图像对识别模型的第三神经网络进行训练的步骤(参考图3),包括:
A201.通过所述第一提取模块提取所述至少一张样本图像的第一类特征数据;
其中,第一提取模块包括至少一个卷积层;第一提取模块还可包括至少一个下采样层。
在本实施例中,通过卷积层提取每一张样本图像的第一类特征数据(全局特征)。以第一提取模块包括一个卷积层为例进行说明。
第一提取模块的卷积层Cg的卷积窗权重记为Wg={Wg1,Wg2,…,Wgi,…,Wgn}。
其中,i=1,2,…,n;n为Cg的输出通道数,Wgi={wgi1,wgi2,…,wgij,…,wgim},j=1,2,…,m;m为Cg的输入通道数。
A202.通过所述第一分类模块对所述第一类特征数据进行识别,获取第一识别数据;
在本实施例中,通过第一分类模块对第一类特征数据进行识别,获取第一类特征数据中每一元素对应的类别,根据获取的类别信息生成第一识别数据。
具体地,第一分类模块可采用全连接层。
A203.根据所述第一识别数据通过第一分类器训练至少一张样本图像的梯度,根据所述梯度更新所述第一提取模块和所述第一分类模块的参数值。
在本实施例中,根据梯度调整第一提取模块和第一分类模块中的参数值,从而实现对第一神经网络训练的目的。
具体地,第一分类器可采用Softmax分类器。
在步骤A2中,采用所述目标图像及至少一张样本图像对识别模型的第四神经网络进行训练的步骤,包括(参考图4):
A211.通过所述第二提取模块提取所述至少一张样本图像的第二类特征数据,所述第二提取模块中的权重组与所述第一提取模块中的权重组正交;
在本实施例中,通过第二提取模块提取每一张样本图像的第二类特征数据(特有特征)。进一步地,第二提取模块可包括至少一个卷积层;第二提取模块还可包括至少一个下采样层。
作为举例而非限定,以第二提取模块包括一个卷积层Cs为例进行如下说明:
卷积层Cs用于从样本图像中提取第二类特征数据以便于后续分析。卷积层Cs的卷积窗权重记为Ws={Ws1,Ws2,…,Wsn},n为Cs的输出通道数,Wsn={wsn1,wsn2,…wsnm},m为Cs的输入通道数;
需要说明的是,卷积层Cs的维度与卷积层Cg的维度需保持一致。
由于特有特征是在少数行人中才会出现的,而全局特征是所有行人都具有的,所以可以认为第二类特征数据和第一类特征数据是完全不相关的。
因此,两个完全不相关的卷积层对应的权重组是正交的,即对于每一个输入通道,它对应的权重组乘积之和为0。算式表达如下:
Figure PCTCN2019118020-appb-000001
为了使特有特征的卷积层Cs具有独特性,需满足以下损失函数Lc公式:
Figure PCTCN2019118020-appb-000002
A212.通过所述第三提取模块对所述第二特征数据进行处理获取所述第二特征数据的位置信息;
进一步地,第三提取模块可包括至少一个卷积层;第三提取模块还可包括至少一个下采样层。
作为举例而非限定,以第三提取模块包括一个卷积层Cp为例进行如下说 明:
卷积层Cp对卷积层Cs输出的第二特征数据进行分析,获得特有特征的位置掩模Y(即:位置信息),为后续的特有特征提取提供了位置参照。Y是一个h×w的矩阵,h和w各自为原图长宽的1/s,s为第一神经网络的下采样系数。
人物的特有特征在行人图像中具有罕见性,仅在很小一部分行人图像中才会存在。因此,提供了基于批数据整体分析的损失函数Lp,以确保其罕见性:
Figure PCTCN2019118020-appb-000003
Figure PCTCN2019118020-appb-000004
其中,Q(Y)表示位置信息的相关度;max(Y)表示在矩阵Y中元素最大的值;B表示批数据的数量个数,N为预设常数。优选的,为了确保罕见度损失的稳定性,B一般大于64,N一般取25。
A213.根据所述位置信息和所述第二类特征数据采用所述第二分类模型进行识别,获取第二识别数据;
在本实施例中,将位置信息和第二类特征数据相乘后输入至第二分类模型进行识别,获取第二类特征数据中每一元素对应的类别,根据获取的类别信息生成与位置相关的第二识别数据。
具体地,第二分类模型可采用全连接层。
A214.依据所述位置信息的相关度及所述第二识别数据通过第二分类器训练至少一张样本图像的梯度,根据所述梯度更新所述第二提取模块、所述第三提取模块和所述第二分类模型的参数值。
在本实施例中,根据梯度调整第二提取模块、第三提取模块和第二分类模型中的参数值,从而实现对第二神经网络训练的目的。
具体地,第二分类器可采用Softmax分类器。
进一步地,在步骤A214中,依据所述位置信息的相关度及所述第二识别数据通过第二分类器训练至少一张样本图像的梯度,根据所述梯度更新所述第二提取模块、所述第三提取模块和所述第二分类模型的参数值的步骤,包括:
根据所述位置信息中值最大的元素与预设阈值的关系获得所述位置信息的相关度,当所述元素大于或等于所述预设阈值时,所述相关度为1;当所述元素小于所述预设阈值时,所述相关度为0;
当相关度为1时,根据所述第二识别数据通过第二分类器训练至少一张样 本图像的梯度,根据所述梯度更新所述第二提取模块、所述第三提取模块和所述第二分类模型的参数值。
在本实施例中,当Q(Y)=1时,采用第二分类器进行学习,当Q(Y)=0时,不进行学习。
作为举例而非限定,行人重识别方法的原理可参考图5所示,以第三神经网络的第一提取模块为卷积层Cg,第一分类模块为全连接层Mg;第四神经网络的第二提取模块为卷积层Cs,第三提取模块为卷积层Cp,第二分类模块为全连接层Ms为例进行如下说明:
将样本图像输入第三神经网络的卷积层Cg和第四神经网络的卷积层Cs,通过卷积层Cg提取样本图像的第一类特征数据,采用全连接层Mg对第一类特征数据进行识别,获取全局特征的识别结果;采用第一分类器进行训练获取梯度,基于梯度更新第一提取模块和第一分类模块的参数值。通过卷积层Cs提取样本图像的第二类特征数据,根据损失函数Lc使卷积层Cg与卷积层Cs的权重组保持正交;通过卷积层Cp对第二特征数据进行分析,获取第二特征数据的位置掩模Y;将位置掩模Y与第二特征数据相乘后输入全连接层Ms进行识别获取特有特征的识别结果;基于位置掩模Y通过损失函数Lp进行计算保证第二特征数据的罕见度稳定性,当相关度为1时,根据第二识别数据通过第二分类器训练至少一张样本图像的梯度,根据梯度更新第二提取模块、第三提取模块和第二分类模型的参数值;当相关度为0时,不更新参数值。上述过程为训练模型的过程。
需要说明的是:应用训练完成的目标识别模型进行图像识别时,当相关度为1时,将特有特征的识别结果和全局特征的识别结果进行合并,作为最终的识别结果输出;当相关度为0时,将全局特征的识别结果作为最终的识别结果输出。
在步骤S2中所述第一神经网络包括第一提取模块和第一分类模块;所述第二神经网络包括:第二提取模块、第三提取模块和第二分类模型;
步骤S2通过目标识别模型的第一神经网络和第二神经网络对所述待识别图像进行识别,获取识别结果的步骤,包括(参考图6):
S21.通过所述第一提取模块提取所述至少一张待识别图像的第一类特征数据;
其中,第一提取模块包括至少一个卷积层;第一提取模块还可包括至少一个下采样层。
在本实施例中,通过卷积层提取每一张样本图像的第一类特征数据(全局特征)。以第一提取模块包括一个卷积层为例进行说明。
第一提取模块的卷积层Cg的卷积窗权重记为Wg={Wg1,Wg2,…,Wgi,…,Wgn}。
其中,i=1,2,…,n;n为Cg的输出通道数,Wgi={wgi1,wgi2,…,wgij,…,wgim},j=1,2,…,m;m为Cg的输入通道数。
S22.通过所述第一分类模块对所述第一类特征数据进行识别,获取第一识别数据;
在本实施例中,通过第一分类模块对第一类特征数据进行识别,获取第一类特征数据中每一元素对应的类别,根据获取的类别信息生成第一识别数据。
具体地,第一分类模块可采用全连接层。
S23.通过所述第二提取模块提取所述至少一张待识别图像的第二类特征数据,所述第二提取模块中的权重组与所述第一提取模块中的权重组正交;
在本实施例中,通过第二提取模块提取每一张样本图像的第二类特征数据(特有特征)。进一步地,第二提取模块可包括至少一个卷积层;第二提取模块还可包括至少一个下采样层。
作为举例而非限定,以第二提取模块包括一个卷积层Cs为例进行如下说明:
卷积层Cs用于从样本图像中提取第二类特征数据以便于后续分析。卷积层Cs的卷积窗权重记为Ws={Ws1,Ws2,…,Wsn},n为Cs的输出通道数,Wsn={wsn1,wsn2,…wsnm},m为Cs的输入通道数;
需要说明的是,卷积层Cs的维度与卷积层Cg的维度需保持一致。
由于特有特征是在少数行人中才会出现的,而全局特征是所有行人都具有的,所以可以认为第二类特征数据和第一类特征数据是完全不相关的。
因此,两个完全不相关的卷积层对应的权重组是正交的,即对于每一个输入通道,它对应的权重组乘积之和为0。算式表达如下:
Figure PCTCN2019118020-appb-000005
S24.通过所述第三提取模块对所述第二特征数据进行处理获取所述第二特征数据的位置信息;
进一步地,第三提取模块可包括至少一个卷积层;第三提取模块还可包括至少一个下采样层。
作为举例而非限定,以第三提取模块包括一个卷积层Cp为例进行如下说明:
卷积层Cp对卷积层Cs输出的第二特征数据进行分析,获得特有特征的位置掩模Y(即:位置信息),为后续的特有特征提取提供了位置参照。Y是一个h×w的矩阵,h和w各自为原图长宽的1/s,s为第一神经网络的下采样 系数。
Figure PCTCN2019118020-appb-000006
其中,Q(Y)表示位置信息的相关度;max(Y)表示在矩阵Y中元素最大的值。
S25.根据所述位置信息和所述第二类特征数据采用所述第二分类模型进行识别,获取第二识别数据;
在本实施例中,将位置信息和第二类特征数据相乘后输入至第二分类模型进行识别,获取第二类特征数据中每一元素对应的类别,根据获取的类别信息生成与位置相关的第二识别数据。
具体地,第二分类模型可采用全连接层。
S26.依据所述位置信息的相关度、所述第二识别数据和所述第一识别数据获取所述待识别图像的识别结果。
进一步地,步骤S26依据所述位置信息的相关度、所述第二识别数据和所述第一识别数据获取所述待识别图像的识别结果的步骤,包括:
根据所述位置信息中值最大的元素与预设阈值的关系获得所述位置信息的相关度,当所述元素大于或等于所述预设阈值时,所述相关度为1;当所述元素小于所述预设阈值时,所述相关度为0;
当相关度为1时,将所述第二识别数据和所述第一识别数据进行合并,生成第三识别数据,根据所述第三识别数据计算识别概率,根据所述识别概率获取所述待识别图像的识别结果;
具体地,当相关度为1时,表示全局特征与特有特征不相关,在进行图像识别时,可将全局特征的第一识别数据的维度与特有特征的第二识别数据维度合并,生成经维度合并后的第三识别数据,对该第三识别数据进行计算获取识别概率,根据该识别概率判断待识别图像中的对象与目标图像中的对象是否匹配,生成匹配结果(识别结果)。
当相关度为0时,根据所述第一识别数据计算识别概率,根据所述识别概率获取所述待识别图像的识别结果。
具体地,当相关度为0时,表示全局特征与特有特征相关,在进行图像识别时,可基于全局特征进行识别,因而,对第一识别数据计算获取识别概率,根据该识别概率判断待识别图像中的对象与目标图像中的对象是否匹配,生成匹配结果(识别结果)。
在本实施例中,通过目标识别模型中的第一神经网络提取待识别对象的全局特征,通过目标识别模型中的第二神经网络提取待识别对象的特有特征(区 别于全局特征的特征),结合全局特征和特有特征对待识别对象进行识别,进而提升识别的准确性,实现通过目标识别模型快速、准确的锁定目标对象的目的。
行人重识别方法可以在提取行人全局特征的同时,提取特有特征的行人所具有的特有特征。目标识别模型包括了两个神经网络(第一神经网络和第二神经网络),利用卷积层的参数正交,使特有特征和全局特征分离开来,并采用损失函数,满足特有特征的独特性和罕见性的需求。当目标行人不具有特有特征时,只需比较全局特征,并找到最相似行人作为识别结果。而当目标行人具有特有特征时,可以用提取的特有特征进行更为精确的特征比较。因为比较的底库更少,所以比较速度更快,结果也会更为精确。
实施例二
请参阅图7,本实施例的一种行人重识别装置1,包括:接收单元11和识别单元12,其中:
接收单元11,用于接收至少一张待识别图像;
于本实施例中,行人重识别装置1可应用于监控、安防或安保***中,接收单元11可利用采集设备(例如:摄像头、相机或带有摄像功能的移动终端等)接收待识别图像。
识别单元12,用于通过目标识别模型的第一神经网络和第二神经网络对所述待识别图像进行识别,获取识别结果;
需要说明的是:所述第一神经网络的权重组与所述第二神经网络的权重组正交。从而保证第一神经网络提取的全局特征与第二神经网络提取的特有特征不相关。
所述第一神经网络的权重组与所述第二神经网络的权重组正交。
在本实施例中,识别单元12通过所述第一提取模块提取所述至少一张待识别图像的第一类特征数据;通过所述第一分类模块对所述第一类特征数据进行识别,获取第一识别数据;通过所述第二提取模块提取所述至少一张待识别图像的第二类特征数据,所述第二提取模块中的权重组与所述第一提取模块中的权重组正交;通过所述第三提取模块对所述第二特征数据进行处理获取所述第二特征数据的位置信息;根据所述位置信息和所述第二类特征数据采用所述第二分类模型进行识别,获取第二识别数据;依据所述位置信息的相关度、所述第二识别数据和所述第一识别数据获取所述待识别图像的识别结果。
进一步地,识别单元12依据所述位置信息的相关度、所述第二识别数据和所述第一识别数据获取所述待识别图像的识别结果的过程为:
根据所述位置信息中值最大的元素与预设阈值的关系获得所述位置信息的相关度,当所述元素大于或等于所述预设阈值时,所述相关度为1;当所述元素小于所述预设阈值时,所述相关度为0;
当相关度为1时,将所述第二识别数据和所述第一识别数据进行合并,生成第三识别数据,根据所述第三识别数据计算识别概率,根据所述识别概率获取所述待识别图像的识别结果;
具体地,当相关度为1时,表示全局特征与特有特征不相关,在进行图像识别时,可将全局特征的第一识别数据的维度与特有特征的第二识别数据维度合并,生成经维度合并后的第三识别数据,对该第三识别数据进行计算获取识别概率,根据该识别概率判断待识别图像中的对象与目标图像中的对象是否匹配,生成匹配结果(识别结果)。
当相关度为0时,根据所述第一识别数据计算识别概率,根据所述识别概率获取所述待识别图像的识别结果。
具体地,当相关度为0时,表示全局特征与特有特征相关,在进行图像识别时,可基于全局特征进行识别,因而,对第一识别数据计算获取识别概率,根据该识别概率判断待识别图像中的对象与目标图像中的对象是否匹配,生成匹配结果(识别结果)。
在本实施例中,通过目标识别模型中的第一神经网络提取待识别对象的全局特征,通过目标识别模型中的第二神经网络提取待识别对象的特有特征(区别于全局特征的特征),结合全局特征和特有特征对待识别对象进行识别,进而提升识别的准确性,实现通过目标识别模型快速、准确的锁定目标对象的目的。
行人重识别装置1可以在提取行人全局特征的同时,提取特有特征的行人所具有的特有特征。目标识别模型包括了两个神经网络(第一神经网络和第二神经网络),利用卷积层的参数正交,使特有特征和全局特征分离开来,并采用损失函数,满足特有特征的独特性和罕见性的需求。当目标行人不具有特有特征时,只需比较全局特征,并找到最相似行人作为识别结果。而当目标行人具有特有特征时,可以用提取的特有特征进行更为精确的特征比较。因为比较的底库更少,所以比较速度更快,结果也会更为精确。
实施例三
为实现上述目的,本申请还提供一种计算机设备2,该计算机设备2包括多个计算机设备2,实施例二的行人重识别装置1的组成部分可分散于不同的计算机设备2中,计算机设备2可以是执行程序的智能手机、平板电脑、笔记 本电脑、台式计算机、机架式服务器、刀片式服务器、塔式服务器或机柜式服务器(包括独立的服务器,或者多个服务器所组成的服务器集群)等。本实施例的计算机设备2至少包括但不限于:可通过***总线相互通信连接的存储器21、处理器23、网络接口22以及行人重识别装置1(参考图8)。需要指出的是,图8仅示出了具有组件-的计算机设备2,但是应理解的是,并不要求实施所有示出的组件,可以替代的实施更多或者更少的组件。
本实施例中,所述存储器21至少包括一种类型的计算机可读存储介质,所述可读存储介质包括闪存、硬盘、多媒体卡、卡型存储器(例如,SD或DX存储器等)、随机访问存储器(RAM)、静态随机访问存储器(SRAM)、只读存储器(ROM)、电可擦除可编程只读存储器(EEPROM)、可编程只读存储器(PROM)、磁性存储器、磁盘、光盘等。在一些实施例中,存储器21可以是计算机设备2的内部存储单元,例如该计算机设备2的硬盘或内存。在另一些实施例中,存储器21也可以是计算机设备2的外部存储设备,例如该计算机设备2上配备的插接式硬盘,智能存储卡(Smart Media Card,SMC),安全数字(Secure Digital,SD)卡,闪存卡(Flash Card)等。当然,所述存储器21还可以既包括计算机设备2的内部存储单元也包括其外部存储设备。本实施例中,存储器21通常用于存储安装于计算机设备2的操作***和各类应用软件,例如实施例一的行人重识别方法的程序代码等。此外,存储器21还可以用于暂时地存储已经输出或者将要输出的各类数据。
所述处理器23在一些实施例中可以是中央处理器(Central Processing Unit,CPU)、控制器、微控制器、微处理器、或其他数据处理芯片。该处理器23通常用于控制计算机设备2的总体操作例如执行与所述计算机设备2进行数据交互或者通信相关的控制和处理等。本实施例中,所述处理器23用于运行所述存储器21中存储的程序代码或者处理数据,例如运行所述的行人重识别装置1等。
所述网络接口22可包括无线网络接口或有线网络接口,该网络接口22通常用于在所述计算机设备2与其他计算机设备2之间建立通信连接。例如,所述网络接口22用于通过网络将所述计算机设备2与外部终端相连,在所述计算机设备2与外部终端之间的建立数据传输通道和通信连接等。所述网络可以是企业内部网(Intranet)、互联网(Internet)、全球移动通讯***(Global System of Mobile communication,GSM)、宽带码分多址(Wideband Code Division Multiple Access,WCDMA)、4G网络、5G网络、蓝牙(Bluetooth)、Wi-Fi等无线或有线网络。
需要指出的是,图8仅示出了具有部件21-23的计算机设备2,但是应理 解的是,并不要求实施所有示出的部件,可以替代的实施更多或者更少的部件。
在本实施例中,存储于存储器21中的所述行人重识别装置1还可以被分割为一个或者多个程序模块,所述一个或者多个程序模块被存储于存储器21中,并由一个或多个处理器(本实施例为处理器23)所执行,以完成本申请。
实施例四:
为实现上述目的,本申请还提供一种计算机可读存储介质,其包括多个存储介质,如闪存、硬盘、多媒体卡、卡型存储器(例如,SD或DX存储器等)、随机访问存储器(RAM)、静态随机访问存储器(SRAM)、只读存储器(ROM)、电可擦除可编程只读存储器(EEPROM)、可编程只读存储器(PROM)、磁性存储器、磁盘、光盘、服务器、App应用商城等等,其上存储有计算机程序,程序被处理器23执行时实现相应功能。本实施例的计算机可读存储介质用于存储行人重识别装置1,被处理器23执行时实现实施例一的行人重识别方法。
上述本申请实施例序号仅仅为了描述,不代表实施例的优劣。
通过以上的实施方式的描述,本领域的技术人员可以清楚地了解到上述实施例方法可借助软件加必需的通用硬件平台的方式来实现,当然也可以通过硬件,但很多情况下前者是更佳的实施方式。
以上仅为本申请的优选实施例,并非因此限制本申请的专利范围,凡是利用本申请说明书及附图内容所作的等效结构或等效流程变换,或直接或间接运用在其他相关的技术领域,均同理包括在本申请的专利保护范围内。

Claims (20)

  1. 一种行人重识别方法,其特征在于,包括下述步骤:
    接收至少一张待识别图像;
    通过目标识别模型的第一神经网络和第二神经网络对所述待识别图像进行识别,获取识别结果;
    所述第一神经网络的权重组与所述第二神经网络的权重组正交。
  2. 根据权利要求1所述的行人重识别方法,其特征在于,通过目标识别模型的第一神经网络和第二神经网络对所述待识别图像进行识别,获取识别结果的步骤,之前包括:
    获取至少一张目标图像及至少一张样本图像;
    采用所述目标图像及至少一张样本图像对识别模型的第三神经网络和第四神经网络进行训练,所述第三神经网络的权重组与所述第四神经网络的权重组正交,获取目标识别模型。
  3. 根据权利要求2所述的行人重识别方法,其特征在于,所述第三神经网络包括第一提取模块、第一分类模块和第一分类器;
    采用所述目标图像及至少一张样本图像对识别模型的第三神经网络进行训练的步骤,包括:
    通过所述第一提取模块提取所述至少一张样本图像的第一类特征数据;
    通过所述第一分类模块对所述第一类特征数据进行识别,获取第一识别数据;
    根据所述第一识别数据通过第一分类器训练至少一张样本图像的梯度,根据所述梯度更新所述第一提取模块和所述第一分类模块的参数值。
  4. 根据权利要求3所述的行人重识别方法,其特征在于,所述第四神经网络包括:第二提取模块、第三提取模块、第二分类模型和第二分类器;
    采用所述目标图像及至少一张样本图像对识别模型的第四神经网络进行训练的步骤,包括:
    通过所述第二提取模块提取所述至少一张样本图像的第二类特征数据,所述第二提取模块中的权重组与所述第一提取模块中的权重组正交;
    通过所述第三提取模块对所述第二特征数据进行处理获取所述第二特征数据的位置信息;
    根据所述位置信息和所述第二类特征数据采用所述第二分类模型进行识别,获取第二识别数据;
    依据所述位置信息的相关度及所述第二识别数据通过第二分类器训练至少一张样本图像的梯度,根据所述梯度更新所述第二提取模块、所述第三提取模块和所述第二分类模型的参数值。
  5. 根据权利要求4所述的行人重识别方法,其特征在于,依据所述位置信息的相关度及所述第二识别数据通过第二分类器训练至少一张样本图像的梯度,根据所述梯度更新所述第二提取模块、所述第三提取模块和所述第二分类模型的参数值的步骤,包括:
    根据所述位置信息中值最大的元素与预设阈值的关系获得所述位置信息的相关度,当所述元素大于或等于所述预设阈值时,所述相关度为1;当所述元素小于所述预设阈值时,所述相关度为0;
    当相关度为1时,根据所述第二识别数据通过第二分类器训练至少一张样本图像的梯度,根据所述梯度更新所述第二提取模块、所述第三提取模块和所述第二分类模型的参数值。
  6. 根据权利要求4所述的行人重识别方法,其特征在于,所述第一神经网络包括第一提取模块和第一分类模块;所述第二神经网络包括:第二提取模块、第三提取模块和第二分类模型;
    通过目标识别模型的第一神经网络和第二神经网络对所述待识别图像进行识别,获取识别结果的步骤,包括:
    通过所述第一提取模块提取所述至少一张待识别图像的第一类特征数据;
    通过所述第一分类模块对所述第一类特征数据进行识别,获取第一识别数据;
    通过所述第二提取模块提取所述至少一张待识别图像的第二类特征数据,所述第二提取模块中的权重组与所述第一提取模块中的权重组正交;
    通过所述第三提取模块对所述第二特征数据进行处理获取所述第二特征数据的位置信息;
    根据所述位置信息和所述第二类特征数据采用所述第二分类模型进行识别,获取第二识别数据;
    依据所述位置信息的相关度、所述第二识别数据和所述第一识别数据获取所述待识别图像的识别结果。
  7. 根据权利要求6所述的行人重识别方法,其特征在于,依据所述位置信息的相关度、所述第二识别数据和所述第一识别数据获取所述待识别图像的识别结果的步骤,包括:
    根据所述位置信息中值最大的元素与预设阈值的关系获得所述位置信 息的相关度,当所述元素大于或等于所述预设阈值时,所述相关度为1;当所述元素小于所述预设阈值时,所述相关度为0;
    当相关度为1时,将所述第二识别数据和所述第一识别数据进行合并,生成第三识别数据,根据所述第三识别数据计算识别概率,根据所述识别概率获取所述待识别图像的识别结果;
    当相关度为0时,根据所述第一识别数据计算识别概率,根据所述识别概率获取所述待识别图像的识别结果。
  8. 一种行人重识别装置,其特征在于,包括:
    接收单元,用于接收至少一张待识别图像;
    识别单元,用于通过目标识别模型的第一神经网络和第二神经网络对所述待识别图像进行识别,获取识别结果;
    所述第一神经网络的权重组与所述第二神经网络的权重组正交。
  9. 根据权利要求8所述的行人重识别装置,其特征在于,还包括:
    训练单元,用于获取至少一张目标图像及至少一张样本图像,采用所述目标图像及至少一张样本图像对识别模型的第三神经网络和第四神经网络进行训练,所述第三神经网络的权重组与所述第四神经网络的权重组正交,获取目标识别模型。
  10. 根据权利要求9所述的行人重识别装置,其特征在于,所述第三神经网络包括第一提取模块、第一分类模块和第一分类器;
    所述第一提取模块,用于提取所述至少一张样本图像的第一类特征数据;
    所述第一分类模块,用于对所述第一类特征数据进行识别,获取第一识别数据;
    所述第一分类器,用于根据所述第一识别数据训练至少一张样本图像的梯度,根据所述梯度更新所述第一提取模块和所述第一分类模块的参数值。
  11. 根据权利要求10所述的行人重识别装置,其特征在于,所述第四神经网络包括:第二提取模块、第三提取模块、第二分类模型和第二分类器;
    所述第二提取模块,用于提取所述至少一张样本图像的第二类特征数据,所述第二提取模块中的权重组与所述第一提取模块中的权重组正交;
    所述第三提取模块,用于对所述第二特征数据进行处理获取所述第二特征数据的位置信息;
    所述第二分类模型,用于根据所述位置信息和所述第二类特征数据进 行识别,获取第二识别数据;
    所述第二分类器,用于依据所述位置信息的相关度及所述第二识别数据训练至少一张样本图像的梯度,根据所述梯度更新所述第二提取模块、所述第三提取模块和所述第二分类模型的参数值。
  12. 根据权利要求11所述的行人重识别装置,其特征在于,所述第二分类器用于根据所述位置信息中值最大的元素与预设阈值的关系获得所述位置信息的相关度,当所述元素大于或等于所述预设阈值时,所述相关度为1;当所述元素小于所述预设阈值时,所述相关度为0;当相关度为1时,根据所述第二识别数据通过第二分类器训练至少一张样本图像的梯度,根据所述梯度更新所述第二提取模块、所述第三提取模块和所述第二分类模型的参数值。
  13. 根据权利要求11所述的行人重识别装置,其特征在于,所述第一神经网络包括第一提取模块和第一分类模块;所述第二神经网络包括:第二提取模块、第三提取模块和第二分类模型。
  14. 一种计算机设备,所述计算机设备,包括存储器、处理器以及存储在存储器上并可在处理器上运行的计算机程序,其特征在于:所述处理器执行所述计算机程序时实现行人重识别方法的以下步骤:
    接收至少一张待识别图像;
    通过目标识别模型的第一神经网络和第二神经网络对所述待识别图像进行识别,获取识别结果;
    所述第一神经网络的权重组与所述第二神经网络的权重组正交。
  15. 根据权利要求14所述的计算机设备,其特征在于,通过目标识别模型的第一神经网络和第二神经网络对所述待识别图像进行识别,获取识别结果的步骤,之前包括:
    获取至少一张目标图像及至少一张样本图像;
    采用所述目标图像及至少一张样本图像对识别模型的第三神经网络和第四神经网络进行训练,所述第三神经网络的权重组与所述第四神经网络的权重组正交,获取目标识别模型。
  16. 根据权利要求15所述的计算机设备,其特征在于,所述第三神经网络包括第一提取模块、第一分类模块和第一分类器;
    采用所述目标图像及至少一张样本图像对识别模型的第三神经网络进行训练的步骤,包括:
    通过所述第一提取模块提取所述至少一张样本图像的第一类特征数据;
    通过所述第一分类模块对所述第一类特征数据进行识别,获取第一识别数据;
    根据所述第一识别数据通过第一分类器训练至少一张样本图像的梯度,根据所述梯度更新所述第一提取模块和所述第一分类模块的参数值。
  17. 根据权利要求16所述的计算机设备,其特征在于,所述第四神经网络包括:第二提取模块、第三提取模块、第二分类模型和第二分类器;
    采用所述目标图像及至少一张样本图像对识别模型的第四神经网络进行训练的步骤,包括:
    通过所述第二提取模块提取所述至少一张样本图像的第二类特征数据,所述第二提取模块中的权重组与所述第一提取模块中的权重组正交;
    通过所述第三提取模块对所述第二特征数据进行处理获取所述第二特征数据的位置信息;
    根据所述位置信息和所述第二类特征数据采用所述第二分类模型进行识别,获取第二识别数据;
    依据所述位置信息的相关度及所述第二识别数据通过第二分类器训练至少一张样本图像的梯度,根据所述梯度更新所述第二提取模块、所述第三提取模块和所述第二分类模型的参数值。
  18. 一种计算机可读存储介质,其上存储有计算机程序,其特征在于:所述计算机程序被处理器执行时实现行人重识别方法的以下步骤:
    接收至少一张待识别图像;
    通过目标识别模型的第一神经网络和第二神经网络对所述待识别图像进行识别,获取识别结果;
    所述第一神经网络的权重组与所述第二神经网络的权重组正交。
  19. 根据权利要求18所述的计算机可读存储介质,其特征在于,通过目标识别模型的第一神经网络和第二神经网络对所述待识别图像进行识别,获取识别结果的步骤,之前包括:
    获取至少一张目标图像及至少一张样本图像;
    采用所述目标图像及至少一张样本图像对识别模型的第三神经网络和第四神经网络进行训练,所述第三神经网络的权重组与所述第四神经网络的权重组正交,获取目标识别模型。
  20. 根据权利要求19所述的计算机可读存储介质,所述第三神经网络包括第一提取模块、第一分类模块和第一分类器;
    采用所述目标图像及至少一张样本图像对识别模型的第三神经网络进行训练的步骤,包括:
    通过所述第一提取模块提取所述至少一张样本图像的第一类特征数据;
    通过所述第一分类模块对所述第一类特征数据进行识别,获取第一识别数据;
    根据所述第一识别数据通过第一分类器训练至少一张样本图像的梯度,根据所述梯度更新所述第一提取模块和所述第一分类模块的参数值。
PCT/CN2019/118020 2019-10-30 2019-11-13 行人重识别方法、装置、计算机设备及可读存储介质 WO2021082078A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201911046195.2A CN110874574B (zh) 2019-10-30 2019-10-30 行人重识别方法、装置、计算机设备及可读存储介质
CN201911046195.2 2019-10-30

Publications (1)

Publication Number Publication Date
WO2021082078A1 true WO2021082078A1 (zh) 2021-05-06

Family

ID=69717988

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2019/118020 WO2021082078A1 (zh) 2019-10-30 2019-11-13 行人重识别方法、装置、计算机设备及可读存储介质

Country Status (2)

Country Link
CN (1) CN110874574B (zh)
WO (1) WO2021082078A1 (zh)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113139496A (zh) * 2021-05-08 2021-07-20 青岛根尖智能科技有限公司 一种基于时序多尺度融合的行人重识别方法及***
CN113591547A (zh) * 2021-06-15 2021-11-02 阿里云计算有限公司 图像处理方法、装置、电子设备及计算机可读存储介质

Families Citing this family (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111626156B (zh) * 2020-05-14 2023-05-09 电子科技大学 一种基于行人掩模和多尺度判别的行人生成方法
CN112446311A (zh) * 2020-11-19 2021-03-05 杭州趣链科技有限公司 对象重识别方法、电子设备、存储介质及装置
CN112528822B (zh) * 2020-12-04 2021-10-08 湖北工业大学 一种基于人脸识别技术的老弱人群寻路导识装置及方法
CN113627352A (zh) * 2021-08-12 2021-11-09 塔里木大学 一种行人重识别方法和***

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20160171346A1 (en) * 2014-12-15 2016-06-16 Samsung Electronics Co., Ltd. Image recognition method and apparatus, image verification method and apparatus, learning method and apparatus to recognize image, and learning method and apparatus to verify image
CN107832672A (zh) * 2017-10-12 2018-03-23 北京航空航天大学 一种利用姿态信息设计多损失函数的行人重识别方法
CN108171319A (zh) * 2017-12-05 2018-06-15 南京信息工程大学 网络连接自适应深度卷积模型的构建方法
CN109614853A (zh) * 2018-10-30 2019-04-12 国家新闻出版广电总局广播科学研究院 一种基于身体结构划分的双线性行人再识别网络构建方法
CN109784186A (zh) * 2018-12-18 2019-05-21 深圳云天励飞技术有限公司 一种行人重识别方法、装置、电子设备及计算机可读存储介质

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108875486A (zh) * 2017-09-28 2018-11-23 北京旷视科技有限公司 目标对象识别方法、装置、***和计算机可读介质
CN109740413B (zh) * 2018-11-14 2023-07-28 平安科技(深圳)有限公司 行人重识别方法、装置、计算机设备及计算机存储介质
CN109784182A (zh) * 2018-12-17 2019-05-21 北京飞搜科技有限公司 行人重识别方法和装置

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20160171346A1 (en) * 2014-12-15 2016-06-16 Samsung Electronics Co., Ltd. Image recognition method and apparatus, image verification method and apparatus, learning method and apparatus to recognize image, and learning method and apparatus to verify image
CN107832672A (zh) * 2017-10-12 2018-03-23 北京航空航天大学 一种利用姿态信息设计多损失函数的行人重识别方法
CN108171319A (zh) * 2017-12-05 2018-06-15 南京信息工程大学 网络连接自适应深度卷积模型的构建方法
CN109614853A (zh) * 2018-10-30 2019-04-12 国家新闻出版广电总局广播科学研究院 一种基于身体结构划分的双线性行人再识别网络构建方法
CN109784186A (zh) * 2018-12-18 2019-05-21 深圳云天励飞技术有限公司 一种行人重识别方法、装置、电子设备及计算机可读存储介质

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113139496A (zh) * 2021-05-08 2021-07-20 青岛根尖智能科技有限公司 一种基于时序多尺度融合的行人重识别方法及***
CN113591547A (zh) * 2021-06-15 2021-11-02 阿里云计算有限公司 图像处理方法、装置、电子设备及计算机可读存储介质

Also Published As

Publication number Publication date
CN110874574B (zh) 2024-05-07
CN110874574A (zh) 2020-03-10

Similar Documents

Publication Publication Date Title
WO2021082078A1 (zh) 行人重识别方法、装置、计算机设备及可读存储介质
US11908238B2 (en) Methods and systems for facial point-of-recognition (POR) provisioning
WO2019109526A1 (zh) 人脸图像的年龄识别方法、装置及存储介质
US10733421B2 (en) Method for processing video, electronic device and storage medium
WO2020098250A1 (zh) 字符识别方法、服务器及计算机可读存储介质
WO2019120115A1 (zh) 人脸识别的方法、装置及计算机装置
CN110532884B (zh) 行人重识别方法、装置及计算机可读存储介质
CN109284733B (zh) 一种基于yolo和多任务卷积神经网络的导购消极行为监控方法
CN110362677B (zh) 文本数据类别的识别方法及装置、存储介质、计算机设备
CN110348362B (zh) 标签生成、视频处理方法、装置、电子设备及存储介质
WO2021051547A1 (zh) 暴力行为检测方法及***
CN111310705A (zh) 图像识别方法、装置、计算机设备及存储介质
US10650234B2 (en) Eyeball movement capturing method and device, and storage medium
CN110163061B (zh) 用于提取视频指纹的方法、装置、设备和计算机可读介质
CN108108711B (zh) 人脸布控方法、电子设备及存储介质
WO2023142551A1 (zh) 模型训练及图像识别方法和装置、设备、存储介质和计算机程序产品
CN113111880B (zh) 证件图像校正方法、装置、电子设备及存储介质
Ma et al. Random projection-based partial feature extraction for robust face recognition
CN111476310A (zh) 一种图像分类方法、装置及设备
An et al. Person re-identification via hypergraph-based matching
WO2019242156A1 (zh) 终端中的应用控制方法和装置及计算机可读存储介质
Hermosilla et al. An enhanced representation of thermal faces for improving local appearance-based face recognition
Kadambari et al. Automation of attendance system using facial recognition
Boka et al. Person recognition for access logging
CN113723310B (zh) 基于神经网络的图像识别方法及相关装置

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 19950425

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

32PN Ep: public notification in the ep bulletin as address of the adressee cannot be established

Free format text: NOTING OF LOSS OF RIGHTS PURSUANT TO RULE 112(1) EPC (EPO FORM 1205A DATED 16/09/2022)

122 Ep: pct application non-entry in european phase

Ref document number: 19950425

Country of ref document: EP

Kind code of ref document: A1