CN111898448A - Pedestrian attribute identification method and system based on deep learning - Google Patents

Pedestrian attribute identification method and system based on deep learning Download PDF

Info

Publication number
CN111898448A
CN111898448A CN202010614419.1A CN202010614419A CN111898448A CN 111898448 A CN111898448 A CN 111898448A CN 202010614419 A CN202010614419 A CN 202010614419A CN 111898448 A CN111898448 A CN 111898448A
Authority
CN
China
Prior art keywords
image
color cast
scene
color
pixel
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202010614419.1A
Other languages
Chinese (zh)
Other versions
CN111898448B (en
Inventor
贾川民
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Peking University
Original Assignee
Peking University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Peking University filed Critical Peking University
Priority to CN202010614419.1A priority Critical patent/CN111898448B/en
Publication of CN111898448A publication Critical patent/CN111898448A/en
Application granted granted Critical
Publication of CN111898448B publication Critical patent/CN111898448B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/50Context or environment of the image
    • G06V20/52Surveillance or monitoring of activities, e.g. for recognising suspicious objects
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02TCLIMATE CHANGE MITIGATION TECHNOLOGIES RELATED TO TRANSPORTATION
    • Y02T10/00Road transport of goods or passengers
    • Y02T10/10Internal combustion engine [ICE] based vehicles
    • Y02T10/40Engine management systems

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Multimedia (AREA)
  • Theoretical Computer Science (AREA)
  • Human Computer Interaction (AREA)
  • Image Analysis (AREA)

Abstract

The application provides a pedestrian attribute identification method and system based on deep learning. Wherein the method comprises the following steps: carrying out scene recognition on an image to be recognized, and extracting a first scene image in the image to be recognized according to a recognition result, wherein the image content of the image to be recognized comprises pedestrians; inputting the first scene image into a pre-trained scene color cast reduction neural network model, and outputting a non-color cast second scene image corresponding to the first scene image by using the scene color cast reduction neural network model; calculating color cast information between the first scene image and the second scene image; performing color cast reduction on the image to be identified according to the color cast information to obtain the image to be identified after color cast reduction; and identifying the attribute of the pedestrian according to the image to be identified after the color cast reduction. The technical scheme of the application can further solve the problem that the pedestrian attribute cannot be identified or is identified wrongly due to color cast of the monitoring video picture under the neon light irradiation at night.

Description

Pedestrian attribute identification method and system based on deep learning
Technical Field
The application relates to the technical field of image processing, in particular to a pedestrian attribute identification method and system based on deep learning.
Background
The video monitoring system is widely applied to daily life as a social security guarantee system, and video acquisition equipment such as a camera and the like can be seen everywhere in public places such as banks, shopping malls, supermarkets, hotels, street corners, intersections, toll stations and the like. The installation of the system greatly increases social security, plays a role in monitoring and recording the behavior of lawbreakers in real time, and provides a great amount of real and reliable clues for the public security organs to detect cases.
With the development of technology and the improvement of living standard of people, the requirement of automatic pedestrian attribute identification in video monitoring is provided to further identify, analyze and track pedestrians, wherein the pedestrian attributes comprise skin colors, clothing colors, color forms and the like of pedestrians, and because the pedestrian attribute identification is realized based on the shot monitoring video, the color cast of the monitoring video shot under the conditions of shining sunlight, bright light and the like is light, the current pedestrian attribute identification technology can be adopted for accurately identifying, but the monitoring video shot under the illumination of a neon lamp at night often has serious color cast, and along with the wide application of color lighting equipment such as colorful neon lamps, colorful lamp belts, colorful lamp boxes and the like, the color cast of the monitoring video often changes to be blue, red and the like, and the color cast degree also often changes, so that the monitoring video is difficult to be subjected to quantization, analysis and tracking, Automatic color cast correction further causes that the pedestrian attributes such as the skin color, the clothing color, the hair color and the like of the pedestrian cannot be accurately and automatically identified.
Disclosure of Invention
The application aims to provide a pedestrian attribute identification method and system based on deep learning.
The application provides a pedestrian attribute identification method based on deep learning in a first aspect, which comprises the following steps:
carrying out scene recognition on an image to be recognized, and extracting a first scene image in the image to be recognized according to a recognition result, wherein the image content of the image to be recognized comprises pedestrians;
inputting the first scene image into a pre-trained scene color cast reduction neural network model, and outputting a non-color cast second scene image corresponding to the first scene image by using the scene color cast reduction neural network model;
calculating color cast information between the first scene image and the second scene image;
performing color cast reduction on the image to be identified according to the color cast information to obtain the image to be identified after color cast reduction;
and identifying the attribute of the pedestrian according to the image to be identified after the color cast reduction.
In some embodiments of the first aspect of the present application, before inputting the first scene image into a pre-trained scene color cast reduction neural network model and outputting a non-color cast second scene image corresponding to the first scene image by using the scene color cast reduction neural network model, the method further includes:
acquiring a plurality of groups of image sample groups, wherein each group of image sample group comprises a first image sample in a color cast state and a second image sample in a non-color cast state, and the first image sample and the second image sample contain the same content;
respectively carrying out scene recognition on all the first image samples and all the second image samples, extracting a first scene sample in each first image sample and a second scene sample in each second image sample according to recognition results, and obtaining a plurality of groups of scene sample groups corresponding to the plurality of groups of image sample groups on the basis of the first scene samples and the second scene samples;
and training a scene color cast reduction neural network model by using the plurality of groups of scene samples as input and the second scene sample as output to obtain the trained scene color cast reduction neural network model.
In some embodiments of the first aspect of the present application, the second image sample is obtained by a user manually performing color cast reduction processing on the first image sample.
In some embodiments of the first aspect of the present application, the scene cast reduction neural network model is implemented using generation of an antagonistic neural network GAN, deep convolution generation of an antagonistic neural network DCGAN, coupled generation of an antagonistic neural network CoGAN, or self-attention generation of an antagonistic neural network SAGAN.
In some embodiments of the first aspect of the present application, the image to be identified is taken from a surveillance video stream taken by a surveillance camera;
the scene recognition is carried out on the image to be recognized, and a first scene image in the image to be recognized is extracted according to a recognition result, and the method comprises the following steps:
detecting the pedestrians in the image to be recognized by adopting an interframe difference method or an optical flow field method, and determining the region where the pedestrians are located;
and extracting other areas except the area where the pedestrian is located in the image to be recognized as a first scene image.
In some embodiments of the first aspect of the present application, the performing scene recognition on the image to be recognized, and extracting a first scene image in the image to be recognized according to a recognition result includes:
detecting a salient region in the image to be identified by adopting a saliency detection algorithm, wherein the salient region is a region where a pedestrian is located;
extracting non-salient regions except the salient region in the image to be identified as a first scene image.
In some embodiments of the first aspect of the present application, the calculating color cast information between the first scene image and the second scene image comprises:
determining a plurality of color channels;
for each color channel, determining channel color cast information of the first scene image and the second scene image in the color channel according to the value of each pixel in the first scene image in the color channel and the value of each pixel in the second scene image in the color channel;
and determining color cast information between the first scene image and the second scene image according to the channel color cast information corresponding to each color channel.
In some embodiments of the first aspect of the present application, the color cast information comprises pixel color cast information for individual pixels in the first scene image;
the color cast restoration of the image to be recognized according to the color cast information to obtain the image to be recognized after color cast restoration, including:
predicting pixel color cast information of each pixel in the region where the pedestrian is located in the image to be identified according to the pixel color cast information of each pixel in the first scene image;
determining pixel color cast information of each pixel in the image to be identified according to the pixel color cast information of each pixel in the first scene image and the pixel color cast information of each pixel in the area where the pedestrian is located;
and performing color cast reduction on the image to be recognized according to the pixel color cast information of each pixel in the image to be recognized to obtain the image to be recognized after color cast reduction.
In some embodiments of the first aspect of the present application, the performing pedestrian attribute recognition according to the image to be recognized after color cast reduction includes:
identifying at least one attribute of the pedestrians contained in the image to be identified after the color cast reduction: skin tone, hair color, and clothing color.
A second aspect of the present application provides a pedestrian attribute identification system based on deep learning, including:
the first scene image extraction module is used for carrying out scene identification on an image to be identified and extracting a first scene image in the image to be identified according to an identification result, wherein the image content of the image to be identified comprises pedestrians;
the second scene image output module is used for inputting the first scene image into a pre-trained scene color cast reduction neural network model and outputting a non-color cast second scene image corresponding to the first scene image by using the scene color cast reduction neural network model;
a color cast information calculation module for calculating color cast information between the first scene image and the second scene image;
the color cast reduction module is used for performing color cast reduction on the image to be identified according to the color cast information to obtain the image to be identified after color cast reduction;
and the pedestrian attribute identification module is used for identifying the pedestrian attribute according to the image to be identified after the color cast reduction.
In some embodiments of the second aspect of the present application, the system further comprises:
the image sample group acquisition module is used for acquiring a plurality of groups of image sample groups, wherein each group of image sample groups comprises a first image sample in a color cast state and a second image sample in a non-color cast state, and the first image sample and the second image sample contain the same content;
a scene sample extraction module, configured to perform scene identification on all the first image samples and the second image samples, extract a first scene sample in each first image sample and a second scene sample in each second image sample according to an identification result, and obtain multiple groups of scene sample groups corresponding to the multiple groups of image sample groups based on the first scene sample and the second scene sample;
and the model training module is used for training the scene color cast reduction neural network model by using the plurality of groups of scene samples as input and the second scene sample as output to obtain the trained scene color cast reduction neural network model.
In some embodiments of the second aspect of the present application, the second image sample is obtained by a user manually performing a color cast reduction process on the first image sample.
In some embodiments of the second aspect of the present application, the scene cast reduction neural network model is implemented using generation of an antagonistic neural network GAN, deep convolution generation of an antagonistic neural network DCGAN, coupled generation of an antagonistic neural network CoGAN, or self-attention generation of an antagonistic neural network SAGAN.
In some embodiments of the second aspect of the present application, the image to be identified is taken from a surveillance video stream taken by a surveillance camera;
the first scene image extraction module comprises:
the pedestrian area identification unit is used for detecting pedestrians in the image to be identified by adopting an interframe difference method or an optical flow field method and determining the area where the pedestrians are located;
and the first scene image extraction unit is used for extracting other areas except the area where the pedestrian is located in the image to be identified as first scene images.
In some embodiments of the second aspect of the present application, the first scene image extraction module comprises:
the salient region detection unit is used for detecting a salient region in the image to be recognized by adopting a saliency detection algorithm, and the salient region is a region where a pedestrian is located;
and the non-significant region extracting unit is used for extracting non-significant regions except the significant region in the image to be identified as a first scene image.
In some embodiments of the second aspect of the present application, the color cast information calculation module includes:
a color channel determination unit for determining a plurality of color channels;
a channel color cast information calculation unit, configured to determine, for each color channel, channel color cast information of the first scene image and the second scene image in the color channel according to a value of each pixel in the first scene image in the color channel and a value of each pixel in the second scene image in the color channel;
a color cast information determining unit, configured to determine color cast information between the first scene image and the second scene image according to the channel color cast information corresponding to each color channel.
In some embodiments of the second aspect of the present application, the color cast information comprises pixel color cast information for individual pixels in the first scene image;
the color cast reduction module comprises:
the pedestrian region color cast prediction unit is used for predicting pixel color cast information of each pixel in a region where a pedestrian is located in the image to be recognized according to the pixel color cast information of each pixel in the first scene image;
the all-color-cast information determining unit is used for determining the pixel color cast information of each pixel in the image to be identified according to the pixel color cast information of each pixel in the first scene image and the pixel color cast information of each pixel in the area where the pedestrian is located;
and the color cast reduction unit is used for performing color cast reduction on the image to be recognized according to the pixel color cast information of each pixel in the image to be recognized to obtain the image to be recognized after color cast reduction.
In some embodiments of the second aspect of the present application, the pedestrian attribute identification module comprises:
the pedestrian attribute identification unit is used for identifying at least one attribute of the following pedestrians contained in the image to be identified after the color cast reduction: skin tone, hair color, and clothing color.
Compared with the prior art, the pedestrian attribute identification method based on deep learning provided by the application identifies the scene of the image to be identified and extracts the first scene image in the image to be identified according to the identification result, wherein, the image content of the image to be recognized comprises pedestrians, then the first scene image is input into a pre-trained scene color cast reduction neural network model, a non-color cast second scene image corresponding to the first scene image is output by utilizing the scene color cast reduction neural network model, and then color cast information between the first scene image and the second scene image is calculated, and then carrying out color cast reduction on the image to be recognized according to the color cast information to obtain the image to be recognized after color cast reduction, namely carrying out pedestrian attribute recognition according to the image to be recognized after color cast reduction. Considering that pedestrians in the image to be recognized are changed frequently, the method trains a scene color cast reduction neural network model in advance, then carries out scene recognition on the image to be recognized and extracts a first scene image, the first scene image can be color cast reduced by using the scene color cast reduction neural network model, then the color cast information of the first scene image can be calculated according to the second scene image, since the color cast information of the first scene image can reflect the color cast information of the whole image to be recognized at least to some extent, therefore, the color cast reduction can be performed on the image to be identified by utilizing the color cast information of the first scene image, then, the image to be identified after color cast reduction is utilized to accurately identify the attributes of the pedestrians, and based on the identification, the technical scheme can further solve the problem that the attributes of the pedestrians cannot be identified or are identified wrongly due to color cast of the monitoring video picture under the illumination of neon light at night.
Drawings
Various other advantages and benefits will become apparent to those of ordinary skill in the art upon reading the following detailed description of the preferred embodiments. The drawings are only for purposes of illustrating the preferred embodiments and are not to be construed as limiting the application. Also, like reference numerals are used to refer to like parts throughout the drawings. In the drawings:
fig. 1 illustrates a flow chart of a deep learning based pedestrian attribute identification method provided by some embodiments of the present application;
FIG. 2 illustrates a schematic diagram of an image to be recognized provided by some embodiments of the present application;
FIG. 3 illustrates a schematic diagram of a first scene image provided by some embodiments of the present application;
fig. 4 illustrates a schematic diagram of a deep learning based pedestrian attribute identification system provided by some embodiments of the present application.
Detailed Description
Exemplary embodiments of the present disclosure will be described in more detail below with reference to the accompanying drawings. While exemplary embodiments of the present disclosure are shown in the drawings, it should be understood that the present disclosure may be embodied in various forms and should not be limited to the embodiments set forth herein. Rather, these embodiments are provided so that this disclosure will be thorough and complete, and will fully convey the scope of the disclosure to those skilled in the art.
It is to be noted that, unless otherwise specified, technical or scientific terms used herein shall have the ordinary meaning as understood by those skilled in the art to which this application belongs.
In addition, the terms "first" and "second", etc. are used to distinguish different objects, rather than to describe a particular order. Furthermore, the terms "include" and "have," as well as any variations thereof, are intended to cover non-exclusive inclusions. For example, a process, method, system, article, or apparatus that comprises a list of steps or elements is not limited to only those steps or elements listed, but may alternatively include other steps or elements not listed, or inherent to such process, method, article, or apparatus.
The embodiment of the application provides a pedestrian attribute identification method and system based on deep learning, and the following description is given by combining the embodiment and the accompanying drawings for illustration.
Referring to fig. 1, which shows a flowchart of a pedestrian attribute identification method based on deep learning according to some embodiments of the present application, as shown in fig. 1, the pedestrian attribute identification method based on deep learning may include the following steps:
step S101: carrying out scene recognition on an image to be recognized, and extracting a first scene image in the image to be recognized according to a recognition result, wherein the image content of the image to be recognized comprises pedestrians;
step S102: inputting the first scene image into a pre-trained scene color cast reduction neural network model, and outputting a non-color cast second scene image corresponding to the first scene image by using the scene color cast reduction neural network model;
step S103: calculating color cast information between the first scene image and the second scene image;
step S104: performing color cast reduction on the image to be identified according to the color cast information to obtain the image to be identified after color cast reduction;
step S105: and identifying the attribute of the pedestrian according to the image to be identified after the color cast reduction.
Compared with the prior art, the pedestrian attribute identification method based on deep learning provided by the embodiment of the application, by carrying out scene recognition on an image to be recognized and extracting a first scene image in the image to be recognized according to a recognition result, wherein, the image content of the image to be recognized comprises pedestrians, then the first scene image is input into a pre-trained scene color cast reduction neural network model, a non-color cast second scene image corresponding to the first scene image is output by utilizing the scene color cast reduction neural network model, and then color cast information between the first scene image and the second scene image is calculated, and then carrying out color cast reduction on the image to be recognized according to the color cast information to obtain the image to be recognized after color cast reduction, namely carrying out pedestrian attribute recognition according to the image to be recognized after color cast reduction. Considering that pedestrians in the image to be recognized are changed frequently, the method trains a scene color cast reduction neural network model in advance, then carries out scene recognition on the image to be recognized and extracts a first scene image, the first scene image can be color cast reduced by using the scene color cast reduction neural network model, then the color cast information of the first scene image can be calculated according to the second scene image, since the color cast information of the first scene image can reflect the color cast information of the whole image to be recognized at least to some extent, therefore, the color cast reduction can be performed on the image to be identified by utilizing the color cast information of the first scene image, then, the image to be identified after color cast reduction is utilized to accurately identify the attributes of the pedestrians, and based on the identification, the technical scheme can further solve the problem that the attributes of the pedestrians cannot be identified or are identified wrongly due to color cast of the monitoring video picture under the illumination of neon light at night.
In some modified embodiments of the embodiment of the present application, the image to be identified is taken from a surveillance video stream shot by a surveillance camera, and may be any frame in the surveillance video stream;
the scene recognition is carried out on the image to be recognized, and a first scene image in the image to be recognized is extracted according to a recognition result, and the method comprises the following steps:
detecting the pedestrians in the image to be recognized by adopting an interframe difference method or an optical flow field method, and determining the region where the pedestrians are located;
and extracting other areas except the area where the pedestrian is located in the image to be recognized as a first scene image.
Referring to fig. 2 and 3, fig. 2 shows a schematic diagram of an image to be recognized provided by some embodiments of the present application, fig. 3 shows a schematic diagram of a first scene image provided by some embodiments of the present application, based on fig. 2, a pedestrian in the image to be recognized may be detected by using an inter-frame difference method or an optical flow field method, a region where the pedestrian is located is determined, and then, other regions except the region where the pedestrian is located in the image to be recognized are extracted as the first scene image.
The interframe difference method and the optical flow field method have better identification capability for images with static and unchangeable background (namely scene) and dynamic change foreground (namely pedestrians), so that the area where the pedestrians are located can be identified more accurately.
In some modified embodiments of the embodiment of the present application, the performing scene recognition on the image to be recognized, and extracting a first scene image in the image to be recognized according to a recognition result includes:
detecting a salient region in the image to be identified by adopting a saliency detection algorithm, wherein the salient region is a region where a pedestrian is located;
extracting non-salient regions except the salient region in the image to be identified as a first scene image.
The significance detection algorithm has the advantages of high efficiency and low system operation load, so that the operation efficiency can be effectively improved and the system load can be reduced by adopting the embodiment of the application.
In some modifications of the embodiments of the present application, before step S102, the method may further include:
acquiring a plurality of groups of image sample groups, wherein each group of image sample group comprises a first image sample in a color cast state and a second image sample in a non-color cast state, and the first image sample and the second image sample contain the same content;
respectively carrying out scene recognition on all the first image samples and all the second image samples, extracting a first scene sample in each first image sample and a second scene sample in each second image sample according to recognition results, and obtaining a plurality of groups of scene sample groups corresponding to the plurality of groups of image sample groups on the basis of the first scene samples and the second scene samples;
and training a scene color cast reduction neural network model by using the plurality of groups of scene samples as input and the second scene sample as output to obtain the trained scene color cast reduction neural network model.
For example, the user may perform color cast correction on a first scene sample with color cast by using image editing software, where the color cast correction has the same meaning as the color cast reduction expression to obtain a second scene sample without color cast, and train the scene color cast reduction neural network model by using the first scene sample as an input and the second scene sample as an output, so that the scene color cast reduction neural network model has the capability of performing automatic color cast correction on the scene image.
On the basis of any implementation manner of the present application, in some specific implementation manners, the scene color cast reduction neural network model is implemented by generating an antagonistic neural network GAN, generating an antagonistic neural network DCGAN by deep convolution, generating an antagonistic neural network CoGAN by coupling, or generating an antagonistic neural network SAGAN by self-attention, and the neural networks have the capability of generating pictures according to the pictures, so that the purpose of the embodiments of the present application can be achieved.
In some variations of embodiments of the present application, the calculating color cast information between the first scene image and the second scene image includes:
determining a plurality of color channels;
for each color channel, determining channel color cast information of the first scene image and the second scene image in the color channel according to the value of each pixel in the first scene image in the color channel and the value of each pixel in the second scene image in the color channel;
and determining color cast information between the first scene image and the second scene image according to the channel color cast information corresponding to each color channel.
The color channels may be RGB three-color channels, or L, a, b channels in an LAB space, and the like, which is not limited in the embodiment of the present application. In a specific implementation, the sum of the values of each pixel in the first scene image in the color channel may be calculated, which is taken as an a channel as an example and is denoted as a1, the sum of the values of each pixel in the second scene image in the color channel may be calculated, which is taken as an example and is denoted as a2, and then the difference between the two values is calculated, which is da equal to a1-a2, where da is the channel color shift information of the first scene image and the second scene image in the color channel a, or the average color shift information dav equal to da/p in the channel a may be obtained by dividing da by the number of pixels of the first scene image, where p represents the number of pixels of the first scene image, and the average color shift information dav may also be denoted as channel color shift information, which may also achieve the purpose of the embodiment of the present application.
It should be noted that the color cast information may be information describing an overall color cast condition of the first scene image, and a value of the color cast information may be a value of the entire first scene image in one color channel, and then, during subsequent color cast correction, color cast correction is performed globally on the entire first scene image according to the value, for example, when the channel color cast information of the a channel is dav ═ 3, which indicates red cast, and during color cast correction, all the a channels of the entire first scene image may be reduced by 3, so as to implement color cast correction. The color cast correction of the b channel can be realized by referring to the above description, and the details are not repeated herein.
In some modifications of the embodiment of the present application, the color shift information may also include pixel color shift information of each pixel in the first scene image, for example, each pixel of the first scene image is compared with each pixel of the second scene image in a one-to-one correspondence manner to obtain pixel color shift information of each pixel, and accordingly, performing color shift restoration on the image to be recognized according to the color shift information to obtain the image to be recognized after color shift restoration may include:
predicting pixel color cast information of each pixel in the region where the pedestrian is located in the image to be identified according to the pixel color cast information of each pixel in the first scene image;
determining pixel color cast information of each pixel in the image to be identified according to the pixel color cast information of each pixel in the first scene image and the pixel color cast information of each pixel in the area where the pedestrian is located;
and performing color cast reduction on the image to be recognized according to the pixel color cast information of each pixel in the image to be recognized to obtain the image to be recognized after color cast reduction.
By the embodiment, the pixel color cast information of the area where the pedestrian is located can be predicted according to the pixel color cast information in the first scene image, specifically, a trend prediction method can be adopted, for example, the missing pixel color cast information of the area where the pedestrian is located can be determined through linear regression fitting according to the pixel color cast information of each row of known pixels in the image to be recognized, so that the pixel color cast information of each pixel in the area where the pedestrian is located can be determined, in the subsequent color cast correction, the color cast correction can be respectively performed on each pixel according to the current color of each pixel and the pixel color cast information of the pixel, and finally the image to be recognized after color cast reduction is obtained.
In some modifications of the embodiments of the present application, the performing of the pedestrian attribute recognition according to the image to be recognized after the color cast reduction includes:
identifying at least one attribute of the pedestrians contained in the image to be identified after the color cast reduction: skin tone, hair color, and clothing color.
The step can be implemented directly or by changing any pedestrian attribute identification method disclosed in the prior art, and the specific implementation mode of the step is not limited in the application.
In consideration of the fact that skin color, color development and clothing color are easily unrecognizable or mistakenly recognized under the irradiation of neon lamps and the like, the pedestrian attributes such as skin color, color development and clothing color can be further accurately recognized after automatic color cast correction is carried out through the embodiment, and the problem that the pedestrian attributes cannot be recognized or mistakenly recognized due to color cast of monitoring video pictures under the irradiation of neon lights at night is solved.
In the embodiment, the application further provides a pedestrian attribute identification system based on deep learning. The pedestrian attribute identification system based on deep learning provided by the embodiment of the application can implement the pedestrian attribute identification method based on deep learning, and the pedestrian attribute identification system based on deep learning can be realized in a software, hardware or software and hardware combined mode. For example, the deep learning based pedestrian attribute identification system may comprise integrated or separate functional modules or units to perform the corresponding steps of the above methods. Referring to fig. 4, a schematic diagram of a deep learning based pedestrian attribute identification system according to some embodiments of the present application is shown. Since the system embodiments are substantially similar to the method embodiments, they are described in a relatively simple manner, and reference may be made to some of the descriptions of the method embodiments for relevant points. The system embodiments described below are merely illustrative.
As shown in fig. 4, the deep learning based pedestrian attribute recognition system 10 may include:
the first scene image extraction module 101 is configured to perform scene recognition on an image to be recognized, and extract a first scene image in the image to be recognized according to a recognition result, where image content of the image to be recognized includes a pedestrian;
a second scene image output module 102, configured to input the first scene image into a pre-trained scene color cast reduction neural network model, and output a non-color cast second scene image corresponding to the first scene image by using the scene color cast reduction neural network model;
a color cast information calculation module 103, configured to calculate color cast information between the first scene image and the second scene image;
the color cast reduction module 104 is configured to perform color cast reduction on the image to be identified according to the color cast information to obtain an image to be identified after color cast reduction;
and the pedestrian attribute identification module 105 is used for identifying the pedestrian attributes according to the image to be identified after the color cast reduction.
In some embodiments of the second aspect of the present application, the system 10, further comprises:
the image sample group acquisition module is used for acquiring a plurality of groups of image sample groups, wherein each group of image sample groups comprises a first image sample in a color cast state and a second image sample in a non-color cast state, and the first image sample and the second image sample contain the same content;
a scene sample extraction module, configured to perform scene identification on all the first image samples and the second image samples, extract a first scene sample in each first image sample and a second scene sample in each second image sample according to an identification result, and obtain multiple groups of scene sample groups corresponding to the multiple groups of image sample groups based on the first scene sample and the second scene sample;
and the model training module is used for training the scene color cast reduction neural network model by using the plurality of groups of scene samples as input and the second scene sample as output to obtain the trained scene color cast reduction neural network model.
In some embodiments of the second aspect of the present application, the second image sample is obtained by a user manually performing a color cast reduction process on the first image sample.
In some embodiments of the second aspect of the present application, the scene cast reduction neural network model is implemented using generation of an antagonistic neural network GAN, deep convolution generation of an antagonistic neural network DCGAN, coupled generation of an antagonistic neural network CoGAN, or self-attention generation of an antagonistic neural network SAGAN.
In some embodiments of the second aspect of the present application, the image to be identified is taken from a surveillance video stream taken by a surveillance camera;
the first scene image extraction module 101 includes:
the pedestrian area identification unit is used for detecting pedestrians in the image to be identified by adopting an interframe difference method or an optical flow field method and determining the area where the pedestrians are located;
and the first scene image extraction unit is used for extracting other areas except the area where the pedestrian is located in the image to be identified as first scene images.
In some embodiments of the second aspect of the present application, the first scene image extraction module 101 includes:
the salient region detection unit is used for detecting a salient region in the image to be recognized by adopting a saliency detection algorithm, and the salient region is a region where a pedestrian is located;
and the non-significant region extracting unit is used for extracting non-significant regions except the significant region in the image to be identified as a first scene image.
In some embodiments of the second aspect of the present application, the color cast information calculating module 103 includes:
a color channel determination unit for determining a plurality of color channels;
a channel color cast information calculation unit, configured to determine, for each color channel, channel color cast information of the first scene image and the second scene image in the color channel according to a value of each pixel in the first scene image in the color channel and a value of each pixel in the second scene image in the color channel;
a color cast information determining unit, configured to determine color cast information between the first scene image and the second scene image according to the channel color cast information corresponding to each color channel.
In some embodiments of the second aspect of the present application, the color cast information comprises pixel color cast information for individual pixels in the first scene image;
the color cast reduction module 104 includes:
the pedestrian region color cast prediction unit is used for predicting pixel color cast information of each pixel in a region where a pedestrian is located in the image to be recognized according to the pixel color cast information of each pixel in the first scene image;
the all-color-cast information determining unit is used for determining the pixel color cast information of each pixel in the image to be identified according to the pixel color cast information of each pixel in the first scene image and the pixel color cast information of each pixel in the area where the pedestrian is located;
and the color cast reduction unit is used for performing color cast reduction on the image to be recognized according to the pixel color cast information of each pixel in the image to be recognized to obtain the image to be recognized after color cast reduction.
In some embodiments of the second aspect of the present application, the pedestrian attribute identification module 105 comprises:
the pedestrian attribute identification unit is used for identifying at least one attribute of the following pedestrians contained in the image to be identified after the color cast reduction: skin tone, hair color, and clothing color.
The deep learning-based pedestrian attribute identification system 10 provided by the embodiment of the present application has the same beneficial effects as the deep learning-based pedestrian attribute identification method provided by the foregoing embodiment of the present application based on the same inventive concept.
It should be noted that the flowchart and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of systems, methods and computer program products according to various embodiments of the present application. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams and/or flowchart illustration, and combinations of blocks in the block diagrams and/or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.
It is clear to those skilled in the art that, for convenience and brevity of description, the specific working processes of the above-described systems, apparatuses and units may refer to the corresponding processes in the foregoing method embodiments, and are not described herein again.
In the several embodiments provided in the present application, it should be understood that the disclosed apparatus and method may be implemented in other ways. The above-described embodiments of the apparatus are merely illustrative, and for example, the division of the units is only one logical division, and there may be other divisions when actually implemented, and for example, a plurality of units or components may be combined or integrated into another system, or some features may be omitted, or not executed. In addition, the shown or discussed mutual coupling or direct coupling or communication connection may be an indirect coupling or communication connection of devices or units through some communication interfaces, and may be in an electrical, mechanical or other form.
The units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the units can be selected according to actual needs to achieve the purpose of the solution of the embodiment.
In addition, functional units in the embodiments of the present application may be integrated into one processing unit, or each unit may exist alone physically, or two or more units are integrated into one unit.
The functions, if implemented in the form of software functional units and sold or used as a stand-alone product, may be stored in a computer readable storage medium. Based on such understanding, the technical solution of the present application or portions thereof that substantially contribute to the prior art may be embodied in the form of a software product stored in a storage medium and including instructions for causing a computer device (which may be a personal computer, a server, or a network device) to execute all or part of the steps of the method according to the embodiments of the present application. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk or an optical disk, and other various media capable of storing program codes.
Finally, it should be noted that: the above embodiments are only used for illustrating the technical solutions of the present application, and not for limiting the same; although the present application has been described in detail with reference to the foregoing embodiments, it should be understood by those of ordinary skill in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some or all of the technical features may be equivalently replaced; such modifications and substitutions do not depart from the spirit and scope of the present disclosure, and the present disclosure should be construed as being covered by the claims and the specification.

Claims (10)

1. A pedestrian attribute identification method based on deep learning is characterized by comprising the following steps:
carrying out scene recognition on an image to be recognized, and extracting a first scene image in the image to be recognized according to a recognition result, wherein the image content of the image to be recognized comprises pedestrians;
inputting the first scene image into a pre-trained scene color cast reduction neural network model, and outputting a non-color cast second scene image corresponding to the first scene image by using the scene color cast reduction neural network model;
calculating color cast information between the first scene image and the second scene image;
performing color cast reduction on the image to be identified according to the color cast information to obtain the image to be identified after color cast reduction;
and identifying the attribute of the pedestrian according to the image to be identified after the color cast reduction.
2. The method according to claim 1, characterized in that the image to be recognized is taken from a surveillance video stream taken by a surveillance camera;
the scene recognition is carried out on the image to be recognized, and a first scene image in the image to be recognized is extracted according to a recognition result, and the method comprises the following steps:
detecting the pedestrians in the image to be recognized by adopting an interframe difference method or an optical flow field method, and determining the region where the pedestrians are located;
and extracting other areas except the area where the pedestrian is located in the image to be recognized as a first scene image.
3. The method according to claim 1, wherein the performing scene recognition on the image to be recognized and extracting a first scene image in the image to be recognized according to a recognition result comprises:
detecting a salient region in the image to be identified by adopting a saliency detection algorithm, wherein the salient region is a region where a pedestrian is located;
extracting non-salient regions except the salient region in the image to be identified as a first scene image.
4. The method of claim 1, wherein the calculating color cast information between the first scene image and the second scene image comprises:
determining a plurality of color channels;
for each color channel, determining channel color cast information of the first scene image and the second scene image in the color channel according to the value of each pixel in the first scene image in the color channel and the value of each pixel in the second scene image in the color channel;
and determining color cast information between the first scene image and the second scene image according to the channel color cast information corresponding to each color channel.
5. The method of claim 1, wherein the color cast information comprises pixel color cast information for each pixel in the first scene image;
the color cast restoration of the image to be recognized according to the color cast information to obtain the image to be recognized after color cast restoration, including:
predicting pixel color cast information of each pixel in the region where the pedestrian is located in the image to be identified according to the pixel color cast information of each pixel in the first scene image;
determining pixel color cast information of each pixel in the image to be identified according to the pixel color cast information of each pixel in the first scene image and the pixel color cast information of each pixel in the area where the pedestrian is located;
and performing color cast reduction on the image to be recognized according to the pixel color cast information of each pixel in the image to be recognized to obtain the image to be recognized after color cast reduction.
6. A pedestrian attribute recognition system based on deep learning, comprising:
the first scene image extraction module is used for carrying out scene identification on an image to be identified and extracting a first scene image in the image to be identified according to an identification result, wherein the image content of the image to be identified comprises pedestrians;
the second scene image output module is used for inputting the first scene image into a pre-trained scene color cast reduction neural network model and outputting a non-color cast second scene image corresponding to the first scene image by using the scene color cast reduction neural network model;
a color cast information calculation module for calculating color cast information between the first scene image and the second scene image;
the color cast reduction module is used for performing color cast reduction on the image to be identified according to the color cast information to obtain the image to be identified after color cast reduction;
and the pedestrian attribute identification module is used for identifying the pedestrian attribute according to the image to be identified after the color cast reduction.
7. The system of claim 6, wherein the image to be identified is taken from a surveillance video stream taken by a surveillance camera;
the first scene image extraction module comprises:
the pedestrian area identification unit is used for detecting pedestrians in the image to be identified by adopting an interframe difference method or an optical flow field method and determining the area where the pedestrians are located;
and the first scene image extraction unit is used for extracting other areas except the area where the pedestrian is located in the image to be identified as first scene images.
8. The system of claim 6, wherein the first scene image extraction module comprises:
the salient region detection unit is used for detecting a salient region in the image to be recognized by adopting a saliency detection algorithm, and the salient region is a region where a pedestrian is located;
and the non-significant region extracting unit is used for extracting non-significant regions except the significant region in the image to be identified as a first scene image.
9. The system of claim 6, wherein the color cast information calculation module comprises:
a color channel determination unit for determining a plurality of color channels;
a channel color cast information calculation unit, configured to determine, for each color channel, channel color cast information of the first scene image and the second scene image in the color channel according to a value of each pixel in the first scene image in the color channel and a value of each pixel in the second scene image in the color channel;
a color cast information determining unit, configured to determine color cast information between the first scene image and the second scene image according to the channel color cast information corresponding to each color channel.
10. The system of claim 6, wherein the color cast information comprises pixel color cast information for individual pixels in the first scene image;
the color cast reduction module comprises:
the pedestrian region color cast prediction unit is used for predicting pixel color cast information of each pixel in a region where a pedestrian is located in the image to be recognized according to the pixel color cast information of each pixel in the first scene image;
the all-color-cast information determining unit is used for determining the pixel color cast information of each pixel in the image to be identified according to the pixel color cast information of each pixel in the first scene image and the pixel color cast information of each pixel in the area where the pedestrian is located;
and the color cast reduction unit is used for performing color cast reduction on the image to be recognized according to the pixel color cast information of each pixel in the image to be recognized to obtain the image to be recognized after color cast reduction.
CN202010614419.1A 2020-06-30 2020-06-30 Pedestrian attribute identification method and system based on deep learning Active CN111898448B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010614419.1A CN111898448B (en) 2020-06-30 2020-06-30 Pedestrian attribute identification method and system based on deep learning

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010614419.1A CN111898448B (en) 2020-06-30 2020-06-30 Pedestrian attribute identification method and system based on deep learning

Publications (2)

Publication Number Publication Date
CN111898448A true CN111898448A (en) 2020-11-06
CN111898448B CN111898448B (en) 2023-10-24

Family

ID=73206513

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010614419.1A Active CN111898448B (en) 2020-06-30 2020-06-30 Pedestrian attribute identification method and system based on deep learning

Country Status (1)

Country Link
CN (1) CN111898448B (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113569683A (en) * 2021-07-20 2021-10-29 上海明略人工智能(集团)有限公司 Scene classification method, system, device and medium combining salient region detection
CN115599954A (en) * 2022-12-12 2023-01-13 广东工业大学(Cn) Video question-answering method based on scene graph reasoning

Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR20090062440A (en) * 2007-12-13 2009-06-17 한국전자통신연구원 Multi-view matching method and device using foreground/background separation
CN102169585A (en) * 2011-03-31 2011-08-31 汉王科技股份有限公司 Method and device for detecting image color cast
CN104766049A (en) * 2015-03-17 2015-07-08 苏州科达科技股份有限公司 Method and system for recognizing object colors
CN108364270A (en) * 2018-05-22 2018-08-03 北京理工大学 Colour cast color of image restoring method and device
CN109785248A (en) * 2018-12-19 2019-05-21 新绎健康科技有限公司 One kind is for the corrected method and system of color of image
CN110276731A (en) * 2019-06-17 2019-09-24 艾瑞迈迪科技石家庄有限公司 Endoscopic image color restoring method and device
CN110298893A (en) * 2018-05-14 2019-10-01 桂林远望智能通信科技有限公司 A kind of pedestrian wears the generation method and device of color identification model clothes
CN111292251A (en) * 2019-03-14 2020-06-16 展讯通信(上海)有限公司 Image color cast correction method, device and computer storage medium

Patent Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR20090062440A (en) * 2007-12-13 2009-06-17 한국전자통신연구원 Multi-view matching method and device using foreground/background separation
CN102169585A (en) * 2011-03-31 2011-08-31 汉王科技股份有限公司 Method and device for detecting image color cast
CN104766049A (en) * 2015-03-17 2015-07-08 苏州科达科技股份有限公司 Method and system for recognizing object colors
CN110298893A (en) * 2018-05-14 2019-10-01 桂林远望智能通信科技有限公司 A kind of pedestrian wears the generation method and device of color identification model clothes
CN108364270A (en) * 2018-05-22 2018-08-03 北京理工大学 Colour cast color of image restoring method and device
CN109785248A (en) * 2018-12-19 2019-05-21 新绎健康科技有限公司 One kind is for the corrected method and system of color of image
CN111292251A (en) * 2019-03-14 2020-06-16 展讯通信(上海)有限公司 Image color cast correction method, device and computer storage medium
CN110276731A (en) * 2019-06-17 2019-09-24 艾瑞迈迪科技石家庄有限公司 Endoscopic image color restoring method and device

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113569683A (en) * 2021-07-20 2021-10-29 上海明略人工智能(集团)有限公司 Scene classification method, system, device and medium combining salient region detection
CN113569683B (en) * 2021-07-20 2024-04-02 上海明略人工智能(集团)有限公司 Scene classification method, system, equipment and medium combined with salient region detection
CN115599954A (en) * 2022-12-12 2023-01-13 广东工业大学(Cn) Video question-answering method based on scene graph reasoning

Also Published As

Publication number Publication date
CN111898448B (en) 2023-10-24

Similar Documents

Publication Publication Date Title
Jung Efficient background subtraction and shadow removal for monochromatic video sequences
CN101084527B (en) A method and system for processing video data
CN100393106C (en) Method and apparatus for detecting and/or tracking image or color area of image sequence
Karaman et al. Comparison of static background segmentation methods
CN108280426B (en) Dark light source expression identification method and device based on transfer learning
KR101778605B1 (en) Method And Apparatus For Recognizing Vehicle License Plate
AU2006252252A1 (en) Image processing method and apparatus
CN109815936B (en) Target object analysis method and device, computer equipment and storage medium
CN102567727A (en) Method and device for replacing background target
CN111898448B (en) Pedestrian attribute identification method and system based on deep learning
CN106651797B (en) Method and device for determining effective area of signal lamp
CN103093203A (en) Human body re-recognition method and human body re-recognition system
KR102142567B1 (en) Image composition apparatus using virtual chroma-key background, method and computer program
WO2022213540A1 (en) Object detecting, attribute identifying and tracking method and system
KR100903816B1 (en) System and human face detection system and method in an image using fuzzy color information and multi-neural network
CN110866473B (en) Target object tracking detection method and device, storage medium and electronic device
CN110830788A (en) Method and device for detecting black screen image
CN111160340B (en) Moving object detection method and device, storage medium and terminal equipment
CN111898449B (en) Pedestrian attribute identification method and system based on monitoring video
CN111008601A (en) Fighting detection method based on video
Low et al. Frame Based Object Detection--An Application for Traffic Monitoring
CN111402189B (en) Video image color cast detection device and method
CN113591591A (en) Artificial intelligence field behavior recognition system
Sari et al. Detection of Moving Vehicle using Adaptive Threshold Algorithm in Varied Lighting
CN111179317A (en) Interactive teaching system and method

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant