CN112200006A - Human body attribute detection and identification method under community monitoring scene - Google Patents

Human body attribute detection and identification method under community monitoring scene Download PDF

Info

Publication number
CN112200006A
CN112200006A CN202010966064.2A CN202010966064A CN112200006A CN 112200006 A CN112200006 A CN 112200006A CN 202010966064 A CN202010966064 A CN 202010966064A CN 112200006 A CN112200006 A CN 112200006A
Authority
CN
China
Prior art keywords
image
human body
body attribute
steps
monitoring scene
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202010966064.2A
Other languages
Chinese (zh)
Inventor
徐亮
张卫山
孙浩云
尹广楹
张大千
管洪清
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Qingdao Sui Zhi Information Technologies Co ltd
Original Assignee
Qingdao Sui Zhi Information Technologies Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Qingdao Sui Zhi Information Technologies Co ltd filed Critical Qingdao Sui Zhi Information Technologies Co ltd
Priority to CN202010966064.2A priority Critical patent/CN112200006A/en
Publication of CN112200006A publication Critical patent/CN112200006A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/26Segmentation of patterns in the image field; Cutting or merging of image elements to establish the pattern region, e.g. clustering-based techniques; Detection of occlusion
    • G06V10/267Segmentation of patterns in the image field; Cutting or merging of image elements to establish the pattern region, e.g. clustering-based techniques; Detection of occlusion by performing operations on regions, e.g. growing, shrinking or watersheds
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • G06V10/56Extraction of image or video features relating to colour
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/94Hardware or software architectures specially adapted for image or video understanding
    • G06V10/955Hardware or software architectures specially adapted for image or video understanding using specific electronic processors

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Multimedia (AREA)
  • Theoretical Computer Science (AREA)
  • Human Computer Interaction (AREA)
  • Software Systems (AREA)
  • Image Analysis (AREA)

Abstract

The invention relates to the technical field of video processing, artificial intelligence and deep learning, and particularly discloses a human body attribute detection and identification method in a community monitoring scene. The invention combines the color identification mechanism aiming at the characteristics of target residents, and effectively improves the accuracy of target detection and identification.

Description

Human body attribute detection and identification method under community monitoring scene
Technical Field
The invention relates to the technical field of video processing, artificial intelligence and deep learning, in particular to a human body attribute detection and identification method in a community monitoring scene.
Background
In recent years, with the improvement of data volume and computing power, especially the large-scale use of GPU operation, deep learning gradually establishes its dominant position in the field of computer vision. In many fields, including image classification, image segmentation, image recognition and speech recognition, deep neural networks have achieved the best results at present. The convolutional neural network is particularly prominent in a plurality of deep neural network structures, the structures are trained by utilizing large-scale data, and operations such as weight sharing, pooling and discarding are simultaneously used for reducing the operand and improving the generalization capability of the model.
Semantic segmentation is one of the most core and basic characters in the computer vision field, has important application in the fields of unmanned driving, medical imaging, geographic remote sensing, robot navigation and the like, and aims to accurately classify each pixel point in an input image. In recent years, deep learning has demonstrated excellent performance on this dense labeling problem. However, in recent years, semantic segmentation methods based on convolutional neural networks mainly focus on how to better fuse the features of a single input image, and little attention is paid to how to make the segmentation result finer by enhancing the features of the image. The full convolution neural network FCN is a representation of the deep learning application in image segmentation, which can accept input images of any size without requiring all training images and test images to be of the same size; more efficient because the problems of repeated storage and computation of the convolution due to the use of pixel blocks are avoided.
Human attribute is as one of the important target under the control scene of community, and accurate detection plays decisive role to subsequent target identification, supplementary searching. The human body attribute is difficult to accurately detect due to the fact that clothes change is large, the quantity of the clothes is large, and the uncertainty is strong, and the human body attribute is difficult to recognize under the influence of factors such as weather, sheltering, angles and postures in a real environment. The human body attribute is used as auxiliary information for describing the identification of community resident characters, and the positioning of the resident moving path has great practical significance, for example: strange people enter the community, and the property is identified, positioned and tracked through the described human body attributes and the characters in the video monitoring, so that the safety and the stability of the community are guaranteed.
In view of this, in a community monitoring scenario, a method capable of improving accuracy of human attribute detection and identification needs to be provided to solve the above problem.
Disclosure of Invention
In order to solve the above technical problems, an object of the present invention is to provide a method for detecting and identifying human body attributes in a community environment. The new image enhancement network is provided, the visual effect of an original picture is improved by utilizing image enhancement, the picture definition is improved, the segmentation network is helped to obtain a better segmentation effect, then, the FCN full convolution network is utilized to realize semantic segmentation and image semantic feature division, and the color identification mechanism is combined to detect and identify the attributes of the clothes of the human body. The method greatly improves the accuracy and efficiency of detecting and identifying the human body attributes of community residents.
In order to solve the technical problems, the technical scheme of the invention is as follows:
a human body attribute detection and identification method under a community monitoring scene comprises the following steps:
step 1: acquiring a video stream in a community monitoring area, and decoding and separating the video stream into image data;
step 2: training an image enhancement network Img-EN model to obtain optimal parameters;
and step 3: adopting a trained Img-EN model to perform enhancement processing on the image data based on a histogram equalization method;
and 4, step 4: inputting the enhanced image into a semantic segmentation network (FCN) for training to obtain an optimal parameter;
and 5: enhancing the image to be detected by using the Img-EN model with the optimal parameters, and performing feature segmentation on the processed image by using the FCN model with the optimal parameters;
step 6: carrying out human body attribute detection and identification on the segmented semantic features by combining a color identification mechanism;
and 7: adopting a GPU scheduling strategy to perform GPU scheduling;
and 8: and sending the final result to the terminal.
Preferably, the step of acquiring the video stream in the monitored area in step 1 includes: high-definition cameras or video acquisition devices are installed at various places of a community, an area needing to be monitored is selected, all video streams in the area are obtained, and the video streams are decoded and separated to obtain image data.
Preferably, in the step 2, the invention uses histogram equalization to implement the image enhancement network Img-EN. The contrast of the local image is adjusted by equalizing the histogram of the image, which allows an overexposed or underexposed image to show more detail. The Img-EN network is formed by a layer of convolution, the convolution kernel parameters of which are dynamically generated from the input image. The structure-modeling histogram equalization algorithm produces a series of enhanced images to enrich the available semantic features.
Preferably, in step 4, the full convolution network FCN related to the present invention is called in all english: fully volumetric Networks. The FCN is used as a semantic segmentation technical model, an input image is received, deconvolution is adopted to carry out upsampling on the feature map of the last convolution layer, the FCN with the highest fineness is selected as 8s times for the upsampling, so that the upsampling is restored to the same size of the input image, each pixel can be predicted, the spatial information in the original input image is kept, and finally, gradual pixel classification is carried out on the upsampled feature map. Training process: firstly, training is carried out by using default parameters, and according to a training intermediate result, an initial weight, a training rate and iteration times are continuously adjusted until the image enhancement network achieves a preset enhancement effect with preset efficiency.
Preferably, in step 6, the present invention uses the TCS230 sensor to implement a color identification mechanism by using a color filter, and the mechanism performs signal processing on the input picture to identify the color component values of the image to be detected, so as to achieve the purpose of identifying the color of the clothes. And (5) combining the semantic features segmented in the step (5) with a color recognition mechanism, so that the accuracy of human body attribute recognition is enhanced.
Preferably, the step of sending the final result to the platform terminal: and calculating the coordinates of the pedestrian area, acquiring the current timestamp and sending the timestamp to the platform terminal.
By adopting the technical scheme, the human body attribute detection and identification method under the community monitoring scene has the following beneficial effects:
(1) the image enhancement and semantic segmentation technologies are combined, and the segmentation result is more refined by enhancing the characteristics of the image through the image enhancement network Img-EN.
(2) The invention designs a color identification mechanism by using the filter, further identifies the color of the attributes of the human body clothes, and improves the accuracy of human body attribute detection.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, it is obvious that the drawings in the following description are only some embodiments of the present invention, and for those skilled in the art, other drawings can be obtained according to the drawings without creative efforts.
FIG. 1 is a general flowchart of a human body attribute detection and identification method in a community environment according to the present invention;
FIG. 2 is a structural diagram of an image enhancement network Img-EN implemented by histogram equalization in the present invention;
FIG. 3 is a diagram of a color recognition mechanism implemented using a color sensor according to the present invention;
FIG. 4 is a diagram of a GPU resource scheduling strategy in a GPU processor cluster according to the present invention.
Detailed Description
The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
The invention provides a human body attribute detection and identification method under a community monitoring environment, which can be used for identifying the attributes of pedestrians on roads in real time in a community, including types and colors of clothes, backpacks and the like. According to the invention, the characteristic image more beneficial to network segmentation is obtained through an image enhancement technology, and a series of semantic characteristics are obtained through network segmentation, so that the purpose of detection and identification is achieved.
As shown in FIG. 1, the method for detecting and identifying human body attributes in a community environment of the invention comprises the following basic steps: step 1: acquiring a video stream in a community monitoring area, and decoding and separating the video stream into image data; step 2: training an image enhancement network Img-EN model to obtain optimal parameters; and step 3: adopting a trained Img-EN model to perform enhancement processing on the image data based on a histogram equalization method; and 4, step 4: inputting the enhanced image into a semantic segmentation network (FCN) for training to obtain an optimal parameter; and 5: enhancing the image to be detected by using the Img-EN model with the optimal parameters, and performing feature segmentation on the processed image by using the FCN model with the optimal parameters; step 6: carrying out human body attribute detection and identification on the segmented semantic features by combining a color identification mechanism; and 7: adopting a GPU scheduling strategy to perform GPU scheduling; and 8: and sending the final result to the terminal.
The following describes in detail a method for detecting and identifying human body attributes in a community environment:
as shown in fig. 1, a high-definition camera or a video acquisition device is installed in a community, an area needing to be monitored is selected, and all video streams in the area are acquired; decoding a video stream, decoding the video stream of the monitoring equipment, separating image data, then performing enhancement processing on a target image by using a trained Img-EN network to obtain a series of semantic features which are rich and available, then performing semantic segmentation on the enhanced image by using a trained FCN network to obtain a series of semantic features related to human body attributes, and detecting and identifying the target by combining the features with a color identification mechanism; monitoring the GPU use condition in a GPU processor cluster in real time, and adopting a proper scheduling strategy to schedule the GPUs in real time; and calculating the coordinates of the pedestrian area, acquiring the current timestamp and sending the timestamp to the platform terminal. The method combines image enhancement and semantic segmentation technologies, an image enhancement network Img-EN and a semantic segmentation network are trained, an original image is subjected to enhancement processing based on a histogram equalization method by using the trained Img-EN model, relevant features are highlighted, and irrelevant backgrounds are weakened, so that the segmentation capability of the semantic segmentation network is improved, an image which is easier to segment is obtained by processing the image enhancement network, the image is subjected to standard semantic segmentation by using the semantic segmentation network, a series of semantic features of the image are obtained, the aim of detecting and identifying human body attributes is fulfilled, and the accuracy of target detection and identification is effectively improved by combining the features of target residents' clothing with a color identification mechanism.
It can be appreciated that in step 2, the present invention implements an image enhancement network Img-EN using histogram equalization. The contrast of the local image is adjusted by equalizing the histogram of the image, which allows an overexposed or underexposed image to show more detail. The network structure is shown in fig. 2, the Img-EN network is formed by a layer of convolution, and the convolution kernel parameters of the convolution layer are dynamically generated by the input image. The structure-modeling histogram equalization algorithm produces a series of enhanced images to enrich the available semantic features. In step 4, the full convolution network FCN related to the present invention is all called in english: fully volumetric Networks. The FCN is used as a semantic segmentation technical model, an input image is received, deconvolution is adopted to carry out upsampling on the feature map of the last convolution layer, the FCN with the highest fineness is selected as 8s times for the upsampling, so that the upsampling is restored to the same size of the input image, each pixel can be predicted, the spatial information in the original input image is kept, and finally, gradual pixel classification is carried out on the upsampled feature map. Training process: firstly, training is carried out by using default parameters, and according to a training intermediate result, an initial weight, a training rate and iteration times are continuously adjusted until the image enhancement network achieves a preset enhancement effect with preset efficiency.
The GPU resource scheduling layer monitors the current GPU resource use condition in real time according to the scheduling strategy as shown in figure 4, before a GPU processor cluster distributes tasks, whether the current GPU consumption is too large is checked, if the consumption is too large, a GPU use condition list and a GPU computing capacity list are checked, and a GPU receiving task is reselected.
According to the human body attribute detection and identification method under the community monitoring scene, the image enhancement and the semantic segmentation technology are combined, the segmentation result is finer by enhancing the characteristics of the image, and the speed and the precision of human body attribute detection and identification are improved by combining a color identification mechanism; the community environment is covered nationally by adopting a plurality of cameras, and basic conditions are provided for human body attribute detection. The invention promotes the further development of the intelligent community.
The above description is only for the purpose of illustrating the preferred embodiments of the present invention and is not to be construed as limiting the invention, and any modifications, equivalents, improvements and the like that fall within the spirit and principle of the present invention are intended to be included therein.

Claims (6)

1. A human body attribute detection and identification method under a community monitoring scene is characterized by comprising the following steps: the method comprises the following steps:
step 1: acquiring a video stream in a community monitoring area, and decoding and separating the video stream into image data;
step 2: training an image enhancement network Img-EN model to obtain optimal parameters;
and step 3: adopting a trained Img-EN model to perform enhancement processing on the image data based on a histogram equalization method;
and 4, step 4: inputting the enhanced image into a semantic segmentation network (FCN) for training to obtain an optimal parameter;
and 5: enhancing the image to be detected by using the Img-EN model with the optimal parameters, and performing feature segmentation on the processed image by using the FCN model with the optimal parameters;
step 6: carrying out human body attribute detection and identification on the segmented semantic features by combining a color identification mechanism;
and 7: adopting a GPU scheduling strategy to perform GPU scheduling;
and 8: and sending the final result to the terminal.
2. The method for detecting and identifying the human body attribute under the community monitoring scene according to claim 1, wherein the method comprises the following steps: in the step 1, high-definition cameras or video acquisition devices are installed at various places of the community, an area needing to be monitored is selected, all video streams in the area are obtained, and the video streams are decoded to separate image data.
3. The method for detecting and identifying the human body attribute under the community monitoring scene according to claim 1, wherein the method comprises the following steps: in the step 2, the contrast of the local image is adjusted by equalizing the histogram of the image, and the histogram equalization can enable the overexposed or underexposed image to display more details; the Img-EN network is formed by a layer of convolution, and convolution kernel parameters of the convolution layer are dynamically generated by an input image; the histogram equalization algorithm produces a series of enhanced images to enrich the available semantic features.
4. The method for detecting and identifying the human body attribute under the community monitoring scene according to claim 1, wherein the method comprises the following steps: in the step 4, the FCN is used as a semantic segmentation technology model, an input image is received, deconvolution is adopted to perform upsampling on the feature map of the last convolutional layer, the FCN-8s times with the highest fineness is selected for the upsampling, so that the upsampling is restored to the same size of the input image, thereby generating a prediction for each pixel, simultaneously reserving spatial information in the original input image, and finally performing gradual pixel classification on the upsampled feature map; training process: firstly, training is carried out by using default parameters, and according to a training intermediate result, an initial weight, a training rate and iteration times are continuously adjusted until the image enhancement network achieves a preset enhancement effect with preset efficiency.
5. The method for detecting and identifying the human body attribute under the community monitoring scene according to claim 1, wherein the method comprises the following steps: in the step 6, a color identification mechanism is realized by using a TCS230 sensor and using a color filter, and the mechanism performs signal processing on an input picture to identify color component values of an image to be detected, so as to achieve the purpose of identifying the color of clothes; and (5) combining the semantic features segmented in the step (5) with a color recognition mechanism, so that the accuracy of human body attribute recognition is enhanced.
6. The method for detecting and identifying the human body attribute under the community monitoring scene according to claim 1, wherein the method comprises the following steps: in the step 8, the method further comprises the steps of calculating the coordinates of the pedestrian area, acquiring the current timestamp and sending the current timestamp to the platform terminal.
CN202010966064.2A 2020-09-15 2020-09-15 Human body attribute detection and identification method under community monitoring scene Pending CN112200006A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010966064.2A CN112200006A (en) 2020-09-15 2020-09-15 Human body attribute detection and identification method under community monitoring scene

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010966064.2A CN112200006A (en) 2020-09-15 2020-09-15 Human body attribute detection and identification method under community monitoring scene

Publications (1)

Publication Number Publication Date
CN112200006A true CN112200006A (en) 2021-01-08

Family

ID=74014932

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010966064.2A Pending CN112200006A (en) 2020-09-15 2020-09-15 Human body attribute detection and identification method under community monitoring scene

Country Status (1)

Country Link
CN (1) CN112200006A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113763296A (en) * 2021-04-28 2021-12-07 腾讯云计算(北京)有限责任公司 Image processing method, apparatus and medium

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1337206A (en) * 2000-07-07 2002-02-27 株式会社北计工业 Color-identifying apparatus
CN110175595A (en) * 2019-05-31 2019-08-27 北京金山云网络技术有限公司 Human body attribute recognition approach, identification model training method and device
CN110991281A (en) * 2019-11-21 2020-04-10 电子科技大学 Dynamic face recognition method
CN111369563A (en) * 2020-02-21 2020-07-03 华南理工大学 Semantic segmentation method based on pyramid void convolutional network

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1337206A (en) * 2000-07-07 2002-02-27 株式会社北计工业 Color-identifying apparatus
CN110175595A (en) * 2019-05-31 2019-08-27 北京金山云网络技术有限公司 Human body attribute recognition approach, identification model training method and device
CN110991281A (en) * 2019-11-21 2020-04-10 电子科技大学 Dynamic face recognition method
CN111369563A (en) * 2020-02-21 2020-07-03 华南理工大学 Semantic segmentation method based on pyramid void convolutional network

Non-Patent Citations (4)

* Cited by examiner, † Cited by third party
Title
GAILI YUE,LEI LU: "Face Recognition Based on Histogram Equalization and Convolution Neural Natwork", 《2018 10TH INTERNATIONAL CONFERENCE ON INTELLIGENT HUMAN-MACHINE SYSTEMS AND CYBERNETICS》 *
何俊杰等: "基于卷积神经网络的电路缺陷识别方法", 《福建电脑》 *
刘增辉: "颜色传感器技术研究进展", 《传感器技术》 *
杜森森等: "跟踪机器人识别运动人体装置的实现方案", 《电视技术》 *

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113763296A (en) * 2021-04-28 2021-12-07 腾讯云计算(北京)有限责任公司 Image processing method, apparatus and medium

Similar Documents

Publication Publication Date Title
CN110176027B (en) Video target tracking method, device, equipment and storage medium
WO2020216008A1 (en) Image processing method, apparatus and device, and storage medium
CN111178183B (en) Face detection method and related device
Zhang et al. Moving vehicles detection based on adaptive motion histogram
CN110929593B (en) Real-time significance pedestrian detection method based on detail discrimination
CN103020992B (en) A kind of video image conspicuousness detection method based on motion color-associations
CN103020985B (en) A kind of video image conspicuousness detection method based on field-quantity analysis
Jia et al. A two-step approach to see-through bad weather for surveillance video quality enhancement
CN111582095B (en) Light-weight rapid detection method for abnormal behaviors of pedestrians
CN109117838B (en) Target detection method and device applied to unmanned ship sensing system
CN109886159B (en) Face detection method under non-limited condition
CN109766828A (en) A kind of vehicle target dividing method, device and communication equipment
KR20140095333A (en) Method and apparratus of tracing object on image
CN110555420A (en) fusion model network and method based on pedestrian regional feature extraction and re-identification
CN112750147A (en) Pedestrian multi-target tracking method and device, intelligent terminal and storage medium
CN114627269A (en) Virtual reality security protection monitoring platform based on degree of depth learning target detection
Shi Object detection models and research directions
CN115861380A (en) End-to-end unmanned aerial vehicle visual target tracking method and device in foggy low-light scene
Sun et al. IRDCLNet: Instance segmentation of ship images based on interference reduction and dynamic contour learning in foggy scenes
CN116129291A (en) Unmanned aerial vehicle animal husbandry-oriented image target recognition method and device
Ghahremannezhad et al. Automatic road detection in traffic videos
CN111767854A (en) SLAM loop detection method combined with scene text semantic information
CN116258940A (en) Small target detection method for multi-scale features and self-adaptive weights
WO2022095818A1 (en) Methods and systems for crowd motion summarization via tracklet based human localization
Dahirou et al. Motion Detection and Object Detection: Yolo (You Only Look Once)

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication

Application publication date: 20210108