CN112149476B - Target detection method, device, equipment and storage medium - Google Patents

Target detection method, device, equipment and storage medium Download PDF

Info

Publication number
CN112149476B
CN112149476B CN201910578134.4A CN201910578134A CN112149476B CN 112149476 B CN112149476 B CN 112149476B CN 201910578134 A CN201910578134 A CN 201910578134A CN 112149476 B CN112149476 B CN 112149476B
Authority
CN
China
Prior art keywords
image
detected
target
environment
detection
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201910578134.4A
Other languages
Chinese (zh)
Other versions
CN112149476A (en
Inventor
郁昌存
王德鑫
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Jingdong Technology Information Technology Co Ltd
Original Assignee
Jingdong Technology Information Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Jingdong Technology Information Technology Co Ltd filed Critical Jingdong Technology Information Technology Co Ltd
Priority to CN201910578134.4A priority Critical patent/CN112149476B/en
Publication of CN112149476A publication Critical patent/CN112149476A/en
Application granted granted Critical
Publication of CN112149476B publication Critical patent/CN112149476B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/10Terrestrial scenes
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T5/00Image enhancement or restoration
    • G06T5/20Image enhancement or restoration using local operators
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/10Segmentation; Edge detection
    • G06T7/11Region-based segmentation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/10Segmentation; Edge detection
    • G06T7/136Segmentation; Edge detection involving thresholding
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/10Image acquisition
    • G06V10/12Details of acquisition arrangements; Constructional details thereof
    • G06V10/14Optical characteristics of the device performing the acquisition or on the illumination arrangements
    • G06V10/141Control of illumination
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/25Determination of region of interest [ROI] or a volume of interest [VOI]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10016Video; Image sequence
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20024Filtering details
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V2201/00Indexing scheme relating to image or video recognition or understanding
    • G06V2201/07Target detection
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V2201/00Indexing scheme relating to image or video recognition or understanding
    • G06V2201/08Detecting or categorising vehicles

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Multimedia (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Image Analysis (AREA)
  • Studio Devices (AREA)

Abstract

The embodiment of the invention discloses a target detection method, a target detection device, target detection equipment and a storage medium. The method comprises the following steps: determining a current environment index value of a shooting environment corresponding to an image to be detected according to the image to be detected; determining a target shooting environment category corresponding to the image to be detected according to the current environment index value and the detection environment index threshold value; determining a target preset detection model corresponding to the target shooting environment category according to the target shooting environment category and the mapping relation between the shooting environment category and the preset detection model; and carrying out target detection on the image to be detected based on the target preset detection model to obtain each target in the image to be detected. Through the technical scheme, the target is detected more accurately, and the real-time performance of target detection is improved.

Description

Target detection method, device, equipment and storage medium
Technical Field
Embodiments of the present invention relate to image processing technologies, and in particular, to a method, an apparatus, a device, and a storage medium for detecting an object.
Background
With the development of technology, many image or video-based automatic target detection services, such as image-based person detection in traffic analysis services, and automatic detection of vehicles through photographed images or videos in intelligent traffic systems, have emerged, so as to further perform subsequent tasks such as vehicle tracking, license plate recognition, and road traffic statistics.
Taking automatic detection of a vehicle as an example, the existing method is mainly aimed at vehicle detection under the condition of strong illumination. However, under the condition of weak illumination (such as night or overcast weather), the characteristics of the vehicle in the image can be greatly changed, such as unclear outline, vanishing of texture details or strong light of the vehicle lamp, so that the vehicle accuracy of the existing vehicle detection algorithm is not high.
To solve the above problems, two methods are currently adopted: one is to augment a vehicle training data set containing various influencing factors in hopes of improving the generalization ability of a vehicle detection algorithm; another is to increase the complexity of the detection algorithm in hopes that the algorithm can adapt to various lighting conditions.
In the process of implementing the present invention, the inventor finds that at least the following problems exist in the prior art: (1) The improvement mode of increasing the training data set is difficult to ensure higher detection accuracy under various illumination conditions; (2) The scheme for improving the algorithm complexity cannot achieve the purpose of real-time detection.
Disclosure of Invention
The embodiment of the invention provides a target detection method, a device, equipment and a storage medium, which are used for realizing more accurate target detection and improving the real-time performance of target detection.
In a first aspect, an embodiment of the present invention provides a target detection method, including:
determining a current environment index value of a shooting environment corresponding to an image to be detected according to the image to be detected;
Determining a target shooting environment category corresponding to the image to be detected according to the current environment index value and the detection environment index threshold value;
determining a target preset detection model corresponding to the target shooting environment category according to the target shooting environment category and the mapping relation between the shooting environment category and the preset detection model;
And carrying out target detection on the image to be detected based on the target preset detection model to obtain each target in the image to be detected.
In a second aspect, an embodiment of the present invention further provides an object detection apparatus, including:
The current environment index value determining module is used for determining a current environment index value of a shooting environment corresponding to the image to be detected according to the image to be detected;
The target shooting environment category determining module is used for determining a target shooting environment category corresponding to the image to be detected according to the current environment index value and the detection environment index threshold value;
The target preset detection model determining module is used for determining a target preset detection model corresponding to the target shooting environment category according to the target shooting environment category and the mapping relation between the shooting environment category and the preset detection model;
And the target detection module is used for carrying out target detection on the image to be detected based on the target preset detection model to obtain each target in the image to be detected.
In a third aspect, an embodiment of the present invention further provides an electronic device, including:
one or more processors;
storage means for storing one or more programs,
The one or more programs, when executed by the one or more processors, cause the one or more processors to implement the object detection methods provided by any of the embodiments of the present invention.
In a fourth aspect, embodiments of the present invention further provide a computer readable storage medium having stored thereon a computer program which, when executed by a processor, implements the object detection method provided by any of the embodiments of the present invention.
According to the embodiment of the invention, the current environment index value of the shooting environment corresponding to the image to be detected is determined, and the target shooting environment category corresponding to the image to be detected is determined according to the current environment index value and the detection environment index threshold value; further, according to the target shooting environment category and the mapping relation between the shooting environment category and the preset detection model, determining a target preset detection model corresponding to the target shooting environment category; and carrying out target detection on the image to be detected based on a target preset detection model to obtain each target in the image to be detected. The method and the device realize the self-adaptive scheduling of the proper target preset detection model according to the current environment index value of the image to be detected, so that the detection result with higher target detection precision can be obtained in different shooting environments, the problems of low target detection precision in different shooting environments caused by low algorithm generalization capability and detection delay caused by high algorithm complexity are solved, the accuracy of target detection is improved, and the algorithm complexity is reduced so as to detect the target more rapidly.
Drawings
FIG. 1 is a flow chart of a target detection method according to a first embodiment of the invention;
FIG. 2 is a flow chart of a target detection method in a second embodiment of the invention;
fig. 3 is an image to be detected corresponding to a strong light shooting environment and a weak light shooting environment in the second embodiment of the present invention;
FIG. 4 is a flow chart of a target detection method in a third embodiment of the invention;
Fig. 5 is a schematic diagram of a vehicle detection result of an image to be detected corresponding to a strong light shooting environment and a weak light shooting environment in the third embodiment of the present invention;
fig. 6 is a schematic structural diagram of an object detection device in a fourth embodiment of the present invention;
fig. 7 is a schematic structural diagram of an electronic device in a fifth embodiment of the present invention.
Detailed Description
The invention is described in further detail below with reference to the drawings and examples. It is to be understood that the specific embodiments described herein are merely illustrative of the invention and are not limiting thereof. It should be further noted that, for convenience of description, only some, but not all of the structures related to the present invention are shown in the drawings.
Example 1
The object detection method provided in the present embodiment may be applied to detection of an object, such as a vehicle or a pedestrian, from images photographed under different photographing environments. The method may be performed by an object detection device, which may be implemented in software and/or hardware, which may be integrated in an electronic device with image processing functionality, such as a mobile phone, a tablet, a desktop computer, a server, etc. Referring to fig. 1, the method of this embodiment specifically includes the following steps:
S110, determining a current environment index value of a shooting environment corresponding to the image to be detected according to the image to be detected.
The image to be detected refers to an image to be subjected to target detection, and the image to be detected can be an image which is independently shot or a frame of image in a video. The imaging environment refers to an environment at the time of imaging an image to be detected, and may be, for example, the intensity of illumination or the degree of fogging at the time of imaging. Different illumination intensities can cause different brightness of the photographed image, and different fogging degrees can cause different blurring degrees of the photographed image. These are environmental factors that affect the quality of the image and thus the accuracy of target detection. The environmental index value refers to a value of an index that characterizes the shooting environment. The index for representing the shooting environment can be the brightness of an image or edge information in the image. The current environmental index value refers to an environmental index value when an image to be detected is photographed.
In the related art, when an image is used for target detection, the shooting environment corresponding to the image to be detected is not distinguished, but the same target detection model is adopted for target detection, so that the detection precision of target detection based on the images obtained under different shooting environments is unequal. Based on the problems, the embodiment of the invention adopts a scheme of distinguishing the shooting environment of the image to be detected and further adopting a target detection model consistent with the shooting environment to detect the target. Therefore, the target detection is performed on the image to be detected, and the shooting environment corresponding to the image to be detected needs to be determined first. In specific implementation, brightness information extraction or edge information extraction can be performed on the image to be detected, and the extracted result is used as a current environment index value of the shooting environment corresponding to the image to be detected.
Illustratively, S110 includes: and determining the brightness information of the image to be detected according to each brightness value corresponding to the brightness channel of the image to be detected, and taking the brightness information as the current environment index value of the shooting environment corresponding to the image to be detected.
And converting the image to be detected from an original color space to a color space containing brightness channels, such as Lab color space or HSV color space, and obtaining a brightness channel image corresponding to the brightness channels of the image to be detected. And then, calculating the mean value or mean variance and the like of the gray values in the brightness channel image to obtain brightness information corresponding to the brightness channel image, wherein the brightness information can be used as a current environment index value of a shooting environment corresponding to the image to be detected. The advantage of this arrangement is that the determination speed of the current environment index value can be increased by reflecting the shooting environment of the image by the brightness information of the image.
S120, determining a target shooting environment category corresponding to the image to be detected according to the current environment index value and the detection environment index threshold value.
The detection environment index threshold value refers to a demarcation value used for distinguishing environment indexes of different shooting environments in the target detection process, and the demarcation value can be set by human experience, or can be automatically determined based on an automatic threshold value segmentation algorithm by utilizing a plurality of environment index values determined in advance. If the detection environment index threshold is automatically determined, only one detection environment index threshold can be determined, and the detection environment index threshold is used in the subsequent target detection process, so that the accuracy of the detection environment index threshold can be ensured to a certain extent, and the target detection speed can be improved to a certain extent; the detection environment index threshold value can also be updated in a staged manner, for example, when target detection is performed based on each frame of image in the shot video, each section of shot video is utilized to automatically use one detection environment index threshold value, and corresponding detection environment index threshold values are used in corresponding shot videos, so that the applicability of the detection environment index threshold values can be ensured to a greater extent, the accuracy of a preset detection model of subsequent scheduling is further improved, and the target detection precision is further improved. The specific determination mode of the detection environment index threshold value can be determined according to the requirements on the detection speed, the detection precision and the like in specific service requirements. The photographing environment category refers to a category to which the photographing environment belongs, for example, when the photographing environment is illumination intensity, the photographing environment category may be a strong illumination category, a weak illumination category, or the like. The target shooting environment category refers to a shooting environment category to which a shooting environment corresponding to an image to be detected belongs.
The shooting environments are divided into different shooting environment categories according to the detection environment index threshold. And then comparing the current environment index value with the detection environment index threshold value, determining which numerical range corresponding to the detection environment index threshold value falls into the current environment index value, and determining the shooting environment category corresponding to the falling numerical range as the target shooting environment category corresponding to the image to be detected.
S130, determining a target preset detection model corresponding to the target shooting environment category according to the target shooting environment category and the mapping relation between the shooting environment category and the preset detection model.
The preset detection model is a preset target detection model, and may be a target detection model trained in the related art, or a target detection model obtained by retraining with a training sample set. The target detection model herein may be a modified Region-based convolutional neural network target detection model Faster Region-CNN, a multi-frame prediction-based end-to-end target detection model (Single Shot MultiBox Detector, SSD), an end-to-end single network target detection model (You Only Look Once, YOLO), or the like.
In order to improve the target detection accuracy based on the images obtained in various shooting environments, the embodiment of the invention obtains preset detection models more suitable for each shooting environment (shooting environment category) in advance, and establishes a one-to-one correspondence relationship between each shooting environment category and the corresponding preset detection model. It can be understood that the accuracy of the detection result obtained by shooting the obtained image under the shooting environment a and performing the target detection by using the preset detection model corresponding to the shooting environment a is higher than the accuracy of the detection result obtained by performing the target detection by using the preset detection model corresponding to the other shooting environment.
In specific implementation, a keyword is searched by taking a target shooting environment type as a model, and a preset detection model corresponding to the target shooting environment type is determined from the mapping relation between the shooting environment type and the preset detection model and is used as a target preset detection model corresponding to the target shooting environment type.
And S140, performing target detection on the image to be detected based on a target preset detection model to obtain each target in the image to be detected.
And inputting the image to be detected into a target preset detection model, and detecting each target in the image to be detected through target detection processing of the model.
According to the technical scheme, the target shooting environment category corresponding to the image to be detected is determined according to the current environment index value and the detection environment index threshold value by determining the current environment index value of the shooting environment corresponding to the image to be detected; further, according to the target shooting environment category and the mapping relation between the shooting environment category and the preset detection model, determining a target preset detection model corresponding to the target shooting environment category; and carrying out target detection on the image to be detected based on a target preset detection model to obtain each target in the image to be detected. The method and the device realize the self-adaptive scheduling of the proper target preset detection model according to the current environment index value of the image to be detected, so that the detection result with higher target detection precision can be obtained in different shooting environments, the problems of low target detection precision in different shooting environments caused by low algorithm generalization capability and detection delay caused by high algorithm complexity are solved, the accuracy of target detection is improved, and the algorithm complexity is reduced so as to detect the target more rapidly.
Example two
The present embodiment further optimizes "determining the current environmental index value of the shooting environment corresponding to the image to be detected according to the image to be detected" based on the first embodiment. On the basis, a step of determining a detection environment index threshold corresponding to the image to be detected is further added. Wherein the explanation of the same or corresponding terms as those of the above embodiments is not repeated herein. Referring to fig. 2, the target detection method provided in this embodiment includes:
s210, filtering the image to be detected based on the guidable filter according to the image to be detected and the set guide image to obtain a filtered image.
The set guide image is a preset image, and is used for a guide image in filtering processing, and is consistent with a shooting scene of an image to be detected. Illustratively, the guide image is set as at least one frame of video image in the captured video corresponding to the image to be detected. The advantage of this arrangement is that the consistency of the shooting scene between the set guiding image and the image to be detected is further improved. The guidable filter is a filter that performs a filtering process using a guidance map, and may be, for example, a guidance filter or a joint bilateral filter. The guidable filter has the characteristics of denoising and retaining edge information, is input into an image to be filtered and a guide image, and is output into a filtered image, wherein the whole image of the filtered image is consistent with the image to be filtered, but the texture information of the filtered image is consistent with the guide image.
In the embodiment of the invention, the target detection is performed based on the shooting environment type, so that in the process of judging the target shooting environment type corresponding to the image to be detected, in order to control the variable factors in the image to be the shooting environment, a guide map and a guidable filter are introduced. In specific implementation, the image to be detected and the set guide image are input into a guidable filter, and the filtered image of the image to be detected, namely the filtered image, can be obtained through filtering treatment.
The guidable filter in the above process may be a guidance filter, and the algorithm principle of the guidance filter is as follows:
the output and input of the guided filter function are assumed to satisfy the following linear relationship within a two-dimensional window:
Where q is the pixel value of the output image, I is the pixel value of the set guide image, p is the pixel value of the input image, n is the noise, I and k are the pixel indices, and a and b are the coefficients of the linear function when the window center is at k (i.e., window w k).
The gradient is taken on both sides of the first formula in the formula (1), so that the following can be obtained: The formula shows that: when the guiding image I is set to have a gradient, the output image q also has a similar gradient, i.e. edge consistency between the guiding filter output image and the guiding image is ensured.
Solving the linear coefficients a and b in equation (1), i.e. the minimum difference between the output value q of the desired fitting function and the true value p, translates into an optimization problem, i.e. minimizing the following cost function in the window:
where ε is a regularization parameter to prevent a from being too large, and also to increase the numerical stability.
S220, determining a peak signal-to-noise ratio corresponding to the image to be detected according to the image to be detected and the filtered image, and taking the peak signal-to-noise ratio as a current environment index value of a shooting environment corresponding to the image to be detected.
Peak signal-to-noise ratio PSNR is an objective method of evaluating image quality. After the image is subjected to the guided filtering process, the quality of the image is different from that of the original image, and in order to measure the change degree of the image before and after the guided filtering, namely the change degree between the image to be detected and the filtered image, a PSNR value is used as a measurement standard. The calculation formula of PSNR is as follows:
Where M represents the mean square error of the image to be detected X and the filtered image Y, i.e
Wherein H and W are the height and width of the image respectively; n is the number of bits per pixel, typically taking n=8, i.e. the number of pixel gray levels is 256. The unit of PSNR is dB, and the larger the value thereof, the smaller the distortion.
Taking the shooting environment as the illumination intensity as an example, the peak signal-to-noise ratio can be used as an environment index, and the change of the shooting environment can be reflected more accurately relative to the brightness index. According to the above description, there is an edge difference between the images before and after the guided filtering process. Referring to fig. 3, the gray value difference between adjacent pixels of the image obtained in the weak illumination shooting environment is small, the edge information is small, and the image variation after the guide filtering is small. And the gray value difference between adjacent pixels of the obtained image under the strong illumination shooting environment is large, the edge information is more, and the image change after the guide filtering is large. Therefore, whether the image to be detected is photographed in a strong illumination environment or in a moderate illumination environment can be judged according to the degree of change between the images before and after filtering. As can be seen from the calculation formula of PSNR, the PSNR value represents the difference between the two images. Therefore, the PSNR value of the image to be detected before and after the guiding filtering can be used as the current environment index value of the shooting environment corresponding to the image to be detected.
S230, determining an associated environment index value of the shooting environment corresponding to each associated image according to the set number of associated images corresponding to the image to be detected.
The set number is a preset number value of images, which can be set manually or can be automatically determined according to the number of frames in the photographed video corresponding to the image to be detected. The related image refers to an image having a related relationship with the image to be detected, and may be, for example, a frame image in a captured video corresponding to the image to be detected. The associated environment index value refers to an environment index value corresponding to the associated image.
In order to make the detection environment index threshold more suitable for the type judgment of the shooting environment when shooting the image to be detected, in this embodiment, updating the detection environment index threshold in real time according to the associated image of the image to be detected is set. In specific implementation, a set number of associated images are acquired, and then for each associated image, an associated environment index value corresponding to the associated image is determined. The associated environment index value may be brightness information of the associated image, or may be a peak signal-to-noise ratio corresponding to the associated image. In order to keep consistent with the current environmental index value of the image to be detected, the correlation environmental index value in this embodiment adopts a peak signal-to-noise ratio. The peak signal-to-noise ratio determination operation of the correlation environment can be described with reference to S210 to S220.
S240, determining a detection environment index threshold corresponding to the image to be detected based on a preset threshold segmentation algorithm according to each associated environment index value.
The preset threshold segmentation algorithm refers to a preset threshold segmentation algorithm, and may be, for example, a maximum inter-class variance method OTSU, an adaptive threshold segmentation method, a maximum entropy threshold segmentation method, or an iterative threshold segmentation method. In this embodiment, the maximum inter-class variance method is taken as an example. The OTSU is an algorithm for obtaining an optimal threshold, and after classification by using the threshold obtained by the OTSU, the inter-class variance of the two classes of targets is the largest.
In order to reduce the influence of human factors in the threshold determination process and improve the accuracy of the threshold, the method of automatically determining the threshold is adopted in the embodiment. In specific implementation, all the associated environment index values can be combined into an associated environment index value two-dimensional matrix, and then the associated environment index value two-dimensional matrix is processed by adopting a maximum inter-class variance method, so that the automatically determined segmentation environment index value can be obtained and used as a detection environment index threshold corresponding to the image to be detected.
All the associated environment index values comprise historical associated environment index values besides the associated environment index values of the associated images corresponding to the images to be detected. In other words, in the process of determining the detection environment index threshold value, the data volume of the associated environment index value is continuously increased, so that the determined detection environment index threshold value is more accurate, the accuracy of the threshold value cannot be affected by abrupt change of a certain value, and the anti-interference capability is strong.
S250, determining a target shooting environment category corresponding to the image to be detected according to the current environment index value and the detection environment index threshold value.
S260, determining a target preset detection model corresponding to the target shooting environment category according to the target shooting environment category and the mapping relation between the shooting environment category and the preset detection model.
And S270, performing target detection on the image to be detected based on a target preset detection model to obtain each target in the image to be detected.
According to the technical scheme of the embodiment, the filtering processing is carried out on the image to be detected based on the guidable filter according to the image to be detected and the set guide image, the filtering image is obtained, and the peak signal-to-noise ratio corresponding to the image to be detected is determined according to the image to be detected and the filtering image and is used as the current environment index value of the shooting environment corresponding to the image to be detected. The method and the device have the advantages that the current environment index value is determined through the peak signal-to-noise ratio, the characterization accuracy of the shooting environment corresponding to the image to be detected is improved, the accuracy of the preset detection model of subsequent dispatching is further improved, and further the target detection accuracy is further improved. Determining an associated environment index value of a shooting environment corresponding to each associated image according to a set number of associated images corresponding to the image to be detected; and determining a detection environment index threshold corresponding to the image to be detected based on a preset threshold segmentation algorithm according to each associated environment index value. The method realizes the staged updating of the detection environment index threshold value, ensures the applicability of the detection environment index threshold value and the image to be detected to a greater extent, further improves the accuracy of a preset detection model of subsequent dispatching, and further improves the target detection precision.
Example III
In this embodiment, on the basis of the second embodiment, a training step of "initial detection model" is added. Wherein the explanation of the same or corresponding terms as those of the above embodiments is not repeated herein.
In this embodiment, the target is a vehicle, the shooting environment is illumination intensity, and the shooting environment categories are a strong illumination category and a weak illumination category.
The automatic detection of vehicles in an intelligent traffic system is the basis for the development of various traffic services, and traffic videos shot in real time on roads are greatly influenced by shooting environments. A rapid vehicle detection algorithm that accommodates different shooting environments is more desirable. The factor that affects the video quality greatly in the shooting environment is the illumination intensity. When the shooting environment is illumination intensity, the shooting environment type can shoot the light intensity type, and is specifically classified into a strong illumination type and a weak illumination type. In the scene, the current environmental index value, the associated environmental index value and the detected environmental index threshold value are respectively the current light intensity index value, the associated light intensity index value and the detected light intensity index threshold value. The target shooting environment category is a target shooting light intensity category.
Referring to fig. 4, the target detection method provided in this embodiment includes:
s310, filtering the image to be detected based on the guidable filter according to the image to be detected and the set guide image to obtain a filtered image.
S320, determining a peak signal-to-noise ratio corresponding to the image to be detected according to the image to be detected and the filtered image, and taking the peak signal-to-noise ratio as a current light intensity index value of the illumination intensity corresponding to the image to be detected.
S330, determining an associated light intensity index value of the illumination intensity corresponding to each associated image according to the set number of associated images corresponding to the image to be detected.
S340, determining a detection light intensity index threshold corresponding to the image to be detected based on a preset threshold segmentation algorithm according to each associated light intensity index value.
S350, determining the type of the target shooting light intensity corresponding to the image to be detected according to the current light intensity index value and the detection light intensity index threshold value.
S360, determining a target preset detection model corresponding to the target shooting light intensity category according to the target shooting light intensity category and the mapping relation between the shooting light intensity category and the preset detection model.
The preset detection models in the embodiment are obtained by retraining the models by using the training sample set, so that each preset detection model is more matched with the corresponding shooting environment, and the target detection precision of the model can be further improved.
Illustratively, each preset detection model is pre-trained by:
A. And determining a training environment index threshold corresponding to the training sample set according to each sample image in the training sample set of target detection.
Before training the model, images obtained by shooting targets in various shooting environments are firstly obtained to serve as a training sample set, wherein the training sample set can be a self-defined image set or a data set provided by each open source platform, such as a public vehicle detection data set. And then, determining a training environment index value corresponding to each sample image in the training sample set, and determining an environment index threshold value (namely a training environment index threshold value) corresponding to the training sample set by utilizing all the training environment index values. The training environment index threshold in this embodiment is the training light intensity index threshold. Specific implementation of this procedure can be seen from the description of determining the detection environment index threshold in the second embodiment.
B. and dividing the training sample set into training sample subsets corresponding to all shooting environments according to the training environment index value and the training environment index threshold value corresponding to each sample image.
C. and aiming at each shooting environment, performing model training on an initial detection model corresponding to the shooting environment by utilizing a training sample subset corresponding to the shooting environment to obtain a preset detection model corresponding to the shooting environment.
The initial detection model refers to an untrained target detection model, and model parameters of the initial detection model are initial model parameter values. The initial detection model corresponding to each shooting environment can be the same or different.
And training the corresponding initial detection model by using each training sample subset to obtain a preset detection model corresponding to the training sample subset. If all the initial detection models are the same, the preset detection models finally obtained by the training data are different. If the preset detection model trained by using the strong illumination training sample subset has a better effect on target detection of the image obtained under the strong illumination shooting environment, and the preset detection model trained by using the weak illumination training sample subset has a better effect on target detection of the image obtained under the weak illumination shooting environment.
Illustratively, the initial detection model is a third version of the end-to-end single network object detection model YOLO V3.YOLO V3 is the most balanced object detection network for speed of operation and monitoring accuracy so far. Short plates of the YOLO series (fast, not good at detecting small objects etc.) are all complemented by fusion of a number of advanced methods. YOLO divides an input image into S x S bins, and if the coordinates of the center position of an object Ground truth fall into a certain bin, the bin is responsible for detecting the object, and each bin predicts B bounding boxes and their confidence scores (confidence score), and C class probabilities. bbox information (x, y, w, h) is the offset and width and height of the center position of the object relative to the grid position, all normalized. The confidence reflects the accuracy of whether an Object is contained and the position where the Object is contained, defined as Pr (Object). Times. IOUtruthpred, where Pr (Object) ∈ {0,1}.
YOLO V3 predicts using a multi-scale fusion approach. In YOLO V3, FPN-like upsampling (upsample) and fusion (3 scales are fused finally, the sizes of the other two scales are 16×16 and 32×32 respectively) are adopted, and detection is performed on feature images feature maps of multiple scales, so that the detection effect of small targets is still obvious. Although 3 bounding boxes per grid in YOLO V3 appear to be less than 5 bounding boxes per grid cell in YOLO V2, the number of bounding boxes is much greater than before because YOLO V3 employs feature fusion of multiple scales.
And S370, carrying out vehicle detection on the image to be detected based on the target preset detection model, and obtaining each vehicle in the image to be detected.
The two images to be detected shown in fig. 3 are subjected to vehicle detection, and corresponding vehicle detection results can be obtained, as shown in fig. 5. As can be seen from fig. 5, in the embodiment of the invention, the object detection method using the model scheduling algorithm in combination with the yolov model can obtain the object detection result with high vehicle recognition accuracy in the strong illumination and weak illumination shooting environments, thereby improving the vehicle recognition accuracy, reducing the calculation complexity of the vehicle detection process and achieving the real-time detection effect.
According to the technical scheme of the embodiment, a training environment index threshold corresponding to a training sample set is determined according to each sample image in the training sample set detected by the target; dividing a training sample set into training sample subsets corresponding to shooting environments according to training environment index values and training environment index thresholds corresponding to each sample image; and aiming at each shooting environment, performing model training on an initial detection model corresponding to the shooting environment by utilizing a training sample subset corresponding to the shooting environment to obtain a preset detection model corresponding to the shooting environment. Training of preset detection models corresponding to different shooting environment categories is achieved, so that each preset detection model is matched with the corresponding shooting environment, and the target detection accuracy of the model can be further improved.
Example IV
The present embodiment provides an object detection apparatus, referring to fig. 6, which specifically includes:
The current environment index value determining module 810 is configured to determine a current environment index value of a shooting environment corresponding to the image to be detected according to the image to be detected;
The target shooting environment category determining module 820 is configured to determine a target shooting environment category corresponding to the image to be detected according to the current environment index value and the detection environment index threshold value;
The target preset detection model determining module 830 is configured to determine a target preset detection model corresponding to the target shooting environment category according to the target shooting environment category and a mapping relationship between the shooting environment category and the preset detection model;
the target detection module 840 is configured to perform target detection on the image to be detected based on a target preset detection model, so as to obtain each target in the image to be detected.
Optionally, the current environmental indicator value determining module 810 is specifically configured to:
And determining the brightness information of the image to be detected according to each brightness value corresponding to the brightness channel of the image to be detected, and taking the brightness information as the current environment index value of the shooting environment corresponding to the image to be detected.
Optionally, the current environmental indicator value determining module 810 is specifically configured to:
According to the image to be detected and the set guide image, filtering the image to be detected based on a guidable filter to obtain a filtered image;
And determining the peak signal-to-noise ratio corresponding to the image to be detected as the current environment index value of the shooting environment corresponding to the image to be detected according to the image to be detected and the filtering image.
Optionally, on the basis of the above apparatus, the apparatus further includes a detection environment index threshold determining module, configured to:
before determining a target shooting environment category corresponding to an image to be detected according to the current environment index value and the detection environment index threshold value, determining an associated environment index value of a shooting environment corresponding to each associated image according to a set number of associated images corresponding to the image to be detected;
and determining a detection environment index threshold corresponding to the image to be detected based on a preset threshold segmentation algorithm according to each associated environment index value.
Optionally, on the basis of the device, the device further comprises a model training module, configured to train each preset detection model in advance by the following manner:
Determining a training environment index threshold corresponding to the training sample set according to each sample image in the training sample set of target detection;
Dividing a training sample set into training sample subsets corresponding to shooting environments according to training environment index values and training environment index thresholds corresponding to each sample image;
And aiming at each shooting environment, performing model training on an initial detection model corresponding to the shooting environment by utilizing a training sample subset corresponding to the shooting environment to obtain a preset detection model corresponding to the shooting environment.
Further, the initial detection model is a third version of an end-to-end single network target detection model.
Optionally, the target is a vehicle, the shooting environment is illumination intensity, and the shooting environment categories are a strong illumination category and a weak illumination category.
According to the target detection device provided by the fourth embodiment of the invention, the proper target preset detection model is adaptively scheduled according to the current environment index value of the image to be detected, so that a detection result with higher target detection precision can be obtained in different shooting environments, the problems of low target detection precision in different shooting environments and detection delay caused by low algorithm generalization capability and high algorithm complexity are solved, the target detection accuracy is improved, and the algorithm complexity is reduced so as to detect the target more rapidly.
The object detection device provided by the embodiment of the invention can execute the object detection method provided by any embodiment of the invention, and has the corresponding functional modules and beneficial effects of the execution method.
It should be noted that, in the above embodiment of the object detection apparatus, each unit and module included are only divided according to the functional logic, but not limited to the above division, so long as the corresponding functions can be implemented; in addition, the specific names of the functional units are also only for distinguishing from each other, and are not used to limit the protection scope of the present invention.
Example five
Referring to fig. 7, the present embodiment provides an electronic device, which includes: one or more processors 920; the storage device 910 is configured to store one or more programs, where the one or more programs are executed by the one or more processors 920, so that the one or more processors 920 implement the target detection method provided by the embodiment of the present invention, and the method includes:
determining a current environment index value of a shooting environment corresponding to the image to be detected according to the image to be detected;
Determining a target shooting environment category corresponding to the image to be detected according to the current environment index value and the detection environment index threshold value;
determining a target preset detection model corresponding to the target shooting environment category according to the target shooting environment category and the mapping relation between the shooting environment category and the preset detection model;
and carrying out target detection on the image to be detected based on a target preset detection model to obtain each target in the image to be detected.
Of course, those skilled in the art will appreciate that the processor 920 may also implement the technical solution of the target detection method provided in any embodiment of the present invention.
The electronic device shown in fig. 7 is only an example and should not be construed as limiting the functionality and scope of use of the embodiments of the present invention. As shown in fig. 7, the electronic device includes a processor 920, a storage device 910, an input device 930, and an output device 940; the number of processors 920 in the electronic device may be one or more, one processor 920 being illustrated in fig. 7; the processor 920, the storage device 910, the input device 930, and the output device 940 in the electronic device may be connected by a bus or other means, for example, by a bus 950 in fig. 7.
The storage device 910 is used as a computer readable storage medium, and may be used to store a software program, a computer executable program, and a module, such as program instructions/modules corresponding to the target detection method in the embodiment of the present invention (for example, a current environmental index value determining module, a target shooting environment category determining module, a target preset detection model determining module, and a target detection module in the target detection device).
The storage device 910 may mainly include a storage program area and a storage data area, wherein the storage program area may store an operating system, at least one application program required for a function; the storage data area may store data created according to the use of the terminal, etc. In addition, the storage 910 may include high-speed random access memory, and may also include non-volatile memory, such as at least one magnetic disk storage device, flash memory device, or other non-volatile solid-state storage device. In some examples, the storage 910 may further include memory remotely located relative to the processor 920, which may be connected to the electronic device via a network. Examples of such networks include, but are not limited to, the internet, intranets, local area networks, mobile communication networks, and combinations thereof.
The input device 930 may be used to receive input numeric or character information and to generate key signal inputs related to user settings and function control of the electronic device. The output device 940 may include a display device such as a display screen.
Example six
The present embodiment provides a storage medium containing computer executable instructions which, when executed by a computer processor, are configured to perform a method of object detection, the method comprising:
determining a current environment index value of a shooting environment corresponding to the image to be detected according to the image to be detected;
Determining a target shooting environment category corresponding to the image to be detected according to the current environment index value and the detection environment index threshold value;
determining a target preset detection model corresponding to the target shooting environment category according to the target shooting environment category and the mapping relation between the shooting environment category and the preset detection model;
and carrying out target detection on the image to be detected based on a target preset detection model to obtain each target in the image to be detected.
Of course, the storage medium containing the computer executable instructions provided in the embodiments of the present invention is not limited to the above method operations, and may also perform the related operations in the object detection method provided in any embodiment of the present invention.
From the above description of embodiments, it will be clear to a person skilled in the art that the present invention may be implemented by means of software and necessary general purpose hardware, but of course also by means of hardware, although in many cases the former is a preferred embodiment. Based on such understanding, the technical solution of the present invention may be embodied essentially or in a part contributing to the prior art in the form of a software product, which may be stored in a computer readable storage medium, such as a floppy disk, a Read-Only Memory (ROM), a random access Memory (Random Access Memory, RAM), a FLASH Memory (FLASH), a hard disk, or an optical disk of a computer, etc., and include several instructions for causing an electronic device (which may be a personal computer, a server, or a network device, etc.) to execute the object detection method provided by the embodiments of the present invention.
Note that the above is only a preferred embodiment of the present invention and the technical principle applied. It will be understood by those skilled in the art that the present invention is not limited to the particular embodiments described herein, but is capable of various obvious changes, rearrangements and substitutions as will now become apparent to those skilled in the art without departing from the scope of the invention. Therefore, while the invention has been described in connection with the above embodiments, the invention is not limited to the embodiments, but may be embodied in many other equivalent forms without departing from the spirit or scope of the invention, which is set forth in the following claims.

Claims (9)

1. A method of detecting an object, comprising:
Determining a current environment index value of a shooting environment corresponding to an image to be detected according to the image to be detected; the determining the current environment index value of the shooting environment corresponding to the image to be detected according to the image to be detected comprises the following steps: according to the image to be detected and the set guide image, filtering the image to be detected based on a guidable filter to obtain a filtered image; determining a peak signal-to-noise ratio corresponding to the image to be detected as a current environment index value of a shooting environment corresponding to the image to be detected according to the image to be detected and the filtering image;
Determining a target shooting environment category corresponding to the image to be detected according to the current environment index value and the detection environment index threshold value;
determining a target preset detection model corresponding to the target shooting environment category according to the target shooting environment category and the mapping relation between the shooting environment category and the preset detection model;
And carrying out target detection on the image to be detected based on the target preset detection model to obtain each target in the image to be detected.
2. The method according to claim 1, wherein determining the current environment index value of the shooting environment corresponding to the image to be detected according to the image to be detected comprises:
And determining the brightness information of the image to be detected according to each brightness value corresponding to the brightness channel of the image to be detected, and taking the brightness information as the current environment index value of the shooting environment corresponding to the image to be detected.
3. The method according to claim 1, further comprising, before the determining the target capturing environment category corresponding to the image to be detected according to the current environment index value and the detection environment index threshold value:
determining an associated environment index value of a shooting environment corresponding to each associated image according to a set number of associated images corresponding to the image to be detected;
And determining the detection environment index threshold corresponding to the image to be detected based on a preset threshold segmentation algorithm according to each associated environment index value.
4. The method of claim 1, wherein each of the predetermined detection models is pre-trained by:
determining a training environment index threshold corresponding to a training sample set according to each sample image in the training sample set of target detection;
dividing the training sample set into training sample subsets corresponding to shooting environments according to training environment index values and training environment index thresholds corresponding to each sample image;
And aiming at each shooting environment, performing model training on an initial detection model corresponding to the shooting environment by utilizing a training sample subset corresponding to the shooting environment to obtain a preset detection model corresponding to the shooting environment.
5. The method of claim 4, wherein the initial detection model is a third version of an end-to-end single network target detection model.
6. The method of any one of claims 1-5, wherein the target is a vehicle, the capture environment is an illumination intensity, and the capture environment categories are a high illumination category and a low illumination category.
7. An object detection apparatus, comprising:
The current environment index value determining module is used for determining a current environment index value of a shooting environment corresponding to the image to be detected according to the image to be detected; the current environment index value determining module is specifically configured to: according to the image to be detected and the set guide image, filtering the image to be detected based on a guidable filter to obtain a filtered image; determining a peak signal-to-noise ratio corresponding to the image to be detected as a current environment index value of a shooting environment corresponding to the image to be detected according to the image to be detected and the filtering image;
The target shooting environment category determining module is used for determining a target shooting environment category corresponding to the image to be detected according to the current environment index value and the detection environment index threshold value;
The target preset detection model determining module is used for determining a target preset detection model corresponding to the target shooting environment category according to the target shooting environment category and the mapping relation between the shooting environment category and the preset detection model;
And the target detection module is used for carrying out target detection on the image to be detected based on the target preset detection model to obtain each target in the image to be detected.
8. An electronic device, the electronic device comprising:
one or more processors;
storage means for storing one or more programs,
When executed by the one or more processors, causes the one or more processors to implement the target detection method of any of claims 1-6.
9. A computer readable storage medium, on which a computer program is stored, characterized in that the computer program, when being executed by a processor, implements the object detection method according to any one of claims 1-6.
CN201910578134.4A 2019-06-28 2019-06-28 Target detection method, device, equipment and storage medium Active CN112149476B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910578134.4A CN112149476B (en) 2019-06-28 2019-06-28 Target detection method, device, equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910578134.4A CN112149476B (en) 2019-06-28 2019-06-28 Target detection method, device, equipment and storage medium

Publications (2)

Publication Number Publication Date
CN112149476A CN112149476A (en) 2020-12-29
CN112149476B true CN112149476B (en) 2024-06-21

Family

ID=73891107

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910578134.4A Active CN112149476B (en) 2019-06-28 2019-06-28 Target detection method, device, equipment and storage medium

Country Status (1)

Country Link
CN (1) CN112149476B (en)

Families Citing this family (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112926476B (en) * 2021-03-08 2024-06-18 京东鲲鹏(江苏)科技有限公司 Vehicle identification method, device and storage medium
CN113077422B (en) * 2021-03-22 2023-08-15 浙江大华技术股份有限公司 Foggy image detection method, model training method and device
CN113933294B (en) * 2021-11-08 2023-07-18 中国联合网络通信集团有限公司 Concentration detection method and device
CN114241430A (en) * 2021-12-22 2022-03-25 杭州海康威视***技术有限公司 Event detection method, device and system, electronic equipment and storage medium
CN117649367B (en) * 2024-01-30 2024-04-30 广州敏行数字科技有限公司 Image orientation correction method and system

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108810413A (en) * 2018-06-15 2018-11-13 Oppo广东移动通信有限公司 Image processing method and device, electronic equipment, computer readable storage medium
CN109858381A (en) * 2019-01-04 2019-06-07 深圳壹账通智能科技有限公司 Biopsy method, device, computer equipment and storage medium

Family Cites Families (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102902951A (en) * 2012-06-29 2013-01-30 陕西省交通规划设计研究院 System and method for vehicle target location and event detection on basis of high-definition video monitoring images
CN103927734B (en) * 2013-01-11 2016-12-28 华中科技大学 A kind of based on the broad image quality evaluating method without reference
JP5820843B2 (en) * 2013-05-29 2015-11-24 富士重工業株式会社 Ambient environment judgment device
US9460522B2 (en) * 2014-10-29 2016-10-04 Behavioral Recognition Systems, Inc. Incremental update for background model thresholds
CN105791814A (en) * 2016-03-09 2016-07-20 中国科学院自动化研究所 Image-processing-technology-based monitoring video quality detection method and apparatus
CN106205488B (en) * 2016-09-21 2019-01-15 深圳市华星光电技术有限公司 Extend the method and display equipment in display of organic electroluminescence service life
CN109871730A (en) * 2017-12-05 2019-06-11 杭州海康威视数字技术股份有限公司 A kind of target identification method, device and monitoring device
CN109325418A (en) * 2018-08-23 2019-02-12 华南理工大学 Based on pedestrian recognition method under the road traffic environment for improving YOLOv3

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108810413A (en) * 2018-06-15 2018-11-13 Oppo广东移动通信有限公司 Image processing method and device, electronic equipment, computer readable storage medium
CN109858381A (en) * 2019-01-04 2019-06-07 深圳壹账通智能科技有限公司 Biopsy method, device, computer equipment and storage medium

Also Published As

Publication number Publication date
CN112149476A (en) 2020-12-29

Similar Documents

Publication Publication Date Title
CN112149476B (en) Target detection method, device, equipment and storage medium
CN110276767B (en) Image processing method and device, electronic equipment and computer readable storage medium
CN110378945B (en) Depth map processing method and device and electronic equipment
CN107153817B (en) Pedestrian re-identification data labeling method and device
CN111104943B (en) Color image region-of-interest extraction method based on decision-level fusion
US11700457B2 (en) Flicker mitigation via image signal processing
CN110580428A (en) image processing method, image processing device, computer-readable storage medium and electronic equipment
CN109413411B (en) Black screen identification method and device of monitoring line and server
CN110569782A (en) Target detection method based on deep learning
CN111738064A (en) Haze concentration identification method for haze image
CN107563299B (en) Pedestrian detection method using RecNN to fuse context information
CN111695373B (en) Zebra stripes positioning method, system, medium and equipment
WO2022116104A1 (en) Image processing method and apparatus, and device and storage medium
CN108765406A (en) A kind of snow mountain detection method based on infrared remote sensing image
CN112949578B (en) Vehicle lamp state identification method, device, equipment and storage medium
CN110969164A (en) Low-illumination imaging license plate recognition method and device based on deep learning end-to-end
CN112686252A (en) License plate detection method and device
CN112330544A (en) Image smear processing method, device, equipment and medium
Shu et al. Small moving vehicle detection via local enhancement fusion for satellite video
CN111027564A (en) Low-illumination imaging license plate recognition method and device based on deep learning integration
CN110751667A (en) Method for detecting infrared dim small target under complex background based on human visual system
Lashkov et al. Edge-computing-facilitated nighttime vehicle detection investigations with CLAHE-enhanced images
CN116311212B (en) Ship number identification method and device based on high-speed camera and in motion state
CN110633705A (en) Low-illumination imaging license plate recognition method and device
CN112785550B (en) Image quality value determining method and device, storage medium and electronic device

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
CB02 Change of applicant information
CB02 Change of applicant information

Address after: 601, 6 / F, building 2, No. 18, Kechuang 11th Street, Daxing District, Beijing, 100176

Applicant after: Jingdong Technology Information Technology Co.,Ltd.

Address before: 601, 6 / F, building 2, No. 18, Kechuang 11th Street, Daxing District, Beijing, 100176

Applicant before: Jingdong Shuke Haiyi Information Technology Co.,Ltd.

Address after: 601, 6 / F, building 2, No. 18, Kechuang 11th Street, Daxing District, Beijing, 100176

Applicant after: Jingdong Shuke Haiyi Information Technology Co.,Ltd.

Address before: 601, 6 / F, building 2, No. 18, Kechuang 11th Street, Daxing District, Beijing, 100176

Applicant before: BEIJING HAIYI TONGZHAN INFORMATION TECHNOLOGY Co.,Ltd.

SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant