CN112614121A - Multi-scale small-target equipment defect identification and monitoring method - Google Patents

Multi-scale small-target equipment defect identification and monitoring method Download PDF

Info

Publication number
CN112614121A
CN112614121A CN202011592556.6A CN202011592556A CN112614121A CN 112614121 A CN112614121 A CN 112614121A CN 202011592556 A CN202011592556 A CN 202011592556A CN 112614121 A CN112614121 A CN 112614121A
Authority
CN
China
Prior art keywords
box
default
convolution
scale
target equipment
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202011592556.6A
Other languages
Chinese (zh)
Inventor
封琰
谭毓卿
袁源
张海林
吴童生
王兴顺
李沛然
樊海峰
梁珑
田洪滨
展毅晟
芦国云
郭妍
谢占兰
卢涛
冯小霞
张青梅
沈娟
马雅静
刘有文
严隆兴
余国栋
杨品梅
邓蓉
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
QINGHAI SANXIN RURAL POWER CO Ltd
Hainan Power Supply Co Of State Grid Qinghai Electric Power Co
Original Assignee
QINGHAI SANXIN RURAL POWER CO Ltd
Hainan Power Supply Co Of State Grid Qinghai Electric Power Co
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by QINGHAI SANXIN RURAL POWER CO Ltd, Hainan Power Supply Co Of State Grid Qinghai Electric Power Co filed Critical QINGHAI SANXIN RURAL POWER CO Ltd
Priority to CN202011592556.6A priority Critical patent/CN112614121A/en
Publication of CN112614121A publication Critical patent/CN112614121A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/0002Inspection of images, e.g. flaw detection
    • G06T7/0004Industrial image inspection
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/25Determination of region of interest [ROI] or a volume of interest [VOI]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10004Still image; Photographic image
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20024Filtering details
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20081Training; Learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20084Artificial neural networks [ANN]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/30Subject of image; Context of image processing
    • G06T2207/30181Earth observation

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • General Engineering & Computer Science (AREA)
  • Computing Systems (AREA)
  • Software Systems (AREA)
  • Molecular Biology (AREA)
  • Computational Linguistics (AREA)
  • Biophysics (AREA)
  • Biomedical Technology (AREA)
  • Mathematical Physics (AREA)
  • General Health & Medical Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Evolutionary Biology (AREA)
  • Multimedia (AREA)
  • Quality & Reliability (AREA)
  • Image Analysis (AREA)

Abstract

The invention relates to the technical field of machine vision, in particular to an image identification monitoring method for defects of small target equipment. A multi-scale small target equipment defect identification monitoring method is characterized in that: (1) a single target detector is constructed for multiple categories. (2) A small convolution filter is used to predict the class scores and position offsets of a fixed set of default bounding boxes on the feature map. (3) Predictions of different scales are generated from feature maps of different scales, and are explicitly separated by aspect ratios. The method provided by the invention outputs discretized multi-scale and multi-proportion default boxes coordinates by predicting object areas on feature maps of different convolution layers, and simultaneously predicts frame coordinate compensation of a series of candidate frames and confidence coefficient of each category by using a small convolution kernel.

Description

Multi-scale small-target equipment defect identification and monitoring method
Technical Field
The invention relates to the technical field of machine vision, in particular to an image identification monitoring method for defects of small target equipment.
Background
In the task of detecting and identifying the defective target of the equipment, the target to be detected is possibly present at any position in the image, and the size, the length and the width of the target are not determined, so that the difficulty is brought to the target detection and identification. Since the image is uncertain in size, it is necessary to use a lot of computing resources to classify all possible positions and sizes of regions on the image, and therefore, it is necessary to first generate some candidate regions (Region probes) to find out regions that are likely to contain objects.
The convolutional neural network belongs to one of neural networks, is one of the most common networks for deep learning, and is widely applied to the fields of machine vision, word processing, numerical analysis and the like. Deep learning is the most important branch of machine learning, and the height which cannot be realized by the original machine learning is achieved in many fields. Therefore, the convolutional neural network can be regarded as a representative of the current mainstream artificial intelligence detection implementation.
Disclosure of Invention
In order to further improve the detection accuracy of the multi-scale small target of the aerial image, the invention provides a ResNet50 variant network structure design mode capable of enhancing the convolution feature extraction of the multi-scale small target. By means of increasing the network width, not only can each layer in the network learn sparse or non-sparse characteristics, but also the adaptability of the network to multi-scale small targets is increased. Meanwhile, the convolution operation of 2X 3 is continuously adopted, so that the same receptive field can be obtained as that of the convolution operation of 5X5, and a certain number of convolution layer weight parameters can be reduced.
(1) A single target detector is constructed for multiple categories.
(2) A small convolution filter is used to predict the class scores and position offsets of a fixed set of default bounding boxes on the feature map.
(3) To achieve high detection accuracy, predictions of different scales are generated from feature maps of different scales, and the predictions are explicitly separated by an aspect ratio.
The model adds several layers of features at the end of the underlying network that predict the offsets of different scales and aspect ratios to the default box and their associated confidence levels.
(1) Multi-scale feature map detection: a convolutional signature layer is added to the end of the truncated base network. The sizes of the layers are gradually reduced to obtain predicted values of multiple scale detections, and the detected convolution models are different for each characteristic layer.
(2) Convolution predictor detected: each added feature layer (or alternatively an existing feature layer of the underlying network) may use a set of convolution filters to produce a fixed set of predictions. The SSD network architecture is pointed out at the top of these figures. For a feature layer of size m × n with p channels, a 3 × 3 × p convolution kernel convolution operation is used, yielding a score for a class or coordinate offset from a default box. At each m × n size location where a convolution kernel operation is applied, an output value is generated. The bounding box offset output is measured relative to a default box, which is positioned relative to the feature map.
(3) Default box to aspect ratio: a set of default bounding boxes is associated with each feature map unit of the top-level network. The default box convolves the feature map such that the position of each box instance with respect to its corresponding cell is fixed. In each feature mapping unit, we predict the offset from the default box shape in the cell, and the per class score of the instance in each box. Specifically, for each of the k boxes at a given position, we compute a class c score and 4 offsets from the original default box. This results in a total of (c +4) k filters required at each position in the profile, producing (c +4) k m n outputs for the m n profile. The default box is similar to the anchor boxes used in Faster R-CNN, but applies to different resolution profiles. Using different default box shapes in multiple feature maps can effectively discretize the space of possible output box shapes.
(4) Matching strategies: at the beginning, each group channel box and default box are matched by using the best jaccard overlay in the MultiBox, so that each group channel box is ensured to correspond to a unique default box. But unlike the MultiBox, the default box is paired with any grountrituth box later, as long as the jaccard overlap between the two is greater than a threshold, which is shown in the following graph:
Figure BDA0002867285470000021
the formula can find that the jaccard overlap is IOU, namely the intersection of the two sets is divided by the union of the two sets.
(5) Data augmentation: for each training image, the following selections were randomly made:
using the original image to sample a patch, the smallest jaccard overlap (IOU) between the patch and the object is: 0.1, 0.3, 0.5, 0.7 and 0.90.1, 0.3, 0.5, 0.7 and 0.9.
Randomly sample one patch: sampled patch is the original image size scale is [0.1, 1] [0.1, 1], aspect ratio is between 1212 and 22. When the center (center) of the groudtuth box is in the sample's patch, the overlap is preserved. After these sampling steps, each sampled patch is rescize to a fixed size and flipped at a random level with a probability of 0.5.
(7) Adding atrous: the size of the receptive field (receptive field) is changed after modifying the network structure. Therefore, the technology of atrous algorithms is adopted, and the model identification precision is improved.
In the invention, a regression method of YOLO is combined with an anchor box mechanism of Faster R-CNN, and a multi-scale-based target detection and identification method is innovatively provided. And (3) outputting discretized multi-scale and multi-proportion default boxes coordinates by predicting object regions on feature maps of different convolution layers, and predicting frame coordinate compensation and confidence coefficient of each category of a series of candidate frames by using a small convolution kernel.
Drawings
FIG. 1 is a flow chart of data processing according to an embodiment of the present invention.
Detailed Description
The technical solution of the present invention is further explained below with reference to the accompanying drawings.
In the target detection and identification algorithm based on the regional convolutional neural network, a method for detecting a target is carried out by classifying a candidate Region (Region pro posal) + Convolutional Neural Network (CNN), namely, a position where the target possibly appears in a graph, namely the candidate Region, is found in advance, and then the convolutional neural network is used for extracting features.
The invention provides a ResNet50 variant network structure design mode capable of enhancing multi-scale small-target convolution feature extraction. By means of increasing the network width, not only can each layer in the network learn sparse or non-sparse characteristics, but also the adaptability of the network to multi-scale small targets is increased. Meanwhile, the convolution operation of 2X 3 is continuously adopted, so that the same receptive field can be obtained as that of the convolution operation of 5X5, and a certain number of convolution layer weight parameters can be reduced.
(1) A single target detector is constructed for multiple categories. Faster and more accurate than the prior art methods, while ensuring a higher inspection rate
(2) A small convolution filter is used to predict the class scores and position offsets of a fixed set of default bounding boxes on the feature map.
(3) To achieve high detection accuracy, predictions of different scales are generated from feature maps of different scales, and the predictions are explicitly separated by an aspect ratio.
The model adds several layers of features at the end of the underlying network that predict the offsets of different scales and aspect ratios to the default box and their associated confidence levels.
(1) Multi-scale feature map detection: a convolutional signature layer is added to the end of the truncated base network. The sizes of the layers are gradually reduced to obtain predicted values of multiple scale detections, and the detected convolution models are different for each characteristic layer.
(2) Convolution predictor detected: each added feature layer (or alternatively an existing feature layer of the underlying network) may use a set of convolution filters to produce a fixed set of predictions. The SSD network architecture is pointed out at the top of these figures. For a feature layer of size m × n with p channels, a 3 × 3 × p convolution kernel convolution operation is used, yielding a score for a class or coordinate offset from a default box. At each m × n size location where a convolution kernel operation is applied, an output value is generated. The bounding box offset output is measured relative to a default box, which is positioned relative to the feature map.
(3) Default box to aspect ratio: a set of default bounding boxes is associated with each feature map unit of the top-level network. The default box convolves the feature map such that the position of each box instance with respect to its corresponding cell is fixed. In each feature mapping unit, we predict the offset from the default box shape in the cell, and the per class score of the instance in each box. Specifically, for each of the k boxes at a given position, we compute a class c score and 4 offsets from the original default box. This results in a total of (c +4) k filters required at each position in the profile, producing (c +4) k m n outputs for the m n profile. The default box is similar to the anchor boxes used in Faster R-CNN, but applies to different resolution profiles. Using different default box shapes in multiple feature maps can effectively discretize the space of possible output box shapes.
(4) Matching strategies: at the beginning, each group channel box and default box are matched by using the best jaccard overlay in the MultiBox, so that each group channel box is ensured to correspond to a unique default box. But unlike the MultiBox, the default box is paired with any grountrituth box later, as long as the jaccard overlap between the two is greater than a threshold, which is shown in the following graph:
Figure BDA0002867285470000041
the jaccard overlay is the IOU, i.e., the intersection of the two sets divided by the union of the two sets.
(5) Data augmentation: for each training image, the following selections were randomly made:
using the original image to sample a patch, the smallest jaccard overlap (IOU) between the patch and the object is: 0.1, 0.3, 0.5, 0.7 and 0.90.1, 0.3, 0.5, 0.7 and 0.9.
Randomly sample one patch: sampled patch is the original image size scale is [0.1, 1] [0.1, 1], aspect ratio is between 1212 and 22. When the center (center) of the groudtuth box is in the sample's patch, the overlap is preserved. After these sampling steps, each sampled patch is rescize to a fixed size and flipped at a random level with a probability of 0.5.
(7) Adding atrous: the size of the receptive field (receptive field) is changed after modifying the network structure. Therefore, the technology of atrous algorithms is adopted, and the model identification precision is improved.

Claims (5)

1. A multi-scale small target equipment defect identification monitoring method is characterized in that:
(1) constructing a single-pass object detector for a plurality of classes;
(2) predicting category scores and position offsets for a fixed set of default bounding boxes on the feature map using a small convolution filter;
(3) predictions of different scales are generated from feature maps of different scales, and are explicitly separated by aspect ratios.
2. The method for identifying and monitoring the defects of the multi-scale small target equipment as claimed in claim 1, wherein: the method comprises the following steps of multi-scale feature map detection: and adding convolution characteristic layers to the tail of the truncated basic network, wherein the sizes of the layers are gradually reduced to obtain predicted values of multiple scale detections, and the detected convolution models are different for each characteristic layer.
3. The method for identifying and monitoring the defects of the multi-scale small target equipment as claimed in claim 2, wherein: convolution predictor including detection: each added feature layer (or alternatively an existing feature layer of the underlying network) may use a set of convolution filters to produce a fixed prediction set, for a feature layer of size m × n with p channels, using a 3 × 3 × p convolution kernel convolution operation to produce a score of the class or coordinate offset relative to a default box, and at each m × n size location where a convolution kernel is applied, to produce an output value, the bounding box offset output value is measured relative to the default box, and the default box position is relative to the feature map.
4. The method for identifying and monitoring the defects of the multi-scale small target equipment as claimed in claim 3, wherein: default box to aspect ratio: associating a set of default bounding boxes with each feature map cell of the top-level network, the default boxes convolving the feature map such that the position of each box instance with respect to its corresponding cell is fixed, predicting, in each feature mapping cell, an offset with respect to a default box shape in the cell, and a score for each class of instances in each box.
5. The method for identifying and monitoring the defects of the multi-scale small target equipment as claimed in claim 4, wherein: the matching strategy is as follows: at the beginning, each group route box and the default box are matched by using the best jaccard overlay in the MultiBox, so that each group route box is ensured to correspond to a unique default box and is different from the MultiBox, and the jaccard overlay between the two boxes is larger than a threshold value.
CN202011592556.6A 2020-12-29 2020-12-29 Multi-scale small-target equipment defect identification and monitoring method Pending CN112614121A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202011592556.6A CN112614121A (en) 2020-12-29 2020-12-29 Multi-scale small-target equipment defect identification and monitoring method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202011592556.6A CN112614121A (en) 2020-12-29 2020-12-29 Multi-scale small-target equipment defect identification and monitoring method

Publications (1)

Publication Number Publication Date
CN112614121A true CN112614121A (en) 2021-04-06

Family

ID=75248808

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202011592556.6A Pending CN112614121A (en) 2020-12-29 2020-12-29 Multi-scale small-target equipment defect identification and monitoring method

Country Status (1)

Country Link
CN (1) CN112614121A (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113378864A (en) * 2021-08-16 2021-09-10 浙江啄云智能科技有限公司 Method, device and equipment for determining anchor frame parameters and readable storage medium
CN117197133A (en) * 2023-11-06 2023-12-08 湖南睿图智能科技有限公司 Control system and method for vision robot in complex industrial environment

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107657233A (en) * 2017-09-28 2018-02-02 东华大学 Static sign language real-time identification method based on modified single multi-target detection device
CN110443778A (en) * 2019-06-25 2019-11-12 浙江工业大学 A method of detection industrial goods random defect
CN110660040A (en) * 2019-07-24 2020-01-07 浙江工业大学 Industrial product irregular defect detection method based on deep learning
CN110826514A (en) * 2019-11-13 2020-02-21 国网青海省电力公司海东供电公司 Construction site violation intelligent identification method based on deep learning
CN110826577A (en) * 2019-11-06 2020-02-21 国网新疆电力有限公司电力科学研究院 High-voltage isolating switch state tracking identification method based on target tracking
CN111753682A (en) * 2020-06-11 2020-10-09 中建地下空间有限公司 Hoisting area dynamic monitoring method based on target detection algorithm
CN111881970A (en) * 2020-07-23 2020-11-03 国网天津市电力公司 Intelligent outer broken image identification method based on deep learning

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107657233A (en) * 2017-09-28 2018-02-02 东华大学 Static sign language real-time identification method based on modified single multi-target detection device
CN110443778A (en) * 2019-06-25 2019-11-12 浙江工业大学 A method of detection industrial goods random defect
CN110660040A (en) * 2019-07-24 2020-01-07 浙江工业大学 Industrial product irregular defect detection method based on deep learning
CN110826577A (en) * 2019-11-06 2020-02-21 国网新疆电力有限公司电力科学研究院 High-voltage isolating switch state tracking identification method based on target tracking
CN110826514A (en) * 2019-11-13 2020-02-21 国网青海省电力公司海东供电公司 Construction site violation intelligent identification method based on deep learning
CN111753682A (en) * 2020-06-11 2020-10-09 中建地下空间有限公司 Hoisting area dynamic monitoring method based on target detection algorithm
CN111881970A (en) * 2020-07-23 2020-11-03 国网天津市电力公司 Intelligent outer broken image identification method based on deep learning

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113378864A (en) * 2021-08-16 2021-09-10 浙江啄云智能科技有限公司 Method, device and equipment for determining anchor frame parameters and readable storage medium
CN113378864B (en) * 2021-08-16 2021-11-12 浙江啄云智能科技有限公司 Method, device and equipment for determining anchor frame parameters and readable storage medium
CN117197133A (en) * 2023-11-06 2023-12-08 湖南睿图智能科技有限公司 Control system and method for vision robot in complex industrial environment
CN117197133B (en) * 2023-11-06 2024-01-30 湖南睿图智能科技有限公司 Control system and method for vision robot in complex industrial environment

Similar Documents

Publication Publication Date Title
CN108898047B (en) Pedestrian detection method and system based on blocking and shielding perception
Li et al. Automatic pavement crack detection by multi-scale image fusion
CN109117876B (en) Dense small target detection model construction method, dense small target detection model and dense small target detection method
CN111797697B (en) Angle high-resolution remote sensing image target detection method based on improved CenterNet
CN107609525B (en) Remote sensing image target detection method for constructing convolutional neural network based on pruning strategy
CN108846835B (en) Image change detection method based on depth separable convolutional network
CN109784203B (en) Method for inspecting contraband in weak supervision X-ray image based on layered propagation and activation
CN110135522B (en) Intelligent method for detecting and marking small target integration of remote sensing image
CN108257114A (en) A kind of transmission facility defect inspection method based on deep learning
CN111950488B (en) Improved Faster-RCNN remote sensing image target detection method
CN110826379B (en) Target detection method based on feature multiplexing and YOLOv3
CN112101278A (en) Hotel point cloud classification method based on k nearest neighbor feature extraction and deep learning
CN104978570B (en) The detection and recognition methods of traffic sign in driving video based on incremental learning
CN107729843B (en) Low-floor tramcar pedestrian identification method based on radar and visual information fusion
CN111860106B (en) Unsupervised bridge crack identification method
CN111798417A (en) SSD-based remote sensing image target detection method and device
CN112614121A (en) Multi-scale small-target equipment defect identification and monitoring method
CN114694178A (en) Method and system for monitoring safety helmet in power operation based on fast-RCNN algorithm
CN115170611A (en) Complex intersection vehicle driving track analysis method, system and application
CN111738114A (en) Vehicle target detection method based on anchor-free accurate sampling remote sensing image
CN113609895A (en) Road traffic information acquisition method based on improved Yolov3
CN116596875A (en) Wafer defect detection method and device, electronic equipment and storage medium
CN117689995A (en) Unknown spacecraft level detection method based on monocular image
CN116188943A (en) Solar radio spectrum burst information detection method and device
CN112767543B (en) FY-3D infrared hyperspectral cloud detection method based on logistic regression

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination