CN116229458A - Method for detecting inclusion based on YOLOV5 - Google Patents

Method for detecting inclusion based on YOLOV5 Download PDF

Info

Publication number
CN116229458A
CN116229458A CN202310231853.5A CN202310231853A CN116229458A CN 116229458 A CN116229458 A CN 116229458A CN 202310231853 A CN202310231853 A CN 202310231853A CN 116229458 A CN116229458 A CN 116229458A
Authority
CN
China
Prior art keywords
inclusion
model
yolov5
detection
fluid
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202310231853.5A
Other languages
Chinese (zh)
Inventor
王兴建
文雪梅
曹俊兴
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Chengdu Univeristy of Technology
Original Assignee
Chengdu Univeristy of Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Chengdu Univeristy of Technology filed Critical Chengdu Univeristy of Technology
Priority to CN202310231853.5A priority Critical patent/CN116229458A/en
Publication of CN116229458A publication Critical patent/CN116229458A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/60Type of objects
    • G06V20/69Microscopic objects, e.g. biological cells or cellular parts
    • G06V20/693Acquisition
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/77Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
    • G06V10/80Fusion, i.e. combining data from various sources at the sensor level, preprocessing level, feature extraction level or classification level
    • G06V10/806Fusion, i.e. combining data from various sources at the sensor level, preprocessing level, feature extraction level or classification level of extracted features
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/82Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/60Type of objects
    • G06V20/69Microscopic objects, e.g. biological cells or cellular parts
    • G06V20/695Preprocessing, e.g. image segmentation

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Health & Medical Sciences (AREA)
  • General Health & Medical Sciences (AREA)
  • General Physics & Mathematics (AREA)
  • Multimedia (AREA)
  • Evolutionary Computation (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Molecular Biology (AREA)
  • Biomedical Technology (AREA)
  • Artificial Intelligence (AREA)
  • Computing Systems (AREA)
  • Software Systems (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Medical Informatics (AREA)
  • Databases & Information Systems (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • General Engineering & Computer Science (AREA)
  • Mathematical Physics (AREA)
  • Sampling And Sample Adjustment (AREA)

Abstract

The invention belongs to the field of oil and gas exploration. A detection method of a fluid inclusion based on YOLOV 5. The YOLOV5 model is improved aiming at the characteristics of the fluid inclusion, so that the method is suitable for detecting the fluid inclusion. The technical core of the invention comprises: collecting and creating a fluid inclusion dataset; the model adds CA attention standing mechanism; the PANet in the neg part of the YOLOV5 model is replaced with BIFPN; the model is added with a small target detection layer; better weights are trained for detection. The invention can carry out real-time intelligent analysis on the inclusion sheet under the microscope to judge whether the inclusion sheet is a fluid inclusion or not, and can realize rapid and effective identification of the fluid inclusion.

Description

Method for detecting inclusion based on YOLOV5
Technical Field
The invention relates to the field of oil and gas exploration, in particular to a method for detecting a fluid inclusion based on YOLOV 5.
Background
The inclusion captured in the mineral is the most complete and direct original fluid (or melt) sample remained so far, and temperature measurement of the inclusion can provide quantitative basis for determining the fluid property, source, diagenetic period and diagenetic environment. Has important guiding significance for oil gas resource evaluation, reservoir geochemistry, fluid type, fluid source, exploration and the like. The oil gas filling time can be determined by utilizing the uniform temperature of the oil gas inclusion, and the oil gas storage period can be directly divided according to the difference of the oil gas filling time. By combining with the geological structure characteristics of the sample output, the buried heat history of the reservoir and the like, estimation of the time to be buried and the ancient geothermal gradient, inference of the diagenetic effect history, research on the maturity of the hydrocarbon source rock and the like can be performed by utilizing the uniform temperature. To conduct the inclusion study, first, the inclusion in the mineral is found. The method for finding inclusion is that the inclusion piece is observed under an optical microscope, then the approximate range is circled, and then the next temperature measurement study is carried out.
The target detection task is to enable a computer to automatically detect the position and the type of a target object of interest in an image or video, and is a classical task in computer vision. The current deep learning methods in the field of target detection are mainly divided into two categories: a target detection algorithm of the two stage; a one stage target detection algorithm. The former is that a series of candidate frames serving as samples are generated by an algorithm, and then the samples are classified by a convolutional neural network; the latter directly converts the problem of target frame positioning into regression problem processing without generating candidate frames. The two methods are different in performance, the former is superior in detection accuracy and positioning accuracy, and the latter is superior in algorithm speed. YOLOv5 as One-stage detector has the advantages of small calculation amount, high recognition speed and the like. The method is widely applied to target recognition at present, and the recognition effect is good.
Finding inclusions under an optical microscope can also be considered as an image processing effort to find patterns of specific shape. The fluid inclusion is mostly elliptical, circular and irregular, and a small amount of pure liquid phase inclusion is irregular. The fluid inclusion mainly comprises a single-phase inclusion and a two-phase inclusion, and the gas phase in most of the gas-liquid two-phase inclusion is violently jumped. According to the characteristics of the fluid inclusion, the phenomenon of gas phase violent jumping exists. The instantaneous image has obvious characteristics, and the image with special shape can be identified by a deep learning method, namely inclusion object detection. YOLOv5 is a high-performance and general target detection model, and can complete two tasks of target positioning and target classification at one time, so that YOLOv5 is selected as a basic skeleton for target detection.
Because the inclusion shape is small, it is time consuming and labor consuming to find the inclusion under the optical microscope generally, and it is necessary to rely on a certain experience to find, and meanwhile, the condition of finding leakage and finding error is unavoidable in manual finding. The method improves the YOLOV5 model, and improves the accuracy and effect of fluid inclusion detection.
Disclosure of Invention
The invention aims to solve the technical problem of providing a method for detecting a fluid inclusion based on YOLOV 5. The method has the advantages that the object of the inclusion is smaller, the feature extraction network and the feature fusion network of the YOLOV5 model are researched and improved, meanwhile, a small object detection layer is added, and the detection capability of the fluid inclusion is improved. The invention overcomes the defect of manual searching at present and realizes the efficient and accurate identification of inclusion.
1. In order to achieve the above purpose, the following technical solutions are adopted to achieve the purpose, and the specific steps are as follows:
step one, inclusion image collection: the inclusion was observed under an optical microscope and video of the inclusion was taken. And sampling the shot video, extracting frames of the video at different times, and obtaining pictures.
Step two, image annotation and data set division: and randomly dividing marked data into a training set and a testing set according to the ratio of 4:1 for the positions and the types of boundary boxes of inclusion bodies in the obtained picture mark graph.
Step three: image data enhancement: and (3) processing the training image by using the torchvision, rotating and cutting, and increasing the number of training set pictures so as to improve the recognition capability of the model.
Fourth, building a model: the detection network model of the YOLOV5 consists of a trunk backbone, neck and an output module output, wherein the trunk backbox comprises a Bottleneck CSP module and a Focus module; the Bottleneck CSP module is used for enhancing the learning performance of the whole convolutional neural network; the Focus module is used for slicing the picture, expanding an input channel to 4 times of the original input channel, and obtaining a downsampling feature map through one convolution; the Neck adopts a structure of combining FPN and PAN, combines a conventional FPN layer with a feature pyramid from bottom to top, fuses the extracted semantic features with the position features, and fuses the features of a trunk layer and a detection layer, so that the model obtains more abundant feature information; the output module predicts the image characteristics and outputs a vector with the class probability of the target object, the object score and the position of the object boundary box.
The mechanism of adding attention: a Coordinate Attention (CA) attention mechanism was introduced in the YoloV5 backbone network. CA attention mechanism. The channel relation and long-term dependence are encoded by accurate position information, and the specific operation is divided into 2 steps of embedding the Coordinatate information and generating Coordinate Attention. Global average pooling is first decomposed into two directions, horizontal and vertical. Specifically, given an input X, each channel is first encoded along the horizontal and vertical coordinates, respectively, using a scaling kernel of size (H, 1) or (1, W). Thus, the output of the c-th channel of height h can be expressed as:
Figure BDA0004120798830000031
likewise, the output of the c-th channel of width w can be expressed as:
Figure BDA0004120798830000032
the 2 transforms aggregate features along two spatial directions, respectively, to obtain a pair of direction-perceived feature maps. The ability to encode both lateral and longitudinal location information into channel attention enables the mobile network to focus on a wide range of location information without excessive computation. Not only is the information between channels acquired, but also the position information related to the direction is considered, so that the model is facilitated to better locate and identify the target; is flexible and lightweight enough to be simply inserted into the core structure of a mobile network; is favorable for positioning and identifying the inclusion.
Optimizing the Neck part of the model: and the PANet of the original network is converted into the BiFPN network, so that the detection precision is improved. The bippn network introduces a learnable weight factor to characterize the importance of different input features while iteratively applying top-down and bottom-up multi-scale feature fusion. The BiFPN adopts cross connection to remove the node with smaller contribution degree to feature fusion in the PANet, and a jump connection is added between the input node and the output node of the same scale, so that more features are fused while more cost is not increased. On the same characteristic scale, each bidirectional path is regarded as a characteristic network layer, and the same layer is repeatedly utilized for multiple times, so that higher-level characteristic fusion is realized, and the detection precision is improved. I.e. a 4-fold downsampling process of the original input picture. The original picture is sent to a feature fusion network after 4 times downsampling to obtain a feature picture with a new size, the feature picture has smaller receptive field and relatively rich position information, and the detection effect of detecting a small target can be improved.
And (3) adjusting the detection scale: similar to the traditional target detection network, the YOLOv5s original network also starts feature fusion from the layer 3 feature layer. The small target detection layer is formed by adding the layer 2 feature layer into a feature fusion network, so that the detection capability of the network on the small target is improved. 160×160 feature maps that were not originally fused in the feature extraction network are added to the detection layer, and 1 upsampling operation and downsampling operation are added to the feature fusion network, so that the final output detection layer is added to 4 layers. After the detection layer is added, the output prediction frames are correspondingly increased from 9 to 12, and the 3 added prediction frames are all different in aspect ratio and are detected for small targets.
Training a model and adjusting a parameter optimization model: the sample divided training set is trained based on a modified YOLOV5 model. The loss function is calculated for each iteration and the parameter values are updated to minimize the value of the loss function until the model converges while preventing overfitting.
And step six, after model training is completed, saving model weight parameters, and setting the format as a. Pt format. Reloading the weight file stored in the model, detecting inclusion by using the weight file, and detecting whether the inclusion exists in the image.
Drawings
FIG. 1 is a view of the inclusion acquired and processed;
FIG. 2 marks an image using a Make Sense;
FIG. 3CA attention mechanism;
FIG. 4 improved YOLOv5 network model
FIG. 5 results
Detailed Description
Step one, inclusion image collection: since there is no dataset about inclusions, the collection channel for the dataset of the present invention is five dolomite flakes in the laboratory. The inclusion was observed under an optical microscope and video of the inclusion was taken. And then through the video sampling pictures, the OpenCV is used for sampling the shot video at intervals, and frames of the video at different times are extracted to obtain 500 pictures.
Step two, image annotation and data set division: and labeling the object boundary box positions and categories of the obtained pictures by using a Make Sense, and then dividing the data set into a training set and a verification set, wherein the ratio of the training set to the verification set is 4:1.
Step three: image data enhancement: and (3) processing the training image by using the torchvision, rotating and cutting, and increasing the number of training set pictures so as to improve the recognition capability of the model.
Fourth, building a model: the detection network model of the YOLOV5 consists of a trunk backbone, neck and an output module output, wherein the trunk backbox comprises a Bottleneck CSP module and a Focus module; the Bottleneck CSP module is used for enhancing the learning performance of the whole convolutional neural network; the Focus module is used for slicing the picture, expanding an input channel to 4 times of the original input channel, and obtaining a downsampling feature map through one convolution; the Neck adopts a structure of combining FPN and PAN, combines a conventional FPN layer with a feature pyramid from bottom to top, fuses the extracted semantic features with the position features, and fuses the features of a trunk layer and a detection layer, so that the model obtains more abundant feature information; the output module predicts the image characteristics and outputs a vector with the class probability of the target object, the object score and the position of the object boundary box.
The mechanism of adding attention: a Coordinate Attention (CA) attention mechanism was introduced in the YoloV5 backbone network. CA attention mechanism. The channel relation and long-term dependence are encoded by accurate position information, and the specific operation is divided into 2 steps of embedding the Coordinatate information and generating Coordinate Attention. Global average pooling is first decomposed into two directions, horizontal and vertical. Specifically, given an input X, each channel is first encoded along the horizontal and vertical coordinates, respectively, using a scaling kernel of size (H, 1) or (1, W). Thus, the output of the c-th channel of height h can be expressed as:
Figure BDA0004120798830000061
likewise, the output of the c-th channel of width w can be expressed as:
Figure BDA0004120798830000062
the 2 transforms aggregate features along two spatial directions, respectively, to obtain a pair of direction-perceived feature maps. The ability to encode both lateral and longitudinal location information into channel attention enables the mobile network to focus on a wide range of location information without excessive computation. Not only is the information between channels acquired, but also the position information related to the direction is considered, so that the model is facilitated to better locate and identify the target; is flexible and lightweight enough to be simply inserted into the core structure of a mobile network; is favorable for positioning and identifying the inclusion.
Optimizing the Neck part of the model: and the PANet of the original network is converted into the BiFPN network, so that the detection precision is improved. The bippn network introduces a learnable weight factor to characterize the importance of different input features while iteratively applying top-down and bottom-up multi-scale feature fusion. The BiFPN adopts cross connection to remove the node with smaller contribution degree to feature fusion in the PANet, and a jump connection is added between the input node and the output node of the same scale, so that more features are fused while more cost is not increased. On the same characteristic scale, each bidirectional path is regarded as a characteristic network layer, and the same layer is repeatedly utilized for multiple times, so that higher-level characteristic fusion is realized, and the detection precision is improved. I.e. a 4-fold downsampling process of the original input picture. The original picture is sent to a feature fusion network after 4 times downsampling to obtain a feature picture with a new size, the feature picture has smaller receptive field and relatively rich position information, and the detection effect of detecting a small target can be improved.
And (3) adjusting the detection scale: similar to the traditional target detection network, the YOLOv5s original network also starts feature fusion from the layer 3 feature layer. The small target detection layer is formed by adding the layer 2 feature layer into a feature fusion network, so that the detection capability of the network on the small target is improved. 160×160 feature maps that were not originally fused in the feature extraction network are added to the detection layer, and 1 upsampling operation and downsampling operation are added to the feature fusion network, so that the final output detection layer is added to 4 layers. After the detection layer is added, the output prediction frames are correspondingly increased from 9 to 12, and the 3 added prediction frames are all different in aspect ratio and are detected for small targets.
Training a model and adjusting a parameter optimization model: the sample divided training set is trained based on a modified YOLOV5 model. The loss function is calculated for each iteration and the parameter values are updated to minimize the value of the loss function until the model converges while preventing overfitting.
And step six, after model training is completed, saving model weight parameters, and setting the format as a. Pt format. Reloading the weight file stored in the model, detecting inclusion by using the weight file, and detecting whether the inclusion exists in the image.

Claims (2)

1. The technology provides a detection method of a fluid inclusion based on YOLOV5, which is essentially characterized in that a YOLOV5 algorithm is used for target identification, a data set is manufactured according to the characteristics of the fluid inclusion, the official YOLOV5 algorithm is improved, a CA attention standing mechanism is added to a model, and a BIFPN is used for replacing PANet of a Neck part and a small target detection layer is added, so that the detection method is more suitable for detection of the inclusion. The key elements of the technology are the data set preparation of inclusion and the improvement of a YOLOV5 model.
2. A detection method of a fluid inclusion based on YOLOV5 is characterized by comprising the following steps:
(1) The inclusion was observed under an optical microscope and video of the inclusion was taken. Sampling the shot video, extracting frames of the video at different times, and obtaining pictures;
(2) Randomly dividing marked data into a training set and a testing set according to the ratio of 4:1 for the positions and the types of boundary boxes of inclusion in the obtained picture marking graph; the method comprises the steps of carrying out a first treatment on the surface of the
(3) The torchvision is used for processing training images, rotating and cutting are carried out, and the number of training set pictures is increased, so that the recognition capability of a model is improved;
4) The yolo 5 algorithm is improved, the official yolo 5 algorithm is improved, the CA attention mechanism is increased by the model, the PANet of the Neck part is replaced by BIFPN, and a small target detection layer is added. Making it suitable for fluid inclusion detection;
training the model and adjusting the parameter to optimize the model, calculating a loss function, updating parameter values to minimize the value of the loss function until the model converges, and simultaneously preventing overfitting;
and step six, after model training is completed, saving a model weight file, reloading, detecting inclusion by using the weight file, and detecting whether the inclusion exists in the image.
CN202310231853.5A 2023-03-10 2023-03-10 Method for detecting inclusion based on YOLOV5 Pending CN116229458A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310231853.5A CN116229458A (en) 2023-03-10 2023-03-10 Method for detecting inclusion based on YOLOV5

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310231853.5A CN116229458A (en) 2023-03-10 2023-03-10 Method for detecting inclusion based on YOLOV5

Publications (1)

Publication Number Publication Date
CN116229458A true CN116229458A (en) 2023-06-06

Family

ID=86588996

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310231853.5A Pending CN116229458A (en) 2023-03-10 2023-03-10 Method for detecting inclusion based on YOLOV5

Country Status (1)

Country Link
CN (1) CN116229458A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117496666A (en) * 2023-11-16 2024-02-02 成都理工大学 Intelligent and efficient drowning rescue system and method

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117496666A (en) * 2023-11-16 2024-02-02 成都理工大学 Intelligent and efficient drowning rescue system and method

Similar Documents

Publication Publication Date Title
Ma et al. Rpt: Learning point set representation for siamese visual tracking
CN113688665B (en) Remote sensing image target detection method and system based on semi-supervised iterative learning
CN109284779A (en) Object detection method based on deep full convolution network
CN110751209B (en) Intelligent typhoon intensity determination method integrating depth image classification and retrieval
CN114638784A (en) Method and device for detecting surface defects of copper pipe based on FE-YOLO
CN112949408B (en) Real-time identification method and system for target fish passing through fish channel
CN116664558A (en) Method, system and computer equipment for detecting surface defects of steel
CN113487600B (en) Feature enhancement scale self-adaptive perception ship detection method
CN113313094B (en) Vehicle-mounted image target detection method and system based on convolutional neural network
CN111062383A (en) Image-based ship detection depth neural network algorithm
CN111008576A (en) Pedestrian detection and model training and updating method, device and readable storage medium thereof
CN116883393B (en) Metal surface defect detection method based on anchor frame-free target detection algorithm
CN116229458A (en) Method for detecting inclusion based on YOLOV5
CN113239753A (en) Improved traffic sign detection and identification method based on YOLOv4
CN114298187B (en) Target detection method integrating improved attention mechanism
CN113902793B (en) Method, system and electronic equipment for predicting end-to-end building height based on single-vision remote sensing image
CN113361528B (en) Multi-scale target detection method and system
CN113469097B (en) Multi-camera real-time detection method for water surface floaters based on SSD network
CN108981728B (en) Intelligent vehicle navigation map building method
CN112418207B (en) Weak supervision character detection method based on self-attention distillation
Zhang et al. Research on pipeline defect detection based on optimized faster r-cnn algorithm
CN113723558A (en) Remote sensing image small sample ship detection method based on attention mechanism
CN117829243A (en) Model training method, target detection device, electronic equipment and medium
CN116665009A (en) Pipeline magnetic flux leakage image detection method based on multi-scale SSD network
CN112015937B (en) Picture geographic positioning method and system

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination