CN115019169A - Single-stage water surface small target detection method and device - Google Patents

Single-stage water surface small target detection method and device Download PDF

Info

Publication number
CN115019169A
CN115019169A CN202210603898.6A CN202210603898A CN115019169A CN 115019169 A CN115019169 A CN 115019169A CN 202210603898 A CN202210603898 A CN 202210603898A CN 115019169 A CN115019169 A CN 115019169A
Authority
CN
China
Prior art keywords
feature
water surface
information fusion
sparse
target detection
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202210603898.6A
Other languages
Chinese (zh)
Inventor
张卫东
陈丽
柏林
熊明磊
董超
杨云祥
覃善兴
孙永强
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Hainan University
Original Assignee
Hainan University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Hainan University filed Critical Hainan University
Priority to CN202210603898.6A priority Critical patent/CN115019169A/en
Publication of CN115019169A publication Critical patent/CN115019169A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/10Terrestrial scenes
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/77Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
    • G06V10/80Fusion, i.e. combining data from various sources at the sensor level, preprocessing level, feature extraction level or classification level
    • G06V10/806Fusion, i.e. combining data from various sources at the sensor level, preprocessing level, feature extraction level or classification level of extracted features
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/82Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V2201/00Indexing scheme relating to image or video recognition or understanding
    • G06V2201/07Target detection
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02ATECHNOLOGIES FOR ADAPTATION TO CLIMATE CHANGE
    • Y02A90/00Technologies having an indirect contribution to adaptation to climate change
    • Y02A90/30Assessment of water resources

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Evolutionary Computation (AREA)
  • General Physics & Mathematics (AREA)
  • Computing Systems (AREA)
  • Multimedia (AREA)
  • Artificial Intelligence (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Software Systems (AREA)
  • Health & Medical Sciences (AREA)
  • General Health & Medical Sciences (AREA)
  • Medical Informatics (AREA)
  • Databases & Information Systems (AREA)
  • Computational Linguistics (AREA)
  • Molecular Biology (AREA)
  • General Engineering & Computer Science (AREA)
  • Mathematical Physics (AREA)
  • Data Mining & Analysis (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Biophysics (AREA)
  • Biomedical Technology (AREA)
  • Image Processing (AREA)

Abstract

The invention relates to a method and a device for detecting a small target on a water surface, wherein the method comprises the steps of collecting images, and enhancing the collected images to expand the data; carrying out feature extraction on the collected images and generating corresponding feature maps; introducing a characteristic pyramid structure and a sparse attention mechanism, and performing information fusion on characteristic graphs of different scales; weighting the feature vectors after information fusion; and balancing the positive and negative sample numbers of the feature vector after information fusion by adopting a Focal local function. According to the invention, a new data enhancement mode and a new feature information fusion mode are introduced, and a feature pyramid structure and a sparse attention mechanism are combined, so that the detection performance of the small water surface target is enhanced, and the situations of missing detection and wrong detection of the target are reduced.

Description

Single-stage water surface small target detection method and device
Technical Field
The invention relates to a target detection technology in the field of computer vision, in particular to a method and a device for detecting a single-stage small water surface target.
Background
In recent years, in order to develop marine resources and maintain the interest of the sea, domestic and foreign researches on unmanned equipment in the water environment are increasingly concerned, and a target detection technology is one of key technologies for building the unmanned equipment in the water, particularly researches on small target detection, such as the detection of long-distance ships and personnel, unmanned aerial vehicle navigation, obstacle avoidance, detection of small floaters and the like, which are difficult points of water operation. Compared with land, the aquatic environment has the characteristics of more complexity, low safety, instability and the like, and the small target has the characteristics of low pixel and less features, so that the detection of the aquatic small target is more challenging.
Current target detection algorithms fall broadly into two broad categories: a two-stage method based on regional recommendation and a single-stage method based on a regression idea are provided, and a representative algorithm of the two-stage detection method is R-CNN series, and the method is characterized by high detection precision, large calculation amount and incapability of achieving real-time detection. The single-stage detection method comprises a YOLO series, an SSD algorithm and the like, directly completes the positioning and classifying tasks at one time, realizes end-to-end detection, and has the advantages of real-time detection capability but slightly lower precision than a two-stage detection algorithm.
Although the single-stage method is continuously improved, the detection performance is also improved, the detection of the small target in the water scene is still a difficult problem, and the targeted improvement of the detection of the small target is an urgent need for the technical development of the existing water unmanned equipment.
Disclosure of Invention
In view of the above, the present invention provides a method and a device for detecting a single-stage small water surface target, so as to solve the problems of target omission and false detection caused by high detection difficulty of the existing small water surface target, and specifically includes the following steps:
s1, acquiring an image, and performing data enhancement on the acquired image to expand data;
s2, extracting the characteristics of the collected images and generating corresponding characteristic graphs;
s3, introducing a characteristic pyramid structure and a sparse attention mechanism, and carrying out information fusion on characteristic graphs of different scales;
s4, weighting the feature vectors after information fusion;
and S5, balancing the number of positive and negative samples when the feature vector after information fusion is subjected to weighting processing by adopting a Focal local function.
Further, S1 specifically includes: the data of the collected overwater image is expanded by using a data enhancement strategy, namely small targets in the picture are copied and pasted to different positions, so that the number of the small targets in the sample is increased, the data size of a matching prior frame is increased, and the overwater image data is improved.
Further, S2 specifically includes: and introducing a single-stage network model of the SSD, and taking the VGG16 as a backbone feature extraction network to acquire feature information of the small water surface target.
Further, the feature fusion part of the feature pyramid structure and the sparse attention mechanism in S3 includes the following steps:
SS1, the size of the feature map after feature extraction is 38 x 38, and feature maps with different scales of 38 x 38, 19 x 19, 10 x 10, 5 x 5, 3 x 3 and 1 x 1 are respectively obtained through a plurality of convolution operations;
SS 2: overlapping the sampled feature map and the feature map corresponding to the feature map to form a feature pyramid structure so as to obtain feature information between different levels and perform information fusion;
SS 3: after feature information is fused, feature maps with the sizes of 10 multiplied by 10, 19 multiplied by 19 and 38 multiplied by 38 are obtained, and a sparse attention mechanism is added behind each feature map;
SS 4: generating N most representative positions by sparse position search sampling to form a sparse feature set, and obtaining an attention diagram P by 1 × 1conv and softmax operations att And is represented as:
P att =softmax[Q T ·Sa(Q)]
where Q represents a query, Sa represents a sparse sampling operation, and T represents a matrix transposition operation.
Further, S4 specifically includes: the feature extraction network pre-trains the weights of the feature vectors after information fusion, the attention diagram in the sparse attention mechanism is subjected to matrix multiplication with the weight sparse matrix V to obtain a weighting result, and the weight of the result is updated.
Further, S5 specifically includes: introducing a focus classification loss function to balance the positive and negative sample numbers can be expressed as:
Figure BDA0003670578100000031
wherein n and m are respectively the number of negative and positive samples, P is the estimation probability, P belongs to [0,1], a and r are hyper-parameters, a belongs to [0,1], r belongs to [0,5], and then parameter adjustment is carried out according to specific data, so that the model focuses on training the negative samples more.
A single-stage surface small target detection device comprising:
the data enhancement module: performing data enhancement on the acquired image to expand the data;
a feature extraction module: performing feature extraction on the input picture and generating a corresponding feature map;
the information fusion module: introducing a characteristic pyramid structure and a sparse attention mechanism, and performing information fusion on characteristic graphs of different scales;
a feature weighting module: weighting the feature vectors after information fusion;
a loss function setting module: and balancing the positive and negative sample numbers of the feature vector after information fusion by adopting a Focal local function.
The method has the advantages that a new data enhancement mode and a new feature information fusion mode are introduced, and a feature pyramid structure and a sparse attention mechanism are combined, so that the detection performance of the small targets on the water surface is enhanced, and the conditions of missing detection and wrong detection of the targets are reduced.
Drawings
FIG. 1 is a schematic flow chart of a single-stage water surface small target detection method according to the present invention;
FIG. 2 is a process diagram of feature fusion of the present invention;
FIG. 3 is a flow chart of the sparse attention mechanism of the present invention;
fig. 4 is a schematic diagram of a single-stage water surface small target detection device of the present invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention more apparent, the present invention is further described in detail with reference to specific embodiments below.
It is to be noted that technical terms or scientific terms used herein should have the ordinary meaning as understood by those having ordinary skill in the art to which the present invention belongs, unless otherwise defined. The use of "first," "second," and similar terms in the present application do not denote any order, quantity, or importance, but rather the terms are used to distinguish one element from another. The word "comprising" or "comprises", and the like, means that the element or item listed before the word covers the element or item listed after the word and its equivalents, but does not exclude other elements or items. The terms "connected" or "coupled" and the like are not restricted to physical or mechanical connections, but may include electrical connections, whether direct or indirect. "upper", "lower", "left", "right", and the like are used merely to indicate relative positional relationships, and when the absolute position of the object being described is changed, the relative positional relationships may also be changed accordingly.
As shown in fig. 1, a single-stage water surface small target detection method includes the following steps:
s1, acquiring an image, and performing data enhancement on the acquired image to expand data;
s2, extracting the characteristics of the collected images and generating corresponding characteristic graphs;
s3, introducing a characteristic pyramid structure and a sparse attention mechanism, and carrying out information fusion on characteristic graphs of different scales;
s4, weighting the feature vectors after information fusion;
and S5, balancing the number of positive and negative samples when the feature vector after information fusion is subjected to weighting processing by adopting a Focal local function.
Specifically, S1 specifically includes the following contents: because the small target samples in the data set are few, the collected overwater image is subjected to data enhancement strategy to expand data, namely the small targets in the picture are copied and pasted at different positions, so that the number of the small targets in the sample is increased, the data size of the matched prior frame is increased, and the overwater image data is improved.
S2 specifically includes: and introducing a single-stage network model of the SSD, determining a feature extraction network, and selecting a backbone feature extraction network as VGG16 to obtain the feature information of the small water surface target.
S3 specifically includes: the characteristic pyramid structure and sparse attention mechanism characteristic fusion part comprises the following specific steps:
SS1, the size of the feature map after feature extraction is 38 x 38, and feature maps with different scales of 38 x 38, 19 x 19, 10 x 10, 5 x 5, 3 x 3 and 1 x 1 are respectively obtained through a plurality of convolution operations;
SS 2: as shown in fig. 2, after a series of up-sampling operations, the sampled feature map and the corresponding feature map are subjected to an overlap (containment) operation to form a feature pyramid structure, so as to obtain feature information between different levels and perform feature fusion;
SS 3: after feature information is fused, feature maps with the sizes of 10 multiplied by 10, 19 multiplied by 19 and 38 multiplied by 38 are obtained, and a sparse attention mechanism (SA) is added behind each feature map;
SS 4: the detection of small targets on the water surface has the problems of less key information proportion and large secondary information proportion such as background, soA traditional attention mechanism is abandoned, and a sparse attention mechanism is introduced, as shown in fig. 3, wherein the sparse attention mechanism SA specifically comprises the following steps: generating N most representative positions by sparse position search sampling to form a sparse feature set, and obtaining an attention diagram P by 1 × 1conv and softmax operations att It can be expressed as:
P att =softmax[Q T ·Sa(Q)]
where Q represents a query, Sa represents a sparse sampling operation, and T represents a matrix transposition operation.
S4 specifically includes: the feature extraction network is used for pre-training the weights, the attention diagram generated in the sparse attention mechanism is subjected to matrix multiplication with the weight sparse matrix V to obtain weighting results, and the weighting results are transmitted to the feature weighting module for weight updating.
And 5: setting the Loss function as Focal local, wherein small objects on the water surface include remote ships, people, obstacles, floating objects and the like, only occupy a small area of the picture, and such background of the sky covers a large area, so that only a small part of the prediction frame contains objects, therefore, the invention introduces a focusing classification Loss function to balance positive and negative sample numbers, which can be expressed as:
Figure BDA0003670578100000061
wherein n and m are respectively the number of negative and positive samples, P is the estimation probability, P belongs to [0,1], a and r are hyper-parameters, a belongs to [0,1], and r belongs to [0,5 ].
And performing parameter adjustment according to specific data to enable the model to focus on training of negative samples, so as to balance the number of positive and negative samples.
Based on the above embodiment, a single-stage water surface small target detection device is provided, as shown in fig. 4, including:
the data enhancement module: performing data enhancement on the acquired image to expand the data;
a feature extraction module: performing feature extraction on the input picture and generating a corresponding feature map;
the information fusion module: introducing a characteristic pyramid structure and a sparse attention mechanism, and performing information fusion on characteristic graphs of different scales;
a feature weighting module: weighting the feature vectors after information fusion;
a loss function setting module: and balancing the positive and negative sample numbers of the feature vector after information fusion by adopting a Focal local function.
The image processing device further comprises an input module and an output module, wherein the input module is used for inputting the image, and the output module is used for outputting the processed result.
According to the invention, a new data enhancement mode and a new feature information fusion mode are introduced, and a feature pyramid structure and a sparse attention mechanism are combined, so that the detection performance of the small target on the water surface is enhanced, and the conditions of missing detection and wrong detection of the target are reduced.
Those of ordinary skill in the art will understand that: the discussion of any embodiment above is meant to be exemplary only, and is not intended to intimate that the scope of the disclosure, including the claims, is limited to those examples; within the idea of the invention, also features in the above embodiments or in different embodiments may be combined, steps may be implemented in any order, and there are many other variations of the different aspects of the invention as described above, which are not provided in detail for the sake of brevity.
The present invention is intended to embrace all such alternatives, modifications and variances which fall within the broad scope of the appended claims. Therefore, any omissions, modifications, substitutions, improvements and the like that may be made without departing from the spirit and principles of the invention are intended to be included within the scope of the invention.

Claims (7)

1. A single-stage water surface small target detection method is characterized by specifically comprising the following steps:
s1, acquiring an image, and performing data enhancement on the acquired image to expand data;
s2, extracting the characteristics of the collected images and generating corresponding characteristic graphs;
s3, introducing a characteristic pyramid structure and a sparse attention mechanism, and carrying out information fusion on characteristic graphs of different scales;
s4, weighting the feature vectors after information fusion;
and S5, balancing the number of positive and negative samples when the feature vector after information fusion is subjected to weighting processing by adopting a FocalLoss function.
2. The single-stage water surface small target detection method according to claim 1, wherein S1 specifically comprises:
the data of the collected overwater image is expanded by using a data enhancement strategy, namely small targets in the picture are copied and pasted to different positions, so that the number of the small targets in the sample is increased, the data size of a matching prior frame is increased, and the overwater image data is improved.
3. The single-stage water surface small target detection method according to claim 1, wherein S2 specifically comprises:
and introducing a single-stage network model of the SSD, and taking the VGG16 as a backbone feature extraction network to acquire feature information of the small water surface target.
4. The single-stage water surface small target detection method according to claim 1, wherein the feature pyramid structure and the feature fusion part of the sparse attention mechanism in S3 comprises the following steps:
SS1, the size of the feature map after feature extraction is 38 x 38, and feature maps with different scales of 38 x 38, 19 x 19, 10 x 10, 5 x 5, 3 x 3 and 1 x 1 are respectively obtained through a plurality of convolution operations;
SS 2: overlapping the sampled feature map and the feature map corresponding to the feature map to form a feature pyramid structure so as to obtain feature information between different levels and perform information fusion;
SS 3: after feature information is fused, feature maps with the sizes of 10 multiplied by 10, 19 multiplied by 19 and 38 multiplied by 38 are obtained, and a sparse attention mechanism is added behind each feature map;
SS 4: tong (Chinese character of 'tong')Searching and sampling through sparse positions to generate N most representative positions to form a sparse feature set, and obtaining an attention diagram P through operations of 1 × 1conv and softmax att And is represented as:
P att =softmax[Q T ·Sa(Q)]
where Q represents a query, Sa represents a sparse sampling operation, and T represents a matrix transposition operation.
5. The single-stage water surface small target detection method according to claim 1, wherein S4 specifically comprises:
the feature extraction network is used for pre-training the weights, the attention diagram in the sparse attention mechanism is subjected to matrix multiplication with the weight sparse matrix V to obtain a weighting result, and the weight of the result is updated.
6. The single-stage water surface small target detection method according to claim 1, wherein S5 specifically comprises:
introducing a focus classification loss function to balance the positive and negative sample numbers can be expressed as:
Figure FDA0003670578090000021
and adjusting parameters according to specific data to enable the model to focus on training of the negative samples.
7. A single-stage water surface small target detection device, comprising:
the data enhancement module: performing data enhancement on the acquired image to expand the data;
a feature extraction module: performing feature extraction on the input picture and generating a corresponding feature map;
the information fusion module: introducing a characteristic pyramid structure and a sparse attention mechanism, and performing information fusion on characteristic graphs of different scales;
a feature weighting module: weighting the feature vectors after information fusion;
a loss function setting module: and balancing the positive and negative sample numbers of the feature vector after information fusion by adopting a Focal local function.
CN202210603898.6A 2022-05-31 2022-05-31 Single-stage water surface small target detection method and device Pending CN115019169A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210603898.6A CN115019169A (en) 2022-05-31 2022-05-31 Single-stage water surface small target detection method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210603898.6A CN115019169A (en) 2022-05-31 2022-05-31 Single-stage water surface small target detection method and device

Publications (1)

Publication Number Publication Date
CN115019169A true CN115019169A (en) 2022-09-06

Family

ID=83071630

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210603898.6A Pending CN115019169A (en) 2022-05-31 2022-05-31 Single-stage water surface small target detection method and device

Country Status (1)

Country Link
CN (1) CN115019169A (en)

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111914917A (en) * 2020-07-22 2020-11-10 西安建筑科技大学 Target detection improved algorithm based on feature pyramid network and attention mechanism
CN113065558A (en) * 2021-04-21 2021-07-02 浙江工业大学 Lightweight small target detection method combined with attention mechanism
CN113177546A (en) * 2021-04-30 2021-07-27 中国科学技术大学 Target detection method based on sparse attention module

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111914917A (en) * 2020-07-22 2020-11-10 西安建筑科技大学 Target detection improved algorithm based on feature pyramid network and attention mechanism
CN113065558A (en) * 2021-04-21 2021-07-02 浙江工业大学 Lightweight small target detection method combined with attention mechanism
CN113177546A (en) * 2021-04-30 2021-07-27 中国科学技术大学 Target detection method based on sparse attention module

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
庞立新等: "一种基于注意力机制RetinaNet的小目标检测方法", 《制导与引信》, vol. 40, no. 4, 31 December 2019 (2019-12-31), pages 1 *

Similar Documents

Publication Publication Date Title
CN110246141B (en) Vehicle image segmentation method based on joint corner pooling under complex traffic scene
CN108960211B (en) Multi-target human body posture detection method and system
CN113065558A (en) Lightweight small target detection method combined with attention mechanism
Gasienica-Jozkowy et al. An ensemble deep learning method with optimized weights for drone-based water rescue and surveillance
Kang et al. A survey of deep learning-based object detection methods and datasets for overhead imagery
CN108596108B (en) Aerial remote sensing image change detection method based on triple semantic relation learning
CN109241902B (en) Mountain landslide detection method based on multi-scale feature fusion
EP3690744B1 (en) Method for integrating driving images acquired from vehicles performing cooperative driving and driving image integrating device using same
CN116758130A (en) Monocular depth prediction method based on multipath feature extraction and multi-scale feature fusion
CN111079739A (en) Multi-scale attention feature detection method
CN115035361A (en) Target detection method and system based on attention mechanism and feature cross fusion
CN115631344B (en) Target detection method based on feature self-adaptive aggregation
CN111931686A (en) Video satellite target tracking method based on background knowledge enhancement
CN113569981A (en) Power inspection bird nest detection method based on single-stage target detection network
Zhang et al. Finding nonrigid tiny person with densely cropped and local attention object detector networks in low-altitude aerial images
CN113297959A (en) Target tracking method and system based on corner attention twin network
CN115100545A (en) Target detection method for small parts of failed satellite under low illumination
CN114821018A (en) Infrared dim target detection method for constructing convolutional neural network by utilizing multidirectional characteristics
CN112800932B (en) Method for detecting remarkable ship target in offshore background and electronic equipment
Liu et al. Density saliency for clustered building detection and population capacity estimation
CN111209919A (en) Marine ship significance detection method and system
CN115019169A (en) Single-stage water surface small target detection method and device
CN116091946A (en) Yolov 5-based unmanned aerial vehicle aerial image target detection method
CN115661692A (en) Unmanned aerial vehicle detection method and system based on improved CenterNet detection network
CN115035429A (en) Aerial photography target detection method based on composite backbone network and multiple measuring heads

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination