CN116912675B

CN116912675B - Underwater target detection method and system based on feature migration

Info

Publication number: CN116912675B
Application number: CN202311175150.1A
Authority: CN
Inventors: 姜宇; 宋建军; 于慧勇; 齐红; 王跃航; 赵明浩; 阮寒涛; 郭千仞
Original assignee: Jilin University
Current assignee: Jilin University
Priority date: 2023-09-13
Filing date: 2023-09-13
Publication date: 2023-11-28
Anticipated expiration: 2043-09-13
Also published as: CN116912675A

Abstract

An underwater target detection method and system based on feature migration relates to the technical field of underwater machine vision target detection. The method solves the problems of poor quality of underwater images, large recognition errors and poor generalization capability of the existing underwater target detection method. The method comprises the following steps: constructing a reference data set so as to train a migration countermeasure learning network model, and migrating the characteristics of the underwater high-definition image to the underwater fuzzy target image by adopting the trained migration countermeasure learning network model; adding two layers of coordinate attention enhancement modules into a backbone network of the Yolov5, and adding a group of anchor frames and SIOU position loss functions to obtain a DCA-Yolov5 target detection model; and performing target detection on the underwater high-definition target image with the enhanced characteristics by adopting a DCA-YOLOv5 target detection model to obtain the position and the category information of the target. The invention is suitable for underwater fuzzy scene enhancement and high-precision underwater target detection.

Description

Underwater target detection method and system based on feature migration

Technical Field

The invention relates to the technical field of underwater machine vision target detection.

Background

With the continuous progress and development of technology, humans find that a large amount of resources exist in the ocean, gradually explore the ocean, and develop a large amount of scientific research to explore the ocean environment and the mystery of living beings.

Underwater target detection refers to a process of automatically detecting, locating, and identifying a target object in an underwater environment. Applications for underwater target detection include fields such as underwater monitoring, marine science research, subsea resource exploration, underwater robotic operation, and the like.

Currently, underwater target detection faces the following challenges:

underwater image quality is poor: due to factors such as absorption, scattering and reflection of an underwater transmission medium, the underwater image is affected by problems such as light attenuation, noise interference and fuzzy distortion, so that the quality of the underwater image is poor, and target detection is not easy to perform.

Data lack: because of the high cost and difficulty of acquiring underwater images, the current available underwater data sets are few, which makes it difficult to study the underwater target detection algorithm.

To address these challenges, many deep learning-based underwater target detection algorithms such as the fast R-CNN algorithm, the YOLO algorithm, the SSD algorithm, and the like have emerged in recent years. These algorithms extract image features using deep convolutional neural networks and employ a variety of techniques for target detection and localization, such as region suggestion, anchor frame and hole convolution, etc. In addition, some researchers have attempted to improve aspects such as data enhancement methods, feature fusion techniques, etc. to improve the performance and robustness of underwater target detection.

However, current deep learning-based underwater target detectors still suffer from the following drawbacks:

detection performance is unstable: due to the complexity of the underwater environment, the performance of the underwater target detector is unstable under different scenarios and conditions. For example, the accuracy and recall of the detector may vary greatly with different water quality, illumination, and target size.

Insufficient generalization ability: due to the diversity of the underwater environment, the trained underwater target detector may not be adaptable to new, unknown underwater environments or target types, which makes the underwater target detector limited in generalization ability, requiring adjustment and optimization for specific tasks.

The recognition error is larger: due to problems of noise, distortion and the like of underwater images, the underwater object detector often has the situation of large recognition errors, especially for objects with complex shapes and various appearances, such as seaweed, coral reefs and the like.

Therefore, how to improve the accuracy of target detection, reduce recognition errors, and improve generalization ability in complex underwater environments is a hotspot and difficulty of current research.

Disclosure of Invention

The invention solves the problems of poor quality of underwater images, large recognition errors and poor generalization capability of the existing underwater target detection method.

In order to achieve the above object, the present invention provides the following solutions:

the invention provides an underwater target detection method based on characteristic migration, which comprises the following steps:

s1, constructing a reference data set by adopting an underwater fuzzy target image and an underwater high-definition image;

s2, training a migration countermeasure learning network model by adopting the reference data set, and obtaining the trained migration countermeasure learning network model;

s3, adopting the trained migration countermeasure learning network model to migrate the characteristics of the underwater high-definition image to the underwater fuzzy target image, and obtaining the underwater high-definition target image with enhanced characteristics;

s4, adding two layers of double-flow coordinate attention modules DCA into a backbone network of the Yolov5 to obtain a DCA-Yolov5 model;

s5, adding a group of anchor frames into the DCA-Yolov5 model to obtain a DCA-Yolov5 model with a small target inspection head;

s6, adding the SIOU position loss function into the DCA-Yolov5 model with the small target inspection head to obtain a DCA-Yolov5 target detection model;

and S7, performing target detection on the underwater high-definition target image with the enhanced characteristics by adopting the DCA-YOLOv5 target detection model to obtain the position and the category information of the target.

Further, in a preferred embodiment, the step S1 specifically includes:

s11, acquiring an underwater fuzzy target image and a tag data set;

s12, acquiring an underwater high-definition image data set;

s13, adding the underwater fuzzy target image and the underwater high-definition image data set into a training set according to the proportion of 1:1And test set->In (a) and (b);

s14, according to the label data set and the training setAnd test set->A reference dataset is obtained.

Further, in a preferred embodiment, the step S3 specifically includes:

s31, training a generator by adopting the trained migration countermeasure learning network modelAnd a generator；

S32, adopting the generatorTo the instituteProcessing the underwater fuzzy target image in the training set to generate an underwater high-definition target image;

s33, adopting the generatorProcessing the underwater high-definition target image in the test set to generate an underwater fuzzy target image;

s34, training a discriminator by adopting the trained migration countermeasure learning network modelAnd a discriminator->；

S35, adopting the discriminatorComparing the generated underwater high-definition target image with the acquired underwater high-definition image to obtain a discriminator +.>Discrimination result, use the said discriminator +.>Comparing the generated underwater blurred object image with the acquired underwater blurred object image to obtain a discriminator +.>Judging a result;

s36, adopting the discriminatorThe discrimination result continuously optimizes the generator +.>The use of said arbiter->The discrimination result continuously optimizes the generator +.>Until an optimal generator->And optimal Generator->；

S37, balancing the optimal generator by adopting a cyclic consistency loss and a loss functionAnd optimal Generator->And generating an image until an underwater high-definition target image with enhanced characteristics is obtained.

Further, in a preferred embodiment, the step S4 specifically includes:

s41, adding any one of two layers of attention enhancement modules in a two-layer dual-flow coordinate attention module DCA to initial two-layer convolution of a backbone network of the YOLOv5 to obtain a YOLOv5 model with one layer of attention mechanism;

s42, before adding the other attention enhancement module in the two-layer double-flow coordinate attention module DCA to the space pyramid pooling module of the YOLOv5 model backbone network with the one-layer attention mechanism, a DCA-YOLOv5 model is obtained.

Further, in a preferred embodiment, the anchor frame size in the step S5 is:、。

further, in a preferred embodiment, the SIOU location loss function in step S6 includes an angle loss, a distance loss, a shape loss, and an intersection ratio loss;

the angle loss is expressed as:

；

the distance loss is expressed as:

；

the shape loss is expressed as:

；

the cross-ratio loss is expressed as:

；

the SIOU position loss function is expressed as:

；

wherein,for the difference in height between the center points of the real and predicted frames,/for the difference in height between the center points of the real and predicted frames>For the distance between the center points of the real and predicted frames, < >>Is a weight coefficient>For distance loss weight in x-direction, +.>In the y directionThe distance loss weight on the prediction frame, w is the width of the prediction frame, h is the height of the prediction frame,/-the prediction frame>Is the width of the real frame->High for real frame->To control the degree of concern for shape loss, IOU is the ratio of the intersection of the predicted and real frame areas.

Further, in a preferred embodiment, the step S7 specifically includes:

s71, preprocessing the underwater high-definition target image with the enhanced characteristics, and preprocessing the image according to a proportion of 7:3, dividing the training set and the testing set;

s72, training the DCA-YOLOv5 target detection model by adopting the training set to obtain a training model;

s73, verifying the training model by adopting a verification set to obtain an optimal model;

and S74, performing target detection on the underwater high-definition target image with the enhanced characteristics by adopting the optimal model to obtain the position and the category information of the target.

The invention also provides an underwater target detection system based on feature migration, which is characterized in that:

the storage device is used for constructing a reference data set by adopting the underwater fuzzy target image and the underwater high-definition image;

the storage device is used for training the migration countermeasure learning network model by adopting the reference data set and obtaining the trained migration countermeasure learning network model;

the storage device is used for migrating the characteristics of the underwater high-definition image to the underwater fuzzy target image by adopting the trained migration countermeasure learning network model to obtain the characteristic-enhanced underwater high-definition target image;

the storage device is used for adding the two-layer double-flow coordinate attention module DCA into a backbone network of the Yolov5 to obtain a DCA-Yolov5 model;

a storage means for adding a set of anchor boxes to the DCA-YOLOv5 model, obtaining a DCA-YOLOv5 model with a small target inspection head;

storage means for adding a SIOU position loss function to the DCA-YOLOv5 model with small target inspection head to obtain a DCA-YOLOv5 target detection model;

and the storage device is used for carrying out target detection on the underwater high-definition target image with the enhanced characteristics by adopting the DCA-YOLOv5 target detection model to obtain the position and the category information of the target.

The invention also provides a computer readable storage medium, on which a computer program is stored, which when being executed by a processor, performs a method for detecting an underwater target based on feature migration as described in any one of the above.

The invention also provides a computer device, which comprises a memory and a processor, wherein the memory stores a computer program, and when the processor runs the computer program stored in the memory, the processor executes the underwater target detection method based on the characteristic migration.

The beneficial effects of the invention are as follows:

the invention provides a feature migration-based underwater target detection method, which adopts a migration-resistant learning network model to migrate the features of an underwater high-definition image to an underwater fuzzy target image, solves the problems of image blurring, light weakness and the like caused by a series of factors such as uneven underwater illumination, water quality blurring and the like, and provides good visual expression.

The invention provides an underwater target detection method based on feature migration, which is characterized in that two layers of double-flow coordinate attention modules DCA are added into a backbone network of Yolov5 to obtain a DCA-Yolov5 model, so that the DCA-Yolov5 model enhances the extraction capability of shallow target information and abstract target information, thereby being capable of focusing on underwater target information better and improving the accuracy of model detection. Meanwhile, an anchor frame is added into the DCA-YOLOv5 model, so that the DCA-YOLOv5 model has a wider detection range, and a SIOU position loss function is added into the DCA-YOLOv5 model, so that the DCA-YOLOv5 model can effectively accelerate the convergence of the model and improve the detection precision.

The invention is suitable for underwater fuzzy scene enhancement and high-precision underwater target detection.

Drawings

FIG. 1 is a flow chart of a method for detecting an underwater target based on feature migration according to an embodiment;

FIG. 2 is a diagram of a DCA-YOLOv5 model configuration according to one embodiment;

FIG. 3 is a schematic diagram of the convolution structure of FIG. 2;

fig. 4 is a block diagram of a coordinate attention module according to one embodiment.

Wherein Residual is denoted as a Residual module, X Avg Pool is denoted as global average pooling along a horizontal direction, Y Avg Pool is denoted as global average pooling along a vertical direction, concat is denoted as splicing along a channel, conv2d is denoted as convolution, batchNorm is denoted as batch normalization, non-linear is denoted as nonlinear operation, sigmoid is denoted as Sigmoid activation function, re-weight is denoted as weighted fusion, split operation.

Detailed Description

The following describes in further detail the embodiments of the present invention with reference to the drawings and examples. The following examples will assist those skilled in the art in further understanding the present invention, but are not intended to limit the invention in any way. It should be noted that variations and modifications can be made by those skilled in the art without departing from the spirit of the invention, which falls within the scope of the invention.

Referring to fig. 1 to 4, the present embodiment provides a method for detecting an underwater target based on feature migration, where the method is as follows:

In practical application, as shown in fig. 1, the present embodiment prepares an underwater diver image dataset and an underwater high definition image dataset, that is, an underwater blurred object image. The underwater camera is a camera specially used for underwater shooting, and can adapt to different underwater environments through different optical lenses and filters. High-definition underwater cameras, which typically have higher pixels and more advanced image sensors, can provide clearer and more detailed underwater images. Constructing a reference data set by adopting an underwater fuzzy target image and an underwater high-definition image; and training the migration countermeasure learning network model by adopting the reference data set, and obtaining the trained migration countermeasure learning network model. And transferring the characteristics of the underwater high-definition image to the underwater fuzzy target image by adopting the trained transfer countermeasure learning network model, so as to obtain 1000 underwater high-definition target images with enhanced characteristics. In the YOLOv5 backbone network, two layers of double-flow coordinate attention modules are added, wherein DCA is expressed as a double-flow coordinate attention module, which means that two coordinate attention modules are respectively used in a low-layer semantic information feature extraction part and a high-layer semantic information feature extraction part, and a DCA-YOLOv5 model is obtained as shown in fig. 4. Then adding an anchor frame into the DCA-YOLOv5 model to obtain the DCA-YOLOv5 model with the small target inspection head, wherein the DCA-YOLOv5 model with the small target inspection head has higher detection accuracy. Adding a SIOU position loss function into a DCA-Yolov5 model with a small target inspection head to obtain a DCA-Yolov5 target detection model, as shown in figures 2 and 3; and inputting the characteristic-enhanced underwater high-definition target image into a DCA-YOLOv5 target detection model to obtain the position and category information of the target. And finally, carrying out post-processing on the output result, including operations such as de-duplication, screening, non-maximum value inhibition and the like, so as to obtain a final underwater diver detection result.

The embodiment provides an underwater target detection method based on feature migration, which adopts a migration countermeasure learning network model to migrate the features of an underwater high-definition image to an underwater fuzzy target image, solves the problems of image blurring, weak light and the like caused by a series of factors such as uneven underwater illumination, fuzzy water quality and the like, and provides good visual expression.

The embodiment provides an underwater target detection method based on feature migration, which adds two layers of coordinate attention enhancement modules into a backbone network of YOLOv5 to obtain a DCA-YOLOv5 model, so that the DCA-YOLOv5 model enhances the extraction capability of shallow target information and abstract target information, thereby being capable of better focusing on underwater target information and improving the accuracy of model detection. Meanwhile, an anchor frame is added into the DCA-YOLOv5 model, so that the DCA-YOLOv5 model has a wider detection range, and a SIOU position loss function is added into the DCA-YOLOv5 model, so that the DCA-YOLOv5 model can effectively accelerate the convergence of the model and improve the detection precision.

In the second embodiment, the step S1 in the method for detecting an underwater target based on feature migration according to the first embodiment is illustrated, where the step S1 specifically includes:

s11, acquiring an underwater fuzzy target image and a tag data set;

s12, acquiring an underwater high-definition image data set;

In the practical application of the embodiment, 1000 underwater fuzzy target images and a label data set are prepared; preparing 1000 underwater high-definition images; and respectively adding the 1000 underwater fuzzy target images and the 1000 underwater high-definition images into a corresponding training set X and a corresponding testing set Y, wherein the ratio of the training set X to the testing set Y is 1:1, and performing operations such as whole size, cutting, standardization and the like on the underwater fuzzy target images and the underwater high-definition images so that the images are suitable for the input requirements of a network model. And constructing a generator G, a generator F, a discriminator Dx and a discriminator Dy model by using a migration countermeasure learning network architecture. The generator G is used for training 1000 underwater blurred object images and is responsible for converting the underwater blurred object images into underwater high-definition object images. Using 1000 underwater high definition target images to train a generator F, the generator F is responsible for converting the underwater high definition target images into underwater blurred target images. The discriminator Dx continuously optimizes the generator G through the discrimination result, and the discriminator Dy continuously optimizes the generator F through the discrimination result until the optimal generator G and the most optimal generator F are obtained. And balancing the images generated by the optimal generator G and the optimal generator F by adopting a cyclic consistency loss and a loss function until the optimal generator G is used for obtaining the underwater high-definition target image with enhanced characteristics.

In the third embodiment, the step S3 in the method for detecting an underwater target based on feature migration according to the second embodiment is illustrated, where the step S3 specifically includes:

S32, adopting the generatorProcessing the underwater fuzzy target image in the training set to generate an underwater high-definition target image;

In practical application, the training set divided in step S13 in the second embodiment isAnd test set->Performing feature migration, and training a generator ∈10 through a trained migration countermeasure learning network model>And a generator->For generator->Inputting an underwater blurred target image, outputting an underwater high definition target image, i.e., +.>The method comprises the steps of carrying out a first treatment on the surface of the For generator->Inputting an underwater high definition target image, outputting an underwater blurred target image, i.e., +.>。

Training a trained migration countermeasure learning network model to obtain a discriminatorAnd a discriminator->The use of said arbiter->Comparing the generated underwater high-definition target image with the acquired underwater high-definition image to obtain a discriminator +.>Discrimination result, use the said discriminator +.>Comparing the generated underwater blurred object image with the acquired underwater blurred object image to obtain a discriminator +.>And judging the result. Through the generator->And discriminator->To evolve both parties by means of the generator +.>And discriminator->Mutually gaming to evolve both parties until an optimal generator +.>And optimal Generator->The relationship is as follows:

；

wherein z is the noise data generated by the generator subject toIs a priori distributed of (a); />Representing real data->Obeying the true sample true distribution +.>. When->The larger the discriminator is, the more accurate the true sample can be identified; when->The smaller theThe larger the discriminator is, the more discrimination is enhanced, if the discriminator is fixed, the generation capacity of the generator is enhanced, the discriminator can discriminate the false data as true, namely +.>Increase, so there is->。

The link between different domains is established through a dual learning method, and meanwhile, the cyclic consistency loss is established:

；

wherein:、/>representing real data->、/>Subject to true sample true distribution respectively->、/>。/>For the generated reconstructed image, subscript 1 is +.>Norms, add->Norms to prevent overfitting from occurring +.>The image of the source domain is reconstructed, so that the gap between the image and the image of the source domain is reduced, and the problem of content change in the characteristic migration process is solved.

Thus, the loss function of feature migration versus learning is the sum of the resistance loss and the loop consistency loss:

；

wherein:is->And->Is to fight against loss of production, is to->Is->And->Is used for the counter-generation loss of (a),for cycle consistency loss, ++>Is a weight value for adjusting the importance of the loss.

The trained feature transfer countermeasure learning has the capability of transferring the high-definition features of the underwater high-definition image into the underwater fuzzy target image, so that the underwater high-definition target image is obtained.

In the fourth embodiment, the step S4 in the method for detecting an underwater target based on feature migration according to the first embodiment is illustrated, where the step S4 specifically includes:

In practical application, the embodiment improves the YOLOv5 network, adds any one of the two layers of coordinate attention enhancement modules to the initial two layers of convolution of the YOLOv5 backbone network, and then obtains a YOLOv5 model with one layer of attention mechanism, which is used for enhancing the extraction capability of the model on shallow target feature information and position information. And adding the other of the two layers of coordinate attention enhancement modules to the space pyramid pooling module of the YOLOv5 model backbone network with the one layer of attention mechanism to obtain a DCA-YOLOv5 model for enhancing the extraction capability of the model on abstract target feature information and position information. Features extracted by two layers of coordinates and attention are subjected to feature fusion on a neck network, so that a DCA-YOLOv5 network model captures global feature information. The coordinate attention decomposes the channel attention into two one-dimensional feature codes along two spatial directions respectively, and aggregates the features along the spatial directions. In this way, remote dependencies can be captured in one spatial direction while precise location information can be preserved in another spatial direction. The resulting feature maps are then individually encoded into a pair of orientation-aware and position-sensitive attention maps that can be complementarily applied to the input feature maps to enhance the representation of the object of interest. The method not only can capture the information of the cross-channel, but also can capture the information of direction perception and position perception, and can better extract the characteristic information and the position information, thereby helping the model to more accurately position and identify the underwater target. To enable the attention module to capture remote dependencies with accurate location information, the global pooling decomposition operates as a one-to-one dimensional feature encoding:

；

in the aboveThe output of the c-th channel is shown, +.>Height and width of the pooling nucleus, < >>Representing the input of the c-th channel.

Given inputThe size is (">1) or (1,/d>) Each channel is encoded along a horizontal coordinate and a vertical coordinate, respectively. Thus, the height is +.>The output of the c-th channel of (c) can be expressed as:

；

at the same time, the width isThe output of the c-th channel of (c) may be written as:

；

the coding mode is respectively aggregated along the vertical direction and the horizontal direction to form a pair of direction perception feature graphs, and each feature graph contains remote feature dependency relation of an input feature in one space direction and accurate position information in the other space direction. The embedding of the coordinate information greatly deepens the attention of the network to the target area of interest.

Coordinate attention mechanism embeds coordinate information into feature graphs generated in two directions for stitching, using a shared 1 x1 convolution transform functionTransform it:

)；

wherein:is a nonlinear activation function.

Generated feature mapIs a feature map formed by encoding spatial information in the horizontal direction and the vertical direction. Nonlinear processing and batch normalization are carried out on the feature map, and +.>Divided into->And->。/>For the reduction rate, the effect is to reduce the complexity of the model and prevent overfitting.

Using convolution transforms to transformAnd->Conversion to and input->Feature map with the same number of channels->And->Finally, nonlinear processing is carried out by using a Sigmoid activation function to obtain attention weight +.>And->。

；

Through the two steps, the output coordinate attentionThe method comprises the following steps:

；

the coordinate attention allows for the encoding of spatial information,including attention from the horizontal direction and attention from the vertical direction. Attention in horizontal and vertical directions is applied to the input tensor simultaneously, in two attention mapsReflects whether the object of interest is present in the respective row and column. The coordinate attention is used for enhancing the attention degree of the network to the target, so that the accuracy of target detection is improved.

The embodiment provides an underwater target detection method based on feature migration, which adds two layers of coordinate attention enhancement modules into a backbone network of YOLOv5 to obtain a DCA-YOLOv5 model, so that the CA-YOLOv5 model enhances the extraction capability of shallow target information and abstract target information, thereby being capable of better focusing on underwater target information and improving the accuracy of model detection.

In a fifth embodiment, the size of the anchor frame in step S5 in the method for detecting an underwater target based on feature migration according to the first embodiment is illustrated, where the size of the anchor frame is:、/>。

in practical application, the embodiment adds the anchor frame of DCA-YOLOv5 model with the size of、And (3) obtaining a DCA-YOLOv5 model with a small target inspection head, and continuing to perform convolution, upsampling and other processing on the feature map after the 19 th layer of the negk part of the DCA-YOLOv5 model so as to continuously expand the feature map, and simultaneously splicing the obtained feature map with the size of the third layer with the feature map in the backbone network when the 22 nd layer is performed, so that a larger detection head is obtained to detect the underwater small target. The DCA-YOLOv5 model finally yields 4 valid prediction feature heads. When an underwater image of 640 x 640 is input, feature head sizes on 4 scales are respectively: 160x160, 80x80, 40x40, 20x20, each for detecting small, medium and large targets, respectively.

In a sixth embodiment, the method for detecting an underwater target based on feature migration according to the first embodiment is exemplified by the SIOU position loss function of step S6, where the SIOU position loss function includes an angle loss, a distance loss, a shape loss, and an intersection ratio loss;

wherein the angular loss is expressed as:

；

the distance loss is expressed as:

；

the shape loss is expressed as:

；

the cross-ratio loss is expressed as:

；

the SIOU position loss function is expressed as:

；

wherein,for the difference in height between the center points of the real and predicted frames,/for the difference in height between the center points of the real and predicted frames>For the distance between the center points of the real and predicted frames, < >>Is a weight coefficient>For distance loss weight in x-direction, +.>For the distance loss weight in the y-direction, w is the width of the prediction box,

h is the height of the prediction block,is the width of the real frame->High for real frame->To control the degree of concern for shape loss, IOU is the ratio of the intersection of the predicted and real frame areas.

In practical application, the Loss function of the DCA-YOLOv5 model mainly classifies the Loss, the confidence Loss and the coordinate Loss, wherein the coordinate Loss function is expressed as CIOU Loss, and the CIOU Loss evaluates the performance of the model by considering factors such as the center point distance between the prediction frame and the real frame, the aspect ratio difference, the overlapping degree and the like, and the formula is as follows:

；

wherein,is a real frame->For the center point of the prediction box, +.>Is the center point of the real frame, +.>For the cross ratio of the predicted frame and the real frame area, +.>For predicting the Euclidean distance between the frame and the center of the real frame,/L>Diagonal length of minimum circumscribed matrix for prediction frame and real frame, +.>As a weight function, v is the measure for the uniformity of the aspect ratio between two boxes, +.>Aspect ratio for real frame, +.>Is the aspect ratio of the prediction box.

However, the coordinate Loss function CIOUs only considers factors such as the center point distance and the length-width ratio difference, and the like, but does not consider the influence of the target position error on the detection precision, so that the model precision is not high and the convergence is too slow. In the embodiment, the SIOULoss is used as a coordinate Loss function instead of CIOULoss, wherein the SIOULoss consists of four parts, namely an Angle Loss cost, a Distance Loss cost, a Shape Loss Shape cost and an intersection ratio Loss IOU cost.

For angle lossThe formula is as follows:

；

wherein the method comprises the steps ofFor the difference in height between the center points of the real and predicted frames,/for the difference in height between the center points of the real and predicted frames>Is the distance between the center points of the real frame and the predicted frame.

For distance lossThe formula is as follows:

；

wherein the method comprises the steps ofIs a weight coefficient>And->Representing the distance loss weights in the x-direction and the y-direction, respectively.

For shape lossThe formula is as follows:

；

where w and h are the width and height of the prediction box,and->Respectively the width sum of the real framesHigh. />To control the degree of concern over shape loss.

For IOU loss, the formula is as follows:

；

where IOU is the cross-ratio of the predicted frame to the real frame area.

The final SIOU Loss formula is:

；

the SIOU Loss is used as a coordinate Loss function, so that the convergence of a model can be effectively quickened, and the detection precision is improved.

In a seventh embodiment, the step S7 in the method for detecting an underwater target based on feature migration according to the first embodiment is illustrated, where the step S7 specifically includes:

In practical application, the underwater high-definition target image data set with enhanced characteristics is obtained according to the following steps: and 3, dividing a training set and a testing set, and preprocessing images, including operations such as image scaling, rotation, normalization, channel sequence transformation and the like, so as to improve the accuracy and generalization capability of model detection. Training a DCA-YOLOv5 model by using a training set to obtain a training model, and verifying the training model by using a verification set to obtain an optimal model; where the validation set is a portion of a division from the training set, typically thirty percent of the training set is taken as the validation set. And carrying out target detection on the underwater high-definition target image with the enhanced characteristics by adopting an optimal model to obtain a detection result, and carrying out post-processing on the detection result, wherein the post-processing comprises operations such as de-duplication, screening, non-maximum suppression and the like, so as to finally obtain the position and the category information of the target.

The eighth embodiment provides an underwater target detection system based on feature migration, where the system is:

The present embodiment provides a computer-readable storage medium having a computer program stored thereon, which when executed by a processor, performs the method for detecting an underwater target based on feature migration according to any one of the first to seventh embodiments.

In a tenth aspect, the present embodiment provides a computer apparatus, including a memory and a processor, wherein the memory stores a computer program, and the processor executes the method for detecting an underwater target based on feature migration according to any one of the first to seventh aspects when the processor executes the computer program stored in the memory.

An eleventh embodiment is a comparison experiment between the underwater target detection method based on feature migration described in any one of the first to tenth embodiments and the existing reference model YOLOv5, the new target detection models YOLOv7, YOLOv8, RT-DETR, and the experimental results indicate that the underwater target detection method based on feature migration is respectively described in [email protected], [email protected], [email protected]:0.95 is 5.4%, 4.3%, 3.6% higher than YOLOv5, 5.1%, 3.9% higher than YOLOv7, 3.4% higher than YOLOv8 by 4.2%, 3.7% higher than YOLOv8 by 3.6% higher than RT-DETR by 3.1% higher than RT-DETR by 2.9% higher than YOLOv 7.

Therefore, the underwater target detection method based on the feature migration improves the accuracy of model detection and improves the detection precision.

Furthermore, the terms "first," "second," and the like, are used for descriptive purposes only and are not to be construed as indicating or implying a relative importance or implicitly indicating the number of technical features indicated. Thus, a feature defining "a first" or "a second" may explicitly or implicitly include at least one such feature. In the description of the present invention, the meaning of "plurality" means at least two, for example, two, three, etc., unless specifically defined otherwise.

In the description of the present specification, a description referring to terms "one embodiment," "some embodiments," "examples," "specific examples," or "some examples," etc., means that a particular feature, structure, material, or characteristic described in connection with the embodiment or example is included in at least one embodiment or example of the present invention. In this specification, schematic representations of the above terms are not necessarily directed to the same embodiment or example. Furthermore, the particular features, structures, materials, or characteristics described may be combined in any suitable manner in any one or more embodiments or examples. Furthermore, the different embodiments or examples described in this specification and the features of the different embodiments or examples may be combined and combined by those skilled in the art without contradiction.

The above description is only an example of the present invention and is not limited to the present invention, but various modifications and changes will be apparent to those skilled in the art. Any modification, equivalent replacement, improvement, etc. made within the spirit and principle of the present invention should be included in the scope of the claims of the present invention.

Claims

1. The method for detecting the underwater target based on the characteristic migration is characterized by comprising the following steps:

the size of the anchor frame is；

s7, performing target detection on the underwater high-definition target image with the enhanced characteristics by adopting the DCA-YOLOv5 target detection model to obtain the position and the category information of the target;

the step S4 specifically includes:

s42, adding the other attention enhancement module in the two-layer double-flow coordinate attention module DCA to the space pyramid pooling module of the YOLOv5 model backbone network with the one-layer attention mechanism to obtain a DCA-YOLOv5 model;

features extracted by two layers of coordinate attention in the DCA-YOLOv5 model can be subjected to feature fusion in a neck network, and the DCA-YOLOv5 model captures global feature information.

2. The method for detecting an underwater target based on feature migration according to claim 1, wherein the step S1 specifically comprises:

s11, acquiring an underwater fuzzy target image and a tag data set;

s12, acquiring an underwater high-definition image data set;

s14, according to the label data set and the training setAnd test set->Obtaining a baseQuasi data sets.

3. The method for detecting an underwater target based on feature migration according to claim 2, wherein the step S3 specifically comprises:

s31, training a generator by adopting the trained migration countermeasure learning network modelAnd a generator->；

S35, adopting the discriminatorComparing the generated underwater high-definition target image with the acquired underwater high-definition image to obtain a discriminator +.>The discrimination result is adoptedSaid arbiter->Comparing the generated underwater blurred object image with the acquired underwater blurred object image to obtain a discriminator +.>Judging a result;

4. The method for detecting an underwater target based on feature migration according to claim 1, wherein the SIOU location loss function in the step S6 includes an angle loss, a distance loss, a shape loss, and an intersection ratio loss;

the angle loss is expressed as:

；

the distance loss is expressed as:

；

the shape loss is expressed as:

；

the cross-ratio loss is expressed as:

；

the SIOU position loss function is expressed as:

；

wherein,for the difference in height between the center points of the real and predicted frames,/for the difference in height between the center points of the real and predicted frames>For the distance between the center points of the real and predicted frames, < >>Is a weight coefficient>For distance loss weight in x-direction, +.>The distance loss weight in the y direction is w is the width of the prediction frame, h is the height of the prediction frame,/->Is the width of the real frame->High for real frame->To control the degree of concern for shape loss, IOU is the ratio of the intersection of the predicted and real frame areas.

5. The method for detecting an underwater target based on feature migration according to claim 1, wherein the step S7 is specifically:

6. An underwater target detection system based on feature migration, characterized in that the system is:

the size of the anchor frame is；

the storage device is used for carrying out target detection on the underwater high-definition target image with the enhanced characteristics by adopting the DCA-YOLOv5 target detection model to obtain the position and the category information of the target;

the specific steps for obtaining the DCA-YOLOv5 model are as follows:

7. A computer readable storage medium, characterized in that the computer readable storage medium has stored thereon a computer program which, when executed by a processor, performs a method of underwater target detection based on feature migration as claimed in any of claims 1-5.

8. A computer device comprising a memory and a processor, wherein the memory stores a computer program, and wherein the processor performs a feature migration-based underwater target detection method as claimed in any of claims 1 to 5 when the processor runs the computer program stored in the memory.