CN117037132A

CN117037132A - Ship water gauge reading detection and identification method based on machine vision

Info

Publication number: CN117037132A
Application number: CN202311003627.8A
Authority: CN
Inventors: 高正华; 徐浩东
Original assignee: Nanjing Jitu Network Technology Co ltd
Current assignee: Nanjing Jitu Network Technology Co ltd
Priority date: 2023-08-10
Filing date: 2023-08-10
Publication date: 2023-11-10

Abstract

The invention discloses a ship water gauge reading detection and identification method based on machine vision, and relates to the technical field of image processing; the method comprises the steps of S1, collecting pictures of the past ship water gauge area, S2, establishing a ship water limit line, S3, predicting and marking the ship water gauge scale, S4, obtaining a precise area to be predicted, and S5, carrying out character prediction on the area to be predicted; according to the method, firstly, a strategy of marking integer parts and decimal parts on the water gauge readings of the ship can avoid complex post-processing algorithm logic, higher accuracy and robustness are still achieved in the face of the condition of inclination of shooting angles or bending of the ship body, in the practical application process, along with continuous expansion of training data, the detection accuracy of the YOLOv5 target detection algorithm on the integer parts and the decimal parts of the water gauge is continuously improved, and secondly, the algorithm is introduced into a Mask RCNN example segmentation algorithm to divide the ship and the water area, and accuracy of a region to be predicted can be guaranteed no matter whether the water body of an actual scene is turbid or not.

Description

Ship water gauge reading detection and identification method based on machine vision

Technical Field

The invention relates to the technical field of image processing, in particular to a ship water gauge reading detection and identification method based on machine vision.

Background

The ship water gauge weighing is a method for measuring ship-borne cargoes, and the method is widely applied to the weight measurement of ship transportation bulk cheap bulk cargoes. Whether the accuracy of the ship water gauge reading identification directly influences the accuracy of ship cargo capacity calculation, and further influences settlement of commodity trade, clearance tax counting and the like, so that the accuracy of ship water gauge line detection and water gauge reading identification is guaranteed to have important significance. Currently, the measurement of ship water gauge lines mainly comprises the following methods:

1. manual observation method: after the ship approaches the shore, manually observing, recording and manually checking by a water gauge observer with abundant experience, comparing and averaging the observed results to obtain a final result, wherein the method is greatly influenced by main objective factors, for example, the water gauge reading is difficult to observe when the water surface fluctuates greatly, the observer cannot obtain accurate numerical values due to the limitation of weather and an observation visual angle, and secondly, the efficiency of manually observing the water gauge reading is low and cannot meet the high-efficiency passing of a port;

2. the ship water level line detection method based on the deep learning algorithm comprises the following steps: with the rapid development of a deep learning algorithm, more and more models based on convolutional neural networks are used for detecting and identifying water gauge readings of ships, the method is used for correcting original input images through affine transformation, secondly, an example segmentation algorithm is used for extracting water gauge areas and ship water boundary positions of the ships, water gauge area pictures are sent into an OCR character recognition algorithm to read water gauge values, finally, integer values and decimal values of the water gauges are judged through complex post-processing logic and combined with ship water boundary position estimation to obtain final water gauge readings, the method is higher in accuracy requirements on image preprocessing and example segmentation algorithm, water gauge reading distortion caused by inclination of shooting angles, surface bending of ships and the like is usually encountered in practical application, in this case, the accuracy of the method is not high, and secondly, complex post-processing logic is only required to meet partial specific scenes, the post-processing algorithm is required to be redesigned when ship types or monitoring scenes change, therefore, the method for judging the integer values and the decimal values through the post-processing algorithm is not high in performance, and cannot be deployed on a large scale, and the inventor proposes a water gauge detection method based on machine vision and ship identification method for solving the problems.

Disclosure of Invention

The method aims to solve the problem of low accuracy of ship water level line detection; the invention aims to provide a ship water gauge reading detection and identification method based on machine vision.

In order to solve the technical problems, the invention adopts the following technical scheme: a ship water gauge reading detection and identification method based on machine vision comprises the following steps:

s1, acquiring a picture of a water gauge area of a past ship through an unmanned plane or a wharf fixed monitoring camera, wherein the image acquisition is required to ensure that light is sufficient, and the water gauge graduation and the water content boundary line of the ship are clearly visible;

s2, performing example segmentation on a ship area and a water area by using a Mask-RCNN model to obtain a ship water boundary;

s3, taking M as a unit, using Arabic numerals to represent integer and decimal parts, respectively predicting the integer part and the decimal part of the scale of the ship water gauge by using a YOLOv5 target detection algorithm to obtain the coordinate position of a boundary frame of the water gauge reading and the category of each boundary frame, wherein the integer part is marked as "int" category, and the decimal part is marked as "float" category;

s4, respectively finding out a target boundary box with the largest y-axis coordinate of a center point in an 'int' type and a 'float' type target by taking the upper left corner of an original image as a coordinate origin, namely an effective reading area, wherein the area further obtains a more accurate area to be predicted by combining the ship water boundary line obtained in the S2;

s5, character prediction is carried out on the area to be predicted through an OCR algorithm, accuracy of water gauge reading prediction is guaranteed, interference of water surface fluctuation on prediction is avoided, all water gauge readings detected within 10 seconds are taken in actual pushing, and an average value is taken as a final prediction result.

Preferably, in S3, under such labeling strategy, the target bounding boxes of different classes have different sizes and aspect ratios, so that the target can be predicted in different prediction feature graphs by the subsequent YOLOv5 target detection algorithm.

Preferably, in S3, YOLOv5 belongs to an object detection algorithm of Anchor-Based, that is, regression prediction needs to be performed by using an a priori Anchor Box (Anchor Box), the basic principle is that an input image is scaled to a size required by a model through a preprocessing operation, and on the basis, a series of convolution, batch normalization and activation functions are used for feature extraction and downsampling of the image to 8 times, 16 times and 32 times, and due to different downsampling multiplying power, a large object in a training set is usually predicted in a small-size prediction feature map, and a small object is predicted in a large-size prediction feature map.

Compared with the prior art, the invention has the beneficial effects that:

1. in the invention, the integer and decimal part of the ship water gauge reading are directly distinguished and detected by utilizing the target detection algorithm, so that complex numerical judgment logic in post-processing is avoided, and the accuracy and robustness of an algorithm model are improved;

2. according to the invention, the ship water gauge image is segmented, detected and OCR character recognition is carried out by fusing the multiple models, the result of ship water gauge reading depends on the prediction output of the multiple models, and the detection result has higher reliability. Meanwhile, the invention simplifies the detection flow of the water gauge reading, avoids redundant operation and improves the detection efficiency of the algorithm model;

3. according to the invention, the ship water gauge can adapt to different types of ship water gauges in a mode of loading a pre-training model and fine tuning, can be easily adapted to other similar water gauge reading tasks, and has strong generalization capability;

4. according to the method, firstly, a strategy of marking integer parts and decimal parts on the water gauge readings of the ship can avoid complex post-processing algorithm logic, higher accuracy and robustness are still achieved in the face of the condition of inclination of shooting angles or bending of the ship body, in the practical application process, along with continuous expansion of training data, the detection accuracy of the YOLOv5 target detection algorithm on the integer parts and the decimal parts of the water gauge is continuously improved, and secondly, the algorithm is introduced into a Mask RCNN example segmentation algorithm to divide the ship and the water area, and accuracy of a region to be predicted can be guaranteed no matter whether the water body of an actual scene is turbid or not.

Drawings

In order to more clearly illustrate the embodiments of the invention or the technical solutions in the prior art, the drawings that are required in the embodiments or the description of the prior art will be briefly described, it being obvious that the drawings in the following description are only some embodiments of the invention, and that other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.

FIG. 1 is a schematic diagram of the flow of detection and identification of the readings of the water gauge of the ship according to the present invention.

FIG. 2 is a diagram showing an example of the water gauge reading label according to the present invention.

Fig. 3 is a comparison of an original picture and a binarized image according to the present invention.

Detailed Description

The following description of the embodiments of the present invention will be made clearly and completely with reference to the accompanying drawings, in which it is apparent that the embodiments described are only some embodiments of the present invention, but not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.

Examples: 1-3, the invention provides a ship water gauge reading detection and identification method based on machine vision, which comprises the following steps:

s1, acquiring a ship water gauge area video by using an unmanned plane or a fixed camera, and framing the video to obtain original image data, wherein in order to ensure the training effect of a subsequent deep neural network model, the acquired data need to cover scenes of different illumination, weather and ship types, and the water gauge scale of the ship in the picture and the water content boundary line of the ship are clearly visible;

s2, marking water gauge readings of original image data by using a LabelImg open source image marking tool, marking integer parts of the water gauge readings as 'int' types and decimal parts as 'float' types for training a subsequent YOLOv5 target detection algorithm, particularly as shown in FIG. 2, marking ship areas and water areas by using a Labelme image marking tool, wherein the marking types are 'polygon', an original picture and a binary image, as shown in FIG. 3;

s3, pre-training a CRNN model on an open source optical character recognition data set ICDAR2013 by an optical character recognition algorithm, and performing fine tuning on the model by using a marked ship water gauge reading data set;

s4, based on the water gauge reading target detection data set in S2, three scales are generated by clustering the water gauge reading target bounding box by using a K-Means algorithm, wherein each group of scales has 3 aspect ratios, and the aspect ratios are respectively 1: 1. 1:2 and 2:1, training a YOLOv5 target detection model by 9 anchors, setting the resolution of model input to 640 x 640, ensuring that part of small target water gauge reading bounding boxes still have characteristic information after downsampling, and training a Mask RCNN model by utilizing a segmentation data set of a water area and a ship in S2;

s5, in the actual reasoning process, a Mask RCNN segmentation model is used for predicting the position of a ship water boundary line in a picture, YOLOv5 is used for detecting an integer region and a decimal region of a water gauge, and an effective reading region of the ship water gauge is obtained through filtering logic;

s6, sending the effective reading area into an OCR character recognition model to identify specific readings, setting the detection time to 10S in a post-processing stage, extracting 80% of video frames in 10S, sending the video frames into the algorithm model, and carrying out weighted summation on the detection results of the extracted video frames in 10S to obtain a final water gauge reading result, wherein in order to avoid the accidental single reasoning and the interference caused by water surface fluctuation, the post-processing mode is not only in line with the scientific mode of conventional manual observation reading, but also the expenditure of calculation force is saved.

In S3, under the marking strategy, different types of target bounding boxes have different sizes and aspect ratios, so that the target can be predicted in different prediction feature graphs by a subsequent YOLOv5 target detection algorithm.

In S3, YOLOv5 belongs to an object detection algorithm of Anchor-Based, that is, regression prediction needs to be performed by using an Anchor Box (Anchor Box), the basic principle is that an input image is scaled to a size required by a model through preprocessing operation, and on the basis, a series of convolution, batch normalization and activation functions are used for extracting features of the image and downsampling the image to 8 times, 16 times and 32 times, and due to different downsampling multiplying power, large objects in a training set are usually predicted in a small-size prediction feature map, and small objects are predicted in a large-size prediction feature map.

Working principle: in the invention, the detection performance of the Anchor-Based algorithm is very sensitive to the size, the aspect ratio and the number of Anchor frames, in order to improve the detection rate of the ship water gauge reading, the K-Means algorithm is adopted to cluster the Anchor frames, and in the YOLOv5 target detection algorithm, the target boundary frame is usually represented by the coordinates of an upper left corner point and a lower right corner point and is marked as (x) ₁ ,y ₁ ,x ₂ ,y ₂ ) When clustering the target bounding boxes, only the width and the height of the bounding boxes are required to be used as features, however, the sizes of actually sampled pictures are different, so that the width and the height of the target bounding boxes are required to be normalized by using the width and the height of the pictures, and the calculation formula is as follows:

it is found in practice that if the euclidean distance in the conventional K-Means clustering algorithm is used as a measure, clusters of large target bounding boxes will have larger errors than clusters of small target bounding boxes in the clustering iteration, so the distance from each target bounding box to each cluster center is calculated by adopting (Intersection over Union, overlap ratio) in the invention, and if the calculation formula of the overlap ratio IoU and the calculation formula of the distance d are as follows:

d(box,anchor)＝1-IoU(box,anchor)

in the above formula, the intersection (box, anchor) represents the intersection of the target bounding box and the anchor frame, and the unit (box, anchor) represents the union of the target bounding box and the anchor frame, and it is known from the above formula that if the target bounding box and the anchor frame completely overlap, i.e., ioU =1, the distance between the anchor frame and the target bounding box is 0. In the iterative process, repeatedly calculating the distance d, distributing each target boundary frame to the anchor frame closest to the target boundary frame to form clusters, calculating the average value of the width and the height of all the target boundary frames in each cluster, updating the width and the height of the anchor frame until no change occurs, enabling the anchor frame after K-Means clustering to be more in line with the boundary frame aspect ratio of a ship water gauge reading data set, and enabling integer areas and decimal areas to be predicted better in a YOLOv5 training stage;

in practical application, errors often exist in an effective reading area obtained only through a YOLOv5 target detection model under the influence of water color, and particularly when water scale reflection occurs on the water surface, the target detection model is prone to false detection, a Mask RCNN segmentation network is introduced to conduct example segmentation on a ship area and a water area, ship water boundary lines can be obtained through classifying each pixel point of an image, if false detection occurs at the boundary lines through a YOLOv5 target detection algorithm, y-axis coordinates of right lower corner points of a target boundary frame are located below the ship water boundary lines, and at the moment, right lower corner coordinates of the target boundary frame exceeding the ship water boundary lines are restrained on the boundary lines through post-processing logic. Likewise, if the upper left corner of the bounding box of the target deduced by YOLOv5 is below the ship water boundary, then the bounding box is considered as an invalid target, and such targets are filtered directly in post-processing logic, ensuring that the target fed into the subsequent OCR character recognition algorithm is just a water gauge reading observable at the ship water boundary. Through the combined action of multiple models, the false detection rate of the water gauge readings of the ship is greatly reduced, and the method is remarkable in that different models can be easily applied to other similar water gauge reading tasks by cascading, and a good detection effect can be obtained.

It will be apparent to those skilled in the art that various modifications and variations can be made to the present invention without departing from the spirit or scope of the invention. Thus, it is intended that the present invention also include such modifications and alterations insofar as they come within the scope of the appended claims or the equivalents thereof.

Claims

1. The ship water gauge reading detection and identification method based on machine vision is characterized by comprising the following steps of:

2. The method for detecting and identifying the ship water gauge reading based on the machine vision according to claim 1, wherein in the step S3, under the marking strategy, different types of target bounding boxes have different sizes and aspect ratios, so that the target can be predicted in different prediction feature diagrams by a subsequent YOLOv5 target detection algorithm.

3. The ship water gauge reading detection and identification method Based on machine vision as claimed in claim 1, wherein in S3, YOLOv5 belongs to an object detection algorithm of Anchor-Based, namely, regression prediction is needed by using an priori Anchor Box (Anchor Box), the basic principle is that an input image is scaled to a size required by a model through preprocessing operation, and on the basis, feature extraction is performed on the image by using a series of convolution, batch normalization and activation functions and downsampling is performed to 8 times, 16 times and 32 times, and large objects in a training set are usually predicted in a small-size prediction feature map and small objects are predicted in a large-size prediction feature map due to different downsampling multiplying power.