CN113643258A

CN113643258A - Method for detecting loss fault of skirt board at side part of train based on deep learning

Info

Publication number: CN113643258A
Application number: CN202110925053.4A
Authority: CN
Inventors: 邓艳
Original assignee: Harbin Kejia General Mechanical and Electrical Co Ltd
Current assignee: Harbin Kejia General Mechanical and Electrical Co Ltd
Priority date: 2021-08-12
Filing date: 2021-08-12
Publication date: 2021-11-12

Abstract

A method for detecting a loss fault of a skirt board at the side part of a train based on deep learning belongs to the technical field of detection of the loss fault of the skirt board at the side part of the train. The invention solves the problem of poor generalization capability of the traditional fault identification algorithm. According to the invention, high-definition imaging equipment is built around the train track, and a high-definition image can be obtained after the train passes through the equipment; and taking the position of the axle as a reference, obtaining an image of an area where the small skirt board is located at the side bogie and an image of an area where the skirt board is located at the non-bogie, detecting the loss fault of the obtained image of the area where the skirt board is located at the side bogie by using a trained YOLOF network structure, uploading the loss fault of the skirt board for alarming, and performing corresponding processing by a worker according to a recognition result to ensure the safe operation of the train. The method can be applied to the detection of the loss fault of the skirt board at the side part of the train.

Description

Method for detecting loss fault of skirt board at side part of train based on deep learning

Technical Field

The invention belongs to the technical field of train side skirt board loss fault detection, and particularly relates to a method for detecting a train side skirt board loss fault based on deep learning.

Background

The skirt board at the side part of the motor train unit is provided with various parts such as cover plates, grids, a battery box and the like, and the skirt board at the side part protects various parts at the side part of the motor train unit and plays an important role in the safe running of trains. However, the lateral skirt board is easy to lose in the process of high-speed running of the train, and if the lateral skirt board is not discovered in time, the running safety is seriously endangered. When the fault detection is carried out by adopting a manual image detection mode, the running density of the motor train unit is high, the time for detecting the motor train unit is short, and the conditions of fatigue, omission and the like are easily caused by a vehicle detecting person in the working process, so that the missed detection and the false detection of the loss fault of the lateral skirt boards are caused, and the driving safety is influenced. Therefore, in order to overcome the defects of the manual detection method, the existing method identifies the loss fault of the side skirt board through an image processing algorithm. However, due to the influence of external factors such as illumination and weather, the quality of the acquired train images is different, and the traditional image processing algorithm has strong dependence on the image quality, so that the generalization capability of the fault identification algorithm is poor.

Disclosure of Invention

The invention aims to solve the problem that the generalization capability of the traditional fault identification algorithm is poor, and provides a method for detecting the loss fault of a skirt board at the side part of a train based on deep learning.

The technical scheme adopted by the invention for solving the technical problems is as follows: a method for detecting a loss fault of a skirt board at the side part of a train based on deep learning comprises the following steps:

the method comprises the following steps of firstly, obtaining images of the side part of a train, and intercepting images of the area where an apron board part is located from the obtained images;

step two, taking the image intercepted in the step one as an input image, and inputting the input image into a trained Yolof network;

the YOLOF network structure comprises a Backbone module, an Encoder module and a Head module, wherein after an input image is input into the Backbone module, an output result of the Backbone module is input into the Encoder module, an output result of the Encoder module is input into the Head module, and a fault detection result is output through the Head module;

the Encoder module comprises a first convolution layer, a second convolution layer and 4 residual error units with different void ratios, wherein the residual error units are connected in series, the output of the backsbone module sequentially passes through the first convolution layer and the second convolution layer of the Encoder module, the output result of the second convolution layer is input into the first residual error unit, and the output result of the second convolution layer is fused with the output result of the first residual error unit to obtain a fused result A1;

inputting the fusion result A1 into the second residual error unit, and fusing the fusion result A1 with the output result of the second residual error unit to obtain a fusion result B1;

inputting the fusion result B1 into the third residual error unit, and fusing the fusion result B1 with the output result of the third residual error unit to obtain a fusion result C1;

inputting the fusion result C1 into the fourth residual unit, fusing the fusion result C1 with the output result of the fourth residual unit to obtain a fusion result D1, and taking the fusion result D1 as the output result of the Encoder module.

Further, the specific process of the step one is as follows:

the method comprises the steps of acquiring an image of the side of the train by using imaging equipment arranged around a train track, and intercepting an image of an area where an apron board part is located from the acquired image by referring to the position of an axle in the image of the side of the train.

Further, the image of the area where the apron part is located includes an apron image at the side bogie and an apron image at the side non-bogie.

Further, the void ratio of the first residual unit is 1, the void ratio of the second residual unit is 3, the void ratio of the third residual unit is 5, and the void ratio of the fourth residual unit is 7.

Further, the structure of the Backbone module is specifically as follows:

starting from the input end of the backhaul module, the backhaul module sequentially comprises a convolution layer, a maximum pooling layer, a first convolution block, a second convolution block, a third convolution block and a fourth convolution block, wherein:

the first convolution block comprises 3 convolution units connected in series, the second convolution block comprises 4-8 convolution units connected in series, the third convolution block comprises 6-36 convolution units connected in series, and the fourth convolution block comprises 3 convolution units connected in series; each convolution unit contains 3 convolution layers.

Further, in the backhaul module, each convolution unit of the first convolution block includes a convolution layer with a size of 64 channels 1 × 1 convolution kernel, a convolution layer with a size of 64 channels 3 × 3 convolution kernel, and a convolution layer with a size of 256 channels 1 × 1 convolution kernel;

each convolution unit of the second convolution block comprises a 128-channel convolution layer with a convolution kernel size of 1 × 1, a convolution layer with a convolution kernel size of 128-channel convolution kernel of 3 × 3 and a convolution layer with a convolution kernel size of 512-channel convolution kernel of 1 × 1;

each convolution unit of the third convolution block comprises a convolution layer with a size of 256 channels 1 × 1 convolution kernel, a convolution layer with a size of 256 channels 3 × 3 convolution kernel and a convolution layer with a size of 1024 channels 1 × 1 convolution kernel;

each convolution unit of the fourth convolution block includes one convolution layer of 512 channels 1 x 1 convolution kernel size, one convolution layer of 512 channels 3 x 3 convolution kernel size, and one convolution layer of 2048 channels 1 x 1 convolution kernel size.

Further, the Head module is used for classifying and regressing the output result of the Encoder module, if the output result of the Head module indicates that the target in the image belongs to the lateral skirt board loss fault, the score is greater than or equal to the score threshold, the length of the minimum external rectangle of the skirt board loss position is greater than the set length threshold, and the width is greater than the set width threshold, the lateral skirt board loss fault exists in the input image, and otherwise, the lateral skirt board loss fault does not exist.

Further, when the Head module is classified, the loss function adopted is as follows:

wherein L is_flFor the loss function, α is a balance factor, γ is an adjustment factor, y' is the probability of detecting the result after the activation function of the Head module, and y is the probability of the true tag.

Further, when the Head module performs regression, the loss function GIoU _ loss used is:

GIoU_loss＝1-GIoU

where A is the prediction box, B is the target box, IOU is the cross-over ratio, and C is the minimum bounding rectangle of the prediction box and the target box.

Further, the training process of the YOLOF network is:

performing data amplification on historically acquired images of an area where an apron board is located at a bogie at the side of the train and images of an area where an apron board is located at a non-bogie at the side of the train, marking the images after data amplification, marking the apron board loss fault in the images as a positive sample, marking water stains, oil stains and other parts of the train in the images as negative samples, and then training the YOLOF network by using the marked images;

and stopping training until the set maximum training times are reached or the detection accuracy of the YOLOF network is not increased any more, so as to obtain the well-trained YOLOF network.

The invention has the beneficial effects that: the invention provides a method for detecting the loss fault of a skirt board at the side part of a train based on deep learning, wherein high-definition imaging equipment is built around a train track, and a high-definition image can be obtained after the train passes through the equipment; and taking the position of the axle as a reference, obtaining an image of an area where the small skirt board is located at the side bogie and an image of an area where the skirt board is located at the non-bogie, detecting the loss fault of the obtained image of the area where the skirt board is located at the side bogie by using a trained YOLOF network structure, uploading the loss fault of the skirt board for alarming, and performing corresponding processing by a worker according to a recognition result to ensure the safe operation of the train.

The method extracts the characteristics through the neural network layer, can more fully utilize the information of the images, and needs to utilize batch sample images as a training set in the training process of the YOLOF network model, so that the characteristics of the batch samples are considered in the parameter training of the YOLOF network model, and the dependency on the quality of a single image is reduced in the detection process; moreover, the YOLOF network model designed by the invention comprises an Encoder module, the Encoder module covers objects with different sizes by serially connecting a plurality of residual error units with different void ratios, so that the characteristics of different receptive fields are obtained, and the loss faults of skirt boards at the side parts of multiple sizes can be detected; when the quality of the acquired image to be detected is poor or the size of the skirt board part image changes, the method can still realize high-precision detection by using the YOLOF network model, and has better generalization capability.

Drawings

FIG. 1 is a flow chart of the method of the present invention;

fig. 2 is a network structure diagram of a backhaul module;

fig. 3 is a network structure diagram of the Encoder module.

Detailed Description

It should be noted that, in the present invention, the embodiments disclosed in the present application may be combined with each other without conflict.

First embodiment this embodiment will be described with reference to fig. 1 and 3. The method for detecting the loss fault of the skirt board at the side part of the train based on deep learning is specifically realized by the following steps:

If the output characteristics of the backhaul module are directly input into the Head module for processing, the algorithm cannot cope with the target detection of large size change of the skirt board in the scene of failure detection of the skirt board loss at the side part of the train. For the skirt at the side of the train, the image size of the skirt at the bogie and the skirt at the non-bogie are very different (the skirt at the bogie is relatively small), thus causing the algorithm (the algorithm here refers to directly inputting the output characteristics of the Backbone module into the Head module for processing) to have low accuracy in detecting the skirt fault. The method designs the YOLOF network comprising the Encoder module, the Encoder module is used for covering objects with different sizes by serially connecting a plurality of modules with different void ratios, the problem of single reception field is solved, the characteristics of different reception fields are obtained, the loss fault of the skirt board at the side part with multiple sizes can be detected, the network can well cope with the size change of the skirt board, and the accuracy of the skirt board fault detection is improved.

The adoption of the automatic image identification mode in the embodiment can improve the efficiency and stability of fault detection. Due to the fact that the gray levels of images shot at different time are different due to the influence of different weather and environmental factors, fault detection is conducted on the side skirt boards through the deep learning network structure of the YOLOF, on one hand, the generalization of a fault detection algorithm can be improved, the problem that the image quality is affected by external factors such as weather illumination and the like and then fault recognition is affected is solved, on the other hand, the characteristic extraction of the SIMO mode of the YOLOF is achieved, the algorithm complexity is reduced compared with that of the MIMO mode, when a large number of skirt board images of the whole vehicle are recognized, the detection efficiency can be improved, and a precondition is provided for guaranteeing that the column-average fault detection time target is reached. Meanwhile, the problems that real faults of the apron board are slightly less, and a data set is influenced by negative samples with more parts such as a side water injection port and the like are solved.

The second embodiment is as follows: the first difference between the present embodiment and the specific embodiment is: the specific process of the step one is as follows:

By adopting a line scanning mode, after the train passes through the imaging equipment, a two-dimensional high-definition image with wide view field and high precision can be formed. The position of the train axle is referred, the skirt board at the side part is divided into two parts, namely the skirt board at the bogie and the skirt board at the non-bogie, the skirt board images of the two parts are respectively intercepted, and then the two parts of images are identified, so that the time occupied by the image processing of the skirt board at the side part of the whole train can be reduced, and the identification accuracy is improved.

The third concrete implementation mode: the present embodiment differs from the first or second embodiment in that: the images of the area where the skirt part is located include the skirt image at the side bogie and the skirt image at the side non-bogie.

The fourth concrete implementation mode: the first difference between the present embodiment and the specific embodiment is: the void ratio of the first residual unit is 1, the void ratio of the second residual unit is 3, the void ratio of the third residual unit is 5, and the void ratio of the fourth residual unit is 7.

The fifth concrete implementation mode: the first difference between the present embodiment and the specific embodiment is: the structure of the Backbone module is specifically as follows:

the first convolution block comprises 3 convolution units connected in series, the second convolution block comprises 4-8 convolution units connected in series, the third convolution block comprises 6-36 convolution units connected in series, and the fourth convolution block comprises 3 convolution units connected in series; wherein each convolution unit comprises 3 convolution layers.

In the backhaul module adopted by the present invention, the second convolution block and the third convolution block respectively include 4 and 6 convolution units connected in series, and a schematic diagram of the backhaul module is shown in fig. 2.

The sixth specific implementation mode: the fifth embodiment is different from the fifth embodiment in that: in the backhaul module, each convolution unit of the first convolution block comprises a convolution layer with the size of 64 channels 1 × 1 convolution kernel, a convolution layer with the size of 64 channels 3 × 3 convolution kernel and a convolution layer with the size of 256 channels 1 × 1 convolution kernel;

The working process of the backhaul module is as follows:

(1) calculating a convolution layer of 64 channels 7 × 7conv and stride 2 of the input image, and then calculating maxpool of 3 × 3 and stride 2 to obtain a C1 feature layer;

(2) successively passing the result obtained in the step (1) through 3 convolution blocks formed by convolution of 64 channels 1 × 1conv, 64 channels 3 × 3conv and 256 channels 1 × 1conv to obtain a C2 feature layer;

(3) successively passing the result obtained in the step (2) through 4 convolution blocks formed by convolution of 128 channels 1 × 1conv, 128 channels 3 × 3conv and 512 channels 1 × 1conv to obtain a C3 feature layer;

(4) successively passing the result obtained in the step (3) through 6 convolution blocks formed by convolution of 256 channels 1 × 1conv, 256 channels 3 × 3conv and 1024 channels 1 × 1conv to obtain a C4 feature layer;

(5) and (4) successively passing the result obtained in the step (4) through 3 convolution blocks consisting of convolution of 512 channels 1 × 1conv, 512 channels 3 × 3conv and 2048 channels 1 × 1conv to obtain a C5 feature layer.

The seventh embodiment: the first difference between the present embodiment and the specific embodiment is: and the Head module is used for classifying and regressing the output result of the Encoder module, and if the output result of the Head module indicates that a target in the image belongs to the lateral skirt board loss fault, the score is greater than or equal to the score threshold, the length of the minimum external rectangle at the skirt board loss position is greater than the set length threshold, and the width of the minimum external rectangle is greater than the set width threshold, the lateral skirt board loss fault is determined to exist in the input image, otherwise, the lateral skirt board loss fault does not exist.

The corresponding length and width threshold values are set according to the size of the skirt board subgraph which is actually intercepted, and because the skirt board subgraph is provided with components like a water filling port/a sewage draining port and the like, the set length and width threshold values need to be larger than the length and width of the components like the water filling port/the sewage draining port and the like in the subgraph, so that the false alarm caused by the cover plate loss fault of the components like the sewage draining port/the water filling port and the like is reduced. The score threshold value is 0.7. And finally, mapping the box (the coordinates of the upper left point and the lower right point) of the fault position to the coordinates of the whole train image to generate message information.

The specific implementation mode is eight: the seventh embodiment is different from the seventh embodiment in that: when the Head module is classified, the adopted loss function is as follows:

The loss of easily classified samples is reduced through the adjustment of the gamma factor, and the trained model is more concerned about difficult and wrongly classified samples. Alpha is a balance factor, and is used for balancing the non-uniformity of the proportion of the positive and negative samples by adjusting the alpha parameter, so that the detection accuracy is improved.

The specific implementation method nine: the seventh embodiment is different from the seventh embodiment in that: when the Head module performs regression, the loss function GIoU _ loss adopted is as follows:

GIoU_loss＝1-GIoU

The GIoU has scale invariance, not only focuses on the overlapping region, but also focuses on the non-overlapping region, and the problem that the difference between non-overlapping frames cannot be evaluated is solved. When the prediction box and the target box are completely overlapped: GIoU IoU 1; when the prediction box and the target box are not overlapped, the GIoU decreases with increasing distance, approaching-1.

The detailed implementation mode is ten: the first difference between the present embodiment and the specific embodiment is: the training process of the YOLOF network is as follows:

The above-described calculation examples of the present invention are merely to explain the calculation model and the calculation flow of the present invention in detail, and are not intended to limit the embodiments of the present invention. It will be apparent to those skilled in the art that other variations and modifications of the present invention can be made based on the above description, and it is not intended to be exhaustive or to limit the invention to the precise form disclosed, and all such modifications and variations are possible and contemplated as falling within the scope of the invention.

Claims

1. A method for detecting a loss fault of a skirt plate at the side part of a train based on deep learning is characterized by specifically comprising the following steps of:

2. The method for detecting the loss fault of the skirt panels at the side of the train based on the deep learning of claim 1, wherein the specific process of the first step is as follows:

3. The method for detecting the loss fault of the skirt board at the side part of the train based on the deep learning as claimed in claim 1 or 2, wherein the image of the area where the skirt board part is located comprises an image of the skirt board at a side bogie and an image of the skirt board at a side non-bogie.

4. The method for detecting the loss fault of the skirt panel at the side of the train based on the deep learning of claim 1, wherein the void ratio of the first residual unit is 1, the void ratio of the second residual unit is 3, the void ratio of the third residual unit is 5, and the void ratio of the fourth residual unit is 7.

5. The method for detecting the loss fault of the skirt plate at the side of the train based on the deep learning of claim 1, wherein the structure of the Backbone module is specifically as follows:

6. The method for detecting the train side skirt missing fault based on the deep learning of claim 5, wherein in the Backbone module, each convolution unit of the first convolution block comprises a convolution layer with the size of 64 channels 1 × 1 convolution kernel, a convolution layer with the size of 64 channels 3 × 3 convolution kernel and a convolution layer with the size of 256 channels 1 × 1 convolution kernel;

7. The method as claimed in claim 1, wherein the Head module is configured to classify and regress an output result of the Encoder module, and if the output result of the Head module indicates that a target in the image belongs to the side skirt loss fault, the score is greater than or equal to the score threshold, and the length of the minimum bounding rectangle of the skirt loss position is greater than the set length threshold, and the width is greater than the set width threshold, the input image is determined to have the side skirt loss fault, otherwise, the side skirt loss fault does not exist.

8. The method for detecting the train side skirt board loss fault based on the deep learning of claim 7, wherein the Head module performs classification by using a loss function as follows:

9. The method for detecting the train side skirt board loss fault based on the deep learning of claim 7, wherein the loss function GIoU _ loss adopted when the Head module performs regression is as follows:

GIoU_loss＝1-GIoU

10. The method for detecting the loss fault of the skirt panels at the side of the train based on the deep learning of claim 1, wherein the training process of the YOLOF network is as follows: