CN112232144A

CN112232144A - Personnel overboard detection and identification method based on improved residual error neural network

Info

Publication number: CN112232144A
Application number: CN202011035521.2A
Authority: CN
Inventors: 姜喆; 王天星; 段一琛; 杨舸
Original assignee: Northwestern Polytechnical University
Current assignee: Northwestern Polytechnical University
Priority date: 2020-09-27
Filing date: 2020-09-27
Publication date: 2021-01-15

Abstract

The invention discloses a personnel overboard detection and identification method based on an improved residual error neural network. According to the method, on the basis of the ResNet34 residual neural network, an SE module is added into each residual block, so that an improved residual neural network is realized; and then processing the collected personnel falling water and non-personnel falling water audio data into a two-classification characteristic diagram data set to train the improved residual convolutional neural network, so as to obtain a residual neural network model with higher detection and identification precision. Finally, the audio data collected in real time are converted into a time-frequency characteristic diagram and input into the trained residual neural network model, and then a real-time recognition result can be obtained. The residual error neural network model integrates the drowning detection process and the identification process, replaces most processing flows with a single neural network, and can obtain higher accuracy.

Description

Personnel overboard detection and identification method based on improved residual error neural network

Technical Field

The invention belongs to the field of signal processing, and particularly relates to a man overboard detection and identification method.

Background

Conventionally, news that people happened to fall into water unfortunately in lakes, reservoirs, and the like is often not fresh. According to the report of the world health organization, 372000 people die from drowning every year in the world, and on average 42 people die every hour every day, wherein not only drowning people but also rescue people exist. The gold time for rescuing after people fall into water is only 5 minutes, and when the water falling event happens suddenly, people falling into water cannot be found at the first time depending on manpower, so that casualties of a large number of people falling into water are caused. In addition, under the condition of severe weather, the visibility is low, and the drowning event is difficult to detect only by means of vision, so that a detection method based on an acoustic signal is considered to be introduced into the underwater environment, but the complexity and the variety of the underwater environment bring great difficulty to the detection and the identification of the drowning event.

In the traditional underwater sound target detection and identification field, the detection and identification are carried out in two stages, the processing process is complex, and the detection and identification precision is deficient to some extent, so that the final identification effect is not ideal, and the existing drowning detection and identification method cannot quickly and accurately identify whether personnel drowning exists.

Disclosure of Invention

In order to overcome the defects of the prior art, the invention provides a personnel overboard detection and identification method based on an improved residual error neural network. According to the method, on the basis of the ResNet34 residual neural network, an SE module is added into each residual block, so that an improved residual neural network is realized; and then processing the collected personnel falling water and non-personnel falling water audio data into a two-classification characteristic diagram data set to train the improved residual convolutional neural network, so as to obtain a residual neural network model with higher detection and identification precision. Finally, the audio data collected in real time are converted into a time-frequency characteristic diagram and input into the trained residual neural network model, and then a real-time recognition result can be obtained. The residual error neural network model integrates the drowning detection process and the identification process, replaces most processing flows with a single neural network, and can obtain higher accuracy.

The technical scheme adopted by the invention for solving the technical problem comprises the following steps:

step 1: placing hydrophones in water and collecting audio signals of the surrounding environment; dividing the collected audio signals into 5 conditions, namely falling into water by a person, falling into water by the person and struggling, falling into water by small-sized sundries, falling into water by large-sized sundries and no object falling into water; the method comprises the following steps of taking audio signals of two situations of falling water of people and struggling as positive sample data, and taking audio signals of three situations of falling water of small sundries, falling water of large sundries and no falling water object as negative sample data;

step 2: performing sliding window slicing processing on each audio signal of the positive sample data and the negative sample data, and then performing short-time Fourier transform to obtain a time-frequency characteristic graph of each audio signal; then, the size of each time-frequency characteristic graph is adjusted to be l₁*l₂And carrying out pixel value normalization processing; labeling all the time-frequency characteristic graphs after the processing: the label of the time-frequency characteristic graph corresponding to the positive sample data is 0, and the label of the time-frequency characteristic graph corresponding to the negative sample data is 1; the time-frequency characteristic graph corresponding to the labeled positive sample data forms a positive sample data set, and the time-frequency characteristic graph corresponding to the labeled negative sample data forms a negative sample data set;

and step 3: randomly selecting a% of time-frequency characteristic graphs from a positive sample data set as a positive training set, and using the rest part of the positive sample data set as a positive test set, wherein a is more than 50 and less than 100; randomly selecting b% of time-frequency characteristic graphs from the negative sample data set as a negative training set, and taking the rest part of the negative sample data set as a negative test set, wherein b is more than 50 and less than 100;

combining the positive training set and the negative training set, and randomly disordering the sequence to form an overall training set; collecting the positive test set and the negative test set to form a total test set;

and 4, step 4: constructing an improved residual error neural network model:

step 4-1: constructing a residual error neural network model with 5 layers on the basis of a ResNet34 module, wherein the 1 st layer consists of 2 convolutional layers and 2 batch normalization layers, and the 2 nd to 5 th layers consist of 3, 4, 6 and 3 residual error blocks respectively; adding an SE module into each residual block in the improved residual neural network model, wherein the SE module consists of p global average pooling layers and q full-connection layers;

step 4-2: defining a loss function:

loss＝-α_t(1-p_t)^γlog(p_t)

wherein p is_tIs the probability that the improved residual error neural network model predicts whether the sample belongs to the positive class or the negative class; alpha is alpha_tIs a weight coefficient, α_tE (0, 1); gamma is a modulation coefficient, gamma belongs to (0, 1);

and 5: training the improved residual error neural network model constructed in the step 4 by adopting an overall training set, using the loss function defined in the step 4-2 as a target function, optimizing by adopting an Adam algorithm, and training a round B in total; testing and identifying accuracy of the improved residual error neural network model obtained by each round of training by using an overall test set, and saving the neural network model with the highest accuracy in the B round of training as an optimal model;

drawing a confusion matrix graph by using the optimal model, and calculating precision ratio and recall ratio of testing the overall test set by using the optimal model;

step 6: using the optimal model trained in the step 5 as a final detection recognition model; carrying out sliding window slicing processing on an audio signal acquired by a hydrophone in real time, and then carrying out short-time Fourier transform to obtain a time-frequency characteristic diagram of the audio signal; and inputting the time-frequency characteristic diagram into a final detection and identification model, and outputting a result of accurately identifying whether the person falls into water by the final detection and identification model.

Preferably, the l₁＝224，l₂＝224。

Preferably, a is 70 and b is 70.

Preferably, p is 1 and q is 2.

Preferably, B is 100.

The invention has the beneficial effects that: due to the adoption of the personnel drowning detection and identification method based on the improved residual error neural network, the drowning detection process and the identification process can be integrated, a single neural network is used for replacing most processing flows, and higher accuracy can be obtained. Meanwhile, the method improves the precision ratio and the recall ratio to a great extent, and reduces the waste of a large amount of manpower and material resources caused by low precision ratio and the drowning casualties caused by low recall ratio.

Drawings

FIG. 1 is a flow chart of the method of the present invention.

Fig. 2 is a diagram of the residual block improved by the method of the present invention.

FIG. 3 is a final confusion matrix map of the optimal model obtained by the present invention.

Fig. 4 is a training loss curve of an embodiment of the present invention.

FIG. 5 is a training accuracy curve according to an embodiment of the present invention.

FIG. 6 is a test accuracy curve according to an embodiment of the present invention.

Detailed Description

The invention is further illustrated with reference to the following figures and examples.

As shown in fig. 1, the present invention provides a method for detecting and identifying man-over-water based on an improved residual neural network, which includes the following steps:

step 1: placing hydrophones in water and collecting audio signals of the surrounding environment; dividing the collected audio signals into 5 conditions, namely falling into water by a person, falling into water by the person and struggling, falling into water by small-sized sundries, falling into water by large-sized sundries and no object falling into water; the method comprises the following steps of taking audio signals of two situations of falling water of people and struggling as positive sample data, and taking audio signals of three situations of falling water of small sundries and falling water of large sundries and no falling water object as negative sample data;

step 2: performing sliding window slicing processing on each audio signal of the positive sample data and the negative sample data, and then performing short-time Fourier transform to obtain a time-frequency characteristic graph of each audio signal; then adjusting the size of each time-frequency feature graph to 224 x 224, and carrying out pixel value normalization processing; labeling all the time-frequency characteristic graphs after the processing: the label of the time-frequency characteristic graph corresponding to the positive sample data is 0, and the label of the time-frequency characteristic graph corresponding to the negative sample data is 1; the time-frequency characteristic graph corresponding to the labeled positive sample data forms a positive sample data set, and the time-frequency characteristic graph corresponding to the labeled negative sample data forms a negative sample data set;

and step 3: randomly selecting 70% of time-frequency characteristic graphs in the positive sample data set as a positive training set, and using the rest of the positive sample data set as a positive test set; randomly selecting 70% of time-frequency characteristic graphs in the negative sample data set as a negative training set, and using the rest of the negative sample data set as a negative test set;

and 4, step 4: constructing an improved residual error neural network model:

step 4-1: constructing a residual error neural network model with 5 layers on the basis of a ResNet34 module, wherein the 1 st layer consists of 2 convolutional layers and 2 batch normalization layers, and the 2 nd to 5 th layers consist of 3, 4, 6 and 3 residual error blocks respectively; adding an SE module into each residual block in the improved residual neural network model, wherein the SE module consists of 1 global average pooling layer and 2 full-connection layers;

the SE module is used for adaptively recalibrating the channel-type feature response, and can learn to use global information to selectively emphasize information features and suppress less useful features, so that the obvious performance improvement is generated on the existing network model at the cost of increasing tiny computing cost;

the structure of the residual block added to the SE module is shown in FIG. 2;

step 4-2: the loss function is defined using Focal loss:

loss＝-α_t(1-p_t)^γlog(p_t)

wherein pt is that the improved residual neural network model prediction samples belong to a positive class or a negative classThe probability of (d); alpha is alpha_tIs a weight coefficient for reducing the weight of the loss of the class sample with too high number of samples to the total loss, alpha_tE (0, 1); gamma is a modulation coefficient and is used for reducing the weight of the loss of the easily classified samples in the total loss, and gamma belongs to (0, 1);

and 5: training the improved residual error neural network model constructed in the step 4 by adopting an overall training set, using the loss function defined in the step 4-2 as a target function, optimizing by adopting an Adam algorithm, and training for 100 rounds in total; testing and identifying accuracy of the improved residual error neural network model obtained by each training round by using a total test set, and saving the neural network model with the highest accuracy in 100 training rounds as an optimal model;

drawing a confusion matrix diagram by using the optimal model, and calculating precision ratio and recall ratio of testing the overall test set by using the optimal model as shown in FIG. 3; the three visualization curves generated during training and testing are shown in fig. 4, 5 and 6.

The invention realizes the integration of the drowning detection and identification process, and trains by using the data collected in the underwater acoustic environment to obtain an optimal detection and identification neural network model. In practical application, collected signal waveforms are preprocessed into a time-frequency characteristic diagram form and input into a trained model, and the model can quickly output an accurate recognition result.

As shown in table 1, compared with the conventional detection method plus the support vector machine identification method, the improved residual neural network model has higher accuracy and a more concise data processing process; compared with a five-layer convolutional neural network detection and identification method, the improved residual neural network has higher feature extraction capability and higher accuracy rate caused by the improved residual neural network. The accuracy of the ResNet50 and SE-ResNext50(32 x 4d) models in the data of Table 1 is lower than that of the ResNet34 model because the complexity of the models is too high, resulting in over-fitting of the training set and ultimately poor results on the test set, thus also proving that complex models are not suitable for use herein. The model comparison results in table 1 also show that the recognition accuracy can be further improved by adding an SE module to the original ResNet34 module. Meanwhile, the method improves the precision ratio and the recall ratio to a great extent, reduces the waste of a large amount of manpower and material resources caused by low precision ratio and the drowning casualties caused by low recall ratio, and proves the effectiveness and the reliability of the method for detecting and identifying the drowning by using the improved residual convolutional neural network.

TABLE 1 recognition accuracy of six models

Claims

1. A personnel overboard detection and identification method based on an improved residual error neural network is characterized by comprising the following steps:

step 2: performing sliding window slicing processing on each audio signal of the positive sample data and the negative sample data, and then performing short-time Fourier transform to obtain a time-frequency characteristic graph of each audio signal; then, the size of each time-frequency characteristic graph is adjusted to be l₁*l₂And carrying out pixel value normalization processing; labeling all the time-frequency characteristic graphs after the processing: class IIIThe label of the time-frequency characteristic graph corresponding to the sample data is 0, and the label of the time-frequency characteristic graph corresponding to the negative sample data is 1; the time-frequency characteristic graph corresponding to the labeled positive sample data forms a positive sample data set, and the time-frequency characteristic graph corresponding to the labeled negative sample data forms a negative sample data set;

and 4, step 4: constructing an improved residual error neural network model:

step 4-2: defining a loss function:

loss＝-α_t(1-p_t)^γlog(p_t)

2. The method for detecting and identifying man over water based on the improved residual neural network as claimed in claim 1, wherein l₁＝224，l₂＝224。

3. The improved residual neural network based man-over-board detection and identification method as claimed in claim 1, wherein a-70 and b-70 are provided.

4. The improved residual neural network-based man over water detection and identification method as claimed in claim 1, wherein p is 1 and q is 2.

5. The improved residual neural network-based man over water detection and identification method as claimed in claim 1, wherein B is 100.