CN113436726B - Automatic lung pathological sound analysis method based on multi-task classification - Google Patents

Automatic lung pathological sound analysis method based on multi-task classification Download PDF

Info

Publication number
CN113436726B
CN113436726B CN202110728236.7A CN202110728236A CN113436726B CN 113436726 B CN113436726 B CN 113436726B CN 202110728236 A CN202110728236 A CN 202110728236A CN 113436726 B CN113436726 B CN 113436726B
Authority
CN
China
Prior art keywords
lung
sound
task
neural network
audio
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202110728236.7A
Other languages
Chinese (zh)
Other versions
CN113436726A (en
Inventor
许静
张建雯
吴彦峰
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Nankai University
Original Assignee
Nankai University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Nankai University filed Critical Nankai University
Priority to CN202110728236.7A priority Critical patent/CN113436726B/en
Publication of CN113436726A publication Critical patent/CN113436726A/en
Application granted granted Critical
Publication of CN113436726B publication Critical patent/CN113436726B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16HHEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
    • G16H50/00ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics
    • G16H50/20ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics for computer-aided diagnosis, e.g. based on medical expert systems
    • AHUMAN NECESSITIES
    • A61MEDICAL OR VETERINARY SCIENCE; HYGIENE
    • A61BDIAGNOSIS; SURGERY; IDENTIFICATION
    • A61B7/00Instruments for auscultation
    • A61B7/003Detecting lung or respiration noise
    • AHUMAN NECESSITIES
    • A61MEDICAL OR VETERINARY SCIENCE; HYGIENE
    • A61BDIAGNOSIS; SURGERY; IDENTIFICATION
    • A61B7/00Instruments for auscultation
    • A61B7/02Stethoscopes
    • A61B7/04Electric stethoscopes
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/047Probabilistic or stochastic networks

Landscapes

  • Engineering & Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • Physics & Mathematics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Biomedical Technology (AREA)
  • General Health & Medical Sciences (AREA)
  • Public Health (AREA)
  • Medical Informatics (AREA)
  • Theoretical Computer Science (AREA)
  • Molecular Biology (AREA)
  • Data Mining & Analysis (AREA)
  • Biophysics (AREA)
  • Mathematical Physics (AREA)
  • Evolutionary Computation (AREA)
  • Animal Behavior & Ethology (AREA)
  • Surgery (AREA)
  • Software Systems (AREA)
  • Computational Linguistics (AREA)
  • Veterinary Medicine (AREA)
  • Acoustics & Sound (AREA)
  • Heart & Thoracic Surgery (AREA)
  • General Engineering & Computer Science (AREA)
  • Computing Systems (AREA)
  • General Physics & Mathematics (AREA)
  • Artificial Intelligence (AREA)
  • Probability & Statistics with Applications (AREA)
  • Databases & Information Systems (AREA)
  • Pathology (AREA)
  • Epidemiology (AREA)
  • Primary Health Care (AREA)
  • Pulmonology (AREA)
  • Measurement Of The Respiration, Hearing Ability, Form, And Blood Characteristics Of Living Organisms (AREA)

Abstract

The invention discloses a lung pathological sound automatic analysis method based on multi-task classification, which relates to the technical field of lung pathological analysis and comprises the following steps: and inputting the extracted audio features into a multitask classification model of a convolutional neural network MobileNet V2, wherein the multitask classification model of the convolutional neural network MobileNet V2 comprises a task of outputting the audio features for lung pathological sound identification and a task of outputting the audio features for lung disease prediction. The invention adopts a multitask learning method to implicitly increase the training data volume and improves the generalization performance of the model through the domain knowledge of a plurality of label information of the same data, thereby improving the prediction accuracy of the multitask classification model of the convolutional neural network MobileNet V2, and in addition, the lightweight multitask classification model of the convolutional neural network MobileNet V2 has fewer parameters and smaller requirements on the computing capacity and the memory size of training equipment, so that the prediction classification task can be completed on mobile or embedded equipment.

Description

Automatic lung pathological sound analysis method based on multi-task classification
Technical Field
The invention relates to the technical field of lung pathology analysis, in particular to a lung pathology sound automatic analysis method based on multi-task classification.
Background
Studies have shown that exacerbations in the state of subjects with pulmonary conditions (e.g., asthma, Chronic Obstructive Pulmonary Disease (COPD), emphysema, cystic fibrosis, etc.) are characterized by a combination of aspects. Breathing defects cause dyspnea (shortness of breath) and coughing. In fact, increased dyspnea and increased sputum purulence and/or volume (which leads to increased coughing) are generally considered to be the most distinct or major symptoms of exacerbation of pulmonary disease.
The lung sound signal is a physiological sound signal generated by the respiratory system of a human body and the outside in the ventilation process, the generation mechanism is complex, and the physiological and pathological information is rich, and the method for screening and diagnosing the lung respiratory diseases by listening to the respiratory sound by using a stethoscope is a main method. However, the stethoscope-based diagnosis has some disadvantages, such as strong subjectivity, incapability of continuous monitoring, limitation of human hearing and memory, etc., when professional medical staff are needed to judge auscultation signals, the problems are particularly significant in poor areas and respiratory disease epidemic periods, and the automated analysis of lung sounds can provide auxiliary diagnosis so as to reduce the workload of professional medical staff, thereby having great significance for intelligent medical treatment.
At present, the automated analysis of lung sounds mainly comprises two major tasks: lung pathological sound identification and lung disease prediction. Lung sounds (breath sounds) are divided into normal sounds and pathological sounds, the types of lung pathological sounds are many, and the most common lung pathological sounds are divided into two types: crackle (crackles) and wheeze (wheezes). The main task of lung pathological sound identification is to judge whether a lung pathological sound exists in a section of lung sound signal, which is beneficial to screening early lung diseases; the lung disease prediction is to predict whether the patient has lung disease and the type of lung disease through the analysis of the lung sound signal. At present, the data volume of the existing lung sound data set is small, noise interference in lung sounds is large, so that relevant and irrelevant characteristics are difficult to distinguish for a single lung sound recognition task, the model generalization capability is weak, the classification performance is poor, the used network model is complex, parameters are large, the requirements on the computing capability and the memory size of training equipment are large, and the lung sound data set needs to be operated on a large-scale server.
The invention patent CN103417241B discloses a diagnosis instrument main machine, three lung sound probes with acoustoelectric sensors and a wireless electronic stethoscope; the method is characterized in that: the diagnostic instrument host comprises a computer for diagnosis and a signal amplifier; the signal amplifier is connected with a corresponding interface of the computer host through a lead; the three lung sound probes are connected with a signal output terminal of the signal amplifier through a lead; the wireless electronic stethoscope is connected with the corresponding interface of the computer host through wireless transmission. The lung sounds are collected for all respiratory diseases, and the simultaneous collection and automatic analysis of multiple regions of the lung sounds have important significance for detecting pathological additional lung sounds, so that the diagnosis and treatment of patients are greatly facilitated. The invention can really establish a clinically available lung sound characteristic analysis means, is applied to clinical diagnosis and treatment of diseases such as infantile pneumonia and the like, adds an objective diagnosis means for the diseases, and has important application prospect for children health and medical treatment. But the method has the defects of low calculation precision and ineffective lung pathological sound identification result and lung disease prediction result of a corresponding patient.
An effective solution to the problems in the related art has not been proposed yet.
Disclosure of Invention
Aiming at the problems in the related art, the invention provides a lung pathological sound automatic analysis method based on multi-task classification, which aims to overcome the technical problems that the data volume of the existing lung sound data set is small, the noise interference in lung sounds is large, so that the related and unrelated characteristics are difficult to distinguish for a single lung sound recognition task, the generalization capability of a model is weak, the classification performance is poor, a network model used in the prior art is complex, parameters are large, the requirements on the computing capability and the memory size of training equipment are large, and the operation on a large server is required.
The technical scheme of the invention is realized as follows:
a lung pathological sound automatic analysis method based on multitask classification comprises the following steps:
inputting the extracted audio features into a multitask classification model of a convolutional neural network MobileNet V2, wherein the multitask classification model of the convolutional neural network MobileNet V2 comprises a task of outputting and using the extracted audio features for lung pathological sound identification and a task of outputting and using the extracted audio features for lung disease prediction, and the method comprises the following steps:
the output is used for lung pathological sound identification task, and the method comprises the following steps:
input to two fully-connected layers of sizes 512 and 128, the ReLU6 activation function, used to increase the nonlinearity of the neural network model, and using the dropout parameter normalization method, used to prevent overfitting, the calculation formula for the fully-connected layers is as follows:
yi=WTxi+b;
wherein, yiIs the output vector of the full connection layer, xiFor the input vector of the fully connected layer, W and b represent the parameters that the neural network needs to learn. The calculation formula of the ReLU activation function is as follows:
Figure BDA0003139343280000021
wherein x is the input of the linear correction unit ReLU activation function and y is the output of the linear correction unit ReLU activation function;
adding a softmax activation function layer to obtain a prediction result of the model for lung pathological sound category identification, and calculating the cross entropy loss of the lung sound identification task by using the prediction result and the lung sound label, wherein the expression is as follows:
Figure BDA0003139343280000031
where x is the input vector of the softmax layer, classlWeight class, a label representing the pathological sounds of the lungs of the breathing cycle audiol]Is the equilibrium weight of the respiratory cycle label class, xj]Representing an input vector corresponding to each category in the softmax layer;
the output is used for a lung disease prediction task, comprising the steps of:
adding a full connection layer, a ReLU activation function, a dropout parameter normalization method and a softmax activation function layer in advance to obtain a prediction result of the model on the patient suffering from the disease, and calculating the cross entropy loss of a patient suffering from the disease classification task, wherein the expression is as follows:
Figure BDA0003139343280000032
where x is the input vector of the softmax layer, classdEight classes of labels, class, that indicate patient lung diseased]Is the balance weight of each class, x [ j ]]Representing input vectors corresponding to each category in the softmax layer。
Further, the loss function of the multi-task classification model of the convolutional neural network MobileNetV2 is the sum of cross-entropy losses of each task, and the expression is as follows:
loss=lossl+lossd
further, the method also comprises the following steps:
the method comprises the steps of collecting lung sound audio data information in advance, preprocessing the lung sound audio data information, unifying breathing period audio segments with different lengths, and using the unified breathing period audio segments as input data of a multitask classification model of a convolutional neural network MobileNet V2;
performing labeling training data, including labeling the type of lung pathological sound and labeling the type of lung diseases;
extracting acoustic features, extracting the Mel frequency spectrogram features of each section of lung sound breathing cycle audio signal, obtaining a spectrogram from the audio signal through short-time Fourier transform, changing the spectrogram into a Mel frequency spectrogram through a Mel scale filter bank, and cutting off a full black empty part to obtain a spectrum feature part;
and obtaining a lung pathological sound identification result of the input respiratory cycle characteristic data and a prediction result of the lung disease of the corresponding patient based on a multi-task classification model of the convolutional neural network MobileNet V2.
Further, the lung sound audio data information preprocessing comprises the following steps:
cutting the lung sound audio data by taking a breathing cycle as a unit;
removing audio noise of the cut lung sound audio data on the basis of a fifth-order Butterworth band-pass filter;
the size of the denoised lung sound audio data is uniformly mapped to a range from-1 to 1 by using standard normalization, and the data is represented as:
Figure BDA0003139343280000041
and then, segmenting and repeating segment filling are carried out, so that the breathing cycle audio segments with different lengths are unified into a fixed length value and are used as input data of a multitask classification model of the convolutional neural network MobileNet V2.
Further, the acquisition of the spectrogram comprises the following steps:
framing and windowing the lung sound breathing period audio signal;
then Fourier transform is carried out on each frame;
the results of each frame are stacked along another dimension to obtain a spectrogram.
The invention has the beneficial effects that:
the invention relates to a lung pathological sound automatic analysis method based on multitask classification, which comprises the steps of extracting acoustic features by collecting lung sound audio data information in advance, extracting the Mel frequency spectrogram features of each section of lung sound breathing cycle audio signal, obtaining the frequency spectrum feature part, inputting a multitask classification model of a convolutional neural network MobileNetV2 to obtain the lung pathological sound recognition result of the input breathing cycle feature data and the prediction result of lung diseases of a corresponding patient, implicitly increasing the training data quantity by adopting a multitask learning method, improving the generalization performance of the model by the field knowledge of a plurality of label information of the same data, improving the prediction accuracy of the multitask classification model of the convolutional neural network MobileNetV2, using the lightweight multitask classification model of the convolutional neural network MobileNetV2, having fewer parameters and smaller requirements on the calculation capability and the memory size of training equipment, so that the predictive classification task can be done on a mobile or embedded device.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings needed in the embodiments will be briefly described below, and it is obvious that the drawings in the following description are only some embodiments of the present invention, and it is obvious for those skilled in the art to obtain other drawings without creative efforts.
FIG. 1 is a flow chart of a method for automated analysis of lung pathology sounds based on multi-task classification according to an embodiment of the present invention;
fig. 2 is a schematic diagram of a neural network architecture of a method for automated analysis of lung pathology sounds based on multitask classification according to an embodiment of the present invention.
Detailed Description
The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments that can be derived by one of ordinary skill in the art from the embodiments given herein are intended to be within the scope of the present invention.
According to an embodiment of the invention, a method for automated analysis of lung pathology sounds based on multi-task classification is provided.
As shown in fig. 1, the method for automatically analyzing lung pathological sounds based on multi-task classification according to an embodiment of the present invention includes the following steps:
collecting lung sound audio data information in advance, preprocessing the lung sound audio data information, unifying breathing period audio segments with different lengths, and using the unified breathing period audio segments as input data of a neural network;
performing labeling training data, including labeling the type of lung pathological sound and labeling the type of lung diseases;
extracting acoustic features, extracting the Mel frequency spectrogram features of each section of lung sound breathing cycle audio signal, obtaining a spectrogram from the audio signal through short-time Fourier transform, changing the spectrogram into a Mel frequency spectrogram through a Mel scale filter bank, and cutting off a full black empty part to obtain a spectrum feature part;
and obtaining a lung pathological sound identification result of the input respiratory cycle characteristic data and a prediction result of lung diseases of a corresponding patient based on a light-weighted multi-task classification model of the convolutional neural network MobileNet V2.
Specifically, the method comprises the following steps:
the method comprises the following steps: and (4) preprocessing data.
1) And cutting the lung sound audio data by taking the breathing cycle as a unit. Since the sampling rate of the acquired lung sound data set is between 4 khz and 44.1 khz, the present solution uses down-sampling to normalize the audio frequency to 4 khz.
2) Because the acquired lung sound data contains more noise, a fifth-order Butterworth band-pass filter (Butterworth band-pass filter) is used for removing audio noise such as heartbeat sound and background conversation sound, and the Butterworth band-pass filter (Butterworth band-pass filter) enables a frequency response curve in a pass frequency band to be flat to the maximum extent without fluctuation, and gradually drops to zero in a stop frequency band.
3) The size of the data is uniformly mapped onto the-1 to 1 interval using standard normalization, expressed as:
Figure BDA0003139343280000061
and each dimension of the data is standardized to a specific interval, so that the convergence speed of the gradient descent method based model can be increased.
4) Setting a fixed input length value, here set to 8s, unifying the respiratory cycle audio segments of different lengths to the fixed length value by segmenting and repeating segment filling so as to extract features and then using the features as input data of the neural network.
Step two: and marking training data.
1) The training data are labeled with lung pathological sound related labels, and the labels are classified into four types, namely normal sound, abnormal sound with crackle sound (crackles), abnormal sound with wheeze sound (wheezes) and abnormal sound with crackle sound (crackles) and wheeze sound (wheezes).
2) The patients corresponding to the respiratory cycle of the lung sounds are found out and relevant labels of lung diseases are marked, and eight groups are respectively healthy (not diseased), suffering from bronchiolitis, suffering from lower respiratory tract infection, asthma, chronic obstructive pulmonary disease, bronchiectasis, suffering from upper respiratory tract infection and pneumonia.
Step three: audio features of the data are extracted.
Extracting Mel-spectral features (Mel-spectra) of each section of lung sound breathing cycle audio signal, and obtaining a spectrogram from the audio signal through Short Time Fourier Transform (STFT). The principle is that the audio signal is divided into frames and windowed, then Fourier transform is carried out on each frame, and then the results of each frame are stacked along the other dimension, so that a two-dimensional signal form similar to one graph, namely the spectrogram, can be obtained.
The obtained Mel frequency spectrum characteristic diagram shows that a plurality of high-frequency regions of the audio are obvious and completely black, and the learning of the neural network to the characteristics is interfered, so that the completely black hollow part is cut off to ensure that the neural network learns the effective frequency spectrum characteristic part.
Step four: the neural network is trained using lung sound data.
As shown in fig. 2, the neural network architecture is a multitask classification model based on a lightweight convolutional neural network MobileNetV2, and a lung pathological sound identification result of input respiratory cycle feature data and a prediction result of a lung disease of a corresponding patient are finally obtained.
Specifically, the neural network architecture comprises:
inputting the extracted audio features, namely Mel-spectral graph features (mel-spectral) graphs into a lightweight network MobileNet V2 module with pre-training weights on a large image data set ImageNet, wherein the step length of a Bottleneck module of a MobileNet V2 module is 1, the size of a convolution kernel is marked in a convolution layer, and then batch normalization and an activation function ReLU6 layer are set, and a ReLU6 activation function represents a common ReLU activation function but limits the maximum output value to 6, so that the low-precision figures of float16/int8 can be used in mobile terminal equipment, and the good numerical resolution can be achieved, and the precision loss can be avoided; followed by a depth separable convolution (depthwise separable convolution), for which the convolution kernel is applied to all input channels, unlike the standard convolution, which first uses a different convolution kernel for each input channel, that is, one convolution kernel for each input channel, and then combines the outputs again using the standard convolution, has an overall effect similar to that of a standard convolution but with a greatly reduced amount of computation and model parameters, followed by batch normalization, the ReLU6 activation function layer, the convolution layer, batch normalization, and the linear activation function, where the linear transformation is used instead of the ReLU6 activation function, and the loss of information by the non-linear activation layer can be avoided. The convolution operation of the bottleeck module increases the number of channels of the picture first and decreases last, in contrast to the usual residual block, in order to extract more channel information. Finally, the output and the original input are subjected to element addition. The Bottleneck module step size is 2, and since the output is not the same as the original output dimension, no element addition is performed.
In addition, the MobileNetV2 module used removed the last classifier layer for the MobileNet network, and the overall framework of the network is shown in table 1:
TABLE 1 network Whole frame table
Number of input channels Operation of t c n s
3 Conv2d - 32 1 2
32 Bottleneck 1 16 1 1
16 Bottleneck 6 24 2 2
24 Bottleneck 6 32 3 2
32 Bottleneck 6 64 4 2
64 Bottleneck 6 96 3 1
96 Bottleneck 6 160 3 2
160 Bottleneck 6 320 1 1
320 Conv2d 1x1 - 1280 1 1
With the help of table 1 above, each row represents a series of operations and repeats n times, the bottleeck operation is shown in fig. 2, t represents the multiplication coefficient of the input channel of the bottleeck operation, i.e. the number of channels in the middle part is a multiple of the number of input channels, n represents the number of times the operation is repeated, c represents the number of output channels, s represents the step size when the module is repeated for the first time (the following repetition steps are all 1), and the convolution operation without convolution kernel is using 3 × 3 convolution kernels.
Because the abnormal condition of the lung sound is related to the lung disease of the patient, the two tasks are subjected to parameter sharing in the MobileNet V2 network module, and joint learning and parallel learning are carried out, so that the difference between the tasks and the connection between the tasks are considered.
The model is then split into two outputs, the first for the lung pathology recognition task, continuing into two fully connected layers of 512 and 128, the ReLU6 activation function, for increasing the neural network model's nonlinearity, and using the dropout parameter normalization method for preventing overfitting, the fully connected layers' calculation formula is as follows:
yi=WTxi+b;
wherein, yiIs the output vector of the full connection layer, xiFor the input vector of the fully connected layer, W and b represent the parameters that the neural network needs to learn. The calculation formula of the ReLU activation function is as follows:
Figure BDA0003139343280000081
where x is the input of the linear modifying unit ReLU activation function and y is the output of the linear modifying unit ReLU activation function.
Finally, adding a softmax activation function layer to obtain a prediction result of the model for recognizing the lung pathological sound category, and calculating by using the prediction result and a lung sound label to obtain the cross entropy loss of the lung sound recognition task, wherein the expression is as follows:
Figure BDA0003139343280000082
where x is the input vector of the softmax layer, classlFour types of labels representing lung pathological sounds of respiratory cycle audio are normal sounds (no abnormal sound), crackle sound (crackles) abnormal sound only, wheeze sound (wheezes) abnormal sound only, crackle sound (crackles) and wheeze sound (wheezes) abnormal sound simultaneously, and weight [ class ] respectivelyl]The balance weight of the breathing cycle label category is obtained by negating the proportion of the samples of the current category to the total samples, and is used for relieving the problem of data imbalance caused by excessive normal lung sound samples and fewer abnormal lung sound samples, namely x [ j ]]Representing the input vector for each category in the softmax layer, j is taken from 1 to the number of categories 4.
And the second output is used for lung disease prediction tasks, the same structure is used, namely a full connection layer, a ReLU activation function, a dropout parameter normalization method and a softmax activation function layer are added, the prediction result of the model on the patient disease information is obtained, and the cross entropy loss of the patient disease information classification task is calculated by using the following expression:
Figure BDA0003139343280000083
where x is the input vector of the softmax layer, classdEight types of labels representing lung diseases of patients, including healthy (not diseased), bronchiolitis, lower respiratory tract infection, asthma, chronic obstructive pulmonary disease, bronchiectasis, upper respiratory tract infection, pneumonia, and weight [ class ]d]Is the balance weight of each category, which is obtained by negating the proportion of the samples in the current category to the total samples, and is used for relieving the data imbalance problems of excessive normal (non-diseased) samples, less samples with different diseases and larger proportion difference, namely xj]Representing the input vector for each category in the softmax layer, j is taken from 1 to the number of categories 8.
The parameters of the neural network in the two part architectures are not shared any more, so that the neural network learns the parameters of the two tasks which are different. The loss function of the neural network model is the sum of cross entropy losses of each task, and the expression is as follows:
loss=lossl+lossd
step five: and (5) performing predictive diagnosis on the examinee.
When the training is completed until the neural network converges, the updated neural network parameters can be used for prediction. Recording lung audio signals when an examiner (a person seeking a doctor) breathes by taking a breathing cycle as a unit, processing the audio signals according to the first step to obtain a mel-spectrum feature map, inputting the mel-spectrum feature map into a neural network, and outputting a prediction result of lung pathological sounds of the patient by the neural network.
In summary, with the aid of the above technical solution of the present invention, the light-weighted network-based multi-task classification model of MobileNetV2 is used to identify pathological lung sounds and lung diseases, and the main innovation point of the framework is to utilize the characteristic that abnormal conditions of respiratory sounds of patients are correlated with lung disease information to perform multi-task learning, and reduce the complexity of the model by using the light-weighted model, that is, the advantages are:
1. the multi-task classification model can effectively improve the lung sound identification accuracy rate for the following reasons:
1) the data is implicitly added. The multi-task learning effectively increases the number of training data, and because all tasks have certain noise, two tasks can be more generally represented by learning at the same time. If only lung pathology sound recognition is learned, the risk of overfitting is assumed, however, learning both lung pathology sound classification and lung disease classification can average the noise pattern, so that the model obtains a better representation of features at the parameter sharing layer.
2) Attention-focusing mechanisms. Since the acquired lung sound data is noisy, the data volume is small, the data dimensionality is high, it is difficult for the model to distinguish relevant from irrelevant features, and multitasking helps to focus the model on the features that really have an effect, because the task of identifying the patient lung disease can provide additional evidence for the relevance and irrelevance of the features.
3) An eavesdropping mechanism. If it is easy to learn some features x for the task of identifying a lung disease of a patient, which are difficult to learn for the task of identifying a lung pathology sound, which may be caused by the more complicated interaction of the task of identifying a lung pathology sound with the features x, or by other features hindering the learning of the features x, model eavesdropping, i.e. the task of identifying a lung pathology sound learns the features x using the task of predicting a lung disease, may be allowed by multi-task learning.
4) Indicating a paradoxical mechanism. Since a hypothetical space that performs well for a sufficient number of training tasks will also perform well for new tasks from the same environment, this helps the model to exhibit the ability to generalize to new tasks.
5) A regularization mechanism. The multi-task learning plays the same role as regularization by introducing induction bias, so that the risk of overfitting the model is reduced, and the capability of fitting random noise is reduced.
2. The model is based on the lightweight network MobileNet V2, the complexity of the model is low, the parameter quantity is small, only 13.88M is needed, the requirements on computing power and memory are low, tasks which need to be trained and predicted on a large server originally can be completed on mobile or embedded equipment, and the training and predicting speed is increased.
The above description is only for the purpose of illustrating the preferred embodiments of the present invention and is not to be construed as limiting the invention, and any modifications, equivalents, improvements and the like that fall within the spirit and principle of the present invention are intended to be included therein.

Claims (5)

1. A lung pathological sound automatic analysis method based on multitask classification is characterized by comprising the following steps:
inputting the extracted audio features into a multitask classification model of a convolutional neural network MobileNet V2, wherein the multitask classification model of the convolutional neural network MobileNet V2 comprises a task of outputting and using the extracted audio features for lung pathological sound identification and a task of outputting and using the extracted audio features for lung disease prediction, and the method comprises the following steps:
the output is used for lung pathological sound identification task, and the method comprises the following steps:
input to two fully-connected layers of sizes 512 and 128, the ReLU6 activation function, used to increase the nonlinearity of the neural network model, and using the dropout parameter normalization method, used to prevent overfitting, the calculation formula for the fully-connected layers is as follows:
yi=WTxi+b;
wherein, yiIs the output vector of the full connection layer, xiFor the input vector of the fully-connected layer, W and b represent parameters that the neural network needs to learn, and the ReLU activation function is expressed as:
Figure FDA0003139343270000011
wherein x is the input of the linear correction unit ReLU activation function and y is the output of the linear correction unit ReLU activation function;
adding a softmax activation function layer to obtain a prediction result of the model for lung pathological sound category identification, and calculating the cross entropy loss of the lung sound identification task by using the prediction result and the lung sound label, wherein the expression is as follows:
Figure FDA0003139343270000012
where x is the input vector of the softmax layer, classlWeight class, a label representing the pathological sounds of the lungs of the breathing cycle audiol]Is the equilibrium weight of the respiratory cycle label class, xj]Representing an input vector corresponding to each category in the softmax layer;
the output is used for a lung disease prediction task, comprising the steps of:
adding a full connection layer, a ReLU activation function, a dropout parameter normalization method and a softmax activation function layer in advance to obtain a prediction result of the model on the patient suffering from the disease, and calculating the cross entropy loss of a patient suffering from the disease classification task, wherein the expression is as follows:
Figure FDA0003139343270000013
where x is the input vector of the softmax layer, classdWeight class, a label indicating the lung disease of a patientd]Is the balance weight of each class, x [ j ]]Representing the input vector corresponding to each category in the softmax layer.
2. The method for the automated analysis of pulmonary pathology sounds based on multitask classification according to claim 1, characterized in that the loss function of the multitask classification model of the convolutional neural network MobileNetV2 is the sum of cross-entropy losses of each task, expressed as follows:
loss=lossl+lossd
3. the method for the automated analysis of pulmonary pathology sounds based on multitasking classification according to claim 2, characterized by the further steps of:
the method comprises the steps of collecting lung sound audio data information in advance, preprocessing the lung sound audio data information, unifying breathing period audio segments with different lengths, and using the unified breathing period audio segments as input data of a multitask classification model of a convolutional neural network MobileNet V2;
performing labeling training data, including labeling the type of lung pathological sound and labeling the type of lung diseases;
extracting acoustic features, extracting the Mel frequency spectrogram features of each section of lung sound breathing cycle audio signal, obtaining a spectrogram from the audio signal through short-time Fourier transform, changing the spectrogram into a Mel frequency spectrogram through a Mel scale filter bank, and cutting off a full black empty part to obtain a spectrum feature part;
and obtaining a lung pathological sound identification result of the input respiratory cycle characteristic data and a prediction result of the lung disease of the corresponding patient based on a multi-task classification model of the convolutional neural network MobileNet V2.
4. The method for the automated analysis of lung pathology sound based on multitask classification according to claim 3, characterized in that said lung sound audio data information preprocessing includes the following steps:
cutting the lung sound audio data by taking a breathing cycle as a unit;
removing audio noise of the cut lung sound audio data on the basis of a fifth-order Butterworth band-pass filter;
the size of the denoised lung sound audio data is uniformly mapped to a range from-1 to 1 by using standard normalization, and the data is represented as:
Figure FDA0003139343270000021
and then, segmenting and repeating segment filling are carried out, so that the breathing cycle audio segments with different lengths are unified into a fixed length value and are used as input data of a multitask classification model of the convolutional neural network MobileNet V2.
5. The method for the automated analysis of pulmonary pathological sounds based on multitasking classification according to claim 4, characterized in that said acquisition of spectrogram includes the following steps:
framing and windowing the lung sound breathing period audio signal;
then Fourier transform is carried out on each frame;
the results of each frame are stacked along another dimension to obtain a spectrogram.
CN202110728236.7A 2021-06-29 2021-06-29 Automatic lung pathological sound analysis method based on multi-task classification Active CN113436726B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110728236.7A CN113436726B (en) 2021-06-29 2021-06-29 Automatic lung pathological sound analysis method based on multi-task classification

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110728236.7A CN113436726B (en) 2021-06-29 2021-06-29 Automatic lung pathological sound analysis method based on multi-task classification

Publications (2)

Publication Number Publication Date
CN113436726A CN113436726A (en) 2021-09-24
CN113436726B true CN113436726B (en) 2022-03-04

Family

ID=77757694

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110728236.7A Active CN113436726B (en) 2021-06-29 2021-06-29 Automatic lung pathological sound analysis method based on multi-task classification

Country Status (1)

Country Link
CN (1) CN113436726B (en)

Families Citing this family (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114141366B (en) * 2021-12-31 2024-03-26 杭州电子科技大学 Auxiliary analysis method for cerebral apoplexy rehabilitation evaluation based on voice multitasking learning
CN114391827A (en) * 2022-01-06 2022-04-26 普昶钦 Pre-hospital emphysema diagnosis device based on convolutional neural network
CN117059283A (en) * 2023-08-15 2023-11-14 宁波市鄞州区疾病预防控制中心 Speech database classification and processing system based on pulmonary tuberculosis early warning

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109273085A (en) * 2018-11-23 2019-01-25 南京清科信息科技有限公司 The method for building up in pathology breath sound library, the detection system of respiratory disorder and the method for handling breath sound
WO2019229543A1 (en) * 2018-05-29 2019-12-05 Healthy Networks Oü Managing respiratory conditions based on sounds of the respiratory system
CN110739070A (en) * 2019-09-26 2020-01-31 南京工业大学 brain disease diagnosis method based on 3D convolutional neural network
CN111144474A (en) * 2019-12-25 2020-05-12 昆明理工大学 Multi-view, multi-scale and multi-task lung nodule classification method
CN111554319A (en) * 2020-06-24 2020-08-18 广东工业大学 Multichannel cardiopulmonary sound abnormity identification system and device based on low-rank tensor learning
CN112633405A (en) * 2020-12-30 2021-04-09 上海联影智能医疗科技有限公司 Model training method, medical image analysis device, medical image analysis equipment and medical image analysis medium

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2006037331A1 (en) * 2004-10-04 2006-04-13 Statchip Aps A handheld home monitoring sensors network device

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2019229543A1 (en) * 2018-05-29 2019-12-05 Healthy Networks Oü Managing respiratory conditions based on sounds of the respiratory system
CN109273085A (en) * 2018-11-23 2019-01-25 南京清科信息科技有限公司 The method for building up in pathology breath sound library, the detection system of respiratory disorder and the method for handling breath sound
CN110739070A (en) * 2019-09-26 2020-01-31 南京工业大学 brain disease diagnosis method based on 3D convolutional neural network
CN111144474A (en) * 2019-12-25 2020-05-12 昆明理工大学 Multi-view, multi-scale and multi-task lung nodule classification method
CN111554319A (en) * 2020-06-24 2020-08-18 广东工业大学 Multichannel cardiopulmonary sound abnormity identification system and device based on low-rank tensor learning
CN112633405A (en) * 2020-12-30 2021-04-09 上海联影智能医疗科技有限公司 Model training method, medical image analysis device, medical image analysis equipment and medical image analysis medium

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
Pre-trained Convolutional Neural Networks for the;Valentyn Vaityshyn,Hanna Porieva ,Anastasiia Makarenkova;《2019 IEEE 》;20191231;全文 *
基于深度学习和迁移学习的肺音识别方法研究;杜慷;《中国优秀硕士毕业论文集》;20210205;全文 *

Also Published As

Publication number Publication date
CN113436726A (en) 2021-09-24

Similar Documents

Publication Publication Date Title
CN113436726B (en) Automatic lung pathological sound analysis method based on multi-task classification
Fahad et al. Microscopic abnormality classification of cardiac murmurs using ANFIS and HMM
Lella et al. Automatic COVID-19 disease diagnosis using 1D convolutional neural network and augmentation with human respiratory sound based on parameters: cough, breath, and voice
Ari et al. Detection of cardiac abnormality from PCG signal using LMS based least square SVM classifier
Delgado-Trejos et al. Digital auscultation analysis for heart murmur detection
Singh et al. Short unsegmented PCG classification based on ensemble classifier
Baghel et al. ALSD-Net: Automatic lung sounds diagnosis network from pulmonary signals
CN111370120B (en) Heart diastole dysfunction detection method based on heart sound signals
Maity et al. Transfer learning based heart valve disease classification from Phonocardiogram signal
CN111789629A (en) Breath sound intelligent diagnosis and treatment system and method based on deep learning
Huang et al. Deep learning-based lung sound analysis for intelligent stethoscope
Roy et al. RDLINet: A Novel Lightweight Inception Network for Respiratory Disease Classification Using Lung Sounds
CN111938691B (en) Basic heart sound identification method and equipment
CN113974607A (en) Sleep snore detecting system based on impulse neural network
Joshi et al. AI-CardioCare: Artificial Intelligence Based Device for Cardiac Health Monitoring
CN113449636B (en) Automatic aortic valve stenosis severity classification method based on artificial intelligence
CN215349053U (en) Congenital heart disease intelligent screening robot
Balasubramanian et al. Machine Learning-Based Classification of Pulmonary Diseases through Real-Time Lung Sounds.
Naveen et al. Deep learning based classification of heart diseases from heart sounds
Dhavala et al. An MFCC features-driven subject-independent convolution neural network for detection of chronic and non-chronic pulmonary diseases
Li et al. Adaptive noise cancellation and classification of lung sounds under practical environment
EP4364669A1 (en) Detection a respiratory disease based on chest sounds
Pradhan et al. Cascaded PFLANN Model for Intelligent Health Informatics in Detection of Respiratory Diseases from Speech Using Bio-inspired Computation
Geng et al. Research on Abnormal Lung Sound Recognition and Diagnosis Based on Improved CNN and Transfomer
van Gorp et al. Aleatoric Uncertainty Estimation of Overnight Sleep Statistics Through Posterior Sampling Using Conditional Normalizing Flows

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant