CN117268796A - Vehicle fault acoustic event detection method - Google Patents

Vehicle fault acoustic event detection method Download PDF

Info

Publication number
CN117268796A
CN117268796A CN202311531071.XA CN202311531071A CN117268796A CN 117268796 A CN117268796 A CN 117268796A CN 202311531071 A CN202311531071 A CN 202311531071A CN 117268796 A CN117268796 A CN 117268796A
Authority
CN
China
Prior art keywords
audio
fault
sound source
correlation
network
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202311531071.XA
Other languages
Chinese (zh)
Other versions
CN117268796B (en
Inventor
魏建国
阿颖
张鸿程
卢雨飞
路文焕
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Tianjin University
Original Assignee
Tianjin University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Tianjin University filed Critical Tianjin University
Priority to CN202311531071.XA priority Critical patent/CN117268796B/en
Publication of CN117268796A publication Critical patent/CN117268796A/en
Application granted granted Critical
Publication of CN117268796B publication Critical patent/CN117268796B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G01MEASURING; TESTING
    • G01MTESTING STATIC OR DYNAMIC BALANCE OF MACHINES OR STRUCTURES; TESTING OF STRUCTURES OR APPARATUS, NOT OTHERWISE PROVIDED FOR
    • G01M17/00Testing of vehicles
    • G01M17/007Wheeled or endless-tracked vehicles
    • GPHYSICS
    • G01MEASURING; TESTING
    • G01SRADIO DIRECTION-FINDING; RADIO NAVIGATION; DETERMINING DISTANCE OR VELOCITY BY USE OF RADIO WAVES; LOCATING OR PRESENCE-DETECTING BY USE OF THE REFLECTION OR RERADIATION OF RADIO WAVES; ANALOGOUS ARRANGEMENTS USING OTHER WAVES
    • G01S5/00Position-fixing by co-ordinating two or more direction or position line determinations; Position-fixing by co-ordinating two or more distance determinations
    • G01S5/18Position-fixing by co-ordinating two or more direction or position line determinations; Position-fixing by co-ordinating two or more distance determinations using ultrasonic, sonic, or infrasonic waves
    • G01S5/22Position of source determined by co-ordinating a plurality of position lines defined by path-difference measurements
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/10Complex mathematical operations
    • G06F17/14Fourier, Walsh or analogous domain transformations, e.g. Laplace, Hilbert, Karhunen-Loeve, transforms
    • G06F17/141Discrete Fourier transforms
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/044Recurrent networks, e.g. Hopfield networks
    • G06N3/0442Recurrent networks, e.g. Hopfield networks characterised by memory or gating, e.g. long short-term memory [LSTM] or gated recurrent units [GRU]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/0464Convolutional networks [CNN, ConvNet]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L25/00Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
    • G10L25/03Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the type of extracted parameters
    • G10L25/18Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the type of extracted parameters the extracted parameters being spectral information of each sub-band
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L25/00Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
    • G10L25/27Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the analysis technique
    • G10L25/30Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the analysis technique using neural networks
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L25/00Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
    • G10L25/48Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use
    • G10L25/51Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use for comparison or discrimination

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Mathematical Physics (AREA)
  • Computational Linguistics (AREA)
  • Health & Medical Sciences (AREA)
  • Data Mining & Analysis (AREA)
  • Software Systems (AREA)
  • Artificial Intelligence (AREA)
  • General Engineering & Computer Science (AREA)
  • Evolutionary Computation (AREA)
  • Molecular Biology (AREA)
  • Signal Processing (AREA)
  • Computing Systems (AREA)
  • Human Computer Interaction (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Biomedical Technology (AREA)
  • Acoustics & Sound (AREA)
  • Biophysics (AREA)
  • General Health & Medical Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Multimedia (AREA)
  • Pure & Applied Mathematics (AREA)
  • Mathematical Optimization (AREA)
  • Mathematical Analysis (AREA)
  • Computational Mathematics (AREA)
  • Discrete Mathematics (AREA)
  • Radar, Positioning & Navigation (AREA)
  • Remote Sensing (AREA)
  • Databases & Information Systems (AREA)
  • Algebra (AREA)
  • Spectroscopy & Molecular Physics (AREA)
  • Measurement Of Mechanical Vibrations Or Ultrasonic Waves (AREA)

Abstract

The invention discloses a vehicle fault acoustic event detection method, which is used for obtaining fault audio based on a mirror image source model algorithmExtracting generalized cross-correlation-phase transformation features using generalized cross-correlation methodExtracting logarithmic Mel spectrum characteristics by taking logarithmic for Mel spectrumFeature fusion is carried out by utilizing a multichannel branch convolution cyclic network, so that generalized cross-correlation-phase transformation features are realizedThe auxiliary audio fragment classification result has the advantages of high recognition rate, accurate positioning, easy training and strong robustness, and meets the requirement of real-time diagnosis on the detection of the acoustic event of the vehicle fault.

Description

Vehicle fault acoustic event detection method
Technical Field
The invention belongs to the technical field of acoustic event detection and positioning, and particularly relates to a vehicle fault acoustic event detection method.
Background
An automobile is a complex system composed of a driving system, a chassis, a vehicle body and electrical equipment, and parts thereof may lose specific functions due to aging, loss or external force collision. When the element is worn or deformed, the worn or fault point rubs or impacts with other structures, and a pulse signal with characteristic frequency is generated. The current mainstream fault acoustic diagnosis method is to collect vibration signals when fault equipment works, and realize fault position and fault type identification through signal processing, machine learning and other analysis methods. However, these methods have strong requirements for the relevant background knowledge of the fault detection personnel and the cost of the detection equipment, and a large number of uncertainty factors in the operation of the machine can lead to more complex diagnosis. Therefore, in general, a vehicle generating abnormal noise needs to go to a designated detection location to perform fault detection, and thus the requirement of detecting the state of the vehicle in real time during daily driving cannot be satisfied.
Disclosure of Invention
Aiming at the problem that the traditional acoustic method is difficult to diagnose the vehicle fault in real time, the invention aims to provide a vehicle fault acoustic event detection method based on multiple channels and multiple features, which is obtained based on a mirror image source model (Image Source Model, ISM) algorithmFailure audioExtracting generalized cross-Correlation-phase transformation feature using generalized cross-Correlation method (Generalized Cross-Correlation, GCC)>Extracting logarithmic Mel spectrum characteristics by taking Log-Mel spectrum (Log-Mel spectra)>Feature fusion is carried out by utilizing a multichannel branch convolution cyclic network, so that generalized cross-correlation-phase transformation features are ∈>The (azimuth characteristic) auxiliary audio fragment classification result has the advantages of high recognition rate, accurate positioning, easy training and strong robustness, and meets the requirement of real-time diagnosis on the detection of the acoustic event of the vehicle fault.
The aim of the invention is achieved by the following technical scheme.
A vehicle fault acoustic event detection method comprising the steps of:
s1, regarding an automobile cabin as a simulation space of a three-dimensional cuboid, taking one corner of the simulation space as an origin and taking coordinates relative to the origin asIs provided with a microphone array of C channel, sound source is arranged at random position around the microphone array, and the coordinates of the sound source relative to the origin are +.>By adjusting the position of the sound source to change +.>After each time the position of the sound source is adjusted, the microphone array is simulated according to a mirror image source model (Image Source Model, ISM) algorithm to obtain the fault audio of the sound source>
(1)
Wherein,for the original fault audio emitted by the sound source (original fault audio refers to an original signal that has not been propagated),an impulse response propagated by the sound source in the analog space;
the impulse response is:
(2)
wherein,is +.>Mirror image of the source of (a) with respect to the origin coordinates, < ->For mirrored sound source coordinates relative to origin +.>Is a mirror image sound source with a coordinate of +.>The reflection order at the time, < >>Is the absorption coefficient of the wall->∈[0,1],/>As a function of the window(s),tis the firstnThe arrival time corresponding to the sampling points;
and->Can be calculated by the following formulas (3) - (5):
(3)
(4)
(5)
wherein,is sound speed (I)>For reverberation time->For the minimum outer sphere radius of the simulation space, < +.>For simulating the volume of the space, +.>To simulate the surface area of the space, +.>Is the sampling rate.
Window functionAs in formula (6):
(6)
wherein,is the window width.
S2, obtaining logarithmic Mel spectrum characteristics based on fault audioAnd generalized cross-correlation-phase transformation feature->
In S2, extracting logarithmic Mel spectrum characteristicsComprising the following steps:
first, to the fault audioPerforming pre-emphasis, framing and windowing operations, and obtaining a frequency spectrum through Fourier transformationWherein the formula of the fourier transform is as follows:
(7)
wherein,audio for pre-emphasis, framing and windowing post-operation failure>Is the first of (2)iThe first channelnSignal of individual sampling points +.>As a function of the hamming window,Naudio for pre-emphasis, framing and windowing post-operation failure>,/>For the discrete fourier transform length,jis imaginary unit, ++>Sample points within the fourier transform interval are numbered.
Then, toObtaining a Mel spectrum through a Mel scale triangular filter after modulus squaring, and finally obtaining logarithmic Mel spectrum characteristics by taking logarithm of the Mel spectrum>
(8)
Wherein,is Mel spectrum>Is a mel-scale triangular filter.
In S2, generalized cross-correlation-phase transformation featuresTo be audio by faultWThe characteristic value obtained by a generalized cross-correlation method based on phase transformation between every two channel signals is as follows:
(9)
(10)
wherein,representing the +.f selected from the malfunctioning audio W>Signal of individual channel and->Cross-correlation function between signals of the individual channels, +.> ,fTo select any two non-duplicate channel numbers from the microphone array C channel signals,f=1、……、/>indicate->The spectrum of the signal of the individual channels is obtained by equation (7),>indicate->The spectrum of the signal of the individual channels is obtained by equation (7),>for angular frequency +.>Representing the +.f. of sound source to microphone array>Microphone and the firstTime differences of arrival for the individual microphones.
S3, fault audio frequencyPair of audio clips of medium length TDigital Meier Spectrum characteristics->And generalized cross-correlation-phase transformation feature->Training is carried out in the multichannel branch convolution loop network, iteration training is carried out until a loss function MSE converges, and the trained multichannel branch convolution loop network is obtained, wherein the definition of an output matrix Y of the multichannel branch convolution loop network is as follows:
(11)
wherein,representing the audio clip classification result: />One-Hot (One-Hot) encoding of species categories; />And the predicted value of the sound source corresponding to the audio fragment relative to the origin coordinate.
In S3, the multi-channel branch convolution network is formed by sequentially connecting a double-branch convolution network, a bidirectional gating circulation unit and a full connection layer in series, each branch of the double-branch convolution network is formed by connecting three layers of convolution blocks in series, each convolution block is formed by connecting a convolution layer, a standardization layer, a pooling layer and a random inactivation layer in series, and outputs of two branches of the double-branch convolution network are connected to an input end of the bidirectional gating circulation unit positioned at the next layer after being spliced by features.
In S3, the double-branch convolution network performs depth feature extraction, then splices and converts the depth feature extraction into a feature matrix with the size of T/5 multiplied by 384, wherein the logarithmic Mel spectrum features of the multi-channel branch convolution loop network are inputIs of size T x 1 x 64, inputGeneralized cross-correlation-phase transformation feature of multi-channel branch convolution loop network>The size of (C-1) is T×C/2×64.
The size of the feature matrix output by the bi-directional gating cycle unit is T/5 multiplied by 256.
The total connection layer is two layers, and the size of an output matrix Y of the multichannel branch convolution circulation network is T/5 x #)。
In S3, the optimizer of the multi-channel branch convolution loop network adopts an Adam optimizer.
S4, taking fault audio acquired by a real vehicle as a test set, and taking the logarithmic Mel spectrum characteristics of the audio fragments with the length of T in the test setAnd generalized cross-correlation-phase transformation feature->And inputting the prediction data into a multichannel branch convolution loop network to predict.
The invention has the characteristics and beneficial effects that:
(1) The vehicle fault acoustic event detection method is different from the traditional fault diagnosis method based on single-channel signals, the microphone array is used for collecting the multi-channel signals, the sound source azimuth characteristics are extracted, the fault diagnosis is assisted, the model is easier to train, and the robustness of the model to different scenes is improved.
(2) Aiming at the problem that fault data are difficult to collect, a large amount of simulation data are generated by combining a mirror image source model algorithm and using a small amount of real data, so that the model training quality is improved.
(3) The problems of fault sound identification and fault sound source azimuth estimation are solved simultaneously by utilizing a double-branch convolution network, and two subtasks are realized through characteristic splicingCan play a role of mutual assistance.
Drawings
Fig. 1 is a block diagram of a multi-channel branched convolutional loop network.
Detailed Description
The vehicle malfunction acoustic event detection method of the present invention will be described in detail with reference to the accompanying drawings and embodiments.
Example 1
A vehicle fault acoustic event detection method comprising the steps of:
s1, regarding an automobile cabin as a simulation space of a three-dimensional cuboid, taking one corner of the simulation space as an origin and taking coordinates relative to the origin asIs arranged around the microphone array, and a sound source is arranged at random positions around the microphone array, the coordinates of the sound source relative to the origin are +.>By adjusting the position of the sound source to change +.>After each adjustment of the position of the sound source, a microphone array was simulated to obtain the fault audio of the sound source according to a mirror source model (Image Source Model, ISM) algorithm (j.b. Allen and d.a. Berkley, "Image method for efficiently simulating small-room diagnostics," j. Acourt. Soc. Am., vol. 65, no. 4, pp. 943-950, 1979.)>
(1)
Wherein,for the original fault audio emitted by the sound source (original fault audio refers to original without propagationA signal),an impulse response propagated by the sound source in the analog space;
the impulse response is:
(2)
wherein,is +.>Mirror image of the source of (a) with respect to the origin coordinates, < ->For mirrored sound source coordinates relative to origin +.>Is a mirror image sound source with a coordinate of +.>The reflection order at the time, < >>Is the absorption coefficient of the wall->∈[0,1],/>As a function of the window(s),tis the firstnThe arrival time corresponding to the sampling points;
and->Can be calculated by the following formulas (3) - (5):
(3)
(4)
(5)
wherein,is sound speed (I)>For reverberation time->For the minimum outer sphere radius of the simulation space, < +.>For simulating the volume of the space, +.>To simulate the surface area of the space, +.>Is the sampling rate.
Window functionAs in formula (6):
(6)
wherein,is the window width.
S2, obtaining logarithmic Mel spectrum characteristics based on fault audioAnd generalized cross-correlation-phase transformation feature->
In S2, extracting logarithmic Mel spectrum characteristicsComprising the following steps:
first, to the fault audioPerforming pre-emphasis, framing and windowing operations, and obtaining a frequency spectrum through Fourier transformationWherein the formula of the fourier transform is as follows:
(7)
wherein,audio for pre-emphasis, framing and windowing post-operation failure>Is the first of (2)iThe first channelnSignal of individual sampling points +.>As a function of the hamming window,Naudio for pre-emphasis, framing and windowing post-operation failure>,/>For the discrete fourier transform length,jis imaginary unit, ++>Sample points within the fourier transform interval are numbered.
Then, toObtaining a Mel spectrum through a Mel scale triangular filter after modulus squaring, and finally obtaining logarithmic Mel spectrum characteristics by taking logarithm of the Mel spectrum>
(8)
Wherein,is Mel spectrum>Is a mel-scale triangular filter.
In S2, generalized cross-correlation-phase transformation featuresTo be audio by faultWThe characteristic value obtained by a generalized cross-correlation method based on phase transformation between every two channel signals is as follows:
(9)
(10)
wherein,representing the +.f selected from the malfunctioning audio W>Signal of individual channel and +.>Mutual between signals of channelsCorrelation function(s)> ,fTo select any two non-duplicate channel numbers from the microphone array C channel signals,f=1、……、/>indicate->The spectrum of the signal of the individual channels is obtained by equation (7),>indicate->The spectrum of the signal of the individual channels is obtained by equation (7),>for angular frequency +.>Representing the +.f. of sound source to microphone array>Microphone and the firstThe arrival time difference of each microphone can be calculated from the distance between the microphone and the sound source and the speed of sound in the air.
S3, fault audio frequencyLogarithmic mel-spectral characteristics of an audio fragment of medium length T>And generalized cross-correlation-phase transformation feature->Input multi-channel branch convolution loop networkAnd (3) performing iterative training until the loss function MSE converges to obtain a trained multichannel branch convolution loop network, wherein the definition of an output matrix Y of the multichannel branch convolution loop network is as follows:
(11)
wherein,representing the audio clip classification result: />One-Hot (One-Hot) encoding of species categories; />And the predicted value of the sound source corresponding to the audio fragment relative to the origin coordinate.
In S3, as shown in fig. 1, the multi-channel branch convolution network is formed by sequentially connecting a dual-branch convolution network, a bi-directional gating circulation unit and a full connection layer in series, each branch of the dual-branch convolution network is formed by connecting three layers of convolution blocks in series, each convolution block is formed by connecting a convolution layer, a standardization layer, a pooling layer and a random inactivation layer in series, and outputs of two branches of the dual-branch convolution network are connected to an input end of the bi-directional gating circulation unit located at the next layer after characteristic splicing.
In S3, the double-branch convolution network performs depth feature extraction, then splices and converts the depth feature extraction into a feature matrix with the size of T/5 multiplied by 384, wherein the logarithmic Mel spectrum features of the multi-channel branch convolution loop network are inputIs of size T x 1 x 64, is input to the generalized cross-correlation-phase transformation feature of the multi-channel branch convolution cyclic network +.>The size of (C-1) is T×C/2×64.
The size of the feature matrix output by the bi-directional gating cycle unit is T/5 multiplied by 256.
The total connection layer is two layers, and the size of an output matrix Y of the multichannel branch convolution circulation network is T/5 x #)。
In S3, the optimizer of the multi-channel branch convolution loop network adopts an Adam optimizer.
S4, taking fault audio acquired by a real vehicle as a test set, and taking the logarithmic Mel spectrum characteristics of the audio fragments with the length of T in the test setAnd generalized cross-correlation-phase transformation feature->And inputting the prediction data into a multichannel branch convolution loop network to predict.
Example 2
Experiments were carried out using the vehicle fault acoustic event detection method of example 1, and specific parameters are as follows:
taking four kinds of audio of inner ring damage, outer ring damage, rolling body damage and normal in the bearing fault data set of the western storage university as original fault audio (v=4);
the automobile cabin is regarded as a three-dimensional cuboid with the length, width and height of 4 meters, 1.8 meters and 1.5 meters respectively, the signal to noise ratio in the simulation space is set to be 30dB, and the reverberation time is setSet to 0.3s, sampling rateF12000Hz, sound velocity +.>340m/s. The lower left corner of the simulation space is taken as the origin, and the coordinates relative to the origin are +.>Is (2,0.9,0.5), the microphone array is a circular shape with a radius of 13cm, which is a 4-channel microphone array, and a sound source is placed on a circular ring with a radius of 0.8m by taking the microphone array as a centerBy adjusting the position of the sound source on the ring at angular intervals of 10 DEG to change +.>After each time the position of the sound source is adjusted, the microphone array is simulated according to a mirror image source model (Image Source Model, ISM) algorithm to obtain the fault audio of the sound source>
Discrete fourier transform length256 mel-scale triangular filters are provided, the number of which is 64.
In S3, the double-branch convolution network performs depth feature extraction, then performs stitching and converts the depth feature extraction into a feature matrix with the size of 50 multiplied by 384, wherein the logarithmic Mel spectrum features are input into the multi-channel branch convolution loop networkIs 250 x 1 x 64, is input to the generalized cross-correlation-phase transformation feature of the multi-channel branch convolution cyclic network +.>Is 250 x 6 x 64. The size of the feature matrix output by the bi-directional gating cyclic unit is 50×256. The full-connection layer is two layers, and the size of an output matrix Y of the multi-channel branch convolution cyclic network is 50 multiplied by 7.
The number of hidden layers of the bi-directional gating cyclic unit is 128. The first fully-connected layer (FC 1) has an input size of 50X 256, an output size of 50X 128, and the second fully-connected layer (FC 2) has an input size of 50X 128 and an output size of 50X 7. The step size and the filling of the convolution layers in the three convolution blocks (CB 1, CB2 and CB 3) are (1, 1), the step size of the largest pooling layer is (5, 4), the filling is (1, 1), and the convolution kernel sizes are shown in Table 1.
TABLE 1 convolution block parameters
In the training process of S3, the learning rate is set to 10 -4, The discarding rate is 0.1, the number of batch samples of the training set is 128, and the optimizer adopts an Adam optimizer, and the iteration number is 50.
Evaluation index: with accuracy of classificationAnd azimuth error->It is defined as follows:
(12)
(13)
wherein,and->Representing real examples and false positive examples in the audio fragment classification result in turn, wherein Q is the number of audio fragments, < >>Is made of->Personal->Converted azimuth angle (estimated value), +.>Is->Azimuth converted from real value of sound source corresponding to each audio fragment relative to origin coordinate。
Taking fault audio collected by a real vehicle as a test set, and taking the logarithmic Mel spectrum characteristics of the audio fragment with the length of T in the test setAnd generalized cross-correlation-phase transformation feature->The method is input into a multichannel branch convolution circulation network for prediction, and results show that the fault type identification accuracy of the method for detecting the vehicle fault acoustic event reaches 76.63 percent, and meanwhile, the direction angle estimation error is 14.58 degrees.
Examples
The experiment was performed using the vehicle fault acoustic event detection method, substantially identical to example 2, with the only difference that: the "multichannel branch convolution loop network" in example 2 was replaced with a "SELDnet network". The SELDnet network adopted in this embodiment is described in detail in: sharath Adavanne, archontis Politis, joonas Nikunen, and Tuomas Virtanen, "Sound event localization and detection of overlapping sources using convolutional recurrent neural network" in IEEE Journal of Selected Topics in Signal Processing (JSTP 2018).
The classification accuracy and azimuth error of example 2 and example 3 were compared and the results are shown in table 2.
Table 2 comparative experimental results
The foregoing has described exemplary embodiments of the invention, it being understood that any simple variations, modifications, or other equivalent arrangements which would not unduly obscure the invention may be made by those skilled in the art without departing from the spirit of the invention.

Claims (10)

1. A method for detecting a vehicle fault acoustic event, comprising the steps of:
s1, regarding an automobile cabin as a simulation space of a three-dimensional cuboid, taking one corner of the simulation space as an origin and taking coordinates relative to the origin asIs provided with a microphone array of C channel, sound source is arranged at random position around the microphone array, and the coordinates of the sound source relative to the origin are +.>By adjusting the position of the sound source to change +.>After the position of the sound source is adjusted each time, simulating a microphone array according to a mirror image source model algorithm to obtain fault audio of the sound source>
Wherein,for original malfunctioning audio emitted by the sound source, +.>An impulse response propagated by the sound source in the analog space;
s2, based on fault audioObtaining logarithmic mel-spectrum characteristics->And generalized cross-correlation-phase transformation feature->
S3, fault audio frequencyLogarithmic mel-spectral characteristics of an audio fragment of medium length T>And generalized cross-correlation-phase transformation feature->Training is carried out in the multichannel branch convolution loop network, iteration training is carried out until a loss function MSE converges, and the trained multichannel branch convolution loop network is obtained, wherein the definition of an output matrix Y of the multichannel branch convolution loop network is as follows:
wherein,representing the audio clip classification result: />Single-hot encoding of species class; />The predicted value of the sound source corresponding to the audio fragment relative to the origin coordinate is obtained;
s4, taking fault audio acquired by a real vehicle as a test set, and taking the logarithmic Mel spectrum characteristics of the audio fragments with the length of T in the test setAnd generalized cross-correlation-phase transformation feature->Input to multiple channelsPrediction is performed in a branch convolution loop network.
2. The vehicle fault acoustic event detection method according to claim 1, wherein the impulse response is:
wherein,is +.>Mirror image of the source of (a) with respect to the origin coordinates, < ->For mirrored sound source coordinates relative to origin +.>Is a mirror image sound source with a coordinate of +.>The reflection order at the time, < >>Is the absorption coefficient of the wall->∈[0,1],/>As a function of the window(s),tis the firstnThe arrival times corresponding to the sampling points.
3. The method for detecting a vehicle failure acoustic event according to claim 2, characterized in that,and->Can be calculated from the following formula:
wherein,is sound speed (I)>For reverberation time->For the minimum outer sphere radius of the simulation space, < +.>For simulating the volume of the space, +.>To simulate the surface area of the space, +.>Is the sampling rate.
4. A method of detecting a vehicle fault acoustic event according to claim 3, wherein the window functionThe following are provided:
wherein,is the window width.
5. The method for detecting a vehicle failure acoustic event according to claim 4, wherein in S2, logarithmic mel-spectrum features are extractedComprising the following steps:
first, to the fault audioPerforming pre-emphasis, framing and windowing operations to obtain frequency spectrum by Fourier transform>Wherein the formula of the fourier transform is as follows:
wherein,audio for pre-emphasis, framing and windowing post-operation failure>Is the first of (2)iThe first channelnThe signal of the one sampling point is used,as a function of the hamming window,Naudio for pre-emphasis, framing and windowing post-operation failure>,/>For the discrete fourier transform length,jis imaginary unit, ++>Numbering sampling points in a Fourier transform interval;
then, toObtaining a Mel spectrum through a Mel scale triangular filter after modulus squaring, and finally obtaining logarithmic Mel spectrum characteristics by taking logarithm of the Mel spectrum>
Wherein,is Mel spectrum>Is a mel-scale triangular filter.
6. The method of claim 5, wherein in S2, the generalized cross-correlation-phase transformation featureTo be audio by faultWThe characteristic value obtained by a generalized cross-correlation method based on phase transformation between every two channel signals is as follows:
wherein,representing the +.f selected from the malfunctioning audio W>Signal of individual channel and->Cross-correlation function between signals of the individual channels, +.> fTo select any two non-duplicate channel numbers from the microphone array C channel signals,f=1、……/>indicate->Frequency spectrum obtained by the signals of the individual channels, +.>Indicate->Frequency spectrum obtained by the signals of the individual channels, +.>For angular frequency +.>Representing the +.f. of sound source to microphone array>Microphone and->Time differences of arrival for the individual microphones.
7. The method for detecting the acoustic event of the vehicle fault according to claim 1, wherein in S3, the multi-channel branch convolution loop network is formed by sequentially connecting a double-branch convolution network, a bi-directional gating loop unit and a full connection layer in series, each branch of the double-branch convolution network is formed by connecting three layers of convolution blocks in series, each convolution block is formed by connecting a convolution layer, a normalization layer, a pooling layer and a random inactivation layer in series, and outputs of two branches of the double-branch convolution network are connected to an input end of the bi-directional gating loop unit positioned at the next layer after characteristic splicing.
8. The method for detecting acoustic events of vehicle failure according to claim 7, wherein in S3, the dual-branch convolutional network is spliced and converted into a feature matrix with a size of T/5 x 384 after depth feature extraction, wherein the log mel spectrum features are input to the multi-channel branch convolutional loop networkIs of size T x 1 x 64, is input to the generalized cross-correlation-phase transformation feature of the multi-channel branch convolution cyclic network +.>The size of (C-1) is T×C/2×64.
9. The method of claim 8, wherein the feature matrix output by the bi-directional gating cycle unit has a size of T/5 x 256.
10. The method for detecting acoustic events of vehicle faults as claimed in claim 9, wherein the full connection layer is two layers, and the size of the output matrix Y of the multi-channel branch convolution cyclic network is T/5× # -);
In S3, the optimizer of the multi-channel branch convolution loop network adopts an Adam optimizer.
CN202311531071.XA 2023-11-16 2023-11-16 Vehicle fault acoustic event detection method Active CN117268796B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202311531071.XA CN117268796B (en) 2023-11-16 2023-11-16 Vehicle fault acoustic event detection method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202311531071.XA CN117268796B (en) 2023-11-16 2023-11-16 Vehicle fault acoustic event detection method

Publications (2)

Publication Number Publication Date
CN117268796A true CN117268796A (en) 2023-12-22
CN117268796B CN117268796B (en) 2024-01-26

Family

ID=89208335

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202311531071.XA Active CN117268796B (en) 2023-11-16 2023-11-16 Vehicle fault acoustic event detection method

Country Status (1)

Country Link
CN (1) CN117268796B (en)

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2013079568A1 (en) * 2011-12-02 2013-06-06 Fraunhofer-Gesellschaft Zur Foerderung Der Angewandten Forschung E.V. Apparatus and method for microphone positioning based on a spatial power density
CN107820158A (en) * 2017-07-07 2018-03-20 大连理工大学 A kind of three-dimensional audio generating means based on the response of head coherent pulse
CN108182949A (en) * 2017-12-11 2018-06-19 华南理工大学 A kind of highway anomalous audio event category method based on depth conversion feature
CN108828501A (en) * 2018-04-29 2018-11-16 桂林电子科技大学 The method that real-time tracking positioning is carried out to moving sound in sound field environment indoors
CN109616138A (en) * 2018-12-27 2019-04-12 山东大学 Voice signal blind separating method and ears hearing assistance system based on segmentation frequency point selection
CN110188670A (en) * 2019-05-29 2019-08-30 广西释码智能信息技术有限公司 Face image processing process, device in a kind of iris recognition and calculate equipment
WO2022007846A1 (en) * 2020-07-08 2022-01-13 华为技术有限公司 Speech enhancement method, device, system, and storage medium

Patent Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2013079568A1 (en) * 2011-12-02 2013-06-06 Fraunhofer-Gesellschaft Zur Foerderung Der Angewandten Forschung E.V. Apparatus and method for microphone positioning based on a spatial power density
CN104094613A (en) * 2011-12-02 2014-10-08 弗劳恩霍弗促进应用研究注册公司 Apparatus and method for microphone positioning based on a spatial power density
CN107820158A (en) * 2017-07-07 2018-03-20 大连理工大学 A kind of three-dimensional audio generating means based on the response of head coherent pulse
CN108182949A (en) * 2017-12-11 2018-06-19 华南理工大学 A kind of highway anomalous audio event category method based on depth conversion feature
CN108828501A (en) * 2018-04-29 2018-11-16 桂林电子科技大学 The method that real-time tracking positioning is carried out to moving sound in sound field environment indoors
CN109616138A (en) * 2018-12-27 2019-04-12 山东大学 Voice signal blind separating method and ears hearing assistance system based on segmentation frequency point selection
CN110188670A (en) * 2019-05-29 2019-08-30 广西释码智能信息技术有限公司 Face image processing process, device in a kind of iris recognition and calculate equipment
WO2022007846A1 (en) * 2020-07-08 2022-01-13 华为技术有限公司 Speech enhancement method, device, system, and storage medium

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
LIANG YU 等, APPLIED ACOUSTICS *

Also Published As

Publication number Publication date
CN117268796B (en) 2024-01-26

Similar Documents

Publication Publication Date Title
CN107644650B (en) Improved sound source positioning method based on progressive serial orthogonalization blind source separation algorithm and implementation system thereof
CN112560822B (en) Road sound signal classification method based on convolutional neural network
CN102147458B (en) Method and device for estimating direction of arrival (DOA) of broadband sound source
CN107507625B (en) Sound source distance determining method and device
CN104407328A (en) Method and system for positioning sound source in enclosed space based on spatial pulse response matching
CN113763986A (en) Air conditioner indoor unit abnormal sound detection method based on sound classification model
CN109597021B (en) Direction-of-arrival estimation method and device
CN112348052A (en) Power transmission and transformation equipment abnormal sound source positioning method based on improved EfficientNet
CN116576956A (en) Multisource vibration signal separation method based on distributed optical fiber acoustic wave sensing
CN112052712B (en) Power equipment state monitoring and fault identification method and system
Traa et al. Blind multi-channel source separation by circular-linear statistical modeling of phase differences
CN110782041B (en) Structural modal parameter identification method based on machine learning
Tcherniak et al. Application of transmissibility matrix method to NVH source contribution analysis
Zhang et al. Underwater target recognition based on spectrum learning with convolutional neural network
CN112180318B (en) Sound source direction of arrival estimation model training and sound source direction of arrival estimation method
CN117268796B (en) Vehicle fault acoustic event detection method
CN113919389A (en) GIS fault diagnosis method and system based on voiceprint imaging
CN112397090B (en) Real-time sound classification method and system based on FPGA
Li et al. Speaker and direction inferred dual-channel speech separation
CN111968671B (en) Low-altitude sound target comprehensive identification method and device based on multidimensional feature space
CN115267672A (en) Method for detecting and positioning sound source
CN104991245A (en) Unmanned aerial vehicle early warning apparatus and early warning method thereof
Youssef et al. From monaural to binaural speaker recognition for humanoid robots
CN110361696B (en) Closed space sound source positioning method based on time reversal technology
CN117630818A (en) Feature preprocessing and extracting method for multi-channel audio positioning

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant