CN102722983B

CN102722983B - Audio and video based detection system and method for horn blowing of motor vehicle in violation of regulations

Info

Publication number: CN102722983B
Application number: CN201210194327.8A
Authority: CN
Inventors: 翟国庆; 谢辉
Original assignee: Zhejiang University ZJU
Current assignee: Zhejiang Lixin Zhongzhi Acoustic Technology Co ltd
Priority date: 2012-06-13
Filing date: 2012-06-13
Publication date: 2014-03-12
Anticipated expiration: 2032-06-13
Also published as: CN102722983A

Abstract

The invention discloses an audio and video based detection system and method for horn blowing of a motor vehicle in violation of regulations. The detection system comprises an audio collection unit, a video collection unit, a main control unit and a storage unit. The detection method comprises the following steps of: sampling a sound signal in a designated area; calculating an actual sound pressure level of the sound signal; if a difference between a sound pressure level of a current sampling value and the actual sound pressure level of a previous sampling value is greater than a preset value, starting to collect a video signal of the designated area and calculating the modified sound pressure level of the current sampling value; calculating a standard deviation between grid sound source sound power levels according to the modified sound pressure level; calculating a quadratic sum of a difference value between an actual time difference of a sound sensor receiving a sudden sound and a theoretic time difference of all grid point sounds transmitted to the sound sensor; and recognizing a grid position corresponding to a horn-blowing point according to the quadratic sum of the difference value and the standard deviation. The audio and video based detection system and method for horn blowing of the motor vehicle in violation of the regulations have the advantages of stronger practicability, convenience for operation and accuracy in locating and are convenient to help managers to determine a vehicle in violation of the regulations.

Description

A kind of motor vehicle violation ring loudspeaker detection method based on Voice & Video

Technical field

The present invention relates to environment noise test field, be specifically related to a kind of motor vehicle violation ring loudspeaker detection system and method based on Voice & Video.

Background technology

Along with the development of the modern industry, environmental pollution is also along with generation, and noise pollution is a kind of of environmental pollution.Epidemiological study according to the nearest World Health Organization (WHO) to European countries, noise pollution has become the important environmental factor that affects quality of life and health, is regarded as three main Environmental Problems in world wide with water pollution, atmospheric pollution.Over-exposure, in noise pollution, not only can have a strong impact on mental health, also can increase the risk of the disease such as cardiovascular.In Chinese most cities, neighbourhood noise is complained and is accounted for 40%～50% of environment complaint, has occupied the first place that environment is complained, and has the trend that continues increase.In the disease burden > > about noise, the report < < noise pollution of health effect being caused at the nearest disclosed portion of the World Health Organization (WHO) and European Union's joint study center, point out, at European Region, the disease burden degree that noise pollution causes is only second to air pollution.For this reason, actively adopt an effective measure and reduce or reduce noise pollution in WHO appealing countries in the world.

Noise is that the sound that sends when body is done random vibration occurs, and with the form of ripple, in certain medium (as solid, liquid, gas), propagates.Generally, the frequency of sound wave that people's ear can be heard is 20～20000Hz, is called audible sound; Lower than 20Hz, be called infrasonic wave; Higher than 20000Hz, be called ultrasound wave.

Hear that the height of the tone of sound depends on sound wave frequency, high frequency sound sounds sharply, and all-bottom sound is comparatively dull to people's sensation.The size of sound determines by the power of sound, and from physical viewpoint, noise is by the making a noise of various different frequencies, varying strength, irregular combining.

Judge whether a sound belongs to noise, from the judgement of physics angle, be only inadequate, subjective factor often plays conclusive effect, even same sound, when people is during in different conditions, different mood, to sound, also can produce different subjective judgements, now sound may become noise or musical sound.From physiology viewpoint, every interference people have a rest, the sound of study and work, and unwanted sound, is referred to as noise, when noise on human and surrounding environment cause harmful effect, just form noise pollution.

Since Industrial Revolution, the creation of various plant equipment and use, brought flourishing and progress to the mankind, but also produced more and more and more and more stronger noise simultaneously.For reducing urban environment noise, pollute, China most cities has been carried out horn-blowing control.But the quick increase along with urban automobile quantity, urban streets is more and more crowded, driver's ring violating the regulations loudspeaker phenomenon is still more outstanding, rebound even to some extent in local area, the normal life of severe jamming surrounding resident and work, special residents at night is had a rest and sleep, and it is always continuous that the public requires vehicle supervision department to strengthen the cry of motor vehicle violation ring loudspeaker supervision.Owing to lacking motor vehicle violation ring loudspeaker automatic checkout system and method, traffic police can only pass through tactics of human sea, to the outstanding section of the ring loudspeaker violating the regulations site supervision of sending someone, by artificial cognition, investigate and prosecute ring horn machine motor-car violating the regulations, sometimes being investigated and prosecuted driver does not admit to break rules and regulations, due to objective evidence cannot be provided, administrative authority faces the true violating the regulations awkward condition that but cannot investigate and prosecute, this and effectively supervision and investigate and prosecute and form sharp contrast automatically to automobile overspeed, the act of violating regulations such as make a dash across the red light.

Pertinent literature and patent retrieval show, the domestic and international report there are no motor vehicle violation ring loudspeaker automatic checkout system and method.

Summary of the invention

The invention provides a kind of motor vehicle violation ring loudspeaker detection system and detection method based on Voice & Video, can realize automatic monitoring and investigation to urban automobile ring violating the regulations loudspeaker.

A motor vehicle violation ring loudspeaker detection system based on Voice & Video, comprising:

Audio collection unit, for gathering the voice signal of appointed area;

Video acquisition unit, for gathering the vision signal of described appointed area;

Main control unit for receiving described voice signal, triggers described video acquisition unit, and identify sound source position in gathered vision signal when voice signal surpasses setting value;

Storage unit, for receiving and store described voice signal and vision signal by described main control unit.

Described storage unit can adopt internal or external storage medium, and is connected with described main control unit by suitable interface, and main control unit can adopt single-chip microcomputer or microcomputer etc. to have the device of signal reception, processing and output function.

Described video acquisition unit can adopt the video image acquisition equipment such as digital camera (or video camera).

Described audio collection unit is sound transducer.

Sound transducer collection be generally simulating signal, need to transfer the discernible form of main control unit to by analog to digital conversion, therefore be provided with the sound card being connected between described sound transducer and described main control unit, described sound card can play analog to digital conversion or pretreated effect, and described sound card can adopt internal or external form.

For to appointed area, many places collected sound signal, or multi-point sampling is carried out in same appointed area, described sound transducer is a plurality of, adopts sound transducer array format.

Work between can each unit of mounting software system coordination in main control unit, described software systems are for automatic collection, storage, the renewal of voice signal and vision signal and voice signal is processed, and identify ring loudspeaker point position.

Motor vehicle violation ring loudspeaker detection system of the present invention can be fixed on the ring violating the regulations horn region that needs monitoring, also can be arranged on moving testing vehicle.

In the present invention, sound transducer array is connected with sound card audio input port by signal connecting line; As preferably, each sound transducer in described sound transducer array linearly shape is arranged.With the perpendicular or parallel layout of car lane, can, across road arrangement in each top, track, also can be arranged on car lane and slow lane intersection.

In order to guarantee collection effect, in described sound transducer array, sound transducer should have good weatherability, and number is no less than three, meeting in the required clear height situation of all kinds of autos onlys, is highly greater than 6m and is advisable.

The resolution of described digital camera (or video camera) is greater than 800*600, can be connected by USB or other ports with microcomputer; Sampling with sound card figure place be 16 and more than, sampling rate 44.1k and more than; External sound card or external storage medium can additional configuration external power supplys.In the higher situation of CPU frequency (much larger than Sampling with sound card frequency) of microcomputer, can realize the synchronized sampling of each road sound transducer to voice signal by timesharing.

The present invention also provides a kind of detection method, utilizes the audio and video frequency signal of dynamic sampling and record, detects motor vehicle violation ring loudspeaker event, identification ring violating the regulations loudspeaker vehicle position, and preserve to meet and put to the proof required minimum Voice & Video material.

A motor vehicle violation ring loudspeaker detection method based on Voice & Video, comprises the steps:

(1) in appointed area, arrange a plurality of (generally at least two) sampled point, collected sound signal, each sampled point can be thought a sound transducer.During sampling, be generally equal interval sampling, sampling interval depends on Sampling with sound card frequency, due to sound transducer collection is simulating signal, therefore sound card is converted to simulating signal the digital signal of binary format, usings the form of wave file (* .wav) preserve as temporary file together with the sampling time.

(2) utilize default calibration value to calculate the actual sound pressure level of described voice signal.

The power of voice signal is to embody by corresponding magnitude of voltage, therefore first by magnitude of voltage V corresponding to described voice signal _ijbe converted to objective sound pressure level p _ij, according to objective sound pressure level p _ijcalculate actual sound pressure level L _ij, L _ij=20lg (p _ij/ p ₀), p wherein ₀for reference acoustic pressure.

In each subscript, i is the numbering of the voice signal of not going the same way;

J is in the voice signal of same road, the numbering of different sampled values;

For example the 8th voice signal magnitude of voltage in the 5th road voice signal is V ₅₈, corresponding objective sound pressure level p ₅₈corresponding actual sound pressure level L ₅₈.

(3) if in the voice signal of a certain road, when the difference of the actual sound pressure level that current sampled value is corresponding with a upper sampled value is greater than preset value (calling trigger condition in the following text), think in current sampled value and contain burst of sound, and start to gather the vision signal of appointed area; Meanwhile, take one of them sampled point as Standard Sampling Point, calculate that described burst of sound propagates into each sampled point and to be transmitted to real time of Standard Sampling Point poor.

The L of sound transducer place, i road for example _ij-L _{i (j-1)}(this preset value is adjustable according to the required degrees of sensitivity of system response for > preset value, generally get 8dB and above being advisable), start video acquisition device, equal interval sampling recording of video picture (preserving frame by frame picture), every frame image time interval < 0.1s, preserves at least 3 frames continuously.

According to microcomputer cpu clock signal, extract apart from starting to gather vision signal place recently simultaneously constantly, and meet sound transducer sampled value place, described trigger condition Ge road t constantly _i(i=1,2, ... n, i is the numbering of the voice signal of not going the same way), here, can using any one sound transducer as the 1st tunnel, and Standard Sampling Point, calculates the poor Δ t of real time that burst of sound (such as burst sound such as ring loudspeaker) is transmitted to each road sound transducer and the 1st road sound transducer _i-1=t _i-t ₁,

T ₁for burst of sound propagates into time of the 1st road sound transducer (being Standard Sampling Point);

T _ifor burst of sound propagates into time of each road sound transducer (being each sampled point).

(4) to meeting the current sampled value of Rule of judgment in step (3), to its actual sound pressure level background correction value, obtain revised sound pressure level,

Revised sound pressure level

L_{i^{'}} = 10 \lg (10^{0.1 L_{ij}} - 10^{L_{i (j - 1)}}) .

(5) appointed area is divided into some grids, according to described revised sound pressure level, utilizes gridding method to calculate the standard deviation of each net point place sound source sound power level; Calculate all net points place voice signal and be transmitted to each sampled point and the theoretical mistiming that is transmitted to Standard Sampling Point.

Appointed area gridding is cut apart to (being less than 1m * 1m square node is advisable), the revised sound pressure level L corresponding according to i road signal _{i '}, the burst noises such as ring loudspeaker are considered as to the point sound source at net point place sounding, by the non-directive model of radiation in semi-free space

(in formula, r is the distance between the sensor of net point k to i road), known L _{i '}in r situation, can Inversion Calculation net point k (call number that k is net point, for example=1,2 ... M, wherein M is call number maximal value) locate point sound source sound power level

according to sound power level, define

(W ₀for reference sound power, get 10 ^-12w), calculate corresponding acoustical power W _ik(i is for not going the same way the numbering of voice signal, general desirable 1,2 ... n, n is the maximal value of numbering), calculate each W of net point place _ikstandard deviation SD _k

({SD}_{k} = \sqrt{\frac{1}{n} Σ_{i = 1}^{n} {(W_{ik} - \frac{1}{n} Σ_{i = 1}^{n} W_{ik})}^{2}}) .

Appointed area is divided into after some grids, need to calculates all net points place voice signal and be transmitted to each sampled point and the theoretical mistiming that is transmitted to Standard Sampling Point.

For the ease of statement, below sampled point is described as to the sound transducer in practical application.

According to the sound path of each net point Zhi Mei road sound transducer (being that net point is to the air line distance r of sound transducer) and sound velocity of propagation c (getting 340m/s) in empty sound, can calculate k net point (being any one grid) and locate voice signal and be transmitted to any one sound transducer required time t _ik=r/c, (i is for not going the same way the numbering of voice signal, and also corresponding sound transducer of not going the same way is general desirable 1,2 ... n, n is the maximal value of numbering).

Take one of them sound transducer as reference sensor, further can calculate k net point voice signal and be transmitted to each sound transducer and the required theoretical mistiming Δ t of base sound sensor _{i-1, k}=t _ik-t _1k,

T _1kit is the time that k net point voice signal is transmitted to the 1st road sound transducer (being Standard Sampling Point);

T _ikit is the time that k net point voice signal is transmitted to each road sound transducer (being each sampled point);

In net point and sound transducer position, determined in situation, the theoretical mistiming can calculated in advance and is kept in microcomputer.

(6) by poor making comparisons of real time corresponding and described burst of sound the theoretical mistiming of each net point, calculated difference quadratic sum, chooses squared difference and 3～5 relatively little net points; Described squared difference and

T_{k} = Σ_{i = 1}^{n} {(Δ t_{i - 1} - Δ t_{i - 1, k})}^{2} .

Here relatively little refers to squared difference that 3～5 net points of selected taking-up are corresponding and all little with respect to other net point, is about to T _ksequence, chooses front 3～5 net points that value is corresponding from small to large.

(7) 3～5 relatively little net points of the standard deviation of sound source sound power level described in selecting step (5), get the common factor of selected net point in these net points and step (6), by with occur simultaneously in each net point apart from sum minimum principle, utilize least square method calculating to identify ring loudspeaker and put corresponding grid position, mark ring loudspeaker point position in the vision signal gathering in described step (3) according to this grid position.

By the standard deviation SD of described sound source sound power level _ksequence, chooses front 3～5 net points that standard deviation is corresponding from small to large, gets the common factor of net point definite in these net points and step (6).By with occur simultaneously in net point apart from sum minimum principle, utilize least square method to calculate to identify ring loudspeaker and put corresponding grid position, mark ring loudspeaker point position in the vision signal gathering in step (3) according to this grid position.

This step computing grid apart from time, regard a point as by grid is approximate, described net point, generally can utilize the central point of grid to carry out approximate treatment.

In order to reduce system burden, as preferably, also comprise that (7) preservation meets the voice signal of Rule of judgment in step (3).

Voice signal is preserved with the form of wave file (* .wav), and this voice signal starting point is first 3 seconds of the vision signal that starts to gather appointed area; The terminal of this voice signal is to start to gather after the vision signal of appointed area 10 seconds; Amount to the voice signal of 13 seconds, delete other interim * .wav files.

Each * .wav file of managerial personnel's playback, whether audiovisual there is motor vehicle horn sound, gets rid of other burst sound and triggers possibility, and determine vehicles peccancy number by corresponding vision signal, is investigated and prosecuted.

Motor vehicle violation ring loudspeaker detection system of the present invention and detection method have stronger practicality, and easy to operate, accurate positioning, is convenient to the personnel of assisting management and determines vehicles peccancy.

Accompanying drawing explanation

Fig. 1 is the structured flowchart of detection system of the present invention.

Embodiment

As shown in Figure 1, the motor vehicle violation ring loudspeaker detection system that the present invention is based on Voice & Video comprises:

Audio collection unit, is sound transducer, for gathering the voice signal of appointed area;

Video acquisition unit, is digital camera (or video camera), for gathering the vision signal of described appointed area;

Main control unit, is single-chip microcomputer or microcomputer, for receiving described voice signal, triggers video acquisition unit (digital camera (or video camera)), and identify sound source position when voice signal surpasses setting value in gathered vision signal;

Storage unit, for received and stored described voice signal and vision signal by main control unit, can adopt internal or external form.

Sound transducer is no less than 3, and linearly shape is arranged, and with the perpendicular or parallel layout of car lane, can also can be arranged on car lane and slow lane intersection across road arrangement in each top, track, for meeting the required clear height of all kinds of autos onlys, is highly greater than 6m.

Due to sound transducer collection is generally simulating signal, need to transfer the discernible form of microcomputer to by analog to digital conversion, therefore be provided with the sound card being connected between sound transducer and microcomputer, sound transducer is connected with sound card audio input port by signal connecting line, sound card audio output is connected with microcomputer, sound card can adopt internal or external form, Sampling with sound card figure place be 16 and more than, sampling rate 44.1k and more than.

Digital camera (or video camera) resolution is greater than 800*600 and is connected by USB or other ports with microcomputer; In the higher situation of microcomputer CPU frequency (much larger than Sampling with sound card frequency), can realize the synchronized sampling of each road sound transducer to voice signal by timesharing.

External sound card or external storage element can additional configuration external power supplys.

The present invention utilizes the audio and video frequency signal of dynamic sampling and record, detects motor vehicle violation ring loudspeaker event, identification ring violating the regulations loudspeaker vehicle position, and preserve to meet and put to the proof required minimum Voice & Video material, specifically comprise the steps:

(1) by the sound card being connected with sound transducer array, equal interval sampling record (sampling interval depends on Sampling with sound card frequency) i road voice signal (i=1,2, ... n, i is not for going the same way the numbering of voice signal) in j (j=1,2 ...) individual voice signal magnitude of voltage V _ijthe data obtained is successively usingd to wave file (* .wav) form by binary format and sampling time to be preserved as temporary file, preserve the desirable i.wav of filename of i road voice signal, in n road sound transducer synchronized sampling voice signal save data situation, total n temporary file.

(2) utilize default calibration value (by calibrating device being fixed on the sound transducer of a certain road, it is 1000Hz that calibrating device produces frequency, sound pressure level is the standard acoustic signal of 94dB, and sound pressure level 94dB is converted into acoustic pressure and is about 1.0024pa, and the voice signal magnitude of voltage that now sampling obtains is V ₀, 1.0024/V ₀be the calibration value of this road sound transducer), in real time by voice signal magnitude of voltage V _ijbe converted to objective sound pressure level p _ij

according to objective sound pressure level p _ijutilize L _ij=20lg (p _ij/ p ₀) the real-time actual sound pressure level L of calculating sampling value _ijp wherein ₀reference acoustic pressure (gets 2 * 10 ^-5pa).

(3) if You Yi road sound transducer as the L of sound transducer place, i road _ij-L _{i (j-1)}(this preset value is adjustable according to the required sensitivity of system response for > preset value, generally get 8dB and above being advisable, call trigger condition in the following text), think in current sampled value and contain burst of sound, trigger and start digital camera (or video camera), equal interval sampling recording of video picture (preserving frame by frame picture), every frame image time interval < 0.1s, preserve continuously at least 3 frames, simultaneously according to microcomputer cpu clock signal, extract apart from starting to gather vision signal place recently constantly, and meet sound transducer sampled value place, above-mentioned trigger condition Ge road t constantly _i(i=1,2 ... n), calculate the poor Δ t of real time that burst of sound (as the loudspeaker etc. that ring) is transmitted to i road sound transducer and the 1st road sound transducer _i-1(Δ t _i-1=t _i-t ₁), t ₁for burst of sound propagates into time of the 1st road sound transducer, t _ifor burst of sound propagates into time of each road sound transducer, otherwise with regard to repeating step (1) and (2).

(4) according to sound superposition principle, utilize formula

deduct other ground unrest values, obtain the revised sound pressure level L that sound transducer place, i road burst of sound produces _{i '}.

(5) gridding method sound field Inversion Calculation

Monitoring ground regional network is formatted and cut apart (being less than 1m * 1m square node is advisable), the revised sound pressure level L producing according to sound transducer place, i road burst of sound (as ring loudspeaker or other burst noises) _{i '}, burst of sound is considered as to the point sound source at net point place sounding, by the non-directive model of radiation in semi-free space

(in formula r be net point k to i road transducer spacing from), known L _{i '}in r situation, can Inversion Calculation net point k (k=1,2 ... M) locate point sound source sound power level

according to sound power level, define

(W ₀for reference sound power, get 10 ^-12w), calculate corresponding acoustical power W _ik(i=1,2 ... n), calculate each W of net point place _ik(i=1,2 ... standard deviation SD n) _k

({SD}_{k} = \sqrt{\frac{1}{n} Σ_{i = 1}^{n} {(W_{ik} - \frac{1}{n} Σ_{i = 1}^{n} W_{ik})}^{2}}) .

According to the sound path of each net point Zhi Mei road sound transducer (being that net point is to the air line distance r of sound transducer) and sound velocity of propagation c (getting 340m/s) in empty sound, calculate k net point place burst of sound and be transmitted to i road sound transducer required time t _ik=r/c, (i=1,2 ... n), further calculate k net point burst of sound and be transmitted to i road sound transducer and the required theoretical mistiming Δ t of the 1st road sound transducer _{i-1, k}=t _ik-t _1k, wherein, t _1kit is the time that k net point voice signal is transmitted to the 1st road sound transducer; t _ikbe the time that k net point voice signal is transmitted to each road sound transducer, at grid and sound transducer position, determined in situation Δ t _{i-1, k}can calculated in advance and be kept in microcomputer.

(6) according to poor Δ t of the real time providing in step (3) _i-1and the theoretical mistiming Δ t that provides of step (5) _{i-1, k}, calculate the theoretical mistiming Δ t of each net point _{i-1, k}poor Δ t of real time with burst of sound _i-1squared difference and T _k,

choose squared difference and 3～5 relatively little net points, be about to T _ksequence, chooses front 3～5 net points that value is corresponding from small to large.

(7) target location identification

By the standard deviation SD of described sound source sound power level _ksequence from small to large, choose front 3～5 net points that standard deviation is corresponding, get the common factor of net point definite in these net points and step (6), by with occur simultaneously in net point apart from sum minimum principle, utilize least square method to calculate and identify grid position corresponding to burst of sound (calling target recognizing site in the following text).Video image is stacked on grid chart in proportion, classify ring loudspeaker suspicion object (because may be other burst noise impacts) as with the motor vehicle that target recognizing site is overlapping or nearest, and on video image, mark particular location, preserve image.

(8) dynamically delete invalid data

Retain with the nearest sound transducer record of target recognizing site air line distance, trigger starting digital camera system start first 3 seconds and trigger latter 10 seconds from this, duration is the voice signal sample voltage value V of totally 13 seconds _ij, with * .wav wave file form, formally preserve, and bind with suspicion object video sectional drawing violating the regulations, delete other interim * .wav files.(keeper is the wave file of the final identification record of direct copying and video interception from storage medium regularly, wired or wireless network periodic transmission data also may be set separately to administrator computer).

(9) each * .wav file of keeper's playback, whether audiovisual there is motor vehicle horn sound, gets rid of other burst of sound and triggers possibility, and carry finally definite vehicles peccancy number of figure by video, is investigated and prosecuted.

Claims

1. the ring of the motor vehicle violation based on a Voice & Video loudspeaker detection method, is characterized in that, comprises the steps:

(1) in appointed area, arrange a plurality of sampled points, collected sound signal;

(2) utilize default calibration value to calculate the actual sound pressure level of described voice signal;

(3), if in the voice signal of a certain road, when the difference of the actual sound pressure level that current sampled value is corresponding with a upper sampled value is greater than preset value, thinks in current sampled value and contain burst of sound, and start to gather the vision signal of appointed area; Meanwhile, take one of them sampled point as Standard Sampling Point, calculate that described burst of sound propagates into each sampled point and to be transmitted to real time of Standard Sampling Point poor;

(4) to meeting the current sampled value of Rule of judgment in step (3), to its actual sound pressure level background correction value, obtain revised sound pressure level;

(5) appointed area is divided into some grids, according to described revised sound pressure level, utilizes gridding method to calculate the standard deviation of each net point place sound source sound power level; Calculate all net points place voice signal and be transmitted to each sampled point and the theoretical mistiming that is transmitted to Standard Sampling Point;

(6) by poor making comparisons of real time corresponding and described burst of sound the theoretical mistiming of each net point, calculated difference quadratic sum, chooses squared difference and 3～5 relatively little net points;

2. the motor vehicle violation ring loudspeaker detection method based on Voice & Video as claimed in claim 1, is characterized in that, only preserves the voice signal that meets Rule of judgment in described step (3).

3. motor vehicle violation based on Voice & Video ring loudspeaker detection method as claimed in claim 2, it is characterized in that, the described voice signal starting point being saved is first 3 seconds of the vision signal that starts to gather appointed area, and the terminal of the voice signal being saved is to start to gather after the vision signal of appointed area 10 seconds.