CN104243894A - Audio and video fused monitoring method - Google Patents

Audio and video fused monitoring method Download PDF

Info

Publication number
CN104243894A
CN104243894A CN201310231183.3A CN201310231183A CN104243894A CN 104243894 A CN104243894 A CN 104243894A CN 201310231183 A CN201310231183 A CN 201310231183A CN 104243894 A CN104243894 A CN 104243894A
Authority
CN
China
Prior art keywords
target
sound
signal
vision signal
feature
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201310231183.3A
Other languages
Chinese (zh)
Inventor
陈孝良
李晓东
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Institute of Acoustics CAS
Original Assignee
Institute of Acoustics CAS
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Institute of Acoustics CAS filed Critical Institute of Acoustics CAS
Priority to CN201310231183.3A priority Critical patent/CN104243894A/en
Publication of CN104243894A publication Critical patent/CN104243894A/en
Pending legal-status Critical Current

Links

Landscapes

  • Image Analysis (AREA)

Abstract

The invention relates to an audio and video fused monitoring method. The method includes the steps that an audio signal and a video signal are collected, and the collected signals are conditioned; synergistic preprocessing is performed on the conditioned signals; whether an obtained signal includes the audio signal and the video signal or not is judged, when the two signals are included, the audio signal and the video signal are analyzed in a fused mode, according to the fused analysis result, target information included in the audio and video signals is found out, and if the obtained signal only includes the audio signal, independent analysis and processing are performed on the audio signal to obtain the target information included in the audio signal; according to the obtained target information, whether the posture of a camera needs to be adjusted or not is determined, and if in need, the posture of the camera is adjusted, and then execution is performed again, wherein the process of adjusting the posture of the camera includes focusing, light supplementing and angle adjusting.

Description

A kind of sound video fusion method for supervising
Technical field
The present invention relates to monitoring field, particularly a kind of sound video fusion method for supervising.
Background technology
Video monitoring is a kind of Main Means in monitoring field.Traditional video monitoring is mainly based on low resolution mono-vision video sensor, in the face of the dynamic scene of complexity and the requirement of intelligent real-time early warning day by day, its technology exists two and challenges greatly: first, video sensor exist visual angle narrower, be subject to the problem of blocking, easily be subject to the impact of IFR conditions and light intensity, such as snow and rain greasy weather gas and day-night change; Second, video monitoring carries out detecting based on video data streams a large amount of continuously, the algorithm complex of localization and tracking is higher, especially realize the real-time of intellectual analysis based on HD video poor, cost and energy consumption are also problems, which has limited the application of HD video transducer in monitoring field.
In order to tackle these challenges, extensive research has been carried out both at home and abroad for the intelligent of video monitoring and real-time, wherein thinking expands based on video high-level processing algorithm and deepens an Intellectual Analysis Technology for video, and the methods such as panorama, stereo camera shooting and 3-D modeling compensate for the narrower defect in mono-vision video sensor visual angle to a certain extent; Another thinking is based on multisensor data fusion theory, utilizes the feature extracted from similar or foreign peoples's multisensor to realize object-oriented intelligent analysis.Carry out linkage of multi-cameras in field of video monitoring and merged the exploration of GPS, radar, laser, foreign peoples's signal such as infrared in recent years.
But sound is as nature signal of interest, does not also draw attention in monitoring field so far, be mainly limited to the technological lag of microphone array.Along with the development of array and sensing technology, the acoustic detection research based on microphone array has had larger progress, has carried out Applied D emonstration in fields such as medical monitoring, consumer electronics, Border Protection, Industry Control.Owing to enhancing based on the acoustic detection method of microphone array dispersive target and detection movable in short-term, location and follow-up control, have the advantages that low energy consumption, round-the-clock, unobstructed, non-blind area and real-time are good, be highly suitable for the application in monitoring field.But because monitoring scene circumstance complication, background are noisy, existing microphone array location technology can not directly apply to monitoring scene analysis.Amount of information in addition due to acoustic detection acquisition is relatively less, only independently cannot meet the demand in monitoring field with microphone array.Also there is no a set of complete skill scheme being adapted to the sound video fusion monitoring in monitoring field at present.
Summary of the invention
The object of the invention is to overcome the defects such as the single video monitoring visual field is narrower, easily affected by environment, obtaining information amount is few, thus a kind of sound video fusion method for supervising based on microphone array and monopod video camera is provided.
To achieve these goals, the invention provides a kind of sound video fusion method for supervising, comprising:
Step 1), collection audio frequency and vision signal, nurse one's health gathered signal;
Step 2), step 1) is obtained, through conditioning signal do collaborative preliminary treatment; Described collaborative preliminary treatment comprise signal done compress, filtering, denoising and enhancing;
Step 3), to step 2) signal that obtains whether comprises sound signal simultaneously and vision signal is judged, when to comprise two kinds of signals simultaneously, perform step 4), if only comprise sound signal, then perform step 5);
Step 4), convergence analysis is done to sound signal and vision signal, find out the target information comprised in described sound vision signal according to the result of convergence analysis, then perform step 6);
Step 5), independently analysis and treament is done to sound signal, obtain the target information comprised in described sound signal, then perform step 6);
Step 6), the target information obtained according to step 4) or step 5) determine to adjust the need of to the attitude of video camera, if desired adjust, and the attitude of adjustment video camera, then re-executes step 1); Wherein, the described attitude to video camera is carried out adjustment and is comprised focusing, light filling, adjustment angle.
In technique scheme, also comprise:
Step 7), pattern recognition is carried out to current sound vision signal, to obtain the semantic information comprising keyword, time, orientation, classification, state of object event; Described pattern recognition comprises behavior understanding, differentiates control and state estimation, and wherein, described behavior understanding, by the extraction of motion feature, obtains the keyword of object event; Described differentiation controls the result according to behavior understanding, the information such as time, orientation of further acquisition event, compared with corresponding keyword threshold value, detects the classification judging object event; Described state estimation, according to the classification differentiating object event, according to the importance degree of the default eigenvalue estimate object event of classification, sets alert level according to estimated result to object event;
Step 8), from the sound vision signal through pattern recognition, capture key message and core fragment, by the semantic information of multiple fragment assembly and editor's formation one reflection monitor message, by coding after the compression of these semantic information, transmit finally by real-time performance.
In technique scheme, described step 4) comprises:
Step 4-1), from background noise data storehouse, extract background noise data, realize background modeling; Wherein, described background noise data storehouse under storing multiple meteorological condition, the background noise of multiple typical scene; Described meteorological condition comprises the special weather condition of wind, rain, snow, mist, described typical scene comprises calling for help, blows a whistle, collides, explodes, fires a shot, low-latitude flying, crowd massing;
Step 4-2), from target characteristic database, extract plurality of target characteristic information, by these target signature informations and step 4-1) in the background noise model set up combine, obtain virtual target feature; Wherein, described target characteristic database is for storing clarification of objective information, described feature comprises essential characteristic, transform domain feature, statistical nature, the motion feature of audio frequency or vision signal, and the information of these features in time, space, spectrum, phase place etc.;
Step 4-3), to step 2) the sound vision signal that generates and step 4-2) the virtual target feature that generates compares, realizes step 2) target's feature-extraction of sound vision signal that generates;
Step 4-4), according to step 4-3) the target's feature-extraction result that obtains utilizes Bayesian analysis to carry out probabilistic determination, found the event comprised in gathered sound vision signal by maximum a posteriori probability;
Step 4-5), to step 4-4) in detected target adopt Wave beam forming and the Wave arrival direction estimating method of based target characteristic sum background noise model, calculate according to acoustic signal propagation rule and open energy, phase place and Doppler effect that moving acoustic sources target under space and enclosure space two class environmental condition has to realize locating, determine the coordinate figure of this target;
Step 4-6), to through location target follow the tracks of.
In technique scheme, the step 4-3 described) and step 4-4) between, also comprise step 4-1 described in multiple exercise)-step 4-3).
In technique scheme, step 4-3 described) in, one group of object feature value by sound vision signal and the right result of virtual target aspect ratio, these object feature value being sorted from high to low according to similarity, is the target's feature-extraction result of sound vision signal higher than a certain object feature value presetting threshold value in ranking results.
In technique scheme, the step 4-6 described) in, described tracking comprises the coordinate figure determined according to microphone array and controls video camera attitude, realizes focusing, light filling, adjustment angle.
In technique scheme, described step 5) comprises:
Step 5-1), from background noise data storehouse, extract background noise data, realize background modeling; Target signature is extracted from target characteristic database;
Step 5-2), adopt Wave beam forming and the Wave arrival direction estimating method of based target characteristic sum background noise model, calculate according to acoustic signal propagation rule and open that energy, phase place and the Doppler effect that moving acoustic sources target under space and enclosure space two class environmental condition has detects distributed object, the contribution of location and trace model, thus identifications, classification be optimized to acoustic target, locate and tracking.
In technique scheme, in described step 6), the number of times re-executing step 1) is no more than 3 times.
The invention has the advantages that:
1) acoustic signature is introduced video detection and tracking algorithm as parameter by the present invention, and acoustics signal processing has the advantages that algorithm complex is little, real-time is good, can improve the performance of video object recognition and tracking algorithm.
2) the present invention extracts the compound characteristics merging audio frequency and video two kinds of foreign peoples's signals, makes up the shortcoming of traditional video surveillance, has round-the-clock, unobstructed, the detection of non-blind area, localization and tracking ability, can improve the reaction speed of supervisory control system.
3) the present invention carries out automatic analysis and semantic understanding to sound video data, capture key message and core fragment in monitoring scene, the semantic information of splicing and editor's formation one reflection monitor message, by Internet Transmission after compressed encoding, the problem that monitor network mass data expands day by day can be avoided.
4) collection of multiple channel acousto vision signal, analysis, calculating and communication function combine together gasifying device by the present invention, solve the oversize problem of not easily installing of microphone array, support wireless transmission and PLC function simultaneously, the problems such as the more cost caused of connection cable is higher can be avoided.
Accompanying drawing explanation
Fig. 1 is the flow chart of sound video fusion method for supervising of the present invention;
Fig. 2 is the schematic diagram adjusted video camera attitude.
Embodiment
Now the invention will be further described by reference to the accompanying drawings.
The voice signal that sound video fusion method for supervising of the present invention obtains based on microphone array and the vision signal that video camera obtains realize the monitoring to monitoring scene.
Before the performing step of the inventive method is elaborated, first related notion involved in the present invention is described.
Target characteristic database: target refers to the unexpected abnormality event in monitoring scene, target characteristic database is for storing clarification of objective information, described feature comprises essential characteristic, transform domain feature, statistical nature, the motion feature of audio frequency or vision signal, and the information of these features in time, space, spectrum, phase place etc. (as average, variance, cepstrum, envelope etc.).
Background noise data storehouse: background noise data storehouse under storing multiple meteorological condition, the background noise of multiple typical scene.Described meteorological condition comprises the special weather conditions such as wind, rain, snow, mist; Described typical scene comprises calling for help, blows a whistle, collides, explodes, fires a shot, low-latitude flying, crowd massing etc.
Below in conjunction with accompanying drawing, method of the present invention is described further.
With reference to figure 1, method of the present invention comprises the following steps:
Step 1), collection audio frequency and vision signal, nurse one's health gathered signal.
In this step, vision signal adopts camera acquisition, and sound signal adopts microphone array collection.Under normal circumstances, the signal gathered should comprise audio frequency and vision signal simultaneously, but due to reasons such as microphone array or video camera break down, in some cases, the signal gathered only comprises sound signal or only comprises vision signal, can continue to perform subsequent operation for this type of situation.
Step 2), step 1) is obtained, through conditioning signal do collaborative preliminary treatment.
In this step, the collaborative preliminary treatment of signal comprises and takes turns doing the operations such as compression, filtering, denoising and enhancing to signal, and implementations of these operations are conventionally known to one of skill in the art, therefore do not repeat herein.
In step 1), if the signal gathered comprises sound signal and vision signal simultaneously, then signal is done compress, filtering time adopt the mode of collaborative compression and collaborative filtering, if the signal gathered only comprises sound signal or vision signal, then process according to single signal cooked mode.
Step 3), to step 2) signal that obtains whether comprises sound signal simultaneously and vision signal is judged, when to comprise two kinds of signals simultaneously, perform step 4), if only comprise sound signal, then perform step 5).
Mention, also there is the possibility only comprising vision signal or only comprise sound signal in the signal that watch-dog gathers before, for the situation only comprising vision signal, to the analysis and treament of vision signal then not within the scope of the application.
Step 4), convergence analysis is done to sound signal and vision signal, find out the target information comprised in described sound vision signal according to the result of convergence analysis, then perform step 6).
Step 5), independently analysis and treament is done to sound signal, obtain the target information comprised in described sound signal, then perform step 6).
Step 6), the target information obtained according to step 4) or step 5) determine to adjust the need of to the attitude of video camera.With reference to figure 2, when adjusting attitude, the attitude that first perception video camera is current, the direction of the determined acoustic target of sound signal then received according to microphone array and range information, determine the difference between video camera current pose and targeted attitude, thus realize pose adjustment.Described pose adjustment comprises the operations such as focusing, light filling, adjustment angle.
After the attitude of adjustment video camera, aforesaid step 1) can be re-executed, carry out signal Resurvey or compensate gathering, gathered result is done collaborative preliminary treatment and the analysis of sound video fusion according to such as front step, and the result obtained can be used for the attitude adjusting video camera further.The operation of this positioning cycle repeats at most 3 times to ensure algorithmic statement and speed, and automatically forbids in tracing process.
Be more than the description to the inventive method basic step, as the preferred implementation of one, in another embodiment, the inventive method also comprises:
Step 7), pattern recognition is carried out to current sound vision signal, to obtain the semantic information such as keyword, time, orientation, classification, state of object event; Described pattern recognition comprises behavior understanding, differentiates control and state estimation, wherein, described behavior understanding is mainly through the extraction of motion feature, obtain the keyword of object event, such as collide, blast etc., described differentiation controls mainly according to the result of behavior understanding, the information such as time, orientation of further acquisition event, compared with corresponding keyword threshold value, detect the classification judging object event; Described state estimation, mainly according to the classification differentiating object event, according to the importance degree of the default eigenvalue estimate object event of classification, sets alert level according to estimated result to object event.
Step 8), from the sound vision signal through pattern recognition, capture key message and core fragment, by the semantic information of multiple fragment assembly and editor's formation one reflection monitor message, by coding after the compression of these semantic information, transmit finally by real-time performance.
By above-mentioned steps 7) and step 8), the sound vision signal obtained in monitor procedure can be retrieved comparatively easily in subsequent operation, improves recall precision, contributes to the further utilization of monitor message.
Below the specific implementation of the correlation step in the inventive method is described further.
In described step 4), convergence analysis is done to sound vision signal and comprises multiple sub-step, comprising:
Step 4-1), from background noise data storehouse, extract background noise data, realize background modeling.
From knowing the description in background noise data storehouse before, under including multiple meteorological condition in background noise data storehouse, the background noise of multiple typical scene, in this step, according to external condition during monitoring, corresponding background noise data can be chosen from background noise data storehouse, utilize this background noise data to carry out modeling.
Step 4-2), from target characteristic database, extract plurality of target characteristic information, by these target signature informations and step 4-1) in the background noise model set up combine, obtain virtual target feature.
Step 4-3), to step 2) the sound vision signal that generates and step 4-2) the virtual target feature that generates compares, realizes step 2) target's feature-extraction of sound vision signal that generates.
One group of object feature value by sound vision signal and the right result of virtual target aspect ratio in this step, these object feature value being sorted from high to low according to similarity, is exactly the target's feature-extraction result of sound vision signal higher than a certain object feature value presetting threshold value in ranking results.
It should be noted that, if when method of the present invention is applied on the embedded OS of resource-constrained, once may not read all information in target characteristic database and background noise data storehouse, in this case, step 4-1)-step 4-3) need multiple exercise, to obtain target's feature-extraction result more accurately.
Step 4-4), according to step 4-3) the target's feature-extraction result that obtains utilizes Bayesian analysis to carry out probabilistic determination, found the event comprised in gathered sound vision signal by maximum a posteriori probability.
Step 4-5), to step 4-4) in detected target adopt Wave beam forming and the Wave arrival direction estimating method of the model refinement of based target characteristic sum background noise, the energy opened moving acoustic sources target under space and enclosure space two class environmental condition and have is calculated according to acoustic signal propagation rule, phase place and Doppler effect, to realize location, determine the coordinate figure of this target.
Step 4-6), to through location target follow the tracks of.Described tracking comprises the coordinate figure determined according to microphone array and controls monopod video camera attitude, realize the operations such as focusing, light filling, adjustment angle, ensure the video information that can catch intended target under multi-target condition continuously and stably, and realize the quick and precisely switching between multiple target.
In described step 5), independently analysis and treament is done to sound signal and comprises:
Step 5-1), from background noise data storehouse, extract background noise data, realize background modeling; Target signature is extracted from target characteristic database.
Step 5-2), adopt Wave beam forming and the Wave arrival direction estimating method of based target characteristic sum background noise model, calculate according to acoustic signal propagation rule and open that energy, phase place and the Doppler effect that moving acoustic sources target under space and enclosure space two class environmental condition has detects distributed object, the contribution of location and trace model, thus identifications, classification be optimized to acoustic target, locate and tracking.
It should be noted last that, above embodiment is only in order to illustrate technical scheme of the present invention and unrestricted.Although with reference to embodiment to invention has been detailed description, those of ordinary skill in the art is to be understood that, modify to technical scheme of the present invention or equivalent replacement, do not depart from the spirit and scope of technical solution of the present invention, it all should be encompassed in the middle of right of the present invention.

Claims (8)

1. a sound video fusion method for supervising, comprising:
Step 1), collection audio frequency and vision signal, nurse one's health gathered signal;
Step 2), step 1) is obtained, through conditioning signal do collaborative preliminary treatment; Described collaborative preliminary treatment comprise signal done compress, filtering, denoising and enhancing;
Step 3), to step 2) signal that obtains whether comprises sound signal simultaneously and vision signal is judged, when to comprise two kinds of signals simultaneously, perform step 4), if only comprise sound signal, then perform step 5);
Step 4), convergence analysis is done to sound signal and vision signal, find out the target information comprised in described sound vision signal according to the result of convergence analysis, then perform step 6);
Step 5), independently analysis and treament is done to sound signal, obtain the target information comprised in described sound signal, then perform step 6);
Step 6), the target information obtained according to step 4) or step 5) determine to adjust the need of to the attitude of video camera, if desired adjust, and the attitude of adjustment video camera, then re-executes step 1); Wherein, the described attitude to video camera is carried out adjustment and is comprised focusing, light filling, adjustment angle.
2. sound video fusion method for supervising according to claim 1, is characterized in that, also comprise:
Step 7), pattern recognition is carried out to current sound vision signal, to obtain the semantic information comprising keyword, time, orientation, classification, state of object event; Described pattern recognition comprises behavior understanding, differentiates control and state estimation, and wherein, described behavior understanding, by the extraction of motion feature, obtains the keyword of object event; Described differentiation controls the result according to behavior understanding, the information such as time, orientation of further acquisition event, compared with corresponding keyword threshold value, detects the classification judging object event; Described state estimation, according to the classification differentiating object event, according to the importance degree of the default eigenvalue estimate object event of classification, sets alert level according to estimated result to object event;
Step 8), from the sound vision signal through pattern recognition, capture key message and core fragment, by the semantic information of multiple fragment assembly and editor's formation one reflection monitor message, by coding after the compression of these semantic information, transmit finally by real-time performance.
3. sound video fusion method for supervising according to claim 1 and 2, it is characterized in that, described step 4) comprises:
Step 4-1), from background noise data storehouse, extract background noise data, realize background modeling; Wherein, described background noise data storehouse under storing multiple meteorological condition, the background noise of multiple typical scene; Described meteorological condition comprises the special weather condition of wind, rain, snow, mist, described typical scene comprises calling for help, blows a whistle, collides, explodes, fires a shot, low-latitude flying, crowd massing;
Step 4-2), from target characteristic database, extract plurality of target characteristic information, by these target signature informations and step 4-1) in the background noise model set up combine, obtain virtual target feature; Wherein, described target characteristic database is for storing clarification of objective information, described feature comprises essential characteristic, transform domain feature, statistical nature, the motion feature of audio frequency or vision signal, and the information of these features in time, space, spectrum, phase place etc.;
Step 4-3), to step 2) the sound vision signal that generates and step 4-2) the virtual target feature that generates compares, realizes step 2) target's feature-extraction of sound vision signal that generates;
Step 4-4), according to step 4-3) the target's feature-extraction result that obtains utilizes Bayesian analysis to carry out probabilistic determination, found the event comprised in gathered sound vision signal by maximum a posteriori probability;
Step 4-5), to step 4-4) in detected target adopt Wave beam forming and the Wave arrival direction estimating method of based target characteristic sum background noise model, calculate according to acoustic signal propagation rule and open energy, phase place and Doppler effect that moving acoustic sources target under space and enclosure space two class environmental condition has to realize locating, determine the coordinate figure of this target;
Step 4-6), to through location target follow the tracks of.
4. sound video fusion method for supervising according to claim 3, is characterized in that, the step 4-3 described) and step 4-4) between, also comprise step 4-1 described in multiple exercise)-step 4-3).
5. sound video fusion method for supervising according to claim 3, it is characterized in that, step 4-3 described) in, one group of object feature value by sound vision signal and the right result of virtual target aspect ratio, these object feature value being sorted from high to low according to similarity, is the target's feature-extraction result of sound vision signal higher than a certain object feature value presetting threshold value in ranking results.
6. sound video fusion method for supervising according to claim 3, is characterized in that, the step 4-6 described) in, described tracking comprises the coordinate figure determined according to microphone array and controls video camera attitude, realizes focusing, light filling, adjustment angle.
7. sound video fusion method for supervising according to claim 1 and 2, it is characterized in that, described step 5) comprises:
Step 5-1), from background noise data storehouse, extract background noise data, realize background modeling; Target signature is extracted from target characteristic database;
Step 5-2), adopt Wave beam forming and the Wave arrival direction estimating method of based target characteristic sum background noise model, calculate according to acoustic signal propagation rule and open that energy, phase place and the Doppler effect that moving acoustic sources target under space and enclosure space two class environmental condition has detects distributed object, the contribution of location and trace model, thus identifications, classification be optimized to acoustic target, locate and tracking.
8. sound video fusion method for supervising according to claim 1 and 2, is characterized in that, in described step 6), the number of times re-executing step 1) is no more than 3 times.
CN201310231183.3A 2013-06-09 2013-06-09 Audio and video fused monitoring method Pending CN104243894A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201310231183.3A CN104243894A (en) 2013-06-09 2013-06-09 Audio and video fused monitoring method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201310231183.3A CN104243894A (en) 2013-06-09 2013-06-09 Audio and video fused monitoring method

Publications (1)

Publication Number Publication Date
CN104243894A true CN104243894A (en) 2014-12-24

Family

ID=52231136

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201310231183.3A Pending CN104243894A (en) 2013-06-09 2013-06-09 Audio and video fused monitoring method

Country Status (1)

Country Link
CN (1) CN104243894A (en)

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106202903A (en) * 2016-07-04 2016-12-07 广州瑞康本圣生物科技有限公司 Internet of Things wisdom hospital event Flow driving engine method
CN107223332A (en) * 2015-03-19 2017-09-29 英特尔公司 Audio-visual scene analysis based on acoustics camera
CN108389586A (en) * 2017-05-17 2018-08-10 宁波桑德纳电子科技有限公司 A kind of long-range audio collecting device, monitoring device and long-range collection sound method
CN110532888A (en) * 2019-08-01 2019-12-03 悉地国际设计顾问(深圳)有限公司 A kind of monitoring method, apparatus and system
CN112396801A (en) * 2020-11-16 2021-02-23 苏州思必驰信息科技有限公司 Monitoring alarm method, monitoring alarm device and storage medium

Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1381131A (en) * 2000-03-21 2002-11-20 皇家菲利浦电子有限公司 Hands-free home video production camcorder
US20030174210A1 (en) * 2002-03-04 2003-09-18 Nokia Corporation Video surveillance method, video surveillance system and camera application module
CN1586074A (en) * 2001-11-13 2005-02-23 皇家飞利浦电子股份有限公司 A system and method for providing an awareness of remote people in the room during a videoconference
CN101017591A (en) * 2007-02-06 2007-08-15 重庆大学 Video safety prevention and monitoring method based on biology sensing and image information fusion
CN101030323A (en) * 2007-04-23 2007-09-05 凌子龙 Automatic evidence collecting device on crossroad for vehicle horning against traffic regulation
CN101364408A (en) * 2008-10-07 2009-02-11 西安成峰科技有限公司 Sound image combined monitoring method and system
CN101753992A (en) * 2008-12-17 2010-06-23 深圳市先进智能技术研究所 Multi-mode intelligent monitoring system and method
CN101771814A (en) * 2009-12-29 2010-07-07 天津市亚安科技电子有限公司 Pan and tilt camera with sound identification and positioning function

Patent Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1381131A (en) * 2000-03-21 2002-11-20 皇家菲利浦电子有限公司 Hands-free home video production camcorder
CN1586074A (en) * 2001-11-13 2005-02-23 皇家飞利浦电子股份有限公司 A system and method for providing an awareness of remote people in the room during a videoconference
US20030174210A1 (en) * 2002-03-04 2003-09-18 Nokia Corporation Video surveillance method, video surveillance system and camera application module
CN101017591A (en) * 2007-02-06 2007-08-15 重庆大学 Video safety prevention and monitoring method based on biology sensing and image information fusion
CN101030323A (en) * 2007-04-23 2007-09-05 凌子龙 Automatic evidence collecting device on crossroad for vehicle horning against traffic regulation
CN101364408A (en) * 2008-10-07 2009-02-11 西安成峰科技有限公司 Sound image combined monitoring method and system
CN101753992A (en) * 2008-12-17 2010-06-23 深圳市先进智能技术研究所 Multi-mode intelligent monitoring system and method
CN101771814A (en) * 2009-12-29 2010-07-07 天津市亚安科技电子有限公司 Pan and tilt camera with sound identification and positioning function

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107223332A (en) * 2015-03-19 2017-09-29 英特尔公司 Audio-visual scene analysis based on acoustics camera
CN107223332B (en) * 2015-03-19 2021-02-05 英特尔公司 Audio visual scene analysis based on acoustic camera
CN106202903A (en) * 2016-07-04 2016-12-07 广州瑞康本圣生物科技有限公司 Internet of Things wisdom hospital event Flow driving engine method
CN108389586A (en) * 2017-05-17 2018-08-10 宁波桑德纳电子科技有限公司 A kind of long-range audio collecting device, monitoring device and long-range collection sound method
CN110532888A (en) * 2019-08-01 2019-12-03 悉地国际设计顾问(深圳)有限公司 A kind of monitoring method, apparatus and system
CN112396801A (en) * 2020-11-16 2021-02-23 苏州思必驰信息科技有限公司 Monitoring alarm method, monitoring alarm device and storage medium

Similar Documents

Publication Publication Date Title
CN107818571B (en) Ship automatic tracking method and system based on deep learning network and average drifting
JP5385893B2 (en) POSITIONING SYSTEM AND SENSOR DEVICE
CN104243894A (en) Audio and video fused monitoring method
CN110991289A (en) Abnormal event monitoring method and device, electronic equipment and storage medium
CN112270680B (en) Low altitude unmanned detection method based on sound and image fusion
CN103198838A (en) Abnormal sound monitoring method and abnormal sound monitoring device used for embedded system
Andersson et al. Fusion of acoustic and optical sensor data for automatic fight detection in urban environments
US20170019639A1 (en) Integrated monitoring cctv, abnormality detection apparatus, and method for operating the apparatus
CN102254394A (en) Antitheft monitoring method for poles and towers in power transmission line based on video difference analysis
CN112261719B (en) Area positioning method combining SLAM technology with deep learning
CN113096397A (en) Traffic jam analysis method based on millimeter wave radar and video detection
CN115034324B (en) Multi-sensor fusion perception efficiency enhancement method
CN105809890A (en) School-bus-safety-oriented missed-child detecting method
CN111353496B (en) Real-time detection method for infrared dim targets
CN105825520A (en) Monocular SLAM (Simultaneous Localization and Mapping) method capable of creating large-scale map
CN110377066A (en) A kind of control method of inspection device, device and equipment
CN108965789B (en) Unmanned aerial vehicle monitoring method and audio-video linkage device
CN107390164B (en) A kind of continuous tracking method of underwater distributed multi-source target
CN202958578U (en) Bird situation monitoring and bird repelling system for airport
CN110597077B (en) Method and system for realizing intelligent scene switching based on indoor positioning
CN105590021B (en) Dynamic quantity audio source tracking method based on microphone array
CN111784750A (en) Method, device and equipment for tracking moving object in video image and storage medium
CN109188419B (en) Method and device for detecting speed of obstacle, computer equipment and storage medium
CN113432276B (en) Method and equipment for automatically adjusting air conditioner and air conditioner
CN115359329A (en) Unmanned aerial vehicle tracking and identifying method and system based on audio-visual cooperation

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
WD01 Invention patent application deemed withdrawn after publication

Application publication date: 20141224

WD01 Invention patent application deemed withdrawn after publication