WO2024065389A1 - 用于检测摄像头干扰的方法、***和电子设备 - Google Patents

用于检测摄像头干扰的方法、***和电子设备 Download PDF

Info

Publication number
WO2024065389A1
WO2024065389A1 PCT/CN2022/122563 CN2022122563W WO2024065389A1 WO 2024065389 A1 WO2024065389 A1 WO 2024065389A1 CN 2022122563 W CN2022122563 W CN 2022122563W WO 2024065389 A1 WO2024065389 A1 WO 2024065389A1
Authority
WO
WIPO (PCT)
Prior art keywords
image
target
response
camera
target image
Prior art date
Application number
PCT/CN2022/122563
Other languages
English (en)
French (fr)
Inventor
李飞
Original Assignee
京东方科技集团股份有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 京东方科技集团股份有限公司 filed Critical 京东方科技集团股份有限公司
Priority to PCT/CN2022/122563 priority Critical patent/WO2024065389A1/zh
Priority to CN202280003367.1A priority patent/CN118120222A/zh
Publication of WO2024065389A1 publication Critical patent/WO2024065389A1/zh

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N5/00Details of television systems
    • H04N5/14Picture signal circuitry for video frequency region

Definitions

  • the present disclosure relates to the field of intelligent video surveillance, and in particular to a method for detecting camera interference, a system for detecting camera interference, and an electronic device.
  • the video sequence is automatically analyzed by computer vision analysis methods to achieve moving target detection, classification, recognition, tracking, etc.
  • the behavior of the target is analyzed by pre-set rules to provide a reference for taking further measures (such as automatically alarming when the object enters the defense area).
  • the camera is interfered with by human or other unpredictable factors, the area monitored by the camera is inconsistent with the area the user wants to monitor, which will seriously affect the performance of the intelligent video surveillance system.
  • traditional algorithms are affected by light when performing image processing. For example, for the same target in two frames of images, the tracking results obtained are very inaccurate when the light changes greatly.
  • camera interference refers to drastic changes in the surveillance image that last for a certain period of time, while some short-term accidental changes are considered normal.
  • the main types of camera interference include: camera obstruction, camera offset, camera out of focus, camera tremor, abnormal camera image brightness, camera image distortion, camera failure, etc.
  • Embodiments of the present disclosure provide a method for detecting camera interference, a system for detecting camera interference, and an electronic device.
  • the present disclosure provides a method for detecting camera interference, comprising: acquiring multiple frames of images through the camera, and selecting at least one image connected area of the first frame of image as at least one tracking template; based on the at least one tracking template, performing tracking calculation on each target image to obtain at least one feature response map corresponding to the at least one tracking template for each target image; wherein the target image is a frame of image in the multiple frames after the first frame of image; and judging whether the camera is interfered with based on the at least one feature response map of each target image.
  • selecting at least one image connected area of a first frame image as at least one tracking template includes: performing image segmentation on the first frame image using a selective search algorithm to generate multiple image connected areas; performing contour extraction on the multiple image connected areas using a morphological method to obtain contour features of the multiple image connected areas; and screening the multiple image connected areas using a rectangularity feature, and selecting at least one image connected area with the largest rectangularity among the multiple image connected areas as at least one tracking template.
  • the characteristic response graph is calculated by the following formula:
  • RF represents a feature response map of each target area in at least one target area corresponding to the at least one tracking template in each target image
  • z represents the first frame image
  • x represents each target image
  • a feature map representing each tracking template of the first frame image A feature map representing each target image
  • b represents a constant.
  • judging whether the camera is interfered with according to at least one feature response map of each target image includes: obtaining a response score value of a target area of the target image corresponding to the at least one tracking template based on each feature response map of the at least one feature response map of each target image; obtaining a response score value of each target area in each target image for the at least one tracking template to judge whether the camera is interfered with; wherein obtaining the response score value of the target area of the target image corresponding to the at least one tracking template includes: processing each value in the feature response map using a sigmoid function; and then taking the largest value among the processed values as the response score value of the target area of the target image corresponding to the at least one tracking template; wherein the sigmoid function is as follows:
  • x represents each value in the characteristic response graph
  • f(x) represents a function for processing each value in the characteristic response graph
  • judging whether the camera is interfered with according to at least one feature response map of each target image further includes: comparing the response score value of each target area in each target image with a first response threshold; in response to the response score value of the target area being less than the first response threshold, assigning a label value of 0 to the target area; in response to the response score value of the target area being greater than the first response threshold, assigning a label value of 1 to the target area; averaging the label values of all target areas in each target image to calculate the total score ratio of all target areas in each target image.
  • S represents the total score ratio of all target areas in each target image
  • N is the number of target areas in each target image, and N is a positive integer
  • i is a positive integer and 1 ⁇ i ⁇ N.
  • judging whether the camera is interfered with is based on at least one feature response map of each target image, and further comprising: comparing the total score ratio of all target areas in each target image with a second response threshold; judging that the camera is blocked in response to the total score ratio of all target areas in each frame of K frame images being less than the second response threshold; judging that the camera is not blocked in response to the total score ratio of all target areas in each target image being greater than or equal to the second response threshold; wherein K is an integer greater than or equal to 1.
  • the method in response to the camera being not obstructed, the method further includes: calculating the average value of the deviations between the center points of all target areas in each target image and the center points of the corresponding tracking template; comparing the average value of the deviations between the center points of all target areas in each target image and the center points of the corresponding tracking template with an offset threshold; in response to the average value of the deviations between the center points of all target areas in each frame of M frames of images and the center points of the corresponding tracking template being greater than or equal to the offset threshold, determining that the camera is offset; wherein M is an integer greater than or equal to 1; in response to the average value of the deviations between the center points of all target areas in each target image and the center points of the corresponding tracking template being less than the offset threshold, determining that the camera is not offset.
  • the at least one tracking template of the first frame image includes four tracking templates.
  • the present disclosure provides a system for detecting camera interference, comprising: a camera configured to acquire multiple frames of images; a template selection module configured to select at least one image connected area of a first frame of image as at least one tracking template, a tracking calculation module configured to perform tracking calculation on each target image based on the at least one tracking template to obtain at least one feature response map corresponding to the at least one tracking template for each target image; wherein the target image is a frame of images in multiple frames after the first frame of image; and an interference judgment module configured to judge whether the camera is interfered with based on the at least one feature response map of each target image.
  • the template selection module is further configured to: perform image segmentation on the first frame image using a selective search algorithm to generate multiple image connected areas; perform contour extraction on the multiple image connected areas using a morphological method to obtain contour features of the multiple image connected areas; and further screen the multiple image connected areas using rectangularity features, and select at least one image connected area with the largest rectangularity among the multiple image connected areas as at least one tracking template.
  • the characteristic response graph is calculated by the following formula:
  • RF represents a feature response map of each target area in at least one target area corresponding to the at least one tracking template in each target image
  • z represents the first frame image
  • x represents each target image
  • a feature map representing each tracking template of the first frame image A feature map representing each target image
  • b represents a constant.
  • the interference judgment module is further configured to: obtain a response score value of a target area of the target image corresponding to the at least one tracking template based on each feature response map in the at least one feature response map corresponding to the at least one tracking template of each target image; obtain a response score value of each target area in each target image for the at least one tracking template to determine whether the camera is interfered with; wherein obtaining the response score value of the target area of the target image corresponding to the at least one tracking template comprises: processing each value in the feature response map using a sigmoid function; and then taking the largest value among the processed values as the response score value of the target area of the target image corresponding to the at least one tracking template; wherein the sigmoid function is as follows:
  • x represents each value in the characteristic response graph
  • f(x) represents a function for processing each value in the characteristic response graph
  • the interference judgment module is further configured to: compare the response score value of each target area in each target image with a first response threshold; in response to the response score value of the target area being less than the first response threshold, assign a label value of 0 to the target area; in response to the response score value of the target area being greater than the first response threshold, assign a label value of 1 to the target area; average the label values of all target areas in each target image to calculate the total score ratio of all target areas in each target image
  • S represents the total score ratio of all target areas in each target image
  • N is the number of target areas in each target image, and N is a positive integer
  • i is a positive integer and 1 ⁇ i ⁇ N.
  • the interference judgment module is further configured to: compare the total score ratio of all target areas in each target image with a second response threshold; in response to the total score ratio of all target areas in each frame image in K frame images being less than the second response threshold, judge that the camera is blocked; in response to the total score ratio of all target areas in each target image being greater than or equal to the second response threshold, judge that the camera is not blocked; wherein K is an integer greater than or equal to 1.
  • the interference judgment module is further configured to: in response to the camera being not blocked, calculate the average value of the deviations between the center points of all target areas in each target image and the center points of the corresponding tracking template; compare the average value of the deviations between the center points of all target areas in each target image and the center points of the corresponding tracking template with an offset threshold; in response to the average value of the deviations between the center points of all target areas in each frame image of M frames and the center points of the corresponding tracking template being greater than or equal to the offset threshold, judge that the camera is offset; wherein M is an integer greater than or equal to 1; in response to the average value of the deviations between the center points of all target areas in each target image and the center points of the corresponding tracking template being less than the offset threshold, judge that the camera is not offset.
  • the at least one tracking template of the first frame image includes four tracking templates.
  • the present disclosure provides an electronic device comprising a memory, a processor, and a computer program stored on the memory and executable on the processor, wherein the computer program, when executed by the processor, implements the method for detecting camera interference according to the first aspect.
  • the target area in the image is automatically acquired, and the regional features are compared and analyzed using the twin network SiamFC tracking algorithm to detect whether the camera is interfered with (mainly including whether the camera is blocked or moved in the present disclosure).
  • the present disclosure uses a method of deep feature comparison tracking to achieve long-interval frame extraction detection, which can effectively solve the problems of camera offset and occlusion detection without excessively consuming the computer's computing resources, and can also avoid the problem of inaccurate detection results caused by traditional algorithms being affected by changes in illumination.
  • FIG1 is a flow chart of a method for detecting camera interference provided by an embodiment of the present disclosure.
  • FIGS. 2a and 2b are schematic diagrams showing target areas in a scenario of monitoring a street.
  • FIG. 2c is a schematic diagram showing the operation process of the SiamFC algorithm.
  • FIG. 2d is a schematic diagram showing an example of calculating a response score value.
  • FIG2e shows a graph of the sigmoid function.
  • FIG. 3 a is a block diagram of a system for detecting camera interference provided by an embodiment of the present disclosure.
  • FIG. 3 b is a block diagram of a system for detecting camera interference provided by an embodiment of the present disclosure.
  • FIG. 3 c is a block diagram of a specific implementation of a system for detecting camera interference provided in an embodiment of the present disclosure.
  • FIG. 4 shows a schematic diagram of the hardware structure of an electronic device provided by this embodiment.
  • Embodiments of the present disclosure provide a method for detecting camera interference, a system for detecting camera interference, and an electronic device, which substantially eliminate one or more problems caused by limitations and disadvantages of the prior art.
  • the present disclosure provides a method for detecting camera interference, comprising: acquiring multiple frames of images through the camera, and selecting at least one image connected area of the first frame of image as at least one tracking template, for each frame of image in the multiple frames after the first frame of image, tracking calculation is performed on each frame of image based on at least one tracking template of the first frame of image to obtain at least one feature response map corresponding to at least one tracking template for each frame of image in the multiple frames after the first frame of image; and judging whether the camera is interfered with according to at least one feature response map of each frame of image in the multiple frames after the first frame of image.
  • an embodiment of the present disclosure provides a method for detecting camera interference.
  • FIG1 is a flow chart of a method for detecting camera interference provided by an embodiment of the present disclosure. As shown in FIG1 , an embodiment of the present disclosure provides a method for detecting camera interference, comprising:
  • each image connected area of the at least one image connected area of the first frame image is selected as a tracking template for use in subsequent processes.
  • the tracking template has obvious features compared to other areas of the first frame image.
  • This obvious feature can more obviously distinguish the tracking template from other areas, and in an ideal state, in multiple frames of images, the position of the tracking template with obvious features in each frame of the image is basically fixed.
  • the user can select the tracking template according to actual needs, and the present disclosure does not limit this.
  • FIG2a and FIG2b are schematic diagrams of target areas in the scenario of monitoring a street.
  • four areas i.e., area 1, area 2, area 3, and area 4 in FIG2a
  • the positions of the four areas are fixed and each area is also stationary.
  • an algorithm or a manual selection method may be used to select at least one image connected area of the first frame image as at least one tracking template.
  • using an algorithm to select at least one image connected area of a first frame image as at least one tracking template includes: using a selective search algorithm (Selective Search) to perform image segmentation on the first frame image to generate multiple image connected areas, and using a morphological method to extract contours of the multiple image connected areas to obtain contour features of the multiple image connected areas.
  • a selective search algorithm Selective Search
  • contour extraction is performed on multiple image connected regions to obtain contour features of the multiple image connected regions, including: processing the multiple image connected regions by area features, and then further screening the multiple image connected regions by using rectangularity features and/or contour inheritance relationships, and selecting N image connected regions with the largest rectangularity among the multiple image connected regions as N tracking templates.
  • rectangularity is the area of a shape divided by the area of the smallest circumscribed rectangle.
  • an image connected region refers to a region consisting of pixels with the same pixel value and adjacent positions in an image.
  • the area feature mainly refers to the size and shape of the image connected region. That is to say, in some embodiments of the present disclosure, the image connected region is processed mainly from the two aspects of size and shape.
  • tracking calculation of each frame image based on at least one tracking template of the first frame image is achieved by using SiamFC algorithm.
  • SiamFC Full-Convolutional Siamses Networks for Object Tracking
  • twin network which uses the image connected area in the first frame image as the tracking template, and performs similarity search and calculation on the corresponding target area in multiple frames after the first frame image, thereby achieving target tracking.
  • Figure 2c is a schematic diagram of the SiamFC algorithm operation process.
  • z represents the first frame image, whose size is 127*127*3;
  • x represents each target image, whose size is 255*255*3.
  • the two are subjected to feature extraction through a weight-sharing twin network, and finally the feature map z (feature-map-0) generated by the tracking template is used as the convolution kernel to perform convolution calculation on the feature map x (feature-map-1) of each target image to generate a feature response map RF (Respond Feature map), which is calculated as follows:
  • RF represents a feature response map of each target area in at least one target area corresponding to the at least one tracking template in each target image
  • z represents the first frame image
  • x represents each target image
  • a feature map representing each tracking template of the first frame image A feature map representing each target image
  • b represents a constant.
  • each image connected region in FIG2a is selected as a tracking template.
  • SiamFC tracking calculations at corresponding positions in subsequent video frames, determining whether the camera is blocked by judging the feature response and calculating the response score value, and determining whether the camera is offset by judging the feature response and calculating the deviation between the center points.
  • the method may further include acquiring a feature map of the tracking template.
  • S3. Determine whether the camera is interfered with according to at least one feature response graph of each target image (ie, each frame of the multiple frames after the first frame).
  • the types of camera interference mainly include: camera being blocked, camera offset, camera out of focus, camera shaking violently, abnormal camera imaging brightness, camera imaging distortion, camera failure, etc.
  • the following describes a method for detecting camera interference provided by an embodiment of the present disclosure, taking two types of camera interference, including camera obstruction and camera offset, as examples.
  • judging whether the camera is interfered with according to at least one feature response map of each target image includes: obtaining a response score value of a target area of the target image corresponding to the at least one tracking template based on each feature response map of the at least one feature response map corresponding to the at least one tracking template of each target image, that is, the similarity between at least one target area in each target image and a corresponding area in the first frame image as a template. If the two areas are exactly the same, the response score value is 1; if the two areas are completely different, the response score value is 0.
  • judging whether the camera is interfered with according to at least one feature response map of each target image further includes: obtaining a response score value of each target area in each target image for the at least one tracking template to judge whether the camera is interfered with.
  • obtaining a response score value of one target area in each target image for one tracking template, obtaining a response score value of one target area in each target image; and then for each of the other tracking templates, repeating the step of obtaining a response score value of one target area to obtain a response score value of each target area in each target image.
  • the response score value may be calculated according to the above formula 1.
  • each value in the feature response map RF calculated according to the above formula 1 is processed, and then the largest value among the processed values is taken as the response score value of the target area of the target image corresponding to the at least one tracking template.
  • FIG. 2d is a schematic diagram showing an example of calculating a response score value.
  • each value in the feature response graph RF calculated according to the above formula 1 is processed by the sigmoid function to obtain a response score value in the range of 0 to 1.
  • the sigmoid function is as follows:
  • x represents each value in the feature response graph
  • f(x) represents a function for processing each value in the feature response graph.
  • Each value in the feature response graph RF is substituted into x to obtain a corresponding value in the range of 0 to 1.
  • the four values of -23, 10, 15 and -1 included in the feature response graph RF are processed by the sigmoid function respectively to obtain corresponding values of 0.01, 0.45, 0.75 and 0.1, and the four values obtained are all in the range of 0 to 1.
  • Fig. 2e shows a graph of the sigmoid function. As shown in Fig. 2e, the values obtained by the sigmoid function are all in the range of 0 to 1, so by applying the sigmoid function, the response score value obtained is also in the range of 0 to 1.
  • judging whether the camera is interfered with according to at least one feature response map of each target image further includes: comparing the response score value of each target area in each target image with a first response threshold; in response to the response score value of the target area being less than the first response threshold, assigning a label value of 0 to the target area; in response to the response score value of the target area being greater than the first response threshold, assigning a label value of 1 to the target area; averaging the label values of all target areas in each target image to calculate the total score ratio of all target areas in each target image.
  • S represents the total score ratio of all target areas in each target image
  • N is the number of target areas in each target image, and N is a positive integer
  • i is a positive integer and 1 ⁇ i ⁇ N.
  • i is less than or equal to N and greater than or equal to 1 and is an integer, and N is the number of target areas.
  • judging whether the camera is interfered with according to at least one feature response map of each target image includes: calculating the total score ratio of all target areas in each target image. (that is, the average of the response score values of N target areas) is compared with the second response threshold value T12. In response to the fact that the total score ratio of all target areas in each frame of K frame images is less than the second response threshold value, the camera is judged to be blocked; in response to the fact that the total score ratio of all target areas in each target image is greater than or equal to the second response threshold value, the camera is judged to be unobstructed.
  • the first response threshold T11 and the second response threshold T12 can be set as needed. In some embodiments of the present disclosure, the first response threshold T11 and the second response threshold T12 can be the same. In some embodiments of the present disclosure, the first response threshold T11 and the second response threshold T12 can be different. For example, the first response threshold T11 and the second response threshold T12 can both be set to 0.5.
  • the type of alarm may include sound, emitting light of different colors (e.g., emitting red light or emitting yellow light), etc.
  • the present disclosure is not limited thereto.
  • other types of alarms may also be used to prompt the user that the camera is interfered with.
  • the method further includes calculating a deviation of a center point between each target region in each target image and a corresponding tracking template.
  • the deviation of a center point between each target region in each target image and a corresponding tracking template in the first frame image is a Euclidean distance between a center point of each target region in each target image and a center point of a corresponding tracking template in the first frame image.
  • the method also includes: calculating the average value of the deviations between the center points of all target areas in each target image and the center points of the corresponding tracking template; comparing the average value of the deviations between the center points of all target areas in each target image and the center points of the corresponding tracking template with the offset threshold; in response to the average value of the deviations between the center points of all target areas in each frame image of the M frames and the center points of the corresponding tracking template being greater than or equal to the offset threshold, it is judged that the camera is offset; wherein M is an integer greater than or equal to 1; in response to the average value of the deviations between the center points of all target areas in each target image and the center points of the corresponding tracking template being less than the offset threshold, it is judged that the camera is not offset.
  • the average value D of the N deviations of the center points between the N target areas in each target image and the corresponding N tracking templates in the first frame image is calculated.
  • D ⁇ T3 that is, the average value D of the N deviations is greater than or equal to the offset threshold T3, and this result lasts for M frames of images, it is determined that the camera is interfered with (e.g., offset).
  • D ⁇ T3 that is, the average value D of the N deviations is less than the offset threshold T3, it is determined that the camera is not interfered with (e.g., offset).
  • an alarm is issued to the user to prompt the user that the camera is interfered with (e.g., offset).
  • the offset threshold T3 can be set as needed.
  • the offset threshold T3 can be set to 150.
  • the type of alarm may include sound, emitting light of different colors (e.g., emitting red light), etc.
  • the present disclosure is not limited to this. In other embodiments of the present disclosure, other types of alarms may also be used to prompt the user that the camera is interfered with (e.g., offset).
  • Figure 2a shows a first frame image, in which four tracking templates (i.e., area 1, area 2, area 3, and area 4) have been set;
  • Figure 2b shows a target image (i.e., a frame image after the first frame image), in which the camera is not blocked but offset.
  • the image shown in FIG. 2a is regarded as the first frame image
  • the image shown in FIG. 2b is regarded as a target image (i.e., a frame image after the first frame image).
  • region 1 and region 2 still exist in the image after the shift (the image shown in FIG. 2b), so they are easily tracked using the SiamFC algorithm in the above step S2, so their response score values are relatively large (i.e., region 1 has a high similarity in the two frames of images, and region 2 has a high similarity in the two frames of images), such as 0.75 (as shown in FIG.
  • T11 e.g., 0.5
  • the average value D of the N deviations between the center points of the N target areas in each target image and the corresponding N tracking templates in the first frame image that is, in this embodiment, the deviation of the center point of target area 1 in the original image and the offset image (that is, the distance between the center points of target area 1 in the original image and the offset image) and the average value D of the deviation of the center point of target area 2 is:
  • d1 is the Euclidean distance between the center point P1 of the target area 1 in the original image (as shown in FIG2a) and the center point P3 of the target area 1 in the offset image (as shown in FIG2b)
  • d2 is the Euclidean distance between the center point P2 of the target area 2 in the original image (as shown in FIG. 2a) and the center point P4 of the target area 2 in the offset image (as shown in FIG. 2b).
  • this method can be deployed in a computer together with some other (forbidden zone intrusion) methods.
  • This method is a guarantee algorithm based on the normal and effective operation of the key area monitoring algorithm.
  • the method of one or more embodiments of the present disclosure may be performed by a single device, such as a computer or server.
  • the method of this embodiment may also be applied in a distributed scenario and completed by multiple devices cooperating with each other.
  • one of the multiple devices may only perform one or more steps in the method of one or more embodiments of the present disclosure, and the multiple devices may interact with each other to complete the described method.
  • embodiments of the present disclosure provide a system for detecting camera interference.
  • FIG3a is a block diagram of a system for detecting camera interference provided by an embodiment of the present disclosure.
  • the system for detecting camera interference provided by an embodiment of the present disclosure includes: a camera configured to acquire multiple frames of images; a template selection module configured to select at least one image connected area of a first frame of image as at least one tracking template; a tracking calculation module configured to perform tracking calculation on each target image based on the at least one tracking template to obtain at least one feature response map corresponding to the at least one tracking template of each target image; wherein the target image is one of the multiple frames of image after the first frame of image; and an interference judgment module configured to judge whether the camera is interfered with according to at least one feature response map of each target image.
  • the template selection module may also be configured to obtain a feature map of the tracking template.
  • an algorithm or a manual selection method may be used to select at least one image connected area of the first frame image as at least one tracking template.
  • an algorithm is used to select at least one image connected area of the first frame image as at least one tracking template, including: using a selective search algorithm (Selective Search) to perform image segmentation on the first frame image to generate multiple image connected areas, and using a morphological method to extract contours of the multiple image connected areas to obtain contour features of the multiple image connected areas.
  • contour extraction is performed on multiple image connected regions to obtain contour features of the multiple image connected regions, including: processing the multiple image connected regions by area features, and then further screening the multiple image connected regions by using rectangularity features and/or contour inheritance relationships, and selecting N image connected regions with the largest rectangularity among the multiple image connected regions as N tracking templates.
  • rectangularity is the area of a shape divided by the area of the smallest circumscribed rectangle.
  • the interference judgment module is further configured to: obtain a response score value of a target area of the target image corresponding to the at least one tracking template based on each feature response map in the at least one feature response map corresponding to the at least one tracking template of each target image; and obtain a response score value of each target area in each target image for the at least one tracking template to determine whether the camera is interfered with.
  • the response score value may be calculated according to the above formula 1.
  • each value in the feature response map RF calculated according to the above formula 1 is processed, and then the largest value among the processed values is taken as the response score value of the target area of the target image corresponding to the at least one tracking template.
  • each value in the feature response graph RF calculated according to the above formula 1 is processed by the sigmoid function to obtain a response score value in the range of 0 to 1.
  • the sigmoid function is as follows:
  • x represents each value in the feature response graph
  • f(x) represents a function for processing each value in the feature response graph.
  • Each value in the feature response graph RF is substituted into x to obtain a corresponding value in the range of 0 to 1.
  • the values obtained by the sigmoid function are all in the range of 0 to 1. Therefore, by applying the sigmoid function, the response score value obtained is also in the range of 0 to 1.
  • the interference judgment module is further configured to: compare the response score value of each target area in each target image with a first response threshold; in response to the response score value of the target area being less than the first response threshold, assign a label value of 0 to the target area; in response to the response score value of the target area being greater than the first response threshold, assign a label value of 1 to the target area; average the label values of all target areas in each target image to calculate the total score ratio of all target areas in each target image
  • S represents the total score ratio of all target areas in each target image
  • N is the number of target areas in each target image, and N is a positive integer
  • i is a positive integer and 1 ⁇ i ⁇ N.
  • i is less than or equal to N and greater than or equal to 1 and is an integer, and N is the number of target areas.
  • the interference determination module is further configured to: calculate the total score ratio of all target areas in each target image. (i.e., the average of the response score values of the N target areas) is compared with the second response threshold value T12.
  • S ⁇ T12 that is, the total score ratio S of the target area is less than the second response threshold value T12, at this time, almost all the tracked target areas are lost, and this result lasts for K frames of images, then it is judged that the camera is blocked.
  • S ⁇ T12 that is, the total score ratio S of the target area is greater than or equal to the second response threshold value T12, then it is judged that the camera is not blocked.
  • the first response threshold T11 and the second response threshold T12 can be set as needed. In some embodiments of the present disclosure, the first response threshold T11 and the second response threshold T12 can be the same. In some embodiments of the present disclosure, the first response threshold T11 and the second response threshold T12 can be different. For example, the first response threshold T11 and the second response threshold T12 can both be set to 0.5.
  • the type of alarm may include sound, emitting light of different colors (e.g., emitting red light or emitting yellow light), etc.
  • the present disclosure is not limited thereto.
  • other types of alarms may also be used to prompt the user that the camera is interfered with.
  • the interference judgment module is further configured to: calculate the deviation of the center point between each target area in each target image and the corresponding tracking template. If the total score ratio S of the target area is greater than or equal to the second response threshold T12, it is judged that the camera is not interfered with (occluded).
  • the interference judgment module is further configured to: calculate the average value of the deviation between the center point of all target areas in each target image and the center point of the corresponding tracking template; compare the average value with the offset threshold; in response to the average value being greater than or equal to the offset threshold, and this result lasts for M frames of images, it is judged that the camera is offset; wherein M is an integer greater than or equal to 1; in response to the average value being less than the offset threshold, it is judged that the camera is not offset.
  • the interference judgment module is further configured to: for each target image, calculate the average value D of the N deviations of the center points between the N target areas in each target image and the corresponding N tracking templates in the first frame image.
  • D ⁇ T3 that is, the average value D of the N deviations is greater than or equal to the offset threshold T3, and this result lasts for M frames of images, it is determined that the camera is interfered with (e.g., offset).
  • D ⁇ T3 that is, the average value D of the N deviations is less than the offset threshold T3, it is determined that the camera is not interfered with (e.g., offset).
  • an alarm is issued to the user to prompt the user that the camera is interfered with (e.g., offset).
  • the offset threshold T3 can be set as needed.
  • the type of alarm may include sound, emitting light of different colors (e.g., emitting red light), etc.
  • the present disclosure is not limited to this. In other embodiments of the present disclosure, other types of alarms may also be used to prompt the user that the camera is interfered with (e.g., offset).
  • FIG3b is a block diagram of a system for detecting camera interference provided by an embodiment of the present disclosure.
  • FIG3c is a block diagram of a specific implementation of a system for detecting camera interference provided by an embodiment of the present disclosure.
  • the system for detecting camera interference provided by an embodiment of the present disclosure further includes: a convolution operation module (Fc-model), which is configured to use a feature map (feature-map-0) of at least one tracking template of the first frame image obtained by the template selection module as a convolution kernel to perform convolution calculation on a feature map (feature-map-1) of a corresponding target area of each frame image (i.e., each target image) in the subsequent multiple frames of images obtained by the tracking calculation module.
  • Fc-model convolution operation module
  • the template selection module and the tracking calculation module are implemented by Alexnet modules.
  • the template selection module and the tracking calculation module are configured to perform the same operation, have the same weights, but have different image sizes, for example, 127*127*3 and 255*255*3, respectively.
  • the template selection module and the tracking calculation module are Alexnet modules using CNN (convolutional neural network).
  • the template selection module and the tracking calculation module may be obtained by training a preset CNN.
  • the convolutional neural network may be an untrained or uncompleted multi-layer convolutional neural network.
  • the convolutional neural network may, for example, include a convolutional layer, a pooling layer, a fully connected layer, and a loss layer.
  • the non-first convolutional layer in the convolutional neural network may be connected to at least one convolutional layer before the non-first convolutional layer.
  • the non-first convolutional layer may be connected to all convolutional layers before it to achieve the first convolutional layer selection; the non-first convolutional layer may also be connected to some convolutional layers before it.
  • the template selection module, the tracking calculation module and the convolution operation module are respectively accelerated and converted in a GPU (graphics processing unit) of NVIDIA through a tensorRT module and deployed through a Triton inference server (a deployment inference tool developed by NVIDIA).
  • the tensorRT module includes a first tensorRT module correspondingly connected to the template selection module and a second tensorRT module correspondingly connected to the tracking calculation module.
  • the first tensorRT module and the second tensorRT module are a deep learning inference engine developed by NVIDIA.
  • the convolution operation module is implemented by an ONNX module (Open Neural Network Exchange), which is an intermediate representation format used for conversion in various deep learning training and inference frameworks.
  • ONNX Open Neural Network Exchange
  • the tracking calculation module and the convolution operation module are connected through the module combination function of the Triton inference server, that is, the output of the tracking calculation module is connected to the input of the convolution operation module, and finally the final tracking result is generated.
  • the SiamFC algorithm operates as follows: the template selection module requests the first tensorRT module to obtain the request result, and then uses the obtained request data to update the convolution operation module; the polling function of the Triton inference server is used to detect module changes, and then the convolution operation module is reloaded into the GPU memory; the tracking calculation module and the convolution operation module are merged into one module, that is, the output of the tracking calculation module is the input of the convolution operation module; the tracking calculation module requests the merged module to obtain the tracking result.
  • the template selection module represents the Alexnet module that infers and obtains the feature map feature-map-0 of at least one tracking template of the first frame image in FIG2c in the first frame
  • the tracking calculation module represents the Alexnet module that infers and obtains the feature map feature-map-1 of the corresponding target area of each frame image in the subsequent multiple frames in FIG2c in each frame in the subsequent frames.
  • the template selection module and the tracking calculation module have the same weight, but when performing module acceleration conversion and solidification, the image input sizes of the template selection module and the tracking calculation module are 127*127*3 and 255*255*3 respectively.
  • the convolution operation module (Fc-model) represents a processing module that uses the feature map feature-map-0 of at least one tracking template of the first frame image as a convolution kernel to perform convolution calculation and activation on the tracking calculation module.
  • one or more embodiments of the present disclosure provide an electronic device comprising a memory, a processor, and a computer program stored in the memory and executable on the processor, wherein when the processor executes the program, the method for detecting camera interference as described in any one of the above embodiments is implemented.
  • FIG4 shows a schematic diagram of the hardware structure of an electronic device provided in this embodiment.
  • the electronic device 1000 may include: a processor 1010, a memory 1020, an input/output interface 1030, a communication interface 1040, and a bus 1050.
  • the processor 1010, the memory 1020, the input/output interface 1030, and the communication interface 1040 are connected to each other through the bus 1050 in the electronic device.
  • the processor 1010 can be implemented by a general-purpose CPU (Central Processing Unit), a microprocessor, an application specific integrated circuit (ASIC), or one or more integrated circuits, and is used to execute relevant programs to implement the method for detecting camera interference as described in any of the above embodiments.
  • a general-purpose CPU Central Processing Unit
  • ASIC application specific integrated circuit
  • the memory 1020 can be implemented in the form of ROM (Read Only Memory), RAM (Random Access Memory), static storage device, dynamic storage device, etc.
  • the memory 1020 can store an operating system and other application programs.
  • the relevant program code is stored in the memory 1020 and called and executed by the processor 1010.
  • the input/output interface 1030 is used to connect the input/output module to realize information input and output.
  • the input/output module can be set as a component in the electronic device (not shown in the figure), or it can be externally connected to the electronic device to provide corresponding functions.
  • the input device may include a keyboard, a mouse, a touch screen, a microphone, various sensors, etc.
  • the output device may include a display, a speaker, a vibrator, an indicator light, etc.
  • the communication interface 1040 is used to connect a communication module (not shown) to realize communication interaction between the electronic device and other devices.
  • the communication module can realize communication through a wired mode (such as USB, network cable, etc.) or a wireless mode (such as mobile network, WIFI, Bluetooth, etc.).
  • the bus 1050 is used to transmit information between various components of the electronic device (eg, the processor 1010 , the memory 1020 , the input/output interface 1030 , and the communication interface 1040 ).
  • the above electronic device only shows the processor 1010, the memory 1020, the input/output interface 1030, the communication interface 1040 and the bus 1050, in the specific implementation process, the electronic device may also include other components necessary for normal operation.
  • the above electronic device may also only include the components necessary for implementing the method for detecting camera interference described in the embodiment of the present disclosure, and does not necessarily include all the components shown in the figure.
  • the embodiment of the present disclosure also provides a non-transitory computer-readable storage medium having computer-executable instructions stored thereon, wherein the instructions, when executed by a processor, perform the above-mentioned method for detecting camera interference.
  • each box in the flow chart or block diagram can represent a module, a program segment or a part of a code, and the module, the program segment or a part of the code contains at least one executable instruction for realizing the specified logical function.
  • the functions marked in the box can also occur in a sequence different from that marked in the accompanying drawings.
  • the boxes represented in two sequences in succession can actually be executed substantially in parallel, and they can sometimes be executed in the opposite order, depending on the functions involved.
  • each box in the block diagram and/or the flow chart, and the combination of the boxes in the block diagram and/or the flow chart can be realized by a dedicated hardware-based system that performs a specified function or operation, or can be realized by a combination of dedicated hardware and computer instructions.
  • each of the components may be a software program set in a computer or a mobile smart device, or may be a separately configured hardware device.
  • the names of these components do not, in some cases, constitute limitations on the components themselves.
  • known power/ground connections to integrated circuit (IC) chips and other components may or may not be shown in the provided figures.
  • the device may be shown in the form of a block diagram so as to avoid making one or more embodiments of the present disclosure difficult to understand, and this also takes into account the fact that the details of the implementation of these devices shown in the form of block diagrams are highly dependent on the platform on which one or more embodiments of the present disclosure will be implemented (i.e., these details should be fully within the scope of understanding of those skilled in the art).
  • DRAM dynamic RAM

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Image Analysis (AREA)

Abstract

本公开提供一种用于检测摄像头干扰的方法,包括:通过所述摄像头,获取多帧图像,并选取第一帧图像的至少一个图像连通区域作为至少一个跟踪模板,对所述第一帧图像之后的多帧图像中的每帧图像,基于所述第一帧图像的至少一个跟踪模板对每帧图像进行跟踪计算,以得到所述第一帧图像之后的多帧图像中的每帧图像的对应于至少一个跟踪模板的至少一个特征响应图;以及根据所述第一帧图像之后的多帧图像中的每帧图像的至少一个特征响应图,来判断所述摄像头是否被干扰。

Description

用于检测摄像头干扰的方法、***和电子设备 技术领域
本公开涉及智能视频监控领域,具体地,涉及一种用于检测摄像头干扰的方法、用于检测摄像头干扰的***和电子设备。
背景技术
随着科学技术的发展以及人们对安全防范意识的不断增强,具有智能视频监控功能的新一代视频监控***已经广泛应用于社会的各个方面,如交通、军事、机场、银行、视频会议、商业、工业等等。
在不需要人为干预的情况下,利用计算机视觉分析方法对视频序列进行自动分析,实现运动目标检测、分类、识别、跟踪等,并在此基础上,通过预先设定的规则对目标的行为进行分析,从而为采取进一步措施提供参考(比如在对象进入设防区时自动报警)。然而,由于人为或其他不可预计因素导致摄像头被干扰,使得摄像头所监控的区域与用户想要监控的区域不一致时,将严重影响智能视频监控***的性能。用户一般情况下也不可能实时观察对应区域是否正确,因此这就会导致摄像头的智能监控功能失效,甚至在有些关于摄像头关键部位监控的场景中,会酿成重大的安全事故或生产事故。另外,传统算法在进行图像处理时,会受光照影响。例如,对于两帧图像中的同一目标而言,光照变化较大的情况下,所得到的跟踪结果很不精确。
目前,一般认为摄像头干扰指监控画面出现剧烈的变化,且持续一定的时间,而对一些短暂的偶然变化则视为正常。摄像头干扰的类型主要有:摄像头被遮挡、摄像头偏移、摄像头失焦、摄像头剧烈抖动、摄像头成像亮度异常、摄像头成像失真、摄像头故障等等。
发明内容
本公开的实施例提供一种用于检测摄像头干扰的方法、用于检测摄像头干扰的***和电子设备。
在一个方面,本公开提供一种用于检测摄像头干扰的方法,包括:通过所述摄像头,获取多帧图像,并选取第一帧图像的至少一个图像连通区域作为至少一个跟踪模板;基于所述至少一个跟踪模板,对每个目标图像进行跟踪计算,以得到所述每个目标图像的对应于所述至少一个跟踪模板的至少一个特征响应图;其中,所述目标图像为所述第一帧图像之后的多帧图像中的一帧图像;以及根据所述每个目标图像的至少一个特征响应图,来判断所述摄像头是否被干扰。
在本公开的一些实施例中,选取第一帧图像的至少一个图像连通区域作为至少一个跟踪模板,包括:利用选择性搜索算法对所述第一帧图像进行图像分割以生成多个图像连通区域;采用形态学方法对所述多个图像连通区域进行轮廓提取以获得所述多个图像连通区域的轮廓特征;以及利用矩形度特征对所述多个图像连通区域进行筛选,选取所述多个图像连通区域中矩形度最大的至少一个图像连通区域作为至少一个跟踪模板。
在本公开的一些实施例中,所述特征响应图通过以下公式计算:
Figure PCTCN2022122563-appb-000001
其中,RF表示所述每个目标图像中的、与所述至少一个跟踪模板对应的至少一个目标区域中的每个目标区域的特征响应图;z表示所述第一帧图像,x表示所述每个目标图像;
Figure PCTCN2022122563-appb-000002
表示所述第一帧图像的每个跟踪模板的特征图,
Figure PCTCN2022122563-appb-000003
表示所述每个目标图像的特征图;
Figure PCTCN2022122563-appb-000004
表示以所述跟踪模板生成的所述特征图z为卷积核对所述每个目标图像的所述特征图x进行卷积计算的SiamFC算法的函数,b表示常数。
在本公开的一些实施例中,根据所述每个目标图像的至少一个特征响应图,来判断所述摄像头是否被干扰,包括:基于所述每个目标图像的对应于所述至少一个跟踪模板的所述至少一个特征响应图中的每个特征响应图,来获得该目标图像的对应于所述至少一个跟踪模板的目标区域的响应得分值;针对所述至 少一个跟踪模板,获得所述每个目标图像中的每个目标区域的响应得分值,来判断所述摄像头是否被干扰;其中,获得该目标图像的对应于所述至少一个跟踪模板的目标区域的响应得分值,包括:利用sigmoid函数对所述特征响应图中的每个值进行处理;然后取处理后的值中最大的一个值作为该目标图像的对应于所述至少一个跟踪模板的目标区域的响应得分值;其中,所述sigmoid函数如下:
Figure PCTCN2022122563-appb-000005
其中,x表示所述特征响应图中的各个值;f(x)表示对所述特征响应图中的各个值进行处理的函数。
在本公开的一些实施例中,根据所述每个目标图像的至少一个特征响应图,来判断所述摄像头是否被干扰,还包括:将所述每个目标图像中的每个目标区域的响应得分值与第一响应阈值比较;响应于目标区域的响应得分值小于所述第一响应阈值,则将该目标区域赋予标记值为0;响应于目标区域的响应得分值大于所述第一响应阈值,则将该目标区域赋予标记值为1;对所述每个目标图像中的所有目标区域的标记值求平均,以计算出所述每个目标图像中的所有目标区域的总得分比例
Figure PCTCN2022122563-appb-000006
其中,S表示所述每个目标图像中的所有目标区域的总得分比例;N为所述每个目标图像中的目标区域的数量,且N为正整数;x i表示所述每个目标图像中的第i个目标区域的标记值,且x i=0或x i=1;且i为正整数且1≤i≤N。
在本公开的一些实施例中,根据所述每个目标图像的至少一个特征响应图,来判断所述摄像头是否被干扰,还包括:将所述每个目标图像中的所有目标区域的所述总得分比例与第二响应阈值比较;响应于K帧图像中的每帧图像中的所述所有目标区域的所述总得分比例小于所述第二响应阈值,则判断所述摄像头被遮挡;响应于所述每个目标图像中的所述所有目标区域的所述总得分比例大于等于所述第二响应阈值,则判断所述摄像头未被遮挡;其中,K为大于等于1的整数。
在本公开的一些实施例中,响应于所述摄像头未被遮挡,所述方法还包括:计算所述每个目标图像中的所有目标区域的中心点与相应跟踪模板的中心点之间的偏差的平均值;将所述每个目标图像中的所有目标区域的中心点与相应跟踪模板的中心点之间的偏差的平均值与偏移阈值比较;响应于M帧图像中的每帧图像中的所有目标区域的中心点与相应跟踪模板的中心点之间的偏差的所述平均值大于等于所述偏移阈值,则判断所述摄像头偏移;其中,M为大于等于1的整数;响应于所述每个目标图像中的所有目标区域的中心点与相应跟踪模板的中心点之间的偏差的平均值小于所述偏移阈值,则判断所述摄像头未偏移。
在本公开的一些实施例中,所述第一帧图像的所述至少一个跟踪模板包括四个跟踪模板。
在第二方面,本公开提供一种用于检测摄像头干扰的***,包括:摄像头,其被配置为获取多帧图像;模板选取模块,其被配置为选取第一帧图像的至少一个图像连通区域作为至少一个跟踪模板,跟踪计算模块,其被配置为基于所述至少一个跟踪模板,对每个目标图像进行跟踪计算,以得到所述每个目标图像的对应于所述至少一个跟踪模板的至少一个特征响应图;其中,所述目标图像为所述第一帧图像之后的多帧图像中的一帧图像;以及干扰判断模块,其被配置为根据所述每个目标图像的至少一个特征响应图,来判断所述摄像头是否被干扰。
在本公开的一些实施例中,所述模板选取模块还被配置为:利用选择性搜索算法对所述第一帧图像进行图像分割以生成多个图像连通区域;采用形态学方法对所述多个图像连通区域进行轮廓提取以获得所述多个图像连通区域的轮廓特征;以及利用矩形度特征对所述多个图像连通区域进行进一步筛选,选取所述多个图像连通区域中矩形度最大的至少一个图像连通区域作为至少一个跟踪模板。
在本公开的一些实施例中,所述特征响应图通过以下公式计算:
Figure PCTCN2022122563-appb-000007
其中,RF表示所述每个目标图像中的、与所述至少一个跟踪模板对应的至 少一个目标区域中的每个目标区域的特征响应图;z表示所述第一帧图像,x表示所述每个目标图像;
Figure PCTCN2022122563-appb-000008
表示所述第一帧图像的每个跟踪模板的特征图,
Figure PCTCN2022122563-appb-000009
表示所述每个目标图像的特征图;
Figure PCTCN2022122563-appb-000010
表示以所述跟踪模板生成的所述特征图z为卷积核对所述每个目标图像的所述特征图x进行卷积计算的SiamFC算法的函数,b表示常数。
在本公开的一些实施例中,所述干扰判断模块还被配置为:基于所述每个目标图像的对应于所述至少一个跟踪模板的所述至少一个特征响应图中的每个特征响应图,来获得该目标图像的对应于所述至少一个跟踪模板的目标区域的响应得分值;针对所述至少一个跟踪模板,获得所述每个目标图像中的每个目标区域的响应得分值,来判断所述摄像头是否被干扰;其中,获得该目标图像的对应于所述至少一个跟踪模板的目标区域的响应得分值,包括:利用sigmoid函数对所述特征响应图中的每个值进行处理;然后取处理后的值中最大的一个值作为该目标图像的对应于所述至少一个跟踪模板的目标区域的响应得分值;其中,所述sigmoid函数如下:
Figure PCTCN2022122563-appb-000011
其中,x表示所述特征响应图中的各个值;f(x)表示对所述特征响应图中的各个值进行处理的函数。
在本公开的一些实施例中,所述干扰判断模块还被配置为:将所述每个目标图像中的每个目标区域的响应得分值与第一响应阈值比较;响应于目标区域的响应得分值小于所述第一响应阈值,则将该目标区域赋予标记值为0;响应于目标区域的响应得分值大于所述第一响应阈值,则将该目标区域赋予标记值为1;对所述每个目标图像中的所有目标区域的标记值求平均,以计算出所述每个目标图像中的所有目标区域的总得分比例
Figure PCTCN2022122563-appb-000012
其中,S表示所述每个目标图像中的所有目标区域的总得分比例;N为所述每个目标图像中的目标区域的数量,且N为正整数;x i表示所述每个目标图像中的第i个目标区域的标记值,且x i=0或x i=1;且i为正整数且1≤i≤N。
在本公开的一些实施例中,所述干扰判断模块还被配置为:将所述每个目标图像中的所有目标区域的所述总得分比例与第二响应阈值比较;响应于K帧图像中的每帧图像中的所述所有目标区域的所述总得分比例小于所述第二响应阈值,则判断所述摄像头被遮挡;响应于所述每个目标图像中的所述所有目标区域的所述总得分比例大于等于所述第二响应阈值,则判断所述摄像头未被遮挡;其中,K为大于等于1的整数。
在本公开的一些实施例中,所述干扰判断模块还被配置为:响应于所述摄像头未被遮挡,计算所述每个目标图像中的所有目标区域的中心点与相应跟踪模板的中心点之间的偏差的平均值;将所述每个目标图像中的所有目标区域的中心点与相应跟踪模板的中心点之间的偏差的平均值与偏移阈值比较;响应于M帧图像中的每帧图像中的所有目标区域的中心点与相应跟踪模板的中心点之间的偏差的所述平均值大于等于所述偏移阈值,则判断所述摄像头偏移;其中,M为大于等于1的整数;响应于所述每个目标图像中的所有目标区域的中心点与相应跟踪模板的中心点之间的偏差的平均值小于所述偏移阈值,则判断所述摄像头未偏移。
在本公开的一些实施例中,所述第一帧图像的所述至少一个跟踪模板包括四个跟踪模板。
在第三方面,本公开提供一种电子设备,包括存储器、处理器及存储在所述存储器上并可在所述处理器上运行的计算机程序,其中,所述计算机程序在被所述处理器执行时,实现根据第一方面所述的用于检测摄像头干扰的方法。
在本公开的实施例提供的用于检测摄像头干扰的方法、用于检测摄像头干扰的***和电子设备中,通过自动获取图像中的目标区域,利用孪生网络SiamFC跟踪算法进行区域特征对比和分析,进而检测摄像头是否被干扰(本公开中主要包括摄像头是否被遮挡或移动)。本公开利用深度特征对比跟踪的方法,实现了长间隔抽帧检测,既能有效的解决摄像头偏移、遮挡检测的问题,又不会过度消耗计算机的计算资源,并且还可以避免传统算法受光照变化影响而导致不准确的检测结果的问题。
附图说明
根据各种公开的实施例,以下附图仅是用于说明目的的示例,并且不旨在限制本发明的范围。
图1为本公开的实施例提供的用于检测摄像头干扰的方法的流程图。
图2a和图2b示出对街道进行监控的情境下目标区域的示意图。
图2c示出SiamFC算法运算过程的示意图。
图2d示出了计算响应得分值的示例的示意图。
图2e示出了sigmoid函数的曲线图。
图3a为本公开的实施例提供的用于检测摄像头干扰的***的框图。
图3b为本公开的实施例提供的用于检测摄像头干扰的***的框图。
图3c为本公开的实施例提供的用于检测摄像头干扰的***的具体实施方式的框图。
图4示出了本实施例所提供的一种电子设备硬件结构示意图。
具体实施方式
现在将参考以下实施例更具体地描述本公开。应当注意,本文中呈现的一些实施例的以下描述仅用于说明和描述的目的。其不是穷举的或限于所公开的精确形式。
需要说明的是,除非另外定义,本公开的一个或多个实施例使用的技术术语或者科学术语应当为本公开所属领域内具有一般技能的人士所理解的通常意义。本公开的一个或多个实施例中使用的“第一”、“第二”以及类似的词语并不表示任何顺序、数量或者重要性,而只是用来区分不同的组成部分。“包括”或者“包含”等类似的词语意指出现该词前面的元件或者物件涵盖出现在该词后面列举的元件或者物件及其等同,而不排除其他元件或者物件。“连接”或者“相连”等类似的词语并非限定于物理的或者机械的连接,而是可以包括电性的连接,不管 是直接的还是间接的。“上”、“下”、“左”、“右”等仅用于表示相对位置关系,当被描述对象的绝对位置改变后,则该相对位置关系也可能相应地改变。
本公开的实施例提供一种用于检测摄像头干扰的方法、用于检测摄像头干扰的***和电子设备,其基本上消除了由于现有技术的限制和缺点而导致的一个或多个问题。在一个方面,本公开提供了一种用于检测摄像头干扰的方法,包括:通过所述摄像头,获取多帧图像,并选取第一帧图像的至少一个图像连通区域作为至少一个跟踪模板,对所述第一帧图像之后的多帧图像中的每帧图像,基于所述第一帧图像的至少一个跟踪模板对每帧图像进行跟踪计算,以得到所述第一帧图像之后的多帧图像中的每帧图像的对应于至少一个跟踪模板的至少一个特征响应图;以及根据所述第一帧图像之后的多帧图像中的每帧图像的至少一个特征响应图,来判断所述摄像头是否被干扰。
在第一方面,本公开的实施例提供一种用于检测摄像头干扰的方法。图1为本公开的实施例提供的用于检测摄像头干扰的方法的流程图。如图1所示,本公开的实施例提供一种用于检测摄像头干扰的方法,包括:
S1、通过所述摄像头,获取多帧图像,并选取第一帧图像的至少一个图像连通区域作为至少一个跟踪模板。
在本公开的一些实施例中,通过所述摄像头,以一定的时间间隔来获取多帧图像。在本公开的一些实施例中,该时间间隔可根据需要进行设置,例如为1小时。也就是说,在本公开的一些实施例中,不必对图像进行连续采集,而是以一定时间间隔对图像进行采集,从而极大地减少计算资源的消耗。在本公开的一些实施例中,所述第一帧图像的所述至少一个图像连通区域中的每个图像连通区域被选用为跟踪模板,以供后续过程中使用。在本公开的一些实施例中,跟踪模板相比于第一帧图像的其他区域,具有明显特征。这种明显特征可以较为明显的将跟踪模板与其他区域区分开,并且在理想状态下,在多帧图像中,具有明显特征的跟踪模板在每帧图像中的位置基本上是固定的。在本公开的一些实施例中,用户可根据实际需要进行选择跟踪模板,本公开对此不做限定。
图2a和图2b示出对街道进行监控的情境下目标区域的示意图。例如,如 图2a所示,共选取了四个区域(即,图2a中的区域1、区域2、区域3、区域4)分别作为跟踪模板。在理想状态下,在多帧图像中,四个区域的位置是固定不动的,并且每个区域也是静止的。
在本公开的一些实施例中,可以利用算法或人工选取的方法来选取第一帧图像的至少一个图像连通区域作为至少一个跟踪模板。
在本公开的一些实施例中,利用算法来选取第一帧图像的至少一个图像连通区域作为至少一个跟踪模板包括:利用选择性搜索算法(Selective Search)对第一帧图像进行图像分割,以生成多个图像连通区域,并采用形态学方法对多个图像连通区域进行轮廓提取以获得所述多个图像连通区域的轮廓特征。
在本公开的一些实施例中,对多个图像连通区域进行轮廓提取以获得所述多个图像连通区域的轮廓特征,包括:通过面积特征对多个图像连通区域进行处理,随后利用矩形度(Rectangularity)特征和/或轮廓的继承关系对多个图像连通区域进行进一步筛选,选取所述多个图像连通区域中矩形度最大的N个图像连通区域分别作为N个跟踪模板。其中,N为图像连通区域的数量,且为大于等于1的整数。如图2a所示,N=4。
在本公开的一些实施例中,矩形度为一形状的面积除以最小外接矩形面积。在本公开的一些实施例中,图像连通区域是指图像中具有相同像素值并且位置相邻的像素组成的区域。在本公开的一些实施例中,面积特征主要指图像连通区域的大小和形状。也就是说,在本公开的一些实施例中,主要是从大小和形状两方面,对图像连通区域进行处理。
S2、基于所述至少一个跟踪模板,对每个目标图像进行跟踪计算,以得到所述每个目标图像的对应于所述至少一个跟踪模板的至少一个特征响应图;其中,所述目标图像为所述第一帧图像之后的多帧图像中的一帧图像;
在本公开的一些实施例中,基于所述第一帧图像的至少一个跟踪模板对每帧图像进行跟踪计算是通过利用SiamFC算法实现的。在本公开的一些实施例中,SiamFC(Fully-Convolutional Siamses Networks for Object Tracking)算法是一种基于孪生网络的目标跟踪算法,其利用第一帧图像中的图像连通区域为跟踪模 板,在第一帧图像之后的多帧图像中,对对应的目标区域进行相似搜索和计算,进而实现目标跟踪。
图2c示出SiamFC算法运算过程的示意图。如图2c所示,z表示所述第一帧图像,其尺寸为127*127*3;x表示所述每个目标图像,其尺寸为255*255*3。将两者通过权值共享的孪生网络进行特征提取,最后以所述跟踪模板生成的所述特征图z(feature-map-0)为卷积核对所述每个目标图像的所述特征图x(feature-map-1)进行卷积计算,生成特征响应图RF(Respond Feature map),如下计算:
Figure PCTCN2022122563-appb-000013
其中,RF表示所述每个目标图像中的、与所述至少一个跟踪模板对应的至少一个目标区域中的每个目标区域的特征响应图;z表示所述第一帧图像,x表示所述每个目标图像;
Figure PCTCN2022122563-appb-000014
表示所述第一帧图像的每个跟踪模板的特征图,
Figure PCTCN2022122563-appb-000015
表示所述每个目标图像的特征图;
Figure PCTCN2022122563-appb-000016
表示以所述跟踪模板生成的所述特征图z为卷积核对所述每个目标图像的所述特征图x进行卷积计算的SiamFC算法的函数,b表示常数。
例如,图2a和图2b所示,创建4个跟踪模板(图2a中每个图像连通区域均被选用作为跟踪模板)。通过在后续视频帧中对应位置进行SiamFC跟踪计算,通过判断特征响应,计算响应得分值来确定摄像头是否被遮挡,以及通过判断特征响应,计算中心点之间的偏差,来确定摄像头是否偏移。
在本公开的一些实施例中,所述方法还可以包括获取跟踪模板的特征图。
S3、根据所述每个目标图像(即,第一帧图像之后的多帧图像中的每帧图像)的至少一个特征响应图,来判断所述摄像头是否被干扰。
在本公开的一些实施例中,摄像头干扰的类型主要有:摄像头被遮挡、摄像头偏移、摄像头失焦、摄像头剧烈抖动、摄像头成像亮度异常、摄像头成像失真、摄像头故障等等。
下面从摄像头干扰的类型包括摄像头被遮挡和摄像头偏移两方面为例,对本公开的实施例提供的用于检测摄像头干扰的方法进行说明。
在摄像头干扰的类型包括摄像头被干扰(如,遮挡)的实施例中,根据所述每个目标图像的至少一个特征响应图,来判断所述摄像头是否被干扰,包括:基于所述每个目标图像的对应于所述至少一个跟踪模板的所述至少一个特征响应图中的每个特征响应图,来获得该目标图像的对应于所述至少一个跟踪模板的目标区域的响应得分值,即,每个目标图像中的至少一个目标区域与第一帧图像中的作为模板的对应区域之间的相似度。如果两个区域完全相同,则响应得分值为1;如果两个区域完全不相同,则响应得分值为0。
在本公开的一些实施例中,根据所述每个目标图像的至少一个特征响应图,来判断所述摄像头是否被干扰,还包括:针对所述至少一个跟踪模板,获得所述每个目标图像中的每个目标区域的响应得分值,来判断所述摄像头是否被干扰。在本公开的一些实施例中,针对一个跟踪模板,获得所述每个目标图像中的一个目标区域的响应得分值;再针对其他跟踪模板中的每一个,重复进行获得一个目标区域的响应得分值的步骤,以获得所述每个目标图像中的每个目标区域的响应得分值。
在本公开的实施例中,响应得分值可根据上述公式1计算。在本公开的实施例中,将通过根据上述公式1计算得到的特征响应图RF中的每个值进行处理,然后取处理后的值中最大的一个值作为该目标图像的对应于所述至少一个跟踪模板的目标区域的响应得分值。
图2d示出了计算响应得分值的示例的示意图。
具体地,如图2d所示,将根据上述公式1计算得到的特征响应图RF中的每个值通过sigmoid函数进行处理,以得到在0到1范围内的响应得分值。其中,sigmoid函数如下:
Figure PCTCN2022122563-appb-000017
其中,x表示所述特征响应图中的各个值;f(x)表示对所述特征响应图中的各个值进行处理的函数。将特征响应图RF中的每个值带入x,从而得到在0到1范围内的对应值。如图2d所示,将特征响应图RF中所包括的-23、10、15和 -1四个值的分别通过sigmoid函数进行处理,从而得到对应的0.01、0.45、0.75和0.1,所得的四个值均在0到1范围内。
图2e示出了sigmoid函数的曲线图。如图2e所示,sigmoid函数所得的值均在0到1范围内,因此,通过应用sigmoid函数,所得到的响应得分值也在0到1范围内。
在本公开的实施例中,根据所述每个目标图像的至少一个特征响应图,来判断所述摄像头是否被干扰,还包括:将所述每个目标图像中的每个目标区域的响应得分值与第一响应阈值比较;响应于目标区域的响应得分值小于所述第一响应阈值,则将该目标区域赋予标记值为0;响应于目标区域的响应得分值大于所述第一响应阈值,则将该目标区域赋予标记值为1;对所述每个目标图像中的所有目标区域的标记值求平均,以计算出所述每个目标图像中的所有目标区域的总得分比例
Figure PCTCN2022122563-appb-000018
其中,S表示所述每个目标图像中的所有目标区域的总得分比例;N为所述每个目标图像中的目标区域的数量,且N为正整数;x i表示所述每个目标图像中的第i个目标区域的标记值,且x i=0或x i=1;且i为正整数且1≤i≤N。
在本公开的一些实施例中,如果目标区域的响应得分值小于预先设定的第一响应阈值,则将该目标区域赋予标记值x i=0,否则将该目标区域赋予标记值x i=1。其中,i小于等于N且大于等于1且为整数,N为目标区域的数量。
在本公开的一些实施例中,根据所述每个目标图像的至少一个特征响应图,来判断所述摄像头是否被干扰,包括:将所述每个目标图像中的所有目标区域的所述总得分比例
Figure PCTCN2022122563-appb-000019
(即,N个目标区域的响应得分值的平均数)与第二响应阈值T12比较。响应于K帧图像中的每帧图像中的所述所有目标区域的所述总得分比例小于所述第二响应阈值,则判断所述摄像头被遮挡;响应于所述每个目标图像中的所述所有目标区域的所述总得分比例大于等于所述第二响应阈值,则判断所述摄像头未被遮挡。换句话说,当S<T12时,即,目标区域的总得分比例S小于第二响应阈值T12,此时,几乎所有跟踪的目标区域全部丢失, 且此结果持续了K帧图像,则判断为摄像头被干扰。当S≥T12时,即,目标区域的总得分比例S大于等于第二响应阈值T12,则判断为摄像头未被干扰。其中,K为大于等于1的整数。
在本公开的一些实施例中,在判断出摄像头被干扰(如,遮挡)时,向用户发出警报,以提示用户摄像头被干扰(如,遮挡)。在本公开的一些实施例中,第一响应阈值T11和第二响应阈值T12可以根据需要进行设置。在本公开的一些实施例中,第一响应阈值T11和第二响应阈值T12可以相同。本公开的一些实施例中,第一响应阈值T11和第二响应阈值T12可以不同。例如,第一响应阈值T11和第二响应阈值T12可以均被设置为0.5。在本公开的一些实施例中,警报的类型可以包括声音,发出不同颜色的光(例如,发出红光或发出黄光)等。然而,本公开不限于此。在本公开的其他实施例中,还可以采用其他类型的警报来提示用户摄像头被干扰。
在摄像头干扰的类型包括摄像头偏移的实施例中,所述方法还包括计算每个目标图像中的每个目标区域与对应的跟踪模板之间的中心点的偏差。在本公开的一些实施例中,每个目标图像中的每个目标区域与第一帧图像中的对应的跟踪模板之间的中心点的偏差为每个目标图像中的每个目标区域的中心点与第一帧图像中的对应的跟踪模板的中心点之间的欧几里得距离。
在本公开的一些实施例中,如果目标区域的总得分比例S大于等于第二响应阈值T12,则判断为摄像头未被干扰(遮挡),在此基础上,所述方法还包括:计算所述每个目标图像中的所有目标区域的中心点与相应跟踪模板的中心点之间的偏差的平均值;将所述每个目标图像中的所有目标区域的中心点与相应跟踪模板的中心点之间的偏差的平均值与偏移阈值比较;响应于M帧图像中的每帧图像中的所有目标区域的中心点与相应跟踪模板的中心点之间的偏差的所述平均值大于等于所述偏移阈值,则判断所述摄像头偏移;其中,M为大于等于1的整数;响应于所述每个目标图像中的所有目标区域的中心点与相应跟踪模板的中心点之间的偏差的平均值小于所述偏移阈值,则判断所述摄像头未偏移。
在本公开的一些实施例中,针对所述每个目标图像,计算每个目标图像中 的N个目标区域与第一帧图像中的对应的N个跟踪模板之间的中心点的N个偏差的平均值D。当D≥T3时,即,N个偏差的平均值D大于等于偏移阈值T3,且此结果持续了M帧图像,则判断为摄像头被干扰(如,偏移)。当D<T3时,即,N个偏差的平均值D小于偏移阈值T3,则判断为摄像头未被干扰(如,偏移)。
在本公开的一些实施例中,在判断出摄像头被干扰(如,偏移)时,向用户发出警报,以提示用户摄像头被干扰(如,偏移)。在本公开的一些实施例中,偏移阈值T3可以根据需要进行设置。例如,偏移阈值T3可以被设置为150。在本公开的一些实施例中,警报的类型可以包括声音,发出不同颜色的光(例如,发出红光)等。然而,本公开不限于此。在本公开的其他实施例中,还可以采用其他类型的警报来提示用户摄像头被干扰(如,偏移)。
例如,在如图2a和图2b所示的实施例中,图2a示出了第一帧图像,其中已经设置好了四个跟踪模板(即,区域1、区域2、区域3、区域4);图2b示出了一个目标图像(即,第一帧图像之后的一帧图像),其中,摄像头未被遮挡,但是偏移。
具体地,在如图2a和图2b所示的实施例中,将图2a所示图像看做是第一帧图像,将图2b所示图像看做是一个目标图像(即,第一帧图像之后的一帧图像)。很明显,区域1和区域2在偏移后图像(图2b所示图像)中仍然存在,因此在上述步骤S2中利用SiamFC算法很容易跟踪到,所以其响应得分值相对较大(即,区域1在两帧图像中相似度较高,且区域2在两帧图像中相似度较高),如0.75(如图2d所示,0.75的响应得分值为目标区域的最大响应位置),大于第一响应阈值T11(例如,为0.5),所以标记值x 1=1,x 2=1;而区域3和区域4在图2b所示的图像中已经消失,所以其响应得分值比较低,如,0.0001,远小于第一响应阈值T11,所以标记值x 3=0,x 4=0。因此计算得到目标区域的总得分比例S=(1+1+0+0)/4=0.5,大于第二响应阈值T12,所以可以判定该摄像头未被遮挡。
在如图2a和图2b所示的实施例中,由于目标区域3和目标区域4在偏移 后图像中已经消失,所以只需要根据目标区域1和目标区域2来判断摄像头是否偏移。具体地,设P1(x1,y1)为原始图像(如图2a所示)中区域1的中心点,P2(x2,y2)为原始图像(如图2a所示)中区域2的中心点;P3(x3,y3)为偏移后图像(如图2b所示)中目标区域1的中心点,P4(x4,y4)为偏移后图像(如图2b所示)中目标区域2的中心点。每个目标图像中的N个目标区域与第一帧图像中的对应的N个跟踪模板之间的中心点的N个偏差的平均值D,即,在本实施例中,原始图像与偏移后图像中的目标区域1的中心点的偏差(即,原始图像与偏移后图像中的目标区域1的中心点之间的距离)和目标区域2的中心点的偏差的平均值D,为:
D=(d1+d2)/2
其中d1为原始图像(如图2a所示)中目标区域1的中心点P1和偏移后图像(如图2b所示)中目标区域1的中心点P3的欧几里得距离
Figure PCTCN2022122563-appb-000020
d2为原始图像(如图2a所示)中目标区域2的中心点P2和偏移后图像(如图2b所示)中目标区域2的中心点P4的欧几里得距离
Figure PCTCN2022122563-appb-000021
例如,D=200,而偏移阈值等于150,则认为摄像头发生了偏移。然后通知用户摄像头发生了偏移。
在实际应用中,本方法可以和一些其他(禁区闯入)方法共同部署在计算机中。本方法是基于关键区域监控算法能够正常有效运行的保障算法。
需要说明的是,本公开的一个或多个实施例的方法可以由单个设备执行,例如一台计算机或服务器等。本实施例的方法也可以应用于分布式场景下,由多台设备相互配合来完成。在这种分布式场景的情况下,这多台设备中的一台设备可以只执行本公开的一个或多个实施例的方法中的某一个或多个步骤,这多台设备相互之间会进行交互以完成所述的方法。
上述对本公开的特定实施例进行了描述。其它实施例在所附权利要求书的范围内。在一些情况下,在权利要求书中记载的动作或步骤可以按照不同于实施例中的顺序来执行并且仍然可以实现期望的结果。另外,在附图中描绘的过 程不一定要求示出的特定顺序或者连续顺序才能实现期望的结果。在某些实施方式中,多任务处理和并行处理也是可以的或者可能是有利的。
在第二方面,本公开的实施例提供一种用于检测摄像头干扰的***。
图3a为本公开的实施例提供的用于检测摄像头干扰的***的框图。如图3a所示,本公开的实施例提供的用于检测摄像头干扰的***包括:摄像头,其被配置为获取多帧图像;模板选取模块,其被配置为选取第一帧图像的至少一个图像连通区域作为至少一个跟踪模板;跟踪计算模块,其被配置为基于所述至少一个跟踪模板,对每个目标图像进行跟踪计算,以得到所述每个目标图像的对应于所述至少一个跟踪模板的至少一个特征响应图;其中,所述目标图像为所述第一帧图像之后的多帧图像中的一帧图像;以及干扰判断模块,其被配置为根据所述每个目标图像的至少一个特征响应图,来判断所述摄像头是否被干扰。
在本公开的一些实施例中,模板选取模块还可以被配置为获取跟踪模板的特征图。在本公开的一些实施例中,可以利用算法或人工选取的方法来选取第一帧图像的至少一个图像连通区域作为至少一个跟踪模板。在本公开的一些实施例中,利用算法来选取第一帧图像的至少一个图像连通区域作为至少一个跟踪模板,包括:利用选择性搜索算法(Selective Search)对第一帧图像进行图像分割,以生成多个图像连通区域,并采用形态学方法对多个图像连通区域进行轮廓提取以获得所述多个图像连通区域的轮廓特征。
在本公开的一些实施例中,对多个图像连通区域进行轮廓提取以获得所述多个图像连通区域的轮廓特征,包括:通过面积特征对多个图像连通区域进行处理,随后利用矩形度(Rectangularity)特征和/或轮廓的继承关系对多个图像连通区域进行进一步筛选,选取所述多个图像连通区域中矩形度最大的N个图像连通区域分别作为N个跟踪模板。其中,N为图像连通区域的数量,且为大于等于1的整数。如图2a所示,N=4。
在本公开的一些实施例中,矩形度为一形状的面积除以最小外接矩形面积。
在摄像头干扰的类型包括摄像头被遮挡的实施例中,所述干扰判断模块还被配置为:基于所述每个目标图像的对应于所述至少一个跟踪模板的所述至少一个特征响应图中的每个特征响应图,来获得该目标图像的对应于所述至少一个跟踪模板的目标区域的响应得分值;针对所述至少一个跟踪模板,获得所述每个目标图像中的每个目标区域的响应得分值,来判断所述摄像头是否被干扰。
在本公开的实施例中,响应得分值可根据上述公式1计算。在本公开的实施例中,将通过根据上述公式1计算得到的特征响应图RF中的每个值进行处理,然后取处理后的值中最大的一个值作为该目标图像的对应于所述至少一个跟踪模板的目标区域的响应得分值。
具体地,如图2d所示,将根据上述公式1计算得到的特征响应图RF中的每个值通过sigmoid函数进行处理,以得到在0到1范围内的响应得分值。其中,sigmoid函数如下:
Figure PCTCN2022122563-appb-000022
其中,x表示所述特征响应图中的各个值;f(x)表示对所述特征响应图中的各个值进行处理的函数。将特征响应图RF中的每个值带入x,从而得到在0到1范围内的对应值。如图2e所示,sigmoid函数所得的值均在0到1范围内,因此,通过应用sigmoid函数,所得到的响应得分值也在0到1范围内。
在本公开的一些实施例中,所述干扰判断模块还被配置为:将所述每个目标图像中的每个目标区域的响应得分值与第一响应阈值比较;响应于目标区域的响应得分值小于所述第一响应阈值,则将该目标区域赋予标记值为0;响应于目标区域的响应得分值大于所述第一响应阈值,则将该目标区域赋予标记值为1;对所述每个目标图像中的所有目标区域的标记值求平均,以计算出所述每个目标图像中的所有目标区域的总得分比例
Figure PCTCN2022122563-appb-000023
其中,S表示所述每个目标图像中的所有目标区域的总得分比例;N为所述每个目标图像中的目标区域的数量,且N为正整数;x i表示所述每个目标图像中的第i个目标区域的标记值,且x i=0或x i=1;且i为正整数且1≤i≤N。
在本公开的一些实施例中,如果每个目标图像中的目标区域的响应得分值小于预先设定的第一响应阈值,则将该目标区域赋予标记值x i=0,否则将该目标区域赋予标记值x i=1。其中,i小于等于N且大于等于1且为整数,N为目标区域的数量。
在本公开的一些实施例中,所述干扰判断模块还被配置为:将所述每个目标图像中的所有目标区域的所述总得分比例
Figure PCTCN2022122563-appb-000024
(即,N个目标区域的响应得分值的平均数)与第二响应阈值比较T12。当S<T12时,即,目标区域的总得分比例S小于第二响应阈值T12,此时,几乎所有跟踪的目标区域全部丢失,且此结果持续了K帧图像,则判断为摄像头被遮挡。当S≥T12时,即,目标区域的总得分比例S大于等于第二响应阈值T12,则判断为摄像头未被遮挡。
在本公开的一些实施例中,在判断出摄像头被干扰(如,遮挡)时,向用户发出警报,以提示用户摄像头被干扰(如,遮挡)。在本公开的一些实施例中,第一响应阈值T11和第二响应阈值T12可以根据需要进行设置。在本公开的一些实施例中,第一响应阈值T11和第二响应阈值T12可以相同。本公开的一些实施例中,第一响应阈值T11和第二响应阈值T12可以不同。例如,第一响应阈值T11和第二响应阈值T12可以均被设置为0.5。在本公开的一些实施例中,警报的类型可以包括声音,发出不同颜色的光(例如,发出红光或发出黄光)等。然而,本公开不限于此。在本公开的其他实施例中,还可以采用其他类型的警报来提示用户摄像头被干扰。
在摄像头干扰的类型包括摄像头偏移的实施例中,所述干扰判断模块还被配置为:计算每个目标图像中的每个目标区域与对应的跟踪模板之间的中心点的偏差。在目标区域的总得分比例S大于等于第二响应阈值T12,则判断为摄像头未被干扰(遮挡),在此基础上,所述干扰判断模块还被配置为:计算所述每个目标图像中的所有目标区域的中心点与相应跟踪模板的中心点之间的偏差的平均值;将所述平均值与偏移阈值比较;响应于所述平均值大于等于所述偏移阈值,且此结果持续了M帧图像,则判断所述摄像头偏移;其中,M为大于等 于1的整数;响应于所述平均值小于所述偏移阈值,则判断所述摄像头未偏移。
在本公开的一些实施例中,所述干扰判断模块还被配置为:针对所述每个目标图像,计算每个目标图像中的N个目标区域与第一帧图像中的对应的N个跟踪模板之间的中心点的N个偏差的平均值D。当D≥T3时,即,N个偏差的平均值D大于等于偏移阈值T3,且此结果持续了M帧图像,则判断为摄像头被干扰(如,偏移)。当D<T3时,即,N个偏差的平均值D小于偏移阈值T3,则判断为摄像头未被干扰(如,偏移)。
在本公开的一些实施例中,在判断出摄像头被干扰(如,偏移)时,向用户发出警报,以提示用户摄像头被干扰(如,偏移)。在本公开的一些实施例中,偏移阈值T3可以根据需要进行设置。在本公开的一些实施例中,警报的类型可以包括声音,发出不同颜色的光(例如,发出红光)等。然而,本公开不限于此。在本公开的其他实施例中,还可以采用其他类型的警报来提示用户摄像头被干扰(如,偏移)。
图3b为本公开的实施例提供的用于检测摄像头干扰的***的框图。图3c为本公开的实施例提供的用于检测摄像头干扰的***的具体实施方式的框图。如图3b和图3c所示,本公开的实施例提供的用于检测摄像头干扰的***还包括:卷积运算模块(Fc-model),其被配置为将通过模板选取模块获取的第一帧图像的至少一个跟踪模板的特征图(feature-map-0)作为卷积核对通过跟踪计算模块获取的后续多帧图像中的每帧图像(即,每个目标图像)的对应目标区域的特征图(feature-map-1)进行卷积计算。
在本公开的一些实施例中,模板选取模块和跟踪计算模块通过Alexnet模块实现。模板选取模块和跟踪计算模块被配置为进行相同的运算,具有相同的权重,但是具有不同的图像尺寸,例如,分别为127*127*3和255*255*3。在本公开的一些实施例中,模板选取模块和跟踪计算模块为采用CNN(convolutional neural network,卷积神经网络)的Alexnet模块。
在本实施例的一些可选的实现方式中,模板选取模块和跟踪计算模块可以是通过对预设的CNN进行训练得到的。其中,该卷积神经网络可以是未经训练 或未训练完成的多层卷积神经网络。该卷积神经网络例如可以包括卷积层、池化层、全连接层和损失层。另外,该卷积神经网络中的非首个卷积层可以与位于该非首个卷积层之前的至少一个卷积层相连接。例如,该非首个卷积层可以与位于其之前的所有卷积层相连接,以实现首个卷积层选择;该非首个卷积层也可以与位于其之前的部分卷积层相连接。
在本公开的一些实施例中,模板选取模块、跟踪计算模块和卷积运算模块分别在NVIDIA(英伟达公司)的GPU(graphics processing unit,图形处理器)中通过tensorRT模块加速转换并通过Triton推理服务器(英伟达公司开发的一种部署推理工具)部署。
在本公开的一些实施例中,tensorRT模块包括与模板选取模块对应连接的第一tensorRT模块和与跟踪计算模块对应连接的第二tensorRT模块。在本公开的一些实施例中,第一tensorRT模块和第二tensorRT模块是英伟达公司开发的一种深度学习推理引擎。
在本公开的一些实施例中,卷积运算模块通过ONNX模块(Open Neural Network Exchange,开放神经网络交换)实现,ONNX模块是用于在各种深度学习训练和推理框架转换的中间表示格式。
在本公开的一些实施例中,在利用Triton推理服务器部署时,通过Triton推理服务器的模块组合功能将跟踪计算模块和卷积运算模块连接起来,即将跟踪计算模块的输出与卷积运算模块的输入相连,最终生成最后的跟踪结果。
具体地,SiamFC算法具体运行如下:模板选取模块请求第一tensorRT模块获取请求结果,然后,利用获取的请求数据更新卷积运算模块;采用Triton推理服务器的轮询(poll)功能,检测模块变换,然后,将卷积运算模块重新加载到GPU内存中;将跟踪计算模块与卷积运算模块合并成为一个模块,即跟踪计算模块的输出为卷积运算模块的输入;跟踪计算模块请求合并后的模块,来获取跟踪结果。
例如,如图3b和图3c所示,模板选取模块表示模块在第一帧中推理获取图2c中的第一帧图像的至少一个跟踪模板的特征图feature-map-0的Alexnet模 块;跟踪计算模块表示模块在后续帧中的每帧中推理获取图2c中的后续多帧图像中的每帧图像的对应目标区域的特征图feature-map-1的Alexnet模块。其中,模板选取模块和跟踪计算模块具有相同的权重,只是在进行模块加速转换固化时,模板选取模块和跟踪计算模块的图像输入尺寸分别为127*127*3和255*255*3。卷积运算模块(Fc-model)表示将第一帧图像的至少一个跟踪模板的特征图feature-map-0作为卷积核对跟踪计算模块进行卷积计算和激活的处理模块。
在第三方面,本公开的一个或多个实施例提供还一种电子设备,包括存储器、处理器及存储在存储器上并可在处理器上运行的计算机程序,所述处理器执行所述程序时实现如上述任意一项实施例所述的用于检测摄像头干扰的方法。
图4示出了本实施例所提供的一种电子设备硬件结构示意图。如图4所示,该电子设备1000可以包括:处理器1010、存储器1020、输入/输出接口1030、通信接口1040和总线1050。其中,处理器1010、存储器1020、输入/输出接口1030和通信接口1040通过总线1050实现彼此之间在电子设备内部的通信连接。
处理器1010可以采用通用的CPU(Central Processing Unit,中央处理器)、微处理器、应用专用集成电路(Application Specific Integrated Circuit,ASIC)、或者一个或多个集成电路等方式实现,用于执行相关程序,以实现如上述任意一项实施例所述的用于检测摄像头干扰的方法。
存储器1020可以采用ROM(Read Only Memory,只读存储器)、RAM(Random Access Memory,随机存取存储器)、静态存储设备、动态存储设备等形式实现。存储器1020可以存储操作***和其他应用程序,在通过软件或者固件来实现如上述任意一项实施例所述的用于检测摄像头干扰的方法时,相关的程序代码保存在存储器1020中,并由处理器1010来调用执行。
输入/输出接口1030用于连接输入/输出模块,以实现信息输入及输出。输入输出/模块可以作为组件设置在电子设备中(图中未示出),也可以外接于电子设备,以提供相应功能。其中,输入设备可以包括键盘、鼠标、触摸屏、麦克 风、各类传感器等,输出设备可以包括显示器、扬声器、振动器、指示灯等。
通信接口1040用于连接通信模块(图中未示出),以实现本电子设备与其他设备的通信交互。其中,通信模块可以通过有线方式(例如USB、网线等)实现通信,也可以通过无线方式(例如移动网络、WIFI、蓝牙等)实现通信。
总线1050用于在电子设备的各个组件(例如处理器1010、存储器1020、输入/输出接口1030和通信接口1040)之间传输信息。
需要说明的是,尽管上述电子设备仅示出了处理器1010、存储器1020、输入/输出接口1030、通信接口1040以及总线1050,但是在具体实施过程中,该电子设备还可以包括实现正常运行所必需的其他组件。此外,本领域的技术人员可以理解的是,上述电子设备中也可以仅包含实现本公开的实施例所述的用于检测摄像头干扰的方法所必需的组件,而不必包含图中所示的全部组件。
本公开实施例还提供一种非暂时性计算机可读存储介质,其上存储有计算机可执行指令,其中,该指令被处理器执行时执行上述用于检测摄像头干扰的方法。
所属领域的普通技术人员应当理解:以上任何实施例的讨论仅为示例性的,并非旨在暗示本公开的范围(包括权利要求)被限于这些示例;在本公开的发明构思下,以上实施例或者不同实施例中的技术特征之间也可以进行组合,步骤可以以任意顺序实现,并存在如上所述的本公开的一个或多个实施例的不同方面的许多其它变化,为了简明它们没有在细节中提供。
附图中的流程图和框图,图示了按照本公开各种实施例的***、方法和电子设备的可能实现的体系架构、功能和操作。在这点上,流程图或框图中的每个方框可以代表一个模块、程序段、或代码的一部分,该模块、程序段、或代码的一部分包含至少一个用于实现规定的逻辑功能的可执行指令。也应当注意,在有些作为替换的实现方式中,方框中所标注的功能也可以以不同于附图中所标注的顺序发生。例如,两个顺序接连地表示的方框实际上可以基本并行地执行,它们有时也可以按相反的顺序执行,这根据所涉及的功能而定。也要注意的是,框图和/或流程图中的每个方框、以及框图和/或流程图中的方框的组合, 可以通过执行规定的功能或操作的专用的基于硬件的***来实现,或者可以通过专用硬件与计算机指令的组合来实现。
本公开实施例中所涉及到的部件可以通过软件的方式实现,也可以通过硬件的方式来实现。所描述的部件也可以设置在处理器中,例如,各所述部件可以是设置在计算机或移动智能设备中的软件程序,也可以是单独配置的硬件装置。其中,这些部件的名称在某种情况下并不构成对该部件本身的限定。
另外,为简化说明和讨论,并且为了不会使本公开的一个或多个实施例难以理解,在所提供的附图中可以示出或可以不示出与集成电路(IC)芯片和其它部件的公知的电源/接地连接。此外,可以以框图的形式示出装置,以便避免使本公开的一个或多个实施例难以理解,并且这也考虑了以下事实,即关于这些以框图的形式示出的装置的实施方式的细节是高度取决于将要实施本公开的一个或多个实施例的平台的(即,这些细节应当完全处于本领域技术人员的理解范围内)。在阐述了具体细节(例如,电路)以描述本公开的示例性实施例的情况下,对本领域技术人员来说显而易见的是,可以在没有这些具体细节的情况下或者这些具体细节有变化的情况下实施本公开的一个或多个实施例。因此,这些描述应被认为是说明性的而不是限制性的。
尽管已经结合了本公开的具体实施例对本公开进行了描述,但是根据前面的描述,这些实施例的很多替换、修改和变型对本领域普通技术人员来说将是显而易见的。例如,其它存储器架构(例如,动态RAM(DRAM))可以使用所讨论的实施例。
本公开的一个或多个实施例旨在涵盖落入所附权利要求的宽泛范围之内的所有这样的替换、修改和变型。因此,凡在本公开的一个或多个实施例的精神和原则之内,所做的任何省略、修改、等同替换、改进等,均应包含在本公开的保护范围之内。

Claims (17)

  1. 一种用于检测摄像头干扰的方法,包括:
    通过所述摄像头,获取多帧图像,并选取第一帧图像的至少一个图像连通区域作为至少一个跟踪模板,
    基于所述至少一个跟踪模板,对每个目标图像进行跟踪计算,以得到所述每个目标图像的对应于所述至少一个跟踪模板的至少一个特征响应图;其中,所述目标图像为所述第一帧图像之后的多帧图像中的一帧图像,以及
    根据所述每个目标图像的至少一个特征响应图,来判断所述摄像头是否被干扰。
  2. 根据权利要求1所述的用于检测摄像头干扰的方法,其中,选取第一帧图像的至少一个图像连通区域作为至少一个跟踪模板,包括:
    利用选择性搜索算法对所述第一帧图像进行图像分割以生成多个图像连通区域;
    采用形态学方法对所述多个图像连通区域进行轮廓提取以获得所述多个图像连通区域的轮廓特征;以及
    利用矩形度特征对所述多个图像连通区域进行筛选,选取所述多个图像连通区域中矩形度最大的至少一个图像连通区域作为至少一个跟踪模板。
  3. 根据权利要求2所述的用于检测摄像头干扰的方法,其中,所述特征响应图通过以下公式计算:
    Figure PCTCN2022122563-appb-100001
    其中,RF表示所述每个目标图像中的、与所述至少一个跟踪模板对应的至少一个目标区域中的每个目标区域的特征响应图;z表示所述第一帧图像,x表示所述每个目标图像;
    Figure PCTCN2022122563-appb-100002
    表示所述第一帧图像的每个跟踪模板的特征图,
    Figure PCTCN2022122563-appb-100003
    表示所述每个目标图像的特征图;
    Figure PCTCN2022122563-appb-100004
    表示以所述跟踪模板生成的所述特 征图z为卷积核对所述每个目标图像的所述特征图x进行卷积计算的SiamFC算法的函数,b表示常数。
  4. 根据权利要求3所述的用于检测摄像头干扰的方法,其中,
    根据所述每个目标图像的至少一个特征响应图,来判断所述摄像头是否被干扰,包括:基于所述每个目标图像的对应于所述至少一个跟踪模板的所述至少一个特征响应图中的每个特征响应图,来获得该目标图像的对应于所述至少一个跟踪模板的目标区域的响应得分值;针对所述至少一个跟踪模板,获得所述每个目标图像中的每个目标区域的响应得分值,来判断所述摄像头是否被干扰;
    其中,获得该目标图像的对应于所述至少一个跟踪模板的目标区域的响应得分值,包括:利用sigmoid函数对所述特征响应图中的每个值进行处理;然后取处理后的值中最大的一个值作为该目标图像的对应于所述至少一个跟踪模板的目标区域的响应得分值;
    其中,所述sigmoid函数如下:
    Figure PCTCN2022122563-appb-100005
    其中,x表示所述特征响应图中的各个值;f(x)表示对所述特征响应图中的各个值进行处理的函数。
  5. 根据权利要求4所述的用于检测摄像头干扰的方法,其中,根据所述每个目标图像的至少一个特征响应图,来判断所述摄像头是否被干扰,还包括:
    将所述每个目标图像中的每个目标区域的响应得分值与第一响应阈值比较;响应于目标区域的响应得分值小于所述第一响应阈值,则将该目标区域赋予标记值为0;响应于目标区域的响应得分值大于所述第一响应阈值,则将该目标区域赋予标记值为1;
    对所述每个目标图像中的所有目标区域的标记值求平均,以计算出所述每个目标图像中的所有目标区域的总得分比例
    Figure PCTCN2022122563-appb-100006
    其中,S表示所述每个目标图像中的所有目标区域的总得分比例;N为所述每个目标图像中的目标区域的数量,且N为正整数;x i表示所述每个目标图像中的第i个目标区域的标记值,且x i=0或x i=1;且i为正整数且1≤i≤N。
  6. 根据权利要求5所述的用于检测摄像头干扰的方法,其中,
    根据所述每个目标图像的至少一个特征响应图,来判断所述摄像头是否被干扰,还包括:
    将所述每个目标图像中的所有目标区域的所述总得分比例与第二响应阈值比较;响应于K帧图像中的每帧图像中的所述所有目标区域的所述总得分比例小于所述第二响应阈值,则判断所述摄像头被遮挡;响应于所述每个目标图像中的所述所有目标区域的所述总得分比例大于等于所述第二响应阈值,则判断所述摄像头未被遮挡;其中,K为大于等于1的整数。
  7. 根据权利要求6所述的用于检测摄像头干扰的方法,其中,响应于所述摄像头未被遮挡,所述方法还包括:
    计算所述每个目标图像中的所有目标区域的中心点与相应跟踪模板的中心点之间的偏差的平均值;
    将所述每个目标图像中的所有目标区域的中心点与相应跟踪模板的中心点之间的偏差的平均值与偏移阈值比较;响应于M帧图像中的每帧图像中的所有目标区域的中心点与相应跟踪模板的中心点之间的偏差的所述平均值大于等于所述偏移阈值,则判断所述摄像头偏移;其中,M为大于等于1的整数;响应于所述每个目标图像中的所有目标区域的中心点与相应跟踪模板的中心点之间的偏差的平均值小于所述偏移阈值,则判断所述摄像头未偏移。
  8. 根据权利要求1至7中任一项所述的用于检测摄像头干扰的方法,其中,所述第一帧图像的所述至少一个跟踪模板包括四个跟踪模板。
  9. 一种用于检测摄像头干扰的***,包括:
    摄像头,其被配置为获取多帧图像,
    模板选取模块,其被配置为选取第一帧图像的至少一个图像连通区域作为至少一个跟踪模板,
    跟踪计算模块,其被配置为基于所述至少一个跟踪模板,对每个目标图像进行跟踪计算,以得到所述每个目标图像的对应于所述至少一个跟踪模板的至少一个特征响应图;其中,所述目标图像为所述第一帧图像之后的多帧图像中的一帧图像;以及
    干扰判断模块,其被配置为根据所述每个目标图像的至少一个特征响应图,来判断所述摄像头是否被干扰。
  10. 根据权利要求9所述的用于检测摄像头干扰的***,其中,所述模板选取模块还被配置为:
    利用选择性搜索算法对所述第一帧图像进行图像分割以生成多个图像连通区域;
    采用形态学方法对所述多个图像连通区域进行轮廓提取以获得所述多个图像连通区域的轮廓特征;以及
    利用矩形度特征对所述多个图像连通区域进行进一步筛选,选取所述多个图像连通区域中矩形度最大的至少一个图像连通区域作为至少一个跟踪模板。
  11. 根据权利要求10所述的用于检测摄像头干扰的***,其中,所述特征响应图通过以下公式计算:
    Figure PCTCN2022122563-appb-100007
    其中,RF表示所述每个目标图像中的、与所述至少一个跟踪模板对应的至少一个目标区域中的每个目标区域的特征响应图;z表示所述第一帧图像,x表示所述每个目标图像;
    Figure PCTCN2022122563-appb-100008
    表示所述第一帧图像的每个跟踪模板的特征图,
    Figure PCTCN2022122563-appb-100009
    表示所述每个目标图像的特征图;
    Figure PCTCN2022122563-appb-100010
    表示以所述跟踪模板生成的所述特 征图z为卷积核对所述每个目标图像的所述特征图x进行卷积计算的SiamFC算法的函数,b表示常数。
  12. 根据权利要求11所述的用于检测摄像头干扰的***,其中,所述干扰判断模块还被配置为:基于所述每个目标图像的对应于所述至少一个跟踪模板的所述至少一个特征响应图中的每个特征响应图,来获得该目标图像的对应于所述至少一个跟踪模板的目标区域的响应得分值;针对所述至少一个跟踪模板,获得所述每个目标图像中的每个目标区域的响应得分值,来判断所述摄像头是否被干扰;
    其中,获得该目标图像的对应于所述至少一个跟踪模板的目标区域的响应得分值,包括:利用sigmoid函数对所述特征响应图中的每个值进行处理;然后取处理后的值中最大的一个值作为该目标图像的对应于所述至少一个跟踪模板的目标区域的响应得分值;
    其中,所述sigmoid函数如下:
    Figure PCTCN2022122563-appb-100011
    其中,x表示所述特征响应图中的各个值;f(x)表示对所述特征响应图中的各个值进行处理的函数。
  13. 根据权利要求12所述的用于检测摄像头干扰的***,其中,所述干扰判断模块还被配置为:
    将所述每个目标图像中的每个目标区域的响应得分值与第一响应阈值比较;响应于目标区域的响应得分值小于所述第一响应阈值,则将该目标区域赋予标记值为0;响应于目标区域的响应得分值大于所述第一响应阈值,则将该目标区域赋予标记值为1;
    对所述每个目标图像中的所有目标区域的标记值求平均,以计算出所述每个目标图像中的所有目标区域的总得分比例
    Figure PCTCN2022122563-appb-100012
    其中,S表示所述每个目标图像中的所有目标区域的总得分比例;N为所述 每个目标图像中的目标区域的数量,且N为正整数;x i表示所述每个目标图像中的第i个目标区域的标记值,且x i=0或x i=1;且i为正整数且1≤i≤N。
  14. 根据权利要求13所述的用于检测摄像头干扰的***,其中,所述干扰判断模块还被配置为:
    将所述每个目标图像中的所有目标区域的所述总得分比例与第二响应阈值比较;响应于K帧图像中的每帧图像中的所述所有目标区域的所述总得分比例小于所述第二响应阈值,则判断所述摄像头被遮挡;响应于所述每个目标图像中的所述所有目标区域的所述总得分比例大于等于所述第二响应阈值,则判断所述摄像头未被遮挡;其中,K为大于等于1的整数。
  15. 根据权利要求14所述的用于检测摄像头干扰的***,其中,所述干扰判断模块还被配置为:响应于所述摄像头未被遮挡,
    计算所述每个目标图像中的所有目标区域的中心点与相应跟踪模板的中心点之间的偏差的平均值;
    将所述每个目标图像中的所有目标区域的中心点与相应跟踪模板的中心点之间的偏差的平均值与偏移阈值比较;响应于M帧图像中的每帧图像中的所有目标区域的中心点与相应跟踪模板的中心点之间的偏差的所述平均值大于等于所述偏移阈值,则判断所述摄像头偏移;其中,M为大于等于1的整数;响应于所述每个目标图像中的所有目标区域的中心点与相应跟踪模板的中心点之间的偏差的平均值小于所述偏移阈值,则判断所述摄像头未偏移。
  16. 根据权利要求9至15中任一项所述的用于检测摄像头干扰的***,其中,所述第一帧图像的所述至少一个跟踪模板包括四个跟踪模板。
  17. 一种电子设备,包括存储器、处理器及存储在所述存储器上并可在所述处理器上运行的计算机程序,其中,所述计算机程序在被所述处理器执行时, 实现根据权利要求1至8中的任一项所述的用于检测摄像头干扰的方法。
PCT/CN2022/122563 2022-09-29 2022-09-29 用于检测摄像头干扰的方法、***和电子设备 WO2024065389A1 (zh)

Priority Applications (2)

Application Number Priority Date Filing Date Title
PCT/CN2022/122563 WO2024065389A1 (zh) 2022-09-29 2022-09-29 用于检测摄像头干扰的方法、***和电子设备
CN202280003367.1A CN118120222A (zh) 2022-09-29 2022-09-29 用于检测摄像头干扰的方法、***和电子设备

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/CN2022/122563 WO2024065389A1 (zh) 2022-09-29 2022-09-29 用于检测摄像头干扰的方法、***和电子设备

Publications (1)

Publication Number Publication Date
WO2024065389A1 true WO2024065389A1 (zh) 2024-04-04

Family

ID=90475328

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2022/122563 WO2024065389A1 (zh) 2022-09-29 2022-09-29 用于检测摄像头干扰的方法、***和电子设备

Country Status (2)

Country Link
CN (1) CN118120222A (zh)
WO (1) WO2024065389A1 (zh)

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2004013614A (ja) * 2002-06-07 2004-01-15 Matsushita Electric Ind Co Ltd 物体追跡装置、物体追跡装置制御方法、および画像認識システム
CN103955718A (zh) * 2014-05-15 2014-07-30 厦门美图之家科技有限公司 一种图像主体对象的识别方法
CN107948465A (zh) * 2017-12-11 2018-04-20 南京行者易智能交通科技有限公司 一种检测摄像头被干扰的方法和装置
CN109685827A (zh) * 2018-11-30 2019-04-26 南京理工大学 一种基于dsp的目标检测与跟踪方法
CN111340850A (zh) * 2020-03-20 2020-06-26 军事科学院***工程研究院***总体研究所 基于孪生网络和中心逻辑损失的无人机对地目标跟踪方法
CN113822223A (zh) * 2021-10-12 2021-12-21 精英数智科技股份有限公司 一种用于检测摄像头被遮挡移动的方法及装置
CN113902677A (zh) * 2021-09-08 2022-01-07 九天创新(广东)智能科技有限公司 一种摄像头遮挡检测方法、装置及智能机器人

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2004013614A (ja) * 2002-06-07 2004-01-15 Matsushita Electric Ind Co Ltd 物体追跡装置、物体追跡装置制御方法、および画像認識システム
CN103955718A (zh) * 2014-05-15 2014-07-30 厦门美图之家科技有限公司 一种图像主体对象的识别方法
CN107948465A (zh) * 2017-12-11 2018-04-20 南京行者易智能交通科技有限公司 一种检测摄像头被干扰的方法和装置
CN109685827A (zh) * 2018-11-30 2019-04-26 南京理工大学 一种基于dsp的目标检测与跟踪方法
CN111340850A (zh) * 2020-03-20 2020-06-26 军事科学院***工程研究院***总体研究所 基于孪生网络和中心逻辑损失的无人机对地目标跟踪方法
CN113902677A (zh) * 2021-09-08 2022-01-07 九天创新(广东)智能科技有限公司 一种摄像头遮挡检测方法、装置及智能机器人
CN113822223A (zh) * 2021-10-12 2021-12-21 精英数智科技股份有限公司 一种用于检测摄像头被遮挡移动的方法及装置

Also Published As

Publication number Publication date
CN118120222A (zh) 2024-05-31

Similar Documents

Publication Publication Date Title
US20200364443A1 (en) Method for acquiring motion track and device thereof, storage medium, and terminal
US11222239B2 (en) Information processing apparatus, information processing method, and non-transitory computer-readable storage medium
US10755131B2 (en) Pixel-level based micro-feature extraction
US10885660B2 (en) Object detection method, device, system and storage medium
US7868772B2 (en) Flame detecting method and device
JP2006146922A (ja) テンプレート方式の顔検出方法
JP2008192131A (ja) 特徴レベル・セグメンテーションを実行するシステムおよび方法
WO2022121130A1 (zh) 电力目标检测方法、装置、计算机设备和存储介质
KR20140040582A (ko) 몽타주 추론 방법 및 장치
WO2022088176A1 (en) Actional-structural self-attention graph convolutional network for action recognition
KR20210064123A (ko) 안전 벨트의 착용 상태 인식 방법, 장치, 전자 기기 및 저장 매체
CN113989858B (zh) 一种工作服识别方法及***
CN111738225A (zh) 人群聚集检测方法、装置、设备及存储介质
Lu et al. Moving vehicle detection based on fuzzy background subtraction
WO2018113206A1 (zh) 一种图像处理方法及终端
CN111159476A (zh) 目标对象的搜索方法、装置、计算机设备及存储介质
KR20080003617A (ko) 에이다부스트와 에스브이엠 학습 분류기를 이용한 눈영역 추출기 및 추출방법
TWI769603B (zh) 影像處理方法及存儲介質
KR20230043318A (ko) 영상 내 객체를 분류하는 객체 분류 방법 및 장치
Shi et al. Moving cast shadow detection in video based on new chromatic criteria and statistical modeling
WO2024041108A1 (zh) 图像矫正模型训练及图像矫正方法、装置和计算机设备
WO2024065389A1 (zh) 用于检测摄像头干扰的方法、***和电子设备
CN115409991B (zh) 目标识别方法、装置、电子设备和存储介质
Xu et al. Covariance descriptor based convolution neural network for saliency computation in low contrast images
CN110135224B (zh) 一种监控视频的前景目标提取方法及***、存储介质及终端

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 22960027

Country of ref document: EP

Kind code of ref document: A1