WO2021098657A1 - Video detection method and apparatus, terminal device, and readable storage medium - Google Patents

Video detection method and apparatus, terminal device, and readable storage medium Download PDF

Info

Publication number
WO2021098657A1
WO2021098657A1 PCT/CN2020/129171 CN2020129171W WO2021098657A1 WO 2021098657 A1 WO2021098657 A1 WO 2021098657A1 CN 2020129171 W CN2020129171 W CN 2020129171W WO 2021098657 A1 WO2021098657 A1 WO 2021098657A1
Authority
WO
WIPO (PCT)
Prior art keywords
video
frame
video frame
prediction
nth
Prior art date
Application number
PCT/CN2020/129171
Other languages
French (fr)
Chinese (zh)
Inventor
乔宇
彭小江
Original Assignee
中国科学院深圳先进技术研究院
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 中国科学院深圳先进技术研究院 filed Critical 中国科学院深圳先进技术研究院
Publication of WO2021098657A1 publication Critical patent/WO2021098657A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/40Scenes; Scene-specific elements in video content
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/22Matching criteria, e.g. proximity measures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/40Scenes; Scene-specific elements in video content
    • G06V20/44Event detection

Definitions

  • This application belongs to the field of computer vision technology, and in particular relates to a video detection method, device, terminal device, and readable storage medium.
  • Video anomaly detection can detect whether the video contains abnormal events.
  • the abnormal events in the video refer to special events that are different from normal behavior in a specific scene. Abnormal events may endanger public safety or affect social order and cause serious consequences. Therefore, , Determining whether there is an abnormal event through video anomaly detection plays an important role in maintaining social order.
  • the entire N frames of the video to be detected are first input to the autoencoder, and the autoencoder performs convolution operation on the input N frames to obtain a video frame, and finally determines whether the video to be detected exists according to the video frame. abnormal.
  • the embodiments of the present application provide a video detection method, device, terminal device, and readable storage medium, so as to improve the problem that when abnormal events are detected, false detections are prone to occur, leading to unsatisfactory detection results.
  • an embodiment of the present application provides a video detection method, including:
  • the video to be tested includes N video frames, where N is an integer greater than 1.
  • the calculation method of the prediction frame of the i+1 video frame among the N video frames is: input the error image of the i video frame into the trained prediction network model for processing, and obtain the i+1 video Frame prediction frame, the error image of the i-th video frame is obtained by subtracting the i-th video frame and the prediction frame of the i-th video frame, 1 ⁇ i ⁇ N-1, and i is an integer.
  • the execution subject of the video detection method is a terminal with image processing capability.
  • the terminal may be a physical terminal, such as a desktop computer, a server, a notebook computer, a tablet computer, etc., or a virtual terminal, such as a cloud server, cloud computing, etc. It should be understood that the above execution subject is only an example, and it is not limited to the above terminal.
  • the prediction frame of the first video frame is obtained after inputting the preset error image into the prediction network model for processing.
  • calculating the difference degree between the predicted frame of the Nth video frame and the Nth video frame can be obtained by subtracting the predicted frame of the Nth video frame from the Nth video frame and taking the modulus to obtain the degree of difference.
  • calculating the difference value between the predicted frame of the Nth video frame and the Nth video frame may also be used to repair the predicted frame of the Nth video frame by using a preset repair algorithm. Then, the predicted frame of the Nth video frame after repair is subtracted from the Nth video frame and the modulus is obtained to obtain the difference degree.
  • the degree of difference meets a preset condition, it is determined that the video to be tested is abnormal, and when the degree of difference is greater than a first preset threshold, it is determined that the video to be tested is abnormal.
  • the reciprocal of the degree of difference may be normalized first to obtain a normality score. Then, when the normality score is less than the second preset threshold, it is determined that the video to be tested is abnormal.
  • the predictive network model is a Long Short-Term Memory (LSTM) model or a Gated Recurrent Unit (GRU) model.
  • LSTM Long Short-Term Memory
  • GRU Gated Recurrent Unit
  • an embodiment of the present application provides a video detection device, including:
  • the acquisition module is configured to acquire a video to be tested, and the video to be tested includes N video frames, where N is an integer greater than 1.
  • the prediction module is used to sequentially obtain the predicted frames of the N video frames.
  • the calculation method of the prediction frame of the i+1th video frame among the N video frames is: input the error image of the i-th video frame into the trained prediction network model for processing, and obtain the i-th +1 prediction frame of the video frame, the error image of the i-th video frame is obtained by subtracting the i-th video frame and the prediction frame of the i-th video frame, 1 ⁇ i ⁇ N-1 , I is an integer.
  • the calculation module is used to calculate the degree of difference between the predicted frame of the Nth video frame and the Nth video frame.
  • the determining module is configured to determine that the video to be tested has an abnormality if the degree of difference meets a preset condition.
  • the video detection device may be the execution subject of the first aspect, and its specific form is the same as that of the first aspect, and details are not described herein again.
  • the prediction frame of the first video frame is obtained after inputting the preset error image into the prediction network model for processing.
  • the calculation module is specifically configured to subtract the predicted frame of the Nth video frame from the Nth video frame and take the modulus to obtain the degree of difference.
  • the calculation module is specifically configured to repair the predicted frame of the Nth video frame by using a preset repair algorithm. Then, the predicted frame of the Nth video frame after repair is subtracted from the Nth video frame and the modulus is obtained to obtain the difference degree.
  • the determining module is specifically configured to determine that the video to be tested has an abnormality when the degree of difference is greater than a first preset threshold.
  • the determining module is also used to first normalize the reciprocal of the difference degree to obtain the normality score. Then, when the normality score is less than the second preset threshold, it is determined that the video to be tested is abnormal.
  • the predictive network model is an LSTM model or a GRU model.
  • the embodiments of the present application provide a terminal device, including: a memory, a processor, and a computer program stored in the memory and running on the processor. Methods.
  • embodiments of the present application provide a computer-readable storage medium, and the computer-readable storage medium stores a computer program, and the computer program is executed by a processor to implement the method provided in the first aspect from time to time.
  • the embodiments of the present application provide a computer program product, which when the computer program product runs on a terminal device, causes the terminal device to execute the method provided in the first aspect.
  • the embodiment of the present application has a beneficial effect: first, according to the video to be tested, the predicted frames of N video frames are sequentially obtained. Among them, the error image of the i-th video frame is obtained from the subtraction of the i-th video frame and the predicted frame of the i-th video frame, and the i+1-th predicted frame is inputted to the computer based on the error image of the i-th video frame. It is processed in the trained predictive network model. Then calculate the difference degree between the Nth video frame and the prediction frame of the Nth video frame. Finally, if the degree of difference meets the preset condition, it is determined that the video to be tested is abnormal.
  • the generated prediction frame takes into account the timing impact of the video to be tested, so that the obtained prediction frame of the Nth video frame is more accurate.
  • the probability of false detection is reduced and the detection accuracy is improved.
  • Fig. 1 is a schematic diagram of an application scenario provided by an embodiment of the present application
  • FIG. 2 is a schematic flowchart of a video detection method provided by an embodiment of the present application.
  • FIG. 3 is a schematic flowchart of a video detection method provided by another embodiment of the present application.
  • FIG. 4 is a schematic flowchart of a video detection method provided by another embodiment of the present application.
  • FIG. 5 is a schematic diagram of an application scenario provided by another embodiment of the present application.
  • FIG. 6 is a schematic structural diagram of a video detection device provided by an embodiment of the present application.
  • Fig. 7 is a schematic structural diagram of a terminal device provided by an embodiment of the present application.
  • the term “if” can be construed as “when” or “once” or “in response to determination” or “in response to detecting ".
  • the phrase “determine if" can be interpreted depending on the context to mean “once determined” or “in response to determination” or “once it is detected that a preset condition is met” or “in response to that it is detected that a preset condition is met” ".
  • the video detection method provided by the embodiments of this application can be applied to mobile phones, tablet computers, wearable devices, in-vehicle devices, augmented reality (AR)/virtual reality (VR) devices, notebook computers, and super mobile personal computers
  • AR augmented reality
  • VR virtual reality
  • terminal devices such as ultra-mobile personal computer (UMPC), netbooks, personal digital assistants (personal digital assistants, PDAs), security cameras, surveillance cameras, etc.
  • UMPC ultra-mobile personal computer
  • PDAs personal digital assistants
  • security cameras surveillance cameras, etc.
  • Fig. 1 is a schematic diagram of an application scenario provided by an embodiment of the present application.
  • this scene includes at least one video capture device 11 and at least one terminal device connected to the video capture device 11.
  • the video capture device 11 may be various forms of cameras, for example, it may be a security camera, a surveillance camera, a camera integrated on a notebook, a camera integrated on a smart phone, and so on.
  • the terminal device may be at least one of the server 12, a personal computer 13, a smart phone 14, a tablet computer 15.
  • the terminal device may obtain the video to be tested collected by the video capture device 11 and detect it
  • the video capture device 11 may be connected to the server 12 in communication. After the video capture device 11 captures the video to be tested, it sends the video to be tested to the server 12 through a wired or wireless network.
  • the server 12 processes the video to be tested to obtain the video to be tested. Measure the video detection results.
  • the detection results can be stored in the server 12 or sent to a designated database for storage, and then can be adjusted from the server 12 or database through other devices, such as smart phones, personal computers, tablets, etc.
  • the video capture device 11 may be connected to at least one of the personal computer 13, the smart phone 14, and the tablet computer 15, and the video to be tested is sent to the terminal device connected to it, and the video to be tested is processed on the device to obtain the video to be tested.
  • the test result of the test video is displayed, and the test video and its test result are displayed on the screen on the device.
  • the video capture device 11 may also be integrated on the terminal device to implement the solution provided in this application.
  • the camera of the smart phone 14 can be used as the video capture device 11 to collect the video to be tested, and then stored in the memory of the smart phone 14, and then the processor of the smart phone 14 executes the corresponding executable program to the memory
  • the video to be tested in is processed to obtain the detection result of the video to be tested and displayed on the screen of the smart phone 14.
  • FIG. 1 is only an example of the application scenario of the present application, and does not constitute a limitation on the application scenario for executing the video detection method provided in the present application. In actual application, it may include more devices than shown in the figure.
  • the server 12 can also communicate with the database to store the video to be tested and the detection results; or communicate with the screen to display the detection results; it can also communicate with the alarm
  • the device communication connection is used to remind the abnormality of the detection result, etc., and there is no restriction here.
  • wireless networks may include wireless local area networks (Wireless Localarea Networks, WLAN) (such as Wi-Fi networks), Bluetooth, Zigbee, mobile communication networks, Near Field Communication (NFC), infrared technology (Infrared, IR). ) And other communication solutions.
  • Wired networks can include optical fiber networks, telecommunication networks, intranets, etc., such as Local Area Network (LAN), Wide Area Network (WAN), Metropolitan Area Network (MAN), and public switched telephone network ( Public Switched Telephone Network, PSTN), etc. There are no restrictions on the types of wireless networks and wired networks.
  • FIG. 2 is a schematic flowchart of a video detection method provided by an embodiment of the present application.
  • the method can be applied to terminal devices in the above-mentioned scenarios, such as the server 12, the personal computer 13, the smart phone 14, the tablet computer 15, or the car computer 16, but is not limited to this.
  • video detection methods including:
  • N is an integer greater than 1.
  • the video to be tested may be a video clip directly collected by the above-mentioned video capture device 11, or it may be a stored video clip retrieved by the terminal device that executes the method after the video capture device 11 collects and stores the video clip. There is no restriction here.
  • the video to be tested is usually a video clip based on the RGB color space, including at least 2 video frames.
  • the number of frames of the video to be tested is determined according to the duration of the video to be tested and the number of frames per second (Frames Per Second, FPS). For example, if the duration of the video to be tested is 2 seconds and FPS30, then the video to be tested includes 60 Video frame.
  • the duration of the video to be tested can be used as a detection step.
  • the length of the detection step can be set according to the actual situation of the application, and there is no limitation here.
  • the calculation method of the prediction frame of the i+1 video frame among the N video frames is: input the error image of the i video frame into the trained prediction network model for processing, and obtain the i+1 video Frame prediction frame, the error image of the i-th video frame is obtained by subtracting the i-th video frame and the prediction frame of the i-th video frame, 1 ⁇ i ⁇ N-1, and i is an integer.
  • the predictive network model may be a time series network model.
  • the time series network model may include an LSTM model or a GRU model.
  • the prediction network model After inputting the error image of the i-th moment into the prediction network model (that is, the error image of the i-th video frame), the prediction network model will generate the prediction frame of the next moment (that is, the i-th video frame) according to the input error image. +1 prediction frame of video frame).
  • the prediction frame of the first video frame is obtained by inputting the preset error image into the prediction network model for processing.
  • the preset error image may be the first video frame and a blank image (ie 0). It is obtained by subtraction, or the first video frame can be directly used as the preset error map.
  • the predictive network model can be trained through preset samples and preset sample labels, and the training method is a conventional method in the field, which will not be repeated here.
  • the difference degree is used to indicate the similarity between the Nth video frame and the prediction frame of the Nth video frame.
  • the similarity between the Nth video frame and the predicted frame of the Nth video frame can be calculated by, for example, histogram matching, cosine similarity, and mean hash algorithm.
  • the similarity between the two can also be expressed by the distance between the images.
  • the similarity between the two can be calculated by the Euclidean distance, Mahalanobis distance, Manhattan distance, etc., but it is not limited to this.
  • preset conditions can be set according to actual applications, the accuracy required for video detection, the interference information in the scene contained in the video detection, and other factors, so as to ensure that the abnormality can be accurately detected during video detection. It will not cause false detection due to being too sensitive or too slow.
  • the abnormality of the video to be tested refers to the existence of special events that are different from normal events in the video to be tested.
  • special events can include violent behaviors such as a motor vehicle driving on a sidewalk, pedestrians crossing a motorway, robbery or fighting; if the scene in the video to be tested is indoor,
  • the special events can include, for example, the appearance of heavy smoke, the appearance of open flames, and the crowds are overcrowded.
  • the generated prediction frame takes into account the timing influence of the video to be tested, so that the obtained prediction frame of the Nth video frame is more accurate Therefore, when it is confirmed whether the video to be tested is abnormal according to the predicted frame of the Nth video frame and the Nth video frame, the probability of false detection is reduced and the detection accuracy is improved.
  • calculating the difference degree between the predicted frame of the Nth video frame and the Nth video frame can be obtained by subtracting the predicted frame of the Nth video frame from the Nth video frame and taking the modulus to obtain the degree of difference.
  • subtracting the predicted frame of the Nth video frame from the Nth video frame and taking the modulus can be achieved through the L 2 norm. For example, if the predicted frame of the Nth video frame is The Nth video frame is I N , then the degree of difference
  • the calculation of the difference between the Nth video frame and the prediction frame of the Nth video frame in S23 may also be implemented through the process shown in FIG. 3. Please refer to FIG. 3.
  • the foregoing S23 may include:
  • S231 Repair the predicted frame of the Nth video frame by using a preset repair algorithm.
  • the prediction frame for the Nth video frame Repair correct the problems of noise and distortion caused by the prediction process that affect detection, and get the predicted frame R N of the Nth video frame after repair.
  • the preset repair algorithm can be a convolutional autoencoder, using convolution Autoencoder pair Repairing is a routine method used by those skilled in the art and will not be repeated here.
  • L 2 can still achieve norm calculating a predicted frame and the video frame N I N degrees of difference of the restored video frame N R N, i.e.
  • the difference is calculated with the Nth video frame, which can effectively reduce the problem of increased difference in detection due to noise, distortion, etc.
  • the probability of false detection improves the accuracy of video detection.
  • the degree of difference meets a preset condition, it is determined that the video to be tested is abnormal, and when the degree of difference is greater than a first preset threshold, it is determined that the video to be tested is abnormal.
  • the first preset threshold when the degree of difference is greater than 70%, it can be determined that the video to be tested is abnormal, that is, the first preset threshold can be set to 70%. It should be understood that in scenarios with different accuracy requirements, the first The preset threshold can also be set to 63%, 72%, 86% and other values.
  • the process of determining that the video to be tested is abnormal may also be implemented through the process shown in FIG. 4.
  • the above S24 may include:
  • the normality score of the video to be tested can be obtained.
  • the interval range of the normality score can be [0,1] or [0,100], which is not limited here.
  • the higher the normality score the higher the probability that the video to be tested is normal. For example, if the interval of the normality score is [0,100], then 0 points can indicate that the video to be tested is abnormal, and 100 points indicate that the video to be tested is abnormal. normal.
  • the second preset threshold is 30. It should be understood that in scenarios with different accuracy requirements below, the second preset threshold can also be set to other values such as 35, 27, 12, etc., which is not limited here.
  • the normality score of the video to be tested is used to determine whether there is an abnormality in the video to be tested.
  • the normality score is more in line with the user's habits, so the user can be more intuitive To determine the degree of abnormality in the video to be tested, and improve user experience.
  • Fig. 5 is a schematic diagram of an application scenario provided by another embodiment of the present application.
  • Figure 5 shows a scenario of reminding dangerous behaviors when the solution of the present application is applied to automatic driving of a car.
  • the application of the video detection method provided by this application is carried out.
  • the explanation should be clear that the following application methods are only examples, not limitations.
  • a video capture device 11 (not shown) can be set in at least one of the car’s air intake grille 16, rear bumper 17, and side mirrors 18.
  • the video capture device 11 can be used in this scenario Micro cameras, infrared cameras, etc., are used to collect images of the driver in the driving position.
  • the video capture device 11 can be connected to an on-board computer (not shown) through a cable.
  • the on-board computer receives at least one video to be tested that is sent by the video capture device 11 and shot while the car is driving, and processes each video to be tested to obtain a test. result.
  • the camera installed on the air intake grille 16 can collect video images in front of the car
  • the camera installed on the rear bumper 17 can collect video images behind the car
  • the cameras installed on the side mirrors 18 can collect both sides of the car. Video image.
  • the detection frequency of the camera that captures the direction of the car can be increased.
  • the FPS collected by the camera on the air intake grille 16 can be set to 120FPS, and the detection frequency is 0.1 seconds. Then every 0.1 seconds, the detection includes Whether there is an abnormality in the video to be tested in 12 video frames, among them, the predicted frame of the first video frame detected each time can use the predicted frame of the 12th video frame in the previous detection cycle.
  • an abnormality is detected in the video to be tested, such as pedestrian 192 in the driving direction, or other obstacles that may cause safety problems, the driver can be reminded to pay attention to safety through voice or light, or it can also cooperate with other sensors to achieve automatic braking, Automatic avoidance and other functions.
  • the cameras installed on the rearview mirrors 18 on both sides can adopt the same settings to detect whether there are other vehicles 191 approaching on both sides of the car or other situations that may cause safety hazards. If they exist, they can also be voiced or lighted. Remind the driver to pay attention to safety or cooperate with other sensors to realize functions such as automatic braking and automatic avoidance.
  • the detection frequency of the camera opposite to the driving direction of the car can be reduced to reduce the load on the trip computer.
  • the camera on the rear bumper 17 can be collected
  • the FPS is set to 30, and the detection frequency is 0.2 seconds, so every 0.2 seconds, it only needs to detect whether there is an abnormality in the video to be tested including 6 video frames.
  • the setting parameters of the camera on the rear bumper 17 in the above example can be exchanged with the setting parameters of the camera on the air intake grille 16, so as to more sensitively detect the position behind the car. 192 pedestrians or other obstacles.
  • FIG. 6 is a schematic structural diagram of a video detection device provided in an embodiment of the present application. For ease of description, only parts related to the embodiment of the present application are shown.
  • the device includes:
  • the acquiring module 31 is configured to acquire a video to be tested, and the video to be tested includes N video frames, where N is an integer greater than one.
  • the prediction module 32 is configured to sequentially obtain the predicted frames of the N video frames.
  • the calculation method of the prediction frame of the i+1th video frame among the N video frames is: input the error image of the i-th video frame into the trained prediction network model for processing, and obtain the i-th +1 prediction frame of the video frame, the error image of the i-th video frame is obtained by subtracting the i-th video frame and the prediction frame of the i-th video frame, 1 ⁇ i ⁇ N-1 , I is an integer.
  • the calculation module 33 is configured to calculate the degree of difference between the predicted frame of the Nth video frame and the Nth video frame.
  • the determining module 34 is configured to determine that the video to be tested has an abnormality if the degree of difference meets a preset condition.
  • the prediction frame of the first video frame is obtained after inputting the preset error image into the prediction network model for processing.
  • the calculation module 33 is specifically configured to subtract the predicted frame of the Nth video frame from the Nth video frame and take the modulus to obtain the degree of difference.
  • the calculation module 33 is specifically configured to repair the predicted frame of the Nth video frame by using a preset repair algorithm. Then, the predicted frame of the Nth video frame after repair is subtracted from the Nth video frame and the modulus is obtained to obtain the difference degree.
  • the determining module 34 is specifically configured to determine that the video to be tested has an abnormality when the degree of difference is greater than a first preset threshold.
  • the determining module 34 is further configured to first normalize the reciprocal of the difference degree to obtain the normality score. Then, when the normality score is less than the second preset threshold, it is determined that the video to be tested is abnormal.
  • the predictive network model is an LSTM model or a GRU model.
  • Fig. 7 is a schematic structural diagram of a terminal device provided by an embodiment of the present application.
  • an embodiment of the present application also provides a terminal device 4, the terminal device 4 includes: at least one processor 41, a memory 42, and a computer program stored in the memory and running on the at least one processor 41 43.
  • the processor 41 implements the steps in any of the foregoing method embodiments when the computer program 43 is executed.
  • FIG. 7 does not constitute a limitation on the structure of the terminal device 4, and may include more or fewer components than shown in the figure, or a combination of some components, or different components, for example, the terminal device 4 may also display Screens, indicator lights, motors, controls (such as buttons), gyroscope sensors, acceleration sensors, etc.
  • the processor 41 may be a central processing unit (Central Processing Unit, CPU), and the processor 41 may also be other general-purpose processors, digital signal processors (Digital Signal Processors, DSPs), application specific integrated circuits (ASICs). ), ready-made programmable gate array (Field-Programmable Gate Array, FPGA) or other programmable logic devices, discrete gates or transistor logic devices, discrete hardware components, etc.
  • the general-purpose processor may be a microprocessor or the processor may also be any conventional processor or the like.
  • the memory 42 may be an internal storage unit of the terminal device 4 in some embodiments, such as a hard disk or memory of the terminal device 4. In other embodiments, the memory 42 may also be an external storage device of the terminal device 4, for example, a plug-in hard disk equipped on the terminal device 4, a smart memory card (Smart Media Card, SMC), and a Secure Digital (SD) Card, Flash Card, etc. Further, the memory 42 may also include both an internal storage unit of the terminal device 4 and an external storage device.
  • the memory 42 is used to store an operating system, an application program, a boot loader (BootLoader), data, and other programs, such as the program code of the computer program. The memory 42 can also be used to temporarily store data that has been obtained or will be obtained.
  • the embodiments of the present application also provide a computer-readable storage medium, where the computer-readable storage medium stores a computer program, and when the computer program is executed by a processor, the steps in each of the foregoing method embodiments can be realized.
  • the embodiments of the present application provide a computer program product.
  • the steps in the foregoing method embodiments can be realized when the mobile terminal is executed.
  • the integrated unit is implemented in the form of a software functional unit and sold or used as an independent product, it can be stored in a computer readable storage medium.
  • the computer program can be stored in a computer-readable storage medium.
  • the computer program can be stored in a computer-readable storage medium.
  • the steps of the foregoing method embodiments can be implemented.
  • the computer program includes computer program code, and the computer program code may be in the form of source code, object code, executable file, or some intermediate forms.
  • the computer-readable medium may at least include: any entity or device capable of carrying the computer program code to the photographing device/terminal device, recording medium, computer memory, read-only memory (ROM, Read-Only Memory), and random access memory (RAM, Random Access Memory), electric carrier signal, telecommunications signal and software distribution medium.
  • ROM read-only memory
  • RAM random access memory
  • electric carrier signal telecommunications signal and software distribution medium.
  • U disk mobile hard disk, floppy disk or CD-ROM, etc.
  • computer-readable media cannot be electrical carrier signals and telecommunication signals.
  • the disclosed device/terminal device and method may be implemented in other ways.
  • the device/terminal device embodiments described above are merely illustrative.
  • the division of the modules or units is only a logical function division, and there may be other divisions in actual implementation, such as multiple units.
  • components can be combined or integrated into another system, or some features can be omitted or not implemented.
  • the displayed or discussed mutual coupling or direct coupling or communication connection may be indirect coupling or communication connection through some interfaces, devices or units, and may be in electrical, mechanical or other forms.
  • the units described as separate components may or may not be physically separate, and the components displayed as units may or may not be physical units, that is, they may be located in one place, or they may be distributed on multiple network units. Some or all of the units may be selected according to actual needs to achieve the objectives of the solutions of the embodiments.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Artificial Intelligence (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Biology (AREA)
  • Evolutionary Computation (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • General Engineering & Computer Science (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Multimedia (AREA)
  • Image Analysis (AREA)

Abstract

The present application is suitable for the technical field of computer vision, and provided therein are a video detection method and apparatus, a terminal device, and a readable storage medium, the method comprising: acquiring a video to be detected, the video to be detected comprising N video frames; sequentially acquiring prediction frames of the N video frames, a prediction frame of an i+1th video frame is obtained by inputting an error image of an ith video frame into a trained prediction network model for processing; and if the degree of difference between an Nth video frame and a prediction frame of the Nth video frame satisfies a preset condition, then determining that the video to be detected has an abnormality. Since a prediction frame is calculated according to an error image prediction of a previous frame, the generated prediction frame thus considers the time series impact of a video to be detected, such that an acquired prediction frame of an Nth video frame is more accurate; in addition, when confirming, according to the prediction frame of the Nth video frame and the Nth video frame, whether the video to be detected is abnormal, the probability of false detection may be reduced, and the detection accuracy effect is improved.

Description

视频检测方法、装置、终端设备及可读存储介质Video detection method, device, terminal equipment and readable storage medium 技术领域Technical field
本申请属于计算机视觉技术领域,尤其涉及一种视频检测方法、装置、终端设备及可读存储介质。This application belongs to the field of computer vision technology, and in particular relates to a video detection method, device, terminal device, and readable storage medium.
背景技术Background technique
视频异常检测能够检测视频中是否包含异常事件,视频中的异常事件指的是在特定场景下与正常行为不同的特别事件,异常事件可能会危害公共安全或影响社会秩序,引起严重的后果,因此,通过视频异常检测确定是否存在异常事件,对于维护社会秩序起着重要作用。Video anomaly detection can detect whether the video contains abnormal events. The abnormal events in the video refer to special events that are different from normal behavior in a specific scene. Abnormal events may endanger public safety or affect social order and cause serious consequences. Therefore, , Determining whether there is an abnormal event through video anomaly detection plays an important role in maintaining social order.
现有技术中,先将待检测视频的N帧整体输入至自动编码器中,由自动编码器对输入的N帧进行卷积操作,得到一个视频帧,最后根据视频帧确定待检测视频是否存在异常。In the prior art, the entire N frames of the video to be detected are first input to the autoencoder, and the autoencoder performs convolution operation on the input N frames to obtain a video frame, and finally determines whether the video to be detected exists according to the video frame. abnormal.
但是,由于现有技术中并未考虑时序对于异常事件检测时的影响,因此在检测异常事件时,容易出现误检,检测效果并不理想。However, since the prior art does not consider the influence of timing on the detection of abnormal events, false detection is prone to occur when abnormal events are detected, and the detection effect is not ideal.
发明内容Summary of the invention
本申请实施例提供了一种视频检测方法、装置、终端设备及可读存储介质,以改善在检测异常事件时,容易出现误检,导致检测效果不理想问题。The embodiments of the present application provide a video detection method, device, terminal device, and readable storage medium, so as to improve the problem that when abnormal events are detected, false detections are prone to occur, leading to unsatisfactory detection results.
第一方面,本申请实施例提供了一种视频检测方法,包括:In the first aspect, an embodiment of the present application provides a video detection method, including:
获取待测视频,待测视频包括N个视频帧,其中,N为大于1的整数。Obtain the video to be tested. The video to be tested includes N video frames, where N is an integer greater than 1.
依次获取N个视频帧的预测帧。Obtain the predicted frames of N video frames in sequence.
其中,N个视频帧中第i+1个视频帧的预测帧的计算方式为:将第i个视频帧的误差图像,输入到已训练的预测网络模型进行处理,得到第i+1个视频帧的预测帧,第i个视频帧的误差图像是根据第i个视频帧和第i个视频帧的预 测帧相减所得,1≤i≤N-1,i为整数。Among them, the calculation method of the prediction frame of the i+1 video frame among the N video frames is: input the error image of the i video frame into the trained prediction network model for processing, and obtain the i+1 video Frame prediction frame, the error image of the i-th video frame is obtained by subtracting the i-th video frame and the prediction frame of the i-th video frame, 1≤i≤N-1, and i is an integer.
计算第N个视频帧与第N个视频帧的预测帧的差异度。Calculate the difference between the Nth video frame and the prediction frame of the Nth video frame.
若差异度符合预设条件,则确定待测视频存在异常。If the degree of difference meets the preset condition, it is determined that the video to be tested is abnormal.
在第一方面的一种可能的实现方式中,该视频检测方法的执行主体为具有图像处理能力的终端。示例性的,该终端可以是实体终端,如台式电脑、服务器、笔记本电脑、平板电脑等,也可是虚拟终端,如云端服务器、云计算等。应理解,以上执行主体仅为实例,并非限制必须是以上终端。In a possible implementation of the first aspect, the execution subject of the video detection method is a terminal with image processing capability. Exemplarily, the terminal may be a physical terminal, such as a desktop computer, a server, a notebook computer, a tablet computer, etc., or a virtual terminal, such as a cloud server, cloud computing, etc. It should be understood that the above execution subject is only an example, and it is not limited to the above terminal.
需要说明的是,第1个视频帧的预测帧是将预设误差图像输入到预测网络模型进行处理后得到的。It should be noted that the prediction frame of the first video frame is obtained after inputting the preset error image into the prediction network model for processing.
一些实施方式中,计算第N个视频帧与第N个视频帧的预测帧的差异度可以通过将第N个视频帧的预测帧与第N个视频帧相减并取模,得到差异度。In some implementation manners, calculating the difference degree between the predicted frame of the Nth video frame and the Nth video frame can be obtained by subtracting the predicted frame of the Nth video frame from the Nth video frame and taking the modulus to obtain the degree of difference.
另一些实施方式中,计算第N个视频帧与第N个视频帧的预测帧的差异值还可以通过预设修复算法,对第N个视频帧的预测帧进行修复。然后将修复后的第N个视频帧的预测帧与第N个视频帧相减并取模,得到差异度。In some other implementation manners, calculating the difference value between the predicted frame of the Nth video frame and the Nth video frame may also be used to repair the predicted frame of the Nth video frame by using a preset repair algorithm. Then, the predicted frame of the Nth video frame after repair is subtracted from the Nth video frame and the modulus is obtained to obtain the difference degree.
可选地,若差异度符合预设条件,则确定待测视频存在异常,可以在差异度大于第一预设阈值时,确定待测视频存在异常。Optionally, if the degree of difference meets a preset condition, it is determined that the video to be tested is abnormal, and when the degree of difference is greater than a first preset threshold, it is determined that the video to be tested is abnormal.
可选地,若差异度符合预设条件,则确定待测视频存在异常,还可以先对差异度的倒数进行归一化,得到正常度分数。然后,在正常度分数小于第二预设阈值时,确定待测视频存在异常。Optionally, if the degree of difference meets a preset condition, it is determined that the video to be tested is abnormal, and the reciprocal of the degree of difference may be normalized first to obtain a normality score. Then, when the normality score is less than the second preset threshold, it is determined that the video to be tested is abnormal.
其中,预测网络模型为长短时记忆网络(Long Short-Term Memory,LSTM)模型或者为门控循环神经网络(Gated Recurrent Unit,GRU)模型。Among them, the predictive network model is a Long Short-Term Memory (LSTM) model or a Gated Recurrent Unit (GRU) model.
第二方面,本申请实施例提供了一种视频检测装置,包括:In the second aspect, an embodiment of the present application provides a video detection device, including:
获取模块,用于获取待测视频,所述待测视频包括N个视频帧,其中,N为大于1的整数。预测模块,用于依次获取所述N个视频帧的预测帧。其中,所述N个视频帧中第i+1个视频帧的预测帧的计算方式为:将第i个视频帧的误差图像,输入到已训练的预测网络模型进行处理,得到所述第i+1个视频帧 的预测帧,所述第i个视频帧的误差图像是根据所述第i个视频帧和所述第i个视频帧的预测帧相减所得,1≤i≤N-1,i为整数。计算模块,用于计算所述第N个视频帧与所述第N个视频帧的预测帧的差异度。确定模块,用于若所述差异度符合预设条件,则确定所述待测视频存在异常。The acquisition module is configured to acquire a video to be tested, and the video to be tested includes N video frames, where N is an integer greater than 1. The prediction module is used to sequentially obtain the predicted frames of the N video frames. Wherein, the calculation method of the prediction frame of the i+1th video frame among the N video frames is: input the error image of the i-th video frame into the trained prediction network model for processing, and obtain the i-th +1 prediction frame of the video frame, the error image of the i-th video frame is obtained by subtracting the i-th video frame and the prediction frame of the i-th video frame, 1≤i≤N-1 , I is an integer. The calculation module is used to calculate the degree of difference between the predicted frame of the Nth video frame and the Nth video frame. The determining module is configured to determine that the video to be tested has an abnormality if the degree of difference meets a preset condition.
在第二方面的一种可能的实现方式中,该视频检测装置可以是第一方面的的执行主体,其具体的形式与之相同,在此不再赘述。In a possible implementation manner of the second aspect, the video detection device may be the execution subject of the first aspect, and its specific form is the same as that of the first aspect, and details are not described herein again.
需要说明的是,第1个视频帧的预测帧是将预设误差图像输入到预测网络模型进行处理后得到的。It should be noted that the prediction frame of the first video frame is obtained after inputting the preset error image into the prediction network model for processing.
一些实施方式中,计算模块,具体用于将第N个视频帧的预测帧与第N个视频帧相减并取模,得到差异度。In some embodiments, the calculation module is specifically configured to subtract the predicted frame of the Nth video frame from the Nth video frame and take the modulus to obtain the degree of difference.
另一些实施方式中,计算模块,具体用于通过预设修复算法,对第N个视频帧的预测帧进行修复。然后将修复后的第N个视频帧的预测帧与第N个视频帧相减并取模,得到差异度。In other embodiments, the calculation module is specifically configured to repair the predicted frame of the Nth video frame by using a preset repair algorithm. Then, the predicted frame of the Nth video frame after repair is subtracted from the Nth video frame and the modulus is obtained to obtain the difference degree.
可选地,确定模块,具体用于在差异度大于第一预设阈值时,确定待测视频存在异常。Optionally, the determining module is specifically configured to determine that the video to be tested has an abnormality when the degree of difference is greater than a first preset threshold.
可选地,确定模块,还用于先对差异度的倒数进行归一化,得到正常度分数。然后,在正常度分数小于第二预设阈值时,确定待测视频存在异常。Optionally, the determining module is also used to first normalize the reciprocal of the difference degree to obtain the normality score. Then, when the normality score is less than the second preset threshold, it is determined that the video to be tested is abnormal.
其中,预测网络模型为LSTM模型或者为GRU模型。Among them, the predictive network model is an LSTM model or a GRU model.
第三方面,本申请实施例提供了一种终端设备,包括:存储器、处理器以及存储在存储器中并可在处理器上运行的计算机程序,处理器执行计算机程序时实现如第一方面所提供的方法。In the third aspect, the embodiments of the present application provide a terminal device, including: a memory, a processor, and a computer program stored in the memory and running on the processor. Methods.
第四方面,本申请实施例提供了一种计算机可读存储介质,计算机可读存储介质存储有计算机程序,计算机程序被处理器执行时时实现如第一方面所提供的方法。In a fourth aspect, embodiments of the present application provide a computer-readable storage medium, and the computer-readable storage medium stores a computer program, and the computer program is executed by a processor to implement the method provided in the first aspect from time to time.
第五方面,本申请实施例提供了一种计算机程序产品,当计算机程序产品在终端设备上运行时,使得终端设备执行如第一方面所提供的方法。In the fifth aspect, the embodiments of the present application provide a computer program product, which when the computer program product runs on a terminal device, causes the terminal device to execute the method provided in the first aspect.
可以理解的是,上述第二方面至第五方面的有益效果可以参见上述第一方面中的相关描述,在此不再赘述。It can be understood that, for the beneficial effects of the second aspect to the fifth aspect described above, reference may be made to the related description in the first aspect described above, and details are not repeated here.
本申请实施例与现有技术相比存在的有益效果是:先通过根据待测视频,依次获取N个视频帧的预测帧。其中,第i个视频帧的误差图像是根据第i个视频帧和第i个视频帧的预测帧相减所得,第i+1个预测帧是根据第i个视频帧的误差图像输入到已训练的预测网络模型中处理得到的。然后计算第N个视频帧与第N个视频帧的预测帧的差异度。最后若差异度符合预设条件,则确定待测视频存在异常。由于计算预测帧时是根据前一帧的误差图像预测得到的,因此生成的预测帧考虑到了待测视频的时序影响,使得获取的第N个视频帧的预测帧更准确,进而在根据第N个视频帧的预测帧和第N个视频帧确认待测视频是否异常时,实现了降低误检的概率,提高检测准确度效果。Compared with the prior art, the embodiment of the present application has a beneficial effect: first, according to the video to be tested, the predicted frames of N video frames are sequentially obtained. Among them, the error image of the i-th video frame is obtained from the subtraction of the i-th video frame and the predicted frame of the i-th video frame, and the i+1-th predicted frame is inputted to the computer based on the error image of the i-th video frame. It is processed in the trained predictive network model. Then calculate the difference degree between the Nth video frame and the prediction frame of the Nth video frame. Finally, if the degree of difference meets the preset condition, it is determined that the video to be tested is abnormal. Since the prediction frame is calculated based on the error image of the previous frame, the generated prediction frame takes into account the timing impact of the video to be tested, so that the obtained prediction frame of the Nth video frame is more accurate. When it is confirmed whether the video to be tested is abnormal or not in the predicted frame of two video frames and the Nth video frame, the probability of false detection is reduced and the detection accuracy is improved.
附图说明Description of the drawings
为了更清楚地说明本申请实施例中的技术方案,下面将对实施例或现有技术描述中所需要使用的附图作简单地介绍,显而易见地,下面描述中的附图仅仅是本申请的一些实施例,对于本领域普通技术人员来讲,在不付出创造性劳动性的前提下,还可以根据这些附图获得其他的附图。In order to more clearly describe the technical solutions in the embodiments of the present application, the following will briefly introduce the drawings that need to be used in the description of the embodiments or the prior art. Obviously, the drawings in the following description are only of the present application. For some embodiments, those of ordinary skill in the art can obtain other drawings based on these drawings without creative labor.
图1是本申请一实施例提供的应用场景示意图;Fig. 1 is a schematic diagram of an application scenario provided by an embodiment of the present application;
图2是本申请一实施例提供的视频检测方法的流程示意图;FIG. 2 is a schematic flowchart of a video detection method provided by an embodiment of the present application;
图3是本申请另一实施例提供的视频检测方法的流程示意图;FIG. 3 is a schematic flowchart of a video detection method provided by another embodiment of the present application;
图4是本申请另一实施例提供的视频检测方法的流程示意图;FIG. 4 is a schematic flowchart of a video detection method provided by another embodiment of the present application;
图5是本申请另一实施例提供的应用场景示意图;FIG. 5 is a schematic diagram of an application scenario provided by another embodiment of the present application;
图6是本申请一实施例提供的视频检测装置的结构示意图;FIG. 6 is a schematic structural diagram of a video detection device provided by an embodiment of the present application;
图7是本申请实施例提供的终端设备的结构示意图。Fig. 7 is a schematic structural diagram of a terminal device provided by an embodiment of the present application.
具体实施方式Detailed ways
以下描述中,为了说明而不是为了限定,提出了诸如特定***结构、技术之类的具体细节,以便透彻理解本申请实施例。然而,本领域的技术人员应当清楚,在没有这些具体细节的其它实施例中也可以实现本申请。在其它情况中,省略对众所周知的***、装置、电路以及方法的详细说明,以免不必要的细节妨碍本申请的描述。In the following description, for the purpose of illustration rather than limitation, specific details such as a specific system structure and technology are proposed for a thorough understanding of the embodiments of the present application. However, it should be clear to those skilled in the art that the present application can also be implemented in other embodiments without these specific details. In other cases, detailed descriptions of well-known systems, devices, circuits, and methods are omitted to avoid unnecessary details from obstructing the description of this application.
如在本申请说明书和所附权利要求书中所使用的那样,术语“若”可以依据上下文被解释为“当...时”或“一旦”或“响应于确定”或“响应于检测到”。类似地,短语“若...则确定”可以依据上下文被解释为意指“一旦确定”或“响应于确定”或“一旦检测到符合预设条件”或“响应于检测到符合预设条件”。As used in the description of this application and the appended claims, the term "if" can be construed as "when" or "once" or "in response to determination" or "in response to detecting ". Similarly, the phrase "determine if..." can be interpreted depending on the context to mean "once determined" or "in response to determination" or "once it is detected that a preset condition is met" or "in response to that it is detected that a preset condition is met" ".
另外,在本申请说明书和所附权利要求书的描述中,术语“第一”、“第二”、“第三”等仅用于区分描述,而不能理解为指示或暗示相对重要性。In addition, in the description of the specification of this application and the appended claims, the terms "first", "second", "third", etc. are only used to distinguish the description, and cannot be understood as indicating or implying relative importance.
在本申请说明书中描述的参考“可能的实施方式”或“一些实施方式”等意味着在本申请的一个或多个实施方式中包括结合该实施方式描述的特定特征、结构或特点。由此,在本说明书中的不同之处出现的语句“在一个实施方式中”、“在一些实施方式中”、“在其他一些实施方式中”、“在另外一些实施方式中”等不是必然都参考相同的实施方式,而是意味着“一个或多个但不是所有的实施方式”,除非是以其他方式另外特别强调。术语“包括”、“包含”、“具有”及它们的变形都意味着“包括但不限于”,除非是以其他方式另外特别强调。The reference to "possible implementations" or "some implementations" described in the specification of this application means that one or more implementations of this application include a specific feature, structure, or characteristic described in combination with the implementation. Therefore, the words "in one embodiment", "in some embodiments", "in some other embodiments", "in some other embodiments", etc. appearing in different places in this specification are not necessarily All refer to the same embodiment, but mean "one or more but not all embodiments" unless it is specifically emphasized in other ways. The terms "including", "including", "having" and their variations all mean "including but not limited to", unless otherwise specifically emphasized.
本申请实施例提供的视频检测方法可以应用于手机、平板电脑、可穿戴设备、车载设备、增强现实(augmented reality,AR)/虚拟现实(virtual reality,VR)设备、笔记本电脑、超级移动个人计算机(ultra-mobile personal computer,UMPC)、上网本、个人数字助理(personal digital assistant,PDA)、安防摄像头、监控摄像头等终端设备上,本申请实施例对终端设备的具体类型不作任何限制。The video detection method provided by the embodiments of this application can be applied to mobile phones, tablet computers, wearable devices, in-vehicle devices, augmented reality (AR)/virtual reality (VR) devices, notebook computers, and super mobile personal computers For terminal devices such as ultra-mobile personal computer (UMPC), netbooks, personal digital assistants (personal digital assistants, PDAs), security cameras, surveillance cameras, etc., the embodiments of this application do not impose any restrictions on the specific types of terminal devices.
图1是本申请一实施例提供的应用场景示意图。Fig. 1 is a schematic diagram of an application scenario provided by an embodiment of the present application.
请参阅图1,该场景中包括至少一个视频采集设备11以及至少一个与视频 采集设备11连接的终端设备。Please refer to FIG. 1, this scene includes at least one video capture device 11 and at least one terminal device connected to the video capture device 11.
一些实施方式中,视频采集设备11可以是各种形式的摄像头,例如,可以是安防摄像头、监控摄像头、集成于笔记本上的摄像头、集成于智能手机上的摄像头等。In some embodiments, the video capture device 11 may be various forms of cameras, for example, it may be a security camera, a surveillance camera, a camera integrated on a notebook, a camera integrated on a smart phone, and so on.
仅作为示例而非限定,终端设备可以是服务器12、个人电脑13、智能手机14、平板电脑15中的至少一种,终端设备可以获取视频采集设备11采集的待测视频,并对其进行检测,例如,视频采集设备11可以与服务器12通信连接,视频采集设备11采集待测视频后,通过有线网络或无线网络,将待测视频发送给服务器12,服务器12对待测视频进行处理,得到待测视频的检测结果,该检测结果可以存在服务器12中,也可以发送至指定的数据库中进行保存,然后可以通过其他设备,如智能手机、个人电脑、平板电脑等,从服务器12或数据库中调取待测视频以及检测结果,进行查阅或处理。或者,视频采集设备11可以与个人电脑13、智能手机14、平板电脑15中的至少一个通信连接,将待测视频发送给与其连接的终端设备,在该设备上对待测视频进行处理,得到待测视频的检测结果,并在该设备上对待测视频和其检测结果通过屏幕进行展示。As an example and not a limitation, the terminal device may be at least one of the server 12, a personal computer 13, a smart phone 14, a tablet computer 15. The terminal device may obtain the video to be tested collected by the video capture device 11 and detect it For example, the video capture device 11 may be connected to the server 12 in communication. After the video capture device 11 captures the video to be tested, it sends the video to be tested to the server 12 through a wired or wireless network. The server 12 processes the video to be tested to obtain the video to be tested. Measure the video detection results. The detection results can be stored in the server 12 or sent to a designated database for storage, and then can be adjusted from the server 12 or database through other devices, such as smart phones, personal computers, tablets, etc. Take the video to be tested and the test result for review or processing. Alternatively, the video capture device 11 may be connected to at least one of the personal computer 13, the smart phone 14, and the tablet computer 15, and the video to be tested is sent to the terminal device connected to it, and the video to be tested is processed on the device to obtain the video to be tested. The test result of the test video is displayed, and the test video and its test result are displayed on the screen on the device.
还有一些实施方式中,视频采集设备11中还可以集成在终端设备上,以实现本申请提供的方案。In some other implementation manners, the video capture device 11 may also be integrated on the terminal device to implement the solution provided in this application.
仅作为示例而非限制,可以使用智能手机14的摄像头作为视频采集设备11采集待测视频,然后存入智能手机14的存储器中,然后智能手机14的处理器执行相应的可执行程序,对存储器中的待测视频进行处理,得到待测视频的检测结果,并展示在智能手机14的屏幕上。Just as an example and not a limitation, the camera of the smart phone 14 can be used as the video capture device 11 to collect the video to be tested, and then stored in the memory of the smart phone 14, and then the processor of the smart phone 14 executes the corresponding executable program to the memory The video to be tested in is processed to obtain the detection result of the video to be tested and displayed on the screen of the smart phone 14.
本领域技术人员可以理解,图1仅仅是本申请应用场景的举例,并不构成对执行本申请中提供的视频检测方法的应用场景的限定,在实际应用时可以包括比图示更多的设备,例如,对于视频采集设备11与服务器12通信连接时,服务器12还可以与数据库通信连接,用于存储待测视频和检测结果;或者与屏幕通信连接,用于展示检测结果;还可以与报警设备通信连接,用于提醒检测 结果异常等,在此不作限制。Those skilled in the art can understand that FIG. 1 is only an example of the application scenario of the present application, and does not constitute a limitation on the application scenario for executing the video detection method provided in the present application. In actual application, it may include more devices than shown in the figure. For example, when the video capture device 11 communicates with the server 12, the server 12 can also communicate with the database to store the video to be tested and the detection results; or communicate with the screen to display the detection results; it can also communicate with the alarm The device communication connection is used to remind the abnormality of the detection result, etc., and there is no restriction here.
其中,无线网络可以包括无线局域网(Wireless Localarea Networks,WLAN)(如Wi-Fi网络),蓝牙,Zigbee,移动通信网络,近距离无线通信技术(Near Field Communication,NFC),红外技术(Infrared,IR)等通信的解决方案。有线网络可以包括光纤网络、远程通信网络、内联网等,如局域网(Local Area Network,LAN)、广域网(Wide Area Network,WAN)、城域网(Metropolitan Area Network,MAN)、公共电话交换网(Public Switched Telephone Network,PSTN)等。无线网络和有线网络的类型在此不做限制。Among them, wireless networks may include wireless local area networks (Wireless Localarea Networks, WLAN) (such as Wi-Fi networks), Bluetooth, Zigbee, mobile communication networks, Near Field Communication (NFC), infrared technology (Infrared, IR). ) And other communication solutions. Wired networks can include optical fiber networks, telecommunication networks, intranets, etc., such as Local Area Network (LAN), Wide Area Network (WAN), Metropolitan Area Network (MAN), and public switched telephone network ( Public Switched Telephone Network, PSTN), etc. There are no restrictions on the types of wireless networks and wired networks.
图2是本申请一实施例提供的视频检测方法的流程示意图。作为示例而非限定,该方法可以应用于上述场景中的终端设备,如服务器12、个人电脑13、智能手机14、平板电脑15或车载电脑16中,但不以此为限。FIG. 2 is a schematic flowchart of a video detection method provided by an embodiment of the present application. As an example and not a limitation, the method can be applied to terminal devices in the above-mentioned scenarios, such as the server 12, the personal computer 13, the smart phone 14, the tablet computer 15, or the car computer 16, but is not limited to this.
请参阅图2,视频检测方法,包括:Please refer to Figure 2, video detection methods, including:
S21、获取待测视频,待测视频包括N个视频帧。S21. Obtain a video to be tested, where the video to be tested includes N video frames.
其中,N为大于1的整数。Wherein, N is an integer greater than 1.
一些实施方式中,待测视频可以是直接通过上述视频采集设备11采集的视频片段,也可以是视频采集设备11采集视频片段并存储后,执行本方法的终端设备调取的存储的视频片段,在此不做限制。In some embodiments, the video to be tested may be a video clip directly collected by the above-mentioned video capture device 11, or it may be a stored video clip retrieved by the terminal device that executes the method after the video capture device 11 collects and stores the video clip. There is no restriction here.
需要说明的是,待测视频通常是基于RGB色彩空间的视频片段,包括至少2帧视频帧。其中,待测视频的帧数是根据待测视频的时长和每秒传输帧数(Frames Per Second,FPS)决定的,例如,若待测视频时长2秒,FPS30,则待测视频包括60个视频帧。It should be noted that the video to be tested is usually a video clip based on the RGB color space, including at least 2 video frames. Among them, the number of frames of the video to be tested is determined according to the duration of the video to be tested and the number of frames per second (Frames Per Second, FPS). For example, if the duration of the video to be tested is 2 seconds and FPS30, then the video to be tested includes 60 Video frame.
其中,在视频检测时,需要连续检测视频,待测视频的时长可以作为一个检测步长,检测步长的长度可以根据应用时的实际情况进行设置,在此不做限制。Among them, during video detection, the video needs to be continuously detected. The duration of the video to be tested can be used as a detection step. The length of the detection step can be set according to the actual situation of the application, and there is no limitation here.
S22、依次获取N个视频帧的预测帧。S22: Acquire prediction frames of N video frames in sequence.
其中,N个视频帧中第i+1个视频帧的预测帧的计算方式为:将第i个视 频帧的误差图像,输入到已训练的预测网络模型进行处理,得到第i+1个视频帧的预测帧,第i个视频帧的误差图像是根据第i个视频帧和第i个视频帧的预测帧相减所得,1≤i≤N-1,i为整数。Among them, the calculation method of the prediction frame of the i+1 video frame among the N video frames is: input the error image of the i video frame into the trained prediction network model for processing, and obtain the i+1 video Frame prediction frame, the error image of the i-th video frame is obtained by subtracting the i-th video frame and the prediction frame of the i-th video frame, 1≤i≤N-1, and i is an integer.
预测网络模型可以为时序网络模型,作为示例而非限定,时序网络模型可以包括LSTM模型或者GRU模型等。The predictive network model may be a time series network model. As an example and not a limitation, the time series network model may include an LSTM model or a GRU model.
其中,在向预测网络模型中输入第i个时刻的误差图像(即第i个视频帧的误差图像)后,预测网络模型会根据输入的误差图像,生成下一个时刻的预测帧(即第i+1个视频帧的预测帧)。Among them, after inputting the error image of the i-th moment into the prediction network model (that is, the error image of the i-th video frame), the prediction network model will generate the prediction frame of the next moment (that is, the i-th video frame) according to the input error image. +1 prediction frame of video frame).
需要说明的是,第1个视频帧的预测帧是将预设误差图像输入到预测网络模型进行处理后得到的,例如,预设误差图可以是第1个视频帧与空白图像(即0)相减得到的,或者也可以将第1个视频帧直接作为预设误差图。It should be noted that the prediction frame of the first video frame is obtained by inputting the preset error image into the prediction network model for processing. For example, the preset error image may be the first video frame and a blank image (ie 0). It is obtained by subtraction, or the first video frame can be directly used as the preset error map.
预测网络模型可以通过预设的样本以及预设的样本标签进行训练,其训练方式为本领域的常规手段,在此不再赘述。The predictive network model can be trained through preset samples and preset sample labels, and the training method is a conventional method in the field, which will not be repeated here.
S23、计算第N个视频帧与第N个视频帧的预测帧的差异度。S23. Calculate the degree of difference between the predicted frame of the Nth video frame and the Nth video frame.
其中,差异度用于表示第N个视频帧与第N个视频帧的预测帧的相似度,二者相似度越低,则表示第N个视频帧与第N个视频帧的预测帧之间的差距越大,待测视频中存在于与预测帧差别过大的视频帧,即存在异常。Among them, the difference degree is used to indicate the similarity between the Nth video frame and the prediction frame of the Nth video frame. The lower the similarity between the two is, the difference between the Nth video frame and the prediction frame of the Nth video frame The larger the difference between is, there is an abnormality in the video frame that is too different from the predicted frame in the video to be tested.
作为示例而非限定,第N个视频帧与第N个视频帧的预测帧之间的相似度可以通过如直方图匹配、余弦相似度、均值哈希算法等进行计算。或者,二者之间的相似度还可以通过图像间的距离进行表示,例如,可以通过欧氏距离、马氏距离、曼哈顿距离等计算二者之间的相似度,但不以此为限。As an example and not a limitation, the similarity between the Nth video frame and the predicted frame of the Nth video frame can be calculated by, for example, histogram matching, cosine similarity, and mean hash algorithm. Alternatively, the similarity between the two can also be expressed by the distance between the images. For example, the similarity between the two can be calculated by the Euclidean distance, Mahalanobis distance, Manhattan distance, etc., but it is not limited to this.
S24、判断差异度是否符合预设条件。S24. Determine whether the degree of difference meets a preset condition.
一些实施方式中,可以根据实际应用时,可根据视频检测要求的精度、视频检测中包含的场景中的干扰信息等因素,设置预设条件,以确保视频检测时既能准确检测到异常,又不会因过于灵敏或过于迟钝而造成误检。In some implementations, preset conditions can be set according to actual applications, the accuracy required for video detection, the interference information in the scene contained in the video detection, and other factors, so as to ensure that the abnormality can be accurately detected during video detection. It will not cause false detection due to being too sensitive or too slow.
S25、若差异度符合预设条件,则确定待测视频存在异常。S25. If the degree of difference meets the preset condition, it is determined that there is an abnormality in the video to be tested.
S26、若差异度不符合预设条件,则确定待测视频无异常。S26. If the degree of difference does not meet the preset condition, it is determined that the video to be tested has no abnormality.
需要说明的是,待测视频存在异常,指的是待测视频中存在有别于正常事件的特别事件。例如,若待测视频中的场景为街道,则特别事件可以包括如在人行道上行驶的机动车、横穿机动车道的行人、抢劫或斗殴等暴力行为;若待测视频中的场景为室内,则特别事件可以包括如出现浓烟、出现明火、人群过于拥挤等。It should be noted that the abnormality of the video to be tested refers to the existence of special events that are different from normal events in the video to be tested. For example, if the scene in the video to be tested is a street, special events can include violent behaviors such as a motor vehicle driving on a sidewalk, pedestrians crossing a motorway, robbery or fighting; if the scene in the video to be tested is indoor, The special events can include, for example, the appearance of heavy smoke, the appearance of open flames, and the crowds are overcrowded.
在以上实施方式中,由于计算预测帧时是根据前一帧的误差图像预测得到的,因此生成的预测帧考虑到了待测视频的时序影响,使得获取的第N个视频帧的预测帧更准确,进而在根据第N个视频帧的预测帧和第N个视频帧确认待测视频是否异常时,实现了降低误检的概率,提高检测准确度效果。In the above embodiments, since the prediction frame is calculated based on the error image of the previous frame, the generated prediction frame takes into account the timing influence of the video to be tested, so that the obtained prediction frame of the Nth video frame is more accurate Therefore, when it is confirmed whether the video to be tested is abnormal according to the predicted frame of the Nth video frame and the Nth video frame, the probability of false detection is reduced and the detection accuracy is improved.
一些实施方式中,计算第N个视频帧与第N个视频帧的预测帧的差异度可以通过将第N个视频帧的预测帧与第N个视频帧相减并取模,得到差异度。In some implementation manners, calculating the difference degree between the predicted frame of the Nth video frame and the Nth video frame can be obtained by subtracting the predicted frame of the Nth video frame from the Nth video frame and taking the modulus to obtain the degree of difference.
其中,将第N个视频帧的预测帧与第N个视频帧相减并取模可以通过L 2范数实现,例如,若第N个视频帧的预测帧为
Figure PCTCN2020129171-appb-000001
第N个视频帧为I N,则差异度
Figure PCTCN2020129171-appb-000002
Among them, subtracting the predicted frame of the Nth video frame from the Nth video frame and taking the modulus can be achieved through the L 2 norm. For example, if the predicted frame of the Nth video frame is
Figure PCTCN2020129171-appb-000001
The Nth video frame is I N , then the degree of difference
Figure PCTCN2020129171-appb-000002
另一些实施方式中,上述S23中计算第N个视频帧与第N个视频帧的预测帧的差异值还可以通过如图3所示的流程实现,请参阅图3,上述S23可以包括:In other embodiments, the calculation of the difference between the Nth video frame and the prediction frame of the Nth video frame in S23 may also be implemented through the process shown in FIG. 3. Please refer to FIG. 3. The foregoing S23 may include:
S231、通过预设修复算法,对第N个视频帧的预测帧进行修复。S231: Repair the predicted frame of the Nth video frame by using a preset repair algorithm.
一些实施方式中,对第N个视频帧的预测帧
Figure PCTCN2020129171-appb-000003
进行修复,修正预测过程中导致的噪声、畸变等影响检测的问题,得到修复后的第N个视频帧的预测帧R N,其中,预设修复算法可以是卷积自编码器,使用卷积自编码器对
Figure PCTCN2020129171-appb-000004
进行修复是本领域技术人员的常规手段,在此不再赘述。
In some embodiments, the prediction frame for the Nth video frame
Figure PCTCN2020129171-appb-000003
Repair, correct the problems of noise and distortion caused by the prediction process that affect detection, and get the predicted frame R N of the Nth video frame after repair. The preset repair algorithm can be a convolutional autoencoder, using convolution Autoencoder pair
Figure PCTCN2020129171-appb-000004
Repairing is a routine method used by those skilled in the art and will not be repeated here.
S232、将修复后的第N个视频帧的预测帧与第N个视频帧相减并取模,得到差异度。S232. The predicted frame of the Nth video frame after repair is subtracted from the Nth video frame and the modulus is taken to obtain the degree of difference.
参考上述示例,依旧可以使用L 2范数实现修复后的第N个视频帧R N的预 测帧与第N个视频帧I N的差异度的计算,即
Figure PCTCN2020129171-appb-000005
Referring to the above example, L 2 can still achieve norm calculating a predicted frame and the video frame N I N degrees of difference of the restored video frame N R N, i.e.
Figure PCTCN2020129171-appb-000005
通过预设修复算法将第N个视频帧的预测帧进行修复后,再与第N个视频帧计算差异度,可以有效地减少由于噪声、畸变等情况导致检测时差异度提高的问题,减少了出现误检的概率,提高了视频检测的准确度。After the predicted frame of the Nth video frame is repaired by the preset repair algorithm, the difference is calculated with the Nth video frame, which can effectively reduce the problem of increased difference in detection due to noise, distortion, etc. The probability of false detection improves the accuracy of video detection.
可选地,若差异度符合预设条件,则确定待测视频存在异常,可以在差异度大于第一预设阈值时,确定待测视频存在异常。Optionally, if the degree of difference meets a preset condition, it is determined that the video to be tested is abnormal, and when the degree of difference is greater than a first preset threshold, it is determined that the video to be tested is abnormal.
需要说明的是,差异度越大,则表示待测视频中存在异常的可能性越大。It should be noted that the greater the degree of difference, the greater the possibility that there is an abnormality in the video to be tested.
作为示例而非限定,当差异度大于70%时,即可确定待测视频存在异常,即,第一预设阈值可以设置为70%,应当理解,在对于不同精度要求的场景下,第一预设阈值还可以设置为63%、72%、86%等其他数值。As an example and not a limitation, when the degree of difference is greater than 70%, it can be determined that the video to be tested is abnormal, that is, the first preset threshold can be set to 70%. It should be understood that in scenarios with different accuracy requirements, the first The preset threshold can also be set to 63%, 72%, 86% and other values.
可选地,上述S24中若差异度符合预设条件,则确定待测视频存在异常的过程,还可以通过如图4所示的流程实现。请参阅图4,上述S24可以包括:Optionally, if the degree of difference in S24 meets a preset condition, the process of determining that the video to be tested is abnormal may also be implemented through the process shown in FIG. 4. Please refer to Figure 4, the above S24 may include:
S241、对差异度的倒数进行归一化,得到正常度分数。S241. Normalize the reciprocal of the difference degree to obtain a normality score.
一些实施方式中,将差异度的倒数进行归一化后,可以得到待测视频的正常度分数,正常度分数的区间范围可以为[0,1]或者[0,100],在此不做限制。In some embodiments, after normalizing the reciprocal of the difference degree, the normality score of the video to be tested can be obtained. The interval range of the normality score can be [0,1] or [0,100], which is not limited here.
其中,正常度分数越高,则表示待测视频为正常的概率越高,例如,若正常度分数的区间为[0,100],则0分可以表示待测视频存在异常,100分表示待测视频正常。Among them, the higher the normality score, the higher the probability that the video to be tested is normal. For example, if the interval of the normality score is [0,100], then 0 points can indicate that the video to be tested is abnormal, and 100 points indicate that the video to be tested is abnormal. normal.
S242、若正常度分数小于第二预设阈值时,确定待测视频存在异常。S242: If the normality score is less than the second preset threshold, it is determined that there is an abnormality in the video to be tested.
参考S241中的示例,一种可能的实现方式中,可以在正常度分数小于30分时,确定待测视频存在异常,即第二预设阈值为30,应当理解,在对于不同精度要求的场景下,第二预设阈值还可以设置为35、27、12等其他数值,在此不做限制。Referring to the example in S241, in a possible implementation manner, when the normality score is less than 30 minutes, it can be determined that the video to be tested is abnormal, that is, the second preset threshold is 30. It should be understood that in scenarios with different accuracy requirements Below, the second preset threshold can also be set to other values such as 35, 27, 12, etc., which is not limited here.
其中,将差异度的倒数归一化后,通过待测视频的正常度分数确定待测视频是否存在异常,在展示检测结果时,由于正常度分数更加符合用户的习惯,因此可以让用户更直观的确定待测视频存在异常的程度,提高用户体验。Among them, after normalizing the reciprocal of the difference degree, the normality score of the video to be tested is used to determine whether there is an abnormality in the video to be tested. When the detection result is displayed, the normality score is more in line with the user's habits, so the user can be more intuitive To determine the degree of abnormality in the video to be tested, and improve user experience.
应理解,上述实施例中各步骤的序号的大小并不意味着执行顺序的先后,各过程的执行顺序应以其功能和内在逻辑确定,而不应对本申请实施例的实施过程构成任何限定。It should be understood that the size of the sequence number of each step in the foregoing embodiment does not mean the order of execution. The execution sequence of each process should be determined by its function and internal logic, and should not constitute any limitation on the implementation process of the embodiment of the present application.
图5是本申请另一实施例提供的应用场景示意图。Fig. 5 is a schematic diagram of an application scenario provided by another embodiment of the present application.
请参阅图5,图5中展示了将本申请的方案应用于汽车自动驾驶时,对危险行为进行提醒的场景,在此,以汽车自动驾驶位例,对本申请提供的视频检测方法的应用进行解释,应当明确,以下应用方式仅为示例,而并非限制。Please refer to Figure 5. Figure 5 shows a scenario of reminding dangerous behaviors when the solution of the present application is applied to automatic driving of a car. Here, taking the example of automatic driving of a car, the application of the video detection method provided by this application is carried out. The explanation should be clear that the following application methods are only examples, not limitations.
该场景中,可以在汽车的进气格栅16、后保险杠17、两侧后视镜18等至少一个位置设置视频采集设备11(未示出),视频采集设备11在本场景中可以采用***头、红外摄像头等,用于在采集驾驶位上驾驶员的影像。视频采集设备11可以通过电缆与车载电脑(未示出)连接,车载电脑接收到视频采集设备11发送的汽车行驶时拍摄的至少一个待测视频,并对每个待测视频进行处理,得到检测结果。In this scenario, a video capture device 11 (not shown) can be set in at least one of the car’s air intake grille 16, rear bumper 17, and side mirrors 18. The video capture device 11 can be used in this scenario Micro cameras, infrared cameras, etc., are used to collect images of the driver in the driving position. The video capture device 11 can be connected to an on-board computer (not shown) through a cable. The on-board computer receives at least one video to be tested that is sent by the video capture device 11 and shot while the car is driving, and processes each video to be tested to obtain a test. result.
其中,设置于进气格栅16的摄像头可以采集汽车前方的视频影像,设置于后保险杠17的摄像头可以采集汽车后方的视频影像,设置于两侧后视镜18的摄像头可以采集汽车两侧的视频影像。汽车自动驾驶时,可以提高拍摄汽车行驶方向的摄像头的检测频率,例如,可以将进气格栅16上的摄像头采集的FPS设置为120FPS,检测频率0.1秒,则每0.1秒内,需检测包括12个视频帧的待测视频是否存在异常,其中,每次检测的第1个视频帧的预测帧可以使用上一个检测周期内,第12个视频帧的预测帧,在一个检测周期中,若检测出待测视频存在异常,如行驶方向存在行人192,或者其他可能引起安全问题的障碍物,则可以通过语音或灯光,提醒驾驶员注意安全,或者,还可以协同其他传感器,实现自动刹车、自动避让等功能。Among them, the camera installed on the air intake grille 16 can collect video images in front of the car, the camera installed on the rear bumper 17 can collect video images behind the car, and the cameras installed on the side mirrors 18 can collect both sides of the car. Video image. When the car is driving automatically, the detection frequency of the camera that captures the direction of the car can be increased. For example, the FPS collected by the camera on the air intake grille 16 can be set to 120FPS, and the detection frequency is 0.1 seconds. Then every 0.1 seconds, the detection includes Whether there is an abnormality in the video to be tested in 12 video frames, among them, the predicted frame of the first video frame detected each time can use the predicted frame of the 12th video frame in the previous detection cycle. In a detection cycle, if If an abnormality is detected in the video to be tested, such as pedestrian 192 in the driving direction, or other obstacles that may cause safety problems, the driver can be reminded to pay attention to safety through voice or light, or it can also cooperate with other sensors to achieve automatic braking, Automatic avoidance and other functions.
同时,设置于两侧后视镜18的摄像头可以采取同样的设置,用于检测汽车两侧是否存在靠近的其他车辆191或其他可能造成安全隐患的情况,若存在,也可以通过语音或灯光,提醒驾驶员注意安全或协同其他传感器,实现自动刹 车、自动避让等功能。At the same time, the cameras installed on the rearview mirrors 18 on both sides can adopt the same settings to detect whether there are other vehicles 191 approaching on both sides of the car or other situations that may cause safety hazards. If they exist, they can also be voiced or lighted. Remind the driver to pay attention to safety or cooperate with other sensors to realize functions such as automatic braking and automatic avoidance.
还需要说明的是,汽车行驶时,汽车后方向对安全,可以降低与拍摄与汽车行驶方向相反的摄像头的检测频率,以减少行车电脑的负荷,例如,可以将后保险杠17上的摄像头采集的FPS设置为30,检测频率0.2秒,则每0.2秒内,仅需检测包括6个视频帧的待测视频是否存在异常。It should also be noted that when the car is driving, the rear direction of the car is safe, and the detection frequency of the camera opposite to the driving direction of the car can be reduced to reduce the load on the trip computer. For example, the camera on the rear bumper 17 can be collected The FPS is set to 30, and the detection frequency is 0.2 seconds, so every 0.2 seconds, it only needs to detect whether there is an abnormality in the video to be tested including 6 video frames.
还有一种场景,在汽车自动泊车时,可以将上述示例中后保险杠17上的摄像头的设置参数与进气格栅16上的摄像头的设置参数互换,以更加灵敏的检测位于汽车后方的行人192或其他障碍物。In another scenario, when the car is automatically parked, the setting parameters of the camera on the rear bumper 17 in the above example can be exchanged with the setting parameters of the camera on the air intake grille 16, so as to more sensitively detect the position behind the car. 192 pedestrians or other obstacles.
对应于上文实施例所述的视频检测方法,图6是本申请一实施例提供的视频检测装置的结构示意图,为了便于说明,仅示出了与本申请实施例相关的部分。Corresponding to the video detection method described in the above embodiment, FIG. 6 is a schematic structural diagram of a video detection device provided in an embodiment of the present application. For ease of description, only parts related to the embodiment of the present application are shown.
参照图6,该装置包括:Referring to Figure 6, the device includes:
获取模块31,用于获取待测视频,所述待测视频包括N个视频帧,其中,N为大于1的整数。The acquiring module 31 is configured to acquire a video to be tested, and the video to be tested includes N video frames, where N is an integer greater than one.
预测模块32,用于依次获取所述N个视频帧的预测帧。其中,所述N个视频帧中第i+1个视频帧的预测帧的计算方式为:将第i个视频帧的误差图像,输入到已训练的预测网络模型进行处理,得到所述第i+1个视频帧的预测帧,所述第i个视频帧的误差图像是根据所述第i个视频帧和所述第i个视频帧的预测帧相减所得,1≤i≤N-1,i为整数。The prediction module 32 is configured to sequentially obtain the predicted frames of the N video frames. Wherein, the calculation method of the prediction frame of the i+1th video frame among the N video frames is: input the error image of the i-th video frame into the trained prediction network model for processing, and obtain the i-th +1 prediction frame of the video frame, the error image of the i-th video frame is obtained by subtracting the i-th video frame and the prediction frame of the i-th video frame, 1≤i≤N-1 , I is an integer.
计算模块33,用于计算所述第N个视频帧与所述第N个视频帧的预测帧的差异度。The calculation module 33 is configured to calculate the degree of difference between the predicted frame of the Nth video frame and the Nth video frame.
确定模块34,用于若所述差异度符合预设条件,则确定所述待测视频存在异常。The determining module 34 is configured to determine that the video to be tested has an abnormality if the degree of difference meets a preset condition.
需要说明的是,第1个视频帧的预测帧是将预设误差图像输入到预测网络模型进行处理后得到的。It should be noted that the prediction frame of the first video frame is obtained after inputting the preset error image into the prediction network model for processing.
一些实施方式中,计算模块33,具体用于将第N个视频帧的预测帧与第N 个视频帧相减并取模,得到差异度。In some implementation manners, the calculation module 33 is specifically configured to subtract the predicted frame of the Nth video frame from the Nth video frame and take the modulus to obtain the degree of difference.
另一些实施方式中,计算模块33,具体用于通过预设修复算法,对第N个视频帧的预测帧进行修复。然后将修复后的第N个视频帧的预测帧与第N个视频帧相减并取模,得到差异度。In other embodiments, the calculation module 33 is specifically configured to repair the predicted frame of the Nth video frame by using a preset repair algorithm. Then, the predicted frame of the Nth video frame after repair is subtracted from the Nth video frame and the modulus is obtained to obtain the difference degree.
可选地,确定模块34,具体用于在差异度大于第一预设阈值时,确定待测视频存在异常。Optionally, the determining module 34 is specifically configured to determine that the video to be tested has an abnormality when the degree of difference is greater than a first preset threshold.
可选地,确定模块34,还用于先对差异度的倒数进行归一化,得到正常度分数。然后,在正常度分数小于第二预设阈值时,确定待测视频存在异常。Optionally, the determining module 34 is further configured to first normalize the reciprocal of the difference degree to obtain the normality score. Then, when the normality score is less than the second preset threshold, it is determined that the video to be tested is abnormal.
其中,预测网络模型为LSTM模型或者为GRU模型。Among them, the predictive network model is an LSTM model or a GRU model.
需要说明的是,上述模块之间的信息交互、执行过程等内容,由于与本申请方法实施例基于同一构思,其具体功能及带来的技术效果,具体可参见方法实施例部分,此处不再赘述。It should be noted that the information exchange and execution process between the above-mentioned modules are based on the same idea as the method embodiment of this application. For specific functions and technical effects, please refer to the method embodiment section for details. Go into details again.
所属领域的技术人员可以清楚地了解到,为了描述的方便和简洁,仅以上述各功能单元、模块的划分进行举例说明,实际应用中,可以根据需要而将上述功能分配由不同的功能单元、模块完成,即将所述装置的内部结构划分成不同的功能单元或模块,以完成以上描述的全部或者部分功能。实施例中的各功能单元、模块可以集成在一个处理单元中,也可以是各个单元单独物理存在,也可以两个或两个以上单元集成在一个单元中,上述集成的单元既可以采用硬件的形式实现,也可以采用软件功能单元的形式实现。另外,各功能单元、模块的具体名称也只是为了便于相互区分,并不用于限制本申请的保护范围。上述***中单元、模块的具体工作过程,可以参考前述方法实施例中的对应过程,在此不再赘述。Those skilled in the art can clearly understand that, for the convenience and conciseness of description, only the division of the above functional units and modules is used as an example. In practical applications, the above functions can be allocated to different functional units and modules as needed. Module completion, that is, the internal structure of the device is divided into different functional units or modules to complete all or part of the functions described above. The functional units and modules in the embodiments can be integrated into one processing unit, or each unit can exist alone physically, or two or more units can be integrated into one unit. The above-mentioned integrated units can be hardware-based Formal realization can also be realized in the form of software functional units. In addition, the specific names of the functional units and modules are only used to facilitate distinguishing from each other, and are not used to limit the protection scope of the present application. For the specific working process of the units and modules in the foregoing system, reference may be made to the corresponding process in the foregoing method embodiment, which will not be repeated here.
图7是本申请实施例提供的终端设备的结构示意图。Fig. 7 is a schematic structural diagram of a terminal device provided by an embodiment of the present application.
请参照图7,本申请实施例还提供了一种终端设备4,该终端设备4包括:至少一个处理器41、存储器42以及存储在存储器中并可在至少一个处理器41上运行的计算机程序43,处理器41执行计算机程序43时实现上述任意各个方 法实施例中的步骤。Referring to FIG. 7, an embodiment of the present application also provides a terminal device 4, the terminal device 4 includes: at least one processor 41, a memory 42, and a computer program stored in the memory and running on the at least one processor 41 43. The processor 41 implements the steps in any of the foregoing method embodiments when the computer program 43 is executed.
需要说明的是,上述图7并不构成对终端设备4结构的限定,可以包括比图示更多或更少的部件,或者组合某些部件,或者不同的部件,例如终端设备4还可以显示屏、指示灯、马达、控件(例如按键)、陀螺仪传感器、加速度传感器等。It should be noted that the foregoing FIG. 7 does not constitute a limitation on the structure of the terminal device 4, and may include more or fewer components than shown in the figure, or a combination of some components, or different components, for example, the terminal device 4 may also display Screens, indicator lights, motors, controls (such as buttons), gyroscope sensors, acceleration sensors, etc.
处理器41可以是中央处理单元(Central Processing Unit,CPU),该处理器41还可以是其他通用处理器、数字信号处理器(Digital Signal Processor,DSP)、专用集成电路(Application Specific Integrated Circuit,ASIC)、现成可编程门阵列(Field-Programmable Gate Array,FPGA)或者其他可编程逻辑器件、分立门或者晶体管逻辑器件、分立硬件组件等。通用处理器可以是微处理器或者该处理器也可以是任何常规的处理器等。The processor 41 may be a central processing unit (Central Processing Unit, CPU), and the processor 41 may also be other general-purpose processors, digital signal processors (Digital Signal Processors, DSPs), application specific integrated circuits (ASICs). ), ready-made programmable gate array (Field-Programmable Gate Array, FPGA) or other programmable logic devices, discrete gates or transistor logic devices, discrete hardware components, etc. The general-purpose processor may be a microprocessor or the processor may also be any conventional processor or the like.
所述存储器42在一些实施例中可以是终端设备4的内部存储单元,例如终端设备4的硬盘或内存。存储器42在另一些实施例中也可以是终端设备4的外部存储设备,例如终端设备4上配备的插接式硬盘,智能存储卡(Smart Media Card,SMC),安全数字(Secure Digital,SD)卡,闪存卡(Flash Card)等。进一步地,所述存储器42还可以既包括终端设备4的内部存储单元也包括外部存储设备。所述存储器42用于存储操作***、应用程序、引导装载程序(BootLoader)、数据以及其他程序等,例如所述计算机程序的程序代码等。所述存储器42还可以用于暂时地存储已经得到或者将要得到的数据。The memory 42 may be an internal storage unit of the terminal device 4 in some embodiments, such as a hard disk or memory of the terminal device 4. In other embodiments, the memory 42 may also be an external storage device of the terminal device 4, for example, a plug-in hard disk equipped on the terminal device 4, a smart memory card (Smart Media Card, SMC), and a Secure Digital (SD) Card, Flash Card, etc. Further, the memory 42 may also include both an internal storage unit of the terminal device 4 and an external storage device. The memory 42 is used to store an operating system, an application program, a boot loader (BootLoader), data, and other programs, such as the program code of the computer program. The memory 42 can also be used to temporarily store data that has been obtained or will be obtained.
本申请实施例还提供了一种计算机可读存储介质,所述计算机可读存储介质存储有计算机程序,所述计算机程序被处理器执行时实现可实现上述各个方法实施例中的步骤。The embodiments of the present application also provide a computer-readable storage medium, where the computer-readable storage medium stores a computer program, and when the computer program is executed by a processor, the steps in each of the foregoing method embodiments can be realized.
本申请实施例提供了一种计算机程序产品,当计算机程序产品在移动终端上运行时,使得移动终端执行时实现可实现上述各个方法实施例中的步骤。The embodiments of the present application provide a computer program product. When the computer program product runs on a mobile terminal, the steps in the foregoing method embodiments can be realized when the mobile terminal is executed.
所述集成的单元如果以软件功能单元的形式实现并作为独立的产品销售或使用时,可以存储在一个计算机可读取存储介质中。基于这样的理解,本申请 实现上述实施例方法中的全部或部分流程,可以通过计算机程序来指令相关的硬件来完成,所述的计算机程序可存储于一计算机可读存储介质中,该计算机程序在被处理器执行时,可实现上述各个方法实施例的步骤。其中,所述计算机程序包括计算机程序代码,所述计算机程序代码可以为源代码形式、对象代码形式、可执行文件或某些中间形式等。所述计算机可读介质至少可以包括:能够将计算机程序代码携带到拍照装置/终端设备的任何实体或装置、记录介质、计算机存储器、只读存储器(ROM,Read-Only Memory)、随机存取存储器(RAM,Random Access Memory)、电载波信号、电信信号以及软件分发介质。例如U盘、移动硬盘、磁碟或者光盘等。在某些司法管辖区,根据立法和专利实践,计算机可读介质不可以是电载波信号和电信信号。If the integrated unit is implemented in the form of a software functional unit and sold or used as an independent product, it can be stored in a computer readable storage medium. Based on this understanding, the implementation of all or part of the processes in the above-mentioned embodiment methods in the present application can be accomplished by instructing relevant hardware through a computer program. The computer program can be stored in a computer-readable storage medium. The computer program can be stored in a computer-readable storage medium. When executed by the processor, the steps of the foregoing method embodiments can be implemented. Wherein, the computer program includes computer program code, and the computer program code may be in the form of source code, object code, executable file, or some intermediate forms. The computer-readable medium may at least include: any entity or device capable of carrying the computer program code to the photographing device/terminal device, recording medium, computer memory, read-only memory (ROM, Read-Only Memory), and random access memory (RAM, Random Access Memory), electric carrier signal, telecommunications signal and software distribution medium. For example, U disk, mobile hard disk, floppy disk or CD-ROM, etc. In some jurisdictions, according to legislation and patent practices, computer-readable media cannot be electrical carrier signals and telecommunication signals.
在上述实施例中,对各个实施例的描述都各有侧重,某个实施例中没有详述或记载的部分,可以参见其它实施例的相关描述。In the above-mentioned embodiments, the description of each embodiment has its own emphasis. For parts that are not described in detail or recorded in an embodiment, reference may be made to related descriptions of other embodiments.
本领域普通技术人员可以意识到,结合本文中所公开的实施例描述的各示例的单元及算法步骤,能够以电子硬件、或者计算机软件和电子硬件的结合来实现。这些功能究竟以硬件还是软件方式来执行,取决于技术方案的特定应用和设计约束条件。专业技术人员可以对每个特定的应用来使用不同方法来实现所描述的功能,但是这种实现不应认为超出本申请的范围。A person of ordinary skill in the art may realize that the units and algorithm steps of the examples described in combination with the embodiments disclosed herein can be implemented by electronic hardware or a combination of computer software and electronic hardware. Whether these functions are performed by hardware or software depends on the specific application and design constraint conditions of the technical solution. Professionals and technicians can use different methods for each specific application to implement the described functions, but such implementation should not be considered beyond the scope of this application.
在本申请所提供的实施例中,应该理解到,所揭露的装置/终端设备和方法,可以通过其它的方式实现。例如,以上所描述的装置/终端设备实施例仅仅是示意性的,例如,所述模块或单元的划分,仅仅为一种逻辑功能划分,实际实现时可以有另外的划分方式,例如多个单元或组件可以结合或者可以集成到另一个***,或一些特征可以忽略,或不执行。另一点,所显示或讨论的相互之间的耦合或直接耦合或通讯连接可以是通过一些接口,装置或单元的间接耦合或通讯连接,可以是电性,机械或其它的形式。In the embodiments provided in this application, it should be understood that the disclosed device/terminal device and method may be implemented in other ways. For example, the device/terminal device embodiments described above are merely illustrative. For example, the division of the modules or units is only a logical function division, and there may be other divisions in actual implementation, such as multiple units. Or components can be combined or integrated into another system, or some features can be omitted or not implemented. In addition, the displayed or discussed mutual coupling or direct coupling or communication connection may be indirect coupling or communication connection through some interfaces, devices or units, and may be in electrical, mechanical or other forms.
所述作为分离部件说明的单元可以是或者也可以不是物理上分开的,作为单元显示的部件可以是或者也可以不是物理单元,即可以位于一个地方,或者 也可以分布到多个网络单元上。可以根据实际的需要选择其中的部分或者全部单元来实现本实施例方案的目的。The units described as separate components may or may not be physically separate, and the components displayed as units may or may not be physical units, that is, they may be located in one place, or they may be distributed on multiple network units. Some or all of the units may be selected according to actual needs to achieve the objectives of the solutions of the embodiments.
以上所述实施例仅用以说明本申请的技术方案,而非对其限制;尽管参照前述实施例对本申请进行了详细的说明,本领域的普通技术人员应当理解:其依然可以对前述各实施例所记载的技术方案进行修改,或者对其中部分技术特征进行等同替换;而这些修改或者替换,并不使相应技术方案的本质脱离本申请各实施例技术方案的精神和范围,均应包含在本申请的保护范围之内。The above-mentioned embodiments are only used to illustrate the technical solutions of the present application, not to limit them; although the present application has been described in detail with reference to the foregoing embodiments, a person of ordinary skill in the art should understand that it can still implement the foregoing The technical solutions recorded in the examples are modified, or some of the technical features are equivalently replaced; these modifications or replacements do not cause the essence of the corresponding technical solutions to deviate from the spirit and scope of the technical solutions of the embodiments of the application, and should be included in Within the scope of protection of this application.

Claims (10)

  1. 一种视频检测方法,其特征在于,包括:A video detection method, characterized in that it comprises:
    获取待测视频,所述待测视频包括N个视频帧,其中,N为大于1的整数;Acquiring a video to be tested, where the video to be tested includes N video frames, where N is an integer greater than 1;
    依次获取所述N个视频帧的预测帧,其中,所述N个视频帧中第i+1个视频帧的预测帧的计算方式为:将第i个视频帧的误差图像,输入到已训练的预测网络模型进行处理,得到所述第i+1个视频帧的预测帧,所述第i个视频帧的误差图像是根据所述第i个视频帧和所述第i个视频帧的预测帧相减所得,1≤i≤N-1,i为整数;Obtain the predicted frames of the N video frames in sequence, wherein the calculation method of the predicted frame of the i+1-th video frame among the N video frames is: input the error image of the i-th video frame into the trained Is processed by the prediction network model to obtain the prediction frame of the i+1-th video frame, and the error image of the i-th video frame is based on the prediction of the i-th video frame and the i-th video frame Frame subtraction, 1≤i≤N-1, i is an integer;
    计算所述第N个视频帧与所述第N个视频帧的预测帧的差异度;Calculating the degree of difference between the predicted frame of the Nth video frame and the Nth video frame;
    若所述差异度符合预设条件,则确定所述待测视频存在异常。If the degree of difference meets a preset condition, it is determined that the video to be tested is abnormal.
  2. 根据权利要求1所述的方法,其特征在于,第1个视频帧的预测帧是将预设误差图像输入到所述预测网络模型进行处理后得到的。The method according to claim 1, wherein the prediction frame of the first video frame is obtained by inputting a preset error image into the prediction network model for processing.
  3. 根据权利要求1所述的方法,其特征在于,所述计算所述第N个视频帧与所述第N个视频帧的预测帧的差异度,包括:The method according to claim 1, wherein the calculating the difference between the predicted frame of the Nth video frame and the Nth video frame comprises:
    将所述第N个视频帧的预测帧与所述第N个视频帧相减并取模,得到所述差异度。The prediction frame of the Nth video frame and the Nth video frame are subtracted and a modulo is obtained to obtain the degree of difference.
  4. 根据权利要求1所述的方法,其特征在于,所述计算所述第N个视频帧与所述第N个视频帧的预测帧的差异值,包括:The method according to claim 1, wherein the calculating the difference value between the prediction frame of the Nth video frame and the Nth video frame comprises:
    通过预设修复算法,对所述第N个视频帧的预测帧进行修复;Repairing the predicted frame of the Nth video frame by using a preset repair algorithm;
    将修复后的所述第N个视频帧的预测帧与所述第N个视频帧相减并取模,得到所述差异度。The predicted frame of the Nth video frame after repair is subtracted from the Nth video frame and a modulo is taken to obtain the degree of difference.
  5. 根据权利要求1-4任一项所述的方法,其特征在于,所述若所述差异度符合预设条件,则确定所述待测视频存在异常,包括:The method according to any one of claims 1-4, wherein the determining that the video to be tested has an abnormality if the degree of difference meets a preset condition comprises:
    若所述差异度大于第一预设阈值,则确定所述待测视频存在异常。If the degree of difference is greater than the first preset threshold, it is determined that the video to be tested is abnormal.
  6. 根据权利要求1-4任一项所述的方法,其特征在于,所述若所述差异度 符合预设条件,则确定所述待测视频存在异常,包括:The method according to any one of claims 1 to 4, wherein the determining that the video to be tested has an abnormality if the degree of difference meets a preset condition comprises:
    对所述差异度的倒数进行归一化,得到正常度分数;Normalize the reciprocal of the difference degree to obtain the normality score;
    若所述正常度分数小于第二预设阈值,则确定所述待测视频存在异常。If the normality score is less than a second preset threshold, it is determined that the video to be tested is abnormal.
  7. 根据权利要求1-4任一项所述的方法,其特征在于,所述预测网络模型为长短时记忆网络LSTM模型或者为门控循环神经网络GRU模型。The method according to any one of claims 1 to 4, wherein the predictive network model is a long and short-term memory network LSTM model or a gated recurrent neural network GRU model.
  8. 一种视频检测装置,其特征在于,包括:A video detection device, characterized in that it comprises:
    获取模块,用于获取待测视频,所述待测视频包括N个视频帧,其中,N为大于1的整数;An acquisition module, configured to acquire a video to be tested, the video to be tested includes N video frames, where N is an integer greater than 1;
    预测模块,用于依次获取所述N个视频帧的预测帧,其中,所述N个视频帧中第i+1个视频帧的预测帧的计算方式为:将第i个视频帧的误差图像,输入到已训练的预测网络模型进行处理,得到所述第i+1个视频帧的预测帧,所述第i个视频帧的误差图像是根据所述第i个视频帧和所述第i个视频帧的预测帧相减所得,1≤i≤N-1,i为整数;The prediction module is used to sequentially obtain the predicted frames of the N video frames, wherein the calculation method of the predicted frame of the i+1-th video frame among the N video frames is: the error image of the i-th video frame , Input to the trained prediction network model for processing to obtain the prediction frame of the i+1th video frame, and the error image of the i-th video frame is based on the i-th video frame and the i-th video frame. Predicted frames of video frames are subtracted, 1≤i≤N-1, i is an integer;
    计算模块,用于计算所述第N个视频帧与所述第N个视频帧的预测帧的差异度;A calculation module, configured to calculate the difference degree between the Nth video frame and the prediction frame of the Nth video frame;
    确定模块,用于若所述差异度符合预设条件,则确定所述待测视频存在异常。The determining module is configured to determine that the video to be tested has an abnormality if the degree of difference meets a preset condition.
  9. 一种终端设备,包括存储器、处理器以及存储在所述存储器中并可在所述处理器上运行的计算机程序,其特征在于,所述处理器执行所述计算机程序时实现如权利要求1至7任一项所述的方法。A terminal device, comprising a memory, a processor, and a computer program stored in the memory and running on the processor, wherein the processor executes the computer program as claimed in claims 1 to 7. The method of any one.
  10. 一种计算机可读存储介质,所述计算机可读存储介质存储有计算机程序,其特征在于,所述计算机程序被处理器执行时实现如权利要求1至7任一项所述的方法。A computer-readable storage medium storing a computer program, wherein the computer program implements the method according to any one of claims 1 to 7 when the computer program is executed by a processor.
PCT/CN2020/129171 2019-11-18 2020-11-16 Video detection method and apparatus, terminal device, and readable storage medium WO2021098657A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201911128730.9A CN110889351B (en) 2019-11-18 2019-11-18 Video detection method, device, terminal equipment and readable storage medium
CN201911128730.9 2019-11-18

Publications (1)

Publication Number Publication Date
WO2021098657A1 true WO2021098657A1 (en) 2021-05-27

Family

ID=69747861

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2020/129171 WO2021098657A1 (en) 2019-11-18 2020-11-16 Video detection method and apparatus, terminal device, and readable storage medium

Country Status (2)

Country Link
CN (1) CN110889351B (en)
WO (1) WO2021098657A1 (en)

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113569770A (en) * 2021-07-30 2021-10-29 北京市商汤科技开发有限公司 Video detection method and device, electronic equipment and storage medium
CN113671917A (en) * 2021-08-19 2021-11-19 中国科学院自动化研究所 Detection method, system and equipment for abnormal state of multi-modal industrial process
CN113688925A (en) * 2021-08-31 2021-11-23 惠州学院 Attendance number identification method, electronic device and storage medium
CN113705370A (en) * 2021-08-09 2021-11-26 百度在线网络技术(北京)有限公司 Method and device for detecting illegal behavior of live broadcast room, electronic equipment and storage medium
CN114040197A (en) * 2021-11-29 2022-02-11 北京字节跳动网络技术有限公司 Video detection method, device, equipment and storage medium
CN114782284A (en) * 2022-06-17 2022-07-22 广州三七极耀网络科技有限公司 Motion data correction method, device, equipment and storage medium
CN117079079A (en) * 2023-09-27 2023-11-17 中电科新型智慧城市研究院有限公司 Training method of video anomaly detection model, video anomaly detection method and system

Families Citing this family (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110889351B (en) * 2019-11-18 2023-09-26 中国科学院深圳先进技术研究院 Video detection method, device, terminal equipment and readable storage medium
CN113486853B (en) * 2021-07-29 2024-02-27 北京百度网讯科技有限公司 Video detection method and device, electronic equipment and medium
CN113435432B (en) * 2021-08-27 2021-11-30 腾讯科技(深圳)有限公司 Video anomaly detection model training method, video anomaly detection method and device
CN113762134B (en) * 2021-09-01 2024-03-29 沈阳工业大学 Method for detecting surrounding obstacles in automobile parking based on vision

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104036250A (en) * 2014-06-16 2014-09-10 上海大学 Video pedestrian detecting and tracking method
US20190156125A1 (en) * 2017-11-21 2019-05-23 Uber Technologies, Inc. Characterizing Content with a Predictive Error Representation
CN110298323A (en) * 2019-07-02 2019-10-01 中国科学院自动化研究所 Detection method of fighting based on video analysis, system, device
CN110414313A (en) * 2019-06-06 2019-11-05 平安科技(深圳)有限公司 Abnormal behaviour alarm method, device, server and storage medium
CN110889351A (en) * 2019-11-18 2020-03-17 中国科学院深圳先进技术研究院 Video detection method and device, terminal equipment and readable storage medium

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104281858B (en) * 2014-09-15 2018-07-10 中安消技术有限公司 Three dimensional convolution neural network training method, video accident detection method and device
CN109214253B (en) * 2017-07-07 2022-11-11 阿里巴巴集团控股有限公司 Video frame detection method and device

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104036250A (en) * 2014-06-16 2014-09-10 上海大学 Video pedestrian detecting and tracking method
US20190156125A1 (en) * 2017-11-21 2019-05-23 Uber Technologies, Inc. Characterizing Content with a Predictive Error Representation
CN110414313A (en) * 2019-06-06 2019-11-05 平安科技(深圳)有限公司 Abnormal behaviour alarm method, device, server and storage medium
CN110298323A (en) * 2019-07-02 2019-10-01 中国科学院自动化研究所 Detection method of fighting based on video analysis, system, device
CN110889351A (en) * 2019-11-18 2020-03-17 中国科学院深圳先进技术研究院 Video detection method and device, terminal equipment and readable storage medium

Cited By (13)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113569770A (en) * 2021-07-30 2021-10-29 北京市商汤科技开发有限公司 Video detection method and device, electronic equipment and storage medium
CN113569770B (en) * 2021-07-30 2024-06-11 北京市商汤科技开发有限公司 Video detection method and device, electronic equipment and storage medium
CN113705370B (en) * 2021-08-09 2023-06-30 百度在线网络技术(北京)有限公司 Method and device for detecting illegal behaviors of live broadcasting room, electronic equipment and storage medium
CN113705370A (en) * 2021-08-09 2021-11-26 百度在线网络技术(北京)有限公司 Method and device for detecting illegal behavior of live broadcast room, electronic equipment and storage medium
CN113671917B (en) * 2021-08-19 2022-08-02 中国科学院自动化研究所 Detection method, system and equipment for abnormal state of multi-modal industrial process
CN113671917A (en) * 2021-08-19 2021-11-19 中国科学院自动化研究所 Detection method, system and equipment for abnormal state of multi-modal industrial process
CN113688925A (en) * 2021-08-31 2021-11-23 惠州学院 Attendance number identification method, electronic device and storage medium
CN113688925B (en) * 2021-08-31 2023-10-24 惠州学院 Attendance number identification method, electronic equipment and storage medium
CN114040197A (en) * 2021-11-29 2022-02-11 北京字节跳动网络技术有限公司 Video detection method, device, equipment and storage medium
CN114040197B (en) * 2021-11-29 2023-07-28 北京字节跳动网络技术有限公司 Video detection method, device, equipment and storage medium
CN114782284A (en) * 2022-06-17 2022-07-22 广州三七极耀网络科技有限公司 Motion data correction method, device, equipment and storage medium
CN117079079A (en) * 2023-09-27 2023-11-17 中电科新型智慧城市研究院有限公司 Training method of video anomaly detection model, video anomaly detection method and system
CN117079079B (en) * 2023-09-27 2024-03-15 中电科新型智慧城市研究院有限公司 Training method of video anomaly detection model, video anomaly detection method and system

Also Published As

Publication number Publication date
CN110889351B (en) 2023-09-26
CN110889351A (en) 2020-03-17

Similar Documents

Publication Publication Date Title
WO2021098657A1 (en) Video detection method and apparatus, terminal device, and readable storage medium
CN110390262B (en) Video analysis method, device, server and storage medium
CN101441712B (en) Flame video recognition method and fire hazard monitoring method and system
EP3561780A1 (en) Driving behavior determination method, device, equipment and storage medium
US9165212B1 (en) Person counting device, person counting system, and person counting method
WO2019020103A1 (en) Target recognition method and apparatus, storage medium and electronic device
US9483944B2 (en) Prediction of free parking spaces in a parking area
CN112085952B (en) Method and device for monitoring vehicle data, computer equipment and storage medium
US20210133468A1 (en) Action Recognition Method, Electronic Device, and Storage Medium
CN111274881A (en) Driving safety monitoring method and device, computer equipment and storage medium
WO2019223655A1 (en) Detection of non-motor vehicle carrying passenger
CN110738150B (en) Camera linkage snapshot method and device and computer storage medium
CN107004353B (en) Traffic violation management system and traffic violation management method
US20160210759A1 (en) System and method of detecting moving objects
CN112052815A (en) Behavior detection method and device and electronic equipment
CN113052098B (en) Scratch-resistant early warning method for vehicle, related device and computer storage medium
US10710537B2 (en) Method and system for detecting an incident , accident and/or scam of a vehicle
TWI774034B (en) Driving warning method, system and equipment based on internet of vehicle
CN202058304U (en) Vehicle supervision system of parking lot
KR20160069685A (en) recognizing system of vehicle number for parking crossing gate
CN114373189A (en) Behavior detection method and apparatus, terminal device and storage medium
WO2022183663A1 (en) Event detection method and apparatus, and electronic device, storage medium and program product
CN111241918B (en) Vehicle tracking prevention method and system based on face recognition
CN112308723A (en) Vehicle detection method and system
CN111191603B (en) Method and device for identifying people in vehicle, terminal equipment and medium

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 20891268

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 20891268

Country of ref document: EP

Kind code of ref document: A1

32PN Ep: public notification in the ep bulletin as address of the adressee cannot be established

Free format text: NOTING OF LOSS OF RIGHTS PURSUANT TO RULE 112(1) EPC (EPO FORM 1205A DATED 19.01.2023)

122 Ep: pct application non-entry in european phase

Ref document number: 20891268

Country of ref document: EP

Kind code of ref document: A1