WO2020248386A1 - 视频分析方法、装置、计算机设备及存储介质 - Google Patents

视频分析方法、装置、计算机设备及存储介质 Download PDF

Info

Publication number
WO2020248386A1
WO2020248386A1 PCT/CN2019/103373 CN2019103373W WO2020248386A1 WO 2020248386 A1 WO2020248386 A1 WO 2020248386A1 CN 2019103373 W CN2019103373 W CN 2019103373W WO 2020248386 A1 WO2020248386 A1 WO 2020248386A1
Authority
WO
WIPO (PCT)
Prior art keywords
target object
abnormal
image
video
video image
Prior art date
Application number
PCT/CN2019/103373
Other languages
English (en)
French (fr)
Inventor
盖超
Original Assignee
平安科技(深圳)有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 平安科技(深圳)有限公司 filed Critical 平安科技(深圳)有限公司
Publication of WO2020248386A1 publication Critical patent/WO2020248386A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/70Information retrieval; Database structures therefor; File system structures therefor of video data
    • G06F16/78Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually
    • G06F16/783Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using metadata automatically derived from the content
    • G06F16/7837Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using metadata automatically derived from the content using objects detected or recognised in the video content
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/40Scenes; Scene-specific elements in video content
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/44Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to encoded video stream scene graphs
    • H04N21/44008Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to encoded video stream scene graphs involving operations for analysing video streams, e.g. detecting features or characteristics in the video stream

Definitions

  • This application relates to the field of image recognition technology, and in particular to a video analysis method, device, computer equipment and storage medium.
  • the first aspect of the present application provides a video analysis method, the method includes:
  • the detecting the target object in the video image to obtain the target object category includes:
  • the tracking the target object in the video image to obtain the state of the target object includes:
  • the target object does not appear in the detection range in the current video frame, it is determined that the target object is abnormal.
  • the judging whether the business scenario is abnormal includes:
  • the abnormal model When the abnormal model outputs the abnormal scene corresponding to the image to be recognized, it is confirmed that the business scene is abnormal.
  • the key information includes the time and place when the business scene is abnormal, and the picture file when the business scene is abnormal in the intercepted video image.
  • the key information includes the time and place when the business scene is abnormal, and the picture file when the business scene is abnormal in the intercepted video image.
  • the method further includes:
  • the third-party business platform includes a public security system and a traffic control system.
  • the method further includes:
  • a second aspect of the present application provides a video analysis device, the device includes:
  • the receiving module is used to receive the video image collected by the camera
  • the detection module is used to detect the target object in the video image to obtain the target object category
  • a tracking module used to track the target object in the video image to obtain the state of the target object
  • An analysis module configured to analyze and obtain the business scene contained in the video image according to the category of the target object and the state of the target object;
  • the judgment module is used to judge whether the business scenario is abnormal.
  • the processing module is configured to record key information when the business scene is abnormal when the business scene in the video image is abnormal.
  • a third aspect of the present application provides a computer device that includes a processor and a memory, and the processor is configured to implement the video analysis method when executing computer-readable instructions stored in the memory.
  • a fourth aspect of the present application provides a non-volatile readable storage medium having computer readable instructions stored on the non-volatile readable storage medium, and when the computer readable instructions are executed by a processor, the Video analysis methods.
  • the video analysis method, device, computer equipment and storage medium described in this application can analyze the video image to obtain the business scenario contained in the video image, and determine whether the business scenario is abnormal, and when the business scenario is abnormal , To record the key information when the business scenario is abnormal. Thereby, the key information can be sent to the corresponding third-party platform, and the exception can be handled in time.
  • FIG. 1 is a flowchart of a video analysis method provided in Embodiment 1 of the present application.
  • FIG. 2 is a diagram of functional modules in a preferred embodiment of the video analysis device of this application provided in the second embodiment of this application.
  • Fig. 3 is a schematic diagram of a computer device provided in Embodiment 3 of the present application.
  • the video analysis method of the embodiment of the present application is applied in a hardware environment composed of at least one computer device and a mobile terminal connected to the computer device through a network.
  • Networks include but are not limited to: wide area network, metropolitan area network or local area network.
  • the video analysis method in the embodiments of the present application may be executed by a computer device or a mobile terminal; it may also be executed by the computer device and the mobile terminal.
  • the video analysis function provided by the method of this application can be directly integrated on the computer device, or a client for implementing the method of this application can be installed.
  • the method provided in this application can also be run on a computer or other device in the form of a software development kit (SDK), and provide an interface for video analysis functions in the form of SDK. The interface can realize the video analysis function.
  • SDK software development kit
  • FIG. 1 is a flowchart of a video analysis method provided in Embodiment 1 of the present application. According to different needs, the execution order in this flowchart can be changed, and some steps can be omitted.
  • Step S1 receiving the video image collected by the camera.
  • the video image is collected by a camera, and the camera is installed in different business scenarios.
  • the business scenario describes a scenario that requires target object detection and/or video analysis.
  • the business scenario is an intelligent traffic business scenario that recognizes traffic accidents, congestion, vehicle speed detection, traffic flow prediction, vehicle loss of control, vehicle trajectory, people or bicycle intrusion, violation of traffic laws, throwing objects, etc.
  • the business scenario also It can be a smart park business scenario that identifies personnel intrusion, leftovers, lost property monitoring, license plate analysis, vehicle trajectory, traffic flow analysis, pedestrian flow analysis, fireworks or smoke, etc.
  • the business scenario can also be illegal ships, overloaded, dense crowds, etc. Ferry monitoring business scenarios such as detection, whether to wear a life jacket, falling into the water, etc.
  • the business scenarios may also be scenarios such as unmanned driving, financial scenarios, equipment login, airport and public area monitoring.
  • the cameras may be cameras of different models and specifications manufactured by different manufacturers, and the video analysis method can realize unified processing and analysis of video images taken by cameras of different models and specifications manufactured by different manufacturers.
  • the video analysis method further includes:
  • the video image may be video decoded by a graphics processing unit (GPU) to obtain each frame of the video image.
  • GPU graphics processing unit
  • Step S2 Detect the target object in the video image to obtain the target object category.
  • the target objects in the video image include people, animals, vehicles, buildings, smoke and so on.
  • detecting the target object in the video image to obtain the target object category includes:
  • the target object in the video image includes a static target object and a moving target object.
  • the stationary target object can be identified through a template-based detection method. Specifically, it includes: determining the contour of the target object shape in the video image, and matching the contour of the target object shape with a pre-stored template file.
  • the outline of the shape of the target object can be determined to be a rectangle, and the rectangle is feature-matched with a pre-stored door template file to identify the target object.
  • the template file of the door is rectangular.
  • the target object in the video image is a moving target object
  • it can be identified by at least one of the background difference method, the frame difference method, and the optical flow method.
  • the background difference method is to perform background modeling on a relatively fixed scene in the video image, and the moving target object is obtained from the difference between the current image and the background model during detection; The corresponding position pixels between the frames are compared to obtain the position of the moving target object; the optical flow method uses the time-varying optical flow vector characteristics to detect the moving target object in the video image.
  • the above-mentioned method for detecting static target objects and moving target objects in a video image is not limited to the above-mentioned enumeration, and any reproduction method suitable for detecting a target object in a video image can be applied to this.
  • the methods for detecting stationary target objects and moving target objects in a video image in this embodiment are all existing technologies, and will not be described in detail herein.
  • the target object in the video image when the target object in the video image is recognized as a car, it may be determined that the target object is a vehicle.
  • the detection and classification of the target object is a very basic task in vision technology, and its purpose is to track some objects of interest in the scene, including conventional target object detection, person detection, vehicle detection, and so on.
  • the basic attributes of the target object in the video image can be obtained by decomposing the target object in the video image, where the basic attributes include color, motion track, shape, structure, etc., and then The obtained basic attributes are compared with the basic attributes of the target object pre-stored in the database, so as to accurately identify the target object in the video image.
  • the database stores a table corresponding to the basic attributes of the target object and the target object category.
  • the determining the category of the target object specifically includes: obtaining the basic attributes of the target object in the video image by decomposing the target object in the video image; and comparing the obtained basic attributes with the ones stored in a database in advance.
  • the basic attributes of the target object are compared; when the acquired basic attributes are consistent with the basic attributes of the target object in the database, the corresponding table of basic attributes and target object categories stored in the database is queried to obtain the target object’s category.
  • Step S3 tracking the target object in the video image to obtain the state of the target object.
  • the state of the target object can be determined by tracking the target object in the video image.
  • the method for tracking the target object in the video image includes:
  • e) Determine whether the target object appears in the detection range of the current video frame, if the target object does not appear in the detection range of the current video frame, determine that the state of the target object is abnormal; if the target object appears in the detection range
  • the detection range in the current video frame determines the image area of the target object in the current video frame, that is, the state of the target object is normal.
  • the pre-order video frame refers to the k video frames before the current video frame
  • the current video frame is estimated and compared and detected through the first k video frames, which requires a small amount of calculation and can solve the occasional loss of target objects in the video Or the problem of occlusion, the detection accuracy is higher.
  • Step S4 Analyze the business scene contained in the video image according to the category of the target object and the state of the target object.
  • the category of the target object can be obtained according to the detection result, and the state of the target object can be determined according to the tracking result, so that the business scene contained in the video image can be analyzed.
  • the category of the target object is a car. If the car does not appear in the detection range of the current video frame, it can be determined that the state of the car is abnormal. If the car is in a congested state, it can be It is learned that the business scene included in the video image is an intelligent transportation business scene.
  • the category of the target object can be obtained as a pedestrian, and the pedestrian does not appear in the detection range in the current video frame, it can be determined that the state of the pedestrian is abnormal. If the pedestrian falls down, it can be known that the business scene included in the video image is an intelligent traffic business scene.
  • the category of the target object can be obtained as a door according to the detection result, and the door does not appear in the detection range in the current video, it can be confirmed that the status of the door is abnormal. If it is kept open, it can be judged that the door is in the video image.
  • the business scenarios included are smart security business scenarios.
  • Step S5 Determine whether the business scene in the video image is abnormal.
  • step S6 is entered; when the business scene in the video image is not abnormal, the process ends.
  • the video image may be input to a pre-trained abnormality model, and whether the business scene in the video image is abnormal or not can be determined according to the abnormality model. Specifically, when it is determined that the target object is abnormal, extract the current video frame as an abnormal image; import the abnormal image as an image to be recognized into a pre-trained anomaly model, wherein the abnormal model is used to characterize the Identify the correspondence between the image and the abnormal scene; when the abnormal model outputs the abnormal scene corresponding to the image to be identified, confirm that the business scene is abnormal and the abnormal model includes abnormal models corresponding to different business scenarios.
  • the abnormal model corresponding to the intelligent transportation business scene includes a traffic accident model, a traffic congestion model, and an illegal scale type, etc.
  • the business scene is a smart park business scene
  • the abnormal model corresponding to the business scene of the smart park includes a personal belongings model, a personnel intrusion model, etc.
  • the abnormal model corresponding to the ferry monitoring business scene includes an overload model, a falling water model, Illegal ship model, etc.
  • the current video frame is extracted as an abnormal image, and the abnormal image is imported into a pre-trained traffic congestion model as an image to be recognized.
  • the traffic jam model outputs the traffic jam scene corresponding to the image to be recognized, it is confirmed that there is an abnormality in the intelligent transportation business scene corresponding to the video image; when the traffic jam model does not output the traffic corresponding to the image to be recognized In a congested scene, confirm that the intelligent transportation service scene corresponding to the video image is normal.
  • the above-mentioned abnormal model is a machine learning model trained based on a picture sample set.
  • the picture samples include abnormal business scene picture samples and normal business scene picture samples.
  • the machine learning model is an artificial intelligence algorithm model that can perform image recognition, including: a convolutional neural network model CNN, a recurrent neural network module RNN, and a deep neural network model DNN.
  • the convolutional neural network model CNN is a multi-layer neural network, which can continuously reduce the dimensionality of the image recognition problem with a huge amount of data, and finally enable it to be trained. Therefore, the machine learning model in the embodiment of the present application may be CNN model.
  • the ResNet network proposes a residual learning framework that reduces the burden of network training. This network is inherently deeper than the previously used network, and solves the problem of other neural networks that decrease in accuracy as the network deepens.
  • the machine learning model may be the ResNet model in the convolution application network model CNN. It should be noted that this is only an example, and other machine learning models that can perform image recognition are also applicable to this application, and will not be repeated here.
  • Step S6 When the business scene in the video image is abnormal, record the key information when the business scene is abnormal.
  • the key information includes the time and place when the business scene is abnormal, and the picture file when the business scene is abnormal in the intercepted video image.
  • the video analysis method further includes sending the recorded key information to a third-party service platform.
  • the third-party service platform includes a public security system, a traffic control system, etc.
  • the third-party business platform can help the third-party business platform to obtain key information when an exception occurs in a business scenario in time, so as to process the exception in time.
  • the video analysis method further includes: displaying key information when the business scenario is abnormal. Specifically, information such as an abnormal picture, time, and point of the business scene is displayed on the display screen.
  • the video analysis method includes receiving a video image collected by a camera; detecting a target object in the video image to obtain the category of the target object; tracking the target object in the video image to obtain the The status of the target object; the business scene contained in the video image is obtained by analyzing the category of the target object and the status of the target object; judging whether the business scene is abnormal; and when the business scene in the video image
  • record the key information when the business scenario is abnormal It is possible to analyze in real time whether the business scene corresponding to the video image is abnormal, and when it is confirmed that the business scene is abnormal, record the key information when the business scene is abnormal. Thereby, the key information can be sent to the corresponding third-party platform, and the exception can be handled in time.
  • FIG. 2 is a diagram of functional modules in a preferred embodiment of the video analysis device of this application.
  • the video analysis device 20 runs in a computer device.
  • the video analysis device 20 may include multiple functional modules composed of computer-readable instruction code segments.
  • the instruction codes of each computer-readable instruction code segment in the video analysis device 20 can be stored in a memory and executed by at least one processor to perform (see FIG. 1 and related descriptions for details) video analysis functions.
  • the video analysis device 20 can be divided into multiple functional modules according to the functions it performs.
  • the functional modules may include: a receiving module 201, a detection module 202, a tracking module 203, an analysis module 204, a judgment module 205, and a processing module 206.
  • the module referred to in this application refers to a series of computer-readable instruction code segments that can be executed by at least one processor and can complete fixed functions, and are stored in a memory. In some embodiments, the functions of each module will be detailed in subsequent embodiments.
  • the receiving module 201 is used to receive video images collected by a camera.
  • the video image is collected by a camera, and the camera is installed in different business scenarios.
  • the business scenario describes a scenario that requires target object detection and/or video analysis.
  • the business scenario is an intelligent traffic business scenario that recognizes traffic accidents, congestion, vehicle speed detection, traffic flow prediction, vehicle loss of control, vehicle trajectory, people or bicycle intrusion, violation of traffic laws, throwing objects, etc.
  • the business scenario also It can be a smart park business scenario that identifies personnel intrusion, leftovers, lost property monitoring, license plate analysis, vehicle trajectory, traffic flow analysis, pedestrian flow analysis, fireworks or smoke, etc.
  • the business scenario can also be illegal ships, overloaded, dense crowds, etc. Ferry monitoring business scenarios such as detection, whether to wear a life jacket, falling into the water, etc.
  • the camera and the computer device are connected through a wired or wireless network communication.
  • the camera sends the collected video images to the computer device through a wired or wireless network.
  • the business scenarios may also be scenarios such as unmanned driving, financial scenarios, equipment login, airport and public area monitoring.
  • the cameras may be cameras of different models and specifications manufactured by different manufacturers
  • the video analysis device 20 may implement unified processing and analysis of video images taken by cameras of different models and specifications manufactured by different manufacturers.
  • the video analysis device 20 may also decode the video image.
  • the video image may be video decoded by a graphics processing unit (GPU) to obtain each frame of the video image.
  • GPU graphics processing unit
  • the detection module 202 is used to detect the target object in the video image to obtain the target object category.
  • the target objects in the video image include people, animals, vehicles, buildings, smoke and so on.
  • detecting the target object in the video image to obtain the target object category includes:
  • the target object in the video image includes a static target object and a moving target object.
  • the stationary target object can be identified through a template-based detection method. Specifically, it includes: determining the contour of the target object shape in the video image, and performing feature matching between the contour of the target object shape and a pre-stored template file.
  • the target object in the video image is a moving target object
  • it can be identified by at least one of the background difference method, the frame difference method, and the optical flow method.
  • the background difference method is to perform background modeling on a relatively fixed scene in the video image, and the moving target object is obtained from the difference between the current image and the background model during detection;
  • the frame difference method is to obtain the moving target object from the difference between the current image and the background model;
  • the corresponding position pixels between the frames are compared to obtain the position of the moving target object;
  • the optical flow method uses the time-varying optical flow vector characteristics to detect the moving target object in the video image.
  • the method for detecting the stationary target object and the moving target object in the video image is not limited to the above-listed method, and any reproduction method suitable for detecting the target object in the video image can be applied to this.
  • the methods for detecting stationary target objects and moving target objects in a video image in this embodiment are all existing technologies, and will not be described in detail herein.
  • the target object in the video image when the target object in the video image is recognized as a car, it may be determined that the target object is a vehicle.
  • the detection and classification of the target object is a very basic task in vision technology, and its purpose is to track some objects of interest in the scene, including conventional target object detection, person detection, vehicle detection, and so on.
  • the basic attributes of the target object in the video image can be obtained by decomposing the target object in the video image, where the basic attributes include color, motion track, shape, structure, etc., and then The obtained basic attributes are compared with the basic attributes of the target object pre-stored in the database, so as to accurately identify the target object in the video image.
  • the database stores a table corresponding to the basic attributes of the target object and the target object category.
  • the determining the category of the target object specifically includes: obtaining the basic attributes of the target object in the video image by decomposing the target object in the video image; and comparing the obtained basic attributes with the ones stored in a database in advance.
  • the basic attributes of the target object are compared; when the acquired basic attributes are consistent with the basic attributes of the target object in the database, the corresponding table of basic attributes and target object categories stored in the database is queried to obtain the target object’s category.
  • the tracking module 203 is configured to track the target object in the video image to obtain the state of the target object.
  • the state of the target object can be determined by tracking the target object in the video image.
  • the method for tracking the target object in the video image includes:
  • e) Determine whether the target object appears in the detection range of the current video frame, if the target object does not appear in the detection range of the current video frame, determine that the state of the target object is abnormal; if the target object appears in the detection range
  • the detection range in the current video frame determines the image area of the target object in the current video frame.
  • the pre-order video frame refers to the k video frames before the current video frame
  • the current video frame is estimated and compared and detected through the first k video frames, which requires a small amount of calculation and can solve the occasional loss of target objects in the video Or the problem of occlusion, the detection accuracy is higher.
  • the analysis module 204 is configured to analyze and obtain the business scenario contained in the video image according to the category of the target object and the state of the target object.
  • the category of the target object can be obtained according to the detection result, and the state of the target object can be determined according to the tracking result, so that the business scene contained in the video image can be analyzed.
  • the target object is a car. If the car does not appear in the detection range of the current video frame, it can be determined that the state of the car is abnormal. If the car is in a congested state, it can be known
  • the business scene included in the video image is an intelligent transportation business scene.
  • the category of the target object can be obtained as a pedestrian, and the pedestrian does not appear in the detection range in the current video frame, it can be determined that the state of the pedestrian is abnormal. If the pedestrian falls down, it can be known that the business scene included in the video image is an intelligent traffic business scene.
  • the judgment module 205 is used to judge whether the business scene in the video image is abnormal. When the business scene in the video image is abnormal, the key information when the business scene is abnormal is recorded.
  • the video image may be input to a pre-trained abnormality model, and whether the business scene in the video image is abnormal or not can be determined according to the abnormality model. Specifically, when it is determined that the target object is abnormal, extract the current video frame as an abnormal image; import the abnormal image as an image to be recognized into a pre-trained abnormal model, where the abnormal model is used to characterize the Identify the correspondence between the image and the abnormal scene; when the abnormal model outputs the abnormal scene corresponding to the image to be identified, confirm that the business scene is abnormal and the abnormal model includes abnormal models corresponding to different business scenarios.
  • the abnormal model corresponding to the intelligent transportation business scenario includes a traffic accident model, a traffic congestion model, and an illegal scale type, etc.
  • the business scenario is a smart park business scenario
  • the abnormal model corresponding to the business scene of the smart park includes a personal belongings model, a personnel intrusion model, etc.
  • the abnormal model corresponding to the ferry monitoring business scene includes an overload model, a falling water model, Illegal ship model, etc.
  • the current video frame is extracted as an abnormal image, and the abnormal image is imported into a pre-trained traffic congestion model as an image to be recognized.
  • the traffic jam model outputs the traffic jam scene corresponding to the image to be recognized, it is confirmed that there is an abnormality in the intelligent transportation business scene corresponding to the video image; when the traffic jam model does not output the traffic corresponding to the image to be recognized In a congested scene, confirm that the intelligent transportation service scene corresponding to the video image is normal.
  • the above-mentioned abnormal model is a machine learning model trained based on a picture sample set.
  • the picture samples include abnormal business scene picture samples and normal business scene picture samples.
  • the machine learning model is an artificial intelligence algorithm model that can perform image recognition, including: a convolutional neural network model CNN, a recurrent neural network module RNN, and a deep neural network model DNN.
  • the convolutional neural network model CNN is a multi-layer neural network, which can continuously reduce the dimensionality of the image recognition problem with a huge amount of data, and finally enable it to be trained. Therefore, the machine learning model in the embodiment of the present application may be CNN model.
  • the ResNet network proposes a residual learning framework that reduces the burden of network training. This network is inherently deeper than the previously used network, and solves the problem of other neural networks that decrease in accuracy as the network deepens.
  • the machine learning model may be the ResNet model in the convolution application network model CNN. It should be noted that this is only an example, and other machine learning models that can perform image recognition are also applicable to this application, and will not be repeated here.
  • the processing module 206 is configured to record key information when the business scene is abnormal when the business scene in the video image is abnormal.
  • the key information includes the time and place when the business scene is abnormal, and the picture file when the business scene is abnormal in the intercepted video image.
  • the video analysis device 20 can also send the recorded key information to a third-party service platform.
  • the third-party service platform includes a public security system, a traffic control system, etc.
  • the third-party business platform can help the third-party business platform to obtain key information when an exception occurs in a business scenario in time, so as to process the exception in time.
  • the video analysis device 20 can also display key information when the business scene is abnormal. Specifically, the display screen displays the time and place when the business scene is abnormal, and the picture file when the business scene is abnormal in the intercepted video image.
  • the video analysis device 20 includes a receiving module 201, a detection module 202, a tracking module 203, an analysis module 204, a judgment module 205, and a processing module 206.
  • the receiving module 201 is used to receive a video image collected by a camera;
  • the detection module 202 is used to detect a target object in the video image to obtain the target object category;
  • the tracking module 203 is used to track the video image
  • the target object in obtains the status of the target object;
  • the analysis module 204 is configured to analyze and obtain the business scene contained in the video image according to the category of the target object and the status of the target object;
  • the judgment module 205 It is used to determine whether the business scene is abnormal; and the processing module 206 is used to record key information when the business scene is abnormal when the business scene in the video image is abnormal.
  • the aforementioned integrated unit implemented in the form of a software function module may be stored in a non-volatile readable storage medium.
  • the above-mentioned software function module is stored in a storage medium, and includes several instructions to make a computer device (which can be a personal computer, a dual-screen device, or a network device, etc.) or a processor to execute the various embodiments of this application Method part.
  • FIG. 3 is a schematic diagram of a computer device provided in Embodiment 3 of this application.
  • the computer device 3 includes: a database 31, a memory 32, at least one processor 33, computer readable instructions 34 stored in the memory 32 and executable on the at least one processor 33, and at least one communication bus 35 .
  • the computer-readable instructions 34 may be divided into one or more modules/units, and the one or more modules/units are stored in the memory 32 and executed by the at least one processor 33 Execute to complete this application.
  • the one or more modules/units may be a series of computer-readable instruction segments capable of completing specific functions, and the instruction segments are used to describe the execution process of the computer-readable instructions 34 in the computer device 3.
  • the computer device 3 is a device that can automatically perform numerical calculation and/or information processing according to pre-set or stored instructions. Its hardware includes, but is not limited to, a microprocessor, an application specific integrated circuit (application license Specific Integrated Circuit). , ASIC), programmable gate array (Field-Programmable Gate Array, FPGA), digital processor (Digital Signal Processor, DSP), embedded equipment, etc.
  • ASIC application specific integrated circuit
  • FPGA Field-Programmable Gate Array
  • DSP Digital Signal Processor
  • embedded equipment etc.
  • the schematic diagram 3 is only an example of the computer device 3, and does not constitute a limitation on the computer device 3. It may include more or less components than those shown in the figure, or combine certain components, or different components.
  • the computer device 3 may also include input and output devices, network access devices, buses, etc.
  • the database (Database) 31 is a warehouse built on the computer device 3 to organize, store and manage data according to a data structure. Databases are usually divided into three types: hierarchical database, network database and relational database. In this embodiment, the database 31 is used to store the video images and the like.
  • the at least one processor 33 may be a central processing unit (Central Processing Unit, CPU), or may also be other general-purpose processors, digital signal processors (Digital Signal Processors, DSPs), application specific integrated circuits (ASICs). ), ready-made programmable gate array (Field-Programmable Gate Array, FPGA) or other programmable logic devices, discrete gate or transistor logic devices, discrete hardware components, etc.
  • the processor 33 can be a microprocessor or the processor 33 can also be any conventional processor, etc.
  • the processor 33 is the control center of the computer device 3, and connects the entire computer device 3 with various interfaces and lines. Parts.
  • the memory 32 can be used to store the computer-readable instructions 34 and/or modules/units, and the processor 33 runs or executes the computer-readable instructions and/or modules/units stored in the memory 32, and The data stored in the memory 32 is called to realize various functions of the computer device 3.
  • the memory 32 may mainly include a program storage area and a data storage area.
  • the program storage area may store an operating system, an application program required by at least one function (such as a sound playback function, an image playback function, etc.), etc.;
  • the data (such as audio data, etc.) created according to the use of the computer device 3 and the like are stored.
  • the memory 32 may also include non-volatile memory, such as a hard disk, a memory, a plug-in hard disk, a smart memory card (Smart Media Card, SMC), a Secure Digital (SD) card, and a flash memory card (Flash Card). , At least one magnetic disk storage device, flash memory device, or other non-volatile solid state storage device.
  • non-volatile memory such as a hard disk, a memory, a plug-in hard disk, a smart memory card (Smart Media Card, SMC), a Secure Digital (SD) card, and a flash memory card (Flash Card).
  • Computer readable instruction codes are stored in the memory 32, and the at least one processor 33 can call the computer readable instruction codes stored in the memory 32 to perform related functions.
  • the various modules (receiving module 201, detection module 202, tracking module 203, analysis module 204, judgment module 205, and processing module 206) described in FIG. 2 are computer-readable instruction codes stored in the memory 32, It is executed by the at least one processor 33 to realize the functions of the various modules to achieve the purpose of video analysis.
  • the receiving module 201 is used to receive video images collected by a camera
  • the detection module 202 is configured to detect a target object in the video image to obtain the target object category;
  • the tracking module 203 is configured to track the target object in the video image to obtain the state of the target object;
  • the analysis module 204 is configured to analyze and obtain the business scene contained in the video image according to the category of the target object and the state of the target object;
  • the judgment module 205 is used to judge whether the business scenario is abnormal.
  • the processing module 206 is configured to record key information when the business scene is abnormal when the business scene in the video image is abnormal.
  • the integrated module/unit of the computer device 3 is implemented in the form of a software functional unit and sold or used as an independent product, it can be stored in a non-volatile readable storage medium.
  • this application implements all or part of the processes in the above-mentioned embodiments and methods, and can also be completed by instructing relevant hardware through a computer program.
  • the computer program can be stored in a non-volatile readable storage medium.
  • the computer program includes computer readable instruction code
  • the computer readable instruction code may be in the form of source code, object code, executable file, or some intermediate form.
  • the non-volatile readable medium may include: any entity or device capable of carrying the computer readable instruction code, recording medium, U disk, mobile hard disk, magnetic disk, optical disk, computer memory, read-only memory (ROM, Read-Only Memory) etc.
  • the computer device 3 may also include a power source (such as a battery) for supplying power to various components.
  • the power source may be logically connected to the at least one processor 33 through a power management system, so as to be implemented through a power management system. Manage functions such as charging, discharging, and power management.
  • the power supply may also include one or more DC or AC power supplies, recharging systems, power failure detection circuits, power converters or inverters, power supply status indicators and other arbitrary components.
  • the computer device 3 may also include a Bluetooth module, a Wi-Fi module, etc., which will not be repeated here.
  • the functional units in the various embodiments of the present application may be integrated in the same processing unit, or each unit may exist alone physically, or two or more units may be integrated in the same unit.
  • the above-mentioned integrated unit can be implemented in the form of hardware or in the form of hardware plus software functional modules.

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Library & Information Science (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Signal Processing (AREA)
  • Image Analysis (AREA)

Abstract

一种视频分析方法,包括:接收摄像头采集的视频图像;检测所述视频图像中的目标对象得到所述目标对象的类别;跟踪所述视频图像中的目标对象得到所述目标对象的状态;根据所述目标对象的类别和所述目标对象的状态分析得到所述视频图像中包含的业务场景;判断所述业务场景是否出现异常;及当所述视频图像中的业务场景出现异常时,记录所述业务场景出现异常时的关键信息。本申请还提供一种视频分析装置、计算机设备及存储介质。通过本申请可以获取视频图像中出现异常事件时的关键信息,以及时处理所述异常事件。

Description

视频分析方法、装置、计算机设备及存储介质
本申请要求于2019年06月14日提交中国专利局,申请号为201910517477.X发明名称为“视频分析方法、装置、服务器及存储介质”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。
技术领域
本申请涉及图像识别技术领域,具体涉及一种视频分析方法、装置、计算机设备及存储介质。
背景技术
随着视频监控技术的不断发展,我国目前视频监控在智慧城市、数字城市、智慧园区、智能交通、渡口监测等各类项目得以广泛应用。物联网是智慧城市的基础,视频监控将是核心。然而,在对所述监控视频进行分析时,需要用户重新播放监控视频,一帧一帧查看视频图像,以查找视频图像中的异常事件。需要花费大量时间与人力。
发明内容
鉴于以上内容,有必要提出一种视频分析方法、装置、计算机设备及存储介质,能够及时获取视频图像中出现异常事件时的关键信息。
本申请的第一方面提供一种视频分析方法,所述方法包括:
接收摄像头采集的视频图像;
检测所述视频图像中的目标对象得到所述目标对象的类别;
跟踪所述视频图像中的目标对象得到所述目标对象的状态;
根据所述目标对象的类别和所述目标对象的状态分析得到所述视频图像中包含的业务场景;
判断所述业务场景是否出现异常;及
当所述视频图像中的业务场景出现异常时,记录所述业务场景出现异常时的关键信息。
优选地,所述检测所述视频图像中的目标对象得到所述目标对象的类别包括:
通过分解所述视频图像中的目标对象,获取所述视频图像中的目标对象的基本属性;
将获取的所述基本属性与预先存储在数据库中的目标对象的基本属性进行比对;
当获取的所述基本属性与所述数据库中的目标对象的基本属性一致时,查询数据库中存储的基本属性与目标对象类别对应表以得到所述目标对象的类别。
优选地,所述跟踪所述视频图像中的目标对象得到所述目标对象的状态 包括:
确定当前视频帧中的目标对象;
获取目标对象在前序视频帧中的图像区域以及所述图像区域的图像特征,其中,所述前序视频帧为当前视频帧之前的k个视频帧,k为正整数;
根据所述目标对象在前序视频帧中的图像区域,对所述目标对象进行运动估计,确定所述目标对象在当前视频帧的预测区域;
根据所述预测区域确定目标对象在当前视频帧中的检测范围;
判断所述目标对象是否出现在当前视频帧中的检测范围;
若所述目标对象出现在当前视频帧中的检测范围,确定所述目标对象在当前视频帧中的图像区域;
若所述目标对象没有出现在当前视频帧中的检测范围,确定所述目标对象异常。
优选地,所述判断所述业务场景是否出现异常包括:
当确定所述目标对象异常时,提取所述当前视频帧作为异常图像;
将所述异常图像作为待识别图像导入预先训练好的异常模型中,其中,所述异常模型用于表征待识别图像与异常场景之间的对应关系;
当所述异常模型输出与所述待识别图像对应的异常场景时,确认所述业务场景出现异常。
优选地,所述关键信息包括所述业务场景出现异常的时间、地点、及截取的所述视频图像中所述业务场景出现异常时的图片文件。
优选地,所述关键信息包括所述业务场景出现异常的时间、地点、及截取的所述视频图像中所述业务场景出现异常时的图片文件。
优选地,所述方法还包括:
发送记录的关键信息至第三方业务平台,其中,所述第三方业务平台包括公安***和交通管制***。
优选地,在接收摄像头采集的视频图像后,所述方法还包括:
对所述视频图像进行解码。
本申请的第二方面提供一种视频分析装置,所述装置包括:
接收模块,用于接收摄像头采集的视频图像;
检测模块,用于检测所述视频图像中的目标对象得到所述目标对象的类别;
跟踪模块,用于跟踪所述视频图像中的目标对象得到所述目标对象的状态;
分析模块,用于根据所述目标对象的类别和所述目标对象的状态分析得到所述视频图像中包含的业务场景;
判断模块,用于判断所述业务场景是否出现异常;及
处理模块,用于当所述视频图像中的业务场景出现异常时,记录所述业务场景出现异常时的关键信息。
本申请的第三方面提供一种计算机设备,所述计算机设备包括处理器和 存储器,所述处理器用于执行所述存储器中存储的计算机可读指令时实现所述视频分析方法。
本申请的第四方面提供一种非易失性可读存储介质,所述非易失性可读存储介质上存储有计算机可读指令,所述计算机可读指令被处理器执行时实现所述视频分析方法。
本申请所述的视频分析方法、装置、计算机设备及存储介质,能够分析视频图像得到所述视频图像所包含的业务场景,并判断所述业务场景是否出现异常,当所述业务场景出现异常时,记录所述业务场景出现异常时的关键信息。从而可以将所述关键信息发送至对应的第三方平台,以及时处理所述异常。
附图说明
图1是本申请实施例一提供的视频分析方法的流程图。
图2是本申请实施例二提供的本申请视频分析装置较佳实施例中的功能模块图。
图3是本申请实施例三提供的计算机设备的示意图。
如下具体实施方式将结合上述附图进一步说明本申请。
具体实施方式
为了能够更清楚地理解本申请的上述目的、特征和优点,下面结合附图和具体实施例对本申请进行详细描述。需要说明的是,在不冲突的情况下,本申请的实施例及实施例中的特征可以相互组合。
在下面的描述中阐述了很多具体细节以便于充分理解本申请,所描述的实施例仅仅是本申请一部分实施例,而不是全部的实施例。基于本申请中的实施例,本领域普通技术人员在没有做出创造性劳动前提下所获得的所有其他实施例,都属于本申请保护的范围。
除非另有定义,本文所使用的所有的技术和科学术语与属于本申请的技术领域的技术人员通常理解的含义相同。本文中在本申请的说明书中所使用的术语只是为了描述具体的实施例的目的,不是旨在于限制本申请。
本申请的说明书和权利要求书及上述附图中的术语“第一”、“第二”和“第三”等是用于区别不同对象,而非用于描述特定顺序。此外,术语“包括”以及它们任何变形,意图在于覆盖不排他的包含。例如包含了一系列步骤或单元的过程、方法、***、产品或设备没有限定于已列出的步骤或单元,而是可选地还包括没有列出的步骤或单元,或可选地还包括对于这些过程、方法、产品或设备固有的其它步骤或单元。
本申请实施例的视频分析方法应用在由至少一个计算机设备和通过网络与所述计算机设备进行连接的移动终端所构成的硬件环境中。网络包括但不限于:广域网、城域网或局域网。本申请实施例的视频分析方法可以由计算机设备来执行,也可以由移动终端来执行;还可以是由计算机设备和移动终 端共同执行。
所述对于需要进行视频分析方法的计算机设备,可以直接在计算机设备上集成本申请的方法所提供的视频分析功能,或者安装用于实现本申请的方法的客户端。再如,本申请所提供的方法还可以以软件开发工具包(Software Development Kit,SDK)的形式运行在计算机等设备上,以SDK的形式提供视频分析功能的接口,计算机或其他设备通过提供的接口即可实现视频分析功能。
实施例一
图1是本申请实施例一提供的视频分析方法的流程图。根据不同的需求,该流程图中的执行顺序可以改变,某些步骤可以省略。
步骤S1,接收摄像头采集的视频图像。
在本实施方式中,通过摄像头采集视频图像,所述摄像头被安装在不同的业务场景中。所述业务场景描述的是需要进行目标对象侦测和/或视频分析的场景。例如,所述业务场景为识别交通事故、拥堵、车速检测、车流预测、车辆失控、车辆行驶轨迹、人员或自行车闯入、违反交通法规、抛洒物等的智能交通业务场景,所述业务场景还可以是识别人员入侵、遗留物、遗失物监测、车***、车辆行驶轨迹、车流分析、人流分析、烟火或烟雾等的智慧园区业务场景,所述业务场景还可以是非法船只、超载、密集人群检测、是否穿救生衣、落水等的渡口监测业务场景。
所述业务场景还可以是无人驾驶、金融场景、设备登录、机场及公共区域的监控等场景。
在本实施方式中,所述摄像头可以是不同厂商出厂的不同型号规格的摄像头,所述视频分析方法可以实现统一处理并分析不同厂商出厂的不同型号规格的摄像头拍摄的视频图像。
优选地,接收摄像头采集的视频图像后,所述视频分析方法还包括:
对所述视频图像进行解码的步骤。
具体地,可以通过图形处理器(GPU)对所述视频图像进行视频解码,以得到所述视频图像中的每帧图像。
步骤S2,检测所述视频图像中的目标对象得到所述目标对象的类别。
在本实施方式中,所述视频图像中的目标对象包括人物、动物、交通工具、建筑物、烟雾等。
具体地,所述检测所述视频图像中的目标对象得到所述目标对象的类别包括:
(1)识别所述视频图像中的目标对象;
在本实施方式中,所述视频图像中的目标对象包括静止目标对象和运动目标对象。
当所述视频图像中的目标对象为静止目标对象时,可以通过基于模板的检测方法来识别所述静止目标对象。具体包括:确定所述视频图像中的目标对象形状的轮廓,将所述目标对象形状的轮廓与预存的模板文件进行特征匹 配。
例如,当所述视频图像中的目标对象为一扇门,可以确定所述目标对象形状的轮廓为一矩形,将所述矩形与预存的门的模板文件进行特征匹配来识别所述目标对象。其中,所述门的模板文件为矩形。
当所述视频图像中的目标对象为运动目标对象时,可以通过背景差法、帧差法、光流法中的至少一种进行识别。所述背景差法是对视频图像中相对较为固定的场景进行背景建模,检测时由当前图像与背景模型之差得到所述运动目标对象;所述帧差法是通过对视频序列中相邻帧之间对应位置像素点进行比较来获取运动目标对象的位置;所述光流法是利用时间变化的光流矢量特性,对所述视频图像中的运动目标对象进行检测。
可以理解的是,上述检测视频图像中的静止目标对象和运动目标对象的方法不限于上述列举的,任何适应于检测出视频图像中的目标对象的放法均可应用于此。另外,本实施例中的所述检测视频图像中的静止目标对象和运动目标对象的方法均为现有技术,本文在此不再详细介绍。
(2)确定所述目标对象的类别。
例如,当识别所述视频图像中的目标对象为汽车时,可以确定所述目标对象的类别为交通工具。所述目标对象的检测和分类是视觉技术中一个非常基础的任务,其目的就是跟踪场景中感兴趣的一些物体,包括常规的目标对象检测、人员检测以及车辆检测等等。
在本实施方式中,可以通过分解所述视频图像中的目标对象,获取所述视频图像中的目标对象的基本属性,其中,所述基本属性包括颜色、运动轨迹、形状、结构等,再在所述获取的基本属性与预先存储在数据库中的目标对象的基本属性进行比对,从而准确地识别出所述视频图像中的目标对象。所述数据库中存储有目标对象的基本属性与目标对象类别对应表。
所述确定所述目标对象的类别具体包括:通过分解所述视频图像中的目标对象,获取所述视频图像中的目标对象的基本属性;将获取的所述基本属性与预先存储在数据库中的目标对象的基本属性进行比对;当获取的所述基本属性与所述数据库中的目标对象的基本属性一致时,查询数据库中存储的基本属性与目标对象类别对应表以得到所述目标对象的类别。
步骤S3,跟踪所述视频图像中的目标对象得到所述目标对象的状态。
在目标对象检测完成以后,需要针对每个检测到的目标对象来计算其运动轨迹,从而实现跟踪所述视频图像中的目标对象。在本实施方式中,通过跟踪所述视频图像中的目标对象可以确定所述目标对象的状态。
所述跟踪所述视频图像中的目标对象的方法包括:
a)确定当前视频帧中的目标对象。
b)获取目标对象在前序视频帧中的图像区域以及所述图像区域的图像特征,其中,所述前序视频帧为当前视频帧之前的k个视频帧,k为正整数。
c)根据所述目标对象在前序视频帧中的图像区域,对所述目标对象进行运动估计,确定所述目标对象在当前视频帧的预测区域。
d)根据所述预测区域确定目标对象在当前视频帧中的检测范围。
e)判断所述目标对象是否出现在当前视频帧中的检测范围,若所述目标对象没有出现在当前视频帧中的检测范围,确定所述目标对象的状态异常;若所述目标对象出现在当前视频帧中的检测范围,确定所述目标对象在当前视频帧中的图像区域,即所述目标对象的状态正常。
由于前序视频帧是指当前视频帧之前的k个视频帧,通过这前k个视频帧来对当前视频帧进行预估和对比检测,计算量较小,并且能够解决视频中目标对象偶尔丢失或者遮挡的问题,检测精度较高。
步骤S4,根据所述目标对象的类别和所述目标对象的状态分析得到所述视频图像中包含的业务场景。
在本实施方式中,根据检测结果可以获得目标对象的类别,根据跟踪结果可以确定所述目标对象的状态,从而可以分析得到所述视频图像中包含的业务场景。
例如,根据检测结果可以获得所述目标对象的类别为汽车,所述汽车没有出现在当前视频帧中的检测范围,则可以确定所述汽车的状态异常,如所述汽车处于拥堵状态,则可得知所述视频图像中包括的业务场景为智能交通业务场景。
又如根据检测结果可以获得所述目标对象的类别为行人,所述行人没有出现在当前视频帧中的检测范围,则可以确定所述行人的状态异常。如所述行人摔倒,则可得知所述视频图像中包括的业务场景为智能交通业务场景。
又如根据检测结果可以获得所述目标对象的类别为门,所述门没有出现在当前视频中的检测范围,则可以确认所述门的状态异常,如保持打开状态可判断所述视频图像中包括的业务场景为智能安保业务场景。
步骤S5,判断所述视频图像中的业务场景是否出现异常。当所述视频图像中的业务场景出现异常时,进入步骤S6;当所述视频图像中的业务场景没有出现异常时,结束流程。
在本实施方式中,通过所述目标对象的类别和所述目标对象的状态分析可判断所述视频图像中的业务场景是否出现异常。例如,通过判断所述目标对象是否出现在当前视频帧中的检测范围,若所述目标对象没有出现在当前视频帧中的检测范围,确定所述目标对象的状态异常,即所述目标对象对应的业务场景也出现异常。
在其他实施方式中,可以通过将所述视频图像输入预先训练好的异常模型,并根据所述异常模型判断所述视频图像中的业务场景是否异常。具体地,当确定所述目标对象异常时,提取所述当前视频帧作为异常图像;将所述异常图像作为待识别图像导入预先训练好的异常模型中,其中,所述异常模型用于表征待识别图像与异常场景之间的对应关系;当所述异常模型输出与所述待识别图像对应的异常场景时,确认所述业务场景出现异常所述异常模型包括不同的业务场景对应的异常模型。例如,当所述业务场景为智能交通业务场景时,所述智能交通业务场景对应的异常模型包括交通事故模型、交通 拥堵模型及违法违规模型等;当所述业务场景为智慧园区业务场景时,所述智慧园区业务场景对应的异常模型包括随身物品遗留模型、人员入侵模型等;当所述业务场景为渡口监测业务场景时,所述渡口监测业务场景对应的异常模型包括超载模型、落水模型、非法船只模型等。
举例而言,当所述视频图像中当前视频帧出现交通拥堵时,提取所述当前视频帧作为异常图像,并将所述异常图像作为待识别图像导入预先训练好的交通拥堵模型中,当所述交通拥堵模型输出与所述待识别图像对应的交通拥堵场景时,确认所述视频图像对应的智能交通业务场景中出现异常;当所述交通拥堵模型没有输出与所述待识别图像对应的交通拥堵场景时,确认所述视频图像对应的智能交通业务场景中正常。
上述异常模型为根据图片样本集训练的机器学习模型。所述图片样本包括异常业务场景图片样本和正常业务场景图片样本。所述机器学习模型为可以进行图像识别的人工智能算法模型,包括:卷积神经网络模型CNN、循环神经网络模块RNN和深度神经网络模型DNN。其中,卷积神经网络模型CNN是一种多层神经网络,可以将数据量庞大的图像识别问题不断降维,最终使其能够被训练,因此,本申请实施例中的机器学习模型可以为CNN模型。
在CNN网络结构的演化上,出现了许多CNN网络,包括LeNet、AlexNet、VGGNet、GoogleNet和ResNet。其中,ResNet网络提出了一种减轻网络训练负担的残差学习框架,这种网络比以前使用过的网络本质上层次更深,解决了其他神经网络随着网络加深,准确率下降的问题。在本实施方式中,所述机器学习模型可以是卷积申请网络模型CNN中的ResNet模型。需要说明的是,此处仅是举例说明,其他可以进行图像识别的机器学习模型同样适用于本申请,此处不进行赘述。
步骤S6,当所述视频图像中的业务场景出现异常时,记录所述业务场景出现异常时的关键信息。
在本实施方式中,所述关键信息包括所述业务场景出现异常的时间、地点、及截取的所述视频图像中所述业务场景出现异常时的图片文件等。
进一步地,所述视频分析方法还包括,将记录的关键信息发送至第三方业务平台。所述第三方业务平台包括公安***、交通管制***等。通过将所述记录的关键信息发送至所述第三方业务平台,可以帮助所述第三方业务平台及时获取业务场景中出现异常时的关键信息,从而及时处理所述异常。
进一步地,所述视频分析方法还包括:展示所述业务场景出现异常时的关键信息。具体地,在所述显示屏中显示所述业务场景的异常图片、时间、点的等信息。
综上所述,本申请提供的视频分析方法,包括接收摄像头采集的视频图像;检测所述视频图像中的目标对象得到所述目标对象的类别;跟踪所述视频图像中的目标对象得到所述目标对象的状态;根据所述目标对象的类别和所述目标对象的状态分析得到所述视频图像中包含的业务场景;判断所述业务场景是否出现异常;及当所述视频图像中的业务场景出现异常时,记录所 述业务场景出现异常时的关键信息。可以实时分析所述视频图像对应的业务场景是否出现异常,并在确认所述业务场景出现异常时,记录所述业务场景出现异常时的关键信息。从而可以将所述关键信息发送至对应的第三方平台,以及时处理所述异常。
实施例二
图2为本申请视频分析装置较佳实施例中的功能模块图。
在一些实施例中,所述视频分析装置20运行于计算机设备中。所述视频分析装置20可以包括多个由计算机可读指令代码段所组成的功能模块。所述视频分析装置20中的各个计算机可读指令代码段的指令代码可以存储于存储器中,并由至少一个处理器所执行,以执行(详见图1及其相关描述)视频分析功能。
本实施例中,所述视频分析装置20根据其所执行的功能,可以被划分为多个功能模块。所述功能模块可以包括:接收模块201、检测模块202、跟踪模块203、分析模块204、判断模块205及处理模块206。本申请所称的模块是指一种能够被至少一个处理器所执行并且能够完成固定功能的一系列计算机可读指令代码段,其存储在存储器中。在一些实施例中,关于各模块的功能将在后续的实施例中详述。
所述接收模块201用于接收摄像头采集的视频图像。
在本实施方式中,通过摄像头采集视频图像,所述摄像头被安装在不同的业务场景中。所述业务场景描述的是需要进行目标对象侦测和/或视频分析的场景。例如,所述业务场景为识别交通事故、拥堵、车速检测、车流预测、车辆失控、车辆行驶轨迹、人员或自行车闯入、违反交通法规、抛洒物等的智能交通业务场景,所述业务场景还可以是识别人员入侵、遗留物、遗失物监测、车***、车辆行驶轨迹、车流分析、人流分析、烟火或烟雾等的智慧园区业务场景,所述业务场景还可以是非法船只、超载、密集人群检测、是否穿救生衣、落水等的渡口监测业务场景。
在本实施方式中,所述摄像头与所述计算机设备之间通过有线或无线网络通信连接。所述摄像头将采集的视频图像通过有线或无线网络发送至所述计算机设备。
所述业务场景还可以是无人驾驶、金融场景、设备登录、机场及公共区域的监控等场景。
在本实施方式中,所述摄像头可以是不同厂商出厂的不同型号规格的摄像头,所述所述视频分析装置20可以实现统一处理并分析不同厂商出厂的不同型号规格的摄像头拍摄的视频图像。
优选地,接收摄像头采集的视频图像后,所述视频分析装置20还可以对所述视频图像进行解码。
具体地,可以通过图形处理器(GPU)对所述视频图像进行视频解码,以得到所述视频图像中的每帧图像。
所述检测模块202用于检测所述视频图像中的目标对象得到所述目标对 象的类别。
在本实施方式中,所述视频图像中的目标对象包括人物、动物、交通工具、建筑物、烟雾等。
具体地,所述检测所述视频图像中的目标对象得到所述目标对象的类别包括:
(1)识别所述视频图像中的目标对象;
在本实施方式中,所述视频图像中的目标对象包括静止目标对象和运动目标对象。
当所述视频图像中的目标对象为静止目标对象时,可以通过基于模板的检测方法来识别所述静止目标对象。具体包括:确定所述视频图像中的目标对象形状的轮廓,将所述目标对象形状的轮廓与预存的模板文件进行特征匹配。
当所述视频图像中的目标对象为运动目标对象时,可以通过背景差法、帧差法、光流法中的至少一种进行识别。所述背景差法是对视频图像中相对较为固定的场景进行背景建模,检测时由当前图像与背景模型之差得到所述运动目标对象;所述帧差法是通过对视频序列中相邻帧之间对应位置像素点进行比较来获取运动目标对象的位置;所述光流法是利用时间变化的光流矢量特性,对所述视频图像中的运动目标对象进行检测。
在本实施方式中,上述检测视频图像中的静止目标对象和运动目标对象的方法不限于上述列举的,任何适应于检测出视频图像中的目标对象的放法均可应用于此。另外,本实施例中的所述检测视频图像中的静止目标对象和运动目标对象的方法均为现有技术,本文在此不再详细介绍。
(2)确定所述目标对象的类别。
例如,当识别所述视频图像中的目标对象为汽车时,可以确定所述目标对象的类别为交通工具。所述目标对象的检测和分类是视觉技术中一个非常基础的任务,其目的就是跟踪场景中感兴趣的一些物体,包括常规的目标对象检测、人员检测以及车辆检测等等。
在本实施方式中,可以通过分解所述视频图像中的目标对象,获取所述视频图像中的目标对象的基本属性,其中,所述基本属性包括颜色、运动轨迹、形状、结构等,再在所述获取的基本属性与预先存储在数据库中的目标对象的基本属性进行比对,从而准确地识别出所述视频图像中的目标对象。所述数据库中存储有目标对象的基本属性与目标对象类别对应表。
所述确定所述目标对象的类别具体包括:通过分解所述视频图像中的目标对象,获取所述视频图像中的目标对象的基本属性;将获取的所述基本属性与预先存储在数据库中的目标对象的基本属性进行比对;当获取的所述基本属性与所述数据库中的目标对象的基本属性一致时,查询数据库中存储的基本属性与目标对象类别对应表以得到所述目标对象的类别。
所述跟踪模块203用于跟踪所述视频图像中的目标对象得到所述目标对象的状态。
在目标对象检测完成以后,需要针对每个检测到的目标对象来计算其运动轨迹,从而实现跟踪所述视频图像中的目标对象。在本实施方式中,通过跟踪所述视频图像中的目标对象可以确定所述目标对象的状态。
所述跟踪所述视频图像中的目标对象的方法包括:
a)确定当前视频帧中的目标对象。
b)获取目标对象在前序视频帧中的图像区域以及所述图像区域的图像特征,其中,所述前序视频帧为当前视频帧之前的k个视频帧,k为正整数。
c)根据所述目标对象在前序视频帧中的图像区域,对所述目标对象进行运动估计,确定所述目标对象在当前视频帧的预测区域。
d)根据所述预测区域确定目标对象在当前视频帧中的检测范围。
e)判断所述目标对象是否出现在当前视频帧中的检测范围,若所述目标对象没有出现在当前视频帧中的检测范围,确定所述目标对象的状态异常;若所述目标对象出现在当前视频帧中的检测范围,确定所述目标对象在当前视频帧中的图像区域。
由于前序视频帧是指当前视频帧之前的k个视频帧,通过这前k个视频帧来对当前视频帧进行预估和对比检测,计算量较小,并且能够解决视频中目标对象偶尔丢失或者遮挡的问题,检测精度较高。
所述分析模块204用于根据所述目标对象的类别和所述目标对象的状态分析得到所述视频图像中包含的业务场景。
在本实施方式中,根据检测结果可以获得目标对象的类别,根据跟踪结果可以确定所述目标对象的状态,从而可以分析得到所述视频图像中包含的业务场景。
例如,根据检测结果可以获得所述目标对象为汽车,所述汽车没有出现在当前视频帧中的检测范围,则可以确定所述汽车的状态异常,如所述汽车处于拥堵状态,则可得知所述视频图像中包括的业务场景为智能交通业务场景。
又如根据检测结果可以获得所述目标对象的类别为行人,所述行人没有出现在当前视频帧中的检测范围,则可以确定所述行人的状态异常。如所述行人摔倒,则可得知所述视频图像中包括的业务场景为智能交通业务场景。
所述判断模块205用于判断所述视频图像中的业务场景是否出现异常。当所述视频图像中的业务场景出现异常时,记录所述业务场景出现异常时的关键信息。
在本实施方式中,通过所述目标对象的类别和所述目标对象的状态分析可判断所述视频图像中的业务场景是否出现异常。例如,通过判断所述目标对象是否出现在当前视频帧中的检测范围,若所述目标对象没有出现在当前视频帧中的检测范围,确定所述目标对象的状态异常,即所述目标对象对应的业务场景也出现异常。
在其他实施方式中,可以通过将所述视频图像输入预先训练好的异常模型,并根据所述异常模型判断所述视频图像中的业务场景是否异常。具体地, 当确定所述目标对象异常时,提取所述当前视频帧作为异常图像;将所述异常图像作为待识别图像导入预先训练好的异常模型中,其中,所述异常模型用于表征待识别图像与异常场景之间的对应关系;当所述异常模型输出与所述待识别图像对应的异常场景时,确认所述业务场景出现异常所述异常模型包括不同的业务场景对应的异常模型。例如,当所述业务场景为智能交通业务场景时,所述智能交通业务场景对应的异常模型包括交通事故模型、交通拥堵模型及违法违规模型等;当所述业务场景为智慧园区业务场景时,所述智慧园区业务场景对应的异常模型包括随身物品遗留模型、人员入侵模型等;当所述业务场景为渡口监测业务场景时,所述渡口监测业务场景对应的异常模型包括超载模型、落水模型、非法船只模型等。
举例而言,当所述视频图像中当前视频帧出现交通拥堵时,提取所述当前视频帧作为异常图像,并将所述异常图像作为待识别图像导入预先训练好的交通拥堵模型中,当所述交通拥堵模型输出与所述待识别图像对应的交通拥堵场景时,确认所述视频图像对应的智能交通业务场景中出现异常;当所述交通拥堵模型没有输出与所述待识别图像对应的交通拥堵场景时,确认所述视频图像对应的智能交通业务场景中正常。
上述异常模型为根据图片样本集训练的机器学习模型。所述图片样本包括异常业务场景图片样本和正常业务场景图片样本。所述机器学习模型为可以进行图像识别的人工智能算法模型,包括:卷积神经网络模型CNN、循环神经网络模块RNN和深度神经网络模型DNN。其中,卷积神经网络模型CNN是一种多层神经网络,可以将数据量庞大的图像识别问题不断降维,最终使其能够被训练,因此,本申请实施例中的机器学习模型可以为CNN模型。
在CNN网络结构的演化上,出现了许多CNN网络,包括LeNet、AlexNet、VGGNet、GoogleNet和ResNet。其中,ResNet网络提出了一种减轻网络训练负担的残差学习框架,这种网络比以前使用过的网络本质上层次更深,解决了其他神经网络随着网络加深,准确率下降的问题。在本实施方式中,所述机器学习模型可以是卷积申请网络模型CNN中的ResNet模型。需要说明的是,此处仅是举例说明,其他可以进行图像识别的机器学习模型同样适用于本申请,此处不进行赘述。
所述处理模块206用于当所述视频图像中的业务场景出现异常时,记录所述业务场景出现异常时的关键信息。
在本实施方式中,所述关键信息包括所述业务场景出现异常的时间、地点、及截取的所述视频图像中所述业务场景出现异常时的图片文件等。
进一步地,所述视频分析装置20还可以将记录的关键信息发送至第三方业务平台。所述第三方业务平台包括公安***、交通管制***等。通过将所述记录的关键信息发送至所述第三方业务平台,可以帮助所述第三方业务平台及时获取业务场景中出现异常时的关键信息,从而及时处理所述异常。
进一步地,所述所述视频分析装置20还可以展示所述业务场景出现异常时的关键信息。具体地,在显示屏中显示所述业务场景出现异常的时间、地 点、及截取的所述视频图像中所述业务场景出现异常时的图片文件等。
综上所述,本申请提供的视频分析装置20,包括接收模块201、检测模块202、跟踪模块203、分析模块204、判断模块205及处理模块206。所述接收模块201用于接收摄像头采集的视频图像;所述检测模块202用于检测所述视频图像中的目标对象得到所述目标对象的类别;所述跟踪模块203用于跟踪所述视频图像中的目标对象得到所述目标对象的状态;所述分析模块204用于根据所述目标对象的类别和所述目标对象的状态分析得到所述视频图像中包含的业务场景;所述判断模块205用于判断所述业务场景是否出现异常;及所述处理模块206用于当所述视频图像中的业务场景出现异常时,记录所述业务场景出现异常时的关键信息。可以实时分析所述视频图像对应的业务场景是否出现异常,并在确认所述业务场景出现异常时,记录所述业务场景出现异常时的关键信息。从而可以将所述关键信息发送至对应的第三方平台,以及时处理所述异常。
上述以软件功能模块的形式实现的集成的单元,可以存储在一个非易失性可读存储介质中。上述软件功能模块存储在一个存储介质中,包括若干指令用以使得一台计算机设备(可以是个人计算机,双屏设备,或者网络设备等)或处理器(processor)执行本申请各个实施例所述方法的部分。
实施例三
图3为本申请实施例三提供的计算机设备的示意图。
所述计算机设备3包括:数据库31、存储器32、至少一个处理器33、存储在所述存储器32中并可在所述至少一个处理器33上运行的计算机可读指令34及至少一条通讯总线35。
所述至少一个处理器33执行所述计算机可读指令34时实现上述视频分析方法实施例中的步骤。
示例性的,所述计算机可读指令34可以被分割成一个或多个模块/单元,所述一个或者多个模块/单元被存储在所述存储器32中,并由所述至少一个处理器33执行,以完成本申请。所述一个或多个模块/单元可以是能够完成特定功能的一系列计算机可读指令段,该指令段用于描述所述计算机可读指令34在所述计算机设备3中的执行过程。
所述计算机设备3是一种能够按照事先设定或存储的指令,自动进行数值计算和/或信息处理的设备,其硬件包括但不限于微处理器、专用集成电路(应用程序lication Specific Integrated Circuit,ASIC)、可编程门阵列(Field-Programmable Gate Array,FPGA)、数字处理器(Digital Signal Processor,DSP)、嵌入式设备等。本领域技术人员可以理解,所述示意图3仅仅是计算机设备3的示例,并不构成对计算机设备3的限定,可以包括比图示更多或更少的部件,或者组合某些部件,或者不同的部件,例如所述计算机设备3还可以包括输入输出设备、网络接入设备、总线等。
所述数据库(Database)31是按照数据结构来组织、存储和管理数据的建立在所述计算机设备3上的仓库。数据库通常分为层次式数据库、网络式 数据库和关系式数据库三种。在本实施方式中,所述数据库31用于存储所述视频图像等。
所述至少一个处理器33可以是中央处理单元(Central Processing Unit,CPU),还可以是其他通用处理器、数字信号处理器(Digital Signal Processor,DSP)、专用集成电路(Application Specific Integrated Circuit,ASIC)、现成可编程门阵列(Field-Programmable Gate Array,FPGA)或者其他可编程逻辑器件、分立门或者晶体管逻辑器件、分立硬件组件等。该处理器33可以是微处理器或者该处理器33也可以是任何常规的处理器等,所述处理器33是所述计算机设备3的控制中心,利用各种接口和线路连接整个计算机设备3的各个部分。
所述存储器32可用于存储所述计算机可读指令34和/或模块/单元,所述处理器33通过运行或执行存储在所述存储器32内的计算机可读指令和/或模块/单元,以及调用存储在存储器32内的数据,实现所述计算机设备3的各种功能。所述存储器32可主要包括存储程序区和存储数据区,其中,存储程序区可存储操作***、至少一个功能所需的应用程序(比如声音播放功能、图像播放功能等)等;存储数据区可存储根据计算机设备3的使用所创建的数据(比如音频数据等)等。此外,存储器32还可以包括非易失性存储器,例如硬盘、内存、插接式硬盘,智能存储卡(Smart Media Card,SMC),安全数字(Secure Digital,SD)卡,闪存卡(Flash Card)、至少一个磁盘存储器件、闪存器件、或其他非易失性固态存储器件。
所述存储器32中存储有计算机可读指令代码,且所述至少一个处理器33可调用所述存储器32中存储的计算机可读指令代码以执行相关的功能。例如,图2中所述的各个模块(接收模块201、检测模块202、跟踪模块203、分析模块204、判断模块205及处理模块206)是存储在所述存储器32中的计算机可读指令代码,并由所述至少一个处理器33所执行,从而实现所述各个模块的功能以达到视频分析目的。
所述接收模块201用于接收摄像头采集的视频图像;
所述检测模块202用于检测所述视频图像中的目标对象得到所述目标对象的类别;
所述跟踪模块203用于跟踪所述视频图像中的目标对象得到所述目标对象的状态;
所述分析模块204用于根据所述目标对象的类别和所述目标对象的状态分析得到所述视频图像中包含的业务场景;
所述判断模块205用于判断所述业务场景是否出现异常;及
所述处理模块206用于当所述视频图像中的业务场景出现异常时,记录所述业务场景出现异常时的关键信息。
所述计算机设备3集成的模块/单元如果以软件功能单元的形式实现并作为独立的产品销售或使用时,可以存储在一个非易失性可读取存储介质中。基于这样的理解,本申请实现上述实施例方法中的全部或部分流程,也可以 通过计算机程序来指令相关的硬件来完成,所述的计算机程序可存储于一非易失性可读存储介质中,该计算机程序在被处理器执行时,可实现上述各个方法实施例的步骤。其中,所述计算机程序包括计算机可读指令代码,所述计算机可读指令代码可以为源代码形式、对象代码形式、可执行文件或某些中间形式等。所述非易失性可读介质可以包括:能够携带所述计算机可读指令代码的任何实体或装置、记录介质、U盘、移动硬盘、磁碟、光盘、计算机存储器、只读存储器(ROM,Read-Only Memory)等。
尽管未示出,所述计算机设备3还可以包括给各个部件供电的电源(比如电池),优选的,电源可以通过电源管理***与所述至少一个处理器33逻辑相连,从而通过电源管理***实现管理充电、放电、以及功耗管理等功能。电源还可以包括一个或一个以上的直流或交流电源、再充电***、电源故障检测电路、电源转换器或者逆变器、电源状态指示器等任意组件。所述计算机设备3还可以包括蓝牙模块、Wi-Fi模块等,在此不再赘述。
应该了解,所述实施例仅为说明之用,在专利申请范围上并不受此结构的限制。
在本申请所提供的几个实施例中,应该理解到,所揭露的电子设备和方法,可以通过其它的方式实现。例如,以上所描述的电子设备实施例仅仅是示意性的,例如,所述单元的划分,仅仅为一种逻辑功能划分,实际实现时可以有另外的划分方式。
另外,在本申请各个实施例中的各功能单元可以集成在相同处理单元中,也可以是各个单元单独物理存在,也可以两个或两个以上单元集成在相同单元中。上述集成的单元既可以采用硬件的形式实现,也可以采用硬件加软件功能模块的形式实现。
对于本领域技术人员而言,显然本申请不限于上述示范性实施例的细节,而且在不背离本申请的精神或基本特征的情况下,能够以其他的具体形式实现本申请。因此,无论从哪一点来看,均应将实施例看作是示范性的,而且是非限制性的,本申请的范围由所附权利要求而不是上述说明限定,因此旨在将落在权利要求的等同要件的含义和范围内的所有变化涵括在本申请内。不应将权利要求中的任何附图标记视为限制所涉及的权利要求。此外,显然“包括”一词不排除其他单元或,单数不排除复数。***权利要求中陈述的多个单元或装置也可以由一个单元或装置通过软件或者硬件来实现。第一,第二等词语用来表示名称,而并不表示任何特定的顺序。
最后应说明的是,以上实施例仅用以说明本申请的技术方案而非限制,尽管参照较佳实施例对本申请进行了详细说明,本领域的普通技术人员应当理解,可以对本申请的技术方案进行修改或等同替换,而不脱离本申请技术方案的精神范围。

Claims (20)

  1. 一种视频分析方法,其特征在于,所述方法包括:
    接收摄像头采集的视频图像;
    检测所述视频图像中的目标对象得到所述目标对象的类别;
    跟踪所述视频图像中的目标对象得到所述目标对象的状态;
    根据所述目标对象的类别和所述目标对象的状态分析得到所述视频图像中包含的业务场景;
    判断所述业务场景是否出现异常;及
    当所述视频图像中的业务场景出现异常时,记录所述业务场景出现异常时的关键信息。
  2. 如权利要求1所述的视频分析方法,其特征在于,所述检测所述视频图像中的目标对象得到所述目标对象的类别包括:
    通过分解所述视频图像中的目标对象,获取所述视频图像中的目标对象的基本属性;
    将获取的所述基本属性与预先存储在数据库中的目标对象的基本属性进行比对;
    当获取的所述基本属性与所述数据库中的目标对象的基本属性一致时,查询数据库中存储的基本属性与目标对象类别对应表以得到所述目标对象的类别。
  3. 如权利要求1所述的视频分析方法,其特征在于,所述跟踪所述视频图像中的目标对象得到所述目标对象的状态包括:
    确定当前视频帧中的目标对象;
    获取目标对象在前序视频帧中的图像区域以及所述图像区域的图像特征,其中,所述前序视频帧为当前视频帧之前的k个视频帧,k为正整数;
    根据所述目标对象在前序视频帧中的图像区域,对所述目标对象进行运动估计,确定所述目标对象在当前视频帧的预测区域;
    根据所述预测区域确定目标对象在当前视频帧中的检测范围;
    判断所述目标对象是否出现在当前视频帧中的检测范围;
    若所述目标对象出现在当前视频帧中的检测范围,确定所述目标对象在当前视频帧中的图像区域;
    若所述目标对象没有出现在当前视频帧中的检测范围,确定所述目标对象异常。
  4. 如权利要求3所述的视频分析方法,其特征在于,所述判断所述业务场景是否出现异常包括:
    当确定所述目标对象异常时,提取所述当前视频帧作为异常图像;
    将所述异常图像作为待识别图像导入预先训练好的异常模型中,其中,所述异常模型用于表征待识别图像与异常场景之间的对应关系;
    当所述异常模型输出与所述待识别图像对应的异常场景时,确认所述业务场景出现异常。
  5. 如权利要求1所述的视频分析方法,其特征在于,所述关键信息包括所述业务场景出现异常的时间、地点、及截取的所述视频图像中所述业务场景出现异常时的图片文件。
  6. 如权利要求5所述的视频分析方法,其特征在于,所述方法还包括:
    发送记录的关键信息至第三方业务平台,其中,所述第三方业务平台包括公安***和交通管制***。
  7. 如权利要求1所述的视频分析方法,其特征在于,在接收摄像头采集的视频图像后,所述方法还包括:
    对所述视频图像进行解码。
  8. 一种视频分析装置,其特征在于,所述装置包括:
    接收模块,用于接收摄像头采集的视频图像;
    检测模块,用于检测所述视频图像中的目标对象得到所述目标对象的类别;
    跟踪模块,用于跟踪所述视频图像中的目标对象得到所述目标对象的状态;
    分析模块,用于根据所述目标对象的类别和所述目标对象的状态分析得到所述视频图像中包含的业务场景;
    判断模块,用于判断所述业务场景是否出现异常;及
    处理模块,用于当所述视频图像中的业务场景出现异常时,记录所述业务场景出现异常时的关键信息。
  9. 一种计算机设备,其特征在于,所述计算机设备包括处理器和存储器,所述存储器中存储至少一个计算机可读指令,所述处理器执行所述至少一个计算机可读指令以实现以下步骤:
    接收摄像头采集的视频图像;
    检测所述视频图像中的目标对象得到所述目标对象的类别;
    跟踪所述视频图像中的目标对象得到所述目标对象的状态;
    根据所述目标对象的类别和所述目标对象的状态分析得到所述视频图像中包含的业务场景;
    判断所述业务场景是否出现异常;及
    当所述视频图像中的业务场景出现异常时,记录所述业务场景出现异常时的关键信息。
  10. 如权利要求9所述的计算机设备,其特征在于,所述处理器执行所述至少一个计算机可读指令以实现所述检测所述视频图像中的目标对象得到所述目标对象的类别时,包括:
    通过分解所述视频图像中的目标对象,获取所述视频图像中的目标对象的基本属性;
    将获取的所述基本属性与预先存储在数据库中的目标对象的基本属性进行比对;
    当获取的所述基本属性与所述数据库中的目标对象的基本属性一致时,查 询数据库中存储的基本属性与目标对象类别对应表以得到所述目标对象的类别。
  11. 如权利要求9所述的计算机设备,其特征在于,所述处理器执行所述至少一个计算机可读指令以实现所述跟踪所述视频图像中的目标对象得到所述目标对象的状态时,包括:
    确定当前视频帧中的目标对象;
    获取目标对象在前序视频帧中的图像区域以及所述图像区域的图像特征,其中,所述前序视频帧为当前视频帧之前的k个视频帧,k为正整数;
    根据所述目标对象在前序视频帧中的图像区域,对所述目标对象进行运动估计,确定所述目标对象在当前视频帧的预测区域;
    根据所述预测区域确定目标对象在当前视频帧中的检测范围;
    判断所述目标对象是否出现在当前视频帧中的检测范围;
    若所述目标对象出现在当前视频帧中的检测范围,确定所述目标对象在当前视频帧中的图像区域;
    若所述目标对象没有出现在当前视频帧中的检测范围,确定所述目标对象异常。
  12. 如权利要求11所述的计算机设备,其特征在于,所述处理器执行所述至少一个计算机可读指令以实现所述判断所述业务场景是否出现异常时,包括:
    当确定所述目标对象异常时,提取所述当前视频帧作为异常图像;
    将所述异常图像作为待识别图像导入预先训练好的异常模型中,其中,所述异常模型用于表征待识别图像与异常场景之间的对应关系;
    当所述异常模型输出与所述待识别图像对应的异常场景时,确认所述业务场景出现异常。
  13. 如权利要求9所述的计算机设备,其特征在于,所述处理器执行所述至少一个计算机可读指令时还用以实现以下步骤:
    发送记录的关键信息至第三方业务平台,其中,所述第三方业务平台包括公安***和交通管制***。
  14. 如权利要求9所述的计算机设备,其特征在于,所述处理器执行所述至少一个计算机可读指令以实现在所述接收摄像头采集的视频图像后,还用以实现以下步骤:
    对所述视频图像进行解码。
  15. 一种非易失性可读存储介质,所述非易失性可读存储介质上存储有至少一个计算机可读指令,其特征在于,所述至少一个计算机可读指令被处理器执行时以实现以下步骤:
    接收摄像头采集的视频图像;
    检测所述视频图像中的目标对象得到所述目标对象的类别;
    跟踪所述视频图像中的目标对象得到所述目标对象的状态;
    根据所述目标对象的类别和所述目标对象的状态分析得到所述视频图像中包含的业务场景;
    判断所述业务场景是否出现异常;及
    当所述视频图像中的业务场景出现异常时,记录所述业务场景出现异常时的关键信息。
  16. 如权利要求15所述的存储介质,其特征在于,所述至少一个计算机可读指令被处理器执行以实现所述检测所述视频图像中的目标对象得到所述目标对象的类别时,包括:
    通过分解所述视频图像中的目标对象,获取所述视频图像中的目标对象的基本属性;
    将获取的所述基本属性与预先存储在数据库中的目标对象的基本属性进行比对;
    当获取的所述基本属性与所述数据库中的目标对象的基本属性一致时,查询数据库中存储的基本属性与目标对象类别对应表以得到所述目标对象的类别。
  17. 如权利要求15所述的存储介质,其特征在于,所述至少一个计算机可读指令被处理器执行以实现所述跟踪所述视频图像中的目标对象得到所述目标对象的状态时,包括:
    确定当前视频帧中的目标对象;
    获取目标对象在前序视频帧中的图像区域以及所述图像区域的图像特征,其中,所述前序视频帧为当前视频帧之前的k个视频帧,k为正整数;
    根据所述目标对象在前序视频帧中的图像区域,对所述目标对象进行运动估计,确定所述目标对象在当前视频帧的预测区域;
    根据所述预测区域确定目标对象在当前视频帧中的检测范围;
    判断所述目标对象是否出现在当前视频帧中的检测范围;
    若所述目标对象出现在当前视频帧中的检测范围,确定所述目标对象在当前视频帧中的图像区域;
    若所述目标对象没有出现在当前视频帧中的检测范围,确定所述目标对象异常。
  18. 如权利要求17所述的存储介质,其特征在于,所述至少一个计算机可读指令被处理器执行以实现所述判断所述业务场景是否出现异常时,包括:
    当确定所述目标对象异常时,提取所述当前视频帧作为异常图像;
    将所述异常图像作为待识别图像导入预先训练好的异常模型中,其中,所述异常模型用于表征待识别图像与异常场景之间的对应关系;
    当所述异常模型输出与所述待识别图像对应的异常场景时,确认所述业务场景出现异常。
  19. 如权利要求15所述的存储介质,其特征在于,所述至少一个计算机可读指令被处理器执行时还用以实现以下步骤:
    发送记录的关键信息至第三方业务平台,其中,所述第三方业务平台包括公安***和交通管制***。
  20. 如权利要求15所述的存储介质,其特征在于,所述至少一个计算机可读指令被处理器执行以实现在所述接收摄像头采集的视频图像后,还用以 实现以下步骤:
    对所述视频图像进行解码。
PCT/CN2019/103373 2019-06-14 2019-08-29 视频分析方法、装置、计算机设备及存储介质 WO2020248386A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201910517477.X 2019-06-14
CN201910517477.XA CN110390262B (zh) 2019-06-14 2019-06-14 视频分析方法、装置、服务器及存储介质

Publications (1)

Publication Number Publication Date
WO2020248386A1 true WO2020248386A1 (zh) 2020-12-17

Family

ID=68285438

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2019/103373 WO2020248386A1 (zh) 2019-06-14 2019-08-29 视频分析方法、装置、计算机设备及存储介质

Country Status (2)

Country Link
CN (1) CN110390262B (zh)
WO (1) WO2020248386A1 (zh)

Cited By (22)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112633126A (zh) * 2020-12-18 2021-04-09 联通物联网有限责任公司 视频处理方法及装置
CN112634329A (zh) * 2020-12-26 2021-04-09 西安电子科技大学 一种基于时空与或图的场景目标活动预测方法及装置
CN112711994A (zh) * 2020-12-21 2021-04-27 航天信息股份有限公司 基于场景识别的违章作业行为检测方法及***
CN112749636A (zh) * 2020-12-29 2021-05-04 精英数智科技股份有限公司 一种煤矿探放水的监测方法、装置、***和存储介质
CN112991280A (zh) * 2021-03-03 2021-06-18 望知科技(深圳)有限公司 视觉检测方法、***及电子设备
CN113065456A (zh) * 2021-03-30 2021-07-02 上海商汤智能科技有限公司 信息提示方法、装置、电子设备和计算机存储介质
CN113361468A (zh) * 2021-06-30 2021-09-07 北京百度网讯科技有限公司 一种业务质检方法、装置、设备及存储介质
CN113378005A (zh) * 2021-06-03 2021-09-10 北京百度网讯科技有限公司 事件处理方法、装置、电子设备以及存储介质
CN113422935A (zh) * 2021-07-06 2021-09-21 城云科技(中国)有限公司 视频流处理方法、装置及***
CN113673351A (zh) * 2021-07-21 2021-11-19 浙江大华技术股份有限公司 一种行为检测方法、设备以及存储介质
CN113705370A (zh) * 2021-08-09 2021-11-26 百度在线网络技术(北京)有限公司 直播间违规行为的检测方法及装置、电子设备和存储介质
CN113724220A (zh) * 2021-08-27 2021-11-30 青岛创新奇智科技集团股份有限公司 视频处理方法、装置、电子设备及计算机可读存储介质
CN113763860A (zh) * 2021-09-14 2021-12-07 杭州海康消防科技有限公司 显示颜色确定方法、装置、电子设备和存储介质
CN113936258A (zh) * 2021-10-15 2022-01-14 北京百度网讯科技有限公司 图像处理方法、装置、电子设备和存储介质
CN113992890A (zh) * 2021-10-22 2022-01-28 北京明略昭辉科技有限公司 一种监测方法、装置、存储介质及电子设备
CN114205565A (zh) * 2022-02-15 2022-03-18 云丁网络技术(北京)有限公司 一种监控视频分发方法和***
CN114378862A (zh) * 2022-03-02 2022-04-22 北京云迹科技股份有限公司 基于云平台的机器人异常自动修复方法、装置和机器人
CN114682520A (zh) * 2022-04-12 2022-07-01 浪潮软件集团有限公司 一种基于国产cpu和人工智能加速卡的次品分拣装置
WO2023040151A1 (zh) * 2021-09-17 2023-03-23 上海商汤智能科技有限公司 算法应用元生成方法、装置、电子设备、计算机可读存储介质及计算机程序产品
CN116953416A (zh) * 2023-09-19 2023-10-27 英迪格(天津)电气有限公司 一种铁路变配电装置运行状态的监控***
CN117079211A (zh) * 2023-08-16 2023-11-17 广州腾方科技有限公司 一种网络机房的安全监控***及其方法
CN117079079A (zh) * 2023-09-27 2023-11-17 中电科新型智慧城市研究院有限公司 视频异常检测模型的训练方法、视频异常检测方法及***

Families Citing this family (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111339894A (zh) * 2020-02-20 2020-06-26 支付宝(杭州)信息技术有限公司 一种数据处理、风险识别方法、装置、设备及介质
CN111652043A (zh) * 2020-04-15 2020-09-11 北京三快在线科技有限公司 对象状态识别方法、装置、图像采集设备及存储介质
CN113552123A (zh) * 2020-04-17 2021-10-26 华为技术有限公司 视觉检测方法和视觉检测装置
CN111680610A (zh) * 2020-06-03 2020-09-18 合肥中科类脑智能技术有限公司 一种施工场景异常监测方法和装置
CN111783591B (zh) * 2020-06-23 2024-04-26 北京百度网讯科技有限公司 异常检测方法、装置、设备和存储介质
CN111832492B (zh) * 2020-07-16 2024-06-04 平安科技(深圳)有限公司 静态交通异常的判别方法、装置、计算机设备及存储介质
CN112804489B (zh) * 2020-12-31 2023-02-17 重庆文理学院 基于互联网+的智慧工地管理***及方法
CN113891072B (zh) * 2021-12-08 2022-02-11 北京拙河科技有限公司 基于亿级像素数据的视频监测与异常分析***与方法
CN114792368A (zh) * 2022-04-28 2022-07-26 上海兴容信息技术有限公司 一种智能判断门店合规的方法和***
CN116708899B (zh) * 2022-06-30 2024-01-23 北京生数科技有限公司 应用于合成虚拟形象的视频处理方法、装置及存储介质
CN115834621A (zh) * 2022-11-16 2023-03-21 山东新一代信息产业技术研究院有限公司 一种基于人工智能的事故快处装置和方法

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107194318A (zh) * 2017-04-24 2017-09-22 北京航空航天大学 目标检测辅助的场景识别方法
CN107346415A (zh) * 2017-06-08 2017-11-14 小草数语(北京)科技有限公司 视频图像处理方法、装置及监控设备
CN109063667A (zh) * 2018-08-14 2018-12-21 视云融聚(广州)科技有限公司 一种基于场景的视频识别方式优化及推送方法
US20190095716A1 (en) * 2017-09-26 2019-03-28 Ambient AI, Inc Systems and methods for intelligent and interpretive analysis of video image data using machine learning
CN109598885A (zh) * 2018-12-21 2019-04-09 广东中安金狮科创有限公司 监控***及其报警方法

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8200011B2 (en) * 2007-09-27 2012-06-12 Behavioral Recognition Systems, Inc. Context processor for video analysis system
CN108830204B (zh) * 2018-06-01 2021-10-19 中国科学技术大学 面对目标的监控视频中异常检测方法

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107194318A (zh) * 2017-04-24 2017-09-22 北京航空航天大学 目标检测辅助的场景识别方法
CN107346415A (zh) * 2017-06-08 2017-11-14 小草数语(北京)科技有限公司 视频图像处理方法、装置及监控设备
US20190095716A1 (en) * 2017-09-26 2019-03-28 Ambient AI, Inc Systems and methods for intelligent and interpretive analysis of video image data using machine learning
CN109063667A (zh) * 2018-08-14 2018-12-21 视云融聚(广州)科技有限公司 一种基于场景的视频识别方式优化及推送方法
CN109598885A (zh) * 2018-12-21 2019-04-09 广东中安金狮科创有限公司 监控***及其报警方法

Cited By (33)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112633126A (zh) * 2020-12-18 2021-04-09 联通物联网有限责任公司 视频处理方法及装置
CN112711994A (zh) * 2020-12-21 2021-04-27 航天信息股份有限公司 基于场景识别的违章作业行为检测方法及***
CN112634329B (zh) * 2020-12-26 2024-02-13 西安电子科技大学 一种基于时空与或图的场景目标活动预测方法及装置
CN112634329A (zh) * 2020-12-26 2021-04-09 西安电子科技大学 一种基于时空与或图的场景目标活动预测方法及装置
CN112749636A (zh) * 2020-12-29 2021-05-04 精英数智科技股份有限公司 一种煤矿探放水的监测方法、装置、***和存储介质
CN112749636B (zh) * 2020-12-29 2023-10-31 精英数智科技股份有限公司 一种煤矿探放水的监测方法、装置、***和存储介质
CN112991280A (zh) * 2021-03-03 2021-06-18 望知科技(深圳)有限公司 视觉检测方法、***及电子设备
CN112991280B (zh) * 2021-03-03 2024-05-28 望知科技(深圳)有限公司 视觉检测方法、***及电子设备
CN113065456A (zh) * 2021-03-30 2021-07-02 上海商汤智能科技有限公司 信息提示方法、装置、电子设备和计算机存储介质
CN113378005A (zh) * 2021-06-03 2021-09-10 北京百度网讯科技有限公司 事件处理方法、装置、电子设备以及存储介质
CN113378005B (zh) * 2021-06-03 2023-06-02 北京百度网讯科技有限公司 事件处理方法、装置、电子设备以及存储介质
CN113361468A (zh) * 2021-06-30 2021-09-07 北京百度网讯科技有限公司 一种业务质检方法、装置、设备及存储介质
CN113422935A (zh) * 2021-07-06 2021-09-21 城云科技(中国)有限公司 视频流处理方法、装置及***
CN113673351A (zh) * 2021-07-21 2021-11-19 浙江大华技术股份有限公司 一种行为检测方法、设备以及存储介质
CN113705370A (zh) * 2021-08-09 2021-11-26 百度在线网络技术(北京)有限公司 直播间违规行为的检测方法及装置、电子设备和存储介质
CN113705370B (zh) * 2021-08-09 2023-06-30 百度在线网络技术(北京)有限公司 直播间违规行为的检测方法及装置、电子设备和存储介质
CN113724220A (zh) * 2021-08-27 2021-11-30 青岛创新奇智科技集团股份有限公司 视频处理方法、装置、电子设备及计算机可读存储介质
CN113763860A (zh) * 2021-09-14 2021-12-07 杭州海康消防科技有限公司 显示颜色确定方法、装置、电子设备和存储介质
CN113763860B (zh) * 2021-09-14 2024-05-31 杭州海康消防科技有限公司 显示颜色确定方法、装置、电子设备和存储介质
WO2023040151A1 (zh) * 2021-09-17 2023-03-23 上海商汤智能科技有限公司 算法应用元生成方法、装置、电子设备、计算机可读存储介质及计算机程序产品
CN113936258A (zh) * 2021-10-15 2022-01-14 北京百度网讯科技有限公司 图像处理方法、装置、电子设备和存储介质
CN113992890A (zh) * 2021-10-22 2022-01-28 北京明略昭辉科技有限公司 一种监测方法、装置、存储介质及电子设备
CN114205565B (zh) * 2022-02-15 2022-07-29 云丁网络技术(北京)有限公司 一种监控视频分发方法和***
CN114205565A (zh) * 2022-02-15 2022-03-18 云丁网络技术(北京)有限公司 一种监控视频分发方法和***
CN114378862A (zh) * 2022-03-02 2022-04-22 北京云迹科技股份有限公司 基于云平台的机器人异常自动修复方法、装置和机器人
CN114378862B (zh) * 2022-03-02 2024-05-10 北京云迹科技股份有限公司 基于云平台的机器人异常自动修复方法、装置和机器人
CN114682520A (zh) * 2022-04-12 2022-07-01 浪潮软件集团有限公司 一种基于国产cpu和人工智能加速卡的次品分拣装置
CN117079211A (zh) * 2023-08-16 2023-11-17 广州腾方科技有限公司 一种网络机房的安全监控***及其方法
CN117079211B (zh) * 2023-08-16 2024-06-04 广州腾方科技有限公司 一种网络机房的安全监控***及其方法
CN116953416B (zh) * 2023-09-19 2023-12-08 英迪格(天津)电气有限公司 一种铁路变配电装置运行状态的监控***
CN116953416A (zh) * 2023-09-19 2023-10-27 英迪格(天津)电气有限公司 一种铁路变配电装置运行状态的监控***
CN117079079A (zh) * 2023-09-27 2023-11-17 中电科新型智慧城市研究院有限公司 视频异常检测模型的训练方法、视频异常检测方法及***
CN117079079B (zh) * 2023-09-27 2024-03-15 中电科新型智慧城市研究院有限公司 视频异常检测模型的训练方法、视频异常检测方法及***

Also Published As

Publication number Publication date
CN110390262B (zh) 2023-06-30
CN110390262A (zh) 2019-10-29

Similar Documents

Publication Publication Date Title
WO2020248386A1 (zh) 视频分析方法、装置、计算机设备及存储介质
US11840239B2 (en) Multiple exposure event determination
US10706330B2 (en) Methods and systems for accurately recognizing vehicle license plates
WO2021135879A1 (zh) 监控车辆数据方法、装置、计算机设备及存储介质
CN111310562B (zh) 基于人工智能的车辆驾驶风险管控方法及其相关设备
US10552687B2 (en) Visual monitoring of queues using auxillary devices
US20230153698A1 (en) Methods and systems for accurately recognizing vehicle license plates
CN109191829B (zh) 道路安全监控方法及***、计算机可读存储介质
CN109360362A (zh) 一种铁路视频监控识别方法、***和计算机可读介质
CN102902960B (zh) 基于高斯建模与目标轮廓的遗留物检测方法
CN112766069A (zh) 基于深度学习的车辆违停检测方法、装置及电子设备
US20160210759A1 (en) System and method of detecting moving objects
CN112233428B (zh) 车流量预测方法、装置、存储介质及设备
WO2021022698A1 (zh) 尾随检测方法、装置、电子设备及存储介质
CN111079621A (zh) 检测对象的方法、装置、电子设备和存储介质
CN114663871A (zh) 图像识别方法、训练方法、装置、***及存储介质
CN114926791A (zh) 一种路口车辆异常变道检测方法、装置、存储介质及电子设备
CN113380021A (zh) 车辆状态检测方法、装置、服务器及计算机可读存储介质
Jiao et al. Traffic behavior recognition from traffic videos under occlusion condition: a Kalman filter approach
CN103761345A (zh) 一种基于ocr字符识别技术的视频检索方法
CN114693722B (zh) 一种车辆行驶行为检测方法、检测装置及检测设备
EP4071728A1 (en) Artificial intelligence model integration and deployment for providing a service
CN112153341B (zh) 一种任务监督方法、装置、***、电子设备及存储介质
Huang et al. A bus crowdedness sensing system using deep-learning based object detection
CN113902987A (zh) 群体智能的对象检测方法、存储介质和处理器

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 19933143

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 19933143

Country of ref document: EP

Kind code of ref document: A1

122 Ep: pct application non-entry in european phase

Ref document number: 19933143

Country of ref document: EP

Kind code of ref document: A1