WO2023071188A1 - 一种异常行为检测的方法、装置、电子设备及存储介质 - Google Patents

一种异常行为检测的方法、装置、电子设备及存储介质 Download PDF

Info

Publication number
WO2023071188A1
WO2023071188A1 PCT/CN2022/096440 CN2022096440W WO2023071188A1 WO 2023071188 A1 WO2023071188 A1 WO 2023071188A1 CN 2022096440 W CN2022096440 W CN 2022096440W WO 2023071188 A1 WO2023071188 A1 WO 2023071188A1
Authority
WO
WIPO (PCT)
Prior art keywords
target
detection
target object
video frame
frame
Prior art date
Application number
PCT/CN2022/096440
Other languages
English (en)
French (fr)
Inventor
袁熙
王宇杰
Original Assignee
上海商汤智能科技有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 上海商汤智能科技有限公司 filed Critical 上海商汤智能科技有限公司
Publication of WO2023071188A1 publication Critical patent/WO2023071188A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods

Definitions

  • the present disclosure relates to the field of security technology, in particular, to a method, device, electronic equipment and storage medium for abnormal behavior detection.
  • Abnormal behavior detection in the shooting area is an important problem in the field of computer vision, such as detecting illegal behavior, traffic accidents and other abnormal events.
  • most of the cameras in the shooting area are used for recording, but do not have the ability to automatically identify abnormal behaviors. It is often necessary to perform real-time abnormal identification through manual on-duty, and trace the abnormalities through retrospective viewing. This method is extremely inefficient .
  • Embodiments of the present disclosure at least provide a method, device, electronic device, and storage medium for abnormal behavior detection.
  • an embodiment of the present disclosure provides a method for abnormal behavior detection, the method comprising:
  • a plurality of target objects located on different sides of the obstructing object are paired to obtain a target object detection pair; and determining a target video containing the target object detection pair frame area;
  • Abnormal behavior detection is performed on the target video frame region by using a trained behavior detection neural network.
  • the obstructing object in the video frame and the target object located on both sides of the obstructing object can be detected first, and then the target can be based on the distance between the target objects located on both sides of the obstructing object Object pairing to determine the target video frame area containing the target object detection pair, and finally the trained behavior detection neural network can be used to detect abnormal behavior in the target video frame area.
  • the target video frame area formed based on target object pairing is used.
  • the target video frame area can correspond to the area with abnormal behavior, which avoids the abnormal behavior detection of other irrelevant areas.
  • the detection accuracy is high, and because the trained behavior detection neural network is used to directly detect anomalies, the detection efficiency is significantly improved.
  • the detection of the obstructing object in the video frame and the targets located on both sides of the obstructing object objects including:
  • the trained obstacle detection neural network to perform obstacle detection on the video frame, to obtain the obstacle detection mark to which the obstacle belongs; and, using the trained pedestrian detection neural network to perform target object detection on the target video frame, Obtain the target object detection frame corresponding to the target object;
  • a target object detection pair including:
  • a second target object paired with the first target object is determined among the at least one second target object.
  • the trained obstacle detection neural network and the trained pedestrian detection neural network can be used to detect obstruction objects and target objects respectively, and the detection efficiency is high. Then determine the pairing situation based on the distance between the detected target object detection frames, so that the paired two target object detection frames cover abnormal behavior as much as possible, for example, when the paired two target object detection frames are relatively close, To a certain extent, it can explain that the two pedestrians have violations of delivering objects, which will further improve the accuracy of anomaly detection.
  • the distance between the target object detection frame based on at least one second target object located on the other side and the target object detection frame of the first target object is from the Before determining a second target object paired with the first target object among at least one second target object, the method further includes:
  • the target object detection frame of the first target object and the target object detection frame of at least one second target object on the other side are respectively subjected to size enlargement processing according to a preset enlargement ratio.
  • the influence of the distance between the detection frames on the abnormal behavior detection can be enlarged to a certain extent, and the detection accuracy can be improved.
  • the distance between the two target object detection frames is determined according to the following steps:
  • the distance between the two target detection lines is used as the distance between the two target object detection frames.
  • the distance between the target object detection frame of the second target object and the target object detection frame of the first target object is determined according to the following steps:
  • the distance between the two distance reference marks is determined as the distance between the target object detection frame of the second target object and the target object detection frame of the first target object.
  • the detecting the obstructing object in the video frame and the target objects located on both sides of the obstructing object includes:
  • an obstructing object in the target video frame and target objects located on both sides of the obstructing object are detected.
  • the video frame screening operation can be performed first, and then the target object detection is performed to better capture abnormal behaviors.
  • the selecting multiple video frames from the video segment in time sequence to obtain the target video sequence includes:
  • the abnormal behavior detection of the target video frame region by using the trained behavior detection neural network includes:
  • the behavior detection neural network is trained according to the following steps:
  • the multi-frame video frame samples are used as the input data of the behavior detection neural network to be trained, and the abnormal behavior indicator labels marked for the multi-frame video frame samples are used as the comparison of the output results of the behavior detection neural network to be trained Supervising data, performing at least one round of network training on the behavior detection neural network to be trained to obtain a trained behavior detection neural network.
  • the method further includes at least one of the following:
  • alarm prompt information is generated.
  • the embodiment of the present disclosure also provides a device for abnormal behavior detection, the device includes:
  • An acquisition module configured to acquire video frames collected in the preset management area
  • a first detection module configured to detect an obstructing object in the video frame, and target objects located on both sides of the obstructing object
  • a determining module configured to pair multiple target objects located on different sides of the obstructing object according to the distance between the target objects located on both sides of the obstructing object to obtain a target object detection pair; and determine to include the target object Detect the target video frame area of the pair;
  • the second detection module is used to detect the abnormal behavior of the target video frame area by using the trained behavior detection neural network.
  • an embodiment of the present disclosure further provides an electronic device, including: a processor, a memory, and a bus, the memory stores machine-readable instructions executable by the processor, and when the electronic device is running, the The processor communicates with the memory through a bus, and when the machine-readable instructions are executed by the processor, the steps of the abnormal behavior detection method described in any one of the first aspect and its various implementation modes are executed.
  • an embodiment of the present disclosure further provides a computer-readable storage medium, which is characterized in that a computer program is stored on the computer-readable storage medium, and the computer program is executed by a processor as described in the first aspect and its implementation.
  • FIG. 1 shows a flow chart of a method for abnormal behavior detection provided by an embodiment of the present disclosure
  • FIG. 2 shows a schematic diagram of an abnormal behavior detection device provided by an embodiment of the present disclosure
  • Fig. 3 shows a schematic diagram of an electronic device provided by an embodiment of the present disclosure.
  • the present disclosure provides a method, device, electronic device, and storage medium for abnormal behavior detection based on target object pairing to detect target video frame regions, and the detection efficiency and accuracy are high.
  • the execution subject of the method for detecting abnormal behavior provided by the embodiment of the present disclosure is generally an electronic computer with certain computing power.
  • the electronic equipment includes, for example: terminal equipment or server or other processing equipment, the terminal equipment can be user equipment (User Equipment, UE), mobile equipment, cellular phone, cordless phone, personal digital assistant (Personal Digital Assistant, PDA), Handheld devices, computing devices, in-vehicle devices, wearable devices, etc.
  • the abnormal behavior detection method may be realized by calling a computer-readable instruction stored in a memory by a processor.
  • FIG. 1 is a flowchart of a method for abnormal behavior detection provided by an embodiment of the present disclosure
  • the method includes S101 to S104, wherein:
  • S101 Obtain video frames collected in a preset management area
  • S102 Detect an obstructing object in the video frame, and target objects located on both sides of the obstructing object;
  • S103 According to the distance between the target objects located on both sides of the obstructing object, pair multiple target objects located on different sides of the obstructing object to obtain a target object detection pair; and determine the target video frame area containing the target object detection pair;
  • S104 Use the trained behavior detection neural network to detect abnormal behaviors in the target video frame area.
  • the abnormal behavior detection method in the embodiments of the present disclosure can be mainly applied in the field of smart cities, for example, it can be aimed at the abnormal behavior detection of pedestrians passing over the guardrail in the subway to pass items that have not passed the security check, or it can be aimed at the abnormal behavior of students fighting at the school guardrail Behavior detection, etc., there are no specific restrictions here.
  • the embodiments of the present disclosure provide a method, device, electronic device, and storage medium for abnormal behavior detection based on target object pairing and related target video frame regions, so as to improve the efficiency and efficiency of abnormal behavior detection. Accuracy.
  • the preset management area here is also different, that is, based on different application scenarios, the range of the above-mentioned preset management area can be flexibly set.
  • the preset management area here may be the area near the subway entrance.
  • the video frame can be taken by a camera device capable of capturing the aforementioned preset management area, and can be a single frame or multiple consecutive frames. In order to better capture the abnormal behavior, multiple consecutive video frames may be used here, for example, it may be a video segment.
  • the implementation of the present disclosure can first detect the obstructing object in the video frame and the target objects located on both sides of the obstructing object, and then pair the target objects located on both sides of the obstructing object to obtain the target object detection pair .
  • the detection of the target object on the one hand, it can be obtained based on the image processing method, and on the other hand, it can be obtained based on the trained detection neural network. Since the detection neural network can dig out the deeper dimensional characteristics of the target object, to a certain extent Therefore, the detection neural network can be used here to detect obstacles and target objects.
  • the traditional behavior recognition method usually performs full-image data enhancement or other preprocessing on the input video sequence and then sends it to the classification model for prediction, which is often found in public video academic datasets.
  • this method is only suitable for human-centered video behavior recognition. For videos captured by cameras in specific scenarios, they often contain more information and cover a larger field of view. At the same time, the event location and human scale of the target are also random. Therefore, it is unreasonable to simply use the full graph as the input of the model.
  • target objects can be paired first, and then the target video frame area can be determined based on the pairing result to realize abnormal behavior detection for the target video frame area.
  • Non-full graph behavior detection can improve the probability and accuracy of capturing abnormal behavior as much as possible.
  • the above-mentioned pairing of obstructing objects can be determined based on the distance between the target objects located on both sides of the obstructing object. This is mainly to realize the detection of the abnormal behavior of passing objects across the fence. It means that the probability of two target objects crossing the obstacle object to perform abnormal behavior is higher, and conversely, the farther the distance is, to a certain extent, the probability of two target objects crossing the obstacle object to perform abnormal behavior is also lower.
  • abnormal behavior detection can be performed on the target video frame region based on the trained behavior detection neural network, that is, for the target video frame region obtained by pairing two target objects with a closer distance, there is a possibility of abnormal behavior bigger.
  • the obstacle object in the embodiment of the present disclosure can be an obstacle, such as a railing, a guardrail, etc.
  • the target object can be a pedestrian.
  • the trained obstacle detection neural network can be used to detect the obstacle object on the video frame, and the obstacle object belongs to Obstructing object detection signs; and, using the trained pedestrian detection neural network to perform target object detection on the target video frame to obtain a target object detection frame corresponding to the target object.
  • the relevant obstacle detection neural network can be obtained by training based on the video frame samples and the labeling information obtained by labeling the obstacles for the video frame samples, and the training can be the relevant video frame samples and the obstacle positions in the video frame samples The relationship between information such as size and direction.
  • the obstacle object detection mark here can be a detection line or a detection frame.
  • the pedestrian detection neural network can be trained based on the video frame samples and the annotation information obtained by marking pedestrians on the video frame samples. Relationship.
  • the relative positional relationship between pedestrians and obstacles also needs to be considered.
  • the first target object located on one side of the obstructing object based on the distance between the target object detection frame of at least one second target object located on the other side and the target object detection frame of the first target object, from at least A second target object paired with the first target object is determined among the second target objects.
  • the above-mentioned first target object may be any one of the first target objects included on the blocking object side, or may be a target object specified from each first target object, which is not discussed in this embodiment of the present disclosure. Make specific restrictions.
  • the target object detection frame in the embodiments of the present disclosure may be a regular rectangular frame, or other regular shapes, for example, a circular frame, an oval frame, etc. In addition, in practical applications, it may also be is an irregular detection box that only contains the target object.
  • the target object detection frame of the first target object and the target object detection frame of the second target object on the other side are both rectangular frames, here, from the two target object detection frames, Select two target detection lines that belong to different target object detection frames and have the smallest distance, and then use the distance between the two target detection lines as the distance between the two target object detection frames.
  • the coordinate information of the two target object detection frames in the video frame can be determined respectively, and then the position information of the target detection line included in each target object detection frame can be determined, and then the two with the smallest distance can be selected from Object detection lines for different object detection boxes.
  • the corresponding distance reference marks can be selected from the target object detection frame of the second target object and the target object detection frame of the first target object respectively, and then based on the distance between the two distance reference marks , to determine the distance between two object detection boxes.
  • the above-mentioned distance reference mark may be the center point of the detection frame, may also be the center line of the detection frame, or may be other marks with distance reference significance, and no specific limitation is made here.
  • an operation of zooming in on the detection frame of the target object may be performed first. That is to say, the target object detection frame of the first target object and the target object detection frame of at least one second target object located on the other side may be enlarged according to the preset magnification ratio, and then by determining the enlarged The distance between two target object detection boxes is used to achieve pairing.
  • the detection frame of each target object can be expanded by 1.5 times, or 1 time, 2 times, 3 times, etc., and then for pedestrians on the side of the railing, the pedestrian detection frame corresponding to the pedestrian can be passed. Find the pedestrian detection frame on the opposite side closest to it, and determine the pedestrian on the opposite side that may pass objects across the fence.
  • the zoom-in operation can be performed synchronously, so that the detection frames of each target object are at the same reference level, and the accuracy of subsequent pairing is improved.
  • the video frames in the embodiments of the present disclosure may be video segments of multiple consecutive frames. Considering the continuous characteristics between each frame of the video clip, the video frame extraction operation can be carried out first, and then the target object detection can be carried out, which can be realized by the following steps:
  • Step 1 selecting multiple frames of video frames in time sequence from the video clips to obtain the target video sequence
  • Step 2 For each target video frame in the target video sequence, detect an obstructing object in the target video frame and target objects located on both sides of the obstructing object.
  • the target video sequence can be determined through the following steps:
  • Step 1 dividing the video segment into multiple video frame groups according to the preset division interval
  • Step 2 for each video frame group in the plurality of video frame groups, select a video frame from the video frame group as a target video frame in the target video sequence;
  • Step 3 Combining video frames respectively selected from multiple video frame groups according to time sequence to obtain a target video sequence.
  • the preset division interval here can be a relevant time interval, for example, it can be divided into a video frame group every 0.5 seconds, it can also be a relevant frame number interval, for example, it can be divided into a video frame group every 5 frames, or it can be is another division method, which is not specifically limited here.
  • the target video frame selected by the divided video frame group reduces the amount of calculation to a certain extent on the premise of ensuring that more behavior information can be detected.
  • the paired target objects can be determined according to the above method, and then the target video frame area containing the target object detection pair can be determined.
  • the target video frame region contains not only target object detection pairs but also obstructive objects between two target objects.
  • the target video frame area corresponding to each target video frame can be input into the trained behavior detection neural network in sequence to determine the target video frame with abnormal behavior, and the abnormal behavior of the target object.
  • the position information in the target video frame of the action can be input into the trained behavior detection neural network in sequence to determine the target video frame with abnormal behavior, and the abnormal behavior of the target object.
  • the target video frame area can be extracted from the corresponding target video frame, and the extracted video frame area can be directly input to the trained behavior detection neural network for abnormal behavior detection.
  • the abnormal behavior detection here can determine the position information of the target object in the target video frame where the abnormal behavior occurs, so that it is convenient for the management personnel to deal with the abnormal situation in time.
  • abnormal behavior detection can be realized based on the trained behavior detection neural network.
  • the behavior detection neural network can be trained according to the following steps:
  • Step 1 obtaining multi-frame video frame samples
  • Step 2 using the multi-frame video frame sample as the input data of the behavior detection neural network to be trained, and using the abnormal behavior indicator label marked for the multi-frame video frame sample as the comparative supervision data of the output result of the behavior detection neural network to be trained, Perform at least one round of network training on the behavior detection neural network to be trained to obtain a trained behavior detection neural network.
  • the comparative supervision data of the behavior detection neural network can be performed based on the abnormal behavior indicator labels labeled by the multi-frame video frame samples.
  • the network output results are farther away from the comparison supervision data, it indicates that the performance of the network is not good enough, and the network training needs to be performed again.
  • the above-mentioned abnormal behavior indicator label may be related to a specific abnormal behavior identification, such as the delivery behavior marked as 1, the fighting behavior marked as 2, and so on.
  • the detection result obtained from the abnormal behavior detection can also be sent to the management terminal, and the management terminal can quickly grasp the abnormal behavior based on the detection result, so that the out response.
  • the embodiments of the present disclosure can also generate alarm prompt information based on the detection results, and can remind managers to respond in a timely manner through voice broadcasts, etc.
  • prompts with different warning strengths can also be generated for different detection results Information, for example, for the abnormal behavior of delivering ordinary goods in the subway scene, it can be prompted through ordinary prompts, and for the abnormal behavior of delivering dangerous goods in the subway scene, it can be prompted through strong reminders.
  • the writing order of each step does not mean a strict execution order and constitutes any limitation on the implementation process.
  • the specific execution order of each step should be based on its function and possible
  • the inner logic is OK.
  • the embodiments of the present disclosure also provide a device for detecting abnormal behavior corresponding to the method for detecting abnormal behavior. Similar, therefore, the implementation of the device can refer to the implementation of the method, and repeated descriptions will not be repeated.
  • FIG. 2 it is a schematic diagram of an abnormal behavior detection device provided by an embodiment of the present disclosure.
  • the device includes: an acquisition module 201, a first detection module 202, a determination module 203, and a second detection module 204; wherein,
  • An acquisition module 201 configured to acquire video frames collected in a preset management area
  • a first detection module 202 configured to detect an obstructing object in the video frame, and target objects located on both sides of the obstructing object;
  • the determining module 203 is used to pair multiple target objects located on different sides of the obstructing object according to the distance between the target objects located on both sides of the obstructing object to obtain a target object detection pair; and determine the target video containing the target object detection pair frame area;
  • the second detection module 204 is configured to use the trained behavior detection neural network to perform abnormal behavior detection on the target video frame area.
  • the obstructing object in the video frame and the target objects located on both sides of the obstructing object can be detected first, and then the target can be determined based on the distance between the target objects located on both sides of the obstructing object Object pairing to determine the target video frame area containing the target object detection pair, and finally the trained behavior detection neural network can be used to detect abnormal behavior in the target video frame area.
  • the target video frame area formed based on target object pairing is used.
  • the target video frame area can correspond to the area with abnormal behavior, which avoids the abnormal behavior detection of other irrelevant areas.
  • the detection accuracy is high, and because the trained behavior detection neural network is used to directly detect anomalies, the detection efficiency is significantly improved.
  • the first detection module 202 is configured to detect the obstructing object in the video frame and the target:
  • the determining module 203 is configured to pair the target objects located on both sides of the obstructing object according to the distance between the target objects located on both sides of the obstructing object according to the following steps to obtain a target object detection pair:
  • a second target object paired with the first target object is determined among the target objects.
  • the determining module 203 is also configured to:
  • the distance between the target object detection frame of the at least one second target object located on the other side and the target object detection frame of the first target object is determined from the at least one second target object.
  • the target object detection frame of the first target object and the target object detection frame of at least one second target object located on the other side are respectively subjected to size enlargement processing according to a preset enlargement ratio. .
  • the determining module 203 It is used to determine the distance between two target object detection boxes according to the following steps:
  • the distance between two target detection lines is taken as the distance between two target object detection frames.
  • the distance between the target object detection frame of the second target object and the target object detection frame of the first target object is determined according to the following steps:
  • the distance between the two distance reference marks is determined as the distance between the target object detection frame of the second target object and the target object detection frame of the first target object.
  • the first detection module 202 is configured to detect an obstructing object in the video frame and target objects located on both sides of the obstructing object according to the following steps:
  • an obstructing object in the target video frame and target objects located on both sides of the obstructing object are detected.
  • the first detection module 202 is configured to select multiple video frames from the video clip in time sequence according to the following steps to obtain the target video sequence:
  • the second detection module 204 is configured to use a trained behavior detection neural network to perform abnormal behavior detection on the target video frame region according to the following steps:
  • the second detection module 204 is configured to train the behavior detection neural network according to the following steps:
  • the multi-frame video frame sample is used as the input data of the behavior detection neural network to be trained, and the abnormal behavior indicator label marked for the multi-frame video frame sample is used as the comparative supervision data of the output result of the behavior detection neural network to be trained, and the training
  • the behavior detection neural network performs at least one round of network training to obtain a trained behavior detection neural network.
  • the second detection module 204 is also configured to send the detection result obtained by detecting the abnormal behavior to the management terminal after using the trained behavior detection neural network to detect the abnormal behavior in the target video frame area ; and/or, based on the detection result obtained by detecting the abnormal behavior, generating alarm prompt information.
  • the embodiment of the present disclosure also provides an electronic device, as shown in FIG. 3 , which is a schematic structural diagram of the electronic device provided by the embodiment of the present disclosure, including: a processor 301 , a memory 302 , and a bus 303 .
  • the memory 302 stores machine-readable instructions executable by the processor 301 (for example, execution instructions corresponding to the acquisition module 201, the first detection module 202, the determination module 203, and the second detection module 204 in the device in FIG. 2 ), when When the electronic device is running, the processor 301 communicates with the memory 302 through the bus 303, and when the machine-readable instructions are executed by the processor 301, the following processes are performed:
  • multiple target objects located on different sides of the obstructing object are paired to obtain the target object detection pair; and determine the target video frame area containing the target object detection pair;
  • Embodiments of the present disclosure also provide a computer-readable storage medium, on which a computer program is stored, and when the computer program is run by a processor, the steps of the abnormal behavior detection method described in the above-mentioned method embodiments are executed.
  • the storage medium may be a volatile or non-volatile computer-readable storage medium.
  • An embodiment of the present disclosure also provides a computer program product, the computer program product carries a program code, and the instructions included in the program code can be used to execute the steps of the abnormal behavior detection method described in the above method embodiment, for details, please refer to The foregoing method embodiments are not described in detail here.
  • the above-mentioned computer program product may be specifically implemented by means of hardware, software or a combination thereof.
  • the computer program product is embodied as a computer storage medium, and in another optional embodiment, the computer program product is embodied as a software product, such as a software development kit (Software Development Kit, SDK) etc. wait.
  • a software development kit Software Development Kit, SDK
  • the units described as separate components may or may not be physically separated, and the components displayed as units may or may not be physical units, that is, they may be located in one place, or may be distributed to multiple network units. Part or all of the units can be selected according to actual needs to achieve the purpose of the solution of this embodiment.
  • each functional unit in each embodiment of the present disclosure may be integrated into one processing unit, each unit may exist separately physically, or two or more units may be integrated into one unit.
  • the functions are realized in the form of software function units and sold or used as independent products, they can be stored in a non-volatile computer-readable storage medium executable by a processor.
  • the technical solution of the present disclosure is essentially or the part that contributes to the prior art or the part of the technical solution can be embodied in the form of a software product, and the computer software product is stored in a storage medium, including Several instructions are used to make an electronic device (which may be a personal computer, a server, or a network device, etc.) execute all or part of the steps of the methods described in various embodiments of the present disclosure.
  • the aforementioned storage media include: U disk, mobile hard disk, read-only memory (Read-Only Memory, ROM), random access memory (Random Access Memory, RAM), magnetic disk or optical disc and other media that can store program codes. .

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Health & Medical Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Evolutionary Computation (AREA)
  • Artificial Intelligence (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Image Analysis (AREA)

Abstract

本公开提供了一种异常行为检测的方法、装置、电子设备及存储介质,其中,该方法包括:获取在预设管理区域内采集的视频帧;检测视频帧中的阻碍对象,以及位于阻碍对象两侧的目标对象;根据位于阻碍对象两侧的目标对象之间的距离,将位于阻碍对象不同侧的多个目标对象进行配对,得到目标对象检测对;并确定包含目标对象检测对的目标视频帧区域;利用训练好的行为检测神经网络对目标视频帧区域进行异常行为检测。本公开中的目标视频帧区域对应的可以是存在异常行为的区域,避免了其它无关区域对于异常行为检测的影响,检测的准确率较高,且由于利用训练好的行为检测神经网络直接进行异常检测,显著提升了检测效率。

Description

一种异常行为检测的方法、装置、电子设备及存储介质
本公开要求于2021年10月29日提交中国专利局、申请号为202111271743.9、发明名称为“一种异常行为检测的方法、装置、电子设备及存储介质”,其全部内容通过引用结合在本公开中。
技术领域
本公开涉及安防技术领域,具体而言,涉及一种异常行为检测的方法、装置、电子设备及存储介质。
背景技术
随着科技的发展和城市智能化的推进,视频的应用也越来越普及,逐渐被应用到商场、交通路口、银行和车站等场所,以保障所拍摄区域中人员的人身与财产安全。
对拍摄区域进行异常行为检测是计算机视觉领域的一个重要问题,例如检测非法行为,交通事故和其他异常事件等。然而,拍摄区域中大多数的摄像头用于记录,而没有起到自动识别异常行为的能力,往往需要通过人工值守的方式进行实时异常识别,并通过回溯查看来追溯异常,这样的方式效率极低。
发明内容
本公开实施例至少提供一种异常行为检测的方法、装置、电子设备及存储介质。
第一方面,本公开实施例提供了一种异常行为检测的方法,所述方法包括:
获取在预设管理区域内采集的视频帧;
检测所述视频帧中的阻碍对象,以及位于所述阻碍对象两侧的目标对象;
根据位于所述阻碍对象两侧的目标对象之间的距离,将位于所述阻碍对象不同侧的多个目标对象进行配对,得到目标对象检测对;并确定包含所述目标对象检测对的目标视频帧区域;
利用训练好的行为检测神经网络对所述目标视频帧区域进行异常行为检测。
采用上述异常行为检测的方法,对于获取的视频帧,可以首先检测视频帧中的阻碍对象以及位于阻碍对象两侧的目标对象,然后可以基于位于阻碍对象两侧的目标对象之间的距离进行目标对象的配对,以确定包含目标对象检测对的目标视频帧区域,最后可以利用 训练好的行为检测神经网络对目标视频帧区域进行异常行为检测。本公开在进行异常行为检测的过程中,采用的是基于目标对象配对所形成的目标视频帧区域,该目标视频帧区域对应的可以是存在异常行为的区域,避免了其它无关区域对于异常行为检测的影响,检测的准确率较高,且由于利用训练好的行为检测神经网络直接进行异常检测,显著提升了检测效率。
在一种可能的实施方式中,在所述阻碍对象为阻挡物,所述目标对象为行人的情况下,所述检测所述视频帧中的阻碍对象,以及位于所述阻碍对象两侧的目标对象,包括:
利用训练好的阻挡物检测神经网络对所述视频帧进行阻碍对象检测,得到阻碍对象所属的阻碍对象检测标志;以及,利用训练好的行人检测神经网络对所述目标视频帧进行目标对象检测,得到目标对象对应的目标对象检测框;
所述根据位于所述阻碍对象两侧的目标对象之间的距离,将位于所述阻碍对象不同侧的多个目标对象进行配对,得到目标对象检测对,包括:
针对位于所述阻碍对象一侧的第一目标对象,基于位于另一侧的至少一个第二目标对象的目标对象检测框,与所述第一目标对象的目标对象检测框之间的距离,从所述至少一个第二目标对象中确定与所述第一目标对象配对的第二目标对象。
这里,可以分别利用训练好的阻挡物检测神经网络以及训练好的行人检测神经网络进行阻碍对象以及目标对象的检测,检测效率较高。然后基于检测得到的目标对象检测框之间的距离确定配对情况,使得配对的两个目标对象检测框尽可能的涵盖异常行为,例如,在配对的两个目标对象检测框比较近的情况下,一定程度上可以说明两个行人存在递物的违规行为,这将进一步提升异常检测的准确率。
在一种可能的实施方式中,在所述基于位于另一侧的至少一个第二目标对象的目标对象检测框,与所述第一目标对象的目标对象检测框之间的距离,从所述至少一个第二目标对象中确定与所述第一目标对象配对的第二目标对象之前,所述方法还包括:
按照预设放大比例对所述第一目标对象的目标对象检测框,以及位于另一侧的至少一个第二目标对象的目标对象检测框分别进行尺寸放大处理。
这里,通过检测框的放大操作一定程度上可以扩大检测框之间的距离对于异常行为检测的影响程度,提升检测准确率。
在一种可能的实施方式中,在所述第二目标对象的目标对象检测框与所述第一目标对象的目标对象检测框所对应的两个目标对象检测框均为矩形框的情况下,所述两个目标对 象检测框之间的距离为根据以下步骤确定的:
从所述两个目标对象检测框中,选取属于不同目标对象检测框的、且距离最小的两条目标检测线;
将所述两条目标检测线之间的距离,作为所述两个目标对象检测框之间的距离。
在一种可能的实施方式中,所述第二目标对象的目标对象检测框与所述第一目标对象的目标对象检测框之间的距离为根据以下步骤确定的:
从所述第二目标对象的目标对象检测框与所述第一目标对象的目标对象检测框中分别选取对应的距离参考标志;
将两个所述距离参考标志之间的距离,确定为所述第二目标对象的目标对象检测框与所述第一目标对象的目标对象检测框之间的距离。
在一种可能的实施方式中,在获取的视频帧为视频片段的情况下,所述检测所述视频帧中的阻碍对象,以及位于所述阻碍对象两侧的目标对象,包括:
从所述视频片段中按时序选取多帧视频帧,得到目标视频序列;
针对所述目标视频序列中的每个目标视频帧,检测所述目标视频帧中的阻碍对象,以及位于所述阻碍对象两侧的目标对象。
这里,考虑到连续视频帧中存在相似帧的可能性,这里可以首先进行视频帧筛选操作,而后再进行目标对象检测,以更好的捕捉异常行为。
在一种可能的实施方式中,所述从所述视频片段中按时序选取多帧视频帧,得到目标视频序列,包括:
按照预设划分间隔将所述视频片段划分为多个视频帧组;
针对所述多个视频帧组中每个视频帧组,从所述视频帧组中选取一个视频帧,作为所述目标视频序列中的一个目标视频帧;
按照时序将所述多个视频帧组分别选取的视频帧进行组合,得到所述目标视频序列。
在一种可能的实施方式中,所述利用训练好的行为检测神经网络对所述目标视频帧区域进行异常行为检测,包括:
将每个所述目标视频帧对应的所述目标视频帧区域依次输入训练好的行为检测神经网络,确定出现异常行为的目标视频帧,以及所述目标对象在所述出现异常行为的目标视频帧中的位置信息。
在一种可能的实施方式中,按照如下步骤训练所述行为检测神经网络:
获取多帧视频帧样本;
将所述多帧视频帧样本作为待训练的行为检测神经网络的输入数据,将针对所述多帧视频帧样本标注的异常行为指示标签作为所述待训练的行为检测神经网络的输出结果的对比监督数据,对所述待训练的行为检测神经网络进行至少一轮网络训练,得到训练好的行为检测神经网络。
在一种可能的实施方式中,在所述利用训练好的行为检测神经网络对所述目标视频帧区域进行异常行为检测之后,所述方法还包括如下至少一项:
向管理终端发送进行异常行为检测得到的检测结果;
基于所述进行异常行为检测得到的检测结果,生成警报提示信息。
第二方面,本公开实施例还提供了一种异常行为检测的装置,所述装置包括:
获取模块,用于获取在预设管理区域内采集的视频帧;
第一检测模块,用于检测所述视频帧中的阻碍对象,以及位于所述阻碍对象两侧的目标对象;
确定模块,用于根据位于所述阻碍对象两侧的目标对象之间的距离,将位于所述阻碍对象不同侧的多个目标对象进行配对,得到目标对象检测对;并确定包含所述目标对象检测对的目标视频帧区域;
第二检测模块,用于利用训练好的行为检测神经网络对所述目标视频帧区域进行异常行为检测。
第三方面,本公开实施例还提供了一种电子设备,包括:处理器、存储器和总线,所述存储器存储有所述处理器可执行的机器可读指令,当电子设备运行时,所述处理器与所述存储器之间通过总线通信,所述机器可读指令被所述处理器执行时执行如第一方面及其各种实施方式任一所述的异常行为检测的方法的步骤。
第四方面,本公开实施例还提供了一种计算机可读存储介质,其特征在于,该计算机可读存储介质上存储有计算机程序,该计算机程序被处理器运行时执行如第一方面及其各种实施方式任一所述的异常行为检测的方法的步骤。
关于上述异常行为检测的装置、电子设备、及计算机可读存储介质的效果描述参见上述异常行为检测的方法的说明,这里不再赘述。
为使本公开的上述目的、特征和优点能更明显易懂,下文特举较佳实施例,并配合所附附图,作详细说明如下。
附图说明
为了更清楚地说明本公开实施例的技术方案,下面将对实施例中所需要使用的附图作简单地介绍,此处的附图被并入说明书中并构成本说明书中的一部分,这些附图示出了符合本公开的实施例,并与说明书一起用于说明本公开的技术方案。应当理解,以下附图仅示出了本公开的某些实施例,因此不应被看作是对范围的限定,对于本领域普通技术人员来讲,在不付出创造性劳动的前提下,还可以根据这些附图获得其他相关的附图。
图1示出了本公开实施例所提供的一种异常行为检测的方法的流程图;
图2示出了本公开实施例所提供的一种异常行为检测的装置的示意图;
图3示出了本公开实施例所提供的一种电子设备的示意图。
具体实施方式
为使本公开实施例的目的、技术方案和优点更加清楚,下面将结合本公开实施例中附图,对本公开实施例中的技术方案进行清楚、完整地描述,显然,所描述的实施例仅仅是本公开一部分实施例,而不是全部的实施例。通常在此处附图中描述和示出的本公开实施例的组件可以以各种不同的配置来布置和设计。因此,以下对在附图中提供的本公开的实施例的详细描述并非旨在限制要求保护的本公开的范围,而是仅仅表示本公开的选定实施例。基于本公开的实施例,本领域技术人员在没有做出创造性劳动的前提下所获得的所有其他实施例,都属于本公开保护的范围。
应注意到:相似的标号和字母在下面的附图中表示类似项,因此,一旦某一项在一个附图中被定义,则在随后的附图中不需要对其进行进一步定义和解释。
本文中术语“和/或”,仅仅是描述一种关联关系,表示可以存在三种关系,例如,A和/或B,可以表示:单独存在A,同时存在A和B,单独存在B这三种情况。另外,本文中术语“至少一种”表示多种中的任意一种或多种中的至少两种的任意组合,例如,包括A、B、C中的至少一种,可以表示包括从A、B和C构成的集合中选择的任意一个或多个元素。
经研究发现,拍摄区域中大多数的摄像头用于记录,而没有起到自动识别异常行为的作用,往往需要通过人工值守的方式进行实时异常识别,并通过回溯查看来追溯异常,这样的方式效率极低。
此外,随着视频大数据的日益发展,产生了巨大的视频数量,仅靠人力去过滤视频中的内容是不现实的。因此,如何利用计算机视觉和深度学习技术来自动检测发生在视频中的异常事件成为急需解决的问题。
人们往往可以通过常识以及对物体在空间中的地理位置来进行异常行为的识别。例如,我们可以通过确认栏杆两侧的人在栏杆上方移动物品,来确定异常行为的发生。然而,对于机器来说,它们没有常识,只有视觉的特征。因此,视觉特征越强,一定程度上说明所得到的异常检测性能就越好。而采用计算机视觉识别异常事件也是极其困难的。可能的挑战包括由于小概率事件导致标注数据的稀缺,类间/类内方差大,异常事件的主观定义差别,拍摄的视频的分辨率较低,等等。
对于智慧城市场景下的隔栏递物异常行为检测,如何能在拍摄视角下定位到行人是一个需要被解决的挑战。通过解决该问题,可以对拍摄场景下视频内容中异常事件进行自动分析,从而为相关部门提供便捷服务。
基于上述研究,本公开提供了一种基于目标对象配对实现目标视频帧区域检测的异常行为检测的方法、装置、电子设备及存储介质,检测的效率和准确率均较高。
为便于对本实施例进行理解,首先对本公开实施例所公开的一种异常行为检测的方法进行详细介绍,本公开实施例所提供的异常行为检测的方法的执行主体一般为具有一定计算能力的电子设备,该电子设备例如包括:终端设备或服务器或其它处理设备,终端设备可以为用户设备(User Equipment,UE)、移动设备、蜂窝电话、无绳电话、个人数字助理(Personal Digital Assistant,PDA)、手持设备、计算设备、车载设备、可穿戴设备等。在一些可能的实现方式中,该异常行为检测的方法可以通过处理器调用存储器中存储的计算机可读指令的方式来实现。
参见图1所示,为本公开实施例提供的异常行为检测的方法的流程图,方法包括S101至S104,其中:
S101:获取在预设管理区域内采集的视频帧;
S102:检测视频帧中的阻碍对象,以及位于阻碍对象两侧的目标对象;
S103:根据位于阻碍对象两侧的目标对象之间的距离,将位于阻碍对象不同侧的多个目标对象进行配对,得到目标对象检测对;并确定包含目标对象检测对的目标视频帧区域;
S104:利用训练好的行为检测神经网络对目标视频帧区域进行异常行为检测。
为了便于理解本公开实施例提供的异常行为检测的方法,接下来对该方法的应用场景 进行简单介绍。本公开实施例中的异常行为检测的方法主要可以应用于智慧城市领域,例如,可以是针对地铁内行人越过护栏传递未过安检物品的异常行为检测,也可以是针对学校护栏处学生打架的异常行为检测,等等,这里不做具体的限制。
相关技术中,大多通过设置专门工作人员值守的方式来进行异常识别,费时费力,且人工监管下,由于出现跑神、临时不在岗等情况而导致无法及时的捕捉到异常行为,适用性较差。
正是为了解决上述问题,本公开实施例才提供了一种基于目标对象配对并进行相关目标视频帧区域的异常行为检测的方法、装置、电子设备及存储介质,以提升异常行为检测的效率和准确率。
其中,针对不同的应用场景,这里的预设管理区域也不同,也即,基于不同的应用场景,上述预设管理区域的范围可灵活设置。以应用于地铁场景为例,这里的预设管理区域可以是地铁进站口的附近区域。视频帧则可以是能够抓拍到上述预设管理区域的摄像装置拍摄的,可以是单帧,也可以是连续多帧。为了更好的捕捉异常行为,这里可以采用的是连续多帧的视频帧,例如,可以是一个视频片段。
在采集到视频帧的情况下,本公开实施首先可以检测视频帧中的阻碍对象,以及位于阻碍对象两侧的目标对象,而后将位于阻碍对象两侧的目标对象进行配对,得到目标对象检测对。
有关目标对象的检测,这里一方面可以是基于图像处理方法得到的,另一方面可以是基于训练好的检测神经网络检测得到的,由于检测神经网络可以挖掘出目标对象更深维度的特征,一定程度上可以提升检测的准确度,因而,这里可以采用检测神经网络进行有关阻碍对象和目标对象的检测。
这里,考虑到传统的行为识别方法通常对输入视频序列进行全图的数据增强或其他预处理后送入到分类模型中进行预测,常见于公开的视频学术数据集中。然而这种方式只适用于以人为中心的视频行为识别,对于特定场景下摄像头拍摄的视频来说,往往包含更多的信息,覆盖的视野也更大。同时,目标的事件发生位置和人体尺度也具有随机性。因此,简单地以全图作为模型的输入是不合理的。
基于此,本公开实施例在进行异常行为检测之前,可以先进行目标对象的配对,而后基于配对结果确定目标视频帧区域以实现针对目标视频帧区域的异常行为检测,通过目标视频帧帧区域而非全图的行为检测,可以尽可能的提升捕捉到异常行为的概率和准确率。
其中,上述有关阻碍对象的配对可以是基于位于阻碍对象两侧的目标对象之间的距离来确定,这主要是为了实现针对隔栏递物这一异常行为的检测,距离越近,一定程度上说明两个目标对象跨过阻碍对象进行异常行为的概率也就越高,反之,距离越远,一定程度上说明两个目标对象跨过阻碍对象进行异常行为的概率也就越低。
这里,可以基于训练好的行为检测神经网络对目标视频帧区域进行异常行为检测,也即,对于距离较近的两个目标对象所配对得到的目标视频帧区域而言,存在异常行为的可能性更大。
本公开实施例中的阻碍对象可以是阻挡物,例如栏杆、护栏等,目标对象可以是行人,这里,可以利用训练好的阻挡物检测神经网络对视频帧进行阻碍对象检测,得到阻碍对象所属的阻碍对象检测标志;以及,利用训练好的行人检测神经网络对目标视频帧进行目标对象检测,得到目标对象对应的目标对象检测框。
其中,有关阻挡物检测神经网络可以是基于视频帧样本以及针对该视频帧样本进行阻挡物标注得到的标注信息训练得到的,训练的可以是有关视频帧样本与视频帧样本中的阻挡物位置、大小、方向等信息之间的关系。这里的阻碍对象检测标志,可以是检测线,也可以是检测框。
另外,有关行人检测神经网络可以是基于视频帧样本以及针对该视频帧样本进行行人标注得到的标注信息训练得到的,训练的可以是有关视频帧样本与视频帧样本中的行人位置等信息之间的关系。
针对隔栏递物行为而言,还需要考虑行人和阻挡物之间的相对位置关系。这里,可以针对位于阻碍对象一侧的第一目标对象,基于位于另一侧的至少一个第二目标对象的目标对象检测框,与第一目标对象的目标对象检测框之间的距离,从至少一个第二目标对象中确定与第一目标对象配对的第二目标对象。
其中,上述第一目标对象可以是阻碍对象一侧包括的各个第一目标对象中的任一个目标对象,也可以是从各个第一目标对象中指定的一个目标对象,本公开实施例对此不做具体的限制。
本公开实施例中的目标对象检测框可以是规则的矩形框,还可以是规则的其它形状,例如,可以是圆形框,椭圆形框等,除此之外,在实际应用中,还可以是仅包含目标对象的不规则检测框。
在第一目标对象的目标对象检测框以及另一侧的第二目标对象的目标对象检测框这两 个目标检测框均为矩形框的情况下,这里,可以从两个目标对象检测框中,选取属于不同目标对象检测框的、且距离最小的两条目标检测线,继而将两条目标检测线之间的距离,作为两个目标对象检测框之间的距离。
在具体应用中,可以分别确定两个目标对象检测框在视频帧中的坐标信息,而后确定每个目标对象检测框包括的目标检测线的位置信息,继而选取出上述距离最小的两条来自于不同目标对象检测框的目标检测线。
另外,本公开实施例中还可以先从第二目标对象的目标对象检测框与第一目标对象的目标对象检测框中分别选取对应的距离参考标志,而后基于两个距离参考标志之间的距离,确定两个目标对象检测框之间的距离。其中,上述距离参考标志可以是检测框中心点,也可以是检测框中线,还可以是其它具有距离参考意义的标志,在此不做具体的限制。
为了更为快速的实现目标对象的配对,本公开实施例在进行距离计算之前,可以先对目标对象检测框进行放大操作。也即,可以按照预设放大比例对第一目标对象的目标对象检测框,以及位于另一侧的至少一个第二目标对象的目标对象检测框分别进行尺寸放大处理,进而通过确定放大处理后的两个目标对象检测框之间的距离来实现配对。
在具体应用中,可以对各个目标对象检测框外扩1.5倍,还可以是1倍、2倍、3倍等,然后对栏杆一侧的行人而言,可以通过该行人对应的行人检测框,找到与其距离最近的对侧的行人检测框,确定可能发生隔栏递物的对侧行人。
需要说明的是,位于不同侧的多个目标对象而言,这里可以同步进行放大操作,从而使得各个目标对象检测框处于同一参考水平,提升后续配对的准确性。
本公开实施例中的视频帧采用的可以是连续多帧的视频片段。考虑到视频片段各帧之间的连续特性,这里可以先进行视频帧抽取操作,再进行目标对象的检测,具体可以通过如下步骤来实现:
步骤一、从视频片段中按时序选取多帧视频帧,得到目标视频序列;
步骤二、针对目标视频序列中的每个目标视频帧,检测目标视频帧中的阻碍对象,以及位于阻碍对象两侧的目标对象。
其中,有关目标视频帧的目标对象检测方法可以参见上述针对视频帧进行目标对象检测的具体描述,在此不再赘述。
本公开实施例中,可以通过如下步骤确定目标视频序列:
步骤一、按照预设划分间隔将视频片段划分为多个视频帧组;
步骤二、针对多个视频帧组中每个视频帧组,从视频帧组中选取一个视频帧,作为目标视频序列中的一个目标视频帧;
步骤三、按照时序将多个视频帧组分别选取的视频帧进行组合,得到目标视频序列。
这里的预设划分间隔可以是有关时间间隔,例如,可以是每隔0.5秒划分一个视频帧组,还可以是有关帧数间隔,例如,可以是每隔5帧划分一个视频帧组,还可以是其它划分方式,在此不做具体的限制。
通过划分的视频帧组所选取出的目标视频帧,在确保可以检测出更多行为信息的前提下,一定程度上还降低了计算量。
针对目标视频帧而言,可以按照上述方法确定配对的目标对象,继而确定包含目标对象检测对的目标视频帧区域。这里的目标视频帧区域除了包含目标对象检测对,还包含介于两个目标对象之间的阻碍对象。
本公开实施例提供的异常行为检测的方法,可以将每个目标视频帧对应的目标视频帧区域依次输入训练好的行为检测神经网络,确定出现异常行为的目标视频帧,以及目标对象在出现异常行为的目标视频帧中的位置信息。
在具体应用中,可以将目标视频帧区域从对应的目标视频帧中抠出,直接将抠出的视频帧区域输入到训练好的行为检测神经网络进行异常行为检测。
这里的异常行为检测可以确定的是目标对象在出现异常行为的目标视频帧中的位置信息,从而便于管理人员及时进行异常情况的处理。
本公开实施例中可以是基于训练好的行为检测神经网络实现的异常行为检测,这里,可以按照如下步骤训练行为检测神经网络:
步骤一、获取多帧视频帧样本;
步骤二、将多帧视频帧样本作为待训练的行为检测神经网络的输入数据,将针对多帧视频帧样本标注的异常行为指示标签作为待训练的行为检测神经网络的输出结果的对比监督数据,对待训练的行为检测神经网络进行至少一轮网络训练,得到训练好的行为检测神经网络。
这里,可以基于多帧视频帧样本标注的异常行为指示标签进行行为检测神经网络的对比监督数据,在网络输出结果越趋近于这一对比监督数据的情况下,说明网络的性能越好,反之,在网络输出结果越远离于这一对比监督数据的情况下,说明网络的性能不够好,需要再次进行网络训练。
其中,上述异常行为指示标签可以是有关具体的异常行为标识,例如标识为1的递物行为、标识为2的打架行为等。
本公开实施例提供的异常行为检测的方法,在进行异常行为检测之后,还可以将进行异常行为检测得到的检测结果发送至管理终端,管理终端基于检测结果可以迅速掌握异常行为,从而可以快速做出响应。
除此之外,本公开实施例还可以基于检测结果生成警报提示信息,可以通过语音播报等方式提醒管理人员及时做出响应,在具体应用中还可以针对不同的检测结果生成不同提醒力度的提示信息,例如,对于地铁场景下传递普通货品的异常行为,可以通过普通提示方式来提示,对于地铁场景下传递危险货品的异常行为,则可以通过强提醒方式来提示。
本领域技术人员可以理解,在具体实施方式的上述方法中,各步骤的撰写顺序并不意味着严格的执行顺序而对实施过程构成任何限定,各步骤的具体执行顺序应当以其功能和可能的内在逻辑确定。
基于同一发明构思,本公开实施例中还提供了与异常行为检测的方法对应的异常行为检测的装置,由于本公开实施例中的装置解决问题的原理与本公开实施例上述异常行为检测的方法相似,因此装置的实施可以参见方法的实施,重复之处不再赘述。
参照图2所示,为本公开实施例提供的一种异常行为检测的装置的示意图,装置包括:获取模块201、第一检测模块202、确定模块203和第二检测模块204;其中,
获取模块201,用于获取在预设管理区域内采集的视频帧;
第一检测模块202,用于检测视频帧中的阻碍对象,以及位于阻碍对象两侧的目标对象;
确定模块203,用于根据位于阻碍对象两侧的目标对象之间的距离,将位于阻碍对象不同侧的多个目标对象进行配对,得到目标对象检测对;并确定包含目标对象检测对的目标视频帧区域;
第二检测模块204,用于利用训练好的行为检测神经网络对目标视频帧区域进行异常行为检测。
采用上述异常行为检测的装置,对于获取的视频帧,可以首先检测视频帧中的阻碍对象以及位于阻碍对象两侧的目标对象,然后可以基于位于阻碍对象两侧的目标对象之间的距离进行目标对象的配对,以确定包含目标对象检测对的目标视频帧区域,最后可以利用训练好的行为检测神经网络对目标视频帧区域进行异常行为检测。本公开在进行异常行为 检测的过程中,采用的是基于目标对象配对所形成的目标视频帧区域,该目标视频帧区域对应的可以是存在异常行为的区域,避免了其它无关区域对于异常行为检测的影响,检测的准确率较高,且由于利用训练好的行为检测神经网络直接进行异常检测,显著提升了检测效率。
在一种可能的实施方式中,在阻碍对象为阻挡物,目标对象为行人的情况下,第一检测模块202,用于按照如下步骤检测视频帧中的阻碍对象,以及位于阻碍对象两侧的目标对象:
利用训练好的阻挡物检测神经网络对视频帧进行阻碍对象检测,得到阻碍对象所属的阻碍对象检测标志;以及,利用训练好的行人检测神经网络对目标视频帧进行目标对象检测,得到目标对象对应的目标对象检测框;
确定模块203,用于按照如下步骤根据位于阻碍对象两侧的目标对象之间的距离,将位于阻碍对象两侧的目标对象进行配对,得到目标对象检测对:
针对位于阻碍对象一侧的第一目标对象,基于位于另一侧的至少一个第二目标对象的目标对象检测框,与第一目标对象的目标对象检测框之间的距离,从至少一个第二目标对象中确定与第一目标对象配对的第二目标对象。
在一种可能的实施方式中,确定模块203,还用于:
在所述基于位于另一侧的至少一个第二目标对象的目标对象检测框,与第一目标对象的目标对象检测框之间的距离,从所述至少一个第二目标对象中确定与所述第一目标对象配对的第二目标对象之前,按照预设放大比例对第一目标对象的目标对象检测框,以及位于另一侧的至少一个第二目标对象的目标对象检测框分别进行尺寸放大处理。
在一种可能的实施方式中,在第二目标对象的目标对象检测框与第一目标对象的目标对象检测框所对应的两个目标对象检测框均为矩形框的情况下,确定模块203,用于按照如下步骤确定两个目标对象检测框之间的距离:
从两个目标对象检测框中,选取属于不同目标对象检测框的、且距离最小的两条目标检测线;
将两条目标检测线之间的距离,作为两个目标对象检测框之间的距离。
在一种可能的实施方式中,第二目标对象的目标对象检测框与第一目标对象的目标对象检测框之间的距离为根据以下步骤确定的:
从第二目标对象的目标对象检测框与第一目标对象的目标对象检测框中分别选取对应 的距离参考标志;
将两个距离参考标志之间的距离,确定为第二目标对象的目标对象检测框与第一目标对象的目标对象检测框之间的距离。
在一种可能的实施方式中,在获取的视频帧为视频片段的情况下,第一检测模块202,用于按照如下步骤检测视频帧中的阻碍对象,以及位于阻碍对象两侧的目标对象:
从视频片段中按时序选取多帧视频帧,得到目标视频序列;
针对目标视频序列中的每个目标视频帧,检测目标视频帧中的阻碍对象,以及位于阻碍对象两侧的目标对象。
在一种可能的实施方式中,第一检测模块202,用于按照如下步骤从视频片段中按时序选取多帧视频帧,得到目标视频序列:
按照预设划分间隔将视频片段划分为多个视频帧组;
针对多个视频帧组中每个视频帧组,从视频帧组中选取一个视频帧,作为目标视频序列中的一个目标视频帧;
按照时序将多个视频帧组分别选取的视频帧进行组合,得到目标视频序列。
在一种可能的实施方式中,第二检测模块204,用于按照如下步骤利用训练好的行为检测神经网络对目标视频帧区域进行异常行为检测:
将每个目标视频帧对应的目标视频帧区域依次输入训练好的行为检测神经网络,确定出现异常行为的目标视频帧,以及目标对象在出现异常行为的目标视频帧中的位置信息。
在一种可能的实施方式中,第二检测模块204,用于按照如下步骤训练行为检测神经网络:
获取多帧视频帧样本;
将多帧视频帧样本作为待训练的行为检测神经网络的输入数据,将针对多帧视频帧样本标注的异常行为指示标签作为待训练的行为检测神经网络的输出结果的对比监督数据,对待训练的行为检测神经网络进行至少一轮网络训练,得到训练好的行为检测神经网络。
在一种可能的实施方式中,第二检测模块204,还用于在利用训练好的行为检测神经网络对目标视频帧区域进行异常行为检测之后,向管理终端发送进行异常行为检测得到的检测结果;和/或,基于进行异常行为检测得到的检测结果,生成警报提示信息。
关于装置中的各模块的处理流程、以及各模块之间的交互流程的描述可以参照上述方法实施例中的相关说明,这里不再详述。
本公开实施例还提供了一种电子设备,如图3所示,为本公开实施例提供的电子设备结构示意图,包括:处理器301、存储器302、和总线303。存储器302存储有处理器301可执行的机器可读指令(比如,图2中的装置中获取模块201、第一检测模块202、确定模块203、第二检测模块204对应的执行指令等),当电子设备运行时,处理器301与存储器302之间通过总线303通信,机器可读指令被处理器301执行时执行如下处理:
获取在预设管理区域内采集的视频帧;
检测视频帧中的阻碍对象,以及位于阻碍对象两侧的目标对象;
根据位于阻碍对象两侧的目标对象之间的距离,将位于阻碍对象不同侧的多个目标对象进行配对,得到目标对象检测对;并确定包含目标对象检测对的目标视频帧区域;
利用训练好的行为检测神经网络对目标视频帧区域进行异常行为检测。
本公开实施例还提供一种计算机可读存储介质,该计算机可读存储介质上存储有计算机程序,该计算机程序被处理器运行时执行上述方法实施例中所述的异常行为检测的方法的步骤。其中,该存储介质可以是易失性或非易失的计算机可读取存储介质。
本公开实施例还提供一种计算机程序产品,该计算机程序产品承载有程序代码,所述程序代码包括的指令可用于执行上述方法实施例中所述的异常行为检测的方法的步骤,具体可参见上述方法实施例,在此不再赘述。
其中,上述计算机程序产品可以具体通过硬件、软件或其结合的方式实现。在一个可选实施例中,所述计算机程序产品具体体现为计算机存储介质,在另一个可选实施例中,计算机程序产品具体体现为软件产品,例如软件开发包(Software Development Kit,SDK)等等。
所属领域的技术人员可以清楚地了解到,为描述的方便和简洁,上述描述的***和装置的具体工作过程,可以参考前述方法实施例中的对应过程,在此不再赘述。在本公开所提供的几个实施例中,应该理解到,所揭露的***、装置和方法,可以通过其它的方式实现。以上所描述的装置实施例仅仅是示意性的,例如,所述单元的划分,仅仅为一种逻辑功能划分,实际实现时可以有另外的划分方式,又例如,多个单元或组件可以结合或者可以集成到另一个***,或一些特征可以忽略,或不执行。另一点,所显示或讨论的相互之间的耦合或直接耦合或通信连接可以是通过一些通信接口,装置或单元的间接耦合或通信连接,可以是电性,机械或其它的形式。
所述作为分离部件说明的单元可以是或者也可以不是物理上分开的,作为单元显示的 部件可以是或者也可以不是物理单元,即可以位于一个地方,或者也可以分布到多个网络单元上。可以根据实际的需要选择其中的部分或者全部单元来实现本实施例方案的目的。
另外,在本公开各个实施例中的各功能单元可以集成在一个处理单元中,也可以是各个单元单独物理存在,也可以两个或两个以上单元集成在一个单元中。
所述功能如果以软件功能单元的形式实现并作为独立的产品销售或使用时,可以存储在一个处理器可执行的非易失的计算机可读取存储介质中。基于这样的理解,本公开的技术方案本质上或者说对现有技术做出贡献的部分或者该技术方案的部分可以以软件产品的形式体现出来,该计算机软件产品存储在一个存储介质中,包括若干指令用以使得一台电子设备(可以是个人计算机,服务器,或者网络设备等)执行本公开各个实施例所述方法的全部或部分步骤。而前述的存储介质包括:U盘、移动硬盘、只读存储器(Read-Only Memory,ROM)、随机存取存储器(Random Access Memory,RAM)、磁碟或者光盘等各种可以存储程序代码的介质。
最后应说明的是:以上所述实施例,仅为本公开的具体实施方式,用以说明本公开的技术方案,而非对其限制,本公开的保护范围并不局限于此,尽管参照前述实施例对本公开进行了详细的说明,本领域的普通技术人员应当理解:任何熟悉本技术领域的技术人员在本公开揭露的技术范围内,其依然可以对前述实施例所记载的技术方案进行修改或可轻易想到变化,或者对其中部分技术特征进行等同替换;而这些修改、变化或者替换,并不使相应技术方案的本质脱离本公开实施例技术方案的精神和范围,都应涵盖在本公开的保护范围之内。因此,本公开的保护范围应所述以权利要求的保护范围为准。

Claims (13)

  1. 一种异常行为检测的方法,其特征在于,所述方法包括:
    获取在预设管理区域内采集的视频帧;
    检测所述视频帧中的阻碍对象,以及位于所述阻碍对象两侧的目标对象;
    根据位于所述阻碍对象两侧的目标对象之间的距离,将位于所述阻碍对象不同侧的多个目标对象进行配对,得到目标对象检测对;并确定包含所述目标对象检测对的目标视频帧区域;
    利用训练好的行为检测神经网络对所述目标视频帧区域进行异常行为检测。
  2. 根据权利要求1所述的方法,其特征在于,在所述阻碍对象为阻挡物,所述目标对象为行人的情况下,所述检测所述视频帧中的阻碍对象,以及位于所述阻碍对象两侧的目标对象,包括:
    利用训练好的阻挡物检测神经网络对所述视频帧进行阻碍对象检测,得到阻碍对象所属的阻碍对象检测标志;以及,利用训练好的行人检测神经网络对所述目标视频帧进行目标对象检测,得到目标对象对应的目标对象检测框;
    所述根据位于所述阻碍对象两侧的目标对象之间的距离,将位于所述阻碍对象不同侧的多个目标对象进行配对,得到目标对象检测对,包括:
    针对位于所述阻碍对象一侧的第一目标对象,基于位于另一侧的至少一个第二目标对象的目标对象检测框,与所述第一目标对象的目标对象检测框之间的距离,从所述至少一个第二目标对象中确定与所述第一目标对象配对的第二目标对象。
  3. 根据权利要求2所述的方法,其特征在于,在所述基于位于另一侧的至少一个第二目标对象的目标对象检测框,与所述第一目标对象的目标对象检测框之间的距离,从所述至少一个第二目标对象中确定与所述第一目标对象配对的第二目标对象之前,所述方法还包括:
    按照预设放大比例对所述第一目标对象的目标对象检测框,以及位于另一侧的至少一个第二目标对象的目标对象检测框分别进行尺寸放大处理。
  4. 根据权利要求2或3所述的方法,其特征在于,在所述第二目标对象的目标对象检测框与所述第一目标对象的目标对象检测框所对应的两个目标对象检测框均为矩形框的情况下,所述两个目标对象检测框之间的距离为根据以下步骤确定的:
    从所述两个目标对象检测框中,选取属于不同目标对象检测框的、且距离最小的两条 目标检测线;
    将所述两条目标检测线之间的距离,作为所述两个目标对象检测框之间的距离。
  5. 根据权利要求2或3所述的方法,其特征在于,所述第二目标对象的目标对象检测框与所述第一目标对象的目标对象检测框之间的距离为根据以下步骤确定的:
    从所述第二目标对象的目标对象检测框与所述第一目标对象的目标对象检测框中分别选取对应的距离参考标志;
    将两个所述距离参考标志之间的距离,确定为所述第二目标对象的目标对象检测框与所述第一目标对象的目标对象检测框之间的距离。
  6. 根据权利要求1至5任一所述的方法,其特征在于,在获取的视频帧为视频片段的情况下,所述检测所述视频帧中的阻碍对象,以及位于所述阻碍对象两侧的目标对象,包括:
    从所述视频片段中按时序选取多帧视频帧,得到目标视频序列;
    针对所述目标视频序列中的每个目标视频帧,检测所述目标视频帧中的阻碍对象,以及位于所述阻碍对象两侧的目标对象。
  7. 根据权利要求6所述的方法,其特征在于,所述从所述视频片段中按时序选取多帧视频帧,得到目标视频序列,包括:
    按照预设划分间隔将所述视频片段划分为多个视频帧组;
    针对所述多个视频帧组中每个视频帧组,从所述视频帧组中选取一个视频帧,作为所述目标视频序列中的一个目标视频帧;
    按照时序将所述多个视频帧组分别选取的视频帧进行组合,得到所述目标视频序列。
  8. 根据权利要求6或7所述的方法,其特征在于,所述利用训练好的行为检测神经网络对所述目标视频帧区域进行异常行为检测,包括:
    将每个所述目标视频帧对应的所述目标视频帧区域依次输入训练好的行为检测神经网络,确定出现异常行为的目标视频帧,以及所述目标对象在所述出现异常行为的目标视频帧中的位置信息。
  9. 根据权利要求1至8任一所述的方法,其特征在于,按照如下步骤训练所述行为检测神经网络:
    获取多帧视频帧样本;
    将所述多帧视频帧样本作为待训练的行为检测神经网络的输入数据,将针对所述多帧 视频帧样本标注的异常行为指示标签作为所述待训练的行为检测神经网络的输出结果的对比监督数据,对所述待训练的行为检测神经网络进行至少一轮网络训练,得到训练好的行为检测神经网络。
  10. 根据权利要求1至9任一所述的方法,其特征在于,在所述利用训练好的行为检测神经网络对所述目标视频帧区域进行异常行为检测之后,所述方法还包括如下至少一项:
    向管理终端发送进行异常行为检测得到的检测结果;
    基于所述进行异常行为检测得到的检测结果,生成警报提示信息。
  11. 一种异常行为检测的装置,其特征在于,所述装置包括:
    获取模块,用于获取在预设管理区域内采集的视频帧;
    第一检测模块,用于检测所述视频帧中的阻碍对象,以及位于所述阻碍对象两侧的目标对象;
    确定模块,用于根据位于所述阻碍对象两侧的目标对象之间的距离,将位于所述阻碍对象不同侧的多个目标对象进行配对,得到目标对象检测对;并确定包含所述目标对象检测对的目标视频帧区域;
    第二检测模块,用于利用训练好的行为检测神经网络对所述目标视频帧区域进行异常行为检测。
  12. 一种电子设备,其特征在于,包括:处理器、存储器和总线,所述存储器存储有所述处理器可执行的机器可读指令,当电子设备运行时,所述处理器与所述存储器之间通过总线通信,所述机器可读指令被所述处理器执行时执行如权利要求1至10任一所述的异常行为检测的方法的步骤。
  13. 一种计算机可读存储介质,其特征在于,该计算机可读存储介质上存储有计算机程序,该计算机程序被处理器运行时执行如权利要求1至10任一所述的异常行为检测的方法的步骤。
PCT/CN2022/096440 2021-10-29 2022-05-31 一种异常行为检测的方法、装置、电子设备及存储介质 WO2023071188A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202111271743.9 2021-10-29
CN202111271743.9A CN113901946A (zh) 2021-10-29 2021-10-29 一种异常行为检测的方法、装置、电子设备及存储介质

Publications (1)

Publication Number Publication Date
WO2023071188A1 true WO2023071188A1 (zh) 2023-05-04

Family

ID=79026849

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2022/096440 WO2023071188A1 (zh) 2021-10-29 2022-05-31 一种异常行为检测的方法、装置、电子设备及存储介质

Country Status (2)

Country Link
CN (1) CN113901946A (zh)
WO (1) WO2023071188A1 (zh)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117392758A (zh) * 2023-12-05 2024-01-12 广州阿凡提电子科技有限公司 基于视频分析的用户行为识别方法及***

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113901946A (zh) * 2021-10-29 2022-01-07 上海商汤智能科技有限公司 一种异常行为检测的方法、装置、电子设备及存储介质

Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107977646A (zh) * 2017-12-19 2018-05-01 北京博睿视科技有限责任公司 一种隔栏递物检测算法
US20200013148A1 (en) * 2018-07-06 2020-01-09 Mitsubishi Electric Research Laboratories, Inc. System and Method for Detecting Motion Anomalies in Video
CN111325937A (zh) * 2020-03-07 2020-06-23 北京迈格威科技有限公司 翻越行为检测方法、装置和电子***
CN112084987A (zh) * 2020-09-16 2020-12-15 杨晓敏 一种基于人工智能的地铁逃票行为的检测方法与***
CN112560649A (zh) * 2020-12-09 2021-03-26 广州云从鼎望科技有限公司 一种行为动作检测方法、***、设备及介质
CN112668377A (zh) * 2019-10-16 2021-04-16 清华大学 信息识别***及其方法
CN112818844A (zh) * 2021-01-29 2021-05-18 成都商汤科技有限公司 安检异常事件检测方法及装置、电子设备和存储介质
CN113177439A (zh) * 2021-04-08 2021-07-27 中通服咨询设计研究院有限公司 一种行人翻越马路护栏检测方法
CN113901946A (zh) * 2021-10-29 2022-01-07 上海商汤智能科技有限公司 一种异常行为检测的方法、装置、电子设备及存储介质

Patent Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107977646A (zh) * 2017-12-19 2018-05-01 北京博睿视科技有限责任公司 一种隔栏递物检测算法
US20200013148A1 (en) * 2018-07-06 2020-01-09 Mitsubishi Electric Research Laboratories, Inc. System and Method for Detecting Motion Anomalies in Video
CN112668377A (zh) * 2019-10-16 2021-04-16 清华大学 信息识别***及其方法
CN111325937A (zh) * 2020-03-07 2020-06-23 北京迈格威科技有限公司 翻越行为检测方法、装置和电子***
CN112084987A (zh) * 2020-09-16 2020-12-15 杨晓敏 一种基于人工智能的地铁逃票行为的检测方法与***
CN112560649A (zh) * 2020-12-09 2021-03-26 广州云从鼎望科技有限公司 一种行为动作检测方法、***、设备及介质
CN112818844A (zh) * 2021-01-29 2021-05-18 成都商汤科技有限公司 安检异常事件检测方法及装置、电子设备和存储介质
CN113177439A (zh) * 2021-04-08 2021-07-27 中通服咨询设计研究院有限公司 一种行人翻越马路护栏检测方法
CN113901946A (zh) * 2021-10-29 2022-01-07 上海商汤智能科技有限公司 一种异常行为检测的方法、装置、电子设备及存储介质

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117392758A (zh) * 2023-12-05 2024-01-12 广州阿凡提电子科技有限公司 基于视频分析的用户行为识别方法及***
CN117392758B (zh) * 2023-12-05 2024-03-26 广州阿凡提电子科技有限公司 基于视频分析的用户行为识别方法及***

Also Published As

Publication number Publication date
CN113901946A (zh) 2022-01-07

Similar Documents

Publication Publication Date Title
Siebert et al. Detecting motorcycle helmet use with deep learning
EP3343443B1 (en) Object detection for video camera self-calibration
WO2023071188A1 (zh) 一种异常行为检测的方法、装置、电子设备及存储介质
Benezeth et al. Abnormal events detection based on spatio-temporal co-occurences
CN102799935B (zh) 一种基于视频分析技术的人流量统计方法
CN105574506A (zh) 基于深度学习和大规模集群的智能人脸追逃***及方法
WO2018210047A1 (zh) 数据处理方法、数据处理装置、电子设备及存储介质
US11587327B2 (en) Methods and systems for accurately recognizing vehicle license plates
CN104933710A (zh) 基于监控视频下的商店人流轨迹智能分析方法
Zhang et al. Crowd density estimation based on statistical analysis of local intra-crowd motions for public area surveillance
Bouma et al. Automatic detection of suspicious behavior of pickpockets with track-based features in a shopping mall
Zin et al. A Markov random walk model for loitering people detection
Saunier et al. A public video dataset for road transportation applications
Badura et al. Intelligent traffic system: Cooperation of MANET and image processing
Shirazi et al. Vision-based pedestrian behavior analysis at intersections
Balali et al. Video-based detection and classification of US traffic signs and mile markers using color candidate extraction and feature-based recognition
WO2018210039A1 (zh) 数据处理方法、数据处理装置及存储介质
Ashraf et al. HVD-net: a hybrid vehicle detection network for vision-based vehicle tracking and speed estimation
CN112508626A (zh) 一种信息处理方法、装置、电子设备及存储介质
Noh et al. SafetyCube: Framework for potential pedestrian risk analysis using multi-dimensional OLAP
Zhou et al. Rapid and robust traffic accident detection based on orientation map
Lao et al. Human running detection: Benchmark and baseline
Saeed et al. Object identification-based stall detection and stall legitimacy analysis for traffic patterns
Kadiķis et al. Vehicle classification in video using virtual detection lines
Mahin et al. A simple approach for abandoned object detection

Legal Events

Date Code Title Description
NENP Non-entry into the national phase

Ref country code: DE