WO2023000253A1 - 攀爬行为预警方法和装置、电子设备、存储介质 - Google Patents

攀爬行为预警方法和装置、电子设备、存储介质 Download PDF

Info

Publication number
WO2023000253A1
WO2023000253A1 PCT/CN2021/107847 CN2021107847W WO2023000253A1 WO 2023000253 A1 WO2023000253 A1 WO 2023000253A1 CN 2021107847 W CN2021107847 W CN 2021107847W WO 2023000253 A1 WO2023000253 A1 WO 2023000253A1
Authority
WO
WIPO (PCT)
Prior art keywords
target
area
target area
behavior information
bottom edge
Prior art date
Application number
PCT/CN2021/107847
Other languages
English (en)
French (fr)
Inventor
王光利
Original Assignee
京东方科技集团股份有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 京东方科技集团股份有限公司 filed Critical 京东方科技集团股份有限公司
Priority to CN202180001950.4A priority Critical patent/CN115917589A/zh
Priority to EP21950508.8A priority patent/EP4336491A4/en
Priority to PCT/CN2021/107847 priority patent/WO2023000253A1/zh
Priority to US17/971,498 priority patent/US11990010B2/en
Publication of WO2023000253A1 publication Critical patent/WO2023000253A1/zh
Priority to US18/144,366 priority patent/US20230316760A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/50Context or environment of the image
    • G06V20/52Surveillance or monitoring of activities, e.g. for recognising suspicious objects
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • G06V40/161Detection; Localisation; Normalisation
    • G06V40/166Detection; Localisation; Normalisation using acquisition arrangements
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • G06V40/168Feature extraction; Face representation
    • G06V40/171Local features and components; Facial parts ; Occluding parts, e.g. glasses; Geometrical relationships
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • G06V40/172Classification, e.g. identification
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/20Movements or behaviour, e.g. gesture recognition
    • G06V40/23Recognition of whole body movements, e.g. for sport training
    • GPHYSICS
    • G08SIGNALLING
    • G08BSIGNALLING OR CALLING SYSTEMS; ORDER TELEGRAPHS; ALARM SYSTEMS
    • G08B13/00Burglar, theft or intruder alarms
    • G08B13/18Actuation by interference with heat, light, or radiation of shorter wavelength; Actuation by intruding sources of heat, light, or radiation of shorter wavelength
    • G08B13/189Actuation by interference with heat, light, or radiation of shorter wavelength; Actuation by intruding sources of heat, light, or radiation of shorter wavelength using passive radiation detection systems
    • G08B13/194Actuation by interference with heat, light, or radiation of shorter wavelength; Actuation by intruding sources of heat, light, or radiation of shorter wavelength using passive radiation detection systems using image scanning and comparing systems
    • G08B13/196Actuation by interference with heat, light, or radiation of shorter wavelength; Actuation by intruding sources of heat, light, or radiation of shorter wavelength using passive radiation detection systems using image scanning and comparing systems using television cameras
    • G08B13/19602Image analysis to detect motion of the intruder, e.g. by frame subtraction
    • G08B13/19606Discriminating between target movement or movement in an area of interest and other non-signicative movements, e.g. target movements induced by camera shake or movements of pets, falling leaves, rotating fan
    • GPHYSICS
    • G08SIGNALLING
    • G08BSIGNALLING OR CALLING SYSTEMS; ORDER TELEGRAPHS; ALARM SYSTEMS
    • G08B13/00Burglar, theft or intruder alarms
    • G08B13/18Actuation by interference with heat, light, or radiation of shorter wavelength; Actuation by intruding sources of heat, light, or radiation of shorter wavelength
    • G08B13/189Actuation by interference with heat, light, or radiation of shorter wavelength; Actuation by intruding sources of heat, light, or radiation of shorter wavelength using passive radiation detection systems
    • G08B13/194Actuation by interference with heat, light, or radiation of shorter wavelength; Actuation by intruding sources of heat, light, or radiation of shorter wavelength using passive radiation detection systems using image scanning and comparing systems
    • G08B13/196Actuation by interference with heat, light, or radiation of shorter wavelength; Actuation by intruding sources of heat, light, or radiation of shorter wavelength using passive radiation detection systems using image scanning and comparing systems using television cameras
    • G08B13/19602Image analysis to detect motion of the intruder, e.g. by frame subtraction
    • G08B13/19613Recognition of a predetermined image pattern or behaviour pattern indicating theft or intrusion
    • GPHYSICS
    • G08SIGNALLING
    • G08BSIGNALLING OR CALLING SYSTEMS; ORDER TELEGRAPHS; ALARM SYSTEMS
    • G08B31/00Predictive alarm systems characterised by extrapolation or other computation using updated historic data

Definitions

  • the present disclosure relates to the technical field of data processing, and in particular to a climbing behavior early warning method and device, electronic equipment, and a storage medium.
  • the uncivilized behavior of tourists also increases, such as graffiti cultural relics, climbing sculptures, etc.
  • the sculptures may be damaged, tourists themselves may be harmed, and other tourists may be adversely affected.
  • video surveillance systems are usually installed in existing scenic spots, and security personnel stare at the monitoring display screens in real time to detect uncivilized behaviors in a timely manner.
  • the present disclosure provides a climbing behavior early warning method and device, electronic equipment, and a storage medium to solve the deficiencies of related technologies.
  • a climbing behavior early warning method comprising:
  • the video image data including the detected target and at least one object
  • determining that the object enters a target area corresponding to the detected target includes:
  • the spatio-temporal relationship refers to the relative spatial relationship between the object area and the target area at different times;
  • the first preset condition includes at least one of the following: the object area is within the target area and the distance between the bottom edge of the object area and the bottom edge of the target area exceeds a set distance threshold, and the object area successively touches the target The edge of the area and the two identification lines and the distance between the bottom edge of the object area and the bottom edge of the target area exceeds the set distance threshold; wherein the two identification lines are set between the line connecting the target area and the target area between the detected targets.
  • the spatio-temporal relationship includes at least one of the following:
  • the object area is within the target area, the object area touches the edge of the target area and the two marking lines successively, the object area touches the two marking lines and the edge of the target area successively, the bottom edge of the object area touches the target area
  • the distance between the bottom edge of the object area exceeds the set distance threshold, the distance between the bottom edge of the object area and the bottom edge of the target area is less than the set distance threshold, and the object area is outside the target area.
  • obtain the object area where the target object is located including:
  • An object whose head is located in the target area is selected as a target object, and an object area where the target object is located is obtained.
  • obtaining the positions of heads of each object in multiple video frames in the video image data includes:
  • acquiring the behavior information of the object includes:
  • the head of the target object is located in the target area;
  • the behavior information includes human body posture;
  • determining that the behavior information indicates that the object climbs the detected target includes:
  • the behavior information includes human body posture
  • the behavior information indicates that the target object is climbing the detected target.
  • the method further includes:
  • the facial image When the facial image satisfies a preset requirement, an identification code matching the facial image is obtained; the preset requirement includes key points of the face and the confidence of the recognition result exceeds a set confidence threshold;
  • a climbing behavior early warning device comprising:
  • a data acquisition module configured to acquire video image data, the video image data including the detected target and at least one object;
  • An information acquisition module configured to acquire behavior information of the object when it is determined that the object enters the target area corresponding to the detected target;
  • a video marking module configured to mark the video frame where the object is located when it is determined that the behavior information indicates that the object is climbing the detected target.
  • the information acquisition module includes:
  • the area acquisition submodule is used to acquire the target area where the detected target is located in the multiple video frames of the video image data, and acquire the object area where the target object is located; the head of the target object is located in the target area;
  • a relationship acquiring submodule configured to acquire the spatio-temporal relationship between the object area and the target area; the spatio-temporal relationship refers to the relative spatial relationship between the object area and the target area at different times;
  • An area determination submodule configured to determine that the target object enters the target area when it is determined that the spatio-temporal relationship satisfies a first preset condition
  • the first preset condition includes at least one of the following: the object area is within the target area and the distance between the bottom edge of the object area and the bottom edge of the target area exceeds a set distance threshold, and the object area successively touches the target The edge of the area and the two identification lines and the distance between the bottom edge of the object area and the bottom edge of the target area exceeds the set distance threshold; wherein the two identification lines are set between the line connecting the target area and the target area between the detected targets.
  • the spatio-temporal relationship includes at least one of the following:
  • the object area is within the target area, the object area touches the edge of the target area and the two marking lines successively, the object area touches the two marking lines and the edge of the target area successively, the bottom edge of the object area touches the target area
  • the distance between the bottom edge of the object area exceeds the set distance threshold, the distance between the bottom edge of the object area and the bottom edge of the target area is less than the set distance threshold, and the object area is outside the target area.
  • the region acquisition submodule includes:
  • a position acquisition unit configured to acquire the position of the head of each object in the multiple video frames in the video image data and the object area where each object is located;
  • the object selection unit is configured to select an object whose head is located in the target area as a target object, and obtain the object area where the target object is located.
  • the location acquisition unit includes:
  • a feature acquisition subunit configured to acquire preset image features of each video frame in the multiple video frames
  • a position prediction subunit configured to identify the recognized position of the head in the current video frame based on the preset image features, and predict the predicted position of the head in the next video frame;
  • the position acquisition subunit is used to match the identified position and the predicted position, and update the predicted position to the identified position after the matching is passed, so as to obtain the position of the same head in two adjacent video frames .
  • the information acquisition module includes:
  • the position acquisition sub-module is used to acquire the position of the key part of the behavior information of the target object in the multiple video frames of the video image data; the head of the target object is located in the target area; the behavior information includes the posture of the human body;
  • the vector generation sub-module is used to generate a one-dimensional vector for the key parts of the behavior information in each video frame according to the preset expression sequence;
  • the image acquisition sub-module is used to concatenate the corresponding one-dimensional vectors in each video frame to obtain a frame of RGB image; the RGB channel in the RGB image corresponds to the xyz axis coordinates of each behavioral information key part;
  • the behavior information acquisition sub-module is configured to acquire the behavior information of the target object according to the RGB image.
  • the video tagging module includes:
  • a position determining submodule configured to determine the position of a specified part of the target object based on the behavior information;
  • the behavior information includes human body posture;
  • a target determination submodule configured to determine that the behavior information indicates that the target object is climbing Climb the detected target.
  • the device also includes:
  • An image acquisition module configured to acquire the facial image of the target object
  • An identification code acquisition module configured to acquire an identification code that matches the facial image when the facial image satisfies a preset requirement; the preset requirement includes the key points of the face and the confidence of the recognition result exceeds the setting confidence threshold;
  • the signal generation module is configured to generate warning information when it is determined that there is no object matching the identification code in the specified database.
  • an electronic device including:
  • memory for storing a computer program executable by said processor
  • the processor is configured to execute the computer program in the memory, so as to realize the above method.
  • a computer-readable storage medium is provided, and when an executable computer program in the storage medium is executed by a processor, the above method can be implemented.
  • the solution provided by the embodiments of the present disclosure can acquire video image data; the video image data includes the detected target and at least one object; when it is determined that the object enters the target area corresponding to the detected target, Obtaining behavior information of the object; when it is determined that the behavior information indicates that the object is climbing the detected target, marking the video frame where the object is located.
  • the behavior of the object climbing the detected target can be found in time, and the management efficiency can be improved.
  • Fig. 1 is a flow chart of a climbing behavior warning method according to an exemplary embodiment.
  • Fig. 2 is a flow chart of determining the current behavior of a target object according to an exemplary embodiment.
  • Fig. 3 is a flow chart of tracking the same head according to an exemplary embodiment.
  • Fig. 4 is a flow chart showing the current behavior of the target object according to an exemplary embodiment.
  • Fig. 5 is a schematic diagram showing the effect of an action of acquiring a target object according to an exemplary embodiment.
  • Fig. 6 is a flow chart of determining whether behavior information indicates that an object climbs a detected target according to an exemplary embodiment.
  • Fig. 7 is a schematic diagram showing the effect of the spatio-temporal relationship between the object area and the target area according to an exemplary embodiment.
  • Fig. 8 is a flow chart showing another climbing behavior early warning method according to an exemplary embodiment.
  • Fig. 9 is a block diagram of a climbing behavior warning device according to an exemplary embodiment.
  • FIG. 1 is a flow chart of a climbing behavior early warning method according to an exemplary embodiment.
  • a climbing behavior early warning method includes steps 11 to 13 .
  • step 11 video image data is acquired, and the video image data includes the detected target and at least one object.
  • the electronic device may be connected to the camera, and receive video image data output by the camera. That is, when the camera is turned on, it can collect video frames to form a video frame stream, and then encode and compress the video frames before sending them to the electronic device. The electronic device can obtain the above-mentioned video image data after decoding and other processing on the received image data.
  • the shooting range of the above-mentioned camera usually points to the specified detected target, where the detected target may include but not limited to statues in scenic spots, Cultural relics in museums, security guardrails, etc., or video image data acquired by electronic equipment includes detected targets.
  • the video image data may or may not include objects, where the objects may be tourists or management personnel.
  • the solutions provided by the present disclosure are applied to scenes including objects, only the scenes including at least one object in the video image data are considered in the following embodiments.
  • step 12 when it is determined that the object enters the target area corresponding to the detected target, the behavior information of the object is acquired.
  • the electronic device can process the above video image data, so as to determine whether the object enters the target area corresponding to the detected target. Referring to FIG. 2 , steps 21 to 23 are included.
  • the electronic device may acquire the target area where the detected target is located in the multiple video frames of the video image data, and acquire the object area where the target object is located.
  • a target recognition model such as a convolutional network model (CNN)
  • CNN convolutional network model
  • the electronic device can input each video frame of the video image data into the target recognition model, and the target recognition model can recognize the detected target in each video frame of the video image data, and then generate the minimum circumscribed rectangle according to the shape of the detected target, then The area corresponding to the smallest circumscribed rectangle in the video frame is the target area, that is, the target area where the detected target in multiple video frames is located can be obtained through the above recognition process.
  • the above minimum circumscribed rectangle may also be replaced by other preset shapes, such as a circle, a rhombus, etc., and if the target area can be obtained, the corresponding solution falls within the protection scope of the present disclosure.
  • the head detection model can be pre-stored in the electronic device, such as a convolutional network model.
  • the head detection model is a lightweight detection model based on CNN, which can be suitable for low resource allocation of electronic devices. Scenarios, or suitable for upgrade and transformation scenarios in existing monitoring systems.
  • the detection result can have a high degree of confidence.
  • the lightweight detection model can pass model compression (Model Compression) and model pruning (Pruning).
  • model compression is to perform parameter compression on the trained model, so that the model carries fewer model parameters, thereby reducing the problem of occupying more memory, and achieving the effect of improving processing efficiency.
  • Model pruning refers to retaining important weights and removing unimportant weights under the premise of ensuring the accuracy of CNN. Usually, the closer the weight value is to 0, the less important the weight is.
  • Model pruning can include: 1. Modify the blob structure or not, directly define the diagonal mask, and rewrite the original matrix into a sparse matrix storage method; 2. Use a new method to calculate the multiplication of sparse matrix and vector . In other words, when pruning, there are two starting points, one is to start from the blob and modify the blob; the diagonal mask is saved in the blob structure. The blob-based method can directly run the related operations of the diagonal mask on the CPU or Gpu, which is more efficient. The second is to start from the layer and directly define the diagonal mask. This method is relatively simple, but the efficiency is relatively low.
  • the global pruning rate may be set, or the pruning rate may be set separately for each layer.
  • the actual value of the pruning rate can be obtained through experiments.
  • the accuracy of the model will decrease after the non-important weights are removed.
  • the sparsity of the model increases, which can reduce the overfitting of the model, and the model accuracy will also improve after fine-tuning.
  • the blob-based method can directly run the related operations of the diagonal mask on the cpu or gpu, which is more efficient, but requires a higher understanding of the source code; while the layer-based method The method is relatively simple, but the efficiency is relatively low.
  • the present disclosure can optimize the confidence in the lightweight detection model, for example: firstly, the confidence threshold of the head is gradually reduced from a preset value (such as 0.7) until the recall rate of the head detection result exceeds the recall threshold. Then, combine the tracking results of the head tracking model with the above detection results, pay attention to the recall rate and precision of the same head, and continue to adjust (fine-tune) the confidence threshold of the head until the recall rate of the same head exceeds the recall rate threshold and precision Exceeding the precision threshold, for example, both the recall threshold and the precision threshold exceed 0.98.
  • the precision threshold for example, both the recall threshold and the precision threshold exceed 0.98.
  • the electronic device can input each video frame into the light-weight detection model, and the light-weight detection model can detect the head of the object in each video frame, such as the head at various angles such as the front, the back, the side, and the top. part, and based on the one-to-one relationship between the head and the object, combined with the shape of the object to generate the minimum circumscribed rectangle and the object area where each object is located, that is, the electronic device can obtain the position of the head of each object in multiple video frames in the video image data and each The object area where the object is located.
  • the electronic device may combine the above target area to select an object whose head is located within the target area as the target object, and at the same time select an object area corresponding to the smallest circumscribed rectangle of the target object, that is, obtain the object area where the target object is located.
  • the above-mentioned head detection model can detect the head of the object in each video frame, but it cannot determine whether the heads in two adjacent video frames are the same object.
  • the process of the electronic device acquiring the head position in each video frame may include acquiring the head position of the same object in different video frames, as shown in FIG. 3 , including steps 31 to 33 .
  • the electronic device can obtain preset image features of the current video frame, such as color features or direction gradient histogram features, and can select preset image features according to specific scenes.
  • the preset image feature can effectively distinguish the heads of different objects and reduce the computational complexity, both of which fall within the protection scope of the present disclosure. It can be understood that, by reducing the computational complexity in this step, the requirements of the disclosed solution on the resource of the electronic device can be reduced, which is beneficial to expand the application range of the disclosed solution.
  • the electronic device may identify the recognized position of the head in the current video frame based on the preset image features.
  • Step 32 can be realized by using the above-mentioned lightweight detection model, and will not be repeated here.
  • the light-weight detection model can be used to quickly identify the position of the head, which is conducive to realizing real-time detection.
  • the electronic device may also predict the predicted position of the head in the next video frame of the current video frame.
  • the electronic device may process video frames by using fast tracking based on a Kalman filter model, so as to preset the position of the head and the movement speed of the head. It should be noted that since this example only focuses on the predicted position of the head, how to use the motion velocity is not described in detail, and it can be processed according to the requirements of the Kalman filter model, and the corresponding scheme falls within the protection scope of the present disclosure.
  • the electronic device can match the identified position and the predicted position, wherein the matching can be achieved by using the cosine distance of the feature vector, for example, when the cosine value of the feature vector corresponding to the identified position and the predicted position exceeds the cosine
  • the value threshold can be set, such as above 0.85
  • the electronic device can update the predicted position to the recognized position, and obtain the position of the same head in the current video frame and the next video frame. In this way, by tracking the same head in this example, object loss can be avoided, which is beneficial to improve detection accuracy.
  • the process of head tracking by an electronic device is as follows:
  • Video frame Frame 0 The head detection model detects that Frame 0 includes 3 head detections, and currently there are no tracks, and initializes these 3 detections as tracks;
  • Video frame Frame 1 The head detection model detects 3 detections again; for the tracks in Frame 0, the new tracks are first predicted; then, the new tracks are matched with the detections, and the matching model can include the use of the Hungarian model, so that Get (track, detection) matching pairs; finally update the corresponding track with the detection in each matching pair.
  • the electronic device may acquire the spatio-temporal relationship between the object area and the target area; the spatio-temporal relationship refers to the relative spatial relationship between the object area and the target area at different times.
  • the electronic device can set two marking lines inside the target area, wherein the first marking line is closer to the edge of the target area than the second marking line, that is, the second marking line is located at the first marking line
  • the principles are as follows:
  • the electronic device may determine the spatio-temporal relationship between the object area and the target area according to the two marking lines, wherein the spatio-temporal relationship refers to the relative positional relationship between the object area and the target area in space at different times.
  • the spatio-temporal relationship includes at least one of the following: the object area is within the target area, the object area touches the edge of the target area and the two marking lines successively, the object area touches the two marking lines and the edge of the target area successively, the object area The distance between the bottom edge of the object area and the bottom edge of the target area exceeds the set distance threshold, the distance between the bottom edge of the object area and the bottom edge of the target area is less than the set distance threshold, and the object area is outside the target area.
  • the object area will move from the outside of the target area to the inside of the target area as time goes by, that is, the object area will first "touch” the first marking line, and then “touch” ” to the 2nd marked line.
  • the object area will move from the inside of the target area to the outside of the target area, that is, the object area will first "touch” the second marking line, and then “touch” Touch” to the first marking line.
  • step 23 when it is determined that the spatio-temporal relationship satisfies the first preset condition, the electronic device may determine that the current behavior of the target object does not belong to the target behavior.
  • a first preset condition may be pre-stored in the electronic device, and the first preset condition includes at least one of the following: the object area is within the target area and the bottom edge of the object area is in line with the target area The distance from the bottom edge exceeds the set distance threshold, the object area touches the edge of the target area and the two marking lines successively, and the distance between the bottom edge of the object area and the bottom edge of the target area exceeds the set distance threshold; Wherein the two identification lines are set between the connection line of the target area and the detected target, the first preset condition can be set according to the specific scene, when it can be determined that the target object belongs to the detected target passing by In the case of uncivilized behavior, the corresponding solution falls within the protection scope of the present disclosure.
  • the electronic device may determine whether the space-time relationship determined in step 22 satisfies the first preset condition. When it is determined that the space-time relationship satisfies the first preset condition, the electronic device may determine that the current behavior of the target object does not belong to the target behavior, that is, the target object belongs to the passing detected object. When it is determined that the space-time relationship does not meet the first preset condition, that is, the second preset condition is met, the electronic device may determine that the current behavior of the target object may belong to the target behavior, and at this time the electronic device may acquire the behavior of the object entering the target area It is understandable that the behavior information at least includes human body posture. Referring to FIG. 4 , steps 41 to 44 are included.
  • the electronic device may obtain the position of the key part of the behavior information of the target object in each video frame.
  • a key point extraction model may be pre-stored in the electronic device, and then each video frame is input to the key point extraction model, and the key point extraction model can extract the key points of the target object in each video frame.
  • the key points may include left arm bone points, right arm bone points, left leg bone points, right leg bone points and trunk bone points.
  • the electronic device can generate a one-dimensional vector for the key parts of the behavior information in each video frame according to the preset order of expression, and the one-dimensional vector can refer to the vectors below the second and third row graphics shown in Figure 5 , such as [63,64,97,103,121,124].
  • the above expression sequence may include at least one of the following: left arm bone point, right arm bone point, left leg bone point, right leg bone point and trunk bone point; left arm bone point, right arm bone point, trunk bone point, Left leg bone point and right leg bone point; left arm bone point, trunk bone point, left leg bone point, right arm bone point and right leg bone point, that is to say, adjust the arrangement order of key points of left and right hands, left and right legs and torso , the corresponding solutions fall within the protection scope of the present disclosure.
  • step 43 the electronic device can concatenate the corresponding one-dimensional vectors in each video frame in the video image data to obtain a frame of RGB image; the RGB channels in the RGB image respectively correspond to the xyz axis coordinates of the key parts of each behavior information .
  • the electronic device may acquire behavior information of the target object according to the RGB image.
  • electronic devices can be classified based on the behavior information detection method of 3D skeleton points, including: behavior information expression based on key point coordinates (the effect is shown in the first row of graphics in Figure 5), including spatial descriptors (the effect is shown in the leftmost figure in the third row in Figure 5), geometric descriptor (the effect is shown in the middle figure in the third row in Figure 5), key frame descriptor (the effect is shown in the third row in the rightmost figure in Figure 5 As shown in the side figure); after considering the correlation of subspace key points to improve the degree of discrimination and considering the matching degree of different video sequences based on the dynamic programming model, the behavior information of the target object can finally be obtained.
  • step 13 when it is determined that the behavior information indicates that the object climbs the detected target, mark the video frame where the object is located.
  • the electronic device may determine whether the behavior information indicates that the object climbs the detected target, see FIG. 6 , including steps 61 and 62 .
  • the electronic device may determine the location of the specified part of the target object based on the behavior information. Taking the specified part as the target's leg as an example, after the action of the target object is determined, the positions of the left leg and the right leg of the target object can be determined. Referring to Figure 7, the right leg of the target subject on the left side of the sculpture in the middle is within the target area, and the left and right legs of the target subject on the right side of the sculpture are within the target area; the target at the sculpture on the left Both of the subject's legs are within the target area.
  • step 62 when the position of the designated part is within the target area and the distance from the bottom edge of the target area exceeds a set distance threshold, the electronic device may determine that the above behavior information indicates that the target object has been climbed. Detect target.
  • the bottom edge of the object area of the target object and the bottom edge of the target area theoretically overlap, that is, the distance is 0; considering the walking action of the target object, the legs will Lifting a certain height may cause the bottom of the object area to be slightly higher than the height of the target area, that is, there is a certain distance between the bottom of the object area and the bottom of the target area (such as 10-30cm, which can be set), so set
  • the above-mentioned distance threshold is set to ensure that the impact of the object passing by the detected object is excluded. In other words, when the position of the specified part is within the target area and the distance from the bottom edge of the target area exceeds a set distance threshold, the electronic device can determine that the target object is climbing the detected target.
  • the video frame where the target object is located is marked.
  • the facial image of the target object can also be extracted, and the video frame and facial image can be associated, so that it is convenient for the manager to see the facial image at the same time when looking back at the above video frame to achieve timely confirmation of the target The effect of the object's identity.
  • the preset target behavior that is, uncivilized behavior
  • the electronic device may also generate an early warning signal, referring to FIG. 8 , including steps 81-83.
  • the electronic device may acquire a facial image of the target object.
  • the facial image can be obtained synchronously during the process of identifying the head of the target object, or the facial image can be obtained after it is determined that the current behavior of the target object is the target behavior. It is understandable that not all objects located in the target area need to determine their behavior, so the number of facial images that needs to be acquired by the latter is less than the number of images acquired by the former, thereby reducing the amount of data processing.
  • the electronic device can obtain an identification code that matches the facial image; the preset requirements include obtaining key points of the face and the confidence of the recognition result exceeds the set confidence degree threshold.
  • the electronic device may acquire attribute information of an area image, where the attribute information may include but not limited to gender, age, height, skin color, and facial key point positions. Then, the electronic device can generate an identification code matching the facial image according to the attribute information, and store it in a designated database.
  • step 83 when it is determined that there is no object matching the above-mentioned identification code in the specified database, it can be determined that the target object is not a manager but a tourist. At this time, the electronic device can generate early warning information. If a tourist is climbing the sculpture , please stay tuned. Of course, the electronic device can also provide the above-mentioned early warning information to corresponding personnel, for example, notifying management personnel by phone or short message, or directly calling the police.
  • an embodiment of the present disclosure also provides a climbing behavior early warning device, see FIG. 9 , the device includes:
  • a data acquisition module 91 configured to acquire video image data, the video image data including the detected target and at least one object;
  • An information acquisition module 92 configured to acquire behavior information of the object when it is determined that the object enters the target area corresponding to the detected target;
  • the video marking module 93 is configured to mark the video frame where the object is located when it is determined that the behavior information indicates that the object is climbing the detected target.
  • the information acquisition module includes:
  • the area acquisition submodule is used to acquire the target area where the detected target is located in the multiple video frames of the video image data, and acquire the object area where the target object is located; the head of the target object is located in the target area;
  • the relationship acquisition submodule is used to acquire the spatio-temporal relationship between the object area and the target area; the spatio-temporal relationship refers to the relative positional relationship between the object area and the target area in space at different times;
  • An area determination submodule configured to determine that the target object enters the target area when it is determined that the spatio-temporal relationship satisfies a first preset condition
  • the first preset condition includes at least one of the following: the object area is within the target area and the distance between the bottom edge of the object area and the bottom edge of the target area exceeds a set distance threshold, and the object area successively touches the target The edge of the area and the two identification lines and the distance between the bottom edge of the object area and the bottom edge of the target area exceeds the set distance threshold; wherein the two identification lines are set between the line connecting the target area and the target area between the detected targets.
  • the spatio-temporal relationship includes at least one of the following:
  • the object area is within the target area, the object area touches the edge of the target area and the two marking lines successively, the object area touches the two marking lines and the edge of the target area successively, the bottom edge of the object area touches the target area
  • the distance between the bottom edge of the object area exceeds the set distance threshold, the distance between the bottom edge of the object area and the bottom edge of the target area is less than the set distance threshold, and the object area is outside the target area.
  • the region acquisition submodule includes:
  • a position acquisition unit configured to acquire the position of the head of each object in the multiple video frames in the video image data and the object area where each object is located;
  • the object selection unit is configured to select an object whose head is located in the target area as a target object, and obtain the object area where the target object is located.
  • the location acquisition unit includes:
  • a feature acquisition subunit configured to acquire preset image features of each video frame in the multiple video frames
  • a position prediction subunit configured to identify the recognized position of the head in the current video frame based on the preset image features, and predict the predicted position of the head in the next video frame;
  • the position acquisition subunit is used to match the identified position and the predicted position, and update the predicted position to the identified position after the matching is passed, so as to obtain the position of the same head in two adjacent video frames .
  • the information acquisition module includes:
  • the position acquisition sub-module is used to acquire the position of the key part of the behavior information of the target object in the multiple video frames of the video image data; the head of the target object is located in the target area; the behavior information includes the posture of the human body;
  • the vector generation sub-module is used to generate a one-dimensional vector for the key parts of the behavior information in each video frame according to the preset expression sequence;
  • the image acquisition sub-module is used to concatenate the corresponding one-dimensional vectors in each video frame to obtain a frame of RGB image; the RGB channel in the RGB image corresponds to the xyz axis coordinates of each behavioral information key part;
  • the behavior information acquisition sub-module is configured to acquire the behavior information of the target object according to the RGB image.
  • the behavior information includes human posture;
  • the video tagging module includes:
  • a position determining submodule configured to determine the position of a specified part of the target object based on the behavior information
  • a target determination submodule configured to determine that the behavior information indicates that the target object is climbing Climb the detected target.
  • the device also includes:
  • An image acquisition module configured to acquire the facial image of the target object
  • An identification code acquisition module configured to acquire an identification code that matches the facial image when the facial image satisfies a preset requirement; the preset requirement includes the key points of the face and the confidence of the recognition result exceeds the setting confidence threshold;
  • the signal generation module is configured to generate warning information when it is determined that there is no object matching the identification code in the specified database.
  • an electronic device comprising:
  • memory for storing a computer program executable by said processor
  • the processor is configured to execute the computer program in the memory, so as to realize the steps of the method as shown in FIG. 1 .
  • a computer-readable storage medium such as a memory including instructions, and the above-mentioned executable computer program can be executed by a processor, so as to realize the steps of the method as shown in FIG. 1 .
  • the readable storage medium may be ROM, random access memory (RAM), CD-ROM, magnetic tape, floppy disk, optical data storage device and the like.

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Health & Medical Sciences (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • General Health & Medical Sciences (AREA)
  • Theoretical Computer Science (AREA)
  • Multimedia (AREA)
  • Oral & Maxillofacial Surgery (AREA)
  • Human Computer Interaction (AREA)
  • Business, Economics & Management (AREA)
  • Emergency Management (AREA)
  • Computing Systems (AREA)
  • Psychiatry (AREA)
  • Social Psychology (AREA)
  • Image Analysis (AREA)

Abstract

一种攀爬行为预警方法和装置、电子设备、存储介质,方法包括:获取视频图像数据,视频图像数据包括被检测目标和至少一个对象(11);当确定对象进入被检测目标对应的目标区域时,获取对象的行为信息(12);当确定行为信息表征对象攀爬被检测目标时,标记对象所在视频帧(13)。通过标记视频图像数据中的视频帧,可以及时发现对象攀爬被检测目标的行为,提高管理效率。

Description

攀爬行为预警方法和装置、电子设备、存储介质 技术领域
本公开涉及数据处理技术领域,尤其涉及一种攀爬行为预警方法和装置、电子设备、存储介质。
背景技术
随着景区内的游客数量的增加,游客不文明行为也随之增加,例如涂鸦文物、攀爬雕塑等。以攀爬雕塑为例,在攀爬过程中可能损坏雕塑,也有可能会伤害到游客自身,同时给其他游客造成不好的影响。
为及时发现并解决上述不文明行为,现有景区内通常安装视频监控***,由安保人员实时的盯着监控显示屏幕,达到及时发现不文明行为。
然而,安保人员同时盯着多个场景极易造成疲劳,加之不文明行为是偶然现象,导致预警的准确度比较差。
发明内容
本公开提供一种攀爬行为预警方法和装置、电子设备、存储介质,以解决相关技术的不足。
根据本公开实施例的第一方面,提供一种攀爬行为预警方法,所述方法包括:
获取视频图像数据,所述视频图像数据包括被检测目标和至少一个对象;
当确定所述对象进入所述被检测目标对应的目标区域时,获取所述对象的行为信息;
当确定所述行为信息表征所述对象攀爬所述被检测目标时,标记所述对象所在视频帧。
可选地,确定所述对象进入所述被检测目标对应的目标区域,包括:
获取所述视频图像数据多视频帧中所述被检测目标所在的目标区域,以及获取目标对象所在的对象区域;所述目标对象的头部位于所述目标区域内;
获取所述对象区域和所述目标区域的时空关系;所述时空关系是指在不同时刻时所述对象区域和所述目标区域在空间上的相对位置关系;
当确定所述时空关系满足第一预设条件时,确定所述目标对象的进入所述目标区域;
第一预设条件包括以下至少一种:对象区域在目标区域之内且所述对象区域的底边与所述目标区域的底边的距离超过设定距离阈值,对象区域先后触碰所述目标区域的边缘和两条标识线且所述对象区域的底边与所述目标区域的底边的距离超过设定距离阈值;其中两条所述标识线设置在所述目标区域的连线与所述被检测目标之间。
可选地,所述时空关系包括以下至少一种:
对象区域在目标区域之内、对象区域先后触碰所述目标区域的边缘和两条标识线、对象区域先后触碰所述目标区域的两条标识线和边缘、对象区域的底边与目标区域的底边的距离超过设定距离阈值、对象区域的底边与目标区域的底边的距离小于设定距离阈值、对象区域在目标区域之外。
可选地,获取目标对象所在的对象区域,包括:
获取所述视频图像数据中多视频帧内各对象头部的位置和各对象所在的对象区域;
选取头部位于所述目标区域内的对象作为目标对象,并获取所述目标对象所在的对象区域。
可选地,获取所述视频图像数据中多视频帧内各对象头部的位置,包括:
获取所述多视频帧内各视频帧的预设图像特征;
基于所述预设图像特征识别当前视频帧中头部的识别位置,以及预测下一视频帧中头部的预测位置;
对所述识别位置和所述预测位置进行匹配,并当匹配通过后将所述预测位置更新为所述识别位置,获得相邻两帧视频帧中同一头部的位置。
可选地,获取所述对象的行为信息,包括:
获取所述视频图像数据多视频帧中目标对象的行为信息关键部位的位置;所述目标对象的头部位于所述目标区域内;所述行为信息包括人体姿态;
按照预设的表述顺序,将各视频帧中行为信息关键部位生成一维向量;
将各视频帧中对应一维向量进行级联,得到一帧RGB图像;所述RGB图像中RGB通道分别对应每个行为信息关键部位的xyz轴坐标;
根据所述RGB图像获取所述目标对象的行为信息。
可选地,确定所述行为信息表征所述对象攀爬所述被检测目标,包括:
基于所述行为信息确定目标对象的指定部位的位置;所述行为信息包括人体姿态;
当所述指定部位的位置位于所述目标区域之内且与所述目标区域的底边的距离超过设定距离阈值时,确定所述行为信息表征所述目标对象攀爬所述被检测目标。
可选地,标记所述对象所在视频帧之后,所述方法还包括:
获取目标对象的面部图像;
当所述面部图像满足预设要求时,获取与所述面部图像相匹配的识别码;所述预设要求包括能够面部的关键点且识别结果的置信度超过设定置信度阈值;
当确定指定数据库中不存在与所述识别码相匹配的对象时,生成预警信息。
根据本公开实施例的第二方面,提供一种攀爬行为预警装置,所述装置包括:
数据获取模块,用于获取视频图像数据,所述视频图像数据包括被检测目标和至少一个对象;
信息获取模块,用于当确定所述对象进入所述被检测目标对应的目标区域时,获取所述对象的行为信息;
视频标记模块,用于当确定所述行为信息表征所述对象攀爬所述被检测目标时,标记所述对象所在视频帧。
可选地,所述信息获取模块包括:
区域获取子模块,用于获取所述视频图像数据多视频帧中所述被检测目标所在的目标区域,以及获取目标对象所在的对象区域;所述目标对象的头部位于所述目标区域内;
关系获取子模块,用于获取所述对象区域和所述目标区域的时空关系;所述时空关系是指在不同时刻时所述对象区域和所述目标区域在空间上的相对位置关系;
区域确定子模块,用于当确定所述时空关系满足第一预设条件时,确定所述目标对象的进入所述目标区域;
第一预设条件包括以下至少一种:对象区域在目标区域之内且所述对象区域的底边与所述目标区域的底边的距离超过设定距离阈值,对象区域先后触碰所述目标区域的边缘和两条标识线且所述对象区域的底边与所述目标区域的底边的距离超过设定距离阈值;其中两条所述标识线设置在所述目标区域的连线与所述被检测目标之间。
可选地,所述时空关系包括以下至少一种:
对象区域在目标区域之内、对象区域先后触碰所述目标区域的边缘和两条标识线、对象区域先后触碰所述目标区域的两条标识线和边缘、对象区域的底边与目标区域的底边的距离超过设定距离阈值、对象区域的底边与目标区域的底边的距离小于设定距离阈值、对象区域在目标区域之外。
可选地,所述区域获取子模块包括:
位置获取单元,用于获取所述视频图像数据中多视频帧内各对象头部的位置和各对象所在的对象区域;
对象选取单元,用于选取头部位于所述目标区域内的对象作为目标对象,并获取所述目标对象所在的对象区域。
可选地,所述位置获取单元包括:
特征获取子单元,用于获取所述多视频帧内各视频帧的预设图像特征;
位置预测子单元,用于基于所述预设图像特征识别当前视频帧中头部的识别位置,以及预测下一视频帧中头部的预测位置;
位置获取子单元,用于对所述识别位置和所述预测位置进行匹配,并当匹配通过后将所述预测位置更新为所述识别位置,获得相邻两帧视频帧中同一头部的位置。
可选地,所述信息获取模块包括:
位置获取子模块,用于获取所述视频图像数据多视频帧中目标对象的行为信息关键部位的位置;所述目标对象的头部位于所述目标区域内;所述行为信息包括人体姿态;
向量生成子模块,用于按照预设的表述顺序,将各视频帧中行为信息关键部位生成一维向量;
图像获取子模块,用于将各视频帧中对应一维向量进行级联,得到一帧RGB图像;所述RGB图像中RGB通道分别对应每个行为信息关键部位的xyz轴坐标;
行为信息获取子模块,用于根据所述RGB图像获取所述目标对象的行为信息。
可选地,所述视频标记模块包括:
位置确定子模块,用于基于所述行为信息确定目标对象的指定部位的位置;所述行为信息包括人体姿态;
目标确定子模块,用于当所述指定部位的位置位于所述目标区域之内且与所述目标区域的底边的距离超过设定距离阈值时,确定所述行为信息表征所述目标对象攀爬所述被检测目标。
可选地,所述装置还包括:
图像获取模块,用于获取目标对象的面部图像;
识别码获取模块,用于当所述面部图像满足预设要求时,获取与所述面部图像相匹配的识别码;所述预设要求包括能够面部的关键点且识别结果的置信度超过设定置信度阈值;
信号生成模块,用于当确定指定数据库中不存在与所述识别码相匹配的对象时,生成预警信息。
根据本公开实施例的第三方面,提供一种电子设备,包括:
处理器;
用于存储所述处理器可执行的计算机程序的存储器;
其中,所述处理器被配置为执行所述存储器中的计算机程序,以实现上述的方法。
根据本公开实施例的第四方面,提供一种计算机可读存储介质,当所述存储介质中的可执行的计算机程序由处理器执行时,能够实现上述的方法。
本公开的实施例提供的技术方案可以包括以下有益效果:
由上述实施例可知,本公开实施例提供的方案可以获取视频图像数据;所述视频图像数据包括被检测目标和至少一个对象;当确定所述对象进入所述被检测目标对应的目标区域时,获取所述对象的行为信息;当确定所述行为信息表征所述对象攀爬所述被检测目标时,标记所述对象所在视频帧。这样,本实施例中通过标记视频图像数据中的视频帧,可以及时发现对象攀爬被检测目标的行为,提高管理效率。
应当理解的是,以上的一般描述和后文的细节描述仅是示例性和解释性的,并不能限制本公开。
附图说明
此处的附图被并入说明书中并构成本说明书的一部分,示出了符合本公开的实施例,并与说明书一起用于解释本公开的原理。
图1是根据一示例性实施例示出的一种攀爬行为预警方法的流程图。
图2是根据一示例性实施例示出的确定目标对象的当前行为的流程图。
图3是根据一示例性实施例示出的跟踪同一头部的流程图。
图4是根据一示例性实施例示出的目标对象的当前行为的流程图。
图5是根据一示例性实施例示出的获取目标对象的动作的效果示意图。
图6是根据一示例性实施例示出的确定行为信息是否表征对象攀爬被检测目标的流程图。
图7是根据一示例性实施例示出的对象区域和目标区域的时空关系的效果示意图。
图8是根据一示例性实施例示出的另一种攀爬行为预警方法的流程图。
图9是根据一示例性实施例示出的一种攀爬行为预警装置的框图。
具体实施方式
这里将详细地对示例性实施例进行说明,其示例表示在附图中。下面的描述涉及附图时,除非另有表示,不同附图中的相同数字表示相同或相似的要素。以下示例性所描述的实施例并不代表与本公开相一致的所有实施例。相反,它们仅是与如所附权利要求书中所详述的、本公开的一些方面相一致的装置例子。
为解决上述技术问题,本公开实施例提供了一种攀爬行为预警方法,适用于电子设备,图1是根据一示例性实施例示出的一种攀爬行为预警方法的流程图。参见图1,一种攀爬行为预警方法,包括步骤11~步骤13。
在步骤11中,获取视频图像数据,所述视频图像数据包括被检测目标和至少一个对象。
本实施例中,电子设备可以与摄像头连接,并接收摄像头输出的视频图像数据。即摄像头在开启状态下可以采集视频帧形成视频帧流,然后对视频帧进行编码和压缩等处理后再发送给电子设备。电子设备对接收到的图像数据进行解码等处理后即可获得上述视频图像数据。
考虑到本公开提供的方案是监测一些目标行为,如攀爬、涂鸦等不文明行为,因此上述摄像头的拍摄范围通常指向指定的被检测目标,其中被检测目标可以包括但不限于 景区的雕像、博物馆的文物、安全护栏等,或者说电子设备获取的视频图像数据包括被检测目标。
可理解的是,视频图像数据中可能包括对象也可能不包括对象,其中对象可以是游客或者管理人员。考虑到本公开提供的方案应用于包括对象的场景,因此后续实施例中仅考虑视频图像数据中包括至少一个对象的场景。
在步骤12中,当确定所述对象进入所述被检测目标对应的目标区域时,获取所述对象的行为信息。
本实施例中,电子设备可以处理上述视频图像数据,从而确定对象是否进入被检测目标对应的目标区域,参见图2,包括步骤21~步骤23。
在步骤21中,电子设备可以获取所述视频图像数据多视频帧中所述被检测目标所在的目标区域,以及获取目标对象所在的对象区域。
以获取目标区域为例,电子设备内可以预先存储目标识别模型,如卷积网络模型(CNN)。电子设备可以将视频图像数据的各视频帧输入到目标识别模型,该目标识别模型可以识别出视频图像数据的各视频帧中的被检测目标,然后根据被检测目标的形状生成最小外接矩形,那么视频帧中与该最小外接矩形对应的区域即是目标区域,也就是说,经过上述识别过程可以获得多视频帧中被检测目标所在的目标区域。可理解的是,上述最小外接矩形还可以采用其他预设形状替代,例如圆形、菱形等,在能够获得目标区域的情况下,相应方案落入本公开的保护范围。
以获取对象区域为例,电子设备内可以预先存储头部检测模型,如卷积网络模型,本示例中头部检测模型为基于CNN的轻量检测模型,可以适于电子设备的资源配置较低的场景,或者适用于对现有监控***中的升级改造场景。这样,本示例中通过设置上述轻量检测模型,即在减少轻量检测模型的参数量的情况下保持识别性能,可以使检测结果具有较高的置信度。
本示例中,轻量检测模型可以通过模型压缩(Model Compression)和模型剪枝(Pruning)。其中,模型压缩即在已经训练好的模型上进行参数压缩,使得模型携带更少的模型参数,从而减少占用较多内存的问题,达到提高处理效率的效果。
模型剪枝是指在保证CNN精度的前提下,保留重要的权重而去掉不重要的权重,通常情况下权重值越接近于0则该权重越不重要。模型剪枝可以包括:1、修改blob的结构或者不修改,直接定义对角线mask,将原来的矩阵改写成稀疏矩阵的存储方式;2、 采用新的方式来计算稀疏矩阵和向量的相乘。也就是说,在进行剪枝时,有两个出发点,一是从blob出发,修改blob;将对角线mask保存于blob结构中。基于blob的方式可以将对角线mask的相关运算直接运行在Cpu或者Gpu上,效率更高。二是从层layer出发,直接定义对角线mask,此方式较为简单,但效率相对较低。
需要说明的是,在设置剪枝率时,可以是设置全局剪枝率,也可以是针对每一层分别设置剪枝率。实际应用中,剪枝率的实际值可以通过实验法获得。
还需要说明的是,一般情况下,将非重要的权重去掉之后模型精度会下降。但是,去掉非重要权重之后模型稀疏性增加,从而可以减少模型的过拟合,并且在经过微调之后模型精度也会提升。
在进行剪枝时,有两个出发点,一是从blob出发,修改blob,将对角线mask保存于blob结构中,二是从layer出发,直接定义对角线mask。这两种方式各有各的特点,基于blob的方式可以将对角线mask的相关运算直接运行在cpu或者gpu上,效率更高,但是需要对源代码有较高的理解;而基于layer的方式较为简单,但效率相对较低。
本公开可以对上述轻量检测模型中的置信度作优化,如:首先,将头部的置信度阈值由预设数值(如0.7)逐渐降低直到头部检测结果的召回率超过召回率阈值。然后,结合头部跟踪模型的跟踪结果和上述检测结果,关注同一头部的召回率和精度,继续调整(微调)头部的置信度阈值,直至同一头部的召回率超过召回率阈值和精度超过精度阈值,例如召回率阈值和精度阈值两者的取值均超过0.98。这样,本示例中通过对头部的置信度进行优化,可以在跟踪目标对象的过程中达到同一个头部具有较好的召回率(recall)和精度(precision),最终达到召回率和精度相平衡的效果。
本示例中,电子设备可以将各视频帧输入到该轻量检测模型,该轻量检测模型可以检测出各视频帧中对象的头部,例如正面、背面、侧面、上面等各种角度的头部,并且基于头部和对象一一对应的关系结合对象的形状生成最小外接矩形以及各对象所在的对象区域,即电子设备可以获取视频图像数据中多视频帧内各对象头部的位置和各对象所在的对象区域。然后,电子设备可以结合上述目标区域,选择出头部位于目标区域之内的对象作为目标对象,同时可以选择出目标对象的最小外接矩形对应的对象区域,即获得目标对象所在的对象区域。
可理解的是,上述头部检测模型可以检测出各视频帧中对象的头部,但是无法 确定相邻2帧视频帧中的头部是否为同一对象。为此,电子设备获取各视频帧内头部的位置的过程中,可以包括获取同一对象的头部在不同视频帧的位置,参见图3,包括步骤31~步骤33。
在步骤31中,针对多视频帧中的各视频帧,电子设备可以获取当前视频帧的预设图像特征,如颜色特征或者方向梯度直方图特征,可根据具体场景选择预设的图像特征,在该预设图像特征能够有效区分不同对象的头部以及降低计算复杂度的情况下均落入本公开的保护范围。可理解的是,本步骤中通过降低计算复杂度,可以降低本公开方案对电子设备的资源的需求,有利于扩大本公开方案的应用范围。
在步骤32中,电子设备可以基于所述预设图像特征识别当前视频帧中头部的识别位置。步骤32中可以采用上述轻量检测模型实现,在此不再赘述。本步骤中通过轻量检测模型可以快速识别出头部的位置,有利于实现检测的实时性。
在步骤32中,电子设备还可以预测当前视频帧的下一视频帧中头部的预测位置。例如,电子设备可以采用基于卡尔曼滤波模型的快速跟踪处理视频帧,从而对头部的位置和头部的运动速度进行预设。需要说明的是,由于本示例中仅关注头部的预测位置,因此对于如何利用运动速度未作详述,可以根据卡尔曼滤波模型的需求进行处理,相应方案落入本公开的保护范围。
在步骤33中,电子设备可以对所述识别位置和所述预测位置进行匹配,其中匹配可以采用特征向量的余弦距离方式来实现,如当识别位置和预测位置对应的特征向量的余弦值超过余弦值阈值(可设置,如0.85以上)时,可以确定识别位置和预测位置通过匹配。当匹配通过后,电子设备可以将预测位置更新为识别位置,获得当前视频帧和下一视频帧中同一头部的位置。这样,本示例中通过跟踪同一头部,可以避免对象丢失,有利于提升检测准确度。
例如,电子设备进行头部跟踪的流程为:
视频帧Frame 0:头部检测模型检测到Frame 0中包括3个头部detections,当前没有任何tracks,将这3个detections初始化为tracks;
视频帧Frame 1:头部检测模型又检测到3个detections;对于Frame 0中的tracks先进行预测得到新的tracks;然后,将新的tracks与detections进行匹配,匹配模型可以包括使用匈牙利模型,从而得到(track,detection)匹配对;最后用每对匹配对中的detection更新对应的track。
在步骤22中,电子设备可以获取所述对象区域和所述目标区域的时空关系;所述时空关系是指在不同时刻时所述对象区域和所述目标区域在空间上的相对位置关系。
本实施例中,电子设备可以在目标区域的内部设置2条标志线,其中第1条标志线距离目标区域的边缘比第2条标志线近,即第2条标志线位于第1条标志线和被检测目标之间,原理如下包括:
(1)通过在目标区域的顶部边处设置两条水平标志线识别对象直接竖直进出目标区域的情况;
(2)通过在目标区域的左侧边处设置两条竖直标志线,识别对象从左侧平行进出目标区域的情况;
(3)通过在目标区域的右侧边处设置两条竖直标志线,识别对象从右侧与平行进出目标区域的情况;
(4)通过在目标区域的底部边处设置一条水平线,识别对象与地面的距离,从而区分对象路过被检测目标还是有可能攀爬被检测目标的行为。
本实施例中,电子设备可以根据2条标志线确定对象区域和目标区域的时空关系,其中上述时空关系是指在不同时刻时对象区域和目标区域在空间上的相对位置关系。其中,时空关系包括以下至少一种:对象区域在目标区域之内、对象区域先后触碰目标区域的边缘和两条标识线、对象区域先后触碰目标区域的两条标识线和边缘、对象区域的底边与目标区域的底边的距离超过设定距离阈值、对象区域的底边与目标区域的底边的距离小于设定距离阈值、对象区域在目标区域之外。
以对象区域进入目标区域为例,随着时间的推移对象区域会从目标区域的外部移动到目标区域的内部,即对象区域会先“触碰”到第1条标志线,然后再“触碰”到第2条标志线。以对象区域离开目标区域为例,随着时间的推移,对象区域会从目标区域的内部移动到目标区域的外部,即对象区域会先“触碰”到第2条标志线,然后再“触碰”到第1条标志线。
在步骤23中,当确定所述时空关系满足第一预设条件时,电子设备可以确定所述目标对象的当前行为不属于所述目标行为。
本实施例中,电子设备内可以预先存储第一预设条件,该第一预设条件包括以下至少一种:对象区域在目标区域之内且所述对象区域的底边与所述目标区域的底边的距离超过设定距离阈值,对象区域先后触碰所述目标区域的边缘和两条标识线且所述对 象区域的底边与所述目标区域的底边的距离超过设定距离阈值;其中两条所述标识线设置在所述目标区域的连线与所述被检测目标之间,可以根据具体场景设置该第一预设条件,在能够确定出目标对象属于路过被检测目标即不属于不文明行为的情况下,相应方案落入本公开的保护范围。
本实施例中,电子设备可以判断步骤22中所确定的时空关系是否满足第一预设条件。当确定时空关系满足第一预设条件,电子设备可以确定目标对象的当前行为不属于所述目标行为,即目标对象属于路过被检测目标。当确定时空关系不满足第一预设条件即满足第二预设条件时,电子设备可以确定目标对象的当前行为有可能属于所述目标行为,此时电子设备可以获取进入目标区域的对象的行为信息,可理解的是,该行为信息至少包括人体姿态。参见图4,包括步骤41~步骤44。
在步骤41中,针对视频图像数据多视频帧中各视频帧,电子设备可以获取各视频帧中目标对象的行为信息关键部位的位置。例如,电子设备内可以预先存储的关键点提取模型,然后将各视频帧输入到关键点提取模型,由关键点提取模型即可提取到各视频帧中的目标对象的关键点。其中,关键点可以包括左手臂骨骼点、右手臂骨骼点、左腿骨骼点、右腿骨骼点和躯干骨骼点。
在步骤42中,电子设备可以按照预设的表述顺序,将各视频帧中行为信息关键部位生成一维向量,一维向量可以参见图5所示第二行图形和第三行图形下方的向量,如[63,64,97,103,121,124]。其中,上述表达顺序可以包括以下至少一种:左手臂骨骼点、右手臂骨骼点、左腿骨骼点、右腿骨骼点和躯干骨骼点;左手臂骨骼点、右手臂骨骼点、躯干骨骼点、左腿骨骼点和右腿骨骼点;左手臂骨骼点、躯干骨骼点、左腿骨骼点、右手臂骨骼点和右腿骨骼点,也就是说调整左右手、左右腿和躯干的关键点的排列顺序,相应方案落入本公开的保护范围。
在步骤43中,电子设备可以将视频图像数据中各视频帧中对应一维向量进行级联,得到一帧RGB图像;所述RGB图像中RGB通道分别对应每个行为信息关键部位的xyz轴坐标。
在步骤44中,电子设备可以根据RGB图像获取目标对象的行为信息。在一示例中,电子设备可以基于3D骨骼点的行为行为信息检测方法进行分类,包括:基于关键点坐标的行为行为信息表达(效果如图5中第一行图形所示),包括空间描述子(效果如图5中第三行最左侧图形所示),几何描述子(效果如图5中第三行中间图形所示)、关键帧描述子(效果如图5中第三行最右侧图形所示);考虑子空间关键点的相关性来 提升判别度以及基于动态规划模型来考虑不同视频序列的匹配度等处理后,最终可获得目标对象的行为信息。
在步骤13中,当确定所述行为信息表征所述对象攀爬所述被检测目标时,标记所述对象所在视频帧。
本实施例中,在确定目标对象的行为信息后,电子设备可以确定该行为信息是否表征对象攀爬被检测目标,参见图6,包括步骤61和步骤62。在步骤61中,电子设备可以基于所述行为信息确定所述目标对象的指定部位的位置。以指定部位是对象的腿部为例,在确定出目标对象的动作之后,就可以确定出目标对象左腿和右腿的位置。参见图7,位于中间的雕塑左侧处目标对象的右腿部位于目标区域之内,雕塑右侧的目标对象的左腿和右腿部位于目标区域之内;位于左侧的雕塑处的目标对象的两腿部位均位于目标区域之内。需要说明的是,实际应用中,目标区域的无需显现,因此图7中目标区域的边均采用虚线表示,以方便理解本公开的方案。在步骤62中,当所述指定部位的位置位于所述目标区域之内且与所述目标区域的底边的距离超过设定距离阈值时,电子设备可以确定上述行为信息表征目标对象攀爬被检测目标。
可理解的是,当目标对象路过被检测目标时,目标对象的对象区域的底边与目标区域的底边理论上是重叠的,即距离是0;考虑到目标对象步行的动作,腿部会抬起一定的高度可能造成对象区域的底边略微高于目标区域的高度,即对象区域的底边与目标区域的底边之间存在一定的距离(如10~30cm,可设置),故设置上述设定距离阈值以保证排除对象路过被检测目标带来的影响。或者说,当指定部位的位置位于所述目标区域之内且与所述目标区域的底边的距离超过设定距离阈值时,电子设备就可以确定目标对象攀爬被检测目标。
本实施例中,当确定目标对象攀爬被检测目标时,标记目标对象所在的视频帧。在一些示例中,在标记对应的视频帧时,还可以提取目标对象的面部图像,并关联视频帧与面部图像,从而方便管理人员在回看上述视频帧可同时看到面部图像达到及时确认目标对象的身份的效果。这样,本实施例中通过标记视频图像数据中的视频帧,可以及时发现预设的目标行为(即不文明行为),提高管理效率。
在一实施例中,在步骤13之后,电子设备还可以生成预警信号,参见图8,包括步骤81~步骤83。
在步骤81中,电子设备可以获取目标对象的面部图像。其中,在识别目标对象 头部的过程中可以同步获得该面部图像,或者在确定目标对象的当前行为是目标行为之后获取面部图像。可理解的是,并不是所有位于目标区域内的对象均需要判断其行为,因此后者所需要获取的面部图像的数量要少于前者获取图像的数量,从而可以减少数据处理量。
在步骤82中,当面部图像满足预设要求时,电子设备可以获取与所述面部图像相匹配的识别码;所述预设要求包括获得面部的关键点且识别结果的置信度超过设定置信度阈值。例如,电子设备可以获取面积图像的属性信息,其中属性信息可以包括但不限于性别、年龄、身高、肤色以及面部关键点位置。然后,电子设备可以根据属性信息生成与面部图像相匹配的识别码,并存储到指定数据库中。
在步骤83中,当确定指定数据库中不存在与上述识别码相匹配的对象时,可以确定目标对象并不是管理人员而是游客,此时电子设备可以生成预警信息,如有游客正在攀爬雕塑,请保持关注。当然,电子设备还可以将上述预警信息提供给相应人员,例如,通过电话或者短消息方式通知管理人员,或者直接报警。
可见,本实施例中通过识别目标对象可以排除管理人员采用目标行为来维护被检测目标的场景,达到提升预警的准确度。
在上述实施例提供的一种攀爬行为预警方法的基础上,本公开实施例还提供了一种攀爬行为预警装置,参见图9,所述装置包括:
数据获取模块91,用于获取视频图像数据,所述视频图像数据包括被检测目标和至少一个对象;
信息获取模块92,用于当确定所述对象进入所述被检测目标对应的目标区域时,获取所述对象的行为信息;
视频标记模块93,用于当确定所述行为信息表征所述对象攀爬所述被检测目标时,标记所述对象所在视频帧。
在一实施例中,所述信息获取模块包括:
区域获取子模块,用于获取所述视频图像数据多视频帧中所述被检测目标所在的目标区域,以及获取目标对象所在的对象区域;所述目标对象的头部位于所述目标区域内;
关系获取子模块,用于获取所述对象区域和所述目标区域的时空关系;所述时 空关系是指在不同时刻时所述对象区域和所述目标区域在空间上的相对位置关系;
区域确定子模块,用于当确定所述时空关系满足第一预设条件时,确定所述目标对象的进入所述目标区域;
第一预设条件包括以下至少一种:对象区域在目标区域之内且所述对象区域的底边与所述目标区域的底边的距离超过设定距离阈值,对象区域先后触碰所述目标区域的边缘和两条标识线且所述对象区域的底边与所述目标区域的底边的距离超过设定距离阈值;其中两条所述标识线设置在所述目标区域的连线与所述被检测目标之间。
在一实施例中,所述时空关系包括以下至少一种:
对象区域在目标区域之内、对象区域先后触碰所述目标区域的边缘和两条标识线、对象区域先后触碰所述目标区域的两条标识线和边缘、对象区域的底边与目标区域的底边的距离超过设定距离阈值、对象区域的底边与目标区域的底边的距离小于设定距离阈值、对象区域在目标区域之外。
在一实施例中,所述区域获取子模块包括:
位置获取单元,用于获取所述视频图像数据中多视频帧内各对象头部的位置和各对象所在的对象区域;
对象选取单元,用于选取头部位于所述目标区域内的对象作为目标对象,并获取所述目标对象所在的对象区域。
在一实施例中,所述位置获取单元包括:
特征获取子单元,用于获取所述多视频帧内各视频帧的预设图像特征;
位置预测子单元,用于基于所述预设图像特征识别当前视频帧中头部的识别位置,以及预测下一视频帧中头部的预测位置;
位置获取子单元,用于对所述识别位置和所述预测位置进行匹配,并当匹配通过后将所述预测位置更新为所述识别位置,获得相邻两帧视频帧中同一头部的位置。
在一实施例中,所述信息获取模块包括:
位置获取子模块,用于获取所述视频图像数据多视频帧中目标对象的行为信息关键部位的位置;所述目标对象的头部位于所述目标区域内;所述行为信息包括人体姿态;
向量生成子模块,用于按照预设的表述顺序,将各视频帧中行为信息关键部位生成一维向量;
图像获取子模块,用于将各视频帧中对应一维向量进行级联,得到一帧RGB图像;所述RGB图像中RGB通道分别对应每个行为信息关键部位的xyz轴坐标;
行为信息获取子模块,用于根据所述RGB图像获取所述目标对象的行为信息。所述行为信息包括人体姿态;
在一实施例中,所述视频标记模块包括:
位置确定子模块,用于基于所述行为信息确定目标对象的指定部位的位置;
目标确定子模块,用于当所述指定部位的位置位于所述目标区域之内且与所述目标区域的底边的距离超过设定距离阈值时,确定所述行为信息表征所述目标对象攀爬所述被检测目标。
在一实施例中,所述装置还包括:
图像获取模块,用于获取目标对象的面部图像;
识别码获取模块,用于当所述面部图像满足预设要求时,获取与所述面部图像相匹配的识别码;所述预设要求包括能够面部的关键点且识别结果的置信度超过设定置信度阈值;
信号生成模块,用于当确定指定数据库中不存在与所述识别码相匹配的对象时,生成预警信息。
需要说明的是,本实施例中示出的装置与图1所示方法实施例的内容相匹配,可以参考上述方法实施例的内容,在此不再赘述。
在示例性实施例中,还提供了一种电子设备,包括:
处理器;
用于存储所述处理器可执行的计算机程序的存储器;
其中,所述处理器被配置为执行所述存储器中的计算机程序,以实现如图1所述方法的步骤。
在示例性实施例中,还提供了一种计算机可读存储介质,例如包括指令的存储器,上述可执行的计算机程序可由处理器执行,以实现如图1所述方法的步骤。其中, 可读存储介质可以是ROM、随机存取存储器(RAM)、CD-ROM、磁带、软盘和光数据存储设备等。
本领域技术人员在考虑说明书及实践这里公开的公开后,将容易想到本公开的其它实施方案。本公开旨在涵盖任何变型、用途或者适应性变化,这些变型、用途或者适应性变化遵循本公开的一般性原理并包括本公开未公开的本技术领域中的公知常识或惯用技术手段。说明书和实施例仅被视为示例性的,本公开的真正范围和精神由下面的权利要求指出。
应当理解的是,本公开并不局限于上面已经描述并在附图中示出的精确结构,并且可以在不脱离其范围进行各种修改和改变。本公开的范围仅由所附的权利要求来限制。

Claims (18)

  1. 一种攀爬行为预警方法,其特征在于,所述方法包括:
    获取视频图像数据,所述视频图像数据包括被检测目标和至少一个对象;
    当确定所述对象进入所述被检测目标对应的目标区域时,获取所述对象的行为信息;
    当确定所述行为信息表征所述对象攀爬所述被检测目标时,标记所述对象所在视频帧。
  2. 根据权利要求1所述的方法,其特征在于,确定所述对象进入所述被检测目标对应的目标区域,包括:
    获取所述视频图像数据多视频帧中所述被检测目标所在的目标区域,以及获取目标对象所在的对象区域;所述目标对象的头部位于所述目标区域内;
    获取所述对象区域和所述目标区域的时空关系;所述时空关系是指在不同时刻时所述对象区域和所述目标区域在空间上的相对位置关系;
    当确定所述时空关系满足第一预设条件时,确定所述目标对象的进入所述目标区域;
    第一预设条件包括以下至少一种:对象区域在目标区域之内且所述对象区域的底边与所述目标区域的底边的距离超过设定距离阈值,对象区域先后触碰所述目标区域的边缘和两条标识线且所述对象区域的底边与所述目标区域的底边的距离超过设定距离阈值;其中两条所述标识线设置在所述目标区域的连线与所述被检测目标之间。
  3. 根据权利要求2所述的方法,其特征在于,所述时空关系包括以下至少一种:
    对象区域在目标区域之内、对象区域先后触碰所述目标区域的边缘和两条标识线、对象区域先后触碰所述目标区域的两条标识线和边缘、对象区域的底边与目标区域的底边的距离超过设定距离阈值、对象区域的底边与目标区域的底边的距离小于设定距离阈值、对象区域在目标区域之外。
  4. 根据权利要求2所述的方法,其特征在于,获取目标对象所在的对象区域,包括:
    获取所述视频图像数据中多视频帧内各对象头部的位置和各对象所在的对象区域;
    选取头部位于所述目标区域内的对象作为目标对象,并获取所述目标对象所在的对象区域。
  5. 根据权利要求4所述的方法,其特征在于,获取所述视频图像数据中多视频帧内各对象头部的位置,包括:
    获取所述多视频帧内各视频帧的预设图像特征;
    基于所述预设图像特征识别当前视频帧中头部的识别位置,以及预测下一视频帧中 头部的预测位置;
    对所述识别位置和所述预测位置进行匹配,并当匹配通过后将所述预测位置更新为所述识别位置,获得相邻两帧视频帧中同一头部的位置。
  6. 根据权利要求1所述的方法,其特征在于,获取所述对象的行为信息,包括:
    获取所述视频图像数据多视频帧中目标对象的行为信息关键部位的位置;所述目标对象的头部位于所述目标区域内;所述行为信息包括人体姿态;
    按照预设的表述顺序,将各视频帧中行为信息关键部位生成一维向量;
    将各视频帧中对应一维向量进行级联,得到一帧RGB图像;所述RGB图像中RGB通道分别对应每个行为信息关键部位的xyz轴坐标;
    根据所述RGB图像获取所述目标对象的行为信息。
  7. 根据权利要求1所述的方法,其特征在于,确定所述行为信息表征所述对象攀爬所述被检测目标,包括:
    基于所述行为信息确定目标对象的指定部位的位置;所述行为信息包括人体姿态;
    当所述指定部位的位置位于所述目标区域之内且与所述目标区域的底边的距离超过设定距离阈值时,确定所述行为信息表征所述目标对象攀爬所述被检测目标。
  8. 根据权利要求1所述的方法,其特征在于,标记所述对象所在视频帧之后,所述方法还包括:
    获取目标对象的面部图像;
    当所述面部图像满足预设要求时,获取与所述面部图像相匹配的识别码;所述预设要求包括能够面部的关键点且识别结果的置信度超过设定置信度阈值;
    当确定指定数据库中不存在与所述识别码相匹配的对象时,生成预警信息。
  9. 一种攀爬行为预警装置,其特征在于,所述装置包括:
    数据获取模块,用于获取视频图像数据,所述视频图像数据包括被检测目标和至少一个对象;
    信息获取模块,用于当确定所述对象进入所述被检测目标对应的目标区域时,获取所述对象的行为信息;
    视频标记模块,用于当确定所述行为信息表征所述对象攀爬所述被检测目标时,标记所述对象所在视频帧。
  10. 根据权利要求9所述的装置,其特征在于,所述信息获取模块包括:
    区域获取子模块,用于获取所述视频图像数据多视频帧中所述被检测目标所在的目标区域,以及获取目标对象所在的对象区域;所述目标对象的头部位于所述目标区域内;
    关系获取子模块,用于获取所述对象区域和所述目标区域的时空关系;所述时空关系是指在不同时刻时所述对象区域和所述目标区域在空间上的相对位置关系;
    区域确定子模块,用于当确定所述时空关系满足第一预设条件时,确定所述目标对象的进入所述目标区域;
    第一预设条件包括以下至少一种:对象区域在目标区域之内且所述对象区域的底边与所述目标区域的底边的距离超过设定距离阈值,对象区域先后触碰所述目标区域的边缘和两条标识线且所述对象区域的底边与所述目标区域的底边的距离超过设定距离阈值;其中两条所述标识线设置在所述目标区域的连线与所述被检测目标之间。
  11. 根据权利要求10所述的装置,其特征在于,所述时空关系包括以下至少一种:
    对象区域在目标区域之内、对象区域先后触碰所述目标区域的边缘和两条标识线、对象区域先后触碰所述目标区域的两条标识线和边缘、对象区域的底边与目标区域的底边的距离超过设定距离阈值、对象区域的底边与目标区域的底边的距离小于设定距离阈值、对象区域在目标区域之外。
  12. 根据权利要求9所述的装置,其特征在于,所述区域获取子模块包括:
    位置获取单元,用于获取所述视频图像数据中多视频帧内各对象头部的位置和各对象所在的对象区域;
    对象选取单元,用于选取头部位于所述目标区域内的对象作为目标对象,并获取所述目标对象所在的对象区域。
  13. 根据权利要求12所述的装置,其特征在于,所述位置获取单元包括:
    特征获取子单元,用于获取所述多视频帧内各视频帧的预设图像特征;
    位置预测子单元,用于基于所述预设图像特征识别当前视频帧中头部的识别位置,以及预测下一视频帧中头部的预测位置;
    位置获取子单元,用于对所述识别位置和所述预测位置进行匹配,并当匹配通过后将所述预测位置更新为所述识别位置,获得相邻两帧视频帧中同一头部的位置。
  14. 根据权利要求9所述的装置,其特征在于,所述信息获取模块包括:
    位置获取子模块,用于获取所述视频图像数据多视频帧中目标对象的行为信息关键部位的位置;所述目标对象的头部位于所述目标区域内;所述行为信息包括人体姿态;
    向量生成子模块,用于按照预设的表述顺序,将各视频帧中行为信息关键部位生成一维向量;
    图像获取子模块,用于将各视频帧中对应一维向量进行级联,得到一帧RGB图像;所述RGB图像中RGB通道分别对应每个行为信息关键部位的xyz轴坐标;
    行为信息获取子模块,用于根据所述RGB图像获取所述目标对象的行为信息。
  15. 根据权利要求9所述的装置,其特征在于,所述视频标记模块包括:
    位置确定子模块,用于基于所述行为信息确定目标对象的指定部位的位置;所述行为信息包括人体姿态;
    目标确定子模块,用于当所述指定部位的位置位于所述目标区域之内且与所述目标区域的底边的距离超过设定距离阈值时,确定所述行为信息表征所述目标对象攀爬所述被检测目标。
  16. 根据权利要求9所述的装置,其特征在于,所述装置还包括:
    图像获取模块,用于获取目标对象的面部图像;
    识别码获取模块,用于当所述面部图像满足预设要求时,获取与所述面部图像相匹配的识别码;所述预设要求包括能够面部的关键点且识别结果的置信度超过设定置信度阈值;
    信号生成模块,用于当确定指定数据库中不存在与所述识别码相匹配的对象时,生成预警信息。
  17. 一种电子设备,其特征在于,包括:
    处理器;
    用于存储所述处理器可执行的计算机程序的存储器;
    其中,所述处理器被配置为执行所述存储器中的计算机程序,以实现如权利要求1~8任一顶所述的方法。
  18. 一种计算机可读存储介质,其特征在于,当所述存储介质中的可执行的计算机程序由处理器执行时,能够实现如权利要求1~8任一项所述的方法。
PCT/CN2021/107847 2021-07-22 2021-07-22 攀爬行为预警方法和装置、电子设备、存储介质 WO2023000253A1 (zh)

Priority Applications (5)

Application Number Priority Date Filing Date Title
CN202180001950.4A CN115917589A (zh) 2021-07-22 2021-07-22 攀爬行为预警方法和装置、电子设备、存储介质
EP21950508.8A EP4336491A4 (en) 2021-07-22 2021-07-22 METHOD AND DEVICE FOR EARLY WARNING OF CLIMBING BEHAVIOR, ELECTRODE DEVICE AND STORAGE MEDIUM
PCT/CN2021/107847 WO2023000253A1 (zh) 2021-07-22 2021-07-22 攀爬行为预警方法和装置、电子设备、存储介质
US17/971,498 US11990010B2 (en) 2021-07-22 2022-10-21 Methods and apparatuses for early warning of climbing behaviors, electronic devices and storage media
US18/144,366 US20230316760A1 (en) 2021-07-22 2023-05-08 Methods and apparatuses for early warning of climbing behaviors, electronic devices and storage media

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/CN2021/107847 WO2023000253A1 (zh) 2021-07-22 2021-07-22 攀爬行为预警方法和装置、电子设备、存储介质

Related Child Applications (1)

Application Number Title Priority Date Filing Date
US17/971,498 Continuation-In-Part US11990010B2 (en) 2021-07-22 2022-10-21 Methods and apparatuses for early warning of climbing behaviors, electronic devices and storage media

Publications (1)

Publication Number Publication Date
WO2023000253A1 true WO2023000253A1 (zh) 2023-01-26

Family

ID=84980330

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2021/107847 WO2023000253A1 (zh) 2021-07-22 2021-07-22 攀爬行为预警方法和装置、电子设备、存储介质

Country Status (4)

Country Link
US (1) US11990010B2 (zh)
EP (1) EP4336491A4 (zh)
CN (1) CN115917589A (zh)
WO (1) WO2023000253A1 (zh)

Families Citing this family (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116403360A (zh) * 2023-06-06 2023-07-07 天一智能科技(东莞)有限公司 一种带预警功能的多功能智慧杆控制方法及***
CN117115924B (zh) * 2023-10-23 2024-01-23 中建三局集团华南有限公司 基于ai的爬架监控方法和***
CN117115863B (zh) * 2023-10-24 2024-02-09 中建三局集团华南有限公司 用于爬架的智能监测方法和***

Citations (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5594422A (en) * 1994-05-19 1997-01-14 Comsis Corporation Universally accessible smoke detector
US20130046462A1 (en) * 2011-08-15 2013-02-21 Honeywell International Inc. Aircraft vision system including a runway position indicator
CN110263623A (zh) * 2019-05-07 2019-09-20 平安科技(深圳)有限公司 列车攀爬监测方法、装置、终端及存储介质
CN110942582A (zh) * 2019-12-23 2020-03-31 福建省特种设备检验研究院 基于机器视觉的扶手带人员异常行为的监测报警方法
CN111191511A (zh) * 2019-12-03 2020-05-22 北京联合大学 一种监狱动态实时行为识别方法及***
CN111209774A (zh) * 2018-11-21 2020-05-29 杭州海康威视数字技术股份有限公司 目标行为识别及显示方法、装置、设备、可读介质
CN111931633A (zh) * 2020-08-05 2020-11-13 珠海完全网络科技有限公司 一种基于视频识别的行为分析与微表情分析方法
CN112183317A (zh) * 2020-09-27 2021-01-05 武汉大学 一种基于时空图卷积神经网络的带电作业现场违章行为检测方法
JP2021026292A (ja) * 2019-07-31 2021-02-22 Kddi株式会社 スポーツ行動認識装置、方法およびプログラム
CN113052139A (zh) * 2021-04-25 2021-06-29 合肥中科类脑智能技术有限公司 一种基于深度学习双流网络的攀爬行为检测方法及***

Family Cites Families (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8711217B2 (en) * 2000-10-24 2014-04-29 Objectvideo, Inc. Video surveillance system employing video primitives
US10140832B2 (en) 2016-01-26 2018-11-27 Flir Systems, Inc. Systems and methods for behavioral based alarms
US10157532B2 (en) * 2016-12-21 2018-12-18 Walmart Apollo, Llc Detection system for unsafe activity at a shelving unit
CN109522793B (zh) 2018-10-10 2021-07-23 华南理工大学 基于机器视觉的多人异常行为检测与识别方法
CN109754411A (zh) 2018-11-22 2019-05-14 济南艾特网络传媒有限公司 基于光流法目标跟踪的爬楼翻窗盗窃行为检测方法及***
CN110378259A (zh) 2019-07-05 2019-10-25 桂林电子科技大学 一种面向监控视频的多目标行为识别方法及***
CN110598596A (zh) 2019-08-29 2019-12-20 深圳市中电数通智慧安全科技股份有限公司 一种危险行为监测方法、装置及电子设备
CN110446015A (zh) 2019-08-30 2019-11-12 北京青岳科技有限公司 一种基于计算机视觉的异常行为监控方法及***

Patent Citations (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5594422A (en) * 1994-05-19 1997-01-14 Comsis Corporation Universally accessible smoke detector
US20130046462A1 (en) * 2011-08-15 2013-02-21 Honeywell International Inc. Aircraft vision system including a runway position indicator
CN111209774A (zh) * 2018-11-21 2020-05-29 杭州海康威视数字技术股份有限公司 目标行为识别及显示方法、装置、设备、可读介质
CN110263623A (zh) * 2019-05-07 2019-09-20 平安科技(深圳)有限公司 列车攀爬监测方法、装置、终端及存储介质
JP2021026292A (ja) * 2019-07-31 2021-02-22 Kddi株式会社 スポーツ行動認識装置、方法およびプログラム
CN111191511A (zh) * 2019-12-03 2020-05-22 北京联合大学 一种监狱动态实时行为识别方法及***
CN110942582A (zh) * 2019-12-23 2020-03-31 福建省特种设备检验研究院 基于机器视觉的扶手带人员异常行为的监测报警方法
CN111931633A (zh) * 2020-08-05 2020-11-13 珠海完全网络科技有限公司 一种基于视频识别的行为分析与微表情分析方法
CN112183317A (zh) * 2020-09-27 2021-01-05 武汉大学 一种基于时空图卷积神经网络的带电作业现场违章行为检测方法
CN113052139A (zh) * 2021-04-25 2021-06-29 合肥中科类脑智能技术有限公司 一种基于深度学习双流网络的攀爬行为检测方法及***

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
GUAN DA; YAN LEI; YANG YIBO; XU WENFU: "A small climbing robot for the intelligent inspection of nuclear power plants", 2014 4TH IEEE INTERNATIONAL CONFERENCE ON INFORMATION SCIENCE AND TECHNOLOGY, IEEE, 26 April 2014 (2014-04-26), pages 484 - 487, XP032657284, DOI: 10.1109/ICIST.2014.6920522 *
LI WEI, CHEN TAIXING; LI ZHE: "Abnormal Event Detection of Regional Weather Station Based on Intelligent Video Analysis", HAINAN DAXUE XUEBAO (ZIRAN KEXUE BAN) -NATURAL SCIENCE JOURNAL OF HAINAN UNIVERSITY, HAINAN DAXUE XUEBAO BIANJIBU, CN, vol. 39, no. 1, 31 March 2021 (2021-03-31), CN , pages 36 - 43, XP093026051, ISSN: 1004-1729, DOI: 10.15886/j.cnki.hdxbzkb.2021.0006 *
See also references of EP4336491A4 *

Also Published As

Publication number Publication date
US20230039549A1 (en) 2023-02-09
EP4336491A1 (en) 2024-03-13
US11990010B2 (en) 2024-05-21
EP4336491A4 (en) 2024-04-24
CN115917589A (zh) 2023-04-04

Similar Documents

Publication Publication Date Title
WO2023000253A1 (zh) 攀爬行为预警方法和装置、电子设备、存储介质
US11468660B2 (en) Pixel-level based micro-feature extraction
CN108053427B (zh) 一种基于KCF与Kalman的改进型多目标跟踪方法、***及装置
CN108062349B (zh) 基于视频结构化数据及深度学习的视频监控方法和***
US11354901B2 (en) Activity recognition method and system
CN108009473B (zh) 基于目标行为属性视频结构化处理方法、***及存储装置
WO2018188453A1 (zh) 人脸区域的确定方法、存储介质、计算机设备
WO2016066038A1 (zh) 一种图像主体提取方法及***
WO2020042419A1 (zh) 基于步态的身份识别方法、装置、电子设备
US20190130580A1 (en) Methods and systems for applying complex object detection in a video analytics system
CN110532970B (zh) 人脸2d图像的年龄性别属性分析方法、***、设备和介质
WO2016149938A1 (zh) 视频监控方法、视频监控***以及计算机程序产品
CN107256386A (zh) 基于深度学习的人类行为分析方法
WO2021218671A1 (zh) 目标跟踪方法及装置、存储介质及计算机程序
CN106778637B (zh) 一种对男女客流的统计方法
CN110781844A (zh) 保安巡察监测方法及装置
CN112700568B (zh) 一种身份认证的方法、设备及计算机可读存储介质
CN114627339A (zh) 茂密丛林区域对越境人员的智能识别跟踪方法及存储介质
CN114359825A (zh) 监测方法和相关产品
CN113177439A (zh) 一种行人翻越马路护栏检测方法
US20160140395A1 (en) Adaptive sampling for efficient analysis of ego-centric videos
EP4287145A1 (en) Statistical model-based false detection removal algorithm from images
JP6851246B2 (ja) 物体検出装置
CN113139504B (zh) 身份识别方法、装置、设备及存储介质
CN110751034B (zh) 行人行为识别方法及终端设备

Legal Events

Date Code Title Description
WWE Wipo information: entry into national phase

Ref document number: 2021950508

Country of ref document: EP

ENP Entry into the national phase

Ref document number: 2021950508

Country of ref document: EP

Effective date: 20231208

121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 21950508

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE