CN111160314B

CN111160314B - Violent sorting identification method and device

Info

Publication number: CN111160314B
Application number: CN202010005394.5A
Authority: CN
Inventors: 刘永霞; 汪建新; 吴明辉
Original assignee: Miaozhen Information Technology Co Ltd
Current assignee: Miaozhen Information Technology Co Ltd
Priority date: 2020-01-03
Filing date: 2020-01-03
Publication date: 2023-08-29
Anticipated expiration: 2040-01-03
Also published as: CN111160314A

Abstract

The application provides a violent sorting identification method and device, wherein the method comprises the following steps: firstly, acquiring video information in the process of sorting articles; inputting the video information into a pre-trained violent sorting recognition model, and determining a violent sorting recognition result corresponding to the video information; then, under the condition that the violent sorting identification result of the video information is violent sorting, determining the residence time of the articles in the video information in the air; finally, based on the residence time, a final violent sorting recognition result of the video information is determined. In the process, video information in the article sorting process is acquired, so that violent sorting video information is determined, and violent sorting behaviors are finally determined based on residence time of articles in violent sorting videos in the air, so that accuracy of identifying the violent sorting behaviors is greatly improved.

Description

Violent sorting identification method and device

Technical Field

The application relates to the technical field of monitoring, in particular to a method and a device for identifying violent sorting.

Background

Along with the continuous rising of the online shopping amount of people, the load of the express industry is gradually increased. Especially, in some online shopping peak periods or holidays, the express delivery volume is greatly increased, a plurality of news of violent sorting and express delivery appear, and adverse effects are caused to a certain extent for the express delivery industry and even online shopping websites.

At present, the judgment of the violent sorting action of the express delivery mostly depends on manual work, the cost is higher, and the judgment accuracy cannot be ensured.

Therefore, how to accurately identify violent sorting behaviors is a problem to be solved.

Disclosure of Invention

Therefore, the application aims to provide a violent sorting identification method and device, which can improve the accuracy of violent sorting behavior identification and reduce the occurrence of violent sorting behavior.

In a first aspect, an embodiment of the present application provides a method for identifying violent sorting, including:

acquiring video information corresponding to each article in the sorting process of a plurality of articles; wherein, each item of video information corresponding to each item comprises a plurality of video images;

sequentially inputting a plurality of video images corresponding to each article into a pre-trained violent sorting recognition model, and determining violent sorting recognition results corresponding to each article;

determining the residence time of any article in the air based on the video information of the any article under the condition that the violent sorting identification result of the article is violent sorting;

based on the residence time, a final violent sorting identification result of the article is determined.

In an alternative embodiment, the acquiring video information during the sorting of the articles includes:

acquiring original video information of a plurality of articles in a sorting process;

and sequentially carrying out object identification on each frame of image in the original video information, and intercepting the video information corresponding to each object in the sorting process from the original video information based on an object identification result.

In an optional implementation manner, the capturing the video information corresponding to each article in the sorting process from the original video information includes:

intercepting original videos corresponding to the articles in the sorting process from the original video information;

and performing interval sampling on video images in the original video corresponding to each object respectively, and obtaining video information corresponding to each object based on a plurality of video images obtained by interval sampling.

In an optional implementation manner, the sequentially performing object identification on each frame of image in the video information includes:

sequentially inputting each frame of images into a first recognition model trained in advance according to the sequence of the time stamps of the frames of images in the video information, and determining the position information of an article in each frame of images;

Based on the article identification result, intercepting the video information corresponding to each article in the sorting process from the video information, including:

determining images corresponding to the sorting of the articles respectively based on the position information of the articles in the images of the frames;

for each item, video information corresponding to the item is generated based on an image corresponding to the item when the item is sorted.

In an alternative embodiment, the behavior recognition model includes one or more of the following:

the system comprises a cyclic neural network RNN, a long and short term memory network LSTM and a gating cyclic unit GRU.

In an alternative embodiment, the determining the residence time of the object in the video information in the air includes:

sequentially inputting each frame of image in the video information into a pre-trained second recognition model, and acquiring an article and a human hand recognition result corresponding to each frame of image in the video information;

determining a first target image when a human hand is separated from the object and a second target image when the object falls to the ground from each frame image in the video information based on the object and human hand identification results respectively corresponding to each frame image;

The dwell time is determined based on the time stamps of the first target image and the second target image.

In an alternative embodiment, the determining the final violent sorting recognition result of the video information based on the residence time includes:

comparing the residence time with a preset residence time threshold;

when the retention time is larger than the retention time threshold value, determining that the violent sorting identification result of the video information is violent sorting;

and when the retention time is less than or equal to the retention time threshold value, determining that the violent sorting identification result of the video information is non-violent sorting.

In a second aspect, an embodiment of the present application further provides an apparatus for identifying violent sorting, where the apparatus includes: the device comprises an acquisition module, a first determination module, a second determination module and a third determination module, wherein:

the acquisition module is used for acquiring video information corresponding to each article in the sorting process of a plurality of articles; wherein, each item of video information corresponding to each item comprises a plurality of video images;

the first determining module is used for sequentially inputting a plurality of video images corresponding to each article into a pre-trained violent sorting and identifying model and determining violent sorting and identifying results corresponding to each article;

The second determining module is used for determining the residence time of any article in the air based on the video information of the article when the violent sorting identification result of the article is violent sorting;

and a third determining module for determining a final violent sorting recognition result of the article based on the residence time.

In an alternative embodiment, the acquiring module is specifically configured to, when acquiring video information during the sorting process of the articles:

In an alternative embodiment, the obtaining module is configured to, when intercepting, from the original video information, the video information corresponding to each article in the sorting process, respectively:

In an optional implementation manner, the acquiring module is configured to, when sequentially identifying the objects in each frame of image in the video information:

In an alternative embodiment, the second determining module is configured, when determining a residence time of the object in the air in the video information, to:

In an alternative embodiment, the third determining module is specifically configured to, when determining the final violent sorting recognition result of the video information based on the residence time:

comparing the residence time with a preset residence time threshold;

In a third aspect, an embodiment of the present application further provides a computer apparatus, including: a processor, a memory and a bus, the memory storing machine-readable instructions executable by the processor, the processor and the memory in communication via the bus when the computer device is running, the machine-readable instructions when executed by the processor performing the steps of the first aspect, or any of the possible implementations of the first aspect.

In a fourth aspect, embodiments of the present application also provide a computer readable storage medium having stored thereon a computer program which, when executed by a processor, performs the steps of the first aspect, or any of the possible implementations of the first aspect.

After video information corresponding to each article in the sorting process of a plurality of articles is acquired, sequentially inputting a plurality of video images corresponding to each article into a pre-trained violent sorting recognition model, and determining violent sorting recognition results corresponding to each article; and then, under the condition that the violent sorting identification result of any article is violent sorting, determining the residence time of the article in the air based on the video information of any article, and determining the final violent sorting identification result of the article based on the residence time. In the process, firstly, the video information which remains sorted is identified through the violent sorting identification model, and then the final violent sorting identification result is determined based on the residence time of the articles in the air, so that the accuracy of identifying violent sorting behaviors is greatly improved.

In order to make the above objects, features and advantages of the present application more comprehensible, preferred embodiments accompanied with figures are described in detail below.

Drawings

In order to more clearly illustrate the technical solutions of the embodiments of the present application, the drawings that are needed in the embodiments will be briefly described below, it being understood that the following drawings only illustrate some embodiments of the present application and therefore should not be considered as limiting the scope, and other related drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.

FIG. 1 is a flow chart illustrating a method of identifying violent sorting in accordance with an embodiment of the present application;

fig. 2 is a schematic structural diagram of an identification device for violent sorting according to an embodiment of the present application;

fig. 3 shows a schematic structural diagram of a computer device according to an embodiment of the present application.

Detailed Description

For the purpose of making the objects, technical solutions and advantages of the embodiments of the present application more apparent, the technical solutions of the embodiments of the present application will be clearly and completely described below with reference to the accompanying drawings in the embodiments of the present application, and it is apparent that the described embodiments are only some embodiments of the present application, not all embodiments. The components of the embodiments of the present application generally described and illustrated in the figures herein may be arranged and designed in a wide variety of different configurations. Thus, the following detailed description of the embodiments of the application, as presented in the figures, is not intended to limit the scope of the application, as claimed, but is merely representative of selected embodiments of the application. All other embodiments, which can be made by a person skilled in the art without making any inventive effort, are intended to be within the scope of the present application.

Considering that in the prior art, the identification of violent sorting behaviors is mostly finished manually, the problems of higher cost, lower efficiency and accuracy and the like exist.

Based on the above, the method and the device for identifying violent sorting provided by the embodiment of the application are characterized in that after video information corresponding to each article in the sorting process of a plurality of articles is acquired, a plurality of video images corresponding to each article are sequentially input into a pre-trained violent sorting identification model, and the violent sorting identification result corresponding to each article is determined; and then, under the condition that the violent sorting identification result of any article is violent sorting, determining the residence time of the article in the air based on the video information of any article, and determining the final violent sorting identification result of the article based on the residence time. In the process, firstly, the video information which remains sorted is identified through the violent sorting identification model, and then the final violent sorting identification result is determined based on the residence time of the articles in the air, so that the accuracy of identifying violent sorting behaviors is greatly improved.

The present application is directed to a method for manufacturing a semiconductor device, and a semiconductor device manufactured by the method.

The following description of the embodiments of the present application will be made more apparent and fully hereinafter with reference to the accompanying drawings, in which some, but not all embodiments of the application are shown. The components of the present application, which are generally described and illustrated in the figures herein, may be arranged and designed in a wide variety of different configurations. Thus, the following detailed description of the embodiments of the application, as presented in the figures, is not intended to limit the scope of the application, as claimed, but is merely representative of selected embodiments of the application. All other embodiments, which can be made by a person skilled in the art without making any inventive effort, are intended to be within the scope of the present application.

It should be noted that: like reference numerals and letters denote like items in the following figures, and thus once an item is defined in one figure, no further definition or explanation thereof is necessary in the following figures.

For the convenience of understanding the present embodiment, first, a detailed description will be given of a method for identifying violent sorting disclosed in the present embodiment, and an execution subject of the method for identifying violent sorting provided in the present embodiment is generally a computer information retrieval system. In particular, its execution subject may also be other computer devices.

Example 1

Referring to fig. 1, a flowchart of a method for identifying violent sorting according to a first embodiment of the present application is shown, where the method includes steps S101 to S104, where:

s101: acquiring video information corresponding to each article in the sorting process of a plurality of articles; wherein, each item of video information corresponding to each item comprises a plurality of video images;

s102: sequentially inputting a plurality of video images corresponding to each article into a pre-trained violent sorting recognition model, and determining violent sorting recognition results corresponding to each article;

s103: determining the residence time of any article in the air based on the video information of the any article under the condition that the violent sorting identification result of the article is violent sorting;

s104: based on the residence time, a final violent sorting identification result of the article is determined.

Hereinafter, each of the above-mentioned S101 to S104 will be described in detail.

And (3) a step of: in S101 above, in the article sorting process, the video information acquired in the article sorting process may be acquired in the following manner:

acquiring original video information in the sorting process of a plurality of articles; and sequentially carrying out object identification on each frame of image in the original video information, and intercepting the video information corresponding to each object in the sorting process from the original video information based on an object identification result.

For example, in general, at some express companies or some express delivery points, monitoring cameras are installed, so that on one hand, the monitoring cameras are used for guaranteeing that the express is not stolen, and once a loss event occurs, some video clues can be provided; on the other hand, the monitoring camera is also used for monitoring whether the courier has violent sorting behaviors in the sorting process. For example, the surveillance video may be directly used as the original video information.

The video information acquired in the article sorting process is intercepted from the whole monitoring video information, and a multi-frame image is acquired; for example: in the process of acquiring video information, detecting whether the monitoring video information corresponds to the same article in real time; after detecting different articles corresponding to the monitoring video information, capturing a preset number of multi-frame images before and/or after the images according to the detected images corresponding to the different articles as video information.

In addition, in another embodiment, the video information corresponding to each article in the sorting process may be intercepted from the original video information in the following manner, for example:

intercepting original videos corresponding to the articles in the sorting process from the original video information; and performing interval sampling on video images in the original video corresponding to each object respectively, and obtaining video information corresponding to each object based on a plurality of video images obtained by interval sampling.

Here, the frequency of interval sampling can be set according to actual needs, so that the image processing capacity of a model is reduced, and the detection efficiency is improved.

After video information is obtained, sequentially inputting each frame of image into a first recognition model trained in advance according to the sequence of time stamps of each frame of image in the video information, and determining the position information of an article in each frame of image;

The video information is intercepted from the video information; for example: in the process of acquiring video information, synchronously inputting each frame of image in the video information into a violent sorting identification model in sequence, and detecting whether violent sorting video information exists in the video information in real time; when violent sorting video information is detected to exist in the video information, according to the image when the result is detected, a preset number of video images before and/or after the image are intercepted to be used as violent sorting video information.

Illustratively, an item corresponds to a piece of video information, and based on the position information of the item in the image of each frame, a state corresponding to the item is determined, for example: when the article is positioned in a human hand, the article is considered to be at the beginning of a piece of video information; the item is considered to be at the end of a piece of video information when the item is located on the ground. And determining images respectively corresponding to the articles when the articles are sorted based on the states corresponding to the articles, and generating video information corresponding to the articles.

And II: in S102, the video information is input into a pre-trained violent sorting recognition model, and a violent sorting recognition result corresponding to the video information is determined.

The violence sorting recognition model is trained by adopting the following modes:

acquiring a plurality of sample videos and label information corresponding to each sample video, wherein the label information is used for violently sorting objects; wherein each sample video comprises a plurality of frames of sample images;

inputting the sample video into a behavior recognition model aiming at each sample video to obtain a violent sorting recognition result corresponding to the sample video;

and training the behavior recognition model based on the violent sorting recognition result corresponding to the sample video and the label information to obtain the violent sorting recognition model.

The method comprises the steps of obtaining a plurality of sample videos in advance, and determining label information corresponding to each sample video, namely violent sorting sample videos and non-violent sorting sample videos, wherein the label information corresponds to whether the articles are violently sorted or not.

Wherein each sample video comprises a plurality of frames of sample images.

The method includes the steps that a plurality of sample videos obtained in advance and label information corresponding to the sample videos are input into a behavior recognition model to obtain violent sorting recognition results corresponding to the sample videos, wherein the label information is used for violently sorting objects or not.

Wherein the behavior recognition model comprises one or more of the following:

any one of a recurrent neural network (Recurrent neural networks, RNN), a Long Short-Term Memory network (LSTM), a gated loop unit (Gated Recurrent Unit, GRU).

Illustratively, based on a plurality of sample videos obtained in advance, the human body three-dimensional posture relative space-time characteristics of each frame of sample image in the violent sample video and the non-violent sample video are respectively extracted, for example: the characteristic information comprises characteristic information such as a joint-to-distance characteristic, a joint-to-bone distance characteristic, a joint-to-plane distance characteristic, a bone-to-included angle characteristic, a bone-to-plane included angle characteristic, a plane-to-plane included angle characteristic, a joint rotation characteristic and the like.

And obtaining a violent sorting recognition result corresponding to the sample video by utilizing the relative space-time characteristics of the human body three-dimensional posture of each frame of sample image in the violent sample video and the non-violent sample video in the plurality of sample videos.

Training the behavior recognition model based on the obtained violent sorting recognition result corresponding to the sample video and the label information corresponding to the sample video to obtain a violent sorting recognition model which is used for the subsequent recognition process of the video information.

Thirdly,: in the step S103, in the case that the violent sorting identification result of any one of the articles is violent sorting in step S102, the residence time of the article in the air is determined based on the video information of the any one of the articles.

Illustratively, if it is determined in step S102 that the violent sorting recognition result of the video information is violent sorting, multi-frame images included in the violent sorting video information are sorted for the violent sorting in the video information.

And obtaining the retention time of the article based on the first target image when the human hand is separated from the article, the second target image when the article falls to the ground and the time stamp respectively corresponding to the first target image and the second target image.

The first recognition model and the second recognition model may be the same recognition model, that is, in the case that the violent sorting recognition result of the video information is violent sorting, the first target image when the human hand is separated from the article and the second target image when the article lands on the ground may be determined directly based on the recognition result of the first model, the time stamps corresponding to the first target image and the second target image respectively may be determined, and the residence time of the article in the video information in the air may be determined based on the time stamps corresponding to the first target image and the second target image respectively.

Fourth, the method comprises the following steps: in S104 described above, the final violent sorting recognition result of the video information is determined based on the retention time obtained in step S103.

Wherein the residence time is compared to a preset residence time threshold.

For example, a sort residence time threshold may be determined based on the actual situation and the residence time compared to the residence time threshold.

For example, when the residence time of an article from the hand to the landing of the article is relatively long, the trajectory of the article during sorting may be considered to be not in accordance with preset requirements, such as: the article may be subject to violent sorting, such as being thrown relatively high, or being thrown relatively far.

For example, after the violent sorting of the articles is determined, sorting video information during sorting of the articles can be stored, and responsible persons thereof can be tracked, so that the purpose of reducing violent sorting behaviors is achieved.

After video information corresponding to each article in the sorting process of the articles is acquired, sequentially inputting a plurality of video images corresponding to each article into a pre-trained violent sorting recognition model, and determining violent sorting recognition results corresponding to each article; and then, under the condition that the violent sorting identification result of any article is violent sorting, determining the residence time of the article in the air based on the video information of any article, and determining the final violent sorting identification result of the article based on the residence time. In the process, firstly, the video information which remains sorted is identified through the violent sorting identification model, and then the final violent sorting identification result is determined based on the residence time of the articles in the air, so that the accuracy of identifying violent sorting behaviors is greatly improved.

Example two

Referring to fig. 2, a schematic diagram of an identification device for violent sorting according to a second embodiment of the present application is shown, where the device includes: an acquisition module 21, a first determination module 22, a second determination module 23, and a third determination module 24, wherein:

the acquiring module 21 is configured to acquire video information corresponding to each article in a sorting process of a plurality of articles; wherein, each item of video information corresponding to each item comprises a plurality of video images;

the first determining module 22 is configured to sequentially input a plurality of video images corresponding to each article into a pre-trained violent sorting recognition model, and determine violent sorting recognition results corresponding to each article;

the second determining module 23 is configured to determine a residence time of any article in the air based on video information of the any article if the violent sorting identification result of the article is violent sorting;

the third determination module 24 is configured to determine a final violent sorting recognition result of the item based on the residence time.

In an alternative embodiment, the acquiring module 21 is specifically configured to, when acquiring video information during the sorting process of the articles:

In an alternative embodiment, the obtaining module 21 is configured to, when intercepting, from the original video information, the video information corresponding to each article during the sorting process, respectively:

In an alternative embodiment, the acquiring module 21 is configured to, when sequentially performing object identification on each frame of image in the video information:

In an alternative embodiment, the second determining module 23 is configured, when determining the residence time of the object in the air in the video information, to:

In an alternative embodiment, the third determining module 24 is specifically configured to, when determining the final violent sorting recognition result of the video information based on the residence time:

comparing the residence time with a preset residence time threshold;

Example III

The embodiment of the present application further provides a computer device 300, as shown in fig. 3, which is a schematic structural diagram of the computer device 300 provided in the embodiment of the present application, including:

a processor 31, a memory 32, and a bus 33; memory 32 is used to store execution instructions, including memory 321 and external memory 322; the memory 321 is also referred to as an internal memory, and is used for temporarily storing operation data in the processor 31 and data exchanged with the external memory 322 such as a hard disk, where the processor 31 exchanges data with the external memory 322 through the memory 321, and when the computer device 300 is running, the processor 31 and the memory 32 communicate with each other through the bus 33, so that the processor 31 executes the following instructions in a user mode:

In a possible implementation manner, the acquiring video information in the process of sorting the articles in the instructions executed by the processor 31 includes:

In a possible implementation manner, in the instructions executed by the processor 31, the capturing, from the original video information, the video information corresponding to each article in the sorting process includes:

In a possible implementation manner, in the instructions executed by the processor 31, the sequentially performing object identification on each frame of image in the video information includes:

In a possible implementation manner, the behavior recognition model includes one or more of the following in the instructions executed by the processor 31:

In a possible implementation manner, the determining the residence time of the article in the video information in the air in the instructions executed by the processor 31 includes:

In a possible implementation manner, the determining the final violent sorting identification result of the video information based on the residence time in the instructions executed by the processor 31 includes:

comparing the residence time with a preset residence time threshold;

The embodiment of the application also provides a computer readable storage medium, on which a computer program is stored, which computer program, when being executed by a processor, performs the steps of the method for identifying violent sorting described in the above method embodiment.

It will be clear to those skilled in the art that, for convenience and brevity of description, specific working procedures of the above-described system and apparatus may refer to corresponding procedures in the foregoing method embodiments, which are not described herein again.

In the several embodiments provided by the present application, it should be understood that the disclosed systems, devices, and methods may be implemented in other manners. The above-described apparatus embodiments are merely illustrative, for example, the division of the units is merely a logical function division, and there may be other manners of division in actual implementation, and for example, multiple units or components may be combined or integrated into another system, or some features may be omitted, or not performed. Alternatively, the coupling or direct coupling or communication connection shown or discussed with each other may be through some communication interface, device or unit indirect coupling or communication connection, which may be in electrical, mechanical or other form.

The units described as separate units may or may not be physically separate, and units shown as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the units may be selected according to actual needs to achieve the purpose of the solution of this embodiment.

In addition, each functional unit in the embodiments of the present application may be integrated in one processing unit, or each unit may exist alone physically, or two or more units may be integrated in one unit.

The functions, if implemented in the form of software functional units and sold or used as a stand-alone product, may be stored in a non-volatile computer readable storage medium executable by a processor. Based on this understanding, the technical solution of the present application may be embodied essentially or in a part contributing to the prior art or in a part of the technical solution, in the form of a software product stored in a storage medium, comprising several instructions for causing a computer device (which may be a personal computer, a server, a network device, etc.) to perform all or part of the steps of the method according to the embodiments of the present application. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a Read-Only Memory (ROM), a random access Memory (Random Access Memory, RAM), a magnetic disk, or an optical disk, or other various media capable of storing program codes.

Finally, it should be noted that: the above examples are only specific embodiments of the present application, and are not intended to limit the scope of the present application, but it should be understood by those skilled in the art that the present application is not limited thereto, and that the present application is described in detail with reference to the foregoing examples: any person skilled in the art may modify or easily conceive of the technical solution described in the foregoing embodiments, or perform equivalent substitution of some of the technical features, while remaining within the technical scope of the present disclosure; such modifications, changes or substitutions do not depart from the spirit and scope of the technical solutions of the embodiments of the present application, and are intended to be included in the scope of the present application. Therefore, the protection scope of the present application shall be subject to the protection scope of the claims.

Claims

1. A method of identifying violent sorting, comprising:

determining a final violent sorting recognition result of the article based on the residence time;

the determining the residence time of the article in the video information in the air comprises the following steps:

2. The method of claim 1, wherein the acquiring video information during the sorting of the items comprises:

3. The method of claim 2, wherein said capturing said video information corresponding to each item in the sorting process from said original video information comprises:

4. A method of identifying as claimed in claim 2 or 3, wherein said sequentially identifying each frame of image in said video information comprises:

5. The identification method according to claim 1, wherein a behavior identification model is trained to obtain the violent sorting identification model; the behavior recognition model includes one or more of the following:

6. The method of claim 1, wherein determining a final violent sorting recognition result of the video information based on the residence time comprises:

comparing the residence time with a preset residence time threshold;

7. An identification device for violent sorting, comprising:

a third determining module for determining a final violent sorting recognition result of the article based on the residence time;

the second determining module is used for determining the residence time of the article in the video information in the air, wherein the second determining module is used for:

8. A computer device, comprising: a processor, a memory and a bus, said memory storing machine readable instructions executable by said processor, said processor and said memory communicating over the bus when the computer device is running, said machine readable instructions when executed by said processor performing the steps of the identification method according to any of claims 1 to 6.

9. A computer-readable storage medium, characterized in that the computer-readable storage medium has stored thereon a computer program which, when executed by a processor, performs the steps of the identification method according to any of claims 1 to 6.