CN112257527A - Mobile phone detection method based on multi-target fusion and space-time video sequence - Google Patents

Mobile phone detection method based on multi-target fusion and space-time video sequence Download PDF

Info

Publication number
CN112257527A
CN112257527A CN202011079614.5A CN202011079614A CN112257527A CN 112257527 A CN112257527 A CN 112257527A CN 202011079614 A CN202011079614 A CN 202011079614A CN 112257527 A CN112257527 A CN 112257527A
Authority
CN
China
Prior art keywords
frame
mobile phone
video image
detection
detection method
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202011079614.5A
Other languages
Chinese (zh)
Other versions
CN112257527B (en
Inventor
龚勋
王琛中
王立
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Southwest Jiaotong University
Original Assignee
Southwest Jiaotong University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Southwest Jiaotong University filed Critical Southwest Jiaotong University
Priority to CN202011079614.5A priority Critical patent/CN112257527B/en
Publication of CN112257527A publication Critical patent/CN112257527A/en
Application granted granted Critical
Publication of CN112257527B publication Critical patent/CN112257527B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/40Scenes; Scene-specific elements in video content
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/107Static hand or arm
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D30/00Reducing energy consumption in communication networks
    • Y02D30/70Reducing energy consumption in communication networks in wireless communication networks

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Multimedia (AREA)
  • Data Mining & Analysis (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Human Computer Interaction (AREA)
  • Evolutionary Computation (AREA)
  • General Engineering & Computer Science (AREA)
  • Computing Systems (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • General Health & Medical Sciences (AREA)
  • Computational Linguistics (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Biophysics (AREA)
  • Molecular Biology (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Biology (AREA)
  • Biomedical Technology (AREA)
  • Health & Medical Sciences (AREA)
  • Image Analysis (AREA)

Abstract

The invention relates to a mobile phone detection method based on multi-target fusion and a space-time video sequence, which comprises the steps of training an improved yolo model to obtain a detection model, and inputting video image frames to operate the detection model to obtain a first frame prediction value; decoding the first frame predicted value, removing a frame with the score value lower than a preset value, realizing NMS (network management system) by using a Diou threshold value, and inhibiting a mobile phone frame when only the mobile phone frame appears according to the decoding result of a certain frame of image; taking the inhibited result as a target template, inputting a video image frame as a candidate frame search area, inputting the candidate frame search area to a full-connection twin network, and selecting a result with the largest score map similarity to mark a mobile phone frame in the video image frame; if the set number of frames has been tracked, the above steps are repeated until the video image input is finished. The invention is based on the lightweight detection network in the One-stage detection algorithm, finely modifies the network structure and the training and detection modes, and obtains higher detection precision under the condition of not reducing the detection speed.

Description

Mobile phone detection method based on multi-target fusion and space-time video sequence
Technical Field
The invention relates to the technical field of image processing, in particular to a mobile phone detection method for multi-target fusion and space-time video sequences.
Background
The detection precision and speed are always the core problems of target detection, and in the process of target detection, in order to obtain a more accurate detection effect, a heavyweight detection algorithm capable of obtaining high precision is usually selected, so that the inference speed of the system at the mobile terminal equipment is greatly limited.
The Chinese patent application with the application number of 202010048048.5 discloses an intelligent monitoring method, equipment and a readable medium for recognizing mobile phone anti-photographing, which performs machine learning on massive mobile phone appearances through an intelligent monitoring system; erecting a camera probe at a place where anti-shooting needs to be arranged, wherein the camera probe is in real-time communication with an intelligent monitoring system; the camera transmits the shot image to an intelligent monitoring system in real time; identifying whether a mobile phone exists or not through an intelligent monitoring system; if the mobile phone exists, the intelligent monitoring system judges whether a mobile phone is used for shooting according to the shot image; the intelligent monitoring system judges that the mobile phone is used for photographing, and then outputs alarm information in real time to remind workers of timely reminding. Using a detection algorithm taking Darknet53 as a Backbone to carry out primary detection, and then monitoring by matching with methods such as bone generation, action recognition and the like; in addition, some methods use similar algorithms to perform initial positioning, and then perform detection by searching from the whole to the local. But the mode of the detection system at the mobile end is not real-time basically due to the mode of the detection system.
Disclosure of Invention
The invention aims to overcome the defects of the prior art, provides a mobile phone detection method for multi-target fusion and space-time video sequences, and solves the defects of the existing detection method.
The purpose of the invention is realized by the following technical scheme: the mobile phone detection method based on multiple target fusion and space-time video sequences comprises the following steps:
training the improved yolo model to obtain a detection model, and inputting a video image frame to operate the detection model to obtain a first frame prediction value;
decoding the first frame predicted value, removing a frame with the score value lower than a preset value, realizing NMS (non-maximum suppression) by using a Diou threshold value, and suppressing the mobile phone frame when only the mobile phone frame appears according to the decoding result of a certain frame image;
taking the inhibited result as a target template, inputting a video image frame as a candidate frame search area, simultaneously inputting the video image frame and the candidate frame search area into a full-connection twin network, and selecting the result with the largest score map similarity to carry out frame marking on the mobile phone in the video image frame;
if the set number of frames has been tracked, the above steps are repeated until the video image input is finished.
Further, the mobile phone detection method further comprises the step of repeatedly taking the suppressed result as a target template, inputting the video image frame as a candidate frame search area, simultaneously inputting the video image frame and the candidate frame search area to the full-connection twin network, and selecting the result with the largest score map similarity to perform frame marking on the mobile phone in the video image frame if the number of frames is not set.
Further, the mobile phone detection method further comprises the step of acquiring a training set and a test set before the step of training the improved yolo model to obtain a detection model and inputting the video image frame to run the detection model to obtain the first frame prediction value.
Further, the step of acquiring the training set and the test set comprises: the method comprises the steps of performing frame division processing on a recorded video, labeling processed video pictures, extracting partial pictures at intervals of frames to construct a data set, and dividing the data set into a training set and a test set according to a certain proportion.
Further, the decoding the first frame prediction value, removing a frame with a score value lower than a preset value, implementing NMS with a Diou threshold, and suppressing the mobile phone frame when only the mobile phone frame appears according to a decoding result of a certain frame image includes:
according to the decodingFormula bx ═ sigmoid (t)x)+cx、by=sigmoid(ty)+cy、bw=pwetw、bh=phethDecoding the first frame prediction value by conf ═ sigmoid (raw _ conf) and prob ═ sigmoid (raw _ prob);
removing the box with confidence or category probability not meeting the requirement with score threshold of 0.4 and implementing NMS with Diou threshold of 0.1;
and (3) rejecting a prediction frame related to the mobile phone in the corresponding image if the mobile phone frame does not have a human body frame, a hand frame or a camera frame according to the decoding result of the certain frame of image, so as to restrain the mobile phone frame.
Further, improvements to the yolo model include the following:
increasing an s branch for detecting a small object for yolov3-tiny to improve the detection effect of small objects such as a camera;
on the basis of the structure of the former step, an SPP (spatial Pyramid Power), SAM (spatial Attention Module) and CAM (channel Attention Module) module is added to be connected with the residual error, so as to improve the feature extraction capability.
The invention has the following advantages: a mobile phone detection method based on multi-target fusion and a space-time video sequence is based on a lightweight detection network in an One-stage detection algorithm, the network structure, training and detection modes are modified finely, high detection precision is obtained under the condition that the detection speed is not reduced, detected targets are tracked by using a tracking algorithm, the problem that difficult samples with large shielding and angle inclination exist is solved, the consumption of a system to resources is reduced, and the reasoning speed of the system at a mobile end is improved greatly on the whole.
Drawings
FIG. 1 is a schematic flow diagram of the process of the present invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the embodiments of the present application clearer, the technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application, and it is obvious that the described embodiments are only a part of the embodiments of the present application, and not all the embodiments. The components of the embodiments of the present application, generally described and illustrated in the figures herein, can be arranged and designed in a wide variety of different configurations. Thus, the following detailed description of the embodiments of the present application, as presented in the figures, is not intended to limit the scope of the claimed application, but is merely representative of selected embodiments of the application. All other embodiments, which can be derived by a person skilled in the art from the embodiments of the present application without making any creative effort, shall fall within the protection scope of the present application. The invention is further described below with reference to the accompanying drawings.
As shown in fig. 1, the present invention relates to a mobile phone detection method based on multi-target fusion and spatiotemporal video sequence, which specifically comprises the following contents:
and S1, performing framing processing on the video recorded by the camera in the actual application scene, and randomly extracting partial pictures at intervals to construct a data set. And labeling the mobile phone, the human body, the hand and the camera in each image by using LabelImg labeling software, and dividing the data set into a training set and a test set according to a certain proportion.
S2, training the detection model by using the improved yolov3 network, wherein the network training input is a training set picture and a corresponding label, and the network output is predicted tx,ty,tw,thOffset value, original confidence and original class probability.
Further, in the training process, for focalloss of confidence loss, considering that the imbalance of positive and negative samples of yolov3 network model is much lower than that of retianenet, the value of α is selected to be 0.4, and the calculation formula of confidence loss is as follows:
Lfocalloss=-αt(1-pt)μ*γlog(pt)
and S3, operating the detection model to obtain a predicted value of the first frame.
S4, decoding the predicted value according to the following decoding formula, removing the box with lower confidence coefficient or class probability by score threshold of 0.40 and realizing NMS by Diou threshold of 0.1;
bx=sigmoid(tx)+cx
by=sigmoid(ty)+cy
bw=pwetw
bh=pheth
conf=sigmoid(raw_conf)
prob=sigmoid(raw_prob)
wherein: bx、by、bh、bwRespectively representing the horizontal and vertical coordinates and the height and width of the center of the prediction frame, phAnd pwRepresenting the height and width of the prior box, respectively. t is txAnd tyThe predicted offset, t, of the center of the object from the upper left corner of the grid is shownwAnd thRepresenting the predicted offset of the object relative to the prior frame, cxAnd cyThen the coordinates of the upper left corner of the grid are represented, score ═ conf (confidence) × prob (class probability).
And S5, if the mobile phone frame does not have the human body frame, the hand frame or the camera frame according to the decoding result of the certain frame picture, the mobile phone frame is restrained.
And S6, taking the suppressed result as a target template, taking the video image frame as a candidate frame searching area, and simultaneously sending the target template and the video image frame into a full-connection twin network to obtain a similarity measurement result score map through template matching.
And S7, selecting the result with the maximum similarity to mark the mobile phone in the video image frame.
S8, judging whether the set frame number is tracked or not, if not, repeating the steps S6-S8, if so, executing the step S9;
s9, steps S3-S9 are repeated until the input of the video image is finished.
In terms of multi-target association, the contribution points of the invention are as follows:
it was found that the giou (generalized Intersection over union) -based position loss (used in the present invention) presents an imbalance opposite to the variance-based position loss, for which the average tag box size and average position loss of the s, m, l branches are statistically generalized,and combining the quantity proportion of each branch frame, and adopting a negative exponential function (a.e)-b/x) Unbalanced fitting correction is carried out on the basic function, and the problem of unbalanced position loss of the large and small frames based on the GIoU is solved.
Following the premise assumption that the average position loss of each branch frame is almost equal when the data volume is large enough, the average label frame size and the average position loss of the s, m and l branches of the first arm-up epoch (the preheating period, namely the period with small learning rate at the beginning of training) are calculated in a partial generalization manner in the training process, and a negative exponential function (a.e) is adopted by combining the quantity proportion of each branch frame-b/x) Unbalanced fitting correction is carried out on the basic function, loss weight of each branch position in the subsequent iteration process is adjusted, and the problem of unbalanced position loss of the large frame and the small frame based on the GIoU is solved.
The problem of rewriting labels in yolo is found, namely, a displacement anchor box assigned to a certain object has a probability of being covered by a subsequent object, so that covered training cannot be carried out, and the specific improvement steps are as follows:
if a certain anchor frame is endowed with a label by a certain original object, judging whether the original object has a unique frame;
if the original object has the unique frame, judging whether the current object can be assigned with an anchor, if so, cancelling the assignment of the current object to a certain anchor frame, otherwise, searching the next anchor frame with the highest iou value for assignment;
if the original object has no unique frame, judging whether the anchor of the highest iou of the existing object and the anchor of the non-highest iou of the original object cover the original assignment; if so, judging whether an anchor of the non-highest iou of the existing object and an anchor of the highest iou of the original object exist or not; if yes, judging whether the current object can have an assignment anchor, if yes, cancelling assignment of the current object to a certain anchor frame, and otherwise, covering the original assignment; if the anchor of the existing object which is not the highest iou and the anchor of the original object which is not the highest iou exist, the person with the lower iou is covered.
The method takes account of the fact that a main and auxiliary distinguishing is needed between the mobile phone and other auxiliary detection targets, all losses of the mobile phone are multiplied by a priority coefficient, and the coefficient is 1.10.
The threshold obtained by atss (adaptive Training Sample selection) is limited, and when the threshold is smaller than a certain value, the quality of the Training Sample corresponding to the obtained threshold is considered to be low, so that the selection mode of the threshold is abandoned, and only the highest IoU of the Training samples to be selected is selected. In the present invention, the predetermined value is 0.10.
The association which basically does not consume computing resources is carried out on the multi-target object, and the computing resource consumption of a cognitive detection mode is reduced.
In the aspect of space-time information fusion, the invention utilizes the context information of two dimensions of time domain and space domain, and obviously improves the problems of shielding and drifting in the tracking process.
The foregoing is illustrative of the preferred embodiments of this invention, and it is to be understood that the invention is not limited to the precise form disclosed herein and that various other combinations, modifications, and environments may be resorted to, falling within the scope of the concept as disclosed herein, either as described above or as apparent to those skilled in the relevant art. And that modifications and variations may be effected by those skilled in the art without departing from the spirit and scope of the invention as defined by the appended claims.

Claims (6)

1. The mobile phone detection method based on the fusion of various targets and the space-time video sequence is characterized in that: the mobile phone detection method comprises the following steps:
training the improved yolo model to obtain a detection model, and inputting a video image frame to operate the detection model to obtain a first frame prediction value;
decoding the first frame predicted value, removing a frame with the score value lower than a preset value, realizing NMS (network management system) by using a Diou threshold value, and inhibiting a mobile phone frame when only the mobile phone frame appears according to the decoding result of a certain frame of image;
taking the inhibited result as a target template, inputting a video image frame as a candidate frame search area, simultaneously inputting the video image frame and the candidate frame search area into a full-connection twin network, and selecting the result with the largest score map similarity to carry out frame marking on the mobile phone in the video image frame;
if the set number of frames has been tracked, the above steps are repeated until the video image input is finished.
2. The mobile phone detection method based on multi-target fusion and spatio-temporal video sequence according to claim 1, characterized in that: if the frame number is not set, the steps of repeatedly taking the suppressed result as a target template, inputting a video image frame as a candidate frame search area, simultaneously inputting the video image frame and the candidate frame search area to a full-connection twin network, and selecting the result with the largest score map similarity to mark the frames of the mobile phones in the video image frame are repeated.
3. The mobile phone detection method based on multi-target fusion and spatio-temporal video sequence according to claim 1, characterized in that: the mobile phone detection method further comprises the step of acquiring a training set and a test set before the steps of training the improved yolov3 model to obtain a detection model and inputting the video image frame to run the detection model to obtain a first frame prediction value.
4. The mobile phone detection method based on multi-target fusion and spatio-temporal video sequence according to claim 3, characterized in that: the step of obtaining the training set test set comprises: the method comprises the steps of performing frame division processing on a recorded video, labeling processed video pictures, extracting partial pictures at intervals of frames to construct a data set, and dividing the data set into a training set and a test set according to a certain proportion.
5. The mobile phone detection method based on multi-target fusion and spatio-temporal video sequence according to claim 1, characterized in that: the decoding the first frame prediction value, removing a frame with a score value lower than a preset value, realizing NMS (network management system) by using a Diou threshold value, and inhibiting a mobile phone frame when only the mobile phone frame appears according to a decoding result of a certain frame image comprises the following steps:
according to the decoding formula bx-sigmoid (t)x)+cx、by=sigmoid(ty)+cy、bw=pwetw、bh=phethDecoding the first frame prediction value by conf ═ sigmoid (raw _ conf) and prob ═ sigmoid (raw _ prob);
removing the box with confidence or category probability not meeting the requirement with score threshold of 0.4 and implementing NMS with Diou threshold of 0.1;
and (3) rejecting a prediction frame related to the mobile phone in the corresponding image if the mobile phone frame does not have a human body frame, a hand frame or a camera frame according to the decoding result of the certain frame of image, so as to restrain the mobile phone frame.
6. The mobile phone detection method based on multi-target fusion and spatio-temporal video sequence according to claim 1, characterized in that: improvements to the yolo model include the following:
increasing an s branch for detecting a small object for yolov3-tiny to improve the detection effect of small objects such as a camera;
on the basis of the model structure of the previous step, SPP, SAM and CAM modules are added to be connected with the residual error, so that the feature extraction capability is improved.
CN202011079614.5A 2020-10-10 2020-10-10 Mobile phone detection method based on multi-target fusion and space-time video sequence Active CN112257527B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202011079614.5A CN112257527B (en) 2020-10-10 2020-10-10 Mobile phone detection method based on multi-target fusion and space-time video sequence

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202011079614.5A CN112257527B (en) 2020-10-10 2020-10-10 Mobile phone detection method based on multi-target fusion and space-time video sequence

Publications (2)

Publication Number Publication Date
CN112257527A true CN112257527A (en) 2021-01-22
CN112257527B CN112257527B (en) 2022-09-02

Family

ID=74242754

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202011079614.5A Active CN112257527B (en) 2020-10-10 2020-10-10 Mobile phone detection method based on multi-target fusion and space-time video sequence

Country Status (1)

Country Link
CN (1) CN112257527B (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112733821A (en) * 2021-03-31 2021-04-30 成都西交智汇大数据科技有限公司 Target detection method fusing lightweight attention model
CN112967289A (en) * 2021-02-08 2021-06-15 上海西井信息科技有限公司 Security check package matching method, system, equipment and storage medium
CN113139092A (en) * 2021-04-28 2021-07-20 北京百度网讯科技有限公司 Video searching method and device, electronic equipment and medium

Citations (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108614894A (en) * 2018-05-10 2018-10-02 西南交通大学 A kind of face recognition database's constructive method based on maximum spanning tree
CN109508710A (en) * 2018-10-23 2019-03-22 东华大学 Based on the unmanned vehicle night-environment cognitive method for improving YOLOv3 network
CN109934121A (en) * 2019-02-21 2019-06-25 江苏大学 A kind of orchard pedestrian detection method based on YOLOv3 algorithm
CN110443210A (en) * 2019-08-08 2019-11-12 北京百度网讯科技有限公司 A kind of pedestrian tracting method, device and terminal
CN110472467A (en) * 2019-04-08 2019-11-19 江西理工大学 The detection method for transport hub critical object based on YOLO v3
CN110619309A (en) * 2019-09-19 2019-12-27 天津天地基业科技有限公司 Embedded platform face detection method based on octave convolution sum YOLOv3
WO2020047854A1 (en) * 2018-09-07 2020-03-12 Intel Corporation Detecting objects in video frames using similarity detectors
CN111161311A (en) * 2019-12-09 2020-05-15 中车工业研究院有限公司 Visual multi-target tracking method and device based on deep learning
AU2020100705A4 (en) * 2020-05-05 2020-06-18 Chang, Jiaying Miss A helmet detection method with lightweight backbone based on yolov3 network
CN111753767A (en) * 2020-06-29 2020-10-09 广东小天才科技有限公司 Method and device for automatically correcting operation, electronic equipment and storage medium

Patent Citations (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108614894A (en) * 2018-05-10 2018-10-02 西南交通大学 A kind of face recognition database's constructive method based on maximum spanning tree
WO2020047854A1 (en) * 2018-09-07 2020-03-12 Intel Corporation Detecting objects in video frames using similarity detectors
CN109508710A (en) * 2018-10-23 2019-03-22 东华大学 Based on the unmanned vehicle night-environment cognitive method for improving YOLOv3 network
CN109934121A (en) * 2019-02-21 2019-06-25 江苏大学 A kind of orchard pedestrian detection method based on YOLOv3 algorithm
CN110472467A (en) * 2019-04-08 2019-11-19 江西理工大学 The detection method for transport hub critical object based on YOLO v3
CN110443210A (en) * 2019-08-08 2019-11-12 北京百度网讯科技有限公司 A kind of pedestrian tracting method, device and terminal
CN110619309A (en) * 2019-09-19 2019-12-27 天津天地基业科技有限公司 Embedded platform face detection method based on octave convolution sum YOLOv3
CN111161311A (en) * 2019-12-09 2020-05-15 中车工业研究院有限公司 Visual multi-target tracking method and device based on deep learning
AU2020100705A4 (en) * 2020-05-05 2020-06-18 Chang, Jiaying Miss A helmet detection method with lightweight backbone based on yolov3 network
CN111753767A (en) * 2020-06-29 2020-10-09 广东小天才科技有限公司 Method and device for automatically correcting operation, electronic equipment and storage medium

Non-Patent Citations (7)

* Cited by examiner, † Cited by third party
Title
PRANAV ADARSH 等: "YOLO v3-Tiny: Object Detection and Recognition using one stage improved model", 《2020 6TH INTERNATIONAL CONFERENCE ON ADVANCED COMPUTING AND COMMUNICATION SYSTEMS (ICACCS)》 *
TAKUYA FUKAGAI 等: "Speed-Up of Object Detection Neural Network with GPU", 《2018 25TH IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING (ICIP)》 *
ZHEDONG ZHENG 等: "Pedestrian Alignment Network for Large-scale Person Re-Identification", 《IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY》 *
ZIQI YANG 等: "A Temporal Sequence Dual-Branch Network for Classifying Hybrid Ultrasound Data of Breast Cancer", 《IEEE ACCESS》 *
侯志强 等: "基于双阈值-非极大值抑制的Faster R-CNN改进算法", 《光电工程》 *
刘庭煜 等: "面向车间人员宏观行为数字孪生模型快速构建的小目标智能检测方法", 《计算机集成制造***》 *
易诗 等: "基于增强型Tiny-YOLOV3模型的野鸡识别方法", 《农业工程学报》 *

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112967289A (en) * 2021-02-08 2021-06-15 上海西井信息科技有限公司 Security check package matching method, system, equipment and storage medium
CN112733821A (en) * 2021-03-31 2021-04-30 成都西交智汇大数据科技有限公司 Target detection method fusing lightweight attention model
CN113139092A (en) * 2021-04-28 2021-07-20 北京百度网讯科技有限公司 Video searching method and device, electronic equipment and medium
CN113139092B (en) * 2021-04-28 2023-11-03 北京百度网讯科技有限公司 Video searching method and device, electronic equipment and medium

Also Published As

Publication number Publication date
CN112257527B (en) 2022-09-02

Similar Documents

Publication Publication Date Title
CN112257527B (en) Mobile phone detection method based on multi-target fusion and space-time video sequence
WO2020173226A1 (en) Spatial-temporal behavior detection method
CN108806334A (en) A kind of intelligent ship personal identification method based on image
CN113052876B (en) Video relay tracking method and system based on deep learning
CN110555420B (en) Fusion model network and method based on pedestrian regional feature extraction and re-identification
CN107256386A (en) Human behavior analysis method based on deep learning
CN115661615A (en) Training method and device of image recognition model and electronic equipment
CN111898566B (en) Attitude estimation method, attitude estimation device, electronic equipment and storage medium
CN114022837A (en) Station left article detection method and device, electronic equipment and storage medium
CN109800756A (en) A kind of text detection recognition methods for the intensive text of Chinese historical document
CN113096159A (en) Target detection and track tracking method, model and electronic equipment thereof
US20230222841A1 (en) Ensemble Deep Learning Method for Identifying Unsafe Behaviors of Operators in Maritime Working Environment
CN109753901A (en) Indoor pedestrian's autonomous tracing in intelligent vehicle, device, computer equipment and storage medium based on pedestrian's identification
CN113065568A (en) Target detection, attribute identification and tracking method and system
CN113901911B (en) Image recognition method, image recognition device, model training method, model training device, electronic equipment and storage medium
CN109086725A (en) Hand tracking and machine readable storage medium
CN116883883A (en) Marine ship target detection method based on generation of anti-shake of countermeasure network
CN114821647A (en) Sleeping post identification method, device, equipment and medium
WO2022205329A1 (en) Object detection method, object detection apparatus, and object detection system
CN111539390A (en) Small target image identification method, equipment and system based on Yolov3
WO2023070955A1 (en) Method and apparatus for detecting tiny target in port operation area on basis of computer vision
CN110210436A (en) A kind of vehicle-mounted camera line walking image-recognizing method
CN115131826A (en) Article detection and identification method, and network model training method and device
CN113469138A (en) Object detection method and device, storage medium and electronic equipment
CN112069997A (en) Unmanned aerial vehicle autonomous landing target extraction method and device based on DenseHR-Net

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant