CN113449634A - Video detection method and device for processing under strong light environment - Google Patents

Video detection method and device for processing under strong light environment Download PDF

Info

Publication number
CN113449634A
CN113449634A CN202110718258.5A CN202110718258A CN113449634A CN 113449634 A CN113449634 A CN 113449634A CN 202110718258 A CN202110718258 A CN 202110718258A CN 113449634 A CN113449634 A CN 113449634A
Authority
CN
China
Prior art keywords
layer
picture
convolution
detection
neural network
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202110718258.5A
Other languages
Chinese (zh)
Inventor
谢尔康
姜蓓蓓
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shanghai Hansheng Information Technology Co ltd
Original Assignee
Shanghai Hansheng Information Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shanghai Hansheng Information Technology Co ltd filed Critical Shanghai Hansheng Information Technology Co ltd
Priority to CN202110718258.5A priority Critical patent/CN113449634A/en
Publication of CN113449634A publication Critical patent/CN113449634A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Health & Medical Sciences (AREA)
  • Computing Systems (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Molecular Biology (AREA)
  • Artificial Intelligence (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Image Analysis (AREA)
  • Image Processing (AREA)

Abstract

The invention discloses a video detection method and a device for processing a highlight environment, wherein the method comprises the following steps: acquiring streaming data by an image acquisition unit; synthesizing the streaming data into a picture and then importing the picture into a deep neural network; the deep neural network converts the size of the picture into N x N to obtain a converted picture; after the converted picture is subjected to multiple detections of the convolution layer, parameters of a detection target are given according to preset confidence coefficient parameters; after the multiple detections, the method further comprises the steps of adding a convolutional layer, an upper sampling layer, a splicing layer and a convolution group, wherein the convolution group is used for receiving data of the convolutional layer in the basic network. According to the method and the device for processing the video detection in the strong light environment, the accuracy, the stability and the prediction precision of prediction are improved by adding the convolution layer, the upper sampling layer, the splicing layer and the convolution group.

Description

Video detection method and device for processing under strong light environment
Technical Field
The present invention relates to a video detection method and apparatus, and in particular, to a video detection method and apparatus for processing a video in a high light environment.
Background
At present, the conventional wharf is continuously developed towards an automatic wharf, and the conditions in a bridge pod and an operation cabin of the port wharf need to be detected in real time so as to realize the automatic operation of the wharf.
However, in general, due to a complex environment in a bridge crane cabin of a port and a wharf, for example, all-weather operation on an island can be subject to the situation of strong light and weak light in the cabin, an operation cabin needs to move along with the carrying process of the bridge crane, and frequent shaking conditions exist, so that the video identification technology has high false detection and high computational power loss.
There is therefore a need for a method of video detection that addresses the above-mentioned problems and disadvantages.
Disclosure of Invention
The invention aims to solve the technical problem of providing a video detection method and a video detection device for processing a highlight environment, wherein the accuracy, stability and prediction precision of prediction are improved by adding a convolution layer, an upper sampling layer, a splicing layer and a convolution group.
The technical scheme adopted by the invention for solving the technical problems is to provide a video detection method for processing a highlight environment, which comprises the following steps:
acquiring streaming data by an image acquisition unit;
synthesizing the streaming data into a picture and then importing the picture into a deep neural network;
the deep neural network converts the size of the picture into N x N to obtain a converted picture;
after the converted picture is subjected to multiple detections of the convolution layer, parameters of a detection target are given according to preset confidence coefficient parameters;
after the multiple detections, the method further comprises the steps of adding a convolutional layer, an upper sampling layer, a splicing layer and a convolution group, wherein the convolution group is used for receiving data of the convolutional layer in the basic network.
Preferably, the image acquisition unit includes: vision camera, laser radar, camera.
Preferably, after the streaming data is synthesized into a picture and then introduced into a neural network, the method further comprises classifying the picture according to a preset rule.
Preferably, the parameters of the detection target include a central x coordinate, a central y coordinate, and a width and a height of the detection frame.
Preferably, the receptive field in the multiple detections is 16 times, and the size of the used anchor frame is (10, 13); (16, 30); (32,33).
The present invention further provides a video detection apparatus for processing a video in a strong light environment, which comprises:
an image acquisition unit for acquiring streaming data;
the image synthesis unit is used for synthesizing the streaming data into an image and then importing the image into the deep neural network;
the image conversion unit is used for converting the size of the image into N x N through the deep neural network to obtain a converted image;
the parameter acquisition unit is used for detecting the converted picture for multiple times through the convolution layer and then giving out a parameter of a detection target according to a preset confidence coefficient parameter;
and the convolution unit is used for adding a convolution layer, an upper sampling layer and a splicing layer after the detection for a plurality of times, and a convolution group, wherein the convolution group is used for receiving the data of the convolution layer in the basic network.
Preferably, the image acquisition unit includes: vision camera, laser radar, camera.
Preferably, after the streaming data is synthesized into a picture and then introduced into a neural network, the method further comprises classifying the picture according to a preset rule.
Preferably, the parameters of the detection target include a central x coordinate, a central y coordinate, and a width and a height of the detection frame.
Preferably, the receptive field in the multiple detections is 16 times, and the size of the used anchor frame is (10, 13); (16, 30); (32,33).
Compared with the prior art, the invention has the following beneficial effects: according to the method and the device for processing the video detection in the strong light environment, a convolutional layer, an upper sampling layer, a splicing layer and a convolution group are added, and the convolution group is used for receiving data of the convolutional layer in a basic network, so that the accuracy, the stability and the prediction precision of prediction are improved;
further, by using 4-layer up-sampling, deeper detection can be achieved, and the probability of false detection is reduced.
Drawings
FIG. 1 is a flow chart of a method for processing video detection in a high light environment according to an embodiment of the present invention;
FIG. 2 is a block diagram of an exemplary embodiment of a video detection apparatus for handling high light environments;
FIG. 3 is a diagram illustrating the number of neural network layers used in a video detection method for processing a highlight environment according to an embodiment of the present invention;
fig. 4 is a schematic diagram of a convolution group in a neural network used in a method for processing video detection in a strong light environment according to an embodiment of the present invention.
Detailed Description
The invention is further described below with reference to the figures and examples.
In the following description, numerous specific details are set forth in order to provide a thorough understanding of the present invention. However, it will be apparent to one of ordinary skill in the art that the present invention may be practiced without these specific details. Accordingly, the particular details set forth are merely exemplary, and the particular details may be varied from the spirit and scope of the present invention and still be considered within the spirit and scope of the present invention.
Referring now to fig. 1, fig. 1 is a flow chart illustrating a method for processing video detection in a high light environment according to an embodiment of the present invention. The embodiment of the invention provides a video detection method for processing a highlight environment, which comprises the following steps:
step 101: acquiring streaming data by an image acquisition unit;
step 102: synthesizing the streaming data into a picture and then importing the picture into a deep neural network;
step 103: the deep neural network converts the size of the picture into N x N to obtain a converted picture;
step 104: after the converted picture is subjected to multiple detections of the convolution layer, parameters of a detection target are given according to preset confidence coefficient parameters;
step 105: after the multiple detections, the method further comprises the steps of adding a convolutional layer, an upper sampling layer, a splicing layer and a convolution group, wherein the convolution group is used for receiving data of the convolutional layer in the basic network.
In a specific implementation, the streaming data is RSTP (rapid spanning Tree Protocol) streaming data.
The image acquisition unit includes: vision camera, laser radar, camera.
And after the streaming data is synthesized into a picture and then is led into a neural network, classifying the picture according to a preset rule. And the step of importing the streaming data into the neural network after the streaming data is synthesized into the picture comprises the step of synthesizing the streaming data into the picture by using an OPENCV component. The preset rules comprise operation criteria in the wharf bridge hanging cabin, such as smoking, operation of electronic equipment during working, non-safety belt fastening and other illegal operations, and can be preset.
The deep neural network converts the size of the picture into N x N to obtain a converted picture, wherein the converted picture comprises a given image or video frame and is set as mat, and the size of the mat is firstly converted into N x N, wherein N can be preset and is a default value of 416. And then, sequentially detecting the converted images by the convolutional layers, and finally giving parameters of all detection targets meeting the requirements according to the specified confidence coefficient parameters.
The parameters of the detection target comprise a central x coordinate, a central y coordinate, and the width and the height of a detection frame. During the detection process, 3 times of detection are performed, corresponding to different receptive fields.
The receptive field in the multiple detection is 16 times, and the size of the used anchor frame is (10, 13); (16, 30); (32,33).
In the specific implementation, after the original predicted values 1, 2 and 3, a convolutional layer, an upsampling layer and a splicing layer, which are the same as the upper layer, are connected, and a convolutional group is used for receiving the data of the convolutional layer in the base network. Finally, a predicted value 4 is calculated together with the data, and by taking the predicted value as a contrast, the data of the convolutional layer is further enhanced. That is, the use of 4-level upsampling can achieve deeper detection and reduce the probability of false detection.
Referring now to fig. 2, fig. 2 is a block diagram of an apparatus for processing video detection in a high light environment according to an embodiment of the present invention. The embodiment of the present invention provides a video detection apparatus 21 for processing under a strong light environment, including:
an image acquisition unit 211 for acquiring streaming data;
a picture synthesizing unit 212, configured to synthesize the streaming data into a picture and then import the picture into a deep neural network;
a picture conversion unit 213, configured to convert the size of the picture into N × N through the deep neural network to obtain a converted picture;
a parameter obtaining unit 214, configured to obtain a parameter of a detection target according to a preset confidence parameter after the converted picture is subjected to multiple detections by the convolutional layer;
convolution unit 215 for adding a convolutional layer, an upsampling layer, and a splicing layer after the multiple detections, and a convolution group for receiving data of the convolutional layer in the base network.
In a specific implementation, the image acquiring unit 211 includes: vision camera, laser radar, camera.
And after the streaming data is synthesized into a picture and then is led into a neural network, classifying the picture according to a preset rule.
The parameters of the detection target comprise a central x coordinate, a central y coordinate, and the width and the height of a detection frame.
The receptive field in the multiple detection is 16 times, and the size of the used anchor frame is (10, 13); (16, 30); (32,33).
Referring now to fig. 3 and 4, fig. 3 is a diagram showing the number of layers of a neural network used in a method for processing video detection in a high light environment according to an embodiment of the present invention, and fig. 4 is a schematic diagram of a convolution group in a neural network used in a method for processing video detection in a high light environment according to an embodiment of the present invention.
After the original predicted values 1, 2 and 3, a convolutional layer, an upper sampling layer and a splicing layer which are the same as the upper layer, and a convolutional group are connected, wherein the convolutional group is used for receiving the data of the convolutional layer in the basic network.
In a specific implementation, a convolutional group includes convolutional layer 1x1, convolutional layer 3x3, convolutional layer 1x1, convolutional layer 3x3, and convolutional layer 1x 1.
In summary, the method and the device for processing video detection in a strong light environment provided by the invention increase a convolution layer, an upper sampling layer, a splicing layer and a convolution group, wherein the convolution group is used for receiving data of the convolution layer in a basic network, so that the accuracy, stability and prediction precision of prediction are improved;
further, by using 4-layer up-sampling, deeper detection can be achieved, and the probability of false detection is reduced. .
Those of ordinary skill in the art will appreciate that the elements and steps of the various examples described in connection with the embodiments disclosed herein may be embodied in electronic hardware, computer software, or combinations of both, and that the components and steps of the various examples have been described in a functional general in the foregoing description for the purpose of illustrating clearly the interchangeability of hardware and software. Whether such functionality is implemented as hardware or software depends upon the particular application and design constraints imposed on the technical solution. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present invention.
The above examples are only for illustrating the technical solutions of the present invention, and are not limited thereto. Although the present invention has been described in detail with reference to the foregoing embodiments, it will be understood by those skilled in the art that various changes in form and details may be made therein without departing from the spirit and scope of the invention.

Claims (10)

1. A video detection method for processing under a strong light environment is characterized by comprising the following steps:
acquiring streaming data by an image acquisition unit;
synthesizing the streaming data into a picture and then importing the picture into a deep neural network;
the deep neural network converts the size of the picture into N x N to obtain a converted picture;
after the converted picture is subjected to multiple detections of the convolution layer, parameters of a detection target are given according to preset confidence coefficient parameters;
after the multiple detections, the method further comprises the steps of adding a convolutional layer, an upper sampling layer, a splicing layer and a convolution group, wherein the convolution group is used for receiving data of the convolutional layer in the basic network.
2. The method of claim 1, wherein the image capturing unit comprises: vision camera, laser radar, camera.
3. The method according to claim 1, further comprising classifying the pictures according to a preset rule after synthesizing the streaming data into the pictures and importing the pictures into a neural network.
4. The method of claim 1, wherein the parameters of the detection target include center x coordinate, center y coordinate, width and height of the detection frame.
5. The method for processing video detection in a strong light environment according to claim 1, wherein the receptive field in the multiple detections is 16 times, and the size of the anchor frame used is (10, 13); (16, 30); (32,33).
6. A video detection apparatus for handling high light environments, comprising:
an image acquisition unit for acquiring streaming data;
the image synthesis unit is used for synthesizing the streaming data into an image and then importing the image into the deep neural network;
the image conversion unit is used for converting the size of the image into N x N through the deep neural network to obtain a converted image;
the parameter acquisition unit is used for detecting the converted picture for multiple times through the convolution layer and then giving out a parameter of a detection target according to a preset confidence coefficient parameter;
and the convolution unit is used for adding a convolution layer, an upper sampling layer and a splicing layer after the detection for a plurality of times, and a convolution group, wherein the convolution group is used for receiving the data of the convolution layer in the basic network.
7. The apparatus for processing video detection in a high light environment according to claim 1, wherein the image capturing unit comprises: vision camera, laser radar, camera.
8. The apparatus according to claim 1, further comprising classifying the pictures according to a preset rule after synthesizing the streaming data into pictures and importing the pictures into a neural network.
9. The apparatus of claim 1, wherein the parameters of the detection target include center x coordinate, center y coordinate, width and height of the detection frame.
10. The apparatus for processing video detection in a high light environment according to claim 1, wherein the field of view in the multiple detections is 16 times, and the size of the anchor frame used is (10, 13); (16, 30); (32,33).
CN202110718258.5A 2021-06-28 2021-06-28 Video detection method and device for processing under strong light environment Pending CN113449634A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110718258.5A CN113449634A (en) 2021-06-28 2021-06-28 Video detection method and device for processing under strong light environment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110718258.5A CN113449634A (en) 2021-06-28 2021-06-28 Video detection method and device for processing under strong light environment

Publications (1)

Publication Number Publication Date
CN113449634A true CN113449634A (en) 2021-09-28

Family

ID=77813285

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110718258.5A Pending CN113449634A (en) 2021-06-28 2021-06-28 Video detection method and device for processing under strong light environment

Country Status (1)

Country Link
CN (1) CN113449634A (en)

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2019144575A1 (en) * 2018-01-24 2019-08-01 中山大学 Fast pedestrian detection method and device
CN110210621A (en) * 2019-06-06 2019-09-06 大连理工大学 A kind of object detection method based on residual error network improvement
CN111310862A (en) * 2020-03-27 2020-06-19 西安电子科技大学 Deep neural network license plate positioning method based on image enhancement in complex environment
CN111898699A (en) * 2020-08-11 2020-11-06 海之韵(苏州)科技有限公司 Automatic detection and identification method for hull target

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2019144575A1 (en) * 2018-01-24 2019-08-01 中山大学 Fast pedestrian detection method and device
CN110210621A (en) * 2019-06-06 2019-09-06 大连理工大学 A kind of object detection method based on residual error network improvement
CN111310862A (en) * 2020-03-27 2020-06-19 西安电子科技大学 Deep neural network license plate positioning method based on image enhancement in complex environment
CN111898699A (en) * 2020-08-11 2020-11-06 海之韵(苏州)科技有限公司 Automatic detection and identification method for hull target

Similar Documents

Publication Publication Date Title
JP3935499B2 (en) Image processing method, image processing apparatus, and image processing program
CN109636754B (en) Extremely-low-illumination image enhancement method based on generation countermeasure network
CN110642109B (en) Vibration detection method and device for lifting equipment, server and storage medium
CN104106260B (en) Control based on geographical map
CN109313806A (en) Image processing apparatus, image processing system, image processing method and program
CN110717532A (en) Real-time detection method for robot target grabbing area based on SE-RetinaGrasp model
TW201822708A (en) Heart rate activity detecting system based on motion images and method thereof
CN110149467A (en) Mobile phone
CN107547839A (en) Remote control table based on graphical analysis
CN113449634A (en) Video detection method and device for processing under strong light environment
CN113936252A (en) Battery car intelligent management system and method based on video monitoring
CN109716350A (en) Optical pickup and electronic equipment
CN107248151B (en) Intelligent liquid crystal display detection method and system based on machine vision
CN109934768B (en) Sub-pixel displacement image acquisition method based on registration mode
CN112183287A (en) People counting method of mobile robot under complex background
CN111414886A (en) Intelligent recognition system for human body dynamic characteristics
CN111147815A (en) Video monitoring system
CN110136085A (en) A kind of noise-reduction method and device of image
CN111402210B (en) Super-resolution positioning method and system for single-molecule fluorescence signal image
CN108182400A (en) The recognition methods of charactron Dynamic Announce and system
CN101115132B (en) Method for obtaining high signal-to-noise ratio image
JP3784474B2 (en) Gesture recognition method and apparatus
JP3627249B2 (en) Image processing device
JP3413778B2 (en) Image processing device
CN113518179A (en) Method and device for identifying and positioning objects in large range of video

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination