CN113449634A - Video detection method and device for processing under strong light environment - Google Patents
Video detection method and device for processing under strong light environment Download PDFInfo
- Publication number
- CN113449634A CN113449634A CN202110718258.5A CN202110718258A CN113449634A CN 113449634 A CN113449634 A CN 113449634A CN 202110718258 A CN202110718258 A CN 202110718258A CN 113449634 A CN113449634 A CN 113449634A
- Authority
- CN
- China
- Prior art keywords
- layer
- picture
- convolution
- detection
- neural network
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000001514 detection method Methods 0.000 title claims abstract description 75
- 238000013528 artificial neural network Methods 0.000 claims abstract description 26
- 238000000034 method Methods 0.000 claims abstract description 22
- 238000005070 sampling Methods 0.000 claims abstract description 13
- 230000002194 synthesizing effect Effects 0.000 claims abstract description 10
- 238000006243 chemical reaction Methods 0.000 claims description 3
- 230000015572 biosynthetic process Effects 0.000 claims description 2
- 238000003786 synthesis reaction Methods 0.000 claims description 2
- 238000010586 diagram Methods 0.000 description 6
- 230000009286 beneficial effect Effects 0.000 description 1
- 230000008676 import Effects 0.000 description 1
- 230000000391 smoking effect Effects 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Health & Medical Sciences (AREA)
- Computing Systems (AREA)
- Biomedical Technology (AREA)
- Biophysics (AREA)
- Computational Linguistics (AREA)
- Data Mining & Analysis (AREA)
- Evolutionary Computation (AREA)
- Life Sciences & Earth Sciences (AREA)
- Molecular Biology (AREA)
- Artificial Intelligence (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Mathematical Physics (AREA)
- Software Systems (AREA)
- Health & Medical Sciences (AREA)
- Image Analysis (AREA)
- Image Processing (AREA)
Abstract
The invention discloses a video detection method and a device for processing a highlight environment, wherein the method comprises the following steps: acquiring streaming data by an image acquisition unit; synthesizing the streaming data into a picture and then importing the picture into a deep neural network; the deep neural network converts the size of the picture into N x N to obtain a converted picture; after the converted picture is subjected to multiple detections of the convolution layer, parameters of a detection target are given according to preset confidence coefficient parameters; after the multiple detections, the method further comprises the steps of adding a convolutional layer, an upper sampling layer, a splicing layer and a convolution group, wherein the convolution group is used for receiving data of the convolutional layer in the basic network. According to the method and the device for processing the video detection in the strong light environment, the accuracy, the stability and the prediction precision of prediction are improved by adding the convolution layer, the upper sampling layer, the splicing layer and the convolution group.
Description
Technical Field
The present invention relates to a video detection method and apparatus, and in particular, to a video detection method and apparatus for processing a video in a high light environment.
Background
At present, the conventional wharf is continuously developed towards an automatic wharf, and the conditions in a bridge pod and an operation cabin of the port wharf need to be detected in real time so as to realize the automatic operation of the wharf.
However, in general, due to a complex environment in a bridge crane cabin of a port and a wharf, for example, all-weather operation on an island can be subject to the situation of strong light and weak light in the cabin, an operation cabin needs to move along with the carrying process of the bridge crane, and frequent shaking conditions exist, so that the video identification technology has high false detection and high computational power loss.
There is therefore a need for a method of video detection that addresses the above-mentioned problems and disadvantages.
Disclosure of Invention
The invention aims to solve the technical problem of providing a video detection method and a video detection device for processing a highlight environment, wherein the accuracy, stability and prediction precision of prediction are improved by adding a convolution layer, an upper sampling layer, a splicing layer and a convolution group.
The technical scheme adopted by the invention for solving the technical problems is to provide a video detection method for processing a highlight environment, which comprises the following steps:
acquiring streaming data by an image acquisition unit;
synthesizing the streaming data into a picture and then importing the picture into a deep neural network;
the deep neural network converts the size of the picture into N x N to obtain a converted picture;
after the converted picture is subjected to multiple detections of the convolution layer, parameters of a detection target are given according to preset confidence coefficient parameters;
after the multiple detections, the method further comprises the steps of adding a convolutional layer, an upper sampling layer, a splicing layer and a convolution group, wherein the convolution group is used for receiving data of the convolutional layer in the basic network.
Preferably, the image acquisition unit includes: vision camera, laser radar, camera.
Preferably, after the streaming data is synthesized into a picture and then introduced into a neural network, the method further comprises classifying the picture according to a preset rule.
Preferably, the parameters of the detection target include a central x coordinate, a central y coordinate, and a width and a height of the detection frame.
Preferably, the receptive field in the multiple detections is 16 times, and the size of the used anchor frame is (10, 13); (16, 30); (32,33).
The present invention further provides a video detection apparatus for processing a video in a strong light environment, which comprises:
an image acquisition unit for acquiring streaming data;
the image synthesis unit is used for synthesizing the streaming data into an image and then importing the image into the deep neural network;
the image conversion unit is used for converting the size of the image into N x N through the deep neural network to obtain a converted image;
the parameter acquisition unit is used for detecting the converted picture for multiple times through the convolution layer and then giving out a parameter of a detection target according to a preset confidence coefficient parameter;
and the convolution unit is used for adding a convolution layer, an upper sampling layer and a splicing layer after the detection for a plurality of times, and a convolution group, wherein the convolution group is used for receiving the data of the convolution layer in the basic network.
Preferably, the image acquisition unit includes: vision camera, laser radar, camera.
Preferably, after the streaming data is synthesized into a picture and then introduced into a neural network, the method further comprises classifying the picture according to a preset rule.
Preferably, the parameters of the detection target include a central x coordinate, a central y coordinate, and a width and a height of the detection frame.
Preferably, the receptive field in the multiple detections is 16 times, and the size of the used anchor frame is (10, 13); (16, 30); (32,33).
Compared with the prior art, the invention has the following beneficial effects: according to the method and the device for processing the video detection in the strong light environment, a convolutional layer, an upper sampling layer, a splicing layer and a convolution group are added, and the convolution group is used for receiving data of the convolutional layer in a basic network, so that the accuracy, the stability and the prediction precision of prediction are improved;
further, by using 4-layer up-sampling, deeper detection can be achieved, and the probability of false detection is reduced.
Drawings
FIG. 1 is a flow chart of a method for processing video detection in a high light environment according to an embodiment of the present invention;
FIG. 2 is a block diagram of an exemplary embodiment of a video detection apparatus for handling high light environments;
FIG. 3 is a diagram illustrating the number of neural network layers used in a video detection method for processing a highlight environment according to an embodiment of the present invention;
fig. 4 is a schematic diagram of a convolution group in a neural network used in a method for processing video detection in a strong light environment according to an embodiment of the present invention.
Detailed Description
The invention is further described below with reference to the figures and examples.
In the following description, numerous specific details are set forth in order to provide a thorough understanding of the present invention. However, it will be apparent to one of ordinary skill in the art that the present invention may be practiced without these specific details. Accordingly, the particular details set forth are merely exemplary, and the particular details may be varied from the spirit and scope of the present invention and still be considered within the spirit and scope of the present invention.
Referring now to fig. 1, fig. 1 is a flow chart illustrating a method for processing video detection in a high light environment according to an embodiment of the present invention. The embodiment of the invention provides a video detection method for processing a highlight environment, which comprises the following steps:
step 101: acquiring streaming data by an image acquisition unit;
step 102: synthesizing the streaming data into a picture and then importing the picture into a deep neural network;
step 103: the deep neural network converts the size of the picture into N x N to obtain a converted picture;
step 104: after the converted picture is subjected to multiple detections of the convolution layer, parameters of a detection target are given according to preset confidence coefficient parameters;
step 105: after the multiple detections, the method further comprises the steps of adding a convolutional layer, an upper sampling layer, a splicing layer and a convolution group, wherein the convolution group is used for receiving data of the convolutional layer in the basic network.
In a specific implementation, the streaming data is RSTP (rapid spanning Tree Protocol) streaming data.
The image acquisition unit includes: vision camera, laser radar, camera.
And after the streaming data is synthesized into a picture and then is led into a neural network, classifying the picture according to a preset rule. And the step of importing the streaming data into the neural network after the streaming data is synthesized into the picture comprises the step of synthesizing the streaming data into the picture by using an OPENCV component. The preset rules comprise operation criteria in the wharf bridge hanging cabin, such as smoking, operation of electronic equipment during working, non-safety belt fastening and other illegal operations, and can be preset.
The deep neural network converts the size of the picture into N x N to obtain a converted picture, wherein the converted picture comprises a given image or video frame and is set as mat, and the size of the mat is firstly converted into N x N, wherein N can be preset and is a default value of 416. And then, sequentially detecting the converted images by the convolutional layers, and finally giving parameters of all detection targets meeting the requirements according to the specified confidence coefficient parameters.
The parameters of the detection target comprise a central x coordinate, a central y coordinate, and the width and the height of a detection frame. During the detection process, 3 times of detection are performed, corresponding to different receptive fields.
The receptive field in the multiple detection is 16 times, and the size of the used anchor frame is (10, 13); (16, 30); (32,33).
In the specific implementation, after the original predicted values 1, 2 and 3, a convolutional layer, an upsampling layer and a splicing layer, which are the same as the upper layer, are connected, and a convolutional group is used for receiving the data of the convolutional layer in the base network. Finally, a predicted value 4 is calculated together with the data, and by taking the predicted value as a contrast, the data of the convolutional layer is further enhanced. That is, the use of 4-level upsampling can achieve deeper detection and reduce the probability of false detection.
Referring now to fig. 2, fig. 2 is a block diagram of an apparatus for processing video detection in a high light environment according to an embodiment of the present invention. The embodiment of the present invention provides a video detection apparatus 21 for processing under a strong light environment, including:
an image acquisition unit 211 for acquiring streaming data;
a picture synthesizing unit 212, configured to synthesize the streaming data into a picture and then import the picture into a deep neural network;
a picture conversion unit 213, configured to convert the size of the picture into N × N through the deep neural network to obtain a converted picture;
a parameter obtaining unit 214, configured to obtain a parameter of a detection target according to a preset confidence parameter after the converted picture is subjected to multiple detections by the convolutional layer;
In a specific implementation, the image acquiring unit 211 includes: vision camera, laser radar, camera.
And after the streaming data is synthesized into a picture and then is led into a neural network, classifying the picture according to a preset rule.
The parameters of the detection target comprise a central x coordinate, a central y coordinate, and the width and the height of a detection frame.
The receptive field in the multiple detection is 16 times, and the size of the used anchor frame is (10, 13); (16, 30); (32,33).
Referring now to fig. 3 and 4, fig. 3 is a diagram showing the number of layers of a neural network used in a method for processing video detection in a high light environment according to an embodiment of the present invention, and fig. 4 is a schematic diagram of a convolution group in a neural network used in a method for processing video detection in a high light environment according to an embodiment of the present invention.
After the original predicted values 1, 2 and 3, a convolutional layer, an upper sampling layer and a splicing layer which are the same as the upper layer, and a convolutional group are connected, wherein the convolutional group is used for receiving the data of the convolutional layer in the basic network.
In a specific implementation, a convolutional group includes convolutional layer 1x1, convolutional layer 3x3, convolutional layer 1x1, convolutional layer 3x3, and convolutional layer 1x 1.
In summary, the method and the device for processing video detection in a strong light environment provided by the invention increase a convolution layer, an upper sampling layer, a splicing layer and a convolution group, wherein the convolution group is used for receiving data of the convolution layer in a basic network, so that the accuracy, stability and prediction precision of prediction are improved;
further, by using 4-layer up-sampling, deeper detection can be achieved, and the probability of false detection is reduced. .
Those of ordinary skill in the art will appreciate that the elements and steps of the various examples described in connection with the embodiments disclosed herein may be embodied in electronic hardware, computer software, or combinations of both, and that the components and steps of the various examples have been described in a functional general in the foregoing description for the purpose of illustrating clearly the interchangeability of hardware and software. Whether such functionality is implemented as hardware or software depends upon the particular application and design constraints imposed on the technical solution. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present invention.
The above examples are only for illustrating the technical solutions of the present invention, and are not limited thereto. Although the present invention has been described in detail with reference to the foregoing embodiments, it will be understood by those skilled in the art that various changes in form and details may be made therein without departing from the spirit and scope of the invention.
Claims (10)
1. A video detection method for processing under a strong light environment is characterized by comprising the following steps:
acquiring streaming data by an image acquisition unit;
synthesizing the streaming data into a picture and then importing the picture into a deep neural network;
the deep neural network converts the size of the picture into N x N to obtain a converted picture;
after the converted picture is subjected to multiple detections of the convolution layer, parameters of a detection target are given according to preset confidence coefficient parameters;
after the multiple detections, the method further comprises the steps of adding a convolutional layer, an upper sampling layer, a splicing layer and a convolution group, wherein the convolution group is used for receiving data of the convolutional layer in the basic network.
2. The method of claim 1, wherein the image capturing unit comprises: vision camera, laser radar, camera.
3. The method according to claim 1, further comprising classifying the pictures according to a preset rule after synthesizing the streaming data into the pictures and importing the pictures into a neural network.
4. The method of claim 1, wherein the parameters of the detection target include center x coordinate, center y coordinate, width and height of the detection frame.
5. The method for processing video detection in a strong light environment according to claim 1, wherein the receptive field in the multiple detections is 16 times, and the size of the anchor frame used is (10, 13); (16, 30); (32,33).
6. A video detection apparatus for handling high light environments, comprising:
an image acquisition unit for acquiring streaming data;
the image synthesis unit is used for synthesizing the streaming data into an image and then importing the image into the deep neural network;
the image conversion unit is used for converting the size of the image into N x N through the deep neural network to obtain a converted image;
the parameter acquisition unit is used for detecting the converted picture for multiple times through the convolution layer and then giving out a parameter of a detection target according to a preset confidence coefficient parameter;
and the convolution unit is used for adding a convolution layer, an upper sampling layer and a splicing layer after the detection for a plurality of times, and a convolution group, wherein the convolution group is used for receiving the data of the convolution layer in the basic network.
7. The apparatus for processing video detection in a high light environment according to claim 1, wherein the image capturing unit comprises: vision camera, laser radar, camera.
8. The apparatus according to claim 1, further comprising classifying the pictures according to a preset rule after synthesizing the streaming data into pictures and importing the pictures into a neural network.
9. The apparatus of claim 1, wherein the parameters of the detection target include center x coordinate, center y coordinate, width and height of the detection frame.
10. The apparatus for processing video detection in a high light environment according to claim 1, wherein the field of view in the multiple detections is 16 times, and the size of the anchor frame used is (10, 13); (16, 30); (32,33).
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110718258.5A CN113449634A (en) | 2021-06-28 | 2021-06-28 | Video detection method and device for processing under strong light environment |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110718258.5A CN113449634A (en) | 2021-06-28 | 2021-06-28 | Video detection method and device for processing under strong light environment |
Publications (1)
Publication Number | Publication Date |
---|---|
CN113449634A true CN113449634A (en) | 2021-09-28 |
Family
ID=77813285
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202110718258.5A Pending CN113449634A (en) | 2021-06-28 | 2021-06-28 | Video detection method and device for processing under strong light environment |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN113449634A (en) |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2019144575A1 (en) * | 2018-01-24 | 2019-08-01 | 中山大学 | Fast pedestrian detection method and device |
CN110210621A (en) * | 2019-06-06 | 2019-09-06 | 大连理工大学 | A kind of object detection method based on residual error network improvement |
CN111310862A (en) * | 2020-03-27 | 2020-06-19 | 西安电子科技大学 | Deep neural network license plate positioning method based on image enhancement in complex environment |
CN111898699A (en) * | 2020-08-11 | 2020-11-06 | 海之韵(苏州)科技有限公司 | Automatic detection and identification method for hull target |
-
2021
- 2021-06-28 CN CN202110718258.5A patent/CN113449634A/en active Pending
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2019144575A1 (en) * | 2018-01-24 | 2019-08-01 | 中山大学 | Fast pedestrian detection method and device |
CN110210621A (en) * | 2019-06-06 | 2019-09-06 | 大连理工大学 | A kind of object detection method based on residual error network improvement |
CN111310862A (en) * | 2020-03-27 | 2020-06-19 | 西安电子科技大学 | Deep neural network license plate positioning method based on image enhancement in complex environment |
CN111898699A (en) * | 2020-08-11 | 2020-11-06 | 海之韵(苏州)科技有限公司 | Automatic detection and identification method for hull target |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
JP3935499B2 (en) | Image processing method, image processing apparatus, and image processing program | |
CN109636754B (en) | Extremely-low-illumination image enhancement method based on generation countermeasure network | |
CN110642109B (en) | Vibration detection method and device for lifting equipment, server and storage medium | |
CN104106260B (en) | Control based on geographical map | |
CN109313806A (en) | Image processing apparatus, image processing system, image processing method and program | |
CN110717532A (en) | Real-time detection method for robot target grabbing area based on SE-RetinaGrasp model | |
TW201822708A (en) | Heart rate activity detecting system based on motion images and method thereof | |
CN110149467A (en) | Mobile phone | |
CN107547839A (en) | Remote control table based on graphical analysis | |
CN113449634A (en) | Video detection method and device for processing under strong light environment | |
CN113936252A (en) | Battery car intelligent management system and method based on video monitoring | |
CN109716350A (en) | Optical pickup and electronic equipment | |
CN107248151B (en) | Intelligent liquid crystal display detection method and system based on machine vision | |
CN109934768B (en) | Sub-pixel displacement image acquisition method based on registration mode | |
CN112183287A (en) | People counting method of mobile robot under complex background | |
CN111414886A (en) | Intelligent recognition system for human body dynamic characteristics | |
CN111147815A (en) | Video monitoring system | |
CN110136085A (en) | A kind of noise-reduction method and device of image | |
CN111402210B (en) | Super-resolution positioning method and system for single-molecule fluorescence signal image | |
CN108182400A (en) | The recognition methods of charactron Dynamic Announce and system | |
CN101115132B (en) | Method for obtaining high signal-to-noise ratio image | |
JP3784474B2 (en) | Gesture recognition method and apparatus | |
JP3627249B2 (en) | Image processing device | |
JP3413778B2 (en) | Image processing device | |
CN113518179A (en) | Method and device for identifying and positioning objects in large range of video |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |