CN118172720A

CN118172720A - Dangerous construction posture detection method in oilfield operation scene

Info

Publication number: CN118172720A
Application number: CN202211576407.XA
Authority: CN
Inventors: 王美; 王逸飞; 薛娟; 杨欣欣; 王凯月; 李慧颖; 延伟
Original assignee: China Petroleum and Chemical Corp; Technology Inspection Center of Sinopec Shengli Oilfield Co; Shengli Oilfield Testing and Evaluation Research Co Ltd
Current assignee: China Petroleum and Chemical Corp; Technology Inspection Center of Sinopec Shengli Oilfield Co; Shengli Oilfield Testing and Evaluation Research Co Ltd
Priority date: 2022-12-09
Filing date: 2022-12-09
Publication date: 2024-06-11

Abstract

The invention relates to a dangerous construction posture detection method in an oilfield operation scene, which comprises the following steps: acquiring a monitoring video V _i of a construction site operation field; the method comprises the steps of screening an operator scene image I _i through monitoring equipment video; labeling the personnel scene image I _i to obtain labeling data I _ann, and dividing the labeling data I _ann into a training sample I _train and a verification sample I _valid; training the neural network model by using a training sample I _train and a verification sample I _valid to obtain a trained neural network model; acquiring a real-time monitoring video V _test and acquiring continuous N frames of operation site images to be tested as a group I _test to be tested; testing the group I _test to be tested to obtain the position information and the category information of the site operation personnel in each test image; inputting a target frame of an extraction operator into a human body posture estimation algorithm so as to acquire posture information p of a target; and integrating results and judging an alarm. The invention can realize real-time detection on whether the operation personnel has potential safety hazards or not in operation, can timely find out unsafe behaviors of the personnel and send out early warning, and avoids accidents.

Description

Dangerous construction posture detection method in oilfield operation scene

Technical Field

The invention relates to the field of target detection, in particular to a dangerous construction posture detection method in an oilfield operation scene

Background

The development of the target detection field has been carried out by twenty, from the early traditional method to the current deep learning method, the accuracy is higher and the speed is faster, and the method is beneficial to the continuous development of related technologies such as deep learning and the like, and is continuously applied to the industry, academia and the like. The method is a new challenge for special gesture recognition under the oilfield operation scene, and has the main difficulty that an independent target detector has high false alarm rate for gesture recognition classification, the oilfield operation scene is complex, the construction gesture is a special gesture, and priori knowledge is less, so that the method for combining target detection with human gesture estimation is not available at present, and therefore detection is carried out on operators on an operation site.

The traditional detection method generally only judges a single image, and is easy to generate false alarm and missing report behaviors, so that the security requirement of ensuring the safe operation of the human body gesture recognition on the petroleum operation site can be met more accurately, and the technical implementation based on multi-feature target detection is very important. Meanwhile, after a video shot by a camera passes through a series of image processing means, the downsampling feature images with different multiples lose information to different degrees; when the camera is too far from the worker, the target detection becomes small and detection of a small target becomes difficult.

At present, the urgent need exists for a dangerous construction gesture detection method in a limited space under an oilfield operation scene, hidden danger can be found in time, life and property safety of personnel is guaranteed, and accidents are avoided.

Disclosure of Invention

The embodiment of the invention provides a dangerous construction posture detection method in an oilfield operation scene. The following presents a simplified summary in order to provide a basic understanding of some aspects of the disclosed embodiments. This summary is not an extensive overview and is intended to neither identify key/critical elements nor delineate the scope of such embodiments. Its sole purpose is to present some concepts in a simplified form as a prelude to the more detailed description that is presented later.

The embodiment of the invention provides a dangerous construction posture detection method in an oilfield operation scene, which aims to realize real-time detection of dangerous posture recognition in a limited space in the oilfield operation scene.

In a method of detecting a hazardous construction pose in an oilfield operation scenario, the improvement comprising:

(1) Acquiring a monitoring video V _i of a construction site operation field;

(2) The method comprises the steps of screening an operator scene image I _i through monitoring equipment video;

(3) Labeling the personnel scene image I _i to obtain labeling data I _ann, and dividing the labeling data I _ann into a training sample I _train and a verification sample I _valid;

(4) Training the neural network model by using a training sample I _train and a verification sample I _valid to obtain a trained neural network model;

(5) Acquiring a real-time monitoring video V _test and acquiring continuous N frames of operation site images to be tested as a group I _test to be tested;

(6) Testing the group I _test to be tested to obtain the position information and the category information of the site operation personnel in each test image;

(7) Inputting a target frame of an extraction operator into a human body posture estimation algorithm so as to acquire posture information p of a target;

(8) And integrating results and judging an alarm.

Preferably, the step (4) comprises

4-1 Initializing a model, completing the construction and initialization of a target detection model YOLOv and a human body posture estimation model,

4-2, Iterating the training model, training the target detection model and the human body posture estimation model by using a training set, calculating a loss function according to the network output value and the true value, updating model weight, and evaluating the performance by using the verification sample target detection model and the human body posture estimation model to obtain an optimal performance model.

Further, the object detection model YOLOv includes a backbone network, a feature pyramid network, and a network head three-part adaptive spatial feature fusion plate ASFF.

Further, the step (4-2) includes

4-21, Performing image data enhancement, wherein the data enhancement comprises multi-source data enhancement and single-source data enhancement; the multi-source data enhancement comprises a Mosaic enhancement and a Mixup enhancement, and the single-source data enhancement comprises an HSV enhancement and random overturn;

4-22 image normalization, wherein the image normalization operation is as follows:

Where m is the mean of the image pixels and σ is the variance of the image pixels.

Preferably, the step (6) includes

6-1, Scaling the input image to a standard size according to the original proportion, and carrying out normalization operation on the image;

6-2 detecting the image obtained in the step S61 by using a target detection model YOLOv to obtain position information and category information of the working personnel on the construction site in the test image, wherein the position information and the category information comprise coordinates b epsilon R ^4×n, category c epsilon R ^1×n and confidence coefficient S epsilon R ¹ ^×n of the working personnel, n is the number of the detected personnel in the image, and priori knowledge is provided for gesture recognition;

6-3, carrying out post-processing operation on the result, including carrying out non-maximum value inhibition operation on the boundary box of the detection result, removing some boxes with higher overlap ratio from the detection result, and removing related false positive boxes by using confidence related information.

Further, the non-maximum suppression operation is as follows, including

6-31 Dividing all prediction bounding boxes b by category c;

6-32 in each category, descending the boundary box b according to the classification confidence s;

6-33 preserve the highest confidence bounding box b _high in each class;

And (3) iteratively calculating the intersection ratio IoU of b _high and the rest of the boundary frame b _rest by 6-34, and if IoU (b _high,b_rest) > beta, removing the corresponding b _rest and IoU at the input, wherein the calculation formula is as follows:

6-35 steps 6-32 to 6-34 are circulated until b _rest is empty;

Where β is the IoU threshold for non-maximum suppression.

Preferably, the step (7) includes

7-1, Cutting the position of the person according to the coordinates b epsilon R ^4×n and the category c epsilon R ^1×n of the person in the prediction result;

7-2, scaling the cut image to a fixed size and performing normalization operation;

7-3, inputting the normalized image into a human body posture estimation algorithm to obtain posture information p of the operator.

Preferably, the step (8) includes

8-1, Judging whether the gesture information p of the operator is complete, deleting detection results of incomplete gesture information extracted from a boundary box of the operator to obtain complete gesture information p;

8-2, sequentially comparing and acquiring the posture information of the operator, comparing the postures of the operator in the continuous three frames of pictures, judging that the operator is in dangerous activity when the conditions detected by the human body posture estimation algorithm of the operator in the continuous three frames are dangerous action postures, and outputting alarm information.

Further, the human body posture estimation algorithm comprises

8-21, Carrying out target detection on the image to be detected, and taking out target frames of all detection operators;

8-22, carrying out resolution processing on the extracted target frame, and scaling the length and the width of the original target frame to a uniform size;

8-23, extracting features of the scaled target frame by using ResNet network;

8-24 input these features into two parallel branches of the convolutional layer; the first branch predicts a set of part affinity fields, PAF representing the degree of association between the components; the second branch predicts a set of confidence maps, each confidence map representing a particular portion of the human skeleton map;

8-25 forming a bipartite graph between pairs of components using the component confidence graph; pruning weaker links in the bipartite graph using the PAF values; through the above steps, an estimated human skeleton map is obtained and assigned to each person in the image.

Preferably, the monitoring equipment is installed on a telegraph pole or a street lamp in a construction site, and can acquire images of the construction site;

The neural network model adopts YOLOv network model structure, and YOLOv network model adopts 8 times, 16 times and 32 times downsampling characteristic diagrams.

The technical scheme provided by the embodiment of the invention can have the following beneficial effects:

According to the invention, whether the potential safety hazard exists in the operation personnel or not can be detected in real time through the target detection technology, so that not only is the management requirement met, but also unsafe behaviors of the personnel can be timely found and early warning can be sent out, and accidents are avoided.

It is to be understood that both the foregoing general description and the following detailed description are exemplary and explanatory only and are not restrictive of the invention as claimed.

Drawings

The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate embodiments consistent with the invention and together with the description, serve to explain the principles of the invention.

FIG. 1 is a flow chart illustrating a method for detecting a dangerous construction gesture in an oilfield operation scenario in accordance with an exemplary embodiment.

FIG. 2 is a partial schematic diagram of a neural network model in a method for detecting a risk construction pose in an oilfield operation scenario according to an exemplary embodiment.

Detailed Description

The following description and the drawings sufficiently illustrate specific embodiments of the invention to enable those skilled in the art to practice them. The embodiments represent only possible variations. Individual components and functions are optional unless explicitly required, and the sequence of operations may vary. Portions and features of some embodiments may be included in, or substituted for, those of others. The scope of embodiments of the invention encompasses the full ambit of the claims, as well as all available equivalents of the claims. Embodiments may be referred to herein, individually or collectively, by the term "invention" merely for convenience and without intending to voluntarily limit the scope of this application to any single invention or inventive concept if more than one is in fact disclosed. Relational terms such as first and second, and the like may be used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Moreover, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed. Various embodiments are described herein in a progressive manner, each embodiment focusing on differences from other embodiments, and identical and similar parts between the various embodiments are sufficient to be seen with each other. The structures, products and the like disclosed in the embodiments correspond to the parts disclosed in the embodiments, so that the description is relatively simple, and the relevant parts refer to the description of the method parts.

The invention is further described below with reference to the accompanying drawings and examples:

the invention discloses a dangerous construction posture detection method in an oilfield operation scene, which comprises the following steps:

(1) Acquiring a monitoring video V _i by using monitoring equipment of an operation site;

(2) An image I _i containing a scene of an operator is screened out from the monitoring video;

(3) Labeling the acquired images by using a labeling tool, wherein the labeling personnel searches the positions b _gt∈R⁴ and the categories c _gt∈R¹ of the personnel in the sample image data I _in to obtain a certain amount of labeling data I _ann, and the labeling data is divided into a training sample I _train and a verification sample I _valid according to a certain proportion;

(4) Inputting the training sample into a target detection model and a human body posture estimation model to train the model, and evaluating the performance by using a verification sample target detection model and a human body posture estimation model to obtain an optimal performance model;

(5) Performing field test on the model to obtain a real-time video to be tested, and extracting a test image I _test reflecting the condition of the operation site within a period of time by using a frame extraction algorithm;

(6) Testing the test image I _test by using the trained target detection model and the human body posture estimation model to obtain position information and category information of the site operation personnel in each test image, wherein the position information and the category information comprise coordinates b epsilon R ^4×n, category c epsilon R ^1×n and confidence coefficient s epsilon R ^1×n of the operation personnel, n is the number of detected personnel in the image, and priori knowledge is provided for posture identification;

(7) Inputting the extracted target frame of the operator into a human body posture estimation algorithm, so as to obtain posture information p of the target;

(8) And integrating results and judging an alarm. Firstly judging whether the gesture information is complete or not through the gesture information p, if the complete gesture information is not contained in the target frame, correcting the target detection result, and deleting most of detection error results. And for the correct detection result, the attitude information p is analyzed. The human body posture estimation algorithm comprises the following steps:

Step one: firstly, carrying out target detection on an image to be detected, and taking out target frames of all detection operators to prepare for a human body posture estimation algorithm.

Step two: and performing resolution processing on the extracted target frame, and scaling the length and the width of the original target frame to a uniform size so that the human body estimation method can uniformly process.

Step three: the human body posture estimation algorithm first uses ResNet network to extract the characteristics of the scaled target frame.

Step four: these features are then input into two parallel branches of the convolutional layer. The first branch predicts a set of component affinity fields, PAF representing the degree of association between components. The second branch predicts a set of confidence maps, each confidence map representing a particular portion of the human skeleton map.

Step five: using the component confidence map, a bipartite graph is formed between the component pairs. Weaker links in the bipartite graph are then pruned using the PAF values. Through the above steps we can estimate the human skeleton map and assign it to each person in the image.

As shown in fig. 2, which is a part of a trained neural network model. (since the actual neural network model is large, only 1/30 of the following is shown

In the above technical solution, the step (4) includes

4-1 Initializing a model. And (3) completing the construction and initialization of the target detection model YOLOv and the human body posture estimation model. The object detection model YOLOv includes three parts, namely a backbone network, a feature pyramid network and a network head. In order to obtain better performance, the invention adds the self-adaptive space feature fusion plate (ASFF);

The target detection model YOLOv is used as a detector, is deployed in a background server, and is integrated with a camera at the front end of monitoring equipment to form a dangerous operation behavior detection system for detecting operators. For the detected operators, if dangerous operation behaviors of the operators are detected through human body posture estimation within a sufficient number of frames, the horses are warned and output the position information of the operators, so that the first time of discovery of rectification is ensured.

For oilfield operation scenarios, a worker may have many unusual poses to work in a confined space, such as: the staff carries out high-altitude pipeline work, construction and other special postures in the closed facilities, and judgment is carried out on the dangerous special postures aiming at oilfield operation scenes and limited spaces:

(1) And (3) judging by combining target detection and human body posture estimation algorithm:

Because in a limited space, the camera is far from the operation scene, the target of the operator is small, and the operation scene has a lot of interferences such as: some smaller similar operation facilities interfere with human bodies, and the positioning and distinguishing of operators with dangerous behaviors can be greatly interfered by background areas by simply using a target detection algorithm, so that confusion and errors are introduced to subsequent realization of alarming. Therefore, the human body posture estimation algorithm judgment is added on the basis of target detection, so as to judge whether the worker has dangerous behaviors. Meanwhile, the human body posture algorithm can check the accuracy of target detection, and the algorithm accuracy can be further improved. If the conditions detected by the operator with continuous multiframes through the human body posture estimation algorithm are dangerous action postures, the operator is judged to be in dangerous activity, and alarm information is output.

(2) Adaptive spatial feature fusion plate (ASFF):

For dangerous behavior in confined spaces there are: the staff performs high-altitude pipeline work, works in a closed facility and the like, few special gesture behaviors appear in daily life, the dangerous behaviors can be accurately identified only by the fact that the special gesture behaviors are identified and the image feature extracting capability of the target detector is stronger, and more excellent priori knowledge is provided for subsequent human gesture identification.

In YOLOv network, in order to make full use of semantic information of high-level features and fine-grained features of bottom-level features, the network outputs multi-layer features in a Feature Pyramid (FPN) mode to realize multi-feature map prediction, so that the network can detect targets of various scales and simultaneously fuse the high-level features and the bottom-level features, but conflicts exist among feature maps of different levels of the structure, the conflicts interfere gradient calculation during training, and the effectiveness of the feature pyramid is reduced.

3 Feature images with different sizes are generated through the FPN plate, and then the feature images are fused through ASFF plates, so that the feature images with different sizes are connected to weaken conflict among the feature images. The specific steps are as follows:

step 1: performing up-down sampling operation on the rest feature images for the output of the first-level feature image to obtain feature images with the same size and depth, so that the subsequent fusion is convenient;

step 2: outputting the processed 3-level feature graphs, and inputting the processed 3-level feature graphs into a convolution of 1 multiplied by n to obtain 3 space weight vectors, wherein each size is n multiplied by h multiplied by w;

step 3: splicing the channel directions to obtain a weight fusion graph of 3n multiplied by h multiplied by w;

Step 4: in order to obtain a weight map with a channel of 3, a convolution of 1×1×3 is adopted on the characteristic map to obtain a weight vector of 3×h×w;

Step 5: normalizing in the channel direction, multiplying 3 vectors onto 3 feature graphs to obtain a fused c×h×w feature graph;

step 6: obtaining a predicted output result with 256 output channels by adopting 3×3 convolution;

The fused formula in the step 5 is as follows:

Alpha, beta and gamma are weight coefficients of feature fusion, and X is a feature of a feature map of different levels.

The spatial filtering conflict information is learned through a weighted fusion mode, if positive samples exist in a certain level of feature graphs in the original feature golden tower and negative samples exist in corresponding positions in other levels of feature graphs, so that discontinuity can interfere gradient results, training efficiency is reduced, through adding ASFF plates, the weight coefficient of the corresponding negative samples can be controlled through weight parameters, gradients of the negative samples can not interfere results, filtering conflict information is achieved, and network feature fusion capability is further enhanced.

(3) OpenPose-based improved algorithm:

The application scene of the invention is a limited space under an oilfield operation scene, and the scene has higher requirements on the real-time performance of a gesture algorithm. Meanwhile, the personnel gesture in the scene is special, the original OpenPose algorithm has good effect on detecting common human body gestures, and a backbone network with stronger feature extraction capability is needed to be used for the network for the special gesture in the scene.

The original OpenPose algorithm structure is divided into two parts, firstly, a characteristic diagram F is obtained by characteristic extraction of a convolutional neural network VGG19, then the characteristic diagram F is input into a double-branch multi-stage network, the upper branch of the network is used for predicting Partial Affinity (PAF), and the upper branch can record position information and direction information between key points; the lower leg is used to predict a Partial Confidence Map (PCM) that represents a particular portion of the human skeleton map.

According to the invention, the oil field operation scene is improved based on OpenPose algorithm, the original model adopts VGG19 to extract the characteristics, the network depth is shallower, the characteristic extraction effect is poor, the ResNet18 with smaller parameter quantity and better effect is used for extracting the characteristics, resNet is a deep convolutional neural network with a residual structure, and the structure is not simple convolutional kernel stacking any more, so that the degradation problems of gradient disappearance and gradient explosion in the deep network are solved. The invention adds the depth separable convolution to the branch circuit, the depth separable convolution is mainly divided into two processes, namely the channel-by-channel convolution and the point-by-point convolution, and the convolution function is realized through the two processes, so that the parameter quantity and the operation cost are further reduced compared with the original conventional convolution.

The improved OpenPose algorithm of the invention has higher speed and better effect than the original algorithm, and is suitable for complex construction site operation scenes.

4-2 Iterative training of the model. Training the target detection model and the human body posture estimation model by using a training set, calculating a loss function according to the network output value and the true value, updating model weight, and evaluating performance by using the verification sample target detection model and the human body posture estimation model to obtain an optimal performance model.

In the above technical scheme, the step (4-2) includes

In the above technical solution, the step (6) includes

6-1, Scaling the input image to a proper size according to the original proportion, and normalizing the image to ensure the training speed and the training efficiency;

In the above technical solution, the non-maximum suppression operation is as follows, including

6-31 Dividing all prediction bounding boxes b by category c;

6-33 preserve the highest confidence bounding box b _high in each class;

6-35 steps 6-32 to 6-34 are circulated until b _rest is empty;

Where β is the IoU threshold for non-maximum suppression.

In the above technical solution, the step (7) includes

7-1, Cutting the position of the person according to the coordinates b epsilon R ^4×n and the category c epsilon R ^1×n of the person in the prediction result of the step 6;

In the above technical solution, the step (8) includes

8-1 Judging whether the gesture information p of the operator is complete, and judging that the gesture information extracted from the boundary frame of the operator is incomplete, wherein the original detection result of the high probability of the extracted target frame of the operator is wrong, and deleting the detection results to obtain complete gesture information p;

8-2 comparing all the gesture information of the operators obtained in 8-1 in turn, comparing the gesture of the operators in the continuous three frames of pictures, judging that the operators are in dangerous activity when the conditions detected by the human gesture estimation algorithm of the operators in the continuous three frames are dangerous action gesture, and outputting alarm information.

In the above technical solution, the human body posture estimation algorithm includes

8-21, Carrying out target detection on the image to be detected, and taking out target frames of all detection operators to prepare for a human body posture estimation algorithm.

8-22, Carrying out resolution processing on the extracted target frame, and scaling the length and the width of the original target frame to a uniform size so that the human body estimation method can uniformly process.

The 8-23 human body posture estimation algorithm first uses ResNet network to extract the characteristics of the scaled target frame.

8-24 Then input these features into two parallel branches of the convolutional layer. The first branch predicts a set of component affinity fields, PAF representing the degree of association between components. The second branch predicts a set of confidence maps, each confidence map representing a particular portion of the human skeleton map.

8-25 Form a bipartite graph between pairs of components using the component confidence graphs. Weaker links in the bipartite graph are then pruned using the PAF values. Through the above steps, a human skeleton map is estimated and assigned to each person in the image.

In the technical scheme, the monitoring equipment is installed on a telegraph pole or a street lamp in a construction site, and can acquire images of the operation site; the horizontal distance between the monitoring equipment and the position of the operation site is within 100 meters; the people output by the neural network model are bounding boxes with confidence greater than 0.4.

The neural network model adopts YOLOv network model structure, and YOLOv network model adopts 8 times, 16 times and 32 times downsampling characteristic diagrams; acquiring continuous N frames of operation site images to be tested from the monitoring video every N frames; the to-be-tested group comprises N frames of to-be-tested operation site images, wherein N is 45; the sample data also comprises filling sample data, wherein the filling sample data is obtained by simulating pictures under different weather conditions by changing brightness, hue and saturation on the basis of the actual obtained operation site picture, and the pictures are obtained by copying on the basis of the actual obtained operation site picture.

In the above technical scheme, training of the model is iterated 70 times after loading the weight file after ImageNet pre-training, four time periods (am, noon, pm, evening), 1-3 test scenes are selected, and enough image tests are obtained, and specific results are shown in the following table.

Type of algorithm	Number of test scenes	Video frame number	Correct identification	Error identification	Accuracy rate of
						Personnel detection	2	5001	4965	36	99.2％
Personnel detection	2	5208	5178	30	99.4％
						Personnel detection	2	4314	4224	90	97.9％
Personnel detection	3	6465	6348	117	98.1％

It is to be understood that the invention is not limited to the arrangements and instrumentality shown in the drawings and described above, and that various modifications and changes may be made without departing from the scope thereof. The scope of the invention is limited only by the appended claims.

Claims

1. A method for detecting a dangerous construction posture in an oilfield operation scene, comprising:

(1) Acquiring a monitoring video V _i of a construction site operation field;

(8) And integrating results and judging an alarm.

2. The method for detecting a dangerous construction situation in an oilfield operation scenario according to claim 1, wherein the step (4) comprises

3. The method for detecting a dangerous construction posture in an oilfield operation scene according to claim 2, wherein the target detection model YOLOv comprises a backbone network, a feature pyramid network and a network head three-part adaptive spatial feature fusion plate ASFF.

4. The method for detecting a dangerous construction situation in an oilfield operation scenario according to claim 2, wherein the step (4-2) comprises

4-22 image normalization, wherein the image normalization operation formula is as follows:

5. The method for detecting a dangerous construction posture in an oilfield operation scenario according to claim 1, wherein,

The step (6) comprises

6-2 detecting the image obtained in the step S61 by using a target detection model YOLOv to obtain position information and category information of the working personnel on the construction site in the test image, wherein the position information and the category information comprise coordinates b epsilon R ^4×n, category c epsilon R ^1×n and confidence coefficient S epsilon R ^1×n of the working personnel, n is the number of the detected personnel in the image, and priori knowledge is provided for gesture recognition;

6. The method for detecting a dangerous construction situation in an oilfield operation scenario according to claim 5, wherein the non-maximum suppressing operation comprises

6-31 Dividing all prediction bounding boxes b by category c;

6-33 preserve the highest confidence bounding box b _high in each class;

6-35 steps 6-32 to 6-34 are circulated until b _rest is empty;

Where β is the IoU threshold for non-maximum suppression.

7. The method for detecting a dangerous construction situation in an oilfield operation scenario according to claim 1, wherein the step (7) comprises

8. The method for detecting a dangerous construction situation in an oilfield operation scenario according to claim 1, wherein the step (8) comprises

9. The method for detecting a dangerous construction situation in an oilfield operation scenario according to claim 8, wherein the human body posture estimation algorithm comprises

8-21, Carrying out target detection on the image to be detected, and taking out target frames of all detection operators

8-23, extracting features of the scaled target frame by using ResNet network;

10. The method for detecting the dangerous construction posture in the oilfield operation scene according to claim 1, wherein the monitoring equipment is installed on a telegraph pole or a street lamp in a construction site, and can acquire images of the operation site;