CN117727046A - Novel mountain torrent front-end instrument and meter reading automatic identification method and system - Google Patents

Novel mountain torrent front-end instrument and meter reading automatic identification method and system Download PDF

Info

Publication number
CN117727046A
CN117727046A CN202311791640.4A CN202311791640A CN117727046A CN 117727046 A CN117727046 A CN 117727046A CN 202311791640 A CN202311791640 A CN 202311791640A CN 117727046 A CN117727046 A CN 117727046A
Authority
CN
China
Prior art keywords
pointer
instrument
module
convolution
image
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202311791640.4A
Other languages
Chinese (zh)
Inventor
陈源
段瑞杰
王晓龙
安国成
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Eccom Network System Co ltd
Original Assignee
Eccom Network System Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Eccom Network System Co ltd filed Critical Eccom Network System Co ltd
Priority to CN202311791640.4A priority Critical patent/CN117727046A/en
Publication of CN117727046A publication Critical patent/CN117727046A/en
Pending legal-status Critical Current

Links

Classifications

    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02ATECHNOLOGIES FOR ADAPTATION TO CLIMATE CHANGE
    • Y02A10/00TECHNOLOGIES FOR ADAPTATION TO CLIMATE CHANGE at coastal zones; at river basins
    • Y02A10/40Controlling or monitoring, e.g. of flood or hurricane; Forecasting, e.g. risk assessment or mapping

Landscapes

  • Image Processing (AREA)
  • Image Analysis (AREA)

Abstract

The invention provides a novel automatic recognition method and system for the readings of instruments and meters at the front end of mountain torrents, comprising the following steps: construction A-U 2 -Net convolutional neural network model backbone network; adding the extracted features and the ECA module output features, and then inputting the added features into a next feature extraction layer for feature extraction; fusing semantic features output in the encoding stage; an up-sampling kernel prediction module and a content perception module are constructed; preprocessing a pointer instrument data set, marking the pointer instrument with a VOC (volatile organic compound) format data set, and inputting the data set into a YOLOV7 model for training; instrument and meterImage and generated data input to improved A-U 2 Training in Net model; after the input picture detects the result through the combination algorithm, the result is decoded and written on the separated picture. The invention eliminates invalid characteristic interference remained in the characteristic channel when the RSU module extracts the multi-scale characteristics, and further enhances the robustness of the model.

Description

Novel mountain torrent front-end instrument and meter reading automatic identification method and system
Technical Field
The invention relates to the field of computer vision, in particular to a novel automatic identification method and system for mountain torrent front-end instrument readings.
Background
The intelligent water conservancy is based on construction of water conservancy informatization, digitalization and intellectualization, utilizes high-performance information digitalization technologies such as cloud computing, internet of things and artificial intelligence to deeply develop and highly integrate water conservancy information resources, realizes networking and intellectualization of water conservancy perception, transmission and application, can provide accurate water resource information for a water conservancy manager, assists in making scientific decisions, and effectively prevents water conservancy hidden danger while improving the utilization rate of water resources. At present, partial water conservancy pointer type instruments in China are in a manual supervision state. If the dam corridor is operated and maintained, the installed water pressure tester, the water seepage pressure meter, the soil pressure meter, the harmful gas monitor and the like can not remotely acquire monitoring data due to communication faults or no communication interface and the like, and still acquire information in a manual inspection and transcription mode. However, this mode is not only wasted time and energy, and efficiency is difficult to satisfy daily instrument monitoring intensity requirement, and easily receives subjective and objective factor influences such as illumination, instrument and observation position's distance and individual reading habit, psychological factor, tired, has certain potential safety hazard.
Along with the combination of artificial intelligence technology and water conservancy informatization application construction, automatic identification of various pointer meters is paid great attention, for example, partial scholars use Huffman transformation to transform plane coordinates of pictures into Huffman space, RGB space is transformed into HSV space, and color features are utilized to detect the beginning and ending scales of a meter; and selecting feature point detection algorithms such as SIFT, ORB and the like, and realizing the positioning of the instrument by matching the corner points of the image so as to read pointer readings. However, the method has strict requirements on image quality, is easily influenced by environmental factors such as illumination, deformation, shielding and the like, and the reading recognition precision cannot meet the reading requirement of the meter under a complex background. In deep learning, the target detection algorithm and the image segmentation algorithm are mainly based on convolutional neural networks, and are well applied to various productions. U2-net is a salient object detection network based on a coding and decoding structure, and a backbone network is formed by stacking RSUs (ReSidul U-blocks) into a U-shaped structure, so that the performance is excellent, but the network structure still has a certain improvement space when the network is directly used for problems of water conservancy pointer instrument scale, pointer feature extraction, segmentation and the like under a complex background.
In summary, three problems mainly exist in the automatic identification of various water conservancy pointer meters: one is how to accurately detect dial areas from complex and changeable scenes; secondly, how to accurately divide the pointer and the scale area in the dial area, and the information such as the pointer, the dial scale and the like of the split meter are difficult to be self-adaptively divided by the algorithms such as Huffman conversion, color feature extraction, SIFT and the like, so that the actual production requirement is not met; thirdly, how to automatically correct the dial plate position in the face of shielding and angle change, the accuracy of the readings of various pointer meters is ensured, and the actual production needs are met.
Application number CN202310100878.1, the name is a novel mountain torrent front end instrument reading automatic identification algorithm and device, and the patent relates to a water conservancy pointer instrument reading identification method based on deep learning, comprising the following steps: constructing an instrument detection model; training a meter detection model: constructing a sample set, wherein the labeling information of the sample comprises dial position information and a plurality of key point position information; the key points comprise pointer end points and dial screw points; inputting the sample set into an instrument detection model to obtain a prediction result; calculating losses between the prediction result and the labeling information by using a loss function, wherein the losses comprise target confidence loss, dial positioning loss and key point positioning loss; iteratively updating the meter detection model parameters based on the loss; acquiring an image to be processed; inputting an image to be processed to a trained instrument detection model to obtain a dial position and a plurality of key point positions; and calculating the meter reading according to the relative position relation between the pointer end point and the dial screw point.
According to the hydraulic pointer instrument panel detection method based on the Yolov7, detection of the hydraulic pointer instrument panel, detection of dial screw points, pointer end points and scale starting points are completed through the target detection based on the Yolov7, no innovative design network structure exists, accuracy of instrument reading is excessively dependent on information of the dial screw points and the pointer end points, and when the information is partially shielded, an algorithm cannot be applied to actual production.
The patent does not provide for combining Yolov7 target detection algorithm with improvement U 2 The automatic pointer instrument reading algorithm of the Net network can not increase the segmentation capability of the model while reducing the parameter quantity of the model, and an anti-interference and accurate and efficient water conservancy pointer instrument reading method is not provided.
Disclosure of Invention
Aiming at the defects in the prior art, the invention aims to provide a novel automatic identification method and system for the readings of instruments and meters at the front end of mountain torrents.
The invention provides a novel automatic recognition method for the readings of instruments and meters at the front end of mountain torrents, which comprises the following steps:
step S1: construction A-U 2 -Net convolutional neural network model backbone network;
step S2: adding the extracted features and the ECA module output features by using depth separable convolution, and then inputting the added features into a next feature extraction layer for feature extraction;
Step S3: fusing semantic features output by the encoding stage, and inputting the semantic features into a symmetrical decoding stage for processing; finishing up-sampling operation, and constructing an up-sampling kernel prediction module and a content perception module;
step S4: preprocessing a pointer instrument data set, marking the pointer instrument with a VOC (volatile organic compound) format data set, and inputting the data set into a YOLOV7 model for training;
step S5: inputting meter images and generated data to an improved A-U 2 Training in Net model, preserving Yolov7 target detection and A-U 2 -preservation of Net image segmentation model weights and combining them; after the input picture detects the result through the combination algorithm, the result is decoded and written on the separated picture.
Preferably, in said step S1:
construction A-U 2 The Net convolutional neural network model backbone network is characterized in that a channel attention ECA module is introduced in the internal coding stage of an RSU module, the ECA module processes that a characteristic diagram with H multiplied by W multiplied by C is input, and the characteristic diagram is subjected to global average pooling to obtain the model backbone network with the size of [ C,1]Learning by a weight-shared one-dimensional convolution, wherein the convolution kernel size is adaptively determined by mapping the channel dimension C, as shown in the following equation:
Wherein: k is the convolution kernel size, representing the coverage rate of local cross-channel interactions; c represents the number of channels; and gamma and b are used for changing the proportion between the channel number C and the size sum of the convolution kernel, adding the learned weight into the input tensor, realizing channel weighting, and calculating the formula:
wherein: x is the input tensor; GAP is global average pooling; sigma is a sigmoid activation function.
Preferably, in said step S2:
the depth separable convolution is used, the residual error structure is utilized to add the extracted features of the depth separable convolution and the output features of the ECA module, then the next feature extraction layer is input for feature extraction, the feature map input is firstly subjected to channel-by-channel convolution, and convolution operation is carried out by utilizing a convolution kernel; then carrying out point-by-point convolution;
in the step S3:
adding a semantic embedded branch structure into the jump connection part, fusing different semantic features output by the encoding stage by utilizing the semantic embedded branch structure, inputting the fused semantic features into the symmetrical decoding stage for processing, and simulating a global multi-scale context; convoluting and upsampling the high-level features, and multiplying the high-level features with the low-level semantic features pixel by pixel;
replacement of SEB structures using CARAFE operators The bilinear interpolation in the process is completed, an upsampling kernel prediction module and a content perception module are constructed, the upsampling kernel prediction module generates an upsampling kernel according to the content of the input features, and the upsampling kernel prediction and normalization are completed through feature map channel compression, content coding and upsampling kernel prediction: reducing the number of channels of an input feature map using a 1 x 1 convolution kernel, in combination with k encoder ×k enc*der Up-sampling kernel prediction is performed on the convolution layer of (2) to obtain a shape ofIs a upsampling kernel of (2); normalizing the upsampled kernel obtained in the previous step by using a Softmax function; the content perception module performs dot product on a characteristic block with the size of k multiplied by k in the original image and a predicted up-sampling kernel of the point to obtain an output value, and the formula is as follows:
wherein r is k up 2 (recombinant kernel size), w i′(n,m) Is a recombinant kernel, x (i+n,j+m) And i and j are sampling distances for the pixel value of the target position.
Preferably, in said step S4:
preprocessing the collected water conservancy pointer instrument data set before training, improving the contrast of the instrument by using an image enhancement algorithm comprising bilateral filtering, median filtering and fuzzy set functions, and eliminating noise interference in a pointer instrument sample, wherein the bilateral filtering is shown as a formula (4):
wherein c (xi, x) represents the geometric proximity of the neighborhood center point x to the neighboring point, dζ is a normalized parameter, k d Is an image pixel domain kernel;
labeling the VOC data set of the pointer instrument by using image labeling LabelImg software, and dividing the labeled data set into a training set, a verification set and a test set; extracting instrument coordinate information based on the marked training samples, cutting out the instrument, marking json data of the cut sample by using LabelMe software, and generating a mask label;
inputting the data set into a YOLOV7 model for training, wherein the input image is subjected to feature extraction by a backstone to obtain 3 feature images with different scales, the feature images with the 3 scales are further integrated with features with different scales in a head to finish the positioning of targets of large, medium and small pointer instruments, so that the positions of all pointer instruments in the image are determined, and sub-images of each dial are cut and separated.
Preferably, in said step S5:
inputting the cropped instrument image and the generated mask data to the improved A-U 2 Training in the Net model to complete the segmentation of the pointer and dial areas in the dial image, and merging the morphological operation to weaken the fuzzy areas of the pointer and the scales;
obtaining the relative position relation between the pointer and the dial aiming at the divided pointer and dial information, and obtaining A-U 2 Polar coordinate expansion is carried out on the pointer and scale information segmented by the Net model, a polar coordinate system is established by taking the center of a sample image separated in the detection stage as an origin, the segmented scale and scale area are mapped into a rectangular image, the initial position of the scale and the centroid position of the pointer are calculated, and the reading of the water conservancy pointer type instrument is completed, wherein a conversion formula and a reading formula from the polar coordinate to the rectangular coordinate system are shown as (5) and (6) (7):
x=r+ρcos(θ) (5)
y=r-ρsin(θ) (6)
wherein ρ is the polar diameter and θ is the polar angle;
preserving Yolov7 target detection and A-U 2 Preservation of Net image segmentation model weights and combining them into end to end structure; after the original input picture detects the result through the final combination algorithm, the result is decoded and written on the separated picture.
The invention provides a novel automatic recognition system for the readings of instruments and meters at the front end of a mountain torrent, which comprises the following components:
module M1: construction A-U 2 -Net convolutional neural network model backbone network;
module M2: adding the extracted features and the ECA module output features by using depth separable convolution, and then inputting the added features into a next feature extraction layer for feature extraction;
module M3: fusing semantic features output by the encoding stage, and inputting the semantic features into a symmetrical decoding stage for processing; finishing up-sampling operation, and constructing an up-sampling kernel prediction module and a content perception module;
Module M4: preprocessing a pointer instrument data set, marking the pointer instrument with a VOC (volatile organic compound) format data set, and inputting the data set into a YOLOV7 model for training;
module M5: inputting meter images and generated data to an improved A-U 2 Training in Net model, preserving Yolov7 target detection and A-U 2 -preservation of Net image segmentation model weights and combining them; after the input picture detects the result through the combination algorithm, the result is decoded and written on the separated picture.
Preferably, in said module M1:
construction A-U 2 The Net convolutional neural network model backbone network is characterized in that a channel attention ECA module is introduced in the internal coding stage of an RSU module, the ECA module processes that a characteristic diagram with H multiplied by W multiplied by C is input, and the characteristic diagram is subjected to global average pooling to obtain the model backbone network with the size of [ C,1]Learning by a weight-shared one-dimensional convolution, wherein the convolution kernel size is adaptively determined by mapping the channel dimension C, as shown in the following equation:
wherein: k is the convolution kernel size, representing the coverage rate of local cross-channel interactions; c represents the number of channels; and gamma and b are used for changing the proportion between the channel number C and the size sum of the convolution kernel, adding the learned weight into the input tensor, realizing channel weighting, and calculating the formula:
Wherein: x is the input tensor; GAP is global average pooling; sigma is a sigmoid activation function.
Preferably, in said module M2:
the depth separable convolution is used, the residual error structure is utilized to add the extracted features of the depth separable convolution and the output features of the ECA module, then the next feature extraction layer is input for feature extraction, the feature map input is firstly subjected to channel-by-channel convolution, and convolution operation is carried out by utilizing a convolution kernel; then carrying out point-by-point convolution;
in the module M3:
adding a semantic embedded branch structure into the jump connection part, fusing different semantic features output by the encoding stage by utilizing the semantic embedded branch structure, inputting the fused semantic features into the symmetrical decoding stage for processing, and simulating a global multi-scale context; convoluting and upsampling the high-level features, and multiplying the high-level features with the low-level semantic features pixel by pixel;
and (3) using a CARAFE operator to replace bilinear interpolation in the SEB structure to complete up-sampling operation, constructing an up-sampling core prediction module and a content perception module, generating an up-sampling core by the up-sampling core prediction module according to the content of the input characteristics, and completing the compression, content coding, up-sampling core prediction and normalization of a feature map channel: reducing the number of channels of an input feature map using a 1 x 1 convolution kernel, in combination with k encoder ×k encoder Up-sampling kernel prediction is performed on the convolution layer of (2) to obtain a shape ofIs a upsampling kernel of (2); the up-sampling kernel obtained in the last step is advanced by using the Softmax functionNormalizing the rows; the content perception module performs dot product on a characteristic block with the size of k multiplied by k in the original image and a predicted up-sampling kernel of the point to obtain an output value, and the formula is as follows:
wherein r is k up 2 (recombinant kernel size), w i′(nm) Is a recombinant kernel, x (i+nj+m) And i and j are sampling distances for the pixel value of the target position.
Preferably, in said module M4:
preprocessing the collected water conservancy pointer instrument data set before training, improving the contrast of the instrument by using an image enhancement algorithm comprising bilateral filtering, median filtering and fuzzy set functions, and eliminating noise interference in a pointer instrument sample, wherein the bilateral filtering is shown as a formula (4):
wherein c (xi, x) represents the geometric proximity of the neighborhood center point x to the neighboring point, dζ is a normalized parameter, k d Is an image pixel domain kernel;
labeling the VOC data set of the pointer instrument by using image labeling LabelImg software, and dividing the labeled data set into a training set, a verification set and a test set; extracting instrument coordinate information based on the marked training samples, cutting out the instrument, marking json data of the cut sample by using LabelMe software, and generating a mask label;
Inputting the data set into a YOLOV7 model for training, wherein the input image is subjected to feature extraction by a backstone to obtain 3 feature images with different scales, the feature images with the 3 scales are further integrated with features with different scales in a head to finish the positioning of targets of large, medium and small pointer instruments, so that the positions of all pointer instruments in the image are determined, and sub-images of each dial are cut and separated.
Preferably, in said module M5:
inputting the cropped instrument image and the generated mask data to the improved A-U 2 Training in the Net model to complete the segmentation of the pointer and dial areas in the dial image, and merging the morphological operation to weaken the fuzzy areas of the pointer and the scales;
obtaining the relative position relation between the pointer and the dial aiming at the divided pointer and dial information, and obtaining A-U 2 Polar coordinate expansion is carried out on the pointer and scale information segmented by the Net model, a polar coordinate system is established by taking the center of a sample image separated in the detection stage as an origin, the segmented scale and scale area are mapped into a rectangular image, the initial position of the scale and the centroid position of the pointer are calculated, and the reading of the water conservancy pointer type instrument is completed, wherein a conversion formula and a reading formula from the polar coordinate to the rectangular coordinate system are shown as (5) and (6) (7):
x=r+ρcos(θ) (5)
y=r-ρsin(θ) (6)
Wherein ρ is the polar diameter and θ is the polar angle;
preserving Yolov7 target detection and A-U 2 Saving the Net image segmentation model weights and combining them into an end-to-end structure; after the original input picture detects the result through the final combination algorithm, the result is decoded and written on the separated picture.
Compared with the prior art, the invention has the following beneficial effects:
1. according to the invention, the channel attention ECA module is introduced in the internal coding of the RSU module, so that invalid characteristic interference remained in the characteristic channel when the RSU module extracts the multi-scale characteristics is eliminated, and the robustness of the model is further enhanced;
2. in the invention, the depth separable convolution is used for replacing the traditional convolution in the downsampling stage in the RSU module, and the residual error structure is utilized to fuse the features extracted by the depth separable convolution with the features extracted by the ECA module and input the fused features to the next layer for feature extraction, so that each level can focus more effective feature information channels and the feature extraction capability of each level is effectively enhanced when the calculated amount of the network feature extraction structure is reduced;
3. the invention constructs an up-sampling core prediction module and a content perception module at the jump connection part, adds a Semantic Embedded Branch (SEB) structure, and uses CARAFE operator to replace bilinear interpolation in the SEB structure to combine into a new CSEB module to finish up-sampling operation, thereby effectively preventing excessive spatial information loss of the network due to jump connection;
4. After pointer and dial information are separated, the invention utilizes the mapping relation between polar coordinates and space coordinates to complete the retrieval of the dial starting point and pointer position, and provides a reading algorithm for high-precision automatic recognition instrument reading automatic recognition algorithm.
Drawings
Other features, objects and advantages of the present invention will become more apparent upon reading of the detailed description of non-limiting embodiments, given with reference to the accompanying drawings in which:
FIG. 1 is a schematic diagram of a comparison of RSU modules prior to modification;
FIG. 2 is a schematic diagram of a feature extraction architecture;
FIG. 3 is A-U 2 -Net overall network architecture schematic;
FIG. 4 is a schematic diagram of an ECA module;
FIG. 5 is a schematic diagram of an automatic reading flow of a water conservancy pointer instrument;
fig. 6 is a schematic diagram showing the test results.
Detailed Description
The present invention will be described in detail with reference to specific examples. The following examples will assist those skilled in the art in further understanding the present invention, but are not intended to limit the invention in any way. It should be noted that variations and modifications could be made by those skilled in the art without departing from the inventive concept. These are all within the scope of the present invention.
Example 1:
the invention belongs to the technical fields of target detection, image segmentation and the like of computer vision, and particularly relates to a YOLOV 7-based target detection algorithm and an improvement U 2 -a method for automatically identifying a pointer instrument combined with a Net algorithm.
Aiming at the defects in the prior art, the invention aims to enhance the contrast ratio between a sample dial and a background by utilizing a filtering algorithm, and provides a novel network architecture U while separating a complex background of an instrument by utilizing a YOLOV7 target detection algorithm 2 Net, solve the technical problem in the water conservancy pointer type instrument reading;
1) The contrast of the instrument is enhanced, the target area is precisely cut, and the background influence is weakened: when the automatic reading of the water conservancy pointer type instrument is carried out, the most concerned problem is whether the accuracy of the instrument reading and the algorithm efficiency can meet the actual production requirement, however, a large amount of interference such as irrelevant background, data noise, degradation and the like exists in an instrument image, and in order to improve the stability and accuracy of the reading, the problems of fuzzy data imaging, overhigh exposure, low contrast and the like are solved by introducing algorithms such as bilateral filtering, median filtering, fuzzy set function and the like, and the area where the instrument dial plate and the pointer are located is detected by combining with the YOLOV7 target detection algorithm, so that the interference of other objects is eliminated, and the segmentation range is reduced;
2) Improved U 2 Net network structure, improving network split pointer, scale capability: aiming at the problems of difficult segmentation, low segmentation precision, missed segmentation and the like of the instrument panel and the pointer, an improved U is provided 2 Net network architecture. In order to improve the characteristic extraction capability of the RSU module and reduce the parameter quantity of the module, a channel attention module and a depth separable convolution are combined based on a residual structure to form a new characteristic extraction layer; in order to prevent the loss of spatial information, a semantic embedded branch structure is added in the jump connection of the outer layer encoding and decoding, a CARAFE operator is adopted for up-sampling, more semantic information is introduced into low-layer features to strengthen the fusion of inter-stage features, the spatial information lost by the jump connection is reduced, the feature extraction capacity of a model is further improved, and the accurate segmentation of a dial and a pointer is realized;
3) High-efficient, accurate pointer instrument registration of reading: after information such as an instrument dial and a pointer is segmented, mapping the pointer and scale information into a rectangular image by taking the center of a sub-image separated by a YOLOV7 target detection algorithm as an origin polar coordinate system, searching the center of mass of the pointer and the initial position of the dial, and finishing accurate reading of pointer readings.
According to the novel automatic recognition method for the meter readings at the front end of the mountain torrent, which is provided by the invention, as shown in fig. 1-6, the method comprises the following steps:
step S1: construction A-U 2 -Net convolutional neural network model backbone network;
specifically, in the step S1:
construction A-U 2 The Net convolutional neural network model backbone network is characterized in that a channel attention ECA module is introduced in the internal coding stage of an RSU module, the ECA module processes that a characteristic diagram with H multiplied by W multiplied by C is input, and the characteristic diagram is subjected to global average pooling to obtain the model backbone network with the size of [ C,1]Learning by a weight-shared one-dimensional convolution, wherein the convolution kernel size is adaptively determined by mapping the channel dimension C, as shown in the following equation:
wherein: k is the convolution kernel size, representing the coverage rate of local cross-channel interactions; c represents the number of channels; and gamma and b are used for changing the proportion between the channel number C and the size sum of the convolution kernel, adding the learned weight into the input tensor, realizing channel weighting, and calculating the formula:
wherein: x is the input tensor; GAP is global average pooling; sigma is a sigmoid activation function.
Step S2: adding the extracted features and the ECA module output features by using depth separable convolution, and then inputting the added features into a next feature extraction layer for feature extraction;
Specifically, in the step S2:
the depth separable convolution is used, the residual error structure is utilized to add the extracted features of the depth separable convolution and the output features of the ECA module, then the next feature extraction layer is input for feature extraction, the feature map input is firstly subjected to channel-by-channel convolution, and convolution operation is carried out by utilizing a convolution kernel; then carrying out point-by-point convolution;
step S3: fusing semantic features output by the encoding stage, and inputting the semantic features into a symmetrical decoding stage for processing; finishing up-sampling operation, and constructing an up-sampling kernel prediction module and a content perception module;
in the step S3:
adding a semantic embedded branch structure into the jump connection part, fusing different semantic features output by the encoding stage by utilizing the semantic embedded branch structure, inputting the fused semantic features into the symmetrical decoding stage for processing, and simulating a global multi-scale context; convoluting and upsampling the high-level features, and multiplying the high-level features with the low-level semantic features pixel by pixel;
and (3) using a CARAFE operator to replace bilinear interpolation in the SEB structure to complete up-sampling operation, constructing an up-sampling core prediction module and a content perception module, generating an up-sampling core by the up-sampling core prediction module according to the content of the input characteristics, and completing the compression, content coding, up-sampling core prediction and normalization of a feature map channel: reducing the number of channels of an input feature map using a 1 x 1 convolution kernel, in combination with k encoder ×k encoder Up-sampling kernel prediction is performed on the convolution layer of (2) to obtain a shape ofIs a upsampling kernel of (2); normalizing the upsampled kernel obtained in the previous step by using a Softmax function; the content perception module performs dot product on a characteristic block with the size of k multiplied by k in the original image and a predicted up-sampling kernel of the point to obtain an output value, and the formula is as follows:
wherein r is k up 2 (recombinant kernel size), w i′(n,m) Is a recombinant kernel, x (i+n,j+m) And i and j are sampling distances for the pixel value of the target position.
Step S4: preprocessing a pointer instrument data set, marking the pointer instrument with a VOC (volatile organic compound) format data set, and inputting the data set into a YOLOV7 model for training;
specifically, in the step S4:
preprocessing the collected water conservancy pointer instrument data set before training, improving the contrast of the instrument by using an image enhancement algorithm comprising bilateral filtering, median filtering and fuzzy set functions, and eliminating noise interference in a pointer instrument sample, wherein the bilateral filtering is shown as a formula (4):
wherein c (xi, x) represents the geometric proximity of the neighborhood center point x to the neighboring point, dζ is a normalized parameter, k d Is an image pixel domain kernel;
labeling the VOC data set of the pointer instrument by using image labeling LabelImg software, and dividing the labeled data set into a training set, a verification set and a test set; extracting instrument coordinate information based on the marked training samples, cutting out the instrument, marking json data of the cut sample by using LabelMe software, and generating a mask label;
Inputting the data set into a YOLOV7 model for training, wherein the input image is subjected to feature extraction by a backstone to obtain 3 feature images with different scales, the feature images with the 3 scales are further integrated with features with different scales in a head to finish the positioning of targets of large, medium and small pointer instruments, so that the positions of all pointer instruments in the image are determined, and sub-images of each dial are cut and separated.
Step S5: inputting meter images and generated data to an improved A-U 2 Training in Net model, preserving Yolov7 target detection and A-U 2 -preservation of Net image segmentation model weights and combining them; after the input picture detects the result through the combination algorithm, the result is decoded and written on the separated picture.
Specifically, in the step S5:
inputting the cropped instrument image and the generated mask data to the improved A-U 2 Training in the Net model to complete the segmentation of the pointer and dial areas in the dial image, and merging the morphological operation to weaken the fuzzy areas of the pointer and the scales;
obtaining the relative position relation between the pointer and the dial aiming at the divided pointer and dial information, and obtaining A-U 2 Polar coordinate expansion is carried out on the pointer and scale information segmented by the Net model, a polar coordinate system is established by taking the center of a sample image separated in the detection stage as an origin, the segmented scale and scale area are mapped into a rectangular image, the initial position of the scale and the centroid position of the pointer are calculated, and the reading of the water conservancy pointer type instrument is completed, wherein a conversion formula and a reading formula from the polar coordinate to the rectangular coordinate system are shown as (5) and (6) (7):
x=r+ρcos(θ) (5)
y=r-ρsin(θ) (6)
Wherein ρ is the polar diameter and θ is the polar angle;
preserving Yolov7 target detection and A-U 2 Saving the Net image segmentation model weights and combining them into an end-to-end structure; after the original input picture detects the result through the final combination algorithm, the result is decoded and written on the separated picture.
Example 2:
example 2 is a preferable example of example 1 to more specifically explain the present invention.
The invention also provides a novel automatic recognition system for the meter readings of the front-end instrument of the mountain torrents, which can be realized by executing the flow steps of the automatic recognition method for the meter readings of the front-end instrument of the mountain torrents, namely, a person skilled in the art can understand the automatic recognition method for the meter readings of the front-end instrument of the mountain torrents as a preferred implementation mode of the automatic recognition system for the meter readings of the front-end instrument of the mountain torrents.
The invention provides a novel automatic recognition system for the readings of instruments and meters at the front end of a mountain torrent, which comprises the following components:
module M1: construction A-U 2 -Net convolutional neural network model backbone network;
specifically, in the module M1:
construction A-U 2 The Net convolutional neural network model backbone network is characterized in that a channel attention ECA module is introduced in the internal coding stage of an RSU module, the ECA module processes that a characteristic diagram with H multiplied by W multiplied by C is input, and the characteristic diagram is subjected to global average pooling to obtain the model backbone network with the size of [ C,1 ]Learning by a weight-shared one-dimensional convolution, wherein the convolution kernel size is adaptively determined by mapping the channel dimension C, as shown in the following equation:
wherein: k is the convolution kernel size, representing the coverage rate of local cross-channel interactions; c represents the number of channels; and gamma and b are used for changing the proportion between the channel number C and the size sum of the convolution kernel, adding the learned weight into the input tensor, realizing channel weighting, and calculating the formula:
wherein: x is the input tensor; GAP is global average pooling; sigma is a sigmoid activation function.
Module M2: adding the extracted features and the ECA module output features by using depth separable convolution, and then inputting the added features into a next feature extraction layer for feature extraction;
specifically, in the module M2:
the depth separable convolution is used, the residual error structure is utilized to add the extracted features of the depth separable convolution and the output features of the ECA module, then the next feature extraction layer is input for feature extraction, the feature map input is firstly subjected to channel-by-channel convolution, and convolution operation is carried out by utilizing a convolution kernel; then carrying out point-by-point convolution;
module M3: fusing semantic features output by the encoding stage, and inputting the semantic features into a symmetrical decoding stage for processing; finishing up-sampling operation, and constructing an up-sampling kernel prediction module and a content perception module;
In the module M3:
adding a semantic embedded branch structure into the jump connection part, fusing different semantic features output by the encoding stage by utilizing the semantic embedded branch structure, inputting the fused semantic features into the symmetrical decoding stage for processing, and simulating a global multi-scale context; convoluting and upsampling the high-level features, and multiplying the high-level features with the low-level semantic features pixel by pixel;
and (3) using a CARAFE operator to replace bilinear interpolation in the SEB structure to complete up-sampling operation, constructing an up-sampling core prediction module and a content perception module, generating an up-sampling core by the up-sampling core prediction module according to the content of the input characteristics, and completing the compression, content coding, up-sampling core prediction and normalization of a feature map channel: reducing the number of channels of an input feature map using a 1 x 1 convolution kernel, in combination with k dncoedr ×k encoder Up-sampling kernel prediction is performed on the convolution layer of (2) to obtain a shape ofIs a upsampling kernel of (2); normalizing the upsampled kernel obtained in the previous step by using a Softmax function; the content perception module performs dot product on a characteristic block with the size of k multiplied by k in the original image and a predicted up-sampling kernel of the point to obtain an output value, and the formula is as follows:
wherein r is k up 2 (recombinant kernel size), w i′(n,m) Is a recombinant kernel, x 9i+n,j+m) And i and j are sampling distances for the pixel value of the target position.
Module M4: preprocessing a pointer instrument data set, marking the pointer instrument with a VOC (volatile organic compound) format data set, and inputting the data set into a YOLOV7 model for training;
specifically, in the module M4:
preprocessing the collected water conservancy pointer instrument data set before training, improving the contrast of the instrument by using an image enhancement algorithm comprising bilateral filtering, median filtering and fuzzy set functions, and eliminating noise interference in a pointer instrument sample, wherein the bilateral filtering is shown as a formula (4):
wherein c (xi, x) represents the geometric proximity of the neighborhood center point x to the neighboring point, dζ is a normalized parameter, k d Is an image pixel domain kernel;
labeling the VOC data set of the pointer instrument by using image labeling LabelImg software, and dividing the labeled data set into a training set, a verification set and a test set; extracting instrument coordinate information based on the marked training samples, cutting out the instrument, marking json data of the cut sample by using LabelMe software, and generating a mask label;
inputting the data set into a YOLOV7 model for training, wherein the input image is subjected to feature extraction by a backstone to obtain 3 feature images with different scales, the feature images with the 3 scales are further integrated with features with different scales in a head to finish the positioning of targets of large, medium and small pointer instruments, so that the positions of all pointer instruments in the image are determined, and sub-images of each dial are cut and separated.
Module M5: inputting meter images and generated data to improvementsA-U 2 Training in Net model, preserving Yolov7 target detection and A-U 2 -preservation of Net image segmentation model weights and combining them; after the input picture detects the result through the combination algorithm, the result is decoded and written on the separated picture.
Specifically, in the module M5:
inputting the cropped instrument image and the generated mask data to the improved A-U 2 Training in the Net model to complete the segmentation of the pointer and dial areas in the dial image, and merging the morphological operation to weaken the fuzzy areas of the pointer and the scales;
obtaining the relative position relation between the pointer and the dial aiming at the divided pointer and dial information, and obtaining A-U 2 Polar coordinate expansion is carried out on the pointer and scale information segmented by the Net model, a polar coordinate system is established by taking the center of a sample image separated in the detection stage as an origin, the segmented scale and scale area are mapped into a rectangular image, the initial position of the scale and the centroid position of the pointer are calculated, and the reading of the water conservancy pointer type instrument is completed, wherein a conversion formula and a reading formula from the polar coordinate to the rectangular coordinate system are shown as (5) and (6) (7):
x=r+ρcos(θ) (5)
y=r-ρsin(θ) (6)
wherein ρ is the polar diameter and θ is the polar angle;
Preserving Yolov7 target detection and A-U 2 Saving the Net image segmentation model weights and combining them into an end-to-end structure; after the original input picture detects the result through the final combination algorithm, the result is decoded and written on the separated picture.
Example 3:
example 3 is a preferable example of example 1 to more specifically explain the present invention.
The invention provides a fusion attention machineNew network architecture A-U 2 -Net。
Novel network architecture A-U 2 -Net:
Step 1, constructing A-U 2 The Net convolutional neural network model backbone network, the modules are shown in fig. 1, and the broken line part is a modified structure. In order to eliminate the invalid characteristic interference remained in the characteristic channel when the RSU module extracts the multi-scale characteristics, the channel attention ECA module is introduced in the internal coding stage of the RSU module, the processing process of the module is that the characteristic diagram with H multiplied by W multiplied by C is input, and the characteristic diagram with the size of [ C,1 is obtained after global average pooling]And learning by a weight-sharing one-dimensional convolution, wherein the convolution kernel size is adaptively determined by mapping the channel dimension C, and the formula is as follows:
wherein: k is the convolution kernel size, representing the coverage rate of local cross-channel interactions; c represents the number of channels; gamma and b are used to change the ratio between the number of channels C and the sum of the convolution kernel sizes, the default values being set to 2 and 1. And finally, adding the learned weight into the input tensor to realize channel weighting. The calculation formula is as follows:
Wherein: x is the input tensor; GAP is global average pooling; sigma is a sigmoid activation function.
And 2, replacing the traditional convolution by using the depth separable convolution, reducing the calculated amount of a network feature extraction structure, adding the features extracted by the depth separable convolution and the output features of the ECA module by using a residual structure, and then inputting the added features into a next feature extraction layer for feature extraction, so that each stage of output features can focus on more effective feature information channels, and the extraction capability of each stage of effective features is enhanced. Taking the input feature map as 7×7×3 and the output as 4 channels as an example. The feature map input is to perform channel-by-channel convolution first, and perform convolution operation by using 4 convolution kernels of 3×3×1, wherein the calculated amount of the process is 3×3× (7-3+1) × (7-3+1) ×3=675, and the parameter is 3×3×3=27; the point-by-point convolution is performed, the size of the convolution kernel is 1×1×m, M is the number of channels output by the previous layer, the calculated amount of the process is 1×1×7×7×3×4=588, and the number of parameters is 1×1×3×4=12. Thus, the total amount of computation required to apply the depth separable convolution is 1263, the total parameter is 39, while the amount of computation required for the conventional convolution is 3×3× (7-3+1) × (7-3+1) ×3×4=2700, the parameter is 4×3×3×3=108, and both the parameter and the computation are significantly higher than the depth separable convolution. The residual structure of the depth separable convolution in combination with the attention mechanism is shown in fig. 2;
And step 3, adding a Semantic Embedded Branch (SEB) structure into the jump connection part, fusing different semantic features output by the encoding stage by utilizing the structure, inputting the fused semantic features into the symmetrical decoding stage for processing, simulating a global multi-scale context, and reducing the missing segmentation phenomenon. The operation content mainly comprises: the high-level features are convolved and upsampled to be pixel-wise multiplied by the low-level semantic features.
And 4, using a CARAFE operator to replace bilinear interpolation in an SEB structure to complete up-sampling operation, and constructing an up-sampling core prediction module and a content perception module, wherein the up-sampling core prediction module mainly generates an up-sampling core according to the content of input features, and the up-sampling core prediction module is completed in 3 steps of feature map channel compression, content coding, up-sampling core prediction and normalization: reducing the number of channels of an input feature map using a 1 x 1 convolution kernel, in combination with k encoder ×k encoder Up-sampling kernel prediction is performed on the convolution layer of (2) to obtain a shape ofIs a upsampling kernel of (2); finally, normalizing the upsampling kernel obtained in the last step by using a Softmax function; the content perception module performs dot product on a characteristic block with the size of k multiplied by k in the original image and a predicted up-sampling kernel of the point to obtain an output value. The formula is shown below.
Finally, the invention combines yolv7+U 2 The Net identifies the water conservancy pointer instrument, and the specific steps are as follows:
automatic reading identification algorithm of hydraulic pointer type instrument reading automatic identification algorithm:
step 1, preprocessing the collected water conservancy pointer instrument data set before training, wherein the resolution of a main object is low due to a large number of irrelevant backgrounds in an instrument image, and the contrast between the instrument and the backgrounds is too low due to the influence of overexposure, insufficient illumination and the like, so that the contrast of the instrument is improved by researching an image enhancement algorithm such as bilateral filtering, median filtering, fuzzy set function and the like, and noise interference in a pointer instrument sample is eliminated, the bilateral filtering is a nonlinear filtering algorithm, the essence of the bilateral filtering is that a compromise process is found between the pixel coordinate adjacency and the pixel value similarity of an image, and the influence of noise can be weakened while the edge information of the image is kept, as shown in a formula (4):
where c (ζ, x) represents the geometric proximity of the neighborhood center point x to the neighboring points, dζ is a normalization parameter.
Step 2, marking the VOC format data set of the pointer instrument by using image marking LabelImg software, and dividing the marked data set into a training set, a verification set and a test set; and extracting instrument coordinate information based on the marked training sample, cutting out the instrument, marking json data of the cut sample by using LabelMe software, and generating a mask label.
And 3, inputting the data set into a YOLOV7 model for training, wherein the input image is subjected to feature extraction by a backstone to obtain 3 feature images with different scales, and the feature images with the 3 scales further integrate the features with different scales in a head to finish the positioning of targets of large, medium and small pointer instruments, so that the positions of all pointer instruments in the image are determined, and sub-images of each dial are cut and separated.
Step 4, inputting the instrument image cut out in the step 2 and the generated mask data into the improved A-U 2 Training in the Net model to complete the segmentation of the pointer and dial areas in the dial image, and merging the morphological operation to weaken the fuzzy areas of the pointer and the scales;
and 5, obtaining the relative position relation between the pointer and the dial aiming at the divided pointer and dial information. Will A-U 2 -polar coordinate expansion is carried out on the pointer and scale information segmented by the Net model, a polar coordinate system is established by taking the center of a sample image separated in the detection stage as an origin, the segmented scale and scale area are mapped into a rectangular image, the initial position of the scale and the centroid position of the pointer are calculated, and the reading of the water conservancy pointer type instrument is completed, wherein a conversion formula and a reading formula from the polar coordinate to the rectangular coordinate system are shown as (5, 6) (7):
x=r+ρcos(θ) (5)
y=r-ρsin(θ) (6)
Step 6, saving Yolov7 target detection and A-U 2 Saving the Net image segmentation model weights and combining them into an end-to-end structure; after the original input picture detects the result through the final combination algorithm, the result is decoded and written into the separated picture, and the overall algorithm flow is shown in fig. 5.
Those skilled in the art will appreciate that the invention provides a system and its individual devices, modules, units, etc. that can be implemented entirely by logic programming of method steps, in addition to being implemented as pure computer readable program code, in the form of logic gates, switches, application specific integrated circuits, programmable logic controllers, embedded microcontrollers, etc. Therefore, the system and various devices, modules and units thereof provided by the invention can be regarded as a hardware component, and the devices, modules and units for realizing various functions included in the system can also be regarded as structures in the hardware component; means, modules, and units for implementing the various functions may also be considered as either software modules for implementing the methods or structures within hardware components.
The foregoing describes specific embodiments of the present invention. It is to be understood that the invention is not limited to the particular embodiments described above, and that various changes or modifications may be made by those skilled in the art within the scope of the appended claims without affecting the spirit of the invention. The embodiments of the present application and features in the embodiments may be combined with each other arbitrarily without conflict.

Claims (10)

1. The novel automatic recognition method for the meter readings of the front end of the mountain torrent is characterized by comprising the following steps of:
step S1: construction A-U 2 -Net convolutional neural network model backbone network;
step S2: adding the extracted features and the ECA module output features by using depth separable convolution, and then inputting the added features into a next feature extraction layer for feature extraction;
step S3: fusing semantic features output by the encoding stage, and inputting the semantic features into a symmetrical decoding stage for processing; finishing up-sampling operation, and constructing an up-sampling kernel prediction module and a content perception module;
step S4: preprocessing a pointer instrument data set, marking the pointer instrument with a VOC (volatile organic compound) format data set, and inputting the data set into a YOLOV7 model for training;
step S5: inputting meter images and generated data to an improved A-U 2 Training in Net model, preserving Yolov7 target detection and A-U 2 -preservation of Net image segmentation model weights and combining them; after the input picture detects the result through the combination algorithm, the result is decoded and written on the separated picture.
2. The method for automatically identifying the readings of the instrument at the front end of the mountain torrents according to claim 1, wherein in the step S1:
construction A-U 2 The Net convolutional neural network model backbone network is characterized in that a channel attention ECA module is introduced in the internal coding stage of an RSU module, the ECA module processes that a characteristic diagram with H multiplied by W multiplied by C is input, and the characteristic diagram is subjected to global average pooling to obtain the model backbone network with the size of [ C,1]Learning by a weight-shared one-dimensional convolution, wherein the convolution kernel size is adaptively determined by mapping the channel dimension C, as shown in the following equation:
wherein: k is the convolution kernel size, representing the coverage rate of local cross-channel interactions; c represents the number of channels; and gamma and b are used for changing the proportion between the channel number C and the size sum of the convolution kernel, adding the learned weight into the input tensor, realizing channel weighting, and calculating the formula:
wherein: x is the input tensor; GAP is global average pooling; sigma is a sigmoid activation function.
3. The method for automatically identifying the readings of the instrument at the front end of the mountain torrents according to claim 1, wherein in the step S2:
the depth separable convolution is used, the residual error structure is utilized to add the extracted features of the depth separable convolution and the output features of the ECA module, then the next feature extraction layer is input for feature extraction, the feature map input is firstly subjected to channel-by-channel convolution, and convolution operation is carried out by utilizing a convolution kernel; then carrying out point-by-point convolution;
in the step S3:
adding a semantic embedded branch structure into the jump connection part, fusing different semantic features output by the encoding stage by utilizing the semantic embedded branch structure, inputting the fused semantic features into the symmetrical decoding stage for processing, and simulating a global multi-scale context; convoluting and upsampling the high-level features, and multiplying the high-level features with the low-level semantic features pixel by pixel;
and (3) using a CARAFE operator to replace bilinear interpolation in the SEB structure to complete up-sampling operation, constructing an up-sampling core prediction module and a content perception module, generating an up-sampling core by the up-sampling core prediction module according to the content of the input characteristics, and completing the compression, content coding, up-sampling core prediction and normalization of a feature map channel: reducing the number of channels of an input feature map using a 1 x 1 convolution kernel, in combination with k encoder ×k encoder Up-sampling kernel prediction is performed on the convolution layer of (2) to obtain a shape ofIs a upsampling kernel of (2); normalizing the upsampled kernel obtained in the previous step by using a Softmax function; the content perception module performs dot product on a characteristic block with the size of k multiplied by k in the original image and a predicted up-sampling kernel of the point to obtain an output value, and the formula is as follows:
wherein r is k up 2 (recombinant kernel size), w i′(n,m) Is a recombinant kernel, x (i+n,j+m) And i and j are sampling distances for the pixel value of the target position.
4. The method for automatically identifying the readings of the instrument at the front end of the mountain torrents according to claim 1, wherein in the step S4:
preprocessing the collected water conservancy pointer instrument data set before training, improving the contrast of the instrument by using an image enhancement algorithm comprising bilateral filtering, median filtering and fuzzy set functions, and eliminating noise interference in a pointer instrument sample, wherein the bilateral filtering is shown as a formula (4):
wherein c (xi, x) represents the geometric proximity of the neighborhood center point x to the neighboring point, dζ is a normalized parameter, k d Is an image pixel domain kernel;
labeling the VOC data set of the pointer instrument by using image labeling LabelImg software, and dividing the labeled data set into a training set, a verification set and a test set; extracting instrument coordinate information based on the marked training samples, cutting out the instrument, marking json data of the cut sample by using LabelMe software, and generating a mask label;
Inputting the data set into a YOLOV7 model for training, wherein the input image is subjected to feature extraction by a backstone to obtain 3 feature images with different scales, the feature images with the 3 scales are further integrated with features with different scales in a head to finish the positioning of targets of large, medium and small pointer instruments, so that the positions of all pointer instruments in the image are determined, and sub-images of each dial are cut and separated.
5. The method for automatically identifying the readings of the instrument at the front end of the mountain torrents according to claim 1, wherein in the step S5:
inputting the cropped instrument image and the generated mask data to the improved A-U 2 Training in the Net model to complete the segmentation of the pointer and dial areas in the dial image, and merging the morphological operation to weaken the fuzzy areas of the pointer and the scales;
obtaining the relative position relation between the pointer and the dial aiming at the divided pointer and dial information, and obtaining A-U 2 Polar coordinate expansion is carried out on the pointer and scale information segmented by the Net model, a polar coordinate system is established by taking the center of a sample image separated in the detection stage as an origin, the segmented scale and scale area are mapped into a rectangular image, the initial position of the scale and the centroid position of the pointer are calculated, and the reading of the water conservancy pointer type instrument is completed, wherein the conversion formula from the polar coordinate to the rectangular coordinate system and the reading formula are as (5), (a) and (b) 6) The following (7):
x=r+ρcos(θ) (5)
y=r-ρsin(θ) (6)
wherein ρ is the polar diameter and θ is the polar angle;
preserving Yolov7 target detection and A-U 2 Saving the Net image segmentation model weights and combining them into an end-to-end structure; after the original input picture detects the result through the final combination algorithm, the result is decoded and written on the separated picture.
6. Novel mountain torrent front end instrument meter reading automatic identification system, its characterized in that includes:
module M1: construction A-U 2 -Net convolutional neural network model backbone network;
module M2: adding the extracted features and the ECA module output features by using depth separable convolution, and then inputting the added features into a next feature extraction layer for feature extraction;
module M3: fusing semantic features output by the encoding stage, and inputting the semantic features into a symmetrical decoding stage for processing; finishing up-sampling operation, and constructing an up-sampling kernel prediction module and a content perception module;
module M4: preprocessing a pointer instrument data set, marking the pointer instrument with a VOC (volatile organic compound) format data set, and inputting the data set into a YOLOV7 model for training;
module M5: inputting meter images and generated data to an improved A-U 2 Training in Net model, preserving Yolov7 target detection and A-U 2 -preservation of Net image segmentation model weights and combining them; after the input picture detects the result through the combination algorithm, the result is decoded and written on the separated picture.
7. The automatic recognition system of the readings of the instrument of the front end of the mountain torrents, as set forth in claim 6, is characterized in that in said module M1:
construction A-U 2 The Net convolutional neural network model backbone network is characterized in that a channel attention ECA module is introduced in the internal coding stage of an RSU module, the ECA module processes that a characteristic diagram with H multiplied by W multiplied by C is input, and the characteristic diagram is subjected to global average pooling to obtain the model backbone network with the size of [ C,1]Learning by a weight-shared one-dimensional convolution, wherein the convolution kernel size is adaptively determined by mapping the channel dimension C, as shown in the following equation:
wherein: k is the convolution kernel size, representing the coverage rate of local cross-channel interactions; c represents the number of channels; and gamma and b are used for changing the proportion between the channel number C and the size sum of the convolution kernel, adding the learned weight into the input tensor, realizing channel weighting, and calculating the formula:
wherein: x is the input tensor; GAP is global average pooling; sigma is a sigmoid activation function.
8. The automatic recognition system of the readings of the instrument of the front end of the mountain torrents, as set forth in claim 6, is characterized in that in said module M2:
The depth separable convolution is used, the residual error structure is utilized to add the extracted features of the depth separable convolution and the output features of the ECA module, then the next feature extraction layer is input for feature extraction, the feature map input is firstly subjected to channel-by-channel convolution, and convolution operation is carried out by utilizing a convolution kernel; then carrying out point-by-point convolution;
in the module M3:
adding a semantic embedded branch structure into the jump connection part, fusing different semantic features output by the encoding stage by utilizing the semantic embedded branch structure, inputting the fused semantic features into the symmetrical decoding stage for processing, and simulating a global multi-scale context; convoluting and upsampling the high-level features, and multiplying the high-level features with the low-level semantic features pixel by pixel;
and (3) using a CARAFE operator to replace bilinear interpolation in the SEB structure to complete up-sampling operation, constructing an up-sampling core prediction module and a content perception module, generating an up-sampling core by the up-sampling core prediction module according to the content of the input characteristics, and completing the compression, content coding, up-sampling core prediction and normalization of a feature map channel: reducing the number of channels of an input feature map using a 1 x 1 convolution kernel, in combination with k encoder ×k encoder Up-sampling kernel prediction is performed on the convolution layer of (2) to obtain a shape of Is a upsampling kernel of (2); normalizing the upsampled kernel obtained in the previous step by using a Softmax function; the content perception module performs dot product on a characteristic block with the size of k multiplied by k in the original image and a predicted up-sampling kernel of the point to obtain an output value, and the formula is as follows:
wherein r is k up 2 (recombinant kernel size), w i′(n,m) Is a recombinant kernel, x (i+n,j+m) And i and j are sampling distances for the pixel value of the target position.
9. The automatic recognition system of the readings of the instrument of the front end of the mountain torrents, as set forth in claim 6, is characterized in that in said module M4:
preprocessing the collected water conservancy pointer instrument data set before training, improving the contrast of the instrument by using an image enhancement algorithm comprising bilateral filtering, median filtering and fuzzy set functions, and eliminating noise interference in a pointer instrument sample, wherein the bilateral filtering is shown as a formula (4):
wherein c (xi, x) represents the geometric proximity of the neighborhood center point x to the neighboring point, dζ is a normalized parameter, k d Is an image pixel domain kernel;
labeling the VOC data set of the pointer instrument by using image labeling LabelImg software, and dividing the labeled data set into a training set, a verification set and a test set; extracting instrument coordinate information based on the marked training samples, cutting out the instrument, marking json data of the cut sample by using LabelMe software, and generating a mask label;
Inputting the data set into a YOLOV7 model for training, wherein the input image is subjected to feature extraction by a backstone to obtain 3 feature images with different scales, the feature images with the 3 scales are further integrated with features with different scales in a head to finish the positioning of targets of large, medium and small pointer instruments, so that the positions of all pointer instruments in the image are determined, and sub-images of each dial are cut and separated.
10. The automatic recognition system of the readings of the instrument of the front end of the mountain torrents, as set forth in claim 6, is characterized in that in said module M5:
inputting the cropped instrument image and the generated mask data to the improved A-U 2 Training in the Net model to complete the segmentation of the pointer and dial areas in the dial image, and merging the morphological operation to weaken the fuzzy areas of the pointer and the scales;
obtaining the relative position relation between the pointer and the dial aiming at the divided pointer and dial information, and obtaining A-U 2 Polar coordinate expansion is carried out on the pointer and scale information segmented by the Net model, a polar coordinate system is established by taking the center of the sample image separated in the detection stage as an origin, the segmented scale and scale area are mapped into a rectangular image, the initial position of the scale and the position of the center of mass of the pointer are calculated, and the method is completed The reading of the water conservancy pointer type instrument, wherein the conversion formula and the reading formula from the polar coordinate to the rectangular coordinate system are shown as (5), (6) and (7):
x=r+ρcos(θ) (5)
y=r-ρsin(θ) (6)
wherein ρ is the polar diameter and θ is the polar angle;
preserving Yolov7 target detection and A-U 2 Saving the Net image segmentation model weights and combining them into an end-to-end structure; after the original input picture detects the result through the final combination algorithm, the result is decoded and written on the separated picture.
CN202311791640.4A 2023-12-22 2023-12-22 Novel mountain torrent front-end instrument and meter reading automatic identification method and system Pending CN117727046A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202311791640.4A CN117727046A (en) 2023-12-22 2023-12-22 Novel mountain torrent front-end instrument and meter reading automatic identification method and system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202311791640.4A CN117727046A (en) 2023-12-22 2023-12-22 Novel mountain torrent front-end instrument and meter reading automatic identification method and system

Publications (1)

Publication Number Publication Date
CN117727046A true CN117727046A (en) 2024-03-19

Family

ID=90208728

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202311791640.4A Pending CN117727046A (en) 2023-12-22 2023-12-22 Novel mountain torrent front-end instrument and meter reading automatic identification method and system

Country Status (1)

Country Link
CN (1) CN117727046A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117974648A (en) * 2024-03-29 2024-05-03 中国机械总院集团江苏分院有限公司 Fabric flaw detection method

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117974648A (en) * 2024-03-29 2024-05-03 中国机械总院集团江苏分院有限公司 Fabric flaw detection method
CN117974648B (en) * 2024-03-29 2024-06-04 中国机械总院集团江苏分院有限公司 Fabric flaw detection method

Similar Documents

Publication Publication Date Title
CN111210435B (en) Image semantic segmentation method based on local and global feature enhancement module
CN109446992B (en) Remote sensing image building extraction method and system based on deep learning, storage medium and electronic equipment
CN115601549B (en) River and lake remote sensing image segmentation method based on deformable convolution and self-attention model
CN111340738B (en) Image rain removing method based on multi-scale progressive fusion
Zhou et al. BOMSC-Net: Boundary optimization and multi-scale context awareness based building extraction from high-resolution remote sensing imagery
CN113888550B (en) Remote sensing image road segmentation method combining super-resolution and attention mechanism
CN111460936A (en) Remote sensing image building extraction method, system and electronic equipment based on U-Net network
CN113052106B (en) Airplane take-off and landing runway identification method based on PSPNet network
Hou et al. BSNet: Dynamic hybrid gradient convolution based boundary-sensitive network for remote sensing image segmentation
CN114724155A (en) Scene text detection method, system and equipment based on deep convolutional neural network
CN113838064B (en) Cloud removal method based on branch GAN using multi-temporal remote sensing data
CN112884758B (en) Defect insulator sample generation method and system based on style migration method
CN117727046A (en) Novel mountain torrent front-end instrument and meter reading automatic identification method and system
CN116205962B (en) Monocular depth estimation method and system based on complete context information
CN110992366A (en) Image semantic segmentation method and device and storage medium
CN113887472A (en) Remote sensing image cloud detection method based on cascade color and texture feature attention
CN115410081A (en) Multi-scale aggregated cloud and cloud shadow identification method, system, equipment and storage medium
CN116958827A (en) Deep learning-based abandoned land area extraction method
CN111860465A (en) Remote sensing image extraction method, device, equipment and storage medium based on super pixels
Sun et al. IRDCLNet: Instance segmentation of ship images based on interference reduction and dynamic contour learning in foggy scenes
Zuo et al. A remote sensing image semantic segmentation method by combining deformable convolution with conditional random fields
CN114581789A (en) Hyperspectral image classification method and system
CN113628180A (en) Semantic segmentation network-based remote sensing building detection method and system
CN113610032A (en) Building identification method and device based on remote sensing image
AU2021104479A4 (en) Text recognition method and system based on decoupled attention mechanism

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination