CN114550197A - Terminal strip image detection information matching method - Google Patents

Terminal strip image detection information matching method Download PDF

Info

Publication number
CN114550197A
CN114550197A CN202210118012.9A CN202210118012A CN114550197A CN 114550197 A CN114550197 A CN 114550197A CN 202210118012 A CN202210118012 A CN 202210118012A CN 114550197 A CN114550197 A CN 114550197A
Authority
CN
China
Prior art keywords
terminal
matching
information
text
terminal block
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202210118012.9A
Other languages
Chinese (zh)
Inventor
王会增
师元康
齐肖彬
张岩坡
刘海锋
王乐
尚柳
张韶光
袁冰
王昭雷
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Super High Voltage Branch Of State Grid Hebei Electric Power Co ltd
State Grid Corp of China SGCC
Original Assignee
Super High Voltage Branch Of State Grid Hebei Electric Power Co ltd
State Grid Corp of China SGCC
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Super High Voltage Branch Of State Grid Hebei Electric Power Co ltd, State Grid Corp of China SGCC filed Critical Super High Voltage Branch Of State Grid Hebei Electric Power Co ltd
Priority to CN202210118012.9A priority Critical patent/CN114550197A/en
Publication of CN114550197A publication Critical patent/CN114550197A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/22Matching criteria, e.g. proximity measures
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Engineering & Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Theoretical Computer Science (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Artificial Intelligence (AREA)
  • Evolutionary Biology (AREA)
  • Evolutionary Computation (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Character Discrimination (AREA)

Abstract

The invention provides a terminal row image detection information matching method, which adopts an optimal matching Kuhn-Munkres algorithm with a weighted bipartite graph to carry out iterative solution, carries out matching effect, carries out solution of a loss matrix by using a spatial relation, logically judges an M multiplied by N relation to carry out matrix weighting, finally carries out binary matching solution on assignment matching solution of a terminal number and terminal spool character information, carries out data association according to terminal row OCR detection and identification results, and finally forms a tabulated processing result by associating with CAD database information. According to the invention, the matching pair information is utilized to form a data table for being checked and compared with the CAD drawing information, and finally, the practical application of the identification, matching and checking task of the terminal row of the worker in the maintenance operation process is realized, so that the effects of improving the maintenance operation efficiency of equipment, reducing the operation error rate of the worker and realizing the intelligent decision making of the worker are achieved.

Description

Terminal strip image detection information matching method
Technical Field
The invention relates to the technical field of artificial intelligence, in particular to a terminal block image detection information matching method.
Background
Artificial intelligence is known as the core driving force of the fourth industrial revolution, and increasingly has remarkable influence on society and economy. With the further maturation of artificial intelligence technology and the increasing investment in government and industry, the application of artificial intelligence will continue to accelerate. Relevant policies are issued by governments of various countries, and the policies are raised to the national strategic level. At present, the operation mode and the development mode of a national power grid are also changing, the power grid develops towards a high-voltage large power grid, wide-area interconnection and flexible self-adaption direction of a power distribution network, and an energy internet is gradually formed. The traditional technical method cannot meet the rapid development requirement of the power grid service and solve the pain point of the related service, and needs to comprehensively develop the artificial intelligence system design from a plurality of aspects such as models, samples, platforms, general components, professional intelligent applications, operation mechanisms, open ecology and the like by means of the artificial intelligence technology, so that the falling of the artificial intelligence application is accelerated, the digital transformation and upgrading of the power grid of a company are promoted, and the comprehensive enabling of the power grid service is realized. In the aspect of artificial intelligence application construction, the national power grid at the present stage successfully applies an unmanned aerial vehicle intelligent inspection technology to replace the traditional artificial inspection, so that the concealed defect discovery rate is greatly improved; the field operation video intelligent analysis is also developed in the field of safety supervision, and the intelligent identification of typical violation behaviors is realized; in the field of capital construction, violation intelligent alarm application is implemented in high-risk and complex process sites, and potential safety hazards of the electric power operation site are effectively identified.
With the revival of neural networks, scene character detection and recognition tasks are greatly promoted, character recognition technologies based on natural scene images also gradually become a hot problem in computer vision research, and in recent years, a plurality of algorithms based on deep learning for scene character detection and recognition emerge. OCR character recognition is a general image understanding technology, and has important significance for research on application of information retrieval, power system information loops, drawing and picture data conversion and the like. However, the problems of difficult detection, identification and translation caused by complicated structure, various types, complicated and various natural environments and character distortion in the present stage still cannot be effectively solved. In practical applications, failure cases occur frequently for different reasons, most of which are due to the lack of generalization capability of the model for new "new data". Thus, even though the OCR model may handle many situations, such as different fonts, orientations, angles, curves, backgrounds, there are some deviations that are not working, such as some non-popular fonts, symbols, and backgrounds, etc.
The method aims at the application requirements that in the secondary operation intelligent loop maintenance working process of the existing power system equipment, the terminal row correlation information of the machine room electric appliance cabinet and the standard CAD drawing information are compared and rechecked, and the field information secondary loop identification intelligent assistance personnel make a decision. And carrying out application technical research on-site terminal row image and character detection, identification, CAD drawing structured database construction, terminal number corresponding relation checking and calibration. The CAD drawing and the image identification contrast in the auxiliary maintenance operation process are realized, the equipment maintenance operation efficiency is improved, the operation work error rate of personnel is reduced, and the practical application of secondary operation intelligent decision is realized.
The terminal row terminal serial numbers and the line pipe characters are not in one-to-one correspondence, and the bipartite graph matching iterative matrix has sparseness, namely information asymmetry, due to the fact that the shooting angle or the corresponding relation of 'oblique' and 'oblique' between the terminal numbers and the terminal line pipes existing on the site is in the data correlation matching stage, a traditional bipartite graph data correlation method is likely to obtain a local optimal solution, missing matching or mismatching is caused, and matching accuracy is greatly influenced.
Disclosure of Invention
Aiming at the analysis of the data condition of the terminal strip, the invention mainly solves the problem of information asymmetric bipartite graph-data association matching. And the final OCR detection and identification result of the terminal row needs to be further judged by service logic, and the optimal matching and association needs to be carried out according to the detection and identification result of the terminal number and the spool text information. The invention aims to provide a terminal block image detection information matching method. The technical scheme of the invention is as follows:
a terminal row image detection information matching method includes the steps of carrying out iterative solution by adopting an optimal matching Kuhn-Munkres algorithm of weighted bipartite graphs, carrying out matching effect, solving a loss matrix by utilizing a spatial relation, logically judging an 'MxN' relation to carry out matrix weighting, finally carrying out binary matching to solve assignment matching solution of terminal numbers and terminal spool character information, carrying out data association according to terminal row OCR detection and identification results, and finally forming a tabulated processing result by associating with CAD database information.
Preferably, the concrete expression is: recognizing and detecting a terminal row result by using an OCR (optical character recognition), and forming a final OCR image data matching result by using a bipartite graph data association matching algorithm, thereby generating tabular conversion from an image to terminal information; the CAD drawing end establishes a drawing database of a corresponding drawing through a standard specification of the terminal row industry, and gives information of each terminal based on the tape attribute to form terminal information association; and finally, comparing the OCR detection data with the CAD database result, further realizing complete closed loop of the terminal row information checking loop, and assisting the checking and comparison application of the checking staff.
Preferably, the loss matrix is solved by using a spatial relationship, wherein the spatial relationship refers to an euclidean distance between the terminal row terminal number text image position coordinates and the spool text image position coordinates, and further forms the loss matrix.
Preferably, M in the M × N indicates the number of terminal row line number texts, and N indicates the number of terminal row line text; the logical relationship refers to the size of M and N.
Preferably, the specific solution process for assigning matching solutions comprises the following steps:
1) firstly, detecting and identifying all texts of a terminal row to obtain the text content of the terminal row and corresponding image position coordinate information;
2) then judging whether the recognized text belongs to numbers or not according to the text content; dividing the terminal wiring line number and the image position information of the terminal wiring line into two groups of sequences;
3) according to 2), judging the sizes of the two groups of sequences, solving the Euclidean distance between the position coordinates of the terminal number text image of the terminal row and the position coordinates of the spool text image to form a loss matrix;
4) and matching and solving according to the loss matrix and a Kuhn-Munkres algorithm, and obtaining the matching pairs of the terminal row serial numbers and the terminal row line pipes according to the column indexes corresponding to the optimal row indexes.
Preferably, said number in step 2) comprises 0,1,2, 3,4,5,6,7,8, 9.
Preferably, the sizes of M and N are determined in step 3), and if M is less than or equal to N, the terminal flat cable number text is used as the row index of the loss matrix, and the conduit text is used as the column index of the loss matrix, and vice versa.
Preferably, a terminal strip image labeling processing step and a terminal strip image detection and identification step are further provided before information matching.
Preferably, the terminal strip image labeling processing step includes: the terminal row with the angle of non-horizontal position is marked by four points, and the marking frame is tightly attached to the periphery of the alphanumeric without leaving a gap; the characters on the same spool are marked as a whole, if spaces are formed among the characters or numbers, the marks are not separated, the spaces are marked, and in the marking process, only one space is marked if a plurality of spaces exist in the middle.
Preferably, the terminal block image detecting and identifying step includes: two-stage OCR character recognition is adopted, a differentiable binary DBNet network structure is adopted as text detection in one stage, and the difference is that a BAM module is added in different stages of a model backbone for enhancing the enhanced expression of the model, so that the purpose of effectively positioning the text target segmentation boundary can be still obtained under the condition of text structure information loss; and in the second stage, text recognition is realized by adopting a CNN + RNN + CTC method.
The invention has the beneficial effects that:
the invention provides a detection technical approach for correlation matching of terminal row terminal numbers and terminal row tube texts based on image OCR text detection recognition results. The method comprises the steps of adopting a terminal number of a terminal row and text detection identification information (including text content and corresponding image position coordinates of a text example) of a terminal row wire tube, judging whether the text belongs to numbers to carry out bipartite graph data grouping, forming a terminal number and wire tube data sequence, solving one-to-one corresponding Euclidean distances according to the image position coordinates of the two data sequences, and forming a loss matrix; and (3) carrying out iterative solution on the loss matrixes of the two data sequences by adopting an optimal matching Kuhn-Munkres algorithm of the weighted bipartite graph, and finally obtaining an optimal line number and matching pair of line pipes according to row and column indexes. The matching pair information is utilized to form a data table for being checked and compared with the CAD drawing information, the practical application of identifying, matching and checking tasks of the terminal block of the working personnel in the maintenance operation process is finally realized, the equipment maintenance operation efficiency is improved, the operation work error rate of the working personnel is reduced, and the intelligent decision making effect of the working personnel is realized.
Drawings
FIG. 1 is a diagram of a terminal strip with text labeled correctly for illustration;
FIG. 2 is a diagram of a common labeling form and solution;
FIG. 3 is a diagram of a terminal block textual character OCR detection recognition network architecture;
FIG. 4 is a diagram of a BAM Bottleneeck Attention Module- -BMCV 2018;
FIG. 5 is a diagram of an Improved-DB-Net network architecture;
FIG. 6 is a graph showing the effect of detection of DB-Net versus Improved-DB-Net;
FIG. 7 is a diagram of the effect of TIA data enhancement method;
FIG. 8 is a diagram of a CNN + RNN network architecture;
FIG. 9 is a matching graph of asymmetric bipartite graphs of information;
FIG. 10 is a diagram illustrating the effect of bipartite matching of a terminal strip;
FIG. 11 is a chart of tabulated processing results;
FIG. 12 consistency results versus a technical roadmap.
Detailed Description
In the description of the present invention, "a plurality" means two or more unless otherwise specified. The terms "inner," "upper," "lower," and the like, refer to an orientation or a state relationship based on that shown in the drawings, which is for convenience in describing and simplifying the description, and do not indicate or imply that the referenced device or element must have a particular orientation, be constructed and operated in a particular orientation, and thus should not be construed as limiting the invention.
In the description of the present invention, it should be noted that, unless otherwise explicitly specified or limited, the terms "mounted" and "connected" are to be interpreted broadly, e.g., as being either fixedly connected, detachably connected, or integrally connected; can be mechanically or electrically connected; may be directly connected or indirectly connected through an intermediate. To those of ordinary skill in the art, the specific meanings of the above terms in the present invention are understood according to specific situations.
It should be noted that the terms "first," "second," and the like in the description and claims of the present invention and in the drawings described above are used for distinguishing between similar elements and not necessarily for describing a particular sequential or chronological order. It is to be understood that the data so used may be interchanged under appropriate circumstances in order to facilitate the description of the embodiments of the invention herein.
The present invention will be described in further detail with reference to the accompanying drawings and specific embodiments.
1 terminal row image labeling processing scheme
Aiming at the labeling of the alphanumeric and text contents on the terminal strip, in order to make up for the data deficiency and prevent the condition of label missing, the labeling method adopts the principle of 'label-to-be-marked-up' in the large aspect. The correct labeling is shown in FIG. 1.
The principle is that the terminal row with a non-horizontal angle (with an inclined angle upwards/downwards) needs to be marked by four points, so that a marked frame is ensured to be tightly attached to the periphery of the alphanumeric without leaving a gap. The characters on the same line pipe are marked as a whole, and if spaces are arranged among the characters (numbers), the characters (numbers) do not need to be marked separately. And the spaces need to be marked, and in the marking process, if a plurality of spaces exist in the middle, only one space is marked.
According to the situation of field data of the terminal strip, various problems often exist, in order to meet the detection and recognition of the terminal strip, a common data processing method in the current OCR technical field is referred, and effective adjustment and solution are performed according to different situations, as shown in FIG. 2.
2 terminal row image detection and identification method
CNN is a convolutional neural network structure, RNN is a recurrent neural network structure, CTC is a way to avoid manual alignment of inputs and outputs. Is a commonly used application method of speech recognition or OCR recognition.
CNN + RNN + CTC is expressed as:
the method adopts a CNN convolution network structure for extracting convolution characteristics of an input text image, and an RNN recurrent neural network is used for continuously extracting character sequence characteristics on the basis of the convolution characteristics and representing context semantic information between text sequences, which means the actual content of a text. Using CTC loss, a series of tag distributions obtained from the loop layer are converted into a final tag sequence, i.e. the actual text content. Ctc (connectionist Temporal classification) is a way to avoid manual alignment of input and output, and is well suited for speech recognition or OCR applications.
Backbone-generally refers to a convolutional neural network that aggregates and forms image features over different image granularities.
And (6) selecting Neck: generally referred to as a series of network layers that mix and combine image features and pass the image features to a prediction layer.
Head: generally, the image feature is predicted and translated.
According to the OCR technical route, the position of characters is determined through image segmentation in the first stage, the determined character position is subjected to image matting, and OCR character recognition is performed based on the scratched image in the second stage.
BDA (Base Data Augmentation, Data enhancement based on basic image processing), including but not limited to changes in color, noise, perspective, etc., for generating new images; TIA is a new data enhancement method, and is an effective distorted text enhancement strategy. The principle is that firstly, some reference points are set, then random disturbance is carried out on the reference points, and corresponding local images are subjected to geometric transformation to form new transformation images.
CNN + RNN: and performing text image processing by using the CNN convolutional neural network to obtain the sequence characteristics of the image, and performing context characteristic learning of the text character sequence by using the image sequence characteristics as the input of an RNN cyclic neural network structure.
Sequence to Sequence problem: the fields of mapping from sequence to sequence, machine translation, speech translation and OCR require speech and text alignment (see fig.), but such alignment is sometimes very difficult. When a model is directly trained without using alignment, the model is difficult to converge due to the difference in human speech rate or the difference in inter-character distance. I.e. the cause of CTCs.
Aiming at the data form of the field terminal block, combining the current situation of OCR detection and recognition research in the current stage, most of the technical routes adopted by the current most advanced OCR recognition system are algorithm detection based on segmentation, and then a two-stage method of character recognition is carried out on the segmentation result. The text is divided from bottom to top, and then the actual content of the text is calculated according to the text division example. The current OCR character recognition technology has immature real end-to-end deep learning network model architecture design, the definition of a network loss target loss function easily causes the paradox, contradiction and conflict (such as the detection tends to regression, and the recognition tends to classification) in the network learning process, and the method has the advantages of difficult training, insufficient generalization capability, large model, low speed and low precision, and is not suitable for the industrial practical application field.
The scheme adopts two-stage OCR character recognition, one stage adopts a differentiable binarization (binary binarization) DBNet network structure as text detection, and the difference is that a BAM (Bottleneck Attention Module) module is added at different stages of a model backbone for enhancing the enhanced expression of the model, so that the purpose of effectively positioning the text target segmentation boundary can be still obtained under the condition of text structure information loss; in the second stage, text recognition is realized by adopting a CNN + RNN + CTC method, which is the most excellent text recognition network architecture at present.
In general, the network structure layer is specifically two parts as shown in fig. 3 (terminal row character OCR detection and recognition network architecture): a detection end and an identification end.
(1) OCR detection terminal for terminal row character
The detection end is the first stage of OCR detection and recognition model, firstly, effective data enhancement function is carried out on data collected on site, and data processing modes such as character deformation, illumination, noise and the like are included; secondly, the current most excellent CNN feature extraction structure is used as a Resnet-backbone, and meanwhile, the sock end SFF (spatial feature fusion) fuses feature distribution of different CNN stages, so that the detection capability on different scales is realized, and the scale problem of the target to be detected is solved. And finally, the head end realizes the segmentation from the input image to the characters, and finally, the position information of the text is predicted and positioned in one stage based on the segmentation result.
The problem of location detection of incomplete text information is solved, and the problem of inaccurate text boundary location is solved at the bottom, and the model is represented by lacking generalization capability. According to the scheme, ResNet is used as a backbone, and a Bottleneck Attention Module (BAM) is added in a residual data fusion stage, as shown in FIG. 4, BAM is a new method for enhancing attention and is used for enhancing the representation capability of a network model. It is based on the fact that "a person cannot focus on all images when looking at it, focusing the focus on important objects in the image, focusing on several important points in the image, resembling the eyes of a person", making it more inclined above the boundary where the model is looking. The adopted attention module can meet the requirement of learning and focusing different contents on different channels, and can effectively fit the intermediate characteristics on the Spatial and the position through the relationship expansion coefficient to increase the expression capability of information. The working mechanism is more consistent with the process expression of 'positioning area' and 'positioning position' of the human visual system under the stimulation of 'key sensitive information'. The proposed BAM attention module also emphasizes that adding an attention module network at different stage residual ends is a key point for information flow enhancement expression.
According to the BAM thought, a backhaul is adopted in the scheme as Resnet50, BAM modules are respectively added in four stages of a network, namely stage1, stage2, stage3 and stage4, and are used for paying attention to attention boundary information in different stages, and the Improved Improved-DB-Net network structure is shown in FIG. 5.
Training is carried out on a public data set according to the changed network structure, the same data and training parameters are ensured at the same time, finally, the detection effect comparison is carried out on the same terminal wiring pipe image, and the comparison result is shown in fig. 6. The comparison result shows that the method has certain boundary learning ability.
(2) OCR recognition end for character characters in terminal row
And the other part of OCR Recognition is used as a second stage, the input of the second stage is one-stage 4-points position output, image information of the regressive text region is subjected to matting through the position output, an original image is scratched out and used as the input of the second stage, namely a network structure diagram.
The warped text recognition translation problem is caused by the fact that the whole model structure is not effectively learned, and the terminal block contains too little bending data. Data enhancement is a common method for improving Text recognition effect, except for (Text BDA (Base Data Augmentation, Data enhancement based on basic image processing), recently, Luo et al proposed a new Data enhancement method TIA Learn to Automation: Joint Data Augmentation and Network Optimization for Text recognition.2020 for character recognition, as shown in FIG. 7.
According to the research, firstly, through a Data enhancement mode of (Base Data Augmentation) BDA + TIA, the illumination influence of the natural environment is expanded, and secondly, the quantity of distorted text samples is amplified, so that the expression is enhanced, the learning capacity of a model is improved, the OCR character translation accuracy is improved, and the problem of recognition and translation of distorted texts is solved. The other two stages of network structure use the CNN + RNN network structure, as shown in fig. 8. CNN + RNN (volumetric recovery Neural network) is a popular image-text recognition model at present, and can recognize a long text sequence (RNN is usually implemented by using a long-time short memory network (BLSTM)). The method comprises a CNN characteristic extraction layer and a BLSTM sequence characteristic extraction layer, and can perform end-to-end joint training. Context relations in the character images are learned by using BLSTM and CTC components, so that the text recognition accuracy is effectively improved, and the model is more robust. In the prediction process, the front end extracts the characteristics of the text image by using a standard CNN network. (Current Neural Network) loop layer structure. The role of RNN is to predict the tag (true value) distribution of the signature sequences obtained from convolutional layers. The distribution of OCR text sequence data refers to data collected at different position points, which reflects the changing state or degree of a certain thing, phenomenon, etc. over time. This is a definition of time-series data, and is not time-dependent but a character sequence, and is a solution to the problem of data relation mapping before and after the sequence data.
And in the second stage, after CNN + RNN, fusing the feature vectors to extract the context features of the character sequence, then obtaining the probability distribution of each row of features, and finally predicting through a transcription layer (CTC-Connectionist Temporal Classification) to obtain a text sequence. The translation stage uses CTC) decode alignment. Firstly, label distribution obtained from a circulation layer is converted into a final recognition result through operations such as deduplication integration and the like; secondly, the problem that similar Sequence to Sequence exists in characters in an image is solved, alignment is needed during preprocessing operation as common scene OCR character recognition, if the model is directly trained without using alignment, the model is difficult to converge due to different position distribution of the text, different distances between terminal characters, or terminal character distortion and type deformation, and the CTC is a mode of avoiding input and output manual alignment, is an effective method which is very suitable for OCR text alignment application, and can effectively play a terminal character alignment translation function.
3 data association matching tabulation processing
And the final OCR detection and identification result of the terminal row needs to be further judged by service logic, and the optimal matching and association needs to be carried out according to the detection and identification result of the terminal number and the spool text information. However, the inequality of the number of the terminal numbers and the number of the spool text information (i.e., "many pairs of few" and few pairs of many ", i.e., M × N or N × M) causes an optimal solution problem of information asymmetry, i.e., mapping to bipartite graph-data association matching. The bipartite problem of terminal row terminal number and line pipe is also called bipartite graph, which is a special model in graph theory. It is an undirected graph of G ═ V, E, and if vertex V can be partitioned into two mutually disjoint subsets (a, B), and the two vertices i and j associated with each edge (i, j) in the graph belong to the two different sets of vertices (i in a, j in B), respectively, graph G is called a bipartite graph. Colloquially, the nodes in the graph are divided into two sets, and the condition that only edges pointing to points in one set from points in the other set exist, namely the points in the two sets do not intercommunicate.
In the study, an optimal matching Kuhn-Munkres algorithm with a weighted bipartite graph is adopted for iterative solution, and a matching effect is achieved, as shown in FIG. 9.
Aiming at the problem of bipartite graph-data association matching of asymmetric information, the research utilizes a key spatial relationship to solve a loss matrix, logically judges an M multiplied by N relationship to perform matrix weighting, and finally performs assignment matching solving of a bipartite matching solving terminal number and terminal spool text information. The matching effect is shown in fig. 10.
And finally, carrying out data association according to the OCR detection and recognition results of the terminal row, and associating with CAD database information to finally form a tabulated processing result. As shown in fig. 11. The invention adopts core algorithms such as image character recognition, data information correlation matching, database correlation matching and the like based on deep learning to realize on-site terminal row data detection recognition and terminal number and terminal row information correlation matching comparison. Specifically, the method comprises the steps of recognizing and detecting a terminal row result by using an OCR (optical character recognition), forming a final OCR image data matching result through a bipartite graph data association matching algorithm, and further generating tabular conversion from an image to terminal information; and the CAD drawing end establishes a drawing database of a corresponding drawing through a standard specification of the terminal row industry, and gives each terminal information based on the tape attribute to form terminal information association. And finally, comparing the OCR detection data with the CAD database result, further realizing complete closed loop of the terminal row information checking loop, and assisting the checking and comparison application of the checking staff. As shown in fig. 12, the consistency results comparison technique is performed.
It is to be understood that the invention is not limited in its application to the details of construction and the arrangement of components set forth herein. The invention is capable of other embodiments and of being practiced and carried out in various ways. The foregoing variations and modifications fall within the scope of the present invention. It will be understood that the invention disclosed and defined herein extends to all alternative combinations of two or more of the individual features mentioned or evident from the text and/or drawings. The embodiments described herein explain the best modes known for practicing the invention and will enable others skilled in the art to utilize the invention.

Claims (10)

1. A terminal row image detection information matching method is characterized in that an optimal matching Kuhn-Munkres algorithm with a weighted bipartite graph is adopted for iterative solution, matching effect is achieved, a space relation is used for solving a loss matrix, the 'MxN' relation is logically judged for matrix weighting, finally binary matching is conducted, assignment matching solution of a terminal number and terminal line pipe character information is conducted, data association is conducted according to terminal row OCR detection and recognition results, association is conducted through CAD database information, and finally a tabulated processing result can be formed.
2. The terminal block image detection information matching method according to claim 1,
the concrete expression is as follows: recognizing and detecting a terminal row result by using an OCR (optical character recognition), and forming a final OCR image data matching result by using a bipartite graph data association matching algorithm, thereby generating tabular conversion from an image to terminal information; the CAD drawing end establishes a drawing database of a corresponding drawing through a standard specification of the terminal row industry, and gives information of each terminal based on the tape attribute to form terminal information association; and finally, comparing the OCR detection data with the CAD database result, further realizing complete closed loop of the terminal row information checking loop, and assisting the checking and comparison application of the checking staff.
3. The method for matching the terminal block image detection information according to claim 1, wherein a loss matrix is solved by using a spatial relationship, wherein the spatial relationship refers to an euclidean distance between the terminal block terminal number text image position coordinates and the spool text image position coordinates, so as to form the loss matrix.
4. The method for matching terminal block image detection information according to claim 1, wherein M in the M × N indicates the number of terminal block line number texts, and N indicates the number of terminal block line number texts; the logical relationship refers to the size of M and N.
5. The terminal strip image detection information matching method according to claim 1, wherein the specific solution process of assigning the matching solution comprises the following steps:
1) firstly, detecting and identifying all texts of a terminal row to obtain the text content of the terminal row and corresponding image position coordinate information;
2) then judging whether the recognized text belongs to numbers according to the text content; dividing the terminal wiring line number and the image position information of the terminal wiring line into two groups of sequences;
3) according to 2), judging the sizes of the two groups of sequences, solving the Euclidean distance between the position coordinates of the terminal number text image of the terminal row and the position coordinates of the spool text image to form a loss matrix;
4) and matching and solving according to the loss matrix and a Kuhn-Munkres algorithm, and obtaining the terminal row serial number and the matching pair of the terminal row line pipes according to the column index corresponding to the optimal row index.
6. The terminal block image detection information matching method according to claim 5, wherein the number in step 2) includes 0,1,2, 3,4,5,6,7,8, 9.
7. The method for matching terminal block image detection information according to claim 5, wherein the sizes of M and N are determined in step 3), and if M is less than or equal to N, the terminal block serial number text is used as the row index of the loss matrix, and the spool text is used as the column index of the loss matrix, and vice versa.
8. The method for matching image detection information of a terminal block according to claim 1, wherein a terminal block image labeling processing step and a terminal block image detection recognizing step are further provided before the information matching.
9. The terminal block image detection information matching method according to claim 1, wherein the terminal block image labeling processing step includes: the terminal row with the angle of non-horizontal position is marked by four points, and the marking frame is tightly attached to the periphery of the alphanumeric without leaving a gap; the characters on the same spool are marked as a whole, if spaces are formed among the characters or numbers, the marks are not separated, the spaces are marked, and in the marking process, only one space is marked if a plurality of spaces exist in the middle.
10. The terminal block image detection information matching method according to claim 1, wherein the terminal block image detection identification step includes: two-stage OCR character recognition is adopted, a differentiable binary DBNet network structure is adopted as text detection in one stage, and the difference is that a BAM module is added in different stages of a model backbone for enhancing the enhanced expression of the model, so that the purpose of effectively positioning the text target segmentation boundary can be still obtained under the condition of text structure information loss; and in the second stage, text recognition is realized by adopting a CNN + RNN + CTC method.
CN202210118012.9A 2022-02-08 2022-02-08 Terminal strip image detection information matching method Pending CN114550197A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210118012.9A CN114550197A (en) 2022-02-08 2022-02-08 Terminal strip image detection information matching method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210118012.9A CN114550197A (en) 2022-02-08 2022-02-08 Terminal strip image detection information matching method

Publications (1)

Publication Number Publication Date
CN114550197A true CN114550197A (en) 2022-05-27

Family

ID=81673241

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210118012.9A Pending CN114550197A (en) 2022-02-08 2022-02-08 Terminal strip image detection information matching method

Country Status (1)

Country Link
CN (1) CN114550197A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115130696A (en) * 2022-07-08 2022-09-30 华能核能技术研究院有限公司 Nuclear power plant termination error-proofing method and device

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115130696A (en) * 2022-07-08 2022-09-30 华能核能技术研究院有限公司 Nuclear power plant termination error-proofing method and device

Similar Documents

Publication Publication Date Title
CN110334705B (en) Language identification method of scene text image combining global and local information
CN112884064B (en) Target detection and identification method based on neural network
Lei et al. Intelligent fault detection of high voltage line based on the Faster R-CNN
CN110070536B (en) Deep learning-based PCB component detection method
CN112528963A (en) Intelligent arithmetic question reading system based on MixNet-YOLOv3 and convolutional recurrent neural network CRNN
KR20220013298A (en) Method and device for recognizing characters
CN111950453A (en) Optional-shape text recognition method based on selective attention mechanism
CN110852368A (en) Global and local feature embedding and image-text fusion emotion analysis method and system
CN110610166A (en) Text region detection model training method and device, electronic equipment and storage medium
CN112836650B (en) Semantic analysis method and system for quality inspection report scanning image table
CN110569843B (en) Intelligent detection and identification method for mine target
CN111476210B (en) Image-based text recognition method, system, device and storage medium
CN110648310A (en) Weak supervision casting defect identification method based on attention mechanism
CN110598698B (en) Natural scene text detection method and system based on adaptive regional suggestion network
CN110796018A (en) Hand motion recognition method based on depth image and color image
CN114092742B (en) Multi-angle-based small sample image classification device and method
CN114550153A (en) Terminal block image detection and identification method
CN112633088B (en) Power station capacity estimation method based on photovoltaic module identification in aerial image
CN114863091A (en) Target detection training method based on pseudo label
CN105654054A (en) Semi-supervised neighbor propagation learning and multi-visual dictionary model-based intelligent video analysis method
CN114898472A (en) Signature identification method and system based on twin vision Transformer network
CN110321803A (en) A kind of traffic sign recognition method based on SRCNN
CN113496210B (en) Photovoltaic string tracking and fault tracking method based on attention mechanism
CN114550197A (en) Terminal strip image detection information matching method
CN117609536A (en) Language-guided reference expression understanding reasoning network system and reasoning method

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination