CN117009516A

CN117009516A - Converter station fault strategy model training method, pushing method and device

Info

Publication number: CN117009516A
Application number: CN202310811621.7A
Authority: CN
Inventors: 乔柱桥; 江��一; 王玉俊; 孙豪; 魏金林; 梁迪团; 杨洋; 周翔; 陈图腾; 王超; 任君; 黄剑湘; 李少森; 黄殿龙; 李祥斌
Original assignee: Kunming Bureau of Extra High Voltage Power Transmission Co
Current assignee: Kunming Bureau of Extra High Voltage Power Transmission Co
Priority date: 2023-07-04
Filing date: 2023-07-04
Publication date: 2023-11-07

Abstract

The application relates to a converter station fault strategy model training method, a pushing method and a device. The training method comprises the following steps: acquiring a fault data set of a converter station, wherein the fault data set comprises fault cases and an operation and maintenance corpus of the converter station; obtaining a plurality of groups of keywords based on the fault cases and the operation corpus; determining a plurality of fault data vectors corresponding to the converter station based on the plurality of groups of keywords; labeling each fault data vector to obtain a label type sequence corresponding to each fault data vector; and taking the fault data vector and the mark type sequence as training data, training a text classification model based on the training data, and taking the trained text classification model as a fault strategy model. The method can improve the utilization efficiency of the existing fault case when handling the sudden problem.

Description

Converter station fault strategy model training method, pushing method and device

Technical Field

The present application relates to the field of artificial intelligence technology, and in particular, to a converter station fault policy model training method, a pushing method, an apparatus, a computer device, a storage medium, and a computer program product.

Background

Along with the development of the electric power market and the interconnection and interworking of power grids, the construction scale and the number of converter stations are also continuously increased. The types of faults and fault handling cases are also diverse, since the equipment in the converter station is numerous and operational for a long period of time. For the converter station equipment, operation and maintenance personnel can diagnose and check the running state or the fault reason of the equipment according to the existing fault cases, and the existing fault cases can provide experience and examples for the operation and maintenance personnel to deal with emergency.

However, most of the existing fault cases at present need to be searched by a large number of documents, and the sudden problem cannot be solved in time, so that the fault handling cases and the emergency plan text cannot be well utilized to cope with the sudden fault.

Therefore, how to improve the utilization efficiency of the existing fault case when handling the sudden problem is a technical problem to be solved.

Disclosure of Invention

In view of the foregoing, it is desirable to provide a converter station fault policy model training method, a pushing method, an apparatus, a computer device, a computer readable storage medium, and a computer program product that can improve the efficiency of utilization of existing fault cases when handling sudden problems.

In a first aspect, the application provides a converter station fault strategy model training method. The method comprises the following steps:

acquiring a fault data set of a converter station, wherein the fault data set comprises fault cases and an operation and maintenance corpus of the converter station; the operation and maintenance corpus comprises fault operation and maintenance information of the converter station;

obtaining a plurality of groups of keywords based on the fault cases and the operation and maintenance corpus, wherein one group of keywords is used for representing one fault; determining a plurality of fault data vectors corresponding to the converter station based on the plurality of groups of keywords;

labeling each fault data vector to obtain a label type sequence corresponding to each fault data vector; wherein the tag type sequence includes at least two of a fault type tag, a fault feature tag, an inspection information tag, and a disposition measure tag; the disposal measure label is a binary label and comprises operation and maintenance role information and disposal information aiming at faults;

taking the fault data vector and the mark type sequence as training data, training a text classification model based on the training data, and taking the trained text classification model as a fault strategy model; the fault policy model is used for identifying structured fault data information from unstructured fault information, wherein the structured fault data information corresponds to information in the mark type sequence.

In one embodiment, the marking each fault data vector to obtain a marking type sequence corresponding to each fault data vector includes:

obtaining a mark type sequence template; the label type sequence templates consist of a plurality of label templates, and each label template corresponds to at least one preset label;

if the sub-vector of the fault data vector is matched with the target preset label of the label template, labeling the target preset label on the sub-vector of the fault data vector;

and after all sub-vectors of the fault data vector complete labeling corresponding to a plurality of label templates, obtaining a labeling type sequence corresponding to the fault data vector which completes labeling.

In one embodiment, if the sub-vector of the fault data vector matches the target preset label of the label template, labeling the target preset label for the sub-vector of the fault data vector includes:

calculating the space distance between the sub-vector of the fault data vector and the label feature vector corresponding to the target preset label;

if the space distance is smaller than a preset distance value, the label feature vector is matched with the sub-vector of the fault data vector, and the sub-vector of the fault data vector is marked with the mark type corresponding to the label feature vector.

In one embodiment, the training the text classification model based on the training data, taking the trained text classification model as the fault policy model, includes:

inputting the training data into the text classification model, and back-propagating and updating model parameters according to preset training times;

after the text classification model finishes updating the training times, determining whether an evaluation index of the trained text classification model accords with a preset convergence condition according to a preset verification set;

if the evaluation index does not accord with the preset convergence condition, returning to execute the step of updating the model parameters by back propagation according to the preset training times; or,

and if the evaluation index meets the preset convergence condition, taking the trained text classification model as the fault strategy model.

In one embodiment, before obtaining the plurality of groups of keywords based on the fault case and the operation corpus, the method further includes:

and acquiring text data corresponding to the image data in the fault data set according to an OCR optical character recognition technology, and replacing the image data with the text data corresponding to the image data to optimize the fault data set.

In a second aspect, the application provides a converter station fault strategy pushing method. The method comprises the following steps:

after receiving the current fault type sent by the user terminal, determining the similarity between the current fault type and the fault type of each fault case in the standard case library to obtain a plurality of similarities;

pushing converter station fault handling measures corresponding to the current fault type to the user terminal according to the similarities;

the standard case library is obtained by carrying out structural identification on a plurality of fault cases of the converter station through a fault strategy model; the fault policy model is trained by the method as described in the first aspect.

In one embodiment, the similarity between the current fault type and the fault type of each fault case in the standard case library is determined, so as to obtain a plurality of similarities; pushing, to the user terminal, a converter station fault handling measure corresponding to the current fault type according to the plurality of similarities, including:

obtaining a current fault vector corresponding to the current fault type and a standard fault vector corresponding to the fault type of each fault case in a standard case library;

Calculating cosine similarity scores of the current fault type vector and each standard fault vector to obtain a plurality of cosine similarity scores;

and obtaining a disposal measure corresponding to the fault type of which the cosine similarity score reaches a preset condition from the standard case library, and taking the disposal measure as a converter station fault disposal measure pushed to the user terminal.

In one embodiment, each standard case in the standard case library correspondingly stores a disposal measure and operation and maintenance role information corresponding to the disposal measure; after the treatment measures corresponding to the fault types, for which the cosine similarity scores reach the preset conditions, are obtained from the standard case library, the treatment measures further comprise:

and determining a converter station fault handling measure matched with the operation and maintenance role information from the handling measures according to the operation and maintenance role information corresponding to the user terminal, wherein the operation and maintenance role information is used as the converter station fault handling measure pushed to the user terminal.

In a third aspect, the application further provides a converter station fault strategy model training device. The device comprises:

the fault data acquisition module is used for acquiring a fault data set of the converter station, wherein the fault data set comprises fault cases and an operation and maintenance corpus of the converter station; the operation and maintenance corpus comprises fault operation and maintenance information of the converter station;

The fault vector determining module is used for obtaining a plurality of groups of keywords based on the fault cases and the operation and maintenance corpus, and one group of keywords is used for representing one fault; determining a plurality of fault data vectors corresponding to the converter station based on the plurality of groups of keywords;

the fault vector labeling module is used for labeling each fault data vector to obtain a label type sequence corresponding to each fault data vector; wherein the tag type sequence includes at least two of a fault type tag, a fault feature tag, an inspection information tag, and a disposition measure tag; the disposal measure label is a binary label and comprises operation and maintenance role information and disposal information aiming at faults;

the model training module is used for taking the fault data vector and the mark type sequence as training data, training a text classification model based on the training data, and taking the trained text classification model as a fault strategy model; the fault policy model is used for identifying structured fault data information from unstructured fault information, wherein the structured fault data information corresponds to information in the mark type sequence.

In a fourth aspect, the application further provides a converter station fault strategy pushing device. The device comprises:

The similarity calculation module is used for determining the similarity between the current fault type and the fault type of each fault case in the standard case library after receiving the current fault type sent by the user terminal, so as to obtain a plurality of similarities;

the disposal measure pushing module is used for pushing the converter station fault disposal measure corresponding to the current fault type to the user terminal according to the plurality of similarities;

In a fifth aspect, the present application also provides a computer device. The computer device comprises a memory storing a computer program and a processor implementing the steps of the method in the first or second aspect when the processor executes the computer program.

In a sixth aspect, the present application also provides a computer readable storage medium. The computer readable storage medium having stored thereon a computer program which, when executed by a processor, implements the steps of the method described in the first or second aspect.

In a seventh aspect, the present application also provides a computer program product. The computer program product comprising a computer program which, when executed by a processor, implements the steps of the method described in the first or second aspect.

The converter station fault strategy model training method, the pushing method, the device, the computer equipment, the storage medium and the computer program product are used for acquiring a fault data set of the converter station, wherein the fault data set comprises fault cases and an operation and maintenance corpus of the converter station; the operation and maintenance corpus comprises fault operation and maintenance information of the converter station; obtaining a plurality of groups of keywords based on the fault cases and the operation and maintenance corpus, wherein one group of keywords is used for representing one fault; determining a plurality of fault data vectors corresponding to the converter station based on the plurality of groups of keywords; labeling each fault data vector to obtain a label type sequence corresponding to each fault data vector; wherein the tag type sequence includes at least two of a fault type tag, a fault feature tag, an inspection information tag, and a disposition measure tag; the disposal measure label is a binary label and comprises operation and maintenance role information and disposal information aiming at faults; taking the fault data vector and the mark type sequence as training data, training a text classification model based on the training data, and taking the trained text classification model as a fault strategy model; the fault strategy model is used for identifying structured fault data information from unstructured fault information, the structured fault data information corresponds to information in the marking type sequence, and it is known that text corresponding to fault cases and operation and maintenance corpuses of a converter station is vectorized and marked for corresponding vectors, so that a label is added for each fault data vector, training data with the fault cases and the operation and maintenance corpuses being corpuses is obtained, a training text classification model is used for obtaining a model for identifying structured fault data information from unstructured fault information, and the structured fault data information comprises a plurality of key description information corresponding to the fault; furthermore, the structured standard case library can be extracted from the fault cases through the model, and the fault strategy of the converter station can be obtained by pushing the base case library to maintenance personnel, so that the utilization efficiency of the maintenance personnel on the existing fault cases when the maintenance personnel deal with the sudden problems is improved.

Drawings

Fig. 1 is an application environment diagram of a converter station fault policy model training method or a converter station fault policy pushing method in one embodiment;

fig. 2 is a flow chart of a method of training a converter station fault strategy model in one embodiment;

fig. 3 is a flow chart of a method for pushing a converter station fault strategy in one embodiment;

fig. 4 is a flow chart of a converter station fault policy model training method and a converter station fault policy pushing method in an embodiment;

fig. 5 is a block diagram of a converter station fault policy model training device in one embodiment;

fig. 6 is a block diagram of a converter station fault policy pushing device in one embodiment;

fig. 7 is an internal structural diagram of a computer device in one embodiment.

Detailed Description

The present application will be described in further detail with reference to the drawings and examples, in order to make the objects, technical solutions and advantages of the present application more apparent. It should be understood that the specific embodiments described herein are for purposes of illustration only and are not intended to limit the scope of the application.

The converter station fault strategy model training method or the converter station fault strategy pushing method provided by the embodiment of the application can be applied to an application environment shown in fig. 1. The terminal 102 corresponding to the converter station communicates with the server 104 through a network, and the server 104 receives a fault data set, a standard case library and a converter station disposal measure of the converter station sent by the terminal 102. The data storage system may store data that the server 104 needs to process. The data storage system may be integrated on the server 104 or may be located on a cloud or other network server. The terminal 102 may be, but not limited to, various personal computers, notebook computers, smart phones, tablet computers, internet of things devices, and portable wearable devices, where the internet of things devices may be smart speakers, smart televisions, smart air conditioners, smart vehicle devices, and the like. The portable wearable device may be a smart watch, smart bracelet, headset, or the like. The server 104 may be implemented as a stand-alone server or as a server cluster of multiple servers.

In one embodiment, as shown in fig. 2, a method for training a converter station fault policy model is provided, and the method is applied to the server 104 in fig. 1 for illustration, and includes the following steps:

s202, acquiring a fault data set of a converter station, wherein the fault data set comprises fault cases and an operation and maintenance corpus of the converter station; the operation and maintenance corpus contains fault operation and maintenance information of the converter station.

The server collects a dataset containing fault cases and an operation and maintenance corpus of the converter stations. The fault cases are fault case records recorded after operation and maintenance personnel process operation and maintenance faults of the converter station, fault operation and maintenance information which can be used for the converter station is contained in the operation and maintenance corpus, for example, the fault operation and maintenance information contains a fault maintenance knowledge base of general equipment, fault types of the general equipment, common fault phenomena, emergency treatment tasks and the like.

The data set should include information such as fault case description, fault type, fault phenomenon, where it should be checked, emergency that the different roles of operation and maintenance personnel should do. The data set may be obtained from a log of the converter station, from an existing fault report, from an operation and maintenance document or from expert knowledge.

It will be appreciated that for text content of the extracted dataset, format unification is required. Text processing can be performed through preprocessing steps such as text cleaning, word segmentation, stop word removal and the like, so that consistency and processibility of text data are ensured.

Optionally, if the pictures and tables contained in the dataset are associated with text data of the fault case, the location information in the file is used to associate the content of the pictures and tables with the corresponding text, which helps to better understand the content of the fault case in subsequent task processing.

S204, obtaining a plurality of groups of keywords based on the fault cases and the operation and maintenance corpus, wherein one group of keywords is used for representing one fault; and determining a plurality of fault data vectors corresponding to the converter station based on the plurality of groups of keywords.

The text data can be segmented through the segmentation tool, so that multiple groups of keywords are obtained, and the required multiple groups of keywords can be obtained through manual segmentation. Multiple keywords in the same fault case are used to characterize the fault, and therefore, multiple keywords in the fault case are used as a set of keywords. Similarly, multiple sets of keywords correspond to multiple faults.

It should be appreciated that the keyword or text data may be vectorized in a variety of ways, for example, word embedding, text vectorization, one-hot encoding, word bag model, TF-IDF word frequency inverse text frequency index (term frequency-inverse document frequency), and the like may be used to obtain the fault data vector corresponding to the plurality of sets of keywords.

Optionally, the pre-processed text data is marked with a Tokenizer and truncated or padded to maintain a fixed length. The tagged text is then converted into word embedding vectors, which can be obtained using a pre-trained RoBERTa model.

S206, marking each fault data vector to obtain a marking type sequence corresponding to each fault data vector; wherein the tag type sequence includes at least two of a fault type tag, a fault feature tag, an inspection information tag, and a disposition measure tag; the disposal measure label is a binary label and comprises operation and maintenance role information and disposal information aiming at faults.

It should be understood that the training model needs to obtain the labeled training samples, so that labels corresponding to each fault data vector need to be obtained, and the label type sequence is considered to be used as the labeling content of the fault data vector. As the labels constituting the label type sequence, a failure type label, a failure feature label, an inspection information label, and a disposal measure label may be used, thereby performing multi-label labeling for the failure data vector. The handling measure label is a binary label, so that the handling measure label further comprises two sub-labels which are respectively used for representing operation and maintenance role information and handling information aiming at faults, different fault handling measures provided by different operation and maintenance roles can be represented in the handling measure label, and the effect that different fault handling measures can be determined for different operation and maintenance roles through the handling measure label is finally achieved.

Alternatively, for labeling of fault type tags, fault feature tags, inspection information tags, and disposition measure tags, they may be encoded as sequence tags using the manner of sequence labeling tasks. The RoBERTa model may be used to classify each marker in the sequence, determining whether the fault data vector to be annotated belongs to a fault type, a fault feature, inspection information or a treatment measure. And labeling the treatment measures of the operation and maintenance personnel with different roles, and encoding the treatment measure of each role into a binary label by adopting a multi-label classification mode. In this case, the RoBERTa model may classify the treatment measures corresponding to each role. In this way, tag codes for different tasks can be integrated into the same RoBERTa model, with which faulty data vectors can be identified and marked with a sequence of marker types.

S208, taking the fault data vector and the mark type sequence as training data, training a text classification model based on the training data, and taking the trained text classification model as a fault strategy model; the fault policy model is used for identifying structured fault data information from unstructured fault information, wherein the structured fault data information corresponds to information in the mark type sequence.

It should be understood that the fault data vector and the mark type sequence have a corresponding relationship, so that the fault data vector and the mark type sequence can be used as training data for supervision training, and the training data can train the fault strategy model to have the following effects through training: structured fault data information is obtained from unstructured fault case data, and includes corresponding types in a sequence of marker types.

For example, input of a fault case data from unstructured, through the identification of a fault policy model, structured data as shown in the following table may be obtained:

at this time, the unstructured fault case text can be converted into fault text classifications corresponding to fault types, fault characteristics, inspection information and treatment measures in the table through the fault policy model, so that the structured information corresponding to the fault case can be directly obtained, and the operation and maintenance roles in the fault case can also be obtained, which treatment measure should be executed.

Alternatively, the fault policy model may be a RoBERTa model, which may learn the fault type, fault phenomenon, inspection information, and association and feature representation between different roles of treatment measures at the same time, so as to perform comprehensive text understanding and information extraction, and besides, the fault policy model may also be a model such as BERT (Bidirectional Encoder Representations from Transformers), convolutional neural network, recurrent neural network, and self-attention model.

Taking the RoBERTa model as an example, the following are the main layers of the RoBERTa model and their corresponding functions:

input embedding layer (Input Embedding Layer): the tag sequence of the input text is converted into an embedded vector representation.

This layer uses Token embedding (Token embedding), position embedding (Position Embeddings), and paragraph embedding (Segment Embeddings) to represent the input.

Transducer encoder layer (Transformer Encoder Layers): the Roberta model is formed by stacking a plurality of transducer encoder layers of the same structure.

Each encoder layer contains a Self-attention mechanism (Self-Attention Mechanism) and a Feed-forward neural network (Feed-forward Neural Network).

These encoder layers help to capture semantic relationships and contextual information of the input text. Residual connection layer (Residual Connections): inside the layer of the Transformer encoder, the residual connection is used to add the input to the output of the layer, thereby helping information flow between layers and avoiding the problem of gradient extinction.

Normalization layer (Layer Normalization): after each transducer encoder layer, layer normalization is applied to normalize the input features, improving the robustness and training speed of the model.

Masking (Masking) layer: the self-attention mechanism in the Roberta model employs a masking mechanism that prevents information leakage by masking (Mask) the input to ensure that each mark sees only the previous mark.

Pooling Layer (Pooling Layer): the output of the RoBERTa model may be summarized by a pooling layer, such as average pooling or maximum pooling, to obtain a global representation or a fixed length text representation.

And on the basis of the Roberta model, an additional full-connection layer is added to perform characteristic extraction and classification tasks. The model output is converted to a class probability distribution using a softmax activation function, using a Conditional Random Field (CRF) loss function and an SGD optimizer.

According to the training method of the fault strategy model of the converter station, the fault cases of the converter station and the texts corresponding to the operation and maintenance corpus are vectorized and the corresponding vectors are marked, so that labels are added to each fault data vector, training data with the fault cases and the operation and maintenance corpus being corpuses is obtained, a training data training text classification model is used for obtaining a model for identifying structured fault data information from unstructured fault information, the model can be used for extracting a structured standard case library from the fault cases, and the fault strategy of the converter station is obtained by pushing the standard case library to maintenance personnel according to the case library, so that the utilization efficiency of the maintenance personnel on the existing fault cases when the maintenance personnel deal with sudden problems is improved.

It will be appreciated that it is necessary to determine what fault data vector needs to be annotated when the annotation is made and what type of annotation sequence the fault data vector needs to be annotated is, and therefore, a type of annotation sequence template may be used and consists of a plurality of tag templates. For example, the label template is a template of a fault type label and a template of a disposal measure label, then the subvector of the fault data vector is traversed at this time to judge whether the subvector is matched with the template corresponding to the fault type label, if not, labeling is not performed, if so, the subvector is labeled as the fault type label, and similarly, the disposal measure label can be obtained.

Typically, each label template corresponds to a plurality of preset labels, for example, the fault types include fault types such as hardware types, software types, physical types, etc., if the template corresponding to the treatment measure label may have a plurality of preset sub-labels, for example, the sub-labels of the treatment measure label with operation and maintenance personnel information and the sub-labels of the specific treatment measure, and the preset sub-labels of the operation and maintenance personnel information may include a station leader, a station length, a value length, an operator and operation and maintenance personnel.

After all the sub-vectors pass through the traversal and labeling of the various label templates, the fault data vector is labeled correspondingly.

In this embodiment, through the above steps, the fault data vector can be labeled, and the labeled label type sequence corresponding to the fault data vector can be obtained.

It should be appreciated that whether the sub-vector matches the preset label may be determined by an association relationship between the label feature vector corresponding to the label template and the sub-vector of the fault data vector. The spatial distance between the vectors can be used to determine the degree of matching of the two. At this time, a preset distance value can be set as a threshold value, and when the preset distance between the sub-vector and the tag feature vector is smaller than the threshold value, the distance between the sub-vector and the tag feature vector is close to each other, and the similarity is high, so that the sub-vector and the tag feature vector are matched, and the sub-vector is marked.

In this embodiment, through the above steps, whether the two are similar can be determined based on the spatial distance, so as to obtain the matching degree between the sub-vector and the preset label.

It should be understood that the training data is input into the model, and the model is trained by the preset training times, and at this time, parameters inside the model are continuously changed along with updating, so as to finally achieve convergence.

Optionally, the model is trained using the prepared training data. The vectorized fault case description is taken as input, the corresponding label is taken as output, and the model parameters are updated through back propagation. The trained model is evaluated using a reserved validation set or cross validation method. And calculating indexes such as accuracy, recall rate, F1 score and the like to measure the performance of the model. And adjusting and improving the model according to the evaluation result.

In addition, knowledge about operation and maintenance is added to training of the model as a corpus to help improve accuracy and robustness of the model in extracting places to be checked from the fault case text. The operation and maintenance related knowledge may include various fault types, common fault phenomena, emergency handling tasks, and the like. The knowledge is taken as additional training data, so that the model can better understand the specific terms and the context of the operation and maintenance field when learning the fault case text. The addition of the operation and maintenance related knowledge can provide richer background information for the model, and help the model to better understand the fault case text, so that the extraction accuracy of the corresponding checked place is improved.

The operation and maintenance related knowledge is consolidated into a text form and combined with the fault case data set. In the training process, the model simultaneously learns semantic representation of fault case text and related information of operation and maintenance knowledge, so that the recognition capability of the model at a place to be checked is improved.

In this embodiment, through the above steps, a structured text data extraction model for a fault case of the converter can be trained, so that key information in an unstructured fault case can be rapidly obtained.

In one embodiment, before obtaining the plurality of groups of keywords based on the fault cases and the operation and maintenance corpus, the method further includes:

It should be appreciated that since the fault cases are mostly word and pdf files, text extraction is first required. The text in the file can be extracted using pdfminer, textract, python-docx, etc. For documents containing pictures, OCR (Optical Character Recognition ) technology may be used to extract text from the pictures. The OCR tool may convert text in the picture into editable text form. For files containing tables, the table data may be extracted and converted to a structured form using tools such as tabula-py, python-docx, pandas, etc. This allows for more convenient subsequent data processing and analysis.

In one embodiment, as shown in fig. 3, a method for pushing a fault policy of a converter station is provided, and the method is applied to the server 104 in fig. 1 for illustration, and includes the following steps:

s302, after receiving the current fault type sent by the user terminal, determining the similarity between the current fault type and the fault type of each fault case in the standard case library, and obtaining a plurality of similarities.

It should be appreciated that the scenario of converter station fault policy pushing is as follows: when facing the fault of the converter station, the user can input the current fault type and send the current fault type to the server 104 through the terminal, so as to obtain a specific fault strategy returned by the server 104.

The standard case library is obtained by carrying out structural identification on a plurality of fault cases of the converter station according to the trained fault strategy model, so that each fault case is structured data in the standard case library and comprises the fault type and the treatment measure of each fault case.

Since the current fault type transmitted by the user terminal is text data and the fault type of each fault case in the standard case library is also text data, the text similarity can be calculated.

And S303, pushing the converter station fault handling measures corresponding to the current fault type to the user terminal according to the similarities.

The standard case library is obtained by carrying out structural identification on a plurality of fault cases of the converter station through a fault strategy model; the fault strategy model is obtained through training the method in the converter station fault strategy model training method.

It should be understood that if the similarity is higher, the current fault type is more matched with the fault type in the standard case library, so that the fault strategies of the current fault type and the fault strategies of the current fault type are more similar, and based on the theory, corresponding treatment measures in the standard case library are acquired and sent to the user terminal.

Optionally, each time a new fault case is input, the fault type, fault characteristics, inspection points and disposal measures in the fault case text are automatically extracted through model identification. Taking the identification result of a fault case of the action of a certain converter transformer pressure relief valve as an example:

optionally, as an embodiment, the fault detection system is used together with a fault recognition system in the current converter station, and when a fault occurs, a plurality of similar fault cases can be automatically selected according to the fault type; and an emergency treatment strategy corresponding to the role of the operation and maintenance personnel is provided, so that the operation and maintenance efficiency is improved.

Optionally, as an embodiment, each new fault case is added into the database as a new fault type, so as to improve the coverage rate of the fault case, and the new fault case is added into the data set as a new training sample, so as to improve the accuracy rate of named entity identification.

If the current fault case coincides with the fault case in the database, the fault case after the current fault treatment is treated is used as the updating optimization of the similar fault case. If the current fault type is not coincident with the fault case in the database, the current identified fault case is used as a new case to be supplemented.

Alternatively, cosine similarity is used to measure the angle between the two vectors, reflecting their degree of similarity. For the output vectors of the two texts, cosine similarity between them can be calculated to evaluate their similarity. The method comprises the following specific steps:

(1) Vectorizing the input fault type;

(2) Inputting the vectorized text into a database, and obtaining an output vector representation of the existing fault type in the database;

(3) Calculating cosine similarity scores between vectors by using a cosine similarity formula;

(4)

wherein vector A and vector B represent the output vectors of the two texts, the dot product representing the vector is represented, the |vector || represents a norm (length) of the vector;

and judging the similarity of the text according to the cosine similarity score. A score of near 1 indicates that the two texts are very similar, and a score of near-1 indicates that the two texts are very dissimilar.

In this embodiment, the cosine similarity between the current fault vector and the plurality of standard fault vectors can be rapidly calculated through the cosine similarity and the text vector.

In one embodiment, each standard case in the standard case library correspondingly stores a treatment measure and operation and maintenance role information corresponding to the treatment measure; after the treatment measures corresponding to the fault types, for which the cosine similarity scores reach the preset conditions, are obtained from the standard case library, the treatment measures further comprise:

It should be understood that, when the user terminal sends the current fault type, the operation and maintenance role information corresponding to the user terminal is also sent together, and at this time, it can be determined by the operation and maintenance role information of the terminal, which is to solve the problem of the current fault type, so, in order to push the fault handling measures more accurately, further, in the handling measures corresponding to the matched standard case library, the converter station fault handling measures corresponding to the operation and maintenance role information of the terminal are found out.

In this embodiment, through the operation and maintenance role information of the terminal, the fault handling measures of the converter station corresponding to the operation and maintenance role can be matched, so as to achieve the effect of pushing the corresponding fault handling measures for the operation and maintenance role.

Alternatively, as an embodiment, as shown in fig. 4, it includes:

s410, collecting fault cases of converter station equipment, and preprocessing the fault cases to form a data set.

S420, improving the Roberta model, and training the model by using fault cases and operation and maintenance related knowledge texts.

And S430, extracting fault event characteristics (including fault types, fault phenomena, inspection information and treatment measures of the keratosis) in the fault case by using the trained model.

S440, establishing a standard structured fault case library of the converter station.

S450, inputting the type of the fault, and outputting the fault case with the highest similarity.

After S430, if a new fault case is generated, the process goes to S431 to input a new fault case, at this time, output a structured fault case text that can only be identified, and add the structured fault case text to the fault case standard structured database.

It should be understood that, although the steps in the flowcharts related to the embodiments described above are sequentially shown as indicated by arrows, these steps are not necessarily sequentially performed in the order indicated by the arrows. The steps are not strictly limited to the order of execution unless explicitly recited herein, and the steps may be executed in other orders. Moreover, at least some of the steps in the flowcharts described in the above embodiments may include a plurality of steps or a plurality of stages, which are not necessarily performed at the same time, but may be performed at different times, and the order of the steps or stages is not necessarily performed sequentially, but may be performed alternately or alternately with at least some of the other steps or stages.

Based on the same inventive concept, the embodiment of the application also provides a converter station fault strategy model training device for realizing the above-mentioned converter station fault strategy model training method. The implementation scheme of the solution provided by the device is similar to the implementation scheme recorded in the method, so the specific limitation in the embodiments of the device for training the fault policy model of the converter station provided below can be referred to the limitation of the training method of the fault policy model of the converter station in the above description, and the description is omitted here.

In one embodiment, as shown in fig. 5, there is provided a converter station fault policy model training apparatus 500, comprising: a fault data acquisition module 501, a fault vector determination module 502, a fault vector labeling module 503, and a model training module 504, wherein:

a fault data obtaining module 501, configured to obtain a fault data set of a converter station, where the fault data set includes a fault case and an operation and maintenance corpus of the converter station; the operation and maintenance corpus comprises fault operation and maintenance information of the converter station;

the fault vector determining module 502 is configured to obtain a plurality of groups of keywords based on the fault case and the operation and maintenance corpus, where one group of keywords is used to characterize a fault; determining a plurality of fault data vectors corresponding to the converter station based on the plurality of groups of keywords;

A fault vector labeling module 503, configured to label each fault data vector, so as to obtain a label type sequence corresponding to each fault data vector; wherein the tag type sequence includes at least two of a fault type tag, a fault feature tag, an inspection information tag, and a disposition measure tag; the disposal measure label is a binary label and comprises operation and maintenance role information and disposal information aiming at faults;

the model training module 504 is configured to use the fault data vector and the tag type sequence as training data, train a text classification model based on the training data, and use the trained text classification model as a fault policy model; the fault policy model is used for identifying structured fault data information from unstructured fault information, wherein the structured fault data information corresponds to information in the mark type sequence.

Further, in one embodiment, the fault vector labeling module 503 is further configured to:

Further, in one embodiment, the model training module 504 is further configured to:

Further, in one embodiment, the converter station fault policy model training device 500 further provides a character recognition module, configured to obtain text data corresponding to image data in the fault dataset according to an OCR optical character recognition technology, and replace the text data corresponding to the image data with the image data to optimize the fault dataset.

The modules in the converter station fault policy model training device can be implemented in whole or in part by software, hardware and a combination thereof. The above modules may be embedded in hardware or may be independent of a processor in the computer device, or may be stored in software in a memory in the computer device, so that the processor may call and execute operations corresponding to the above modules.

Based on the same inventive concept, the embodiment of the application also provides a converter station fault strategy pushing device for realizing the above related converter station fault strategy pushing method. The implementation of the solution provided by the device is similar to the implementation described in the above method, so the specific limitation in the embodiments of the present application for one or more fault policy pushing devices for a converter station may be referred to the limitation of the fault policy pushing method for a converter station in the foregoing description, which is not repeated herein.

In one embodiment, as shown in fig. 6, there is provided a converter station fault policy pushing device 600, including: a similarity calculation module 601 and a disposition measure pushing module 602, wherein:

the similarity calculation module 601 is configured to determine, after receiving a current fault type sent by a user terminal, a similarity between the current fault type and a fault type of each fault case in a standard case library, so as to obtain a plurality of similarities;

a handling measure pushing module 602, configured to push, to the user terminal, a converter station fault handling measure corresponding to the current fault type according to the multiple similarities;

the standard case library is obtained by carrying out structural identification on a plurality of fault cases of the converter station through a fault strategy model; the fault strategy model is obtained through training by the converter station fault strategy model training method.

Further, in one embodiment, the similarity calculation module 601 is further configured to:

and calculating cosine similarity scores of the current fault type vector and each standard fault vector to obtain a plurality of cosine similarity scores.

The disposition measure pushing module 602 is further configured to:

Further, in an embodiment, the disposition measure pushing module 602 is further configured to determine, from the disposition measures, a converter station fault disposition measure matching the operation and maintenance role information according to the operation and maintenance role information corresponding to the user terminal, as a converter station fault disposition measure pushed to the user terminal.

In one embodiment, a computer device is provided, which may be a server, the internal structure of which may be as shown in fig. 7. The computer device includes a processor, a memory, an Input/Output interface (I/O) and a communication interface. The processor, the memory and the input/output interface are connected through a system bus, and the communication interface is connected to the system bus through the input/output interface. Wherein the processor of the computer device is configured to provide computing and control capabilities. The memory of the computer device includes a non-volatile storage medium and an internal memory. The non-volatile storage medium stores an operating system, computer programs, and a database. The internal memory provides an environment for the operation of the operating system and computer programs in the non-volatile storage media. The database of the computer device is for storing a fault dataset of the converter station. The input/output interface of the computer device is used to exchange information between the processor and the external device. The communication interface of the computer device is used for communicating with an external terminal through a network connection. The computer program, when executed by the processor, implements a converter station fault strategy model training method or a converter station fault strategy pushing method.

It will be appreciated by those skilled in the art that the structure shown in FIG. 7 is merely a block diagram of some of the structures associated with the present inventive arrangements and is not limiting of the computer device to which the present inventive arrangements may be applied, and that a particular computer device may include more or fewer components than shown, or may combine some of the components, or have a different arrangement of components.

In an embodiment, there is also provided a computer device comprising a memory and a processor, the memory having stored therein a computer program, the processor implementing the steps of the method embodiments described above when the computer program is executed.

In one embodiment, a computer-readable storage medium is provided, on which a computer program is stored which, when executed by a processor, carries out the steps of the method embodiments described above.

In an embodiment, a computer program product is provided, comprising a computer program which, when executed by a processor, implements the steps of the method embodiments described above.

It should be noted that, the user information (including but not limited to user equipment information, user personal information, etc.) and the data (including but not limited to data for analysis, stored data, presented data, etc.) related to the present application are information and data authorized by the user or sufficiently authorized by each party, and the collection, use and processing of the related data need to comply with the related laws and regulations and standards of the related country and region.

Those skilled in the art will appreciate that implementing all or part of the above described methods may be accomplished by way of a computer program stored on a non-transitory computer readable storage medium, which when executed, may comprise the steps of the embodiments of the methods described above. Any reference to memory, database, or other medium used in embodiments provided herein may include at least one of non-volatile and volatile memory. The nonvolatile Memory may include Read-Only Memory (ROM), magnetic tape, floppy disk, flash Memory, optical Memory, high density embedded nonvolatile Memory, resistive random access Memory (ReRAM), magnetic random access Memory (Magnetoresistive Random Access Memory, MRAM), ferroelectric Memory (Ferroelectric Random Access Memory, FRAM), phase change Memory (Phase Change Memory, PCM), graphene Memory, and the like. Volatile memory can include random access memory (Random Access Memory, RAM) or external cache memory, and the like. By way of illustration, and not limitation, RAM can be in the form of a variety of forms, such as static random access memory (Static Random Access Memory, SRAM) or dynamic random access memory (Dynamic Random Access Memory, DRAM), and the like. The databases referred to in the embodiments provided herein may include at least one of a relational database and a non-relational database. The non-relational database may include, but is not limited to, a blockchain-based distributed database, and the like. The processor referred to in the embodiments provided in the present application may be a general-purpose processor, a central processing unit, a graphics processor, a digital signal processor, a programmable logic unit, a data processing logic unit based on quantum computing, or the like, but is not limited thereto.

The technical features of the above embodiments may be arbitrarily combined, and all possible combinations of the technical features in the above embodiments are not described for brevity of description, however, as long as there is no contradiction between the combinations of the technical features, they should be considered as the scope of the description.

The foregoing examples illustrate only a few embodiments of the application and are described in detail herein without thereby limiting the scope of the application. It should be noted that it will be apparent to those skilled in the art that several variations and modifications can be made without departing from the spirit of the application, which are all within the scope of the application. Accordingly, the scope of the application should be assessed as that of the appended claims.

Claims

1. A method for training a converter station fault strategy model, the method comprising:

2. The method of claim 1, wherein labeling each fault data vector to obtain a label type sequence corresponding to each fault data vector comprises:

3. The method of claim 2, wherein labeling the sub-vector of the fault data vector with the target preset tag if the sub-vector of the fault data vector matches the target preset tag of the tag template, comprises:

4. The method of claim 1, wherein training a text classification model based on the training data, using the trained text classification model as a fault policy model, comprises:

5. The method of claim 1, wherein before obtaining the plurality of sets of keywords based on the fault cases and the operation corpus, further comprises:

6. A converter station fault policy pushing method, characterized in that the method comprises:

the standard case library is obtained by carrying out structural identification on a plurality of fault cases of the converter station through a fault strategy model; the fault policy model is trained by the method of any one of claims 1 to 5.

7. The method of claim 6, wherein the determining the similarity of the current fault type to the fault type of each fault case in the standard case library obtains a plurality of similarities; pushing, to the user terminal, a converter station fault handling measure corresponding to the current fault type according to the plurality of similarities, including:

8. The method of claim 7, wherein each standard case in the standard case library correspondingly stores a treatment measure and operation and maintenance role information corresponding to the treatment measure; after the treatment measures corresponding to the fault types, for which the cosine similarity scores reach the preset conditions, are obtained from the standard case library, the treatment measures further comprise:

9. A converter station fault policy model training device, the device comprising:

10. A converter station fault policy pushing device, characterized in that the device comprises:

11. A computer device comprising a memory and a processor, the memory storing a computer program, characterized in that the processor implements the steps of the method of any one of claims 1 to 8 when the computer program is executed.

12. A computer readable storage medium, on which a computer program is stored, characterized in that the computer program, when being executed by a processor, implements the steps of the method of any of claims 1 to 8.

13. A computer program product comprising a computer program, characterized in that the computer program, when executed by a processor, implements the steps of the method of any one of claims 1 to 8.