WO2016188279A1 - 一种故障谱的生成、基于故障谱的检测方法和装置 - Google Patents

一种故障谱的生成、基于故障谱的检测方法和装置 Download PDF

Info

Publication number
WO2016188279A1
WO2016188279A1 PCT/CN2016/080015 CN2016080015W WO2016188279A1 WO 2016188279 A1 WO2016188279 A1 WO 2016188279A1 CN 2016080015 W CN2016080015 W CN 2016080015W WO 2016188279 A1 WO2016188279 A1 WO 2016188279A1
Authority
WO
WIPO (PCT)
Prior art keywords
fault
detection
work order
order data
word
Prior art date
Application number
PCT/CN2016/080015
Other languages
English (en)
French (fr)
Inventor
刘迅
Original Assignee
阿里巴巴集团控股有限公司
刘迅
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 阿里巴巴集团控股有限公司, 刘迅 filed Critical 阿里巴巴集团控股有限公司
Publication of WO2016188279A1 publication Critical patent/WO2016188279A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/22Detection or location of defective computer hardware by testing during standby operation or during idle time, e.g. start-up testing

Definitions

  • the present application relates to the technical field of computers, and in particular, to a fault spectrum generation method, a fault spectrum based detection method, a fault spectrum generation device, and a fault spectrum based detection device.
  • the user submits a work order to the work order system for inspection and maintenance to solve the problem.
  • the existing work order system is mainly composed of two subsystems: autonomous answering system and customer service answering system.
  • step-by-step troubleshooting which takes a lot of time
  • the speed of fault detection is slow; and the technical documents accumulated by the work order system are generally large in number, complicated in operation, and require a large number of users.
  • reading technical documents needs to accumulate knowledge in the field, and the technical threshold is high. It is difficult for a user with weak technical skills or customer service to solve the problem alone.
  • embodiments of the present application have been proposed in order to provide a method for generating a fault spectrum that overcomes the above problems or at least partially solves the above problems, a fault spectrum based detection method, and a corresponding fault spectrum generation.
  • Device a fault spectrum based detection device.
  • the embodiment of the present application discloses a method for generating a fault spectrum, including:
  • each first work order data includes an error letter Information and detection information
  • Each type of fault spectrum model is trimmed to obtain each type of fault spectrum.
  • the fault spectrum includes a connected root node and a leaf node, where the root node represents fault information, the leaf node represents detection information, and at least part of the leaf nodes have a logical relationship, and the leaf node has one or Multiple parent nodes.
  • the step of extracting a common feature word from the detection information comprises:
  • At least part of the first participle is extracted as a common feature word according to the weight.
  • the step of extracting a common feature word from the detection information further includes:
  • the step of performing a pruning process on each type of fault spectrum model comprises:
  • the subtree is a set of one or more leaf nodes
  • the step of performing a pruning process on each type of fault spectrum model further includes:
  • the fault spectrum model is trimmed according to a preset pruning manner.
  • the step of performing a pruning process on each type of fault spectrum model further includes:
  • a leaf node whose logical relationship is illegal is cut out from the fault spectrum model.
  • the embodiment of the present application further discloses a fault spectrum based detection method, including:
  • the detection is performed according to the one or more detection paths, and the detection result is obtained.
  • the fault spectrum includes a connected root node and a leaf node, where the root node represents fault information, the leaf node represents detection information, and at least part of the leaf nodes have a logical relationship, and the leaf node has one or Multiple parent nodes.
  • the step of extracting keywords from the second work order data comprises:
  • the step of extracting keywords from the second work order data further includes:
  • the step of searching for one or more detection paths according to the feature words comprises:
  • the step of performing detection according to the one or more detection paths, and obtaining the detection result includes:
  • the candidate detection result of the final leaf node is set as the detection result.
  • the fault spectrum is generated by:
  • each first work order data includes fault information and detection information
  • Each type of fault spectrum model is trimmed to obtain each type of fault spectrum.
  • the embodiment of the present application further discloses a fault spectrum generating apparatus, including:
  • a work order data acquisition module configured to acquire first work order data of one or more categories; each first work order data includes fault information and detection information;
  • a common feature word extraction module configured to extract a common feature word from the detection information as a feature vector for each type of first work order data
  • a fault spectrum model learning module configured to learn a logical relationship between the fault information and the feature vector for each type of first work order data, and obtain each type of fault spectrum model
  • the fault spectrum model trimming module is used to trim each type of fault spectrum model to obtain each type of fault spectrum.
  • the fault spectrum includes a connected root node and a leaf node, where the root node represents fault information, the leaf node represents detection information, and at least part of the leaf nodes have a logical relationship, and the leaf node has one or Multiple parent nodes.
  • the public feature word extraction module comprises:
  • a first word segmentation processing module configured to perform word segmentation processing on the detection information to obtain one or more first word segments
  • a word frequency statistics module configured to count the word frequency of the first participle
  • a weight calculation module configured to calculate a weight of the first word segment by using a word frequency of the first word segment
  • a first participle extraction submodule configured to extract at least part of the first participle as a public according to the weight Common feature words.
  • the public feature word extraction module further includes:
  • a first matching submodule configured to perform matching in the preset stop word bank by using the one or more first word segments
  • the first removal sub-module is used to remove the first participle that matches the success.
  • the fault spectrum model pruning module comprises:
  • subtree search submodule configured to search for the same subtree in the fault spectrum model;
  • the subtree is a set of one or more leaf nodes;
  • connection submodule for connecting a parent node of the same subtree to one of the subtrees when found
  • the first pruning submodule is used to cut out other subtrees other than the connected subtree in the same subtree.
  • the fault spectrum model pruning module further includes:
  • the second trimming submodule is configured to perform the trimming process on the fault spectrum model according to a preset pruning manner.
  • the fault spectrum model pruning module further includes:
  • a third pruning submodule configured to cut out a leaf node whose logical relationship is illegal from the fault spectrum model.
  • the embodiment of the present application further discloses a fault spectrum based detection apparatus, including:
  • a keyword extraction module configured to extract keywords from the second work order data when the second work order data is received
  • a fault spectrum finding module configured to search for a fault spectrum corresponding to the category of the second work order data
  • Detecting a path finding module configured to search for one or more detection paths according to the keyword in the fault spectrum
  • a detecting module configured to perform detection according to the one or more detection paths, and obtain a detection result.
  • the fault spectrum includes a connected root node and a leaf node, the root node represents fault information, and the leaf node represents detection information, and at least part of the leaf nodes have a logic relationship
  • the leaf node has one or more parent nodes.
  • the keyword extraction module includes:
  • a second word segmentation processing sub-module configured to perform word segmentation processing on the second work order data to obtain one or more second word segments
  • a part of speech recognition sub-module configured to identify part of speech of the one or more second participles
  • a second word segmentation sub-module configured to extract keywords from the one or more second word segments according to the part of speech.
  • the keyword extraction module further includes:
  • a second matching submodule configured to perform matching in the preset stop word bank by using the one or more first word segments
  • the second removal sub-module is used to remove the second participle that matches the success.
  • the detection path searching module includes:
  • a root node matching submodule configured to find a root node that matches the keyword in the fault spectrum
  • the leaf node traverses the sub-module for traversing one or more leaf nodes connected to the root node to obtain one or more detection paths.
  • the detecting module comprises:
  • a detection information obtaining submodule configured to acquire, for each detection path, detection information of one or more leaf node representations in the detection path;
  • a candidate detection result obtaining sub-module configured to perform detection according to the detection information represented by the current leaf node, to obtain a candidate detection result
  • a leaf node search submodule configured to search for a next leaf node whose logical relationship matches the candidate detection result, and return a call candidate detection result acquisition submodule until execution to the final leaf node;
  • the detection result setting sub-module is configured to set a candidate detection result of the final leaf node as the detection result.
  • the fault spectrum is generated by calling the following module:
  • a work order data acquisition module for acquiring first work order data of one or more categories; each The work order data includes fault information and detection information;
  • a common feature word extraction module configured to extract a common feature word from the detection information as a feature vector for each type of first work order data
  • a fault spectrum model learning module configured to learn a logical relationship between the fault information and the feature vector for each type of first work order data, and obtain each type of fault spectrum model
  • the fault spectrum model trimming module is used to trim each type of fault spectrum model to obtain each type of fault spectrum.
  • the embodiment of the present application establishes a fault spectrum, so that the subsequent detection supports concurrent detection according to the one or more detection paths, which reduces the detection time and improves the detection efficiency.
  • the detection operation of the application fault spectrum is simple and greatly reduced.
  • the frequency of manual participation reduces the user's energy consumption.
  • the knowledge points in the knowledge base formed by the massive work order data are used to deal with the faults, which greatly reduces the technical threshold and facilitates the problem of users with weak technical skills or customer service alone.
  • FIG. 1 is a flow chart showing the steps of an embodiment of a method for generating a fault spectrum according to the present application
  • FIGS. 2A and 2B are diagrams showing an example of trimming of a fault spectrum model of the present application
  • 3A and 3B are diagrams showing an example of trimming of a fault spectrum model of the present application.
  • FIG. 4 is a flow chart showing the steps of an embodiment of a fault spectrum based detection method according to the present application.
  • Figure 5 is a diagram showing an example of a detection path of the present application.
  • 6A is a diagram showing an example of a conventional detection
  • 6B is a diagram showing an example of detection of the present application.
  • FIG. 7 is a structural block diagram of an embodiment of a fault spectrum generating apparatus of the present application.
  • FIG. 8 is a structural block diagram of an embodiment of a fault spectrum based detecting apparatus of the present application.
  • FIG. 1 a flow chart of steps of a method for generating a fault spectrum of the present application is shown, which may specifically include the following steps:
  • Step 101 Acquire first work order data of one or more categories
  • the first work order data in the history can be stored.
  • the typical first work order data is written into knowledge points and stored in the knowledge base.
  • each first work order data may include: date, user ID, product, problem classification, problem (fault information), solution (detection information), communication record, and the like.
  • the fault information may be information describing the fault that occurred, and the detection information may be information describing how to detect and resolve the fault, and the two correspond to each other.
  • the fault information is “DB (Database) access slow”, and the detection information is “Please check the network congestion first”.
  • Step 102 Extract, for each type of first work order data, a common feature word from the detection information as a feature vector;
  • the public feature word which is a word common to some of the first work order data in the class, can be used to characterize the detection information as a parameter of the training sample.
  • step 102 may include the following sub-steps:
  • Sub-step S11 performing word segmentation processing on the detection information to obtain one or more first word segments
  • word segmentation can be handled in the following ways:
  • Word segmentation based on string matching refers to matching the Chinese character string to be analyzed with a term in a preset machine dictionary according to a certain strategy. If a string is found in the dictionary, the matching is successful ( Identify a word).
  • the word segmentation method based on feature scanning or mark segmentation refers to prioritizing and segmenting some words with obvious features in the string to be analyzed. Using these words as breakpoints, the original string can be divided into Small strings come in mechanical participles; or combine word segmentation with word class notation, The use of rich word class information to help segmentation decision-making, and in turn in the labeling process to test and adjust the word segmentation results.
  • the word segmentation method based on understanding refers to the effect of identifying words by letting the computer simulate the understanding of the sentence.
  • the basic idea is to perform syntactic and semantic analysis at the same time as word segmentation, and use syntactic information and semantic information to deal with ambiguity.
  • Statistical-based word segmentation method Statistics on the frequency of combinations of adjacent co-occurrence words in the corpus, calculating their mutual information, and calculating the adjacent co-occurrence probability of two Chinese characters X and Y.
  • the mutual information can reflect the closeness of the relationship between Chinese characters. When the degree of tightness is above a certain threshold, the word group may be considered to constitute a word.
  • word segmentation processing mode is only an example.
  • other word segmentation processing modes may be set according to actual conditions, which is not limited by the embodiment of the present application.
  • those skilled in the art may also adopt other word segmentation processing methods according to actual needs, and the embodiment of the present application does not limit this.
  • Sub-step S12 counting the word frequency of the first participle
  • Sub-step S13 calculating the weight of the first participle by the word frequency of the first participle
  • the weight of the first participle can be calculated by TF-IDF (term frequency–inverse document frequency, a commonly used weighting technique for information retrieval and information mining).
  • TF-IDF can be used to assess how important a word is to a file set or to a file in a corpus.
  • the importance of a word increases proportionally with the number of times it appears in the file, but It also decreases inversely with the frequency it appears in the corpus.
  • Sub-step S14 extracting at least part of the first participle as a common feature word according to the weight.
  • the first N (N is a positive integer, such as 10) first participle with the highest weight can be extracted as a common feature word.
  • the extracted common feature words are as follows:
  • step 102 may further include the following sub-steps:
  • Sub-step S15 using the one or more first word segments to perform matching in a preset stop word bank
  • Sub-step S16 the first participle matching the success is removed.
  • the terminology lexicon can store words with high frequency, but the actual meaning is not big, mainly refers to adverbs, function words, modal particles, etc., such as "yes", "is” and so on.
  • the meaningless words in the first participle can be filtered by the stop words.
  • the detection information "Please check the network congestion first” can be divided into the first participles such as “please”, “you”, “first”, “right”, “network congestion”, “detection”, “bar”, etc.
  • the Gayrus By disabling the thesaurus, you can remove meaningless words such as "please”, “you", “first”, “right”, “bar”.
  • Step 103 For each type of first work order data, learn a logical relationship between the fault information and the feature vector, and obtain each type of fault spectrum model;
  • a training device may be preset, which may be used to learn logical relationships between data of various dimensions (ie, fault information, feature vectors), such as a support vector machine (SVM), a decision tree (Decision Tree).
  • SVM support vector machine
  • Decision Tree decision tree
  • the support vector machine maps the sample space into a feature space of high-dimensional or even infinite dimension through a nonlinear mapping p (Hilbert space), so that the problem of nonlinear separability in the original sample space is transformed into the feature.
  • p Hembert space
  • Random forests are built in a random way. There are many decision trees in the forest. There is no correlation between each decision tree in the random forest. After getting the forest, when a new input sample enters, let each decision tree in the forest make a separate judgment to see which class the sample should belong to (for the classification algorithm), and then see which One type is selected the most, and the sample is predicted to be that type.
  • the decision tree is based on the known probability of occurrence of various situations. By constructing a decision tree to obtain the probability that the expected value of the net present value is greater than or equal to zero, the project risk is evaluated and the feasibility analysis is judged.
  • the method is a graphical method that intuitively uses probability analysis.
  • the fault spectrum model can be fitted. If the error (CP) is less than a preset error threshold (such as 0.001), the fitting is stopped, and the fault spectrum model of the training station is shown in FIG. 2A. It is a tree structure, including a root node and a leaf node, and at least some of the leaf nodes have a logical relationship.
  • a preset error threshold such as 0.001
  • the root node represents the fault information
  • the leaf node represents the detection information
  • A: DB connection failure is a root node, which represents fault information
  • B: network disconnection is a leaf node, and the detection information is represented, and the root node is " A: DB connection failed”
  • the root node is " A: DB connection failed”
  • each type of fault spectrum model is trimmed to obtain each type of fault spectrum.
  • the fault spectrum model can be trimmed according to actual needs to obtain the fault spectrum.
  • the fault spectrum may include a connected root node and a leaf node, the root node may represent fault information, the leaf node may represent detection information, at least some of the leaf nodes may have a logical relationship, and the leaf node may have one or more parent nodes. .
  • the trimmed fault spectrum can be stored in the fault spectrum warehouse (database).
  • step 104 can include the following sub-steps:
  • Sub-step S21 searching for the same sub-tree in the fault spectrum model
  • the subtree may be a set of one or more leaf nodes
  • Sub-step S22 when found, connect the parent node of the same sub-tree to one of the sub-trees;
  • Sub-step S23 in the same subtree, the other subtrees other than the connected subtree are clipped.
  • the fault spectrum of the tree-like structure is not a tree structure.
  • some subtrees in a binary tree may repeat other subtrees, resulting in a cumbersome structure of the binary tree, too many branches, and unclear logical relationships; however, there is no duplication in the DAG.
  • the subtree because if it is repeated, the parent node can delete the subtree and point to other duplicate subtrees. Therefore, the tree structure is relatively small compared to the binary tree, and the logic is clearer.
  • subtrees consisting of four leaf nodes: “H: Configuration Check”, “F: Whitelist”, “D: Detection Password”, and “J: Detection Port” are repeated, as shown in FIG. 3B.
  • One of the subtrees can be deleted such that the parent nodes "B: Network Broken” and "E: Local Detection” of the subtree point to the same subtree.
  • step 104 may further include the following sub-steps:
  • Sub-step S24 the fault spectrum model is trimmed according to a preset pruning mode.
  • the trained fault spectrum model may contain deeper levels, which may result in numerous steps for detection.
  • the fault spectrum model may be trimmed by a preset pruning method, such as a prune() function, and the level of the fault spectrum model is reduced within a range of acceptable detection errors to reduce the fault spectrum.
  • a preset pruning method such as a prune() function
  • the complexity of the model reduces the steps of detection.
  • the level of the fault spectrum model shown in FIG. 2A is 6 layers, and after trimming by the pruning method, a 4-layer fault spectrum model as shown in FIG. 2B is obtained.
  • step 104 may further include the following sub-steps:
  • Sub-step S25 cutting out the leaf node whose logical relationship is illegal from the fault spectrum model.
  • the leaf nodes that are illegal for some logical relationships may be cut out by manual confirmation or by legal logic relationship to improve the accuracy.
  • FIG. 4 a flow chart of the steps of the fault spectrum-based detection method embodiment of the present application is shown, which may specifically include the following steps:
  • Step 401 When receiving the second work order data, extract keywords from the second work order data;
  • the embodiment of the present application can be applied to a virtual customer service system, which can use the streaming real-time processing framework strom to ensure that detection is completed with minimal delay.
  • the stream real-time processing framework such as the S4 (Simple Scalable Streaming System), the MillWheel, and the Kinesis can be applied, and the embodiment of the present application does not limit this.
  • the user can submit the second work order data to the virtual customer service system through a browser, an independent application, etc., and the virtual customer service system can remove the irrelevant interference information and purify the second work order data.
  • the second work order data is allowed to contain a fault information to solve a problem. If the user raises a question, the information unrelated to the problem can be regarded as the interference information.
  • the second work order data may be classified according to the product, and the information unrelated to the current product may be the second work order data.
  • the user's second question is unrelated to the first question and is interference information.
  • a keyword may be extracted, and the keyword may be information that reflects the characteristics of the second work order data (ie, fault information).
  • step 401 can include the following sub-steps:
  • Sub-step S31 performing word segmentation processing on the second work order data to obtain one or more second word segments
  • word segmentation can be handled in the following ways:
  • a word segmentation method based on feature scanning or marker segmentation.
  • word segmentation processing mode is only an example.
  • other word segmentation processing modes may be set according to actual conditions, which is not limited by the embodiment of the present application.
  • those skilled in the art may also adopt other word segmentation processing methods according to actual needs, and the embodiment of the present application does not limit this.
  • Sub-step S32 identifying the part of speech of the one or more second word segments
  • Sub-step S33 extracting keywords from the one or more second word segments according to the part of speech.
  • the participle analysis may be performed on the second participle, and the part of speech of each second participle, such as a noun, a verb, an adjective, an adverb, a preposition, a conjunction, a helper, and the like, may be obtained.
  • nouns and verbs can be used to form keywords, nouns can be used to determine the target object, and verbs can infer the main semantics.
  • UDF function For example, in a second work order data, user A asks "the UDF function with or without the maximum value".
  • the verb is "with or without” and the noun is "UDF function", that is, the problem of user A (ie, The keyword) is to ask, "There is no UDF function.”
  • step 401 may further include the following sub-steps:
  • Sub-step S34 using the one or more first word segments to perform matching in a preset stop word bank
  • Sub-step S35 the second participle matching the success is removed.
  • the meaningless words in the second participle can be filtered by the stop word.
  • Step 402 Find a fault spectrum corresponding to the category of the second work order data
  • the fault spectrum can be pre-trained and stored in the fault spectrum warehouse (database).
  • the text similarity can be used to find the category corresponding to the category of the second work order data.
  • the fault spectrum that is, the fault spectrum warehouse looks for a fault spectrum similar to the keyword in the second work order data.
  • the fault spectrum of the "DB access slow” category can be matched according to the text similarity.
  • the fault spectrum may include a connected root node and a leaf node, the root node may represent fault information, and the leaf node may represent detection information, at least part of the leaf nodes may have a logical relationship, and the leaf node may have a Or multiple parent nodes.
  • Step 403 in the fault spectrum, searching for one or more detection paths according to the keyword;
  • the detection path may record information of a detection mode and a detection sequence.
  • step 403 can include the following sub-steps:
  • Sub-step S41 in the fault spectrum, searching for a root node that matches the keyword
  • Sub-step S42 traversing one or more leaf nodes connected to the root node to obtain one or more detection paths.
  • the fault spectrum is a tree-like structure (directed acyclic graph DAG), and therefore, the "node" matching can be performed from the top to the bottom when the detection path is retrieved.
  • the root node can be located in the fault spectrum, and all the leaf nodes that are traversed from the root node are traversed to form a detection path.
  • the number of the next layer of leaf nodes is generally the same as the number of detection paths.
  • connection may be that the root node is directly connected to the child node, or that the root node is indirectly connected to the child node.
  • the leaf node "E: Local Detection” is directly connected to the root node "A: DB Connection Failure"
  • the leaf node "H: Configuration Troubleshooting” is indirectly connected to the root node "A: DB Connection Failure”.
  • Step 404 Perform detection according to the one or more detection paths to obtain a detection result.
  • the concurrent detection is performed according to the one or more detection paths, which reduces the detection time and improves the detection efficiency.
  • step 404 can include the following sub-steps:
  • Sub-step S51 acquiring, for each detection path, detection information of one or more leaf node representations in the detection path;
  • Sub-step S52 performing detection according to the detection information represented by the current leaf node, and obtaining candidate detection results
  • Sub-step S53 searching for the next leaf node whose logical relationship matches the candidate detection result, and returning to the execution sub-step S51 until execution to the final leaf node;
  • Sub-step S54 the candidate detection result of the final leaf node is set as the detection result.
  • the logical relationship and the detection information represented by the leaf node may be referred to as a rule, that is, when a certain condition (logical relationship) is met, an operation (detection information) is performed.
  • DB access when DB access is slow, it checks the bandwidth and traffic of the network. This is a logical relationship.
  • Rules are pre-defined in the rule engine, such as JBoss Rules. Once the condition is triggered, the rule engine executes this rule, such as executing the network status detection command ifstat.
  • the detection path is detected layer by layer according to a logical relationship until the final leaf node, and part of the child nodes that do not conform to the logical relationship can be avoided.
  • the final leaf node may refer to a leaf node that does not have a leaf node of the next layer, and is not necessarily the lowest leaf node in the detection path.
  • the root node "A: DB access slow” represents fault information, and the elements are network, DB load, and SQL (Structured Query Language).
  • 70% of the work order data is a very common problem, which has a large degree of repetitiveness. It takes more than 50% of customer service resources to process these problems with the existing work order system, and spends a lot of duplication of labor. The efficiency is very low, and one of the purposes of the embodiment of the present application is to automatically solve the 70% common problem.
  • the existing work order system is generally based on the user's own experience to guide the problem investigation, that is, the direction of the guidance is "person ⁇ problem", the operation is complicated, the user needs a lot of energy, and the technical threshold is high.
  • the embodiment of the present application is to let the system guide the troubleshooting process.
  • the fault spectrum generated by the massive work order data guides the user to check the problem, and is reverse guidance, that is, the guiding direction is “system ⁇ problem ⁇ person”, and the operation is simple, and the operation is greatly reduced.
  • the frequency of manual participation reduces the user's energy consumption.
  • the knowledge points in the knowledge base formed by the massive work order data locks greatly reduce the technical threshold, and facilitate the users with weak technical skills or customer service to solve the problem alone.
  • the existing work order system is inspected in the serial check, that is, without any inherent logical relationship, and all the nodes in FIG. 6A need to be sorted out in an orderly manner, for example: A ⁇ B ⁇ C ⁇ D ⁇ E until you find a problem. Therefore, when the time complexity is O(N), N is the number of nodes, that is, all nodes must be checked.
  • the fault spectrum can be generated in the following manner:
  • Sub-step S61 acquiring first work order data of one or more categories; each first work order data includes fault information and detection information;
  • Sub-step S62 for each type of first work order data, extract a common feature word from the detection information as a feature vector
  • Sub-step S63 learning, for each type of first work order data, a logical relationship between the fault information and the feature vector to obtain each type of fault spectrum model;
  • each type of fault spectrum model is trimmed to obtain each type of fault spectrum.
  • the fault spectrum may include a connected root node and a leaf node, where the root node may represent fault information, and the leaf node may represent detection information, and at least some of the leaf nodes may have a logical relationship.
  • a leaf node can have one or more parent nodes.
  • sub-step S62 may include the following sub-steps:
  • Sub-step S621, performing word segmentation processing on the detection information to obtain one or more first word segments
  • Sub-step S622 counting the word frequency of the first participle
  • Sub-step S624 extracting at least part of the first participle as a common feature word according to the weight.
  • sub-step S62 may further include the following sub-steps:
  • Sub-step S625, performing matching in the preset stop word bank by using the one or more first word segments;
  • sub-step S64 may include the following sub-steps:
  • Sub-step S641 searching for the same sub-tree in the fault spectrum model; the sub-tree is a set of one or more leaf nodes;
  • Sub-step S642 when found, connect the parent node of the same sub-tree to one of the sub-trees;
  • Sub-step S643 in the same subtree, the other subtrees other than the connected subtree are clipped.
  • sub-step S64 may further include the following sub-steps:
  • Sub-step S644 the fault spectrum model is trimmed according to a preset pruning manner.
  • sub-step S64 may further include the following sub-steps:
  • Sub-step S645 the leaf node whose logical relationship is illegal is clipped from the fault spectrum model.
  • FIG. 7 a structural block diagram of an apparatus for generating a fault spectrum of the present application is shown, which may specifically include the following modules:
  • the work order data obtaining module 701 is configured to acquire first work order data of one or more categories; each first work order data includes fault information and detection information;
  • the common feature word extraction module 702 is configured to extract a common feature word from the detection information as a feature vector for each type of first work order data;
  • the fault spectrum model learning module 703 is configured to learn a logical relationship between the fault information and the feature vector for each type of first work order data, and obtain each type of fault spectrum model;
  • the fault spectrum model trimming module 704 is configured to perform trimming processing on each type of fault spectrum model to obtain each type of fault spectrum.
  • the fault spectrum may include a connected root node and a leaf node, where the root node may represent fault information, and the leaf node may represent detection information, and at least some of the leaf nodes may have a logical relationship.
  • a leaf node can have one or more parent nodes.
  • the common feature word extraction module 702 may include the following sub-modules:
  • a first word segmentation processing module configured to perform word segmentation processing on the detection information to obtain one or more first word segments
  • a word frequency statistics module configured to count the word frequency of the first participle
  • a weight calculation module configured to calculate a weight of the first word segment by using a word frequency of the first word segment
  • the first participle extraction sub-module is configured to extract at least part of the first participle as a public feature word according to the weight.
  • the common feature word extraction module 702 may further include the following sub-modules:
  • a matching submodule configured to perform matching in the preset stop word bank by using the one or more first word segments
  • the fault spectrum model pruning module 704 can include the following sub-modules:
  • subtree search submodule configured to search for the same subtree in the fault spectrum model;
  • the subtree is a set of one or more leaf nodes;
  • connection submodule for connecting a parent node of the same subtree to one of the subtrees when found
  • the first pruning submodule is used to cut out other subtrees other than the connected subtree in the same subtree.
  • the fault spectrum model pruning module 704 may further include the following sub-modules:
  • the second trimming submodule is configured to perform the trimming process on the fault spectrum model according to a preset pruning manner.
  • the fault spectrum model pruning module 704 may further include the following sub-modules:
  • a third pruning submodule configured to cut out a leaf node whose logical relationship is illegal from the fault spectrum model.
  • FIG. 8 a structural block of an embodiment of a fault spectrum based detecting apparatus of the present application is shown The figure may specifically include the following modules:
  • a keyword extraction module 801 configured to extract a keyword from the second work order data when the second work order data is received
  • the fault spectrum finding module 802 is configured to search for a fault spectrum corresponding to the category of the second work order data
  • a detection path searching module 803 configured to search for one or more detection paths according to the keyword in the fault spectrum
  • the detecting module 804 is configured to perform detection according to the one or more detection paths to obtain a detection result.
  • the fault spectrum may include a connected root node and a leaf node, where the root node may represent fault information, and the leaf node may represent detection information, and at least some of the leaf nodes may have a logical relationship.
  • a leaf node can have one or more parent nodes.
  • the keyword extraction module 801 may include the following sub-modules:
  • a second word segmentation processing sub-module configured to perform word segmentation processing on the second work order data to obtain one or more second word segments
  • a part of speech recognition sub-module configured to identify part of speech of the one or more second participles
  • a second word segmentation sub-module configured to extract keywords from the one or more second word segments according to the part of speech.
  • the keyword extraction module 801 may further include the following sub-modules:
  • a second matching submodule configured to perform matching in the preset stop word bank by using the one or more first word segments
  • the second removal sub-module is used to remove the second participle that matches the success.
  • the detection path lookup module 803 may include the following sub-modules:
  • a root node matching submodule configured to find a root node that matches the keyword in the fault spectrum
  • the leaf node traverses the sub-module for traversing one or more leaf nodes connected to the root node to obtain one or more detection paths.
  • the detecting module 804 may include the following submodules:
  • a detection information obtaining submodule configured to acquire, for each detection path, detection information of one or more leaf node representations in the detection path;
  • a candidate detection result obtaining sub-module configured to perform detection according to the detection information represented by the current leaf node, to obtain a candidate detection result
  • a leaf node search submodule configured to search for a next leaf node whose logical relationship matches the candidate detection result, and return a call candidate detection result acquisition submodule until execution to the final leaf node;
  • the detection result setting sub-module is configured to set a candidate detection result of the final leaf node as the detection result.
  • the fault spectrum can be generated by calling the following modules:
  • a work order data acquisition module configured to acquire first work order data of one or more categories; each first work order data includes fault information and detection information;
  • a common feature word extraction module configured to extract a common feature word from the detection information as a feature vector for each type of first work order data
  • a fault spectrum model learning module configured to learn a logical relationship between the fault information and the feature vector for each type of first work order data, and obtain each type of fault spectrum model
  • the fault spectrum model trimming module is used to trim each type of fault spectrum model to obtain each type of fault spectrum.
  • the common feature word extraction module 702 may include the following sub-modules:
  • a first word segmentation processing module configured to perform word segmentation processing on the detection information to obtain one or more first word segments
  • a word frequency statistics module configured to count the word frequency of the first participle
  • a weight calculation module configured to calculate a right of the first word segment by using a word frequency of the first word segment weight
  • the first participle extraction sub-module is configured to extract at least part of the first participle as a public feature word according to the weight.
  • the public feature word extraction module may further include the following sub-modules:
  • a matching submodule configured to perform matching in the preset stop word bank by using the one or more first word segments
  • the fault spectrum model pruning module may include the following sub-modules:
  • subtree search submodule configured to search for the same subtree in the fault spectrum model;
  • the subtree is a set of one or more leaf nodes;
  • connection submodule for connecting a parent node of the same subtree to one of the subtrees when found
  • the first pruning submodule is used to cut out other subtrees other than the connected subtree in the same subtree.
  • the fault spectrum model pruning module may further include the following sub-modules:
  • the second trimming submodule is configured to perform the trimming process on the fault spectrum model according to a preset pruning manner.
  • the fault spectrum model pruning module may further include the following sub-modules:
  • a third pruning submodule configured to cut out a leaf node whose logical relationship is illegal from the fault spectrum model.
  • the description is relatively simple, and the relevant parts can be referred to the description of the method embodiment.
  • embodiments of the embodiments of the present application can be provided as a method, apparatus, or computer program product. Therefore, the embodiments of the present application may take the form of an entirely hardware embodiment, an entirely software embodiment, or an embodiment combining software and hardware. Moreover, embodiments of the present application can take the form of a computer program product embodied on one or more computer-usable storage media (including but not limited to disk storage, CD-ROM, optical storage, etc.) including computer usable program code.
  • computer-usable storage media including but not limited to disk storage, CD-ROM, optical storage, etc.
  • the computer device includes one or more processors (CPUs), input/output interfaces, network interfaces, and memory.
  • the memory may include non-persistent memory, random access memory (RAM), and/or non-volatile memory in a computer readable medium, such as read only memory (ROM) or flash memory.
  • RAM random access memory
  • ROM read only memory
  • Memory is an example of a computer readable medium.
  • Computer readable media includes both permanent and non-persistent, removable and non-removable media.
  • Information storage can be implemented by any method or technology. The information can be computer readable instructions, data structures, modules of programs, or other data.
  • Examples of computer storage media include, but are not limited to, phase change memory (PRAM), static random access memory (SRAM), dynamic random access memory (DRAM), other types of random access memory (RAM), read only memory. (ROM), electrically erasable programmable read only memory (EEPROM), flash memory or other memory technology, compact disk read only memory (CD-ROM), digital versatile disk (DVD) or other optical storage, Magnetic tape cartridges, magnetic tape storage or other magnetic storage devices or any other non-transportable media can be used to store information that can be accessed by a computing device.
  • computer readable media does not include non-persistent computer readable media, such as modulated data signals and carrier waves.
  • Embodiments of the present application are described with reference to flowcharts and/or block diagrams of methods, terminal devices (systems), and computer program products according to embodiments of the present application. It will be understood that each flow and/or block of the flowchart illustrations and/or FIG.
  • These computer program instructions can be provided to a general purpose computer, A processor of a special purpose computer, embedded processor or other programmable data processing terminal device to generate a machine such that instructions executed by a processor of a computer or other programmable data processing terminal device are used to implement a flow in a flowchart or A plurality of processes and/or block diagrams of means for a function specified in a block or blocks.
  • the computer program instructions can also be stored in a computer readable memory that can direct a computer or other programmable data processing terminal device to operate in a particular manner, such that the instructions stored in the computer readable memory produce an article of manufacture comprising the instruction device.
  • the instruction device implements the functions specified in one or more blocks of the flowchart or in a flow or block of the flowchart.
  • a method for generating a fault spectrum provided by the present application and a fault spectrum based detection method
  • the method, a fault spectrum generating device and a fault spectrum based detecting device are described in detail.
  • the principle and implementation manner of the present application are described in the specific examples.
  • the description of the above embodiment is only used for To help understand the method of the present application and its core idea; at the same time, for those of ordinary skill in the art, according to the idea of the present application, there will be changes in the specific implementation manner and application scope. It should not be construed as limiting the application.

Landscapes

  • Engineering & Computer Science (AREA)
  • General Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Computer Hardware Design (AREA)
  • Quality & Reliability (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
  • Test And Diagnosis Of Digital Computers (AREA)

Abstract

一种故障谱的生成、基于故障谱的检测方法和装置,该生成方法包括:获取一个或多个类别的第一工单数据(101);每个第一工单数据中包括故障信息与检测信息;针对每类第一工单数据,从所述检测信息中提取公共特征词,作为特征向量(102);针对每类第一工单数据,学习所述故障信息与所述特征向量之间的逻辑关系,获得每类故障谱模型(103);对每类故障谱模型进行修剪处理,获得每类故障谱(104)。通过建立故障谱,使得后续检测时支持并发依据该一个或多个检测路径进行检测,减少了检测耗时,提高检测的效率,同时,应用故障谱的检测操作简单,大大减少了人工参与的频次,减少用户精力的耗费。

Description

一种故障谱的生成、基于故障谱的检测方法和装置 技术领域
本申请涉及计算机的技术领域,特别是涉及一种故障谱的生成方法、一种基于故障谱的检测方法、一种故障谱的生成装置和一种基于故障谱的检测装置。
背景技术
随着科技的快速发展,各种产品,如虚拟主机、云平台等等,广泛进入人们的生活、学习、工作等领域。
通常,在产品出现故障时,用户会向工单***提交工单,进行检测、维护,进而解决故障。
现有的工单***主要由两个子***组成:自主解答***和客服解答***。
在工单***中,用户需要自己查阅帮助中心文档或根据向导提示来解决故障。
由于用户需要根据文档或提示一步一步操作排查,即串行排查,耗费较多的时间,故障检测的速度慢;并且,工单***所累积的技术文档一般数量很多,操作复杂,需要耗费用户大量的精力;此外,阅读技术文档需要多需要对领域内的知识有积累,技术门槛较高,对于技术功底弱的用户或客服很难独自解决问题。
发明内容
鉴于上述问题,提出了本申请实施例以便提供一种克服上述问题或者至少部分地解决上述问题的一种故障谱的生成方法、一种基于故障谱的检测方法和相应的一种故障谱的生成装置、一种基于故障谱的检测装置。
为了解决上述问题,本申请实施例公开了一种故障谱的生成方法,包括:
获取一个或多个类别的第一工单数据;每个第一工单数据中包括故障信 息与检测信息;
针对每类第一工单数据,从所述检测信息中提取公共特征词,作为特征向量;
针对每类第一工单数据,学习所述故障信息与所述特征向量之间的逻辑关系,获得每类故障谱模型;
对每类故障谱模型进行修剪处理,获得每类故障谱。
优选地,所述故障谱中包括相连的根节点与叶子节点,所述根节点表征故障信息,所述叶子节点表征检测信息,至少部分叶子节点之间具有逻辑关系,所述叶子节点具有一个或多个父节点。
优选地,所述从所述检测信息中提取公共特征词的步骤包括:
对所述检测信息进行分词处理,获得一个或多个第一分词;
统计所述第一分词的词频;
通过所述第一分词的词频计算所述第一分词的权重;
按照所述权重提取至少部分第一分词作为公共特征词。
优选地,所述从所述检测信息中提取公共特征词的步骤还包括:
采用所述一个或多个第一分词在预置的停用词库中进行匹配;
移除匹配成功的第一分词。
优选地,所述对每类故障谱模型进行修剪处理的步骤包括:
在所述故障谱模型中查找相同的子树;所述子树为一个或多个叶子节点的集合;
当查找到时,将相同的子树的父节点连接至其中一个子树;
在相同的子树中,剪去已连接的子树之外的其他的子树。
优选地,所述对每类故障谱模型进行修剪处理的步骤还包括:
按照预设的剪枝方式对所述故障谱模型进行修剪处理。
优选地,所述对每类故障谱模型进行修剪处理的步骤还包括:
从所述故障谱模型剪去逻辑关系不合法的叶子节点。
本申请实施例还公开了一种基于故障谱的检测方法,包括:
当接收到第二工单数据时,从所述第二工单数据中提取关键词;
查找所述第二工单数据所属类别对应的故障谱;
在所述故障谱中,根据所述关键词查找一个或多个检测路径;
依据所述一个或多个检测路径进行检测,获得检测结果。
优选地,所述故障谱中包括相连的根节点与叶子节点,所述根节点表征故障信息,所述叶子节点表征检测信息,至少部分叶子节点之间具有逻辑关系,所述叶子节点具有一个或多个父节点。
优选地,所述从所述第二工单数据中提取关键词的步骤包括:
对所述第二工单数据进行分词处理,获得一个或多个第二分词;
识别所述一个或多个第二分词的词性;
按照所述词性从所述一个或多个第二分词中提取关键词。
优选地,所述从所述第二工单数据中提取关键词的步骤还包括:
采用所述一个或多个第一分词在预置的停用词库中进行匹配;
移除匹配成功的第二分词。
优选地,所述在所述故障谱中,根据所述特征词查找一个或多个检测路径的步骤包括:
在所述故障谱中,查找与所述关键词匹配的根节点;
遍历与所述根节点相连的一个或多个叶子节点,获得一个或多个检测路径。
优选地,所述依据所述一个或多个检测路径进行检测,获得检测结果的步骤包括:
针对每个检测路径,获取所述检测路径中的一个或多个叶子节点表征的检测信息;
按照当前叶子节点表征的检测信息进行检测,获得候选检测结果;
查找逻辑关系与所述候选检测结果匹配的下一叶子节点,返回执行按照当前叶子节点表征的检测信息进行检测的步骤,直至执行至最终的叶子节点;
将最终的叶子节点的候选检测结果设置为检测结果。
优选地,所述故障谱通过以下方式生成:
获取一个或多个类别的第一工单数据;每个第一工单数据中包括故障信息与检测信息;
针对每类第一工单数据,从所述检测信息中提取公共特征词,作为特征向量;
针对每类第一工单数据,学习所述故障信息与所述特征向量之间的逻辑关系,获得每类故障谱模型;
对每类故障谱模型进行修剪处理,获得每类故障谱。
本申请实施例还公开了一种故障谱的生成装置,包括:
工单数据获取模块,用于获取一个或多个类别的第一工单数据;每个第一工单数据中包括故障信息与检测信息;
公共特征词提取模块,用于针对每类第一工单数据,从所述检测信息中提取公共特征词,作为特征向量;
故障谱模型学习模块,用于针对每类第一工单数据,学习所述故障信息与所述特征向量之间的逻辑关系,获得每类故障谱模型;
故障谱模型修剪模块,用于对每类故障谱模型进行修剪处理,获得每类故障谱。
优选地,所述故障谱中包括相连的根节点与叶子节点,所述根节点表征故障信息,所述叶子节点表征检测信息,至少部分叶子节点之间具有逻辑关系,所述叶子节点具有一个或多个父节点。
优选地,所述公共特征词提取模块包括:
第一分词处理模块,用于对所述检测信息进行分词处理,获得一个或多个第一分词;
词频统计模块,用于统计所述第一分词的词频;
权重计算模块,用于通过所述第一分词的词频计算所述第一分词的权重;
第一分词提取子模块,用于按照所述权重提取至少部分第一分词作为公 共特征词。
优选地,所述公共特征词提取模块还包括:
第一匹配子模块,用于采用所述一个或多个第一分词在预置的停用词库中进行匹配;
第一移除子模块,用于移除匹配成功的第一分词。
优选地,所述故障谱模型修剪模块包括:
子树查找子模块,用于在所述故障谱模型中查找相同的子树;所述子树为一个或多个叶子节点的集合;
连接子模块,用于在查找到时,将相同的子树的父节点连接至其中一个子树;
第一修剪子模块,用于在相同的子树中,剪去已连接的子树之外的其他的子树。
优选地,所述故障谱模型修剪模块还包括:
第二修剪子模块,用于按照预设的剪枝方式对所述故障谱模型进行修剪处理。
优选地,所述故障谱模型修剪模块还包括:
第三修剪子模块,用于从所述故障谱模型剪去逻辑关系不合法的叶子节点。
本申请实施例还公开了一种基于故障谱的检测装置,包括:
关键词提取模块,用于在接收到第二工单数据时,从所述第二工单数据中提取关键词;
故障谱查找模块,用于查找所述第二工单数据所属类别对应的故障谱;
检测路径查找模块,用于在所述故障谱中,根据所述关键词查找一个或多个检测路径;
检测模块,用于依据所述一个或多个检测路径进行检测,获得检测结果。
优选地,所述故障谱中包括相连的根节点与叶子节点,所述根节点表征故障信息,所述叶子节点表征检测信息,至少部分叶子节点之间具有逻辑关 系,所述叶子节点具有一个或多个父节点。
优选地,所述关键词提取模块包括:
第二分词处理子模块,用于对所述第二工单数据进行分词处理,获得一个或多个第二分词;
词性识别子模块,用于识别所述一个或多个第二分词的词性;
第二分词提取子模块,用于按照所述词性从所述一个或多个第二分词中提取关键词。
优选地,所述关键词提取模块还包括:
第二匹配子模块,用于采用所述一个或多个第一分词在预置的停用词库中进行匹配;
第二移除子模块,用于移除匹配成功的第二分词。
优选地,所述检测路径查找模块包括:
根节点匹配子模块,用于在所述故障谱中,查找与所述关键词匹配的根节点;
叶子节点遍历子模块,用于遍历与所述根节点相连的一个或多个叶子节点,获得一个或多个检测路径。
优选地,所述检测模块包括:
检测信息获取子模块,用于针对每个检测路径,获取所述检测路径中的一个或多个叶子节点表征的检测信息;
候选检测结果获取子模块,用于按照当前叶子节点表征的检测信息进行检测,获得候选检测结果;
叶子节点查找子模块,用于查找逻辑关系与所述候选检测结果匹配的下一叶子节点,返回调用候选检测结果获取子模块,直至执行至最终的叶子节点;
检测结果设置子模块,用于将最终的叶子节点的候选检测结果设置为检测结果。
优选地,所述故障谱通过调用以下模块生成:
工单数据获取模块,用于获取一个或多个类别的第一工单数据;每个第 一工单数据中包括故障信息与检测信息;
公共特征词提取模块,用于针对每类第一工单数据,从所述检测信息中提取公共特征词,作为特征向量;
故障谱模型学习模块,用于针对每类第一工单数据,学习所述故障信息与所述特征向量之间的逻辑关系,获得每类故障谱模型;
故障谱模型修剪模块,用于对每类故障谱模型进行修剪处理,获得每类故障谱。
本申请实施例包括以下优点:
本申请实施例通过建立故障谱,使得后续检测时支持并发依据该一个或多个检测路径进行检测,减少了检测耗时,提高检测的效率,同时,应用故障谱的检测操作简单,大大减少了人工参与的频次,减少用户精力的耗费,同时,利用海量的工单数据所形成的知识库中的知识点处理故障,大大降低了技术门槛,方便技术功底弱的用户或客服独自解决问题。
附图说明
图1是本申请的一种故障谱的生成方法实施例的步骤流程图;
图2A和图2B是本申请的一种故障谱模型的修剪示例图;
图3A和图3B是本申请的一种故障谱模型的修剪示例图;
图4是本申请的一种基于故障谱的检测方法实施例的步骤流程图;
图5是本申请的一种检测路径的示例图;
图6A是现有的一种检测示例图;
图6B是本申请的一种检测示例图;
图7是本申请的一种故障谱的生成装置实施例的结构框图;
图8是本申请的一种基于故障谱的检测装置实施例的结构框图。
具体实施方式
为使本申请的上述目的、特征和优点能够更加明显易懂,下面结合附图和具体实施方式对本申请作进一步详细的说明。
参照图1,示出了本申请的一种故障谱的生成方法实施例的步骤流程图,具体可以包括如下步骤:
步骤101,获取一个或多个类别的第一工单数据;
在实际应用中,可以对历史中海量的第一工单数据进行存储,对该海量的第一工单数据分析总结后,把典型的第一工单数据写成知识点,保存在知识库中。
一般而言,每个第一工单数据可以包括:日期、用户ID、产品、问题分类、问题(故障信息)、解决办法(检测信息)、沟通记录等要素。
其中,故障信息可以为记载所发生的故障的信息,检测信息可以为记载如何进行检测解决该故障的信息,两者是相对应的。
例如,在某个工单数据中,故障信息为“DB(Database,数据库)访问慢”,检测信息为“请您首先对网络拥塞检测吧”。
通过问题分类可以提取足够数量的、同属一个类别的第一工单数据,作为训练样本。
步骤102,针对每类第一工单数据,从所述检测信息中提取公共特征词,作为特征向量;
公共特征词,为该类中部分第一工单数据中所共有的词,可以用于表征检测信息的特征,作为训练样本的参数。
在本申请的一种优选实施例中,步骤102可以包括如下子步骤:
子步骤S11,对所述检测信息进行分词处理,获得一个或多个第一分词;
在具体实现中,可以通过以下方式进行分词处理:
1、基于字符串匹配的分词方法:是指按照一定的策略将待分析的汉字串与一个预置的机器词典中的词条进行匹配,若在词典中找到某个字符串,则匹配成功(识别出一个词)。
2、基于特征扫描或标志切分的分词方法:是指优先在待分析字符串中识别和切分出一些带有明显特征的词,以这些词作为断点,可将原字符串分为较小的串再来进机械分词;或者将分词和词类标注结合起来, 利用丰富的词类信息对分词决策提供帮助,并且在标注过程中又反过来对分词结果进行检验、调整。
3、基于理解的分词方法:是指通过让计算机模拟人对句子的理解,达到识别词的效果。其基本思想就是在分词的同时进行句法、语义分析,利用句法信息和语义信息来处理歧义现象。
4、基于统计的分词方法:对语料中相邻共现的各个字的组合的频度进行统计,计算它们的互现信息,以及计算两个汉字X、Y的相邻共现概率。互现信息可以体现汉字之间结合关系的紧密程度。当紧密程度高于某一个阈值时,便可认为此字组可能构成了一个词。
当然,上述分词处理方式只是作为示例,在实施本申请实施例时,可以根据实际情况设置其他分词处理方式,本申请实施例对此不加以限制。另外,除了上述分词处理方式外,本领域技术人员还可以根据实际需要采用其它分词处理方式,本申请实施例对此也不加以限制。
子步骤S12,统计所述第一分词的词频;
子步骤S13,通过所述第一分词的词频计算所述第一分词的权重;
在实际应用中,可以通过TF-IDF(term frequency–inverse document frequency,一种用于资讯检索与资讯探勘的常用加权技术)计算第一分词的权重。
具体而言,TF-IDF可以用于评估一字词对于一个文件集或一个语料库中的其中一份文件的重要程度,字词的重要性随着它在文件中出现的次数成正比增加,但同时会随着它在语料库中出现的频率成反比下降。
子步骤S14,按照所述权重提取至少部分第一分词作为公共特征词。
若通过TF-IDF计算第一分词的权重,则可以提取权重最高的前N(N为正整数,如10)个第一分词作为公共特征词。
对于整体而言,会获得每个分类的故障信息及其公共特征词。
例如:对于连接失败的分类,提取的公共特征词如下:
拦截,错误日志,部分失败,…,白名单;
报错,验证失败,…,密码;
连接失败,访问拒绝,…,端口。
在本申请的另一种优选实施例中,步骤102还可以包括如下子步骤:
子步骤S15,采用所述一个或多个第一分词在预置的停用词库中进行匹配;
子步骤S16,移除匹配成功的第一分词。
停用词库中可以存储出现频率很高,但实际意义又不大的词,主要指副词、虚词、语气词等,如“是”、“而是”等。
在本申请实施例中,在子步骤S12之前,可以通过停用词滤去第一分词中无意义的词。
例如,检测信息“请您首先对网络拥塞检测吧”可以划分为“请”、“您”、“首先”、“对”、“网络拥塞”、“检测”、“吧”等第一分词,通过停用词库,可以去除“请”、“您”、“首先”、“对”、“吧”等无意义的词。
步骤103,针对每类第一工单数据,学习所述故障信息与所述特征向量之间的逻辑关系,获得每类故障谱模型;
应用本申请实施例,可以预先设置训练器,可以用于学习各个维度的数据(即故障信息、特征向量)的逻辑关系,如支持向量机(Support Vector Machine,SVM)、决策树(Decision Tree)、随机森林(Random Forest)等等,本申请实施例对此不加以限制。
其中,支持向量机是通过一个非线性映射p,把样本空间映射到一个高维乃至无穷维的特征空间中(Hilbert空间),使得在原来的样本空间中非线性可分的问题转化为在特征空间中的线性可分的问题。
随机森林,是用随机的方式建立一个森林,森林里面有很多的决策树组成,随机森林的每一棵决策树之间是没有关联的。在得到森林之后,当有一个新的输入样本进入的时候,就让森林中的每一棵决策树分别进行一下判断,看看这个样本应该属于哪一类(对于分类算法),然后看看哪一类被选择最多,就预测这个样本为那一类。
决策树是在已知各种情况发生概率的基础上,通过构成决策树来求取净现值的期望值大于等于零的概率,评价项目风险,判断其可行性的决策分析 方法,是直观运用概率分析的一种图解法。
在训练器训练时,可以对故障谱模型进行拟合,若误差(CP)小于一个预先设定的误差阈值(如0.001),停止拟合,所训练处的故障谱模型如图2A所示,是树形结构,包括根节点与叶子节点,至少部分叶子节点之间具有逻辑关系。
如图2A所示,如“2.5”、“3.1”为节点(包括根节点、叶子节点),表征特征向量,“mmax<6100”、“syct>=360”等表征逻辑关系。
此外,根节点表征故障信息,叶子节点表征检测信息。
具体而言,如图3A所示,“A:DB连接失败”为根节点,表征故障信息,“E:本地检测”、“B:网络断”为叶子节点,表征检测信息,与根节点“A:DB连接失败”不存在逻辑关系。
而“H:配置排查”、“C:修复网络”为“B:网络断”的子节点,即“B:网络断”为“H:配置排查”、“C:修复网络”的父节点,“H:配置排查”、“C:修复网络”与“B:网络断”存在逻辑关系(图上未示出)。
步骤104,对每类故障谱模型进行修剪处理,获得每类故障谱。
在实际应用中,可以按照实际需求对故障谱模型进行修剪处理,获得故障谱。
其中,故障谱中可以包括相连的根节点与叶子节点,根节点可以表征故障信息,叶子节点可以表征检测信息,至少部分叶子节点之间可以具有逻辑关系,叶子节点可以具有一个或多个父节点。
修剪好的故障谱,可以存储在故障谱仓库(数据库)中。
在本申请的一种优选实施中,步骤104可以包括如下子步骤:
子步骤S21,在所述故障谱模型中查找相同的子树;
其中,所述子树可以为一个或多个叶子节点的集合;
子步骤S22,当查找到时,将相同的子树的父节点连接至其中一个子树;
子步骤S23,在相同的子树中,剪去已连接的子树之外的其他的子树。
在本申请实施例中,由于某些子树可能有重复,因此,可以递归检查有重复的子树,发现后,在一个节点指向另一个子树,同时删除本身的子树, 使得某些叶子节点具有多个父节点(表示一个现象可能由多种原因造成),形成类树形结构,即有向无环图(DAG,指一个有向图无法从某个顶点出发经过若干条边回到该点)。
类树形结构的故障谱并非树形结构,如二叉树,二叉树中某些子树可能会其他子树重复,造成二叉树的结构冗长,分支过多,逻辑关系不清晰;但在DAG中不存在重复的子树,因为如果重复了,父节点可以删除本子树,并指向其他重复的子树,因此,相对于二叉树等树形结构层次较少,逻辑较清晰。
需要说明的是,相同(即重复)是指叶子节点相同、叶子节点之间的逻辑关系相同。
如图3A所示,“H:配置排查”、“F:白名单”、“D:检测密码”和“J:检测端口”这四个叶子节点组成的子树重复,如图3B所示,可以删除其中一个子树,使得该子树的父节点“B:网络断”和“E:本地检测”指向同一个子树。
在本申请的另一种优选实施中,步骤104还可以包括如下子步骤:
子步骤S24,按照预设的剪枝方式对所述故障谱模型进行修剪处理。
一般情况下,训练出来的故障谱模型可能含有较深的层次,可能造成检测的步骤繁多。
在本申请实施例中,可以通过预设的剪枝方式,如prune()函数,对故障谱模型进行修剪处理,在可接受检测误差的范围内,把故障谱模型的层次减低,减少故障谱模型的复杂度,减少检测的步骤。
例如,如图2A所示的故障谱模型的层次为6层,通过剪枝方式修剪之后,获得如图2B所示的4层故障谱模型。
在本申请的另一种优选实施中,步骤104还可以包括如下子步骤:
子步骤S25,从所述故障谱模型剪去逻辑关系不合法的叶子节点。
在本申请实施例中,可以通过人工确认或者通过合法逻辑关系排查,剪去对一些逻辑关系不合法的叶子节点,提高准确率。
参照图4,示出了本申请的一种基于故障谱的检测方法实施例的步骤流程图,具体可以包括如下步骤:
步骤401,当接收到第二工单数据时,从所述第二工单数据中提取关键词;
本申请实施例可以应用在虚拟客服***中,该虚拟克服***可以使用流式实时处理框架strom,保证在极小的延迟完成检测。
当然,除了strom,还可以应用S4(Simple Scalable Streaming System)、MillWheel、Kinesis等流式实时处理框架中,本申请实施例对此不加以限制。
在具体实现中,用户可以通过浏览器、独立的应用等方式提交第二工单数据给虚拟客服***,虚拟客服***可以会去除无关干扰信息、净化第二工单数据。
通常,该第二工单数据中允许包含一个故障信息,解决一个问题,若用户提出了一个问题,与此问题无关的信息可以认为是干扰信息。
进一步而言,在某种情况下,第二工单数据可以根据产品分类,与当前产品无关的信息可以第二工单数据。
例如,在某个第二工单数据中,用户询问:“我的SQL中的UDF函数为什么不能执行?”虚拟客服***回答:“因为现在对外没有开放UDF的权限,因此你的UDF不能执行。”此外,用户再次询问:“明白。另一个问题:我的日志怎么下载呀?”
此示例中,用户的第2个问题与第一个问题无关,属于干扰信息。
对于过滤干扰信息后的第二工单数据,可以提取关键词,该关键词可以为体现第二工单数据(即故障信息)特征的信息。
在本申请的一种优选实施例中,步骤401可以包括如下子步骤:
子步骤S31,对所述第二工单数据进行分词处理,获得一个或多个第二分词;
在具体实现中,可以通过以下方式进行分词处理:
1、基于字符串匹配的分词方法。
2、基于特征扫描或标志切分的分词方法。
3、基于理解的分词方法。
4、基于统计的分词方法。
当然,上述分词处理方式只是作为示例,在实施本申请实施例时,可以根据实际情况设置其他分词处理方式,本申请实施例对此不加以限制。另外,除了上述分词处理方式外,本领域技术人员还可以根据实际需要采用其它分词处理方式,本申请实施例对此也不加以限制。
子步骤S32,识别所述一个或多个第二分词的词性;
子步骤S33,按照所述词性从所述一个或多个第二分词中提取关键词。
在本申请实施例中,可以对第二分词进行词性分析,获得各第二分词的词性,如名词、动词、形容词、副词、介词、连词、助词等等。
其中,可以通过名词与动词组成关键词,名词可以用于确定目标对象,动词可以推测出主要语义。
例如,在某个第二工单数据中,用户A询问“有没有最大值的UDF函数”,此示例中,动词为“有没有”,名词为“UDF函数”,即用户A的问题(即关键词)是问,“有没有UDF函数”。
又例如,在某个第二工单数据中,用户B询问“如何调试UDF函数”,此示例中,动词为“调试”,名词为“UDF函数”,即用户B的问题(即关键词)是问,“调试UDF函数”的方法。
在本申请的一种优选实施例中,步骤401还可以包括如下子步骤:
子步骤S34,采用所述一个或多个第一分词在预置的停用词库中进行匹配;
子步骤S35,移除匹配成功的第二分词。
在本申请实施例中,在子步骤S32之前,可以通过停用词滤去第二分词中无意义的词。
步骤402,查找所述第二工单数据所属类别对应的故障谱;
应用本申请实施例,可以预先训练故障谱,存储在故障谱仓库(数据库)中。
在实际应用中,可以通过文本相似度查找第二工单数据所属类别对应的 故障谱,即在故障谱仓库中查找与第二工单数据中的关键词相似的故障谱。
例如,如果关键词是:“DB”、“查询”、“等待”、“慢”等,则可以根据文本相似度匹配上“DB访问慢”这个类目的故障谱。
在本申请实施例中,故障谱中可以包括相连的根节点与叶子节点,根节点可以表征故障信息,叶子节点可以表征检测信息,至少部分叶子节点之间可以具有逻辑关系,叶子节点可以具有一个或多个父节点。
步骤403,在所述故障谱中,根据所述关键词查找一个或多个检测路径;
在具体实现中,该检测路径可以记载检测方式及检测顺序的信息。
在本申请的一种优选实施例中,步骤403可以包括如下子步骤:
子步骤S41,在所述故障谱中,查找与所述关键词匹配的根节点;
子步骤S42,遍历与所述根节点相连的一个或多个叶子节点,获得一个或多个检测路径。
在本申请实施例中,故障谱是一个类树形结构(有向无环图DAG),因此,检索检测路径时可以从上致下地进行使用“节点”匹配。
根据关键词可以在故障谱中定位根节点,是从该根节点向下遍历所经过的所有叶子节点,组成检测路径。
例如,如图3B所示的故障谱中存在两个检测路径,分布为“A→E→H→F/D/J”和“A→B→C/H→F/D/J”。
由于根节点与下一层的叶子节点一般不存在逻辑关系,因此,该下一层叶子节点的数量与检测路径的数量一般相同。
需要说明的是,相连可以指根节点与子节点直接相连,也可以指根节点与子节点间接相连。
例如,如图3B所示,叶子节点“E:本地检测”与根节点“A:DB连接失败”直接相连,叶子节点“H:配置排查”与根节点“A:DB连接失败”间接相连。
步骤404,依据所述一个或多个检测路径进行检测,获得检测结果。
在申请实施例中,支持并发依据该一个或多个检测路径进行检测,减少了检测耗时,提高检测的效率。
在本申请的一种优选实施例中,步骤404可以包括如下子步骤:
子步骤S51,针对每个检测路径,获取所述检测路径中的一个或多个叶子节点表征的检测信息;
子步骤S52,按照当前叶子节点表征的检测信息进行检测,获得候选检测结果;
子步骤S53,查找逻辑关系与所述候选检测结果匹配的下一叶子节点,返回执行子步骤S51,直至执行至最终的叶子节点;
子步骤S54,将最终的叶子节点的候选检测结果设置为检测结果。
在本申请实施例中,逻辑关系和叶子节点表征的检测信息可以称之为规则,即当符合某个条件(逻辑关系)就做执行某个操作(检测信息)。
例如,当DB访问慢时,就检查网络的带宽及流量,这就是一条逻辑关系。
规则是预先定义在规则引擎(rule engine)中的,如JBoss Rules(业务规则引擎),一旦条件触发,规则引擎会执行这条规则,如:执行网络状态检测命令ifstat。
在具体实现中,在检测路径按照逻辑关系逐层检测,直至最终的叶子节点,可以避免执行不符合逻辑关系的部分子节点。其中,最终的叶子节点可以指没有下一层叶子节点的叶子节点,并不一定是检测路径中最底层的叶子节点。
例如,在如图5所示的检测路径中,根节点“A:DB访问慢”表征故障信息,要素是网络,DB负载,SQL(Structured Query Language,结构化查询语言)。
首先要按照子节点“B:网络拥塞检测”确定网络是否有问题,若网络有问题(即“Y”),其余的要素都很难起作用(即无需执行“C:DB负载检测”),按照子节点“H:联系网工”进行人工解决;其次,如果网络没有问题(即“N”),则按照子节点“C:DB负载检测”判断DB负载是否高,如果“DB负载高”,即使SQL本身没有问题(即无需执行“D:慢SQL检测”),对外也会展现出访问慢的情况,则按照子节点“J:SQL线程检测”确认SQL 线程是否运行正常;如果“DB负载低”,则按照子节点“D:慢SQL检测”进行检测,分布为叶子节点“K:索引”的检测、“M:执行计划”的检测和“N:锁”的检测。
若执行到子节点“H:联系网工”、“J:SQL线程检测”、“K:索引”、“M:执行计划”和“N:锁”,则可以终止检测,获得检测结果。
据某项数据统计表明,70%的工单数据是很基本的常见问题,有较大的重复性,以现有的工单***处理这些问题需占用50%以上的客服资源,花费大量重复劳动,效率很低,本申请实施例的目的之一是自动化解决这70%的常见问题。
现有的工单***一般是用户根据自己的经验来引导问题排查,即导向方向为“人→问题”,操作复杂,需要耗费用户大量的精力,并且,技术门槛高。
本申请实施例是让***引导排查过程,是由海量的工单数据生成的故障谱来指导用户排查问题,是逆向引导,即导向方向为“***→问题→人”,操作简单,大大减少了人工参与的频次,减少用户精力的耗费,同时,利用海量的工单数据锁形成的知识库中的知识点处理问题,大大降低了技术门槛,方便技术功底弱的用户或客服独自解决问题。
如图6A所示,现有的工单***在串行排查时,即没有任何内在逻辑关系地进行排查,需要对图6A中所有的节点一一进行乱序排查,如:A→B→C→D→E,直到发现问题。因此,时间复杂度时O(N),N为节点的数量,即所有的节点都要进行排查。
在本申请实施例中,如图6B所示,通过故障谱进行并发排查(如并发执行B、C)时,因为节点间是有逻辑关系的,一般不需要对所有的节点一一排查,节省了很多排查步骤,如,若在C检测导向D,则不需要执行E,反之,若在C检测导向E,则不需要执行D。对于故障谱的结构,利用分支结构(每次除以2),可以省去很多排查步骤,排查复杂度最低可达O(log2N),N为节点的数量。
在本申请的一种优选实施例中,所述故障谱可以通过以下方式生成:
子步骤S61,获取一个或多个类别的第一工单数据;每个第一工单数据中包括故障信息与检测信息;
子步骤S62,针对每类第一工单数据,从所述检测信息中提取公共特征词,作为特征向量;
子步骤S63,针对每类第一工单数据,学习所述故障信息与所述特征向量之间的逻辑关系,获得每类故障谱模型;
子步骤S64,对每类故障谱模型进行修剪处理,获得每类故障谱。
在实际应用中,所述故障谱中可以包括相连的根节点与叶子节点,所述根节点可以表征故障信息,所述叶子节点可以表征检测信息,至少部分叶子节点之间可以具有逻辑关系,所述叶子节点可以具有一个或多个父节点。
在本申请的一种优选实施例中,子步骤S62可以包括如下子步骤:
子步骤S621,对所述检测信息进行分词处理,获得一个或多个第一分词;
子步骤S622,统计所述第一分词的词频;
子步骤S623,通过所述第一分词的词频计算所述第一分词的权重;
子步骤S624,按照所述权重提取至少部分第一分词作为公共特征词。
在本申请的另一种优选实施例中,子步骤S62还可以包括如下子步骤:
子步骤S625,采用所述一个或多个第一分词在预置的停用词库中进行匹配;
子步骤S626,移除匹配成功的第一分词。
在本申请的一种优选实施例中,子步骤S64可以包括如下子步骤:
子步骤S641,在所述故障谱模型中查找相同的子树;所述子树为一个或多个叶子节点的集合;
子步骤S642,当查找到时,将相同的子树的父节点连接至其中一个子树;
子步骤S643,在相同的子树中,剪去已连接的子树之外的其他的子树。
在本申请的另一种优选实施例中,子步骤S64还可以包括如下子步骤:
子步骤S644,按照预设的剪枝方式对所述故障谱模型进行修剪处理。
在本申请的另一种优选实施例中,子步骤S64还可以包括如下子步骤:
子步骤S645,从所述故障谱模型剪去逻辑关系不合法的叶子节点。
需要说明的是,对于方法实施例,为了简单描述,故将其都表述为一系列的动作组合,但是本领域技术人员应该知悉,本申请实施例并不受所描述的动作顺序的限制,因为依据本申请实施例,某些步骤可以采用其他顺序或者同时进行。其次,本领域技术人员也应该知悉,说明书中所描述的实施例均属于优选实施例,所涉及的动作并不一定是本申请实施例所必须的。
参照图7,示出了本申请的一种故障谱的生成装置实施例的结构框图,具体可以包括如下模块:
工单数据获取模块701,用于获取一个或多个类别的第一工单数据;每个第一工单数据中包括故障信息与检测信息;
公共特征词提取模块702,用于针对每类第一工单数据,从所述检测信息中提取公共特征词,作为特征向量;
故障谱模型学习模块703,用于针对每类第一工单数据,学习所述故障信息与所述特征向量之间的逻辑关系,获得每类故障谱模型;
故障谱模型修剪模块704,用于对每类故障谱模型进行修剪处理,获得每类故障谱。
在具体实现中,所述故障谱中可以包括相连的根节点与叶子节点,所述根节点可以表征故障信息,所述叶子节点可以表征检测信息,至少部分叶子节点之间可以具有逻辑关系,所述叶子节点可以具有一个或多个父节点。
在本申请的一种优选实施例中,所述公共特征词提取模块702可以包括如下子模块:
第一分词处理模块,用于对所述检测信息进行分词处理,获得一个或多个第一分词;
词频统计模块,用于统计所述第一分词的词频;
权重计算模块,用于通过所述第一分词的词频计算所述第一分词的权重;
第一分词提取子模块,用于按照所述权重提取至少部分第一分词作为公共特征词。
在本申请的一种优选实施例中,所述公共特征词提取模块702还可以包括如下子模块:
匹配子模块,用于采用所述一个或多个第一分词在预置的停用词库中进行匹配;
移除子模块,用于移除匹配成功的第一分词。
在本申请的一种优选实施例中,所述故障谱模型修剪模块704可以包括如下子模块:
子树查找子模块,用于在所述故障谱模型中查找相同的子树;所述子树为一个或多个叶子节点的集合;
连接子模块,用于在查找到时,将相同的子树的父节点连接至其中一个子树;
第一修剪子模块,用于在相同的子树中,剪去已连接的子树之外的其他的子树。
在本申请的一种优选实施例中,所述故障谱模型修剪模块704还可以包括如下子模块:
第二修剪子模块,用于按照预设的剪枝方式对所述故障谱模型进行修剪处理。
在本申请的一种优选实施例中,所述故障谱模型修剪模块704还可以包括如下子模块:
第三修剪子模块,用于从所述故障谱模型剪去逻辑关系不合法的叶子节点。
参照图8,示出了本申请的一种基于故障谱的检测装置实施例的结构框 图,具体可以包括如下模块:
关键词提取模块801,用于在接收到第二工单数据时,从所述第二工单数据中提取关键词;
故障谱查找模块802,用于查找所述第二工单数据所属类别对应的故障谱;
检测路径查找模块803,用于在所述故障谱中,根据所述关键词查找一个或多个检测路径;
检测模块804,用于依据所述一个或多个检测路径进行检测,获得检测结果。
在具体实现中,所述故障谱中可以包括相连的根节点与叶子节点,所述根节点可以表征故障信息,所述叶子节点可以表征检测信息,至少部分叶子节点之间可以具有逻辑关系,所述叶子节点可以具有一个或多个父节点。
在本申请的一种优选实施例中,所述关键词提取模块801可以包括如下子模块:
第二分词处理子模块,用于对所述第二工单数据进行分词处理,获得一个或多个第二分词;
词性识别子模块,用于识别所述一个或多个第二分词的词性;
第二分词提取子模块,用于按照所述词性从所述一个或多个第二分词中提取关键词。
在本申请的一种优选实施例中,所述关键词提取模块801还可以包括如下子模块:
第二匹配子模块,用于采用所述一个或多个第一分词在预置的停用词库中进行匹配;
第二移除子模块,用于移除匹配成功的第二分词。
在本申请的一种优选实施例中,所述检测路径查找模块803可以包括如下子模块:
根节点匹配子模块,用于在所述故障谱中,查找与所述关键词匹配的根节点;
叶子节点遍历子模块,用于遍历与所述根节点相连的一个或多个叶子节点,获得一个或多个检测路径。
在本申请的一种优选实施例中,所述检测模块804可以包括如下子模块:
检测信息获取子模块,用于针对每个检测路径,获取所述检测路径中的一个或多个叶子节点表征的检测信息;
候选检测结果获取子模块,用于按照当前叶子节点表征的检测信息进行检测,获得候选检测结果;
叶子节点查找子模块,用于查找逻辑关系与所述候选检测结果匹配的下一叶子节点,返回调用候选检测结果获取子模块,直至执行至最终的叶子节点;
检测结果设置子模块,用于将最终的叶子节点的候选检测结果设置为检测结果。
在本申请的一种优选实施例中,所述故障谱可以通过调用以下模块生成:
工单数据获取模块,用于获取一个或多个类别的第一工单数据;每个第一工单数据中包括故障信息与检测信息;
公共特征词提取模块,用于针对每类第一工单数据,从所述检测信息中提取公共特征词,作为特征向量;
故障谱模型学习模块,用于针对每类第一工单数据,学习所述故障信息与所述特征向量之间的逻辑关系,获得每类故障谱模型;
故障谱模型修剪模块,用于对每类故障谱模型进行修剪处理,获得每类故障谱。
在本申请的一种优选实施例中,所述公共特征词提取模块702可以包括如下子模块:
第一分词处理模块,用于对所述检测信息进行分词处理,获得一个或多个第一分词;
词频统计模块,用于统计所述第一分词的词频;
权重计算模块,用于通过所述第一分词的词频计算所述第一分词的权 重;
第一分词提取子模块,用于按照所述权重提取至少部分第一分词作为公共特征词。
在本申请的一种优选实施例中,所述公共特征词提取模块还可以包括如下子模块:
匹配子模块,用于采用所述一个或多个第一分词在预置的停用词库中进行匹配;
移除子模块,用于移除匹配成功的第一分词。
在本申请的一种优选实施例中,所述故障谱模型修剪模块可以包括如下子模块:
子树查找子模块,用于在所述故障谱模型中查找相同的子树;所述子树为一个或多个叶子节点的集合;
连接子模块,用于在查找到时,将相同的子树的父节点连接至其中一个子树;
第一修剪子模块,用于在相同的子树中,剪去已连接的子树之外的其他的子树。
在本申请的一种优选实施例中,所述故障谱模型修剪模块还可以包括如下子模块:
第二修剪子模块,用于按照预设的剪枝方式对所述故障谱模型进行修剪处理。
在本申请的一种优选实施例中,所述故障谱模型修剪模块还可以包括如下子模块:
第三修剪子模块,用于从所述故障谱模型剪去逻辑关系不合法的叶子节点。
对于装置实施例而言,由于其与方法实施例基本相似,所以描述的比较简单,相关之处参见方法实施例的部分说明即可。
本说明书中的各个实施例均采用递进的方式描述,每个实施例重点说明的都是与其他实施例的不同之处,各个实施例之间相同相似的部分互相参见即可。
本领域内的技术人员应明白,本申请实施例的实施例可提供为方法、装置、或计算机程序产品。因此,本申请实施例可采用完全硬件实施例、完全软件实施例、或结合软件和硬件方面的实施例的形式。而且,本申请实施例可采用在一个或多个其中包含有计算机可用程序代码的计算机可用存储介质(包括但不限于磁盘存储器、CD-ROM、光学存储器等)上实施的计算机程序产品的形式。
在一个典型的配置中,所述计算机设备包括一个或多个处理器(CPU)、输入/输出接口、网络接口和内存。内存可能包括计算机可读介质中的非永久性存储器,随机存取存储器(RAM)和/或非易失性内存等形式,如只读存储器(ROM)或闪存(flash RAM)。内存是计算机可读介质的示例。计算机可读介质包括永久性和非永久性、可移动和非可移动媒体可以由任何方法或技术来实现信息存储。信息可以是计算机可读指令、数据结构、程序的模块或其他数据。计算机的存储介质的例子包括,但不限于相变内存(PRAM)、静态随机存取存储器(SRAM)、动态随机存取存储器(DRAM)、其他类型的随机存取存储器(RAM)、只读存储器(ROM)、电可擦除可编程只读存储器(EEPROM)、快闪记忆体或其他内存技术、只读光盘只读存储器(CD-ROM)、数字多功能光盘(DVD)或其他光学存储、磁盒式磁带,磁带磁磁盘存储或其他磁性存储设备或任何其他非传输介质,可用于存储可以被计算设备访问的信息。按照本文中的界定,计算机可读介质不包括非持续性的电脑可读媒体(transitory media),如调制的数据信号和载波。
本申请实施例是参照根据本申请实施例的方法、终端设备(***)、和计算机程序产品的流程图和/或方框图来描述的。应理解可由计算机程序指令实现流程图和/或方框图中的每一流程和/或方框、以及流程图和/或方框图中的流程和/或方框的结合。可提供这些计算机程序指令到通用计算机、 专用计算机、嵌入式处理机或其他可编程数据处理终端设备的处理器以产生一个机器,使得通过计算机或其他可编程数据处理终端设备的处理器执行的指令产生用于实现在流程图一个流程或多个流程和/或方框图一个方框或多个方框中指定的功能的装置。
这些计算机程序指令也可存储在能引导计算机或其他可编程数据处理终端设备以特定方式工作的计算机可读存储器中,使得存储在该计算机可读存储器中的指令产生包括指令装置的制造品,该指令装置实现在流程图一个流程或多个流程和/或方框图一个方框或多个方框中指定的功能。
这些计算机程序指令也可装载到计算机或其他可编程数据处理终端设备上,使得在计算机或其他可编程终端设备上执行一系列操作步骤以产生计算机实现的处理,从而在计算机或其他可编程终端设备上执行的指令提供用于实现在流程图一个流程或多个流程和/或方框图一个方框或多个方框中指定的功能的步骤。
尽管已描述了本申请实施例的优选实施例,但本领域内的技术人员一旦得知了基本创造性概念,则可对这些实施例做出另外的变更和修改。所以,所附权利要求意欲解释为包括优选实施例以及落入本申请实施例范围的所有变更和修改。
最后,还需要说明的是,在本文中,诸如第一和第二等之类的关系术语仅仅用来将一个实体或者操作与另一个实体或操作区分开来,而不一定要求或者暗示这些实体或操作之间存在任何这种实际的关系或者顺序。而且,术语“包括”、“包含”或者其任何其他变体意在涵盖非排他性的包含,从而使得包括一系列要素的过程、方法、物品或者终端设备不仅包括那些要素,而且还包括没有明确列出的其他要素,或者是还包括为这种过程、方法、物品或者终端设备所固有的要素。在没有更多限制的情况下,由语句“包括一个……”限定的要素,并不排除在包括所述要素的过程、方法、物品或者终端设备中还存在另外的相同要素。
以上对本申请所提供的一种故障谱的生成方法、一种基于故障谱的检测 方法、一种故障谱的生成装置和一种基于故障谱的检测装置,进行了详细介绍,本文中应用了具体个例对本申请的原理及实施方式进行了阐述,以上实施例的说明只是用于帮助理解本申请的方法及其核心思想;同时,对于本领域的一般技术人员,依据本申请的思想,在具体实施方式及应用范围上均会有改变之处,综上所述,本说明书内容不应理解为对本申请的限制。

Claims (28)

  1. 一种故障谱的生成方法,其特征在于,包括:
    获取一个或多个类别的第一工单数据;每个第一工单数据中包括故障信息与检测信息;
    针对每类第一工单数据,从所述检测信息中提取公共特征词,作为特征向量;
    针对每类第一工单数据,学习所述故障信息与所述特征向量之间的逻辑关系,获得每类故障谱模型;
    对每类故障谱模型进行修剪处理,获得每类故障谱。
  2. 根据权利要求1所述的方法,其特征在于,所述故障谱中包括相连的根节点与叶子节点,所述根节点表征故障信息,所述叶子节点表征检测信息,至少部分叶子节点之间具有逻辑关系,所述叶子节点具有一个或多个父节点。
  3. 根据权利要求1或2所述的方法,其特征在于,所述从所述检测信息中提取公共特征词的步骤包括:
    对所述检测信息进行分词处理,获得一个或多个第一分词;
    统计所述第一分词的词频;
    通过所述第一分词的词频计算所述第一分词的权重;
    按照所述权重提取至少部分第一分词作为公共特征词。
  4. 根据权利要求3所述的方法,其特征在于,所述从所述检测信息中提取公共特征词的步骤还包括:
    采用所述一个或多个第一分词在预置的停用词库中进行匹配;
    移除匹配成功的第一分词。
  5. 根据权利要求1或2或4所述的方法,其特征在于,所述对每类故障谱模型进行修剪处理的步骤包括:
    在所述故障谱模型中查找相同的子树;所述子树为一个或多个叶子节点的集合;
    当查找到时,将相同的子树的父节点连接至其中一个子树;
    在相同的子树中,剪去已连接的子树之外的其他的子树。
  6. 根据权利要求5所述的方法,其特征在于,所述对每类故障谱模型进行修剪处理的步骤还包括:
    按照预设的剪枝方式对所述故障谱模型进行修剪处理。
  7. 根据权利要求5所述的方法,其特征在于,所述对每类故障谱模型进行修剪处理的步骤还包括:
    从所述故障谱模型剪去逻辑关系不合法的叶子节点。
  8. 一种基于故障谱的检测方法,其特征在于,包括:
    当接收到第二工单数据时,从所述第二工单数据中提取关键词;
    查找所述第二工单数据所属类别对应的故障谱;
    在所述故障谱中,根据所述关键词查找一个或多个检测路径;
    依据所述一个或多个检测路径进行检测,获得检测结果。
  9. 根据权利要求8所述的方法,其特征在于,所述故障谱中包括相连的根节点与叶子节点,所述根节点表征故障信息,所述叶子节点表征检测信息,至少部分叶子节点之间具有逻辑关系,所述叶子节点具有一个或多个父节点。
  10. 根据权利要求8所述的方法,其特征在于,所述从所述第二工单数据中提取关键词的步骤包括:
    对所述第二工单数据进行分词处理,获得一个或多个第二分词;
    识别所述一个或多个第二分词的词性;
    按照所述词性从所述一个或多个第二分词中提取关键词。
  11. 根据权利要求10所述的方法,其特征在于,所述从所述第二工单数据中提取关键词的步骤还包括:
    采用所述一个或多个第一分词在预置的停用词库中进行匹配;
    移除匹配成功的第二分词。
  12. 根据权利要求9所述的方法,其特征在于,所述在所述故障谱中, 根据所述特征词查找一个或多个检测路径的步骤包括:
    在所述故障谱中,查找与所述关键词匹配的根节点;
    遍历与所述根节点相连的一个或多个叶子节点,获得一个或多个检测路径。
  13. 根据权利要求12所述的方法,其特征在于,所述依据所述一个或多个检测路径进行检测,获得检测结果的步骤包括:
    针对每个检测路径,获取所述检测路径中的一个或多个叶子节点表征的检测信息;
    按照当前叶子节点表征的检测信息进行检测,获得候选检测结果;
    查找逻辑关系与所述候选检测结果匹配的下一叶子节点,返回执行按照当前叶子节点表征的检测信息进行检测的步骤,直至执行至最终的叶子节点;
    将最终的叶子节点的候选检测结果设置为检测结果。
  14. 根据权利要求8或9或10或11或12或13所述的方法,其特征在于,所述故障谱通过以下方式生成:
    获取一个或多个类别的第一工单数据;每个第一工单数据中包括故障信息与检测信息;
    针对每类第一工单数据,从所述检测信息中提取公共特征词,作为特征向量;
    针对每类第一工单数据,学习所述故障信息与所述特征向量之间的逻辑关系,获得每类故障谱模型;
    对每类故障谱模型进行修剪处理,获得每类故障谱。
  15. 一种故障谱的生成装置,其特征在于,包括:
    工单数据获取模块,用于获取一个或多个类别的第一工单数据;每个第一工单数据中包括故障信息与检测信息;
    公共特征词提取模块,用于针对每类第一工单数据,从所述检测信息中提取公共特征词,作为特征向量;
    故障谱模型学习模块,用于针对每类第一工单数据,学习所述故障信息与所述特征向量之间的逻辑关系,获得每类故障谱模型;
    故障谱模型修剪模块,用于对每类故障谱模型进行修剪处理,获得每类故障谱。
  16. 根据权利要求15所述的装置,其特征在于,所述故障谱中包括相连的根节点与叶子节点,所述根节点表征故障信息,所述叶子节点表征检测信息,至少部分叶子节点之间具有逻辑关系,所述叶子节点具有一个或多个父节点。
  17. 根据权利要求15或16所述的装置,其特征在于,所述公共特征词提取模块包括:
    第一分词处理模块,用于对所述检测信息进行分词处理,获得一个或多个第一分词;
    词频统计模块,用于统计所述第一分词的词频;
    权重计算模块,用于通过所述第一分词的词频计算所述第一分词的权重;
    第一分词提取子模块,用于按照所述权重提取至少部分第一分词作为公共特征词。
  18. 根据权利要求17所述的装置,其特征在于,所述公共特征词提取模块还包括:
    第一匹配子模块,用于采用所述一个或多个第一分词在预置的停用词库中进行匹配;
    第一移除子模块,用于移除匹配成功的第一分词。
  19. 根据权利要求15或16或18所述的装置,其特征在于,所述故障谱模型修剪模块包括:
    子树查找子模块,用于在所述故障谱模型中查找相同的子树;所述子树为一个或多个叶子节点的集合;
    连接子模块,用于在查找到时,将相同的子树的父节点连接至其中一个子树;
    第一修剪子模块,用于在相同的子树中,剪去已连接的子树之外的其他的子树。
  20. 根据权利要求19所述的装置,其特征在于,所述故障谱模型修剪模块还包括:
    第二修剪子模块,用于按照预设的剪枝方式对所述故障谱模型进行修剪处理。
  21. 根据权利要求19所述的装置,其特征在于,所述故障谱模型修剪模块还包括:
    第三修剪子模块,用于从所述故障谱模型剪去逻辑关系不合法的叶子节点。
  22. 一种基于故障谱的检测装置,其特征在于,包括:
    关键词提取模块,用于在接收到第二工单数据时,从所述第二工单数据中提取关键词;
    故障谱查找模块,用于查找所述第二工单数据所属类别对应的故障谱;
    检测路径查找模块,用于在所述故障谱中,根据所述关键词查找一个或多个检测路径;
    检测模块,用于依据所述一个或多个检测路径进行检测,获得检测结果。
  23. 根据权利要求22所述的装置,其特征在于,所述故障谱中包括相连的根节点与叶子节点,所述根节点表征故障信息,所述叶子节点表征检测信息,至少部分叶子节点之间具有逻辑关系,所述叶子节点具有一个或多个父节点。
  24. 根据权利要求22所述的装置,其特征在于,所述关键词提取模块包括:
    第二分词处理子模块,用于对所述第二工单数据进行分词处理,获得一个或多个第二分词;
    词性识别子模块,用于识别所述一个或多个第二分词的词性;
    第二分词提取子模块,用于按照所述词性从所述一个或多个第二分词中 提取关键词。
  25. 根据权利要求24所述的装置,其特征在于,所述关键词提取模块还包括:
    第二匹配子模块,用于采用所述一个或多个第一分词在预置的停用词库中进行匹配;
    第二移除子模块,用于移除匹配成功的第二分词。
  26. 根据权利要求23所述的装置,其特征在于,所述检测路径查找模块包括:
    根节点匹配子模块,用于在所述故障谱中,查找与所述关键词匹配的根节点;
    叶子节点遍历子模块,用于遍历与所述根节点相连的一个或多个叶子节点,获得一个或多个检测路径。
  27. 根据权利要求25所述的装置,其特征在于,所述检测模块包括:
    检测信息获取子模块,用于针对每个检测路径,获取所述检测路径中的一个或多个叶子节点表征的检测信息;
    候选检测结果获取子模块,用于按照当前叶子节点表征的检测信息进行检测,获得候选检测结果;
    叶子节点查找子模块,用于查找逻辑关系与所述候选检测结果匹配的下一叶子节点,返回调用候选检测结果获取子模块,直至执行至最终的叶子节点;
    检测结果设置子模块,用于将最终的叶子节点的候选检测结果设置为检测结果。
  28. 根据权利要求22或23或24或25或26或27所述的装置,其特征在于,所述故障谱通过调用以下模块生成:
    工单数据获取模块,用于获取一个或多个类别的第一工单数据;每个第一工单数据中包括故障信息与检测信息;
    公共特征词提取模块,用于针对每类第一工单数据,从所述检测信息中提取公共特征词,作为特征向量;
    故障谱模型学习模块,用于针对每类第一工单数据,学习所述故障信息与所述特征向量之间的逻辑关系,获得每类故障谱模型;
    故障谱模型修剪模块,用于对每类故障谱模型进行修剪处理,获得每类故障谱。
PCT/CN2016/080015 2015-05-25 2016-04-22 一种故障谱的生成、基于故障谱的检测方法和装置 WO2016188279A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201510272657.8A CN106294038B (zh) 2015-05-25 2015-05-25 一种故障谱的生成、基于故障谱的检测方法和装置
CN201510272657.8 2015-05-25

Publications (1)

Publication Number Publication Date
WO2016188279A1 true WO2016188279A1 (zh) 2016-12-01

Family

ID=57393592

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2016/080015 WO2016188279A1 (zh) 2015-05-25 2016-04-22 一种故障谱的生成、基于故障谱的检测方法和装置

Country Status (2)

Country Link
CN (1) CN106294038B (zh)
WO (1) WO2016188279A1 (zh)

Cited By (15)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109582670A (zh) * 2018-10-31 2019-04-05 深圳市元征科技股份有限公司 一种车辆维修方案的推荐方法及相关设备
CN109785919A (zh) * 2018-11-30 2019-05-21 平安科技(深圳)有限公司 名词匹配方法、装置、设备及计算机可读存储介质
CN110147387A (zh) * 2019-05-08 2019-08-20 腾讯科技(上海)有限公司 一种根因分析方法、装置、设备及存储介质
CN110222182A (zh) * 2019-06-06 2019-09-10 腾讯科技(深圳)有限公司 一种语句分类方法及相关设备
CN111191529A (zh) * 2019-12-17 2020-05-22 中移(杭州)信息技术有限公司 一种处理异常工单的方法及***
CN111191937A (zh) * 2019-12-31 2020-05-22 深圳市计通智能技术有限公司 一种告警危害评估方法、***及终端设备
CN111259149A (zh) * 2020-01-19 2020-06-09 清华大学 化学品事故分类方法、装置、计算机设备和存储介质
CN112256830A (zh) * 2020-10-21 2021-01-22 北京工业大数据创新中心有限公司 一种设备排查信息获取方法、装置和设备故障排查***
CN112445893A (zh) * 2019-09-05 2021-03-05 北京国双科技有限公司 一种信息搜索方法、装置、设备及存储介质
CN112817948A (zh) * 2019-11-15 2021-05-18 北京三快在线科技有限公司 数据检测的方法、装置、可读存储介质以及电子设备
CN113589191A (zh) * 2021-07-07 2021-11-02 江苏毅星新能源科技有限公司 一种电源故障诊断***及方法
CN115221892A (zh) * 2022-07-12 2022-10-21 中国电信股份有限公司 工单数据处理方法及装置、存储介质及电子设备
CN116366377A (zh) * 2023-06-02 2023-06-30 深信服科技股份有限公司 恶意文件检测方法、装置、设备及存储介质
CN116542634A (zh) * 2023-06-21 2023-08-04 中国电信股份有限公司 工单处理方法、装置和计算机可读存储介质
CN117891411A (zh) * 2024-03-14 2024-04-16 济宁蜗牛软件科技有限公司 一种海量档案数据优化存储方法

Families Citing this family (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109905269B (zh) * 2018-01-17 2020-11-17 华为技术有限公司 确定网络故障的方法和装置
CN108470022B (zh) * 2018-01-18 2021-11-23 南京邮电大学 一种基于运维管理的智能工单质检方法
CN109063217B (zh) * 2018-10-29 2020-11-03 广东电网有限责任公司广州供电局 电力营销***中的工单分类方法、装置及其相关设备
CN109684447A (zh) * 2018-12-13 2019-04-26 贵州电网有限责任公司 一种基于文本挖掘的电网调度运行日志故障信息分析方法
CN112258371A (zh) * 2020-11-17 2021-01-22 珠海大横琴科技发展有限公司 一种故障处理的方法和装置

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101742540A (zh) * 2010-02-05 2010-06-16 华为技术有限公司 在线自诊断的方法及装置
CN101846992A (zh) * 2010-05-07 2010-09-29 上海理工大学 基于数控机床故障案例的故障树构造方法
CN103310389A (zh) * 2013-05-31 2013-09-18 南方电网科学研究院有限责任公司 基于故障模式与故障树建立的架空输电线路故障检修方法
CN104376033A (zh) * 2014-08-01 2015-02-25 中国人民解放军装甲兵工程学院 一种基于故障树和数据库技术的故障诊断方法

Family Cites Families (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CA2449470A1 (en) * 2003-11-14 2005-05-14 Casebank Technologies Inc. Case-based reasoning system and method having fault isolation manual trigger cases
CN101907681B (zh) * 2010-07-15 2012-07-04 南京航空航天大学 基于gsd_svdd的模拟电路动态在线故障诊断方法
CN102346756B (zh) * 2010-12-24 2013-04-03 镇江诺尼基智能技术有限公司 一种设备故障解决方案知识管理与检索***及方法
CN104008110A (zh) * 2013-02-26 2014-08-27 成都勤智数码科技股份有限公司 一种运维工单自动转知识库的方法
CN104063458B (zh) * 2014-06-26 2017-09-29 北京奇虎科技有限公司 一种对终端故障问题提供对应解决方案的方法及装置
CN104502103A (zh) * 2014-12-07 2015-04-08 北京工业大学 一种基于模糊支持向量机的轴承故障诊断方法

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101742540A (zh) * 2010-02-05 2010-06-16 华为技术有限公司 在线自诊断的方法及装置
CN101846992A (zh) * 2010-05-07 2010-09-29 上海理工大学 基于数控机床故障案例的故障树构造方法
CN103310389A (zh) * 2013-05-31 2013-09-18 南方电网科学研究院有限责任公司 基于故障模式与故障树建立的架空输电线路故障检修方法
CN104376033A (zh) * 2014-08-01 2015-02-25 中国人民解放军装甲兵工程学院 一种基于故障树和数据库技术的故障诊断方法

Cited By (26)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109582670A (zh) * 2018-10-31 2019-04-05 深圳市元征科技股份有限公司 一种车辆维修方案的推荐方法及相关设备
CN109582670B (zh) * 2018-10-31 2023-04-07 深圳市元征科技股份有限公司 一种车辆维修方案的推荐方法及相关设备
CN109785919A (zh) * 2018-11-30 2019-05-21 平安科技(深圳)有限公司 名词匹配方法、装置、设备及计算机可读存储介质
CN109785919B (zh) * 2018-11-30 2023-06-23 平安科技(深圳)有限公司 名词匹配方法、装置、设备及计算机可读存储介质
CN110147387A (zh) * 2019-05-08 2019-08-20 腾讯科技(上海)有限公司 一种根因分析方法、装置、设备及存储介质
CN110147387B (zh) * 2019-05-08 2023-06-09 腾讯科技(上海)有限公司 一种根因分析方法、装置、设备及存储介质
CN110222182A (zh) * 2019-06-06 2019-09-10 腾讯科技(深圳)有限公司 一种语句分类方法及相关设备
CN110222182B (zh) * 2019-06-06 2022-12-27 腾讯科技(深圳)有限公司 一种语句分类方法及相关设备
CN112445893A (zh) * 2019-09-05 2021-03-05 北京国双科技有限公司 一种信息搜索方法、装置、设备及存储介质
CN112817948A (zh) * 2019-11-15 2021-05-18 北京三快在线科技有限公司 数据检测的方法、装置、可读存储介质以及电子设备
CN111191529A (zh) * 2019-12-17 2020-05-22 中移(杭州)信息技术有限公司 一种处理异常工单的方法及***
CN111191529B (zh) * 2019-12-17 2023-04-28 中移(杭州)信息技术有限公司 一种处理异常工单的方法及***
CN111191937B (zh) * 2019-12-31 2023-12-29 深圳市计通智能技术有限公司 一种告警危害评估方法、***及终端设备
CN111191937A (zh) * 2019-12-31 2020-05-22 深圳市计通智能技术有限公司 一种告警危害评估方法、***及终端设备
CN111259149A (zh) * 2020-01-19 2020-06-09 清华大学 化学品事故分类方法、装置、计算机设备和存储介质
CN111259149B (zh) * 2020-01-19 2024-03-12 清华大学 化学品事故分类方法、装置、计算机设备和存储介质
CN112256830B (zh) * 2020-10-21 2023-09-08 北京工业大数据创新中心有限公司 一种设备排查信息获取方法、装置和设备故障排查***
CN112256830A (zh) * 2020-10-21 2021-01-22 北京工业大数据创新中心有限公司 一种设备排查信息获取方法、装置和设备故障排查***
CN113589191A (zh) * 2021-07-07 2021-11-02 江苏毅星新能源科技有限公司 一种电源故障诊断***及方法
CN113589191B (zh) * 2021-07-07 2024-03-01 郴州雅晶源电子有限公司 一种电源故障诊断***及方法
CN115221892A (zh) * 2022-07-12 2022-10-21 中国电信股份有限公司 工单数据处理方法及装置、存储介质及电子设备
CN115221892B (zh) * 2022-07-12 2024-02-27 中国电信股份有限公司 工单数据处理方法及装置、存储介质及电子设备
CN116366377A (zh) * 2023-06-02 2023-06-30 深信服科技股份有限公司 恶意文件检测方法、装置、设备及存储介质
CN116366377B (zh) * 2023-06-02 2023-11-07 深信服科技股份有限公司 恶意文件检测方法、装置、设备及存储介质
CN116542634A (zh) * 2023-06-21 2023-08-04 中国电信股份有限公司 工单处理方法、装置和计算机可读存储介质
CN117891411A (zh) * 2024-03-14 2024-04-16 济宁蜗牛软件科技有限公司 一种海量档案数据优化存储方法

Also Published As

Publication number Publication date
CN106294038A (zh) 2017-01-04
CN106294038B (zh) 2019-10-18

Similar Documents

Publication Publication Date Title
WO2016188279A1 (zh) 一种故障谱的生成、基于故障谱的检测方法和装置
KR102485179B1 (ko) 설명 정보 확정 방법, 장치, 전자 기기 및 컴퓨터 저장 매체
US10025819B2 (en) Generating a query statement based on unstructured input
Papadakis et al. Three-dimensional entity resolution with JedAI
WO2020259260A1 (zh) 一种结构化查询语言sql注入检测方法及装置
JP6309644B2 (ja) スマート質問回答の実現方法、システム、および記憶媒体
WO2018157805A1 (zh) 一种自动问答处理方法及自动问答***
US9424294B2 (en) Method for facet searching and search suggestions
US9176949B2 (en) Systems and methods for sentence comparison and sentence-based search
US20180095962A1 (en) Translation of natural language questions and requests to a structured query format
US7953754B2 (en) Method and system for finding the focus of a document
US10853357B2 (en) Extensible automatic query language generator for semantic data
CN104050256A (zh) 基于主动学习的问答方法及采用该方法的问答***
EP3799640A1 (en) Semantic parsing of natural language query
CN106570180A (zh) 基于人工智能的语音搜索方法及装置
CN111967761A (zh) 一种基于知识图谱的监控预警方法、装置及电子设备
US10628749B2 (en) Automatically assessing question answering system performance across possible confidence values
US10282678B2 (en) Automated similarity comparison of model answers versus question answering system output
Fagin et al. Declarative cleaning of inconsistencies in information extraction
Nayak et al. Knowledge graph based automated generation of test cases in software engineering
Shekarpour et al. RQUERY: rewriting natural language queries on knowledge graphs to alleviate the vocabulary mismatch problem
CN109471889B (zh) 报表加速方法、***、计算机设备和存储介质
CN110909126A (zh) 一种信息查询方法及装置
Tran et al. Simplified effective method for identifying semantic relations from a knowledge graph
Zhu et al. A N-gram based approach to auto-extracting topics from research articles1

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 16799168

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 16799168

Country of ref document: EP

Kind code of ref document: A1