JP6085574B2

JP6085574B2 - Work record content analysis apparatus, method and program

Info

Publication number: JP6085574B2
Application number: JP2014026891A
Authority: JP
Inventors: 暁渡邉; 達明木村; 剛豊野; 西松　研; 研西松
Original assignee: Nippon Telegraph and Telephone Corp
Current assignee: Nippon Telegraph and Telephone Corp
Priority date: 2014-02-14
Filing date: 2014-02-14
Publication date: 2017-02-22
Anticipated expiration: 2034-02-14
Also published as: JP2015153188A

Description

本発明は、作業記録内容解析装置及び方法及びプログラムに係り、特に、システム運用におけるオペレータの作業内容を記した作業記録を分析するための作業記録内容解析装置及び方法及びプログラムに関する。 The present invention relates to a work record content analysis apparatus, method, and program, and more particularly, to a work record content analysis apparatus, method, and program for analyzing a work record that describes the work contents of an operator in system operation.

今日、オペレータ業務の作業証跡、及び、オペレータの知見の蓄積を目的として、システム運用におけるインシデント発生時の状況と対応作業は作業記録として蓄積され、一元的な管理が行われている。作業記録には運用を効率的に行うために必要な情報が多く含まれるが、膨大な量が蓄積され人手で情報抽出は困難であり、また、自由記述であるために機械的な分析も困難な状況にある。 Today, for the purpose of accumulating operator work trails and operator knowledge, the situation and response work at the time of incident occurrence in system operation are accumulated as work records and are managed centrally. The work record contains a lot of information necessary for efficient operation, but it is difficult to extract information manually due to the accumulation of a huge amount, and it is also difficult to analyze mechanically because it is a free description It is in the situation.

こうした中で、作業記録からキーワードとなる単語を抽出し、キーワードから作業記録の内容を解析する方法がある（例えば、非特許文献1参照）。 Under such circumstances, there is a method of extracting a word as a keyword from the work record and analyzing the content of the work record from the keyword (see, for example, Non-Patent Document 1).

また、言語、辞書やコーパスに依存しない形態素解析により動作を分類する技術がある（例えば、非特許文献2参照）。 There is also a technique for classifying actions by morphological analysis independent of language, dictionary, or corpus (see, for example, Non-Patent Document 2).

Potharaju Rahul, Navendu Jain, and Cristina Nita-Rotaru. "Juggling the jigsaw: Towards automated problem inference from network trouble tickets." Proceedings of the 10th USENIX conference on Networked Systems Design and Implementation. 2013.Potharaju Rahul, Navendu Jain, and Cristina Nita-Rotaru. "Juggling the jigsaw: Towards automated problem inference from network trouble tickets." Proceedings of the 10th USENIX conference on Networked Systems Design and Implementation. 2013. MeCab: Yet Another Part-of-Speech and Morphological Analyzer, http://mecab.***code.com/svn/trunk/mecab/doc/index.htmlMeCab: Yet Another Part-of-Speech and Morphological Analyzer, http://mecab.***code.com/svn/trunk/mecab/doc/index.html J. Nivre. An efficient algorithm for projective dependency parsing. In IWPT, 2003.J. Nivre. An efficient algorithm for projective dependency parsing. In IWPT, 2003. Jason M. Eisner, Three New Probabilistic Models for Dependency Parsing: An Exploration, In COLING, 1996Jason M. Eisner, Three New Probabilistic Models for Dependency Parsing: An Exploration, In COLING, 1996 L.R.Labiner, A tutorial on hidden Markov models and selected applications in speech recognition, Proceedings of the IEEE, 1989L.R.Labiner, A tutorial on hidden Markov models and selected applications in speech recognition, Proceedings of the IEEE, 1989 T. Kudo, K. Yamamoto, and Y. Matsumoto. Applying conditional random fields to japanese morphological analysis. In EMNLP, 2004.T. Kudo, K. Yamamoto, and Y. Matsumoto. Applying conditional random fields to japanese morphological analysis. In EMNLP, 2004.

しかしながら、上記従来の技術は、作業記録の"故障発生モジュール"、"故障を特定した方法"、"故障の解決方法"を獲得するに留まり、原因箇所毎に特定して分類することができず、作業記録の文全ての内容を解析することができないという問題がある。 However, the above-mentioned conventional technology only acquires the “failure occurrence module”, “the method of identifying the failure”, and “the solution of the failure” in the work record, and cannot identify and classify each cause. There is a problem that the contents of all the sentences in the work record cannot be analyzed.

また、形態素解析により動作を分類する方法は、特徴量は自由記述欄から予め人手により原因箇所を示す単語を定義する必要があり、自動的に取得することはできない。 Also, in the method of classifying actions by morphological analysis, it is necessary to define a word indicating the cause part manually in advance from the free description column, and the feature amount cannot be automatically acquired.

本発明は上記の点に鑑みなされたもので、利用者が各文における内容が把握されていない作業記録について、原因箇所に紐付いた特徴量を負荷の係る事前作業なしに獲得し、故障原因に応じた分類を行い、作業記録の内容の全てを解析することが可能な作業記録内容解析装置及び方法及びプログラムを提供することを目的とする。 The present invention has been made in view of the above points, and for a work record in which the content of each sentence is not grasped by the user, the feature amount associated with the cause location is acquired without prior work with a load, and the cause of failure is obtained. An object of the present invention is to provide a work record content analysis apparatus, method, and program capable of classifying according to each other and analyzing the entire contents of the work record.

一態様によれば、利用者の作業記録の内容を分析するための作業記録内容分析装置であって、
前記作業記録が与えられると、解析の対象となる解析単位文を抽出する内容解析単位決定手段と、
予め解析単位文と内容ラベルを対応付けた内容ラベルＤＢを参照して、前記解析単位文の内容ラベルを取得する内容ラベル決定手段と、
前記解析単位文に含まれる単語と、前記内容ラベル決定手段で取得した前記解析単位文の内容ラベルを組み合わせて、前記解析単位文から内容解析を行うための特徴量を抽出する特徴量抽出手段と、
前記特徴量抽出手段で得られた一つまたは複数の特徴量に基づいて、条件付確率、同時確率を求める、または、作業記録の集合に含まれる頻度の高い特徴量を選択することにより、解析結果を求める内容解析手段と、を有する作業記録内容分析装置が提供される。

According to one aspect, a work record content analyzer for analyzing the contents of a user's work record,
Given the work record, content analysis unit determination means for extracting an analysis unit sentence to be analyzed,
A content label determination unit that acquires a content label of the analysis unit sentence with reference to a content label DB that associates the analysis unit sentence and the content label in advance.
A feature quantity extracting unit that extracts a feature quantity for performing content analysis from the analysis unit sentence by combining a word included in the analysis unit sentence and a content label of the analysis unit sentence acquired by the content label determining unit; ,
Based on one or a plurality of feature quantities obtained by the feature quantity extraction means, a conditional probability and a joint probability are obtained, or analysis is performed by selecting a feature quantity with a high frequency included in a set of work records. A work record content analysis device having content analysis means for obtaining a result is provided.

一態様によれば、利用者が各文における内容が把握されていない作業記録について、原因箇所ごとの分類を行い、内容ラベルや解析単位文を作業記録の解析結果として提供することが可能となる。 According to one aspect, it is possible for a user to classify work records whose contents in each sentence are not grasped for each cause, and to provide content labels and analysis unit sentences as analysis results of the work records. .

本発明の一実施の形態における作業記録内容解析装置の構成例である。It is an example of composition of a work record contents analysis device in one embodiment of the present invention. 本発明の一実施の形態における作業記録内容解析装置の処理のフローチャートである。It is a flowchart of a process of the work record content analysis apparatus in one embodiment of this invention. 本発明の第１の実施例のシステム構成例である。It is a system configuration example of the first embodiment of the present invention. 本発明の第２の実施例のシステム構成例である。It is a system configuration example of the 2nd example of the present invention. 本発明の第３の実施例のシステム構成例である。It is a system configuration example of the 3rd example of the present invention.

以下、図面と共に本発明の実施の形態を説明する。 Hereinafter, embodiments of the present invention will be described with reference to the drawings.

図１は、本発明の一実施の形態における作業記録内容解析装置の構成例を示す。 FIG. 1 shows a configuration example of a work record content analysis apparatus according to an embodiment of the present invention.

同図に示す作業内容解析装置１００は、ユーザインタフェース部１１０、内容解析単位決定部１２０、内容ラベル決定部１３０、特徴量抽出部１４０、内容解析部１５０を有する。また、図示しないがメモリ等の記憶手段を有するものとする。 The work content analysis apparatus 100 shown in the figure includes a user interface unit 110, a content analysis unit determination unit 120, a content label determination unit 130, a feature amount extraction unit 140, and a content analysis unit 150. Further, although not shown, it is assumed to have storage means such as a memory.

ユーザインタフェース部１１０は、ユーザから解析単位を受け付け、内容解析単位決定部１２０及び内容ラベル決定部１３０に入力する機能と、内容解析部１５０で得られた解析結果を出力する機能を有する。 The user interface unit 110 has a function of receiving an analysis unit from a user and inputting it to the content analysis unit determination unit 120 and the content label determination unit 130 and a function of outputting an analysis result obtained by the content analysis unit 150.

内容解析単位決定部１２０は、作業記録ＤＢ１０から読み出した作業記録情報から解析単位や所定の抽出方法に基づいて解析単位文を抽出する。 The content analysis unit determination unit 120 extracts an analysis unit sentence from work record information read from the work record DB 10 based on an analysis unit or a predetermined extraction method.

内容ラベル決定部１３０は、解析単位文のそれぞれについて、予め定義された内容ラベルのうち、どの内容ラベルに該当するかを解析して、解析単位文と内容ラベルを紐付けて内容ラベルＤＢ２０に格納する。また、ユーザインタフェース１１０を介して、または、内容解析単位決定部１２０から解析単位文が与えられると、当該解析単位文に対する内容ラベルを内容ラベルＤＢ２０から検索する。なお、検索された内容ラベルはユーザインタフェース１１０を介して利用者に出力することも可能である。 The content label determination unit 130 analyzes which content label corresponds to each of the analysis unit sentences, and stores the analysis unit sentence and the content label in the content label DB 20. To do. Further, when an analysis unit sentence is given via the user interface 110 or from the content analysis unit determination unit 120, a content label for the analysis unit sentence is searched from the content label DB 20. The retrieved content label can be output to the user via the user interface 110.

特徴量抽出部１４０は、解析単位文における特徴量を求める。特徴量としては、解析単位文から得られた単語と内容ラベルから求められる特徴量、解析単語文を構文解析した結果、要約情報の単語集合、システムメッセージの特徴量、システム数値情報等がある。 The feature quantity extraction unit 140 obtains a feature quantity in the analysis unit sentence. The feature amount includes a feature amount obtained from a word obtained from an analysis unit sentence and a content label, a result of syntactic analysis of the analysis word sentence, a word set of summary information, a feature amount of a system message, system numerical information, and the like.

内容解析部１５０は、特徴量抽出部１４０で得られた特徴量に基づいて、解析単位文の解析を行う。その内容解析結果はユーザインタフェース部１１０を介して出力する。 The content analysis unit 150 analyzes the analysis unit sentence based on the feature amount obtained by the feature amount extraction unit 140. The content analysis result is output via the user interface unit 110.

作業記録ＤＢ１０は、システム運用中に発生した故障の発生に対するオペレータの対応を記録した作業記録が格納されている。 The work record DB 10 stores a work record that records an operator's response to a failure that occurred during system operation.

内容ラベルＤＢ２０は、解析単位文それぞれについて、予め定義した内容ラベルのうち、利用者によってどの内容ラベルに該当するかが解析され、解析単位文に紐付けられた内容ラベルが格納される。内容ラベルの定義例を表１に示す。 The content label DB 20 analyzes which content label corresponds to each analysis unit sentence among the predefined content labels, and stores the content label associated with the analysis unit sentence. Table 1 shows an example of content label definition.

内容ラベルは、例えば、作業記録の作業工数の統計分析などを目的として、作業記録の中で作業を実施したことを記述した解析単位文のみを抽出する場合には"作業"、"作業以外"の２種類の内容ラベルが定義される。また、内容ラベルは離散値とは限らない。例えば、作業内容の記述のうち、他者にとって重要な記述を抽出する場合には、各解析単位文の内容の重要度をスコアとし、内容ラベルはスコアとなる。

For the purpose of statistical analysis of work man-hours in work records, the content label is "work" or "non-work" when extracting only analysis unit statements describing work performed in work records. Two types of content labels are defined. The content label is not necessarily a discrete value. For example, when a description important for others is extracted from the description of the work content, the importance of the content of each analysis unit sentence is set as a score, and the content label is a score.

以下では、作業記録の各文に対して、内容を示す内容ラベルを与えることにより、本発明の課題を解決する。 Below, the subject of this invention is solved by giving the content label which shows the content with respect to each sentence of a work record.

図２は、本発明の一実施の形態における作業記録内容解析装置の処理のフローチャートである。 FIG. 2 is a flowchart of processing of the work record content analysis apparatus according to the embodiment of the present invention.

ステップ１０１）内容解析単位決定部１２０は、作業記録ＤＢ１０から読み出した作業記録情報から以下のような方法により解析単位文を抽出する。解析単位文を抽出する方法としては、以下のような方法がある。 Step 101) The content analysis unit determination unit 120 extracts an analysis unit sentence from the work record information read from the work record DB 10 by the following method. There are the following methods for extracting an analysis unit sentence.

（１）作業記録ＤＢ１０から読み出した作業記録情報に区切りを与えて解析単位文とする。 (1) The work record information read from the work record DB 10 is divided into analysis unit sentences.

（２）ユーザインタフェース部１１０を介して、利用者から解析単位を取得し、作業記録ＤＢ１０から読み出した作業記録情報から当該解析単位に応じて解析単位文を抽出する。 (2) The analysis unit is acquired from the user via the user interface unit 110, and the analysis unit sentence is extracted from the work record information read from the work record DB 10 according to the analysis unit.

（３）作業記録ＤＢ１０から読み出した作業記録情報の改行文字毎に区切って解析単位とする。 (3) A unit of analysis is divided for each line feed character of the work record information read from the work record DB 10.

（４）作業記録ＤＢ１０から読み出した作業記録情報全文を解析単位とする。 (4) The entire work record information read from the work record DB 10 is used as an analysis unit.

上記のいずれかの方法により取得した解析単位文を内容ラベル決定部１３０、または、特徴量抽出部１４０に出力する。 The analysis unit sentence acquired by any one of the above methods is output to the content label determination unit 130 or the feature amount extraction unit 140.

ステップ１０２）内容ラベル決定部１３０は、ステップ１０１で取得した解析単位文それぞれについて、内容ラベルＤＢ２０に予め定義され、格納されている内容ラベルのうち、どの内容ラベルに該当するかを解析する。 Step 102) The content label determination unit 130 analyzes which content label corresponds to each of the analysis unit sentences acquired in Step 101, which is defined in advance in the content label DB 20 and stored.

ステップ１０３）特徴量抽出部１４０は、解析単位文の内容解析を行うための、解析単位文における特徴量を得る。 Step 103) The feature quantity extraction unit 140 obtains the feature quantity in the analysis unit sentence for analyzing the contents of the analysis unit sentence.

特徴量を得るには、以下の種々の方法がある。 There are the following various methods for obtaining the feature amount.

（１）形態素解析を利用：
作業記録ＤＢ１０に利用者が付けた作業記録の日本語のタイトルや作業記録の要約情報がある場合、これを形態素解析し、その単語の集合を特徴量とする。例えば、
"ホストXでパケット異常発生"
というタイトルがあった場合、
"ホストX"，"で"，"パケット"，"異常"，"発生"
と単語に分割して、これらの単語の集合を特徴量とする。これにより、内容解析部１５０ではその特徴量が似ているグループによって分類することが可能となる。 (1) Use morphological analysis:
If the work record DB 10 has a Japanese title of the work record provided by the user and summary information of the work record, the morphological analysis is performed and the set of words is used as a feature amount. For example,
"Packet error occurred on host X"
If there is a title
"Host X", "With", "Packet", "Abnormal", "Occurred"
And a set of these words as a feature amount. As a result, the content analysis unit 150 can classify by groups having similar feature values.

（２）テンプレートを利用：
システムメッセージを特徴量として採用する方法として、作業記録ＤＢ１０から取得したシステムメッセージが異常状態を示している場合（例えば、監視システムが出力した異常アラートメッセージや機器のエラーログ）には当該メッセージをテンプレート化する方法がある。 (2) Use templates:
As a method of adopting a system message as a feature quantity, when a system message acquired from the work record DB 10 indicates an abnormal state (for example, an abnormal alert message output from a monitoring system or an error log of a device), the message is used as a template. There is a way to make it.

例えば、システムメッセージが"ホストXでCPU負荷が90%を超えました"、"2014/01/01 01:10:10Error:system initiation is not defined"である場合、当該メッセージをテンプレート化（状況によって変化する部分を除いたもの）する。例えば、
"****でCPU負荷が**%を超えました"，"****/**/** **:**:** Error:system initiation is not defined"
のようにして、テンプレートと同じメッセージであるものにIDを付与する。ある作業記録に書かれた異常で、ID100、ID200、ID300の３つのメッセージが出力されたら（100,200,300）というIDの集合を特徴量とする。 For example, if the system message is "CPU load exceeded 90% on host X" or "2014/01/01 01:10:10 Error: system initiation is not defined" Excluding changing parts). For example,
"CPU load exceeds **% at ****", "**** / ** / ** **: **: ** Error: system initiation is not defined"
In this way, ID is assigned to the same message as the template. When three messages of ID100, ID200, and ID300 are output due to an abnormality written in a certain work record, a set of IDs (100, 200, 300) is set as a feature amount.

（３）システム数値情報を利用：
数値情報として、パケット数や応答時間など、作業記録に記載されている数値（ベクトル）を特徴量として用いる。例えば、特徴量（"CPU使用率"、"ディスク書き込み速度"、"機器温度"）である場合は、（0.5,4000,298）というベクトルが特徴量となる。 (3) Use system numerical information:
As numerical information, numerical values (vectors) described in work records such as the number of packets and response time are used as feature quantities. For example, in the case of a feature amount (“CPU usage rate”, “disk writing speed”, “device temperature”), a vector of (0.5, 4000, 298) is the feature amount.

（４）解析単位文と内容ラベルを用いる：
特徴量Fは、i番目の解析単位文に含まれる情報をX_i、また、i番目の解析単位文の内容ラベルをY_iとして、
X₁,…X_n， Y₁,…,Y_i-1， Y_i+1,…,Y_n
を自由に用いて定義することが可能である。 (4) Use parsing unit sentences and content labels:
The feature amount F is defined by supposing that the information included in the i-th analysis unit sentence is X _i , and the content label of the i-th analysis unit sentence is Y _i .
X ₁ ,… X _n , Y ₁ ,…, Y _i−1 , Y _{i + 1} ,…, Y _n
Can be defined freely.

例えば、内容推定を行う解析単位文に単語"緊急"が含まれるかどうかは、
F(X₁,…,X_n，Y₁,…,Y_i-1，Y_i+1,…,Y_n)=1 if "緊急"∈X_i, otherwise 0
と表現できる。解析単位文に含まれる単語の獲得方法は、形態素解析器MeCab（例えば、非特許文献２等参照）などがある。 For example, whether or not the word “emergency” is included in the analysis unit sentence for content estimation,
F (X ₁ ,…, X _n , Y ₁ ,…, Y _i−1 , Y _{i + 1} ,…, Y _n ) = 1 if “urgent” ∈X _i , otherwise 0
Can be expressed. As a method for acquiring words included in the analysis unit sentence, there is a morphological analyzer MeCab (see, for example, Non-Patent Document 2).

また、一つ前の解析単位文の内容ラベルが"根拠"であることを特徴量として用いる場合であれば、解析単位文の内容ラベルが決定した時点で内容ラベルＤＢ２０に当該解析単位文の内容ラベルY_i-1を保持しておき、内容ラベルＤＢ２０から内容ラベルを得ることにより、
F(X₁,…,X_n，Y₁,…Y_i-1，Y_i+1,…,Y_n)=1 if Y_i-1="根拠", otherwise 0
と表現できる。 If the content label of the immediately previous analysis unit statement is “Rationale”, the content of the analysis unit statement is stored in the content label DB 20 when the content label of the analysis unit statement is determined. By holding the label Y _i-1 and obtaining the content label from the content label DB 20,
F (X ₁ , ..., X _n , Y ₁ , ... Y _i-1 , Y _{i + 1} , ..., Y _n ) = 1 if Y _i-1 = "reason", otherwise 0
Can be expressed.

なお、以下のステップ１０４に示す特徴量も含め、特徴量は一つでもよいし、複数の特徴量を組み合わせてもよい。 It should be noted that one feature quantity may be included, including a feature quantity shown in step 104 below, or a plurality of feature quantities may be combined.

ステップ１０４）また、特徴量抽出部１４０において、特徴量を抽出する方法として、解析単位文の構造を解析する方法がある。 Step 104) Further, as a method for extracting the feature quantity in the feature quantity extraction unit 140, there is a method for analyzing the structure of the analysis unit sentence.

この場合、内容に関連性のある解析単位文の組（Xp,Xq）を獲得する。例えば、"起・承・転・結"となっている作業記録であれば、［起，承］、［承，転］、［転，結］を関連性のある解析単位文の組として得る。 In this case, a set of analysis unit sentences (Xp, Xq) relevant to the contents is acquired. For example, in the case of a work record with “start / end / change / end”, [start, end], [follow, change], [turn, end] are obtained as a set of related analysis unit sentences. .

また、"目的・根拠１・根拠２・根拠３"となっている作業記録であれば、［目的，根拠１］、［目的，根拠２］、［目的，根拠３］を関連性のある解析単位文の組として得る。 In addition, if the work record is “Purpose / Evidence 1 / Evidence 2 / Evidence 3”, [Purpose / Evidence 1], [Purpose / Evidence 2], and [Purpose / Evidence 3] are related analysis. Get as a set of unit sentences.

上記のような解析単位文の構造解析方法として、構文解析手法のShift-Reduce（例えば、非特許文献３参照）、Eisner法（例えば、非特許文献４）などがある。 As the structure analysis method of the analysis unit sentence as described above, there are a syntax analysis method Shift-Reduce (for example, see Non-Patent Document 3), an Eisner method (for example, Non-Patent Document 4), and the like.

また、特徴量抽出部１４０は、解析単位文の構造を解析し、特徴量として用いることも可能である。この場合、内容に関連性がある解析単位文の組（Xp,Xq）を獲得する。 The feature quantity extraction unit 140 can also analyze the structure of the analysis unit sentence and use it as a feature quantity. In this case, a set of analysis unit sentences (Xp, Xq) that are related to the content is acquired.

上記の構造を用いて、構造特徴量を定義可能である。例えば、内容解析を行う解析単位文と関連がある解析単位文の特徴量を用いる、等である。この場合、（A,B），(B,C)，(C,D)という構造を持つ作業記録である場合、Bの解析単位文の特徴量として、A，Cの特徴量が用いられる。 A structural feature amount can be defined using the above structure. For example, the feature amount of the analysis unit sentence related to the analysis unit sentence for content analysis is used. In this case, in the case of a work record having a structure of (A, B), (B, C), (C, D), the feature quantities of A and C are used as the feature quantities of the analysis unit sentence of B.

ステップ１０５）内容解析部１５０は、上記のステップ１０３，１０４で得られた特徴量を元に内容解析を実施する。 Step 105) The content analysis unit 150 performs content analysis based on the feature values obtained in the above steps 103 and 104.

内容解析手法は、同時確率をモデル化するHMM(Hidden Markov Mode)（例えば、非特許文献５参照）、条件付確率をモデル化するCRF(Conditional Random Fields)（例えば、非特許文献６参照）などが考えられる。 Content analysis methods include HMM (Hidden Markov Mode) for modeling simultaneous probabilities (for example, see Non-Patent Document 5), CRF (Conditional Random Fields) for modeling conditional probabilities (for example, see Non-Patent Document 6), etc. Can be considered.

内容解析部１５０は、予め、特徴量と内容ラベルの関係を、当該作業記録内容解析装置１００に接続される分析システム４０に学習させ、新たな作業記録に対して、各解析単位文の特徴量から、内容ラベルを得る方法がある。この場合、解析単位文の特徴量と内容ラベルの学習方法は、学習データを利用者によって予め与える方法、分析システム４０の実行時に利用者によってインタラクティブに与える方法、同一内容ラベルと考えられる別のデータセットを用いる方法が考えられる。同一内容ラベルと考えられる別のデータセットとしては、例えば、作業マニュアルに記述された文を全て"作業"という内容ラベルのデータとして活用する、などが考えられる。 The content analysis unit 150 causes the analysis system 40 connected to the work record content analysis apparatus 100 to learn the relationship between the feature value and the content label in advance, and the feature value of each analysis unit sentence is added to the new work record. From there, there is a way to get a content label. In this case, the learning method of the feature amount and the content label of the analysis unit sentence includes a method in which learning data is given in advance by the user, a method in which the analysis system 40 is interactively given by the user, and another data that is considered to be the same content label. A method using a set is conceivable. As another data set considered as the same content label, for example, all the sentences described in the work manual can be used as data of the content label “work”.

また、学習は必ずしも内容ラベルが得られた学習データを用いて行う必要はない。例えば、内容解析単位決定部１２０で得られた分析を行う作業記録の集合から、重要な解析単位文のみを取り出す場合には、特徴量抽出部１４０で得られた分析対象の作業記録の集合に登場する頻度の高い特徴量を、重要な解析単位文としてもよい。 Further, learning is not necessarily performed using learning data from which content labels are obtained. For example, when only an important analysis unit sentence is extracted from the set of work records to be analyzed obtained by the content analysis unit determination unit 120, the set of work records to be analyzed obtained by the feature amount extraction unit 140 Features that appear frequently may be used as important analysis unit sentences.

ステップ１０６）内容解析部１５０で得られた内容解析結果を出力する。内容解析結果は、内容ラベルや、重要であると判断された解析単位文、内容ラベルが付与された解析単位文等である。 Step 106) The content analysis result obtained by the content analysis unit 150 is output. The content analysis result includes a content label, an analysis unit sentence determined to be important, an analysis unit sentence provided with the content label, and the like.

以下に、上記の実施の形態で示した作業記録内容解析装置を適用した場合の例を示す。 Hereinafter, an example in which the work record content analysis apparatus shown in the above embodiment is applied will be described.

［第１の実施例］
図３は、本発明の第１の実施例のシステム構成例である。 [First embodiment]
FIG. 3 is a system configuration example of the first embodiment of the present invention.

同図において、図１と同一構成部分には同一符号を付しその説明を省略する。 In this figure, the same components as those in FIG.

作業記録ＤＢ１０に蓄積される作業の対象となるシステムは、どのようなものであっても構わない。例えば、企業におけるITシステム、大規模ネットワークなどが挙げられる。 Any system can be used as a target of work stored in the work record DB 10. For example, IT systems in enterprises, large-scale networks, etc.

図３に示すシステムは、利用者３０が解析対象として入力した作業記録に対して、解析単位文ごとの内容ラベルを獲得する場合を想定している。 The system shown in FIG. 3 assumes a case where a content label for each analysis unit sentence is acquired for a work record input as an analysis target by the user 30.

作業記録に対する内容ラベルの付与システムとして用いる場合、利用者は解析の対象となる作業記録を作業記録内容解析装置１００のユーザインタフェース１１０を介して与える。内容ラベル決定部１３０は、対象の作業記録における内容ラベルを解析した後、内容ラベルＤＢ２０に作業記録の内容ラベルを記録する。その後、利用者３０に対して内容ラベルＤＢ２０から対象の作業記録の内容ラベルを取り出し、提示する。内容ラベルの解析と、利用者３０に対する内容ラベルの提示は、同じ時間に行われる必要はなく、予め解析が行われ、内容ラベルＤＢ２０に格納されている場合も考えられる。 When used as a system for assigning a content label to a work record, the user provides a work record to be analyzed via the user interface 110 of the work record content analysis apparatus 100. After analyzing the content label in the target work record, the content label determination unit 130 records the content label of the work record in the content label DB 20. Thereafter, the content label of the target work record is extracted from the content label DB 20 and presented to the user 30. The analysis of the content label and the presentation of the content label to the user 30 do not have to be performed at the same time, and it may be considered that the analysis is performed in advance and stored in the content label DB 20.

これにより、利用者３０は、作業記録のうち必要とする内容ラベルを持つ解析単位文のみを取得することが可能となる。 As a result, the user 30 can acquire only the analysis unit sentence having the required content label in the work record.

［第２の実施例］
図４は、本発明の第２の実施例のシステム構成例である。 [Second Embodiment]
FIG. 4 is a system configuration example of the second embodiment of the present invention.

同図に示すシステムでは、複数の作業記録ＤＢ１０_１〜１０_nに対して、同時に内容ラベルを解析する場合を想定している。これは異なる種別の作業記録について、同一の内容ラベルをもつ解析単位文を獲得し、作業記録間の紐付けを行うことを実現する。例えば、ITシステムにおける人手で記述された作業記録と、同ITシステムのコマンドログなどの作業記録に対して、同時に内容解析を行い、それぞれ同じ内容レベルを持つ解析対象の作業記録を獲得し、当該内容ラベルと紐付けて、当該作業記録が格納されている作業記録ＤＢ１０に格納することができる。 In the system shown in the figure, it is assumed that content labels are simultaneously analyzed for a plurality of work record DBs 10 ₁ to 10 _n . This realizes that the analysis unit sentences having the same content label are acquired for the different types of work records, and the work records are linked. For example, content analysis is performed simultaneously on work records written manually in the IT system and work logs such as the command log of the IT system, and the work records to be analyzed having the same content level are obtained. The content label can be linked to the work record DB 10 in which the work record is stored.

具体的には、利用者３０がそれぞれの作業記録ＤＢ１０について解析を行う作業記録を選択する。作業記録内容解析装置１００は、作業記録ＤＢ１０から解析対象の作業記録を獲得し、内容ラベル決定部１３０において内容ラベルの解析を行う。この内容ラベルの解析は、複数の作業記録の解析単位文の特徴量を相互に活用しあう場合も、独立に解析する場合も考えられる。 Specifically, the user 30 selects a work record for analyzing each work record DB 10. The work record content analysis device 100 acquires a work record to be analyzed from the work record DB 10 and analyzes the content label in the content label determination unit 130. The analysis of the content label can be performed by utilizing the feature quantities of the analysis unit sentences of a plurality of work records, or by analyzing them independently.

［第３の実施例］
図５は、本発明の第３の実施例のシステム構成例である。 [Third embodiment]
FIG. 5 is a system configuration example of the third embodiment of the present invention.

図１と同一構成部分には同一符号を付しその説明を省略する。 The same components as those in FIG. 1 are denoted by the same reference numerals and description thereof is omitted.

図５に示すシステムは、作業記録記述中の内容解析を行い、複数の利用者に対して内容解析の結果を提示する場合である。 The system shown in FIG. 5 is a case where the content analysis in the work record description is performed and the result of the content analysis is presented to a plurality of users.

図５に示すように、複数の利用者(Ａ，Ｂ)がいる場合、利用者に対して解析単位文書を変更する場合である。例えば、作業記録の内容ラベル推定結果を元に、特定の内容ラベルのみを利用者Ａに提示し、残りを利用者Ｂに対して提示する。この場合、それぞれの利用者Ａ，Ｂは内容解析を行う作業記録を入力すると同時に、解析単位文の出力を得るための内容ラベルも入力する。作業記録内容解析装置１００は、内容ラベル決定部１３０において、作業記録に該当する内容ラベルのみを獲得し、それぞれの利用者に対して出力を行う内容ラベルを持つ解析単位文のみをユーザインタフェース１１０から提示する。 As shown in FIG. 5, when there are a plurality of users (A, B), the analysis unit document is changed for the users. For example, based on the content label estimation result of the work record, only a specific content label is presented to the user A, and the rest is presented to the user B. In this case, each user A and B inputs a work record for performing content analysis, and also inputs a content label for obtaining an output of an analysis unit sentence. In the work record content analysis apparatus 100, the content label determination unit 130 acquires only the content label corresponding to the work record and outputs only the analysis unit sentence having the content label to be output to each user from the user interface 110. Present.

これにより、例えば、ITシステムの運用において、作業記録から遠隔オペレータ（利用者Ａ）と実地作業者（利用者Ｂ）が同時に実施する作業を獲得する場合に、遠隔オペレータ（利用者Ａ）には、遠隔オペレータ（利用者Ａ）が行うべき作業の（特定の内容ラベルが付いた）解析単位文を提供し、実地作業者（利用者Ｂ）には実地作業者（利用者Ｂ）が行うべき作業の（特定の内容ラベルが付いた）解析単位文を提供することが可能となる。 Thus, for example, in the operation of an IT system, when a remote operator (user A) and an actual worker (user B) obtain work simultaneously performed from work records, the remote operator (user A) , Provide analysis unit statements (with specific content labels) of work to be performed by the remote operator (user A), and the actual worker (user B) should be performed by the actual worker (user B) It is possible to provide an analysis unit sentence (with a specific content label) of work.

なお、図１に示す作業記録内容解析装置１００の各構成要素の動作をプログラムとして構築し、作業記録内容解析装置として利用されるコンピュータにインストールする、または、ネットワークを介して流通させることが可能である。 The operation of each component of the work record content analysis apparatus 100 shown in FIG. 1 can be constructed as a program and installed in a computer used as the work record content analysis apparatus, or distributed via a network. is there.

本発明は、上記の実施の形態及び実施例に限定されることなく、特許請求の範囲内において、種々変更・応用が可能である。 The present invention is not limited to the above-described embodiments and examples, and various modifications and applications are possible within the scope of the claims.

１０，１０₁〜１０_n 作業記録ＤＢ
２０内容ラベルＤＢ
３０利用者（端末）
４０分析システム
１００作業記録内容解析装置
１１０ユーザインタフェース部
１２０内容解析単位決定部
１３０内容ラベル決定部
１４０特徴量抽出部
１５０内容解析部 10, 10 ₁ -10 _n work record DB
20 Content label DB
30 users (terminals)
40 Analysis System 100 Work Record Content Analysis Device 110 User Interface Unit 120 Content Analysis Unit Determination Unit 130 Content Label Determination Unit 140 Feature Extraction Unit 150 Content Analysis Unit

Claims

利用者の作業記録の内容を分析するための作業記録内容分析装置であって、
前記作業記録が与えられると、解析単位文を抽出する内容解析単位決定手段と、
予め解析単位文と内容ラベルを対応付けた内容ラベルＤＢを参照して、前記解析単位文の内容ラベルを取得する内容ラベル決定手段と、
前記解析単位文に含まれる単語と、前記内容ラベル決定手段で取得した前記解析単位文の内容ラベルを組み合わせて、前記解析単位文から内容解析を行うための特徴量を抽出する特徴量抽出手段と、
前記特徴量抽出手段で得られた一つまたは複数の特徴量に基づいて、条件付確率、同時確率を求める、または、作業記録の集合に含まれる頻度の高い特徴量を選択することにより、解析結果を求める内容解析手段と、
を有することを特徴とする作業記録内容分析装置。 A work record content analysis device for analyzing the contents of a user's work record,
Given the work record, content analysis unit determination means for extracting an analysis unit sentence;
A content label determination unit that acquires a content label of the analysis unit sentence with reference to a content label DB that associates the analysis unit sentence and the content label in advance.
A feature quantity extracting unit that extracts a feature quantity for performing content analysis from the analysis unit sentence by combining a word included in the analysis unit sentence and a content label of the analysis unit sentence acquired by the content label determining unit; ,
Based on one or a plurality of feature quantities obtained by the feature quantity extraction means, a conditional probability and a joint probability are obtained, or analysis is performed by selecting a feature quantity with a high frequency included in a set of work records. Content analysis means for obtaining results;
A work record content analyzing apparatus comprising:

前記内容解析単位決定手段は、
前記作業記録に区切りを与える；
前記作業記録の改行文字毎に区切る；
入力された解析単位を用いる；
または、
前記作業記録全文を用いる；
のいずれかにより前記解析単位文を抽出する手段を含む
請求項１記載の作業記録内容分析装置。 The content analysis unit determination means includes
Give a break to the work record;
Separating every newline character in the work record;
Use the entered analysis unit;
Or
Use the full work log;
The work record content analysis apparatus according to claim 1, further comprising means for extracting the analysis unit sentence by any of the above.

前記特徴量抽出手段は、
前記解析単位文を構文解析し、構文解析の結果に基づいて、内容に関連性のある解析単位文の組を特徴量とする手段を含む
請求項１記載の作業記録内容分析装置。 The feature amount extraction means includes:
The work record content analysis apparatus according to claim 1, further comprising means for parsing the analysis unit sentence and using a set of analysis unit sentences related to the contents as a feature amount based on a result of the syntax analysis.

前記内容ラベル決定手段は、
前記解析単位文から取得した内容ラベルを前記内容ラベルＤＢに格納する手段を含む
請求項１記載の作業記録内容分析装置。 The content label determining means includes
Working recorded contents analyzer according to claim 1 including means for storing the contents label acquired from the analyzing unit statement on the content label DB.

前記内容ラベル決定手段は、
前記作業記録を格納した作業記録ＤＢから得られた解析対象の作業記録に対して、前記内容ラベルを紐付けて該作業記録ＤＢに格納する手段を含む
請求項１記載の作業記録内容分析装置。 The content label determining means includes
The work record to the working record obtained analyzed from the work record DB storing the work record content analyzer according to claim 1 including means for storing in association with the contents label to the working recording DB.

利用者の作業記録の内容を分析するための作業記録内容分析方法であって、
内容解析単位決定手段、内容ラベル決定手段、特徴量抽出手段、及び内容解析手段を有する装置において、
前記内容解析単位決定手段が、前記作業記録が与えられると、解析単位文を抽出する内容解析単位決定ステップと、
前記内容ラベル決定手段が、予め解析単位文と内容ラベルを対応付けた内容ラベルＤＢを参照して、前記解析単位文の内容ラベルを取得する内容ラベル決定ステップと、
前記特徴量抽出手段が、前記解析単位文に含まれる単語と、前記内容ラベル決定ステップにおいて取得した前記解析単位文の内容ラベルを組み合わせて、前記解析単位文から内容解析を行うための特徴量を抽出する特徴量抽出ステップと、
前記内容解析手段が、前記特徴量抽出ステップで得られた一つまたは複数の特徴量に基づいて、条件付確率、同時確率を求める、または、作業記録の集合に含まれる頻度の高い特徴量を選択することにより、解析結果を求める内容解析ステップと、
を行うことを特徴とする作業記録内容分析方法。 A work record content analysis method for analyzing the contents of a user's work record,
In an apparatus having content analysis unit determination means, content label determination means, feature amount extraction means, and content analysis means,
The content analysis unit determination means, when the work record is given, a content analysis unit determination step of extracting an analysis unit sentence;
The content label determining means refers to a content label DB in which an analysis unit sentence and a content label are associated with each other in advance, and acquires a content label of the analysis unit sentence.
The feature amount extraction unit combines a word included in the analysis unit sentence with the content label of the analysis unit sentence acquired in the content label determination step, and calculates a feature amount for content analysis from the analysis unit sentence. A feature extraction step to extract;
The content analysis means obtains a conditional probability, a joint probability based on one or a plurality of feature amounts obtained in the feature amount extraction step, or a feature amount that is frequently included in a set of work records. A content analysis step for obtaining an analysis result by selecting,
A method for analyzing the contents of work records, characterized by:

コンピュータを、
請求項１乃至５のいずれか１項に記載の作業記録内容分析装置の各手段として機能させるための作業記録内容分析プログラム。 Computer
A work record content analysis program for functioning as each means of the work record content analysis apparatus according to any one of claims 1 to 5 .