JPH077417B2

JPH077417B2 - Sentence inspection device

Info

Publication number: JPH077417B2
Application number: JP63256188A
Authority: JP
Inventors: 俊一福島
Original assignee: NEC Corp
Current assignee: NEC Corp
Priority date: 1988-10-11
Filing date: 1988-10-11
Publication date: 1995-01-30
Anticipated expiration: 2010-01-30
Also published as: JPH02103658A

Description

【発明の詳細な説明】〔産業上の利用分野〕本発明は入力された文章から誤り箇所あるいは不適切箇
所を検出する文章検査装置に関する。DETAILED DESCRIPTION OF THE INVENTION [Industrial field of use] The present invention relates to a sentence inspection device for detecting an error or inappropriate portion from an input sentence.

〔従来の技術〕[Conventional technology]

日本語文章の文体は常体（だ・である体）と敬語（です
・ます体、であります体、ございます体）とに分けるこ
とができる。例えば、次の（１）および（２）の文は常
体の文であり、（３）および（４）の文は敬体の文であ
る。The writing style of Japanese sentences can be divided into normal (da) body and honorifics (is, masuda, desu tai, and tai tai). For example, the following sentences (1) and (2) are ordinary sentences, and the sentences (3) and (4) are respectful sentences.

次の通りだ。 …（１）次の通りである。 …（２）次の通りです。 …（３）次の通りでございます。 …（４）一般に、ひとつの文章中で常体の文と敬体の文とが混在
することは好ましくない。そのような混在の検査は、
『日本語文章作成支援システムCOMET』（福島・他、電
子通信学会技術研究報告OS86−21、1986年）、および特
開昭61−229155号公報『日本語ワード・プロセッシング
方式』などで述べられている。その方式は文体（常体／
敬体）を特徴付ける表現を記憶する文体表現記憶手段を
用意し、文章中からこの文体表現記憶手段に記憶された
表現を検出すると同時に、検出された表現の数を常体と
敬体とに分けてカウントするというものである。常体と
敬体の数がともに１以上であれば、常体と敬体とが混在
していることになる。It is as follows. … (1) It is as follows. (2) It is as follows. … (3) It is as follows. (4) In general, it is not preferable that normal sentences and respectful sentences are mixed in one sentence. Such mixed inspection is
"Japanese text creation support system COMET" (Fukushima et al., IEICE Technical Research Report OS86-21, 1986) and JP-A-61-229155, "Japanese Word Processing Method" There is. The method is style (normal /
A stylistic expression storage means for storing expressions that characterize (respected body) is prepared, and the expression stored in the stylized expression storage means is detected from the sentence, and at the same time, the number of detected expressions is divided into normal and respected expressions. Is to count. If both the number of regular bodies and the number of respected bodies are 1 or more, it means that the number of regular bodies and respected bodies are mixed.

〔発明が解決しようとする課題〕[Problems to be Solved by the Invention]

文体に関する検査としては、従来の常体と敬体との混在
に関する検査だけでなく、敬体や常体が使用されている
位置に関する検査が必要である。As the inspection regarding the style, not only the conventional inspection regarding the mixture of the ordinary body and the respectful body, but also the inspection regarding the position where the normal body and the regular body are used.

例えば、次の（５）は常体の文であり、（６），
（７），（８）はともに（５）に対する敬体の文であ
る。しかし、同じ敬体の文であっても、（６），
（７），（８）では文の途中の敬体の使用されている位
置・回数が異なる（下線部分が敬体を特徴付ける表現で
ある）。その結果、（５）＜（６）＜（７）＜（８）の
順の丁寧さが増している。For example, the following (5) is an ordinary sentence, (6),
Both (7) and (8) are respectful sentences to (5). However, even with the same respectful sentence, (6),
In (7) and (8), the position and the number of times the honorific body is used in the middle of the sentence are different (the underlined portion is an expression that characterizes the respectful body). As a result, the order of (5) <(6) <(7) <(8) is increasing.

データは次に示した通りだが、安易に結論は出せない。
…（５）データは次に示した通りだが、安易に結論は出せませ
ん。 …（６）データは次に示した通りですが、安易に結論は出せませ
ん。 …（７）データは次に示しました通りですが、安易に結論は出せ
ません。 …（８）常体の文と敬体の文という区別だけでなく敬体の文に関
する様々な形が日本語では許されるが、作成する文章の
種類によっては好ましくない形が存在する。例えば、社
内文書を考えた場合、（８）は過度に丁寧であり避けた
い形である。The data is shown below, but I can't easily make a conclusion.
(5) The data is as shown below, but I can't easily make a conclusion. … (6) The data is as shown below, but it is not easy to draw a conclusion. … (7) The data is as shown below, but it is not easy to draw a conclusion. (8) Not only distinction between ordinary and respectful sentences but also various forms related to respectful sentences are allowed in Japanese, but there are unfavorable forms depending on the type of sentence to be created. For example, when considering an in-house document, (8) is a form that is too polite and should be avoided.

また、ひとつの文章中では常体の文または敬体の文に統
一されているだけでなく、敬体の文の場合の敬体の使用
法も統一されているべきである。例えば、次の（９）の
ような文章は敬体の使用法が不規則であり、日本語とし
て不自然である（下線部分が敬体を特徴付ける表現であ
る）。In addition, in a sentence, not only standardized or respected sentences should be unified, but also respected usage of respected sentences should be unified. For example, the following sentence (9) has irregular usage of the honorific body and is unnatural in Japanese (the underlined portion is an expression that characterizes the respectful body).

超新星から届いたと考えられます素粒子のデータを収集
しました。そのデータは、次に示した通りだが、安易に
結論は出せません。他のグループもデータを収集してい
ますが、そのデータとの比較が必要です。 …（９）上記のような問題は敬体や常体を使用する位置に関する
検査を行えば解決することができる。例えば、分末（句
点の直前）と接続助詞「が」の直前では敬体を使用し、
他の位置では常体を使用するという条件を定めて検査す
るならば、（５）〜（８）については（５），（６），
（８）は不適切で（７）が適切であると判定される。同
様に、（９）は不適切であり、それに対して次の（10）
のような文章は適切であると判定されることになる（下
線部分が敬体を特徴付ける表現である）。We have collected data on elementary particles that are thought to have come from supernovae. The data is as shown below, but I can not easily make a conclusion. Other groups also collect data, but need to compare it. (9) The above problems can be solved by conducting an inspection regarding the position where the respected body or the ordinary body is used. For example, at the end of the minute (immediately before the punctuation mark) and immediately before the connecting particle "ga", use Keitai,
At other positions, if the condition of using a normal body is determined and the inspection is performed, for (5) to (8), (5), (6),
It is determined that (8) is inappropriate and (7) is appropriate. Similarly, (9) is inappropriate, while the following (10)
A sentence such as will be judged to be appropriate (the underlined part is the expression that characterizes respect).

超新星から届いたと考えられる素粒子のデータを収集し
ました。そのデータは、次に示した通りですが、安易に
結論は出せません。他のグループもデータを収集してい
ますが、そのデータとの比較が必要です。 …（10）従来、このような敬体や常体の使用されている位置に関
する検査は、人間が行うしか方法がなかった。We have collected data on elementary particles that are thought to have come from supernovae. The data is as shown below, but I can not easily make a conclusion. Other groups also collect data, but need to compare it. … (10) Traditionally, only humans have been able to perform such inspections on the positions in which a respected body or a normal body is used.

本発明の目的は、敬体や常体の使用されている位置に関
する検査を行うことのできる文章検査装置を提供するこ
とである。また、基準となる文章を学習することによっ
て、敬体や常体を使用する位置に関する条件を容易に設
定できるようにしている。It is an object of the present invention to provide a sentence inspection device capable of inspecting a used position of a respected body or a normal body. In addition, by learning the reference sentence, it is possible to easily set the condition regarding the position where the respected body or the ordinary body is used.

〔課題を解決するための手段〕[Means for Solving the Problems]

本発明の文章検査装置は、入力された日本語文章から誤
り箇所あるいは不適切箇所を検出する文章検査装置にお
いて、文体を特徴付ける表現を記憶する文体表現記憶手
段と、前記日本語文章から前記文体表現記憶手段に記憶
された表現を検出する文体表現検出手段と、学習モード
と検査モードとを切り換える切り換え手段と、前記学習
モードにおいて前記文体表現検出手段の検出結果をもと
に前記文体を特徴付ける表現を使用する位置の条件を抽
出する位置条件学習手段と、前記位置条件学習手段によ
って抽出された条件を記憶する位置条件記憶手段と、前
記検査モードにおいて前記文体表現検出手段によって検
出された表現の位置が前記位置条件記憶手段に記憶され
た条件を満たすか否かを判定する位置条件判定手段とを
備えて構成される。The sentence inspection device of the present invention is a sentence inspection device for detecting an error part or an inappropriate part from an input Japanese sentence, and a style expression storage means for storing an expression characterizing a style, and the style expression from the Japanese sentence. Stylistic expression detection means for detecting the expression stored in the storage means, switching means for switching between the learning mode and the inspection mode, and an expression characterizing the style based on the detection result of the style expression detection means in the learning mode. The position condition learning means for extracting the condition of the position to be used, the position condition storage means for storing the condition extracted by the position condition learning means, and the position of the expression detected by the stylistic expression detection means in the inspection mode are And a position condition judging unit for judging whether or not the condition stored in the position condition storing unit is satisfied.

〔実施例〕〔Example〕

以下、本発明について図面を参照しながら説明する。 Hereinafter, the present invention will be described with reference to the drawings.

第１図は本発明による文章検査装置の第一の実施例の構
成を示すブロック図である。FIG. 1 is a block diagram showing the configuration of the first embodiment of the text inspection apparatus according to the present invention.

同図において、文章入力手段１は日本語文章を入力す
る。かな漢字変換入力装置、ペンタッチ・キーボード、
文字認識装置などが用いられる。In the figure, the sentence input means 1 inputs a Japanese sentence. Kana-Kanji conversion input device, pen touch keyboard,
A character recognition device or the like is used.

文章記憶手段２は文章入力手段１によって入力された日
本語文章を文字コード列として記憶する。ICメモリ、磁
気ディスク装置、磁気テープ装置、光ディスク装置など
が用いられる。The text storage means 2 stores the Japanese text input by the text input means 1 as a character code string. IC memories, magnetic disk devices, magnetic tape devices, optical disk devices, etc. are used.

文体表現記憶手段３は文体（常体／敬体）を特徴付ける
表現を記憶する。ICメモリ、磁気ディスク装置、磁気テ
ープ装置、光ディスク装置などが用いられる。第２図お
よび第３図は文体表現記憶手段３の内容の例である。第
２図では、常体を特徴付ける表現の文字列と敬体を特徴
付ける文字列の両方が、文体表現30の文字コード順に並
べて記憶されている。そして各表現には常体と敬体のい
ずれかを識別する情報である文体識別31が付加されてい
る。第３図では、常体を特徴付ける表現（ａ）と敬体を
特徴付ける表現（ｂ）とが分けて登録される。従って、
文字識別31は各々の表現には付与されていない。The style expression storage unit 3 stores an expression that characterizes a style (regular / respect). IC memories, magnetic disk devices, magnetic tape devices, optical disk devices, etc. are used. 2 and 3 are examples of the contents of the style expression storage unit 3. In FIG. 2, both the character string of the expression characterizing the ordinary body and the character string characterizing the honorific body are arranged and stored in the order of the character code of the style expression 30. Then, each expression is added with a style identification 31 which is information for identifying either a normal body or a respected body. In FIG. 3, the expression (a) characterizing a normal body and the expression (b) characterizing a respected body are separately registered. Therefore,
Character identification 31 is not given to each expression.

文体表現検査手段４は文章記憶手段２に記憶された日本
語文章から文体表現記憶手段３に記憶された表現を検出
する。コンピュータのCPUなどが用いられる。この文体
表現検出手段４は文章記憶手段２に記憶された日本語文
章と文体表現記憶手段３に記憶された表現とを比較・照
合する処理を行ない、検出された表現の文章中の位置お
よび文体識別とを位置条件判別手段６および位置条件学
習手段９へ送る。The stylistic expression inspection means 4 detects the expression stored in the style expression storage means 3 from the Japanese sentence stored in the text storage means 2. The CPU of the computer is used. The style expression detection means 4 performs a process of comparing and collating the Japanese sentence stored in the text storage means 2 with the expression stored in the style expression storage means 3, and detects the position and style of the detected expression in the sentence. The identification is sent to the position condition determining means 6 and the position condition learning means 9.

位置条件記憶手段５は文体（常体／敬体）を特徴付ける
表現を使用する位置に関する条件を記憶する。記憶する
条件は、切り換え手段10が学習モードを指定するとき
に、位置条件学習手段９によって書き込まれる。ICメモ
リ、磁気ディスク装置、磁気テープ装置、光ディスク装
置などが用いられる。第４図および第５図は位置条件記
憶手段の内容の例である。第４図の位置条件記憶手段５
は敬体を特徴付ける表現の直後に接続する表現の文字列
を登録する形で条件を表している。すなわち、敬体を特
徴付ける表現の直後は「。」「が、」「ので、」のいず
れかであり、常体を特徴付ける表現の直後は「。」
「が、」「ので、」以外であるという条件を表してい
る。第５図の位置条件記憶手段5aは文体識別50と条件種
別51と接続表現52から成っている。文体識別50は常態を
特徴付ける表現の位置に関する条件なのか、敬体を特徴
付ける表現の位置に関する条件なのかを示す。条件種別
51が「＋」の場合はその文体表現の直後に接続する表現
として接続表現52の部分の文字列を許すということを表
し、条件種別51が「−」の場合はその文体表現の直後に
接続する表現として接続表現52の部分の文字列を禁止す
るということを表す。この第５図は第４図と同様の条件
を表したものである。The position condition storage means 5 stores a condition regarding a position that uses an expression that characterizes a style (normal / respect). The condition to be stored is written by the position condition learning means 9 when the switching means 10 specifies the learning mode. IC memories, magnetic disk devices, magnetic tape devices, optical disk devices, etc. are used. 4 and 5 show examples of the contents of the position condition storage means. Position condition storage means 5 in FIG.
Indicates the condition by registering the character string of the expression connected immediately after the expression characterizing the respectful body. That is, immediately after the expression that characterizes the respectful body, one of ".", "Ga", and "because", and immediately after the expression that characterizes the ordinary body, "."
It represents a condition other than "ga" and "because". The position condition storage means 5a of FIG. 5 comprises a style identification 50, a condition type 51, and a connection expression 52. The style identification 50 indicates whether the condition is related to the position of the expression characterizing the normal state or the position related to the expression characterizing the respectful body. Condition type
When 51 is “+”, it means that the character string of the connection expression 52 is allowed as an expression connected immediately after the style expression, and when the condition type 51 is “−”, it is connected immediately after the style expression. It means that the character string of the connection expression 52 is prohibited as the expression. This FIG. 5 shows the same conditions as in FIG.

切り換え手段10は学習モードと検査モードとを切り換え
る。キーボードの特定のキーとして実現してもよいし、
トグルスイッチとして実現してもよい。切り換え手段10
は学習モードが指定されているときには位置条件学習手
段９を起動し、検査モードが指定されているときには位
置条件判定手段６を起動する（位置条件学習手段９と位
置条件判定手段６とのいずれか一方のみを起動する）。The switching means 10 switches between the learning mode and the inspection mode. May be implemented as a specific key on the keyboard,
It may be implemented as a toggle switch. Switching means 10
Activates the position condition learning means 9 when the learning mode is designated, and activates the position condition determining means 6 when the inspection mode is designated (either the position condition learning means 9 or the position condition determining means 6). Start only one).

位置条件学習手段９は、学習モードが指定されていると
きに、文体表現検出手段４の検出結果をもとにして文体
（常体／敬体）を特徴付ける表現を使用する位置に関す
る条件を抽出する。コンピュータのCPUなどが用いられ
る。学習モードでは、文体の位置に関する条件を満たし
た基準となる文章が文章入力手段１から入力されるよう
にする。文体を特徴付ける表現を使用する位置に関する
条件の抽出方法位置条件記憶手段５に記憶する条件の記
述方法によって異なるが、第４図に示したような記述方
法の場合は、例えば文体表現検出手段４から敬体表現の
検出された位置を受け取り、その直後の文字列を句読点
まで抽出すればよい。第５図に示したような記述方法の
場合は、あらかじめ接続表現52として考えられるものと
文体種別（敬体／常体）50とを組み合わせたパターンに
ついて、条件種別51の値を「−」にした表（第５図の条
件種別51の値が「−」のもの）を用意しておく。そして
基準となる文章から文体表現検出手段４の検出した文体
表現の直後の文字列とその接続表現52の文字列とを照合
して、一致した接続表現と文字種別に対応する部分の条
件種別51に「＋」を書き込む。位置条件学習手段９はそ
れらの条件を位置条件記憶手段５に書き込む。When the learning mode is designated, the position condition learning unit 9 extracts the condition regarding the position using the expression that characterizes the style (normal / respected body) based on the detection result of the style expression detection unit 4. . The CPU of the computer is used. In the learning mode, the reference sentence satisfying the condition regarding the position of the style is input from the sentence input means 1. Method of extracting condition relating to position using expression characterizing style Situation is different from the method of describing condition stored in position condition storage means 5, but in the case of the description method as shown in FIG. It is sufficient to receive the detected position of the honorific expression and extract the character string immediately after that to the punctuation mark. In the case of the description method as shown in FIG. 5, the value of the condition type 51 is set to “-” for the pattern in which what is considered as the connection expression 52 and the style type (respected / normal) 50 are combined in advance. The prepared table (the value of the condition type 51 in FIG. 5 is "-") is prepared. Then, the character string immediately after the style expression detected by the style expression detecting unit 4 from the reference sentence is compared with the character string of the connection expression 52, and the condition type 51 of the portion corresponding to the matched connection expression and the character type. Write "+" in. The position condition learning means 9 writes those conditions in the position condition storage means 5.

位置条件判定手段６は、検査モードが指定されていると
きに、文体表現検出手段４によって検出された表現の位
置が位置条件記憶手段５に記憶された条件を満たすか否
かを判定する。コンピュータのCPUなどが用いられる。
検査モード時には、検査の対象となる文章が入力される
ことになる。この位置条件判定手段６は、まず文体表現
検出手段４より文体（常体／敬体）を特徴付ける表現の
検出された位置を受け取る。そして次に、その文体に関
する条件を位置条件記憶手段５から読み込み、検出され
た表現の位置がその条件を満たすか否かを文章記憶手段
２に記憶された日本語文章を検索して判定する。この判
定処理の内容は位置条件記憶手段５における条件の記述
方法によって異なるものになるが、第４図に示したよう
な位置条件記憶手段５の場合は第６図のフローチャート
のような判定処理となる。なお、位置条件判定手段６は
検出された表現の位置と判定結果とを出力する。The position condition determination means 6 determines whether or not the position of the expression detected by the style expression detection means 4 satisfies the condition stored in the position condition storage means 5 when the inspection mode is designated. The CPU of the computer is used.
In the inspection mode, a sentence to be inspected is input. The position condition determination means 6 first receives the detected position of the expression characterizing the style (regular / respect) from the style expression detection means 4. Then, the condition regarding the style is read from the position condition storage means 5, and it is determined whether the position of the detected expression satisfies the condition by searching the Japanese sentence stored in the sentence storage means 2. The content of this determination processing differs depending on the method of describing the conditions in the position condition storage means 5, but in the case of the position condition storage means 5 as shown in FIG. 4, it is different from the determination processing as shown in the flowchart of FIG. Become. The position condition judging means 6 outputs the position of the detected expression and the judgment result.

また、文体表現検出手段４と位置条件判定手段６の動作
タイミングの制御方法としては、文体表現検出手段４が
文体を特徴付ける表現をひとつ検出するごとに位置条件
判定手段６の判定処理を行う方法と、文体表現検出手段
４が文体を特徴付ける表現を全て検出した後で位置条件
判定手段６の判定処理をまとめて行う方法がある。As a method of controlling the operation timings of the style expression detection unit 4 and the position condition determination unit 6, a method of performing the determination process of the position condition determination unit 6 every time the style expression detection unit 4 detects one expression that characterizes the style. There is a method of collectively performing the determination processing of the position condition determination unit 6 after the style expression detection unit 4 detects all expressions that characterize the style.

次に上記の文章検査装置の動作を例を用いて説明する。
その際、文体表現記憶手段３の内容は第２図の例を用い
る。動作タイミングについては、文体表現検出手段４が
文体を特徴付ける表現を全て検出した後で、位置条件判
定手段６の判定処理をまとめて行う方法による。Next, the operation of the above-mentioned text inspection device will be described using an example.
At this time, the content of the style expression storage unit 3 uses the example of FIG. The operation timing is based on a method in which the style expression detection unit 4 detects all expressions that characterize the style and then collectively performs the determination processing of the position condition determination unit 6.

まず、切り換え手段10が学習モードを指定したとする。
そして次のような基準となる文章（15）が入力されて文
章記憶手段２に記憶されているものとする。First, it is assumed that the switching means 10 specifies the learning mode.
Then, it is assumed that the following reference sentence (15) is input and stored in the sentence storage means 2.

今は晴れていますが、予報では雨ですので、傘が必要で
しょう。 …（15）そして文体表現検出手段４は文章記憶手段２に記憶され
た（15）の文章と文体表現記憶手段３に記憶された文字
列とを照合し、次のような文字列を、文字を特徴付ける
表現として検出する。［］内は検出された表現の位置
（文章の先頭からの文字数）、文体識別である。It's sunny now, but it's raining in the forecast, so you'll need an umbrella. (15) Then, the style expression detection means 4 collates the sentence (15) stored in the text storage means 2 with the character string stored in the style expression storage means 3, and the following character string is converted into a character. Is detected as a characterizing expression. In [], the position of the detected expression (the number of characters from the beginning of the sentence) and the style identification are shown.

います［６〜8,敬体］です［16〜17,敬体］でしょう［25〜28,敬体］この結果をもとに、位置条件学習手段９は文体を特徴付
ける表現を使用する位置に関する条件の抽出を行う。こ
こでは、上記の敬体表現の直後の文字列を句読点まで取
り出すことにより、第４図のような条件が位置条件記憶
手段５に書き込まれたものとする。Yes [6-8, honorable body] [16-17, honorable body] Maybe [25-28, honorable body] Based on this result, the position condition learning means 9 uses the expression that characterizes the style Extract the conditions related to. Here, it is assumed that the condition as shown in FIG. 4 is written in the position condition storage means 5 by extracting the character string immediately after the above-mentioned honorific expression up to the punctuation mark.

さて、ここで位置条件の設定は完了し、切り換え手段10
から検査モードが指定されたものとする。そして文章入
力手段１から前に示した（９）の例文が入力され、文章
記憶手段２に記憶されたものとする。そして、文体表現
検出手段４は、文章記憶手段２に記憶された（９）の文
章と文体表現記憶手段３に記憶された文字列とを照合
し、次のような文字列を文体を特徴付ける表現として検
出する。Now, the setting of the position condition is completed, and the switching means 10
It is assumed that the inspection mode is specified from. Then, it is assumed that the example sentence (9) shown above is input from the sentence input unit 1 and stored in the sentence storage unit 2. Then, the style expression detection unit 4 collates the sentence (9) stored in the sentence storage unit 2 with the character string stored in the style expression storage unit 3 and expresses the following character string as a feature characterizing the style. To detect as.

いた［７〜8,常体］られます［12〜15,敬体］しました［26〜29,敬体］した［41〜42,常体］だ［45〜45,常体］せません［55〜58,敬体］います［75〜77,敬体］です［92〜93,敬体］すると、今度は検査モードなので、位置条件学習手段９
ではなく位置条件判定手段６が動作し、文体表現検出手
段４によって検出された表現の直後の文字列と位置条件
記憶手段５に記憶された文字列との比較を行い、第６図
のフローチャートに従った判定を行う。Yes [7-8, normal] Yes [12-15, respect] Yes [26-29, respect] Yes [41-42, usual] [45-45, normal] No [ 55-58, respected body] Yes [75-77, respected body] [92-93, respected body] Then, since it is the inspection mode, the position condition learning means 9
Instead, the position condition determination means 6 operates, and the character string immediately after the expression detected by the style expression detection means 4 is compared with the character string stored in the position condition storage means 5, and the flowchart in FIG. Make the judgment according to the following.

具体的に説明すると、［７〜8,常体］に対しては、文章
の９文字目からの文字列「と考え………」は位置条件記
憶手段５に記憶された「。」「が、」「ので、」とは一
致しない。そして文体識別は常体なので、条件を満たす
と判定される。［12〜15,敬体］に対しては、文章の16
文字目からの文字列「素粒子の………」は位置条件記憶
手段５に記憶された「。」「が、」「ので、」とは一致
しない。そして文体識別は敬体なので、条件を満たさな
いと判定される。［26〜29,敬体］に対しては、文章の3
0文字目からの文字列「。その………」は位置条件記憶
手段５に記憶された「。」と一致する。そして文字識別
は敬体なので、条件を満たすと判定される。以下同様で
ある。More specifically, for [7 to 8, normal body], the character string "think ..." from the 9th character of the sentence is stored in the position condition storage means 5 as "."" , "So," does not match. Then, since the style identification is an ordinary form, it is determined that the condition is satisfied. For [12-15, Honorific], the text 16
The character string "elementary particle ---..." from the first character does not match with ".", "Ga", and "because" stored in the position condition storage means 5. Since the style identification is respectful, it is determined that the condition is not satisfied. For [26-29, honorific], the sentence 3
The character string “.that ... ……” from the 0th character matches “.” Stored in the position condition storage means 5. Since the character identification is respectful, it is determined that the condition is satisfied. The same applies hereinafter.

その結果、位置条件判定手段６から次にような情報が出
力される。As a result, the position condition determination means 6 outputs the following information.

［７〜8,条件を満たす］［12〜15,条件を満たさない］［26〜29,条件を満たす］［41〜42,条件を満たす］［45〜45,条件を満たさない］［55〜58,条件を満たす］［75〜77,条件を満たす］［92〜93,条件を満たす］すなわち、（９）の文章に対して次の（11）の下線で示
した部分の文体がおかしいという結果が得られ、常体や
敬体の使用されている位置に関する検査が行われた。[7-8, satisfy the condition] [12-15, do not meet the condition] [26-29, meet the condition] [41-42, meet the condition] [45-45, do not meet the condition] [55- 58, Satisfaction] [75-77, Satisfaction] [92-93, Satisfaction] That is, it is said that the underlined part of (11) below is incorrect in the sentence (9). The results were obtained, and an examination was conducted on the positions in which normal and respected bodies were used.

超新星から届いたと考えられます素粒子のデータを収集
しました。そのデータは、次に示した通りだが、安易に
結論は出せません。他のグループもデータを収集してい
ますが、そのデータとの比較が必要です。 …（11）第１の下線の部分は「られる」となるべきであり、敬体
が過剰の箇所が検出されてい。第２の下線の部分は「で
す」となるべきであり、敬体が不足している部分が検出
されている。We have collected data on elementary particles that are thought to have come from supernovae. The data is as shown below, but I can not easily make a conclusion. Other groups also collect data, but need to compare it. … (11) The first underlined part should be “done”, and the place where the honorific body is excessive is detected. The second underlined part should be “”, and the part lacking respect is detected.

ここで、文体表現記憶手段３に記憶しておく内容を制限
しておくことも考えられる。例えば、文体表現記憶手段
３に記憶しておく内容を第３図（ａ）のように常体を特
徴付ける表現のみとすることもできる。その場合には
（11）の第２の下線部分のような、敬体が不足している
箇所のみが検出される（敬体が過剰な箇所は検出されな
い）。逆に文体表現記憶手段３に記憶しておく内容を第
３図（ｂ）のように敬体を特徴付ける表現のみとした場
合には、（11）の第１の下線部分のような、敬体が過剰
な箇所のみが検出される（敬体が不足している箇所は検
出されない）。Here, it may be possible to limit the contents stored in the style expression storage unit 3. For example, the content stored in the style expression storage unit 3 may be only the expression characterizing the ordinary body as shown in FIG. 3 (a). In that case, only a portion where the honorific body is insufficient, such as the second underlined portion of (11), is detected (a location where the honorific body is excessive is not detected). On the contrary, when the content stored in the style expression storage unit 3 is only the expression characterizing the honorific body as shown in FIG. 3B, the honorific character such as the first underlined portion in (11) is used. Only the place where is excessive is detected (the place where lack of respect is not detected).

また、文体表現記憶手段３に記憶しておく表現の文体識
別について、常体／敬体の２分類だけでなく３つ以上の
分類を用いることも可能である。第７図は常体／敬体1/
敬体２の３種類の文体識別を用いた文体表現記憶手段３
の内容の例である。第７図における文体識別が敬体２の
ものは「ございます体」といわれるものである。そして
文体識別の種類に応じて条件を細分化することも可能で
ある。第８図は第７図のような文体識別の種類に対応し
た、位置条件記憶手段5aの内容の例である。In addition, for the style identification of the expressions stored in the style expression storage unit 3, it is possible to use not only the two classifications of normal / respectful but also three or more classifications. Figure 7 shows normal body / respect body 1 /
Style expression storage means 3 using three types of style identification
It is an example of the content of. The type 2 in Stylistic identification in FIG. 7 is referred to as an "Arisu body". It is also possible to subdivide the conditions according to the type of style identification. FIG. 8 shows an example of the contents of the position condition storage means 5a corresponding to the type of style identification as shown in FIG.

第９図は本発明による文章検査装置の第二の実施例の構
成を示すブロック図である。この第二の実施例は前述の
第一の実施例に対して単語辞書記憶手段７と文章解析手
段８を加えたものである。FIG. 9 is a block diagram showing the configuration of the second embodiment of the text inspection apparatus according to the present invention. In the second embodiment, the word dictionary storage means 7 and the sentence analysis means 8 are added to the first embodiment.

単語辞書記憶手段７は日本語の単語について少なくとも
表記と品詞とを登録した単語辞書を記憶する。ICメモ
リ、磁気ディスク装置、磁気テープ装置、光ディスク装
置などが用いられる。文章解析手段８は文章記憶手段２
に記憶された文章の解析を行う。コンピュータのCPUな
どが用いられる。The word dictionary storage means 7 stores a word dictionary in which at least notations and parts of speech of Japanese words are registered. IC memories, magnetic disk devices, magnetic tape devices, optical disk devices, etc. are used. The sentence analysis means 8 is the sentence storage means 2.
The sentence stored in is analyzed. The CPU of the computer is used.

文章の解析は単語辞書記憶手段７に記憶された単語辞書
を参照して行い、その結果として文節・単語の単位や単
語の品詞などが得られる。この文章解析手段８と単語辞
書記憶手段７は公知の手段であり、例えば『国語辞書の
記憶と日本語文の自動分割』（長尾・他、「情報処理」
第19巻第６号、1978）のようにして実現できる。文章解
析手段８は文章記憶手段２に記憶された文章の解析結果
を再び文章記憶手段２に書き込むので、文章記憶手段２
には文章の文字コード列だけでなく文節・単語の単位や
単語の品詞の情報も記憶される。例えば文章解析の結
果、文（６）に対しては次の（12）のような内容が記憶
される。［］内は単語の品詞、／は文節の境界を表す。The sentence is analyzed by referring to the word dictionary stored in the word dictionary storage means 7, and as a result, the unit of the phrase / word, the word part of speech, and the like are obtained. The sentence analysis means 8 and the word dictionary storage means 7 are well-known means, for example, "memory of national language dictionary and automatic division of Japanese sentence" (Nagao et al., "Information Processing").
Volume 19, No. 6, 1978). Since the sentence analysis means 8 writes the analysis result of the sentence stored in the sentence storage means 2 into the sentence storage means 2 again, the sentence storage means 2
In addition to the character code string of the sentence, the information of the unit of the clause / word and the word part of speech is stored. For example, as a result of the sentence analysis, the following contents (12) are stored for the sentence (6). The part of speech of the word is shown in [], and / is the boundary of the phrase.

データ［名詞］は［格助詞］／次［名詞］に［格助詞］
／示［動詞語幹］し［五段語尾連用形］た［助動詞
「た」連帯形］／通り［名詞］だ［助動詞「だ」終止
形］が［接続助詞「が」］が、［読点］／安易［形容動
詞語幹］に［形容動詞語尾連用形］／結論［名詞］は
［格助詞］／出［動詞語幹］せ［下一段語尾連用形］ま
せ［助動詞「ます」未然形］ん［助動詞「ん」終止
形］。［句点］ ……（12）この第二の実施例では、文体表現記憶手段３や位置条件
記憶手段５に記憶する表現について、その表現の文字列
だけでなくその表現を構成する単語の品詞情報もあわせ
て待つようにする。第10図は第二の実施例における文体
表現記憶手段３の内容の例を示す図である。第11図は第
二の実施例における位置条件記憶手段５の内容の例を示
す図である。Data [noun] is [case particle] / next [noun] [case particle]
/ Indicative [verb stem] and [five-step infix] [auxiliary verb “ta” solidarity] / street noun [auxiliary verb “da” end form] [connective particle “ga”], [reading point] / Easy [adjective verb stem] to [adjective verb inflection] / conclusion [noun] is [case particle] / exit [verb stem] [lower one-stage inflection] "End-form]. [Phrase] (12) In the second embodiment, regarding the expression stored in the style expression storage unit 3 and the position condition storage unit 5, not only the character string of the expression but also the part-of-speech information of the words forming the expression. Make sure to wait as well. FIG. 10 is a diagram showing an example of the contents of the style expression storage means 3 in the second embodiment. FIG. 11 is a diagram showing an example of the contents of the position condition storage means 5 in the second embodiment.

文体表現検出手段４や位置条件判定手段６については、
第一の実施例で説明した処理において文字列の照合を行
う際に、文字列の照合だけでなく単語の品詞の照合も行
うようにする。また、位置条件学習手段９は文体表現の
直後の接続表現を単語（あるいは単語の系列）として品
詞も含めて抽出するようにする。その結果、第二の実施
例では第一の実施例に比べて、文体表現検出手段４の文
体を特徴付ける表現の検出誤りや位置条件判定手段６の
条件の判定誤りがなくなり、文章検査装置の性能が向上
する。Regarding the stylistic expression detection means 4 and the position condition determination means 6,
When the character strings are collated in the processing described in the first embodiment, not only the character strings are collated but also the word part of speech is collated. Further, the position condition learning means 9 extracts the connected expression immediately after the style expression as a word (or a series of words) including the part of speech. As a result, in the second embodiment, as compared with the first embodiment, the detection error of the expression characterizing the style of the style expression detection unit 4 and the error of the condition determination of the position condition determination unit 6 are eliminated, and the performance of the text inspection apparatus is improved. Is improved.

〔発明の効果〕〔The invention's effect〕

以上説明したように、本発明によれば敬体や常体の使用
されている位置に関する検査を行うことが可能となる。
すなわち日本語文章の文体について、従来のように常体
と敬体の混在を検査するだけでなく、敬体の使い方も検
査することができるので、より自然の日本語文章を作成
することに役立つ。As described above, according to the present invention, it is possible to perform an inspection regarding the position where a respected body or a normal body is used.
In other words, with regard to the style of Japanese sentences, it is possible to check not only the mixture of ordinary and respectful bodies as in the past, but also how to use respectful bodies, which is useful for creating more natural Japanese sentences. .

また、本発明の文章検査装置では、敬体や常体を使用す
る位置の条件をユーザの好みに応じて、あるいは作成す
る文章に応じて変更して文体の検査を行うことが可能で
ある。例えば第一の実施例に関して、位置条件学習手段
９によって位置条件記憶手段５に書き込む条件を「。」
と「ので、」のみにするならば、文体がおかしいとして
検出される箇所は次の（13）のようになり、（11）とは
異なる結果が得られる。Further, in the sentence inspection device of the present invention, it is possible to inspect the style by changing the condition of the position where the respected body or the ordinary body is used according to the preference of the user or according to the created sentence. For example, regarding the first embodiment, the condition to be written in the position condition storage means 5 by the position condition learning means 9 is “.”.
If only "because" and "because" is used, the part where the style is detected as strange is as in (13) below, and a different result from (11) is obtained.

超新星から届いたと考えられます素粒子のデータを収集
しました。そのデータは、次に示した通りだが、安易に
結論は出せません。他のグループもデータを収集してい
ますが、そのデータとの比較が必要です。 …（13）また同様に、敬体の直後に使用する表現を位置条件記憶
手段５に書き込まなければ、次の（14）のような検出結
果となる。これは全てを常体に統一しようとする検査の
場合になる。We have collected data on elementary particles that are thought to have come from supernovae. The data is as shown below, but I can not easily make a conclusion. Other groups also collect data, but need to compare it. (13) Similarly, if the expression used immediately after the respectful body is not written in the position condition storage means 5, the detection result as in the following (14) is obtained. This is the case for tests that try to unify everything into a normal body.

超新星から届いたと考えられます素粒子のデータを収集
しました。そのデータは、次に示した通りだが、容易に
結論は出せません。他のグループもデータを収集してい
ますが、そのデータとの比較が必要です。We have collected data on elementary particles that are thought to have come from supernovae. The data is as shown below, but it is not easy to draw a conclusion. Other groups also collect data, but need to compare it.

このように条件をユーザが設定できる結果、ユーザの好
みや作成する文章に応じた文体の検査を行える柔軟性が
高く、操作性の良い文章検査装置が得られる。さらに、
条件は基準となる文章をもとに自動的に学習することが
できるので、条件の設定がきわめて容易である。As a result of allowing the user to set the conditions in this manner, a sentence inspection device with high flexibility and high operability that can inspect the style according to the user's preference and the sentence to be created can be obtained. further,
Since the conditions can be automatically learned based on the reference text, it is extremely easy to set the conditions.

【図面の簡単な説明】[Brief description of drawings]

第１図・第９図は本発明の実施例の構成を示すブロック
図、第２図・第３図・第７図・第10図は文体表現記憶手
段の内容の例を示す図、第４図・第５図・第８図・第11
図は位置条件記憶手段の内容の例を示す図、第６図は位
置条件判定手段における判定処理のフローチャートであ
る。１……文章入力手段、２……文章記憶手段、３……文体
表現記憶手段、４……文体表現検出手段、５……位置条
件記憶手段、６……位置条件判定手段、７……単語辞書
記憶手段、８……文章解析手段、９……位置条件学習手
段、10……切り換え手段。1 and 9 are block diagrams showing the configuration of an embodiment of the present invention, and FIGS. 2, 3, 7, and 10 are diagrams showing an example of the contents of a style expression storage means, and FIG. Fig. 5 Fig. 8 Fig. 11
FIG. 6 is a diagram showing an example of the contents of the position condition storage means, and FIG. 6 is a flow chart of the determination processing in the position condition determination means. 1 ... text input means, 2 ... text storage means, 3 ... style expression storage means, 4 ... style expression detection means, 5 ... position condition storage means, 6 ... position condition determination means, 7 ... word Dictionary storage means, 8 ... sentence analysis means, 9 ... position condition learning means, 10 ... switching means.

Claims

【特許請求の範囲】[Claims]

【請求項１】入力された日本語文章から誤り箇所あるい
は不適切箇所を検出する文章検査装置において、文体を
特徴付ける表現を記憶する文体表現記憶手段と、前記日
本語文章から前記文体表現記憶手段に記憶された表現を
検出する文体表現検出手段と、学習モードと検査モード
とを切り換える切り換え手段と、前記学習モードにおい
て前記文体表現検出手段の検出結果をもとに前記文体を
特徴付ける表現を使用する位置の条件を抽出する位置条
件学習手段と、前記位置条件学習手段によって抽出され
た条件を記憶する位置条件記憶手段と、前記検査モード
において前記文体表現検出手段によって検出された表現
の位置が前記位置条件記憶手段に記憶された条件を満た
すか否かを判定する位置条件判定手段とを備えたことを
特徴とする文章検査装置。1. A sentence inspecting device for detecting an error or an inappropriate portion from an input Japanese sentence, wherein a style expression storage means for storing an expression characterizing a style and a Japanese style sentence storage means for storing the expression from the Japanese sentence. Stylistic expression detecting means for detecting the stored expression, switching means for switching between the learning mode and the inspection mode, and a position for using the expression characterizing the style based on the detection result of the style expression detecting means in the learning mode. Position condition learning means for extracting the condition, position condition storing means for storing the condition extracted by the position condition learning means, and the position of the expression detected by the style expression detecting means in the inspection mode is the position condition. A sentence condition detection means for determining whether or not a condition stored in the storage means is satisfied. Apparatus.