JP2003302997A

JP2003302997A - Speech controller

Info

Publication number: JP2003302997A
Application number: JP2002109169A
Authority: JP
Inventors: Yoshifumi Tanimoto; 好史谷本
Original assignee: Murata Machinery Ltd
Current assignee: Murata Machinery Ltd
Priority date: 2002-04-11
Filing date: 2002-04-11
Publication date: 2003-10-24

Abstract

<P>PROBLEM TO BE SOLVED: To easily change contents of speech control by changing a speech structured document, to easily conduct version up work of the document and to commonly possess compilation results in an easy manner. <P>SOLUTION: A structured document, which is made by structuring the operation tree of a device main body 20, is converted into a speech structured document and stored in an input output communication free box. Then, instructions of an operator from a microphone are converted into control information by the voice structured document to control the body 20. Then, the frequency of utilization made by the operator for every field and an erroneous recognition ratio are compiled, converted into a structured document and outputted. Based on the compilation result, the organization of the tree is changed so that the operator more easily conducts inputting operations. Moreover, speech recognition is facilitated easy by changing key works. <P>COPYRIGHT: (C)2004,JPO

Description

【発明の詳細な説明】Detailed Description of the Invention

【０００１】[0001]

【発明の利用分野】この発明は音声制御装置に関し、特
に音声による制御構造を容易に変更し得るようにするこ
とに関する。BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to a voice control device, and more particularly, to making it possible to easily change a voice control structure.

【０００２】[0002]

【従来技術】マイクロホンとスピーカなどを用いた音声
インターフェースを用い、オペレータが音声入力した操
作コマンドに応じて、各種装置を制御することが知られ
ている。この場合、操作コマンドは例えばツリー構造を
なし、操作コマンドの入力の順序や、入力すべきコマン
ドの表現などは、装置の設計時に決まっている。従っ
て、予定していない順序で操作コマンドを入力すること
や、予定していないキーワードで音声入力することは原
則として認められない。このため音声制御は、期待され
るほどの柔軟性を備えていない。2. Description of the Related Art It is known that a voice interface using a microphone and a speaker is used to control various devices according to an operation command voice-input by an operator. In this case, the operation commands have, for example, a tree structure, and the input order of the operation commands, the expression of the commands to be input, and the like are determined when the device is designed. Therefore, as a general rule, it is not allowed to input operation commands in an unscheduled order or to input voice with unscheduled keywords. Therefore, voice control is not as flexible as expected.

【０００３】[0003]

【発明の課題】この発明の基本的課題は、オペレータに
対する操作コマンドのツリー体系やコマンドの表現など
の、音声制御に必要な情報を容易に変更し得るようにし
て、より快適な音声インターフェースを備えた音声制御
装置を提供することにある（請求項１〜３）。請求項２
の発明での追加の課題は、音声制御装置の中心となる音
声構造化文書を、遠隔から容易に変更や修正，バージョ
ンアップや共有などができるようにすることにある。請
求項３の発明での追加の課題は、オペレータに取っての
音声制御の使用上の特徴を容易に解析できるようにし
て、音声構造化文書を、オペレータの使い勝手が向上す
るように、バージョンアップしやすくすることにある。SUMMARY OF THE INVENTION The basic object of the present invention is to provide a more comfortable voice interface by making it possible to easily change the information necessary for voice control, such as the tree system of operation commands for operators and the expression of commands. Another object is to provide a voice control device (claims 1 to 3). Claim 2
Another object of the present invention is to make it possible to easily remotely change, modify, upgrade, and share a voice structured document, which is the core of a voice control device. An additional object of the invention of claim 3 is to upgrade the version of the voice structured document so that the operator can easily analyze the usage characteristics of the voice control and the operator's usability is improved. To make it easier.

【０００４】[0004]

【発明の構成】この発明の音声制御装置では、音声イン
ターフェースと、音声による制御情報を構造化した音声
構造化文書を変更自在に記憶するための手段を設ける。
変更自在とは、例えば、音声構造化文書を入出力あるい
は上書きなどにより修正できるようにすることや、音声
構造化文書を記憶したチップなどを交換できるようにす
ることである。そしてこの発明では、前記音声インター
フェースからのオペレータの音声による指示と、前記音
声構造化文書とに基づいて、制御対象の装置本体の制御
コマンドを生成するための手段とを設ける（請求項
１）。The voice control apparatus of the present invention is provided with a voice interface and means for variably storing a voice structured document in which voice control information is structured.
Changeable means, for example, that the voice structured document can be modified by inputting / outputting or overwriting, or that the chip storing the voice structured document can be replaced. Further, according to the present invention, means for generating a control command for the apparatus main body to be controlled is provided based on the operator's voice instruction from the voice interface and the voice structured document (claim 1).

【０００５】好ましくは、前記音声構造化文書を通信に
より入出力するための手段を設ける（請求項２）。Preferably, means is provided for inputting and outputting the voice structured document by communication (claim 2).

【０００６】特に好ましくは、音声インターフェースへ
のオペレータの指示内容を集計するための手段と、該集
計結果を構造化文書に変換するための手段とを設ける
（請求項３）。Particularly preferably, a means for totaling the instruction contents of the operator to the voice interface and a means for converting the totalization result into a structured document are provided (claim 3).

【０００７】[0007]

【発明の作用と効果】この発明では、オペレータの音声
による指示は、音声構造化文書によって解釈される。例
えばオペレータの「はい」／「いいえ」などの入力の意
味は、その前の音声構造化文書からの質問に対する答え
として解釈され、オペレータの入力したキーワードの意
味も、音声構造化文書での文脈に基づいて解釈される。
音声構造化文書は、装置本体に対する制御コマンドとは
別の層を形成するので、音声構造化文書を変更すれば、
オペレータに対する音声インターフェースの内容を容易
に変更でき、装置本体に対する制御コマンドの内容や体
系などは変更する必要がない（請求項１）。According to the present invention, the voice instruction of the operator is interpreted by the voice structured document. For example, the meaning of the operator's input such as “yes” / “no” is interpreted as the answer to the question from the previous voice structured document, and the meaning of the keyword input by the operator is also changed to the context in the voice structured document. Be interpreted based on.
Since the voice structured document forms a layer different from the control command for the device body, if the voice structured document is changed,
The contents of the voice interface for the operator can be easily changed, and it is not necessary to change the contents or system of the control commands for the main body of the apparatus (Claim 1).

【０００８】請求項２の発明では、音声構造化文書を通
信により入出力できるので、遠隔から音声構造化文書を
変更したり、修正したり、バージョンアップしたり、ま
た音声構造化文書を他の音声制御装置に入力することに
より共有する、などのことができる。なお上記は、音声
構造化文書を通信により入出力する際の作用効果の例で
あるが、これらのことを全て行う必要があるのではな
い。According to the second aspect of the present invention, since the voice structured document can be input and output by communication, the voice structured document can be remotely changed, modified, or upgraded, and the voice structured document can be changed to another one. It can be shared by inputting to the voice control device. It should be noted that the above is an example of the action and effect when the voice structured document is input and output by communication, but it is not necessary to perform all of these.

【０００９】請求項３の発明では、オペレータの音声イ
ンターフェースへの指示内容を集計し、構造化文書に変
換するので、集計した構造化文書は、例えば装置１台分
のオペレータの指示内容のデータとなっている。そして
この構造化文書を参照することにより、音声構造化文書
を、オペレータの実際の使い方に合わせて、バージョン
アップすることができる。According to the third aspect of the present invention, the contents of instructions to the voice interface of the operator are aggregated and converted into a structured document. Therefore, the aggregated structured document is, for example, data of the contents of instructions of the operator for one device. Has become. By referring to this structured document, the voice structured document can be upgraded according to the actual usage of the operator.

【００１０】[0010]

【実施例】図１〜図４に実施例を示す。図において、２
は音声構造化文書記憶部で、VoiceXMLなどの音声構造化
文書を記憶し、この音声構造化文書には、装置本体に対
する操作コマンドのツリーの体型を構造化し、これを音
声構造化文書に変換してある。音声構造化文書の内容
は、オペレータから見た場合、操作コマンドの入力への
プロンプトとヘルプなどである。また装置本体の制御側
から見た場合、操作コマンドの入力シーケンスなどであ
り、プロンプトに対するオペレータの音声入力を、音声
構造化文書のデータに基づいて操作コマンドに変換す
る。EXAMPLE An example is shown in FIGS. In the figure, 2
Is a voice structured document storage unit that stores voice structured documents such as VoiceXML. In this voice structured document, the body type of the tree of operation commands for the main body of the device is structured, and this is converted into a voice structured document. There is. From the operator's point of view, the contents of the voice structured document include prompts and help for inputting operation commands. When viewed from the control side of the apparatus main body, it is an input sequence of operation commands, etc., and the operator's voice input to the prompt is converted into an operation command based on the data of the voice structured document.

【００１１】４はXMLなどで記述した構造化文書で、こ
の場合、装置本体の操作コマンドのツリーを構造化した
ものであり、文書変換手段６により音声構造化文書に変
換して、音声構造化文書記憶部２に記憶させる。８はス
ピーカで、音声構造化文書に記述したプロンプトやヘル
プなどを、ＤＡコンバータ７でＤＡ変換し、オペレータ
に対して報知するためのものである。１０はマイクロホ
ン、１２はＡＤコンバータ、１４は音声認識手段、１６
はテキスト化手段で、オペレータの入力した音声の内容
をテキスト化して出力する。なお文書変換手段６は設け
なくても良い。Reference numeral 4 is a structured document described in XML or the like. In this case, the tree of the operation commands of the apparatus main body is structured, and the document conversion means 6 converts the tree into a voice structured document for voice structured. The document is stored in the document storage unit 2. Reference numeral 8 denotes a speaker, which is used for DA conversion of a prompt or help described in the voice structured document by the DA converter 7 to notify the operator. 10 is a microphone, 12 is an AD converter, 14 is a voice recognition means, 16
Is a text-to-text conversion means which converts the contents of the voice input by the operator into text and outputs the text. The document conversion means 6 may not be provided.

【００１２】１８は制御部で、テキスト化手段１６の出
力と音声構造化文書記憶部２に記憶した音声構造化文書
とを用いて、装置本体２０に対する制御コマンドを発生
させる。装置本体２０は例えばファクシミリ装置やコピ
ー機、イメージスキャナなどの画像装置とし、ここでは
ファクシミリ機能とコピー機能並びにイメージスキャナ
機能とを複合した複合化機とする。ただし装置本体２０
の種類自体は任意で、外界との間で通信により入出力の
できるものが好ましく、例えばデジタル家電製品などで
もよい。また制御部１８には非音声入力手段２２を設
け、音声認識手段１４が正しく音声認識できない場合な
どに備えて、制御部１８に入力できるようにする。非音
声入力手段２２は、装置本体２０が本来備えているテン
キーやタッチパネルあるいは各種のスイッチなどの入力
手段を用いればよい。A control unit 18 generates a control command for the apparatus main body 20 by using the output of the text forming means 16 and the voice structured document stored in the voice structured document storage unit 2. The apparatus main body 20 is, for example, an image apparatus such as a facsimile machine, a copying machine, or an image scanner, and here, it is a compound machine that combines a facsimile function, a copy function, and an image scanner function. However, the device body 20
The type itself is arbitrary, and those capable of inputting / outputting through communication with the outside world are preferable, and for example, digital home appliances may be used. Further, the control unit 18 is provided with a non-voice input unit 22 so that the voice can be input to the control unit 18 in case the voice recognition unit 14 cannot correctly recognize the voice. As the non-voice input means 22, an input means such as a ten-key pad, a touch panel, or various switches that the apparatus body 20 originally has may be used.

【００１３】２４は集計手段で、オペレータの指示内容
を集計すると共に、非音声入力手段２２から入力された
事項に合わせて、音声入力を誤認識した事例をその内容
毎に集計する。音声入力の内容の集計としては、オペレ
ータが入力すべき事項のうちで、例えば、デフォールト
値以外の値をオペレータが入力した例に着目する。そし
てこのような例を、操作コマンドの種類毎に集計する。
すると例えば前記の複合機の場合、コピー枚数はデフォ
ールト値を１部とし、コピー枚数は１部の場合がほとん
どで、コピー枚数は必要な場合にのみオペレータが入力
できるようにすればよいのか、あるいは１部以外のコピ
ー枚数が入力される割合が高いので、コピー枚数を毎回
質問し入力させることがよいのかを分析・集計できる。
また同様に、コピーサイズのデフォールト値がＡ４であ
る場合に、Ａ４をデフォールト値として維持することが
正しいのかどうかを集計できる。Reference numeral 24 is a totaling means for totaling the instruction contents of the operator and, in accordance with the items input from the non-voice inputting means 22, the cases in which voice input is erroneously recognized are totaled for each content. As an aggregate of the contents of voice input, attention is paid to an example in which the operator inputs a value other than the default value among the items to be input by the operator. And such an example is totaled for every kind of operation command.
Then, for example, in the case of the above-mentioned multi-function peripheral, the default value of the number of copies is one copy, and the number of copies is one copy in most cases. Should the operator be able to input the number of copies only when necessary? Since the ratio of the number of copies other than one set being input is high, it is possible to analyze and aggregate whether it is good to ask and input the number of copies each time.
Similarly, when the default value of the copy size is A4, it can be aggregated whether or not maintaining A4 as the default value is correct.

【００１４】例えばコピー枚数が１枚で、サイズはＡ
４、縮小拡大率は１００％であることが多いのであれ
ば、「いつものコピーですか？」などのプロンプトを設
けて、答え（オペレータの入力）が「はい」であれば、
コピー枚数が１枚、サイズはＡ４、縮小拡大率は１００
％と解釈すればよい。このようにすると、操作コマンド
の入力をより簡単に行える。また入力サイズはＡ４とＡ
３が多く、コピー枚数は１部が多く、縮小拡大率はＡ４
の場合１００％が多く、Ａ３の場合７１％が多い場合、
最初に入力サイズを問い合わせ、Ａ４であれば「コピー
枚数は１枚で縮小拡大率は１００％でよいですか？」な
どと質問し、そうでない場合に分岐するようにすればよ
い。またサイズがＡ３であれば「コピー枚数は１枚で縮
小拡大率は７１％でよいですか？」などと質問して、異
なる場合に分岐すればよい。オペレータがどのような指
示をしているのかを項目毎に集計すると、実際の使われ
方として最も多いパターンがどのようなものであるかを
判別し、オペレータへの質問（プロンプト）の体系を変
更して、より入力を簡単にできる。For example, the number of copies is 1, and the size is A
4. If the reduction / enlargement ratio is often 100%, a prompt such as "Is it a regular copy?" Is provided, and if the answer (operator input) is "Yes",
Number of copies is 1, size is A4, reduction / enlargement ratio is 100
You can interpret it as%. This makes it easier to input the operation command. Input size is A4 and A
There are 3 copies, the number of copies is 1 copy, and the reduction / enlargement ratio is A4.
If 100% is high and if A3 is 71% high,
First, the input size is inquired, and if it is A4, a question such as "Is the number of copies one and the reduction / enlargement ratio 100%?" Is asked, and if not, the process may be branched. If the size is A3, a question such as "Is the number of copies one copy and the reduction / enlargement ratio 71%?" By summing up what kind of instructions the operator is giving for each item, we can determine what kind of pattern is the most used actually and change the system of questions (prompts) to the operator. Then, you can input more easily.

【００１５】次にどのような操作コマンドが非音声入力
手段２２からの入力などにより取り消される場合が多い
のか、あるいはどのような操作コマンドに関して認識不
能なキーワードが入力されたり、あり得ないコマンドが
入力されたりすることが多いのかを集計できれば、どの
ような操作コマンドや操作コマンドの解釈ルールに関し
て、入力された音声の誤認識が、プロンプトの誤解が生
じやすいのかを集計できる。例えばコピー枚数を問い合
わせるプロンプトに対して、「１１４％」などの別のプ
ロンプトに対する入力があるのは、あり得ないコマンド
が入力される例である。このようなことが多い場合、プ
ロンプトの表現を変えたり、制御情報（コマンド）の入
力を要求する順序を変えたりした方が良い。このように
して、誤認識や誤入力の元になりやすいプロンプトやヘ
ルプを他のプロンプトやヘルプに変更するなどにより、
誤認識率や誤入力率を低下させることができる。Next, what kind of operation command is often canceled by inputting from the non-voice inputting means 22, or what kind of operation command is unrecognizable keyword input, or impossible command is input. If it is possible to count the number of frequently-executed messages, it is possible to calculate what kind of operation command and interpretation rule of the operation command the misrecognition of the input voice easily causes misunderstanding of the prompt. For example, in response to an inquiry about the number of copies, there is an input for another prompt such as "114%", which is an example in which an impossible command is input. In many cases, it is better to change the expression of the prompt or change the order of requesting input of control information (command). In this way, by changing the prompt or help that is likely to cause misrecognition or incorrect input to another prompt or help,
The false recognition rate and the false input rate can be reduced.

【００１６】オペレータの音声入力の内容の頻度や誤認
識率などは、複数の音声制御装置に渡って集計し、それ
に基づいて音声構造化文書を変更することが好ましい。
このため各操作コマンドや操作駒野の解釈ルールなどに
関して、デフォールト値以外の値が入力された頻度や、
各操作コマンドや解釈ルール毎の誤認識率などの集計値
を、構造化文書作成手段２６で、XMLなどの構造化文書
に変換する。It is preferable that the frequency of the contents of the voice input by the operator, the erroneous recognition rate, and the like are aggregated over a plurality of voice control devices, and the voice structured document is changed based on the total.
Therefore, with regard to the interpretation rules of each operation command and operation Komano, the frequency of inputting values other than the default value,
The aggregated value such as the error recognition rate for each operation command or each interpretation rule is converted into a structured document such as XML by the structured document creating unit 26.

【００１７】２８は入出力通信手段で、好ましくはHTTP
（ハイパー・テキスト・トランスファー・プロトコー
ル）やFTP（ファイル・トランスファー・プロトコー
ル）に従い、あるいは電子メールなどの形式で、外界と
通信により入出力できるものとする。そして入出力通信
手段２８は、音声構造化文書や、構造化文書作成手段２
６で作成したオペレータの使用状況と問題点に関する構
造化文書などを、サービスセンターなどの外界との間で
通信する。３０は編集手段で、音声構造化文書などの編
集に用いるが、音声構造化文書の編集はサービスセンタ
ーなどの外界で行うことにし、音声制御装置には設けな
くてもよい。実施例の場合、装置本体２０はファクシミ
リ機能を備えた複合機なので、入出力通信手段２８には
ファクシミリ機能に用いるモデムやＬＡＮインターフェ
ースなどを用いればよい。28 is an input / output communication means, preferably HTTP
It can be input and output by communication with the outside world according to (Hyper Text Transfer Protocol), FTP (File Transfer Protocol), or in the form of e-mail. Then, the input / output communication means 28 uses the voice structured document and the structured document creating means 2
The structured document and the like regarding the operator's usage status and problems created in 6 are communicated with the outside world such as a service center. Reference numeral 30 denotes an editing unit, which is used for editing a voice structured document or the like, but the voice structured document is to be edited outside the service center or the like and may not be provided in the voice control device. In the case of the embodiment, since the apparatus main body 20 is a multifunction machine having a facsimile function, the input / output communication means 28 may be a modem or a LAN interface used for the facsimile function.

【００１８】図１の音声制御装置を、より模式的に簡単
化して示すと、図２のようになる。ここで図１と同じ符
号は同じものを表し、音声制御装置の中心はVoiceXMLな
どの音声構造化文書３２で、これは一方ではスピーカと
マイクロホンなどから成るヒューマンインターフェース
３４と組み合わされ、ヒューマンインターフェース３４
に対して音声構造化文書３２からプロンプトやヘルプな
どを入力してオペレータに知らせ、オペレータはヒュー
マンインターフェース３４から装置本体２０の操作コマ
ンドもしくは操作コマンドを発生させるための情報を入
力する。この入力と音声構造化文書３２で定まる文脈、
（音声入力されたデータを解釈するための、音声構造化
文書でのフィールドの属性）、とを用いて、制御部１８
は制御コマンドを発生し、装置本体２０を制御する。FIG. 2 is a simplified schematic diagram of the voice control device of FIG. Here, the same reference numerals as those in FIG. 1 represent the same things, and the center of the voice control device is a voice structured document 32 such as VoiceXML, which on the one hand is combined with a human interface 34 consisting of a speaker and a microphone, etc.
The operator inputs a prompt or help from the voice structured document 32 to inform the operator, and the operator inputs an operation command of the apparatus main body 20 or information for generating an operation command from the human interface 34. The context defined by this input and the voice structured document 32,
(Attribute of field in voice structured document for interpreting voice input data), and
Generates a control command to control the apparatus body 20.

【００１９】音声構造化文書３２の内容は、装置本体２
０を制御するための操作体系を構造化した構造化文書
を、音声構造化文書に変換したものである。そして音声
構造化文書３２は入出力通信手段を介して外界３８との
間で通信でき、これによって音声構造化文書をバージョ
ンアップしたり、変更・修正したり、他の音声制御装置
との間で共有したりすることができる。次にヒューマン
インターフェース３４から得られる特徴、例えばどの操
作コマンドに関してどのような値が指定される頻度が高
いか、あるいはどの項目に関して誤認識が生じやすいか
などのことは、集計手段２４に送られて集計され、これ
は構造化文書に変換されてデータベース化され、外界３
８に送られて、音声構造化文書３２のバージョンアップ
などのために用いられる。The contents of the voice structured document 32 are the contents of the apparatus body 2.
A structured document in which an operation system for controlling 0 is structured is converted into a voice structured document. Then, the voice structured document 32 can be communicated with the outside world 38 via the input / output communication means, whereby the voice structured document can be upgraded, changed or modified, or exchanged with another voice control device. You can share it. Next, characteristics obtained from the human interface 34, for example, what value is frequently specified for which operation command, or which item is likely to be erroneously recognized, are sent to the aggregation means 24. Aggregated, this is converted into a structured document and made into a database, and the external world 3
8 and is used for version upgrade of the voice structured document 32 and the like.

【００２０】以上のように実施例では、装置本体２０を
制御する制御部１８と、音声構造化文書３２と、ヒュー
マンインターフェース３４のハードウェアとが機能上独
立しているので、音声構造化文書のみを変更することが
でき、オペレータに対する音声によるインターフェース
を実質的に自由に変更できる。As described above, in the embodiment, since the control unit 18 for controlling the apparatus main body 20, the voice structured document 32, and the hardware of the human interface 34 are functionally independent, only the voice structured document is used. Can be changed, and the voice interface to the operator can be changed substantially freely.

【００２１】図３，図４に、コピー／ファクシミリ／イ
メージスキャナの複合機の場合での、コピーの制御指令
の入力の例を示す。図３の４０はコピー動作のキー操作
ツリーを示し、コピーを実行するには、例えば倍率を選
択し、部数を選択し、スタートキーを押下する必要があ
る。このキー操作のツリーをXML文書で表記すると、図
３の構造化文書４２のようになる。続いてこれをVoiceX
ML文書に変換すると、図４の音声構造化文書４４のよう
になる。各々のフィールドは１つの項目を表し、倍率選
択の項目、部数選択の項目、実行するかどうかの選択の
項目の３つの項目がある。また各項目に対して、オペレ
ータの音声コマンド「ヘルプ」を検出すると、倍率選択
の範囲や部数選択の範囲あるいは、「はい」か「いい
え」で実行の有無を入力すべきこと、などを音声で報知
する。音声構造化文書４４に基づく音声インターフェー
スとオペレータとの間の入出力例４６を図４に示すと、
Ｍ（装置）は音声インターフェースからの音声を、Ｏ
（オペレータ）はオペレータの指示を示す。FIGS. 3 and 4 show examples of inputting a copy control command in the case of a copy / facsimile / image scanner compound machine. Reference numeral 40 in FIG. 3 denotes a key operation tree of the copy operation. To execute the copy, it is necessary to select, for example, the magnification, the number of copies, and the start key. When the tree of this key operation is represented by an XML document, it becomes the structured document 42 of FIG. Then this is VoiceX
When converted into an ML document, the structured voice document 44 in FIG. 4 is obtained. Each field represents one item, and there are three items, that is, a magnification selection item, a number of copies selection item, and an execution or non-execution selection item. When the operator's voice command "help" is detected for each item, the range of magnification selection, the range of number of copies selection, or "Yes" or "No" whether or not to execute is to be input by voice. Notify me. An input / output example 46 between the voice interface based on the voice structured document 44 and the operator is shown in FIG.
M (device) sends audio from the audio interface to O
(Operator) indicates an instruction from the operator.

【００２２】音声構造化文書４４での最初の項目に従
い、「倍率は？」とのプロンプトを行い、これに対して
オペレータがヘルプを求めると、選択範囲が５０％から
４００％の範囲であることが報知される。これに対して
オペレータが適宜の倍率を指定し、次の項目の部数に対
して、「部数は？」とのプロンプトを行い、オペレータ
からのヘルプの要求入力があれば、選択部数の範囲を報
知し、オペレータからの部数の入力を受け付ける。次い
で最後のフィールドである実行の有無をオペレータに対
してプロンプトし、「実行」の入力があれば、「コピー
実行中です」などの回答をする。なお以上の手順では３
回の入力が必要なので、「いつものコピー」などのショ
ートカットの入力を可能にし、そのためには「いつもの
コピー」の意味を定める必要があり、オペレータの使用
状況を集計する。According to the first item in the voice structured document 44, a prompt "What is the magnification?" Is asked, and when the operator asks for help, the selection range is 50% to 400%. Is notified. In response to this, the operator specifies an appropriate scaling factor and prompts "How many copies?" For the number of copies for the next item. If there is a request for help from the operator, the range of the selected number of copies is announced. Then, the input of the number of copies from the operator is accepted. Then, the operator is prompted for the last field, execution or non-execution, and if "execution" is input, a reply such as "copying is in progress" is given. In the above procedure, 3
Since it is necessary to input the number of times, it is possible to enter a shortcut such as "usual copy", and in order to do so, it is necessary to determine the meaning of "usual copy", and the usage status of the operator is totaled.

【００２３】実施例では、音声構造化文書を変更するの
みで、他の部分を変更せずに、オペレータに対して新し
い音声入力の環境を提供できる。このためより使い勝手
の良い音声インターフェースを提供できる。次に音声構
造化文書は通信により入出力自在なので、遠隔からバー
ジョンアップしたり変更したり修正したりすることがで
きる。また編集手段による編集などを介して、使いやす
い音声構造化文書が得られた場合、これを他の音声制御
装置へ転送して、複数の音声制御装置間で共有できる。
さらにオペレータがどのような事項を入力し、どのよう
な事項に関して誤認識が生じているのかを集計し、構造
化文書にすることにより、例えば複数の制御装置に渡っ
て、オペレータの使用実態とその問題点をデータベース
化することができる。そしてこのようなデータベースに
基づいて、音声構造化文書の内容を容易にバージョンア
ップすることができる。In the embodiment, it is possible to provide the operator with a new voice input environment by only changing the voice structured document and without changing other portions. Therefore, a more convenient voice interface can be provided. Next, since the voice structured document can be freely input and output by communication, it is possible to upgrade, change or modify it remotely. Further, when an easy-to-use voice structured document is obtained through editing by the editing means, this can be transferred to another voice control device and shared among a plurality of voice control devices.
Furthermore, by summarizing what kind of items the operator inputs and what kind of misrecognition has occurred, and by creating a structured document, for example, the actual operating conditions of the operator and their Problems can be stored in a database. Then, based on such a database, the contents of the voice structured document can be easily upgraded.

【図面の簡単な説明】[Brief description of drawings]

【図１】実施例の音声制御装置のブロック図FIG. 1 is a block diagram of a voice control device according to an embodiment.

【図２】実施例の音声制御装置の動作スキームを示す
図FIG. 2 is a diagram showing an operation scheme of the voice control device according to the embodiment.

【図３】実施例での複写機のキー操作のツリーとこれ
を構造化した構造化文書の例を示す図FIG. 3 is a diagram showing an example of a tree of key operations of a copying machine and a structured document in which the tree is structured in the embodiment.

【図４】実施例での複写機のキー操作のツリーを示す
音声構造化文書と、これに基づく複写機とオペレータと
の入出力例を示す図FIG. 4 is a diagram showing a voice structured document showing a tree of key operations of the copying machine in the embodiment, and an example of input / output between the copying machine and an operator based on the voice structured document.

【符号の説明】[Explanation of symbols]

２音声構造化文書記憶部４構造化文書６文書変換手段７ＤＡコンバータ８スピーカ１０マイクロホン１２ＡＤコンバータ１４音声認識手段１６テキスト化手段１８制御部２０装置本体２２非音声入力手段２４集計手段２６構造化文書作成手段２８入出力通信手段３０編集手段３２音声構造化文書３４ヒューマンインターフェース３６構造化文書３８外界４０キー操作ツリー４２構造化文書４４音声構造化文書４６入出力例 2 Voice structured document storage 4 structured documents 6 Document conversion means 7 DA converter 8 speakers 10 microphones 12 AD converter 14 Voice recognition means 16 Text conversion means 18 Control unit 20 Device body 22 Non-voice input means 24 means of counting 26 Structured document creation means 28 Input / output communication means 30 Editing means 32 Voice structured document 34 Human Interface 36 Structured Document 38 External 40 key operation tree 42 Structured document 44 Voice structured document 46 Input / output example

───────────────────────────────────────────────────── フロントページの続き (51)Int.Cl.⁷ 識別記号ＦＩテーマコート゛(参考）Ｇ１０Ｌ 3/00 ５７１Ｖ ─────────────────────────────────────────────────── ─── Continuation of front page (51) Int.Cl. ⁷ Identification code FI theme code (reference) G10L 3/00 571V

Claims

【特許請求の範囲】[Claims]

【請求項１】音声インターフェースと、音声による制
御情報を構造化した音声構造化文書を変更自在に記憶す
るための手段と、前記音声インターフェースからのオペ
レータの音声による指示と、前記音声構造化文書とに基
づいて、制御対象の装置本体の制御コマンドを生成する
ための手段とを設けた、音声制御装置。1. A voice interface, a means for variably storing a voice structured document in which voice control information is structured, an operator's voice instruction from the voice interface, and the voice structured document. And a means for generating a control command for the device body to be controlled based on the above.

【請求項２】前記音声構造化文書を通信により入出力
するための手段を設けたことを特徴とする、請求項１の
音声制御装置。2. The voice control device according to claim 1, further comprising means for inputting and outputting the voice structured document by communication.

【請求項３】音声インターフェースへのオペレータの
指示内容を集計するための手段と、該集計結果を構造化
文書に変換するための手段とを設けたことを特徴とす
る、請求項１または２の音声制御装置。3. The method according to claim 1, further comprising: a means for totaling the instruction contents of the operator to the voice interface, and a means for converting the totalized result into a structured document. Voice control device.