JP2008039826A

JP2008039826A - Voice guidance apparatus

Info

Publication number: JP2008039826A
Application number: JP2006209838A
Authority: JP
Inventors: Takeshi Kikuta; 雄菊田; Masahiko Ikegami; 征彦池上
Original assignee: ARUNETTO KK
Current assignee: ARUNETTO KK
Priority date: 2006-08-01
Filing date: 2006-08-01
Publication date: 2008-02-21

Abstract

<P>PROBLEM TO BE SOLVED: To provide a voice output apparatus capable of selectively performing sightseeing guidance by a tourist's native language. <P>SOLUTION: Place name information in voice data of a plurality of languages stored in a data server 1 is detected, and the detected location name information is converted to a voice data in a predetermined language, and the plurality of voice data which are converted are simultaneously output with voice by an FM output device 2. In order to impress the location name information, voice output is daringly performed in Japanese, and repeated in a plurality of times. <P>COPYRIGHT: (C)2008,JPO&INPIT

Description

本発明は、音声出力装置に関し、より詳しくは、複数の異なる言語による音声案内を行うために使用される音声出力装置に関する。 The present invention relates to a voice output device, and more particularly to a voice output device used for voice guidance in a plurality of different languages.

例えば、外国から日本に来る観光客は、必ずしも、日本語が得意ではない。そこで、例えば彼らが日本で各地に所在する観光地を観光する際に、彼らの母国語で観光案内されるのであれば、現在の場所や次に移動する地名などの情報が分かりやすい。
このため、例えば外国人用の観光バスでは、現在の場所や次の観光地の情報などが複数の言語で案内されている。この場合、一般的には、例えば録音テープなどを用いて、同じ内容の案内を複数の言語で順次、つまり時間的に続けて行う構成が用いられている。
特開平２００２−１３５４５２号公報 For example, tourists coming to Japan from abroad are not necessarily good at Japanese. Therefore, for example, when sightseeing in sightseeing spots located in various places in Japan, if the tourist guidance is in their native language, information such as the current place and the name of the next place to move to is easy to understand.
For this reason, for example, on a tourist bus for foreigners, information on the current location and the next sightseeing spot is provided in a plurality of languages. In this case, generally, for example, a recording tape or the like is used to sequentially guide the same contents in a plurality of languages, that is, continuously in time.
JP-A-2002-135452

しかしながら、上記のように複数の言語で同じ内容を順次案内する方法の場合には、各国語による案内説明が順番に行われるため、１つの外国語での説明が終わるまでに時間がかかってしまい、特定の言語を母国語とする人には必要な案内を聞くタイミングがずれてしまうという問題がある。 However, in the case of the method of sequentially guiding the same contents in a plurality of languages as described above, it takes time to complete the explanation in one foreign language because guidance explanations in each language are performed in order. There is a problem that a person who speaks a specific language has a different timing to listen to necessary guidance.

また、観光案内する情報の内容に追加や変更がある場合には、追加や変更の内容を各言語毎に録音する必要がある。その場合、一般的には、各言語の数に対応した数の人（あるいは複数の言語を話せる人）を用意する必要があり、その分、人件費などの費用が嵩むという問題がある。 In addition, when there is an addition or change in the content of the information for sightseeing guidance, it is necessary to record the content of the addition or change for each language. In that case, in general, it is necessary to prepare a number of people (or a person who can speak a plurality of languages) corresponding to the number of each language, and there is a problem that labor costs and the like increase accordingly.

本発明の音声出力装置は、使用言語の異なる複数の音声データが保存されたデータ記憶部と、前記複数の音声データを読み出すと共に、読み出した複数の音声データに所定の言語による地名情報の音声データをそれぞれ合成して出力する音声合成部と、前記合成された複数の音声データを音声出力する音声出力部とを有してなることを特徴とする。
前記音声合成部は、前記地名情報の音声デ−タを複数回繰り返す態様で合成を行うことを特徴とし、さらに、前記地名情報の音声データが日本語の音声データであることを特徴とする。
また、本発明の別の音声出力装置は、使用言語が異なる複数の音声データが保存されたデータ記憶部と、前記複数の音声データにおける地名情報をそれぞれ検出する地名情報検出部と、前記検出した地名情報を所定の言語による音声データにそれぞれ置換する地名情報置換部と、前記地名情報を置換した複数の音声データのうちの所定のものを多重に音声出力する音声出力部とを有してなることを特徴とする。
好ましい実施形態において、前記音声合成部は、前記地名情報の音声データを複数回繰り返す態様で合成を行う。同じく、前記音声出力部は、前記地名情報の音声データを複数回繰り返す態様で前記音声出力を行う。また、前記地名情報の音声データは、例えば日本語の音声データである。 An audio output device according to the present invention includes a data storage unit storing a plurality of audio data in different languages, and reading the plurality of audio data. The audio data of place name information in a predetermined language is read from the plurality of read audio data. And a voice output unit for outputting the plurality of synthesized voice data as voices.
The speech synthesis unit synthesizes the speech data of the place name information in a manner that is repeated a plurality of times, and the speech data of the place name information is Japanese speech data.
Another audio output device according to the present invention includes a data storage unit storing a plurality of audio data in different languages, a place name information detecting unit for detecting place name information in each of the plurality of audio data, and the detection A place name information replacing unit that replaces place name information with audio data in a predetermined language; and an audio output unit that outputs a plurality of pieces of audio data in which the place name information is replaced. It is characterized by that.
In a preferred embodiment, the speech synthesizer synthesizes speech data of the place name information a plurality of times. Similarly, the voice output unit performs the voice output in such a manner that the voice data of the place name information is repeated a plurality of times. Moreover, the voice data of the place name information is, for example, Japanese voice data.

この音声出力装置において、前記音声出力部は、例えば、ワイヤレス通信で前記音声出力するワイヤレス通信部を有して構成される。ワイヤレス通信部は、具体的には例えばＡＭ・ＦＭデジタル通信装置が用いられる。 In this audio output device, the audio output unit includes, for example, a wireless communication unit that outputs the audio by wireless communication. Specifically, for example, an AM / FM digital communication device is used as the wireless communication unit.

本発明では、観光地における所定内容の案内を行うための音声データを、地名情報と、その他の部分（共通情報）とに分けて設ける。そして、共通情報については、複数の言語（例えば、日本語、英語、中国語、韓国語など）でそれぞれ発音した、使用言語の異なる複数の共通情報用音声データを作成する。一方、地名情報については、特定の言語（例えば日本語）で発音した地名情報用音声デーを作成する。そして、この地名情報用音声データを、上記の複数の共通情報用音声データのそれぞれと所定の順序で合成することで、地名情報用の発音データが共通で且つ共通情報における使用言語の異なる複数の音声データをそれぞれ作成する。
あるいは、所定内容の案内を行うための音声を複数の言語（例えば、日本語、英語、中国語、韓国語など）でそれぞれ発音した複数の音声データを作り、これら音声データを記憶装置に保存しておく。一方、特定の言語（例えば日本語）で所定の地名を発音した地名情報用音声データを別途作成し、同様に記憶装置に保存しておく。そして、使用の際には、上記保存された複数の音声データにおける地名情報を検出すると共にこの検出した地名情報を上記地名情報用音声データと置換する。このように地名情報を置換した複数の音声データの内の所定のものが音声出力される。 In the present invention, voice data for guiding predetermined contents in a sightseeing spot is provided separately for place name information and other parts (common information). For the common information, a plurality of pieces of common information sound data that are pronounced in a plurality of languages (for example, Japanese, English, Chinese, Korean, etc.) and used in different languages are created. On the other hand, for place name information, place name information voice data that is pronounced in a specific language (for example, Japanese) is created. Then, by synthesizing the place name information voice data with each of the plurality of common information voice data in a predetermined order, the place name information pronunciation data is common and the plurality of languages used in the common information are different. Create each audio data.
Alternatively, a plurality of voice data, each of which is produced by uttering voices for guiding predetermined contents in a plurality of languages (for example, Japanese, English, Chinese, Korean, etc.), are stored in a storage device. Keep it. On the other hand, place name information voice data in which a predetermined place name is pronounced in a specific language (for example, Japanese) is separately created and similarly stored in a storage device. In use, the place name information in the plurality of stored voice data is detected and the detected place name information is replaced with the place name information voice data. In this way, a predetermined one of the plurality of audio data in which the place name information is replaced is output as audio.

また、複数の音声データは、音声出力部から多重に（つまり同時に）音声出力される。音声出力は好ましくはワイヤレス通信で出力される。具体的には、例えばＡＭ・ＦＭデジタル通信装置が用いられる。
例えば、この装置でＦＭ通信する場合は、使用言語毎に異なる周波数帯でこれら複数の音声データを同時にＦＭ通信により音声出力することで多重出力を行う。そして、例えば、利用者側に対応するＦＭ受信装置を持たせておき、利用者は必要な言語に割り当てられた周波数帯での音声出力をＦＭ受信装置で受信することで、当該言語での音声出力を聞くことができる。 In addition, a plurality of audio data is output in multiple (that is, simultaneously) audio from the audio output unit. The audio output is preferably output by wireless communication. Specifically, for example, an AM / FM digital communication device is used.
For example, when FM communication is performed using this apparatus, multiple outputs are performed by simultaneously outputting a plurality of audio data in different frequency bands for each language used by FM communication. For example, an FM receiver corresponding to the user side is provided, and the user receives the voice output in the frequency band assigned to the necessary language by the FM receiver so that the voice in the language is received. You can hear the output.

本発明によれば、上記のように複数の異なる言語での同時的な音声出力を行うことができ、従来のような案内のタイミングのずれを回避することができる。
一方、地名情報に変更があった場合、あるいは地名情報に使用する言語を変更する場合には、地名情報用音声データだけを変更すれば良い。変更する地名情報用音声データの作成は、例えば地名情報用音声データが日本語であれば日本人一人で作成することができるため、１人分の人件費ですみ、低コストで行うことができる。
また、地名だけを特定の言語（例えば、案内を行っている国の言語）で行うことで、次の効果もある。即ち、例えば英語の案内の中に地名だけを英語とはイントネーションやアクセントなどが異なる日本語（日本語の発音）で複数回音声出力することで、その地名に意識がいき、しかも複数回、好ましくは２、３回繰り返して音声出力することにより、注意を喚起することができる。この地名情報は、観光案内の中でも特に重要であることから、このように意識を高めることで、記憶に残り易くなるという利点がある。 According to the present invention, as described above, simultaneous voice output in a plurality of different languages can be performed, and a conventional shift in the timing of guidance can be avoided.
On the other hand, when the place name information is changed, or when the language used for the place name information is changed, only the place name information audio data is changed. For example, if the place name information voice data is in Japanese, it can be created by a Japanese person, so it can be done at a low cost. .
Further, by performing only the place name in a specific language (for example, the language of the country where guidance is provided), the following effects are also obtained. That is, for example, by outputting the place name only in Japanese (Japanese pronunciation) with different intonations and accents from English in the English guidance, the place name becomes more conscious, and more than once, preferably Can be alerted by repeating voice output a few times. Since this place name information is particularly important in tourist information, there is an advantage that it becomes easy to remain in memory by raising the consciousness in this way.

以下に、本発明の実施例を説明する。
図１に本発明の実施例の音声出力装置を示した。この音声出力装置は、データサーバ１と、ＦＭ出力装置２とから構成されている。データサーバ１は、音声データ記憶部１１、地名情報検出部１２、地名情報置換部１３、音声データ合成部１４などから構成される。 Examples of the present invention will be described below.
FIG. 1 shows an audio output apparatus according to an embodiment of the present invention. This audio output device is composed of a data server 1 and an FM output device 2. The data server 1 includes a voice data storage unit 11, a place name information detection unit 12, a place name information replacement unit 13, a voice data synthesis unit 14, and the like.

音声データ記憶部１１は、例えばＲＡＭやＨＤ（ハードディスク）などの記憶装置で構成されており、所定の案内に対応する、複数の異なる言語（例えば、日本語、英語、中国語、韓国語）での音声出力を行うための音声データ（例えばＭＰ３形式、ＷＭＰ形式などのデジタルデータ）が記憶されている。これら音声データは、少なくとも、地名情報に対応する地名情報用音声データが検出および置換可能な態様で保存されている。より具体的には、地名情報用音声データは、その他の情報（共通情報）のための音声データ（共通情報用音声データ）とデータ的に分離された状態で保存されている。 The audio data storage unit 11 is configured by a storage device such as a RAM or an HD (hard disk), for example, and is provided in a plurality of different languages (for example, Japanese, English, Chinese, Korean) corresponding to a predetermined guidance. Audio data (for example, digital data such as MP3 format and WMP format) is stored. These audio data are stored in such a manner that at least the place name information audio data corresponding to the place name information can be detected and replaced. More specifically, the place name information audio data is stored in a state of being separated from audio data (common information audio data) for other information (common information).

ここで、本実施例においては、地名情報は全て日本語の情報であり、一方、共通情報は例えば上記した日本語、英語、中国語、韓国語の各言語による情報である。そして、各言語の共通情報に、日本語による音声出力により地名情報がそれぞれ所定の態様で組み合わせることで、所定の案内が行われる。 Here, in this embodiment, the place name information is all Japanese information, while the common information is information in, for example, Japanese, English, Chinese, and Korean languages. Then, predetermined information is provided by combining the common information of each language with the place name information in a predetermined manner by voice output in Japanese.

地名情報検出部１２は、例えば、ＣＰＵや所要のコンピュータプログラムなどから構成される機能実現手段であり、音声データ記憶部１１に記憶された音声データから地名情報用音声データを検出する。この検出は、例えば、各地名情報用音声データに特定のヘッダを付けておき、このヘッダの有無を検知することで行う。 The place name information detection unit 12 is a function realizing unit configured by, for example, a CPU, a required computer program, and the like, and detects the place name information voice data from the voice data stored in the voice data storage unit 11. This detection is performed, for example, by attaching a specific header to the local name information audio data and detecting the presence or absence of this header.

地名情報置換部１３は、上記と同様に例えばＣＰＵや所要のコンピュータプログラムなどから構成される機能実現手段であり、地名情報検出部１２において検出された地名情報用音声データを、別の地名情報用音声データに置換するものである。置換により、元のデータは削除され、新たなデータが保存される。 The place name information replacement unit 13 is a function realization unit configured by, for example, a CPU or a required computer program as described above, and the place name information voice data detected by the place name information detection unit 12 is used for another place name information. It replaces with audio data. By the replacement, the original data is deleted and new data is saved.

音声データ合成部１４は、上記と同様に例えばＣＰＵや所要のコンピュータプログラムなどから構成される機能実現手段であり、音声データ記憶部１１に記憶された音声データを所要の順序で読み出し、且つ、合成して出力する機能を有する。
なお、以上は地名情報検出部１２と地名情報置換部１３とを組み合わせることで、音声データ中の地名情報用音声データを所定の地名情報用音声データに置換した例であるが、これに代えて、これら地名情報検出部１２と地名情報置換部１３とを用いずに、次の方法とすることもできる。
即ち、音声データ記憶部１１に保存された音声データを、音声データ合成部１４により呼び出す際において、音声データ合成部１４が上記共通情報用音声データに所定の地名情報用音声データを合成して出力する構成とすることもできる。 The voice data synthesizing unit 14 is a function realization means composed of, for example, a CPU and a required computer program as described above, reads out the voice data stored in the voice data storage unit 11 in a required order, and synthesizes And has a function of outputting.
The above is an example in which the place name information detecting unit 12 and the place name information replacing unit 13 are combined to replace the place name information voice data in the voice data with predetermined place name information voice data. Instead of using the place name information detecting unit 12 and the place name information replacing unit 13, the following method may be used.
That is, when the voice data stored in the voice data storage unit 11 is called by the voice data synthesis unit 14, the voice data synthesis unit 14 synthesizes the predetermined location name information voice data with the common information voice data and outputs the same. It can also be set as the structure to do.

一方、ＦＭ出力装置２は、音声データ変換部２１１から２１４、ＡＭ・ＦＭデジタル通信装置２２１から２２４、ＦＭ送信機２３１から２３４などから構成される。音声データ変換部２１１から２１４は、データサーバ１の音声データ合成部１４から送信された４つの異なる言語による音声データを、ＦＭ送信用に音声、例えばアナログ化した音声にそれぞれ変換する。ＡＭ・ＦＡデジタル通信装置２２１から２２４は、変換された音声を、ＡＭまたはＦＡＭ用の信号に変換する。この実施例では、ＦＭ用の信号に変換する。また、ＦＭ送信機２３１から２３４は、これら変換された信号を所定の互いに異なる周波数のＦＭ電波に乗せて送信する。 On the other hand, the FM output device 2 includes audio data converters 211 to 214, AM / FM digital communication devices 221 to 224, FM transmitters 231 to 234, and the like. The audio data conversion units 211 to 214 convert the audio data in four different languages transmitted from the audio data synthesis unit 14 of the data server 1 into audio, for example, analog audio, for FM transmission. The AM / FA digital communication devices 221 to 224 convert the converted voice into an AM or FAM signal. In this embodiment, the signal is converted into an FM signal. Further, the FM transmitters 231 to 234 transmit these converted signals on FM radio waves having predetermined different frequencies.

ここで、本実施例の装置は、例えば、外国人が乗車する観光バスに設置して使用される。この場合、案内を必要とする外国人は、自らが所持するＦＭ受信機を用い、このＦＭ受信機で必要とする言語に対応する周波数を受信することで、当該言語での案内を聞くことができる。また、必要に応じてＦＭ受信機を貸与ないし販売することで、必要な言語での音声情報を聞くことができる。 Here, the apparatus of the present embodiment is used by being installed on a sightseeing bus on which a foreigner gets on, for example. In this case, a foreigner who needs guidance can listen to guidance in that language by using the FM receiver that he / she owns and receiving the frequency corresponding to the language required by this FM receiver. it can. Also, by lending or selling FM receivers as necessary, it is possible to listen to voice information in a necessary language.

図３（ａ）から（ｄ）は、４つの異なる言語で案内を行う一例を示したもので、上から日本語、英語、中国語、韓国語による案内の例である。この案内では、地名に相当する「東京」の部分だけが共通であり、その他の部分は前記した各国の言語に対応した文章となっている。 FIGS. 3A to 3D show examples of guidance in four different languages, and are examples of guidance in Japanese, English, Chinese, and Korean from the top. In this guide, only the part of “Tokyo” corresponding to the place name is common, and the other part is a sentence corresponding to the language of each country described above.

また、この例では、「東京」を日本語の発音のままそれぞれ２回づつ発音する構成としている。このように連続して２回発音することで、１回だけの場合に比べて、聞き手が地名に対する認識を確実に行うことができる。つまり、１回目は聞き漏らした場合でも、同じ地名の発音が２回続くことから、その地名を確実に認識することができるようになる。また、３回繰り返す構成としても良いが、４回以上繰り返すとくどくなるので好ましくない。。 Further, in this example, “Tokyo” is pronounced twice each with Japanese pronunciation. In this way, the pronunciation is performed twice in succession, so that the listener can surely recognize the place name as compared with the case where the pronunciation is performed only once. That is, even if the first time is missed, the place name is pronounced twice, so that the place name can be reliably recognized. Moreover, although it is good also as a structure repeated 3 times, since it becomes difficult when it repeats 4 times or more, it is not preferable. .

本発明の音声案内装置の実施例を示した説明図である。It is explanatory drawing which showed the Example of the voice guidance apparatus of this invention. 図１のデータサーバの具体的な構成例を示した説明図である。It is explanatory drawing which showed the specific structural example of the data server of FIG. （ａ）から（ｄ）は、それぞれ日本語、英語、中国語、韓国語による観光案内の一例を示した説明図である。(A)-(d) is explanatory drawing which showed an example of the sightseeing guide in Japanese, English, Chinese, and Korean, respectively.

符号の説明Explanation of symbols

１データサーバ
２ＦＭ出力装置
１１音声データ記憶部
１２地名情報置換部
１３地名情報置換部

DESCRIPTION OF SYMBOLS 1 Data server 2 FM output device 11 Voice data storage part 12 Place name information replacement part 13 Place name information replacement part

Claims

使用言語の異なる複数の音声データが保存されたデータ記憶部と、
前記複数の音声データを読み出すと共に、読み出した複数の音声データに所定の言語による地名情報の音声データをそれぞれ合成して出力する音声合成部と、
前記合成された複数の音声データを音声出力する音声出力部とを有してなる、
ことを特徴とする音声出力装置。 A data storage unit storing a plurality of audio data in different languages;
A voice synthesis unit that reads out the plurality of voice data, and synthesizes and outputs voice data of place name information in a predetermined language to the plurality of read voice data;
An audio output unit that outputs the synthesized plurality of audio data as audio,
An audio output device characterized by that.

前記音声合成部は、前記地名情報の音声データを複数回繰り返す態様で前記合成を行う、ことを特徴とする請求項１記載の音声出力装置。 The voice output device according to claim 1, wherein the voice synthesizer performs the synthesis in a manner in which voice data of the place name information is repeated a plurality of times.

前記地名情報の音声データが日本語の音声データである、ことを特徴とする請求項１または２記載の音声出力装置。 The voice output device according to claim 1 or 2, wherein the voice data of the place name information is Japanese voice data.

使用言語の異なる複数の音声データが保存されたデータ記憶部と、
前記複数の音声データにおける地名情報をそれぞれ検出する地名情報検出部と、
前記検出した地名情報を所定の言語による音声データにそれぞれ置換する地名情報置換部と、
前記地名情報を置換した複数の音声データのうちの所定のものを多重に音声出力する音声出力部とを有してなる、
ことを特徴とする音声出力装置。 A data storage unit storing a plurality of audio data in different languages;
A place name information detecting unit for detecting place name information in each of the plurality of audio data;
A place name information replacement unit for replacing the detected place name information with audio data in a predetermined language;
An audio output unit that outputs a plurality of audio data obtained by replacing the place name information with a plurality of audio data.
An audio output device characterized by that.

前記音声出力部は、前記地名情報の音声データを複数回繰り返す態様で前記音声出力を行う、ことを特徴とする請求項４記載の音声出力装置。 The audio output device according to claim 4, wherein the audio output unit performs the audio output in a manner in which audio data of the place name information is repeated a plurality of times.

前記地名情報の音声データが，日本語の音声データである、ことを特徴とする請求項４または５記載の音声出力装置。 6. The voice output device according to claim 4, wherein the voice data of the place name information is Japanese voice data.

前記音声出力部が、ワイヤレス通信で前記音声出力するワイヤレス通信部を有してなる、ことを特徴とする請求項１から６のいずれか１に記載の音声出力装置。 The audio output device according to claim 1, wherein the audio output unit includes a wireless communication unit that outputs the audio by wireless communication.

前記ワイヤレス通信部が、ＡＭ・ＦＭデジタル通信装置である、ことを特徴とする請求項７記載の音声出力装置。

8. The audio output device according to claim 7, wherein the wireless communication unit is an AM / FM digital communication device.