JP2000081897A

JP2000081897A - Method of recording speech information, speech information recording medium, and method and device of reproducing speech information

Info

Publication number: JP2000081897A
Application number: JP10249672A
Authority: JP
Inventors: Hiroshi Sekiguchi; 博司関口
Original assignee: KANAASU DATA KK
Current assignee: KANAASU DATA KK
Priority date: 1998-09-03
Filing date: 1998-09-03
Publication date: 2000-03-21
Anticipated expiration: 2018-09-03
Also published as: JP3617603B2

Abstract

PROBLEM TO BE SOLVED: To obtain a speech information of which reproduction time is extended or contracted and emphasized or attenuated without altering frequency components of original speech information itself in an arbitrary part as a speech for hearing exercise of Japanese English learners. SOLUTION: According to this recording method, a 1st speech information string sampled in a 1st period is divided into plural frequency components, and concerning each frequency component, a sinusoidal wave data, which has been changed in the amplitude and the number of waveforms of a predetermined part to an amplitude information sequence sequentially extracted in a 2nd period, is produced, and a 2nd speech information sequence synthesized by adding a sinusoidal wave data corresponding to these individual frequency components to the former sinusoidal data is recorded in a predetermined recording medium 15. In such a manner, a speech information string which is extended or contracted and emphasized or attenuated without changing the frequency in an arbitrary part is recorded in the obtained recording medium 15.

Description

【発明の詳細な説明】DETAILED DESCRIPTION OF THE INVENTION

【０００１】[0001]

【発明の属する技術分野】この発明は、ＣＤ−ＲＯＭ、
ＭＤ、ＭＯ等の円盤状記録媒体やＤＡＴ等のテープ状記
録媒体に音声情報を記録する音声情報の記録方法、該音
声情報が記録された音声情報記録媒体、並びに該音声情
報記録媒体に記録された音声情報列を読み出し再生する
ための音声情報の再生方法及び再生装置に関するもので
ある。TECHNICAL FIELD The present invention relates to a CD-ROM,
Audio information recording method for recording audio information on a disc-shaped recording medium such as MD or MO or a tape-shaped recording medium such as DAT, an audio information recording medium on which the audio information is recorded, and an audio information recording medium recorded on the audio information recording medium The present invention relates to a method and apparatus for reproducing audio information for reading and reproducing an audio information sequence.

【０００２】[0002]

【従来の技術】従来から、英会話等の語学の独習用、詩
吟の練習用、法律の独習用、その他の目的のために、カ
セットテープ等の記録媒体に音声情報が記録された教材
が種々提供されている。ここで、英会話の独習用の教材
を例に説明すると、従来の主な記録媒体は、例えば一連
の英語の発声（音声情報）が記録されてたカセットテー
プ（又はレコード）であり、学習者はこのテープ教材と
テキストとを組み合せて使用していた。なお、このよう
な教材には、初級用から上級用まで種々のレベルが用意
されている。2. Description of the Related Art Conventionally, various teaching materials in which audio information is recorded on a recording medium such as a cassette tape have been provided for self-learning of languages such as English conversation, practice of poetry, self-learning of laws, and other purposes. Have been. Here, taking as an example a teaching material for self-study of English conversation, a conventional main recording medium is, for example, a cassette tape (or record) on which a series of English utterances (voice information) is recorded. This tape teaching material and text were used in combination. Note that such teaching materials are prepared in various levels from beginner to advanced.

【０００３】また、日本国特許第２５８１７００号に
は、複数の区画に区分された上級者学習用に適した音声
情報列（ナチュラルスピードの発生音）が記録された第
１領域と、これら各区画に対応した等価な区画からなる
初級者学習用に適した音声情報列（はっきりとした発生
音であって、言語学上は同一の意味で派生の異なる音
声）が記録された第２領域と、該上級者学習用及び初級
者学習用の各音声情報列の対応する各区画の関係を、こ
れら音声情報列の各区画の記録媒体における記録位置で
示す情報が記録された第３領域とを、少なくとも備えた
ＣＤ−ＲＯＭ等の情報記録媒体、及びこのような構造を
備えた情報記録媒体の対応する区画間での切替え再生等
を含む再生方法が提案されている。[0003] Japanese Patent No. 2581700 discloses a first area in which a voice information sequence (natural speed generated sound) which is divided into a plurality of sections and which is suitable for advanced learning is recorded, and each of these sections. A second area in which a speech information sequence (clearly generated sound, which is linguistically derived and different in the same sense) that is suitable for beginner learning and composed of equivalent sections corresponding to is recorded, The relationship between the corresponding sections of the audio information sequences for the advanced learning and the beginner learning, and the third area where information indicating the recording position of each section of the audio information sequences on the recording medium is recorded. An information recording medium such as a CD-ROM provided at least and a reproducing method including switching reproduction between corresponding sections of the information recording medium having such a structure have been proposed.

【０００４】[0004]

【発明が解決しようとする課題】上述のように、日本国
特許第２５８１７００号の情報記録媒体には、該媒体上
の第１領域にネイティブスピーカーの発生音が記録さ
れ、また第２領域に言語上は同一の意味で遅緩した発音
で構成された音声情報列が記録されている。したがっ
て、第１領域に記録された音声情報列が再生されている
最中に再生音を聞き取れなかった場合、第２領域に記録
された同一内容の音声情報列（第１音声情報列の再生中
の区画と第２音声情報列の再生すべき区画との対応は第
３領域に記録されている）を切替えて再生することによ
り、学習者は聞き取れなかった音声の意味を理解するこ
とができる。As described above, in the information recording medium of Japanese Patent No. 2581700, a sound generated by a native speaker is recorded in a first area on the medium, and a language is recorded in a second area. Above, an audio information sequence composed of delayed sounds in the same sense is recorded. Therefore, if the reproduced sound cannot be heard while the audio information sequence recorded in the first area is being reproduced, the audio information sequence having the same content recorded in the second area (during the reproduction of the first audio information sequence) The correspondence between the section and the section to be reproduced in the second audio information sequence is recorded in the third area.) By switching and reproducing, the learner can understand the meaning of the unrecognizable sound.

【０００５】しかしながら、英語学習者は上述のように
第２領域に記録されている情報を聞くことにより第１領
域に記録された情報を理解することはできても、依然と
して該第１領域に記録されている情報、取り分け聞き取
れない音は単に繰り返し聞いただけでは聞き取れるよう
にはならない。日本人英語学習者の場合、日本語にない
音素特に子音の聞き取りが苦手であり、ネイティブスピ
ーカーとの会話に支障をきたしていることは周知であ
る。[0005] However, although the English learner can understand the information recorded in the first area by listening to the information recorded in the second area as described above, the English learner can still understand the information recorded in the first area. The information that is being heard, especially the inaudible sound, cannot be heard simply by listening repeatedly. It is well known that Japanese learners of English are not good at listening to phonemes not found in Japanese, especially consonants, which hinders conversation with native speakers.

【０００６】この発明は聞き取り難い部分が学習者にと
って聞き取りやすいように予め編集された音声を聞かせ
ることで、元の音声に対するヒヤリング能力を向上させ
る技術に関し、英語学習者のヒヤリング練習用の音声情
報として、元の音声情報自体の周波数成分を変えること
なく選択的に周波数成分の振幅、再生時間が編集された
音声情報の記録方法、音声情報記録媒体、並びに音声情
報の再生方法及び再生装置を提供することを目的として
いる。[0006] The present invention relates to a technology for improving the hearing ability of an original learner by allowing the learner to hear a pre-edited speech so that the learner can easily hear the hard to hear part. The present invention provides a method for recording audio information in which the amplitude and the reproduction time of the frequency component are selectively edited without changing the frequency component of the original audio information itself, an audio information recording medium, and a method and apparatus for reproducing the audio information. It is intended to be.

【０００７】[0007]

【課題を解決するための手段】この発明は、ヒヤリング
練習用の音声として、取り込まれた音声情報列の周波数
成分を変えることなく、該音声情報列の所望の部分を強
調あるいは減衰させたり、また再生時間を部分的に伸長
あるいは短縮させた音声情報列を新たに生成、記録、再
生する技術に関するものである。この発明では、再生さ
れる音声情報の音質を変えないため、サンプリングされ
た音声情報に対してではなく、該音声情報の各周波数成
分に対して所望の編集を行い、これら編集された周波数
成分を合成して新たな音声情報列を得ている。この構成
により、日本人英語学習者にとって聞き取り難い部分が
選択的に強調及び／又は伸長されたヒヤリング練習用の
音声情報の提供を可能にする。また、上級者がヒヤリン
グ能力のさらなる向上を望む場合には、逆に音声が選択
的に減衰されたり再生時間が短縮された音声情報の提供
を可能にする。SUMMARY OF THE INVENTION The present invention enhances or attenuates a desired portion of an audio information sequence without changing the frequency component of the audio information sequence taken as hearing practice audio. The present invention relates to a technique for newly generating, recording, and reproducing an audio information sequence whose playback time is partially extended or shortened. In the present invention, desired editing is performed not on the sampled audio information but on each frequency component of the audio information in order not to change the sound quality of the reproduced audio information, and these edited frequency components are A new speech information sequence is obtained by synthesis. With this configuration, it is possible to provide audio information for hearing practice in which a part difficult to hear for Japanese English learners is selectively emphasized and / or expanded. In addition, when an advanced user desires to further improve the hearing ability, it is possible to provide audio information in which the audio is selectively attenuated or the reproduction time is shortened.

【０００８】具体的にこの発明に係る音声情報の記録方
法は、第１周期（例えば音楽ＣＤの音響クロック４４．
１ＫＨｚ）でサンプリングされた第１音声情報列を複数
の周波数成分（以下、チャネルという）に分割し、第２
周期（例えば１波形を形成するために必要なデータ数に
相当）で各チャネルごとにその振幅情報を得る。なお、
この振幅情報は第１音声情報列の例えば１００データ分
に相当する波形の振幅変化量で与えられ、もし１００デ
ータ分で１波形が形成されない場合には１波形できるデ
ータ数に増やして（第２周期を長くして）抽出される。
なお、この第２周期は規則性のある周期であればよい。More specifically, the method for recording audio information according to the present invention includes a first period (for example, an audio clock 44.
1 KHz) is divided into a plurality of frequency components (hereinafter referred to as channels),
The amplitude information is obtained for each channel at a period (for example, corresponding to the number of data necessary to form one waveform). In addition,
This amplitude information is given by an amplitude change amount of a waveform corresponding to, for example, 100 data of the first audio information sequence. If one waveform is not formed by 100 data, the number of data that can be formed by one waveform is increased (second data). (With a longer period).
The second cycle may be a regular cycle.

【０００９】さらに、このように得られた各チャネルの
振幅情報列（各チャネルごとに第２周期で抽出された振
幅情報の列）に対してそれぞれ振幅情報を選択的に変更
するよう編集された複数の修正振幅情報列が生成され
る。この複数の修正振幅情報列は、それぞれ各周波数成
分に対応したチャネルごとに求められる。そして、各チ
ャネルに対応した修正振幅情報列間で、互いに対応して
いる同じタイミングで抽出された振幅情報からなる各情
報成分群と、これら各情報成分群ごとに用意される、第
１周期を基準にして音声再生時間の伸長あるいは短縮を
指示するための制御情報とからなるＶデータが生成され
る。Furthermore, the amplitude information sequence of each channel obtained as described above (the sequence of amplitude information extracted in the second cycle for each channel) is edited so as to selectively change the amplitude information. A plurality of corrected amplitude information strings are generated. The plurality of corrected amplitude information strings are obtained for each channel corresponding to each frequency component. Then, between the corrected amplitude information sequences corresponding to each channel, each information component group consisting of the amplitude information extracted at the same timing corresponding to each other, and a first cycle prepared for each of these information component groups Based on the reference, V data including control information for instructing extension or shortening of the audio reproduction time is generated.

【００１０】続いて、上記第２周期のデータとして生成
されたＶデータから、該Ｖデータにより与えられる振幅
（修正後の値）を有するとともに第１周期のデータ間隔
を有する、各チャネルに相当する正弦波データであっ
て、上記制御情報で指示された再生時間に相当する波数
の正弦波データがそれぞれ生成される。このように各チ
ャネルごとに生成された正弦波データは順次加算される
ことにより、第１周期のオーディオデータ（第２音声情
報列）が生成される。そして、この生成されたオーディ
オデータが所定の記録媒体に記録される。Subsequently, from the V data generated as the data of the second cycle, each channel has an amplitude (corrected value) given by the V data and a data interval of the first cycle. Sine wave data having a wave number corresponding to the reproduction time specified by the control information is generated. The sine wave data generated for each channel as described above is sequentially added to generate audio data (second audio information sequence) of the first cycle. Then, the generated audio data is recorded on a predetermined recording medium.

【００１１】なお、この発明に係る音声情報の記録方法
では、第２周期で抽出された各チャネルの振幅情報列に
おける各振幅情報に対し、任意の部分で選択的に強調さ
れるか減衰されるよう編集が行われる。すなわち、この
発明に係る音声情報の記録方法は、各チャネルの振幅情
報列について、各チャネル間で互いに対応している所定
部分の振幅情報から与えられる振幅値をそれぞれ選択的
に大きくあるいは小さく設定し直すことにより、修正振
幅情報列を生成している。また、この発明に係る音声情
報の記録方法では、再生音の不自然な振幅変化を避ける
ため、各チャネルについて、生成される正弦波データの
各振幅は、修正振幅情報列の互いに隣接した各振幅情報
間の直線補間により得られた値により決定されることを
特徴としている。In the audio information recording method according to the present invention, each amplitude information in the amplitude information sequence of each channel extracted in the second cycle is selectively emphasized or attenuated at an arbitrary portion. Editing is performed as follows. That is, in the audio information recording method according to the present invention, for the amplitude information sequence of each channel, the amplitude value given from the amplitude information of the predetermined portion corresponding to each other between the channels is selectively set to be large or small. Thus, a corrected amplitude information sequence is generated. Further, in the audio information recording method according to the present invention, in order to avoid an unnatural amplitude change of the reproduced sound, each amplitude of the sine wave data generated for each channel is equal to each adjacent amplitude of the corrected amplitude information sequence. It is characterized by being determined by a value obtained by linear interpolation between information.

【００１２】以上のようにこの発明に係る音声情報の記
録方法では、各チャネルごとに生成された振幅情報列に
対して、その任意の部分の振幅を変更するよう構成され
ており、また、再生時間の伸長・短縮を指示するための
制御情報が第２周期で抽出された各チャネルの振幅情報
をまとめた情報成分群ごとに用意されるため、周波数成
分を変更することなく、任意の部分において該再生音声
の選択的な強調・減衰を可能にするとともに、再生時間
の部分的な伸長・短縮も可能にする。As described above, in the audio information recording method according to the present invention, the amplitude of an arbitrary portion of the amplitude information sequence generated for each channel is changed, and the reproduction is performed. Since control information for instructing extension / reduction of time is prepared for each information component group in which the amplitude information of each channel extracted in the second cycle is prepared, without changing the frequency component, In addition to enabling selective emphasis / attenuation of the reproduced sound, it also enables partial extension / reduction of the reproduction time.

【００１３】これは、主として日本人がナチュラル・ス
ピードの英語を単にゆっくり再生して聴けるようにした
場合であっても、各周波数成分について単純にかつ一様
に音声再生時間を伸ばしたり短縮したのでは不充分であ
り、発生音の種類によっては子音部のスペクトルの時間
変化が言語上の音として別の音を意味する場合があるか
らである。例えば、ＢＡ（バ）とＰＡ（パ）の発音は、
前者のスペクトル変化が速く、後者は遅いだけでスペク
トルそのものはほとんど同じ形をしている。したがっ
て、ＢＡ（バ）という発音の子音部も含めて時間を伸長
するとＰＡ（パ）と聴こえることになる。これを防ぐに
は子音部の伸長度をＢＡ（バ）と聴こえる限界に留め、
母音部のみ望みの音声再生時間に伸長あるいは短縮する
ようにすれば、ＢＡ（バ）のままに聴こえることにな
る。一方、母音部はいくら伸長あるいは］短縮してもそ
の母音のままで聴こえるから望みの長さ（望みの再生時
間）に設定できる。一方、日本人には弱すぎて聴き取り
にくい小さな子音部の音のところだけを選択的に２倍と
か３倍に強調して聴かせることも必要である。母音部も
含めて強調したのでは全体が大きくなり過ぎて効果がな
い。どうしても選択的に強調しなければならない。以上
の理由から、この発明に係る音声情報の再生方法は、各
チャネルの振幅情報列も初級者にとって特に聞き取り難
い部分を選択的に強調された修正振幅情報列を編集し、
さらにこれら各チャネルごとの修正振幅情報列のうち同
じタイミングで生成された振幅情報から構成されるＶデ
ータとともに再生時間の伸長を指示する制御情報を順次
記録するよう構成されている。逆に、上級者の場合には
上述の各発声音の特性を考慮して、所望の部分で再生音
声が減衰したり、再生時間が短縮されるよう選択的に音
声情報列を編集してもよい。[0013] This is because, even when mainly Japanese people are allowed to play back natural-speed English simply and slowly, the sound reproduction time is simply and uniformly extended or shortened for each frequency component. Is insufficient, and the temporal change of the spectrum of the consonant part may mean another sound as a linguistic sound depending on the type of the generated sound. For example, the pronunciation of BA (ba) and PA (pa)
The former has a fast spectral change and the latter has only a slow change, and the spectrum itself has almost the same shape. Therefore, if the time is extended to include the consonant part of the sound BA, the sound will be heard as PA. To prevent this, limit the degree of extension of the consonant to the limit that can be heard as BA,
If only the vowel part is extended or shortened to the desired sound reproduction time, the sound can be heard as BA. On the other hand, a vowel portion can be set to a desired length (a desired reproduction time) because the vowel can be heard as it is even if it is expanded or shortened. On the other hand, it is necessary for the Japanese to selectively emphasize only the sound of a small consonant part that is too weak and difficult to hear by double or triple. The emphasis including the vowel part is too large for the whole effect. It must be emphasized selectively. For the above reasons, the audio information reproducing method according to the present invention edits a corrected amplitude information sequence in which the amplitude information sequence of each channel is also selectively emphasized for a part that is particularly difficult to hear for a beginner,
Further, control information for instructing to extend the reproduction time is sequentially recorded together with V data composed of amplitude information generated at the same timing in the corrected amplitude information sequence for each channel. Conversely, in the case of an advanced user, in consideration of the characteristics of each vocal sound described above, even if the reproduced voice is attenuated at a desired portion or the audio information sequence is selectively edited so as to shorten the reproduction time. Good.

【００１４】さらに、この発明に係る音声情報の記録方
法では、男性の音声が上述の記録方法で所定の記録媒体
に記録された場合、音声再生時間の伸長を行いながら再
生すると、出力される音声の周波数スペクトルは不変で
あっても感覚的により低い音にシフトしたような錯覚を
起す可能性がある。逆に音声再生時間の短縮を行いなが
ら再生すると、感覚的により低い音にシフトしたような
錯覚を起す可能性もある。そこで、上記制御情報には、
半音分あるいは１音分程度高音方向あるいは低音方向へ
周波数成分全体をシフトして再生可能にするための周波
数シフト指示情報を含むのが好ましい。Further, in the audio information recording method according to the present invention, when a male voice is recorded on a predetermined recording medium by the above-described recording method, if the audio is reproduced while extending the audio reproduction time, the output audio is output. May have an illusion that the frequency spectrum of is unchanged but sensuously shifted to lower sounds. Conversely, if the sound is reproduced while shortening the sound reproduction time, an illusion that the sound is shifted to a lower sound may be caused. Therefore, the above control information includes
It is preferable to include frequency shift instruction information for shifting the entire frequency component in the treble or bass direction by about a semitone or one tone to enable reproduction.

【００１５】また、この発明に係る技術は、上述の日本
国特許第２５８１７００号に開示された技術と組合わせ
ることにより、飛躍的な学習効果が期待できる。すなわ
ち、ネイティブスピーカーの発声音を発声の節目で分割
した可変長の区画に対応して、任意部分の音声が伸長及
び／又は強調された音声情報を別途用意することによ
り、聞き取れなかった音声を繰り返し再生して聞くこと
ができるとともに、係る音声の聞き取り難い部分が強調
・伸長された音声を聞くことで、元の音声に対するヒヤ
リング能力の向上が期待できる。また、上級者にとって
は、より積極的に学習効果を向上させるため、区画に区
分されたネーティブスピーカーの発声音とともに任意部
分の音声が短縮及び／又は減衰された音声情報を別途用
意することにより、敢えて再生時間を短縮して再生した
り、子音部を聞こえにくくする（振幅を小さくする）こ
とも可能であり、ネイティブスピーカーの発声音とを組
合わせた学習が可能となる。[0015] Further, by combining the technology according to the present invention with the technology disclosed in Japanese Patent No. 2581700, a dramatic learning effect can be expected. In other words, the voice of the native speaker is divided at the utterance boundary, and the voice of the arbitrary part is expanded and / or separately prepared in correspondence with the variable-length section. By being able to reproduce and listen to the sound, and by listening to the sound in which the inaudible part of the sound is emphasized / extended, the hearing ability for the original sound can be expected to be improved. In addition, for advanced users, in order to more positively improve the learning effect, by separately preparing audio information in which the sound of any part is shortened and / or attenuated together with the utterance sound of the native speaker divided into sections, It is also possible to dare to shorten the playback time and to make the consonant part difficult to hear (reduce the amplitude), and it is possible to learn in combination with the utterance of a native speaker.

【００１６】具体的に上記第１音声情報列は、所定の音
声再生手段で再生出力されるべき単語列から構成された
１又は２以上の文に対応する音声情報列であって、発音
の節目でそれぞれ分割された情報ごとに可変長の区画に
区分された状態で記録媒体に記録される。これにより、
上記第２音声情報列は、第１音声情報列の区画に対応し
て分割された区画ごとに所定の記録媒体に記録され、さ
らに該記録媒体には、該第１音声情報列と該第２音声情
報列とを所定の音声再生手段で切替え再生すべく、切替
え可能な各区画を当該所定の記録媒体における該各区画
の記録位置で示す記録位置識別情報が記録される。この
ように、上記第１音声情報列の分割された各区画と第２
音声情報列の各区画間での対応関係を予め記録しておく
ことにより、所望の１又は２以上の区画を繰り返し再生
できるとともに、ナチュラルスピードの再生音と、各学
習者のレベルに応じて用意された同一発生音をリアルタ
イムで切替えながら再生することが可能になる。More specifically, the first voice information sequence is a voice information sequence corresponding to one or more sentences composed of a word sequence to be reproduced and output by a predetermined voice reproducing means, Is recorded on a recording medium in a state of being divided into sections of variable length for each of the divided information. This allows
The second audio information sequence is recorded on a predetermined recording medium for each of the divisions corresponding to the divisions of the first audio information sequence, and the recording medium further includes the first audio information sequence and the second audio information sequence. In order to switch and reproduce the audio information sequence by a predetermined audio reproducing unit, recording position identification information indicating each switchable section by the recording position of each section on the predetermined recording medium is recorded. Thus, each of the divided sections of the first audio information sequence and the second
By pre-recording the correspondence between the sections of the audio information sequence, one or more desired sections can be repeatedly reproduced, and prepared according to the natural-speed playback sound and the level of each learner. It is possible to reproduce the same generated sound while switching in real time.

【００１７】したがって、この発明に係る記録方法によ
り所定の音声情報（波形データではなく、各周波数成分
の修正された振幅情報列）が記録された音声情報記録媒
体が得られる。Therefore, the recording method according to the present invention can provide an audio information recording medium on which predetermined audio information (not the waveform data, but the corrected amplitude information sequence of each frequency component) is recorded.

【００１８】このような音声情報の記録媒体としては、
例えばＣＤ−ＲＯＭ、ＭＤ、ＭＯ等の円盤状記録媒体
や、ＤＡＴ等のテープ状記録媒体が適用可能であり、必
然的に係る音声情報情報の記録媒体には、第１周期でサ
ンプリングされた第１音声情報列を複数の周波数成分に
分割し、これら各周波数成分について、第２周期で順次
抽出された振幅情報列に対して所定部分の振幅及び所定
部分に波形数が変更された正弦波データを生成し、これ
ら各周波数成分に相当する正弦波データを加算して合成
された第２音声情報列が少なくとも記録されている。す
なわち、当該音声記録媒体に記録される第２音声情報列
は、所定周期でサンプリングされた第１音声情報列を構
成する各周波数成分について、各周波数成分間で互いに
対応している部分に対し、少なくとも振幅が変更される
かあるいは波形数が変更されることにより、選択的に振
幅及び再生時間が編集された第２音声情報列である。As a recording medium for such audio information,
For example, a disc-shaped recording medium such as a CD-ROM, an MD, and an MO, and a tape-shaped recording medium such as a DAT can be applied. One audio information sequence is divided into a plurality of frequency components, and for each of these frequency components, sine wave data in which the amplitude of a predetermined portion and the number of waveforms are changed to a predetermined portion with respect to the amplitude information sequence sequentially extracted in the second cycle Is generated, and at least a second audio information sequence synthesized by adding sine wave data corresponding to each of these frequency components is recorded. In other words, the second audio information sequence recorded on the audio recording medium includes, for each frequency component constituting the first audio information sequence sampled at a predetermined cycle, a portion corresponding to each other between the frequency components. This is the second audio information sequence in which the amplitude and the reproduction time are selectively edited at least by changing the amplitude or the number of waveforms.

【００１９】さらに、この発明に係る音声情報記録媒体
は、所定の音声再生手段で再生出力されるべき単語列か
ら構成された１又は２以上の文に対応する音声情報列で
ある上記第１情報列が、発音の節目でそれぞれ分割され
た情報ごとに可変長の区画に区分された状態で記録され
ることにより、上述の日本国特許第２５８１７００号に
開示された技術と組合わせることが可能である。Further, the audio information recording medium according to the present invention is characterized in that the first information is an audio information sequence corresponding to one or more sentences composed of a word sequence to be reproduced and output by predetermined audio reproducing means. The sequence is recorded in a state of being divided into variable-length sections for each piece of information divided at each sounding point, so that the technique can be combined with the technique disclosed in Japanese Patent No. 2581700 described above. is there.

【００２０】以上のような構成の音声情報記録媒体に
は、上記第１音声情報列とともに、上記第２音声情報列
が、第１音声情報列の区画に対応して分割された区画ご
とに記録され、さらに該第１音声情報列と該第２音声情
報列とを所定の音声再生手段で切替え再生すべく、切替
え可能な各区画を当該所定の記録媒体における該各区画
の記録位置で示す記録位置識別情報が記録されているの
で、このような音声情報記録媒体を用意することによ
り、この発明に係る音声情報の再生方法及び再生装置
は、一方の音声情報列の再生中であっても他方の音声情
報列の対応する区画の音声情報列についてリアルタイム
の切替え再生が可能になる。In the audio information recording medium having the above-described structure, the second audio information sequence is recorded together with the first audio information sequence in each of the divided sections corresponding to the sections of the first audio information string. In addition, in order to switch and reproduce the first audio information sequence and the second audio information sequence by a predetermined audio reproducing unit, each switchable section is indicated by a recording position of each section on the predetermined recording medium. Since the position identification information is recorded, by preparing such an audio information recording medium, the audio information reproducing method and the reproducing apparatus according to the present invention can be performed while one audio information sequence is being reproduced. In this case, real-time switching reproduction can be performed for the audio information sequence in the corresponding section of the audio information sequence.

【００２１】なお、上述されたこの発明の実施形態に
は、記録ソフト（上述の記録方法をパーソナルコンピュ
ータ等で実施可能なプログラム、あるいは該プログラム
が記録された記録媒体）、専用記録装置、使用マニュア
ル、あるいはこれらの組合わせによる販売、係る音声情
報記録媒体単体での販売の他、該音声情報記録媒体、音
声情報の再生ソフト（パーソナルコンピュータ等で実効
可能なプログラム、あるいは該プログラムを記録した記
録媒体を含む）、専用再生装置、使用マニュアル、ある
いはこれらの組合わせによる販売が考えられる。Note that the above-described embodiment of the present invention includes recording software (a program capable of executing the above-described recording method on a personal computer or the like, or a recording medium on which the program is recorded), a dedicated recording device, and a manual for use. Or a combination of the above, sales of the audio information recording medium alone, the audio information recording medium, audio information reproduction software (a program executable on a personal computer or the like, or a recording medium on which the program is recorded) ), A dedicated playback device, a usage manual, or a combination thereof.

【００２２】[0022]

【発明の実施の形態】以下、この発明の一実施例を図１
〜図１４を用いて説明する。なお、図中同一部分には同
一符号を付して重複する説明を省略する。BRIEF DESCRIPTION OF THE DRAWINGS FIG.
This will be described with reference to FIGS. In the drawings, the same portions are denoted by the same reference numerals, and redundant description will be omitted.

【００２３】この発明は、例えば英語学習者のヒヤリン
グ練習に際し、予め聞き取り難い部分を選択的に強調あ
るいは減衰させたり、再生時間を伸長あるいは短縮させ
た音声情報の提供を可能にする技術である。したがっ
て、このように予め編集された音声情報を聞いた学習者
にとっては、元の音声に対するヒヤリング能力の向上が
期待できる。The present invention is a technique which enables, for example, a hearing practice of an English learner to selectively emphasize or attenuate a part which is difficult to hear in advance, and to provide audio information in which a reproduction time is extended or shortened. Therefore, the learner who has heard the audio information edited in advance in this way can be expected to improve the hearing ability with respect to the original audio.

【００２４】図１は、この発明に係る音声情報の記録動
作を概略的に説明するため概念図である。まず、マイク
１１等により、例えば音楽ＣＤの音響クロック４４．１
ＫＨｚ（第１周期）でサンプリングされたネイティブス
ピーカーのナチュラルスピードの音声（第１音声情報）
がＰＣ１本体に取り込まれ、一旦ハードディスク等に記
録される。そして、取り込まれた音声情報を図２の表に
示されたように区分された各チャネル（周波数成分）に
分割するためフィルタリングされる。なお、取り込まれ
る音声情報の周波数範囲は７５Ｈｚ〜１０，０００Ｈ
ｚ、また、サンプリング周波数は音楽ＣＤの音響クロッ
クに合わせて４４．１ｋＨｚ（２２．６８μｓ）とす
る。分割するチャネル数は８５（７オクターブ＋１音）
とし、各チャネル＃１〜＃８５の中心周波数（中心ｆ）
は平均律（１オクターブ当り１２平均律とする）の半音
列になるように設定する（７７．７８Ｈｚ（Ｄ＃）〜
９，９６０Ｈｚ（Ｄ＃））。FIG. 1 is a conceptual diagram for schematically explaining a recording operation of audio information according to the present invention. First, for example, the sound clock 44.1 of a music CD by the microphone 11 or the like.
Natural speed natural speaker sound sampled at KHz (first cycle) (first sound information)
Is taken into the PC 1 and temporarily recorded on a hard disk or the like. Then, the acquired audio information is filtered to be divided into respective channels (frequency components) as shown in the table of FIG. The frequency range of the audio information to be taken is 75 Hz to 10,000 H.
z, and the sampling frequency is 44.1 kHz (22.68 μs) in accordance with the audio clock of the music CD. The number of divided channels is 85 (7 octaves + 1 sound)
And the center frequency (center f) of each channel # 1 to # 85
Is set to be a semitone string of equal temperament (12 equal temperament per octave) (77.78 Hz (D #)-
9,960 Hz (D #)).

【００２５】以上のように各チャネル＃１〜＃８５にそ
れぞれ分割されたデータは、その振幅情報が２．２６８
ｍｓごと（４４．１ｋＨｚサンプリングの１００データ
に相当、ただし１００データで１波形が形成できない場
合にはデータ数を増やす）に抽出される。したがって、
この実施形態では、各チャネル＃１〜＃８５における振
幅情報のサンプリングレート（第２周期）は４４１サン
プル／ｓ（２．２６８ｍｓ）である。なお、このサンプ
リングレートは、規則性のある周期であればよく、例え
ば１００データ分取り込んだ次に、１２０データ分取り
込んで処理するなど、これら異なるレートで交互に処理
を繰り返すような実施形態であってもよい。As described above, the data divided into each of the channels # 1 to # 85 has amplitude information of 2.268.
It is extracted every ms (corresponding to 100 data of 44.1 kHz sampling, but when 100 data cannot form one waveform, the number of data is increased). Therefore,
In this embodiment, the sampling rate (second cycle) of the amplitude information in each of the channels # 1 to # 85 is 441 samples / s (2.268 ms). The sampling rate may be a regular cycle. For example, the processing is alternately repeated at these different rates, for example, after capturing 100 data and then processing by capturing 120 data. You may.

【００２６】さらに、ＰＣ１の制御系１０は、２．２６
８ｍｓごとにサンプリングされた各チャネル＃１〜＃８
５の振幅情報に対し、種々の編集（ディスプレイ１２、
及びキーボード、マウス等の入力装置１３を介して行う
ことも可能）を行い、２．２６８ｍｓごとの修正振幅情
報群を生成する。そして、各チャネル＃１〜＃８５の修
正振幅情報（修正振幅情報群を構成している要素）をそ
れぞれ１バイト（８ビット）で表現し、さらに２バイト
の制御情報を付加して８７バイト（８５チャネル×１バ
イト＋２バイト）のＶデータ１９を生成する。Further, the control system 10 of the PC 1 has a function of 2.26.
Each channel # 1 to # 8 sampled every 8 ms
5, various edits (display 12,
And via the input device 13 such as a keyboard and a mouse) to generate a corrected amplitude information group every 2.268 ms. Then, the corrected amplitude information (elements constituting the corrected amplitude information group) of each of the channels # 1 to # 85 is represented by 1 byte (8 bits), and 2 bytes of control information are added to 87 bytes ( V data 19 of (85 channels × 1 byte + 2 bytes) is generated.

【００２７】なお、修正振幅情報は、各チャネル＃１〜
＃８５の振幅情報列（２．２６８ｍｓでサンプリングさ
れた振幅情報）における各振幅情報を、任意の部分で選
択的に強調あるいは減衰させるよう編集して得られた情
報である。すなわち、各チャネル＃１〜＃８５の振幅情
報列について、各チャネル間で互いに対応している所定
部分の振幅情報から与えられる振幅値をそれぞれ選択的
に大きくあるいは小さく設定し直すことにより、修正振
幅情報列は生成される。また、上記制御情報は、上述の
編集動作により指示された、各チャネル＃１〜＃８５の
周波数成分の再生すべき時間の伸長あるいは短縮を指示
する伸長指示情報（１バイト）と各チャネル＃１〜＃８
５に相当している周波数成分を低音方向あるいは高音方
向に半音又は１音だけ全体的にシフトさせて再生させる
か否かを指示する周波数シフト指示情報（１バイト）で
構成されている。Note that the corrected amplitude information is stored in each of the channels # 1 to # 1.
This is information obtained by editing each amplitude information in the amplitude information string of # 85 (amplitude information sampled at 2.268 ms) so as to be selectively emphasized or attenuated at an arbitrary portion. That is, for the amplitude information sequence of each of the channels # 1 to # 85, the amplitude value given from the amplitude information of the predetermined portion corresponding to each other between the channels is selectively set to be larger or smaller, respectively. An information sequence is generated. The control information includes decompression instruction information (1 byte) for instructing the extension or shortening of the time to reproduce the frequency component of each of the channels # 1 to # 85, which is instructed by the above-described editing operation, and the channel # 1. ~ # 8
5 is composed of frequency shift instruction information (1 byte) for instructing whether or not the frequency component corresponding to 5 is reproduced by shifting the entire frequency by one semitone or one tone in the bass direction or the treble direction.

【００２８】上記伸長指示情報は、１データを何ｍｓで
再生するかの再生クロック数で表現されている。例え
ば、この伸長指示情報を再生するクロック数の２分の１
で表現すると、５０で元の再生時間と同じになり、この
情報を１００に設定すると４４．１ｋＨｚのクロックと
して２００クロックで再生することになり、再生時間を
２倍に延ばすことが可能となる（この情報は１バイトで
表現されるため、最大で２５６÷５０＝５．１２倍まで
再生時間の伸長が可能）。逆にこの情報を２５に設定す
ると４４．１ｋＨｚのクロックとして５０クロックで再
生することになり、再生時間を１／２に短縮することが
可能となる。また、上記周波数シフト指示情報は、全周
波数成分を低音方向あるいは高音方向にシフトさせる場
合にＯＮ”１”、シフトさせる必要がない場合にはＯＦ
Ｆ”０”がセットされる。The decompression instruction information is expressed by the number of reproduction clocks indicating how many milliseconds one data is reproduced. For example, one half of the number of clocks for reproducing the decompression instruction information
When this information is set to 100, it is reproduced at 200 clocks as a clock of 44.1 kHz, and the reproduction time can be doubled. Since this information is represented by one byte, the reproduction time can be extended up to 256 ÷ 50 = 5.12 times at the maximum). Conversely, if this information is set to 25, it will be reproduced with 50 clocks as a clock of 44.1 kHz, and the reproduction time can be reduced to half. The frequency shift instruction information is ON “1” when all frequency components are shifted in the bass direction or the treble direction, and is OF when the frequency components need not be shifted.
F "0" is set.

【００２９】以上のように２．２６８ｍｓでサンプリン
グされた各チャネル＃１〜＃８５の振幅情報を制御系１
０が所望の編集を施すことにより生成されたＶデータ１
９に基づいて、新たな音声情報列が生成される。As described above, the amplitude information of each channel # 1 to # 85 sampled at 2.268 ms is transmitted to the control system 1
0 is V data 1 generated by performing desired editing
9, a new audio information sequence is generated.

【００３０】なお、上記生成されたＶデータ１９から新
たな音声情報列を生成するためには、各チャネル＃１〜
＃８５に相当する波長の正弦波を生成する正弦波生成回
路１６−１〜１６−８５を有する外部装置１６が必要に
なる。各生成回路１６−１〜１６−８５には、各チャネ
ル＃１〜＃８５に対応した周波数の正弦波の基本データ
が記録されたＲＯＭと、生成した正弦波データを一旦記
録しておくＲＡＭ＃１〜＃８５をそれぞれ備えており、
これら各回路では、制御系１０から送られてきたＶデー
タ１９の修正振幅情報に基づいて成形され、かつ制御情
報の伸長指示情報で指示された波形数の正弦波データを
それぞれのＲＡＭ＃１〜＃８５に書込む。なお、この正
弦波データを構成するデータ間隔は、サンプリング周波
数４４．１ｋＨｚのデータ間隔２２．６８μｓである。In order to generate a new audio information sequence from the generated V data 19, each of the channels # 1 to # 1
An external device 16 having sine wave generation circuits 16-1 to 16-85 for generating a sine wave having a wavelength corresponding to # 85 is required. Each of the generating circuits 16-1 to 16-85 has a ROM in which basic data of a sine wave having a frequency corresponding to each of the channels # 1 to # 85 is recorded, and a RAM # in which the generated sine wave data is temporarily recorded. 1 to # 85, respectively.
In each of these circuits, sine wave data of the number of waveforms formed based on the corrected amplitude information of the V data 19 sent from the control system 10 and indicated by the decompression instruction information of the control information are stored in the RAM # 1 to RAM # 1. Write to # 85. The data interval of the sine wave data is 22.68 μs at a sampling frequency of 44.1 kHz.

【００３１】そして、これら各生成回路１６−１〜１６
−８５におけるＲＡＭ＃１〜＃８５に書込まれている正
弦波データが４４．１ｋＨｚのタイミングで順次読み出
され、それぞれ加算されることによりオーディオデータ
（音声情報列）が生成される。このオーディオデータは
制御系１０に送られ、Ｉ／Ｏを介してＣＤ−ＲＯＭ書込
装置等の入出力装置１４に制御系１０から出力される。
この入出力装置１４は、制御系１０から送られてきた４
４．１ｋＨｚのオーディオデータを例えばＣＤ−ＲＯＭ
等の所定の音声情報記録媒体１５に記録する。Each of these generating circuits 16-1 to 16-16
The sine wave data written in the RAMs # 1 to # 85 at −85 are sequentially read at a timing of 44.1 kHz, and are added to each other to generate audio data (sound information sequence). The audio data is sent to the control system 10 and output from the control system 10 to an input / output device 14 such as a CD-ROM writing device via an I / O.
The input / output device 14 receives the 4
4.1 kHz audio data is stored in a CD-ROM, for example.
And the like on a predetermined audio information recording medium 15.

【００３２】上記各生成回路１６−１〜１６−８５で行
われる正弦波データの生成では、再生音の不自然な振幅
変化を避けるため、各チャネル＃１〜＃８５について、
正弦波データの各振幅が、修正振幅情報列の互いに隣接
した各振幅情報間の直線補間により得られた値により決
定される。また、外部装置１６で生成されたオーディオ
データはそのままＤＡＣ１７及びＡＭＰを介してスピー
カー１８から音声として出力してもよい。さらに、この
ような音声情報記録媒体１５としては、例えばＣＤ−Ｒ
ＯＭ、ＭＤ、ＭＯ等の円盤状記録媒体や、ＤＡＴ等のテ
ープ状記録媒体が適用可能である。In the generation of sine wave data performed by each of the generation circuits 16-1 to 16-85, in order to avoid an unnatural amplitude change of the reproduced sound,
Each amplitude of the sine wave data is determined by a value obtained by linear interpolation between adjacent amplitude information in the corrected amplitude information sequence. Further, the audio data generated by the external device 16 may be directly output as audio from the speaker 18 via the DAC 17 and the AMP. Further, as such an audio information recording medium 15, for example, a CD-R
Disc-shaped recording media such as OM, MD, and MO, and tape-shaped recording media such as DAT are applicable.

【００３３】一方、この発明は主として日本人がナチュ
ラル・スピードの英語を単にゆっくり再生して聴けるよ
うする技術に関するものであるが、各周波数成分につい
て単純にかつ一様に音声再生時間を伸ばしたり短縮した
のでは不充分である。すなわち、図３は音声スペクトル
の基本的な形状を示す図であるが、発生音の種類によっ
ては子音部のスペクトルの時間変化が言語上の音として
別の音を意味する場合があるからである。例えば、ＢＡ
（バ）とＰＡ（パ）の発音は、前者のスペクトル変化が
速く、後者は遅いだけでスペクトルそのものはほとんど
同じ形をしている。したがって、ＢＡ（バ）という発音
の子音部も含めて時間を伸長するとＰＡ（パ）と聴こえ
ることになる。これを防ぐには子音部の伸長度をＢＡ
（バ）と聴こえる限界に留め、母音部のみ望みの音声再
生時間に伸長あるいは短縮するようにすれば、ＢＡ
（バ）のままに聴こえることになる。一方、母音部はい
くら伸長あるいは短縮してもその母音のままで聴こえる
から望みの長さ（望みの再生時間）に設定できる。一
方、日本人には弱すぎて聴き取りにくい小さな子音部の
音のところだけを選択的に２倍とか３倍に強調（振幅を
大きくして）して聴かせることも必要である。母音部も
含めて強調したのでは全体が大きくなり過ぎて効果がな
い。どうしても選択的に強調しなければならない。以上
の理由から、この発明に係る音声情報の再生方法は、各
チャネルの振幅情報列も初級者にとって特に聞き取り難
い部分を選択的に強調された修正振幅情報列を編集し、
さらにこれら各チャネルごとの修正振幅情報列のうち同
じタイミングで生成された振幅情報から構成されるＶデ
ータとともに再生時間の伸長を指示する制御情報を順次
記録するよう構成されている。逆に、上級者の場合には
上述の各発声音の特性を考慮して、所望の部分で再生音
声が減衰させたり、再生時間が短縮されるよう選択的に
音声情報列を編集してもよい。On the other hand, the present invention mainly relates to a technique for allowing a Japanese to simply reproduce natural-speed English slowly and to listen to it. However, it is possible to simply and uniformly extend or shorten the audio reproduction time for each frequency component. It is not enough. That is, FIG. 3 is a diagram showing the basic shape of the voice spectrum, but depending on the type of the generated sound, the temporal change of the spectrum of the consonant part may mean another sound as a linguistic sound. . For example, BA
As for the pronunciations of (ba) and PA (pa), the former has a fast spectrum change, and the latter has only a slow change, and the spectrum itself has almost the same shape. Therefore, if the time is extended to include the consonant part of the sound BA, the sound will be heard as PA. To prevent this, set the degree of consonant expansion to BA
If the vowel part is extended or shortened to the desired sound reproduction time, the BA
(Ba) will be heard as it is. On the other hand, the vowel part can be heard as it is, no matter how much it is expanded or shortened, so that it can be set to a desired length (a desired reproduction time). On the other hand, it is necessary for the Japanese to selectively emphasize only a small consonant sound that is too weak to hear and to double or triple the amplitude (increase the amplitude). The emphasis including the vowel part is too large for the whole effect. It must be emphasized selectively. For the above reasons, the audio information reproducing method according to the present invention edits a corrected amplitude information sequence in which the amplitude information sequence of each channel is also selectively emphasized for a part that is particularly difficult to hear for a beginner,
Further, control information for instructing to extend the reproduction time is sequentially recorded together with V data composed of amplitude information generated at the same timing in the corrected amplitude information sequence for each channel. Conversely, in the case of advanced users, in consideration of the characteristics of each of the above uttered sounds, even if the reproduced sound is attenuated at a desired portion or the audio information sequence is selectively edited so as to shorten the reproduction time. Good.

【００３４】さらに、この発明では、男性の音声が上述
の記録方法で所定の記録媒体に記録された場合、音声再
生時間の伸長及び／又は所望部分の音声強調を行いなが
ら再生すると、出力される音声の周波数スペクトルは不
変であっても感覚的により低い音にシフトしたような錯
覚を起す可能性がある。逆に音声再生時間の短縮及び／
又は所望部分の音声減衰を行いながら再生すると、感覚
的により低い音にシフトしたような錯覚を起す可能性も
ある。そこで、上記制御情報には、半音分あるいは１音
分程度低音方向あるいあ高音方向へ周波数成分全体をシ
フトして再生可能にするための周波数シフト指示情報が
含まれている。Further, in the present invention, when a male voice is recorded on a predetermined recording medium by the above-described recording method, it is output when the voice reproduction time is extended and / or the desired portion is reproduced while performing voice enhancement. Even though the frequency spectrum of the sound is unchanged, it can create the illusion that it has shifted sensoryly to lower sounds. Conversely, shortening the audio playback time and / or
Or, when the sound is reproduced while attenuating the sound of a desired portion, an illusion that the sound is shifted to a lower sound may be caused. Therefore, the control information includes frequency shift instruction information for shifting the entire frequency component in a low tone direction or a high tone direction for a half tone or one tone to enable reproduction.

【００３５】次に、この発明は、上述の日本国特許第２
５８１７００号に開示されているように、ネイティブス
ピーカーの音声が記録された記録媒体を再生等する技術
に好適である。以下、係る技術にこの発明を適用する構
成について説明する。Next, the present invention relates to the above-mentioned Japanese Patent No. 2
As disclosed in Japanese Patent No. 581700, the present invention is suitable for a technique for reproducing a recording medium on which the sound of a native speaker is recorded. Hereinafter, a configuration in which the present invention is applied to such a technique will be described.

【００３６】この発明は、上述の日本国特許第２５８１
７００号に開示された技術と組合わせることにより、飛
躍的な学習効果が期待できる。すなわち、ネイティブス
ピーカーの発声音を発声の節目で分割した可変長の区画
に対応して、任意部分が選択的に伸長あるいは縮小され
たり、強調あるいは減衰された音声情報を別途用意する
ことにより、学習者は聞き取れなかった音声を繰り返し
再生して聞くことができるとともに、聞き取り能力を向
上させるべく、再生される音声の聞き取り難い部分が伸
長あるいは短縮、強調あるいは減衰された音声としても
聞くことが可能になる。The present invention relates to the above-mentioned Japanese Patent No. 2581.
By combining with the technique disclosed in Japanese Patent No. 700, a dramatic learning effect can be expected. In other words, learning is performed by separately preparing audio information in which an arbitrary part is selectively expanded or reduced, or emphasized or attenuated, corresponding to a variable-length section obtained by dividing the utterance of a native speaker at an utterance knot. The user can hear the unrecognized sound repeatedly by replaying it.In order to improve the listening ability, it is possible to listen to the reproduced sound as an expanded or shortened, emphasized or attenuated sound. Become.

【００３７】図４は、この発明に係る音声記録媒体に記
録されるべき音声情報列を含む各種情報を概念的に説明
するための図である。FIG. 4 is a diagram for conceptually explaining various information including an audio information sequence to be recorded on the audio recording medium according to the present invention.

【００３８】まず、音声情報記録媒体１５に記録される
第１音声情報列（４４．１ｋＨｚでサンプリングされた
音声情報列）は、映画における出演者の会話、日常の生
活環境における会話等のように、長さの異なる複数のセ
ンテンス（文）から構成され、また、各センテンス（各
会話者の音声情報）の間に、音声が再生されていない状
況、雑音のみが再生されている状況、音楽（ＢＧＭ）の
みが再生されている状況等のランダムに発生する無音声
期間が存在し得る一連の音声情報列である。したがっ
て、第１音声情報列は、所定の音声再生手段で再生出力
されるべき複数の単語列から構成された１又は２以上の
文に対応する音声情報列であって、当該音声情報記録媒
体１５の第１領域に、図４に示されたように、発音の節
目でそれぞれ分割された音声情報ごとに可変長の区画
（以下、セグメントという）に区分された状態で記録さ
れる。First, a first audio information sequence (an audio information sequence sampled at 44.1 kHz) recorded on the audio information recording medium 15 is used for a conversation in a movie, a conversation in a daily living environment, and the like. , Composed of a plurality of sentences (sentences) having different lengths, and between each sentence (voice information of each talker), a situation in which no sound is reproduced, a situation in which only noise is reproduced, and a music ( BGM) is a series of audio information sequences in which there may be a silent period that occurs randomly, such as when only BGM is being reproduced. Therefore, the first audio information sequence is an audio information sequence corresponding to one or two or more sentences composed of a plurality of word sequences to be reproduced and output by the predetermined audio reproduction means, and As shown in FIG. 4, the audio information divided into variable-length sections (hereinafter, referred to as segments) is recorded in the first area of each sound information.

【００３９】一般にネイティブスピーカーの英会話で
は、１センテンスは概ね３秒程度で発声されるため、記
録されるべき音声情報列を構成するセグメントを決定す
る発音の節目を各センテンスの間に設定することで、図
１（ａ）、（ｂ）あるいは（ｄ）に示されたように、音
声情報列を構成する可変長セグメント６２１、６２２、
７９９をそれぞれ構成するのが妥当である。なお、会話
中のセンテンスの中には図１（ｃ）に示されたように、
極端に短いセンテンスも含まれるが、このセンテンス７
０１も１つのセグメントを構成する。一方、図１（ｅ）
に示されたように、極端に長いセンテンスの場合には、
接続詞や関係詞等の前が発音の節目となるため、図１
（ｅ）に示されたようなセンテンスでは、連続する２つ
のセグメント８０１、８０２で構成するのが妥当であ
る。したがって、記録されるべき音声情報列のセグメン
トとは、発声上の区切り（息継ぎ位置）又は言語上（文
法上）のなんらかの区切りにもとづいて分割された音声
情報の記録単位であることを意味する。In general, in English conversation of a native speaker, one sentence is uttered in about three seconds. Therefore, by setting a pronunciation step for determining a segment constituting a voice information sequence to be recorded between each sentence. , As shown in FIG. 1 (a), (b) or (d), the variable length segments 621, 622,
It is reasonable to construct 799 respectively. In addition, as shown in FIG.
Although extremely short sentences are included, this sentence 7
01 also constitutes one segment. On the other hand, FIG.
As shown in, for extremely long sentences,
Before the conjunction or relative, etc., is a turning point of the pronunciation.
In the sentence as shown in (e), it is appropriate that the sentence is composed of two continuous segments 801 and 802. Therefore, the segment of the audio information sequence to be recorded means a recording unit of audio information divided based on a utterance break (breathing position) or a linguistic (grammatical) break.

【００４０】この発明に係る音声情報の記録方法では、
まず上述のように第１情報列を分割して得られた各セグ
メントそれぞれに対し、任意の部分が選択的に編集（各
周波数成分の振幅の変更、再生時間の変更）された第２
音声情報列を生成する。この第２音声情報列は、具体的
には図５に示されたように、各周波数成分について編集
するＰＣ１本体と、編集されたオーディオデータ（第２
音声情報列）を生成する外部装置１６で構成された装置
により、所定の音声情報記録媒体１５に記録される。In the method for recording audio information according to the present invention,
First, for each segment obtained by dividing the first information sequence as described above, an arbitrary portion is selectively edited (change of the amplitude of each frequency component, change of the reproduction time).
Generate an audio information sequence. Specifically, as shown in FIG. 5, the second audio information sequence is composed of the PC 1 itself for editing each frequency component and the edited audio data (second audio data).
The audio information sequence is recorded on a predetermined audio information recording medium 15 by a device including an external device 16 that generates the audio information sequence.

【００４１】特に、外部装置１６は、図５に示されたよ
うに、オーディオデータを生成するマスターボード１６
５と、各チャネルに対応して設けられた正弦波生成回路
１６−１〜１６−８５を備えたスレーブボード１６６で
構成されている。マスターボード１６５は、ＰＣ１から
のＶデータをコントロール信号に従って各生成回路１６
−１〜１６−８５に供給すべく、タイミングコントロー
ラ１７１と、ＦＩＦＯ１７２を備えるとともに、各生成
回路１６−１〜１６−８５から送られてきた正弦波デー
タ（１６ビット）を順次加算し、オーディオデータ（１
６ビット）を生成する加算器１７３と、ＰＣ１へ送信さ
れる該生成されたオーディオデータを一旦格納するバッ
ファとしてのＲＡＭ１７４を備える。なお、図５に示さ
れたマスターボード１６５は、ＰＣ１からの指示で第１
音声情報列と新たに編集された第２音声情報列とを音に
して何度でもスピー力に出し、耳で聴いて比較できるよ
う、生成されたオーディオデータを直接スピーカー１７
７で再生出力すべく、ＤＡＣ１７５及びＡＭＰ１７６が
設けられている（音声再生のための構造は図１に示され
たようにＰＣ１側に設けられてもよい）。一方、スレー
ブボード１６６は、各チャネルに対応して所定の周波数
の正弦波をそれぞれ生成する正弦波生成回路１６−１〜
１６−８５を備えており、これら生成回路１６−１〜１
６−８５は、正弦波を生成するためのデータが記録され
たＲＯＭと、一旦生成された正弦波データを格納するバ
ッファとしてのＲＡＭ＃１〜＃８５をそれぞれ有する。In particular, as shown in FIG. 5, the external device 16 includes a master board 16 for generating audio data.
5 and a slave board 166 provided with sine wave generation circuits 16-1 to 16-85 provided corresponding to each channel. The master board 165 transmits the V data from the PC 1 to each generation circuit 16 according to the control signal.
-1 to 16-85, a timing controller 171 and a FIFO 172 are provided, and sine wave data (16 bits) sent from each of the generation circuits 16-1 to 16-85 are sequentially added, and audio data is added. (1
(6 bits) and a RAM 174 as a buffer for temporarily storing the generated audio data transmitted to the PC 1. Note that the master board 165 shown in FIG.
The generated audio data is directly transmitted to the speaker 17 so that the audio information sequence and the newly edited second audio information sequence can be output as sound as many times as possible and compared with ears.
7, a DAC 175 and an AMP 176 are provided (a structure for sound reproduction may be provided on the PC 1 side as shown in FIG. 1). On the other hand, the slave board 166 generates sine wave generation circuits 16-1 to 16-1 that generate sine waves of a predetermined frequency corresponding to each channel.
16-85, and these generation circuits 16-1 to 16-1
6-85 includes a ROM in which data for generating a sine wave is recorded, and RAMs # 1 to # 85 as buffers for storing the sine wave data once generated.

【００４２】なお、マスターボード１６５とスレーブボ
ード１６６は、３０本の信号バスとＧＮＤ、Ｖｃｃの合
計３２本のバスで接続されており、図中、１６７で示さ
れたバス群は各生成回路１６−１〜１６−８５へＶデー
タを供給するためのＶデータ関連バス群であり、１６８
で示されたバス群は各生成回路からマスターボード１６
５へオーディオデータ生成用の正弦波データを送るため
のオーディオデータ関連バス群である。The master board 165 and the slave board 166 are connected by 30 signal buses and a total of 32 buses of GND and Vcc. In FIG. 168 is a group of V data related buses for supplying V data to -1 to 16-85.
The bus group indicated by is transmitted from each generation circuit to the master board 16.
5 is a group of audio data-related buses for sending sine wave data for generating audio data to the bus 5.

【００４３】次に、この発明に係る音声情報の記録方法
の、日本国特許第２５８１７００号に開示された技術に
適用された実施形態を、図５を参照しながら、図６及び
図７のフローチャートを用いて説明する。Next, an embodiment of a method for recording audio information according to the present invention applied to the technology disclosed in Japanese Patent No. 2581700 will be described with reference to FIG. This will be described with reference to FIG.

【００４４】まず、Ｖデータの生成はＰＣ１側で行われ
る。すなわち、ＰＣ１では、一連の音声情報列（第１音
声情報列）が４４．１ｋＨｚ（１６ｂｉｔ／データ）を
サンプリングし、この第１音声情報列に相当するサンプ
リングデータを一旦ハードディスクに格納し（ステップ
ＳＴ１）、図４に示されたように複数のセグメントに分
解する（ステップＳＴ２）。First, V data is generated on the PC 1 side. That is, in the PC1, a series of audio information sequences (first audio information sequence) samples 44.1 kHz (16 bits / data), and sampling data corresponding to the first audio information sequence is temporarily stored in the hard disk (step ST1). ), And is decomposed into a plurality of segments as shown in FIG. 4 (step ST2).

【００４５】続いて、分割されたセグメントのうち１セ
グメントについて、デジタル・バンド・バス・フィルタ
ー・プログラムにより、まず第1チャンネル＃１のバン
ド幅（７５．５７ｋＨｚ〜８０．０６ｋＨｚ）の波形情
報をメモリーに展開する。この時も４４．１ｋＨｚのレ
ートに相当するデータ間隔のまま展開する。そして、１
００データごとに平均振幅情報（８ビット）を抽出する
（ステップＳＴ３）。なお、上述のように第１チャネル
＃１の周波数成分について１００データで１波形できな
い場合には１波形できるデータ数に増やして振隔情報を
求める。対象セグメントのサンプリングデータが終了す
るまで、１００データ分づつずらして同じ動作を繰り返
す。この動作により、対象チャネルである第１チャネル
＃１についてデータ間隔２．２６８ｍｓの振幅情報列
（１秒当り４４１個の振幅情報）である。対象チャネル
である第１チャンネル＃１の振幅情報抽出動作が終了す
ると（ステップＳＴ５）、続いてデジタル・バンド・バ
ス・フィルターにより第２チャンネル＃２の周波数を分
割して上記ステップＳＴ３〜ＳＴ５の動作を繰り返し、
対象チャネルを変更しながら（ステップＳＴ７）、第１
チャネル＃１〜第８５チャネル＃８５について対象セグ
メントの振幅情報列が生成される。Subsequently, for one of the divided segments, first, the digital band bus filter program stores the waveform information of the bandwidth (75.57 kHz to 80.06 kHz) of the first channel # 1 in the memory. Expand to At this time, the data is developed with the data interval corresponding to the rate of 44.1 kHz. And 1
The average amplitude information (8 bits) is extracted for every 00 data (step ST3). As described above, when one waveform cannot be formed with 100 data for the frequency component of the first channel # 1, the distance information is obtained by increasing the number of data that can be formed by one waveform. Until the sampling data of the target segment ends, the same operation is repeated by shifting by 100 data. By this operation, an amplitude information string (441 amplitude information pieces per second) with a data interval of 2.268 ms for the first channel # 1 as the target channel. When the operation of extracting the amplitude information of the first channel # 1 as the target channel is completed (step ST5), the frequency of the second channel # 2 is divided by the digital band bus filter, and the operations of the above steps ST3 to ST5 are performed. Repeat
While changing the target channel (step ST7), the first
For the channel # 1 to the 85th channel # 85, an amplitude information sequence of the target segment is generated.

【００４６】以上の動作は、対象セグメントを変更しな
がら（ステップＳＴ１０）、ステップＳＴ１でサンプリ
ングされた第１音声情報列を構成するすべてのセグメン
トが終了するまで行われる（ステップＳＴ９）。The above operation is performed while changing the target segment (step ST10) until all the segments constituting the first audio information sequence sampled in step ST1 are completed (step ST9).

【００４７】次に、以上のステップＳＴ１〜ＳＴ９が実
行されることにより得られた、各セグメントいついて８
５チャネル分の振幅情報列に対し、ＰＣ１側では以下の
ような編集が行われ、Ｖデータが生成される（ステップ
ＳＴ１１）。Next, for each segment obtained by executing the above steps ST1 to ST9, 8
The PC1 side performs the following editing on the amplitude information sequence for five channels, and generates V data (step ST11).

【００４８】まず、分割されたセグメントごとにに生成
された８５チャネル分の振幅情報列群を格納先であるハ
ードディスクから呼び出し、モニタ１２上に順次その振
幅波形を表示する。First, a group of amplitude information for 85 channels generated for each of the divided segments is called from the hard disk as the storage destination, and the amplitude waveforms are sequentially displayed on the monitor 12.

【００４９】実際の編集作業は、表示された振幅波形の
所望の部分を指定して再生時間を指定する（クロック５
０が基準）。また、必要であれば変更する部分を指定し
て振幅の変更（表示された震央くを基準にして倍率で設
定）を行ったり、低音方向あるいは高音方向への周波数
シフト指示を指定する。例えば、セグメントの中の子音
部は振幅を２倍、再生時間を１．５倍にする一方、母音
部は振幅をそのままにして、再生時間のみ２．５倍にす
る等、選択的に任意の部分に対して得られた振幅情報列
を編集し、新たに各振幅情報が修正された修正振幅情報
列を生成する。In the actual editing operation, a desired portion of the displayed amplitude waveform is specified and a reproduction time is specified (clock 5).
0 is the standard). If necessary, the part to be changed is designated to change the amplitude (set by the magnification based on the displayed epicenter) or to designate a frequency shift in the bass or treble direction. For example, a consonant part in a segment has an amplitude twice as large and the reproduction time is 1.5 times, while a vowel part has the same amplitude and only 2.5 times the reproduction time. The amplitude information sequence obtained for the portion is edited, and a corrected amplitude information sequence in which each amplitude information is newly corrected is generated.

【００５０】そして、得られた８５チャネル分の修正振
幅情報列のうち、各修正振幅情報列間で互いに対応して
いる同じタイミングの情報成分をまとめた情報成分群ご
とに、上述の再生時間の変更を指示する情報と周波数シ
フトを指示する情報とからなる制御情報を付加すること
により、データ間隔２．２６８ｍｓのＶデータが得られ
る。Then, of the obtained corrected amplitude information sequences for 85 channels, the information component group in which the information components of the same timing corresponding to each other among the corrected amplitude information sequences are grouped, the above-described reproduction time By adding control information including information instructing a change and information instructing a frequency shift, V data with a data interval of 2.268 ms can be obtained.

【００５１】次に、以上のようにＰＣ１側で用意された
Ｖデータ（８７バイト／データ）は外部装置１６のマス
ターボード１６５へ送られ、さらに該マスターボード１
６５からデータバスを介してスレーブボード１６６上の
各正弦波生成回路１６−１〜１６−８５へ送られる。な
お、スレーブボード１６６は、実際には８回路が搭載さ
れた１１枚のボード（１１枚目のボードには８回路中５
回路だけ使用する）で構成されるものとし、それぞれの
回路が対応するチャネルの正弦波データを生成する（ス
テップＳＴ１２）。なお、各回路は、正弦波の波形デー
タを収納しているＲＯＭが異なることと、対応するチャ
ンネルを指定する７ビットのＤＩＰ・ＳＷの設定が異な
ること以外は全て同じで構成である。Next, the V data (87 bytes / data) prepared on the PC 1 side as described above is sent to the master board 165 of the external device 16, and
65 to the sine wave generation circuits 16-1 to 16-85 on the slave board 166 via the data bus. Note that the slave board 166 is actually composed of eleven boards on which eight circuits are mounted (the eleventh board includes five out of eight circuits).
(Only circuits are used), and each circuit generates sine wave data of the corresponding channel (step ST12). Each circuit has the same configuration except that the ROM storing the sine wave waveform data is different and the setting of the 7-bit DIP / SW for specifying the corresponding channel is different.

【００５２】各チャネルを受け持つ各回路では、まず、
マスターボード１６５から送られてきた８７バイトのＶ
データのうちへッダー（２バイト）を共通に受け取る一
方、該Ｖデータのうちの修正振幅情報については対応す
るチャネル用の修正振幅情報（１バイト）だけを受け取
る。各回路では、４４．１ｋＨｚの何クロック分で波形
を成形し出力するのかを判断するため、受け取ったヘッ
ダー情報の再生時間を調べられる。例えば指示された再
生時間が５０で与えられた場合には１００クロック再生
（再生時間は変らない）、１１０の時は２２０クロック
再生（再生時間は２倍）となる。各回路には受け持つ周
波数の正弦波データが４４．１ｋＨｚで出力された時の
データ間隔でＲＯＭに収納されている（ＲＯＭのアドレ
スのゼロ番地からＮ番地までにその周波数の正弦波波が
正確にＭ波収納されている（Ｍ、Ｎは自然数）。各回路
中のプロセサ一は、１つの正弦波データを作るごと（２
２．６８μｓごと）にＲＯＭのアドレスを十１してい
く。そして、Ｎ番地の次にはゼロへ戻る。こうすること
で、正確な正弦波を不連続点なしに作れる。ただし、上
記正弦波データは、受け取った修正振幅情報をそのＲＯ
Ｍに格納されていた基本データに掛けて１つの正弦波デ
ータを生成する。また、各振幅情報は、今回の振幅情報
と前回の振幅情報との間を直線補間することにより得ら
れた値とする。In each circuit for each channel, first,
87 bytes of V sent from master board 165
While the header (2 bytes) of the data is received in common, the corrected amplitude information of the V data only receives the corrected amplitude information (1 byte) for the corresponding channel. In each circuit, the reproduction time of the received header information can be examined in order to determine the number of clocks of 44.1 kHz for shaping and outputting the waveform. For example, if the instructed reproduction time is given by 50, 100 clock reproduction (the reproduction time does not change) is obtained, and if it is 110, 220 clock reproduction (the reproduction time is doubled). Each circuit stores the sine wave data of the assigned frequency in the ROM at a data interval when output at 44.1 kHz (the sine wave of the frequency is accurately stored in the ROM address from address 0 to address N). M-waves are stored (M and N are natural numbers) .The processor in each circuit generates (2
(Every 2.68 μs), the address of the ROM is increased by 11. Then, after the address N, the process returns to zero. In this way, an accurate sine wave can be created without discontinuities. However, the sine wave data is based on the received corrected amplitude information in the RO.
One sine wave data is generated by multiplying the basic data stored in M. Each amplitude information is a value obtained by linearly interpolating between the current amplitude information and the previous amplitude information.

【００５３】以上のように、各回路で生成された正弦波
データは、周期４４．１ｋＨｚ（２２．６８μｓ）でＲ
ＯＭを参照し、上記補間で求められた係数を参照された
データに掛けて出力バッファである各ＲＡＭ＃１〜＃８
５へ収納する。As described above, the sine wave data generated by each circuit has a period of 44.1 kHz (22.68 μs).
Each of the RAMs # 1 to # 8 as output buffers is obtained by multiplying the referred data by the coefficient obtained by the interpolation with reference to the OM.
Store in 5.

【００５４】そして、マスターボード１６５からのコン
トロール信号により、各ＲＡＭ＃１〜＃８５に格納され
た正弦波データが出力バス（１６ビット）へ送出するタ
イミング（２２．６８μｓ周期）をもらい、その時だけ
バスへ送出する。１回路に与えられた時間幅は２２６ｎ
ｓ（２２．６８μｓ÷８５）となる。一方、マスターボ
ード１６５側の取込タイミングはクロックと同期信号で
与えられる。同期信号から何クロック目かの数は上記Ｄ
ＩＰスイッチで指定されたチャンネル番号と同じとな
る。また、上記Ｖデータのヘッダー情報に含まれる周波
数シフト指示情報がＯＮの時、半音（又は全音）シフト
した周波数の正弦波データが各回路から出力できるよ
う、各回路に設けられているＲＯＭに２種類の波形デー
タを格納しておき、いずれかを選択できるようにする。Then, according to the control signal from the master board 165, a timing (22.68 μs cycle) at which the sine wave data stored in each of the RAMs # 1 to # 85 is transmitted to the output bus (16 bits) is received. Send to the bus. The time width given to one circuit is 226n
s (22.68 μs ÷ 85). On the other hand, the fetch timing on the master board 165 side is given by a clock and a synchronization signal. The number of the number of clocks from the synchronization signal is D
It is the same as the channel number specified by the IP switch. When the frequency shift instruction information included in the header information of the V data is ON, the sine wave data of the frequency shifted by a half tone (or whole tone) can be output from each circuit to the ROM provided in each circuit. The type of waveform data is stored so that any one can be selected.

【００５５】一方、マスターボード１６５は各回路１６
−１〜１６−８５で生成された正弦波データを、２２．
６８μｓ中８５データの割合（データ間隔は２２．６８
μｓ÷８５＝２６６ｎｓ）で受け取る。実際には、各回
路からの正弦波データを受け取りながら加算器１７３で
加算していき、４４．１ｋＨｚのオーディオデータ（第
２音声情報列）を生成する（ステップＳＴ１３）。生成
されたオーディオデータは順次バッファであるＲＡＭ１
７４に格納され、ＰＣ１へ送られる。On the other hand, the master board 165
The sine wave data generated in -1 to 16-85 are converted to 22.
Rate of 85 data in 68 μs (data interval is 22.68
μs ÷ 85 = 266 ns). In practice, the adder 173 adds the sine wave data while receiving the sine wave data from each circuit to generate 44.1 kHz audio data (second audio information sequence) (step ST13). The generated audio data is sequentially stored in a RAM 1 as a buffer.
74 and sent to PC1.

【００５６】ＰＣ１では、送られてきたオーディオデー
タを入出力装置１４を制御しながら所定の記録媒体１５
に該オーディオデータを記録していくことにより（ステ
ップＳＴ１４）、この発明に係る音声情報記録媒体が得
られる。The PC 1 controls the input / output device 14 to transmit the received audio data to a predetermined recording medium 15.
By recording the audio data in step (step ST14), the audio information recording medium according to the present invention is obtained.

【００５７】次に、この発明に係る音声情報記録媒体
の、上述の日本国特許第２５８１７００号に開示されて
た技術が適用された各実施形態について説明する。Next, embodiments of the audio information recording medium according to the present invention to which the technology disclosed in Japanese Patent No. 2581700 will be described.

【００５８】音声情報記録媒体に係る第１実施形態まず、第１実施形態では、少なくとも２種類の音声情報
列と記録位置識別情報が記録されている。すなわち、第
１音声情報列は例えばネイティブスピーカが自然な速さ
で話す英語の音声情報からなり、この音声情報列は上述
されたように発音の節目（センテンスの終りやセンテン
ス中の一息つける、発生上あるいは文法上の区切り）で
複数の可変長セグメントに分割されている。第２音声情
報列は、第１情報列を、上述のように任意の部分が選択
的に編集することにより得られた音声情報列であって、
第１音声情報列の各セグメントに対応して複数の可変長
セグメントに分割されている。また、記録位置識別情報
は、少なくとも、第１及び第２音声情報列における各セ
グメントが、当該音声記録媒体のどの位置に記録されて
いるかを示す情報である。したがって、例えば第１音声
情報列のｔ番目のセグメント”It's not much of a pro
blem.”に対応する第２音声情報列のセグメント”It's・
・・not ・・・much・・of・・a ・・・problem.”が、媒体のどの位置
に記録されているかということは、この記録位置識別情
報により認識することができる。First Embodiment of Audio Information Recording Medium First, in the first embodiment, at least two types of audio information strings and recording position identification information are recorded. In other words, the first audio information sequence is composed of, for example, English audio information spoken at a natural speed by a native speaker, and this audio information sequence is composed of the sound generation points (the end of a sentence or a pause in a sentence, as described above). Above or grammatical break) into multiple variable length segments. The second audio information sequence is an audio information sequence obtained by selectively editing an arbitrary part of the first information sequence as described above,
It is divided into a plurality of variable length segments corresponding to each segment of the first audio information sequence. Further, the recording position identification information is information indicating at least where in the audio recording medium each segment in the first and second audio information strings is recorded. Therefore, for example, the t-th segment of the first audio information sequence "It's not much of a pro
blem . ", the segment of the second audio information sequence"It's
································ not ················· problem .

【００５９】その結果、第１及び第２音声情報列と記録
位置識別情報は互いに無関係に記録されるのではなく、
一定の関係をもって記録され、各音声情報列はセグメン
トを単位として有機的に組み合わされている。すなわ
ち、第１及び第２音声情報列は互いに対をなしており、
これらをセグメントごとに関連させているのが記録位置
識別情報である。なお、この実施形態では、記録記録位
置識別情報は当該音声情報記録媒体のディレクトリ領域
に記録されており、少なくとも各セグメントの先頭位置
に関する情報を含んでいる。As a result, the first and second audio information strings and the recording position identification information are not recorded independently of each other.
It is recorded with a certain relationship, and each audio information sequence is organically combined in units of segments. That is, the first and second audio information strings are paired with each other,
Recording position identification information correlates these for each segment. In this embodiment, the recording / recording position identification information is recorded in the directory area of the audio information recording medium, and includes at least information on the start position of each segment.

【００６０】以上のような構造を備えた音声情報記録媒
体（第１実施形態）の再生方法では、記録されたセグメ
ントごとに順番に音声再生が行われるが、特に、この再
生方法では、当該音声情報記録媒体に記録された第１音
声情報列から第２音声情報列への再生切換え（あるいは
第２音声情報列から第１音声情報列への再生切換え）が
可能であることを特徴としている。なお、この再生切換
え動作は、セグメントを単位として行われる。例えば、
第１音声情報列のｔ番目のセグメントが再生されている
ときに第２音声情報列の再生指示が入力されると（割込
み要求の発生）、記録位置識別情報に基づいて第２音声
情報列の対応するｔ番目のセグメントを読み出し、その
対応するセグメントの音声再生が実行される。また逆
に、第２音声情報列から第１音声情報列への再生切換え
も、上述した再生切換え動作と同様に各セグメント単位
で行われる。In the reproducing method of the audio information recording medium having the above-described structure (first embodiment), the audio is reproduced sequentially for each recorded segment. The reproduction switching from the first audio information sequence to the second audio information sequence recorded on the information recording medium (or the reproduction switching from the second audio information sequence to the first audio information sequence) is possible. This playback switching operation is performed in units of segments. For example,
When an instruction to reproduce the second audio information sequence is input while the t-th segment of the first audio information sequence is being reproduced (the generation of an interrupt request), the second audio information sequence is reproduced based on the recording position identification information. The corresponding t-th segment is read, and the audio reproduction of the corresponding segment is executed. Conversely, reproduction switching from the second audio information sequence to the first audio information sequence is also performed in units of each segment similarly to the above-described reproduction switching operation.

【００６１】なお、この再生方法では、上述の再生切換
え動作の他、リピート再生等の種々の変形が可能であ
る。その代表的なものとして、いわゆる戻し指令があ
る。すなわち、再生中の停止命令により一時再生を中断
した後に戻し指令が入力されたときは、指令された量だ
け音声情報の読み出し位置を戻すことによりより操作者
の希望に合った音声情報の再生が行われる。In this reproduction method, in addition to the above-described reproduction switching operation, various modifications such as repeat reproduction are possible. A typical example is a so-called return command. That is, when a return command is input after the temporary playback is interrupted by the stop command during playback, the readout position of the voice information is returned by the instructed amount, whereby the playback of the voice information that meets the operator's desire can be performed. Done.

【００６２】音声記録記録媒体に係る第２実施形態この第２実施形態は、上述された第１実施形態と基本的
には同じ構造であるが、上記第１音声情報列及び第２音
声情報列の他、第１音声情報列の内容と等価な意味内容
であるが別の音声情報であり、例えば単語を区切って話
すゆっくりとした速さの英語の音声情報である第３音声
情報列を備えていることを特徴としている。また、この
第３音声情報列も、複数の可変長セグメントから構成さ
れており、上記記録位置識別情報は、これら第１〜第３
音声情報列における各セグメント間での記録位置を管理
している。したがって、この第２実施形態における音声
情報の再生方法は第１実施形態と同様である。 Second Embodiment of Audio Recording / Recording Medium This second embodiment has basically the same structure as the above-described first embodiment, but the first audio information sequence and the second audio information sequence In addition, there is provided a third voice information sequence which is semantic content equivalent to the content of the first voice information sequence but is different voice information, for example, slow-speed English voice information which speaks by separating words. It is characterized by having. The third audio information sequence is also composed of a plurality of variable-length segments, and the recording position identification information includes the first to third information.
It manages the recording position between segments in the audio information sequence. Therefore, the method of reproducing the audio information in the second embodiment is the same as in the first embodiment.

【００６３】なお、この実施形態において、重要なこと
は、上記第１音声情報列と、第３音声情報列はそれぞれ
複数の可変長セグメントに区分されているが、互いにセ
グメントごとにその意味内容が対応していることであ
る。例えば、第１音声情報列のｔ番目（図４（ａ）では
６２１番目）のセグメントがネイティブスピーカの話
す”It's not much of a problem.”であるときは、第
３音声情報列のｔ番目のセグメントは各単語を区切って
話す”It is not much of a problem.”となる。ただ
し、第２音声情報列と対応した内容でかつ別の音声情報
からなるということは、言語上は同一の意味で発声の異
なるものであることを示している。In this embodiment, what is important is that the first audio information sequence and the third audio information sequence are each divided into a plurality of variable length segments. It is corresponding. For example, when the t-th segment (621-th in FIG. 4A) of the first audio information sequence is "It's not much of a problem ." The segment is " It is not much of a problem ." However, the fact that the content corresponds to the second audio information sequence and is composed of different audio information indicates that the utterances have the same meaning in language and different utterances.

【００６４】音声情報記録媒体に係る第３実施形態さらに、この発明に係る音声情報記録媒体の第３実施形
態について説明する。この第３実施形態に係る音声情報
記録媒体は、第１及び第２音声情報列の他、さらに文法
解説等の音声情報列である第４音声情報列が当該音声情
報記録媒体に記録されている点が、上述の第１実施形態
に係る音声情報記録媒体と異なる。 Third Embodiment of Audio Information Recording Medium A third embodiment of the audio information recording medium according to the present invention will be described. In the audio information recording medium according to the third embodiment, in addition to the first and second audio information strings, a fourth audio information string which is an audio information string such as a grammar explanation is recorded on the audio information recording medium. This is different from the audio information recording medium according to the first embodiment described above.

【００６５】ここで重要なことは、上記第３音声情報列
は第１及び第２音声情報列の１又は２以上の可変長セグ
メントをひとまとまりとしたセグメント群に区分されて
いることである。換言すれば、この第４音声情報列の１
つのセグメント群は第１及び第２音声情報列の１又は２
以上のセグメントを包含しており、したがって、第４音
声情報列の１つのセグメント群は第１及び第２音声情報
列の１又は２以上のセグメントと対になっている。特
に、この構成は図４（ｅ）に示されたように、１つのセ
ンテンスが複数のセグメントに区分された場合を想定し
ている。What is important here is that the third audio information sequence is divided into a group of one or more variable length segments of the first and second audio information sequences. In other words, 1 of the fourth audio information sequence
One segment group is one or two of the first and second audio information strings.
These segments are included, and thus one segment group of the fourth audio information sequence is paired with one or more segments of the first and second audio information sequences. In particular, this configuration assumes a case where one sentence is divided into a plurality of segments, as shown in FIG.

【００６６】また、この第３実施形態の音声情報記録媒
体において、所定の領域に記録された記録位置識別情報
には、上記第４音声情報列の内容の記録位置をもセグメ
ント群ごとに示す情報も含まれている。したがって、第
１、第２及び第４音声情報列と記録位置識別情報は互い
に一定の関係をもって媒体に記録され、各音声情報列は
セグメントあるいはセグメント群を単位として有機的に
組み合わされている。なお、この第３実施形態において
も、記録位置識別情報は当該音声情報記録媒体のディレ
クトリ領域に記録され、各音声情報列におけるセグメン
トの先頭位置に関する情報を含んでいる。また、この実
施形態においても、第１音声情報列の音声情報と等価で
あって、単語を区切って話すゆっくりとした速さの第３
音声情報列をさらに記録してもよい。In the audio information recording medium of the third embodiment, the recording position identification information recorded in a predetermined area includes information indicating the recording position of the contents of the fourth audio information sequence for each segment group. Is also included. Therefore, the first, second, and fourth audio information strings and the recording position identification information are recorded on the medium in a fixed relation to each other, and each audio information string is organically combined in units of segments or segments. Note that, also in the third embodiment, the recording position identification information is recorded in the directory area of the audio information recording medium, and includes information on the head position of the segment in each audio information sequence. Also, in this embodiment, the third speech information, which is equivalent to the speech information of the first speech information sequence and has a slow speed of speaking while separating words, is also used.
The audio information sequence may be further recorded.

【００６７】以上のような構造を備えた音声情報記録媒
体（第３実施形態）の再生方法は、基本的に上述された
第１実施形態の場合と同じであるが、第１及び第２音声
情報列間での再生切換えの他、該第１及び第２音声情報
列と第４音声情報列との間においても再生切換え動作を
行う点が異なる。The method of reproducing the audio information recording medium having the above structure (third embodiment) is basically the same as that of the first embodiment described above, except that the first and second audio are recorded. The difference is that the reproduction switching operation is performed between the first and second audio information sequences and the fourth audio information sequence in addition to the reproduction switching between the information sequences.

【００６８】例えば、第１音声情報列の再生中にネイテ
ィブスピーカの”It's not much of a problem.”が聴き
取れなかったときは、再生中の第１音声情報列から第２
音声情報列に再生を切換えることにより、選択的に伸長
等の編集が施された音声”It's ・・・not・・・ much・・of・・a・
・・problem.”を聴くことができる。そして、この日本語
の意味や文法を知りたいときは、さらに、再生中の音声
情報列から第4音声情報列へ再生を切換えればよい。も
ちろん、この再生方法においても、上述の第１実施形態
に係る音声情報記録媒体の再生方法で説明された戻し指
令や停止命令を組み合せて使えるよう応用できることは
言うまでもない。また、この再生方法においても、切換
え再生及びリピート再生が可能である。For example, when the native speaker cannot hear "It's not much of a problem ." During reproduction of the first audio information sequence, the second audio information sequence is reproduced.
By switching the playback to the audio information sequence, the audio that has been selectively edited such as decompression "It's ... not ... much ... of ... a ..."
..Problem . "And if you want to know the meaning and grammar of this Japanese language, you can also switch the playback from the audio information sequence being reproduced to the fourth audio information sequence. It is needless to say that this reproducing method can be applied so that it can be used in combination with the return command and the stop command described in the reproducing method of the audio information recording medium according to the first embodiment. Reproduction and repeat reproduction are possible.

【００６９】音声情報記録媒体に係る第４実施形態この発明に係る音声情報記録媒体の第４実施形態は、基
本的に上述の第１実施形態の場合と同様であるが、第１
及び第２音声情報列の他、文字情報列が記録されている
点が主に異なる。この文字情報列は、第１又は第２音声
情報列に対応する内容の文字情報に相当しており、例え
ばネイティブスピーカが話す英語（音声）に対応する文
字情報に相当している。 Fourth Embodiment of the Audio Information Recording Medium The fourth embodiment of the audio information recording medium according to the present invention is basically the same as that of the first embodiment described above.
The main difference is that a character information string is recorded in addition to the second audio information string. This character information sequence corresponds to character information having contents corresponding to the first or second audio information sequence, for example, character information corresponding to English (voice) spoken by a native speaker.

【００７０】この文字情報列も、第１及び第２音声情報
列の各セグメントと対応するセグメントに区分されてい
る。また、この第４実施形態に係る音声情報記録媒体に
おいても、記録位置識別情報には、この文字情報列の記
録位置を各音声情報列のそれぞれのセグメントごとにそ
れらの先頭位置に関する情報が含まれ、当該音声情報記
録媒体のディレクトリ領域に記録される。したがって、
第１及び第２音声情報列と文字情報列はそれぞれセグメ
ント単位で対応することになる。This character information sequence is also divided into segments corresponding to each segment of the first and second audio information sequences. Also, in the audio information recording medium according to the fourth embodiment, the recording position identification information includes information on the recording position of the character information string for each segment of each audio information string and on the head position thereof. Are recorded in the directory area of the audio information recording medium. Therefore,
The first and second audio information strings correspond to the character information strings on a segment basis.

【００７１】なお、この第４実施形態に係る音声情報記
録媒体において、上述の第３実施形態における第４音声
情報列を記録情報として加えるときは、第１及び第２音
声情報列と文字情報列の１又は２以上のセグメントは第
３音声情報列の１つのセグメント群にも対応することに
なる。この構成においても、上記記録位置識別情報に
は、各セグメントの先頭位置が含まれ、かつ当該音声記
録媒体のディレクトリ領域に記録される。そして、上述
の第３実施形態と同様に、この第４実施形態でも、第１
音声情報列の音声情報と等価であって、単語を区切って
話すゆっくりとした速さの第３音声情報列をさらに記録
してもよい。In the audio information recording medium according to the fourth embodiment, when the fourth audio information sequence in the third embodiment is added as recording information, the first and second audio information sequences and the character information sequence are added. The one or more segments correspond to one segment group of the third audio information sequence. Also in this configuration, the recording position identification information includes the head position of each segment and is recorded in the directory area of the audio recording medium. Then, like the third embodiment described above, in the fourth embodiment, the first
A third voice information sequence, which is equivalent to the voice information of the voice information sequence and has a slow speed of speaking while separating words, may be further recorded.

【００７２】以上のような構造を備えた音声情報記録媒
体（第４実施形態）の再生方法も、基本的に上述の第２
実施形態の場合と同様であるが、第１又は第２音声情報
列の再生中に文字情報列がディスプレイ表示される点が
異なる。The reproduction method of the audio information recording medium (fourth embodiment) having the above structure is basically the same as that of the second embodiment.
This is the same as the embodiment, except that the character information string is displayed on the display while the first or second audio information string is being reproduced.

【００７３】例えば、第１音声情報列のセグメント”I
t's not much of a problem.”が再生されているとき
は、所定の表示部に”It's not much of a problem.”
もしくは”It is not much of a problem.”がディスプ
レイ表示される。なお、この表示については再生中の音
声情報列と時間的に完全に同期している必要はなく、文
字が少しずつ遅れて表示されたり、あるいは少しずつ先
に表示されたりしてもよい。また、この再生方法でも、
切換え再生及びリピート再生が可能である。For example, the segment " I " of the first audio information sequence
When "t's not much of a problem ." is played, "It's not much of a problem."
Or "It is not much of a problem." Note that this display does not need to be completely synchronized in time with the audio information sequence being reproduced, and characters may be displayed with a slight delay or a little before. Also, with this playback method,
Switching reproduction and repeat reproduction are possible.

【００７４】次に、この発明に係る音声記録媒体の具体
的な構造を、図８〜図１１を用いて、以下詳細に説明す
る。Next, a specific structure of the audio recording medium according to the present invention will be described in detail with reference to FIGS.

【００７５】図８は、この発明に係る音声情報記録媒体
の例として、上述の第３実施形態を英会話独習用に適用
したときの各音声情報列Ａ、Ｂ、Ｃと、その記録内容を
説明するための図である。この図において、音声情報列
Ａはネイティブスピーカの話す英語の情報列（第１音声
情報列）であり、複数のセグメント６２１、６２２から
構成されている。音声情報列Ｂは図６及び図７に示され
たフローチャートを用いて説明されたように選択的に該
第１情報列の所定部分が伸長等するよう編集された情報
列（第２音声情報列）である。また、音声情報列Ｃは日
本語の解説をする情報列（第３音声情報列）であり、こ
の音声情報列Ｃに含まれるセグメント群は、各音声情報
列Ａ、Ｂの各セグメント６２１、６２２にそれぞれ対応
している。FIG. 8 illustrates, as an example of the audio information recording medium according to the present invention, the audio information strings A, B, and C when the above-described third embodiment is applied to English conversation self-study, and the recorded contents thereof. FIG. In this figure, an audio information sequence A is an English information sequence (first audio information sequence) spoken by a native speaker, and is composed of a plurality of segments 621 and 622. As described with reference to the flowcharts shown in FIGS. 6 and 7, the audio information sequence B is an information sequence (second audio information sequence) which is selectively edited so that a predetermined portion of the first information sequence is expanded or the like. ). The audio information sequence C is an information sequence for explaining Japanese (third audio information sequence), and the segments included in the audio information sequence C are segments 621 and 622 of each audio information sequence A and B. Respectively.

【００７６】また、図９は、図８に示された態様におけ
る１セグメント当りの時間と容量の関係を説明するため
の表である。この表において、１秒間は６キロバイトの
容量に対応している。例えば音声情報列Ａのセグメント
６２１では、”It's”の発声時間が０．２秒、その容量
が１．２ＫＢ（キロバイト）、”not”の発声時間が
０．１秒、その容量が０．６ＫＢ（キロバイト）、”mu
ch of a”の発声時間が０．４秒、その容量が２．４Ｋ
Ｂ（キロバイト）、そして”problem”の発声時間が
０．３秒、その容量が１．８ＫＢ（キロバイト）であ
り、セグメント６２１全体の発声時間は２．０秒、その
容量は１２ＫＢ（キロバイト）となる。FIG. 9 is a table for explaining the relationship between time and capacity per segment in the mode shown in FIG. In this table, one second corresponds to a capacity of 6 kilobytes. For example, in the segment 621 of the voice information sequence A, the utterance time of “It's” is 0.2 seconds, its capacity is 1.2 KB (kilobytes), the utterance time of “not” is 0.1 seconds, and its capacity is 0.6 KB. (Kilobytes), "mu
Ch of a "utterance time is 0.4 seconds and its capacity is 2.4K
B (kilobytes) and the "problem" utterance time is 0.3 seconds, its capacity is 1.8 KB (kilobytes), the utterance time of the entire segment 621 is 2.0 seconds, and its capacity is 12 KB (kilobytes). Become.

【００７７】さらに、図１０は、図８及び図９に示され
た形態におけるディレクトリ領域の記録内容を説明する
ための表である。この表において、ディレクトリ領域
は、１セグメント当り９×３＝２７バイト（Ｂ）で構成
される。音声情報列Ａ、Ｂ、Ｃはそれぞれ図８の音声情
報列Ａ、Ｂ、Ｃに対応している。また、１バイトのＣは
属性を示し、Ｃ＝０は音声情報列Ａ、Ｃ＝６４は音声情
報列Ｂであることを意味する。また、Ｃ＝１２８、１２
９は音声情報列Ｃであることを意味し、特にＣ＝１２９
のとき、すなわちビット表現（８ビット（ｂｉｔ））
で”１００００００１”のときは前のセグメントと同じ
解説対象であることを示す（音声情報列Ｃの解説対象と
なる同じセグメント群に属していることを示し、例えば
図４（ｅ）のセグメント８０１、８０２の場合が相当す
る）。FIG. 10 is a table for explaining the recorded contents of the directory area in the forms shown in FIGS. In this table, the directory area is composed of 9 × 3 = 27 bytes (B) per segment. The audio information strings A, B, and C respectively correspond to the audio information strings A, B, and C in FIG. In addition, C of one byte indicates an attribute, C = 0 means a voice information sequence A, and C = 64 means a voice information sequence B. Also, C = 128, 12
9 means a voice information sequence C, and particularly C = 129
, Ie, bit representation (8 bits)
"10000001" indicates that the segment is the same as the previous segment (the segment belongs to the same segment group to be explained in the voice information sequence C, for example, the segments 801 and 801 in FIG. 4E). 802).

【００７８】位置情報のＭ、Ｓ、Ｂ（各１バイト）は産
業界で標準になっているＣＤ−ＲＯＭ上の位置を表わす
パラメータである。すなわちＭは分、Ｓは秒、Ｂはブロ
ックをそれぞれ示す。また、１ブロックは２，０４８バ
イトであり、７５ブロックで１秒分を構成している。し
たがって、最大の数はＭ＝５９、Ｓ＝５９、Ｂ＝７４と
なる。次の２バイトのＳＢはスタートバイトを示し、そ
の次の３バイトのＬＬＬは各セグメント全体の長さを示
している。なお、位置を示すパラメータに分、秒を使う
理由はＣＤ−ＲＯＭはもともと音楽用として開発された
ためであり、始めからの時間として記録位置を表現する
ようになっている。そのためＣＤ−ＲＯＭを当該音声情
報記録媒体として採用した場合には、この分と秒は再生
時の時間とは全く無関係であり、単に記録媒体上の記録
位置を表わしている情報にすぎないことになる。The position information M, S, and B (one byte each) are parameters representing the position on the CD-ROM which is standard in the industry. That is, M indicates minutes, S indicates seconds, and B indicates blocks. One block is 2,048 bytes, and 75 blocks constitute one second. Therefore, the maximum numbers are M = 59, S = 59, B = 74. The next 2-byte SB indicates a start byte, and the next 3-byte LLL indicates the length of each segment as a whole. The reason why minutes and seconds are used as parameters indicating the position is that the CD-ROM was originally developed for music, and the recording position is expressed as the time from the beginning. Therefore, when a CD-ROM is employed as the audio information recording medium, the minute and second are completely independent of the reproduction time, and are merely information indicating a recording position on the recording medium. Become.

【００７９】その結果、例えば音声情報列Ａにおけるセ
グメント６２１の”It's not much of a problem.”は、
Ｏ分１１秒３ブロックの８２６バイト目から６，０００
バイトの長さでネイティプスビーカの話す英語の音声情
報が記録され、音声情報列Ｂにおける対応するセグメン
トは０分１１秒３ブロックの２，０２６バイト目から１
７，４００バイトの長さで選択的に伸長された上記ネイ
ティブスピーカーの英語が記録され、音声情報列Ｃのセ
グメント群は０分１１秒６ブロックの１，２８２バイト
目から７２，０００バイトの長さで日本語解説が記録さ
れる。なお、６２１、６２２等のセグメントナンバーは
メモリ上にはなく、そのアドレスに対応している。ま
た、各セグメントの関係を示す記録位置識別情報は、こ
のディレクトリ領域に含まれる。As a result, for example, "It's not much of a problem ."
6,000 from the 826th byte of 3 blocks of O minutes 11 seconds
The audio information of the English spoken by the natives beaker is recorded in a byte length, and the corresponding segment in the audio information sequence B is 1 from the 2,026 byte of the 0 minute 11 second 3 block.
The native speaker's English selectively expanded with a length of 7,400 bytes is recorded, and the segment group of the audio information sequence C is 72,000 bytes long from the 282nd byte of the 0 minute 11 second 6 block. Now the Japanese commentary is recorded. Note that the segment numbers such as 621 and 622 do not exist in the memory but correspond to the addresses. Further, recording position identification information indicating the relationship between the segments is included in this directory area.

【００８０】さらに具体的には、第１０図に示されたデ
ィレクトリ領域の記録内容から、当該音声情報記録媒体
の０分１１秒３ブロックにおける８２６バイト目から８
２６＋６，０００−１＝６，８２５バイト目までの領域
には、セグメントが６２１で属性Ｃが０の音声情報列す
なわちネイティブスピーカが話す”It's not much of a
problem.”に相当する情報が記録される。また、当該音
声情報記録媒体の０分１１秒３ブロックにおける２，０
２６バイト目から２，０２６＋１７，４００−１＝１
９，４２５バイト目までの領域には、セグメントが６２
１で属性Ｃが６４の音声情報列すなわち選択的に伸長さ
れた音声情報が記録される。さらに、当該音声情報記録
媒体の０分１１秒６ブロックにおける１，２８２バイト
目から１，２８２＋７２，０００−１＝７３，２８１バ
イト目までの領域には、セグメントが６２１で属性Ｃが
１２８の音声情報列すなわち日本語の解説に相当する情
報が記録される。More specifically, based on the recorded contents of the directory area shown in FIG. 10, from the 826th byte in the 0 minute, 11 second, 3 blocks of the audio information recording medium,
In the area up to the 26 + 6,000-1 = 6,825th byte, a voice information sequence having a segment of 621 and an attribute C of 0, that is, a native speaker speaks, "It's not much of a
problem . "is recorded. Also, 2,0 in the 0: 11: 3 block of the audio information recording medium is recorded.
2,026 + 17,400-1 = 1 from the 26th byte
In the area up to the 9,425th byte, there are 62 segments.
In step 1, the audio information sequence having the attribute C of 64, that is, the audio information selectively expanded is recorded. Further, in the area from the 1,282th byte to the 1,282 + 72,000-1 = 73,281th byte in the 0: 11: 6 block of the audio information recording medium, the audio with the segment 621 and the attribute C of 128 is set. An information sequence, that is, information corresponding to a Japanese commentary is recorded.

【００８１】このように、図１０に示されたディレクト
リ領域を設ければ、図９に示されたような再生時間及び
容量で図８に示された各音声情報列が記録可能である。As described above, if the directory area shown in FIG. 10 is provided, each audio information sequence shown in FIG. 8 can be recorded with the reproduction time and capacity as shown in FIG.

【００８２】次に、各セグメント６２１、６２２に関す
る情報は、例えば図１１（ａ）に示された可変長セグメ
ントのヘッダー部に記録される。このヘッダー部は、図
１１（ｂ）に示されたように、先頭から文字情報や画像
情報の有無等を示すための１バイト領域（１Ｂ）、音声
情報列Ａ用に用意された領域であって情報列タイプ（音
声情報列Ａ、Ｂ等を区別するための情報）を示す１バイ
トデータ、そのデータ長を示す３バイトデータ、及び予
備の１バイトデータから構成された５バイト領域（５
Ｂ）、音声情報列Ｂ用に用意された領域であって情報列
タイプを示す１バイトデータ、そのデータ長を示す３バ
イトデータ、及び予備の１バイトデータから構成された
５バイト領域（５Ｂ）、音声情報列Ｃ用に用意された領
域であって情報列タイプを示す１バイトデータ及びその
データ長を示す３バイトデータから構成された４バイト
領域（４Ｂ）、文字情報列Ｄ用に用意された領域であっ
て情報列タイプを示す１バイトデータ及びそのデータ長
を示す３バイトデータから構成された４バイト領域（４
Ｂ）、同様に文字情報列Ｄ用に用意された領域であって
アドレスを示す３バイトデータ及びそのデータ長を示す
３バイトデータから構成された６バイト領域（６Ｂ）、
上記第３音声情報列のような他の情報列（タイプＥ）用
に用意された４バイト領域（４Ｂ）、及び予備の３バイ
ト領域（３Ｂ）からなる、３２バイトの領域である。Next, information on each of the segments 621 and 622 is recorded, for example, in the header section of the variable length segment shown in FIG. As shown in FIG. 11B, the header section is a 1-byte area (1B) for indicating the presence or absence of character information and image information from the beginning, and an area prepared for the audio information sequence A. 1 byte data indicating the information sequence type (information for distinguishing audio information sequences A, B, etc.), 3 byte data indicating the data length thereof, and a 5-byte area (5
B), a 5-byte area (5B) which is an area prepared for the audio information string B and includes 1-byte data indicating an information string type, 3-byte data indicating its data length, and spare 1-byte data A 4-byte area (4B), which is an area prepared for the voice information string C and is composed of 1-byte data indicating the information string type and 3-byte data indicating its data length, and is prepared for the character information string D A 4-byte area (4 bytes) composed of 1-byte data indicating the information sequence type and 3-byte data indicating the data length thereof.
B), a 6-byte area (6B) similarly prepared for the character information string D and composed of 3-byte data indicating an address and 3-byte data indicating its data length;
This is a 32-byte area including a 4-byte area (4B) prepared for another information string (type E) such as the third audio information string and a spare 3-byte area (3B).

【００８３】次に、図１２〜図１４を用いて、この発明
に係る音声情報の再生方法及び装置構成を説明する。Next, the audio information reproducing method and apparatus configuration according to the present invention will be described with reference to FIGS.

【００８４】まず、図１２は、この発明に係る音声情報
の再生方法実現するための再生装置の全体構成を示す斜
視図である。この図からも分かるように、当該音声記録
媒体は、例えばポータブルなＣＤプレイヤ（再生装置本
体２００）により再生可能なＣＤ−ＲＯＭであり、この
再生装置本体２００はコード接続されたハンドセット８
０によりリモート制御される。このハンドセット８０に
は少なくとも再生中のセグメント番号を表示する液晶デ
ィスプレイ（ＬＣＤ）等の表示部２１０や、各種制御用
ボタン群２４０が設けられている。また、操作者は再生
装置本体２００で再生された音声情報をイヤホン１３０
を介して聴くことができる。First, FIG. 12 is a perspective view showing the entire configuration of a reproducing apparatus for realizing the audio information reproducing method according to the present invention. As can be seen from this figure, the audio recording medium is, for example, a CD-ROM that can be reproduced by a portable CD player (reproducing apparatus main body 200).
0 is remotely controlled. The handset 80 is provided with a display unit 210 such as a liquid crystal display (LCD) for displaying at least the segment number being reproduced, and a group of various control buttons 240. In addition, the operator transmits the audio information reproduced by the reproduction device main body 200 to the earphone 130.
You can listen through.

【００８５】また、図１３は、図１２に示された再生装
置の構成を示すブロック図である。この図に示されたよ
うに、当該音声情報記録媒体１５であるＣＤ−ＲＯＭは
再生機構２０５にセットされる。再生機構２０５はディ
スクインターフェイス（Ｉ／Ｆ）３０及びバス４０を介
してＣＰＵ５０に接続されている。また、バス４０には
プログラムを格納するための例えば３２キロバイト（Ｋ
Ｂ）のＲＯＭ６０と、ディレクトリや音声情報列を一時
的に格納するための例えば２５６キロバイトのＲＡＭ７
０とが接続されている。さらに、バス４０には手動操作
のためのハンドセット８０との間で情報の授受を行なう
ハンドセットインターフェイス（Ｉ／Ｆ）９０と、音声
出力用のアンプ（ＡＭＰ）１００を介して外部端子１１
０及びハンドセット８０に接続されたＤ／Ａコンパータ
１２に接続されている。なお、ハンドセット８０には上
述されたようにイヤホン１３０が接続されている。FIG. 13 is a block diagram showing a configuration of the reproducing apparatus shown in FIG. As shown in this figure, the CD-ROM as the audio information recording medium 15 is set in the reproducing mechanism 205. The playback mechanism 205 is connected to the CPU 50 via the disk interface (I / F) 30 and the bus 40. Further, for example, 32 kilobytes (K
B) ROM 60 and RAM 7 of, for example, 256 kilobytes for temporarily storing directories and audio information strings.
0 is connected. Further, a bus 40 has a handset interface (I / F) 90 for transmitting and receiving information to and from a handset 80 for manual operation, and an external terminal 11 via an audio output amplifier (AMP) 100.
0 and a D / A converter 12 connected to the handset 80. Note that the earphone 130 is connected to the handset 80 as described above.

【００８６】図１４（ａ）、（ｂ）は、それぞれＲＯＭ
６０及びＲＡＭ７０のメモリ割り当て状況を説明するた
めの図である。図１４（ａ）に示されたように、３２キ
ロバイトのＲＯＭ６０にはプログラムが格納される。一
方、図１４（ｂ）に示されたように、ＲＡＭ７０には、
（５０＋５０）＝１００キロバイトのバッファ（５０ブ
ロック分に相当）と、（７５＋７５）＝１５０キロバイ
トのディレクトリと、６キロバイト分のシステムエリア
が割り当てられる。したがって、ＲＡＭ７０には常時５
０ブロック分の音声情報列が保持され、かつ１５０キロ
バイト÷２７≒５，５５５セグメント分のディレクトリ
（音声情報列Ａの部分のみで約３０分間に相当）が保持
される。FIGS. 14A and 14B respectively show a ROM
FIG. 6 is a diagram for explaining a memory allocation status of a RAM 60 and a RAM 70. As shown in FIG. 14A, a program is stored in a ROM 60 of 32 kilobytes. On the other hand, as shown in FIG.
A buffer of (50 + 50) = 100 kilobytes (corresponding to 50 blocks), a directory of (75 + 75) = 150 kilobytes, and a system area of 6 kilobytes are allocated. Therefore, the RAM 70 always has 5
The audio information sequence for 0 blocks is held, and a directory for 150 kilobytes {27} 5,555 segments (only the audio information sequence A corresponds to about 30 minutes) is held.

【００８７】なお、上述の具体例では当該音声情報記録
媒体としてＣＤ−ＲＯＭを用いているが、その代表的な
ものの容量は５５２メガバイト（ＭＢ）である。ＣＤ−
ＲＯＭではアドレスを表わすのに分、秒、ブロックの単
位を用いている。また、１ブロックは２，０４８バイ
ト、７５ブロックは１秒、６０秒は１分であるため、該
ＣＤ−ＲＯＭのアドレスの最大の値は５９分５９秒７４
ブロックである。逆に、このＣＤ−ＲＯＭの容量は２，
０４８×７５×６０×６０＝５５２．９６メガバイトで
ある。このうち、最初から２秒分はＣＤ−ＲＯＭのフォ
ーマットとしてユーザは使えないので、正確には最大容
量とし５５２．６５２８ＭＢとなる。さらに、最初から
２０秒に相当するところまでディレクトリ領域が割り当
てられると、３メガバイトのディレクトリ容量をＣＤ−
ＲＯＭに確保することができる。In the above-mentioned specific example, a CD-ROM is used as the audio information recording medium, but the typical one has a capacity of 552 megabytes (MB). CD-
In ROM, minutes, seconds, and blocks are used to represent addresses. Also, since one block is 2,048 bytes, 75 blocks is one second, and 60 seconds is one minute, the maximum value of the address of the CD-ROM is 59 minutes 59 seconds 74
It is a block. Conversely, the capacity of this CD-ROM is 2,
048 × 75 × 60 × 60 = 552.96 megabytes. Of these, the first two seconds cannot be used by the user as a CD-ROM format, so the maximum capacity is 552.6528 MB as the maximum capacity. Further, when the directory area is allocated to a position corresponding to 20 seconds from the beginning, the directory capacity of 3 MB is reduced to the CD-size.
It can be secured in ROM.

【００８８】なお、上述されたこの発明の実施形態に
は、音声情報の記録ソフト（上述の記録方法をパーソナ
ルコンピュータ等で実施可能なプログラム、あるいは該
プログラムが記録された記録媒体）、専用記録装置、使
用マニュアル、あるいはこれらの組合わせによる販売、
係る音声情報記録媒体単体での販売の他、該音声情報記
録媒体、再生ソフト（パーソナルコンピュータ等で実効
可能なプログラム、あるいは該プログラムを記録した記
録媒体を含む）、専用再生装置、使用マニュアル、ある
いはこれらの組合わせによる販売が考えられる。The above-described embodiment of the present invention includes a recording software for audio information (a program capable of executing the above-described recording method on a personal computer or the like, or a recording medium on which the program is recorded), a dedicated recording device , Use manual, or a combination of these,
In addition to selling the audio information recording medium alone, the audio information recording medium, reproduction software (including a program executable on a personal computer or the like, or a recording medium on which the program is recorded), a dedicated reproduction apparatus, a manual for use, or Sales by combining these are conceivable.

【００８９】[0089]

【発明の効果】以上のようにこの発明は、第１周期でサ
ンプリングされた第１音声情報列から分割された複数の
周波数成分について、所望の部分に振幅を変更（強調あ
るいは減衰）したり波数を変更（再生時間を伸長するよ
うに増やすかあるいは再生時間を短縮のために減らす）
することにより、修正された正弦波データを生成し、こ
れら各周波数成分の正弦波データを加算することによ
り、新たに合成された第２音声情報列を所定の記録媒体
に記録する。このように記録された所望の音声情報列
は、周波数を変えることなく任意の部分で再生時間を伸
長あるいは短縮したり、任意部分の音声が強調あるいは
減衰された音声として再生できるという効果がある。As described above, according to the present invention, for a plurality of frequency components divided from the first audio information sequence sampled in the first cycle, the amplitude is changed (emphasized or attenuated) to a desired portion or the wave number is changed. Change (increase to increase playback time or decrease to shorten playback time)
Thereby, the corrected sine wave data is generated, and the sine wave data of each of these frequency components is added to record the newly synthesized second audio information sequence on a predetermined recording medium. The desired audio information sequence recorded in this way has the effect that the reproduction time can be extended or shortened in an arbitrary portion without changing the frequency, and the audio in the arbitrary portion can be reproduced as emphasized or attenuated audio.

【００９０】また、この発明は、日本国特許第２５８１
７００号に開示された技術との組合わが可能であり、ネ
イティブスピーカーの発声音を発声の節目で分割した可
変長の区画に対応して、任意部分の音声が伸長及び／又
は強調された音声情報を別途用意することにより、初級
学習者は聞き取れなかった音声を繰り返し再生して聞く
ことができるとともに、係る音声の聞き取り難い部分が
強調・伸長された音声としても聞くことが可能になると
いう効果がある。また上級学習者とっては、任意部分の
音声が短縮及び／又は減衰された音声情報を別途用意す
ることにより、ネイティブスピーカーの発声音の再生と
組合わせて、より積極的な学習が可能になるという効果
がある。Also, the present invention relates to Japanese Patent No. 2581
No. 700, which can be combined with the technology disclosed in US Pat. No. 700, and corresponding to a variable-length section obtained by dividing the uttered sound of a native speaker at an utterance node, and expanding and / or emphasizing audio information of an arbitrary part By preparing a separate file, the beginner learner can repeatedly listen to and listen to the inaudible sound, and also have the effect that it is possible to hear the hard-to-hear part of the sound as emphasized and expanded sound. is there. For advanced learners, by separately preparing audio information in which the audio of an arbitrary portion is shortened and / or attenuated, it becomes possible to perform more active learning in combination with the reproduction of the utterance of a native speaker. This has the effect.

【図面の簡単な説明】[Brief description of the drawings]

【図１】この発明に係る音声情報の記録動作を説明する
ための概念図である。FIG. 1 is a conceptual diagram for explaining a recording operation of audio information according to the present invention.

【図２】サンプリングされた入力音声情報から分割する
各周波数成分（チャネル）の一例を示す表である。FIG. 2 is a table showing an example of each frequency component (channel) divided from sampled input audio information.

【図３】音声スペクトルの基本的な形状を説明するため
の図である。FIG. 3 is a diagram for explaining a basic shape of an audio spectrum.

【図４】この発明に係る音声記録媒体に記録されるべき
音声情報列を含む各種情報を概念的に説明するための図
である。FIG. 4 is a diagram for conceptually explaining various types of information including an audio information sequence to be recorded on the audio recording medium according to the present invention.

【図５】この発明に係る音声情報の記録方法を実現する
ための周辺装置の全体構成を示す図である。FIG. 5 is a diagram showing an overall configuration of a peripheral device for realizing the audio information recording method according to the present invention.

【図６】この発明に係る音声情報の記録方法を説明する
ためのフローチャート（その１）である。FIG. 6 is a flowchart (part 1) for describing a method for recording audio information according to the present invention.

【図７】この発明に係る音声情報の記録方法を説明する
ためのフローチャート（その２）である。FIG. 7 is a flowchart (part 2) for describing a method of recording audio information according to the present invention.

【図８】この発明に係る、英会話独習用に適用された音
声記録媒体の各音声情報列と、その記録内容を説明する
ための図である。FIG. 8 is a diagram for explaining each voice information sequence of a voice recording medium applied for English conversation self-study according to the present invention and the recorded contents thereof.

【図９】図８に示された各音声情報列について、各セグ
メント当りの時間と容量との関係を説明するための表で
ある。FIG. 9 is a table for explaining the relationship between time and capacity per segment for each audio information sequence shown in FIG. 8;

【図１０】図８及び図９に示された音声記録媒体におけ
るディレクトリ領域の記録内容（記録位置識別情報を含
む）を説明するための表である。FIG. 10 is a table for explaining recorded contents (including recording position identification information) of a directory area in the audio recording medium shown in FIGS. 8 and 9;

【図１１】この発明に係る音声記録媒体に記録されるべ
き可変長セグメントの構成を示す図である。FIG. 11 is a diagram showing a configuration of a variable length segment to be recorded on the audio recording medium according to the present invention.

【図１２】この発明に係る音声記録媒体の再生方法を実
現する再生装置の全体構成を示す斜視図である。FIG. 12 is a perspective view showing an overall configuration of a reproducing apparatus for realizing the audio recording medium reproducing method according to the present invention.

【図１３】図１２に示された再生装置の構成を示すブロ
ック図である。FIG. 13 is a block diagram showing a configuration of the playback device shown in FIG.

【図１４】図１３に示されたＲＯＭ及びＲＡＭのメモリ
割り当て状況を説明するための図である。FIG. 14 is a diagram for explaining a memory allocation state of a ROM and a RAM shown in FIG. 13;

【符号の説明】[Explanation of symbols]

１…ＰＣ、１０…制御系、１４…入出力装置、１５…音
声情報記録媒体、１９…Ｖデータ、１６−１〜１６−８
５…正弦波データ生成回路、１７、１７５…ＤＡＣ、１
８、１７７…スピーカー、１７３…加算器。DESCRIPTION OF SYMBOLS 1 ... PC, 10 ... Control system, 14 ... I / O device, 15 ... Sound information recording medium, 19 ... V data, 16-1 to 16-8
5 ... sine wave data generation circuit, 17, 175 ... DAC, 1
8, 177: speaker, 173: adder.

Claims

【特許請求の範囲】[Claims]

【請求項１】音声情報を所定の記録媒体に記録するた
めの音声情報の記録方法であって、第１周期でサンプリングされた第１音声情報列を複数の
周波数成分に分割し、前記複数の周波数成分について、少なくとも１波形分以
上に相当する第２周期で抽出された振幅情報からなる振
幅情報列における１又は２以上の所定部分が選択的に編
集された修正振幅情報列を生成し、前記複数の周波数成分おのおのの修正振幅情報列のう
ち、各周波数成分間で互いに対応している同じタイミン
グで抽出された振幅情報からなる各情報成分群と、これ
ら各情報成分群ごとに用意された、前記第１周期を基準
にして音声再生時間の伸長あるいは短縮を指示するため
の制御情報とからなるＶデータを生成し、前記複数の周波数成分おのおのについて、前記生成され
たＶデータで与えられる振幅を有するとともに前記第１
周期のデータ間隔を有する正弦波データであって、前記
Ｖデータに含まれる制御情報で指示された再生時間に相
当する波数の正弦波データを生成し、前記複数の周波数成分おのおのについて生成された正弦
波データを順次加算することにより得られる、前記第１
周期の第２音声情報列を所定の記録媒体に記録する音声
情報の記録方法。An audio information recording method for recording audio information on a predetermined recording medium, comprising: dividing a first audio information sequence sampled in a first cycle into a plurality of frequency components; For a frequency component, a corrected amplitude information sequence in which one or two or more predetermined portions in an amplitude information sequence composed of amplitude information extracted in a second cycle corresponding to at least one waveform or more are selectively edited; Among the corrected amplitude information sequences for each of the plurality of frequency components, each information component group consisting of amplitude information extracted at the same timing corresponding to each other among the frequency components, and each of these information component groups was prepared. V data including control information for instructing extension or shortening of the audio reproduction time is generated based on the first cycle, and the V data is generated for each of the plurality of frequency components. Wherein with an amplitude given by V data first
Sine wave data having a cycle data interval and generating sine wave data having a wave number corresponding to the reproduction time specified by the control information included in the V data; and generating sine wave data for each of the plurality of frequency components. The first data obtained by sequentially adding wave data.
A recording method of audio information for recording a second audio information sequence of a cycle on a predetermined recording medium.

【請求項２】前記複数の周波数成分おのおのの振幅情
報列間で、互いに対応している選択された部分の振幅情
報により与えられる振幅値をそれぞれ変更することによ
り、前記複数の周波数成分おのおのの修正振幅情報列を
生成することを特徴とする請求項１記載の音声情報の記
録方法。2. Modification of each of the plurality of frequency components by changing amplitude values given by amplitude information of a selected portion corresponding to each other between the amplitude information sequences of each of the plurality of frequency components. 2. The method for recording audio information according to claim 1, wherein an amplitude information sequence is generated.

【請求項３】前記複数の周波数成分おのおのについ
て、生成される前記正弦波データの各振幅は、前記修正
振幅情報列の互いに隣接した各振幅情報間の直線補間に
より得られた値により決定されることを特徴とする請求
項１又は２記載の音声情報の記録方法。3. An amplitude of the sine wave data generated for each of the plurality of frequency components is determined by a value obtained by linear interpolation between mutually adjacent amplitude information of the corrected amplitude information sequence. 3. The method for recording audio information according to claim 1, wherein:

【請求項４】前記制御情報は、前記複数の周波数成分
全体を高音方向あるいは低音方向にシフトした状態で再
生させるための周波数シフト指示情報を含むことを特徴
とする請求項１〜３のいずれか一項記載の音声情報の記
録方法。4. The control information according to claim 1, wherein the control information includes frequency shift instruction information for reproducing the plurality of frequency components in a state shifted in a treble direction or a bass direction. A method for recording audio information according to one of the preceding claims.

【請求項５】前記第１音声情報列は、所定の音声再生
手段で再生出力されるべき単語列から構成された１又は
２以上の文に対応する音声情報列であって、発音の節目
でそれぞれ分割された情報ごとに可変長の区画に区分さ
れた状態で前記所定の記録媒体に記録されることを特徴
とする請求項１〜４のいずれか一項記載の音声情報の記
録方法。5. The first audio information sequence is an audio information sequence corresponding to one or two or more sentences composed of a word sequence to be reproduced and output by a predetermined audio reproducing means, and includes a sound generation node. 5. The audio information recording method according to claim 1, wherein the divided information is recorded on the predetermined recording medium in a state of being divided into variable-length sections.

【請求項６】前記第２音声情報列は、前記第１音声情
報列の区画に対応して分割された区画ごとに前記所定の
記録媒体に記録されており、さらに該記録媒体には、該
第１音声情報列と該第２音声情報列とを前記所定の音声
再生手段で切替え再生すべく、切替え可能な各区画を当
該所定の記録媒体における該各区画の記録位置で示す記
録位置識別情報が記録されることを特徴とする請求項５
記載の音声情報の記録方法。6. The second audio information sequence is recorded on the predetermined recording medium for each of the divisions corresponding to the divisions of the first audio information sequence, and the recording medium further includes: Recording position identification information indicating each switchable section by the recording position of each section on the predetermined recording medium so that the first audio information string and the second audio information string are switched and reproduced by the predetermined audio reproducing means. 6. is recorded.
How to record the described audio information.

【請求項７】前記請求項１記載の音声情報の記録方法
により第２音声情報列が記録された音声情報記録媒体。7. An audio information recording medium on which a second audio information sequence is recorded by the audio information recording method according to claim 1.

【請求項８】所定周期でサンプリングされた第１音声
情報列を構成する各周波数成分について、各周波数成分
間で互いに対応している１又は２以上の部分に対し、少
なくとも振幅が変更されるかあるいは波形数を変更され
ることにより、選択的に振幅及び再生時間が編集された
第２音声情報列が少なくとも記録された音声情報記録媒
体。8. For each frequency component constituting a first audio information sequence sampled at a predetermined period, at least the amplitude of at least one or two or more parts corresponding to each other between the frequency components is changed. Alternatively, an audio information recording medium in which at least a second audio information sequence whose amplitude and reproduction time are selectively edited by changing the number of waveforms is recorded.

【請求項９】前記第１音声情報列は、所定の音声再生
手段で再生出力されるべき単語列から構成された１又は
２以上の文に対応する音声情報列であって、発音の節目
でそれぞれ分割された情報ごとに可変長の区画に区分さ
れた状態で記録されている特徴とする請求項７又は８記
載の音声情報記録媒体。9. The first audio information sequence is an audio information sequence corresponding to one or more sentences composed of a word sequence to be reproduced and output by a predetermined audio reproducing means, and includes a sound generation node. 9. The audio information recording medium according to claim 7, wherein each of the divided information is recorded in a state of being divided into sections of variable length.

【請求項１０】前記第２音声情報列は、前記第１音声
情報列の区画に対応して分割された区画ごとに記録され
ており、さらに該第１音声情報列と該第２音声情報列と
を前記所定の音声再生手段で切替え再生すべく、切替え
可能な各区画を当該所定の記録媒体における該各区画の
記録位置で示す記録位置識別情報が記録されたことを特
徴とする請求項９記載の音声情報記録媒体。10. The second audio information sequence is recorded for each section divided corresponding to the section of the first audio information sequence, and further includes the first audio information sequence and the second audio information sequence. 10. Recording position identification information indicating each switchable section by the recording position of each section on the predetermined recording medium so that the predetermined audio reproduction means can switch and reproduce the information. Described audio information recording medium.

【請求項１１】所定の記録媒体に予め記録されている
音声情報列を再生するための音声情報の再生方法におい
て、前記記録媒体は、所定の音声再生手段で再生出力される
べき単語列から構成された１又は２以上の文に対応する
音声情報列であって、発音の節目でそれぞれ分割された
情報ごとに可変長の区画に区分された第１音声情報列
と、該第１音声情報列の区画に対応して分割された音声
情報列であって、該第１音声情報列を構成する各周波数
成分について、各周波数成分間で互いに対応している１
又は２以上の部分に対し、少なくとも振幅が変更される
かあるいは波形数を変更されることにより、選択的に振
幅強度及び再生時間が変更された第２音声情報列と、該
第１音声情報列と該第２音声情報列とを前記所定の音声
再生手段で切替え再生すべく、切替え可能な各区画を当
該所定の記録媒体における該各区画の記録位置で示す記
録位置識別情報とを少なくとも含み、前記第１音声情報列の再生中あるいは中断の後に入力さ
れた前記第２音声情報列の再生命令に対し、再生中の前
記第１音声情報列中の区画に対応する前記第２音声情報
列中の区画の音声情報列を前記記録位置識別情報に基づ
いて前記記録媒体から読み出し、該読み出された音声情
報列を前記所定の音声再生手段により再生する第１ステ
ップと、前記第２音声情報列の再生中あるいは中断の後に入力さ
れた前記第１音声情報列の再生命令に対し、再生中の前
記第２音声情報列中の区画に対応する前記第１音声情報
列中の区画の音声情報列を前記記録位置識別情報に基づ
いて前記記録媒体から読み出し、該読み出された音声情
報列を前記所定の音声再生手段により再生する第２ステ
ップと、を少なくとも備えた音声情報の再生方法。11. A sound information reproducing method for reproducing a sound information sequence recorded in advance on a predetermined recording medium, wherein the recording medium comprises a word sequence to be reproduced and output by predetermined sound reproducing means. A first speech information sequence corresponding to one or more of the selected sentences, wherein the first speech information sequence is divided into variable-length sections for each information segmented at each pronunciation node; , And each of the frequency components constituting the first audio information sequence corresponds to each other among the frequency components.
Or, for at least two portions, at least the amplitude or the number of waveforms is changed to selectively change the amplitude and reproduction time of the second audio information sequence and the first audio information sequence. In order to switch and reproduce the second audio information sequence with the predetermined audio reproduction means, at least recording position identification information indicating each switchable section by the recording position of each section on the predetermined recording medium, In response to an instruction to reproduce the second audio information sequence input during or after the reproduction of the first audio information sequence, the second audio information sequence corresponding to the section in the first audio information sequence being reproduced. A first step of reading an audio information sequence of the section from the recording medium based on the recording position identification information, and reproducing the read audio information sequence by the predetermined audio reproduction means; and a second audio information sequence. Re In response to a reproduction command of the first audio information sequence input during or after the interruption, the audio information sequence of the section in the first audio information sequence corresponding to the section in the second audio information sequence being reproduced is converted to the audio information sequence. A second step of reading from the recording medium based on the recording position identification information and reproducing the read audio information sequence by the predetermined audio reproducing means.

【請求項１２】前記請求項１１記載の音声情報の再生
方法を実施するための音声情報列の再生装置。12. An audio information sequence reproducing apparatus for performing the audio information reproducing method according to claim 11.