JP3639461B2

JP3639461B2 - Audio signal pitch period extraction method, audio signal pitch period extraction apparatus, audio signal time axis compression apparatus, audio signal time axis expansion apparatus, audio signal time axis compression / expansion apparatus

Info

Publication number: JP3639461B2
Application number: JP17824399A
Authority: JP
Inventors: 健生井上
Original assignee: Sanyo Electric Co Ltd
Current assignee: Sanyo Electric Co Ltd
Priority date: 1998-09-29
Filing date: 1999-06-24
Publication date: 2005-04-20
Anticipated expiration: 2019-06-24
Also published as: CN1320256A; EP1136981A1; CN1158640C; CA2345712A1; WO2000019407A1; JP2000305581A; EP1136981A4

Abstract

In a voice signal pitch period detecting method for detecting the pitch period of an input voice waveform by taking a predetermined number of pitch periods on the basis of the input voice waveform of a predetermined time period, a voice signal pitch period detecting method is characterized by reducing, when the detected pitch period is not more than a predetermined reference value, the number of times of pitch period detecting processing by considering the pitch period of a waveform of a predetermined number of pitch periods subsequent to a waveform of the predetermined number of pitch periods detected the same as the currently detected pitch period. <IMAGE>

Description

【０００１】
【発明の属する技術分野】
本発明は、音声信号のピッチ周期抽出方法、及び音声信号のピッチ周期抽出装置、並びに音声信号の時間軸圧縮装置、及び音声信号の時間軸伸長装置、さらには音声信号の時間軸圧縮伸長装置に関する。
【０００２】
【従来の技術】
半導体メモリ等に音声を記録する場合やディジタル伝送系等で音声を伝送する場合には、音声レベルを直接符号化するＰＣＭ方法のほか、記録側で音声の特徴を表すパラメータ形式で分析して記録し、再生側でそのパラメータから音声を合成する音声符号化方法等が注目されている。
【０００３】
斯かる音声の特徴を表すパラメータの１つにピッチ周期があり、このピッチ周期は一般的に声の高さを表すものである。
【０００４】
該ピッチ周期抽出方法の１つに自己相関を利用するものがある。
【０００５】
自己相関を用いたピッチ周期抽出法には、信号は時間制限されていると仮定し、時間長Ｔsの区間内だけに信号が存在し、その時間長Ｔsの区間外では信号は常にゼロとして自己相関を求める短時間自己相関を用いる方法がある。これは、コロナ社発行「音声のディジタル信号処理」（上）−L.R.Rabiner＆R.W.Schafer著、鈴木久喜訳−p152-p152にも記載されているように、いま、音声波形をディジタル音声データｘ(ｎ)で表すと、前述の方法による短時間自己相関値Ｒｎ(ｋ)は下記のようになる。
【０００６】
【数１】

【０００７】
ここで、Ｔsは音声信号が存在すると仮定した時間区間、ｋは短時間自己相関値Ｒｎ(ｋ)を算出するときに音声波形を遅延させる際の遅延時間であり、Ｔs≫ｋの関係にある。
【０００８】
【発明が解決しようとする課題】
然し乍ら、音声信号の圧縮・伸長処理を行う場合には、音声波形のピッチ周期を求める必要があり、ピッチ周期が短い波形の場合にはピッチ周期が長い波形に比べて単位時間当たりのピッチ周期の抽出回数が多くなり、これによりピッチ周期抽出時間を要してしまう結果、処理手段（プロセッサ）に負担が掛かる問題点を有していた。
【０００９】
さらに詳述すると、ピッチ周期を抽出する時間期間Ｔｓは、想定される最大ピッチ（つまり最も長いピッチ周期）の２倍に設定している。そして２つのピッチ周期分を抽出しながら圧縮や伸長の処理を行うので（圧縮については図２、伸長については図７に示すが、その詳細は後述する）ピッチ周期が長い波形の場合は、図１４（ｂ）に示すように、ピッチ周期２つ分ずつ波形を抽出してもピッチ周期を抽出する時間期間Ｔｓはオーバーラップすることはない。
【００１０】
然し、ピッチ周期が短い波形の場合は、図１４（ａ）に示すように、ピッチ周期２つ分ずつ波形を抽出していくと、ピッチ周期を抽出する時間期間Ｔｓがオーバーラップしてしまう（ピッチ周期抽出１、２、３を参照）。これは、前述したように、ピッチ周期を抽出する時間期間Ｔｓが、長いピッチ周期を有する波形のピッチ周期を抽出する時間期間Ｔｓの２倍〜３倍の固定幅であるために生じる問題である。
【００１１】
このようなことから、ピッチ周期が短い波形の場合は、ピッチ周期が長い波形に比べて、単位時間当たりのピッチ周期の抽出回数が多くなり、ピッチ周期の抽出処理を行う処理手段（プロセッサ）に大きな負担がかかっていた。
【００１２】
ところで、前記したように、図１４（ａ）に示すようにピッチ周期が短い波形の場合にはピッチ周期を抽出する時間期間Ｔｓがオーバーラップしている。
【００１３】
従って、例えば、前記図１４（ａ）の場合において、ピッチ周期抽出１とピッチ周期抽出３のところで自己相関値Ｒｎ(ｋ)を計算し、ピッチ周期抽出２のところでは自己相関値Ｒｎ(ｋ)を計算しないようにしても影響が少ないことがわかった。即ち、ピッチ周期が短い波形の場合には、自己相関値Ｒｎ(ｋ)を毎回計算しなくても影響が少ないといえる。
【００１４】
また、人の声は同じピッチ周期で繰り返される波形で構成されることが多い。ピッチ周期が短い波形で構成される音声（即ち女性などの高い声）の場合には、ピッチ周期が長い波形で構成される音声（即ち弾性などの低い声）に比べて、所定期間内における同一ピッチ周期の波形の数が多いことになる。
【００１５】
ピッチ周期が短い波形で構成される音声の場合は、前記１３（ａ）のようにピッチ周期を抽出する時間期間Ｔｓがオーバーラップすることになるが、前述のように、人の声は同じピッチ周期で繰り返される波形で構成されることが多いため、このような観点からも、自己相関値Ｒｎ(ｋ)を毎回計算しなくても影響が少ないことがわかった。
【００１６】
従って、本発明は、このような着目点に基づいてなされたものであり、短い処理時間で入力音声信号からピッチ周期を抽出する音声信号のピッチ周期抽出方法、及び音声信号のピッチ周期抽出装置等を提供することを目的とする。
【００１７】
【課題を解決するための手段】
上記の課題を解決するために本発明は、ピッチ周期を抽出する時間期間 Ts を入力音声信号の時間軸方向に順次シフトさせ、それぞれの時間期間 Ts 毎に前記入力音声信号のピッチ周期を抽出するピッチ周期抽出方法において、前記時間期間 Ts が所定のシフト位置にあるときに抽出したピッチ周期が、予め設定した閾値以下の場合、その後のシフト位置における前記時間期間 Ts に対してはピッチ周期を抽出せず、当該その後のシフト位置における時間期間 Ts に対し、前記所定のシフト位置に前記時間期間 Ts があるときに抽出した前記閾値以下のピッチ周期を割り当てる、ことを特徴とする。
【００１８】
また本発明は、ピッチ周期を抽出する時間期間 Ts を入力音声信号の時間軸方向に順次シフトさせ、それぞれの時間期間 Ts 毎に前記入力音声信号のピッチ周期を抽出するピッチ周期抽出装置において、前記時間期間 Ts の音声信号からピッチ周期を抽出するピッチ周期抽出手段と、前記ピッチ周期抽出手段によって抽出されたピッチ周期を記憶する記憶手段と、前記ピッチ周期に対し閾値を設定する閾値設定手段と、前記ピッチ周期抽出手段によって抽出されたピッチ周期と前記閾値設定手段にて設定された閾値とを比較し、当該ピッチ周期が前記閾値以下であるかどうかを判定するピッチ周期判定手段と、前記ピッチ周期判定手段によって、前記ピッチ周期抽出手段によって抽出されたピッチ周期が前記閾値以下であると判定された場合、その後のシフト位置における前記時間期間 Ts に対してはピッチ周期を抽出せず、当該その後のシフト位置における前記時間期間 Ts に対し、前記記憶手段に記憶されたピッチ周期を割り当てるピッチ周期割り当て手段と、を備えることを特徴とする。
【００１９】
また本発明は、ピッチ周期を抽出する時間期間 Ts を入力音声信号の時間軸方向に順次シフトさせ、それぞれの時間期間 Ts 毎に前記入力音声信号のピッチ周期を抽出するピッチ周期抽出装置と、該ピッチ周期抽出装置によって抽出されたピッチ周期に基づいて入力音声信号の時間軸の圧縮処理を行う時間軸圧縮手段と、を備えた音声信号の時間軸圧縮装置において、前記ピッチ周期抽出装置は、前記時間期間 Ts の音声信号からピッチ周期を抽出するピッチ周期抽出手段と、前記ピッチ周期抽出手段によって抽出されたピッチ周期を記憶する記憶手段と、前記ピッチ周期に対し閾値を設定する閾値設定手段と、前記ピッチ周期抽出手段によって抽出されたピッチ周期と前記閾値設定手段にて設定された閾値とを比較し、当該ピッチ周期が前記閾値以下であるかどうかを判定するピッチ周期判定手段と、前記ピッチ周期判定手段によって、前記ピッチ周期抽出手段によって抽出されたピッチ周期が前記閾値以下であると判定された場合、その後のシフト位置における前記時間期間 Ts に対してはピッチ周期を抽出せず、当該その後のシフト位置における前記時間期間 Ts に対し、前記記憶手段に記憶されたピッチ周期を割り当てるピッチ周期割り当て手段と、を備えることを特徴とする。
【００２０】
また本発明は、ピッチ周期を抽出する時間期間 Ts を入力音声信号の時間軸方向に順次シフトさせ、それぞれの時間期間 Ts 毎に前記入力音声信号のピッチ周期を抽出するピッチ周期抽出装置と、該ピッチ周期抽出装置によって抽出されたピッチ周期に基づいて入力音声信号の時間軸の伸長処理を行う時間軸伸長手段と、
を備えた音声信号の時間軸伸長装置において、前記ピッチ周期抽出装置は、前記時間期間 Ts の音声信号からピッチ周期を抽出するピッチ周期抽出手段と、前記ピッチ周期抽出手段によって抽出されたピッチ周期を記憶する記憶手段と、前記ピッチ周期に対し閾値を設定する閾値設定手段と、前記ピッチ周期抽出手段によって抽出されたピッチ周期と前記閾値設定手段にて設定された閾値とを比較し、当該ピッチ周期が前記閾値以下であるかどうかを判定するピッチ周期判定手段と、前記ピッチ周期判定手段によって、前記ピッチ周期抽出手段によって抽出されたピッチ周期が前記閾値以下であると判定された場合、その後のシフト位置における前記時間期間 Ts に対してはピッチ周期を抽出せず、当該その後のシフト位置における前記時間期間 Ts に対し、前記記憶手段に記憶されたピッチ周期を割り当てるピッチ周期割り当て手段と、を備えることを特徴とする。
【００２１】
また本発明は、入力された音声信号を符号化する符号化手段と、前記符号化手段で符号化された信号を格納するメモリ手段と、前記メモリ手段に格納された信号を復号する復号化手段と、ピッチ周期を抽出する時間期間 Ts を前記復号化手段より出力される信号の時間軸方向に順次シフトさせ、それぞれの時間期間 Ts 毎に前記復号化手段より出力される信号のピッチ周期を抽出するピッチ周期抽出装置と、前記ピッチ周期抽出装置によって抽出されたピッチ周期に基づいて前記復号化手段より出力される信号の時間軸の圧縮処理を行う時間軸圧縮手段と、前記ピッチ周期抽出装置によって抽出されたピッチ周期に基づいて前記復号化手段より出力される信号の時間軸の伸長処理を行う時間軸伸長手段と、前記復号化手段より出力される信号を選択的に前記時間軸圧縮手段または時間軸伸長手段に導く選択手段と、を備えた音声信号の時間軸圧縮伸長装置において、前記ピッチ周期抽出装置は、前記時間期間 Ts の信号からピッチ周期を抽出するピッチ周期抽出手段と、前記ピッチ周期抽出手段によって抽出されたピッチ周期を記憶する記憶手段と、前記ピッチ周期に対し閾値を設定する閾値設定手段と、前記ピッチ周期抽出手段によって抽出されたピッチ周期と前記閾値設定手段にて設定された閾値とを比較し、当該ピッチ周期が前記閾値以下であるかどうかを判定するピッチ周期判定手段と、前記ピッチ周期判定手段によって、前記ピッチ周期抽出手段によって抽出されたピッチ周期が前記閾値以下であると判定された場合、その後のシフト位置における前記時間期間 Ts に対してはピッチ周期を抽出せず、当該その後のシフト位置における前記時間期間 Ts に対し、前記記憶手段に記憶されたピッチ周期を割り当てるピッチ周期割り当て手段と、を備えることを特徴とする。
【００２２】
また本発明は、ピッチ周期を抽出する時間期間 Ts を入力音声信号の時間軸方向に順次シフトさせ、それぞれの時間期間 Ts 毎に前記入力音声信号のピッチ周期を抽出するピッチ周期抽出装置と、該ピッチ周期抽出装置によって抽出されたピッチ周期に基づいて入力音声信号の時間軸の圧縮処理を行う時間軸圧縮手段と、前記時間軸圧縮手段によって時間軸圧縮された信号を格納するメモリ手段と、前記ピッチ周期抽出装置によって抽出されたピッチ周期に基づいて、前記メモリ手段に格納された信号の時間軸の伸長処理を行う時間軸伸長手段と、を備えた音声信号の時間軸圧縮伸長装置において、前記ピッチ周期抽出装置は、前記時間期間 Ts の音声信号からピッチ周期を抽出するピッチ周期抽出手段と、前記ピッチ周期抽出手段によって抽出されたピッチ周期を記憶する記憶手段と、前記ピッチ周期に対し閾値を設定する閾値設定手段と、前記ピッチ周期抽出手段によって抽出されたピッチ周期と前記閾値設定手段にて設定された閾値とを比較し、当該ピッチ周期が前記閾値以下であるかどうかを判定するピッチ周期判定手段と、前記ピッチ周期判定手段によって、前記ピッチ周期抽出手段によって抽出されたピッチ周期が前記閾値以下であると判定された場合、その後のシフト位置における前記時間期間 Ts に対してはピッチ周期を抽出せず、当該その後のシフト位置における前記時間期間 Ts に対し、前記記憶手段に記憶されたピッチ周期を割り当てるピッチ周期割り当て手段と、を備えることを特徴とする。
【００２３】
また本発明はピッチ周期を抽出する時間期間 Ts １を入力音声信号の時間軸方向に順次シフトさせ、それぞれの時間期間 Ts １毎に前記入力音声信号のピッチ周期を抽出する第１のピッチ周期抽出装置と、前記第１のピッチ周期抽出装置によって抽出されたピッチ周期に基づいて入力音声信号の時間軸の圧縮処理を行う時間軸圧縮手段と、前記時間軸圧縮手段によって時間軸圧縮された信号を格納するメモリ手段と、ピッチ周期を抽出する時間期間 Ts ２を前記メモリ手段に格納された信号の時間軸方向に順次シフトさせ、それぞれの時間期間 Ts ２毎に前記メモリ手段に格納された信号のピッチ周期を抽出する第２のピッチ周期抽出装置と、前記第２のピッチ周期抽出装置によって抽出されたピッチ周期に基づいて、前記メモリ手段に格納された信号の時間軸の伸長処理を行う時間軸伸長手段と、を備えた音声信号の時間軸圧縮伸長装置において、前記第１のピッチ周期抽出装置は、前記時間期間 Ts １の音声信号からピッチ周期を抽出する第１のピッチ周期抽出手段と、前記第１のピッチ周期抽出手段によって抽出されたピッチ周期を記憶する第１の記憶手段と、前記ピッチ周期に対し閾値を設定する第１の閾値設定手段と、前記第１のピッチ周期抽出手段によって抽出されたピッチ周期と前記第１の閾値設定手段にて設定された閾値とを比較し、当該ピッチ周期が前記閾値以下であるかどうかを判定する第１のピッチ周期判定手段と、前記第１のピッチ周期判定手段によって、前記第１のピッチ周期抽出手段によって抽出されたピッチ周期が前記閾値以下であると判定された場合、その後のシフト位置における前記時間期間 Ts １に対してはピッチ周期を抽出せず、当該その後のシフト位置における前記時間期間 Ts １に対し、前記第１の記憶手段に記憶されたピッチ周期を割り当てる第１のピッチ周期割り当て手段と、を備え、前記第２のピッチ周期抽出装置は、前記時間期間 Ts ２の信号からピッチ周期を抽出する第２のピッチ周期抽出手段と、前記第２のピッチ周期抽出手段によって抽出されたピッチ周期を記憶する第２の記憶手段と、前記ピッチ周期に対し閾値を設定する第２の閾値設定手段と、前記第２のピッチ周期抽出手段によって抽出されたピッチ周期と前記第２の閾値設定手段にて設定された閾値とを比較し、当該ピッチ周期が前記閾値以下であるかどうかを判定する第２のピッチ周期判定手段と、前記第２のピッチ周期判定手段によって、前記第２のピッチ周期抽出手段によって抽出されたピッチ周期が前記閾値以下であると判定された場合、その後のシフト位置における前記時間期間 Ts ２に対してはピッチ周期を抽出せず、当該その後のシフト位置における前記時間期間 Ts ２に対し、前記第２の記憶手段に記憶されたピッチ周期を割り当てる第２のピッチ周期割り当て手段と、を備えることを特徴とする。
【００２４】
また本発明は、ピッチ周期を抽出する時間期間 Ts を入力音声信号の時間軸方向に順次シフトさせ、それぞれの時間期間 Ts 毎に前記入力音声信号のピッチ周期を抽出するピッチ周期抽出装置と、前記ピッチ周期抽出装置によって抽出されたピッチ周期に基づいて入力音声信号の時間軸の圧縮処理を行う時間軸圧縮手段と、前記時間軸圧縮手段によって時間軸圧縮された信号を複数の帯域に分割して符号化する帯域分割符号化手段と、前記帯域分割符号化手段によって符号化された信号を格納するメモリ手段と、前記メモリ手段に格納された信号を復号する帯域分割復号化手段と、前記ピッチ周期抽出装置で抽出したピッチ周期に基づいて、前記帯域分割復号化手段から出力される信号の時間軸の伸長処理を行う時間軸伸長手段と、を備えた音声信号の時間軸圧縮伸長装置において、前記ピッチ周期抽出装置は、前記時間期間 Ts の音声信号からピッチ周期を抽出するピッチ周期抽出手段と、前記ピッチ周期抽出手段によって抽出されたピッチ周期を記憶する記憶手段と、前記ピッチ周期に対し閾値を設定する閾値設定手段と、前記ピッチ周期抽出手段によって抽出されたピッチ周期と前記閾値設定手段にて設定された閾値とを比較し、当該ピッチ周期が前記閾値以下であるかどうかを判定するピッチ周期判定手段と、前記ピッチ周期判定手段によって、前記ピッチ周期抽出手段によって抽出されたピッチ周期が前記閾値以下であると判定された場合、その後のシフト位置における前記時間期間 Ts に対してはピッチ周期を抽出せず、当該その後のシフト位置における前記時間期間 Ts に対し、前記記憶手段に記憶されたピッチ周期を割り当てるピッチ周期割り当て手段と、を備えることを特徴とする。
【００２５】
また本発明は、ピッチ周期を抽出する時間期間 Ts １を入力音声信号の時間軸方向に順次シフトさせ、それぞれの時間期間 Ts １毎に前記入力音声信号のピッチ周期を抽出する第１のピッチ周期抽出装置と、前記第１のピッチ周期抽出装置によって抽出されたピッチ周期に基づいて入力音声信号の時間軸の圧縮処理を行う時間軸圧縮手段と、前記時間軸圧縮手段によって時間軸圧縮された信号を複数の帯域に分割して符号化する帯域分割符号化手段と、前記帯域分割符号化手段にによって符号化された信号を格納するメモリ手段と、前記メモリ手段に格納された信号を復号する帯域分割復号化手段と、ピッチ周期を抽出する時間期間 Ts ２を前記帯域分割復号化手段から出力される信号の時間軸方向に順次シフトさせ、それぞれの時間期間 Ts ２毎に前記帯域分割復号化手段から出力される信号のピッチ周期を抽出する第２のピッチ周期抽出装置と、前記第２のピッチ周期抽出装置によって抽出されたピッチ周期に基づいて、前記帯域分割復号化手段から出力される信号の時間軸の伸長処理を行う時間軸伸長手段と、を備えた音声信号の時間軸圧縮伸長装置において、前記第１のピッチ周期抽出装置は、前記時間期間 Ts １の音声信号からピッチ周期を抽出する第１のピッチ周期抽出手段と、前記第１のピッチ周期抽出手段によって抽出されたピッチ周期を記憶する第１の記憶手段と、前記ピッチ周期に対し閾値を設定する第１の閾値設定手段と、前記第１のピッチ周期抽出手段によって抽出されたピッチ周期と前記第１の閾値設定手段にて設定された閾値とを比較し、当該ピッチ周期が前記閾値以下であるかどうかを判定する第１のピッチ周期判定手段と、前記第１のピッチ周期判定手段によって、前記第１のピッチ周期抽出手段によって抽出されたピッチ周期が前記閾値以下であると判定された場合、その後のシフト位置における前記時間期間 Ts １に対してはピッチ周期を抽出せず、当該その後のシフト位置における前記時間期間 Ts １に対し、前記第１の記憶手段に記憶されたピッチ周期を割り当てる第１のピッチ周期割り当て手段と、を備え、前記第２のピッチ周期抽出装置は、前記時間期間 Ts ２の信号からピッチ周期を抽出する第２のピッチ周期抽出手段と、前記第２のピッチ周期抽出手段によって抽出されたピッチ周期を記憶する第２の記憶手段と、前記ピッチ周期に対し閾値を設定する第２の閾値設定手段と、前記第２のピッチ周期抽出手段によって抽出されたピッチ周期と前記第２の閾値設定手段にて設定された閾値とを比較し、当該ピッチ周期が前記閾値以下であるかどうかを判定する第２のピッチ周期判定手段と、前記第２のピッチ周期判定手段によって、前記第２のピッチ周期抽出手段によって抽出されたピッチ周期が前記閾値以下であると判定された場合、その後のシフト位置における前記時間期間 Ts ２に対してはピッチ周期を抽出せず、当該その後のシフト位置における前記時間期間 Ts ２に対し、前記第２の記憶手段に記憶されたピッチ周期を割り当てる第２のピッチ周期割り当て手段と、を備えることを特徴とする音声信号の時間軸圧縮伸長装置。
【００２６】
また本発明は、ピッチ周期を抽出する時間期間 Ts を入力音声信号の時間軸方向に順次シフトさせ、それぞれの時間期間 Ts 毎に前記入力音声信号のピッチ周期を抽出するピッチ周期抽出装置と、再生速度情報を出力する再生速度設定手段と、前記再生速度設定手段からの再生速度情報及び、前記ピッチ周期抽出装置によって抽出されたピッチ周期に基づいて、入力音声信号の時間軸の圧縮処理を行う時間軸圧縮手段と、を備えた音声信号の時間軸圧縮装置において、前記ピッチ周期抽出装置は、前記時間期間 Ts の音声信号からピッチ周期を抽出するピッチ周期抽出手段と、前記ピッチ周期抽出手段によって抽出されたピッチ周期を記憶する記憶手段と、前記ピッチ周期に対し閾値を設定する閾値設定手段と、前記ピッチ周期抽出手段によって抽出されたピッチ周期と前記閾値設定手段にて設定された閾値とを比較し、当該ピッチ周期が前記閾値以下であるかどうかを判定するピッチ周期判定手段と、前記ピッチ周期判定手段によって、前記ピッチ周期抽出手段によって抽出されたピッチ周期が前記閾値以下であると判定された場合、その後のシフト位置における前記時間期間 Ts に対してはピッチ周期を抽出せず、当該その後のシフト位置における前記時間期間 Ts に対し、前記記憶手段に記憶されたピッチ周期を割り当てるピッチ周期割り当て手段と、を備えることを特徴とする。
【００２７】
また本発明は、前記再生速度設定手段からの再生速度情報に応じて、前記所定値の実質的な値を変化させることを特徴とする。
【００２８】
また本発明は、ピッチ周期を抽出する時間期間 Ts を入力音声信号の時間軸方向に順次シフトさせ、それぞれの時間期間 Ts 毎に前記入力音声信号のピッチ周期を抽出するピッチ周期抽出装置と、再生速度情報を出力する再生速度設定手段と、前記再生速度設定手段からの再生速度情報及び、前記ピッチ周期抽出装置によって抽出されたピッチ周期に基づいて、入力音声信号の時間軸の伸長処理を行う時間軸伸長手段と、を備えた音声信号の時間軸伸長装置において、前記ピッチ周期抽出装置は、前記時間期間 Ts の音声信号からピッチ周期を抽出するピッチ周期抽出手段と、前記ピッチ周期抽出手段によって抽出されたピッチ周期を記憶する記憶手段と、前記ピッチ周期に対し閾値を設定する閾値設定手段と、前記ピッチ周期抽出手段によって抽出されたピッチ周期と前記閾値設定手段にて設定された閾値とを比較し、当該ピッチ周期が前記閾値以下であるかどうかを判定するピッチ周期判定手段と、前記ピッチ周期判定手段によって、前記ピッチ周期抽出手段によって抽出されたピッチ周期が前記閾値以下であると判定された場合、その後のシフト位置における前記時間期間 Ts に対してはピッチ周期を抽出せず、当該その後のシフト位置における前記時間期間 Ts に対し、前記記憶手段に記憶されたピッチ周期を割り当てるピッチ周期割り当て手段と、を備えることを特徴とする。
【００２９】
また本発明は、前記再生速度設定手段からの再生速度情報に応じて、前記所定値の実質的な値を変化させることを特徴とする。
【００３０】
また本発明は、入力された音声信号を符号化する符号化手段と、前記符号化手段で符号化された信号を格納するメモリ手段と、前記メモリ手段に格納された信号を復号する復号化手段と、ピッチ周期を抽出する時間期間 Ts を前記復号化手段より出力される信号の時間軸方向に順次シフトさせ、それぞれの時間期間 Ts 毎に前記復号化手段より出力される信号のピッチ周期を抽出するピッチ周期抽出装置と、再生速度情報を出力する再生速度設定手段と、前記再生速度設定手段からの再生速度情報及び、前記ピッチ周期抽出装置によって抽出されたピッチ周期に基づいて、前記復号化手段より出力される信号の時間軸の圧縮処理を行う時間軸圧縮手段と、前記再生速度設定手段からの再生速度情報及び、前記ピッチ周期抽出装置によって抽出されたピッチ周期に基づいて、前記復号化手段より出力される信号の時間軸の伸長処理を行う時間軸伸長手段と、前記復号化手段より出力される信号を選択的に前記時間軸圧縮手段または時間軸伸長手段に導く選択手段と、を備えた音声信号の時間軸圧縮伸長装置において、前記ピッチ周期抽出装置は、前記時間期間 Ts の信号からピッチ周期を抽出するピッチ周期抽出手段と、前記ピッチ周期抽出手段によって抽出されたピッチ周期を記憶する記憶手段と、前記ピッチ周期に対し閾値を設定する閾値設定手段と、前記ピッチ周期抽出手段によって抽出されたピッチ周期と前記閾値設定手段にて設定された閾値とを比較し、当該ピッチ周期が前記閾値以下であるかどうかを判定するピッチ周期判定手段と、前記ピッチ周期判定手段によって、前記ピッチ周期抽出手段によって抽出されたピッチ周期が前記閾値以下であると判定された場合、その後のシフト位置における前記時間期間 Ts に対してはピッチ周期を抽出せず、当該その後のシフト位置における前記時間期間 Ts に対し、前記記憶手段に記憶されたピッチ周期を割り当てるピッチ周期割り当て手段と、を備えることを特徴とする。
【００３１】
また本発明は、前記再生速度設定手段からの再生速度情報に応じて、前記所定値の実質的な値を変化させることを特徴とする。
【００３２】
【発明の実施の形態】
以下、本発明の実施の形態を図１乃至図１０に基づいて説明する。
【００３３】
先ず、図１は音声信号の早聞き（早送り再生）を行うための再生装置の概略を示すブロック図である。同図において、１は入力されたディジタル音声信号（以下「音声信号」という。）の声の高さを表わすピッチ周期を抽出するピッチ周期抽出手段である。
【００３４】
２は前記ピッチ周期抽出手段１で求められたピッチ周期と、予め定められた閾値（後述する）との比較・判定を行うピッチ周期判定手段である。
【００３５】
前記ピッチ周期判定手段２が前記ピッチ周期抽出手段１の抽出したピッチ周期と後述する閾値とを比較・判定した結果、そのピッチ周期が閾値より小さいか、若しくは等しい場合には、前記ピッチ周期抽出手段１はその後の新たに音声信号のピッチ周期を抽出しないが、その詳細については、図３のフローチャートに基づいて後述する。
【００３６】
ここで、ピッチ周期の抽出方法は周知の自己相関を用いたピッチ周期抽出法を用いる。
【００３７】
３は前記ピッチ周期抽出手段１で求めたピッチ周期の情報を一時的に記憶しておくバッファメモリからなるバッファ手段、また４は図２に示すように、例えばピッチ周期２個分の音声波形を切り出した後、１ピッチ周期分の波形Ａ、波形Ｂのそれぞれに１から０に、及び０から１へ、直線的に変化する重み係数を乗じた波形（波形Ａ’及び波形Ｂ’）を生成した後に足し合わせることによって波形Ｃを得ることで、音声信号の時間軸圧縮を行う時間軸圧縮手段、５はピッチ周期抽出手段１側、又はバッファメモリ３側の何れかに切り替わる切替手段である。
【００３８】
尚、前記切替手段５は、前記ピッチ周期判定手段２の比較・判定結果に従って適宜ピッチ周期抽出手段１側、又はバッファメモリ３側の何れかに切り替えられる。また、６は閾値設定手段、７は再生速度設定手段であり、該再生速度設定手段は、使用者によって選択・設定された再生速度情報を出力する。
【００３９】
次に、本発明の音声信号のピッチ周期抽出方法を図３のフローチャートを用いて説明する。図３において、先ずステップＳ１では使用者が再生速度設定手段７を操作して再生速度を設定する。設定された再生速度情報はピッチ周期判定手段２へ送られる。尚、再生速度は、例えば０．５倍速〜２．０倍速の中の予め決められたパターンの中から選択するか、あるいは数値を直接入力するように構成しても良い。
【００４０】
ステップＳ２では、前記ステップＳ１で設定された再生速度に基づき、音声信号のピッチ周期に関する閾値ＳＨ、及び分岐変数ｕｐｄについての初期値を設定する（ただし、閾値ＳＨ＜分岐変数ｕｐｄ）。
【００４１】
【表１】

【００４２】
ここでは、一例として、使用者が前記ステップＳ１にて再生速度として２倍速を選択したものとし、閾値ＳＨ＝２４０（サンプル）、分岐変数ｕｐｄ＝３００（サンプル）として説明する（Ｓ２）。
【００４３】
尚、この「サンプル」とは、音声信号がディジタル信号である場合に、所望のサンプリング周波数に従ってサンプリングされた音声信号の数をいう。下記表１は再生速度と、閾値ＳＨ及び分岐変数ｕｐｄとの関係を表している。
【００４４】
尚、表１において、再生速度（再生速度設定手段からの再生速度情報）に応じて分岐変数ｕｐｄの計算方法を変えている（変数ｐｎｕｍに乗ずる係数を変えている）のは、歪を少なくするためである。この理由を以下に示す。
【００４５】
図１３（ａ）に示すように、例えば再生速度が２倍速の場合、１ピッチ周期目の波形と２ピッチ周期目の波形を１つの波形に圧縮して１つ目の出力波形とし、次に３ピッチ周期目の波形と４ピッチ周期目の波形を圧縮し２つ目の出力波形とする。即ち、１度のピッチ周期の抽出範囲に４ピッチ周期以上入っていれば、２つ目の出力波形を生成する際にピッチ周期の抽出は必要ない。このため、前記表１に示すように、変数ｐｎｕｍに乗ずる係数を４としている。
【００４６】
また、図１３（ｂ）に示すように、例えば再生速度が１．５倍速の場合は、１ピッチ周期目の波形と２ピッチ周期目の波形を１つの波形に圧縮して１つ目の出力波形とし、次に３ピッチ周期目をそのまま２つ目の出力波形とし、さらに４ピッチ周期目の波形と５ピッチ周期目の波形を圧縮し３つ目の出力波形とする。即ち、１度のピッチ周期の抽出範囲に５ピッチ周期以上入っていれば３つ目の出力波形を生成する際にピッチ周期の抽出は必要ない。このため、前記表１に示すように、変数ｐｎｕｍに乗ずる係数を５としている。
【００４７】
そして他の再生速度の場合にについても同様にしてそれぞれ適切な係数を与えており、このように変数ｐｎｕｍに乗ずる係数を最適な値に設定することで歪が低減することが確認できた。
【００４８】
また、前記表１では、変数ｐｎｕｍに乗ずる係数を再生速度設定手段からの再生速度情報に応じて変更するようにしているが、これに変えて、再生速度に応じて閾値ＳＨを変化させるように構成しても実質的に同じことが実現できる。即ち、「再生速度に応じて変数ｐｎｕｍに乗ずる係数または閾値ＳＨを変える」ということは、請求項１１、請求項１３、請求項１５における「再生速度設定手段からの再生速度情報に応じて、所定値の実質的な値を変化させる」ということを意味している。
【００４９】
次にステップＳ３では、音声信号の早聞きを行うための再生装置に入力される音声信号の読み込みを行う。
【００５０】
ステップＳ４では、分岐変数ｕｐｄと閾値ＳＨとを比較し、分岐変数ｕｐｄの値が閾値ＳＨの値より大きければステップＳ８に進み、音声信号からピッチ周期を抽出して変数ｐｎｕｍに格納する。一方、分岐変数ｕｐｄの値が閾値ＳＨの値より小さいか若しくは等しければステップＳ５に進む。
【００５１】
このステップＳ４において、前述のようにステップＳ２の初期値の設定で閾値ＳＨの値を２４０サンプルに、また分岐変数ｕｐｄの値を閾値ＳＨより大きい３００サンプルに設定したため、初期値の設定直後は必ずステップＳ８に進んで、音声信号のピッチ周期を抽出することになる。このため、前記図１における切換え手段５はピッチ周期抽出手段１側に切換えられている。
【００５２】
尚、音声信号のピッチ周期の抽出方法は周知の自己相関を用いたピッチ周期抽出法を用いる。
【００５３】
ステップＳ９では、前記ステップＳ８で抽出したピッチ周期ｐｎｕｍの値を分岐変数ｕｐｄに設定する。この場合、前記表１にしたがって、ｕｐｄ←４×ｐｎｕｍとなる（矢印は代入を示す、以下同様）。
【００５４】
次に、ステップＳ６においては、音声信号の時間軸圧縮を行う。この時間軸圧縮方法は前記図２に示すように、例えばピッチ周期２個分の音声波形を切り出した後、１ピッチ周期分のそれぞれの波形にそれぞれ異なる重み係数、例えば１から０に、及び０から１に、直線的に変化する重み係数を乗じた後に足し合わせることによって時間軸圧縮処理、即ち早聞きを行うことが可能である。
【００５５】
ステップＳ７では、継続して処理を行うか否かを判定し、継続して処理を行うのであれば、ステップＳ３に戻り、一方処理を終了するのであれば終了する。
【００５６】
尚、処理を終了する条件としては、例えば使用者が音声の再生を停止すべく停止ボタン（図示せず）を操作した場合などである。
【００５７】
一方、前記ステップＳ４において、分岐変数ｕｐｄの値が閾値ＳＨの値より小さいか若しくは等しくなった場合は、前記図１における切換え手段５をバッファ手段３側に切換え、ステップＳ５において、前記表１にしたがってｕｐｄ←ｕｐｄ＋２×ｐｎｕｍに設定し、次にステップＳ６にて時間軸圧縮を行ない、ステップＳ３へ戻る。
【００５８】
従って、このＳ５で設定された分岐変数ｕｐｄの値が、閾値ＳＨの値より小さいか、若しくは等しい限りは、ステップＳ８における音声信号のピッチ周期ｐｎｕｍの抽出処理及びステップＳ９による分岐変数ｕｐｄの設定を行うことはない（その間は図１の切換え手段５もバッファ手段３側に切換えられたままである）。即ち、ピッチ周期が短い場合には、ステップＳ８における音声信号のピッチ周期ｐｎｕｍの抽出処理及びステップＳ９による分岐変数ｕｐｄの設定を行う必要がないということである。
【００５９】
以上の処理を繰り返すことによって、図４（ａ）に示すように、ピッチ周期が短い場合、ピッチ周期を抽出した音声波形に連続するピッチ周期２個分の波形についてはピッチ周期の抽出を行う必要がなくなるため、ピッチ周期抽出手段１の処理負担は軽減されることになる。
【００６０】
一方、図４（ｂ）に示すように、ピッチ周期が長い場合には、単位時間当たりに抽出するピッチ周期の抽出回数は少ないため、ピッチ周期抽出手段１の処理負担は以前と変わることはない。
【００６１】
前述の実施の形態では、入力された音声信号を時間軸圧縮処理した例を述べたが、本発明はこれには限られず、入力された音声信号を時間軸伸長する場合には図１の「時間軸圧縮手段４」の代わりに、図５に示すように「時間軸伸長手段８」を備えることにより、音声信号の遅聞き（ゆっくり再生）を行うための再生装置とすることができる。
【００６２】
この時間軸伸長手段８は、図７に示すように、例えばピッチ周期３個分の音声波形を切り出した後、ピッチ周期２個分の波形Ａに例えば０から１に直線的に変化する重み係数を乗じて波形Ａ’を生成し、またピッチ周期２個分の波形Ｂに例えば１から０に直線的に変化する重み係数を乗じて波形Ｂ’を生成し、それぞれを足し合わせることによって、ピッチ周期１個分の波形Ｄ及び波形Ｅを得ることで、時間軸伸長処理、即ち遅聞きを行うことが可能である。
【００６３】
この場合の動作について、図６のフローチャートに基づいて説明する。ここでは、一例として、使用者が前記ステップＳ１１にて再生速度として０．５倍速を選択したものとし、閾値ＳＨ＝２４０（サンプル）、分岐変数ｕｐｄ＝３００（サンプル）として説明する（Ｓ１２）。
【００６４】
次にステップＳ１３では、音声信号の早聞きを行うための再生装置に入力される音声信号の読み込みを行う。
【００６５】
ステップＳ１４では、分岐変数ｕｐｄと閾値ＳＨとを比較し、分岐変数ｕｐｄの値が閾値ＳＨの値より大きければステップＳ１８に進み、音声信号からピッチ周期を抽出して変数ｐｎｕｍに格納する。一方、分岐変数ｕｐｄの値が閾値ＳＨの値より小さいか若しくは等しければステップＳ１５に進む。
【００６６】
前記ステップ１５及び後述するステップ１９における分岐変数ｕｐｄの計算式と再生速度との関係は下記の表２のように設定されている。
【００６７】
【表２】

【００６８】
このステップＳ１４において、前述のようにステップＳ１２の初期値の設定で閾値ＳＨの値を２４０サンプルに、また分岐変数ｕｐｄの値を閾値ＳＨより大きい３００サンプルに設定したため、初期値の設定直後は必ずステップＳ１８に進んで、音声信号のピッチ周期を抽出することになる。このため、前記図５における切換え手段５はピッチ周期抽出手段１側に切換えられている。
【００６９】
尚、音声信号のピッチ周期の抽出方法は周知の自己相関を用いたピッチ周期抽出法を用いる。
【００７０】
ステップＳ１９では、前記ステップＳ１８で抽出したピッチ周期ｐｎｕｍの値を分岐変数ｕｐｄに設定する。この場合、前記表１にしたがって、ｕｐｄ←３×ｐｎｕｍとなる。
【００７１】
次に、ステップＳ１６においては、音声信号の時間軸伸長を行う。この時間軸伸長方法は前記図７に示すように、例えばピッチ周期２個分の音声波形を切り出した後、１ピッチ周期分のそれぞれの波形にそれぞれ異なる重み係数、例えば１から０に、及び０から１に変化する重み係数を乗じた後に足し合わせることによって時間軸伸長処理、即ち遅聞きを行うことが可能である。
【００７２】
次に、ステップＳ１７では、継続して処理を行うか否かを判定し、継続して処理を行うのであれば、ステップＳ３に戻り、一方処理を終了するのであれば終了する。
【００７３】
尚、処理を終了する条件としては、例えば使用者が音声の再生を停止すべく停止ボタン（図示せず）を操作した場合などである。
【００７４】
一方、前記ステップＳ１４において、分岐変数ｕｐｄの値が閾値ＳＨの値より小さいか若しくは等しくなった場合は、前記図５における切換え手段５をバッファ手段３側に切換え、ステップＳ１５において、前記表１にしたがってｕｐｄ←ｕｐｄ＋ｐｎｕｍに設定し、次にステップＳ１６にて時間軸伸長を行ない、ステップＳ１３へ戻る。
【００７５】
従って、このＳ１５で設定された分岐変数ｕｐｄの値が、閾値ＳＨの値より小さいか、若しくは等しい限りは、ステップＳ１８における音声信号のピッチ周期ｐｎｕｍの抽出処理及びステップＳ１９による分岐変数ｕｐｄの設定を行うことはない（その間は図５の切換え手段５もバッファ手段３側に切換えられたままである）。
【００７６】
即ち、ピッチ周期が短い場合には、ステップＳ１８における音声信号のピッチ周期ｐｎｕｍの抽出処理及びステップＳ１９による分岐変数ｕｐｄの設定を行う必要がなくなるため、ピッチ周期抽出手段１の処理負担は軽減されることになる。
【００７７】
また、前述の実施の形態では、音声信号の早聞きを行うための再生装置、音声信号の遅聞きを行うための再生装置それぞれ単独の例について述べたが、図８に示すように、これらの機能を併せ持った装置とすることもできる。
【００７８】
図８において、１２は入力された音声信号を既存のＡＤＰＣＭ処理によって符号化するＡＤＰＣＭ符号化手段、９は前記ＡＤＰＣＭ符号化手段１２で符号化された信号を格納するメモリ、１３は前記メモリ９からの信号を復号するＡＤＰＣＭ復号化手段、１４は選択手段、４は前記選択手段１４を介して前記ＡＤＰＣＭ復号化手段１３からの信号が導かれる時間軸圧縮手段、８は前記選択手段１４を介して前記ＡＤＰＣＭ復号化手段１３からの信号が導かれる時間軸伸長手段である。
【００７９】
該構成により、選択手段１４を時間軸圧縮手段４側に切り換えることで早聞き出力信号を得られ、一方、選択手段１４を時間軸伸長手段８側に切り替えることで遅聞き出力信号を得られる。
【００８０】
また、図９に示すように時間軸圧縮した音声信号を一旦メモリ９に格納し、そのメモリから読み出した音声信号を時間軸伸長することにより音声信号の圧縮伸長を実現する装置にも適用することができる。
【００８１】
なお、この実施例では時間軸圧縮を行う部分に本発明のピッチ周期抽出装置を適用している。また、該装置においては、音声信号の早聞きまたは遅聞きを行うためのものではなく、音声信号を時間軸圧縮処理を施してメモリに格納することで、少ないメモリに多くの信号を記録するための装置である。以下、図１０乃至図１２に示す装置においても同様である。
【００８２】
図１０は前記図９に示す装置において、時間軸伸長処理を行う部分にも本発明のピッチ周期抽出装置を適用した例を示している。この例では、時間軸伸長処理を行う側にも、第２ピッチ周期抽出手段２１、第２ピッチ周期判定手段２２、第２バッファ２３、第２切り換え手段２４、第２閾値設定手段２５を設けているので、メモリ９に格納された信号からピッチ周期を検出することができるため、メモリ９にピッチ周期まで格納する必要がなく、メモリ９の容量を節約することができる。
【００８３】
図１１は前記図９に示す装置に、さらに帯域分割符号化手段１０及び帯域分割復号化手段１１を備えた例を示している。この例においては、帯域分割符号化手段１０及び帯域分割復号化手段１１により、時間軸方向のみならず周波数帯域方向にも信号の圧縮・伸長処理を行うことができる。
【００８４】
図１２は、前記図１１に示した回路に、さらに時間軸伸長処理を行う側にも、第２ピッチ周期抽出手段２１、第２ピッチ周期判定手段２２、第２バッファ２３、第２切り換え手段２４、第２閾値設定手段２５を設けたものである。この例においては、帯域分割符号化手段１０及び帯域分割復号化手段１１により、時間軸方向のみならず周波数帯域方向にも信号の圧縮・伸長処理を行うことができ、さらに第２ピッチ周期抽出手段２１、第２ピッチ周期判定手段２２、第２バッファ２３、第２切り換え手段２４、第２閾値設定手段２５によって、帯域分割復号化手段１１から出力される信号よりピッチ周期を抽出するので、メモリ９にピッチ周期まで格納する必要がなく、メモリ９の容量を節約することができる。
【００８５】
【発明の効果】
以上、詳述した如く本発明に依れば、ピッチ周期が短い音声波形についてピッチ周期を抽出する場合に、最初の音声波形についてのみピッチ周期を抽出すればよく、この音声波形に続く音声波形のピッチ周期を求める必要がなくなる結果、ピッチ周期抽出手段の処理負担が軽減される効果を奏する。
【００８６】
また、再生速度設定手段からの再生速度情報に応じて、所定値の実質的な値を変化させることで、再生時の音の歪を少なくすることが出来る。
【図面の簡単な説明】
【図１】本発明のピッチ周期抽出装置の構成を示すブロック図である。
【図２】時間軸圧縮処理を説明するためのフローチャートである。
【図３】本発明のピッチ周期抽出装置の動作を示すフローチャートである。
【図４】時間軸圧縮処理を説明するためのフローチャートである。
【図５】本発明の時間軸伸長装置の構成を示すブロック図である。
【図６】本発明の時間軸伸長装置の動作を示すフローチャートである。
【図７】時間軸伸長処理を説明するための図である。
【図８】本発明の時間軸圧縮伸長装置の構成を示すブロック図である。
【図９】本発明の音声信号記録再生装置の構成を示すブロック図である。
【図１０】本発明の他の音声信号記録再生装置の構成を示すブロック図である。
【図１１】本発明の他の音声信号記録再生装置の構成を示すブロック図である。
【図１２】本発明の音声信号記録再生装置の構成を示すブロック図である。
【図１３】歪を低減できる効果を説明するための図である。
【図１４】従来の時間軸圧縮処理を示す図である。
【符号の説明】
１ピッチ周期抽出手段
２ピッチ周期判定手段
３バッファ手段
４時間軸圧縮手段
５切換え手段
６閾値設定手段
８時間軸伸長手段
９メモリ手段
１０帯域分割符号化手段
１１帯域分割復号化手段
１２ＡＤＰＣＭ符号化手段
１３ＡＤＰＣＭ復号化手段
１４選択手段
２１第２ピッチ周期抽出手段
２２第２ピッチ周期判定手段
２３第２バッファ
２４第２切換え手段
２５第２閾値設定手段[0001]
BACKGROUND OF THE INVENTION
The present invention relates to an audio signal pitch period extraction method, an audio signal pitch period extraction apparatus, an audio signal time axis compression apparatus, an audio signal time axis expansion apparatus, and an audio signal time axis compression expansion apparatus. .
[0002]
[Prior art]
When recording audio in a semiconductor memory, etc., or when transmitting audio through a digital transmission system, etc., in addition to the PCM method that directly encodes the audio level, the recording side analyzes and records in a parameter format that represents audio characteristics. However, attention has been paid to a speech encoding method for synthesizing speech from the parameters on the playback side.
[0003]
One of the parameters representing such voice characteristics is a pitch period, which generally represents the pitch of the voice.
[0004]
One of the pitch period extraction methods uses an autocorrelation.
[0005]
In the pitch period extraction method using autocorrelation, it is assumed that the signal is time-limited, the signal exists only within the interval of time length Ts, and the signal is always set to zero outside the interval of time length Ts. There is a method using short-time autocorrelation for obtaining the correlation. As described in Corona's "Digital Signal Processing of Voice" (above)-LRRabiner & R.W. Schafer, Translated by Kuki Suzuki-p152-p152 When represented by (n), the short-time autocorrelation value Rn (k) according to the above-described method is as follows.
[0006]
[Expression 1]

[0007]
Here, Ts is a time interval in which an audio signal is assumed to exist, k is a delay time when the audio waveform is delayed when calculating the short-time autocorrelation value Rn (k), and has a relationship of Ts >> k. .
[0008]
[Problems to be solved by the invention]
However, when performing compression / decompression processing of an audio signal, it is necessary to determine the pitch period of the audio waveform. When the waveform has a short pitch period, the pitch period per unit time is longer than that of a waveform with a long pitch period. As a result of the increase in the number of extractions, which requires a pitch period extraction time, there is a problem in that the processing means (processor) is burdened.
[0009]
More specifically, the time period Ts for extracting the pitch period is set to twice the assumed maximum pitch (that is, the longest pitch period). Since compression and decompression are performed while extracting two pitch periods (compression is shown in FIG. 2 and decompression is shown in FIG. 7, the details will be described later). As shown in FIG. 14 (b), the time period Ts for extracting the pitch period does not overlap even if the waveform is extracted by two pitch periods.
[0010]
However, in the case of a waveform with a short pitch period, as shown in FIG. 14A, when the waveform is extracted by two pitch periods, the time period Ts for extracting the pitch period overlaps ( (See

Pitch period extraction

1, 2, 3). As described above, this is a problem that occurs because the time period Ts for extracting the pitch period has a fixed width that is two to three times the time period Ts for extracting the pitch period of the waveform having a long pitch period. .
[0011]
For this reason, in the case of a waveform having a short pitch period, the number of extractions of the pitch period per unit time is increased compared to a waveform having a long pitch period, and the processing means (processor) that performs the extraction process of the pitch period is used. It was a big burden.
[0012]
By the way, as described above, in the case of a waveform having a short pitch period as shown in FIG. 14A, the time periods Ts for extracting the pitch period overlap.
[0013]
Therefore, for example, in the case of FIG. 14A, the autocorrelation value Rn (k) is calculated at the pitch period extraction 1 and the pitch period extraction 3, and the autocorrelation value Rn (k) at the pitch period extraction 2. It turns out that there is little influence even if it is made not to calculate. That is, in the case of a waveform with a short pitch period, it can be said that there is little influence even if the autocorrelation value Rn (k) is not calculated each time.
[0014]
In addition, human voices are often composed of waveforms that are repeated at the same pitch period. In the case of a voice composed of a waveform with a short pitch period (that is, a high voice such as a woman), it is the same within a predetermined period compared to a voice composed of a waveform with a long pitch period (that is, a low voice such as elasticity). The number of waveforms in the pitch period is large.
[0015]
In the case of a voice composed of a waveform with a short pitch period, the time period Ts for extracting the pitch period overlaps as in 13 (a), but as described above, human voices have the same pitch. Since it is often composed of a waveform repeated in a cycle, it has been found from this point of view that the influence is small even if the autocorrelation value Rn (k) is not calculated every time.
[0016]
Therefore, the present invention has been made based on such a point of interest, and a pitch period extraction method for an audio signal that extracts a pitch period from an input audio signal in a short processing time, a pitch period extraction device for an audio signal, and the like The purpose is to provide.
[0017]
[Means for Solving the Problems]
  In order to solve the above problems, the present inventionTime period for extracting pitch period Ts Are shifted sequentially in the time axis direction of the input audio signal, and each time period Ts In the pitch period extracting method for extracting the pitch period of the input audio signal every time, the time period Ts When the pitch period extracted when is at a predetermined shift position is less than or equal to a preset threshold value, the time period at the subsequent shift position Ts Does not extract the pitch period, and the time period at the subsequent shift position Ts Against the predetermined shift position for the time period Ts Assign a pitch period equal to or less than the threshold extracted whenIt is characterized by that.
[0018]
  The present invention also providesTime period for extracting pitch period Ts Are shifted sequentially in the time axis direction of the input audio signal, and each time period Ts In the pitch period extracting device for extracting the pitch period of the input audio signal every time, the time period Ts Pitch period extracting means for extracting a pitch period from the audio signal, storage means for storing the pitch period extracted by the pitch period extracting means, threshold setting means for setting a threshold for the pitch period, and the pitch period By comparing the pitch period extracted by the extracting means with the threshold value set by the threshold value setting means, the pitch period determining means for determining whether the pitch period is equal to or less than the threshold value, and the pitch period determining means , When it is determined that the pitch period extracted by the pitch period extracting means is equal to or less than the threshold value, the time period at the subsequent shift position Ts For the above, the pitch period is not extracted, and the time period at the subsequent shift position Ts On the other hand, pitch period assigning means for assigning the pitch period stored in the storage means;It is characterized by providing.
[0019]
The present invention also providesTime period for extracting pitch period Ts Are shifted sequentially in the time axis direction of the input audio signal, and each time period Ts A pitch period extracting device for extracting a pitch period of the input audio signal every time;Pitch period extractionExtracted by equipmentBased on pitch periodinputvoicesignalTime axis compression means for performing the time axis compression processing ofIn the audio signal time axis compression apparatus, the pitch period extraction device includes the time period. Ts Pitch period extracting means for extracting a pitch period from the audio signal, storage means for storing the pitch period extracted by the pitch period extracting means, threshold setting means for setting a threshold for the pitch period, and the pitch period By comparing the pitch period extracted by the extracting means with the threshold value set by the threshold value setting means, the pitch period determining means for determining whether the pitch period is equal to or less than the threshold value, and the pitch period determining means , When it is determined that the pitch period extracted by the pitch period extracting means is equal to or less than the threshold value, the time period at the subsequent shift position Ts For the above, the pitch period is not extracted, and the time period at the subsequent shift position Ts On the other hand, pitch period assigning means for assigning the pitch period stored in the storage means;It is characterized by providing.
[0020]
  The present invention also providesTime period for extracting pitch period Ts Are shifted sequentially in the time axis direction of the input audio signal, and each time period Ts A pitch period extracting device for extracting a pitch period of the input audio signal every time;Pitch period extractionExtracted by equipmentBased on pitch periodinputvoicesignalTime axis extension means for performing the time axis extension processing of
In the audio signal time base extending device, the pitch period extracting device includes the time period. Ts Pitch period extracting means for extracting a pitch period from the audio signal, storage means for storing the pitch period extracted by the pitch period extracting means, threshold setting means for setting a threshold for the pitch period, and the pitch period By comparing the pitch period extracted by the extracting means with the threshold value set by the threshold value setting means, the pitch period determining means for determining whether the pitch period is equal to or less than the threshold value, and the pitch period determining means , When it is determined that the pitch period extracted by the pitch period extracting means is equal to or less than the threshold value, the time period at the subsequent shift position Ts For the above, the pitch period is not extracted, and the time period at the subsequent shift position Ts On the other hand, pitch period assigning means for assigning the pitch period stored in the storage means;It is characterized by providing.
[0021]
  The present invention also provides an encoding means for encoding an input speech signal, a memory means for storing the signal encoded by the encoding means, and a decoding for decoding the signal stored in the memory means.ConversionMeans,Time period for extracting pitch period Ts Are sequentially shifted in the time axis direction of the signal output from the decoding means, and the respective time periods are shifted. Ts A pitch period extracting device for extracting the pitch period of the signal output from the decoding means every time;The pitch period extractionExtracted by equipmentA time axis compression means for compressing a time axis of a signal output from the decoding means based on a pitch period; and the pitch period extractionExtracted by equipmentA time axis expansion means for extending a time axis of a signal output from the decoding means based on a pitch period; and a signal output from the decoding means is selectively used as the time axis compression means or the time axis expansion. Selection means leading to means;In the audio signal time axis compression / decompression apparatus, the pitch period extraction device includes the time period. Ts Pitch period extracting means for extracting a pitch period from the signal, storage means for storing the pitch period extracted by the pitch period extracting means, threshold setting means for setting a threshold for the pitch period, and the pitch period extracting The pitch period extracted by the means is compared with the threshold set by the threshold setting means, and the pitch period determining means for determining whether the pitch period is equal to or less than the threshold, and the pitch period determining means, When it is determined that the pitch period extracted by the pitch period extracting means is equal to or less than the threshold value, the time period at the subsequent shift position Ts For the above, the pitch period is not extracted, and the time period at the subsequent shift position Ts On the other hand, pitch period assigning means for assigning the pitch period stored in the storage means;It is characterized by providing.
[0022]
  The present invention also providesTime period for extracting pitch period Ts Are shifted sequentially in the time axis direction of the input audio signal, and each time period Ts A pitch period extracting device for extracting a pitch period of the input audio signal every time;Pitch period extractionExtracted by equipmentBased on pitch periodinputvoicesignalTime axis compression means for performing time axis compression processing, memory means for storing signals time-compressed by the time axis compression means, and pitch period extractionExtracted by equipmentBased on the pitch period, the memorymeansA time axis extension means for extending the time axis of the signal stored inIn the audio signal time axis compression / decompression apparatus, the pitch period extraction device includes the time period. Ts Pitch period extracting means for extracting a pitch period from the audio signal, storage means for storing the pitch period extracted by the pitch period extracting means, threshold setting means for setting a threshold for the pitch period, and the pitch period By comparing the pitch period extracted by the extracting means with the threshold value set by the threshold value setting means, the pitch period determining means for determining whether the pitch period is equal to or less than the threshold value, and the pitch period determining means , When it is determined that the pitch period extracted by the pitch period extracting means is equal to or less than the threshold value, the time period at the subsequent shift position Ts For the above, the pitch period is not extracted, and the time period at the subsequent shift position Ts On the other hand, pitch period assigning means for assigning the pitch period stored in the storage means;It is characterized by providing.
[0023]
  The present invention also providesTime period for extracting pitch period

Ts

1 is sequentially shifted in the time axis direction of the input audio signal, and each time period Ts A first pitch period extracting device that extracts a pitch period of the input audio signal every one;Extracting the first pitch periodExtracted by equipmentBased on pitch periodinputvoicesignalTime axis compression means for performing the time axis compression processing, and memory means for storing a signal compressed by the time axis compression means.Time period for extracting pitch period

Ts

2 is sequentially shifted in the time axis direction of the signal stored in the memory means, and each time period is shifted. Ts A second pitch period extracting device for extracting the pitch period of the signal stored in the memory means every two;Extracting the second pitch periodExtracted by equipmentBased on the pitch period, the memorymeansA time axis extension means for extending the time axis of the signal stored inIn the time-base compression / decompression apparatus for audio signals, the first pitch period extraction device includes the time period. Ts A first pitch period extracting means for extracting a pitch period from one audio signal; a first storage means for storing the pitch period extracted by the first pitch period extracting means; and a threshold for the pitch period. The first threshold value setting means to be set, the pitch period extracted by the first pitch period extraction means and the threshold value set by the first threshold value setting means are compared, and the pitch period is equal to or less than the threshold value. The pitch period extracted by the first pitch period extracting means is determined to be less than or equal to the threshold value by the first pitch period determining means and the first pitch period determining means. If the time period in the subsequent shift position Ts No pitch period is extracted for 1, and the time period at the subsequent shift position

Ts

1, a first pitch period assigning means for assigning a pitch period stored in the first storage means, and the second pitch period extracting device includes the time period. Ts A second pitch period extracting means for extracting a pitch period from the second signal; a second storage means for storing the pitch period extracted by the second pitch period extracting means; and setting a threshold for the pitch period. Comparing the pitch period extracted by the second pitch period extracting means with the threshold value set by the second threshold setting means, and the pitch period is equal to or less than the threshold value. When it is determined by the second pitch period determining means that determines whether or not there is a pitch period extracted by the second pitch period extracting means that is equal to or less than the threshold value. , The time period at the subsequent shift position

Ts

2 does not extract the pitch period, and the time period at the subsequent shift position

Ts

2, a second pitch period assigning means for assigning a pitch period stored in the second storage means,It is characterized by providing.
[0024]
The present invention also providesTime period for extracting pitch period Ts Are shifted sequentially in the time axis direction of the input audio signal, and each time period Ts A pitch period extracting device for extracting a pitch period of the input audio signal every time;The pitch period extractionExtracted by equipmentBased on pitch periodinputvoicesignalTime axis compression means for performing time axis compression processing, band division encoding means for dividing and encoding a signal time-compressed by the time axis compression means into a plurality of bands, and the band division encoding means The band division decoding based on the pitch period extracted by the pitch period extraction device, the memory means for storing the signal encoded by the above, the band division decoding means for decoding the signal stored in the memory means, and the pitch period extraction device A time axis extending means for extending the time axis of the signal output from the means;In the audio signal time axis compression / decompression apparatus, the pitch period extraction device includes the time period. Ts Pitch period extracting means for extracting a pitch period from the audio signal, storage means for storing the pitch period extracted by the pitch period extracting means, threshold setting means for setting a threshold for the pitch period, and the pitch period By comparing the pitch period extracted by the extracting means with the threshold value set by the threshold value setting means, the pitch period determining means for determining whether the pitch period is equal to or less than the threshold value, and the pitch period determining means , When it is determined that the pitch period extracted by the pitch period extracting means is equal to or less than the threshold value, the time period at the subsequent shift position Ts For the above, the pitch period is not extracted, and the time period at the subsequent shift position Ts On the other hand, pitch period assigning means for assigning the pitch period stored in the storage means;It is characterized by providing.
[0025]
The present invention also providesTime period for extracting pitch period

Ts

1 is sequentially shifted in the time axis direction of the input audio signal, and each time period Ts A first pitch period extracting device that extracts a pitch period of the input audio signal every one;Extracting the first pitch periodExtracted by equipmentBased on pitch periodinputA time axis compressing unit for compressing a time axis of the audio signal; a band division encoding unit for encoding the signal compressed by the time axis compressing unit into a plurality of bands; and the band dividing code. Memory means for storing the signal encoded by the converting means, band division decoding means for decoding the signal stored in the memory means,Time period for extracting pitch period

Ts

2 is sequentially shifted in the time axis direction of the signal output from the band division decoding means, and each time period is shifted. Ts A second pitch period extracting device for extracting a pitch period of a signal output from the band division decoding means every two;Extracting the second pitch periodExtracted by equipmentA time base extension means for performing a time base extension process of the signal output from the band division decoding means based on the pitch period;In the time-base compression / decompression apparatus for audio signals, the first pitch period extraction device includes the time period. Ts A first pitch period extracting means for extracting a pitch period from one audio signal; a first storage means for storing the pitch period extracted by the first pitch period extracting means; and a threshold for the pitch period. The first threshold value setting means to be set, the pitch period extracted by the first pitch period extraction means and the threshold value set by the first threshold value setting means are compared, and the pitch period is equal to or less than the threshold value. The pitch period extracted by the first pitch period extracting means is determined to be less than or equal to the threshold value by the first pitch period determining means and the first pitch period determining means. If the time period in the subsequent shift position Ts No pitch period is extracted for 1, and the time period at the subsequent shift position

Ts

2, a second pitch period assigning means for assigning a pitch period stored in the second storage means,A time-base compression / expansion device for audio signals, comprising:
[0026]
  The present invention also providesTime period for extracting pitch period Ts Are shifted sequentially in the time axis direction of the input audio signal, and each time period Ts A pitch period extracting device for extracting a pitch period of the input audio signal every time;Playback speed setting means for outputting playback speed information, playback speed information from the playback speed setting means, and extraction of the pitch periodExtracted by equipmentBased on the pitch period,inputvoicesignalTime axis compression means for performing the time axis compression processing ofIn the audio signal time axis compression apparatus, the pitch period extraction device includes the time period. Ts Pitch period extracting means for extracting a pitch period from the audio signal, storage means for storing the pitch period extracted by the pitch period extracting means, threshold setting means for setting a threshold for the pitch period, and the pitch period By comparing the pitch period extracted by the extracting means with the threshold value set by the threshold value setting means, the pitch period determining means for determining whether the pitch period is equal to or less than the threshold value, and the pitch period determining means , When it is determined that the pitch period extracted by the pitch period extracting means is equal to or less than the threshold value, the time period at the subsequent shift position Ts For the above, the pitch period is not extracted, and the time period at the subsequent shift position Ts On the other hand, pitch period assigning means for assigning the pitch period stored in the storage means;It is characterized by providing.
[0027]
Further, the present invention is characterized in that a substantial value of the predetermined value is changed according to reproduction speed information from the reproduction speed setting means.
[0028]
  The present invention also providesTime period for extracting pitch period Ts Are shifted sequentially in the time axis direction of the input audio signal, and each time period Ts A pitch period extracting device for extracting a pitch period of the input audio signal every time;Playback speed setting means for outputting playback speed information, playback speed information from the playback speed setting means, and extraction of the pitch periodExtracted by equipmentBased on the pitch period,inputvoicesignalTime axis extension means for performing the time axis extension processing ofIn the audio signal time base extending device, the pitch period extracting device includes the time period. Ts Pitch period extracting means for extracting a pitch period from the audio signal, storage means for storing the pitch period extracted by the pitch period extracting means, threshold setting means for setting a threshold for the pitch period, and the pitch period By comparing the pitch period extracted by the extracting means with the threshold value set by the threshold value setting means, the pitch period determining means for determining whether the pitch period is equal to or less than the threshold value, and the pitch period determining means , When it is determined that the pitch period extracted by the pitch period extracting means is equal to or less than the threshold value, the time period at the subsequent shift position Ts For the above, the pitch period is not extracted, and the time period at the subsequent shift position Ts On the other hand, pitch period assigning means for assigning the pitch period stored in the storage means;It is characterized by providing.
[0029]
Further, the present invention is characterized in that a substantial value of the predetermined value is changed according to reproduction speed information from the reproduction speed setting means.
[0030]
  The present invention also provides an encoding means for encoding an input speech signal, a memory means for storing the signal encoded by the encoding means, and a decoding for decoding the signal stored in the memory means.ConversionMeans,Time period for extracting pitch period Ts Are sequentially shifted in the time axis direction of the signal output from the decoding means, and the respective time periods are shifted. Ts A pitch period extracting device for extracting the pitch period of the signal output from the decoding means every time;Playback speed setting means for outputting playback speed information, playback speed information from the playback speed setting means, and extraction of the pitch periodExtracted by equipmentSaid decoding based on pitch periodConversionTime axis compression means for compressing the time axis of the signal output from the means, reproduction speed information from the reproduction speed setting means, and pitch period extractionExtracted by equipmentSaid decoding based on pitch periodConversionTime axis expansion means for extending the time axis of the signal output from the means, and the decodingConversionSelection means for selectively guiding a signal output from the means to the time axis compression means or time axis extension means;In the audio signal time axis compression / decompression apparatus, the pitch period extraction device includes the time period. Ts Pitch period extracting means for extracting a pitch period from the signal, storage means for storing the pitch period extracted by the pitch period extracting means, threshold setting means for setting a threshold for the pitch period, and the pitch period extracting The pitch period extracted by the means is compared with the threshold set by the threshold setting means, and the pitch period determining means for determining whether the pitch period is equal to or less than the threshold, and the pitch period determining means, When it is determined that the pitch period extracted by the pitch period extracting means is equal to or less than the threshold value, the time period at the subsequent shift position Ts For the above, the pitch period is not extracted, and the time period at the subsequent shift position Ts On the other hand, pitch period assigning means for assigning the pitch period stored in the storage means;It is characterized by providing.
[0031]
Further, the present invention is characterized in that a substantial value of the predetermined value is changed according to reproduction speed information from the reproduction speed setting means.
[0032]
DETAILED DESCRIPTION OF THE INVENTION
Hereinafter, embodiments of the present invention will be described with reference to FIGS.
[0033]
First, FIG. 1 is a block diagram showing an outline of a playback apparatus for performing fast listening (fast forward playback) of an audio signal. In the figure, reference numeral 1 denotes pitch period extracting means for extracting a pitch period representing the pitch of an input digital audio signal (hereinafter referred to as “audio signal”).
[0034]
Reference numeral 2 denotes a pitch period determining means for comparing and determining a pitch period obtained by the pitch period extracting means 1 and a predetermined threshold (described later).
[0035]
When the pitch period determining means 2 compares and determines the pitch period extracted by the pitch period extracting means 1 and a threshold value described later, if the pitch period is smaller than or equal to the threshold value, the pitch period extracting means No. 1 does not newly extract the pitch period of the audio signal thereafter, the details of which will be described later based on the flowchart of FIG.
[0036]
Here, a pitch period extraction method using a well-known autocorrelation is used as a pitch period extraction method.
[0037]
Reference numeral 3 denotes buffer means comprising a buffer memory for temporarily storing pitch period information obtained by the pitch period extracting means 1, and reference numeral 4 denotes, for example, an audio waveform for two pitch periods as shown in FIG. After cutting, waveforms (waveform A ′ and waveform B ′) are generated by multiplying each of the waveform A and waveform B for one pitch period by a weighting factor that varies linearly from 1 to 0 and from 0 to 1. The time axis compressing means 5 for compressing the time axis of the audio signal by obtaining the waveform C by adding them together is a switching means for switching to either the pitch period extracting means 1 side or the buffer memory 3 side.
[0038]
The switching means 5 is appropriately switched to either the pitch period extracting means 1 side or the buffer memory 3 side according to the comparison / determination result of the pitch period determining means 2. Reference numeral 6 denotes threshold setting means, and reference numeral 7 denotes reproduction speed setting means. The reproduction speed setting means outputs reproduction speed information selected and set by the user.
[0039]
Next, the pitch period extraction method for audio signals according to the present invention will be described with reference to the flowchart of FIG. In FIG. 3, first, in step S1, the user operates the playback speed setting means 7 to set the playback speed. The set reproduction speed information is sent to the pitch period determination means 2. Note that the playback speed may be selected from predetermined patterns of, for example, 0.5 times speed to 2.0 times speed, or may be configured to directly input a numerical value.
[0040]
In step S2, a threshold value SH related to the pitch period of the audio signal and an initial value for the branch variable upd are set based on the playback speed set in step S1 (threshold value SH <branch variable upd).
[0041]
[Table 1]

[0042]
Here, as an example, it is assumed that the user has selected the double speed as the playback speed in step S1, and the threshold SH = 240 (sample) and the branch variable upd = 300 (sample) will be described (S2).
[0043]
The “sample” means the number of audio signals sampled according to a desired sampling frequency when the audio signal is a digital signal. Table 1 below shows the relationship between the reproduction speed, the threshold value SH, and the branch variable upd.
[0044]
In Table 1, if the calculation method of the branching variable upd is changed according to the playback speed (playback speed information from the playback speed setting means) (the coefficient multiplied by the variable pnum is changed), distortion is reduced. Because. The reason is shown below.
[0045]
As shown in FIG. 13A, for example, when the playback speed is double speed, the waveform of the first pitch period and the waveform of the second pitch period are compressed into one waveform to obtain the first output waveform, The waveform of the 3rd pitch period and the waveform of the 4th pitch period are compressed into a second output waveform. That is, if the extraction range of one pitch period is 4 pitch periods or more, it is not necessary to extract the pitch period when generating the second output waveform. For this reason, as shown in Table 1, the coefficient multiplied by the variable pnum is set to 4.
[0046]
As shown in FIG. 13B, for example, when the playback speed is 1.5 times, the waveform of the first pitch period and the waveform of the second pitch period are compressed into one waveform and the first output is performed. Next, the third pitch period is used as the second output waveform, and the fourth and fifth pitch waveforms are compressed to form the third output waveform. That is, if the extraction range of one pitch period is 5 pitch periods or more, it is not necessary to extract the pitch period when generating the third output waveform. For this reason, as shown in Table 1, the coefficient multiplied by the variable pnum is set to 5.
[0047]
In the case of other playback speeds, appropriate coefficients are given in the same manner, and it has been confirmed that the distortion is reduced by setting the coefficient multiplied to the variable pnum to an optimum value.
[0048]
In Table 1, the coefficient multiplied by the variable pnum is changed according to the playback speed information from the playback speed setting means. Instead, the threshold SH is changed according to the playback speed. Even if configured, substantially the same can be realized. That is, “the coefficient to be multiplied by the variable pnum or the threshold value SH is changed according to the playback speed” is “predetermined according to the playback speed information from the playback speed setting means” according to claims 11, 13, and 15. This means that the actual value of the value is changed.
[0049]
Next, in step S3, the audio signal input to the playback device for quickly listening to the audio signal is read.
[0050]
In step S4, the branch variable upd is compared with the threshold value SH. If the value of the branch variable upd is larger than the threshold value SH, the process proceeds to step S8, where the pitch period is extracted from the audio signal and stored in the variable pnum. On the other hand, if the value of the branch variable upd is smaller than or equal to the value of the threshold SH, the process proceeds to step S5.
[0051]
In step S4, the threshold value SH is set to 240 samples and the branch variable upd is set to 300 samples larger than the threshold value SH in the setting of the initial value in step S2, as described above. Proceeding to step S8, the pitch period of the audio signal is extracted. For this reason, the switching means 5 in FIG. 1 is switched to the pitch period extracting means 1 side.
[0052]
Note that a pitch period extraction method using a well-known autocorrelation is used as a method for extracting a pitch period of an audio signal.
[0053]
In step S9, the value of the pitch period pnum extracted in step S8 is set in the branch variable upd. In this case, according to Table 1, upd ← 4 × pnum (the arrow indicates substitution, and so on).
[0054]
Next, in step S6, time base compression of the audio signal is performed. In this time axis compression method, as shown in FIG. 2, for example, after a voice waveform corresponding to two pitch periods is cut out, different weighting factors, for example, 1 to 0, Thus, time-base compression processing, that is, quick listening can be performed by multiplying 1 to 1 after multiplying by a linearly changing weighting coefficient.
[0055]
In step S7, it is determined whether or not the process is to be continued. If the process is to be continued, the process returns to step S3, and if the process is to be terminated, the process is terminated.
[0056]
The condition for terminating the process is, for example, a case where the user operates a stop button (not shown) to stop the sound reproduction.
[0057]
On the other hand, when the value of the branch variable upd is smaller than or equal to the value of the threshold value SH in step S4, the switching means 5 in FIG. 1 is switched to the buffer means 3 side. Therefore, upd ← upd + 2 × pnum is set, then time axis compression is performed in step S6, and the process returns to step S3.
[0058]
Therefore, as long as the value of the branch variable upd set in S5 is smaller than or equal to the value of the threshold SH, the process of extracting the pitch period pnum of the audio signal in step S8 and the setting of the branch variable upd in step S9 are performed. (During that time, the switching means 5 in FIG. 1 is also switched to the buffer means 3 side). That is, when the pitch period is short, it is not necessary to perform the extraction process of the pitch period pnum of the audio signal in step S8 and the setting of the branch variable upd in step S9.
[0059]
By repeating the above processing, as shown in FIG. 4A, when the pitch period is short, it is necessary to extract the pitch period for two pitch periods continuous to the voice waveform from which the pitch period is extracted. Therefore, the processing load on the pitch period extracting means 1 is reduced.
[0060]
On the other hand, as shown in FIG. 4B, when the pitch period is long, the number of extractions of the pitch period extracted per unit time is small, so that the processing load of the pitch period extracting unit 1 does not change. .
[0061]
In the above-described embodiment, the example in which the input audio signal is subjected to the time axis compression processing has been described. However, the present invention is not limited to this, and when the input audio signal is extended in the time axis, “ By providing the “time axis extension means 8” as shown in FIG. 5 in place of the time axis compression means 4 ”, it is possible to provide a playback device for slow listening (slow playback) of an audio signal.
[0062]
As shown in FIG. 7, the time axis extension means 8 cuts out a speech waveform corresponding to, for example, three pitch periods and then linearly changes, for example, from 0 to 1 in a waveform A corresponding to two pitch periods. Is used to generate waveform A ′, and waveform B ′ is generated by multiplying waveform B for two pitch periods by a weighting factor that varies linearly from 1 to 0, for example, and adding each of them, thereby adding pitch By obtaining the waveform D and the waveform E for one period, it is possible to perform time axis extension processing, that is, slow listening.
[0063]
The operation in this case will be described based on the flowchart of FIG. Here, as an example, it is assumed that the user has selected 0.5 × speed as the playback speed in step S11, and the threshold SH = 240 (sample) and the branch variable upd = 300 (sample) will be described (S12).
[0064]
Next, in step S13, the audio signal input to the playback device for quickly listening to the audio signal is read.
[0065]
In step S14, the branch variable upd is compared with the threshold value SH. If the value of the branch variable upd is larger than the threshold value SH, the process proceeds to step S18, where the pitch period is extracted from the audio signal and stored in the variable pnum. On the other hand, if the value of the branch variable upd is smaller than or equal to the threshold value SH, the process proceeds to step S15.
[0066]
The relationship between the calculation formula of the branch variable upd in step 15 and step 19 described later and the reproduction speed is set as shown in Table 2 below.
[0067]
[Table 2]

[0068]
In step S14, as described above, the threshold value SH is set to 240 samples and the branch variable upd is set to 300 samples larger than the threshold value SH in the initial value setting in step S12. Proceeding to step S18, the pitch period of the audio signal is extracted. For this reason, the switching means 5 in FIG. 5 is switched to the pitch period extracting means 1 side.
[0069]
Note that a pitch period extraction method using a well-known autocorrelation is used as a method for extracting a pitch period of an audio signal.
[0070]
In step S19, the value of the pitch period pnum extracted in step S18 is set in the branch variable upd. In this case, according to Table 1, upd ← 3 × pnum.
[0071]
Next, in step S16, the time base extension of the audio signal is performed. As shown in FIG. 7, this time axis extension method cuts out a speech waveform for two pitch periods, for example, and then separates different weighting factors for each waveform for one pitch period, for example, from 1 to 0 and 0 It is possible to perform time axis extension processing, that is, slow listening, by multiplying by weighting factors that change from 1 to 1 and adding together.
[0072]
Next, in step S17, it is determined whether or not the process is to be continued. If the process is to be continued, the process returns to step S3, and if the process is to be terminated, the process is terminated.
[0073]
The condition for terminating the process is, for example, a case where the user operates a stop button (not shown) to stop the sound reproduction.
[0074]
On the other hand, when the value of the branch variable upd is smaller than or equal to the value of the threshold value SH in step S14, the switching means 5 in FIG. 5 is switched to the buffer means 3 side. Therefore, upd ← upd + pnum is set, and then the time axis is expanded in step S16, and the process returns to step S13.
[0075]
Therefore, as long as the value of the branch variable upd set in S15 is smaller than or equal to the value of the threshold SH, the extraction process of the pitch period pnum of the audio signal in step S18 and the setting of the branch variable upd in step S19 are performed. (During that time, the switching means 5 in FIG. 5 is also switched to the buffer means 3 side).
[0076]
That is, when the pitch period is short, it is not necessary to perform the process of extracting the pitch period pnum of the audio signal in step S18 and the setting of the branching variable upd in step S19, so the processing load on the pitch period extracting unit 1 is reduced. It will be.
[0077]
Further, in the above-described embodiment, the reproduction apparatus for performing the early listening of the audio signal and the reproduction apparatus for performing the delayed listening of the audio signal have been described. However, as shown in FIG. It can also be a device having both functions.
[0078]
In FIG. 8, 12 is an ADPCM encoding means for encoding an input speech signal by the existing ADPCM processing, 9 is a memory for storing the signal encoded by the ADPCM encoding means 12, and 13 is from the memory 9. ADPCM decoding means for decoding the signal of the signal 14, selection means 14, time axis compression means for receiving the signal from the ADPCM decoding means 13 through the selection means 14, and 8 via the selection means 14 It is a time axis expansion means for guiding a signal from the ADPCM decoding means 13.
[0079]
With this configuration, a fast listening output signal can be obtained by switching the selection means 14 to the time axis compression means 4 side, while a slow listening output signal can be obtained by switching the selection means 14 to the time axis expansion means 8 side.
[0080]
Also, as shown in FIG. 9, the audio signal compressed in the time axis is temporarily stored in the memory 9, and the audio signal read from the memory is temporarily extended in the time axis so that the audio signal can be compressed and expanded. Can do.
[0081]
In this embodiment, the pitch period extracting device of the present invention is applied to the portion that performs time axis compression. In addition, in this device, not to perform early listening or slow listening of an audio signal, but to store a large number of signals in a small memory by storing the audio signal in a memory after performing time axis compression processing. It is a device. The same applies to the apparatus shown in FIGS.
[0082]
FIG. 10 shows an example in which the pitch period extracting device of the present invention is applied to the portion performing the time axis extension processing in the device shown in FIG. In this example, the second pitch period extracting means 21, the second pitch period determining means 22, the second buffer 23, the second switching means 24, and the second threshold value setting means 25 are also provided on the time axis extension processing side. Therefore, since the pitch period can be detected from the signal stored in the memory 9, it is not necessary to store the pitch period in the memory 9, and the capacity of the memory 9 can be saved.
[0083]
FIG. 11 shows an example in which the apparatus shown in FIG. 9 is further provided with band division encoding means 10 and band division decoding means 11. In this example, the band division encoding means 10 and the band division decoding means 11 can perform signal compression / decompression processing not only in the time axis direction but also in the frequency band direction.
[0084]
FIG. 12 shows the second pitch cycle extraction means 21, the second pitch cycle determination means 22, the second buffer 23, and the second switching means 24 on the side of the circuit shown in FIG. The second threshold value setting means 25 is provided. In this example, the band division encoding means 10 and the band division decoding means 11 can perform signal compression / decompression processing not only in the time axis direction but also in the frequency band direction, and further, the second pitch period extraction means. 21, the second pitch period determining means 22, the second buffer 23, the second switching means 24, and the second threshold setting means 25 extract the pitch period from the signal output from the band division decoding means 11. Therefore, it is not necessary to store up to the pitch period, and the capacity of the memory 9 can be saved.
[0085]
【The invention's effect】
As described above in detail, according to the present invention, when a pitch period is extracted for a speech waveform having a short pitch period, it is only necessary to extract the pitch period for the first speech waveform. As a result of eliminating the need to obtain the pitch period, the processing load of the pitch period extraction means is reduced.
[0086]
Further, by changing the substantial value of the predetermined value according to the reproduction speed information from the reproduction speed setting means, it is possible to reduce the distortion of the sound during reproduction.
[Brief description of the drawings]
FIG. 1 is a block diagram showing a configuration of a pitch period extracting device of the present invention.
FIG. 2 is a flowchart for explaining time axis compression processing;
FIG. 3 is a flowchart showing the operation of the pitch period extracting device of the present invention.
FIG. 4 is a flowchart for explaining time axis compression processing;
FIG. 5 is a block diagram showing a configuration of a time axis extension device of the present invention.
FIG. 6 is a flowchart showing the operation of the time axis extension apparatus of the present invention.
FIG. 7 is a diagram for explaining time axis extension processing;
FIG. 8 is a block diagram showing a configuration of a time-axis compression / decompression apparatus according to the present invention.
FIG. 9 is a block diagram showing a configuration of an audio signal recording / reproducing apparatus of the present invention.
FIG. 10 is a block diagram showing a configuration of another audio signal recording / reproducing apparatus of the present invention.
FIG. 11 is a block diagram showing a configuration of another audio signal recording / reproducing apparatus of the present invention.
FIG. 12 is a block diagram showing a configuration of an audio signal recording / reproducing apparatus of the present invention.
FIG. 13 is a diagram for explaining the effect of reducing distortion.
FIG. 14 is a diagram illustrating a conventional time axis compression process.
[Explanation of symbols]
1 Pitch period extraction means
2 Pitch period determination means
3 Buffer means
4 Time axis compression means
5 Switching means
6 Threshold setting means
8 Time axis extension means
9 Memory means
10 Band division coding means
11 Band division decoding means
12 ADPCM encoding means
13 ADPCM decoding means
14 Selection means
21 Second pitch period extraction means
22 Second pitch period determining means
23 Second buffer
24 Second switching means
25 Second threshold value setting means

Claims

ピッチ周期を抽出する時間期間 Ts を入力音声信号の時間軸方向に順次シフトさせ、それぞれの時間期間 Ts 毎に前記入力音声信号のピッチ周期を抽出するピッチ周期抽出方法において、
前記時間期間 Ts が所定のシフト位置にあるときに抽出したピッチ周期が、予め設定した閾値以下の場合、その後のシフト位置における前記時間期間 Ts に対してはピッチ周期を抽出せず、当該その後のシフト位置における時間期間 Ts に対し、前記所定のシフト位置に前記時間期間 Ts があるときに抽出した前記閾値以下のピッチ周期を割り当てる、ことを特徴とするピッチ周期抽出方法。 In the pitch period extraction method of sequentially shifting the time period Ts for extracting the pitch period in the time axis direction of the input audio signal and extracting the pitch period of the input audio signal for each time period Ts ,
When the pitch period extracted when the time period Ts is at a predetermined shift position is equal to or less than a preset threshold value , the pitch period is not extracted for the time period Ts at the subsequent shift position , and the subsequent respect to the time period Ts in the shift position, assigning a pitch period of less than or equal to the threshold value extracted when there is the time period Ts at the predetermined shift position, the pitch period extraction method characterized by.

ピッチ周期を抽出する時間期間 Ts を入力音声信号の時間軸方向に順次シフトさせ、それぞれの時間期間 Ts 毎に前記入力音声信号のピッチ周期を抽出するピッチ周期抽出装置において、
前記時間期間 Ts の音声信号からピッチ周期を抽出するピッチ周期抽出手段と、
前記ピッチ周期抽出手段によって抽出されたピッチ周期を記憶する記憶手段と、
前記ピッチ周期に対し閾値を設定する閾値設定手段と、
前記ピッチ周期抽出手段によって抽出されたピッチ周期と前記閾値設定手段にて設定された閾値とを比較し、当該ピッチ周期が前記閾値以下であるかどうかを判定するピッチ周期判定手段と、
前記ピッチ周期判定手段によって、前記ピッチ周期抽出手段によって抽出されたピッチ周期が前記閾値以下であると判定された場合、その後のシフト位置における前記時間期間 Ts に対してはピッチ周期を抽出せず、当該その後のシフト位置における前記時間期間 Ts に対し、前記記憶手段に記憶されたピッチ周期を割り当てるピッチ周期割り当て手段と、
を備えることを特徴とする音声信号のピッチ周期抽出装置。 In the pitch period extraction device that sequentially shifts the time period Ts for extracting the pitch period in the time axis direction of the input audio signal, and extracts the pitch period of the input audio signal for each time period Ts ,
Pitch period extracting means for extracting a pitch period from the audio signal of the time period Ts ;
Storage means for storing the pitch period extracted by the pitch period extraction means;
Threshold setting means for setting a threshold for the pitch period;
A pitch period determining means for comparing the pitch period extracted by the pitch period extracting means with the threshold set by the threshold setting means and determining whether the pitch period is equal to or less than the threshold;
When it is determined by the pitch period determining means that the pitch period extracted by the pitch period extracting means is equal to or less than the threshold value, the pitch period is not extracted for the time period Ts at the subsequent shift position , Pitch period assigning means for assigning a pitch period stored in the storage means for the time period Ts at the subsequent shift position ;
An apparatus for extracting a pitch period of an audio signal.

ピッチ周期を抽出する時間期間 Ts を入力音声信号の時間軸方向に順次シフトさせ、それぞれの時間期間 Ts 毎に前記入力音声信号のピッチ周期を抽出するピッチ周期抽出装置と、
該ピッチ周期抽出装置によって抽出されたピッチ周期に基づいて入力音声信号の時間軸の圧縮処理を行う時間軸圧縮手段と、
を備えた音声信号の時間軸圧縮装置において、
前記ピッチ周期抽出装置は、
前記時間期間 Ts の音声信号からピッチ周期を抽出するピッチ周期抽出手段と、
前記ピッチ周期抽出手段によって抽出されたピッチ周期を記憶する記憶手段と、
前記ピッチ周期に対し閾値を設定する閾値設定手段と、
前記ピッチ周期抽出手段によって抽出されたピッチ周期と前記閾値設定手段にて設定された閾値とを比較し、当該ピッチ周期が前記閾値以下であるかどうかを判定するピッチ周期判定手段と、
前記ピッチ周期判定手段によって、前記ピッチ周期抽出手段によって抽出されたピッチ周期が前記閾値以下であると判定された場合、その後のシフト位置における前記時間期間 Ts に対してはピッチ周期を抽出せず、当該その後のシフト位置における前記時間期間 Ts に対し、前記記憶手段に記憶されたピッチ周期を割り当てるピッチ周期割り当て手段と、
を備えることを特徴とする音声信号の時間軸圧縮装置。 A pitch period extracting device that sequentially shifts the time period Ts for extracting the pitch period in the time axis direction of the input audio signal, and extracts the pitch period of the input audio signal for each time period Ts ;
A time axis compression means for compressing the time axis of the input audio signal based on the pitch period extracted by the pitch period extraction device ;
In the time axis compression apparatus of the audio signal provided with
The pitch period extraction device includes:
Pitch period extracting means for extracting a pitch period from the audio signal of the time period Ts ;
Storage means for storing the pitch period extracted by the pitch period extraction means;
Threshold setting means for setting a threshold for the pitch period;
A pitch period determining means for comparing the pitch period extracted by the pitch period extracting means with the threshold set by the threshold setting means and determining whether the pitch period is equal to or less than the threshold;
When it is determined by the pitch period determining means that the pitch period extracted by the pitch period extracting means is equal to or less than the threshold value, the pitch period is not extracted for the time period Ts at the subsequent shift position , Pitch period assigning means for assigning a pitch period stored in the storage means for the time period Ts at the subsequent shift position ;
A time-base compression apparatus for an audio signal, comprising:

ピッチ周期を抽出する時間期間 Ts を入力音声信号の時間軸方向に順次シフトさせ、それぞれの時間期間 Ts 毎に前記入力音声信号のピッチ周期を抽出するピッチ周期抽出装置と、
該ピッチ周期抽出装置によって抽出されたピッチ周期に基づいて入力音声信号の時間軸の伸長処理を行う時間軸伸長手段と、
を備えた音声信号の時間軸伸長装置において、
前記ピッチ周期抽出装置は、
前記時間期間 Ts の音声信号からピッチ周期を抽出するピッチ周期抽出手段と、
前記ピッチ周期抽出手段によって抽出されたピッチ周期を記憶する記憶手段と、
前記ピッチ周期に対し閾値を設定する閾値設定手段と、
前記ピッチ周期抽出手段によって抽出されたピッチ周期と前記閾値設定手段にて設定された閾値とを比較し、当該ピッチ周期が前記閾値以下であるかどうかを判定するピッチ周期判定手段と、
前記ピッチ周期判定手段によって、前記ピッチ周期抽出手段によって抽出されたピッチ周期が前記閾値以下であると判定された場合、その後のシフト位置における前記時間期間 Ts に対してはピッチ周期を抽出せず、当該その後のシフト位置における前記時間期間 Ts に対し、前記記憶手段に記憶されたピッチ周期を割り当てるピッチ周期割り当て手段と、
を備えることを特徴とする音声信号の時間軸伸長装置。 A pitch period extracting device that sequentially shifts the time period Ts for extracting the pitch period in the time axis direction of the input audio signal, and extracts the pitch period of the input audio signal for each time period Ts ;
A time axis extending means for extending the time axis of the input audio signal based on the pitch period extracted by the pitch period extracting device ;
In the time extension device of the audio signal with
The pitch period extraction device includes:
Pitch period extracting means for extracting a pitch period from the audio signal of the time period Ts ;
Storage means for storing the pitch period extracted by the pitch period extraction means;
Threshold setting means for setting a threshold for the pitch period;
A pitch period determining means for comparing the pitch period extracted by the pitch period extracting means with the threshold set by the threshold setting means and determining whether the pitch period is equal to or less than the threshold;
When it is determined by the pitch period determining means that the pitch period extracted by the pitch period extracting means is equal to or less than the threshold value, the pitch period is not extracted for the time period Ts at the subsequent shift position , Pitch period assigning means for assigning a pitch period stored in the storage means for the time period Ts at the subsequent shift position ;
An apparatus for expanding a time axis of an audio signal, comprising:

入力された音声信号を符号化する符号化手段と、
前記符号化手段で符号化された信号を格納するメモリ手段と、
前記メモリ手段に格納された信号を復号する復号化手段と、
ピッチ周期を抽出する時間期間 Ts を前記復号化手段より出力される信号の時間軸方向に順次シフトさせ、それぞれの時間期間 Ts 毎に前記復号化手段より出力される信号のピッチ周期を抽出するピッチ周期抽出装置と、
前記ピッチ周期抽出装置によって抽出されたピッチ周期に基づいて前記復号化手段より出力される信号の時間軸の圧縮処理を行う時間軸圧縮手段と、
前記ピッチ周期抽出装置によって抽出されたピッチ周期に基づいて前記復号化手段より出力される信号の時間軸の伸長処理を行う時間軸伸長手段と、
前記復号化手段より出力される信号を選択的に前記時間軸圧縮手段または時間軸伸長手段に導く選択手段と、
を備えた音声信号の時間軸圧縮伸長装置において、
前記ピッチ周期抽出装置は、
前記時間期間 Ts の信号からピッチ周期を抽出するピッチ周期抽出手段と、
前記ピッチ周期抽出手段によって抽出されたピッチ周期を記憶する記憶手段と、
前記ピッチ周期に対し閾値を設定する閾値設定手段と、
前記ピッチ周期抽出手段によって抽出されたピッチ周期と前記閾値設定手段にて設定された閾値とを比較し、当該ピッチ周期が前記閾値以下であるかどうかを判定するピッチ周期判定手段と、
前記ピッチ周期判定手段によって、前記ピッチ周期抽出手段によって抽出されたピッチ周期が前記閾値以下であると判定された場合、その後のシフト位置における前記時間期間 Ts に対してはピッチ周期を抽出せず、当該その後のシフト位置における前記時間期間 Ts に対し、前記記憶手段に記憶されたピッチ周期を割り当てるピッチ周期割り当て手段と、
を備えることを特徴とする音声信号の時間軸圧縮伸長装置。Encoding means for encoding the input audio signal;
Memory means for storing the signal encoded by the encoding means;
A decoding means for decoding the stored signal to said memory means,
A pitch for sequentially shifting the time period Ts for extracting the pitch period in the time axis direction of the signal output from the decoding means, and for extracting the pitch period of the signal output from the decoding means for each time period Ts A period extractor;
Time axis compression means for performing time axis compression processing of a signal output from the decoding means based on the pitch period extracted by the pitch period extraction device ;
A time axis extending means for extending the time axis of the signal output from the decoding means based on the pitch period extracted by the pitch period extracting device ;
A selection means for selectively guiding a signal output from the decoding means to the time axis compression means or the time axis extension means;
In the time-axis compression / decompression apparatus for audio signals comprising:
The pitch period extraction device includes:
Pitch period extracting means for extracting a pitch period from the signal of the time period Ts ;
Storage means for storing the pitch period extracted by the pitch period extraction means;
Threshold setting means for setting a threshold for the pitch period;
A pitch period determining means for comparing the pitch period extracted by the pitch period extracting means with the threshold set by the threshold setting means and determining whether the pitch period is equal to or less than the threshold;
When it is determined by the pitch period determining means that the pitch period extracted by the pitch period extracting means is equal to or less than the threshold value, the pitch period is not extracted for the time period Ts at the subsequent shift position , Pitch period assigning means for assigning a pitch period stored in the storage means for the time period Ts at the subsequent shift position ;
A time-base compression / expansion device for audio signals, comprising:

ピッチ周期を抽出する時間期間 Ts を入力音声信号の時間軸方向に順次シフトさせ、それぞれの時間期間 Ts 毎に前記入力音声信号のピッチ周期を抽出するピッチ周期抽出装置と、
該ピッチ周期抽出装置によって抽出されたピッチ周期に基づいて入力音声信号の時間軸の圧縮処理を行う時間軸圧縮手段と、
前記時間軸圧縮手段によって時間軸圧縮された信号を格納するメモリ手段と、
前記ピッチ周期抽出装置によって抽出されたピッチ周期に基づいて、前記メモリ手段に格納された信号の時間軸の伸長処理を行う時間軸伸長手段と、
を備えた音声信号の時間軸圧縮伸長装置において、
前記ピッチ周期抽出装置は、
前記時間期間 Ts の音声信号からピッチ周期を抽出するピッチ周期抽出手段と、
前記ピッチ周期抽出手段によって抽出されたピッチ周期を記憶する記憶手段と、
前記ピッチ周期に対し閾値を設定する閾値設定手段と、
前記ピッチ周期抽出手段によって抽出されたピッチ周期と前記閾値設定手段にて設定された閾値とを比較し、当該ピッチ周期が前記閾値以下であるかどうかを判定するピッチ周期判定手段と、
前記ピッチ周期判定手段によって、前記ピッチ周期抽出手段によって抽出されたピッチ周期が前記閾値以下であると判定された場合、その後のシフト位置における前記時間期間 Ts に対してはピッチ周期を抽出せず、当該その後のシフト位置における前記時間期間 Ts に対し、前記記憶手段に記憶されたピッチ周期を割り当てるピッチ周期割り当て手段と、
を備えることを特徴とする音声信号の時間軸圧縮伸長装置。 A pitch period extracting device that sequentially shifts the time period Ts for extracting the pitch period in the time axis direction of the input audio signal, and extracts the pitch period of the input audio signal for each time period Ts ;
A time axis compression means for compressing the time axis of the input audio signal based on the pitch period extracted by the pitch period extraction device ;
Memory means for storing the signal compressed by the time axis compression means;
A time base extension means for performing a time base extension process of the signal stored in the memory means based on the pitch period extracted by the pitch period extraction device ;
In the time-axis compression / decompression apparatus for audio signals comprising:
The pitch period extraction device includes:
Pitch period extracting means for extracting a pitch period from the audio signal of the time period Ts ;
Storage means for storing the pitch period extracted by the pitch period extraction means;
Threshold setting means for setting a threshold for the pitch period;
A pitch period determining means for comparing the pitch period extracted by the pitch period extracting means with the threshold set by the threshold setting means and determining whether the pitch period is equal to or less than the threshold;
When it is determined by the pitch period determining means that the pitch period extracted by the pitch period extracting means is equal to or less than the threshold value, the pitch period is not extracted for the time period Ts at the subsequent shift position , Pitch period assigning means for assigning a pitch period stored in the storage means for the time period Ts at the subsequent shift position ;
A time-base compression / expansion device for audio signals, comprising:

ピッチ周期を抽出する時間期間 Ts １を入力音声信号の時間軸方向に順次シフトさせ、それぞれの時間期間 Ts １毎に前記入力音声信号のピッチ周期を抽出する第１のピッチ周期抽出装置と、
前記第１のピッチ周期抽出装置によって抽出されたピッチ周期に基づいて入力音声信号の時間軸の圧縮処理を行う時間軸圧縮手段と、
前記時間軸圧縮手段によって時間軸圧縮された信号を格納するメモリ手段と、
ピッチ周期を抽出する時間期間 Ts ２を前記メモリ手段に格納された信号の時間軸方向に順次シフトさせ、それぞれの時間期間 Ts ２毎に前記メモリ手段に格納された信号のピッチ周期を抽出する第２のピッチ周期抽出装置と、
前記第２のピッチ周期抽出装置によって抽出されたピッチ周期に基づいて、前記メモリ手段に格納された信号の時間軸の伸長処理を行う時間軸伸長手段と、
を備えた音声信号の時間軸圧縮伸長装置において、
前記第１のピッチ周期抽出装置は、
前記時間期間 Ts １の音声信号からピッチ周期を抽出する第１のピッチ周期抽出手段と、
前記第１のピッチ周期抽出手段によって抽出されたピッチ周期を記憶する第１の記憶手段と、
前記ピッチ周期に対し閾値を設定する第１の閾値設定手段と、
前記第１のピッチ周期抽出手段によって抽出されたピッチ周期と前記第１の閾値設定手段にて設定された閾値とを比較し、当該ピッチ周期が前記閾値以下であるかどうかを判定する第１のピッチ周期判定手段と、
前記第１のピッチ周期判定手段によって、前記第１のピッチ周期抽出手段によって抽出されたピッチ周期が前記閾値以下であると判定された場合、その後のシフト位置における前記時間期間 Ts １に対してはピッチ周期を抽出せず、当該その後のシフト位置における前記時間期間 Ts １に対し、前記第１の記憶手段に記憶されたピッチ周期を割り当てる第１のピッチ周期割り当て手段と、
を備え、
前記第２のピッチ周期抽出装置は、
前記時間期間 Ts ２の信号からピッチ周期を抽出する第２のピッチ周期抽出手段と、
前記第２のピッチ周期抽出手段によって抽出されたピッチ周期を記憶する第２の記憶手段と、
前記ピッチ周期に対し閾値を設定する第２の閾値設定手段と、
前記第２のピッチ周期抽出手段によって抽出されたピッチ周期と前記第２の閾値設定手段にて設定された閾値とを比較し、当該ピッチ周期が前記閾値以下であるかどうかを判定する第２のピッチ周期判定手段と、
前記第２のピッチ周期判定手段によって、前記第２のピッチ周期抽出手段によって抽出されたピッチ周期が前記閾値以下であると判定された場合、その後のシフト位置における前記時間期間 Ts ２に対してはピッチ周期を抽出せず、当該その後のシフト位置における前記時間期間 Ts ２に対し、前記第２の記憶手段に記憶されたピッチ周期を割り当てる第２のピッチ周期割り当て手段と、
を備えることを特徴とする音声信号の時間軸圧縮伸長装置。 A first pitch period extracting device that sequentially shifts the time period Ts 1 for extracting the pitch period in the time axis direction of the input audio signal and extracts the pitch period of the input audio signal for each time period Ts 1;
Time axis compression means for performing compression processing of the time axis of the input audio signal based on the pitch period extracted by the first pitch period extraction device ;
Memory means for storing the signal compressed by the time axis compression means;
The time period Ts2 for extracting the pitch period is sequentially shifted in the time axis direction of the signal stored in the memory means, and the pitch period of the signal stored in the memory means is extracted for each time period Ts2 . Two pitch period extractors;
A time axis extension means for performing a time axis extension process of the signal stored in the memory means based on the pitch period extracted by the second pitch period extraction device ;
In the time-axis compression / decompression apparatus for audio signals comprising:
The first pitch period extraction device includes:
First pitch period extracting means for extracting a pitch period from the audio signal of the time period Ts 1;
First storage means for storing the pitch period extracted by the first pitch period extraction means;
First threshold value setting means for setting a threshold value for the pitch period;
The first pitch period extracted by the first pitch period extraction means is compared with the threshold value set by the first threshold value setting means to determine whether the pitch period is equal to or less than the threshold value. Pitch period determining means;
If the pitch period extracted by the first pitch period extracting means is determined to be equal to or less than the threshold by the first pitch period determining means, the time period Ts 1 at the subsequent shift position is First pitch period assigning means for assigning a pitch period stored in the first storage means for the time period Ts 1 at the subsequent shift position without extracting a pitch period;
With
The second pitch period extracting device includes:
Second pitch period extracting means for extracting a pitch period from the signal of the time period Ts 2;
Second storage means for storing the pitch period extracted by the second pitch period extraction means;
Second threshold value setting means for setting a threshold value for the pitch period;
The second pitch period extracted by the second pitch period extracting means is compared with the threshold value set by the second threshold value setting means to determine whether the pitch period is equal to or less than the threshold value. Pitch period determining means;
When it is determined by the second pitch period determining means that the pitch period extracted by the second pitch period extracting means is equal to or less than the threshold value, for the time period Ts 2 at the subsequent shift position, Second pitch period allocating means for allocating the pitch period stored in the second storage means for the time period Ts 2 at the subsequent shift position without extracting the pitch period;
A time-base compression / expansion device for audio signals, comprising:

ピッチ周期を抽出する時間期間 Ts を入力音声信号の時間軸方向に順次シフトさせ、それぞれの時間期間 Ts 毎に前記入力音声信号のピッチ周期を抽出するピッチ周期抽出装置と、
前記ピッチ周期抽出装置によって抽出されたピッチ周期に基づいて入力音声信号の時間軸の圧縮処理を行う時間軸圧縮手段と、
前記時間軸圧縮手段によって時間軸圧縮された信号を複数の帯域に分割して符号化する帯域分割符号化手段と、
前記帯域分割符号化手段によって符号化された信号を格納するメモリ手段と、前記メモリ手段に格納された信号を復号する帯域分割復号化手段と、
前記ピッチ周期抽出装置で抽出したピッチ周期に基づいて、前記帯域分割復号化手段から出力される信号の時間軸の伸長処理を行う時間軸伸長手段と、
を備えた音声信号の時間軸圧縮伸長装置において、
前記ピッチ周期抽出装置は、
前記時間期間 Ts の音声信号からピッチ周期を抽出するピッチ周期抽出手段と、
前記ピッチ周期抽出手段によって抽出されたピッチ周期を記憶する記憶手段と、
前記ピッチ周期に対し閾値を設定する閾値設定手段と、
前記ピッチ周期抽出手段によって抽出されたピッチ周期と前記閾値設定手段にて設定された閾値とを比較し、当該ピッチ周期が前記閾値以下であるかどうかを判定するピッチ周期判定手段と、
前記ピッチ周期判定手段によって、前記ピッチ周期抽出手段によって抽出されたピッチ周期が前記閾値以下であると判定された場合、その後のシフト位置における前記時間期間 Ts に対してはピッチ周期を抽出せず、当該その後のシフト位置における前記時間期間 Ts に対し、前記記憶手段に記憶されたピッチ周期を割り当てるピッチ周期割り当て手段と、
を備えることを特徴とする音声信号の時間軸圧縮伸長装置。 A pitch period extracting device that sequentially shifts the time period Ts for extracting the pitch period in the time axis direction of the input audio signal, and extracts the pitch period of the input audio signal for each time period Ts ;
Time axis compression means for performing compression processing of the time axis of the input audio signal based on the pitch period extracted by the pitch period extraction device ;
Band division encoding means for dividing and encoding the signal compressed by the time axis compression means into a plurality of bands;
Memory means for storing the signal encoded by the band division encoding means, band division decoding means for decoding the signal stored in the memory means,
A time axis extension means for performing a time axis extension process of the signal output from the band division decoding means based on the pitch period extracted by the pitch period extraction device;
In the time-axis compression / decompression apparatus for audio signals comprising:
The pitch period extraction device includes:
Pitch period extracting means for extracting a pitch period from the audio signal of the time period Ts ;
Storage means for storing the pitch period extracted by the pitch period extraction means;
Threshold setting means for setting a threshold for the pitch period;
A pitch period determining means for comparing the pitch period extracted by the pitch period extracting means with the threshold set by the threshold setting means and determining whether the pitch period is equal to or less than the threshold;
When it is determined by the pitch period determining means that the pitch period extracted by the pitch period extracting means is equal to or less than the threshold value, the pitch period is not extracted for the time period Ts at the subsequent shift position , Pitch period assigning means for assigning a pitch period stored in the storage means for the time period Ts at the subsequent shift position ;
A time-base compression / expansion device for audio signals, comprising:

ピッチ周期を抽出する時間期間 Ts １を入力音声信号の時間軸方向に順次シフトさせ、それぞれの時間期間 Ts １毎に前記入力音声信号のピッチ周期を抽出する第１のピッチ周期抽出装置と、
前記第１のピッチ周期抽出装置によって抽出されたピッチ周期に基づいて入力音声信号の時間軸の圧縮処理を行う時間軸圧縮手段と、
前記時間軸圧縮手段によって時間軸圧縮された信号を複数の帯域に分割して符号化する帯域分割符号化手段と、
前記帯域分割符号化手段にによって符号化された信号を格納するメモリ手段と、前記メモリ手段に格納された信号を復号する帯域分割復号化手段と、
ピッチ周期を抽出する時間期間 Ts ２を前記帯域分割復号化手段から出力される信号の時間軸方向に順次シフトさせ、それぞれの時間期間 Ts ２毎に前記帯域分割復号化手段から出力される信号のピッチ周期を抽出する第２のピッチ周期抽出装置と、
前記第２のピッチ周期抽出装置によって抽出されたピッチ周期に基づいて、前記帯域分割復号化手段から出力される信号の時間軸の伸長処理を行う時間軸伸長手段と、
を備えた音声信号の時間軸圧縮伸長装置において、
前記第１のピッチ周期抽出装置は、
前記時間期間 Ts １の音声信号からピッチ周期を抽出する第１のピッチ周期抽出手段と、
前記第１のピッチ周期抽出手段によって抽出されたピッチ周期を記憶する第１の記憶手段と、
前記ピッチ周期に対し閾値を設定する第１の閾値設定手段と、
前記第１のピッチ周期抽出手段によって抽出されたピッチ周期と前記第１の閾値設定手段にて設定された閾値とを比較し、当該ピッチ周期が前記閾値以下であるかどうかを判定する第１のピッチ周期判定手段と、
前記第１のピッチ周期判定手段によって、前記第１のピッチ周期抽出手段によって抽出されたピッチ周期が前記閾値以下であると判定された場合、その後のシフト位置における前記時間期間 Ts １に対してはピッチ周期を抽出せず、当該その後のシフト位置における前記時間期間 Ts １に対し、前記第１の記憶手段に記憶されたピッチ周期を割り当てる第１のピッチ周期割り当て手段と、
を備え、
前記第２のピッチ周期抽出装置は、
前記時間期間 Ts ２の信号からピッチ周期を抽出する第２のピッチ周期抽出手段と、
前記第２のピッチ周期抽出手段によって抽出されたピッチ周期を記憶する第２の記憶手段と、
前記ピッチ周期に対し閾値を設定する第２の閾値設定手段と、
前記第２のピッチ周期抽出手段によって抽出されたピッチ周期と前記第２の閾値設定手段にて設定された閾値とを比較し、当該ピッチ周期が前記閾値以下であるかどうかを判定する第２のピッチ周期判定手段と、
前記第２のピッチ周期判定手段によって、前記第２のピッチ周期抽出手段によって抽出されたピッチ周期が前記閾値以下であると判定された場合、その後のシフト位置における前記時間期間 Ts ２に対してはピッチ周期を抽出せず、当該その後のシフト位置における前記時間期間 Ts ２に対し、前記第２の記憶手段に記憶されたピッチ周期を割り当てる第２のピッチ周期割り当て手段と、
を備えることを特徴とする音声信号の時間軸圧縮伸長装置。 A first pitch period extracting device that sequentially shifts the time period Ts 1 for extracting the pitch period in the time axis direction of the input audio signal and extracts the pitch period of the input audio signal for each time period Ts 1;
Time axis compression means for performing compression processing of the time axis of the input audio signal based on the pitch period extracted by the first pitch period extraction device ;
Band division encoding means for dividing and encoding the signal compressed by the time axis compression means into a plurality of bands;
Memory means for storing the signal encoded by the band division encoding means; band division decoding means for decoding the signal stored in the memory means;
The time period Ts 2 for extracting the pitch period is sequentially shifted in the time axis direction of the signal output from the band division decoding unit, and the signal output from the band division decoding unit is output for each time period Ts 2. A second pitch period extracting device for extracting the pitch period;
A time axis extending means for performing a time axis extending process of a signal output from the band division decoding means based on the pitch period extracted by the second pitch period extracting device ;
In the time-axis compression / decompression apparatus for audio signals comprising:
The first pitch period extraction device includes:
First pitch period extracting means for extracting a pitch period from the audio signal of the time period Ts 1;
First storage means for storing the pitch period extracted by the first pitch period extraction means;
First threshold value setting means for setting a threshold value for the pitch period;
The first pitch period extracted by the first pitch period extraction means is compared with the threshold value set by the first threshold value setting means to determine whether the pitch period is equal to or less than the threshold value. Pitch period determining means;
If the pitch period extracted by the first pitch period extracting means is determined to be equal to or less than the threshold by the first pitch period determining means, the time period Ts 1 at the subsequent shift position is First pitch period assigning means for assigning a pitch period stored in the first storage means for the time period Ts 1 at the subsequent shift position without extracting a pitch period;
With
The second pitch period extracting device includes:
Second pitch period extracting means for extracting a pitch period from the signal of the time period Ts 2;
Second storage means for storing the pitch period extracted by the second pitch period extraction means;
Second threshold value setting means for setting a threshold value for the pitch period;
The second pitch period extracted by the second pitch period extracting means is compared with the threshold value set by the second threshold value setting means to determine whether the pitch period is equal to or less than the threshold value. Pitch period determining means;
When it is determined by the second pitch period determining means that the pitch period extracted by the second pitch period extracting means is equal to or less than the threshold value, for the time period Ts 2 at the subsequent shift position, Second pitch period allocating means for allocating the pitch period stored in the second storage means for the time period Ts 2 at the subsequent shift position without extracting the pitch period;
A time-base compression / expansion device for audio signals, comprising:

ピッチ周期を抽出する時間期間 Ts を入力音声信号の時間軸方向に順次シフトさせ、それぞれの時間期間 Ts 毎に前記入力音声信号のピッチ周期を抽出するピッチ周期抽出装置と、
再生速度情報を出力する再生速度設定手段と、
前記再生速度設定手段からの再生速度情報及び、前記ピッチ周期抽出装置によって抽出されたピッチ周期に基づいて、入力音声信号の時間軸の圧縮処理を行う時間軸圧縮手段と、
を備えた音声信号の時間軸圧縮装置において、
前記ピッチ周期抽出装置は、
前記時間期間 Ts の音声信号からピッチ周期を抽出するピッチ周期抽出手段と、
前記ピッチ周期抽出手段によって抽出されたピッチ周期を記憶する記憶手段と、
前記ピッチ周期に対し閾値を設定する閾値設定手段と、
前記ピッチ周期抽出手段によって抽出されたピッチ周期と前記閾値設定手段にて設定された閾値とを比較し、当該ピッチ周期が前記閾値以下であるかどうかを判定するピッチ周期判定手段と、
前記ピッチ周期判定手段によって、前記ピッチ周期抽出手段によって抽出されたピッチ周期が前記閾値以下であると判定された場合、その後のシフト位置における前記時間期間 Ts に対してはピッチ周期を抽出せず、当該その後のシフト位置における前記時間期間 Ts に対し、前記記憶手段に記憶されたピッチ周期を割り当てるピッチ周期割り当て手段と、
を備えることを特徴とする音声信号の時間軸圧縮装置。 A pitch period extracting device that sequentially shifts the time period Ts for extracting the pitch period in the time axis direction of the input audio signal, and extracts the pitch period of the input audio signal for each time period Ts ;
Playback speed setting means for outputting playback speed information;
Time axis compression means for performing time axis compression processing of the input audio signal based on the reproduction speed information from the reproduction speed setting means and the pitch period extracted by the pitch period extraction device ;
In the time axis compression apparatus of the audio signal provided with
The pitch period extraction device includes:
Pitch period extracting means for extracting a pitch period from the audio signal of the time period Ts ;
Storage means for storing the pitch period extracted by the pitch period extraction means;
Threshold setting means for setting a threshold for the pitch period;
A pitch period determining means for comparing the pitch period extracted by the pitch period extracting means with the threshold set by the threshold setting means and determining whether the pitch period is equal to or less than the threshold;
When it is determined by the pitch period determining means that the pitch period extracted by the pitch period extracting means is equal to or less than the threshold value, the pitch period is not extracted for the time period Ts at the subsequent shift position , Pitch period assigning means for assigning a pitch period stored in the storage means for the time period Ts at the subsequent shift position ;
A time-base compression apparatus for an audio signal, comprising:

前記再生速度設定手段からの再生速度情報に応じて、前記所定値の実質的な値を変化させることを特徴とする請求項１０記載の音声信号の
時間軸圧縮装置。11. The audio signal time base compression apparatus according to claim 10, wherein a substantial value of the predetermined value is changed in accordance with reproduction speed information from the reproduction speed setting means.

ピッチ周期を抽出する時間期間 Ts を入力音声信号の時間軸方向に順次シフトさせ、それぞれの時間期間 Ts 毎に前記入力音声信号のピッチ周期を抽出するピッチ周期抽出装置と、
再生速度情報を出力する再生速度設定手段と、
前記再生速度設定手段からの再生速度情報及び、前記ピッチ周期抽出装置によって抽出されたピッチ周期に基づいて、入力音声信号の時間軸の伸長処理を行う時間軸伸長手段と、
を備えた音声信号の時間軸伸長装置において、
前記ピッチ周期抽出装置は、
前記時間期間 Ts の音声信号からピッチ周期を抽出するピッチ周期抽出手段と、
前記ピッチ周期抽出手段によって抽出されたピッチ周期を記憶する記憶手段と、
前記ピッチ周期に対し閾値を設定する閾値設定手段と、
前記ピッチ周期抽出手段によって抽出されたピッチ周期と前記閾値設定手段にて設定された閾値とを比較し、当該ピッチ周期が前記閾値以下であるかどうかを判定するピッチ周期判定手段と、
前記ピッチ周期判定手段によって、前記ピッチ周期抽出手段によって抽出されたピッチ周期が前記閾値以下であると判定された場合、その後のシフト位置における前記時間期間 Ts に対してはピッチ周期を抽出せず、当該その後のシフト位置における前記時間期間 Ts に対し、前記記憶手段に記憶されたピッチ周期を割り当てるピッチ周期割り当て手段と、
を備えることを特徴とする音声信号の時間軸伸長装置。 A pitch period extracting device that sequentially shifts the time period Ts for extracting the pitch period in the time axis direction of the input audio signal, and extracts the pitch period of the input audio signal for each time period Ts ;
Playback speed setting means for outputting playback speed information;
A time axis extending means for extending the time axis of the input audio signal based on the playback speed information from the playback speed setting means and the pitch period extracted by the pitch period extracting device ;
In the time extension device of the audio signal with
The pitch period extraction device includes:
Pitch period extracting means for extracting a pitch period from the audio signal of the time period Ts ;
Storage means for storing the pitch period extracted by the pitch period extraction means;
Threshold setting means for setting a threshold for the pitch period;
A pitch period determining means for comparing the pitch period extracted by the pitch period extracting means with the threshold set by the threshold setting means and determining whether the pitch period is equal to or less than the threshold;
When it is determined by the pitch period determining means that the pitch period extracted by the pitch period extracting means is equal to or less than the threshold value, the pitch period is not extracted for the time period Ts at the subsequent shift position , Pitch period assigning means for assigning a pitch period stored in the storage means for the time period Ts at the subsequent shift position ;
An apparatus for expanding a time axis of an audio signal, comprising:

前記再生速度設定手段からの再生速度情報に応じて、前記所定値の実質的な値を変化させることを特徴とする請求項１２記載の音声信号の時間軸伸長装置。13. The audio signal time base extension device according to claim 12, wherein a substantial value of the predetermined value is changed in accordance with reproduction speed information from the reproduction speed setting means.

入力された音声信号を符号化する符号化手段と、
前記符号化手段で符号化された信号を格納するメモリ手段と、
前記メモリ手段に格納された信号を復号する復号化手段と、
ピッチ周期を抽出する時間期間 Ts を前記復号化手段より出力される信号の時間軸方向に順次シフトさせ、それぞれの時間期間 Ts 毎に前記復号化手段より出力される信号のピッチ周期を抽出するピッチ周期抽出装置と、
再生速度情報を出力する再生速度設定手段と、
前記再生速度設定手段からの再生速度情報及び、前記ピッチ周期抽出装置によって抽出されたピッチ周期に基づいて、前記復号化手段より出力される信号の時間軸の圧縮処理を行う時間軸圧縮手段と、
前記再生速度設定手段からの再生速度情報及び、前記ピッチ周期抽出装置によって抽出されたピッチ周期に基づいて、前記復号化手段より出力される信号の時間軸の伸長処理を行う時間軸伸長手段と、
前記復号化手段より出力される信号を選択的に前記時間軸圧縮手段または時間軸伸長手段に導く選択手段と、
を備えた音声信号の時間軸圧縮伸長装置において、
前記ピッチ周期抽出装置は、
前記時間期間 Ts の信号からピッチ周期を抽出するピッチ周期抽出手段と、
前記ピッチ周期抽出手段によって抽出されたピッチ周期を記憶する記憶手段と、
前記ピッチ周期に対し閾値を設定する閾値設定手段と、
前記ピッチ周期抽出手段によって抽出されたピッチ周期と前記閾値設定手段にて設定された閾値とを比較し、当該ピッチ周期が前記閾値以下であるかどうかを判定するピッチ周期判定手段と、
前記ピッチ周期判定手段によって、前記ピッチ周期抽出手段によって抽出されたピッチ周期が前記閾値以下であると判定された場合、その後のシフト位置における前記時間期間 Ts に対してはピッチ周期を抽出せず、当該その後のシフト位置における前記時間期間 Ts に対し、前記記憶手段に記憶されたピッチ周期を割り当てるピッチ周期割り当て手段と、
を備えることを特徴とする音声信号の時間軸圧縮伸長装置。Encoding means for encoding the input audio signal;
Memory means for storing the signal encoded by the encoding means;
A decoding means for decoding the stored signal to said memory means,
A pitch for sequentially shifting the time period Ts for extracting the pitch period in the time axis direction of the signal output from the decoding means, and for extracting the pitch period of the signal output from the decoding means for each time period Ts A period extractor;
Playback speed setting means for outputting playback speed information;
Reproduction speed information from the reproduction speed setting means and, with the pitch based on the pitch period extracted by periodically extracting apparatus, the time axis compression means for performing compression processing of the time axis of the signal outputted from said decoding means,
Reproduction speed information from the reproduction speed setting means and, with the pitch based on the pitch period extracted by periodically extracting apparatus, the time axis expanding means for performing expansion processing of the time axis of the signal outputted from said decoding means,
And selecting means for directing selectively the time axis compression means or the time-base decompression means a signal output from said decoding means,
In the time-axis compression / decompression apparatus for audio signals comprising:
The pitch period extraction device includes:
Pitch period extracting means for extracting a pitch period from the signal of the time period Ts ;
Storage means for storing the pitch period extracted by the pitch period extraction means;
Threshold setting means for setting a threshold for the pitch period;
A pitch period determining means for comparing the pitch period extracted by the pitch period extracting means with the threshold set by the threshold setting means and determining whether the pitch period is equal to or less than the threshold;
When it is determined by the pitch period determining means that the pitch period extracted by the pitch period extracting means is equal to or less than the threshold value, the pitch period is not extracted for the time period Ts at the subsequent shift position , Pitch period assigning means for assigning a pitch period stored in the storage means for the time period Ts at the subsequent shift position ;
A time-base compression / expansion device for audio signals, comprising:

前記再生速度設定手段からの再生速度情報に応じて、前記所定値の実質的な値を変化させることを特徴とする請求項１４記載の音声信号の時間軸圧縮伸長装置。15. The audio signal time base compression / decompression apparatus according to claim 14, wherein a substantial value of the predetermined value is changed in accordance with reproduction speed information from the reproduction speed setting means.