JP3719503B2

JP3719503B2 - Music player

Info

Publication number: JP3719503B2
Application number: JP2001174899A
Authority: JP
Inventors: 健次三輪
Original assignee: Sharp Corp
Current assignee: Sharp Corp
Priority date: 2001-06-08
Filing date: 2001-06-08
Publication date: 2005-11-24
Anticipated expiration: 2021-06-08
Also published as: JP2002366197A

Description

【０００１】
【発明の属する技術分野】
本発明は、圧縮デジタル音楽データ信号に対して伸長処理を施して音楽再生を行う音楽再生装置に関する。例えば、データ伝送手段として無線を利用した衛星デジタルＴＶ（テレビジョン）放送、地上波デジタルＴＶ放送、地上波デジタルラジオ放送等のチューナに使用される音楽再生装置、データ伝送手段として有線を利用したケーブルＴＶのデジタル放送等のチューナに使用される音楽再生装置、データ伝送手段としてインターネットを利用したインターネットラジオ、インターネット電話に使用される音楽再生装置等、或いはオフラインにより情報を伝達するパッケージ系情報媒体としてＭＤ（ミニ・ディスク）のプレーヤー、ＩＣカードもしくはＩＣカードに内蔵されたフラッシュメモリに圧縮音源データを記録させたＩＣ半導体音楽プレーヤー等の音楽再生装置に関する。
【０００２】
【従来の技術】
近年においては、半導体技術の発展により、大容量のデジタルデータを高速に処理することが可能となっている。それに伴って、音楽データに関しても、デジタル圧縮されたオーディオコンテンツが流通しており、デジタル放送の普及も進んでいる。
【０００３】
音楽データをデジタル圧縮するための方式としては、ＭＤで使用されるＡＴＲＡＣ（ＡｄａｐｔｉｖｅＴｒａｎｓｆｏｒｍＡｃｏｕｓｔｉｃＣｏｄｉｎｇ）、ＰＣ（パーソナルコンピュータ）等で使用されるＭＰ３（ＭＰＥＧ−１ＡｕｄｉｏＬａｙｅｒ３）、デジタルＴＶ放送等で使用されるＡＡＣ（ＭＰＥＧ−２ＡＡＣ）等が挙げられる。
【０００４】
いずれの方式においても、まず、前処理として、アナログデータである音楽データを標本化および量子化してデジタル化し、帯域フィルタ等により帯域分割して圧縮単位毎にデータ分割し、時間軸に沿って表されたデータ（時間成分信号）を周波数軸に沿って表されたデータ（周波数成分信号）に変換した後、圧縮処理が行われる。例えば、音楽の場合には４４．１ｋＨｚまたは４８ｋＨｚ等の周波数でサンプリングが行われ、２４ｂｉｔデータまたは１６ｂｉｔデータ等に量子化される。一方、音声の場合には８ｋＨｚ等の周波数でサンプリングが行われ、８ｂｉｔデータ等に量子化される。時間軸データから周波数軸データへの変換は、データを帯域フィルタ等により帯域分割し、各帯域内でＭＤＣＴ（ＭｏｄｉｆｅｉｄＤｉｓｃｒｅｔｅＣｏｓｉｎｅｔｒａｎｓｆｏｒｍ）等を用いてデータ変換することにより行われる。
【０００５】
圧縮処理としては、聴覚心理モデル等を用いて人間が知覚できない音楽成分をデータから削除する等の非可逆データ圧縮、データをハフマン符号等のエントロピー符号化する可逆データ圧縮等が行われる。さらに、ドラム、シンバル等のアタック（衝撃）音に対する再現性を向上させるため、アタック音部分ではデータを圧縮する単位時間幅を短くしてサブフレーム化する方法、アタック音部分のみ、他の部分とは別に切り出して処理する方法等が用いられている。
【０００６】
周波数成分に変換された音楽データには、音源データの周波数に関するデータと位相に関するデータとが含まれる。例えば図１（ａ）に示すような単一周波数からなる音源データを第１フレーム〜第３フレームの３つのサウンドフレームに分割した場合、切り出された各サウンドフレームでは位相が異なる。このため、各時間軸データを変換して得られる各周波数軸データは、音源データの周波数が同じであっても、位相データが異なるデータとなる。
【０００７】
なお、再生波形において各サウンドフレーム間の連結状態を良好にして信号遷移を滑らかにするために、図１（ｂ）〜図１（ｄ）に示す窓関数と各サウンドフレームの時間軸データとを乗算する窓処理が行われる。窓関数の曲線部分は、データ圧縮・伸長方式によって異なり、例えば１／４円またはｓｉｎ波形等が用いられる。この窓処理により、窓関数の平坦部分（値「１」の部分）では、時間軸データがそのまま再生波形として出力され、各サウンドフレーム間の連結部分では、各時間軸データと窓関数の曲線部分とを乗算した結果を加算した波形が再生波形として出力される。その結果、図１（ｅ）に示す再生波形が得られる。
【０００８】
ところで、デジタル圧縮された音楽データを転送するデータ転送方式は、時間的制限を伴わないデータ転送方式と、時間的制限を伴うデータ転送方式とに分けられる。時間的制限を伴わないデータ転送方式の例としては、例えばインターネット等を用いて音楽データファイルをダウンロードし、全データの受信を完了した後で、受信データを受信端末側のパーソナルコンピュータ等により伸長・再生する場合等が挙げられる。このデータ転送方式において、元データにエラーが含まれず、データ伝送時にエラーが発生した場合には、受信端末側でデータ伝送時のエラーを検出し、送信側に対してデータ再送要求を行うことにより、最終的にエラーが無い音楽データファイルを受信することができる。データ伝送時のエラーは、例えば伝送単位であるパケット単位でパリティデータを付加すること等により、受信側で検出することができる。なお、元データの破壊等によって元データにエラーが含まれる場合には、その元データを基にして伝送系における誤りを検出するための検出符号が付加される。このような場合には、受信側では、例えばファイルフォーマットに異常がある等のような特別な場合を除いて、エラーを検出することは不可能である。
【０００９】
時間的制限が伴うデータ転送方式の例としては、例えば衛星デジタルＴＶ放送が挙げられる。衛星デジタルＴＶ放送では、ＭＰＥＧ−２（ＭｏｖｉｎｇＰｉｃｔｕｒｅＥｘｐｅｒｔｓＧｒｏｕｐｐｈａｓｅ２）、ＡＡＣ（ＡｄｖａｎｃｅｄＡｕｄｉｏＣｏｄｉｎｇ）等の音楽圧縮方式で圧縮されたデジタル音楽データが放送される。放送されるデータは、伝送系における誤り訂正・検出符号として、リードソロモン符号、畳み込み符号等の強力な誤り訂正・検出符号が付加されて伝送される。しかしながら、天候状態、電波の受信状態等によっては、このような誤り訂正・検出符号が充分受信されず、受信側で誤り訂正ができない場合もある。このような場合に、時間的に遅れることなく音楽再生を行う必要があるＴＶ放送等の分野では、データを再送する等の手段を用いることができないため、エラーが生じた部分に関しては受信側で正しく音楽を再生することができない。そこで、従来においては、例えば受信データからエラーが検出された場合には、エラーを含む部分を無音化（ミュート）する等の処理が行われている。
【００１０】
さらに、衝撃対策用バッファメモリ（ショックプルーフメモリ）を備えたＭＤ再生機器においても、連続的な振動等によってＭＤからの正常なデータ読み取りが阻害される場合がある。このような場合に、バッファメモリに格納されたデータ分以上にデータ読み取りができない期間が続くと、例えばエラーを含む部分を無音化する処理、直前の再生データを繰り返して再生する処理等が行われている。
【００１１】
【発明が解決しようとする課題】
上述したように、時間的に遅れることなく音楽を再生する必要があるＴＶ放送等の分野では、データの再送等の手段を取ることができず、エラーが生じた部分に関しては受信端末側で正しく音楽を再生することができない。
【００１２】
そこで、従来においては、受信データからエラーが検出された場合に、エラーを含む部分を無音化（ミュート）する方法等が用いられている。例えば図２（ａ）に示すような音源波形をデジタル圧縮データとして伝送する際に第２サウンドフレームにエラーが発生した場合、受信側では第２サウンドフレームの圧縮データとして無音データが伸長される。そして、図２（ｂ）〜図２（ｄ）に示す各サウンドフレームの時間軸データと窓関数の値とが乗算され、図２（ｅ）に示すような再生波形が生成される。なお、再生ＡＭＰ（増幅器）にて出力を絞る方法等によりエラーを含む部分を無音化することも可能である。しかし、このようにエラー部分を無音化する方法では、図２（ｅ）に示すように、本来連続して出力されるべき音楽がミュート処理によって途切れるために音質が低下するという問題がある。さらに、エラー部分を無音化する方法では、エラーが長期化したときに利用者が受信データにエラーが生じているのか、再生装置自体の故障であるのかを判断することが容易ではないという問題もある。
【００１３】
さらに、従来においては、受信データからエラーが検出された場合に、直前の再生データを繰り返して再生する方法も用いられている。例えば図３（ａ）に示すような音源波形をデジタル圧縮データとして伝送する際に第２サウンドフレームにエラーが発生した場合、受信側では第２サウンドフレームの圧縮データとして直前の第１サウンドフレームの圧縮データが伸長される。そして、図３（ｂ）〜図３（ｄ）に示す各サウンドフレームの時間データと窓関数の値とが乗算され、図３（ｅ）に示すような再生波形が生成される。しかし、このように直前の再生データを繰り返して再生する方法では、周波数が同じ音源波形を圧縮したデータであっても、サウンドフレーム毎に異なる位相データが含まれるため、直前のフレームサウンドデータを用いた場合でも、図３（ｅ）に示すように、再生波形にひずみが発生して音質が低下するという問題がある。
【００１４】
本発明は、このような従来技術の課題を解決するためになされたものであり、再生波形にひずみを生じさせることなくデータ伝送時のエラーを補完することができ、さらに、データにエラーがあった場合に利用者に知らせることができる音楽再生装置を提供することを目的とする。
【００１５】
【課題を解決するための手段】
本発明の音楽再生装置は、入力信号であるデジタル音楽データの周波数成分信号を、記録単位毎に時間成分信号に変換する信号変換手段と、前記記録単位毎に、前記時間成分信号の入力の開始期間および終端期間に該時間成分信号の入力レベルを増加させる増加部および減衰させる減衰部がそれぞれ設けられるとともに、該開始期間および該終端期間を除く期間に該時間成分信号の入力レベルを維持させる維持部が設けられた窓データをそれぞれ生成する窓データ生成手段と、前記記録単位毎に、前記信号変換手段に入力される前記入力信号の状態を検出する入力信号判定手段と、前記入力信号判定手段によって前記記録単位の前記入力信号に誤りが無いと判定された場合に、前記信号変換手段から出力された前記記録単位の時間成分信号と前記窓データ生成部にて生成された窓データとを乗算して、前記記録単位毎に得られる乗算結果を、該記録単位の窓データの前記増加部に該記録単位の直前の記録単位の前記減衰部を重複させた状態で連結して出力し、前記入力信号判定手段によって前記記録単位の前記入力信号に誤りがあると判定された場合に、誤りがあると判定された記録単位において前記信号変換手段による信号変換処理および前記窓データ生成手段による窓データの生成処理を行わずに、その直前の記録単位の前記信号変換手段から出力された時間成分信号を、該直前の記録単位における窓データの維持部を延長して該維持部と乗算して出力する信号変換制御手段とを備えている。
【００１６】
好ましくは、前記入力信号判定手段によって前記記録単位の前記入力信号に誤りがあると判定された場合に、該誤りの発生期間が長くなると、該記録単位における前記信号変換手段の出力に代えて、無音処理が行われる。
【００１７】
さらに好ましくは、少なくとも１つの特定音楽の周波数成分信号を圧縮したデータが予め格納されている記憶手段と、前記無音処理の後に、さらに、前記入力信号判定手段によって前記記録単位の前記入力信号に誤りがあると判定された場合に、前記無音処理に代えて、前記記憶手段に格納された特定音楽の周波数成分信号の圧縮データを選択するデータ選択制御手段とをさらに備える。
【００１９】
以下に、本発明の作用について説明する。
【００２０】
本発明にあっては、入力信号判定手段によって入力信号の誤り有無および誤りが発生している期間等を検出する。そして、入力信号に誤りが含まれている場合には、信号変換制御手段によって信号変換手段を制御して、誤りが発生する前の記録単位（サウンドフレームまたはサブフレーム）の信号変換処理期間を複数の記録単位に対応するように延長させ、誤りが発生した記録単位に対しては信号変換処理を行わせない。誤りが発生する前の記録単位に対するデータを用いて波形的に連続する音データを生成させるため、直前の再生データを繰り返し再生する従来技術のような位相差による音質の低下を防ぐことが可能である。
【００２１】
本発明にあっては、無音データを圧縮データとして記憶手段に予め記録しておくことにより、入力信号に誤りが発生する前の記録単位と誤りが発生した記録単位とで周波数、音量および位相等の音成分変化が大きい場合等に、無音データを選択して誤りが発生した記録単位を無音化することが可能である。さらに、警告音等の周波数成分信号を圧縮データとして記憶手段に予め記録しておくことにより、誤りが長期化した場合等に警告音データを選択して警告音を再生し、データに誤りが含まれていることを利用者に知らせることが可能である。
【００２２】
誤りが発生する前の記録単位に対応するデータを用いる処理、無音化処理および警告音発生処理のいずれを行うかについては、入力信号の誤り発生有無、誤り発生期間、音成分変化等に応じて、選択手段により選択することが可能である。
【００２３】
さらに、入力信号に誤りが含まれている場合に、窓処理制御手段によって窓処理手段を制御して誤りが発生する前の記録単位に対する窓処理期間を複数の記録単位に対応するように延長させ、誤りが発生した記録単位に対しては窓処理を行わせない。これにより、記録単位間のデータ連結部において音成分を円滑に遷移させて再生波形を生成するための窓処理を誤りの有無に応じて制御することが可能となる。
【００２４】
【発明の実施の形態】
以下に、本発明の実施の形態について、図面に基づいて説明する。
【００２５】
具体的な実施の形態について説明する前に、音楽データを圧縮・伸長する処理について、図４を用いて基本的な説明を行う。
【００２６】
音楽データを圧縮処理する際には、まず、前処理として、アナログデータである音楽データを標本化および量子化してデジタル化し、帯域フィルタ−１〜ｍ等により帯域分割して圧縮単位毎にデータ分割し、時間軸データ（時間成分信号）を周波数軸データ（周波数成分信号）に変換した後に、圧縮処理を行う。例えば音楽の場合、ＣＤ（コンパクト・ディスク）等に記録する場合には４４．１ｋＨｚの周波数でサンプリングを行って２４ｂｉｔデータまたは１６ｂｉｔデータなどに量子化し、ＤＡＴ（ＤｉｇｉｔａｌＡｕｄｉｏＴａｐｅ）もしくはＰＣ（パーソナルコンピュータ）等に記録する場合には４８ｋＨｚの周波数でサンプリングを行って２４ｂｉｔデータまたは１６ｂｉｔデータ等に量子化する。一方、音声の場合には８ｋＨｚ等の周波数でサンプリングを行い、８ｂｉｔデータ等に量子化する。時間軸データから周波数軸データへの変換の際には、データを帯域フィルタ等により帯域分割し、各帯域内でＭＤＣＴ等を用いてデータ変換する。
【００２７】
例えば、図５（ａ）に示す入力波形に対して標本化・量子化を行って図５（ｂ）に示す時間軸データとし、この時間軸データに対して図５（ｃ）に示すような周波数の異なるサンプリング波形１〜ｎ等を乗算してＭＤＣＴ処理を行い、各サンプリング波形の周波数成分データ（スペクトラム成分データ）を求める。
【００２８】
このようにして得られたスペクトラム成分データに対して、聴覚心理モデル等を用いて人間が知覚できない音楽成分をデータから削除する等の非可逆データ圧縮、およびデータをハフマン符号等のエントロピー符号化する可逆データ圧縮等を行い、さらに、有効桁数の調整等も行う。なお、エントロピー符号化等の処理は、圧縮技術によっては行われない場合もある。さらに、ドラム、シンバル等のアタック音に対する再現性を向上するため、アタック音部分ではデータを圧縮する単位時間幅を短くしてサブフレーム化し、データの時間追従性を向上させてもよい。または、アタック音部分のみ、他の部分とは別に切り出して処理してもよい。
【００２９】
伸長処理では、上記圧縮処理にて圧縮されたデータを伸長し、周波数成分データに展開する。各周波数成分データは、圧縮時の逆変換であるＩＭＤＣＴ（ＩｎｖｅｒｓｅＤｉｓｃｒｅｔｅＦｏｕｒｉｅｒＴｒａｎｓｕｆｏｒｍ）等を用いて時間軸データに変換される。具体的には、図６に示すように、周波数成分データを録音時の周波数サンプリング波形と同じ波形に乗算し、各時間軸での乗算結果を加算することにより、サウンドフレーム毎に周波数軸データから時間軸データへの変換を行う。さらに、サウンドフレーム間の信号遷移を円滑化するために、窓処理を行う。この窓処理によって、前サウンドフレームと次サウンドフレームとを連結部分で重複させ、重複部分において前サウンドフレームの窓関数の終端部（減衰部）と次サウンドフレームの窓関数の開始部（増加部）とを加算することにより、サウンドフレームの連結部で周波数、音量、位相等の音成分を円滑に遷移させることができる。
【００３０】
（実施形態１）
本実施形態１では、入力信号の状態として誤り有りおよび誤り無しの２状態を検出し、誤り有りの場合に前サウンドフレームの信号変換処理を延長する例について説明する。この音楽再生装置は、データ誤りが発生する頻度が比較的少ないデータ伝送システムに好適である。
【００３１】
図７は、実施形態１の音楽再生装置において、周波数成分信号（周波数軸データ）から時間成分信号（時間軸データ）に変換する処理を説明するためのブロック図である。なお、この処理は、通常、ＤＳＰ（ＤｉｇｉｔａｌＳｉｇｎａｌＰｒｏｃｅｓｓｏｒ）またはＲＩＳＣ（ＲｅｄｕｃｅｄＩｎｓｔｒｕｃｔｉｏｎＳｅｔＣｏｍｐｕｔｅｒ）等によって演算処理されるが、ここでは機能ブロック図として示し、説明を行う。
【００３２】
デジタル圧縮データは、インターネット等により伝送されて音楽再生装置に受信され、またはＭＤ等の記録媒体から読み出されて入力される。入力されたデジタル圧縮データは、誤り訂正・検出符号等により誤り訂正・検出されてバッファメモリに格納される。ＭＤ等からデータを再度読み込むことが可能なシステムにおいてデータ誤りが検出された場合には、読み込まれたデータを破壊して再度データの読み込みを行う。しかし、データを再度読み込むことが可能なシステムであっても、装置の振動等によってバッファメモリに格納可能な期間以上連続してデータ誤りが発生した場合には、誤り検出結果を含んだ状態でデータが再生処理部に入力される。図７中の入力信号は、このような誤り検出結果を含む入力データであり、誤り検出結果は入力信号判定部Ｕ１により判定される。
【００３３】
入力信号判定部Ｕ１にて入力信号に誤りが無いと判定された場合には、データ伸長処理部Ｕ２において、入力データに対してハフマン符号等のエントロピー符号を復号するデータ伸長処理が行われる。なお、エントロピー符号によるデータ圧縮以外にも、スケールファクタにより有効桁数を制限する等の圧縮方法があり、各々のデータ圧縮時の圧縮方法に従って伸長処理を行う。本実施形態では、データ圧縮・伸長方法については問わないものとする。このデータ伸長処理によって、周波数成分データが求められる。この周波数成分データには、音の音量成分、周波数成分および位相成分に関する情報が含まれている。
【００３４】
周波数軸−時間軸データ変換処理部Ｕ３では、以上のようにして求めた周波数成分データ１〜ｎと各周波数に対応する周波数波形データ１〜ｎとを乗算し、乗算結果を各周波数毎に加算することにより、周波数軸データから時間軸データへの変換を行い、フレームサウンドデータ（時間成分データ）を生成する。なお、周波数軸データから時間軸データへの変換処理は、後述する窓関数が０ではなくなるポイント（図８中のＡ点）から処理するものとする。フレームタイミングは、フレームタイミング発生部Ｕ５によって生成されて各部分に出力される。
【００３５】
周波数波形データ１〜ｎは、録音時の各サンプリング周波数に対応する周波数の波形データである。各周波数波形データは、最も精度の高い１周期分の基本波形データから必要な時間分解能により標本化および量子化して抽出した波形データのデータテーブルから、各周波数の位相データを引数としてデータを参照するテーブルデータ参照方式によって求められる。なお、各周波数波形データを演算処理によって求めることも可能である。
【００３６】
入力信号判定部（Ｕ１）にて入力信号に誤りが無いと判定された場合には、各周波数波形データの参照用引数値である位相データを初期化し、１サウンドフレーム分のデータ処理（図８のＡ−Ｄ間）を行う。
【００３７】
一方、入力信号判定部（Ｕ１）にて入力信号に誤りがあると判定された場合には、周波数成分データの更新処理は行わず、誤りが含まれない最新の周波数成分データが保持され、各周波数波形データの各参照用引数値である位相データも初期化されない。この状態で、データ処理（図８のＤ−Ｆ間）を行うことにより、前サウンドフレームから波形的に連続する音データ（サウンドデータ）を生成することができる。この処理は、本来規定された１サウンドフレーム期間の処理をさらに１フレーム期間分延長して２フレームに対応させるものである。さらに、入力信号の誤りが連続する場合には、上記処理を繰り返すことにより位相差を含まない複数フレーム分のフレームサウンドデータを処理することができる。
【００３８】
次に、窓処理について説明する。窓処理は、サウンドフレームの連結部において音データの連続性を補正するために、音データ（フレームサウンドデータ）と窓関数を乗算することにより行われる。窓関数は、図７に示す窓データ生成部Ｕ４により生成される。
【００３９】
窓関数には、図９に示すように、先端部および後端部に平坦な「０」部分があり、その間に「非０」部分がある。「非０」部分は、前部の増加部分と中央の平坦な「１」部分と後部の減衰部分からなる。増加部分は、前フレームの窓関数における減衰部分と重複しており、前フレームとの連結を調整するために用いられる。減衰部分は、後フレームの窓関数における増加部分と重複しており、後フレームとの連結を調整するために用いられる。中央の平坦な「１」部分は、他フレームからの影響を受けない部分である。増加部分の前および減衰部分の後の「０」部分は、音データ（フレームサウンドデータ）と乗算した結果が０となり、前フレームデータの中央の平坦部分（窓関数が「１」の部分）に加算されても音生成に影響を与えない。
【００４０】
入力信号判定部（Ｕ１）にて誤りが無いと判定された場合には、前フレームのフレームサウンドデータ（図９（ｂ））と窓関数の減衰部分（図９（ａ））とを乗算した結果と、当該フレームのフレームサウンドデータ（図９（ｄ））と窓関数の増加部分（図９（ｃ））とを乗算した結果とを加算してサウンドデータを生成し、生成したサウンドデータを前フレームのフレームサウンドデータ（図９（ｂ））に連結して出力部へ出力する。続いて、窓関数の平坦な「１」部分（図９（ｃ））と当該フレームのフレームサウンドデータ（図９（ｄ））とを乗算した結果を出力部へ出力する。その後、当該フレームのフレームサウンドデータ（図９（ｄ））と窓関数の減少部分（図９（ｃ））とを乗算した結果と、後フレームのフレームサウンドデータ（図９（ｆ））と窓関数の増加部分（図９（ｅ））とを乗算した結果とを加算して出力部へ出力する。これにより、図９（ｇ）に示すような連結サウンドデータが得られる。なお、実際には再生される時間軸データは図６に示すようなデジタルデータであり、この後にアナログデータに変換する処理を行うが、簡略化のために図９ではフレームサウンドデータおよび連結サウンドデータをアナログデータとして表している。
【００４１】
一方、入力信号判定部（Ｕ１）にて誤りがあると判定された場合には、当該フレームと前フレームとを連結するための窓処理を行わず、周波数軸−時間軸データ変換部（Ｕ３）から連続して出力されるフレームサウンドデータ（前フレームの周波数成分データを引き続いて時間軸データに変換して生成したフレームサウンドデータ）と窓関数の平坦な「１」部分（図８のＣ−Ｅ部分）とを乗算して出力部へ出力する。この処理は、本来規定された１サウンドフレーム期間の窓処理をさらに１フレーム期間分延長して２フレームに対応させるものである。但し、音再生の連続性を保つため、窓処理をサウンドフレーム分単位で分割して行い、サウンドフレーム期間毎に出力部へのデータ出力を行う。さらに、入力信号の誤りが連続する場合には、上記処理を繰り返すことにより、窓処理期間を複数フレームに対応させることができる。
【００４２】
以上のように、フレーム毎の音データ（フレームサウンドデータ）と窓関数（乗算係数）とを掛け合わせてフレームサウンドデータを生成し、前後のフレームサウンドデータと連結することにより音データ（サウンドデータ）を生成する。さらに、誤りが発生した場合には、フレームサウンドデータ生成期間および窓処理期間を延長して、音データを生成する。
【００４３】
なお、データの誤り有無によって窓処理が影響されないように、図８に示す前フレームのＣ−Ｄ間の音データを当該フレームの前部に複写することにより、当該フレームを補完して１フレーム分データとしてもよい。この場合、同じＣ−Ｄ間の周波数成分データを、前フレームの窓関数減衰部分のデータおよび当該フレームの窓関数増加部分のデータとして窓処理を行って、音データを生成することができる。これにより、データの誤りがあっても、データに誤りが無い場合と同じ窓処理を行うことができるという利点がある。しかし、この方法では、前フレームのデータの後部を当該フレームの前部に複写するという処理が増えるため、窓処理を延長する方法と比べて処理量が少なくなる方を選択することができる。
【００４４】
さらに、音再生処理では、周波数帯域毎に再生した音データを帯域合成する処理、デジタルデータをアナログ変換する処理、ＡＭＰにより増幅する処理等も行われるが、これらの処理は従来技術と同様に行うことができるので、ここでは説明を省略する。
【００４５】
本実施形態によれば、データ誤りが発生する頻度が比較的少ないデータ伝送システムにおいて、入力データの誤りに対応して音楽を再生すると共に、音質の低下を少なくすることができる。本実施形態は、データ誤りが発生する頻度が比較的少なく、かつ、音成分が比較的なだらかに遷移している音源に対して有効であり、例えば誤り発生期間が長期化する場合および音成分が大きく変化する音源等に対しては、無音化する処理の方が適していることもある。
【００４６】
誤り発生期間が長期化したか否かについては、システム制御用ＬＳＩ等を用いて外部から監視することができる。一方、音成分の変化が大きいか否かについては、どのような圧縮処理が行われているかによって判定することができる。例えば、音成分の急激な変化に対応してサウンドフレームをサブフレーム化する処理が行われているか否かを監視する方法等が挙げられる。サウンドフレーム期間は、音の再現性によって規定される。例えばサウンドフレーム期間を長くした場合、周波数成分に変換後のデータ圧縮効率を向上させることができる。しかし、再生時に１サウンドフレーム内で周波数、音量等の音質を変更することができないため、時間応答性が悪くなって再生音楽の品質が悪くなる。これを防ぐため、例えばドラム、シンバル等、短時間に周波数および音量に急激な変化を伴うアタック音に関しては、圧縮単位毎にデータを分割する際に、図９（ｈ）に示すような窓部分が短い窓関数を用いてサブフレーム化し、図９（ｉ）に示すような連結サウンドデータを生成することにより、音量の急激な変化に対する時間応答性を改善する方法を用いることができる。または、アタック音部分のみを他の部分とは別に抽出して符号化処理を行う等の方法を用いてもよい。これらの処理が行われているか否かを監視することにより、音成分変化が急激であるか否かを判断することができる。
【００４７】
（実施形態２）
本実施形態２では、入力信号の状態として、誤りの発生状態に応じて複数の状態を検出し、各々の状態に応じて信号変換処理および窓変換処理を制御する例について説明する。この音楽再生装置は、データ誤りが発生する頻度が比較的多いデータ伝送システムに好適であり、例えば衛星デジタル放送等の無線系デジタル伝送システム等が挙げられる。
【００４８】
図１０は、実施形態２の音楽再生装置において検出される入力信号の状態を説明するための図である。図１１は、実施形態２の音楽再生装置において、周波数成分信号（周波数軸データ）から時間成分信号（時間軸データ）に変換する処理を説明するためのブロック図である。
【００４９】
本実施形態では、入力信号判定部Ｕ１にて入力信号の誤り有無を検出し、状態判定・保持部Ｕ７にて誤り発生期間および誤り発生状態等を判定して、誤りが無い場合を通常状態Ｓ１、誤り発生期間が短い場合をエラー状態１Ｓ２、誤り発生期間が中程度または誤り状態から復帰中である場合をエラー状態２Ｓ３、誤り発生期間が長い場合をエラー状態３Ｓ４として、状態判定・保持部Ｕ７に格納する。さらに、状態を細分化、拡大化して増やすことも可能である。状態判定・保持部Ｕ７としては、例えば状態を保持する２ビットの記憶素子、状態を判定するランダムロジック、誤り有無を判定する判定回路、誤りフレーム数をカウントするカウンタおよびタイミング回路等からなる通常のステートマシンを用いることが可能であり、その詳細な説明については省略する。
【００５０】
入力信号判定部Ｕ１にて入力信号に誤りが無いと判定された場合には、状態判定・保持部Ｕ７にて通常状態Ｓ１と判定される。この場合には、通常処理として、入力された圧縮データをデータ伸長処理部Ｕ２にて伸長し、入力選択部Ｕ６にて伸長データを選択して周波数軸−時間軸データ変換部Ｕ３にて変換し、窓処理を行って音楽データとして再生する。通常状態Ｓ１において、入力信号判定部Ｕ２にて入力信号に誤りがあると判定され、前のフレームから音成分の変化量が小さいフレームである場合には、エラー状態１Ｓ２に状態遷移し、変化量が大きいフレームである場合にはエラー状態２Ｓ３に状態遷移する。
【００５１】
エラー状態１Ｓ２では、実施形態１と同様に、直前フレームのフレームサウンドデータに対する周波数軸−時間軸データ変換処理期間および窓処理期間を延長させることにより、位相および音量等の連続性を保った状態で音データを生成する。通常、音成分変化が少ない音源に対して数サウンドフレーム分のデータ補完を行うためには、無音処理に比べて、直前フレームのフレームサウンドデータを用いて音データを再生する方法が有効である。エラー状態１Ｓ２において、誤りがさらに連続する場合には、一定回数以上でエラー状態２Ｓ３に状態遷移する。
【００５２】
エラー状態２Ｓ３では、無音（ミュート）処理を行い、音を出力させない。本実施形態では、周波数成分データ１〜ｎを全て０にしたデータを無音データとして予め記憶素子に格納しておき、入力選択部Ｕ６にて無音データを選択して無音を再生する。なお、周波数成分データ１〜ｎを全て０にしたデータを圧縮したデータを予め記憶素子に格納させておき、入力選択部Ｕ６により選択してデータ伸長部Ｕ２にてデータ伸長を行うようにしてもよい。エラー状態２Ｓ３において、誤りが発生した場合にはエラー状態２Ｓ４に状態遷移し、誤りが発生しない場合には通常状態Ｓ１に状態遷移する。
【００５３】
エラー状態３Ｓ４では、警告音を出力し、入力信号に誤りが検出されたことを利用者に通知する。本実施形態では、サウンドフレームの周波数を基本とし、その整数倍の周波数を警告音として用いる。例えば、電子ブザー等に用いられる４ｋＨｚまたは２ｋＨｚ近傍の周波数を用いることができる。サウンドフレームの周波数を整数倍した周波数を用いることにより、各サウンドフレームにおける波形の位相および音量等を等しくすることができるため、各サウンドフレームで共通の周波数成分データを用いることができる。この周波数成分データを警告音データとして予め記憶素子に格納しておき、入力選択部Ｕ６にて警告音データを選択して警告音を再生する。なお、警告音の周波数成分データを圧縮したデータを予め記憶素子に格納させておき、入力選択部Ｕ６により選択してデータ伸長部Ｕ２にてデータ伸長を行うようにしてもよい。エラー状態３Ｓ４では、誤りが続く限り警告音を出力し、誤りが無くなった場合にはエラー状態２Ｓ３を経由して通常状態Ｓ１に状態遷移する。
【００５４】
本実施形態における処理は、エラー状態２Ｓ３とエラー状態３Ｓ４とで合わせて２つのサウンドフレーム分の周波数成分データと、ステートマシーンと、周辺処理装置によって実現することが可能である。
【００５５】
なお、状態をさらに細分化して、例えばエラー状態２Ｓ３からエラー状態３Ｓ４に遷移するときには段階的に音量を大きくし、その反対にエラー状態３Ｓ４からエラー状態２Ｓ３に遷移するときには段階的に音量を小さくする状態を加えてもよい。または、エラー状態３Ｓ４からさらに誤りが連続した場合に、警告音の音成分を変化させた警告音を発生させる状態を加えてもよい。
【００５６】
【発明の効果】
以上詳述したように、本発明によれば、短い期間の誤りに対しては位相および音量等の連続性を保った状態で音データを補完して再生することができる。長い期間の誤りに比べて発生し易い短い期間の誤りに対して、従来の無音化処理等に比べて、自然な音再生を行うことができるので、本発明は非常に有効な技術である。さらに、本発明によれば、長期間の誤りに対しては無音化処理を用いる等、誤りの状態によって適切な処理を選択して行うことができる。
【００５７】
さらに、本発明によれば、長期間の誤りが生じた場合に、警告音を発生して、利用者に誤りを知らせることができる。従来では、無音化処理により利用者が機械故障またはボリューム故障と誤解して誤って音量を上げ、受信状態が正常化した後に大音量で音出力される等の不具合が生じていたが、本発明によればこのような不具合を防ぐことができる。
【００５８】
本発明は、周波数軸−時間軸変換技術を用いる音楽圧縮技術全般に応用可能であり、圧縮・伸長技術には影響を与えない。本発明によれば、伝送時に用いられる誤り・訂正符号等による誤り訂正能力を超える誤りに対しても、音補完を行うことができるので、データ伝送システム全体における誤り対応能力を向上させることができる。
【図面の簡単な説明】
【図１】（ａ）〜（ｅ）は、通常の音再生処理を説明するための波形図である。
【図２】（ａ）〜（ｅ）は、従来のエラー発生時における無音化処理を説明するための波形図である。
【図３】（ａ）〜（ｅ）は、従来のエラー発生時における前フレームデータ使用処理を説明するための波形図である。
【図４】基本的な音楽データの圧縮・伸長処理について説明するためのフロー図である。
【図５】（ａ）〜（ｃ）は、圧縮処理の前処理について説明するための図である。
【図６】周波数軸−時間軸データ変換処理を説明するための図である。
【図７】実施形態１の音楽再生装置を説明するための機能ブロック図である。
【図８】周波数軸−時間軸データ変換処理を説明するための波形図である。
【図９】（ａ）〜（ｉ）は、窓処理を説明するための波形図である。
【図１０】実施形態２の音楽再生装置における入力信号の状態遷移を説明するための図である。
【図１１】実施形態２の音楽再生装置を説明するための機能ブロック図である。
【符号の説明】
Ｕ１入力信号判定部
Ｕ２データ伸長部
Ｕ３周波数軸−時間軸データ変換部
Ｕ４窓データ生成部
Ｕ５フレームタイミング発生部
Ｕ６入力選択部
Ｕ７状態判定・保持部[0001]
BACKGROUND OF THE INVENTION
The present invention relates to a music playback apparatus that performs music playback by performing decompression processing on a compressed digital music data signal. For example, a music playback device used for tuners such as satellite digital TV (television) broadcasting, terrestrial digital TV broadcasting, and terrestrial digital radio broadcasting using radio as data transmission means, and cable using wired as data transmission means Music playback device used in TV digital broadcast tuners, Internet radio using the Internet as a data transmission means, music playback device used in Internet telephones, etc., or MD as a package-based information medium for transmitting information offline The present invention relates to a music reproduction apparatus such as a (mini-disc) player, an IC card, or an IC semiconductor music player in which compressed sound source data is recorded in a flash memory built in the IC card.
[0002]
[Prior art]
In recent years, with the development of semiconductor technology, it has become possible to process large volumes of digital data at high speed. Along with this, digitally compressed audio contents are also distributed with respect to music data, and digital broadcasting is also spreading.
[0003]
As a method for digitally compressing music data, it is used in ATRAC (Adaptive Transform Acoustic Coding) used in MD, MP3 (MPEG-1 Audio Layer 3) used in PC (personal computer), digital TV broadcasting, etc. AAC (MPEG-2 AAC) or the like.
[0004]
In any method, first, as preprocessing, music data as analog data is sampled, quantized, digitized, band-divided by a band filter or the like, divided into data for each compression unit, and displayed along the time axis. The converted data (time component signal) is converted into data (frequency component signal) represented along the frequency axis, and then compression processing is performed. For example, in the case of music, sampling is performed at a frequency such as 44.1 kHz or 48 kHz, and quantized into 24 bit data or 16 bit data. On the other hand, in the case of voice, sampling is performed at a frequency such as 8 kHz and quantized to 8 bit data or the like. The conversion from the time axis data to the frequency axis data is performed by dividing the data into bands using a band filter or the like and converting the data using MDCT (Modified Discrete Cosine Transform) or the like within each band.
[0005]
As compression processing, lossy data compression such as deleting music components that cannot be perceived by humans from data using an auditory psychological model or the like, lossless data compression such as entropy coding such as Huffman code, and the like are performed. Furthermore, in order to improve the reproducibility to attack (impact) sounds such as drums, cymbals, etc., the attack sound part is a method of sub-frame by shortening the unit time width for compressing data, only the attack sound part, and other parts Alternatively, a method of cutting out and processing is used.
[0006]
The music data converted into the frequency component includes data relating to the frequency and data relating to the phase of the sound source data. For example, when sound source data having a single frequency as shown in FIG. 1A is divided into three sound frames of the first frame to the third frame, the phases of the extracted sound frames are different. For this reason, each frequency axis data obtained by converting each time axis data is data having different phase data even if the frequency of the sound source data is the same.
[0007]
It should be noted that the window functions shown in FIGS. 1B to 1D and the time axis data of each sound frame are used in order to improve the connection state between the sound frames in the reproduced waveform and smooth the signal transition. Multiplication window processing is performed. The curved portion of the window function differs depending on the data compression / decompression method, and for example, a quarter circle or a sin waveform is used. By this window processing, the time axis data is output as a reproduced waveform as it is in the flat portion of the window function (value “1” portion), and the time axis data and the curve portion of the window function are connected at the connection portion between the sound frames. A waveform obtained by adding the results obtained by multiplying and is output as a reproduced waveform. As a result, the reproduction waveform shown in FIG.
[0008]
By the way, a data transfer method for transferring digitally compressed music data can be divided into a data transfer method without time restriction and a data transfer method with time restriction. As an example of a data transfer method without time restrictions, for example, a music data file is downloaded using the Internet or the like, and after reception of all the data is completed, the received data is expanded / reduced by a personal computer on the receiving terminal side. The case where it reproduces is mentioned. In this data transfer method, if the original data contains no error and an error occurs during data transmission, the receiving terminal detects the error during data transmission and sends a data retransmission request to the transmitting side. Finally, music data files with no errors can be received. An error at the time of data transmission can be detected on the receiving side, for example, by adding parity data in units of packets that are transmission units. When an error is included in the original data due to destruction of the original data, a detection code for detecting an error in the transmission system is added based on the original data. In such a case, it is impossible for the receiving side to detect an error except for a special case such as an abnormality in the file format.
[0009]
As an example of a data transfer method with time restrictions, for example, satellite digital TV broadcasting can be cited. In satellite digital TV broadcasting, digital music data compressed by a music compression method such as MPEG-2 (Moving Picture Experts Group phase 2) or AAC (Advanced Audio Coding) is broadcast. The broadcast data is transmitted with a powerful error correction / detection code such as a Reed-Solomon code or a convolutional code added as an error correction / detection code in the transmission system. However, depending on weather conditions, radio wave reception conditions, etc., such error correction / detection codes may not be sufficiently received, and error correction may not be possible on the receiving side. In such a case, in the field of TV broadcasting or the like where it is necessary to perform music reproduction without delay in time, means such as data retransmission cannot be used. I can't play music correctly. Therefore, conventionally, for example, when an error is detected from the received data, processing such as silencing (muting) a portion including the error is performed.
[0010]
Furthermore, even in an MD playback device equipped with a shock countermeasure buffer memory (shock proof memory), normal data reading from the MD may be hindered by continuous vibration or the like. In such a case, if a period during which data cannot be read exceeds the amount of data stored in the buffer memory, for example, a process of silencing a part including an error, a process of repeatedly reproducing the immediately preceding reproduction data, and the like are performed. ing.
[0011]
[Problems to be solved by the invention]
As described above, in the field of TV broadcasting and the like where music must be played back without time delay, it is not possible to take measures such as data retransmission, and the receiving terminal side is correct regarding the portion where an error has occurred. I can't play music.
[0012]
Therefore, conventionally, when an error is detected from received data, a method of muting a portion including the error is used. For example, when an error occurs in the second sound frame when transmitting a sound source waveform as shown in FIG. 2A as digital compressed data, silence data is expanded as compressed data of the second sound frame on the receiving side. Then, the time axis data of each sound frame shown in FIGS. 2B to 2D is multiplied by the value of the window function to generate a reproduction waveform as shown in FIG. It is also possible to silence the portion including the error by a method of narrowing the output with a reproduction AMP (amplifier). However, in the method of silencing the error part in this way, as shown in FIG. 2E, there is a problem that the sound quality is deteriorated because the music that should be continuously output is interrupted by the mute processing. Furthermore, in the method of silencing the error part, there is also a problem that it is not easy for the user to determine whether an error has occurred in the received data or a failure of the playback device itself when the error has been prolonged. is there.
[0013]
Further, conventionally, when an error is detected from received data, a method of repeatedly reproducing the immediately preceding reproduction data is also used. For example, when an error occurs in the second sound frame when transmitting the sound source waveform as shown in FIG. 3A as digital compressed data, the receiving side uses the first sound frame immediately before as the compressed data of the second sound frame on the receiving side. The compressed data is decompressed. Then, the time data of each sound frame shown in FIG. 3B to FIG. 3D is multiplied by the value of the window function to generate a reproduction waveform as shown in FIG. However, in the method of repeatedly reproducing the immediately preceding playback data in this way, even if the sound source waveform having the same frequency is compressed, different sound phase data is included in each sound frame, so the immediately preceding frame sound data is used. Even in such a case, as shown in FIG. 3 (e), there is a problem in that the reproduced waveform is distorted and the sound quality is deteriorated.
[0014]
The present invention has been made to solve such problems of the prior art, and can compensate for errors in data transmission without causing distortion in the reproduced waveform. Further, there is an error in the data. An object of the present invention is to provide a music playback device that can notify a user in the event of a failure.
[0015]
[Means for Solving the Problems]
  The music reproducing apparatus of the present invention includes a signal conversion means for converting a frequency component signal of digital music data as an input signal into a time component signal for each recording unit, and start of input of the time component signal for each recording unit. Period and termination periodInIncreasing part for increasing the input level of the time component signal and attenuating part for attenuating itAre provided, The period excluding the start period and the end periodInMaintenance unit for maintaining the input level of the time component signalWas providedWindow data generating means for generating window data, input signal determining means for detecting the state of the input signal input to the signal converting means for each recording unit, and recording unit by the input signal determining means. When it is determined that there is no error in the input signal, the recording unit time component signal output from the signal conversion unit is multiplied by the window data generated by the window data generation unit, and the recording is performed. The multiplication result obtained for each unit is used as the window data of the recording unit.The increase partOf the recording unit immediately before the recording unitThe attenuation partThe signal conversion means in the recording unit determined to have an error when the input signal determination means determines that there is an error in the input signal of the recording unit. The time component signal output from the signal conversion unit of the immediately preceding recording unit without maintaining the signal conversion processing by the window data generating process by the window data generating unit is used to maintain the window data in the immediately preceding recording unit. And a signal conversion control means for extending the section and multiplying the maintaining section for output.
[0016]
  Preferably, when it is determined by the input signal determination means that there is an error in the input signal of the recording unit,When the generation period of the error becomes long,Silence processing is performed instead of the output of the signal conversion means in the recording unit.
[0017]
  More preferably, a storage means in which data obtained by compressing at least one specific music frequency component signal is stored in advance, and after the silence process, the input signal determination means further includes an error in the input signal of the recording unit. In the case where it is determined that there is, there is further provided data selection control means for selecting compressed data of the frequency component signal of the specific music stored in the storage means instead of the silence processing.
[0019]
The operation of the present invention will be described below.
[0020]
In the present invention, the input signal determination means detects the presence / absence of an error in the input signal and the period during which the error occurs. When an error is included in the input signal, the signal conversion control unit is controlled by the signal conversion control unit, so that a plurality of signal conversion processing periods of a recording unit (sound frame or subframe) before the error occurs are set. The signal conversion process is not performed on the recording unit in which an error occurs. Since sound data that is continuous in waveform is generated using data for the recording unit before an error occurs, it is possible to prevent deterioration in sound quality due to phase difference as in the prior art in which the previous reproduction data is repeatedly reproduced. is there.
[0021]
In the present invention, silence data is recorded in advance in the storage means as compressed data, so that the frequency, volume, phase, etc. of the recording unit before the error occurs in the input signal and the recording unit in which the error occurs are recorded. When there is a large change in sound component, it is possible to silence the recording unit in which an error has occurred by selecting silence data. Furthermore, by recording a frequency component signal such as a warning sound in advance in the storage means as compressed data, the warning sound is selected and reproduced when the error is prolonged, and the data contains an error. It is possible to inform the user that
[0022]
Whether to use the data corresponding to the recording unit before the error occurs, the silence process, or the warning sound generation process depends on whether there is an error in the input signal, the error generation period, the sound component change, etc. It is possible to select by the selection means.
[0023]
Furthermore, when an error is included in the input signal, the window processing means is controlled by the window processing control means to extend the window processing period for the recording unit before the error occurs so as to correspond to a plurality of recording units. The window processing is not performed for the recording unit in which an error has occurred. As a result, it is possible to control the window processing for generating a reproduction waveform by smoothly transitioning the sound component in the data connection portion between the recording units according to the presence or absence of an error.
[0024]
DETAILED DESCRIPTION OF THE INVENTION
Embodiments of the present invention will be described below with reference to the drawings.
[0025]
Before describing a specific embodiment, a process for compressing / decompressing music data will be basically described with reference to FIG.
[0026]
When compressing music data, first, as pre-processing, music data that is analog data is sampled, quantized, digitized, and band-divided by band filters -1 to m, etc., and divided into data for each compression unit. Then, after the time axis data (time component signal) is converted into frequency axis data (frequency component signal), compression processing is performed. For example, in the case of music, when recording on a CD (compact disc) or the like, sampling is performed at a frequency of 44.1 kHz and quantized to 24 bit data or 16 bit data, and then DAT (Digital Audio Tape) or PC (personal computer) In the case of recording on the like, sampling is performed at a frequency of 48 kHz and quantized to 24 bit data or 16 bit data. On the other hand, in the case of voice, sampling is performed at a frequency of 8 kHz or the like and quantized to 8 bit data or the like. In the conversion from time axis data to frequency axis data, the data is band-divided by a band filter or the like, and data conversion is performed using MDCT or the like within each band.
[0027]
For example, the input waveform shown in FIG. 5A is sampled and quantized to obtain the time axis data shown in FIG. 5B, and the time axis data is shown in FIG. 5C. MDCT processing is performed by multiplying sampling waveforms 1 to n having different frequencies to obtain frequency component data (spectrum component data) of each sampling waveform.
[0028]
The spectrum component data obtained in this way is subjected to irreversible data compression such as deleting music components that cannot be perceived by humans from the data using an psychoacoustic model or the like, and entropy coding such as Huffman code. Reversible data compression is performed, and the number of significant digits is adjusted. Note that processing such as entropy encoding may not be performed depending on the compression technique. Furthermore, in order to improve reproducibility with respect to attack sounds such as drums, cymbals, etc., in the attack sound part, the unit time width for compressing data may be shortened into subframes to improve the time followability of data. Alternatively, only the attack sound portion may be cut out and processed separately from the other portions.
[0029]
In the decompression process, the data compressed by the compression process is decompressed and expanded into frequency component data. Each frequency component data is converted into time axis data by using an IMDCT (Inverse Discrete Fourier Transform), which is an inverse transform at the time of compression. Specifically, as shown in FIG. 6, by multiplying the frequency component data by the same waveform as the frequency sampling waveform at the time of recording, and adding the multiplication results on each time axis, the frequency axis data is obtained for each sound frame. Convert to time axis data. Further, window processing is performed to smooth the signal transition between sound frames. By this window processing, the previous sound frame and the next sound frame are overlapped at the connection part, and the terminal part (attenuation part) of the window function of the previous sound frame and the start part (increase part) of the window function of the next sound frame at the overlapping part. And the sound components such as frequency, volume, phase, etc. can be smoothly transitioned at the connection portion of the sound frames.
[0030]
(Embodiment 1)
In the first embodiment, an example will be described in which two states, with and without errors, are detected as input signal states, and signal conversion processing for the previous sound frame is extended when there is an error. This music playback apparatus is suitable for a data transmission system in which the frequency of occurrence of data errors is relatively low.
[0031]
FIG. 7 is a block diagram for explaining processing for converting a frequency component signal (frequency axis data) into a time component signal (time axis data) in the music playback device of the first embodiment. This process is usually performed by a DSP (Digital Signal Processor) or RISC (Reduced Instruction Set Computer), etc., but here it is shown as a functional block diagram for explanation.
[0032]
The digital compressed data is transmitted via the Internet or the like and received by a music playback device, or read from a recording medium such as an MD and inputted. The input digital compressed data is error-corrected / detected by an error-correcting / detecting code or the like and stored in the buffer memory. If a data error is detected in a system that can read data again from the MD or the like, the read data is destroyed and the data is read again. However, even in a system that can read data again, if data errors occur continuously for more than the period that can be stored in the buffer memory due to vibration of the device, etc., the data including the error detection result is included. Is input to the reproduction processing unit. The input signal in FIG. 7 is input data including such an error detection result, and the error detection result is determined by the input signal determination unit U1.
[0033]
When the input signal determination unit U1 determines that there is no error in the input signal, the data expansion processing unit U2 performs a data expansion process for decoding an entropy code such as a Huffman code on the input data. In addition to data compression by entropy codes, there are compression methods such as limiting the number of significant digits by a scale factor, and decompression processing is performed according to the compression method at the time of each data compression. In the present embodiment, the data compression / decompression method does not matter. By this data expansion process, frequency component data is obtained. This frequency component data includes information regarding the volume component, frequency component, and phase component of the sound.
[0034]
The frequency axis-time axis data conversion processing unit U3 multiplies the frequency component data 1 to n obtained as described above by the frequency waveform data 1 to n corresponding to each frequency, and adds the multiplication result for each frequency. Thus, the frequency axis data is converted to the time axis data, and frame sound data (time component data) is generated. Note that the conversion processing from the frequency axis data to the time axis data is performed from a point (point A in FIG. 8) where a window function described later is not zero. The frame timing is generated by the frame timing generation unit U5 and output to each part.
[0035]
  The frequency waveform data 1 to n are waveform data having a frequency corresponding to each sampling frequency at the time of recording. Each frequency waveform data is the most accurate1 cycleIt is obtained by a table data reference system that refers to data using phase data of each frequency as an argument from a data table of waveform data sampled and quantized and extracted from the basic waveform data of minutes with the required time resolution. Each frequency waveform data can also be obtained by arithmetic processing.
[0036]
When the input signal determination unit (U1) determines that there is no error in the input signal, it initializes phase data, which is a reference argument value for each frequency waveform data, and performs data processing for one sound frame (FIG. 8). Between A and D).
[0037]
On the other hand, when the input signal determination unit (U1) determines that there is an error in the input signal, the update processing of the frequency component data is not performed, and the latest frequency component data not including the error is held. The phase data that is each reference argument value of the frequency waveform data is not initialized. By performing data processing (between D and F in FIG. 8) in this state, sound data (sound data) that is continuous in waveform from the previous sound frame can be generated. In this process, the process of the originally defined one sound frame period is further extended by one frame period to correspond to two frames. Further, when errors in the input signal continue, the frame sound data for a plurality of frames not including the phase difference can be processed by repeating the above processing.
[0038]
Next, window processing will be described. The window processing is performed by multiplying sound data (frame sound data) and a window function in order to correct the continuity of the sound data at the connection portion of the sound frames. The window function is generated by the window data generation unit U4 shown in FIG.
[0039]
As shown in FIG. 9, the window function has a flat “0” portion at the front end portion and the rear end portion, and a “non-zero” portion therebetween. The “non-zero” part consists of a front increasing part, a central flat “1” part and a rear attenuating part. The increase part overlaps the decay part in the window function of the previous frame and is used to adjust the connection with the previous frame. The attenuation part overlaps with the increase part in the window function of the back frame, and is used to adjust the connection with the back frame. The flat “1” portion at the center is a portion that is not affected by other frames. The “0” part before the increase part and after the attenuation part is 0 when the sound data (frame sound data) is multiplied, and the flat part at the center of the previous frame data (the part where the window function is “1”). Addition does not affect sound generation.
[0040]
When the input signal determination unit (U1) determines that there is no error, the frame sound data of the previous frame (FIG. 9B) is multiplied by the attenuation portion of the window function (FIG. 9A). The result is added to the result of multiplying the frame sound data of the frame (FIG. 9 (d)) and the increased portion of the window function (FIG. 9 (c)) to generate sound data. It is connected to the frame sound data of the previous frame (FIG. 9B) and output to the output unit. Subsequently, the result obtained by multiplying the flat “1” portion of the window function (FIG. 9C) and the frame sound data of the frame (FIG. 9D) is output to the output unit. After that, the frame sound data (FIG. 9 (d)) of the frame is multiplied by the reduced portion of the window function (FIG. 9 (c)), the frame sound data (FIG. 9 (f)) of the subsequent frame, and the window. The result obtained by multiplying the increased portion of the function (FIG. 9 (e)) is added and output to the output unit. As a result, linked sound data as shown in FIG. 9G is obtained. Actually, the time axis data to be reproduced is digital data as shown in FIG. 6, and after that, conversion processing to analog data is performed. However, for the sake of simplicity, in FIG. Is represented as analog data.
[0041]
On the other hand, if the input signal determination unit (U1) determines that there is an error, the frequency axis-time axis data conversion unit (U3) does not perform window processing for connecting the frame and the previous frame. Frame sound data (frame sound data generated by continuously converting the frequency component data of the previous frame into time axis data) and the flat “1” portion of the window function (CE in FIG. 8). Part) and output to the output unit. In this processing, the window processing for one sound frame period which is originally defined is further extended by one frame period to correspond to two frames. However, in order to maintain the continuity of sound reproduction, the window processing is performed in units of sound frames, and data is output to the output unit for each sound frame period. Furthermore, when errors in the input signal continue, the window processing period can correspond to a plurality of frames by repeating the above processing.
[0042]
As described above, frame sound data is generated by multiplying sound data (frame sound data) for each frame and a window function (multiplication coefficient), and the sound data (sound data) is concatenated with the preceding and following frame sound data. Is generated. Further, when an error occurs, sound data is generated by extending the frame sound data generation period and the window processing period.
[0043]
In order to prevent the window process from being affected by the presence or absence of data errors, the sound data between CDs of the previous frame shown in FIG. It may be data. In this case, sound data can be generated by performing window processing on the same frequency component data between CDs as data of the window function attenuation portion of the previous frame and data of the window function increase portion of the frame. Thereby, even if there is an error in the data, there is an advantage that the same window processing can be performed as when there is no error in the data. However, in this method, since the process of copying the rear part of the data of the previous frame to the front part of the frame is increased, it is possible to select a method with a smaller processing amount compared with the method of extending the window process.
[0044]
Furthermore, in the sound reproduction process, a process of synthesizing sound data reproduced for each frequency band, a process of converting digital data into analog data, a process of amplifying with AMP, and the like are performed. Since it is possible, description is abbreviate | omitted here.
[0045]
According to this embodiment, in a data transmission system in which the frequency of occurrence of data errors is relatively low, music can be played in response to input data errors, and deterioration in sound quality can be reduced. The present embodiment is effective for a sound source in which the frequency of occurrence of data errors is relatively low and the sound component is relatively gently changed. For example, when the error occurrence period is prolonged and the sound component is For a sound source or the like that changes greatly, a process of silence may be more appropriate.
[0046]
Whether or not the error occurrence period has been extended can be monitored from the outside using a system control LSI or the like. On the other hand, whether or not the change in the sound component is large can be determined depending on what compression processing is being performed. For example, there is a method of monitoring whether or not a process of substituting a sound frame in response to a sudden change in sound components is performed. The sound frame period is defined by sound reproducibility. For example, when the sound frame period is lengthened, the data compression efficiency after conversion into frequency components can be improved. However, since the sound quality such as frequency and volume cannot be changed within one sound frame at the time of reproduction, the time responsiveness is deteriorated and the quality of reproduced music is deteriorated. In order to prevent this, for example, drums, cymbals, and other attack sounds that have a sudden change in frequency and volume in a short time, when dividing the data for each compression unit, a window portion as shown in FIG. Can be subframed using a short window function to generate concatenated sound data as shown in FIG. 9 (i), whereby a method for improving time response to a sudden change in volume can be used. Alternatively, a method may be used in which only the attack sound portion is extracted separately from the other portions and encoded. By monitoring whether or not these processes are being performed, it is possible to determine whether or not the sound component change is abrupt.
[0047]
(Embodiment 2)
In the second embodiment, an example will be described in which a plurality of states are detected according to an error occurrence state as the state of an input signal, and signal conversion processing and window conversion processing are controlled according to each state. This music reproducing apparatus is suitable for a data transmission system in which data errors frequently occur, and examples thereof include a wireless digital transmission system such as satellite digital broadcasting.
[0048]
FIG. 10 is a diagram for explaining a state of an input signal detected in the music playback device according to the second embodiment. FIG. 11 is a block diagram for explaining processing for converting a frequency component signal (frequency axis data) into a time component signal (time axis data) in the music playback device of the second embodiment.
[0049]
In the present embodiment, the input signal determination unit U1 detects the presence or absence of an error in the input signal, the state determination / holding unit U7 determines the error occurrence period, the error occurrence state, and the like. When the error occurrence period is short, the error state 1 S2, the error occurrence period is medium or being restored from the error state, the error state 2 S3, and the case where the error occurrence period is long is the error state 3 S4. Store in the holding unit U7. Further, it is possible to increase the state by subdividing and expanding the state. The state determination / holding unit U7 includes, for example, a normal memory composed of a 2-bit storage element for holding the state, a random logic for determining the state, a determination circuit for determining whether there is an error, a counter for counting the number of error frames, a timing circuit, and the like. A state machine can be used, and detailed description thereof is omitted.
[0050]
When the input signal determination unit U1 determines that there is no error in the input signal, the state determination / holding unit U7 determines the normal state S1. In this case, as the normal processing, the input compressed data is expanded by the data expansion processing unit U2, the expansion data is selected by the input selection unit U6, and converted by the frequency axis-time axis data conversion unit U3. Then, window processing is performed and music data is reproduced. In the normal state S1, when the input signal determination unit U2 determines that there is an error in the input signal and the frame has a small change amount of the sound component from the previous frame, the state transitions to the error state 1 S2 and changes. If the frame is a large amount, the state transitions to error state 2 S3.
[0051]
Error state 1 In S2, as in the first embodiment, the continuity such as phase and volume is maintained by extending the frequency axis-time axis data conversion processing period and the window processing period for the frame sound data of the immediately preceding frame. To generate sound data. In general, in order to perform data interpolation for several sound frames for a sound source with a small change in sound component, a method of reproducing sound data using frame sound data of the immediately preceding frame is more effective than silence processing. In the error state 1 S2, when errors continue further, the state transitions to the error state 2 S3 more than a certain number of times.
[0052]
Error state 2 In S3, silence (mute) processing is performed and no sound is output. In the present embodiment, data obtained by setting all frequency component data 1 to n to 0 is stored in advance in a storage element as silence data, and the silence data is reproduced by selecting the silence data at the input selection unit U6. Note that data obtained by compressing data in which frequency component data 1 to n are all 0 is stored in a storage element in advance, and is selected by the input selection unit U6 and is subjected to data expansion by the data expansion unit U2. Good. In error state 2 S3, if an error occurs, state transition is made to error state 2 S4, and if no error occurs, state transition is made to normal state S1.
[0053]
Error state 3 In S4, a warning sound is output to notify the user that an error has been detected in the input signal. In the present embodiment, the frequency of the sound frame is basically used, and an integer multiple of the frequency is used as the warning sound. For example, a frequency in the vicinity of 4 kHz or 2 kHz used for an electronic buzzer or the like can be used. By using a frequency obtained by multiplying the frequency of the sound frame by an integer, the phase and volume of the waveform in each sound frame can be made equal, so that common frequency component data can be used in each sound frame. This frequency component data is stored in advance in the storage element as warning sound data, and the warning sound is reproduced by selecting the warning sound data in the input selection unit U6. Alternatively, data obtained by compressing the frequency component data of the warning sound may be stored in advance in a storage element, selected by the input selection unit U6, and decompressed by the data decompression unit U2. In error state 3 S4, a warning sound is output as long as the error continues, and when there is no error, the state transitions to normal state S1 via error state 2 S3.
[0054]
The processing in the present embodiment can be realized by the frequency component data for two sound frames in the error state 2 S3 and the error state 3 S4, the state machine, and the peripheral processing device.
[0055]
The state is further subdivided. For example, when the transition is made from the error state 2 S3 to the error state 3 S4, the volume is increased stepwise, and conversely, when the transition is made from the error state 3 S4 to the error state 2 S3, stepwise. A state of reducing the volume may be added. Alternatively, a state in which a warning sound in which the sound component of the warning sound is changed may be added when errors continue further from error state 3 S4.
[0056]
【The invention's effect】
As described above in detail, according to the present invention, sound data can be complemented and reproduced in a state where continuity such as phase and volume is maintained for errors in a short period. The present invention is a very effective technique because it is possible to perform natural sound reproduction with respect to an error of a short period that is likely to occur as compared with an error of a long period, as compared with a conventional silence process or the like. Furthermore, according to the present invention, an appropriate process can be selected and performed depending on the error state, such as using a silence process for a long-term error.
[0057]
Furthermore, according to the present invention, when a long-term error occurs, a warning sound can be generated to notify the user of the error. Conventionally, the silence processing caused the user to misunderstand that it was a mechanical failure or volume failure, and increased the volume by mistake. According to this, such a problem can be prevented.
[0058]
The present invention can be applied to music compression technology in general using frequency axis-time axis conversion technology, and does not affect compression / decompression technology. According to the present invention, it is possible to perform sound interpolation even for an error exceeding the error correction capability due to an error / correction code used at the time of transmission, so that it is possible to improve the error handling capability in the entire data transmission system. .
[Brief description of the drawings]
FIGS. 1A to 1E are waveform diagrams for explaining normal sound reproduction processing;
FIGS. 2A to 2E are waveform diagrams for explaining a conventional silence process when an error occurs.
FIGS. 3A to 3E are waveform diagrams for explaining a conventional frame data use process when an error occurs.
FIG. 4 is a flowchart for explaining basic music data compression / decompression processing;
FIGS. 5A to 5C are diagrams for explaining pre-processing of compression processing; FIG.
FIG. 6 is a diagram for explaining frequency axis-time axis data conversion processing;
FIG. 7 is a functional block diagram for explaining the music playback device according to the first embodiment;
FIG. 8 is a waveform diagram for explaining frequency axis-time axis data conversion processing;
FIGS. 9A to 9I are waveform diagrams for explaining window processing. FIG.
FIG. 10 is a diagram for explaining state transition of an input signal in the music playback device according to the second embodiment.
FIG. 11 is a functional block diagram for explaining a music playback device according to a second embodiment;
[Explanation of symbols]
U1 input signal determination unit
U2 data decompression unit
U3 Frequency axis-time axis data converter
U4 Window data generator
U5 frame timing generator
U6 input selector
U7 State determination / holding unit

Claims

入力信号であるデジタル音楽データの周波数成分信号を、記録単位毎に時間成分信号に変換する信号変換手段と、
前記記録単位毎に、前記時間成分信号の入力の開始期間および終端期間に該時間成分信号の入力レベルを増加させる増加部および減衰させる減衰部がそれぞれ設けられるとともに、該開始期間および該終端期間を除く期間に該時間成分信号の入力レベルを維持させる維持部が設けられた窓データをそれぞれ生成する窓データ生成手段と、
前記記録単位毎に、前記信号変換手段に入力される前記入力信号の状態を検出する入力信号判定手段と、
前記入力信号判定手段によって前記記録単位の前記入力信号に誤りが無いと判定された場合に、前記信号変換手段から出力された前記記録単位の時間成分信号と前記窓データ生成部にて生成された窓データとを乗算して、前記記録単位毎に得られる乗算結果を、該記録単位の窓データの前記増加部に該記録単位の直前の記録単位の前記減衰部を重複させた状態で連結して出力し、前記入力信号判定手段によって前記記録単位の前記入力信号に誤りがあると判定された場合に、誤りがあると判定された記録単位において前記信号変換手段による信号変換処理および前記窓データ生成手段による窓データの生成処理を行わずに、その直前の記録単位の前記信号変換手段から出力された時間成分信号を、該直前の記録単位における窓データの維持部を延長して該維持部と乗算して出力する信号変換制御手段と、
を備えた音楽再生装置。A signal conversion means for converting a frequency component signal of digital music data as an input signal into a time component signal for each recording unit;
For each recording unit, an increasing unit for increasing the input level of the time component signal and an attenuating unit for decreasing the input level of the time component signal are provided in the start period and the end period of the input of the time component signal, respectively. Window data generating means for generating window data each provided with a maintaining unit for maintaining the input level of the time component signal during a period excluding;
Input signal determination means for detecting the state of the input signal input to the signal conversion means for each recording unit;
When the input signal determination unit determines that the input signal of the recording unit is error-free, the time component signal of the recording unit output from the signal conversion unit and the window data generation unit The multiplication result obtained for each recording unit is multiplied by the window data, and the result obtained by coupling the attenuation part of the recording unit immediately before the recording unit to the increase part of the window data of the recording unit is connected. And when the input signal determining means determines that the input signal of the recording unit has an error, the signal conversion processing by the signal converting means and the window data in the recording unit determined to have an error. Without performing the window data generation processing by the generation unit, the time component signal output from the signal conversion unit of the immediately preceding recording unit is used as the window data maintaining unit in the immediately preceding recording unit. A signal conversion control unit and outputting the multiplied with the holding section is extended,
A music playback device comprising:

前記入力信号判定手段によって前記記録単位の前記入力信号に誤りがあると判定された場合に、該誤りの発生期間が長くなると、該記録単位における前記信号変換手段の出力に代えて、無音処理が行われる請求項１に記載の音楽再生装置。When the input signal determination unit determines that the input signal of the recording unit has an error, if the generation period of the error becomes longer , a silence process is performed instead of the output of the signal conversion unit in the recording unit. The music playback device according to claim 1, wherein the music playback device is performed.

少なくとも１つの特定音楽の周波数成分信号を圧縮したデータが予め格納されている記憶手段と、
前記無音処理の後に、さらに、前記入力信号判定手段によって前記記録単位の前記入力信号に誤りがあると判定された場合に、前記無音処理に代えて、前記記憶手段に格納された特定音楽の周波数成分信号の圧縮データを選択するデータ選択制御手段とをさらに備える請求項１に記載の音楽再生装置。Storage means in which data obtained by compressing frequency component signals of at least one specific music is stored in advance;
After the silence process, the frequency of the specific music stored in the storage means instead of the silence process when the input signal determination means determines that the input signal of the recording unit has an error. The music playback device according to claim 1, further comprising data selection control means for selecting compressed data of the component signal.