JP3796240B2

JP3796240B2 - Network telephone and voice decoding apparatus

Info

Publication number: JP3796240B2
Application number: JP2003336494A
Authority: JP
Inventors: 浩三奥田; 美香桐本; 啓之平井; 宏樹大西
Original assignee: Sanyo Electric Co Ltd
Current assignee: Sanyo Electric Co Ltd
Priority date: 2002-09-30
Filing date: 2003-09-26
Publication date: 2006-07-12
Anticipated expiration: 2023-09-26
Also published as: JP2004282692A

Description

この発明は、インターネット電話機等のVoIPを利用したネットワーク電話機および音声復号化装置に関する。 The present invention relates to a network telephone and a voice decoding apparatus using VoIP such as an Internet telephone.

例えば、インターネットを使用して音声通話を行うインターネット電話が既に開発されている。インターネット電話は、"VoIP" という技術を利用している。VoIP(Voice over Internet Protocol)は、インターネットやイントラネットなどのTCP/IPネットワーク上で音声通話を行う、つまり音声データを送受信することを可能にする技術である。 For example, Internet telephones that make voice calls using the Internet have already been developed. Internet telephones use a technology called "VoIP". VoIP (Voice over Internet Protocol) is a technology that enables a voice call over a TCP / IP network such as the Internet or an intranet, that is, transmission / reception of voice data.

インターネット電話では、従来の電話機とは異なり、音声を圧縮した後にパケット化して、ＩＰネットワークを通して通話するものである。この種の通話装置では、ＩＰネットワークの状況によって、パケットの到達時刻にばらつきが生じる( ジッタ) 場合が多い。つまり、ＩＰネットワークを経由して到達するパケットの間隔は一定ではないことが多い。しかしながら、パケット受信側において復号化音声を連続的に出力するためには、符号化データを一定間隔で復号器に渡す必要がある。そこで、図１に示すように復号器１０２の前段にジッタを吸収するためのジッタ吸収バッファ１０１が設けられている。 Unlike conventional telephones, Internet telephones compress voices, packetize them, and talk over IP networks. In this type of communication device, there are many cases where the arrival time of packets varies (jitter) depending on the state of the IP network. That is, the interval between packets that arrive via the IP network is often not constant. However, in order to continuously output decoded speech on the packet receiving side, it is necessary to pass the encoded data to the decoder at regular intervals. Therefore, as shown in FIG. 1, a jitter absorption buffer 101 for absorbing jitter is provided in the preceding stage of the decoder 102.

ジッタ吸収バッファ１０１は、複数のパケットを記憶するための複数のバッファ部（パケット記憶部）を備えている。ジッタ吸収バッファ１０１のバッファ部には、到達したパケットが、パケット番号の順番で左側から順番に格納されていく。最も左側のバッファ部に格納されているパケットが一定時間毎に読み出されて復号器１０２に渡される。復号器１０２に１パケットが渡されると、ジッタ吸収バッファ１０１内の他のパケットが左側に１つずつシフトされる。復号器１０２は、ジッタ吸収バッファ１０１から渡されたパケット（符号化データ）を復号化して出力する。 The jitter absorption buffer 101 includes a plurality of buffer units (packet storage units) for storing a plurality of packets. In the buffer portion of the jitter absorption buffer 101, the arrived packets are stored in order of packet numbers from the left side. Packets stored in the leftmost buffer unit are read at regular intervals and passed to the decoder 102. When one packet is passed to the decoder 102, the other packets in the jitter absorption buffer 101 are shifted one by one to the left. The decoder 102 decodes and outputs the packet (encoded data) passed from the jitter absorption buffer 101.

図２（ａ）に示すように、ジッタ吸収バッファ１０１の最も左端に格納されているパケットが復号器に渡される時刻において、到達パケットが格納されるバッファ部の位置を表す分布を、パケット到達時刻の分布ということにする。このような分布をパケット到達時刻の分布と呼ぶのは、この分布は、ジッタ吸収バッファ１０１の左端を原点とし、右方向に時刻をとり、上方向に確率をとった場合において、到達パケットが格納される時刻の分布を表したことになるからである。 As shown in FIG. 2A, at the time when the packet stored at the leftmost end of the jitter absorption buffer 101 is passed to the decoder, the distribution indicating the position of the buffer unit where the arrival packet is stored is expressed as the packet arrival time. The distribution of. This distribution is called a packet arrival time distribution. This distribution uses the left end of the jitter absorption buffer 101 as the origin, takes the time in the right direction, and takes the probability in the upward direction. This is because it represents the distribution of time.

パケット到達時刻の分布が図２（ａ）に示すようなＳ０である場合には、ジッタ吸収バッファ１０１は効率よく働く。図２（ａ）に示すパケットの到達時刻の分布Ｓ０では、左から５番目のパケット部に到達パケットが格納される確率が最も高くなっている。 When the packet arrival time distribution is S0 as shown in FIG. 2A, the jitter absorption buffer 101 works efficiently. In the packet arrival time distribution S0 shown in FIG. 2A, the probability that an arrival packet is stored in the fifth packet portion from the left is the highest.

通話中にＩＰネットワークにおける固定的な遅延が減少した場合、ジッタ吸収バッファ１０１に到達するパケットの分布は、図２（ｂ）に示すように、Ｓ０からＳ１に移動する。この場合、ＩＰネットワークにおける固定的な遅延は少なくなっているにもかかわらず、ジッタ吸収バッファ１０１において、固定的に時間Ｔの遅延が生じることとなり、円滑な通話に支障をきたす。 When the fixed delay in the IP network decreases during a call, the distribution of packets that reach the jitter absorption buffer 101 moves from S0 to S1, as shown in FIG. In this case, although the fixed delay in the IP network is reduced, the jitter absorption buffer 101 has a fixed time T delay, which hinders a smooth call.

通話中にＩＰネットワークにおける固定的な遅延が増加した場合、ジッタ吸収バッファ１０１に到達するパケットの分布は、図２（ｃ）に示すように、Ｓ０からＳ２に移動する。この場合、ジッタ吸収バッファ１０１から外れた部分に到達するパケットは復号器１０２に出力することができず、パケット損失と同様に音声品質が劣化する。 When the fixed delay in the IP network increases during a call, the distribution of packets reaching the jitter absorption buffer 101 moves from S0 to S2, as shown in FIG. In this case, a packet that reaches a portion outside the jitter absorption buffer 101 cannot be output to the decoder 102, and the voice quality deteriorates in the same way as packet loss.

通話中にＩＰネットワークにおけるジッタ量が増加した場合、ジッタ吸収バッファ１０１に到達するパケットの分布は、図２（ｄ）に示すように、Ｓ０からＳ３に変化する。この場合、ジッタ吸収バッファ１０１から外れた部分に到達するパケットは復号器１０２に出力することができず、パケット損失と同様に音声品質が劣化する。 When the amount of jitter in the IP network increases during a call, the distribution of packets reaching the jitter absorption buffer 101 changes from S0 to S3 as shown in FIG. In this case, a packet that reaches a portion outside the jitter absorption buffer 101 cannot be output to the decoder 102, and the voice quality deteriorates in the same way as packet loss.

通話中にＩＰネットワークにおけるジッタ量が減少した場合、ジッタ吸収バッファ１０１に到達するパケットの分布は、図２（ｅ）に示すように、Ｓ０からＳ４に変化する。この場合、ＩＰネットワークにおけるジッタを吸収するために必要なバッファ量が少なくなるにもかかわらず、ジッタ吸収バッファ１０１において固定的に時間Ｔの遅延が生じることとなり、ジッタ吸収バッファ１０１の利用効率が悪い。 When the amount of jitter in the IP network decreases during a call, the distribution of packets reaching the jitter absorption buffer 101 changes from S0 to S4 as shown in FIG. In this case, although the amount of buffer necessary for absorbing jitter in the IP network is reduced, the jitter absorption buffer 101 has a fixed delay of time T, and the use efficiency of the jitter absorption buffer 101 is poor. .

パケットの到達時刻の分布を最適な分布にするためには、ジッタ吸収バッファ１０１内に格納されているパケットの数を調整することが考えられる。例えば、パケット到達時刻の分布が、図２（ｂ）または図２（ｅ）のような場合には、ジッタ吸収バッファ１０１内に格納されているパケットを廃棄（間引き）することによって、パケットの到達時刻の分布を最適な分布にする。また、パケットの到達時刻の分布が、図２（ｃ）または図２（ｄ）のような場合には、ジッタ吸収バッファ１０１内に格納されているパケットを複製することによって、パケット到達時刻の分布を最適な分布にする。 In order to optimize the distribution of arrival times of packets, it is conceivable to adjust the number of packets stored in the jitter absorption buffer 101. For example, when the packet arrival time distribution is as shown in FIG. 2B or FIG. 2E, the packet arrival is achieved by discarding (decimating) the packet stored in the jitter absorption buffer 101. Make the time distribution optimal. When the packet arrival time distribution is as shown in FIG. 2C or FIG. 2D, the packet arrival time distribution is obtained by duplicating the packet stored in the jitter absorption buffer 101. To the optimal distribution.

しかしながら、ジッタ吸収バッファ１０１内に格納されているパケットの数（パケット蓄積量）を調整する手法では、パケットの廃棄や複製によって、出力音声の品質が劣化するという問題がある。 However, the method of adjusting the number of packets stored in the jitter absorption buffer 101 (packet accumulation amount) has a problem that the quality of output sound is deteriorated due to packet discard or duplication.

なお、ジッタ吸収バッファ１０１内に格納されているパケットを廃棄（間引き）するか複製するかの判別は、従来は複数のパケットの到達遅延偏差を算出し、算出された到達遅延偏差に基づいて行っている。しかしながら、この判別方法では、信頼性の高い到達遅延偏差（統計量）を算出するためには、十分なデータ量が必要となるため、ジッタ吸収バッファ１０１内のパケット蓄積量制御に遅延が生ずるという問題がある。 Note that the determination of whether to discard (decimate) or duplicate the packet stored in the jitter absorption buffer 101 is conventionally performed based on the arrival delay deviations calculated for the plurality of packets. ing. However, in this determination method, a sufficient amount of data is required to calculate a highly reliable arrival delay deviation (statistic), and therefore a delay occurs in the packet accumulation amount control in the jitter absorption buffer 101. There's a problem.

なお、ジッタ吸収バッファ１０１内のパケット蓄積量を制御することは、言い換えれば、ジッタ吸収バッファにパケットが格納されてからそのパケットが復号されるまでの遅延時間を制御することである。 Note that controlling the amount of accumulated packets in the jitter absorption buffer 101 is, in other words, controlling the delay time from when a packet is stored in the jitter absorption buffer until the packet is decoded.

この発明は、ジッタ吸収バッファ内に格納されているパケットの廃棄や複製を行うことなく、パケットの到達時刻の分布を最適な分布となるように調整できるネットワーク電話機および音声復号化装置を提供することを目的とする。 The present invention provides a network telephone and a speech decoding apparatus that can adjust the distribution of arrival times of packets to an optimum distribution without discarding or duplicating the packets stored in the jitter absorption buffer. With the goal.

また、この発明は、ジッタ吸収バッファにパケットが格納されてから復号されるまでの遅延時間を制御する際に、制御遅延を小さくできるネットワーク電話機および音声復号化装置を提供することを目的とする。 Another object of the present invention is to provide a network telephone and a speech decoding apparatus that can reduce the control delay when controlling the delay time from when a packet is stored in a jitter absorption buffer to when it is decoded.

この発明による第１の音声復号化装置は、受信パケットを格納するための複数のバッファ部を有するジッタ吸収バッファと、ジッタ吸収バッファに格納されているパケットを復号化するための復号手段とを備えており、復号手段にパケットを出力するジッタ吸収バッファの出力端側のパケットのパケット番号を基準として、受信パケットがジッタ吸収バッファにおける当該受信パケットのパケット番号に対応した位置に格納される音声復号化装置において、復号手段によって得られた復号化音声信号に対して再生速度の変換を行うための再生速度変更手段、再生速度変更手段から出力されるデジタル音声信号を一時的に蓄積する出力バッファ、出力バッファに蓄積されたデジタル音声信号を所定時間間隔で読み出す手段、ジッタ吸収バッファへの受信パケットの格納位置に基づいて、再生速度変更手段を制御する再生速度制御手段、ならびに出力バッファのデータ蓄積量に基づいて、復号手段による復号タイミングを制御する復号タイミング制御手段を備えており、ジッタ吸収バッファ内に、ジッタ吸収バッファの出力端から所要数のバッファ部からなる第１領域と、第１領域よりジッタ吸収バッファの他端側において所要数のバッファ部からなる第２領域と、第２領域よりジッタ吸収バッファの他端側において所要数のバッファ部からなる第３領域とが設定されており、再生速度制御手段は、ジッタ吸収バッファ内の第１領域に受信パケットが格納される場合に、再生速度が遅くなるように再生速度変更手段を制御し、ジッタ吸収バッファ内の第３領域に受信パケットが所定回数連続して格納された場合には、再生速度が早くなるように、再生速度変更手段を制御するものであることを特徴とする。 A first speech decoding apparatus according to the present invention comprises a jitter absorption buffer having a plurality of buffer units for storing received packets, and a decoding means for decoding packets stored in the jitter absorption buffer. Audio decoding in which the received packet is stored at a position corresponding to the packet number of the received packet in the jitter absorbing buffer with reference to the packet number of the packet on the output end side of the jitter absorbing buffer that outputs the packet to the decoding means In the apparatus, reproduction speed changing means for converting the reproduction speed on the decoded audio signal obtained by the decoding means, an output buffer for temporarily storing the digital audio signal output from the reproduction speed changing means, and output means for reading the digital audio signals stored in the buffer at a predetermined time interval, jitter buffer Based in the storage position of the received packet, the reproduction speed control means for controlling the playback speed change means, and based on the amount of data stored in the output buffer comprises a decoding timing control means for controlling the decoding timing by the decoding means, In the jitter absorption buffer, a first area consisting of a required number of buffer sections from the output end of the jitter absorption buffer, a second area consisting of a required number of buffer sections on the other end side of the jitter absorption buffer from the first area, The third area composed of the required number of buffer sections is set on the other end side of the jitter absorption buffer from the two areas, and the reproduction speed control means stores the received packet in the first area in the jitter absorption buffer. In addition, the playback speed changing means is controlled so that the playback speed becomes slow, and a received packet is received a predetermined number of times in the third area in the jitter absorption buffer. When stored continue to, as the playback speed becomes faster, and characterized in that for controlling the reproduction speed changing unit.

復号タイミング制御手段としては、例えば、出力バッファのデータ蓄積量が所定の基準量より少なくなったときに、復号手段にパケットの復号化を要求するものが用いられる。 As the decoding timing control means, for example, a means for requesting the decoding means to decode a packet when the data accumulation amount of the output buffer becomes smaller than a predetermined reference amount is used.

この発明による第２の音声復号化装置は、受信パケットを格納するための複数のバッファ部を有するジッタ吸収バッファと、ジッタ吸収バッファに格納されているパケットを復号化するための復号手段とを備えており、復号手段にパケットを出力するジッタ吸収バッファの出力端側のパケットのパケット番号を基準として、受信パケットがジッタ吸収バッファにおける当該受信パケットのパケット番号に対応した位置に格納される音声復号化装置において、ジッタ吸収バッファ内に、ジッタ吸収バッファの出力端から所要数のバッファ部からなる第１領域と、第１領域よりジッタ吸収バッファの他端側において所要数のバッファ部からなる第２領域と、第２領域よりジッタ吸収バッファの他端側において所要数のバッファ部からなる第３領域とが設定されており、ジッタ吸収バッファ内の第１領域に受信パケットが格納される場合に、ジッタ吸収バッファにパケットが格納されてからそのパケットが復号されるまでの遅延時間が長くなるような制御を行い、ジッタ吸収バッファ内の第３領域に受信パケットが所定回数連続して格納された場合には、ジッタ吸収バッファにパケットが格納されてからそのパケットが復号されるまでの遅延時間が短くなるような制御を行う遅延時間制御手段を備えていることを特徴とする。 A second speech decoding apparatus according to the present invention comprises a jitter absorbing buffer having a plurality of buffer units for storing received packets, and a decoding means for decoding the packets stored in the jitter absorbing buffer. Audio decoding in which the received packet is stored at a position corresponding to the packet number of the received packet in the jitter absorbing buffer with reference to the packet number of the packet on the output end side of the jitter absorbing buffer that outputs the packet to the decoding means In the apparatus, in the jitter absorption buffer, a first area composed of a required number of buffer sections from the output end of the jitter absorption buffer, and a second area composed of the required number of buffer sections on the other end side of the jitter absorption buffer from the first area. And a third region comprising a required number of buffer portions on the other end side of the jitter absorption buffer from the second region When the received packet is stored in the first area in the jitter absorption buffer, the delay time from when the packet is stored in the jitter absorption buffer until the packet is decoded is increased. If the received packet is stored continuously in the third area in the jitter absorption buffer for a predetermined number of times, the delay time from when the packet is stored in the jitter absorption buffer to when the packet is decoded is shortened. A delay time control means for performing such control is provided .

遅延時間制御手段としては、たとえば、復号手段によって得られた復号化音声信号に対して再生速度の変換を行うための再生速度変更手段、再生速度変更手段から出力されるデジタル音声信号を一時的に蓄積する出力バッファ、出力バッファに蓄積されたデジタル音声信号を所定時間間隔で読み出す手段、ジッタ吸収バッファ内の第１領域に受信パケットが格納される場合に、再生速度が遅くなるように、再生速度変更手段を制御し、ジッタ吸収バッファ内の第３領域に受信パケットが所定回数連続して格納された場合には、再生速度が早くなるように、再生速度変更手段を制御する手段、ならびに出力バッファのデータ蓄積量に基づいて、復号手段による復号タイミングを制御する復号タイミング制御手段を備えているものが用いられる。 As the delay time control means, for example, a reproduction speed changing means for converting the reproduction speed on the decoded audio signal obtained by the decoding means, and a digital audio signal output from the reproduction speed changing means are temporarily received. An output buffer for accumulating, a means for reading out digital audio signals accumulated in the output buffer at predetermined time intervals, and a reproduction speed so that the reproduction speed is reduced when a received packet is stored in the first area in the jitter absorption buffer. Means for controlling the changing means and controlling the reproducing speed changing means so that the reproducing speed is increased when the received packet is stored continuously in the third area in the jitter absorption buffer a predetermined number of times, and the output buffer Based on the amount of data stored, a decoding timing control means for controlling the decoding timing by the decoding means is used.

遅延時間制御手段としては、たとえば、ジッタ吸収バッファ内の第１領域に受信パケットが格納される場合に、パケット読み出しタイミングにおいてジッタ吸収バッファ内から読み出したパケットが、今回を含めて複数回の連続するパケット読み出しタイミングにおいて繰り返し復号されるようにかつその間においてジッタ吸収バッファからのパケットの読み出しを禁止するように、ジッタ吸収バッファの読み出しおよび復号手段へ送るパケットを制御し、ジッタ吸収バッファ内の第３領域に受信パケットが所定回数連続して格納された場合には、パケット読み出しタイミングにおいてジッタ吸収バッファ内に格納されている複数のパケットを一度に読み出して、その一つのみを復号し、その他を破棄するようにジッタ吸収バッファからのパケットの読み出しおよび復号手段へ送るパケットを制御するものが用いられる。 As the delay time control means, for example, when a received packet is stored in the first area in the jitter absorption buffer , the packet read from the jitter absorption buffer at the packet read timing is continued a plurality of times including this time. The third region in the jitter absorption buffer is controlled by controlling the packet to be read from the jitter absorption buffer and sent to the decoding means so as to be repeatedly decoded at the packet read timing and to prohibit the reading of the packet from the jitter absorption buffer in the meantime. When a received packet is stored continuously for a predetermined number of times , a plurality of packets stored in the jitter absorption buffer are read at a time at the packet read timing, only one of them is decoded, and the others are discarded. The jitter absorption buffer Controls the packet sent to Tsu City of reading and decoding means are used.

この発明による第１のネットワーク電話機は、受信パケットを格納するための複数のバッファ部を有するジッタ吸収バッファと、ジッタ吸収バッファに格納されているパケットを復号化するための復号手段とを備えており、復号手段にパケットを出力するジッタ吸収バッファの出力端側のパケットのパケット番号を基準として、受信パケットがジッタ吸収バッファにおける当該受信パケットのパケット番号に対応した位置に格納されるネットワーク電話機において、復号手段によって得られた復号化音声信号に対して再生速度の変換を行うための再生速度変更手段、再生速度変更手段から出力されるデジタル音声信号を一時的に蓄積する出力バッファ、出力バッファに蓄積されたデジタル音声信号を所定時間間隔で読み出す手段、ジッタ吸収バッファへの受信パケットの格納位置に基づいて、再生速度変更手段を制御する再生速度制御手段、ならびに出力バッファのデータ蓄積量に基づいて、復号手段による復号タイミングを制御する復号タイミング制御手段を備えており、ジッタ吸収バッファ内に、ジッタ吸収バッファの出力端から所要数のバッファ部からなる第１領域と、第１領域よりジッタ吸収バッファの他端側において所要数のバッファ部からなる第２領域と、第２領域よりジッタ吸収バッファの他端側において所要数のバッファ部からなる第３領域とが設定されており、再生速度制御手段は、ジッタ吸収バッファ内の第１領域に受信パケットが格納される場合に、再生速度が遅くなるように再生速度変更手段を制御し、ジッタ吸収バッファ内の第３領域に受信パケットが所定回数連続して格納された場合には、再生速度が早くなるように、再生速度変更手段を制御するものであることを特徴とする。 The first network phone according to the present invention, comprises a jitter buffer having a plurality of buffer portions for storing the received packet, and a decoding means for decoding the packets stored in the jitter buffer In the network telephone in which the received packet is stored at a position corresponding to the packet number of the received packet in the jitter absorbing buffer with reference to the packet number of the packet on the output end side of the jitter absorbing buffer that outputs the packet to the decoding means Reproduction speed changing means for converting the playback speed to the decoded audio signal obtained by the means, an output buffer for temporarily storing the digital audio signal output from the reproduction speed changing means, and an output buffer Means for reading out digital audio signals at predetermined time intervals, jitter absorption Based on the storage location of the received packet to Ffa, playback speed control means for controlling the playback speed change means, and based on the amount of data stored in the output buffer, comprises a decoding timing control means for controlling the decoding timing by the decoding means A first region including a required number of buffer portions from the output end of the jitter absorbing buffer, and a second region including a required number of buffer portions on the other end side of the jitter absorbing buffer from the first region. The third region including the required number of buffer units is set on the other end side of the jitter absorption buffer from the second region, and the reproduction speed control means stores the received packet in the first region in the jitter absorption buffer. The playback speed changing means is controlled so that the playback speed becomes slow, and the received packet is stored in the third area in the jitter absorption buffer. If the stored constant number continuously, as the playback speed becomes faster, and characterized in that for controlling the reproduction speed changing unit.

この発明による第２のネットワーク電話機は、受信パケットを格納するための複数のバッファ部を有するジッタ吸収バッファと、ジッタ吸収バッファに格納されているパケットを復号化するための復号手段とを備えており、復号手段にパケットを出力するジッタ吸収バッファの出力端側のパケットのパケット番号を基準として、受信パケットがジッタ吸収バッファにおける当該受信パケットのパケット番号に対応した位置に格納されるネットワーク電話機において、ジッタ吸収バッファ内に、ジッタ吸収バッファの出力端から所要数のバッファ部からなる第１領域と、第１領域よりジッタ吸収バッファの他端側において所要数のバッファ部からなる第２領域と、第２領域よりジッタ吸収バッファの他端側において所要数のバッファ部からなる第３領域とが設定されており、ジッタ吸収バッファ内の第１領域に受信パケットが格納される場合に、ジッタ吸収バッファにパケットが格納されてからそのパケットが復号されるまでの遅延時間が長くなるような制御を行い、ジッタ吸収バッファ内の第３領域に受信パケットが所定回数連続して格納された場合には、ジッタ吸収バッファにパケットが格納されてからそのパケットが復号されるまでの遅延時間が短くなるような制御を行う遅延時間制御手段を備えていることを特徴とする。 Second network telephone according to the invention comprises a jitter buffer having a plurality of buffer portions for storing the received packet, and a decoding means for decoding the packets stored in the jitter buffer , based on the packet number of the output end side of the jitter buffer to output the packet to the decoding unit packet, in a network phone received packet is stored in the position corresponding to the packet number of the received packet in the jitter buffer, the jitter In the absorption buffer, a first area consisting of a required number of buffer sections from the output end of the jitter absorption buffer, a second area consisting of a required number of buffer sections on the other end side of the jitter absorption buffer from the first area, and a second area It consists of the required number of buffer sections on the other end side of the jitter absorption buffer from the area. When three areas are set and the received packet is stored in the first area in the jitter absorption buffer, the delay time from when the packet is stored in the jitter absorption buffer until the packet is decoded becomes longer When the received packet is stored for a predetermined number of times in the third area in the jitter absorption buffer, the delay time from when the packet is stored in the jitter absorption buffer until the packet is decoded It is characterized by comprising delay time control means for performing control so as to shorten .

この発明によれば、ジッタ吸収バッファ内に格納されているパケットの廃棄や複製を行うことなく、パケットの到達時刻の分布を最適な分布となるように調整できるようになる。 According to the present invention, it is possible to adjust the distribution of arrival times of packets so as to be an optimum distribution without discarding or duplicating the packets stored in the jitter absorption buffer.

また、この発明によれば、ジッタ吸収バッファにパケットが格納されてから復号されるまでの遅延時間を制御する際に、制御遅延を小さくできるようになる。 According to the present invention, the control delay can be reduced when the delay time from when the packet is stored in the jitter absorption buffer to when it is decoded is controlled.

以下、図３〜図１４を参照して、この発明をインターネット電話に適用した場合の実施例について説明する。 Hereinafter, with reference to FIGS. 3 to 14, an embodiment in which the present invention is applied to an Internet telephone will be described.

以下、第１の実施例について説明する。 The first embodiment will be described below.

〔１〕インターネット電話機の構成の説明 [1] Explanation of configuration of Internet telephone

図３は、インターネット電話機の構成を示している。 FIG. 3 shows the configuration of the Internet telephone.

インターネット電話機は、Ａ／Ｄコンバータ１、Ｄ／Ａコンバータ２、ＤＳＰ（音声復号化装置）３、マイコン４およびネットワークコントローラ５を備えている。 The Internet telephone includes an A / D converter 1, a D / A converter 2, a DSP (voice decoding device) 3, a microcomputer 4, and a network controller 5.

入力音声信号は、Ａ／Ｄコンバータ１によってデジタル音声信号に変換された後にＤＳＰ３に送られる。ＤＳＰ３では、デジタル音声信号が圧縮された後にパケット化される。ＤＳＰ３によって得られたパケットは、マイコン４およびネットワークコントローラ５を介してＩＰネットワークに送出される。 The input audio signal is converted into a digital audio signal by the A / D converter 1 and then sent to the DSP 3. In the DSP 3, the digital audio signal is compressed and then packetized. The packet obtained by the DSP 3 is sent to the IP network via the microcomputer 4 and the network controller 5.

ＩＰネットワークを介して送られてきたパケットは、ネットワークコントローラ５およびマイコン4 を介してＤＳＰ３に送られる。ＤＳＰ３では、パケットが復号化される。ＤＳＰ３によって得られたデシタル音声信号はＤ／Ａコンバータ２によってアナログの音声信号に変換されて出力される。 A packet sent via the IP network is sent to the DSP 3 via the network controller 5 and the microcomputer 4. In DSP3, the packet is decoded. The digital audio signal obtained by the DSP 3 is converted into an analog audio signal by the D / A converter 2 and output.

図４は、ＤＳＰ３の詳細な構成を示している。 FIG. 4 shows a detailed configuration of the DSP 3.

ＤＳＰ３は、送信パケットを生成するための手段と、復号化音声信号を生成するための手段とを備えている。 The DSP 3 includes means for generating a transmission packet and means for generating a decoded audio signal.

送信パケットを生成するための手段は、Ａ／Ｄコンバータ１から入力される入力音声信号を圧縮するための符号器３１および符号器３１によって得られた符号化データをパケット化してＲＴＰパケットを生成するＲＴＰパケット化部３２を備えている。 The means for generating the transmission packet generates an RTP packet by packetizing the encoder 31 for compressing the input voice signal input from the A / D converter 1 and the encoded data obtained by the encoder 31. An RTP packetizing unit 32 is provided.

復号化音声信号を生成するための手段は、ジッタ吸収バッファ３３、復号器３４、再生速度変更部（以下、可変速再生部という）３５、出力バッファ３６、再生速度制御部３７および復号タイミング制御部３８を備えている。再生速度制御部３７および復号タイミング制御部３８は、実際は、１つの制御部によって構成されているが、説明の便宜上、２つの制御部に分けている。 Means for generating a decoded audio signal include a jitter absorption buffer 33, a decoder 34, a playback speed changing unit (hereinafter referred to as a variable speed playback unit) 35, an output buffer 36, a playback speed control unit 37, and a decoding timing control unit. 38. The playback speed control unit 37 and the decoding timing control unit 38 are actually configured by one control unit, but are divided into two control units for convenience of explanation.

ジッタ吸収バッファ３３は、図１のジッタ吸収バッファ１０１と同様に、複数のバッファ部（パケット記憶部）を備えている。ジッタ吸収バッファ３３の各バッファ部には、到達したパケットが、パケット番号の順番で左側から順番に格納されていく。最も左側のバッファ部に格納されているパケットが所定のタイミングで読み出されて復号器３４に渡される。復号器３４に１パケットが渡されると、ジッタ吸収バッファ３３内の他のパケットが左側に１つずつシフトされる。 The jitter absorption buffer 33 includes a plurality of buffer units (packet storage units), similar to the jitter absorption buffer 101 of FIG. In each buffer section of the jitter absorption buffer 33, the arrived packets are stored in order of packet numbers from the left side. The packet stored in the leftmost buffer unit is read at a predetermined timing and passed to the decoder 34. When one packet is passed to the decoder 34, the other packets in the jitter absorption buffer 33 are shifted one by one to the left.

復号器３４は、ジッタ吸収バッファ３３から渡されたパケット（符号化データ）を復号化する。復号器３４によって得られた復号化音声信号は、可変速再生部３５に送られ、再生速度の変更処理（話速変換処理）が施される。可変速再生部３５から出力されるデジタル音声信号は出力バッファ３６に蓄積される。出力バッファ３６に蓄積されたデジタル音声信号は、所定時間間隔毎に１データずつ順次読み出されて、Ｄ／Ａコンバータ２に出力される。 The decoder 34 decodes the packet (encoded data) passed from the jitter absorption buffer 33. The decoded audio signal obtained by the decoder 34 is sent to the variable speed reproduction unit 35 and subjected to a reproduction speed change process (speech speed conversion process). The digital audio signal output from the variable speed reproduction unit 35 is accumulated in the output buffer 36. The digital audio signal stored in the output buffer 36 is sequentially read out one data at a predetermined time interval and output to the D / A converter 2.

再生速度制御部３７は、ジッタ吸収バッファ３３のバッファ量（パケット蓄積量）に基づいて、可変速再生部３５を制御する。復号タイミング制御部３８は、出力バッファ３６のデータ蓄積量に基づいて、復号器３４による復号タイミングを制御する。 The playback speed control unit 37 controls the variable speed playback unit 35 based on the buffer amount (packet accumulation amount) of the jitter absorption buffer 33. The decoding timing control unit 38 controls the decoding timing by the decoder 34 based on the amount of data stored in the output buffer 36.

上記復号化音声信号を生成するための手段の特徴は、ジッタ吸収バッファ３３のバッファ量（パケット蓄積量）に応じて復号化音声信号の再生速度を制御することにより、ジッタ吸収バッファ３３からのパケット出力タイミング（復号タイミング）を制御することにある。ジッタ吸収バッファ３３からのパケット出力は、出力バッファ３６内に格納されているデータ量が、所定の基準量を下回ったときに行われる。 The feature of the means for generating the decoded audio signal is that the packet from the jitter absorption buffer 33 is controlled by controlling the reproduction speed of the decoded audio signal in accordance with the buffer amount (packet accumulation amount) of the jitter absorption buffer 33. The purpose is to control the output timing (decoding timing). The packet output from the jitter absorption buffer 33 is performed when the amount of data stored in the output buffer 36 falls below a predetermined reference amount.

これにより、ジッタ吸収バッファ３３内に格納されているパケットの廃棄や複製を行うことなく、パケット到達時刻の分布が最適な位置にくるようにジッタ吸収バッファ３３内のバッファ量、言い換えれば、パケットがジッタ吸収バッファ３３内に格納されてから、そのパケットが復号化されるまでの遅延時間を調整することが可能となる。なお、再生音声の再生速度は、ピッチ幅を変えることなく、再生速度のみを変更する。 Thereby, without discarding or duplicating the packet stored in the jitter absorption buffer 33, the buffer amount in the jitter absorption buffer 33, in other words, the packet is stored so that the distribution of packet arrival times is at an optimal position. It is possible to adjust the delay time from when the packet is stored in the jitter absorption buffer 33 until the packet is decoded. Note that the playback speed of the playback audio is changed only in the playback speed without changing the pitch width.

〔２〕復号化音声信号を生成するための手段の動作についての説明 [2] Description of operation of means for generating a decoded speech signal

以下、復号化音声信号を生成するための手段の動作について、さらに詳しく説明する。 Hereinafter, the operation of the means for generating the decoded audio signal will be described in more detail.

通話中において、ジッタ吸収バッファ３３に到達するパケットの分布が、図５（ａ）に破線Ｓ１で示すような分布であり、実線の分布Ｓ０のように分布を移動させたい場合には、再生速度が早くなるように可変速再生部３５を制御する。可変速再生部３５は、再生速度を早くする際には、例えば、図６に示すように、３ピッチ分の波形から２ピッチ分の波形を生成する。 During a call, the distribution of packets reaching the jitter absorption buffer 33 is a distribution as indicated by a broken line S1 in FIG. 5A, and when it is desired to move the distribution as indicated by a solid line distribution S0, the reproduction speed The variable speed reproduction unit 35 is controlled so as to be faster. When the playback speed is increased, the variable speed playback unit 35 generates a waveform for two pitches from a waveform for three pitches, for example, as shown in FIG.

つまり、まず、原波形内の３ピッチ分の波形のうち、前から２ピッチ分の波形に右下がり直線で表される重みをかけるとともに、後から２ピッチ分の波形に右上がりの直線で表される重みをかける。そして、これらの２ピッチ分の波形を加算することにより、２ピッチ分の波形を生成する。 That is, first, among the waveforms for 3 pitches in the original waveform, the waveform for 2 pitches from the front is applied with the weight represented by a right-downward straight line, and the waveform for 2 pitches from the back is represented by a straight line rising to the right To be weighted. And the waveform for 2 pitches is produced | generated by adding the waveform for these 2 pitches.

このように、再生速度が早くされると、１パケットに対するデータ量が減少するため、出力バッファ３６内の蓄積データが所定の基準量を下回るタイミングが早くなり、ジッタ吸収バッファ３３からのパケット出力タイミング（復号タイミング）が早くなる。言い換えれば、パケットがジッタ吸収バッファ３３内に格納されてから、そのパケットが復号化されるまでの遅延時間が短くなる。この結果、パケット到達時刻の分布が最適な位置Ｓ０に移動する。 As described above, when the reproduction speed is increased, the amount of data for one packet decreases, and therefore, the timing at which the accumulated data in the output buffer 36 falls below a predetermined reference amount is advanced, and the packet output timing from the jitter absorption buffer 33 is increased. (Decoding timing) becomes earlier. In other words, the delay time from when the packet is stored in the jitter absorption buffer 33 until the packet is decoded is shortened. As a result, the distribution of packet arrival times moves to the optimum position S0.

通話中において、ジッタ吸収バッファ３３に到達するパケットの分布が、図５（ｂ）に破線Ｓ２で示すような分布であり、実線の分布Ｓ０のように分布を移動させたい場合には、再生速度が遅くなるように可変速再生部３５を制御する。可変速再生部３５は、再生速度を遅くする際には、例えば、図７に示すように、３ピッチ分の波形から４ピッチ分の波形を生成する。 During a call, the distribution of packets reaching the jitter absorption buffer 33 is a distribution as shown by a broken line S2 in FIG. 5B, and when it is desired to move the distribution as indicated by a solid line distribution S0, the reproduction speed The variable speed reproduction unit 35 is controlled so as to slow down. For example, as shown in FIG. 7, the variable speed reproduction unit 35 generates a waveform corresponding to four pitches from a waveform corresponding to three pitches when the reproduction speed is decreased.

つまり、まず、原波形内の３ピッチ分の波形のうち、前から２ピッチ分の波形に右上がり直線で表される重みをかけるとともに、後から２ピッチ分の波形に右下がりの直線で表される重みをかける。そして、これらの２ピッチ分の波形を加算することにより、２ピッチ分の波形を生成する。そして、得られた波形を、原波形の中央の１ピッチ分の波形と置き換えることにより、４ピッチ分の波形を生成する。 That is, first, among the waveforms of 3 pitches in the original waveform, the waveform of 2 pitches from the front is weighted with a straight line rising to the right, and the waveform of 2 pitches from the back is represented by a straight line with a downward slope. To be weighted. And the waveform for 2 pitches is produced | generated by adding the waveform for these 2 pitches. Then, by replacing the obtained waveform with the waveform for one pitch at the center of the original waveform, a waveform for four pitches is generated.

このように、再生速度が遅くされると、１パケットに対するデータ量が増加するため、出力バッファ３６内の蓄積データが所定の基準量を下回るタイミングが遅くなり、ジッタ吸収バッファ３３からのパケット出力タイミング（復号タイミング）が遅くなる。言い換えれば、パケットがジッタ吸収バッファ３３内に格納されてから、そのパケットが復号化されるまでの遅延時間が長くなる。この結果、パケット到達時刻の分布が最適な位置Ｓ０に移動する。 As described above, when the reproduction speed is slowed down, the amount of data for one packet increases, so the timing at which the accumulated data in the output buffer 36 falls below a predetermined reference amount is delayed, and the packet output timing from the jitter absorption buffer 33 is delayed. (Decoding timing) is delayed. In other words, the delay time from when the packet is stored in the jitter absorption buffer 33 until the packet is decoded becomes longer. As a result, the distribution of packet arrival times moves to the optimum position S0.

通話中において、ＩＰネットワークにおけるジッタ量が増加した場合、ジッタ吸収バッファ３３に到達するパケットの分布が、図５（ｃ）に破線Ｓ３で示すような分布であり、実線の分布Ｓ０のように分布を移動させたい場合には、再生速度が遅くなるように可変速再生部３５を制御することにより、ジッタ吸収バッファ３３からのパケット出力タイミングを遅くさせる。 When the amount of jitter in the IP network increases during a call, the distribution of packets reaching the jitter absorption buffer 33 is a distribution as indicated by a broken line S3 in FIG. 5C, and is distributed as a solid line distribution S0. In order to move the packet, the packet output timing from the jitter absorption buffer 33 is delayed by controlling the variable speed reproduction unit 35 so that the reproduction speed becomes slow.

通話中において、ＩＰネットワークにおけるジッタ量が減少した場合、ジッタ吸収バッファ３３に到達するパケットの分布が、図５（ｄ）に破線Ｓ４で示すような分布であり、実線の分布Ｓ０のように分布を移動させたい場合には、再生速度が早くなるように可変速再生部３５を制御することにより、ジッタ吸収バッファ３３からのパケット出力タイミングを早くさせる。 When the amount of jitter in the IP network decreases during a call, the distribution of packets reaching the jitter absorption buffer 33 is a distribution as indicated by a broken line S4 in FIG. 5D, and is distributed as a solid line distribution S0. In order to move the packet, the variable-speed playback unit 35 is controlled so that the playback speed is increased, so that the packet output timing from the jitter absorption buffer 33 is advanced.

〔３〕再生速度制御部３７によって行われる再生速度制御についての説明 [3] Description of playback speed control performed by the playback speed control unit 37

図８において、ジッタ吸収バッファ３３の左端のバッファ部からパケットが読み出されるものとし、Ｓ０を目標とするパケット到達時刻の分布とする。ジッタ吸収バッファ３３の左端部の３つのバッファ部からなる領域をバッファ領域Ａ（第１領域）と定義し、バッファ領域Ａの右隣の１つのバッファ部からなる領域をバッファ領域Ｂ（第２領域）と定義し、バッファ領域Ｂより右側の領域をバッファ領域Ｃ（第３領域）と定義する。なお、各領域Ａ、Ｂ、Ｃのバッファ部の量は、設定により変更することが可能である。 In FIG. 8, it is assumed that a packet is read from the leftmost buffer section of the jitter absorption buffer 33, and S0 is a target packet arrival time distribution. An area consisting of the three buffer portions at the left end of the jitter absorption buffer 33 is defined as a buffer area A (first area), and an area consisting of one buffer section on the right side of the buffer area A is defined as a buffer area B (second area). ) And a region on the right side of the buffer region B is defined as a buffer region C (third region) . It should be noted that the amount of the buffer portion in each of the areas A, B, and C can be changed by setting.

再生速度制御の基本的な考え方について説明する。図９（ａ）に示すように、実際のパケット到達時刻の分布Ｓ２が目標とするパケット到達時刻の分布Ｓ０よりも左側にずれている場合には、ジッタ吸収バッファ３３のバッファ領域Ａに到達パケットが格納されるようになる。したがって、バッファ領域Ａに到達パケットが格納される場合には、再生速度制御部３７は、再生速度が遅くなるように可変速再生部３５を制御する。この結果、復号器３４へのパケット出力タイミング（復号タイミング）が遅くなる。 The basic concept of playback speed control will be described. As shown in FIG. 9A, when the actual packet arrival time distribution S2 is shifted to the left side of the target packet arrival time distribution S0, the arrival packet reaches the buffer area A of the jitter absorption buffer 33. Will be stored. Therefore, when the arrival packet is stored in the buffer area A, the playback speed control unit 37 controls the variable speed playback unit 35 so that the playback speed becomes slow. As a result, the packet output timing (decoding timing) to the decoder 34 is delayed.

一方、図９（ｂ）に示すように、実際のパケット到達時刻の分布Ｓ１が目標とするパケット到達時刻の分布Ｓ０よりも右側にずれている場合には、ジッタ吸収バッファ３３のバッファ領域ＡおよびＢからなる領域に一定時間、到達パケットが格納されなくなる。つまり、一定時間、到達パケットがバッファ領域Ｃのみに格納される。したがって、バッファ領域ＡおよびＢからなる領域に一定時間、到達パケットが格納されない場合には、再生速度制御部３７は、再生速度が早くなるように可変速再生部３５を制御する。この結果、復号器３４へのパケット出力タイミング（復号タイミング）が早くなる。 On the other hand, as shown in FIG. 9B, when the actual packet arrival time distribution S1 is shifted to the right from the target packet arrival time distribution S0, the buffer area A of the jitter absorption buffer 33 and The arrival packet is not stored in the area consisting of B for a certain period of time. That is, the arrival packet is stored only in the buffer area C for a certain time. Accordingly, when the arrival packet is not stored in the area composed of the buffer areas A and B for a certain time, the playback speed control unit 37 controls the variable speed playback unit 35 so that the playback speed becomes faster. As a result, the packet output timing (decoding timing) to the decoder 34 is advanced.

図１０は、初期化処理手順を示している。 FIG. 10 shows the initialization processing procedure.

電源オン時に行われる初期化処理においては、カウンタｂ＿ｃｎｔに、所定値Ｂ＿ＴＨＬ（例えば１００）を設定する（ステップ１）。また、可変速再生部３５に与える再生速度制御内容を再生速度を変更しない状態に設定する（ステップ２）。 In the initialization process performed when the power is turned on, a predetermined value B_THL (for example, 100) is set in the counter b_cnt (step 1). Further, the playback speed control content to be given to the variable speed playback unit 35 is set to a state in which the playback speed is not changed (step 2).

図１１は、再生速度の制御処理手順を示している。 FIG. 11 shows a playback speed control processing procedure.

再生速度の制御処理は、ジッタ吸収バッファ３３への到達パケットの入力処理が開始される毎に行われる。 The reproduction speed control process is performed each time the arrival packet input process to the jitter absorption buffer 33 is started.

パケット入力処理が開始されると、ジッタ吸収バッファ３３へのパケット入力位置が図８のバッファ領域Ａであるか否かを判別する（ステップ１１）。パケット入力位置がバッファ領域Ａである場合には、図９（ａ）に示すように、実際のパケット到達時刻の分布Ｓ２が目標とするパケット到達時刻の分布Ｓ０よりも左側にずれていると判断し、カウンタｂ＿ｃｎｔに所定値Ｂ＿ＴＨＬを格納するとともに（ステップ１２）、再生速度制御内容を再生速度を遅くする状態に設定する（ステップ１３）。そして、パケットをジッタ吸収バッファ３３に格納することにより（ステップ２０）、今回のパケット入力処理を終了する。 When the packet input process is started, it is determined whether or not the packet input position to the jitter absorption buffer 33 is the buffer area A in FIG. 8 (step 11). When the packet input position is the buffer area A, as shown in FIG. 9A, it is determined that the actual packet arrival time distribution S2 is shifted to the left from the target packet arrival time distribution S0. Then, the predetermined value B_THL is stored in the counter b_cnt (step 12), and the playback speed control content is set to a state where the playback speed is slowed down (step 13). Then, the packet is stored in the jitter absorption buffer 33 (step 20), and the current packet input process is terminated.

上記ステップ１１において、パケット入力位置がバッファ領域Ａではないと判別した場合には、パケット入力位置がバッファ領域Ｂであるか否かを判別する（ステップ１４）。パケット入力位置がバッファ領域Ｂである場合には、実際のパケット到達時刻の分布が目標とするパケット到達時刻の分布と一致している可能性が高いと判断し、カウンタｂ＿ｃｎｔに所定値Ｂ＿ＴＨＬを格納するとともに（ステップ１５）、再生速度制御内容を再生速度を変更しない状態に設定する（ステップ１６）。そして、パケットをジッタ吸収バッファ３３に格納することにより（ステップ２０）、今回のパケット入力処理を終了する。 If it is determined in step 11 that the packet input position is not the buffer area A, it is determined whether or not the packet input position is the buffer area B (step 14). When the packet input position is the buffer area B, it is determined that there is a high possibility that the actual packet arrival time distribution matches the target packet arrival time distribution, and the predetermined value B_THL is stored in the counter b_cnt. At the same time (step 15), the playback speed control content is set so as not to change the playback speed (step 16). Then, the packet is stored in the jitter absorption buffer 33 (step 20), and the current packet input process is terminated.

上記ステップ１４において、パケット入力位置がバッファ領域Ｂではないと判別した場合には、カウンタ値ｂ＿ｃｎｔを１だけデクリメント（−１）する（ステップ１７）。そして、カウンタ値ｂ＿ｃｎｔが０以下になったか否かを判別する（ステップ１８）。カウンタ値ｂ＿ｃｎｔが０より大きいときには、実際のパケット到達時刻の分布が目標とするパケット到達時刻の分布と一致している可能性が高いと判断し、再生速度制御内容を再生速度を変更しない状態に設定する（ステップ１６）。そして、パケットをジッタ吸収バッファ３３に格納することにより（ステップ２０）、今回のパケット入力処理を終了する。 If it is determined in step 14 that the packet input position is not the buffer area B, the counter value b_cnt is decremented by 1 (-1) (step 17). Then, it is determined whether or not the counter value b_cnt has become 0 or less (step 18). When the counter value b_cnt is greater than 0, it is determined that there is a high possibility that the actual packet arrival time distribution matches the target packet arrival time distribution, and the reproduction speed control content is changed to a state in which the reproduction speed is not changed. Set (step 16). Then, the packet is stored in the jitter absorption buffer 33 (step 20), and the current packet input process is terminated.

上記ステップ１８において、カウンタ値ｂ＿ｃｎｔが０以下になったと判別した場合には、図９（ｂ）に示すように、実際のパケット到達時刻の分布Ｓ１が目標とするパケット到達時刻の分布Ｓ０よりも右側にずれていると判断して、再生速度制御内容を再生速度を早くする状態に設定する（ステップ１９）。そして、パケットをジッタ吸収バッファ３３に格納することにより（ステップ２０）、今回のパケット入力処理を終了する。 When it is determined in step 18 that the counter value b_cnt has become 0 or less, as shown in FIG. 9B, the actual packet arrival time distribution S1 is larger than the target packet arrival time distribution S0. Judging that it is shifted to the right side, the playback speed control content is set to a state in which the playback speed is increased (step 19). Then, the packet is stored in the jitter absorption buffer 33 (step 20), and the current packet input process is terminated.

〔４〕復号タイミング制御処理手順についての説明 [4] Description of decoding timing control processing procedure

図１２は、復号タイミングの制御処理手順を示している。 FIG. 12 shows a decoding timing control processing procedure.

Ｄ／Ａコンバータ２への出力処理（Ｄ／Ａ出力処理）が開始されると、出力バッファ３６から１つのデータを出力する（ステップ３１）。そして、出力バッファ３６内のデータ量が所定の基準量Ｂ＿ＤＡＴＡ＿ＴＨＬより小さくなったか否かを判別する（ステップ３２）。出力バッファ３６内のデータ量が所定の基準量以上である場合には、今回のＤ／Ａ出力処理を終了する。 When output processing to the D / A converter 2 (D / A output processing) is started, one data is output from the output buffer 36 (step 31). Then, it is determined whether or not the amount of data in the output buffer 36 has become smaller than a predetermined reference amount B_DATA_THL (step 32). If the amount of data in the output buffer 36 is equal to or greater than the predetermined reference amount, the current D / A output process is terminated.

上記ステップ３２において、出力バッファ３６内のデータ量が所定の基準量Ｂ＿ＤＡＴＡ＿ＴＨＬより小さくなったと判別した場合には、復号器３４に復号を要求した後（ステップ３３）、今回のＤ／Ａ出力処理を終了する。 If it is determined in step 32 that the data amount in the output buffer 36 has become smaller than the predetermined reference amount B_DATA_THL, the decoder 34 is requested to perform decoding (step 33), and the current D / A output processing is performed. finish.

以下、第２の実施例について説明する。第２の実施例においては、インターネット電話機の全体的な構成は、図３に示すものと同様であるが、ＤＳＰ３の構成が図４に示すものと異なっている。 The second embodiment will be described below. In the second embodiment, the overall configuration of the Internet telephone is the same as that shown in FIG. 3, but the configuration of the DSP 3 is different from that shown in FIG.

図１３は、ＤＳＰ３の詳細な構成を示している。 FIG. 13 shows a detailed configuration of the DSP 3.

ＤＳＰ３は、送信パケットを生成するための手段と、復号化音声信号を生成するための手段とを備えている。送信パケットを生成するための手段は、図４と同様に、Ａ／Ｄコンバータ１から入力される入力音声信号を圧縮するための符号器３１および符号器３１によって得られた符号化データをパケット化してＲＴＰパケットを生成するＲＴＰパケット化部３２を備えている。 The DSP 3 includes means for generating a transmission packet and means for generating a decoded audio signal. The means for generating the transmission packet packetizes the encoded data obtained by the encoder 31 and the encoder 31 for compressing the input audio signal input from the A / D converter 1 as in FIG. An RTP packetizing unit 32 for generating RTP packets.

復号化音声信号を生成するための手段は、図４とは異なり、ジッタ吸収バッファ３３、復号器３４、出力バッファ３６および遅延時間制御部３９を備えている。遅延時間制御部３９は、ジッタ吸収バッファ３３の後段であって、復号器３４の前段に設けられており、パケットがジッタ吸収バッファ３３に格納されてから、そのパケットが復号化されるまでの遅延時間を制御する。この実施例では、ジッタ吸収バッファ３３からパケットを読み出すタイミング（復号タイミング）は、一定期間毎に到来する。 Unlike FIG. 4, the means for generating the decoded audio signal includes a jitter absorption buffer 33, a decoder 34, an output buffer 36, and a delay time control unit 39. The delay time control unit 39 is provided after the jitter absorption buffer 33 and before the decoder 34, and a delay from when the packet is stored in the jitter absorption buffer 33 until the packet is decoded. Control the time. In this embodiment, the timing (decoding timing) for reading a packet from the jitter absorption buffer 33 arrives at regular intervals.

遅延時間制御部３９によって行われる遅延時間制御について説明する。 The delay time control performed by the delay time control unit 39 will be described.

図８において、ジッタ吸収バッファ３３の左端のバッファ部からパケットが読み出されるものとし、Ｓ０を目標とするパケット到達時刻の分布とする。ジッタ吸収バッファ３３の左端部の３つのバッファ部からなる領域をバッファ領域Ａと定義し、バッファ領域Ａの右隣の１つのバッファ部からなる領域をバッファ領域Ｂと定義し、バッファ領域Ｂより右側の領域をバッファ領域Ｃと定義する。なお、各領域Ａ、Ｂ、Ｃのバッファ部の量は、設定により変更することが可能である。 In FIG. 8, it is assumed that a packet is read from the leftmost buffer section of the jitter absorption buffer 33, and S0 is a target packet arrival time distribution. The area consisting of the three buffer parts at the left end of the jitter absorbing buffer 33 is defined as a buffer area A, and the area consisting of one buffer part right next to the buffer area A is defined as a buffer area B. Is defined as a buffer area C. It should be noted that the amount of the buffer portion in each of the areas A, B, and C can be changed by setting.

図９（ａ）に示すように、実際のパケット到達時刻の分布Ｓ２が目標とするパケット到達時刻の分布Ｓ０よりも左側にずれている場合には、ジッタ吸収バッファ３３のバッファ領域Ａに到達パケットが格納されるようになる。ジッタ吸収バッファ３３のバッファ領域Ａに到達パケットが格納される場合には、遅延時間制御部３９は、ジッタ吸収バッファ３３内に格納されているパケットを複製するのと等価な処理を行う。 As shown in FIG. 9A, when the actual packet arrival time distribution S2 is shifted to the left side of the target packet arrival time distribution S0, the arrival packet reaches the buffer area A of the jitter absorption buffer 33. Will be stored. When the arrival packet is stored in the buffer area A of the jitter absorption buffer 33, the delay time control unit 39 performs processing equivalent to duplicating the packet stored in the jitter absorption buffer 33.

具体的には、ある復号タイミングにおいてジッタ吸収バッファ３３から読み出された１つのパケットを復号器３４に送るとともに保持しておき、次の復号タイミングにおいてはジッタ吸収バッファ３３から新たなパケットの読み出しを行うことなく保持しているパケット（前回の復号タイミングで読み出されたパケット）を復号器３４に送るように、ジッタ吸収バッファ３３からのパケットの読み出しおよび復号器３４へ送るパケットを制御する。この結果、パケットがジッタ吸収バッファ３３に格納されてから、そのパケットが復号化されるまでの遅延時間が長くなる。遅延時間制御部３９によるこのような制御を行う動作モードを、遅延時間延長化モードということにする。 Specifically, one packet read from the jitter absorption buffer 33 at a certain decoding timing is sent to the decoder 34 and held, and a new packet is read from the jitter absorption buffer 33 at the next decoding timing. The packet reading from the jitter buffer 33 and the packet to be sent to the decoder 34 are controlled so that the held packet (the packet read at the previous decoding timing) is sent to the decoder 34 without being performed. As a result, the delay time from when the packet is stored in the jitter absorption buffer 33 until the packet is decoded becomes longer. An operation mode in which such control by the delay time control unit 39 is performed is referred to as a delay time extension mode.

一方、図９（ｂ）に示すように、実際のパケット到達時刻の分布Ｓ１が目標とするパケット到達時刻の分布Ｓ０よりも右側にずれている場合には、ジッタ吸収バッファ３３のバッファ領域ＡおよびＢからなる領域に一定時間、到達パケットが格納されなくなる。つまり、一定時間、到達パケットがバッファ領域Ｃのみに格納される。バッファ領域ＡおよびＢからなる領域に一定時間、到達パケットが格納されない場合には、遅延時間制御部３９は、ジッタ吸収バッファ３３内に格納されているパケットを削除（間引き）するのと等価な処理を行う。 On the other hand, as shown in FIG. 9B, when the actual packet arrival time distribution S1 is shifted to the right from the target packet arrival time distribution S0, the buffer area A of the jitter absorption buffer 33 and The arrival packet is not stored in the area consisting of B for a certain period of time. That is, the arrival packet is stored only in the buffer area C for a certain time. When the arrival packet is not stored in the area composed of the buffer areas A and B for a certain period of time, the delay time control unit 39 is a process equivalent to deleting (decimating out) the packet stored in the jitter absorption buffer 33. I do.

具体的には、復号タイミングにおいてジッタ吸収バッファ３３から２つのパケットを連続して読み出し、そのうちの一方を破棄し、他方のみを復号器３４に送るように、ジッタ吸収バッファ３３からのパケットの読み出しおよび復号器３４へ送るパケットを制御する。この結果、パケットがジッタ吸収バッファ３３に格納されてから、そのパケットが復号化されるまでの遅延時間が短くなる。遅延時間制御部３９によるこのような制御を行う動作モードを、遅延時間短縮化モードということにする。 Specifically, two packets are continuously read from the jitter absorption buffer 33 at the decoding timing, one of them is discarded, and only the other is sent to the decoder 34. Controls packets sent to the decoder 34. As a result, the delay time from when the packet is stored in the jitter absorption buffer 33 until the packet is decoded is shortened. An operation mode in which such control by the delay time control unit 39 is performed is referred to as a delay time shortening mode.

なお、遅延時間制御部３９は、通常動作モード時には、復号タイミングにおいてジッタ吸収バッファ３３から１つのパケットを読み出して、そのパケットを復号器３４に送るといった動作を行う。 In the normal operation mode, the delay time control unit 39 reads one packet from the jitter absorption buffer 33 at the decoding timing and sends the packet to the decoder 34.

図１４は、遅延時間制御部３９による動作モード決定処理手順を示している。 FIG. 14 shows an operation mode determination processing procedure by the delay time control unit 39.

なお、電源オン時に行われる初期化処理において、カウンタｂ＿ｃｎｔに、所定値Ｂ＿ＴＨＬ（例えば１００）が設定されるとともに、遅延時間制御部３９の動作モードとしては、通常動作モードが設定されているものとする。 In the initialization process performed when the power is turned on, a predetermined value B_THL (for example, 100) is set in the counter b_cnt, and the normal operation mode is set as the operation mode of the delay time control unit 39. To do.

遅延時間制御処理は、ジッタ吸収バッファ３３への到達パケットの入力処理が開始される毎に行われる。 The delay time control process is performed every time the arrival packet input process to the jitter absorption buffer 33 is started.

パケット入力処理が開始されると、ジッタ吸収バッファ３３へのパケット入力位置が図８のバッファ領域Ａであるか否かを判別する（ステップ１１１）。パケット入力位置がバッファ領域Ａである場合には、図９（ａ）に示すように、実際のパケット到達時刻の分布Ｓ２が目標とするパケット到達時刻の分布Ｓ０よりも左側にずれていると判断し、カウンタｂ＿ｃｎｔに所定値Ｂ＿ＴＨＬを格納するとともに（ステップ１１２）、動作モードを遅延時間延長化モードに設定する（ステップ１１３）。そして、パケットをジッタ吸収バッファ３３に格納することにより（ステップ１２０）、今回のパケット入力処理を終了する。 When the packet input process is started, it is determined whether or not the packet input position to the jitter absorption buffer 33 is the buffer area A in FIG. 8 (step 111). When the packet input position is the buffer area A, as shown in FIG. 9A, it is determined that the actual packet arrival time distribution S2 is shifted to the left from the target packet arrival time distribution S0. The predetermined value B_THL is stored in the counter b_cnt (step 112), and the operation mode is set to the delay time extension mode (step 113). Then, the packet is stored in the jitter absorption buffer 33 (step 120), and the current packet input process is terminated.

上記ステップ１１１において、パケット入力位置がバッファ領域Ａではないと判別した場合には、パケット入力位置がバッファ領域Ｂであるか否かを判別する（ステップ１１４）。パケット入力位置がバッファ領域Ｂである場合には、実際のパケット到達時刻の分布が目標とするパケット到達時刻の分布と一致している可能性が高いと判断し、カウンタｂ＿ｃｎｔに所定値Ｂ＿ＴＨＬを格納するとともに（ステップ１１５）、動作モードを通常動作モードに設定する（ステップ１１６）。そして、パケットをジッタ吸収バッファ３３に格納することにより（ステップ１２０）、今回のパケット入力処理を終了する。 If it is determined in step 111 that the packet input position is not the buffer area A, it is determined whether or not the packet input position is the buffer area B (step 114). When the packet input position is the buffer area B, it is determined that there is a high possibility that the actual packet arrival time distribution matches the target packet arrival time distribution, and the predetermined value B_THL is stored in the counter b_cnt. In addition, the operation mode is set to the normal operation mode (step 116). Then, the packet is stored in the jitter absorption buffer 33 (step 120), and the current packet input process is terminated.

上記ステップ１１４において、パケット入力位置がバッファ領域Ｂではないと判別した場合には、カウンタ値ｂ＿ｃｎｔを１だけデクリメント（−１）する（ステップ１１７）。そして、カウンタ値ｂ＿ｃｎｔが０以下になったか否かを判別する（ステップ１１８）。カウンタ値ｂ＿ｃｎｔが０より大きいときには、実際のパケット到達時刻の分布が目標とするパケット到達時刻の分布と一致している可能性が高いと判断し、動作モードを通常動作モードに設定する（ステップ１１６）。そして、パケットをジッタ吸収バッファ３３に格納することにより（ステップ１２０）、今回のパケット入力処理を終了する。 If it is determined in step 114 that the packet input position is not the buffer area B, the counter value b_cnt is decremented by 1 (-1) (step 117). Then, it is determined whether or not the counter value b_cnt has become 0 or less (step 118). When the counter value b_cnt is larger than 0, it is determined that there is a high possibility that the actual packet arrival time distribution matches the target packet arrival time distribution, and the operation mode is set to the normal operation mode (step 116). ). Then, the packet is stored in the jitter absorption buffer 33 (step 120), and the current packet input process is terminated.

上記ステップ１１８において、カウンタ値ｂ＿ｃｎｔが０以下になったと判別した場合には、図９（ｂ）に示すように、実際のパケット到達時刻の分布Ｓ１が目標とするパケット到達時刻の分布Ｓ０よりも右側にずれていると判断して、動作モードを遅延時間短縮化モードに設定する（ステップ１１９）。そして、パケットをジッタ吸収バッファ３３に格納することにより（ステップ１２０）、今回のパケット入力処理を終了する。 When it is determined in step 118 that the counter value b_cnt has become 0 or less, as shown in FIG. 9B, the actual packet arrival time distribution S1 is larger than the target packet arrival time distribution S0. The operation mode is determined to be shifted to the right side, and the operation mode is set to the delay time shortening mode (step 119). Then, the packet is stored in the jitter absorption buffer 33 (step 120), and the current packet input process is terminated.

従来技術を示すブロック図である。It is a block diagram which shows a prior art. 図１の従来技術の問題点を説明するための模式図である。It is a schematic diagram for demonstrating the problem of the prior art of FIG. インターネット電話機の構成を示すブロック図である。It is a block diagram which shows the structure of an internet telephone. 図３のＤＳＰの構成を示すブロック図である。It is a block diagram which shows the structure of DSP of FIG. 本発明の基本的な考え方を説明するための模式図である。It is a schematic diagram for demonstrating the basic view of this invention. 再生速度を早くする場合の可変速再生部３５の処理内容を説明するための模式図である。It is a schematic diagram for demonstrating the processing content of the variable speed reproduction | regeneration part 35 when making reproduction | regeneration speed high. 再生速度を遅くする場合の可変速再生部３５の処理内容を説明するための模式図である。It is a schematic diagram for demonstrating the processing content of the variable speed reproduction | regeneration part 35 in the case of making reproduction speed slow. 再生速度制御を説明するための模式図である。It is a schematic diagram for demonstrating reproduction speed control. 再生速度制御の基本的な考え方を説明するための模式図である。It is a schematic diagram for demonstrating the fundamental view of reproduction speed control. 初期化処理手順を示すフローチャートである。It is a flowchart which shows the initialization process procedure. 再生速度制御処理手順を示すフローチャートである。It is a flowchart which shows the reproduction speed control processing procedure. 復号タイミング制御処理手順を示すフローチャートである。It is a flowchart which shows a decoding timing control processing procedure. ＤＳＰの他の構成例を示すブロック図である。It is a block diagram which shows the other structural example of DSP. 図１３の遅延時間制御部３９による動作モード決定処理手順を示すフローチャートである。It is a flowchart which shows the operation mode determination processing procedure by the delay time control part 39 of FIG.

符号の説明Explanation of symbols

３ＤＳＰ
３３ジッタ吸収バッファ
３４復号器
３５可変速再生部
３６出力バッファ
３７再生速度制御部
３８復号タイミング制御部
３９遅延時間制御部 3 DSP
33 Jitter Absorption Buffer 34 Decoder 35 Variable Speed Playback Unit 36 Output Buffer 37 Playback Speed Control Unit 38 Decoding Timing Control Unit 39 Delay Time Control Unit

Claims

受信パケットを格納するための複数のバッファ部を有するジッタ吸収バッファと、ジッタ吸収バッファに格納されているパケットを復号化するための復号手段とを備えており、復号手段にパケットを出力するジッタ吸収バッファの出力端側のパケットのパケット番号を基準として、受信パケットがジッタ吸収バッファにおける当該受信パケットのパケット番号に対応した位置に格納される音声復号化装置において、
復号手段によって得られた復号化音声信号に対して再生速度の変換を行うための再生速度変更手段、
再生速度変更手段から出力されるデジタル音声信号を一時的に蓄積する出力バッファ、出力バッファに蓄積されたデジタル音声信号を所定時間間隔で読み出す手段、
ジッタ吸収バッファへの受信パケットの格納位置に基づいて、再生速度変更手段を制御する再生速度制御手段、ならびに
出力バッファのデータ蓄積量に基づいて、復号手段による復号タイミングを制御する復号タイミング制御手段を備えており、
ジッタ吸収バッファ内に、ジッタ吸収バッファの出力端から所要数のバッファ部からなる第１領域と、第１領域よりジッタ吸収バッファの他端側において所要数のバッファ部からなる第２領域と、第２領域よりジッタ吸収バッファの他端側において所要数のバッファ部からなる第３領域とが設定されており、
再生速度制御手段は、ジッタ吸収バッファ内の第１領域に受信パケットが格納される場合に、再生速度が遅くなるように再生速度変更手段を制御し、ジッタ吸収バッファ内の第３領域に受信パケットが所定回数連続して格納された場合には、再生速度が早くなるように、再生速度変更手段を制御するものであることを特徴とする音声復号化装置。 A jitter absorbing buffer having a plurality of buffer units for storing received packets and a decoding means for decoding the packets stored in the jitter absorbing buffer, and outputting the packets to the decoding means In the speech decoding apparatus in which the received packet is stored at a position corresponding to the packet number of the received packet in the jitter absorption buffer with reference to the packet number of the packet on the output end side of the buffer ,
Reproduction speed changing means for converting the reproduction speed of the decoded audio signal obtained by the decoding means;
An output buffer for temporarily storing the digital audio signal output from the reproduction speed changing means; a means for reading out the digital audio signal stored in the output buffer at predetermined time intervals;
A reproduction speed control means for controlling the reproduction speed changing means based on the storage position of the received packet in the jitter absorption buffer, and a decoding timing control means for controlling the decoding timing by the decoding means based on the amount of data stored in the output buffer. equipped and,
In the jitter absorption buffer, a first area consisting of a required number of buffer sections from the output end of the jitter absorption buffer, a second area consisting of a required number of buffer sections on the other end side of the jitter absorption buffer from the first area, A third region consisting of a required number of buffer portions is set on the other end side of the jitter absorption buffer from two regions;
The reproduction speed control means controls the reproduction speed changing means so that the reproduction speed is slowed down when the received packet is stored in the first area in the jitter absorption buffer, and the reception speed is controlled in the third area in the jitter absorption buffer. Is stored in a predetermined number of times, the playback speed changing means is controlled so that the playback speed becomes faster .

復号タイミング制御手段は、出力バッファのデータ蓄積量が所定の基準量より少なくなったときに、復号手段にパケットの復号化を要求するものであることを特徴とする請求項１に記載の音声復号化装置。 2. The voice decoding according to claim 1, wherein the decoding timing control means requests the decoding means to decode a packet when the data accumulation amount of the output buffer becomes smaller than a predetermined reference amount. Device.

受信パケットを格納するための複数のバッファ部を有するジッタ吸収バッファと、ジッタ吸収バッファに格納されているパケットを復号化するための復号手段とを備えており、復号手段にパケットを出力するジッタ吸収バッファの出力端側のパケットのパケット番号を基準として、受信パケットがジッタ吸収バッファにおける当該受信パケットのパケット番号に対応した位置に格納される音声復号化装置において、
ジッタ吸収バッファ内に、ジッタ吸収バッファの出力端から所要数のバッファ部からなる第１領域と、第１領域よりジッタ吸収バッファの他端側において所要数のバッファ部からなる第２領域と、第２領域よりジッタ吸収バッファの他端側において所要数のバッファ部からなる第３領域とが設定されており、
ジッタ吸収バッファ内の第１領域に受信パケットが格納される場合に、ジッタ吸収バッファにパケットが格納されてからそのパケットが復号されるまでの遅延時間が長くなるような制御を行い、ジッタ吸収バッファ内の第３領域に受信パケットが所定回数連続して格納された場合には、ジッタ吸収バッファにパケットが格納されてからそのパケットが復号されるまでの遅延時間が短くなるような制御を行う遅延時間制御手段を備えていることを特徴とする音声復号化装置。 A jitter absorbing buffer having a plurality of buffer units for storing received packets and a decoding means for decoding the packets stored in the jitter absorbing buffer, and outputting the packets to the decoding means In the speech decoding apparatus in which the received packet is stored at a position corresponding to the packet number of the received packet in the jitter absorption buffer with reference to the packet number of the packet on the output end side of the buffer,
In the jitter absorption buffer, a first area consisting of a required number of buffer sections from the output end of the jitter absorption buffer, a second area consisting of a required number of buffer sections on the other end side of the jitter absorption buffer from the first area, A third region consisting of a required number of buffer portions is set on the other end side of the jitter absorption buffer from two regions;
When a received packet is stored in the first area in the jitter absorption buffer, control is performed to increase the delay time from when the packet is stored in the jitter absorption buffer until the packet is decoded. When the received packet is stored continuously in the third area within a predetermined number of times, the delay is controlled so that the delay time from when the packet is stored in the jitter absorption buffer to when the packet is decoded is shortened A speech decoding apparatus comprising a time control means .

遅延時間制御手段は、
復号手段によって得られた復号化音声信号に対して再生速度の変換を行うための再生速度変更手段、
再生速度変更手段から出力されるデジタル音声信号を一時的に蓄積する出力バッファ、出力バッファに蓄積されたデジタル音声信号を所定時間間隔で読み出す手段、
ジッタ吸収バッファ内の第１領域に受信パケットが格納される場合に、再生速度が遅くなるように、再生速度変更手段を制御し、ジッタ吸収バッファ内の第３領域に受信パケットが所定回数連続して格納された場合には、再生速度が早くなるように、再生速度変更手段を制御する手段、ならびに
出力バッファのデータ蓄積量に基づいて、復号手段による復号タイミングを制御する復号タイミング制御手段、
を備えていることを特徴とする請求項３に記載の音声復号化装置。 The delay time control means is
Reproduction speed changing means for converting the reproduction speed of the decoded audio signal obtained by the decoding means;
An output buffer for temporarily storing the digital audio signal output from the reproduction speed changing means; a means for reading out the digital audio signal stored in the output buffer at predetermined time intervals;
When the received packet is stored in the first area in the jitter absorption buffer, the reproduction speed changing means is controlled so that the reproduction speed becomes slow, and the received packet continues in the third area in the jitter absorption buffer a predetermined number of times. Means for controlling the playback speed changing means so as to increase the playback speed, and
Decoding timing control means for controlling the decoding timing by the decoding means based on the amount of data stored in the output buffer;
The speech decoding apparatus according to claim 3, further comprising:

遅延時間制御手段は、ジッタ吸収バッファ内の第１領域に受信パケットが格納される場合に、パケット読み出しタイミングにおいてジッタ吸収バッファ内から読み出したパケットが、今回を含めて複数回の連続するパケット読み出しタイミングにおいて繰り返し復号されるようにかつその間においてジッタ吸収バッファからのパケットの読み出しを禁止するように、ジッタ吸収バッファの読み出しおよび復号手段へ送るパケットを制御し、ジッタ吸収バッファ内の第３領域に受信パケットが所定回数連続して格納された場合には、パケット読み出しタイミングにおいてジッタ吸収バッファ内に格納されている複数のパケットを一度に読み出して、その一つのみを復号し、その他を破棄するようにジッタ吸収バッファからのパケットの読み出しおよび復号手段へ送るパケットを制御することを特徴とする請求項３に記載の音声復号化装置。 When the received packet is stored in the first area in the jitter absorption buffer, the delay time control means is configured to read the packet read out from the jitter absorption buffer at the packet read timing a plurality of consecutive packet read timings including this time. The packet to be read back to the jitter absorbing buffer and to be sent to the decoding means so that the packet is sent to the third region in the jitter absorbing buffer. Is stored continuously for a predetermined number of times, multiple packets stored in the jitter absorption buffer are read at a time at the packet read timing, only one of them is decoded, and the jitter is discarded. Reading packets from the absorption buffer Speech decoding apparatus according to claim 3, characterized in that to control the packet sent to the pre-decoding unit.

受信パケットを格納するための複数のバッファ部を有するジッタ吸収バッファと、ジッタ吸収バッファに格納されているパケットを復号化するための復号手段とを備えており、復号手段にパケットを出力するジッタ吸収バッファの出力端側のパケットのパケット番号を基準として、受信パケットがジッタ吸収バッファにおける当該受信パケットのパケット番号に対応した位置に格納されるネットワーク電話機において、Jitter absorption comprising: a jitter absorption buffer having a plurality of buffer units for storing received packets; and a decoding means for decoding packets stored in the jitter absorption buffer, and outputting the packets to the decoding means In the network telephone in which the received packet is stored at a position corresponding to the packet number of the received packet in the jitter absorption buffer with reference to the packet number of the packet on the output end side of the buffer,
復号手段によって得られた復号化音声信号に対して再生速度の変換を行うための再生速度変更手段、Reproduction speed changing means for converting the reproduction speed of the decoded audio signal obtained by the decoding means;
再生速度変更手段から出力されるデジタル音声信号を一時的に蓄積する出力バッファ、出力バッファに蓄積されたデジタル音声信号を所定時間間隔で読み出す手段、An output buffer for temporarily storing the digital audio signal output from the reproduction speed changing means; a means for reading out the digital audio signal stored in the output buffer at predetermined time intervals;
ジッタ吸収バッファへの受信パケットの格納位置に基づいて、再生速度変更手段を制御する再生速度制御手段、ならびにA reproduction speed control means for controlling the reproduction speed changing means based on the storage position of the received packet in the jitter absorption buffer; and
出力バッファのデータ蓄積量に基づいて、復号手段による復号タイミングを制御する復号タイミング制御手段を備えており、A decoding timing control means for controlling the decoding timing by the decoding means based on the amount of data stored in the output buffer;
ジッタ吸収バッファ内に、ジッタ吸収バッファの出力端から所要数のバッファ部からなる第１領域と、第１領域よりジッタ吸収バッファの他端側において所要数のバッファ部からなる第２領域と、第２領域よりジッタ吸収バッファの他端側において所要数のバッファ部からなる第３領域とが設定されており、In the jitter absorption buffer, a first region including a required number of buffer portions from the output end of the jitter absorption buffer, a second region including a required number of buffer portions on the other end side of the jitter absorption buffer from the first region, A third region composed of a required number of buffer portions is set on the other end side of the jitter absorption buffer from two regions;
再生速度制御手段は、ジッタ吸収バッファ内の第１領域に受信パケットが格納される場合に、再生速度が遅くなるように再生速度変更手段を制御し、ジッタ吸収バッファ内の第３領域に受信パケットが所定回数連続して格納された場合には、再生速度が早くなるように、再生速度変更手段を制御するものであることを特徴とするネットワーク電話機。The reproduction speed control means controls the reproduction speed changing means so that the reproduction speed is slowed down when the received packet is stored in the first area in the jitter absorption buffer, and the reception speed is controlled in the third area in the jitter absorption buffer. A network telephone characterized by controlling the reproduction speed changing means so that the reproduction speed becomes faster when a predetermined number of times is stored.

復号タイミング制御手段は、出力バッファのデータ蓄積量が所定の基準量より少なくなったときに、復号手段にパケットの復号化を要求するものであることを特徴とする請求項６に記載のネットワーク電話機。 7. The network telephone according to claim 6, wherein the decoding timing control means requests the decoding means to decode a packet when the amount of data stored in the output buffer becomes smaller than a predetermined reference amount. .

受信パケットを格納するための複数のバッファ部を有するジッタ吸収バッファと、ジッタ吸収バッファに格納されているパケットを復号化するための復号手段とを備えており、復号手段にパケットを出力するジッタ吸収バッファの出力端側のパケットのパケット番号を基準として、受信パケットがジッタ吸収バッファにおける当該受信パケットのパケット番号に対応した位置に格納されるネットワーク電話機において、
ジッタ吸収バッファ内に、ジッタ吸収バッファの出力端から所要数のバッファ部からなる第１領域と、第１領域よりジッタ吸収バッファの他端側において所要数のバッファ部からなる第２領域と、第２領域よりジッタ吸収バッファの他端側において所要数のバッファ部からなる第３領域とが設定されており、
ジッタ吸収バッファ内の第１領域に受信パケットが格納される場合に、ジッタ吸収バッファにパケットが格納されてからそのパケットが復号されるまでの遅延時間が長くなるような制御を行い、ジッタ吸収バッファ内の第３領域に受信パケットが所定回数連続して格納された場合には、ジッタ吸収バッファにパケットが格納されてからそのパケットが復号されるまでの遅延時間が短くなるような制御を行う遅延時間制御手段を備えていることを特徴とするネットワーク電話機。 A jitter absorbing buffer having a plurality of buffer units for storing received packets and a decoding means for decoding the packets stored in the jitter absorbing buffer, and outputting the packets to the decoding means In the network telephone in which the received packet is stored at a position corresponding to the packet number of the received packet in the jitter absorption buffer with reference to the packet number of the packet on the output end side of the buffer,
In the jitter absorption buffer, a first area consisting of a required number of buffer sections from the output end of the jitter absorption buffer, a second area consisting of a required number of buffer sections on the other end side of the jitter absorption buffer from the first area, A third region consisting of a required number of buffer portions is set on the other end side of the jitter absorption buffer from two regions;
When a received packet is stored in the first area in the jitter absorption buffer, control is performed to increase the delay time from when the packet is stored in the jitter absorption buffer until the packet is decoded. When the received packet is stored continuously in the third area within a predetermined number of times, the delay is controlled so that the delay time from when the packet is stored in the jitter absorption buffer to when the packet is decoded is shortened A network telephone comprising a time control means .

遅延時間制御手段は、
復号手段によって得られた復号化音声信号に対して再生速度の変換を行うための再生速度変更手段、
再生速度変更手段から出力されるデジタル音声信号を一時的に蓄積する出力バッファ、出力バッファに蓄積されたデジタル音声信号を所定時間間隔で読み出す手段、
ジッタ吸収バッファ内の第１領域に受信パケットが格納される場合に、再生速度が遅くなるように、再生速度変更手段を制御し、ジッタ吸収バッファ内の第３領域に受信パケットが所定回数連続して格納された場合には、再生速度が早くなるように、再生速度変更手段を制御する手段、ならびに
出力バッファのデータ蓄積量に基づいて、復号手段による復号タイミングを制御する復号タイミング制御手段、
を備えていることを特徴とする請求項８に記載のネットワーク電話機。 The delay time control means is
Reproduction speed changing means for converting the reproduction speed of the decoded audio signal obtained by the decoding means;
An output buffer for temporarily storing the digital audio signal output from the reproduction speed changing means; a means for reading out the digital audio signal stored in the output buffer at predetermined time intervals;
When the received packet is stored in the first area in the jitter absorption buffer, the reproduction speed changing means is controlled so that the reproduction speed becomes slow, and the received packet continues in the third area in the jitter absorption buffer a predetermined number of times. Means for controlling the playback speed changing means so as to increase the playback speed, and
Decoding timing control means for controlling the decoding timing by the decoding means based on the amount of data stored in the output buffer;
The network telephone according to claim 8, further comprising:

遅延時間制御手段は、ジッタ吸収バッファ内の第１領域に受信パケットが格納される場合に、パケット読み出しタイミングにおいてジッタ吸収バッファ内から読み出したパケットが、今回を含めて複数回の連続するパケット読み出しタイミングにおいて繰り返し復号されるようにかつその間においてジッタ吸収バッファからのパケットの読み出しを禁止するように、ジッタ吸収バッファの読み出しおよび復号手段へ送るパケットを制御し、ジッタ吸収バッファ内の第３領域に受信パケットが所定回数連続して格納された場合には、パケット読み出しタイミングにおいてジッタ吸収バッファ内に格納されている複数のパケットを一度に読み出して、その一つのみを復号し、その他を破棄するようにジッタ吸収バッファからのパケットの読み出しおよび復号手段へ送るパケットを制御することを特徴とする請求項８に記載のネットワーク電話機。 When the received packet is stored in the first area in the jitter absorption buffer, the delay time control means is configured to read the packet read out from the jitter absorption buffer at the packet read timing a plurality of consecutive packet read timings including this time. The packet to be read back to the jitter absorbing buffer and to be sent to the decoding means so that the packet is sent to the third region in the jitter absorbing buffer. Is stored continuously for a predetermined number of times, multiple packets stored in the jitter absorption buffer are read at a time at the packet read timing, only one of them is decoded, and the jitter is discarded. Reading packets from the absorption buffer Network telephone according to claim 8, characterized in that to control the packet sent to the pre-decoding unit.