WO2017107544A1 - 网络抖动处理方法、装置和终端设备 - Google Patents

网络抖动处理方法、装置和终端设备 Download PDF

Info

Publication number
WO2017107544A1
WO2017107544A1 PCT/CN2016/097257 CN2016097257W WO2017107544A1 WO 2017107544 A1 WO2017107544 A1 WO 2017107544A1 CN 2016097257 W CN2016097257 W CN 2016097257W WO 2017107544 A1 WO2017107544 A1 WO 2017107544A1
Authority
WO
WIPO (PCT)
Prior art keywords
buffer
voice data
time interval
preset
interval
Prior art date
Application number
PCT/CN2016/097257
Other languages
English (en)
French (fr)
Inventor
李敬
王林章
梁善桂
吴子敬
Original Assignee
小米科技有限责任公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 小米科技有限责任公司 filed Critical 小米科技有限责任公司
Priority to JP2016565349A priority Critical patent/JP6382345B2/ja
Priority to RU2016148816A priority patent/RU2651215C1/ru
Priority to KR1020177024410A priority patent/KR101986549B1/ko
Publication of WO2017107544A1 publication Critical patent/WO2017107544A1/zh

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L47/00Traffic control in data switching networks
    • H04L47/10Flow control; Congestion control
    • H04L47/28Flow control; Congestion control in relation to timing considerations
    • H04L47/283Flow control; Congestion control in relation to timing considerations in response to processing delays, e.g. caused by jitter or round trip time [RTT]
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L49/00Packet switching elements
    • H04L49/90Buffering arrangements
    • H04L49/9005Buffering arrangements using dynamic buffer space allocation
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L65/00Network arrangements, protocols or services for supporting real-time applications in data packet communication
    • H04L65/1066Session management
    • H04L65/1083In-session procedures
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L65/00Network arrangements, protocols or services for supporting real-time applications in data packet communication
    • H04L65/60Network streaming of media packets
    • H04L65/65Network streaming protocols, e.g. real-time transport protocol [RTP] or real-time control protocol [RTCP]
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L65/00Network arrangements, protocols or services for supporting real-time applications in data packet communication
    • H04L65/80Responding to QoS
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L43/00Arrangements for monitoring or testing data switching networks
    • H04L43/08Monitoring or testing based on specific metrics, e.g. QoS, energy consumption or environmental parameters
    • H04L43/0852Delays
    • H04L43/087Jitter

Definitions

  • the present disclosure relates to the field of smart terminal device technologies, and in particular, to a network jitter processing method, apparatus, and terminal device.
  • VoIP Voice over Internet Protocol
  • the transmitting end In VoIP voice communication, the transmitting end generally uses a fixed frame rate speech coding algorithm to transmit data at uniform time intervals.
  • the factor that has the greatest impact on the voice quality of the receiving end is not the end-to-end transmission delay. It is the jitter of the time interval at which the receiving end receives data, but the related art lacks a method of solving the network jitter according to the jitter of the time interval at which the receiving end receives data.
  • the present disclosure provides a network jitter processing method, apparatus, and terminal device for solving network jitter according to jitter of a time interval at which a receiving end receives data.
  • a network dither processing method including:
  • the buffer is adjusted according to the buffer target size to solve the network jitter.
  • the technical solution may include the following beneficial effects: since the probability distribution of the time interval at which the receiving end receives the voice data packet changes according to the network jitter, the time interval at which the receiving end receives the voice data packet is a random variable, and the time interval of the embodiment is adopted.
  • the probability distribution and the allowable jitter probability calculate the buffer target size, so the buffer target size changes with the network jitter, and the buffer size is adjusted according to the buffer target size, so that the buffer size changes with the network jitter, due to the buffer
  • the size of the network is closely related to its ability to overcome network jitter. Off, network jitter can be resolved by adjusting the buffer size of the buffer target.
  • the nth time interval J n R n - R n-1 , where R n represents a receiving moment of the nth voice data packet, and R n-1 represents The receiving time of the n-1th voice data packet, 1 ⁇ n ⁇ (N-1), where N represents the total number of the voice data packets;
  • the number of the preset intervals is plural, and the ith preset interval is Where T is a fixed time interval at which the transmitting end sends the voice data packet, and K is a positive integer;
  • the probability distribution of the time interval is calculated according to the preset interval to which each time interval belongs, including:
  • the technical solution may include the following beneficial effects: counting the number of time intervals respectively falling into each preset interval in all time intervals, according to the number of time intervals of each preset interval and the total number of voice data packets.
  • the ratio is the probability that the time interval falls within the preset interval, and the probability that the time interval falls within each preset interval is obtained, and the probability distribution of the time interval can be obtained.
  • nN max the number of time intervals corresponding to the preset interval to which the time interval belongs is decremented by 1, wherein N max is a threshold of the total number of voice packets.
  • the technical solution may include the following beneficial effects: reducing the time interval of the longest existence time by adding a new time interval, ensuring that the time interval of each preset interval is distributed as the receiving end receives the increase of the voice data packet. The total number remains the same.
  • the calculating a buffer target size according to the probability distribution and the allowable jitter probability includes:
  • the buffer is adjusted according to the buffer target size To address network jitter, including:
  • the current size of the buffer is greater than the buffer target size, deleting the non-speech data in the buffer, so that the difference between the current size of the buffer and the target size of the buffer is within a preset range. .
  • the technical solution may include the following beneficial effects: calculating the buffer target size by the probability distribution of the time interval and the allowable jitter probability, and adjusting the current size of the buffer according to the buffer target size, so that the size of the buffer is dynamic with network jitter. Adjusted to resolve network jitter.
  • the method further includes:
  • the technical solution may include the following beneficial effects: by adding a buffer between the decoder and the sound emitting module at the receiving end, the RTP message receiving module of the receiving end transmits the received voice data packet to the decoder as an RTP message.
  • the RTP message is decoded by the decoder to obtain voice data or non-speech data, and the voice data or non-speech data is put into the buffer, and the dynamic adjustment of the buffer realizes the processing of the network jitter.
  • a network jitter processing apparatus including:
  • a receiving record module configured to receive a plurality of voice data packets and record a receiving time of each voice data packet
  • a calculating module configured to calculate, according to the receiving moment, a time interval for receiving all two adjacent voice data packets in the plurality of voice data packets
  • a statistics module configured to calculate a probability distribution of the time interval according to a preset interval to which each time interval belongs
  • the calculation module is further configured to calculate a buffer target size according to the probability distribution and the allowable jitter probability
  • the adjustment module is configured to adjust the buffer according to the buffer target size to resolve network jitter.
  • the technical solution may include the following beneficial effects: since the probability distribution of the time interval at which the receiving end receives the voice data packet changes according to the network jitter, the time interval at which the receiving end receives the voice data packet is a random variable, and the time interval of the embodiment is adopted.
  • the probability distribution and the allowable jitter probability calculate the buffer target size, so the buffer target size changes with the network jitter, and the buffer size is adjusted according to the buffer target size, so that the buffer size changes with the network jitter, due to the buffer
  • the size is closely related to its ability to overcome network jitter, and network jitter can be resolved by adjusting the buffer size of the buffer target.
  • the nth time interval J n R n - R n-1 , where R n represents a receiving moment of the nth voice data packet, and R n-1 represents The receiving time of the n-1th voice data packet, 1 ⁇ n ⁇ (N-1), where N represents the total number of the voice data packets;
  • the number of the preset intervals is plural, and the ith preset interval is Where T is a fixed time interval at which the transmitting end sends the voice data packet, and K is a positive integer;
  • the statistics module includes:
  • the first statistic sub-module is configured to count the number of time intervals N i , i ⁇ 0 in each of the time intervals respectively falling into the respective preset intervals, where i represents the identification number of the preset interval, and N i ⁇ 0;
  • a first calculation submodule configured to calculate, according to each N i and the total number N of the voice data packets, a probability that the time interval falls within each preset interval
  • the technical solution may include the following beneficial effects: counting the number of time intervals respectively falling into each preset interval in all time intervals, according to the number of time intervals of each preset interval and the total number of voice data packets.
  • the ratio is the probability that the time interval falls within the preset interval, and the probability that the time interval falls within each preset interval is obtained, and the probability distribution of the time interval can be obtained.
  • the statistic module further includes:
  • the second statistic sub-module is configured to, when N ⁇ N max , for any one of n, if n ⁇ N max , the time interval corresponding to the preset interval to which the (nN max +1) time interval belongs The number is decremented by 1, wherein N max is a threshold of the total number of voice packets.
  • the technical solution may include the following beneficial effects: reducing the time interval of the longest existence time by adding a new time interval, ensuring that the time interval of each preset interval is distributed as the receiving end receives the increase of the voice data packet. The total number remains the same.
  • the calculating module includes:
  • a second calculation sub-module configured to calculate a condition m, where P represents a known allowable jitter probability
  • a third calculation submodule configured to calculate a sum of sizes of the 0th preset interval of the plurality of preset intervals to the mth preset interval, where the sum is the buffer target size.
  • the adjusting module includes:
  • the first adjustment submodule is configured to generate data in the buffer to complete the frame if the current size of the buffer is smaller than the buffer target size and the effective length of the buffer is 0;
  • a second adjustment submodule configured to delete non-speech data in the buffer if the current size of the buffer is greater than the buffer target size, so that the current size of the buffer and the buffer target The difference in size is within the preset range.
  • the technical solution may include the following beneficial effects: calculating the buffer target size by the probability distribution of the time interval and the allowable jitter probability, and adjusting the current size of the buffer according to the buffer target size, so that the size of the buffer is dynamic with network jitter. Adjusted to resolve network jitter.
  • the device further includes:
  • a decoding module configured to receive, by the receiving recording module, a plurality of voice data packets, and record each voice number Decoding each of the plurality of voice data packets to obtain voice data or non-speech data according to a receiving time of the packet, and storing the voice data or the non-speech data in the buffer .
  • the technical solution may include the following beneficial effects: by adding a buffer between the decoder and the sound emitting module at the receiving end, the RTP message receiving module of the receiving end transmits the received voice data packet to the decoder as an RTP message.
  • the RTP message is decoded by the decoder to obtain voice data or non-speech data, and the voice data or non-speech data is put into the buffer, and the dynamic adjustment of the buffer realizes the processing of the network jitter.
  • a terminal device including:
  • a memory configured to store processor executable instructions
  • processor is configured to:
  • the buffer is adjusted according to the buffer target size to solve the network jitter.
  • FIG. 1A is a flowchart of Embodiment 1 of a network dither processing method according to an exemplary embodiment
  • Figure 1B is a schematic view of the receiving end of the embodiment shown in Figure 1A;
  • FIG. 2 is a flowchart of Embodiment 2 of a network dither processing method according to an exemplary embodiment
  • FIG. 3 is a flowchart of Embodiment 3 of a network dither processing method according to an exemplary embodiment
  • FIG. 4 is a flowchart of Embodiment 4 of a network dither processing method according to an exemplary embodiment
  • FIG. 5 is a flowchart of Embodiment 5 of a network dither processing method according to an exemplary embodiment
  • FIG. 6 is a block diagram of Embodiment 1 of a network dithering apparatus according to an exemplary embodiment
  • FIG. 7 is a block diagram of a second embodiment of a network jitter processing apparatus according to an exemplary embodiment
  • FIG. 8 is a block diagram of a terminal device according to an exemplary embodiment
  • FIG. 9 is a block diagram of another terminal device, according to an exemplary embodiment.
  • FIG. 1A is a flowchart of Embodiment 1 of a network dither processing method, which may be performed by a network dithering processing device, which may be integrated in a terminal device, as shown in FIG. 1A, according to an exemplary embodiment. As shown, the method includes the following steps:
  • step 101 a plurality of voice data packets are received, and the reception timing of each voice data packet is recorded.
  • two terminal devices perform voice communication through VoIP
  • the protocol used by the two terminal devices for voice communication is Real-time Transport Protocol (RTP)
  • RTP Real-time Transport Protocol
  • the transmitting end sends multiple RTP messages to the receiving end, and each RTP message includes voice data
  • the receiving end receives the multiple RTP messages, and records the receiving time of receiving each RTP message.
  • step 102 a time interval for receiving all two adjacent voice data packets in the plurality of voice data packets is calculated according to the receiving time.
  • the nth time interval J n R n - R n-1 , where R n represents the reception time of the nth voice data packet, and R n-1 represents the reception time of the n-1th voice data packet, 1 ⁇ n ⁇ (N-1), where N represents the total number of voice packets.
  • the receiving time of the n-1th voice data packet is R n-1
  • the receiving of the nth voice data packet The time is R n
  • the number of all the RTP messages is N, 1 ⁇ n ⁇ (N-1).
  • step 103 the probability distribution of the time interval is counted according to a preset interval to which each time interval belongs.
  • the number of the preset intervals is multiple, and the ith preset interval is
  • T is a fixed time interval at which the transmitting end sends the voice data packet
  • K is a positive integer, i ⁇ 0
  • statistics the number of time intervals that fall into each preset interval in all time intervals, that is, for each pre- Set interval Counting the number N i of the time intervals falling into the preset interval in all time intervals, since J n varies with n, the time interval J at which the receiving end receives the adjacent two voice data packets is a random variable
  • the ratio of the number of time intervals N i corresponding to the preset interval to the total number N of voice packets is the probability p i of the random variable J falling into the preset interval, and falls within each preset interval according to the random variable J.
  • the probability distribution of the random variable J can be obtained by the probability p i , i ⁇ 0.
  • K 8
  • a buffer target size is calculated based on the probability distribution and the allowable jitter probability.
  • the allowable jitter probability indicates the probability of network jitter that the buffer can overcome, and the probability that the random variable J falls into the first m+1 preset intervals And the probability of the first m+2 preset intervals If the allowable jitter probability P satisfies the condition Then, the sum of the size of the 0th preset interval to the mth preset interval is the buffer target size.
  • step 105 the buffer is adjusted according to the buffer target size to resolve network jitter.
  • a buffer is added between the decoder and the sound emitting module at the receiving end, and the RTP message receiving module of the receiving end transmits the received voice data packet to the decoder as the RTP message, and the RTP message is sent by the decoder.
  • the length of the voice data or non-speech data obtained after the decoding of an RTP message is L
  • the buffer The total size is K*L, that is, the buffer is divided into K blocks, and the length of each block is L.
  • the sequence numbers of all blocks are 0, 1, 2, ..., K-1.
  • the playback module periodically reads data from the buffer for playback. It is assumed that the playback module reads data from the block with the sequence number P read in the buffer for playback, and the decoder writes the decoded output data to the block with the sequence number P write .
  • P write -P read indicates the effective data length of the buffer, and also represents the delay generated by the buffer.
  • the buffer Since the size of the buffer is closely related to its ability to overcome network jitter, the buffer is adjusted according to the buffer target size to resolve network jitter.
  • the probability distribution of the time interval at which the receiving end receives the voice data packet changes according to the change of the network jitter, and the time interval at which the receiving end receives the voice data packet is a random variable.
  • the probability distribution of the time interval is Allow the jitter probability to calculate the buffer target size, so the buffer target size changes with the network jitter. Adjusting the buffer according to the buffer target size can make the buffer size change with the network jitter. Because of the size of the buffer, The ability to overcome network jitter is closely related, and network jitter can be resolved by adjusting the buffer size of the buffer target.
  • FIG. 2 is a flowchart of Embodiment 2 of a network dither processing method according to an exemplary embodiment. As shown in FIG. 2, the method may include the following steps:
  • step 201 a plurality of voice data packets are received, and the reception timing of each voice data packet is recorded.
  • step 202 a time interval for receiving all two adjacent voice data packets in the plurality of voice data packets is calculated according to the receiving time.
  • step 203 the number of time intervals N i , i ⁇ 0 of each of the preset intervals respectively falling in the respective preset intervals is counted, where i represents the identification number of the preset interval, and N i ⁇ 0.
  • step 204 the probability that the time interval falls within each preset interval is calculated according to each N i and the total number N of the voice data packets.
  • a statistical value corresponding to each preset interval is obtained. For example, if the i-th preset interval corresponds to N i statistical values, For the probability that the random variable J falls within the i-th preset interval, the probability that the random variable J falls within each preset interval can be calculated.
  • a buffer target size is calculated based on the probability distribution and the allowable jitter probability.
  • step 206 the buffer is adjusted according to the buffer target size to resolve network jitter.
  • the ratio of the number of time intervals of each preset interval to the total number of voice packets is used as the time interval.
  • the probability of falling into the preset interval, the probability that the time interval falls within each preset interval, and the probability distribution of the time interval can be obtained.
  • FIG. 3 is a flowchart of Embodiment 3 of a network dither processing method according to an exemplary embodiment. As shown in FIG. 3, the method may include the following steps:
  • step 301 a plurality of voice data packets are received, and the reception timing of each voice data packet is recorded.
  • step 302 a time interval for receiving all two adjacent voice data packets in the plurality of voice data packets is calculated according to the receiving time.
  • step 303 the number of time intervals N i , i ⁇ 0 in each of the time intervals respectively falling into the respective preset intervals is counted, where i represents the identification number of the preset interval, and N i ⁇ 0.
  • step 304 the probability that the time interval falls within each preset interval is calculated according to each N i and the total number N of the voice data packets.
  • step 305 when N ⁇ N max , for any one n, if n ⁇ N max , the number of time intervals corresponding to the preset interval to which the (nN max +1)th time interval belongs is decremented by one, Where N max is a threshold of the total number of voice packets.
  • the receiving end has received 1000 voice data packets, and the sequence numbers of the 1000 voice data packets are 0, 1, 2, ..., 999, and 1000 voice data packets correspond to 999 time intervals.
  • J 1 , J 2 ... J 999 the threshold of the total number of voice data packets receivable at the receiving end is 1000.
  • J 1000 is calculated, assuming J 1000 belongs to the 10th preset interval, and the statistic value corresponding to the 10th preset interval is incremented by 1.
  • the statistical value corresponding to the preset interval to which J 1 belongs is decremented by 1.
  • J 1001 is calculated. If J 1001 belongs to the 8th preset interval, the statistical value corresponding to the 8th preset interval is incremented by 1, and at the same time, the statistics corresponding to the preset interval to which J 2 belongs are counted. The value is decremented by 1, that is, each time interval is increased, and the time interval with the longest existence time is reduced, so that the total number of time intervals distributed in each preset interval does not change as the receiving end receives the voice data packet.
  • a buffer target size is calculated based on the probability distribution and the allowable jitter probability.
  • step 307 the buffer is adjusted according to the buffer target size to resolve network jitter.
  • FIG. 4 is a flowchart of Embodiment 4 of a network dither processing method according to an exemplary embodiment. As shown in FIG. 4, the method may include the following steps:
  • step 401 a plurality of voice data packets are received, and the reception timing of each voice data packet is recorded.
  • step 402 a time interval for receiving all two adjacent voice data packets in the plurality of voice data packets is calculated according to the receiving time.
  • step 403 the number of time intervals N i , i ⁇ 0 of each of the preset intervals in each of the time intervals is counted, where i represents the identification number of the preset interval, and N i ⁇ 0.
  • step 404 the probability that the time interval falls within each preset interval is calculated according to each N i and the total number N of the voice data packets.
  • step 405 when N ⁇ N max , for any one n, if n ⁇ N max , the number of time intervals corresponding to the preset interval to which the (nN max +1)th time interval belongs is decremented by one, Where N max is a threshold of the total number of voice packets.
  • step 406 the calculation satisfies the condition m, where P represents a known allowable jitter probability.
  • the allowable jitter probability indicates the probability of network jitter that the buffer can overcome, and the probability that the random variable J falls into the first m+1 preset intervals And the probability of the first m+2 preset intervals The known number P of the preset allowable jitter probability, the calculation satisfies the condition m.
  • step 407 the sum of the sizes of the 0th preset interval of the plurality of preset intervals to the mth preset interval is used as the buffer target size.
  • step 408 if the current size of the buffer is smaller than the buffer target size, data is generated in the buffer for complementing the frame; if the current size of the buffer is larger than the buffer target size, deleting the Non-speech data in the buffer such that the difference between the current size of the buffer and the target size of the buffer is within a predetermined range.
  • the buffer is adjusted according to the buffer target size to solve the network jitter.
  • the specific adjustment method is: if the current size of the buffer is smaller than the buffer target size, the data is generated in the buffer for complementing the frame, and the frame is complemented.
  • the method is specifically: for the speech data, repeating according to the gene period to ensure the smoothness of the speech; for the non-speech data, according to the energy estimation of the background noise, the noise is generated and smoothed. If the current size of the buffer is greater than the buffer target size, deleting non-speech data in the buffer to reduce the current size of the buffer until the current size of the buffer and the buffer target size Stop deleting when the difference is within the preset range Non-speech data in the buffer.
  • the buffer target size is calculated by the probability distribution of the time interval and the allowable jitter probability, and the current size of the buffer is adjusted according to the buffer target size, so that the size of the buffer is dynamically adjusted according to the network jitter, thereby solving Network jitter.
  • FIG. 5 is a flowchart of Embodiment 5 of a network dither processing method according to an exemplary embodiment. As shown in FIG. 5, the method may include the following steps:
  • step 501 a plurality of voice data packets are received, and the reception timing of each voice data packet is recorded.
  • each of the plurality of voice data packets is decoded to obtain voice data or non-speech data, and the voice data or the non-speech data is stored in the buffer.
  • step 502 is consistent with the description in FIG. 1B in the first embodiment, and details are not described herein again.
  • step 503 a time interval for receiving all two adjacent voice data packets in the plurality of voice data packets is calculated according to the receiving time.
  • step 504 the number of time intervals N i , i ⁇ 0 of each of the preset intervals respectively falling in each time interval is counted, where i represents the identification number of the preset interval, and N i ⁇ 0.
  • step 505 the probability that the time interval falls within each preset interval is calculated according to each N i and the total number N of the voice data packets.
  • step 506 when N ⁇ N max , for any one n, if n ⁇ N max , the number of time intervals corresponding to the preset interval to which the (nN max +1)th time interval belongs is decremented by one, Where N max is a threshold of the total number of voice packets.
  • step 507 the calculation satisfies the condition m, where P represents a known allowable jitter probability.
  • the allowable jitter probability indicates the probability of network jitter that the buffer can overcome, and the probability that the random variable J falls into the first m+1 preset intervals And the probability of the first m+2 preset intervals The known number P of the preset allowable jitter probability, the calculation satisfies the condition m.
  • step 508 the sum of the sizes of the 0th preset interval of the plurality of preset intervals to the mth preset interval is used as the buffer target size.
  • step 509 if the current size of the buffer is smaller than the buffer target size, and the effective length of the buffer is 0, data is generated in the buffer to complement the frame; if the current size of the buffer is greater than the buffer The area target size deletes the non-speech data in the buffer such that the difference between the current size of the buffer and the buffer target size is within a preset range.
  • the buffer target length is 3, if the network is blocked If the plug, the effective data length in the buffer will become 2, 1, and there is no need to fill the frame. If the blocked packets arrive at this time, the effective length of the buffer will increase. . If the blocked packets do not arrive consecutively and the valid length of the buffer becomes 0, then the complement is required.
  • the target length of the buffer will become longer, for example, it becomes 5, which means that when blocking, the buffer has 4 frames of data to be played, and after 4 frames of data are played, If the blocked data packet does not arrive continuously, the frame needs to be complemented. It can be seen that the longer target length of the buffer will reduce the probability of the buffer complement frame and improve the sound quality.
  • the packet loss tolerance is unchanged, and the target length of the buffer will become smaller. For example, it becomes 1, meaning that the delay introduced by the buffer is only one frame, and the delay becomes shorter, but if the network is blocked, play. After completing 1 frame of data, you need to fill in the frame.
  • the packet loss tolerance becomes larger.
  • the target length of the buffer will become longer, which means that the delay increases, but the probability of complementing the frame decreases. This is the user's probability of setting a larger packet loss probability. cost.
  • the packet loss tolerance becomes smaller.
  • the target length of the buffer will become smaller, which means that the delay is reduced, but the probability of complementing the frame increases, because the user accepts more packet loss frames. In exchange for less delay.
  • the RTP message receiving module of the receiving end transmits the received voice data packet as an RTP message to the decoder, and the decoder
  • the RTP message is decoded to obtain voice data or non-speech data, and the voice data or non-speech data is put into the buffer, and the dynamic adjustment of the buffer realizes the processing of the network jitter.
  • FIG. 6 is a block diagram of a first embodiment of a network jitter processing apparatus according to an exemplary embodiment. As shown in FIG. 6, the apparatus includes a receiving recording module 61, a calculating module 62, a statistics module 63, and an adjusting module 64.
  • the receiving record module 61 is configured to receive a plurality of voice data packets and record a receiving moment of each voice data packet;
  • the calculating module 62 is configured to calculate, according to the receiving time, a time interval for receiving all two adjacent voice data packets in the plurality of voice data packets;
  • the statistics module 63 is configured to collect a probability distribution of the time interval according to a preset interval to which each time interval belongs;
  • the calculation module 62 is further configured to calculate a buffer target size according to the probability distribution and the allowable jitter probability
  • the adjustment module 64 is configured to adjust the buffer according to the buffer target size to resolve network jitter.
  • the probability distribution of the time interval at which the receiving end receives the voice data packet changes according to the change of the network jitter, and the time interval at which the receiving end receives the voice data packet is a random variable.
  • the probability distribution of the time interval is Allow the jitter probability to calculate the buffer target size, so the buffer target size changes with the network jitter. Adjusting the buffer according to the buffer target size can make the buffer size change with the network jitter. Because of the size of the buffer, The ability to overcome network jitter is closely related by the size of the buffer target Adjust the buffer to resolve network jitter.
  • FIG. 7 is a block diagram of a second embodiment of a network jitter processing apparatus according to an exemplary embodiment.
  • the nth time interval J n R n -R n-1 , where R n represents the reception timing of the nth voice packet, and R n-1 represents the reception timing of the n-1th voice packet, 1 ⁇ n ⁇ (N-1), where N represents The total number of the voice data packets; the number of the preset intervals is multiple, and the ith preset interval is
  • T is a fixed time interval at which the transmitting end sends the voice data packet
  • K is a positive integer.
  • the statistics module 63 includes: a first statistics sub-module 631, a first calculation sub-module 632, and a second statistical sub-module 633.
  • the first statistic sub-module 631 is configured to count the number of time intervals N i , i ⁇ 0 in each of the time intervals respectively falling into the respective preset intervals, where i represents the identification number of the preset interval, and N i ⁇ 0 ;
  • the first calculation sub-module 632 is configured to calculate, according to each N i and the total number N of the voice data packets, a probability that the time interval falls within each preset interval
  • the second statistic sub-module 633 is configured to, when N ⁇ N max , for any one of n, if n ⁇ N max , the time interval corresponding to the preset interval to which the (nN max +1) time interval belongs The number is reduced by 1, where N max is the threshold of the total number of voice packets.
  • the calculation module 62 includes a second calculation sub-module 621 and a third calculation sub-module 622.
  • a second calculation sub-module 621 configured to calculate a condition m, where P represents a known allowable jitter probability
  • the third calculation sub-module 622 is configured to calculate a sum of sizes of the 0th preset interval from the plurality of the preset intervals to the mth preset interval, where the sum is the buffer target size.
  • the adjustment module 64 includes a first adjustment sub-module 641 and a second adjustment sub-module 642.
  • the first adjustment sub-module 641 is configured to generate data in the buffer to complete the frame if the current size of the buffer is smaller than the buffer target size and the effective length of the buffer is 0;
  • the second adjustment sub-module 642 is configured to delete non-speech data in the buffer if the current size of the buffer is greater than the buffer target size, so that the current size of the buffer and the buffer The difference in target size is within the preset range.
  • the device further includes a decoding module 65 configured to receive, after the recording module 61 receives the plurality of voice data packets, and record the receiving time of each voice data packet, perform each voice data packet in the plurality of voice data packets. Decoding obtains voice data or non-speech data, and stores the voice data or the non-speech data in the buffer.
  • the network jitter processing apparatus provided in this embodiment may be used to implement the technical solution of the method embodiment shown in any of FIG. 2 to FIG. 5.
  • the ratio of the number of time intervals of each preset interval to the total number of voice packets is used as the time interval.
  • the probability of falling into the preset interval, the probability that the time interval falls within each preset interval, the probability distribution of the time interval can be obtained; by adding a new time interval, the time interval of the longest existence time is reduced, and the time interval is ensured.
  • the receiving end receives the increase of the voice data packet, and the total number of time intervals distributed in each preset interval does not change; the buffer target size is calculated by the probability distribution of the time interval and the allowable jitter probability, and the buffer is adjusted according to the buffer target size.
  • the current size so that the size of the buffer is dynamically adjusted with network jitter, thus solving the network jitter; by adding a buffer between the decoder and the sound emitting module at the receiving end, the RTP message receiving module of the receiving end will receive
  • the incoming voice data packet is specifically sent to the decoder by the RTP message, and the decoder decodes the RTP message.
  • Voice data or voice data, the voice data and non-voice data into the buffer, the buffer is implemented to dynamically adjust the processing of the network jitter.
  • the network jitter processing apparatus can be implemented as a terminal device, including:
  • a memory configured to store processor executable instructions
  • processor is configured to:
  • the buffer is adjusted according to the buffer target size to solve the network jitter.
  • the probability distribution of the time interval at which the receiving end receives the voice data packet changes according to the change of the network jitter, and the time interval at which the receiving end receives the voice data packet is a random variable.
  • the probability distribution of the time interval is Allow the jitter probability to calculate the buffer target size, so the buffer target size changes with the network jitter. Adjusting the buffer according to the buffer target size can make the buffer size change with the network jitter. Because of the size of the buffer, The ability to overcome network jitter is closely related, and network jitter can be resolved by adjusting the buffer size of the buffer target.
  • FIG. 9 is a block diagram of another terminal device, according to an exemplary embodiment.
  • the terminal device 800 can be a mobile phone, a computer, a digital broadcast terminal, a messaging device, a game console, a tablet device, a medical device, a fitness device, a personal digital assistant, and the like.
  • the terminal device 800 may include one or more of the following components: a processing component 802, a memory 804, power component 806, multimedia component 808, audio component 810, input/output (I/O) interface 812, sensor component 814, and communication component 816.
  • Processing component 802 typically controls the overall operation of terminal device 800, such as operations associated with display, telephone calls, data communications, camera operations, and recording operations.
  • Processing component 802 can include one or more processors 820 to execute instructions to perform all or part of the steps of the above described methods.
  • processing component 802 can include one or more modules to facilitate interaction between component 802 and other components.
  • processing component 802 can include a multimedia module to facilitate interaction between multimedia component 808 and processing component 802.
  • Memory 804 is configured to store various types of data to support operation at device 800. Examples of such data include instructions for any application or method operating on terminal device 800, contact data, phone book data, messages, pictures, videos, and the like.
  • the memory 804 can be implemented by any type of volatile or non-volatile storage device, or a combination thereof, such as static random access memory (SRAM), electrically erasable programmable read only memory (EEPROM), erasable.
  • SRAM static random access memory
  • EEPROM electrically erasable programmable read only memory
  • EPROM Programmable Read Only Memory
  • PROM Programmable Read Only Memory
  • ROM Read Only Memory
  • Magnetic Memory Flash Memory
  • Disk Disk or Optical Disk.
  • Power component 806 provides power to various components of terminal device 800.
  • Power component 806 can include a power management system, one or more power sources, and other components associated with generating, managing, and distributing power for terminal device 800.
  • the multimedia component 808 includes a screen that provides an output interface between the terminal device 800 and a user.
  • the screen can include a liquid crystal display (LCD) and a touch panel (TP). If the screen includes a touch panel, the screen can be implemented as a touch screen to receive input signals from the user.
  • the touch panel includes one or more touch sensors to sense touches, slides, and gestures on the touch panel. The touch sensor may sense not only the boundary of the touch or sliding action, but also the duration and pressure associated with the touch or slide operation.
  • the multimedia component 808 includes a front camera and/or a rear camera. When the device 800 is in an operation mode, such as a shooting mode or a video mode, the front camera and/or the rear camera can receive external multimedia data. Each front and rear camera can be a fixed optical lens system or have focal length and optical zoom capabilities.
  • the audio component 810 is configured to output and/or input an audio signal.
  • the audio component 810 includes a microphone (MIC) that is configured to receive an external audio signal when the terminal device 800 is in an operational mode, such as a call mode, a recording mode, and a voice recognition mode.
  • the received audio signal may be further stored in memory 804 or transmitted via communication component 816.
  • the audio component 810 also includes a speaker for outputting an audio signal.
  • the I/O interface 812 provides an interface between the processing component 802 and the peripheral interface module, which may be a keyboard, a click wheel, a button, or the like. These buttons may include, but are not limited to, a home button, a volume button, a start button, and a lock button.
  • Sensor assembly 814 includes one or more sensors for providing terminal device 800 with a status assessment of various aspects.
  • sensor component 814 can detect the on/off state of device 800, the relative orientation of components The location, for example, the component is a display and keypad of the terminal device 800.
  • the sensor component 814 can also detect a change in location of a component of the terminal device 800 or the terminal device 800, the presence or absence of contact of the user with the terminal device 800, and the terminal device 800. Azimuth or acceleration/deceleration and temperature changes of the terminal device 800.
  • Sensor assembly 814 can include a proximity sensor configured to detect the presence of nearby objects without any physical contact.
  • Sensor assembly 814 may also include a light sensor, such as a CMOS or CCD image sensor, for use in imaging applications.
  • the sensor assembly 814 can also include an acceleration sensor, a gyro sensor, a magnetic sensor, a pressure sensor, or a temperature sensor.
  • Communication component 816 is configured to facilitate wired or wireless communication between terminal device 800 and other devices.
  • the terminal device 800 can access a wireless network based on a communication standard such as WiFi, 2G or 3G, or a combination thereof.
  • communication component 816 receives broadcast signals or broadcast associated information from an external broadcast management system via a broadcast channel.
  • the communication component 816 also includes a near field communication (NFC) module to facilitate short range communication.
  • NFC near field communication
  • the NFC module can be implemented based on radio frequency identification (RFID) technology, infrared data association (IrDA) technology, ultra-wideband (UWB) technology, Bluetooth (BT) technology, and other technologies.
  • RFID radio frequency identification
  • IrDA infrared data association
  • UWB ultra-wideband
  • Bluetooth Bluetooth
  • terminal device 800 may be implemented by one or more application specific integrated circuits (ASICs), digital signal processors (DSPs), digital signal processing devices (DSPDs), programmable logic devices (PLDs), A gated array (FPGA), controller, microcontroller, microprocessor, or other electronic component implementation for performing the above methods.
  • ASICs application specific integrated circuits
  • DSPs digital signal processors
  • DSPDs digital signal processing devices
  • PLDs programmable logic devices
  • FPGA gated array
  • controller microcontroller, microprocessor, or other electronic component implementation for performing the above methods.
  • non-transitory computer readable storage medium comprising instructions, such as a memory 804 comprising instructions executable by the processor 820 of the terminal device 800 to perform the above method.
  • the non-transitory computer readable storage medium may be a ROM, a random access memory (RAM), a CD-ROM, a magnetic tape, a floppy disk, and an optical data storage device.
  • a non-transitory computer readable storage medium when instructions in the storage medium are executed by a processor of a mobile terminal, enabling the mobile terminal to perform a network dither processing method, the method comprising:
  • the user's terminal device is in a horizontally placed state when taking a photo
  • the photo display direction Determining, according to a correspondence relationship between the shooting gesture and the photo display direction, a photo display direction corresponding to the detected gesture, the photo display direction being a display direction of the photo frame of the terminal device relative to the horizontally placed state ;
  • the photo is displayed according to the photo display direction and the gravity direction.

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • Multimedia (AREA)
  • Business, Economics & Management (AREA)
  • General Business, Economics & Management (AREA)
  • Environmental & Geological Engineering (AREA)
  • Data Exchanges In Wide-Area Networks (AREA)
  • Telephonic Communication Services (AREA)

Abstract

本公开是关于一种网络抖动处理方法、装置和终端设备,该方法包括:接收多个语音数据包,并记录各个语音数据包的接收时刻;依据所述接收时刻计算接收所述多个语音数据包中所有相邻两个语音数据包的时间间隔;依据各个时间间隔所属的预设区间统计所述时间间隔的概率分布;依据所述概率分布和容许抖动概率计算缓冲区目标大小;依据所述缓冲区目标大小调整缓冲区以解决网络抖动。 (图1A)

Description

网络抖动处理方法、装置和终端设备
本申请基于申请号为201510967352.9、申请日为2015年12月21日的中国专利申请提出,并要求该中国专利申请的优先权,该中国专利申请的全部内容在此引入本申请作为参考。
技术领域
本公开涉及智能终端设备技术领域,尤其涉及一种网络抖动处理方法、装置和终端设备。
背景技术
随着网络带宽的增加和移动互联网的普及,在手机上利用互联网的实时语音通信应用快速发展,与有线网络相比,无线网络在丢包率、网络延时、网络抖动等方面对网络电话(Voice over Internet Protocol,简称VoIP)这种实时语音通信业务的质量有较大的影响。
在VoIP语音通信中,发送端一般采用固定帧率的语音编码算法,按均匀的时间间隔发送数据,这种情况下,对接收端语音质量影响最大的因素,不是端到端的传输延时,而是接收端接收数据的时间间隔的抖动,但相关技术缺少依据接收端接收数据的时间间隔的抖动解决网络抖动的方法。
发明内容
本公开提供一种网络抖动处理方法、装置和终端设备,用以依据接收端接收数据的时间间隔的抖动解决网络抖动。
根据本公开实施例的第一方面,提供一种网络抖动处理方法,包括:
接收多个语音数据包,并记录各个语音数据包的接收时刻;
依据所述接收时刻计算接收所述多个语音数据包中所有相邻两个语音数据包的时间间隔;
依据各个时间间隔所属的预设区间统计所述时间间隔的概率分布;
依据所述概率分布和容许抖动概率计算缓冲区目标大小;
依据所述缓冲区目标大小调整缓冲区以解决网络抖动。
该技术方案可以包括以下有益效果:由于接收端接收语音数据包的时间间隔的概率分布随网络抖动的变化而变化,接收端接收语音数据包的时间间隔是一个随机变量,本实施例通过时间间隔的概率分布和容许抖动概率计算缓冲区目标大小,因此缓冲区目标大小随着网络抖动而变化,依据缓冲区目标大小调整缓冲区,可使缓冲区的大小随着网络抖动而变化,由于缓冲区的大小与其能够克服的网络抖动的能力密切相 关,通过对缓冲区目标大小调整缓冲区可解决网络抖动。
在第一方面的第一种可能的实现方式中,第n个时间间隔Jn=Rn-Rn-1,其中,Rn表示第n个语音数据包的接收时刻,Rn-1表示第n-1个语音数据包的接收时刻,1≤n≤(N-1),N表示所述语音数据包的总个数;
所述预设区间的个数为多个,第i个预设区间为
Figure PCTCN2016097257-appb-000001
其中,T为发送端发送所述语音数据包的固定时间间隔,K为正整数;
所述依据各个时间间隔所属的预设区间统计所述时间间隔的概率分布,包括:
统计所有时间间隔中分别落入各个预设区间的时间间隔的个数Ni,i≥0,其中,i表示预设区间的标识号,Ni≥0;
依据各个Ni和所述语音数据包的总个数N计算所述时间间隔落入各个预设区间的概率
Figure PCTCN2016097257-appb-000002
该技术方案可以包括以下有益效果:通过统计所有时间间隔中分别落入各个预设区间的时间间隔的个数,根据每个预设区间的时间间隔的个数与语音数据包的总个数的比值作为时间间隔落入该预设区间的概率,得出时间间隔落入各个预设区间的概率,可获得时间间隔的概率分布。
根据第一方面的第一种可能的实现方式,在第一方面的第二种可能的实现方式中,所当N≥Nmax时,对于任何一个n,若n≥Nmax,则将第(n-Nmax+1)个时间间隔所属的预设区间对应的时间间隔的个数减1,其中,Nmax为所述语音数据包的总个数的阈值。
该技术方案可以包括以下有益效果:通过每增加一个新的时间间隔减少一个存在时间最久的时间间隔,保证随着接收端接收到语音数据包的增加,分布在各个预设区间的时间间隔的总数不变。
根据第一方面的第二种可能的实现方式,在第一方面的第三种可能的实现方式中,所述依据所述概率分布和容许抖动概率计算缓冲区目标大小,包括:
计算满足条件
Figure PCTCN2016097257-appb-000003
的m,其中,P表示已知的容许抖动概率;
将多个所述预设区间中第0个预设区间连续到第m个预设区间的大小之和作为所述缓冲区目标大小。
根据第一方面至第一方面的第三种可能的实现方式中的任何一种实现方式,在第一方面的第四种可能的实现方式中,所述依据所述缓冲区目标大小调整缓冲区以解决网络抖动,包括:
若缓冲区的当前大小小于所述缓冲区目标大小,且缓冲区有效长度为0,则在所述缓冲区中生成数据进行补帧;
若缓冲区的当前大小大于所述缓冲区目标大小,则删除所述缓冲区中的非语音数据,以使所述缓冲区的当前大小与所述缓冲区目标大小的差值在预设范围内。
该技术方案可以包括以下有益效果:通过时间间隔的概率分布和容许抖动概率计算缓冲区目标大小,并依据缓冲区目标大小调整缓冲区的当前大小,以使缓冲区的大小随着网络抖动而动态调节,从而解决了网络抖动。
根据第一方面的第四种可能的实现方式,在第一方面的第五种可能的实现方式中,所述接收多个语音数据包,并记录各个语音数据包的接收时刻之后,还包括:
对所述多个语音数据包中的每个语音数据包进行解码获得语音数据或非语音数据,并将所述语音数据或所述非语音数据存入所述缓冲区。
该技术方案可以包括以下有益效果:通过在接收端的解码器和放音模块之间添加一个缓冲区,接收端的RTP报文接收模块将接收到的语音数据包具体为RTP报文传到解码器,由解码器对RTP报文进行解码处理获得语音数据或非语音数据,并将语音数据或非语音数据放入缓冲区,对缓冲区的动态调节实现对网络抖动的处理。
根据本公开实施例的第二方面,提供一种网络抖动处理装置,包括:
接收记录模块,被配置为接收多个语音数据包,并记录各个语音数据包的接收时刻;
计算模块,被配置为依据所述接收时刻计算接收所述多个语音数据包中所有相邻两个语音数据包的时间间隔;
统计模块,被配置为依据各个时间间隔所属的预设区间统计所述时间间隔的概率分布;
所述计算模块还被配置为依据所述概率分布和容许抖动概率计算缓冲区目标大小;
调整模块,被配置为依据所述缓冲区目标大小调整缓冲区以解决网络抖动。
该技术方案可以包括以下有益效果:由于接收端接收语音数据包的时间间隔的概率分布随网络抖动的变化而变化,接收端接收语音数据包的时间间隔是一个随机变量,本实施例通过时间间隔的概率分布和容许抖动概率计算缓冲区目标大小,因此缓冲区目标大小随着网络抖动而变化,依据缓冲区目标大小调整缓冲区,可使缓冲区的大小随着网络抖动而变化,由于缓冲区的大小与其能够克服的网络抖动的能力密切相关,通过对缓冲区目标大小调整缓冲区可解决网络抖动。
在第二方面的第一种可能的实现方式中,第n个时间间隔Jn=Rn-Rn-1,其中,Rn表示第n个语音数据包的接收时刻,Rn-1表示第n-1个语音数据包的接收时刻,1≤n≤(N-1),N表示所述语音数据包的总个数;
所述预设区间的个数为多个,第i个预设区间为
Figure PCTCN2016097257-appb-000004
其中,T为发送端发送所述语音数据包的固定时间间隔,K为正整数;
所述统计模块包括:
第一统计子模块,被配置为统计所有时间间隔中分别落入各个预设区间的时间间 隔的个数Ni,i≥0,其中,i表示预设区间的标识号,Ni≥0;
第一计算子模块,被配置为依据各个Ni和所述语音数据包的总个数N计算所述时间间隔落入各个预设区间的概率
Figure PCTCN2016097257-appb-000005
该技术方案可以包括以下有益效果:通过统计所有时间间隔中分别落入各个预设区间的时间间隔的个数,根据每个预设区间的时间间隔的个数与语音数据包的总个数的比值作为时间间隔落入该预设区间的概率,得出时间间隔落入各个预设区间的概率,可获得时间间隔的概率分布。
根据第二方面的第一种可能的实现方式,在第二方面的第二种可能的实现方式中,所述统计模块还包括:
第二统计子模块,被配置为当N≥Nmax时,对于任何一个n,若n≥Nmax,则将第(n-Nmax+1)个时间间隔所属的预设区间对应的时间间隔的个数减1,其中,Nmax为所述语音数据包的总个数的阈值。
该技术方案可以包括以下有益效果:通过每增加一个新的时间间隔减少一个存在时间最久的时间间隔,保证随着接收端接收到语音数据包的增加,分布在各个预设区间的时间间隔的总数不变。
根据第二方面的第二种可能的实现方式,在第二方面的第三种可能的实现方式中,所述计算模块包括:
第二计算子模块,被配置为计算满足条件
Figure PCTCN2016097257-appb-000006
的m,其中,P表示已知的容许抖动概率;
第三计算子模块,被配置为计算多个所述预设区间中第0个预设区间连续到第m个预设区间的大小之和,所述之和为所述缓冲区目标大小。
根据第二方面至第二方面的第三种可能的实现方式中的任何一种实现方式,在第二方面的第四种可能的实现方式中,所述调整模块包括:
第一调整子模块,被配置为若缓冲区的当前大小小于所述缓冲区目标大小,且缓冲区有效长度为0,则在所述缓冲区中生成数据进行补帧;
第二调整子模块,被配置为若缓冲区的当前大小大于所述缓冲区目标大小,则删除所述缓冲区中的非语音数据,以使所述缓冲区的当前大小与所述缓冲区目标大小的差值在预设范围内。
该技术方案可以包括以下有益效果:通过时间间隔的概率分布和容许抖动概率计算缓冲区目标大小,并依据缓冲区目标大小调整缓冲区的当前大小,以使缓冲区的大小随着网络抖动而动态调节,从而解决了网络抖动。
根据第二方面的第四种可能的实现方式,在第二方面的第五种可能的实现方式中,所述装置还包括:
解码模块,被配置为所述接收记录模块接收多个语音数据包,并记录各个语音数 据包的接收时刻之后,对所述多个语音数据包中的每个语音数据包进行解码获得语音数据或非语音数据,并将所述语音数据或所述非语音数据存入所述缓冲区。
该技术方案可以包括以下有益效果:通过在接收端的解码器和放音模块之间添加一个缓冲区,接收端的RTP报文接收模块将接收到的语音数据包具体为RTP报文传到解码器,由解码器对RTP报文进行解码处理获得语音数据或非语音数据,并将语音数据或非语音数据放入缓冲区,对缓冲区的动态调节实现对网络抖动的处理。
根据本公开实施例的第三方面,提供一种终端设备,包括:
处理器;
被配置为存储处理器可执行指令的存储器;
其中,所述处理器被配置为:
接收多个语音数据包,并记录各个语音数据包的接收时刻;
依据所述接收时刻计算接收所述多个语音数据包中所有相邻两个语音数据包的时间间隔;
依据各个时间间隔所属的预设区间统计所述时间间隔的概率分布;
依据所述概率分布和容许抖动概率计算缓冲区目标大小;
依据所述缓冲区目标大小调整缓冲区以解决网络抖动。
应当理解的是,以上的一般描述和后文的细节描述仅是示例性和解释性的,并不能限制本公开。
附图说明
此处的附图被并入说明书中并构成本说明书的一部分,示出了符合本公开的实施例,并与说明书一起用于解释本公开的原理。
图1A是根据一示例性实施例示出的一种网络抖动处理方法实施例一的流程图;
图1B是图1A所示实施例中接收端的示意图;
图2是根据一示例性实施例示出的一种网络抖动处理方法实施例二的流程图;
图3是根据一示例性实施例示出的一种网络抖动处理方法实施例三的流程图;
图4是根据一示例性实施例示出的一种网络抖动处理方法实施例四的流程图;
图5是根据一示例性实施例示出的一种网络抖动处理方法实施例五的流程图;
图6是根据一示例性实施例示出的一种网络抖动处理装置实施例一的框图;
图7是根据一示例性实施例示出的一种网络抖动处理装置实施例二的框图;
图8是根据一示例性实施例示出的一种终端设备的框图;
图9是根据一示例性实施例示出的另一种终端设备的框图。
通过上述附图,已示出本公开明确的实施例,后文中将有更详细的描述。这些附图和文字描述并不是为了通过任何方式限制本公开构思的范围,而是通过参考特定实施例为本领域技术人员说明本公开的概念。
具体实施方式
这里将详细地对示例性实施例进行说明,其示例表示在附图中。下面的描述涉及附图时,除非另有表示,不同附图中的相同数字表示相同或相似的要素。以下示例性实施例中所描述的实施方式并不代表与本公开相一致的所有实施方式。相反,它们仅是与如所附权利要求书中所详述的、本公开的一些方面相一致的装置和方法的例子。
图1A是根据一示例性实施例示出的一种网络抖动处理方法实施例一的流程图,该方法可以由网络抖动处理装置来执行,该网络抖动处理装置可以集成在终端设备中,如图1A所示,该方法包括以下步骤:
在步骤101中,接收多个语音数据包,并记录各个语音数据包的接收时刻。
本实施例中,两个终端设备通过VoIP进行语音通信,两个终端设备进行语音通信时使用的协议为实时传输协议(Real-time Transport Protocol,简称RTP),接收端的用户在收听发送端的用户讲话时,发送端向接收端发送多个RTP报文,每个RTP报文包括语音数据,接收端接收该多个RTP报文,并记录接收每个RTP报文的接收时刻。
在步骤102中,依据所述接收时刻计算接收所述多个语音数据包中所有相邻两个语音数据包的时间间隔。
第n个时间间隔Jn=Rn-Rn-1,其中,Rn表示第n个语音数据包的接收时刻,Rn-1表示第n-1个语音数据包的接收时刻,1≤n≤(N-1),N表示所述语音数据包的总个数。
依据每个RTP报文的接收时刻计算出所有相邻两个语音数据包的时间间隔,例如,第n-1个语音数据包的接收时刻为Rn-1,第n个语音数据包的接收时刻为Rn,Jn=Rn-Rn-1表示第n个时间间隔,所有该RTP报文的个数为N,1≤n≤(N-1)。
在步骤103中,依据各个时间间隔所属的预设区间统计所述时间间隔的概率分布。
本实施例中,所述预设区间的个数为多个,第i个预设区间为
Figure PCTCN2016097257-appb-000007
其中,T为发送端发送所述语音数据包的固定时间间隔,K为正整数,i≥0;统计所有时间间隔中分别落入各个预设区间的时间间隔的个数,即对于每一个预设区间
Figure PCTCN2016097257-appb-000008
统计所有时间间隔中落入该预设区间的时间间隔的个数Ni,由于Jn是随n变化的,因此接收端接收相邻两个语音数据包的时间间隔J是一个随机变量,该预设区间对应的时间间隔的个数Ni与语音数据包的总个数N的比值即是随机变量J落入该预设区间的概率pi,依据随机变量J落入各个预设区间的概率pi,i≥0可获得随机变量J的概率分布。优选的,K=8,K越大,统计区间划分越细,计算得到的概率分布越准确,但过大的K值,当统计的包数目不够多时,反而会使概率分布图的某些位置数据过小,导致数据失真。
在步骤104中,依据所述概率分布和容许抖动概率计算缓冲区目标大小。
本实施例中,容许抖动概率表示缓冲区能克服的网络抖动的概率,累加随机变量J落入前m+1个预设区间的概率和
Figure PCTCN2016097257-appb-000009
以及前m+2个预设区间的概率和
Figure PCTCN2016097257-appb-000010
若 容许抖动概率P满足条件
Figure PCTCN2016097257-appb-000011
则第0个预设区间连续到第m个预设区间的大小之和为缓冲区目标大小。
在步骤105中,依据所述缓冲区目标大小调整缓冲区以解决网络抖动。
结合图1B,接收端的解码器和放音模块之间添加一个缓冲区,接收端的RTP报文接收模块将接收到的语音数据包具体为RTP报文传到解码器,由解码器对RTP报文进行解码处理获得语音数据或非语音数据,并将语音数据或非语音数据放入缓冲区,优选的,一个RTP报文经过解码处理后获得的语音数据或非语音数据的长度为L,缓冲区的总大小为K*L,即缓冲区分为K块,每块的长度为L,所有块的序号依次为0,1,2,…….,K-1。放音模块定期从缓冲区读取数据进行放音,假设放音模块从缓冲区中序号为Pread的块读取数据进行放音,解码器将解码输出的数据写入序号为Pwrite的块,Pwrite-Pread表示缓冲区的有效数据长度,也代表了缓冲区所产生的延时。
由于缓冲区的大小与其能够克服的网络抖动的能力密切相关,因此依据所述缓冲区目标大小调整缓冲区以解决网络抖动。
本实施例中,由于接收端接收语音数据包的时间间隔的概率分布随网络抖动的变化而变化,接收端接收语音数据包的时间间隔是一个随机变量,本实施例通过时间间隔的概率分布和容许抖动概率计算缓冲区目标大小,因此缓冲区目标大小随着网络抖动而变化,依据缓冲区目标大小调整缓冲区,可使缓冲区的大小随着网络抖动而变化,由于缓冲区的大小与其能够克服的网络抖动的能力密切相关,通过对缓冲区目标大小调整缓冲区可解决网络抖动。
图2是根据一示例性实施例示出的一种网络抖动处理方法实施例二的流程图,如图2所示,该方法可以包括如下的步骤:
在步骤201中,接收多个语音数据包,并记录各个语音数据包的接收时刻。
在步骤202中,依据所述接收时刻计算接收所述多个语音数据包中所有相邻两个语音数据包的时间间隔。
在步骤203中,统计所有时间间隔中分别落入各个预设区间的时间间隔的个数Ni,i≥0,其中,i表示预设区间的标识号,Ni≥0。
本实施例中,所述预设区间的个数为多个,第i个预设区间为
Figure PCTCN2016097257-appb-000012
i≥0,所有RTP报文的个数为N,则所有时间间隔为J1,J2......JN-1,统计J1,J2......JN-1中分别落入每个预设区间的个数,用Ni表示J1,J2......JN-1中落入第i个预设区间的个数,例如,J1,J2......JN-1中只有J2、J3和J10满足条件
Figure PCTCN2016097257-appb-000013
Figure PCTCN2016097257-appb-000014
即J1,J2......JN-1中有3个值落入第i个预设区间范围内,则Ni=3,同理于其他预设区间。
在步骤204中,依据各个Ni和所述语音数据包的总个数N计算所述时间间隔落入 各个预设区间的概率
Figure PCTCN2016097257-appb-000015
根据步骤203可获得各个预设区间对应的统计值,例如,第i个预设区间对应有Ni个统计值,则
Figure PCTCN2016097257-appb-000016
为随机变量J落入第i个预设区间的概率,如此便可计算出随机变量J落入各个预设区间的概率。
在步骤205中,依据所述概率分布和容许抖动概率计算缓冲区目标大小。
在步骤206中,依据所述缓冲区目标大小调整缓冲区以解决网络抖动。
本实施例中,通过统计所有时间间隔中分别落入各个预设区间的时间间隔的个数,根据每个预设区间的时间间隔的个数与语音数据包的总个数的比值作为时间间隔落入该预设区间的概率,得出时间间隔落入各个预设区间的概率,可获得时间间隔的概率分布。
图3是根据一示例性实施例示出的一种网络抖动处理方法实施例三的流程图,如图3所示,该方法可以包括如下步骤:
在步骤301中,接收多个语音数据包,并记录各个语音数据包的接收时刻。
在步骤302中,依据所述接收时刻计算接收所述多个语音数据包中所有相邻两个语音数据包的时间间隔。
在步骤303中,统计所有时间间隔中分别落入各个预设区间的时间间隔的个数Ni,i≥0,其中,i表示预设区间的标识号,Ni≥0。
在步骤304中,依据各个Ni和所述语音数据包的总个数N计算所述时间间隔落入各个预设区间的概率
Figure PCTCN2016097257-appb-000017
在步骤305中,当N≥Nmax时,对于任何一个n,若n≥Nmax,则将第(n-Nmax+1)个时间间隔所属的预设区间对应的时间间隔的个数减1,其中,Nmax为所述语音数据包的总个数的阈值。
合理假设,本实施例中接收端当前已经接收到1000个语音数据包,该1000个语音数据包的序号依次为0,1,2……999,1000个语音数据包对应有999个时间间隔即J1,J2......J999,接收端可接收的语音数据包的总个数的阈值为1000,当接收端接收序号为1000的语音数据包时,计算J1000,假设J1000属于第10个预设区间,则将第10个预设区间对应的统计值加1,同时,将J1所属的预设区间对应的统计值减1,同理,当接收端接收序号为1001的语音数据包时,计算J1001,假设J1001属于第8个预设区间,则将第8个预设区间对应的统计值加1,同时,将J2所属的预设区间对应的统计值减1,即每增加一个新的时间间隔减少一个存在时间最久的时间间隔,保证随着接收端接收到语音数据包的增加,分布在各个预设区间的时间间隔的总数不变。
在步骤306中,依据所述概率分布和容许抖动概率计算缓冲区目标大小。
在步骤307中,依据所述缓冲区目标大小调整缓冲区以解决网络抖动。
本实施例中,通过每增加一个新的时间间隔减少一个存在时间最久的时间间隔,保证随着接收端接收到语音数据包的增加,分布在各个预设区间的时间间隔的总数不变。
图4是根据一示例性实施例示出的一种网络抖动处理方法实施例四的流程图,如图4所示,该方法可以包括如下步骤:
在步骤401中,接收多个语音数据包,并记录各个语音数据包的接收时刻。
在步骤402中,依据所述接收时刻计算接收所述多个语音数据包中所有相邻两个语音数据包的时间间隔。
在步骤403中,统计所有时间间隔中分别落入各个预设区间的时间间隔的个数Ni,i≥0,其中,i表示预设区间的标识号,Ni≥0。
在步骤404中,依据各个Ni和所述语音数据包的总个数N计算所述时间间隔落入各个预设区间的概率
Figure PCTCN2016097257-appb-000018
在步骤405中,当N≥Nmax时,对于任何一个n,若n≥Nmax,则将第(n-Nmax+1)个时间间隔所属的预设区间对应的时间间隔的个数减1,其中,Nmax为所述语音数据包的总个数的阈值。
在步骤406中,计算满足条件
Figure PCTCN2016097257-appb-000019
的m,其中,P表示已知的容许抖动概率。
容许抖动概率表示缓冲区能克服的网络抖动的概率,累加随机变量J落入前m+1个预设区间的概率和
Figure PCTCN2016097257-appb-000020
以及前m+2个预设区间的概率和
Figure PCTCN2016097257-appb-000021
预设容许抖动概率的已知数P,计算满足条件
Figure PCTCN2016097257-appb-000022
的m。
在步骤407中,将多个所述预设区间中第0个预设区间连续到第m个预设区间的大小之和作为所述缓冲区目标大小。
在步骤408中,若缓冲区的当前大小小于所述缓冲区目标大小,则在所述缓冲区中生成数据进行补帧;若缓冲区的当前大小大于所述缓冲区目标大小,则删除所述缓冲区中的非语音数据,以使所述缓冲区的当前大小与所述缓冲区目标大小的差值在预设范围内。
依据所述缓冲区目标大小调整缓冲区以解决网络抖动,具体调整方法为:若缓冲区的当前大小小于所述缓冲区目标大小,则在所述缓冲区中生成数据进行补帧,补帧的方法具体为:对于语音数据,根据基因周期进行重复,以保证语音的平稳;对于非语音数据,根据背景噪声的能量估计,生成噪声并平滑。若缓冲区的当前大小大于所述缓冲区目标大小,则删除所述缓冲区中的非语音数据,以减小缓冲区的当前大小,直到所述缓冲区的当前大小与所述缓冲区目标大小的差值在预设范围内时停止删除 所述缓冲区中的非语音数据。
本实施例中,通过时间间隔的概率分布和容许抖动概率计算缓冲区目标大小,并依据缓冲区目标大小调整缓冲区的当前大小,以使缓冲区的大小随着网络抖动而动态调节,从而解决了网络抖动。
图5是根据一示例性实施例示出的一种网络抖动处理方法实施例五的流程图,如图5所示,该方法可以包括如下步骤:
在步骤501中,接收多个语音数据包,并记录各个语音数据包的接收时刻。
在步骤502中,对所述多个语音数据包中的每个语音数据包进行解码获得语音数据或非语音数据,并将所述语音数据或所述非语音数据存入所述缓冲区。
具体如图1B所示,步骤502的方法与实施例一中关于图1B的描述一致,此处不再赘述。
在步骤503中,依据所述接收时刻计算接收所述多个语音数据包中所有相邻两个语音数据包的时间间隔。
在步骤504中,统计所有时间间隔中分别落入各个预设区间的时间间隔的个数Ni,i≥0,其中,i表示预设区间的标识号,Ni≥0。
在步骤505中,依据各个Ni和所述语音数据包的总个数N计算所述时间间隔落入各个预设区间的概率
Figure PCTCN2016097257-appb-000023
在步骤506中,当N≥Nmax时,对于任何一个n,若n≥Nmax,则将第(n-Nmax+1)个时间间隔所属的预设区间对应的时间间隔的个数减1,其中,Nmax为所述语音数据包的总个数的阈值。
在步骤507中,计算满足条件
Figure PCTCN2016097257-appb-000024
的m,其中,P表示已知的容许抖动概率。
容许抖动概率表示缓冲区能克服的网络抖动的概率,累加随机变量J落入前m+1个预设区间的概率和
Figure PCTCN2016097257-appb-000025
以及前m+2个预设区间的概率和
Figure PCTCN2016097257-appb-000026
预设容许抖动概率的已知数P,计算满足条件
Figure PCTCN2016097257-appb-000027
的m。
在步骤508中,将多个所述预设区间中第0个预设区间连续到第m个预设区间的大小之和作为所述缓冲区目标大小。
在步骤509中,若缓冲区的当前大小小于所述缓冲区目标大小,且缓冲区有效长度为0,则在所述缓冲区中生成数据进行补帧;若缓冲区的当前大小大于所述缓冲区目标大小,则删除所述缓冲区中的非语音数据,以使所述缓冲区的当前大小与所述缓冲区目标大小的差值在预设范围内。
假设某种网络条件和设定的丢包容许概率下,缓冲区目标长度为3,如果网络阻 塞,则缓冲区内有效数据长度就会变成2,1,此时并不需要补帧,如果此时被阻塞的数据包连续到达,缓冲区的有效长度会增加,此时不需补帧。如果被阻塞的数据包未连续到达,缓冲区有效长度变为0,则需要补帧。
假设网络条件变差,丢包容许概率不变,缓冲区的目标长度会变长,比如变成5,意味着当阻塞时,缓冲区还有4帧数据可以放音,4帧数据播放结束后被阻塞的数据包未连续到达,则需要补帧,由此可见,缓冲区的目标长度变长将减少缓冲区补帧的概率,提高了音质。
假设网络条件变好,丢包容许概率不变,缓冲区的目标长度会变小,比如变成1,意味着缓冲区引入的延时只有一帧,延时变短,但是如果网络阻塞,播放完1帧数据后需要补帧。
假设网络条件不变,丢包容许概率变大,理论上,缓冲区的目标长度会变长,意味着延时增加,但补帧的概率降低,这就是用户设定更大丢包容许概率的代价。
假设网络条件不变,丢包容许概率变小,理论上,缓冲区的目标长度会变小,意味着延时降低,但补帧的概率增加,因为用户接受较多的丢包补帧,来换取较少的延时。
本实施例中,通过在接收端的解码器和放音模块之间添加一个缓冲区,接收端的RTP报文接收模块将接收到的语音数据包具体为RTP报文传到解码器,由解码器对RTP报文进行解码处理获得语音数据或非语音数据,并将语音数据或非语音数据放入缓冲区,对缓冲区的动态调节实现对网络抖动的处理。
图6是根据一示例性实施例示出的一种网络抖动处理装置实施例一的框图,如图6所示,该装置包括接收记录模块61、计算模块62、统计模块63和调整模块64,。
接收记录模块61,被配置为接收多个语音数据包,并记录各个语音数据包的接收时刻;
计算模块62,被配置为依据所述接收时刻计算接收所述多个语音数据包中所有相邻两个语音数据包的时间间隔;
统计模块63,被配置为依据各个时间间隔所属的预设区间统计所述时间间隔的概率分布;
计算模块62还被配置为依据所述概率分布和容许抖动概率计算缓冲区目标大小;
调整模块64,被配置为依据所述缓冲区目标大小调整缓冲区以解决网络抖动。
本实施例中,由于接收端接收语音数据包的时间间隔的概率分布随网络抖动的变化而变化,接收端接收语音数据包的时间间隔是一个随机变量,本实施例通过时间间隔的概率分布和容许抖动概率计算缓冲区目标大小,因此缓冲区目标大小随着网络抖动而变化,依据缓冲区目标大小调整缓冲区,可使缓冲区的大小随着网络抖动而变化,由于缓冲区的大小与其能够克服的网络抖动的能力密切相关,通过对缓冲区目标大小 调整缓冲区可解决网络抖动。
图7是根据一示例性实施例示出的一种网络抖动处理装置实施例二的框图,如图7所示,在图6所示实施例的基础上,第n个时间间隔Jn=Rn-Rn-1,其中,Rn表示第n个语音数据包的接收时刻,Rn-1表示第n-1个语音数据包的接收时刻,1≤n≤(N-1),N表示所述语音数据包的总个数;所述预设区间的个数为多个,第i个预设区间为
Figure PCTCN2016097257-appb-000028
其中,T为发送端发送所述语音数据包的固定时间间隔,K为正整数。
统计模块63包括:第一统计子模块631、第一计算子模块632和第二统计子模块633。
第一统计子模块631,被配置为统计所有时间间隔中分别落入各个预设区间的时间间隔的个数Ni,i≥0,其中,i表示预设区间的标识号,Ni≥0;
第一计算子模块632,被配置为依据各个Ni和所述语音数据包的总个数N计算所述时间间隔落入各个预设区间的概率
Figure PCTCN2016097257-appb-000029
第二统计子模块633,被配置为当N≥Nmax时,对于任何一个n,若n≥Nmax,则将第(n-Nmax+1)个时间间隔所属的预设区间对应的时间间隔的个数减1,其中,Nmax为所述语音数据包的总个数的阈值。
计算模块62包括:第二计算子模块621和第三计算子模块622。
第二计算子模块621,被配置为计算满足条件
Figure PCTCN2016097257-appb-000030
的m,其中,P表示已知的容许抖动概率;
第三计算子模块622,被配置为计算多个所述预设区间中第0个预设区间连续到第m个预设区间的大小之和,所述之和为所述缓冲区目标大小。
调整模块64包括:第一调整子模块641和第二调整子模块642。
第一调整子模块641,被配置为若缓冲区的当前大小小于所述缓冲区目标大小,且缓冲区有效长度为0,则在所述缓冲区中生成数据进行补帧;
第二调整子模块642,被配置为若缓冲区的当前大小大于所述缓冲区目标大小,则删除所述缓冲区中的非语音数据,以使所述缓冲区的当前大小与所述缓冲区目标大小的差值在预设范围内。
所述装置还包括解码模块65,被配置为接收记录模块61接收多个语音数据包,并记录各个语音数据包的接收时刻之后,对所述多个语音数据包中的每个语音数据包进行解码获得语音数据或非语音数据,并将所述语音数据或所述非语音数据存入所述缓冲区。
本实施例提供的网络抖动处理装置可以用于执行图2-图5任一所示方法实施例的技术方案。
本实施例中,通过统计所有时间间隔中分别落入各个预设区间的时间间隔的个数,根据每个预设区间的时间间隔的个数与语音数据包的总个数的比值作为时间间隔落入该预设区间的概率,得出时间间隔落入各个预设区间的概率,可获得时间间隔的概率分布;通过每增加一个新的时间间隔减少一个存在时间最久的时间间隔,保证随着接收端接收到语音数据包的增加,分布在各个预设区间的时间间隔的总数不变;通过时间间隔的概率分布和容许抖动概率计算缓冲区目标大小,并依据缓冲区目标大小调整缓冲区的当前大小,以使缓冲区的大小随着网络抖动而动态调节,从而解决了网络抖动;通过在接收端的解码器和放音模块之间添加一个缓冲区,接收端的RTP报文接收模块将接收到的语音数据包具体为RTP报文传到解码器,由解码器对RTP报文进行解码处理获得语音数据或非语音数据,并将语音数据或非语音数据放入缓冲区,对缓冲区的动态调节实现对网络抖动的处理。
关于上述实施例中的网络抖动处理装置,其中各个模块、子模块执行操作的具体方式已经在有关该方法的实施例中进行了详细描述,此处将不做详细阐述说明。
以上描述了网络抖动处理装置的内部功能和结构,如图8所示,实际中,该网络抖动处理装置可实现为终端设备,包括:
处理器;
被配置为存储处理器可执行指令的存储器;
其中,所述处理器被配置为:
接收多个语音数据包,并记录各个语音数据包的接收时刻;
依据所述接收时刻计算接收所述多个语音数据包中所有相邻两个语音数据包的时间间隔;
依据各个时间间隔所属的预设区间统计所述时间间隔的概率分布;
依据所述概率分布和容许抖动概率计算缓冲区目标大小;
依据所述缓冲区目标大小调整缓冲区以解决网络抖动。
本实施例中,由于接收端接收语音数据包的时间间隔的概率分布随网络抖动的变化而变化,接收端接收语音数据包的时间间隔是一个随机变量,本实施例通过时间间隔的概率分布和容许抖动概率计算缓冲区目标大小,因此缓冲区目标大小随着网络抖动而变化,依据缓冲区目标大小调整缓冲区,可使缓冲区的大小随着网络抖动而变化,由于缓冲区的大小与其能够克服的网络抖动的能力密切相关,通过对缓冲区目标大小调整缓冲区可解决网络抖动。
图9是根据一示例性实施例示出的另一种终端设备的框图。例如,终端设备800可以是移动电话,计算机,数字广播终端,消息收发设备,游戏控制台,平板设备,医疗设备,健身设备,个人数字助理等。
参照图10,终端设备800可以包括以下一个或多个组件:处理组件802,存储器 804,电力组件806,多媒体组件808,音频组件810,输入/输出(I/O)的接口812,传感器组件814,以及通信组件816。
处理组件802通常控制终端设备800的整体操作,诸如与显示,电话呼叫,数据通信,相机操作和记录操作相关联的操作。处理组件802可以包括一个或多个处理器820来执行指令,以完成上述的方法的全部或部分步骤。此外,处理组件802可以包括一个或多个模块,便于处理组件802和其他组件之间的交互。例如,处理组件802可以包括多媒体模块,以方便多媒体组件808和处理组件802之间的交互。
存储器804被配置为存储各种类型的数据以支持在设备800的操作。这些数据的示例包括用于在终端设备800上操作的任何应用程序或方法的指令,联系人数据,电话簿数据,消息,图片,视频等。存储器804可以由任何类型的易失性或非易失性存储设备或者它们的组合实现,如静态随机存取存储器(SRAM),电可擦除可编程只读存储器(EEPROM),可擦除可编程只读存储器(EPROM),可编程只读存储器(PROM),只读存储器(ROM),磁存储器,快闪存储器,磁盘或光盘。
电力组件806为终端设备800的各种组件提供电力。电力组件806可以包括电源管理***,一个或多个电源,及其他与为终端设备800生成、管理和分配电力相关联的组件。
多媒体组件808包括在所述终端设备800和用户之间的提供一个输出接口的屏幕。在一些实施例中,屏幕可以包括液晶显示器(LCD)和触摸面板(TP)。如果屏幕包括触摸面板,屏幕可以被实现为触摸屏,以接收来自用户的输入信号。触摸面板包括一个或多个触摸传感器以感测触摸、滑动和触摸面板上的手势。所述触摸传感器可以不仅感测触摸或滑动动作的边界,而且还检测与所述触摸或滑动操作相关的持续时间和压力。在一些实施例中,多媒体组件808包括一个前置摄像头和/或后置摄像头。当设备800处于操作模式,如拍摄模式或视频模式时,前置摄像头和/或后置摄像头可以接收外部的多媒体数据。每个前置摄像头和后置摄像头可以是一个固定的光学透镜***或具有焦距和光学变焦能力。
音频组件810被配置为输出和/或输入音频信号。例如,音频组件810包括一个麦克风(MIC),当终端设备800处于操作模式,如呼叫模式、记录模式和语音识别模式时,麦克风被配置为接收外部音频信号。所接收的音频信号可以被进一步存储在存储器804或经由通信组件816发送。在一些实施例中,音频组件810还包括一个扬声器,用于输出音频信号。
I/O接口812为处理组件802和***接口模块之间提供接口,上述***接口模块可以是键盘,点击轮,按钮等。这些按钮可包括但不限于:主页按钮、音量按钮、启动按钮和锁定按钮。
传感器组件814包括一个或多个传感器,用于为终端设备800提供各个方面的状态评估。例如,传感器组件814可以检测到设备800的打开/关闭状态,组件的相对定 位,例如所述组件为终端设备800的显示器和小键盘,传感器组件814还可以检测终端设备800或终端设备800一个组件的位置改变,用户与终端设备800接触的存在或不存在,终端设备800方位或加速/减速和终端设备800的温度变化。传感器组件814可以包括接近传感器,被配置用来在没有任何的物理接触时检测附近物体的存在。传感器组件814还可以包括光传感器,如CMOS或CCD图像传感器,用于在成像应用中使用。在一些实施例中,该传感器组件814还可以包括加速度传感器,陀螺仪传感器,磁传感器,压力传感器或温度传感器。
通信组件816被配置为便于终端设备800和其他设备之间有线或无线方式的通信。终端设备800可以接入基于通信标准的无线网络,如WiFi,2G或3G,或它们的组合。在一个示例性实施例中,通信组件816经由广播信道接收来自外部广播管理***的广播信号或广播相关信息。在一个示例性实施例中,所述通信组件816还包括近场通信(NFC)模块,以促进短程通信。例如,在NFC模块可基于射频识别(RFID)技术,红外数据协会(IrDA)技术,超宽带(UWB)技术,蓝牙(BT)技术和其他技术来实现。
在示例性实施例中,终端设备800可以被一个或多个应用专用集成电路(ASIC)、数字信号处理器(DSP)、数字信号处理设备(DSPD)、可编程逻辑器件(PLD)、现场可编程门阵列(FPGA)、控制器、微控制器、微处理器或其他电子元件实现,用于执行上述方法。
在示例性实施例中,还提供了一种包括指令的非临时性计算机可读存储介质,例如包括指令的存储器804,上述指令可由终端设备800的处理器820执行以完成上述方法。例如,所述非临时性计算机可读存储介质可以是ROM、随机存取存储器(RAM)、CD-ROM、磁带、软盘和光数据存储设备等。
一种非临时性计算机可读存储介质,当所述存储介质中的指令由移动终端的处理器执行时,使得移动终端能够执行一种网络抖动处理方法,所述方法包括:
检测用户拍摄照片时的手势,所述用户的终端设备在拍摄照片时处于水平放置状态;
根据拍摄手势与照片显示方向的对应关系,确定与检测到的所述手势对应的照片显示方向,所述照片显示方向是所述照片相对于处于水平放置状态的所述终端设备的边框的显示方向;
根据所述照片显示方向和重力方向显示所述照片。
本领域技术人员在考虑说明书及实践这里公开的发明后,将容易想到本公开的其它实施方案。本申请旨在涵盖本公开的任何变型、用途或者适应性变化,这些变型、用途或者适应性变化遵循本公开的一般性原理并包括本公开未公开的本技术领域中的公知常识或惯用技术手段。说明书和实施例仅被视为示例性的,本公开的真正范围和精神由下面的权利要求指出。
应当理解的是,本公开并不局限于上面已经描述并在附图中示出的精确结构,并且可以在不脱离其范围进行各种修改和改变。本公开的范围仅由所附的权利要求来限制。

Claims (13)

  1. 一种网络抖动处理方法,其特征在于,所述方法包括:
    接收多个语音数据包,并记录各个语音数据包的接收时刻;
    依据所述接收时刻计算接收所述多个语音数据包中所有相邻两个语音数据包的时间间隔;
    依据各个时间间隔所属的预设区间统计所述时间间隔的概率分布;
    依据所述概率分布和容许抖动概率计算缓冲区目标大小;
    依据所述缓冲区目标大小调整缓冲区以解决网络抖动。
  2. 根据权利要求1所述的方法,其特征在于,第n个时间间隔Jn=Rn-Rn-1,其中,Rn表示第n个语音数据包的接收时刻,Rn-1表示第n-1个语音数据包的接收时刻,1≤n≤(N-1),N表示所述语音数据包的总个数;
    所述预设区间的个数为多个,第i个预设区间为
    Figure PCTCN2016097257-appb-100001
    其中,T为发送端发送所述语音数据包的固定时间间隔,K为正整数;
    所述依据各个时间间隔所属的预设区间统计所述时间间隔的概率分布,包括:
    统计所有时间间隔中分别落入各个预设区间的时间间隔的个数Ni,i≥0,其中,i表示预设区间的标识号,Ni≥0;
    依据各个Ni和所述语音数据包的总个数N计算所述时间间隔落入各个预设区间的概率
    Figure PCTCN2016097257-appb-100002
  3. 根据权利要求2所述的方法,其特征在于,当N≥Nmax时,对于任何一个n,若n≥Nmax,则将第(n-Nmax+1)个时间间隔所属的预设区间对应的时间间隔的个数减1,其中,Nmax为所述语音数据包的总个数的阈值。
  4. 根据权利要求3所述的方法,其特征在于,所述依据所述概率分布和容许抖动概率计算缓冲区目标大小,包括:
    计算满足条件
    Figure PCTCN2016097257-appb-100003
    的m,其中,P表示已知的容许抖动概率;
    将多个所述预设区间中第0个预设区间连续到第m个预设区间的大小之和作为所述缓冲区目标大小。
  5. 根据权利要求1-4任一项所述的方法,其特征在于,所述依据所述缓冲区目标大小调整缓冲区以解决网络抖动,包括:
    若缓冲区的当前大小小于所述缓冲区目标大小,且缓冲区有效长度为0,则在所述缓冲区中生成数据进行补帧;
    若缓冲区的当前大小大于所述缓冲区目标大小,则删除所述缓冲区中的非语音数据,以使所述缓冲区的当前大小与所述缓冲区目标大小的差值在预设范围内。
  6. 根据权利要求5所述的方法,其特征在于,所述接收多个语音数据包,并记录各个语音数据包的接收时刻之后,还包括:
    对所述多个语音数据包中的每个语音数据包进行解码获得语音数据或非语音数据,并将所述语音数据或所述非语音数据存入所述缓冲区。
  7. 一种网络抖动处理装置,其特征在于,所述装置包括:
    接收记录模块,被配置为接收多个语音数据包,并记录各个语音数据包的接收时刻;
    计算模块,被配置为依据所述接收时刻计算接收所述多个语音数据包中所有相邻两个语音数据包的时间间隔;
    统计模块,被配置为依据各个时间间隔所属的预设区间统计所述时间间隔的概率分布;
    所述计算模块还被配置为依据所述概率分布和容许抖动概率计算缓冲区目标大小;
    调整模块,被配置为依据所述缓冲区目标大小调整缓冲区以解决网络抖动。
  8. 根据权利要求7所述的装置,其特征在于,第n个时间间隔Jn=Rn-Rn-1,其中,Rn表示第n个语音数据包的接收时刻,Rn-1表示第n-1个语音数据包的接收时刻,1≤n≤(N-1),N表示所述语音数据包的总个数;
    所述预设区间的个数为多个,第i个预设区间为
    Figure PCTCN2016097257-appb-100004
    其中,T为发送端发送所述语音数据包的固定时间间隔,K为正整数;
    所述统计模块包括:
    第一统计子模块,被配置为统计所有时间间隔中分别落入各个预设区间的时间间隔的个数Ni,i≥0,其中,i表示预设区间的标识号,Ni≥0;
    第一计算子模块,被配置为依据各个Ni和所述语音数据包的总个数N计算所述时间间隔落入各个预设区间的概率
    Figure PCTCN2016097257-appb-100005
  9. 根据权利要求8所述的装置,其特征在于,所述统计模块还包括:
    第二统计子模块,被配置为当N≥Nmax时,对于任何一个n,若n≥Nmax,则将第(n-Nmax+1)个时间间隔所属的预设区间对应的时间间隔的个数减1,其中,Nmax为所述语音数据包的总个数的阈值。
  10. 根据权利要求9所述的装置,其特征在于,所述计算模块包括:
    第二计算子模块,被配置为计算满足条件
    Figure PCTCN2016097257-appb-100006
    的m,其中,P表示已知的容许抖动概率;
    第三计算子模块,被配置为计算多个所述预设区间中第0个预设区间连续到第m个预设区间的大小之和,所述之和为所述缓冲区目标大小。
  11. 根据权利要求7-10任一项所述的装置,其特征在于,所述调整模块包括:
    第一调整子模块,被配置为若缓冲区的当前大小小于所述缓冲区目标大小,且缓冲区有效长度为0,则在所述缓冲区中生成数据进行补帧;
    第二调整子模块,被配置为若缓冲区的当前大小大于所述缓冲区目标大小,则删除所述缓冲区中的非语音数据,以使所述缓冲区的当前大小与所述缓冲区目标大小的差值在预设范围内。
  12. 根据权利要求11所述的装置,其特征在于,所述装置还包括:
    解码模块,被配置为所述接收记录模块接收多个语音数据包,并记录各个语音数据包的接收时刻之后,对所述多个语音数据包中的每个语音数据包进行解码获得语音数据或非语音数据,并将所述语音数据或所述非语音数据存入所述缓冲区。
  13. 一种终端设备,其特征在于,包括:
    处理器;
    被配置为存储处理器可执行指令的存储器;
    其中,所述处理器被配置为:
    接收多个语音数据包,并记录各个语音数据包的接收时刻;
    依据所述接收时刻计算接收所述多个语音数据包中所有相邻两个语音数据包的时间间隔;
    依据各个时间间隔所属的预设区间统计所述时间间隔的概率分布;
    依据所述概率分布和容许抖动概率计算缓冲区目标大小;
    依据所述缓冲区目标大小调整缓冲区以解决网络抖动。
PCT/CN2016/097257 2015-12-21 2016-08-30 网络抖动处理方法、装置和终端设备 WO2017107544A1 (zh)

Priority Applications (3)

Application Number Priority Date Filing Date Title
JP2016565349A JP6382345B2 (ja) 2015-12-21 2016-08-30 ネットワークジッタ処理方法、装置、端末デバイス、プログラム及び記録媒体
RU2016148816A RU2651215C1 (ru) 2015-12-21 2016-08-30 Способ и устройство для обработки сетевого джиттера, а также терминал
KR1020177024410A KR101986549B1 (ko) 2015-12-21 2016-08-30 네트워크 지터 처리 방법, 장치, 단말 디바이스, 프로그램 및 저장매체

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201510967352.9A CN105939289B (zh) 2015-12-21 2015-12-21 网络抖动处理方法、装置和终端设备
CN201510967352.9 2015-12-21

Publications (1)

Publication Number Publication Date
WO2017107544A1 true WO2017107544A1 (zh) 2017-06-29

Family

ID=57153042

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2016/097257 WO2017107544A1 (zh) 2015-12-21 2016-08-30 网络抖动处理方法、装置和终端设备

Country Status (8)

Country Link
US (1) US10129161B2 (zh)
EP (1) EP3185480B1 (zh)
JP (1) JP6382345B2 (zh)
KR (1) KR101986549B1 (zh)
CN (1) CN105939289B (zh)
ES (1) ES2786400T3 (zh)
RU (1) RU2651215C1 (zh)
WO (1) WO2017107544A1 (zh)

Families Citing this family (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109413485B (zh) * 2017-08-17 2022-02-01 成都鼎桥通信技术有限公司 数据缓存方法及装置
CN108933786B (zh) * 2018-07-03 2021-04-09 公安部第一研究所 用于改善无线数字通信***接收方密文语音质量的方法
CN109491827A (zh) * 2018-12-03 2019-03-19 浪潮电子信息产业股份有限公司 应用软件的日志数据存储方法、装置、设备及存储介质
CN109981482B (zh) * 2019-03-05 2022-04-05 北京世纪好未来教育科技有限公司 音频处理方法及装置
CN110634511B (zh) * 2019-09-27 2021-09-14 北京西山居互动娱乐科技有限公司 一种音频数据处理方法及装置
KR102409915B1 (ko) * 2019-11-28 2022-06-20 울산과학기술원 실시간 데이터 스트리밍 수신 장치 및 방법, 실시간 데이터 데이터 스트리밍 송수신 시스템, 컴퓨터 판독 가능한 기록 매체 및 컴퓨터 프로그램
CN111580777B (zh) * 2020-05-06 2024-03-08 北京达佳互联信息技术有限公司 音频处理方法、装置、电子设备及存储介质
CN115118636B (zh) * 2022-06-13 2024-05-14 北京达佳互联信息技术有限公司 网络抖动状态的确定方法、装置、电子设备及存储介质

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20080285599A1 (en) * 2005-11-07 2008-11-20 Ingemar Johansson Control Mechanism for Adaptive Play-Out with State Recovery
CN101491138A (zh) * 2006-07-10 2009-07-22 艾利森电话股份有限公司 压缩延迟分组传输调度
CN101689946A (zh) * 2007-06-28 2010-03-31 环球Ip解决方法(Gips)有限责任公司 用于确定抖动缓冲器级别的方法和接收器
CN103888381A (zh) * 2012-12-20 2014-06-25 杜比实验室特许公司 用于控制抖动缓冲器的装置和方法

Family Cites Families (19)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP3518982B2 (ja) * 1997-12-01 2004-04-12 松下電器産業株式会社 パケット分解装置及びパケット分解方法
US6665317B1 (en) * 1999-10-29 2003-12-16 Array Telecom Corporation Method, system, and computer program product for managing jitter
US6744764B1 (en) * 1999-12-16 2004-06-01 Mapletree Networks, Inc. System for and method of recovering temporal alignment of digitally encoded audio data transmitted over digital data networks
US7110422B1 (en) * 2002-01-29 2006-09-19 At&T Corporation Method and apparatus for managing voice call quality over packet networks
JP2003264583A (ja) * 2002-03-08 2003-09-19 Nippon Telegr & Teleph Corp <Ntt> パケットシェーピング方法及び装置
US7289451B2 (en) * 2002-10-25 2007-10-30 Telefonaktiebolaget Lm Ericsson (Publ) Delay trading between communication links
RU2236092C1 (ru) * 2003-02-04 2004-09-10 Харитонов Владимир Христианович Способ коммутации при передаче и приеме мультимедийной информации
US7359324B1 (en) * 2004-03-09 2008-04-15 Nortel Networks Limited Adaptive jitter buffer control
EP1754327A2 (en) * 2004-03-16 2007-02-21 Snowshore Networks, Inc. Jitter buffer management
US7590170B2 (en) * 2004-09-29 2009-09-15 Teradyne, Inc. Method and apparatus for measuring jitter
JP4905060B2 (ja) * 2006-11-07 2012-03-28 富士通株式会社 受信装置およびデータ再生方法
JP4800250B2 (ja) * 2007-03-27 2011-10-26 Kddi株式会社 必要十分な受信バッファサイズを決定するパケット受信装置、方法及びプログラム
US7733893B2 (en) * 2007-06-29 2010-06-08 Global Ip Solutions (Gips) Ab Method and receiver for determining a jitter buffer level
JP2011211624A (ja) * 2010-03-30 2011-10-20 Oki Networks Co Ltd VoIP通信装置
RU2485695C2 (ru) * 2011-07-26 2013-06-20 Государственное образовательное учреждение высшего профессионального образования Академия Федеральной службы охраны Российской Федерации (Академия ФСО России) Способ проверки виртуального соединения для передачи мультимедийных данных с заданными характеристиками
JP2014135637A (ja) * 2013-01-10 2014-07-24 Anritsu Networks Kk パケット分解組み立て装置およびパケット分解組み立て方法
CN103685070B (zh) * 2013-12-18 2016-11-02 广州华多网络科技有限公司 一种调整抖动缓存大小的方法及装置
CN105099949A (zh) * 2014-04-16 2015-11-25 杜比实验室特许公司 基于对延迟抖动和对话动态的监视的抖动缓冲器控制
CN104301610B (zh) * 2014-09-28 2017-11-17 小米科技有限责任公司 图像拍摄控制方法及装置

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20080285599A1 (en) * 2005-11-07 2008-11-20 Ingemar Johansson Control Mechanism for Adaptive Play-Out with State Recovery
CN101491138A (zh) * 2006-07-10 2009-07-22 艾利森电话股份有限公司 压缩延迟分组传输调度
CN101689946A (zh) * 2007-06-28 2010-03-31 环球Ip解决方法(Gips)有限责任公司 用于确定抖动缓冲器级别的方法和接收器
CN103888381A (zh) * 2012-12-20 2014-06-25 杜比实验室特许公司 用于控制抖动缓冲器的装置和方法

Also Published As

Publication number Publication date
ES2786400T3 (es) 2020-10-09
KR20170113610A (ko) 2017-10-12
RU2651215C1 (ru) 2018-04-18
JP6382345B2 (ja) 2018-08-29
CN105939289B (zh) 2019-03-12
KR101986549B1 (ko) 2019-09-30
EP3185480B1 (en) 2020-02-05
US20170180263A1 (en) 2017-06-22
US10129161B2 (en) 2018-11-13
CN105939289A (zh) 2016-09-14
JP2018503274A (ja) 2018-02-01
EP3185480A1 (en) 2017-06-28

Similar Documents

Publication Publication Date Title
WO2017107544A1 (zh) 网络抖动处理方法、装置和终端设备
WO2020134559A1 (zh) 一种数据传输方法、装置、终端设备及存储介质
WO2018120906A1 (zh) 缓存状态报告bsr上报触发方法、装置和用户终端
WO2020211535A1 (zh) 网络延迟控制方法、装置、电子设备及存储介质
US10103999B2 (en) Jitter buffer level estimation
WO2019196034A1 (zh) 非授权小区中的数据传输方法及装置、基站和用户设备
JP6336119B2 (ja) 情報処理方法、装置、プログラム、及び記録媒体
WO2020151570A1 (zh) 一种拥塞控制方法、装置、电子设备及存储介质
WO2018232561A1 (zh) 一种上报缓存状态报告的方法及装置
CN109561356B (zh) 数据发送方法、数据发送装置、电子设备和计算机可读存储介质
US11388459B2 (en) Method and apparatus for determining bandwidth, and electronic device and storage medium
WO2018201439A1 (zh) 随机接入方法及装置、用户设备和计算机可读存储介质
WO2021031311A1 (zh) 超网络构建方法、使用方法、装置及介质
WO2019028606A1 (zh) 数据传输方法、装置及计算机可读存储介质
CN110061814B (zh) 一种语音延时抖动控制方法、装置、电子设备及存储介质
WO2017031900A1 (zh) 时延控制方法及装置
WO2021073394A1 (zh) 一种数据传输方法、装置、电子设备及存储介质
WO2019109349A1 (zh) 终端接入的控制方法及装置
CN106790450B (zh) 缓存处理方法、装置及服务器
CN109450595B (zh) 报文发送方法和装置
CN115065643B (zh) 网络链路拥塞检测方法、装置、电子设备及存储介质
WO2022133855A1 (zh) 解调性能确定方法和装置、解调性能接收方法和装置
WO2019165645A1 (zh) 发射功率差值的指示方法及装置、功率补偿方法及装置

Legal Events

Date Code Title Description
ENP Entry into the national phase

Ref document number: 2016565349

Country of ref document: JP

Kind code of ref document: A

ENP Entry into the national phase

Ref document number: 2016148816

Country of ref document: RU

Kind code of ref document: A

121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 16877356

Country of ref document: EP

Kind code of ref document: A1

ENP Entry into the national phase

Ref document number: 20177024410

Country of ref document: KR

Kind code of ref document: A

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 16877356

Country of ref document: EP

Kind code of ref document: A1