Packet loss hiding method and device suitable for Bluetooth voice call and Bluetooth voice processing chip
Technical Field
The invention belongs to the technical field of Bluetooth communication, and particularly relates to a packet loss hiding method and device suitable for Bluetooth voice call and a Bluetooth voice processing chip.
Background
Bluetooth voice communication is a short-distance wireless communication technology, traditional electric wires are omitted, voice data are transmitted in the air, the conditions of packet error and packet loss are easily generated at a receiving end, voice communication is not consistent, and Bluetooth voice communication quality and communication distance are greatly influenced. The packet loss concealment technology (PLC) compensates lost data packets by using synthesized data packets, and performs packet loss concealment by using the masking effect of human ears, so that the voice call effect of Bluetooth can be improved to a great extent, the voice call is continuous, and the data packets cannot be lost or go wrong.
As shown in fig. 1, when a receiving end of a bluetooth headset detects a packet loss of a bluetooth voice packet, after the packet loss is hidden by a PLC module, an interrupted voice waveform is converted into a continuous voice waveform, and then the continuous voice waveform is converted into a voice by a digital-to-analog converter (DAC) and played.
At present, the packet loss concealment method mainly comprises an insertion technology (silence, white noise and packet replacement), a time domain correction technology and a waveform substitution method based on fundamental tone detection. The insertion technique is to fill a gap caused by packet loss with a simple substitute, and the common substitute is silence, noise or a previous voice packet. The time domain correction technology is to utilize the front and back voice packets of the notch waveform and the filtering technology to make the data of the notch continuous as much as possible, so that a certain hiding effect can be achieved, but the effect is not obvious, and the method has great use limitation. The pitch period based waveform substitution technology is a method for detecting the pitch period of voice through an algorithm, for unvoiced sound, a simple repetition technology is used for filling a waveform gap, for voiced sound, a pitch period closest to the gap is used for filling the gap, and the method is simple and effective and can achieve good hiding effect. However, in the case of continuous packet loss, the hiding effect is still not good, and when waveform substitution is performed, data continuity before and after the gap is not good, resulting in poor call quality.
Disclosure of Invention
The embodiment of the invention provides a packet loss hiding method and device suitable for Bluetooth voice communication and a Bluetooth voice processing chip, and aims to solve the existing problems.
In an embodiment of the present invention, a packet loss hiding method applicable to a bluetooth voice call is provided, where the method includes:
gene cycle detection: detecting a pitch period before the packet loss of the Bluetooth voice data packet by adopting a correlation coefficient method, and obtaining a correlation coefficient Rmax corresponding to the gene period;
a waveform replacing step: generating a substitute waveform according to a waveform corresponding to the lost data packet in a previous pitch period of the lost data packet, and substituting the substitute waveform for the waveform of the lost data packet;
and self-adaptive filtering: and performing adaptive smoothing filtering on the waveforms on the two sides of the waveform gap formed by the lost data packet according to the corresponding waveforms of the waveforms on the two sides of the waveform gap formed by the lost data packet in the previous pitch period.
The adaptive filtering step specifically includes:
respectively extracting the waveform LD of a data packet on the left side of a waveform gap formed by the lost data packet and the waveform RD of a data packet on the right side of the waveform gap formed by the lost data packet, wherein the voice data of each data packet comprises 30 voice sampling points;
respectively extracting waveforms LDP and RDP of a previous pitch period corresponding to the waveforms LD and RD;
respectively filtering the waveforms LD and RD by using the waveforms LDP and RDP to obtain the smoothed waveforms LDC and RDC, wherein the filtering formula is as follows:
LDC[i]=( LDP[i]* (i/Rf) + LD[i]* (30-(i/Rf)))/30,
RDC[i]=( RDP[i]* (30-(i/Rf)) + RD[i]* (i/Rf))/30,
wherein i is an integer between 0 and 29, and Rf = Rmax + 1.
In the embodiment of the present invention, before the pitch period detecting step, a bluetooth voice data packet loss detecting step is further included, which detects whether the received bluetooth voice data packet is lost.
In an embodiment of the present invention, the pitch period detecting step specifically includes:
setting N continuous voice data sampling points on the left side from the lost data packet as a template window, and setting a sliding window with the length same as that of the template window, wherein N is more than or equal to 120;
sliding the sliding window from the 40 th point on the left of the lost data packet to the 120 th voice data sampling point on the left of the lost data packet, and respectively calculating the correlation coefficient R of the sliding window and the template window;
and comparing the correlation coefficients of the points to obtain the maximum correlation coefficient Rmax, and taking the pitch period corresponding to the correlation coefficient Rmax as the pitch period before the Bluetooth voice data packet is lost.
In the embodiment of the present invention, in the waveform replacing step,
when the number of the lost data packets is one, the substitute waveform adopts a waveform corresponding to the lost data packet in a pitch period before the lost data packet;
and when the lost data packets are a plurality of continuous data packets, taking the waveform corresponding to the first lost data packet in the previous pitch period as a substitute waveform of the first data packet, sequentially attenuating each voice sampling point of the waveform corresponding to the other lost data packet in the previous pitch period by 1/(15 x 30) to form a substitute waveform of the other lost data packet, and if the number of the continuous lost data packets is more than 16, directly attenuating the substitute waveform after more than 16 data packets to 0.
In an embodiment of the present invention, a packet loss hiding device suitable for a bluetooth voice call is further provided, where the device includes:
the gene period detection unit is used for detecting a pitch period before the Bluetooth voice data packet is lost by adopting a correlation coefficient method and obtaining a correlation coefficient Rmax corresponding to the gene period;
the waveform replacing unit is used for generating a replacing waveform according to the waveform corresponding to the lost data packet in the previous pitch period of the lost data packet and replacing the waveform of the lost data packet with the replacing waveform; and
and the adaptive filtering unit is used for performing adaptive smooth filtering on the waveforms on two sides of the waveform gap formed by the lost data packet according to the corresponding waveforms of the waveforms on two sides of the waveform gap formed by the lost data packet in the previous pitch period.
The filtering formula adopted by the self-adaptive filtering unit is as follows:
LDC[i]=( LDP[i]* (i/Rf) + LD[i]* (30-(i/Rf)))/30,
RDC[i]=( RDP[i]* (30-(i/Rf)) + RD[i]* (i/Rf))/30,
wherein Rf = Rmax +1, i is an integer between 0 and 29,
LD and RD respectively represent the waveform of a packet to the left and the waveform of a packet to the right of a waveform gap formed by the lost packet, LDP and RDP respectively represent the waveforms of the previous pitch period corresponding to the waveforms LD and RD, and LDC and RDC respectively represent the waveforms corresponding to the waveforms LD and RD after filtering.
In the embodiment of the present invention, the packet loss hiding device suitable for bluetooth voice communication further includes a bluetooth voice data detecting unit, where the bluetooth voice data detecting unit is configured to detect whether the bluetooth voice data packet is lost.
In the embodiment of the invention, the gene cycle detection unit performs gene cycle detection by adopting the following steps:
setting N continuous voice data sampling points on the left side from the lost data packet as a template window, and setting a sliding window with the length same as that of the template window, wherein N is more than or equal to 120;
sliding the sliding window from the 40 th point on the left of the lost data packet to the 120 th voice data sampling point on the left of the lost data packet, and respectively calculating the correlation coefficient R of the sliding window and the template window;
and comparing the correlation coefficients of the points to obtain the maximum correlation coefficient Rmax, and taking the pitch period corresponding to the correlation coefficient Rmax as the pitch period before the Bluetooth voice data packet is lost.
In the embodiment of the present invention, the waveform substitution unit generates a substitution waveform by using the following waveform substitution formula:
W[j]= WP[j],1≤j≤30,
W[j]= WP[j]*(1-(j-30)/(15*30),30<j≤480,
W[j]=0,j>480,
where W represents a substitute waveform for the missing data packet, WP represents a waveform corresponding to the missing data packet in a previous pitch period, and j represents a j-th point in the waveform. .
In the embodiment of the present invention, a bluetooth voice processing chip is further provided, where the bluetooth voice processing chip includes at least one processor, a memory, and an interface, and the at least one processor, the memory, and the interface are all connected through a bus;
the memory stores computer-executable instructions;
and the at least one processor executes the computer execution instruction stored in the memory, so that the Bluetooth voice processing chip executes the packet loss hiding method suitable for the Bluetooth voice call.
Compared with the prior art, when the packet loss waveform is replaced, the multi-frame attenuation method is adopted, the condition of wrong sound can be effectively avoided, even if the packet loss amount is large, the actual effect can be easily accepted by human ears, when the replaced waveform is filtered, the self-adaptive window filtering technology is adopted, the filtering strength can be self-adaptively adjusted, the method is more suitable for the characteristic of variable voice, the smoothness of the replaced waveform is improved, the communication quality is improved, the defect that the flexibility of a fixed window is insufficient is also avoided, when the integral packet loss rate of the Bluetooth voice communication is less than 15%, the packet loss error can be effectively hidden, the Bluetooth voice communication quality is improved, and the voice communication distance is increased.
Drawings
Fig. 1 is a block diagram of a bluetooth call packet loss processing system in the prior art;
fig. 2 is a flowchart of an implementation of a packet loss hiding method suitable for a bluetooth voice call according to an embodiment of the present invention;
FIG. 3 is a flow diagram of an implementation of pitch period detection in FIG. 2;
FIG. 4 is a waveform diagram of pitch period detection in FIG. 2;
FIG. 5 is a schematic diagram of an alternative waveform for a lost voice packet;
FIG. 6 is a flow chart of an implementation of the adaptive smoothing filter of FIG. 2;
fig. 7 is a schematic structural diagram of a packet loss concealment apparatus suitable for bluetooth voice communication according to a second embodiment of the present invention;
fig. 8 is a schematic structural diagram of a bluetooth voice processing chip according to a third embodiment of the present invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention more apparent, the present invention is described in further detail below with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are merely illustrative of the invention and are not intended to limit the invention.
The following detailed description of the implementation of the present invention is made with reference to specific embodiments:
example one
Fig. 2 shows an implementation flow of a packet loss concealment method suitable for a bluetooth voice call according to an embodiment of the present invention, where the method includes steps S1 to S4, which are described in detail below.
Step S1: and detecting the packet loss of the Bluetooth voice data.
In this step, whether the received bluetooth voice data is lost or not is detected, and whether the received bluetooth voice data is lost or not is detected. When the mobile phone is in a Bluetooth call, voice is coded and encrypted, the voice is transmitted to a receiving end through RF, a controller of the receiving end decodes and decrypts the hollow voice data, if the receiving is successful, correct voice data can be obtained through verification, the voice data are transmitted to a receiving person through a DAC, if the receiving is failed and the verification is unsuccessful, the packet is marked as an error packet or an empty packet, packet loss hiding processing is carried out in the subsequent steps, new voice waveforms are compensated, and errors are hidden as much as possible.
Step S2: and detecting a pitch period.
It should be noted that the frequency range of human voice is 200Hz-66 Hz, and the voice sampling frequency of bluetooth voice call is 8000 Hz, so that the pitch period range of human voice data samples (samples) is 40-120, that is, the maximum period is 120 voice data samples. The voice packet of bluetooth is 30 voice data sampling points, and when the bluetooth packet is lost, the pitch period before the packet is lost needs to be solved first.
In step S2, a method of calculating a correlation coefficient is used to calculate a pitch period, and a maximum point Rmax of the correlation coefficient R, which is an initiation point of the pitch period, is calculated by using a characteristic that the larger the correlation coefficient is, the stronger the data correlation is, as shown in fig. 3 and 4, the specific step of pitch period detection includes:
step S201: setting N continuous voice data sampling points on the left side from the lost data packet as a template window, and setting a sliding window with the length same as that of the template window, wherein N is more than or equal to 120;
step S202: sliding the sliding window from the 40 th point on the left of the lost data packet to the 120 th voice data sampling point on the left of the lost data packet, and respectively calculating the correlation coefficient R of the sliding window and the template window;
step S203: and comparing the correlation coefficients of the points to obtain the maximum correlation coefficient Rmax, and taking the pitch period corresponding to the correlation coefficient Rmax as the pitch period before the Bluetooth voice data packet is lost.
In step S202, the larger the value of N is, the more accurate the calculated gene period is, and if the value of N is too small, the gene period calculation error may be caused, so in this embodiment, the value of N is set to N ≧ 120, that is, more than one pitch period, thereby preventing the gene period calculation error; in step S202, according to the pitch period range being 40-120 speech data samples, when calculating the correlation coefficient, the sliding range of the sliding window is set to 40-120 speech data samples, thereby avoiding redundant calculation and improving the detection speed of pitch period detection.
Step S3: and the waveform substitution is to generate a substitution waveform according to the waveform corresponding to the lost data packet in the previous pitch period of the lost data packet and substitute the substitution waveform for the waveform of the lost data packet.
It should be noted that, because of the instability of the bluetooth system, when a packet loss occurs, a situation of continuous packet loss often occurs later, and in order to make the effect of continuous frame loss acceptable to human ears and hide errors as much as possible for this feature of bluetooth voice communication, in this embodiment, on the basis of simple waveform substitution, a multi-frame attenuation method is used for waveform substitution. The specific waveform substitution rules are as follows:
when the number of the lost data packets is one, the substitute waveform adopts a waveform corresponding to the lost data packet in a pitch period before the lost data packet;
and when the lost data packets are a plurality of continuous data packets, taking the waveform corresponding to the first lost data packet in the previous pitch period as a substitute waveform of the first data packet, sequentially attenuating each voice sampling point of the waveform corresponding to the other lost data packet in the previous pitch period by 1/(15 x 30) to form a substitute waveform of the other lost data packet, and if the number of the continuous lost data packets is more than 16, directly attenuating the substitute waveform after more than 16 data packets to 0.
Since each bluetooth voice data packet includes 30 voice data samples, the waveform substitution rule can be expressed by the following formula:
W[j]= WP[j],1≤j≤30,
W[j]= WP[j]*(1-(j-30)/(15*30),30<j≤480,
W[j]=0,j>480,
where W represents a substitute waveform for the missing data packet, WP represents a waveform corresponding to the missing data packet in a previous pitch period, and j represents a j-th point in the waveform.
Through bluetooth voice packet loss experiments, it is found that under the condition of a large amount of packet loss, a simple periodic waveform substitution is used, a wrong sound is directly introduced, and the sound effect is unacceptable. A schematic diagram of a specific waveform substitution is shown in fig. 5.
Step S4: and the adaptive smoothing filter is used for performing adaptive smoothing filtering on the waveforms on two sides of the waveform gap formed by the lost data packet according to the corresponding waveforms of the waveforms on two sides of the waveform gap formed by the lost data packet in the previous pitch period.
After the waveform substitution step is completed, a relatively large data fall may exist between the original voice waveform and the substituted waveform, noise often occurs if the waveform is not processed, in order to enable smooth transition between the original voice waveform and the substituted waveform and avoid noise to improve the call quality, smooth filtering needs to be performed on two groups of data, the problem of noise caused by excessive sound can be solved through smooth filtering, if the filtering is too large, the sound can be caused to be problematic, and people can hear the voice with a feeling of stuffiness. In order to enable smooth filtering to achieve a better effect, i.e., smooth transition can be achieved, and excessive error sounds cannot be introduced, in this embodiment, an adaptive window filtering technology is adopted, so that filtering strength can be adaptively adjusted, the method is more suitable for the characteristics of changeable voice, smoothness of a substitute waveform is improved, conversation quality is improved, and the defect that the flexibility of a fixed window is insufficient is also avoided. As shown in fig. 6, the adaptive filtering step specifically includes:
step S401: respectively extracting the waveform LD of a data packet on the left side of a waveform gap formed by the lost data packet and the waveform RD of a data packet on the right side of the waveform gap formed by the lost data packet, wherein the voice data of each data packet comprises 30 voice sampling points;
step S402: respectively extracting waveforms LDP and RDP of a previous pitch period corresponding to the waveforms LD and RD;
step S403: respectively filtering the waveforms LD and RD by using the waveforms LDP and RDP to obtain the smoothed waveforms LDC and RDC, wherein the filtering formula is as follows:
LDC[i]=( LDP[i]* (i/Rf) + LD[i]* (30-(i/Rf)))/30,
RDC[i]=( RDP[i]* (30-(i/Rf)) + RD[i]* (i/Rf))/30,
wherein i is an integer between 0 and 29, and Rf = Rmax + 1.
Example two
Fig. 7 shows a packet loss concealment apparatus suitable for bluetooth voice communication according to a second embodiment of the present invention, which includes a bluetooth voice data detection unit 710, a gene period detection unit 720, a waveform substitution unit 730, and an adaptive filtering unit 740. The following is a detailed description.
The bluetooth voice data detecting unit 710 is configured to detect whether the bluetooth voice data packet is lost;
the gene period detecting unit 720 is configured to detect a pitch period before a bluetooth voice packet is lost by using a correlation coefficient method, and obtain a correlation coefficient Rmax corresponding to the gene period;
the gene cycle detection unit 720 performs gene cycle detection by the following steps:
setting N continuous voice data sampling points on the left side from the lost data packet as a template window, and setting a sliding window with the length same as that of the template window, wherein N is more than or equal to 120;
sliding the sliding window from the 40 th point on the left of the lost data packet to the 120 th voice data sampling point on the left of the lost data packet, and respectively calculating the correlation coefficient R of the sliding window and the template window;
and comparing the correlation coefficients of the points to obtain the maximum correlation coefficient Rmax, and taking the pitch period corresponding to the correlation coefficient Rmax as the pitch period before the Bluetooth voice data packet is lost.
The waveform replacing unit 730, configured to generate a replacing waveform according to a waveform corresponding to the lost packet in a previous pitch period of the lost packet, and replace the waveform of the lost packet with the replacing waveform;
the waveform substitution unit 730 generates a substitution waveform by using the following waveform substitution formula:
W[j]= WP[j],1≤j≤30,
W[j]= WP[j]*(1-(j-30)/(15*30),30<j≤480,
W[j]=0,j>480,
where W represents a substitute waveform for the missing data packet, WP represents a waveform corresponding to the missing data packet in a previous pitch period, and j represents a j-th point in the waveform.
The adaptive filtering unit 740 is configured to perform adaptive smoothing filtering on the waveforms on both sides of the waveform gap formed by the lost packet according to the waveforms corresponding to the waveforms on both sides of the waveform gap formed by the lost packet in the previous pitch period, where the filtering formula is as follows:
LDC[i]=( LDP[i]* (i/Rf) + LD[i]* (30-(i/Rf)))/30,
RDC[i]=( RDP[i]* (30-(i/Rf)) + RD[i]* (i/Rf))/30,
wherein Rf = Rmax +1, i is an integer between 0 and 29,
LD and RD respectively represent the waveform of a packet to the left and the waveform of a packet to the right of a waveform gap formed by the lost packet, LDP and RDP respectively represent the waveforms of the previous pitch period corresponding to the waveforms LD and RD, and LDC and RDC respectively represent the waveforms corresponding to the waveforms LD and RD after filtering.
EXAMPLE III
As shown in fig. 8, a bluetooth voice processing chip according to a third embodiment of the present invention includes at least one processor 810, a memory 820 and an interface 830, where the at least one processor 810, the memory 820 and the interface 830 are all connected by a bus;
the memory 820 stores computer-executable instructions;
the at least one processor 810 executes the computer-executable instructions stored in the memory, so that the bluetooth voice processing chip executes the packet loss concealment method applicable to bluetooth voice communication according to the first embodiment.
In summary, by using the method, apparatus and chip of the present invention, when performing bluetooth voice packet loss waveform substitution, a multi-frame attenuation method is used, which can effectively avoid the situation of wrong voice, and even if the packet loss amount is large, the actual effect can be easily accepted by human ears, when filtering the substituted waveform, an adaptive window filtering technique is used, which can adaptively adjust the filtering strength, and is more suitable for the characteristics of changeable voice, improve the smoothness of the substituted waveform, improve the communication quality, and avoid the defect that the fixed window is not flexible enough, and when the overall packet loss rate of bluetooth voice communication is less than 15%, the packet loss error can be effectively hidden, the bluetooth voice communication quality is improved, and the voice communication distance is increased.
Notably, one of ordinary skill in the art will understand that: the steps or part of the steps for implementing the above method embodiments may be implemented by hardware related to program instructions, the program may be stored in a computer-readable storage medium, and when executed, the program performs the steps including the above method embodiments, and the storage medium includes: various media that can store program codes, such as ROM, RAM, magnetic or optical disks.
The above description is only for the purpose of illustrating the preferred embodiments of the present invention and is not to be construed as limiting the invention, and any modifications, equivalents and improvements made within the spirit and principle of the present invention are intended to be included within the scope of the present invention.