CN117411607A

CN117411607A - Audio synchronous playing method, system, audio playing device and audio system

Info

Publication number: CN117411607A
Application number: CN202210800641.XA
Authority: CN
Inventors: 蔡李镇
Original assignee: Ju Li Zhuhai Microelectronics Co ltd
Current assignee: Ju Li Zhuhai Microelectronics Co ltd
Priority date: 2022-07-06
Filing date: 2022-07-06
Publication date: 2024-01-16

Abstract

The method determines the target playing time of playing a target audio packet according to preset playing time delay, target offset time length, receiving time and audio time length of the target audio packet, and then plays the target audio packet at the target playing time, so that synchronous audio playing of one or more audio playing devices can be completed. The method can ensure that a plurality of audio playing devices can accurately and synchronously play the same audio without using extra bandwidth for time service, reduces bandwidth requirements and improves the utilization rate of communication bandwidth.

Description

Audio synchronous playing method, system, audio playing device and audio system

Technical Field

The disclosure relates to the technical field of communication, in particular to an audio synchronous playing method, an audio synchronous playing system, audio playing equipment and an audio system.

Background

In general, in order to achieve a better audio playback effect, a plurality of audio playback devices may be connected to one terminal device, i.e., one-to-many. In the related art, in order to realize audio synchronous playing of a plurality of audio playing devices, a unified timing mode is required for the audio playing devices and the terminal device, the audio playing devices need to refer to accurate time service of the terminal device, and real-time communication is required to be continuously performed among the plurality of audio playing devices to agree with clock information of synchronous playing, so that synchronous playing time difference among the plurality of audio playing devices can be kept within a preset range. However, this synchronization method requires establishing communications between multiple audio playback devices, occupies additional communications bandwidth, and reduces the effective bandwidth utilization of the overall communication system.

Disclosure of Invention

The invention aims to provide an audio synchronous playing method, an audio synchronous playing system, audio playing equipment and an audio system, so that the audio playing equipment can realize multi-equipment audio synchronous playing under the condition that time service of an audio sending equipment is not utilized.

In a first aspect, the present disclosure provides an audio synchronous playing method, applied to an audio playing device, the method including:

determining a target offset duration, wherein the target offset duration characterizes a time interval between a receiving time when an audio playback module of the audio playing device receives a target audio packet sent by an audio sending device and a starting point of a transmission window where the audio sending device transmits the target audio packet;

determining a target playing time for playing the target audio packet according to a preset playing time delay, the target offset time length, the receiving time and the audio time length of the target audio packet, wherein the preset playing time delay represents the time length required by at least one audio playing device connected with the audio sending device for completing receiving the target audio packet and synchronously playing the target audio packet;

and playing the target audio packet at the target playing time.

Optionally, playing the target audio packet at the target playing time includes:

when the target audio packet is the first audio packet of the audio stream, determining a waiting time length according to the decoding time length of the target audio packet, the target playing time and the receiving time;

and under the condition that the audio playing device finishes decoding the target audio packet and separates the waiting time, determining that the target playing time is reached, and playing the target audio packet.

Optionally, the method further comprises:

determining actual playing delay according to the target offset time length, the decoding time length of a next target audio packet, the audio time length of the next target audio packet and the buffer residual time length of a current played audio packet in the process of playing the audio packet, wherein the buffer residual time length represents the residual audio time length of the current played audio packet when the next target audio packet is decoded;

and determining a target playing rate for playing the audio packet with the buffer residual duration according to the actual playing delay and the preset playing delay, wherein the target playing rate enables a difference value between the preset playing delay and the actual playing delay determined according to the target playing rate to be maintained within a preset range.

Optionally, the determining the actual playing delay according to the target offset duration, the decoding duration of the next target audio packet, the audio duration of the next target audio packet, and the buffer remaining duration of the currently played audio packet includes:

and determining the actual playing delay according to the sum of the target offset duration, the decoding duration of the next target audio packet, the audio duration of the next target audio packet and the buffer residual duration.

Optionally, the determining, according to the actual playing delay and the preset playing delay, a target playing rate for playing the audio packet with the buffered remaining duration includes:

when the actual playing delay is smaller than the preset playing delay, the playing speed of the audio playing device is reduced, and the target playing speed is obtained;

and under the condition that the actual playing delay is larger than the preset playing delay, increasing the playing rate of the audio playing equipment to obtain the target playing rate.

Optionally, the determining the target offset duration includes:

determining a window offset duration and a receiving offset duration, wherein the window offset duration represents a time interval between a starting time of the audio sending device for transmitting the target audio packet to the audio playing device and a starting point of the transmission window, and the receiving offset duration represents a time interval between a time of the audio playing device for receiving the target audio packet by an audio playing module and the starting time;

And obtaining the target offset duration according to the sum of the window offset duration and the receiving offset duration.

Optionally, the determining, according to a preset playing delay, the target offset duration, the receiving time and the audio duration of the target audio packet, the target playing time for playing the target audio packet includes:

determining a first target duration according to the sum of the target offset duration and the audio duration of the target audio packet;

determining a second target duration according to the difference between the preset playing delay and the first target duration;

and determining the time which is spaced from the receiving time by the second target time length as the target playing time.

In a second aspect, the present disclosure provides an audio synchronized playback system applied to an audio playback device, the system comprising:

a first determining module configured to determine a target offset duration, where the target offset duration characterizes a time interval between a receiving time when an audio playback module of the audio playing device receives a target audio packet sent by an audio sending device and a starting point of a transmission window where the audio sending device transmits the target audio packet;

The second determining module is configured to determine a target playing time for playing the target audio packet according to a preset playing time delay, the target offset time length, the receiving time and the audio time length of the target audio packet, wherein the preset playing time delay represents the time length required by at least one audio playing device connected with the audio sending device for completing receiving the target audio packet and synchronously playing the target audio packet;

and the playing module is configured to play the target audio packet at the target playing time.

In a third aspect, the present disclosure provides a non-transitory computer readable storage medium having stored thereon a computer program which, when executed by a processor, implements the steps of the method of the first aspect.

In a fourth aspect, the present disclosure provides an audio playback apparatus comprising:

a memory having a computer program stored thereon;

a controller for executing the computer program in the memory to implement the steps of the method of the first aspect.

In a fifth aspect, the present disclosure provides an audio system comprising an audio transmitting apparatus and at least one audio playing apparatus, wherein:

The audio transmitting device is configured to transmit a target audio packet to the at least one audio playing device;

the at least one audio playback device is configured to:

and playing the target audio packet at the target playing time.

Through the technical scheme, when the audio sending device sends the audio packet to the audio playing device, clock information does not need to be interacted between other audio playing devices connected with the audio sending device, namely clock alignment operation does not need to be performed between different audio playing devices. The audio playing device calculates the corresponding target playing time according to the corresponding target offset time, the receiving time of the received target audio packet and the audio time of the target audio packet and the preset playing time delay, and then synchronous audio playing of one or more audio playing devices can be completed. The method can ensure that a plurality of audio playing devices can accurately and synchronously play the same audio, and does not need to utilize extra bandwidth to exchange clock information, thereby reducing bandwidth requirements and improving the utilization rate of communication bandwidth. Experiments prove that compared with the same starting point of the audio sending device, the delay error of the playback of the audio by the plurality of audio playing devices can be controlled within 6us or less.

Additional features and advantages of the present disclosure will be set forth in the detailed description which follows.

Drawings

The accompanying drawings are included to provide a further understanding of the disclosure, and are incorporated in and constitute a part of this specification, illustrate the disclosure and together with the description serve to explain, but do not limit the disclosure. In the drawings:

fig. 1 is an application scenario diagram illustrating an audio synchronized playback method according to an exemplary embodiment.

Fig. 2 is a flowchart illustrating an audio synchronized playback method according to an exemplary embodiment.

Fig. 3 is a schematic diagram illustrating targeted audio packet transmission according to an exemplary embodiment.

Fig. 4 is a schematic diagram of a transmission window shown according to an example embodiment.

FIG. 5 is a schematic diagram illustrating clock transitions according to an example embodiment.

Fig. 6 is a schematic diagram illustrating wait time periods according to an example embodiment.

Fig. 7 is a flowchart illustrating an audio synchronized playback method according to another exemplary embodiment.

Fig. 8 is a schematic diagram illustrating audio playback in accordance with an exemplary embodiment.

Fig. 9 is a schematic diagram showing actual play delays according to an exemplary embodiment.

Fig. 10 is a schematic diagram showing a module connection of an audio synchronized playback system according to an exemplary embodiment.

Fig. 11 is a block diagram illustrating an audio playback device according to an exemplary embodiment.

Detailed Description

Specific embodiments of the present disclosure are described in detail below with reference to the accompanying drawings. It should be understood that the detailed description and specific examples, while indicating and illustrating the disclosure, are not intended to limit the disclosure.

It should be noted that, all actions for acquiring signals, information or data in the present disclosure are performed under the condition of conforming to the corresponding data protection rule policy of the country of the location and obtaining the authorization given by the owner of the corresponding device.

The embodiment of the disclosure provides an Audio system, which comprises an Audio sending device and one or more Audio playing devices, wherein the Audio sending device and the Audio playing devices can be connected through Bluetooth, for example, the Audio sending device can perform Audio broadcasting to the one or more Audio playing devices based on CIS channels and BIS channels in Le Audio (Bluetooth Low energy). Of course, the connection may also be made through other networks, such as a WiFi network, a cellular communication network, or wired communication, among others. The audio sending device may refer to a terminal device such as a mobile terminal, a vehicle-mounted terminal, and the like, and the audio playing device may refer to a device such as a mobile terminal, an earphone, a sound box, and the like with an audio playing function. Fig. 1 is a schematic view of an application scenario of an audio synchronous playing method according to an exemplary embodiment, and as shown in fig. 1, an audio transmitting apparatus may be connected to a first audio playing apparatus, a second audio playing apparatus, …, and an nth audio playing apparatus, respectively.

Fig. 2 is a flowchart illustrating an audio synchronized playback method according to an exemplary embodiment. As shown in fig. 2, an embodiment of the present disclosure provides an audio synchronous playing method, which may be applied to the audio playing device shown in fig. 1, and may specifically be executed by an audio synchronous playing system disposed in the audio playing device, where the system may be implemented by software and/or hardware, and configured in the audio playing device. As shown in fig. 2, the method may include the following steps.

In step 210, a target offset duration is determined, where the target offset duration characterizes a time interval between a time when an audio playback module of the audio playing device receives a target audio packet sent by an audio sending device and a starting point of a transmission window where the audio sending device transmits the target audio packet.

Here, fig. 3 is a schematic diagram illustrating transmission of a target audio packet according to an exemplary embodiment, and as shown in fig. 3, the audio transmission apparatus 30 communicates with the audio playing apparatus 31, the audio playing apparatus 31 receives the target audio packet transmitted by the audio transmission apparatus 30 through the data transmission module 311 and transmits the received target audio packet to the audio playback module 312, and the audio playback module 312 decodes the target audio packet and converts a digital signal into analog information to control the audio output module (not shown in fig. 3) to play sound.

According to a synchronous transmission protocol between the audio transmitting device and the audio playing device, the audio transmitting device respectively transmits target audio packets to one or more audio playing devices connected with the audio transmitting device in a transmission window. Fig. 4 is a schematic diagram of a transmission window shown in an exemplary embodiment, where in the transmission window of the audio transmission device, the audio transmission device transmits a target audio packet to the first audio playing device in a first sub-transmission window spaced apart from a start point of the transmission window by a first window offset duration, the audio transmission device transmits a target audio packet to the second audio playing device in a second sub-transmission window spaced apart from the start point of the transmission window by a second window offset duration, and the audio transmission device transmits a target audio packet to the nth audio playing device in an nth sub-transmission window spaced apart from the start point of the transmission window by an nth window offset duration, as shown in fig. 4. Wherein, in each transmission window, the audio sending device sends the target audio packet to the audio playing device according to different time sequences. It should be noted that, in the same transmission window, the target audio packets transmitted may be the same, or may be different channel formats of the same audio, such as left channel audio or right channel audio.

It should be understood that the window offset duration refers to a time interval between a start time of transmission of the target audio packet by the audio transmitting apparatus to the audio playing apparatus and a start point of a transmission window of the audio transmitting apparatus. The window offset duration may be negotiated when the audio transmitting apparatus establishes a communication connection with the audio playing apparatus. After the window offset duration, the audio transmitting apparatus transmits the target audio packet to the audio playing apparatus, and within the transmission window, the target audio packet is allowed to be transmitted one or more times so as to be retransmitted when the transmission is disturbed, thereby improving the probability of successful transmission of the target audio packet.

In addition, a single transmission of the target audio packet requires time, so when the audio playing device receives the target audio packet, there is a delay relative to the starting point of the transmission window, and the processing delay of the data transmission system of the audio playing device is the receiving offset duration. Wherein the received offset duration characterizes a time interval between a time when the audio playback module of the audio playback device receives the target audio packet and the start time. It should be noted that, if the target audio packet cannot be successfully transmitted within the whole transmission window, an empty packet is returned to the audio playback module of the audio playing device after the last transmission failure.

In some embodiments, the audio playing device may determine a window offset duration and a receiving offset duration corresponding to the audio playing device, and obtain the target offset duration according to a sum of the window offset duration and the receiving offset duration.

For example, the time length of the transmission window of the audio transmitting apparatus is 10000us, when the starting point of the transmission window when the target audio packet a is transmitted is 100000us, the window offset duration corresponding to the audio playing apparatus B is 3000us, and the receiving offset duration is 1200us, then at 103000us, the audio transmitting apparatus starts to transmit the target audio packet a to the data transmission module of the audio playing apparatus B, and at 104200us, the audio playback module of the audio playing apparatus B receives the target audio packet a, and the corresponding target offset duration is 4200us.

It should be noted that, the window offset duration corresponding to each audio playing device is fixed, and the receiving offset duration is limited within the range of the sub-transmission window. The window offset duration and the size of the sub-transmission window are agreed when the audio playing device and the audio sending device are in communication connection, and the window offset duration and the size of the sub-transmission window are determined by the synchronous transmission protocol. For example, as shown in fig. 4, the first sub-transmission window, the second sub-transmission window, and the nth sub-transmission window each include a corresponding receiving offset duration. And in the time range corresponding to the sub-transmission window, the sending and receiving of the target audio packet can be completed.

In addition, the above-described target offset period is performed based on the clock of the audio playback apparatus. There is a clock in the audio playback module of the audio playback device that starts to count from 0us after start-up. In the data transmission process of the target audio packet, the starting point of the transmission window, the receiving offset duration and the window offset duration are timed by the clock of the audio transmitting equipment. When the audio playing device plays back the audio, the starting point, the receiving offset duration and the window offset duration of a transmission window which are clocked by the clock of the audio sending device are required to be converted into the clock of the audio playing module for clocking.

Fig. 5 is a schematic diagram illustrating clock conversion according to an exemplary embodiment, as shown in fig. 5, when a start point of a transmission window of an audio transmission device is 100000us time and a window offset duration corresponding to an audio playing device B is 3000us on a clock of the audio transmission device, the audio transmission device starts to transmit a target audio packet a to a data transmission system of the audio playing device B at 103000us time. The corresponding receiving offset duration of the audio playing device B is 1200us, and at the moment 104200us, the audio playback module of the audio playing device B receives the target audio packet a, and the corresponding target offset duration is 4200us.

Corresponding to the clock of the audio playing device, if the audio playback module of the audio playing device B receives the target audio packet a at 210000us, the corresponding starting point is 205800 us.

In step 220, determining a target playing time for playing the target audio packet according to a preset playing time delay, the target offset time length, the receiving time and the audio time length of the target audio packet, where the preset playing time delay characterizes a time length required by at least one audio playing device connected with the audio sending device to complete receiving the target audio packet and synchronously playing the target audio packet.

Here, the preset play delay is a time window in which one or more audio playing devices can complete the actions of receiving a target audio packet, decoding the target audio packet, synchronously starting to play the target audio packet, and synchronously playing the completed target audio packet. Thus, the preset playback delay may be a maximum time for one or more audio playback devices to complete the actions of receiving the target audio packet, decoding the target audio packet, and synchronously playing the completed target audio packet. Also, different audio playback devices have the same preset playback delay, which may be negotiated by the audio transmitting device with the audio playback device. For example, the preset play delay may be 20000us, and within the duration of 20000us, all audio playing devices connected to the audio sending device can complete the actions of receiving the target audio packet, decoding the target audio packet, playing the target audio packet at the same time, and playing the completed target audio packet at the same time.

It should be noted that, the preset playing delay may be determined by the audio sending device acquiring the decoding duration, the target offset duration, the communication time consumption and other time of each audio playing device when the audio playing device is connected to the audio sending device.

The audio duration of the target audio packet refers to the duration of audio data used when the target audio packet is encoded. For example, 48kHz sampling rate audio, 480 sampling points are required for each target audio packet encoding, then the audio duration of the target audio packet is 10000us. Of course, the audio duration of the target audio packet may also refer to the duration required for playing the target audio packet at the preset playing speed. For example, the preset playing speed may be consistent with the sampling speed of the target audio packet, and the required length for playing the target audio packet is substantially equal to the length of the audio data used when the audio packet is encoded.

The target playing time refers to the time when the audio playback module of the audio playing device starts playing the target audio packet. For each target audio packet, it has a corresponding target playing time. On the time axis, the end time of the last audio packet of the audio stream is the play time of the next audio packet.

In some embodiments, a first target duration may be determined according to a sum of the target offset duration and an audio duration of the target audio packet, then a second target duration may be determined according to a difference between the preset play delay and the first target duration, and a time interval between the second target duration and the receiving time may be determined as the target play time.

For example, a first target duration=target offset duration+audio duration, a second target duration=preset playback delay-first target duration. Target play time = receive time + second target duration.

It should be understood that the target playing time at which different audio playing devices play the same target audio packet is consistent, although the target offset time is different for different audio playing devices.

In step 230, the target audio packet is played at the target playing time.

The target playing time is referred to by a clock in an audio playback module of the audio playing device, and when the target playing time is reached, the audio playback module of the audio playing device starts playing the target audio packet, so that the audio playing device connected with the audio sending device can synchronously play the same target audio packet.

Therefore, when the audio sending device sends the audio packet to the audio playing device, clock information does not need to be interacted between other audio playing devices connected with the audio sending device, namely clock alignment operation does not need to be carried out between different audio playing devices. The audio playing device calculates the corresponding target playing time according to the corresponding target offset time, the receiving time of the received target audio packet and the audio time of the target audio packet and the preset playing time delay, and then synchronous audio playing of one or more audio playing devices can be completed. The method can ensure that a plurality of audio playing devices can accurately and synchronously play the same audio, and does not need to utilize extra bandwidth to exchange clock information, thereby reducing bandwidth requirements and improving the utilization rate of communication bandwidth. Experiments prove that compared with the same starting point of the audio sending device, the delay error of the audio playing device for playing back the audio can be controlled within 6us or less, and the time difference between synchronous audio playing of a plurality of audio playing devices can be controlled within 12us or less.

In some implementations, when the target audio packet is the first audio packet of the audio stream, the waiting time may be determined according to the decoding time length, the target playing time and the receiving time length of the target audio packet, and when the audio playing device finishes decoding the target audio packet and waits for the time length at intervals, the target playing time is determined to be reached, and the target audio packet is played.

Here, the target audio packet is an audio packet divided from an audio stream, and the divided target audio packets are sequentially arranged in time order to form the audio stream. Since the target offset durations corresponding to different audio playback devices are different, the waiting durations corresponding to different audio playback devices connected to the audio transmission device may be different. The audio playing device connected with the audio sending device plays the same target audio packet synchronously after the corresponding waiting time.

The decoding duration of the target audio packet refers to the duration required by the audio playing device to decode the target audio packet.

The waiting time length refers to the time length that the audio playing device needs to wait after finishing decoding the target audio packet, and after finishing decoding the target audio packet and waiting for the time length, determining that the target playing time is reached, and starting to play the target audio packet by the audio playback module of the audio playing device. For different audio playing devices, the audio playing devices have the same preset playing delay, and for each audio playing device, the corresponding target offset duration, decoding duration and audio duration can be determined, so that the waiting duration of the audio playing device can be determined.

It should be understood that the target playing time and the receiving time of the target audio packet are referenced to a clock in the audio playback module.

And the audio playback module decodes the target audio packet in a time window between the target playing time and the receiving time. Typically, the audio playback module reaches the target playback time directly after the target audio packet is decoded. However, since the preset playing delay is the time required for one or more audio playing devices to complete the actions of receiving the target audio packet, decoding the target audio packet, synchronously starting playing the target audio packet and synchronously playing the completed target audio packet, the decoding time of the one or more audio playing devices to complete the decoding of the target audio packet is different, so that the corresponding waiting time is also different.

Fig. 6 is a schematic diagram illustrating a waiting time period according to an exemplary embodiment, and as shown in fig. 6, assuming that an audio time period of a target audio packet is 10000us, a preset play delay is 20000us. The audio transmitting apparatus generates a transmission window for transmitting the target audio packet a at a timing of 100000us, starts transmitting the target audio packet a to the audio playing apparatus B at a timing of 103000us after a window offset period of 3000us, and receives the target audio packet a after a reception offset period of 1200us by the audio playback module of the audio playing apparatus B. The clock converted to the audio playback module is at the receiving time 210000us, and the audio playback module of the audio playback device B receives the target audio packet a. The starting point of the transmission window is 205800 us. At the moment 210000us, the audio playback module starts to decode the target audio packet a, after the decoding duration is 2500us, the decoding of the target audio packet a is completed at the moment 212500us, and since the target playing time of the target audio packet a is 215800us, the corresponding waiting duration is 3300us. I.e. after the audio playback module has finished decoding the target audio packet a, a time length of interval 3300us is required before the target play moment is reached.

It should be noted that, after different waiting periods, different audio playing devices play the same target audio packet at the same time. Wherein playing the same target audio packet at the same time means playing the same target audio packet at the same time under the world clock. For example, for a first audio playback device it may play the target audio packet a at 215800us of the clock of its corresponding audio playback module, and for a second audio playback device it may play the target audio packet a at 216800us of the clock of its corresponding audio playback module, but under the world clock, the first audio playback device and the second audio playback device play the target audio packet a in synchronization.

In other embodiments, if a plurality of audio packets are included in the audio packet linked list for audio playback, the target duration may be determined according to the target offset duration, the decoding duration of the target audio packet, the audio duration of the target audio packet, and the sum of the audio durations of other audio packets, and further the waiting duration may be determined according to a difference between the preset play delay and the target duration.

Of course, in the embodiment of determining the waiting time, since the transmission time consumed by the data transmission modules of all the audio playing devices to transmit the audio packets to the audio playback module is the same by default, the influence of the transmission time on synchronous playing is not considered when the waiting time is calculated.

In some implementations, the wait time may be calculated in connection with the time taken by the data transmission module to send the audio packets to the audio playback module. For example, assuming that the preset play delay is 20000us, the target offset time is 4200us, the decoding time is 2500us, the audio time is 10000us, and the transmission time is 10us, the corresponding waiting time=20000 us-10000us-4200us-2500us-10us, that is, the waiting time is 3290us.

As shown in fig. 6, the audio playback module plays the target audio packet a at the target play time of 215800us, with 215800us after the waiting time 3300us has elapsed as the target play time.

The audio playing device finishes decoding the target audio packet, and after waiting for a long time, reaches the target playing time, and the audio playing module of the audio playing device starts playing the target audio packet.

It should be noted that, for the first audio packet in the audio stream, by calculating the waiting duration, the audio stream can be synchronously played by a plurality of audio playing devices.

Fig. 7 is a flowchart illustrating an audio synchronized playback method according to another exemplary embodiment. As shown in fig. 7, the audio synchronized playback method may further include the following steps.

In step 410, during the process of playing the audio packet, determining an actual playing delay according to the target offset duration, the decoding duration of the next target audio packet, the audio duration of the next target audio packet, and the buffer remaining duration of the currently played audio packet, where the buffer remaining duration characterizes the remaining audio duration of the currently played audio packet when the decoding of the next target audio packet is completed.

Here, the buffer remaining time period refers to the remaining audio time period of the audio packet currently played when the decoding of the next target audio packet is completed. Of course, the buffered remaining duration may also be a duration required to play the remaining audio duration at the default play rate. Fig. 8 is a schematic diagram illustrating audio playback according to an exemplary embodiment, as shown in fig. 8, in which a plurality of audio packets exist in an audio packet list, where an audio packet N is a newly received audio packet, and the audio playback module extracts an audio packet N-2 from the audio packet list and plays back the audio packet N-2 to form a sound when playing back the sound. Therefore, when the actual playing delay is calculated and the next target audio packet is the audio packet N-1, the buffer remaining duration of the audio packet N-2 currently played in the audio stream needs to be combined.

In some embodiments, the actual playing delay may be determined according to a sum of the target offset duration, the decoding duration of the next target audio packet, the audio duration of the next target audio packet, and the buffer remaining duration.

Fig. 9 is a schematic diagram showing actual play delays according to an exemplary embodiment. As shown in fig. 9, assume that the preset play delay is 20000us and the audio duration of the target audio packet is 10000us. On the clock of the audio playing device, the starting point of the transmission window of the audio transmitting device for transmitting the target audio packet is 205800us, the target offset duration is passed, and at 210000us, the audio playback module of the audio playing device receives the target audio packet. After the decoding duration, at 212500us, the audio playback module finishes decoding the next target audio packet. The time to begin playing the next target audio packet is 215800us.

Assuming that the buffer remaining duration is 3298us, the actual play delay is the target offset duration+the decoding duration+the buffer remaining duration=19998 us.

In step 420, a target playing rate for playing the audio packet with the buffered remaining duration is determined according to the actual playing delay and the preset playing delay, where the target playing rate maintains a difference between the preset playing delay and the actual playing delay determined according to the target playing rate within a preset range.

Here, during the process of playing the audio packet, the audio playback module calculates the actual playing delay in real time or periodically. And calculating a target playing rate for playing the audio packets with the residual buffer time length according to the actual playing delay and the preset playing delay, so as to adjust the playing rate of the audio playback module to the target playing rate.

As shown in fig. 9, the time 215800us is a target playing time corresponding to a next target audio packet, and when the audio playing device finishes playing the currently played audio packet at the first playing rate, the target playing time of the next target audio packet should be reached.

The first playing rate refers to a current playing rate of an audio playback module of the audio playing device. For example, when the rate of playing the currently played audio packet by the audio playback module is a, if the rate a is not adjusted when playing the audio packet to the next target audio packet, the rate a is taken as the first playing rate. If the target audio packet is the first audio packet of the audio stream, the playing rate currently set by the audio playback module is taken as the first playing rate.

Since the actual playing delay is the sum of the target offset duration, the decoding duration of the next target audio packet, the audio duration of the next target audio packet and the buffer remaining duration, and the buffer remaining duration can be changed by the playing rate, by adjusting the first playing rate of the audio playback module to the target playing rate, the difference between the preset playing delay and the actual playing delay determined according to the target playing rate can be maintained within the preset range.

The preset range can be set according to actual conditions, and can be 0us or-6-6 us. When the preset range is 0us, the time when the current playing audio packet is played is consistent with the target playing time of the next target audio packet.

For example, assuming that the preset range is 0us, the preset play delay is 20000us, as shown in fig. 9, the actual play delay is 19998us, which indicates that at the time 215798us, the audio playback module plays the audio packet currently played. The target time for playing the next target audio packet is 215800us, so if the playing rate of the audio playback module is not adjusted, the playing time of the next target audio packet is advanced by 2us, or the next target audio packet needs 2us to finish decoding, and the audio interruption time of 2us exists.

By slowing down the first play rate to the target play rate, the required length of the buffer remaining time length after the play of 3298us can be made 3300us. Namely, at the moment 215800us, the audio playback module plays one audio packet, and starts playing the next target audio packet, so that accurate time delay playing of the next target audio packet is realized.

In some embodiments, the playing rate of the audio playing device is reduced to obtain the target playing rate when the actual playing delay is smaller than the preset playing delay.

In some embodiments, when the actual playing delay is greater than the preset playing delay, the playing rate of the audio playing device is increased, and the target playing rate is obtained.

It should be appreciated that after adjusting the first playback rate to the target playback rate, the audio playback module plays the audio packets at the target playback rate until the next adjustment of the playback rate.

For example, the first playback rate is adjusted to the second playback rate in the middle of playing the first audio packet. If the playing rate is not adjusted when the second audio packet is played, continuing to play the second audio packet at the second playing rate.

Therefore, by adjusting the playing rate of the audio packets in the playing process of the audio packets, one or more audio playing devices connected with the audio sending device can accurately and synchronously play the same audio packets.

Fig. 10 is a schematic diagram showing a module connection of an audio synchronized playback system according to an exemplary embodiment. As shown in fig. 10, an embodiment of the present disclosure proposes an audio synchronized playback system, the system 1000 is applied to an audio playback device, and the system 1000 includes:

A first determining module 1001, configured to determine a target offset duration, where the target offset duration characterizes a time interval between a time when an audio playback module of the audio playing device receives a target audio packet sent by an audio sending device and a start point of a transmission window where the audio sending device transmits the target audio packet;

a second determining module 1002, configured to determine a target playing time for playing the target audio packet according to a preset playing delay, the target offset time length, the receiving time and the audio time length of the target audio packet, where the preset playing delay characterizes a time length required by at least one audio playing device connected to the audio sending device to complete receiving the target audio packet and synchronously playing the target audio packet;

a playing module 1003 configured to play the target audio packet at the target playing time.

Optionally, the playing module 1003 includes:

the first determining subunit is configured to determine a waiting duration according to the decoding duration of the target audio packet, the target playing time and the receiving time when the target audio packet is the first audio packet of the audio stream;

And the playing subunit is configured to determine that the target playing time is reached and play the target audio packet under the condition that the audio playing device finishes decoding the target audio packet and the waiting time is spaced.

Optionally, the system 1000 further comprises:

the first computing unit is configured to determine actual playing delay according to the target offset duration, the decoding duration of a next target audio packet, the audio duration of the next target audio packet and the buffer residual duration of a currently played audio packet in the process of playing the audio packet, wherein the buffer residual duration represents the residual audio duration of the currently played audio packet when the decoding of the next target audio packet is completed;

and the second calculation unit is configured to determine a target playing rate for playing the audio packet with the buffer residual duration according to the actual playing delay and the preset playing delay, wherein the target playing rate enables a difference value between the preset playing delay and the actual playing delay determined according to the target playing rate to be maintained within a preset range.

Optionally, the first computing unit is specifically configured to:

Optionally, the second computing unit is specifically configured to:

Optionally, the first determining module 1001 includes:

a second determining subunit configured to determine a window offset duration and a receive offset duration, where the window offset duration characterizes a time interval between a start time of the audio sending device transmitting the target audio packet to the audio playing device and a start point of the transmission window, and the receive offset duration characterizes a time interval between a time of the audio playing device receiving the target audio packet by the audio playing module and the start time;

And a third determining subunit configured to obtain the target offset duration according to the sum of the window offset duration and the receiving offset duration.

Optionally, the second determining module 1002 includes:

a fourth determining subunit configured to determine a first target duration according to a sum of the target offset duration and an audio duration of the target audio packet;

a fifth determining subunit configured to determine a second target duration according to a difference between the preset play delay and the first target duration;

a sixth determination subunit configured to determine, as the target playing time, a time spaced apart from the receiving time by the second target time period.

The specific manner in which the individual functional modules perform the operations in the systems of the above embodiments have been described in detail in connection with embodiments of the method, and will not be described in detail herein.

Fig. 11 is a block diagram illustrating an audio playback device according to an exemplary embodiment. As shown in fig. 11, the electronic device 700 may include: a processor 701, a memory 702. The electronic device 700 may also include one or more of a multimedia component 703, an input/output (I/O) interface 704, and a communication component 705.

The processor 701 is configured to control the overall operation of the electronic device 700 to perform all or part of the steps in the above-mentioned audio synchronous playing method. The memory 702 is used to store various types of data to support operation on the electronic device 700, which may include, for example, instructions for any application or method operating on the electronic device 700, as well as application-related data, such as contact data, messages sent and received, pictures, audio, video, and so forth. The Memory 702 may be implemented by any type or combination of volatile or non-volatile Memory devices, such as static random access Memory (Static Random Access Memory, SRAM for short), electrically erasable programmable Read-Only Memory (Electrically Erasable Programmable Read-Only Memory, EEPROM for short), erasable programmable Read-Only Memory (Erasable Programmable Read-Only Memory, EPROM for short), programmable Read-Only Memory (Programmable Read-Only Memory, PROM for short), read-Only Memory (ROM for short), magnetic Memory, flash Memory, magnetic disk, or optical disk. The multimedia component 703 can include a screen and an audio component. Wherein the screen may be, for example, a touch screen, the audio component being for outputting and/or inputting audio signals. For example, the audio component may include a microphone for receiving external audio signals. The received audio signals may be further stored in the memory 702 or transmitted through the communication component 705. The audio assembly further comprises at least one speaker for outputting audio signals. The I/O interface 704 provides an interface between the processor 701 and other interface modules, which may be a keyboard, mouse, buttons, etc. These buttons may be virtual buttons or physical buttons. The communication component 705 is for wired or wireless communication between the electronic device 700 and other devices. Wireless communication, such as Wi-Fi, bluetooth, near field communication (Near Field Communication, NFC for short), 2G, 3G, 4G, NB-IOT, eMTC, or other 5G, etc., or one or a combination of more of them, is not limited herein. The corresponding communication component 705 may thus comprise: wi-Fi module, bluetooth module, NFC module, etc.

In an exemplary embodiment, the electronic device 700 may be implemented by one or more application specific integrated circuits (Application Specific Integrated Circuit, abbreviated as ASIC), digital signal processor (Digital Signal Processor, abbreviated as DSP), digital signal processing device (Digital Signal Processing Device, abbreviated as DSPD), programmable logic device (Programmable Logic Device, abbreviated as PLD), field programmable gate array (Field Programmable Gate Array, abbreviated as FPGA), controller, microcontroller, microprocessor, or other electronic components for performing the above-described audio synchronized playback method.

In another exemplary embodiment, a computer readable storage medium is also provided, comprising program instructions which, when executed by a processor, implement the steps of the audio synchronized playback method described above. For example, the computer readable storage medium may be the memory 702 including program instructions described above, which are executable by the processor 701 of the electronic device 700 to perform the audio synchronized playback method described above.

In another exemplary embodiment, a computer program product is also provided, comprising a computer program executable by a programmable apparatus, the computer program having code portions for performing the above-described audio synchronized playback method when executed by the programmable apparatus.

The embodiment of the disclosure also provides an audio system, which comprises an audio sending device and at least one audio playing device, wherein:

the at least one audio playback device is configured to:

and playing the target audio packet at the target playing time.

The preferred embodiments of the present disclosure have been described in detail above with reference to the accompanying drawings, but the present disclosure is not limited to the specific details of the above embodiments, and various simple modifications may be made to the technical solutions of the present disclosure within the scope of the technical concept of the present disclosure, and all the simple modifications belong to the protection scope of the present disclosure.

In addition, the specific features described in the foregoing embodiments may be combined in any suitable manner, and in order to avoid unnecessary repetition, the present disclosure does not further describe various possible combinations.

Moreover, any combination between the various embodiments of the present disclosure is possible as long as it does not depart from the spirit of the present disclosure, which should also be construed as the disclosure of the present disclosure.

Claims

1. An audio synchronous playing method, which is applied to an audio playing device, comprising:

And playing the target audio packet at the target playing time.

2. The method of claim 1, wherein playing the target audio packet at the target play time comprises:

3. The method according to claim 1 or 2, characterized in that the method further comprises:

4. The method of claim 3, wherein determining the actual playback delay based on the target offset duration, the decoding duration of the next target audio packet, the audio duration of the next target audio packet, and the buffer remaining duration of the currently played audio packet comprises:

5. The method of claim 3, wherein determining a target playback rate for playing the buffered remaining length of audio packets based on the actual playback delay and the preset playback delay comprises:

6. The method of claim 1, wherein the determining the target offset duration comprises:

7. The method of claim 1, wherein determining the target playing time for playing the target audio packet according to the preset playing delay, the target offset time length, the receiving time and the audio time length of the target audio packet comprises:

8. An audio synchronized playback system for use with an audio playback device, the system comprising:

9. A non-transitory computer readable storage medium, having stored thereon a computer program, characterized in that the program when executed by a processor realizes the steps of the method according to any of claims 1-7.

10. An audio playback device, comprising:

a memory having a computer program stored thereon;

a controller for executing the computer program in the memory to implement the steps of the method of any one of claims 1-7.

11. An audio system comprising an audio transmitting apparatus and at least one audio playing apparatus, wherein:

the at least one audio playback device is configured to:

And playing the target audio packet at the target playing time.