CN112887404B

CN112887404B - Audio transmission control method, equipment and computer readable storage medium

Info

Publication number: CN112887404B
Application number: CN202110103836.4A
Authority: CN
Inventors: 廖松茂
Original assignee: Nubia Technology Co Ltd
Current assignee: Nubia Technology Co Ltd
Priority date: 2021-01-26
Filing date: 2021-01-26
Publication date: 2023-08-11
Anticipated expiration: 2041-01-26
Also published as: CN112887404A

Abstract

The invention discloses an audio transmission control method, equipment and a computer readable storage medium, wherein the method comprises the following steps: creating a collection queue for recording received audio data; then, acquiring audio data to be played in the set queue; when the audio data to be played is extracted, rejecting the extracted audio data in the set queue; when the time delay of the audio transmission in the current state is obtained according to the real-time capacity of the set queue and exceeds a preset value, the receiving amount of the audio data in the set queue is reduced so as to slow down the time delay of the audio transmission. The humanized audio transmission control scheme is realized, so that the possible time delay can be found in time in the screen recording or screen throwing process, the degree of the time delay is accurately judged, the perception-free time delay judging and processing scheme is provided, and the user experience is enhanced.

Description

Audio transmission control method, equipment and computer readable storage medium

Technical Field

The present invention relates to the field of mobile communications, and in particular, to an audio transmission control method, apparatus, and computer readable storage medium.

Background

In the prior art, with the continuous development of intelligent terminal equipment, the use frequency of a user for the screen throwing or recording of the equipment is higher and higher, however, based on the existing screen throwing or recording scheme, certain time delay defects may exist for the control of audio transmission, and in particular, in the process of screen throwing or recording, the playing of the audio and the transmission rhythm of the original audio may have the asynchronous condition, and with the continuous increase of the duration of the screen throwing or recording, the degree of the time delay is further accumulated. That is, when the screen is just started to be thrown or recorded, the possibility that the generated time delay is perceived by the user is low, if the screen is thrown or recorded for a long time, the accumulated time delay becomes more and more obvious, and the occurrence of the audio time delay can bring a certain trouble to the normal screen throwing or recording of the user, and the video and audio synchronous screen throwing or recording file cannot be generated, so that the user experience is reduced to a certain extent.

Disclosure of Invention

In order to solve the above technical drawbacks in the prior art, the present invention provides an audio transmission control method, which includes:

creating a collection queue for recording received audio data;

Acquiring audio data to be played in the set queue;

when the audio data to be played is extracted, rejecting the extracted audio data in the set queue;

when the time delay of the audio transmission in the current state is obtained according to the real-time capacity of the set queue and exceeds a preset value, the receiving amount of the audio data in the set queue is reduced so as to slow down the time delay of the audio transmission.

Optionally, the creating a collection queue for recording the received audio data includes:

generating a first thread for receiving the audio data transmitted by the network and decoding the audio data;

the first thread creates the set queue, and records the audio data received in the current state, decoded and not yet played through the set queue.

Optionally, the obtaining the audio data to be played in the collection queue includes:

generating a second thread for monitoring the playing state of the audio data;

and acquiring the audio data to be played from the set queue through the second thread.

Optionally, when the audio data to be played is extracted, the extracting audio data is removed from the collection queue, including:

Extracting the audio data to be played from the set queue through the second thread;

and rejecting the audio data to be played in the set queue through the first thread.

Optionally, when the delay of the audio transmission in the current state is obtained according to the real-time capacity of the set queue, the method reduces the receiving amount of the audio data in the set queue to slow down the delay of the audio transmission includes:

presetting a preset capacity value corresponding to the set queue for monitoring the time delay;

and acquiring the real-time capacity of the set queue, and determining the size relation between the real-time capacity and the preset capacity value.

Optionally, when the delay of the audio transmission in the current state obtained according to the real-time capacity of the set queue exceeds a preset value, reducing the receiving amount of the audio data in the set queue to slow down the delay of the audio transmission, and further including:

if the real-time capacity is smaller than the preset capacity value, determining that the time delay does not exceed the preset value;

and if the real-time capacity is larger than the preset capacity value, determining that the time delay exceeds the preset value.

presetting a frame skip condition for processing the received audio data;

and when the time delay is determined to exceed the preset value, reducing the subsequent receiving amount of the audio data in the set queue according to the frame skip condition so as to slow down the time delay of the audio transmission.

presetting an amplitude condition for extracting the audio data according to the amplitude;

and when the time delay is determined to be more than the preset value, detecting and deleting the received audio data meeting the amplitude condition in the audio data so as to reduce the subsequent receiving amount of the audio data in the collection queue and slow down the time delay of the audio transmission.

The invention also proposes an audio transmission control device comprising a memory, a processor and a computer program stored on said memory and executable on said processor, said computer program implementing the steps of the audio transmission control method as defined in any one of the above when executed by said processor.

The present invention also proposes a computer-readable storage medium having stored thereon an audio transmission control program which, when executed by a processor, implements the steps of the audio transmission control method as set forth in any one of the preceding claims.

An audio transmission control method, apparatus, and computer-readable storage medium embodying the present invention by creating a collection queue for recording received audio data; then, acquiring audio data to be played in the set queue; when the audio data to be played is extracted, rejecting the extracted audio data in the set queue; when the time delay of the audio transmission in the current state is obtained according to the real-time capacity of the set queue and exceeds a preset value, the receiving amount of the audio data in the set queue is reduced so as to slow down the time delay of the audio transmission. The humanized audio transmission control scheme is realized, so that the possible time delay can be found in time in the screen recording or screen throwing process, the degree of the time delay is accurately judged, the perception-free time delay judging and processing scheme is provided, and the user experience is enhanced.

Drawings

The invention will be further described with reference to the accompanying drawings and examples, in which:

fig. 1 is a schematic diagram of a hardware structure of a mobile terminal according to the present invention;

fig. 2 is a schematic diagram of a communication network system according to an embodiment of the present invention;

fig. 3 is a flowchart of a first embodiment of the audio transmission control method of the present invention;

fig. 4 is a flowchart of a second embodiment of the audio transmission control method of the present invention;

fig. 5 is a flowchart of a third embodiment of the audio transmission control method of the present invention;

fig. 6 is a flowchart of a fourth embodiment of the audio transmission control method of the present invention;

fig. 7 is a flowchart of a fifth embodiment of the audio transmission control method of the present invention;

fig. 8 is a flowchart of a sixth embodiment of the audio transmission control method of the present invention;

fig. 9 is a flowchart of a seventh embodiment of the audio transmission control method of the present invention;

fig. 10 is a flowchart of an eighth embodiment of the audio transmission control method of the present invention.

Detailed Description

It should be understood that the specific embodiments described herein are for purposes of illustration only and are not intended to limit the scope of the invention.

In the following description, suffixes such as "module", "component", or "unit" for representing elements are used only for facilitating the description of the present invention, and have no specific meaning per se. Thus, "module," "component," or "unit" may be used in combination.

The terminal may be implemented in various forms. For example, the terminals described in the present invention may include mobile terminals such as cell phones, tablet computers, notebook computers, palm computers, personal digital assistants (Personal Digital Assistant, PDA), portable media players (Portable Media Player, PMP), navigation devices, wearable devices, smart bracelets, pedometers, and fixed terminals such as digital TVs, desktop computers, and the like.

The following description will be given taking a mobile terminal as an example, and those skilled in the art will understand that the configuration according to the embodiment of the present invention can be applied to a fixed type terminal in addition to elements particularly used for a moving purpose.

Referring to fig. 1, which is a schematic diagram of a hardware structure of a mobile terminal implementing various embodiments of the present invention, the mobile terminal 100 may include: an RF (Radio Frequency) unit 101, a WiFi module 102, an audio output unit 103, an a/V (audio/video) input unit 104, a sensor 105, a display unit 106, a user input unit 107, an interface unit 108, a memory 109, a processor 110, and a power supply 111. Those skilled in the art will appreciate that the mobile terminal structure shown in fig. 1 is not limiting of the mobile terminal and that the mobile terminal may include more or fewer components than shown, or may combine certain components, or a different arrangement of components.

The following describes the components of the mobile terminal in detail with reference to fig. 1:

the radio frequency unit 101 may be used for receiving and transmitting signals during the information receiving or communication process, specifically, after receiving downlink information of the base station, processing the downlink information by the processor 110; and, the uplink data is transmitted to the base station. Typically, the radio frequency unit 101 includes, but is not limited to, an antenna, at least one amplifier, a transceiver, a coupler, a low noise amplifier, a duplexer, and the like. In addition, the radio frequency unit 101 may also communicate with networks and other devices via wireless communications. The wireless communication may use any communication standard or protocol, including but not limited to GSM (Global System of Mobile communication, global System for Mobile communications), GPRS (General Packet Radio Service ), CDMA2000 (Code Division Multiple Access, CDMA 2000), WCDMA (Wideband Code Division Multiple Access ), TD-SCDMA (Time Division-Synchronous Code Division Multiple Access, time Division synchronous code Division multiple Access), FDD-LTE (Frequency Division Duplexing-Long Term Evolution, frequency Division Duplex Long term evolution), and TDD-LTE (Time Division Duplexing-Long Term Evolution, time Division Duplex Long term evolution), etc.

WiFi belongs to a short-distance wireless transmission technology, and a mobile terminal can help a user to send and receive e-mails, browse web pages, access streaming media and the like through the WiFi module 102, so that wireless broadband Internet access is provided for the user. Although fig. 1 shows a WiFi module 102, it is understood that it does not belong to the necessary constitution of a mobile terminal, and can be omitted entirely as required within a range that does not change the essence of the invention.

The audio output unit 103 may convert audio data received by the radio frequency unit 101 or the WiFi module 102 or stored in the memory 109 into an audio signal and output as sound when the mobile terminal 100 is in a call signal reception mode, a talk mode, a recording mode, a voice recognition mode, a broadcast reception mode, or the like. Also, the audio output unit 103 may also provide audio output (e.g., a call signal reception sound, a message reception sound, etc.) related to a specific function performed by the mobile terminal 100. The audio output unit 103 may include a speaker, a buzzer, and the like.

The a/V input unit 104 is used to receive an audio or video signal. The a/V input unit 104 may include a graphics processor (Graphics Processing Unit, GPU) 1041 and a microphone 1042, the graphics processor 1041 processing image data of still pictures or video obtained by an image capturing device (e.g., a camera) in a video capturing mode or an image capturing mode. The processed image frames may be displayed on the display unit 106. The image frames processed by the graphics processor 1041 may be stored in the memory 109 (or other storage medium) or transmitted via the radio frequency unit 101 or the WiFi module 102. The microphone 1042 can receive sound (audio data) via the microphone 1042 in a phone call mode, a recording mode, a voice recognition mode, and the like, and can process such sound into audio data. The processed audio (voice) data may be converted into a format output that can be transmitted to the mobile communication base station via the radio frequency unit 101 in the case of a telephone call mode. The microphone 1042 may implement various types of noise cancellation (or suppression) algorithms to cancel (or suppress) noise or interference generated in the course of receiving and transmitting the audio signal.

The mobile terminal 100 also includes at least one sensor 105, such as a light sensor, a motion sensor, and other sensors. Specifically, the light sensor includes an ambient light sensor and a proximity sensor, wherein the ambient light sensor can adjust the brightness of the display panel 1061 according to the brightness of ambient light, and the proximity sensor can turn off the display panel 1061 and/or the backlight when the mobile terminal 100 moves to the ear. As one of the motion sensors, the accelerometer sensor can detect the acceleration in all directions (generally three axes), and can detect the gravity and direction when stationary, and can be used for applications of recognizing the gesture of a mobile phone (such as horizontal and vertical screen switching, related games, magnetometer gesture calibration), vibration recognition related functions (such as pedometer and knocking), and the like; as for other sensors such as fingerprint sensors, pressure sensors, iris sensors, molecular sensors, gyroscopes, barometers, hygrometers, thermometers, infrared sensors, etc. that may also be configured in the mobile phone, the detailed description thereof will be omitted.

The display unit 106 is used to display information input by a user or information provided to the user. The display unit 106 may include a display panel 1061, and the display panel 1061 may be configured in the form of a liquid crystal display (Liquid Crystal Display, LCD), an Organic Light-Emitting Diode (OLED), or the like.

The user input unit 107 may be used to receive input numeric or character information and to generate key signal inputs related to user settings and function control of the mobile terminal. In particular, the user input unit 107 may include a touch panel 1071 and other input devices 1072. The touch panel 1071, also referred to as a touch screen, may collect touch operations thereon or thereabout by a user (e.g., operations of the user on the touch panel 1071 or thereabout by using any suitable object or accessory such as a finger, a stylus, etc.) and drive the corresponding connection device according to a predetermined program. The touch panel 1071 may include two parts of a touch detection device and a touch controller. The touch detection device detects the touch azimuth of a user, detects a signal brought by touch operation and transmits the signal to the touch controller; the touch controller receives touch information from the touch detection device, converts it into touch point coordinates, and sends the touch point coordinates to the processor 110, and can receive and execute commands sent from the processor 110. Further, the touch panel 1071 may be implemented in various types such as resistive, capacitive, infrared, and surface acoustic wave. The user input unit 107 may include other input devices 1072 in addition to the touch panel 1071. In particular, other input devices 1072 may include, but are not limited to, one or more of a physical keyboard, function keys (e.g., volume control keys, switch keys, etc.), a trackball, mouse, joystick, etc., as specifically not limited herein.

Further, the touch panel 1071 may overlay the display panel 1061, and when the touch panel 1071 detects a touch operation thereon or thereabout, the touch panel 1071 is transferred to the processor 110 to determine the type of touch event, and then the processor 110 provides a corresponding visual output on the display panel 1061 according to the type of touch event. Although in fig. 1, the touch panel 1071 and the display panel 1061 are two independent components for implementing the input and output functions of the mobile terminal, in some embodiments, the touch panel 1071 may be integrated with the display panel 1061 to implement the input and output functions of the mobile terminal, which is not limited herein.

The interface unit 108 serves as an interface through which at least one external device can be connected with the mobile terminal 100. For example, the external devices may include a wired or wireless headset port, an external power (or battery charger) port, a wired or wireless data port, a memory card port, a port for connecting a device having an identification module, an audio input/output (I/O) port, a video I/O port, an earphone port, and the like. The interface unit 108 may be used to receive input (e.g., data information, power, etc.) from an external device and transmit the received input to one or more elements within the mobile terminal 100 or may be used to transmit data between the mobile terminal 100 and an external device.

Memory 109 may be used to store software programs as well as various data. The memory 109 may mainly include a storage program area that may store an operating system, application programs required for at least one function (such as a sound playing function, an image playing function, etc.), and a storage data area; the storage data area may store data (such as audio data, phonebook, etc.) created according to the use of the handset, etc. In addition, memory 109 may include high-speed random access memory, and may also include non-volatile memory, such as at least one magnetic disk storage device, flash memory device, or other volatile solid-state storage device.

The processor 110 is a control center of the mobile terminal, connects various parts of the entire mobile terminal using various interfaces and lines, and performs various functions of the mobile terminal and processes data by running or executing software programs and/or modules stored in the memory 109 and calling data stored in the memory 109, thereby performing overall monitoring of the mobile terminal. Processor 110 may include one or more processing units; preferably, the processor 110 may integrate an application processor that primarily handles operating systems, user interfaces, applications, etc., with a modem processor that primarily handles wireless communications. It will be appreciated that the modem processor described above may not be integrated into the processor 110.

The mobile terminal 100 may further include a power source 111 (e.g., a battery) for supplying power to the respective components, and preferably, the power source 111 may be logically connected to the processor 110 through a power management system, so as to perform functions of managing charging, discharging, and power consumption management through the power management system.

Although not shown in fig. 1, the mobile terminal 100 may further include a bluetooth module or the like, which is not described herein.

In order to facilitate understanding of the embodiments of the present invention, a communication network system on which the mobile terminal of the present invention is based will be described below.

Referring to fig. 2, fig. 2 is a schematic diagram of a communication network system according to an embodiment of the present invention, where the communication network system is an LTE system of a general mobile communication technology, and the LTE system includes a UE (User Equipment) 201, an e-UTRAN (Evolved UMTS Terrestrial Radio Access Network ) 202, an epc (Evolved Packet Core, evolved packet core) 203, and an IP service 204 of an operator that are sequentially connected in communication.

Specifically, the UE201 may be the terminal 100 described above, and will not be described herein.

The E-UTRAN202 includes eNodeB2021 and other eNodeB2022, etc. The eNodeB2021 may be connected with other eNodeB2022 by a backhaul (e.g., an X2 interface), the eNodeB2021 is connected to the EPC203, and the eNodeB2021 may provide access from the UE201 to the EPC 203.

EPC203 may include MME (Mobility Management Entity ) 2031, hss (Home Subscriber Server, home subscriber server) 2032, other MMEs 2033, SGW (Serving Gate Way) 2034, pgw (PDN Gate Way) 2035 and PCRF (Policy and Charging Rules Function, policy and tariff function entity) 2036, and so on. The MME2031 is a control node that handles signaling between the UE201 and EPC203, providing bearer and connection management. HSS2032 is used to provide registers to manage functions such as home location registers (not shown) and to hold user specific information about service characteristics, data rates, etc. All user data may be sent through SGW2034 and PGW2035 may provide IP address allocation and other functions for UE201, PCRF2036 is a policy and charging control policy decision point for traffic data flows and IP bearer resources, which selects and provides available policy and charging control decisions for a policy and charging enforcement function (not shown).

IP services 204 may include the internet, intranets, IMS (IP Multimedia Subsystem ), or other IP services, etc.

Although the LTE system is described above as an example, it should be understood by those skilled in the art that the present invention is not limited to LTE systems, but may be applied to other wireless communication systems, such as GSM, CDMA2000, WCDMA, TD-SCDMA, and future new network systems.

Based on the above mobile terminal hardware structure and the communication network system, various embodiments of the method of the present invention are provided.

Example 1

Fig. 3 is a flowchart of a first embodiment of the audio transmission control method of the present invention. An audio transmission control method, the method comprising:

s1, creating a collection queue for recording received audio data;

s2, acquiring audio data to be played in the set queue;

s3, when the audio data to be played is extracted, the extracted audio data is removed from the collection queue;

and S4, when the time delay of the audio transmission in the current state is obtained according to the real-time capacity of the set queue and exceeds a preset value, reducing the receiving quantity of the audio data in the set queue so as to slow down the time delay of the audio transmission.

Optionally, in this embodiment, considering that in the existing screen-casting or screen-recording scheme, there may be a certain delay defect for controlling the audio transmission, specifically, in the process of screen-casting or screen-recording, there may be an unsynchronized situation between the playing of the audio and the transmission rhythm of the original audio, and as the duration of screen-casting or screen-recording is continuously increased, the extent of such delay may be further accumulated. That is, when the screen is started to be thrown or recorded, the possibility that the generated time delay is perceived by the user is low, and when the screen is thrown or recorded for a long time, the accumulated time delay becomes more and more obvious, so in the embodiment, the time delay is specifically found and solved in time.

Optionally, in this embodiment, by splitting separate threads of the transmission decoding and consuming part of the audio, technical ideas of the producer and consumer are introduced, the production queue situation of the audio producer and the situation of the audio consumer are detected in real time, and when it is determined that there may be a significantly perceptible time delay, a low frequency frame detection and identification is performed on PCM data (Pulse Code Modulation ), and audio frames that cannot be perceived by the user are dynamically discarded, so as to avoid occurrence of the perceptible time delay.

Alternatively, in the present embodiment, as described above for example, in particular, considering that the delay degree of audio data is difficult to determine and that inaccuracy in determination is easily caused if subjective perception by a person is relied upon, in particular, when delay is not very obvious, the present embodiment needs to find delay of such audio data in time, that is, determine transmission delay that is likely to occur by quantization.

Alternatively, in this embodiment, as described above in the example, the whole execution stream is split by using the producer and consumer models, the audio is decoded into PCM data after receiving the network data, and this part is a single thread, which is called the producer part, i.e. the PCM data that can be played is produced and put into a collection queue List.

Alternatively, in this embodiment, as described above for example, the part playing the PCM data is called consumer part, which is also a single thread that is only responsible for obtaining the corresponding PCM data from the List set and then playing it through standard Qt (an audio playing protocol) sound playing.

Alternatively, in this embodiment, as described above, since a dynamic List is set, it is possible to determine whether the audio is likely to be delayed or unlikely to be delayed by detecting the size of the volume in the List.

Optionally, in this embodiment, as described above, by the foregoing steps, the current audio delay situation may be presumed from the size of the List, so, in order to further quantify the foregoing determination scheme, a corresponding threshold T may be set for the size of the List, and if the size of the List is greater than the threshold T, it is determined that the current apparent audio may be about to be delayed, where, in this case, it is required to reduce the subsequent receiving amount of the audio data in the aggregate queue, so as to slow down the possible delay of the audio transmission.

The embodiment has the advantages that a collection queue for recording received audio data is created; then, acquiring audio data to be played in the set queue; when the audio data to be played is extracted, rejecting the extracted audio data in the set queue; when the time delay of the audio transmission in the current state is obtained according to the real-time capacity of the set queue and exceeds a preset value, the receiving amount of the audio data in the set queue is reduced so as to slow down the time delay of the audio transmission. The humanized audio transmission control scheme is realized, so that the possible time delay can be found in time in the screen recording or screen throwing process, the degree of the time delay is accurately judged, the perception-free time delay judging and processing scheme is provided, and the user experience is enhanced.

Example two

Fig. 4 is a flowchart of a second embodiment of the audio transmission control method according to the present invention, based on the above embodiment, the creating a collection queue for recording received audio data includes:

s11, generating a first thread for receiving the audio data transmitted by the network and decoding the audio data;

S12, the first thread creates the set queue, and the audio data which is received in the current state, decoded and not played is recorded through the set queue.

Optionally, in this embodiment, a first thread for receiving the audio data transmitted by the network and decoding the audio data is generated, where the audio data may also originate from the audio data of the system layer received during the screen recording or the screen throwing process;

optionally, in this embodiment, the first thread creates the set queue, and records, through the set queue, the audio data received in the current state, which is decoded and not yet played, where the set queue records the audio data in a form of a data segment or a data frame, so as to facilitate a subsequent determination of a capacity size of the queue according to a number of data segments or data frames;

optionally, in this embodiment, when there are multiple screen recording or screen throwing processes in the system, corresponding set queues are created according to different processes, so that delay determination and delay slowing processing are performed respectively.

The method has the advantages that the first thread for receiving the audio data transmitted by the network and decoding the audio data is generated; then, the first thread creates the set queue, and records the audio data received in the current state, decoded and not yet played through the set queue. The method provides a creation basis of an aggregate queue for realizing a humanized audio transmission control scheme, so that possible time delay can be found in time in the screen recording or screen throwing process, the degree of the time delay can be accurately judged, and a non-perception time delay judging and processing scheme is provided, so that user experience is enhanced.

Example III

Fig. 5 is a flowchart of a third embodiment of the audio transmission control method according to the present invention, based on the above embodiment, the obtaining audio data to be played in the set queue includes:

s21, generating a second thread for monitoring the playing state of the audio data;

s22, acquiring the audio data to be played in the set queue through the second thread.

Optionally, in this embodiment, during the recording or the projection, the audio data is played in a same step, so in order to identify whether there is a delay in the played audio data, this embodiment generates a second thread for monitoring the playing state of the audio data;

optionally, in this embodiment, during the process that the audio data is played synchronously, the audio data to be played is obtained in real time from the collection queue through the second thread.

The method has the advantages that the second thread for monitoring the playing state of the audio data is generated; and then, acquiring the audio data to be played in the set queue through the second thread. The method provides a reading scheme of an aggregate queue for realizing a humanized audio transmission control scheme, so that possible time delay can be found in time in the screen recording or screen throwing process, and the degree of the time delay can be accurately judged, and a non-perception time delay judging and processing scheme is provided, so that the user experience is enhanced.

Example IV

Fig. 6 is a flowchart of a fourth embodiment of the audio transmission control method according to the present invention, based on the above embodiment, wherein when the audio data to be played is extracted, the extracted audio data is removed from the set queue, including:

s31, extracting the audio data to be played from the set queue through the second thread;

s32, eliminating the audio data to be played in the set queue through the first thread.

Optionally, in this embodiment, the received audio data is cached in a set queue, and then the audio data to be played is extracted from the set queue by the second thread, and the audio data to be played is removed from the set queue by the first thread;

alternatively, in this embodiment, the above-mentioned set queue may be another recording form, the audio data is marked in segments, then the mark is recorded by the first thread, and the buffering of the audio data is performed by the system, thereby, the mark of the audio data to be played is extracted from the set queue by the second thread, and the mark of the audio data to be played is removed from the set queue by the first thread, that is, the capacity of the queue is determined by the number of marks.

The method has the advantages that the audio data to be played are extracted from the set queue through the second thread; and then, rejecting the audio data to be played in the set queue through the first thread. The dynamic monitoring scheme of the set queue is provided for realizing a humanized audio transmission control scheme, so that possible time delay can be timely found in the screen recording or screen throwing process, the degree of the time delay can be accurately judged, and the non-perception time delay judging and processing scheme is provided, so that the user experience is enhanced.

Example five

Fig. 7 is a flowchart of a fifth embodiment of an audio transmission control method according to the present invention, based on the above embodiment, where when the delay of the audio transmission in the current state is obtained according to the real-time capacity of the set queue and exceeds a preset value, reducing the receiving amount of the audio data in the set queue to slow down the delay of the audio transmission includes:

s41, presetting a preset capacity value corresponding to the set queue for monitoring the time delay;

s42, acquiring the real-time capacity of the set queue, and determining the size relation between the real-time capacity and the preset capacity value.

Optionally, in this embodiment, as described in the above example, when the above-mentioned combination queue is used for caching audio data, the preset capacity value corresponding to the set queue for monitoring the time delay is preset, where the preset capacity value is a capacity value of the data;

optionally, in this embodiment, when the above-mentioned combination queue is a tag for recording buffered audio data, as described in the above example, the preset capacity value corresponding to the set queue for monitoring the time delay is preset, where the preset capacity value is a count value of the tag;

optionally, in this embodiment, in order to improve the accuracy of detection, when the user is in a scene with higher time delay requirement and more sensitive time delay such as audio-video synchronous playing, a lower preset capacity value is determined, so that subsequent time delay slowing operation is triggered timely.

The method has the advantages that the preset capacity value corresponding to the set queue for monitoring the time delay is preset; and then, acquiring the real-time capacity of the set queue, and determining the size relation between the real-time capacity and the preset capacity value. The capacity judgment scheme of the set queue is provided for realizing a humanized audio transmission control scheme, so that the possible time delay can be timely found in the screen recording or screen throwing process, the degree of the time delay is accurately judged, and the non-perception time delay judgment and processing scheme is provided, so that the user experience is enhanced.

Example six

Fig. 8 is a flowchart of a sixth embodiment of an audio transmission control method according to the present invention, based on the above embodiment, where when the delay of the audio transmission in the current state is obtained according to the real-time capacity of the set queue and exceeds a preset value, the method reduces the receiving amount of the audio data in the set queue to slow down the delay of the audio transmission, further includes:

s43, if the real-time capacity is smaller than the preset capacity value, determining that the time delay does not exceed the preset value;

and S44, if the real-time capacity is larger than the preset capacity value, determining that the time delay exceeds the preset value.

Optionally, in this embodiment, a plurality of different preset capacity values are set, and when the real-time capacity is greater than the lowest preset capacity value, the receiving amount of the next audio data is obtained;

optionally, in this embodiment, when the received amount of the acquired next audio data is lower than a preset value, it is determined whether the real-time capacity is greater than the preset capacity value of the next lowest;

optionally, in this embodiment, if the real-time capacity is greater than the preset capacity value that is the next lowest, it is determined that the time delay has exceeded the preset value.

The beneficial effect of the embodiment is that if the real-time capacity is smaller than the preset capacity value, the time delay is determined not to exceed the preset value; and if the real-time capacity is larger than the preset capacity value, determining that the time delay exceeds the preset value. The time delay judging scheme based on the set queue is provided for realizing a humanized audio transmission control scheme, so that the possible time delay can be timely found in the screen recording or screen throwing process, the degree of the time delay can be accurately judged, and the non-perception time delay judging and processing scheme is provided, so that the user experience is enhanced.

Example seven

Fig. 9 is a flowchart of a seventh embodiment of an audio transmission control method according to the present invention, based on the above embodiment, where when the delay of the audio transmission in the current state is obtained according to the real-time capacity of the set queue and exceeds a preset value, the method reduces the receiving amount of the audio data in the set queue to slow down the delay of the audio transmission, further includes:

s45, presetting a frame skip condition for processing the received audio data;

and S46, when the time delay is determined to exceed the preset value, reducing the subsequent receiving amount of the audio data in the set queue according to the frame skip condition so as to slow down the time delay of the audio transmission.

Optionally, in this embodiment, an interval frame skip scheme is adopted for playing the current audio data, specifically, each time an audio frame is discarded every several frames, the next frame is played without directly sending the Qt sound playing thread, so that possible obvious breakage of the sound caused by large-area frame loss is reduced as much as possible.

Optionally, in this embodiment, according to the range of the capacity difference value determined by the delay, the interval of the discarded audio frames is correspondingly determined;

optionally, in this embodiment, if the capacity difference value of the delay determination is in a higher difference range, the interval of the discarded audio frames is correspondingly reduced;

alternatively, in the present embodiment, if the capacity difference value of the delay determination is in a lower difference range, the interval of the discarded audio frames is correspondingly enlarged.

The method has the advantages that the frame skipping condition for processing the received audio data is preset; and then, when the time delay is determined to exceed the preset value, reducing the subsequent receiving amount of the audio data in the set queue according to the frame skip condition so as to slow down the time delay of the audio transmission. The frame loss scheme after delay judgment is provided for realizing a humanized audio transmission control scheme, so that the possible delay can be timely found in the screen recording or screen throwing process, the degree of the delay can be accurately judged, and the perception-free delay judgment and processing scheme is provided, so that the user experience is enhanced.

Example eight

Fig. 10 is a flowchart of an eighth embodiment of an audio transmission control method according to the present invention, based on the above embodiment, where when the delay of the audio transmission in the current state is obtained according to the real-time capacity of the set queue and exceeds a preset value, the method reduces the receiving amount of the audio data in the set queue to slow down the delay of the audio transmission, further includes:

s47, presetting an amplitude condition for extracting the audio data according to the amplitude;

and S48, when the time delay is determined to exceed the preset value, detecting and deleting the received audio data meeting the amplitude condition in the audio data so as to reduce the subsequent receiving amount of the audio data in the collection queue and slow down the time delay of the audio transmission.

Optionally, in this embodiment, the amplitude of a sound is calculated on the PCM data decoded by the decoding thread, and if the amplitude is found to be 0, it indicates that the audio is obviously mute, so that the consumer does not need to send the List to consume the audio;

optionally, in this embodiment, the corresponding amplitude condition is determined according to the data content of the current audio data, for example, when playing the game voice for audio sensitivity, a lower amplitude threshold is determined as the discarded amplitude condition, so as to reduce the discarding amount and avoid the occurrence of the situation of erroneous discarding.

The method has the advantages that the amplitude condition for extracting the audio data according to the amplitude is preset; and then, when the time delay is determined to exceed the preset value, detecting and deleting the received audio data meeting the amplitude condition in the audio data so as to reduce the subsequent receiving amount of the audio data in the collection queue and slow down the time delay of the audio transmission. The amplitude screening-based discarding scheme after delay judgment is provided for realizing humanized audio transmission control scheme, so that possible delay can be timely found in the screen recording or projection process, the degree of delay can be accurately judged, and the non-perception delay judgment and processing scheme is provided, so that the user experience is enhanced.

Example nine

Based on the above embodiments, the present invention also proposes an audio transmission control device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, which computer program, when executed by the processor, implements the steps of the audio transmission control method according to any one of the above.

It should be noted that the above device embodiments and method embodiments belong to the same concept, the specific implementation process of the device embodiments is detailed in the method embodiments, and technical features in the method embodiments are correspondingly applicable to the device embodiments, which are not repeated herein.

Examples ten

Based on the above embodiments, the present invention also proposes a computer-readable storage medium having stored thereon an audio transmission control program which, when executed by a processor, implements the steps of the audio transmission control method as set forth in any one of the above.

It should be noted that the medium embodiment and the method embodiment belong to the same concept, the specific implementation process of the medium embodiment and the method embodiment are detailed, and technical features in the method embodiment are correspondingly applicable in the medium embodiment, which is not repeated herein.

It should be noted that, in this document, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising one … …" does not exclude the presence of other like elements in a process, method, article, or apparatus that comprises the element.

The foregoing embodiment numbers of the present invention are merely for the purpose of description, and do not represent the advantages or disadvantages of the embodiments.

From the above description of the embodiments, it will be clear to those skilled in the art that the above-described embodiment method may be implemented by means of software plus a necessary general hardware platform, but of course may also be implemented by means of hardware, but in many cases the former is a preferred embodiment. Based on such understanding, the technical solution of the present invention may be embodied essentially or in a part contributing to the prior art in the form of a software product stored in a storage medium (e.g. ROM/RAM, magnetic disk, optical disk) comprising instructions for causing a terminal (which may be a mobile phone, a computer, a server, an air conditioner, or a network device, etc.) to perform the method according to the embodiments of the present invention.

The embodiments of the present invention have been described above with reference to the accompanying drawings, but the present invention is not limited to the above-described embodiments, which are merely illustrative and not restrictive, and many forms may be made by those having ordinary skill in the art without departing from the spirit of the present invention and the scope of the claims, which are to be protected by the present invention.

Claims

1. An audio transmission control method, the method comprising:

creating a collection queue for recording received audio data;

acquiring audio data to be played in the set queue;

when the time delay of the audio transmission in the current state is obtained according to the real-time capacity of the set queue and exceeds a preset value, reducing the receiving amount of the audio data in the set queue so as to slow down the time delay of the audio transmission;

the creating a collection queue for recording received audio data includes:

creating the set queue by the first thread, and recording the audio data which is received in the current state, decoded and not yet played through the set queue;

the obtaining the audio data to be played in the collection queue includes:

generating a second thread for monitoring the playing state of the audio data;

acquiring the audio data to be played from the set queue through the second thread;

And when the audio data to be played is extracted, rejecting the extracted audio data in the set queue, including:

removing the audio data to be played from the set queue through the first thread;

when the time delay of the audio transmission in the current state is obtained according to the real-time capacity of the set queue and exceeds a preset value, the method for reducing the receiving amount of the audio data in the set queue to slow down the time delay of the audio transmission comprises the following steps:

acquiring the real-time capacity of the set queue, and determining the size relation between the real-time capacity and the preset capacity value;

when the time delay of the audio transmission in the current state is obtained according to the real-time capacity of the set queue and exceeds a preset value, the method reduces the receiving amount of the audio data in the set queue so as to slow down the time delay of the audio transmission, and further comprises the following steps:

If the real-time capacity is larger than the preset capacity value, determining that the time delay exceeds the preset value;

presetting a frame skip condition for processing the received audio data;

when the time delay is determined to exceed the preset value, reducing the subsequent receiving amount of the audio data in the set queue according to the frame skip condition so as to slow down the time delay of the audio transmission;

or,

2. An audio transmission control device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, which when executed by the processor, implements the steps of the audio transmission control method according to claim 1.

3. A computer-readable storage medium, wherein an audio transmission control program is stored on the computer-readable storage medium, which when executed by a processor, implements the steps of the audio transmission control method according to claim 1.