CN116260797A - Processing method, device, equipment and storage medium for audio and video data transmission - Google Patents

Processing method, device, equipment and storage medium for audio and video data transmission Download PDF

Info

Publication number
CN116260797A
CN116260797A CN202111497996.8A CN202111497996A CN116260797A CN 116260797 A CN116260797 A CN 116260797A CN 202111497996 A CN202111497996 A CN 202111497996A CN 116260797 A CN116260797 A CN 116260797A
Authority
CN
China
Prior art keywords
terminal
audio
video data
transmission channel
data
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202111497996.8A
Other languages
Chinese (zh)
Inventor
周吾昆
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Youzhuju Network Technology Co Ltd
Original Assignee
Beijing Youzhuju Network Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Youzhuju Network Technology Co Ltd filed Critical Beijing Youzhuju Network Technology Co Ltd
Priority to CN202111497996.8A priority Critical patent/CN116260797A/en
Publication of CN116260797A publication Critical patent/CN116260797A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L65/00Network arrangements, protocols or services for supporting real-time applications in data packet communication
    • H04L65/1066Session management
    • H04L65/1069Session establishment or de-establishment
    • GPHYSICS
    • G09EDUCATION; CRYPTOGRAPHY; DISPLAY; ADVERTISING; SEALS
    • G09BEDUCATIONAL OR DEMONSTRATION APPLIANCES; APPLIANCES FOR TEACHING, OR COMMUNICATING WITH, THE BLIND, DEAF OR MUTE; MODELS; PLANETARIA; GLOBES; MAPS; DIAGRAMS
    • G09B5/00Electrically-operated educational appliances
    • G09B5/08Electrically-operated educational appliances providing for individual presentation of information to a plurality of student stations
    • G09B5/14Electrically-operated educational appliances providing for individual presentation of information to a plurality of student stations with provision for individual teacher-student communication
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L65/00Network arrangements, protocols or services for supporting real-time applications in data packet communication
    • H04L65/80Responding to QoS
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/01Protocols
    • H04L67/10Protocols in which an application is distributed across nodes in the network
    • H04L67/104Peer-to-peer [P2P] networks
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/14Session management
    • H04L67/141Setup of application sessions

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • Business, Economics & Management (AREA)
  • Multimedia (AREA)
  • General Business, Economics & Management (AREA)
  • Physics & Mathematics (AREA)
  • Educational Administration (AREA)
  • Educational Technology (AREA)
  • General Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Information Transfer Between Computers (AREA)

Abstract

The disclosure provides a processing method, a device, equipment and a storage medium for audio and video data transmission, wherein the method comprises the following steps: firstly, a transmission channel is established between a first terminal and a second terminal, then the first terminal receives video picture data acquired by the second terminal through the transmission channel, and the first terminal calibrates the synchronously acquired video picture data and the pre-acquired audio data to obtain audio and video data. And the first terminal pushes the audio and video data to the streaming media server, wherein the streaming media server is used for pushing the audio and video data to the third terminal. Therefore, the transmission channel is established between the first terminal and the second terminal, so that the stable reliability of audio and video data transmission between the first terminal and the second terminal is ensured, the first terminal can stably and reliably push the audio and video data after time synchronization calibration to the streaming media server, and the experience of both online teaching parties is ensured.

Description

Processing method, device, equipment and storage medium for audio and video data transmission
Technical Field
The disclosure relates to the field of data processing, and in particular relates to a processing method, a device, equipment and a storage medium for audio and video data transmission.
Background
At present, the online technology of the network is increasingly applied to the field of education and teaching, teachers can give lessons through intelligent terminal equipment and the Internet, and students can learn knowledge by watching course audios and videos on line through the intelligent terminal equipment and the Internet.
The teacher gives lessons through the intelligent terminal device and the internet, and usually gives lessons through a mobile phone and an iPad. Specifically, a TCP (Transmission Control Protocol ) connection is established between the mobile phone end and the iPad end, the mobile phone end is used for collecting teacher teaching video data, then the video data is transmitted to the iPad end through the TCP connection, the iPad end is used for synchronously collecting audio data, and then the video data from the mobile phone end and the audio data collected locally synchronously are subjected to time synchronization calibration and then pushed to a media server for realizing online teaching.
Because the mobile phone at the teacher end and the iPad are connected based on the TCP to transmit video data, when the network environment is not good, the network packet loss rate is increased, so that congestion control of the TCP is triggered easily, the transmission bandwidth is reduced, the transmission of the video data is affected, and further the problems of picture delay, picture blocking and the like of the video data received by the iPad are caused. In addition, after the network environment is restored, because the congestion data is received in a centralized way, the situation that the video picture is played in an accelerated way may occur, and the experience of both online teaching parties is affected.
Therefore, how to ensure the stable reliability of audio and video data transmission between two devices at the teacher end of online teaching and improve the experience of two online teaching parties is a technical problem that needs to be solved at present.
Disclosure of Invention
In order to solve the above technical problems or at least partially solve the above technical problems, the embodiments of the present disclosure provide a processing method for audio and video data transmission, which establishes a transmission channel between a first terminal and a second terminal, so as to ensure the stability and reliability of audio and video data transmission between the first terminal and the second terminal, thereby enabling the first terminal to stably and reliably push audio and video data after time alignment to a streaming media server, and ensuring the experience of both online teaching parties.
In a first aspect, the present disclosure provides a processing method for audio and video data transmission, where the method includes:
a transmission channel is established between the first terminal and the second terminal;
the first terminal receives the video picture data acquired by the second terminal through the transmission channel;
the first terminal calibrates the video picture data with the pre-acquired audio data to obtain audio and video data; wherein the audio data and the video picture data are synchronously acquired;
the first terminal pushes the audio and video data to a streaming media server; the streaming media server is used for pushing the audio and video data to a third terminal.
In an alternative embodiment, the establishing a transmission channel between the first terminal and the second terminal includes:
establishing application layer connection between a first terminal and a second terminal;
the first terminal sends a request refer message to the second terminal through the application layer connection; wherein, the offer message carries session description protocol SDP information of the first terminal;
the first terminal receives a response answer message from the second terminal through the application layer connection; the answer message carries SDP information of the second terminal;
and the first terminal establishes a peer-to-peer connection P2P transmission channel with the second terminal based on the SDP information of the second terminal.
In an alternative embodiment, the establishing a transmission channel between the first terminal and the second terminal includes:
establishing application layer connection between a first terminal and a second terminal;
the first terminal receives the offer message from the second terminal through the application layer connection; wherein, the offer message carries session description protocol SDP information of the second terminal;
the first terminal collects SDP information of the first terminal and generates an answer message based on the SDP information;
the first terminal sends the answer message to the second terminal through the application layer connection; the answer message is used for establishing P2P connection between the first terminal and the second terminal.
In an alternative embodiment, the establishing an application layer connection between the first terminal and the second terminal includes:
the first terminal establishes a TCP connection with the second terminal based on a transmission control protocol TCP.
In an alternative embodiment, the first terminal establishes a TCP connection with the second terminal based on a transmission control protocol TCP, including:
and the first terminal establishes TCP connection with the second terminal based on the two-dimension code of the first terminal or the second terminal.
In an alternative embodiment, the establishing a transmission channel between the first terminal and the second terminal includes:
and establishing a user datagram protocol UDP data transmission channel between the first terminal and the second terminal.
In an optional implementation manner, when the first terminal detects that the first terminal starts to establish the application layer connection with the second terminal, timing is started, and when the first terminal detects that the P2P transmission channel is successfully established between the first terminal and the second terminal, timing is ended, and a timing result is obtained;
the first terminal determines the timing result as the time consuming connection between itself and the second terminal.
In a second aspect, the present disclosure further provides a processing apparatus for audio and video data transmission, where the apparatus includes:
the establishing module is used for establishing a transmission channel with the second terminal;
the receiving module is used for receiving the video picture data acquired by the second terminal through the transmission channel;
the calibration module is used for calibrating the video picture data with the pre-acquired audio data to obtain audio and video data; wherein the audio data and the video picture data are synchronously acquired;
the pushing module is used for pushing the audio and video data to the streaming media server; the streaming media server is used for pushing the audio and video data to a third terminal.
In a third aspect, the present disclosure provides a computer readable storage medium having instructions stored therein, which when run on a terminal device, cause the terminal device to implement the above-described method.
In a fourth aspect, the present disclosure provides an apparatus comprising: the computer program comprises a memory, a processor and a computer program stored in the memory and capable of running on the processor, wherein the processor realizes the method when executing the computer program.
In a fifth aspect, the present disclosure provides a computer program product comprising computer programs/instructions which when executed by a processor implement the above-described method.
Compared with the prior art, the technical scheme provided by the embodiment of the disclosure has the following advantages:
the embodiment of the disclosure provides a processing method for audio and video data transmission, firstly, a transmission channel is established between a first terminal and a second terminal, then, the first terminal receives video picture data acquired by the second terminal through the transmission channel, and the first terminal calibrates the video picture data with pre-acquired audio data to obtain audio and video data, wherein the audio data and the video picture data are acquired synchronously. And the first terminal pushes the audio and video data to the streaming media server, wherein the streaming media server is used for pushing the audio and video data to the third terminal. Therefore, the processing method for audio and video data transmission provided by the embodiment of the disclosure ensures the stable reliability of audio and video data transmission between the first terminal and the second terminal by establishing the transmission channel between the first terminal and the second terminal, so that the first terminal can stably and reliably push the audio and video data after time synchronization calibration to the streaming media server, and the experience of both online teaching parties is ensured.
Drawings
The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate embodiments consistent with the disclosure and together with the description, serve to explain the principles of the disclosure.
In order to more clearly illustrate the embodiments of the present disclosure or the solutions in the prior art, the drawings that are required for the description of the embodiments or the prior art will be briefly described below, and it will be obvious to those skilled in the art that other drawings can be obtained from these drawings without inventive effort.
Fig. 1 is a flowchart of a processing method for audio and video data transmission according to an embodiment of the present disclosure;
fig. 2 is a schematic diagram of time consuming establishment between terminals according to an embodiment of the present disclosure;
fig. 3 is a schematic data interaction diagram of a processing method for audio and video data transmission according to an embodiment of the present disclosure;
fig. 4 is a schematic structural diagram of a processing device for audio and video data transmission according to an embodiment of the present disclosure;
fig. 5 is a schematic structural diagram of a processing device for audio and video data transmission according to an embodiment of the present disclosure.
Detailed Description
In order that the above objects, features and advantages of the present disclosure may be more clearly understood, a further description of aspects of the present disclosure will be provided below. It should be noted that, without conflict, the embodiments of the present disclosure and features in the embodiments may be combined with each other.
In the following description, numerous specific details are set forth in order to provide a thorough understanding of the present disclosure, but the present disclosure may be practiced otherwise than as described herein; it will be apparent that the embodiments in the specification are only some, but not all, embodiments of the disclosure.
At present, the online technology of the network is increasingly applied to the field of education and teaching, teachers can give lessons through intelligent terminal equipment and the Internet, and students can learn knowledge by watching course audios and videos on line through the intelligent terminal equipment and the Internet.
The teacher gives lessons through the intelligent terminal device and the internet, and usually gives lessons through a mobile phone and an iPad. Specifically, a TCP connection is established between the mobile phone end and the iPad end, the mobile phone end is used for collecting teacher teaching video data, then the video data is transmitted to the iPad end through the TCP connection, the iPad end is used for synchronously collecting audio data, and then the video data from the mobile phone end and the audio data collected locally and synchronously are subjected to time synchronization calibration and then pushed to a media server for realizing online teaching.
Because the mobile phone at the teacher end and the iPad are connected based on the TCP to transmit video data, when the network environment is not good, the network packet loss rate is increased, so that congestion control of the TCP is triggered easily, the transmission bandwidth is reduced, the transmission of the video data is affected, and further the problems of picture delay, picture blocking and the like of the video data received by the iPad are caused. In addition, after the network environment is restored, because the congestion data is received in a centralized way, the situation that the video picture is played in an accelerated way may occur, and the experience of both online teaching parties is affected.
Based on the application scene, how to ensure the stable reliability of audio and video data transmission between two devices at the teacher end of the online teaching, and improve the experience of both sides of the online teaching is a technical problem to be solved by the embodiment of the disclosure.
Therefore, the embodiment of the disclosure provides a processing method for audio and video data transmission, firstly, a transmission channel is established between a first terminal and a second terminal, then, the first terminal receives video picture data acquired by the second terminal through the transmission channel, and the first terminal calibrates the video picture data with pre-acquired audio data to obtain audio and video data, wherein the audio data and the video picture data are acquired synchronously. And the first terminal pushes the audio and video data to the streaming media server, wherein the streaming media server is used for pushing the audio and video data to the third terminal. Therefore, the processing method for audio and video data transmission provided by the embodiment of the disclosure ensures the stable reliability of audio and video data transmission between the first terminal and the second terminal by establishing the transmission channel between the first terminal and the second terminal, so that the first terminal can stably and reliably push the audio and video data after time synchronization calibration to the streaming media server, and the experience of both online teaching parties is ensured.
Based on this, an embodiment of the present disclosure provides a processing method for audio and video data transmission, and referring to fig. 1, a flowchart of the processing method for audio and video data transmission provided by the embodiment of the present disclosure is provided, where the method includes:
s101: and a transmission channel is established between the first terminal and the second terminal.
In the embodiment of the disclosure, the first terminal and the second terminal are in the same local area network, and the first terminal and the second terminal may be intelligent terminals with cameras and microphones, for example, including smart phones, tablet computers, and the like. In an application scenario, for example, in an application scenario of online teaching, the first terminal and the second terminal may be teacher terminals, for example, the first terminal may be a tablet computer client (such as an iPad teaching terminal), the second terminal may be a smart phone client (such as a mobile phone plug-in terminal), and so on. The iPad teaching end and the mobile phone pushing end are in the same local area network.
In the embodiment of the disclosure, a transmission channel is established between the first terminal and the second terminal, for example, a P2P (Peer to Peer) transmission channel may be established, where P2P is a distributed application architecture for distributing tasks and workloads among peers (peers), and is a networking or network form formed by a Peer-to-Peer computing model at an application layer. And establishing a peer-to-peer connection P2P transmission channel between the first terminal and the second terminal, wherein the first terminal and the second terminal in the P2P transmission channel are in peer-to-peer status, the first terminal not only serves as a server to provide services for the second terminal, but also enjoys the services provided by the second terminal, and otherwise, the second terminal serves as a server to provide services for the first terminal and enjoys the services provided by the first terminal.
In an alternative embodiment, the peer-to-peer connection P2P transmission channel between the first terminal and the second terminal may be established according to the following steps A1-A4.
Step A1: and establishing application layer connection between the first terminal and the second terminal.
In the embodiment of the disclosure, a connection is established between the first terminal and the second terminal at an application layer, for example, a TCP (Transmission Control Protocol ) connection may be established between the first terminal and the second terminal at the application layer, and TCP communication between the first terminal and the second terminal is implemented based on the TCP protocol; etc.
In an alternative embodiment, the first terminal establishes a TCP connection with the second terminal based on the transmission control protocol TCP.
In the embodiment of the disclosure, the TCP connection is an auxiliary data transmission channel established at the application layer, and is used for transmitting SDP (Session Description Protocol ) information corresponding to each of the first terminal and the second terminal, so as to provide services for subsequent P2P connection establishment.
In an alternative embodiment, the first terminal establishes a TCP connection with the second terminal based on the two-dimensional code of the first terminal.
In the embodiment of the disclosure, the first terminal generates the two-dimensional code, the second terminal can perform one-scan operation on the two-dimensional code, and the TCP connection between the first terminal and the second terminal is established based on the two-dimensional code generated by the first terminal.
In another alternative embodiment, the first terminal establishes a TCP connection with the second terminal based on the two-dimensional code of the second terminal.
In the embodiment of the disclosure, the second terminal generates the two-dimensional code, the first terminal can perform one-scan operation on the two-dimensional code, and the TCP connection between the first terminal and the second terminal is established based on the two-dimensional code generated by the second terminal.
For example, taking an online teaching application scenario as an example, the first terminal may be an iPad teaching end, and the second terminal may be a mobile phone plug end. When a teacher end user clicks a start button, the iPad teaching end automatically opens a camera and a microphone, a two-dimensional code is generated, a mobile phone pushing end can perform one-sweep operation on the two-dimensional code, and TCP connection between the iPad teaching end and the mobile phone pushing end is established based on the two-dimensional code.
In the embodiment of the disclosure, after the code is scanned between the first terminal and the second terminal, webRTC (Web Real-Time Communication ) connection can be established, SDP information between the first terminal and the second terminal is exchanged by using a signaling server, where the signaling server is used to help two ends to establish connection under the condition that privacy is exposed as little as possible, so that security is improved. The manner of implementing signaling may include, among other things, establishing socket (socket) connections, etc.
Step A2: the first terminal sends a request offer message to the second terminal through the application layer connection.
The offer message carries session description protocol SDP information of the first terminal.
In the embodiment of the disclosure, a first terminal sends a request offer message to a second terminal through socket connection, and waits for a response answer message of the second terminal. The offer message includes a message type (as indicated by type), a sender user name (as indicated by name), a receiver user name (as indicated by target), SDP information of the first terminal, and the like. For example, the offer message includes type: "video-offer", name: first terminal, target: SDP information of the second terminal, the first terminal, etc.
Step A3: the first terminal receives a reply answer message from the second terminal through the application layer connection.
The answer message carries SDP information of the second terminal.
In the embodiment of the disclosure, after the second terminal receives the offer message sent by the first terminal, the second terminal sends a response answer message to the first terminal through socket connection. The answer message includes a message type (e.g., type) of the answer message, a sender user name (e.g., name), a receiver user name (e.g., target), SDP information of the second terminal, and the like. For example, the answer message includes type: "video-answer", name: second terminal, target: SDP information of the first terminal, the second terminal, etc.
Step A4: the first terminal establishes a peer-to-peer connection P2P transmission channel with the second terminal based on SDP information of the second terminal.
In the embodiment of the present disclosure, through the above step A2 and step A3, an ICE (Interactive Connectivity Establishment, interactive connection creation) protocol may be applied, and the establishment of the P2P transmission channel is completed by exchanging SDP information between the first terminal and the second terminal. If the P2P transmission channel establishment is unsuccessful, continuing to return to step A4, and based on the SDP information, establishing a P2P transmission channel between the first terminal and the second terminal by applying the ICE protocol. If the P2P transmission channel is successfully established, the audio and video data between the first terminal and the second terminal can be transmitted through the P2P transmission channel.
In an alternative embodiment, the peer-to-peer connection P2P transmission channel between the first terminal and the second terminal may be established according to the following steps B1-B4.
Step B1: and establishing application layer connection between the first terminal and the second terminal.
In the embodiment of the present disclosure, the specific process of step B1 is described in detail in step A1 in the above embodiment, which is not described herein.
Step B2: the first terminal receives the offer message from the second terminal through the application layer connection.
Wherein the offer message carries session description protocol SDP information of the second terminal.
In the embodiment of the disclosure, the first terminal receives the offer message from the second terminal through the socket connection, where the offer message includes a message type (e.g., indicated by type) of the offer message, a sender user name (e.g., indicated by name), a receiver user name (e.g., indicated by target), SDP information of the second terminal, and so on. For example, the offer message includes type: "video-offer", name: second terminal, target: SDP information of the first terminal, the second terminal, etc.
Step B3: the first terminal collects own SDP information and generates an answer message based on the SDP information.
In the embodiment of the disclosure, the first terminal collects own SDP information and generates an answer message based on the SDP information, where the answer message includes a message type (e.g., denoted by type) of the answer message, a sender user name (e.g., denoted by name), a receiver user name (e.g., denoted by target), and SDP information of the first terminal. For example, the answer message includes type: "video-answer", name: first terminal, target: SDP information of the second terminal, the first terminal, etc.
Step B4: the first terminal sends an answer message to the second terminal through the application layer connection.
The answer message is used for establishing P2P connection between the first terminal and the second terminal.
In the embodiment of the disclosure, the first terminal sends the answer message to the second terminal through socket connection, and further, ICE protocol is applied to complete the establishment of the P2P transmission channel by exchanging SDP information between the first terminal and the second terminal. If the P2P transmission channel is successfully established, the audio and video data between the first terminal and the second terminal can be transmitted through the P2P transmission channel.
In an alternative embodiment, a user datagram protocol, UDP, data transfer channel is established between the first terminal and the second terminal.
In the embodiment of the disclosure, UDP (User Datagram Protocol ) is used to process data packets, and a UDP data transmission channel may be used to transmit audio and video data between the first terminal and the second terminal. The implementation manner of establishing the P2P transmission channel between the first terminal and the second terminal includes establishing a UDP data transmission channel between the first terminal and the second terminal.
S102: the first terminal receives the video picture data acquired by the second terminal through the transmission channel.
In the embodiment of the disclosure, after the second terminal collects the video picture data, the video picture data is sent to the first terminal through the P2P transmission channel. For example, taking an online teaching application scenario as an example, the first terminal may be an iPad teaching end, and the second terminal may be a mobile phone plug end. When a teacher end user clicks a start button, the mobile phone plug end automatically opens the camera and collects video picture data.
S103: and the first terminal calibrates the video picture data with the pre-acquired audio data to obtain audio and video data.
Wherein, the audio data and the video picture data are synchronously collected.
In the embodiment of the disclosure, when the second terminal collects video frame data, the first terminal synchronously collects audio data, and after receiving the video frame data transmitted by the second terminal through the P2P transmission channel, the second terminal performs calibration on the video frame data and the audio data, for example, performs time synchronization calibration on the video frame data and the audio data, so as to obtain audio and video data.
For example, taking an online teaching application scenario as an example, the first terminal may be an iPad teaching end, and the second terminal may be a mobile phone plug end. When a teacher end user clicks a start button, the mobile phone plug end automatically opens a camera and collects video picture data, and meanwhile, the iPad teaching end automatically opens the camera and a microphone and collects audio data, and after receiving the video picture data transmitted by the mobile phone plug end through a P2P transmission channel, time setting calibration is carried out on the video picture data and the audio data, so that final audio and video data are obtained.
S104: the first terminal pushes the audio and video data to the streaming media server.
The streaming media server is used for pushing the audio and video data to the third terminal.
In the embodiment of the disclosure, the streaming media server includes an RTC (Real-Time Communications, real-time collaboration) server, and the first terminal, the second terminal, and the third terminal may apply for joining the same RTC room to the RTC server. The first terminal applies for adding the RTC room to the RTC server, can push audio and video data to the RTC server after the RTC room is added successfully, and the third terminal applies for adding the RTC room to the RTC server, and can pull the audio and video data of the RTC server after the RTC room is added successfully.
In an alternative embodiment, the first terminal starts timing when detecting that the first terminal starts to establish the application layer connection with the second terminal, and finishes timing when detecting that the P2P transmission channel is successfully established between the first terminal and the second terminal, and obtains a timing result, and the first terminal determines the timing result as time consuming connection between the first terminal and the second terminal.
Fig. 2 is a schematic diagram of a connection time between terminals according to an embodiment of the present disclosure, where the connection time between a first terminal and a second terminal is divided into two parts, and the first part is a connection time, such as a timing result of t1+t2+t3+t4+t5 in fig. 2, where the connection time is based on an angle statistics of the first terminal; the second part is time consuming to transfer.
In the embodiment of the disclosure, when the first terminal detects that the first terminal starts to establish the TCP connection with the second terminal, the first terminal starts to time, and when the first terminal detects that the second terminal successfully establishes the P2P transmission channel with the second terminal, the timing is ended, and the corresponding timing result is that the connection between the first terminal and the second terminal is time-consuming. As shown in fig. 2, t1 is the time consumption from the start of the creation of the offer to the success of the creation of the offer by the first terminal, and sets the local SDP information of the first terminal; t2 is the time consumption from the success of creating the offer to the start of the first terminal sending the offer to the second terminal; t3 is the time consumption from the beginning of sending the offer to the receiving of the answer returned by the second terminal, and sets the far-end SDP information of the second terminal; t4 is the time consumption of setting the answer of the second terminal to the local of the first terminal, and starting ICE protocol connection; t5 is the set-up time consumption of the ICE protocol connection and starts to send data.
In the embodiment of the present disclosure, in the implementation process of t3+t4+t5, the second terminal performs the establishment of the connection synchronously, as t6+t7 in fig. 2, where t6 is the time consumption from when the second terminal receives the offer of the first terminal to when the second terminal creates its own answer, and sets the SDP information of the second terminal as the far-end SDP information; t7 is the time consumption for the second terminal to send an answer to the ICE successful establishment.
In the embodiment of the disclosure, the time consumed for transmitting is the time required for a frame of video to be received from the acquisition end to the opposite end, as shown in fig. 2, and t8 is the time consumed from the successful ICE establishment to the reception of the first frame of video.
In the processing method for audio and video data transmission provided by the embodiment of the disclosure, a transmission channel is established between a first terminal and a second terminal, then the first terminal receives video picture data collected by the second terminal through the transmission channel, and the first terminal calibrates the video picture data with pre-collected audio data to obtain audio and video data, wherein the audio data and the video picture data are collected synchronously. And the first terminal pushes the audio and video data to the streaming media server, wherein the streaming media server is used for pushing the audio and video data to the third terminal. Therefore, the processing method for audio and video data transmission provided by the embodiment of the disclosure ensures the stable reliability of audio and video data transmission between the first terminal and the second terminal by establishing the transmission channel between the first terminal and the second terminal, so that the first terminal can stably and reliably push the audio and video data after time synchronization calibration to the streaming media server, and the experience of both online teaching parties is ensured.
Based on the above processing method for audio and video data transmission, taking an online teaching application scenario as an example, the first terminal may be an iPad teaching terminal, the second terminal may be a mobile phone push terminal, and the third terminal may be a student terminal.
S301: and generating a two-dimensional code by the iPad teaching terminal.
In the embodiment of the disclosure, an iPad teaching end starts a camera and generates a two-dimensional code.
S302: the mobile phone plug flow end sweeps a two-dimension code.
In the embodiment of the disclosure, a mobile phone plug-flow end starts a camera and performs a sweeping operation on a two-dimensional code generated by an iPad teaching end so as to establish TCP connection between the iPad teaching end and the mobile phone plug-flow end.
S303: and the iPad teaching end sends a request offer message to the mobile phone plug end through the application layer connection.
The offer message carries session description protocol SDP information of the iPad teaching end.
In the embodiment of the disclosure, an iPad teaching end sends a request offer message to a mobile phone plug-flow end through socket connection, wherein the offer message comprises a message type: "video-offer", sender user name: iPad teaching end and receiver user name target: a mobile phone push end, SDP information of an iPad teaching end and the like.
S304: and the mobile phone plug-in terminal receives the offer message sent by the iPad teaching terminal and generates a response answer message.
The answer message carries SDP information of the mobile phone plug end.
In the embodiment of the disclosure, a mobile phone plug-in terminal receives an offer message sent by an iPad teaching terminal and generates a response answer message, wherein the answer message comprises a message type: "video-answer", sender user name: mobile phone plug-flow terminal and receiver user name target: an iPad teaching end, SDP information of a mobile phone plug-flow end and the like.
S305: and the iPad teaching end receives an answer message from the mobile phone plug end through the application layer connection.
In the embodiment of the disclosure, the iPad teaching end receives an answer message from the mobile phone plug end through socket connection.
S306: the iPad teaching end establishes a P2P transmission channel between SDP information of the mobile phone pushing end and the mobile phone pushing end.
In the embodiment of the disclosure, through the S303-S305, SDP information exchange between the iPad teaching end and the mobile phone plug end is achieved, and the ICE protocol is applied, so that the P2P transmission channel is established by exchanging SDP information between the iPad teaching end and the mobile phone plug end, where the P2P transmission channel can be used for transmitting audio and video data between the iPad teaching end and the mobile phone plug end.
S307: the mobile phone plug-in end collects video picture data, and the iPad teaching end synchronously collects audio data.
In the embodiment of the disclosure, the mobile phone plug-flow end collects video picture data, and can perform operations such as adding beauty, filters, virtual background effects and the like aiming at the video picture data, and meanwhile, the iPad teaching end synchronously collects audio data.
S308: and the mobile phone plug-in end sends the video picture data to the iPad teaching end through the P2P transmission channel.
S309: and the iPad teaching end performs time synchronization calibration on the video picture data and the pre-acquired audio data to obtain audio and video data.
In the embodiment of the disclosure, after receiving video picture data transmitted by a mobile phone plug end through a P2P transmission channel, an iPad teaching end performs time synchronization calibration on the video picture data and audio data to obtain audio and video data.
S310: and the iPad teaching end pushes the audio and video data to the RTC server.
In the embodiment of the disclosure, the iPad teaching end can apply for joining the RTC room to the RTC server end, and push audio and video data to the RTC server end after the RTC room is successfully joined.
S311: the RTC server pushes the audio and video data to the student end.
In the embodiment of the disclosure, the student side can apply for adding the RTC room to the RTC server side, and pull the audio and video data of the RTC server side after the RTC room is successfully added, so that online teaching of the teacher to the student is realized.
In the processing method for audio and video data transmission provided by the embodiment of the disclosure, firstly, a peer-to-peer connection P2P transmission channel is established between an iPad teaching end and a mobile phone pushing end, then, the iPad teaching end receives video picture data acquired by the mobile phone pushing end through the P2P transmission channel, and the iPad teaching end performs time synchronization calibration on the video picture data and pre-acquired audio data to obtain audio and video data, wherein the audio data and the video picture data are acquired synchronously. And the iPad teaching end pushes the audio and video data to the RTC server, wherein the RTC server is used for pushing the audio and video data to students. Therefore, the processing method for audio and video data transmission provided by the embodiment of the disclosure ensures the stable reliability of audio and video data transmission between the iPad teaching end and the mobile phone plug end by establishing the P2P transmission channel between the iPad teaching end and the mobile phone plug end, so that the iPad teaching end can stably and reliably push the audio and video data after time alignment to the RTC server, and the experience of both online teaching parties is ensured.
Based on the above method embodiments, the disclosure further provides a processing device for audio/video data transmission, referring to fig. 4, which is a schematic structural diagram of the processing device for audio/video data transmission provided in the embodiments of the disclosure, where the processing device 400 for audio/video data transmission includes:
a setting module 401, configured to set up a transmission channel with the second terminal;
a receiving module 402, configured to receive, through the transmission channel, video frame data collected by the second terminal;
the calibration module 403 is configured to calibrate the video frame data with the pre-acquired audio data to obtain audio/video data; wherein the audio data and the video picture data are synchronously acquired;
a pushing module 404, configured to push the audio and video data to a streaming media server; the streaming media server is used for pushing the audio and video data to a third terminal.
In an alternative embodiment, the establishing module 401 includes:
the first establishing sub-module is used for establishing application layer connection with the second terminal;
a first sending sub-module, configured to send a request offer message to the second terminal through the application layer connection; wherein, the offer message carries session description protocol SDP information;
a first receiving sub-module, configured to receive a response answer message from the second terminal through the application layer connection; the answer message carries SDP information of the second terminal;
and the second establishing sub-module is used for establishing a peer-to-peer connection P2P transmission channel between the SDP information of the second terminal and the second terminal.
In an alternative embodiment, the establishing module 401 includes:
the first establishing sub-module is used for establishing application layer connection with the second terminal;
the second receiving submodule is used for receiving the offer message from the second terminal through the application layer connection; wherein, the offer message carries session description protocol SDP information of the second terminal;
the generation sub-module is used for collecting SDP information of the generation sub-module and generating an answer message based on the SDP information;
a second sending sub-module, configured to send the answer message to the second terminal through the application layer connection; the answer message is used for establishing P2P connection with the second terminal.
In an alternative embodiment, the first establishing sub-module includes:
and the third establishing sub-module is used for establishing TCP connection with the second terminal based on a transmission control protocol TCP.
In an alternative embodiment, the third establishing sub-module is configured to:
and establishing TCP connection with the second terminal based on the two-dimensional code.
In an alternative embodiment, the establishing module 401 includes:
and the fourth establishing sub-module is used for establishing a user datagram protocol UDP data transmission channel with the second terminal.
In an alternative embodiment, the apparatus further comprises:
the timing module is used for starting timing when the first terminal detects that the first terminal starts to establish the application layer connection with the second terminal, ending timing when the first terminal detects that the P2P transmission channel is successfully established between the first terminal and the second terminal, and acquiring a timing result;
and the determining module is used for determining the timing result as the time consuming connection between the first terminal and the second terminal.
In the processing device for audio and video data transmission provided in the embodiment of the present disclosure, first, a transmission channel is established between a first terminal and a second terminal, then, the first terminal receives video picture data collected by the second terminal through the transmission channel, and the first terminal calibrates the video picture data with pre-collected audio data to obtain audio and video data, where the audio data and the video picture data are collected synchronously. And the first terminal pushes the audio and video data to the streaming media server, wherein the streaming media server is used for pushing the audio and video data to the third terminal. Therefore, the processing device for audio and video data transmission provided by the embodiment of the disclosure establishes the transmission channel between the first terminal and the second terminal, so that the stable reliability of audio and video data transmission between the first terminal and the second terminal is ensured, and the first terminal can stably and reliably push the audio and video data after time synchronization calibration to the streaming media server, thereby ensuring the experience of both online teaching parties.
In addition to the above method and apparatus, the embodiments of the present disclosure further provide a computer readable storage medium, where instructions are stored, when the instructions are executed on a terminal device, to enable the terminal device to implement the processing method for audio/video data transmission according to the embodiments of the present disclosure.
The embodiment of the disclosure also provides a computer program product, which comprises a computer program/instruction, and the computer program/instruction realizes the processing method of audio and video data transmission according to the embodiment of the disclosure when being executed by a processor.
In addition, the embodiment of the present disclosure further provides a processing device 500 for audio and video data transmission, as shown in fig. 5, which may include:
a processor 501, a memory 502, an input device 503 and an output device 504. The number of processors 501 in the processing device for the transmission of audio and video data may be one or more, one processor being taken as an example in fig. 5. In some embodiments of the present disclosure, the processor 501, memory 502, input device 503, and output device 504 may be connected by a bus or other means, with bus connections being exemplified in fig. 5.
The memory 502 may be used to store software programs and modules, and the processor 501 performs various functional applications and data processing of the processing device for audio-video data transmission by running the software programs and modules stored in the memory 502. The memory 502 may mainly include a storage program area and a storage data area, wherein the storage program area may store an operating system, application programs required for at least one function, and the like. In addition, memory 502 may include high-speed random access memory, and may also include non-volatile memory, such as at least one magnetic disk storage device, flash memory device, or other volatile solid-state storage device. The input means 503 may be used to receive input numeric or character information and to generate signal inputs related to user settings and function control of the processing device for the transmission of audiovisual data.
In particular, in this embodiment, the processor 501 loads executable files corresponding to the processes of one or more application programs into the memory 502 according to the following instructions, and the processor 501 executes the application programs stored in the memory 502, so as to implement the various functions of the processing device for audio and video data transmission.
It should be noted that in this document, relational terms such as "first" and "second" and the like are used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Moreover, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising one … …" does not exclude the presence of other like elements in a process, method, article, or apparatus that comprises the element.
The foregoing is merely a specific embodiment of the disclosure to enable one skilled in the art to understand or practice the disclosure. Various modifications to these embodiments will be readily apparent to those skilled in the art, and the generic principles defined herein may be applied to other embodiments without departing from the spirit or scope of the disclosure. Thus, the present disclosure is not intended to be limited to the embodiments shown and described herein but is to be accorded the widest scope consistent with the principles and novel features disclosed herein.

Claims (11)

1. A method for processing audio and video data transmission, the method comprising:
a transmission channel is established between the first terminal and the second terminal;
the first terminal receives the video picture data acquired by the second terminal through the transmission channel;
the first terminal calibrates the video picture data with the pre-acquired audio data to obtain audio and video data; wherein the audio data and the video picture data are synchronously acquired;
the first terminal pushes the audio and video data to a streaming media server; the streaming media server is used for pushing the audio and video data to a third terminal.
2. The method of claim 1, wherein establishing a transmission channel between the first terminal and the second terminal comprises:
establishing application layer connection between a first terminal and a second terminal;
the first terminal sends a request refer message to the second terminal through the application layer connection; wherein, the offer message carries session description protocol SDP information of the first terminal;
the first terminal receives a response answer message from the second terminal through the application layer connection; the answer message carries SDP information of the second terminal;
and the first terminal establishes a peer-to-peer connection P2P transmission channel with the second terminal based on the SDP information of the second terminal.
3. The method of claim 1, wherein establishing a transmission channel between the first terminal and the second terminal comprises:
establishing application layer connection between a first terminal and a second terminal;
the first terminal receives the offer message from the second terminal through the application layer connection; wherein, the offer message carries session description protocol SDP information of the second terminal;
the first terminal collects SDP information of the first terminal and generates an answer message based on the SDP information;
the first terminal sends the answer message to the second terminal through the application layer connection; the answer message is used for establishing P2P connection between the first terminal and the second terminal.
4. A method according to claim 2 or 3, wherein establishing an application layer connection between the first terminal and the second terminal comprises:
the first terminal establishes a TCP connection with the second terminal based on a transmission control protocol TCP.
5. The method of claim 4, wherein the first terminal establishes a TCP connection with the second terminal based on a transmission control protocol TCP, comprising:
and the first terminal establishes TCP connection with the second terminal based on the two-dimension code of the first terminal or the second terminal.
6. The method of claim 1, wherein establishing a transmission channel between the first terminal and the second terminal comprises:
and establishing a user datagram protocol UDP data transmission channel between the first terminal and the second terminal.
7. A method according to claim 2 or 3, characterized in that the method further comprises:
starting timing when the first terminal detects that the first terminal starts to establish the application layer connection with the second terminal, ending timing when the first terminal detects that the P2P transmission channel is successfully established between the first terminal and the second terminal, and acquiring a timing result;
the first terminal determines the timing result as the time consuming connection between itself and the second terminal.
8. A processing device for audio and video data transmission, the device comprising:
the establishing module is used for establishing a transmission channel with the second terminal;
the receiving module is used for receiving the video picture data acquired by the second terminal through the transmission channel;
the calibration module is used for calibrating the video picture data with the pre-acquired audio data to obtain audio and video data; wherein the audio data and the video picture data are synchronously acquired;
the pushing module is used for pushing the audio and video data to the streaming media server; the streaming media server is used for pushing the audio and video data to a third terminal.
9. A computer readable storage medium, characterized in that the computer readable storage medium has stored therein instructions, which when run on a terminal device, cause the terminal device to implement the method of any of claims 1-7.
10. An apparatus, comprising: a memory, a processor, and a computer program stored on the memory and executable on the processor, the processor implementing the method of any one of claims 1-7 when the computer program is executed.
11. A computer program product, characterized in that it comprises a computer program/instruction which, when executed by a processor, implements the method according to any of claims 1-7.
CN202111497996.8A 2021-12-09 2021-12-09 Processing method, device, equipment and storage medium for audio and video data transmission Pending CN116260797A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202111497996.8A CN116260797A (en) 2021-12-09 2021-12-09 Processing method, device, equipment and storage medium for audio and video data transmission

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111497996.8A CN116260797A (en) 2021-12-09 2021-12-09 Processing method, device, equipment and storage medium for audio and video data transmission

Publications (1)

Publication Number Publication Date
CN116260797A true CN116260797A (en) 2023-06-13

Family

ID=86684786

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111497996.8A Pending CN116260797A (en) 2021-12-09 2021-12-09 Processing method, device, equipment and storage medium for audio and video data transmission

Country Status (1)

Country Link
CN (1) CN116260797A (en)

Similar Documents

Publication Publication Date Title
US10432676B2 (en) Enhanced discovery for ad-hoc meetings
CN109995741B (en) Method and system for realizing wheat connection in network live broadcast
KR101243065B1 (en) Conference terminal, conference server, conference system and data processing method
CN111479121B (en) Live broadcasting method and system based on streaming media server
CN111427527B (en) Screen projection method, device, equipment and computer readable storage medium
CN105407369A (en) Web application based terminal communication method and device
CN103814593A (en) Multicasting in a wireless display system
CN103702238A (en) Multi-screen video sharing method, terminals and server
WO2018072650A1 (en) Method, apparatus, and platform for accomplishing interaction between mobile terminal and iptv
CN112929595B (en) Network conference convergence system and method
CN103179213A (en) Method and system for transmitting home media resources on the basis of peer-to-peer agent mechanism
CN107835445B (en) MQTT protocol-based television control method, mobile terminal and television
CN111818010B (en) Data transmission method and device, electronic equipment and storage medium
CN113747247B (en) Live broadcast method, live broadcast device, computer equipment and storage medium
CN110996039B (en) Electronic whiteboard sharing method, system and computer-readable storage medium
CN116260797A (en) Processing method, device, equipment and storage medium for audio and video data transmission
CN115379279B (en) Multi-screen linkage interaction method, device, system, storage medium and electronic equipment
US10439832B2 (en) Enhanced discovery for AD-HOC meetings
CN108234398B (en) Multimedia communication method and system and related equipment
CN113839910A (en) Video conference realization method, terminal and SIP gateway
CN111813312B (en) Data transmission method, device, system, terminal equipment and readable storage medium
CN116708381B (en) Cross-network data transmission method and device, storage medium and electronic equipment
CN114554276B (en) Method, device and system for sharing content between devices
CN111225252B (en) PON gateway UPNP video live broadcast method based on openwrt system
CN114189648A (en) Method and device for adding live broadcast source into video conference

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination