CN105681715A

CN105681715A - Audio and video processing method and apparatus

Info

Publication number: CN105681715A
Application number: CN201610122421.0A
Authority: CN
Inventors: 兰玉龙; 李植珩; 曹星忠; 何永辉; 王玉帝; 彭陆渝; 吴铭津; 赵友明
Original assignee: Tencent Technology Shenzhen Co Ltd
Current assignee: Tencent Technology Shenzhen Co Ltd; Tencent Cloud Computing Beijing Co Ltd
Priority date: 2016-03-03
Filing date: 2016-03-03
Publication date: 2016-06-15
Anticipated expiration: 2036-03-03
Also published as: WO2017148442A1; CN105681715B

Abstract

The embodiment of the invention discloses an audio and video processing method and apparatus. In an audio and video stream acquisition process, collected audio and video streams are compressed into playable audio and video units at preset duration intervals. Then the playable audio and video units are sent to a server. After the audio and video stream acquisition is finished, a synthesis instruction is sent to the server, so that the server synthesizes the received playable audio and video units according to the synthesis instruction and obtains the corresponding audio and video. The scheme can compress the collected audio and video into the playable audio and video units in the audio and video acquisition process and upload the playable audio and video units to the server, namely the audio and video are collected while the audio and video are uploaded to the server after segmented compression, and the audio and video are compressed without waiting for the completion of the audio and video acquisition. Compared with the prior art, the audio and video processing method and apparatus shorten the recording and uploading time of the audio and video, and improve the recording and uploading efficiency of the audio and video.

Description

A kind of audio/video processing method and device

Technical field

The present invention relates to communication technical field, be specifically related to a kind of audio/video processing method and device.

Background technology

Along with the development of the Internet and terminal technology, increasing user uses terminal to watch or uploaded videos, with study, amusement. Such as, user can upload the audio frequency and video that oneself is recorded, and watches for friend, household etc.

In the prior art, the recording of audio frequency and video is uploaded scheme and is: recording audio/video, after having recorded, the audio frequency and video recorded is compressed, and is uploaded onto the server by network by the audio, video data after compression after having compressed. In addition, in some cases, audio, video data is uploaded (such as, in IM system) to after server, in addition it is also necessary to send and signal the other side and download).

To in the research of prior art and practice process, it was found by the inventors of the present invention that audio frequency and video are recorded to upload in scheme and recorded, compress, upload, the step series connection such as signal and perform at present, therefore, the time that audio frequency and video recording is uploaded is longer.

Summary of the invention

The embodiment of the present invention provides a kind of audio/video processing method and device, it is possible to shortens audio frequency and video and records the time uploaded.

The embodiment of the present invention provides a kind of audio/video processing method, including:

In the process that audio/video flow gathers, every preset duration, the audio/video flow collected is compressed into playable audio frequency and video unit;

Described playable audio frequency and video unit is sent to server;

After audio/video flow collection completes, send synthetic instruction to described server, so that the described playable audio frequency and video unit received is synthetically derived the audio frequency and video of correspondence by described server according to described synthetic instruction.

Accordingly, the embodiment of the present invention also provides for a kind of audio frequency and video and processes device, including:

Compression unit, for, in the process that audio/video flow gathers, being compressed into playable audio frequency and video unit every preset duration by the audio/video flow collected;

Audio frequency and video transmitting element, for being sent to server by described playable audio frequency and video unit;

Instruction sending unit, for, after audio/video flow collection completes, sending synthetic instruction to described server, so that the described playable audio frequency and video unit received is synthetically derived the audio frequency and video of correspondence by described server according to described synthetic instruction.

Additionally, the embodiment of the present invention also provides for another kind of audio/video processing method, including:

Receive the playable audio frequency and video unit that terminal sends in the process that audio/video flow gathers;

Receive the synthetic instruction that described terminal completes to send afterwards in audio/video flow collection;

According to described synthetic instruction, the playable audio frequency and video unit received is carried out synthesis process, to obtain audio frequency and video.

Accordingly, the embodiment of the present invention also provides for another kind of audio frequency and video and processes device, including:

Audio frequency and video receive unit, for receiving the playable audio frequency and video unit that terminal sends in the process that audio/video flow gathers;

Instruction reception unit, for receiving the synthetic instruction that described terminal completes to send afterwards in audio/video flow collection;

Synthesis unit, for carrying out synthesis process according to described synthetic instruction to the playable audio frequency and video unit received, to obtain audio frequency and video.

The embodiment of the present invention adopts in the process that audio/video flow gathers, every preset duration, the audio/video flow collected is compressed into playable audio frequency and video unit, then, this playable audio frequency and video unit is sent to server, after audio/video flow collection completes, send synthetic instruction to this server, so that this playable audio frequency and video unit received is synthetically derived the audio frequency and video of correspondence by this server according to this synthetic instruction; Owing to the audiovisual compression collected can be become can play audio frequency and video unit and upload onto the server by the program in the process gathering audio frequency and video, namely audio frequency and video are gathered while being uploaded onto the server by audio frequency and video sectional compression, complete compression afterwards without waiting for audio-video collection and upload audio frequency and video, in terms of existing technologies, shorten audio frequency and video and record the time uploaded, improve audio frequency and video and record the efficiency uploaded.

Accompanying drawing explanation

In order to be illustrated more clearly that the technical scheme in the embodiment of the present invention, below the accompanying drawing used required during embodiment is described is briefly described, apparently, accompanying drawing in the following describes is only some embodiments of the present invention, for those skilled in the art, under the premise not paying creative work, it is also possible to obtain other accompanying drawing according to these accompanying drawings.

Fig. 1 is the flow chart of a kind of audio/video processing method that the embodiment of the present invention one provides;

Fig. 2 is the flow chart of a kind of audio/video processing method that the embodiment of the present invention two provides;

Fig. 3 is the flow chart of a kind of audio/video processing method that the embodiment of the present invention three provides;

Fig. 4 is the structural representation that a kind of audio frequency and video that the embodiment of the present invention four provides process device

Fig. 5 is the structural representation that a kind of audio frequency and video that the embodiment of the present invention five provides process device.

Detailed description of the invention

Below in conjunction with the accompanying drawing in the embodiment of the present invention, the technical scheme in the embodiment of the present invention is clearly and completely described, it is clear that described embodiment is only a part of embodiment of the present invention, rather than whole embodiments. Based on the embodiment in the present invention, the every other embodiment that those skilled in the art obtain under not making creative work premise, broadly fall into the scope of protection of the invention.

The embodiment of the present invention provides a kind of audio frequency and video to process and device.To be described in detail respectively below.

Embodiment one,

The angle processing device from audio frequency and video is described by the present embodiment, and these audio frequency and video process device specifically can be integrated in terminal etc. to be needed to carry out in the equipment that audio frequency and video are uploaded or transmitted.

A kind of audio/video processing method, including: adopt in the process that audio/video flow gathers, every preset duration, the audio/video flow collected is compressed into playable audio frequency and video unit, then, this playable audio frequency and video unit is sent to server, after audio/video flow collection completes, send synthetic instruction to this server, so that this playable audio frequency and video unit received is synthetically derived the audio frequency and video of correspondence by this server according to this synthetic instruction. As it is shown in figure 1, the idiographic flow of this audio/video processing method can be such that

101, in the process that audio/video flow gathers, every preset duration, the audio/video flow collected is compressed into playable audio frequency and video unit.

Such as, gather audio/video flow by photographic head and mike, and at interval of preset duration, the audio/video flow collected is encoded when gathering audio/video flow, to form playable audio frequency and video unit.

Wherein, can play audio frequency and video unit is the independent audio frequency and video unit that can play, and this playable audio frequency and video unit can include can play audio unit and playable video unit. Preset duration can set according to the actual requirements, such as, it is possible to is set to 1s, namely in the process gathering audio/video flow, every 1s, the audio/video flow collected is compressed into playable audio frequency and video unit.

Every preset duration, the audio/video flow collected being compressed into playable audio frequency and video unit specifically process can be: every preset duration, the audio/video flow newly collected is compressed into playable audio frequency and video unit; Such as, start timer when starting recording audio/video stream and start timing, when timing reaches 1s, the audio/video flow currently collected is compressed into playable audio frequency and video unit (audio/video flow now collected is newly-increased audio/video flow relative to original state), when timing reaches 2s, the audio/video flow newly collected is compressed into playable audio frequency and video unit ... by that analogy until having been compressed by all audio/video flows collected.

Wherein, the audio/video flow collected is compressed into the mode of playable audio frequency and video unit by the present embodiment to be had multiple, such as, it is possible to respectively video data in audio/video flow and voice data are compressed, then, by the voice data after compression and video data synthesized voice video data; Namely step " audio/video flow collected being compressed into playable audio frequency and video unit every preset duration " may include that

Respectively the video data in this audio/video flow and voice data are compressed every preset duration, to obtain can play audio unit and playable video unit;

This playable audio unit and playable audio frequency and video unit are synthesized, to obtain can play audio frequency and video unit.

Wherein, the compression process of the present embodiment is data encoding processor, alternatively, audio coding formats can adopt AAC (AdvancedAudioCoding, Advanced Audio Coding), the form such as MP3, WMA, H.263 video code model can adopt, H.264 form.

For the ease of uploading of follow-up playable audio frequency and video unit, the present embodiment method can after being compressed into playable audio frequency and video unit by audio frequency and video flow process, this playable audio frequency and video unit is put in a queue, so, follow-up extraction from queue at transmission time can play audio frequency and video unit and upload, it is ensured that order that audio frequency and video unit sends and prevent from omitting audio frequency and video unit etc.; Namely step " audio/video flow collected being compressed into playable audio frequency and video unit every preset duration " may include that, every preset duration, the audio/video flow collected is compressed into playable audio frequency and video unit, and this playable audio frequency and video unit is put into transmission queue.

102, this playable audio frequency and video unit is sent to server.

Specifically, in the process that audio/video flow gathers, playable video unit is sent to server by network; Namely gather audio/video flow while sending and can play audio frequency and video unit, shorten the time that audio frequency and video are uploaded.

Wherein, can play the transmission mode of audio frequency and video unit and can have multiple, such as, can out of order send, or string, transmitted in parallel (as, transmission bottom can use UDP (UserDatagramProtocol, UDP), TCP (TransmissionControlProtocol transmission control protocol) advise one or more connection concurrent transmission) etc., again such as, UDT (UDP-basedDataTransferProtocol, based on the Data Transport Protocol of UDP) and other application layer protocol stacks, transmission system transmission etc. can be adopted.

Alternatively, when this playable audio frequency and video unit being put into transmission queue, this playable audio frequency and video unit " is sent to server " and may include that from the playable audio frequency and video unit of this transmission queue extraction by step, and the playable audio frequency and video unit extracted is sent to server.

In order to make server end can synthesize playable audio frequency and video unit obtain complete audio frequency and video, the present embodiment method can also after generating playable audio frequency and video unit, for it, mark is set, such as, playable audio frequency and video unit is numbered, in addition, in order to make server end can synthesize required audio frequency and video, namely the audio frequency and video specified, server the mark of the audio frequency and video belonging to playable audio frequency and video unit can also be sent to server, so that can be chosen this playable audio frequency and video unit belonging to these audio frequency and video and synthesize; Namely step " this playable audio frequency and video unit is sent to server " may include that

For this playable audio frequency and video unit, mark is set;

The mark of audio frequency and video belonging to this playable audio frequency and video unit and mark and this playable audio frequency and video unit thereof is sent to server.

Wherein, the mode arranging mark for playable audio frequency and video unit has multiple, such as, according to the generation time of playable audio frequency and video unit, mark can be set for it, namely step " arranging mark for this playable audio frequency and video unit " may include that the generation time according to this playable audio frequency and video unit, arranges mark for this playable audio frequency and video unit.

Specifically, it is possible to according to the priority of the time of generation, mark is set for can play audio frequency and video unit; Such as, when successively compression generation has five playable audio frequency and video unit, it is possible to for its numbering 1.mp4,2.mp4 ... 5.mp4; Wherein, first compression generate playable audio frequency and video unit be numbered 1, second compression generate playable audio frequency and video unit be numbered 2 ... the like the 5th compression generate playable audio frequency and video unit be numbered 5.

In the present embodiment, can play the mark of audio frequency and video belonging to audio frequency and video unit and can have uniqueness, it is alternatively possible to the checking parameter of first frame of video by playable audio frequency and video unit, such as md5 is as the mark of audio frequency and video; Namely step " mark of audio frequency and video belonging to this playable audio frequency and video unit and mark and playable audio frequency and video unit thereof is sent to server " may include that

Obtain the checking parameter of first frame of video of this playable audio frequency and video unit, and using this checking parameter mark (can called after sid) as audio frequency and video belonging to this playable audio frequency and video unit;

The mark of audio frequency and video group belonging to this playable audio frequency and video unit and mark and playable audio frequency and video unit thereof is sent to server.

Such as, it is possible to the md5 of first frame of video of the playable audio frequency and video unit that first compression is generated as the uniquely tagged (can called after sid) of audio frequency and video belonging to this playing audio-video unit, etc.

103, after audio/video flow collection completes, synthetic instruction is sent to this server, so that the playable audio frequency and video unit received is synthetically derived the audio frequency and video of correspondence by this server according to this synthetic instruction.

Such as, after audio/video flow has been recorded, it is possible to sending synthetic instruction to network side video server, the playable audio frequency and video unit received can be synthesized by such video server according to this synthetic instruction, obtains complete audio frequency and video.

Wherein, synthetic instruction may indicate that the audio frequency and video needing synthesis and participates in the playable audio frequency and video unit of synthesis; Such as when synthetic instruction middle pitch video unit numbered list is empty and carries sid, all that receive and corresponding with sid audio frequency and video unit are synthesized by the instruction of this synthetic instruction.

Again such as, synthetic instruction can carry the mark (such as above-mentioned sid) of audio frequency and video belonging to the mark numbering of audio frequency and video unit (such as can play) of the playable audio frequency and video unit participating in synthesis and this playable audio frequency and video unit, such server can choose the playable audio frequency and video unit (owing to there is the playable audio frequency and video unit of numerous video in server) of correspondence according to the mark of audio frequency and video, then, mark according to the playable audio frequency and video unit participating in synthesis determines playable audio frequency and video unit to be synthesized from the playable audio frequency and video unit chosen, finally, the playable audio frequency and video unit determined is synthesized.

Such as, synthetic instruction carry sid and participate in synthesis audio frequency and video unit numbering (as: carry numbering 1,4,5), now synthetic instruction instruction adopts its audio frequency and video unit carrying numbering corresponding to carry out synthesizing (as to the 1.mp4 received, 4.mp4,5.mp4 synthesize). Audio frequency and video when this kind of situation may be used for client audio frequency and video generation editing, after service end synthesizes editing.

It will be appreciated that in other embodiments, cancellation instruction can be sent to server, so that server deletes the audio frequency and video unit received, such as, cancelling instruction carry sid and cancel marker bit, server removes audio frequency and video unit corresponding to sid after receiving instruction.

As from the foregoing, the present embodiment adopts in the process that audio/video flow gathers, every preset duration, the audio/video flow collected is compressed into playable audio frequency and video unit, then, this playable audio frequency and video unit is sent to server, after audio/video flow collection completes, send synthetic instruction to this server, so that this playable audio frequency and video unit received is synthetically derived the audio frequency and video of correspondence by this server according to this synthetic instruction; Owing to the audiovisual compression collected can be become can play audio frequency and video unit and upload onto the server by the program in the process gathering audio frequency and video, namely audio frequency and video are gathered while being uploaded onto the server by audio frequency and video sectional compression, complete compression afterwards without waiting for audio-video collection and upload audio frequency and video, in terms of existing technologies, shorten audio frequency and video and record the time uploaded, improve audio frequency and video and record the efficiency uploaded, greatly reduce period of reservation of number, it it is recorded namely to upload successfully to the experience of user, it is achieved zero-waiting.

Additionally, audiovisual compression is become the independent audio frequency and video unit that can play by the program, when needs adopt streaming media playing mode playing audio-video, it is not necessary to backstage synthesizes, then, then the audio frequency and video of synthesis are cut into slices, save server resource; Client can directly download audio frequency and video unit by streaming media server, plays in order; Improve recorded video and carry out the speed of streaming media playing, improve Consumer's Experience.

Embodiment two,

According to the method described by embodiment one, below citing is described in further detail.

In the present embodiment, will process for these audio frequency and video that device is specifically integrated is described in detail in the terminal.

Wherein, these audio frequency and video process device is integrated in the mode of terminal to be had multiple, such as, installs in the terminal with client or other software forms.

As in figure 2 it is shown, the idiographic flow of this audio/video processing method can be such that

201, terminal is gathering audio/video flow by photographic head and mike, and every duration T, the audio/video flow newly collected is compressed into playable audio frequency and video unit in the process gathering audio/video flow.

Wherein, T can set according to the actual requirements, is such as 1s, 2s etc.

Such as, IM client or videoconference client in terminal gather audio/video flow by photographic head and the mike of terminal, and become to can play audio frequency and video unit by the audiovisual compression newly collected every 1s when gathering audio frequency and video.

202, terminal arranges numbering according to can play generation time corresponding to audio frequency and video unit for this playable audio frequency and video unit, and playable audio frequency and video unit and numbering thereof is put in transmission queue.

Such as, the time order and function that IM client or videoconference client generate according to playable audio frequency and video unit arranges numbering for can play audio frequency and video unit, such as, first playable audio frequency and video unit generated be numbered 1, second playable audio frequency and video unit generated be numbered playable playing audio-video unit that 2 ... n-th generate be numbered n; After compression generates and can play audio frequency and video unit, it is possible to playable audio frequency and video unit is put in transmission queue.

203, terminal while putting into transmission queue by playable audio frequency and video unit and numbering thereof, reads and can play audio frequency and video unit and numbering thereof, and obtain the uniquely tagged sid of audio frequency and video belonging to this playable audio frequency and video unit from this transmission queue.

Such as, while playable audio frequency and video unit and numbering thereof are put into transmission queue by IM client or videoconference client, IM client or videoconference client read from this transmission queue and can play audio frequency and video unit and numbering thereof, and using the md5 of first frame of video of first the audio frequency and video unit uniquely tagged (called after sid) as audio frequency and video group.

204, the playable audio frequency and video unit read and numbering and sid are sent to server by terminal.

Such as, sid corresponding with playable audio frequency and video unit to the playable audio frequency and video unit read and numbering thereof is sent jointly to server by IM client or videoconference client.

205, terminal is after audio/video flow collection completes, and sends synthetic instruction to this server, and this synthetic instruction carries this sid and numbered list, so that the audio frequency and video unit received is synthesized by server according to this synthetic instruction.

Such as, after IM client or videoconference client collection complete, sending synthetic instruction to this server, this synthetic instruction carries this sid and numbered list.

Wherein, numbered list may indicate that the audio frequency and video unit participating in synthesis, for instance, when numbered list is empty, it was shown that need background server that all audio frequency and video unit received are synthesized; When numbered list comprises the numbering of audio frequency and video unit, it was shown that need the corresponding audio frequency and video unit of numbering that numbered list is comprised by background server to synthesize. As, when the audio frequency and video generation editing of client recording, synthetic instruction can be carried sid and participate in the audio frequency and video element number of synthesis, as: 2.mp4,3.mp4 are cut away, then synthetic instruction carries numbering 1,4,5, now, server can find the 1.mp4 of correspondence according to the numbering that sid and synthetic instruction carry, 4.mp4,5.mp4, and three sections of audio frequency and video unit are spliced, to synthesize sample.mp4.

As from the foregoing, the present embodiment adopts terminal gathering audio/video flow by photographic head and mike, and every duration T, the audio/video flow newly collected is compressed into playable audio frequency and video unit in the process gathering audio/video flow, then, terminal arranges numbering according to the generation time that can play audio frequency and video unit corresponding for this playable audio frequency and video unit, and playable audio frequency and video unit and numbering thereof are put in transmission queue, terminal while putting into transmission queue by playable audio frequency and video unit and numbering thereof, read from this transmission queue and can play audio frequency and video unit and numbering thereof, and obtain the uniquely tagged sid of audio frequency and video belonging to this playable audio frequency and video unit, terminal is by the playable audio frequency and video unit read and numbering thereof, it is sent to server with sid, terminal is after audio/video flow collection completes, send synthetic instruction to this server, this synthetic instruction carries this sid and numbered list, so that the audio frequency and video unit received is synthesized by server according to this synthetic instruction, owing to the audiovisual compression collected can be become can play audio frequency and video unit and upload onto the server by the program in the process gathering audio frequency and video, namely audio frequency and video are gathered while being uploaded onto the server by audio frequency and video sectional compression, complete compression afterwards without waiting for audio-video collection and upload audio frequency and video, in terms of existing technologies, shorten audio frequency and video and record the time uploaded, improve audio frequency and video and record the efficiency uploaded, greatly reduce period of reservation of number, it it is recorded namely to upload successfully to the experience of user, it is achieved zero-waiting.

Embodiment three,

The angle processing device from audio frequency and video is described by the present embodiment, and these audio frequency and video process device and specifically can be integrated in the equipment that server etc. needs to carry out audio frequency and video synthesis.

A kind of audio/video processing method, including: receive the playable audio frequency and video unit that terminal sends, then, receive the synthetic instruction that this terminal sends, according to this synthetic instruction, the playable audio frequency and video unit received is carried out synthesis process, to obtain audio frequency and video.

As it is shown on figure 3, a kind of audio/video processing method, idiographic flow can be such that

301, the playable audio frequency and video unit that terminal sends is received.

Such as, it is possible to gathering in terminal and receive the playable audio frequency and video unit that terminal sends in the process of audio/video flow, the audio/video flow collected can be compressed obtaining by terminal by this playable audio frequency and video unit.

In order to synthesized voice video, end side also needs to send the mark of audio frequency and video belonging to playable audio frequency and video unit and mark thereof and playable audio frequency and video unit, now, step " receiving the playable audio frequency and video unit that terminal sends " may include that the mark of audio frequency and video belonging to the playable audio frequency and video unit and mark thereof and this playable audio frequency and video unit that receive terminal transmission.

302, the synthetic instruction that this terminal sends is received.

Such as, after this terminal has gathered audio/video flow, receive the synthetic instruction that this terminal sends. Wherein, this synthetic instruction may indicate that the audio frequency and video needing synthesis and participates in the playable audio frequency and video unit of synthesis. Such as, synthetic instruction can be carried the target needing to participate in synthesis and be can play the mark of audio frequency and video unit and this target can play the mark of audio frequency and video belonging to audio frequency and video unit, in order to server can determine from the playable audio frequency and video received that the target participating in synthesis can play audio frequency and video unit.

303, according to this synthetic instruction, the playable audio frequency and video unit received is carried out synthesis process, to obtain audio frequency and video.

Specifically, it is possible to first determine the playable audio frequency and video unit needing to participate in synthesis according to synthetic instruction, then, the playable audio frequency and video unit participating in synthesis is synthesized; Namely step " the playable audio frequency and video unit received being carried out synthesis process according to this synthetic instruction " may include that

From the playable audio frequency and video unit received, determine that according to this synthetic instruction the target participating in synthesis can play audio frequency and video unit;

This target be can play audio frequency and video unit and carry out synthesis process.

Such as, when synthetic instruction indicates the target that all playable audio frequency and video unit received are participation synthesis to can play audio frequency and video unit (as: when the audio frequency and video element number list in synthetic instruction is empty), then determine all receive playable audio frequency and video unit be participate in synthesis target can play audio frequency and video unit, at this point it is possible to receive playable audio frequency and video unit synthesize all; Again such as, when the target that synthetic instruction indicates certain several unit in the playable audio frequency and video unit received to be participation synthesis can play audio frequency and video unit (such as, when audio frequency and video element number list in synthetic instruction comprises the audio frequency and video element number participating in synthesis), then determine that in the playable audio frequency and video unit received, the playable audio frequency and video unit of this synthetic instruction instruction is that the target participating in synthesis can play audio frequency and video unit, now, this target determined be can play audio frequency and video unit to synthesize.

Wherein, can carrying, in synthetic instruction, the target needing to participate in synthesis and can play the mark of audio frequency and video unit and when this target can play the mark of audio frequency and video belonging to audio frequency and video unit, step " is chosen the target participating in synthesis from the playable audio frequency and video unit received and be can play audio frequency and video unit " and may include that according to this synthetic instruction

The mark of audio frequency and video belonging to the mark that can play audio frequency and video belonging to audio frequency and video unit according to this target and the playable audio frequency and video unit received, the playable audio frequency and video unit of correspondence is chosen, to obtain playable audio frequency and video unit to be selected from the playable audio frequency and video unit received;

This target can play the mark of audio frequency and video unit and the mark of these playable audio frequency and video to be selected as required, determines that this target can play audio frequency and video unit from playable audio frequency and video unit to be selected.

Such as, obtain the mark of audio frequency and video belonging to the playable audio frequency and video unit received, then, belonging to this mark be can play audio frequency and video unit with target, the mark of audio frequency and video compares, if it is identical, then choose this playable audio frequency and video unit received, to obtain multiple video unit that broadcasts to be selected, after having chosen, target can be obtained can play the mark of audio frequency and video unit and the mark of playable audio frequency and video unit to be selected and compare, if it is identical, it is determined that this playable audio frequency and video unit to be selected is that target can play audio frequency and video unit.

Again such as, synthetic instruction is carried the target needing to participate in synthesis and be can play the numbering of audio frequency and video unit, and this target can play the uniquely tagged (called after sid) of audio frequency and video belonging to audio frequency and video unit, now, can according to sid choose from the playable audio frequency and video unit received correspondence playable audio frequency and video unit (as: choose the broadcast video unit that uniquely tagged is sid of affiliated audio frequency and video), to obtain multiple playable audio frequency and video unit to be selected, then, the numbering of the numbering of audio frequency and video unit and playable audio frequency and video unit to be selected is can play according to target, determine from playable audio frequency and video unit to be selected target can broadcast video unit (as: determine that the numbering to be selected playable audio frequency and video unit identical with the numbering that synthetic instruction is carried is the playable audio frequency and video unit of target).

Wherein, target be can play the mode of audio frequency and video unit synthesis and have multiple, such as target can play audio frequency and video unit can include the playable video unit of target and the playable audio unit of target, now, respectively video unit and audio unit can be spliced, then, spliced video unit and audio unit being synthesized, namely step " this target be can play audio frequency and video unit and carry out synthesis process " may include that

Can play, according to this target, the mark that audio frequency and video unit is corresponding, respectively this target be can play video unit and this target can play video unit and splices;

This spliced target be can play video unit and spliced target can play audio unit and merges.

In order to flower screen, card, the nonsynchronous problem of Voice & Video occur in the audio frequency and video after preventing synthesis, the present embodiment can carry out frame removal when to video unit and/or audio unit splicing, to ensure audio frequency, audio video synchronization, wherein audio unit is carried out frame removing and is possible to prevent that card occurs; Now, step " can play, according to this target, the mark that audio frequency and video unit is corresponding, this target can play video unit respectively and this target can play video unit and splices " can specifically include:

According to the mark that the playable audio frequency and video unit of this target is corresponding, this target be can play video unit to be ranked up, and according to default splicing rule, the playable video unit of this target after sequence is spliced;

It is removed processing according to the regular audio frame to this target audio unit of default removal, and according to the mark that the playable audio frequency and video unit of this target is corresponding, the playable audio unit of target after removal process is spliced.

Wherein, video unit is unrestricted with the sequential of audio unit splicing, and it can first splice video unit and splice audio unit again, it is also possible to splice simultaneously.

The present embodiment is preset splicing rule and can be set according to the actual requirements, such as preventing video unit stitching portion from flower screen occurring, presetting average rule can be: if the last frame of video unit is I frame, and next units of video the first frame spliced with it is also I frame, needing to discard this I frame last, otherwise some player there will be flower screen in this stitching portion; Namely step " according to default splicing rule, this target after sequence being can play video unit to splice " may include that

Detection current goal can play last frame of video in video unit is whether be I frame, and whether first frame of video of the playable video unit of the next target spliced with it is be I frame;

If being I frame, then this next one target be can play first frame of video of video unit and remove;

Next target after current goal can play video unit and removing can play video unit and splices.

Wherein, preset removal rule can also set according to the actual requirements, such as, for the audio frequency and video after preventing synthesis, Caton phenomenon occurs, this presets removal rule: remove three audio frames of head and two audio frames of afterbody of audio unit, or two audio frames of the head of audio unit remove; Namely step " being removed processing according to the default rule audio frame to this target audio unit of removing " may include that

This target be can play three audio frames of head of audio unit and two audio frames of afterbody are removed, or two audio frames of head that this target can play audio unit are removed.

Such as, it is applicable to various platforms for enabling to embodiment, the present embodiment is preset removal rule and can be set according to data coding mode and coding implementing platform, such as, AAC coding under ios platform, each audio frequency and video unit removes head 3 frame and afterbody 2 frame data of voice data, and the AAC under Android platform encodes, and each audio frequency and video unit removes voice data head 2 frame data.

Alternatively, the present embodiment method can also after synthesis completes, and transmission signals the other side and downloads end. In addition, can also include in the present embodiment method: receive audio frequency and video and upload cancellation instruction, and the playable audio frequency and video unit received is deleted according to this instruction, such as, server receives the audio frequency and video of terminal transmission and uploads cancellation instruction, this instruction is carried above-mentioned sid and cancels mark position, and then, server removes all audio frequency and video unit that sid is corresponding.

Alternatively, in other scenes, if need to adopt stream media mode playing audio-video, the video unit that can broadcast can not be synthesized by backstage, directly the playable audio frequency and video unit received is sent to playback terminal carries out streaming media playing, in terms of existing technologies, backstage, without carrying out audio frequency and video synthesis, switching, saves server resource.

As from the foregoing, the present embodiment adopts and receives the synthetic instruction that this terminal completes in audio/video flow collection to send afterwards, then, receive the synthetic instruction that this terminal completes to send afterwards in audio/video flow collection, according to this synthetic instruction, the playable audio frequency and video unit received is carried out synthesis process, to obtain audio frequency and video; The program can gather in terminal and receive the playable audio frequency and video unit that terminal sends in the process of audio/video flow, and synthesizes receiving playable audio frequency and video unit after audio/video flow collection completes; In terms of existing technologies, it is possible to shorten audio frequency and video and record the time uploaded, greatly reduce period of reservation of number, be recorded namely to upload successfully to the experience of user, it is achieved zero-waiting, promote Consumer's Experience.

Embodiment four,

In order to implement above method better, the embodiment of the present invention also provides for a kind of audio frequency and video and processes device, and as shown in Figure 4, these audio frequency and video process device can also include compression unit 401, audio frequency and video transmitting element 402 and instruction sending unit 403, as follows:

(1) compression unit 401;

Compression unit 401, for, in the process that audio/video flow gathers, being compressed into playable audio frequency and video unit every preset duration by the audio/video flow collected.

Specifically, compression unit 401 can include audiovisual compression subelement and audio frequency and video synthon unit;

This audiovisual compression subelement, for being compressed the video data in this audio/video flow and voice data respectively every preset duration, to obtain can play audio unit and playable video unit;

This audio frequency and video synthon unit, for synthesizing this playable audio unit and playable audio frequency and video unit, to obtain can play audio frequency and video unit.

Such as, compression unit, it is possible to specifically for the audio/video flow collected being compressed into playable audio frequency and video unit every preset duration, and this playable audio frequency and video unit is put into transmission queue.

Such as, every preset duration, the audio/video flow newly collected is compressed into playable audio frequency and video unit; Such as, start timer when starting recording audio/video stream and start timing, when timing reaches 1s, the audio/video flow currently collected is compressed into playable audio frequency and video unit (audio/video flow now collected is newly-increased audio/video flow relative to original state), when timing reaches 2s, the audio/video flow newly collected is compressed into playable audio frequency and video unit ... by that analogy until having been compressed by all audio/video flows collected.

Wherein, the compression process of the present embodiment is data encoding processor, and alternatively, audio coding formats can adopt the forms such as AAC, MP3, WMA, and H.263 video code model can adopt, and H.264 waits form.

(2) audio frequency and video transmitting element 402;

Audio frequency and video transmitting element 402, for being sent to server by this playable audio frequency and video unit.

Wherein, can play the transmission mode of audio frequency and video unit and can have multiple, such as, can out of order send, or string, transmitted in parallel (as, transmission bottom can use UDP, TCP to advise one or more connection concurrent transmissions) etc., again it is for instance possible to use UDT and other application layer protocol stacks, transmission system transmission etc.

Alternatively, when this playable audio frequency and video unit is put into transmission queue by compression unit 401, audio frequency and video transmitting element 402 can extract from this transmission queue and can play audio frequency and video unit, and the playable audio frequency and video unit extracted is sent to server.

For make server side can synthesized voice video, this audio frequency and video transmitting element 402 includes: mark arranges subelement and audio frequency and video and sends subelement;

This mark arranges subelement, for arranging mark for this playable audio frequency and video unit;

These audio frequency and video send subelement, are sent to server for the mark of audio frequency and video belonging to this playable audio frequency and video unit and mark and this playable audio frequency and video unit thereof.

Such as, these audio frequency and video send subelement, it is possible to specifically for:

Obtain the checking parameter of first frame of video of this playable audio frequency and video unit, and using this checking parameter as the mark of audio frequency and video belonging to this playable audio frequency and video unit;

Again such as, mark arranges subelement, it is possible to specifically for the generation time according to this playable audio frequency and video unit, arrange mark for this playable audio frequency and video unit.

(3) instruction sending unit 403;

Instruction sending unit 403, for, after audio/video flow collection completes, sending synthetic instruction to this server, so that this playable audio frequency and video unit received is synthetically derived the audio frequency and video of correspondence by this server according to this synthetic instruction.

When being embodied as, above unit can realize as independent entity, it is also possible to carries out combination in any, realizes as same or several entities, and being embodied as of above unit referring to embodiment of the method above, can not repeat them here.

These audio frequency and video process device can integrated in the terminal, such as, installing in the terminal with client or other software forms, this terminal specifically can include the equipment such as mobile phone, panel computer, notebook computer or individual calculus (PC, PersonalComputer).

As from the foregoing, the present embodiment adopts compression unit 401 in the process that audio/video flow gathers, every preset duration, the audio/video flow collected is compressed into playable audio frequency and video unit, then, by audio frequency and video transmitting element 402, this playable audio frequency and video unit is sent to server, by instruction sending unit 403 after audio/video flow collection completes, send synthetic instruction to this server, so that this playable audio frequency and video unit received is synthetically derived the audio frequency and video of correspondence by this server according to this synthetic instruction; Owing to the audiovisual compression collected can be become can play audio frequency and video unit and upload onto the server by the program in the process gathering audio frequency and video, namely audio frequency and video are gathered while being uploaded onto the server by audio frequency and video sectional compression, complete compression afterwards without waiting for audio-video collection and upload audio frequency and video, in terms of existing technologies, shorten audio frequency and video and record the time uploaded, improve audio frequency and video and record the efficiency uploaded, greatly reduce period of reservation of number, it it is recorded namely to upload successfully to the experience of user, it is achieved zero-waiting.

Embodiment five,

Correspondingly, the embodiment of the present invention also provides for another kind of audio frequency and video and processes device, as it is shown in figure 5, these audio frequency and video process device can also include audio frequency and video reception unit 501, instruction reception unit 502 and synthesis unit 503, as follows:

(1) audio frequency and video receive unit 501;

Audio frequency and video receive unit 501, for receiving the playable audio frequency and video unit that terminal sends in the process that audio/video flow gathers.

Such as, audio frequency and video receive unit 501 and can receive the mark of audio frequency and video belonging to the playable audio frequency and video unit and mark thereof and this playable audio frequency and video unit that terminal sends in the process that audio/video flow gathers.

(2) instruction reception unit 502;

Instruction reception unit 502, for receiving the synthetic instruction that this terminal completes to send afterwards in audio/video flow collection.

Wherein, this synthetic instruction may indicate that the audio frequency and video needing synthesis close the playable audio frequency and video unit participating in synthesis. Such as, synthetic instruction can be carried the target needing to participate in synthesis and be can play the mark of audio frequency and video unit and this target can play the mark of audio frequency and video belonging to audio frequency and video unit, in order to server can determine from the playable audio frequency and video received that the target participating in synthesis can play audio frequency and video unit.

(3) synthesis unit 503;

Synthesis unit 503, for carrying out synthesis process according to this synthetic instruction to the playable audio frequency and video unit received, to obtain audio frequency and video.

Specifically, synthesis unit 503 can specifically include and determine subelement and synthon unit; ;

This determines subelement, for determining from the playable audio frequency and video unit received that according to this synthetic instruction the target participating in synthesis can play audio frequency and video unit;

This synthon unit, carries out synthesis process for this target be can play audio frequency and video unit.

Such as, mark and this target of carrying the playable audio frequency and video unit of target of needs participation synthesis in this synthetic instruction can play the mark of audio frequency and video belonging to audio frequency and video unit, and audio frequency and video are when receiving the mark that unit 501 receives audio frequency and video belonging to the playable audio frequency and video unit and mark thereof and this playable audio frequency and video unit that terminal sends in the process that audio/video flow gathers, this determines subelement, it is possible to specifically for:

Wherein, target can play audio frequency and video unit can include the playable video unit of target and the playable audio unit of target, now, and synthon unit, it is possible to specifically for:

Occur that audio frequency and video are asynchronous for audio frequency and video after preventing synthesis, cause the problem appearance such as card, Hua Ping, ratio wherein, synthon unit, it is possible to specifically for:

It is removed processing according to the regular audio frame to this target audio unit of default removal, and according to the mark that the playable audio frequency and video unit of this target is corresponding, the playable audio unit of target after removal process is spliced;

Again such as, the step " according to default splicing rule, this target after sequence being can play video unit to splice " that synthon unit performs can specifically include:

Again such as, the step " being removed processing according to the default rule audio frame to this target audio unit of removing " that synthon unit performs can specifically include:

These audio frequency and video process device can in the equipment such as integrating server.

As from the foregoing, the audio frequency and video of the present embodiment process device and receive, by audio frequency and video reception unit 501, the playable audio frequency and video unit that terminal sends in the process that audio/video flow gathers, then, the synthetic instruction that this terminal completes to send afterwards is received in audio/video flow collection by instruction reception unit 502, according to this synthetic instruction, the playable audio frequency and video unit received is carried out synthesis process by synthesis unit 503, to obtain audio frequency and video; The program can gather in terminal and receive the playable audio frequency and video unit that terminal sends in the process of audio/video flow, and synthesizes receiving playable audio frequency and video unit after audio/video flow collection completes; In terms of existing technologies, it is possible to shorten audio frequency and video and record the time uploaded, greatly reduce period of reservation of number, be recorded namely to upload successfully to the experience of user, it is achieved zero-waiting, promote Consumer's Experience.

One of ordinary skill in the art will appreciate that all or part of step in the various methods of above-described embodiment can be by the hardware that program carrys out instruction relevant and completes, this program can be stored in a computer-readable recording medium, storage medium may include that read only memory (ROM, ReadOnlyMemory), random access memory (RAM, RandomAccessMemory), disk or CD etc.

A kind of the audio/video processing method above embodiment of the present invention provided and device and system are described in detail, principles of the invention and embodiment are set forth by specific case used herein, and the explanation of above example is only intended to help to understand method and the core concept thereof of the present invention; Simultaneously for those skilled in the art, according to the thought of the present invention, all will change in specific embodiments and applications, in sum, this specification content should not be construed as limitation of the present invention.

Claims

1. an audio/video processing method, it is characterised in that including:

Described playable audio frequency and video unit is sent to server;

2. audio/video processing method as claimed in claim 1, it is characterised in that described the step that described playable audio frequency and video unit is sent to server is specifically included:

For described playable audio frequency and video unit, mark is set;

The mark of audio frequency and video belonging to described playable audio frequency and video unit and mark and described playable audio frequency and video unit thereof is sent to server.

3. audio/video processing method as claimed in claim 2, it is characterised in that the described step arranging mark for described playable audio frequency and video unit specifically includes:

The generation time according to described playable audio frequency and video unit, mark is set for described playable audio frequency and video unit.

4. audio/video processing method as claimed in claim 2, it is characterised in that the step that the mark of audio frequency and video belonging to described playable audio frequency and video unit and mark and playable audio frequency and video unit thereof is sent to server is specifically included:

Obtain the checking parameter of first frame of video of described playable audio frequency and video unit, and using described checking parameter as the mark of audio frequency and video belonging to described playable audio frequency and video unit;

The mark of audio frequency and video group belonging to described playable audio frequency and video unit and mark and playable audio frequency and video unit thereof is sent to server.

5. audio/video processing method as claimed in claim 1, it is characterised in that described every preset duration, the audio/video flow collected be compressed into the step of playable audio frequency and video unit and specifically include:

Respectively the video data in described audio/video flow and voice data are compressed every preset duration, to obtain can play audio unit and playable video unit;

Described playable audio unit and playable audio frequency and video unit are synthesized, to obtain can play audio frequency and video unit.

6. audio/video processing method as claimed in claim 1, it is characterised in that described every preset duration, the audio/video flow collected be compressed into the step of playable audio frequency and video unit and specifically include:

Every preset duration, the audio/video flow collected is compressed into playable audio frequency and video unit, and described playable audio frequency and video unit is put into transmission queue;

Described the step that described playable audio frequency and video unit is sent to server is specifically included:

Extract from described transmission queue and can play audio frequency and video unit, and the playable audio frequency and video unit extracted is sent to server.

7. audio/video transmission method as claimed in claim 2, it is characterised in that described synthetic instruction carries identifying and the mark of audio frequency and video belonging to described playable audio frequency and video unit of the playable audio frequency and video unit of participation synthesis.

8. an audio/video processing method, it is characterised in that including:

9. audio/video processing method as claimed in claim 8, it is characterised in that the described step playable audio frequency and video unit received being carried out synthesis process according to described synthetic instruction specifically includes:

From the playable audio frequency and video unit received, determine that according to described synthetic instruction the target participating in synthesis can play audio frequency and video unit;

Described target be can play audio frequency and video unit and carry out synthesis process.

10. audio/video processing method as claimed in claim 9, it is characterised in that described synthetic instruction is carried the target needing to participate in synthesis and be can play the mark of audio frequency and video unit and this target can play the mark of audio frequency and video belonging to audio frequency and video unit;

The step of the playable audio frequency and video unit that described reception terminal sends specifically includes:

Receive the mark of audio frequency and video belonging to the playable audio frequency and video unit and mark and described playable audio frequency and video unit that terminal sends;

The step choosing the playable audio frequency and video unit of the target participating in synthesis according to described synthetic instruction from the playable audio frequency and video unit received specifically includes:

The mark of audio frequency and video belonging to the mark that can play audio frequency and video belonging to audio frequency and video unit according to described target and the playable audio frequency and video unit received, the playable audio frequency and video unit of correspondence is chosen, to obtain playable audio frequency and video unit to be selected from the playable audio frequency and video unit received;

Described target can play the mark of audio frequency and video unit and the mark of described playable audio frequency and video to be selected as required, determines that described target can play audio frequency and video unit from playable audio frequency and video unit to be selected.

11. audio/video processing method as claimed in claim 10, it is characterised in that described target can play audio frequency and video unit and includes the playable video unit of target and the playable audio unit of target;The described step that the playable audio frequency and video unit of described target is carried out synthesis process specifically includes:

Can play, according to described target, the mark that audio frequency and video unit is corresponding, respectively described target be can play video unit and described target can play video unit and splices;

Described spliced target be can play video unit and spliced target can play audio unit and merges.

12. audio/video processing method as claimed in claim 11, it is characterized in that, the described mark corresponding according to the playable audio frequency and video unit of described target, the step respectively playable video unit of the playable video unit of described target and described target spliced specifically includes:

According to the mark that the playable audio frequency and video unit of described target is corresponding, described target be can play video unit to be ranked up, and according to default splicing rule, the playable video unit of the described target after sequence is spliced;

It is removed processing according to the regular audio frame to described target audio unit of default removal, and according to the mark that the playable audio frequency and video unit of described target is corresponding, the playable audio unit of target after removal process is spliced.

13. audio/video processing method as claimed in claim 12, it is characterised in that described basis is preset the regular step that the playable video unit of described target after sequence is spliced of splicing and specifically included:

14. audio/video processing method as claimed in claim 12, it is characterised in that described basis is preset and removed the step that the rule audio frame to described target audio unit is removed processing and specifically include:

Described target be can play three audio frames of head of audio unit and two audio frames of afterbody are removed, or two audio frames of head that described target can play audio unit are removed.

15. audio frequency and video process device, it is characterised in that including:

16. audio frequency and video as claimed in claim 15 process device, it is characterised in that described audio frequency and video transmitting element includes: mark arranges subelement and audio frequency and video send subelement;

Described mark arranges subelement, for arranging mark for described playable audio frequency and video unit;

Described audio frequency and video send subelement, are sent to server for the mark of audio frequency and video belonging to described playable audio frequency and video unit and mark and described playable audio frequency and video unit thereof.

17. audio frequency and video as claimed in claim 16 process device, it is characterised in that described audio frequency and video send subelement, specifically for:

18. audio frequency and video as claimed in claim 15 process device, it is characterised in that described compression unit includes: audiovisual compression subelement and audio frequency and video synthon unit;

Described audiovisual compression subelement, for being compressed the video data in described audio/video flow and voice data respectively every preset duration, to obtain can play audio unit and playable video unit;

Described audio frequency and video synthon unit, for synthesizing described playable audio unit and playable audio frequency and video unit, to obtain can play audio frequency and video unit.

19. audio frequency and video as claimed in claim 15 process device, it is characterised in that

Described compression unit, specifically for the audio/video flow collected being compressed into playable audio frequency and video unit every preset duration, and puts into transmission queue by described playable audio frequency and video unit;

Described audio frequency and video transmitting element, can play audio frequency and video unit specifically for extracting from described transmission queue, and the playable audio frequency and video unit extracted is sent to server.

20. audio frequency and video process device, it is characterised in that including:

21. audio frequency and video as claimed in claim 20 process device, it is characterised in that described synthesis unit specifically includes: determine subelement and synthon unit;

Described determine subelement, for determining from the playable audio frequency and video unit received that according to described synthetic instruction the target participating in synthesis can play audio frequency and video unit;

Described synthon unit, carries out synthesis process for described target be can play audio frequency and video unit.

22. audio frequency and video as claimed in claim 21 process device, it is characterised in that described synthetic instruction is carried the target needing to participate in synthesis and be can play the mark of audio frequency and video unit and this target can play the mark of audio frequency and video belonging to audio frequency and video unit;

Wherein, described audio frequency and video receive unit, specifically for receiving the mark of audio frequency and video belonging to the playable audio frequency and video unit that sends in the process that audio/video flow gathers of terminal and mark thereof and described playable audio frequency and video unit;

Described determine subelement, specifically for:

23. audio frequency and video as claimed in claim 22 process device, it is characterised in that described target can play audio frequency and video unit and includes the playable video unit of target and the playable audio unit of target;

Wherein, described synthon unit, specifically for:

24. audio frequency and video as claimed in claim 23 process device, it is characterised in that described synthon unit, specifically for:

It is removed processing according to the regular audio frame to described target audio unit of default removal, and according to the mark that the playable audio frequency and video unit of described target is corresponding, the playable audio unit of target after removal process is spliced;