CN110300274A - Method for recording, device and the storage medium of video file - Google Patents
Method for recording, device and the storage medium of video file Download PDFInfo
- Publication number
- CN110300274A CN110300274A CN201810235113.8A CN201810235113A CN110300274A CN 110300274 A CN110300274 A CN 110300274A CN 201810235113 A CN201810235113 A CN 201810235113A CN 110300274 A CN110300274 A CN 110300274A
- Authority
- CN
- China
- Prior art keywords
- data
- text data
- video
- text
- audio data
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 238000000034 method Methods 0.000 title claims abstract description 83
- 238000003860 storage Methods 0.000 title claims abstract description 19
- 230000008569 process Effects 0.000 claims abstract description 39
- 230000005236 sound signal Effects 0.000 claims description 7
- 238000012952 Resampling Methods 0.000 claims description 5
- 238000006243 chemical reaction Methods 0.000 claims description 2
- 238000004519 manufacturing process Methods 0.000 abstract description 8
- 238000010586 diagram Methods 0.000 description 20
- 230000033001 locomotion Effects 0.000 description 17
- 230000006870 function Effects 0.000 description 14
- 238000004891 communication Methods 0.000 description 11
- 230000000694 effects Effects 0.000 description 11
- 244000046052 Phaseolus vulgaris Species 0.000 description 10
- 235000010627 Phaseolus vulgaris Nutrition 0.000 description 10
- 230000001133 acceleration Effects 0.000 description 9
- 230000002093 peripheral effect Effects 0.000 description 8
- 238000012545 processing Methods 0.000 description 7
- 238000005516 engineering process Methods 0.000 description 4
- 230000008859 change Effects 0.000 description 3
- 239000000284 extract Substances 0.000 description 3
- 230000004927 fusion Effects 0.000 description 3
- 230000007246 mechanism Effects 0.000 description 3
- 230000003287 optical effect Effects 0.000 description 3
- 238000011946 reduction process Methods 0.000 description 3
- 230000009471 action Effects 0.000 description 2
- 238000013473 artificial intelligence Methods 0.000 description 2
- 230000008901 benefit Effects 0.000 description 2
- 239000000919 ceramic Substances 0.000 description 2
- 230000006835 compression Effects 0.000 description 2
- 238000007906 compression Methods 0.000 description 2
- 238000013461 design Methods 0.000 description 2
- 230000001815 facial effect Effects 0.000 description 2
- 230000005484 gravity Effects 0.000 description 2
- 239000004973 liquid crystal related substance Substances 0.000 description 2
- 230000001052 transient effect Effects 0.000 description 2
- 240000001417 Vigna umbellata Species 0.000 description 1
- 235000011453 Vigna umbellata Nutrition 0.000 description 1
- 230000005540 biological transmission Effects 0.000 description 1
- 239000003086 colorant Substances 0.000 description 1
- 238000012937 correction Methods 0.000 description 1
- 238000001514 detection method Methods 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 230000018109 developmental process Effects 0.000 description 1
- 230000008034 disappearance Effects 0.000 description 1
- 238000009826 distribution Methods 0.000 description 1
- 235000013399 edible fruits Nutrition 0.000 description 1
- 230000005611 electricity Effects 0.000 description 1
- 230000006872 improvement Effects 0.000 description 1
- 230000006698 induction Effects 0.000 description 1
- 238000007689 inspection Methods 0.000 description 1
- 230000002452 interceptive effect Effects 0.000 description 1
- 230000001788 irregular Effects 0.000 description 1
- 238000010801 machine learning Methods 0.000 description 1
- 238000012423 maintenance Methods 0.000 description 1
- 239000000463 material Substances 0.000 description 1
- 238000010295 mobile communication Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 238000005457 optimization Methods 0.000 description 1
- 238000007781 pre-processing Methods 0.000 description 1
- 238000002360 preparation method Methods 0.000 description 1
- 230000009467 reduction Effects 0.000 description 1
- 238000009877 rendering Methods 0.000 description 1
- 238000011160 research Methods 0.000 description 1
- 238000005096 rolling process Methods 0.000 description 1
- 230000011218 segmentation Effects 0.000 description 1
- 230000006641 stabilisation Effects 0.000 description 1
- 238000011105 stabilization Methods 0.000 description 1
- 230000001360 synchronised effect Effects 0.000 description 1
Classifications
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/26—Speech to text systems
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/28—Constructional details of speech recognition systems
- G10L15/30—Distributed recognition, e.g. in client-server systems, for mobile phones or network applications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N5/00—Details of television systems
- H04N5/222—Studio circuitry; Studio devices; Studio equipment
- H04N5/262—Studio circuits, e.g. for mixing, switching-over, change of character of image, other special effects ; Cameras specially adapted for the electronic generation of special effects
- H04N5/278—Subtitling
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N5/00—Details of television systems
- H04N5/76—Television signal recording
Landscapes
- Engineering & Computer Science (AREA)
- Multimedia (AREA)
- Computational Linguistics (AREA)
- Health & Medical Sciences (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Human Computer Interaction (AREA)
- Physics & Mathematics (AREA)
- Acoustics & Sound (AREA)
- Signal Processing (AREA)
- Television Signal Processing For Recording (AREA)
Abstract
The invention discloses a kind of method for recording of video file, device and storage mediums, belong to Internet technical field.The described method includes: display includes the video record interface of subtitle addition option during video record;When receiving subtitle addition instruction, according to audio data collected during video record, the text data that the audio data is converted to is obtained;The display text data on video record interface;When receiving video record END instruction, the video file including text data is generated.The present invention is converted to text data in video file recording process, by collected audio data, and the text data being converted to is shown at video record interface.The process makes a subtitle file without user in advance, saves vast resources and manufacturing process is simpler.
Description
Technical field
The present invention relates to Internet technical field, in particular to a kind of method for recording of video file, device and storage are situated between
Matter.
Background technique
With the development of internet technology, various social applications are widely used in the life of user, it has also become between user
The main tool of communication exchange.In order to meet the use demand of user, social application provides video record function, is based on video
Recording function, many users can be with recorded video, and adds text in the video of recording, to promote the entertaining interacted between user
Property.
Currently, the method for recording of video file are as follows: the video record function based on social application records the first video text
Part;Obtain the subtitle file of user's production;Subtitle file is merged into the first video file, the second video file is obtained.
However, the relevant technologies have been recorded due to needing user to make a subtitle file in advance, and in the first video file
Cheng Hou merges subtitle file with the first video file recorded, cause resource consumption larger and manufacturing process more
It is cumbersome.
Summary of the invention
In order to solve problems in the prior art, the embodiment of the invention provides a kind of method for recording of video file, device
And storage medium.The technical solution is as follows:
On the one hand, a kind of method for recording of video file is provided, which comprises
During video record, show that video record interface, the video record interface include subtitle addition option;
When receiving subtitle addition instruction, according to audio data collected during video record, described in acquisition
The text data that audio data is converted to;
The text data is shown on the video record interface;
When receiving video record END instruction, the video file including text data is generated.
On the other hand, a kind of record device of video file is provided, described device includes:
Display module, for showing video record interface, the video record interface includes word during video record
Curtain addition option;
Obtain module, for when receive subtitle addition instruction when, according to audio collected during video record
Data obtain the text data that the audio data is converted to;
The display module, for showing the text data on the video record interface;
Generation module, it is raw for when receiving video record END instruction, generating the video file including text data
At the video file including text data.
On the other hand, a kind of terminal is provided, the terminal includes processor and memory, is stored in the memory
At least one instruction, at least a Duan Chengxu, code set or instruction set, at least one instruction, an at least Duan Chengxu, institute
It states code set or described instruction collection is loaded by the processor and executed to realize the method for recording such as video file.
On the other hand, a kind of computer readable storage medium is provided, at least one finger is stored in the storage medium
Enable, at least a Duan Chengxu, code set or instruction set, at least one instruction, an at least Duan Chengxu, the code set or
Described instruction collection is loaded by processor and is executed the method for recording to realize video file.
Technical solution provided in an embodiment of the present invention has the benefit that
In video file recording process, collected audio data is converted into text data, and will be converted to
Text data is shown in video record interface.The process makes a subtitle file without user in advance, saves vast resources
And manufacturing process is simpler.
Detailed description of the invention
To describe the technical solutions in the embodiments of the present invention more clearly, make required in being described below to embodiment
Attached drawing is briefly described, it should be apparent that, drawings in the following description are only some embodiments of the invention, for
For those of ordinary skill in the art, without creative efforts, it can also be obtained according to these attached drawings other
Attached drawing.
Fig. 1 is implementation environment involved in a kind of method for recording of video file provided in an embodiment of the present invention;
Fig. 2 is a kind of flow chart of the method for recording of video file provided in an embodiment of the present invention;
Fig. 3 is a kind of schematic diagram at video record interface provided in an embodiment of the present invention;
Fig. 4 is the schematic diagram of a kind of audio data collecting provided in an embodiment of the present invention and upload procedure;
Fig. 5 is a kind of schematic diagram at video record interface provided in an embodiment of the present invention;
Fig. 6 is a kind of schematic diagram at video record interface provided in an embodiment of the present invention;
Fig. 7 is a kind of schematic diagram at video record interface provided in an embodiment of the present invention;
Fig. 8 is a kind of schematic diagram at video record interface provided in an embodiment of the present invention;
Fig. 9 is a kind of schematic diagram at video record interface provided in an embodiment of the present invention;
Figure 10 is a kind of schematic diagram at video record interface provided in an embodiment of the present invention;
Figure 11 is a kind of schematic diagram at video record interface provided in an embodiment of the present invention;
Figure 12 is a kind of schematic diagram at video record interface provided in an embodiment of the present invention;
Figure 13 is the schematic diagram of a kind of text data and video file alignment procedure provided in an embodiment of the present invention;
Figure 14 is a kind of schematic diagram at video preview interface provided in an embodiment of the present invention;
Figure 15 is a kind of structural schematic diagram of the record device of video file provided in an embodiment of the present invention;
Figure 16 is the structural schematic diagram of terminal provided in an embodiment of the present invention.
Specific embodiment
To make the object, technical solutions and advantages of the present invention clearer, below in conjunction with attached drawing to embodiment party of the present invention
Formula is described in further detail.
Fig. 1 shows implementation environment involved in the method for recording of video file provided in an embodiment of the present invention, referring to figure
1, which includes terminal 101, server 102 and terminal 103.
Wherein, terminal 101 can be smart phone, tablet computer, laptop etc., and the embodiment of the present invention is not to terminal
101 product type makees specific limit.In order to meet the use demand of user, terminal 101 is equipped at least one social activity and answers
With at least one social application can call the camera of terminal 101, acquire the image data of user, can also call terminal 101
Microphone, acquires the audio data of user, to realize video record function.
Server 102 is a kind of social application server, and the social activity that server 102 can be installed based on terminal 101 is answered
With, for terminal 101 provide communication service and other service, for example, audio data to be converted to the service etc. of text data.
The product type of terminal 103 is identical as the product type of terminal 101, and the embodiment of the present invention is equally without specific
It limits.Terminal 103 can be equipped with social application identical with terminal 101, and based on the social application installed, receive service
The video file that the terminal 101 of device forwarding is recorded;Terminal 103 can certainly install the social application different from terminal 101, or
Person does not install any social application, and in that case, terminal 103 can pass through bluetooth, NFC (Near Field
Communication, the short distance wireless communication technology) or other means receive terminal 101 record video file.
The embodiment of the invention provides a kind of method for recording of video file, referring to fig. 2, side provided in an embodiment of the present invention
Method process includes:
201, during video record, terminal shows video record interface.
When user has video record demand, user can choose any social activity with video record function in terminal to answer
With terminal chooses operation by detection user's, the social application is run, to pass through the video record option to social application
Choose operation, show video record interface.
Wherein, video record interface is used for the current video image of real-time display user, view during video record
Frequency recording interface includes various options relevant to video record, for example, record stop option, delivery option, save options with
And video pictures optimization option etc., such as the options such as contrast, brightness, further include subtitle addition option, is added based on the subtitle
Option, terminal can add text data during video record.It is also aobvious on video record interface other than various function choosing-items
Be shown with the correlation time with video record, such as recorded duration, remaining recording duration etc..
202, when receiving subtitle type selection command, terminal shows at least one subtitle type option.
During video record, the operation of choosing of option is added to the subtitle by detecting, terminal receives subtitle class
Type selection instruction, under the triggering of the subtitle type selection command, terminal shows at least one subtitle on video record interface
Type option, every kind of subtitle type option correspond to a kind of Subtitle Demonstration form.At least one subtitle type type selecting is respectively first
Type option, Second Type option and third type option etc., wherein the display format of first kind option can be barrage shape
Formula, the display format of Second Type option can add the form of text for barrage, and the display format of third type option can be
The form moved around from center.
Fig. 3 is a kind of video record interface, referring to Fig. 3, subtitle addition option is shown on the video record interface, works as inspection
It measuring and when choosing operation of option is added to subtitle, terminal shows three types option on video record interface, respectively the
One type option, Second Type option and third type option.
203, when receiving subtitle addition instruction, terminal according to audio data collected during video record,
Obtain the text data that audio data is converted to.
During video record, when detecting the operation of choosing to any subtitle type option, terminal is based on microphone
Audio signal is acquired, and obtains audio data from audio signal, and then according to audio collected during audio recording
Data obtain the text data that audio data is converted to.When detecting the operation of choosing to any subtitle type option
Afterwards, before audio data is converted to text data, terminal will also show subtitle addition prompt letter on video record interface
Breath, to prompt user that can add subtitle in the video of recording.The content that the subtitle adds prompt information can be " recorded video
When speak can add subtitle " etc..
Terminal obtains the textual data that audio data is converted to according to audio data collected during video record
According to process it is as follows:
2031, during video record, terminal carries out resampling according to audio signal of the default sample rate to acquisition, obtains
To an at least frame audio data.
Wherein, presetting sample rate is the audio sample rate that server can be supported, and can negotiate to determine by terminal and server,
It is advisable in the embodiment of the present invention with 16000.
Terminal carries out resampling according to audio signal of the default sample rate to acquisition, can convert collected audio signal
It is an at least frame audio data, the duration of every frame audio data can be determined according to default sample rate.For example, default sample rate is
16000, then duration=1000*1/16000=0.625 milliseconds of every frame audio data.
2032, when at least a frame audio data meets preset condition, terminal converts an at least frame audio data,
Obtain text data.
When at least a frame audio data meets preset condition, terminal converts an at least frame audio data, obtains
When text data, step 20321~20323 can be used:
20321, terminal calculates the volume value of every frame audio data.
By taking any frame audio data as an example, terminal obtains the range value for each sampled point that the frame audio data includes, and
Square for calculating each sample amplitude value then calculates square of the range value for all sampled points that the frame audio data includes
With, and then the logarithm of the quadratic sum of range value is calculated, obtain the volume value of the frame audio data.
20322, when the volume value of any frame audio data is greater than specified threshold, terminal storage audio data.
Wherein, specified threshold is determined according to voice volume value, which is generally 45.Terminal is by by any frame sound
The volume value of frequency evidence is compared with specified threshold, can be screened to an at least frame audio data, to screen out at least
The audio data of non-voice in one frame audio data.
When the volume value of audio data be greater than specified threshold when, terminal to volume value be greater than specified threshold audio data into
Row pretreatment, to obtain one section of clear effective voice data.The preprocessing process are as follows: the audio data drops in terminal
It makes an uproar processing, then the range value of each sampled point of the audio data after noise reduction process is doubled, is obtained pretreated
Audio data.
20323, when the total duration of the audio data stored reaches the first preset duration, terminal is first pre- to total duration
If the audio data of duration is converted, text data is obtained.
Wherein, the first preset duration can be 100 milliseconds, 200 milliseconds etc..Terminal is the first preset duration to total duration
Audio data is converted, and when obtaining text data, can be used two ways, and a kind of mode is that terminal is to total duration in local
The audio data of first preset duration is converted, and text data is obtained;Total duration is by another way, terminal by network
The audio data of first preset duration is sent to server, is converted by server.
In view of during entire video record, terminal needs to acquire a large amount of audio data, for the ease of to these
Audio data distinguishes, and terminal can be the audio data that total duration is the first preset duration as unit of the first preset duration
Data directory is set, which can be configured according to the acquisition order of audio data, for example, can reach for first
The audio data setting data directory of one preset duration is 1, and number is arranged in the audio data for reaching the first preset duration for second
It is 2 according to index, and so on.The data directory is mainly used for when text data merges with video file, by text data with
Voice in video is aligned.When terminal needs the audio data that total duration is the first preset duration being sent to server
When, the corresponding data directory of the audio data of first preset duration can also be sent to server together, and server is pre- by first
If the audio data of duration is converted to text data, and using the corresponding data directory of the audio data of the first preset duration as text
The data directory of notebook data, and then text data and its data directory are sent to terminal.
In another embodiment of the present invention, if the volume value of any frame audio data is less than specified threshold, terminal
The volume value for counting audio data is less than the duration of specified threshold, when the volume value of audio data is in the second preset duration
Respectively less than specified threshold, terminal to server sends heartbeat data, with the heartbeat between maintenance and server, avoids and server
Between communication disruption.Wherein, the second preset duration can be negotiated to determine by terminal and server, which is 1 second, 2
Second, 3 seconds etc., it is 3 seconds that the embodiment of the present invention, which chooses the second preset duration in application,.
When getting text data, corresponding relationship between terminal storage data directory and text data, thus rear
In continuous operation, corresponding text data can be found according to the data directory.
In view of the text data that same audio data is converted under different context is different, for example, server connects
The audio data received is " ni ", and according to the pronunciation of audio data, it is " mud " that server, which gets corresponding text data, and will
The text data got is sent to terminal;After the first preset duration, it is " haoma ", knot that server, which receives audio data,
Front and back context is closed, server judges that user's actually word is " how do you do " rather than " mud is OK ", at this moment with regard to needs pair
The text data for being sent to terminal is corrected.
Specifically, the correction course of text data are as follows: text data and its data directory are sent to terminal by server, when
When receiving text data and its data directory, whether terminal inquiry is locally stored with the data directory, if be locally stored
The data directory, by according to the text data received, the text data that stored corresponding to the data directory carries out more terminal
Newly;If local not stored data directory, terminal directly store the text data and its data directory received.
Fig. 4 shows audio data collecting provided in an embodiment of the present invention and upload procedure, referring to fig. 4, in video record
Terminal acquires the voice signal of user based on microphone in real time in the process, and is believed using voice of 16000 sample rate to acquisition
Number carry out resampling, obtain an at least frame audio data.Then, terminal calculates the volume value of every frame audio data, and judges sound
Whether magnitude is less than 45, if the volume value of audio data is respectively less than 45 in 3 seconds, sends heartbeat data to server;Such as
The volume value of fruit any frame audio data is greater than 45, then carries out Nsx noise reduction process to the frame audio data, and will be after noise reduction process
The volume value of audio data enhance 2 times, and then cache the frame audio data.Terminal detects the data of cached audio data
Whether length (i.e. total duration) is more than or equal to 100 milliseconds, if the data length of the audio data cached is more than or equal to 100 millis
Second, then the audio data cached is sent to server;If the data length of the audio data cached is less than 100 millis
Second, then continue to store audio data.
204, terminal is according to the corresponding display format of selected word curtain type option, the display text number on video record interface
According to.
In embodiments of the present invention, every kind of subtitle type option corresponds to a kind of display format, based on accessed text
Data, terminal will be according to the corresponding display formats of selected word curtain-like type, the display text data on video record interface.
For the difference of selected word curtain type option, display text data on video record interface of the embodiment of the present invention
Form is also different.Specifically include following several situations:
The first situation, subtitle type option include first kind option.
First kind option be barrage form option, using the first kind option on video record interface display text
Before data, need to be arranged as follows:
1, at least one trajectory is set, and every trajectory is used for the corresponding text of display text data;
2, be arranged text mode of entrance, the text mode of entrance include enter from the left side screen from the right exit screen, from
The right enters screen and from the left side exits screen, screen is entered from top exits screen from below, enters screen from below from top
Screen is exited, screen is entered from upper left side exits screen etc. from lower right, user can be arranged according to the hobby of oneself;
3, character motion speed is set, and the movement speed of the movetext can be random movement on same trajectory
Speed be it is identical, the movement speed of text is different on different trajectories;
4, the color of text is set, and the color of the text on same trajectory can be the same or different, on different trajectories
The color of text can be the same or different;
5, need to be arranged the size of text, the size of the text be also it is random, the size of the text on same trajectory can
It can also be different with identical, the text size on different trajectories can be the same or different.
Based on above-mentioned set content, terminal obtains text data, and obtains the display parameters of text data, including font,
Color, size, movement speed, display position etc., and then according to the display parameters of text data, the text in text data is drawn
It makes on video record interface, to show the text with different barrage effects on video record interface.
In view of user is in recorded video, one one as unit of sentence is recorded, and terminal also can in display
By lteral data as unit of sentence on trajectory one one show that therefore, it is necessary to be arranged at least one sentence
Display mechanism.For at least one sentence at least one trajectory display mechanism, the following can be followed:
First, avoid the sentence on a trajectory excessively crowded as far as possible, remaining barrage occurs idle;
Second, when all trajectories are occupied, may be selected to use the longest trajectory of duration, and will be former on the trajectory
Some sentences replace;
If third does not receive always new sentence, circulating rolling shows old sentence.
Trajectory quantity is set as 4, the sentence of display is two, two sentences video record interface display format,
It can be found in Fig. 5, Fig. 6 and Fig. 7.
Referring to the left figure in Fig. 5, terminal is in 4 Ballistic display first statements, when receiving Article 2 sentence, referring to
Right figure in Fig. 5, when receiving Article 2 sentence, terminal replaces original on 4 trajectories first using Article 2 sentence
Sentence, and Article 2 sentence is shown on 4 trajectories.
Referring to left figure in Fig. 6, terminal uses 1 Ballistic display first statement, using 3 Ballistic display Article 2 languages
Sentence, referring to the right figure in Fig. 6, when receiving Article 3 sentence, terminal is using the Ballistic display third language for showing the first sentence
Sentence, and 2 Ballistic display Article 3 sentences, at this time 3 Ballistic display thirds are selected from 3 trajectories of display Article 2 sentence
Sentence, a Ballistic display Article 2 sentence.
Referring to the left figure in Fig. 7, terminal works as reception referring to the right figure in Fig. 7 using 3 Ballistic display first statements
When to Article 2 sentence, terminal has occupied 1 bullet of selection in trajectory from 3 using remaining 1 Ballistic display Article 2 sentence
Road shows the second sentence, at this time 2 Ballistic displays, one sentence, two Ballistic display Article 2 sentences.
Above-mentioned Fig. 5, Fig. 6, Fig. 7 are the schematic diagram for showing form, for actual displayed form, reference can be made to reality shown in Fig. 8
Border surface chart.
Second situation, subtitle type option include Second Type option.
Second Type option is the barrage form option of text and designated pictures.Wherein, designated pictures can for user from
The picture downloaded on network perhaps picture provided in picture database or the expression etc. carried for terminal.Text and
The position of designated pictures can be adjacent, can also be located on any position of screen.The size of text, the color of text,
The movement speed of text and display mechanism etc., identical as the above-mentioned setting to barrage form, details are not described herein again.
It before display text data, needs first to be arranged text on video record interface using Second Type option and specifies
The move mode of picture.Specifically, text and the move mode of designated pictures can be with texts and designated pictures always with identical
Movement speed is moved together;Or text and designated pictures are moved always with different movement speeds;Or it is directed to
In the position of text and designated pictures, adjacent and text is located at the case where designated pictures rear, and text and designated pictures are first with identical
Speed moved, designated pictures slow down movement speed when being moved to distance to a declared goal, since text is still with identical speed
It is moved, to generate the effect of designated pictures hidden text.Wherein, distance to a declared goal can be set by user according to the interest of itself
It sets, can also be arranged by research staff.In order to increase interest, when detecting that designated pictures are moved to distance to a declared goal, figure is specified
Piece can use animation mode hidden text while slowing down movement speed.If it is another to be moved to screen in designated pictures
Before side, text is hidden completely, then designated pictures will disappear together with text.
Based on above-mentioned set content, terminal obtains text data and designated pictures, and obtains the display parameters of text data,
Including font, size, movement speed, color, display position etc., while the display parameters of designated pictures are obtained, including picture position
It sets, movement speed etc., and then according to the display parameters of text data and the display parameters of designated pictures, by the text in text data
Word and designated pictures are plotted on video record interface, to showed on video record interface with different barrage effects
Text and picture.
Referring to Fig. 9, designated pictures are set as beans people's picture, each text is shown in the form of red bean, every
A text is as in the moving process of beans people's picture, when the moving distance of beans people's picture is the half of screen length, beans people schemes
The movement speed of piece slows down.Since the movement speed of text is constant, thus text is gradually hidden by beans people's picture.To make the process
More vivid, the beans people in beans people's picture will constantly change, and by mouth opening and closing, realize that " beans people " eats " beans " effect.Work as text
It is eaten up completely, beans people picture also disappears therewith.
Above-mentioned Fig. 9 is the schematic diagram for showing form, for actual displayed form, reference can be made to practical interface shown in Fig. 10
Figure.
The third situation, subtitle type option include third type option.
Third type option is the form choosing for controlling the text in lteral data and being moved around centered on predeterminated position
, it before display text data, needs that predeterminated position and movement is first arranged on video record interface using third type option
Mode.
Wherein, predeterminated position can be located at screen on any position can if detecting the facial image of user on the screen
It, can be with any position if the facial image of user is not detected on the screen using the mouth position of user as predeterminated position
For predeterminated position.Move mode can be to be mobile to two sides centered on predeterminated position, can also be for centered on predeterminated position
It is mobile etc. to multiple directions.To keep effect more significant, each Text segmentation in every a word can be opened in display, and
Each word is shown using a kind of color at random, for each word in a word, can be shown using identical color,
It can also be shown using different colors.In the process that control lteral data is moved around centered on predeterminated position
In, each word is with the increase of moving distance and the final disappearance that becomes larger, the process can generally continue 400 milliseconds.
Based on above-mentioned set content, terminal obtains text data, and obtains the display parameters of text data, including font,
Size, movement speed, color, display position, move mode etc., and then according to the display parameters of text data, by text data
In Word-Drawing to video record interface on, to show the text with Different Dynamic effect on video record interface
Word.
Referring to Figure 11, predeterminated position is set as the position where user's mouth, while text is from small become larger, terminal is examined
Survey user's mouth position, and control text constantly from most bar to the two sides corners of the mouth it is mobile.
Above-mentioned Figure 11 is the schematic diagram for showing form, for actual displayed form, reference can be made to practical interface shown in Figure 12
Figure.
205, when receiving video record END instruction, terminal generates the video file including text data.
During video record, terminal acquires the voice signal of user in real time, and using the above method in video record
Display text data on interface.Since voice signal is converted to written form and needs certain time length from collecting, thus recording
Text data shown on video record interface and the voice of user are that have certain time difference in the process, lead to video in this way
It shown text data and currently collected voice signal and is mismatched on recording interface.For this purpose, when receiving video record
When END instruction processed, terminal generates the video including text data by carrying out registration process to lteral data and video image
File.
Terminal generation includes the steps that the video file of text data is as follows:
2051, terminal obtains the recording time of the corresponding audio data of text data.
Terminal obtains the recording time of the corresponding video data of text data, including but not limited to the following two kinds mode:
First way, text data have data directory, and terminal obtains the recording of the corresponding audio data of text data
It, can be by the product of the data directory of text data and the first preset duration, as the corresponding audio data of text data when the time
Recording time.For example, the data directory of text data is 2, the first preset duration is 100 milliseconds, then terminal is by by this article
The data directory of notebook data and the first preset duration are multiplied, and the recording time for obtaining the corresponding audio data of text data is 200
Millisecond.
It should be noted that recording time described in the step is the relative time that opposite subtitle adds the moment, in determination
The recording time of the corresponding audio data of text data needs the time plus the subtitle addition moment.
When the second way, terminal obtain the recording time of the corresponding audio data of text data, text data can be obtained
Receiving time, can recording by the difference between the receiving time of text data and third preset duration, as text data
Time.
Wherein, third preset duration is sent to server from terminal side for audio data and returns to the text after conversion to terminal
The duration of data, the third preset duration determine that the third preset duration can according to the handling duration and network state of server
Think 500 milliseconds, 600 milliseconds, 800 milliseconds etc..For example, the receiving time of text data is 10:01:00, third preset duration
It is 800 milliseconds, then the recording time that this article notebook data can be obtained is 10:00:200.
2052, terminal is according to the recording time of audio data and the recording time of every frame video image, to text data and
Video image merges, and obtains video file.
By in the merging process of text data and video image, terminal obtains a frame video according to recording time sequence
Image, and according to the recording time of the recording time of frame video image and the text data not merged, by recording time earlier than this
The recording time of the frame video image and text data not merged is merged with the frame video image.Due in video record mistake
Cheng Zhong, accessed lteral data have certain display effect, and therefore, it is necessary to selected word curtain-like type addition option is corresponding
Display data, text data and video image be packaged so that terminal or other terminals are playing video text
When part, the text synchronous with user speech can not only be shown, and can be according to the selected subtitle type option pair of user
The display effect answered is shown, so that interactive process has more interest.
In the merging process of text data and video image, if video image has not been obtained, illustrate that text is completed
Data merge with video image, if can not get recording time of the recording time earlier than the frame video image and not merge
Text data, illustrate that user does not speak when recording the frame video image or user is turned off subtitle addition function, this
When the frame video image can be added in video file.
Figure 13 shows the schematic diagram of text data Yu video file alignment procedure, referring to Figure 13, when receiving server
When the text data of transmission, terminal judges locally whether be stored with this article notebook data according to the data directory of text data, if
The data directory of this article notebook data is locally stored, then according to the text data received, to the local text data stored
It is updated;If the data directory of local not stored this article notebook data, according to the data directory of this article notebook data and first
It 100 milliseconds of preset duration, determines the recording time of the corresponding audio data of text data, is then saved in local.Later, eventually
Text data is plotted to video record interface according to subtitle type option by end, when receiving video record END instruction, eventually
End generates provisional video file A, and extracts a frame video image from video file A sequentially in time, if cannot succeed
Extract video image, then terminal can send video file A, if it is possible to successfully extract video image, then from
In the local text data cached, the text data that recording time is not later than the frame video image is obtained, and will be acquired
Text data is merged with the frame video image, obtains new video file B;The frame is not later than if there is no recording time
The frame video image is added in video file B by the text data of video image, terminal.
When generate include text data video file after, terminal can by directly by the video file issue other ends
End, the video file that can also be recorded before transmitting by preview function preview.It is carried out in advance to the video file recorded
When looking at, terminal can add special efficacy element in the video file recorded.The process of specific addition special efficacy element are as follows: when receiving
When to the instruction for previewing of video file, terminal shows preview interface, wherein preview interface includes at least one special efficacy element addition
Option, every kind of special efficacy element addition option correspond to a kind of special efficacy element.At least one special efficacy element includes scribble, mosaic, table
At least one such as feelings.Operation is chosen to any special efficacy element addition option when detecting, terminal receives special efficacy element and adds
When adding instruction, the corresponding special efficacy element of selected special efficacy element addition option is added in video file by terminal, and in preview circle
The video image of addition special efficacy element is shown on face.Figure 14 is the schematic diagram of the preview interface of terminal, referring to Figure 14, preview circle
Three kinds of scribble, mosaic and expression special efficacy elements are shown on the video image that face is played.It is added when receiving special efficacy element
When instruction, the video file for adding special efficacy element is sent to other terminals by terminal.Other terminals receive the video file, in turn
By playing the video file including text data and special efficacy element, interacted with user is recorded.
Method provided in an embodiment of the present invention is converted to collected audio data in video file recording process
Text data, and the text data being converted to is shown at video record interface.The process makes one without user in advance
Subtitle file, saves vast resources and manufacturing process is simpler.Subtitle can be added in video file recording process, and can will be literary
Notebook data is aligned with video image, so that the recording process real-time of video is stronger.In addition, can also be according to selected by user
Display format render in riotous profusion text, enrich video content, increase the interest of video.
Referring to Figure 15, the embodiment of the invention provides a kind of record device of video file, which includes:
Display module 1501, for showing video record interface, which includes during video record
Subtitle adds option;
Obtain module 1502, for when receive subtitle addition instruction when, according to collected during video record
Audio data obtains the text data that audio data is converted to;
Display module 1501, for the display text data on video record interface;
Generation module 1503, for when receiving video record END instruction, generating the video text including text data
Part.
In another embodiment of the present invention, generation module 1503, for obtaining the corresponding audio data of text data
Recording time;According to the recording time of the recording time of audio data and every frame video image, to text data and video figure
As merging, video file is obtained.
In another embodiment of the present invention, text data has data directory, obtains module 1502, is used for text
The product of the data directory of data and the first preset duration, the recording time as the corresponding audio data of text data.
In another embodiment of the present invention, module 1502 is obtained, for obtaining the receiving time of text data;It will be literary
Difference between the receiving time and third preset duration of notebook data, as the recording time of text data, third preset duration
It is determined according to the sending time of audio data and the receiving time of text data.
In another embodiment of the present invention, generation module 1503 are used for for any frame video image, according to video
The recording time of the recording time of image and the text data not merged, merges video image and text data.
In another embodiment of the present invention, module 1502 is obtained, for being adopted during video record according to default
Sample rate carries out resampling to the audio signal of acquisition, obtains an at least frame audio data;When an at least frame audio data meets in advance
If when condition, being converted to an at least frame audio data, obtaining text data.
In another embodiment of the present invention, module 1502 is obtained, for obtaining the volume value of every frame audio data;When
The volume value of any frame audio data is greater than specified threshold, stores audio data;When the total duration of the audio data stored reaches
To the first preset duration, the audio data that total duration is the first preset duration is sent to server.
In another embodiment of the present invention, the device further include:
Sending module is respectively less than specified threshold for the volume value when audio data, to service in the second preset duration
Device sends heartbeat data, and heartbeat data is used to maintain the heartbeat between server, when the second preset duration is default greater than first
It is long.
In another embodiment of the present invention, text data has data directory, the device further include:
Data directory has been locally stored for working as in update module, according to text data, textual data corresponding to data directory
According to being updated;
Memory module, for having data directory, storing data index and text data when locally not stored.
In another embodiment of the present invention, display module 1501, for when receiving subtitle type selection command,
Show that at least one subtitle type option, every kind of subtitle type option correspond to a kind of Subtitle Demonstration form;When receiving textual data
When according to acquisition instructions, the step of obtaining text data is executed.
In another embodiment of the present invention, subtitle type option includes first kind option, and display module 1501 is used
In when selected word curtain type option be the first kind option when, using barrage form on video record interface display text number
According to;Alternatively,
Subtitle type option includes Second Type option, and Second Type option includes designated pictures, and display module 1501 is used
In when selected subtitle option be Second Type option when, using barrage form on video record interface display text data with
The mobile moving process of designated pictures;Alternatively,
Subtitle type option includes third type option, display module 1501, for being third class when selected subtitle option
When type option, movement that the text on video record interface in display text data is moved around centered on predeterminated position
Process.
In another embodiment of the present invention, display module 1501 refer to the preview of video file for working as to receive
When enabling, show that preview interface, preview interface include that at least one special efficacy element adds option, every kind of special efficacy element adds option pair
Answer a kind of special efficacy element;When receiving the addition instruction of special efficacy element, the corresponding special efficacy element of selected element addition option is added
It is added in video file, and shows the video image of addition special efficacy element in preview interface.
To sum up, device provided in an embodiment of the present invention turns collected audio data in video file recording process
It is changed to text data, and the text data being converted to is shown at video record interface.The process makes in advance without user
One subtitle file, saves vast resources and manufacturing process is simpler.Subtitle can be added in video file recording process, and can
Text data is aligned with video image, so that the recording process real-time of video is stronger.In addition, can also be according to user institute
The display format of selection renders in riotous profusion text, enriches video content, increases the interest of video.
Figure 16 shows the structural block diagram of the terminal 1600 of an illustrative embodiment of the invention offer.The terminal 1600 can
To be: smart phone, tablet computer, MP3 player (Moving Picture Experts Group Audio Layer
III, dynamic image expert's compression standard audio level 3), MP4 (Moving Picture Experts Group Audio
Layer IV, dynamic image expert's compression standard audio level 4) player, laptop or desktop computer.Terminal 1600 is also
Other titles such as user equipment, portable terminal, laptop terminal, terminal console may be referred to as.
In general, terminal 1600 includes: processor 1601 and memory 1602.
Processor 1601 may include one or more processing cores, such as 4 core processors, 8 core processors etc..Place
Reason device 1601 can use DSP (Digital Signal Processing, Digital Signal Processing), FPGA (Field-
Programmable Gate Array, field programmable gate array), PLA (Programmable Logic Array, may be programmed
Logic array) at least one of example, in hardware realize.Processor 1601 also may include primary processor and coprocessor, master
Processor is the processor for being handled data in the awake state, also referred to as CPU (Central Processing
Unit, central processing unit);Coprocessor is the low power processor for being handled data in the standby state.?
In some embodiments, processor 1601 can be integrated with GPU (Graphics Processing Unit, image processor),
GPU is used to be responsible for the rendering and drafting of content to be shown needed for display screen.In some embodiments, processor 1601 can also be wrapped
AI (Artificial Intelligence, artificial intelligence) processor is included, the AI processor is for handling related machine learning
Calculating operation.
Memory 1602 may include one or more computer readable storage mediums, which can
To be non-transient.Memory 1602 may also include high-speed random access memory and nonvolatile memory, such as one
Or multiple disk storage equipments, flash memory device.In some embodiments, the non-transient computer in memory 1602 can
Storage medium is read for storing at least one instruction, at least one instruction performed by processor 1601 for realizing this Shen
Please in embodiment of the method provide video file method for recording.
In some embodiments, terminal 1600 is also optional includes: peripheral device interface 1603 and at least one periphery are set
It is standby.It can be connected by bus or signal wire between processor 1601, memory 1602 and peripheral device interface 1603.It is each outer
Peripheral equipment can be connected by bus, signal wire or circuit board with peripheral device interface 1603.Specifically, peripheral equipment includes:
In radio circuit 1604, touch display screen 1605, camera 1606, voicefrequency circuit 1607, positioning component 1608 and power supply 1609
At least one.
Peripheral device interface 1603 can be used for I/O (Input/Output, input/output) is relevant outside at least one
Peripheral equipment is connected to processor 1601 and memory 1602.In some embodiments, processor 1601, memory 1602 and periphery
Equipment interface 1603 is integrated on same chip or circuit board;In some other embodiments, processor 1601, memory
1602 and peripheral device interface 1603 in any one or two can be realized on individual chip or circuit board, this implementation
Example is not limited this.
Radio circuit 1604 is for receiving and emitting RF (Radio Frequency, radio frequency) signal, also referred to as electromagnetic signal.
Radio circuit 1604 is communicated by electromagnetic signal with communication network and other communication equipments.Radio circuit 1604 is by telecommunications
Number being converted to electromagnetic signal is sent, alternatively, the electromagnetic signal received is converted to electric signal.Optionally, radio circuit
1604 include: antenna system, RF transceiver, one or more amplifiers, tuner, oscillator, digital signal processor, volume solution
Code chipset, user identity module card etc..Radio circuit 1604 can by least one wireless communication protocol come with it is other
Terminal is communicated.The wireless communication protocol includes but is not limited to: Metropolitan Area Network (MAN), each third generation mobile communication network (2G, 3G, 4G and
5G), WLAN and/or WiFi (Wireless Fidelity, Wireless Fidelity) network.In some embodiments, radio frequency electrical
Road 1604 can also include NFC (Near Field Communication, wireless near field communication) related circuit, the application
This is not limited.
Display screen 1605 is for showing UI (User Interface, user interface).The UI may include figure, text,
Icon, video and its their any combination.When display screen 1605 is touch display screen, display screen 1605 also there is acquisition to exist
The ability of the touch signal on the surface or surface of display screen 1605.The touch signal can be used as control signal and be input to place
Reason device 1601 is handled.At this point, display screen 1605 can be also used for providing virtual push button and/or dummy keyboard, it is also referred to as soft to press
Button and/or soft keyboard.In some embodiments, display screen 1605 can be one, and the front panel of terminal 1600 is arranged;Another
In a little embodiments, display screen 1605 can be at least two, be separately positioned on the different surfaces of terminal 1600 or in foldover design;
In still other embodiments, display screen 1605 can be flexible display screen, is arranged on the curved surface of terminal 1600 or folds
On face.Even, display screen 1605 can also be arranged to non-rectangle irregular figure, namely abnormity screen.Display screen 1605 can be with
Using LCD (Liquid Crystal Display, liquid crystal display), OLED (Organic Light-Emitting Diode,
Organic Light Emitting Diode) etc. materials preparation.
CCD camera assembly 1606 is for acquiring image or video.Optionally, CCD camera assembly 1606 includes front camera
And rear camera.In general, the front panel of terminal is arranged in front camera, the back side of terminal is arranged in rear camera.?
In some embodiments, rear camera at least two is that main camera, depth of field camera, wide-angle camera, focal length are taken the photograph respectively
As any one in head, to realize that main camera and the fusion of depth of field camera realize background blurring function, main camera and wide
Pan-shot and VR (Virtual Reality, virtual reality) shooting function or other fusions are realized in camera fusion in angle
Shooting function.In some embodiments, CCD camera assembly 1606 can also include flash lamp.Flash lamp can be monochromatic temperature flash of light
Lamp is also possible to double-colored temperature flash lamp.Double-colored temperature flash lamp refers to the combination of warm light flash lamp and cold light flash lamp, can be used for
Light compensation under different-colour.
Voicefrequency circuit 1607 may include microphone and loudspeaker.Microphone is used to acquire the sound wave of user and environment, and
It converts sound waves into electric signal and is input to processor 1601 and handled, or be input to radio circuit 1604 to realize that voice is logical
Letter.For stereo acquisition or the purpose of noise reduction, microphone can be separately positioned on the different parts of terminal 1600 to be multiple.
Microphone can also be array microphone or omnidirectional's acquisition type microphone.Loudspeaker is then used to that processor 1601 or radio frequency will to be come from
The electric signal of circuit 1604 is converted to sound wave.Loudspeaker can be traditional wafer speaker, be also possible to piezoelectric ceramics loudspeaking
Device.When loudspeaker is piezoelectric ceramic loudspeaker, the audible sound wave of the mankind can be not only converted electrical signals to, can also be incited somebody to action
Electric signal is converted to the sound wave that the mankind do not hear to carry out the purposes such as ranging.In some embodiments, voicefrequency circuit 1607 may be used also
To include earphone jack.
Positioning component 1608 is used for the current geographic position of positioning terminal 1600, to realize navigation or LBS (Location
Based Service, location based service).Positioning component 1608 can be the GPS (Global based on the U.S.
Positioning System, global positioning system), the dipper system of China, Russia Gray receive this system or European Union
The positioning component of Galileo system.
Power supply 1609 is used to be powered for the various components in terminal 1600.Power supply 1609 can be alternating current, direct current
Electricity, disposable battery or rechargeable battery.When power supply 1609 includes rechargeable battery, which can support wired
Charging or wireless charging.The rechargeable battery can be also used for supporting fast charge technology.
In some embodiments, terminal 1600 further includes having one or more sensors 1610.One or more sensing
Device 1610 includes but is not limited to: acceleration transducer 1611, gyro sensor 1612, pressure sensor 1613, fingerprint sensing
Device 1614, optical sensor 1615 and proximity sensor 1616.
Acceleration transducer 1611 can detecte the acceleration in three reference axis of the coordinate system established with terminal 1600
Size.For example, acceleration transducer 1611 can be used for detecting component of the acceleration of gravity in three reference axis.Processor
The 1601 acceleration of gravity signals that can be acquired according to acceleration transducer 1611, control touch display screen 1605 with transverse views
Or longitudinal view carries out the display of user interface.Acceleration transducer 1611 can be also used for game or the exercise data of user
Acquisition.
Gyro sensor 1612 can detecte body direction and the rotational angle of terminal 1600, gyro sensor 1612
Acquisition user can be cooperateed with to act the 3D of terminal 1600 with acceleration transducer 1611.Processor 1601 is according to gyro sensors
The data that device 1612 acquires, following function may be implemented: action induction (for example changing UI according to the tilt operation of user) is clapped
Image stabilization, game control and inertial navigation when taking the photograph.
The lower layer of side frame and/or touch display screen 1605 in terminal 1600 can be set in pressure sensor 1613.When
When the side frame of terminal 1600 is arranged in pressure sensor 1613, user can detecte to the gripping signal of terminal 1600, by
Reason device 1601 carries out right-hand man's identification or prompt operation according to the gripping signal that pressure sensor 1613 acquires.Work as pressure sensor
1613 when being arranged in the lower layer of touch display screen 1605, is grasped by processor 1601 according to pressure of the user to touch display screen 1605
Make, realization controls the operability control on the interface UI.Operability control include button control, scroll bar control,
At least one of icon control, menu control.
Fingerprint sensor 1614 is used to acquire the fingerprint of user, is collected by processor 1601 according to fingerprint sensor 1614
Fingerprint recognition user identity, alternatively, by fingerprint sensor 1614 according to the identity of collected fingerprint recognition user.Knowing
Not Chu the identity of user when being trusted identity, authorize the user to execute relevant sensitive operation by processor 1601, which grasps
Make to include solving lock screen, checking encryption information, downloading software, payment and change setting etc..Fingerprint sensor 1614 can be set
Set the front, the back side or side of terminal 1600.When being provided with physical button or manufacturer Logo in terminal 1600, fingerprint sensor
1614 can integrate with physical button or manufacturer Logo.
Optical sensor 1615 is for acquiring ambient light intensity.In one embodiment, processor 1601 can be according to light
The ambient light intensity that sensor 1615 acquires is learned, the display brightness of touch display screen 1605 is controlled.Specifically, work as ambient light intensity
When higher, the display brightness of touch display screen 1605 is turned up;When ambient light intensity is lower, the aobvious of touch display screen 1605 is turned down
Show brightness.In another embodiment, the ambient light intensity that processor 1601 can also be acquired according to optical sensor 1615, is moved
The acquisition parameters of state adjustment CCD camera assembly 1606.
Proximity sensor 1616, also referred to as range sensor are generally arranged at the front panel of terminal 1600.Proximity sensor
1616 for acquiring the distance between the front of user Yu terminal 1600.In one embodiment, when proximity sensor 1616 is examined
When measuring the distance between the front of user and terminal 1600 and gradually becoming smaller, by processor 1601 control touch display screen 1605 from
Bright screen state is switched to breath screen state;When proximity sensor 1616 detect the distance between front of user and terminal 1600 by
When gradual change is big, touch display screen 1605 is controlled by processor 1601 and is switched to bright screen state from breath screen state.
It, can be with it will be understood by those skilled in the art that the restriction of the not structure paired terminal 1600 of structure shown in Figure 16
Including than illustrating more or fewer components, perhaps combining certain components or being arranged using different components.
Terminal provided in an embodiment of the present invention is converted to collected audio data in video file recording process
Text data, and the text data being converted to is shown at video record interface.The process makes one without user in advance
Subtitle file, saves vast resources and manufacturing process is simpler.Subtitle can be added in video file recording process, and can will be literary
Notebook data is aligned with video image, so that the recording process real-time of video is stronger.In addition, can also be according to selected by user
Display format render in riotous profusion text, enrich video content, increase the interest of video.
The embodiment of the invention provides a kind of computer readable storage medium, at least one is stored in the storage medium
Instruction, at least a Duan Chengxu, code set or instruction set, at least one instruction, an at least Duan Chengxu, the code set
Or described instruction collection is loaded by processor and is executed the method for recording to realize video file shown in Fig. 2.
Computer readable storage medium provided in an embodiment of the present invention will be collected in video file recording process
Audio data is converted to text data, and the text data being converted to is shown at video record interface.The process is without using
Family makes a subtitle file in advance, saves vast resources and manufacturing process is simpler.It can add in video file recording process
Captioning, and text data can be aligned with video image, so that the recording process real-time of video is stronger.In addition, also
In riotous profusion text can be rendered according to the selected display format of user, enrich video content, increase the interest of video.
It should be understood that the record device of video file provided by the above embodiment is in recorded video file, only with
The division progress of above-mentioned each functional module can according to need and for example, in practical application by above-mentioned function distribution by not
Same functional module is completed, i.e., the internal structure of the record device of video file is divided into different functional modules, to complete
All or part of function described above.In addition, the record device and video file of video file provided by the above embodiment
Method for recording embodiment belong to same design, specific implementation process is detailed in embodiment of the method, and which is not described herein again.
Those of ordinary skill in the art will appreciate that realizing that all or part of the steps of above-described embodiment can pass through hardware
It completes, relevant hardware can also be instructed to complete by program, the program can store in a kind of computer-readable
In storage medium, storage medium mentioned above can be read-only memory, disk or CD etc..
The foregoing is merely presently preferred embodiments of the present invention, is not intended to limit the invention, it is all in spirit of the invention and
Within principle, any modification, equivalent replacement, improvement and so on be should all be included in the protection scope of the present invention.
Claims (15)
1. a kind of method for recording of video file, which is characterized in that the described method includes:
During video record, show that video record interface, the video record interface include subtitle addition option;
When receiving subtitle addition instruction, according to audio data collected during video record, the audio is obtained
The text data that data conversion obtains;
The text data is shown on the video record interface;
When receiving video record END instruction, the video file including text data is generated.
2. the method according to claim 1, wherein described when receiving video record END instruction, generation
Video file including text data, comprising:
Obtain the recording time of the corresponding audio data of the text data;
According to the recording time of the recording time of the audio data and every frame video image, to the text data and video figure
As merging, the video file is obtained.
3. according to the method described in claim 2, it is characterized in that, the text data has data directory, the acquisition institute
State the recording time of the corresponding audio data of text data, comprising:
By the product of the data directory of the text data and the first preset duration, as the corresponding audio number of the text data
According to recording time.
4. according to the method described in claim 2, it is characterized in that, the text data corresponding audio data of obtaining
Recording time, comprising:
Obtain the receiving time of the text data;
By the difference between the receiving time of the text data and third preset duration, when recording as the text data
Between, the third preset duration is determined according to the sending time of the audio data and the receiving time of the text data.
5. according to the method described in claim 2, it is characterized in that, the recording time and every frame according to the audio data
The recording time of video image merges the text data and video image, comprising:
For any frame video image, when according to the recording of the recording time of the video image and the text data not merged
Between, the video image and the text data are merged.
6. the method according to claim 1, wherein the textual data for obtaining the audio data and being converted to
According to, comprising:
During video record, resampling is carried out according to audio signal of the default sample rate to acquisition, obtains an at least frame sound
Frequency evidence;
When it is described at least a frame audio data meets preset condition when, an at least frame audio data is converted, is obtained
The text data.
7. according to the method described in claim 6, it is characterized in that, described when an at least frame audio data meets default item
When part, an at least frame audio data is converted, the text data is obtained, comprising:
Obtain the volume value of every frame audio data;
When any frame audio data volume value be greater than specified threshold, store the audio data;
It is the sound of first preset duration to total duration when the total duration of the audio data stored reaches the first preset duration
Frequency obtains the text data according to being converted.
8. according to the method described in claim 6, it is characterized in that, the method also includes:
When the volume value of the audio data is respectively less than the specified threshold in the second preset duration, Xiang Suoshu server is sent
Heartbeat data, the heartbeat data are used to maintain the heartbeat between the server, and second preset duration is greater than described
First preset duration.
9. the method according to claim 1, wherein the text data has data directory, the acquisition text
After notebook data, further includes:
When the data directory has been locally stored, according to the text data, to the corresponding text data of the data directory into
Row updates;
There is the data directory when locally not stored, stores the data directory and the text data.
10. the method according to claim 1, wherein the method also includes:
When receiving subtitle type selection command, show that at least one subtitle type option, every kind of subtitle type option are corresponding
A kind of Subtitle Demonstration form;
When receiving text data acquisition instructions, execute the acquisition text data the step of.
11. according to the method described in claim 10, it is characterized in that, the subtitle type option includes first kind option,
It is described to show the text data on the video record interface, comprising:
When selected word curtain type option is the first kind option, shown on the video record interface using barrage form
Show the text data;Alternatively,
The subtitle type option includes Second Type option, and the Second Type option includes designated pictures, described described
The text data is shown on video record interface, comprising:
When selected subtitle option is the Second Type option, institute is shown on the video record interface using barrage form
State the text data moving process mobile with the designated pictures;Alternatively,
The subtitle type option includes third type option, described to show the textual data on the video record interface
According to, comprising:
When selected subtitle option is third type option, the text in the text data is shown on the video record interface
The moving process that word is moved around centered on predeterminated position.
12. method according to any one of claim 1 to 11, which is characterized in that described to receive video record knot
When Shu Zhiling, after generation is including the video file of text data, further includes:
When receiving the instruction for previewing to the video file, show that preview interface, the preview interface include at least one
Special efficacy element adds option, and every kind of special efficacy element addition option corresponds to a kind of special efficacy element;
When receiving the addition instruction of special efficacy element, the corresponding special efficacy element of selected element addition option is added to the video
In file, and the video image for adding the special efficacy element is shown in the preview interface.
13. a kind of record device of video file, which is characterized in that described device includes:
Display module, for showing video record interface, the video record interface includes that subtitle adds during video record
Add option;
Module is obtained, for when receiving subtitle addition instruction, according to audio data collected during video record,
Obtain the text data that the audio data is converted to;
The display module, for showing the text data on the video record interface;
Generation module generates packet for generating the video file including text data when receiving video record END instruction
Include the video file of text data.
14. a kind of terminal, which is characterized in that the terminal includes processor and memory, is stored at least in the memory
One instruction, at least a Duan Chengxu, code set or instruction set, at least one instruction, an at least Duan Chengxu, the generation
Code collection or described instruction collection are loaded as the processor and are executed to realize the video as described in any one of claims 1 to 12
The method for recording of file.
15. a kind of computer readable storage medium, which is characterized in that be stored at least one instruction, extremely in the storage medium
A few Duan Chengxu, code set or instruction set, at least one instruction, an at least Duan Chengxu, the code set or described
Instruction set is loaded as processor and is executed the recording side to realize the video file as described in any one of claims 1 to 12
Method.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810235113.8A CN110300274B (en) | 2018-03-21 | 2018-03-21 | Video file recording method, device and storage medium |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810235113.8A CN110300274B (en) | 2018-03-21 | 2018-03-21 | Video file recording method, device and storage medium |
Publications (2)
Publication Number | Publication Date |
---|---|
CN110300274A true CN110300274A (en) | 2019-10-01 |
CN110300274B CN110300274B (en) | 2022-05-10 |
Family
ID=68025346
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201810235113.8A Active CN110300274B (en) | 2018-03-21 | 2018-03-21 | Video file recording method, device and storage medium |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN110300274B (en) |
Cited By (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110888710A (en) * | 2019-12-05 | 2020-03-17 | 广州酷狗计算机科技有限公司 | Method and device for adding subtitles, computer equipment and storage medium |
CN112363899A (en) * | 2020-11-10 | 2021-02-12 | 杭州和利时自动化有限公司 | Operation replay method, device, equipment and computer readable storage medium |
CN112533052A (en) * | 2020-11-27 | 2021-03-19 | 北京字跳网络技术有限公司 | Video sharing method and device, electronic equipment and storage medium |
CN113784169A (en) * | 2021-09-10 | 2021-12-10 | 湖南快乐阳光互动娱乐传媒有限公司 | Video recording method and device with bullet screen |
CN114157877A (en) * | 2021-10-08 | 2022-03-08 | 钉钉(中国)信息技术有限公司 | Playback data generation method and device, and playback method and device |
CN114554246A (en) * | 2022-02-23 | 2022-05-27 | 北京纵横无双科技有限公司 | Medical science popularization video production method and system based on UGC mode |
CN114745585A (en) * | 2022-04-06 | 2022-07-12 | Oppo广东移动通信有限公司 | Subtitle display method, device, terminal and storage medium |
Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20020010589A1 (en) * | 2000-07-24 | 2002-01-24 | Tatsushi Nashida | System and method for supporting interactive operations and storage medium |
CN101382937A (en) * | 2008-07-01 | 2009-03-11 | 深圳先进技术研究院 | Multimedia resource processing method based on speech recognition and on-line teaching system thereof |
CN105812920A (en) * | 2016-03-14 | 2016-07-27 | 腾讯科技(深圳)有限公司 | Media information processing method and media information processing device |
CN106412645A (en) * | 2016-09-09 | 2017-02-15 | 广州酷狗计算机科技有限公司 | Method and apparatus for uploading video file to multimedia server |
CN107316642A (en) * | 2017-06-30 | 2017-11-03 | 联想(北京)有限公司 | Video file method for recording, audio file method for recording and mobile terminal |
CN107360460A (en) * | 2017-07-31 | 2017-11-17 | 深圳回收宝科技有限公司 | A kind of method, equipment and storage medium for detecting video addition captions |
-
2018
- 2018-03-21 CN CN201810235113.8A patent/CN110300274B/en active Active
Patent Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20020010589A1 (en) * | 2000-07-24 | 2002-01-24 | Tatsushi Nashida | System and method for supporting interactive operations and storage medium |
CN101382937A (en) * | 2008-07-01 | 2009-03-11 | 深圳先进技术研究院 | Multimedia resource processing method based on speech recognition and on-line teaching system thereof |
CN105812920A (en) * | 2016-03-14 | 2016-07-27 | 腾讯科技(深圳)有限公司 | Media information processing method and media information processing device |
CN106412645A (en) * | 2016-09-09 | 2017-02-15 | 广州酷狗计算机科技有限公司 | Method and apparatus for uploading video file to multimedia server |
CN107316642A (en) * | 2017-06-30 | 2017-11-03 | 联想(北京)有限公司 | Video file method for recording, audio file method for recording and mobile terminal |
CN107360460A (en) * | 2017-07-31 | 2017-11-17 | 深圳回收宝科技有限公司 | A kind of method, equipment and storage medium for detecting video addition captions |
Cited By (12)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110888710A (en) * | 2019-12-05 | 2020-03-17 | 广州酷狗计算机科技有限公司 | Method and device for adding subtitles, computer equipment and storage medium |
CN112363899A (en) * | 2020-11-10 | 2021-02-12 | 杭州和利时自动化有限公司 | Operation replay method, device, equipment and computer readable storage medium |
CN112533052A (en) * | 2020-11-27 | 2021-03-19 | 北京字跳网络技术有限公司 | Video sharing method and device, electronic equipment and storage medium |
WO2022111375A1 (en) * | 2020-11-27 | 2022-06-02 | 北京字跳网络技术有限公司 | Video sharing method and apparatus, electronic device, and storage medium |
US11956531B2 (en) | 2020-11-27 | 2024-04-09 | Beijing Zitiao Network Technology Co., Ltd. | Video sharing method and apparatus, electronic device, and storage medium |
CN113784169A (en) * | 2021-09-10 | 2021-12-10 | 湖南快乐阳光互动娱乐传媒有限公司 | Video recording method and device with bullet screen |
CN113784169B (en) * | 2021-09-10 | 2023-06-27 | 湖南快乐阳光互动娱乐传媒有限公司 | Video recording method and device with barrage |
CN114157877A (en) * | 2021-10-08 | 2022-03-08 | 钉钉(中国)信息技术有限公司 | Playback data generation method and device, and playback method and device |
CN114157877B (en) * | 2021-10-08 | 2024-04-16 | 钉钉(中国)信息技术有限公司 | Playback data generation method and device, playback method and device |
CN114554246A (en) * | 2022-02-23 | 2022-05-27 | 北京纵横无双科技有限公司 | Medical science popularization video production method and system based on UGC mode |
CN114554246B (en) * | 2022-02-23 | 2024-05-31 | 北京纵横无双科技有限公司 | UGC mode-based medical science popularization video production method and system |
CN114745585A (en) * | 2022-04-06 | 2022-07-12 | Oppo广东移动通信有限公司 | Subtitle display method, device, terminal and storage medium |
Also Published As
Publication number | Publication date |
---|---|
CN110300274B (en) | 2022-05-10 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN108833818B (en) | Video recording method, device, terminal and storage medium | |
CN111124561B (en) | Display method applied to electronic equipment with folding screen and electronic equipment | |
CN110300274A (en) | Method for recording, device and the storage medium of video file | |
CN108401124B (en) | Video recording method and device | |
CN108182021A (en) | Multimedia messages methods of exhibiting, device, storage medium and equipment | |
WO2021169399A1 (en) | Method for caching application interface, and electronic apparatus | |
CN109920065A (en) | Methods of exhibiting, device, equipment and the storage medium of information | |
CN109191549A (en) | Show the method and device of animation | |
CN110019929A (en) | Processing method, device and the computer readable storage medium of web page contents | |
WO2022068819A1 (en) | Interface display method and related apparatus | |
CN111327694B (en) | File uploading method and device, storage medium and electronic equipment | |
WO2022028241A1 (en) | Preview cover generation method and electronic device | |
CN112583957A (en) | Display method of electronic device, electronic device and computer-readable storage medium | |
WO2021082815A1 (en) | Display element display method and electronic device | |
CN110163160A (en) | Face identification method, device, equipment and storage medium | |
CN109982129A (en) | Control method for playing back, device and the storage medium of short-sighted frequency | |
CN113420177A (en) | Audio data processing method and device, computer equipment and storage medium | |
CN110798327B (en) | Message processing method, device and storage medium | |
CN110099360A (en) | Voice message processing method and device | |
CN111437600A (en) | Plot showing method, plot showing device, plot showing equipment and storage medium | |
CN110045958A (en) | Data texturing generation method, device, storage medium and equipment | |
CN109218169A (en) | Instant communication method, device and storage medium | |
CN109614563A (en) | Show method, apparatus, equipment and the storage medium of webpage | |
CN111341317B (en) | Method, device, electronic equipment and medium for evaluating wake-up audio data | |
CN110297684A (en) | Theme display methods, device and storage medium based on virtual portrait |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |