CN104731944A

CN104731944A - Video searching method and device

Info

Publication number: CN104731944A
Application number: CN201510148886.9A
Authority: CN
Inventors: 邹明双
Original assignee: Nubia Technology Co Ltd
Current assignee: Nubia Technology Co Ltd
Priority date: 2015-03-31
Filing date: 2015-03-31
Publication date: 2015-06-24

Abstract

The invention discloses a video searching method. The video searching method comprises the following steps that a video source file is decomposed into multiple video frames, and time points corresponding to the video frames are recorded; comparison is carried out on each video frame according to preset searching characteristics to obtain a time point set corresponding to all the video frames containing the searching characteristics; the time point set is divided into multiple time periods on the basis of preset conditions, and corresponding video clips are captured from the video source file according to the divided time periods. The invention further discloses a video searching device. Due to the facts that a user does not need to carry out capture operations manually and comparison operation is carried out on each video frame in the video resource file, video clips, containing the searching characteristics, set by a user can be searched and captured easily, rapidly and accurately.

Description

Video searching method and device

Technical field

The present invention relates to video technique field, particularly relate to a kind of video searching method and device.

Background technology

At present, the video segment of scene that user likes or personage is often included in a video file, if user only wants that the video segment comprising these scenes or personage in video file being carried out intercepting preserves, manually intercepting generating video fragment can be carried out in video file by software in prior art, if but want that the video segment of scene or the personage that users of comprising all in video file are liked all intercepts, would need to carry out repeatedly Manual interception operation, and need constantly to manually locate the beginning intercepted in video file, end position, complicated operation and at substantial time.

Summary of the invention

Fundamental purpose of the present invention is to propose a kind of video searching method and device, is intended to search for quickly and easily intercept qualified video segment.

For achieving the above object, a kind of video searching method provided by the invention, described video searching method comprises the following steps:

Video source file is decomposed into each frame of video, and records time point corresponding to each frame of video;

Search characteristics according to presetting is compared to each frame of video, obtains and comprises time point set corresponding to all frame of video of described search characteristics;

Based on pre-conditioned, described time point set is divided into some time section, in described video source file, intercepts corresponding video segment according to the some time section divided.

Preferably, describedly based on pre-conditioned, described time point set is divided into some time section, the step intercepting corresponding video segment according to the some time section divided in described video source file comprises:

The time point that the difference of adjacent time point in described time point set is less than the first preset value is divided to the same time period, the time point that the difference of adjacent time point in described time point set is greater than the first preset value is divided to the different time periods;

Using the minimum time point in each time period as starting point, the maximum time point in each time period intercepts the described starting point video segment corresponding with between end point as end point in described video source file.

Preferably, describedly based on pre-conditioned, described time point set is divided into some time section, the step intercepting corresponding video segment according to the some time section divided in described video source file also comprises:

The time period that the difference put between minimum time point of maximum time in some time section is less than the second preset value is deleted.

Preferably, described search characteristics comprises default Word message, acoustic information and/or pictorial information.

Preferably, the search characteristics that described basis is preset is compared to each frame of video, obtains the step comprising time point set corresponding to all frame of video of described search characteristics and comprises:

Based on scale invariant feature transfer algorithm, and according to the search characteristics preset, each frame of video is compared, obtain and comprise by each the time point set that time point corresponding to the frame of video of described search characteristics form.

In addition, for achieving the above object, the present invention also proposes a kind of video searching apparatus, and described video searching apparatus comprises:

Decomposing module, for video source file is decomposed into each frame of video, and records time point corresponding to each frame of video;

Comparing module, for comparing to each frame of video according to the search characteristics preset, obtaining and comprising time point set corresponding to all frame of video of described search characteristics;

Dividing interception module, for described time point set being divided into some time section based on pre-conditioned, in described video source file, intercepting corresponding video segment according to the some time section divided.

Preferably, described division interception module comprises:

Division unit, is divided to the same time period for the time point difference of adjacent time point in described time point set being less than the first preset value, the time point that the difference of adjacent time point in described time point set is greater than the first preset value is divided to the different time periods;

Interception unit, for using the minimum time point in each time period as starting point, maximum time in each time period point in described video source file, intercept the described starting point video segment corresponding with between end point as end point.

Preferably, described division interception module also for:

Preferably, described comparing module specifically for:

The video searching method that the present invention proposes and device, by the time point of each frame video in video source file is carried out record, and each frame video and the search characteristics preset are compared, obtain and comprise time point set corresponding to all frame videos of described search characteristics, according to pre-conditioned, described time point set is divided into some time section again, can search in described video source file and intercept all video segments comprising described search characteristics, owing to manually carrying out intercept operation without the need to user, and all comparison operation has been carried out to each frame video in video source file, energy is simple and quick and search accurately intercepts the video segment comprising the search characteristics that user sets.

Accompanying drawing explanation

Fig. 1 is the physical arrangement schematic diagram of the terminal realizing each embodiment of the present invention;

Fig. 2 is the schematic flow sheet of video searching method one embodiment of the present invention;

Fig. 3 is the refinement schematic flow sheet of step S30 in Fig. 2;

Fig. 4 is the high-level schematic functional block diagram of video searching apparatus one embodiment of the present invention;

Fig. 5 is the refinement high-level schematic functional block diagram dividing interception module 03 in Fig. 4.

The realization of the object of the invention, functional characteristics and advantage will in conjunction with the embodiments, are described further with reference to accompanying drawing.

Embodiment

Should be appreciated that specific embodiment described herein only in order to explain the present invention, be not intended to limit the present invention.

The terminal realizing each embodiment of the present invention is described referring now to accompanying drawing.In follow-up description, use the suffix of such as " module ", " parts " or " unit " for representing element only in order to be conducive to explanation of the present invention, itself is specific meaning not.Therefore, " module " and " parts " can mixedly use.

Fig. 1 is the physical arrangement schematic diagram of the terminal realizing each embodiment of the present invention.

This terminal 1100 comprises:

Processor 1110 is general central processor (CPU), microprocessor, ASIC(Application Specific Integrated Circuit) (application-specific integrated circuit, ASIC), or one or more integrated circuit.

Storer 1120, for completing the storage of various software programs, the storage of data of terminal, and the operation etc. of software program.This storer 1120 can be RAM, EPROM, SSD, SD card, one or more in HD hard disk.The video file that the embodiment of the present invention provides and special efficacy, also run and store on storer 1120.

Sensor 1130, for measure and record data, this sensor 1130 can comprise following in any one or more: camera, GPS module, gravity sensor, acceleration transducer, range sensor, optical sensor, microphone, loudspeaker.

Transmission circuit 1140, for providing communication function, comprise in cellular network (GSM/UMTS/LTE/CDMA etc.), WLAN (wireless local area network) (WLAN), short-range communication (NFC), bluetooth etc. one or more.

Power supply 1150, for providing direct current supply, or converts Alternating Current Power Supply to direct current supply.

I/O interface circuit 1160, for providing external interface.Alternatively, this I/O interface circuit 1160 can comprise any one or more in following interfaces: USB interface, SD card interface, keystroke interface.

Display 1170 can be display screen and/or the touch-screen of terminal, in display video file and the special efficacy of display screen; Touch-screen is for receiving user's touch operation and converting user operation instruction to.

I/O control circuit 1180, for controlling the data interaction between various imput output circuit parts, especially, such as, data interaction between processor 1110 and I/O interface circuit 1160, display 1170.

Based on above-mentioned terminal hardware structure and communication system, each embodiment of video searching method of the present invention is proposed.

As shown in Figure 2, in an embodiment of video searching method of the present invention, this video searching method comprises:

Step S10, is decomposed into each frame of video by video source file, and records time point corresponding to each frame of video;

In the present embodiment, first need the video source file searched for carry out decomposition and inversion to user, whole video source file is decomposed into each frame of video, like this, because each frame of video is the data of picture format, conveniently follow-uply to compare.Meanwhile, time point corresponding for each frame of video all to there being scale and time point between a period of time, being carried out record, and the time point of each frame of video and correspondence thereof is carried out buffer memory by each frame of video.

Step S20, the search characteristics according to presetting is compared to each frame of video, obtains and comprises time point set corresponding to all frame of video of described search characteristics;

Each frame of video in buffer memory and the search characteristics preset are compared, wherein, the search characteristics preset can be the search characteristics of the user's input received, also can be the search characteristics that user presets, and the feature that must comprise in the search characteristics video segment that to be user need intercepts, this search characteristics can be Word message, acoustic information or the pictorial information that user needs to search for, and also can be other characteristic informations.According to comparison result using the frame of video comprising described search characteristics as the frame of video meeting user search and require, and obtain and comprise by each the time point set that time point corresponding to the frame of video of described search characteristics form.

It should be noted that, for a kind of information that the search characteristics of comparison both can be in Word message, acoustic information, pictorial information in the present embodiment, also can be the much information in Word message, acoustic information, pictorial information.When a kind of information in user's inputting word information, acoustic information, pictorial information is as search characteristics, if contain in frame of video user input a kind of information, then using this frame of video as the frame of video meeting user search requirement; And when user inputs multiple search characteristics as have input Word message, acoustic information and pictorial information simultaneously simultaneously, only have when frame of video comprise user input all search characteristics as include simultaneously user input Word message, acoustic information and pictorial information time, just using this frame of video as the frame of video meeting user search requirement, otherwise, using this frame of video as the frame of video not meeting user search requirement, filtration abandons.Wherein, the acoustic information of user's input can be sound clip, recording file etc., and pictorial information can be face sectional drawing, place or scene sectional drawing etc.

Step S30, is divided into some time section based on pre-conditioned by described time point set, intercepts corresponding video segment according to the some time section divided in described video source file.

According to pre-conditioned, Analysis and Screening is carried out to described time point set, as divided some time periods had between beginning, end zone by the mode of the parameter threshold such as the difference of adjacent time point, the quantity of adjacent time point in time point set as described in arranging, then, in described video source file, corresponding video segment is intercepted according to the some time section divided.The time point corresponding with the frame of video meeting described search characteristics is all included in each time period due to division, therefore, the frame of video meeting described search characteristics is all included according in the video segment that the some time section divided intercepts in described video source file, namely the video segment intercepted in described video source file all comprises described search characteristics, meets the searching requirement of user.

The present embodiment is by carrying out record by the time point of each frame video in video source file, and each frame video and the search characteristics preset are compared, obtain and comprise time point set corresponding to all frame videos of described search characteristics, according to pre-conditioned, described time point set is divided into some time section again, can search in described video source file and intercept all video segments comprising described search characteristics, owing to manually carrying out intercept operation without the need to user, and all comparison operation has been carried out to each frame video in video source file, energy is simple and quick and search accurately intercepts the video segment comprising the search characteristics that user sets.

Further, as shown in Figure 3, above-mentioned steps S30 can comprise:

Step S301, is divided to the same time period by the time point that the difference of adjacent time point in described time point set is less than the first preset value, and the time point that the difference of adjacent time point in described time point set is greater than the first preset value is divided to the different time periods;

Step S302, using the minimum time point in each time period as starting point, the maximum time point in each time period intercepts the described starting point video segment corresponding with between end point as end point in described video source file.

When carrying out Analysis and Screening to described time point set, all time points in described time point set can be sorted sequentially, be divided to the same time period by the time point difference of adjacent time point in described time point set being less than the first preset value, the time point difference of adjacent time point in described time point set being greater than the first preset value is divided to the different time periods and described time point set is divided into some time section.

As from as described in first time point also namely minimum time point in time point set, selected first time point is starting point, time point is above deducted successively with time point below, if the time difference between front and back adjacent time point is less than the first preset value n1, then continue the comparison carrying out time difference with a rear time point, if the time difference between front and back adjacent time point is greater than the first preset value n1, then current point in time is labeled as end point, first time point, current point in time and the time point in the middle of both are all divided to the same time period.Then time point below of current point in time is labeled as the starting point of another time period, carry out time difference with a rear time point of another time period starting point again to compare, the like compare, described time point set can be divided into some different time periods.

Using the minimum time point in each time period as starting point, the maximum time point in each time period can intercept the described starting point video segment corresponding with between end point as end point in described video source file.In the present embodiment, by the mode that the difference of adjacent time point in described time point set and the first preset value are compared divide some have start, time period between end zone, make to meet the frame of video that user search requires be included in same video segment by adjacent in the video segment finally intercepted according to different time sections correspondence, meet the frame of video that user search requires put different video segments under by excessive for the time interval, more effectively carry out search to video source file to intercept, improve the visibility of the video segment of intercepting.

Further, in other embodiments, in above-mentioned steps S30, the time period that also difference put between minimum time point of maximum time in some time section can be less than the second preset value deletes.

After described time point set is divided into some time section, further, also can the time period divided be screened, the time period that the difference put between minimum time point of maximum time in some time section is less than the second preset value n2 is deleted, like this, the time period only comprising a small amount of frame of video can be filtered out, avoid the final video segment duration intercepted too short, produce insignificant video segment, improve the efficiency of video intercepting.

Further, in other embodiments, above-mentioned steps S20 can comprise:

In the present embodiment, receive default search characteristics as Word message, acoustic information or pictorial information after, scale invariant feature is utilized to change (Scale-invariant feature transform, being called for short SIFT) algorithm compares to each frame of video of decomposing, and obtain and comprise by each the time point set that time point corresponding to the frame of video of described search characteristics form.Wherein, when user inputs multiple search characteristics as have input Word message, acoustic information and pictorial information simultaneously simultaneously, only have when frame of video comprise user input all search characteristics as include simultaneously user input Word message, acoustic information and pictorial information time, just using this frame of video as the frame of video meeting user search requirement, otherwise, using this frame of video as the frame of video not meeting user search requirement, filtration abandons.

The SIFT algorithm of comparing used to each frame of video of decomposing in the present embodiment is a kind of algorithm of computer vision, be used for detecting and the locality characteristic described in image, it finds extreme point in space scale, and extracts its position, yardstick, rotational invariants.Its range of application comprises that object identification, robot map perception and navigation, image are sewed up, 3D model is set up, gesture identification, image tracing and action comparison.The description of local image feature can help identification object with detecting, and SIFT feature is point of interest based on some local appearance on object and has nothing to do with the size of image and rotating.The tolerance changed for light, noise, slightly visual angle is also quite high.Based on these characteristics, they are highly significants and relatively easily capture, and in the property data base that female number is huge, are easy to identification object and rarely have misidentification.Use SIFT feature to describe the detecting rate of covering for fractional object also quite high, even only need the SIFT object features of more than 3 to be just enough to calculate position and orientation.Under computer hardware speed now and under small-sized property data base condition, identification speed can close to real-time operation.Containing much information of SIFT feature, is adapted at quick and precisely mating in high-volume database.

The essence of SIFT algorithm searches key point on different metric spaces, and calculate the direction of key point.The key point that SIFT finds is that some are very outstanding, can not because of illumination, the factor such as affined transformation and noise and the point changed, as the bright spot of angle point, marginal point, dark space and the dim spot etc. in clear zone.

SIFT algorithm specifically can be analyzed to following four steps:

1, metric space extremum extracting: search for the picture position on all yardsticks.The potential point of interest for yardstick and invariable rotary is identified by gaussian derivative function.

2, key point location: on the position of each candidate, the model meticulous by matching determines position and yardstick.The selection gist of key point is in their degree of stability.

3, direction is determined: based on the gradient direction of image local, distributes to one or more direction, each key point position.All operations to view data below all convert relative to the direction of key point, yardstick and position, thus provide the unchangeability for these conversion.

4, key point describes: in the neighborhood around each key point, the gradient of measurement image local on selected yardstick.These gradients are transformed into a kind of expression, this distortion and the illumination variation representing the local shape that permission is larger.

The characteristic utilizing SIFT algorithm to be adapted at quick and precisely mating in high-volume database in the present embodiment is compared to each frame of video of decomposing, greatly improve accuracy and the speed of comparison, so that follow-up simple and quick and search accurately intercepts out the video segment of search characteristics comprising user's setting.

The present invention further provides a kind of video searching apparatus, with reference to Fig. 4, in an embodiment of video searching apparatus of the present invention, this video searching apparatus comprises:

Decomposing module 01, for video source file is decomposed into each frame of video, and records time point corresponding to each frame of video;

Comparing module 02, for comparing to each frame of video that decomposing module 01 is decomposed according to the search characteristics preset, obtaining and comprising time point set corresponding to all frame of video of described search characteristics;

It should be noted that, for a kind of information that the search characteristics of comparison both can be in Word message, acoustic information, pictorial information in the present embodiment, also can be the much information in Word message, acoustic information, pictorial information.When a kind of information in user's inputting word information, acoustic information, pictorial information as Word message as search characteristics time, as long as contain in frame of video user input Word message, then using this frame of video as the frame of video meeting user search requirement; And when user inputs multiple search characteristics as have input Word message, acoustic information and pictorial information simultaneously simultaneously, only have when frame of video comprise user input all search characteristics as include simultaneously user input Word message, acoustic information and pictorial information time, just using this frame of video as the frame of video meeting user search requirement, otherwise, using this frame of video as the frame of video not meeting user search requirement, filtration abandons.Wherein, the acoustic information of user's input can be sound clip, recording file etc., and pictorial information can be face sectional drawing, place or scene sectional drawing etc.

Dividing interception module 03, for the described time point set that comparing module 02 comparison obtains being divided into some time section based on pre-conditioned, in described video source file, intercepting corresponding video segment according to the some time section divided.

Further, as shown in Figure 5, above-mentioned division interception module 03 can comprise:

Division unit 031, is divided to the same time period by the time point that the difference of adjacent time point in described time point set is less than the first preset value, and the time point that the difference of adjacent time point in described time point set is greater than the first preset value is divided to the different time periods;

Interception unit 032, using the minimum time point in each time period as starting point, the maximum time point in each time period intercepts the described starting point video segment corresponding with between end point as end point in described video source file.

Further, in other embodiments, the time period of above-mentioned division interception module 03 also for the difference put between minimum time point of maximum time in some time section being less than the second preset value deletes.

Further, in other embodiments, above-mentioned comparing module 02 specifically for:

SIFT algorithm specifically can be analyzed to following four steps:

It should be noted that, in this article, term " comprises ", " comprising " or its any other variant are intended to contain comprising of nonexcludability, thus make to comprise the process of a series of key element, method, article or device and not only comprise those key elements, but also comprise other key elements clearly do not listed, or also comprise by the intrinsic key element of this process, method, article or device.When not more restrictions, the key element limited by statement " comprising ... ", and be not precluded within process, method, article or the device comprising this key element and also there is other identical element.

The invention described above embodiment sequence number, just to describing, does not represent the quality of embodiment.

Through the above description of the embodiments, those skilled in the art can be well understood to the mode that above-described embodiment method can add required general hardware platform by software and realize, hardware can certainly be passed through, but in a lot of situation, the former is better embodiment.Based on such understanding, technical scheme of the present invention can embody with the form of software product the part that prior art contributes in essence in other words, this computer software product is stored in a storage medium (as ROM/RAM, magnetic disc, CD), comprising some instructions in order to make a station terminal equipment (can be mobile phone, computing machine, server, air conditioner, or the network equipment etc.) perform method described in each embodiment of the present invention.

These are only the preferred embodiments of the present invention; not thereby the scope of the claims of the present invention is limited; every utilize instructions of the present invention and accompanying drawing content to do equivalent structure or equivalent flow process conversion; or be directly or indirectly used in other relevant technical fields, be all in like manner included in scope of patent protection of the present invention.

Claims

1. a video searching method, is characterized in that, described video searching method comprises, step:

2. video searching method as claimed in claim 1, is characterized in that, describedly based on pre-conditioned, described time point set is divided into some time section, intercepts corresponding video segment, comprising according to the some time section divided in described video source file:

Using the minimum time point in each time period as starting point, the maximum time in each time period puts as end point, intercepts the described starting point video segment corresponding with between end point in described video source file.

3. video searching method as claimed in claim 2, is characterized in that, describedly based on pre-conditioned, described time point set is divided into some time section, intercepts corresponding video segment, also comprise according to the some time section divided in described video source file:

4. video searching method as claimed in claim 1, it is characterized in that, described search characteristics comprises default Word message, acoustic information and/or pictorial information.

5. the video searching method according to any one of Claims 1-4, is characterized in that, the search characteristics that described basis is preset is compared to each frame of video, obtains and comprises time point set corresponding to all frame of video of described search characteristics, comprising:

6. a video searching apparatus, is characterized in that, described video searching apparatus comprises:

7. video searching apparatus as claimed in claim 6, it is characterized in that, described division interception module comprises:

8. video searching apparatus as claimed in claim 7, is characterized in that, described division interception module also for:

9. video searching apparatus as claimed in claim 6, it is characterized in that, described search characteristics comprises default Word message, acoustic information and/or pictorial information.

10. the video searching apparatus according to any one of claim 6 to 9, is characterized in that, described comparing module specifically for: