CN114827713B

CN114827713B - Video processing method and device, computer readable storage medium and electronic equipment

Info

Publication number: CN114827713B
Application number: CN202110082809.3A
Authority: CN
Inventors: 成云峰; 杨太任
Original assignee: Guangdong Oppo Mobile Telecommunications Corp Ltd
Current assignee: Guangdong Oppo Mobile Telecommunications Corp Ltd
Priority date: 2021-01-21
Filing date: 2021-01-21
Publication date: 2023-08-08
Anticipated expiration: 2041-01-21
Also published as: WO2022156294A1; CN114827713A

Abstract

The disclosure provides a video processing method, a video processing device, a computer readable storage medium and electronic equipment, and relates to the technical field of video processing. The video processing method comprises the following steps: when a first event occurs in the video, starting a video interception task; determining whether the video has a second event within a predetermined period of time after the first event is completed; if the second event occurs, determining whether a third event occurs in the video within a preset time period after the second event is finished; if the third event occurs, the third event is taken as a second event; ending the video interception task to determine the intercepted video clip if the second event or the third event does not occur; at least two of the first event, the second event and the third event are related events. The method and the device can intercept video clips of a plurality of associated events from the video, and the video clips are continuous video clips, so that the video clips watched by a user are continuous and the events are complete.

Description

Video processing method and device, computer readable storage medium and electronic equipment

Technical Field

The present disclosure relates to the field of video processing technology, and in particular, to a video processing method, a video processing apparatus, a computer readable storage medium, and an electronic device.

Background

Video, an important way to communicate information, has been widely used in many fields such as monitoring, education, entertainment, medical treatment, intelligent driving, etc.

There is often some content in the video that is not of interest to the user, and the proportion of such content in the video may be large, the viewing experience of the user is poor, and the storage pressure is high. Currently, some schemes for capturing video are presented. However, these schemes of capturing video may have a problem that capturing effects are poor, such as losing information focused by the user.

Disclosure of Invention

The disclosure provides a video processing method, a video processing device, a computer readable storage medium and an electronic device, so as to overcome the problem of poor video capturing effect at least to a certain extent.

According to a first aspect of the present disclosure, there is provided a video processing method, including: when a first event occurs in the video, starting a video interception task; determining whether the video has a second event within a predetermined period of time after the first event is completed; if the second event occurs, determining whether a third event occurs in the video within a preset time period after the second event is finished; if the third event occurs, the third event is taken as a second event; ending the video interception task to determine the intercepted video clip if the second event or the third event does not occur; at least two of the first event, the second event and the third event are related events.

According to a second aspect of the present disclosure, there is provided a video processing method, comprising: when a first event occurs in the video, starting a video interception task; if the related event of the first event does not appear within the preset time after the first event is ended, ending the video interception task to determine the intercepted video clip; if a second event associated with the first event occurs within a predetermined period of time after the end of the first event and no associated event of the first event occurs within the predetermined period of time after the end of the second event, ending the video capture task to determine the captured video clip.

According to a third aspect of the present disclosure, there is provided a video processing apparatus comprising: the task starting module is used for starting a video intercepting task when a first event occurs in the video; the event determining module is used for determining whether the video has a second event or not within a preset time length after the first event is ended; if the second event occurs, determining whether a third event occurs in the video within a preset time period after the second event is finished; if the third event occurs, the third event is taken as a second event; the first video interception module is used for ending the video interception task if the second event or the third event does not occur, so as to determine the intercepted video clip; at least two of the first event, the second event and the third event are related events.

According to a fourth aspect of the present disclosure, there is provided a video processing apparatus comprising: the task starting module is used for starting a video intercepting task when a first event occurs in the video; the second video intercepting module is used for ending the video intercepting task to determine the intercepted video clip if the associated event of the first event does not appear within the preset time after the first event is ended; and the third video intercepting module is used for ending the video intercepting task to determine the intercepted video clip if the second event associated with the first event occurs within the preset time after the first event is ended and the associated event of the first event does not occur within the preset time after the second event is ended.

According to a fifth aspect of the present disclosure, there is provided a computer-readable storage medium having stored thereon a computer program which, when executed by a processor, implements the video processing method described above.

According to a sixth aspect of the present disclosure, there is provided an electronic device comprising a processor; and a memory for storing one or more programs that, when executed by the processor, cause the processor to implement the video processing method described above.

In some embodiments of the present disclosure, when a first event occurs in a video, a video capturing task is started, whether a second event occurs in the video is determined within a predetermined time period after the end of the first event, and if the second event occurs, a timing for ending the video capturing task is determined based on a time at which the second event ends or content occurring after the video, so as to determine a captured video clip. In one aspect, the present disclosure may intercept video clips of multiple associated events from a video; on the other hand, the intercepted video clips are continuous video clips, so that the video clips watched by the user are continuous and the event is relatively complete.

It is to be understood that both the foregoing general description and the following detailed description are exemplary and explanatory only and are not restrictive of the disclosure.

Drawings

The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate embodiments consistent with the disclosure and together with the description, serve to explain the principles of the disclosure. It will be apparent to those of ordinary skill in the art that the drawings in the following description are merely examples of the disclosure and that other drawings may be derived from them without undue effort. In the drawings:

FIG. 1 illustrates a video schematic containing user movement events in some techniques;

FIG. 2 is a schematic diagram illustrating a manner of intercepting the video of FIG. 1 for a fixed period of time;

FIG. 3 shows a schematic diagram of another example taken with a fixed duration;

FIG. 4 illustrates a video schematic diagram containing user movement events in other techniques;

FIG. 5 shows a schematic diagram of an exemplary system architecture of a video processing scheme of an embodiment of the present disclosure;

FIG. 6 illustrates a schematic diagram of an electronic device suitable for use in implementing embodiments of the present disclosure;

fig. 7 schematically illustrates a flowchart of a video processing method according to an exemplary embodiment of the present disclosure;

FIG. 8 schematically illustrates a flow chart of the overall process of a video processing scheme according to an embodiment of the present disclosure;

fig. 9 schematically illustrates a flow chart of a scheme for video capture by cloud participation according to another embodiment of the present disclosure;

fig. 10 schematically illustrates a flowchart of a video processing method according to another exemplary embodiment of the present disclosure;

fig. 11 schematically illustrates a block diagram of a video processing apparatus according to an exemplary embodiment of the present disclosure;

fig. 12 schematically illustrates a block diagram of a video processing apparatus according to another exemplary embodiment of the present disclosure;

Fig. 13 schematically illustrates a block diagram of a video processing apparatus according to still another exemplary embodiment of the present disclosure;

fig. 14 schematically illustrates a block diagram of a video processing apparatus according to still another exemplary embodiment of the present disclosure.

Detailed Description

Example embodiments will now be described more fully with reference to the accompanying drawings. However, the exemplary embodiments may be embodied in many forms and should not be construed as limited to the examples set forth herein; rather, these embodiments are provided so that this disclosure will be thorough and complete, and will fully convey the concept of the example embodiments to those skilled in the art. The described features, structures, or characteristics may be combined in any suitable manner in one or more embodiments. In the following description, numerous specific details are provided to give a thorough understanding of embodiments of the present disclosure. One skilled in the relevant art will recognize, however, that the aspects of the disclosure may be practiced without one or more of the specific details, or with other methods, components, devices, steps, etc. In other instances, well-known technical solutions have not been shown or described in detail to avoid obscuring aspects of the present disclosure.

Furthermore, the drawings are merely schematic illustrations of the present disclosure and are not necessarily drawn to scale. The same reference numerals in the drawings denote the same or similar parts, and thus a repetitive description thereof will be omitted. Some of the block diagrams shown in the figures are functional entities and do not necessarily correspond to physically or logically separate entities. These functional entities may be implemented in software or in one or more hardware modules or integrated circuits or in different networks and/or processor devices and/or microcontroller devices.

The flow diagrams depicted in the figures are exemplary only and not necessarily all steps are included. For example, some steps may be decomposed, and some steps may be combined or partially combined, so that the order of actual execution may be changed according to actual situations. In addition, all of the following terms "first," "second," "third," and the like are used for distinguishing purposes only and should not be taken as a limitation of the present disclosure; the embodiments, implementations, and specific features thereof of the present application may be combined with one another without conflict.

FIG. 1 illustrates a video schematic containing user movement events in some techniques. Referring to FIG. 1, user movement occurs in the video within 1 minute of 13:00:00 to 13:01:00.

According to some technical schemes of the present disclosure, a video capturing manner of capturing a video with a fixed duration may be used to capture the video shown in fig. 1, so as to obtain a video picture moved by a user. For example, referring to FIG. 2, the fixed duration is 5 minutes, in which case the 5 minute video clip may be truncated from 13:00:00 to 13:05:00.

However, there is no event of user movement from 13:01:00 to 13:05:00 for 4 minutes and later, if the event is intercepted for 4 minutes, the storage space is wasted, the user wastes time when watching, and the experience is poor.

In addition, a mismatch between the fixed intercept duration and the event occurrence duration may also result in another outcome. For example, referring to fig. 3, within a total of 6 minutes of 13:00:00 to 13:06:00, events of user movement occur in the video, while the fixed duration is configured to be 5 minutes. In this case, the 1-minute video clips 13:05:00 to 13:06:00 cannot be cut out, which causes the problem that the user movement event is incomplete and event information is missed.

Fig. 4 shows a video schematic of other techniques involving user movement events. Referring to fig. 4, within 13:00:00 to 13:00:50, there are two events, user movement 1 and user movement 2, at times 13:00 to 13:00:10 and 13:00:30 to 13:00:50, respectively.

By using some schemes of the present disclosure, video clips corresponding to the user movement 1 and the user movement 2 may be extracted respectively, and then combined, so as to obtain a cut video clip.

However, on the one hand, such merging can cause discontinuities in the resulting video clip, affecting user viewing; on the other hand, the splicing process between video clips is complex and not easy to implement.

In view of this, the present disclosure provides a new video processing scheme.

Fig. 5 shows a schematic diagram of an exemplary system architecture of a video processing scheme of an embodiment of the present disclosure.

As shown in fig. 5, the system architecture may include a terminal device 51 and a cloud 53. The terminal device 51 and the cloud 53 may be connected through a network, which may include various connection types, such as a wired, wireless communication link, or an optical fiber cable, etc.

The terminal device 51 may interact with the cloud 53 through a network to receive or transmit messages and the like. The terminal device 51 may be a mobile phone, a tablet computer, an intelligent wearable device, a personal computer, various video monitoring devices (doorbell, camera) and the like. In different scenarios, the terminal device may also be referred to as a terminal, a mobile terminal, an intelligent terminal, etc. In addition, the cloud 53 may be a single server, or may be a server cluster formed by a plurality of servers, and the cloud 53 may also be referred to as a cloud server or a server.

In some instances where the video processing scheme of the present disclosure is performed by the terminal device 51, the terminal device 51 may initiate a video capture task when a first event occurs in the video. A determination is made as to whether the video has a second event within a predetermined length of time after the end of the first event. If the second event occurs, determining whether the video has a third event within a predetermined time period after the second event has ended. If the third event occurs, the third event is taken as the second event, whether the third event exists or not is continuously determined in a new preset time period, and the circulation process is executed. If the terminal device 51 determines that the second event or the third event does not occur, the video capturing task is ended to determine the captured video clip. At least two of the first event, the second event and the third event are associated events, more specifically, the first event, the second event and the third event are associated events, or the first event and the second event and the third event may be associated events respectively. It should be noted that the related events may be the same event or related events, and the related events may be user-defined, or may be preset by the system, for example, a falling event and a crying event are set as related events.

In one embodiment, the terminal device 51 may remove the video segment with the last predetermined duration from the intercepted video segment to generate the target video segment, and further, may upload the target video segment to the cloud 53 for storage. It will be appreciated that the target video clip may also be stored locally (which may be understood as a device performing the video capturing task, such as a camera, a mobile phone, etc.) or on another device (which may be understood as another device connected to the local device), such as by wireless or wired transmission to a memory of another device, such as a television, a mobile phone, etc.

It will be appreciated that the terminal device 51 may transmit the cut video clip to the designated device, so that the designated device may reject the video clip of the last predetermined duration from the cut video clip, and generate the target video clip. The designated device may be other devices than the terminal device 51, such as a cloud server, a mobile phone, a television, and the like.

In another embodiment, the terminal device 51 may upload the cut video clip to the cloud 53. The cloud 53 may extract a video clip of a last predetermined duration from the video clip in response to the video acquisition request corresponding to the video clip, generate a target video clip, and send the target video clip to a requesting end that initiates the request. The request end may be the terminal device 51 or other devices, which is not limited in this disclosure.

In addition, after receiving the intercepted video clip sent by the terminal device 51, the cloud 53 may immediately reject the video clip with the last predetermined duration from the video clip, generate a target video clip, and store the target video clip, so as to send the target video clip to the request end when the cloud 53 receives the video acquisition request.

In other examples of the execution of the video processing scheme of the present disclosure by the terminal device 51, the terminal device 51 may initiate a video-intercept task when a first event occurs in the video. And if the associated event of the first event does not appear within a preset time after the first event is ended, ending the video interception task to determine the intercepted video clip. If a second event associated with the first event occurs within a predetermined period of time after the end of the first event and no associated event of the first event occurs within the predetermined period of time after the end of the second event, ending the video capture task to determine the captured video clip.

In some instances where the video processing scheme of the present disclosure is performed by the cloud 53, the cloud 53 may receive video data from the terminal device 51. The cloud 53 may then analyze the video data to initiate a video capture task when a first event occurs in the video. A determination is made as to whether the video has a second event within a predetermined length of time after the end of the first event. If the second event occurs, determining whether the video has a third event within a predetermined time period after the second event has ended. If the third event occurs, the loop process is performed with the third event as the second event. If the cloud 53 determines that the second event or the third event does not occur, the video capturing task is ended to determine the captured video clip. At least two of the first event, the second event and the third event are associated events, and more specifically, the first event, the second event and the third event are associated events.

The cloud 53 may further intercept the intercepted video clip to reject the video clip of the last predetermined duration, generate a target video clip, and store the target video clip. This process may be performed immediately after the cut-out video clip is determined, or may be performed after the corresponding video acquisition request is received, which is not limited by the present disclosure.

In other examples of the implementation of the video processing scheme of the present disclosure by the cloud 53, the cloud 53 may initiate a video capture task when a first event occurs in the video. And if the associated event of the first event does not appear within a preset time after the first event is ended, ending the video interception task to determine the intercepted video clip. If a second event associated with the first event occurs within a predetermined period of time after the end of the first event, ending the video capture task after the predetermined period of time has elapsed after the end of the second event to determine a captured video clip.

In addition, it should be noted that, on the one hand, the video processing scheme of the present disclosure may be applied to a video monitoring scene, that is, the video is a video captured in real time by a camera, and is analyzed in real time, so as to intercept a video clip meeting the user requirement. In another aspect, the video processing scheme of the present disclosure may also be used to analyze existing video.

Fig. 6 shows a schematic diagram of an electronic device suitable for use in implementing exemplary embodiments of the present disclosure. The terminal device of the present disclosure may be configured in the form of an electronic device as shown in fig. 6. It should be noted that the electronic device shown in fig. 6 is only an example, and should not impose any limitation on the functions and the application scope of the embodiments of the present disclosure.

The electronic device of the present disclosure includes at least a processor and a memory for storing one or more programs that when executed by the processor, enable the processor to implement the video processing method of the exemplary embodiments of the present disclosure.

Specifically, as shown in fig. 6, the electronic device 600 may include: processor 610, internal memory 621, external memory interface 622, universal serial bus (Universal Serial Bus, USB) interface 630, charge management module 640, power management module 641, battery 642, antenna 1, antenna 2, mobile communication module 650, wireless communication module 660, audio module 670, speaker 671, receiver 672, microphone 673, ear-piece interface 674, sensor module 680, display 690, camera module 691, indicator 692, motor 693, keys 694, and user identification module (Subscriber Identification Module, SIM) card interface 695, among others. The sensor module 680 may include a depth sensor, a pressure sensor, a gyroscope sensor, a barometric sensor, a magnetic sensor, an acceleration sensor, a distance sensor, a proximity sensor, a fingerprint sensor, a temperature sensor, a touch sensor, an ambient light sensor, a bone conduction sensor, and the like.

It is to be understood that the illustrated structure of the presently disclosed embodiments does not constitute a particular limitation of the electronic device 600. In other embodiments of the present disclosure, electronic device 600 may include more or fewer components than shown, or certain components may be combined, or certain components may be split, or different arrangements of components. The illustrated components may be implemented in hardware, software, or a combination of software and hardware.

The processor 610 may include one or more processing units, such as: the processor 610 may include an application processor (Application Processor, AP), a modem processor, a graphics processor (Graphics Processing Unit, GPU), an image signal processor (Image Signal Processor, ISP), a controller, a video codec, a digital signal processor (Digital Signal Processor, DSP), a baseband processor, and/or a Neural network processor (Neural-etwork Processing Unit, NPU), etc. Wherein the different processing units may be separate devices or may be integrated in one or more processors. In addition, a memory may be provided in the processor 610 for storing instructions and data.

The electronic device 600 may implement a photographing function through an ISP, a camera module 691, a video codec, a GPU, a display 690, an application processor, and the like. In some embodiments, the electronic device 600 may include 1 or N camera modules 691, where N is a positive integer greater than 1, and if the electronic device 600 includes N cameras, one of the N cameras is a master camera.

The internal memory 621 may be used to store computer-executable program code that includes instructions. The internal memory 621 may include a storage program area and a storage data area. The external memory interface 622 may be used to connect an external memory card, such as a Micro SD card, to enable expansion of the memory capabilities of the electronic device 600.

The present disclosure also provides a computer-readable storage medium that may be included in the electronic device described in the above embodiments; or may exist alone without being incorporated into the electronic device.

The computer readable storage medium can be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or a combination of any of the foregoing. More specific examples of the computer-readable storage medium may include, but are not limited to: an electrical connection having one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing. In the context of this disclosure, a computer-readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device.

A computer readable storage medium may transmit, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device. Program code embodied on a computer readable storage medium may be transmitted using any appropriate medium, including but not limited to: wireless, wire, fiber optic cable, RF, etc., or any suitable combination of the foregoing.

The computer-readable storage medium carries one or more programs which, when executed by one of the electronic devices, cause the electronic device to implement the methods described in the embodiments below.

The flowcharts and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of systems, methods and computer program products according to various embodiments of the present disclosure. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams or flowchart illustration, and combinations of blocks in the block diagrams or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.

The units involved in the embodiments of the present disclosure may be implemented by means of software, or may be implemented by means of hardware, and the described units may also be provided in a processor. Wherein the names of the units do not constitute a limitation of the units themselves in some cases.

The video processing method of the exemplary embodiment of the present disclosure may include steps 1 to 4, specifically:

in step 1, a video capture task is initiated when a first event occurs in the video.

In step 2, it is determined whether the video has a second event within a predetermined period of time after the end of the first event. If the second event does not occur, executing the step 3; if a second event occurs, step 4 is performed.

In step 3, the video capture task is ended to determine the captured video clip.

In step 4, it is determined whether a third event has occurred in the video within a predetermined period of time after the end of the second event. If the third event occurs, taking the third event as a second event, and circularly executing the step 4; if no third event occurs, step 3 is performed.

At least two of the first event, the second event and the third event are related events.

Fig. 7 schematically shows a flowchart of a video processing method of an exemplary embodiment of the present disclosure. The steps of the video processing method of the present disclosure will be described below by taking the example in which the terminal device executes the steps shown in fig. 7. Referring to fig. 7, the video processing method may include the steps of:

s70, when a first event occurs in the video, starting a video interception task.

The video aimed at by the scheme of the present disclosure may be video captured by a camera in real time, and the content of the video (i.e., the object captured by the camera) is not limited by the present disclosure. The camera may be a fixed camera, such as a monitoring camera of a parking lot or a manufacturing shop. In addition, the camera can also be a mobile camera, such as a camera on a mobile phone, through which a user can perform mobile shooting to acquire surrounding scene information.

The video aimed at by the scheme of the present disclosure may also be a video that has been shot, and when the video needs to be analyzed, the video is acquired from the memory. Similarly, the present disclosure does not limit the video type of the video that has been captured.

In an exemplary embodiment of the present disclosure, the first event may be a preset event, and the preset event may include a user preset event or a system preset event. The user preset event may be an event that the user performs demonstration in advance, and the terminal device may shoot and save the event that the user demonstrated. For example, taking a preset event as an event in which a face occurs, the terminal device may capture an image including the face and an image not including the face, and then the user may select the image including the face as the image including the preset event on a preset event configuration interface. In addition, the preset event can also be an event preset by a system when the terminal equipment leaves the factory, and the type of the preset event is not limited by the present disclosure.

For another example, the first event may be an event of interest to the user, or a predetermined type of event preset by the system. For example, the first event may be any one or more of the conditions that a face exists in a shooting scene, an object (e.g., a person, an animal, etc.) moves, a device in the scene sends a prompt signal, crys, scream, falls, etc., and the type of the first event is not limited by the present disclosure.

According to some embodiments of the present disclosure, first, a terminal device may extract video frame images from a video at predetermined time intervals. Wherein the predetermined time interval is related to the scene type, and can be set based on the scene, and the value of the predetermined time interval is not limited in the disclosure.

Because each frame is not processed, but video frame images are extracted at intervals of preset time, the processing pressure of the terminal equipment is greatly reduced, and resources are saved.

It is readily understood that each frame of image in the video may be extracted for processing in some scenes where the image content is highly variable or other scenes where careful analysis is required.

Next, feature extraction may be performed on the video frame image. Specifically, a machine learning model based on deep learning may be used to process the video frame images to extract features of the video frame images. Among other things, the present disclosure does not limit the structure and training process of the machine learning model. In addition, features of the video frame image may also be extracted using methods such as histograms, which are not limited by the present disclosure.

Then, according to the extracted features, it can be determined whether the video frame image has the preset event. It will be appreciated that the output of the machine learning model may be the result of whether a preset event has occurred. In addition, according to the features extracted by the machine learning model, further analysis can be performed to obtain a result of whether a preset event occurs.

For example, in the case where the first event is to determine that there is a cat in the scene, the video frame image may be input into a trained convolutional neural network, and feature extraction may be performed by the convolutional neural network to classify whether there is a cat in the video frame image.

In addition, in view of the possible occurrence of errors in the determination of a single frame, according to other embodiments of the present disclosure, a scheme for determining whether a first event (or referred to as a preset event) occurs based on a plurality of frames is also provided.

Specifically, first, a target video frame image in which a preset object appears for the first time may be determined from a video according to the extracted features. The preset object is an object for determining that an event is a preset event, and it can be understood that the preset object can be used as an identifier of the preset event. Next, if a preset object exists in one or more frames of video frame images after the target video frame image, determining that a preset event occurs in the video, and taking the target video frame image as a starting point for starting the preset event.

For example, in consecutive 100-frame images, if a face appears in the 5 th frame image, it is determined whether a face also appears in the 6 th frame or whether a face appears in a predetermined number of video frame images (e.g., 6 th to 10 th frames) after that. If the frames are judged to have faces, the faces in the video can be determined, and the 5 th frame is used as a starting point of the faces.

When a first event occurs in the video, the terminal device may initiate a video capture task.

In this case, as described above, the video capture task may be started from the target video frame image. Still further to the above example, the video capture task may be initiated from the 5 th frame image.

According to some embodiments of the present disclosure, initiating a video capture task includes initiating a capture operation on the video. Specifically, in the case where the video is a video captured by a camera in real time, the operation of starting the video capturing task includes starting recording the video.

According to further embodiments of the present disclosure, initiating a video capture task includes recording a time in the video at which a first event begins to occur as a video capture start time. It will be appreciated that the time at which the first event begins to occur is the point in time in the video at which the first event is from scratch, i.e. the instant in time from when the first event does not occur to when the first event occurs. In addition, the video capture start time may be the time in the video, that is, it represents the relative time. However, the video capture start time may also represent an absolute time in reality, which is not limited by the present disclosure.

S72, determining whether the video has a second event or not within a preset time period after the first event is ended.

The following description will take, as an example, the association of the second event with the first event.

In some embodiments of the present disclosure, the second event being associated with the first event refers to: the second event is the same event type as the first event. For example, faces may all appear, user movements may all exist, other specified objects (e.g., cats, specified devices, etc.) may all exist, etc.

In other embodiments of the present disclosure, the second event being associated with the first event may also refer to: the second event is identical to the first event. For example, the second event and the first event are both faces of the appearance user a. In addition, it is understood that the same herein refers to the same image corresponding to the event, and the position and size where the image does not necessarily appear are identical.

In further embodiments of the present disclosure, the second event being associated with the first event means: the second event may be a subsequent event to the first event. For example, assembling an article includes two steps, process a and process b, where process a is performed before process b is performed, in which case the event corresponding to process a is a first event and the event corresponding to process b is a second event.

At the end of the first event detected by step S70, a timer may be started to determine whether the video has a second event within a predetermined period of time. The predetermined time period is related to an application scenario of the present disclosure, for example, may be 10 seconds, 30 seconds, or the like, which is not limited by the present disclosure.

That is, after the event detected in step S70 is ended, it is determined whether or not a next event corresponding thereto has occurred within a predetermined period of time. For example, in the case of detecting a face, when the face disappears from the video, the timer is started, and whether a face appears again is detected for a predetermined period of time.

In addition, the manner of determining whether the second event exists may be the same as that of determining the second event in step S70, that is, whether the event occurs may be determined by analysis of the video frame images.

In the case that the second event occurs in the video, the terminal device executes step S74; in case it is determined that the second event does not occur in the video, the terminal device performs step S78.

Further, for the process of detecting the end of the first event, it is possible to detect whether the first event ends in combination with one or more frames, similarly to the case of detecting the occurrence of the first event described above.

For example, the end of the first event is found in the 20 th frame image, in which case the judgment process for one or more frames may be performed later, and if the first event does not occur, the 20 th frame is taken as the image of the end of the first event.

In addition, it is still understood that, when the multi-frame image is used to determine whether the second event occurs, it may be set such that when the multi-frame images (for example, 3-frame images, 5-frame images, etc.) having the object corresponding to the second event all occur within a predetermined period of time, it may be determined that the video occurs, or it may be set such that the video occurs as long as the image having the object corresponding to the second event occurs within the predetermined period of time.

S74, determining whether the video has a third event or not within a preset time period after the second event is finished.

In some embodiments of the present disclosure, the third event may be associated with the first event or the second event, and the meaning of the association is the same as the association described in step S72, and will not be described again. It should be noted that at least two of the first event, the second event and the third event are related events to each other, and more specifically, the first event, the second event and the third event are related events to each other.

If it is determined in step S72 that the second event occurs in the video, the terminal device may determine whether a third event occurs in the video within a predetermined period of time after the second event ends.

Specifically, the method of determining whether the third event occurs in the video may also be performed by extracting features of the video frame image and analyzing the features.

In case it is determined that the third event occurs in the video, the terminal device performs step S76; in case it is determined that the third event does not occur in the video, the terminal device performs step S78.

In addition, it is still understood that when the multi-frame image is used to determine whether the third event occurs, it may be set that when the multi-frame images (for example, 3-frame images, 5-frame images, etc.) having the object corresponding to the third event all occur within the predetermined period of time, it may be determined that the video occurs the third event, or it may be set that the video occurs the third event as long as the image having the object corresponding to the third event occurs within the predetermined period of time.

And S76, taking the third event as a second event.

If it is determined in step S74 that the video has a third event, the third event is taken as a second event, and the operation of determining whether the video has the third event within a predetermined period of time after the end of the second event is performed in step S74. Thus, as shown in fig. 7, the loop process of step S74 and step S76 is formed.

It can be seen that as long as another event associated occurs within a predetermined period of time after the end of the event, the loop process is executed until no associated event occurs within a predetermined period of time after the end of the event, and the process jumps from step S74 to step S78.

For example, the predetermined time period is 10 seconds. If the event a is finished, the event b associated with the event a occurs within 10 seconds, then continuing to judge whether the event associated with the event a (or the event b) occurs within 10 seconds after the event b is finished, if the associated event c occurs, then continuing to judge whether the event associated with the previous event occurs within 10 seconds after the event c is finished, … …, and so on.

S78, if the second event or the third event does not occur, ending the video interception task to determine the intercepted video clip.

In an embodiment where the operation of starting the video capture task includes starting the capturing operation on the video, the operation of ending the video capture task by the terminal device includes: ending the intercepting operation of the video. Specifically, when the video is a video captured by the camera in real time, ending the video capturing task includes stopping recording the video.

In an embodiment where the operation of starting the video capturing task includes recording a video capturing start time, in the case where it is determined in step S72 that the second event does not occur, the operation of ending the video capturing task by the terminal device includes: and recording the time which is determined to be a preset time after the first event is ended and is taken as the video interception ending time. In this case, a period of video clip may be determined based on the video clip start time and the video clip end time, and a clip operation may be performed for the period to determine a clip of the video clip.

In step S74, when it is determined that the third event does not occur, the operation of the terminal device to end the video capturing task includes: and recording the time which is determined to be a preset time after the second event is ended and is taken as the video interception ending time. In this case, a period of video clip may be determined based on the video clip start time and the video clip end time, and a clip operation may be performed for the period to determine a clip of the video clip.

For example, in the video, the video capturing start time is 01:30, and the video capturing end time is 03:00, in which case, the terminal device may capture a video clip corresponding to 01:30 to 03:00 from the video, that is, determine the captured video clip.

After determining the intercepted video clip, in view of the fact that no corresponding event exists in the last predetermined time period of the video clip, in this case, the terminal device may reject the video clip of the last predetermined time period from the intercepted video clip, and generate the target video clip. In addition, the terminal equipment can upload the target video clip to the cloud for storage.

Therefore, the cloud end can respond to the video acquisition request corresponding to the target video segment sent by the terminal equipment or other equipment, and send the target video segment to the equipment sending the request.

In addition, considering that the processing resources of the terminal equipment are limited, the terminal equipment can directly upload the cut video clips to the cloud.

In this case, in some embodiments, the cloud end may, in response to a video acquisition request corresponding to the intercepted video clip, reject a video clip of a last predetermined duration from the intercepted video clip to generate a target video clip, and send the target video clip to a requesting end that initiates the video acquisition request for viewing by a user.

In other embodiments, the cloud end may remove the video segment of the last predetermined duration from the intercepted video segment, generate and store the target video segment, so that the cloud end responds to the video acquisition request corresponding to the intercepted video segment, and sends the target video segment to the request end initiating the video acquisition request for the user to watch. It can be appreciated that in some embodiments of the present application, at least two consecutive associated events (including the same event) within a preset time interval can be intercepted from a video, and each associated event is not interrupted (the video between any two consecutive associated events is also intercepted), so as to improve the viewing effect of a user.

The overall process of the video processing scheme according to the embodiment of the present disclosure will be described below with reference to fig. 8, taking the occurrence of the same preset event as an example.

In step S802, the terminal device monitors a video shot by the camera in real time. The camera can be integrated on the terminal equipment, and in addition, the camera can be connected with the terminal equipment in a wired or wireless mode, so that the terminal equipment can acquire videos.

In step S804, the terminal device determines whether a preset event occurs in the video. If so, step S806 is performed; if not, return to step S802.

In step S806, after the preset event is ended, the recording of the N seconds of video is prolonged, where N seconds corresponds to the predetermined time period, for example, 10 seconds, 30 seconds, etc.

In step S808, the terminal device determines whether a preset event occurs again within N seconds. If so, returning to step S806; if not, step S810 is performed.

In step S810, the terminal device determines a truncated video clip, where the truncated video clip includes N seconds of video clip after the end of the last preset event.

In step S812, the terminal device truncates the last N seconds of video clips from the video clips determined in step S810, and uploads the video clips to the cloud for storage.

Fig. 9 schematically illustrates a flow chart of a scheme for video capture by cloud participation according to another embodiment of the present disclosure.

In step S902, the cloud acquires and stores a video clip captured by the terminal device. The process of determining the cut video clip by the terminal device may be as shown in the above steps S802 to S810.

In step S904, the cloud receives a video acquisition request corresponding to the video clip.

In step S906, the cloud end may truncate the last N seconds of the video clip and send the video clip to the request end of the video acquisition request.

In addition, the present disclosure also provides another video processing method for a scene in which only video containing two related events needs to be output. Referring to fig. 10, the video processing method may include the steps of:

s102, when a first event occurs in the video, starting a video interception task.

Step S102 is the same as step S70 described above, and will not be described again.

S104, if the related event of the first event does not appear within a preset time after the first event is ended, ending the video interception task to determine the intercepted video clip.

Regarding determining whether two events are associated, it is similar to the case where the first event is associated with the second event in step S72. After the first event occurs, the terminal device can determine whether an associated event of the first event occurs within a predetermined time period after the first event is finished, and if so, the video interception task is finished to determine the intercepted video clip.

The process of ending the video capturing task to determine the captured video clip is similar to the process of step S78, and will not be described again.

S106, if a second event associated with the first event occurs within a preset time period after the first event is ended and the associated event of the first event does not occur within the preset time period after the second event is ended, ending the video interception task to determine the intercepted video clip.

If the event associated with the first event occurs within a predetermined time period after the end of the first event and is recorded as a second event, the terminal device may end the video capturing task to determine the captured video clip after the second event is ended and the associated event of the first event (or the second event) does not occur within the predetermined time period.

In the present exemplary solution, considering that events in some scenes often have strong continuity, after the second event is ended, the video capturing task is ended after a predetermined period of time, so that the problem that some relevant information possibly existing in the video in the predetermined period of time is omitted or discarded is avoided.

In addition, aiming at other scenes, the method and the device can also remove the video with the last preset time length, and generate a target video clip for storage.

In some embodiments of the present disclosure, the terminal device may reject a video clip of a last predetermined duration from the intercepted video clip, and generate a target video clip. In addition, the terminal equipment can upload the target video clip to the cloud for storage.

In other embodiments, the cloud end may remove the video segment of the last predetermined duration from the intercepted video segment, generate and store the target video segment, so that the cloud end responds to the video acquisition request corresponding to the intercepted video segment, and sends the target video segment to the request end initiating the video acquisition request for the user to watch.

Based on the video processing method disclosed by the disclosure, on one hand, the video processing method disclosed by the disclosure can intercept video clips of a plurality of related events from the video; on the other hand, the intercepted video clips are continuous video clips, so that the video clips watched by the user are continuous and the event is complete; on the other hand, the video clips are stored based on the cut video clips, so that the storage space can be greatly saved.

It should be noted that although the steps of the methods in the present disclosure are depicted in the accompanying drawings in a particular order, this does not require or imply that the steps must be performed in that particular order, or that all illustrated steps be performed, to achieve desirable results. Additionally or alternatively, certain steps may be omitted, multiple steps combined into one step to perform, and/or one step decomposed into multiple steps to perform, etc.

Further, a video processing apparatus is also provided in this example embodiment.

Fig. 11 schematically shows a block diagram of a video processing apparatus of an exemplary embodiment of the present disclosure. Referring to fig. 11, the video processing apparatus 11 according to an exemplary embodiment of the present disclosure may include a task starting module 111, an event detecting module 113, and a first video intercepting module 115.

Specifically, the task initiation module 111 may be configured to initiate a video capturing task when a first event occurs in a video; the event detection module 113 may be configured to determine whether the video has a second event within a predetermined time period after the end of the first event; if the second event occurs, determining whether a third event occurs in the video within a preset time period after the second event is finished; if the third event occurs, the third event is taken as a second event; the first video capture module 115 may be configured to end the video capture task to determine a captured video clip if the second event or the third event does not occur; at least two of the first event, the second event and the third event are related events.

According to an exemplary embodiment of the present disclosure, the first video capture module 115 may be further configured to perform: and eliminating the video clips with the last preset time length from the intercepted video clips to generate target video clips.

According to an exemplary embodiment of the present disclosure, the first video capture module 115 may be further configured to perform: and transmitting the cut video clips to the designated equipment so that the designated equipment can remove the video clips with the last preset duration from the cut video clips to generate target video clips.

According to an exemplary embodiment of the present disclosure, referring to fig. 12, the video processing apparatus 12 may further include a video clip uploading module 121, as compared to the video processing apparatus 11.

Specifically, the video clip upload module 121 may be configured to perform: and uploading the cut video clips to the cloud. In this case, the cloud end responds to the video acquisition request corresponding to the intercepted video segment, eliminates the video segment with the last preset duration from the intercepted video segment, generates a target video segment, and sends the target video segment to a request end initiating the video acquisition request; or the cloud end eliminates the video segment with the last preset time length from the intercepted video segment, generates a target video segment and stores the target video segment, so that the cloud end responds to the video acquisition request corresponding to the intercepted video segment and sends the target video segment to a request end for initiating the video acquisition request.

According to an example embodiment of the present disclosure, the process of the task initiation module 111 initiating the video capture task may be configured to perform: when a first event occurs in the video, an intercept operation is initiated on the video. In this case, the process of ending the video capture task by the first video capture module 115 may be configured to perform: ending the intercepting operation of the video.

According to an example embodiment of the present disclosure, the process of the task initiation module 111 initiating the video capture task may be configured to perform: the time at which the first event starts to occur in the video is recorded as the video capture start time. In this case, the process of the first video capture module 115 ending the video capture task to determine the captured video clip may be configured to perform: under the condition that the second event does not appear, recording the time which is determined to be longer than the preset time after the first event is ended and is taken as video interception ending time, and intercepting the video based on the video interception starting time and the video interception ending time so as to determine intercepted video clips; and under the condition that the third event does not occur, recording the time which is determined to be longer than the preset time after the second event is ended and is used as the video interception ending time, and intercepting the video based on the video interception starting time and the video interception ending time so as to determine the intercepted video clip.

According to an exemplary embodiment of the present disclosure, the first event is a preset event including a user preset event or a system preset event. In this case, referring to fig. 13, the video processing apparatus 13 may further include an image analysis module 131, as compared to the video processing apparatus 11.

In particular, the image analysis module 131 may be configured to perform: extracting features of video frame images in the video; and determining whether a preset event occurs in the video according to the extracted characteristics.

According to an exemplary embodiment of the present disclosure, the process of the image analysis module 131 determining whether a preset event occurs in a video according to the extracted features may be configured to perform: according to the extracted characteristics, determining a target video frame image of a preset object from the video, wherein the preset object is an object for determining an event as a preset event; if one or more frames of video frame images behind the target video frame image all have preset objects, determining that a preset event occurs in the video; starting a video interception task from a target video frame image.

According to an exemplary embodiment of the present disclosure, the video is a video photographed by a camera in real time.

Further, another video processing apparatus is also provided in the present exemplary embodiment.

Fig. 14 schematically shows a block diagram of a video processing apparatus according to another exemplary embodiment of the present disclosure. Referring to fig. 14, the video processing apparatus 14 according to an exemplary embodiment of the present disclosure may include a task initiation module 111, a second video capture module 141, and a third video capture module 143.

Specifically, the task initiation module 111 may be configured to initiate a video capturing task when a first event occurs in a video; the second video capturing module 141 may be configured to end the video capturing task to determine a captured video clip if no associated event of the first event occurs within a predetermined time period after the end of the first event; the third video capturing module 143 may be configured to end the video capturing task to determine the captured video clip if the second event associated with the first event occurs within a predetermined time period after the end of the first event and no associated event of the first event occurs within a predetermined time period after the end of the second event.

According to an exemplary embodiment of the present disclosure, the third video capture module 143 may be further configured to perform: and eliminating the video clips with the last preset time length from the intercepted video clips to generate target video clips.

According to an exemplary embodiment of the present disclosure, the third video capture module 143 may be further configured to perform: and transmitting the cut video clips to the designated equipment so that the designated equipment can remove the video clips with the last preset duration from the cut video clips to generate target video clips.

According to an exemplary embodiment of the present disclosure, the video processing apparatus 14 may further include the video clip uploading module 121 described above.

Since each functional module of the video processing apparatus according to the embodiment of the present disclosure is the same as that in the above-described method embodiment, a detailed description thereof will be omitted.

From the above description of embodiments, those skilled in the art will readily appreciate that the example embodiments described herein may be implemented in software, or may be implemented in software in combination with the necessary hardware. Thus, the technical solution according to the embodiments of the present disclosure may be embodied in the form of a software product, which may be stored in a non-volatile storage medium (may be a CD-ROM, a U-disk, a mobile hard disk, etc.) or on a network, including several instructions to cause a computing device (may be a personal computer, a server, a terminal device, or a network device, etc.) to perform the method according to the embodiments of the present disclosure.

Furthermore, the above-described figures are only schematic illustrations of processes included in the method according to the exemplary embodiments of the present disclosure, and are not intended to be limiting. It will be readily appreciated that the processes shown in the above figures do not indicate or limit the temporal order of these processes. In addition, it is also readily understood that these processes may be performed synchronously or asynchronously, for example, among a plurality of modules.

It should be noted that although in the above detailed description several modules or units of a device for action execution are mentioned, such a division is not mandatory. Indeed, the features and functionality of two or more modules or units described above may be embodied in one module or unit in accordance with embodiments of the present disclosure. Conversely, the features and functions of one module or unit described above may be further divided into a plurality of modules or units to be embodied.

Other embodiments of the disclosure will be apparent to those skilled in the art from consideration of the specification and practice of the disclosure herein. This application is intended to cover any adaptations, uses, or adaptations of the disclosure following, in general, the principles of the disclosure and including such departures from the present disclosure as come within known or customary practice within the art to which the disclosure pertains. It is intended that the specification and examples be considered as exemplary only, with a true scope and spirit of the disclosure being indicated by the following claims.

It is to be understood that the present disclosure is not limited to the precise arrangements and instrumentalities shown in the drawings, and that various modifications and changes may be effected without departing from the scope thereof. The scope of the present disclosure is limited only by the appended claims.

Claims

1. A video processing method, comprising:

when a first event occurs in the video, starting a video interception task;

determining whether a second event occurs to the video within a predetermined time period after the first event is finished;

if the second event occurs, determining whether a third event occurs in the video within the preset time period after the second event is ended;

if the third event occurs, the third event is taken as the second event;

ending the video interception task to determine an intercepted video clip if the second event or the third event does not occur, and generating a target video clip after the intercepted video clip is rejected to the video clip of the preset duration;

wherein the first event, the second event, and the third event are related events to each other, and the second event is a subsequent event to the first event; or alternatively

The first event is an associated event with the second event and the third event, respectively, and the second event is a subsequent event to the first event.

2. The video processing method according to claim 1, characterized in that the video processing method further comprises:

And eliminating the video clips with the preset time length from the intercepted video clips to generate target video clips.

3. The video processing method according to claim 1, characterized in that the video processing method further comprises:

transmitting the intercepted video clips to a designated device, so that the designated device can reject the last video clip with the preset duration from the intercepted video clips, and a target video clip is generated.

4. The video processing method according to claim 1, characterized in that the video processing method further comprises:

uploading the cut video clips to a cloud;

the cloud end responds to a video acquisition request corresponding to the intercepted video segments, eliminates the last video segment with the preset duration from the intercepted video segments, generates a target video segment, and sends the target video segment to a request end initiating the video acquisition request; or the cloud end eliminates the last video segment with the preset duration from the intercepted video segments, generates a target video segment and stores the target video segment, so that the cloud end responds to a video acquisition request corresponding to the intercepted video segment and sends the target video segment to a request end initiating the video acquisition request.

5. The video processing method of claim 1, wherein initiating a video capture task comprises: starting to intercept the video;

ending the video capture task includes: ending the intercepting operation of the video.

6. The video processing method of claim 1, wherein initiating a video capture task comprises: recording the time of starting to appear the first event in the video as video interception starting time;

ending the video capture task to determine a captured video clip, comprising: under the condition that the second event does not occur, recording the time which is determined to be longer than the preset time after the first event is ended, and taking the time as video interception ending time, and intercepting the video based on the video interception starting time and the video interception ending time to determine intercepted video clips; and under the condition that the third event does not occur, recording the time which is determined to be over the preset time after the second event is ended and is taken as video interception ending time, and intercepting the video based on the video interception starting time and the video interception ending time so as to determine intercepted video clips.

7. The video processing method according to claim 1, wherein the first event is a preset event, the preset event including a user preset event or a system preset event; the video processing method further comprises the following steps:

extracting features of video frame images in the video;

and determining whether the preset event occurs in the video according to the extracted characteristics.

8. The video processing method according to claim 7, wherein determining whether the preset event occurs in the video according to the extracted feature comprises:

according to the extracted characteristics, determining a target video frame image of a preset object from the video, wherein the preset object is an object for determining an event as the preset event;

if the preset object exists in one or more video frame images behind the target video frame image, determining that the preset event occurs in the video;

and starting the video interception task from the target video frame image.

9. The video processing method according to any one of claims 1 to 8, wherein the video is a video photographed by a camera in real time.

10. A video processing method, comprising:

when a first event occurs in the video, starting a video interception task;

ending the video intercepting task to determine the intercepted video clip if the associated event of the first event does not appear within a preset time after the first event is ended;

if a second event associated with the first event occurs within the preset time after the first event is finished and the associated event of the first event does not occur within the preset time after the second event is finished, ending the video interception task to determine an intercepted video segment, and generating a target video segment after the intercepted video segment is removed from the video segment of the last preset time;

wherein the second event is a subsequent event to the first event.

11. The video processing method according to claim 10, characterized in that the video processing method further comprises:

12. The video processing method according to claim 10, characterized in that the video processing method further comprises:

13. The video processing method according to claim 10, characterized in that the video processing method further comprises:

uploading the cut video clips to a cloud;

14. A video processing apparatus, comprising:

the task starting module is used for starting a video intercepting task when a first event occurs in the video;

The event determining module is used for determining whether the video has a second event or not within a preset time length after the first event is ended; if the second event occurs, determining whether a third event occurs in the video within the preset time period after the second event is ended; if the third event occurs, the third event is taken as the second event;

the first video interception module is used for ending the video interception task to determine an intercepted video clip if the second event or the third event does not occur, and generating a target video clip after the intercepted video clip is rejected to the video clip of the last preset duration;

15. A video processing apparatus, comprising:

The second video intercepting module is used for ending the video intercepting task to determine the intercepted video clip if the related event of the first event does not appear within the preset time after the first event is ended;

a third video capturing module, configured to end the video capturing task if a second event associated with the first event occurs within the predetermined time period after the end of the first event and no associated event of the first event occurs within the predetermined time period after the end of the second event, so as to determine a captured video segment, and generate a target video segment after the captured video segment is removed from the video segment of the last predetermined time period;

wherein the second event is a subsequent event to the first event.

16. A computer readable storage medium, on which a computer program is stored, characterized in that the program, when executed by a processor, implements the video processing method according to any one of claims 1 to 13.

17. An electronic device, comprising:

a processor;

a memory for storing one or more programs that, when executed by the processor, cause the processor to implement the video processing method of any of claims 1-13.