CN114266785A

CN114266785A - Optical flow prediction method, optical flow prediction device, electronic device, and storage medium

Info

Publication number: CN114266785A
Application number: CN202111570506.2A
Authority: CN
Inventors: 刘晶晶; 徐宁
Original assignee: Beijing Dajia Internet Information Technology Co Ltd
Current assignee: Beijing Dajia Internet Information Technology Co Ltd
Priority date: 2021-12-21
Filing date: 2021-12-21
Publication date: 2022-04-01

Abstract

The present disclosure relates to an optical flow prediction method, an optical flow prediction apparatus, an electronic device, and a storage medium, the optical flow prediction method including: acquiring an image including a sky region; acquiring texture features of the image, and predicting optical flow in the image based on the acquired texture features of the image; determining a sky region in the image by segmenting the image; obtaining an optical flow of the sky area based on the predicted optical flow and the determined sky area.

Description

Optical flow prediction method, optical flow prediction device, electronic device, and storage medium

Technical Field

The present disclosure relates to the field of signal processing, and in particular, to an optical flow prediction method, an optical flow prediction apparatus, an electronic device, and a storage medium.

Background

The popularization of the smart phone enables everyone to shoot photos and videos anytime and anywhere to record wonderful moments in daily life and travel. In the material of people's daily shooting, landscape pictures and people pictures containing sky backgrounds occupy a considerable proportion. At present, there are a lot of computer vision and computer graphics technologies that help people edit sky photos and videos, thereby creating more attractive multimedia content. For example, one may replace the gray sky in the original image with a blue sky, add an interesting dynamic object to the sky area in the photograph, or generate a sky moving image. Unlike other sky photograph editing techniques, the skybook generation does not need to use an additional sky image, and the generated video is still based on the sky background in the original photograph. The technique can generate a sky-moving video (also referred to as a "skimming map") as long as the motion pattern of the sky region is known. However, in practice, some applications that follow based on the sky light flow are often poor due to the inability to accurately predict the sky light flow, for example, if the predicted sky light flow is not accurate enough, the visual effect of the generated sky dynamic video will be poor (e.g., the sky moving image is not vivid enough or has visual defects), in other words, how to better predict the sky light flow determines the visual effect of the generated sky moving image to a large extent. In view of this, a technique capable of predicting the sky light flow more accurately is required.

Disclosure of Invention

The present disclosure provides an optical flow prediction method, an electronic device and a storage medium, so as to at least solve the problem of poor accuracy of optical flow prediction in the related art.

According to a first aspect of the embodiments of the present disclosure, there is provided an optical flow prediction method, wherein the optical flow prediction method includes: acquiring an image including a sky region; acquiring texture features of the image, and predicting optical flow in the image based on the acquired texture features of the image; determining a sky region in the image by segmenting the image; obtaining an optical flow of the sky area based on the predicted optical flow and the determined sky area.

Optionally, the obtaining an optical flow of the sky area based on the predicted optical flow and the determined sky area includes: filtering out optical flow of non-sky areas by comparing the predicted optical flow to the determined sky area; determining a case where a light flow remaining after filtering out a light flow of a non-sky area covers the sky area, and obtaining a light flow of the sky area based on the remaining light flow according to the determined case.

Optionally, the determining a case that a light flow remaining after filtering out a light flow of a non-sky area covers the sky area and obtaining the light flow of the sky area based on the remaining light flow according to the determined case includes: determining whether a remaining optical flow after filtering out optical flows of non-sky areas covers more than a predetermined proportion of sky areas; determining the remaining optical flow as an optical flow of a sky area if coverage exceeds a predetermined proportion of the sky area; determining a completeness of the remaining optical flow according to an area of the remaining optical flow covering a sky area if not covered by more than a predetermined proportion of the sky area; if the integrity exceeds a preset threshold, obtaining an optical flow of the sky area through optical flow propagation based on the remaining optical flow; if the integrity does not exceed the preset threshold, obtaining an optical flow of the sky area using a preset optical flow template and the determined sky area.

Optionally, the obtaining of the optical flow of the sky area by using a preset optical flow template and the determined sky area includes: based on the determined characteristics of the sky area, selecting an optical flow template with the highest matching degree with the sky area from a plurality of preset optical flow templates, and obtaining the optical flow of the sky area according to the selected optical flow template.

Optionally, the obtaining an optical flow of the sky area according to the selected optical flow template includes: obtaining an optical flow of the sky area by performing a random optical flow perturbation on the selected optical flow template.

Optionally, the characteristic of the sky region includes at least one of a position, a shape, and a size of the sky region in the image.

Optionally, the predicting optical flow in the image based on the obtained texture features of the image includes: and predicting optical flow in the image by utilizing a pre-trained depth generation type confrontation network model based on the texture characteristics, wherein the depth generation type confrontation network model is trained based on the sky time delay video.

Optionally, the determining the sky region in the image by segmenting the image comprises: based on the image, a sky region and a non-sky region in the image are distinguished by utilizing a pre-trained sky segmentation model, wherein the sky segmentation model is trained based on sky segmentation labeling data in a sky delayed video.

According to a second aspect of the embodiments of the present disclosure, there is provided an optical-flow prediction apparatus including: an image acquisition unit configured to acquire an image including a sky region; an image optical flow prediction unit configured to acquire a texture feature of the image and predict an optical flow in the image based on the acquired texture feature of the image; a sky region determination unit configured to determine a sky region in the image by segmenting the image; a sky-light-flow obtaining unit configured to obtain a light flow of the sky area based on the predicted light flow and the determined sky area.

Optionally, the obtaining an optical flow of the sky area based on the predicted optical flow and the determined sky area comprises: filtering out optical flow of non-sky areas by comparing the predicted optical flow to the determined sky area; determining a case where a light flow remaining after filtering out a light flow of a non-sky area covers the sky area, and obtaining a light flow of the sky area based on the remaining light flow according to the determined case.

Optionally, the determining a case that a light flow remaining after filtering out a light flow of a non-sky area covers the sky area and obtaining the light flow of the sky area based on the remaining light flow according to the determined case includes: determining whether a remaining optical flow after filtering out optical flow of non-sky regions covers more than a predetermined proportion of the sky regions; determining the remaining optical flow as an optical flow of the sky area if coverage exceeds a predetermined proportion of the sky area; determining a completeness of the remaining optical flow according to an area of the remaining optical flow covering the sky area if not covered by more than a predetermined proportion of the sky area; if the integrity exceeds a preset threshold, obtaining an optical flow of the sky area through optical flow propagation based on the remaining optical flow; if the integrity does not exceed the preset threshold, obtaining an optical flow of the sky area using a preset optical flow template and the determined sky area.

Optionally, the predicting optical flow in the image based on the obtained texture features of the image includes: and predicting the optical flow in the image by utilizing a pre-trained depth generation type confrontation network model based on the texture characteristics, wherein the depth generation type confrontation network model is trained by learning the mapping relation between the sky texture and the sky optical flow in the sky delayed video based on the sky delayed video.

According to a third aspect of the embodiments of the present disclosure, there is provided an electronic apparatus, including: at least one processor; at least one memory storing computer-executable instructions, wherein the computer-executable instructions, when executed by the at least one processor, cause the at least one processor to perform the optical flow prediction method as described above.

According to a fourth aspect of embodiments of the present disclosure, there is provided a computer-readable storage medium storing instructions that, when executed by at least one processor, cause the at least one processor to perform the optical flow prediction method as described above.

According to a fifth aspect of embodiments of the present disclosure, there is provided a computer program product comprising computer instructions, characterized in that the computer instructions, when executed by a processor, implement the optical flow method as described above.

The technical scheme provided by the embodiment of the disclosure at least brings the following beneficial effects: according to the optical flow prediction method of the embodiment of the disclosure, by predicting the optical flow in the image based on the texture features of the acquired image, and determining the sky area in the image by segmenting the image, and then obtaining the optical flow of the sky area based on both the predicted optical flow and the determined sky area, the optical flow of the sky area can be predicted more accurately.

It is to be understood that both the foregoing general description and the following detailed description are exemplary and explanatory only and are not restrictive of the disclosure.

Drawings

The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate exemplary embodiments consistent with the present disclosure and, together with the description, serve to explain the principles of the disclosure and are not to be construed as limiting the disclosure.

FIG. 1 is an exemplary system architecture to which exemplary embodiments of the present disclosure may be applied;

FIG. 2 is a flow chart of an optical flow prediction method of an exemplary embodiment of the present disclosure;

FIG. 3 is a schematic diagram illustrating an optical flow prediction method of an exemplary embodiment of the present disclosure;

FIG. 4 is a diagram illustrating an example of an optical flow of a sky area obtained using an optical flow prediction method according to an exemplary embodiment of the present disclosure;

FIG. 5 is a block diagram illustrating an optical flow prediction apparatus of an exemplary embodiment of the present disclosure;

fig. 6 is a block diagram of an electronic device according to an exemplary embodiment of the present disclosure.

Detailed Description

In order to make the technical solutions of the present disclosure better understood by those of ordinary skill in the art, the technical solutions in the embodiments of the present disclosure will be clearly and completely described below with reference to the accompanying drawings.

It should be noted that the terms "first," "second," and the like in the description and claims of the present disclosure and in the above-described drawings are used for distinguishing between similar elements and not necessarily for describing a particular sequential or chronological order. It is to be understood that the data so used is interchangeable under appropriate circumstances such that the embodiments of the disclosure described herein are capable of operation in sequences other than those illustrated or otherwise described herein. The embodiments described in the following examples do not represent all embodiments consistent with the present disclosure. Rather, they are merely examples of apparatus and methods consistent with certain aspects of the present disclosure, as detailed in the appended claims.

In this case, the expression "at least one of the items" in the present disclosure means a case where three types of parallel expressions "any one of the items", "a combination of any plural ones of the items", and "the entirety of the items" are included. For example, "include at least one of a and B" includes the following three cases in parallel: (1) comprises A; (2) comprises B; (3) including a and B. For another example, "at least one of the first step and the second step is performed", which means that the following three cases are juxtaposed: (1) executing the step one; (2) executing the step two; (3) and executing the step one and the step two.

Fig. 1 illustrates an exemplary system architecture 100 in which exemplary embodiments of the present disclosure may be applied.

As shown in fig. 1, the system architecture 100 may include

terminal devices

101, 102, 103, a network 104, and a server 105. The network 104 serves as a medium for providing communication links between the

terminal devices

101, 102, 103 and the server 105. Network 104 may include various connection types, such as wired, wireless communication links, or fiber optic cables, to name a few. A user may use the

terminal devices

101, 102, 103 to interact with the server 105 over the network 104 to receive or send messages (e.g., image or video data upload requests, image or video data download requests), etc. Various communication client applications, such as audio and video communication software, audio and video recording software, instant messaging software, conference software, mailbox clients, social platform software, and the like, may be installed on the

terminal devices

101, 102, and 103. Further, various image or video shooting editing applications may also be installed on the

terminal apparatuses

101, 102, and 103. The

terminal apparatuses

101, 102, and 103 may be hardware or software. When the

terminal devices

101, 102, 103 are hardware, they may be various electronic devices having a display screen and capable of playing, recording, editing, etc. audio and video, including but not limited to smart phones, tablet computers, laptop portable computers, desktop computers, etc. When the

terminal device

101, 102, 103 is software, it may be installed in the electronic devices listed above, it may be implemented as a plurality of software or software modules (for example, to provide distributed services), or it may be implemented as a single software or software module. And is not particularly limited herein.

The

terminal devices

101, 102, 103 may be equipped with image capturing means (e.g. a camera) to capture image or video data. In practice, the smallest visual unit that makes up a video is a Frame (Frame). Each frame is a static image. Temporally successive sequences of frames are composited together to form a motion video. Further, the

terminal apparatuses

101, 102, 103 may also be mounted with a component (e.g., a speaker) for converting an electric signal into sound to play the sound, and may also be mounted with a device (e.g., a microphone) for converting an analog audio signal into a digital audio signal to pick up the sound. In addition, the

terminal apparatuses

101, 102, 103 can perform voice communication or video communication with each other.

The server 105 may be a server providing various services, such as a background server providing support for multimedia applications installed on the

terminal devices

101, 102, 103. The background server can analyze, store and the like the received data such as the audio and video data uploading request, can also receive the audio and video data downloading request sent by the

terminal equipment

101, 102 and 103, and feeds back the audio and video data indicated by the audio and video data downloading request to the

terminal equipment

101, 102 and 103.

The server may be hardware or software. When the server is hardware, it may be implemented as a distributed server cluster formed by multiple servers, or may be implemented as a single server. When the server is software, it may be implemented as multiple pieces of software or software modules (e.g., to provide distributed services), or as a single piece of software or software module. And is not particularly limited herein.

It should be noted that the optical flow prediction method provided by the embodiment of the present disclosure is generally executed by a terminal device, but may also be executed by a server, or may also be executed by cooperation of the terminal device and the server. Accordingly, the optical flow prediction means may be provided in the terminal device, in the server, or in both the terminal device and the server.

It should be understood that the number of terminal devices, networks, and servers in fig. 1 is merely illustrative. There may be any number of terminal devices, networks, and servers, as desired for implementation, and the disclosure is not limited thereto.

With the progress of artificial intelligence technology, most of the existing sky light stream prediction methods mainly use a deep neural network model to learn how to judge the dynamic mode of a sky area according to texture features in a picture from a massive sky delayed video. That is, this method is to directly predict the sky light flow in the image. The method extracts real optical flow from the sky delayed video and uses the real optical flow as the labeling data of the training neural network. Due to the good learning capability of the deep neural network, when the model is trained by a large amount of data, the model has the capability of predicting the optical flow according to the image scene. For example, the methods can judge the motion direction of the whole sky area according to the shape and distribution of clouds, and simultaneously fine tune the dynamic change direction and amplitude of each sky pixel by combining local texture characteristics. The method has the greatest advantage that the mapping relation between static sky textures and optical flow (motion mode) can be independently learned by means of the existing optical flow extraction algorithm and mass video data without time-consuming and labor-consuming manual labeling. When a new sky photo is given, the sky image optical flow can be obtained without human intervention. However, a disadvantage of these methods is that the data that they train the algorithm comes primarily from delayed video. When the picture of the delayed video contains other dynamically changing areas, such as rivers, lake surfaces and the like, the sky light stream obtained through the light stream extraction algorithm has a labeling error, so that the trained model can predict the light stream on some non-sky areas. It is also difficult to generate realistic dynamic pictures using optical flow of these non-sky regions, subject to the scale of the training data. In addition, the sky delayed video contains limited sky scenes, and the diversity of sky images under real scenes is high. This results in the optical flow generated by this kind of method containing more errors, which means that the optical flow prediction of the partial sky pixels is zero, and the non-sky area obtains larger optical flow. Therefore, the problem brought to the generation of the sky dynamic video is that the visual effect of the sky dynamic image is greatly influenced because a part of the sky area is still and the static object has image displacement.

Sky segmentation (sky segmentation) is a relatively difficult computer vision problem compared to sky light flow in directly predicted images. Therefore, it is considered that a relatively robust method of generating a sky light flow is to obtain a sky area in an image by sky segmentation and to give a dynamic change pattern to the sky area. Based on the result of sky segmentation, the method firstly uses a neural network encoder (encoder) to extract depth features from the sky region; then, carrying out projective transformation (homographic transformation) on the depth characteristics by utilizing a homographic matrix (homographic matrix) which is artificially predefined; finally, the converted features are passed through a decoder (decoder) to generate corresponding video frames. Due to the fact that the sky segmentation result is used, the method can well restrain the optical flow of the non-sky area, and meanwhile, most of sky pixels can obtain relatively good dynamic textures. However, this approach has a problem in that it requires the use of an artificially defined homography matrix to generate the sky dynamics. Although the change of the homography matrix can also change the motion mode of the sky, the method ignores the importance of the image characteristics of the sky area to determine the dynamic mode of the sky, and easily causes the generated dynamic sky to be inconsistent with the picture scene. For example, in some sky images, texture displacement above and below is more visually reasonable, and the method is likely to give dynamic changes in level.

In summary, as mentioned in the background, some subsequent applications based on the sky light flow are often poor in practice due to the inability to accurately predict the sky light flow, for example, if the predicted sky light flow is not accurate enough, the generated sky animation will have poor visual effect (e.g., the sky animation is not realistic enough, or there is a visual flaw).

The method can obtain harmonious and vivid motion modes (light streams) on most real sky images, thereby being convenient for generating sky dynamic videos with good visual effect and few image quality problems based on the predicted light streams.

FIG. 2 is a flowchart of an optical flow prediction method according to an exemplary embodiment of the present disclosure.

Referring to fig. 2, in step S210, an image including a sky area is acquired. Here, the image may be a single image including the sky area. As an example, an image including a sky region may be acquired in response to an operation instruction of a user to acquire the image. Here, the operation instruction may be, for example, a touch operation of the user on the user interface, but is not limited thereto. Any manner of acquiring an image (manual or automatic) may be employed to acquire an image including a region of the sky, and the present disclosure is not limited thereto.

In step S220, texture features of the image are acquired, and an optical flow in the image is predicted based on the acquired texture features of the image. Specifically, in step S220, after the texture features of the image are acquired, the optical flow in the image may be predicted by using the pre-trained depth-generating confrontation network model based on the texture features. Here, the optical flow refers to a displacement of each pixel on the image in an image coordinate system, which corresponds to a motion pattern of the image. The depth-generating antagonistic network model is used to predict optical flow in the entire image, and hereinafter, is also referred to as an "optical flow prediction model".

The deep generation type antagonistic network model is a kind of deep neural network model, the working principle of which in the prediction stage is not different from that of the traditional deep neural network, and the optical flow of each image pixel is predicted based on the texture feature of the image input into the deep neural network. The main difference from the traditional deep neural network is the strategy adopted in model training. The traditional deep neural network only needs to learn the pixel-level mapping relation between static textures and optical flows, and a depth generation type antagonistic network model considers the local pixel-level mapping relation and introduces an antagonistic loss function based on the whole image to enable the predicted optical flows to be more matched with a global image scene, so that the problem of optical flow prediction distortion caused by local overfitting is reduced, and the generalization capability of the model is improved.

The depth-generating confrontation network model can be based on a sky delayed video, and is trained by learning a mapping relation between a sky texture and a sky optical flow in the sky delayed video. By collecting a large amount of sky-delayed video to cover more sky textures and sky motion patterns (i.e., sky light flow), the model's ability to adapt to practical application scenarios can be improved. As is well known to those skilled in the art, a deep generative confrontation network model may generally include a generative network and a discriminative network, whereas, when a depth-generating countermeasure network is used to predict the sky light flow, the generation network may generate the predicted sky light flow based on the sky texture in the picture of the sky-delayed video, the judgment network can judge whether the predicted sky light flow is the real sky light flow in the sky delayed video, in other words, the output of the judgment network is the probability that the sky light flow predicted by the generation network is the real sky light flow, by continuing training with an overall image-based antagonism loss function, the generating network can predict a sufficient "spurious" sky light flow, and for the discrimination network, the optical flow predicted by the generation network is basically difficult to discriminate whether the optical flow is real or not, and thus the training is completed. Through the process, the deep generation type confrontation network model is trained by learning the mapping relation between the sky texture and the sky light stream in the sky delayed video based on the sky delayed video.

Therefore, the present disclosure predicts the optical flow in the image based on the texture features of the image by using the depth-generating countering network model in step S220, can effectively reduce the optical flow prediction distortion caused by local overfitting, and can adapt to the actual application scenario.

Next, in step S230, a sky region in the image is determined by segmenting the image. According to an example embodiment, a pre-trained sky segmentation model (also referred to as a "sky matting model") may be utilized to distinguish sky regions from non-sky regions in the image based on the image. Here, the sky segmentation model is trained based on sky segmentation labeling data in a sky delayed video. The sky-delayed video may be the sky-delayed video used above in training the optical flow prediction model.

The sky segmentation model can distinguish the sky region from the non-sky region in the image, so as to help us obtain the optical flow of the sky region based on the optical flow predicted in step S220 and the determined sky region (hereinafter, also referred to as optimizing the optical flow). According to an exemplary embodiment, in order to better cope with the application of generating subsequent skimming images, a sky segmentation model may be trained using a large amount of sky segmentation labeling data. The sky segmentation model is used for performing two classifications on each pixel in the image so as to distinguish a sky area from a non-sky area in the image. Any pixel-level binary model can be used as the sky segmentation model, and preferably, the sky segmentation model can be a deep neural network model. The deep neural network model has the advantage that image features which can represent sky regions can be automatically learned from a large number of sky segmentation labeling images without manually designing image features and classification models. When the depth network model of the full convolution layer is used, the sky segmentation model can be applied to images with various proportions and sizes. As an example, the sky segmentation model may be trained by performing the following operations: sky segmentation marking data are extracted from the sky delayed video, the fineness of sky labeling at complex edges is optimized, and the sky segmentation marking data are trained on the basis of the optimized sky segmentation marking data. In the sky delayed video, only the sky area has obvious dynamic change, but the non-sky area does not exist or has very slight image change, so when the sky segmentation annotation data is extracted from the sky delayed video, the sky area and the non-sky area can be distinguished through the image motion change information, and the sky segmentation annotation data is acquired.

After determining a sky area in an image by segmenting the image, an optical flow of the sky area may be obtained based on the predicted optical flow and the determined sky area at step S240. Based on the result of sky segmentation, the dynamic change of the non-sky area can be well inhibited, and meanwhile, the sky area presents more complete dynamic texture. Specifically, in step S240, the predicted optical flow and the determined sky area may be compared to filter out the optical flow of the non-sky area, and then, a case where the remaining optical flow after filtering out the optical flow of the non-sky area covers the sky area may be determined, and the optical flow of the sky area may be obtained based on the remaining optical flow according to the determined case.

As an example, determining that a remaining optical flow after filtering out an optical flow of a non-sky area covers the sky area, and obtaining an optical flow of the sky area based on the remaining optical flow according to the determining may include: first, it is determined whether the optical flow remaining after filtering out the optical flow of non-sky areas covers more than a predetermined proportion of sky areas. Second, if the coverage exceeds a predetermined proportion of the sky area (at which point the remaining optical flow may be considered to be the full sky optical flow), then the remaining optical flow is determined to be the optical flow of the sky area. For example, using the segmented sky area, it is possible to determine a ratio of the sky pixels in which the optical flow is predicted to occupy the sky area in the image, and when the ratio is greater than a predetermined ratio (e.g., 95%), it is possible to consider the predicted sky optical flow as complete. Conversely, if a predetermined proportion of the sky area is not covered (at which point the remaining light flow may be considered to not be a complete sky light flow), then the integrity of the remaining light flow is determined based on the area of the sky area covered by the remaining light flow. For example, the integrity may be a ratio between an area of the sky area covered by the remaining optical flow and an area of the sky area, or a value obtained based on the ratio (for example, a value obtained by multiplying the ratio by a predetermined value), and even the integrity may be the area of the sky area covered by the remaining optical flow. It should be noted that there are many definitions of completeness, and the present disclosure is not limited thereto as long as it is determined according to an area of the sky area covered by the remaining optical flow. If the integrity exceeds a preset threshold, obtaining an optical flow of the sky area through optical flow propagation based on the remaining optical flow. That is, if the optical flow field at this time is not able to cover most of the sky area, we use the method of optical flow propagation to conduct the known optical flow to the whole sky area to obtain more complete sky optical flow. Here, the optical flow propagation is to transfer the predicted sky optical flow to the sky area where the optical flow is not predicted, by the neighborhood rule. For example, optical flow propagation may be performed using a breadth first search algorithm. Conversely, if the integrity does not exceed the preset threshold, then an optical flow of the sky area is obtained using a preset optical flow template and the determined sky area. The reason is that in some extreme scenes, only small-area sky light flow which can be regarded as very unreliable can be obtained, and a preset light flow template can be used for filling the sky area in the image. For example, an optical flow template with the highest matching degree with the sky area can be selected from a plurality of preset optical flow templates based on the determined characteristics of the sky area, and the optical flow of the sky area can be obtained according to the selected optical flow template. For example, the optical flow of the sky area may be obtained by performing random optical flow perturbation on a selected optical flow template, which may increase the local diversity of the optical flow in the image area. As an example, the characteristic of the sky region may include at least one of a position, a shape, and a size of the sky region in the image, but is not limited thereto.

Optionally, the method shown in fig. 2 may further include: and generating a skimming map according to the obtained optical flow of the sky area. Since the optical flow prediction and the sky segmentation are combined to obtain the optical flow of the sky area according to the optical flow prediction method shown in fig. 2, that is, the optical flow of the sky area is obtained based on the predicted optical flow and the determined sky area, a more accurate sky optical flow can be obtained. Correspondingly, the sky kinematical image generated based on the more accurate sky light stream is more vivid, more fits the sky scene in the image, and the visual effect is better.

Fig. 3 is a schematic diagram illustrating an optical flow prediction method according to an exemplary embodiment of the present disclosure. To more intuitively understand the optical flow prediction method shown in fig. 2, an example of the optical flow prediction method shown in fig. 2 is briefly described below with reference to fig. 3.

As shown in fig. 3, after an image including the sky (referred to as "sky image" in fig. 3) is acquired, the sky image is input to the above-mentioned optical flow prediction model and the sky cutout model, respectively. Then, an optical flow in the image is predicted using an optical flow prediction model, and a sky region is determined using a sky matting model. Next, based on the optical flow predicted by the optical flow prediction model and the sky area determined by the sky cutout model, the optical flow of the sky area may be obtained.

Specifically, first, it may be determined whether the sky light flow is complete in the manner mentioned above in the description of step S240. If complete, it is determined to be the final optical flow of the sky area. If not, further determining whether the integrity exceeds a preset threshold (e.g., X), and if the integrity exceeds the preset threshold, conducting the known optical flow to the whole sky area through optical flow propagation so as to obtain the optical flow of the sky area. In contrast, if the integrity does not exceed the preset threshold, an optical-flow template with the highest degree of matching with the sky area may be selected from a plurality of preset optical-flow templates based on the determined features of the sky area, and an optical flow of the sky area may be obtained by performing random optical-flow perturbation on the selected optical-flow template.

By combining the deep learning-based optical flow prediction model with the sky matting model, the sky optical flow can be predicted robustly. Furthermore, as mentioned above in the description of step S240, for some extreme cases, a preset sky light-flow template may be applied to obtain the light flow of the sky area. Therefore, for sky images in various real scenes, the sky light stream with a good effect can be predicted by using the light stream prediction method according to the exemplary embodiment of the disclosure, and a sky dynamic short video with a good visual effect and less image quality problem is generated.

Fig. 4 is a view illustrating an example of an optical flow of a sky area obtained by an optical flow prediction method according to an exemplary embodiment of the present disclosure.

The sky optical flow can be accurately and robustly predicted from a single image by using the optical flow prediction method of the disclosed exemplary embodiment. As shown in fig. 4, the first row is a single sky image (including an image of a sky area), and the second row is a visualization of the obtained sky light flow. The different colors represent different directions of the optical flow, the shade of the color represents the magnitude of the optical flow, the blank represents that the image area has no dynamic change (i.e. no optical flow), and the ray direction of each grid cell is the specific direction of the optical flow. As can be seen from the example of fig. 4, the sky optical flow obtained by the optical flow prediction method according to the exemplary embodiment of the present disclosure is relatively fit to the sky scene in the image, and not only can the motion pattern of the whole sky region be captured, but also the generation of dynamic texture in the non-sky region is well suppressed. In addition, the obtained sky light flow has strong diversity in local image, so if the sky moving image is generated based on the sky light flow, the visual effect of the generated sky moving image can be ensured to be more vivid, and the image quality problem is less.

Fig. 5 is a block diagram illustrating an optical flow prediction apparatus according to an exemplary embodiment of the present disclosure.

Referring to fig. 5, the optical-flow prediction apparatus 500 may include an image acquisition unit 510, an image optical-flow prediction unit 520, a sky area determination unit 530, and a sky optical-flow obtaining unit 540. Specifically, the image acquisition unit 510 may be configured to acquire an image including a sky region. The image optical flow prediction unit 520 may be configured to acquire a texture feature of the image and predict an optical flow in the image based on the acquired texture feature of the image. The sky region determining unit 530 may be configured to determine a sky region in the image by segmenting the image. The sky light-flow obtaining unit 540 may be configured to obtain a light-flow of the sky area based on the predicted light-flow and the determined sky area.

Further, although not shown in fig. 5, the optical flow prediction apparatus 500 may further include a skimming map generation unit, optionally. The skimming map generating unit may be configured to generate a skimming map according to the obtained optical flow of the sky area.

Since the optical flow prediction method shown in fig. 2 can be executed by the optical flow prediction apparatus 500 shown in fig. 5, and the image acquisition unit 510, the image optical flow prediction unit 520, the sky area determination unit 530 and the sky optical flow obtaining unit 540 can respectively execute the operations corresponding to step S210, step S220, step S230 and step S240 in fig. 2, any relevant details related to the operations executed by the units in fig. 5 can be referred to the corresponding description of fig. 2, and are not repeated here.

Furthermore, it should be noted that although the optical flow prediction apparatus 500 is described above as being divided into units for respectively executing corresponding processes, it is clear to those skilled in the art that the processes executed by the units described above can also be executed without any specific division of the units by the optical flow prediction apparatus 500 or without explicit demarcation between the units. In addition, the optical flow prediction apparatus 500 may further include other units, for example, a storage unit and the like.

Referring to FIG. 6, an electronic device 600 may include at least one memory 601 and at least one processor 602, the at least one memory storing computer-executable instructions that, when executed by the at least one processor, cause the at least one processor 602 to perform an optical flow prediction method in accordance with embodiments of the present disclosure.

By way of example, the electronic device may be a PC computer, tablet device, personal digital assistant, smartphone, or other device capable of executing the set of instructions described above. The electronic device need not be a single electronic device, but can be any collection of devices or circuits that can execute the above instructions (or sets of instructions) either individually or in combination. The electronic device may also be part of an integrated control system or system manager, or may be configured as a portable electronic device that interfaces with local or remote (e.g., via wireless transmission).

In an electronic device, a processor may include a Central Processing Unit (CPU), a Graphics Processing Unit (GPU), a programmable logic device, a special-purpose processor system, a microcontroller, or a microprocessor. By way of example, and not limitation, processors may also include analog processors, digital processors, microprocessors, multi-core processors, processor arrays, network processors, and the like.

The processor may execute instructions or code stored in the memory, which may also store data. The instructions and data may also be transmitted or received over a network via a network interface device, which may employ any known transmission protocol.

The memory may be integral to the processor, e.g., RAM or flash memory disposed within an integrated circuit microprocessor or the like. Further, the memory may comprise a stand-alone device, such as an external disk drive, storage array, or any other storage device usable by a database system. The memory and the processor may be operatively coupled or may communicate with each other, such as through an I/O port, a network connection, etc., so that the processor can read files stored in the memory.

In addition, the electronic device may also include a video display (such as a liquid crystal display) and a user interaction interface (such as a keyboard, mouse, touch input device, etc.). All components of the electronic device may be connected to each other via a bus and/or a network.

According to an embodiment of the present disclosure, there may also be provided a computer-readable storage medium storing instructions that, when executed by at least one processor, cause the at least one processor to perform an optical flow prediction method according to an exemplary embodiment of the present disclosure. Examples of the computer-readable storage medium herein include: read-only memory (ROM), random-access programmable read-only memory (PROM), electrically erasable programmable read-only memory (EEPROM), random-access memory (RAM), dynamic random-access memory (DRAM), static random-access memory (SRAM), flash memory, non-volatile memory, CD-ROM, CD-R, CD + R, CD-RW, CD + RW, DVD-ROM, DVD-R, DVD + R, DVD-RW, DVD + RW, DVD-RAM, BD-ROM, BD-R, BD-R LTH, BD-RE, Blu-ray or compact disc memory, Hard Disk Drive (HDD), solid-state drive (SSD), card-type memory (such as a multimedia card, a Secure Digital (SD) card or a extreme digital (XD) card), magnetic tape, a floppy disk, a magneto-optical data storage device, an optical data storage device, a hard disk, a magnetic tape, a magneto-optical data storage device, a hard disk, a magnetic tape, a magnetic data storage device, a magnetic tape, a magnetic data storage device, a magnetic tape, a magnetic data storage device, a magnetic tape, a magnetic data storage device, a magnetic tape, a magnetic data storage device, A solid state disk, and any other device configured to store and provide a computer program and any associated data, data files, and data structures to a processor or computer in a non-transitory manner such that the processor or computer can execute the computer program. The instructions in the computer-readable storage medium or computer program described above may be run in an environment deployed in a computer apparatus, such as a client, a host, a proxy device, a server, etc., and further, in one example, the computer program and any associated data, data files, and data structures are distributed across a networked computer system such that the computer program and any associated data, data files, and data structures are stored, accessed, and executed in a distributed fashion by one or more processors or computers.

According to an embodiment of the present disclosure, there may also be provided a computer program product including computer instructions that, when executed by a processor, implement the optical flow prediction method according to an exemplary embodiment of the present disclosure.

Other embodiments of the disclosure will be apparent to those skilled in the art from consideration of the specification and practice of the disclosure disclosed herein. This application is intended to cover any variations, uses, or adaptations of the disclosure following, in general, the principles of the disclosure and including such departures from the present disclosure as come within known or customary practice within the art to which the disclosure pertains. It is intended that the specification and examples be considered as exemplary only, with a true scope and spirit of the disclosure being indicated by the following claims.

Claims

1. An optical flow prediction method, comprising:

acquiring an image including a sky region;

acquiring texture features of the image, and predicting optical flow in the image based on the acquired texture features of the image;

determining a sky region in the image by segmenting the image;

obtaining an optical flow of the sky area based on the predicted optical flow and the determined sky area.

2. The optical-flow prediction method of claim 1, wherein the obtaining an optical flow of the sky area based on the predicted optical flow and the determined sky area comprises:

filtering out optical flow of non-sky areas by comparing the predicted optical flow to the determined sky area;

determining a case where a light flow remaining after filtering out a light flow of a non-sky area covers the sky area, and obtaining a light flow of the sky area based on the remaining light flow according to the determined case.

3. The optical-flow prediction method of claim 2, wherein the determining a case where a remaining optical flow after filtering out an optical flow of a non-sky area covers the sky area and obtaining an optical flow of the sky area based on the remaining optical flow according to the determined case comprises:

determining whether a remaining optical flow after filtering out optical flow of non-sky regions covers more than a predetermined proportion of the sky regions;

determining the remaining optical flow as an optical flow of the sky area if coverage exceeds a predetermined proportion of the sky area;

determining a completeness of the remaining optical flow according to an area of the remaining optical flow covering the sky area if not covered by more than a predetermined proportion of the sky area;

if the integrity exceeds a preset threshold, obtaining an optical flow of the sky area through optical flow propagation based on the remaining optical flow;

if the integrity does not exceed the preset threshold, obtaining an optical flow of the sky area using a preset optical flow template and the determined sky area.

4. The optical flow prediction method of claim 3, wherein the obtaining of the optical flow of the sky area using a preset optical flow template and the determined sky area comprises:

based on the determined characteristics of the sky area, selecting an optical flow template with the highest matching degree with the sky area from a plurality of preset optical flow templates, and obtaining the optical flow of the sky area according to the selected optical flow template.

5. The optical flow prediction method of claim 4, wherein the obtaining an optical flow of the sky area according to the selected optical flow template comprises:

obtaining an optical flow of the sky area by performing a random optical flow perturbation on the selected optical flow template.

6. The optical flow prediction method of claim 4, wherein the features of the sky region include at least one of a location, a shape, and a size of the sky region in the image.

7. The optical flow prediction method of claim 1, wherein the predicting the optical flow in the image based on the obtained texture features of the image comprises:

and predicting the optical flow in the image by utilizing a pre-trained depth generation type confrontation network model based on the texture characteristics, wherein the depth generation type confrontation network model is based on a sky delayed video and is trained by learning the mapping relation between the sky texture and the sky optical flow in the sky delayed video.

8. An optical flow prediction apparatus comprising:

an image acquisition unit configured to acquire an image including a sky region;

an image optical flow prediction unit configured to acquire a texture feature of the image and predict an optical flow in the image based on the acquired texture feature of the image;

a sky region determination unit configured to determine a sky region in the image by segmenting the image;

a sky-light-flow obtaining unit configured to obtain a light flow of the sky area based on the predicted light flow and the determined sky area.

9. An electronic device, comprising:

at least one processor;

at least one memory storing computer-executable instructions,

wherein the computer-executable instructions, when executed by the at least one processor, cause the at least one processor to perform the optical flow prediction method of any one of claims 1 to 7.

10. A computer-readable storage medium storing instructions that, when executed by at least one processor, cause the at least one processor to perform the optical flow prediction method of any one of claims 1 to 7.