WO2022166896A1 - 视频生成方法、装置、设备及可读存储介质 - Google Patents

视频生成方法、装置、设备及可读存储介质 Download PDF

Info

Publication number
WO2022166896A1
WO2022166896A1 PCT/CN2022/075037 CN2022075037W WO2022166896A1 WO 2022166896 A1 WO2022166896 A1 WO 2022166896A1 CN 2022075037 W CN2022075037 W CN 2022075037W WO 2022166896 A1 WO2022166896 A1 WO 2022166896A1
Authority
WO
WIPO (PCT)
Prior art keywords
image
sketch
images
pixel
mask
Prior art date
Application number
PCT/CN2022/075037
Other languages
English (en)
French (fr)
Inventor
王旭
刘凯
Original Assignee
北京字跳网络技术有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 北京字跳网络技术有限公司 filed Critical 北京字跳网络技术有限公司
Priority to EP22749175.0A priority Critical patent/EP4277261A4/en
Priority to BR112023015702A priority patent/BR112023015702A2/pt
Priority to JP2023547370A priority patent/JP2024506014A/ja
Priority to US18/264,232 priority patent/US20240095981A1/en
Publication of WO2022166896A1 publication Critical patent/WO2022166896A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T11/002D [Two Dimensional] image generation
    • G06T11/20Drawing from basic elements, e.g. lines or circles
    • G06T11/203Drawing of straight lines or curves
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N5/00Details of television systems
    • H04N5/222Studio circuitry; Studio devices; Studio equipment
    • H04N5/262Studio circuits, e.g. for mixing, switching-over, change of character of image, other special effects ; Cameras specially adapted for the electronic generation of special effects
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T11/002D [Two Dimensional] image generation
    • G06T11/001Texturing; Colouring; Generation of texture or colour
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T13/00Animation
    • G06T13/802D [Two Dimensional] animation, e.g. using sprites
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T5/00Image enhancement or restoration
    • G06T5/70Denoising; Smoothing
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • G06V40/168Feature extraction; Face representation
    • G06V40/171Local features and components; Facial parts ; Occluding parts, e.g. glasses; Geometrical relationships
    • GPHYSICS
    • G11INFORMATION STORAGE
    • G11BINFORMATION STORAGE BASED ON RELATIVE MOVEMENT BETWEEN RECORD CARRIER AND TRANSDUCER
    • G11B27/00Editing; Indexing; Addressing; Timing or synchronising; Monitoring; Measuring tape travel
    • G11B27/02Editing, e.g. varying the order of information signals recorded on, or reproduced from, record carriers
    • G11B27/031Electronic editing of digitised analogue information signals, e.g. audio or video signals
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N5/00Details of television systems
    • H04N5/222Studio circuitry; Studio devices; Studio equipment
    • H04N5/262Studio circuits, e.g. for mixing, switching-over, change of character of image, other special effects ; Cameras specially adapted for the electronic generation of special effects
    • H04N5/2625Studio circuits, e.g. for mixing, switching-over, change of character of image, other special effects ; Cameras specially adapted for the electronic generation of special effects for obtaining an image which is composed of images from a temporal image sequence, e.g. for a stroboscopic effect
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N5/00Details of television systems
    • H04N5/222Studio circuitry; Studio devices; Studio equipment
    • H04N5/262Studio circuits, e.g. for mixing, switching-over, change of character of image, other special effects ; Cameras specially adapted for the electronic generation of special effects
    • H04N5/265Mixing
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/30Subject of image; Context of image processing
    • G06T2207/30196Human being; Person
    • G06T2207/30201Face
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2210/00Indexing scheme for image generation or computer graphics
    • G06T2210/12Bounding box
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N5/00Details of television systems
    • H04N5/222Studio circuitry; Studio devices; Studio equipment
    • H04N5/262Studio circuits, e.g. for mixing, switching-over, change of character of image, other special effects ; Cameras specially adapted for the electronic generation of special effects
    • H04N5/272Means for inserting a foreground image in a background image, i.e. inlay, outlay
    • H04N2005/2726Means for inserting a foreground image in a background image, i.e. inlay, outlay for simulating a person's appearance, e.g. hair style, glasses, clothes

Definitions

  • the embodiments of the present disclosure relate to the technical field of image processing, and in particular, to a video generation method, apparatus, device, and readable storage medium.
  • Sketching is a kind of image stylization.
  • sketching is an art form that uses pencil as a medium to express scenes or characters with lines.
  • Sketch can be divided into line structure drawing and fine realistic sketch.
  • Creating sketches by hand requires the author to have a certain artistic background.
  • the terminal device With the intelligentization of terminal devices, users can obtain sketch images and the like by using the terminal devices.
  • the terminal device In the process of acquiring the sketch image, the terminal device performs style conversion processing on the source image selected by the user, thereby obtaining a sketch image.
  • a dynamic video which simulates the process of painting by an artist, and draws a sketch image stroke by stroke.
  • Embodiments of the present disclosure provide a video generation method, device, device, and readable storage medium, which generate a video simulating a painter's stroke by stroke to create a sketch image based on an image, and the process is simple.
  • an embodiment of the present disclosure provides a video generation method, including:
  • the target sketch image Based on the target sketch image, multiple sub-images of the target sketch image are generated, the multiple sub-images respectively correspond to the sketch images of the target sketch image under different rendering completion degrees, and the target sketch image is the multiple sketches any sketch image in the image;
  • each sub-picture of each sketch image in the plurality of sketch images is set according to the order of the color depth from light to dark and the order of the drawing completion degree from low to high,
  • the sketching video is generated.
  • the present disclosure provides a video generation device, comprising:
  • a first generating unit configured to generate a plurality of sketch images according to the source image, where the plurality of sketch images respectively correspond to sketch images of the source image under different color depths;
  • the second generating unit is configured to generate, based on the target sketch image, multiple sub-images of the target sketch image, where the multiple sub-images respectively correspond to the sketch images of the target sketch image under different drawing completion degrees, and the target sketch image the image is any one of the plurality of sketch images;
  • the third generating unit is configured to use each sub-image of each sketch image in the plurality of sketch images as a video frame of the sketch drawing video, and set in the order of color depth from light to dark and the order of drawing completion degree from low to high The sequence of the video frames to generate the sketching video.
  • an electronic device comprising: at least one processor and a memory;
  • the memory stores computer-executable instructions
  • the at least one processor executes computer-implemented instructions stored in the memory to cause the at least one processor to perform the video generation method as described in the first aspect and various possible designs of the first aspect above.
  • a computer-readable storage medium where computer-executable instructions are stored in the computer-readable storage medium, and when a processor executes the computer-executable instructions, The video generation method described above in the first aspect and various possible designs of the first aspect is implemented.
  • an embodiment of the present disclosure provides a computer program product, where the computer program product includes: a computer program, where the computer program is stored in a readable storage medium, and at least one processor of an electronic device stores data from the readable storage medium.
  • the medium reads the computer program, the computer program being executed by the at least one processor to cause the electronic device to perform the video generation method as described in the first aspect and various possible designs of the first aspect above.
  • an embodiment of the present disclosure provides a computer program that, when executed by a processor, implements the video generation method described in the first aspect and various possible designs of the first aspect.
  • the electronic device After the electronic device obtains the source image, it generates sketches at different stages in the process of simulating the painter's drawing of the target sketch based on the source image in order of colors from light to dark.
  • Image for each target sketch image, generate multiple sub-images in the order of drawing completion from high to low to simulate the process of drawing the sketch image by the painter, and then draw each sub-image of each sketch image in the multiple sketch images as a sketch
  • For the video frames of the video set the sequence of the video frames according to the order of color depth from light to dark and the order of drawing completion from low to high to generate a sketch drawing video.
  • various stages of the painter's creation of sketch images are simulated through image processing, the process is simple, and no deep learning process is required, and the efficiency is high.
  • FIG. 1 is a schematic diagram of a network architecture of a video generation method provided by an embodiment of the present disclosure
  • FIG. 2 is a flowchart of a video generation method provided by an embodiment of the present application.
  • FIG. 3 is a schematic diagram of a plurality of sketch images provided by an embodiment of the present disclosure.
  • FIG. 4 is a schematic diagram of a plurality of sub-graphs provided by an embodiment of the present disclosure.
  • FIG. 5 is a schematic diagram of the process of drawing sub-pictures in the video generation method provided by the present disclosure
  • FIG. 6 is a schematic process diagram of a video generation method provided by the present disclosure.
  • FIG. 7 is a schematic diagram of a face key point in the video generation method provided by the present disclosure.
  • FIG. 8 is a schematic diagram of a second mask of the hair region in the video generation method provided by the present disclosure.
  • FIG. 9 is a schematic diagram of a first convex hull area and a second convex hull area in the video generation method provided by the present disclosure.
  • FIG. 10 is a structural block diagram of a video generating apparatus according to an embodiment of the present disclosure.
  • FIG. 11 is a schematic structural diagram of an electronic device for implementing an embodiment of the present disclosure.
  • the present disclosure considers, based on the source image and the target sketch, to generate a video simulating the painter's stroke-by-stroke drawing process, the process is simple, and meets the needs of the majority of ordinary users.
  • FIG. 1 is a schematic diagram of a network architecture of a video generation method provided by an embodiment of the present disclosure.
  • the network architecture includes a terminal device 1 , a server 2 and a network 3 , and the terminal device 1 and the server 2 establish a network connection through the network 3 .
  • Network 3 includes various types of network connections, such as wired, wireless communication links, or fiber optic cables.
  • the user uses the terminal device 1 to interact with the server 2 through the network 3 to receive or send messages and the like.
  • Various communication client applications are installed on the terminal device 1, such as video playback applications, shopping applications, search applications, instant communication tools, email clients, social platform software, and the like.
  • the terminal device 1 may be hardware or software.
  • the terminal device 1 is, for example, a mobile phone, a tablet computer, an e-book reader, a laptop computer, a desktop computer, or the like.
  • the terminal device 1 is software, it can be installed in the hardware devices listed above. In this case, the terminal device 1 is, for example, multiple software modules or a single software module, which is not limited by the embodiments of the present disclosure.
  • the server 2 is a server capable of providing various servers for receiving the source image sent by the terminal device, and generating a video simulating a sketch image created by a painter stroke by stroke based on the source image.
  • the server 2 may be hardware or software.
  • the server 2 is hardware, the server 2 is a single server or a distributed server cluster composed of multiple servers.
  • the server 2 is software, it may be multiple software modules or a single software module, etc., which is not limited in the embodiment of the present disclosure.
  • terminal devices 1 , servers 2 and networks 3 in FIG. 1 are only illustrative. In actual implementation, any number of electronic devices 1 , servers 2 and networks 3 are deployed according to actual requirements.
  • the server 2 and the network 3 in the above-mentioned FIG. 1 may not exist.
  • FIG. 2 is a flowchart of a video generation method provided by an embodiment of the present application.
  • the execution subject of this embodiment is an electronic device, and the electronic device is, for example, the terminal device or the server in the above-mentioned FIG. 1 .
  • This embodiment includes:
  • the electronic device obtains the source image locally, or obtains the source image from the Internet.
  • the source image is also called (image_src).
  • the source image is a red green blue (RGB) image, a black-and-white photo, etc., which is not limited in the embodiment of the present disclosure.
  • the plurality of sketch images respectively correspond to sketch images of the source image at different color depths.
  • the process of creating a sketch image by a professional painter is a step-by-step process, and the visual experience of the entire work is from light to dark colors.
  • the electronic device generates a plurality of sketch images according to the source image, and the colors of the sketch images are sequentially deepened.
  • FIG. 3 please refer to FIG. 3 .
  • FIG. 3 is a schematic diagram of a plurality of sketch images provided by an embodiment of the present disclosure.
  • the leftmost is the source image, which can be a color RGB image or a black and white photo, etc., 1-4 are different sketch images.
  • the electronic device processes the source image by using an image processing algorithm, thereby obtaining sketch images 1-4 based on the source image.
  • the sketch image 4 is, for example, the final product, which is equivalent to the finished product created by a professional painter.
  • sketch images 1-4 are merely illustrative to illustrate different sketch images, and do not necessarily mean that there are four sketch images. In practical implementation, the number of sketch images with colors ranging from light to dark may be less than or greater than 4.
  • the plurality of sub-images respectively correspond to the sketch images of the target sketch image under different drawing completion degrees
  • the target sketch image is any one of the plurality of sketch images.
  • the electronic device when a professional painter creates the target sketch image, it is impossible to complete the sketch image in one stroke, but is completed step by step and multiple times.
  • the electronic device generates sketch images with different drawing completion degrees for the target sketch image, which are hereinafter referred to as sub-images.
  • sub-images For the adjacent first sub-image and the second sub-image, the second sub-image contains more stroke areas than the first sub-image contains. For example, if the degree of drawing completion indicates that the mouth is drawn first and then the eyes are drawn, the first sub-picture includes the drawn mouth, and the second sub-picture includes the drawn eyes in addition to the drawn mouth.
  • the first sub-image contains the drawn outline of the mouth
  • the second sub-image not only contains the drawn outline of the mouth, but also includes the filling of the mouth, etc.
  • FIG. 4 is a schematic diagram of multiple sub-graphs provided by an embodiment of the present disclosure.
  • the length of the video is 15 seconds, 30 frames per second, and 450 frames of images in total.
  • 30 frames of the 450 frames of images are used for drawing the outline of the human face, which is equivalent to having 30 sub-images, and the images of the 30 frames are equivalent to simulating the outline of the human face drawn by a professional painter in 30 strokes.
  • the stroke area is the outline of the face.
  • the outline of the face in the first frame image is the least, and the outline of the face in the second frame image gradually increases. In the 30th frame image, the whole face Outlines are drawn.
  • each sub-picture of each sketch image in the plurality of sketch images as the video frame of the sketch drawing video, and set the video frame in the order of color depth from light to dark and the order of the degree of completion of drawing from low to high. sequence, generating the sketching video.
  • all the sub-images are used as video frames. After that, set the order of each sub-image of the target sketch image in the order of the drawing completion degree from low to high, so as to obtain the sub-video that simulates the creation of the target sketch image, and then set the sub-video in the order of color from light to dark. These sub-videos are sequenced and synthesized, resulting in a sketching video.
  • the electronic device When the electronic device is a server, the electronic device sends the sketching video to a mobile terminal such as a mobile phone, so that the mobile terminal such as the mobile phone can play the sketching video.
  • the electronic device is a mobile terminal such as a mobile phone, the sketching video is directly played or the video sketching video is stored locally.
  • the electronic device after acquiring the source image, the electronic device generates sketch images of different stages in the process of simulating the painter's drawing of the target sketch in the order from light to dark based on the source image, and for each target sketch image, Generate multiple sub-images in the process of simulating the process of drawing the sketch image by the painter according to the order of drawing completion degree from high to bottom, and then use each sub-image of each sketch image in the multiple sketch images as the video frame of the sketch drawing video, according to the color depth from The order of light to dark and the order of drawing completion from low to high set the order of video frames to generate a sketch drawing video.
  • various stages of the painter's creation of sketch images are simulated through image processing, the process is simple, and no deep learning process is required, and the efficiency is high.
  • the electronic device when the electronic device generates a plurality of sub-images of the target sketch image based on the target sketch image, the increasing order of the mask values of the pixels in the first mask is determined, and the initial mask value of the pixels in the first mask is determined.
  • the film value is 0, the growth of pixels in the first mask is used to indicate that the mask value of the pixel is changed from 0 to 1, and the first mask is used to make the background of the target sketch image according to the The growing order gradually transitions to the target sketch image.
  • the plurality of subgraphs are generated, and each increase in the increase sequence is one-to-one with the subgraphs in the plurality of subgraphs correspond.
  • the electronic device sets a first mask (mask) for the target sketch image, and the initial mask of the pixels on the first mask is Membrane values are all 0.
  • the electronic device determines the increasing order of the pixels in the first mask, and one increasing sequence is used to indicate which pixels have a mask value from 0 to 1, and each increase corresponds to one or more strokes in the painting process of the professional painter. For example, if a certain increase indicates that the mask value of the pixel point of the upper lip contour changes from 0 to 1, then the upper lip contour can be drawn according to this increase.
  • the target sketch image and the background of the target sketch image first, for each increase in the increasing sequence, from all The first set of pixels corresponding to the increase is determined in the first mask.
  • the latter sketch image is actually obtained by continuing to draw on the basis of the former sketch image in the real creation process.
  • the process of creating the target sketch image is actually a process of drawing on the background of the target sketch image, so that the background is transformed into the target sketch image.
  • it is necessary to control the variation of the mask value of the pixels in the first mask. For example, initially, the mask value of each pixel in the first mask is 0, and at this time, the background does not change.
  • the electronic device determines the first pixel set from the first mask according to the growth. For example, during a certain increase, the mask value of the pixel representing the contour of the upper lip in the first mask changes to 1, and the pixel representing the contour of the upper lip is used as the first set of pixels.
  • a second pixel set is determined from the target sketch image, and a third pixel set is determined from the background, the first pixel set, the second pixel set One-to-one correspondence with the pixels in the third pixel set.
  • the pixels of the target sketch image, the first mask and the background are in one-to-one correspondence. Therefore, after the electronic device determines the first set of pixels from the first mask, it can determine the second pixel from the target sketch image. set, and a third set of pixels is determined from the background.
  • the electronic device determines, according to the mask value of the first pixel in the first pixel set, the pixel value of the second pixel in the second pixel set, and the pixel value of the third pixel in the third pixel set,
  • the pixel value of the fourth pixel, the pixel value of the fourth pixel the mask value of the first pixel ⁇ the pixel value of the second pixel+the pixel value of the third pixel ⁇ (1-the first pixel mask value for one pixel).
  • the third pixel in the background is updated to the fourth pixel to obtain the sub-image corresponding to the increase.
  • FIG. 5 is a schematic diagram of a process of drawing sub-images in the video generation method provided by the present disclosure.
  • the pixel value of each pixel on the background is 255, that is, the background is a pure white background.
  • the pixels corresponding to a certain increase of the first mask are 6 pixels with a mask value of 1
  • a polyline containing the 6 pixels is finally drawn on the background, and the gray values of the 6 pixels are 90, 0, 23, 23, 255, 89.
  • the background when the target sketch image is not the sketch image with the lightest color among the plurality of sketch images, the background is that the background is adjacent to the target sketch image among the plurality of sketch images. , and the sketch image before the target sketch image; when the target sketch image is the sketch image with the darkest color among the plurality of sketch images, the background is a white image with the same size as the target sketch image .
  • the sketch image 2 is obtained by continuing to draw the sketch image 1, and the sketch image 3 is obtained on the basis of the sketch image 2.
  • the sketch image 1 is actually drawn on a white background. That is to say, the background of the sketch image 1 is a white background, the pixel value of each pixel of the white background is 255, the background of the sketch image 2 is the sketch image 1, and so on.
  • sketch images at different stages correspond to different backgrounds, so as to accurately simulate the purpose of drawing sketch images by painters.
  • each increment in the growth sequence of the first mask corresponds to the sub-images in the multiple sub-images one-to-one, and the growth sequence includes the growth sequence of the facial contours of the characters in the source image.
  • the electronic device determines the increasing order of the mask values of the pixels in the first mask, firstly, extracts the face key points in the source image to obtain a key point set. Afterwards, the electronic device determines a second mask of the hair region of the person in the source image according to the source image.
  • the electronic device determines the first convex hull area of the person's face according to the set of key points; according to the intersection of the first convex hull area and the second mask, from the first mask A second convex hull region is defined in the film. Finally, the electronic device sequentially connects the face contour key points in the key point set on the second convex hull region according to the stroke speed, to obtain the growth sequence of the human face contour in the source image, and the stroke speed is based on The duration of the video is determined.
  • FIG. 6 is a schematic process diagram of the video generation method provided by the present disclosure.
  • the electronic device uses the facial key point detection model to extract the facial feature key points and the facial contour key points, and uses the hair segmentation model to determine the second mask of the hair region of the person .
  • FIG. 7 is a schematic diagram of a face key point in the video generation method provided by the present disclosure.
  • the key points of the human face include the key points of 7 parts of the left eyebrow, the right eyebrow, the left eye, the right eye, the nose, the mouth and the outline of the human face.
  • the numbers next to the key points in the figure represent the serial numbers of the key points. For example, there are 34 key points in the face contour, and the key points represented by the serial numbers 1 to 34 are the key points of the face contour.
  • FIG. 8 is a schematic diagram of the second mask of the hair area in the video generation method provided by the present disclosure. Please refer to FIG. 8.
  • the function of the second mask of the hair area is to avoid the problem of disordered sequence in the process of simulating painting, such as actual painting. When drawing the outline of the face first, then the eyebrows, then the eyes, and finally the hair, instead of the hair appearing first. Therefore, the use of the second mask can block the hair area to avoid hair appearing first in the simulation painting process, resulting in an unnatural simulation process.
  • the electronic device When drawing a face contour, first determine the growth order of the face contour in the first mask. During the determination process, the electronic device determines the first convex hull region from the first mask according to the set of key points. For example, the electronic device will connect the face contour key points, the left eyebrow key points and the right eyebrow key points in sequence, and the first convex hull region can be obtained. After that, the electronic device determines the intersection of the first convex hull region and the second mask described in FIG. 8 , and then the second convex hull region can be determined.
  • FIG. 9 is a schematic diagram of the first convex hull area and the second convex hull area in the video generation method provided by the present disclosure. Please refer to FIG. 9. Since there is an area blocked by hair in the first convex hull area, in order to avoid the first convex hull area in the simulation drawing process When the hair problem occurs, it is necessary to remove the area occluded by the hair in the first convex hull area, so as to obtain the second convex hull area. The position of the face can be determined from the background according to the second convex hull region. Next, it is necessary to draw the outline of the face, facial features, etc. at the position of the face in the background, and then draw the area other than the hair and the character, so that the target sketch image appears little by little on the background, thereby simulating the process of drawing the target sketch image .
  • the process of drawing the face contour is actually a process in which the mask values of the pixels representing the face contour in the first mask are sequentially changed from 0 to 1, that is, the growth process of the pixels representing the face contour in the first mask.
  • the growth process includes the growth sequence and the stroke speed.
  • the growth sequence is the process of connecting the key points of the face contour in sequence. For example, the 1-34th face contour key points are connected in sequence to obtain the growth sequence of the face contour.
  • the growth rate is determined by the length of the video. For example, in a 450-frame video, if you want to draw the outline of the face in 3 strokes, it is equivalent to increasing the first mask 3 times during the process of drawing the outline of the face, each time increasing by about 11 pixels.
  • the outline of the face is drawn by simulating three strokes of the painter, corresponding to three sub-images, namely three image frames.
  • three strokes of the painter corresponding to three sub-images, namely three image frames.
  • you want to draw the outline of a human face in one second it is equivalent to increasing the first mask 30 times, which is equivalent to simulating the process of drawing the outline of a human face with 30 strokes by an artist.
  • the mask value of the face contour position on the first mask is 1.
  • the mask value of the face contour continues to remain 1.
  • the electronic device determines the growth sequence of the outline of the human face, it also needs to determine the growth sequence of each facial feature area.
  • the electronic device determines the key points of different areas of the face from the set of key points; performs interpolation on each area of the face according to the key points of each area; According to the stroke speed on the second convex hull region, the growth order of the corresponding regions in the face of the person in the source image is determined.
  • the face contour is relatively simple, and the discount obtained by sequentially connecting the key points of the face can represent the face contour.
  • the image obtained by sequentially connecting the key points of the facial features area cannot satisfy the realistic sketch image.
  • Figure 7. After the two lines of key points on the left eyebrow are connected in sequence, there is a blank part in the middle. Obviously, this cannot meet the requirements of the sketch image.
  • the key points of each row are connected in turn, it is obviously unreasonable to form two strokes to draw the left eyebrow. Therefore, for the facial features area, it is not enough to connect the key points in sequence.
  • different facial features need to be interpolated in different ways. According to the interpolated area, in the area of the second convex hull in the background according to The brush stroke speed determines the value-added sequence corresponding to the facial features area, and prepares for the subsequent sub-strategy drawing of facial features.
  • the key points of the left eyebrow include two rows of key points, and the key points of the upper row correspond to the key points of the next row one by one.
  • the electronic The device determines the average value to interpolate the keypoints of the left eyebrow region into 3 rows of keypoints. Further, the electronic device can also interpolate the key points of the left eyebrow region into 4 lines or 5 lines, which is not limited in the embodiment of the present disclosure, and the number of lines is related to the precision. After interpolation, it takes seven or eight strokes to draw the left eyebrow, which can more reasonably simulate the painting process of the artist.
  • the eyeball region is interpolated according to the circular region according to the key points of the eyeball region.
  • the pupil cannot be drawn horizontally and horizontally. Therefore, for the eyeball area, use the circular interpolation method, first draw a circle representing the eyeball, and then interpolate small circles one by one until a solid circle is formed. During the interpolation process, the average value of the center of the eyeball and the points on the circumference is determined, the average value is taken as the radius, and a circle is interpolated with the center of the eyeball as the center. From the perspective of the first mask, when the first mask grows, the image formed by each growing pixel is not a polyline, but one or more circles.
  • a plurality of curves are vertically interpolated according to the key points of the mouth region.
  • the key points in the mouth area can only draw the outline of the mouth. After the outline of the mouth is completed, the interpolation is performed according to the vertical line when filling. From the point of view of the first mask, after the first mask completes the growth of the mouth contour, for the lips part, the image formed by the pixels each time increases is a vertical line.
  • the last few frames of images are used to supplement the background areas other than the hair and the characters.
  • 90 frames are used to generate the sketch image 1 in Figure 3
  • the first 88 frames of the 90 frames are used to draw the outline of the face and facial features
  • the 89th frame is used to add hair
  • the 90th frame is used for adding hair.
  • background areas other than people For adding background areas other than people.
  • the electronic device when the electronic device generates a plurality of sketch images according to the source image, first, a grayscale image is generated based on the source image. Then, the electronic device determines a plurality of Gaussian kernels, the Gaussian kernels in the plurality of Gaussian kernels correspond to the sketch images in the plurality of sketch images one-to-one, and the size of the corresponding Gaussian kernel of the sketch image with darker color is larger than that of the color Gaussian kernel for lighter sketch images. Next, the electronic device performs Gaussian blur on the grayscale image according to the multiple Gaussian kernels, respectively, to obtain a Gaussian blurred image corresponding to each of the multiple Gaussian kernels. Finally, the electronic device generates the multiple sketch images according to the Gaussian blur map corresponding to each of the multiple Gaussian kernels and the grayscale image.
  • the electronic device after acquiring the source image, the electronic device performs a noise reduction process such as median filtering on the source image. Afterwards, the electronic device performs grayscale processing on each pixel of the source image, thereby converting the source image into a grayscale image. After obtaining the grayscale image, the electronic device determines a Gaussian kernel (kernel) corresponding to the Gaussian blur according to Gaussian convolution, etc., and uses the Gaussian kernel to perform Gaussian blurring on the grayscale image to obtain a Gaussian blurred image. The larger the size of the Gaussian kernel, the darker the color of the Gaussian blur image obtained by using the Gaussian kernel to perform the Gaussian model on the grayscale image.
  • a noise reduction process such as median filtering on the source image.
  • the electronic device performs grayscale processing on each pixel of the source image, thereby converting the source image into a grayscale image.
  • the electronic device determines a Gaussian kernel (kernel) corresponding to the Gau
  • the electronic device uses the Gaussian blur image and the grayscale image as materials to generate a black and white sketch image. For example, the electronic device fuses the pixels of the Gaussian blur image with the corresponding pixels in the grayscale image to generate a sketch image.
  • the source image, grayscale image, Gaussian blur image and sketch image have the same size and one-to-one pixel correspondence. Therefore, the electronic device can determine the pixel value of the corresponding pixel in the black and white sketch image according to the pixel value of the pixel in the grayscale image and the Gaussian blur image, and then obtain the sketch image. For example, after the Gaussian blur map is generated, the electronic device uses the dodge mode to extract the effect, that is, the following formula (1) is used for extraction:
  • image_target represents the pixel value of the pixel in the sketch image
  • image_gray represents the pixel value of the pixel in the grayscale image
  • gray_blur1 represents the pixel value of the pixel in the Gaussian blur image.
  • dodges based on different Gaussian kernels can obtain sketch images of different levels.
  • FIG. 10 is a structural block diagram of a video generation apparatus according to an embodiment of the present disclosure.
  • the device includes: an acquiring unit 11 , a first generating unit 12 , a second generating unit 13 and a third generating unit 14 .
  • the acquiring unit 11 is used to acquire the source image.
  • the first generating unit 12 is configured to generate a plurality of sketch images according to the source image, where the plurality of sketch images respectively correspond to sketch images of the source image with different color depths.
  • the second generating unit 13 is configured to generate, based on the target sketch image, multiple sub-images of the target sketch image, where the multiple sub-images respectively correspond to the sketch images of the target sketch image under different drawing completion degrees, and the target sketch image
  • the sketch image is any one of the plurality of sketch images.
  • the third generating unit 14 is configured to use each sub-image of each sketch image in the plurality of sketch images as the video frame of the sketch drawing video, in the order of color depth from light to dark and the order of drawing completion degree from low to high
  • the sequence of the video frames is set, and the sketching video is generated.
  • the second generating unit 13 is configured to determine the increasing order of the mask values of the pixels in the first mask, and the initial mask value of the pixels in the first mask is 0,
  • the growth of pixels in the first mask is used to indicate that the mask value of the pixels is changed from 0 to 1, and the first mask is used to make the background of the target sketch image gradually change to the target sketch image; according to the growth sequence, the target sketch image and the background of the target sketch image, the multiple sub-images are generated, and each increase in the growth sequence is the same as that in the multiple sub-images.
  • the subgraphs correspond one by one.
  • the second generation unit 13 when the second generation unit 13 generates the plurality of sub-images according to the growth order, the target sketch image, and the background of the target sketch image, it is used for the growth
  • a first set of pixels corresponding to the increase is determined from the first mask
  • a second set of pixels is determined from the target sketch image
  • a third pixel set is determined from the background, and the first pixel set, the second pixel set and the pixels in the third pixel set are in one-to-one correspondence;
  • the third pixel is updated to the fourth pixel
  • the background when the target sketch image is not the sketch image with the lightest color among the plurality of sketch images, the background is the same as the target sketch image among the plurality of sketch images.
  • the source image includes characters
  • the increasing order includes an increasing sequence of face contours of the characters in the source image
  • the second generating unit 13 determines the pixels in the first mask
  • it is used to extract the face key points in the source image to obtain a set of key points; according to the source image, determine the second mask of the hair area of the character in the source image;
  • the set of key points determine the first convex hull area of the person's face; according to the intersection of the first convex hull area and the second mask, determine from the second mask
  • the second convex hull area; on the second convex hull area connect the face contour key points in the key point set in turn according to the stroke speed to obtain the growth sequence of the human face contour in the source image, and the stroke The speed is determined according to the duration of the video.
  • the second generating unit 13 sequentially connects the face contour key points in the key point set on the second convex hull region according to the stroke speed to obtain the characters in the source image After the growth sequence of the face contour, it is also used to determine the key points of different regions of the human face from the set of key points; according to the key points of each region, each region of the human face is interpolated; In the area after interpolation, according to the stroke speed on the second convex hull area, the growth order of the corresponding areas in the face of the person in the source image is determined.
  • the second generation unit 13 when the second generation unit 13 performs interpolation on different regions of the human face according to the key points of the regions, for the eyebrow region in the human face, according to the eyebrows
  • the key points of the area are horizontally interpolated with multiple curves; for the eyeball area in the face area, according to the key points of the eyeball area, the eyeball area is interpolated according to the circular area; for the mouth in the face area region, and vertically interpolate multiple curves according to the key points of the mouth region.
  • the first generating unit 12 is configured to generate a grayscale image based on the source image;
  • the sketch images in the sketch images are in one-to-one correspondence, and the size of the corresponding Gaussian kernel of the sketch image with darker color is larger than that of the sketch image with lighter color;
  • Gaussian blurring is performed on the image to obtain a Gaussian blur map corresponding to each of the multiple Gaussian kernels; according to the Gaussian blur map corresponding to each of the multiple Gaussian kernels and the grayscale map, the multiple Gaussian blur maps are generated.
  • FIG. 11 is a schematic structural diagram of an electronic device for implementing an embodiment of the present disclosure.
  • the electronic device 200 may be a terminal device or a server.
  • the terminal device may include, but is not limited to, such as mobile phones, notebook computers, digital broadcast receivers, personal digital assistants (Personal Digital Assistant, referred to as PDA), tablet computers (Portable Android Device, referred to as PAD), portable multimedia players (Portable Media Player, PMP for short), mobile terminals such as in-vehicle terminals (such as in-vehicle navigation terminals), etc., and fixed terminals such as digital TVs, desktop computers, and the like.
  • PDA Personal Digital Assistant
  • PAD tablet computers
  • PMP portable multimedia players
  • PMP Portable Media Player
  • mobile terminals such as in-vehicle terminals (such as in-vehicle navigation terminals), etc.
  • fixed terminals such as digital TVs, desktop computers, and the like.
  • the electronic device shown in FIG. 11 is only an example, and should not
  • the electronic device 200 may include a processing device (such as a central processing unit, a graphics processor, etc.) 201, which may be stored in a read only memory (Read Only Memory, ROM for short) 202 according to a program or from a storage device 208 is a program loaded into a random access memory (Random Access Memory, RAM for short) 203 to execute various appropriate actions and processes.
  • ROM Read Only Memory
  • RAM Random Access Memory
  • various programs and data required for the operation of the electronic device 200 are also stored.
  • the processing device 201, the ROM 202, and the RAM 203 are connected to each other through a bus 204.
  • An Input/Output (I/O for short) interface 205 is also connected to the bus 204 .
  • the following devices can be connected to the I/O interface 205: input devices 206 including, for example, a touch screen, touch pad, keyboard, mouse, camera, microphone, accelerometer, gyroscope, etc.; including, for example, a Liquid Crystal Display (LCD for short) ), speaker, vibrator, etc. output device 207; storage device 208 including, eg, magnetic tape, hard disk, etc.; and communication device 202.
  • Communication means 202 may allow electronic device 200 to communicate wirelessly or by wire with other devices to exchange data. While FIG. 11 shows the electronic device 200 having various means, it should be understood that not all of the illustrated means are required to be implemented or provided. More or fewer devices may alternatively be implemented or provided.
  • embodiments of the present disclosure include a computer program product comprising a computer program carried on a computer-readable medium, the computer program containing program code for performing the method illustrated in the flowchart.
  • the computer program may be downloaded and installed from the network via the communication device 202, or from the storage device 208, or from the ROM 202.
  • the processing apparatus 201 When the computer program is executed by the processing apparatus 201, the above-mentioned functions defined in the methods of the embodiments of the present disclosure are executed.
  • the computer-readable medium mentioned above in the present disclosure may be a computer-readable signal medium or a computer-readable storage medium, or any combination of the above two.
  • the computer-readable storage medium can be, for example, but not limited to, an electrical, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus or device, or a combination of any of the above.
  • Computer readable storage media may include, but are not limited to, electrical connections with one or more wires, portable computer disks, hard disks, random access memory (RAM), read only memory (ROM), erasable Programmable Read-Only Memory (Erasable Programmable Read-Only Memory, referred to as EPROM, or flash memory), optical fiber, portable compact disk read-only memory (Compact Disc Read-Only Memory, referred to as CD-ROM), optical storage devices, magnetic storage devices, or Any suitable combination of the above.
  • a computer-readable storage medium may be any tangible medium that contains or stores a program that can be used by or in conjunction with an instruction execution system, apparatus, or device.
  • a computer-readable signal medium may include a data signal propagated in baseband or as part of a carrier wave with computer-readable program code embodied thereon. Such propagated data signals may take a variety of forms, including but not limited to electromagnetic signals, optical signals, or any suitable combination of the foregoing.
  • a computer-readable signal medium can also be any computer-readable medium other than a computer-readable storage medium that can transmit, propagate, or transport the program for use by or in connection with the instruction execution system, apparatus, or device .
  • the program code contained on the computer readable medium can be transmitted by any suitable medium, including but not limited to: electric wire, optical cable, radio frequency (RF for short), etc., or any suitable combination of the above.
  • the above-mentioned computer-readable medium may be included in the above-mentioned electronic device; or may exist alone without being assembled into the electronic device.
  • the aforementioned computer-readable medium carries one or more programs, and when the aforementioned one or more programs are executed by the electronic device, causes the electronic device to execute the methods shown in the foregoing embodiments.
  • Computer program code for carrying out operations of the present disclosure may be written in one or more programming languages, including object-oriented programming languages—such as Java, Smalltalk, C++, but also conventional Procedural programming language - such as the "C" language or similar programming language.
  • the program code may execute entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer, or entirely on the remote computer or server.
  • the remote computer can be connected to the user's computer through any kind of network—including a Local Area Network (LAN) or a Wide Area Network (WAN)—or, can be connected to an external A computer (eg using an Internet service provider to connect via the Internet).
  • LAN Local Area Network
  • WAN Wide Area Network
  • each block in the flowchart or block diagrams may represent a module, segment, or portion of code that contains one or more logical functions for implementing the specified functions executable instructions.
  • the functions noted in the blocks may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved.
  • each block of the block diagrams and/or flowchart illustrations, and combinations of blocks in the block diagrams and/or flowchart illustrations can be implemented in dedicated hardware-based systems that perform the specified functions or operations , or can be implemented in a combination of dedicated hardware and computer instructions.
  • the units involved in the embodiments of the present disclosure may be implemented in a software manner, and may also be implemented in a hardware manner.
  • the name of the unit does not constitute a limitation of the unit itself under certain circumstances, for example, the first obtaining unit may also be described as "a unit that obtains at least two Internet Protocol addresses".
  • FPGA Field Programmable Gate Array
  • ASIC Application Specific Integrated Circuit
  • ASSP Application Specific Standard Parts
  • SOC System on Chip
  • CPLD Complex Programmable Logic Device
  • a machine-readable medium may be a tangible medium that may contain or store a program for use by or in connection with the instruction execution system, apparatus or device.
  • the machine-readable medium may be a machine-readable signal medium or a machine-readable storage medium.
  • Machine-readable media may include, but are not limited to, electronic, magnetic, optical, electromagnetic, infrared, or semiconductor systems, devices, or devices, or any suitable combination of the foregoing.
  • machine-readable storage media would include one or more wire-based electrical connections, portable computer disks, hard disks, random access memory (RAM), read only memory (ROM), erasable programmable read only memory (EPROM or flash memory), fiber optics, compact disk read only memory (CD-ROM), optical storage, magnetic storage, or any suitable combination of the foregoing.
  • RAM random access memory
  • ROM read only memory
  • EPROM or flash memory erasable programmable read only memory
  • CD-ROM compact disk read only memory
  • magnetic storage or any suitable combination of the foregoing.
  • a video generation method comprising: acquiring a source image; and generating a plurality of sketch images according to the source image, where the plurality of sketch images respectively correspond to Sketch images of the source image at different color depths; based on the target sketch image, generate multiple sub-images of the target sketch image, and the multiple sub-images respectively correspond to the sketches of the target sketch image under different rendering completion degrees image, the target sketch image is any one sketch image in the plurality of sketch images; each sub-picture of each sketch image in the plurality of sketch images is used as the video frame of the sketch drawing video, according to the color depth from light to The order of depth and the order of drawing completion from low to high set the order of the video frames to generate the sketch drawing video.
  • the generating a plurality of sub-images of the target sketch image based on the target sketch image includes: determining an increasing order of mask values of pixels in a first mask, the first The initial mask value of the pixels in the mask is 0, the growth of the pixels in the first mask is used to indicate that the mask value of the pixels is changed from 0 to 1, and the first mask is used to make the target
  • the background of the sketch image is gradually transformed into the target sketch image according to the increasing order; according to the increasing order, the target sketch image and the background of the target sketch image, the plurality of sub-images are generated, and in the increasing order
  • Each increase of is in a one-to-one correspondence with the subgraphs in the plurality of subgraphs.
  • the background when the target sketch image is not a sketch image with the lightest color among the plurality of sketch images, the background is a sketch image that is the same as the target sketch among the plurality of sketch images
  • the sketch image that is adjacent to the image and is located before the target sketch image; when the target sketch image is the sketch image with the lightest color among the plurality of sketch images, the background is the size of the target sketch image. Same white image.
  • the source image includes a person
  • the increasing order includes an increasing order of face contours of the person in the source image
  • the mask for determining the pixels in the first mask includes: extracting the key points of the face in the source image to obtain a set of key points; according to the source image, determining the second mask of the hair area of the person in the source image; according to the key point A set of points to determine the first convex hull area of the person's face; according to the intersection of the first convex hull area and the second mask, determine the second convex hull from the second mask area; on the second convex hull area, connect the facial contour key points in the set of key points sequentially according to the stroke speed, and obtain the growth sequence of the human face contour in the source image, and the stroke speed is based on the The length of the video is determined.
  • the face contour key points in the key point set are sequentially connected on the second convex hull region according to the stroke speed to obtain the face of the person in the source image
  • the growth sequence of the contour also includes: determining the key points of different regions of the human face from the set of key points; performing interpolation on each region of the human face according to the key points of each region; , on the second convex hull region, according to the stroke speed, determine the growth order of the corresponding regions in the face of the person in the source image.
  • performing interpolation on different regions of the human face according to key points of the regions includes: for the eyebrow region in the human face, performing interpolation according to the eyebrow region of the human face.
  • the key points are horizontally interpolated with multiple curves; for the eyeball region in the face region, according to the key points of the eyeball region, the eyeball region is interpolated according to the circular region; for the mouth region in the face region, According to the key points of the mouth region, a plurality of curves are vertically interpolated.
  • the generating a plurality of sketch images according to the source image includes: generating a grayscale image based on the source image; determining a plurality of Gaussian kernels, the plurality of Gaussian kernels
  • the Gaussian kernels in the sketch images correspond one-to-one with the sketch images in the plurality of sketch images, and the size of the corresponding Gaussian kernel of the sketch image with darker color is larger than that of the sketch image with lighter color
  • the multiple Gaussian kernels kernel respectively perform Gaussian blur on the grayscale image to obtain a Gaussian blur map corresponding to each of the multiple Gaussian kernels; according to the Gaussian blur map corresponding to each of the multiple Gaussian kernels and the grayscale A degree map is generated to generate the plurality of sketch images.
  • a video generation apparatus comprising:
  • a first generating unit configured to generate a plurality of sketch images according to the source image, where the plurality of sketch images respectively correspond to sketch images of the source image under different color depths;
  • the second generating unit is configured to generate, based on the target sketch image, multiple sub-images of the target sketch image, where the multiple sub-images respectively correspond to the sketch images of the target sketch image under different drawing completion degrees, and the target sketch image the image is any one of the plurality of sketch images;
  • the third generating unit is configured to use each sub-image of each sketch image in the plurality of sketch images as a video frame of the sketch drawing video, and set in the order of color depth from light to dark and the order of drawing completion degree from low to high The sequence of the video frames to generate the sketching video.
  • the second generating unit is configured to determine an increasing order of mask values of pixels in a first mask, where an initial mask value of pixels in the first mask is 0 , the growth of pixels in the first mask is used to indicate that the mask value of the pixel is changed from 0 to 1, and the first mask is used to make the background of the target sketch image gradually change according to the growth order is the target sketch image; according to the growth sequence, the target sketch image and the background of the target sketch image, the plurality of subgraphs are generated, and each increase in the increase sequence is related to the plurality of subgraphs The subgraphs correspond one-to-one.
  • a first pixel set corresponding to the growth is determined from the first mask; according to the first pixel set, a second pixel set is determined from the target sketch image, and determine a third pixel set from the background, the first pixel set, the second pixel set and the pixels in the third pixel set are in one-to-one correspondence;
  • the third pixel is updated to the fourth pixel, and the sub-picture corresponding to
  • the background when the target sketch image is not a sketch image with the lightest color among the plurality of sketch images, the background is a sketch image that is the same as the target sketch among the plurality of sketch images
  • the sketch image that is adjacent to the image and is located before the target sketch image; when the target sketch image is the sketch image with the lightest color among the plurality of sketch images, the background is the size of the target sketch image. Same white image.
  • the source image includes characters
  • the increasing order includes an increasing sequence of face contours of the characters in the source image
  • the second generating unit determines In the increasing order of the mask values of the pixels, it is used to extract the key points of the face in the source image to obtain a set of key points; according to the source image, determine the second mask of the hair area of the person in the source image ; According to the set of key points, determine the first convex hull area of the person's face; According to the intersection of the first convex hull area and the second mask, determine from the second mask A second convex hull area is obtained; on the second convex hull area, the face contour key points in the key point set are sequentially connected according to the stroke speed, and the growth sequence of the human face contour in the source image is obtained.
  • the stroke speed is determined according to the duration of the video.
  • the second generating unit sequentially connects the face contour key points in the key point set on the second convex hull region according to the stroke speed, and obtains the image in the source image. After the growth sequence of the human face contours of the characters, it is also used to determine the key points of the different regions of the human face from the set of key points; according to the key points of each region, each region of the human face is interpolated; According to the interpolated area, on the second convex hull area, according to the stroke speed, determine the growth order of the corresponding area in the face of the person in the source image.
  • the second generation unit when the second generation unit performs interpolation on different regions of the human face according to key points of the regions, for the eyebrow region in the human face, according to the The key points of the eyebrow area are horizontally interpolated with multiple curves; for the eyeball area in the face area, according to the key points of the eyeball area, the eyeball area is interpolated according to the circular area; for the eyeball area in the face area For the mouth area, a plurality of curves are vertically interpolated according to the key points of the mouth area.
  • the first generating unit is configured to generate a grayscale image based on the source image
  • the Gaussian kernels in the plurality of Gaussian kernels correspond to the sketch images in the plurality of sketch images one-to-one, and the size of the corresponding Gaussian kernel of the sketch image with darker color is larger than that of the sketch with lighter color
  • Gaussian blurring is performed on the grayscale image respectively to obtain a Gaussian blurred image corresponding to each of the multiple Gaussian kernels; according to the Gaussian blurring corresponding to each of the multiple Gaussian kernels the blur map and the grayscale map to generate the plurality of sketch images.
  • an electronic device comprising: at least one processor and a memory;
  • the memory stores computer-executable instructions
  • the at least one processor executes computer-implemented instructions stored in the memory to cause the at least one processor to perform the video generation method as described in the first aspect and various possible designs of the first aspect above.
  • a computer-readable storage medium where computer-executable instructions are stored in the computer-readable storage medium, and when a processor executes the computer-executable instructions, The video generation method described above in the first aspect and various possible designs of the first aspect is implemented.
  • a computer program product comprising: a computer program, the computer program being stored in a readable storage medium, at least one processor of an electronic device The computer program is read from the readable storage medium, the computer program being executed by the at least one processor to cause the electronic device to perform the video generation method as described in the first aspect and various possible designs of the first aspect above.
  • a computer program that, when executed by a processor, implements the video generation method described in the first aspect and various possible designs of the first aspect. .

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Multimedia (AREA)
  • Health & Medical Sciences (AREA)
  • Oral & Maxillofacial Surgery (AREA)
  • Signal Processing (AREA)
  • General Health & Medical Sciences (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Human Computer Interaction (AREA)
  • Processing Or Creating Images (AREA)
  • Image Processing (AREA)
  • Indexing, Searching, Synchronizing, And The Amount Of Synchronization Travel Of Record Carriers (AREA)
  • Television Signal Processing For Recording (AREA)

Abstract

本公开实施例提供一种视频生成方法、装置、设备及可读存储介质,电子设备获取到源图像后,基于源图像按照颜色从浅到深的顺序生成模拟画家绘制目标素描过程中不同阶段的素描图像,针对每个目标素描图像,按照绘制完成度从高到底的顺序生成模拟画家绘制该素描图像的过程中的多个子图,然后将多个素描图像中各个素描图像的各个子图作为素描绘制视频的视频帧,按照颜色深度从浅到深的顺序以及绘制完成度从低到高的顺序设置视频帧的顺序,生成素描绘制视频。该过程中,通过图像处理模拟画家创作素描图像的各个阶段,过程简单,且无需进行深度学习过程,效率高。

Description

视频生成方法、装置、设备及可读存储介质
相关申请交叉引用
本申请要求于2021年02月05日提交中国专利局、申请号为202110163139.8、发明名称为“视频生成方法、装置、设备及可读存储介质”的中国专利申请的优先权,其全部内容通过引用并入本文。
技术领域
本公开实施例涉及图像处理技术领域,尤其涉及一种视频生成方法、装置、设备及可读存储介质。
背景技术
素描是图像风格化的一种,现实中的素描是以铅笔为媒介,用线条来表现景物或人物的艺术形式。素描可以分为线条结构画和精细写实素描两种。手工创作素描要求作者具有一定的美术功底。
随着终端设备的智能化,用户利用终端设备能够得到素描图像等。获取素描图像的过程中,终端设备对用户选中的源图像进行风格转换处理,从而得到一幅素描图像。随着视频等的发展,用户不再满足于获取静态图像,而是期望能够获得一段动态视频,模拟画家作画的过程,一笔一笔的绘制出素描图像。
然而,现有的视频大多数都是具有美术功底的画家现场作画录制的真实视频,对于普通用户而言,难度大,基本无法实现。
发明内容
本公开实施例提供一种视频生成方法、装置、设备及可读存储介质,基于图像,生成模拟画家一笔一划创作素描图像的视频,过程简单。
第一方面,本公开实施例提供一种视频生成方法,包括:
获取源图像;
根据所述源图像,生成多个素描图像,所述多个素描图像分别对应于所述源图像在不同颜色深度下的素描图像;
基于目标素描图像,生成所述目标素描图像的多个子图,所述多个子图分别对应于所述目标素描图像在不同绘制完成度下的素描图像,所述目标素描图像是所述多个素描图像中的任意一个素描图像;
将所述多个素描图像中各个素描图像的各个子图作为素描绘制视频的视频帧,按照颜色深度从浅到深的顺序以及绘制完成度从低到高的顺序设置所述视频帧的顺序,生成所述素描绘制视频。
第二方面,本公开提供一种视频生成装置,包括:
获取单元,用于获取源图像;
第一生成单元,用于根据所述源图像,生成多个素描图像,所述多个素描图像分别对应于所述源图像在不同颜色深度下的素描图像;
第二生成单元,用于基于目标素描图像,生成所述目标素描图像的多个子图,所述多个子图分别对应于所述目标素描图像在不同绘制完成度下的素描图像,所述目标素描图像是所述多个素描图像中的任意一个素描图像;
第三生成单元,用于将所述多个素描图像中各个素描图像的各个子图作为素描绘制视频的视频帧,按照颜色深度从浅到深的顺序以及绘制完成度从低到高的顺序设置所述视频帧的顺序,生成所述素描绘制视频。
第三方面,根据本公开的一个或多个实施例,提供了一种电子设备,包括:至少一个处理器和存储器;
所述存储器存储计算机执行指令;
所述至少一个处理器执行所述存储器存储的计算机执行指令,使得所述至少一个处理器执行如上第一方面以及第一方面各种可能的设计所述的视频生成方法。
第四方面,根据本公开的一个或多个实施例,提供了一种计算机可读存储介质,所述计算机可读存储介质中存储有计算机执行指令,当处理器执行所述计算机执行指令时,实现如上第一方面以及第一方面各种可能的设计所述的视频生成方法。
第五方面,本公开实施例提供一种计算机程序产品,所述计算机程序产品包括:计算机程序,所述计算机程序存储在可读存储介质中,电子设备的至少一个处理器从所述可读存储介质读取所述计算机程序,所述至少一个处理器执行所述计算机程序使得电子设备执行如上第一方面以及第一方面各种可能的设计所述的视频生成方法。
第六方面,本公开实施例提供一种计算机程序,该计算机程序被处理器执行时实现如上第一方面以及第一方面各种可能的设计所述的视频生成方法。
本公开实施例提供的视频生成方法、装置、设备及可读存储介质,电子设备获取到源图像后,基于源图像按照颜色从浅到深的顺序生成模拟画家绘制目标素描过程中不同阶段的素描图像,针对每个目标素描图像,按照绘制完成度从高到底的顺序生成模拟画家绘制该素描图像的过程中的多个子图,然后将多个素描图像中各个素描图像的各个子图作为素描绘制视频的视频帧,按照颜色深度从浅到深的顺序以及绘制完成度从低到高的顺序设置视频帧的顺序,生成素描绘制视频。该过程中,通过图像处理模拟画家创作素描图像的各个阶段,过程简单,且无需进行深度学习过程,效率高。
附图说明
为了更清楚地说明本公开实施例或现有技术中的技术方案,下面将对实施例或现有技术描述中所需要使用的附图作一简单地介绍,显而易见地,下面描述中的附图是本公开的一些实施例,对于本领域普通技术人员来讲,在不付出创造性劳动性的前提下,还可以根据这些附图获得其他的附图。
图1是本公开实施例提供一种视频生成方法的网络架构示意图;
图2是本申请实施例提供的视频生成方法的流程图;
图3是本公开是实施例提供的多个素描图像的示意图;
图4是本公开是实施例提供的多个子图的示意图;
图5是本公开提供的视频生成方法中绘制子图的过程示意图;
图6是本公开提供的视频生成方法的过程示意图;
图7是本公开提供的视频生成方法中人脸关键点的示意图;
图8是本公开提供的视频生成方法中头发区域的第二掩膜的示意图;
图9是本公开提供的视频生成方法中第一凸包区域和第二凸包区域的示意图;
图10为本公开实施例提供的一种视频生成装置的结构框图;
图11为用来实现本公开实施例的电子设备的结构示意图。
具体实施方式
为使本公开实施例的目的、技术方案和优点更加清楚,下面将结合本公开实施例中的附图,对本公开实施例中的技术方案进行清楚、完整地描述,显然,所描述的实施例是本公开一部分实施例,而不是全部的实施例。基于本公开中的实施例,本领域普通技术人员在没有作出创造性劳动前提下所获得的所有其他实施例,都属于本公开保护的范围。
目前,很多铅笔画素描作画过程的短视频深受用户的欢迎,该类视频是对专业画家现场创作铅笔画素描的过程进行拍摄得到。由于普通用户没有美术功底,显然,上述现场拍摄录制短视频的方式对普通用户不适用。
因此,本公开考虑基于源图像和目标素描,生成模拟画家一笔一划作画过程的视频,过程简单,满足广大普通用户的需求。
图1是本公开实施例提供一种视频生成方法的网络架构示意图。请参照图1,该网络架构包括终端设备1、服务器2和网络3,终端设备1和服务器2通过网络3建立网络连接。网络3包括各种网络连接类型,如有线、无线通信链路或光纤电缆等。
用户使用终端设备1通过网络3与服务器2进行交互,以接收或发送消息等。终端设备1上安装有各种通讯客户端应用,如视频播放类应用、购物类应用、搜索类应用、即时通信工具、邮箱客户端、社交平台软件等。
终端设备1可以是硬件也可以是软件。当终端设备1为硬件时,终端设备1例如为手机、平板电脑、电子书阅读器、膝上型便携电脑、台式计算机等。当终端设备1为软件时,其可以安装在上述列举的硬件设备中,此时,终端设备1例如为多个软件模块或单个软件模块等,本公开实施例并不限制。
服务器2是能够提供多种服务器的服务器,用于接收终端设备发送的源图像,基于源图像生成模拟画家一笔一划创作素描图像的视频。
服务器2可以是硬件也可以是软件。当服务器2为硬件时,该服务器2为单个服务器或多个服务器组成的分布式服务器集群。当服务器2为软件时,可以为多个软件模块或单个软件模块等,本公开实施例并不限制。
应当理解的是,图1中的终端设备1、服务器2和网络3的数量仅仅是示意性的。实际实现中,根据实际需求部署任意数量的电子设备1、服务器2和网络3。
另外,当本公开的视频生成方法由终端设备1执行时,由于无需联网,因此上述图1中的服务器2和网络3可以不存在。
下面,基于图1所示网络架构,对本申请实施例所述的视频生成方法进行详细说明。示例性的,请参照图2。
图2是本申请实施例提供的视频生成方法的流程图。本实施例的执行主体为电子设备,该电子设备例如为上述图1中的终端设备或服务器。本实施例包括:
101、获取源图像。
示例性的,电子设备从本地获取源图像,或者,从互联网获取源图像。源图像也称为(image_src)。源图像为红绿蓝(red green blue,RGB)图像、黑白照片等,本公开实施例并不限制。
102、根据所述源图像,生成多个素描图像。
其中,所述多个素描图像分别对应于所述源图像在不同颜色深度下的素描图像。
示例性的,专业画家创作素描图像的过程是一个循序渐进的过程,整个作品视觉上的感受是颜色从浅到深。本步骤中,电子设备根据源图像,产生多个素描图像,该些素描图像颜色依次加深。示例性的,请参照图3。
图3是本公开是实施例提供的多个素描图像的示意图。请参照图3,最左边为源图像,其可以是彩色的RGB图像或黑白照片等,①-④为不同的素描图像。实际创作过程中,一幅作品的实际产生过程为从①到④。因此,电子设备利用图像处理算法对源图像进行处理,从而基于源图像得到素描图像①-④。其中,素描图像④例如是最终的产品,即等同于专业画家创作的成品。
需要说明的是,上述的素描图像①-④仅是示例性的说明不同的素描图像,并不代表一定是4个素描图像。实际实现中,颜色从浅至深的素描图像的数量可小于或大于4。
103、基于目标素描图像,生成所述目标素描图像的多个子图。
其中,所述多个子图分别对应于所述目标素描图像在不同绘制完成度下的素描图像,所述目标素描图像是所述多个素描图像中的任意一个素描图像。
示例性的,对于上述多个素描图像中的任意一个素描图像,以下称之为目标素描图像,专业画家创作该目标素描图像时,不可能一笔完成,而是一步一步的多次绘制完成。为了模拟该过程,电子设备针对目标素描图像,生成不同绘制完成度下的素描图像,以下称之为子图。对于相邻的第一子图和第二子图而言,第二子图包含的笔触区域多于第一子图包含的笔触区域。例如,绘制完成度指示先画嘴巴再画眼睛,则第一子图包含绘制好的嘴巴,第二子图除了绘制好的嘴巴外,还包括绘制好的眼睛。
另外,即使同一个部位,也可能需要多笔才能绘制完成。比如,第一子图包含绘制好的嘴巴轮廓,第二子图除了包含绘制好的嘴巴轮廓外,还包括对嘴巴的填充等。
下面,以人脸轮廓为例,对多个子图进行相似说明。示例性的,请参照他4,图4是本公开是实施例提供的多个子图的示意图。
请参照图4,假设视频的长度为15秒,每秒30帧,共包含450帧图像。该450帧图像中的30帧用于人脸轮廓的绘制,则相当于有30个子图,该30帧的图像相当于模拟专业画家30笔画完人脸轮廓。多个子图中的笔触区域增多。笔触区域即为人脸轮廓,如图4所示,30帧图像中,第一帧图像中的人脸轮廓最少,第二帧图像中人脸轮廓逐渐增多,到第30帧图像中,整个人脸轮廓被绘制出。
104、将所述多个素描图像中各个素描图像的各个子图作为素描绘制视频的视频帧,按照颜色深度从浅到深的顺序以及绘制完成度从低到高的顺序设置所述视频帧的顺序,生成所述素描绘制视频。
示例性的,电子设备生成多个素描图像中各素描图像的子图后,将所有的子图作为视频帧。之后,按照绘制完成度从低到高的顺序,设置目标素描图像的各个子图的顺序,从而得到模拟创作目标素描图像的子视频,进而按照颜色从浅到深的顺序,设置各子视频的顺序并合成该些子视频,从而得到素描绘制视频。
当电子设备为服务器时,电子设备将素描绘制视频发送给手机等移动终端,供手机等移动终端播放素描绘制视频。或者,当电子设备为手机等移动终端时,直接播放该素描绘制视频或者将视素描绘制频存储在本地等。
本公开实施例提供的视频生成方法,电子设备获取到源图像后,基于源图像按照颜色从浅到深的顺序生成模拟画家绘制目标素描过程中不同阶段的素描图像,针对每个目标素描图像,按照绘制完成度从高到底的顺序生成模拟画家绘制该素描图像的过程中的多个子图,然后将多个素描图像中各个素描图像的各个子图作为素描绘制视频的视频帧,按照颜色深度从浅到深的顺序以及绘制完成度从低到高的顺序设置视频帧的顺序,生成素描绘制视频。该过程中,通过图像处理模拟画家创作素描图像的各个阶段,过程简单,且无需进行深度学习过程,效率高。
上述实施例中,电子设备基于目标素描图像,生成所述目标素描图像的多个子图时,确定第一掩膜中像素的掩膜值的增长顺序,所述第一掩膜中像素的初始掩膜值为0,所述第一掩膜中像素的增长用于指示所述像素的掩膜值从0变更为1,所述第一掩膜用于使得所述目标素描图像的背景根据所述增长顺序逐渐转变为所述目标素描图像。然后,根据所述增长顺序、所述目标素描图像和所述目标素描图像的背景,生成所述多个子图,所述增长顺序中的每次增长与所述多个子图中的子图一一对应。
示例性的,对于多个素描图像中的任意一个素描图像,以下称为目标素描图像,电子设备为该目标素描图像设置一个第一掩膜(mask),第一掩膜上的像素的初始掩膜值均为0。之后,电子设备确定第一掩膜中像素的增长顺序,一次增长顺序用于指示哪些像素的掩膜值从0变为1,每次增长对应专业画家作画过程中的一笔或多笔。例如,某次增长指示上嘴唇轮廓的像素点的掩膜值从0变化为1,那么根据该此增长能够绘制出上嘴唇轮廓。
采用该种方案,通过生成绘制笔画素描过程中不同阶段的子图,实现精确模拟画家绘制素描图像的目的。
上述实施例中,电子设备根据所述增长顺序、所述目标素描图像和所述目标素描图像的背景,生成所述多个子图时,首先,针对所述增长顺序中的每次增长,从所述第一掩膜中确定出所述增长对应的第一像素集合。
示例性的,颜色从浅至深的多个素描图像中,对于相邻的两个素描图像而已,真实创作过程中后一个素描图像实际上是在前一个素描图像的基础上继续绘制得到的。
因此,创作目标素描图像的过程,实际上是在目标素描图像的背景上进行绘制,使得背景转变为目标素描图像的过程。为了实现该过程,需要控制第一掩膜中像素的掩膜值的变化。比如,初始时,第一掩膜中各像素的掩膜值均为0,此时,背景未发生任何变化。每次增长过程中,电子设备根据该此增长,从第一掩膜中确定出第一像素集合。例如,某次增长时,第一掩膜中代表上嘴唇轮廓的像素的掩膜值变化为1,则将代表上嘴唇轮廓的像素作为第一像素集合。
然后,根据所述第一像素集合,从所述目标素描图像中确定出第二像素集合,并从所述 背景中确定出第三像素集合,所述第一像素集合、所述第二像素集合和所述第三像素集合中的像素一一对应。
示例性的,目标素描图像、第一掩膜和背景的像素一一对应,因此,电子设备从第一掩膜中确定出第一像素集合后,就能够从目标素描图像中确定出第二像素集合,并从背景中确定出第三像素集合。
之后,电子设备根据所述第一像素集合中第一像素的掩膜值、所述第二像素集合中的第二像素的像素值、所述第三像素集合中第三像素的像素值,确定第四像素的像素值,所述第四像素的像素值=所述第一像素的掩膜值×所述第二像素的像素值+所述第三像素的像素值×(1-所述第一像素的掩膜值)。将背景中的第三像素更新为第四像素,得到所述增长对应的子图。
示例性的,请参照图5,图5是本公开提供的视频生成方法中绘制子图的过程示意图。请参照图5,背景上各像素的像素值为255,也就是说,背景为纯白背景。假设第一掩膜的某次增长对应的像素为掩膜值为1的6个像素,则最终在背景上绘制出包含该6个像素的折线,该6个像素的灰度值分别为90、0、23、23、255、89。
采用该种方案,通过模拟在背景上绘制各个子图的过程,实现精确模拟画家绘制素描图像的目的。
上述实施例中,当所述目标素描图像不是所述多个素描图像中颜色最浅的素描图像时,所述背景是所述背景是所述多个素描图像中与所述目标素描图像相邻、且位于所述目标素描图像之前的素描图像;当所述目标素描图像是所述多个素描图像中颜色最深的素描图像时,所述背景是尺寸与所述目标素描图像尺寸相同的白色图像。
示例性的,再请参照图3,素描图像②是对素描图像①继续绘制得到的,素描图像③是在素描图像②的基础上得到的。素描图像①实际上是在白色背景上绘制得到的。也就是说,素描图像①背景是白色背景,该白色背景的每个像素的像素值为255,素描图像②的背景是素描图像①,依次类推。
采用该种方案,不同阶段的素描图像对应不同的背景,实现精确模拟画家绘制素描图像的目的。
上述实施例中,第一掩膜的增长顺序中的每次增值与多个子图中的子图一一对应,增长顺序包括所述源图像中人物的人脸轮廓的增长顺序。当源图像中包含人物时,电子设备确定第一掩膜中像素的掩膜值的增长顺序时,首先,提取所述源图像中的人脸关键点,得到关键点集合。之后,电子设备根据所述源图像,确定所述源图像中人物的头发区域的第二掩膜。然后,电子设备根据所述关键点集合,确定出所述人物的人脸的第一凸包区域;根据所述第一凸包区域和所述第二掩膜的交集,从所述第一掩膜中确定出第二凸包区域。最后,电子设备在所述第二凸包区域上按照笔触速度依次连接所述关键点集合中的人脸轮廓关键点,得到所述源图像中人物人脸轮廓的增长顺序,所述笔触速度根据所述视频的时长确定出。
示例性的,请参照图6,图6是本公开提供的视频生成方法的过程示意图。请参照图6,获取到源图像后,电子设备利用人脸关键点检测模型,提取出五官关键点和人脸轮廓关键点,并利用头发分割模型,确定出人物的头发区域的第二掩膜。
图7是本公开提供的视频生成方法中人脸关键点的示意图。请参照图7,人脸关键点包括左眉毛、右眉毛、左眼、右眼、鼻子、嘴巴和人脸轮廓7部分的关键点。图中关键点旁边的 数字代表关键点的序号,比如,人脸轮廓共有34个关键点,则序号1-序号34各自代表的关键点为人脸轮廓的关键点。
图8是本公开提供的视频生成方法中头发区域的第二掩膜的示意图,请参照图8,头发区域的第二掩膜的作用是避免模拟作画过程中顺序混乱的问题,比如,实际作画时,先绘制人脸轮廓,再绘制画眉毛,再绘制眼睛,最后补头发,而不是头发先出现。因此,利用第二掩膜能够遮挡住头发区域,避免模拟作画过程中先出现头发,导致整个模拟过程很不自然。
绘制人脸轮廓时,首先要确定第一掩膜中人脸轮廓的增长顺序。确定过程中,电子设备根据关键点集合,从第一掩膜中确定出第一凸包区域。例如,电子设备将按次序连接人脸轮廓关键点、左眉毛关键点和右眉毛关键点,就能够得到第一凸包区域。之后,电子设备确定第一凸包区域和图8所述第二掩膜的交集,就能够确定出第二凸包区域。
图9是本公开提供的视频生成方法中第一凸包区域和第二凸包区域的示意图,请参照图9,由于第一凸包区域中存在头发遮挡的区域,为了避免模拟绘制过程中先出现头发的问题,需要将第一凸包区域中被头发遮挡的区域去除,从而得到第二凸包区域。根据第二凸包区域就能够从背景中确定出人脸的位置。接下来,需要在背景中人脸的位置绘制人脸轮廓、五官等,然后,绘制头发和人物以外的区域,使得目标素描图像在背景上一点点的出现,从而模拟出绘制目标素描图像的过程。
绘制人脸轮廓的过程,实际上是第一掩膜中代表人脸轮廓的像素的掩膜值依次从0变化为1的过程,即第一掩膜中代表人脸轮廓的像素的增长过程。该增长过程包括增长顺序和笔触速度,增长顺序是将人脸轮廓关键点依次连接的过程,比如,将第1-34个人脸轮廓关键点依次连接,从而得到人脸轮廓的增长顺序。增长速度根据视频的长度确定出。例如,一个450帧的视频,想要3笔画完人脸轮廓,则相当于第一掩膜在人脸轮廓的绘制过程中增长3次,每次大概增长11个像素,从视觉上看,相当于模拟画家3笔完成人脸轮廓的绘制,对应三个子图,即三个图像帧。再如,想要一秒绘制出人脸轮廓,则相当于第一掩膜增长30次,相当于模拟画家30笔绘制完人脸轮廓的过程。
绘制出人脸轮廓后,第一mask上人脸轮廓位置的掩膜值为1。后续绘制五官的过程中,人脸轮廓的掩膜值继续保持1。
采用该种方案,实现精确模拟画家绘制人脸轮廓的过程。
再请参照图6,电子设备确定出人脸轮廓的增长顺序后,还需要确定各个五官区域的增长顺序。确定过程中,电子设备从所述关键点集合中确定出所述人脸不同区域的关键点;根据各区域的关键点,对所述人脸的各区域进行插值;根据插值后区域,在所述第二凸包区域上按照笔触速度,确定所述源图像中人物的人脸中对应区域的增长顺序。
示例性的,人脸轮廓比较简单,人脸关键点依次连接得到的折现就能够表示人脸轮廓。但是,对于五官区域而言,由于五官区域的关键点依次连接得到的图像并不能满足写实的素描图像,比如,请参照图7,左眉毛的两行关键点依次连接后,中间有空白部分,显然,这是无法满足素描图像的要求的。而且,若将每行关键点依次连接,形成两笔绘制出左眉毛的效果显然是不合理的。因此,对于五官区域而言,依次连接关键点是远远不够的,此时需要对不同的五官区域使用不同的方式进行插值,根据插值后的区域,在背景中第二凸包的区域内按照笔触速度,确定出五官区域对应的增值顺序,为后续分策略绘制五官做准备。
采用该种方案,对不同的五官区域使用不同的差值算法,实现精确模拟画家绘制五官的 过程。
上述实施例中,电子设备对所述人脸的不同区域,根据所述区域的关键点进行插值时,对于人脸中的眉毛区域,根据所述眉毛区域的关键点横向插值多条曲线。
示例性的,再请参照图7,以左眉毛区域为例,左眉毛的关键点包含两行关键点,上一行的关键点和下一行的关键点一一对应,对于每组关键点,电子设备确定出平均值,从而将左眉毛区域的关键点插值为3行关键点。进一步的,电子设备还可以将左眉毛区域的关键点插值为4行或5行,本公开实施例并不限制,行数和精度有关。插值后,需要七八笔才能绘制出左眉毛,能够更合理的模拟出画家作画的过程。
对于所述人脸区域中的眼球区域,根据所述眼球区域的关键点,按照圆形区域对所述眼球区域插值。
示例性的,由于眼球是圆形的,不能一横一横的去绘制瞳孔。因此,对于眼球区域,使用圆形插值的方式,先画一个代表眼球的圆,再插值出一个个的小圆圈,直到形成一个实心的圆。插值过程中,确定眼球的圆心和圆周上的点的平均值,将该平均值作为半径,将眼球的圆心为中心插值出一个圆。从第一掩膜的角度来看,第一掩膜增长时,每次增长的像素形成的图像不是折线,而是一个或多个圆圈。
对于所述人脸区域中的嘴巴区域,根据所述嘴巴区域的关键点,竖向插值多条曲线。
示例性的,嘴巴区域的关键点仅能绘制出嘴巴轮廓,完成嘴巴轮廓后,填充时按照竖线来插值。从第一掩膜的角度来看,第一掩膜完成嘴巴轮廓的增长后,对于嘴唇部分,每次增长的像素形成的图像是一条条的竖线。
采用该种方案,通过对五官区域使用不同的差值算法,实现精确模拟画家绘制五官的过程。
上述实施例中,电子设备绘制出人脸轮廓和五官后,用最后几帧图像补充头发和人物以外的背景区域。例如,450帧的视频中,90帧用于产生图3中的素描图像①,该90帧中的前88帧用于绘制脸部轮廓和五官,第89帧用于添加头发,第90帧用于添加人物以外的背景区域。
上述实施例中,电子设备根据所述源图像,生成多个素描图像时,首先,基于所述源图像,生成灰度图。然后,电子设备确定多个高斯核,所述多个高斯核中的高斯核和所述多个素描图像中的素描图像一一对应,颜色较深的素描图像的对应的高斯核的尺寸大于颜色较浅的素描图像对应的高斯核。接着,电子设备根据所述多个高斯核,分别对所述灰度图进行高斯模糊,得到所述多个高斯核中各高斯核对应的高斯模糊图。最后,电子设备根据所述多个高斯核中各高斯核对应的高斯模糊图和所述灰度图,生成所述多个素描图像。
示例性的,电子设备获取到源图像后,对源图像进行中值滤波等进行降噪处理。之后,电子设备对源图像的每个像素进行灰度处理,从而将源图像转换成灰度图。得到灰度图像后,电子设备根据高斯卷积等确定高斯模糊对应的高斯核(kernel),使用该高斯kernel对灰度图进行高斯模糊,得到高斯模糊图。高斯核的尺寸越大,则使用该高斯核对灰度图进行高斯模型得到的高斯模糊图的颜色越深。
得到多个素描图像中各素描图像对应的高斯模糊图后,电子设备将高斯模糊图和灰度图作为素材,生成黑白素描图像。例如,电子设备对高斯模糊图的像素和灰度图中对应像素进行融合处理,从而生成素描图像。融合过程中,源图像、灰度图、高斯模糊图和素描图像的 尺寸相同、像素一一对应。因此,电子设备能够根据灰度图和高斯模糊图中像素的像素值,确定出黑白素描图像中对应像素的像素值,进而得到素描图像。例如,生成高斯模糊图后,电子设备采用减淡(dodge)模式进行效果提取,即采用如下公式(1)进行提取:
image_target=(image_gray/gray_blur)×255    公式(1)
其中,image_target表示素描图像中像素的像素值,image_gray表示灰度图中像素的像素值,gray_blur1表示高斯模糊图中像素的像素值。
采用该种方案,基于不同高斯kernel的dodge能够得到不同层次的素描图像。
对应于上文实施例的视频生成方法,图10为本公开实施例提供的一种视频生成装置的结构框图。为了便于说明,仅示出了与本公开实施例相关的部分。请参照图10,所述设备包括:获取单元11、第一生成单元12、第二生成单元13和第三生成单元14。
获取单元11,用于获取源图像。
第一生成单元12,用于根据所述源图像,生成多个素描图像,所述多个素描图像分别对应于所述源图像在不同颜色深度下的素描图像。
第二生成单元13,用于基于目标素描图像,生成所述目标素描图像的多个子图,所述多个子图分别对应于所述目标素描图像在不同绘制完成度下的素描图像,所述目标素描图像是所述多个素描图像中的任意一个素描图像。
第三生成单元14,用于将所述多个素描图像中各个素描图像的各个子图作为素描绘制视频的视频帧,按照颜色深度从浅到深的顺序以及绘制完成度从低到高的顺序设置所述视频帧的顺序,生成所述素描绘制视频。
在本公开的一个实施例中,所述第二生成单元13,用于确定第一掩膜中像素的掩膜值的增长顺序,所述第一掩膜中像素的初始掩膜值为0,所述第一掩膜中像素的增长用于指示所述像素的掩膜值从0变更为1,所述第一掩膜用于使得所述目标素描图像的背景根据所述增长顺序逐渐转变为所述目标素描图像;根据所述增长顺序、所述目标素描图像和所述目标素描图像的背景,生成所述多个子图,所述增长顺序中的每次增长与所述多个子图中的子图一一对应。
在本公开的一个实施例中,所述第二生成单元13根据所述增长顺序、所述目标素描图像和所述目标素描图像的背景,生成所述多个子图时,用于针对所述增长顺序中的每次增长,从所述第一掩膜中确定出所述增长对应的第一像素集合;根据所述第一像素集合,从所述目标素描图像中确定出第二像素集合,并从所述背景中确定出第三像素集合,所述第一像素集合、所述第二像素集合和所述第三像素集合中的像素一一对应;根据所述第一像素集合中第一像素的掩膜值、所述第二像素集合中的第二像素的像素值、所述第三像素集合中第三像素的像素值,确定第四像素的像素值,所述第四像素的像素值=所述第一像素的掩膜值×所述第二像素的像素值+所述第三像素的像素值×(1-所述第一像素的掩膜值);将所述背景中的所述第三像素更新为所述第四像素,得到所述增长对应的子图。
在本公开的一个实施例中,当所述目标素描图像不是所述多个素描图像中的颜色最浅的素描图像时,所述背景是所述多个素描图像中与所述目标素描图像相邻、且位于所述目标素描图像之前的素描图像;当所述目标素描图像是所述多个素描图像中颜色最浅的素描图像时,所述背景是尺寸与所述目标素描图像尺寸相同的白色图像。
在本公开的一个实施例中,所述源图像中包含人物,所述增长顺序包括所述源图像中人 物的人脸轮廓的增长顺序,所述第二生成单元13确定第一掩膜中像素的掩膜值的增长顺序时,用于提取所述源图像中的人脸关键点,得到关键点集合;根据所述源图像,确定所述源图像中人物的头发区域的第二掩膜;根据所述关键点集合,确定出所述人物的人脸的第一凸包区域;根据所述第一凸包区域和所述第二掩膜的交集,从所述第二掩膜中确定出第二凸包区域;在所述第二凸包区域上按照笔触速度依次连接所述关键点集合中的人脸轮廓关键点,得到所述源图像中人物人脸轮廓的增长顺序,所述笔触速度根据所述视频的时长确定出。
在本公开的一个实施例中,所述第二生成单元13在所述第二凸包区域上按照笔触速度依次连接所述关键点集合中的人脸轮廓关键点,得到所述源图像中人物的人脸轮廓的增长顺序之后,还用于从所述关键点集合中确定出所述人脸不同区域的关键点;根据各区域的关键点,对所述人脸的各区域进行插值;根据插值后区域,在所述第二凸包区域上按照笔触速度,确定所述源图像中人物的人脸中对应区域的增长顺序。
在本公开的一个实施例中,所述第二生成单元13对所述人脸的不同区域,根据所述区域的关键点进行插值时,对于所述人脸中的眉毛区域,根据所述眉毛区域的关键点横向插值多条曲线;对于所述人脸区域中的眼球区域,根据所述眼球区域的关键点,按照圆形区域对所述眼球区域插值;对于所述人脸区域中的嘴巴区域,根据所述嘴巴区域的关键点,竖向插值多条曲线。
在本公开的一个实施例中,所述第一生成单元12,用于基于所述源图像,生成灰度图;确定多个高斯核,所述多个高斯核中的高斯核和所述多个素描图像中的素描图像一一对应,颜色较深的素描图像的对应的高斯核的尺寸大于颜色较浅的素描图像对应的高斯核;根据所述多个高斯核,分别对所述灰度图进行高斯模糊,得到所述多个高斯核中各高斯核对应的高斯模糊图;根据所述多个高斯核中各高斯核对应的高斯模糊图和所述灰度图,生成所述多个素描图像。
本实施例提供的装置,可用于执行上述方法实施例的技术方案,其实现原理和技术效果类似,本实施例此处不再赘述。
图11为用来实现本公开实施例的电子设备的结构示意图,该电子设备200可以为终端设备或服务器。其中,终端设备可以包括但不限于诸如移动电话、笔记本电脑、数字广播接收器、个人数字助理(Personal Digital Assistant,简称PDA)、平板电脑(Portable Android Device,简称PAD)、便携式多媒体播放器(Portable Media Player,简称PMP)、车载终端(例如车载导航终端)等等的移动终端以及诸如数字TV、台式计算机等等的固定终端。图11示出的电子设备仅仅是一个示例,不应对本公开实施例的功能和使用范围带来任何限制。
如图11所示,电子设备200可以包括处理装置(例如中央处理器、图形处理器等)201,其可以根据存储在只读存储器(Read Only Memory,简称ROM)202中的程序或者从存储装置208加载到随机访问存储器(Random Access Memory,简称RAM)203中的程序而执行各种适当的动作和处理。在RAM 203中,还存储有电子设备200操作所需的各种程序和数据。处理装置201、ROM 202以及RAM 203通过总线204彼此相连。输入/输出(Input/Output,简称I/O)接口205也连接至总线204。
通常,以下装置可以连接至I/O接口205:包括例如触摸屏、触摸板、键盘、鼠标、摄像头、麦克风、加速度计、陀螺仪等的输入装置206;包括例如液晶显示器(Liquid Crystal Display,简称LCD)、扬声器、振动器等的输出装置207;包括例如磁带、硬盘等的存储装置208;以 及通信装置202。通信装置202可以允许电子设备200与其他设备进行无线或有线通信以交换数据。虽然图11示出了具有各种装置的电子设备200,但是应理解的是,并不要求实施或具备所有示出的装置。可以替代地实施或具备更多或更少的装置。
特别地,根据本公开的实施例,上文参考流程图描述的过程可以被实现为计算机软件程序。例如,本公开的实施例包括一种计算机程序产品,其包括承载在计算机可读介质上的计算机程序,该计算机程序包含用于执行流程图所示的方法的程序代码。在这样的实施例中,该计算机程序可以通过通信装置202从网络上被下载和安装,或者从存储装置208被安装,或者从ROM 202被安装。在该计算机程序被处理装置201执行时,执行本公开实施例的方法中限定的上述功能。
需要说明的是,本公开上述的计算机可读介质可以是计算机可读信号介质或者计算机可读存储介质或者是上述两者的任意组合。计算机可读存储介质例如可以是——但不限于——电、磁、光、电磁、红外线、或半导体的***、装置或器件,或者任意以上的组合。计算机可读存储介质的更具体的例子可以包括但不限于:具有一个或多个导线的电连接、便携式计算机磁盘、硬盘、随机访问存储器(RAM)、只读存储器(ROM)、可擦式可编程只读存储器(Erasable Programmable Read-Only Memory,简称EPROM,或闪存)、光纤、便携式紧凑磁盘只读存储器(Compact Disc Read-Only Memory,简称CD-ROM)、光存储器件、磁存储器件、或者上述的任意合适的组合。在本公开中,计算机可读存储介质可以是任何包含或存储程序的有形介质,该程序可以被指令执行***、装置或者器件使用或者与其结合使用。而在本公开中,计算机可读信号介质可以包括在基带中或者作为载波一部分传播的数据信号,其中承载了计算机可读的程序代码。这种传播的数据信号可以采用多种形式,包括但不限于电磁信号、光信号或上述的任意合适的组合。计算机可读信号介质还可以是计算机可读存储介质以外的任何计算机可读介质,该计算机可读信号介质可以发送、传播或者传输用于由指令执行***、装置或者器件使用或者与其结合使用的程序。计算机可读介质上包含的程序代码可以用任何适当的介质传输,包括但不限于:电线、光缆、射频(Radio Frequency,简称RF)等等,或者上述的任意合适的组合。
上述计算机可读介质可以是上述电子设备中所包含的;也可以是单独存在,而未装配入该电子设备中。
上述计算机可读介质承载有一个或者多个程序,当上述一个或者多个程序被该电子设备执行时,使得该电子设备执行上述实施例所示的方法。
可以以一种或多种程序设计语言或其组合来编写用于执行本公开的操作的计算机程序代码,上述程序设计语言包括面向对象的程序设计语言—诸如Java、Smalltalk、C++,还包括常规的过程式程序设计语言—诸如“C”语言或类似的程序设计语言。程序代码可以完全地在用户计算机上执行、部分地在用户计算机上执行、作为一个独立的软件包执行、部分在用户计算机上部分在远程计算机上执行、或者完全在远程计算机或服务器上执行。在涉及远程计算机的情形中,远程计算机可以通过任意种类的网络——包括局域网(Local Area Network,简称LAN)或广域网(Wide Area Network,简称WAN)—连接到用户计算机,或者,可以连接到外部计算机(例如利用因特网服务提供商来通过因特网连接)。
附图中的流程图和框图,图示了按照本公开各种实施例的***、方法和计算机程序产品的可能实现的体系架构、功能和操作。在这点上,流程图或框图中的每个方框可以代表一个 模块、程序段、或代码的一部分,该模块、程序段、或代码的一部分包含一个或多个用于实现规定的逻辑功能的可执行指令。也应当注意,在有些作为替换的实现中,方框中所标注的功能也可以以不同于附图中所标注的顺序发生。例如,两个接连地表示的方框实际上可以基本并行地执行,它们有时也可以按相反的顺序执行,这依所涉及的功能而定。也要注意的是,框图和/或流程图中的每个方框、以及框图和/或流程图中的方框的组合,可以用执行规定的功能或操作的专用的基于硬件的***来实现,或者可以用专用硬件与计算机指令的组合来实现。
描述于本公开实施例中所涉及到的单元可以通过软件的方式实现,也可以通过硬件的方式来实现。其中,单元的名称在某种情况下并不构成对该单元本身的限定,例如,第一获取单元还可以被描述为“获取至少两个网际协议地址的单元”。
本文中以上描述的功能可以至少部分地由一个或多个硬件逻辑部件来执行。例如,非限制性地,可以使用的示范类型的硬件逻辑部件包括:现场可编程门阵列(Field Programmable Gate Array,简称FPGA)、专用集成电路(Application Specific Integrated Circuit,简称ASIC)、专用标准产品(Application Specific Standard Parts,简称ASSP)、片上***(System on Chip,简称SOC)、复杂可编程逻辑设备(Complex Programmable Logic Device,简称CPLD)等等。
在本公开的上下文中,机器可读介质可以是有形的介质,其可以包含或存储以供指令执行***、装置或设备使用或与指令执行***、装置或设备结合地使用的程序。机器可读介质可以是机器可读信号介质或机器可读储存介质。机器可读介质可以包括但不限于电子的、磁性的、光学的、电磁的、红外的、或半导体***、装置或设备,或者上述内容的任何合适组合。机器可读存储介质的更具体示例会包括基于一个或多个线的电气连接、便携式计算机盘、硬盘、随机存取存储器(RAM)、只读存储器(ROM)、可擦除可编程只读存储器(EPROM或快闪存储器)、光纤、便捷式紧凑盘只读存储器(CD-ROM)、光学储存设备、磁储存设备、或上述内容的任何合适组合。
第一方面,根据本公开的一个或多个实施例,提供了一种视频生成方法,包括:获取源图像;根据所述源图像,生成多个素描图像,所述多个素描图像分别对应于所述源图像在不同颜色深度下的素描图像;基于目标素描图像,生成所述目标素描图像的多个子图,所述多个子图分别对应于所述目标素描图像在不同绘制完成度下的素描图像,所述目标素描图像是所述多个素描图像中的任意一个素描图像;将所述多个素描图像中各个素描图像的各个子图作为素描绘制视频的视频帧,按照颜色深度从浅到深的顺序以及绘制完成度从低到高的顺序设置所述视频帧的顺序,生成所述素描绘制视频。
根据本公开的一个或多个实施例,所述基于目标素描图像,生成所述目标素描图像的多个子图,包括:确定第一掩膜中像素的掩膜值的增长顺序,所述第一掩膜中像素的初始掩膜值为0,所述第一掩膜中像素的增长用于指示所述像素的掩膜值从0变更为1,所述第一掩膜用于使得所述目标素描图像的背景根据所述增长顺序逐渐转变为所述目标素描图像;根据所述增长顺序、所述目标素描图像和所述目标素描图像的背景,生成所述多个子图,所述增长顺序中的每次增长与所述多个子图中的子图一一对应。
根据本公开的一个或多个实施例,所述根据所述增长顺序、所述目标素描图像和所述目标素描图像的背景,生成所述多个子图,包括:针对所述增长顺序中的每次增长,从所述第一掩膜中确定出所述增长对应的第一像素集合;根据所述第一像素集合,从所述目标素描图像中确定出第二像素集合,并从所述背景中确定出第三像素集合,所述第一像素集合、所述 第二像素集合和所述第三像素集合中的像素一一对应;根据所述第一像素集合中第一像素的掩膜值、所述第二像素集合中的第二像素的像素值、所述第三像素集合中第三像素的像素值,确定第四像素的像素值,所述第四像素的像素值=所述第一像素的掩膜值×所述第二像素的像素值+所述第三像素的像素值×(1-所述第一像素的掩膜值);将所述背景中的所述第三像素更新为所述第四像素,得到所述增长对应的子图。
根据本公开的一个或多个实施例,当所述目标素描图像不是所述多个素描图像中的颜色最浅的素描图像时,所述背景是所述多个素描图像中与所述目标素描图像相邻、且位于所述目标素描图像之前的素描图像;当所述目标素描图像是所述多个素描图像中颜色最浅的素描图像时,所述背景是尺寸与所述目标素描图像尺寸相同的白色图像。
根据本公开的一个或多个实施例,所述源图像中包含人物,所述增长顺序包括所述源图像中人物的人脸轮廓的增长顺序,所述确定第一掩膜中像素的掩膜值的增长顺序,包括:提取所述源图像中的人脸关键点,得到关键点集合;根据所述源图像,确定所述源图像中人物的头发区域的第二掩膜;根据所述关键点集合,确定出所述人物的人脸的第一凸包区域;根据所述第一凸包区域和所述第二掩膜的交集,从所述第二掩膜中确定出第二凸包区域;在所述第二凸包区域上按照笔触速度依次连接所述关键点集合中的人脸轮廓关键点,得到所述源图像中人物人脸轮廓的增长顺序,所述笔触速度根据所述视频的时长确定出。
根据本公开的一个或多个实施例,所述在所述第二凸包区域上按照笔触速度依次连接所述关键点集合中的人脸轮廓关键点,得到所述源图像中人物的人脸轮廓的增长顺序之后,还包括:从所述关键点集合中确定出所述人脸不同区域的关键点;根据各区域的关键点,对所述人脸的各区域进行插值;根据插值后区域,在所述第二凸包区域上按照笔触速度,确定所述源图像中人物的人脸中对应区域的增长顺序。
根据本公开的一个或多个实施例,所述对所述人脸的不同区域,根据所述区域的关键点进行插值,包括:对于所述人脸中的眉毛区域,根据所述眉毛区域的关键点横向插值多条曲线;对于所述人脸区域中的眼球区域,根据所述眼球区域的关键点,按照圆形区域对所述眼球区域插值;对于所述人脸区域中的嘴巴区域,根据所述嘴巴区域的关键点,竖向插值多条曲线。
根据本公开的一个或多个实施例,所述根据所述源图像,生成多个素描图像,包括:基于所述源图像,生成灰度图;确定多个高斯核,所述多个高斯核中的高斯核和所述多个素描图像中的素描图像一一对应,颜色较深的素描图像的对应的高斯核的尺寸大于颜色较浅的素描图像对应的高斯核;根据所述多个高斯核,分别对所述灰度图进行高斯模糊,得到所述多个高斯核中各高斯核对应的高斯模糊图;根据所述多个高斯核中各高斯核对应的高斯模糊图和所述灰度图,生成所述多个素描图像。
第二方面,根据本公开的一个或多个实施例,提供了一种视频生成装置,包括:
获取单元,用于获取源图像;
第一生成单元,用于根据所述源图像,生成多个素描图像,所述多个素描图像分别对应于所述源图像在不同颜色深度下的素描图像;
第二生成单元,用于基于目标素描图像,生成所述目标素描图像的多个子图,所述多个子图分别对应于所述目标素描图像在不同绘制完成度下的素描图像,所述目标素描图像是所述多个素描图像中的任意一个素描图像;
第三生成单元,用于将所述多个素描图像中各个素描图像的各个子图作为素描绘制视频的视频帧,按照颜色深度从浅到深的顺序以及绘制完成度从低到高的顺序设置所述视频帧的顺序,生成所述素描绘制视频。
根据本公开的一个或多个实施例,所述第二生成单元,用于确定第一掩膜中像素的掩膜值的增长顺序,所述第一掩膜中像素的初始掩膜值为0,所述第一掩膜中像素的增长用于指示所述像素的掩膜值从0变更为1,所述第一掩膜用于使得所述目标素描图像的背景根据所述增长顺序逐渐转变为所述目标素描图像;根据所述增长顺序、所述目标素描图像和所述目标素描图像的背景,生成所述多个子图,所述增长顺序中的每次增长与所述多个子图中的子图一一对应。
根据本公开的一个或多个实施例,所述第二生成单元根据所述增长顺序、所述目标素描图像和所述目标素描图像的背景,生成所述多个子图时,用于针对所述增长顺序中的每次增长,从所述第一掩膜中确定出所述增长对应的第一像素集合;根据所述第一像素集合,从所述目标素描图像中确定出第二像素集合,并从所述背景中确定出第三像素集合,所述第一像素集合、所述第二像素集合和所述第三像素集合中的像素一一对应;根据所述第一像素集合中第一像素的掩膜值、所述第二像素集合中的第二像素的像素值、所述第三像素集合中第三像素的像素值,确定第四像素的像素值,所述第四像素的像素值=所述第一像素的掩膜值×所述第二像素的像素值+所述第三像素的像素值×(1-所述第一像素的掩膜值);将所述背景中的所述第三像素更新为所述第四像素,得到所述增长对应的子图。
根据本公开的一个或多个实施例,当所述目标素描图像不是所述多个素描图像中的颜色最浅的素描图像时,所述背景是所述多个素描图像中与所述目标素描图像相邻、且位于所述目标素描图像之前的素描图像;当所述目标素描图像是所述多个素描图像中颜色最浅的素描图像时,所述背景是尺寸与所述目标素描图像尺寸相同的白色图像。
根据本公开的一个或多个实施例,所述源图像中包含人物,所述增长顺序包括所述源图像中人物的人脸轮廓的增长顺序,所述第二生成单元确定第一掩膜中像素的掩膜值的增长顺序时,用于提取所述源图像中的人脸关键点,得到关键点集合;根据所述源图像,确定所述源图像中人物的头发区域的第二掩膜;根据所述关键点集合,确定出所述人物的人脸的第一凸包区域;根据所述第一凸包区域和所述第二掩膜的交集,从所述第二掩膜中确定出第二凸包区域;在所述第二凸包区域上按照笔触速度依次连接所述关键点集合中的人脸轮廓关键点,得到所述源图像中人物人脸轮廓的增长顺序,所述笔触速度根据所述视频的时长确定出。
根据本公开的一个或多个实施例,所述第二生成单元在所述第二凸包区域上按照笔触速度依次连接所述关键点集合中的人脸轮廓关键点,得到所述源图像中人物的人脸轮廓的增长顺序之后,还用于从所述关键点集合中确定出所述人脸不同区域的关键点;根据各区域的关键点,对所述人脸的各区域进行插值;根据插值后区域,在所述第二凸包区域上按照笔触速度,确定所述源图像中人物的人脸中对应区域的增长顺序。
根据本公开的一个或多个实施例,所述第二生成单元对所述人脸的不同区域,根据所述区域的关键点进行插值时,对于所述人脸中的眉毛区域,根据所述眉毛区域的关键点横向插值多条曲线;对于所述人脸区域中的眼球区域,根据所述眼球区域的关键点,按照圆形区域对所述眼球区域插值;对于所述人脸区域中的嘴巴区域,根据所述嘴巴区域的关键点,竖向插值多条曲线。
根据本公开的一个或多个实施例,所述第一生成单元,用于基于所述源图像,生成灰度图;
确定多个高斯核,所述多个高斯核中的高斯核和所述多个素描图像中的素描图像一一对应,颜色较深的素描图像的对应的高斯核的尺寸大于颜色较浅的素描图像对应的高斯核;
根据所述多个高斯核,分别对所述灰度图进行高斯模糊,得到所述多个高斯核中各高斯核对应的高斯模糊图;根据所述多个高斯核中各高斯核对应的高斯模糊图和所述灰度图,生成所述多个素描图像。
第三方面,根据本公开的一个或多个实施例,提供了一种电子设备,包括:至少一个处理器和存储器;
所述存储器存储计算机执行指令;
所述至少一个处理器执行所述存储器存储的计算机执行指令,使得所述至少一个处理器执行如上第一方面以及第一方面各种可能的设计所述的视频生成方法。
第四方面,根据本公开的一个或多个实施例,提供了一种计算机可读存储介质,所述计算机可读存储介质中存储有计算机执行指令,当处理器执行所述计算机执行指令时,实现如上第一方面以及第一方面各种可能的设计所述的视频生成方法。
第五方面,根据本公开的一个或多个实施例,提供了一种计算机程序产品,该计算机程序产品包括:计算机程序,该计算机程序存储在可读存储介质中,电子设备的至少一个处理器从该可读存储介质读取该计算机程序,该至少一个处理器执行该计算机程序使得电子设备执行如上第一方面以及第一方面各种可能的设计所述的视频生成方法。
第六方面,根据本公开的一个或多个实施例,提供了一种计算机程序,该计算机程序被处理器执行时实现如上第一方面以及第一方面各种可能的设计所述的视频生成方法。
以上描述仅为本公开的较佳实施例以及对所运用技术原理的说明。本领域技术人员应当理解,本公开中所涉及的公开范围,并不限于上述技术特征的特定组合而成的技术方案,同时也应涵盖在不脱离上述公开构思的情况下,由上述技术特征或其等同特征进行任意组合而形成的其它技术方案。例如上述特征与本公开中公开的(但不限于)具有类似功能的技术特征进行互相替换而形成的技术方案。
此外,虽然采用特定次序描绘了各操作,但是这不应当理解为要求这些操作以所示出的特定次序或以顺序次序执行来执行。在一定环境下,多任务和并行处理可能是有利的。同样地,虽然在上面论述中包含了若干具体实现细节,但是这些不应当被解释为对本公开的范围的限制。在单独的实施例的上下文中描述的某些特征还可以组合地实现在单个实施例中。相反地,在单个实施例的上下文中描述的各种特征也可以单独地或以任何合适的子组合的方式实现在多个实施例中。
尽管已经采用特定于结构特征和/或方法逻辑动作的语言描述了本主题,但是应当理解所附权利要求书中所限定的主题未必局限于上面描述的特定特征或动作。相反,上面所描述的特定特征和动作仅仅是实现权利要求书的示例形式。

Claims (13)

  1. 一种视频生成方法,其特征在于,包括:
    获取源图像;
    根据所述源图像,生成多个素描图像,所述多个素描图像分别对应于所述源图像在不同颜色深度下的素描图像;
    基于目标素描图像,生成所述目标素描图像的多个子图,所述多个子图分别对应于所述目标素描图像在不同绘制完成度下的素描图像,所述目标素描图像是所述多个素描图像中的任意一个素描图像;
    将所述多个素描图像中各个素描图像的各个子图作为素描绘制视频的视频帧,按照颜色深度从浅到深的顺序以及绘制完成度从低到高的顺序设置所述视频帧的顺序,生成所述素描绘制视频。
  2. 根据权利要求1所述的方法,其特征在于,所述基于目标素描图像,生成所述目标素描图像的多个子图,包括:
    确定第一掩膜中像素的掩膜值的增长顺序,所述第一掩膜中像素的初始掩膜值为0,所述第一掩膜中像素的增长用于指示所述像素的掩膜值从0变更为1,所述第一掩膜用于使得所述目标素描图像的背景根据所述增长顺序逐渐转变为所述目标素描图像;
    根据所述增长顺序、所述目标素描图像和所述目标素描图像的背景,生成所述多个子图,所述增长顺序中的每次增长与所述多个子图中的子图一一对应。
  3. 根据权利要求2所述的方法,其特征在于,所述根据所述增长顺序、所述目标素描图像和所述目标素描图像的背景,生成所述多个子图,包括:
    针对所述增长顺序中的每次增长,从所述第一掩膜中确定出所述增长对应的第一像素集合;
    根据所述第一像素集合,从所述目标素描图像中确定出第二像素集合,并从所述背景中确定出第三像素集合,所述第一像素集合、所述第二像素集合和所述第三像素集合中的像素一一对应;
    根据所述第一像素集合中第一像素的掩膜值、所述第二像素集合中的第二像素的像素值、所述第三像素集合中第三像素的像素值,确定第四像素的像素值,所述第四像素的像素值=所述第一像素的掩膜值×所述第二像素的像素值+所述第三像素的像素值×(1-所述第一像素的掩膜值);
    将所述背景中的所述第三像素更新为所述第四像素,得到所述增长对应的子图。
  4. 根据权利要求2或3所述的方法,其特征在于,
    当所述目标素描图像不是所述多个素描图像中的颜色最浅的素描图像时,所述背景是所述多个素描图像中与所述目标素描图像相邻、且位于所述目标素描图像之前的素描图像;
    当所述目标素描图像是所述多个素描图像中颜色最浅的素描图像时,所述背景是尺寸与所述目标素描图像尺寸相同的白色图像。
  5. 根据权利要求2-4中任意一项所述的方法,其特征在于,所述源图像中包含人物,所述增长顺序包括所述源图像中人物的人脸轮廓的增长顺序,所述确定第一掩膜中像素的掩膜值的增长顺序,包括:
    提取所述源图像中的人脸关键点,得到关键点集合;
    根据所述源图像,确定所述源图像中人物的头发区域的第二掩膜;
    根据所述关键点集合,确定出所述人物的人脸的第一凸包区域;
    根据所述第一凸包区域和所述第二掩膜的交集,从所述第二掩膜中确定出第二凸包区域;
    在所述第二凸包区域上按照笔触速度依次连接所述关键点集合中的人脸轮廓关键点,得到所述源图像中人物人脸轮廓的增长顺序,所述笔触速度根据所述视频的时长确定出。
  6. 根据权利要求5所述的方法,其特征在于,所述在所述第二凸包区域上按照笔触速度依次连接所述关键点集合中的人脸轮廓关键点,得到所述源图像中人物的人脸轮廓的增长顺序之后,还包括:
    从所述关键点集合中确定出所述人脸不同区域的关键点;
    根据各区域的关键点,对所述人脸的各区域进行插值;
    根据插值后区域,在所述第二凸包区域上按照笔触速度,确定所述源图像中人物的人脸中对应区域的增长顺序。
  7. 根据权利要求6所述的方法,其特征在于,所述对所述人脸的不同区域,根据所述区域的关键点进行插值,包括:
    对于所述人脸中的眉毛区域,根据所述眉毛区域的关键点横向插值多条曲线;
    对于所述人脸区域中的眼球区域,根据所述眼球区域的关键点,按照圆形区域对所述眼球区域插值;
    对于所述人脸区域中的嘴巴区域,根据所述嘴巴区域的关键点,竖向插值多条曲线。
  8. 根据权利要求1-7中任意一项所述的方法,其特征在于,所述根据所述源图像,生成多个素描图像,包括:
    基于所述源图像,生成灰度图;
    确定多个高斯核,所述多个高斯核中的高斯核和所述多个素描图像中的素描图像一一对应,颜色较深的素描图像的对应的高斯核的尺寸大于颜色较浅的素描图像对应的高斯核;
    根据所述多个高斯核,分别对所述灰度图进行高斯模糊,得到所述多个高斯核中各高斯核对应的高斯模糊图;
    根据所述多个高斯核中各高斯核对应的高斯模糊图和所述灰度图,生成所述多个素描图像。
  9. 一种视频生成装置,其特征在于,包括:
    获取单元,用于获取源图像;
    第一生成单元,用于根据所述源图像,生成多个素描图像,所述多个素描图像分别对应于所述源图像在不同颜色深度下的素描图像;
    第二生成单元,用于基于目标素描图像,生成所述目标素描图像的多个子图,所述多个子图分别对应于所述目标素描图像在不同绘制完成度下的素描图像,所述目标素描图像是所述多个素描图像中的任意一个素描图像;
    第三生成单元,用于将所述多个素描图像中各个素描图像的各个子图作为素描绘制 视频的视频帧,按照颜色深度从浅到深的顺序以及绘制完成度从低到高的顺序设置所述视频帧的顺序,生成所述素描绘制视频。
  10. 一种电子设备,其特征在于,包括:至少一个处理器和存储器;
    所述存储器存储计算机执行指令;
    所述至少一个处理器执行所述存储器存储的计算机执行指令,使得所述至少一个处理器执行如权利要求1-8中任意一项所述的视频生成方法。
  11. 一种计算机可读存储介质,其特征在于,所述计算机可读存储介质中存储有计算机执行指令,当处理器执行所述计算机执行指令时,实现如权利要求1-8中任意一项所述的视频生成方法。
  12. 一种计算机程序产品,其特征在于,包括计算机程序,所述计算机程序在被处理器执行时实现如权利要求1-8中任意一项所述的视频生成方法。
  13. 一种计算机程序,其特征在于,所述计算机程序被处理器执行时实现如权利要求1-8中任意一项所述的视频生成方法。
PCT/CN2022/075037 2021-02-05 2022-01-29 视频生成方法、装置、设备及可读存储介质 WO2022166896A1 (zh)

Priority Applications (4)

Application Number Priority Date Filing Date Title
EP22749175.0A EP4277261A4 (en) 2021-02-05 2022-01-29 VIDEO GENERATING METHOD AND APPARATUS, AS WELL AS APPARATUS AND READABLE STORAGE MEDIUM
BR112023015702A BR112023015702A2 (pt) 2021-02-05 2022-01-29 Método e aparelho de geração de vídeo, dispositivo e meio de armazenamento legível
JP2023547370A JP2024506014A (ja) 2021-02-05 2022-01-29 動画生成方法、装置、機器及び可読記憶媒体
US18/264,232 US20240095981A1 (en) 2021-02-05 2022-01-29 Video generation method and apparatus, device and readable storage medium

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202110163139.8A CN112995534B (zh) 2021-02-05 2021-02-05 视频生成方法、装置、设备及可读存储介质
CN202110163139.8 2021-02-05

Publications (1)

Publication Number Publication Date
WO2022166896A1 true WO2022166896A1 (zh) 2022-08-11

Family

ID=76348275

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2022/075037 WO2022166896A1 (zh) 2021-02-05 2022-01-29 视频生成方法、装置、设备及可读存储介质

Country Status (6)

Country Link
US (1) US20240095981A1 (zh)
EP (1) EP4277261A4 (zh)
JP (1) JP2024506014A (zh)
CN (1) CN112995534B (zh)
BR (1) BR112023015702A2 (zh)
WO (1) WO2022166896A1 (zh)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116260976A (zh) * 2023-05-15 2023-06-13 深圳比特耐特信息技术股份有限公司 一种视频数据处理应用***

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112995534B (zh) * 2021-02-05 2023-01-24 北京字跳网络技术有限公司 视频生成方法、装置、设备及可读存储介质
CN113747136B (zh) * 2021-09-30 2024-03-22 深圳追一科技有限公司 视频数据处理方法、装置、设备及介质

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP3114587U (ja) * 2005-07-08 2005-10-27 アートコレクションハウス株式会社 デッサン用絵画教材
CN109087553A (zh) * 2018-08-23 2018-12-25 广东智媒云图科技股份有限公司 一种临摹绘画方法
CN109448079A (zh) * 2018-10-25 2019-03-08 广东智媒云图科技股份有限公司 一种绘画引导方法及设备
CN109993810A (zh) * 2019-03-19 2019-07-09 广东智媒云图科技股份有限公司 一种智能素描绘画方法、装置、存储介质及终端设备
CN112995534A (zh) * 2021-02-05 2021-06-18 北京字跳网络技术有限公司 视频生成方法、装置、设备及可读存储介质

Family Cites Families (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6021417A (en) * 1997-10-31 2000-02-01 Foto Fantasy, Inc. Method of stimulating the creation of an artist's drawing or painting, and device for accomplishing same
US6619860B1 (en) * 1997-11-14 2003-09-16 Eastman Kodak Company Photobooth for producing digitally processed images
WO2012093856A2 (en) * 2011-01-04 2012-07-12 Samsung Electronics Co., Ltd. Method and apparatus for creating a live artistic sketch of an image
CN102496180B (zh) * 2011-12-15 2014-03-26 山东师范大学 一种自动生成水墨山水画图像的方法
CN107967667A (zh) * 2017-12-21 2018-04-27 广东欧珀移动通信有限公司 素描的生成方法、装置、终端设备和存储介质
CN110599437A (zh) * 2019-09-26 2019-12-20 北京百度网讯科技有限公司 用于处理视频的方法和装置
CN110738595B (zh) * 2019-09-30 2023-06-30 腾讯科技(深圳)有限公司 图片处理方法、装置和设备及计算机存储介质
CN110717919A (zh) * 2019-10-15 2020-01-21 阿里巴巴(中国)有限公司 图像处理方法、装置、介质和计算设备

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP3114587U (ja) * 2005-07-08 2005-10-27 アートコレクションハウス株式会社 デッサン用絵画教材
CN109087553A (zh) * 2018-08-23 2018-12-25 广东智媒云图科技股份有限公司 一种临摹绘画方法
CN109448079A (zh) * 2018-10-25 2019-03-08 广东智媒云图科技股份有限公司 一种绘画引导方法及设备
CN109993810A (zh) * 2019-03-19 2019-07-09 广东智媒云图科技股份有限公司 一种智能素描绘画方法、装置、存储介质及终端设备
CN112995534A (zh) * 2021-02-05 2021-06-18 北京字跳网络技术有限公司 视频生成方法、装置、设备及可读存储介质

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
See also references of EP4277261A4 *

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116260976A (zh) * 2023-05-15 2023-06-13 深圳比特耐特信息技术股份有限公司 一种视频数据处理应用***

Also Published As

Publication number Publication date
US20240095981A1 (en) 2024-03-21
BR112023015702A2 (pt) 2023-11-07
JP2024506014A (ja) 2024-02-08
CN112995534B (zh) 2023-01-24
EP4277261A1 (en) 2023-11-15
EP4277261A4 (en) 2024-07-24
CN112995534A (zh) 2021-06-18

Similar Documents

Publication Publication Date Title
WO2022166896A1 (zh) 视频生成方法、装置、设备及可读存储介质
US20180025506A1 (en) Avatar-based video encoding
US20190287283A1 (en) User-guided image completion with image completion neural networks
US9314692B2 (en) Method of creating avatar from user submitted image
CN110766777A (zh) 虚拟形象的生成方法、装置、电子设备及存储介质
KR20240050463A (ko) 얼굴 재연을 위한 시스템 및 방법
CN112017257B (zh) 图像处理方法、设备及存储介质
US10783713B2 (en) Transmutation of virtual entity sketch using extracted features and relationships of real and virtual objects in mixed reality scene
CN112581635B (zh) 一种通用的快速换脸方法、装置、电子设备和存储介质
JP2023549841A (ja) ビデオ処理方法、装置、電子機器及び記憶媒体
CN111340921B (zh) 染色方法、装置和计算机***及介质
WO2023197780A1 (zh) 图像处理方法、装置、电子设备及存储介质
WO2022166907A1 (zh) 图像处理方法、装置、设备及可读存储介质
CN114842120A (zh) 一种图像渲染处理方法、装置、设备及介质
CN110860084A (zh) 一种虚拟画面处理方法及装置
CN115953597B (zh) 图像处理方法、装置、设备及介质
WO2021155666A1 (zh) 用于生成图像的方法和装置
CN112508772B (zh) 图像生成方法、装置及存储介质
CN114422698A (zh) 视频生成方法、装置、设备及存储介质
CN116501901A (zh) 一种数字音乐图像展示方法、***及装置
CN115714888B (zh) 视频生成方法、装置、设备与计算机可读存储介质
WO2023158375A2 (zh) 表情包生成方法及设备
WO2023158370A2 (zh) 表情包生成方法及设备
CN112053450B (zh) 文字的显示方法、装置、电子设备及存储介质
CN115512193A (zh) 面部表情生成方法和装置

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 22749175

Country of ref document: EP

Kind code of ref document: A1

WWE Wipo information: entry into national phase

Ref document number: 18264232

Country of ref document: US

WWE Wipo information: entry into national phase

Ref document number: 202327052480

Country of ref document: IN

Ref document number: 2023547370

Country of ref document: JP

ENP Entry into the national phase

Ref document number: 2022749175

Country of ref document: EP

Effective date: 20230807

REG Reference to national code

Ref country code: BR

Ref legal event code: B01A

Ref document number: 112023015702

Country of ref document: BR

NENP Non-entry into the national phase

Ref country code: DE

WWE Wipo information: entry into national phase

Ref document number: 11202305881X

Country of ref document: SG

ENP Entry into the national phase

Ref document number: 112023015702

Country of ref document: BR

Kind code of ref document: A2

Effective date: 20230804