WO2023098649A1 - 视频生成方法、装置、设备及存储介质 - Google Patents

视频生成方法、装置、设备及存储介质 Download PDF

Info

Publication number
WO2023098649A1
WO2023098649A1 PCT/CN2022/134957 CN2022134957W WO2023098649A1 WO 2023098649 A1 WO2023098649 A1 WO 2023098649A1 CN 2022134957 W CN2022134957 W CN 2022134957W WO 2023098649 A1 WO2023098649 A1 WO 2023098649A1
Authority
WO
WIPO (PCT)
Prior art keywords
portrait
video
image
video frame
background image
Prior art date
Application number
PCT/CN2022/134957
Other languages
English (en)
French (fr)
Inventor
陈一鑫
Original Assignee
北京字节跳动网络技术有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 北京字节跳动网络技术有限公司 filed Critical 北京字节跳动网络技术有限公司
Publication of WO2023098649A1 publication Critical patent/WO2023098649A1/zh

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N5/00Details of television systems
    • H04N5/222Studio circuitry; Studio devices; Studio equipment
    • H04N5/262Studio circuits, e.g. for mixing, switching-over, change of character of image, other special effects ; Cameras specially adapted for the electronic generation of special effects
    • H04N5/2621Cameras specially adapted for the electronic generation of special effects during image pickup, e.g. digital cameras, camcorders, video cameras having integrated special effects capability
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/20Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
    • H04N21/23Processing of content or additional data; Elementary server operations; Server middleware
    • H04N21/234Processing of video elementary streams, e.g. splicing of video streams or manipulating encoded video stream scene graphs
    • H04N21/23412Processing of video elementary streams, e.g. splicing of video streams or manipulating encoded video stream scene graphs for generating or manipulating the scene composition of objects, e.g. MPEG-4 objects
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/20Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
    • H04N21/23Processing of content or additional data; Elementary server operations; Server middleware
    • H04N21/234Processing of video elementary streams, e.g. splicing of video streams or manipulating encoded video stream scene graphs
    • H04N21/23424Processing of video elementary streams, e.g. splicing of video streams or manipulating encoded video stream scene graphs involving splicing one content stream with another content stream, e.g. for inserting or substituting an advertisement
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/44Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to encoded video stream scene graphs
    • H04N21/44012Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to encoded video stream scene graphs involving rendering scenes according to scene graphs, e.g. MPEG-4 scene graphs
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/44Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to encoded video stream scene graphs
    • H04N21/44016Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to encoded video stream scene graphs involving splicing one content stream with another content stream, e.g. for substituting a video clip
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N5/00Details of television systems
    • H04N5/222Studio circuitry; Studio devices; Studio equipment
    • H04N5/262Studio circuits, e.g. for mixing, switching-over, change of character of image, other special effects ; Cameras specially adapted for the electronic generation of special effects
    • H04N5/272Means for inserting a foreground image in a background image, i.e. inlay, outlay
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N5/00Details of television systems
    • H04N5/222Studio circuitry; Studio devices; Studio equipment
    • H04N5/262Studio circuits, e.g. for mixing, switching-over, change of character of image, other special effects ; Cameras specially adapted for the electronic generation of special effects
    • H04N5/272Means for inserting a foreground image in a background image, i.e. inlay, outlay
    • H04N2005/2726Means for inserting a foreground image in a background image, i.e. inlay, outlay for simulating a person's appearance, e.g. hair style, glasses, clothes
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Definitions

  • Embodiments of the present disclosure relate to the technical field of image processing, for example, to a video generation method, device, device, and storage medium.
  • short video apps have developed rapidly, entered users' lives, and gradually enriched users' spare time. Users can use videos, photos, etc. to record their lives, and through the special effects technology provided on the short video app, they can reprocess and express in richer forms, such as beautification, stylization, and expression editing.
  • Embodiments of the present disclosure provide a video generation method, device, device, and storage medium, which can implement special effect processing for hiding portraits in collected videos, and improve the interest of generated videos.
  • an embodiment of the present disclosure provides a method for generating a video, including:
  • a plurality of the portrait video frames are spliced to obtain a target video.
  • the embodiment of the present disclosure also provides a video generation device, including:
  • the video frame acquisition module is configured to acquire video frames included in the video to be processed; wherein, the video to be processed contains portraits;
  • the portrait segmentation module is configured to perform portrait segmentation on the video frame to obtain a portrait image and a background image;
  • a portrait image adjustment module configured to adjust the transparency of pixels satisfying the set conditions in the portrait image to obtain an adjusted portrait image
  • a complete background image acquisition module configured to process the background image to obtain a complete background image
  • a portrait video frame acquisition module configured to fuse the adjusted portrait image with the complete background image to obtain a portrait video frame
  • the target video acquisition module is configured to splice a plurality of the portrait video frames to obtain the target video.
  • an embodiment of the present disclosure further provides an electronic device, where the electronic device includes: at least one processing device;
  • a storage device configured to store at least one program
  • the at least one processing device When the at least one program is executed by the at least one processing device, the at least one processing device implements the video generation method according to the embodiments of the present disclosure.
  • the embodiments of the present disclosure further provide a computer-readable medium on which a computer program is stored, and when the program is executed by a processing device, the video generation method as described in the embodiments of the present disclosure is implemented.
  • FIG. 1 is a flow chart of a method for generating a video in an embodiment of the present disclosure
  • FIG. 2 is a schematic structural diagram of a video generation device in an embodiment of the present disclosure
  • Fig. 3 is a schematic structural diagram of an electronic device in an embodiment of the present disclosure.
  • the term “comprise” and its variations are open-ended, ie “including but not limited to”.
  • the term “based on” is “based at least in part on”.
  • the term “one embodiment” means “at least one embodiment”; the term “another embodiment” means “at least one further embodiment”; the term “some embodiments” means “at least some embodiments.” Relevant definitions of other terms will be given in the description below.
  • Fig. 1 is a flow chart of a video generation method provided by an embodiment of the present disclosure. This embodiment is applicable to the situation of implementing a hidden effect on a portrait of a video frame.
  • the method can be executed by a video generation device, which can be implemented by hardware and and/or software, and generally can be integrated into a device with video generation function, which can be an electronic device such as a server, a mobile terminal, or a server cluster. As shown in Figure 1, the method includes the following steps:
  • Step 110 acquiring video frames included in the video to be processed.
  • the video to be processed contains portraits.
  • a portrait can be understood as an image containing part or all of a human body.
  • the video to be processed may be downloaded from a local database or a server database, or pre-recorded by the camera of the terminal, or collected in real time by the camera of the terminal. If the video to be processed is downloaded or pre-recorded from the database, it is necessary to dismantle the video to be processed to obtain the video frames contained in the video to be processed; if the video to be processed is collected in real time, directly obtain the video frame captured by the camera in real time That's it.
  • video frames can be extracted from the video to be processed at preset intervals, for example: each video frame of the video to be processed can be extracted, or a video frame can be extracted every N video frames.
  • Step 120 performing portrait segmentation on the video frame to obtain a portrait image and a background image.
  • the principle of performing portrait segmentation on the video frame may be: firstly, identify the portrait in the video frame, and then extract and process the identified portrait, so as to obtain the portrait image and the background image of the extracted portrait.
  • the manner of performing portrait segmentation on each video frame to obtain a portrait image and a background image may be: performing portrait recognition on each video frame to obtain a portrait mask map and a background mask map; according to the portrait mask map and A portrait image is obtained from the video frame; a background image is obtained according to the background mask image and the video frame.
  • the portrait mask map can be understood as a mask in which the portrait area is transparent and the background area is black;
  • the background mask map can be understood as a mask in which the background area is transparent and the portrait area is black.
  • the process of performing portrait segmentation on each video frame can be: input the video frame into the semantic recognition model, obtain the confidence that each pixel in the video frame belongs to the portrait, and determine the gray value of the pixel according to the confidence, so as to obtain The mask map, and finally obtain the portrait mask map and the background mask map according to the mask map.
  • the mask map is a grayscale image
  • the white area is the portrait area
  • the black area is the background area.
  • the process of obtaining the portrait image according to the portrait mask image and the video frame may be: create a new layer (or become a patch), on which the portrait mask image is superimposed on the video frame, and the The background area is occluded to obtain a portrait image.
  • the process of obtaining the background image according to the background mask image and the video frame may be: superimposing the background mask image on the video frame, blocking the portrait area, thereby obtaining the background image.
  • the portrait image and the background image are acquired based on the portrait mask image and the background mask image, which can improve the accuracy of the portrait image and the background image.
  • Step 130 adjusting the transparency of pixels in the portrait image that meet the set conditions to obtain an adjusted portrait image.
  • the setting condition may be that the color value of the pixel point is greater than or smaller than the set value, or may be that the distance between the pixel point and the center point of the video frame is greater than or smaller than the set value.
  • the set value is related to the time when the video frame is in the video to be processed.
  • the way to adjust the transparency can be to reduce the transparency of the pixels by a set ratio, for example, adjust to any ratio such as 1/2 or 1/5 of the original transparency. In this embodiment, the transparency can be directly adjusted to 0 to achieve the effect of hiding the pixels.
  • the set value can change from large to small with the time stamp, so that the effect of the generated hidden video is that the portrait gradually changes from outside to inside.
  • Hidden; or the setting value can change from large to small and then from small to large with the time stamp.
  • the effect of the hidden video generated in this way is that the portrait is first hidden from the outside to the inside, and then displayed from the inside to the outside.
  • the set value can change from small to large with the time stamp, so that the effect of the generated video is that the portrait is gradually hidden from the inside to the outside; or set The value can first change from small to large and then from large to small with the timestamp, so that the effect of the generated video is that the portrait is first hidden from the inside to the outside, and then displayed from the outside to the inside.
  • the setting value is related to the time stamp of the video frame, so that the portraits in the generated video show the effect of gradually hiding, which improves the interest of the video.
  • the manner of hiding the pixel may be to hide the coloring of the pixel so that the pixel appears transparent.
  • adjust the transparency of pixels satisfying the set conditions in the portrait image and obtain the adjusted portrait image by: copying at least one copy of the portrait image to obtain at least one copy of the portrait image;
  • the portrait copy is rotated at a set angle along the coordinate axis of the three-dimensional space to obtain the rotated portrait copy, and the portrait and at least one rotated portrait copy form a portrait group; for the pixels in the portrait group that meet the set conditions Adjust the transparency of the image to obtain the adjusted portrait group.
  • the manner of duplicating at least one portrait image may be to create at least one new layer, and place the portrait image on the new layer, thereby obtaining at least one duplicate portrait image.
  • the coordinate axis of the three-dimensional space can be the x-axis or the y-axis, and the set angle can be any value between 10-90 degrees.
  • the multiple portrait reproductions may rotate along different coordinate axes or rotate at different angles. Exemplarily, it is assumed that two copies of the portrait are obtained, and two copies of the portrait are obtained, one of which is rotated 70 degrees along the x-axis, and the other is rotated 70 degrees along the y-axis.
  • the transparency of pixels whose distance from the center point of the video frame is greater or smaller than a set value so as to obtain the adjusted group of portraits .
  • multiple copies of the portrait are copied, and the transparency of the multiple portraits is adjusted at the same time, which can present a special effect of "transforming shape and changing shadow", thereby improving the interest of the video.
  • the following steps are also included: for each pixel in the adjusted portrait group, determine the rotation percentage according to the distance between the pixel point and the center point of the video frame; and set the rotation angle to determine the rotation parameter of the pixel point; rotate the pixel point based on the rotation parameter.
  • setting the rotation angle is related to the time stamp of the video frame.
  • the set rotation angle can be changed from small to large until all portraits are hidden.
  • the rotation parameters include a first sub-rotation parameter and a second sub-rotation parameter.
  • the method of determining the rotation parameter of the pixel point according to the rotation percentage and the set rotation angle can be: determine the intermediate rotation angle according to the rotation percentage and the set rotation angle; use the sine value of the intermediate rotation angle as the first sub-rotation parameter, and set the intermediate rotation angle The cosine of is used as the second sub-rotation parameter.
  • the manner of performing rotational offset on the pixel point based on the rotation parameter may be: determining the coordinate information of the pixel point after rotation according to the first sub-rotation parameter and the second sub-rotation parameter.
  • the coordinate information of the pixel point after rotation is determined based on the first sub-rotation parameter and the second sub-rotation parameter, so that the position of the pixel point after rotation can be accurately determined.
  • the following steps can still be performed: For each pixel in the portrait image, determine the rotation percentage according to the distance between the pixel point and the center point of the video frame ; Determine the rotation parameter of the pixel according to the rotation percentage and the set rotation angle; wherein, the set rotation angle is related to the time stamp of the video frame; rotate the pixel based on the rotation parameter.
  • the portrait image of the current frame or the pixels in the adjusted portrait image group are rotated based on the rotation parameters, so that the portrait of the video can have a rotation effect while the transparency changes, thereby improving the interest of the video.
  • the following step is further included: scaling at least one portrait image in the adjusted group of portrait images according to a set ratio to obtain the group of zoomed portrait images.
  • the setting ratio can be set to any value between 50%-90%. Assuming that the setting ratio is 70%, the portrait in the current video frame is scaled to 70% of the portrait in the previous frame.
  • at least one may be randomly selected for scaling, which is not limited here. That is, all the portraits in the portrait group can be zoomed simultaneously, or several of them can be selected for zooming. In this embodiment, zooming at least one portrait image in the portrait image group can make the portraits in the video show the effect of dynamic size change.
  • Step 140 process the background image to obtain a complete background image.
  • the background image is the background image with the portrait area cut out, so the background image needs to be repaired.
  • a preset restoration algorithm (such as an inpainting algorithm) may be used to process the background image.
  • the method of processing the background image to obtain a complete background image may be: obtaining the optical flow information of the background image of the first video frame of the video to be processed or the background image of the last video frame of the video frame; A fixed inpainting algorithm is used to process the optical flow information to obtain the complete background image of the video frame.
  • the optical flow information may be acquired by inputting video frames into an optical flow information determination model.
  • any optical flow information acquisition method in the related art may be used, which is not limited here.
  • the optical flow information is processed by using the set restoration algorithm to obtain the complete background image of the video frame.
  • the optical flow information of the background image of the first video frame can be used to repair the background image of each subsequent video frame.
  • the advantage of this is that the number of times to extract the optical flow information can be reduced, thereby reducing the amount of calculation , thereby improving the efficiency of the entire video generation process.
  • the background image of the current video frame can also be repaired by using the optical flow information of the previous video frame, which has the advantage of improving the accuracy of the background image repair.
  • Step 150 fusing the adjusted portrait image with the complete background image to obtain portrait video frames.
  • the portrait image is superimposed with the complete background image to obtain a portrait video frame. If the portrait image is copied to obtain a portrait image group, then the portrait image group is fused with the complete background image to obtain a portrait video frame. If the pixels in the portrait image are rotated, the rotated portrait image group is fused with the complete background image to obtain a portrait video frame. If the portrait image is scaled, the scaled portrait image group is fused with the complete background image to obtain a portrait video frame.
  • Step 160 splicing multiple portrait video frames to obtain the target video.
  • the portrait in each video frame extracted from the video to be processed is subjected to the transparency adjustment processing of the above embodiment, then multiple portrait video frames are obtained, and the multiple portrait video frames are spliced and encoded to obtain the target video .
  • the transparency of pixels satisfying the set condition in the portrait image may be adjusted to 0, so as to achieve the effect of hiding the pixels.
  • the pixel points satisfying the set condition in the group of portrait images are hidden to obtain the group of hidden portrait images (the group of hidden portrait images may contain at least one portrait image). Then, based on the determined rotation parameters, the pixel points in the hidden portrait group are rotated to obtain the rotated hidden portrait group. Then, at least one portrait image of the rotated hidden portrait image group is scaled according to a set ratio to obtain the scaled hidden portrait image group. Then, the scaled hidden portrait image group is fused with the complete background image to obtain hidden portrait video frames, and finally the hidden portrait video frames are spliced to obtain the target video.
  • the generated target video presents the effect that the portrait gradually hides.
  • the technical solution of the embodiment of the present disclosure obtains the video frame contained in the video to be processed; wherein, the video to be processed contains a portrait; performs portrait segmentation on the video frame to obtain a portrait image and a background image; satisfies the setting conditions in the portrait image Adjust the transparency of the pixel points to obtain the adjusted portrait image; process the background image to obtain a complete background image; fuse the adjusted portrait image and the complete background image to obtain a portrait video frame; combine multiple portrait video frames Splicing is performed to obtain the target video.
  • the video generation method provided by the embodiments of the present disclosure adjusts the transparency of pixels satisfying the set conditions in the portrait image, and can implement special effect processing for hiding the portraits in the captured video and improve the interest of the generated video.
  • Fig. 2 is a schematic structural diagram of a video generation device provided by an embodiment of the present disclosure. As shown in Fig. 2, the device includes:
  • the video frame obtaining module 210 is configured to obtain the video frame included in the video to be processed; wherein, the video to be processed contains a portrait;
  • the portrait segmentation module 220 is configured to perform portrait segmentation on the video frame to obtain a portrait image and a background image;
  • the portrait image adjustment module 230 is configured to adjust the transparency of pixels satisfying the set conditions in the portrait image to obtain the adjusted portrait image;
  • the complete background image acquisition module 240 is configured to process the background image to obtain a complete background image
  • the portrait video frame acquisition module 250 is configured to fuse the adjusted portrait image with the complete background image to obtain the portrait video frame;
  • the target video acquisition module 260 is configured to splice multiple portrait video frames to obtain the target video.
  • the portrait segmentation module 220 is also set to:
  • the portrait adjustment module 230 is also set to:
  • the portrait video frame acquisition module 250 is also set to:
  • the adjusted portrait image group is fused with the complete background image to obtain a portrait video frame.
  • rotation module set to:
  • the rotation module is also set to:
  • the sine value of the intermediate rotation angle is used as the first sub-rotation parameter, and the cosine value of the intermediate rotation angle is used as the second sub-rotation parameter;
  • Rotate and offset pixels based on rotation parameters including:
  • the coordinate information of the pixel after rotation is determined according to the first sub-rotation parameter and the second sub-rotation parameter.
  • scaling module set to:
  • Scaling at least one portrait image in the adjusted portrait image group according to a set ratio to obtain a scaled portrait image group.
  • the complete background image acquisition module 240 is further set to:
  • the set restoration algorithm is used to process the optical flow information to obtain the complete background image of the video frame.
  • the set condition is that the distance between the pixel point and the center point of the video frame is greater than or smaller than a set value; wherein, the set value is related to the time stamp of the video frame.
  • the above-mentioned device can execute the methods provided in all the foregoing embodiments of the present disclosure, and has corresponding functional modules for executing the above-mentioned methods.
  • the above-mentioned device can execute the methods provided in all the foregoing embodiments of the present disclosure, and has corresponding functional modules for executing the above-mentioned methods.
  • FIG. 3 it shows a schematic structural diagram of an electronic device 300 suitable for implementing an embodiment of the present disclosure.
  • the electronic equipment in the embodiment of the present disclosure may include but not limited to such as mobile phone, notebook computer, digital broadcast receiver, personal digital assistant (Personal Digital Assistant, PDA), PAD (tablet computer), portable multimedia player (Portable Media Player , PMP), mobile terminals such as vehicle-mounted terminals (such as vehicle-mounted navigation terminals), and fixed terminals such as digital television (Television, TV), desktop computers, etc., or various forms of servers, such as independent servers or server clusters.
  • PDA Personal Digital Assistant
  • PAD tablet computer
  • PMP portable multimedia player
  • mobile terminals such as vehicle-mounted terminals (such as vehicle-mounted navigation terminals)
  • fixed terminals such as digital television (Television, TV), desktop computers, etc.
  • servers such as independent servers or server clusters.
  • the electronic device shown in FIG. 3 is only an example, and should not limit the functions and scope of use of the embodiments of the present disclosure
  • an electronic device 300 may include a processing device (such as a central processing unit, a graphics processing unit, etc.)
  • the device 305 loads programs in the random access storage device (Random Access Memory, RAM) 303 to execute various appropriate actions and processes.
  • RAM Random Access Memory
  • various programs and data necessary for the operation of the electronic device 300 are also stored.
  • the processing device 301, ROM 302, and RAM 303 are connected to each other through a bus 304.
  • An input/output (Input/Output, I/O) interface 305 is also connected to the bus 304 .
  • an input device 306 including, for example, a touch screen, a touchpad, a keyboard, a mouse, a camera, a microphone, an accelerometer, a gyroscope, etc.; including, for example, a liquid crystal display (Liquid Crystal Display, LCD) , an output device 307 such as a speaker, a vibrator, etc.; a storage device 308 including, for example, a magnetic tape, a hard disk, etc.; and a communication device 309.
  • the communication means 309 may allow the electronic device 300 to perform wireless or wired communication with other devices to exchange data. While FIG. 3 shows electronic device 300 having various means, it should be understood that implementing or having all of the means shown is not a requirement. More or fewer means may alternatively be implemented or provided.
  • embodiments of the present disclosure include a computer program product comprising a computer program carried on a computer readable medium, the computer program comprising program code for performing a word recommendation method.
  • the computer program may be downloaded and installed from the network via the communication means 309, or from the storage means 305, or from the ROM 302.
  • the processing device 301 When the computer program is executed by the processing device 301, the above-mentioned functions defined in the methods of the embodiments of the present disclosure are performed.
  • the computer-readable medium mentioned above in the present disclosure may be a computer-readable signal medium or a computer-readable storage medium or any combination of the two.
  • a computer readable storage medium may be, for example, but not limited to, an electrical, magnetic, optical, electromagnetic, infrared, or semiconductor system, device, or device, or any combination thereof.
  • Computer readable storage media may include, but are not limited to: electrical connections having at least one lead, portable computer diskettes, hard disks, random access memory (RAM), read only memory (ROM), erasable programmable Read memory ((Erasable Programmable Read-Only Memory, EPROM) or flash memory), optical fiber, portable compact disk read-only memory (Compact Disc Read-Only Memory, CD-ROM), optical storage device, magnetic storage device, or any of the above the right combination.
  • a computer-readable storage medium may be any tangible medium that contains or stores a program that can be used by or in conjunction with an instruction execution system, apparatus, or device.
  • a computer-readable signal medium may include a data signal propagated in baseband or as part of a carrier wave carrying computer-readable program code therein. Such propagated data signals may take many forms, including but not limited to electromagnetic signals, optical signals, or any suitable combination of the foregoing.
  • a computer-readable signal medium may also be any computer-readable medium other than a computer-readable storage medium, which can send, propagate, or transmit a program for use by or in conjunction with an instruction execution system, apparatus, or device .
  • the program code contained on the computer readable medium can be transmitted by any appropriate medium, including but not limited to: electric wire, optical cable, radio frequency (Radio Frequency, RF), etc., or any suitable combination of the above.
  • the client and the server can communicate using any currently known or future-developed network protocols such as HTTP (HyperText Transfer Protocol, Hypertext Transfer Protocol), and can communicate with digital data in any form or medium
  • HTTP HyperText Transfer Protocol
  • the communication eg, communication network
  • Examples of communication networks include local area networks (Local Area Network, LAN), wide area networks (Wide Area Network, WAN), internetworks (e.g., the Internet), and peer-to-peer networks (e.g., ad hoc peer-to-peer networks), as well as any currently existing networks that are known or developed in the future.
  • the above-mentioned computer-readable medium may be included in the above-mentioned electronic device, or may exist independently without being incorporated into the electronic device.
  • the above-mentioned computer-readable medium carries at least one program, and when the above-mentioned at least one program is executed by the electronic device, the electronic device: acquires video frames contained in the video to be processed; wherein, the video to be processed contains portraits; Carrying out portrait segmentation on the frame to obtain a portrait image and a background image; adjusting the transparency of pixels satisfying the set conditions in the portrait image to obtain an adjusted portrait image; processing the background image to obtain a complete background image; Fusing the adjusted portrait image with the complete background image to obtain a portrait video frame; splicing a plurality of the portrait video frames to obtain a target video.
  • Computer program code for carrying out operations of the present disclosure may be written in one or more programming languages, or combinations thereof, including but not limited to object-oriented programming languages—such as Java, Smalltalk, C++, and Includes conventional procedural programming languages - such as the "C" language or similar programming languages.
  • the program code may execute entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer or entirely on the remote computer or server.
  • the remote computer can be connected to the user computer through any kind of network, including a local area network (LAN) or a wide area network (WAN), or it can be connected to an external computer (such as through an Internet service provider). Internet connection).
  • LAN local area network
  • WAN wide area network
  • Internet service provider such as AT&T, MCI, Sprint, EarthLink, MSN, GTE, etc.
  • each block in the flowchart or block diagram may represent a module, program segment, or part of code that contains at least one programmable logic function for implementing the specified logical function.
  • Execute instructions may also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or they may sometimes be executed in the reverse order, depending upon the functionality involved.
  • each block of the block diagrams and/or flowchart illustrations, and combinations of blocks in the block diagrams and/or flowchart illustrations can be implemented by a dedicated hardware-based system that performs the specified functions or operations , or may be implemented by a combination of dedicated hardware and computer instructions.
  • the units involved in the embodiments described in the present disclosure may be implemented by software or by hardware. Wherein, the name of a unit does not constitute a limitation of the unit itself under certain circumstances.
  • exemplary types of hardware logic components include: Field Programmable Gate Arrays (Field Programmable Gate Arrays, FPGAs), Application Specific Integrated Circuits (ASICs), Application Specific Standard Products (Application Specific Standard Parts, ASSP), System on Chip (System on Chip, SOC), Complex Programmable Logic Device (Complex Programmable Logic Device, CPLD) and so on.
  • a machine-readable medium may be a tangible medium that may contain or store a program for use by or in conjunction with an instruction execution system, apparatus, or device.
  • a machine-readable medium may be a machine-readable signal medium or a machine-readable storage medium.
  • a machine-readable medium may include, but is not limited to, electronic, magnetic, optical, electromagnetic, infrared, or semiconductor systems, apparatus, or devices, or any suitable combination of the foregoing.
  • machine-readable storage media would include at least one wire-based electrical connection, a portable computer disk, a hard disk, random access memory (RAM), read only memory (ROM), erasable programmable read only memory (EPROM), or flash memory), optical fiber, compact disc read only memory (CD-ROM), optical storage, magnetic storage, or any suitable combination of the foregoing.
  • RAM random access memory
  • ROM read only memory
  • EPROM erasable programmable read only memory
  • flash memory flash memory
  • optical fiber compact disc read only memory
  • CD-ROM compact disc read only memory
  • magnetic storage or any suitable combination of the foregoing.
  • the embodiment of the present disclosure discloses a video generation method, including:
  • a plurality of the portrait video frames are spliced to obtain a target video.
  • perform portrait segmentation on each video frame to obtain a portrait image and a background image including:
  • a background image is obtained according to the background mask image and the video frame.
  • adjusting the transparency of pixels satisfying the set conditions in the portrait image to obtain the adjusted portrait image including:
  • Fusing the adjusted portrait image with the complete background image to obtain portrait video frames including:
  • the adjusted portrait image group is fused with the complete background image to obtain a portrait video frame.
  • the adjusted portrait group further include:
  • the pixel is rotated based on the rotation parameter.
  • determining the rotation parameter of the pixel point according to the rotation percentage and the set rotation angle includes:
  • Performing a rotational offset on the pixel based on the rotation parameter includes:
  • the adjusted portrait group further include:
  • Scaling at least one portrait image in the adjusted portrait image group according to a set ratio to obtain a scaled portrait image group.
  • the background image is processed to obtain a complete background image, including:
  • the optical flow information is processed by using a preset restoration algorithm to obtain a complete background image of the video frame.
  • the set condition is that the distance between the pixel point and the center point of the video frame is greater than or less than a set value; wherein the set value is related to the time stamp of the video frame.
  • the embodiment of the present disclosure discloses a video generation device, including:
  • the video frame acquisition module is configured to acquire video frames included in the video to be processed; wherein, the video to be processed contains portraits;
  • the portrait segmentation module is configured to perform portrait segmentation on the video frame to obtain a portrait image and a background image;
  • a portrait image adjustment module configured to adjust the transparency of pixels satisfying the set conditions in the portrait image to obtain an adjusted portrait image
  • a complete background image acquisition module configured to process the background image to obtain a complete background image
  • a portrait video frame acquisition module configured to fuse the adjusted portrait image with the complete background image to obtain a portrait video frame
  • the target video acquisition module is configured to splice a plurality of the portrait video frames to obtain the target video.
  • an electronic device including:
  • At least one processing device At least one processing device
  • a storage device configured to store at least one program
  • the at least one processing device When the at least one program is executed by the at least one processing device, the at least one processing device is made to implement the video generation method described in any one of the embodiments of the present disclosure.
  • the embodiment of the present disclosure discloses a computer-readable medium, on which a computer program is stored, and when the computer program is executed by a processing device, any of the embodiments of the present disclosure can be implemented.
  • a method for generating a video is not limited to, but not limited to, but not limited to, but not limited to, but not limited to, but not limited to, but not limited to, but not limited to, but not limited to, but not limited to, but not limited to a computer program.

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Business, Economics & Management (AREA)
  • Marketing (AREA)
  • Studio Circuits (AREA)
  • Image Processing (AREA)
  • Processing Or Creating Images (AREA)

Abstract

本公开实施例公开了一种视频生成方法、装置、设备及存储介质。包括:获取待处理视频包含的视频帧;其中,所述待处理视频中包含人像;对视频帧进行人像分割,获得人像图及背景图;对所述人像图中满足设定条件的像素点的透明度进行调整,获得调整后的人像图;对所述背景图进行处理,获得完整背景图;将调整后的人像图和所述完整背景图进行融合,获得人像视频帧;将多个所述人像视频帧进行拼接,获得目标视频。

Description

视频生成方法、装置、设备及存储介质
本申请要求在2021年11月30日提交中国专利局、申请号为202111444204.0的中国专利申请的优先权,该申请的全部内容通过引用结合在本申请中。
技术领域
本公开实施例涉及图像处理技术领域,例如涉及一种视频生成方法、装置、设备及存储介质。
背景技术
近几年,短视频APP迅速发展,走进了用户的生活,逐渐丰富了用户的业余生活。用户可以采用视频、照片等方式记录生活,并可以通过短视频APP上提供的特效技术,进行再加工,以更丰富的形式进行表达,比如美颜、风格化、表情编辑等。
发明内容
本公开实施例提供一种视频生成方法、装置、设备及存储介质,可以实现对采集的视频中的人像进行隐藏的特效处理,提高生成视频的趣味性。
第一方面,本公开实施例提供了一种视频生成方法,包括:
获取待处理视频包含的视频帧;其中,所述待处理视频中包含人像;
对所述视频帧进行人像分割,获得人像图及背景图;
对所述人像图中满足设定条件的像素点的透明度进行调整,获得调整后的人像图;
对所述背景图进行处理,获得完整背景图;
将调整后的人像图和所述完整背景图进行融合,获得人像视频帧;
将多个所述人像视频帧进行拼接,获得目标视频。
第二方面,本公开实施例还提供了一种视频生成装置,包括:
视频帧获取模块,设置为获取待处理视频包含的视频帧;其中,所述待处理视频中包含人像;
人像分割模块,设置为对所述视频帧进行人像分割,获得人像图及背景图;
人像图调整模块,设置为对所述人像图中满足设定条件的像素点的透明度进行调整,获得调整后的人像图;
完整背景图获取模块,设置为对所述背景图进行处理,获得完整背景图;
人像视频帧获取模块,设置为将调整后人像图和所述完整背景图进行融合,获得人像视频帧;
目标视频获取模块,设置为将多个所述人像视频帧进行拼接,获得目标视频。
第三方面,本公开实施例还提供了一种电子设备,所述电子设备包括:至少一个处理装置;
存储装置,设置为存储至少一个程序;
当所述至少一个程序被所述至少一个处理装置执行,使得所述至少一个处理装置实现如本公开实施例所述的视频生成方法。
第四方面,本公开实施例还提供了一种计算机可读介质,其上存储有计算机程序,该程序被处理装置执行时实现如本公开实施例所述的视频生成方法。
附图说明
图1是本公开实施例中的一种视频生成方法的流程图;
图2是本公开实施例中的一种视频生成装置的结构示意图;
图3是本公开实施例中的一种电子设备的结构示意图。
具体实施方式
下面将参照附图更详细地描述本公开的实施例。
应当理解,本公开的方法实施方式中记载的各个步骤可以按照不同的顺序 执行,和/或并行执行。此外,方法实施方式可以包括附加的步骤和/或省略执行示出的步骤。本公开的范围在此方面不受限制。
本文使用的术语“包括”及其变形是开放性包括,即“包括但不限于”。术语“基于”是“至少部分地基于”。术语“一个实施例”表示“至少一个实施例”;术语“另一实施例”表示“至少一个另外的实施例”;术语“一些实施例”表示“至少一些实施例”。其他术语的相关定义将在下文描述中给出。
需要注意,本公开中提及的“第一”、“第二”等概念仅用于对不同的装置、模块或单元进行区分,并非用于限定这些装置、模块或单元所执行的功能的顺序或者相互依存关系。
需要注意,本公开中提及的“一个”、“多个”的修饰是示意性而非限制性的,本领域技术人员应当理解,除非在上下文另有明确指出,否则应该理解为“至少一个”。
本公开实施方式中的多个装置之间所交互的消息或者信息的名称仅用于说明性的目的,而并不是用于对这些消息或信息的范围进行限制。
图1为本公开实施例提供的一种视频生成方法的流程图,本实施例可适用于对视频帧的人像实现隐藏效果的情况,该方法可以由视频生成装置来执行,该装置可由硬件和/或软件组成,并一般可集成在具有视频生成功能的设备中,该设备可以是服务器、移动终端或服务器集群等电子设备。如图1所示,该方法包括如下步骤:
步骤110、获取待处理视频包含的视频帧。
其中,待处理视频中包含人像。人像可以理解为包含部分人体或者全部人体的图像。待处理视频可以是从本地数据库或者服务端数据库下载的,或者是利用终端的摄像头预先录制的,或者是利用终端的摄像头实时采集的。若待处理视频是从数据库下载或者预先录制的,则需要对待处理视频进行拆帧处理,获得待处理视频包含的视频帧;若待处理视频是实时采集的,则直接获取摄像头实时采集的视频帧即可。本实施例中,可以按照预设间隔从待处理视频中提 取视频帧,例如:可以提取待处理视频的每一个视频帧,或者每隔N个视频帧提取一视频帧。
步骤120,对视频帧进行人像分割,获得人像图及背景图。
其中,对视频帧进行人像分割的原理可以是:首先对视频帧中的人像进行识别,然后将识别到的人像抠取处理,从而获得人像图和抠掉人像的背景图。
示例性的,对每一个视频帧进行人像分割,获得人像图及背景图的方式可以是:对每一个视频帧进行人像识别,获得人像掩膜图和背景掩膜图;根据人像掩膜图和视频帧获得人像图;根据背景掩膜图和视频帧获得背景图。
其中,人像掩膜图可以理解为人像区域为透明色、背景区域为黑色的掩膜;背景掩膜图可以理解为背景区域为透明色、人像区域为黑色的掩膜。对每一个视频帧进行人像分割的过程可以是:将视频帧输入语义识别模型,获得视频帧中每个像素点属于人像的置信度,并根据该置信度确定像素点的灰度值,从而获得掩膜贴图,最后根据掩膜贴图获取人像掩膜图和背景掩膜图。该掩膜贴图为一张灰度图,白色区域为人像区域,黑色区域为背景区域。
本实施例中,根据人像掩膜图和视频帧获得人像图的过程可以是:创建新的图层(或者成为面片),在该图层上将人像掩膜图叠加至视频帧上,将背景区域遮挡,从而获得人像图。同理,根据背景掩膜图和视频帧获得背景图的过程可以是:将背景掩膜图叠加至视频帧上,将人像区域遮挡,从而获得背景图。本实施例中,基于人像掩膜图和背景掩膜图来获取人像图和背景图,可以提高人像图和背景图的精度。
步骤130,对人像图中满足设定条件的像素点的透明度进行调整,获得调整后的人像图。
其中,设定条件可以是像素点的颜色值大于或者小于设定值,或者可以是像素点距离视频帧中心点的距离大于或者小于设定值。且该设定值与视频帧在待处理视频所处的时刻相关。调整透明度的方式可以是将像素点的透明度调小设定比例,例如调整为原透明度的1/2或者1/5等任意的比例。本实施例中,可 以直接将透明度调整为0,以实现对像素点隐藏的效果。示例性的,若设定条件是像素点距离视频帧中心点的距离大于设定值,则设定值可以随时间戳由大变小,这样生成的隐藏视频的效果为人像由外到内逐渐隐藏;或者设定值可以随时间戳先由大变小再由小变大,这样生成的隐藏视频的效果是人像先由外到内隐藏,再由内到外显示。若设定条件是像素点距离视频帧中心点的距离小于设定值,则设定值可以随时间戳由小变大,这样生成的视频的效果为人像由内到外逐渐隐藏;或者设定值可以随时间戳先由小变大再由大变小,这样生成的视频的效果是人像先由内到外隐藏,再由外到内显示。本实施例中,设定值与视频帧的时间戳相关,使得生成的视频中的人像呈现逐渐隐藏的效果,提高视频的趣味性。
本实施例中,将像素点隐藏的方式可以是,隐藏该像素点的着色,使该像素点呈现透明色。
可选的,对人像图中满足设定条件的像素点的透明度进行调整,获得调整后的人像图的方式还可以是:将人像图复制至少一份,获得至少一个人像复制图;将至少一个人像复制图沿三维空间的坐标轴旋转设定角度,获得旋转后的人像复制图,人像图和旋转后的至少一个人像复制图组成人像图组;对人像图组中满足设定条件的像素点的透明度进行调整,获得调整后的人像图组。
其中,将人像图复制至少一份的方式可以是创建至少一个新图层,在新图层上放置人像图,从而获得至少一个人像复制图。三维空间的坐标轴可以为x轴或者y轴,设定角度可以是10-90度之间的任意值。
若人像复制图包含多个,为了防止旋转的人像复制图重叠,则多个人像复制图旋转时或沿的坐标轴不同或者旋转的角度不同。示例性的,假设对人像图复制了两份,获得两张人像复制图,其中一张沿x轴旋转70度,另一张沿y轴旋转70度。
示例性的,在获得人像图组后,对于人像图组中的每个人像图,调整距离视频帧中心点的距离大于或者小于设定值的像素点的透明度,从而获得调整后 的人像图组。本实施例中,将人像图复制多份,对多个人像图的透明度同时进行调整,可以呈现“移形换影”的特效,从而提高视频的趣味性。
可选的,在获得调整后的人像图组之后,还包括如下步骤:对调整后的人像图组中的每个像素点,根据像素点距离视频帧中心点的距离确定旋转百分比;根据旋转百分比和设定旋转角度确定像素点的旋转参数;基于旋转参数对像素点进行旋转。
其中,设定旋转角度与视频帧的时间戳相关。本实施例中,随视频帧在待处理视频所处的时刻由前到后,设定旋转角度可以由小变大,直到人像全部隐藏。根据像素点距离视频帧中心点的距离确定旋转百分比,可以按照图像公式计算:p=(0.9-d)/0.9,其中,p为旋转百分比,d为像素点距离视频帧中心点的距离。
其中,旋转参数包括第一子旋转参数和第二子旋转参数。根据旋转百分比和设定旋转角度确定像素点的旋转参数的方式可以是:根据旋转百分比和设定旋转角度确定中间旋转角;将中间旋转角的正弦值作为第一子旋转参数,将中间旋转角的余弦值作为第二子旋转参数。
其中,根据旋转百分比和设定旋转角度确定中间旋转角可以按照如下公式计算:θ=p*percent*angle*8.0。angle为设定旋转角度;percent为设定百分比,可以是一设定值;p为旋转百分比。则第一子旋转参数为s=sinθ,第二子旋转参数为c=cosθ。
相应的,基于旋转参数对像素点进行旋转偏移的方式可以是:根据第一子旋转参数和第二子旋转参数确定像素点旋转后的坐标信息。
示例性的,根据第一子旋转参数和第二子旋转参数确定像素点旋转后的坐标信息,可以按照如下公式计算:(x2,y2)=(x1*c-y1*s,x1*s+y1*c),其中,(x1,y1)为像素点旋转前的坐标信息,(x2,y2)为像素点旋转后的坐标信息。本实施例中,基于第一子旋转参数和第二子旋转参数确定像素点旋转后的坐标信息,可以准确的确定出像素点旋转后所在的位置。
本实施例中,若未对人像图进行复制,即只包含一张人像图,仍然可以执行如下步骤:对于人像图中的每个像素点,根据像素点距离视频帧中心点的距离确定旋转百分比;根据旋转百分比和设定旋转角度确定像素点的旋转参数;其中,设定旋转角度与视频帧的时间戳相关;基于旋转参数对像素点进行旋转。
本实施例中,基于旋转参数对当前帧的人像图或调整后的人像图组中的像素点进行旋转,可以使视频的人像在透明度变化的同时具有旋转的特效,从而提高视频的趣味性。
可选的,在获得调整后的人像图组之后,还包括如下步骤:对调整后的人像图组中的至少一个人像图按照设定比例进行缩放,获得缩放后的人像图组。
其中,设定比例可以设置为50%-90%之间的任意值。假设设定比例取70%,则将当前视频帧中的人像缩放为上一帧人像的70%。本实施例中,在调整后的人像图组中的人像图缩放时,可以随机选择至少一个进行缩放,此处不做限定。即可以同时缩放人像图组中的所有人像图,或者选择其中几个进行缩放。本实施例中,对人像图组中的至少一个人像图进行缩放,可以使得视频中的人像呈现尺寸动态变化的效果。
步骤140,对背景图进行处理,获得完整背景图。
其中,背景图为抠掉人像区域的背景图,因此需要对背景图进行修复。本实施例中,可以采用设定修复算法(如:inpainting算法)对背景图进行处理。
示例性的,对背景图进行处理,获得完整背景图的方式可以是:获取待处理视频的首帧视频帧的背景图或者视频帧的上一帧视频帧的背景图的光流信息;采用设定修复算法对光流信息进行处理,获得视频帧的完整背景图。
其中,光流信息的获取方式可以是将视频帧输入光流信息确定模型中进行获取,本实施例中可以采用相关技术中任意的光流信息获取方式,此处不做限定。在获得光流信息后,采用设定修复算法对光流信息进行处理,获得视频帧的完整背景图。
本实施例中,可以采用首帧视频帧的背景图的光流信息对后续每一帧视频 帧的背景图进行修复,这样做的好处是,可以减少提取光流信息的次数,从而降低计算量,从而提高整个视频生成过程的效率。也可以采用上一帧视频帧的的光流信息对当前视频帧的背景图进行修复,这样做的好处,可以提高背景图修复的精度。
步骤150,将调整后的人像图和完整背景图进行融合,获得人像视频帧。
本实施例中,若只有一张人像图,则将该人像图与完整背景图进行叠加,获得人像视频帧。若对人像图进行了复制,获得人像图组,则将人像图组和完整背景图进行融合,获得人像视频帧。若对人像图中的像素点进行了旋转,则将旋转后的人像图组和完整背景图进行融合,获得人像视频帧。若对人像图进行了缩放,则将缩放后的人像图组和完整背景图进行融合,获得人像视频帧。
步骤160,将多个人像视频帧进行拼接,获得目标视频。
本实施例中,对待处理视频中提取的每个视频帧中的人像进行了上述实施例的透明度的调整处理,则获得多个人像视频帧,将多个人像视频帧进行拼接编码,获得目标视频。
可选的,可以将人像图中满足设定条件的像素点的透明度调整为0,以实现对像素点隐藏的效果。示例性的,将人像图组中满足设定条件的像素点隐藏,获得隐***像图组(隐***像图组中可以包含至少一张人像图)。然后基于确定的旋转参数,对隐***像图组中的像素点进行旋转,获得旋转后的隐***像图组。再然后对旋转后的隐***像图组的至少一个人像图按照设定比例进行缩放,获得缩放后的隐***像图组。接着将缩放后的隐***像图组和完整背景图进行融合,获得隐***像视频帧,最后将隐***像视频帧进行拼接,获得目标视频。本实施例,生成的目标视频呈现人像逐渐隐藏的效果。
本公开实施例的技术方案,获取待处理视频包含的视频帧;其中,待处理视频中包含人像;对视频帧进行人像分割,获得人像图及背景图;对所述人像图中满足设定条件的像素点的透明度进行调整,获得调整后的人像图;对背景图进行处理,获得完整背景图;将调整后人像图和完整背景图进行融合,获得 人像视频帧;将多个人像视频帧进行拼接,获得目标视频。本公开实施例提供的视频生成方法,对人像图中满足设定条件的像素点的透明度进行调整,可以实现对采集的视频中的人像进行隐藏的特效处理,提高生成视频的趣味性。
图2是本公开实施例提供的一种视频生成装置的结构示意图,如图2所示,该装置包括:
视频帧获取模块210,设置为获取待处理视频包含的视频帧;其中,待处理视频中包含人像;
人像分割模块220,设置为对视频帧进行人像分割,获得人像图及背景图;
人像图调整模块230,设置为对所述人像图中满足设定条件的像素点的透明度进行调整,获得调整后的人像图;
完整背景图获取模块240,设置为对背景图进行处理,获得完整背景图;
人像视频帧获取模块250,设置为将调整后人像图和完整背景图进行融合,获得人像视频帧;
目标视频获取模块260,设置为将多个人像视频帧进行拼接,获得目标视频。
可选的,人像分割模块220,还设置为:
对每一个视频帧进行人像识别,获得人像掩膜图和背景掩膜图;
根据人像掩膜图和视频帧获得人像图;
根据背景掩膜图和视频帧获得背景图。
可选的,人像图调整模块230,还设置为:
将人像图复制至少一份,获得至少一个人像复制图;
将至少一个人像复制图沿三维空间的坐标轴旋转设定角度,获得旋转后的人像复制图,人像图和旋转后的至少一个人像复制图组成人像图组;
对所述人像图组中满足所述设定条件的像素点的透明度进行调整,获得调整后的人像图组;
可选的,人像视频帧获取模块250,还设置为:
将调整后的人像图组和完整背景图进行融合,获得人像视频帧。
可选的,还包括,旋转模块,设置为:
对于调整后的人像图组中的每个像素点,根据像素点距离视频帧中心点的距离确定旋转百分比;
根据旋转百分比和设定旋转角度确定像素点的旋转参数;其中,设定旋转角度与视频帧的时间戳相关;
基于旋转参数对像素点进行旋转。
可选的,旋转模块,还设置为:
根据旋转百分比和设定旋转角度确定中间旋转角;
将中间旋转角的正弦值作为第一子旋转参数,将中间旋转角的余弦值作为第二子旋转参数;
基于旋转参数对像素点进行旋转偏移,包括:
根据第一子旋转参数和第二子旋转参数确定像素点旋转后的坐标信息。
可选的,还包括:缩放模块,设置为:
对调整后的人像图组中的至少一个人像图按照设定比例进行缩放,获得缩放后的人像图组。
可选的,完整背景图获取模块240,还设置为:
获取待处理视频的首帧视频帧的背景图或者视频帧的上一帧视频帧的背景图的光流信息;
采用设定修复算法对光流信息进行处理,获得视频帧的完整背景图。
可选的,设定条件为像素点距离视频帧中心点的距离大于或者小于设定值;其中,设定值与视频帧的时间戳相关。
上述装置可执行本公开前述所有实施例所提供的方法,具备执行上述方法相应的功能模块。未在本实施例中详尽描述的技术细节,可参见本公开前述所有实施例所提供的方法。
下面参考图3,其示出了适于用来实现本公开实施例的电子设备300的结构示意图。本公开实施例中的电子设备可以包括但不限于诸如移动电话、笔记本 电脑、数字广播接收器、个人数字助理(Personal Digital Assistant,PDA)、PAD(平板电脑)、便携式多媒体播放器(Portable Media Player,PMP)、车载终端(例如车载导航终端)等等的移动终端以及诸如数字电视(Television,TV)、台式计算机等等的固定终端,或者各种形式的服务器,如独立服务器或者服务器集群。图3示出的电子设备仅仅是一个示例,不应对本公开实施例的功能和使用范围带来任何限制。
如图3所示,电子设备300可以包括处理装置(例如中央处理器、图形处理器等)301,其可以根据存储在只读存储装置(Read-Only Memory,ROM)302中的程序或者从存储装置305加载到随机访问存储装置(Random Access Memory,RAM)303中的程序而执行各种适当的动作和处理。在RAM 303中,还存储有电子设备300操作所需的各种程序和数据。处理装置301、ROM 302以及RAM 303通过总线304彼此相连。输入/输出(Input/Output,I/O)接口305也连接至总线304。
通常,以下装置可以连接至I/O接口305:包括例如触摸屏、触摸板、键盘、鼠标、摄像头、麦克风、加速度计、陀螺仪等的输入装置306;包括例如液晶显示器(Liquid Crystal Display,LCD)、扬声器、振动器等的输出装置307;包括例如磁带、硬盘等的存储装置308;以及通信装置309。通信装置309可以允许电子设备300与其他设备进行无线或有线通信以交换数据。虽然图3示出了具有各种装置的电子设备300,但是应理解的是,并不要求实施或具备所有示出的装置。可以替代地实施或具备更多或更少的装置。
特别地,根据本公开的实施例,上文参考流程图描述的过程可以被实现为计算机软件程序。例如,本公开的实施例包括一种计算机程序产品,其包括承载在计算机可读介质上的计算机程序,该计算机程序包含用于执行词语的推荐方法的程序代码。在这样的实施例中,该计算机程序可以通过通信装置309从网络上被下载和安装,或者从存储装置305被安装,或者从ROM 302被安装。在该计算机程序被处理装置301执行时,执行本公开实施例的方法中限定的上 述功能。
需要说明的是,本公开上述的计算机可读介质可以是计算机可读信号介质或者计算机可读存储介质或者是上述两者的任意组合。计算机可读存储介质例如可以是——但不限于——电、磁、光、电磁、红外线、或半导体的***、装置或器件,或者任意以上的组合。计算机可读存储介质的更具体的例子可以包括但不限于:具有至少一个导线的电连接、便携式计算机磁盘、硬盘、随机访问存储器(RAM)、只读存储器(ROM)、可擦式可编程只读存储器((Erasable Programmable Read-Only Memory,EPROM)或闪存)、光纤、便携式紧凑磁盘只读存储器(Compact Disc Read-Only Memory,CD-ROM)、光存储器件、磁存储器件、或者上述的任意合适的组合。在本公开中,计算机可读存储介质可以是任何包含或存储程序的有形介质,该程序可以被指令执行***、装置或者器件使用或者与其结合使用。而在本公开中,计算机可读信号介质可以包括在基带中或者作为载波一部分传播的数据信号,其中承载了计算机可读的程序代码。这种传播的数据信号可以采用多种形式,包括但不限于电磁信号、光信号或上述的任意合适的组合。计算机可读信号介质还可以是计算机可读存储介质以外的任何计算机可读介质,该计算机可读信号介质可以发送、传播或者传输用于由指令执行***、装置或者器件使用或者与其结合使用的程序。计算机可读介质上包含的程序代码可以用任何适当的介质传输,包括但不限于:电线、光缆、射频(Radio Frequency,RF)等等,或者上述的任意合适的组合。
在一些实施方式中,客户端、服务器可以利用诸如HTTP(HyperText Transfer Protocol,超文本传输协议)之类的任何当前已知或未来研发的网络协议进行通信,并且可以与任意形式或介质的数字数据通信(例如,通信网络)互连。通信网络的示例包括局域网(Local Area Network,LAN),广域网(Wide Area Network,WAN),网际网(例如,互联网)以及端对端网络(例如,ad hoc端对端网络),以及任何当前已知或未来研发的网络。
上述计算机可读介质可以是上述电子设备中所包含的;也可以是单独存在, 而未装配入该电子设备中。
上述计算机可读介质承载有至少一个程序,当上述至少一个程序被该电子设备执行时,使得该电子设备:获取待处理视频包含的视频帧;其中,所述待处理视频中包含人像;对视频帧进行人像分割,获得人像图及背景图;对所述人像图中满足设定条件的像素点的透明度进行调整,获得调整后的人像图;对所述背景图进行处理,获得完整背景图;将调整后的人像图和所述完整背景图进行融合,获得人像视频帧;将多个所述人像视频帧进行拼接,获得目标视频。
可以以一种或多种程序设计语言或其组合来编写用于执行本公开的操作的计算机程序代码,上述程序设计语言包括但不限于面向对象的程序设计语言—诸如Java、Smalltalk、C++,还包括常规的过程式程序设计语言—诸如“C”语言或类似的程序设计语言。程序代码可以完全地在用户计算机上执行、部分地在用户计算机上执行、作为一个独立的软件包执行、部分在用户计算机上部分在远程计算机上执行、或者完全在远程计算机或服务器上执行。在涉及远程计算机的情形中,远程计算机可以通过任意种类的网络——包括局域网(LAN)或广域网(WAN)—连接到用户计算机,或者,可以连接到外部计算机(例如利用因特网服务提供商来通过因特网连接)。
附图中的流程图和框图,图示了按照本公开各种实施例的***、方法和计算机程序产品的可能实现的体系架构、功能和操作。在这点上,流程图或框图中的每个方框可以代表一个模块、程序段、或代码的一部分,该模块、程序段、或代码的一部分包含至少一个用于实现规定的逻辑功能的可执行指令。也应当注意,在有些作为替换的实现中,方框中所标注的功能也可以以不同于附图中所标注的顺序发生。例如,两个接连地表示的方框实际上可以基本并行地执行,它们有时也可以按相反的顺序执行,这依所涉及的功能而定。也要注意的是,框图和/或流程图中的每个方框、以及框图和/或流程图中的方框的组合,可以用执行规定的功能或操作的专用的基于硬件的***来实现,或者可以用专用硬件与计算机指令的组合来实现。
描述于本公开实施例中所涉及到的单元可以通过软件的方式实现,也可以通过硬件的方式来实现。其中,单元的名称在某种情况下并不构成对该单元本身的限定。
本文中以上描述的功能可以至少部分地由至少一个硬件逻辑部件来执行。例如,非限制性地,可以使用的示范类型的硬件逻辑部件包括:现场可编程门阵列(Field Programmable Gate Array,FPGA)、专用集成电路(Application Specific Integrated Circuit,ASIC)、专用标准产品(Application Specific Standard Parts,ASSP)、片上***(System on Chip,SOC)、复杂可编程逻辑设备(Complex Programmable Logic Device,CPLD)等等。
在本公开的上下文中,机器可读介质可以是有形的介质,其可以包含或存储以供指令执行***、装置或设备使用或与指令执行***、装置或设备结合地使用的程序。机器可读介质可以是机器可读信号介质或机器可读储存介质。机器可读介质可以包括但不限于电子的、磁性的、光学的、电磁的、红外的、或半导体***、装置或设备,或者上述内容的任何合适组合。机器可读存储介质的更具体示例会包括基于至少一个线的电气连接、便携式计算机盘、硬盘、随机存取存储器(RAM)、只读存储器(ROM)、可擦除可编程只读存储器(EPROM或快闪存储器)、光纤、便捷式紧凑盘只读存储器(CD-ROM)、光学储存设备、磁储存设备、或上述内容的任何合适组合。
根据本公开的至少一个实施例,本公开实施例公开了一种视频生成方法,包括:
获取待处理视频包含的视频帧;其中,所述待处理视频中包含人像;
对视频帧进行人像分割,获得人像图及背景图;
对所述人像图中满足设定条件的像素点的透明度进行调整,获得调整后的人像图;
对所述背景图进行处理,获得完整背景图;
将调整后的人像图和所述完整背景图进行融合,获得人像视频帧;
将多个所述人像视频帧进行拼接,获得目标视频。
可选地,对每一个视频帧进行人像分割,获得人像图及背景图,包括:
对每一个视频帧进行人像识别,获得人像掩膜图和背景掩膜图;
根据所述人像掩膜图和所述视频帧获得人像图;
根据所述背景掩膜图和所述视频帧获得背景图。
可选地,对所述人像图中满足设定条件的像素点的透明度进行调整,获得调整后的人像图,包括:
将所述人像图复制至少一份,获得至少一个人像复制图;
将所述至少一个人像复制图沿三维空间的坐标轴旋转设定角度,获得旋转后的人像复制图,所述人像图和旋转后的至少一个人像复制图组成人像图组;
对所述人像图组中满足所述设定条件的像素点的透明度进行调整,获得调整后的人像图组;
将调整后的人像图和所述完整背景图进行融合,获得人像视频帧,包括:
将调整后的人像图组和所述完整背景图进行融合,获得人像视频帧。
可选地,在获得调整后的人像图组之后,还包括:
对于调整后的人像图组中的每个像素点,根据所述像素点距离所述视频帧中心点的距离确定旋转百分比;
根据所述旋转百分比和设定旋转角度确定所述像素点的旋转参数;其中,所述设定旋转角度与所述视频帧的时间戳相关;
基于所述旋转参数对所述像素点进行旋转。
可选地,根据所述旋转百分比和设定旋转角度确定所述像素点的旋转参数,包括:
根据所述旋转百分比和所述设定旋转角度确定中间旋转角;
将所述中间旋转角的正弦值作为第一子旋转参数,将所述中间旋转角的余弦值作为第二子旋转参数;
基于所述旋转参数对所述像素点进行旋转偏移,包括:
根据所述第一子旋转参数和所述第二子旋转参数确定所述像素点旋转后的坐标信息。
可选地,在获得调整后的人像图组之后,还包括:
对所调整后的人像图组中的至少一个人像图按照设定比例进行缩放,获得缩放后的人像图组。
可选地,对所述背景图进行处理,获得完整背景图,包括:
获取所述待处理视频的首帧视频帧的背景图或者所述视频帧的上一帧视频帧的背景图的光流信息;
采用设定修复算法对所述光流信息进行处理,获得所述视频帧的完整背景图。
可选地,所述设定条件为像素点距离所述视频帧中心点的距离大于或者小于设定值;其中,所述设定值与所述视频帧的时间戳相关。
根据本公开的至少一个实施例,本公开实施例公开了一种视频生成装置,包括:
视频帧获取模块,设置为获取待处理视频包含的视频帧;其中,所述待处理视频中包含人像;
人像分割模块,设置为对视频帧进行人像分割,获得人像图及背景图;
人像图调整模块,设置为对所述人像图中满足设定条件的像素点的透明度进行调整,获得调整后的人像图;
完整背景图获取模块,设置为对所述背景图进行处理,获得完整背景图;
人像视频帧获取模块,设置为将调整后的人像图和所述完整背景图进行融合,获得人像视频帧;
目标视频获取模块,设置为将多个所述人像视频帧进行拼接,获得目标视频。
根据本公开的至少一个实施例,本公开实施例公开了一种电子设备,包括:
至少一个处理装置;
存储装置,设置为存储至少一个程序;
当所述至少一个程序被所述至少一个处理装置执行,使得所述至少一个处理装置实现本公开实施例中任一所述的视频生成方法。
根据本公开的至少一个实施例,本公开实施例公开了一种计算机可读介质,所述计算机可读介质上存储有计算机程序,所述计算机程序被处理装置执行时实现本公开实施例中任一所述的视频生成方法。

Claims (11)

  1. 一种视频生成方法,包括:
    获取待处理视频包含的视频帧;其中,所述待处理视频中包含人像;
    对所述视频帧进行人像分割,获得人像图及背景图;
    对所述人像图中满足设定条件的像素点的透明度进行调整,获得调整后的人像图;
    对所述背景图进行处理,获得完整背景图;
    将调整后的人像图和所述完整背景图进行融合,获得人像视频帧;
    将多个所述人像视频帧进行拼接,获得目标视频。
  2. 根据权利要求1所述的方法,其中,对所述视频帧进行人像分割,获得人像图及背景图,包括:
    对每一个视频帧进行人像识别,获得人像掩膜图和背景掩膜图;
    根据所述人像掩膜图和所述视频帧获得人像图;
    根据所述背景掩膜图和所述视频帧获得背景图。
  3. 根据权利要求1所述的方法,其中,对所述人像图中满足设定条件的像素点的透明度进行调整,获得调整后的人像图,包括:
    将所述人像图复制至少一份,获得至少一个人像复制图;
    将所述至少一个人像复制图沿三维空间的坐标轴旋转设定角度,获得旋转后的人像复制图,所述人像图和旋转后的至少一个人像复制图组成人像图组;
    对所述人像图组中满足所述设定条件的像素点的透明度进行调整,获得调整后的人像图组;
    将调整后的人像图和所述完整背景图进行融合,获得人像视频帧,包括:
    将调整后的人像图组和所述完整背景图进行融合,获得人像视频帧。
  4. 根据权利要求3所述的方法,在获得调整后人像图组之后,还包括:
    对于调整后的人像图组中的每个像素点,根据所述像素点距离所述视频帧中心点的距离确定旋转百分比;
    根据所述旋转百分比和设定旋转角度确定所述像素点的旋转参数;其中, 所述设定旋转角度与所述视频帧在所述待处理视频所处的时刻相关;
    基于所述旋转参数对所述像素点进行旋转。
  5. 根据权利要求4所述的方法,其中,根据所述旋转百分比和设定旋转角度确定所述像素点的旋转参数,包括:
    根据所述旋转百分比和所述设定旋转角度确定中间旋转角;
    将所述中间旋转角的正弦值作为第一子旋转参数,将所述中间旋转角的余弦值作为第二子旋转参数;
    基于所述旋转参数对所述像素点进行旋转,包括:
    根据所述第一子旋转参数和所述第二子旋转参数确定所述像素点旋转后的坐标信息。
  6. 根据权利要求3所述的方法,在获得调整后的人像图组之后,还包括:
    对所述调整后的人像图组中的至少一个人像图按照设定比例进行缩放,获得缩放后的人像图组。
  7. 根据权利要求1所述的方法,其中,对所述背景图进行处理,获得完整背景图,包括:
    获取所述待处理视频的首帧视频帧的背景图或者所述视频帧的上一帧视频帧的背景图的光流信息;
    采用设定修复算法对所述光流信息进行处理,获得所述视频帧的完整背景图。
  8. 根据权利要求1所述的方法,其中,所述设定条件为像素点距离所述视频帧中心点的距离大于或者小于设定值;所述设定值与所述视频帧在所述待处理视频所处的时刻相关。
  9. 一种视频生成装置,包括:
    视频帧获取模块,设置为获取待处理视频包含的视频帧;其中,所述待处理视频中包含人像;
    人像分割模块,设置为对所述视频帧进行人像分割,获得人像图及背景图;
    人像图调整模块,设置为对所述人像图中满足设定条件的像素点的透明度进行调整,获得调整后的人像图;
    完整背景图获取模块,设置为对所述背景图进行处理,获得完整背景图;
    人像视频帧获取模块,设置为将调整后的人像图和所述完整背景图进行融合,获得人像视频帧;
    目标视频获取模块,设置为将多个所述人像视频帧进行拼接,获得目标视频。
  10. 一种电子设备,包括:
    至少一个处理装置;
    存储装置,设置为存储至少一个程序;
    当所述至少一个程序被所述至少一个处理装置执行,使得所述至少一个处理装置实现如权利要求1-8中任一所述的视频生成方法。
  11. 一种计算机可读介质,所述计算机可读介质上存储有计算机程序,所述计算机程序被处理装置执行时实现如权利要求1-8中任一所述的视频生成方法。
PCT/CN2022/134957 2021-11-30 2022-11-29 视频生成方法、装置、设备及存储介质 WO2023098649A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202111444204.0 2021-11-30
CN202111444204.0A CN114040129B (zh) 2021-11-30 2021-11-30 视频生成方法、装置、设备及存储介质

Publications (1)

Publication Number Publication Date
WO2023098649A1 true WO2023098649A1 (zh) 2023-06-08

Family

ID=80139620

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2022/134957 WO2023098649A1 (zh) 2021-11-30 2022-11-29 视频生成方法、装置、设备及存储介质

Country Status (2)

Country Link
CN (1) CN114040129B (zh)
WO (1) WO2023098649A1 (zh)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114040129B (zh) * 2021-11-30 2023-12-05 北京字节跳动网络技术有限公司 视频生成方法、装置、设备及存储介质

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2020179052A1 (ja) * 2019-03-07 2020-09-10 日本電気株式会社 画像処理装置、制御方法、及びプログラム
CN111695105A (zh) * 2020-05-29 2020-09-22 北京字节跳动网络技术有限公司 验证方法、装置和电子设备
CN111815649A (zh) * 2020-06-30 2020-10-23 清华大学深圳国际研究生院 一种人像抠图方法及计算机可读存储介质
CN112308866A (zh) * 2020-11-04 2021-02-02 Oppo广东移动通信有限公司 图像处理方法、装置、电子设备及存储介质
CN113132795A (zh) * 2019-12-30 2021-07-16 北京字节跳动网络技术有限公司 图像处理方法及装置
CN114040129A (zh) * 2021-11-30 2022-02-11 北京字节跳动网络技术有限公司 视频生成方法、装置、设备及存储介质

Family Cites Families (16)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2018002533A1 (fr) * 2016-06-30 2018-01-04 Fittingbox Procédé d'occultation d'un objet dans une image ou une vidéo et procédé de réalité augmentée associé
CN106851385B (zh) * 2017-02-20 2019-12-27 北京乐我无限科技有限责任公司 视频录制方法、装置和电子设备
AU2017272325A1 (en) * 2017-12-08 2019-06-27 Canon Kabushiki Kaisha System and method of generating a composite frame
CN110189246B (zh) * 2019-05-15 2023-02-28 北京字节跳动网络技术有限公司 图像风格化生成方法、装置及电子设备
CN110290425B (zh) * 2019-07-29 2023-04-07 腾讯科技(深圳)有限公司 一种视频处理方法、装置及存储介质
CN111145192B (zh) * 2019-12-30 2023-07-28 维沃移动通信有限公司 图像处理方法及电子设备
CN111292337B (zh) * 2020-01-21 2024-03-01 广州虎牙科技有限公司 图像背景替换方法、装置、设备及存储介质
CN111263071B (zh) * 2020-02-26 2021-12-10 维沃移动通信有限公司 一种拍摄方法及电子设备
KR20210120599A (ko) * 2020-03-27 2021-10-07 라인플러스 주식회사 아바타 서비스 제공 방법 및 시스템
CN111464761A (zh) * 2020-04-07 2020-07-28 北京字节跳动网络技术有限公司 视频的处理方法、装置、电子设备及计算机可读存储介质
CN111556278B (zh) * 2020-05-21 2022-02-01 腾讯科技(深圳)有限公司 一种视频处理的方法、视频展示的方法、装置及存储介质
CN111586319B (zh) * 2020-05-27 2024-04-09 北京百度网讯科技有限公司 视频的处理方法和装置
CN112351291A (zh) * 2020-09-30 2021-02-09 深圳点猫科技有限公司 一种基于ai人像分割的教学互动方法、装置及设备
CN112637517B (zh) * 2020-11-16 2022-10-28 北京字节跳动网络技术有限公司 视频处理方法、装置、电子设备及存储介质
CN113014830A (zh) * 2021-03-01 2021-06-22 鹏城实验室 视频虚化方法、装置、设备及存储介质
CN113362365A (zh) * 2021-06-17 2021-09-07 云从科技集团股份有限公司 视频处理方法、***、装置及介质

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2020179052A1 (ja) * 2019-03-07 2020-09-10 日本電気株式会社 画像処理装置、制御方法、及びプログラム
CN113132795A (zh) * 2019-12-30 2021-07-16 北京字节跳动网络技术有限公司 图像处理方法及装置
CN111695105A (zh) * 2020-05-29 2020-09-22 北京字节跳动网络技术有限公司 验证方法、装置和电子设备
CN111815649A (zh) * 2020-06-30 2020-10-23 清华大学深圳国际研究生院 一种人像抠图方法及计算机可读存储介质
CN112308866A (zh) * 2020-11-04 2021-02-02 Oppo广东移动通信有限公司 图像处理方法、装置、电子设备及存储介质
CN114040129A (zh) * 2021-11-30 2022-02-11 北京字节跳动网络技术有限公司 视频生成方法、装置、设备及存储介质

Also Published As

Publication number Publication date
CN114040129A (zh) 2022-02-11
CN114040129B (zh) 2023-12-05

Similar Documents

Publication Publication Date Title
WO2022166872A1 (zh) 一种特效展示方法、装置、设备及介质
WO2022083383A1 (zh) 图像处理方法、装置、电子设备及计算机可读存储介质
CN111292420B (zh) 用于构建地图的方法和装置
WO2021139382A1 (zh) 人脸图像的处理方法、装置、可读介质和电子设备
WO2021254502A1 (zh) 目标对象显示方法、装置及电子设备
WO2023143222A1 (zh) 图像处理方法、装置、设备及存储介质
WO2023051244A1 (zh) 图像生成方法、装置、设备及存储介质
CN111414879A (zh) 人脸遮挡程度识别方法、装置、电子设备及可读存储介质
WO2022233223A1 (zh) 图像拼接方法、装置、设备及介质
WO2023103999A1 (zh) 3d目标点渲染方法、装置、设备及存储介质
WO2023072015A1 (zh) 人物风格形象图的生成方法、装置、设备及存储介质
WO2023232056A1 (zh) 图像处理方法、装置、存储介质及电子设备
WO2023179310A1 (zh) 图像修复方法、装置、设备、介质及产品
WO2023098649A1 (zh) 视频生成方法、装置、设备及存储介质
WO2024037556A1 (zh) 图像处理方法、装置、设备及存储介质
WO2023207379A1 (zh) 图像处理方法、装置、设备及存储介质
WO2023273697A1 (zh) 图像处理方法、模型训练方法、装置、电子设备及介质
CN112418249A (zh) 掩膜图像生成方法、装置、电子设备和计算机可读介质
CN112714263B (zh) 视频生成方法、装置、设备及存储介质
WO2022071875A1 (zh) 图片转视频的方法、装置、设备及存储介质
CN111818265B (zh) 基于增强现实模型的交互方法、装置、电子设备及介质
WO2023226628A1 (zh) 图像展示方法、装置、电子设备及存储介质
WO2020155908A1 (zh) 用于生成信息的方法和装置
WO2023239299A1 (zh) 图像处理方法、装置、电子设备及存储介质
WO2023138441A1 (zh) 视频生成方法、装置、设备及存储介质

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 22900453

Country of ref document: EP

Kind code of ref document: A1