CN117710614A

CN117710614A - Image processing method, device, equipment and medium

Info

Publication number: CN117710614A
Application number: CN202211097123.2A
Authority: CN
Inventors: 瞿镇一
Original assignee: Beijing Zitiao Network Technology Co Ltd
Current assignee: Beijing Zitiao Network Technology Co Ltd
Priority date: 2022-09-08
Filing date: 2022-09-08
Publication date: 2024-03-15

Abstract

The embodiment of the disclosure relates to an image processing method, an image processing device and a medium, wherein the method comprises the following steps: acquiring a plurality of target sampling images corresponding to a virtual three-dimensional object, and depth information of each target sampling image; wherein each target sampling image contains local surface information of the virtual three-dimensional object, and depth information of different target sampling images is different; according to the depth information of each target sampling image, carrying out superposition processing on the plurality of target sampling images to obtain a laminated image; and displaying the laminated image. According to the embodiment of the disclosure, the stereoscopic effect can be presented to the user only by using the image, so that the operation resources and the storage resources required for presenting the stereoscopic effect to the user are greatly saved.

Description

Image processing method, device, equipment and medium

Technical Field

The present disclosure relates to the field of image processing technologies, and in particular, to an image processing method, apparatus, device, and medium.

Background

In a scene presented with a virtual 3D model, such as a virtual reality, a user may take a photograph of the virtual 3D model in order to preserve or share the photographed image for subsequent viewing to other users. However, the inventor finds that the shot image is still a two-dimensional image, and a certain offensive feeling is brought to the user in the virtual scene, so that the watching experience is poor. However, if the user wants to generate stereoscopic feeling when watching the image, it is generally necessary to restore the virtual 3D model based on the image, that is, to re-render the photographed virtual 3D model, which consumes very much computing resources and storage resources.

Disclosure of Invention

In order to solve the above technical problems or at least partially solve the above technical problems, the present disclosure provides an image processing method, apparatus, device, and medium.

The embodiment of the disclosure provides an image processing method, which comprises the following steps: acquiring a plurality of target sampling images corresponding to a virtual three-dimensional object, and depth information of each target sampling image; wherein each target sampling image contains local surface information of the virtual three-dimensional object, and depth information of different target sampling images is different; according to the depth information of each target sampling image, carrying out superposition processing on the plurality of target sampling images to obtain a laminated image; and displaying the laminated image.

The embodiment of the disclosure also provides an image processing apparatus, including: the image acquisition module is used for acquiring a plurality of target sampling images corresponding to the virtual three-dimensional object and depth information of each target sampling image; wherein each target sampling image contains local surface information of the virtual three-dimensional object, and depth information of different target sampling images is different; the image superposition module is used for superposing the plurality of target sampling images according to the depth information of each target sampling image to obtain a laminated image; and the layer display module is used for displaying the laminated image.

The embodiment of the disclosure also provides an electronic device, which comprises: a processor; a memory for storing the processor-executable instructions; the processor is configured to read the executable instructions from the memory and execute the instructions to implement an image processing method as provided in an embodiment of the disclosure.

The present disclosure also provides a computer-readable storage medium storing a computer program for executing the image processing method as provided by the embodiments of the present disclosure.

According to the technical scheme provided by the embodiment of the disclosure, a plurality of target sampling images containing local surface information of the virtual three-dimensional object and corresponding depth of field information can be directly obtained, and then the target sampling images are overlapped based on the depth of field information, and finally the obtained laminated image can display the surface information of the virtual three-dimensional object according to the depth of field, so that a stereoscopic effect is presented to a user. According to the method, model rendering is not needed for the virtual three-dimensional object, the stereoscopic effect can be presented to the user only by using the image, and operation resources and storage resources required for presenting the stereoscopic effect to the user are greatly saved.

It should be understood that the description in this section is not intended to identify key or critical features of the embodiments of the disclosure, nor is it intended to be used to limit the scope of the disclosure. Other features of the present disclosure will become apparent from the following specification.

Drawings

The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate embodiments consistent with the disclosure and together with the description, serve to explain the principles of the disclosure.

In order to more clearly illustrate the embodiments of the present disclosure or the solutions in the prior art, the drawings that are required for the description of the embodiments or the prior art will be briefly described below, and it will be obvious to those skilled in the art that other drawings can be obtained from these drawings without inventive effort.

Fig. 1 is a schematic flow chart of an image processing method according to an embodiment of the disclosure;

FIG. 2 is a schematic diagram of a sampling provided by an embodiment of the present disclosure;

FIG. 3 is a schematic diagram of a sample hierarchy provided by an embodiment of the present disclosure;

FIG. 4 is a schematic diagram of a sampled image provided by an embodiment of the present disclosure;

FIG. 5 is a schematic diagram of a matrix arrangement according to an embodiment of the disclosure;

FIG. 6 is a schematic illustration of a laminated layer according to an embodiment of the disclosure;

FIG. 7 is a schematic view of an eye view provided by an embodiment of the present disclosure;

FIG. 8 is a schematic diagram of a seaming process according to an embodiment of the present disclosure;

fig. 9 is a schematic structural diagram of an image processing apparatus according to an embodiment of the present disclosure;

fig. 10 is a schematic structural diagram of an electronic device according to an embodiment of the disclosure.

Detailed Description

In order that the above objects, features and advantages of the present disclosure may be more clearly understood, a further description of aspects of the present disclosure will be provided below. It should be noted that, without conflict, the embodiments of the present disclosure and features in the embodiments may be combined with each other.

In the following description, numerous specific details are set forth in order to provide a thorough understanding of the present disclosure, but the present disclosure may be practiced otherwise than as described herein; it will be apparent that the embodiments in the specification are only some, but not all, embodiments of the disclosure.

The inventor finds that the multimedia carrier shared by users when social contact is performed in a virtual scene such as virtual reality is generally image information without stereo depth, which contradicts the potential visual instinct of the users for observing the world in the virtual scene. For example, in some application scenarios, user a captures an image X of a 3D object in a virtual three-dimensional scene, and user a may share image X to user B in the virtual three-dimensional scene so that user B may view image X. When the image X is displayed for the user B, if the information of the image X is directly displayed in a 2D form, namely, the user B views a two-dimensional picture of the image X, the user B cannot feel the depth of field effect, and the image X is only a 2D image for the user B and has certain offensive sense in the virtual three-dimensional scene. If the information of the image X is displayed for the user B by rendering the photographed 3D object, the 3D object needs to be transmitted, loaded and rendered, the required file memory is large, the device performance is also occupied greatly, the sharing is not facilitated, and the reduction effect of the 3D object is affected by the coloring rendering and other modes. In addition, if the information of the image X is displayed for the user B in the form of a 3D side by side (left and right eyes), the eyes of the user B cannot be focused freely, and the naked eye exhibits a certain angle-limited section, and the pseudo-stereoscopic effect generated by the virtual focus in the form of a 3D side by side (left and right eyes) is liable to interfere with focusing with other objects or interfaces in the virtual scene.

In order to improve at least one of the above problems, embodiments of the present disclosure provide an image processing method, apparatus, device, and medium, which are described in detail below. It should be noted that the above is merely an exemplary illustration and should not be construed as an application scenario limitation. In practical applications, any scene that needs to present a stereoscopic effect to a user in an image manner for a virtual three-dimensional object may be suitable for the image processing method of the present disclosure.

Fig. 1 is a flowchart of an image processing method according to an embodiment of the present disclosure, where the method may be performed by an image processing apparatus, and the apparatus may be implemented by software and/or hardware, and may be generally integrated in an electronic device. As shown in fig. 1, the method mainly includes the following steps S102 to S106:

step S102, a plurality of target sampling images corresponding to the virtual three-dimensional object and depth information of each target sampling image are obtained; each target sampling image contains local surface information of a virtual three-dimensional object, and depth information of different target sampling images is different.

The virtual three-dimensional object may be a 3D model in the virtual scene, such as one or more of an item, a building, a person in the virtual scene, or the entire virtual scene, without limitation. In practical application, a plurality of target sampling images corresponding to the virtual three-dimensional object can be acquired based on the sampling position and the sampling gesture (which may also be called as sampling gesture). It will be appreciated that the sampling pose is different and that the plurality of target sampled images obtained are also different. In some implementation examples, user a views a virtual three-dimensional object in a virtual three-dimensional scene, and the pose of user a is the sampling pose; in some embodiments, the user a takes a picture of the virtual three-dimensional object with a virtual prop (virtual light field camera), and the pose of the virtual prop is the sampling pose.

In the embodiment of the disclosure, the acquired multiple target sampling images all carry local surface information of the virtual three-dimensional object, and in some specific implementation examples, the local surface information of different target sampling images is at least partially different, such as may be completely different or partially the same. In some embodiments, the depth of field of the target sampled image may be determined directly based on the distance between the surface of the virtual three-dimensional object to which the target sampled image corresponds and the sampling location. When the distance between the surface of the virtual three-dimensional object corresponding to the target sampling image and the sampling position is unique (such as the same distance between different positions on the surface and the sampling position), the unique distance can be taken as depth information of the target sampling image. When there are multiple distance values between the surface of the virtual three-dimensional object corresponding to the target sampled image and the sampling position (such as different distances between different positions on the surface and the sampling position), in some embodiments, the target distance may be determined based on the multiple distance values, for example, the multiple distance values are weighted and averaged to obtain the target distance, or an average value of the maximum distance value and the minimum distance value is taken as the target distance, or a mode or a median in the multiple distance values is taken as the target distance, and then the target distance is taken as depth information of field of the target sampled image. In other embodiments, each of the plurality of distance values may be used as depth information of the target sampled image, or a plurality of the maximum distance value, the minimum distance value, the weighted average value, the mode, or the median may be used as depth information of the target sampled image. The depth information of different target sampling images is different, and the distances between the surfaces of the virtual three-dimensional objects corresponding to the different target sampling images and the sampling positions can be understood to be different.

Step S104, according to the depth information of each target sampling image, overlapping the plurality of target sampling images to obtain a laminated image.

In some implementations, the plurality of target sample images may be ordered according to depth information for each target sample image; and superposing the plurality of target sampling images based on the sequencing result. For example, according to the method, a plurality of target sampling images can be ranked from near to far, that is, according to depth information, a target sampling image corresponding to the surface of a virtual three-dimensional object having a closest distance from a sampling position can be used as a first image of a stacked image, wherein the image belongs to a near image; and taking a target sampling image corresponding to the surface of the virtual three-dimensional object which is farthest from the sampling position as a last image of the laminated image, wherein the image belongs to the distant view image.

In some embodiments, the regions of the target sampled image that do not include local surface information are hollowed-out regions or transparent regions. Therefore, when a plurality of target sampling images are subjected to superposition processing, the hollowed-out area of each target sampling image can be used for perspective of local surface information on other target sampling images, and the obtained superposition image can integrally show the sum of the local surface information of each target sampling image.

And step S106, displaying the laminated image.

In practical application, the stacked images can be displayed according to the viewing pose, and as each target sampling image in the stacked images contains local surface information of the virtual three-dimensional object and the depth information of different target sampling images is different, the stacked images can have a stereoscopic effect with depth of field on the whole. In addition, in practical application, the display mode of the stacked images can be determined according to the viewing pose, namely, the viewing pose is different, and the display effect of the stacked images is different. For example, a core sampling image in the stacked image viewed by the user can be determined according to the line of sight of the user, the core sampling image can be understood as a focusing image, and at the moment, the rest target sampling images can be subjected to blurring processing with different degrees to form an eye focusing effect. When the laminated image is actually displayed, the human eyes watching the laminated image can be focused freely, and the 3D naked-eye effect is realized. It should be noted that the stacked image obtained in the above manner is similar to the light field image in the real scene, and has an effect similar to the light field image, but the manner in which the stacked image is obtained and displayed is completely different from the manner in which the light field image is obtained by photographing with the light field camera in the real scene (imaging with the microlens array). The mode provided by the embodiment of the disclosure can be directly realized in a digital space without being limited by physical conditions on the realization level of acquiring and displaying the laminated images.

For ease of understanding, reference may be made to a sampling schematic diagram shown in fig. 2, which illustrates sampling a virtual three-dimensional object in a VR virtual 3D scene, where the virtual 3D scene includes a near view, a far view, and a 3D object, and the entire virtual 3D scene may be regarded as a virtual three-dimensional object. It can be appreciated that the virtual prop (virtual light field camera) is different in position and pose, and the sampling result is different. The virtual light field camera is used for shooting the virtual three-dimensional object in the virtual three-dimensional scene to obtain the surface information of the virtual three-dimensional scene, the shooting process can be understood as a process of sampling the surface information of the virtual three-dimensional object, and the position and the gesture of the virtual light field camera can be used as sampling parameters. The position and posture of the virtual light field camera can be characterized by adopting 6DOF (six degrees of freedom) parameters, such as information including three-dimensional coordinate position, yaw angle, pitch angle, roll angle and the like, and can further comprise the size of a lens frame adopted when the virtual light field camera performs image acquisition. The user can shoot the virtual three-dimensional object by utilizing the virtual light field camera, so that a plurality of target sampling images are obtained. In addition, fig. 2 is also merely an example, and in practical application, the user may not need a virtual light field camera, and the position and the gesture of the user may be directly taken as sampling parameters, which is not limited herein.

The implementation manner of the step S102 is further described below, and for example, reference may be made to the following step S to step S implementation:

step one, determining sampling parameters; the sampling parameters include sampling position and sampling posture. As described above, the position and posture of the virtual camera may be used as the sampling position and sampling posture, or the position and posture of the user may be used as the sampling position and sampling posture, and may be flexibly set according to the needs, which is not limited herein.

And step two, based on the sampling parameters, acquiring a plurality of target sampling images corresponding to the virtual three-dimensional object, and acquiring depth information of each target sampling image. In practical applications, the embodiment of the disclosure does not limit the specific number of the target sampling images, and can be flexibly set according to requirements, and in general, the larger the number of the target sampling images is, the better the stereoscopic effect that can be finally presented by a plurality of target sampling images is. Based on the sampling position and the sampling gesture, specific surface information of the virtual three-dimensional object contained in each target sampling image can be obtained; based on the sampling position, depth information of each target sampling image can be obtained. By the method, a plurality of target sampling images and depth information of each target sampling image can be obtained objectively and accurately.

In some specific implementation examples, the second step may be implemented with reference to the following steps a and b:

and a, slicing the virtual three-dimensional object based on the sampling parameters to obtain a plurality of parallel tangential planes. Slice processing may also be understood as sample layering, where for ease of understanding, a VR virtual 3D scene may be sample layered (or sliced) with reference to a sample layering schematic shown in fig. 3, resulting in multiple parallel slices. It should be noted that, in practical application, the slicing process may not perform a real slicing operation on a virtual three-dimensional object, but only layer the virtual three-dimensional object, so as to achieve a slicing-like effect in a layered manner.

In some implementations, the virtual three-dimensional object may be sliced at specified intervals; the specified intervals include equal intervals or unequal intervals. The interval may also be understood as a depth of field interval or as the distance between two slices in a virtual scene. In the case where the specified interval is an unequal interval, the interval distance between two adjacent tangential planes close to the sampling position is smaller than the interval distance between two adjacent tangential planes far from the sampling position. In other words, the sampling layering may be performed in the order from near to far at the sampling view angle (such as the view angle of the mirror frame), and the sampling layer (the tangential plane) may be distributed in a near-dense and far-sparse manner, that is, the near-dense and far-sparse slicing frequency is higher than the far-distant slicing frequency, and the slicing manner adopted in fig. 3 is the near-dense and far-sparse slicing manner. The method not only accords with the visual viewing habit of the user, but also can effectively reduce the processing cost by reducing the sampling frequency at the distant view on the basis of presenting a certain stereoscopic impression for the user.

And b, acquiring a plurality of target sampling images corresponding to the virtual three-dimensional object based on the plurality of parallel tangential planes, and acquiring depth information of each target sampling image. In practical applications, the portion of the virtual three-dimensional object sandwiched between two tangent planes may be regarded as a sampling interval, where one sampling interval corresponds to one target sampling image, and in some implementation examples, the following steps 1 to 3 may be referred to obtain multiple target sampling images corresponding to the virtual three-dimensional object:

step 1, obtaining a plurality of section groups based on a plurality of parallel sections; wherein the section group comprises two sections.

In some specific implementations, two slices in a slice group are spaced apart by a specified number of slices in a plurality of parallel slices. The embodiment of the present disclosure is not limited to the specified number, and may be 0, 1, 2, 3, or the like. In some embodiments, the specified number may be 0, that is, two adjacent slices are taken as one slice group, such as 10 parallel slices in total, and 1 and 2 may be taken as one slice group if the numbers are 1-10 in the order from the near to the far; 2 and 3 as a slice group; 3 and 4 as a slice group, and so on. In some implementations, the specified number may be 1, i.e., slice 1 and slice 3 as one slice group, slice 2 and slice 4 as one slice group, slice 3 and slice 5 as one slice group, and so on. In some implementations, the specified number may be 2, i.e., slice 1 and slice 4 as one slice group, slice 2 and slice 5 as one slice group, slice 3 and slice 6 as one slice group, and so on. In practical applications, a plurality of slice groups can be obtained by using a plurality of slice interleaving intervals according to requirements.

And 2, for each section group, projecting the surface part of the virtual three-dimensional object clamped between the two sections contained in the section group to obtain a sampling image corresponding to the section group.

For example, a surface portion of a virtual three-dimensional object sandwiched between two slices included in the slice group is projected on a 2D square meter image, resulting in a sampling image corresponding to the slice group. For ease of understanding, reference may be made to a sample image schematic diagram shown in fig. 4, illustrating a plurality of sample images from near to far, each sample image containing projections of a localized surface portion of a virtual three-dimensional object. It should be noted that if the segmentation method is different, the obtained slice groups are different, and the corresponding sampled images are also different. For example, the slice 1 and the slice 2 are taken as one slice group, the slice 2 and the slice 3 are taken as one slice group, the sampling images corresponding to the plurality of slice groups obtained in the mode have no overlapped surface parts, and the local surface information of the virtual three-dimensional object contained in each sampling image is different. Whereas if slice 1 and slice 3 are taken as one slice group, there is partial overlapping local surface information in the sample images corresponding to the plurality of slice groups obtained in such a manner that slice 2 and slice 4 are taken as one slice group, such as slice 1 and slice 3 being taken as slice group corresponding sample image 1 and slice 2 and slice 4 being taken as slice group corresponding sample image 2, there is partial identical local surface information (local surface portion of the virtual three-dimensional object sandwiched by slice 2 and slice 3) between sample image 1 and sample image 2. If slice 1 and slice 4 are taken as a slice group, slice 2 and slice 5 are taken as a slice group, slice 3 and slice 6 are taken as a slice group, and there is also partial overlapping partial surface information in the sampled images corresponding to the slice groups obtained in this way, for example, slice 1 and slice 4 are taken as a slice group corresponding sampled image 1, slice 2 and slice 5 are taken as a slice group corresponding sampled image 2, then there is partial identical partial surface information between sampled image 1 and sampled image 2 as follows: the local surface portions of the virtual three-dimensional object sandwiched by slice 2 and slice 3, and the local surface portions of the virtual three-dimensional object sandwiched by slice 3 and slice 4.

It will be appreciated that the more the group of cuts contains and is spaced between two groups of cuts, the more the same local surface information exists between the corresponding sampled images of two adjacent groups of cuts, i.e. the more overlapping portions exist. The above-mentioned manner of providing adjacent sampled images with overlapping local surface information is also an optimized manner of obtaining a target sampled image, which can effectively avoid limiting the viewing angle. Specifically, as the laminated image finally displayed for the user is formed by laminating a plurality of target sampling images, if the viewing position of the user when viewing the laminated image is more deviated, the viewing angle is larger, gaps exist at the joint positions of the front and rear layers under the large-angle observation of the user, and meanwhile, the lasting phenomenon of unaligned patterns with lines on the surfaces can occur. Under the condition that no local surface information is overlapped between the target sampling images, the fault tolerance capability is lower, and the user is more likely to see hollows between adjacent target sampling images, so that the watching experience of the user is affected, and the watching angle of the user needs to be limited; in case of overlapping information between the target sampling images, the fault tolerance capability can be increased, and the probability that a user perceives hollowness exists between the target sampling images under a larger viewing angle is effectively reduced, so that the viewing angle of the user can be properly widened, and the viewing experience of the user is effectively improved.

And step 3, determining a plurality of target sampling images from the sampling images corresponding to all the section groups. In some specific implementation examples, the target sample image may be determined with reference to any one of the following modes one to three:

mode one: and taking the sampling images corresponding to all the section groups as target sampling images. That is, each sampling image is taken as a target sampling image, and the target sampling image can comprehensively present the surface information of the virtual three-dimensional object.

Mode two: taking the sampling images which meet the preset conditions in the sampling images corresponding to all the section groups as target sampling images; that is, a desired sample image is selected from all sample images as a target sample image. Illustratively, the similarity between the sampled images meeting the preset condition is not higher than a preset similarity threshold; and/or the projection pixel quantity of the sampling image meeting the preset condition is not lower than the preset pixel quantity. Such as trees, houses, and distant views in a virtual three-dimensional scene, where there may be no objects, but a broad ground. The multiple sampled images obtained for the distant view may have a high degree of similarity, and may also contain little or no local surface information, so that part of the sampled images in the distant view may be removed and not used as target sampled images. The target sampling image meeting the conditions is selected from the sampling images through the limitation of the similarity threshold value and the preset pixel quantity, so that the processing cost of the subsequent target sampling image, such as storage cost, transmission cost, display cost and the like, can be reduced as much as possible on the basis of ensuring that the target sampling image contains more useful information.

Mode three: and extracting target sampling images from the sampling images corresponding to all the section groups according to a specified interval. The specified interval may be a depth of field interval or a distance interval between sampled images. In the same way, the number of the target sampling images can be effectively reduced by extracting the target sampling images according to the designated intervals, so that the subsequent processing cost for the target sampling images is reduced.

In practical application, any one of the first to third modes can be flexibly selected according to the requirement, and the method is not limited.

In some implementations, obtaining depth information for each target sampled image includes: and for each target sampling image, acquiring the distance between the surface part of the virtual three-dimensional object corresponding to the target sampling image and the sampling position, and determining depth information corresponding to the target sampling image according to the distance. Illustratively, distances between different points on a surface portion of the virtual three-dimensional object sandwiched between slice 1 and slice 2 and the sampling locations are different, and thus an average value of distances between different points on the surface portion and the sampling locations may be used as a depth of field of the target sampling image corresponding to slice 1 and slice 2.

In practical application, after a plurality of target sampling images are obtained, the plurality of target sampling images may be arranged in a matrix form; storing the plurality of aligned target sample images. In this way, not only is storage facilitated, but also transportation is facilitated. For easy understanding, a schematic matrix arrangement shown in fig. 5 may be referred to, which illustrates a manner that a plurality of target sampling images are arranged in a matrix form, where gray portions of the target sampling images in fig. 5 are local surface information of the virtual three-dimensional object, and white portions are hollowed out or transparent. In this way, the memory resources and transmission resources required for the target sampling image can be further reduced.

After the plurality of target sample images have been acquired, if a stereoscopic effect of the virtual three-dimensional object is to be viewed based on the plurality of target sample images, for example, after the user a photographs the virtual three-dimensional object through the virtual light field camera, the plurality of target sample images may be obtained. After that, the user a or other users B may play the image of the virtual three-dimensional object by using the virtual prop such as the picture player, at this time, the multiple target sampled images may be stacked together again according to the depth information of the multiple target sampled images, and the stacking effect may also refer to fig. 4, after the stacked image is obtained, in order to further improve the viewing effect of the user, to enable the user to have a more realistic stereoscopic impression when viewing the stacked image, and to further determine the display manner of the stacked image according to the actual viewing position and viewing posture of the user, for convenience of understanding, the implementation manner of step S106 is further described below.

In some embodiments, presenting the stacked image includes: a mask is provided on a surface of the layered image, and the layered image provided with the mask is displayed so that a user views the layered image through the mask. In some embodiments, the mask may be a virtual photo frame, where the mask is used to block an overflow portion of the layered image outside the photo frame (i.e., outside the mask), and the mask is used to calibrate an initial depth of field of the stereoscopic layer, and/or the mask is used to limit an eye viewing angle range of the user viewing the stereoscopic layer, so as to avoid a hollow portion in the layered image from being perforated due to an excessive viewing angle of the user.

In some specific implementation examples of displaying the layered image provided with the mask, the layered image provided with the mask may be displayed in a preset virtual three-dimensional scene, and the mask is transparent with respect to the virtual three-dimensional scene. For example, assuming that a preset virtual three-dimensional scene for showing a layered image is scene B, and the layered image is a scene a taken (sampled) by a user for scene a, the user can view the scene a formed by the layered image with a stereoscopic effect within a mask, and see the scene B showing the layered image in the appearance of the mask. Furthermore, the mask can be provided with a frame, and can cut a stereoscopic scene realized by the laminated images and a current virtual three-dimensional scene, so that illusion of a user due to distance staggering is avoided. For ease of understanding, reference may be made to a schematic illustration of a layered image shown in fig. 6, where the black border is a mask border, and a user may view the layered image through the mask, while the current scene that is still presented outside the mask is used to present the layered image. In some embodiments for acquiring a plurality of target sampling images corresponding to a virtual three-dimensional object, the plurality of target sampling images corresponding to the virtual three-dimensional object may be acquired according to sampling parameters (sampling positions and sampling postures) and a preset mask frame, that is, the presentation form of the target sampling images is not only related to the sampling parameters but also related to the mask frame. For ease of understanding, the mask bezel may be considered the frame of the aforementioned virtual light field camera. Accordingly, the presentation form of the stacked image obtained by performing the stacking processing on the plurality of target sampling images is also related to the mask frame, such as the size of the mask frame is different in the case of determining the sampling position and the sampling posture, and the surface information of the virtual three-dimensional object displayed in the obtained stacked image is different, and illustratively, the larger the mask frame is, the more the surface information of the virtual three-dimensional object presented by the stacked image viewed through the mask by the user is. In practical application, if the laminated image is to be displayed, a corresponding mask can be generated according to a preset mask frame, so that the laminated image is displayed through the mask.

Referring to an eye view schematic shown in fig. 7, a user can achieve an effect of free focusing of eyes, that is, a naked eye 3D effect. When the slice frequency (or sampling frequency) is high, the number of target sampling images in the stacked image is large, and the naked eye 3D effect gradually increases from the slice to the volume stereoscopic impression.

In some embodiments, the display of the laminated image may be achieved by referring to the following steps a to B:

step A, obtaining viewing parameters of a user; wherein the viewing parameters include a viewing position and a viewing angle.

Considering that the viewing positions and viewing angles of the users are different, the picture effects of the viewed stacked images are different, so that the stacked images can be displayed based on the viewing parameters of the users. In some implementations, the viewing parameters of the user may be obtained from the position and pose of the VR headset of the user. In a virtual reality scene, a user usually needs to wear VR (virtual reality) head-mounted equipment to watch a virtual scene and interact with the virtual scene, and the display mode of the laminated image can be determined according to the position and the gesture of the VR head-mounted equipment and the position and the gesture of the laminated image in the virtual reality scene. In some implementations, the user's eye position and eye movement information may also be determined by specifying sensors; viewing parameters are then determined based on the eye position and the eye movement information. The specified sensor may be configured on an electronic device having a display screen, the embodiments of the present disclosure do not limit the type of electronic device, such as a mobile phone, a computer, etc., the electronic device may display a stacked image on the display screen, and determine the eye position and eye movement information of the user through the specified sensor (such as a camera); and then determining the relative position and the relative visual angle between eyes and the display screen according to the eye position and the eye movement information and the position and the gesture of the display screen, so as to determine the display mode of the laminated image according to the relative position and the relative visual angle. The foregoing are exemplary and any manner in which viewing parameters may be determined is not intended to be limiting.

And B, displaying the laminated image according to the viewing parameters. It can be understood that when a user observes the layered image, the relative position and the relative angle exist between the eye view angle position of the user and the layered image in the virtual space, and the convergence angle can be changed by dropping focus to different target sampling images, so that a certain parallax effect is formed, and the naked eye 3D effect is achieved. In this way, the user is in different positions and/or different postures, and the picture effect of the observed laminated images is different, so that the method is more in line with the actual 3D viewing experience.

In some embodiments, the step B may refer to the following steps (1) to (2):

and (1) determining that a user views the core sampling image in the stacked image according to the viewing parameters. In practical applications, the core sampling image may be regarded as an image of the user's eye in focus, and as shown in fig. 7, the images of the eyes in focus at different viewing angles may be considered as different, and the viewing angles of the eyes are related to the viewing position and viewing posture.

And (2) respectively carrying out blurring processing on the rest target sampling images except the core sampling image in the laminated image, wherein the blurring degree of the target sampling image which is farther from the core sampling image is larger. That is, successive blurring is performed on adjacent images of the core sample image, thereby further forming an eye focusing effect, and a naked eye 3D effect is clearly presented to the user.

In some embodiments, in order to further improve the viewing effect, reduce the lasting phenomenon that the user views that the adjacent target sampling images have gaps at the joint or the patterns cannot be aligned, the embodiment of the disclosure may further optimize the display manner of the stacked images, and for example, the above step B may be implemented with reference to the following steps B1 to B3:

and B1, determining adjacent sampling images to be corrected in the laminated images according to the viewing parameters. The sampling image to be corrected is an image which needs to be subjected to additional optimization processing in the display process, and is also an image which is easy to be perceived by a user to be penetrated.

In some implementations, a core sample image in a user viewing a stacked image may be determined based on viewing parameters; and then taking the adjacent images of the core sampling image as the sampling image to be corrected. It can be understood that the core sampling image is the target sampling image currently focused by the user, the target sampling image and the adjacent target sampling images are used as the sampling images to be corrected, and further optimization processing is performed, so that the user is prevented from perceiving the lasting phenomenon.

And step B2, performing joint closing processing on the adjacent sampling images to be corrected. The stitching process includes a stretching process and/or an alignment process.

In some implementations, a gaze region of a user's eyes in an adjacent sampled image to be corrected may be acquired; then determining a matching region corresponding to the gazing region in the adjacent sampling images to be corrected; the method comprises the steps that a fixation area and a matching area belong to different sampling images to be corrected, the fixation area and the matching area have similar characteristics, and the distance between the fixation area and the matching area is smaller than a preset threshold; and then, stitching the gazing area and the matching area. In this way, similar visual features between adjacent front and back images can be stretched and aligned so that the pattern between adjacent images at a particular viewing angle is continuous and stitched. In addition, the mode can only carry out joint closing processing on the local area of the target sampling image, so that the processing amount can be reduced, and the distortion phenomenon caused by large-area joint closing can be effectively avoided.

On the basis of the above, the embodiment of the disclosure provides a specific implementation manner of stitching the gazing area and the matching area: and stretching and/or aligning the gazing area and the matching area based on the similar features so as to realize joint connection between the gazing area and the matching area. In order to facilitate understanding of the above steps B1 to B2, reference may be made to a joint processing schematic diagram shown in fig. 8, where observation parameters of a user may be obtained, a current gazing area of the user may be determined based on a line from a viewing angle of the user to an image, and alignment processing may be performed on similar features (schematic triangles in a dashed frame) of two adjacent target sampling images currently gazed by the user, and at the same time, an area where the similar features are located may be stretched, so that adjacent sampling images to be corrected may be joined better, and a joint effect may be presented under the viewing angle of the user.

And step B3, displaying the laminated image after the joint close treatment.

Through the steps B1 to B3, the display mode of the target sampling images in the laminated images can be optimized, the phenomenon that the adjacent target sampling images are penetrated by a user in a joint position and gaps or patterns cannot be aligned is reduced, and the viewing experience of the user is effectively ensured.

In summary, according to the image processing method provided by the embodiment of the present disclosure, sampling can be performed on a virtual three-dimensional object to obtain a plurality of target sampling images carrying local surface information of the virtual three-dimensional object, where depth information of the target sampling images is different. Compared with the method for directly storing and transmitting the 3D model of the virtual three-dimensional object, the method has the advantages that resources required for storing and transmitting the target sampling image are little, and particularly under the condition that the area which does not contain the local surface information in the target sampling image is a hollowed-out area, the memory for storing the information of the virtual three-dimensional object can be further compressed. In addition, the method does not need to carry out complex rendering on the 3D model, the requirement on the performance of equipment is not high, and the universality of the equipment is stronger.

In contrast, in the above manner provided by the embodiment of the disclosure, when a user views a stacked image, the user can focus on different target sampling images based on different viewing angles and viewing positions, so that the convergence angle can be changed, the naked eye 3D effect is achieved, the viewing pose is different, and the 3D visual feeling presented to the user is also different. Further, according to the embodiment of the disclosure, the effect of focusing eyes can be further enhanced by carrying out different degrees of blurring processing on the rest target sampling images except the target sampling image focused by the user, so that more realistic 3D viewing experience is provided for the user. According to the embodiment of the disclosure, the acquisition mode and the display mode of the target sampling image are optimized, so that the viewing angle range of a user can be effectively expanded, the user is supported to view the laminated image at a larger viewing angle, a large-range 3D naked-eye effect is achieved, and the viewing experience of the user is effectively improved.

The above method provided by the embodiment of the disclosure can enable the user to watch the laminated image with a larger viewing angle, so as to achieve a large-scale 3D naked-eye effect, and the user can obtain different visual experiences by adjusting the watching position. Further, in the above manner provided by the embodiment of the present disclosure, when a user views a stacked image, the user may cause a convergence angle to change by dropping focus to a different target sampling image, so as to achieve a naked eye 3D effect.

Corresponding to the foregoing image processing method, fig. 9 is a schematic structural diagram of an image processing apparatus according to an embodiment of the present disclosure, where the apparatus may be implemented by software and/or hardware, and may be generally integrated in an electronic device, as shown in fig. 9, and includes:

an image obtaining module 902, configured to obtain a plurality of target sampling images corresponding to the virtual three-dimensional object and depth information of each target sampling image; each target sampling image contains local surface information of a virtual three-dimensional object, and depth information of different target sampling images is different;

the image superposition module 904 is configured to perform superposition processing on the multiple target sampling images according to the depth information of each target sampling image, so as to obtain a stacked image;

and the layer display module 906 is used for displaying the laminated image.

In some embodiments, the image acquisition module 902 is specifically configured to: determining sampling parameters; the sampling parameters comprise sampling positions and sampling postures; and acquiring a plurality of target sampling images corresponding to the virtual three-dimensional object based on the sampling parameters, and acquiring depth information of each target sampling image.

In some embodiments, the image acquisition module 902 is specifically configured to: slicing the virtual three-dimensional object based on the sampling parameters to obtain a plurality of parallel tangential planes; based on the parallel tangential planes, acquiring a plurality of target sampling images corresponding to the virtual three-dimensional object, and acquiring depth information of each target sampling image.

In some embodiments, the image acquisition module 902 is specifically configured to: slicing the virtual three-dimensional object according to a specified interval; the specified intervals include equal intervals or unequal intervals.

In some embodiments, the separation distance between two adjacent tangential planes near the sampling location is less than the separation distance between two adjacent tangential planes far from the sampling location.

In some embodiments, the image acquisition module 902 is specifically configured to: obtaining a plurality of section groups based on the plurality of parallel sections; wherein the section group comprises two sections; for each section group, projecting the surface part of the virtual three-dimensional object clamped between two sections contained in the section group to obtain a sampling image corresponding to the section group; and determining a plurality of target sampling images from all sampling images corresponding to the section group.

In some embodiments, two cuts in the set of cuts are spaced apart by a specified number of cuts in the plurality of parallel cuts.

In some embodiments, the image acquisition module 902 is specifically configured to: taking all the sampling images corresponding to the section groups as target sampling images; or taking the sampling images which meet the preset conditions in the sampling images corresponding to all the section groups as target sampling images; or extracting target sampling images from all the sampling images corresponding to the section group according to a specified interval.

In some embodiments, the similarity between the sampled images meeting the preset condition is not higher than a preset similarity threshold; and/or the projection pixel quantity of the sampling image meeting the preset condition is not lower than the preset pixel quantity.

In some embodiments, the image acquisition module 902 is specifically configured to: and for each target sampling image, acquiring the distance between the surface part of the virtual three-dimensional object corresponding to the target sampling image and the sampling position, and determining depth information corresponding to the target sampling image according to the distance.

In some embodiments, the image overlay module 904 is specifically configured to: sorting a plurality of target sampling images according to the depth information of each target sampling image; and carrying out superposition processing on the plurality of target sampling images based on the sequencing result.

In some embodiments, the layer presentation module 906 is specifically configured to: and setting a mask on the surface of the laminated image, and displaying the laminated image provided with the mask so that a user can watch the laminated image through the mask.

In some embodiments, the layer presentation module 906 is specifically configured to: and displaying the laminated image provided with the mask in a preset virtual three-dimensional scene, wherein the mask is transparent relative to the virtual three-dimensional scene.

In some embodiments, the layer presentation module 906 is specifically configured to: obtaining a viewing parameter of a user; wherein the viewing parameters include a viewing position and a viewing angle; and displaying the laminated image according to the viewing parameters.

In some embodiments, the layer presentation module 906 is specifically configured to: and acquiring the watching parameters of the user according to the position and the gesture of the VR headset of the user.

In some embodiments, the layer presentation module 906 is specifically configured to: determining an eye position and eye movement information of a user by designating a sensor; viewing parameters are determined based on the eye position and the eye movement information.

In some embodiments, the layer presentation module 906 is specifically configured to: according to the viewing parameters, determining that the user views a core sampling image in the laminated image; and respectively carrying out blurring processing on the rest target sampling images except the core sampling image in the laminated image, wherein the blurring degree of the target sampling image which is farther from the core sampling image is larger.

In some embodiments, the layer presentation module 906 is specifically configured to: determining adjacent sampling images to be corrected in the laminated images according to the viewing parameters; performing stitching treatment on the adjacent sampling images to be corrected; and displaying the laminated image after the joint treatment.

In some embodiments, the layer presentation module 906 is specifically configured to: according to the viewing parameters, determining that the user views a core sampling image in the laminated image; and taking the core sampling image and the adjacent image of the core sampling image as the sampling image to be corrected.

In some embodiments, the layer presentation module 906 is specifically configured to: acquiring a fixation area of eyes of the user in the adjacent sampling images to be corrected; determining a matching region corresponding to the gazing region in the adjacent sampling images to be corrected; the fixation area and the matching area belong to different sampling images to be corrected, the fixation area and the matching area have similar characteristics, and the distance between the fixation area and the matching area is smaller than a preset threshold; and performing joint closing processing on the gazing area and the matching area.

In some embodiments, the layer presentation module 906 is specifically configured to: and stretching and/or aligning the gazing area and the matching area based on the similar features so as to realize joint connection between the gazing area and the matching area.

In some embodiments, the apparatus further comprises an arrangement storage module for arranging the plurality of target sample images in a matrix form; storing the plurality of aligned target sample images.

In some embodiments, the area of the target sample image that does not include the local surface information is a hollowed-out area or a transparent area.

The image processing device provided by the embodiment of the disclosure can execute the image processing method provided by any embodiment of the disclosure, and has the corresponding functional modules and beneficial effects of the execution method.

It will be clearly understood by those skilled in the art that, for convenience and brevity of description, specific working procedures of the above-described apparatus embodiments may refer to corresponding procedures in the method embodiments, which are not described herein again.

The embodiment of the disclosure also provides an electronic device, which includes: a processor; a memory for storing processor-executable instructions; and a processor for reading the executable instructions from the memory and executing the instructions to implement the image processing method.

The embodiment of the disclosure also provides a computer readable storage medium storing a computer program for executing the above image processing method.

Fig. 10 is a schematic structural diagram of an electronic device according to an embodiment of the disclosure. As shown in fig. 10, an electronic device 1000 includes one or more processors 1001 and memory 1002.

The processor 1001 may be a Central Processing Unit (CPU) or other form of processing unit having data processing and/or instruction execution capabilities and may control other components in the electronic device 1000 to perform desired functions.

Memory 1002 may include one or more computer program products that may include various forms of computer-readable storage media, such as volatile memory and/or non-volatile memory. The volatile memory may include, for example, random Access Memory (RAM) and/or cache memory (cache), and the like. The non-volatile memory may include, for example, read Only Memory (ROM), hard disk, flash memory, and the like. One or more computer program instructions may be stored on the computer readable storage medium that can be executed by the processor 1001 to implement the image processing methods and/or other desired functions of the embodiments of the present disclosure described above. Various contents such as an input signal, a signal component, a noise component, and the like may also be stored in the computer-readable storage medium.

In one example, the electronic device 1000 may further include: an input device 1003 and an output device 1004, which are interconnected by a bus system and/or other forms of connection mechanisms (not shown).

In addition, the input device 1003 may include, for example, a keyboard, a mouse, and the like.

The output device 1004 may output various information to the outside, including the determined distance information, direction information, and the like. The output 1004 may include, for example, a display, speakers, a printer, and a communication network and remote output devices connected thereto, etc.

Of course, only some of the components of the electronic device 1000 that are relevant to the present disclosure are shown in fig. 10 for simplicity, components such as buses, input/output interfaces, and the like are omitted. In addition, the electronic device 1000 may include any other suitable components depending on the particular application.

In addition to the methods and apparatus described above, embodiments of the present disclosure may also be computer program products comprising computer program instructions which, when executed by a processor, cause the processor to perform the image processing methods provided by the embodiments of the present disclosure.

The computer program product may write program code for performing the operations of embodiments of the present disclosure in any combination of one or more programming languages, including an object oriented programming language such as Java, C++ or the like and conventional procedural programming languages, such as the "C" programming language or similar programming languages. The program code may execute entirely on the user's computing device, partly on the user's device, as a stand-alone software package, partly on the user's computing device, partly on a remote computing device, or entirely on the remote computing device or server.

Further, embodiments of the present disclosure may also be a computer-readable storage medium having stored thereon computer program instructions which, when executed by a processor, cause the processor to perform the image processing method provided by the embodiments of the present disclosure.

The computer readable storage medium may employ any combination of one or more readable media. The readable medium may be a readable signal medium or a readable storage medium. The readable storage medium may include, for example, but is not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or a combination of any of the foregoing. More specific examples (a non-exhaustive list) of the readable storage medium would include the following: an electrical connection having one or more wires, a portable disk, a hard disk, random Access Memory (RAM), read-only memory (ROM), erasable programmable read-only memory (EPROM or flash memory), optical fiber, portable compact disk read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing.

The disclosed embodiments also provide a computer program product comprising a computer program/instruction which, when executed by a processor, implements the image processing method in the disclosed embodiments.

It should be noted that in this document, relational terms such as "first" and "second" and the like are used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Moreover, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising one … …" does not exclude the presence of other like elements in a process, method, article, or apparatus that comprises the element.

The foregoing is merely a specific embodiment of the disclosure to enable one skilled in the art to understand or practice the disclosure. Various modifications to these embodiments will be readily apparent to those skilled in the art, and the generic principles defined herein may be applied to other embodiments without departing from the spirit or scope of the disclosure. Thus, the present disclosure is not intended to be limited to the embodiments shown and described herein but is to be accorded the widest scope consistent with the principles and novel features disclosed herein.

Claims

1. An image processing method, comprising:

acquiring a plurality of target sampling images corresponding to a virtual three-dimensional object, and depth information of each target sampling image; wherein each target sampling image contains local surface information of the virtual three-dimensional object, and depth information of different target sampling images is different;

according to the depth information of each target sampling image, carrying out superposition processing on the plurality of target sampling images to obtain a laminated image;

and displaying the laminated image.

2. The method of claim 1, wherein the acquiring a plurality of target sampled images corresponding to the virtual three-dimensional object and depth information for each of the sampled images comprises:

determining sampling parameters; the sampling parameters comprise sampling positions and sampling postures;

and acquiring a plurality of target sampling images corresponding to the virtual three-dimensional object based on the sampling parameters, and acquiring depth information of each target sampling image.

3. The method according to claim 2, wherein the acquiring a plurality of target sampled images corresponding to the virtual three-dimensional object based on the sampling parameters, and acquiring depth information of each of the target sampled images, comprises:

Slicing the virtual three-dimensional object based on the sampling parameters to obtain a plurality of parallel tangential planes;

based on the parallel tangential planes, acquiring a plurality of target sampling images corresponding to the virtual three-dimensional object, and acquiring depth information of each target sampling image.

4. A method according to claim 3, wherein said slicing the virtual three-dimensional object comprises;

slicing the virtual three-dimensional object according to a specified interval; the specified intervals include equal intervals or unequal intervals.

5. A method according to claim 3, wherein the separation distance between two adjacent facets close to the sampling location is smaller than the separation distance between two adjacent facets further from the sampling location.

6. The method of claim 3, wherein the acquiring a plurality of target sample images corresponding to the virtual three-dimensional object comprises:

obtaining a plurality of section groups based on the plurality of parallel sections; wherein the section group comprises two sections;

for each section group, projecting the surface part of the virtual three-dimensional object clamped between two sections contained in the section group to obtain a sampling image corresponding to the section group;

And determining a plurality of target sampling images from all sampling images corresponding to the section group.

7. The method of claim 6, wherein two cuts in the set of cuts are spaced apart from a specified number of cuts in the plurality of parallel cuts.

8. The method of claim 6, wherein determining a plurality of target sample images from the sample images corresponding to all of the slice groups comprises:

taking all the sampling images corresponding to the section groups as target sampling images; or,

taking the sampling images which meet the preset conditions in the sampling images corresponding to all the section groups as target sampling images; or,

and extracting target sampling images from all sampling images corresponding to the section group according to a specified interval.

9. The method of claim 8, wherein the similarity between the sampled images meeting the preset condition is not higher than a preset similarity threshold; and/or the projection pixel quantity of the sampling image meeting the preset condition is not lower than the preset pixel quantity.

10. The method of claim 6, wherein said acquiring depth information for each of said target sampled images comprises:

And for each target sampling image, acquiring the distance between the surface part of the virtual three-dimensional object corresponding to the target sampling image and the sampling position, and determining depth information corresponding to the target sampling image according to the distance.

11. The method according to claim 1, wherein the superimposing the plurality of target sample images based on the depth information of each of the target sample images includes:

sorting a plurality of target sampling images according to the depth information of each target sampling image;

and carrying out superposition processing on the plurality of target sampling images based on the sequencing result.

12. The method of claim 1, wherein the presenting the stacked image comprises:

and setting a mask on the surface of the laminated image, and displaying the laminated image provided with the mask so that a user can watch the laminated image through the mask.

13. The method of claim 12, wherein the displaying the stacked image provided with the mask comprises:

and displaying the laminated image provided with the mask in a preset virtual three-dimensional scene, wherein the mask is transparent relative to the virtual three-dimensional scene.

14. The method of claim 1, wherein the presenting the stacked image comprises:

obtaining a viewing parameter of a user; wherein the viewing parameters include a viewing position and a viewing angle;

and displaying the laminated image according to the viewing parameters.

15. The method of claim 14, wherein the obtaining the viewing parameters of the user comprises:

and acquiring the watching parameters of the user according to the position and the gesture of the VR headset of the user.

16. The method of claim 14, wherein the obtaining the viewing parameters of the user comprises:

determining an eye position and eye movement information of a user by designating a sensor;

viewing parameters are determined based on the eye position and the eye movement information.

17. The method of claim 14, wherein the presenting the stacked image according to the viewing parameter comprises:

according to the viewing parameters, determining that the user views a core sampling image in the laminated image;

and respectively carrying out blurring processing on the rest target sampling images except the core sampling image in the laminated image, wherein the blurring degree of the target sampling image which is farther from the core sampling image is larger.

18. The method of claim 14, wherein the presenting the stacked image according to the viewing parameter comprises:

determining adjacent sampling images to be corrected in the laminated images according to the viewing parameters;

performing stitching treatment on the adjacent sampling images to be corrected;

and displaying the laminated image after the joint treatment.

19. The method of claim 18, wherein determining adjacent sample images to be corrected in the stacked image based on the viewing parameter comprises:

and taking the core sampling image and the adjacent image of the core sampling image as the sampling image to be corrected.

20. The method of claim 18, wherein the step of stitching the adjacent sample images to be corrected comprises:

acquiring a fixation area of eyes of the user in the adjacent sampling images to be corrected;

determining a matching region corresponding to the gazing region in the adjacent sampling images to be corrected; the fixation area and the matching area belong to different sampling images to be corrected, the fixation area and the matching area have similar characteristics, and the distance between the fixation area and the matching area is smaller than a preset threshold;

And performing joint closing processing on the gazing area and the matching area.

21. The method of claim 20, wherein the step of stitching the gaze area and the matching area comprises:

and stretching and/or aligning the gazing area and the matching area based on the similar features so as to realize joint connection between the gazing area and the matching area.

22. The method according to claim 1, wherein the method further comprises:

arranging the plurality of target sampling images in a matrix form;

storing the plurality of aligned target sample images.

23. The method according to any one of claims 1 to 22, wherein the areas of the target sample image not containing the local surface information are hollow out areas or transparent areas.

24. An image processing apparatus, comprising:

the image acquisition module is used for acquiring a plurality of target sampling images corresponding to the virtual three-dimensional object and depth information of each target sampling image; wherein each target sampling image contains local surface information of the virtual three-dimensional object, and depth information of different target sampling images is different;

The image superposition module is used for superposing the plurality of target sampling images according to the depth information of each target sampling image to obtain a laminated image;

and the layer display module is used for displaying the laminated image.

25. An electronic device, the electronic device comprising:

a processor;

a memory for storing the processor-executable instructions;

the processor is configured to read the executable instructions from the memory and execute the instructions to implement the image processing method of any of the preceding claims 1-23.

26. A computer-readable storage medium, characterized in that the storage medium stores a computer program for executing the image processing method according to any one of the preceding claims 1-23.