CN117456078B

CN117456078B - Neural radiation field rendering method, system and equipment based on various sampling strategies

Info

Publication number: CN117456078B
Application number: CN202311748881.0A
Authority: CN
Inventors: 方顺; 张志恒
Original assignee: Beijing Xuanguang Technology Co ltd
Current assignee: Beijing Xuanguang Technology Co ltd
Priority date: 2023-12-19
Filing date: 2023-12-19
Publication date: 2024-03-26
Anticipated expiration: 2043-12-19
Also published as: CN117456078A

Abstract

The invention belongs to the field of computer vision, in particular relates to a nerve radiation field rendering method, system and equipment based on various sampling strategies, and aims to solve the problems that amblyopia objects are easy to be ignored, unclear, lost and generate errors in the prior art. The invention comprises the following steps: aligning a camera with a target object, and acquiring a camera direction vector and camera parameters; and acquiring rendering information of the target object through a camera and a nerve radiation field rendering network based on various sampling strategies, wherein the rendering information comprises a signed distance field information value, a volume density value, a color value, a depth value and a normal line, and acquiring a high-resolution view or a high-resolution three-dimensional reconstruction model of a set viewing angle based on the rendering information. According to the invention, the target object is sampled by setting different sampling densities, so that the computing resource can be improved for the area where the amblyopia object is located, and the problems that the amblyopia object is ignored, unclear or lost when three-dimensional reconstruction or new view angle synthesis is performed are effectively avoided.

Description

Neural radiation field rendering method, system and equipment based on various sampling strategies

Technical Field

The invention belongs to the field of computer vision, and particularly relates to a nerve radiation field rendering method, system and equipment based on various sampling strategies.

Background

Thin objects or parts, tiny objects or parts, position-limited objects or parts (such as at corners, far from the camera), view-limited objects or parts (such as where camera angles cause some objects to suffer from position limitation), light-limited objects or parts (where blur problems due to objects being too weak in light), overlay-limited objects or parts (such as being simultaneously tiny, at dark corners, and far from the camera), etc., these objects or parts are referred to as amblyopia objects. When three-dimensional reconstruction or new view synthesis is performed by using a nerve radiation field, rendering problems such as neglected, unclear, lost, error and the like are easy to occur to the amblyopia object, so that a good reconstruction or new view synthesis effect cannot be obtained.

The root cause of the problem of amblyopia object rendering is that the neural network processes the low-frequency information faster than the convergence speed of the high-frequency information, so that the problem cannot be well solved even if the training data is sufficient and the training time is sufficient. The traditional method or only improving the neural network has the difficulty of optimizing both accuracy and computing resources.

In order to solve the problem, with less training data and shorter training time, the high-frequency signal of the object is more easily captured, and a novel sampling strategy is proposed and is rendered by a nerve radiation field and a volume rendering mode.

Disclosure of Invention

In order to solve the above problems in the prior art, that is, the problems of easy neglect, unclear, lost and error generation of the amblyopia object in the prior art, the invention provides a neural radiation field rendering method based on various sampling strategies, which comprises the following steps:

step S1, aligning a camera to a target object, and acquiring a current camera direction vector and camera parameters;

step S2, rendering information is obtained through a camera and a nerve radiation field rendering network based on various sampling strategies, wherein the rendering information comprises a signed distance field information value, a volume density value, a color value, a depth value and a normal of a target object;

step S21, performing first-layer sampling and position coding on a target object in a uniform sampling mode through an image acquisition device and a sampling rendering network to obtain a first-layer sampling point set;

step S22, calculating a feature map of the first layer sampling point set through a feature extraction network based on the first layer sampling point set;

Step S23, obtaining first rendering information through a multi-branch rendering information extraction network through the feature map of the first layer sampling point set and the current camera direction vector, wherein the first rendering information comprises a first signed distance field information value, a first bulk density value, a first color value, a first depth value and a first normal;

step S24, setting a second layer sampling density according to the first rendering information, and carrying out second layer sampling and position coding on the target object according to the second layer sampling density to obtain a second layer sampling point set;

step S25, calculating a feature map of the second layer sampling point set through a feature extraction network based on the second layer sampling point set;

step S26, obtaining second rendering information through a multi-branch rendering information extraction network through the feature map of the second layer sampling point set and the current camera direction vector, wherein the second rendering information comprises a second signed distance field information value, a second volume density value, a second color value, a second depth value and a second normal;

the first rendering information and the second rendering information form the rendering information;

and step S3, acquiring a high-resolution view or a high-resolution three-dimensional reconstruction model of the set view angle based on the rendering information.

Further, the neural radiation field rendering network based on the multiple sampling strategies specifically comprises:

a sampling rendering network, a feature extraction network and a multi-branch rendering information extraction network;

the sampling rendering network comprises a sampling module and a position coding unit;

the feature extraction network is constructed based on a first multi-layer perceptron with jump connection;

the multi-branch rendering information extraction network is a parallel volume density acquisition branch, a color acquisition branch, a depth acquisition branch and a normal line acquisition branch; the bulk density acquisition branch comprises a second multi-layer perceptron; the color acquisition branch comprises a position coding unit and a third multi-layer perceptron; the depth acquisition branch comprises a position coding unit and a fourth multi-layer perceptron; the normal line acquisition branch comprises a position coding unit and a fifth multi-layer perceptron.

According to the technical scheme, the dyeing network, the characteristic extraction network and the multi-branch rendering information extraction network are arranged, the information of a target object can be automatically analyzed, different sampling precision is distributed to different positions through sampling modules with different sampling densities, and the details of amblyopia objects are better restored while the integral precision is reserved. The network structure provided by the scheme generates the volume density through the signed distance field information, and compared with the traditional nerve radiation field network, the accuracy of the volume density value is improved. Compared with the traditional neural radiation field network, the network of the scheme generates depth values and discoveries simultaneously besides color values, so that the overall effect of the network is further optimized through a loss function.

Further, the first multi-layer perceptron comprises 8 hidden layers, each hidden layer comprises 256 channels, and the input end of the first multi-layer perceptron is connected with the 4 th hidden layer in a jumping connection mode.

According to the first multi-layer perceptron, the shallow sampling rendering image is directly sent to the deep hidden layer of the first multi-layer perceptron in a jumping connection mode, so that the problem of low training efficiency of the multi-layer perceptron is solved, and the information processing capability of the multi-layer perceptron which is trained finally is improved.

Further, the step S23 includes:

step S231, inputting the feature map of the first layer sampling point set into a volume density acquisition branch;

respectively inputting a feature map of the first layer sampling point set and the current camera direction vector into a parallel color acquisition branch, a depth acquisition branch and a normal acquisition branch;

step S232, the current camera direction vector is encoded through a position encoding unit to obtain a direction encoding vector;

step S233, acquiring a first signed distance field information value through a second multi-layer perceptron based on the feature map of the first layer sampling point set, and further acquiring a first bulk density value;

And acquiring a first color value, a first depth value and a first normal line through the third multi-layer perceptron, the fourth multi-layer perceptron and the fifth multi-layer perceptron respectively based on the feature map of the first layer sampling point set and the direction coding vector.

According to the invention, through three networks with the same structure and different parameters, independent calculation is respectively carried out on different information to be acquired, and comprehensive information can be acquired for both amblyopia objects and common objects, so that the reconstruction precision or the generation precision of another visual angle image can be improved.

Further the setting of the second layer sampling density according to the first rendering information includes: one or two of an importance sampling mode and a amblyopia sampling mode in parallel.

Further, the sampling mode according to the importance includes:

estimating an information density distribution by the first color value;

a weighted sampling density obtained by scaling in a preset proportion or scaling in a preset proportion and shaping based on the information density distribution;

the weighted sampling density is taken as a second layer sampling density.

According to the invention, the information density distribution calculated by the first layer of sampling points is directly used as the sampling density of the second layer of sampling, the relation between more and less information density distribution and the amblyopia object is utilized, the sampling precision of the amblyopia object can be automatically increased, the area of the amblyopia object is not required to be identified by a neural network, and the calculated amount and the calculation resource loss are reduced.

Further, the shaping includes one or more of interpolating, weighting, stretching the set section, compressing or filtering the set section.

According to the invention, on the basis of the information density distribution calculated by the first layer of sampling points, a shaping mode is further applied, so that different precision can be adopted for the amblyopia objects with different degrees, and the adaptability to the amblyopia objects with different degrees and different types is widened.

Further, the amblyopia sampling mode specifically includes setting a second layer of sampling density for pixels of the image acquisition device emitting rays according to the first depth value, and performing amblyopia sampling to obtain amblyopia sampling points：

；

Wherein,for sampling points +.>The amblyopia sampling points are expressed as 3D sampling points in space; />Pixels representing amblyopia sampling points, +.>，/>And->Representing coordinates; />Representing a first depth map, ">Representing the coordinate position in the first depth map as +.>Is a point of (2); />Representing a camera pose translation vector; />Representing a camera pose rotation matrix; />Representing camera intrinsic parameters;

and carrying out position coding on the amblyopia sampling points to obtain an amblyopia sampling point set, and inputting the amblyopia sampling point set serving as a second sampling point set into the feature extraction network.

According to the invention, the amblyopia sampling is independently arranged according to the depth map, so that the amblyopia object can be sampled complementarily to the layered sampling, the low-frequency information is acquired through the common sampling, the high-frequency information is acquired through the amblyopia sampling, the amblyopia area can be focused in a targeted manner, the high-frequency information is reserved, and the reconstruction precision and integrity are improved.

Further, the neural radiation field rendering network based on the multiple sampling strategies comprises the following training method:

a1, acquiring a training set; the training set comprises a training set target object, a signed distance field information truth value, a volume density truth value, a color truth value, a depth truth value and a normal truth value;

a2, acquiring a training set signed distance field information value, a training set density value, a training set color value, a training set depth value and a training set normal line through the neural radiation field rendering network based on the multiple sampling strategies;

step A3, based on the training set signed distance field information value, training set density value, training set color value, training set depth value and training set normal and signed distance field information truth value, color truth value, depth truth value and normal truth value,calculating the total loss function：

；

Wherein, Is the sampling point location loss; />Is the depth penalty; />Is the normal loss;is a loss of smoothness; />Is optical path loss; />Is the color loss; />、/>、/>、/>、/>Andrepresenting the weight coefficient; wherein depth loss->Loss of normal->Smooth loss->Optical path loss->And color loss->Type L2 loss;

and A4, adjusting network parameters and repeating the method in the steps A1 to A3 until the total loss function is lower than a set first threshold value, so as to obtain the trained neural radiation field rendering network based on various sampling strategies.

The invention sets loss for the whole reconstruction of the SDF, designs a loss function independently for the amblyopia sampling part and the layering sampling part, reserves the complete information of the target object to the maximum extent, avoids the neglect or loss of the amblyopia object part, and improves the whole restoration precision and the precision of generating the new view angle image.

Further, the sampling point position is lostThe method specifically comprises the following steps:

；

wherein,representing amblyopia sampling point->Is a signed distance field information value; />Representing the batch with the smallest sampled pixel in each iteration.

The sampling point position loss is calculated only for the amblyopia object, so that the amblyopia sampling point appears on the object surface as much as possible, the depth loss and the loss of the whole SDF are mutually complemented, and the reconstruction and rendering precision is improved.

Further, the depth lossThe method specifically comprises the following steps:

；

wherein,representing rays emitted by the camera per pixel; />True value representing depth, +.>A predicted value representing a depth value; />A sampling space representing rays emitted by all camera pixels; the depth loss is an L2 loss.

The invention optimizes the depth information of the amblyopia object by adopting a depth loss mode, can form complementation with the loss of the sampling point position, and furthest reduces the reconstruction accuracy of the amblyopia object.

Further, the normal line is lostThe method specifically comprises the following steps:

；

wherein,representing each cameraRays emitted by the pixels; />A sampling space representing rays emitted by all camera pixels; />A true value representing a normal; />A predicted value representing a normal; the normal loss is an L2 loss.

Further, the smoothness is lostThe method specifically comprises the following steps:

；

wherein,representing the sampling point +.>Is a signed distance field information value; />Representing the gradient; />Representing the sampling point +.>A gradient of signed distance field information; />Representing random uniform 3D positional perturbations; />A gradient of signed distance field information values representing sample points added with random uniform 3D position perturbations; />Representing a small batch of sample points near the surface of the target object; the smoothness loss is an L2 loss.

Further, the optical path lossThe method specifically comprises the following steps:

；

wherein,representing the first layer sample point,/for>Representing a small batch of first layer sample points; />Representing the gradient;signed distance field information values representing first layer sample points; />A gradient representing signed distance field information for a first layer of sample points; the optical path loss is L2 loss.

According to the invention, the normal loss is calculated for the whole sampling point, the difference between the surface normal and the true value is estimated, the performance of the model is further optimized, and the reconstruction quality is improved; and encourages the surface smoothness of the reconstructed model or image by smoothness loss; ensuring that the reconstructed surface gradient is unit length through optical path loss helps to obtain smoother and continuous surface estimation; the three kinds of integral losses ensure the reconstruction accuracy of the model from different aspects and complemented each other.

Further, the color is lostSpecifically, it is：

；

Wherein,representing rays emitted by the camera per pixel; />A sampling space representing rays emitted by all camera pixels;representing a predicted value of a screen pixel color value predicted for a first layer sample; />Representing predicted values of screen pixel color values for the second layer sampling prediction; />Is the true value of the screen pixel color; the color loss is L2 loss.

The invention provides the weight of the distributed samples for the importance sampling network of the second layer sampling through the uniform first layer sampling network, because the uniform sampling network outputs the bulk density which can affect the second layer sampling distribution if inaccurate; therefore, the loss is calculated through the sampling points which are respectively input, and the color restoration precision is ensured.

In another aspect of the present invention, a neural radiation field rendering system based on a plurality of sampling strategies is presented, the system comprising:

the shooting parameter acquisition module is configured to aim a camera at a target object and acquire a current camera direction vector and camera parameters;

the rendering module is configured to acquire rendering information through a camera and a neural radiation field rendering network based on various sampling strategies, wherein the rendering information comprises a signed distance field information value, a volume density value, a color value, a depth value and a normal of a target object;

performing first-layer sampling and position coding on a target object in a uniform sampling mode through image acquisition equipment and a sampling rendering network based on the first-layer sampling point set to obtain a first-layer sampling point set;

calculating a feature map of the first layer of sampling point set through a feature extraction network;

Acquiring first rendering information through a multi-branch rendering information extraction network by the feature map of the first layer sampling point set and the current camera direction vector, wherein the first rendering information comprises a first signed distance field information value, a first bulk density value, a first color value, a first depth value and a first normal;

setting a second layer sampling density according to the first rendering information, and performing second layer sampling and position coding on the target object according to the second layer sampling density to obtain a second layer sampling point set;

calculating a feature map of the second-layer sampling point set through a feature extraction network based on the second-layer sampling point set;

acquiring second rendering information through a multi-branch rendering information extraction network by the feature map of the second layer sampling point set and the current camera direction vector, wherein the second rendering information comprises a second signed distance field information value, a second volume density value, a second color value, a second depth value and a second normal;

and the image reconstruction module is used for acquiring a high-resolution view or a high-resolution three-dimensional reconstruction model of the set view angle based on the rendering information.

In a third aspect of the present invention, an electronic device is provided, including:

at least one processor; and

a memory communicatively coupled to at least one of the processors; wherein,

the memory stores instructions executable by the processor for execution by the processor to implement the neural radiation field rendering system described above based on a variety of sampling strategies.

In a fourth aspect of the present invention, a computer readable storage medium is provided, the computer readable storage medium storing computer instructions for execution by the computer to implement the above-described multiple sampling strategy based neural radiation field rendering system.

The invention has the beneficial effects that:

(1) According to the invention, the target object is sampled by setting different sampling densities, so that the computing resource can be improved for the area where the amblyopia object is located, and the problems that the amblyopia object is ignored, unclear, lost or generates errors when three-dimensional reconstruction or new view angle synthesis is performed are effectively avoided.

(2) The method for sampling the amblyopia object specifically by adding the amblyopia sampling method disclosed by the invention has the advantages that the overall reconstruction precision or the new view angle generation precision is reserved, and meanwhile, the precision of the amblyopia object is improved.

(3) According to the invention, by combining hierarchical sampling and amblyopia sampling, different areas of an image are selectively processed, and the calculation resource consumption of a model is reduced while the accuracy of details is maintained.

(4) According to the method, the sampling point position loss, the depth loss and the color loss are set to conduct targeted learning on the amblyopia object, and the loss of the signed distance field of the whole rendered image is mutually optimized, so that the accuracy of finally obtained images is improved.

Drawings

Other features, objects and advantages of the present application will become more apparent upon reading of the detailed description of non-limiting embodiments, made with reference to the following drawings, in which:

FIG. 1 is a flow chart of a neural radiation field rendering method based on various sampling strategies in an embodiment of the invention;

FIG. 2 is a flow chart of acquiring signed distance field information, color, depth and normal of a target object in an embodiment of the present invention;

FIG. 3 is a schematic diagram of a network architecture of a neural radiation field rendering network based on multiple sampling strategies in an embodiment of the present invention;

fig. 4 is a schematic diagram of performing hierarchical sampling of different sampling densities in an embodiment of the present invention.

Detailed Description

The present application is described in further detail below with reference to the drawings and examples. It is to be understood that the specific embodiments described herein are merely illustrative of the invention and are not limiting of the invention. It should be noted that, for convenience of description, only the portions related to the present invention are shown in the drawings.

It should be noted that, in the case of no conflict, the embodiments and features in the embodiments may be combined with each other. The present application will be described in detail below with reference to the accompanying drawings in conjunction with embodiments.

In order to more clearly describe the neural radiation field rendering method based on various sampling strategies of the present invention, each step in the embodiment of the present invention is described in detail below with reference to fig. 1.

The neural radiation field rendering method based on multiple sampling strategies in the first embodiment of the invention comprises the following steps S1-S3, wherein the detailed description of each step is as follows:

step S1, a camera is aligned to a target object, and a current camera direction vector and camera parameters are obtained.

Step S2, as shown in FIG. 2, obtaining rendering information through a camera and a neural radiation field rendering network based on various sampling strategies, wherein the rendering information comprises signed distance field information values, volume density values, color values, depth values and normals of a target object.

In this embodiment, the neural radiation field rendering network based on multiple sampling strategies, as shown in fig. 3, specifically includes:

in this embodiment, the first multi-layer perceptron includes 8 hidden layers, each hidden layer includes 256 channels, and the input end of the first multi-layer perceptron and the 4 th hidden layer are connected by a jump connection manner. The input of the first multi-layer perceptron is a vector subjected to position coding, and signed distance field information (SDF value) of the current point is output. Feature vectors can be better extracted by the jump connection.

The second multi-layer perceptron, the third multi-layer perceptron, the fourth multi-layer perceptron and the fifth multi-layer perceptron in the embodiment are all networks of 1 layer and 128 channels, the input is the vector and the volume density of the camera azimuth code, and the output is the color value, the depth value and the normal of the current point respectively.

before position encoding, the sampling points are normalized to the range of [ -1,1 ];

in this embodiment, the position coding adopts the following method:

；

where sin represents a sin coding scheme and cos represents a cos coding scheme; the sin coding mode and the cos coding mode have L parameters, the parameters of the position coding output are 3×l×2, 3 represents three dimensions, 2 represents two coding modes, if L is 10, the parameters of the position coding output are 3×10×2=60 parameters, and p represents a sampling point.

For particle positionIn the sense that the method has the advantages of,the position is a 3-dimensional vector, and the x, y, z all need to be encoded.

The subsequent 4 outputs of SDF value, color, depth and normal are all internally related, so that an 8-layer 256 channel/layer MLP is used for feature extraction, and then a plurality of related but different parameters are output through a 1-layer head, so that the efficiency is improved. Bulk density is the result of further processing after SDF, and bulk density is desired to be obtained by SDF processing rather than directly generating bulk density, since bulk density obtained by SDF can achieve better results, i.e., unbiased bulk density.

in this embodiment, the step S23 includes:

for the camera direction vector, the same coding scheme as the position coding unit is used, for the camera direction,the direction is also a 3-dimensional vector, taking a total of l=4 times, so 3×4×2=24 parameters are required.

In this embodiment, the bulk density valueThe calculation method comprises the following steps:

；

wherein,representing the sampling point +.>Representing the sampling point +.>Signed distance field information of>A learnable parameter indicative of sparsity near the surface of the control object,/->The base of the natural logarithm is represented.

In this embodiment, the color value, depth value and normal are all obtained by NeRF:

；

wherein,representing rays +. >Color value predictive value of +.>Representing rays +.>Depth value predictors of ∈10->Representing rays +.>Normal predicted values of (2); since it is obtained by a small number of sample point estimations, the formula is represented by an about equal sign, N is the number of all sample points, ">Contribution coefficient representing the current color, +.>，/>Which is indicative of the density of the current precursor,representing the distance between two adjacent sampling points; />Indicating cumulative transmittance, +.>RepresentingIs the color contributing to the out, i.e. is opaque sparse, < >>Is the transmittance, so that the product of all the transmittances of the previous i-1 points is the current cumulative transmittance, +.>The method comprises the steps of carrying out a first treatment on the surface of the Formally, this is also a continuous cumulative transmissivity +.>Is a discrete form of (a); />Is the color of the ith particle, estimated by NeRF. />Is the distance of the ith particle from the center of the camera. />Is the normal to the ith particle, and can be obtained by calculating the SDF function gradient at the x-point.

the second layer samples in this embodiment are essentially importance samples.

In this embodiment, the setting the second layer sampling density according to the first rendering information includes: one or two of an importance sampling mode and a amblyopia sampling mode in parallel.

The importance sampling mode specifically comprises the following steps:

estimating an information density distribution by the first color value;

the weighted sampling density is taken as a second layer sampling density. In the present embodiment, the effect of the first layer sampling is shown in the left half of fig. 4, and in fig. 4, the curve represents the information density distribution; while the right half of fig. 4 shows that the second layer sampling is performed on the basis of the first layer sampling, a plurality of second layer sampling points are provided at a place where the information density distribution is high, and fewer second layer sampling points are provided at a place where the information density distribution is low.

In this embodiment, the shaping includes one or more of interpolation, weighting, stretching the set section, compressing the set section, or filtering. For example, in the obtained weighted sampling density distribution map, a plurality of projections, that is, a plurality of regions of amblyopia exist, but the regions of amblyopia of different degrees may exhibit a density close to that exhibited in the information density distribution, if the sampling density of the second layer samples is determined directly using the obtained weighted sampling density distribution map, the difference in the sampling density of the second layer of the amblyopia subject of different degrees becomes small. The range of the weighted sampling density obtained by the sampling points of the first layer is divided into a plurality of intervals by a shaping method, for example, the intervals with high density are stretched according to the percentage of the corresponding sampling points of the second layer, and the information density distribution diagram with higher wave peaks after shaping is obtained. The whole weighted sampling density distribution diagram can be stretched, so that more second-layer sampling points are distributed to the wave crest; the weighted density distribution map and the set peak shape can be fitted and adjusted to be a more gentle or more prominent information density distribution map after shaping according to the set peak shape. The number of points of importance samples needed to be solved by the shaping method is not completely proportional to the obtained information density distribution in different stages.

In this embodiment, the amblyopia sampling mode specifically includes setting a second layer of sampling density for a pixel point of a pixel emission ray of the image acquisition device according to a first depth value, and performing amblyopia sampling to obtain an amblyopia sampling point：

；

Wherein,for sampling points +.>The amblyopia sampling points are expressed as 3D sampling points in space; />Pixels representing amblyopia sampling points, +.>，/>And->Representing coordinates; />Representing a first depth map, ">Representing the coordinate position in the first depth map as +.>Is a point of (2); />Representing a camera pose translation vector; />Representing a camera pose rotation matrix; />Representing camera intrinsic parameters; camera pose translation vector is a parameter describing the position of the camera's center in the world coordinate system, while pose rotation matrixTo describe the parameters of rotation of the camera's coordinates relative to the world coordinate system.

In the first layer of sampling, the feature extraction network has not learned the feature of the amblyopia object, that is, the gradient disappears, so that the high-frequency information cannot be learned, and therefore, amblyopia sampling is required according to the depth map.

In this embodiment, at least 2 layers of sampling are performed, the first layer of sampling is uniform sampling, and the sampling density of the second layer of sampling includes importance sampling set based on the first color value alone, amblyopia sampling set based on the first depth value alone, or amblyopia sampling set based on the first color value importance sampling and the first depth value in parallel;

Under a normal state, the first layer of sampling is adopted as uniform sampling, and the second layer of sampling is adopted as concurrent importance sampling and amblyopia sampling for all targets;

the embodiment further comprises a third layer sampling mode, specifically a first layer sampling mode is uniform sampling, and a second layer sampling mode is importance sampling or amblyopia sampling; the third layer sampling mode is one of importance sampling or amblyopia sampling which is different from the second layer sampling mode; if the third layer sampling mode is adopted, the first rendering information and the second rendering information are used as the basis for setting the sampling density of the third layer sampling.

The embodiment further includes a step of selecting a combination of the first sampling modes according to whether the current view of the target object has the amblyopia area or the area size of the amblyopia area, specifically: judging whether a weak vision area exists in the view of the current target object through semantic segmentation and an example segmentation method;

if the amblyopia area exists, adopting a mode that the first layer of sampling is uniform sampling, the second layer of sampling is importance sampling, and the third layer of sampling is importance sampling; or the first layer of samples are uniform samples, and the second layer of samples are importance samples and amblyopia samples in parallel. If the amblyopia area does not exist, only adopting a mode that the first layer of sampling is uniform sampling and the second layer of sampling is importance sampling; the quality of the important attention area is guaranteed, and the calculation efficiency of sampling is guaranteed.

The importance sampling mode adopted by the embodiment focuses on the number of sampling points of an important area, and the amblyopia sampling mode focuses on the number of sampling points of an amblyopia area; according to the requirements of sampling tasks or the areas or importance degrees of various areas of the view of the current target object, the corresponding sampling mode can be used as the third layer of sampling, so that the precision of the third layer of sampling is further increased on the basis of maintaining the high precision of the second layer of sampling.

If the amblyopia area and the importance area are overlapped, the sequence of the importance sampling and the amblyopia sampling does not influence the final effect.

Amblyopia sampling of all 3D objects is a huge waste, since most objects are low frequency information, which can be obtained by normal sampling, only amblyopia objects need to be sampled. The embodiment adopts an importance algorithm to determine sampling positions according to a depth loss functionAnd set a threshold for depth loss +.>Pixels below this threshold belong to low frequency information and are not points of interest. Besides depth error, SDF reconstruction error is adopted to filter the low-frequency information of a large plane area to a high-frequency area of a small area.

the method described in this embodiment is only a method of each round of steps S1 to S2, and a signed distance field information value, a color value, a depth value and a normal line under the current view angle of the current camera can be obtained;

selecting another camera/view angle (namely changing camera direction vector and camera parameters), and repeating the iterative steps S1-S2 to obtain signed distance field information values, color values, depth values and normals of a plurality of cameras/view angles; and then the model reconstruction or the information acquisition of the whole scene is completed.

The neural radiation field rendering network based on the multiple sampling strategies comprises the following training method:

step A3, calculating a total loss function based on the training set signed distance field information value, the training set density value, the training set color value, the training set depth value, the training set normal and signed distance field information true value, the color true value, the depth true value and the normal true value：/>；

Wherein,is the sampling point location loss; />Is the depth penalty; />Is the normal loss;is a loss of smoothness; />Is optical path loss; />Is the color loss; />、/>、/>、/>、/>Andrepresenting the weight coefficient; wherein depth loss->Loss of normal- >Smooth loss->Optical path loss->And color loss->Type L2 loss;

in this embodiment, the sampling point position is lostThe method specifically comprises the following steps:

；

wherein,representing amblyopia sampling point->Is a signed distance field information value; />Representing the batch with the smallest sampled pixel in each iteration. The sampling point position loss is only aimed at the amblyopia object, ensures that the amblyopia sampling point is as far as possible on the object surface, and also needs to be constrained by depth loss and loss of the whole signed distance field.

In this embodiment, the depth lossThe method specifically comprises the following steps:

；

wherein,representing rays emitted by the camera per pixel; />True value representing depth, +.>A predicted value representing a depth value; />A sampling space representing rays emitted by all camera pixels; the depth loss is an L2 loss. />

In the present embodiment, the normal line is lostThe method specifically comprises the following steps:

；

wherein,representing rays emitted by the camera per pixel; />A sampling space representing rays emitted by all camera pixels; />A true value representing a normal; />A predicted value representing a normal; the normal loss is an L2 loss.

In the present embodiment, the smoothness is lostThe method specifically comprises the following steps:

；

wherein,representing the sampling point +.>Is a signed distance field information value; / >Representing the gradient; />Representing the sampling point +.>A gradient of signed distance field information; />Representing random uniform 3D positional perturbations; />A gradient of signed distance field information values representing sample points added with random uniform 3D position perturbations; />Representing a small batch of sample points near the surface of the target object; the smoothness loss is an L2 loss.

In the present embodiment, the optical path lossThe method specifically comprises the following steps:

；

wherein,representing the first layer sample point,/for>Representing a small batch of first layer sample points; />Representing the gradient;signed distance field information values representing first layer sample points; />Representation ofGradient of signed distance field information of the first layer sampling point; the optical path loss is L2 loss.

In this embodiment, the color is lostThe method specifically comprises the following steps:

；

wherein,representing rays emitted by the camera per pixel; />A sampling space representing rays emitted by all camera pixels;representing a predicted value of a screen pixel color value predicted for a first layer sample; />Representing predicted values of screen pixel color values for the second layer sampling prediction; />Is the true value of the screen pixel color; the color loss is L2 loss. The sum of all the screen ray losses is then the color loss function, with the predicted and actual values being used as an L2 loss. The reason for calculating the loss of the uniform sampling network is to provide the weight of the assigned samples to the importance sampling network by the uniform sampling network because the uniform sampling network outputs a bulk density that, if inaccurate, affects the second tier sampling assignment.

The loss function may also set a separate loss function for the volume density values and be incorporated into the total loss function.

Although the steps are described in the above-described sequential order in the above-described embodiments, it will be appreciated by those skilled in the art that in order to achieve the effects of the present embodiments, the steps need not be performed in such order, and may be performed simultaneously (in parallel) or in reverse order, and such simple variations are within the scope of the present invention.

A second embodiment of the present invention is a neural radiation field rendering system based on a plurality of sampling strategies, the system comprising:

It should be noted that, in the neural radiation field rendering system based on multiple sampling strategies provided in the foregoing embodiments, only the division of the foregoing functional modules is illustrated, in practical application, the foregoing functional allocation may be performed by different functional modules according to needs, that is, the modules or steps in the foregoing embodiments of the present invention are further decomposed or combined, for example, the modules in the foregoing embodiments may be combined into one module, or may be further split into multiple sub-modules, so as to complete all or part of the functions described above. The names of the modules and steps related to the embodiments of the present invention are merely for distinguishing the respective modules or steps, and are not to be construed as unduly limiting the present invention.

An electronic device of a third embodiment of the present invention includes:

at least one processor; and

a memory communicatively coupled to at least one of the processors; wherein,

the memory stores instructions executable by the processor for execution by the processor to implement the neural radiation field rendering method described above based on a variety of sampling strategies.

A fourth embodiment of the present invention is a computer-readable storage medium storing computer instructions for execution by the computer to implement the above-described neural radiation field rendering method based on a plurality of sampling strategies.

It will be clearly understood by those skilled in the art that, for convenience and brevity of description, the specific working process of the storage device and the processing device described above and the related description may refer to the corresponding process in the foregoing method embodiment, which is not repeated herein.

Those of skill in the art will appreciate that the various illustrative modules, method steps described in connection with the embodiments disclosed herein may be implemented as electronic hardware, computer software, or combinations of both, and that the program(s) corresponding to the software modules, method steps, may be embodied in Random Access Memory (RAM), memory, read Only Memory (ROM), electrically programmable ROM, electrically erasable programmable ROM, registers, hard disk, removable disk, CD-ROM, or any other form of storage medium known in the art. To clearly illustrate this interchangeability of electronic hardware and software, various illustrative components and steps have been described above generally in terms of their functionality. Whether such functionality is implemented as electronic hardware or software depends upon the particular application and design constraints imposed on the solution. Those skilled in the art may implement the described functionality using different approaches for each particular application, but such implementation is not intended to be limiting.

The terms "first," "second," and the like, are used for distinguishing between similar objects and not for describing a particular sequential or chronological order.

The terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus/apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus/apparatus.

Thus far, the technical solution of the present invention has been described in connection with the preferred embodiments shown in the drawings, but it is easily understood by those skilled in the art that the scope of protection of the present invention is not limited to these specific embodiments. Equivalent modifications and substitutions for related technical features may be made by those skilled in the art without departing from the principles of the present invention, and such modifications and substitutions will be within the scope of the present invention.

Claims

1. A neural radiation field rendering method based on a plurality of sampling strategies, the method comprising:

Step S2, obtaining rendering information of a target object through a camera and a nerve radiation field rendering network based on various sampling strategies, wherein the rendering information comprises a signed distance field information value, a volume density value, a color value, a depth value and a normal;

the neural radiation field rendering network based on various sampling strategies specifically comprises:

the multi-branch rendering information extraction network is a parallel volume density acquisition branch, a color acquisition branch, a depth acquisition branch and a normal line acquisition branch; the bulk density acquisition branch comprises a second multi-layer perceptron; the color acquisition branch comprises a position coding unit and a third multi-layer perceptron; the depth acquisition branch comprises a position coding unit and a fourth multi-layer perceptron; the normal line acquisition branch comprises a position coding unit and a fifth multi-layer perceptron; step S21, performing first-layer sampling and position coding on a target object in a uniform sampling mode through an image acquisition device and a sampling rendering network to obtain a first-layer sampling point set;

step S25, calculating a feature map of the second layer sampling points through a feature extraction network based on the second layer sampling point set;

2. The method for rendering the neural radiation field based on the plurality of sampling strategies according to claim 1, wherein the step S23 comprises:

3. The method of claim 1, wherein the setting a second layer sampling density according to the first rendering information comprises: one or two parallel importance sampling modes and amblyopia sampling modes;

The amblyopia sampling mode specifically comprises the steps of setting a second layer of sampling density for pixel points of rays emitted by pixels of image acquisition equipment according to a first depth value, and performing amblyopia sampling to obtain amblyopia sampling points：

；

Wherein,the amblyopia sampling points are expressed as 3D sampling points in space; />Pixels representing the amblyopia sampling points,，/>and->Representing coordinates; />Representing a first depth map, ">Representing the coordinate position in the first depth map as +.>Is a point of (2); />Representing a camera pose translation vector; />Representing a camera pose rotation matrix; />Representing camera intrinsic parameters;

and carrying out position coding on the amblyopia sampling points to obtain an amblyopia sampling point set, and taking the amblyopia sampling point set as a second layer sampling point set to be input into the feature extraction network.

4. A neural radiation field rendering method based on multiple sampling strategies according to claim 3, characterized in that the importance sampling pattern is specifically:

estimating an information density distribution by the first color value;

converting in a preset proportion or converting in a preset proportion and shaping based on the information density distribution to obtain a weighted sampling density;

the weighted sampling density is taken as a second layer sampling density.

5. The multiple sampling strategy based neural radiation field rendering method of claim 4, wherein the multiple sampling strategy based neural radiation field rendering network training method comprises:

step A3, calculating a total loss function based on the training set signed distance field information value, the training set density value, the training set color value, the training set depth value, the training set normal and signed distance field information true value, the color true value, the depth true value and the normal true value：

The method comprises the steps of carrying out a first treatment on the surface of the Wherein (1)>Is the sampling point location loss; />Is the depth penalty; />Is the normal loss; />Is a loss of smoothness;is optical path loss; />Is the color loss; />、/>、/>、/>、/>And->Representing the weight coefficient; wherein depth loss->Loss of normal->Smooth loss->Optical path loss->And color lossType L2 loss;

6. The multiple sampling strategy based neural radiation field rendering method of claim 5, wherein the sampling point position is lostThe method specifically comprises the following steps:

；

7. The multiple sampling strategy based neural radiation field rendering method of claim 6, wherein the depth penaltyLoss of normal->Smooth loss->Optical path loss->And color lossThe method comprises the following steps:

the depth loss：

；

Wherein,representing each image of a cameraRays emitted by the element; />True value representing depth, +.>A predicted value representing a depth value; />A sampling space representing rays emitted by all camera pixels; depth loss is L2 loss;

the normal lossThe method specifically comprises the following steps:

；

wherein,representing rays emitted by the camera per pixel; />A sampling space representing rays emitted by all camera pixels;a true value representing a normal; />A predicted value representing a normal; the normal loss is L2 loss;

the smoothness lossThe method specifically comprises the following steps:

；

wherein,representing the sampling point +.>Is a signed distance field information value; />Representing the gradient; / >Representing the sampling point +.>A gradient of signed distance field information; />Representing random uniform 3D positional perturbations; />A gradient of signed distance field information values representing sample points added with random uniform 3D position perturbations; />Representing a small batch of sample points near the surface of the target object; the smoothness loss is L2 loss;

the optical path lossThe method specifically comprises the following steps:

；

wherein,representing the first layer sample point,/for>Representing a small batch of first layer sample points; />Representing the gradient; />Signed distance field information values representing first layer sample points; />A gradient representing signed distance field information for a first layer of sample points; the optical path loss is L2 loss;

the color lossThe method specifically comprises the following steps:

；

8. A neural radiation field rendering system based on a plurality of sampling strategies, the system comprising:

the multi-branch rendering information extraction network is a parallel volume density acquisition branch, a color acquisition branch, a depth acquisition branch and a normal line acquisition branch; the bulk density acquisition branch comprises a second multi-layer perceptron; the color acquisition branch comprises a position coding unit and a third multi-layer perceptron; the depth acquisition branch comprises a position coding unit and a fourth multi-layer perceptron; the normal line acquisition branch comprises a position coding unit and a fifth multi-layer perceptron; performing first-layer sampling and position coding on a target object in a uniform sampling mode through image acquisition equipment and a sampling rendering network to obtain a first-layer sampling point set;

Calculating a feature map of the first layer of sampling points through a feature extraction network based on the first layer of sampling point sets;

acquiring first rendering information through a multi-branch rendering information extraction network by the feature map of the first layer sampling point and the current camera direction vector, wherein the first rendering information comprises a first signed distance field information value, a first bulk density value, a first color value, a first depth value and a first normal;

calculating a feature map of the second-layer sampling points through a feature extraction network based on the second-layer sampling point set;

9. An electronic device, comprising:

at least one processor; and

a memory communicatively coupled to at least one of the processors; wherein,

the memory stores instructions executable by the processor for performing the neural radiation field rendering method based on the plurality of sampling strategies of any one of claims 1-7.

10. A computer-readable storage medium storing computer instructions for execution by the computer to implement the multiple sampling strategy-based neural radiation field rendering method of any one of claims 1-7.