CN114065928A

CN114065928A - Virtual data generation method and device, electronic equipment and storage medium

Info

Publication number: CN114065928A
Application number: CN202010750439.1A
Authority: CN
Inventors: 沈宇翔; 王再冉; 黄海斌; 蔡东阳; 刘裕峰; 郭小燕; 石磊
Original assignee: Beijing Dajia Internet Information Technology Co Ltd
Current assignee: Beijing Dajia Internet Information Technology Co Ltd
Priority date: 2020-07-30
Filing date: 2020-07-30
Publication date: 2022-02-18

Abstract

The present disclosure relates to a virtual data generation method, apparatus, electronic device, and storage medium, the method comprising: selecting a pre-constructed virtual building model, wherein the virtual building model is composed of a plurality of building planes, and each building plane is provided with a corresponding building label; performing multi-view sampling on the virtual building model to obtain a plurality of view images of the virtual building model, wherein each view image comprises at least one building view plane, and each building view plane is a view of the building plane corresponding to the view plane under the view angle corresponding to the view image; enhancing each building visual angle plane in each visual angle image and a building label corresponding to the building visual angle plane to obtain a training plane and a training label corresponding to each building visual angle plane; and taking the training plane and the training labels as virtual data for neural network training. By applying the method provided by the disclosure, the quality of the training data of the neural network can be effectively improved.

Description

Virtual data generation method and device, electronic equipment and storage medium

Technical Field

The present disclosure relates to the field of computer vision, and in particular, to a method and an apparatus for generating virtual data, an electronic device, and a storage medium.

Background

In the related art, in the field of computer vision, a computing device can obtain the detection capability of a specific object through a neural network training method. The quality of the detection capability of the neural network depends on whether the neural network has a stable network structure and accurate network parameters. Therefore, in the actual neural network training process, the selection of the training data becomes very critical, and factors such as the data volume and quality of the training data, the quality of the data labels, the cost of data collection and the like determine the accuracy of the detection capability after the neural network training is completed to a great extent.

Further, when training a neural network for building surface detection, the source of the training data typically collects real building data. In the process of collecting the real building data, the influence of external factors is large, and after the real building data are collected, the manual processing difficulty of the data labels is large, so that the training precision of a neural network for building surface detection is influenced, and the detection capability of the neural network is reduced.

Disclosure of Invention

The present disclosure provides a virtual data generation method, an apparatus, an electronic device, and a storage medium, which at least solve the problem that, when real building data is used as training data in the related art, the training accuracy of a neural network for building surface detection is influenced by external factors, and the detection capability of the neural network is reduced. The technical scheme of the disclosure is as follows:

according to a first aspect of the embodiments of the present disclosure, there is provided a virtual data generation method, including:

selecting a pre-constructed virtual building model, wherein the virtual building model is composed of a plurality of building planes, and each building plane is provided with a corresponding building label; the building plane is an outer surface area of the virtual building model;

performing multi-view sampling on the virtual building model to obtain a plurality of view images of the virtual building model, wherein each view image comprises at least one building view plane, and each building view plane is a view of a building plane corresponding to the view plane under a view angle corresponding to the view image;

enhancing each building view plane in each view image and a building label corresponding to the building view plane to obtain a training plane corresponding to each building view plane and a training label corresponding to the training plane;

and taking each training plane in each visual angle image and the training label corresponding to each training plane as virtual data for neural network training.

Optionally, the building process of the virtual building model includes:

acquiring a preset geometric body, wherein the geometric body consists of a plurality of triangular surfaces;

randomly selecting one triangular surface from the plurality of triangular surfaces as a starting triangular surface;

dividing a first building plane corresponding to the starting triangular surface in the geometric body, and allocating a building label corresponding to the first building plane, wherein the first building plane is composed of at least one triangular surface, each triangular surface composing the first building plane at least comprises the starting triangular surface, and the similarity between the normals of any two adjacent triangular surfaces in each triangular surface composing the first building plane is smaller than a preset similarity threshold;

and randomly selecting a new initial triangular surface from the rest triangular surfaces in the plurality of triangular surfaces, dividing a building plane corresponding to the new initial triangular surface in the geometric body, and distributing building labels until all the triangular surfaces in the geometric body are divided into the building planes corresponding to the triangular surfaces, thereby completing the construction process of the virtual building model.

Optionally, the performing multi-view sampling on the virtual building model to obtain multiple view images of the virtual building model includes:

using a virtual camera to take multi-view pictures of the virtual building model to obtain a plurality of building images of the virtual building model;

for each building image, when the building image includes a first perspective plane of the virtual building model, removing areas which do not meet sampling conditions in each first perspective plane in the building image, and obtaining a plurality of perspective images of the virtual building model.

Optionally, removing the area, which does not satisfy the sampling condition, in each of the first perspective planes in the building image to obtain multiple perspective images of the virtual building model, includes:

in response to determining that a first perspective plane including an occlusion map exists in each first perspective plane in the building image, performing intersection operation on the first perspective plane including the occlusion map and the occlusion map included in the first perspective plane, removing the occlusion map, and obtaining an intersection plane corresponding to the first perspective plane including the occlusion map;

determining each first visual angle plane not containing an occlusion map and each intersection plane in the building image as a target plane;

and detecting the detection frames of the target planes, removing the target planes with the areas of the detection frames smaller than a preset detection area to obtain a view angle image corresponding to the building image, and determining the target planes with the areas of the remaining detection frames not smaller than the preset detection area as the building view angle planes in the view angle image.

Optionally, the process of determining that there is a first perspective plane including an occlusion map in each first perspective plane in the building image includes:

shooting a first image corresponding to the building image at a shooting angle corresponding to the building image; the first image comprises contour information of each first view angle plane and a building label of each first view angle plane at the shooting angle;

for each first perspective plane, in response to detecting that there is an area of the outline information of the first perspective plane, which is not marked with a building label, determining that the first perspective plane contains a corresponding occlusion map.

acquiring a standard image of a shooting angle corresponding to the building image, wherein the standard image comprises standard view planes corresponding to first view planes of the building image under a scene without obstacle simulation;

and comparing each first visual angle plane of each building image with a standard visual angle plane corresponding to the first visual angle plane, and determining the first visual angle plane as the first visual angle plane comprising the occlusion map in response to detecting that the first visual angle plane is inconsistent with the standard visual angle plane.

Optionally, the enhancing processing is performed on each building view plane in each view image and the building label corresponding to the building view plane to obtain a training plane corresponding to each building view plane and a training label corresponding to the training plane, and the enhancing processing includes:

enhancing the plane characteristics of each building visual angle plane and the label characteristics of each building label corresponding to the building visual angle plane to obtain an enhanced building visual angle plane and an enhanced building label;

and performing feature splicing on the plane features of each building visual angle plane and the plane features of the enhanced building visual angle plane corresponding to the building visual angle plane to obtain a training plane corresponding to the building visual angle plane, and performing feature splicing on the label features of the building labels corresponding to the building visual angle plane and the label features of the enhanced building labels corresponding to the building labels to obtain training labels corresponding to the training plane.

According to a second aspect of the embodiments of the present disclosure, there is provided a virtual data generation apparatus including:

a selecting unit configured to perform selecting a pre-constructed virtual building model, the virtual building model being composed of a plurality of building planes, each building plane being provided with a corresponding building tag; the building plane is an outer surface area of the virtual building model;

the sampling unit is configured to perform multi-view sampling on the virtual building model to obtain a plurality of view images of the virtual building model, each view image comprises at least one building view plane, and each building view plane is a view of a building plane corresponding to the view image under a view corresponding to the view image;

the enhancement unit is configured to perform enhancement processing on each building view plane in each view image and a building label corresponding to the building view plane to obtain a training plane corresponding to each building view plane and a training label corresponding to the training plane;

and the processing unit is configured to execute the steps of taking the training planes in each perspective image and the training labels corresponding to the training planes as virtual data for neural network training.

Optionally, the virtual data generating apparatus further includes a virtual building model constructing unit; the virtual building model construction unit is configured to perform:

Optionally, the sampling unit includes:

the photographing sub-unit is configured to execute multi-view photographing of the virtual building model by applying a virtual camera to obtain a plurality of building images of the virtual building model;

and the perspective image processing subunit is configured to execute, for each building image, when a first perspective plane of the virtual building model is included in the building image, removing areas which do not meet sampling conditions in each first perspective plane in the building image, and obtaining a plurality of perspective images of the virtual building model.

Optionally, the view image processing subunit includes:

an occlusion map removing subunit configured to perform, in response to determining that there is a first perspective plane including an occlusion map in each first perspective plane in the building image, intersection operation on the first perspective plane including the occlusion map and an occlusion map included therein, removing the occlusion map, and obtaining an intersection plane corresponding to the first perspective plane including the occlusion map;

a first determining subunit configured to perform determination of each first perspective plane not including an occlusion map and each intersection plane in the building image as a target plane;

and the plane removing subunit is configured to perform detection frame detection on each target plane, remove the target planes with the detection frame areas smaller than a preset detection area to obtain the perspective images corresponding to the building images, and determine each target plane with the remaining detection frame areas not smaller than the preset detection area as the building perspective plane in the perspective images.

Optionally, the first determining subunit includes:

the image shooting subunit is configured to shoot a first image corresponding to the building image at a shooting angle corresponding to each building image; the first image comprises contour information of each first view angle plane and a building label of each first view angle plane at the shooting angle;

and the second determining subunit is configured to perform, for each first perspective plane, in response to detecting that an area without a labeled building label exists in the outline information of the first perspective plane, determining that the first perspective plane contains a corresponding occlusion map.

Optionally, the first determining subunit includes:

the acquisition subunit is configured to acquire a standard image of a shooting angle corresponding to each building image, wherein the standard image comprises standard view planes corresponding to first view planes of the building images under a scene without obstacle simulation;

a third determining subunit, configured to perform comparing each first perspective plane of each building image with a standard perspective plane corresponding to the first perspective plane, and in response to detecting that the first perspective plane is inconsistent with the standard perspective plane, determine the first perspective plane as a first perspective plane including an occlusion map.

Optionally, the enhancement unit includes:

the first processing subunit is configured to perform enhancement on the plane feature of each building view plane and the tag feature of each building tag corresponding to the building view plane, so as to obtain an enhanced building view plane and an enhanced building tag;

and the second processing subunit is configured to perform feature matching on the plane feature of each building view plane and the plane feature of the enhanced building view plane corresponding to the building view plane to obtain a training plane corresponding to the building view plane, and perform feature matching on the label feature of the building label corresponding to the building plane and the label feature of the enhanced building label corresponding to the building label to obtain a training label corresponding to the training plane.

According to a third aspect of the embodiments of the present disclosure, there is provided an electronic apparatus including: a processor; a memory for storing the processor-executable instructions; wherein the processor is configured to execute the instructions to implement the virtual data generation method of the first aspect.

According to a fourth aspect of embodiments of the present disclosure, a storage medium in which instructions, when executed by a processor of an electronic device, enable the electronic device to perform the virtual data generation method according to the first aspect.

According to a fifth aspect of embodiments of the present disclosure, there is provided a computer program product, wherein instructions of the computer program product, when executed by a processor of an electronic device, enable the electronic device to perform the virtual data generation method according to the first aspect.

The technical scheme provided by the embodiment of the disclosure at least brings the following beneficial effects:

the present disclosure relates to a virtual data generation method, apparatus, electronic device, and storage medium, the method comprising: selecting a pre-constructed virtual building model, wherein the virtual building model is composed of a plurality of building planes, and each building plane is provided with a corresponding building label; performing multi-view sampling on the virtual building model to obtain a plurality of view images of the virtual building model, wherein each view image comprises at least one building view plane, and each building view plane is a view of a building plane corresponding to the view plane under a view angle corresponding to the view image; enhancing each building view plane in each view image and a building label corresponding to the building view plane to obtain a training plane corresponding to each building view plane and a training label corresponding to the training plane; and taking each training plane in each visual angle image and a training label corresponding to each training plane as virtual data for the neural network training. By applying the virtual data generation method provided by the disclosure, the virtual building model consistent with the real building structure is obtained, the virtual building model is sampled, and the sampled data is enhanced to obtain virtual data similar to real sample data, namely, the virtual building model is sampled, so that data of various visual angles of different buildings can be conveniently obtained, and training samples of rich neural networks can be obtained; in the sampling process, the sampling is not required to be carried out in an actual building site, so that the sampling process is not influenced by various external interference factors, the quality of a training sample of the neural network can be improved, and the detection capability of the neural network can be improved.

It is to be understood that both the foregoing general description and the following detailed description are exemplary and explanatory only and are not restrictive of the disclosure.

Drawings

The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate embodiments consistent with the present disclosure and, together with the description, serve to explain the principles of the disclosure and are not to be construed as limiting the disclosure.

FIG. 1 is a flow diagram illustrating a method of virtual data generation in accordance with an exemplary embodiment;

FIG. 2 is a flowchart illustrating a process of obtaining multiple perspective images of a virtual building model, according to an exemplary embodiment;

FIG. 3 is a flow diagram illustrating another process of obtaining multiple perspective images of a virtual building model in accordance with an exemplary embodiment;

FIG. 4 is an exemplary diagram illustrating a virtual camera capturing a virtual building model in accordance with one illustrative embodiment;

FIG. 5 is an exemplary diagram illustrating an image of a building in accordance with one illustrative embodiment;

FIG. 6 is a sample effect example diagram of a virtual building model, shown in accordance with an example embodiment;

FIG. 7 is an exemplary diagram illustrating sampling effects for an actual building, according to an exemplary embodiment;

FIG. 8 is a schematic diagram illustrating a process for obtaining virtual data in accordance with an illustrative embodiment;

FIG. 9 is a block diagram illustrating a virtual data generation apparatus in accordance with an exemplary embodiment;

FIG. 10 is a block diagram illustrating a method for an electronic device in accordance with an exemplary embodiment;

FIG. 11 is a block diagram illustrating another electronic device in accordance with an example embodiment.

Detailed Description

In order to make the technical solutions of the present disclosure better understood by those of ordinary skill in the art, the technical solutions in the embodiments of the present disclosure will be clearly and completely described below with reference to the accompanying drawings.

It should be noted that the terms "first," "second," and the like in the description and claims of the present disclosure and in the above-described drawings are used for distinguishing between similar elements and not necessarily for describing a particular sequential or chronological order. It is to be understood that the data so used is interchangeable under appropriate circumstances such that the embodiments of the disclosure described herein are capable of operation in sequences other than those illustrated or otherwise described herein. The implementations described in the exemplary embodiments below are not intended to represent all implementations consistent with the present disclosure. Rather, they are merely examples of apparatus and methods consistent with certain aspects of the present disclosure, as detailed in the appended claims.

Fig. 1 is a flowchart illustrating a virtual data generation method according to an exemplary embodiment, where the virtual data generation method is used in an electronic device, as shown in fig. 1, and includes the following steps:

in step S11, a pre-constructed virtual building model is selected, where the virtual building model is composed of multiple building planes, each building plane is provided with a corresponding building label, and the building plane is an outer surface area of the virtual building model.

In the method, before step S11 is executed, a plurality of virtual building models are pre-constructed, each virtual building model may correspond to a physical building, and the virtual building model is consistent with the corresponding physical building in terms of structure, color, appearance ratio, and the like. In step S11, a corresponding virtual building model may be selected according to the actual requirement generated by the virtual data, and the specific virtual building model may be a virtual building model specified by the user, and the user may specify the virtual building model by sending a model selection instruction to the electronic device.

Specifically, the virtual building is composed of a plurality of building planes, each building plane is an outer surface area of the virtual building, and the whole outer surface of the virtual building is composed of the outer surface areas.

Each building plane is a part of the entire external surface area of the virtual building, and it can be understood that each building plane represents the visual presentation of its corresponding external surface area in the virtual environment, and may include the shape, building components, building materials, colors, and the like of the external surface area, and the visual presentation of each building plane in the virtual environment may be a plane or a plane-like plane.

Optionally, each building plane is provided with a building label, and the building label is used for describing the building plane, such as one or more of the size, the color and the description information of each object included on the building plane, and the objects included on the building plane may be air conditioners, billboards and the like.

In step S12, performing multi-view sampling on the virtual building model to obtain multiple view images of the virtual building model, where each view image includes at least one building view plane, and each building view plane is a view of its corresponding building plane at a view corresponding to the view image.

In the process of sampling the virtual building model, the virtual building can be sampled at each preset visual angle, so that a plurality of visual angle images of the virtual building model are obtained.

In particular, for each virtual building plane of the virtual building model, the view that the virtual building plane presents at a different perspective may be different, which may refer to the displayed portion of the building plane at that perspective.

Optionally, the area of each building view plane in the view image is not smaller than a preset threshold, and the size of the threshold may be set according to actual requirements, which is not limited herein.

In step S13, each building view plane in each view image and the building label corresponding to the building view plane are subjected to enhancement processing, so as to obtain a training plane corresponding to each building view plane and a training label corresponding to the training plane.

Specifically, a generative countermeasure network or other forms of networks may be used to enhance building view planes and building labels corresponding to the building view planes, so as to obtain a training plane corresponding to each building view plane and a training label corresponding to the training plane, thereby reducing a difference between data feature domains of the building view planes and the building view planes in a real scene, and reducing a difference between the data feature domains of the building labels and the real building labels; for example, the difference between the color characteristics of the building view plane and the color characteristics of the real building view plane, and the difference between the color label corresponding to the building view plane and the real color label can be reduced.

In step S14, the training planes in each perspective image and the training labels corresponding to each training plane are used as virtual data for the neural network training.

By applying the method provided by the embodiment of the disclosure, the virtual data is obtained through the virtual building model, the data of different buildings can be conveniently obtained, the neural network model training samples are enriched, in addition, the collected data do not need to be artificially labeled, the data labeling cost can be reduced, the virtual data after the enhancement processing can be similar to real data, in the process of obtaining the virtual data, the external interference can not be caused, the quality of the training samples can be greatly improved, and further, the detection precision of the neural network can be effectively improved.

In the method provided by the embodiment of the present disclosure, based on the implementation process, specifically, the process of constructing the virtual building model includes:

Wherein, through the mode that 3D found, found the geometry, this geometry can be the building geometry, and information such as structure, material and the colour of this geometry can be found with reference to actual building and is obtained, and this geometry does not carry the building label.

Specifically, the first building plane is composed of a starting triangular surface or a plurality of triangular surfaces connected in sequence, the plurality of triangular surfaces connected in sequence at least comprise the starting triangular surface, and if the first building plane is composed of the plurality of triangular surfaces connected in sequence, the similarity between the normals of any two adjacent triangular surfaces in each triangular surface composing the first building plane is smaller than a similarity threshold.

The smaller the similarity between the normals of two adjacent triangular surfaces is, the situation that the two adjacent triangular surfaces tend to form a plane can be shown; the similarity between the normals of any two adjacent triangular faces in the triangular faces forming the first building plane can be determined by the size of the included angle between the normals of the two triangular faces.

Optionally, when the directions of the normals of the two triangular surfaces are not the same, the smaller the included angle between the normals of the two triangular surfaces is, the smaller the similarity between the two triangular surfaces is.

In the method provided by the embodiment of the present disclosure, one way of dividing the first building plane corresponding to the starting triangular surface in the geometric body may be: determining a triangular surface connected with each edge of an initial triangular surface as a triangular surface to be compared, and performing similarity comparison on a normal line of the initial triangular surface and a normal line of each triangular surface to be compared, namely performing similarity comparison on the direction of the initial triangular surface and the direction of the triangular surface connected with each edge of the initial triangle, wherein the similarity between the two normal lines is determined by the size of an included angle between the two normal lines, and when the similarity between the normal line of the triangular surface to be compared and the normal line of the initial triangular surface is smaller than a sub-similarity threshold value, determining the triangular surface to be compared as a target triangular surface; wherein, the similarity threshold may be twice the sub-similarity threshold; specifically, if the triangular face connected with any side of the initial triangular face is a target triangular face, the triangular face connected with the other two sides of the target triangular face can be expanded to be a new triangular face to be compared of the initial triangular face, whether the two newly-added triangular faces to be compared are the target triangular faces or not is continuously determined, if yes, the expansion is continuously performed, the triangular face connected with the other two sides of the new target triangular face is expanded to be the triangular face to be compared of the initial triangular face until the newly-added triangular face to be compared is not the target triangular face, the expansion of the newly-added triangular face to be compared is stopped, and when the newly-added triangular faces to be compared cannot be expanded, corresponding plane labels are distributed to the initial triangular face and each target triangular face, so that the initial triangular face to which the plane labels are distributed and each target triangular face form a first building plane, and distributing building labels to the first building plane, selecting a new initial triangular surface from the rest triangular surfaces, and continuously iterating to finish the division of the geometric bodies.

For example, randomly selecting one triangular surface from a plurality of triangular surfaces as a starting triangular surface, connecting the starting triangular surface with each side of the starting triangular surface in sequence to perform similarity comparison, namely performing similarity comparison with 3 triangular surfaces, using the triangular surface with the similarity smaller than a sub-similarity threshold as a target triangular surface, if the 3 triangular surfaces are all target triangular surfaces, using the triangular surfaces respectively connected with the remaining two sides of each target triangular surface as triangular surfaces to be compared, namely, at the moment, connecting each target triangular surface with two new triangular surfaces to be compared respectively, totaling 6 triangular surfaces to be compared, if 4 of the 6 newly-increased triangular surfaces to be compared are target triangular surfaces, continuing expanding the 4 target triangular surfaces, obtaining 8 new triangular surfaces to be compared until each newly-increased triangular surface to be compared is not the target triangular surface, and forming a first building plane by each target triangular surface and the initial triangular surface, selecting a new initial triangular surface from the triangular surfaces except the triangular surface forming the first building plane, after the geometric bodies are divided, setting a corresponding building label for each first building plane to obtain each building plane, and taking the geometric body to which the building plane to which the label is allocated as a virtual building model.

By applying the method provided by the embodiment of the disclosure, the plurality of sequentially connected triangular surfaces with small normal similarity are used as a building plane, so that the building plane can be visually presented as a plane, and in the process of dividing the building plane, a corresponding building label is allocated to each building plane, so that the setting efficiency of the building labels can be improved.

In the method provided in the embodiment of the present disclosure, based on the implementation process, specifically, the process of performing multi-view sampling on the virtual building model to obtain multiple view images of the virtual building model specifically includes, as shown in fig. 2:

in step S21, a virtual camera is used to take multiple-view pictures of the virtual building model, and multiple building images of the virtual building model are obtained.

The building images of the virtual building model may be obtained by photographing in a simulated real scene, where the simulated real scene may include one or more obstacles such as trees, people, vehicles, clouds, and the like, that is, the building images may be images obtained by photographing the virtual building model in a corresponding viewing angle by using a virtual camera when the virtual building model is in the simulated real scene.

Specifically, the virtual camera may be a virtual camera capable of capturing a color image, and the building image may be a color image.

In step S22, for each building image, when the building image includes the first perspective plane of the virtual building model, removing the areas that do not satisfy the sampling condition in each first perspective plane in the building image, and obtaining a plurality of perspective images of the virtual building model.

The first view plane may be a building plane including the virtual building model and an image simulating a real scene, which are obtained by shooting with a virtual camera at a view angle corresponding to the building image.

Specifically, the occlusion map region in each first view plane and the facet region in each first view plane may be removed, where the facet is the first view plane in which the area of the partial region not including the occlusion map is smaller than the preset detection area.

Wherein the building image not including the first viewing plane may be deleted.

By applying the method provided by the embodiment of the disclosure, the area which does not meet the sampling condition in the first visual angle plane can be removed, so that the acquired building visual angle image can be prevented from being interfered, and the quality of the training data can be improved.

In the method provided by the embodiment of the present disclosure, based on the foregoing implementation process, specifically, the removing the area that does not satisfy the sampling condition in each first view plane in the building image to obtain the multiple view images of the virtual building model, as shown in fig. 3, may include:

in step S31, in response to determining that there is a first perspective plane including an occlusion map in each first perspective plane in the building image, intersecting the first perspective plane including the occlusion map with an occlusion map included therein, removing the occlusion map, and obtaining an intersection plane corresponding to the first perspective plane including the occlusion map.

The occlusion map may be an image of each obstacle occluding the building plane of the virtual building at a viewing angle corresponding to the building image, that is, an image of at least one obstacle existing between the lens of the virtual camera and the building plane.

Specifically, at some viewing angles, there may be an obstacle between the lens of the virtual camera and the building plane of the virtual building, referring to fig. 4, which is an exemplary diagram of a virtual camera provided by the present disclosure for shooting a virtual building model, that is, there is a portion where a person blocks the building plane between the lens of the virtual camera and the building plane of the virtual building, an image of the building taken at the viewing angle is as shown in fig. 5, a portion of a person in the first viewing angle plane of the building plane 2 in fig. 5 is a blocked image corresponding to the first viewing angle plane corresponding to the building plane 2, and there is no block in the first viewing angle plane of the building plane 1.

Specifically, the intersection plane refers to a portion from which the occlusion map in the first view plane is removed, where the occlusion map in the first view plane may be an image of an outline portion of each occlusion object, or an image of a circumscribed geometric portion of an outline of each occlusion object included in the first view plane, where the circumscribed geometry may be a circumscribed rectangle, a circle, an ellipse, or the like.

In step S32, each first perspective plane and each intersection plane in the building image, which do not include an occlusion map, are determined as a target plane.

In step S33, performing detection frame detection on each target plane, removing target planes whose detection frame areas are smaller than a preset detection area to obtain a perspective image corresponding to the building image, and determining each target plane whose remaining detection frame areas are not smaller than the preset detection area as the building perspective plane in the perspective image.

The detection frame detection is carried out on each target plane, so that the target planes with small areas can be eliminated, and the condition of manually eliminating the facets is simulated.

Specifically, the feasible method of detecting the detection frame of each target plane, and removing the target planes with the detection frame area smaller than the preset detection area to obtain the view angle image corresponding to the building image may include:

identifying each target plane by applying a preset linear detection algorithm to obtain at least one detection frame marked on each target plane; determining the area of each detection frame on each target plane, and determining the target detection frame with the largest area of the detection frames on each target plane; comparing the area of each target detection frame with a preset detection area; and if the target detection frame with the detection frame area smaller than the preset detection area exists, removing the target plane to which the target detection frame with the detection frame area smaller than the preset detection area belongs to obtain the view angle image corresponding to the building image.

When the target plane is divided into a plurality of parts by the obstacle, a plurality of detection frames can be generated when the target plane is identified by adopting a straight line detection algorithm.

Specifically, for each target plane, if there is one detection frame of the target plane, the detection frame may be determined as a target detection frame; if the number of the detection frames of the target plane is multiple, the detection frame with the largest area in each detection frame can be determined as the target detection frame, that is, the detection frames can be sorted according to the size of the area, the detection frame with the largest area is selected as the target detection frame, and the target plane to which the target detection frame with the area smaller than the preset detection area belongs is removed, so that the condition of manually removing the facet can be simulated, the generation efficiency of virtual data is improved, and the quality of the virtual data is improved,

optionally, if there is no target detection frame with a detection frame area smaller than the preset detection area, the building image labeled with each target detection frame may be used as the view angle image, and each target plane may be determined as the building view angle plane in the view angle image.

In this embodiment, in another possible way of performing multi-view sampling on the virtual building model to obtain multiple view images of the virtual building model, there is another possible step in parallel with step S34, and the step may be: detecting the detection frames of all the target planes to obtain a target detection frame of each target plane, wherein the target detection frame is the detection frame with the largest area in all the target planes; determining the proportion of the area of each target detection frame in the area of the building image, removing the target planes to which the target detection frames with the proportion smaller than a preset proportion threshold belong to obtain the perspective image corresponding to the building image, and determining the remaining target planes as the building perspective planes in the perspective image.

By applying the method provided by the embodiment of the disclosure, the occlusion graph can be removed and the small plane can be removed under the simulation of an actual application scene, so that the occlusion graph and the small plane are prevented from influencing the training effect of the neural network, and the quality of virtual data is improved.

In the method provided by the embodiment of the present disclosure, based on the implementation process, specifically, a feasible manner of determining that there is a process of a first view plane including an occlusion map in each first view plane in the building image may include:

If the area without the building label exists in the contour information of the first view plane, it can be determined that the first view plane does not include the occlusion map.

Specifically, the first image is obtained by shooting the virtual building model at a certain shooting angle by a virtual camera capable of shooting a label, and the first image may include an image of a black-and-white outline of the virtual building model and building labels of each first view angle plane of the virtual building at the shooting angle; if there is an area not labeled with a building label in the first view plane, it indicates that the building label of the area is blocked by the blocking object, so that it can be determined that the first view plane includes a corresponding blocking map, where the blocking map can be a portion of the first view plane corresponding to the area.

In the process of shooting at any shooting angle, a virtual camera capable of shooting a color image and a virtual camera capable of shooting a label can be respectively adopted to shoot the virtual building at the shooting angle, so that a color building image and a first image corresponding to the building image are obtained, and the first image is obtained by shooting the virtual building at the shooting angle by the virtual camera capable of shooting the label.

By applying the method provided by the embodiment of the disclosure, the contour information of each first visual angle plane is detected, the first visual angle plane including the shielding diagram can be accurately and quickly determined in each first visual angle plane, the shielding diagram in the first visual angle plane can be removed, and the quality of virtual data can be improved.

In the method provided by the embodiment of the present disclosure, based on the implementation process described above, specifically, another feasible manner of determining that there is a process of a first perspective plane including an occlusion map in each first perspective plane in the building image may include:

Specifically, in the process of practical application, after the virtual building model is constructed, in order to make the sampled data closer to the real data, some objects simulating a real scene may be arranged around the virtual building model, for example, virtual object models such as clouds, trees, people, and vehicles may be arranged around the virtual building model; before arranging an obstacle model scene around the virtual building model, shooting the virtual building model at each shooting angle by using a virtual camera capable of shooting a color image so as to obtain a standard image at each shooting angle, namely, the standard image can be shot by the virtual camera capable of shooting the color image under the condition that the obstacle model scene does not exist, and no obstacle for shielding the building plane exists between the virtual camera and each building plane of the virtual building model; after arranging the obstacle model scene around the virtual building model, shooting the building model with the obstacle model scene at each shooting visual angle to obtain a building image, wherein at the moment, images of the obstacle model, namely the shielding images, may exist in each first visual angle plane in the building image.

The standard view plane is an image of a building plane to which the first view plane belongs at the shooting angle, each first view plane included in the building image corresponds to each standard view plane of the standard image one to one, and the corresponding building image and the standard image can be obtained by shooting the virtual building model at the same shooting angle and in different scenes by the same virtual camera.

Specifically, when the first viewing angle plane and the standard viewing angle plane corresponding to the first viewing angle plane are not consistent, it is indicated that the first viewing angle plane includes a corresponding occlusion map, that is, an image area where the first viewing angle plane and the standard viewing angle plane are different is the occlusion map, and if the first viewing angle plane and the standard viewing angle plane are consistent, it is indicated that the first viewing angle plane does not include the occlusion map.

By applying the method provided by the embodiment of the disclosure, each first visual angle plane is compared with the corresponding standard visual angle plane respectively, the first visual angle plane comprising the shielding graph can be accurately and quickly determined in each first visual angle plane, the shielding graph in the first visual angle plane can be removed, and the quality of virtual data can be improved.

In the method provided by the embodiment of the present disclosure, based on the implementation process, specifically, the enhancing processing is performed on each building view plane in each view image and the building label corresponding to the building view plane to obtain a training plane corresponding to each building view plane and a training label corresponding to the training plane, and the method includes:

In the method provided by the embodiment of the disclosure, a preset generative countermeasure network may be used to enhance the plane feature of each building view plane and the tag feature of a building tag in each view image, and a data enhancement network in other forms may also be used to enhance the plane feature of the building view plane and the tag feature of the building tag, so as to improve the visual effects of the building view plane and the building tag, so that the building view plane and the building tag are clearer, and in order to avoid the enhanced data deviating from the real data, the enhanced and enhanced data are subjected to feature matching, so that the plane feature of the building view plane and the tag feature of the building tag are close to the real data.

By applying the method provided by the embodiment of the disclosure, the building view plane and the building label are enhanced, and the data before and after enhancement are subjected to feature splicing, so that the data feature domain difference between the virtual data and the real data can be reduced, and the features of the virtual data are uniformly distributed.

In the application process, the building label can be processed by a method of simulating a manual marking building, so that the difference between the building label and the feature domain of the real data label is reduced.

In the process of simulating the manual marking of the building, the marking rules of the manual data can be normalized, and errors caused by the difference of different manual marking modes are avoided. The rules for manual marking can be customized according to actual conditions, but the standard is that all real data marking rules must be consistent.

For each virtual building model's geometry, a separate building plane label is generated for each building plane of the virtual building model. The generation mode of the label can be as follows: determining the geometry of the virtual building model, wherein the geometry is composed of a plurality of triangular surfaces, selecting any one of the triangular surfaces as an initial triangular surface, obtaining a normal N _0 of the initial triangular surface, then selecting any one of the triangular surfaces as a generation circle of a FACE-ID (initially 3 sides), marking the FACE-ID as F _0, then sequentially checking the normal N _ N of the surfaces connected with the three sides, calculating the similarity between the N _0 and the N _ N, if the direction similarity between the surface connected with any one side and the initial triangular surface is smaller than a threshold value, giving the surface connected with the side F _0, and expanding the boundary of the F _0 (newly adding 2 sides), otherwise, not expanding the boundary. And iterating, wherein F _0 finally stops expanding, and after the expanding is stopped, each triangular surface marked with F _0 is used as a building plane, and a building label is set for the building plane. Then, a new triangle without the mark FACE-ID is selected and the above process is repeated. This results in the virtual building model as a whole, which comprises a plurality of building planes provided with building labels.

In the present disclosure, the difference of the data characteristic domain between the training plane and the real data is reduced by an optimal sampling manner, as shown in fig. 6, which is an exemplary diagram of the sampling effect of the virtual building model provided in this embodiment, specifically, the data statistics on the building floor normal direction of the virtual building model, where the black point is the effect of sampling visualization, as shown in fig. 7, the exemplary diagram of the sampling effect of the real building provided in this embodiment, specifically, the data statistics on the building floor normal direction of the real building, where the black point is the effect of using visualization, which can be obtained by comparing fig. 6 and fig. 7, and the distribution range of the sampled virtual data covers the distribution of the real data, thereby achieving the effect of "superset".

Referring to fig. 8, in order to obtain a schematic flow diagram of virtual data according to an embodiment of the present invention, a building view plane and a building label corresponding to the building view plane are enhanced by a generative countermeasure network or other types of networks to obtain an enhanced building plane and an enhanced building label, and the building view plane and the enhanced building view plane are spliced to obtain a training plane; splicing the building label with the enhanced building label to obtain a training label; and taking the training plane and the training labels as virtual data for building detection neural network training.

Fig. 9 is a block diagram illustrating a virtual data generation apparatus according to an example embodiment. Referring to fig. 9, the apparatus includes a selecting unit 901, a sampling unit 902, an enhancing unit 903, and a processing unit 904.

The selecting unit 901 is configured to perform selecting a pre-constructed virtual building model, where the virtual building model is composed of multiple building planes, and each building plane is provided with a corresponding building tag; the building plane is an outer surface area of the virtual building model;

the sampling unit 902 is configured to perform multi-view sampling on the virtual building model, to obtain multiple view images of the virtual building model, where each view image includes at least one building view plane, and each building view plane is a view of a building plane corresponding to the building view plane at a view angle corresponding to the view image;

the enhancing unit 903 is configured to perform enhancement processing on each building view plane in each view image and a building label corresponding to the building view plane, so as to obtain a training plane corresponding to each building view plane and a training label corresponding to the training plane;

the processing unit 904 is configured to perform, as virtual data for neural network training, the training planes in each of the perspective images and the training labels corresponding to each of the training planes.

In another embodiment provided by the present disclosure, based on the above scheme, optionally, the virtual data generating apparatus further includes a virtual building model constructing unit; the virtual building model construction unit is configured to perform:

In another embodiment provided in the present disclosure, based on the above scheme, optionally, the sampling unit 902 includes:

In another embodiment provided by the present disclosure, based on the above scheme, optionally, the perspective image processing subunit includes:

In another embodiment provided by the present disclosure, based on the above scheme, optionally, the first determining subunit includes:

In another embodiment provided in the present disclosure, based on the above scheme, optionally, the enhancing unit 903 includes:

FIG. 10 is a block diagram illustrating a method for an electronic device 1000 according to an example embodiment. For example, the electronic device 1000 may be a mobile phone, a computer, a digital broadcast terminal, a messaging device, a game console, a tablet device, a medical device, an exercise device, a personal digital assistant, and the like.

Referring to fig. 10, electronic device 1000 may include one or more of the following components: processing component 1002, memory 1004, power component 1006, multimedia component 1008, audio component 1010, input/output (I/O) interface 1012, sensor component 1014, and communications component 1016.

The processing component 1002 generally controls overall operation of the electronic device 1000, such as operations associated with display, telephone calls, data communications, camera operations, and recording operations. The processing components 1002 may include one or more processors 1020 to execute instructions to perform all or a portion of the steps of the methods described above. Further, processing component 1002 may include one or more modules that facilitate interaction between processing component 1002 and other components. For example, the processing component 1002 may include a multimedia module to facilitate interaction between the multimedia component 1008 and the processing component 1002.

The memory 1004 is configured to store various types of data to support operation at the device 1000. Examples of such data include instructions for any application or method operating on the electronic device 1000, contact data, phonebook data, messages, pictures, videos, and so forth. The memory 1004 may be implemented by any type or combination of volatile or non-volatile memory devices such as Static Random Access Memory (SRAM), electrically erasable programmable read-only memory (EEPROM), erasable programmable read-only memory (EPROM), programmable read-only memory (PROM), read-only memory (ROM), magnetic memory, flash memory, magnetic or optical disks.

The power supply component 1006 provides power to the various components of the electronic device 1000. The power components 1006 may include a power management system, one or more power sources, and other components associated with generating, managing, and distributing power for the electronic device 1000.

The multimedia component 1008 includes a screen that provides an output interface between the electronic device 1000 and a user. In some embodiments, the screen may include a Liquid Crystal Display (LCD) and a Touch Panel (TP). If the screen includes a touch panel, the screen may be implemented as a touch screen to receive an input signal from a user. The touch panel includes one or more touch sensors to sense touch, slide, and gestures on the touch panel. The touch sensor may not only sense the boundary of a touch or slide action, but also detect the duration and pressure associated with the touch or slide operation. In some embodiments, the multimedia component 1008 includes a front facing camera and/or a rear facing camera. The front camera and/or the rear camera may receive external multimedia data when the device 1000 is in an operating mode, such as a shooting mode or a video mode. Each front camera and rear camera may be a fixed optical lens system or have a focal length and optical zoom capability.

The audio component 1010 is configured to output and/or input audio signals. For example, the audio component 1010 may include a Microphone (MIC) configured to receive external audio signals when the electronic device 1000 is in an operational mode, such as a call mode, a recording mode, and a voice recognition mode. The received audio signal may further be stored in the memory 1004 or transmitted via the communication component 1016. In some embodiments, audio component 1010 also includes a speaker for outputting audio signals.

I/O interface 1012 provides an interface between processing component 1002 and peripheral interface modules, which may be keyboards, click wheels, buttons, etc. These buttons may include, but are not limited to: a home button, a volume button, a start button, and a lock button.

The sensor assembly 1014 includes one or more sensors for providing various aspects of status assessment for the electronic device 1000. For example, sensor assembly 1014 may detect an open/closed state of device 1000, the relative positioning of components, such as a display and keypad of electronic device 1000, sensor assembly 1014 may also detect a change in position of electronic device 1000 or a component of electronic device 1000, the presence or absence of user contact with electronic device 1000, orientation or acceleration/deceleration of electronic device 1000, and a change in temperature of electronic device 1000. The sensor assembly 1014 may include a proximity sensor configured to detect the presence of a nearby object without any physical contact. The sensor assembly 1014 may also include a light sensor, such as a CMOS or CCD image sensor, for use in imaging applications. In some embodiments, the sensor assembly 1014 may also include an acceleration sensor, a gyroscope sensor, a magnetic sensor, a pressure sensor, or a temperature sensor.

The communication component 1016 is configured to facilitate wired or wireless communication between the electronic device 1000 and other devices. The electronic device 600 may access a wireless network based on a communication standard, such as WiFi, a carrier network (such as 2G, 3G, 4G, or 5G), or a combination thereof. In an exemplary embodiment, the communication component 1016 receives a broadcast signal or broadcast related information from an external broadcast management system via a broadcast channel. In an exemplary embodiment, the communications component 1016 further includes a Near Field Communication (NFC) module to facilitate short-range communications. For example, the NFC module may be implemented based on Radio Frequency Identification (RFID) technology, infrared data association (IrDA) technology, Ultra Wideband (UWB) technology, Bluetooth (BT) technology, and other technologies.

In an exemplary embodiment, the electronic device 1000 may be implemented by one or more Application Specific Integrated Circuits (ASICs), Digital Signal Processors (DSPs), Digital Signal Processing Devices (DSPDs), Programmable Logic Devices (PLDs), Field Programmable Gate Arrays (FPGAs), controllers, micro-controllers, microprocessors, or other electronic elements for performing the above-described virtual data generation method.

In an exemplary embodiment, a storage medium comprising instructions, such as the memory 1004 comprising instructions, executable by the processor 1020 of the electronic device 1000 to perform the above-described virtual data generation method is also provided. Alternatively, the storage medium may be a non-transitory computer readable storage medium, which may be, for example, a ROM, a Random Access Memory (RAM), a CD-ROM, a magnetic tape, a floppy disk, an optical data storage device, and the like.

In an exemplary embodiment, a computer program product is also provided, which comprises readable program code executable by the processor 1020 of the electronic device 1000 to perform the virtual data generation method of any of the embodiments. Alternatively, the program code may be stored in a storage medium of the electronic device 1000, which may be a non-transitory computer-readable storage medium, such as a ROM, a Random Access Memory (RAM), a CD-ROM, a magnetic tape, a floppy disk, an optical data storage device, and the like.

In addition, the electronic device 1000 includes some functional modules that are not shown, and are not described in detail herein.

Fig. 11 is a schematic structural diagram of another electronic device provided in an embodiment of the present disclosure. Referring to fig. 11, at a hardware level, the electronic device includes a processor, such as a Central Processing Unit (CPU). Optionally, the electronic device further comprises an internal bus 1104, a network interface 1102, a memory 1105, an I/O controller. The Memory may include a Memory, such as a Random-Access Memory 1151 (RAM) and a Read-only Memory 1152 (ROM), and may also include a mass storage device 1106, such as at least 1 disk storage. Of course, the electronic device may also include hardware required for other services.

The processor, the network interface, and the memory may be connected to each other via an internal bus, which may be an ISA (Industry Standard Architecture) bus, a PCI (peripheral component interconnect) bus, an EISA (Extended Industry Standard Architecture) bus, or the like. The bus may be divided into an address bus, a data bus, a control bus, etc.

A memory to store processor executable instructions. The processor is configured to execute the instructions stored in the memory, and logically form a virtual data generation device, so as to implement the virtual data generation method provided by any embodiment of the disclosure, and achieve the same technical effect.

The virtual data generation method disclosed in the embodiment of fig. 1 of the present disclosure may be applied to a processor, or may be implemented by a processor. The processor may be an integrated circuit chip having signal processing capabilities. In implementation, the steps of the above method may be performed by integrated logic circuits of hardware in a processor or instructions in the form of software. The Processor may be a general-purpose Processor, including a Central Processing Unit (CPU), a Network Processor (NP), and the like; but also Digital Signal Processors (DSPs), Application Specific Integrated Circuits (ASICs), Field Programmable Gate Arrays (FPGAs) or other Programmable logic devices, discrete Gate or transistor logic devices, discrete hardware components. The various methods, steps and logic blocks disclosed in one or more embodiments of the present disclosure may be implemented or performed. A general purpose processor may be a microprocessor or the processor may be any conventional processor or the like. The steps of a method disclosed in connection with one or more embodiments of the present disclosure may be embodied directly in hardware, in a software module executed by a hardware decoding processor, or in a combination of the hardware and software modules executed by a hardware decoding processor. The software module may be located in ram, flash memory, rom, prom, or eprom, registers, etc. storage media as is well known in the art. The storage medium is located in a memory, and a processor reads information in the memory and completes the steps of the method in combination with hardware of the processor.

Of course, besides the software implementation, the electronic device of the present disclosure does not exclude other implementations, such as logic devices or a combination of software and hardware, and the like, that is, the execution main body of the following processing flow is not limited to each logic unit, and may also be hardware or logic devices.

The embodiment of the present disclosure further provides a computer-readable storage medium, where when instructions in the storage medium are executed by a processor of an electronic device, the electronic device is enabled to execute the virtual data generation method provided in any embodiment of the present disclosure, and obtain the same technical effect.

While certain embodiments of the present disclosure have been described above, other embodiments are within the scope of the following claims. In some cases, the actions or steps recited in the claims may be performed in a different order than in the embodiments and still achieve desirable results. In addition, the processes depicted in the accompanying figures do not necessarily require the particular order shown, or sequential order, to achieve desirable results. In some embodiments, multitasking and parallel processing may also be possible or may be advantageous.

Other embodiments of the disclosure will be apparent to those skilled in the art from consideration of the specification and practice of the disclosure disclosed herein. This application is intended to cover any variations, uses, or adaptations of the disclosure following, in general, the principles of the disclosure and including such departures from the present disclosure as come within known or customary practice within the art to which the disclosure pertains. It is intended that the specification and examples be considered as exemplary only, with a true scope and spirit of the disclosure being indicated by the following claims.

It will be understood that the present disclosure is not limited to the precise arrangements described above and shown in the drawings and that various modifications and changes may be made without departing from the scope thereof. The scope of the present disclosure is limited only by the appended claims.

Claims

1. A method for generating virtual data, comprising:

2. The method of claim 1, wherein the building process of the virtual building model comprises:

3. The method of claim 1, wherein the multi-view sampling of the virtual building model to obtain multiple view images of the virtual building model comprises:

4. The method of claim 3, wherein removing regions of each of the first perspective planes in the building image that do not satisfy the sampling condition to obtain a plurality of perspective images of the virtual building model comprises:

5. The method of claim 4, wherein determining that there is a first perspective plane comprising an occlusion map in each first perspective plane in the building image comprises:

6. The method of claim 4, wherein determining that there is a first perspective plane comprising an occlusion map in each first perspective plane in the building image comprises:

acquiring a standard image of a shooting angle corresponding to each building image, wherein the standard image comprises standard view planes corresponding to first view planes of the building image under a scene without obstacle simulation;

7. The method according to claim 1, wherein the enhancing processing of each building view plane in each view image and the building label corresponding to the building view plane to obtain a training plane corresponding to each building view plane and a training label corresponding to the training plane includes:

8. A virtual data generation apparatus, comprising:

9. An electronic device, comprising:

a processor;

a memory for storing the processor-executable instructions;

wherein the processor is configured to execute the instructions to implement the virtual data generation method of any of claims 1 to 7.

10. A storage medium, wherein instructions in the storage medium, when executed by a processor of an electronic device, enable the electronic device to perform the virtual data generation method of any one of claims 1 to 7.