CN116958381A - Automatic generation method for building facade replacement texture - Google Patents

Automatic generation method for building facade replacement texture Download PDF

Info

Publication number
CN116958381A
CN116958381A CN202310735232.0A CN202310735232A CN116958381A CN 116958381 A CN116958381 A CN 116958381A CN 202310735232 A CN202310735232 A CN 202310735232A CN 116958381 A CN116958381 A CN 116958381A
Authority
CN
China
Prior art keywords
window
building
texture
network
mask
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202310735232.0A
Other languages
Chinese (zh)
Inventor
赵婷婷
李志林
朱军
慎利
遆鹏
谢亚坤
沈星宇
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Southwest Jiaotong University
Original Assignee
Southwest Jiaotong University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Southwest Jiaotong University filed Critical Southwest Jiaotong University
Priority to CN202310735232.0A priority Critical patent/CN116958381A/en
Publication of CN116958381A publication Critical patent/CN116958381A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T15/003D [Three Dimensional] image rendering
    • G06T15/04Texture mapping
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/0475Generative networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T17/00Three dimensional [3D] modelling, e.g. data description of 3D objects
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/82Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/10Terrestrial scenes
    • G06V20/176Urban or other man-made structures

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Software Systems (AREA)
  • Evolutionary Computation (AREA)
  • Computing Systems (AREA)
  • Artificial Intelligence (AREA)
  • General Health & Medical Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • General Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Computer Graphics (AREA)
  • Data Mining & Analysis (AREA)
  • Mathematical Physics (AREA)
  • Molecular Biology (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Medical Informatics (AREA)
  • Databases & Information Systems (AREA)
  • Geometry (AREA)
  • Processing Or Creating Images (AREA)

Abstract

The invention discloses a method for automatically generating a building facade replacement texture, which comprises the following steps: performing window extraction by adopting a Mask-Rcnn network, and extracting the spatial position information and Mask data of the window on the vertical face of the building; generating a building facade wall texture image by using a Pix2PixHD network; carrying out regularization treatment on the window mask data, retrieving window textures with consistent structure and style from a material library, generating window textures with spatial layout and geometric structure characteristics, attaching the window textures to the generated wall textures, and replacing original texture data of the building model through texture mapping; the invention solves the problem that the building facade is easy to twist and lose in the three-dimensional modeling of the urban area, and simultaneously reserves the information of the real space position, the size, the layout and the like of the building facade window.

Description

Automatic generation method for building facade replacement texture
Technical Field
The invention relates to the technical field of computer vision image generation, in particular to an automatic generation method of a building elevation replacement texture.
Background
Analysis and three-dimensional modeling of real city scenes are fundamental problems of computer vision and computer graphics, and with the wide use of three-dimensional city building models in the fields of house investigation, stereo measurement, city planning and the like, the construction of three-dimensional models of buildings is greatly concerned. Because of the advantages of low modeling cost, rich model texture detail expression, short modeling time, high speed and the like, oblique photography modeling has been developed into a mainstream method for large-scale reconstruction.
Currently, oblique photography modeling mostly employs a Multi-view stereovision (MVS) method. The working flow comprises the following three steps: firstly, obtaining camera parameters and scene sparse three-dimensional information from a motion obtaining structure (structure from motion, SFM) for aerial images obtained by an unmanned aerial vehicle, then performing dense matching by using an MVS method to obtain a dense three-dimensional point cloud model, and finally obtaining the three-dimensional model of a building by poisson surface reconstruction and texture mapping of the point cloud. However, the existing process also has some problems, such as distortion during model reconstruction; in addition, the texture may be distorted during mapping due to stretching, clipping, transformation and other operations, so that the texture is not well attached after mapping, and a distortion feeling is given. In addition, when the unmanned aerial vehicle image acquisition is carried out on a building, the acquired images are insufficient in number or the building is blocked by vegetation and the like, so that the three-dimensional building model has the phenomena of building elevation distortion, hollowness and the like. Thus requiring texture replacement of the building facade.
In addition, aiming at the problem of image texture generation, currently, a GAN network is commonly used to perform picture texture synthesis, but a building facade image not only comprises wall textures, but also comprises objects such as windows, if the window on the building facade is also subjected to image generation, the position, the size and the layout information of the actual window on the building facade can be damaged, and the requirement of expressing a real scene of a three-dimensional building is not met.
Disclosure of Invention
In order to solve the problems in the prior art, the invention aims to provide an automatic generation method for the replacement texture of the building elevation, which solves the problem that the building elevation is easy to twist and lose in the three-dimensional modeling of the urban area, and simultaneously reserves the information of the real space position, the size, the layout and the like of the building elevation window.
In order to achieve the above purpose, the invention adopts the following technical scheme: an automatic generation method of a building elevation replacement texture comprises the following steps:
step 1, performing window extraction by adopting a Mask-Rcnn network, and extracting space position information and Mask data of a window on a building elevation;
step 2, generating a building facade wall texture image by using a Pix2PixHD network;
and step 3, regularizing the window mask data, retrieving window textures with consistent structure and style from a material library, generating window textures with spatial layout and geometric structure characteristics, attaching the window textures to the generated wall textures, and replacing original texture data of the building model through texture mapping.
As a further improvement of the present invention, the step 1 is specifically as follows:
building vertical window is manufactured, and training samples are extracted: preparing a building elevation data set, marking the manual data of the window by using marking software, storing the marking result in a Json format, and carrying out data enhancement on the data by adopting a random 90-degree multiple rotation, horizontal overturning and telescopic transformation method so as to enhance the segmentation recognition result of the network and the generalization performance of the model;
and performing parameter fine adjustment and model training on the Mask-Rcnn network by using the window training data set, testing the trained model, and extracting the position information and Mask file of the building vertical window.
As a further improvement of the present invention, the step 2 is specifically as follows:
preparing a building facade wall data set: the input is a noise image, and the label is a real wall texture image; carrying out data enhancement on the expansion change of the image pair;
parameter fine tuning and model training are carried out on the Pix2PixHD network: and after training the model, testing to obtain the generated building facade wall texture image.
As a further improvement of the present invention, the step 3 is specifically as follows:
carrying out regularization treatment on the window mask file and complementing the undetected window, and attaching the window to the generated building elevation wall texture image according to the real layout and size information of the window in the mask file; and finally mapping the generated building elevation texture image into a three-dimensional model through texture mapping.
The invention adopts the methods of image generation and window extraction to generate the real wall texture of the building facade, and keeps the real layout information and the actual size of the window according to the mask file of the extracted window.
The invention focuses on the problem of automatic generation of building elevation texture replacement, and the three-dimensional model of the building after elevation replacement looks more regular and uniform by replacing the twisted and hollow building elevation and simultaneously keeping the information of the real space position, the size, the layout and the like of the building elevation window, thereby achieving the purpose of complete building elevation.
The beneficial effects of the invention are as follows:
1. the invention automatically extracts the window mask on the vertical face of the building, thereby saving the time for manually acquiring the position and the size of the window;
2. the building elevation texture generated by the invention keeps the true position and structural layout of the window on the original elevation, and ensures the authenticity of the window layout on the elevation;
3. the size of the building elevation image generated by the invention is not limited to 512 x 512 image sizes, and a larger image can be generated, so that the method is more suitable for the situation that the size of the elevation surface in an actual model is larger.
Drawings
FIG. 1 is a flow chart of a building elevation texture generation in accordance with an embodiment of the present invention;
FIG. 2 is a flow chart of a Mask-Rcnn network in an embodiment of the invention;
FIG. 3 is a schematic diagram of a window mask file extracted in an embodiment of the present invention;
fig. 4 is a schematic diagram of a GAN network structure according to an embodiment of the present invention;
FIG. 5 is a diagram of a Pix2PixHD generation network in accordance with an embodiment of the invention;
FIG. 6 is a diagram illustrating the calculation of Pix2PixHD discrimination network loss in accordance with an embodiment of the present invention;
FIG. 7 is a building elevation image input in an embodiment of the present invention;
FIG. 8 is a regularized window mask file according to an embodiment of the present invention;
fig. 9 is a graph showing building facade results produced in an embodiment of the invention.
Detailed Description
Embodiments of the present invention will be described in detail below with reference to the accompanying drawings.
Examples
In order to extract window information and layout structure of the window information on the building facade, the Mask-Rcnn network is adopted to extract the window, and the spatial position information and Mask data of the window on the building facade are extracted; generating a building facade wall texture image by using a Pix2PixHD network; and (3) carrying out regularization processing on the window mask data, retrieving window textures with consistent structure and style from a material library, generating window textures with spatial layout and geometric structure characteristics, attaching the window textures to the generated wall textures, and replacing original texture data of the building model through texture mapping.
The basic steps of the algorithm of the embodiment are as follows:
(1) And (5) manufacturing a building facade window and extracting a training sample. Preparing a building elevation data set, marking the manual data of the window by using marking software, storing the marking result in a Json format, and carrying out data enhancement on the data by adopting a random 90-degree multiple rotation, horizontal overturning and telescopic transformation method so as to enhance the segmentation recognition result of the network and the generalization performance of the model.
(2) And performing parameter fine adjustment and model training on the Mask-Rcnn network by using the window training data set, testing the trained model, and extracting the position information and Mask file of the building vertical window.
(3) A building facade wall dataset is prepared. The input is a noise image, and the label is a real wall texture image. And carrying out data enhancement on the expansion change of the image pair.
(4) Parameter fine tuning and model training are performed on the Pix2PixHD network. And after training the model, testing to obtain the generated building facade wall texture image.
(5) And (3) regularizing the window mask file and complementing the undetected window, and attaching the window to the generated building elevation wall texture image according to the real layout and size information of the window in the mask file. And finally mapping the generated building elevation texture image into a three-dimensional model through texture mapping.
This embodiment is further described below:
the specific implementation flow chart of the building elevation texture replacement in this embodiment is shown in fig. 1, firstly, parameter adjustment and model training are performed on a model according to a prepared building elevation texture data set, and then a building elevation with a proper size is generated by using the trained model. And then, acquiring a window of the original building elevation by using a Mask-Rcnn, detecting and positioning the window, acquiring a binary file of the window of the building elevation, carrying out Mask regularization on the acquired binary file of the window, selecting a window with proper size and style from a window database prepared in advance by using the regularized window Mask file, attaching the window to the generated building elevation, acquiring the building elevation with a real window layout, and finally mapping the texture image of the acquired building elevation onto a three-dimensional building model.
The field of deep learning has made a major breakthrough in recent years, wherein most research results are based on a perception technology, and a computer perceives objects and recognizes contents by imitating the thinking mode of human beings. The idea of generating a countermeasure network (Generative adversarial network, GAN) was proposed by Goodfellow in 2014, and its development history is only six years, but it brings great impact to the field of artificial intelligence. The game process of GAN uses the data distribution produced by the production network (Generator Network, G) to fit the actual data distribution. Namely, a network for generating building elevation pictures receives random noise, generates building elevation pictures and outputs the building elevation pictures. The function of the discrimination network (Discriminator Network, D) is to calculate the probability that a picture is a generated or true picture from an input building facade picture by means of a discriminator. The two are respectively and reversely updated according to the returned results, the networks are balanced mutually, the dynamic change finally reaches Nash equilibrium, and the GAN network structure is shown in figure 4. Based on the superiority of GAN performance, the model is gradually applied to various directions in the field of image processing, including image conversion, image restoration, style migration, image generation, and the like. The generation of the building elevation belongs to the generation of pictures by utilizing a GAN network, and because the size of the building elevation in an actual model is larger, the Mask-Rcnn network is adopted to extract window positions and divide the window, and the Pix2PixHD network frame is adopted to generate high-definition texture of the building elevation.
The Mask-Rcnn network structure is shown in fig. 2, a main feature extraction network and a feature pyramid network (Feature Pyramid Networks, FPN) are utilized to perform convolution feature extraction, bottom layer information is fused, a region suggestion network (Region Proposal Network, RPN) generates a suggestion window, then the feature windows of each level are fused and pooled to generate a feature map with a fixed size, and finally the feature map with the fixed size is classified, frame regression and window Mask prediction are performed. And manually marking the window by using marking software on the acquired building elevation data set, enhancing the data by using a random 90-degree multiple rotation, horizontal overturning and telescopic transformation method, and then performing model training. The window mask results extracted during the model test stage are shown in fig. 3.
The workflow of the Pix2PixHD network is shown in fig. 5 and 6: g denotes a generation network, and D denotes a discrimination network. The function of the generating network is to generate the data distribution of the target domain according to the input data, namely, the probability of the identification error of the judging network is maximized, so that the image is mistakenly considered as a real sample image instead of a generated false image by the judging network; the object of the discrimination network is to maximize discrimination loss, and accurately judge the result of the generated network, and accurately identify the image generated by the generated network and the label image of the target domain. The generating network and the judging network are mutually opposed in the GAN network training, and finally the generating network and the judging network learn the respective optimal states together to achieve Nash distribution.
In order to improve the resolution of the generated image, pix2PixHD designs a coarse-to-fine generation network comprising two sub-networks as shown in fig. 5. G1 is a global generation network, G2 is a local enhancement network, an input image firstly obtains local shallow layer characteristics through the G2 network, then the input image is subjected to 2 times downsampling, a downsampling result is input into the G1 network, the G1 network is a complete coding and decoding network structure, global characteristics of the image can be obtained, and finally the local characteristics of the G2 network and the global characteristics of the G1 network are subjected to pixel-by-pixel addition (Element-wise Add) and are input into the G2 network. Wherein the G2 network doubles the resolution of the image generated by the G1 network. As for the discrimination network, in order to better be able to produce high resolution pictures, pix2PixHD adopts a multi-scale discrimination network as shown in fig. 6. Three discrimination components with the same network structure but different image scales are adopted to form a discrimination network, and downsampling with sampling coefficients of 1,2 and 4 is respectively carried out on the image generated by the generation network and the label image. In this way, the label picture and the generated picture after downsampling are respectively given to the three distinguishing components, the distinguishing network corresponding to the picture with the minimum resolution has a larger receptive field, the global sense of image generation is stronger, and the distinguishing network corresponding to the image with the maximum resolution captures richer details.
The implementation flow of this embodiment is shown in the figure, and is divided into building elevation texture data set making, model parameter fine tuning and training, and model testing.
The specific implementation process is as follows:
(1) A texture dataset for a building facade is created for model training. Because the building facade image size is relatively big, can't directly input into the network and carry out model training, in addition training dataset size is too big can cause the memory that consumes in the training process, leads to model training slower scheduling problem. The training data needs to be cropped to the proper size. Finally, the diversity of the data set is increased through data enhancement, and horizontal rotation, telescopic transformation and the like are mainly performed.
(2) Model parameter fine tuning and training. And fine tuning the super parameters in the model, such as learning rate, iteration times, batch processing size and the like, observing curve changes of training loss and verification loss during training, and stopping model training in advance when the training loss curve is not reduced or the accuracy of the verification set is not increased.
(3) And (5) model testing. And calling the trained model to test, and obtaining a building elevation texture image generated by the model through inputting the noise image.
(4) And (3) segmenting and extracting the window on the vertical face image of the original building by using a Mask-Rcnn network, acquiring the position information of the window, regularizing the window Mask file, retrieving a proper window from a material library prepared in advance by using the regularized window Mask file, and attaching the proper window to a proper position for generating the vertical face of the building. And attaching the generated vertical face texture result to the three-dimensional model through texture mapping.
This embodiment has been fully verified in experiments, as shown in fig. 7, which is a new Mask file after size regularization of the extracted window Mask file (shown in fig. 3), because the size and shape of the windows are irregular in the Mask file extracted by the Mask-Rcnn network, and thus all the window sizes are regularized by using a regularization algorithm. Fig. 9 shows the result of attaching a window file retrieved from a window database prepared in advance to a generated elevation, and it can be seen from comparison with fig. 7 that the layout position of the window remains unchanged. Finally, the result of fig. 9 is mapped into a three-dimensional model of the building by a texture mapping method, and the building elevation texture with the concave-convex structure shown in fig. 7 can be obtained.
The foregoing examples merely illustrate specific embodiments of the invention, which are described in greater detail and are not to be construed as limiting the scope of the invention. It should be noted that it will be apparent to those skilled in the art that several variations and modifications can be made without departing from the spirit of the invention, which are all within the scope of the invention.

Claims (4)

1. The automatic generation method of the building elevation replacement texture is characterized by comprising the following steps:
step 1, performing window extraction by adopting a Mask-Rcnn network, and extracting space position information and Mask data of a window on a building elevation;
step 2, generating a building facade wall texture image by using a Pix2PixHD network;
and step 3, regularizing the window mask data, retrieving window textures with consistent structure and style from a material library, generating window textures with spatial layout and geometric structure characteristics, attaching the window textures to the generated wall textures, and replacing original texture data of the building model through texture mapping.
2. The automatic generation method of building facade replacement textures according to claim 1, wherein the step 1 is specifically as follows:
building vertical window is manufactured, and training samples are extracted: preparing a building elevation data set, marking the manual data of the window by using marking software, storing the marking result in a Json format, and carrying out data enhancement on the data by adopting a random 90-degree multiple rotation, horizontal overturning and telescopic transformation method so as to enhance the segmentation recognition result of the network and the generalization performance of the model;
and performing parameter fine adjustment and model training on the Mask-Rcnn network by using the window training data set, testing the trained model, and extracting the spatial position information and Mask file of the building vertical window.
3. The automatic generation method of the building facade replacement texture according to claim 2, wherein the step 2 is specifically as follows:
preparing a building facade wall data set: the input is a noise image, and the label is a real wall texture image; carrying out data enhancement on the expansion change of the image pair;
parameter fine tuning and model training are carried out on the Pix2PixHD network: and after training the model, testing to obtain the generated building facade wall texture image.
4. The automatic generation method of building facade replacement textures according to claim 3, wherein the step 3 is specifically as follows:
carrying out regularization treatment on the window mask file and complementing the undetected window, and attaching the window to the generated building elevation wall texture image according to the real layout and size information of the window in the mask file; and finally mapping the generated building elevation texture image into a three-dimensional model through texture mapping.
CN202310735232.0A 2023-06-20 2023-06-20 Automatic generation method for building facade replacement texture Pending CN116958381A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310735232.0A CN116958381A (en) 2023-06-20 2023-06-20 Automatic generation method for building facade replacement texture

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310735232.0A CN116958381A (en) 2023-06-20 2023-06-20 Automatic generation method for building facade replacement texture

Publications (1)

Publication Number Publication Date
CN116958381A true CN116958381A (en) 2023-10-27

Family

ID=88441824

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310735232.0A Pending CN116958381A (en) 2023-06-20 2023-06-20 Automatic generation method for building facade replacement texture

Country Status (1)

Country Link
CN (1) CN116958381A (en)

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114066768A (en) * 2021-11-24 2022-02-18 武汉大势智慧科技有限公司 Building facade image restoration method, device, equipment and storage medium
CN114117614A (en) * 2021-12-01 2022-03-01 武汉大势智慧科技有限公司 Method and system for automatically generating building facade texture

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114066768A (en) * 2021-11-24 2022-02-18 武汉大势智慧科技有限公司 Building facade image restoration method, device, equipment and storage medium
CN114117614A (en) * 2021-12-01 2022-03-01 武汉大势智慧科技有限公司 Method and system for automatically generating building facade texture

Similar Documents

Publication Publication Date Title
CN111797716B (en) Single target tracking method based on Siamese network
CN109636905B (en) Environment semantic mapping method based on deep convolutional neural network
CN110852267B (en) Crowd density estimation method and device based on optical flow fusion type deep neural network
CN109191369A (en) 2D pictures turn method, storage medium and the device of 3D model
CN113449594B (en) Multilayer network combined remote sensing image ground semantic segmentation and area calculation method
CN110188835B (en) Data-enhanced pedestrian re-identification method based on generative confrontation network model
CN112163498B (en) Method for establishing pedestrian re-identification model with foreground guiding and texture focusing functions and application of method
CN105160310A (en) 3D (three-dimensional) convolutional neural network based human body behavior recognition method
CN114117614A (en) Method and system for automatically generating building facade texture
CN110210431B (en) Point cloud semantic labeling and optimization-based point cloud classification method
CN110852182A (en) Depth video human body behavior recognition method based on three-dimensional space time sequence modeling
CN113011288A (en) Mask RCNN algorithm-based remote sensing building detection method
CN114943876A (en) Cloud and cloud shadow detection method and device for multi-level semantic fusion and storage medium
CN104463962B (en) Three-dimensional scene reconstruction method based on GPS information video
CN106683125A (en) RGB-D image registration method based on 2D/3D mode switching
CN115719445A (en) Seafood identification method based on deep learning and raspberry type 4B module
CN113610024B (en) Multi-strategy deep learning remote sensing image small target detection method
CN113076806A (en) Structure-enhanced semi-supervised online map generation method
CN115115847B (en) Three-dimensional sparse reconstruction method and device and electronic device
CN113920254B (en) Monocular RGB (Red Green blue) -based indoor three-dimensional reconstruction method and system thereof
CN116958381A (en) Automatic generation method for building facade replacement texture
CN117011692A (en) Road identification method and related device
CN113139965A (en) Indoor real-time three-dimensional semantic segmentation method based on depth map
CN112115771A (en) Gait image synthesis method based on star-shaped generation confrontation network
CN117238018B (en) Multi-granularity-based incremental deep and wide network living body detection method, medium and equipment

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination