CN112465970A

CN112465970A - Navigation map construction method, device, system, electronic device and storage medium

Info

Publication number: CN112465970A
Application number: CN202011362824.5A
Authority: CN
Inventors: 果泽龄; 刘鹤云
Original assignee: Beijing Sinian Zhijia Technology Co ltd
Current assignee: Beijing Sinian Zhijia Technology Co ltd
Priority date: 2020-11-27
Filing date: 2020-11-27
Publication date: 2021-03-09
Anticipated expiration: 2040-11-27
Also published as: CN112465970B

Abstract

The application relates to a navigation map construction method, a navigation map construction device, a navigation map construction system, an electronic device and a storage medium, wherein the navigation map construction method comprises the following steps: acquiring a plurality of orthographic projection images by using aerial photography with a fixed height of an unmanned aerial vehicle; filtering the ground moving target according to the orthographic projection image by utilizing a well-trained confrontation generation network model to obtain a corresponding static map image; three-dimensional scene construction and orthographic projection global map generation are carried out by utilizing aerial triangular modeling according to the static map image, and an orthographic projection global map of the subarea is obtained; and segmenting the forward projection global map by using a well-trained depth segmentation model, and splicing segmented map images with road area identifications obtained by segmentation to obtain a navigation map with the road area identifications. The method and the device solve the problems that the map constructed in the mode of satellite shooting and artificial labeling is low in precision and large in workload, and realize efficient high-precision navigation map construction.

Description

Navigation map construction method, device, system, electronic device and storage medium

Technical Field

The present application relates to the field of mapping technologies, and in particular, to a method, an apparatus, a system, an electronic apparatus, and a storage medium for constructing a navigation map.

Background

With the rapid development of the fields of sensors, big data and artificial intelligence, the unmanned technology developed on the basis can improve the transportation efficiency, the traffic safety and the travel convenience, and has become one of the research hotspots in academia and industry in recent years. In unmanned driving, safety is always the primary issue, and a large number of experiments show that accidents cannot be avoided only by using existing precision cameras, radar and other vehicle-mounted sensors, so that automatic driving needs to obtain information outside a conventional sensing range, and a GPS positioning system and a pre-established map are combined for navigation, so that the problem is solved.

Most of traditional navigation map making methods are in a mode of shooting by satellites and adding manual marking, the workload is huge, the precision is usually 10 meters, the width and the position of a lane where the navigation map is located cannot be judged for roads usually, the navigation map making methods can only be used as driving auxiliary means, and the navigation map making methods are not enough for main technical means of automatic driving.

At present, no effective solution is provided for the problems of low map precision and huge workload which are constructed by a satellite shooting and artificial labeling mode in the related technology.

Disclosure of Invention

The embodiment of the application provides a navigation map construction method, a navigation map construction device, a navigation map construction system, an electronic device and a storage medium, and aims to at least solve the problems that in the related art, the accuracy of a map constructed in a satellite shooting and artificial labeling mode is low and the workload is huge.

In a first aspect, an embodiment of the present application provides a navigation map construction method, including:

acquiring a plurality of orthographic projection images by using aerial photography with a fixed height of an unmanned aerial vehicle;

filtering the ground moving target of the orthographic projection image by using a well-trained confrontation generation network model to obtain a corresponding static map image;

performing three-dimensional scene construction and orthographic projection global map generation by utilizing aerial triangulation according to the static map image to obtain an orthographic projection global map of the subarea;

and segmenting the orthographic projection global map by using a well-trained depth segmentation model, and splicing segmented map images with road area identifications obtained by segmentation to obtain a navigation map with the road area identifications.

In some of these embodiments, further comprising training against the generated network model;

the training of the confrontation generation network model comprises:

acquiring a plurality of orthographic projection images;

generating a set of graph pairs with contrast according to the orthographic projection image as a training set;

and training the confrontation generation network model according to the training set to obtain the completely trained confrontation generation network model.

In some of these embodiments, further comprising training of the depth segmentation model;

the training of the depth segmentation model comprises the following steps:

acquiring an orthographic projection global map of a subarea;

dividing the orthographic projection global map into a training set and a test set;

and training the depth segmentation model according to the training set and the test set to obtain a completely trained depth segmentation model.

In some of these embodiments, acquiring several orthographic projection images using fixed-height aerial photography by a drone includes:

the method comprises the steps of utilizing aerial photography with the fixed height of an unmanned aerial vehicle to obtain a plurality of orthographic projection images with GPS positioning information in the fixed height and the fixed area, and storing the orthographic projection images.

In some embodiments, the performing three-dimensional scene construction and forward projection global map generation by using aerial triangulation modeling according to the static map image to obtain a partitioned forward projection global map includes:

detecting the same-name points of adjacent regions of the static map image to obtain pixel points containing space coordinates and RGB pixel values;

carrying out raster scene reconstruction according to the pixel points to obtain a global 3D map;

and carrying out orthographic projection cutting splicing on the global 3D map according to preset area division to obtain the partitioned orthographic projection global map.

In some embodiments, performing raster scene reconstruction according to the pixel points to obtain a global 3D map includes:

and comparing adjacent points according to the pixel points, combining the adjacent points on the same plane into a grid, and using the pixel RGB value to modify the grid image so as to establish a series of grids with RGB color information at different angles, thereby establishing a global 3D map.

In a second aspect, an embodiment of the present application provides a navigation map building apparatus, including: the system comprises an acquisition module, a filtering module, a partition building module and a segmentation and splicing module;

the acquisition module is used for acquiring a plurality of orthographic projection images by using aerial photography with a fixed height of the unmanned aerial vehicle;

the filtering module is used for filtering the ground moving target of the orthographic projection image by utilizing a well-trained confrontation generation network model to obtain a corresponding static map image;

the construction partition module is used for performing three-dimensional scene construction and orthographic projection global map generation by utilizing aerial triangular modeling according to the static map image to obtain an orthographic projection global map of a partition;

and the segmentation and splicing module is used for segmenting the orthographic projection global map by utilizing a well-trained depth segmentation model, and splicing segmented map images with road area identifications obtained by segmentation to obtain a navigation map with the road area identifications.

In a third aspect, an embodiment of the present application provides a navigation map building system, including: a terminal device, a transmission device and a server device; the terminal equipment is connected with the server equipment through the transmission equipment;

the terminal equipment is used for acquiring a plurality of orthographic projection images;

the transmission equipment is used for transmitting a plurality of orthographic projection images;

the server device is configured to execute the navigation map construction method according to the first aspect.

In a fourth aspect, an embodiment of the present application provides an electronic device, which includes a memory, a processor, and a computer program stored on the memory and executable on the processor, and the processor, when executing the computer program, implements the navigation map construction method according to the first aspect.

In a fifth aspect, the present application provides a storage medium, on which a computer program is stored, and when the program is executed by a processor, the navigation map construction method according to the first aspect is implemented.

Compared with the related art, the navigation map construction method, the navigation map construction device, the navigation map construction system, the electronic device and the storage medium provided by the embodiment of the application utilize aerial photography of an unmanned aerial vehicle at a fixed height to acquire a plurality of orthographic projection images; filtering the ground moving target of the orthographic projection image by using a well-trained confrontation generation network model to obtain a corresponding static map image; performing three-dimensional scene construction and orthographic projection global map generation by utilizing aerial triangulation according to the static map image to obtain an orthographic projection global map of the subarea; and segmenting the orthographic projection global map by using a well-trained depth segmentation model, and splicing segmented map images with road area identifications obtained by segmentation to obtain a navigation map with the road area identifications. The problems of low map precision and huge workload which are built in a mode of satellite shooting and manual marking are solved, and efficient and high-precision navigation map building is realized.

The details of one or more embodiments of the application are set forth in the accompanying drawings and the description below to provide a more thorough understanding of the application.

Drawings

The accompanying drawings, which are included to provide a further understanding of the application and are incorporated in and constitute a part of this application, illustrate embodiment(s) of the application and together with the description serve to explain the application and not to limit the application. In the drawings:

fig. 1 is a block diagram of a hardware structure of a terminal device of a navigation map construction method according to an embodiment of the present application;

FIG. 2 is a flowchart of a navigation map construction method according to an embodiment of the present application;

FIG. 3 is a flow diagram of a training of a challenge-generating network model according to an embodiment of the present application;

FIG. 4 is a flow chart of deep segmentation model training according to an embodiment of the present application;

FIG. 5 is a schematic diagram illustrating aerial triangulation of neighboring image similarity point matches according to an embodiment of the present application;

FIG. 6 is a schematic diagram of an aerial triangulation solution according to an embodiment of the present application;

fig. 7 is a block diagram illustrating a navigation map construction apparatus according to an embodiment of the present application.

Description of the drawings: 210. an acquisition module; 220. a filtration module; 230. constructing a partition module; 240. and (5) segmenting and splicing the modules.

Detailed Description

In order to make the objects, technical solutions and advantages of the present application more apparent, the present application will be described and illustrated below with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are merely illustrative of the present application and are not intended to limit the present application. All other embodiments obtained by a person of ordinary skill in the art based on the embodiments provided in the present application without any inventive step are within the scope of protection of the present application. Moreover, it should be appreciated that in the development of any such actual implementation, as in any engineering or design project, numerous implementation-specific decisions must be made to achieve the developers' specific goals, such as compliance with system-related and business-related constraints, which may vary from one implementation to another.

Reference in the specification to "an embodiment" means that a particular feature, structure, or characteristic described in connection with the embodiment can be included in at least one embodiment of the specification. The appearances of the phrase in various places in the specification are not necessarily all referring to the same embodiment, nor are separate or alternative embodiments mutually exclusive of other embodiments. Those of ordinary skill in the art will explicitly and implicitly appreciate that the embodiments described herein may be combined with other embodiments without conflict.

Unless defined otherwise, technical or scientific terms referred to herein shall have the ordinary meaning as understood by those of ordinary skill in the art to which this application belongs. Reference to "a," "an," "the," and similar words throughout this application are not to be construed as limiting in number, and may refer to the singular or the plural. The present application is directed to the use of the terms "including," "comprising," "having," and any variations thereof, which are intended to cover non-exclusive inclusions; for example, a process, method, system, article, or apparatus that comprises a list of steps or modules (elements) is not limited to the listed steps or elements, but may include other steps or elements not expressly listed or inherent to such process, method, article, or apparatus. Reference to "connected," "coupled," and the like in this application is not intended to be limited to physical or mechanical connections, but may include electrical connections, whether direct or indirect. Reference herein to "a plurality" means greater than or equal to two. "and/or" describes an association relationship of associated objects, meaning that three relationships may exist, for example, "A and/or B" may mean: a exists alone, A and B exist simultaneously, and B exists alone. Reference herein to the terms "first," "second," "third," and the like, are merely to distinguish similar objects and do not denote a particular ordering for the objects.

The method provided by the embodiment can be executed in a terminal, a computer or a similar operation device. Taking the operation on the terminal as an example, fig. 1 is a hardware structure block diagram of the terminal of the navigation map construction method according to the embodiment of the present invention. As shown in fig. 1, the terminal 10 may include one or more (only one shown in fig. 1) processors 102 (the processor 102 may include, but is not limited to, a processing device such as a microprocessor MCU or a programmable logic device FPG a) and a memory 104 for storing data, and optionally, a transmission device 106 for communication functions and an input-output device 108. It will be understood by those skilled in the art that the structure shown in fig. 1 is only an illustration and is not intended to limit the structure of the terminal. For example, the terminal 10 may also include more or fewer components than shown in FIG. 1, or have a different configuration than shown in FIG. 1.

The memory 104 may be used to store a computer program, for example, a software program and a module of application software, such as a computer program corresponding to the navigation map construction method in the embodiment of the present invention, and the processor 102 executes various functional applications and data processing by running the computer program stored in the memory 104, so as to implement the method described above. The memory 104 may include high speed random access memory, and may also include non-volatile memory, such as one or more magnetic storage devices, flash memory, or other non-volatile solid-state memory. In some examples, the memory 104 may further include memory located remotely from the processor 102, which may be connected to the terminal 10 via a network. Examples of such networks include, but are not limited to, the internet, intranets, local area networks, mobile communication networks, and combinations thereof.

The transmission device 106 is used to receive or transmit data via a network. Specific examples of the network described above may include a wireless network provided by a communication provider of the terminal 10. In one example, the transmission device 106 includes a Network adapter (NIC) that can be connected to other Network devices through a base station to communicate with the internet. In one example, the transmission device 106 may be a Radio Frequency (RF) module, which is used to communicate with the internet in a wireless manner.

The present embodiment provides a navigation map construction method, and fig. 2 is a flowchart of a navigation map construction method according to an embodiment of the present application, and as shown in fig. 2, the flowchart includes the following steps:

step S210, acquiring a plurality of orthographic projection images by using aerial photography with a fixed height of an unmanned aerial vehicle;

step S220, filtering the ground moving target according to the orthographic projection image by utilizing a well-trained confrontation generation network model to obtain a corresponding static map image;

step S230, three-dimensional scene construction and orthographic projection global map generation are carried out according to the static map image by utilizing aerial triangulation modeling, and an orthographic projection global map of the subarea is obtained;

and S240, segmenting the orthographic projection global map by using the well-trained depth segmentation model, and splicing segmented map images with the road area identification obtained by segmentation to obtain the navigation map with the road area identification.

It should be noted that the unmanned aerial vehicle is used for carrying out aerial photography at a fixed height to obtain a plurality of orthographic projection images with GPS positioning information at the fixed height and in a fixed area, and the orthographic projection images are stored. The fixed height is 100 meters +/-2 meters, the shooting angle is 90 degrees and is vertical to the horizontal plane downwards, and the shooting track is S-shaped reciprocating shooting in a fixed area. During aerial photography, aerial photography is carried out at fixed intervals, each aerial photography is 5 seconds per sheet, and the images are saved as jpg image files with the resolution of 5472x 3648. And the images collected at different time can be respectively stored in different subfolders, and the images of a plurality of folders in the same area jointly form an original data set of the orthographic projection image. In other embodiments, the specific parameters of the fixed shooting are not limited, but the above parameters are used to improve the calculation efficiency. For example, the fixed height may be 50 meters ± 2 meters, 80 meters ± 2 meters, and the like. The orthographic projection image is acquired by the unmanned aerial vehicle, so that data acquisition is easy; and unmanned aerial vehicle working range is big, can greatly improve work efficiency. In other embodiments, high-precision map construction can also be performed through satellite maps.

After obtaining the orthographic projection image, filtering a ground moving target by using a confrontation generation network model with complete training to obtain a corresponding static map image; and then global scene map construction and area cutting are carried out based on an algorithm of aerial triangulation modeling. In this embodiment, aerial triangulation is a topographic mapping technique, which can be used to assist in building a 3D map by using a computer to calculate the approximate positions of different pixels in an image from the image, the image's location, focal length, and angle.

The file structure of the algorithm of step S230 is:

|--images/；

-img-1234.jpg storing an image dataset used to construct a map;

|--opensfm/；

information such as calculation in the | se mapillary/opennsf map;

|--odm_meshing/；

-odm _ mesh.ply mesh file;

|--odm_texturing/；

ob, -/odm _ textual _ model.ob; j 3D mesh model files;

obj — odm _ textual _ model _ geo.obj; a 3D model file;

|--odm_georeferencing/；

laz point cloud model file (if point cloud mapping is used);

|--odm_orthophoto/；

tif orthographic projection image.

The using process of the algorithm is that orthographic projection images (original data sets) needed for constructing the map are introduced into images, 3D map construction and orthographic projection algorithms are called through docker in a command line, orthographic projection image paths are specified in parameters, the algorithm runs program paths and project paths, and the algorithm can calculate automatically. The calculation output results are mainly images and 3D model files for decoration, mesh files and orthographic projection files. The obj-format model file can be opened to check the mapping effect, typically using conventional model reading software. The orthographic projection image is large, so that the compressed orthographic projection panoramic image is placed in the attached drawing, the original orthographic projection image is cut by using the script, the part of the edge of the image, which lacks uneven information, is cut off and deleted, the required part is cut off and is divided to obtain a plurality of orthographic projection images, the orthographic projection images serve as a data set to be further processed, splicing inspection is tried after cutting, and the successful splicing proves that the cutting process does not damage the image.

And finally, segmenting the orthographic projection global map by using a well-trained depth segmentation model, and splicing segmented map images with road area identifications obtained by segmentation to obtain a navigation map with the road area identifications.

Through the steps, the problems of low map precision and huge workload which are built in a mode of satellite shooting and artificial labeling are solved, and efficient and high-precision navigation map building is realized.

In this embodiment, two models, namely a countermeasure generation network model and a depth segmentation model, need to be trained in advance, and the training process of the two models is described in detail below.

In some of these embodiments, as shown in FIG. 3, training against a generative network model includes the following steps;

s310, acquiring a plurality of orthographic projection images;

s320, generating a group of graph pairs with contrast action according to the orthographic projection image to serve as a training set;

and S330, training the confrontation generation network model according to the training set to obtain the completely-trained confrontation generation network model.

In particular, the countermeasure generation network model is used for filtering ground moving objects. The countermeasure generation network has three major modules, three inputs and two outputs. The modules are respectively an encoder for extracting characteristic action, a generator for generating an image and a discriminator for judging the similarity degree of the generated image and a reference image; inputting an orthographic projection image containing a moving target, an image set of a manually filtered moving target corresponding to the orthographic projection image and added Gaussian random noise; the output is the generation result of the generator and the judgment result of the discriminator, and the two are respectively input into the network to update the parameters of the module, thereby realizing trainable iteration. When the countermeasure generation network iterates each time, two corresponding images need to be imported, so that training is achieved.

The orthographic projection image is obtained by aerial photography with a fixed height of the unmanned aerial vehicle, and the orthographic projection image contains non-static targets such as ground moving vehicles and the like, so that the orthographic projection image cannot be used for directly building the image. Filtering of ground moving objects may be performed using a learning-based countermeasure generation network. In this embodiment, a part of the orthographic projection image is taken for copy backup as a training set of the model. Filtering out ground moving targets in an artificial comparison mode on the positive projection images of the copied backup, and filling adjacent pixels in a target area. Selecting a pixel area of an object to be deleted, calculating an approximate pixel value interval in surrounding pixels, and extracting proper pixels to complete in the deleted area. For the situation that other objects exist near the target to be filtered, for example, a traffic indication line is pressed by a vehicle to be filtered, after algorithm filtering, manual pixel selection is performed, and then filtering of the ground moving target is completed. After manual filtering, two groups of images are obtained, corresponding to scenes at the same position, one group has ground moving targets, and the other group does not have moving targets, so that a group of image pairs forming a contrast effect is formed, and the images can be used as a training set for generating a network in an antagonistic way.

And (5) training deployment. The above-mentioned pair of graphs with contrasting effects is used as a training set input countermeasure to generate a neural network (GAN). GAN can generate an image that is close to true by the countermeasure of the discriminator D and the generator G. The neural network of the GAN is trained using the established map pairs so that it can generate images without ground moving objects from images with ground moving objects. The training process is divided into the establishment and the training of a generator and a discriminator. The generator part inputs the image which is not modified manually into the encoder, extracts the characteristics of the image through convolution downsampling, simultaneously adds Gaussian noise into characteristic dimensionality, enables the generation network to have better operability through the noise, and restores the dimensionality of the image through upsampling of the generation network, so that the image is generated, and a loss function is calculated. And the discriminator part inputs the image generated by the generator and the corresponding artificially filtered image into the discriminator to carry out judgment and calculates the loss function. After one iteration is finished, the judged result information is stored and the parameters of the generator and the discriminator are updated, and the iterative training is continued, so that the capability of the generator for generating the image is enhanced until the generated image is highly similar to the artificially filtered target image, and the training is finished. And finally, processing all the orthographic projection images by using the trained generator neural network model to complete the filtering task of the ground moving vehicle.

Along with the increase of the training iteration times, the filtering effect of the image ground vehicles generated by the algorithm is gradually improved, the vehicles needing to be filtered on the ground are blurred, vehicles with large differences between black colors and ground colors, and the like, can be successfully filtered, various vehicles on the ground are completely filtered, and a static map without ground moving vehicles is generated. Finally, through the trained generator, the batch conversion of generating the corresponding images without the ground moving target from the original orthographic projection images with the ground moving target can be realized, and the static map images are obtained, so that the expectation is achieved.

In some embodiments, as shown in fig. 3, the training of the depth segmentation model includes the following steps;

s410, acquiring an orthographic projection global map of the subarea;

s420, dividing the orthographic projection global map into a training set and a test set;

and S430, training the depth segmentation model according to the training set and the test set to obtain a completely-trained depth segmentation model.

Specifically, the partitioned orthographic projection maps are divided into a training set and a testing set in batches, the training set is subjected to artificial semantic calibration, and areas where vehicles cannot run, such as road areas and vegetation areas, are marked, wherein only white lane lines exist between roads, the white lane lines are soft intervals, vegetation is arranged on the edges of the roads, and pedestrian crossings and the like are hard intervals. And obtaining a batch of map data sets marked with segmentation semantics through marking.

And (5) partitioning the model architecture. And performing semantic segmentation on the processed map data set, extracting a road network and establishing a high-precision map. In this embodiment, a semantic segmentation network based on encoder and decoder structures may be employed. The method comprises the steps of inputting a map data set into a semantic segmentation network in batches, extracting features through operations such as multilayer convolution, pooling and the like, classifying in a feature space, and projecting classification results onto corresponding pixels of an image through operations of multilayer convolution and upsampling so as to complete a semantic segmentation task, wherein the classification of travelable roads is covered, namely a road network of a high-precision map is required to be constructed.

Training and deploying the application. And (4) leading the well-made training set into a deep segmentation model to be trained for training, and leading the result of semantic segmentation to be closer to a correct result after multiple times of training. After training, the test set is used for testing, so that road areas, road bidirectional separation lines, vegetation areas, sidewalks and other building areas can be correctly identified.

Semantic segmentation may identify different regions in the orthographic projection image, which may be merged into feasible regions and non-drivable regions according to labels. The segmentation result is at the pixel level, that is, the more accurate the original image data acquired in the whole process is, the more accurate the segmentation result is, and the segmentation result can be accurate to the centimeter level, so that accurate information such as road width can be completely acquired. Meanwhile, the segmentation result can be spliced together with the splicing of the map, so that a high-precision road network is combined, and the high-precision map is successfully constructed.

In one embodiment, the method comprises the following steps of constructing a three-dimensional scene and generating an orthographic projection global map by utilizing aerial triangulation according to a static map image to obtain the orthographic projection global map of a subarea;

s231, detecting the same-name points of adjacent regions of the static map image to obtain pixel points containing space coordinates and RGB pixel values;

s232, carrying out raster scene reconstruction according to the pixel points to obtain a global 3D map;

and S233, carrying out orthographic projection cutting and splicing on the global 3D map according to preset region division to obtain the partitioned orthographic projection global map.

Specifically, after the static map image is imported with an algorithm based on aerial triangulation modeling, external orientation information of the image, namely the three-dimensional position of the optical center of the camera during photographing, three-dimensional angle elements, the acquisition area and range of the image and the like can be viewed, so that the spatial distribution of different images based on the GPS can be determined.

And detecting the same-name points based on adjacent areas. After the aerial triangulation is performed to calculate the positions of different pixels in space in different images, similar points matching is performed on similar areas in different images, as shown in fig. 5, and if a matching point is found, the matching point is stored, and the matching point with the determined space coordinate is called a control point. When the map is subsequently reconstructed, matching expansion is carried out towards the periphery on the basis of the map, matching points with uncertain space coordinates are called connecting points, and different images can be connected through the connecting points. After roughly determining the spatial relative positions of the different photographs, the algorithm will proceed with the matching of the regional pixels. For the same region containing a plurality of photos, called as an encryption region, the pixels in the region are subjected to calculation processing, real space coordinates are estimated from many pieces of coordinate information containing errors, edge joint processing is required for edge parts containing fewer photos, matching is performed by judging whether the edge joint difference exceeds the limit, and the edge joint difference is a quantity for measuring the similarity of two regions to be judged, as shown in fig. 6. Through the process, a large number of pixel points containing space coordinates and RGB pixel values are obtained.

And reconstructing a grid scene. According to the solved pixel points, adjacent points in the same plane can be combined into grids by comparing the adjacent points, the RGB values of the pixels are used for producing grid images for decoration, and color construction is carried out on each grid through interpolation and the like so as to establish a series of grids with different angles of RGB color information, and therefore a global 3D map is established.

And (4) orthographic projection output. With the global 3D map, it is possible to output an orthographic projection image of each region according to the previous region division. For example, the map is divided according to a plane space, areas are divided according to different longitudes and latitudes (x and y coordinates), and in each area, each longitude and latitude only takes the highest value of the altitude (z coordinate) in the corresponding pixel point, and is output as a 2D plane map with a limited size, which is a local orthographic projection map of the original 3D map. The orthographic projection map divided into small blocks is convenient to store, and the same orthographic projection image is cut and divided, so that the same orthographic projection map can be automatically spliced according to the number to form a global orthographic projection global map. And make the orthographic projection image of gathering can anti-interference, like in the orthographic projection image of different time gathering, the position of ground mobile vehicle is different, and the vehicle all can be being filtered to reduce the interference. Interference of illumination of images acquired on different dates can be eliminated during 3D map construction. The more images are collected, the stronger the anti-interference capability is.

In some embodiments, the orthographic projection global map is segmented by a well-trained depth segmentation model, different areas in the orthographic projection image can be identified, and the feasible areas and the non-drivable areas can be combined according to the labels. The segmentation result is at the pixel level, that is, the more accurate the original image data acquired in the whole process is, the more accurate the segmentation result is, and the segmentation result can be accurate to the centimeter level, so that accurate information such as road width can be completely acquired. Meanwhile, the segmentation result can be spliced together with the splicing of the map, so that a high-precision road network is combined, and the high-precision map is successfully constructed. Make this application can provide centimetre level navigation for unmanned systems, be the important source of the external information of system as unmanned aerial vehicle.

It should be noted that the steps illustrated in the above-described flow diagrams or in the flow diagrams of the figures may be performed in a computer system, such as a set of computer-executable instructions, and that, although a logical order is illustrated in the flow diagrams, in some cases, the steps illustrated or described may be performed in an order different than here.

The present embodiment further provides a navigation map construction apparatus, which is used to implement the foregoing embodiments and preferred embodiments, and the description of the apparatus is omitted here. As used hereinafter, the terms "module," "unit," "subunit," and the like may implement a combination of software and/or hardware for a predetermined function. Although the means described in the embodiments below are preferably implemented in software, an implementation in hardware, or a combination of software and hardware is also possible and contemplated.

Fig. 7 is a block diagram of a navigation map construction apparatus according to an embodiment of the present application, and as shown in fig. 7, the apparatus includes: an acquisition module 210, a filtering module 220, a partition constructing module 230 and a segmentation and splicing module 240;

an acquisition module 210, configured to acquire a plurality of orthographic projection images by using an aerial photography with a fixed height by an unmanned aerial vehicle;

a filtering module 220, configured to filter the ground moving target according to the orthographic projection image by using a well-trained confrontation generation network model to obtain a corresponding static map image;

the partition construction module 230 is configured to perform three-dimensional scene construction and forward projection global map generation by using aerial triangulation modeling according to the static map image, so as to obtain a forward projection global map of the partition;

and the segmentation and splicing module 240 is configured to segment the forward projection global map by using a well-trained depth segmentation model, and splice the segmented map images with the road area identifier obtained by the segmentation to obtain a navigation map with the road area identifier.

The method solves the problems of low map precision and huge workload constructed in a mode of satellite shooting and manual labeling, and realizes efficient and high-precision navigation map construction.

In one embodiment, a first training module; the first training module is used for acquiring a plurality of orthographic projection images; generating a group of graph pairs with contrast action according to the orthographic projection image as a training set; and training the confrontation generation network model according to the training set to obtain the completely trained confrontation generation network model.

In one embodiment, a second training module; the second training module is used for acquiring an orthographic projection global map of the subarea; dividing the orthographic projection global map into a training set and a test set; and training the depth segmentation model according to the training set and the test set to obtain the depth segmentation model with complete training.

In one embodiment, the obtaining module 210 is further configured to obtain a plurality of orthographic projection images with GPS positioning information of a fixed height and a fixed area by using the fixed height aerial photography of the unmanned aerial vehicle, and store the orthographic projection images.

In one embodiment, the partition constructing module 230 is further configured to perform homonymous point detection on neighboring areas of the static map image to obtain a pixel point containing a spatial coordinate and an RGB pixel value; carrying out grid scene reconstruction according to the pixel points to obtain a global 3D map; and carrying out orthographic projection cutting splicing on the global 3D map according to preset area division to obtain the partitioned orthographic projection global map.

The above modules may be functional modules or program modules, and may be implemented by software or hardware. For a module implemented by hardware, the modules may be located in the same processor; or the modules can be respectively positioned in different processors in any combination.

The present embodiment also provides an electronic device comprising a memory having a computer program stored therein and a processor configured to execute the computer program to perform the steps of any of the above method embodiments.

Optionally, the electronic apparatus may further include a transmission device and an input/output device, wherein the transmission device is connected to the processor, and the input/output device is connected to the processor.

Optionally, in this embodiment, the processor may be configured to execute the following steps by a computer program:

s1, acquiring a plurality of orthographic projection images by aerial photography with an unmanned aerial vehicle at a fixed height;

s2, filtering the ground moving target according to the orthographic projection image by using a well-trained confrontation generation network model to obtain a corresponding static map image;

s3, performing three-dimensional scene construction and orthographic projection global map generation by using aerial triangulation according to the static map image to obtain an orthographic projection global map of the partition;

and S4, segmenting the orthographic projection global map by using a well-trained depth segmentation model, and splicing segmented map images with road area identifications obtained by segmentation to obtain a navigation map with the road area identifications.

It should be noted that, for specific examples in this embodiment, reference may be made to examples described in the foregoing embodiments and optional implementations, and details of this embodiment are not described herein again.

In addition, in combination with the navigation map construction method in the foregoing embodiment, the embodiment of the present application may provide a storage medium to implement. The storage medium having stored thereon a computer program; the computer program, when executed by a processor, implements any of the navigation map construction methods in the above embodiments.

It should be understood by those skilled in the art that various features of the above-described embodiments can be combined in any combination, and for the sake of brevity, all possible combinations of features in the above-described embodiments are not described in detail, but rather, all combinations of features which are not inconsistent with each other should be construed as being within the scope of the present disclosure.

The above-mentioned embodiments only express several embodiments of the present application, and the description thereof is more specific and detailed, but not construed as limiting the scope of the invention. It should be noted that, for a person skilled in the art, several variations and modifications can be made without departing from the concept of the present application, which falls within the scope of protection of the present application. Therefore, the protection scope of the present patent shall be subject to the appended claims.

Claims

1. A navigation map construction method is characterized by comprising the following steps:

2. The navigation map construction method according to claim 1, further comprising training against generation of a network model;

the training of the confrontation generation network model comprises:

acquiring a plurality of orthographic projection images;

3. The navigation map construction method according to claim 1, further comprising training of a depth segmentation model;

the training of the depth segmentation model comprises the following steps:

acquiring an orthographic projection global map of a subarea;

4. The navigation map construction method of any one of claims 1-3, wherein acquiring a plurality of forward projection images by aerial photography at a fixed height of an unmanned aerial vehicle comprises:

5. The method for constructing the navigation map according to claim 4, wherein the step of constructing a three-dimensional scene and generating an orthographic projection global map by using aerial triangulation modeling according to the static map image to obtain the orthographic projection global map of the subarea comprises the following steps:

6. The navigation map construction method according to claim 5, wherein performing raster scene reconstruction according to the pixel points to obtain a global 3D map comprises:

7. A navigation map construction apparatus characterized by comprising: the system comprises an acquisition module, a filtering module, a partition building module and a segmentation and splicing module;

the acquisition module is used for acquiring a plurality of orthographic projection images by using aerial photography with a fixed height of the unmanned aerial vehicle; the filtering module is used for filtering the ground moving target of the orthographic projection image by utilizing a well-trained confrontation generation network model to obtain a corresponding static map image;

8. A navigation map construction system, comprising: a terminal device, a transmission device and a server device; the terminal equipment is connected with the server equipment through the transmission equipment;

the server device is configured to execute the navigation map construction method according to any one of claims 1 to 6.

9. An electronic device comprising a memory and a processor, wherein the memory stores a computer program, and the processor is configured to execute the computer program to perform the navigation map construction method according to any one of claims 1 to 6.

10. A storage medium having stored thereon a computer program, wherein the computer program is configured to execute the navigation map construction method according to any one of claims 1 to 6 when executed.