CN111656760A

CN111656760A - Image generation device, image generation method, program, and recording medium

Info

Publication number: CN111656760A
Application number: CN201980009014.0A
Authority: CN
Inventors: 陈斌; 沈思杰
Original assignee: SZ DJI Technology Co Ltd
Current assignee: SZ DJI Technology Co Ltd
Priority date: 2018-11-30
Filing date: 2019-11-12
Publication date: 2020-09-11
Also published as: JP2020088821A; WO2020108290A1

Abstract

Provided is an image generation device capable of easily generating a composite image or the like by a device having a low computational processing capability and suppressing a reduction in reproducibility of an object represented by the composite image. The image generation device generates a composite image based on a plurality of captured images captured by a flying object, and includes a processing unit that performs processing related to generation of the composite image. The processing unit acquires a plurality of captured images captured by an imaging device provided in the flying object, generates a three-dimensional model based on the plurality of captured images, acquires the respective postures of the imaging device when the plurality of captured images are captured, calculates the distances between the respective positions of the imaging device and the three-dimensional model when the plurality of captured images are captured based on the respective postures of the imaging device and the three-dimensional model, adjusts the sizes of the plurality of captured images based on the distances between the respective positions of the imaging device and the three-dimensional model, and generates a composite image by combining the plurality of captured images after the size adjustment.

Description

Image generation device, image generation method, program, and recording medium

[ technical field ] A method for producing a semiconductor device

The present disclosure relates to an image generation device, an image generation method, a program, and a recording medium that generate a composite image based on a plurality of captured images captured by a flying object.

[ background of the invention ]

Conventionally, a platform (unmanned aerial vehicle) that performs shooting while passing through a predetermined fixed route is known. The platform receives a shooting instruction from the ground base and shoots a shooting object. When shooting an imaging subject, the platform flies along a fixed path and, while the platform is tilted with respect to the imaging subject, shoots the imaging subject (see patent document 1).

[ Prior art documents ]

[ patent document ]

[ patent document 1 ] Japanese patent application laid-open No. 2010-61216

[ summary of the invention ]

[ problem to be solved by the invention ]

It is possible to synthesize a plurality of images photographed by the unmanned aerial vehicle of patent document 1 and generate a synthesized image. Further, an orthoimage may be generated based on a plurality of images taken by the unmanned aerial vehicle of patent document 1, and projecting dense three-dimensional point groups on a two-dimensional plane using a three-dimensional reconstruction technique. In the former generation of a composite image, it is necessary to identify the scale (size) of the reference image in advance in order to fit all the images into the reference image. If a composite image is generated without taking into account the size of each image, the reproducibility of the object embodied by the composite image is degraded. In the case of generating the latter ortho image, although dense three-dimensional point clusters are generated, the calculation for generating the dense three-dimensional point clusters is complicated and the calculation cost is high. For example, in the process of performing three-dimensional reconstruction to generate an orthoimage, it is necessary to sequentially perform prime point generation (e.g., sfm: Structure from Motion), dense point generation (e.g., mvs: Multi-View Stereo), mesh generation, texture generation, and the like, resulting in a heavy processing load for generating an orthoimage. Therefore, it is difficult to easily generate a composite image or an ortho image on a tablet terminal or the like whose calculation processing capability is not so high.

[ MEANS FOR SOLVING PROBLEMS ] to solve the problems

In one aspect, an image generation device that generates a composite image based on a plurality of captured images captured by a flying object includes a processing portion that executes processing relating to generation of the composite image, the processing portion acquiring the plurality of captured images captured by an imaging device included in the flying object; generating a three-dimensional model based on the plurality of captured images, and acquiring each posture of the imaging device when the plurality of captured images are captured; calculating distances between respective positions of the image pickup device and the three-dimensional model when the plurality of picked-up images are picked up, based on the respective postures of the image pickup device and the three-dimensional model; adjusting the sizes of the plurality of captured images based on the distances between the respective positions of the imaging device and the three-dimensional model; the plurality of captured images after the size adjustment are combined to generate a combined image.

The processing portion may acquire each position and each orientation of the image pickup device at the time of taking the plurality of picked-up images, and calculate a distance between each position of the image pickup device and the three-dimensional model based on each position and each orientation of the image pickup device and the three-dimensional model.

The processing section may calculate a distance between the image pickup device and the first portion of the three-dimensional model corresponding to the image pickup range for each image pickup range picked up by the image pickup device at each position.

The processing section may divide the imaging range captured at each position of the imaging device to generate divided regions of the imaging range, calculate a second part of the three-dimensional model corresponding to the divided regions, and calculate a distance between the imaging device and the second part of the three-dimensional model corresponding to the divided regions in accordance with each of the divided regions.

The distance may be a distance in the vertical direction between each position of the imaging device and the three-dimensional model.

The distance may be a distance between each position of the imaging device and the three-dimensional model in an imaging direction of the imaging device.

The processing unit may generate sparse point cloud data based on the plurality of captured images, and generate a three-dimensional model based on the sparse point cloud data.

The processing unit may project a plurality of three-dimensional points included in the sparse point group data onto the two-dimensional plane; designating the projected plurality of two-dimensional points adjacent in the two-dimensional plane as a group, and designating the plurality of groups; connecting a plurality of three-dimensional points included in the sparse point group data corresponding to the specified adjacent two-dimensional points in groups to generate a plurality of plane data; a three-dimensional model is generated based on the plurality of face data.

In one aspect, an image generation method that generates a composite image based on a plurality of captured images captured by a flying object, includes: acquiring a plurality of captured images captured by an imaging device included in a flying object; generating a three-dimensional model based on the plurality of camera images; acquiring each posture of the image pickup device when a plurality of image pickup images are picked up; calculating distances between respective positions of the image pickup device and the three-dimensional model when the plurality of picked-up images are picked up, based on the respective postures of the image pickup device and the three-dimensional model; adjusting the sizes of the plurality of captured images based on the distances between the respective positions of the imaging device and the three-dimensional model; the plurality of captured images after the size adjustment are combined to generate a combined image.

The step of acquiring a gesture may comprise the steps of: the positions and postures of the imaging device at the time of capturing the plurality of captured images. The step of calculating the distance may comprise the steps of: calculating distances between the respective positions of the image pickup apparatus and the three-dimensional model based on the respective positions and the respective postures of the image pickup apparatus and the three-dimensional model.

The step of calculating the distance may comprise the steps of: the distance between the imaging device and the first portion of the three-dimensional model corresponding to the imaging range is calculated for each imaging range imaged by the imaging device at each position.

The step of calculating the distance may comprise the steps of: dividing an imaging range captured at each position of an imaging device to generate divided regions of the imaging range; calculating a second portion of the three-dimensional model corresponding to the segmented region; and calculating, for each of the divided regions, a distance between the image pickup device and the second portion of the three-dimensional model corresponding to the divided region.

The step of generating a three-dimensional model may comprise the steps of: generating sparse point group data based on the plurality of camera images; generating a three-dimensional model based on the sparse point cloud data.

The step of generating a three-dimensional model may comprise the steps of: projecting a plurality of three-dimensional points included in the sparse point cloud data onto a two-dimensional plane; designating the projected plurality of two-dimensional points adjacent in the two-dimensional plane as a group, and designating the plurality of groups; connecting a plurality of three-dimensional points included in the sparse point group data corresponding to the specified adjacent two-dimensional points in groups to generate a plurality of plane data; and generating a three-dimensional model based on the plurality of face data.

In one aspect, a program for causing an image generation device that generates a composite image based on a plurality of captured images captured by a flying object to execute: acquiring a plurality of captured images captured by an imaging device included in a flying object; generating a three-dimensional model based on the plurality of camera images; acquiring each posture of the image pickup device when a plurality of image pickup images are picked up; calculating distances between respective positions of the image pickup device and the three-dimensional model when the plurality of picked-up images are picked up, based on the respective postures of the image pickup device and the three-dimensional model; adjusting the sizes of the plurality of captured images based on the distances between the respective positions of the imaging device and the three-dimensional model; and synthesizing the plurality of captured images after the size adjustment to generate a synthesized image.

In one aspect, a recording medium is a computer-readable recording medium having recorded thereon a program for causing an image generation device that generates a composite image based on a plurality of captured images captured by a flying object to execute: acquiring a plurality of captured images captured by an imaging device included in a flying object; generating a three-dimensional model based on the plurality of camera images; acquiring each posture of the image pickup device when a plurality of image pickup images are picked up; calculating distances between respective positions of the image pickup device and the three-dimensional model when the plurality of picked-up images are picked up, based on the respective postures of the image pickup device and the three-dimensional model; adjusting the sizes of the plurality of captured images based on the distances between the respective positions of the imaging device and the three-dimensional model; and synthesizing the plurality of captured images after the size adjustment to generate a synthesized image.

In addition, the summary of the invention does not list all features of the present disclosure. Furthermore, sub-combinations of these feature sets may also constitute the invention.

[ description of the drawings ]

Fig. 1 is a schematic diagram showing a first configuration example of a flight system in embodiment 1.

Fig. 2 is a schematic diagram showing a second configuration example of the flight system in embodiment 1.

Fig. 3 is a diagram showing one example of a concrete appearance of the unmanned aerial vehicle.

Fig. 4 is a block diagram illustrating one example of a hardware configuration of an unmanned aerial vehicle.

Fig. 5 is a block diagram showing one example of a hardware configuration of a terminal.

Fig. 6 is a flowchart showing one example of the image composition processing procedure.

Fig. 7 is a diagram showing one example of distances between respective positions of the image pickup section and corresponding portions of the three-dimensional model.

Fig. 8 is a diagram showing an example of deriving the distance between the position of the imaging unit and the three-dimensional model.

Fig. 9 is a diagram showing an example of deriving the distance of each imaging range.

Fig. 10 is a diagram showing an example of deriving the distance of each divided region into which the imaging range is divided.

Fig. 11 is a diagram showing an example of correction of the resizing amount.

Fig. 12 is a flowchart showing one example of the generation process of the three-dimensional model.

Fig. 13 is a diagram showing an example of a three-dimensional point group and a two-dimensional point group.

Fig. 14 is a diagram showing a three-dimensional model generated in a comparative example.

Fig. 15 is a diagram showing an example of the three-dimensional model generated in embodiment 1.

[ detailed description ] embodiments

The present disclosure will be described below with reference to embodiments of the invention, but the following embodiments do not limit the invention according to the claims. All combinations of features described in the embodiments are not necessarily essential to the inventive solution.

The claims, the specification, the drawings, and the abstract of the specification contain matters to be protected by copyright. The copyright owner would not objection to the facsimile reproduction by anyone of the files, as represented by the patent office documents or records. However, in other cases, the copyright of everything is reserved.

In the following embodiments, the flying object is exemplified by an Unmanned Aerial Vehicle (UAV). In the drawings of the present specification, the unmanned aerial vehicle is also expressed as "UAV". The image generation system is exemplified by a flight system having an unmanned aerial vehicle and a terminal. The image generating apparatus is mainly exemplified by an unmanned aircraft, but may be a terminal. The terminal may include a smartphone, tablet terminal, pc (personal computer), or other device. The image generation method is defined by operations in the image generation apparatus. Further, a program (for example, a program for causing the image generating apparatus to execute various processes) is recorded in the recording medium.

(embodiment mode 1)

Fig. 1 is a schematic diagram showing a first configuration example of a flight system 10 in embodiment 1. The flight system 10 includes an unmanned aerial vehicle 100 and a terminal 80. The unmanned aerial vehicle 100 and the terminal 80 may communicate with each other through wired communication or wireless communication (e.g., wireless lan (local Area network)). In fig. 1, it is illustrated that the terminal 80 is a portable terminal (e.g., a smartphone, a tablet terminal).

In addition, the flight system may be constituted to include an unmanned aerial vehicle, a transmitter (proportional controller), and a portable terminal. When a transmitter is included, the user can control the flight of the unmanned aerial vehicle using a control stick disposed to the left and right of the front of the transmitter. In this case, the unmanned aerial vehicle, the transmitter, and the portable terminal can communicate with each other by wired communication or wireless communication.

Fig. 2 is a schematic diagram showing a second constitutional example of the flight system 10 in embodiment 1. In fig. 2, the terminal 80 is exemplified as a PC. In either of fig. 1 and 2, the functions that the terminal 80 has may be the same.

Fig. 3 is a diagram showing one example of a concrete appearance of the unmanned aerial vehicle 100. In fig. 3, a perspective view of the unmanned aerial vehicle 100 is shown when flying in the moving direction STV 0. The unmanned aerial vehicle 100 is an example of a mobile body.

As shown in fig. 3, the roll axis (refer to the x-axis) is set in a direction parallel to the ground and along the moving direction STV 0. In this case, the pitch axis (see the y axis) is set in the direction parallel to the ground and perpendicular to the roll axis, and the yaw axis (see the z axis) is set in the direction perpendicular to the ground and perpendicular to the roll axis and the pitch axis.

The unmanned aerial vehicle 100 includes a UAV main body 102, a universal joint 200, an image pickup unit 220, and a plurality of image pickup units 230.

The UAV body 102 includes a plurality of rotors (propellers). UAV body 102 flies unmanned aircraft 100 by controlling the rotation of the plurality of rotors. UAV body 102 uses, for example, four rotors to fly unmanned aircraft 100. The number of rotors is not limited to four. Further, the unmanned aerial vehicle 100 may be a fixed wing aircraft without rotors.

The imaging unit 220 is an imaging camera that images an object (for example, an overhead object, a landscape such as a mountain or river, or a building on the ground) included in a desired imaging range.

The plurality of imaging units 230 are sensing cameras that capture images of the surroundings of the unmanned aircraft 100 in order to control the flight of the unmanned aircraft 100. The two cameras 230 may be provided at the nose, i.e., the front, of the unmanned aircraft 100. The other two image pickup units 230 may be provided on the bottom surface of the unmanned aircraft 100. The two image pickup portions 230 on the front side may be paired to function as a so-called stereo camera. The two image pickup portions 230 on the bottom surface side may also be paired to function as a stereo camera. Three-dimensional spatial data around the unmanned aerial vehicle 100 can be generated based on the images captured by the plurality of image capturing sections 230. In addition, the number of the image pickup units 230 included in the unmanned aerial vehicle 100 is not limited to four. The unmanned aerial vehicle 100 may include at least one image pickup unit 230. The unmanned aircraft 100 may include at least one camera 230 at the nose, tail, sides, bottom, and top of the unmanned aircraft 100, respectively. The angle of view settable in the image pickup section 230 may be larger than the angle of view settable in the image pickup section 220. The image pickup section 230 may have a single focus lens or a fisheye lens.

Fig. 4 is a block diagram showing one example of the hardware configuration of the unmanned aerial vehicle 100. The unmanned aerial vehicle 100 includes a UAV control Unit 110, a communication interface 150, a memory 160, a memory 170, a universal joint 200, a rotor mechanism 210, a camera Unit 220, a camera Unit 230, a GPS receiver 240, an Inertial Measurement Unit (IMU) 250, a magnetic compass 260, an air pressure altimeter 270, an ultrasonic sensor 280, and a laser Measurement instrument 290.

The UAV control Unit 110 is constituted by, for example, a CPU (Central Processing Unit), an MPU (Micro Processing Unit), or a DSP (Digital Signal Processor). The UAV control unit 110 performs signal processing for controlling the operation of each unit of the unmanned aircraft 100 as a whole, input/output processing of data with respect to other units, arithmetic processing of data, and storage processing of data.

The UAV controller 110 controls the flight of the unmanned aircraft 100 according to a program stored in the memory 160. The UAV control 110 may control flight. The UAV control 110 may take aerial images.

The UAV control 110 acquires position information indicating a position of the unmanned aerial vehicle 100. The UAV controller 110 may obtain, from the GPS receiver 240, location information indicating the latitude, longitude, and altitude at which the unmanned aircraft 100 is located. The UAV control unit 110 may acquire latitude and longitude information indicating the latitude and longitude where the unmanned aircraft 100 is located from the GPS receiver 240, and may acquire altitude information indicating the altitude where the unmanned aircraft 100 is located from the barometric altimeter 270 as position information. The UAV control unit 110 may acquire a distance between a point of emission of the ultrasonic wave generated by the ultrasonic sensor 280 and a point of reflection of the ultrasonic wave as the altitude information.

The UAV control 110 may obtain orientation information from the magnetic compass 260 that represents the orientation of the unmanned aircraft 100. The orientation information may be represented by, for example, a bearing corresponding to the orientation of the nose of the unmanned aircraft 100.

The UAV control unit 110 may acquire position information indicating a position where the unmanned aircraft 100 should be present when the imaging unit 220 performs imaging in accordance with the imaging range of the imaging. The UAV control 110 may obtain from the memory 160 location information indicating where the unmanned aircraft 100 should be. The UAV control 110 may obtain, via the communication interface 150, location information from other devices that indicates a location where the unmanned aerial vehicle 100 should be present. The UAV control unit 110 may specify a position where the unmanned aircraft 100 can exist by referring to the three-dimensional map database, and acquire the position as position information indicating a position where the unmanned aircraft 100 should exist.

The UAV control unit 110 can acquire imaging range information indicating imaging ranges of the imaging unit 220 and the imaging unit 230. The UAV control unit 110 may acquire, as a parameter for determining an imaging range, angle-of-view information indicating angles of view of the imaging unit 220 and the imaging unit 230 from the imaging unit 220 and the imaging unit 230. The UAV control unit 110 may acquire information indicating the imaging directions of the imaging unit 220 and the imaging unit 230 as a parameter for determining the imaging range. The UAV control unit 110 may acquire, for example, attitude information indicating an attitude state of the imaging unit 220 from the universal joint 200 as information indicating an imaging direction of the imaging unit 220. The attitude information of the imaging unit 220 may indicate an angle at which the gimbal 200 rotates from the reference rotation angle of the pitch axis and the yaw axis.

The UAV control unit 110 may acquire, as a parameter for determining the imaging range, position information indicating a position where the unmanned aerial vehicle 100 is located. The UAV control unit 110 may obtain imaging range information by defining an imaging range indicating a geographical range to be imaged by the imaging unit 220 and generating imaging range information based on the angles of view and the imaging directions of the imaging unit 220 and the imaging unit 230 and the position of the unmanned aerial vehicle 100.

The UAV control unit 110 may acquire imaging range information from the memory 160. The UAV control section 110 may acquire imaging range information via the communication interface 150.

The UAV control unit 110 controls the universal joint 200, the rotor mechanism 210, the imaging unit 220, and the imaging unit 230. The UAV control unit 110 may control the imaging range of the imaging unit 220 by changing the imaging direction or the angle of view of the imaging unit 220. The UAV control unit 110 can control the imaging range of the imaging unit 220 supported by the gimbal 200 by controlling the rotation mechanism of the gimbal 200.

The imaging range refers to a geographical range imaged by the imaging unit 220 or the imaging unit 230. The imaging range is defined by latitude, longitude, and altitude. The imaging range may be a range of three-dimensional spatial data defined by latitude, longitude, and altitude. The imaging range may be a range of two-dimensional spatial data defined by latitude and longitude. The imaging range may be specified according to the angle of view and the imaging direction of the imaging unit 220 or the imaging unit 230 and the position where the unmanned aerial vehicle 100 is located. The imaging direction of the imaging section 220 and the imaging section 230 may be defined by the azimuth and depression angle that the front of the imaging lens in which the imaging section 220 and the imaging section 230 are provided faces. The imaging direction of the imaging section 220 may be a direction specified by the orientation of the nose of the unmanned aerial vehicle 100 and the attitude state of the imaging section 220 with respect to the gimbal 200. The imaging direction of the imaging section 230 may be a direction specified by the orientation of the nose of the unmanned aerial vehicle 100 and the position where the imaging section 230 is provided.

The UAV control 110 may determine the environment around the unmanned aircraft 100 by analyzing the plurality of images captured by the plurality of cameras 230. The UAV control 110 may control flight based on the environment surrounding the unmanned aircraft 100, such as avoiding obstacles.

The UAV control unit 110 can acquire stereo information (three-dimensional information) indicating a stereo shape (three-dimensional shape) of an object existing around the unmanned aircraft 100. The object may be, for example, a part of a landscape of a building, a road, a vehicle, a tree, etc. The stereo information is, for example, three-dimensional spatial data. The UAV control unit 110 may acquire the stereoscopic information by generating stereoscopic information indicating a stereoscopic shape of an object existing around the unmanned aircraft 100 from each of the images obtained from the plurality of imaging units 230. The UAV control unit 110 may acquire the stereoscopic information indicating the stereoscopic shape of the object existing around the unmanned aircraft 100 by referring to the three-dimensional map database stored in the memory 160 or the memory 170. The UAV control section 110 can acquire the stereoscopic information relating to the stereoscopic shape of the object existing around the unmanned aircraft 100 by referring to the three-dimensional map database managed by the server existing on the network.

UAV control 110 controls the flight of unmanned aircraft 100 by controlling rotor mechanism 210. That is, the UAV controller 110 controls the position including the latitude, longitude, and altitude of the unmanned aerial vehicle 100 by controlling the rotor mechanism 210. The UAV control unit 110 may control the imaging range of the imaging unit 220 by controlling the flight of the unmanned aerial vehicle 100. The UAV control unit 110 may control an angle of view of the image pickup unit 220 by controlling a zoom lens included in the image pickup unit 220. The UAV control unit 110 may control an angle of view of the image capturing unit 220 by digital zooming using a digital zoom function of the image capturing unit 220.

When the imaging unit 220 is fixed to the unmanned aircraft 100 and the imaging unit 220 cannot be moved, the UAV control unit 110 may cause the imaging unit 220 to capture an image of a desired imaging range in a desired environment by moving the unmanned aircraft 100 to a specific position at a specific date and time. Alternatively, even when the imaging unit 220 does not have the zoom function and the angle of view of the imaging unit 220 cannot be changed, the UAV control unit 110 may cause the imaging unit 220 to capture an image of a desired imaging range in a desired environment by moving the unmanned aerial vehicle 100 to a specific position on a specific date and time.

The communication interface 150 communicates with the terminal 80. The communication interface 150 may perform wireless communication by any wireless communication method. The communication interface 150 may perform wired communication by any wired communication method. The communication interface 150 may transmit a camera image (e.g., an aerial image), additional information (metadata) related to the camera image to the terminal 80.

The memory 160 stores programs and the like necessary for the UAV control unit 110 to control the universal joint 200, the rotor mechanism 210, the imaging unit 220, the imaging unit 230, the GPS receiver 240, the inertial measurement unit 250, the magnetic compass 260, the barometric altimeter 270, the ultrasonic sensor 280, and the laser meter 290. The Memory 160 may be a computer-readable recording medium and may include at least one of flash memories such as an SRAM (Static Random Access Memory), a DRAM (Dynamic Random Access Memory), an EPROM (Erasable Programmable Read Only Memory), an EEPROM (Electrically Erasable Programmable Read-Only Memory), and a USB (Universal Serial Bus) Memory. The memory 160 may be detached from the unmanned aircraft 100. The memory 160 may operate as a working memory.

The memory 170 may include at least one of an HDD (Hard Disk Drive), an SSD (Solid State Drive), an SD card, a USB memory, and other memories. The memory 170 may store various information and various data. The memory 170 may be detachable from the unmanned aircraft 100. The memory 170 may record the camera image.

The memory 160 or the storage 170 may store information of the imaging position and the imaging path generated by the terminal 80 or the unmanned aerial vehicle 100. The information of the imaging position and the imaging path may be set by the UAV control unit 110 as one of the imaging parameters for imaging planned by the unmanned aerial vehicle 100 or the flight parameters for flight scheduled by the unmanned aerial vehicle 100. The setting information may be stored in the memory 160 or the storage 170.

The gimbal 200 may support the camera 220 rotatably about a yaw axis, a pitch axis, and a roll axis. The gimbal 200 can rotate the imaging unit 220 around at least one of the yaw axis, pitch axis, and roll axis, thereby changing the imaging direction of the imaging unit 220.

Rotor mechanism 210 has a plurality of rotors and a plurality of drive motors that rotate the plurality of rotors. The rotary wing mechanism 210 is controlled to rotate by the UAV control unit 110, thereby flying the unmanned aerial vehicle 100. The number of rotors 211 may be, for example, four, or may be other numbers. Further, the unmanned aerial vehicle 100 may be a fixed wing aircraft without rotors.

The image pickup unit 220 picks up an image of an object in a desired image pickup range and generates data of a picked-up image. Image data (for example, a captured image) obtained by the imaging unit 220 may be stored in the memory or the memory 170 of the imaging unit 220.

The imaging unit 230 captures an image of the periphery of the unmanned aircraft 100 and generates data of a captured image. The image data of the image pickup section 230 may be stored in the memory 170.

The GPS receiver 240 receives a plurality of signals indicating times transmitted from a plurality of navigation satellites (i.e., GPS satellites) and positions (coordinates) of the respective GPS satellites. The GPS receiver 240 calculates the position of the GPS receiver 240 (i.e., the position of the unmanned aircraft 100) based on the plurality of signals received. The GPS receiver 240 outputs the position information of the unmanned aerial vehicle 100 to the UAV control section 110. In addition, the calculation of the position information of the GPS receiver 240 may be performed by the UAV control section 110 instead of the GPS receiver 240. In this case, information indicating the time and the position of each GPS satellite included in the plurality of signals received by the GPS receiver 240 is input to the UAV control unit 110.

The inertial measurement unit 250 detects the attitude of the unmanned aircraft 100 and outputs the detection result to the UAV control unit 110. The inertial measurement unit 250 can detect the three-axis acceleration and the three-axis angular velocity of the pitch axis, the roll axis, and the yaw axis of the unmanned aerial vehicle 100 in the front-rear direction, the left-right direction, and the up-down direction, as the attitude of the unmanned aerial vehicle 100.

The magnetic compass 260 detects the orientation of the nose of the unmanned aerial vehicle 100, and outputs the detection result to the UAV control section 110.

The barometric altimeter 270 detects the flying height of the unmanned aircraft 100, and outputs the detection result to the UAV control unit 110.

The ultrasonic sensor 280 emits ultrasonic waves, detects ultrasonic waves reflected by the ground or an object, and outputs the detection result to the UAV control unit 110. The detection result may show the distance from the unmanned aerial vehicle 100 to the ground, i.e., the altitude. The detection result may show the distance from the unmanned aerial vehicle 100 to the object (subject).

The laser measurement instrument 290 irradiates a laser beam on an object, receives reflected light reflected by the object, and measures the distance between the unmanned aircraft 100 and the object (subject) by the reflected light. As an example of the laser-based distance measuring method, a time-of-flight method may be cited.

Fig. 5 is a block diagram showing one example of the hardware configuration of the terminal 80. The terminal 80 includes a terminal control unit 81, an operation unit 83, a communication unit 85, a memory 87, a display unit 88, and a storage 89. The terminal 80 may be held by a user who wishes to control the flight of the unmanned aircraft 100.

The terminal control unit 81 is configured by, for example, a CPU, an MPU, or a DSP. The terminal control unit 81 performs signal processing for controlling the operation of each unit of the terminal 80 as a whole, data input/output processing with respect to other units, data arithmetic processing, and data storage processing.

The terminal control unit 81 can acquire data and information from the unmanned aircraft 100 via the communication unit 85. The terminal control section 81 can acquire data and information (for example, various parameters) input via the operation section 83. The terminal control unit 81 can acquire data and information stored in the memory 87. The terminal control unit 81 can transmit data and information (for example, information of position, speed, and flight path) to the unmanned aerial vehicle 100 via the communication unit 85. The terminal control unit 81 may transmit the data and information to the display unit 88, and may cause the display unit 88 to display information based on the data and information.

The operation unit 83 receives and acquires data and information input by the user of the terminal 80. The operation unit 83 may include input devices such as buttons, keys, a touch display screen, and a microphone. Here, the operation unit 83 and the display unit 88 are mainly illustrated as being constituted by a touch panel. In this case, the operation section 83 may perform a touch operation, a click operation, a drag operation, and the like. The operation unit 83 can receive information of various parameters. The information input by the operation unit 83 may be transmitted to the unmanned aircraft 100.

The communication unit 85 performs wireless communication with the unmanned aircraft 100 by various wireless communication methods. The wireless communication means of this wireless communication may include, for example, wireless LAN, Bluetooth (registered trademark), or communication via a public wireless network. The communication unit 85 can perform wired communication by any wired communication method.

The memory 87 may include, for example, a program for defining the operation of the terminal 80, a ROM for storing data of set values, and a RAM for temporarily storing various information and data used when the terminal control unit 81 performs processing. Memory 87 may include memory other than ROM and RAM. The memory 87 may be provided inside the terminal 80. The memory 87 may be configured to be removable from the terminal 80. The program may include an application program.

The Display unit 88 is configured by, for example, an LCD (Liquid Crystal Display), and displays various information and data output from the terminal control unit 81. The display unit 88 can display various data and information related to execution of the application program.

The memory 89 stores and holds various data and information. The memory 89 may be an HDD, SSD, SD card, USB memory, or the like. The memory 89 may be provided inside the terminal 80. The memory 89 may be detachably provided on the terminal 80. The memory 89 may store the camera image acquired from the unmanned aerial vehicle 100 and additional information thereof. Additional information may be stored in memory 87.

The operation of the flight system 10 will be described below. The unmanned aerial vehicle 100 or the terminal 80 of the flight system 10 performs processing relating to generation of a composite image based on a plurality of photographic images captured by the unmanned aerial vehicle 100. The UAV control unit 110 of the unmanned aircraft 100 or the terminal control unit 81 of the terminal 80 is an example of a processing unit that executes processing related to generation of a composite image. Where the processing associated with the composite image is shown as being performed by the unmanned aircraft 100.

In the present embodiment, it may be assumed that the generation of the synthetic image is performed by a processor with insufficient computing power. The composite image can be used as a map image or an orthoimage. Processors with insufficient computing power may include, for example, processors that are difficult to implement in real-time for composite image generation including dense point cluster generation.

Fig. 6 is a flowchart showing one example of the image composition processing procedure. As one example, the processing may be executed by the terminal control section 81 of the terminal 80 executing a program stored in the memory 87. Further, the unmanned aerial vehicle 100 may perform an action to assist the image synthesis process. For example, the unmanned aerial vehicle 100 may provide the terminal 80 with the captured image captured by the imaging section 220 and additional information thereof, or may provide various parameters (for example, flight parameters related to the flight of the unmanned aerial vehicle 100, imaging parameters related to the imaging by the imaging section 220).

The station control unit 81 acquires the flight range and various parameters (S1). In this case, the user may input the flight range and parameters to the terminal 80. The terminal control part 81 may receive user input via the operation part 83, and acquire the input flight range and parameters.

The terminal control section 81 can acquire map information from an external server via the communication section 85. For example, when the flight range is set to a rectangular range, the user can obtain information of the flight range by inputting the positions (latitude, longitude) of the four corners of the rectangle in the map information. Further, when the flight range is set to a circular range, the user can obtain information of the flight range by inputting the radius of a circle centered on the flight position. Further, the user can obtain information of the flight range by inputting information of an area, a specific place name (e.g., tokyo), and the like, and based on the map information. The station control unit 81 can acquire the flight ranges stored in the memory 87 and the storage 89 from the memory 87 and the storage 89. The flight range may be a predetermined range in which the unmanned aircraft 100 is flying.

The parameters may be imaging parameters related to imaging by the imaging unit 220, and flight parameters related to flight of the unmanned aerial vehicle 100. The imaging parameters may include an imaging position, imaging date and time, distance from the object, imaging angle of view, attitude of the unmanned aerial vehicle 100 at the time of imaging, imaging direction, imaging conditions, camera parameters (shutter speed, exposure value, imaging mode, and the like). The flight parameters may include flight position (three-dimensional position or two-dimensional position), flight altitude, flight speed, flight acceleration, flight path, flight date and time, and the like. The terminal control unit 81 can acquire various parameters stored in the memory 87 or the memory 89 from the memory 87 or the memory 89.

The terminal control section 81 can acquire a flight range and various parameters from an external server, the unmanned aerial vehicle 100 via the communication section 85. In the unmanned aircraft 100, the range of flight and various parameters may be obtained from the memory 160, obtained from various sensors in the unmanned aircraft 100 (e.g., GPS receiver 240, inertial measurement device 250), or derived (e.g., calculated). The terminal control unit 81 can determine the flight path and the imaging position of the unmanned aerial vehicle 100 based on the acquired flight range and various parameters. The terminal control unit 81 may notify the determined flight path and imaging position of the unmanned aircraft 100 to the unmanned aircraft 100 via the communication unit 85.

In the unmanned aerial vehicle 100, the UAV control unit 110 controls the flight according to the determined flight path, and causes the imaging unit 220 to capture (e.g., aerial) an image. In this case, the UAV control section 110 may acquire a plurality of photographic images at different positions and postures during flight. The UAV control 110 may send the camera images to the terminal 80 via the communication interface 150. In the terminal 80, the terminal control unit 81 acquires the captured image via the communication unit 85 and stores the captured image in the memory 89 (S2).

The shooting may be performed at each shooting position along the flight path, and a plurality of shot images are saved in the memory 89. The terminal control unit 81 may acquire additional information related to the captured image via the communication unit 85 and store the additional information in the memory 89. The additional information may include information similar to the various parameters described above (e.g., flight parameters, camera parameters). Therefore, the terminal control unit 81 can acquire information such as the imaging position, the attitude, and the imaging direction of the imaging unit 220 (i.e., the unmanned aircraft 100) at the time of imaging each captured image.

Further, a plurality of captured images may be captured by the same imaging unit 220, or may be captured by different imaging units 220. That is, multiple images may be taken by multiple different unmanned aerial vehicles 100.

The terminal control section 81 can acquire a plurality of captured images from the communication section 85 or the memory 89. The terminal control unit 81 may extract a plurality of feature points included in each captured image. The feature point may be a point at any position on the captured image. The terminal control unit 81 may perform matching processing for corresponding the same feature point to a plurality of feature points included in each captured image, and may generate a corresponding point as a corresponding feature point.

In the matching process, the terminal control unit 81 may consider a difference (re-projection error) between an actual observation position at which each feature point is projected onto the captured image and a reproduction position at which each feature point is reproduced on the captured image based on parameters such as the position and orientation of the imaging unit 220. The terminal control section 81 can perform Bundle Adjustment (BA) for minimizing the reprojection error. The terminal control unit 81 can derive the result of the bundle adjustment and the correspondence relationship between the feature points in each captured image, and can derive the corresponding points.

The terminal control section 81 may generate a sparse point group including a plurality of corresponding points. The terminal control unit 81 may generate a sparse point group from sfm, for example. The number of sparse point groups may be, for example, several hundred points per image. The data of the points contained in the sparse point group may include data representing a three-dimensional position. That is, the sparse point group here is a three-dimensional point group. In this manner, the terminal control unit 81 generates a sparse point group based on the plurality of captured images (S3).

The terminal control unit 81 generates the three-dimensional model M based on the sparse point group (S4). In this case, the terminal control unit 81 may generate a plurality of sfs (surfaces) having adjacent points included in the sparse point group as vertices, and generate the three-dimensional model M represented by the plurality of surfaces sf. The three-dimensional model M may be, for example, a terrain model representing the shape of the ground. Since the three-dimensional model M is formed based on the sparse point group, it becomes a sparse three-dimensional model (rough three-dimensional model). The generated three-dimensional model M may be stored in the memory 89.

The terminal control unit 81 derives (e.g., calculates) the distance D between each position of the image pickup unit 220 and the three-dimensional model M based on the three-dimensional model M and at least one of each position (three-dimensional position) and each orientation of the image pickup unit 220 at the time of image pickup (S5). That is, since the unmanned aerial vehicle 100 moves during flight, a plurality of distances D between the position of the imaging unit 220 and the three-dimensional model M are derived. Since the three-dimensional model M is generated based on the sparse point group having the three-dimensional position information, the position and shape of the three-dimensional model M in the three-dimensional space can be determined. The coordinate space in which the three-dimensional model M exists and the coordinate space in which the imaging unit 220 of the unmanned aircraft 100 exists are the same coordinate space. Therefore, the distance D between each position of the image pickup section 220 and a predetermined position in the three-dimensional model M can be derived. The terminal control unit 81 may calculate the distance D not only based on the position information of the image pickup unit 220 but also based on the posture information of the image pickup unit 220 at each time (the posture of the image pickup unit 220 at each time) using the time information. In this case, the terminal control section 81 may acquire only information on the posture and not information on the position of the image pickup section 220.

The terminal control section 81 adjusts the size (scale) of each captured image captured by the imaging section 220 at each position based on the acquired distance D (S6). In this case, the terminal control section 81 may calculate the size of each captured image captured by the image capturing section 220 at each position based on the acquired distance D. The terminal control section 81 may enlarge or reduce each captured image so that it has the calculated size. The size of each captured image is an index indicating the enlargement rate or reduction rate of each captured image for generating a composite image. A reference captured image (captured image not enlarged or reduced) may be present in each captured image (a plurality of captured images).

For example, the terminal control unit 81 can increase the magnification of the captured image because the larger the distance D between the imaging unit 220 and the three-dimensional model M, the smaller the imaging target (e.g., the ground, a building, an object) included in the captured image captured by the imaging unit 220. Since the shorter the distance D between the imaging unit 220 and the three-dimensional model M, the larger the subject included in the captured image captured by the imaging unit 220, the terminal control unit 81 can reduce the magnification of the captured image.

For example, the terminal control unit 81 can reduce the reduction rate of the captured image because the longer the distance D between the imaging unit 220 and the three-dimensional model M, the smaller the subject included in the captured image captured by the imaging unit 220. Since the shorter the distance D between the imaging unit 220 and the three-dimensional model M, the larger the subject included in the captured image captured by the imaging unit 220, the terminal control unit 81 can increase the reduction rate of the captured image.

The terminal control unit 81 synthesizes the size-adjusted captured images to generate a synthesized image (S7). In this case, the terminal control section 81 may delete a duplicate portion, other than one of the adjacent captured images, with respect to the duplicate subject portion depicted in the captured image, among the adjacent captured images in the composite image. That is, regarding the overlapped part, the subject can be drawn as represented by any one of the captured images. The terminal control unit 81 may generate a composite image based on the plurality of captured images after the size adjustment by a known composite method.

In this way, the terminal 80 can make the positional relationship between the feature points and the corresponding points included in each captured image substantially equal by adjusting the size of each captured image. Thus, the terminal 80 is able to control the occurrence of: the size of the subject (object) included in the composite image differs depending on the original captured image, and the size of the subject varies for each captured image included in the composite image. For example, the terminal 80 can match the dimensions of the object (e.g., objects such as terrain and buildings) reflected in the composite image among the plurality of captured images, and can reduce the variation in the dimensions of the same or corresponding objects among the plurality of captured images to generate the composite image.

Further, since the terminal 80 generates a sparse point group, generates a sparse three-dimensional model, and adjusts the size of each captured image, it is not necessary to generate a dense point group. Therefore, the terminal 80 can reduce the processing load and processing time for generating the composite image without having to provide a processor having high processing power.

Further, distance derivation, size adjustment, composite image generation, and the like can be performed based on the captured image captured by the imaging unit 230, not based on the captured image captured by the imaging unit 220.

Further, at least a part of the processing of the image generation processing of fig. 6 may also be performed by the unmanned aerial vehicle 100. For example, the UAV controller 110 may acquire the flight range and various parameters from the terminal 80 at S1, acquire a captured image (S2), generate a sparse point group (S3), generate a three-dimensional model (S4), derive the distance D (S5), adjust the size (S6), and generate a composite image (S7). In addition, the image synthesizing process is also shared by the unmanned aerial vehicle 100 and the terminal 80.

Next, an example of deriving the distance D will be described.

Fig. 7 is a diagram showing an example of distances between respective positions of the imaging unit 220 (i.e., respective positions of the unmanned aerial vehicle 100) and corresponding portions of the three-dimensional model M (e.g., portions included in the imaging range CR). In fig. 7, the imaging unit 220 at each position is denoted by 220a, 220b, and 220 c.

The distance D from the imaging unit 220a is a distance ha. The distance D between the imaging unit 220c and the three-dimensional model M is the distance hc. That is, in fig. 7, the distance D is different at each position of the imaging section 220. Therefore, when the captured images are combined without performing the resizing, the sizes of the objects reflected in the captured images are not uniform in the generated combined image. In contrast, in the present embodiment, the terminal control unit 81 resizes each captured image by a resizing amount (for example, an enlargement rate or a reduction rate) corresponding to the distance D, and combines the resized captured images. Therefore, the terminal 80 can generate a composite image in which the sizes of the objects reflected in the respective captured images are uniform.

Fig. 8 is a diagram showing an example of deriving the distance D.

As shown in fig. 8, the distance D between the position of the imaging unit 220 and the three-dimensional model M may be a distance h1 along the vertical direction (the direction perpendicular to the horizontal direction). That is, the terminal control unit 81 may set the distance h1 connecting the imaging unit 220 and the intersection point C1 as the distance D, where the intersection point C1 is an intersection point between the three-dimensional model M and a straight line L1 passing through the imaging unit 220 and parallel to the vertical direction. Further, the position of the photographing section 220 may be a photographing plane of the photographing section 220, and may be an image center of a photographed image photographed by the photographing section 220.

Thus, the terminal 80 can perform the size adjustment of the captured image in consideration of the flying height of the unmanned aircraft at the time of capturing the captured image.

As shown in fig. 8, the distance D between the position of the image capturing section 220 and the three-dimensional model M may be a distance h2 along the image capturing direction of the image capturing section 220. That is, the terminal control unit 81 may set the distance h2 between the imaging unit 220 and the intersection point C2 as the distance D, where the intersection point C2 is the intersection point of the straight line L2 passing through the imaging unit 220 and parallel to the imaging direction defined by the orientation of the imaging unit 220 and the three-dimensional model M. Thereby, the inclination of the shooting direction with respect to the vertical direction is also taken into account.

Thus, the terminal 80 can perform the size adjustment of the captured image in consideration of the distance between the object reflected in the captured image and the drone 100 when the captured image is captured.

In addition, at least a part of the three-dimensional model M is included in the imaging range CR in which the image is imaged by the imaging unit 220. Therefore, at least a part of the three-dimensional model M is included in the image range of the captured image captured by the imaging unit 220. One photographing range CR or one image range corresponds to one photographed image.

The distance D between the position of the imaging unit 220 and the three-dimensional model M can be derived for each captured image. That is, the terminal control unit 81 may derive the distance D to the portion of the three-dimensional model M included in the imaging range CR or the image range for each imaging range CR or each image range.

Specifically, the terminal control unit 81 may calculate the distance DA between the position PA of the imaging unit 220 at which the captured image GMA is captured and the portion MA of the three-dimensional model M (first example of the first portion of the three-dimensional model) included in the imaging range CRA of the imaging unit 220 at the position PA. The terminal control unit 81 may calculate a distance DB between a position PB of the imaging unit 220 at which the captured image GMB is captured and a portion MB of the three-dimensional model M (a second example of the first portion of the three-dimensional model) included in the imaging range CRB of the imaging unit 220 at the position PB.

Fig. 9 is a diagram showing an example of deriving the distance D for each imaging range CR. In fig. 9, in the shooting range CR1 in which the shot image GM1 is shot, the distance D between the shooting section 220 and the portion MA of the three-dimensional model M is distance "1". In the shooting range CR2 in which the shot image GM2 is shot, the distance D between the shooting section 220 and the portion MB of the three-dimensional model M is distance "2". In this case, the terminal control unit 81 may, for example, enlarge the captured image GM2 in the case of the capturing range CR2 by 2 times, and combine the captured image GM1 with the 2-times enlarged captured image GM 2. In the case of the distance "2", the enlargement by 2 times is merely an example, and the amount of size adjustment such as the enlargement ratio or the reduction ratio may be determined according to the distance.

In this way, the terminal 80 can roughly derive the distance D between the position (camera position) of the imaging unit 220 and the imaging target (three-dimensional model M) for each distance corresponding to each captured image. Therefore, the processing load of the terminal 80 for deriving the distance D is relatively small, and the processing time is relatively short.

Further, the distance D between the position of the imaging unit 220 and the three-dimensional model M may be derived for each divided region DR of the captured image. That is, the distance D can be derived for each of the divided regions DR of the imaging range CR and the image range GR corresponding to the captured image. In this case, the terminal control unit 81 may generate a plurality of divided regions DR by dividing the imaging range CR or the image range GR. The terminal control unit 81 may calculate each distance between each of the divided regions DR and a portion of the three-dimensional model M corresponding to each of the divided regions DR as a distance D. Therefore, since the distance D is derived for each divided region DR, the number of different distances D corresponding to the number of divisions can be derived for one captured image.

Specifically, the terminal control section 81 may calculate the distance DC1 between the position PC of the imaging section 220 where the captured image GMC is captured and the portion MC1 (an example of the second portion of the three-dimensional model) of the three-dimensional model M included in the divided region DRC1 of the imaging range CRC of the imaging section 220 at the position PC. The terminal control section 81 may calculate the distance DC2 between the position PC and the part MC2 (one example of the second part of the three-dimensional model) of the three-dimensional model M contained in the divided region DRC2 of the shooting range CRC. That is, the distances DC1 and DC2 can be calculated for each of the divided regions DR of the imaging range CRC. In addition, the derivation of the distance D for each divided region DR may be performed for a plurality of imaging ranges CR corresponding to a plurality of captured images.

Fig. 10 is a diagram showing an example of deriving the distance D for each divided region DR obtained by dividing the imaging range CR.

In fig. 10, the imaging range CR11 in which the imaging unit 220a captures the captured image GM11 includes divided regions DR11 to DR 19. In the divided area DR11, a distance D11 between the position of the imaging section 220 that has imaged the captured image GM11 and the portion M11 corresponding to the three-dimensional model M (the portion M11 of the three-dimensional model M) is a distance "1". In the divided region DR12, the distance D12 between the position of the imaging unit 220 that has imaged the captured image GM11 and the portion M12 corresponding to the three-dimensional model M is a distance "0.5". In the divided area DR13, the distance D12 between the position of the imaging section 220 that has imaged the captured image GM11 and the portion M13 corresponding to the three-dimensional model M is a distance "1". In the divided region DR14, the distance D14 between the position of the imaging unit 220 that has imaged the captured image GM11 and the portion M14 corresponding to the three-dimensional model M is a distance "2". In the divided area DR15, the distance D15 between the position of the imaging section 220 that has imaged the captured image GM11 and the portion M15 corresponding to the three-dimensional model M is a distance "2". In the divided region DR16, the distance D16 between the position of the imaging unit 220 that has imaged the captured image GM11 and the portion M16 corresponding to the three-dimensional model M is a distance "1.5". In the divided region DR17, the distance D17 between the position of the imaging unit 220 that has imaged the captured image GM11 and the portion M17 corresponding to the three-dimensional model M is a distance "1". In the divided region DR18, the distance D18 between the position of the imaging unit 220 that has imaged the captured image GM18 and the portion M18 corresponding to the three-dimensional model M is a distance "2.5". In the divided region DR19, the distance D19 between the position of the imaging unit 220 that has imaged the captured image GM19 and the portion M19 corresponding to the three-dimensional model M is a distance "1".

In fig. 10, the imaging range CR21 of the captured image GM21 captured by the imaging unit 220b includes divided regions DR21 to DR 29. In the divided area DR21, a distance D21 between the position of the imaging section 220 that has imaged the captured image GM21 and the corresponding portion M21 of the three-dimensional model M is a distance "2". Hereinafter, similarly, the distance D22 corresponding to the divided region DR22 is a distance "1.5". The distance D23 corresponding to the divided region DR23 is a distance "2". The distance D24 corresponding to the divided region DR24 is a distance "3". The distance D25 corresponding to the divided region DR25 is a distance "3". The distance D26 corresponding to the divided region DR26 is a distance "2.5". The distance D27 corresponding to the divided region DR27 is a distance "2". The distance D28 corresponding to the divided region DR28 is a distance "3.5". The distance D29 corresponding to the divided region DR29 is a distance "2".

In this case, for example, the terminal control unit 81 enlarges the portion of the captured image GM11 corresponding to the distance "1" (the portion corresponding to the divided region DR 11) by 0.5 times and maintains the portion of the captured image GM13 corresponding to the distance "1" (the portion corresponding to the divided region DR 13) as it is (1 time) (without enlarging or reducing), and maintains the portion of the captured image GM12 corresponding to the distance "0.5" (the portion corresponding to the divided region DR 12) as it is (1 time). The terminal control unit 81 enlarges the portion of the captured image GM14 corresponding to the distance "2" (the portion corresponding to the divided region DR 14) by 2 times, enlarges the portion of the captured image GM15 corresponding to the distance "2" (the portion corresponding to the divided region DR 15) by 2 times, and enlarges the portion of the captured image GM16 corresponding to the distance "1.5" (the portion corresponding to the divided region DR 16) by 1.5 times. The terminal control unit 81 keeps the portion of the captured image GM17 corresponding to the distance "1" (the portion corresponding to the divided region DR 17) intact, enlarges the portion of the captured image GM18 corresponding to the distance "2.5" (the portion corresponding to the divided region DR 18) by 2.5 times, and keeps the portion of the captured image GM19 corresponding to the distance "1" (the portion corresponding to the divided region DR 19) intact. In this way, the terminal control unit 81 adjusts the sizes of the regions corresponding to the respective divided regions DR11 to DR19 in the captured image GM 11.

Similarly, the terminal control unit 81 enlarges the portion of the captured image GM21 corresponding to the distance "2" by 2 times, enlarges the portion of the captured image GM22 corresponding to the distance "1.5" by 1.5 times, and enlarges the portion of the captured image GM23 corresponding to the distance "2" by 2 times. The terminal control unit 81 enlarges the portion of the captured image GM24 corresponding to the distance "3" by 3 times, enlarges the portion of the captured image GM25 corresponding to the distance "3" by 3 times, and enlarges the portion of the captured image GM26 corresponding to the distance "2.5" by 2.5 times. The terminal control unit 81 enlarges the portion of the captured image GM27 corresponding to the distance "2" by 2 times, enlarges the portion of the captured image GM28 corresponding to the distance "3.5" by 3.5 times, and enlarges the portion of the captured image GM29 corresponding to the distance "2" by 2 times. In this way, the terminal control unit 81 adjusts the sizes of the regions corresponding to the respective divided regions DR21 to DR29 in the captured image GM 21.

In this way, the terminal 80 can derive the distance D between the position of the imaging unit 220 (camera position) and the imaging target (three-dimensional model M) in a fine range in correspondence with one distance for each of the divided regions DR into which the imaging range CR is divided. Therefore, the terminal 80 can improve the accuracy of the distance D between the camera position and the three-dimensional model M. Therefore, the terminal 80 can perform the size adjustment based on the distance with high accuracy, and can improve the reproducibility of the object included in the composite image.

For example, when the shape of the terrain approximated by the three-dimensional model M is complicated and the difference in elevation is large, the terminal 80 can derive the distance D between the three-dimensional model M and the three-dimensional model M for each divided region FR into which the imaging range CR is divided. Therefore, the terminal 80 can resize the captured image according to the distance D for each of the divided regions DR, and synthesize the resized captured images to generate a synthesized image. In addition, when deriving the distance D for each divided area, the terminal 80 can perform the scaling more finely than when deriving the distance D for each imaging range.

The relationship between the distance and the resizing amount (here, magnification) is an example, and the resizing amount may be determined according to the distance.

As shown in fig. 9, by scaling each imaging range CR, a smaller amount of calculation is required than in the case of scaling each divided region DR in the imaging range CR. On the other hand, as shown in fig. 10, the terminal 80 can perform the scaling finely by scaling each divided region DR as compared with the case of scaling each imaging range CR. In this case, even if the difference in the height of the terrain is large in one imaging range CR and it is difficult to represent the difference in the height by one distance, for example, the terminal 80 can perform the scale adjustment corresponding to the difference in the height that changes frequently, and can reduce the variation in the scale of the original captured image in the synthesized image. Therefore, the terminal 80 can improve the generation accuracy of the composite image.

It is also conceivable to derive the distance between the imaging unit 220 and the three-dimensional model M for each point included in the sparse point group, but in this case, information of a plurality of different distances appears, and it is difficult to know an accurate scale. The terminal 80 derives the distance D for each of the imaging ranges CR, which are the ranges of the captured images, or derives the distance D for each of the divided regions DR into which the imaging ranges CR are divided into several parts, thereby suppressing generation of a composite image based on a plurality of captured images of uniform size without considering the distance D. In addition, the terminal 80 can reduce the amount of computation and shorten the computation time by deriving the distance slightly roughly, instead of deriving the distance D for each point of the sparse point group.

Next, the correction of the resizing amount SA in the vicinity of the boundary where the distances D are different will be described.

The terminal control unit 81 generates a composite image SG based on the plurality of size-adjusted captured images GM. In the composite image SG, the size adjustment amount SA determined in accordance with the distance D may be different in the vicinity of the boundary where the imaging range CR of the captured image GM as the base of the composite image SG or the distance D corresponding to the divided region DR is different, and thereby a discontinuous region may occur. In this case, the terminal control unit 81 may correct the resizing amount SA in the vicinity of the boundary so that the resizing amount SA in the vicinity of the boundary changes smoothly in the adjacent region.

The vicinity of the boundary may be, for example, a portion of each imaging range CR within a specific range from the boundary of adjacent imaging ranges CR. When adjacent imaging ranges CR overlap, the specific range may be at least a part of an overlapping region where adjacent imaging ranges CR overlap. The captured images GM corresponding to the adjacent capturing ranges CR are adjacent to each other in the combined image SG. Similarly, the vicinity of the boundary may be, for example, a portion of each of the divided regions DR within a certain range from the boundary of the adjacent divided regions DR. Portions of the captured image GM corresponding to the adjacent divided areas DR are adjacent to each other in the composite image SG.

For example, when the size adjustment is performed by deriving the distance D for each imaging range CR, the terminal control unit 81 may smoothly change the size adjustment amount SA in the vicinity of the boundary between adjacent imaging ranges CR. For example, in the case where the distance D is 1 in the shooting range CR31 and the distance D is 2 in the shooting range CR32 adjacent to the shooting range CR31, the terminal control portion 81 may correct the resizing amount SA so that the resizing amount (for example, magnification) smoothly changes from 1 to 2 in the vicinity of the boundary.

Fig. 11 is a diagram showing an example of correction of the resizing amount. In fig. 11, the distance D is 1 in the shooting range CR31 and 2 in the shooting range CR 32. Therefore, when the resizing amount SA is not corrected, the resizing amount SA (for example, the magnification ratio) may be 1 in the captured image GM31 corresponding to the capturing range CR31, and the resizing amount SA may be 2 in the captured image GM32 corresponding to the capturing range CR 32. In the case of correcting the resizing amount SA, the resizing amount SA other than the vicinity of the boundary in the captured image GM31 may be 1, and the resizing amount SA other than the vicinity of the boundary in the captured image GM31 may be 2. Then, the terminal control section 81 may perform correction so that the resizing amount SA smoothly changes from 1 to 2 in the vicinity of the boundary from the adjacent captured image GM31 toward the captured image GM32 in the combined image SG. In this case, the termination control unit 81 may change the resizing amount SA linearly in the vicinity of the boundary (see the curve g1) or nonlinearly (see the graph g 2).

Thus, the terminal 80 can suppress the problem that the image quality of the composite image SG is poor due to the occurrence of a discontinuous region in the vicinity of the end (the vicinity of the boundary) of the plurality of captured images GM having different resizing amounts SA. This is also applicable to the case where the distance D is derived for each divided region DR and the size adjustment is performed.

Next, an example of generating the three-dimensional model M will be described. Here, the three-dimensional model M is exemplified as a generated terrain model. The terminal control unit 81 may generate a three-dimensional model using the grid gd corresponding to the imaging range CR. The terrain may widely include, for example, the shape of an imaging target (for example, the ground, a building, or a target) imaged by the imaging unit 220 provided in the flying drone 100.

The grid gd will be explained. The grid gd may be formed of a mesh pattern. The grid gd virtually represents the terrain that is the imaging range of the three-dimensional model M. The grid gd may be set in the same range as the imaging range CR or in a range included in the imaging range CR. The grid gd may be lattice-shaped, triangular, other polygonal, other shapes, including grid points gp as vertices of the grid gd.

The interval (grid interval) between the grid points gp may be a predetermined value or may be arbitrarily set by the terminal control unit 81. The grid spacing may be 1m, 2m, etc. For example, the terminal control section 81 may specify the grid interval via the operation section 83. In addition, the position in the two-dimensional plane of the sparse point group (the position without considering the height in the three-dimensional space) may not coincide with the position in the two-dimensional plane of the grid point gp (the position without considering the grid height).

Fig. 12 is a flowchart showing an example of the process of generating the three-dimensional model M. The generation of the three-dimensional model M shown in fig. 12 is an example, and the three-dimensional model may be generated by another method.

The terminal control unit 81 projects a sparse point group (three-dimensional point group PG3) in the three-dimensional space (XYZ coordinate system) onto the two-dimensional plane (XY plane), and generates a sparse point group (two-dimensional point group PG2) projected onto the two-dimensional plane (S11). The terminal control unit 81 may generate a two-dimensional point group PG2 by setting the height (Z coordinate) of the three-dimensional point group PG3 to a value of 0. Here, the sparse point group may be the sparse point group generated in S3 of fig. 6.

The terminal control unit 81 specifies a plurality of adjacent dots included in the two-dimensional dot group PG2 (S12). Then, the terminal control unit 81 connects the points on the plurality of three-dimensional spaces corresponding to the extracted points on the two-dimensional plane, in consideration of the height of the three-dimensional point group PG3 projected onto the two-dimensional plane, with respect to the designated plurality of points, and generates a surface sf (surface) sf (S13). A plurality of points may be specified in whole or in part of the two-dimensional point group PG 2. A plurality of groups having a plurality of points for generating the surface sf may be formed, and the surface sf is generated in each of the plurality of groups. In this case, the terminal control unit 81 may triangulate adjacent 3 points included in the two-dimensional point group PG2 using 3 points included in the corresponding three-dimensional point group PG3 to generate a triangular surface sf, that is, may triangulate (Delaunay triangulation). The terminal control unit 81 may generate the surface sf by a method other than Delaunay triangulation.

In a range where each generated surface sf is projected onto a two-dimensional plane, that is, in a range surrounded by a plurality of points specified by the two-dimensional point group PG2, there may be one or more grid points gp. The termination control unit 81 sets the height (grid height) of each grid point gp to the position of the surface sf in the grid point. That is, the termination control unit 81 may set the height of the intersection between the straight line passing through the grid point gp and along the vertical direction and the plane sf to the grid height. In this way, the terminal control unit 81 calculates the three-dimensional position of each grid point gp (mesh point) (S14).

The terminal control unit 81 generates a three-dimensional model M based on the three-dimensional position of each grid point gp (S15). The shape of the 3-dimensional model M may be specified by the three-dimensional position of each grid point gp. In addition, since the three-dimensional position of each grid point gp is located at the three-dimensional position of each plane sf, the shape of the three-dimensional model M may be a shape in which the planes sf are combined. In this way, the terminal 80 can generate the three-dimensional model M from the three-dimensional position of each grid point gp in the grid gd and determine the three-dimensional model M.

In this way, the terminal control unit 81 can project a plurality of three-dimensional points included in a sparse point group (an example of sparse point group data) onto a two-dimensional plane. The terminal control section 81 may designate a plurality of two-dimensional points projected to be adjacent in the two-dimensional plane as one group. The terminal control section 81 can designate a plurality of such groups. The terminal control unit 81 may generate a plurality of surfaces sf (an example of surface data) by connecting a plurality of three-dimensional points included in a sparse point group corresponding to a plurality of designated adjacent two-dimensional points in a group. The terminal control unit 81 may generate the three-dimensional model M based on the plurality of surfaces sf.

Thus, the terminal 80 temporarily projects the three-dimensional point group PG3 on a two-dimensional plane to generate a two-dimensional point group PG2, and generates a plane based on the adjacent relationship of the two-dimensional point group PG 2. Therefore, the terminal 80 can easily obtain a smooth shape as compared with the generation of the plane sf based on the adjacency relation of the three-dimensional point group PG 3. Therefore, the terminal 80 derives the shape of the three-dimensional model M based on the adjacent relationship of the two-dimensional point group PG2, thereby making it possible to improve the accuracy of reproduction of the actual terrain.

Fig. 13 is a diagram showing an example of a three-dimensional point group PG3 and a two-dimensional point group PG 2.

The terminal control unit 81 generates a two-dimensional point group PG2 based on the three-dimensional point group PG 3. The terminal control unit 81 projects each point included in a three-dimensional point group PG3 in a three-dimensional space (XYZ space) onto a two-dimensional plane (XY plane), and generates each point included in a two-dimensional point group PG 2. The terminal control unit 81 specifies, for example, adjacent points P21 to P23 (an example of two-dimensional points) included in the two-dimensional point group G2. The points P31 to P33 (an example of three-dimensional points) in the three-dimensional point group PG3 corresponding to P21 to P23 are points before the points P21 to P23 are projected onto the two-dimensional plane, and are adjacent points for generating the one plane sf 1.

Therefore, the terminal 80 specifies a plurality of points adjacent in the two-dimensional plane and joins a plurality of points within the three-dimensional point group PG3 corresponding to the specified plurality of points to generate one plane sf, and therefore can suppress the intersection or discontinuity of the planes sf with each other in the three-dimensional space. Therefore, the terminal 80 can generate the three-dimensional model M having a shape more conforming to the topographic shape.

Next, the generation results of the three-dimensional model M of the comparative example and the present embodiment will be described.

In the comparative example, it is assumed that a sparse three-dimensional model is generated based on sparse point group data in accordance with a rendered Poisson surface generation algorithm. The screened Poisson surface generation algorithm is described in the following reference non-patent document 1. (refer to non-patent document 1: Michael Kazhdan, Hugues Hoppe, "Screen great poisson surface retrieval." (ToG), Volume32, Issue3, June2013, Article No.29)

Fig. 14 is a diagram showing a sparse three-dimensional model generated by a scened Poisson surface generation algorithm as a comparative example.

In fig. 14, the landforms G1 and G2 have a shape protruding in the horizontal direction, and are different from actual landforms. This is considered to be because in the comparative example, a plurality of intersection points exist between a vertical line along the vertical direction in the three-dimensional space and a plane derived from the sparse point group (three-dimensional point group). In addition, in the comparative example, since the sparse point group is used, the three-dimensional model has low reproducibility, and the accuracy of the connection relationship of the points included in the three-dimensional point group used for generating the plane is lowered.

Fig. 15 is a diagram showing an example of the three-dimensional model M generated in the process of generating the three-dimensional model M according to the present embodiment.

In the present embodiment, the terminal control unit 81 projects the sparse point group (three-dimensional point group PG3) on a two-dimensional plane and performs triangulation, for example. That is, a continuous triangle can be generated in a two-dimensional plane. Therefore, even in a three-dimensional space, the terminal 80 can generate continuous triangles in the direction along the two-dimensional plane, and can suppress the occurrence of a plurality of intersection points between a vertical line in the vertical direction of the three-dimensional space and the surface sf derived from the sparse point group. For example, although there are the features G1 and G2 protruding in the horizontal direction in fig. 14, the portions corresponding to the features G1 and G2 in fig. 15 are the smooth features G3. In this way, the three-dimensional point group PG3 is projected once onto the two-dimensional plane, thereby ensuring continuity of the plane sf in the three-dimensional space formed by connecting the respective points. The height of the plane sf in three-dimensional space may be based on the height of the three-dimensional point cloud before projection onto the two-dimensional plane.

In this way, the unmanned aerial vehicle 100 (one example of the image generating apparatus) can generate the composite image SG based on the plurality of captured images GM captured by the unmanned aerial vehicle 100 (flying body). The terminal 80 may include a terminal control unit 81 (an example of a processing unit) that performs processing related to the generation of the composite image SG. The terminal control unit 81 can acquire a plurality of captured images GM captured by a capturing unit 220 (one example of a capturing device) provided in the unmanned aerial vehicle 100. The terminal control section 81 may generate the three-dimensional model M based on the plurality of captured images GM. At least one of each position and each posture of the photographing section 220 when photographing the plurality of photographed images GM may be acquired. The terminal control unit 81 may derive the distance D between each position of the imaging unit 220 and the three-dimensional model M based on the three-dimensional model M and at least one of each position and each orientation of the imaging unit 220. The terminal control section 81 may adjust the sizes of the plurality of captured images GM based on the distances D between the respective positions of the capturing section 220 and the three-dimensional model M. The terminal control unit 81 may synthesize the plurality of size-adjusted captured images GM to generate a synthesized image SG. Further, the terminal control section 81 may generate sparse point group data based on the plurality of captured images GM, and generate the three-dimensional model M based on the sparse point group data.

Thus, the terminal 80 adjusts the sizes of the captured images GM and combines the images to match the sizes of the objects reflected in the captured images GM, thereby improving the reproducibility of the objects represented by the combined image. In addition, the terminal 80 does not need to sequentially perform all the processing such as the prime point generation, the dense point generation, the mesh generation, and the texture generation in order to generate the composite image. Therefore, the terminal 80 can reduce the processing load for generating the composite image, and can shorten the processing time. Therefore, the terminal 80 can easily generate a composite image with a tablet terminal or the like whose calculation processing capability is not too high. The terminal 80 can generate an ortho image or the like in addition to the composite image.

The present disclosure has been described above with reference to the embodiments, but the technical scope of the present disclosure is not limited to the scope described in the above embodiments. It will be apparent to those skilled in the art that various changes and modifications can be made in the above embodiments. It should be understood from the description of the claims that the embodiments with such modifications and improvements are also included in the technical scope of the present disclosure.

The execution order of the operations, procedures, steps, and stages of the devices, systems, programs, and methods shown in the claims, the specification, and the drawings of the specification may be implemented in any order as long as it is not particularly clear that "before", "in advance", and the like, and as long as the output of the preceding process is not used in the following process. The operational flow in the claims, the description, and the drawings is described using "first", "next", and the like for convenience, but this does not necessarily mean that the operations are performed in this order.

[ description of symbols ]

10 flight system

80 terminal

81 terminal control part

83 operating part

85 communication unit

87 memory

88 display part

89 memory

100 unmanned aircraft

110 UAV control

150 communication interface

160 memory

170 memory

200 universal joint

210 rotating vane mechanism

220, 230 shooting part

240 GPS receiver

250 inertia measuring device

260 magnetic sensor

270 barometer

280 ultrasonic sensor

290 laser measuring device

CR imaging range

DR split area

GM shot image

SG composite image

Claims

An image generating apparatus for generating a composite image based on a plurality of captured images captured by a flying object, the image generating apparatus comprising a processing unit for executing processing relating to generation of the composite image,

the processing unit acquires a plurality of captured images captured by an imaging device provided in the flying object; generating a three-dimensional model based on the plurality of camera images; acquiring each attitude of the image pickup apparatus when the plurality of picked-up images are picked up, and calculating a distance between each position of the image pickup apparatus when the plurality of picked-up images are picked up and the three-dimensional model based on the each attitude of the image pickup apparatus and the three-dimensional model; adjusting the size of the plurality of captured images based on the distance between each position of the imaging device and the three-dimensional model; the plurality of captured images after the size adjustment are combined to generate a combined image.
The image generating apparatus according to claim 1, wherein the processing section acquires each position and each orientation of the imaging apparatus when the plurality of captured images are captured; calculating distances between the respective positions of the image pickup apparatus and the three-dimensional model based on the respective positions and the respective postures of the image pickup apparatus and the three-dimensional model.
The image generating apparatus according to claim 1 or 2, wherein the processing section calculates a distance between the imaging device and a first portion of the three-dimensional model corresponding to each of the imaging ranges imaged at each position by the imaging device.
The image generating apparatus according to claim 1 or 2, wherein the processing section divides an imaging range captured at each position of the imaging apparatus to produce divided regions of the imaging range; computing a second portion of the three-dimensional model corresponding to the segmented region; for each of the divided regions, a distance between the imaging device and a second portion of the three-dimensional model corresponding to the divided region is calculated.
The image generation apparatus according to any one of claims 1 to 4, wherein the distance is a distance in a vertical direction between each position of the imaging apparatus and the three-dimensional model.
The image generation apparatus according to any one of claims 1 to 4, wherein the distance is a distance between each position of the imaging apparatus and the three-dimensional model in an imaging direction of the imaging apparatus.
The image generation apparatus according to any one of claims 1 to 6, wherein the processing unit generates sparse point group data based on the plurality of captured images, and generates a three-dimensional model based on the sparse point group data.
The image generating apparatus according to claim 7, wherein the processing unit projects a plurality of three-dimensional points included in the sparse point group data onto a two-dimensional plane; designating the projected plurality of two-dimensional points adjacent in the two-dimensional plane as a group, and designating a plurality of the groups; connecting a plurality of three-dimensional points included in the sparse point group data corresponding to the specified adjacent two-dimensional points by the set to generate a plurality of plane data; generating the three-dimensional model based on the plurality of face data.
An image generation method for generating a composite image based on a plurality of captured images captured by a flying object, comprising:

acquiring a plurality of captured images captured by an imaging device provided in the flying object;

generating a three-dimensional model based on the plurality of camera images;

acquiring each posture of the image pickup apparatus when the plurality of picked-up images are picked up;

calculating distances between respective positions of the image pickup apparatus and the three-dimensional model based on the respective postures of the image pickup apparatus and the three-dimensional model;

adjusting the size of the plurality of captured images based on the distance between each position of the imaging device and the three-dimensional model; and

the plurality of captured images after the size adjustment are combined to generate a combined image.
The image generation method of claim 9, wherein the step of acquiring a gesture comprises: acquiring positions and postures of the imaging device when the plurality of captured images are captured;

the step of calculating the distance comprises: calculating a distance between each position of the image pickup apparatus and the three-dimensional model based on the each position and the each posture of the image pickup apparatus and the three-dimensional model.
The image generation method according to claim 9 or 10, wherein the step of calculating the distance includes the steps of: the distance between the imaging device and the first portion of the three-dimensional model corresponding to the imaging range is calculated for each imaging range imaged by the imaging device at each position.
The image generation method according to claim 9 or 10, wherein the step of calculating the distance includes the steps of:

dividing an imaging range captured at each position of the imaging device to generate divided regions of the imaging range;

computing a second portion of the three-dimensional model corresponding to the segmented region; and

for each of the divided regions, a distance between the imaging device and a second portion of the three-dimensional model corresponding to the divided region is calculated.
The image generation method according to any one of claims 9 to 12, wherein the distance is a distance in a vertical direction between each position of the imaging device and the three-dimensional model.
The image generation method according to any one of claims 9 to 12, wherein the distance is a distance between each position of the imaging device and the three-dimensional model in an imaging direction of the imaging device.
The image generation method according to any one of claims 9 to 14, wherein the step of generating the three-dimensional model includes the steps of:

generating sparse point group data based on the plurality of camera images; and

generating a three-dimensional model based on the sparse point cloud data.
The image generation method of claim 15, wherein the step of generating the three-dimensional model comprises the steps of:

projecting a plurality of three-dimensional points included in the sparse point cloud data onto a two-dimensional plane;

designating the projected plurality of two-dimensional points adjacent in the two-dimensional plane as a group, and designating a plurality of the groups;

connecting a plurality of three-dimensional points included in the sparse point group data corresponding to the specified adjacent two-dimensional points by the set to generate a plurality of plane data; and

generating the three-dimensional model based on the plurality of face data.
A program for causing an image generation device, which generates a composite image based on a plurality of captured images captured by a flying object, to execute:

acquiring a plurality of captured images captured by an imaging device included in the flying object;

generating a three-dimensional model based on the plurality of camera images;

acquiring each posture of the image pickup apparatus when the plurality of picked-up images are picked up;

calculating distances between respective positions of the image pickup apparatus and the three-dimensional model when the plurality of picked-up images are picked up, based on the respective postures of the image pickup apparatus and the three-dimensional model;

adjusting the size of the plurality of captured images based on the distance between each position of the imaging device and the three-dimensional model; and

the plurality of captured images after the size adjustment are combined to generate a combined image.
A computer-readable recording medium having recorded thereon a program for causing an image generation device that generates a composite image based on a plurality of captured images captured by a flying object to execute:

acquiring a plurality of captured images captured by an imaging device included in the flying object;

generating a three-dimensional model based on the plurality of camera images;

acquiring each posture of the image pickup apparatus when the plurality of picked-up images are picked up;

calculating distances between respective positions of the image pickup apparatus and the three-dimensional model when the plurality of picked-up images are picked up, based on the respective postures of the image pickup apparatus and the three-dimensional model;

adjusting the size of the plurality of captured images based on the distance between each position of the imaging device and the three-dimensional model; and

the plurality of captured images after the size adjustment are combined to generate a combined image.