KR102431333B1

KR102431333B1 - Method of autonomous navigation of multiple ships, apparatus for autonomous navigation of multiple ships, and computer program for the method

Info

Publication number: KR102431333B1
Application number: KR1020210186582A
Authority: KR
Inventors: 조경은; 김준오
Original assignee: 국방과학연구소; 동국대학교 산학협력단
Priority date: 2021-12-23
Filing date: 2021-12-23
Publication date: 2022-08-12

Abstract

The present invention relates to a multi-vessel autonomous navigation method, a multi-vessel autonomous navigation device, and a computer program stored in a recording medium to execute the method, capable of controlling a ship through a layered structure based on reinforcement learning for applying artificial intelligence to ship control. The multi-vessel autonomous navigation method performed by a computing device comprises: an autonomous task step of establishing a navigation plan for the ship by analyzing a space according to the ship's task with respect to the preset space; an autonomous navigation step of navigating the ship by setting a route for avoiding an obstacle in the space where the navigation plan is established; an autonomous control step of controlling a target value of the ship determined in the autonomous navigation step based on the environment of the space and the characteristics of the ship; and a step for reinforcement learning by linking the autonomous task step, the autonomous navigation step, and the autonomous control step to each other based on the task, the navigation plan, the space, the route, and the target value of the ship. Thus, a multi-vessel autonomous navigation method, a multi-vessel autonomous navigation device, and a computer program stored in a recording medium to execute the method are provided.

Description

다중 선박 자율 항해 방법, 다중 선박 자율 항해 장치 및 상기 방법을 실행시키기 위하여 기록매체에 저장된 컴퓨터 프로그램{Method of autonomous navigation of multiple ships, apparatus for autonomous navigation of multiple ships, and computer program for the method}A method of autonomous navigation of multiple ships, a device for autonomous navigation of multiple ships, and a computer program stored in a recording medium for executing the method {Method of autonomous navigation of multiple ships, apparatus for autonomous navigation of multiple ships, and computer program for the method}

본 발명의 실시예들은 다중 선박 자율 항해 방법, 다중 선박 자율 항해 장치 및 상기 방법을 실행시키기 위하여 기록매체에 저장된 컴퓨터 프로그램에 관한 것으로서, 더 상세하게는 선박 제어에 인공 지능을 적용하는데 있어서 강화 학습을 기반으로 계층화된 구조를 통해 선박을 제어할 수 있는 다중 선박 자율 항해 방법, 다중 선박 자율 항해 장치 및 상기 방법을 실행시키기 위하여 기록매체에 저장된 컴퓨터 프로그램에 관한 것이다.Embodiments of the present invention relate to a multi-vessel autonomous navigation method, a multi-vessel autonomous navigation device, and a computer program stored in a recording medium for executing the method, and more particularly, reinforcement learning in applying artificial intelligence to ship control. It relates to a multi-vessel autonomous navigation method capable of controlling a ship through a hierarchical structure based on the base, a multi-vessel autonomous navigation device, and a computer program stored in a recording medium for executing the method.

기존에는 해상에서 자율 선박을 운행하기 위한 전반적인 문제들 중 일부에 강화 학습 알고리즘을 적용하였다. 예를 들어, 목적지까지 항해하는 자율 항해, 지정된 항해 경로를 추종하기 위한 선박 제어와 같이 분리된 연구가 진행되었다. 하지만, 해상에 자율 선박을 운영하여 임무를 수행할 경우 이러한 문제는 통합되어 고려되어야 하며, 일부만으로는 자율 선박을 운행하기 어렵다. 기존에 일부의 문제를 다룬 이유는 강화 학습 알고리즘의 경우 데이터의 종류가 많아 상태의 공간이 넓어질 경우 학습이 어렵거나 알고리즘의 결과가 수렴되지 못하는 경우가 생기기 때문이다. 특히, 의사결정으로 인한 행동의 공간이 커질 경우 올바른 자율 선박의 운행을 기대하기 어렵다. 더욱이, 자율 선박에 적용하는 강화 학습 알고리즘은 학습의 시간이 오래 걸리며, 학습 방법에 어려움을 가지고 있다.In the past, reinforcement learning algorithms were applied to some of the overall problems for operating autonomous ships at sea. Separate studies have been conducted, for example, autonomous navigation to a destination, and vessel control to follow a designated navigation route. However, when performing a mission by operating an autonomous vessel in the sea, these issues must be considered in an integrated manner, and it is difficult to operate an autonomous vessel with only some. The reason that some problems have been dealt with in the past is that, in the case of reinforcement learning algorithms, there are many types of data, so if the space of the state is widened, it may be difficult to learn or the results of the algorithm may not converge. In particular, if the space for action due to decision-making is large, it is difficult to expect the correct operation of the autonomous vessel. Moreover, reinforcement learning algorithms applied to autonomous ships take a long time to learn and have difficulties in learning methods.

본 발명은 상기와 같은 문제점을 포함하여 여러 문제점들을 해결하기 위한 것으로서, 선박 제어에 인공 지능을 적용하는데 있어서 강화 학습을 기반으로 계층화된 구조를 통해 선박을 제어할 수 있는 다중 선박 자율 항해 방법, 다중 선박 자율 항해 장치 및 상기 방법을 실행시키기 위하여 기록매체에 저장된 컴퓨터 프로그램을 제공하는 것을 목적으로 한다. 그러나 이러한 과제는 예시적인 것으로, 이에 의해 본 발명의 범위가 한정되는 것은 아니다.The present invention is to solve various problems including the above problems, and in applying artificial intelligence to ship control, a multi-vessel autonomous navigation method capable of controlling a ship through a layered structure based on reinforcement learning, multi An object of the present invention is to provide a computer program stored in a recording medium for executing a ship autonomous navigation device and the method. However, these problems are exemplary, and the scope of the present invention is not limited thereto.

본 발명의 일 관점에 따르면, 컴퓨팅 장치에 의해 수행되는 다중 선박 자율 항해 방법에 있어서, 미리 설정된 공간에 대하여 선박의 임무에 따라 상기 공간을 분석하여 상기 선박의 항해 계획을 수립하는 자율 임무 단계, 항해 계획이 수립된 상기 공간에서 장애물을 회피하는 항로를 설정하여 상기 선박을 항해하는 자율 항해 단계, 상기 자율 항해 단계에서 결정된 상기 선박의 목표값을 상기 공간의 환경 및 상기 선박의 특성을 기초로 제어하는 자율 제어 단계, 및 상기 자율 임무 단계, 상기 자율 항해 단계, 및 상기 자율 제어 단계 상호간 상기 선박의 상기 임무, 상기 항해 계획, 상기 공간, 상기 항로, 및 상기 목표값을 기초로 서로 연계되어 강화 학습되는 단계를 포함하는, 다중 선박 자율 항해 방법이 제공된다.According to one aspect of the present invention, in a multi-ship autonomous navigation method performed by a computing device, an autonomous mission step of establishing a navigation plan of the ship by analyzing the space according to the mission of the ship with respect to a preset space, navigation An autonomous navigation step of navigating the ship by setting a route to avoid obstacles in the space where the plan is established, and controlling the target value of the ship determined in the autonomous navigation step based on the environment of the space and the characteristics of the ship The autonomous control step, and the autonomous mission step, the autonomous navigation step, and the autonomous control step are mutually linked to each other based on the mission, the navigation plan, the space, the route, and the target value of the ship A multi-vessel autonomous navigation method is provided, comprising the steps of:

상기 자율 임무 단계, 상기 자율 항해 단계, 및 상기 자율 제어 단계는 상기 자율 임무 단계, 상기 자율 항해 단계, 및 상기 자율 제어 단계의 순으로 3개의 계층으로 이루어질 수 있다.The autonomous mission phase, the autonomous navigation phase, and the autonomous control phase may consist of three layers in the order of the autonomous mission phase, the autonomous navigation phase, and the autonomous control phase.

상기 강화 학습되는 단계는, 상기 자율 임무 단계, 상기 자율 항해 단계, 및 상기 자율 제어 단계의 순서대로 서로 상호간 지도 학습되는 단계일 수 있다.The step of reinforcement learning may be a step of mutually supervised learning in the order of the autonomous mission step, the autonomous navigation step, and the autonomous control step.

상기 강화 학습되는 단계는, 상기 자율 제어 단계에서 출력된 출력값에 따라 상기 선박의 방향과 속도를 포함하는 상기 선박의 현재값을 결정하는 단계, 상기 출력값과 상기 목표값을 비교하여 상기 자율 항해 단계로 보상을 전달하는 단계, 및 상기 목표값과 상기 현재값을 비교하여 상기 자율 임무 단계로 보상을 전달하는 단계를 포함할 수 있다.The step of reinforcing learning may include: determining a current value of the ship including the direction and speed of the ship according to the output value output from the autonomous control step; Comparing the output value with the target value to enter the autonomous navigation step The method may include delivering a reward, and comparing the target value with the current value and transferring the reward to the autonomous task step.

상기 강화 학습되는 단계는, 상기 자율 임무 단계, 상기 자율 항해 단계, 및 상기 자율 제어 단계가 상기 목표값을 이용하여 각각 분리되어 학습되는 단계일 수 있다.The step of performing the reinforcement learning may be a step in which the autonomous mission step, the autonomous navigation step, and the autonomous control step are separately learned using the target value.

본 발명의 일 실시예에 따른 다중 선박 자율 항해 방법은, 강화 학습된 상기 자율 임무 단계, 상기 자율 항해 단계, 및 상기 자율 제어 단계가 서로 실시간 연동되어 상기 선박을 제어하는 단계를 더 포함할 수 있다.The multi-vessel autonomous navigation method according to an embodiment of the present invention may further include the step of controlling the vessel by interworking with each other in real time, the reinforcement-learned autonomous mission step, the autonomous navigation step, and the autonomous control step. .

본 발명의 일 관점에 따르면, 컴퓨터를 이용하여 상술한 방법을 실행시키기 위하여 기록매체에 저장된 컴퓨터 프로그램이 제공된다.According to one aspect of the present invention, there is provided a computer program stored in a recording medium for executing the above-described method using a computer.

본 발명의 일 관점에 따르면, 프로세서를 포함하고, 상기 프로세서는, 미리 설정된 공간에 대하여 선박의 임무에 따라 상기 공간을 분석하여 상기 선박의 항해 계획을 수립하는 자율 임무 단계를 수행하고, 항해 계획이 수립된 상기 공간에서 장애물을 회피하는 항로를 설정하여 상기 선박을 항해하는 자율 항해 단계를 수행하고, 상기 자율 항해 단계에서 결정된 상기 선박의 목표값을 상기 공간의 환경 및 상기 선박의 특성을 기초로 제어하는 자율 제어 단계를 수행하고, 상기 자율 임무 단계, 상기 자율 항해 단계, 및 상기 자율 제어 단계 상호간 상기 선박의 상기 임무, 상기 항해 계획, 상기 공간, 상기 항로, 및 상기 목표값을 기초로 서로 연계하여 강화 학습하는, 다중 선박 자율 항해 장치가 제공된다.According to one aspect of the present invention, it includes a processor, wherein the processor performs an autonomous mission step of establishing a navigation plan of the ship by analyzing the space according to the mission of the ship for a preset space, An autonomous navigation step of navigating the ship is performed by setting a route avoiding obstacles in the established space, and the target value of the ship determined in the autonomous navigation step is controlled based on the environment of the space and the characteristics of the ship performing an autonomous control step, and the autonomous mission phase, the autonomous navigation phase, and the autonomous control phase mutually link each other based on the mission, the navigation plan, the space, the route, and the target value of the vessel. A reinforcement learning, multi-vessel autonomous navigation device is provided.

상기 프로세서는, 상기 자율 임무 단계, 상기 자율 항해 단계, 및 상기 자율 제어 단계의 순서대로 서로 상호간 지도 학습할 수 있다.The processor may mutually supervise each other in the order of the autonomous mission phase, the autonomous navigation phase, and the autonomous control phase.

상기 프로세서는, 상기 자율 제어 단계에서 출력된 출력값에 따라 상기 선박의 방향과 속도를 포함하는 상기 선박의 현재값을 결정하고, 상기 출력값과 상기 목표값을 비교하여 상기 자율 항해 단계로 보상을 전달하고, 상기 목표값과 상기 현재값을 비교하여 상기 자율 임무 단계로 보상을 전달할 수 있다.The processor determines a current value of the ship including the direction and speed of the ship according to the output value output in the autonomous control step, compares the output value with the target value, and delivers a reward to the autonomous navigation step, , by comparing the target value with the current value, a reward may be delivered to the autonomous task step.

상기 프로세서는, 상기 자율 임무 단계, 상기 자율 항해 단계, 및 상기 자율 제어 단계를 상기 목표값을 이용하여 각각 분리시켜 학습할 수 있다.The processor may separate and learn the autonomous mission phase, the autonomous navigation phase, and the autonomous control phase using the target value.

상기 프로세서는, 강화 학습된 상기 자율 임무 단계, 상기 자율 항해 단계, 및 상기 자율 제어 단계를 서로 실시간 연동시켜 상기 선박을 제어할 수 있다.The processor may control the vessel by linking the reinforcement-learned autonomous mission step, the autonomous navigation step, and the autonomous control step with each other in real time.

전술한 것 외의 다른 측면, 특징, 이점은 이하의 발명을 실시하기 위한 구체적인 내용, 청구범위 및 도면으로부터 명확해질 것이다.Other aspects, features, and advantages other than those described above will become apparent from the following detailed description, claims and drawings for carrying out the invention.

상기한 바와 같이 이루어진 본 발명의 일 실시예에 따르면, 선박 제어에 인공 지능을 적용하는데 있어서 강화 학습을 기반으로 계층화된 구조를 통해 선박을 제어할 수 있는 다중 선박 자율 항해 방법, 다중 선박 자율 항해 장치 및 상기 방법을 실행시키기 위하여 기록매체에 저장된 컴퓨터 프로그램을 구현할 수 있다. 물론 이러한 효과에 의해 본 발명의 범위가 한정되는 것은 아니다.According to an embodiment of the present invention made as described above, in applying artificial intelligence to ship control, a multi-vessel autonomous navigation method capable of controlling a ship through a layered structure based on reinforcement learning, and a multi-vessel autonomous navigation device and a computer program stored in a recording medium to execute the method. Of course, the scope of the present invention is not limited by these effects.

도 1은 본 발명의 일 실시예에 따른 다중 선박 자율 항해 장치의 구성 및 동작을 설명하기 위한 도면이다.
도 2는 본 발명의 일 실시예에 따른 다중 선박 자율 항해 장치의 프로세서 구성을 설명하기 위한 도면이다.
도 3은 본 발명의 일 실시예에 따른 다중 선박 자율 항해 방법을 보여주는 순서도이다.
도 4는 본 발명의 일 실시예에 따른 다중 선박 자율 항해 방법의 학습 과정을 설명하기 위한 도면이다.
도 5는 본 발명의 일 실시예에 따른 다중 선박 자율 항해 방법의 실행 과정을 설명하기 위한 도면이다.
도 6 및 도 7은 본 발명의 일 실시예에 따른 다중 선박 자율 항해 방법을 을 설명하기 위한 도면이다.
도 8 및 도 9는 본 발명의 다른 실시예에 따른 다중 선박 자율 항해 방법을 을 설명하기 위한 도면이다.1 is a diagram for explaining the configuration and operation of a multi-vessel autonomous navigation apparatus according to an embodiment of the present invention.
2 is a diagram for explaining the configuration of a processor of a multi-vessel autonomous navigation apparatus according to an embodiment of the present invention.
3 is a flowchart illustrating a multi-vessel autonomous navigation method according to an embodiment of the present invention.
4 is a diagram for explaining a learning process of a multi-vessel autonomous navigation method according to an embodiment of the present invention.
5 is a view for explaining an execution process of the multi-vessel autonomous navigation method according to an embodiment of the present invention.
6 and 7 are diagrams for explaining a multi-vessel autonomous navigation method according to an embodiment of the present invention.
8 and 9 are diagrams for explaining a multi-vessel autonomous navigation method according to another embodiment of the present invention.

본 발명은 다양한 변환을 가할 수 있고 여러 가지 실시예를 가질 수 있는 바, 특정 실시예들을 도면에 예시하고 상세한 설명에 상세하게 설명하고자 한다. 본 발명의 효과 및 특징, 그리고 그것들을 달성하는 방법은 도면과 함께 상세하게 후술되어 있는 실시예들을 참조하면 명확해질 것이다. 그러나 본 발명은 이하에서 개시되는 실시예들에 한정되는 것이 아니라 다양한 형태로 구현될 수 있다.Since the present invention can apply various transformations and can have various embodiments, specific embodiments are illustrated in the drawings and described in detail in the detailed description. Effects and features of the present invention, and a method of achieving them, will become apparent with reference to the embodiments described below in detail in conjunction with the drawings. However, the present invention is not limited to the embodiments disclosed below and may be implemented in various forms.

이하, 첨부된 도면을 참조하여 본 발명의 실시예들을 상세히 설명하기로 하며, 도면을 참조하여 설명할 때 동일하거나 대응하는 구성 요소는 동일한 도면부호를 부여하고 이에 대한 중복되는 설명은 생략하기로 한다.Hereinafter, embodiments of the present invention will be described in detail with reference to the accompanying drawings, and when described with reference to the drawings, the same or corresponding components are given the same reference numerals, and the overlapping description thereof will be omitted. .

이하의 실시예에서, 제1 이나 제2 등의 용어는 한정적인 의미가 아니라, 일 구성 요소를 다른 구성 요소와 구별하는 목적으로 사용되었다. 그리고 단수의 표현은 문맥상 명백하게 다르게 뜻하지 않는 한, 복수의 표현을 포함한다. 또한, 포함하다 또는 가지다 등의 용어는 명세서 상에 기재된 특징 또는 구성요소가 존재함을 의미하는 것이고, 하나 이상의 다른 특징들 또는 구성요소가 부가될 가능성을 배제하는 것은 아니다.In the following embodiments, terms such as first, second, etc. are used for the purpose of distinguishing one component from another without limiting meaning. And singular expressions include plural expressions unless the context clearly dictates otherwise. In addition, terms such as include or have means that a feature or element described in the specification is present, and does not exclude the possibility that one or more other features or elements may be added.

도면에서는 설명의 편의를 위하여 구성 요소들이 그 크기가 과장 또는 축소될 수 있다. 예컨대, 도면에서 나타난 각 구성의 크기 및 두께는 설명의 편의를 위해 임의로 나타내었으므로, 본 발명이 반드시 도시된 바에 한정되지 않는다.In the drawings, the size of the components may be exaggerated or reduced for convenience of description. For example, since the size and thickness of each component shown in the drawings are arbitrarily indicated for convenience of description, the present invention is not necessarily limited to the illustrated bar.

이하의 실시예에서, 영역, 구성 요소, 부, 블록 또는 모듈 등의 부분이 다른 부분 위에 또는 상에 있다고 할 때, 다른 부분의 바로 위에 있는 경우뿐만 아니라, 그 중간에 다른 영역, 구성 요소, 부, 블록 또는 모듈 등이 개재되어 있는 경우도 포함한다. 그리고 영역, 구성 요소, 부, 블록 또는 모듈 등이 연결되었다고 할 때, 영역, 구성 요소, 부, 블록 또는 모듈들이 직접적으로 연결된 경우뿐만 아니라 영역, 구성요소, 부, 블록 또는 모듈들 중간에 다른 영역, 구성 요소, 부, 블록 또는 모듈들이 개재되어 간접적으로 연결된 경우도 포함한다.In the following embodiments, when it is said that a part such as a region, component, part, block or module is on or on another part, not only when it is directly on the other part, but also another region, component, part in the middle , blocks or modules are included. And when a region, component, part, block or module is connected, it is not only when the region, component, part, block or module is directly connected, but also another region in the middle of the region, component, part, block or module. , including cases in which components, units, blocks or modules are interposed and indirectly connected.

이하에서는, 본 발명이 속하는 기술분야에서 통상의 지식을 가진 자가 본 발명을 용이하게 실시할 수 있도록 하기 위하여, 본 발명의 여러 실시예에 관하여 첨부된 도면을 참조하여 상세히 설명하기로 한다.Hereinafter, various embodiments of the present invention will be described in detail with reference to the accompanying drawings in order to enable those of ordinary skill in the art to easily practice the present invention.

도 1은 본 발명의 일 실시예에 따른 다중 선박 자율 항해 장치의 구성 및 동작을 설명하기 위한 도면이고, 도 2는 본 발명의 일 실시예에 따른 다중 선박 자율 항해 장치의 프로세서 구성을 설명하기 위한 도면이다.1 is a diagram for explaining the configuration and operation of a multi-vessel autonomous navigation apparatus according to an embodiment of the present invention, and FIG. 2 is a diagram for explaining a processor configuration of a multi-vessel autonomous navigation apparatus according to an embodiment of the present invention It is a drawing.

먼저, 도 1을 참조하면, 본 발명의 일 실시예에 따른 다중 선박 자율 항해 장치(100)는 메모리(110), 프로세서(120), 통신 모듈(130) 및 입출력 인터페이스(140)를 포함할 수 있다. First, referring to FIG. 1 , the multi-vessel autonomous navigation apparatus 100 according to an embodiment of the present invention may include a memory 110 , a processor 120 , a communication module 130 , and an input/output interface 140 . have.

메모리(110)는 컴퓨터에서 판독 가능한 기록 매체로서, RAM(random access memory), ROM(read only memory) 및 디스크 드라이브와 같은 비소멸성 대용량 기록장치(permanent mass storage device)를 포함할 수 있다. 또한, 메모리(110)에는 다중 선박 자율 항해 장치(100)를 제어하기 위한 프로그램 코드 및 미리 학습된 인공 지능 알고리즘 또는 강화 학습 알고리즘이 일시적 또는 영구적으로 저장될 수 있다.The memory 110 is a computer-readable recording medium and may include a random access memory (RAM), a read only memory (ROM), and a permanent mass storage device such as a disk drive. In addition, a program code for controlling the multi-vessel autonomous navigation apparatus 100 and a pre-learned artificial intelligence algorithm or reinforcement learning algorithm may be temporarily or permanently stored in the memory 110 .

프로세서(120)는 미리 설정된 공간에 대하여 선박의 임무에 따라 공간을 분석하여 선박의 항해 계획을 수립하는 자율 임무 단계를 수행하고, 항해 계획이 수립된 공간에서 장애물을 회피하는 항로를 설정하여 선박을 항해하는 자율 항해 단계를 수행하고, 자율 항해 단계에서 결정된 선박의 목표값을 공간의 환경 및 선박의 특성을 기초로 제어하는 자율 제어 단계를 수행하고, 자율 임무 단계, 자율 항해 단계, 및 자율 제어 단계 상호간 선박의 임무, 항해 계획, 공간, 항로, 및 목표값을 기초로 서로 연계하여 강화 학습할 수 있다.The processor 120 performs an autonomous mission step of establishing a sailing plan of a ship by analyzing the space according to the mission of the ship for a preset space, and setting a route to avoid obstacles in the space in which the navigation plan is established to set the ship Performing the autonomous navigation step of navigating, performing the autonomous control step of controlling the target value of the ship determined in the autonomous navigation step based on the environment of space and the characteristics of the ship, and performing the autonomous mission step, the autonomous navigation step, and the autonomous control step Reinforcement learning may be performed in connection with each other based on the ship's mission, voyage plan, space, route, and target value.

통신 모듈(130)은 네트워크를 통해 외부 서버 또는 외부 장치와 통신하기 위한 기능을 제공할 수 있다. 일례로, 다중 선박 자율 항해 장치(100)의 프로세서(120)가 메모리(110)와 같은 기록 장치에 저장된 프로그램 코드에 따라 생성한 요청이 통신 모듈(130)의 제어에 따라 네트워크를 통해 외부 서버로 전달될 수 있다. 역으로, 외부 서버의 프로세서의 제어에 따라 제공되는 제어 신호나 명령, 컨텐츠, 파일 등이 네트워크를 거쳐 통신 모듈(130)을 통해 다중 선박 자율 항해 장치(100)로 수신될 수 있다. 예를 들어 통신 모듈(130)을 통해 수신된 외부 서버의 제어 신호나 명령 등은 프로세서(120)나 메모리(110)로 전달될 수 있고, 컨텐츠나 파일 등은 다중 선박 자율 항해 장치(100)가 더 포함할 수 있는 저장 매체로 저장될 수 있다. The communication module 130 may provide a function for communicating with an external server or an external device through a network. For example, a request generated by the processor 120 of the multi-vessel autonomous navigation device 100 according to a program code stored in a recording device such as the memory 110 is transmitted to an external server through a network under the control of the communication module 130 . can be transmitted. Conversely, a control signal, command, content, file, etc. provided under the control of the processor of the external server may be received by the multi-vessel autonomous navigation apparatus 100 through the communication module 130 through the network. For example, a control signal or command of an external server received through the communication module 130 may be transmitted to the processor 120 or the memory 110 , and contents or files may be transmitted by the multi-vessel autonomous navigation device 100 . It may be stored as a storage medium that may further include.

통신 방식은 제한되지 않으며, 네트워크가 포함할 수 있는 통신망(일례로, 이동통신망, 유선 인터넷, 무선 인터넷, 방송망)을 활용하는 통신 방식뿐만 아니라 기기들간의 근거리 무선 통신 역시 포함될 수 있다. 예를 들어, 네트워크는, PAN(personal area network), LAN(local area network), CAN(campus area network), MAN(metropolitan area network), WAN(wide area network), BBN(broadband network), 인터넷 등의 네트워크 중 하나 이상의 임의의 네트워크를 포함할 수 있다. 또한, 네트워크는 버스 네트워크, 스타 네트워크, 링 네트워크, 메쉬 네트워크, 스타-버스 네트워크, 트리 또는 계층적(hierarchical) 네트워크 등을 포함하는 네트워크 토폴로지 중 임의의 하나 이상을 포함할 수 있으나, 이에 제한되지 않는다.The communication method is not limited, and not only a communication method using a communication network (eg, a mobile communication network, a wired Internet, a wireless Internet, a broadcasting network) that the network may include, but also short-range wireless communication between devices may be included. For example, the network is a personal area network (PAN), a local area network (LAN), a campus area network (CAN), a metropolitan area network (MAN), a wide area network (WAN), a broadband network (BBN), the Internet, etc. may include any one or more of the networks of Further, the network may include, but is not limited to, any one or more of a network topology including, but not limited to, a bus network, a star network, a ring network, a mesh network, a star-bus network, a tree or a hierarchical network, and the like. .

또한, 통신 모듈(130)은 외부 서버와 네트워크를 통해 통신할 수 있다. 통신 방식은 제한되지 않지만, 네트워크는 근거리 무선통신망일 수 있다. 예를 들어, 네트워크는 블루투스(Bluetooth), BLE(Bluetooth Low Energy), Wifi 통신망일 수 있다. Also, the communication module 130 may communicate with an external server through a network. Although the communication method is not limited, the network may be a local area wireless network. For example, the network may be a Bluetooth (Bluetooth), BLE (Bluetooth Low Energy), or Wifi communication network.

입출력 인터페이스(140)는 입출력 장치와의 인터페이스를 위한 수단일 수 있다. 예를 들어, 입력 장치는 키보드 또는 마우스 등의 장치를, 그리고 출력 장치는 어플리케이션의 통신 세션을 표시하기 위한 디스플레이와 같은 장치를 포함할 수 있다. 다른 예로 입출력 인터페이스(140)는 터치스크린과 같이 입력과 출력을 위한 기능이 하나로 통합된 장치와의 인터페이스를 위한 수단일 수도 있다. 보다 구체적인 예로, 다중 선박 자율 항해 장치(100)의 프로세서(120)는 메모리(110)에 로딩된 컴퓨터 프로그램의 명령을 처리함에 있어서 외부 서버가 제공하는 데이터를 이용하여 구성되는 서비스 화면이나 컨텐츠가 입출력 인터페이스(140)를 통해 디스플레이에 표시될 수 있다.The input/output interface 140 may be a means for an interface with an input/output device. For example, the input device may include a device such as a keyboard or mouse, and the output device may include a device such as a display for displaying a communication session of an application. As another example, the input/output interface 140 may be a means for an interface with a device in which functions for input and output are integrated into one, such as a touch screen. As a more specific example, the processor 120 of the multi-vessel autonomous navigation device 100 processes the command of the computer program loaded in the memory 110, and the service screen or content configured using data provided by the external server is input/output. It may be displayed on the display through the interface 140 .

또한, 다른 실시예들에서 다중 선박 자율 항해 장치(100)는 도 1의 구성요소들보다 더 많은 구성요소들을 포함할 수도 있다. 상술한 입출력 장치 중 적어도 일부를 포함하도록 구현되거나 또는 내부 구성요소들에 전력을 공급하는 배터리 및 충전 장치, 각종 센서, 데이터베이스 등과 같은 다른 구성요소들을 더 포함할 수도 있다.In addition, in other embodiments, the multi-vessel autonomous navigation apparatus 100 may include more components than those of FIG. 1 . It is implemented to include at least a part of the above-described input/output device, or may further include other components such as a battery and charging device for supplying power to internal components, various sensors, and a database.

이하 도 2를 참조하여 본 발명의 일 실시예에 따른 다중 선박 자율 항해 장치(100)의 프로세서(120)의 내부 구성에 대하여 상세히 검토한다. 후술되는 프로세서(120)는 이해의 용이를 위하여 도 1에 도시된 다중 선박 자율 항해 장치(100)의 프로세서(120)임을 가정하고 설명한다.Hereinafter, the internal configuration of the processor 120 of the multi-ship autonomous navigation apparatus 100 according to an embodiment of the present invention will be reviewed in detail with reference to FIG. 2 . The processor 120 to be described later is assumed to be the processor 120 of the multi-vessel autonomous navigation apparatus 100 shown in FIG. 1 for ease of understanding.

본 발명의 일 실시예에 따른 다중 선박 자율 항해 장치(100)의 프로세서(120)는 자율 임무 모듈(121), 자율 항해 모듈(122), 및 자율 제어 모듈(123)을 포함한다. 또한, 자율 임무 모듈(121), 자율 항해 모듈(122), 및 자율 제어 모듈(123)은 강화 학습 알고리즘을 탑재하여 각 모듈은 강화 학습을 수행할 수 있다. 몇몇 실시예에 따라 프로세서(120)의 구성요소들은 선택적으로 프로세서(120)에 포함되거나 제외될 수도 있다. 또한, 몇몇 실시예에 따라 프로세서(120)의 구성요소들은 프로세서(120)의 기능의 표현을 위해 분리 또는 병합될 수도 있다.The processor 120 of the multi-vessel autonomous navigation apparatus 100 according to an embodiment of the present invention includes an autonomous mission module 121 , an autonomous navigation module 122 , and an autonomous control module 123 . In addition, the autonomous mission module 121 , the autonomous navigation module 122 , and the autonomous control module 123 are equipped with a reinforcement learning algorithm so that each module may perform reinforcement learning. According to some embodiments, components of the processor 120 may be selectively included or excluded from the processor 120 . In addition, according to some embodiments, the components of the processor 120 may be separated or combined to express the functions of the processor 120 .

이러한 프로세서(120) 및 프로세서(120)의 구성요소들은 도 3의 다중 선박 자율 항해 방법이 포함하는 단계들(S110 내지 S140)을 수행하도록 다중 선박 자율 항해 장치(100)를 제어할 수 있다. 예를 들어, 프로세서(120) 및 프로세서(120)의 구성요소들은 메모리(110)가 포함하는 운영체제의 코드와 적어도 하나의 프로그램의 코드에 따른 명령(instruction)을 실행하도록 구현될 수 있다. 여기서, 프로세서(120)의 구성요소들은 다중 선박 자율 항해 장치(100)에 저장된 프로그램 코드가 제공하는 명령에 따라 프로세서(120)에 의해 수행되는 프로세서(120)의 서로 다른 기능들(different functions)의 표현들일 수 있다. 프로세서(120)의 내부 구성 및 구체적인 동작에 대해서는 도 3의 다중 선박 자율 항해 방법의 순서도를 참조하여 설명하기로 한다.The processor 120 and components of the processor 120 may control the multi-vessel autonomous navigation apparatus 100 to perform steps S110 to S140 included in the multi-vessel autonomous navigation method of FIG. 3 . For example, the processor 120 and components of the processor 120 may be implemented to execute instructions according to the code of the operating system included in the memory 110 and the code of at least one program. Here, the components of the processor 120 are different functions of the processor 120 that are performed by the processor 120 according to a command provided by the program code stored in the multi-vessel autonomous navigation device 100 . can be expressions. The internal configuration and specific operation of the processor 120 will be described with reference to the flowchart of the multi-vessel autonomous navigation method of FIG. 3 .

도 3은 본 발명의 일 실시예에 따른 다중 선박 자율 항해 방법을 보여주는 순서도이다.3 is a flowchart illustrating a multi-vessel autonomous navigation method according to an embodiment of the present invention.

단계 S110에서, 프로세서(120)는 미리 설정된 공간에 대하여 선박의 임무에 따라 공간을 분석하여 선박의 항해 계획을 수립하는 자율 임무 단계를 수행할 수 있다. 예를 들어, 프로세서는 자율 임무 단계에서 선박의 임무가 수색 임무인 경우 해양 영역에 대하여 수색 영역, 이동 목적지, 이동 적지에서의 임무 등에 대한 계획을 수립할 수 있다. 예컨대, 항해 계획은 목적지 또는 중간 목적지에 대한 항해 계획을 포함할 수 있다.In step S110, the processor 120 may perform an autonomous mission step of establishing a sailing plan of the ship by analyzing the space according to the mission of the ship with respect to the preset space. For example, when the mission of the ship is a search mission in the autonomous mission phase, the processor may establish a plan for a search area, a moving destination, a mission in a moving site, etc. for the maritime area. For example, the voyage plan may include a voyage plan for a destination or intermediate destination.

예를 들어, 자율 임무 단계에서는 자율 선박이 해양에서 임무를 수행하기 위한 의사결정이 이루어질 수 있다. 예컨대, 자율 선박의 임무가 해양 환경 조사의 경우, 공간을 어떻게 이동하고 어떻게 샘플을 수집하는지에 따른 계획이 결정되며 주어진 공간에 따른 계획이 수립될 수 있다. 또한, 자율 임무 단계에서 수립된 계획이 자율 항해 단계로 전달될 수 있다.For example, in the autonomous mission phase, decisions may be made for autonomous vessels to perform missions at sea. For example, when the mission of the autonomous vessel is to investigate the marine environment, a plan may be determined according to how to move in space and how to collect a sample, and a plan may be established according to a given space. In addition, a plan established in the autonomous mission phase may be transferred to the autonomous navigation phase.

단계 S120에서, 프로세서(120)는 항해 계획이 수립된 공간에서 장애물을 회피하는 항로를 설정하여 선박을 항해하는 자율 항해 단계를 수행할 수 있다. 예를 들어, 프로세서는 자율 항해 단계에서 자율 임무 단계에서 주어진 위치(예컨대, Way Point)로 선박을 항해 할 수 있다. 예를 들어, 자율 항해 단계에서는 동적, 정적 장애물을 회피하고, 주어진 공간에서의 시간에 따른 이동을 위해 선박의 방향과 속도를 결정할 수 있다.In step S120 , the processor 120 may perform an autonomous navigation step of navigating a vessel by setting a route for avoiding obstacles in a space in which a navigation plan is established. For example, the processor may navigate the vessel to a given location (eg, Way Point) in the autonomous mission phase in the autonomous navigation phase. For example, in the autonomous navigation phase, it is possible to avoid dynamic and static obstacles, and to determine the direction and speed of a vessel for movement over time in a given space.

예를 들어, 자율 항해 단계에서는 자율 임무 단계로부터 전달 받은 목표 지점으로 선박이 자율 항해할 수 있다. 예를 들어, 자율 항해 단계에서는 선박이 목표지점까지 이동되어 GPS와 같은 위치 측정 센서로부터 이동 완료에 대해 자율 임무 단계로 반환하고 자율 임무 단계로부터 보상을 받을 수 있다. 또한, 선박의 이동을 위한 선박의 방향과 속도가 자율 제어 단계로 전달될 수 있다. 예컨대, 자율 항해 단계에서는 센서를 통해 얻어진 자연 환경(파도/조류/바람 등) 데이터와 선박 상태(속도, 방향, 선박 기울기 등) 데이터를 수집하여 자율 제어 단계로 전달할 수 있다.For example, in the autonomous navigation phase, the vessel may autonomously navigate to the target point received from the autonomous mission phase. For example, in the autonomous navigation phase, the vessel may be moved to a target point and return to the autonomous mission phase for the completion of movement from a position measurement sensor such as GPS, and may receive a reward from the autonomous mission phase. In addition, the direction and speed of the vessel for the movement of the vessel may be transmitted to the autonomous control stage. For example, in the autonomous navigation step, natural environment (wave/tidal/wind, etc.) data and ship condition (speed, direction, ship inclination, etc.) data obtained through sensors may be collected and transmitted to the autonomous control step.

단계 S130에서, 프로세서(120)는 자율 항해 단계에서 결정된 선박의 목표값을 공간의 환경 및 선박의 특성을 기초로 제어하는 자율 제어 단계를 수행할 수 있다. 예를 들어, 프로세서는 자율 제어 단계에서 자율 항해 단계에서 의사 결정한 제어의 목표치(방향, 속도)를 주어진 환경(자연환경: 파도/조류/바람 등)과 선박의 특성(무게, 길이, 넓이 등)에 따라 제어 명령을 출력할 수 있다. 예컨대, 제어 명령은 선박의 종류에 따라 다를 수 있으며, 제어는 기계적 제어를 포함할 수 있다.In step S130 , the processor 120 may perform an autonomous control step of controlling the target value of the ship determined in the autonomous navigation step based on the environment of the space and the characteristics of the ship. For example, in the autonomous control phase, the processor sets the target value (direction, speed) of the control decided in the autonomous navigation phase in the given environment (natural environment: wave/tidal/wind, etc.) and the characteristics of the vessel (weight, length, width, etc.) Control commands can be output according to For example, the control command may be different depending on the type of vessel, and the control may include mechanical control.

예를 들어, 자율 제어 단계에서는 자율 항해 단계로부터 이동에 따른 방향과 속도가 목표값으로 입력될 수 있다. 또한, 프로세서는 자율 제어 단계에서 자연 환경 데이터와 선박 상태 데이터를 기반으로 선박의 제어 방법에 따른 행동을 출력할 수 있다.For example, in the autonomous control phase, the direction and speed according to movement from the autonomous navigation phase may be input as target values. In addition, the processor may output an action according to the control method of the vessel based on the natural environment data and the vessel state data in the autonomous control step.

단계 S140에서, 프로세서(120)는 자율 임무 단계, 자율 항해 단계, 및 자율 제어 단계 상호간 선박의 임무, 항해 계획, 공간, 항로, 및 목표값을 기초로 서로 연계되어 강화 학습되는 단계를 수행할 수 있다. 예를 들어, 본 발명의 일 실시예에 따른 프로세서는 자율 제어 단계에서 출력된 행동으로 선박의 이동과 방향이 결정되며, 주어진 목표값과 비교하여 자율 항해 단계로 보상을 전달할 수 있다. 또한, 프로세서는 자율 항해 단계에서 목표값과 현재값을 비교하여 자율 임무 단계로 보상을 전달할 수 있다. 결국, 자율 임무 단계, 자율 항해 단계, 및 자율 임무 단계의 3개의 계층이 연계되어 자율 선박의 전체적인 운영이 이루어질 수 있다. 예컨대, 프로세서는 상기 목표값과 강화 학습 단계의 보상값을 기초로 자율 임무 단계, 자율 항해 단계, 및 자율 임무 단계의 3개의 계층을 서로 연계하여 강화 학습 시킬 수 있다. 또한, 각 단계는 학습된 강화 학습으로 주어진 단계별 임무를 만족하는 결과를 하부 단계로 전달할 수 있다.In step S140, the processor 120 may perform reinforcement learning in connection with each other based on the ship's mission, voyage plan, space, route, and target value between the autonomous mission phase, the autonomous navigation phase, and the autonomous control phase. have. For example, the processor according to an embodiment of the present invention may determine the movement and direction of a ship as an action output in the autonomous control phase, and may transmit a reward to the autonomous navigation phase by comparing it with a given target value. In addition, the processor may compare the target value with the current value in the autonomous navigation phase and transfer the reward to the autonomous mission phase. As a result, the three layers of the autonomous mission phase, the autonomous navigation phase, and the autonomous mission phase are linked to achieve the overall operation of the autonomous vessel. For example, the processor may perform reinforcement learning by linking three layers of the autonomous mission phase, the autonomous navigation phase, and the autonomous mission phase with each other based on the target value and the reward value of the reinforcement learning phase. In addition, each step can deliver a result that satisfies a given step-by-step task through learned reinforcement learning to a lower level.

본 발명의 일 실시예에 따른 자율 임무 단계, 자율 항해 단계, 및 자율 제어 단계는 자율 임무 단계, 자율 항해 단계, 및 자율 제어 단계의 순으로 3개의 계층으로 이루어질 수 있다. 예를 들어, 자율 임무 단계, 자율 항해 단계, 및 자율 제어 단계의 순으로 하위 계층인 자율 임무 단계로부터 상위 계층인 자율 제어 단계의 순으로 3개의 계층으로 이루어질 수 있다.The autonomous mission phase, the autonomous navigation phase, and the autonomous control phase according to an embodiment of the present invention may consist of three layers in the order of the autonomous mission phase, the autonomous navigation phase, and the autonomous control phase. For example, the autonomous mission phase, the autonomous navigation phase, and the autonomous control phase may consist of three layers in the order from an autonomous mission phase, which is a lower layer, to an autonomous control phase, which is an upper layer, in the order of the autonomous control phase.

예를 들어, 상기 자율 임무 단계는 프로세서의 자율 임무 모듈에서 수행될 수 있다. 또한, 상기 자율 항해 단계는 프로세서의 자율 항해 모듈에서 수행될 수 있다. 또한, 상기 자율 제어 단계는 프로세서의 자율 제어 모듈에서 수행될 수 있다. 예컨대, 자율 임무 모듈, 자율 항해 모듈, 자율 제어 모듈은 3개의 계층으로 이루어져 3개의 계층이 연계되어 자율 선박의 전체적인 운영을 할 수 있다.For example, the autonomous task step may be performed in an autonomous task module of the processor. Also, the autonomous navigation step may be performed in an autonomous navigation module of the processor. Also, the autonomous control step may be performed in an autonomous control module of the processor. For example, the autonomous mission module, the autonomous navigation module, and the autonomous control module are composed of three layers, and the three layers can be linked to perform the overall operation of the autonomous vessel.

본 발명의 일 실시예에서 강화 학습되는 단계는, 자율 임무 단계, 자율 항해 단계, 및 자율 제어 단계의 순서대로 서로 상호간 지도 학습되는 단계일 수 있다. 예를 들어, 본 발명의 일 실시예에 따른 강화 학습 되는 단계는 미리 학습된 강화 학습(Reinforcement Learning) 알고리즘에 의하여 학습되는 단계일 수 있다. In an embodiment of the present invention, the step of reinforcement learning may be a step of mutually supervised learning in the order of an autonomous mission phase, an autonomous navigation phase, and an autonomous control phase. For example, the step of performing reinforcement learning according to an embodiment of the present invention may be a step of learning by a pre-trained reinforcement learning algorithm.

본 발명에 따른 강화 학습 알고리즘은 미리 학습될 수 있다. 예를 들어, 본 발명에 따른 강화 학습 알고리즘은 Q-Learning, SARSA 또는 DQN(Deep Q-Network) 일 수 있다. 다만, 본 발명의 강화 학습 알고리즘은 상기 예시에 한정되는 것은 아니며, 다양한 형태의 인공 지능 알고리즘으로 구현될 수 있다.The reinforcement learning algorithm according to the present invention may be trained in advance. For example, the reinforcement learning algorithm according to the present invention may be Q-Learning, SARSA, or Deep Q-Network (DQN). However, the reinforcement learning algorithm of the present invention is not limited to the above example, and may be implemented with various types of artificial intelligence algorithms.

본 발명에 일 실시예에 따른 프로세서는 자율 제어 단계에서 출력된 출력값에 따라 선박의 방향과 속도를 포함하는 선박의 현재값을 결정할 수 있다. 또한, 본 발명에 일 실시예에 따른 프로세서는 상기 출력값과 상기 목표값을 비교하여 자율 항해 단계로 보상을 전달할 수 있다. 또한, 본 발명에 일 실시예에 따른 프로세서는 상기 목표값과 상기 현재값을 비교하여 자율 임무 단계로 보상을 전달할 수 있다.The processor according to an embodiment of the present invention may determine the current value of the vessel including the direction and speed of the vessel according to the output value output in the autonomous control step. In addition, the processor according to an embodiment of the present invention may compare the output value with the target value and transmit a reward to the autonomous navigation step. In addition, the processor according to an embodiment of the present invention may compare the target value with the current value and deliver a reward to the autonomous task step.

또한, 본 발명의 일 실시예에 따른 자율 임무 단계 및 자율 항해 단계는 전달받은 보상을 기초로 학습되는 단계일 수 있다. 또한, 본 발명의 일 실시예에 따른 자율 임무 단계 및 자율 항해 단계는 전달받은 보상을 기초로 행동하는 단계일 수 있다.In addition, the autonomous mission phase and the autonomous navigation phase according to an embodiment of the present invention may be a learning phase based on a received reward. In addition, the autonomous mission step and the autonomous navigation step according to an embodiment of the present invention may be a step of acting based on a received reward.

본 발명에 일 실시예에 따른 강화 학습되는 단계는, 자율 임무 단계, 자율 항해 단계, 및 자율 제어 단계가 상기 목표값을 이용하여 각각 분리되어 학습되는 단계일 수 있다. 또한, 본 발명의 일 실시예에 따른 프로세서는 강화 학습된 자율 임무 단계, 자율 항해 단계, 및 자율 제어 단계를 서로 실시간 연동시켜 선박을 제어할 수 있다.The step of reinforcement learning according to an embodiment of the present invention may be a step in which the autonomous mission step, the autonomous navigation step, and the autonomous control step are separately learned using the target value. In addition, the processor according to an embodiment of the present invention may control the vessel by interworking the reinforcement-learned autonomous mission phase, the autonomous navigation phase, and the autonomous control phase with each other in real time.

도 4는 본 발명의 일 실시예에 따른 다중 선박 자율 항해 방법의 학습 과정을 설명하기 위한 도면이고, 도 5는 본 발명의 일 실시예에 따른 다중 선박 자율 항해 방법의 실행 과정을 설명하기 위한 도면이다.4 is a diagram for explaining a learning process of a multi-vessel autonomous navigation method according to an embodiment of the present invention, and FIG. 5 is a diagram for explaining an execution process of a multi-vessel autonomous navigation method according to an embodiment of the present invention to be.

도 4 및 도 5를 참조하면, 본 발명의 일 실시예에 따른 계층화된 알고리즘의 학습 및 실행 방법이 개략적으로 도시되어 있다. 4 and 5 , a method for learning and executing a layered algorithm according to an embodiment of the present invention is schematically illustrated.

본 발명에 따른 3개의 계층화된 알고리즘은 강화 학습(Reinforcement Learning) 알고리즘에 의하여 학습될 수 있다.The three layered algorithms according to the present invention may be learned by a reinforcement learning algorithm.

본 발명의 일 실시예에서, 자율 제어 단계(450, 550)에서 출력된 출력값에 따라 선박의 방향과 속도를 포함하는 선박의 현재값이 결정될 수 있다. 또한, 프로세서는 상기 출력값과 상기 목표값을 비교하여 자율 항해 단계(430, 530)로 보상(Reward)을 전달할 수 있다. 또한, 프로세서는 상기 목표값과 상기 현재값을 비교하여 자율 임무 단계(410, 510)로 보상(Reward)을 전달할 수 있다. In an embodiment of the present invention, a current value of the vessel including the direction and speed of the vessel may be determined according to the output values output in the autonomous control steps 450 and 550 . In addition, the processor may compare the output value with the target value and transmit a reward to the autonomous navigation steps 430 and 530 . In addition, the processor may transmit a reward to the autonomous task steps 410 and 510 by comparing the target value with the current value.

본 발명의 일 실시예에 따른 계층화된 알고리즘의 학습 순서는 자율 임무(410) -> 자율 항해(430) -> 자율 제어(450)의 순서로 상호간 지도 학습이 진행될 수 있다. 예를 들어, 도 4에 도시된 바와 같이, 자율 항해(430)는 자율 제어(450)의 결과에 따라 보상을 받을 수 있다. 또한, 자율 임무(410)는 자율 제어(450)와 자율 항해(430)의 결과에 따라 보상을 받을 수 있다.In the order of learning the layered algorithm according to an embodiment of the present invention, mutual supervised learning may be performed in the order of autonomous task 410 -> autonomous navigation 430 -> autonomous control 450 . For example, as shown in FIG. 4 , the autonomous navigation 430 may receive a reward according to the result of the autonomous control 450 . In addition, the autonomous mission 410 may be rewarded according to the results of the autonomous control 450 and the autonomous navigation 430 .

본 발명의 일 실시예에 따른 학습에서는 주어진 목표값을 각 계층별로 전달 함으로써 결과를 평가 할 수 있다. 이러한 학습을 통해 자율 임무(410), 자율 항해(430), 자율 제어(450)를 분리하여 학습하되, 각 계층이 평가하여 학습의 명확성과 학습시간을 줄일 수 있다. 또한, 본 발명에 따른 학습은 선박과 거리가 먼 자율 임무(410)에서부터 선박의 상세한 제어를 위한 자율 제어(450)로 상태의 범위와 시간의 순서에 따라 나누어질 수 있다.In learning according to an embodiment of the present invention, a result can be evaluated by delivering a given target value to each layer. Through this learning, the autonomous mission 410, the autonomous navigation 430, and the autonomous control 450 are separately learned, but each layer is evaluated to reduce the clarity of learning and the learning time. In addition, the learning according to the present invention can be divided according to the range of states and the order of time from the autonomous mission 410 that is far from the vessel to the autonomous control 450 for detailed control of the vessel.

본 발명에 따르면 알고리즘 계층화는 자율 선박의 임무를 수행하는데 있어 요구되는 선박의 종류에 따라 최적화 하여 적용될 수 있다. 선박은 크기와 무게, 동력 종류, 조향 방법에 따라 제어 방법이 다르므로, 자율 항해와 자율 제어를 분리하여 전역 의사 결정과 지역 의사 결정을 서로 연계함으로써 종류가 다른 선박들을 동일 알고리즘으로 적용할 수 있다.According to the present invention, algorithm layering can be applied by optimizing according to the type of vessel required to perform the mission of the autonomous vessel. Since ships have different control methods depending on their size, weight, power type, and steering method, by separating autonomous navigation from autonomous control and linking global decision-making with local decision-making, different types of ships can be applied with the same algorithm. .

또한, 본 발명에 따르면 이러한 알고리즘 계층화에서는 학습에 있어서 계층간 데이터와 목표치를 통해 인간이 관여하지 않는 상호간 지도 학습이 이루어질 수 있으며, 확장되는 요구에 능동적으로 적응할 수 있다.In addition, according to the present invention, in such algorithm layering, mutually supervised learning without human involvement can be made through inter-layer data and target values in learning, and it is possible to actively adapt to expanding demands.

도 6 및 도 7은 본 발명의 일 실시예에 따른 다중 선박 자율 항해 방법을 을 설명하기 위한 도면이다.6 and 7 are diagrams for explaining a multi-vessel autonomous navigation method according to an embodiment of the present invention.

도 6 및 도 7을 참조하면, 본 발명의 일 실시예에 따른 강화 학습 알고리즘을 통해 자율 항해를 하기 위한 데이터가 예시적으로 도시되어 있다.6 and 7 , data for autonomous navigation through a reinforcement learning algorithm according to an embodiment of the present invention are exemplarily shown.

예를 들어, 자율 선박(S)의 자율 항해를 하기 위한 데이터는 선박의 위치, 목표 지점 위치, 선박 상태, 자연 상태, 및 Lidar 측정값 등을 포함할 수 있다.For example, data for autonomous navigation of the autonomous vessel S may include a position of the vessel, a position of a target point, a vessel state, a natural state, and a Lidar measurement value.

본 발명의 일 실시예에 따른 자율 선박의 자율 항해는 도 6 및 도 7에 도시된 데이터를 기반으로 의사 결정될 수 있다. 예컨대, 상태에 따라 행동과 보상을 받는 강화 학습 알고리즘의 기본 구조를 바탕으로 자율 선박의 주어진 상태와 데이터는 도 6 및 도 7에 도시된 바와 같다. 예컨대, 자연 상태, 선박 상태를 GPS, Lidar, Radar, 카메라, IMU와 같은 자세 제어 센서를 통해 수집한 동적 또는 정적 장애물에 대한 데이터는 자율 선박을 기준으로 상대적인 측정 데이터가 수집될 수 있다.The autonomous navigation of the autonomous vessel according to an embodiment of the present invention may be determined based on the data shown in FIGS. 6 and 7 . For example, based on the basic structure of a reinforcement learning algorithm that receives actions and rewards according to the state, the given state and data of the autonomous vessel are as shown in FIGS. 6 and 7 . For example, data on dynamic or static obstacles collected through an attitude control sensor such as GPS, Lidar, Radar, a camera, and IMU for natural state and ship state may be relative measurement data with respect to an autonomous vessel.

도 8 및 도 9는 본 발명의 다른 실시예에 따른 다중 선박 자율 항해 방법을 을 설명하기 위한 도면이다.8 and 9 are diagrams for explaining a multi-vessel autonomous navigation method according to another embodiment of the present invention.

도 8 및 도 9를 참조하면, 본 발명의 일 실시예에 따른 강화 학습 알고리즘을 통해 자율 제어를 하기 위한 데이터가 예시적으로 도시되어 있다.8 and 9 , data for autonomous control through a reinforcement learning algorithm according to an embodiment of the present invention are exemplarily shown.

예를 들어, 자율 선박(S)의 자율 제어를 하기 위한 데이터는 선박 상태, 자연 상태, 및 목표값 등을 포함할 수 있다.For example, data for autonomous control of the autonomous vessel S may include a vessel state, a natural state, and a target value.

본 발명의 일 실시예에 따른 자율 제어의 경우 자율 항해 단계로부터 전달받은 데이터는 제어의사결정을 위한 상태 데이터로 전달될 수 있다. 또한, 목적지까지 항해하기 위해 결정된 거시적 제어는 제어의사결정의 목표값이 될 수 있다. 또한, 제어의사결정은 주어진 목표값을 선박의 상태, 자연 환경 상태를 고려하여 최적의 선박 제어 신호로 출력할 수 있다.In the case of autonomous control according to an embodiment of the present invention, data received from the autonomous navigation step may be transmitted as state data for making a control decision. In addition, the macro-control determined to navigate to the destination may be a target value of the control decision-making. In addition, the control decision-making may output a given target value as an optimal vessel control signal in consideration of the state of the vessel and the state of the natural environment.

본 발명에 따르면, 주어진 공간에서 자율 선박은 목적에 따라 공간을 분석하여 계획을 수립하는 부분, 수립된 공간에서 실시간으로 장애물을 회피하며 최적의 항로를 설정하는 항해 부분, 항해를 하기 위해 주어진 선박의 제어 방식과 현재의 자연환경을 고려하여 최적의 제어를 위한 부분으로 분리하여 계층화 하였으며, 이러한 선박 운행 모델은 상호 지도 학습을 자동으로 진행할 수 있다. 또한, 계층화된 부분은 임무에 따라 또는 주어진 해상 공간에 따라, 선박의 제어 방식에 따라 조합하여 재사용이 가능하다.According to the present invention, the autonomous vessel in a given space analyzes the space according to the purpose and establishes a plan, avoids obstacles in real time in the established space and sets an optimal route, and sets the optimal route for the navigation. Considering the control method and the current natural environment, it was divided into parts for optimal control and layered. In addition, the layered part can be reused by combining it according to a mission or a given sea space, and according to a control method of a ship.

또한, 본 발명은 선박의 안정적인 제어를 위해 제어 모듈을 계층화 하여 서로 연계함으로써 하나의 모듈에서 모든 상황을 고려하는 것보다 안정적이며, 문제 해결을 위해 3계층의 강화 학습 알고리즘이 서로 협력하도록 구성되어 있다. 선박 항로뿐만 아니라, 선박의 목적지 결정, 선박의 제어 결정을 분리하여 단순화 할 수 있다.In addition, the present invention is more stable than considering all situations in one module by layering and linking the control modules for stable control of the ship, and the reinforcement learning algorithm of the three layers is configured to cooperate with each other to solve the problem. . It can be simplified by separating the ship's route, the ship's destination decision, and the ship's control decision.

또한, 본 발명은 계층화된 강화 학습 알고리즘 간 데이터 표준화를 통해 명료한 목표값이 주어지며, 센서를 통해 주어진 데이터를 환경 정보로 선언하여 각 임무, 항해, 제어에 대한 명확한 결과값을 통해 운영의 투명성이 제공될 수 있다. 또한 데이터의 표준화로 복잡한 통신 체계를 별도로 선언할 필요 없이, 분리된 계층간 표준 프로토콜을 사용할 수 있다.In addition, in the present invention, a clear target value is given through data standardization between layered reinforcement learning algorithms, and the data given through a sensor is declared as environmental information, and transparency of operation through clear results for each mission, navigation, and control This can be provided. In addition, it is possible to use a standard protocol between separate layers without the need to separately declare a complex communication system due to data standardization.

또한, 본 발명은 선박 제어에 인공지능을 적용하는데 있어, 의사결정에 강력한 알고리즘인 강화 학습을 기반으로 계층화된 구조를 통해 선박을 제어하는 방법을 제시하고 있다. 본 발명에서는 학습을 할 때 문제를 분리하여 각각 목적에 맞는 명확한 학습 목적이 주어지며, 계층화된 알고리즘 구조는 사람이 관여하지 않는 비지도 학습으로 이루어질 수 있다. 특히, 즉, 학습의 효율성이 높고, 복잡한 환경일수록 높은 제어 기술을 배울 수 있다.In addition, the present invention proposes a method of controlling a ship through a layered structure based on reinforcement learning, which is a powerful algorithm for decision-making, in applying artificial intelligence to ship control. In the present invention, when learning is performed, a clear learning objective is given to each purpose by separating the problem, and the layered algorithm structure can be achieved by unsupervised learning without human involvement. In particular, the higher the learning efficiency and the more complex the environment, the higher the control technique can be learned.

또한, 본 발명은 선박 제어에 대해 문제를 분리하고 계층화 하여 학습 속도를 높이고, 문제의 분리를 통해 의사결정의 명확성을 보장할 수 있다. 또한, 강화 학습 알고리즘 설계를 지연 보상과 같은 일반화를 통해 별도의 복잡한 보상정책이나, 특수한 데이터를 사용하지 않고, 현실적인 데이터를 기반으로 적용 범용성이 높다. 지속적으로 발전하는 강화 학습 알고리즘을 별도의 설계 없이 그대로 적용할 수 있다. 서로 다른 성능, 특성을 가진 자율 선박에 동시에 적용할 수 있다. 제어의 문제를 분리함으로써 해양 임무에 필요한 다양한 선박을 조합하여 적용할 수 있다. 상위 계층 알고리즘을 추가하여 더 넓은 범위의 임무를 수행할 수 있다. 예를 들어, 목적지 도착 후 어떤 임무 수행에 대한 동일한 알고리즘을 계층적으로 확장할 수 있다. In addition, the present invention can increase the learning speed by separating and stratify problems for vessel control, and ensure clarity of decision-making through problem separation. In addition, the generalization of the reinforcement learning algorithm design, such as delay compensation, does not use a separate complex compensation policy or special data, and is highly versatile in application based on realistic data. Continuously evolving reinforcement learning algorithms can be applied without any special design. It can be simultaneously applied to autonomous ships with different performance and characteristics. By isolating the control problem, it is possible to combine and apply various vessels required for maritime missions. A wider range of tasks can be accomplished by adding higher-layer algorithms. For example, the same algorithm for performing certain tasks after arrival at a destination can be extended hierarchically.

본 발명에 따른 강화 학습 알고리즘을 적용하는데 있어, 이와 같은 계층적 프레임을 제시하여 다양한 선박 형태에 적용할 수 있고, 계층의 확장을 통해 아래로는 에너지 절감, 위로는 더 상위 계층의 임무를 담당하는 알고리즘으로 확장할 수 있다. 본 발명이 제시하는 계층화된 강화 학습 알고리즘을 통해 자율 선박의 현실적인 구성과 연동이 가능하다.In applying the reinforcement learning algorithm according to the present invention, such a hierarchical frame can be presented and applied to various types of ships. The algorithm can be extended. Through the layered reinforcement learning algorithm presented by the present invention, it is possible to link with the realistic configuration of the autonomous vessel.

이상에서 설명된 장치 및/또는 시스템은, 하드웨어 구성요소, 소프트웨어 구성요소 및/또는 하드웨어 구성요소 및 소프트웨어 구성요소의 조합으로 구현될 수 있다. 실시예들에서 설명된 장치 및 구성요소는, 예를 들어, 프로세서, 콘트롤러, ALU(arithmetic logic unit), 디지털 신호 프로세서(digital signal processor), 마이크로컴퓨터, FPGA(field programmable gate array), PLU(programmable logic unit), 마이크로프로세서, 또는 명령(instruction)을 실행하고 응답할 수 있는 다른 어떠한 장치와 같이, 하나 이상의 범용 컴퓨터 또는 특수 목적 컴퓨터를 이용하여 구현될 수 있다. 처리 장치는 운영 체제(OS) 및 운영 체제 상에서 수행되는 하나 이상의 소프트웨어 어플리케이션을 수행할 수 있다. 또한, 처리 장치는 소프트웨어의 실행에 응답하여, 데이터를 접근, 저장, 조작, 처리 및 생성할 수도 있다. 이해의 편의를 위하여, 처리 장치는 하나가 사용되는 것으로 설명된 경우도 있지만, 해당 기술분야에서 통상의 지식을 가진 자는, 처리 장치가 복수 개의 처리 요소(processing element) 및/또는 복수 유형의 처리 요소를 포함할 수 있음을 알 수 있다. 예를 들어, 처리 장치는 복수 개의 프로세서 또는 하나의 프로세서 및 하나의 콘트롤러를 포함할 수 있다. 또한, 병렬 프로세서(parallel processor)와 같은, 다른 처리 구성(processing configuration)도 가능하다.The apparatus and/or system described above may be implemented as a hardware component, a software component, and/or a combination of the hardware component and the software component. The devices and components described in the embodiments may include, for example, a processor, a controller, an arithmetic logic unit (ALU), a digital signal processor, a microcomputer, a field programmable gate array (FPGA), a programmable logic unit (PLU). It may be implemented using one or more general purpose or special purpose computers, such as a logic unit, microprocessor, or any other device capable of executing and responding to instructions. The processing device may execute an operating system (OS) and one or more software applications running on the operating system. The processing device may also access, store, manipulate, process, and generate data in response to execution of the software. For convenience of understanding, although one processing device is sometimes described as being used, one of ordinary skill in the art will recognize that the processing device includes a plurality of processing elements and/or a plurality of types of processing elements. It can be seen that may include For example, the processing device may include a plurality of processors or one processor and one controller. Other processing configurations are also possible, such as parallel processors.

소프트웨어는 컴퓨터 프로그램(computer program), 코드(code), 명령(instruction) 또는 이들 중 하나 이상의 조합을 포함할 수 있으며, 원하는 대로 동작하도록 처리 장치를 구성하거나 독립적으로 또는 결합적으로(collectively) 처리 장치를 명령할 수 있다. 소프트웨어 및/또는 데이터는, 처리 장치에 의하여 해석되거나 처리 장치에 명령 또는 데이터를 제공하기 위하여, 어떤 유형의 기계, 구성요소(component), 물리적 장치, 가상 장치(virtual equipment), 컴퓨터 저장 매체 또는 장치, 또는 전송되는 신호 파(signal wave)에 영구적으로, 또는 일시적으로 구체화(embody)될 수 있다. 소프트웨어는 네트워크로 연결된 컴퓨터 시스템 상에 분산되어서, 분산된 방법으로 저장되거나 실행될 수도 있다. 소프트웨어 및 데이터는 하나 이상의 컴퓨터 판독 가능 기록 매체에 저장될 수 있다.The software may comprise a computer program, code, instructions, or a combination of one or more of these, which configures the processing device to operate as desired or, independently or collectively, the processing device can be ordered. The software and/or data may be any kind of machine, component, physical device, virtual equipment, computer storage medium or device, to be interpreted by or to provide instructions or data to the processing device. , or may be permanently or temporarily embody in a transmitted signal wave. The software may be distributed over networked computer systems and stored or executed in a distributed manner. Software and data may be stored in one or more computer-readable recording media.

실시예에 따른 방법은 다양한 컴퓨터 수단을 통하여 수행될 수 있는 프로그램 명령 형태로 구현되어 컴퓨터 판독 가능 매체에 기록될 수 있다. 컴퓨터 판독 가능 매체는 프로그램 명령, 데이터 파일, 데이터 구조 등을 단독으로 또는 조합하여 포함할 수 있다. 매체에 기록되는 프로그램 명령은 실시예를 위하여 특별히 설계되고 구성된 것들이거나 컴퓨터 소프트웨어 당업자에게 공지되어 사용 가능한 것일 수도 있다. 컴퓨터 판독 가능 기록 매체의 예에는 하드 디스크, 플로피 디스크 및 자기 테이프와 같은 자기 매체(magnetic media), CD-ROM, DVD와 같은 광기록 매체(optical media), 플롭티컬 디스크(floptical disk)와 같은 자기-광 매체(magneto-optical media), 및 롬(ROM), 램(RAM), 플래시 메모리 등과 같은 프로그램 명령을 저장하고 수행하도록 특별히 구성된 하드웨어 장치가 포함된다. 프로그램 명령의 예에는 컴파일러에 의해 만들어지는 것과 같은 기계어 코드뿐만 아니라 인터프리터 등을 사용해서 컴퓨터에 의해서 실행될 수 있는 고급 언어 코드를 포함한다. 상기된 하드웨어 장치는 실시예의 동작을 수행하기 위해 하나 이상의 소프트웨어 모듈로서 작동하도록 구성될 수 있으며, 그 역도 마찬가지이다.The method according to the embodiment may be implemented in the form of program instructions that can be executed through various computer means and recorded in a computer-readable medium. The computer-readable medium may include program instructions, data files, data structures, and the like, alone or in combination. The program instructions recorded on the medium may be specially designed and configured for the embodiment, or may be known and available to those skilled in the art of computer software. Examples of the computer-readable recording medium include magnetic media such as hard disks, floppy disks and magnetic tapes, optical media such as CD-ROMs and DVDs, and magnetic media such as floppy disks. - includes magneto-optical media, and hardware devices specially configured to store and execute program instructions, such as ROM, RAM, flash memory, and the like. Examples of program instructions include not only machine language codes such as those generated by a compiler, but also high-level language codes that can be executed by a computer using an interpreter or the like. The hardware devices described above may be configured to operate as one or more software modules to perform the operations of the embodiments, and vice versa.

이상과 같이 실시예들이 비록 한정된 실시예와 도면에 의해 설명되었으나, 해당 기술분야에서 통상의 지식을 가진 자라면 상기의 기재로부터 다양한 수정 및 변형이 가능하다. 예를 들어, 설명된 기술들이 설명된 방법과 다른 순서로 수행되거나, 및/또는 설명된 시스템, 구조, 장치, 회로 등의 구성요소들이 설명된 방법과 다른 형태로 결합 또는 조합되거나, 다른 구성요소 또는 균등물에 의하여 대치되거나 치환되더라도 적절한 결과가 달성될 수 있다.As described above, although the embodiments have been described with reference to the limited embodiments and drawings, various modifications and variations are possible by those skilled in the art from the above description. For example, the described techniques are performed in a different order than the described method, and/or the described components of the system, structure, apparatus, circuit, etc. are combined or combined in a different form than the described method, or other components Or substituted or substituted by equivalents may achieve an appropriate result.

그러므로, 다른 구현들, 다른 실시예들 및 특허청구범위와 균등한 것들도 후술하는 특허청구범위의 범위에 속한다.Therefore, other implementations, other embodiments, and equivalents to the claims are also within the scope of the following claims.

100: 다중 선박 자율 항해 장치
110: 메모리
120: 프로세서
130: 통신 모듈
140: 입출력 인터페이스100: multi-vessel autonomous navigation device
110: memory
120: processor
130: communication module
140: input/output interface

Claims

컴퓨팅 장치에 의해 수행되는 다중 선박 자율 항해 방법에 있어서,
미리 설정된 공간에 대하여 선박의 임무에 따라 상기 공간을 분석하여 상기 선박의 항해 계획을 수립하는 자율 임무 단계;
항해 계획이 수립된 상기 공간에서 장애물을 회피하는 항로를 설정하여 상기 선박을 항해하는 자율 항해 단계;
상기 자율 항해 단계에서 결정된 상기 선박의 목표값을 상기 공간의 환경 및 상기 선박의 특성을 기초로 제어하는 자율 제어 단계; 및
상기 자율 임무 단계, 상기 자율 항해 단계, 및 상기 자율 제어 단계 상호간 상기 선박의 상기 임무, 상기 항해 계획, 상기 공간, 상기 항로, 및 상기 목표값을 기초로 서로 연계되어 강화 학습되는 단계를 포함하고,
상기 강화 학습되는 단계는, 상기 자율 임무 단계, 상기 자율 항해 단계, 및 상기 자율 제어 단계의 순서대로 서로 상호간 지도 학습되는 단계이고,
상기 강화 학습되는 단계는,
상기 자율 제어 단계에서 출력된 출력값에 따라 상기 선박의 방향과 속도를 포함하는 상기 선박의 현재값을 결정하는 단계;
상기 출력값과 상기 목표값을 비교하여 상기 자율 항해 단계로 보상을 전달하는 단계; 및
상기 목표값과 상기 현재값을 비교하여 상기 자율 임무 단계로 보상을 전달하는 단계를 포함하는, 다중 선박 자율 항해 방법.A multi-vessel autonomous navigation method performed by a computing device, comprising:
An autonomous mission step of establishing a voyage plan of the vessel by analyzing the space according to the mission of the vessel with respect to the preset space;
an autonomous navigation step of navigating the vessel by setting a route to avoid obstacles in the space where the navigation plan is established;
an autonomous control step of controlling the target value of the vessel determined in the autonomous navigation step based on the environment of the space and characteristics of the vessel; and
The autonomous mission step, the autonomous navigation step, and the autonomous control step include a step of reinforcing learning in connection with each other based on the mission of the ship, the navigation plan, the space, the route, and the target value,
The step of reinforcement learning is a step of mutually supervised learning in the order of the autonomous mission step, the autonomous navigation step, and the autonomous control step,
The reinforcement learning step is,
determining a current value of the vessel including the direction and speed of the vessel according to the output value output in the autonomous control step;
transmitting a reward to the autonomous navigation step by comparing the output value with the target value; and
and transferring a reward to the autonomous mission step by comparing the target value with the current value.

제1 항에 있어서,
상기 자율 임무 단계, 상기 자율 항해 단계, 및 상기 자율 제어 단계는 상기 자율 임무 단계, 상기 자율 항해 단계, 및 상기 자율 제어 단계의 순으로 3개의 계층으로 이루어진, 다중 선박 자율 항해 방법.The method of claim 1,
wherein the autonomous mission step, the autonomous navigation step, and the autonomous control step are composed of three layers in the order of the autonomous mission step, the autonomous navigation step, and the autonomous control step.

삭제delete

제2 항에 있어서,
상기 강화 학습되는 단계는, 상기 자율 임무 단계, 상기 자율 항해 단계, 및 상기 자율 제어 단계가 상기 목표값을 이용하여 각각 분리되어 학습되는 단계인, 다중 선박 자율 항해 방법.3. The method of claim 2,
The step of reinforcing learning is a step in which the autonomous mission step, the autonomous navigation step, and the autonomous control step are separately learned using the target value.

제2 항에 있어서,
강화 학습된 상기 자율 임무 단계, 상기 자율 항해 단계, 및 상기 자율 제어 단계가 서로 실시간 연동되어 상기 선박을 제어하는 단계;
를 더 포함하는, 다중 선박 자율 항해 방법.3. The method of claim 2,
controlling the vessel by linking the reinforcement-learned autonomous mission step, the autonomous navigation step, and the autonomous control step in real time with each other;
Further comprising, a multi-vessel autonomous navigation method.

컴퓨팅 장치를 이용하여 제1 항, 제2 항, 제 5항, 및 제6 항 중 어느 한 항의 방법을 실행시키기 위하여 기록매체에 저장된 컴퓨터 프로그램.A computer program stored in a recording medium for executing the method of any one of claims 1, 2, 5, and 6 using a computing device.

프로세서;를 포함하고,
상기 프로세서는, 미리 설정된 공간에 대하여 선박의 임무에 따라 상기 공간을 분석하여 상기 선박의 항해 계획을 수립하는 자율 임무 단계를 수행하고, 항해 계획이 수립된 상기 공간에서 장애물을 회피하는 항로를 설정하여 상기 선박을 항해하는 자율 항해 단계를 수행하고, 상기 자율 항해 단계에서 결정된 상기 선박의 목표값을 상기 공간의 환경 및 상기 선박의 특성을 기초로 제어하는 자율 제어 단계를 수행하고, 상기 자율 임무 단계, 상기 자율 항해 단계, 및 상기 자율 제어 단계 상호간 상기 선박의 상기 임무, 상기 항해 계획, 상기 공간, 상기 항로, 및 상기 목표값을 기초로 서로 연계하여 강화 학습하고,
상기 프로세서는, 상기 자율 임무 단계, 상기 자율 항해 단계, 및 상기 자율 제어 단계의 순서대로 서로 상호간 지도 학습하고,
상기 프로세서는, 상기 자율 제어 단계에서 출력된 출력값에 따라 상기 선박의 방향과 속도를 포함하는 상기 선박의 현재값을 결정하고, 상기 출력값과 상기 목표값을 비교하여 상기 자율 항해 단계로 보상을 전달하고, 상기 목표값과 상기 현재값을 비교하여 상기 자율 임무 단계로 보상을 전달하는, 다중 선박 자율 항해 장치.processor; including;
The processor performs an autonomous mission step of establishing a navigation plan of the ship by analyzing the space according to the mission of the ship for a preset space, and setting a route to avoid obstacles in the space where the navigation plan is established. performing an autonomous navigation step of navigating the ship, and performing an autonomous control step of controlling the target value of the ship determined in the autonomous navigation step based on the environment of the space and characteristics of the ship, the autonomous mission step; Reinforcement learning in connection with each other based on the mission of the vessel, the navigation plan, the space, the route, and the target value between the autonomous navigation step and the autonomous control step,
The processor performs mutually supervised learning in the order of the autonomous mission step, the autonomous navigation step, and the autonomous control step,
The processor determines a current value of the ship including the direction and speed of the ship according to the output value output in the autonomous control step, compares the output value with the target value, and transmits a reward to the autonomous navigation step, , a multi-vessel autonomous navigation device that compares the target value with the current value and delivers a reward to the autonomous mission step.

제8 항에 있어서,
상기 자율 임무 단계, 상기 자율 항해 단계, 및 상기 자율 제어 단계는 상기 자율 임무 단계, 상기 자율 항해 단계, 및 상기 자율 제어 단계의 순으로 3개의 계층으로 이루어진, 다중 선박 자율 항해 장치.9. The method of claim 8,
The autonomous mission phase, the autonomous navigation phase, and the autonomous control phase are composed of three layers in the order of the autonomous mission phase, the autonomous navigation phase, and the autonomous control phase.

삭제delete

제9 항에 있어서,
상기 프로세서는, 상기 자율 임무 단계, 상기 자율 항해 단계, 및 상기 자율 제어 단계를 상기 목표값을 이용하여 각각 분리시켜 학습하는, 다중 선박 자율 항해 장치.10. The method of claim 9,
wherein the processor separates and learns the autonomous mission step, the autonomous navigation step, and the autonomous control step using the target value.

제9 항에 있어서,
상기 프로세서는, 강화 학습된 상기 자율 임무 단계, 상기 자율 항해 단계, 및 상기 자율 제어 단계를 서로 실시간 연동시켜 상기 선박을 제어하는, 다중 선박 자율 항해 장치.

10. The method of claim 9,
The processor is configured to control the vessel by linking the reinforcement-learned autonomous mission step, the autonomous navigation step, and the autonomous control step with each other in real time.