JP2019165599A

JP2019165599A - Control system, learning device, control device and control method

Info

Publication number: JP2019165599A
Application number: JP2018053507A
Authority: JP
Inventors: 房二堀部; Fusaji Horibe
Original assignee: Lixil Corp
Current assignee: Lixil Corp
Priority date: 2018-03-20
Filing date: 2018-03-20
Publication date: 2019-09-26
Anticipated expiration: 2038-03-20
Also published as: JP7090442B2

Abstract

To provide a control system that can control rotation of a windmill so that rotation speed in the windmill is optimized even in an unstable wind state.SOLUTION: The control system comprises: a learning part that learns correspondence information between wind speed and rotation of a windmill on the basis of wind speed information indicating the wind speed in an installation place of the windmill, rotation information relating to the rotation of the windmill, relation information of the wind speed and the rotation, which indicates an allowable range and a target for the rotation; a memorizing part that memorizes the correspondence information between the wind speed and the rotation; a state detecting part that detects wind speed in a case where a rotation speed control parameter for controlling rotation speed of the windmill is set on the windmill; a determining part that determines the rotation information on the basis of the wind speed detected by the state detecting part and the correspondence information; and a control part that controls rotation of the windmill so that rotation of the windmill is within the allowable range and gets close to the target, on the basis of the rotation information determined by the determining part.SELECTED DRAWING: Figure 12

Description

本発明は、制御システム、学習装置、制御装置、及び制御方法に関する。 The present invention relates to a control system, a learning device, a control device, and a control method.

従来、風力発電システムにおいて、風車のブレード（翼）の取り付け角（ピッチ角）を変化させることにより、出力を高効率に制御する技術がある。しかし、垂直軸型の風車では翼のピッチ制御を持たないものが多い。ピッチ制御を持たない風力発電システムにおいては、風車が一定以上の風速（強風）を受ける状況では、風車の回転速度を減速させる。これにより、強風下において風車の回転速度が過回転となり強制停止してしまう事態を防ぎ、強風下においても風車の回転動作を継続させる。こうすることで運転可能な条件の拡大を図ってきた（例えば、特許文献１）。 Conventionally, in a wind power generation system, there is a technique for controlling the output with high efficiency by changing a mounting angle (pitch angle) of a blade (blade) of a wind turbine. However, many vertical axis wind turbines do not have blade pitch control. In a wind power generation system that does not have pitch control, the rotational speed of the windmill is reduced in a situation where the windmill receives a wind speed (strong wind) that exceeds a certain level. This prevents a situation in which the rotational speed of the windmill over-rotates under a strong wind and forcibly stops, and the rotational operation of the windmill is continued even under a strong wind. In this way, the operating conditions have been expanded (for example, Patent Document 1).

特許第４４０１１１７号公報Japanese Patent No. 4401117

しかしながら、風況は時々刻々と変化するため、不安定であり、予測することが困難であるという実情がある。このため、強風に備えて風車の回転速度を減速させた状態で、想定していた強風が吹かなかった場合には、風車の回転速度を減速させた量に応じて発電電力が低減してしまうという問題があった。 However, since the wind conditions change from moment to moment, there is a fact that they are unstable and difficult to predict. For this reason, in the state where the rotational speed of the windmill is reduced in preparation for strong winds, if the expected strong wind does not blow, the generated power is reduced according to the amount by which the rotational speed of the windmill is reduced. There was a problem.

本発明は、このような事情に鑑みてなされたものであり、その目的は、不安定な風況であっても風車における回転速度を最適化させるように制御を行うことが可能となる制御システム、学習装置、制御装置、及び制御方法を提供することである。 The present invention has been made in view of such circumstances, and an object of the present invention is to provide a control system capable of performing control so as to optimize the rotational speed of the windmill even in an unstable wind condition. A learning device, a control device, and a control method are provided.

上述した課題を解決するために本発明の一実施形態は、発電システムの風車を制御する制御システムであって、前記風車の設置場所における風速を示す風速情報、前記風車の回転に関する回転情報、及び、前記風速と前記回転の関係情報であって前記回転の許容範囲と目標とを示す関係情報に基づいて、前記風速と前記回転との対応情報を学習する学習部と、前記風速と前記回転の対応情報を記憶する記憶部と、前記風車の回転数を制御する回転数制御パラメータを前記風車に設定した場合における風速と前記回転数とを検出する状態検出部と、前記状態検出部により検出された風速と前記回転数、及び前記対応情報に基づいて、前記回転情報を決定する決定部と、前記決定部により決定された回転情報に基づいて、前記風車の回転が前記許容範囲に収まり、尚且つ前記目標に近づくように、前記風車の回転を制御する制御部とを備えることを特徴とする制御システムである。
である。 In order to solve the above-described problem, an embodiment of the present invention is a control system that controls a windmill of a power generation system, and includes wind speed information indicating a wind speed at a location where the windmill is installed, rotation information regarding rotation of the windmill, and A learning unit for learning correspondence information between the wind speed and the rotation based on relation information indicating the wind speed and the rotation and indicating the allowable range of rotation and the target; and the wind speed and the rotation A storage unit that stores correspondence information, a state detection unit that detects a wind speed and the rotation speed when a rotation speed control parameter that controls the rotation speed of the wind turbine is set in the wind turbine, and a state detection unit that detects A determination unit that determines the rotation information based on the wind speed, the rotation speed, and the correspondence information, and the rotation of the windmill is permitted based on the rotation information determined by the determination unit. Fall in the range, besides to approach the target, a control system, characterized in that it comprises a control unit for controlling rotation of the windmill.
It is.

また、本発明の一実施形態は、上述の制御システムであって、前記状態検出部により検出された風速、及び前記風車の回転速度に基づいて、所定の報酬条件に応じた報酬を算出する報酬算出部を更に備え、前記関係情報は、前記報酬算出部により算出された報酬を含み、前記学習部は、報酬に基づいて前記風速と前記回転との対応情報を学習する強化学習モデルである。 Moreover, one embodiment of the present invention is the above-described control system, wherein a reward that calculates a reward according to a predetermined reward condition is calculated based on the wind speed detected by the state detection unit and the rotational speed of the windmill. The relationship information includes a reward calculated by the reward calculation unit, and the learning unit is a reinforcement learning model that learns correspondence information between the wind speed and the rotation based on the reward.

また、本発明の一実施形態は、上述の制御システムであって、前記報酬算出部は、前記状態検出部により検出された風速が所定の風速閾値以上である場合、前記風車の回転速度が第１閾値以上であるか否かに応じて報酬を算出する。 Moreover, one embodiment of the present invention is the above-described control system, wherein the reward calculation unit is configured such that when the wind speed detected by the state detection unit is equal to or higher than a predetermined wind speed threshold, the rotation speed of the windmill is the first. A reward is calculated according to whether or not it is equal to or greater than one threshold.

また、本発明の一実施形態は、上述の制御システムであって、前記報酬算出部は、前記状態検出部により検出された風速が前記風速閾値以上であり、尚且つ、前記風車の回転速度が前記第１閾値以上である場合に第１レベルの報酬を算出し、前記風車の回転速度が前記第１閾値より小さい第２閾値以上である場合、前記第１レベルより高い第２レベルの報酬を算出する。 One embodiment of the present invention is the above-described control system, wherein the reward calculation unit has a wind speed detected by the state detection unit equal to or higher than the wind speed threshold value, and the rotational speed of the windmill is A reward of the first level is calculated when it is equal to or higher than the first threshold, and a reward of a second level higher than the first level is calculated when the rotational speed of the windmill is equal to or higher than a second threshold smaller than the first threshold. calculate.

また、本発明の一実施形態は、上述の制御システムであって、前記報酬算出部は、前記状態検出部により検出された風速が前記風速閾値以上であり、前記風車の回転速度が前記第２閾値未満である場合、前記第１レベルより高く、尚且つ前記第２レベルより低い第３レベルの報酬を算出する。 Moreover, one embodiment of the present invention is the above-described control system, wherein the reward calculation unit is configured such that the wind speed detected by the state detection unit is equal to or higher than the wind speed threshold, and the rotation speed of the windmill is the second speed. If it is less than the threshold, a reward of a third level higher than the first level and lower than the second level is calculated.

また、本発明の一実施形態は、風力発電システムの風車の設置場所における風速を示す風速情報、前記風車の回転に関する回転情報、及び前記風速と前記回転の関係情報であって前記回転の許容範囲と目標とを示す関係情報に基づいて、前記風速と前記回転の関係情報を学習する学習部を備える学習装置である。 In one embodiment of the present invention, wind speed information indicating wind speed at a wind turbine installation location of the wind power generation system, rotation information regarding rotation of the wind turbine, and relation information between the wind speed and the rotation, the allowable range of the rotation. It is a learning apparatus provided with the learning part which learns the relationship information of the said wind speed and the said rotation based on the relationship information which shows a target.

また、本発明の一実施形態は、風力発電システムの風車の回転数を制御する回転数制御パラメータを前記風車に設定した場合における風速を検出する状態検出部と、前記状態検出部により検出された風速、及び、前記風速と前記風車の回転との関係情報に基づいて、前記風車の回転に関する回転情報を決定する決定部と、前記決定部により決定された回転情報に基づいて、前記風車の回転を制御する制御部とを備える制御装置である。 Further, in one embodiment of the present invention, a state detection unit that detects a wind speed when a rotation speed control parameter that controls the rotation speed of a wind turbine of a wind power generation system is set in the wind turbine, and the state detection unit A determination unit that determines rotation information related to the rotation of the windmill based on the wind speed and the relationship information between the wind speed and the rotation of the windmill, and the rotation of the windmill based on the rotation information determined by the determination unit It is a control apparatus provided with the control part which controls.

また、本発明の一実施形態は、学習部が、発電システムの風車を制御する制御方法であって、前記風車の設置場所における風速を示す風速情報、前記風車の回転に関する回転情報、及び、前記風速と前記回転の関係情報であって許容範囲と目標とを示す関係情報に基づいて、前記風速と前記回転との対応情報を学習し、記憶部が、前記風速と前記回転の対応情報を記憶し、状態検出部が、前記風車の回転数を制御する回転数制御パラメータを前記風車に設定した場合における風速を検出し、決定部が、前記状態検出部により検出された風速、及び前記対応情報に基づいて、前記回転情報を決定し、制御部が、前記決定部により決定された回転情報に基づいて、前記風車の回転が前記許容範囲に収まり、尚且つ前記目標に近づくように、前記風車の回転を制御する制御方法である。 One embodiment of the present invention is a control method in which the learning unit controls the wind turbine of the power generation system, the wind speed information indicating the wind speed at the installation location of the wind turbine, the rotation information regarding the rotation of the wind turbine, and the Based on the relationship information between the wind speed and the rotation and indicating the allowable range and the target, the correspondence information between the wind speed and the rotation is learned, and the storage unit stores the correspondence information between the wind speed and the rotation. The state detection unit detects the wind speed when the rotation speed control parameter for controlling the rotation speed of the windmill is set in the windmill, and the determination unit detects the wind speed detected by the state detection unit, and the correspondence information Based on the rotation information determined by the determination unit, the control unit determines the rotation information so that the rotation of the windmill is within the allowable range and approaches the target. A control method for controlling the rotation.

以上説明したように、この発明によれば、不安定な風況であっても風車における回転速度の制御を最適化させることが可能となる。 As described above, according to the present invention, it is possible to optimize the control of the rotational speed in the windmill even in an unstable wind condition.

第１の実施形態に係る風力発電システム１の概略構成の一例を示すブロック図である。It is a block diagram which shows an example of schematic structure of the wind power generation system 1 which concerns on 1st Embodiment. 第１の実施形態に係る風力発電システム１の制御装置６０及び学習装置７０の構成の一例を示すブロック図である。It is a block diagram which shows an example of a structure of the control apparatus 60 and the learning apparatus 70 of the wind power generation system 1 which concern on 1st Embodiment. 第１の実施形態に係る強風時における報酬条件の例を示す図である。It is a figure which shows the example of the reward conditions at the time of the strong wind which concerns on 1st Embodiment. 第１の実施形態に係る風力発電システム１における風速と風車の回転速度との関係の一例を示す図である。It is a figure which shows an example of the relationship between the wind speed in the wind power generation system 1 which concerns on 1st Embodiment, and the rotational speed of a windmill. 第１の実施形態に係る風力発電システム１における風速と発電電力との関係の一例を示す図である。It is a figure which shows an example of the relationship between the wind speed and generated electric power in the wind power generation system 1 which concerns on 1st Embodiment. 第１の実施形態に係る制御装置６０の動作例を示すフローチャートである。It is a flowchart which shows the operation example of the control apparatus 60 which concerns on 1st Embodiment. 第２の実施形態に係る対象区間を説明する図である。It is a figure explaining the object area concerning a 2nd embodiment. 第２の実施形態に係る制御装置６０の動作例を示すフローチャートである。It is a flowchart which shows the operation example of the control apparatus 60 which concerns on 2nd Embodiment. 第３の実施形態に係る制御装置６０の動作例を示すフローチャートである。It is a flowchart which shows the operation example of the control apparatus 60 which concerns on 3rd Embodiment. 第４の実施形態に係る制御装置６０の動作例を示すフローチャートである。It is a flowchart which shows the operation example of the control apparatus 60 which concerns on 4th Embodiment. 第５の実施形態に係る風力発電システム１Ａの制御装置６０Ａの構成の一例を示すブロック図である。It is a block diagram which shows an example of a structure of 60 A of control apparatuses of the wind power generation system 1A which concerns on 5th Embodiment. 第６の実施形態に係る風力発電システム１Ｂの制御装置６０Ｂの構成の一例を示すブロック図である。It is a block diagram which shows an example of a structure of control apparatus 60B of the wind power generation system 1B which concerns on 6th Embodiment.

以下、実施形態の制御システム、学習装置、制御装置を、図面を参照して説明する。 Hereinafter, a control system, a learning apparatus, and a control apparatus according to embodiments will be described with reference to the drawings.

＜第１の実施形態＞
図１は、第１の実施形態に係る風力発電システム１の概略構成の一例を示すブロック図である。風力発電システム１は、風力発電機本体１０と制御システム５０とを備える。風力発電機本体１０と制御システム５０との間では、種々の情報がやりとりされる。
図１に示すように、例えば、制御システム５０から風力発電機本体１０に、風力発電機本体１０を制御する制御パラメータが出力される。
また、例えば、風力発電機本体１０から制御システム５０に、風力発電機本体１０の状態を示す状態パラメータが出力される。 <First Embodiment>
FIG. 1 is a block diagram illustrating an example of a schematic configuration of a wind power generation system 1 according to the first embodiment. The wind power generation system 1 includes a wind power generator main body 10 and a control system 50. Various information is exchanged between the wind power generator main body 10 and the control system 50.
As shown in FIG. 1, for example, control parameters for controlling the wind power generator main body 10 are output from the control system 50 to the wind power generator main body 10.
Further, for example, a state parameter indicating the state of the wind power generator main body 10 is output from the wind power generator main body 10 to the control system 50.

制御パラメータは、例えば、風車２０の回転数を制御する回転数制御パラメータ、及び発電機３０により発電される回生電力の電力量を制御する電力制御パラメータである。
また、状態パラメータは、例えば、風車２０の風速、風車２０の回転速度（以下、単に回転速度ともいう）、及び発電機３０により発電された回生電力の電力量を示す情報である。 The control parameters are, for example, a rotation speed control parameter that controls the rotation speed of the windmill 20 and a power control parameter that controls the amount of regenerative power generated by the generator 30.
The state parameter is, for example, information indicating the wind speed of the windmill 20, the rotational speed of the windmill 20 (hereinafter also simply referred to as the rotational speed), and the amount of regenerative power generated by the generator 30.

風力発電機本体１０は、風車２０、発電機３０、整流・昇圧部３１、電圧検出部３２、電流検出部３３、風速センサ４１及び回転速度センサ４２を備える。
風車２０は、例えば、垂直軸型風車として構成されており、鉛直方向に延びる回転軸の周囲に複数の直線翼が一体として回転可能に連結させた直線翼垂直軸風車などによって構成されている。
風車２０は、例えば、後述する発電機３０の回転子と回転軸を介して接続され、発電機３０の回転子と一体となって回転する。ここで、発電機３０の回転子は、発電機３０により発電される回生電力の電力量に応じた回転数で回転する。また、回生電力の電力量は、後述する制御システム５０によりＭＰＰＴ（Maximum Power Point Tracking）制御がなされる。このため、風車２０の回転数は、制御システム５０によるＭＰＰＴ制御により、間接的に制御される。 The wind power generator main body 10 includes a windmill 20, a power generator 30, a rectification / boost unit 31, a voltage detection unit 32, a current detection unit 33, a wind speed sensor 41, and a rotation speed sensor 42.
The windmill 20 is configured, for example, as a vertical axis type windmill, and is configured by a straight blade vertical axis windmill in which a plurality of straight blades are rotatably connected integrally around a rotating shaft extending in the vertical direction.
The windmill 20 is connected to, for example, a rotor of a generator 30 (described later) via a rotation shaft, and rotates integrally with the rotor of the generator 30. Here, the rotor of the generator 30 rotates at a rotational speed corresponding to the amount of regenerative power generated by the generator 30. In addition, the amount of regenerative power is subjected to MPPT (Maximum Power Point Tracking) control by a control system 50 described later. For this reason, the rotational speed of the windmill 20 is indirectly controlled by MPPT control by the control system 50.

発電機３０は、風車２０の回転力を変換して電力を生じさせる装置であり、例えば、三相交流発電機として構成され、風車２０の回転と連動して回転する回転子が風車２０の回転軸に連結されて回転することにより交流電力を発電する。発電機３０は、発電した交流電力を整流・昇圧部３１に供給する。なお、発電機３０は、発電した電力を整流・昇圧部３１側に供給する発電機として動作する他、整流・昇圧部３１から交流電力が供給される電動機としても動作する。発電機３０は、例えば、風車２０の起動時に回転をアシストするアシスト制御を行う場合等に電動機として動作する。 The generator 30 is a device that generates electric power by converting the rotational force of the windmill 20. For example, the generator 30 is configured as a three-phase AC generator, and a rotor that rotates in conjunction with the rotation of the windmill 20 rotates the windmill 20. AC power is generated by rotating by being connected to the shaft. The generator 30 supplies the generated AC power to the rectification / boost unit 31. The generator 30 operates as a generator that supplies the generated power to the rectifying / boosting unit 31 side, and also operates as an electric motor that is supplied with AC power from the rectifying / boosting unit 31. The generator 30 operates as an electric motor, for example, when assist control is performed to assist rotation when the windmill 20 is activated.

整流・昇圧部３１は、発電機３０により発電された交流電力を直流電力に変換し、変換した直流電力の電圧を変換（昇圧）する。整流・昇圧部３１は、例えば、昇圧チョッパ回路である。整流・昇圧部３１から出力される直流電力が、発電機３０により発電された回生電力に相当する。 The rectifying / boosting unit 31 converts AC power generated by the generator 30 into DC power, and converts (boosts) the voltage of the converted DC power. The rectifier / boost unit 31 is, for example, a boost chopper circuit. The DC power output from the rectification / boost unit 31 corresponds to the regenerative power generated by the generator 30.

なお、整流・昇圧部３１は、発電機３０に発電動作を行わせる場合には昇圧チョッパ回路として作動し、アシスト制御時等に発電機３０を電動機として動作させる場合にはインバ−タとして作動する回路である。なお、アシスト制御時の供給電力は、風力発電システム１のバッテリ（不図示）からの電力であってもよい。 The rectifier / boost unit 31 operates as a boost chopper circuit when the generator 30 performs a power generation operation, and operates as an inverter when the generator 30 is operated as an electric motor during assist control or the like. Circuit. Note that the power supplied during assist control may be power from a battery (not shown) of the wind power generation system 1.

電圧検出部３２は、公知の電圧計によって構成され、整流・昇圧部３１から出力される出力電圧を検出し、検出した出力電圧を制御システム５０に出力する。
電流検出部３３は、公知の電流計によって構成され、整流・昇圧部３１から出力される出力電流を検出し、検出した出力電流を制御システム５０に出力する。 The voltage detection unit 32 includes a known voltmeter, detects the output voltage output from the rectification / boost unit 31, and outputs the detected output voltage to the control system 50.
The current detection unit 33 includes a known ammeter, detects the output current output from the rectification / boost unit 31, and outputs the detected output current to the control system 50.

風速センサ４１は、公知の風速センサによって構成され、例えば、風車２０の近傍の所定位置（例えば、風車２０における回転翼以外の部位）に設けられて風車が受ける風の風速を検出する。風速センサ４１は、検出した風速を示す情報を、制御システム５０に出力する。 The wind speed sensor 41 is configured by a known wind speed sensor, for example, is provided at a predetermined position in the vicinity of the windmill 20 (for example, a portion other than the rotor blades in the windmill 20), and detects the wind speed of the wind received by the windmill. The wind speed sensor 41 outputs information indicating the detected wind speed to the control system 50.

回転速度センサ４２は、風車２０の回転速度を検出する。回転速度センサ４２は、風車２０の回転軸部（不図示）の回転速度を検出し得るセンサであればよく、公知の様々な回転速度センサを用いることができる。回転速度センサ４２は、検出した回転速度を示す情報を、制御システム５０に出力する。 The rotation speed sensor 42 detects the rotation speed of the windmill 20. The rotation speed sensor 42 may be any sensor that can detect the rotation speed of the rotation shaft portion (not shown) of the windmill 20, and various known rotation speed sensors can be used. The rotation speed sensor 42 outputs information indicating the detected rotation speed to the control system 50.

制御システム５０は、制御装置６０と、学習装置７０とを備える。
制御装置６０は、風速センサ４１により検出された風速、及び回転速度センサ４２により検出された回転速度に基づいて、回転数制御パラメータを決定することにより風車の回転数を制御する。制御装置６０は、学習装置７０を用いて、回転数制御パラメータを決定する。制御装置６０が学習装置７０を用いて回転数制御パラメータを決定する方法については後で詳しく説明する。 The control system 50 includes a control device 60 and a learning device 70.
The control device 60 controls the rotational speed of the windmill by determining the rotational speed control parameter based on the wind speed detected by the wind speed sensor 41 and the rotational speed detected by the rotational speed sensor 42. The control device 60 uses the learning device 70 to determine the rotation speed control parameter. A method in which the control device 60 determines the rotation speed control parameter using the learning device 70 will be described in detail later.

学習装置７０は、例えば、強化学習を行う装置である。この場合、学習装置７０は、強化学習における学習する主体となるエージェントに相当し、制御対象（本実施形態では、風力発電機本体１０）とのやりとりにより、制御対象をより適切に制御するための学習を進める。
以下では、学習装置７０が強化学習を行う場合を例示して説明するが、これに限定されない。学習装置７０は、制御対象（風力発電機本体１０）に関する状態に基づいて、制御対象を制御するパラメータがより適切となるように学習するものであればよい。学習装置７０は、教師あり学習を行ってもよいし、教師なし学習を行ってもよいし、その他の学習を行ってもよい。ここで、制御対象（風力発電機本体１０）に関する状態とは、風力発電機本体１０及び風力発電機本体１０の周囲の状態であり、例えば、状態パラメータで示される風車２０における風速、風車２０の回転速度、及び発電機３０の発電量等の変数である。また、ここでの状態には、上述した風速等のような時々刻々変化する状態の他、予め定められた状態、例えば、風車２０の回転速度の限界値、風車２０の回転トルクの上下限、及び発電機３０が発電可能な最大の電力量等を含む。 The learning device 70 is a device that performs reinforcement learning, for example. In this case, the learning device 70 corresponds to an agent that is a subject of learning in reinforcement learning, and for controlling the control target more appropriately through interaction with the control target (in this embodiment, the wind power generator main body 10). Advance learning.
Below, although the case where the learning apparatus 70 performs reinforcement learning is illustrated and demonstrated, it is not limited to this. The learning device 70 may be any device that learns based on the state related to the control target (wind power generator main body 10) so that the parameter for controlling the control target becomes more appropriate. The learning device 70 may perform supervised learning, may perform unsupervised learning, or may perform other learning. Here, the state relating to the control target (wind generator main body 10) is a state around the wind generator main body 10 and the wind generator main body 10. For example, the wind speed in the wind turbine 20 indicated by the state parameter, the wind turbine 20 These are variables such as the rotational speed and the amount of power generated by the generator 30. In addition to the state that changes from moment to moment such as the wind speed described above, the state here is a predetermined state, for example, the limit value of the rotational speed of the windmill 20, the upper and lower limits of the rotational torque of the windmill 20, And the maximum amount of power that the generator 30 can generate.

本実施形態では、学習装置７０は、風力発電機本体１０を制御する回転数制御パラメータを出力し、出力した回転数制御パラメータに応じて、風力発電機本体１０の状態を観察し、状態の変化に応じて、次の回転数制御パラメータを決定する。
また、学習装置７０は、風力発電機本体１０の状態に応じた報酬を受け取る。これにより、学習装置７０は、報酬を手掛かりとして自身が出力した回転数制御パラメータの良し悪しを判断することにより学習を進め、より適した回転数制御パラメータを出力することが可能となる。 In the present embodiment, the learning device 70 outputs a rotation speed control parameter for controlling the wind power generator main body 10, observes the state of the wind power generator main body 10 according to the output rotation speed control parameter, and changes the state. In response to this, the next rotation speed control parameter is determined.
The learning device 70 receives a reward corresponding to the state of the wind power generator main body 10. Accordingly, the learning device 70 can advance learning by determining whether the rotation speed control parameter output by itself is good or bad with the reward as a clue, and can output a more appropriate rotation speed control parameter.

図２は、本発明の一実施形態に係る風力発電システム１の制御装置６０の構成の一例を示すブロック図である。
図２に示すように、制御装置６０は、パラメータ取得部６１と、状態検出部６２と、報酬算出部６３と、報酬出力部６４とを備える。また、学習装置７０は、強化学習部７１を備える。ここで、強化学習部７１は、「学習部」の一例である。 FIG. 2 is a block diagram illustrating an example of the configuration of the control device 60 of the wind power generation system 1 according to an embodiment of the present invention.
As shown in FIG. 2, the control device 60 includes a parameter acquisition unit 61, a state detection unit 62, a reward calculation unit 63, and a reward output unit 64. The learning device 70 includes a reinforcement learning unit 71. Here, the reinforcement learning unit 71 is an example of a “learning unit”.

パラメータ取得部６１は、強化学習部７１から出力される回転数制御パラメータを取得する。パラメータ取得部６１は、取得した回転数制御パラメータを、風力発電機本体１０に対して出力する。 The parameter acquisition unit 61 acquires the rotation speed control parameter output from the reinforcement learning unit 71. The parameter acquisition unit 61 outputs the acquired rotation speed control parameter to the wind power generator main body 10.

状態検出部６２は、風力発電機本体１０の状態を示す状態パラメータを検出する。状態パラメータは、風力発電機本体１０に含まれる風車２０や発電機３０に関する情報であり、例えば、風速センサ４１により検出された風速、回転速度センサ４２により検出された回転速度、及び発電機３０により発電された回生電力を示す情報である。状態検出部６２は、検出した状態パラメータを、報酬算出部６３に出力する。 The state detection unit 62 detects a state parameter indicating the state of the wind power generator main body 10. The state parameter is information related to the windmill 20 and the generator 30 included in the wind power generator main body 10. For example, the wind speed detected by the wind speed sensor 41, the rotational speed detected by the rotational speed sensor 42, and the generator 30. It is the information which shows the regenerated electric power generated. The state detection unit 62 outputs the detected state parameter to the reward calculation unit 63.

ここで、風速センサ４１により検出された風速は「風速情報」の一例である。回転速度センサ４２により検出された回転速度は、「回転情報」の一例である。また、発電機３０により発電された回生電力は、「電力情報」の一例である。 Here, the wind speed detected by the wind speed sensor 41 is an example of “wind speed information”. The rotation speed detected by the rotation speed sensor 42 is an example of “rotation information”. The regenerative power generated by the generator 30 is an example of “power information”.

報酬算出部６３は、状態検出部６２から取得した風速、及び回転速度を示す情報に基づいて、報酬を算出する。報酬算出部６３は、予め定めた所定の報酬条件に応じて報酬を算出する。
ここで、報酬条件は、例えば、風速に対して回転速度がより適切に制御されたと判定される場合に、より高い報酬が得られるように設定される。 The reward calculation unit 63 calculates a reward based on the wind speed acquired from the state detection unit 62 and information indicating the rotation speed. The reward calculation unit 63 calculates a reward according to a predetermined reward condition.
Here, the reward condition is set so that, for example, a higher reward is obtained when it is determined that the rotation speed is more appropriately controlled with respect to the wind speed.

例えば、報酬算出部６３は、風速、及び回転速度に基づいて、強風時において風車の回転が過回転となることなく制御され、発電電力が低下し過ぎることが抑制されている場合には、より高い報酬を算出する。また、例えば、報酬算出部６３は、風速、及び回転速度に基づいて、発電に適した風況にも係らず、風車の回転が不適切に抑制されてしまい、発電電力の低下を引き起こしている場合には、より低い報酬を算出する。報酬算出部６３は、算出した報酬を報酬出力部６４に出力する。 For example, the reward calculation unit 63 is controlled based on the wind speed and the rotation speed without excessive rotation of the windmill in a strong wind, and the generated power is suppressed from being excessively reduced. Calculate a high reward. In addition, for example, the reward calculation unit 63 causes a decrease in the generated power because the rotation of the windmill is inappropriately suppressed based on the wind speed and the rotation speed regardless of the wind conditions suitable for power generation. If so, calculate a lower reward. The reward calculation unit 63 outputs the calculated reward to the reward output unit 64.

報酬出力部６４は、報酬算出部６３から取得した報酬を、強化学習部７１に出力する。 The reward output unit 64 outputs the reward acquired from the reward calculation unit 63 to the reinforcement learning unit 71.

強化学習部７１は、風力発電機本体１０の状態を示す風速と回転速度を示す情報を取得する。また、強化学習部７１は、報酬出力部６４から報酬を取得する。強化学習部７１は、取得した風速と回転速度を示す情報、及び風速に対して予め定められた回転数の許容範囲と回転数の目標値を示す情報に基づいて、報酬を手掛かりとしてより高い報酬が得られるように学習を進め、より適切な回転数制御パラメータを出力する。 The reinforcement learning unit 71 acquires information indicating the wind speed and the rotation speed indicating the state of the wind power generator main body 10. Further, the reinforcement learning unit 71 acquires a reward from the reward output unit 64. The reinforcement learning unit 71 uses the information indicating the acquired wind speed and rotational speed, and the information indicating the allowable range of the rotational speed and the target value of the rotational speed that are predetermined for the wind speed as a clue for a higher reward. The learning is advanced so as to obtain a more appropriate rotation speed control parameter.

ここで、回転数の許容範囲とは、風車２０や発電機３０の機械的な耐用限界等に基づいて定められる風車２０の回転数として許容される範囲である。また、回転数の目標値とは、風速ごとに定まる回生電力が最大となる風車の回転数である。 Here, the allowable range of the rotational speed is a range that is allowed as the rotational speed of the windmill 20 that is determined based on a mechanical durability limit of the windmill 20 or the generator 30. Further, the target value of the rotational speed is the rotational speed of the windmill that maximizes the regenerative power determined for each wind speed.

風力発電では、強風時においては、特に風車の回転数を制御することが困難となる場合があり、風車の回転速度が過多になった場合には風車が強制停止されてしまい発電電力の低下を招く。また、強風に備えて風車の回転速度を抑制し過ぎると、発電電力の低下を招く要因となり得る。
そこで、本実施形態では強風時における回転数制御パラメータに対し、風車が適切に制御されたか否かに応じて、報酬に差がつくように報酬条件を設定する。これにより、強化学習部７１に、強風時においてもより適切な回転数制御パラメータが出力できるように学習させることが可能となる。 In wind power generation, it may be difficult to control the rotation speed of the windmill, especially during strong winds, and if the rotation speed of the windmill becomes excessive, the windmill will be forcibly stopped, reducing the generated power. Invite. In addition, if the rotational speed of the windmill is excessively suppressed in preparation for strong winds, it may be a factor that causes a decrease in generated power.
Therefore, in the present embodiment, the remuneration condition is set so that the remuneration is different depending on whether or not the windmill is appropriately controlled with respect to the rotation speed control parameter during a strong wind. Thereby, it is possible to cause the reinforcement learning unit 71 to learn so that a more appropriate rotation speed control parameter can be output even in a strong wind.

図３は、第１の実施形態に係る強風時における報酬条件の例を示す図である。
図３に示すように、強風時において、回転速度が所定の第１閾値以上である場合、最低レベルである第１レベルの報酬とする。つまり、強風時において回転速度が超過している場合には、最も低い報酬とする。最低レベルである第１レベルの報酬とは、例えば、マイナスの報酬である。 FIG. 3 is a diagram illustrating an example of reward conditions during a strong wind according to the first embodiment.
As shown in FIG. 3, when the rotational speed is equal to or higher than a predetermined first threshold value during strong winds, the reward is the first level that is the lowest level. In other words, if the rotational speed exceeds in a strong wind, the lowest reward is set. The first level reward that is the lowest level is, for example, a negative reward.

また、回転速度が所定の第２閾値以上、尚且つ第１閾値未満である場合、最高ランクである第２レベルの報酬とする。つまり、強風時において回転速度が適正範囲に制御されている場合には、最も高い報酬とする。なお、第２閾値は第１閾値よりも低い閾値である。 Further, when the rotation speed is equal to or higher than a predetermined second threshold value and lower than the first threshold value, the reward is the second level reward that is the highest rank. That is, when the rotational speed is controlled within an appropriate range during a strong wind, the highest reward is obtained. Note that the second threshold is a threshold lower than the first threshold.

また、回転速度が所定の第２閾値未満である場合、第２レベルよりも低く、尚且つ第１レベルよりは高い第３レベルの報酬とする。つまり、強風時において回転速度が速度不足である場合には、最も高い報酬よりは低い報酬であるが、回転速度が速度超過である場合よりは高い報酬とする。 Further, when the rotation speed is less than the predetermined second threshold, the reward is a third level lower than the second level and higher than the first level. That is, when the rotational speed is insufficient during strong winds, the reward is lower than the highest reward, but higher than when the rotational speed is excessive.

なお、ここでの強風時とは、所定の強風判定閾値以上の風速が検出された場合であり、この強風判定閾値は、風車の構造や機械的な強度に応じて任意に定められてよい。また、上述した第１閾値、及び第２閾値は、風速に依らず一定の値であってもよいし、風速に応じでそれぞれ異なる値であってもよい。 The strong wind here is a case where a wind speed equal to or higher than a predetermined strong wind determination threshold is detected, and the strong wind determination threshold may be arbitrarily determined according to the structure of the wind turbine and the mechanical strength. Further, the first threshold value and the second threshold value described above may be constant values regardless of the wind speed, or may be different values depending on the wind speed.

図４は、第１の実施形態に係る風力発電システム１における風速と風車の回転速度との関係の一例を示す図である。図４の横軸は風速［ｍ／ｓ］、縦軸は風車２０の回転数［ｒ／ｍｉｎ］を示す。また、図４においては、風速Ｂ［ｍ／ｓ］以上である場合に強風となる。 FIG. 4 is a diagram illustrating an example of the relationship between the wind speed and the rotational speed of the windmill in the wind power generation system 1 according to the first embodiment. The horizontal axis in FIG. 4 indicates the wind speed [m / s], and the vertical axis indicates the rotational speed [r / min] of the wind turbine 20. Moreover, in FIG. 4, it becomes a strong wind when it is more than the wind speed B [m / s].

図４に示すように、強風ではない風速Ａ［ｍ／ｓ］〜風速Ｂ［ｍ／ｓ］までの間において、風速と回転数とが正の比例係数で比例する関係となるように制御される。一方、強風である風速Ｂ［ｍ／ｓ］以上の風速である場合、風速と回転数とが所定の関係となるように制御されることが望ましい。ここでの所定の関係とは、例えば、風速が増加すると回転数が低下する関係である。風速と回転数との関係が、この例における特性ＦＧ上の点となるように制御した場合、発電機３０のトルクを最大トルクに維持することができる。この場合、最大の発電電力を得ることが可能となる。この例では特性ＦＧに示す線上に沿って回転数が制御されることが最も望ましい。このため、本実施形態では、強風時において、風車２０の回転数が、その風速に応じた適正範囲に制御された場合に、最も高い報酬を設定する。この場合の適正範囲は、例えば、風速に応じた特性ＦＧ上の点を含む所定の範囲である。 As shown in FIG. 4, the wind speed and the rotational speed are controlled so as to be proportional to each other by a positive proportional coefficient between the wind speed A [m / s] and the wind speed B [m / s] that are not strong winds. The On the other hand, when the wind speed is higher than the wind speed B [m / s], which is a strong wind, it is desirable to control the wind speed and the rotational speed to have a predetermined relationship. The predetermined relationship here is, for example, a relationship in which the rotational speed decreases as the wind speed increases. When the relationship between the wind speed and the rotational speed is controlled to be a point on the characteristic FG in this example, the torque of the generator 30 can be maintained at the maximum torque. In this case, the maximum generated power can be obtained. In this example, it is most desirable that the rotational speed is controlled along the line indicated by the characteristic FG. For this reason, in this embodiment, the highest reward is set when the rotational speed of the windmill 20 is controlled within an appropriate range according to the wind speed during strong winds. The appropriate range in this case is, for example, a predetermined range including a point on the characteristic FG corresponding to the wind speed.

一方、回転数が特性ＦＧ上の点を超過してしまった場合、発電機３０の回転子の機械的な耐用限界を超過する可能性があることから、風車が強制停止される。風車が強制停止されてしまうと発電をすることができない。つまり、風速と回転数との関係が領域Ｄにある場合、風車が強制停止される可能性が高まることから、このような事態は望ましくない。このため、本実施形態では、強風時において、風車２０の回転数が、その風速に応じた回転速度における速度超過の範囲に制御された場合に、最も低い報酬を設定する。この場合の速度超過の範囲は、例えば、風速に応じた領域Ｄに含まれる所定の範囲である。 On the other hand, if the rotational speed exceeds a point on the characteristic FG, the wind turbine is forcibly stopped because there is a possibility that the mechanical durability limit of the rotor of the generator 30 may be exceeded. If the windmill is forcibly stopped, it cannot generate electricity. That is, when the relationship between the wind speed and the rotational speed is in the region D, the possibility that the windmill is forcibly stopped increases, and thus such a situation is not desirable. For this reason, in the present embodiment, when the wind speed is strong, the lowest reward is set when the rotational speed of the windmill 20 is controlled to be in the range of excessive speed at the rotational speed corresponding to the wind speed. In this case, the overspeed range is, for example, a predetermined range included in the region D corresponding to the wind speed.

また、回転数が特性ＦＧ上の点よりも低下した場合、風車２０の回転数を過剰に抑制することになり、発電機３０はさらに発電をすることが可能であるにも係らず、発電できていない状態となる。この場合、風車２０が強制停止されたり、発電機３０の機械的な耐用限界を超えたりする心配はないが、発電電力を最大限に引き出せていない。このため、本実施形態では、強風時において、風車２０の回転数が、その風速に応じた速度不足の範囲に制御された場合に、最も高い報酬より低いが、最も低い報酬よりは高い報酬を設定する。この場合の速度不足の範囲は、例えば、領域ＦＢＣＧで示される範囲である。 In addition, when the rotational speed is lower than the point on the characteristic FG, the rotational speed of the windmill 20 is excessively suppressed, and the generator 30 can generate power even though it can further generate power. Not in a state. In this case, there is no fear that the windmill 20 is forcibly stopped or the mechanical durability limit of the generator 30 is exceeded, but the generated power cannot be drawn to the maximum. For this reason, in this embodiment, when the number of rotations of the windmill 20 is controlled to a range of insufficient speed according to the wind speed in a strong wind, the reward is lower than the highest reward but higher than the lowest reward. Set. In this case, the range of insufficient speed is, for example, a range indicated by the region FBCG.

図５は、第１の実施形態に係る風力発電システム１における風速と発電電力の出力との関係の一例を示す図である。図５の横軸は風速［ｍ／ｓ］、縦軸は発電電力［ｋＷ］を示す。図５では、図４における制御特性に基づいて風車の回転数が制御された場合における風力と発電電力との関係を示している。また、図５においては、図４同様に、風速Ｂ［ｍ／ｓ］以上である場合に強風となる。
図５に示すように、強風ではない風速Ａ［ｍ／ｓ］〜風速Ｂ［ｍ／ｓ］までの間において、風速に応じて発電電力が三次関数的に増大する。一方、強風である風速Ｂ［ｍ／ｓ］以上の風速である場合、図４に示す特性ＦＧに沿って回転数が制御されることで、発電電力が最大出力Ｍａｘに維持される。 FIG. 5 is a diagram illustrating an example of the relationship between the wind speed and the output of the generated power in the wind power generation system 1 according to the first embodiment. The horizontal axis in FIG. 5 indicates the wind speed [m / s], and the vertical axis indicates the generated power [kW]. FIG. 5 shows the relationship between wind power and generated power when the rotational speed of the windmill is controlled based on the control characteristics shown in FIG. In FIG. 5, as in FIG. 4, strong winds occur when the wind speed is B [m / s] or higher.
As shown in FIG. 5, the generated power increases in a cubic function according to the wind speed between the wind speed A [m / s] and the wind speed B [m / s] that are not strong winds. On the other hand, when the wind speed is higher than the wind speed B [m / s], which is a strong wind, the generated power is maintained at the maximum output Max by controlling the rotation speed along the characteristic FG shown in FIG.

図６は、第１の実施形態に係る制御装置６０の動作例を示すフローチャートである。
まず、制御装置６０のパラメータ取得部６１は、強化学習部７１から回転数制御パラメータを取得する（ステップＳ１０）。
次に、状態検出部６２は、風速、及び回転速度を検出する（ステップＳ１１）。状態検出部６２は、風速センサ４１により検出された風速、及び回転速度センサ４２により検出された回転速度を取得することにより、風速、及び回転速度を検出する。状態検出部６２は、検出した風速、及び回転速度を、報酬算出部６３に出力する。 FIG. 6 is a flowchart illustrating an operation example of the control device 60 according to the first embodiment.
First, the parameter acquisition unit 61 of the control device 60 acquires a rotation speed control parameter from the reinforcement learning unit 71 (step S10).
Next, the state detection unit 62 detects the wind speed and the rotation speed (step S11). The state detection unit 62 detects the wind speed and the rotation speed by acquiring the wind speed detected by the wind speed sensor 41 and the rotation speed detected by the rotation speed sensor 42. The state detection unit 62 outputs the detected wind speed and rotation speed to the reward calculation unit 63.

次に、報酬算出部６３は、報酬を算出する。
まず、報酬算出部６３は、風速が強風であるか否かを判定する（ステップＳ１２）。報酬算出部６３は、風速が強風である場合、回転速度が第１閾値以上であるか否かを判定する（ステップＳ１３）。報酬算出部６３は、回転速度が第１閾値以上である場合、第１レベルの報酬とする（ステップＳ１４）。一方、報酬算出部６３は、回転速度が第１閾値未満である場合、回転速度が第２閾値以上であるか否かを判定する（ステップＳ１６）。 Next, the reward calculation unit 63 calculates a reward.
First, the reward calculation unit 63 determines whether or not the wind speed is strong (step S12). The reward calculation unit 63 determines whether the rotational speed is equal to or higher than the first threshold when the wind speed is strong (step S13). The reward calculation unit 63 sets the first level reward when the rotation speed is equal to or higher than the first threshold (step S14). On the other hand, when the rotation speed is less than the first threshold, the reward calculation unit 63 determines whether the rotation speed is equal to or greater than the second threshold (step S16).

報酬算出部６３は、回転速度が第２閾値以上である場合、第２レベルの報酬とする（ステップＳ１７）。一方、報酬算出部６３は、回転速度が第２閾値未満である場合、第３レベルの報酬とする（ステップＳ１８）。 The reward calculation unit 63 determines a second level reward when the rotation speed is equal to or higher than the second threshold (step S17). On the other hand, when the rotation speed is less than the second threshold value, the reward calculation unit 63 sets the third level reward (step S18).

なお、ステップＳ１２において、風速が強風でない場合、報酬算出部６３は、通常レベルの報酬とする（ステップＳ１９）。通常レベルの報酬とは、例えば、回転速度が適正範囲に含まれるように制御されている場合には最も高い報酬とし、適正範囲から外れた場合には外れた方向（速度超過、又は速度不足）に関わらず、適正範囲から乖離した度合に応じて、報酬を低減させる。 In step S12, if the wind speed is not strong, the reward calculation unit 63 sets the normal level reward (step S19). The normal level reward is, for example, the highest reward when the rotation speed is controlled to be included in the appropriate range, and the direction that is out of the appropriate range (excessive speed or insufficient speed). Regardless, the reward is reduced according to the degree of deviation from the appropriate range.

報酬算出部６３は、算出した報酬を、報酬出力部６４に出力する。報酬出力部６４は、報酬を、強化学習部７１に出力する（ステップＳ１５）。 The reward calculation unit 63 outputs the calculated reward to the reward output unit 64. The reward output unit 64 outputs the reward to the reinforcement learning unit 71 (step S15).

＜第２の実施形態＞
次に第２の実施形態について説明する。
本実施形態では、制御装置６０の制御対象が回生電力である点において、他の実施形態と相違する。以下では、上述した実施形態と異なる点を説明し、上述した実施形態と同一または類似の機能を有する構成に同一の符号を付し、その説明を省略する。 <Second Embodiment>
Next, a second embodiment will be described.
This embodiment is different from the other embodiments in that the control target of the control device 60 is regenerative power. Below, a different point from embodiment mentioned above is demonstrated, the same code | symbol is attached | subjected to the structure which has the same or similar function as embodiment mentioned above, and the description is abbreviate | omitted.

まず前提として、風力発電においては、風車２０が受ける風速に対する回生電力の最大値は風車ごとに固有の特性値として一義的に定められている。そして、この回生電力の最大値が、風速に応じた回生電力の目標値として設定され、回生電力の目標値に近づくように回生電力が制御される。 First, as a premise, in wind power generation, the maximum value of regenerative power with respect to the wind speed received by the windmill 20 is uniquely determined as a characteristic value unique to each windmill. The maximum value of the regenerative power is set as a target value of the regenerative power corresponding to the wind speed, and the regenerative power is controlled so as to approach the target value of the regenerative power.

制御装置６０は、学習装置７０を用いて、電力制御パラメータを決定する。制御装置６０は、電力制御パラメータに基づいて、整流・昇圧部３１から出力される直流電力が、電力制御パラメータにより指示された電力値に近づくようにＭＰＰＴ（Maximum Power Point Tracking）制御を行う。具体的には、制御装置６０は、電圧検出部３２で検出される出力電圧、および電流検出部で検出される出力電流によって決定される出力電力が、電力制御パラメータにより指示された回生電力の目標値となるように整流・昇圧部３１に与えるＰＷＭ信号のデューティ比を変化させる。 The controller 60 uses the learning device 70 to determine the power control parameter. Based on the power control parameter, the control device 60 performs MPPT (Maximum Power Point Tracking) control so that the DC power output from the rectification / boost unit 31 approaches the power value indicated by the power control parameter. Specifically, the control device 60 sets the output power detected by the voltage detection unit 32 and the output power determined by the output current detected by the current detection unit to the target of regenerative power indicated by the power control parameter. The duty ratio of the PWM signal applied to the rectifying / boosting unit 31 is changed so as to be a value.

学習装置７０は、状態検出部６２により検知される状態パラメータ、及び報酬出力部６４により出力される報酬に基づいて、電力制御パラメータを出力する。また、学習装置７０は、出力した電力制御パラメータが風力発電機本体１０の整流・昇圧部３１に設定されたことによる、風力発電機本体１０の状態の変化を観察し、状態の変化や報酬に応じて、次の電力制御パラメータを決定する。 The learning device 70 outputs a power control parameter based on the state parameter detected by the state detection unit 62 and the reward output by the reward output unit 64. In addition, the learning device 70 observes a change in the state of the wind power generator main body 10 due to the output power control parameter being set in the rectification / boost unit 31 of the wind power generator main body 10, and serves as a change or reward for the state. In response, the next power control parameter is determined.

強化学習部７１は、風力発電機本体１０の状態を示す風速と回転速度と回生電力を示す情報を取得する。強化学習部７１は、取得した風速と回転速度と回生電力を示す情報、及び回生電力の目標値を示す情報に基づいて、報酬を手掛かりとしてより高い報酬が得られるように学習を進め、より適切な電力制御パラメータを出力する。ここで、回生電力の目標値とは、風速ごとに定まる発電可能な回生電力の最大値である。 The reinforcement learning unit 71 acquires information indicating wind speed, rotation speed, and regenerative power indicating the state of the wind power generator main body 10. Reinforcement learning unit 71 advances learning based on the information indicating the acquired wind speed, rotation speed, and regenerative power, and information indicating the target value of regenerative power so that a higher reward can be obtained using the reward as a clue, and more appropriately Output power control parameters. Here, the target value of regenerative power is the maximum value of regenerative power that can be generated, which is determined for each wind speed.

報酬算出部６３は、報酬を算出する場合、報酬を算出する対象とする所定の対象区間を抽出する。ここで、対象区間とは、風況に応じて定まる所定の区間であり、例えば、風速が減速する減速区間と風速が加速する加速区間とを合わせた区間である。 When calculating a reward, the reward calculating unit 63 extracts a predetermined target section for which a reward is to be calculated. Here, the target section is a predetermined section that is determined according to the wind conditions, for example, a section that combines a deceleration section in which the wind speed is decelerated and an acceleration section in which the wind speed is accelerated.

風力発電においては、風速が減速しているにも係らず、風速に応じた最大の回生電力が得られる回転数で風車を回転させ続けると、発電負荷により風車の回転が失速してしまう場合があった。このための対策として、例えば、減速区間では回生電力を最大にする制御を行わず、回転が失速しないように回転速度を維持する制御を行い、加速時に回転数を増加させることでより高い回生電力が得られるように制御することが考えられる。このように制御すれば、トータルの発電量を増やすことが可能である。 In wind power generation, even if the wind speed is decelerating, if the wind turbine continues to rotate at the rotation speed at which the maximum regenerative power corresponding to the wind speed is obtained, the rotation of the wind turbine may stall due to the power generation load. there were. As countermeasures for this, for example, the control to maximize the regenerative power is not performed in the deceleration section, the control to maintain the rotation speed so that the rotation does not stall, and the rotation speed is increased during acceleration to increase the regenerative power. It is conceivable to control so as to obtain By controlling in this way, the total power generation amount can be increased.

そこで、本実施形態では、報酬算出部６３は、所定の対象区間における回生電力の加算値が所定の電力閾値以上である場合には、より高い報酬を算出する。また、報酬算出部６３は、対象区間における回生電力の加算値が所定の電力閾値未満である場合には、より低い報酬を算出する。これにより、学習装置７０に、所定の対象区間における回生電力の加算値が高くなるような電力制御パラメータを出力するように学習させることができる。例えば、学習装置７０に、減速区間のどのタイミングで回生電力を最大にする制御から回転速度を維持する制御に切替え、加速区間のどのタイミングで回転速度を維持する制御から回生電力を最大にする制御に切替えれば、トータルの発電量が増えるかを学習させることができる。 Therefore, in the present embodiment, the reward calculation unit 63 calculates a higher reward when the added value of the regenerative power in the predetermined target section is equal to or greater than a predetermined power threshold. Moreover, the reward calculation part 63 calculates a lower reward, when the addition value of the regenerative electric power in a target area is less than a predetermined electric power threshold value. Thereby, it is possible to cause the learning device 70 to learn to output a power control parameter that increases the regenerative power addition value in a predetermined target section. For example, the learning device 70 is switched from the control that maximizes the regenerative power at any timing in the deceleration zone to the control that maintains the rotation speed, and the control that maintains the rotation speed at any timing in the acceleration zone and the control that maximizes the regenerative power By switching to, it is possible to learn whether the total power generation amount increases.

図７は、第２の実施形態に係る風力発電システム１における対象区間を示す図である。図７（ａ）は風速の時間変化を模式的に示す図である。図７（ｂ）は風速の時間変化の一例である。図７（ａ）、及び（ｂ）の横軸は時間［ｍｉｎ］、縦軸は風車２０の風速［ｍ／ｓ］を示す。 FIG. 7 is a diagram illustrating a target section in the wind power generation system 1 according to the second embodiment. Fig.7 (a) is a figure which shows typically the time change of a wind speed. FIG. 7B is an example of a change in wind speed over time. 7A and 7B, the horizontal axis represents time [min], and the vertical axis represents the wind speed [m / s] of the wind turbine 20.

図７（ａ）に示すように、風速の時間変化においては、加速のピークＰ１を示した後、減速に転じて減速のピークＰ２を示し、その後加速に転じて加速のピークＰ３を示す。風速はこのように減速と加速とを交互に繰り返しながら変化する。報酬算出部６３は、加速のピークＰ１から減速のピークＰ２までを減速区間、減速のピークＰ２から加速のピークＰ３までを加速区間とし、減速区間とその後の加速区間とを合わせた対象区間を抽出する。 As shown in FIG. 7A, in the time change of the wind speed, after showing the acceleration peak P1, it turns to deceleration and shows the deceleration peak P2, and then turns to acceleration and shows the acceleration peak P3. The wind speed thus changes while alternately repeating deceleration and acceleration. The reward calculating unit 63 extracts a target section that combines the deceleration section and the subsequent acceleration section, with the acceleration section from the acceleration peak P1 to the deceleration peak P2 as the deceleration section and the deceleration peak P2 to the acceleration peak P3 as the acceleration section. To do.

図７（ｂ）に示すように、時間Ｔ１において風速が加速のピークとなり、その後減速した風速が時間Ｔ２において再び加速のピークとなる場合、対象区間は時間Ｔ１からＴ２までの間（以下、単に「時間Ｔ１〜Ｔ２」と記す）である。同様に、時間Ｔ２〜Ｔ３、時間Ｔ３〜Ｔ４、…がそれぞれ対象区間である。対象区間の時間は、風況に応じて定まる任意の時間であってよく、ある対象区間と他の対象区間との時間が異なっていてよい。また、時間Ｔ４〜Ｔ５のように、一旦減速した風速がしばらく維持され、再度減速したような場合も減速区間としてよく、その後の加速区間と合わせて対象区間としてよい。また、時間Ｔ５〜Ｔ６のように対象区間に対して減速区間の割合が極端に少ない場合や、時間Ｔ６〜Ｔ７のように対象区間に対して増加区間の割合が極端に少ない場合も、対象区間としてよい。 As shown in FIG. 7B, when the wind speed becomes a peak of acceleration at time T1, and the wind speed decelerated thereafter becomes the peak of acceleration again at time T2, the target section is between time T1 and time T2 (hereinafter simply referred to as “only”). “Time T1 to T2”). Similarly, time T2 to T3, time T3 to T4,. The time of the target section may be an arbitrary time determined according to the wind conditions, and the time of a certain target section and another target section may be different. Further, when the wind speed once decelerated is maintained for a while and is decelerated again like time T4 to T5, it may be set as the deceleration zone, and may be set as the target zone together with the subsequent acceleration zone. Further, when the ratio of the deceleration section is extremely small with respect to the target section as at times T5 to T6, or when the ratio of the increase section with respect to the target section is extremely small at times T6 to T7, As good as

報酬算出部６３は、対象区間を抽出する場合、例えば、状態検出部６２により検出された風速Ｖ（ｎ）と、その前に状態検出部６２により検出された風速Ｖ（ｎ−１）との風速差分（Ｖ（ｎ）−Ｖ（ｎ−１））を算出することで、風速が減速しているか、減速のピークであるか、加速しているか、又は加速のピークであるかを判定する。報酬算出部６３は、風速差分がマイナスの値である場合、風速が減速であると判定する。報酬算出部６３は、風速差分がマイナスの値から０（ゼロ）、又は０（ゼロ）に近い所定の範囲内に変化した場合、風速が減速のピークであると判定する。報酬算出部６３は、風速差分がプラスの値である場合、風速が加速であると判定する。報酬算出部６３は、風速差分がプラスの値から０（ゼロ）、又は０（ゼロ）に近い所定の範囲内に変化した場合、風速が加速のピークであると判定する。 When the reward calculation unit 63 extracts the target section, for example, the wind speed V (n) detected by the state detection unit 62 and the wind speed V (n−1) detected by the state detection unit 62 before that are calculated. By calculating the wind speed difference (V (n) −V (n−1)), it is determined whether the wind speed is decelerating, decelerating peak, accelerating, or accelerating peak. . The reward calculation unit 63 determines that the wind speed is a deceleration when the wind speed difference is a negative value. The reward calculation unit 63 determines that the wind speed is a deceleration peak when the wind speed difference changes from a negative value to 0 (zero) or within a predetermined range close to 0 (zero). The reward calculation unit 63 determines that the wind speed is acceleration when the wind speed difference is a positive value. The reward calculation unit 63 determines that the wind speed is the acceleration peak when the wind speed difference changes from a positive value to 0 (zero) or within a predetermined range close to 0 (zero).

なお、風速を検出する周期は、風力発電機本体１０に対して制御を行う制御周期（例えば、１０［ｍｓ］）や風況等に応じて任意に定められてよい。また、風況に応じた回生電力が出力されるまでに所定の遅延があることが考えられることから、報酬算出部６３は、対象区間に応じた時間に所定の遅延時間を考慮した時間における回生電力に基づいて、報酬を算出するようにしてよい。この場合の遅延時間は、風速に依らず一定の値であってもよいし、風速に応じて変動する値であってもよい。 In addition, the period which detects a wind speed may be arbitrarily determined according to the control period (for example, 10 [ms]) which controls with respect to the wind power generator main body 10, a wind condition, etc. In addition, since it is considered that there is a predetermined delay until the regenerative power corresponding to the wind condition is output, the reward calculation unit 63 regenerates at a time considering the predetermined delay time in the time corresponding to the target section. The reward may be calculated based on the power. The delay time in this case may be a constant value regardless of the wind speed, or may be a value that varies according to the wind speed.

また、上記では、報酬算出部６３は、対象区間における回生電力の加算値の大きさに基づいて、報酬を算出したが、これに限定されることはない。報酬算出部６３は、減速区間において回転速度が失速することなく維持された場合により高い報酬を算出し、減速区間において回転速度が失速した場合にはより低い報酬を算出するようにしてもよい。 Moreover, although the reward calculation part 63 calculated the reward based on the magnitude | size of the addition value of the regenerative electric power in a target area in the above, it is not limited to this. The reward calculation unit 63 may calculate a higher reward when the rotation speed is maintained without stalling in the deceleration section, and may calculate a lower reward when the rotation speed stalls in the deceleration section.

また、上記では、報酬算出部６３は、風速に基づいて対象区間を抽出したが、これに限定されない。報酬算出部６３は、風速に代えて回転数を用いて、対象区間を抽出してもよいし、風速と回転数を用いて対象区間を抽出してもよい。 Moreover, although the reward calculation part 63 extracted the object area based on the wind speed in the above, it is not limited to this. The reward calculation unit 63 may extract the target section using the rotation speed instead of the wind speed, or may extract the target section using the wind speed and the rotation speed.

また、報酬算出部６３は、風速に基づいて、対象区間におけるトータル発電量に基づいて報酬を算出するか否かを判定してもよい。報酬算出部６３は、例えば、風速が所定の強風閾値未満である場合、つまり強風でない場合、対象区間におけるトータル発電量に基づいて報酬を算出すると判定する。一方、報酬算出部６３は、風速が所定の強風閾値以上である場合、つまり強風である場合、対象区間におけるトータル発電量に基づいて報酬を算出しないと判定する。強風時に発電量を高めようとすれば、風車２０が過回転となる可能性があるためである。 Further, the reward calculation unit 63 may determine whether to calculate a reward based on the total power generation amount in the target section based on the wind speed. For example, when the wind speed is less than a predetermined strong wind threshold, that is, when the wind speed is not strong, the reward calculating unit 63 determines to calculate the reward based on the total power generation amount in the target section. On the other hand, when the wind speed is equal to or higher than the predetermined strong wind threshold, that is, when the wind speed is strong, the reward calculating unit 63 determines not to calculate the reward based on the total power generation amount in the target section. This is because the windmill 20 may be over-rotated if the power generation amount is increased during a strong wind.

図８は、第２の実施形態に係る制御装置６０の動作例を示すフローチャートである。
まず、制御装置６０のパラメータ取得部６１は、強化学習部７１から電力制御パラメータを取得する（ステップＳ２０）。
次に、状態検出部６２は、風速、回転速度、及び回生電力を検出する（ステップＳ２１）。状態検出部６２は、風速センサ４１により検出された風速、及び回転速度センサ４２により検出された回転速度、を取得することにより、風速、及び回転速度を検出する。また、状態検出部６２は、電圧検出部３２より検出された回生電力の電圧、及び電流検出部３３により検出された回生電力の電流を取得することにより、回生電力を検出する。状態検出部６２は、検出した風速、回転速度、及び回生電力を、報酬算出部６３に出力する。 FIG. 8 is a flowchart illustrating an operation example of the control device 60 according to the second embodiment.
First, the parameter acquisition unit 61 of the control device 60 acquires a power control parameter from the reinforcement learning unit 71 (step S20).
Next, the state detection part 62 detects a wind speed, a rotational speed, and regenerative electric power (step S21). The state detection unit 62 detects the wind speed and the rotation speed by acquiring the wind speed detected by the wind speed sensor 41 and the rotation speed detected by the rotation speed sensor 42. Further, the state detection unit 62 detects the regenerative power by acquiring the voltage of the regenerative power detected by the voltage detection unit 32 and the current of the regenerative power detected by the current detection unit 33. The state detection unit 62 outputs the detected wind speed, rotation speed, and regenerative power to the reward calculation unit 63.

次に、報酬算出部６３は、報酬を算出する。
まず、報酬算出部６３は、風速が加速のピークであるか否かを判定する（ステップＳ２２）。報酬算出部６３は、風速が加速のピークである場合、対象区間における回生電力の加算値（トータル発電量）が所定の電力閾値以上であるか否かを判定する（ステップＳ２３）。報酬算出部６３は、トータル発電量が電力閾値以上である場合、第２レベルの報酬とする（ステップＳ２４）。一方、報酬算出部６３は、トータル発電量が電力閾値未満である場合、第２レベルより低い第１レベルの報酬とする（ステップＳ２５）。
報酬算出部６３は、トータル電力量をクリアし、ステップＳ２０に示す処理に戻る（ステップＳ２６）。 Next, the reward calculation unit 63 calculates a reward.
First, the reward calculation unit 63 determines whether or not the wind speed is the acceleration peak (step S22). When the wind speed is at the acceleration peak, the remuneration calculation unit 63 determines whether or not the regenerative power addition value (total power generation amount) in the target section is equal to or greater than a predetermined power threshold (step S23). When the total power generation amount is equal to or greater than the power threshold value, the reward calculation unit 63 sets the second level reward (step S24). On the other hand, when the total power generation amount is less than the power threshold, the reward calculation unit 63 sets the first level reward lower than the second level (step S25).
The reward calculation unit 63 clears the total power amount and returns to the process shown in step S20 (step S26).

一方、ステップＳ２２において、風速が加速のピークでない場合、報酬算出部６３は、トータル発電量に、検出した回生電力を加算し、ステップＳ２０に示す処理に戻る（ステップＳ２７）。 On the other hand, if the wind speed is not the acceleration peak in step S22, the reward calculating unit 63 adds the detected regenerative power to the total power generation amount, and returns to the process shown in step S20 (step S27).

＜第３の実施形態＞
次に第３の実施形態について説明する。
本実施形態では、風車２０の回転数を制御する場合に、風況に応じて、それぞれ異なる制御を行う点において、他の実施形態と相違する。以下では、上述した実施形態と異なる点を説明し、上述した実施形態と同一または類似の機能を有する構成に同一の符号を付し、その説明を省略する。 <Third Embodiment>
Next, a third embodiment will be described.
In this embodiment, when controlling the rotation speed of the windmill 20, it differs from other embodiment in the point which performs different control according to a wind condition, respectively. Below, a different point from embodiment mentioned above is demonstrated, the same code | symbol is attached | subjected to the structure which has the same or similar function as embodiment mentioned above, and the description is abbreviate | omitted.

本実施形態では、風況を減速区間、加速区間、及び強風区間に分類し、分類した区間の各々に基づいて、風車２０の回転数を制御するように、学習装置７０に学習させる。ここで、減速区間、及び加速区間は、第２の実施形態における減速区間、及び加速区間と同等である。強風区間は、風速が所定の強風判定閾値以上となる区間である。 In the present embodiment, the wind condition is classified into a deceleration zone, an acceleration zone, and a strong wind zone, and the learning device 70 is trained to control the rotational speed of the windmill 20 based on each of the classified zones. Here, the deceleration zone and the acceleration zone are equivalent to the deceleration zone and the acceleration zone in the second embodiment. The strong wind section is a section in which the wind speed is equal to or higher than a predetermined strong wind determination threshold.

風力発電においては、加速区間では、風速に応じた最大の回生電力が得られる回転数で風車を回転させることが望ましい。風速と回転数との関係は、風速が加速すれば、風車２０の回転数が増加する傾向にあるが、風車の慣性（イナーシャ）により、風速に対して一定ではない遅延が発生する。このため、風速に対応する回転数で風車を回転させた場合であっても、想定される最大の回生電力が得られない場合があった。このため、加速区間では、風車のイナーシャを考慮して制御されることが望ましい。 In wind power generation, it is desirable to rotate the windmill at a rotational speed at which the maximum regenerative power corresponding to the wind speed is obtained in the acceleration section. As for the relationship between the wind speed and the rotational speed, if the wind speed is accelerated, the rotational speed of the windmill 20 tends to increase. However, due to the inertia of the windmill, a non-constant delay occurs with respect to the wind speed. For this reason, even when the windmill is rotated at a rotational speed corresponding to the wind speed, the assumed maximum regenerative power may not be obtained. For this reason, it is desirable that the acceleration section be controlled in consideration of the inertia of the windmill.

また、減速区間では、風速に応じた最大の回生電力が得られる回転数で風車を回転させ続けると、発電負荷により風車の回転が失速してしまう場合があった。このため、減速区間では、風速の変化（減速）の度合い（単位時間あたりの風速の変化量）を考慮して制御されることが望ましい。 Further, in the deceleration zone, if the windmill is continuously rotated at a rotation speed at which the maximum regenerative power corresponding to the wind speed is obtained, the windmill rotation may be stalled due to the power generation load. For this reason, in the deceleration zone, it is desirable to control in consideration of the degree of wind speed change (deceleration) (the amount of change in wind speed per unit time).

また、強風区間では、回転数を減速させることで風車２０の回転速度が速度超過に陥らないように制御するが、減速し過ぎると回生電力が低下してしまう場合があった。このため、強風区間では、風速の変化（加速）の度合い（単位時間あたりの風速の変化量）と回生電力とを考慮して制御されることが望ましい。 In the strong wind section, the rotational speed of the windmill 20 is controlled so as not to exceed the speed by reducing the rotational speed. However, if the speed is excessively reduced, the regenerative power may be reduced. For this reason, in the strong wind section, it is desirable to control in consideration of the degree of wind speed change (acceleration) (wind speed change per unit time) and regenerative power.

そこで、本実施形態では、加速区間では、風速に応じて定まる風車２０の回転数の目標値を基準とし、基準である目標値を含む所定の範囲の回転数を目標の範囲として、目標の範囲内で風車２０の回転数を制御し、より大きな回生電力が得られる場合により高い報酬を与えることで、風況に応じてより適した目標値を探すように学習させる。 Therefore, in the present embodiment, in the acceleration section, the target range of the rotation speed of the wind turbine 20 determined according to the wind speed is used as a reference, and the rotation speed within a predetermined range including the reference target value is set as the target range. The number of revolutions of the windmill 20 is controlled in the interior, and when a larger regenerative power is obtained, a higher reward is given to learn to search for a more suitable target value according to the wind conditions.

具体的には、強化学習部７１は、風力発電機本体１０の状態を示す風速と回転速度を示す情報を取得する。強化学習部７１は、取得した風速と回転速度を示す情報、及び回生電力の目標値を含む風車２０に設定可能な所定の範囲の回転数を示す情報に基づいて、報酬を手掛かりとしてより高い報酬が得られるように学習を進め、より適切な電力制御パラメータを出力する。ここで、風車２０に設定可能な所定の範囲の回転数とは、風速ごとに定まる回生電力が最大となる風車の回転数を含む所定の範囲の回転数である。この範囲には、風車２０や発電機３０の機械的な耐用限界等に基づいて定められる風車２０の回転数として許容される範囲内であることが望ましい。 Specifically, the reinforcement learning unit 71 acquires information indicating the wind speed and the rotation speed indicating the state of the wind power generator main body 10. The reinforcement learning unit 71 uses the information indicating the acquired wind speed and rotation speed, and the information indicating the rotation speed within a predetermined range that can be set in the wind turbine 20 including the target value of the regenerative power as a clue as a higher reward. Learning is performed so that more appropriate power control parameters are output. Here, the rotation speed within a predetermined range that can be set for the wind turbine 20 is a rotation speed within a predetermined range including the rotation speed of the wind turbine that maximizes the regenerative power determined for each wind speed. This range is preferably within a range allowed as the number of rotations of the windmill 20 determined based on the mechanical durability limit of the windmill 20 or the generator 30.

また、報酬算出部６３は、加速区間では、より大きな回生電力が得られる場合により高い報酬を算出する。これにより、強化学習部７１は、加速区間では、目標の範囲内で回転数制御パラメータを出力し、出力した回転数制御パラメータが、風車２０に設定された場合の風車２０の回転の状態に応じて、より大きな回生電力が得られる制御を学習する。 Further, the reward calculation unit 63 calculates a higher reward in the acceleration section when larger regenerative power is obtained. Thereby, the reinforcement learning unit 71 outputs the rotation speed control parameter within the target range in the acceleration section, and the output rotation speed control parameter is set according to the rotation state of the windmill 20 when the windmill 20 is set. Thus, the control for obtaining a larger regenerative power is learned.

また、本実施形態においては、減速区間では、報酬算出部６３は、風速の減速の度合いに応じて報酬を算出する。具体的には、報酬算出部６３は、風速の減速の度合いに対する、風車２０の回転速度の変化量がより小さい場合に、より高い報酬を算出する。これにより、強化学習部７１は、減速区間では、風速が減速する場合であっても、風車２０の回転速度が失速しないように維持するような制御を学習する。 In the present embodiment, in the deceleration zone, the reward calculation unit 63 calculates a reward according to the degree of wind speed deceleration. Specifically, the reward calculation unit 63 calculates a higher reward when the amount of change in the rotational speed of the windmill 20 relative to the degree of deceleration of the wind speed is smaller. Thereby, the reinforcement learning part 71 learns the control which maintains so that the rotational speed of the windmill 20 may not be stalled, even if it is a case where a wind speed decelerates in a deceleration area.

また、報酬算出部６３は、減速区間とその後の加速区間とを合わせた区間（減速対象区間）における回生電力の加算値がより大きい場合に、より高い報酬を算出する。これにより、強化学習部７１は、減速区間で回生電力が小さくなった場合であっても、その後の加速区間でより大きな回生電力が出力されるような制御を学習する。 Moreover, the reward calculation part 63 calculates a higher reward, when the addition value of the regenerative electric power in the area (deceleration object area) which combined the deceleration area and the subsequent acceleration area is larger. Thereby, even if it is a case where regenerative electric power becomes small in the deceleration area, the reinforcement learning part 71 learns the control that larger regenerative electric power is output in the subsequent acceleration area.

また、本実施形態においては、強風区間では、強風時において風速が加速する強風加速区間とその後の減速区間とを合わせた区間（強風対象区間）における回生電力の加算値がより大きい場合に、より高い報酬を算出する。これにより、強化学習部７１は、強風区間で減速させ過ぎてその後の減速区間で回生電力が小さくならないような制御を学習する。 Further, in the present embodiment, in the strong wind section, when the added value of the regenerative power in the section (strong wind target section) that combines the strong wind acceleration section in which the wind speed is accelerated in the strong wind and the subsequent deceleration section is larger, Calculate a high reward. Thereby, the reinforcement learning part 71 learns the control which makes it decelerate too much in a strong wind area, and does not become small regenerative electric power in the subsequent deceleration area.

なお、報酬算出部６３は、強風区間では、風車２０の回転速度が速度超過となった場合には、最も低い報酬を算出する。
これにより、報酬算出部６３は、強風区間で速度超過とならないような制御を学習する。 In the strong wind section, the reward calculation unit 63 calculates the lowest reward when the rotational speed of the windmill 20 exceeds the speed.
Thereby, the reward calculation part 63 learns control which does not exceed speed in a strong wind area.

図９は、第３の実施形態に係る制御装置６０の動作例を示すフローチャートである。
まず、制御装置６０のパラメータ取得部６１は、強化学習部７１から、目標の範囲内の回転数制御パラメータを取得する（ステップＳ３０）。
次に、状態検出部６２は、風速、回転速度、及び回生電力を検出する（ステップＳ３１）。
次に、報酬算出部６３は、取得した風速に基づいて、区間を抽出するか否か判定する（ステップＳ３２）。報酬算出部６３は、取得した風速が強風から強風ではない通常の風速に変化した場合、加速のピークとなった場合、又は減速のピークとなった場合、区間を抽出すると判定する。報酬算出部６３は、区間を抽出しない場合、ステップＳ３０に戻る。 FIG. 9 is a flowchart illustrating an operation example of the control device 60 according to the third embodiment.
First, the parameter acquisition unit 61 of the control device 60 acquires a rotation speed control parameter within the target range from the reinforcement learning unit 71 (step S30).
Next, the state detection part 62 detects a wind speed, a rotational speed, and regenerative electric power (step S31).
Next, the reward calculation unit 63 determines whether to extract a section based on the acquired wind speed (step S32). The reward calculation unit 63 determines to extract a section when the acquired wind speed changes from a strong wind to a normal wind speed that is not a strong wind, when the peak of acceleration or the peak of deceleration occurs. The reward calculation unit 63 returns to step S30 when no section is extracted.

報酬算出部６３は、区間を抽出する場合、風速が所定の強風判定閾値以上となった場合、風速が所定の強風判定閾値未満となるまでを強風区間として抽出する。また、報酬算出部６３は、風速が加速のピークとなった場合、その後に風速が減速のピークとなるまでを減速区間として抽出する。報酬算出部６３は、風速が所定の減速のピークとなった場合、その後に風速が加速のピークとなるまでを加速区間として抽出する。 When calculating the section, the reward calculating unit 63 extracts, as a strong wind section, when the wind speed is equal to or higher than a predetermined strong wind determination threshold value until the wind speed becomes less than the predetermined strong wind determination threshold value. In addition, when the wind speed reaches an acceleration peak, the reward calculation unit 63 extracts the deceleration period until the wind speed reaches a deceleration peak thereafter. When the wind speed reaches a predetermined deceleration peak, the reward calculation unit 63 extracts, as an acceleration section, until the wind speed subsequently reaches an acceleration peak.

報酬算出部６３は、抽出した区間が強風区間であるか否かを判定する（ステップＳ３３）。報酬算出部６３は、抽出した区間が強風区間である場合、強風加速区間とその後の減速区間における回生電力の加算値に応じた報酬を算出する（ステップＳ３４）一方、報酬算出部６３は、区間が強風区間でない場合、抽出した区間が減速区間であるか否かを判定する（ステップＳ３５）。
報酬算出部６３は、抽出した区間が減速区間である場合、減速区間とその後の加速区間における回生電力の加算値に応じた報酬を算出する（ステップＳ３６）。
一方、報酬算出部６３は、抽出した区間が減速区間でない場合、つまり加速区間である場合、回生電力に応じた報酬を算出する（ステップＳ３７）。 The reward calculation unit 63 determines whether or not the extracted section is a strong wind section (step S33). When the extracted section is a strong wind section, the reward calculation section 63 calculates a reward according to the added value of the regenerative power in the strong wind acceleration section and the subsequent deceleration section (step S34), while the reward calculation section 63 selects the section If is not a strong wind section, it is determined whether or not the extracted section is a deceleration section (step S35).
When the extracted section is a deceleration section, the reward calculation unit 63 calculates a reward according to the added value of the regenerative power in the deceleration section and the subsequent acceleration section (step S36).
On the other hand, if the extracted section is not a deceleration section, that is, if it is an acceleration section, the reward calculation unit 63 calculates a reward according to regenerative power (step S37).

＜第４の実施形態＞
次に第４の実施形態について説明する。
本実施形態では、風車２０の回転数を制御する場合に、風況に応じて限界風速を変化させる点において、他の実施形態と相違する。以下では、上述した実施形態と異なる点を説明し、上述した実施形態と同一または類似の機能を有する構成に同一の符号を付し、その説明を省略する。 <Fourth Embodiment>
Next, a fourth embodiment will be described.
In this embodiment, when controlling the rotation speed of the windmill 20, it differs from other embodiment in the point which changes a limit wind speed according to a wind condition. Below, a different point from embodiment mentioned above is demonstrated, the same code | symbol is attached | subjected to the structure which has the same or similar function as embodiment mentioned above, and the description is abbreviate | omitted.

限界風速とは、風力発電システム１による発電が可能な風速の上限である。限界風速は、例えば、風車２０や発電機３０の機械的な耐用限界等に基づいて決定される破壊風速に所定の余裕（マージン）を考慮して決定される。ここで、破壊風速とは、風車２０や発電機３０が損傷したり破壊されたりする程度の風速である。 The limit wind speed is the upper limit of the wind speed at which power generation by the wind power generation system 1 is possible. The limit wind speed is determined in consideration of a predetermined margin (margin) with respect to the breaking wind speed determined based on, for example, the mechanical durability limit of the wind turbine 20 or the generator 30. Here, the breaking wind speed is a wind speed at which the windmill 20 and the generator 30 are damaged or destroyed.

風力発電システム１においては、風車２０が受ける風速が限界風速に達すると、運転を停止する。この場合、風力発電システム１は、発電が制御されているか否かに関わらず、強制的に運転を停止する。このため、強風時に発電が制御されている場合であっても、風速が限界風速に達すれば運転を停止せざるを得ず、回生電力の低減の要因となる場合があった。このため、強風時であっても発電が適切に制御されている場合には、限界風速をより破壊風速に近付ける方向に変化させ、発電を維持できるように制御されることが望ましい。 In the wind power generation system 1, the operation is stopped when the wind speed received by the windmill 20 reaches the limit wind speed. In this case, the wind power generation system 1 forcibly stops the operation regardless of whether or not power generation is controlled. For this reason, even when power generation is controlled during a strong wind, if the wind speed reaches the limit wind speed, the operation has to be stopped, which may cause a reduction in regenerative power. For this reason, when power generation is appropriately controlled even during strong winds, it is desirable that the critical wind speed be changed in a direction closer to the breaking wind speed so that power generation can be maintained.

そこで、本実施形態では、強風区間における制御の安定度合に基づいて、限界風速に近づいている場合でも、安定した制御が行われている場合には破壊風速に近付ける方向に変化させ、より安定した制御が行われている場合により高い報酬を与えることで、強風時においても風況に応じてより安定した制御が行われるように学習させる。 Therefore, in the present embodiment, based on the degree of stability of control in the strong wind section, even when approaching the limit wind speed, if stable control is being performed, the direction is changed to approach the breaking wind speed, and more stable. By giving a higher reward when control is being performed, learning is performed so that more stable control is performed according to the wind conditions even in strong winds.

具体的には、報酬算出部６３は、強風区間において、風車２０の回転速度と最大回転速度との差分、及び風車２０の回転速度の変化率を算出し、算出した差分と変化率とに基づいて、報酬を算出する。 Specifically, the reward calculation unit 63 calculates the difference between the rotational speed of the windmill 20 and the maximum rotational speed and the rate of change of the rotational speed of the windmill 20 in the strong wind section, and based on the calculated difference and rate of change. To calculate the reward.

また、報酬算出部６３は、風車２０の回転速度と最大回転速度との差分が所定の余裕閾値未満である場合、風車２０の回転速度の変化率が所定の変化閾値未満である場合より高い報酬（第２レベル）を算出し、当該回転速度の変化率が所定の変化閾値以上である場合より低い報酬（第１レベル）を算出する。風車２０の回転速度の変化率が低い場合、風車２０の回転速度の変化率が高い場合と比較して、風車２０の回転がより安定して制御されていると判断できるためである。 In addition, the reward calculation unit 63 has a higher reward when the difference between the rotational speed of the windmill 20 and the maximum rotational speed is less than a predetermined margin threshold, and when the rate of change in the rotational speed of the windmill 20 is less than a predetermined change threshold. (Second level) is calculated, and a lower reward (first level) is calculated than when the rate of change of the rotational speed is equal to or greater than a predetermined change threshold. This is because when the rate of change in the rotational speed of the windmill 20 is low, it can be determined that the rotation of the windmill 20 is controlled more stably than when the rate of change in the rotational speed of the windmill 20 is high.

報酬算出部６３は、風車２０の回転速度と最大回転速度との差分が所定の余裕閾値以上である場合には、上述した当該回転速度の変化率が所定の変化閾値以上である場合の報酬（第１レベル）よりも高く、当該回転速度の変化率が所定の変化閾値未満である場合の報酬（第２レベル）よりも低い報酬（第３レベル）を算出する。風車２０の回転速度と最大回転速度との差分が余裕閾値以上である場合、強風時の制御として、過度に安全な方向により過ぎていると判断できるためである。 When the difference between the rotational speed of the windmill 20 and the maximum rotational speed is equal to or greater than a predetermined margin threshold, the reward calculation unit 63 rewards when the above-described change rate of the rotational speed is equal to or greater than a predetermined change threshold ( A reward (third level) that is higher than the first level and lower than the reward (second level) when the rate of change of the rotation speed is less than a predetermined change threshold is calculated. This is because if the difference between the rotational speed of the windmill 20 and the maximum rotational speed is equal to or greater than the margin threshold, it can be determined that the control is over in an excessively safe direction as the control during strong winds.

ここで、風車２０の最大回転速度とは、風車２０に回転させることが可能な回転速度の最大値である。最大回転速度は、例えば、強度設計上の上限とする。 Here, the maximum rotational speed of the windmill 20 is the maximum value of the rotational speed at which the windmill 20 can be rotated. The maximum rotation speed is, for example, an upper limit in strength design.

また、本実施形態の制御装置６０では、強風区間において、風車２０の回転速度と最大回転速度との差分が所定の余裕閾値未満である場合であって、尚且つ、風車２０の回転速度の変化率が所定の変化閾値未満である場合、限界風速を破壊風速に近づく方向に変化させる。具体的には、限界風速を記憶する風速情報記憶部（不図示）に記憶させている限界風速を書き換える。これにより、変更後の限界風速に応じた制御が行われる。つまり、制御装置６０は、状態検出部６２により検出された風速に基づいて、風速情報記憶部に記憶された限界風速を参照し、風速が限界風速以上である場合には、風力発電機本体１０の動作を停止させる。 In the control device 60 of the present embodiment, the difference between the rotational speed of the windmill 20 and the maximum rotational speed is less than a predetermined margin threshold in the strong wind section, and the change in the rotational speed of the windmill 20 is also performed. When the rate is less than the predetermined change threshold, the critical wind speed is changed in a direction approaching the breaking wind speed. Specifically, the limit wind speed stored in a wind speed information storage unit (not shown) that stores the limit wind speed is rewritten. Thereby, control according to the limit wind speed after a change is performed. That is, the control device 60 refers to the limit wind speed stored in the wind speed information storage unit based on the wind speed detected by the state detection unit 62, and when the wind speed is equal to or higher than the limit wind speed, the wind power generator main body 10 Stop the operation.

なお、限界風速の変更は、所定の限界風速変更値に基づいて、段階的に変更することが望ましい。限界風速変更値は、風車２０の構造や、立地条件、季節等に基づいて任意に設定されてよい。 It is desirable to change the limit wind speed stepwise based on a predetermined limit wind speed change value. The limit wind speed change value may be arbitrarily set based on the structure of the wind turbine 20, the location conditions, the season, and the like.

図１０は、第４の実施形態に係る制御装置６０の動作例を示すフローチャートである。
まず、制御装置６０のパラメータ取得部６１は、強化学習部７１から、回転数制御パラメータを取得する（ステップＳ４０）。
次に、状態検出部６２は、風速、回転速度、及び回生電力を検出する（ステップＳ４１）。
次に、報酬算出部６３は、取得した風速が強風であるか否か判定する（ステップＳ４２）。 FIG. 10 is a flowchart illustrating an operation example of the control device 60 according to the fourth embodiment.
First, the parameter acquisition unit 61 of the control device 60 acquires a rotation speed control parameter from the reinforcement learning unit 71 (step S40).
Next, the state detection part 62 detects a wind speed, a rotational speed, and regenerative electric power (step S41).
Next, the reward calculation unit 63 determines whether or not the acquired wind speed is a strong wind (step S42).

報酬算出部６３は、取得した風速が強風である場合、状態検出部６２から取得した回転速度と最大回転速度との差分を算出し、算出した差分が余裕閾値以上か否かを判定する（ステップＳ４３）。報酬算出部６３は、算出した差分が余裕閾値以上である場合、第３レベルの報酬を算出する（ステップＳ４４）。 When the acquired wind speed is a strong wind, the reward calculation unit 63 calculates a difference between the rotation speed acquired from the state detection unit 62 and the maximum rotation speed, and determines whether the calculated difference is equal to or greater than a margin threshold (step) S43). When the calculated difference is equal to or greater than the margin threshold, the reward calculation unit 63 calculates a third level reward (step S44).

報酬算出部６３は、算出した差分が余裕閾値未満である場合、状態検出部６２から取得した回転速度の変化率を算出し、算出した変化率が所定の変化閾値未満か否かを判定する（ステップＳ４５）。報酬算出部６３は、算出した変化率が変化閾値未満である場合、第２レベルの報酬（最高レベル）を算出する（ステップＳ４６）。この場合、制御装置６０は、限界風速を破壊風速に近付ける方向に変更する（ステップＳ４８）。
報酬算出部６３は、算出した変化率が変化閾値以上である場合、第１レベルの報酬（最低レベル）を算出する（ステップＳ４７）。 When the calculated difference is less than the margin threshold, the reward calculation unit 63 calculates the change rate of the rotation speed acquired from the state detection unit 62 and determines whether the calculated change rate is less than a predetermined change threshold ( Step S45). When the calculated change rate is less than the change threshold, the reward calculation unit 63 calculates the second level reward (the highest level) (step S46). In this case, the control device 60 changes the limit wind speed so as to approach the breaking wind speed (step S48).
The reward calculation unit 63 calculates a first level reward (minimum level) when the calculated change rate is equal to or greater than the change threshold (step S47).

（第５の実施形態）
次に、第５の実施形態について説明する。
本実施形態では、制御装置６０が、学習済みモデルを用いて風車２０の回転数を制御する点において、上述した実施形態と相違する。
図１１は、第５の実施形態の変形例に係る風力発電システム１Ａの概略構成の一例を示すブロック図である。図１１に示すように、制御装置６０Ａは、学習済みモデル記憶部６５と、決定部６６と、制御部６７とを備える。 (Fifth embodiment)
Next, a fifth embodiment will be described.
The present embodiment is different from the above-described embodiment in that the control device 60 controls the rotational speed of the windmill 20 using the learned model.
FIG. 11: is a block diagram which shows an example of schematic structure of the wind power generation system 1A which concerns on the modification of 5th Embodiment. As illustrated in FIG. 11, the control device 60A includes a learned model storage unit 65, a determination unit 66, and a control unit 67.

学習済みモデル記憶部６５は、学習済みモデルを記憶する。学習済みモデルは、制御対象である風力発電機本体１０の状態と、風力発電機本体１０に対する制御との関係を示す情報（関係情報）が格納されたデータベース（学習済みモデル）である。学習済みモデルは、風力発電機本体１０の状態に応じて、その状態に対応する風力発電機本体１０を制御する指標を示すパラメータ（以下、制御指標パラメータという）を推定するモデルである。 The learned model storage unit 65 stores a learned model. The learned model is a database (learned model) in which information (relation information) indicating the relationship between the state of the wind power generator main body 10 to be controlled and the control of the wind power generator main body 10 is stored. The learned model is a model for estimating a parameter indicating an index for controlling the wind power generator main body 10 corresponding to the state (hereinafter referred to as a control index parameter) according to the state of the wind power generator main body 10.

ここで、制御指標パラメータは、風力発電機本体１０を制御する指標となる情報であって、制御パラメータそのものであってもよいし、制御パラメータを導出するために用いられる情報であってもよい。 Here, the control index parameter is information serving as an index for controlling the wind power generator main body 10 and may be the control parameter itself or information used for deriving the control parameter.

例えば、制御指標パラメータが風車２０の回転を制御する指標となる情報である場合、制御指標パラメータは、回転数制御パラメータそのものであってもよいし、風車２０の回転数や回転速度を数値で示すものであってもよいし、回転数や回転速度を増加させる、又は減少させるというような風車２０の回転数の制御を相対的に示すものであってもよい。 For example, when the control index parameter is information serving as an index for controlling the rotation of the windmill 20, the control index parameter may be the rotation speed control parameter itself, or numerically indicate the rotation speed and the rotation speed of the windmill 20. It may be a thing, and the control of the rotation speed of the windmill 20 to increase or decrease the rotation speed or the rotation speed may be relatively indicated.

例えば、制御指標パラメータが回生電力を制御する指標となる情報である場合、制御指標パラメータは、電力制御パラメータそのものであってもよいし、回生電力の目標値を示すものであってもよいし、回生電力を増加させる、又は減少させるというような風回生電力の制御を相対的に示すものであってもよい。 For example, when the control index parameter is information serving as an index for controlling regenerative power, the control index parameter may be the power control parameter itself, or may indicate a target value for regenerative power, Control of wind regenerative power such as increasing or decreasing regenerative power may be relatively indicated.

学習済みモデルは、例えば、上述した実施形態において強化学習部７１により学習が実施されることにより作成された学習済みモデルであってもよいし、他の風車であって、風車２０と似た構造を有し、風車２０が設置された地域と似たような地域に設けられた風車における風力発電システムの状態と制御との関係を学習した学習済みモデルであってもよい。 The learned model may be, for example, a learned model created by learning performed by the reinforcement learning unit 71 in the above-described embodiment, or may be another windmill having a structure similar to the windmill 20 And a learned model in which the relationship between the state of the wind power generation system and control in a windmill provided in a region similar to the region where the windmill 20 is installed may be learned.

決定部６６は、取得した制御指標パラメータに基づいて、風力発電機本体１０に対する制御に関する制御情報を決定する。ここでの制御情報は、制御指標パラメータに応じて決定される制御を示す情報であり、例えば風車の回転に関する回転情報であり、又、例えば回生電力に関する電力情報である。つまり、決定部６６は、回転指標パラメータに基づいて回転情報を決定する。また、決定部６６は、電力指標パラメータに基づいて電力情報を決定する。決定部６６は、決定した回転情報を、制御部６７に出力する。 The determination unit 66 determines control information related to control of the wind power generator main body 10 based on the acquired control index parameter. The control information here is information indicating control determined according to the control index parameter, for example, rotation information related to rotation of the windmill, and power information related to regenerative power, for example. That is, the determination unit 66 determines the rotation information based on the rotation index parameter. Further, the determination unit 66 determines power information based on the power index parameter. The determination unit 66 outputs the determined rotation information to the control unit 67.

ここでの回転情報には、例えば、風車の回転数を増加させるか、或いは減少させるかといった回転数の変化を示す情報の他、段階的に変化させるか、一気に変化させるかといった回転数を変化させる方法を示す情報も含まれる。 In the rotation information here, for example, in addition to information indicating a change in the number of rotations, such as whether to increase or decrease the number of rotations of the windmill, the number of rotations, such as whether to change stepwise or change at a stroke, is changed. Information indicating the method to be used is also included.

また、電力情報には、例えば、回生電力を増加させるか、或いは減少させるかといった回生電力の変化を示す情報の他、段階的に変化させるか、一気に変化させるかといった回生電力を変化させる度合を示す情報も含まれる。 In addition, the power information includes, for example, information indicating a change in the regenerative power such as whether to increase or decrease the regenerative power, and the degree to which the regenerative power is changed in a stepwise manner or at a stroke. Information to indicate is also included.

制御部６７は、決定部６６により決定された制御情報に基づいて、風力発電機本体１０を制御する制御パラメータを決定する。制御部６７は、例えば、決定部６６により決定された回転情報に基づいて、風車２０の回転が許容範囲に収まり、尚且つ目標に近づくよう、風車２０の回転を制御する回転数制御パラメータを決定する。また、制御部６７は、例えば、決定部６６により決定された電力情報に基づいて、回生電力が目標に近づくよう、回生電力を制御する電力制御パラメータを決定する。制御部６７は、決定した制御パラメータを、パラメータ取得部６１を介して風力発電機本体１０に出力する。 The control unit 67 determines a control parameter for controlling the wind power generator main body 10 based on the control information determined by the determination unit 66. For example, based on the rotation information determined by the determination unit 66, the control unit 67 determines a rotation speed control parameter for controlling the rotation of the windmill 20 so that the rotation of the windmill 20 is within the allowable range and approaches the target. To do. For example, the control unit 67 determines a power control parameter for controlling the regenerative power so that the regenerative power approaches a target based on the power information determined by the determination unit 66. The control unit 67 outputs the determined control parameter to the wind power generator main body 10 via the parameter acquisition unit 61.

ここで、風車２０の回転における許容範囲は、風車２０の回転数として許容される範囲のことであり、例えば、図４における特性ＥＦ、及び特性ＦＧよりも回転数が低い領域、つまり、図４のＡＥＦＧＣで囲まれた領域である。また、目標は、風車２０の回転数の目標となる値であり、例えば図４における特性ＥＦ、及び特性ＦＧに沿った値である。 Here, the allowable range for the rotation of the windmill 20 is an allowable range for the rotation speed of the windmill 20, and is, for example, a region where the rotation speed is lower than the characteristic EF and the characteristic FG in FIG. This area is surrounded by AEFGC. Further, the target is a value that is a target of the rotation speed of the windmill 20, and is a value that conforms to, for example, the characteristic EF and the characteristic FG in FIG.

（第６の実施形態）
次に、第６の実施形態について説明する。
本実施形態では、制御装置６０が学習済みモデルを用いて出力した制御指標パラメータ（以下、単にパラメータという）と、学習装置７０が出力したパラメータとのいずれかを用いて、風車２０の回転数を制御する点において、上述した実施形態と相違する。
図１２は、第６の実施形態の変形例に係る風力発電システム１Ｂの概略構成の一例を示すブロック図である。図１２に示すように、制御装置６０Ｂは、選択部６８を備える。 (Sixth embodiment)
Next, a sixth embodiment will be described.
In the present embodiment, the rotational speed of the windmill 20 is determined using either a control index parameter (hereinafter simply referred to as a parameter) output by the control device 60 using the learned model or a parameter output by the learning device 70. The point of control is different from the above-described embodiment.
FIG. 12 is a block diagram illustrating an example of a schematic configuration of a wind power generation system 1B according to a modification of the sixth embodiment. As illustrated in FIG. 12, the control device 60 </ b> B includes a selection unit 68.

選択部６８は、学習済みモデル記憶部６５に記憶される学習済みモデルから出力されるパラメータと、学習装置７０により出力されるパラメータとの何れか一方を決定部６６に出力する。選択部６８は、何れの一方を選択するかを、予め定められたフェーズに従って決定するようにしてよい。選択部６８は、例えば、風車２０の回転数の制御を学習装置70に学習させる学習フェーズにおいては、学習装置７０により出力されるパラメータを選択する。一方、選択部６８は、風車２０の回転数の制御を学習済みの学習モデルが学習済みモデル記憶部６５に記憶され、学習済みモデルを用いて風車２０の回転数の制御する制御フェーズにおいては、学習済みモデル記憶部６５に記憶される学習済みモデルから出力されるパラメータを選択する。 The selection unit 68 outputs either the parameter output from the learned model stored in the learned model storage unit 65 or the parameter output from the learning device 70 to the determination unit 66. The selection unit 68 may determine which one to select according to a predetermined phase. For example, in the learning phase in which the learning device 70 learns control of the rotational speed of the windmill 20, the selection unit 68 selects a parameter output by the learning device 70. On the other hand, in the control phase in which the selection unit 68 controls the rotation speed of the windmill 20 in such a manner that the learning model having learned the control of the rotation speed of the windmill 20 is stored in the learned model storage section 65 and the rotation speed of the windmill 20 is controlled using the learned model. A parameter output from the learned model stored in the learned model storage unit 65 is selected.

また、上述した少なくとも一つの実施形態では、強化学習部７１が学習した内容を、学習済みモデル記憶部６５やその他の図示しない記憶部に記憶させておき、記憶させた内容に基づいて、更に学習を進めるようにしてよい。これにより、風車２０に共通するある程度の基本的な制御について学習したモデルを、風車２０が設けられた地域の風況や、季節の風況、昼夜の時間帯による風況の相違や、天候等の状態に応じた制御を行うことが可能となる。
なお、上述した少なくとも一つの実施形態では、風車２０の回転数を制御するパラメータとして回転数制御パラメータが用いられる場合を例示して説明したが、これに限定されることはない。制御システム５０は、風車２０の回転数を制御するパラメータとして、回転速度や回転時間等を制御するようにしてもよい。この場合、風車２０の回転数を制御するパラメータは、例えば回転速度パラメータ、回転時間パラメータ等であってよい。このような、風車２０の回転数を制御するパラメータの総称として、回転制御パラメータが用いられてよい。つまり、回転数制御パラメータは、「回転数制御パラメータ」の一例である。 In at least one embodiment described above, the content learned by the reinforcement learning unit 71 is stored in the learned model storage unit 65 or other storage unit (not shown), and further learning is performed based on the stored content. You may be allowed to proceed. As a result, a model learned about a certain amount of basic control common to the windmill 20 can be applied to the wind conditions in the area where the windmill 20 is installed, seasonal wind conditions, differences in wind conditions due to day and night time zones, weather conditions, etc. Control according to the state can be performed.
In the above-described at least one embodiment, the case where the rotation speed control parameter is used as the parameter for controlling the rotation speed of the windmill 20 is described as an example, but the present invention is not limited to this. The control system 50 may control the rotation speed, the rotation time, and the like as parameters for controlling the rotation speed of the windmill 20. In this case, the parameter for controlling the rotation speed of the windmill 20 may be, for example, a rotation speed parameter, a rotation time parameter, or the like. A rotation control parameter may be used as a general term for such parameters that control the rotation speed of the windmill 20. That is, the rotation speed control parameter is an example of a “rotation speed control parameter”.

以上説明したように、第１の実施形態の制御システム５０は、風車２０の設置場所における風速を示す風速情報（例えば、風速センサ４１により検出された風速）、風車２０の回転に関する回転情報（例えば、回転速度センサ４２により検出された回転速度）、及び、風速と回転の関係情報であって許容範囲と目標とを示す関係情報（例えば、図４に示す、風速と回転数との関係）に基づいて、風速と回転との対応情報を学習する強化学習部７１と、風速と回転の対応情報を記憶する学習済みモデル記憶部６５と、風車の回転数を制御する回転制御パラメータを風車２０に設定した場合における風速を検出する状態検出部６２と、状態検出部６２により検出された風速、及び対応情報に基づいて、回転情報を決定する決定部６６と、決定部６６により決定された回転情報に基づいて、回転情報が許容範囲に収まり、尚且つ目標に近づくように、風車２０の回転を制御する制御部６７とを備える。これにより、実施形態の制御システム５０は、不安定な風況であっても風車２０の回転速度を最適化させるように制御を行うことが可能となる。 As described above, the control system 50 according to the first embodiment includes the wind speed information (for example, the wind speed detected by the wind speed sensor 41) indicating the wind speed at the installation location of the wind turbine 20, and the rotation information (for example, the wind speed detected by the wind speed sensor 41). , The rotation speed detected by the rotation speed sensor 42), and the relationship information between the wind speed and the rotation and indicating the allowable range and the target (for example, the relationship between the wind speed and the rotation speed shown in FIG. 4). Based on the reinforcement learning unit 71 that learns the correspondence information between the wind speed and the rotation, the learned model storage unit 65 that stores the correspondence information between the wind speed and the rotation, and the rotation control parameter that controls the rotational speed of the windmill in the windmill 20. A state detection unit 62 that detects the wind speed when set, a determination unit 66 that determines rotation information based on the wind speed detected by the state detection unit 62, and correspondence information, and a determination unit 6 On the basis of the rotation information determined by the rotation information fits the allowable range, besides to approach the target, and a control unit 67 for controlling the rotation of the wind turbine 20. Thus, the control system 50 according to the embodiment can perform control so as to optimize the rotation speed of the windmill 20 even in an unstable wind condition.

また、第１の実施形態の制御システム５０は、状態検出部６２により検出された風速、及び風車２０の回転速度に基づいて、所定の報酬条件に応じた報酬を算出する報酬算出部６３を更に備え、強化学習部７１は、報酬に基づいて風速と回転との対応情報（例えば、回転数制御パラメータ）を学習する強化学習モデルである。これにより、実施形態の制御システム５０は、報酬手掛かりとしてより適切な制御を学習することができる。 The control system 50 of the first embodiment further includes a reward calculation unit 63 that calculates a reward according to a predetermined reward condition based on the wind speed detected by the state detection unit 62 and the rotational speed of the windmill 20. The reinforcement learning unit 71 is a reinforcement learning model that learns correspondence information (for example, a rotation speed control parameter) between wind speed and rotation based on a reward. Thereby, the control system 50 of the embodiment can learn more appropriate control as a reward clue.

また、第１の実施形態の制御システム５０では、報酬算出部６３は、状態検出部６２により検出された風速が強風であり、尚且つ、風車２０の回転速度が第１閾値以上である場合、第１レベルの報酬を算出する。また、報酬算出部６３は、状態検出部６２により検出された風速が強風であり、尚且つ、風車２０の回転速度が前記第１閾値より小さい第２閾値以上である場合、第１レベルより高い第２レベルの報酬を算出する。また、報酬算出部６３は、状態検出部６２により検出された風速が強風であり、尚且つ、風車２０の回転速度が第２閾値未満である場合、第１レベルより高く、尚且つ第２レベルより低い第３レベルの報酬を算出する。これにより、実施形態の制御システム５０は、風速が強風である場合に、回転速度が超過しないように制御することが可能である。また、回転速度が超過しない場合には、回転速度が速度不足となるよりも適正範囲となるように、学習させることが可能となるため、回転速度が超過し易い強風時にも、強制停止してしまうことを抑制し、また、発電電力が最大を維持するように学習させることができる。 Moreover, in the control system 50 of 1st Embodiment, the reward calculation part 63 is the wind speed detected by the state detection part 62, and when the rotational speed of the windmill 20 is more than a 1st threshold value, First level reward is calculated. The reward calculation unit 63 is higher than the first level when the wind speed detected by the state detection unit 62 is strong and the rotational speed of the windmill 20 is equal to or higher than a second threshold value smaller than the first threshold value. Second level reward is calculated. Further, the reward calculation unit 63 is higher than the first level when the wind speed detected by the state detection unit 62 is strong and the rotational speed of the windmill 20 is less than the second threshold, and the second level. A lower third level reward is calculated. Thereby, the control system 50 of the embodiment can control the rotational speed so as not to exceed when the wind speed is strong. In addition, if the rotational speed does not exceed, it is possible to learn so that the rotational speed is in an appropriate range rather than insufficient speed. It can be made to learn that generation | occurrence | production will be suppressed and the generated electric power may maintain the maximum.

また、第１の実施形態の学習装置７０は、風力発電システム１の風車２０の設置場所における風速を示す風速情報、及び風車２０の回転に関する回転情報に基づいて、風速と回転の対応情報を学習する強化学習部７１を備えるため、風速と風車の回転との状態に応じて、どのように風車の回転を制御すべきかを学習することができるため、風況が不安定であっても風車の回転をより適切に制御することが可能となる。 Further, the learning device 70 of the first embodiment learns correspondence information between wind speed and rotation based on wind speed information indicating the wind speed at the installation location of the wind turbine 20 of the wind power generation system 1 and rotation information related to the rotation of the wind turbine 20. Since the reinforcement learning unit 71 is provided, it is possible to learn how to control the rotation of the windmill according to the state of the wind speed and the rotation of the windmill. It becomes possible to control rotation more appropriately.

また、第１の実施形態の制御装置６０は、風力発電システム１の風車２０の回転数を制御する回転制御パラメータを風車２０に設定した場合における風速を検出する状態検出部６２と、状態検出部６２により検出された風速、及び、風速と風車の回転との対応情報（例えば、学習済みモデル記憶部６５に記憶された学習済みモデル）に基づいて、風車の回転に関する回転情報を決定する決定部６６と、決定部６６により決定された回転情報に基づいて、風車２０の回転を制御する制御部６７とを備える。これにより、実施形態の制御装置６０は、風速等の状態が学習済みモデルで学習済みの状態と似たような状態である場合に、学習済みモデルから出力された制御パラメータに応じた制御を行うことができ、より適切に制御することが可能となる。 In addition, the control device 60 of the first embodiment includes a state detection unit 62 that detects a wind speed when a rotation control parameter that controls the rotation speed of the wind turbine 20 of the wind power generation system 1 is set in the wind turbine 20, and a state detection unit. A determination unit that determines rotation information related to the rotation of the windmill based on the wind speed detected by 62 and correspondence information between the wind speed and the rotation of the windmill (for example, a learned model stored in the learned model storage unit 65). 66 and a control unit 67 that controls the rotation of the windmill 20 based on the rotation information determined by the determination unit 66. Thereby, the control device 60 of the embodiment performs control according to the control parameter output from the learned model when the state such as the wind speed is similar to the learned state in the learned model. Can be controlled more appropriately.

以上説明したように、第２の実施形態の制御システム５０は、風車２０の設置場所における風速を示す風速情報（例えば、風速センサ４１により検出された風速）、風車２０の回転に関する回転情報（例えば、回転速度センサ４２により検出された回転速度）、発電システムにより発電される回生電力に関する電力情報（例えば、電圧検出部３２により検出された回生電力の電圧、および電流検出部３３により検出された回生電力の電流）及び、風速と回転と回生電力との関係情報であって回生電力の目標を示す関係情報に基づいて、風速と回転と回生電力との対応情報を学習する強化学習部７１と、風速と回転と回生電力の対応情報を記憶する学習済みモデル記憶部６５と、回生電力を制御する電力制御パラメータに基づいて整流・昇圧部３１を制御した場合における回生電力の変化、および風速を検出する状態検出部６２と、状態検出部６２により検出された回生電力と風速、及び対応情報に基づいて、電力情報を決定する決定部６６と、決定部６６により決定された電力情報に基づいて、電力情報の目標に近づくように、回生電力を制御する制御部６７とを備える。これにより、実施形態の制御システム５０は、風況が変化した場合であってもトータルの発電電力が最大となるように制御を行うことが可能となる。 As described above, the control system 50 according to the second embodiment has the wind speed information indicating the wind speed at the installation location of the windmill 20 (for example, the wind speed detected by the wind speed sensor 41) and the rotation information (for example, the rotation of the windmill 20). , The rotational speed detected by the rotational speed sensor 42), power information relating to the regenerative power generated by the power generation system (for example, the voltage of the regenerative power detected by the voltage detector 32, and the regenerative power detected by the current detector 33). Power learning current), and reinforcement information learning section 71 that learns correspondence information between wind speed, rotation, and regenerative power based on relationship information that indicates the relationship between wind speed, rotation, and regenerative power and indicates the target of regenerative power; A learned model storage unit 65 that stores correspondence information between wind speed, rotation, and regenerative power, and a rectifier / boost unit based on a power control parameter that controls the regenerative power The state detection unit 62 that detects a change in regenerative power and the wind speed when 1 is controlled, and the determination unit 66 that determines power information based on the regenerative power and wind speed detected by the state detection unit 62 and the corresponding information. And a control unit 67 that controls the regenerative power so as to approach the target of the power information based on the power information determined by the determination unit 66. Thus, the control system 50 according to the embodiment can perform control so that the total generated power is maximized even when the wind condition changes.

また、第２の実施形態の制御システム５０は、状態検出部６２により検出された風速、及び回生電力に基づいて、所定の報酬条件に応じた報酬を算出する報酬算出部６３を更に備え、強化学習部７１は、報酬に基づいて風速と回生電力との対応情報（例えば、電力制御パラメータ）を学習する強化学習モデルである。これにより、実施形態の制御システム５０は、報酬を手掛かりとしてより適切な制御を学習することができる。 The control system 50 according to the second embodiment further includes a reward calculation unit 63 that calculates a reward according to a predetermined reward condition based on the wind speed detected by the state detection unit 62 and the regenerative power. The learning unit 71 is a reinforcement learning model that learns correspondence information (for example, power control parameters) between wind speed and regenerative power based on rewards. Thereby, the control system 50 of the embodiment can learn more appropriate control using the reward as a clue.

また、第２の実施形態の制御システム５０では、報酬算出部６３は、状態検出部６２により検出された回生電力が所定の電力閾値未満である場合、第１レベルの報酬を算出する。また、報酬算出部６３は、状態検出部６２により検出された回生電力が所定の電力閾値以上である場合、第１レベルより高い第２レベルの報酬を算出する。これにより、実施形態の制御システム５０は、回生電力が大きくなるように制御することが可能である。
また、第２の実施形態の制御システム５０では、報酬算出部６３は、状態検出部６２により検出された風速に基づいて、減速区間とその後の加速区間とを含む対象区間を抽出し、対象区間における、状態検出部６２により検出された回生電力の加算値が所定の電力閾値以上である場合、第２レベルの報酬を算出する。これにより、第２の実施形態の制御システム５０では、減速区間において回生電力を出力させ続けると風車の回転が失速する場合があっても、減速区間においては回生電力の出力を抑制して、加速区間で回生電力をより高く出力させるなどの制御を学習させ、対象区間におけるトータルの回生電力が大きくなるように制御することが可能である。
また、第２の実施形態の制御システム５０では、報酬算出部６３は、状態検出部６２により検出された風速が所定の強風判定閾値未満である場合に報酬を算出する。これにより、第２の実施形態の制御システム５０では、強風時にも回生電力を大きくしようとして過回転に陥ってしまうような間違った制御を抑制することが可能である。 In the control system 50 of the second embodiment, the reward calculation unit 63 calculates the first level reward when the regenerative power detected by the state detection unit 62 is less than a predetermined power threshold. Moreover, the reward calculation part 63 calculates the reward of the 2nd level higher than a 1st level, when the regenerative electric power detected by the state detection part 62 is more than a predetermined electric power threshold value. Thereby, the control system 50 of embodiment can control so that regenerative electric power becomes large.
In the control system 50 of the second embodiment, the reward calculation unit 63 extracts a target section including a deceleration section and a subsequent acceleration section based on the wind speed detected by the state detection section 62, and the target section When the added value of the regenerative power detected by the state detection unit 62 is equal to or greater than a predetermined power threshold value, a second level reward is calculated. Thereby, in the control system 50 of 2nd Embodiment, even if rotation of a windmill may stall if it continues outputting regenerative electric power in a deceleration area, the output of regenerative electric power is suppressed in a deceleration area, and acceleration is carried out. It is possible to perform control such that the regenerative power is output higher in the section, and the total regenerative power in the target section is increased.
In the control system 50 of the second embodiment, the reward calculation unit 63 calculates a reward when the wind speed detected by the state detection unit 62 is less than a predetermined strong wind determination threshold. Thereby, in the control system 50 of 2nd Embodiment, it is possible to suppress the wrong control which falls into an overspeed trying to enlarge regenerative electric power also at the time of a strong wind.

また、第２の実施形態の学習装置７０は、風力発電システム１の風車２０の設置場所における風速を示す風速情報、風車２０の回転に関する回転情報、及び風力発電システム１により発電される回生電力に関する電力情報に基づいて、風速と回転と回生電力の対応情報を学習する強化学習部７１を備えるため、風速と風車の回転と回生電力の状態に応じて、どのように回生電力を制御すべきかを学習することができるため、風況が変化する場合であっても回生電力をより適切に制御することが可能となる。 The learning device 70 according to the second embodiment also relates to wind speed information indicating the wind speed at the installation location of the wind turbine 20 of the wind power generation system 1, rotation information regarding rotation of the wind turbine 20, and regenerative power generated by the wind power generation system 1. Since the reinforcement learning unit 71 that learns the correspondence information between wind speed, rotation, and regenerative power based on the power information is provided, how to control the regenerative power according to the wind speed, the rotation of the windmill, and the state of the regenerative power is determined. Since it is possible to learn, the regenerative power can be more appropriately controlled even when the wind conditions change.

また、第２の実施形態の制御装置６０は、風力発電システム１により発電された回生電力を制御する電力制御パラメータを整流・昇圧部３１に設定した場合における回生電力と風速とを検出する状態検出部６２と、状態検出部６２により検出された回生電力、風速、及び、風速と風車の回転と回生電力との対応情報（例えば、学習済みモデル記憶部６５に記憶された学習済みモデル）に基づいて、回生電力に関する電力情報を決定する決定部６６と、決定部６６により決定された電力情報に基づいて、回生電力を制御する制御部６７とを備える。これにより、実施形態の制御装置６０は、風速等の状態が学習済みモデルで学習済みの状態と似たような状態である場合に、学習済みモデルから出力された制御パラメータに応じた制御を行うことができ、より適切に制御することが可能となる。 Further, the control device 60 of the second embodiment detects the state of the regenerative power and the wind speed when the power control parameter for controlling the regenerative power generated by the wind power generation system 1 is set in the rectification / boost unit 31. 62, based on regenerative power and wind speed detected by the state detection unit 62, and correspondence information (for example, a learned model stored in the learned model storage unit 65) between the wind speed, rotation of the windmill, and regenerative power. Thus, a determination unit 66 that determines power information related to regenerative power and a control unit 67 that controls regenerative power based on the power information determined by the determination unit 66 are provided. Thereby, the control device 60 of the embodiment performs control according to the control parameter output from the learned model when the state such as the wind speed is similar to the learned state in the learned model. Can be controlled more appropriately.

以上説明したように、第３の実施形態の制御システム５０は、風車２０の設置場所における風速を示す風速情報（例えば、風速センサ４１により検出された風速）、風車２０の回転に関する回転情報（例えば、回転速度センサ４２により検出された回転速度）、発電システムにより発電される回生電力に関する電力情報（例えば、電圧検出部３２により検出された回生電力の電圧、および電流検出部３３により検出された回生電力の電流）及び、風速と回転と回生電力との関係情報であって回転数の目標を含む前記風車に設定可能な前記回転数の範囲を示す関係情報に基づいて、風速と回転と回生電力との対応情報を学習する強化学習部７１と、風速と回転と回生電力の対応情報を記憶する学習済みモデル記憶部６５と、回転制御パラメータに基づいて制御した場合における回転速度、及び風速を検出する状態検出部６２と、状態検出部６２により検出された回転速度と風速、及び対応情報に基づいて、回転情報を決定する決定部６６と、決定部６６により決定された回転情報に基づいて、風車２０の回転を制御する制御部６７とを備える。これにより、実施形態の制御システム５０は、風況が変化した場合であっても発電電力量が最大となるように制御を行うことが可能となる。 As described above, the control system 50 according to the third embodiment has the wind speed information indicating the wind speed at the installation location of the windmill 20 (for example, the wind speed detected by the wind speed sensor 41) and the rotation information (for example, the rotation of the windmill 20). , The rotational speed detected by the rotational speed sensor 42), power information relating to the regenerative power generated by the power generation system (for example, the voltage of the regenerative power detected by the voltage detector 32, and the regenerative power detected by the current detector 33). Power current) and information on the relationship between the wind speed, rotation, and regenerative power, and the relationship information indicating the range of the rotation speed that can be set for the wind turbine including the target of the rotation speed, and the wind speed, rotation, and regenerative power. Reinforcement learning unit 71 that learns the correspondence information between the wind speed, the rotation, and the regenerative power, the learned model storage unit 65 that stores the correspondence information, and the rotation control parameter A state detection unit 62 that detects a rotation speed and a wind speed when controlled based on the control, a determination unit 66 that determines rotation information based on the rotation speed and the wind speed detected by the state detection unit 62, and correspondence information; And a control unit 67 that controls the rotation of the wind turbine 20 based on the rotation information determined by the determination unit 66. As a result, the control system 50 according to the embodiment can perform control so that the amount of generated power is maximized even when the wind condition changes.

以上説明したように、第４の実施形態の制御システム５０は、風車２０の設置場所における風速を示す風速情報（例えば、風速センサ４１により検出された風速）、風車２０の回転に関する回転情報（例えば、回転速度センサ４２により検出された回転速度）、発電システムにより発電される回生電力に関する電力情報（例えば、電圧検出部３２により検出された回生電力の電圧、および電流検出部３３により検出された回生電力の電流）及び、風速と回転と回生電力との関係情報であって前記風速と前記風車が回転可能な回転数の最大値とを示す関係情報に基づいて、風速と回転と回生電力との対応情報を学習する強化学習部７１と、風速と回転と回生電力の対応情報を記憶する学習済みモデル記憶部６５と、回転制御パラメータに基づいて制御した場合における回転速度、および風速を検出する状態検出部６２と、状態検出部６２により検出された回転速度と風速、及び対応情報に基づいて、回転情報を決定する決定部６６と、決定部６６により決定された回転情報に基づいて、風車２０の回転を制御する制御部６７とを備える。これにより、実施形態の制御システム５０は、風況が変化した場合であってもトータルの発電電力が最大となるように制御を行うことが可能となる。 As described above, the control system 50 according to the fourth embodiment includes the wind speed information (for example, the wind speed detected by the wind speed sensor 41) indicating the wind speed at the installation location of the wind turbine 20, and the rotation information (for example, the wind speed detected by the wind speed sensor 41). , The rotational speed detected by the rotational speed sensor 42), power information relating to the regenerative power generated by the power generation system (for example, the voltage of the regenerative power detected by the voltage detector 32, and the regenerative power detected by the current detector 33). Current) and information on the relationship between the wind speed, rotation, and regenerative power, and the relationship information that indicates the wind speed and the maximum value of the number of rotations that the windmill can rotate. Based on the reinforcement learning unit 71 that learns correspondence information, the learned model storage unit 65 that stores correspondence information of wind speed, rotation, and regenerative power, and rotation control parameters A state detection unit 62 that detects the rotation speed and the wind speed when controlled, a determination unit 66 that determines rotation information based on the rotation speed and the wind speed detected by the state detection unit 62, and correspondence information, and a determination unit And a control unit 67 that controls the rotation of the windmill 20 based on the rotation information determined by 66. Thus, the control system 50 according to the embodiment can perform control so that the total generated power is maximized even when the wind condition changes.

上述した実施形態における制御システム５０、制御装置６０、及び学習装置７０の各々が行う処理の全部または一部をコンピュータで実現するようにしてもよい。その場合、この機能を実現するためのプログラムをコンピュータ読み取り可能な記録媒体に記録して、この記録媒体に記録されたプログラムをコンピュータシステムに読み込ませ、実行することによって実現してもよい。なお、ここでいう「コンピュータシステム」とは、ＯＳや周辺機器等のハードウェアを含むものとする。また、「コンピュータ読み取り可能な記録媒体」とは、フレキシブルディスク、光磁気ディスク、ＲＯＭ、ＣＤ−ＲＯＭ等の可搬媒体、コンピュータシステムに内蔵されるハードディスク等の記憶装置のことをいう。さらに「コンピュータ読み取り可能な記録媒体」とは、インターネット等のネットワークや電話回線等の通信回線を介してプログラムを送信する場合の通信線のように、短時間の間、動的にプログラムを保持するもの、その場合のサーバやクライアントとなるコンピュータシステム内部の揮発性メモリのように、一定時間プログラムを保持しているものも含んでもよい。また上記プログラムは、前述した機能の一部を実現するためのものであってもよく、さらに前述した機能をコンピュータシステムにすでに記録されているプログラムとの組み合わせで実現できるものであってもよく、ＦＰＧＡ等のプログラマブルロジックデバイスを用いて実現されるものであってもよい。 You may make it implement | achieve all or one part of the process which each of the control system 50, the control apparatus 60, and the learning apparatus 70 in embodiment mentioned above performs with a computer. In that case, a program for realizing this function may be recorded on a computer-readable recording medium, and the program recorded on this recording medium may be read into a computer system and executed. Here, the “computer system” includes an OS and hardware such as peripheral devices. The “computer-readable recording medium” refers to a storage device such as a flexible medium, a magneto-optical disk, a portable medium such as a ROM and a CD-ROM, and a hard disk incorporated in a computer system. Furthermore, the “computer-readable recording medium” dynamically holds a program for a short time like a communication line when transmitting a program via a network such as the Internet or a communication line such as a telephone line. In this case, a volatile memory inside a computer system serving as a server or a client in that case may be included and a program held for a certain period of time. Further, the program may be a program for realizing a part of the above-described functions, and may be a program capable of realizing the functions described above in combination with a program already recorded in a computer system. It may be realized using a programmable logic device such as an FPGA.

以上、この発明の実施形態について図面を参照して詳述してきたが、具体的な構成はこの実施形態に限られるものではなく、この発明の要旨を逸脱しない範囲の設計等も含まれる。 The embodiment of the present invention has been described in detail with reference to the drawings. However, the specific configuration is not limited to this embodiment, and includes designs and the like that do not depart from the gist of the present invention.

１風力発電システム
１０風力発電機本体
２０風車
３０発電機
３１整流・昇圧部
３２電圧検出部
３３電流検出部
４１風速センサ
４２回転速度センサ
５０制御システム
６０制御装置
６１パラメータ取得部
６２状態検出部
６３報酬算出部
６４報酬出力部
６５学習済みモデル記憶部
６６決定部
６７制御部
７０学習装置
７１強化学習部 DESCRIPTION OF SYMBOLS 1 Wind power generation system 10 Wind generator main body 20 Windmill 30 Generator 31 Rectification / boost part 32 Voltage detection part 33 Current detection part 41 Wind speed sensor 42 Rotational speed sensor 50 Control system 60 Control apparatus 61 Parameter acquisition part 62 State detection part 63 Reward Calculation unit 64 Reward output unit 65 Learned model storage unit 66 Determination unit 67 Control unit 70 Learning device 71 Reinforcement learning unit

Claims

発電システムの風車を制御する制御システムであって、
前記風車の設置場所における風速を示す風速情報、前記風車の回転に関する回転情報、及び、前記風速と前記回転の関係情報であって前記回転の許容範囲と目標とを示す関係情報に基づいて、前記風速と前記回転との対応情報を学習する学習部と、
前記風速と前記回転の対応情報を記憶する記憶部と、
前記風車の回転数を制御する回転数制御パラメータを前記風車に設定した場合における前記回転数と風速を検出する状態検出部と、
前記状態検出部により検出された前記回転数と風速、及び前記対応情報に基づいて、前記回転情報を決定する決定部と、
前記決定部により決定された回転情報に基づいて、前記風車の回転が前記許容範囲に収まり、尚且つ前記目標に近づくように、前記風車を制御する制御部と
を備える
ことを特徴とする制御システム。 A control system for controlling a wind turbine of a power generation system,
Based on the wind speed information indicating the wind speed at the installation location of the windmill, the rotation information related to the rotation of the windmill, and the relationship information indicating the allowable range of rotation and the target information regarding the wind speed and the rotation, A learning unit for learning correspondence information between the wind speed and the rotation;
A storage unit for storing correspondence information between the wind speed and the rotation;
A state detector that detects the rotational speed and the wind speed when the rotational speed control parameter for controlling the rotational speed of the windmill is set in the windmill;
A determination unit that determines the rotation information based on the rotation speed and the wind speed detected by the state detection unit, and the correspondence information;
A control unit that controls the wind turbine so that the rotation of the wind turbine is within the allowable range and approaches the target based on the rotation information determined by the determination unit. .

前記状態検出部により検出された風速、及び前記風車の回転速度に基づいて、所定の報酬条件に応じた報酬を算出する報酬算出部
を更に備え、
前記関係情報は、前記報酬算出部により算出された報酬であり、
前記学習部は、報酬に基づいて前記風速と前記回転との対応情報を学習する強化学習モデルである
請求項１に記載の制御システム。 A remuneration calculation unit for calculating a remuneration according to a predetermined remuneration condition based on the wind speed detected by the state detection unit and the rotational speed of the windmill;
The relation information is a reward calculated by the reward calculation unit,
The control system according to claim 1, wherein the learning unit is a reinforcement learning model that learns correspondence information between the wind speed and the rotation based on a reward.

前記報酬算出部は、前記状態検出部により検出された風速が所定の風速閾値以上である場合、前記風車の回転速度が第１閾値以上であるか否かに応じて報酬を算出する
請求項２に記載の制御システム。 The reward calculation unit, when the wind speed detected by the state detection unit is equal to or higher than a predetermined wind speed threshold value, calculates a reward according to whether or not the rotational speed of the windmill is equal to or higher than a first threshold value. The control system described in.

前記報酬算出部は、前記状態検出部により検出された風速が前記風速閾値以上であり、尚且つ、前記風車の回転速度が第１閾値以上である場合に第１レベルの報酬を算出し、前記風車の回転速度が前記第１閾値より小さい第２閾値以上である場合、前記第１レベルより高い第２レベルの報酬を算出する
請求項３に記載の制御システム。 The reward calculation unit calculates a first level reward when the wind speed detected by the state detection unit is equal to or higher than the wind speed threshold value, and the rotational speed of the windmill is equal to or higher than a first threshold value, The control system according to claim 3, wherein when the rotation speed of the windmill is equal to or higher than a second threshold value that is smaller than the first threshold value, a reward of a second level that is higher than the first level is calculated.

前記報酬算出部は、前記状態検出部により検出された風速が前記風速閾値以上であり、尚且つ、前記風車の回転速度が前記第２閾値未満である場合、前記第１レベルより高く、尚且つ前記第２レベルより低い第３レベルの報酬を算出する
請求項４に記載の制御システム。 The reward calculation unit is higher than the first level when the wind speed detected by the state detection unit is equal to or higher than the wind speed threshold and the rotational speed of the windmill is lower than the second threshold, and The control system according to claim 4, wherein a reward of a third level lower than the second level is calculated.

風力発電システムの風車の設置場所における風速を示す風速情報、前記風車の回転に関する回転情報、及び前記風速と前記回転の関係情報であって前記回転の許容範囲と目標とを示す関係情報に基づいて、前記風速と前記回転の対応情報を学習する学習部
を備える学習装置。 Based on the wind speed information indicating the wind speed at the wind turbine installation location of the wind power generation system, the rotation information regarding the rotation of the wind turbine, and the relationship information between the wind speed and the rotation and indicating the allowable range and target of the rotation. A learning device comprising: a learning unit that learns correspondence information between the wind speed and the rotation.

風力発電システムの風車の回転数を制御する回転数制御パラメータを前記風車に設定した場合における風速を検出する状態検出部と、
前記状態検出部により検出された風速、及び、前記風速と前記風車の回転との対応情報に基づいて、前記風車の回転に関する回転情報を決定する決定部と、
前記決定部により決定された回転情報に基づいて、前記風車の回転を制御する制御部と
を備える制御装置。 A state detection unit for detecting the wind speed when the rotation speed control parameter for controlling the rotation speed of the wind turbine of the wind power generation system is set in the wind turbine;
A determination unit that determines rotation information related to the rotation of the windmill based on the wind speed detected by the state detection unit and correspondence information between the wind speed and the rotation of the windmill;
And a control unit that controls the rotation of the windmill based on the rotation information determined by the determination unit.

学習部が、発電システムの風車を制御する制御方法であって、
前記風車の設置場所における風速を示す風速情報、前記風車の回転に関する回転情報、及び、前記風速と前記回転の関係情報であって許容範囲と目標とを示す関係情報に基づいて、前記風速と前記回転との対応情報を学習し、
記憶部が、前記風速と前記回転の対応情報を記憶し、
状態検出部が、前記風車の回転数を制御する回転数制御パラメータを前記風車に設定した場合における前記回転数と風速を検出し、
決定部が、前記状態検出部により検出された前記回転数と風速、及び前記対応情報に基づいて、前記回転情報を決定し、
制御部が、前記決定部により決定された回転情報に基づいて、前記風車の回転が前記許容範囲に収まり、尚且つ前記目標に近づくように、前記風車を制御する
制御方法。 The learning unit is a control method for controlling the wind turbine of the power generation system,
Based on the wind speed information indicating the wind speed at the installation location of the windmill, the rotation information related to the rotation of the windmill, and the relationship information indicating the allowable range and the target, the relation information of the wind speed and the rotation, and the wind speed and the Learn the correspondence information with rotation,
A storage unit stores correspondence information between the wind speed and the rotation,
The state detection unit detects the rotational speed and the wind speed when the rotational speed control parameter for controlling the rotational speed of the windmill is set in the windmill,
The determination unit determines the rotation information based on the rotation speed and wind speed detected by the state detection unit, and the correspondence information,
A control method in which the control unit controls the windmill based on the rotation information determined by the determination unit so that the rotation of the windmill is within the allowable range and approaches the target.