CN112231870B

CN112231870B - Intelligent generation method for railway line in complex mountain area

Info

Publication number: CN112231870B
Application number: CN202011011062.4A
Authority: CN
Inventors: 何庆; 高天赐; 李子涵; 王平; 高岩; 王启航; 李晨钟; 王晓明; 徐双婷; 钱舒月
Original assignee: Southwest Jiaotong University
Current assignee: Southwest Jiaotong University
Priority date: 2020-09-23
Filing date: 2020-09-23
Publication date: 2022-08-02
Anticipated expiration: 2040-09-23
Also published as: CN112231870A

Abstract

The invention relates to the technical field of railway lines, in particular to an intelligent generation method of a railway line in a complex mountain area, which comprises the following steps: firstly, building Environment; secondly, setting various attributes of Environment; thirdly, determining the optimal path of the line by adopting the DDPG in the reinforcement learning; 3.1, creating a Memory Buffer; 3.2, updating the DDPG structure parameters by using the Memory Buffer storage content; 3.3, optimizing the railway line path by using the DDPG until the line path is converged. The invention can greatly save manpower and material resources and effectively improve the efficiency and the level of the line selection of the mountain railway.

Description

Intelligent generation method for railway line in complex mountain area

Technical Field

The invention relates to the technical field of railway lines, in particular to an intelligent generation method of a railway line in a complex mountain area.

Background

The design and planning of the line are a tap and a foundation of railway construction, and are also a core work with wide related range and strong systematicness in the railway construction. The trend of the railway line directly influences the difficulty degree of project engineering, the size of engineering investment and the safety of construction and operation. Therefore, the determination of the optimal route needs to consider not only a series of natural factors such as geological topography in the railway passing area, but also other limiting conditions such as the existing railway, the road, historical historic sites, ecological environment protection areas and the like in the target area. In summary, the railway route selection problem is an optimization decision problem involving multiple limiting factors, and the optimization process is tedious and needs to consume a large amount of manpower and time to complete the optimization of each design variable.

At present, with the vigorous construction of the hidden line in the river, the complex terrain environment in the western mountainous area also brings new challenges to the reasonable arrangement of structures such as bridges, tunnels and roadbeds. Therefore, the problems can be solved by means of gradually developed artificial intelligence and computer application technology, economy and reasonability can be automatically generated, and the efficiency and the level of the line selection of the mountain railway can be effectively improved by the railway line meeting various constraints.

Many scholars at home and abroad adopt corresponding computer optimization algorithms to plan and design railway lines. Easa and the like optimize the construction cost of the railway line by adopting an enumeration method, fully consider the geometric constraint influence of the three-dimensional space line in the line optimization process, but the method has large calculation amount and cannot ensure whether the searched scheme is globally optimal or not. Hogan et al optimize a planar line by a dynamic programming method, and the method recurs an optimization target by a state transfer function to obtain a cost value of a total objective function. Jong of Maryland university utilizes a Genetic Algorithm (GA) to optimize highway lines for the first time, and the method takes plane intersection points, vertical section intersection points and plane circular curve radiuses as optimization variables, calculates the fitness (optimization target) of each generation of population through operations such as selection, intersection, variation and the like, and screens line scheme individuals according to the fitness to further obtain a line scheme with high fitness. And then, Schonfeld combines a GIS (geographic information system) with a genetic algorithm, quantifies the influence of the environment on line selection through a corresponding calculation formula, and optimizes and compares the construction cost of the railway line with the traction power performance of the train. In recent years, the scholars of the university of the middle and south have made great progress in the railway route selection in mountainous areas. A Puhao professor team provides a mountain area route optimization method based on a method of combining an improved distance transformation algorithm (DT) with a Genetic Algorithm (GA), a particle swarm algorithm (PSO) and the like, so that the constraint condition and the construction cost of the route are fully considered, factors such as geological disasters and environmental damage are quantized to serve as an optimization target of the mountain area route, and a more economic, safer and more reasonable mountain area route scheme is solved. However, the method adopted at home and abroad at present is complicated in process, generally, the line trend needs to be generated firstly, then fine tuning optimization is carried out on the line, the methodology is laggard, and the latest optimization theory in the field of artificial intelligence is not tried.

Disclosure of Invention

The invention provides an intelligent complex mountain railway line generation method which can overcome some or some defects in the prior art.

The invention discloses an intelligent generation method of a complex mountain railway line, which comprises the following steps:

firstly, building Environment;

1.1, obtaining design circuit information of a target area;

1.2, dividing a research area into a plurality of grids by utilizing a Geographic Information System (GIS);

secondly, setting various attributes of Environment;

2.1, setting the State State of the Agent, namely the existing line output by the Agent of the Agent in the Environment;

the circuit includes N times of State transition steps in the optimization process, namely transition step, and the State Space is S when the ith step is finished _i ：

S _i ＝{[x _i ,y _i ,h _i ] ^T |x _i ∈[0,W],y _i ∈[0,H]}；

Wherein i is 1,2, …, N; x is the number of _i And y _i Represents a plane coordinate, h _i The elevation of the agent at the end of the ith step; w and H are the maximum width and height of the target area grid; when i is 1, the agent is at the starting point of the line;

2.2, setting an Action of the Agent, namely the direction of the next spatial line output by the Agent;

each transition step is done by using an Action, and the Action space a is expressed as follows:

A＝{[Δx _i ,Δy _i ,G _i ] ^T |Δx _i ∈[0,W],Δy _i ∈[0,H],G _i ∈[-G _max ,G _max ]}；

in the formula,. DELTA.x _i And Δ y _i The amount of movement of the horizontal and vertical coordinates of the plane, G, when the agent takes action _i Is in a state S _i And S _i+1 The vertical section gradient between the two sections is required to meet the vertical section limit condition: g _i ≤G _max ，G _i Is set to [ -G ] _max ,G _max ]；

According to S _i 、S _i+1 And A _i BetweenRelation of (1), state of the next step S _i+1 The calculation is as follows:

S _i+1 ＝[x _i+1 ,y _i+1 ,h _i+1 ]＝[x _i +Δx _i ,y _i +Δy,h _i +G _i ×l _i ]；

in the formula (I), the compound is shown in the specification,

is S _i 、S _i+1 D is the side length of the grid;

2.3, Reward of the Agent, namely after the Agent outputs the line trend, the Environment gives feedback to the Agent;

by taking action A _i State of agent from S _i Transition to S _i+1 At this point, the Environment will give the agent a reward R _i ，R _i The expression of (a) is as follows:

in the formula (I), the compound is shown in the specification,

and

the evaluation indexes of the unit cost, the survival state and the distance line terminal point of the line are respectively. u. of _c ,u _s And u _d The weight coefficients of the three components;

thirdly, determining the optimal path of the line by adopting the DDPG in the reinforcement learning;

3.1 creating a Memory Buffer to store the slave S _i Transition to S _i+1 The transition step comprises the current state of the agent, the action in the state, the reward given to the agent by the environment after taking the action and the next state reached by the agent after taking the action;

3.2, updating DDPG structure parameters by using the storage content of Memory Buffer, randomly selecting a plurality of transition steps from the Memory Buffer for training, firstly updating Actor-Net in the Main-Net by using a calculated policy gradient by using a random gradient descent method, and then calculating a TD Error value by using two neural networks in the Target Net for updating critical-Net in the Main-Net;

3.3, optimizing the railway line path by using the DDPG until the line path is converged.

Preferably, the designing the routing information includes: elevation information of a target area, coordinates of a starting point and a middle point of a line, forbidden zone information, constraint conditions and cost standard information.

Preferably, the constraints include: maximum limit gradient G _max Adjacent gradient algebraic difference delta G _max Minimum curve length L _Cmin Minimum clip line length L _Tmin Minimum slope length L _Smin Maximum allowable bridge height H _Bmax And maximum allowable tunnel length L _Tmax (ii) a The fee standard information includes: bridge per linear meter cost U _Bi Cost of each linear meter of tunnel U _Ti And fill out square unit price U _Fi 、U _Ci 。

Preferably, the state of the agent is subject to various constraints, including plane constraints, vertical section constraints, building structure constraints, and other constraints.

Preferably, satisfying the plane restriction condition means:

after i +1 state conversion steps, the ith rotation angle of the line is as follows:

(a) minimum plane circular curve length (L) _Cmin ) It should satisfy:

L _Cmin -α _i R _i ≤0；

in the formula, R _i In order to satisfy the radius of the circular curve of the plane at the ith corner after i +1 state transition steps: r _min -R _i ≤0；

(b) Minimum clip line length (L) between two planar circular curves _Tmin ) It should satisfy:

L _Tmin -L _Ti ≤0；

wherein L is _Ti The calculation formula for the clip line length is as follows:

the condition that the longitudinal section limit value is met is as follows:

the longitudinal section limiting conditions mainly comprise: maximum limit gradient (G) _max ) Minimum slope segment length (L) _Smin ) And adjacent gradient algebraic difference (Δ G) _max )：

(a) Vertical gradient:

(b) length of slope section:

(c) algebraic difference of adjacent slope sections:

|G _i+1 -G _i |≤ΔG _max ；

satisfying the building structure limitation condition means that:

after each transition step, the intelligent agent automatically arranges bridges, tunnels and roadbed sections according to corresponding line bridge boundary filling height and line tunnel boundary excavation depth, wherein the height (H) of the bridge from the ground _Bi ) Should not exceed the maximum allowable bridge height (H) _Bmax ) Full length of a single tunnel (L) _Tui ) Should not exceed the maximum allowable tunnel length (L) _Tmax ) Namely:

H _Bi ≤H _Bmax ；

L _Tui ≤L _Tmax ；

satisfying other constraints means:

the line cannot cross the environmental protection area and the historical trails.

As a preference, the first and second liquid crystal compositions are,

and

the method comprises the following steps:

1) unit cost index

Line construction costs include bridge construction costs (C) _B ) Cost of tunnel construction (C) _T ) Fill and dig square charge (C) _E ) Environmental protection costs (C) _I ) And linear cost (C) _L )；

(a) Bridge construction costs (C) _B )：

In the formula, n is the number of full-line bridges; u shape _Bi The unit construction cost ([ gamma ]/] of the ith bridge is calculated; l is _Bi The length of the ith bridge; c _Ai Construction costs ('this') of the i-th bridge abutment;

(b) cost of tunnel construction (C) _T )：

In the formula, n is the number of the full-line tunnels; u shape _Ti The unit construction cost ([ gamma ] for the ith tunnel; l is a radical of an alcohol _Tui The length of the ith tunnel is taken as the length of the ith tunnel; c _Pi The construction cost (rah) of the ith tunnel portal;

(c) fill and dig square charge (C) _E )：

The cross-sectional area of the subgrade section can be calculated as follows:

A＝2(W _s +Δh×i)×Δh；

in the formula, W _s The width of the roadbed surface; delta h is the roadbed filling and excavating height; i is the slope of the roadbed side slope;

further, fill and dig square fee is calculated as:

in the formula of U _Fi And U _Ci Respectively the ith section filling and excavating unit cost ([ gamma/m ] ³ ) M and n are the number of filling and digging respectively; a. the _i Is the ith cross-sectional area (m) ² )；L _i Is the ith section length (m);

(d) cost of environmental protection (C) _I )：

In the formula of U _i Is the unit fine ([ gamma/m ]) of the line passing through the environment protection area ² )；A _Pi Area (m) occupied by a line crossing an environmentally protected area ² )；

(e) Linear cost (C) _L )：

Linear costs are costs associated with track length, including track laying and electrical utility costs:

C _L ＝U _L ×L；

in the formula of U _L The unit construction cost ([ m ]) for the linear expense; l is the total line length;

(f) unit construction cost

Based on the above fee criteria information, the state of the agent is determined by the slave S _i Transition to S _i+1 The unit construction cost of time is:

taking a negative value;

2) index of survival status

Agent state driven by S _i Transition to S _i+1 In time, all constraints can be met, and the environment will give the agent a positive survival reward

On the contrary, if the intelligent agent can not meet all the limiting conditions, the index is a negative value;

3) evaluation index for distance to line end point

The calculation is as follows:

in the formula (d) ₁ Is the diagonal length of the target area; d is a radical of ₂ The distance from the current position of the intelligent agent to the line terminal point.

Preferably, in step 3.2, the structural parameters in Target Net are not updated directly, and after every multiple iteration steps, two neural network parameters in Main-Net are copied to Target Net to realize updating.

The traditional railway route selection method is also called manual route selection in a certain sense, and is characterized in that route selection designers evaluate, analyze and compare and select a plurality of railway schemes according to past knowledge accumulation and working experience, and finally determine an optimal route. However, due to time and resource constraints, the designer's preferred routes are always limited, and many potentially viable routes are easily ignored. The intelligent route selection method provided by the invention can greatly save manpower and material resources, and effectively improve the efficiency and level of route selection of mountainous railways.

Drawings

Fig. 1 is a flowchart of an intelligent complex mountain railway route generation method in embodiment 1;

fig. 2 is a schematic diagram of meshing of topographic information in embodiment 1;

FIG. 3 is a diagram illustrating the environment and attributes of line selection optimization in embodiment 1;

FIG. 4 is a schematic line shape of a planar wiring in example 1;

FIG. 5 is a schematic diagram showing the operation in example 1;

FIG. 6 is a schematic diagram of the structure of DDPG in example 1;

FIG. 7 is a comparison between the method of the present embodiment and the manual route selection method in example 1.

Detailed Description

For a further understanding of the invention, reference should be made to the following detailed description taken in conjunction with the accompanying drawings and examples. It is to be understood that the examples are illustrative of the invention and not limiting.

Example 1

As shown in fig. 1, the present embodiment provides an intelligent generation method for a complex mountain railway line, which includes the following steps:

firstly, building Environment;

1.1, obtaining design circuit information of a target area;

1.2, dividing a research area into a plurality of grids by utilizing a Geographic Information System (GIS); as shown in fig. 2, the side length of the grid varies according to the accuracy of the data, and is typically 30m, and each grid contains elevation information of the current grid;

secondly, setting various attributes of Environment;

the method is based on Reinforcement Learning, and mainly comprises the following concepts and attributes, as shown in fig. 2:

1) environment: the method comprises the steps of obtaining target area topographic information, line design information, cost standard information, various constraint conditions and the like;

2) agent of the Agent: the intelligent line selection system can be regarded as a designer of line selection or an intelligent program of line selection and is responsible for outputting an optimized line;

3) state: existing lines output by the agent in the Environment;

4) reward: after the intelligent agent outputs the route, the environment gives feedback to the intelligent agent; such as: the cost of the intelligent agent output line, whether constraint conditions are met, and the like;

5) and (4) Action: outputting the next spatial line trend by the intelligent agent;

the circuit includes N times of State transition steps in the optimization process, i.e. transition step, then the State Space is S when the ith step is finished _i ：

S _i ＝{[x _i ,y _i ,h _i ] ^T |x _i ∈[0,W],y _i ∈[0,H]}；

as shown in fig. 5, (a) in fig. 5 is a schematic plane operation diagram, and (b) in fig. 5 is a schematic longitudinal section operation diagram; each transition step is done by using an Action, and the Action space a is expressed as follows:

According to S _i 、S _i+1 And A _i Relation between, state S of the next step _i+1 The calculation is as follows:

in the formula (I), the compound is shown in the specification,

is S _i 、S _i+1 D is the grid side length;

in the formula (I), the compound is shown in the specification,

and

the evaluation indexes of the unit cost, the survival state and the distance line terminal point of the line are respectively. u. of _c ,u _s And u _d The weight coefficients of the three are;

thirdly, determining the optimal path of the line by adopting the DDPG in the reinforcement learning; the DDPG includes four neural networks, and the structure thereof is shown in fig. 6, and can be divided into two types (Actor-Net, critical-Net), where the input of Actor-Net is the state (state) of the agent, and the output is the action (action); the Critic-Net is responsible for evaluating the quality of the action adopted by the intelligent agent in the current state, and the action adopted by the intelligent agent is more favorable when the output value (policy gradient) is larger; in addition, Actor-Net 'and Critic-Net' in Target-Net in FIG. 6 are identical to the two neural network structures in Main-Net, but the updating manner is different.

3.1 creating a Memory Buffer to store the slave S _i Transition to S _i+1 The transition step includes the current state of the agent, the action in this state, the reward given to the agent by the environment after taking this action, and the next state reached by the agent after taking this action;

In this embodiment, designing the line information includes: elevation information of a target area, coordinates of a starting point and a middle point of a line, forbidden zone information, constraint conditions and cost standard information.

In this embodiment, the constraint conditions include: maximum limit gradient G _max Adjacent gradient algebraic difference delta G _max Minimum curve length L _Cmin Minimum clip line length L _Tmin Minimum slope length L _Smin Maximum allowable bridge height H _Bmax And maximum allowable tunnel length L _Tmax (ii) a The fee standard information includes: bridge per linear meter cost U _Bi Cost of each linear meter of tunnel U _Ti And fill out square unit price U _Fi 、U _Ci 。

In this embodiment, the state of the agent needs to satisfy various restriction conditions, including plane restriction conditions, longitudinal section restriction conditions, building structure restriction conditions, and other restriction conditions.

In this embodiment, satisfying the plane constraint condition means:

(a) minimum plane circular curve length (L) _Cmin ) It should satisfy:

L _Cmin -α _i R _i ≤0；

L _Tmin -L _Ti ≤0；

wherein, referring to FIG. 4, L _Ti The calculation formula for the clip line length is as follows:

the condition that the longitudinal section limit value is met is as follows:

(a) Vertical gradient:

(b) length of slope section:

(c) algebraic difference of adjacent slope sections:

|G _i+1 -G _i |≤ΔG _max ；

satisfying the building structure limitation condition means that:

after each transition step, the intelligent agent automatically arranges bridges, tunnels and roadbed sections according to corresponding line bridge boundary filling height and line tunnel boundary excavation depth, wherein the height (H) of the bridge from the ground _Bi ) Should not exceed the maximum allowable bridge height (H) _Bmax ) Full length of a single tunnel (L) _Tui ) Should not exceed the maximum allowed tunnel length (L) _Tmax ) Namely:

H _Bi ≤H _Bmax ；

L _Tui ≤L _Tmax ；

satisfying other constraints means:

In the present embodiment of the present invention,

and

the method comprises the following steps:

1) unit cost index

(a) Bridge construction costs (C) _B )：

(b) cost of tunnel construction (C) _T )：

In the formula, n is the number of the full-line tunnels; u shape _Ti The unit construction cost ([ gamma ] for the ith tunnel; l is _Tui The length of the ith tunnel; c _Pi The construction cost (rah) of the ith tunnel portal;

(c) fill and dig square charge (C) _E )：

The cross-sectional area of the subgrade section can be calculated as follows:

A＝2(W _s +Δh×i)×Δh；

further, fill and dig square fee is calculated as:

(d) cost of environmental protection (C) _I )：

(e) Linear cost (C) _L )：

C _L ＝U _L ×L；

(f) unit construction cost

notably, since the aim of the method is to reduce the construction costs of the line,

taking a negative value;

2) index of survival status

In consideration of the fact that mountainous area lines are difficult to meet various limiting conditions in the design planning process, the intelligent agent cannot fully explore the target area. Therefore, an index of survival status is added in the method

The agent can reach the end point more easily;

3) evaluation index for distance to line end point

The calculation is as follows:

in the formula (d) ₁ Is the diagonal length of the target area; d ₂ The distance from the current position of the intelligent agent to the line terminal point. It can be seen that the closer to the endpoint, the greater the reward the agent receives.

In this embodiment, in step 3.2, the structural parameters in Target Net are not updated directly, and after every multiple iteration steps, two neural network parameters in Main-Net are copied to Target Net to realize updating. In FIG. 6, the black arrow indicates the updating step of Actor-Net in Main-Net, the red arrow indicates the updating step of Critic-Net in Main-Net, and the orange arrow indicates the updating step of Target-Net.

This example compares the method with the manual line selection method, and the result is shown in fig. 7, where the line type in fig. 7(a) is the line type obtained by the method of this example, and the lower half line types from E to S in fig. 7(b) are the results of manual line selection. Table 1 shows a comparison of the economic indicators of two lines, as can be seen: the total cost of the line obtained by the method of the embodiment is obviously lower than the result of manual line selection, and the total cost of the line is reduced by 7.37 percent.

TABLE 1 comparison of economic indicators

The present invention and its embodiments have been described above schematically, without limitation, and what is shown in the drawings is only one of the embodiments of the present invention, and the actual structure is not limited thereto. Therefore, if the person skilled in the art receives the teaching, without departing from the spirit of the invention, the person skilled in the art shall not inventively design the similar structural modes and embodiments to the technical solution, but shall fall within the scope of the invention.

Claims

1. An intelligent generation method for a railway line in a complex mountain area is characterized by comprising the following steps: the method comprises the following steps:

firstly, building Environment;

1.1, obtaining design circuit information of a target area;

secondly, setting various attributes of Environment;

S _i ＝{[x _i ,y _i ,h _i ] ^T |x _i ∈[0,W],y _i ∈[0,H]}；

each transition step is done by taking an Action, and the Action space a is represented as follows:

in the formula (I), the compound is shown in the specification,

is S _i 、S _i+1 D is the side length of the grid;

in the formula (I), the compound is shown in the specification,

and

evaluation of unit cost, survival status and distance to end point of lineA price index; u. of _c ,u _s And u _d The weight coefficients of the three are;

2. The intelligent complex mountain railway line generation method according to claim 1, characterized in that: designing the routing information includes: elevation information of a target area, coordinates of a starting point and a middle point of a line, forbidden zone information, constraint conditions and cost standard information.

3. The intelligent generation method of the complex mountain railway line according to claim 2, characterized in that: the constraint conditions include: maximum limit gradient G _max Adjacent gradient algebraic difference delta G _max Minimum curve length L _Cmin Minimum clip line length L _Tmin Minimum slope length L _Smin Maximum allowable bridge height H _Bmax And maximum allowable tunnel length L _Tmax (ii) a The fee standard information includes: bridge per linear meter cost U _Bi Cost of each linear meter of tunnel U _Ti And fill out square unit price U _Fi 、U _Ci 。

4. The intelligent generation method of the complex mountain railway line according to claim 3, characterized in that: the state of the intelligent agent needs to meet various limiting conditions, wherein the various limiting conditions comprise plane limiting conditions, longitudinal section limiting conditions, building structure limiting conditions and other limiting conditions;

satisfying the plane constraint condition means:

(a) length L of least plane circular curve _Cmin It should satisfy:

L _Cmin -α _i R _i ≤0；

in the formula, R _i After i +1 state conversion steps, the radius of the plane circular curve at the ith corner must satisfy the following conditions: r _min -R _i ≤0；

(b) Minimum clip line length L between two plane circular curves _Tmin It should satisfy:

L _Tmin -L _Ti ≤0；

the condition that the longitudinal section limit value is met is as follows:

the longitudinal section limiting conditions mainly comprise: maximum limit gradient G _max Minimum slope length L _Smin And adjacent gradient algebraic difference deltag _max ：

(a) Vertical gradient:

(b) length of slope section:

(c) algebraic difference between adjacent slope segments:

|G _i+1 -G _i |≤ΔG _max ；

satisfying the building structure limitation condition means that:

after each transition step, the intelligent agent can automatically arrange the bridge, the tunnel and the roadbed section according to the corresponding line bridge boundary filling height and line tunnel boundary digging depth, wherein the height H of the bridge from the ground _Bi Should not exceed the maximum allowable bridge height H _Bmax Full length L of a single tunnel _Tui Should not exceed the maximum allowed tunnel length L _Tmax Namely:

H _Bi ≤H _Bmax ；

L _Tui ≤L _Tmax ；

satisfying other constraints means:

5. The intelligent generation method of the complex mountain railway line according to claim 4, characterized in that:

and

the method comprises the following steps:

1) unit cost index

Line construction costs include bridge construction costs C _B Cost of tunnel construction C _T Fill and dig square charge C _E Environmental protection charge C _I And linear cost C _L ；

(a) Bridge construction cost C _B ：

In the formula, n is the number of full-line bridges; u shape _Bi The unit construction cost of the ith bridge; l is _Bi The length of the ith bridge; c _Ai Construction cost of the i-th bridge abutment;

(b) cost of tunnel construction C _T ：

In the formula, n is the number of the full-line tunnels; u shape _Ti The unit construction cost of the ith tunnel is saved; l is _Tui The length of the ith tunnel is taken as the length of the ith tunnel; c _Pi The construction cost of the ith tunnel portal;

(c) fill and dig square charge C _E ：

The cross-sectional area of the subgrade section can be calculated as follows:

A＝2(W _s +Δh×i)×Δh；

further, fill and dig square fee is calculated as:

in the formula of U _Fi And U _Ci Respectively the unit cost of filling and digging square of the ith section, and m and n are respectively the quantity of filling and digging square; a. the _i Is the ith cross-sectional area; l is _i Is the length of the ith section;

(d) cost for environmental protection C _I ：

In the formula of U _i The unit fine when the line passes through the environmental protection area; a. the _Pi The area occupied by the line passing through the environmental protection area;

(e) linear cost C _L ：

C _L ＝U _L ×L；

in the formula of U _L Unit construction cost for linear expense; l is the total line length;

(f) unit construction cost

taking a negative value;

2) index of survival status

3) evaluation of distance to end of linePrice index

The calculation is as follows:

in the formula (d) ₁ Is the diagonal length of the target area; d ₂ The distance from the current position of the intelligent agent to the line terminal point.

6. The intelligent generation method of the complex mountain railway line according to claim 1, characterized in that: in step 3.2, the structural parameters in Target Net are not updated directly, and after every multiple iteration steps, two neural network parameters in Main-Net are copied to Target Net to realize updating.