CN112836858A

CN112836858A - Multi-type intermodal transportation emission reduction path selection method, system and device for containers

Info

Publication number: CN112836858A
Application number: CN202110019565.4A
Authority: CN
Inventors: 王晓宁; 刘民壮; 杨昌运; 宋宇; 崔梓钰; 王丽芬
Original assignee: Harbin Institute of Technology
Current assignee: Harbin Institute of Technology
Priority date: 2021-01-07
Filing date: 2021-01-07
Publication date: 2021-05-25
Anticipated expiration: 2041-01-07
Also published as: CN112836858B

Abstract

A method, a system and a device for selecting a multi-type intermodal transportation emission reduction path of a container belong to the technical field of transportation. The method aims to solve the problems that the efficiency of determining a transportation strategy is low and the actual transportation efficiency corresponding to the strategy is low in the determined intermodal transportation mode in the existing multi-mode intermodal emission reduction path selection method. The method comprises the steps of determining a multi-mode intermodal transportation emission reduction utility model of the container aiming at different carbon tax collection modes; then, establishing a Q table by taking the container as an intelligent agent and the total utility of the multi-type intermodal emission reduction path as a target task; and taking the starting point and the end point of the multi-type intermodal transportation emission reduction path as the starting point and the end point of a Q table, and selecting the multi-type intermodal transportation emission reduction path of the container by using a reinforcement learning algorithm. The method is mainly used for path selection of multi-mode intermodal transportation emission reduction of the containers.

Description

Multi-type intermodal transportation emission reduction path selection method, system and device for containers

Technical Field

The invention discloses a method, a system and a device for selecting a multi-type intermodal transportation emission reduction path of a container, and belongs to the technical field of transportation.

Background

With the increasing traffic carbon emission generated by the multimodal transportation mode in freight transportation in China, carbon tax policy intervention environment carbon emission related research has been developed in China, but specific mode formulation needs to be perfected.

The multi-type intermodal transportation is a transportation process which is jointly completed by mutual connection and transportation of two or more transportation modes, is a comprehensive organization of different transportation modes, and can reduce intermediate links and transportation cost. The multimodal transportation is mainly carried out in a container transportation mode, but at present, the transportation emission reduction effect under the condition of different carbon tax rates is researched by combining multimodal transportation path selection.

Because the transportation enterprises select the path of container cargo transportation not at one time, but the path is continuously adjusted along with various utility influences in the transportation process, and meanwhile, the transportation enterprises have the characteristic of autonomous learning, and the initial transportation scheme selection of the transportation enterprises is changed due to the influence of other transportation enterprises. The current multi-mode intermodal container transportation basically only considers whether an effective transfer point exists at an origin point and a destination point, and if so, the container is arranged for transportation. Therefore, the transportation mode determined by the current path selection method has the problem of low transportation efficiency corresponding to the transportation mode, and more importantly, the transportation mode determined by the current path selection method can not meet the future transportation conditions and regulations, that is, the applicability of the current path selection method is gradually reduced, and even the applicability is lost.

Disclosure of Invention

The method aims to solve the problems that the efficiency of determining a transportation strategy is low, the actual transportation efficiency corresponding to the strategy is low and the applicability of the path selection method is low in the determined intermodal transportation mode in the conventional multi-mode intermodal emission reduction path selection method.

A multi-type intermodal transportation emission reduction path selection method for containers comprises the following steps:

determining a multi-type intermodal transportation emission reduction utility model of the container according to different carbon tax collection modes, wherein the different carbon tax collection modes comprise two carbon tax collection modes, namely a single tax collection mode and a segmented progressive tax collection mode;

the utility model of the multi-type intermodal transportation emission reduction of the container under the single tax rate collection mode is as follows:

the constraints are as follows:

wherein m and n are transportation modes;

is a decision variable indicating whether or not cargo can be transported in the transport mode m between the i and j nodes;

a decision variable is used for indicating whether the transport mode m can be converted into the transport mode n at the node i; c represents the total effect of multi-type intermodal transportation emission reduction of the container;

representing the basic transportation utility of different transportation modes; c_handleIs the transfer utility of the cargo; c_safeFor the security of goodsUtility; c_timeTime utility for goods; c^EIs the carbon emission utility of the cargo; beta is a carbon tax utility function under the single tax rate collection mode;

the carbon emission is the carbon emission of the m transportation mode in the process of transferring from the node i to the node j;

a shipment quantity representing a mode of transportation m between i and j; q represents the total freight volume;

the utility model of the multi-type intermodal transportation emission reduction of the container under the sectional progressive tax rate collection mode: carbon emission utility C of goods based on single tax rate collection mode^ETo be replaced by C^Z：

In the formula, E_nRepresents the maximum carbon emission amount allowed in different carbon tax collection intervals; beta is a_nRepresenting carbon tax rates of different carbon tax intervals;

then, establishing a Q table by taking the container as an intelligent agent and the total utility of the multi-type intermodal emission reduction path as a target task; and taking the starting point and the end point of the multi-type intermodal transportation emission reduction path as the starting point and the end point of a Q table, and selecting the multi-type intermodal transportation emission reduction path of the container by using a reinforcement learning algorithm.

Further, the process for realizing the selection of the multi-type intermodal transportation emission reduction path of the container by using the reinforcement learning-based algorithm comprises the following steps:

setting P as a set formed by all city nodes in the multimodal transport process; n is a transportation mode set; taking the starting point and the end point of the multi-type intermodal emission reduction path as the starting point and the end point of a Q table;

the intelligent agent starts from a starting point in the Q table, selects an action A from the Q table according to an epsilon-greedy algorithm and executes the action A;

the reward function R is used as feedback information of the action A executed in the current state S to the learning environment of the intelligent agent, and the Q table is updated according to the reward function and state change generated by the action of the intelligent agent;

then, the intelligent agent continuously starts from the current state, selects an action A from the Q table according to an epsilon-greedy algorithm and executes the action A, and updates the Q table;

judging whether the end point of the multi-type intermodal transportation path is reached or not, if not, the intelligent agent continues to start from the current state; if the end point of the multi-type intermodal transport path is reached, judging whether the set iteration number is reached; if so, finishing the optimization, and obtaining a Q table which is the optimized emission reduction path selection strategy; and if not, judging whether the Q table is converged, if so, obtaining the optimal strategy of the multi-type joint transportation emission reduction path selection, and if not, selecting the multi-type joint transportation emission reduction path from the starting point in the Q table again.

Further, the reward function R is as follows:

wherein, R is a reward function and is feedback information of the action A executed in the current state S to the learning environment of the intelligent agent; and C' is a constant, and the reward and punishment condition after the state of the intelligent object is changed is quantitatively estimated.

Further, the updating formula for updating the Q table according to the reward function and the state change generated by the intelligent body action is as follows

Q(S,A)＝Q(S,A)+α(R+γmax_AQ(S′,′)-Q(S,A))

Wherein: q is the expectation that the profit can be obtained by taking the action A in the state S, S is the current state, A is the action of the current state, S 'is the next state, and A' is the action of the next state; gamma is a discount factor; α is the learning rate.

A multi-type intermodal transportation emission reduction path selection system for a container is used for executing the multi-type intermodal transportation emission reduction path selection method for the container.

A container multi-intermodal emission reduction routing apparatus for storing and/or operating a container multi-intermodal emission reduction routing system as claimed in any one of claims 1 to 4.

Has the advantages that:

the multi-mode intermodal transportation emission reduction utility model takes various freight parameters in container transportation into consideration, wherein the freight parameters comprise: the basic transportation utility of the cargo, the transfer utility of the cargo, the time utility of the cargo, the safety utility of the cargo, and the carbon emission utility of the cargo. In addition, the invention also considers the single tax collection mode and the segmented progressive tax collection mode to determine the path, so the invention has stronger applicability, can be suitable for the optimal path selection under various conditions, and can also be suitable for the condition that the carbon emission has strict requirements.

Meanwhile, the invention is a path selection method aiming at various freight parameters, so that the actual path and the transfer mode determined according to the invention have very strong pertinence, the actual transportation efficiency can be improved, and the cost can be effectively reduced.

Drawings

Fig. 1 is a schematic flow chart of a first embodiment.

Detailed Description

The first embodiment is as follows: the present embodiment is described in connection with figure 1,

the method for selecting the multi-type intermodal transportation emission reduction path of the container in the embodiment is used for selecting the multi-type intermodal transportation emission reduction path and the emission reduction transportation mode based on a reinforcement learning algorithm, and comprises the following steps of:

step 1: establishing a Q table by taking the container as an intelligent agent and taking the total utility of the multi-type intermodal emission reduction path as a target task; starting and ending points of the multi-type combined transportation emission reduction path are used as starting and ending points of the Q table;

setting P as a set formed by all city nodes in the multimodal transport process; n is a transportation mode set; m and n are transportation modes, and m and n in the embodiment are 1, 2 and 3 respectively representing road transportation, railway transportation and water transportation;

is a decision variable, taking the value 1 or0, representing whether the goods can be transported in the transportation mode m between the nodes i and j;

a decision variable is taken, the value is 1 or 0, and the decision variable indicates whether the transportation mode m can be converted into the transportation mode n at the node i; c represents the total utility of multi-mode intermodal transportation emission reduction of the container, and in some embodiments, for the purpose of unifying the basis of the utility, the total cost can be presented (in this way, in reinforcement learning, the reward and punishment function can be designed in the form of the total cost); of course, in other embodiments, other utilities and unified indexes can be used, and each utility can also be converted into a dimensionless index;

the basic transportation utility of different transportation modes is shown, in some embodiments, the basic transportation utility actually refers to the transportation path length, and in order to obtain comprehensive utility and enable each utility to have a uniform combination basis, the transportation path length is converted into basic transportation cost in the implementation and in combination with the transportation amount of different cargos and the cost of the unit path length of the different cargos; c_handleFor the transshipment of goods, in some embodiments the transshipment utility is actually the loss of manpower at the transshipment point, so that the utility has a unified consolidated base, the embodiment combines the transportation volumes of different goods to convert into the loading, unloading, stocking and management costs of the goods; c_safeFor the safety utility of the goods, in some embodiments, the safety utility refers to the goods loss utility, actually refers to the goods loss rate generated in the loading, unloading and transferring process of the goods, and is related to the transportation company and the transportation mode; in order to enable each utility to have a uniform merging base, the embodiment combines different cargo transportation volumes and cargo values to convert into cargo damage cost; c_timeFor the time utility of the goods, in some embodiments the time utility actually refers to the rate of depreciation of the goods caused by transit time; in order to enable each utility to have a uniform combination basis, the method is implemented by combining different freight transportation amounts, freight values and transportation time to convert into depreciation cost of the freight; c^EIs the carbon emission effect of the goods, actually means muchThe carbon emission generated in the transportation process of the combined transportation is related to energy conversion and transportation modes; in order to have a uniform consolidated basis for each utility, the embodiment and the collection of the carbon tax are converted into the carbon emission cost; beta is a carbon tax utility function under the single tax rate collection mode;

in the present embodiment, the process of acquiring the transport length and the transport time is as follows:

logging in a high-grade map API of a high-grade open platform, applying for a Key Key of a Web service API, sending a get request through a requests module in Python, acquiring longitude and latitude at a geocoding interface, returning a JSON character string, storing the longitude and latitude of an urban node in a dictionary by using a JSON module, calculating the real distance between two points by using a path planning function in the high-grade map API, acquiring transportation time, and compiling data into a table by using a pandas module so as to visualize the road transportation distance between the urban nodes and the corresponding transportation time. The crawling of the distance between the railway transportation path and the waterway transportation path and the transportation time is similar to the crawling of the road transportation, only the target website is converted into a ship communication network and a train time network from a high-grade map, the calculation of the real distance is not needed, and only the corresponding distance and time data need to be crawled.

the constraints are as follows:

wherein,

the transportation quantity of goods in the m transportation mode is less than the total transportation quantity;

is a continuity constraint, which indicates that during the transportation process from the node i to the node k, the transportation mode transfer occurs at j, the transition is from m to n, and the transportation is performed in the transportation mode n from the node j to the node k, thereby ensuring the continuity of the transportation process.

Comparing the multi-type combined transportation emission reduction utility model of the container under the sectional progressive tax rate collection mode with that under the single tax rate collection mode, calculating the difference on the carbon emission utility of the goods, and calculating the carbon emission utility C of the goods^ETo be replaced by C^Z：

In the formula, E_nRepresents the maximum carbon emission amount allowed in different carbon tax collection intervals; beta is a_nRepresenting the carbon tax rate of different carbon tax intervals.

In the above process, the two carbon tax collection modes include a single tax collection mode and a segmented progressive tax collection mode, where the segmented progressive tax collection mode indicates that the carbon emission utility of the graded progressive is presented corresponding to different tax rates for gradients of different carbon emission amounts. And the basic conditions of the multimodal transport emission reduction utility model are as follows: the stations of all the city nodes are well operated, and the effect on the station stacking link due to poor connection is avoided; the goods delivery mode is 'field to field', and the empty box delivery link under the 'door to door' condition is not considered; the transported similar goods are regarded as a ticket goods and can not be separated; the goods can be delivered within a specified time without considering the goods delay and goods loss caused by factors such as inelasticity and the like; when the port city is transported and reloaded, the transporting and reloading actions only occur once, and the secondary transporting and reloading do not occur.

Initializing a state space and an action space, wherein the state space is a multi-type joint transport node set, the action space is all transport strategies which can be adopted by an agent in the nodes, and the transport strategies comprise transport node selection and transport mode selection.

Step 2: designing a reward function. If the action of the intelligent agent is favorable for completing the target task, giving the intelligent agent a reward; and if the action of the intelligent agent is not beneficial to completing the target task, giving punishment to the intelligent agent.

Wherein, R is a reward function, and is feedback information of the intelligent agent learning environment by executing the action a in the current state S. And C' is a constant, and the reward and punishment condition after the state of the intelligent object is changed is quantitatively estimated.

And step 3: the agent starts a new test from the start point in the Q table, selects action A from the Q table according to an epsilon-greedy algorithm, and executes.

And epsilon is an exploration rate, in the training process of reinforcement learning, the probability of selecting the optimal action by the intelligent agent is epsilon, and the action which enables the reward value to be maximum (the effectiveness to be reduced to the maximum) is not selected with a certain probability of 1, and other actions are selected to generate more possibilities, so that the global optimal effect is achieved.

And 4, step 4: the Q table is updated based on the reward function and the state changes generated by the agent's actions.

Q(S,A)＝Q(S,A)+α(R+γmax_AQ(S′,′)-Q(S,A))

Wherein: q is the expectation that revenue can be obtained by taking action A in the S state, S is the current state, A is the action of the current state, S 'is the next state, and A' is the action of the next state.

Gamma is a discount factor, the value is taken at [0, 1], the importance degree of the algorithm on the reward value which can be obtained in the future is shown, if the value is 0, the current state-action is only related to the instant reward value obtained by the intelligent body; if not 0, the description is not only associated with the instant prize value, but also with the prize value of the next state-action pair, and the closer to 1, the greater the degree of correlation.

Alpha is the learning rate, the value is taken at [0, 1], if the value is 0, the updating of the Q value is only related to the experience in the Q table; if not 0, it is shown to be related not only to the experience already found in the Q table but also to the Q value calculated in the later experiment, and the closer to 1, the greater the degree of correlation.

And 5: and the intelligent agent selects an action A from the Q table according to an epsilon-greedy algorithm from the current state, executes the action A and updates the Q table.

Step 6: judging whether the end point of the multi-type intermodal transportation path is reached, if not, returning to the step 5; if the end point of the multimodal transportation path is reached, step 7 is executed.

And 7: judging whether the set iteration times are reached, if not, performing step 8; and if so, finishing the optimization, and obtaining the Q table which is the optimized emission reduction path selection strategy.

And 8: judging whether the Q table is converged, and returning to the step 3 if the Q table is not converged; and if the Q table is converged, obtaining the optimal strategy of the multi-type combined transportation emission reduction path selection.

Because the length of the transportation path and the transportation time have certain relation and cannot be equal to each other, the utility model of the invention comprises transportation utility and time utility, can give consideration to both the transportation path and the transportation time, and can adjust the weight based on the invention, so the selection of time or path can be emphasized according to the actual situation based on the invention, thereby the determined strategy is more targeted, the efficiency of the strategy selected by the invention is higher (the efficiency of the model determination scheme of the invention is higher), the accurate butt joint of the intermodal transportation can be realized in the selected path, the selected transit point is ensured to be more suitable for the actual transit condition and capability of the accurate transit point, the consumption of other time such as waiting and the like is reduced on the whole, thereby the actual transportation efficiency is improved (the actual transportation efficiency corresponding to the model determination scheme of the invention is higher); meanwhile, the invention enables the selected strategy to be more suitable for the provider of the transportation service and the requirements of different users, and can provide personalized service, thereby improving the communication and matching efficiency of the consignor, the carrier and the actual transporter, and further improving the overall efficiency.

The second embodiment is as follows:

the embodiment is a system for selecting a multi-type intermodal transportation emission reduction path of a container, which is used for implementing a method for selecting the multi-type intermodal transportation emission reduction path of the container and helps to complete the selection of the multi-type intermodal transportation emission reduction path of the container.

The third concrete implementation mode:

the embodiment is a device for selecting the multi-type intermodal transportation emission reduction path of the container, and the device is used for storing and/or operating a multi-type intermodal transportation emission reduction path selection system of the container. The device of the embodiment can be a common computer or the like, and can also be a specially developed device for storing and/or operating a multi-type container intermodal transportation emission reduction path selection system.

The invention selects reinforcement learning as a path selection method, and has the effect that the reinforcement learning aims at the maximum reward in the future, namely, the global optimal path is obtained. The multi-type intermodal transportation process of the container is optimized through combination of transportation modes such as a road mode, a railway mode and a water transportation mode, a multi-type intermodal transportation emission reduction path scheme taking the lowest utility as an optimization target is constructed, existing facility equipment of various transportation modes can be fully utilized, resource integration in the transportation process is realized, the number of times of transportation is reduced, sustainable development in the transportation process is facilitated, the purposes of cost reduction and efficiency improvement are achieved, and the competitiveness of the logistics industry is improved. And the multi-type intermodal transportation of the containers has fewer times, fixed starting and ending points and only three transportation modes including highway, railway and water transportation, the whole transportation process is simple and intuitive relative to urban road transportation, the space complexity and the time complexity of the multi-type intermodal emission reduction model are smaller, the calculation consumption for completing the multi-type intermodal emission reduction path selection of the containers at one time is less, the time is short, and the targets of enterprises can be effectively completed.

The reinforcement learning model established by the invention has the core advantages that the feedback can be actively acquired from the environment to search an ideal path, and the whole path optimization process is dynamically executed and is similar to artificial intelligence understood by people; in the era of big data drive, the Python crawler can acquire the latest transportation conditions, transportation expenses and the like, so that the selected path can be advanced with time.

The present invention is capable of other embodiments and its several details are capable of modifications in various obvious respects, all without departing from the spirit and scope of the present invention.

Claims

1. A multi-type intermodal transportation emission reduction path selection method for containers is characterized by comprising the following steps:

the constraints are as follows:

wherein m and n are transportation modes;

representing the basic transportation utility of different transportation modes; c_handleIs the transfer utility of the cargo; c_safeFor the safety utility of the cargo; c_timeTime utility for goods; c^EIs the carbon emission utility of the cargo; beta is aA carbon tax utility function in a single tax rate collection mode;

2. The method for selecting the multi-type intermodal transportation emission reduction path for the container according to claim 1, wherein the process for realizing the selection of the multi-type intermodal transportation emission reduction path for the container based on the reinforcement learning algorithm comprises the following steps:

3. The method of claim 2, wherein the reward function R is as follows:

4. The method as claimed in claim 2 or 3, wherein the updating formula for updating the Q table according to the reward function and the state change generated by the intelligent action is as follows

Q(S,A)＝Q(S,A)+α(R+γmax_AQ(S′,A′)-Q(S,A))

5. A container multi-intermodal emission reduction routing system for performing a container multi-intermodal emission reduction routing method as claimed in any one of claims 1 to 4.

6. A container multi-intermodal emission abatement path selection apparatus for storing and/or operating a container multi-intermodal emission abatement path selection system as claimed in any one of claims 1 to 4.