CN106650993B - Dynamic resource optimization method based on Markov decision process - Google Patents
Dynamic resource optimization method based on Markov decision process Download PDFInfo
- Publication number
- CN106650993B CN106650993B CN201610887855.XA CN201610887855A CN106650993B CN 106650993 B CN106650993 B CN 106650993B CN 201610887855 A CN201610887855 A CN 201610887855A CN 106650993 B CN106650993 B CN 106650993B
- Authority
- CN
- China
- Prior art keywords
- development
- enterprise
- task
- capability
- capacity
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 238000000034 method Methods 0.000 title claims abstract description 70
- 230000008569 process Effects 0.000 title claims abstract description 51
- 238000005457 optimization Methods 0.000 title claims abstract description 32
- 238000011161 development Methods 0.000 claims abstract description 168
- 238000004519 manufacturing process Methods 0.000 claims abstract description 52
- 238000012356 Product development Methods 0.000 claims abstract description 27
- 239000011159 matrix material Substances 0.000 claims description 39
- 230000007704 transition Effects 0.000 claims description 33
- 238000013468 resource allocation Methods 0.000 claims description 19
- 238000012546 transfer Methods 0.000 claims description 12
- 238000010276 construction Methods 0.000 claims description 9
- 238000000354 decomposition reaction Methods 0.000 claims description 7
- 238000010586 diagram Methods 0.000 claims description 6
- 238000009499 grossing Methods 0.000 claims description 6
- 238000005516 engineering process Methods 0.000 claims description 5
- 238000009826 distribution Methods 0.000 claims description 3
- 239000000126 substance Substances 0.000 claims description 3
- 238000004364 calculation method Methods 0.000 abstract description 3
- 238000010187 selection method Methods 0.000 abstract description 2
- 239000013256 coordination polymer Substances 0.000 description 8
- 238000004088 simulation Methods 0.000 description 5
- 230000008901 benefit Effects 0.000 description 3
- 238000004422 calculation algorithm Methods 0.000 description 2
- 230000000694 effects Effects 0.000 description 2
- 238000012986 modification Methods 0.000 description 2
- 230000004048 modification Effects 0.000 description 2
- 230000000295 complement effect Effects 0.000 description 1
- 238000013461 design Methods 0.000 description 1
- 239000006185 dispersion Substances 0.000 description 1
- 238000011156 evaluation Methods 0.000 description 1
- 230000010354 integration Effects 0.000 description 1
- 230000007246 mechanism Effects 0.000 description 1
- 238000011160 research Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q10/00—Administration; Management
- G06Q10/04—Forecasting or optimisation specially adapted for administrative or management purposes, e.g. linear programming or "cutting stock problem"
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q50/00—Information and communication technology [ICT] specially adapted for implementation of business processes of specific business sectors, e.g. utilities or tourism
- G06Q50/04—Manufacturing
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02P—CLIMATE CHANGE MITIGATION TECHNOLOGIES IN THE PRODUCTION OR PROCESSING OF GOODS
- Y02P90/00—Enabling technologies with a potential contribution to greenhouse gas [GHG] emissions mitigation
- Y02P90/30—Computing systems specially adapted for manufacturing
Landscapes
- Engineering & Computer Science (AREA)
- Business, Economics & Management (AREA)
- Human Resources & Organizations (AREA)
- Economics (AREA)
- Strategic Management (AREA)
- Theoretical Computer Science (AREA)
- Tourism & Hospitality (AREA)
- General Physics & Mathematics (AREA)
- Marketing (AREA)
- General Business, Economics & Management (AREA)
- Physics & Mathematics (AREA)
- Operations Research (AREA)
- Quality & Reliability (AREA)
- Game Theory and Decision Science (AREA)
- Entrepreneurship & Innovation (AREA)
- Development Economics (AREA)
- Manufacturing & Machinery (AREA)
- Health & Medical Sciences (AREA)
- General Health & Medical Sciences (AREA)
- Primary Health Care (AREA)
- Management, Administration, Business Operations System, And Electronic Commerce (AREA)
Abstract
The invention belongs to the technical field of dynamic resource optimization, and particularly relates to a dynamic resource optimization method based on a Markov decision process. The method breaks through the traditional manufacturing resource selection method, abstracts the problem that a plurality of development tasks accurately regulate and control the cloud manufacturing resources in the cloud manufacturing environment into a Markov decision-making selection process, and realizes the mathematical modeling of uncertainty of the development process on resource selection; the expected development cost is taken as a target function, a cross entropy method is adopted for calculation, the combination optimization problem is converted into a correlation random optimization problem, the optimal selection probability of cloud manufacturing resources is obtained, reasonable scheduling and efficient utilization of manufacturing resources in the collaborative development work of complex products are achieved, and the product development risk and the manufacturing cost are effectively reduced.
Description
Technical Field
The invention belongs to the technical field of dynamic resource optimization, and particularly relates to a dynamic resource optimization method based on a Markov decision process.
Background
Under the current global manufacturing environment, the development of complex products is often completed by scheduling enterprise resources with different regions, different types and different characteristics. Cloud manufacturing is supported by an information network, geographically dispersed enterprise resources with complementary capabilities are connected by integrating social manufacturing resources and capabilities, sharing, integration and cooperative work of the dispersed manufacturing resources are realized, design, manufacture and assembly of complex products and the whole life cycle of sales and service are completed cooperatively, and the maximum benefit is obtained while market demands are better responded.
Manufacturing resources in a cloud manufacturing environment comprise various physical elements of all production activities of an enterprise in the whole life cycle of a product, have the characteristics of various varieties, heterogeneous shapes, geographical dispersion and the like, accurately regulate and control cloud manufacturing resources and manufacturing capacity, construct a cloud manufacturing resource combination with optimal overall service quality and highest cluster cooperation capacity, and become the key for smoothly developing cloud manufacturing.
The advantages and disadvantages of the cloud manufacturing resource combination optimization model and the solving mechanism directly influence the product development quality and whether the development process can be safely and smoothly carried out, the cloud manufacturing characteristics determine that the manufacturing resource selection in the product development process is full of uncertain factors, most of the current research on the manufacturing resource optimization configuration only considers the problems of time, cost, quality, resource evaluation and the like, and the influence of the uncertainty of the product development process on the cloud manufacturing resource selection is not fully considered: namely, the possibility of failure exists in the product development, which not only makes the product development process full of risks, but also has great influence on the product development cost and the development period.
Therefore, how to realize dynamic optimization selection of cloud manufacturing resources under the influence of uncertainty in the product development process remains a technical problem which needs to be solved urgently.
The Markov decision process is a stochastic dynamic system optimal decision process based on the Markov process theory, and has the characteristic that under the condition of knowing the current state, the future evolution of the Markov decision process is independent of the past evolution of the Markov decision process, namely, a decision maker periodically or continuously observes a stochastic dynamic system with Markov property in the decision process and makes decisions sequentially. At present, the Markov decision process is widely applied to a plurality of fields of natural science and engineering technology, and particularly, a great deal of practice and popularization are achieved on the aspect of prediction technology.
Disclosure of Invention
Technical problem to be solved
The technical problem to be solved by the invention is as follows: how to provide a dynamic resource optimization method based on a Markov decision process, which tries to regard a dynamic resource optimization configuration problem in a cloud manufacturing environment as a Markov decision process and utilizes a cross entropy algorithm to carry out optimization solution on dynamic resources.
(II) technical scheme
In order to solve the above technical problems, the present invention provides a dynamic resource optimization method based on a markov decision process, which is implemented based on a dynamic resource optimization system, the dynamic resource optimization system comprising: the system comprises a complex product task decomposition module, a development capability network construction module, a dynamic resource selection decision module and a cross entropy solving module;
step S1: decomposing the total task F through a complex product task decomposition module according to the performance requirement, the structure requirement and the precision requirement of the complex product to form n development subtasks, namely F ═ { F ═ F i1,2, …, n, where fiRepresenting the ith subtask in the development process;
step S2: according to the requirements of each development subtask on cloud manufacturing resources, a dynamic complex product development capability network consisting of development capability resources of a cross-region enterprise is established through a development capability network construction module; the developing capability network construction module comprises: the system comprises an enterprise development capacity resolver, a capacity resource pool builder and a capacity network builder;
the step S2 includes the following sub-steps:
step S201: the enterprise manufacturing resources of cross regions under the cloud manufacturing environment are virtualized through an enterprise development capacity decomposer, and enterprise development capacity is uniformly expressed as an enterprise development capacity unit cij={lov(cij),fiJ }; wherein lov (c)ij) For a certain subtask fiFor enterprise j, the level of development ability to complete the task, and the size of the level reflects the expected level of completion of the development task;
step S202: based on step S201, for a certain subtask fi, according to the number of enterprises in the enterprise manufacturing resources across the region, repeating step S201 several times, and further establishing a virtual enterprise development capability resource pool cp (i) { c) } c by the capability resource pool builderi1,ci2,,…,cij},i=1~n;
Step S203: according to the sequential relationship between the adjacent subtasks, the capability network builder establishes the sequential relationship between the enterprise development capability resource pools and the association relationship between each enterprise development capability unit in the two adjacent enterprise development capability resource pools, so as to form a dynamic complex product development capability network;
step S3: based on a complex product development capability network, a dynamic resource selection decision module obtains a dynamic resource allocation strategy according to a Markov decision method; the step S3 includes the steps of:
step S301: in the complex product development capacity network diagram, in each subtask development process, a certain time t (t is 0,1,2, …) is set to be capable of allocating only one enterprise development capacity unit cijThe corresponding development requirements are met;
step S302: taking the current subtask development stage corresponding to the time t as the task state corresponding to the time t, the task state space of the total task F can be expressed as S ═ St,t≥0}={f1,f2,…,fn};
Step S303: for a certain time t, the corresponding subtask is subtask fiThen its task state is St=fiDefining the enterprise development capacity unit corresponding to the time tAt thetaijProbabilistic success of the development subtask fiThen the next sub-task f is entered at the next time t +1i+1In the enterprise development capacity unit distribution stage, the task state at the moment t +1 is St+1=fi+1;
At 1-thetaijIndicating incomplete development task fiIs the task development failed, the task state at the next time t +1 is the same as the state at the time t, i.e. St+1=St=fi(ii) a Wherein the probability thetaijAnd development capability level lov (c)ij) The relation of (A) is as follows: thetaij=lov(cij)/10;
Step S304: for a certain subtask fiEnterprise development capability unit distributed at time tIn other words, a development task of size of aboutThe development cost of (2);
step S305: setting St=fnIs the target state, i.e. the final state;
step S306: from task start time 0 to task completion time t, the Enterprise development capability Unit assignment process can be described by a history of the Markov process:
step S307: history description H according to step S306tIs acquired in history HtAllocating enterprise development capability units under conditionsProbability set ofNamely a dynamic resource allocation strategy;
step S308: setting gamma as slave demand state S0=f1First reaching the final state St=fnThe development times of the enterprise development capacity units distributed in time are defined as a development capacity sequence from time 0 to time gamma
Step S309: the dynamic resource scheduling optimization configuration problem in the cloud manufacturing environment can be described as seeking an optimal selection strategy so that the development expectation cost Z (X) is minimum; wherein the content of the first and second substances,Eπrepresenting the expectation with respect to the probability density pi;
step S4: optimizing and outputting the dynamic resource allocation strategy by adopting a cross entropy solving module; the step S4 includes the steps of:
step S401: aiming at the dynamic resource allocation strategy pi, performing initialization operation to enable the probability of each enterprise development capacity unit in the dynamic resource allocation strategy pi being allocated to be the same, namely after initialization, the process of allocating the enterprise development capacity units is a random process, and therefore the dynamic resource allocation strategy pi is characterized as an initial transfer matrix P with the same element values and the sum of the element values of each row being 1;
step S402: randomly selecting an enterprise development capacity unit as a starting point in a complex product development capacity network corresponding to the total task F, and generating a path X through n steps of different state random transitions based on an initial transition matrix P in view of the fact that the number of subtasks is n1,X2,…,XnSince the state transition process is random, N paths can be obtained, and each path X is calculatediCost Z (X)i);
Step S403: will develop the expense Z (X)i) Sorting from small to large:
Step S404: calculating by utilizing a Lagrange multiplier method according to the obtained quantile value to obtain a first probability transfer matrix P ', wherein an element P in the first probability transfer matrix P' isij' is represented as:
wherein p isij' represents the probability that the allocated capacity development unit is j when the subtask i is developed;
is shown inIn N paths, for the paths with the development cost not higher than gamma, the times of developing the units are distributed when the subtask i is developed;
representing the times of distributing capacity developing units to be j when developing subtasks i for paths with developing cost not higher than gamma in the N paths;
step S405, according to the first probability transition matrix P 'and the initial transition matrix P, correcting by adopting a smoothing technology to obtain a second probability transition matrix P ″ - α. P' + (1- α). P, wherein α is a smoothing parameter;
step S406: reassigning the initial transition matrix P, and assigning a second probability transition matrix P' to the initial transition matrix P;
step S407: repeating steps S402 to S406 until for a given number of iterations d, a transition matrix P occurs for different initials, all resulting in a slave demand state S0=f1First reaching the final state St=fnTime, development cost Z (X)i) By fractional valueUntil the end;
step S408: when the condition of the step S407 occurs, it is regarded as that an optimal selection policy occurs, and the current initial transfer matrix P is output as the optimal selection policy, so that the optimal resource combination of dynamic resources in the cloud manufacturing environment of the complex product can be obtained.
(III) advantageous effects
Compared with the prior art, the invention provides a dynamic resource optimization method based on a Markov decision process, breaks through the traditional manufacturing resource selection method, abstracts the problem that a plurality of development tasks accurately regulate and control cloud manufacturing resources in a cloud manufacturing environment into a Markov decision selection process, and realizes mathematical modeling of uncertainty of the development process on resource selection; the expected development cost is taken as a target function, a cross entropy method is adopted for calculation, the combination optimization problem is converted into a correlation random optimization problem, the optimal selection probability of cloud manufacturing resources is obtained, reasonable scheduling and efficient utilization of manufacturing resources in the collaborative development work of complex products are achieved, and the product development risk and the manufacturing cost are effectively reduced.
Drawings
And the graph l is a complex product development capability network.
Figure 2 is a schematic diagram of a dynamic resource selection decision for a markov decision process.
FIG. 3 is a schematic diagram of a comparison between development costs of an optimal selection strategy and a random selection strategy.
Detailed Description
In order to make the objects, contents, and advantages of the present invention clearer, the following detailed description of the embodiments of the present invention will be made in conjunction with the accompanying drawings and examples.
To solve the above technical problem, the present invention provides a dynamic resource optimization method based on a markov decision process, as shown in fig. 1 to 3, which is implemented based on a dynamic resource optimization system, and the dynamic resource optimization system includes: the system comprises a complex product task decomposition module, a development capability network construction module, a dynamic resource selection decision module and a cross entropy solving module;
step S1: decomposing the total task F through a complex product task decomposition module according to the performance requirement, the structure requirement and the precision requirement of the complex product to form n development subtasks, namely F ═ { F ═ F i1,2, …, n, where fiRepresenting the ith subtask in the development process;
step S2: according to the requirements of each development subtask on cloud manufacturing resources, a dynamic complex product development capability network consisting of development capability resources of a cross-region enterprise is established through a development capability network construction module; the developing capability network construction module comprises: the system comprises an enterprise development capacity resolver, a capacity resource pool builder and a capacity network builder;
the step S2 includes the following sub-steps:
step S201: the enterprise manufacturing resources of cross-region enterprises under the cloud manufacturing environment are virtualized through an enterprise development capacity decomposer, and the enterprises are connectedThe industry development capability is uniformly expressed as an enterprise development capability unit cij={lov(cij),fiJ }; wherein lov (c)ij) For a certain subtask fiFor enterprise j, the level of development ability to complete the task, and the size of the level reflects the expected level of completion of the development task;
step S202: on the basis of step S201, aiming at a certain subtask fiRepeating step S201 several times according to the number of enterprises in the enterprise manufacturing resources across the region, and further establishing a virtual enterprise development capability resource pool cp (i) { c) } by the capability resource pool builderi1,ci2,,…,cij},i=1~n;
Step S203: according to the sequential relationship between the adjacent subtasks, the capability network builder establishes the sequential relationship between the enterprise development capability resource pools and the incidence relationship between each enterprise development capability unit in the two adjacent enterprise development capability resource pools (namely, the incidence relationship between one enterprise development capability unit in one enterprise development capability resource pool and any enterprise development capability unit in the adjacent enterprise development capability resource pools), thereby forming a dynamic complex product development capability network;
step S3: based on a complex product development capability network, a dynamic resource selection decision module obtains a dynamic resource allocation strategy according to a Markov decision method; the step S3 includes the steps of:
step S301: in the complex product development capacity network diagram, in each subtask development process, a certain time t (t is 0,1,2, …) is set to be capable of allocating only one enterprise development capacity unit cijThe corresponding development requirements are met;
step S302: taking the current subtask development stage corresponding to the time t as the task state corresponding to the time t, the task state space of the total task F can be expressed as S ═ St,t≥0}={f1,f2,…,fn};
Step S303: for a certain time t, the corresponding subtask is subtask fiThen its task state is St=fiTo determineDefining enterprise development capacity unit corresponding to the moment tAt thetaijProbabilistic success of the development subtask fiThen the next sub-task f is entered at the next time t +1i+1In the enterprise development capacity unit distribution stage, the task state at the moment t +1 is St+1=fi+1;
At 1-thetaijIndicating incomplete development task fiIs the task development failed, the task state at the next time t +1 is the same as the state at the time t, i.e. St+1=St=fi(ii) a Wherein the probability thetaijAnd development capability level lov (c)ij) The relation of (A) is as follows: thetaij=lov(cij)/10;
Step S304: for a certain subtask fiEnterprise development capability unit distributed at time tIn other words, a development task of size of aboutThe development cost of (2);
step S305: setting St=fnIs the target state, i.e. the final state;
step S306: from task start time 0 to task completion time t, the Enterprise development capability Unit assignment process can be described by a history of the Markov process:
step S307: history description H according to step S306tIs acquired in history HtAllocating enterprise development capability units under conditionsProbability set ofNamely a dynamic resource allocation strategy;
step S308: setting gamma as slave demand state S0=f1First reaching the final state St=fnThe development times of the enterprise development capacity units distributed in time are defined as a development capacity sequence from time 0 to time gamma
Step S309: the dynamic resource scheduling optimization configuration problem in the cloud manufacturing environment can be described as seeking an optimal selection strategy so that the development expectation cost Z (X) is minimum; wherein the content of the first and second substances,Eπrepresenting the expectation with respect to the probability density pi;
step S4: optimizing and outputting the dynamic resource allocation strategy by adopting a cross entropy solving module; the step S4 includes the steps of:
step S401: aiming at the dynamic resource allocation strategy pi, performing initialization operation to enable the probability of each enterprise development capacity unit in the dynamic resource allocation strategy pi being allocated to be the same, namely after initialization, the process of allocating the enterprise development capacity units is a random process, and therefore the dynamic resource allocation strategy pi is characterized as an initial transfer matrix P with the same element values and the sum of the element values of each row being 1;
step S402: randomly selecting an enterprise development capacity unit as a starting point in a complex product development capacity network corresponding to the total task F, and generating a path X through n steps of different state random transitions based on an initial transition matrix P in view of the fact that the number of subtasks is n1,X2,…,XnSince the state transition process is random, N paths can be obtained, and each path X is calculatediCost Z (X)i);
Step S403: will develop the expense Z (X)i) Sorting from small to large:
Step S404: calculating by utilizing a Lagrange multiplier method according to the obtained quantile value to obtain a first probability transfer matrix P ', wherein an element P in the first probability transfer matrix P' isij' is represented as:
wherein p isij' represents the probability that the allocated capacity development unit is j when the subtask i is developed;
representing the times of distributing capacity development units when developing the subtask i for the path with development cost not higher than gamma in the N paths;
representing the times of distributing capacity developing units to be j when developing subtasks i for paths with developing cost not higher than gamma in the N paths;
step S405, according to the first probability transition matrix P 'and the initial transition matrix P, correcting by adopting a smoothing technology to obtain a second probability transition matrix P ″ - α. P' + (1- α). P, wherein α is a smoothing parameter;
step S406: reassigning the initial transition matrix P, and assigning a second probability transition matrix P' to the initial transition matrix P;
step S407: repeating steps S402 to S406 until for a given number of iterations d, a transition matrix P occurs for different initials, all resulting in a slave demand state S0=f1First reaching the final state St=fnTime, development cost Z (X)i) By fractional valueUntil the end;
step S408: when the condition of the step S407 occurs, it is regarded as that an optimal selection policy occurs, and the current initial transfer matrix P is output as the optimal selection policy, so that the optimal resource combination of dynamic resources in the cloud manufacturing environment of the complex product can be obtained.
Examples
In this embodiment, a task F is decomposed according to a complex product task decomposition model, and a task set is formed: f ═ Fi|i=1,2,…,8}。
The cross-regional enterprise manufacturing resources in the cloud manufacturing environment are virtualized and serviced, and the enterprise development capacity unit is expressed as cij={lov(cij),fiJ, where j is 1,2, … 5.
Establishing a virtual enterprise development capacity resource pool: CP (1) ═ c11,c12,c13,c14,c15},CP(2)={c21,c22,c23,c24,c25},CP(3)={c31,c32,c33,c34,c35},CP(4)={c41,c42,c43,c44,c45},CP(5)={c51,c52,c53,c54,c55},CP(6)={c61,c62,c63,c64,c65},CP(7)={c71,c72,c73,c74,c75},CP(8)={c81,c82,c83,c84,c85}
For convenience of description, all enterprises in the cloud manufacturing environment are developed into a capability unit cijCorresponding development capability element rating lov (c)ij) And the development cost is expressed by a set, and the table 1 shows a development capability unit level set LOV ═ LOV (c)ij) I 8, j 5, table 2 is the development cost set CS={Cs(cij) I 8, j 5, where lov(cij) And CS(cij) Given by a preset value.
Table 1 the developed capability unit rating set LOV is:
TABLE 2 development cost set CSComprises the following steps:
and allocating a candidate development capacity resource pool for each subtask to form a complex product development capacity network, as shown in fig. 1.
A schematic diagram of a Markov decision process based dynamic resource selection decision is shown in FIG. 2, where θ isijRepresenting development capability element cijSuccessfully completes the development task fiThe probability of (d); 1-thetaijIndicating incomplete development task fiI.e. the product development failed at the current stage. Where probability θ and Productivity level lov (c)ij) The relation of (A) is as follows: thetaij=lov(cij)/10. Table 3 shows task f1~f8Probability of development success θ ijI.e. theta ═ thetaij|i=8,j=5}。
TABLE 3 probability of success of development task θijThe set of (a) is:
in this embodiment, a cross entropy algorithm is used to optimize the dynamic resource selection decision model.
Generating an initial transition matrix P such that the probability of each development capability unit of the transition matrix being selected is equal, i.e.
Let N be 5000, ρ be 0.16, α beAnd when d is 10, optimizing the dynamic resource selection strategy based on the Markov decision process by using a cross entropy method. The minimum development cost 2132 is obtained after iterative calculation for 46 times, wherein the optimal selection probabilityIs composed of
Therefore, in this example, development subtask f1~f8The Markov decision process-based dynamic resource optimization selection strategy is c14,c22,c33,c44,c53,c64,c71,c85。
The optimization selection strategy and the random selection strategy are respectively subjected to simulation operation, the development cost of the two selection strategies after 10 times of operation is shown in fig. 3, the average development cost of the optimization selection strategy in 10 times of simulation is 3113, and the average development cost of the random selection strategy is 3610.9. Simulation results show that the average development cost of the optimal selection strategy is lower than the average cost of the random selection strategy. However, it should be noted that it cannot be guaranteed that the development cost per time in the simulation process is smaller than the random selection strategy in the simulation process, because whether the development capability unit can complete the task per time has randomness, which has a great influence on the development cost, and therefore, the influence of the randomness is reduced by comparing the average development cost.
The above description is only a preferred embodiment of the present invention, and it should be noted that, for those skilled in the art, several modifications and variations can be made without departing from the technical principle of the present invention, and these modifications and variations should also be regarded as the protection scope of the present invention.
Claims (1)
1. A dynamic resource optimization method based on a Markov decision process is characterized by being implemented based on a dynamic resource optimization system, wherein the dynamic resource optimization system comprises the following steps: the system comprises a complex product task decomposition module, a development capability network construction module, a dynamic resource selection decision module and a cross entropy solving module;
step S1: decomposing the total task F through a complex product task decomposition module according to the performance requirement, the structure requirement and the precision requirement of the complex product to form n development subtasks, namely F ═ { F ═ Fi1,2, …, n, where fiRepresenting the ith subtask in the development process;
step S2: according to the requirements of each development subtask on cloud manufacturing resources, a dynamic complex product development capability network consisting of development capability resources of a cross-region enterprise is established through a development capability network construction module; the developing capability network construction module comprises: the system comprises an enterprise development capacity resolver, a capacity resource pool builder and a capacity network builder;
the step S2 includes the following sub-steps:
step S201: the enterprise manufacturing resources of cross regions under the cloud manufacturing environment are virtualized through an enterprise development capacity decomposer, and enterprise development capacity is uniformly expressed as an enterprise development capacity unit cij={lov(cij),fiJ }; wherein lov (c)ij) For a certain subtask fiFor enterprise j, the level of development ability to complete the task, and the size of the level reflects the expected level of completion of the development task;
step S202: on the basis of step S201, aiming at a certain subtask fiRepeating step S201 several times according to the number of enterprises in the enterprise manufacturing resources across the region, and further establishing a virtual enterprise development capability resource pool cp (i) { c) } by the capability resource pool builderi1,ci2,…,cij},i=1~n;
Step S203: according to the sequential relationship between the adjacent subtasks, the capability network builder establishes the sequential relationship between the enterprise development capability resource pools and the association relationship between each enterprise development capability unit in the two adjacent enterprise development capability resource pools, so as to form a dynamic complex product development capability network;
step S3: based on a complex product development capability network, a dynamic resource selection decision module obtains a dynamic resource allocation strategy according to a Markov decision method; the step S3 includes the steps of:
step S301: in the complex product development capacity network diagram, in each subtask development process, a certain time t (t is 0,1,2, …) is set to be capable of allocating only one enterprise development capacity unit cijThe corresponding development requirements are met;
step S302: taking the current subtask development stage corresponding to the time t as the task state corresponding to the time t, the task state space of the total task F can be expressed as S ═ St,t≥0}={f1,f2,…,fn};
Step S303: for a certain time t, the corresponding subtask is subtask fiThen its task state is St=fiDefining the enterprise development capacity unit corresponding to the time tAt thetaijProbabilistic success of the development subtask fiThen the next sub-task f is entered at the next time t +1i+1In the enterprise development capacity unit distribution stage, the task state at the moment t +1 is St+1=fi+1;
At 1-thetaijIndicating incomplete development task fiIs the task development failed, the task state at the next time t +1 is the same as the state at the time t, i.e. St+1=St=fi(ii) a Wherein the probability thetaijAnd development capability level lov (c)ij) The relation of (A) is as follows: thetaij=lov(cij)/10;
Step S304: for a certain subtask fiEnterprise development capability unit distributed at time tIn other words, a development task of size of aboutThe development cost of (2);
step S305: setting St=fnIs the target state, i.e. the final state;
step S306: from task start time 0 to task completion time t, the Enterprise development capability Unit assignment process can be described by a history of the Markov process:
step S307: history description H according to step S306tIs acquired in history HtAllocating enterprise development capability units under conditionsProbability density ofNamely a dynamic resource allocation strategy pi;
step S308: set gamma at history HtConditional from the demand state S0=f1First reaching the final state St=fnThe development times of the enterprise development capacity units distributed in time are defined as a development capacity sequence by all enterprise development capacity units distributed from the time 0 to the time gamma' when the development times reach gamma
Step S309: the dynamic resource scheduling optimization configuration problem in the cloud manufacturing environment can be described as seeking a dynamic resource allocation strategy pi, wherein the strategy enables the development expectation cost Z (X) to be minimum; wherein the content of the first and second substances,Eπexpressing the expectation of pi relative to the dynamic resource allocation strategy;
step S4: optimizing and outputting the dynamic resource allocation strategy by adopting a cross entropy solving module; the step S4 includes the steps of:
step S401: aiming at the dynamic resource allocation strategy pi, carrying out initialization operation to ensure that the probability of tasks allocated to each enterprise development capacity unit is the same, namely, the dynamic resource allocation strategy pi is characterized as an initial transfer matrix P with the same element values and the sum of the element values of each row being 1;
step S402: randomly selecting an enterprise development capacity unit as a starting point in a complex product development capacity network corresponding to the total task F, and generating a path X through n steps of different state random transitions based on an initial transition matrix P in view of the fact that the number of subtasks is n1,X2,…,XnSince the state transition process is random, N paths can be obtained, and each path X is calculatediCost Z (X)i);
Step S403: will develop the expense Z (X)i) Sorting from small to large:
Step S404: calculating by utilizing a Lagrange multiplier method according to the obtained quantile value to obtain a first probability transfer matrix P ', wherein an element P in the first probability transfer matrix P' isij' is represented as:
wherein p isij' represents the probability that the allocated capacity development unit is j when the subtask i is developed;
represented in N paths, with no more than development costWhen developing the subtask i, allocating the number of times of capability development units;
indicating that for N paths, the development cost is not higher than thatWhen developing the subtask i, allocating the number of times that the capacity development unit is j;
step S405, according to the first probability transition matrix P 'and the initial transition matrix P, correcting by adopting a smoothing technology to obtain a second probability transition matrix P ″ -, wherein α is a smoothing parameter, and the second probability transition matrix P ″ -, is α. P' + (1- α). P;
step S406: reassigning the initial transition matrix P, namely assigning the second probability transition matrix P' to the initial transition matrix P;
step S407: repeating steps S402 to S406 until for a given number of iterations d, a transition matrix P occurs for different initials, all resulting in a slave demand state S0=f1First reaching the final state St=fnTime, development cost Z (X)i) By fractional valueUntil the end;
step S408: when the condition of the step S407 occurs, it is regarded as that an optimal selection policy occurs, and the current initial transfer matrix P is output as the optimal selection policy, so that the optimal resource combination of dynamic resources in the cloud manufacturing environment of the complex product can be obtained.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201610887855.XA CN106650993B (en) | 2016-10-11 | 2016-10-11 | Dynamic resource optimization method based on Markov decision process |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201610887855.XA CN106650993B (en) | 2016-10-11 | 2016-10-11 | Dynamic resource optimization method based on Markov decision process |
Publications (2)
Publication Number | Publication Date |
---|---|
CN106650993A CN106650993A (en) | 2017-05-10 |
CN106650993B true CN106650993B (en) | 2020-07-03 |
Family
ID=58855282
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201610887855.XA Active CN106650993B (en) | 2016-10-11 | 2016-10-11 | Dynamic resource optimization method based on Markov decision process |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN106650993B (en) |
Families Citing this family (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN106919755B (en) * | 2017-03-01 | 2020-08-11 | 清华大学 | Cloud manufacturing system uncertainty quantitative analysis method and device based on data |
CN109697273A (en) * | 2017-10-20 | 2019-04-30 | 顺丰科技有限公司 | For resource allocation methods, equipment, the readable storage medium storing program for executing of mark problem |
CN108063830B (en) * | 2018-01-26 | 2020-06-23 | 重庆邮电大学 | Network slice dynamic resource allocation method based on MDP |
EP3702942A4 (en) * | 2018-03-27 | 2021-08-04 | Nippon Steel Corporation | Analysis system, analysis method, and program |
CN116137630B (en) * | 2023-04-19 | 2023-08-18 | 井芯微电子技术(天津)有限公司 | Method and device for quantitatively processing network service demands |
CN117134364B (en) * | 2023-06-21 | 2024-05-28 | 国网湖北省电力有限公司营销服务中心(计量中心) | Feed processing enterprise load management method based on staged strategy gradient algorithm |
Citations (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103036974A (en) * | 2012-12-13 | 2013-04-10 | 广东省电信规划设计院有限公司 | Cloud computing resource scheduling method and system based on hidden markov model |
Family Cites Families (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US9047423B2 (en) * | 2012-01-12 | 2015-06-02 | International Business Machines Corporation | Monte-Carlo planning using contextual information |
-
2016
- 2016-10-11 CN CN201610887855.XA patent/CN106650993B/en active Active
Patent Citations (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103036974A (en) * | 2012-12-13 | 2013-04-10 | 广东省电信规划设计院有限公司 | Cloud computing resource scheduling method and system based on hidden markov model |
Non-Patent Citations (3)
Title |
---|
云制造环境下的制造资源优化配置研究;王时龙 等;《计算机集成制造***》;20120731;第18卷(第7期);1396-1405 * |
基于Markov决策过程用交叉熵方法优化软件测试;张德平 等;《软件学报》;20081030(第10期);2770-2779 * |
基于马尔科夫决策过程的软件测试仿真与计算;秦强 等;《数值计算与计算机应用》;20140630;第35卷(第2期);92-102 * |
Also Published As
Publication number | Publication date |
---|---|
CN106650993A (en) | 2017-05-10 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN106650993B (en) | Dynamic resource optimization method based on Markov decision process | |
Lin et al. | Fast GA-based project scheduling for computing resources allocation in a cloud manufacturing system | |
Jamrus et al. | Dynamic coordinated scheduling for supply chain under uncertain production time to empower smart production for Industry 3.5 | |
Chou et al. | Process flexibility revisited: The graph expander and its applications | |
CN107679750B (en) | Cloud manufacturing service resource matching method based on adaptive coefficient genetic algorithm | |
CN115600774B (en) | Multi-target production scheduling optimization method for assembly type building component production line | |
Pereira Jr et al. | On multicriteria decision making under conditions of uncertainty | |
Xu et al. | Functional objectives decisionmaking of discrete manufacturing system based on integrated ant colony optimization and particle swarm optimization approach | |
CN104166903A (en) | Task planning method and system based on working procedure division | |
Wang et al. | Schedule-based execution bottleneck identification in a job shop | |
Simeone et al. | Resource efficiency enhancement in sheet metal cutting industrial networks through cloud manufacturing | |
US20240046168A1 (en) | Data processing method and apparatus | |
Ding et al. | Optimal incentive and load design for distributed coded machine learning | |
Hong et al. | A dynamic demand-driven smart manufacturing for mass individualization production | |
Chang et al. | Stochastic programming for qualification management of parallel machines in semiconductor manufacturing | |
CN110717662B (en) | Task allocation method, device, equipment and storage medium | |
Ananth et al. | Cooperative game theoretic approach for job scheduling in cloud computing | |
US20230267007A1 (en) | System and method to simulate demand and optimize control parameters for a technology platform | |
CN117196020A (en) | Conflict resolution method based on improved genetic algorithm | |
Elgendy et al. | Integrated strategies to an improved genetic algorithm for allocating and scheduling multi-task in cloud manufacturing environment | |
CN110648076A (en) | Task allocation method, device, equipment and storage medium | |
CN116708446A (en) | Network performance comprehensive weight decision-based computing network scheduling service method and system | |
Sunku | Selection of contractors for a housing development project in India by using an integrated model | |
Lin et al. | Integrated optimization of supplier selection and service scheduling in cloud manufacturing environment | |
CN113723695B (en) | Remanufacturing scheduling optimization method based on scene |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |