CN106650993B - Dynamic resource optimization method based on Markov decision process - Google Patents

Dynamic resource optimization method based on Markov decision process Download PDF

Info

Publication number
CN106650993B
CN106650993B CN201610887855.XA CN201610887855A CN106650993B CN 106650993 B CN106650993 B CN 106650993B CN 201610887855 A CN201610887855 A CN 201610887855A CN 106650993 B CN106650993 B CN 106650993B
Authority
CN
China
Prior art keywords
development
enterprise
task
capability
capacity
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201610887855.XA
Other languages
Chinese (zh)
Other versions
CN106650993A (en
Inventor
杨建新
秦强
吉军
刘文军
杨一铭
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Information Central Of China North Industries Group Corp
Original Assignee
Information Central Of China North Industries Group Corp
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Information Central Of China North Industries Group Corp filed Critical Information Central Of China North Industries Group Corp
Priority to CN201610887855.XA priority Critical patent/CN106650993B/en
Publication of CN106650993A publication Critical patent/CN106650993A/en
Application granted granted Critical
Publication of CN106650993B publication Critical patent/CN106650993B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/04Forecasting or optimisation specially adapted for administrative or management purposes, e.g. linear programming or "cutting stock problem"
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q50/00Information and communication technology [ICT] specially adapted for implementation of business processes of specific business sectors, e.g. utilities or tourism
    • G06Q50/04Manufacturing
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02PCLIMATE CHANGE MITIGATION TECHNOLOGIES IN THE PRODUCTION OR PROCESSING OF GOODS
    • Y02P90/00Enabling technologies with a potential contribution to greenhouse gas [GHG] emissions mitigation
    • Y02P90/30Computing systems specially adapted for manufacturing

Landscapes

  • Engineering & Computer Science (AREA)
  • Business, Economics & Management (AREA)
  • Human Resources & Organizations (AREA)
  • Economics (AREA)
  • Strategic Management (AREA)
  • Theoretical Computer Science (AREA)
  • Tourism & Hospitality (AREA)
  • General Physics & Mathematics (AREA)
  • Marketing (AREA)
  • General Business, Economics & Management (AREA)
  • Physics & Mathematics (AREA)
  • Operations Research (AREA)
  • Quality & Reliability (AREA)
  • Game Theory and Decision Science (AREA)
  • Entrepreneurship & Innovation (AREA)
  • Development Economics (AREA)
  • Manufacturing & Machinery (AREA)
  • Health & Medical Sciences (AREA)
  • General Health & Medical Sciences (AREA)
  • Primary Health Care (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The invention belongs to the technical field of dynamic resource optimization, and particularly relates to a dynamic resource optimization method based on a Markov decision process. The method breaks through the traditional manufacturing resource selection method, abstracts the problem that a plurality of development tasks accurately regulate and control the cloud manufacturing resources in the cloud manufacturing environment into a Markov decision-making selection process, and realizes the mathematical modeling of uncertainty of the development process on resource selection; the expected development cost is taken as a target function, a cross entropy method is adopted for calculation, the combination optimization problem is converted into a correlation random optimization problem, the optimal selection probability of cloud manufacturing resources is obtained, reasonable scheduling and efficient utilization of manufacturing resources in the collaborative development work of complex products are achieved, and the product development risk and the manufacturing cost are effectively reduced.

Description

Dynamic resource optimization method based on Markov decision process
Technical Field
The invention belongs to the technical field of dynamic resource optimization, and particularly relates to a dynamic resource optimization method based on a Markov decision process.
Background
Under the current global manufacturing environment, the development of complex products is often completed by scheduling enterprise resources with different regions, different types and different characteristics. Cloud manufacturing is supported by an information network, geographically dispersed enterprise resources with complementary capabilities are connected by integrating social manufacturing resources and capabilities, sharing, integration and cooperative work of the dispersed manufacturing resources are realized, design, manufacture and assembly of complex products and the whole life cycle of sales and service are completed cooperatively, and the maximum benefit is obtained while market demands are better responded.
Manufacturing resources in a cloud manufacturing environment comprise various physical elements of all production activities of an enterprise in the whole life cycle of a product, have the characteristics of various varieties, heterogeneous shapes, geographical dispersion and the like, accurately regulate and control cloud manufacturing resources and manufacturing capacity, construct a cloud manufacturing resource combination with optimal overall service quality and highest cluster cooperation capacity, and become the key for smoothly developing cloud manufacturing.
The advantages and disadvantages of the cloud manufacturing resource combination optimization model and the solving mechanism directly influence the product development quality and whether the development process can be safely and smoothly carried out, the cloud manufacturing characteristics determine that the manufacturing resource selection in the product development process is full of uncertain factors, most of the current research on the manufacturing resource optimization configuration only considers the problems of time, cost, quality, resource evaluation and the like, and the influence of the uncertainty of the product development process on the cloud manufacturing resource selection is not fully considered: namely, the possibility of failure exists in the product development, which not only makes the product development process full of risks, but also has great influence on the product development cost and the development period.
Therefore, how to realize dynamic optimization selection of cloud manufacturing resources under the influence of uncertainty in the product development process remains a technical problem which needs to be solved urgently.
The Markov decision process is a stochastic dynamic system optimal decision process based on the Markov process theory, and has the characteristic that under the condition of knowing the current state, the future evolution of the Markov decision process is independent of the past evolution of the Markov decision process, namely, a decision maker periodically or continuously observes a stochastic dynamic system with Markov property in the decision process and makes decisions sequentially. At present, the Markov decision process is widely applied to a plurality of fields of natural science and engineering technology, and particularly, a great deal of practice and popularization are achieved on the aspect of prediction technology.
Disclosure of Invention
Technical problem to be solved
The technical problem to be solved by the invention is as follows: how to provide a dynamic resource optimization method based on a Markov decision process, which tries to regard a dynamic resource optimization configuration problem in a cloud manufacturing environment as a Markov decision process and utilizes a cross entropy algorithm to carry out optimization solution on dynamic resources.
(II) technical scheme
In order to solve the above technical problems, the present invention provides a dynamic resource optimization method based on a markov decision process, which is implemented based on a dynamic resource optimization system, the dynamic resource optimization system comprising: the system comprises a complex product task decomposition module, a development capability network construction module, a dynamic resource selection decision module and a cross entropy solving module;
step S1: decomposing the total task F through a complex product task decomposition module according to the performance requirement, the structure requirement and the precision requirement of the complex product to form n development subtasks, namely F ═ { F ═ F i1,2, …, n, where fiRepresenting the ith subtask in the development process;
step S2: according to the requirements of each development subtask on cloud manufacturing resources, a dynamic complex product development capability network consisting of development capability resources of a cross-region enterprise is established through a development capability network construction module; the developing capability network construction module comprises: the system comprises an enterprise development capacity resolver, a capacity resource pool builder and a capacity network builder;
the step S2 includes the following sub-steps:
step S201: the enterprise manufacturing resources of cross regions under the cloud manufacturing environment are virtualized through an enterprise development capacity decomposer, and enterprise development capacity is uniformly expressed as an enterprise development capacity unit cij={lov(cij),fiJ }; wherein lov (c)ij) For a certain subtask fiFor enterprise j, the level of development ability to complete the task, and the size of the level reflects the expected level of completion of the development task;
step S202: based on step S201, for a certain subtask fi, according to the number of enterprises in the enterprise manufacturing resources across the region, repeating step S201 several times, and further establishing a virtual enterprise development capability resource pool cp (i) { c) } c by the capability resource pool builderi1,ci2,,…,cij},i=1~n;
Step S203: according to the sequential relationship between the adjacent subtasks, the capability network builder establishes the sequential relationship between the enterprise development capability resource pools and the association relationship between each enterprise development capability unit in the two adjacent enterprise development capability resource pools, so as to form a dynamic complex product development capability network;
step S3: based on a complex product development capability network, a dynamic resource selection decision module obtains a dynamic resource allocation strategy according to a Markov decision method; the step S3 includes the steps of:
step S301: in the complex product development capacity network diagram, in each subtask development process, a certain time t (t is 0,1,2, …) is set to be capable of allocating only one enterprise development capacity unit cijThe corresponding development requirements are met;
step S302: taking the current subtask development stage corresponding to the time t as the task state corresponding to the time t, the task state space of the total task F can be expressed as S ═ St,t≥0}={f1,f2,…,fn};
Step S303: for a certain time t, the corresponding subtask is subtask fiThen its task state is St=fiDefining the enterprise development capacity unit corresponding to the time t
Figure BDA0001128896240000031
At thetaijProbabilistic success of the development subtask fiThen the next sub-task f is entered at the next time t +1i+1In the enterprise development capacity unit distribution stage, the task state at the moment t +1 is St+1=fi+1
At 1-thetaijIndicating incomplete development task fiIs the task development failed, the task state at the next time t +1 is the same as the state at the time t, i.e. St+1=St=fi(ii) a Wherein the probability thetaijAnd development capability level lov (c)ij) The relation of (A) is as follows: thetaij=lov(cij)/10;
Step S304: for a certain subtask fiEnterprise development capability unit distributed at time t
Figure BDA0001128896240000041
In other words, a development task of size of about
Figure BDA0001128896240000042
The development cost of (2);
step S305: setting St=fnIs the target state, i.e. the final state;
step S306: from task start time 0 to task completion time t, the Enterprise development capability Unit assignment process can be described by a history of the Markov process:
Figure BDA0001128896240000043
step S307: history description H according to step S306tIs acquired in history HtAllocating enterprise development capability units under conditions
Figure BDA0001128896240000044
Probability set of
Figure BDA0001128896240000045
Namely a dynamic resource allocation strategy;
step S308: setting gamma as slave demand state S0=f1First reaching the final state St=fnThe development times of the enterprise development capacity units distributed in time are defined as a development capacity sequence from time 0 to time gamma
Figure BDA0001128896240000046
Step S309: the dynamic resource scheduling optimization configuration problem in the cloud manufacturing environment can be described as seeking an optimal selection strategy so that the development expectation cost Z (X) is minimum; wherein the content of the first and second substances,
Figure BDA0001128896240000047
Eπrepresenting the expectation with respect to the probability density pi;
step S4: optimizing and outputting the dynamic resource allocation strategy by adopting a cross entropy solving module; the step S4 includes the steps of:
step S401: aiming at the dynamic resource allocation strategy pi, performing initialization operation to enable the probability of each enterprise development capacity unit in the dynamic resource allocation strategy pi being allocated to be the same, namely after initialization, the process of allocating the enterprise development capacity units is a random process, and therefore the dynamic resource allocation strategy pi is characterized as an initial transfer matrix P with the same element values and the sum of the element values of each row being 1;
step S402: randomly selecting an enterprise development capacity unit as a starting point in a complex product development capacity network corresponding to the total task F, and generating a path X through n steps of different state random transitions based on an initial transition matrix P in view of the fact that the number of subtasks is n1,X2,…,XnSince the state transition process is random, N paths can be obtained, and each path X is calculatediCost Z (X)i);
Step S403: will develop the expense Z (X)i) Sorting from small to large:
Z(Xi)(1)≤Z(Xi)(2)≤…≤Z(Xi)(N)its quantile value rho is
Figure BDA0001128896240000051
Step S404: calculating by utilizing a Lagrange multiplier method according to the obtained quantile value to obtain a first probability transfer matrix P ', wherein an element P in the first probability transfer matrix P' isij' is represented as:
Figure BDA0001128896240000052
wherein p isij' represents the probability that the allocated capacity development unit is j when the subtask i is developed;
Figure BDA0001128896240000053
is shown inIn N paths, for the paths with the development cost not higher than gamma, the times of developing the units are distributed when the subtask i is developed;
Figure BDA0001128896240000054
representing the times of distributing capacity developing units to be j when developing subtasks i for paths with developing cost not higher than gamma in the N paths;
step S405, according to the first probability transition matrix P 'and the initial transition matrix P, correcting by adopting a smoothing technology to obtain a second probability transition matrix P ″ - α. P' + (1- α). P, wherein α is a smoothing parameter;
step S406: reassigning the initial transition matrix P, and assigning a second probability transition matrix P' to the initial transition matrix P;
step S407: repeating steps S402 to S406 until for a given number of iterations d, a transition matrix P occurs for different initials, all resulting in a slave demand state S0=f1First reaching the final state St=fnTime, development cost Z (X)i) By fractional value
Figure BDA0001128896240000055
Until the end;
step S408: when the condition of the step S407 occurs, it is regarded as that an optimal selection policy occurs, and the current initial transfer matrix P is output as the optimal selection policy, so that the optimal resource combination of dynamic resources in the cloud manufacturing environment of the complex product can be obtained.
(III) advantageous effects
Compared with the prior art, the invention provides a dynamic resource optimization method based on a Markov decision process, breaks through the traditional manufacturing resource selection method, abstracts the problem that a plurality of development tasks accurately regulate and control cloud manufacturing resources in a cloud manufacturing environment into a Markov decision selection process, and realizes mathematical modeling of uncertainty of the development process on resource selection; the expected development cost is taken as a target function, a cross entropy method is adopted for calculation, the combination optimization problem is converted into a correlation random optimization problem, the optimal selection probability of cloud manufacturing resources is obtained, reasonable scheduling and efficient utilization of manufacturing resources in the collaborative development work of complex products are achieved, and the product development risk and the manufacturing cost are effectively reduced.
Drawings
And the graph l is a complex product development capability network.
Figure 2 is a schematic diagram of a dynamic resource selection decision for a markov decision process.
FIG. 3 is a schematic diagram of a comparison between development costs of an optimal selection strategy and a random selection strategy.
Detailed Description
In order to make the objects, contents, and advantages of the present invention clearer, the following detailed description of the embodiments of the present invention will be made in conjunction with the accompanying drawings and examples.
To solve the above technical problem, the present invention provides a dynamic resource optimization method based on a markov decision process, as shown in fig. 1 to 3, which is implemented based on a dynamic resource optimization system, and the dynamic resource optimization system includes: the system comprises a complex product task decomposition module, a development capability network construction module, a dynamic resource selection decision module and a cross entropy solving module;
step S1: decomposing the total task F through a complex product task decomposition module according to the performance requirement, the structure requirement and the precision requirement of the complex product to form n development subtasks, namely F ═ { F ═ F i1,2, …, n, where fiRepresenting the ith subtask in the development process;
step S2: according to the requirements of each development subtask on cloud manufacturing resources, a dynamic complex product development capability network consisting of development capability resources of a cross-region enterprise is established through a development capability network construction module; the developing capability network construction module comprises: the system comprises an enterprise development capacity resolver, a capacity resource pool builder and a capacity network builder;
the step S2 includes the following sub-steps:
step S201: the enterprise manufacturing resources of cross-region enterprises under the cloud manufacturing environment are virtualized through an enterprise development capacity decomposer, and the enterprises are connectedThe industry development capability is uniformly expressed as an enterprise development capability unit cij={lov(cij),fiJ }; wherein lov (c)ij) For a certain subtask fiFor enterprise j, the level of development ability to complete the task, and the size of the level reflects the expected level of completion of the development task;
step S202: on the basis of step S201, aiming at a certain subtask fiRepeating step S201 several times according to the number of enterprises in the enterprise manufacturing resources across the region, and further establishing a virtual enterprise development capability resource pool cp (i) { c) } by the capability resource pool builderi1,ci2,,…,cij},i=1~n;
Step S203: according to the sequential relationship between the adjacent subtasks, the capability network builder establishes the sequential relationship between the enterprise development capability resource pools and the incidence relationship between each enterprise development capability unit in the two adjacent enterprise development capability resource pools (namely, the incidence relationship between one enterprise development capability unit in one enterprise development capability resource pool and any enterprise development capability unit in the adjacent enterprise development capability resource pools), thereby forming a dynamic complex product development capability network;
step S3: based on a complex product development capability network, a dynamic resource selection decision module obtains a dynamic resource allocation strategy according to a Markov decision method; the step S3 includes the steps of:
step S301: in the complex product development capacity network diagram, in each subtask development process, a certain time t (t is 0,1,2, …) is set to be capable of allocating only one enterprise development capacity unit cijThe corresponding development requirements are met;
step S302: taking the current subtask development stage corresponding to the time t as the task state corresponding to the time t, the task state space of the total task F can be expressed as S ═ St,t≥0}={f1,f2,…,fn};
Step S303: for a certain time t, the corresponding subtask is subtask fiThen its task state is St=fiTo determineDefining enterprise development capacity unit corresponding to the moment t
Figure BDA0001128896240000071
At thetaijProbabilistic success of the development subtask fiThen the next sub-task f is entered at the next time t +1i+1In the enterprise development capacity unit distribution stage, the task state at the moment t +1 is St+1=fi+1
At 1-thetaijIndicating incomplete development task fiIs the task development failed, the task state at the next time t +1 is the same as the state at the time t, i.e. St+1=St=fi(ii) a Wherein the probability thetaijAnd development capability level lov (c)ij) The relation of (A) is as follows: thetaij=lov(cij)/10;
Step S304: for a certain subtask fiEnterprise development capability unit distributed at time t
Figure BDA0001128896240000081
In other words, a development task of size of about
Figure BDA0001128896240000082
The development cost of (2);
step S305: setting St=fnIs the target state, i.e. the final state;
step S306: from task start time 0 to task completion time t, the Enterprise development capability Unit assignment process can be described by a history of the Markov process:
Figure BDA0001128896240000083
step S307: history description H according to step S306tIs acquired in history HtAllocating enterprise development capability units under conditions
Figure BDA0001128896240000084
Probability set of
Figure BDA0001128896240000085
Namely a dynamic resource allocation strategy;
step S308: setting gamma as slave demand state S0=f1First reaching the final state St=fnThe development times of the enterprise development capacity units distributed in time are defined as a development capacity sequence from time 0 to time gamma
Figure BDA0001128896240000086
Step S309: the dynamic resource scheduling optimization configuration problem in the cloud manufacturing environment can be described as seeking an optimal selection strategy so that the development expectation cost Z (X) is minimum; wherein the content of the first and second substances,
Figure BDA0001128896240000087
Eπrepresenting the expectation with respect to the probability density pi;
step S4: optimizing and outputting the dynamic resource allocation strategy by adopting a cross entropy solving module; the step S4 includes the steps of:
step S401: aiming at the dynamic resource allocation strategy pi, performing initialization operation to enable the probability of each enterprise development capacity unit in the dynamic resource allocation strategy pi being allocated to be the same, namely after initialization, the process of allocating the enterprise development capacity units is a random process, and therefore the dynamic resource allocation strategy pi is characterized as an initial transfer matrix P with the same element values and the sum of the element values of each row being 1;
step S402: randomly selecting an enterprise development capacity unit as a starting point in a complex product development capacity network corresponding to the total task F, and generating a path X through n steps of different state random transitions based on an initial transition matrix P in view of the fact that the number of subtasks is n1,X2,…,XnSince the state transition process is random, N paths can be obtained, and each path X is calculatediCost Z (X)i);
Step S403: will develop the expense Z (X)i) Sorting from small to large:
Z(Xi)(1)≤Z(Xi)(2)≤…≤Z(Xi)(N)Its quantile value rho is
Figure BDA0001128896240000091
Step S404: calculating by utilizing a Lagrange multiplier method according to the obtained quantile value to obtain a first probability transfer matrix P ', wherein an element P in the first probability transfer matrix P' isij' is represented as:
Figure BDA0001128896240000092
wherein p isij' represents the probability that the allocated capacity development unit is j when the subtask i is developed;
Figure BDA0001128896240000093
representing the times of distributing capacity development units when developing the subtask i for the path with development cost not higher than gamma in the N paths;
Figure BDA0001128896240000094
representing the times of distributing capacity developing units to be j when developing subtasks i for paths with developing cost not higher than gamma in the N paths;
step S405, according to the first probability transition matrix P 'and the initial transition matrix P, correcting by adopting a smoothing technology to obtain a second probability transition matrix P ″ - α. P' + (1- α). P, wherein α is a smoothing parameter;
step S406: reassigning the initial transition matrix P, and assigning a second probability transition matrix P' to the initial transition matrix P;
step S407: repeating steps S402 to S406 until for a given number of iterations d, a transition matrix P occurs for different initials, all resulting in a slave demand state S0=f1First reaching the final state St=fnTime, development cost Z (X)i) By fractional value
Figure BDA0001128896240000095
Until the end;
step S408: when the condition of the step S407 occurs, it is regarded as that an optimal selection policy occurs, and the current initial transfer matrix P is output as the optimal selection policy, so that the optimal resource combination of dynamic resources in the cloud manufacturing environment of the complex product can be obtained.
Examples
In this embodiment, a task F is decomposed according to a complex product task decomposition model, and a task set is formed: f ═ Fi|i=1,2,…,8}。
The cross-regional enterprise manufacturing resources in the cloud manufacturing environment are virtualized and serviced, and the enterprise development capacity unit is expressed as cij={lov(cij),fiJ, where j is 1,2, … 5.
Establishing a virtual enterprise development capacity resource pool: CP (1) ═ c11,c12,c13,c14,c15},CP(2)={c21,c22,c23,c24,c25},CP(3)={c31,c32,c33,c34,c35},CP(4)={c41,c42,c43,c44,c45},CP(5)={c51,c52,c53,c54,c55},CP(6)={c61,c62,c63,c64,c65},CP(7)={c71,c72,c73,c74,c75},CP(8)={c81,c82,c83,c84,c85}
For convenience of description, all enterprises in the cloud manufacturing environment are developed into a capability unit cijCorresponding development capability element rating lov (c)ij) And the development cost is expressed by a set, and the table 1 shows a development capability unit level set LOV ═ LOV (c)ij) I 8, j 5, table 2 is the development cost set CS={Cs(cij) I 8, j 5, where lov(cij) And CS(cij) Given by a preset value.
Table 1 the developed capability unit rating set LOV is:
Figure BDA0001128896240000101
TABLE 2 development cost set CSComprises the following steps:
Figure BDA0001128896240000102
and allocating a candidate development capacity resource pool for each subtask to form a complex product development capacity network, as shown in fig. 1.
A schematic diagram of a Markov decision process based dynamic resource selection decision is shown in FIG. 2, where θ isijRepresenting development capability element cijSuccessfully completes the development task fiThe probability of (d); 1-thetaijIndicating incomplete development task fiI.e. the product development failed at the current stage. Where probability θ and Productivity level lov (c)ij) The relation of (A) is as follows: thetaij=lov(cij)/10. Table 3 shows task f1~f8Probability of development success θ ijI.e. theta ═ thetaij|i=8,j=5}。
TABLE 3 probability of success of development task θijThe set of (a) is:
Figure BDA0001128896240000111
in this embodiment, a cross entropy algorithm is used to optimize the dynamic resource selection decision model.
Generating an initial transition matrix P such that the probability of each development capability unit of the transition matrix being selected is equal, i.e.
Figure BDA0001128896240000112
Let N be 5000, ρ be 0.16, α beAnd when d is 10, optimizing the dynamic resource selection strategy based on the Markov decision process by using a cross entropy method. The minimum development cost 2132 is obtained after iterative calculation for 46 times, wherein the optimal selection probability
Figure BDA0001128896240000113
Is composed of
Figure BDA0001128896240000121
Therefore, in this example, development subtask f1~f8The Markov decision process-based dynamic resource optimization selection strategy is c14,c22,c33,c44,c53,c64,c71,c85
The optimization selection strategy and the random selection strategy are respectively subjected to simulation operation, the development cost of the two selection strategies after 10 times of operation is shown in fig. 3, the average development cost of the optimization selection strategy in 10 times of simulation is 3113, and the average development cost of the random selection strategy is 3610.9. Simulation results show that the average development cost of the optimal selection strategy is lower than the average cost of the random selection strategy. However, it should be noted that it cannot be guaranteed that the development cost per time in the simulation process is smaller than the random selection strategy in the simulation process, because whether the development capability unit can complete the task per time has randomness, which has a great influence on the development cost, and therefore, the influence of the randomness is reduced by comparing the average development cost.
The above description is only a preferred embodiment of the present invention, and it should be noted that, for those skilled in the art, several modifications and variations can be made without departing from the technical principle of the present invention, and these modifications and variations should also be regarded as the protection scope of the present invention.

Claims (1)

1. A dynamic resource optimization method based on a Markov decision process is characterized by being implemented based on a dynamic resource optimization system, wherein the dynamic resource optimization system comprises the following steps: the system comprises a complex product task decomposition module, a development capability network construction module, a dynamic resource selection decision module and a cross entropy solving module;
step S1: decomposing the total task F through a complex product task decomposition module according to the performance requirement, the structure requirement and the precision requirement of the complex product to form n development subtasks, namely F ═ { F ═ Fi1,2, …, n, where fiRepresenting the ith subtask in the development process;
step S2: according to the requirements of each development subtask on cloud manufacturing resources, a dynamic complex product development capability network consisting of development capability resources of a cross-region enterprise is established through a development capability network construction module; the developing capability network construction module comprises: the system comprises an enterprise development capacity resolver, a capacity resource pool builder and a capacity network builder;
the step S2 includes the following sub-steps:
step S201: the enterprise manufacturing resources of cross regions under the cloud manufacturing environment are virtualized through an enterprise development capacity decomposer, and enterprise development capacity is uniformly expressed as an enterprise development capacity unit cij={lov(cij),fiJ }; wherein lov (c)ij) For a certain subtask fiFor enterprise j, the level of development ability to complete the task, and the size of the level reflects the expected level of completion of the development task;
step S202: on the basis of step S201, aiming at a certain subtask fiRepeating step S201 several times according to the number of enterprises in the enterprise manufacturing resources across the region, and further establishing a virtual enterprise development capability resource pool cp (i) { c) } by the capability resource pool builderi1,ci2,…,cij},i=1~n;
Step S203: according to the sequential relationship between the adjacent subtasks, the capability network builder establishes the sequential relationship between the enterprise development capability resource pools and the association relationship between each enterprise development capability unit in the two adjacent enterprise development capability resource pools, so as to form a dynamic complex product development capability network;
step S3: based on a complex product development capability network, a dynamic resource selection decision module obtains a dynamic resource allocation strategy according to a Markov decision method; the step S3 includes the steps of:
step S301: in the complex product development capacity network diagram, in each subtask development process, a certain time t (t is 0,1,2, …) is set to be capable of allocating only one enterprise development capacity unit cijThe corresponding development requirements are met;
step S302: taking the current subtask development stage corresponding to the time t as the task state corresponding to the time t, the task state space of the total task F can be expressed as S ═ St,t≥0}={f1,f2,…,fn};
Step S303: for a certain time t, the corresponding subtask is subtask fiThen its task state is St=fiDefining the enterprise development capacity unit corresponding to the time t
Figure FDA0002478847350000021
At thetaijProbabilistic success of the development subtask fiThen the next sub-task f is entered at the next time t +1i+1In the enterprise development capacity unit distribution stage, the task state at the moment t +1 is St+1=fi+1
At 1-thetaijIndicating incomplete development task fiIs the task development failed, the task state at the next time t +1 is the same as the state at the time t, i.e. St+1=St=fi(ii) a Wherein the probability thetaijAnd development capability level lov (c)ij) The relation of (A) is as follows: thetaij=lov(cij)/10;
Step S304: for a certain subtask fiEnterprise development capability unit distributed at time t
Figure FDA0002478847350000022
In other words, a development task of size of about
Figure FDA0002478847350000023
The development cost of (2);
step S305: setting St=fnIs the target state, i.e. the final state;
step S306: from task start time 0 to task completion time t, the Enterprise development capability Unit assignment process can be described by a history of the Markov process:
Figure FDA0002478847350000024
step S307: history description H according to step S306tIs acquired in history HtAllocating enterprise development capability units under conditions
Figure FDA0002478847350000025
Probability density of
Figure FDA0002478847350000026
Namely a dynamic resource allocation strategy pi;
step S308: set gamma at history HtConditional from the demand state S0=f1First reaching the final state St=fnThe development times of the enterprise development capacity units distributed in time are defined as a development capacity sequence by all enterprise development capacity units distributed from the time 0 to the time gamma' when the development times reach gamma
Figure FDA0002478847350000031
Step S309: the dynamic resource scheduling optimization configuration problem in the cloud manufacturing environment can be described as seeking a dynamic resource allocation strategy pi, wherein the strategy enables the development expectation cost Z (X) to be minimum; wherein the content of the first and second substances,
Figure FDA0002478847350000032
Eπexpressing the expectation of pi relative to the dynamic resource allocation strategy;
step S4: optimizing and outputting the dynamic resource allocation strategy by adopting a cross entropy solving module; the step S4 includes the steps of:
step S401: aiming at the dynamic resource allocation strategy pi, carrying out initialization operation to ensure that the probability of tasks allocated to each enterprise development capacity unit is the same, namely, the dynamic resource allocation strategy pi is characterized as an initial transfer matrix P with the same element values and the sum of the element values of each row being 1;
step S402: randomly selecting an enterprise development capacity unit as a starting point in a complex product development capacity network corresponding to the total task F, and generating a path X through n steps of different state random transitions based on an initial transition matrix P in view of the fact that the number of subtasks is n1,X2,…,XnSince the state transition process is random, N paths can be obtained, and each path X is calculatediCost Z (X)i);
Step S403: will develop the expense Z (X)i) Sorting from small to large:
Z(Xi)(1)≤Z(Xi)(2)≤…≤Z(Xi)(N)then its rho.100% quantile value is
Figure FDA0002478847350000033
Step S404: calculating by utilizing a Lagrange multiplier method according to the obtained quantile value to obtain a first probability transfer matrix P ', wherein an element P in the first probability transfer matrix P' isij' is represented as:
Figure FDA0002478847350000034
wherein p isij' represents the probability that the allocated capacity development unit is j when the subtask i is developed;
Figure FDA0002478847350000041
represented in N paths, with no more than development cost
Figure FDA0002478847350000042
When developing the subtask i, allocating the number of times of capability development units;
Figure FDA0002478847350000043
indicating that for N paths, the development cost is not higher than that
Figure FDA0002478847350000044
When developing the subtask i, allocating the number of times that the capacity development unit is j;
step S405, according to the first probability transition matrix P 'and the initial transition matrix P, correcting by adopting a smoothing technology to obtain a second probability transition matrix P ″ -, wherein α is a smoothing parameter, and the second probability transition matrix P ″ -, is α. P' + (1- α). P;
step S406: reassigning the initial transition matrix P, namely assigning the second probability transition matrix P' to the initial transition matrix P;
step S407: repeating steps S402 to S406 until for a given number of iterations d, a transition matrix P occurs for different initials, all resulting in a slave demand state S0=f1First reaching the final state St=fnTime, development cost Z (X)i) By fractional value
Figure FDA0002478847350000045
Until the end;
step S408: when the condition of the step S407 occurs, it is regarded as that an optimal selection policy occurs, and the current initial transfer matrix P is output as the optimal selection policy, so that the optimal resource combination of dynamic resources in the cloud manufacturing environment of the complex product can be obtained.
CN201610887855.XA 2016-10-11 2016-10-11 Dynamic resource optimization method based on Markov decision process Active CN106650993B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201610887855.XA CN106650993B (en) 2016-10-11 2016-10-11 Dynamic resource optimization method based on Markov decision process

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201610887855.XA CN106650993B (en) 2016-10-11 2016-10-11 Dynamic resource optimization method based on Markov decision process

Publications (2)

Publication Number Publication Date
CN106650993A CN106650993A (en) 2017-05-10
CN106650993B true CN106650993B (en) 2020-07-03

Family

ID=58855282

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201610887855.XA Active CN106650993B (en) 2016-10-11 2016-10-11 Dynamic resource optimization method based on Markov decision process

Country Status (1)

Country Link
CN (1) CN106650993B (en)

Families Citing this family (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106919755B (en) * 2017-03-01 2020-08-11 清华大学 Cloud manufacturing system uncertainty quantitative analysis method and device based on data
CN109697273A (en) * 2017-10-20 2019-04-30 顺丰科技有限公司 For resource allocation methods, equipment, the readable storage medium storing program for executing of mark problem
CN108063830B (en) * 2018-01-26 2020-06-23 重庆邮电大学 Network slice dynamic resource allocation method based on MDP
EP3702942A4 (en) * 2018-03-27 2021-08-04 Nippon Steel Corporation Analysis system, analysis method, and program
CN116137630B (en) * 2023-04-19 2023-08-18 井芯微电子技术(天津)有限公司 Method and device for quantitatively processing network service demands
CN117134364B (en) * 2023-06-21 2024-05-28 国网湖北省电力有限公司营销服务中心(计量中心) Feed processing enterprise load management method based on staged strategy gradient algorithm

Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103036974A (en) * 2012-12-13 2013-04-10 广东省电信规划设计院有限公司 Cloud computing resource scheduling method and system based on hidden markov model

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9047423B2 (en) * 2012-01-12 2015-06-02 International Business Machines Corporation Monte-Carlo planning using contextual information

Patent Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103036974A (en) * 2012-12-13 2013-04-10 广东省电信规划设计院有限公司 Cloud computing resource scheduling method and system based on hidden markov model

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
云制造环境下的制造资源优化配置研究;王时龙 等;《计算机集成制造***》;20120731;第18卷(第7期);1396-1405 *
基于Markov决策过程用交叉熵方法优化软件测试;张德平 等;《软件学报》;20081030(第10期);2770-2779 *
基于马尔科夫决策过程的软件测试仿真与计算;秦强 等;《数值计算与计算机应用》;20140630;第35卷(第2期);92-102 *

Also Published As

Publication number Publication date
CN106650993A (en) 2017-05-10

Similar Documents

Publication Publication Date Title
CN106650993B (en) Dynamic resource optimization method based on Markov decision process
Lin et al. Fast GA-based project scheduling for computing resources allocation in a cloud manufacturing system
Jamrus et al. Dynamic coordinated scheduling for supply chain under uncertain production time to empower smart production for Industry 3.5
Chou et al. Process flexibility revisited: The graph expander and its applications
CN107679750B (en) Cloud manufacturing service resource matching method based on adaptive coefficient genetic algorithm
CN115600774B (en) Multi-target production scheduling optimization method for assembly type building component production line
Pereira Jr et al. On multicriteria decision making under conditions of uncertainty
Xu et al. Functional objectives decisionmaking of discrete manufacturing system based on integrated ant colony optimization and particle swarm optimization approach
CN104166903A (en) Task planning method and system based on working procedure division
Wang et al. Schedule-based execution bottleneck identification in a job shop
Simeone et al. Resource efficiency enhancement in sheet metal cutting industrial networks through cloud manufacturing
US20240046168A1 (en) Data processing method and apparatus
Ding et al. Optimal incentive and load design for distributed coded machine learning
Hong et al. A dynamic demand-driven smart manufacturing for mass individualization production
Chang et al. Stochastic programming for qualification management of parallel machines in semiconductor manufacturing
CN110717662B (en) Task allocation method, device, equipment and storage medium
Ananth et al. Cooperative game theoretic approach for job scheduling in cloud computing
US20230267007A1 (en) System and method to simulate demand and optimize control parameters for a technology platform
CN117196020A (en) Conflict resolution method based on improved genetic algorithm
Elgendy et al. Integrated strategies to an improved genetic algorithm for allocating and scheduling multi-task in cloud manufacturing environment
CN110648076A (en) Task allocation method, device, equipment and storage medium
CN116708446A (en) Network performance comprehensive weight decision-based computing network scheduling service method and system
Sunku Selection of contractors for a housing development project in India by using an integrated model
Lin et al. Integrated optimization of supplier selection and service scheduling in cloud manufacturing environment
CN113723695B (en) Remanufacturing scheduling optimization method based on scene

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant