CN109902954A - A kind of flexible job shop dynamic dispatching method based on industrial big data - Google Patents

A kind of flexible job shop dynamic dispatching method based on industrial big data Download PDF

Info

Publication number
CN109902954A
CN109902954A CN201910144370.5A CN201910144370A CN109902954A CN 109902954 A CN109902954 A CN 109902954A CN 201910144370 A CN201910144370 A CN 201910144370A CN 109902954 A CN109902954 A CN 109902954A
Authority
CN
China
Prior art keywords
scheduling
data
rule
machine
workpiece
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201910144370.5A
Other languages
Chinese (zh)
Other versions
CN109902954B (en
Inventor
汤洪涛
费永辉
闫伟杰
陈程
梁佳炯
程晓雅
王丹南
李晋青
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Zhejiang University of Technology ZJUT
Original Assignee
Zhejiang University of Technology ZJUT
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Zhejiang University of Technology ZJUT filed Critical Zhejiang University of Technology ZJUT
Priority to CN201910144370.5A priority Critical patent/CN109902954B/en
Publication of CN109902954A publication Critical patent/CN109902954A/en
Application granted granted Critical
Publication of CN109902954B publication Critical patent/CN109902954B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Landscapes

  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

A kind of flexible job shop dynamic dispatching method based on industrial big data, includes the following steps as follows: step 1: Usage data collection tool Sqoop and Flume, scheduling data is obtained from database or file system, and be stored in HDFS file system;Step 2: it by Tool for Data Warehouse Hive, is divided for dispatching data by unit of scheduling scheme;Step 3: it is converted into trained example by data are dispatched using Spark Computational frame, and is stored in the form of scheduling scheme is unit into Hbase;Step 4: being screened from multiple indexs, obtains showing the scheduling data acquisition system that good scheduling scheme generates in commission;Step 5: the cluster based on disturbance attribute is carried out to scheduling relevant historical data;Step 6: using improved random forests algorithm, excavates random forest scheduling rule;Step 7: flexible job shop dynamic dispatching is instructed using excavated scheduling rule.Practical operability of the present invention, computational efficiency are high, quickly can make real-time response to workshop disturbance.

Description

A kind of flexible job shop dynamic dispatching method based on industrial big data
Technical field
The present invention relates to a kind of flexible job shop dynamic dispatching methods based on industrial big data
Background technique
Scheduling plays an important role in manufacture system, and scheduling quality will will affect the competitiveness of manufacturing enterprise itself. Formulating scientific and reasonable scheduling scheme for workshop can be improved production efficiency, reduction process cost, shortening product life cycle, together When can guarantee to deliver goods with guaranteeing the quality on time.Flexible job shop due to its with flexible process route with to the fast of the market demand Fast adaptability to changes can be very good the production requirement for meeting multi items, small lot, and therefore, it has become a kind of lifes being widely used Production mode.Flexible job shop dynamic dispatching is that the disturbance of actual production environment is considered on the basis of static scheduling, is more accorded with Actual production environment is closed, so more having research significance.
As product demand constantly changes to personalization, manufacturing process is more various, and actual schedule problem also becomes more Complexity, manufacturing enterprise is to Job-Shop way to solve the problem in practical operability, computational efficiency and to workshop disturbance Real-time response ability etc. made higher requirement.Priority scheduling rule is a kind of simple heuristic rule, it Computational efficiency is high, practical strong operability and can be used for Real-Time Scheduling, is suitable for the dynamic dispatch environment of complexity.But priority scheduling The performance of rule is influenced by actual environment variation, and single scheduling rule cannot have preferably in all bumpy weathers Scheduling performance.In order to meet the needs of actual job Job-Shop, a kind of feasible thinking is from scheduling relevant historical data It is middle to excavate the scheduling knowledge about scheduling rule to instruct practical Job-Shop activity.Scheduling problem is solved by data mining Research be broadly divided into the method in conjunction with existing priority scheduling rule and the excavation scheduling rule from scheduling relevant historical data Method.
In terms of combining existing priority scheduling rule, WANG Shuang-Xi etc. is in (A hybridknowledge discovery model using decision tree and neural network for selecting Dispatching rules of a semiconductor final testing factory, 2005) it is directed to semiconductor row Industry proposes a kind of combination decision tree and neural network and excavates priority scheduling rule selection mechanism from scheduling relevant historical data Method, most suitable priority scheduling rule under the available current environment of the selection mechanism.SHIUE Y.R. etc. is in (Data- mining-based dynamic dispatching rule selection mechanism for shop floor Control systems using a support vector machine approach, 2009) it proposes a kind of using supporting Vector machine (SupportVector Machine, SVM) excavates priority scheduling rule selection mechanism from scheduling relevant historical data Method, and real-time operation decision is made with this.Mouelhi is in (Training a neuralnetwork to select Dispatching rules in real time, 2009) etc. a kind of scheduling rule selecting party of combination neural network is proposed Method, this method excavate scheduling rule real-time selection method by neural network from the scheduling relevant historical data that emulation generates.
Priority scheduling rule only makes scheduling decision by a small amount of information, this may cause scheduling result not to the utmost such as people Meaning, therefore extracting new scheduling rule from scheduling relevant historical data is another thinking.LI X etc. is in (Discovering Dispatching rules using datamining, 2005) propose it is a kind of using decision tree from scheduling relevant historical data In obtain the method for completely new scheduling rule, and be experimentally confirmed extract scheduling rule can be fitted former dispatching party well Case.SIGURDUR OLAFSSON etc. is in (Learning effective new single machine dispatching Rules from optimal scheduling data, 2010) a kind of two stage scheduling knowledge learning method is proposed, it is first First learn the training example for how obtaining being suitble to excavation from scheduling process data, then rule is scheduled to these training examples Excavation.Wang Chenglong etc. proposes a kind of combination Petri network (method for digging of solving job shop scheduling problem rule is studied, 2015) and builds The branch-bound algorithm of mould and the scheduling rule method for digging of decision Tree algorithms, the scheduling rule of extraction can be used for instructing static work Industry Job-Shop.
In conclusion being primarily directed to vehicle about the method for excavating scheduling rule from scheduling relevant historical data at present Between on scheduling problem, applied to the less of flexible job shop dynamic scheduling problem.In addition, the above method is used Scheduling relevant historical data be partial to gross data, a large amount of uses however as Intellisense equipment in shop layer, workshop Start to intelligent development, Job-Shop relevant historical data shows the industry such as scale is big, value is low, continuous sampling, higher-dimension The characteristics of big data.
Summary of the invention
In order to overcome, the existing practical operability of flexible job shop dynamic dispatching method is not high, computational efficiency is completely and to vehicle Between disturb real-time response scarce capacity the problem of, the present invention provide a kind of practical strong operability, computational efficiency it is high, can be right Workshop disturbs the flexible job shop dynamic dispatching method for making real-time response.
The technical solution adopted by the present invention to solve the technical problems is:
A kind of flexible job shop dynamic dispatching method based on industrial big data, the flexible job shop dynamic dispatching Method the following steps are included:
Step 1, data acquisition: using the metadata acquisition tool under Hadoop ecosystem from existing information system Historical data relevant to scheduling is acquired, and is stored in HDFS file system.
Step 2, Data Integration: by Tool for Data Warehouse Hive using SQL statement to the scheduling in HDFS file system Data acquisition system DhIt is divided by unit of scheduling scheme, i.e., the scheduling relevant historical generated primary scheduling scheme in commission Data are divided into together.
Step 3, the data relay of integration: being turned to the form of trained example using Spark by data conversion, is convenient for data Mining algorithm is scheduled rule digging.
Step 4, data screening: from three Maximal Makespan, total tardiness time, machine total load indexs to history tune Degree scheme is considered, and screening obtains showing the scheduling relevant historical data set that good scheduling scheme generates in commission. It specifically includes:
Step 4.1: in Maximal Makespan index, to only use the dispatching party of SPT rule generation in situation of the same race The Maximal Makespan of case is as screening criteria.
Step 4.2: identical to use EDD rule combination SPT rule to complete in situation of the same race in total tardiness time index In the case of scheduler task total extension time as screening criteria.
Step 4.3: in machine total load index, to combine LMWT and SPT rule to complete scheduler task in situation of the same race Machine total load as screening criteria, the scheduling data that the scheduling scheme of these three indexs generates in commission can be met simultaneously Set, it will the input as scheduling rule mining algorithm.
Step 5;Cluster based on disturbance attribute: DBSCAN clustering method is used, to the scheduling relevant historical number after screening According to carrying out based on disturbance attribute using scheduling scheme as unit (i.e. the data that a scheduling scheme generates are as an object) Cluster.It specifically includes:
Step 5.1: noisy data when executing to scheme carries out data normalization processing, if certain disturbance attribute is in each side Data in case are X1, X2, X3..., Xn, then they need to convert by such as formula (1).
In formula (1)Indicate the mean value of attribute;S is expressed as standard deviation;Y1, Y2, Y3..., Yn are the data after standardization.
Step 5.2: determining the parameter field radius Eps of DBSCAN algorithm, and included at least in the radius of kernel object field Object number MinPts.
Step 5.3: finding out a kernel object p at random, create new cluster of the p as kernel object.It is found again from p The reachable object of direct density, is classified in cluster.
Step 5.4: repeating step 5.3, when not new point can be added to any cluster, which terminates.
Step 6, random forest scheduling rule are excavated: being used improved random forests algorithm, divided from cluster each after cluster The forest scheduling scheduling rule 1 that workpiece selection machine problem Wa Jue be resolved is asked with idle machine selection workpieces processing is solved The random forest scheduling 2 of topic.It specifically includes:
Step 6.1: for the cluster after each cluster, extracting trained example with putting back to from cluster, form k new instructions Practice example collection, for constructing decision tree.
Step 6.2: m characteristic attribute of random selection, and best divisional mode is calculated, it is respectively trained to obtain k decision tree.
Step 6.3: using non-selected trained example in cluster, the classification performance of test decision tree.
Step 6.4: judging whether there is similar decision tree, if it exists similar decision tree, then reservation table is existing in testing Good decision tree, forms random forest.
Step 6.5: logical to calculate the weight w of every decision tree according to Bayes's voting mechanism, h obtains forest scheduling scheduling Rule 1 and random forest scheduling 2.
Step 7, scheduling rule use: instructing flexible job shop dynamic to adjust by excavated random forest scheduling rule Degree.It specifically includes:
Step 7.1 selects workpieces processing problem according to the problem of solution for workpiece machine choice or idle machine, It finds and random forest scheduling rule 1 corresponding to cluster belonging to the bumpy weather of current flexible job shop or random forest tune Metric then 2.
Step 7.2, according to selected random forest scheduling rule, optimal method is selected by comparing two-by-two, in candidate machine Most suitable workpiece or machine are selected in device set M or candidate artifacts set J.
Technical concept of the invention are as follows: Job-Shop relevant historical data show scale is big, value is low, continuous sampling, The characteristics of industry big data such as higher-dimension, therefore combine big data according to the pretreatment for completing scheduling relevant historical data first.Fig. 2 is provided Combine the data prediction model of big data technology.Data prediction flexible job shop dynamic scheduling problem is that having disturbance In the environment of the data acquisition system D that solves the problems, such as the machine choice of workpiece and the workpiece select permeability of idle machine, therefore acquirehPoint It is system disturbance information relevant to disturbance when scheduling scheme is formulated for three parts: d1;D2 is that certain procedure selection of workpiece adds When work machine, the status information of every machine in the collection of machines of this procedure can be currently processed;D3 needs for idle machine When selecting workpiece to be processed in waiting list, the status information of each workpiece in current queue.Dispatch data acquisition system DhIn data mode it is chaotic, cannot be used directly for next data screening, cluster and scheduling rule excacation, need Pass through Data Integration and the arrangement scheduling data acquisition system D that is convertedhIn data.In scheduling data acquisition system DhIn to imply reflection real The mass efficient information of border dispatch environment feature and scheduling knowledge, at the same also along with many useless or wrong rule or Mode.Therefore the multiple parameter data Filtering system of Fig. 3 is used, from three Maximal Makespan, total tardiness time, machine total load angles Degree considers history scheduling scheme, retains generated data in the schedule history scheme execution for meet three indexs.
Using random forests algorithm as the mining algorithm of scheduling rule, the algorithm building thus of finally obtained scheduling rule Random forest, is essentially more trained C4.5 decision trees, and the scheduling performance of scheduling rule depends on point of decision tree Class performance, the computational efficiency and complexity of scheduling rule depend on the numbers of branches of decision tree.By DBSCAN cluster to compared with Excellent scheduling data DbIt is reasonably divided, distinguishes data caused by the scheduling decision made under different bumpy weathers, The scheduling rule for different bumpy weathers is respectively obtained from each region of division again, gained random forest tune can be enhanced Metric then in decision tree classification performance and reduce numbers of branches so that scheduling rule complexity is lower, computational efficiency Higher, scheduling performance is more preferable.
Learning scheduling rule f, f from schedule history related data by random forests algorithm is to true scheduling rule in fact A kind of then estimation of ySoIt is that there is a certain error between y.Error includes three parts: noise δ2, varianceAnd deviationWherein noise δ2It is inevitable, but can be by reducing varianceOr deviationThe error of algorithm is reduced, to improve the performance of random forests algorithm.The correlation between decision tree is reduced simultaneously ρ can reduce variance, therefore if the similarity between two decision numbers is excessive, retain the decision number that test does very well, thus Reduce the correlation ρ between decision tree.Traditional random forests algorithm is using the voting mechanism that the minority is subordinate to the majority, random gloomy No matter fine or not the classification performance of decision tree is in woods, weight all having the same.Such mechanism results in determining for classification performance difference Plan tree and the good decision tree of classification performance influence degree having the same for final result.Therefore Bayes's voting machine is used herein System.The mechanism is classified in testing based on every decision tree shows one weight of setting, then votes according to this weight.
Beneficial effects of the present invention are mainly manifested in: being dug from the scheduling relevant historical data with industrial big data Scheduling rule is dug to instruct the method for scheduling as main body frame, the data prediction model for combining big data technology is established, mentions The high speed and accuracy of data prediction, establishes the behavior aggregate based on disturbance attribute, reduces scheduling rule complexity Degree, improve scheduling rule computational efficiency is higher and scheduling performance, establish based on the scheduling for improving random forests algorithm Mining model improves the generalization ability and scheduling performance of scheduling rule.
Detailed description of the invention
Fig. 1 is that scheduling rule of the invention excavates overall architecture.
Fig. 2 is the scheduling data prediction model of combination big data technology of the invention.
Fig. 3 is multiple parameter data Filtering system of the invention.
Fig. 4 is the flow chart that improved random forests algorithm of the invention excavates scheduling rule.
Fig. 5 is the resulting dispatching party of flexible job shop dynamic dispatching method of use of the invention based on industrial big data Case.
Specific embodiment
- Fig. 5 referring to Fig.1, a kind of flexible job shop dynamic dispatching method based on industrial big data, overall framework reference Fig. 1 is specifically divided into three parts: first part, specific referring to Fig. 2 in conjunction with the scheduling data prediction model of big data technology It is divided into data acquisition, Data Integration, data conversion and data screening;First part, based on disturbance hierarchical cluster attribute strategy;Third portion Point, based on the scheduling rule mining model for improving random forests algorithm.Its technical step generally is as follows: step 1, data Acquisition: existing from MES, ERP, SCADA etc. using the metadata acquisition tool Sqoop and Flume under Hadoop ecosystem Historical data relevant to scheduling is acquired in information system, and is stored in HDFS file system.Acquiring data includes three Divide Dh={ d1, d2, d3 }: d1 is system disturbance information relevant to disturbance when scheduling scheme is formulated;D2 is certain road work of workpiece When sequence selects processing machine, the status information of every machine in the collection of machines of this procedure can be currently processed;D3 is sky When not busy machine needs to select workpiece to be processed in waiting list, the status information of each workpiece in current queue.
Step 2, Data Integration: by Tool for Data Warehouse Hive using SQL statement to the scheduling in HDFS file system Data acquisition system DhIt is divided by unit of scheduling scheme, i.e., the scheduling relevant historical generated primary scheduling scheme in commission Data are divided into together.
Data conversion: d2 and d3 in data after integration is partially converted into the shape of trained example using Spark by step 3 Formula is scheduled rule digging convenient for data mining algorithm.It specifically includes:
Step 3.1: for the scheduling data acquisition system D of acquisitionhThe part d2, by actual selection in certain history scheduling scheme Machine m1 is considered as most suitable machine, and the machine in the alternative collection of machines { m2, m3... } of this process is can be processed with other in it Compare to form trained example one by one.
Step 3.2: for the scheduling data acquisition system D of acquisitionhThe part d3, by actual selection in certain history scheduling scheme Workpiece j1 is considered as most suitable machine, by its with other etc. workpiece in workpiece set { j2, j3... } to be processed compare one by one Form training example.
Step 4, data screening: from three Maximal Makespan, total tardiness time, machine total load indexs to history tune Degree scheme is considered, and screening obtains showing the scheduling relevant historical data set that good scheduling scheme generates in commission Db.It specifically includes:
Step 4.1: in Maximal Makespan index, to only use the dispatching party of SPT rule generation in situation of the same race The Maximal Makespan of case is as screening criteria.It only uses SPT rule and refers to that most fast machine and idle machine are processed in workpiece selection Device selects process time shortest workpiece.The scheduling scheme that Maximal Makespan meets this index enters step 4.2, is unsatisfactory for then It eliminates.
Step 4.2: identical to use EDD rule combination SPT rule to complete in situation of the same race in total tardiness time index In the case of scheduler task total extension time as screening criteria.SPT+EDD rule refers to that workpiece selection processing is most fast Machine and idle machine selection delivery date earliest workpiece.The scheduling scheme that total tardiness time meets this index enters step 4.3, it is unsatisfactory for, eliminates.
Step 4.3: in machine total load index, to combine LMWT and SPT rule to complete scheduler task in situation of the same race Machine total load as screening criteria, LMWT+SPT rule refers to workpiece selection free time longest machine and idle Machine choice process time shortest workpiece.The scheduling number that the scheduling scheme of these three indexs generates in commission can be met simultaneously According to set, it will the input as scheduling rule mining algorithm.
Step 5;Cluster based on disturbance attribute: using DBSCAN to DbUsing scheduling scheme as a unit (i.e. dispatching party The data that case generates are as an object), according to DbIn solution formulation when system disturbance attribute (part d1) be based on Disturb the cluster of attribute.It specifically includes:
Step 5.1: to d1 partial data standardization, if certain data of disturbance attribute in each scheme is X1, X2, X3..., Xn, then they need to convert by such as formula (1).
In formula (1)Indicate the mean value of attribute;S is expressed as standard deviation;Y1, Y2, Y3..., Yn are the data after standardization.
Step 5.2: determining the parameter field radius Eps of DBSCAN algorithm, and included at least in the radius of kernel object field Object number MinPts.
Step 5.3: finding out the kernel object p of one not processed (not being classified as some cluster or labeled as noise) at random (the object number for including in the radius of field is not less than MinPts), new cluster C is established, all objects in p radius of neighbourhood Eps are added Enter Candidate Set N.
Step 5.4: finding out not yet processed object q in a Candidate Set N at random.If q is kernel object, by q neighbour Object that is not processed and not being added to N is added in N in the radius Eps of domain.If q is not included into any one cluster, q is added Enter C.
Step 5.5: step 5.4 is repeated, until N is sky.
Step 5.6: repeating step 5.3,5.4,5.5, when not new object can be added to any cluster, the mistake Journey terminates
Step 6, random forest scheduling rule are excavated: being used improved random forests algorithm, divided from cluster each after cluster The forest scheduling rule 1 that workpiece selection machine problem Wa Jue be resolved selects workpieces processing with idle machine is solved the problems, such as Random forest scheduling 2.It specifically includes:
Step 6.1: for the cluster after each cluster, (being excavated from the d2 (excavating random forest scheduling rule 1) in cluster with d3 Random forest scheduling rule 2) in extract trained example with putting back to, be respectively formed new training the example collection P1 and P2 of k, be used for Construct decision tree.
Step 6.2:P1 and P2 randomly chooses m characteristic attribute from d2 and d3 respectively, and calculates best divisional mode, point K decision tree T1 and T2 Xun Lian not obtained.
Wherein decision tree building process are as follows:
Step 6.2.1: creation root node N.
Step 6.2.2: whether there are also remaining training examples for training of judgement example collection, if the return node N without if, if having Then in next step.
Step 6.2.3: whether the scheduling decision of training of judgement example collection residue training example is all C, if then returning Node N, and it is labeled as class C, if having in next step.
Step 6.2.4: judge to produce attribute list whether be it is empty, it is empty then be labeled as most classes occur in sample, otherwise In next step.
Step 6.2.5: check whether the attribute in Attribute class table is continuity, and connection attribute will be obtained by dichotomy The maximum attribute separate mode of attribute gain G (D, A).(all properties value of attribute can be divided by two parts by dichotomy, This shared N-1 kind division methods, the division threshold value of dichotomy are selection two respectively in the average value of adjacent two o'clock.Information gain meter Calculation mode such as formula (2), (3), (4)).
G (D, A)=H (D)-H (D | A) (2)
G (D, A) indicates the information gain of attribute A in formula (2);H (D, A) classification information entropy in formula (3);H in formula (4) (D | A conditional entropy) is indicated;In addition, D indicates training Exemplar Data Set, | D | indicate the training example quantity of D, and D has K classification Ck, k=1,2;|Ck| it indicates in classification CkIn training example number.D can be divided into n subset D by attribute A1, D2..., Dn, | Di| it is DiTraining example number.DiIn belong to class CkThe collection of training example be combined into Dik, | Dik| it is DikInstruction Practice example number.
Step 6.2.6: the selection maximum attribute flag node N of information gain-ratio, information gain-ratio calculation formula such as formula (5), (6), return step 6.2.2.
GR (D, A)=G (D, A)/H (A) (5)
GR (D, A) indicates information gain-ratio in formula (5);H (A) indicates division information;Other symbol meanings are same as above.
Step 6.3: using non-selected trained example in d2 and d3, testing the classification chart of decision tree in T1 and T2 respectively It is existing.
Step 6.4: calculate the similarity S in T1 or T2 between decision tree, calculation formula such as formula (7), if decision tree it Between similarity be greater than 60%, then the test performance in comparison step 6.3, retains the decision tree done very well, and forms random gloomy Woods.
DT in formula (7)1With DT2Indicate two decision trees of progress similarity calculation;K indicates DT1With DT2To test case The identical number of classification results;r1nWith r2nWhen indicating that n-th classification results are identical, DT1With DT2The characteristic attribute used, c table Show classification results;Work as r1n=r2nWhen, i.e. DT1With DT2When obtaining identical classification results with identical characteristic attribute, I (r1n.c, r2nIt .c)=1, is otherwise the number of test case for 0, Nt.
Step 6.5: by Bayes's voting mechanism, calculating separately the weight w of every decision tree in T1 and T2, h is calculated public Formula such as formula (8), (9) obtain forest scheduling scheduling rule 1 and random forest scheduling rule 2.
V represents the number that this decision tree correctly classifies to test case in formula (8), (9);M is indicated to test case mistake Classification number;
Step 7, scheduling rule use: instructing flexible job shop dynamic to adjust by excavated random forest scheduling rule Degree.It specifically includes:
Step 7.1 selects workpieces processing problem according to the problem of solution for workpiece machine choice or idle machine, It finds and random forest scheduling rule 1 corresponding to cluster belonging to the bumpy weather of current flexible job shop or random forest tune Metric then 2.
Step 7.2, according to selected random forest scheduling rule, optimal method is selected by comparing two-by-two, in candidate machine Most suitable workpiece or machine are selected in device set M or candidate artifacts set J.
Step 7.2.1, for workpiece machine choice problem, if m1, m2 are two machines in M, according to selected by step 7.1 The selection result of every decision tree in random forest scheduling rule is calculated in random forest scheduling rule 1, wraps in these results Selection 1 and selection 2 have been contained (selecting 1 to represent, the former m1 is suitable, and it is suitable that selection 2 represents the latter m2).Work is selected for idle machine Part problem, if j1, j2 are two workpiece in J, the random forest scheduling rule 2 according to selected by step 7.1 is calculated random The selection result of every decision tree in forest scheduling rule, decision 1 and decision 2 are contained in these results, and (decision 1 represents the former J1 is suitable, and it is suitable that decision 2 represents the latter j2).
Step 7.2.2: selection result WR, WR after every decision tree weighting are obtained by Bayes's voting mechanism and calculate public Formula such as formula (10), and the average value AWR of weighted results is acquired, if AWR indicates that the former m1 or j1 is suitable less than 1.5, if AWR is greater than 1.5 indicate that the latter m2 or j2 are suitable.
WR=wC+hR (10)
C represents the classification results that this decision tree provides in formula (10);R represents the equal of the classification results that all decision trees provide Value, w, h calculation formula are shown in formula (8), (9).
Example: in certain scheduler task, needing workpieces processing JT1, JT2 ..., JT8 each 100, i.e., and 10 batches, they Delivery date be respectively 20.0,22.0,14.0,21.0,19.0,22.0,18.0,23.0 process unit's times, the per pass of workpiece Process time such as table one of the process on each machine.And mechanical disorder has occurred when the time is 4, in first of work of JT1 Find that its second operation work occurs to lack material after the completion of sequence, when the time is 10, the process time of workpiece all increases by 10%.
Each work pieces process timetable of table one
The scheduling scheme that is obtained by the flexible job shop dynamic dispatching method based on industrial big data as shown in figure 5, Abscissa indicates the time in figure, and ordinate indicates machine, the percentile digital representation workpiece type in Gantt chart, unit numbers table Show operation number.The Maximal Makespan of final scheme is 21.8 process unit's times, 5.3 process unit's time of Zong Tuoqi, machine 96.4 process unit's time of total load.
Flexible job shop dynamic scheduling problem can smoothly be solved using patented method, and excavated using this method The property made is strong, computational efficiency is high to instruct flexible job shop scheduling to have actually for scheduling rule, without carrying out to scheduling problem The features such as modeling, the disturbance in energy real-time response workshop.

Claims (1)

1. a kind of flexible job shop dynamic dispatching method based on industrial big data, comprising the following steps:
Step 1, data acquisition: using the metadata acquisition tool Sqoop and Flume under Hadoop ecosystem, from MES, ERP, Historical data relevant to scheduling is acquired in the existing information system such as SCADA, and is stored in HDFS file system;Acquisition Data include three parts Dh={ d1, d2, d3 }: d1 is system disturbance information relevant to disturbance when scheduling scheme is formulated;D2 is When certain procedure of workpiece selects processing machine, the state of every machine in the collection of machines of this procedure can be currently processed Information;When d3 is that idle machine needs to select workpiece to be processed in waiting list, the shape of each workpiece in current queue State information;
Step 2, Data Integration: by Tool for Data Warehouse Hive using SQL statement to the scheduling data in HDFS file system Set DhIt is divided by unit of scheduling scheme, i.e., the scheduling relevant historical data generated primary scheduling scheme in commission It is divided into together;
Step 3, the d2 and d3 in data after integration: being partially converted into the form of trained example using Spark by data conversion, Rule digging is scheduled convenient for data mining algorithm;It specifically includes:
Step 3.1: for the scheduling data acquisition system D of acquisitionhThe part d2, by the machine of actual selection in certain history scheduling scheme M1 is considered as most suitable machine, and the machine in the alternative collection of machines { m2, m3... } of this process is can be processed one by one with other in it Compare to form trained example;
Step 3.2: for the scheduling data acquisition system D of acquisitionhThe part d3, by the workpiece of actual selection in certain history scheduling scheme J1 is considered as most suitable machine, by its with other etc. workpiece in workpiece set { j2, j3... } to be processed compare to be formed one by one Training example;
Step 4, data screening: from three Maximal Makespan, total tardiness time, machine total load indexs to history dispatching party Case is considered, and screening obtains showing the scheduling relevant historical data set D that good scheduling scheme generates in commissionb;Tool Body includes:
Step 4.1: in Maximal Makespan index, to only use the scheduling scheme of SPT rule generation in situation of the same race Maximal Makespan is as screening criteria;It only uses SPT rule and refers to that workpiece selection is processed most fast machine and idle machine and selected Select process time shortest workpiece;The scheduling scheme that Maximal Makespan meets this index enters step 4.2, is unsatisfactory for, and washes in a pan It eliminates;
Step 4.2: in total tardiness time index, to use EDD rule combination SPT rule to complete same case in situation of the same race Under scheduler task total extension time as screening criteria;SPT+EDD rule refers to that most fast machine is processed in workpiece selection Device and idle machine select delivery date earliest workpiece;The scheduling scheme that total tardiness time meets this index enters step 4.3, no Satisfaction is then eliminated;
Step 4.3: in machine total load index, to combine LMWT and SPT rule to complete the machine of scheduler task in situation of the same race For device total load as screening criteria, LMWT+SPT rule refers to workpiece selection free time longest machine and idle machine Select process time shortest workpiece;The scheduling data set that the scheduling scheme of these three indexs generates in commission can be met simultaneously It closes, it will the input as scheduling rule mining algorithm;
Step 5;Cluster based on disturbance attribute: using DBSCAN to DbUsing scheduling scheme as unit, i.e., a scheduling scheme produces Raw data are as an object, according to DbIn solution formulation when system disturbance attribute, i.e. the part d1 is carried out based on disturbance The cluster of attribute;It specifically includes:
Step 5.1: to d1 partial data standardization, if certain data of disturbance attribute in each scheme is X1, X2, X3..., Xn, then they need to convert by such as formula (1);
In formula (1)Indicate the mean value of attribute;S is expressed as standard deviation;Y1, Y2, Y3..., Yn are the data after standardization;
Step 5.2: the parameter field radius Eps of DBSCAN algorithm is determined, with pair included at least in the radius of kernel object field As number MinPts;
Step 5.3: finding out the kernel object p of one not processed (not being classified as some cluster or labeled as noise), core at random The object number for including in the radius of the field heart object p is not less than MinPts, establishes new cluster C, will be all in p radius of neighbourhood Eps Candidate Set N is added in object;
Step 5.4: finding out not yet processed object q in a Candidate Set N at random;If q is kernel object, by q neighborhood half Object that is not processed and not being added to N is added in N in diameter Eps;If q is not included into any one cluster, C is added in q;
Step 5.5: step 5.4 is repeated, until N is sky;
Step 5.6: repeating step 5.3,5.4,5.5, when not new object can be added to any cluster, the process knot Beam
Step 6, random forest scheduling rule are excavated: being used improved random forests algorithm, dug respectively from cluster each after cluster The forest scheduling scheduling rule 1 that pick is resolved workpiece selection machine problem selects workpieces processing with idle machine is solved the problems, such as Random forest scheduling 2;It specifically includes:
Step 6.1: for the cluster after each cluster, (being excavated random from d2 (excavating random forest scheduling rule 1) and the d3 in cluster Forest scheduling rule 2) in extract trained example with putting back to, new training the example collection P1 and P2 of k are respectively formed, for constructing Decision tree;
Step 6.2:P1 and P2 randomly chooses m characteristic attribute from d2 and d3 respectively, and calculates best divisional mode, instructs respectively Get k decision tree T1 and T2;
Wherein decision tree building process are as follows:
Step 6.2.1: creation root node N;
Step 6.2.2: whether there are also remaining training examples for training of judgement example collection, if the return node N without if, if under having One step;
Step 6.2.3: whether the scheduling decision of training of judgement example collection residue training example is all C, if then return node N, and it is labeled as class C, if having in next step;
Step 6.2.4: judge to produce attribute list whether be it is empty, it is empty then be labeled as most classes occur in sample, it is otherwise next Step;
Step 6.2.5: check whether the attribute in Attribute class table is continuity, and connection attribute will obtain attribute by dichotomy The maximum attribute separate mode of gain G (D, A);(all properties value of attribute can be divided by two parts by dichotomy, this Shared N-1 kind division methods, the division threshold value of dichotomy are selection two respectively in the average value of adjacent two o'clock;Information gain calculating side The following formula of formula (2), (3), (4);
G (D, A)=H (D)-H (D | A) (2)
G (D, A) indicates the information gain of attribute A in formula (2);H (D, A) classification information entropy in formula (3);H in formula (4) (D | A) table Show conditional entropy;In addition, D indicates training Exemplar Data Set, | D | indicate the training example quantity of D, and D has K classification Ck, k= 1,2;|Ck| it indicates in classification CkIn training example number;D can be divided into n subset D by attribute A1, D2..., Dn, | Di| it is DiTraining example number;DiIn belong to class CkThe collection of training example be combined into Dik, | Dik| it is DikTraining example Number;
Step 6.2.6: the selection maximum attribute flag node N of information gain-ratio, information gain-ratio calculation formula such as formula (5), (6), return step 6.2.2;
GR (D, A)=G (D, A)/H (A) (5)
GR (D, A) indicates information gain-ratio in formula (5);H (A) indicates division information;Other symbol meanings are same as above;
Step 6.3: using non-selected trained example in d2 and d3, the classification for testing decision tree in T1 and T2 respectively is showed;
Step 6.4: calculating the similarity S in T1 or T2 between decision tree, calculation formula such as formula (7), if between decision tree Similarity is greater than 60%, then the test performance in comparison step 6.3, retains the decision tree done very well, form random forest;
DT in formula (7)1With DT2Indicate two decision trees of progress similarity calculation;K indicates DT1With DT2Classify to test case and ties The identical number of fruit;r1nWith r2nWhen indicating that n-th classification results are identical, DT1With DT2The characteristic attribute used, c presentation class knot Fruit;Work as r1n=r2nWhen, i.e. DT1With DT2When obtaining identical classification results with identical characteristic attribute, I (r1n.c,r2n.c)= 1, it is otherwise the number of test case for 0, Nt;
Step 6.5: by Bayes's voting mechanism, calculating separately the weight w of every decision tree in T1 and T2, h, calculation formula is such as Formula (8), (9) obtain forest scheduling scheduling rule 1 and random forest scheduling rule 2;
V represents the number that this decision tree correctly classifies to test case in formula (8), (9);M indicates to classify to test case mistake Number;
Step 7, scheduling rule use: instructing flexible job shop dynamic dispatching by excavated random forest scheduling rule;Tool Body includes:
Step 7.1 is found according to being that workpiece machine choice or idle machine select workpieces processing problem the problem of solution Random forest scheduling rule 1 corresponding to cluster belonging to bumpy weather with current flexible job shop or random forest scheduling rule Then 2;
Step 7.2, according to selected random forest scheduling rule, optimal method is selected by comparing two-by-two, in candidate machine collection It closes in M or candidate artifacts set J and selects most suitable workpiece or machine;
Step 7.2.1, it is random according to selected by step 7.1 if m1, m2 are two machines in M for workpiece machine choice problem The selection result of every decision tree in random forest scheduling rule is calculated in forest scheduling rule 1, contains in these results Selection 1 and selection 2, selecting 1 to represent, the former m1 is suitable, and it is suitable that selection 2 represents the latter m2;Idle machine selection workpiece is asked Topic, if j1, j2 are two workpiece in J, random forest scheduling rule 2, is calculated random forest according to selected by step 7.1 The selection result of every decision tree in scheduling rule, decision 1 and decision 2 are contained in these results, and decision 1 represents the former j1 and closes Suitable, it is suitable that decision 2 represents the latter j2;
Step 7.2.2: selection result WR, the WR calculation formula after every decision tree weighting are obtained such as by Bayes's voting mechanism Formula (10), and the average value AWR of weighted results is acquired, if AWR indicates that the former m1 or j1 is suitable less than 1.5, if AWR is greater than 1.5 Indicate that the latter m2 or j2 are suitable;
WR=wC+hR (10)
C represents the classification results that this decision tree provides in formula (10);R represents the mean value for the classification results that all decision trees provide, W, h calculation formula are shown in formula (8), (9).
CN201910144370.5A 2019-02-27 2019-02-27 Flexible job shop dynamic scheduling method based on industrial big data Active CN109902954B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910144370.5A CN109902954B (en) 2019-02-27 2019-02-27 Flexible job shop dynamic scheduling method based on industrial big data

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910144370.5A CN109902954B (en) 2019-02-27 2019-02-27 Flexible job shop dynamic scheduling method based on industrial big data

Publications (2)

Publication Number Publication Date
CN109902954A true CN109902954A (en) 2019-06-18
CN109902954B CN109902954B (en) 2020-11-13

Family

ID=66945563

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910144370.5A Active CN109902954B (en) 2019-02-27 2019-02-27 Flexible job shop dynamic scheduling method based on industrial big data

Country Status (1)

Country Link
CN (1) CN109902954B (en)

Cited By (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111047215A (en) * 2019-12-09 2020-04-21 中国兵器科学研究院 Random forest based field replaceable unit classification and classification determination method
CN111401427A (en) * 2020-03-12 2020-07-10 华中科技大学 Product cost evaluation method and system based on industrial big data
CN111766839A (en) * 2020-05-09 2020-10-13 同济大学 Computer implementation system for self-adaptive updating of intelligent workshop scheduling knowledge
CN112712289A (en) * 2021-01-18 2021-04-27 上海交通大学 Adaptive method, system, and medium based on temporal information entropy
CN112883640A (en) * 2021-02-04 2021-06-01 西南交通大学 Digital twin station system, job scheduling method based on system and application
CN112904818A (en) * 2021-01-19 2021-06-04 东华大学 Prediction-reaction type scheduling method for complex structural member processing workshop
WO2021189620A1 (en) * 2020-03-25 2021-09-30 重庆邮电大学 Digital workshop electric energy management research method based on context awareness
CN115357570A (en) * 2022-08-24 2022-11-18 安徽维德工业自动化有限公司 Workshop optimization scheduling management method based on random forest algorithm
CN116402173A (en) * 2022-09-06 2023-07-07 大连理工大学 Intelligent algorithm for distributing container areas and container positions of ship unloading container based on machine learning
CN117010671A (en) * 2023-10-07 2023-11-07 中国信息通信研究院 Distributed flexible workshop scheduling method and device based on block chain

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104914835A (en) * 2015-05-22 2015-09-16 齐鲁工业大学 Flexible job-shop scheduling multi-objective method
CN106094757A (en) * 2016-07-15 2016-11-09 郑州航空工业管理学院 A kind of dynamic flexible solving job shop scheduling problem control method based on data-driven
CN106611232A (en) * 2016-02-04 2017-05-03 四川用联信息技术有限公司 Layered optimization algorithm for solving multi-technical-route workshop scheduling
CN107862411A (en) * 2017-11-09 2018-03-30 西南交通大学 A kind of extensive flexible job shop scheduling optimization method
CN108733003A (en) * 2017-04-20 2018-11-02 南京理工大学 Slewing parts process working hour prediction technique based on kmeans clustering algorithms and system

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104914835A (en) * 2015-05-22 2015-09-16 齐鲁工业大学 Flexible job-shop scheduling multi-objective method
CN106611232A (en) * 2016-02-04 2017-05-03 四川用联信息技术有限公司 Layered optimization algorithm for solving multi-technical-route workshop scheduling
CN106094757A (en) * 2016-07-15 2016-11-09 郑州航空工业管理学院 A kind of dynamic flexible solving job shop scheduling problem control method based on data-driven
CN108733003A (en) * 2017-04-20 2018-11-02 南京理工大学 Slewing parts process working hour prediction technique based on kmeans clustering algorithms and system
CN107862411A (en) * 2017-11-09 2018-03-30 西南交通大学 A kind of extensive flexible job shop scheduling optimization method

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
THOMAS BUSCHMANN ET AL: "《Flexible and Robust Walking(Workshop-Dynamic Locomation and Balancing of Humanoids: State of the Art and Challenges)》", 《2015 IEEE INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION》 *
汪双喜等: "《不同再调度周期下的柔性作业车间动态调度》", 《万方数据》 *

Cited By (16)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111047215B (en) * 2019-12-09 2023-06-23 中国兵器科学研究院 Method for determining classification of field replaceable units based on random forest
CN111047215A (en) * 2019-12-09 2020-04-21 中国兵器科学研究院 Random forest based field replaceable unit classification and classification determination method
CN111401427A (en) * 2020-03-12 2020-07-10 华中科技大学 Product cost evaluation method and system based on industrial big data
CN111401427B (en) * 2020-03-12 2022-11-08 华中科技大学 Product cost evaluation method and system based on industrial big data
WO2021189620A1 (en) * 2020-03-25 2021-09-30 重庆邮电大学 Digital workshop electric energy management research method based on context awareness
CN111766839A (en) * 2020-05-09 2020-10-13 同济大学 Computer implementation system for self-adaptive updating of intelligent workshop scheduling knowledge
CN111766839B (en) * 2020-05-09 2023-08-29 同济大学 Computer-implemented system for self-adaptive update of intelligent workshop scheduling knowledge
CN112712289A (en) * 2021-01-18 2021-04-27 上海交通大学 Adaptive method, system, and medium based on temporal information entropy
CN112712289B (en) * 2021-01-18 2022-11-22 上海交通大学 Adaptive method, system, and medium based on temporal information entropy
CN112904818A (en) * 2021-01-19 2021-06-04 东华大学 Prediction-reaction type scheduling method for complex structural member processing workshop
CN112904818B (en) * 2021-01-19 2022-07-15 东华大学 Prediction-reaction type scheduling method for complex structural member processing workshop
CN112883640A (en) * 2021-02-04 2021-06-01 西南交通大学 Digital twin station system, job scheduling method based on system and application
CN115357570A (en) * 2022-08-24 2022-11-18 安徽维德工业自动化有限公司 Workshop optimization scheduling management method based on random forest algorithm
CN116402173A (en) * 2022-09-06 2023-07-07 大连理工大学 Intelligent algorithm for distributing container areas and container positions of ship unloading container based on machine learning
CN117010671A (en) * 2023-10-07 2023-11-07 中国信息通信研究院 Distributed flexible workshop scheduling method and device based on block chain
CN117010671B (en) * 2023-10-07 2023-12-05 中国信息通信研究院 Distributed flexible workshop scheduling method and device based on block chain

Also Published As

Publication number Publication date
CN109902954B (en) 2020-11-13

Similar Documents

Publication Publication Date Title
CN109902954A (en) A kind of flexible job shop dynamic dispatching method based on industrial big data
KR102184182B1 (en) Project/Task Intelligent Goal Management Method and Platform based on Super Tree
Shen et al. Mathematical modeling and multi-objective evolutionary algorithms applied to dynamic flexible job shop scheduling problems
Guo et al. Optimisation of integrated process planning and scheduling using a particle swarm optimisation approach
US20210073695A1 (en) Production scheduling system and method
CN113792924A (en) Single-piece job shop scheduling method based on Deep reinforcement learning of Deep Q-network
CN103310285A (en) Performance prediction method applicable to dynamic scheduling for semiconductor production line
Li et al. Data-based scheduling framework and adaptive dispatching rule of complex manufacturing systems
He et al. Integrated scheduling of production and distribution operations in a global MTO supply chain
Pani et al. A data mining approach to forecast late arrivals in a transhipment container terminal
Zhao et al. An improved Q-learning based rescheduling method for flexible job-shops with machine failures
CN102402716A (en) Intelligent production decision support system
CN104572297A (en) Hadoop job scheduling method based on genetic algorithm
CN109039727A (en) Message queue monitoring method and device based on deep learning
CN106327053B (en) Construction method of weaving process recommendation model based on multi-mode set
Ajorlou et al. Optimization of a multiproduct conwip-based manufacturing system using artificial bee colony approach
CN111626497A (en) People flow prediction method, device, equipment and storage medium
Ishankhodjayev et al. Development of information support for decision-making in intelligent energy systems
CN107463151B (en) A kind of complex surface machining multidimensional knowledge cloud cooperating service method
Chan et al. The applications of flexible manufacturing technologies in business process reengineering
CN115689201A (en) Multi-criterion intelligent decision optimization method and system for enterprise resource supply and demand allocation
Shen et al. Blocking flow shop scheduling based on hybrid ant colony optimization
CN105260948B (en) A kind of water-supply systems daily planning scheduling decision method
Madureira et al. Using genetic algorithms for dynamic scheduling
Aksvonov et al. Development of a hybrid decision-making method based on a simulation-genetic algorithm in a web-oriented metallurgical enterprise information system

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant