CN107977670A - Accident classification stage division, the apparatus and system of decision tree and bayesian algorithm - Google Patents

Accident classification stage division, the apparatus and system of decision tree and bayesian algorithm Download PDF

Info

Publication number
CN107977670A
CN107977670A CN201710934709.2A CN201710934709A CN107977670A CN 107977670 A CN107977670 A CN 107977670A CN 201710934709 A CN201710934709 A CN 201710934709A CN 107977670 A CN107977670 A CN 107977670A
Authority
CN
China
Prior art keywords
classification
attribute
event
algorithms
decision tree
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201710934709.2A
Other languages
Chinese (zh)
Inventor
华婷婷
孙苑
王冉
陶卫峰
游庆根
龚少麟
林宇
童号
陶骏
徐斌
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
CETC 28 Research Institute
Original Assignee
CETC 28 Research Institute
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by CETC 28 Research Institute filed Critical CETC 28 Research Institute
Priority to CN201710934709.2A priority Critical patent/CN107977670A/en
Publication of CN107977670A publication Critical patent/CN107977670A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • G06F18/2415Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on parametric or probabilistic models, e.g. based on likelihood ratio or false acceptance rate versus a false rejection rate
    • G06F18/24155Bayesian classification
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/243Classification techniques relating to the number of classes
    • G06F18/24323Tree-organised classifiers
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/25Fusion techniques
    • G06F18/254Fusion techniques of classification results, e.g. of results related to same input data

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Artificial Intelligence (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Biology (AREA)
  • Evolutionary Computation (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Probability & Statistics with Applications (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The present invention relates to a kind of accident classification classification side based on decision Tree algorithms and bayesian algorithm, apparatus and system, this method includes:S1, carry out feature division to pre-classification classifiable event storehouse, builds training sample set;S2, according to training sample set, be utilized respectively ID3 algorithms, C4.5 algorithms, CART algorithms, build three decision tree classification hierarchy models;S3, according to training sample set, build simultaneously training Bayes classifier;S4, carry out key feature attributes extraction to classification event to be sorted;S5, classified according to affair character attribute using three decision-tree models, draws three classification results;S6, the probability according to affair character attribute using Bayes classifier to three classification results calculating category in S5, take probability is highest to be used as final classification result.Method of the invention, it is possible to lift the classification accuracy of single algorithm, the shortcomings that decision Tree algorithms are to the more difficult prediction of successional field effectively compensate for.

Description

Accident classification stage division, the apparatus and system of decision tree and bayesian algorithm
Technical field
The present invention relates to smart city technical field, and in particular to what a kind of decision Tree algorithms and bayesian algorithm were combined Accident classification stage division, apparatus and system.
Background technology
The key step of event handling efficiency is improved in public safety commander's control field, the matching of prediction scheme and history scheme Suddenly.The matching of prediction scheme and history scheme is classified dependent on the classification of event.Currently, classify both at home and abroad to accident usual There are two kinds of forms:First, pure manually judge, according to historical incident situation, associated core feature is manually summed up, formation refers to Mark system, when new accident arrives, according to index, the type and rank of artificial judgment event;Second, manually+sentence automatically It is fixed, first by manually summing up accident core feature, index system is formed, when new accident arrives, is transferred to Machine calculates the type of event and rank.
The method of existing processing accident classification, is all to utilize traditional machine learning techniques, such as pattra leaves This network, SVM algorithm of support vector machine, Fuzzy Decision Method etc..However, these methods are in narrow application range to a certain extent, Accuracy is than relatively low, it is impossible to uses the demand of existing accident classification classification.
The content of the invention
In view of the defects existing in the prior art, the present invention provides a kind of prominent based on decision Tree algorithms and bayesian algorithm Event category stage division, apparatus and system are sent out, effectively compensate for decision Tree algorithms to the more difficult prediction of successional field, when When classification is too many, mistake will may increase and show poorly in the stronger data of processing feature relevance lacks Point.
It is an object of the present invention to provide a kind of accident classification based on decision Tree algorithms and bayesian algorithm Stage division, it is characterised in that including:
S1, carry out feature division to pre-classification classifiable event storehouse, builds training sample set;
S2, according to training sample set, be utilized respectively ID3 algorithms, C4.5 algorithms, CART algorithms, build three decision trees point Class hierarchy model;
S3, according to training sample set, build simultaneously training Bayes classifier;
S4, carry out key feature attributes extraction to classification event to be sorted;
S5, classified according to affair character attribute using three decision-tree models, draws three classification results;
S6, according to affair character attribute using Bayes classifier in S5 three classification results calculate the category it is general Rate, takes probability is highest to be used as final classification result.
Wherein, the accident classification specifically includes:
The accident is divided into natural calamity, four class of accident, occurred events of public safety and social security events;
Four the accident graded properties, the order of severity, controllability and coverage factors are divided into especially great, again Greatly, larger and general four grades.
Wherein, specifically included in the step S2 using ID3 algorithms structure decision-tree model:
Calculate the information gain of each attribute of each event;
The characteristic attribute of information gain maximum is selected to carry out branch's division as final split point.
Wherein, the information gain of each attribute for calculating classification event to be sorted, specifically includes:
Calculate the desired value of each attribute of each event;
The expectation information requirement of each attribute is calculated according to the desired value;
Calculate the information gain of each attribute respectively according to the expectation information requirement.
Wherein, specifically included in the step S2 using C4.5 algorithms structure decision tree classification hierarchy model:
Calculate the information gain of each attribute of each event;
According to described information gain, the information gain-ratio of each attribute is calculated;
The characteristic attribute of information gain-ratio maximum is selected to carry out branch's division as split point.
Wherein, specifically included in the step S2 using CART algorithms structure decision tree classification hierarchy model:
Calculate the impurity level of each attribute of each event;
According to the impurity level of each attribute, the GINI indexes of each branch are calculated;
The characteristic attribute for choosing the GINI indexes minimum of each branch carries out branch's division, obtains CART decision-tree models.
Wherein, the step S3 is specifically included:
Based on training sample set, Bayes's classification clasfficiator is built according to Bayes' theorem;
Conditional probability of each affair character attribute in each classification classification results is calculated using Bayes's classification clasfficiator, to institute Bayes's classification clasfficiator is stated to be trained.
Wherein, the step S4 is specifically included:
Key feature attributes extraction is carried out to classification event to be sorted using Chinese words segmentation;.
Wherein, the step S4 is specifically included:
Divided according to the characteristic attribute of event in sample set, to event using participle and keyword match, extracted to be sorted The key feature attribute of classification event.
A kind of another aspect of the present invention, there is provided accident classification classification based on decision Tree algorithms and bayesian algorithm Device, it is characterised in that including:
Training sample set builds module, and feature division, structure training are carried out to pre-classification classifiable event storehouse for training Sample set;
Decision tree classification hierarchy model builds module, for the training sample set according to structure, be utilized respectively ID3 algorithms, C4.5 algorithms, CART algorithms, build three decision tree classification hierarchy models;
Grader builds module, for according to training sample set, building and training Bayes classifier;
Characteristic extracting module, for carrying out key feature attributes extraction to classification event to be sorted;
Sort module, for three according to affair character attribute using decision tree classification hierarchy model structure module construction Decision tree classification hierarchy model is classified, and draws three classification results;
Classification results computing module, for three according to affair character attribute using Bayes classifier to sort module Classification results calculate the probability of the category, and acquisition probability is highest to be used as final classification result.
Wherein, the accident classification specifically includes:
The accident is divided into natural calamity, four class of accident, occurred events of public safety and social security events;
Four the accident graded properties, the order of severity, controllability and coverage factors are divided into especially great, again Greatly, larger and general four grades.
Wherein, the decision tree classification hierarchy model structure module, specifically includes:
ID3 algorithm construction units, for utilizing ID3 algorithms structure classification hierarchy model;
C4.5 algorithm construction units, for utilizing C4.5 algorithms structure classification hierarchy model;
CART algorithm construction units, for utilizing CART algorithms structure classification hierarchy model.
Wherein, the ID3 algorithms construction unit, specifically includes:
Information gain computing unit, the information gain of each attribute for calculating each event;
Branch's division unit, the characteristic attribute for selecting information gain maximum carry out branch as final split point and draw Point.
Wherein, described information gain calculating unit, specifically includes:
Desired value computation subunit, the desired value of each attribute for calculating each event;
Desired value demand subelement, for calculating the expectation information requirement of each attribute according to the desired value;
Information gain computation subunit, the information for calculating each attribute respectively according to the expectation information requirement increase Benefit.
Wherein, the C4.5 algorithms construction unit, specifically includes:
Information gain calculates the second subelement, the information gain of each attribute for calculating each event;
Information gain-ratio computation subunit, for according to described information gain, calculating the information gain-ratio of each attribute;
Branch divides the second subelement, and the characteristic attribute for selecting information gain-ratio maximum carries out branch as split point Division.
Wherein, the CART algorithms build unit of determining, and specifically include:
Impurity level computation subunit, the impurity level of each attribute for calculating each event;
GINI index computation subunits, for the impurity level according to each attribute, the GINI for calculating each branch refers to Number;
CART decision-tree models build subelement, are carried out for choosing the characteristic attribute of GINI indexes minimum of each branch Branch divides, and obtains CART decision-tree models.
Wherein, the grader structure module specifically includes:
Grader construction unit, for based on training sample set, Bayes's classification clasfficiator to be built according to Bayes' theorem;
Classifier training unit, for calculating each affair character attribute in each classification classification using Bayes's classification clasfficiator As a result conditional probability, is trained the Bayes's classification clasfficiator.
Wherein, the characteristic extracting module specifically, carries out classification event to be sorted using Chinese words segmentation crucial Characteristic attribute extracts.
Wherein, the characteristic extracting module specifically includes:Divided according to the characteristic attribute of event in sample set, event is adopted With participle and keyword match, the key feature attribute of classification event to be sorted is extracted.
A kind of another aspect of the invention, there is provided accident classification point based on decision Tree algorithms and bayesian algorithm Level system, it is characterised in that including the above-mentioned accident classification grading plant based on decision Tree algorithms and bayesian algorithm.
The classification of the accident based on decision Tree algorithms and bayesian algorithm stage division, the apparatus and system of the present invention, The classification accuracy of single algorithm can be lifted, effectively compensate for decision Tree algorithms to the more difficult prediction of successional field, When classification is too many, mistake will may increase and show poorly in the stronger data of processing feature relevance lacks Point.
Brief description of the drawings
Fig. 1 shows the flow of accident classification stage division of the present invention based on decision Tree algorithms and bayesian algorithm Figure;
Fig. 2 shows the knot of the classification grading plant of the accident based on decision Tree algorithms and bayesian algorithm of the present invention Structure block diagram.
Embodiment
In order to make the purpose , technical scheme and advantage of the present invention be clearer, with reference to the accompanying drawings and embodiments, it is right The present invention is further elaborated, it will be appreciated that specific embodiment described herein is only to explain the present invention, not For limiting the present invention.
It is shown in the drawings now with detailed reference to the embodiment of the present invention, the example of these embodiments.The suffix of element " module " and " unit " is used herein to conveniently describe, and therefore can convertibly be used, and is distinguished without any Meaning or function.
Although all elements or unit that form the embodiment of the present invention illustrated as being coupled in discrete component or are grasped As discrete component or unit, but the present invention may be not necessarily limited to such a embodiment.According to embodiment, in the purpose of the present invention One or more elements can be selectively bonded to element all in scope and are operating as one or more elements.
In one embodiment of the present of invention, as shown in Figure 1, there is provided a kind of prominent based on decision Tree algorithms and bayesian algorithm Event category stage division is sent out, including:
S1, carry out feature division to pre-classification classifiable event storehouse, builds training sample set.
In the present embodiment, if training sample set S includes n event, by each accident according to natural calamity, accident calamity Difficult, occurred events of public safety and 4 major class of social security events, 22 subclass are classified, and according to its property, order of severity, controllable The factors such as property, coverage are divided into 4 grades, i.e. I grades (especially great), II grades (great), III level (larger) and IV grades (general). The final classification classification results of account part are r, and set each event and include m characteristic attribute t, then each event is classified by classification As a result r and m characteristic attribute t is described, and sample set S is as follows:
S={ t11, t12..., t1m, r1;ti1, ti2..., tim, ri;...;tn1, tn2..., tnm, rn}。
S2, according to training sample set, be utilized respectively ID3 algorithms, C4.5 algorithms, CART algorithms, build three decision trees point Class hierarchy model;
(1) training ID3 decision tree classification hierarchy models.Before each non-leaf nodes division of decision tree, first count Information gain caused by each attribute is calculated, selects the characteristic attribute of information gain maximum to be divided as final split point Branch divides, and completes the division of continuation next node after the division of a non-leaf nodes, finally obtains ID3 decision-tree models. Wherein, ID3 algorithms obtain letter and suffer from gain being divided into 3 steps, and flow is as shown in Figure 2.First, the phase of D classification classification results is calculated Hope info (D),Wherein piOccur for i-th of class hierarchies in whole sample set Probability.Then, the expectation information requirement of characteristic attribute is calculated, it is assumed that according to characteristic attribute tiThe event of sample set is drawn Divide, then characteristic attribute tiBe desired forV is t in formulai1To timAffair character category Property.Finally, by formulaIt can obtain according to characteristic attribute tiThe letter of division Cease gain.
(2) training C4.5 decision tree classification hierarchy models.When dividing decision tree non-leaf nodes, calculate and believe in ID3 algorithms On the basis of ceasing gain, characteristic attribute t is calculatediInformation gain-ratio beWherein, split_info(ti) to divide information,V is t in formulai1To timEvent it is special Levy attribute.According to the result of calculation of information gain-ratio formula, the characteristic attribute of information gain-ratio maximum is chosen every time as node Decision tree branches division is carried out, finally obtains C4.5 decision-tree models.
(3) training CART decision tree classification hierarchy models.When decision tree divides, impurity level is calculated Wherein pkBelong to the ratio of class hierarchies k for event in branch.If with characteristic attribute tiSample set is divided into branch, each branch GINI indexes beAccording to GINI formula of index, choose every time so that each branch The characteristic attribute of GINI indexes minimum carries out branch's division, obtains CART decision-tree models.
S3, according to training sample set, build simultaneously training Bayes classifier.
According to Bayes law, classification event to be sorted is T (T=t in characteristic attribute value1, t2..., tm) in the case of Belong to a certain class hierarchies riConditional probability beWherein P (tj| ri) it is that event belongs to riCharacteristic attribute is t in the case of class hierarchiesjProbability.Bayes's classification clasfficiator is exactly to P (tj| ri) value calculated, and be stored in grader, for carrying out probability calculation to classification event to be sorted.
S4, carry out key feature attributes extraction to classification event to be sorted.Divided according to the characteristic attribute of event in sample set, Participle and keyword match method are used to event description, extract the characteristic attribute value of classification event to be sorted.
S5, classified according to affair character attribute using three decision-tree models, draws three classification results.
In the present embodiment, respectively with tri- decision tree classification hierarchy models of ID3, C4.5, CART, to classification event to be sorted Classify, obtain three classification results.Three decision tree classification hierarchy model classification are classified specific steps and are:Utilize decision-making Tree classification hierarchy model, since the affair character attribute of root node, tests the characteristic attribute that classification Event Distillation to be sorted goes out, And output branch is selected according to its value, each node is tested successively, and until reaching leaf node, the value of leaf node is then event Classification classification results.
S6, according to affair character attribute using Bayes classifier in S5 three classification results calculate the category it is general Rate, takes probability is highest to be used as final classification result.
In the above-described embodiments, if the obtained result of three decision tree classification hierarchy models is consistent, this step is skipped, directly Obtain final classification classification results;If three decision tree classification hierarchy models obtain more than one classification classification results, to each Classification classification results are detected using Bayes classifier, calculate the characteristic attribute of classification event to be sorted respectively in the classification Conditional probability under classification results, takes the conduct final classification classification results that probability is high.
In another embodiment of the present invention, as shown in Figure 2, there is provided one kind is based on decision Tree algorithms and bayesian algorithm Accident classification grading plant, specifically include:
Training sample set builds module 10, and feature division, structure instruction are carried out to pre-classification classifiable event storehouse for training Practice sample set;
Decision tree classification hierarchy model builds module 20, for the training sample set according to structure, is utilized respectively ID3 calculations Method, C4.5 algorithms, CART algorithms, build three decision tree classification hierarchy models;
Grader builds module 30, for according to training sample set, building and training Bayes classifier;
Characteristic extracting module 40, for carrying out key feature attributes extraction to classification event to be sorted;
Sort module 50, for building the three of module construction using decision tree classification hierarchy model according to affair character attribute A decision tree classification hierarchy model is classified, and draws three classification results;
Classification results computing module 60, for utilizing Bayes classifier to the three of sort module according to affair character attribute A classification results calculate the probability of the category, and acquisition probability is highest to be used as final classification result.
In another embodiment of the invention, there is provided a kind of accident classification based on decision Tree algorithms and bayesian algorithm Hierarchy system, including the above-mentioned accident classification grading plant based on decision Tree algorithms and bayesian algorithm.
It should be appreciated that the functional unit or ability that describe in the present specification be referred to alternatively as or be denoted as component, module or System, more specifically to emphasize their realization independence.For example, component, module or system can be implemented as hardware circuit, its Including customizing ultra-large integrated (VLSI) circuit OR gate array, such as ready-made semiconductor, logic chip, transistor, or its His discrete assembly.Component or module can also realize in programmable hardware device, such as field programmable gate array, programmable array Logic, programmable logic device etc..Component or module can also be real in the software for being performed by various types of processors It is existing.For example, the component or module of the identification of executable code can include one or more computer instructions physically or logically, It can be with for example, be organized as object, program or function.However, the component or module that are identified need not be physically positioned at Together, but the disparate instruction for being stored in diverse location can be included, it includes component or mould when being bonded together in logic Block simultaneously realizes the regulation purpose for component or module.
It should be appreciated that spy is not limited to above by the effect that the present invention can realize by those skilled in the art The content not described, and the further advantage of the present invention will be more clearly understood from detailed description above.
It should be apparent to those skilled in the art that can be without departing from the spirit or scope of the present invention in the present invention In make various modifications and variations.Therefore, if it is contemplated that the present invention modifications and variations fall into subsidiary claim and In the range of their equivalents, then the present invention covers these modifications and variations.

Claims (20)

  1. A kind of 1. accident classification stage division based on decision Tree algorithms and bayesian algorithm, it is characterised in that including:
    S1, carry out feature division to pre-classification classifiable event storehouse, builds training sample set;
    S2, according to training sample set, be utilized respectively ID3 algorithms, C4.5 algorithms, CART algorithms, build three decision tree classifications point Level model;
    S3, according to training sample set, build simultaneously training Bayes classifier;
    S4, carry out key feature attributes extraction to classification event to be sorted;
    S5, classified according to affair character attribute using three decision-tree models, draws three classification results;
    S6, the probability according to affair character attribute using Bayes classifier to three classification results calculating category in S5, Take probability is highest to be used as final classification result.
  2. 2. according to the method described in claim 1, it is characterized in that, accident classification specifically includes:
    The accident is divided into natural calamity, four class of accident, occurred events of public safety and social security events;
    Four the accident graded properties, the order of severity, controllability and coverage factors be divided into it is especially great, great, Larger and general four grades.
  3. 3. according to the method described in claim 1, it is characterized in that, ID3 algorithms structure decision tree mould is utilized in the step S2 Type, specifically includes:
    Calculate the information gain of each attribute of each event;
    The characteristic attribute of information gain maximum is selected to carry out branch's division as final split point.
  4. 4. the according to the method described in claim 3, it is characterized in that, letter of each attribute for calculating classification event to be sorted Gain is ceased, is specifically included:
    Calculate the desired value of each attribute of each event;
    The expectation information requirement of each attribute is calculated according to the desired value;
    Calculate the information gain of each attribute respectively according to the expectation information requirement.
  5. 5. according to the method described in claim 1, it is characterized in that, C4.5 algorithms structure decision tree point is utilized in the step S2 Class hierarchy model, specifically includes:
    Calculate the information gain of each attribute of each event;
    According to described information gain, the information gain-ratio of each attribute is calculated;
    The characteristic attribute of information gain-ratio maximum is selected to carry out branch's division as split point.
  6. 6. according to the method described in claim 1, it is characterized in that, CART algorithms structure decision tree point is utilized in the step S2 Class hierarchy model, specifically includes:
    Calculate the impurity level of each attribute of each event;
    According to the impurity level of each attribute, the GINI indexes of each branch are calculated;
    The characteristic attribute for choosing the GINI indexes minimum of each branch carries out branch's division, obtains CART decision-tree models.
  7. 7. according to the method described in claim 1, it is characterized in that, the step S3 is specifically included:
    Based on training sample set, Bayes's classification clasfficiator is built according to Bayes' theorem;
    Conditional probability of each affair character attribute in each classification classification results is calculated using Bayes's classification clasfficiator, to the shellfish Ye Si classification clasfficiators are trained.
  8. 8. according to the method described in claim 1, it is characterized in that, the step S4 is specifically included:
    Key feature attributes extraction is carried out to classification event to be sorted using Chinese words segmentation.
  9. 9. according to the method described in claim 1, it is characterized in that, the step S4 is specifically included:
    Divided according to the characteristic attribute of event in sample set, to event using participle and keyword match, extract classification to be sorted The key feature attribute of event.
  10. A kind of 10. accident classification grading plant based on decision Tree algorithms and bayesian algorithm, it is characterised in that including:
    Training sample set builds module, carries out feature division to pre-classification classifiable event storehouse for training, builds training sample Collection;
    Decision tree classification hierarchy model builds module, for the training sample set according to structure, is utilized respectively ID3 algorithms, C4.5 Algorithm, CART algorithms, build three decision tree classification hierarchy models;
    Grader builds module, for according to training sample set, building and training Bayes classifier;
    Characteristic extracting module, for carrying out key feature attributes extraction to classification event to be sorted;
    Sort module, for three decision-makings according to affair character attribute using decision tree classification hierarchy model structure module construction Tree classification hierarchy model is classified, and draws three classification results;
    Classification results computing module, for utilizing three classification of the Bayes classifier to sort module according to affair character attribute As a result the probability of the category is calculated, acquisition probability is highest to be used as final classification result.
  11. 11. device according to claim 10, it is characterised in that the accident classification specifically includes:
    The accident is divided into natural calamity, four class of accident, occurred events of public safety and social security events;
    Four the accident graded properties, the order of severity, controllability and coverage factors be divided into it is especially great, great, Larger and general four grades.
  12. 12. device according to claim 10, it is characterised in that the decision tree classification hierarchy model builds module, tool Body includes:
    ID3 algorithm construction units, for utilizing ID3 algorithms structure classification hierarchy model;
    C4.5 algorithm construction units, for utilizing C4.5 algorithms structure classification hierarchy model;
    CART algorithm construction units, for utilizing CART algorithms structure classification hierarchy model.
  13. 13. device according to claim 12, it is characterised in that the ID3 algorithms construction unit, specifically includes:
    Information gain computing unit, the information gain of each attribute for calculating each event;
    Branch's division unit, the characteristic attribute for selecting information gain maximum carry out branch's division as final split point.
  14. 14. device according to claim 13, it is characterised in that described information gain calculating unit, specifically includes:
    Desired value computation subunit, the desired value of each attribute for calculating each event;
    Desired value demand subelement, for calculating the expectation information requirement of each attribute according to the desired value;
    Information gain computation subunit, for calculating the information gain of each attribute respectively according to the expectation information requirement.
  15. 15. device according to claim 12, it is characterised in that the C4.5 algorithms construction unit, specifically includes:
    Information gain calculates the second subelement, the information gain of each attribute for calculating each event;
    Information gain-ratio computation subunit, for according to described information gain, calculating the information gain-ratio of each attribute;
    Branch divides the second subelement, and the characteristic attribute for selecting information gain-ratio maximum carries out branch as split point and draws Point.
  16. 16. device according to claim 12, it is characterised in that the CART algorithms build unit of determining, and specifically include:
    Impurity level computation subunit, the impurity level of each attribute for calculating each event;
    GINI index computation subunits, for the impurity level according to each attribute, calculate the GINI indexes of each branch;
    CART decision-tree models build subelement, and branch is carried out for choosing the characteristic attribute of GINI indexes minimum of each branch Division, obtains CART decision-tree models.
  17. 17. device according to claim 10, it is characterised in that the grader structure module specifically includes:
    Grader construction unit, for based on training sample set, Bayes's classification clasfficiator to be built according to Bayes' theorem;
    Classifier training unit, for calculating each affair character attribute in each classification classification results using Bayes's classification clasfficiator Conditional probability, the Bayes's classification clasfficiator is trained.
  18. 18. device according to claim 10, it is characterised in that the characteristic extracting module specifically, utilizes Chinese point Word technology carries out key feature attributes extraction to classification event to be sorted.
  19. 19. device according to claim 18, it is characterised in that the characteristic extracting module specifically includes:According to sample The characteristic attribute division of concentration event, to event using participle and keyword match, the key for extracting classification event to be sorted is special Levy attribute.
  20. 20. a kind of accident classification hierarchy system based on decision Tree algorithms and bayesian algorithm, it is characterised in that including power Profit requires accident classification grading plant of the 10-19 any one of them based on decision Tree algorithms and bayesian algorithm.
CN201710934709.2A 2017-10-09 2017-10-09 Accident classification stage division, the apparatus and system of decision tree and bayesian algorithm Pending CN107977670A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201710934709.2A CN107977670A (en) 2017-10-09 2017-10-09 Accident classification stage division, the apparatus and system of decision tree and bayesian algorithm

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201710934709.2A CN107977670A (en) 2017-10-09 2017-10-09 Accident classification stage division, the apparatus and system of decision tree and bayesian algorithm

Publications (1)

Publication Number Publication Date
CN107977670A true CN107977670A (en) 2018-05-01

Family

ID=62012376

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201710934709.2A Pending CN107977670A (en) 2017-10-09 2017-10-09 Accident classification stage division, the apparatus and system of decision tree and bayesian algorithm

Country Status (1)

Country Link
CN (1) CN107977670A (en)

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109635254A (en) * 2018-12-03 2019-04-16 重庆大学 Paper duplicate checking method based on naive Bayesian, decision tree and SVM mixed model
CN110942098A (en) * 2019-11-28 2020-03-31 江苏电力信息技术有限公司 Power supply service quality analysis method based on Bayesian pruning decision tree
CN112819069A (en) * 2021-01-29 2021-05-18 中国农业银行股份有限公司 Event grading method and device
CN113232674A (en) * 2021-05-28 2021-08-10 南京航空航天大学 Vehicle control method and device based on decision tree and Bayesian network
CN113722509A (en) * 2021-09-07 2021-11-30 中国人民解放军32801部队 Knowledge graph data fusion method based on entity attribute similarity
CN115829061A (en) * 2023-02-21 2023-03-21 中国电子科技集团公司第二十八研究所 Emergency accident disposal method based on historical case and empirical knowledge learning

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101414300A (en) * 2008-11-28 2009-04-22 电子科技大学 Method for sorting and processing internet public feelings information
CN106202561A (en) * 2016-07-29 2016-12-07 北京联创众升科技有限公司 Digitized contingency management case library construction methods based on the big data of text and device

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101414300A (en) * 2008-11-28 2009-04-22 电子科技大学 Method for sorting and processing internet public feelings information
CN106202561A (en) * 2016-07-29 2016-12-07 北京联创众升科技有限公司 Digitized contingency management case library construction methods based on the big data of text and device

Non-Patent Citations (5)

* Cited by examiner, † Cited by third party
Title
吴坚 等: "基于随机森林算法的网络舆情文本信息分类方法研究", 《信息网络安全》 *
杨云 等: "决策树模型ID3算法在突发公共卫生事件风险评估中的应用", 《中国预防医学杂志》 *
杨德礼 等: "《电子商务环境下管理理论与方法》", 31 December 2004, 大连理工大学出版社 *
薛云霞: "微博用户属性识别方法研究", 《中国优秀硕士学位论文全文数据库信息科技辑》 *
韦鹏程 等: "《大数据巨量分析与机器学习的整合与开发》", 31 May 2017, 电子科技大学出版社 *

Cited By (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109635254A (en) * 2018-12-03 2019-04-16 重庆大学 Paper duplicate checking method based on naive Bayesian, decision tree and SVM mixed model
CN110942098A (en) * 2019-11-28 2020-03-31 江苏电力信息技术有限公司 Power supply service quality analysis method based on Bayesian pruning decision tree
CN112819069A (en) * 2021-01-29 2021-05-18 中国农业银行股份有限公司 Event grading method and device
CN113232674A (en) * 2021-05-28 2021-08-10 南京航空航天大学 Vehicle control method and device based on decision tree and Bayesian network
CN113722509A (en) * 2021-09-07 2021-11-30 中国人民解放军32801部队 Knowledge graph data fusion method based on entity attribute similarity
CN113722509B (en) * 2021-09-07 2022-03-01 中国人民解放军32801部队 Knowledge graph data fusion method based on entity attribute similarity
CN115829061A (en) * 2023-02-21 2023-03-21 中国电子科技集团公司第二十八研究所 Emergency accident disposal method based on historical case and empirical knowledge learning
CN115829061B (en) * 2023-02-21 2023-04-28 中国电子科技集团公司第二十八研究所 Emergency accident handling method based on historical case and experience knowledge learning

Similar Documents

Publication Publication Date Title
CN107977670A (en) Accident classification stage division, the apparatus and system of decision tree and bayesian algorithm
CN111079639B (en) Method, device, equipment and storage medium for constructing garbage image classification model
CN106815369B (en) A kind of file classification method based on Xgboost sorting algorithm
CN107944480A (en) A kind of enterprises ' industry sorting technique
CN107644057B (en) Absolute imbalance text classification method based on transfer learning
CN103927302B (en) A kind of file classification method and system
WO2019179403A1 (en) Fraud transaction detection method based on sequence width depth learning
CN102289522B (en) Method of intelligently classifying texts
CN108388651A (en) A kind of file classification method based on the kernel of graph and convolutional neural networks
CN109871885B (en) Plant identification method based on deep learning and plant taxonomy
CN113326377B (en) Name disambiguation method and system based on enterprise association relationship
CN108460421A (en) The sorting technique of unbalanced data
CN106919951A (en) A kind of Weakly supervised bilinearity deep learning method merged with vision based on click
CN107392241A (en) A kind of image object sorting technique that sampling XGBoost is arranged based on weighting
CN101876987A (en) Overlapped-between-clusters-oriented method for classifying two types of texts
CN108710894A (en) A kind of Active Learning mask method and device based on cluster representative point
CN102968419B (en) Disambiguation method for interactive Internet entity name
CN111754345A (en) Bit currency address classification method based on improved random forest
CN103886030B (en) Cost-sensitive decision-making tree based physical information fusion system data classification method
WO2023019698A1 (en) Hyperspectral image classification method based on rich context network
CN102750286A (en) Novel decision tree classifier method for processing missing data
CN113051914A (en) Enterprise hidden label extraction method and device based on multi-feature dynamic portrait
CN108920446A (en) A kind of processing method of Engineering document
CN111125396B (en) Image retrieval method of single-model multi-branch structure
Parvathi et al. Identifying relevant text from text document using deep learning

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20180501