CN107977670A - Accident classification stage division, the apparatus and system of decision tree and bayesian algorithm - Google Patents
Accident classification stage division, the apparatus and system of decision tree and bayesian algorithm Download PDFInfo
- Publication number
- CN107977670A CN107977670A CN201710934709.2A CN201710934709A CN107977670A CN 107977670 A CN107977670 A CN 107977670A CN 201710934709 A CN201710934709 A CN 201710934709A CN 107977670 A CN107977670 A CN 107977670A
- Authority
- CN
- China
- Prior art keywords
- classification
- attribute
- event
- algorithms
- decision tree
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/24—Classification techniques
- G06F18/241—Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
- G06F18/2415—Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on parametric or probabilistic models, e.g. based on likelihood ratio or false acceptance rate versus a false rejection rate
- G06F18/24155—Bayesian classification
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/21—Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
- G06F18/214—Generating training patterns; Bootstrap methods, e.g. bagging or boosting
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/24—Classification techniques
- G06F18/243—Classification techniques relating to the number of classes
- G06F18/24323—Tree-organised classifiers
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/25—Fusion techniques
- G06F18/254—Fusion techniques of classification results, e.g. of results related to same input data
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Data Mining & Analysis (AREA)
- Physics & Mathematics (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Artificial Intelligence (AREA)
- Life Sciences & Earth Sciences (AREA)
- Bioinformatics & Computational Biology (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Evolutionary Biology (AREA)
- Evolutionary Computation (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Probability & Statistics with Applications (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
The present invention relates to a kind of accident classification classification side based on decision Tree algorithms and bayesian algorithm, apparatus and system, this method includes:S1, carry out feature division to pre-classification classifiable event storehouse, builds training sample set;S2, according to training sample set, be utilized respectively ID3 algorithms, C4.5 algorithms, CART algorithms, build three decision tree classification hierarchy models;S3, according to training sample set, build simultaneously training Bayes classifier;S4, carry out key feature attributes extraction to classification event to be sorted;S5, classified according to affair character attribute using three decision-tree models, draws three classification results;S6, the probability according to affair character attribute using Bayes classifier to three classification results calculating category in S5, take probability is highest to be used as final classification result.Method of the invention, it is possible to lift the classification accuracy of single algorithm, the shortcomings that decision Tree algorithms are to the more difficult prediction of successional field effectively compensate for.
Description
Technical field
The present invention relates to smart city technical field, and in particular to what a kind of decision Tree algorithms and bayesian algorithm were combined
Accident classification stage division, apparatus and system.
Background technology
The key step of event handling efficiency is improved in public safety commander's control field, the matching of prediction scheme and history scheme
Suddenly.The matching of prediction scheme and history scheme is classified dependent on the classification of event.Currently, classify both at home and abroad to accident usual
There are two kinds of forms:First, pure manually judge, according to historical incident situation, associated core feature is manually summed up, formation refers to
Mark system, when new accident arrives, according to index, the type and rank of artificial judgment event;Second, manually+sentence automatically
It is fixed, first by manually summing up accident core feature, index system is formed, when new accident arrives, is transferred to
Machine calculates the type of event and rank.
The method of existing processing accident classification, is all to utilize traditional machine learning techniques, such as pattra leaves
This network, SVM algorithm of support vector machine, Fuzzy Decision Method etc..However, these methods are in narrow application range to a certain extent,
Accuracy is than relatively low, it is impossible to uses the demand of existing accident classification classification.
The content of the invention
In view of the defects existing in the prior art, the present invention provides a kind of prominent based on decision Tree algorithms and bayesian algorithm
Event category stage division, apparatus and system are sent out, effectively compensate for decision Tree algorithms to the more difficult prediction of successional field, when
When classification is too many, mistake will may increase and show poorly in the stronger data of processing feature relevance lacks
Point.
It is an object of the present invention to provide a kind of accident classification based on decision Tree algorithms and bayesian algorithm
Stage division, it is characterised in that including:
S1, carry out feature division to pre-classification classifiable event storehouse, builds training sample set;
S2, according to training sample set, be utilized respectively ID3 algorithms, C4.5 algorithms, CART algorithms, build three decision trees point
Class hierarchy model;
S3, according to training sample set, build simultaneously training Bayes classifier;
S4, carry out key feature attributes extraction to classification event to be sorted;
S5, classified according to affair character attribute using three decision-tree models, draws three classification results;
S6, according to affair character attribute using Bayes classifier in S5 three classification results calculate the category it is general
Rate, takes probability is highest to be used as final classification result.
Wherein, the accident classification specifically includes:
The accident is divided into natural calamity, four class of accident, occurred events of public safety and social security events;
Four the accident graded properties, the order of severity, controllability and coverage factors are divided into especially great, again
Greatly, larger and general four grades.
Wherein, specifically included in the step S2 using ID3 algorithms structure decision-tree model:
Calculate the information gain of each attribute of each event;
The characteristic attribute of information gain maximum is selected to carry out branch's division as final split point.
Wherein, the information gain of each attribute for calculating classification event to be sorted, specifically includes:
Calculate the desired value of each attribute of each event;
The expectation information requirement of each attribute is calculated according to the desired value;
Calculate the information gain of each attribute respectively according to the expectation information requirement.
Wherein, specifically included in the step S2 using C4.5 algorithms structure decision tree classification hierarchy model:
Calculate the information gain of each attribute of each event;
According to described information gain, the information gain-ratio of each attribute is calculated;
The characteristic attribute of information gain-ratio maximum is selected to carry out branch's division as split point.
Wherein, specifically included in the step S2 using CART algorithms structure decision tree classification hierarchy model:
Calculate the impurity level of each attribute of each event;
According to the impurity level of each attribute, the GINI indexes of each branch are calculated;
The characteristic attribute for choosing the GINI indexes minimum of each branch carries out branch's division, obtains CART decision-tree models.
Wherein, the step S3 is specifically included:
Based on training sample set, Bayes's classification clasfficiator is built according to Bayes' theorem;
Conditional probability of each affair character attribute in each classification classification results is calculated using Bayes's classification clasfficiator, to institute
Bayes's classification clasfficiator is stated to be trained.
Wherein, the step S4 is specifically included:
Key feature attributes extraction is carried out to classification event to be sorted using Chinese words segmentation;.
Wherein, the step S4 is specifically included:
Divided according to the characteristic attribute of event in sample set, to event using participle and keyword match, extracted to be sorted
The key feature attribute of classification event.
A kind of another aspect of the present invention, there is provided accident classification classification based on decision Tree algorithms and bayesian algorithm
Device, it is characterised in that including:
Training sample set builds module, and feature division, structure training are carried out to pre-classification classifiable event storehouse for training
Sample set;
Decision tree classification hierarchy model builds module, for the training sample set according to structure, be utilized respectively ID3 algorithms,
C4.5 algorithms, CART algorithms, build three decision tree classification hierarchy models;
Grader builds module, for according to training sample set, building and training Bayes classifier;
Characteristic extracting module, for carrying out key feature attributes extraction to classification event to be sorted;
Sort module, for three according to affair character attribute using decision tree classification hierarchy model structure module construction
Decision tree classification hierarchy model is classified, and draws three classification results;
Classification results computing module, for three according to affair character attribute using Bayes classifier to sort module
Classification results calculate the probability of the category, and acquisition probability is highest to be used as final classification result.
Wherein, the accident classification specifically includes:
The accident is divided into natural calamity, four class of accident, occurred events of public safety and social security events;
Four the accident graded properties, the order of severity, controllability and coverage factors are divided into especially great, again
Greatly, larger and general four grades.
Wherein, the decision tree classification hierarchy model structure module, specifically includes:
ID3 algorithm construction units, for utilizing ID3 algorithms structure classification hierarchy model;
C4.5 algorithm construction units, for utilizing C4.5 algorithms structure classification hierarchy model;
CART algorithm construction units, for utilizing CART algorithms structure classification hierarchy model.
Wherein, the ID3 algorithms construction unit, specifically includes:
Information gain computing unit, the information gain of each attribute for calculating each event;
Branch's division unit, the characteristic attribute for selecting information gain maximum carry out branch as final split point and draw
Point.
Wherein, described information gain calculating unit, specifically includes:
Desired value computation subunit, the desired value of each attribute for calculating each event;
Desired value demand subelement, for calculating the expectation information requirement of each attribute according to the desired value;
Information gain computation subunit, the information for calculating each attribute respectively according to the expectation information requirement increase
Benefit.
Wherein, the C4.5 algorithms construction unit, specifically includes:
Information gain calculates the second subelement, the information gain of each attribute for calculating each event;
Information gain-ratio computation subunit, for according to described information gain, calculating the information gain-ratio of each attribute;
Branch divides the second subelement, and the characteristic attribute for selecting information gain-ratio maximum carries out branch as split point
Division.
Wherein, the CART algorithms build unit of determining, and specifically include:
Impurity level computation subunit, the impurity level of each attribute for calculating each event;
GINI index computation subunits, for the impurity level according to each attribute, the GINI for calculating each branch refers to
Number;
CART decision-tree models build subelement, are carried out for choosing the characteristic attribute of GINI indexes minimum of each branch
Branch divides, and obtains CART decision-tree models.
Wherein, the grader structure module specifically includes:
Grader construction unit, for based on training sample set, Bayes's classification clasfficiator to be built according to Bayes' theorem;
Classifier training unit, for calculating each affair character attribute in each classification classification using Bayes's classification clasfficiator
As a result conditional probability, is trained the Bayes's classification clasfficiator.
Wherein, the characteristic extracting module specifically, carries out classification event to be sorted using Chinese words segmentation crucial
Characteristic attribute extracts.
Wherein, the characteristic extracting module specifically includes:Divided according to the characteristic attribute of event in sample set, event is adopted
With participle and keyword match, the key feature attribute of classification event to be sorted is extracted.
A kind of another aspect of the invention, there is provided accident classification point based on decision Tree algorithms and bayesian algorithm
Level system, it is characterised in that including the above-mentioned accident classification grading plant based on decision Tree algorithms and bayesian algorithm.
The classification of the accident based on decision Tree algorithms and bayesian algorithm stage division, the apparatus and system of the present invention,
The classification accuracy of single algorithm can be lifted, effectively compensate for decision Tree algorithms to the more difficult prediction of successional field,
When classification is too many, mistake will may increase and show poorly in the stronger data of processing feature relevance lacks
Point.
Brief description of the drawings
Fig. 1 shows the flow of accident classification stage division of the present invention based on decision Tree algorithms and bayesian algorithm
Figure;
Fig. 2 shows the knot of the classification grading plant of the accident based on decision Tree algorithms and bayesian algorithm of the present invention
Structure block diagram.
Embodiment
In order to make the purpose , technical scheme and advantage of the present invention be clearer, with reference to the accompanying drawings and embodiments, it is right
The present invention is further elaborated, it will be appreciated that specific embodiment described herein is only to explain the present invention, not
For limiting the present invention.
It is shown in the drawings now with detailed reference to the embodiment of the present invention, the example of these embodiments.The suffix of element
" module " and " unit " is used herein to conveniently describe, and therefore can convertibly be used, and is distinguished without any
Meaning or function.
Although all elements or unit that form the embodiment of the present invention illustrated as being coupled in discrete component or are grasped
As discrete component or unit, but the present invention may be not necessarily limited to such a embodiment.According to embodiment, in the purpose of the present invention
One or more elements can be selectively bonded to element all in scope and are operating as one or more elements.
In one embodiment of the present of invention, as shown in Figure 1, there is provided a kind of prominent based on decision Tree algorithms and bayesian algorithm
Event category stage division is sent out, including:
S1, carry out feature division to pre-classification classifiable event storehouse, builds training sample set.
In the present embodiment, if training sample set S includes n event, by each accident according to natural calamity, accident calamity
Difficult, occurred events of public safety and 4 major class of social security events, 22 subclass are classified, and according to its property, order of severity, controllable
The factors such as property, coverage are divided into 4 grades, i.e. I grades (especially great), II grades (great), III level (larger) and IV grades (general).
The final classification classification results of account part are r, and set each event and include m characteristic attribute t, then each event is classified by classification
As a result r and m characteristic attribute t is described, and sample set S is as follows:
S={ t11, t12..., t1m, r1;ti1, ti2..., tim, ri;...;tn1, tn2..., tnm, rn}。
S2, according to training sample set, be utilized respectively ID3 algorithms, C4.5 algorithms, CART algorithms, build three decision trees point
Class hierarchy model;
(1) training ID3 decision tree classification hierarchy models.Before each non-leaf nodes division of decision tree, first count
Information gain caused by each attribute is calculated, selects the characteristic attribute of information gain maximum to be divided as final split point
Branch divides, and completes the division of continuation next node after the division of a non-leaf nodes, finally obtains ID3 decision-tree models.
Wherein, ID3 algorithms obtain letter and suffer from gain being divided into 3 steps, and flow is as shown in Figure 2.First, the phase of D classification classification results is calculated
Hope info (D),Wherein piOccur for i-th of class hierarchies in whole sample set
Probability.Then, the expectation information requirement of characteristic attribute is calculated, it is assumed that according to characteristic attribute tiThe event of sample set is drawn
Divide, then characteristic attribute tiBe desired forV is t in formulai1To timAffair character category
Property.Finally, by formulaIt can obtain according to characteristic attribute tiThe letter of division
Cease gain.
(2) training C4.5 decision tree classification hierarchy models.When dividing decision tree non-leaf nodes, calculate and believe in ID3 algorithms
On the basis of ceasing gain, characteristic attribute t is calculatediInformation gain-ratio beWherein,
split_info(ti) to divide information,V is t in formulai1To timEvent it is special
Levy attribute.According to the result of calculation of information gain-ratio formula, the characteristic attribute of information gain-ratio maximum is chosen every time as node
Decision tree branches division is carried out, finally obtains C4.5 decision-tree models.
(3) training CART decision tree classification hierarchy models.When decision tree divides, impurity level is calculated
Wherein pkBelong to the ratio of class hierarchies k for event in branch.If with characteristic attribute tiSample set is divided into branch, each branch
GINI indexes beAccording to GINI formula of index, choose every time so that each branch
The characteristic attribute of GINI indexes minimum carries out branch's division, obtains CART decision-tree models.
S3, according to training sample set, build simultaneously training Bayes classifier.
According to Bayes law, classification event to be sorted is T (T=t in characteristic attribute value1, t2..., tm) in the case of
Belong to a certain class hierarchies riConditional probability beWherein P (tj|
ri) it is that event belongs to riCharacteristic attribute is t in the case of class hierarchiesjProbability.Bayes's classification clasfficiator is exactly to P (tj|
ri) value calculated, and be stored in grader, for carrying out probability calculation to classification event to be sorted.
S4, carry out key feature attributes extraction to classification event to be sorted.Divided according to the characteristic attribute of event in sample set,
Participle and keyword match method are used to event description, extract the characteristic attribute value of classification event to be sorted.
S5, classified according to affair character attribute using three decision-tree models, draws three classification results.
In the present embodiment, respectively with tri- decision tree classification hierarchy models of ID3, C4.5, CART, to classification event to be sorted
Classify, obtain three classification results.Three decision tree classification hierarchy model classification are classified specific steps and are:Utilize decision-making
Tree classification hierarchy model, since the affair character attribute of root node, tests the characteristic attribute that classification Event Distillation to be sorted goes out,
And output branch is selected according to its value, each node is tested successively, and until reaching leaf node, the value of leaf node is then event
Classification classification results.
S6, according to affair character attribute using Bayes classifier in S5 three classification results calculate the category it is general
Rate, takes probability is highest to be used as final classification result.
In the above-described embodiments, if the obtained result of three decision tree classification hierarchy models is consistent, this step is skipped, directly
Obtain final classification classification results;If three decision tree classification hierarchy models obtain more than one classification classification results, to each
Classification classification results are detected using Bayes classifier, calculate the characteristic attribute of classification event to be sorted respectively in the classification
Conditional probability under classification results, takes the conduct final classification classification results that probability is high.
In another embodiment of the present invention, as shown in Figure 2, there is provided one kind is based on decision Tree algorithms and bayesian algorithm
Accident classification grading plant, specifically include:
Training sample set builds module 10, and feature division, structure instruction are carried out to pre-classification classifiable event storehouse for training
Practice sample set;
Decision tree classification hierarchy model builds module 20, for the training sample set according to structure, is utilized respectively ID3 calculations
Method, C4.5 algorithms, CART algorithms, build three decision tree classification hierarchy models;
Grader builds module 30, for according to training sample set, building and training Bayes classifier;
Characteristic extracting module 40, for carrying out key feature attributes extraction to classification event to be sorted;
Sort module 50, for building the three of module construction using decision tree classification hierarchy model according to affair character attribute
A decision tree classification hierarchy model is classified, and draws three classification results;
Classification results computing module 60, for utilizing Bayes classifier to the three of sort module according to affair character attribute
A classification results calculate the probability of the category, and acquisition probability is highest to be used as final classification result.
In another embodiment of the invention, there is provided a kind of accident classification based on decision Tree algorithms and bayesian algorithm
Hierarchy system, including the above-mentioned accident classification grading plant based on decision Tree algorithms and bayesian algorithm.
It should be appreciated that the functional unit or ability that describe in the present specification be referred to alternatively as or be denoted as component, module or
System, more specifically to emphasize their realization independence.For example, component, module or system can be implemented as hardware circuit, its
Including customizing ultra-large integrated (VLSI) circuit OR gate array, such as ready-made semiconductor, logic chip, transistor, or its
His discrete assembly.Component or module can also realize in programmable hardware device, such as field programmable gate array, programmable array
Logic, programmable logic device etc..Component or module can also be real in the software for being performed by various types of processors
It is existing.For example, the component or module of the identification of executable code can include one or more computer instructions physically or logically,
It can be with for example, be organized as object, program or function.However, the component or module that are identified need not be physically positioned at
Together, but the disparate instruction for being stored in diverse location can be included, it includes component or mould when being bonded together in logic
Block simultaneously realizes the regulation purpose for component or module.
It should be appreciated that spy is not limited to above by the effect that the present invention can realize by those skilled in the art
The content not described, and the further advantage of the present invention will be more clearly understood from detailed description above.
It should be apparent to those skilled in the art that can be without departing from the spirit or scope of the present invention in the present invention
In make various modifications and variations.Therefore, if it is contemplated that the present invention modifications and variations fall into subsidiary claim and
In the range of their equivalents, then the present invention covers these modifications and variations.
Claims (20)
- A kind of 1. accident classification stage division based on decision Tree algorithms and bayesian algorithm, it is characterised in that including:S1, carry out feature division to pre-classification classifiable event storehouse, builds training sample set;S2, according to training sample set, be utilized respectively ID3 algorithms, C4.5 algorithms, CART algorithms, build three decision tree classifications point Level model;S3, according to training sample set, build simultaneously training Bayes classifier;S4, carry out key feature attributes extraction to classification event to be sorted;S5, classified according to affair character attribute using three decision-tree models, draws three classification results;S6, the probability according to affair character attribute using Bayes classifier to three classification results calculating category in S5, Take probability is highest to be used as final classification result.
- 2. according to the method described in claim 1, it is characterized in that, accident classification specifically includes:The accident is divided into natural calamity, four class of accident, occurred events of public safety and social security events;Four the accident graded properties, the order of severity, controllability and coverage factors be divided into it is especially great, great, Larger and general four grades.
- 3. according to the method described in claim 1, it is characterized in that, ID3 algorithms structure decision tree mould is utilized in the step S2 Type, specifically includes:Calculate the information gain of each attribute of each event;The characteristic attribute of information gain maximum is selected to carry out branch's division as final split point.
- 4. the according to the method described in claim 3, it is characterized in that, letter of each attribute for calculating classification event to be sorted Gain is ceased, is specifically included:Calculate the desired value of each attribute of each event;The expectation information requirement of each attribute is calculated according to the desired value;Calculate the information gain of each attribute respectively according to the expectation information requirement.
- 5. according to the method described in claim 1, it is characterized in that, C4.5 algorithms structure decision tree point is utilized in the step S2 Class hierarchy model, specifically includes:Calculate the information gain of each attribute of each event;According to described information gain, the information gain-ratio of each attribute is calculated;The characteristic attribute of information gain-ratio maximum is selected to carry out branch's division as split point.
- 6. according to the method described in claim 1, it is characterized in that, CART algorithms structure decision tree point is utilized in the step S2 Class hierarchy model, specifically includes:Calculate the impurity level of each attribute of each event;According to the impurity level of each attribute, the GINI indexes of each branch are calculated;The characteristic attribute for choosing the GINI indexes minimum of each branch carries out branch's division, obtains CART decision-tree models.
- 7. according to the method described in claim 1, it is characterized in that, the step S3 is specifically included:Based on training sample set, Bayes's classification clasfficiator is built according to Bayes' theorem;Conditional probability of each affair character attribute in each classification classification results is calculated using Bayes's classification clasfficiator, to the shellfish Ye Si classification clasfficiators are trained.
- 8. according to the method described in claim 1, it is characterized in that, the step S4 is specifically included:Key feature attributes extraction is carried out to classification event to be sorted using Chinese words segmentation.
- 9. according to the method described in claim 1, it is characterized in that, the step S4 is specifically included:Divided according to the characteristic attribute of event in sample set, to event using participle and keyword match, extract classification to be sorted The key feature attribute of event.
- A kind of 10. accident classification grading plant based on decision Tree algorithms and bayesian algorithm, it is characterised in that including:Training sample set builds module, carries out feature division to pre-classification classifiable event storehouse for training, builds training sample Collection;Decision tree classification hierarchy model builds module, for the training sample set according to structure, is utilized respectively ID3 algorithms, C4.5 Algorithm, CART algorithms, build three decision tree classification hierarchy models;Grader builds module, for according to training sample set, building and training Bayes classifier;Characteristic extracting module, for carrying out key feature attributes extraction to classification event to be sorted;Sort module, for three decision-makings according to affair character attribute using decision tree classification hierarchy model structure module construction Tree classification hierarchy model is classified, and draws three classification results;Classification results computing module, for utilizing three classification of the Bayes classifier to sort module according to affair character attribute As a result the probability of the category is calculated, acquisition probability is highest to be used as final classification result.
- 11. device according to claim 10, it is characterised in that the accident classification specifically includes:The accident is divided into natural calamity, four class of accident, occurred events of public safety and social security events;Four the accident graded properties, the order of severity, controllability and coverage factors be divided into it is especially great, great, Larger and general four grades.
- 12. device according to claim 10, it is characterised in that the decision tree classification hierarchy model builds module, tool Body includes:ID3 algorithm construction units, for utilizing ID3 algorithms structure classification hierarchy model;C4.5 algorithm construction units, for utilizing C4.5 algorithms structure classification hierarchy model;CART algorithm construction units, for utilizing CART algorithms structure classification hierarchy model.
- 13. device according to claim 12, it is characterised in that the ID3 algorithms construction unit, specifically includes:Information gain computing unit, the information gain of each attribute for calculating each event;Branch's division unit, the characteristic attribute for selecting information gain maximum carry out branch's division as final split point.
- 14. device according to claim 13, it is characterised in that described information gain calculating unit, specifically includes:Desired value computation subunit, the desired value of each attribute for calculating each event;Desired value demand subelement, for calculating the expectation information requirement of each attribute according to the desired value;Information gain computation subunit, for calculating the information gain of each attribute respectively according to the expectation information requirement.
- 15. device according to claim 12, it is characterised in that the C4.5 algorithms construction unit, specifically includes:Information gain calculates the second subelement, the information gain of each attribute for calculating each event;Information gain-ratio computation subunit, for according to described information gain, calculating the information gain-ratio of each attribute;Branch divides the second subelement, and the characteristic attribute for selecting information gain-ratio maximum carries out branch as split point and draws Point.
- 16. device according to claim 12, it is characterised in that the CART algorithms build unit of determining, and specifically include:Impurity level computation subunit, the impurity level of each attribute for calculating each event;GINI index computation subunits, for the impurity level according to each attribute, calculate the GINI indexes of each branch;CART decision-tree models build subelement, and branch is carried out for choosing the characteristic attribute of GINI indexes minimum of each branch Division, obtains CART decision-tree models.
- 17. device according to claim 10, it is characterised in that the grader structure module specifically includes:Grader construction unit, for based on training sample set, Bayes's classification clasfficiator to be built according to Bayes' theorem;Classifier training unit, for calculating each affair character attribute in each classification classification results using Bayes's classification clasfficiator Conditional probability, the Bayes's classification clasfficiator is trained.
- 18. device according to claim 10, it is characterised in that the characteristic extracting module specifically, utilizes Chinese point Word technology carries out key feature attributes extraction to classification event to be sorted.
- 19. device according to claim 18, it is characterised in that the characteristic extracting module specifically includes:According to sample The characteristic attribute division of concentration event, to event using participle and keyword match, the key for extracting classification event to be sorted is special Levy attribute.
- 20. a kind of accident classification hierarchy system based on decision Tree algorithms and bayesian algorithm, it is characterised in that including power Profit requires accident classification grading plant of the 10-19 any one of them based on decision Tree algorithms and bayesian algorithm.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201710934709.2A CN107977670A (en) | 2017-10-09 | 2017-10-09 | Accident classification stage division, the apparatus and system of decision tree and bayesian algorithm |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201710934709.2A CN107977670A (en) | 2017-10-09 | 2017-10-09 | Accident classification stage division, the apparatus and system of decision tree and bayesian algorithm |
Publications (1)
Publication Number | Publication Date |
---|---|
CN107977670A true CN107977670A (en) | 2018-05-01 |
Family
ID=62012376
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201710934709.2A Pending CN107977670A (en) | 2017-10-09 | 2017-10-09 | Accident classification stage division, the apparatus and system of decision tree and bayesian algorithm |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN107977670A (en) |
Cited By (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109635254A (en) * | 2018-12-03 | 2019-04-16 | 重庆大学 | Paper duplicate checking method based on naive Bayesian, decision tree and SVM mixed model |
CN110942098A (en) * | 2019-11-28 | 2020-03-31 | 江苏电力信息技术有限公司 | Power supply service quality analysis method based on Bayesian pruning decision tree |
CN112819069A (en) * | 2021-01-29 | 2021-05-18 | 中国农业银行股份有限公司 | Event grading method and device |
CN113232674A (en) * | 2021-05-28 | 2021-08-10 | 南京航空航天大学 | Vehicle control method and device based on decision tree and Bayesian network |
CN113722509A (en) * | 2021-09-07 | 2021-11-30 | 中国人民解放军32801部队 | Knowledge graph data fusion method based on entity attribute similarity |
CN115829061A (en) * | 2023-02-21 | 2023-03-21 | 中国电子科技集团公司第二十八研究所 | Emergency accident disposal method based on historical case and empirical knowledge learning |
Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101414300A (en) * | 2008-11-28 | 2009-04-22 | 电子科技大学 | Method for sorting and processing internet public feelings information |
CN106202561A (en) * | 2016-07-29 | 2016-12-07 | 北京联创众升科技有限公司 | Digitized contingency management case library construction methods based on the big data of text and device |
-
2017
- 2017-10-09 CN CN201710934709.2A patent/CN107977670A/en active Pending
Patent Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101414300A (en) * | 2008-11-28 | 2009-04-22 | 电子科技大学 | Method for sorting and processing internet public feelings information |
CN106202561A (en) * | 2016-07-29 | 2016-12-07 | 北京联创众升科技有限公司 | Digitized contingency management case library construction methods based on the big data of text and device |
Non-Patent Citations (5)
Title |
---|
吴坚 等: "基于随机森林算法的网络舆情文本信息分类方法研究", 《信息网络安全》 * |
杨云 等: "决策树模型ID3算法在突发公共卫生事件风险评估中的应用", 《中国预防医学杂志》 * |
杨德礼 等: "《电子商务环境下管理理论与方法》", 31 December 2004, 大连理工大学出版社 * |
薛云霞: "微博用户属性识别方法研究", 《中国优秀硕士学位论文全文数据库信息科技辑》 * |
韦鹏程 等: "《大数据巨量分析与机器学习的整合与开发》", 31 May 2017, 电子科技大学出版社 * |
Cited By (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109635254A (en) * | 2018-12-03 | 2019-04-16 | 重庆大学 | Paper duplicate checking method based on naive Bayesian, decision tree and SVM mixed model |
CN110942098A (en) * | 2019-11-28 | 2020-03-31 | 江苏电力信息技术有限公司 | Power supply service quality analysis method based on Bayesian pruning decision tree |
CN112819069A (en) * | 2021-01-29 | 2021-05-18 | 中国农业银行股份有限公司 | Event grading method and device |
CN113232674A (en) * | 2021-05-28 | 2021-08-10 | 南京航空航天大学 | Vehicle control method and device based on decision tree and Bayesian network |
CN113722509A (en) * | 2021-09-07 | 2021-11-30 | 中国人民解放军32801部队 | Knowledge graph data fusion method based on entity attribute similarity |
CN113722509B (en) * | 2021-09-07 | 2022-03-01 | 中国人民解放军32801部队 | Knowledge graph data fusion method based on entity attribute similarity |
CN115829061A (en) * | 2023-02-21 | 2023-03-21 | 中国电子科技集团公司第二十八研究所 | Emergency accident disposal method based on historical case and empirical knowledge learning |
CN115829061B (en) * | 2023-02-21 | 2023-04-28 | 中国电子科技集团公司第二十八研究所 | Emergency accident handling method based on historical case and experience knowledge learning |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN107977670A (en) | Accident classification stage division, the apparatus and system of decision tree and bayesian algorithm | |
CN111079639B (en) | Method, device, equipment and storage medium for constructing garbage image classification model | |
CN106815369B (en) | A kind of file classification method based on Xgboost sorting algorithm | |
CN107944480A (en) | A kind of enterprises ' industry sorting technique | |
CN107644057B (en) | Absolute imbalance text classification method based on transfer learning | |
CN103927302B (en) | A kind of file classification method and system | |
WO2019179403A1 (en) | Fraud transaction detection method based on sequence width depth learning | |
CN102289522B (en) | Method of intelligently classifying texts | |
CN108388651A (en) | A kind of file classification method based on the kernel of graph and convolutional neural networks | |
CN109871885B (en) | Plant identification method based on deep learning and plant taxonomy | |
CN113326377B (en) | Name disambiguation method and system based on enterprise association relationship | |
CN108460421A (en) | The sorting technique of unbalanced data | |
CN106919951A (en) | A kind of Weakly supervised bilinearity deep learning method merged with vision based on click | |
CN107392241A (en) | A kind of image object sorting technique that sampling XGBoost is arranged based on weighting | |
CN101876987A (en) | Overlapped-between-clusters-oriented method for classifying two types of texts | |
CN108710894A (en) | A kind of Active Learning mask method and device based on cluster representative point | |
CN102968419B (en) | Disambiguation method for interactive Internet entity name | |
CN111754345A (en) | Bit currency address classification method based on improved random forest | |
CN103886030B (en) | Cost-sensitive decision-making tree based physical information fusion system data classification method | |
WO2023019698A1 (en) | Hyperspectral image classification method based on rich context network | |
CN102750286A (en) | Novel decision tree classifier method for processing missing data | |
CN113051914A (en) | Enterprise hidden label extraction method and device based on multi-feature dynamic portrait | |
CN108920446A (en) | A kind of processing method of Engineering document | |
CN111125396B (en) | Image retrieval method of single-model multi-branch structure | |
Parvathi et al. | Identifying relevant text from text document using deep learning |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20180501 |