CN109685321A - Event risk method for early warning, electronic equipment and medium based on data mining - Google Patents

Event risk method for early warning, electronic equipment and medium based on data mining Download PDF

Info

Publication number
CN109685321A
CN109685321A CN201811431329.8A CN201811431329A CN109685321A CN 109685321 A CN109685321 A CN 109685321A CN 201811431329 A CN201811431329 A CN 201811431329A CN 109685321 A CN109685321 A CN 109685321A
Authority
CN
China
Prior art keywords
data
event
feature
index
assailant
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201811431329.8A
Other languages
Chinese (zh)
Inventor
王红
王彩雨
赵丽丽
王峰
俞凤萍
胡斌
闫晓燕
张伟
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shandong Normal University
Original Assignee
Shandong Normal University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shandong Normal University filed Critical Shandong Normal University
Priority to CN201811431329.8A priority Critical patent/CN109685321A/en
Publication of CN109685321A publication Critical patent/CN109685321A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/06Resources, workflows, human or project management; Enterprise or organisation planning; Enterprise or organisation modelling
    • G06Q10/063Operations research, analysis or management
    • G06Q10/0635Risk analysis of enterprise or organisation activities
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/213Feature extraction, e.g. by transforming the feature space; Summarisation; Mappings, e.g. subspace methods
    • G06F18/2135Feature extraction, e.g. by transforming the feature space; Summarisation; Mappings, e.g. subspace methods based on approximation criteria, e.g. principal component analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/23Clustering techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q50/00Information and communication technology [ICT] specially adapted for implementation of business processes of specific business sectors, e.g. utilities or tourism
    • G06Q50/10Services
    • G06Q50/26Government or public services
    • G06Q50/265Personal security, identity or safety

Landscapes

  • Engineering & Computer Science (AREA)
  • Business, Economics & Management (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Human Resources & Organizations (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Tourism & Hospitality (AREA)
  • Economics (AREA)
  • General Engineering & Computer Science (AREA)
  • Evolutionary Biology (AREA)
  • Artificial Intelligence (AREA)
  • Strategic Management (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Evolutionary Computation (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • General Business, Economics & Management (AREA)
  • Educational Administration (AREA)
  • Entrepreneurship & Innovation (AREA)
  • Development Economics (AREA)
  • Marketing (AREA)
  • Computer Security & Cryptography (AREA)
  • Health & Medical Sciences (AREA)
  • Primary Health Care (AREA)
  • General Health & Medical Sciences (AREA)
  • Game Theory and Decision Science (AREA)
  • Operations Research (AREA)
  • Quality & Reliability (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The invention discloses event risk method for early warning, electronic equipment and media based on data mining, obtain the record data of the history attack of terrorism and event to be tested;To the record number of the history attack of terrorism and event to be tested according to pre-processing;Classified using clustering algorithm to the data obtained after data prediction, if being divided into Ganlei's data;Several features are extracted from every a kind of data using Principal Component Analysis Algorithm;Feature integration is carried out to all features that all class data are extracted and obtains characteristic set;Several index features are extracted from characteristic set using Principal Component Analysis Algorithm;The weight of each index feature is calculated using improved entropy assessment;It is weighted for the characteristic value of each index feature of each event using corresponding weight, to calculated result according to being ranked up from big to small, sequence serial number of the event to be tested in all sequences is exported as a result, if sequence serial number is less than given threshold, is issued warning signal.

Description

Event risk method for early warning, electronic equipment and medium based on data mining
Technical field
This disclosure relates to the field of data mining, more particularly to event risk method for early warning, electronics based on data mining Equipment and medium.
Background technique
The statement of this part is only to improve background technique relevant to the disclosure, not necessarily constitutes the prior art.
Currently, the attack of terrorism refer to it is that extremist or tissue artificially manufacture, be directed to but be not limited only to the common people and civilian set Attack applying, not meeting international morality and justice, it not only has great lethal and destructive power, can directly contribute huge Casualties and property loss, but also huge psychological pressure is brought, cause society a degree of turbulent, Normal work and orders of life are interfered, and then is greatly hindered the development of the economy.
Common classification generally uses subjective method, selects several main indicators by authoritative organization or department, forces rule Determine grade scale, but the harmfulness of the attack of terrorism depends not only on casualties and economic loss the two aspects, also with Opportunity of generation, region, the object that is directed to etc. factors are related, thus are difficult to form unified mark using above-mentioned stage division It is quasi-.With the emergence of the attack of terrorism, data mining can be carried out according to its feature, objectively to the terrified thing of progress The quantization modulation of part, this is an important process, makes specific aim measure for relevant departments and provides objective basis.
In conclusion lacking precisely the attack of terrorism with quick method for prewarning risk, still lack effective solution Certainly scheme.
Summary of the invention
In order to solve the deficiencies in the prior art, present disclose provides event risk method for early warning, electricity based on data mining Sub- equipment and medium, have based on improve entropy assessment model to the attack of terrorism carry out risk precisely and quick early warning.
In a first aspect, present disclose provides the event risk method for early warning based on data mining;
Event risk method for early warning based on data mining, comprising:
Data acquisition step: the record data of the history attack of terrorism and event to be tested are obtained;Each event is set There is unique number;The record data, comprising: area, attack type, the property loss amount of money, injured sum, dead sum, The solution date of assailant's quantity, the assailant's quantity arrested, assailant's death toll, event summary, hostage's kidnapping result or event;
Data prediction step: to the record number of the history attack of terrorism and event to be tested according to pre-processing;
Data-classification step: classified using clustering algorithm to the data obtained after data prediction, be divided into several Class data;
Extraction step of feature: using Principal Component Analysis Algorithm, extracts several features from every a kind of data;
Feature integration step: feature integration is carried out to all features that all class data are extracted, obtains characteristic set;
Feature second extraction: Principal Component Analysis Algorithm is used, several index features are extracted from characteristic set;
Feature weight obtaining step: the weight of each index feature is calculated using improved entropy assessment;
Risk-warning step: for the characteristic value of each index feature of each event, added using corresponding weight Power calculates, to calculated result according to being ranked up from big to small, using sequence serial number of the event to be tested in all sequences as As a result it exports, if sequence serial number is less than given threshold, issues warning signal.
As some possible implementations, the clustering algorithm is using system clustering algorithm.
As some possible implementations, the weight W of each index feature is calculated using improved entropy assessmenti:
Assuming that giving k index feature X1, X2..., Xk, wherein Xi={ x1, x2..., xn};xnRepresent different samples pair The sampled data values answered;
Assuming that the sampled data values x of index featureiValue after standardization is Yij:
Wherein, min (Xi) indicate XiSampled data values minimum value;max(Xi) indicate XiSampled data values maximum Value;
Secondly, seeking the comentropy E of each index featurej, j=1,2 ..., k;Assuming that there is k index feature, each index is special Levy corresponding n sampled data values;
Wherein,If pij=0, then it defines
According to the calculation formula of comentropy, the comentropy for calculating k index is E1, E2..., Ek, then, it is determined that respectively referring to Mark weight Wi:
As some possible implementations, the data prediction step, comprising: data screening sub-step, data are filled out Fill sub-step, data conversion sub-step and data normalization sub-step;
The data screening sub-step, the solution date for kidnapping result and event to event summary, hostage reject;
The data fill sub-step, assailant's quantity that the attack of terrorism occurs, assailant's number death sum, arrested Amount, injured sum, dead sum, assailant's death toll and property loss amount record missing values are filled, for unknown number According to progress zero padding;
The data conversion sub-step, the area that the attack of terrorism occurs, attack type, is converted by text data Numerical data;
The step of regional text data is converted into numerical data are as follows: by the death sum of the corresponding event in each area and act of violence Hand quantity is summed, successively right according to sequence from big to small after sequence to summed result according to being ranked up from big to small Area carries out digital marking, and number marking is successively successively decreased.
The step of attack type text data is converted into numerical data are as follows: every kind of attack type is corresponded to the death of event Sum and assailant's quantity are summed, suitable according to from big to small after sequence to summed result according to being ranked up from big to small Sequence successively carries out digital marking to attack type, and number marking is successively successively decreased.
The data normalization sub-step uses the data being converted to by data screening, data filling and data Minimax normalization algorithm is normalized, and according to the data after normalized, establishes N*1 for each event Matrix, N indicates the number of data, and the value of each element is the knot after the corresponding numerical value normalization of each record data in matrix Fruit.
Second aspect, present disclose provides a kind of electronic equipment;
A kind of electronic equipment, comprising: the meter that memory, processor and storage are run on a memory and on a processor The instruction of calculation machine, when the computer instruction is run by processor, completes step described in any of the above-described method.
The third aspect, present disclose provides a kind of computer readable storage mediums;
A kind of computer readable storage medium, operation has computer instruction thereon, and the computer instruction is transported by processor When row, step described in any of the above-described method is completed.
Compared with prior art, the beneficial effect of the disclosure is:
In the way of traditional data prediction, data format is converted by content of text, the availability of data is improved, adds The accuracy of strong model;The grouping of feature is realized in the way of cluster, mitigates the difficulty and error of high latitude dimensionality reduction.
The optimization processing that feature is realized in the way of the Fusion Features based on Principal Component Analysis, compared to other existing skills Art method, implementation method are more succinct effective;
Show that the weight progress score statistics of each index can compared to traditional entropy assessment using improved entropy assessment It is determined by enhancing the accurate precision of weight.
Detailed description of the invention
The accompanying drawings constituting a part of this application is used to provide further understanding of the present application, and the application's shows Meaning property embodiment and its explanation are not constituted an undue limitation on the present application for explaining the application.
Fig. 1 is the information flow schematic diagram of one or more embodiments;
Fig. 2 is KMO and the Bartlett verification result of the attack of terrorism data of one or more embodiments;
Fig. 3 is the communality figure of one or more embodiments;
Fig. 4 is that the variance of the explanation of one or more embodiments amounts to figure;
Fig. 5 is the rotation component matrix figure of one or more embodiments;
Fig. 6 is the entropy assessment score distribution map of one or more embodiments.
Specific embodiment
It is noted that following detailed description is all illustrative, it is intended to provide further instruction to the application.Unless another It indicates, all technical and scientific terms used herein has usual with the application person of an ordinary skill in the technical field The identical meanings of understanding.
It should be noted that term used herein above is merely to describe specific embodiment, and be not intended to restricted root According to the illustrative embodiments of the application.As used herein, unless the context clearly indicates otherwise, otherwise singular Also it is intended to include plural form, additionally, it should be understood that, when in the present specification using term "comprising" and/or " packet Include " when, indicate existing characteristics, step, operation, device, component and/or their combination.
Embodiment 1:
As shown in Figure 1, the event risk method for early warning based on data mining, comprising:
Data acquisition step: the record data of the history attack of terrorism and event to be tested are obtained;Each event is set There is unique number;The record data, comprising: area, attack type, the property loss amount of money, injured sum, dead sum, The solution date of assailant's quantity, the assailant's quantity arrested, assailant's death toll, event summary, hostage's kidnapping result or event;
Data prediction step: to the record number of the history attack of terrorism and event to be tested according to pre-processing, Pre-treatment step includes: data screening, data conversion, data filling;
It is unrelated to the invention to kidnap solution date of result and event etc. to event summary, hostage for data screening step Data are rejected;
Data conversion step, the features such as area, attack type that the attack of terrorism occurs, is converted by text data Number.Conversion regime uses ten point system.First according to area or the features such as attack type institute to death sum determine correlation The significance level in area;
Assuming that the event number in the area R is num, the corresponding dead sum of corresponding event is nkilli, i=1,2 ..., Num, N are the overall dead sum of all events in sample, then the final score S of this area is formula 5.
Data filling step, assailant's quantity that the attack of terrorism is occurred, dead sum, the assailant's quantity arrested, by Wound sum, dead sum, assailant's death toll and property loss amount record missing values are filled, according to the ratio of missing values It is filled, 95% feature is higher than for miss rate, is directly rejected, 95% feature is lower than for miss rate, use Unknown data is carried out to the mode of zero padding;
Data normalizing steps: being normalized pretreated data, for using the attack of terrorism The record data of the attack of terrorism after screening are normalized in the maximum value and minimum value for recording data, so that in advance The data of processing are defined ([0,1]) in a certain range, to eliminate adverse effect caused by unusual sample data.
Data-classification step: classifying to the data obtained after normalization using clustering algorithm, be divided into 4 class data, Wherein all features are divided into four classes, wherein first group of feature includes: dead sum, injured sum.Second group of feature includes: act of violence Hand quantity, the assailant's quantity arrested, assailant's death toll.Third group feature includes: the property loss amount of money.4th group of feature packet It includes: area, attack type.
Extraction step of feature: using Principal Component Analysis Algorithm, extracts N from every a kind of dataiA feature;
Feature integration step: all N that all class data are extractediA feature carries out feature integration, obtains containing N number of spy The characteristic set of sign;
Feature second extraction: Principal Component Analysis Algorithm is used, main is extracted from characteristic setiIndex feature, i=1, 2,3;
Feature weight obtaining step: the weight of each index feature is calculated using improved entropy assessment;
Risk-warning step: for the characteristic value of each index feature of each event, added using corresponding weight Power calculates, to calculated result according to being ranked up from big to small, using sequence serial number of the event to be tested in all sequences as As a result it exports, if sequence serial number is less than given threshold, issues warning signal.
The purpose of the present embodiment is to provide the event risk method for early warning based on data mining, and step includes:
(1) data processing is carried out to the sample data of acquisition:
Step 1: missing values clean.Its missing values ratio is calculated, determines the range of missing values.According to missing ratio and word Section importance, takes different processing strategies.The feature high for importance, miss rate is low, is filled.
Step 2: carrying out Data Format Transform: since certain features correspond to the features such as text type, such as area, to solution Certainly problem has certain importance, so text is carried out digital conversion.It is not aligned to importing partially to arrange existing for data The problem of, and the case where have more column, optimize processing.
Step 3: carrying out non-demand data cleaning.Event summary, hostage in data kidnap the solution day of result and event Phase etc. belongs to non-demand data, therefore it is directly deleted.
Step 4: the data after cleaning are normalized, for the record data using the attack of terrorism The record data of the attack of terrorism after screening are normalized in maximum value and minimum value, so that pretreated data It is defined ([0,1]) in a certain range, to eliminate adverse effect caused by unusual sample data.
(2) data classification:
Using systemic clustering, data characteristics after pretreatment is divided into inhomogeneity and carries out feature extraction.Specifically, this hair Bright to use farthest neighbors clustering procedure, module is with pearson correlation demarcation interval.All features are divided into four classes, In first group of feature include: dead sum, injured sum.Second group of feature include: assailant's quantity, arrests assailant's quantity, act of violence Hand death toll.Third group feature includes: the property loss amount of money.4th group of feature includes: area, attack type.
(3) feature is once extracted:
It carries out feature to every group of data respectively using Principal Component Analysis once to extract, every group obtains NiA different spy Sign.
(4) feature integration:
Feature N after every group of data are once extractediIt is integrated, obtains characteristic set.
(5) feature second extraction:
The principal component signature analysis includes partial correlation inspection and factorial analysis;The partial correlation is examined, for examining Look into the partial correlation between attack of terrorism relative recording data;The factorial analysis, according to above-mentioned partial correlation, using because Sub- analytic approach carries out decorrelation, winner's composition characteristics, respectively main to the record data of the attack of terrorism1, main2, main3
Principal component feature is obtained using factor-analysis approach, partial correlation specifically is carried out to 4 tested features first It examines.Specifically, the present invention is examined using KMO and Bartlett sphericity.Initial data degree of correlation is higher, more suitable use Factor analysis is analyzed.The value of KMO shows that original variable correlation is weaker closer to 0;The value of KMO closer to 1, Then show that original variable correlation is stronger.And Bartlett sphericity test statistics mainly sees that its conspicuousness, conspicuousness are low Then show that data distribution for spherical distribution, has construction validity between variable when 0.05, it was demonstrated that initial data be appropriate for because Son analysis.It is as shown in Figure 2 to analyze result.As it can be seen that the conspicuousness of KMO=0.793 > 0.5 and Bartlett are 0 less than 0.05, say There is significant correlation between bright characteristic variable, be appropriate for factorial analysis.Communality (shown in Fig. 3), reflects information The loss amount (1- extraction degree) of extraction degree ((extraction of values/initial value)/100) and information.Initial value and extraction of values are compared, it can To find out the loss amount of information.
In order to further determine the number of principal component feature, the present invention is to original 4 feature Main1, Main2, Main3, Main4Carry out factorial analysis, obtain illustrating square margin total figure, as shown in figure 4, wherein comprising 4 feature initial characteristic values and Variance contribution ratio, and extract the characteristic value and variance contribution ratio of 3 principal components.Principle according to characteristic value greater than 1 can mention Take out 3 principal components.This 3 principal components illustrate variance: cumulative proportion in ANOVA reaches 92.911% > 85%, analyzes in this way The main gene come is satisfactory, can be used to training pattern.The present invention further obtains the rotation component matrix of 4 features, such as Shown in Fig. 5.Can intuitively reflect which primitive character has been classified as same constituents and initial characteristics are had in ingredient Some magnitudes of load.
Then factorial analysis is carried out to this 4 features, factorial analysis is specifically carried out using dimensionality reduction module, according to be achieved Target, it is desirable that low-rank subspace has maximum separability to sample, therefore the present invention is quasi- to 4 index features progress dimensionality reductions, goes Fall the multiple correlation between feature.
Mainly the realization process includes: to all samples normalizations;Seek the correlation matrix of sample;Spy is done to correlation matrix Value indicative is decomposed;Take feature vector w corresponding to maximum d ' characteristic value1, w2..., wd′.Parameter d ' can pass through cross validation Mode obtain, can also be with given threshold τ, choosing makes formula (6) to set up the smallest, wherein λi, λjIt is characteristic value.This hair Bright given threshold is τ=0.85.I, j are cumulative and control variable, i=1, and 2 ..., d ', j=1,2 ..., d.
Finally extract 3 principal component feature main1, main2, main3
Wherein λi, λjIt is characteristic value.Obviously, lower dimensional space and original higher dimensional space must be different, because having given up minimum The corresponding feature vector of a characteristic value of d-d ', this is the result of dimensionality reduction.But give up this partial information to be necessary, one side energy Increase the sampling density of sample, this is exactly the purpose of dimensionality reduction;On the other hand, there is the effect of denoising to a certain extent Fruit, because feature vector corresponding to the smallest characteristic value is often related with noise.
(6) it improves entropy assessment and determines weight
Objective weight is determined according to the size of index variability.In general, if the comentropy E of some indexjIt is smaller, Show that index value obtains that degree of variation is bigger, the information content provided is more, can play the role of in overall merit it is also bigger, Weight is also bigger.On the contrary, the comentropy E of some indexjIt is bigger, show that index value obtains the information that degree of variation is smaller, provides Amount is also fewer, and the effect played in overall merit is also smaller, and weight is also just smaller.
Firstly, obtaining each finger target value Y by data normalizationk, initial data is carried out by normalizing by data normalization Change processing, unified conversion is between 0-1.Assuming that given k index X1, X2..., Xk, wherein Xi{x1, x2..., xnVacation If being Y to the value after the standardization of each achievement dataij
Secondly, seeking the comentropy of each index.Assuming that there is k index feature, each index feature corresponds to n sample data Value.According to the definition of comentropy in information theory, the comentropy E of one group of datajFor formula 8
WhereinIf pij=0, then it defines
Then, it is determined that each index weights.According to the calculation formula of comentropy, the comentropy for calculating each index is E1, E2..., Ek.The comentropy of index is smaller, it includes content it is more.Conversely, fewer.In general, comentropy it is smaller its Weight is bigger.If it is desired to further strengthening the significance level of index, can be determined by enhancing the accurate precision of weight.Therefore Improved entropy assessment is formula 9
Finally, scoring each feature.Three correlated characteristics chosen are as follows: area, attack type and property loss The amount of money.If ZlFor the final score of the 1st event, thenScore distribution histogram is as shown in Figure 6.By dividing Three local minimum points of cloth histogram graph discovery, respectively n1, n2, n3.Therefore event can be divided into five ranks.Grading range As shown in table 1.
1 grading range index of table
Grade Rate range
One rank 0
Two ranks 0~n1
Three ranks n1~n2
Four ranks n2~n3
Five scale n3More than
(7) method validation
" high score event " is used to be verified, discovery high score example all concentrates on preceding the 10% of score substantially, illustrates model Substantially effectively.
The foregoing is merely preferred embodiment of the present application, are not intended to limit this application, for the skill of this field For art personnel, various changes and changes are possible in this application.Within the spirit and principles of this application, made any to repair Change, equivalent replacement, improvement etc., should be included within the scope of protection of this application.

Claims (8)

1. the event risk method for early warning based on data mining, characterized in that include:
Data acquisition step: the record data of the history attack of terrorism and event to be tested are obtained;Each event is designed with only One number;The record data, comprising: area, attack type, the property loss amount of money, injured sum, dead sum, assailant The solution date of quantity, the assailant's quantity arrested, assailant's death toll, event summary, hostage's kidnapping result or event;
Data prediction step: to the record number of the history attack of terrorism and event to be tested according to pre-processing;
Data-classification step: classified using clustering algorithm to the data obtained after data prediction, if being divided into Ganlei's number According to;
Extraction step of feature: using Principal Component Analysis Algorithm, extracts several features from every a kind of data;
Feature integration step: feature integration is carried out to all features that all class data are extracted, obtains characteristic set;
Feature second extraction: Principal Component Analysis Algorithm is used, several index features are extracted from characteristic set;
Feature weight obtaining step: the weight of each index feature is calculated using improved entropy assessment;
Risk-warning step: for the characteristic value of each index feature of each event, meter is weighted using corresponding weight It calculates, to calculated result according to being ranked up from big to small, as a result by sequence serial number of the event to be tested in all sequences Output issues warning signal if sequence serial number is less than given threshold.
2. the method as described in claim 1, characterized in that the clustering algorithm is using system clustering algorithm.
3. the method as described in claim 1, characterized in that calculate the weight W of each index feature using improved entropy assessmenti:
Assuming that giving k index feature X1,X2,…,Xk, wherein Xi={ x1,x2,…,xn};xnRepresent the corresponding sample of different samples Notebook data value;
Assuming that the sampled data values x of index featureiValue after standardization is Yij:
Wherein, min (Xi) indicate XiSampled data values minimum value;max(Xi) indicate XiSampled data values maximum value;
Secondly, seeking the comentropy E of each index featurej, j=1,2 ..., k;Assuming that have k index feature, each index feature pair Answer n sampled data values;
Wherein,If pij=0, then it defines
According to the calculation formula of comentropy, the comentropy for calculating k index is E1,E2,…,Ek, then, it is determined that each index is weighed Weight Wi:
4. the method as described in claim 1, characterized in that the data prediction step, comprising: data screening sub-step, Data fill sub-step, data conversion sub-step and data normalization sub-step;
The data screening sub-step, the solution date for kidnapping result and event to event summary, hostage reject;
The data fill sub-step, assailant's quantity that the attack of terrorism is occurred, dead sum, the assailant's quantity arrested, Injured sum, dead sum, assailant's death toll and property loss amount record missing values are filled, for unknown data into Row zero padding;
The data conversion sub-step, the area that the attack of terrorism occurs, attack type, is converted into number by text data Data;
The data normalization sub-step, to the data being converted to by data screening, data filling and data, using maximum Minimum normalization algorithm is normalized, and according to the data after normalized, the square of N*1 is established for each event Battle array, N indicate the number of data, and the value of each element is the result after the corresponding numerical value normalization of each record data in matrix.
5. method as claimed in claim 4, characterized in that
The step of regional text data is converted into numerical data are as follows: by the death sum and assailant's number of the corresponding event in each area Amount is summed, to summed result according to being ranked up from big to small, after sequence, according to sequence from big to small successively to area Digital marking is carried out, number marking is successively successively decreased.
6. method as claimed in claim 4, characterized in that
The step of attack type text data is converted into numerical data are as follows: every kind of attack type is corresponded to the death sum of event Sum with assailant's quantity, to summed result according to being ranked up from big to small, after sequence, according to sequence from big to small according to Secondary to carry out digital marking to attack type, number marking is successively successively decreased.
7. a kind of electronic equipment, characterized in that include: memory, processor and storage on a memory and on a processor The computer instruction of operation when the computer instruction is run by processor, completes any one of the claims 1-6 method institute The step of stating.
8. a kind of computer readable storage medium, characterized in that operation has computer instruction thereon, and the computer instruction is located When managing device operation, step described in any one of the claims 1-6 method is completed.
CN201811431329.8A 2018-11-26 2018-11-26 Event risk method for early warning, electronic equipment and medium based on data mining Pending CN109685321A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201811431329.8A CN109685321A (en) 2018-11-26 2018-11-26 Event risk method for early warning, electronic equipment and medium based on data mining

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201811431329.8A CN109685321A (en) 2018-11-26 2018-11-26 Event risk method for early warning, electronic equipment and medium based on data mining

Publications (1)

Publication Number Publication Date
CN109685321A true CN109685321A (en) 2019-04-26

Family

ID=66185619

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201811431329.8A Pending CN109685321A (en) 2018-11-26 2018-11-26 Event risk method for early warning, electronic equipment and medium based on data mining

Country Status (1)

Country Link
CN (1) CN109685321A (en)

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110348510A (en) * 2019-07-08 2019-10-18 中国海洋石油集团有限公司 A kind of data preprocessing method based on deep water hydrocarbon drilling process conditions of the current stage
CN112465533A (en) * 2019-09-09 2021-03-09 ***通信集团河北有限公司 Intelligent product selection method and device and computing equipment
CN112907035A (en) * 2021-01-27 2021-06-04 厦门卫星定位应用股份有限公司 K-means-based transportation subject credit rating method and device
CN113537691A (en) * 2021-05-09 2021-10-22 武汉兴得科技有限公司 Big data public health event emergency command method and system
CN116596353A (en) * 2022-09-29 2023-08-15 中国人民解放军空军工程大学 Quantitative analysis method for terrorist attack event record data

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101673280A (en) * 2009-07-20 2010-03-17 浙江大学 Method for determining terror attack organization based on feature mining of terror attack event
CN105956982A (en) * 2016-05-04 2016-09-21 江苏大学 Method of predicting act of terror based on background change
CN106570767A (en) * 2016-10-26 2017-04-19 中国农业科学院农业质量标准与检测技术研究所 Monitoring data statistics analysis method and device in risk monitoring information system
CN106776884A (en) * 2016-11-30 2017-05-31 江苏大学 A kind of act of terrorism Forecasting Methodology that multi-categorizer is combined based on multi-tag
CN108776817A (en) * 2018-06-04 2018-11-09 孟玺 The type prediction method and system of the attack of terrorism

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101673280A (en) * 2009-07-20 2010-03-17 浙江大学 Method for determining terror attack organization based on feature mining of terror attack event
CN105956982A (en) * 2016-05-04 2016-09-21 江苏大学 Method of predicting act of terror based on background change
CN106570767A (en) * 2016-10-26 2017-04-19 中国农业科学院农业质量标准与检测技术研究所 Monitoring data statistics analysis method and device in risk monitoring information system
CN106776884A (en) * 2016-11-30 2017-05-31 江苏大学 A kind of act of terrorism Forecasting Methodology that multi-categorizer is combined based on multi-tag
CN108776817A (en) * 2018-06-04 2018-11-09 孟玺 The type prediction method and system of the attack of terrorism

Cited By (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110348510A (en) * 2019-07-08 2019-10-18 中国海洋石油集团有限公司 A kind of data preprocessing method based on deep water hydrocarbon drilling process conditions of the current stage
CN110348510B (en) * 2019-07-08 2021-08-03 中国海洋石油集团有限公司 Data preprocessing method based on staged characteristics of deepwater oil and gas drilling process
CN112465533A (en) * 2019-09-09 2021-03-09 ***通信集团河北有限公司 Intelligent product selection method and device and computing equipment
CN112907035A (en) * 2021-01-27 2021-06-04 厦门卫星定位应用股份有限公司 K-means-based transportation subject credit rating method and device
CN112907035B (en) * 2021-01-27 2022-08-05 厦门卫星定位应用股份有限公司 K-means-based transportation subject credit rating method and device
CN113537691A (en) * 2021-05-09 2021-10-22 武汉兴得科技有限公司 Big data public health event emergency command method and system
CN116596353A (en) * 2022-09-29 2023-08-15 中国人民解放军空军工程大学 Quantitative analysis method for terrorist attack event record data
CN116596353B (en) * 2022-09-29 2024-06-04 中国人民解放军空军工程大学 Quantitative analysis method for terrorist attack event record data

Similar Documents

Publication Publication Date Title
CN109685321A (en) Event risk method for early warning, electronic equipment and medium based on data mining
Sun et al. Predicting public procurement irregularity: An application of neural networks
CN109409677A (en) Enterprise Credit Risk Evaluation method, apparatus, equipment and storage medium
CN102955902B (en) Method and system for evaluating reliability of radar simulation equipment
CN104321794B (en) A kind of system and method that the following commercial viability of an entity is determined using multidimensional grading
CN109657011A (en) A kind of data digging method and system screening attack of terrorism criminal gang
CN109446812A (en) A kind of embedded system firmware safety analytical method and system
CN112132233A (en) Criminal personnel dangerous behavior prediction method and system based on effective influence factors
CN110309863A (en) Evaluation method that a kind of identity based on analytic hierarchy process (AHP) and grey correlation analysis is credible
CN102880631A (en) Chinese author identification method based on double-layer classification model, and device for realizing Chinese author identification method
AU2019101158A4 (en) A method of analyzing customer churn of credit cards by using logistics regression
Chen et al. Research on data mining combination model analysis and performance prediction based on students’ behavior characteristics
CN114358014A (en) Work order intelligent diagnosis method, device, equipment and medium based on natural language
CN109582743A (en) A kind of data digging method for the attack of terrorism
Ergu et al. Predicting personality with twitter data and machine learning models
CN116340815A (en) University abnormal behavior student identification method based on convolutional neural network
CN109214598A (en) Batch ranking method based on K-MEANS and ARIMA model prediction residential quarters collateral risk
Işık et al. Detection of fraudulent transactions using artificial neural networks and decision tree methods
Zhu et al. Research on data mining of college students’ physical health for physical education reform
CN114862531A (en) Enterprise financial risk early warning method and system based on deep learning
CN113920366A (en) Comprehensive weighted main data identification method based on machine learning
Zhao et al. An intelligent evaluation method to analyze the competitiveness of airlines
Cui et al. Using PCA and ANN to identify significant factors and modeling customer satisfaction for the complex service processes
CN110209953A (en) A kind of calculation method towards uncertain social computing problem
CN108629507A (en) A kind of enterprise credit management system

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20190426