CN104636492B - Dynamic data grading method based on fuzzy integral feature fusion - Google Patents

Dynamic data grading method based on fuzzy integral feature fusion Download PDF

Info

Publication number
CN104636492B
CN104636492B CN201510095450.8A CN201510095450A CN104636492B CN 104636492 B CN104636492 B CN 104636492B CN 201510095450 A CN201510095450 A CN 201510095450A CN 104636492 B CN104636492 B CN 104636492B
Authority
CN
China
Prior art keywords
data
feature
fusion
fuzzy
features
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201510095450.8A
Other languages
Chinese (zh)
Other versions
CN104636492A (en
Inventor
赵雅倩
陈继承
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Inspur Electronic Information Industry Co Ltd
Original Assignee
Inspur Electronic Information Industry Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Inspur Electronic Information Industry Co Ltd filed Critical Inspur Electronic Information Industry Co Ltd
Priority to CN201510095450.8A priority Critical patent/CN104636492B/en
Publication of CN104636492A publication Critical patent/CN104636492A/en
Application granted granted Critical
Publication of CN104636492B publication Critical patent/CN104636492B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/28Databases characterised by their database models, e.g. relational or object models
    • G06F16/284Relational databases
    • G06F16/285Clustering or classification
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/28Databases characterised by their database models, e.g. relational or object models
    • G06F16/282Hierarchical databases, e.g. IMS, LDAP data stores or Lotus Notes

Landscapes

  • Engineering & Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention discloses a dynamic data grading method based on fuzzy integral feature fusion, belonging to the technical field of computer storage and comprising the following steps: firstly, data feature extraction is carried out on training set data to form an initial data feature set, and data features are extracted according to data application and storage characteristics; fusing data characteristics; thirdly, reducing the fused data characteristics; generating a data grading model; data storage hierarchy mapping; the invention improves the accuracy of data classification, fully considers the characteristics of mutual correlation among data characteristics, utilizes fuzzy integration to perform characteristic fusion, establishes a more reasonable data classification model, is suitable for storage level decision of various dynamic data management, improves the processing speed of data classification and improves the storage efficiency.

Description

A kind of dynamic data stage division based on fuzzy integral Fusion Features
Technical field
The present invention discloses a kind of dynamic data stage division, belongs to computer memory technical field, specifically a kind of Dynamic data stage division based on fuzzy integral Fusion Features.
Background technology
With big data, the arrival in cloud storage epoch, cloud data center is developed rapidly so that high-performance, it is low into This intelligent data management turns into study hotspot.Because the application environment of complexity causes data to have ageing and spatiality, number According to accessing and handling the features such as complexity, storage requirements for access diversity, so needing to be classified various dynamic datas, dividing Layer processing, to realize the reasonable mapping between application demand and storage resource, improves the cost performance of storage device.For example, pass through Data staging model splits data into hot spot data and cold data, and hot spot data is placed into the more excellent storage device of performance On, access performance is lifted, the cold data often not accessed is placed on low-speed device, reduces carrying cost.
Dynamic data classification substantially belongs to classification problem, more using supervised classification method, i.e., according to point of pre-training Level model is classified to data.So dynamic data hierarchy model is the key element of data staging/Bedding storage.It is existing The classification judgment rule of data staging model is mostly the linear combination of each feature, but each access feature of dynamic data is mutually closed Join, not par wise irrelevance, simple linear relationship can not carry out accurate description to these correlations, thus influence depositing for data Storage and use.For this problem, the present invention proposes a kind of Intelligent Dynamic data staging side based on fuzzy integral Fusion Features Method, to improve the accuracy of data staging, characteristic associated with each other between data characteristics is taken into full account, has been carried out using fuzzy integral Fusion Features, more rational data staging model is established, suitable for the storage hierarchy decision-making of various dynamic data managements, improved The processing speed of data staging, lift storage efficiency.
The content of the invention
The present invention is interrelated for each access feature of dynamic data, not par wise irrelevance, and simple linear relationship is not The problem of these correlations can be carried out with accurate description, thus influence the storage and use of data, there is provided one kind is based on fuzzy The dynamic data stage division of Fusion Features is integrated, realizes and carries out Fusion Features using fuzzy integral, is established more rational Data staging model, suitable for the storage hierarchy decision-making of various dynamic data managements, the processing speed of data staging is improved, is lifted Storage efficiency.
Concrete scheme proposed by the present invention is:
A kind of dynamic data stage division based on fuzzy integral Fusion Features, is concretely comprised the following steps:
1. carrying out data characteristics extraction to training set data, primary data characteristic set is formed, according to data application and is deposited Store up feature extraction data characteristics;
2. data characteristics merges:The fuzzy mearue of each data characteristics combination is calculated, each feature is carried out using fuzzy integral Fusion, obtain Fusion Features computation model and new data characteristics vector;
3. yojan is carried out to the data characteristics after fusion:Feature reduction is carried out to obtained new data characteristic vector, chosen Optimal feature subset;
4. data staging model generates:Classification based training, generation data classification mould are carried out according to the optimal feature subset of selection Type;Fusion Features computation model and optimal feature subset disaggregated model according to obtaining show that the optimum fusion of data to be fractionated is special Sign vector;
5. data storage level maps:According to data classification model and the obtained optimum fusion feature of data to be fractionated to Amount, judges data category, the mapping established between data and storage hierarchy to be sorted.
Described data characteristics extraction and application is artificial or machine is carried out, and is dropped primitive character with the method for mapping or conversion Dimension, the less new feature of the dimension compared with primitive character is transformed to, forms primary data characteristic set.
Described data characteristics fusion, specific fusion process are:Calculate the fuzzy mearue of feature set to be fused:Using artificial The method specified picks out the feature composition feature set to be fused that there may be correlation, reduces and calculates dimension;Calculate each spy Levy the fuzzy integral of combination:According to obtained fuzzy mearue, integrated using fuzzy integral Choquet or Sugeno integral and calculating moulds Paste integration;It is determined that new data characteristics vector:The too low combinations of features of fuzzy integral value is given up according to specified threshold, will be remaining Combinations of features is combined as new characteristic vector together with reserved monomeric character independent of each other.
The described method that yojan is carried out to the data characteristics after fusion can use principal component analysis PCA methods, independent element Analyze ICA methods, linear decision analysis LDA methods, Local Features Analysis LFA methods.
Usefulness of the present invention is:The present invention proposes a kind of Intelligent Dynamic data based on fuzzy integral Fusion Features Stage division, to improve the accuracy of data staging, characteristic associated with each other between data characteristics is taken into full account, has utilized fuzzy product Divide and carry out Fusion Features, establish more rational data staging model, determined suitable for the storage hierarchy of various dynamic data managements Plan, the processing speed of data staging is improved, lift storage efficiency.
Brief description of the drawings
The schematic flow sheet of Fig. 1 present invention.
Embodiment
The present invention will be further described.
Fuzzy mearue is a dull and normalized set function, and the additive property in probability measure is replaced with condition by it Weaker monotonicity, it can be regarded as the extension of probability measure.And fuzzy integral be just defined on the basis of fuzzy mearue one Kind nonlinear function, has the ability of fusion multiple information, and conventional fuzzy integral has Choquet integrations and Sugeno to integrate.
A kind of dynamic data stage division based on fuzzy integral Fusion Features, is concretely comprised the following steps:
1. carrying out data characteristics extraction to training set data, primary data characteristic set is formed, according to data application and is deposited Store up feature extraction data characteristics;Data access and storage feature can manually be extracted by expert according to correlation experience more.To carry The availability and accuracy of high data characteristics, can be by the way of multidigit expert extracts jointly.On the other hand, depth network goes out Now to automatically extract data characteristics.Therefore, the stage can also use depth network or other machines learning method Data characteristics is automatically extracted, to improve the automaticity of data staging, or even accuracy.Due to meeting in subsequent treatment The data characteristics extracted is further analysed and handled, so, the stage feature of extraction can be slightly detailed.
2. data characteristics merges:The fuzzy mearue of each data characteristics combination is calculated, each feature is carried out using fuzzy integral Fusion, obtain Fusion Features computation model and new data characteristics vector;Described data characteristics fusion, specific fusion process For:Calculate the fuzzy mearue of feature set to be fused:The feature that there may be correlation is picked out using the method being manually specified Feature set to be fused is formed, reduces and calculates dimension;Calculate the fuzzy integral of each combinations of features:According to obtained fuzzy mearue, make With fuzzy integral Choquet integrations or Sugeno integral and calculating fuzzy integrals;It is determined that new data characteristics vector:According to specified threshold Value gives up the too low combinations of features of fuzzy integral value, by remaining combinations of features together with reserved monomeric character independent of each other It is combined as new characteristic vector.
3. yojan is carried out to the data characteristics after fusion:Feature reduction is carried out to obtained new data characteristic vector, chosen Optimal feature subset;Feature reduction method can use principal component analysis PCA methods, independent component analysis ICA methods, linear decision analysis LDA methods, Local Features Analysis LFA methods.Also the machine learning methods such as rough set can be used.Because Choquet integrations inherently have There is roughening, so being mainly the yojan to independent characteristic based on the new feature vector that Choquet integration fusions obtain.
4. data classification model generates:Classification based training, generation data classification mould are carried out according to the optimal feature subset of selection Type;Fusion Features computation model and optimal feature subset disaggregated model according to obtaining show that the optimum fusion of data to be fractionated is special Sign vector;Classification based training is carried out to training set based on optimal feature subset, generates data staging model.Classification based training model can be with It is arbitrary classification model, including Supervised classification, such as decision tree, neutral net or unsupervised segmentation, such as cluster Deng.
5. data storage level maps:It is vectorial according to data classification model and the fusion feature of obtained data to be fractionated, Judge data category, the mapping established between data and storage hierarchy to be sorted.

Claims (4)

  1. A kind of 1. dynamic data stage division based on fuzzy integral Fusion Features, it is characterized in that concretely comprising the following steps:
    1. carrying out data characteristics extraction to training set data, primary data characteristic set is formed, it is special according to data application and storage Property extraction data characteristics;
    2. data characteristics merges:The fuzzy mearue of each data characteristics combination is calculated, each feature is merged using fuzzy integral, Obtain Fusion Features computation model and new data characteristics vector;
    3. yojan is carried out to the data characteristics after fusion:Feature reduction is carried out to obtained new data characteristic vector, chosen optimal Character subset;
    4. data staging model generates:Classification based training is carried out according to the optimal feature subset of selection, generates data classification model;Root According to obtained Fusion Features computation model and optimal feature subset disaggregated model draw the optimum fusion features of data to be fractionated to Amount;
    5. data storage level maps:According to data classification model and the obtained optimum fusion characteristic vector of data to be fractionated, Judge data category, the mapping established between data and storage hierarchy to be sorted.
  2. 2. a kind of dynamic data stage division based on fuzzy integral Fusion Features according to claim 1, it is characterized in that Described data characteristics extraction and application is artificial or machine is carried out, and with the method for mapping or conversion by primitive character dimensionality reduction, is transformed to The less new feature of dimension compared with primitive character, form primary data characteristic set.
  3. 3. a kind of dynamic data stage division based on fuzzy integral Fusion Features according to claim 1 or 2, its feature It is described data characteristics fusion, specific fusion process is:Calculate the fuzzy mearue of feature set to be fused:Using what is be manually specified Method picks out the feature composition feature set to be fused that there may be correlation, reduces and calculates dimension;Calculate each combinations of features Fuzzy integral:According to obtained fuzzy mearue, integrated using fuzzy integral Choquet or Sugeno integral and calculatings obscure product Point;It is determined that new data characteristics vector:The too low combinations of features of fuzzy integral value is given up according to specified threshold, by remaining feature Combination is combined as new characteristic vector together with reserved monomeric character independent of each other.
  4. 4. a kind of dynamic data stage division based on fuzzy integral Fusion Features according to claim 3, it is characterized in that The described method that yojan is carried out to the data characteristics after fusion can use principal component analysis PCA methods, independent component analysis ICA Method, linear decision analysis LDA methods, Local Features Analysis LFA methods.
CN201510095450.8A 2015-03-04 2015-03-04 Dynamic data grading method based on fuzzy integral feature fusion Active CN104636492B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201510095450.8A CN104636492B (en) 2015-03-04 2015-03-04 Dynamic data grading method based on fuzzy integral feature fusion

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201510095450.8A CN104636492B (en) 2015-03-04 2015-03-04 Dynamic data grading method based on fuzzy integral feature fusion

Publications (2)

Publication Number Publication Date
CN104636492A CN104636492A (en) 2015-05-20
CN104636492B true CN104636492B (en) 2017-12-05

Family

ID=53215238

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201510095450.8A Active CN104636492B (en) 2015-03-04 2015-03-04 Dynamic data grading method based on fuzzy integral feature fusion

Country Status (1)

Country Link
CN (1) CN104636492B (en)

Families Citing this family (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105303324B (en) * 2015-11-10 2019-11-19 中国建设银行股份有限公司 A kind of information system parameter management method and device
CN107426315B (en) * 2017-07-24 2020-07-31 南京邮电大学 Distributed cache system Memcached improvement method based on BP neural network
CN109214514A (en) * 2018-08-14 2019-01-15 浪潮通用软件有限公司 A kind of data analysing method based on Rough Set
CN111166294B (en) * 2020-01-29 2021-09-14 北京交通大学 Automatic sleep apnea detection method and device based on inter-heartbeat period
CN117909507B (en) * 2024-03-19 2024-05-17 金盾检测技术股份有限公司 AI-based data classification system

Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101872424A (en) * 2010-07-01 2010-10-27 重庆大学 Facial expression recognizing method based on Gabor transform optimal channel blur fusion

Patent Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101872424A (en) * 2010-07-01 2010-10-27 重庆大学 Facial expression recognizing method based on Gabor transform optimal channel blur fusion

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
《基于多分类器多模糊积分的信息融合方法》;段宝彬,孙梅兰;《 重庆科技学院学报(自然科学版)》;20080630;第10卷(第3期);第87-89页 *
《基于模糊积分的多分类器融合方法研究》;赵志伟;《中国优秀硕士学位论文全文数据库》;20091215;第三章 *

Also Published As

Publication number Publication date
CN104636492A (en) 2015-05-20

Similar Documents

Publication Publication Date Title
CN104636492B (en) Dynamic data grading method based on fuzzy integral feature fusion
CN106372648A (en) Multi-feature-fusion-convolutional-neural-network-based plankton image classification method
CN106503148B (en) A kind of table entity link method based on multiple knowledge base
CN106909643A (en) The social media big data motif discovery method of knowledge based collection of illustrative plates
CN106709754A (en) Power user grouping method based on text mining
CN104090882B (en) A kind of quick clustering method of advertisement order and system, server
CN103902988B (en) A kind of sketch shape matching method based on Modular products figure with Clique
CN110322453A (en) 3D point cloud semantic segmentation method based on position attention and auxiliary network
CN103186538A (en) Image classification method, image classification device, image retrieval method and image retrieval device
CN104036255A (en) Facial expression recognition method
CN105654196A (en) Adaptive load prediction selection method based on electric power big data
CN107562947A (en) A kind of Mobile Space-time perceives the lower dynamic method for establishing model of recommendation service immediately
CN103942606A (en) Residential electricity consumption customer segmentation method based on fruit fly intelligent optimization algorithm
CN105678590A (en) topN recommendation method for social network based on cloud model
CN109657063A (en) A kind of processing method and storage medium of magnanimity environment-protection artificial reported event data
CN112528639B (en) Object recognition method and device, storage medium and electronic equipment
CN104915354A (en) Multimedia file pushing method and device
CN101398846A (en) Image, semantic and concept detection method based on partial color space characteristic
CN105808665A (en) Novel hand-drawn sketch based image retrieval method
CN104216993A (en) Tag-co-occurred tag clustering method
CN109783805A (en) A kind of network community user recognition methods and device
CN104008177A (en) Method and system for rule base structure optimization and generation facing image semantic annotation
CN104765852B (en) Data digging method based on fuzzy algorithmic approach under big data background
CN104361135A (en) Image retrieval method
CN103440651A (en) Multi-label image annotation result fusion method based on rank minimization

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant