CN109614396A - A kind of method for cleaning of address data structure and standardization - Google Patents

A kind of method for cleaning of address data structure and standardization Download PDF

Info

Publication number
CN109614396A
CN109614396A CN201811543929.3A CN201811543929A CN109614396A CN 109614396 A CN109614396 A CN 109614396A CN 201811543929 A CN201811543929 A CN 201811543929A CN 109614396 A CN109614396 A CN 109614396A
Authority
CN
China
Prior art keywords
address
cleaning
classification
standardization
data structure
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201811543929.3A
Other languages
Chinese (zh)
Inventor
宋才华
郑爱武
蓝源娟
王永才
吴丽贤
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Guangdong Power Grid Co Ltd
Foshan Power Supply Bureau of Guangdong Power Grid Corp
Original Assignee
Guangdong Power Grid Co Ltd
Foshan Power Supply Bureau of Guangdong Power Grid Corp
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Guangdong Power Grid Co Ltd, Foshan Power Supply Bureau of Guangdong Power Grid Corp filed Critical Guangdong Power Grid Co Ltd
Priority to CN201811543929.3A priority Critical patent/CN109614396A/en
Publication of CN109614396A publication Critical patent/CN109614396A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q50/00Information and communication technology [ICT] specially adapted for implementation of business processes of specific business sectors, e.g. utilities or tourism
    • G06Q50/06Energy or water supply
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Business, Economics & Management (AREA)
  • Health & Medical Sciences (AREA)
  • Engineering & Computer Science (AREA)
  • Economics (AREA)
  • Public Health (AREA)
  • Water Supply & Treatment (AREA)
  • General Health & Medical Sciences (AREA)
  • Human Resources & Organizations (AREA)
  • Marketing (AREA)
  • Primary Health Care (AREA)
  • Strategic Management (AREA)
  • Tourism & Hospitality (AREA)
  • Physics & Mathematics (AREA)
  • General Business, Economics & Management (AREA)
  • General Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The present invention relates to the method for cleaning of a kind of address data structure and standardization, comprising the following steps: S1: original address text initialization process;S2 original address stratification parsing;S3: the address date of hierarchical parsing is matched with base address dictionary library;S4: being judged according to whether matching degree meets the requirements, and the address date for the hierarchical parsing that matching degree is met the requirements is added to base address dictionary library as clearing achievements, is unsatisfactory for desired data and is returned to S2 into next cleaning circulation;S5: comprehensive assessment is carried out to clearing achievements using the algorithm of similarity and compliance evaluation, to confirm the accuracy and validity of achievement;The present invention can effectively improve the integrality and accuracy rate of customer electricity address;Accuracy that user's report barrier address judges can be improved, the response speed that improves emergency maintenance, the zone user to be affected by power failure send prompting message, grasp zonal power load demand etc. and all having played very important effect.

Description

A kind of method for cleaning of address data structure and standardization
Technical field
The present invention relates to address data structureizations and standardization field, more particularly, to a kind of address data structure With the method for cleaning of standardization.
Background technique
In today that urban construction is maked rapid progress, numerous streets, community are planned again and construction, this phenomenon cause to supply More and more customer electricity address dates and real address are inconsistent in electric enterprise marketing system.In addition to this, it is gone through due to some History reason causes existing customer electricity address date to there is phenomena such as a large amount of mistake, poikilonymy, imperfect information, such as will Table number is as address, cell, Lou Dong etc. without standard appellation etc.;Simultaneously as the customer electricity address date of storage is not knot The data of structure, there are the regular inconsistent or even same cell difference development periods, difference that the customer address of different community is filled in Fill in rule all inconsistence problems in address between Lou Dong;These problems have seriously affected customer service work, emergency check man The quality of work also produces serious influence to all kinds of supporting system for analysis and decision making construction carried out based on address date.
Summary of the invention
The present invention is to overcome customer electricity address described in the above-mentioned prior art imperfect, inaccurate, user's report barrier address Accuracy of judgement degree is not high, and emergency maintenance response speed is slow, cannot be that the zone user transmission prompting message etc. being affected by power failure lacks It falls into, the method for cleaning of a kind of address data structure and standardization is provided.
It the described method comprises the following steps:
S1: power supply enterprise's storage customer electricity original address data are obtained, and carry out initialization process;
S2: stratification parsing is carried out to power supply enterprise's storage customer electricity original address data after initialization;
S3: the address date of hierarchical parsing is matched with base address dictionary library;
S4: according to the matching degree between the address date of hierarchical parsing and base address dictionary library whether meet the requirements into Row judgement;
The address date for the hierarchical parsing that matching degree is met the requirements is added to base address dictionary as clearing achievements Library;
Matching degree is unsatisfactory for desired data return S2 and parse again in next cleaning circulation, until some cleaning Until circulation cannot obtain the address date for meeting matching degree requirement again.
S5: the algorithm of building similarity and compliance evaluation, and comprehensive assessment is carried out to clearing achievements.
This method is the treatment process repeatedly recycled, and the achievement cleared up every time can all be used to supplement with modified basis Location dictionary library, it is whole until completing then with the treatment process for participating in next round by supplement and modified base address dictionary library A scale removal process.
All customer electricity addresses can be carried out the processing of structuring and standardization by the present invention, realize administrative region, street It does, the name of cell unification, i.e., customer electricity address is uniformly processed and is stated are as follows: city+district+street+cell+Lou Dong+door The form (not cell road+road form can be used) of the trade mark effectively increases customer electricity address integrality and quasi- True rate;Improving accuracy, the response speed of raising emergency maintenance, the region to be affected by power failure that user's report barrier address judges User sends prompting message, the zonal power load demand of grasp etc. and has played very important effect.
Preferably, the analytic method of the step S2 is to be improved to the participle based on text feature by traditional segmenting method Method.
It is preferably based on the segmenting method of text feature are as follows: on the basis of understanding " segmenting method based on statistics ", into The extension of row algorithm, except applying frequency (DF), increases information gain (IG), mutual information, X2Count (CHI), expectation intersects Four kinds of methods of entropy (CE).
Preferably, in step S5 the algorithm of similarity and compliance evaluation pass through comprehensive clustering algorithm, k nearest neighbor algorithm, CART classification tree regression algorithm constructs.
Preferably, information gain is occurs in electricity consumption address by counting some characteristic item or the number that does not occur is come in advance The classification of electricity consumption address is surveyed, the calculation formula of information gain is as follows:
Wherein Pr(ci) indicate the probability that feature occurs in the sample, Pr(ci| t) indicate each in the case that feature occurs The probability of classification is how many respectively.
Information gain G (t) reflects reduction of the feature t to classification confusion degree, that is, to the information content of classification in reality By being sorted according to the information gain value of each feature in existing, and it is sub according to the feature that the threshold value of setting selects proper size Collection.
Preferably, the association relationship of mutual information is completed to extract by calculating the correlation between feature t and classification c;It calculates Formula are as follows:
Wherein: A is the number that t and c occurs simultaneously;B is that t occurs and number that c does not occur;C is c appearance and t does not have The number of appearance;N is all electricity consumption number of addresses;If t and c are uncorrelated, I (t, c) value is 0;It is then right if there is m class M value is had in each t, takes being averaged for they, so that it may obtain a linear order needed for Feature Selection;I average value is bigger The probability that feature is selected is bigger.
Preferably, χ2The calculation formula of statistics can be expressed as:.
Wherein, t indicates that characteristic item, c indicate classification.
Preferably, it is expected that intersect closely related calculation formula as follows,
Wherein Pr(ci| t) and P (ci) the same information gain of meaning;If entry and electricity consumption address classes strong correlation, also It is Pr(ci| t) greatly, and corresponding classification probability of occurrence is small, then illustrates that influence of the entry to classification is big, corresponding CE value is with regard to big, quilt Choose the possibility as characteristic item bigger;It is expected that intersecting the closely related probability distribution for reflecting text categories and some spy occurring Determine the distance between the probability distribution of text categories under conditions of word, the expectation of entry t intersects closely related bigger, is distributed to text categories Influence it is also bigger.
Compared with prior art, the beneficial effect of technical solution of the present invention is: providing a kind of address date for power supply enterprise The method for cleaning of structuring and standardization realizes that administrative region, neighbourhood committee, the name of cell are unified, effectively increases client's use Electric address integrality and accuracy rate;In the accuracy for improving the judgement of user's report barrier address, the response speed for improving emergency maintenance, it is The zone user that is affected by power failure send prompting message, grasp zonal power load demand etc. all played it is very heavy The effect wanted.
Detailed description of the invention
Fig. 1 is the method for cleaning flow chart of the present embodiment address data structure and standardization.
Specific embodiment
The attached figures are only used for illustrative purposes and cannot be understood as limitating the patent;
In order to better illustrate this embodiment, the certain components of attached drawing have omission, zoom in or out, and do not represent actual product Size;
To those skilled in the art, it is to be understood that certain known features and its explanation, which may be omitted, in attached drawing 's.
The following further describes the technical solution of the present invention with reference to the accompanying drawings and examples.
The present embodiment provides the method for cleaning of a kind of address data structure and standardization, as shown in Figure 1, the method packet Include following steps:
S1: power supply enterprise's storage customer electricity original address data are obtained, and carry out initialization process;
S2: to " segmenting method based on statistics " understand on the basis of, carry out algorithm extension, applying frequency (DF) it Outside, information gain (IG), mutual information, χ are increased2(CHI), expectation cross entropy (CE) four kinds of methods are counted, by traditional participle Method, which is improved, becomes the segmenting method based on text feature, to realize to power supply enterprise's storage customer electricity original address data Carry out stratification parsing;Specific analytic method specification:
DF (Document frequency): it can be expressed as electricity consumption address frequency herein;DF is indicated The electricity consumption number of addresses of some characteristic item t.This method for measuring characteristic item significance level is based on such a hypothesis: DF is lesser Influence of the characteristic item to classification results is smaller;This method preferentially takes the biggish characteristic item of DF, and the lesser characteristic item of DF will be by It rejects.
Information gain (IG): IG occurs in electricity consumption address by counting some characteristic item or the number that does not occur is predicted The calculation formula of the classification of electricity consumption address, IG is as follows:
Wherein Pr(ci) indicate the probability that feature occurs in the sample, Pr(ci| t) indicate each in the case that feature occurs The probability of classification is how many respectively, and m indicates the number of classification.
Information gain G (t) reflects reduction of the feature t to classification confusion degree, that is, the information content to classification.In reality By being sorted according to the information gain value of each feature in existing, and it is sub according to the feature that the threshold value of setting selects proper size Collection.
Mutual information Ml (Mutual information): association relationship, it is related between classification c by calculating feature t Property is completed to extract, calculation formula are as follows:
Wherein: A is the number that t and c occurs simultaneously;B is that t occurs and number that c does not occur;C is c appearance and t does not have The number of appearance.N is all electricity consumption number of addresses;If t and c are uncorrelated, I (t, c) value is 0;If there is m classification, then M value is had for each t, takes being averaged for they, so that it may obtain a linear order needed for Feature Selection;Big I average value Feature a possibility that being selected it is big.
χ2Count (CHI): CHI method has thought substantially similar with Ml method, same by calculating feature t and classification c Between degree of dependence complete to extract;If characteristic item t and classification c inverse correlation, just illustrate the electricity consumption address containing characteristic item t not The probability for belonging to c wants larger, this is also to have very much directive significance for judging whether electricity consumption address is not belonging to classification;To overcome This defect, CHI calculate the correlation of characteristic item t and classification c using formula;Calculation formula can be expressed as:.
It is expected that cross entropy (CE): expectation intersects closely related (CE) and is defined as follows,
Wherein Pr(ci| t) and P (ci) the same information gain of meaning;If entry and electricity consumption address classes strong correlation, also It is Pr(ci| t) greatly, and corresponding classification probability of occurrence is small, then illustrates that influence of the entry to classification is big, corresponding CE value is with regard to big, just It is likely to selected as characteristic item;It is expected that intersecting the closely related probability distribution for reflecting text categories and some specific word occurring Under conditions of text categories the distance between probability distribution;The expectation of entry t intersects closely related bigger, to be distributed to text categories shadow Sound is also bigger.
S3: the address date of hierarchical parsing is matched with base address dictionary library, the level that matching degree is met the requirements Dissolve the address date of analysis as clearing achievements, and is added to base address dictionary library;
S4: the data that matching degree is unsatisfactory for requiring are put into next cleaning circulation and parse again, until some cleaning follows Until ring cannot obtain the address date for meeting matching degree requirement again;
S5: comprehensive clustering algorithm, k nearest neighbor algorithm, CART classification tree regression algorithm, building similarity and compliance evaluation Algorithm, and comprehensive assessment is carried out to clearing achievements;Specific method is described as follows:
Clustering algorithm: similar electricity consumption address similarity is larger under normal circumstances, and inhomogeneous electricity consumption address similarity It is smaller.As a kind of unsupervised machine learning method, cluster is not due to needing training process, and does not need in advance to text Mark classification by hand, therefore there is certain flexibility and higher automatic processing ability.
One electricity consumption address shows as one and is made of word, word and number, in terms of most famous information retrieval can be used Electricity consumption address is expressed as weighted feature vector D=D (T1, W1 by vector space model (vector space model, VSM); T2, W2;…;Tn, Wn), then, the classification of sample to be divided is determined by calculating the method for electricity consumption address similarity.Work as electricity consumption When address is represented as vector space model, the similarity of electricity consumption address can be come by the inner product between feature vector It indicates.Most electricity consumption address can be regarded as and be made of several words in simple terms, each word be converted to weight with Afterwards, each weight can regard the one-component in vector as, then an electricity consumption address can regard one in n-dimensional space as Vector, here it is the origin of vector space model;The corresponding weight of word can be calculated by TF-IDF weighting technique.
CART post-class processing: being a kind of Decision-Tree Method, estimates letter using the gini index based on minimum range Number, for determining the expansion shape of the decision tree generated by the Sub Data Set;In the method, key is to examine some address sample The Geordie impurity level of the post-class processing of this collection;Geordie impurity level indicates that an address sample chosen at random is divided in the subsets A possibility that wrong (such as a customer electricity address is assigned to a wrong cell);Geordie impurity level is selected for this sample In probability multiplied by it by the probability of misclassification.When all samples are all a classes in a node, Geordie impurity level is zero.
K nearest neighbor algorithm: its core concept is if big in the K in feature space most adjacent samples of a sample Majority belongs to some classification, then the sample also belongs to this classification, and the characteristic with sample in this classification;It is close using K Key factor of the adjacent algorithm when assessing the consistency of an address sample set is its distance function.It applies in the method Minkowski Distance formula:
Wherein, xi、yiFor two-dimentional variable, p indicates variable element.
The terms describing the positional relationship in the drawings are only for illustration, should not be understood as the limitation to this patent;
Obviously, the above embodiment of the present invention be only to clearly illustrate example of the present invention, and not be pair The restriction of embodiments of the present invention.For those of ordinary skill in the art, may be used also on the basis of the above description To make other variations or changes in different ways.There is no necessity and possibility to exhaust all the enbodiments.It is all this Made any modifications, equivalent replacements, and improvements etc., should be included in the claims in the present invention within the spirit and principle of invention Protection scope within.

Claims (8)

1. the method for cleaning of a kind of address data structure and standardization, which is characterized in that the described method comprises the following steps:
S1: power supply enterprise's storage customer electricity original address data are obtained, and carry out initialization process;
S2: stratification parsing is carried out to power supply enterprise's storage customer electricity original address data after initialization;
S3: the address date of hierarchical parsing is matched with base address dictionary library;
S4: sentenced according to whether the matching degree between the address date of hierarchical parsing and base address dictionary library meets the requirements It is disconnected;
The address date for the hierarchical parsing that matching degree is met the requirements is added to base address dictionary library as clearing achievements;
Matching degree is unsatisfactory for desired data return S2 and parse again in next cleaning circulation, until some cleaning recycles Until the address date for meeting matching degree requirement cannot be obtained again;
S5: assessing similarity and consistency, and carries out comprehensive assessment to clearing achievements.
2. the method for cleaning of address data structure according to claim 1 and standardization, which is characterized in that the step The analytic method of S2 is the segmenting method based on text feature.
3. the method for cleaning of address data structure according to claim 2 and standardization, which is characterized in that be based on text The segmenting method of feature are as follows: on the basis of " segmenting method based on statistics ", algorithm extension is carried out, except applying frequency, Information gain, mutual information, χ are increased simultaneously2Statistics, expectation four kinds of methods of cross entropy.
4. the method for cleaning of address data structure according to claim 1 and standardization, which is characterized in that in step S5 The algorithm of similarity and compliance evaluation is by comprehensive k nearest neighbor algorithm, comprehensive clustering algorithm, CART classification tree regression algorithm come structure It builds.
5. the method for cleaning of address data structure according to claim 3 and standardization, which is characterized in that information gain It is the number for occurring in electricity consumption address by counting some characteristic item or not occurring the classification of predicting electricity consumption address, information increases The calculation formula of benefit is as follows:
Wherein Pr(ci) indicate the probability that feature occurs in the sample, Pr(ci| t) indicate each classification in the case that feature occurs Probability is how many respectively, and m indicates the number of classification.
6. the method for cleaning of address data structure according to claim 3 and standardization, which is characterized in that mutual information Association relationship is completed to extract by calculating the correlation between feature t and classification c;Calculation formula are as follows:
Wherein: A is the number that t and c occurs simultaneously;B is that t occurs and number that c does not occur;C is c appearance and t does not occur Number;N is all electricity consumption number of addresses;If t and c are uncorrelated, I (t, c) value is 0.
7. the method for cleaning of address data structure according to claim 3 and standardization, which is characterized in that χ2Statistics Calculation formula can be expressed as:
Wherein, t indicates characteristic item and c indicates classification, and A is the number that t and c occurs simultaneously;B is that t occurs and c does not occur time Number;C is that c occurs and number that t does not occur;N is all electricity consumption number of addresses.
8. the method for cleaning of address data structure according to claim 3 and standardization, which is characterized in that expectation intersects Closely related calculation formula is as follows,
Wherein Pr(ci) indicate the probability that feature occurs in the sample, Pr(ci| t) indicate each classification in the case that feature occurs Probability is how many respectively, and m indicates the number of classification.
CN201811543929.3A 2018-12-17 2018-12-17 A kind of method for cleaning of address data structure and standardization Pending CN109614396A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201811543929.3A CN109614396A (en) 2018-12-17 2018-12-17 A kind of method for cleaning of address data structure and standardization

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201811543929.3A CN109614396A (en) 2018-12-17 2018-12-17 A kind of method for cleaning of address data structure and standardization

Publications (1)

Publication Number Publication Date
CN109614396A true CN109614396A (en) 2019-04-12

Family

ID=66010485

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201811543929.3A Pending CN109614396A (en) 2018-12-17 2018-12-17 A kind of method for cleaning of address data structure and standardization

Country Status (1)

Country Link
CN (1) CN109614396A (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113448923A (en) * 2020-04-17 2021-09-28 北京新氧科技有限公司 File generation method and device and terminal
CN115168548A (en) * 2022-09-05 2022-10-11 吉奥时空信息技术股份有限公司 Recall-sorting based address matching method

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102169498A (en) * 2011-04-14 2011-08-31 中国测绘科学研究院 Address model constructing method and address matching method and system
CN106055650A (en) * 2016-05-31 2016-10-26 深圳市永兴元科技有限公司 Address standardization method and device
CN106709065A (en) * 2017-01-19 2017-05-24 国家电网公司 Standardization processing method and standardized processing device for address information
CN107329950A (en) * 2017-06-13 2017-11-07 武汉工程大学 It is a kind of based on the Chinese address segmenting method without dictionary
CN108228825A (en) * 2018-01-02 2018-06-29 北京市燃气集团有限责任公司 A kind of station address data cleaning method based on participle
CN109190997A (en) * 2018-09-18 2019-01-11 广东电网有限责任公司 Chinese address hierarchical analysis and standard processing method and system

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102169498A (en) * 2011-04-14 2011-08-31 中国测绘科学研究院 Address model constructing method and address matching method and system
CN106055650A (en) * 2016-05-31 2016-10-26 深圳市永兴元科技有限公司 Address standardization method and device
CN106709065A (en) * 2017-01-19 2017-05-24 国家电网公司 Standardization processing method and standardized processing device for address information
CN107329950A (en) * 2017-06-13 2017-11-07 武汉工程大学 It is a kind of based on the Chinese address segmenting method without dictionary
CN108228825A (en) * 2018-01-02 2018-06-29 北京市燃气集团有限责任公司 A kind of station address data cleaning method based on participle
CN109190997A (en) * 2018-09-18 2019-01-11 广东电网有限责任公司 Chinese address hierarchical analysis and standard processing method and system

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
宋才华等: "供电企业存量客户用电地址数据结构化与规范化的清理方法研究", 《微型电脑应用》 *
高文雅: "中文文本分类中的关键技术研究", 《中国优秀硕士论文全文数据库信息科技辑》 *

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113448923A (en) * 2020-04-17 2021-09-28 北京新氧科技有限公司 File generation method and device and terminal
CN113448923B (en) * 2020-04-17 2023-09-12 北京新氧科技有限公司 File generation method, device and terminal
CN115168548A (en) * 2022-09-05 2022-10-11 吉奥时空信息技术股份有限公司 Recall-sorting based address matching method
CN115168548B (en) * 2022-09-05 2022-11-22 吉奥时空信息技术股份有限公司 Recall-sorting based address matching method

Similar Documents

Publication Publication Date Title
CN104408095B (en) One kind is based on improved KNN file classification methods
CN104778186B (en) Merchandise items are mounted to the method and system of standardized product unit
CN108363810A (en) Text classification method and device
CN110532351A (en) Recommend word methods of exhibiting, device, equipment and computer readable storage medium
CN108388929A (en) Client segmentation method and device based on cost-sensitive and semisupervised classification
CN110990718A (en) Social network model building module of company image improving system
CN108763496A (en) A kind of sound state data fusion client segmentation algorithm based on grid and density
CN109614396A (en) A kind of method for cleaning of address data structure and standardization
Sarantitis et al. A network analysis of the United Kingdom’s consumer price index
CN109658156A (en) A kind of material price measuring method, device, terminal device and storage medium
KR101625124B1 (en) The Technology Valuation Model Using Quantitative Patent Analysis
CN113239266B (en) Personalized recommendation method and system based on local matrix decomposition
CN109190997A (en) Chinese address hierarchical analysis and standard processing method and system
CN104572623B (en) A kind of efficient data analysis and summary method of online LDA models
Lejeune et al. Optimization for simulation: LAD accelerator
Wang et al. Computer supported data-driven decisions for service personalization: a variable-scale clustering method
CN115630708A (en) Model updating method and device, electronic equipment, storage medium and product
CN115392351A (en) Risk user identification method and device, electronic equipment and storage medium
CN108629506A (en) Modeling method, device, computer equipment and the storage medium of air control model
Keskin et al. Cohort fertility heterogeneity during the fertility decline period in Turkey
CN113537461A (en) Network key node discovery method and system based on SIR value learning
Lee et al. A comparison of the predictive powers of tenure choices between property ownership and renting
SEWART et al. Graphical models in credit scoring
Schaidnagel et al. DNA: an online algorithm for credit card fraud detection for games merchants
CN111461199A (en) Security attribute selection method based on distributed junk mail classified data

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20190412