CN106897358A - Clustering algorithm based on constraints realizes that search engine keywords optimize - Google Patents

Clustering algorithm based on constraints realizes that search engine keywords optimize Download PDF

Info

Publication number
CN106897358A
CN106897358A CN201710006185.0A CN201710006185A CN106897358A CN 106897358 A CN106897358 A CN 106897358A CN 201710006185 A CN201710006185 A CN 201710006185A CN 106897358 A CN106897358 A CN 106897358A
Authority
CN
China
Prior art keywords
keyword
constraints
search engine
algorithm based
clustering algorithm
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201710006185.0A
Other languages
Chinese (zh)
Inventor
金平艳
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Sichuan Yonglian Information Technology Co Ltd
Original Assignee
Sichuan Yonglian Information Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Sichuan Yonglian Information Technology Co Ltd filed Critical Sichuan Yonglian Information Technology Co Ltd
Priority to CN201710006185.0A priority Critical patent/CN106897358A/en
Publication of CN106897358A publication Critical patent/CN106897358A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/953Querying, e.g. by the use of web search engines
    • G06F16/9535Search customisation based on user profiles and personalisation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/23Clustering techniques
    • G06F18/232Non-hierarchical techniques
    • G06F18/2321Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions
    • G06F18/23213Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions with fixed number of clusters, e.g. K-means clustering

Landscapes

  • Engineering & Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Theoretical Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Probability & Statistics with Applications (AREA)
  • Artificial Intelligence (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Biology (AREA)
  • Evolutionary Computation (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

Clustering algorithm based on constraints realizes that search engine keywords optimize, and kernel keyword, the corresponding data item of search keyword, such as national monthly volumes of searches, degree of contention and each clicking cost of estimation are determined according to business eventDeng, dimension-reduction treatment again is carried out to above-mentioned keyword set, each keyword is represented with First Five-Year Plan dimensional vector, increase homepage webpage number and total searched page number, and then the four-dimension is reduced to again by five dimensions, clustering algorithm based on constraints is clustered to above-mentioned keyword, the present invention is higher than traditional clustering method degree of accuracy, part is according to the degree of correlation come the result of partition clustering, more meet empirical value, the overall situation considers the accounting in each field, reduce isolated point influences on cluster result, simplify key word analysis flow simultaneously, data process effects are good, run time complexity is low, processing speed is faster, ranking of the energy fast lifting keyword in website, certain flow can be brought for website, so as to reach preferable web information flow target.

Description

Clustering algorithm based on constraints realizes that search engine keywords optimize
Technical field
The present invention relates to Semantic Web technology field, and in particular to the clustering algorithm based on constraints realizes search engine Keyword optimizes.
Background technology
Search engine has turned into the important tool that numerous netizens obtain information.Search engine optimization (Search Engine Optimization, abbreviation SEO) refer to that series of optimum is carried out to website using correlation technique, so as to improve corresponding Keyword ranking on a search engine, is finally reached the purpose of website marketing.In fact, search engine optimization is exactly to carry out network A kind of form of marketing, allows enterprise utilizing main search engine optimization strategy, to the keyword in webpage, content and chain The various factors strategy such as connecing carries out the optimization of correlation so that the enterprise web site after application strategy can be by major main flow search engines Preferentially capture and include, in the top in target pages are indexed, attraction clicking rate, so as to can reach raising corporate image, push away The purpose of wide website.In with regard to the form of current all-network marketing, search engine optimization undoubtedly can in a short time expand shadow Sound, the preferred approach of enterprise web image.SEO is the optimization of keyword after all.Keyword is user in search phase The word or expression used during the page is closed, is also that search engine is setting up the word that concordance list is used.Contributed to using keyword Obtain search engine inquiry ranking higher, it should be noted that keyword research is intended to find out the keyword of most worthy.It is domestic at present Outer theoretical research and technology application to keyword optimization is relatively more, but temporarily does not propose an effective method to simplify keyword Analysis process, also neither one perfect mechanism manage keyword optimisation strategy and progress.Based on the demand, the present invention is carried The clustering algorithm based on constraints has been supplied to realize that search engine keywords optimize.
The content of the invention
The technical problem that search engine optimization is realized in keyword optimization is directed to, the invention provides based on constraints reality Existing search engine keywords optimization.
In order to solve the above problems, the present invention is achieved by the following technical solutions:
Step 1:Kernel keyword is determined according to business event, related keyword is collected using search engine, these are crucial Word has corresponding data items in a search engine, such as national monthly volumes of searches, degree of contention and each clicking cost (CPC) of estimation
Step 2:With reference to enterprise product and market analysis, the above-mentioned related keyword set for searching of dimensionality reduction is screened;
Step 3:For the keyword set after screening dimensionality reduction, by the corresponding page of search engine search keyword, this In record homepage webpage number and total searched page number, i.e. each keyword dimensionality reduction be four-dimensional again by five dimensional vectors.
Step 4:Clustering algorithm based on constraints, clustering processing is carried out to above-mentioned keyword, and its specific sub-step is such as Under:
Step 4.1:Using the k-means algorithm initialization clusters based on ε fields;
Step 4.2:Initialize the information flow function in each ε fieldFollowing judgements are pressed from set of data objects D Condition selects k initial cluster center;
Step 4.3:To every class keywords i, (i ∈ (1,2 ..., m)) are redistributed, poly- by probability function p (i) selection Class center j ';
Step 4.4:According to the result of decision function Δ (I), Ge Cu centers are recalculated;
Step 4.5:If cluster center changes, step 4.2 is gone to, otherwise iteration terminates, export cluster result.
Step 5:According to enterprise's concrete condition, comprehensive keyword efficiency optimization and value rate optimize, and selection is suitable crucial Word optimisation strategy reaches web information flow target.
Present invention has the advantages that:
1, this algorithm can simplify key word analysis flow, and then reduce whole web information flow workload.
2, the run time complexity of this algorithm is low, and processing speed is faster.
3rd, this algorithm has bigger value.
4th, the ranking of website its keyword of fast lifting in a short time can be helped.
5th, for enterprise web site brings certain flow and inquiry, so as to reach preferable web information flow target.
6th, this algorithm part distinguishes each class from the degree of correlation, and the degree of accuracy of classification results more meets empirical value.
7, the overall situation considers the accounting in the field of each, can so reduce influence of the isolated point to cluster result.
8th, the effect of data processing is more preferable.
Brief description of the drawings
The clustering algorithm that Fig. 1 is based on constraints realizes that search engine keywords optimize structure flow chart
Fig. 2 is based on applicating flow chart of the clustering algorithm of constraints in cluster analysis
Specific embodiment
In order to solve the technical problem that search engine optimization is realized in keyword optimization, the present invention is carried out with reference to Fig. 1-Fig. 2 Describe in detail, its specific implementation step is as follows:
Step 1:Kernel keyword is determined according to business event, related keyword is collected using search engine, these are crucial Word has corresponding data items in a search engine, such as national monthly volumes of searches, degree of contention and each clicking cost (CPC) of estimation Deng.
Step 2:With reference to enterprise product and market analysis, the above-mentioned related keyword set for searching of dimensionality reduction is screened;
Step 3:For the keyword set after screening dimensionality reduction, by the corresponding page of search engine search keyword, this In record homepage webpage number and total searched page number, i.e. each keyword dimensionality reduction be four-dimensional, its specific meter again by five dimensional vectors Calculation process is as follows:
Here associative key number is m, existing following m × 5 matrix:
Ni、Ldi、CPCi、NiS、NiYIt is followed successively by monthly volumes of searches, degree of contention, the estimation of i-th corresponding this country of keyword Each clicking cost (CPC), homepage webpage number, total searched page number.
Dimensionality reduction is the four-dimension again, i.e.,
XI ∈ (1,2 ..., m)It is search efficiency, ZI ∈ (1,2 ..., m)It is value rate, as following formula:
Step 4:Clustering algorithm based on constraints, clustering processing is carried out to above-mentioned keyword, and its specific sub-step is such as Under:
Step 4.1:It is c clusters using the k-means algorithm initializations based on ε fields.
Step 4.2:With the number initialization Subject Matrix J between value [0,1], it is set to meet the whole constraints being subordinate to, its Specific calculating process is as follows:
Above formulaI-th crucial term vector and cluster center vector in for spaceInner product, μijFor keyword i is subordinate to Belong to the degree coefficient of class j, it meets and following is subordinate to constraints:
Initialization Subject Matrix J is m × c:
Step 4.3:Initialize each field object functionC class catalogue scalar functions are built, is comprehensively subordinate to constraint bar Part, builds m equation group, and it is solved, you can obtain cluster result, and its specific calculating process is as follows:
Above formula nεjIt is the number of data object in j class ε fields.
C class catalogue scalar functions are
Comprehensively it is subordinate to constraints, builds m equation group:
Here λi(i ∈ (1,2 ..., m)) are the m Lagrange multipliers of constraint formula.Parameter derivations are input into all, i.e., Can try to achieve makesReach the necessary condition c of maximumj、μij
Above formula xiVector corresponding to keyword i;
Step 4.4:Using the result of following formula decision function Δ (I), Ge Cu centers are recalculated, its specific calculating process is such as Under:
Decision function Δ (I):
Above formulaIt is new catalogue scalar functions,For the catalogue scalar functions that last iteration draws.θ is a foot Enough small numbers, only meet above-mentioned condition, then have found optimal classification, do not find otherwise.
Concrete structure flow such as Fig. 2 of clustering algorithm based on constraints.
Step 5:According to enterprise's concrete condition, comprehensive keyword efficiency optimization and value rate optimize, and selection is suitable crucial Word optimisation strategy reaches web information flow target.
Clustering algorithm based on constraints realizes that search engine keywords optimize, its false code process
Input:The kernel keyword that website is extracted, c classes are initialized as based on ε fields
Output:Global catalogue scalar functionsThe maximum c cluster of summation.

Claims (2)

1. the clustering algorithm based on constraints realizes that search engine keywords optimize, the present invention relates to Semantic Web technology neck Domain, and in particular to the clustering algorithm based on constraints realizes that search engine keywords optimize, it is characterized in that, including following step Suddenly:
Step 1:Kernel keyword is determined according to business event, related keyword is collected using search engine, these keywords exist There are corresponding data items in search engine, such as national monthly volumes of searches, degree of contention and each clicking cost of estimation(CPC)Deng
Step 2:With reference to enterprise product and market analysis, the above-mentioned related keyword set for searching of dimensionality reduction is screened;
Step 3:For the keyword set after screening dimensionality reduction, by the corresponding page of search engine search keyword, remember here Dimensionality reduction is four-dimensional again by five dimensional vectors for record homepage webpage number and total searched page number, i.e. each keyword, and it was specifically calculated Journey is as follows:
Here associative key number is m, existing followingMatrix:
It is followed successively by monthly volumes of searches, degree of contention, the estimation of i-th corresponding this country of keyword Each clicking cost(CPC), homepage webpage number, total searched page number dimensionality reduction again
It is the four-dimension, i.e.,
It is search efficiency,It is value rate, as following formula:
Step 4:Clustering algorithm based on constraints, clustering processing is carried out to above-mentioned keyword, and its specific sub-step is as follows:
Step 4.1:Using being based onThe k-means algorithm initialization clusters in field;
Step 4.2:Initialize eachThe information flow function in field, following judgements are pressed from set of data objects D Condition selects k initial cluster center;
Step 4.3:To every class keywordsRedistributed, by probability functionSelection is poly- Class center
Step 4.4:According to decision functionResult, recalculate Ge Cu centers;
Step 5:According to enterprise's concrete condition, comprehensive keyword efficiency optimization and value rate optimize, and select suitable keyword excellent Change strategy and reach web information flow target.
2. the clustering algorithm based on constraints according to claim 1 realizes that search engine keywords optimize, and it is special Levying is, the specific calculating process in the above step 4 is as follows:
Step 4:Clustering algorithm based on constraints, clustering processing is carried out to above-mentioned keyword, and its specific sub-step is as follows:
Step 4.1:Using being based onThe k-means algorithm initializations in field are c clusters
Step 4.2:With the number initialization Subject Matrix J between value [0,1], the whole constraints for being subordinate to its satisfaction, its is specific Calculating process is as follows:
Above formulaI-th crucial term vector and cluster center vector in for spaceInner product,For keyword i is subordinate to Belong to the degree coefficient of class j, it meets and following is subordinate to constraints:
Initializing Subject Matrix J is
Step 4.3:Initialize each field object function, c class catalogue scalar functions are built, comprehensively it is subordinate to constraint bar Part, builds m equation group of Ah, and it is solved, you can obtain cluster result, and its specific calculating process is as follows:
Above formulaIt is j classesThe number of data object in field
C class catalogue scalar functions are
Comprehensively it is subordinate to constraints, builds m equation group:
HereIt is the m Lagrange multiplier of constraint formula, to all input parameter derivations, you can ask Must makeReach the necessary condition of maximum
Above formulaVector corresponding to keyword i;
Step 4.4:Using following formula decision functionResult, recalculate Ge Cu centers, its specific calculating process is as follows:
Decision function
Above formulaIt is new catalogue scalar functions,It is the catalogue scalar functions that last iteration draws,It is a foot Enough small numbers, only meet above-mentioned condition, then have found optimal classification, do not find otherwise.
CN201710006185.0A 2017-01-04 2017-01-04 Clustering algorithm based on constraints realizes that search engine keywords optimize Pending CN106897358A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201710006185.0A CN106897358A (en) 2017-01-04 2017-01-04 Clustering algorithm based on constraints realizes that search engine keywords optimize

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201710006185.0A CN106897358A (en) 2017-01-04 2017-01-04 Clustering algorithm based on constraints realizes that search engine keywords optimize

Publications (1)

Publication Number Publication Date
CN106897358A true CN106897358A (en) 2017-06-27

Family

ID=59198347

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201710006185.0A Pending CN106897358A (en) 2017-01-04 2017-01-04 Clustering algorithm based on constraints realizes that search engine keywords optimize

Country Status (1)

Country Link
CN (1) CN106897358A (en)

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103218435A (en) * 2013-04-15 2013-07-24 上海嘉之道企业管理咨询有限公司 Method and system for clustering Chinese text data
CN103258000A (en) * 2013-03-29 2013-08-21 北界创想(北京)软件有限公司 Method and device for clustering high-frequency keywords in webpages

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103258000A (en) * 2013-03-29 2013-08-21 北界创想(北京)软件有限公司 Method and device for clustering high-frequency keywords in webpages
CN103218435A (en) * 2013-04-15 2013-07-24 上海嘉之道企业管理咨询有限公司 Method and system for clustering Chinese text data

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
林元国 等: "K-means算法在关键词优化中的应用", 《计算机***应用》 *
邓健爽 等: "基于搜索引擎的关键词自动聚类法", 《计算机科学》 *

Similar Documents

Publication Publication Date Title
KR101778679B1 (en) Method and system for classifying data consisting of multiple attribues represented by sequences of text words or symbols using deep learning
Kuang et al. Integrating multi-level deep learning and concept ontology for large-scale visual recognition
CN106021457A (en) Keyword-based RDF distributed semantic search method
CN105531701A (en) Personalized trending image search suggestion
CN106649616A (en) Clustering algorithm achieving search engine keyword optimization
CN107577786B (en) A kind of matrix decomposition recommendation method based on joint cluster
CN106933954A (en) Search engine optimization technology is realized based on Decision Tree Algorithm
López-Sánchez et al. Deep neural networks and transfer learning applied to multimedia web mining
Chen et al. Deep net architectures for visual-based clothing image recognition on large database
CN106933953A (en) A kind of fuzzy K mean cluster algorithm realizes search engine optimization technology
Nezamabadi-pour et al. Concept learning by fuzzy k-NN classification and relevance feedback for efficient image retrieval
CN106909626A (en) Improved Decision Tree Algorithm realizes search engine optimization technology
CN106874376A (en) A kind of method of verification search engine keyword optimisation technique
CN107622071A (en) By indirect correlation feedback without clothes image searching system and the method looked under source
Sujatha et al. A new design of multimedia big data retrieval enabled by deep feature learning and Adaptive Semantic Similarity Function
Liu et al. Classification of fashion article images based on improved random forest and VGG-IE algorithm
CN111061939B (en) Scientific research academic news keyword matching recommendation method based on deep learning
CN106897356A (en) Improved Fuzzy C mean algorithm realizes that search engine keywords optimize
Bao et al. Mmfl-net: multi-scale and multi-granularity feature learning for cross-domain fashion retrieval
CN106802945A (en) Fuzzy c-Means Clustering Algorithm based on VSM realizes that search engine keywords optimize
CN106874377A (en) The improved clustering algorithm based on constraints realizes that search engine keywords optimize
Zhou et al. Automatic metric search for few-shot learning
CN106897358A (en) Clustering algorithm based on constraints realizes that search engine keywords optimize
CN106897376A (en) Fuzzy C-Mean Algorithm based on ant colony realizes that keyword optimizes
Wang et al. Dominant sets clustering for image retrieval

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
WD01 Invention patent application deemed withdrawn after publication

Application publication date: 20170627

WD01 Invention patent application deemed withdrawn after publication