CN106599118A - Method for realizing search engine keyword optimization by improved density clustering algorithm - Google Patents
Method for realizing search engine keyword optimization by improved density clustering algorithm Download PDFInfo
- Publication number
- CN106599118A CN106599118A CN201611089215.0A CN201611089215A CN106599118A CN 106599118 A CN106599118 A CN 106599118A CN 201611089215 A CN201611089215 A CN 201611089215A CN 106599118 A CN106599118 A CN 106599118A
- Authority
- CN
- China
- Prior art keywords
- search engine
- key word
- cluster
- keyword
- keywords
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/30—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F16/35—Clustering; Classification
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Data Mining & Analysis (AREA)
- Databases & Information Systems (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
The invention discloses a method for realizing search engine keyword optimization by an improved density clustering algorithm. The method comprises the steps of determining core keywords according to enterprise businesses, and searching for data items corresponding to the keywords, such as domestic monthly search volume, degree of competition, estimated cost per click (CPC) and the like; performing dimension reduction processing on a keyword set, thereby reducing five dimensions to four dimensions, wherein each keyword is represented by a five-dimensional vector, namely, a home page number and a total search page number are increased; and finally clustering the keywords by utilizing the improved density clustering algorithm, wherein an influence function of each cluster center is f(i,j). According to the method, the algorithm is simpler, more convenient and more effective; the runtime complexity is low; the processing speed is higher; a classification result better conforms to an empirical value; a better data processing effect is achieved; keyword rankings of websites can be quickly improved in a short time under the assistance; and a certain flow and inquiry are brought for enterprise websites, so that an ideal website optimization goal is achieved.
Description
Technical field
The present invention relates to Semantic Web technology field, and in particular to a kind of improved density clustering algorithm method realizes that search is drawn
Hold up key word optimization.
Background technology
Search engine is the main tool that people obtain Internet resources, with the famous search engine such as Yahoo, Google
Occur, search engine optimization technology (SearchEngineOptimization, SEO) also gradually grows up.Search engine is excellent
Change technology includes black cap technology and white cap technology, wherein black cap technology represents the malice optimization skill for violating principle of optimality of search engine
Art, shows as key word is piled up in the page in key word optimisation technique or places unrelated key word to improve in search engine
In ranking, at present each search engine have been incorporated into correlation technique and rule to punishing using the website of black cap technology;In vain
Cap technology then represents the optimisation technique of searched engine accreditation.Select key word to be one of most important SEO tasks, but often lack
Weary discussion and research.Without correct key word, SEO work will be got half the result with twice the effort.In research network search keyword volumes of searches number
During according to relation with relevant issues, select which key word to be the key issue for first having to solve, read document, the author has found,
For key word selection mostly by virtue of experience and subjective factorss, also the perfect mechanism of neither one is managing key word optimization plan
Omit and progress.To make the selection more scientific and objectivity of key word, based on the demand, the invention provides a kind of improve
Density clustering algorithm algorithm realize that search engine keywords are excellent.
The content of the invention
The technical problem that search engine optimization is realized in key word optimization is directed to, the invention provides a kind of improved density
Clustering algorithm realizes that search engine keywords optimize.
In order to solve the above problems, the present invention is achieved by the following technical solutions:
Step 1:Kernel keyword being determined according to business event, related keyword being collected using search engine, these are crucial
Word has corresponding data items in a search engine, such as national monthly volumes of searches, degree of contention and each clicking cost (CPC) of estimation etc.
Step 2:With reference to enterprise product and the market analysiss, the above-mentioned related keyword set for searching of screening dimensionality reduction;
Step 3:For screening the keyword set after dimensionality reduction, by the corresponding page of search engine search keyword, this
In record homepage webpage number and total searched page number, i.e. each key word dimensionality reduction be four-dimensional again by five dimensional vectors.
Step 4:Using a kind of improved density clustering algorithm, clustering processing is carried out to above-mentioned key word, its concrete sub-step
It is rapid as follows:
Step 4.1:Using the k-means algorithm initialization clusters based on ε fields.
Step 4.2:Initialize influence function f (i, j) in each ε fieldstart, by following from set of data objects D
Decision condition selects k initial cluster center.
Step 4.3:To every class keywords i, (i ∈ (1,2 ..., m)) are redistributed, and select poly- by probability function p (i)
Class center j ';
Step 4.4:According to the result of decision function Δ (f), Ge Cu centers are recalculated;
Step 4.5:If cluster center changes, step 4.2 is gone to, otherwise iteration terminates, export cluster result.
Step 5:According to enterprise's concrete condition, comprehensive key word efficiency optimization and the optimization of value rate, select suitable crucial
Word optimisation strategy reaches web information flow target.
Present invention has the advantages that:
1, this algorithm can simplify key word analysis flow process, and then reduce whole web information flow workload.
2, the run time complexity of this algorithm is low, and processing speed is faster.
3rd, this algorithm has bigger value.
4th, the ranking of website its key word of fast lifting at short notice can be helped.
5th, certain flow and inquiry are brought for enterprise web site, so as to reach preferable web information flow target.
6th, the accuracy of this algorithm classification result more meets empirical value.
7th, this algorithm is more simple and effective.
8th, the effect of data processing is more preferable.
Description of the drawings
A kind of improved density clustering algorithms of Fig. 1 realize search engine keywords optimization structure flow chart
A kind of applicating flow chart of the improved density clustering algorithms of Fig. 2 in cluster analyses
Specific embodiment
The technical problem of search engine optimization is realized to solve key word optimization, the present invention is carried out with reference to Fig. 1-Fig. 2
Describe in detail, its specific implementation step is as follows:
Step 1:Kernel keyword being determined according to business event, related keyword being collected using search engine, these are crucial
Word has corresponding data items in a search engine, such as national monthly volumes of searches, degree of contention and each clicking cost (CPC) of estimation
Deng.
Step 2:With reference to enterprise product and the market analysiss, the above-mentioned related keyword set for searching of screening dimensionality reduction;
Step 3:For screening the keyword set after dimensionality reduction, by the corresponding page of search engine search keyword, this
In record homepage webpage number and total searched page number, i.e. each key word dimensionality reduction be four-dimensional, its concrete meter again by five dimensional vectors
Calculation process is as follows:
Here associative key number be m, existing following m × 5 matrix:
Ni、Ldi、CPCi、NiS、NiYIt is followed successively by monthly volumes of searches, degree of contention, the estimation of i-th corresponding this country of key word
Each clicking cost (CPC), homepage webpage number, total searched page number.
Dimensionality reduction is the four-dimension again, i.e.,
XI ∈ (1,2 ..., m)For search efficiency, ZI ∈ (1,2 ..., m)For value rate, as following formula:
Step 4:Using a kind of improved density clustering algorithm, clustering processing is carried out to above-mentioned key word, its concrete sub-step
It is rapid as follows:
Step 4.1:Using the k-means algorithm initialization clusters based on ε fields.
Step 4.2:Initialize influence function f (i, j) in each ε fieldstart, by following from set of data objects D
Decision condition selects k initial cluster center, and its concrete calculating process is as follows:
Above formula NεFor the number of data object in each ε field, d (i, j) is in the intra-cluster of key word i to correspondence ε fields
The distance of the heart, expected values of the σ for cluster center.
Above formula xihFor the corresponding vector of i-th key word in ε fields, yjhFor the cluster centre data object in ε fields to
Amount.
Decision condition is as follows:
F (i, j)start> γ
γ is the threshold value for setting, and only meets above formula condition and is then classified as cluster.
Step 4.3:To every class keywords i, (i ∈ (1,2 ..., m)) are redistributed, and select poly- by probability function p (i)
Class center j ', its concrete calculating process are as follows:
By the corresponding cluster centre j ' of p (i) value MAXIMUM SELECTIONs.
Step 4.4:According to the result of decision function Δ (f), Ge Cu centers are recalculated, its concrete calculating process is as follows:
Δ (f)=f (i, J)new- f (i, j)old> 0
Meet above formula, then recalculate Ge Cu centers.
Step 4.5:If cluster center changes, step 4.2 is gone to, otherwise iteration terminates, export cluster result.
Step 5:According to enterprise's concrete condition, comprehensive key word efficiency optimization and the optimization of value rate, select suitable crucial
Word optimisation strategy reaches web information flow target.
A kind of improved density clustering algorithm realizes that search engine keywords optimize, its false code process
Input:The kernel keyword that website is extracted, initializes cluster based on ε fields, initializes the impact letter in each ε field
Number f (i, j)start
Output:K maximum cluster of the summation of global impact function f (i, j).
Claims (2)
1. a kind of improved density clustering algorithm realizes that search engine keywords optimize, the present invention relates to Semantic Web technology neck
Domain, and in particular to a kind of improved density clustering algorithm method realizes that search engine keywords optimize, and it is characterized in that, including following step
Suddenly:
Step 1:Kernel keyword being determined according to business event, related keyword being collected using search engine, these keywords exist
There are corresponding data items in search engine, such as national monthly volumes of searches, degree of contention and each clicking cost of estimation(CPC)Deng
Step 2:With reference to enterprise product and the market analysiss, the above-mentioned related keyword set for searching of screening dimensionality reduction;
Step 3:For screening the keyword set after dimensionality reduction, by the corresponding page of search engine search keyword, remember here
Dimensionality reduction is four-dimensional again by five dimensional vectors for record homepage webpage number and total searched page number, i.e. each key word, and which specifically calculated
Journey is as follows:
Here associative key number is m, existing followingMatrix:
、、、、Be followed successively by the corresponding this country of i-th key word monthly volumes of searches, degree of contention, estimate
Calculate each clicking cost(CPC), homepage webpage number, total searched page number dimensionality reduction again
For the four-dimension, i.e.,
For search efficiency,For value rate, as following formula:
Step 4:Using a kind of improved density clustering algorithm, clustering processing is carried out to above-mentioned key word, its concrete sub-step is such as
Under:
Step 4.1:Using being based onThe k-means algorithm initialization clusters in field
Step 4.2:Initialize eachThe influence function in field, following judgements are pressed from set of data objects D
Condition selects k initial cluster center
Step 4.3:To every class keywordsRedistributed, selected in cluster by probability function p (i)
The heart;
Step 4.4:According to decision functionResult, recalculate Ge Cu centers;
Step 4.5:If cluster center changes, step 4.2 is gone to, otherwise iteration terminates, export cluster result
Step 5:According to enterprise's concrete condition, comprehensive key word efficiency optimization and the optimization of value rate, select suitable key word excellent
Change strategy and reach web information flow target.
2. realize that search engine keywords optimize according to a kind of improved density clustering algorithm described in claim 1, which is special
Levying is, the concrete calculating process in the above step 4 is as follows:
Step 4:Using a kind of improved density clustering algorithm, clustering processing is carried out to above-mentioned key word, its concrete sub-step is such as
Under:
Step 4.1:Using being based onThe k-means algorithm initialization clusters in field
Step 4.2:Initialize eachThe influence function in field, following judgements are pressed from set of data objects D
Condition selects k initial cluster center, and its concrete calculating process is as follows:
Above formulaFor eachThe number of data object in field,Be key word i to correspondenceField intra-cluster center
Distance,For the expected value at cluster center
Above formulaForThe corresponding vector of i-th key word in field,ForCluster centre data object in field to
Amount
Decision condition is as follows:
For the threshold value for setting, only meet above formula condition and be then classified as cluster
Step 4.3:To every class keywordsRedistributed, selected in cluster by probability function p (i)
The heart, its concrete calculating process is as follows:
By the corresponding cluster centre of p (i) value MAXIMUM SELECTIONs
Step 4.4:According to decision functionResult, recalculate Ge Cu centers, its concrete calculating process is as follows:
Meet above formula, then recalculate Ge Cu centers
Step 4.5:If cluster center changes, step 4.2 is gone to, otherwise iteration terminates, export cluster result.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201611089215.0A CN106599118A (en) | 2016-11-30 | 2016-11-30 | Method for realizing search engine keyword optimization by improved density clustering algorithm |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201611089215.0A CN106599118A (en) | 2016-11-30 | 2016-11-30 | Method for realizing search engine keyword optimization by improved density clustering algorithm |
Publications (1)
Publication Number | Publication Date |
---|---|
CN106599118A true CN106599118A (en) | 2017-04-26 |
Family
ID=58594408
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201611089215.0A Pending CN106599118A (en) | 2016-11-30 | 2016-11-30 | Method for realizing search engine keyword optimization by improved density clustering algorithm |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN106599118A (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108985064A (en) * | 2018-07-16 | 2018-12-11 | 中国人民解放军战略支援部队信息工程大学 | A kind of method and device identifying malice document |
Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103218435A (en) * | 2013-04-15 | 2013-07-24 | 上海嘉之道企业管理咨询有限公司 | Method and system for clustering Chinese text data |
CN103258000A (en) * | 2013-03-29 | 2013-08-21 | 北界创想(北京)软件有限公司 | Method and device for clustering high-frequency keywords in webpages |
-
2016
- 2016-11-30 CN CN201611089215.0A patent/CN106599118A/en active Pending
Patent Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103258000A (en) * | 2013-03-29 | 2013-08-21 | 北界创想(北京)软件有限公司 | Method and device for clustering high-frequency keywords in webpages |
CN103218435A (en) * | 2013-04-15 | 2013-07-24 | 上海嘉之道企业管理咨询有限公司 | Method and system for clustering Chinese text data |
Non-Patent Citations (2)
Title |
---|
林元国 等: "K-means算法在关键词优化中的应用", 《计算机***应用》 * |
邓健爽 等: "基于搜索引擎的关键词自动聚类法", 《计算机科学》 * |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108985064A (en) * | 2018-07-16 | 2018-12-11 | 中国人民解放军战略支援部队信息工程大学 | A kind of method and device identifying malice document |
CN108985064B (en) * | 2018-07-16 | 2023-10-20 | 中国人民解放军战略支援部队信息工程大学 | Method and device for identifying malicious document |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Yang et al. | Discovering topic representative terms for short text clustering | |
CN111708740A (en) | Mass search query log calculation analysis system based on cloud platform | |
CN106649616A (en) | Clustering algorithm achieving search engine keyword optimization | |
CN106933954A (en) | Search engine optimization technology is realized based on Decision Tree Algorithm | |
CN106909626A (en) | Improved Decision Tree Algorithm realizes search engine optimization technology | |
CN102222093B (en) | Method for obtaining longest common substring of alphabetic strings | |
CN111753151B (en) | Service recommendation method based on Internet user behavior | |
Li et al. | Self-supervised learning-based weight adaptive hashing for fast cross-modal retrieval | |
CN111061939B (en) | Scientific research academic news keyword matching recommendation method based on deep learning | |
CN106874376A (en) | A kind of method of verification search engine keyword optimisation technique | |
CN106599118A (en) | Method for realizing search engine keyword optimization by improved density clustering algorithm | |
CN106897356A (en) | Improved Fuzzy C mean algorithm realizes that search engine keywords optimize | |
CN111985217B (en) | Keyword extraction method, computing device and readable storage medium | |
Tejasree et al. | An improved differential bond energy algorithm with fuzzy merging method to improve the document clustering for information mining | |
CN115203514A (en) | Commodity query redirection method and device, equipment, medium and product thereof | |
CN106802945A (en) | Fuzzy c-Means Clustering Algorithm based on VSM realizes that search engine keywords optimize | |
CN115098728A (en) | Video retrieval method and device | |
CN106649537A (en) | Search engine keyword optimization technology based on improved swarm intelligence algorithm | |
Yang et al. | A hot topic detection approach on Chinese microblogging | |
CN106776915A (en) | A kind of new clustering algorithm realizes that search engine keywords optimize | |
CN106933950A (en) | New Model tying algorithm realizes search engine optimization technology | |
CN106874377A (en) | The improved clustering algorithm based on constraints realizes that search engine keywords optimize | |
CN106897376A (en) | Fuzzy C-Mean Algorithm based on ant colony realizes that keyword optimizes | |
CN106776923A (en) | Improved clustering algorithm realizes that search engine keywords optimize | |
CN106528862A (en) | Search engine keyword optimization realized on the basis of improved mean value center algorithm |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
WD01 | Invention patent application deemed withdrawn after publication | ||
WD01 | Invention patent application deemed withdrawn after publication |
Application publication date: 20170426 |