CN106776912A - Realize that search engine keywords optimize based on field dispersion algorithm - Google Patents

Realize that search engine keywords optimize based on field dispersion algorithm Download PDF

Info

Publication number
CN106776912A
CN106776912A CN201611085847.XA CN201611085847A CN106776912A CN 106776912 A CN106776912 A CN 106776912A CN 201611085847 A CN201611085847 A CN 201611085847A CN 106776912 A CN106776912 A CN 106776912A
Authority
CN
China
Prior art keywords
keyword
field
search engine
degree
cluster
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201611085847.XA
Other languages
Chinese (zh)
Inventor
金平艳
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Sichuan Yonglian Information Technology Co Ltd
Original Assignee
Sichuan Yonglian Information Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Sichuan Yonglian Information Technology Co Ltd filed Critical Sichuan Yonglian Information Technology Co Ltd
Priority to CN201611085847.XA priority Critical patent/CN106776912A/en
Publication of CN106776912A publication Critical patent/CN106776912A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/332Query formulation
    • G06F16/3329Natural language query formulation or dialogue systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/951Indexing; Web crawling techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/955Retrieval from the web using information identifiers, e.g. uniform resource locators [URL]

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Mathematical Physics (AREA)
  • Artificial Intelligence (AREA)
  • Human Computer Interaction (AREA)
  • Computational Linguistics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

Realize that search engine keywords optimize based on field dispersion algorithm, kernel keyword, the corresponding data item of search keyword, such as national monthly volumes of searches, degree of contention and each clicking cost of estimation are determined according to business eventDeng, dimension-reduction treatment again is carried out to above-mentioned keyword set, each keyword First Five-Year Plan dimensional vector is represented, that is, increase homepage webpage number and total searched page number, and then is reduced to the four-dimension again by five dimensions, finally using field dispersion algorithm is based on, according to eachThe function of degree of disagreement in field is to keyword clustering, inventive algorithm is more simple and effective, run time complexity is low, processing speed is faster, classification results more meet empirical value, the ranking of website its keyword of fast lifting in a short time can be helped, is that enterprise web site brings certain flow and inquiry, so as to reach preferable web information flow target.

Description

Realize that search engine keywords optimize based on field dispersion algorithm
Technical field
The present invention relates to Semantic Web technology field, and in particular to one kind realizes search engine based on field dispersion algorithm Keyword optimizes.
Background technology
Search engine has turned into the important tool that numerous netizens obtain information.Search engine optimization (Search Engine Optimization, abbreviation SEO) refer to that series of optimum is carried out to website using correlation technique, so as to improve corresponding Keyword ranking on a search engine, is finally reached the purpose of website marketing.SEO is the optimization of keyword after all.Close Keyword optimisation strategy generally comprises the selection of keyword, the distribution of keyword and density domination etc., and keyword is that user is searching The word or expression used during rope related pages, is also that search engine is setting up the concordance list word to be used.Using keyword Help to obtain search engine inquiry ranking higher, it should be noted that keyword research is intended to find out the keyword of most worthy.This It is a bit the basic conception of search engine optimization, is favorably improved search engine ranking.In research network search keyword volumes of searches During the relation of data and relevant issues, it is the key issue for first having to solve to select which keyword, reads document, Bi Zhefa It is existing, for keyword selection mostly by virtue of experience and subjective factor, also the perfect mechanism of neither one is excellent to manage keyword Change strategy and progress.To make the selection of keyword more scientific and objectivity, based on the demand, the invention provides being based on Field dispersion algorithm realizes that search engine keywords are excellent.
The content of the invention
Be directed to keyword optimization and realize the technical problem of search engine optimization, the invention provides one kind based on field from Divergence algorithm realizes that search engine keywords optimize.
In order to solve the above problems, the present invention is achieved by the following technical solutions:
Step 1:Kernel keyword is determined according to business event, related keyword is collected using search engine, these are crucial Word has corresponding data items in a search engine, such as national monthly volumes of searches, degree of contention and each clicking cost (CPC) of estimation
Step 2:With reference to enterprise product and market analysis, the above-mentioned related keyword set for searching of dimensionality reduction is screened;
Step 3:For the keyword set after screening dimensionality reduction, by the corresponding page of search engine search keyword, this In record homepage webpage number and total searched page number, i.e. each keyword dimensionality reduction be four-dimensional again by five dimensional vectors.
Step 4:Using based on field dispersion algorithm, clustering processing is carried out to above-mentioned keyword, its specific sub-step is such as Under:
Step 4.1:Using the k-means algorithm initialization clusters based on ε fields.
Step 4.2:Initialize each field function of degree of disagreement L (S2)start, sentence by following from set of data objects D Fixed condition selects k initial cluster center.
Step 4.3:To every class keywords i, (i ∈ (1,2 ..., m)) are redistributed, poly- by probability function p (i) selection Class center j;
Step 4.4:According to decision function Δ (S2) result, recalculate Ge Cu centers;
Step 4.5:If cluster center changes, step (2) is gone to, otherwise iteration terminates, export cluster result.
Step 5:According to enterprise's concrete condition, comprehensive keyword efficiency optimization and value rate optimize, and selection is suitable crucial Word optimisation strategy reaches web information flow target.
Present invention has the advantages that:
1, this algorithm can simplify key word analysis flow, and then reduce whole web information flow workload.
2, the run time complexity of this algorithm is low, and processing speed is faster.
3rd, this algorithm has bigger value.
4th, the ranking of website its keyword of fast lifting in a short time can be helped.
5th, for enterprise web site brings certain flow and inquiry, so as to reach preferable web information flow target.
6th, the degree of accuracy of this algorithm classification result more meets empirical value;
7th, this algorithm is more simple and effective.
Brief description of the drawings
Fig. 1 realizes that search engine keywords optimize structure flow chart based on field dispersion algorithm
Fig. 2 is based on applicating flow chart of the field dispersion algorithm in cluster analysis
Specific embodiment
In order to solve the technical problem that search engine optimization is realized in keyword optimization, the present invention is carried out with reference to Fig. 1-Fig. 2 Describe in detail, its specific implementation step is as follows:
Step 1:Kernel keyword is determined according to business event, related keyword is collected using search engine, these are crucial Word has corresponding data items in a search engine, such as national monthly volumes of searches, degree of contention and each clicking cost (CPC) of estimation Deng.
Step 2:With reference to enterprise product and market analysis, the above-mentioned related keyword set for searching of dimensionality reduction is screened;
Step 3:For the keyword set after screening dimensionality reduction, by the corresponding page of search engine search keyword, this In record homepage webpage number and total searched page number, i.e. each keyword dimensionality reduction be four-dimensional, its specific meter again by five dimensional vectors Calculation process is as follows:
Here associative key number is m, existing following m × 5 matrix:
Ni、Ldi、CPCi、Nis、NiYIt is followed successively by monthly volumes of searches, degree of contention, the estimation of i-th corresponding this country of keyword Each clicking cost (CPC), homepage webpage number, total searched page number.
Dimensionality reduction is the four-dimension again, i.e.,
XI ∈ (1,2 ..., m)It is search efficiency, ZI ∈ (1,2 ..., m)It is value rate, as following formula:
Step 4:Using based on field dispersion algorithm, clustering processing is carried out to above-mentioned keyword, its specific sub-step is such as Under:
Step 4.1:Using the k-means algorithm initialization clusters based on ε fields.
Step 4.2:Initialize each field function of degree of disagreement L (S2)start, sentence by following from set of data objects D Fixed condition selects k initial cluster center, and its specific calculating process is as follows:
Above formula NεIt is the number of data object in ε fields, xihVector corresponding to data object in ε fields, yihFor ε leads Corresponding cluster centre data object vectors in domain.
It is as follows using decision condition:
L(S2)start> ω
ω is the threshold value for setting, and only meets this threshold value, and the k cluster degree of accuracy of initialization is higher.
Step 4.3:To every class keywords i, (i ∈ (1,2 ..., m)) are redistributed, poly- by probability function p (i) selection Class center j, its specific calculating process is as follows:
yjhIt is jth class cluster centre data object vectors, α is smoothing factor, makes probability function p (i) value bigger, and it is right just to select The cluster center j for answering, that is, have following formula:
OrderThen
Step 4.4:According to decision function Δ (S2) result, recalculate Ge Cu centers, its specific calculating process is as follows:
Decision function Δ (S2):
Δ(S2)=L (S2)new-L(S2)old> 0
Above formula L (S2)newIt is new field function of degree of disagreement, L (S2)oldFor the field dispersion letter that last iteration draws Number.Only meet above formula decision condition, Ji Cu centers change.
Step 4.5:If cluster center changes, step (2) is gone to, otherwise iteration terminates, export cluster result.
Concrete structure flow such as Fig. 2 based on field dispersion algorithm.
Step 5:According to enterprise's concrete condition, comprehensive keyword efficiency optimization and value rate optimize, and selection is suitable crucial Word optimisation strategy reaches web information flow target.
Realize that search engine keywords optimize based on field dispersion algorithm, its false code process
Input:The kernel keyword that website is extracted, cluster, initialization field function of degree of disagreement L (S are initialized based on ε fields2 )start
Output:High-quality keyword after series of optimum.

Claims (2)

1. realize that search engine keywords optimize based on field dispersion algorithm, the present invention relates to Semantic Web technology field, tool Body is related to one kind to realize that search engine keywords optimize based on field dispersion algorithm, it is characterized in that, comprise the following steps:
Step 1:Kernel keyword is determined according to business event, related keyword is collected using search engine, these keywords exist There are corresponding data items in search engine, such as national monthly volumes of searches, degree of contention and each clicking cost of estimation(CPC)Deng
Step 2:With reference to enterprise product and market analysis, the above-mentioned related keyword set for searching of dimensionality reduction is screened;
Step 3:For the keyword set after screening dimensionality reduction, by the corresponding page of search engine search keyword, remember here Dimensionality reduction is four-dimensional again by five dimensional vectors for record homepage webpage number and total searched page number, i.e. each keyword, and it was specifically calculated Journey is as follows:
Here associative key number is m, existing followingMatrix:
It is followed successively by monthly volumes of searches, degree of contention, the estimation of i-th corresponding this country of keyword Each clicking cost(CPC), homepage webpage number, total searched page number dimensionality reduction again
It is the four-dimension, i.e.,
It is search efficiency,It is value rate, as following formula:
Step 4:Using based on field dispersion algorithm, clustering processing is carried out to above-mentioned keyword, its specific sub-step is as follows:
Step 4.1:Using being based onThe k-means algorithm initialization clusters in field
Step 4.2:Initialize each field function of degree of disagreement, following judgements are pressed from set of data objects D Condition selects k initial cluster center
Step 4.3:To every class keywordsRedistributed, select to cluster by probability function p (i) Center j;
Step 4.4:According to decision functionResult, recalculate Ge Cu centers;
Step 4.5:If cluster center changes, step is gone to(2), otherwise iteration terminates, and exports cluster result
Step 5:According to enterprise's concrete condition, comprehensive keyword efficiency optimization and value rate optimize, and select suitable keyword excellent Change strategy and reach web information flow target.
2. realize that search engine keywords optimize based on field dispersion algorithm according to claim 1, it is characterized in that, Specific calculating process in the above step 4 is as follows:
Step 4:Using being based onField dispersion algorithm, clustering processing is carried out to above-mentioned keyword, and its specific sub-step is as follows:
Step 4.1:Using the k-means algorithm initialization clusters based on field
Step 4.2:Initialize each field function of degree of disagreement, following judgements are pressed from set of data objects D Condition selects k initial cluster center, and its specific calculating process is as follows:
Above formulaForThe number of data object in field,ForVector in field corresponding to data object,ForNeck Corresponding cluster centre data object vectors in domain
It is as follows using decision condition:
It is the threshold value for setting, only meets this threshold value, the k cluster degree of accuracy of initialization is higher
Step 4.3:To every class keywordsRedistributed, select to cluster by probability function p (i) Center j, its specific calculating process is as follows:
It is jth class cluster centre data object vectors,It is smoothing factor, makes probability function p (i) value bigger, just selects correspondence Cluster center j, that is, have following formula:
Order, then
Step 4.4:According to decision functionResult, recalculate Ge Cu centers, its specific calculating process is as follows:
Decision function
Above formulaIt is new field function of degree of disagreement,It is the field function of degree of disagreement that last iteration draws, Only meet above formula decision condition, Ji Cu centers change
Step 4.5:If cluster center changes, step is gone to(2), otherwise iteration terminates, and exports cluster result
Concrete structure flow such as Fig. 2 based on field dispersion algorithm.
CN201611085847.XA 2016-11-30 2016-11-30 Realize that search engine keywords optimize based on field dispersion algorithm Pending CN106776912A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201611085847.XA CN106776912A (en) 2016-11-30 2016-11-30 Realize that search engine keywords optimize based on field dispersion algorithm

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201611085847.XA CN106776912A (en) 2016-11-30 2016-11-30 Realize that search engine keywords optimize based on field dispersion algorithm

Publications (1)

Publication Number Publication Date
CN106776912A true CN106776912A (en) 2017-05-31

Family

ID=58913967

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201611085847.XA Pending CN106776912A (en) 2016-11-30 2016-11-30 Realize that search engine keywords optimize based on field dispersion algorithm

Country Status (1)

Country Link
CN (1) CN106776912A (en)

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103218435A (en) * 2013-04-15 2013-07-24 上海嘉之道企业管理咨询有限公司 Method and system for clustering Chinese text data
CN103258000A (en) * 2013-03-29 2013-08-21 北界创想(北京)软件有限公司 Method and device for clustering high-frequency keywords in webpages

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103258000A (en) * 2013-03-29 2013-08-21 北界创想(北京)软件有限公司 Method and device for clustering high-frequency keywords in webpages
CN103218435A (en) * 2013-04-15 2013-07-24 上海嘉之道企业管理咨询有限公司 Method and system for clustering Chinese text data

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
林元国 等: "K-means算法在关键词优化中的应用", 《计算机***应用》 *
邓健爽 等: "基于搜索引擎的关键词自动聚类法", 《计算机科学》 *

Similar Documents

Publication Publication Date Title
Jiang et al. An improved K-nearest-neighbor algorithm for text categorization
CN110222160A (en) Intelligent semantic document recommendation method, device and computer readable storage medium
CN111291188B (en) Intelligent information extraction method and system
Shuang et al. A sentiment information Collector–Extractor architecture based neural network for sentiment analysis
CN106649616A (en) Clustering algorithm achieving search engine keyword optimization
CN106933954A (en) Search engine optimization technology is realized based on Decision Tree Algorithm
Gligorijevic et al. Deeply supervised model for click-through rate prediction in sponsored search
Xiao et al. Dinrec: Deep interest network based api recommendation approach for mashup creation
Ye et al. Using node identifiers and community prior for graph-based classification
CN106909626A (en) Improved Decision Tree Algorithm realizes search engine optimization technology
CN106933953A (en) A kind of fuzzy K mean cluster algorithm realizes search engine optimization technology
CN111061939B (en) Scientific research academic news keyword matching recommendation method based on deep learning
Li et al. Self-supervised learning-based weight adaptive hashing for fast cross-modal retrieval
Azzam et al. A question routing technique using deep neural network for communities of question answering
CN106874376A (en) A kind of method of verification search engine keyword optimisation technique
Lin et al. Deep-profiling: a deep neural network model for scholarly web user profiling
CN106897356A (en) Improved Fuzzy C mean algorithm realizes that search engine keywords optimize
Zhang et al. Short-text feature expansion and classification based on non-negative matrix factorization
CN106776912A (en) Realize that search engine keywords optimize based on field dispersion algorithm
Zeng et al. RACMF: robust attention convolutional matrix factorization for rating prediction
CN106599118A (en) Method for realizing search engine keyword optimization by improved density clustering algorithm
CN106802945A (en) Fuzzy c-Means Clustering Algorithm based on VSM realizes that search engine keywords optimize
CN106897376A (en) Fuzzy C-Mean Algorithm based on ant colony realizes that keyword optimizes
CN106649537A (en) Search engine keyword optimization technology based on improved swarm intelligence algorithm
CN106776915A (en) A kind of new clustering algorithm realizes that search engine keywords optimize

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
WD01 Invention patent application deemed withdrawn after publication

Application publication date: 20170531

WD01 Invention patent application deemed withdrawn after publication