CN106776912A - Realize that search engine keywords optimize based on field dispersion algorithm - Google Patents
Realize that search engine keywords optimize based on field dispersion algorithm Download PDFInfo
- Publication number
- CN106776912A CN106776912A CN201611085847.XA CN201611085847A CN106776912A CN 106776912 A CN106776912 A CN 106776912A CN 201611085847 A CN201611085847 A CN 201611085847A CN 106776912 A CN106776912 A CN 106776912A
- Authority
- CN
- China
- Prior art keywords
- keyword
- field
- search engine
- degree
- cluster
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/30—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F16/33—Querying
- G06F16/332—Query formulation
- G06F16/3329—Natural language query formulation or dialogue systems
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/90—Details of database functions independent of the retrieved data types
- G06F16/95—Retrieval from the web
- G06F16/951—Indexing; Web crawling techniques
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/90—Details of database functions independent of the retrieved data types
- G06F16/95—Retrieval from the web
- G06F16/955—Retrieval from the web using information identifiers, e.g. uniform resource locators [URL]
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Databases & Information Systems (AREA)
- Physics & Mathematics (AREA)
- Data Mining & Analysis (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Mathematical Physics (AREA)
- Artificial Intelligence (AREA)
- Human Computer Interaction (AREA)
- Computational Linguistics (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
Realize that search engine keywords optimize based on field dispersion algorithm, kernel keyword, the corresponding data item of search keyword, such as national monthly volumes of searches, degree of contention and each clicking cost of estimation are determined according to business eventDeng, dimension-reduction treatment again is carried out to above-mentioned keyword set, each keyword First Five-Year Plan dimensional vector is represented, that is, increase homepage webpage number and total searched page number, and then is reduced to the four-dimension again by five dimensions, finally using field dispersion algorithm is based on, according to eachThe function of degree of disagreement in field is to keyword clustering, inventive algorithm is more simple and effective, run time complexity is low, processing speed is faster, classification results more meet empirical value, the ranking of website its keyword of fast lifting in a short time can be helped, is that enterprise web site brings certain flow and inquiry, so as to reach preferable web information flow target.
Description
Technical field
The present invention relates to Semantic Web technology field, and in particular to one kind realizes search engine based on field dispersion algorithm
Keyword optimizes.
Background technology
Search engine has turned into the important tool that numerous netizens obtain information.Search engine optimization (Search
Engine Optimization, abbreviation SEO) refer to that series of optimum is carried out to website using correlation technique, so as to improve corresponding
Keyword ranking on a search engine, is finally reached the purpose of website marketing.SEO is the optimization of keyword after all.Close
Keyword optimisation strategy generally comprises the selection of keyword, the distribution of keyword and density domination etc., and keyword is that user is searching
The word or expression used during rope related pages, is also that search engine is setting up the concordance list word to be used.Using keyword
Help to obtain search engine inquiry ranking higher, it should be noted that keyword research is intended to find out the keyword of most worthy.This
It is a bit the basic conception of search engine optimization, is favorably improved search engine ranking.In research network search keyword volumes of searches
During the relation of data and relevant issues, it is the key issue for first having to solve to select which keyword, reads document, Bi Zhefa
It is existing, for keyword selection mostly by virtue of experience and subjective factor, also the perfect mechanism of neither one is excellent to manage keyword
Change strategy and progress.To make the selection of keyword more scientific and objectivity, based on the demand, the invention provides being based on
Field dispersion algorithm realizes that search engine keywords are excellent.
The content of the invention
Be directed to keyword optimization and realize the technical problem of search engine optimization, the invention provides one kind based on field from
Divergence algorithm realizes that search engine keywords optimize.
In order to solve the above problems, the present invention is achieved by the following technical solutions:
Step 1:Kernel keyword is determined according to business event, related keyword is collected using search engine, these are crucial
Word has corresponding data items in a search engine, such as national monthly volumes of searches, degree of contention and each clicking cost (CPC) of estimation
Step 2:With reference to enterprise product and market analysis, the above-mentioned related keyword set for searching of dimensionality reduction is screened;
Step 3:For the keyword set after screening dimensionality reduction, by the corresponding page of search engine search keyword, this
In record homepage webpage number and total searched page number, i.e. each keyword dimensionality reduction be four-dimensional again by five dimensional vectors.
Step 4:Using based on field dispersion algorithm, clustering processing is carried out to above-mentioned keyword, its specific sub-step is such as
Under:
Step 4.1:Using the k-means algorithm initialization clusters based on ε fields.
Step 4.2:Initialize each field function of degree of disagreement L (S2)start, sentence by following from set of data objects D
Fixed condition selects k initial cluster center.
Step 4.3:To every class keywords i, (i ∈ (1,2 ..., m)) are redistributed, poly- by probability function p (i) selection
Class center j;
Step 4.4:According to decision function Δ (S2) result, recalculate Ge Cu centers;
Step 4.5:If cluster center changes, step (2) is gone to, otherwise iteration terminates, export cluster result.
Step 5:According to enterprise's concrete condition, comprehensive keyword efficiency optimization and value rate optimize, and selection is suitable crucial
Word optimisation strategy reaches web information flow target.
Present invention has the advantages that:
1, this algorithm can simplify key word analysis flow, and then reduce whole web information flow workload.
2, the run time complexity of this algorithm is low, and processing speed is faster.
3rd, this algorithm has bigger value.
4th, the ranking of website its keyword of fast lifting in a short time can be helped.
5th, for enterprise web site brings certain flow and inquiry, so as to reach preferable web information flow target.
6th, the degree of accuracy of this algorithm classification result more meets empirical value;
7th, this algorithm is more simple and effective.
Brief description of the drawings
Fig. 1 realizes that search engine keywords optimize structure flow chart based on field dispersion algorithm
Fig. 2 is based on applicating flow chart of the field dispersion algorithm in cluster analysis
Specific embodiment
In order to solve the technical problem that search engine optimization is realized in keyword optimization, the present invention is carried out with reference to Fig. 1-Fig. 2
Describe in detail, its specific implementation step is as follows:
Step 1:Kernel keyword is determined according to business event, related keyword is collected using search engine, these are crucial
Word has corresponding data items in a search engine, such as national monthly volumes of searches, degree of contention and each clicking cost (CPC) of estimation
Deng.
Step 2:With reference to enterprise product and market analysis, the above-mentioned related keyword set for searching of dimensionality reduction is screened;
Step 3:For the keyword set after screening dimensionality reduction, by the corresponding page of search engine search keyword, this
In record homepage webpage number and total searched page number, i.e. each keyword dimensionality reduction be four-dimensional, its specific meter again by five dimensional vectors
Calculation process is as follows:
Here associative key number is m, existing following m × 5 matrix:
Ni、Ldi、CPCi、Nis、NiYIt is followed successively by monthly volumes of searches, degree of contention, the estimation of i-th corresponding this country of keyword
Each clicking cost (CPC), homepage webpage number, total searched page number.
Dimensionality reduction is the four-dimension again, i.e.,
XI ∈ (1,2 ..., m)It is search efficiency, ZI ∈ (1,2 ..., m)It is value rate, as following formula:
Step 4:Using based on field dispersion algorithm, clustering processing is carried out to above-mentioned keyword, its specific sub-step is such as
Under:
Step 4.1:Using the k-means algorithm initialization clusters based on ε fields.
Step 4.2:Initialize each field function of degree of disagreement L (S2)start, sentence by following from set of data objects D
Fixed condition selects k initial cluster center, and its specific calculating process is as follows:
Above formula NεIt is the number of data object in ε fields, xihVector corresponding to data object in ε fields, yihFor ε leads
Corresponding cluster centre data object vectors in domain.
It is as follows using decision condition:
L(S2)start> ω
ω is the threshold value for setting, and only meets this threshold value, and the k cluster degree of accuracy of initialization is higher.
Step 4.3:To every class keywords i, (i ∈ (1,2 ..., m)) are redistributed, poly- by probability function p (i) selection
Class center j, its specific calculating process is as follows:
yjhIt is jth class cluster centre data object vectors, α is smoothing factor, makes probability function p (i) value bigger, and it is right just to select
The cluster center j for answering, that is, have following formula:
OrderThen
Step 4.4:According to decision function Δ (S2) result, recalculate Ge Cu centers, its specific calculating process is as follows:
Decision function Δ (S2):
Δ(S2)=L (S2)new-L(S2)old> 0
Above formula L (S2)newIt is new field function of degree of disagreement, L (S2)oldFor the field dispersion letter that last iteration draws
Number.Only meet above formula decision condition, Ji Cu centers change.
Step 4.5:If cluster center changes, step (2) is gone to, otherwise iteration terminates, export cluster result.
Concrete structure flow such as Fig. 2 based on field dispersion algorithm.
Step 5:According to enterprise's concrete condition, comprehensive keyword efficiency optimization and value rate optimize, and selection is suitable crucial
Word optimisation strategy reaches web information flow target.
Realize that search engine keywords optimize based on field dispersion algorithm, its false code process
Input:The kernel keyword that website is extracted, cluster, initialization field function of degree of disagreement L (S are initialized based on ε fields2
)start
Output:High-quality keyword after series of optimum.
Claims (2)
1. realize that search engine keywords optimize based on field dispersion algorithm, the present invention relates to Semantic Web technology field, tool
Body is related to one kind to realize that search engine keywords optimize based on field dispersion algorithm, it is characterized in that, comprise the following steps:
Step 1:Kernel keyword is determined according to business event, related keyword is collected using search engine, these keywords exist
There are corresponding data items in search engine, such as national monthly volumes of searches, degree of contention and each clicking cost of estimation(CPC)Deng
Step 2:With reference to enterprise product and market analysis, the above-mentioned related keyword set for searching of dimensionality reduction is screened;
Step 3:For the keyword set after screening dimensionality reduction, by the corresponding page of search engine search keyword, remember here
Dimensionality reduction is four-dimensional again by five dimensional vectors for record homepage webpage number and total searched page number, i.e. each keyword, and it was specifically calculated
Journey is as follows:
Here associative key number is m, existing followingMatrix:
、、、、It is followed successively by monthly volumes of searches, degree of contention, the estimation of i-th corresponding this country of keyword
Each clicking cost(CPC), homepage webpage number, total searched page number dimensionality reduction again
It is the four-dimension, i.e.,
It is search efficiency,It is value rate, as following formula:
Step 4:Using based on field dispersion algorithm, clustering processing is carried out to above-mentioned keyword, its specific sub-step is as follows:
Step 4.1:Using being based onThe k-means algorithm initialization clusters in field
Step 4.2:Initialize each field function of degree of disagreement, following judgements are pressed from set of data objects D
Condition selects k initial cluster center
Step 4.3:To every class keywordsRedistributed, select to cluster by probability function p (i)
Center j;
Step 4.4:According to decision functionResult, recalculate Ge Cu centers;
Step 4.5:If cluster center changes, step is gone to(2), otherwise iteration terminates, and exports cluster result
Step 5:According to enterprise's concrete condition, comprehensive keyword efficiency optimization and value rate optimize, and select suitable keyword excellent
Change strategy and reach web information flow target.
2. realize that search engine keywords optimize based on field dispersion algorithm according to claim 1, it is characterized in that,
Specific calculating process in the above step 4 is as follows:
Step 4:Using being based onField dispersion algorithm, clustering processing is carried out to above-mentioned keyword, and its specific sub-step is as follows:
Step 4.1:Using the k-means algorithm initialization clusters based on field
Step 4.2:Initialize each field function of degree of disagreement, following judgements are pressed from set of data objects D
Condition selects k initial cluster center, and its specific calculating process is as follows:
Above formulaForThe number of data object in field,ForVector in field corresponding to data object,ForNeck
Corresponding cluster centre data object vectors in domain
It is as follows using decision condition:
It is the threshold value for setting, only meets this threshold value, the k cluster degree of accuracy of initialization is higher
Step 4.3:To every class keywordsRedistributed, select to cluster by probability function p (i)
Center j, its specific calculating process is as follows:
It is jth class cluster centre data object vectors,It is smoothing factor, makes probability function p (i) value bigger, just selects correspondence
Cluster center j, that is, have following formula:
Order, then
Step 4.4:According to decision functionResult, recalculate Ge Cu centers, its specific calculating process is as follows:
Decision function:
Above formulaIt is new field function of degree of disagreement,It is the field function of degree of disagreement that last iteration draws,
Only meet above formula decision condition, Ji Cu centers change
Step 4.5:If cluster center changes, step is gone to(2), otherwise iteration terminates, and exports cluster result
Concrete structure flow such as Fig. 2 based on field dispersion algorithm.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201611085847.XA CN106776912A (en) | 2016-11-30 | 2016-11-30 | Realize that search engine keywords optimize based on field dispersion algorithm |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201611085847.XA CN106776912A (en) | 2016-11-30 | 2016-11-30 | Realize that search engine keywords optimize based on field dispersion algorithm |
Publications (1)
Publication Number | Publication Date |
---|---|
CN106776912A true CN106776912A (en) | 2017-05-31 |
Family
ID=58913967
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201611085847.XA Pending CN106776912A (en) | 2016-11-30 | 2016-11-30 | Realize that search engine keywords optimize based on field dispersion algorithm |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN106776912A (en) |
Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103218435A (en) * | 2013-04-15 | 2013-07-24 | 上海嘉之道企业管理咨询有限公司 | Method and system for clustering Chinese text data |
CN103258000A (en) * | 2013-03-29 | 2013-08-21 | 北界创想(北京)软件有限公司 | Method and device for clustering high-frequency keywords in webpages |
-
2016
- 2016-11-30 CN CN201611085847.XA patent/CN106776912A/en active Pending
Patent Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103258000A (en) * | 2013-03-29 | 2013-08-21 | 北界创想(北京)软件有限公司 | Method and device for clustering high-frequency keywords in webpages |
CN103218435A (en) * | 2013-04-15 | 2013-07-24 | 上海嘉之道企业管理咨询有限公司 | Method and system for clustering Chinese text data |
Non-Patent Citations (2)
Title |
---|
林元国 等: "K-means算法在关键词优化中的应用", 《计算机***应用》 * |
邓健爽 等: "基于搜索引擎的关键词自动聚类法", 《计算机科学》 * |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Jiang et al. | An improved K-nearest-neighbor algorithm for text categorization | |
CN110222160A (en) | Intelligent semantic document recommendation method, device and computer readable storage medium | |
CN111291188B (en) | Intelligent information extraction method and system | |
Shuang et al. | A sentiment information Collector–Extractor architecture based neural network for sentiment analysis | |
CN106649616A (en) | Clustering algorithm achieving search engine keyword optimization | |
CN106933954A (en) | Search engine optimization technology is realized based on Decision Tree Algorithm | |
Gligorijevic et al. | Deeply supervised model for click-through rate prediction in sponsored search | |
Xiao et al. | Dinrec: Deep interest network based api recommendation approach for mashup creation | |
Ye et al. | Using node identifiers and community prior for graph-based classification | |
CN106909626A (en) | Improved Decision Tree Algorithm realizes search engine optimization technology | |
CN106933953A (en) | A kind of fuzzy K mean cluster algorithm realizes search engine optimization technology | |
CN111061939B (en) | Scientific research academic news keyword matching recommendation method based on deep learning | |
Li et al. | Self-supervised learning-based weight adaptive hashing for fast cross-modal retrieval | |
Azzam et al. | A question routing technique using deep neural network for communities of question answering | |
CN106874376A (en) | A kind of method of verification search engine keyword optimisation technique | |
Lin et al. | Deep-profiling: a deep neural network model for scholarly web user profiling | |
CN106897356A (en) | Improved Fuzzy C mean algorithm realizes that search engine keywords optimize | |
Zhang et al. | Short-text feature expansion and classification based on non-negative matrix factorization | |
CN106776912A (en) | Realize that search engine keywords optimize based on field dispersion algorithm | |
Zeng et al. | RACMF: robust attention convolutional matrix factorization for rating prediction | |
CN106599118A (en) | Method for realizing search engine keyword optimization by improved density clustering algorithm | |
CN106802945A (en) | Fuzzy c-Means Clustering Algorithm based on VSM realizes that search engine keywords optimize | |
CN106897376A (en) | Fuzzy C-Mean Algorithm based on ant colony realizes that keyword optimizes | |
CN106649537A (en) | Search engine keyword optimization technology based on improved swarm intelligence algorithm | |
CN106776915A (en) | A kind of new clustering algorithm realizes that search engine keywords optimize |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
WD01 | Invention patent application deemed withdrawn after publication |
Application publication date: 20170531 |
|
WD01 | Invention patent application deemed withdrawn after publication |