CN106802945A - Fuzzy c-Means Clustering Algorithm based on VSM realizes that search engine keywords optimize - Google Patents
Fuzzy c-Means Clustering Algorithm based on VSM realizes that search engine keywords optimize Download PDFInfo
- Publication number
- CN106802945A CN106802945A CN201710012398.4A CN201710012398A CN106802945A CN 106802945 A CN106802945 A CN 106802945A CN 201710012398 A CN201710012398 A CN 201710012398A CN 106802945 A CN106802945 A CN 106802945A
- Authority
- CN
- China
- Prior art keywords
- keyword
- fuzzy
- vsm
- search engine
- clustering algorithm
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/30—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F16/35—Clustering; Classification
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/30—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F16/36—Creation of semantic tools, e.g. ontology or thesauri
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/90—Details of database functions independent of the retrieved data types
- G06F16/95—Retrieval from the web
- G06F16/951—Indexing; Web crawling techniques
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q30/00—Commerce
- G06Q30/02—Marketing; Price estimation or determination; Fundraising
- G06Q30/0241—Advertisements
- G06Q30/0277—Online advertisement
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Databases & Information Systems (AREA)
- Physics & Mathematics (AREA)
- Business, Economics & Management (AREA)
- General Engineering & Computer Science (AREA)
- Data Mining & Analysis (AREA)
- Accounting & Taxation (AREA)
- Development Economics (AREA)
- Finance (AREA)
- Strategic Management (AREA)
- Entrepreneurship & Innovation (AREA)
- General Business, Economics & Management (AREA)
- Computational Linguistics (AREA)
- Marketing (AREA)
- Economics (AREA)
- Game Theory and Decision Science (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
Fuzzy c-Means Clustering Algorithm based on VSM realizes that search engine keywords optimize, and kernel keyword, the corresponding data item of search keyword, such as national monthly volumes of searches, degree of contention and each clicking cost of estimation are determined according to business eventDeng, dimension-reduction treatment again is carried out to above-mentioned keyword set, each keyword is represented with First Five-Year Plan dimensional vector, increase homepage webpage number and total searched page number, and then the four-dimension is reduced to again by five dimensions, Fuzzy c-Means Clustering Algorithm based on VSM is to above-mentioned keyword clustering, finally according to enterprise's concrete condition, selection is adapted to the keyword optimisation strategy of enterprise, the present invention be accurately assigned with global each field accounting and every cluster path and weight coefficient, result more meets empirical value, isolated point influence is subtracted, reduce whole web information flow workload, avoid clustering Premature Convergence, run time complexity is low simultaneously, processing speed is faster, can be with fast lifting keyword ranking, so as to reach preferable web information flow target.
Description
Technical field
The present invention relates to Semantic Web technology field, and in particular to the Fuzzy c-Means Clustering Algorithm based on VSM realizes search
Engine keyword optimizes.
Background technology
Developing rapidly for internet, has driven the expansion of internet information, and its commercial value is also excavated by people.It is more
Industry information is delivered in the middle of network, it is desirable to be found by search engine advertisement or other types advertisement, with low cost
Bring considerable income.At present, China's search engine industry comparative maturity, but complete Enterprise search engine strategy
Theory does not occur also, and this is also to cause the less successful major reason of current enterprise implement search engine optimization.A lot
Enterprise is to know simple optimization, and a complete strategy system does not instruct how it is optimized, in face of optimization, root
Originally do not know how to be implemented.This also results in some enterprises and is practised fraud to pursue interests temporary transient at the moment, gone
Look for the leak of some search engines to obtain temporary transient ranking, this has greatly upset the normal hair of search engine optimization industry
Exhibition.
One business website obtains nature ranking preferentially with its core keyword in main flow search engine, in the business of today
, there is extraordinary value in industry society.Therefore keyword is also commonly known as being the whole foundation stone for searching for application.At present both at home and abroad
To keyword optimization theoretical research and technology application it is relatively more, but temporarily do not propose an effective method simplify keyword divide
Analysis flow, also neither one perfect mechanism manage keyword optimisation strategy and progress.Based on the demand, the present invention is provided
A kind of Fuzzy c-Means Clustering Algorithm based on VSM realizes that search engine keywords optimize.
The content of the invention
The technical problem that search engine optimization is realized in keyword optimization is directed to, the invention provides a kind of based on VSM's
Fuzzy c-Means Clustering Algorithm realizes that search engine keywords optimize.
In order to solve the above problems, the present invention is achieved by the following technical solutions:
Step 1:Kernel keyword is determined according to business event, related keyword is collected using search engine, these are crucial
Word has corresponding data items in a search engine, such as national monthly volumes of searches, degree of contention and each clicking cost (CPC) of estimation
Step 2:With reference to enterprise product and market analysis, the above-mentioned related keyword set for searching of dimensionality reduction is screened;
Step 3:For the keyword set after screening dimensionality reduction, by the corresponding page of search engine search keyword, this
In record homepage webpage number and total searched page number, i.e. each keyword dimensionality reduction be four-dimensional again by five dimensional vectors.
Step 4:Fuzzy c-Means Clustering Algorithm based on VSM, clustering processing is carried out to above-mentioned keyword, its specific sub-step
It is rapid as follows:
Step 4.1:It is c classes using the k-means algorithm initializations based on ε fields.
Step 4.2:Subject Matrix J is initialized with the random number between value [0,1], the whole constraint bar for being subordinate to its satisfaction
Part;
Step 4.3:Initialize each field object functionC class catalogue scalar functions are built, is comprehensively subordinate to constraint
Condition, builds m equation group, and it is solved, you can obtain cluster result;
Step 4.4:Using the result of following formula decision function Δ (g), Ge Cu centers are recalculated;
Step 4.5:If cluster center changes, step 4.2 is gone to, recalculate Subject Matrix J, otherwise iteration knot
Beam, exports cluster result.
Step 5:According to enterprise's concrete condition, comprehensive keyword efficiency optimization and value rate optimize, and selection is suitable crucial
Word optimisation strategy reaches web information flow target.
Present invention has the advantages that:
1, this algorithm can simplify key word analysis flow, and then reduce whole web information flow workload.
2, the run time complexity of this algorithm is low, and processing speed is faster.
3rd, this algorithm has bigger value.
4th, the ranking of website its keyword of fast lifting in a short time can be helped.
5th, for enterprise web site brings certain flow and inquiry, so as to reach preferable web information flow target.
6th, this algorithm is accurately assigned with weight of each field in the path summation in global accounting and local each field
Coefficient, classification results more meet empirical value.
7th, influence of the isolated point to cluster result is reduced.
8th, with reference to Fuzzy c-Means Clustering Algorithm, it is to avoid cluster result Premature Convergence.
Brief description of the drawings
The Fuzzy c-Means Clustering Algorithm that Fig. 1 is based on VSM realizes that search engine keywords optimize structure flow chart
Fig. 2 is based on applicating flow chart of the Fuzzy c-Means Clustering Algorithm of VSM in cluster analysis
Specific embodiment
In order to solve the technical problem that search engine optimization is realized in keyword optimization, the present invention is carried out with reference to Fig. 1-Fig. 2
Describe in detail, its specific implementation step is as follows:
Step 1:Kernel keyword is determined according to business event, related keyword is collected using search engine, these are crucial
Word has corresponding data items in a search engine, such as national monthly volumes of searches, degree of contention and each clicking cost (CPC) of estimation
Deng.
Step 2:With reference to enterprise product and market analysis, the above-mentioned related keyword set for searching of dimensionality reduction is screened;
Step 3:For the keyword set after screening dimensionality reduction, by the corresponding page of search engine search keyword, this
In record homepage webpage number and total searched page number, i.e. each keyword dimensionality reduction be four-dimensional, its specific meter again by five dimensional vectors
Calculation process is as follows:
Here associative key number is m, existing following m × 5 matrix:
Ni、Ldi、CPCi、NiS、NiYIt is followed successively by monthly volumes of searches, degree of contention, the estimation of i-th corresponding this country of keyword
Each clicking cost (CPC), homepage webpage number, total searched page number.
Dimensionality reduction is the four-dimension again, i.e.,
XI ∈ (1,2 ..., m)It is search efficiency, ZI ∈ (1,2 ..., m)It is value rate, as following formula:
Step 4:Fuzzy c-Means Clustering Algorithm based on VSM, clustering processing is carried out to above-mentioned keyword, its specific sub-step
It is rapid as follows:
Step 4.1:It is c classes using the k-means algorithm initializations based on ε fields.
Step 4.2:Subject Matrix J is initialized with the random number between value [0,1], the whole constraint bar for being subordinate to its satisfaction
Part;Its specific calculating process is as follows:
C classes are divided into according to ε fields initialization data object set D;
Initialization Subject Matrix J is m × C:
wijBelong to for keyword i the degree coefficient of j classes, i.e. j ∈ (1,2 ..., C), i ∈ (1,2 ..., m).
The whole constraints being subordinate to is:
Step 4.3:Initialize each field object functionC class catalogue scalar functions are built, is comprehensively subordinate to constraint
Condition, builds m equation group, and it is solved, you can obtain cluster result, and its specific calculating process is as follows:
Above formula nεjIt is the number of data object in ε fields in j classes,It is coefficient of variation in each ε field, α, β point
Wei not quantity nε, coefficient of variationInfluence coefficient, and alpha+beta=1, its value can go out suitable value according to experiment iteration.
Above formula
xihTo belong to the space vector of i-th keyword of j classes, yjhIt is j class cluster center vectors, h is vectorial corresponding element
Number.
Build c class catalogue scalar functions
Comprehensively it is subordinate to constraints, builds m equation group:
λi(i=1 ..., is m) the m Lagrangian of constraint formula, derivation is carried out to above-mentioned formula, to all inputs
Parameter derivation, you can trying to achieve makesReach the necessary condition c of maximumj、wij:
Above formulaVector corresponding to keyword i;
Step 4.4:Using the result of following formula decision function Δ (g), Ge Cu centers are recalculated, its specific calculating process is such as
Under:
It is new catalogue scalar functions,It is the catalogue scalar functions that last iteration draws, θ is one sufficiently small
Number, only meet above-mentioned condition, then have found optimal classification.
Step 4.5:If cluster center changes, step 4.2 is gone to, recalculate Subject Matrix J, otherwise iteration knot
Beam, exports cluster result.
Fuzzy c-Means Clustering Algorithm concrete structure flow such as Fig. 2 based on VSM.
Step 5:According to enterprise's concrete condition, comprehensive keyword efficiency optimization and value rate optimize, and selection is suitable crucial
Word optimisation strategy reaches web information flow target.
Fuzzy c-Means Clustering Algorithm based on VSM realizes that search engine keywords optimize, its false code process
Input:The kernel keyword that website is extracted, cluster is initialized based on ε fields
Output:C class catalogue scalar functionsC maximum cluster.
Claims (2)
1. the Fuzzy c-Means Clustering Algorithm based on VSM realizes that search engine keywords optimize, the present invention relates to Semantic Web technology
Field, and in particular to the Fuzzy c-Means Clustering Algorithm based on VSM realizes that search engine keywords optimize, it is characterized in that, including such as
Lower step:
Step 1:Kernel keyword is determined according to business event, related keyword is collected using search engine, these keywords exist
There are corresponding data items in search engine, such as national monthly volumes of searches, degree of contention and each clicking cost of estimation(CPC)Deng
Step 2:With reference to enterprise product and market analysis, the above-mentioned related keyword set for searching of dimensionality reduction is screened;
Step 3:For the keyword set after screening dimensionality reduction, by the corresponding page of search engine search keyword, remember here
Dimensionality reduction is the four-dimension, its specific calculating process again by five dimensional vectors for record homepage webpage number and total searched page number, i.e. each keyword
It is as follows:
Here associative key number is m, existing followingMatrix:
、、、、It is followed successively by monthly volumes of searches, degree of contention, the estimation of i-th corresponding this country of keyword
Each clicking cost(CPC), homepage webpage number, total searched page number dimensionality reduction again
It is the four-dimension, i.e.,
It is search efficiency,It is value rate, as following formula:
Step 4:Fuzzy c-Means Clustering Algorithm based on VSM, clustering processing is carried out to above-mentioned keyword, and its specific sub-step is such as
Under:
Step 4.1:Using being based onThe k-means algorithm initializations in field are c classes
Step 4.2:Subject Matrix J is initialized with the random number between value [0,1], the whole constraints for being subordinate to its satisfaction;
Step 4.3:Initialize each field object function, c class catalogue scalar functions are built, comprehensively it is subordinate to constraint bar
Part, builds m equation group, and it is solved, you can obtain cluster result;
Step 4.4:Using following formula decision functionResult, recalculate Ge Cu centers;
Step 4.5:If cluster center changes, step 4.2 is gone to, recalculates Subject Matrix J, otherwise iteration terminates,
Output cluster result
Step 5:According to enterprise's concrete condition, comprehensive keyword efficiency optimization and value rate optimize, and select suitable keyword excellent
Change strategy and reach web information flow target.
2. the Fuzzy c-Means Clustering Algorithm based on VSM according to claim 1 realizes that search engine keywords optimize,
It is characterized in that, the specific calculating process in the above step 4 is as follows:
Step 4:Fuzzy c-Means Clustering Algorithm based on VSM, clustering processing is carried out to above-mentioned keyword, and its specific sub-step is such as
Under:
Step 4.1:Using being based onThe k-means algorithm initializations in field are c classes
Step 4.2:Subject Matrix J is initialized with the random number between value [0,1], the whole constraints for being subordinate to its satisfaction;Its
Specific calculating process is as follows:
According toField initialization data object set D is divided into C classes;
Initializing Subject Matrix J is:
Belong to the degree coefficient of j classes for keyword i, i.e.,It is subordinate to
Entirely constraints is:
Step 4.3:Initialize each field object function, c class catalogue scalar functions are built, comprehensively it is subordinate to constraint bar
Part, builds m equation group, and it is solved, you can obtain cluster result, and its specific calculating process is as follows:
Above formulaFor in j classesThe number of data object in field,For eachCoefficient of variation in field,、
Respectively quantity, coefficient of variationInfluence coefficient, and, its value can according to experiment iteration go out
Suitable value
Above formula
To belong to the space vector of i-th keyword of j classes,It is j class cluster center vectors, h is vectorial corresponding element
Number
Build c class catalogue scalar functions:
Comprehensively it is subordinate to constraints, builds m equation group:
It is the m Lagrangian of constraint formula, derivation is carried out to above-mentioned formula, to all input
Parameter derivation, you can trying to achieve makesReach the necessary condition of maximum、:
Above formulaVector corresponding to keyword i;
Step 4.4:Using following formula decision functionResult, recalculate Ge Cu centers, its specific calculating process is as follows:
It is new catalogue scalar functions,It is the catalogue scalar functions that last iteration draws,For one it is sufficiently small
Number, only meet above-mentioned condition, then have found optimal classification
Step 4.5:If cluster center changes, step 4.2 is gone to, recalculates Subject Matrix J, otherwise iteration terminates,
Output cluster result
Fuzzy c-Means Clustering Algorithm concrete structure flow such as Fig. 2 based on VSM.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201710012398.4A CN106802945A (en) | 2017-01-09 | 2017-01-09 | Fuzzy c-Means Clustering Algorithm based on VSM realizes that search engine keywords optimize |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201710012398.4A CN106802945A (en) | 2017-01-09 | 2017-01-09 | Fuzzy c-Means Clustering Algorithm based on VSM realizes that search engine keywords optimize |
Publications (1)
Publication Number | Publication Date |
---|---|
CN106802945A true CN106802945A (en) | 2017-06-06 |
Family
ID=58984631
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201710012398.4A Pending CN106802945A (en) | 2017-01-09 | 2017-01-09 | Fuzzy c-Means Clustering Algorithm based on VSM realizes that search engine keywords optimize |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN106802945A (en) |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US11397754B2 (en) | 2020-02-14 | 2022-07-26 | International Business Machines Corporation | Context-based keyword grouping |
CN116450634A (en) * | 2023-06-15 | 2023-07-18 | 中新宽维传媒科技有限公司 | Data source weight evaluation method and related device thereof |
Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103218435A (en) * | 2013-04-15 | 2013-07-24 | 上海嘉之道企业管理咨询有限公司 | Method and system for clustering Chinese text data |
CN103258000A (en) * | 2013-03-29 | 2013-08-21 | 北界创想(北京)软件有限公司 | Method and device for clustering high-frequency keywords in webpages |
-
2017
- 2017-01-09 CN CN201710012398.4A patent/CN106802945A/en active Pending
Patent Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103258000A (en) * | 2013-03-29 | 2013-08-21 | 北界创想(北京)软件有限公司 | Method and device for clustering high-frequency keywords in webpages |
CN103218435A (en) * | 2013-04-15 | 2013-07-24 | 上海嘉之道企业管理咨询有限公司 | Method and system for clustering Chinese text data |
Non-Patent Citations (2)
Title |
---|
林元国 等: "K-means算法在关键词优化中的应用", 《计算机***应用》 * |
邓健爽 等: "基于搜索引擎的关键词自动聚类法", 《计算机科学》 * |
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US11397754B2 (en) | 2020-02-14 | 2022-07-26 | International Business Machines Corporation | Context-based keyword grouping |
CN116450634A (en) * | 2023-06-15 | 2023-07-18 | 中新宽维传媒科技有限公司 | Data source weight evaluation method and related device thereof |
CN116450634B (en) * | 2023-06-15 | 2023-09-29 | 中新宽维传媒科技有限公司 | Data source weight evaluation method and related device thereof |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN105512242B (en) | A kind of parallel recommendation method based on social network structure | |
CN106649616A (en) | Clustering algorithm achieving search engine keyword optimization | |
CN103034687B (en) | A kind of relating module recognition methodss based on 2 class heterogeneous networks | |
CN101454771A (en) | System and method of segmenting and tagging entities based on profile matching using a multi-media survey | |
CN102456057B (en) | Search method based on online trade platform, device and server | |
CN102968465A (en) | Network information service platform and search service method based on network information service platform | |
CN106933954A (en) | Search engine optimization technology is realized based on Decision Tree Algorithm | |
CN103235822A (en) | Database generating and querying method | |
Cong | Personalized recommendation of film and television culture based on an intelligent classification algorithm | |
Bi et al. | Crowd intelligence: Conducting asymmetric impact-performance analysis based on online reviews | |
CN106933953A (en) | A kind of fuzzy K mean cluster algorithm realizes search engine optimization technology | |
CN106802945A (en) | Fuzzy c-Means Clustering Algorithm based on VSM realizes that search engine keywords optimize | |
CN106909626A (en) | Improved Decision Tree Algorithm realizes search engine optimization technology | |
Cao et al. | Lightweight multiscale neural architecture search with spectral–spatial attention for hyperspectral image classification | |
CN107622071A (en) | By indirect correlation feedback without clothes image searching system and the method looked under source | |
CN108959577B (en) | Entity matching method and computer program based on non-dominant attribute outlier detection | |
Wei et al. | Semi-supervised neural architecture search for hyperspectral imagery classification method with dynamic feature clustering | |
CN111753151B (en) | Service recommendation method based on Internet user behavior | |
CN107562761A (en) | A kind of information-pushing method and device | |
CN106897356A (en) | Improved Fuzzy C mean algorithm realizes that search engine keywords optimize | |
CN106874376A (en) | A kind of method of verification search engine keyword optimisation technique | |
CN106874377A (en) | The improved clustering algorithm based on constraints realizes that search engine keywords optimize | |
CN106933950A (en) | New Model tying algorithm realizes search engine optimization technology | |
CN106897376A (en) | Fuzzy C-Mean Algorithm based on ant colony realizes that keyword optimizes | |
CN106776923A (en) | Improved clustering algorithm realizes that search engine keywords optimize |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
WD01 | Invention patent application deemed withdrawn after publication | ||
WD01 | Invention patent application deemed withdrawn after publication |
Application publication date: 20170606 |