CN113688926B - Website behavior classification method, system, storage medium and equipment - Google Patents

Website behavior classification method, system, storage medium and equipment Download PDF

Info

Publication number
CN113688926B
CN113688926B CN202111014054.XA CN202111014054A CN113688926B CN 113688926 B CN113688926 B CN 113688926B CN 202111014054 A CN202111014054 A CN 202111014054A CN 113688926 B CN113688926 B CN 113688926B
Authority
CN
China
Prior art keywords
data
membership
class
filtering
dimension
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202111014054.XA
Other languages
Chinese (zh)
Other versions
CN113688926A (en
Inventor
周劲
秦庆雪
韩士元
王琳
杜韬
纪科
张坤
赵亚欧
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
University of Jinan
Original Assignee
University of Jinan
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by University of Jinan filed Critical University of Jinan
Priority to CN202111014054.XA priority Critical patent/CN113688926B/en
Publication of CN113688926A publication Critical patent/CN113688926A/en
Application granted granted Critical
Publication of CN113688926B publication Critical patent/CN113688926B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • G06F18/2413Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on distances to training or reference patterns
    • G06F18/24147Distances to closest patterns, e.g. nearest neighbour classification
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/958Organisation or management of web site content, e.g. publishing, maintaining pages or automatic linking
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/23Clustering techniques
    • G06F18/232Non-hierarchical techniques
    • G06F18/2321Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions
    • G06F18/23213Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions with fixed number of clusters, e.g. K-means clustering

Landscapes

  • Engineering & Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Evolutionary Biology (AREA)
  • Evolutionary Computation (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Artificial Intelligence (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Databases & Information Systems (AREA)
  • Probability & Statistics with Applications (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention belongs to the field of website behavior classification, and provides a website behavior classification method, a system, a storage medium and equipment. The method comprises the steps of obtaining a website behavior data set; wherein, one attribute of each data in the set is a dimension; screening neighbors of each data to determine a filtering window of the corresponding data; randomly selecting a preset number of pieces of data from the website behavior data set to be respectively used as class center data, and calculating membership degrees of all data in the website behavior data set to all class center data; based on the filtering window of each data, each dimension of each data is used as a guide to filter membership degrees, and weighted summation of membership degrees after multidimensional filtering is used as the membership degree after final filtering; updating the class center data of each class by utilizing the final filtered membership degree, and further updating the attribute weight of each dimension of each class; and (3) iteratively calculating and judging the termination condition of the step of updating the class center data, and finally outputting the website behavior classification result.

Description

Website behavior classification method, system, storage medium and equipment
Technical Field
The invention belongs to the field of website behavior classification, and particularly relates to a website behavior classification method, a system, a storage medium and equipment.
Background
The statements in this section merely provide background information related to the present disclosure and may not necessarily constitute prior art.
The guide filtering is an image filtering method that can effectively remove noise and maintain edge information of a guide image, and is widely used in image segmentation, enhancement, defogging, and the like. This technique generally takes an image to be processed as a guide image, and performs filtering processing on an input image using information of the guide image, resulting in a filtered image having gradient information of the guide image and effectively removing noise. In recent years, in order to solve the problem that the clustering segmentation result is not accurate enough due to the fact that the traditional clustering algorithm cannot well utilize the spatial information of the image, a plurality of students apply the guided filtering method to the clustering process, and a plurality of fuzzy clustering algorithms related to the guided filtering are provided. The method takes the image to be segmented as a guide image, filters the membership degree obtained through fuzzy C-means, so that the membership degree can contain more gradient information, and the accuracy of image segmentation is improved.
In recent years, research work for adding guided filtering to fuzzy clustering for image segmentation has gained increasing attention. However, the current fuzzy clustering algorithm based on the guided filtering is only limited to the problem of image segmentation, and the guided filtering is mainly used for processing the images and is not suitable for website behavior analysis data. The website behavior analysis data also has space information, and the mining of potential information of the data has important significance for more accurate classification. However, the current fuzzy clustering method with spatial information is difficult to calculate or information is easy to lose in the clustering process.
Disclosure of Invention
In order to solve the technical problems in the background art, the invention provides a website behavior classification method, a system, a storage medium and equipment, which can accurately classify website behaviors.
In order to achieve the above purpose, the present invention adopts the following technical scheme:
a first aspect of the present invention provides a website behavior classification method, including:
acquiring a website behavior data set; wherein, one attribute of each data in the set is a dimension;
screening neighbors of each data to determine a filtering window of the corresponding data;
randomly selecting a preset number of pieces of data from the website behavior data set to be respectively used as class center data, and calculating membership degrees of all data in the website behavior data set to all class center data;
based on the filtering window of each data, each dimension of each data is used as a guide to filter membership degrees, and weighted summation of membership degrees after multidimensional filtering is used as the membership degree after final filtering;
updating the class center data of each class by utilizing the final filtered membership degree, and further updating the attribute weight of each dimension of each class;
and (3) iteratively calculating and judging the termination condition of the step of updating the class center data, and finally outputting the website behavior classification result.
Further, each data in the set contains at least two attributes.
Further, use is made ofNearest neighbor method finds nearest ++for each data in website behavior data set>Stripe data, this->The stripe data is the neighbor of the corresponding data; />Is a positive integer greater than or equal to 1.
Further, find the nearest data for each data in the website behavior data setThe process of the bar data is as follows:
calculating a distance matrix of the data by using Euclidean distance;
find the nearest data including itself for each dataAnd the neighbors.
Further, the process of determining the filter window of the corresponding data is:
screening the neighbors of each data point by using subtraction or addition to ensure that each data point and the neighbors are neighbors;
each data point has its remaining neighbors with symmetry as a filtering window.
Further, the formula for filtering the membership degree by using each dimension of each data as a guide respectively is thatWherein->Represents the%>First->Post-dimensional filtering->The degree of membership of the class,represents->The data belong to->Class membership->Representing guide data->Person->Value of dimension->Indicating guidance data +.>A data-centric window,>and->Representation window->The (1) th part of the body>Linear coefficients of dimensions.
Further, the terminating condition of the step of updating the respective class center data is: and iteratively calculating that the difference value between the two adjacent set objective function values is smaller than a set value or the iteration number exceeds a set threshold.
A second aspect of the present invention provides a website behavior classification system comprising:
the website behavior data acquisition module is used for acquiring a website behavior data set; wherein, one attribute of each data in the set is a dimension;
a filtering window determining module, which is used for screening the neighbor of each data to determine the filtering window of the corresponding data;
the class center data initializing module is used for randomly selecting a preset number of pieces of data from the website behavior data set to serve as class center data respectively, and calculating membership degrees of all data in the website behavior data set to all class center data;
the membership calculation module is used for filtering membership based on a filtering window of each data, and then using each dimension of each data as a guide to filter the membership respectively, and taking weighted summation of the membership after multi-dimensional filtering as the membership after final filtering;
the attribute weight updating module is used for updating the class center data of each class by utilizing the final filtered membership degree so as to update the attribute weight of each dimension;
and the classification result output module is used for iteratively calculating and judging the termination condition of the step of updating the class center data of each class, and finally outputting the website behavior classification result.
A third aspect of the present invention provides a computer-readable storage medium.
A computer readable storage medium having stored thereon a computer program which when executed by a processor performs the steps in a website behavior classification method as described above.
A fourth aspect of the invention provides a computer device.
A computer device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, the processor implementing the steps in the website behavior classification method as described above when the program is executed.
Compared with the prior art, the invention has the beneficial effects that:
the invention screens the neighbors of each data in the website behavior data set to determine the filtering window of the corresponding data, randomly selects a preset number of data from the website behavior data set to be respectively used as class center data, calculates the membership degree of each data in the website behavior data set to be respectively used as class center data, filters the membership degree by using each dimension of each data as a guide based on the filtering window of each data, and weights and sums the membership degrees after multidimensional filtering to be used as the membership degrees after final filtering, thereby being capable of more accurately mining the interests and the preferences of users in website behavior analysis by using the guide filtering, and further improving the accuracy of website behavior classification.
Additional aspects of the invention will be set forth in part in the description which follows and, in part, will be obvious from the description, or may be learned by practice of the invention.
Drawings
The accompanying drawings, which are included to provide a further understanding of the invention and are incorporated in and constitute a part of this specification, illustrate embodiments of the invention and together with the description serve to explain the invention.
FIG. 1 is a diagram of data selection in a filter window according to an embodiment of the present invention;
FIG. 2 is a graph of a guided filtering versus membership filtering process in accordance with an embodiment of the present invention;
FIG. 3 is a detailed process of guided filtering on a first class of membership filtering in accordance with an embodiment of the present invention;
FIG. 4 is a flowchart of a website behavior classification method according to an embodiment of the invention.
Detailed Description
The invention will be further described with reference to the drawings and examples.
It should be noted that the following detailed description is illustrative and is intended to provide further explanation of the invention. Unless defined otherwise, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this invention belongs.
It is noted that the terminology used herein is for the purpose of describing particular embodiments only and is not intended to be limiting of exemplary embodiments according to the present invention. As used herein, the singular is also intended to include the plural unless the context clearly indicates otherwise, and furthermore, it is to be understood that the terms "comprises" and/or "comprising" when used in this specification are taken to specify the presence of stated features, steps, operations, devices, components, and/or combinations thereof.
Example 1
As shown in fig. 4, the embodiment provides a website behavior classification method, which specifically includes the following steps:
s101: acquiring a website behavior data set; wherein one attribute of each data in the collection is a dimension.
Wherein each data in the set contains at least two attributes.
Reading in website behavior data to be clusteredWherein->Here->Is the number of samples of website behavior data to be clustered, < +.>The number of attributes contained in each piece of website behavior data is referred to as a dimension hereinafter, where the attributes include, but are not limited to, user ID, device type, gender, age, time of event, location, duration, specific time, specific operation, etc. It should be noted that, in this embodiment, the website behavior data sets are all data obtained by legal approaches.
S102: the neighbors of each data are filtered to determine the filter window for the corresponding data, as shown in fig. 1.
In a specific implementation, for example, the following are set: number of neighborsBy using->The nearest neighbor method finds the nearest +.>Stripe data, this->The stripe data is the neighbor of the data, and the filtering window of each data is determined by screening the neighbor of each data. />Is a positive integer greater than or equal to 1.
Wherein, find the nearest data for each data in the website behavior data setThe process of the bar data is as follows:
using Euclidean distanceCalculating a distance matrix of the data;
find the nearest data including itself for each dataAnd the neighbors.
Specifically, the method for determining the filter window includes the following steps:
taking into account the dataIs data->Neighbor, data->Not necessarily data->Is a neighbor of (c). It is therefore considered to filter the neighbors of each data point using subtraction (addition) to ensure that each data point and its neighbors are neighbors to each other. If data->Is data->Neighbor, data->Not data->Is to be +.>From->Is deleted from the neighbors of (2), and the addition filtering is to add +.>Added to->Is a neighbor of (2);
each data point has its remaining neighbors with symmetry as a filtering window.
S103: randomly selecting a preset number of pieces of data from the website behavior data set to be used as class center data respectively, and calculating membership degrees of all data in the website behavior data set to all class center data.
Presetting a cluster numberRandomly initialize->A clustering center, wherein the clustering center is selected from website behavior data to be clustered>The pieces of data are respectively used as class center data, wherein each piece of data has +.>Attribute, to iterate counterSet to 0, maximum number of iterations +.>Set to 150, the weight of each dimension is set to +.>Stop threshold of fuzzy clustering algorithm +.>Set to 10 -6
S104: based on the filtering window of each data, each dimension of each data is used as a guide to filter the membership degree, and weighted summation of the membership degrees after multi-dimensional filtering is used as the membership degree after final filtering, as shown in fig. 2.
Specifically, calculate the firstThe data belong to->Membership degree of individual cluster centers +.>The method comprises the steps of carrying out a first treatment on the surface of the Filtering the membership degree by using each dimension of each data in the website behavior data set as a guide, weighting and summing the membership degrees after multidimensional filtering to obtain the final filtered membership degree, and using the filtered membership degree for subsequent calculation.
Wherein, as shown in fig. 3, the filtering the membership degree by the guided filtering includes the following steps:
(1) To be obtainedThe membership matrix of (2) is divided into +.>Personal->Membership matrix of (a);
(2) Each dimension of the original data is respectively used as guiding data, and membership degree of each class is calculated according to a formulaFiltering, wherein->Represents the%>First->Post-dimensional filtering->Class membership->Represents->The data belong to->Class membership->Representing guide data->Person->Value of dimension->Indicating guidance data +.>A data-centric window,>and->Representation window->The (1) th part of the body>Linear coefficient of dimension,/->Is to prevent->The oversized pilot filter parameter, here typically takes a value of 10 -4 Using the formula +.>And->To obtain->And->Wherein->And->Indicate the->Dimension in window->Mean and variance of>Is window->The number of data in>Is input membership +.>In window->Is a mean value of (c).
Wherein, the membership calculation formula isIn the formula->Is->Class I->Attribute weight of dimension->Is the fuzzy coefficient, here generally takes the value 2, < >>Is->First->Value of dimension->Is->First of clustering centers>Values of dimensions.
S105: and updating the class center data of each class by utilizing the final filtered membership, and further updating the attribute weight of each dimension of each class.
Combining the obtained filtered membership degree updateCluster center of class->The obtained clustering center is used for subsequent calculation; updating the +.sup.th in combination with the membership and cluster center obtained above>Attribute weight of mth dimension of class +.>
The calculation formula of the clustering center is as follows
The result after multidimensional filtering is according toWeighted summation yields the final filtered membership, here +.>Indicate->Class I->The dimensions are weighted in two ways, one being by mean weighting, i.e. each dimension is weighted +.>The other is to use the EFWFCM weight update formula +.>Weights determined, here->Is a regularized scalar.
S106: and (3) iteratively calculating and judging the termination condition of the step of updating the class center data, and finally outputting the website behavior classification result.
Wherein the terminating conditions for the step of updating the respective class center data are: and iteratively calculating that the difference value between the two adjacent set objective function values is smaller than a set value or the iteration number exceeds a set threshold.
Calculate the firstObjective function value obtained by multiple iterations>
Calculate the firstThe value of the objective function obtained by the iteration +.>And->Objective function value +.>The difference between them, if it is satisfied->Or->And (3) ending the iteration, outputting a clustering result, and repeatedly executing the steps S103 to S106 if the clustering result is not met until the iteration ending condition is met, and outputting the clustering result.
Wherein the formula is usedTo calculate +.>Objective function value obtained by multiple iterations>
Example two
The embodiment provides a website behavior classification method, which specifically comprises the following modules:
the website behavior data acquisition module is used for acquiring a website behavior data set; wherein, one attribute of each data in the set is a dimension;
a filtering window determining module, which is used for screening the neighbor of each data to determine the filtering window of the corresponding data;
the class center data initializing module is used for randomly selecting a preset number of pieces of data from the website behavior data set to serve as class center data respectively, and calculating membership degrees of all data in the website behavior data set to all class center data;
the membership calculation module is used for filtering membership based on a filtering window of each data, and then using each dimension of each data as a guide to filter the membership respectively, and taking weighted summation of the membership after multi-dimensional filtering as the membership after final filtering;
the attribute weight updating module is used for updating the class center data of each class by utilizing the final filtered membership degree so as to update the attribute weight of each dimension;
and the classification result output module is used for iteratively calculating and judging the termination condition of the step of updating the class center data of each class, and finally outputting the website behavior classification result.
Here, each module of the present embodiment corresponds to each step in the first embodiment, and the implementation process is the same, which is not described here.
Example III
The present embodiment provides a computer-readable storage medium having stored thereon a computer program which, when executed by a processor, implements the steps in the website behavior classification method as described in the above embodiment one.
It will be appreciated by those skilled in the art that embodiments of the present invention may be provided as a method, system, or computer program product. Accordingly, the present invention may take the form of a hardware embodiment, a software embodiment, or an embodiment combining software and hardware aspects. Furthermore, the present invention may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, magnetic disk storage, optical storage, and the like) having computer-usable program code embodied therein.
The storage medium may be a magnetic disk, an optical disk, a Read-Only Memory (ROM), a Random access Memory (Random AccessMemory, RAM), or the like.
Example IV
The present embodiment provides a computer device, including a memory, a processor, and a computer program stored in the memory and executable on the processor, where the processor implements the steps in the website behavior classification method according to the above embodiment when the processor executes the program.
The present invention is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems) and computer program products according to embodiments of the invention. It will be understood that each flow and/or block of the flowchart illustrations and/or block diagrams, and combinations of flows and/or blocks in the flowchart illustrations and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
The above description is only of the preferred embodiments of the present invention and is not intended to limit the present invention, but various modifications and variations can be made to the present invention by those skilled in the art. Any modification, equivalent replacement, improvement, etc. made within the spirit and principle of the present invention should be included in the protection scope of the present invention.

Claims (6)

1. A method for classifying website behaviors, comprising:
acquiring website behavior dataA collection; wherein, one attribute of each data in the set is a dimension; each data in the set at least comprises two attributes; reading in website behavior data to be clusteredWherein,/>Is the number of samples of website behavior data to be clustered, < +.>The number of the attributes contained in the behavior data of each website is called dimension, and the attributes comprise user ID, equipment type, gender, age, time, place, duration, specific time and specific operation of the event;
screening neighbors of each data to determine a filtering window of the corresponding data; the process of determining the filter window of the corresponding data is as follows: screening the neighbors of each data point by using subtraction or addition to ensure that each data point and the neighbors are neighbors; each data point takes the neighbor with symmetry reserved by the data point as a filtering window; usingNearest neighbor method finds nearest ++for each data in website behavior data set>Stripe data, this->The stripe data is the neighbor of the corresponding data; />Is a positive integer greater than or equal to 1;
randomly selecting a preset number of pieces of data from the website behavior data set to be respectively used as class center data, and calculating membership degrees of all data in the website behavior data set to all class center data; presetting a cluster numberRandomly initialize->A clustering center, wherein the clustering center is selected from website behavior data to be clustered>The pieces of data are respectively used as class center data, wherein each piece of data has +.>Attribute, iteration counter->Set to 0, the maximum number of iterations is +.>The weight of each dimension is set to +.>The stop threshold of the fuzzy clustering algorithm is set to +.>
Based on the filtering window of each data, each dimension of each data is used as a guide to filter membership degrees, and weighted summation of membership degrees after multidimensional filtering is used as the membership degree after final filtering; wherein, the filtering the membership degree by the guiding filtering comprises the following steps:
(1) To be obtainedThe membership matrix of (2) is divided into +.>Personal->Membership matrix of (a);
(2) Each dimension of the original data is respectively used as guiding data, and membership degree of each class is calculated according to a formulaFiltering, wherein->Represents->First->Post-dimensional filtering->The degree of membership of the class,represents->The data belong to->Class membership->Representing guide data->Person->Value of dimension->Indicating guidance data +.>A data-centric window, using the formula +.>And->To obtain->And->Wherein->And->Representation window->The (1) th part of the body>Linear coefficient of dimension,/->Is to prevent->Oversized pilot filter parameters +.>And->Indicate the->Dimension in window->Mean and variance of>Is window->The number of data in>Is input membership +.>In window->Is the average value of (2);
wherein, the membership calculation formula isIn the formula->Is->Class I->Attribute weight of dimension->Is a fuzzy coefficient, +.>Is->First of clustering centers>Values of dimensions;
updating the class center data of each class by utilizing the final filtered membership degree, and further updating the attribute weight of each dimension of each class;
combining the obtained filtered membership degree updateCluster center of class->The obtained clustering center is used for subsequent calculation; updating the +.sup.th in combination with the membership and cluster center obtained above>Attribute weight of mth dimension of class +.>
The calculation formula of the clustering center is as follows
The result after multidimensional filtering is according toWeighted summation is carried out to obtain final membership after filtering; the weighting mode adopts mean weighting or an EFWFCM weight updating formula to calculate the weight; each dimension of the mean weighting is weighted asThe method comprises the steps of carrying out a first treatment on the surface of the The EFWFCM weight update formula is +.>,/>Is a regularized scalar;
and (3) iteratively calculating and judging the termination condition of the step of updating the class center data, and finally outputting the website behavior classification result.
2. The web site activity classification method of claim 1 wherein a nearest is found for each data in the web site activity data setThe process of the bar data is as follows:
calculating a distance matrix of the data by using Euclidean distance;
find the nearest data including itself for each dataAnd the neighbors.
3. The web site activity classification method of claim 1 wherein the terminating of the step of updating the respective class center data is conditioned by: and iteratively calculating that the difference value between the two adjacent set objective function values is smaller than a set value or the iteration number exceeds a set threshold.
4. A web site behavior classification system, comprising:
the website behavior data acquisition module is used for acquiring a website behavior data set; wherein, one attribute of each data in the set is a dimension; each data in the set at least comprises two attributes; reading in website behavior data to be clusteredWherein->,/>Is the number of samples of website behavior data to be clustered, < +.>The number of the attributes contained in the behavior data of each website is called dimension, and the attributes comprise user ID, equipment type, gender, age, time, place, duration, specific time and specific operation of the event;
a filtering window determining module, which is used for screening the neighbor of each data to determine the filtering window of the corresponding data; the process of determining the filter window of the corresponding data is as follows: screening the neighbors of each data point by using subtraction or addition to ensure that each data point and the neighbors are neighbors; each data point takes the neighbor with symmetry reserved by the data point as a filtering window; usingNearest neighbor method finds nearest ++for each data in website behavior data set>Stripe data, this->The stripe data is the neighbor of the corresponding data; />Is a positive integer greater than or equal to 1;
the class center data initializing module is used for randomly selecting a preset number of pieces of data from the website behavior data set to be respectively used as class center data, and calculating that each piece of data in the website behavior data set belongs to each piece of dataClass center data membership; presetting a cluster numberRandomly initialize->A clustering center selected from the website behavior data to be clusteredThe pieces of data are respectively used as class center data, wherein each piece of data has +.>Attribute, iteration counter->Set to 0, the maximum number of iterations is +.>The weight of each dimension is set to +.>The stop threshold of the fuzzy clustering algorithm is set to +.>
The membership calculation module is used for filtering membership based on a filtering window of each data, and then using each dimension of each data as a guide to filter the membership respectively, and taking weighted summation of the membership after multi-dimensional filtering as the membership after final filtering; wherein, the filtering the membership degree by the guiding filtering comprises the following steps:
(1) To be obtainedThe membership matrix of (2) is divided into +.>Personal->Membership matrix of (a);
(2) Each dimension of the original data is respectively used as guiding data, and membership degree of each class is calculated according to a formulaFiltering, wherein->Represents->First->Post-dimensional filtering->The degree of membership of the class,represents->The data belong to->Class membership->Representing guide data->Person->Value of dimension->Indicating guidance data +.>A data-centric window, using the formula +.>And->To obtain->And->Wherein->And->Representation window->The (1) th part of the body>Linear coefficient of dimension,/->Is to prevent->Oversized pilot filter parameters +.>And->Indicate the->Dimension in window->Mean and variance of>Is window->The number of data in>Is input membership +.>In window->Is the average value of (2);
wherein, the membership calculation formula isIn the formula->Is->Class I->Attribute weight of dimension->Is a fuzzy coefficient, +.>Is->First of clustering centers>Values of dimensions;
the attribute weight updating module is used for updating the class center data of each class by utilizing the final filtered membership degree so as to update the attribute weight of each dimension;
combining the obtained filtered membership degree updateCluster center of class->The obtained clustering center is used for subsequent calculation; updating the +.sup.th in combination with the membership and cluster center obtained above>Attribute weight of mth dimension of class +.>
The calculation formula of the clustering center is as follows
The result after multidimensional filtering is according toWeighted summation is carried out to obtain final membership after filtering; the weighting mode adopts mean weighting or an EFWFCM weight updating formula to calculate the weight; each dimension of the mean weighting is weighted asThe method comprises the steps of carrying out a first treatment on the surface of the The EFWFCM weight update formula is +.>,/>Is a regularized scalar;
and the classification result output module is used for iteratively calculating and judging the termination condition of the step of updating the class center data of each class, and finally outputting the website behavior classification result.
5. A computer readable storage medium having stored thereon a computer program, which when executed by a processor implements the steps of the website behavior classification method of any of claims 1-3.
6. A computer device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, the processor implementing the steps in the website behavior classification method of any one of claims 1-3 when the program is executed.
CN202111014054.XA 2021-08-31 2021-08-31 Website behavior classification method, system, storage medium and equipment Active CN113688926B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202111014054.XA CN113688926B (en) 2021-08-31 2021-08-31 Website behavior classification method, system, storage medium and equipment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111014054.XA CN113688926B (en) 2021-08-31 2021-08-31 Website behavior classification method, system, storage medium and equipment

Publications (2)

Publication Number Publication Date
CN113688926A CN113688926A (en) 2021-11-23
CN113688926B true CN113688926B (en) 2024-03-08

Family

ID=78584470

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111014054.XA Active CN113688926B (en) 2021-08-31 2021-08-31 Website behavior classification method, system, storage medium and equipment

Country Status (1)

Country Link
CN (1) CN113688926B (en)

Citations (13)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103605794A (en) * 2013-12-05 2014-02-26 国家计算机网络与信息安全管理中心 Website classifying method
CN106651838A (en) * 2016-11-15 2017-05-10 山东师范大学 Gel protein partitioning method based on fuzzy clustering
CN107643925A (en) * 2017-09-30 2018-01-30 广东欧珀移动通信有限公司 Background application method for cleaning, device, storage medium and electronic equipment
CN108197650A (en) * 2017-12-30 2018-06-22 南京理工大学 The high spectrum image extreme learning machine clustering method that local similarity is kept
CN109285175A (en) * 2018-08-15 2019-01-29 中国科学院苏州生物医学工程技术研究所 The fuzzy clustering image partition method filtered based on morphological reconstruction and degree of membership
CN109685820A (en) * 2018-11-29 2019-04-26 济南大学 Image partition method based on morphological reconstruction with the FCM cluster with guidance filtering
CN109726738A (en) * 2018-11-30 2019-05-07 济南大学 Data classification method based on transfer learning Yu attribute entropy weighted fuzzy clustering
CN109741330A (en) * 2018-12-21 2019-05-10 东华大学 A kind of medical image cutting method of mixed filtering strategy and fuzzy C-mean algorithm
CN110569915A (en) * 2019-09-12 2019-12-13 齐鲁工业大学 automobile data clustering method and system based on intuitive fuzzy C-means
CN110659930A (en) * 2019-08-27 2020-01-07 深圳大学 Consumption upgrading method and device based on user behaviors, storage medium and equipment
CN111062394A (en) * 2019-11-18 2020-04-24 济南大学 Fuzzy clustering color image segmentation method based on multi-channel weighting guide filtering
CN111932472A (en) * 2020-07-27 2020-11-13 江苏大学 Image edge-preserving filtering method based on soft clustering
CN113222924A (en) * 2021-04-30 2021-08-06 西安电子科技大学 Hyperspectral image anomaly detection system based on FPGA

Patent Citations (13)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103605794A (en) * 2013-12-05 2014-02-26 国家计算机网络与信息安全管理中心 Website classifying method
CN106651838A (en) * 2016-11-15 2017-05-10 山东师范大学 Gel protein partitioning method based on fuzzy clustering
CN107643925A (en) * 2017-09-30 2018-01-30 广东欧珀移动通信有限公司 Background application method for cleaning, device, storage medium and electronic equipment
CN108197650A (en) * 2017-12-30 2018-06-22 南京理工大学 The high spectrum image extreme learning machine clustering method that local similarity is kept
CN109285175A (en) * 2018-08-15 2019-01-29 中国科学院苏州生物医学工程技术研究所 The fuzzy clustering image partition method filtered based on morphological reconstruction and degree of membership
CN109685820A (en) * 2018-11-29 2019-04-26 济南大学 Image partition method based on morphological reconstruction with the FCM cluster with guidance filtering
CN109726738A (en) * 2018-11-30 2019-05-07 济南大学 Data classification method based on transfer learning Yu attribute entropy weighted fuzzy clustering
CN109741330A (en) * 2018-12-21 2019-05-10 东华大学 A kind of medical image cutting method of mixed filtering strategy and fuzzy C-mean algorithm
CN110659930A (en) * 2019-08-27 2020-01-07 深圳大学 Consumption upgrading method and device based on user behaviors, storage medium and equipment
CN110569915A (en) * 2019-09-12 2019-12-13 齐鲁工业大学 automobile data clustering method and system based on intuitive fuzzy C-means
CN111062394A (en) * 2019-11-18 2020-04-24 济南大学 Fuzzy clustering color image segmentation method based on multi-channel weighting guide filtering
CN111932472A (en) * 2020-07-27 2020-11-13 江苏大学 Image edge-preserving filtering method based on soft clustering
CN113222924A (en) * 2021-04-30 2021-08-06 西安电子科技大学 Hyperspectral image anomaly detection system based on FPGA

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
Morphological Reconstruction-Based Image-Guided Fuzzy Clustering with a Novel Impact Factor;Qingxue Qin等;Journal of Healthcare Engineering;20210914;第2021卷;1-13 *
基于引导滤波的模糊聚类算法研究及图像分割应用;徐广梅;CNKI优秀硕士学位论文全文库;20210115;全文 *
模糊评判法和模糊聚类分析在优化布点中的应用;潘玉奇;微电子学与计算机;20070831;第24卷(第8期);169-172 *

Also Published As

Publication number Publication date
CN113688926A (en) 2021-11-23

Similar Documents

Publication Publication Date Title
Vijay et al. An efficient brain tumor detection methodology using K-means clustering algoriftnn
KR101434170B1 (en) Method for study using extracted characteristic of data and apparatus thereof
Shao et al. Using the maximum between-class variance for automatic gridding of cDNA microarray images
CN111753987A (en) Method and device for generating machine learning model
Boss et al. Mammogram image segmentation using fuzzy clustering
WO2022033015A1 (en) Method and apparatus for processing abnormal region in image, and image segmentation method and apparatus
CN113688926B (en) Website behavior classification method, system, storage medium and equipment
CN116504314B (en) Gene regulation network construction method based on cell dynamic differentiation
Dehariya et al. Brain image segmentation to diagnose tumor by applying wiener filter and intelligent water drop algorithm
Aswathy et al. MRI brain tumor segmentation using genetic algorithm with SVM classifier
Biju et al. A genetic algorithm based fuzzy C mean clustering model for segmenting microarray images
Corso et al. Segmentation of sub-cortical structures by the graph-shifts algorithm
Vadaparthi et al. Segmentation of brain mr images based on finite skew gaussian mixture model with fuzzy c-means clustering and em algorithm
CN115240843A (en) Fairness prediction system based on structure causal model
JP2016520220A (en) Hidden attribute model estimation device, method and program
JP7429514B2 (en) machine learning device
CN113656707A (en) Financing product recommendation method, system, storage medium and equipment
Ramathilaga et al. Two novel fuzzy clustering methods for solving data clustering problems
CN112419047A (en) Method and system for predicting overdue of bank personal loan by utilizing characteristic trend analysis
Ackerman et al. Density-based interpretable hypercube region partitioning for mixed numeric and categorical data
Mandal et al. Adaptive median filtering based on unsupervised classification of pixels
Xu et al. An improved fuzzy c-means clustering algorithm with guided filter for image segmentation
CN110825707A (en) Data compression method
CN114821206B (en) Multi-modal image fusion classification method and system based on confrontation complementary features
Paulraj et al. An effective approach of CT Lung segmentation using possibilistic fuzzy c-means algorithm and classification of lung cancer cells with the aid of soft computing techniques

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant