CN113298115A - User grouping method, device, equipment and storage medium based on clustering - Google Patents

User grouping method, device, equipment and storage medium based on clustering Download PDF

Info

Publication number
CN113298115A
CN113298115A CN202110418570.2A CN202110418570A CN113298115A CN 113298115 A CN113298115 A CN 113298115A CN 202110418570 A CN202110418570 A CN 202110418570A CN 113298115 A CN113298115 A CN 113298115A
Authority
CN
China
Prior art keywords
clustering
user attribute
user
attribute characteristics
center
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202110418570.2A
Other languages
Chinese (zh)
Inventor
胡文阳
王汉超
傅正佳
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Bigo Technology Pte Ltd
Original Assignee
Bigo Technology Pte Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Bigo Technology Pte Ltd filed Critical Bigo Technology Pte Ltd
Priority to CN202110418570.2A priority Critical patent/CN113298115A/en
Publication of CN113298115A publication Critical patent/CN113298115A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/23Clustering techniques
    • G06F18/232Non-hierarchical techniques
    • G06F18/2321Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions
    • G06F18/23211Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions with adaptive number of clusters
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/40Information retrieval; Database structures therefor; File system structures therefor of multimedia data, e.g. slideshows comprising image and additional audio data
    • G06F16/43Querying
    • G06F16/435Filtering based on additional data, e.g. user or group profiles
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q50/00Information and communication technology [ICT] specially adapted for implementation of business processes of specific business sectors, e.g. utilities or tourism
    • G06Q50/01Social networking

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Business, Economics & Management (AREA)
  • Computing Systems (AREA)
  • General Health & Medical Sciences (AREA)
  • Evolutionary Biology (AREA)
  • Evolutionary Computation (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Artificial Intelligence (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Probability & Statistics with Applications (AREA)
  • Health & Medical Sciences (AREA)
  • Economics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Human Resources & Organizations (AREA)
  • Marketing (AREA)
  • Primary Health Care (AREA)
  • Strategic Management (AREA)
  • Tourism & Hospitality (AREA)
  • General Business, Economics & Management (AREA)
  • Multimedia (AREA)
  • Databases & Information Systems (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The embodiment of the invention discloses a user grouping method, a device, equipment and a storage medium based on clustering, wherein the method comprises the following steps: acquiring user attribute features and corresponding clustering centers, wherein each clustering center is obtained by off-line calculation, and the user attribute features are associated with users; determining a target clustering center corresponding to the user attribute characteristics according to the user attribute characteristics and the clustering centers, and determining a group corresponding to the target clustering center as a group of the user; and updating the target clustering center according to the user attribute characteristics so as to be used for subsequent user grouping. According to the scheme, the problem that the clustering result precision is poor when the characteristic distribution fluctuates greatly due to parameter fluctuation is solved, and accurate grouping of data is realized.

Description

User grouping method, device, equipment and storage medium based on clustering
Technical Field
The embodiment of the application relates to the field of computers, in particular to a user grouping method, device, equipment and storage medium based on clustering.
Background
With the continuous development of network transmission and audio-video technology in recent years, short video playing platforms attract more and more users. Due to the huge base number of users, grouping users in a short video platform and then respectively serving the users become a common mode. Therefore, how to more accurately, effectively and real-timely group video users is an important topic for optimizing the viewing experience of short video platform users.
In the prior art, users are generally grouped in a clustering manner. Clustering, among other things, refers to the process of dividing a collection of physical or abstract objects into classes composed of similar objects, referred to as clustering. The cluster generated by clustering is a collection of a set of data objects that are similar to objects in the same cluster and distinct from objects in other clusters. In the existing clustering mode, when the feature distribution fluctuates greatly, for example, the network condition of a user group fluctuates to cause the feature distribution to change greatly, the time efficiency and accuracy of the feature distribution are both reduced significantly, and improvement is needed.
Disclosure of Invention
The embodiment of the invention provides a user grouping method, a user grouping device, user grouping equipment and a storage medium based on clustering, solves the problem of poor clustering result accuracy when characteristic distribution fluctuates greatly due to parameter fluctuation, and realizes accurate grouping of data.
In a first aspect, an embodiment of the present invention provides a user grouping method based on clustering, where the method includes:
acquiring user attribute features and corresponding clustering centers, wherein each clustering center is obtained by off-line calculation, and the user attribute features are associated with users;
determining a target clustering center corresponding to the user attribute characteristics according to the user attribute characteristics and the clustering centers, and determining a group corresponding to the target clustering center as a group of the user;
and updating the target clustering center according to the user attribute characteristics so as to be used for subsequent user grouping.
In a second aspect, an embodiment of the present invention further provides a device for grouping users based on clustering, where the device includes:
the data acquisition module is used for acquiring user attribute characteristics and corresponding clustering centers, wherein each clustering center is obtained by off-line calculation, and the user attribute characteristics are associated with users;
the data grouping module is used for determining a target clustering center corresponding to the user attribute characteristics according to the user attribute characteristics and the clustering centers and determining a group corresponding to the target clustering center as a group of the user;
and the cluster updating module is used for updating the target cluster center according to the user attribute characteristics and then using the updated target cluster center for subsequent user grouping.
In a third aspect, an embodiment of the present invention further provides a device for grouping users based on clustering, where the device includes:
one or more processors;
a storage device for storing one or more programs,
when executed by the one or more processors, cause the one or more processors to implement a cluster-based user grouping method as described in embodiments of the invention.
In a fourth aspect, the present invention also provides a storage medium storing computer-executable instructions, which when executed by a computer processor, are configured to perform the clustering-based user grouping method according to the present invention.
In the embodiment of the invention, user attribute characteristics and corresponding clustering centers are obtained, wherein each clustering center is obtained by off-line calculation, the user attribute characteristics are associated with users, a target clustering center corresponding to the user is determined according to the user attribute characteristics and each clustering center, the group corresponding to the target clustering center is determined as a group of the users, and meanwhile, the determined target clustering center is updated according to the user attribute characteristics so as to apply the updated clustering center to subsequent user groups. The scheme solves the problem that the user equipment cannot be grouped well when the characteristic distribution fluctuates greatly due to parameter fluctuation, and realizes accurate grouping of data.
Drawings
Fig. 1 is a flowchart of a user grouping method based on clustering according to an embodiment of the present invention;
FIG. 2 is a flow chart of another clustering-based user grouping method according to an embodiment of the present invention;
FIG. 3 is a flow chart of another clustering-based user grouping method according to an embodiment of the present invention;
FIG. 4 is a flow chart of another clustering-based user grouping method according to an embodiment of the present invention;
FIG. 5 is a flow chart of another clustering-based user grouping method according to an embodiment of the present invention;
FIG. 6 is a flow chart of another clustering-based user grouping method according to an embodiment of the present invention;
fig. 7 is a block diagram of a clustering-based user grouping apparatus according to an embodiment of the present invention;
fig. 8 is a schematic structural diagram of an apparatus according to an embodiment of the present invention.
Detailed Description
The embodiments of the present invention will be described in further detail with reference to the drawings and examples. It is to be understood that the specific embodiments described herein are merely illustrative of and not restrictive on the broad invention. It should be further noted that, for convenience of description, only some structures, not all structures, relating to the embodiments of the present invention are shown in the drawings.
Fig. 1 is a flowchart of a clustering-based user grouping method provided in an embodiment of the present invention, where the embodiment is applicable to reasonably grouping users for subsequently executing a corresponding service policy, and the method may be executed by a computing device such as a desktop, a notebook, a server, and the like, and specifically includes the following steps:
step S101, obtaining user attribute characteristics and corresponding clustering centers, wherein each clustering center is obtained through off-line calculation, and the user attribute characteristics are associated with users.
In one embodiment, users are grouped in a clustering manner to implement different service policies for users under different groups. The user may be characterized by a user login identifier, a user equipment identifier, or a temporary user identifier.
Wherein the user attribute features are associated with the user to characterize different attributes of the user. Taking video playing as an example, the user attribute feature may be the current network bandwidth of the user, a device decoding parameter or a data download rate, etc. Illustratively, the server provides a video playing resource, and an application program is installed in the user terminal device, through which the user requests and acquires the video resource of the server to play locally in the device. In the video playing process, the current network bandwidth, equipment decoding parameters and data downloading rate of user equipment are used as user attribute characteristics to be sent to a server, and after the server receives the user attribute characteristics on line in real time, the server determines the grouping of users corresponding to the user attribute characteristics.
The clustering center is obtained by calculating offline data. For example, the plurality of user attribute features of the history records may be clustered to obtain a plurality of cluster centers. The number of cluster centers can be preset, and the set number can be determined according to the number of specific subsequent grouping execution strategies. The clustering calculation method can be GMM algorithm, DBSCAN algorithm or Kmeans algorithm. In another embodiment, the number of cluster centers is generated using automatic calculation.
In an embodiment, after the user attribute features are obtained, the cluster data may be stored in the server, and the cluster data is loaded to the server memory when the user grouping is performed, for each cluster center obtained by calculating the corresponding historical data. Different types of clustering models can be set according to the dimension of different user attribute characteristics and the user service processing strategy. For example, taking video playing as an example, the user attribute features are network bandwidth, device decoding parameters, and data download rate, and the corresponding different service policies may be different video bit rates, where a bit rate refers to data traffic used by a video file in a unit time. The method comprises the steps of presetting multiple different code rates for the same video playing resource, setting the number of corresponding clustering centers to be K if the different code rates in K are set, correspondingly clustering user attribute characteristics of historical records to obtain K clustering centers, and associating data related to a clustering model with the user attribute characteristics related to video playing. In another embodiment, the user attribute features may be user browsing records, user search records, and user click-through content records, and the corresponding different service policies may be recommendations for different types of advertising content. If M different advertisement types are set, each advertisement type corresponds to one or more advertisements, M clustering centers are obtained by clustering the user attribute characteristics of the historical records, and each clustering center corresponds to one recommended advertisement type.
In another embodiment, when determining the cluster centers offline, the number of cluster centers is automatically determined according to the offline user attribute characteristics of the history records. Such as using intra-group mean error sum algorithms, segmentation algorithms around a central point, or hierarchical clustering algorithms, etc.
And S102, determining a target clustering center corresponding to the user attribute characteristics according to the user attribute characteristics and the clustering centers, and determining a group corresponding to the target clustering center as the group of the user.
And the target clustering center is the clustering center with the closest distance between the determined current user attribute characteristics and each clustering center. If 5 current clustering centers are provided, namely clustering center 1, clustering center 2, clustering center 3, clustering center 4 and clustering center 5, the distances from the clustering centers determined according to the current user attribute characteristics are distance 1a, distance 2a, distance 3a, distance 4a and distance 5a in sequence. Assuming that the smallest distance value among the distance 1a, the distance 2a, the distance 3a, the distance 4a, and the distance 5a is the distance 4a, the corresponding cluster center 4 is determined as the target cluster center.
In one embodiment, using the Kmeans algorithm as an example, the user attribute features are represented by a D-dimensional vector xnExpressed, recorded as:
xn(xn,1,xn,2,...,xn,D)
wherein xnThe calculation formula for the distance value to the center of each cluster is:
Figure BDA0003026904340000051
wherein K represents the selection of K clustering centers, mjD-dimensional feature vector (m) representing jth cluster centerj,1,mj,2,...,mj,D) N represents the total number of samples of N users, and xn represents the user attribute characteristics of the nth user, i.e. the current user.
In one embodiment, each cluster center corresponds to one group, each group corresponds to different processing strategies, and after the target cluster center corresponding to the user attribute characteristics is obtained through the calculation, the group corresponding to the target cluster center is determined as the user group. Therefore, grouping of users based on the acquired user attribute characteristics in real time and online is achieved. Accordingly, the processing side rate of the user packet is output or the corresponding packet is directly performed for the user after the user packet is determined. For example, assuming that there are currently 5 packets, in turn packet 1, packet 2, packet 3, packet 4, and packet 5, the policy for each packet in turn may be a different video playback mode, examples being 4K, 2K, blu-ray, ultra-definition, and high definition.
And step S103, updating the target clustering center according to the user attribute characteristics.
And after grouping the users corresponding to the currently acquired user attribute characteristics, further updating the target clustering center. Namely, the clustering center is made to perform self-adaptive adjustment according to the user attribute characteristics. In one embodiment, the adjusted clustering centers are correspondingly updated and adjusted to the clustering centers previously stored by the server, and when another user attribute feature is subsequently obtained and user grouping is performed, each updated and stored clustering center is read to determine the user grouping.
Using the Kmeans algorithm as an example, the userAttribute characterization by D-dimensional vector xnExpressing (same as above), selecting the determined nearest cluster center as the target cluster center, and determining the target by the following formula
Figure BDA0003026904340000052
Wherein K represents the number of clusters, mjD-dimensional feature vector (m) representing jth cluster centerj,1,mj,2,...,mj,D),xnRepresenting the user attribute characteristics of the nth user, i.e. the current user.
The updating process of the target clustering center comprises the following steps: and updating the characteristic value of the target clustering center based on the user attribute characteristics and the learning rate parameters. The specific calculation formula is as follows:
Figure BDA0003026904340000061
where γ is the learning rate, and is preferably set to a small positive number, such as 0.05.
According to the scheme, the user grouping method carries out clustering center classification on the user attribute characteristics acquired online in real time, the groups corresponding to the target clustering centers are determined as the current user groups after the target clustering centers are determined, the grouping process is strong in real time, and online real-time user grouping classification can be realized aiming at the user characteristic attributes with fluctuation so as to execute the corresponding service strategies. Each clustering center is determined in an off-line calculation mode, the on-line clustering process and the off-line clustering process are organically integrated, and the accuracy of grouping results is guaranteed. Meanwhile, due to the fact that the user attribute characteristics greatly fluctuate in some scenes, the clustering center is correspondingly adjusted according to the user attribute characteristics, dynamic adjustment of the clustering center is achieved, groups corresponding to the user attribute characteristics can be dynamically and accurately determined, and the problem that dynamic adjustment cannot be achieved in a common user grouping mode is solved.
Fig. 2 is a flowchart of another user grouping method based on clustering according to an embodiment of the present invention, which provides a specific method for determining a plurality of clustering centers. As shown in fig. 2, the technical solution is as follows:
step S201, obtaining a historical record of user attribute characteristics, and clustering the user attribute characteristics in the historical record to obtain a plurality of clustering centers.
In one embodiment, a cluster calculation is performed on the user attribute features of the history to automatically derive a plurality of cluster centers. Specifically, the calculation formula is as follows, in a manner that the evaluation index uses the sum of squares of errors:
Figure BDA0003026904340000062
wherein m isjD-dimensional feature vector (m) representing jth cluster centerj,1,mj,2,...,mj,D) N represents the total number of user samples, xnFeatures representing an nth user sample; kminAnd KmaxRepresenting the minimum and maximum cluster centers, respectively. For each K (K)min<K<Kmax) Calculate Δ SSE _ RatioK=(SSEK-1-SSEK)/(SSEK-SSEK+1) Is chosen such that Δ SSE _ RatioKThe maximum K is taken as the optimal number of clusters.
Step S202, obtaining user attribute characteristics and corresponding clustering centers, wherein each clustering center is obtained through off-line calculation, and the user attribute characteristics are associated with users.
Step S203, determining a target clustering center corresponding to the user attribute characteristics according to the user attribute characteristics and the clustering centers, and determining a group corresponding to the target clustering center as the group of the user.
And step S204, updating the target clustering center according to the user attribute characteristics.
According to the scheme, the user attribute characteristics of the historical records are obtained, clustering processing is carried out on the user attribute characteristics of the historical records to automatically obtain a plurality of clustering centers, automatic and optimal determination of the grouping number is achieved, and the user grouping based on clustering is more accurate.
Fig. 3 is a flowchart of another user grouping method based on clustering according to an embodiment of the present invention, which further provides a method for updating a cluster center. As shown in fig. 3, the technical solution is as follows:
step S301, obtaining a history record of the user attribute characteristics, and clustering the user attribute characteristics in the history record to obtain a plurality of clustering centers.
Step S302, user attribute features and corresponding clustering centers are obtained, the clustering centers are obtained through off-line calculation, and the user attribute features are associated with users.
Step S303, determining a target clustering center corresponding to the user attribute characteristics according to the user attribute characteristics and the clustering centers, and determining a group corresponding to the target clustering center as the group of the user.
And step S304, updating the target clustering center according to the user attribute characteristics.
Step S305, whether the updating time of the clustering result is reached is determined, and if yes, step S306 is executed.
In one embodiment, after the target clustering centers are updated according to the user attribute characteristics, whether clustering result update time is reached is further determined, wherein the clustering result update time is used for indicating to recalculate the number of the optimal clustering centers and correspondingly initialize the clustering centers. For example, the cluster result update time may be half a day, one day, three days, one week, or the like.
And S306, clustering again based on the user attribute characteristics of the historical records and the newly added user attribute characteristics to obtain a plurality of updated clustering centers.
And when the cluster result updating time is met, clustering processing is carried out again based on the user attribute characteristics of the historical records and the newly added user attribute characteristics to obtain a plurality of updated cluster centers. The determination of the number of the cluster centers and the initialization of the cluster centers are included, and the specific determination of the number of the cluster centers can be referred to the explanation part of step S201, which is not described herein again. Optionally, after the initialization of the clustering center is completed, the clustering data is stored in a memory of the server, so that the clustering data is used for calculation when the obtained user attribute features are grouped for the users subsequently.
According to the scheme, the clustering centers are determined again at regular intervals, the number of the clustering centers can be dynamically adjusted according to the recorded user attribute characteristics, the problem that the clustering result is poor in accuracy when the characteristic distribution fluctuates greatly due to parameter fluctuation is solved, and accurate grouping of data is achieved.
Fig. 4 is a flowchart of another user grouping method based on clustering according to an embodiment of the present invention, which specifically defines a determination manner of update time of a clustering result. As shown in fig. 4, the technical solution is as follows:
step S401, obtaining a history record of user attribute characteristics, and performing clustering processing on the user attribute characteristics in the history record to obtain a plurality of clustering centers.
Step S402, obtaining user attribute characteristics and corresponding clustering centers, wherein each clustering center is obtained through off-line calculation, and the user attribute characteristics are associated with users.
Step S403, determining a target clustering center corresponding to the user attribute characteristics according to the user attribute characteristics and the clustering centers, and determining a group corresponding to the target clustering center as the group of the user.
And S404, updating the target clustering center according to the user attribute characteristics.
Step S405, whether the updating time of the clustering result is reached is determined, and if yes, step S406 is executed.
And step S406, clustering again based on the user attribute characteristics of the history records and the newly added user attribute characteristics to obtain a plurality of updated clustering centers.
Step S407, dynamically determining the cluster result updating time according to the updating result of the target cluster center.
In one embodiment, the update time of the clustering result is dynamically updated according to the update result of the target clustering center. Specifically, if the change of the cluster center in the update result of the target cluster center is large, the update time of the cluster result is shortened, for example, from 1 day to half a day; otherwise, the updating time of the clustering result is prolonged, such as prolonging from 1 day to 2 days. Specifically, after the determined closest cluster center is used as the target cluster center, the characteristic value k of the target cluster center is determined*Determined accordingly
Figure BDA0003026904340000081
And mk*,dIf the ratio of (A) falls continuously in the interval [0.8, 1.2 ]]And correspondingly prolonging the updating time of the clustering result, otherwise, shortening the updating time of the clustering result. It should be noted that, the specific time value for shortening and extending the update time of the clustering result is not limited, and is only an example.
According to the scheme, the cluster result updating time is dynamically determined according to the updating result of the target cluster center, the cluster calculation efficiency can be effectively improved, and the overall operation power consumption is reduced on the premise that the user grouping result is accurate.
Fig. 5 is a flowchart of another clustering-based user grouping method according to an embodiment of the present invention, which further defines a process of obtaining user attribute features. As shown in fig. 5, the technical solution is as follows:
step S501, receiving user attribute characteristics sent by a client, wherein the client sends the user attribute characteristics when detecting a video playing event trigger.
In one embodiment, the user attribute characteristics received by the server are sent by the client. And the client sends the user attribute characteristics in real time when running the corresponding functions. Specifically, for example, when a user uses a client device to play a video, when a video playing event is detected, for example, when a video playing button is detected to be clicked, the current user attribute feature is determined. For example, determining a current network bandwidth parameter, a device decoding parameter, a data download rate, and the like, where the network bandwidth parameter may be a current network bandwidth size determined by a test in an operation process of a client, the data download rate may be a data download rate recorded by the client during a historical video playing, and the device decoding parameter may be a recorded device intrinsic performance parameter, and may also be a cpu occupancy rate, a memory occupancy rate, and the like of the current device determined in real time.
Step S502, determining a target clustering center corresponding to the user attribute characteristics according to the user attribute characteristics and the clustering centers, and determining a group corresponding to the target clustering center as the group of the user.
And S503, updating the target clustering center according to the user attribute characteristics.
And step S504, determining whether the updating time of the clustering result is reached, and if so, executing step S505.
And step S505, clustering again based on the user attribute characteristics of the historical records and the newly added user attribute characteristics to obtain a plurality of updated clustering centers.
According to the scheme, the user attribute characteristics sent by the client are received online in real time, the corresponding user groups are determined in a real-time adjustable clustering mode based on the user attribute characteristics, and then the corresponding service strategies are executed for the users. Especially for the user characteristic attribute of the designed network parameter, the network fluctuation is usually changed greatly, and the traditional clustering algorithm has insufficient operation timeliness and operation efficiency to realize the efficient grouping of users. According to the scheme, the problem that the clustering result precision is poor when the characteristic distribution fluctuates greatly due to parameter fluctuation is solved, and accurate grouping of data is realized.
Fig. 6 is a flowchart of another clustering-based user grouping method according to an embodiment of the present invention, which further optimizes the user grouping process. As shown in fig. 6, the technical solution is as follows:
step S601, receiving a user attribute feature sent by a client, wherein the client sends the user attribute feature when detecting a video playing event trigger.
Step S602, determining a fluctuation value of the user attribute feature, and if the fluctuation value is greater than a preset threshold value, determining a target clustering center corresponding to the user attribute feature according to the user attribute feature and each clustering center.
In one embodiment, the server performs preliminary analysis on the user attribute features received in real time to determine a fluctuation value of the user attribute features, and if the fluctuation value is greater than a preset threshold value, determines a target clustering center corresponding to the user attribute features according to the user attribute features and each clustering center. Correspondingly, if the fluctuation value is not larger than the preset threshold value, the recorded previous grouping of the user is adopted to execute the subsequent processing strategy. Specifically, taking the example that the user attribute feature parameter includes the network bandwidth, if the network bandwidth is reduced from 100M to 20M and the fluctuation range thereof exceeds 5 times, it is correspondingly determined that the fluctuation value thereof is greater than a preset threshold (for example, the preset threshold may be 2), and a target cluster center corresponding to the user attribute feature is determined according to the user attribute feature and each cluster center, and the user group is re-determined.
Step S603, determining the group corresponding to the target clustering center as the group of the user.
And step S604, updating the target clustering center according to the user attribute characteristics.
And step S605, determining whether the updating time of the clustering result is reached, and if so, executing step S606.
And step S606, clustering is carried out again based on the user attribute characteristics of the historical records and the newly added user attribute characteristics to obtain a plurality of updated clustering centers.
According to the scheme, the fluctuation value of the user attribute features is determined, the target clustering center corresponding to the user attribute features is determined according to the user attribute features and all the clustering centers under the condition that the fluctuation value is larger than the preset threshold value, otherwise, the last user group is obtained and used as the user group corresponding to the user attribute features, the determination mode of the user group is optimized, and the overall operation efficiency is improved.
Fig. 7 is a block diagram of a clustering-based user grouping apparatus according to an embodiment of the present invention, which is used for executing the clustering-based user grouping method according to the embodiment of the present invention, and has corresponding functional modules and beneficial effects of the execution method. As shown in fig. 7, the apparatus specifically includes: a data acquisition module 101, a data grouping module 102, and a cluster update module 103, wherein,
the data acquisition module 101 is configured to acquire user attribute features and corresponding clustering centers, where each clustering center is obtained by offline calculation, and the user attribute features are associated with users;
a data grouping module 102, configured to determine, according to the user attribute features and the respective clustering centers, target clustering centers corresponding to the user attribute features, and determine groups corresponding to the target clustering centers as groups of the users;
and the cluster updating module 103 is configured to update the target cluster center according to the user attribute characteristics and then use the updated target cluster center for subsequent user grouping.
According to the scheme, the user grouping method carries out clustering center classification on the user attribute characteristics acquired online in real time, the groups corresponding to the target clustering centers are determined as the current user groups after the target clustering centers are determined, the grouping process is strong in real time, and online real-time user grouping classification can be realized aiming at the user characteristic attributes with fluctuation so as to execute the corresponding service strategies. Each clustering center is determined in an off-line calculation mode, the on-line clustering process and the off-line clustering process are organically integrated, and the accuracy of grouping results is guaranteed. Meanwhile, due to the fact that the user attribute characteristics greatly fluctuate in some scenes, the clustering center is correspondingly adjusted according to the user attribute characteristics, dynamic adjustment of the clustering center is achieved, groups corresponding to the user attribute characteristics can be dynamically and accurately determined, and the problem that dynamic adjustment cannot be achieved in a common user grouping mode is solved.
In one possible embodiment, the apparatus further comprises a cluster center determining module 104 for:
before the user attribute features and the corresponding clustering centers are obtained, the historical records of the user attribute features are obtained, and clustering processing is carried out on the user attribute features in the historical records to obtain a plurality of clustering centers.
In a possible embodiment, the cluster update module 103 is further configured to:
and after determining the group corresponding to the target clustering center as the group of the user, determining whether the updating time of the clustering result is reached, if so, re-clustering based on the user attribute characteristics of the historical records and the newly added user attribute characteristics to obtain a plurality of updated clustering centers.
In a possible embodiment, the cluster update module 103 is further configured to:
and after the target clustering center is updated according to the user attribute characteristics, dynamically determining the updating time of the clustering result according to the updating result of the target clustering center.
In a possible embodiment, the cluster update module 103 is specifically configured to:
and updating the characteristic value of the target clustering center based on the user attribute characteristics and the learning rate parameters.
In a possible embodiment, the data obtaining module 101 is specifically configured to:
the method comprises the steps of receiving user attribute characteristics sent by a client, wherein the client sends the user attribute characteristics when detecting a video playing event trigger, and the user attribute characteristics comprise one or more of network bandwidth parameters, equipment decoding parameters and data downloading rates.
In one possible embodiment, the data packet module is further configured to:
determining a fluctuation value of the user attribute feature before determining a target clustering center corresponding to the user attribute feature according to the user attribute feature and each clustering center, and if the fluctuation value is greater than a preset threshold value, determining a target clustering center corresponding to the user attribute feature according to the user attribute feature and each clustering center, wherein the fluctuation value comprises a network bandwidth parameter fluctuation value.
Fig. 8 is a schematic structural diagram of a clustering-based user grouping apparatus according to an embodiment of the present invention, as shown in fig. 8, the apparatus includes a processor 201, a memory 202, an input device 203, and an output device 204; the number of the processors 201 in the device may be one or more, and one processor 201 is taken as an example in fig. 8; the processor 201, the memory 202, the input device 203 and the output device 204 in the apparatus may be connected by a bus or other means, and fig. 8 illustrates the connection by a bus as an example. Memory 202, which is a computer-readable storage medium, may be used to store software programs, computer-executable programs, and modules, such as program instructions/modules corresponding to the clustering-based user grouping method in embodiments of the present invention. The processor 201 executes various functional applications of the device and data processing, i.e. implements the cluster-based user grouping method described above, by running software programs, instructions and modules stored in the memory 202. The input device 203 may be used to receive input numeric or character information and generate key signal inputs related to user settings and function controls of the apparatus. The output device 204 may include a display device such as a display screen.
Embodiments of the present invention also provide a storage medium containing computer-executable instructions which, when executed by a computer processor, perform a method for clustering based user grouping, the method comprising:
acquiring user attribute features and corresponding clustering centers, wherein each clustering center is obtained by off-line calculation, and the user attribute features are associated with users;
determining a target clustering center corresponding to the user attribute characteristics according to the user attribute characteristics and the clustering centers, and determining a group corresponding to the target clustering center as a group of the user;
and updating the target clustering center according to the user attribute characteristics.
It should be noted that, in the embodiment of the clustering-based user grouping apparatus, the included units and modules are only divided according to functional logic, but are not limited to the above division as long as the corresponding functions can be implemented; in addition, specific names of the functional units are only for convenience of distinguishing from each other, and are not used for limiting the protection scope of the embodiment of the invention.
It should be noted that the foregoing is only a preferred embodiment of the present invention and the technical principles applied. Those skilled in the art will appreciate that the embodiments of the present invention are not limited to the specific embodiments described herein, and that various obvious changes, adaptations, and substitutions are possible, without departing from the scope of the embodiments of the present invention. Therefore, although the embodiments of the present invention have been described in more detail through the above embodiments, the embodiments of the present invention are not limited to the above embodiments, and many other equivalent embodiments may be included without departing from the concept of the embodiments of the present invention, and the scope of the embodiments of the present invention is determined by the scope of the appended claims.

Claims (10)

1. The user grouping method based on clustering is characterized by comprising the following steps:
acquiring user attribute features and corresponding clustering centers, wherein each clustering center is obtained by off-line calculation, and the user attribute features are associated with users;
determining a target clustering center corresponding to the user attribute characteristics according to the user attribute characteristics and the clustering centers, and determining a group corresponding to the target clustering center as a group of the user;
and updating the target clustering center according to the user attribute characteristics so as to be used for subsequent user grouping.
2. The method of claim 1, wherein before obtaining the user attribute features and the corresponding respective cluster centers, further comprising:
and acquiring a historical record of the user attribute characteristics, and clustering the user attribute characteristics in the historical record to obtain a plurality of clustering centers.
3. The clustering-based user grouping method according to claim 2, further comprising, after determining the group corresponding to the target cluster center as the group of the user:
and determining whether the updating time of the clustering result is reached, and if so, re-clustering based on the user attribute characteristics of the historical records and the newly added user attribute characteristics to obtain a plurality of updated clustering centers.
4. The cluster-based user grouping method of claim 3, further comprising, after updating the target cluster center according to the user attribute feature:
and dynamically determining the cluster result updating time according to the updating result of the target cluster center.
5. The cluster-based user grouping method of claim 1, wherein the updating the target cluster center according to the user attribute feature comprises:
and updating the characteristic value of the target clustering center based on the user attribute characteristics and the learning rate parameters.
6. The cluster-based user grouping method according to any one of claims 1-5, wherein the obtaining user attribute features comprises:
the method comprises the steps of receiving user attribute characteristics sent by a client, wherein the client sends the user attribute characteristics when detecting a video playing event trigger, and the user attribute characteristics comprise one or more of network bandwidth parameters, equipment decoding parameters and data downloading rates.
7. The method according to claim 6, further comprising, before determining the target cluster center corresponding to the user attribute feature according to the user attribute feature and the respective cluster centers:
and determining a fluctuation value of the user attribute feature, and if the fluctuation value is greater than a preset threshold value, determining a target clustering center corresponding to the user attribute feature according to the user attribute feature and each clustering center, wherein the fluctuation value comprises a network bandwidth parameter fluctuation value.
8. A clustering-based user grouping apparatus, comprising:
the data acquisition module is used for acquiring user attribute characteristics and corresponding clustering centers, wherein each clustering center is obtained by off-line calculation, and the user attribute characteristics are associated with users;
the data grouping module is used for determining a target clustering center corresponding to the user attribute characteristics according to the user attribute characteristics and the clustering centers and determining a group corresponding to the target clustering center as a group of the user;
and the cluster updating module is used for updating the target cluster center according to the user attribute characteristics and then using the updated target cluster center for subsequent user grouping.
9. A cluster-based user grouping apparatus, the apparatus comprising: one or more processors; storage means for storing one or more programs which, when executed by the one or more processors, cause the one or more processors to implement the cluster-based user grouping method of any of claims 1-7.
10. A storage medium storing computer executable instructions for performing the cluster-based user grouping method of any one of claims 1-7 when executed by a computer processor.
CN202110418570.2A 2021-04-19 2021-04-19 User grouping method, device, equipment and storage medium based on clustering Pending CN113298115A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110418570.2A CN113298115A (en) 2021-04-19 2021-04-19 User grouping method, device, equipment and storage medium based on clustering

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110418570.2A CN113298115A (en) 2021-04-19 2021-04-19 User grouping method, device, equipment and storage medium based on clustering

Publications (1)

Publication Number Publication Date
CN113298115A true CN113298115A (en) 2021-08-24

Family

ID=77319911

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110418570.2A Pending CN113298115A (en) 2021-04-19 2021-04-19 User grouping method, device, equipment and storage medium based on clustering

Country Status (1)

Country Link
CN (1) CN113298115A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115065600A (en) * 2022-06-13 2022-09-16 远景智能国际私人投资有限公司 Equipment grouping method, device, equipment and storage medium

Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102082718A (en) * 2011-02-23 2011-06-01 中国人民解放军信息工程大学 Service-oriented method for clustering services in virtual network
WO2015001416A1 (en) * 2013-07-05 2015-01-08 Tata Consultancy Services Limited Multi-dimensional data clustering
CN106604267A (en) * 2017-02-21 2017-04-26 重庆邮电大学 Dynamic self-adapting wireless sensor network invasion detection intelligence algorithm
CN109819282A (en) * 2017-11-22 2019-05-28 腾讯科技(深圳)有限公司 A kind of video user classification recognition methods, device and medium
CN110245687A (en) * 2019-05-17 2019-09-17 腾讯科技(上海)有限公司 User classification method and device
WO2020233320A1 (en) * 2019-05-20 2020-11-26 深圳壹账通智能科技有限公司 Reminding task allocation method and apparatus, computer device, and storage medium
CN112069485A (en) * 2020-06-12 2020-12-11 完美世界(北京)软件科技发展有限公司 Safety processing method, device and equipment based on user behaviors
CN112364937A (en) * 2020-11-30 2021-02-12 腾讯科技(深圳)有限公司 User category determination method and device, recommended content determination method and electronic equipment

Patent Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102082718A (en) * 2011-02-23 2011-06-01 中国人民解放军信息工程大学 Service-oriented method for clustering services in virtual network
WO2015001416A1 (en) * 2013-07-05 2015-01-08 Tata Consultancy Services Limited Multi-dimensional data clustering
CN106604267A (en) * 2017-02-21 2017-04-26 重庆邮电大学 Dynamic self-adapting wireless sensor network invasion detection intelligence algorithm
CN109819282A (en) * 2017-11-22 2019-05-28 腾讯科技(深圳)有限公司 A kind of video user classification recognition methods, device and medium
CN110245687A (en) * 2019-05-17 2019-09-17 腾讯科技(上海)有限公司 User classification method and device
WO2020233320A1 (en) * 2019-05-20 2020-11-26 深圳壹账通智能科技有限公司 Reminding task allocation method and apparatus, computer device, and storage medium
CN112069485A (en) * 2020-06-12 2020-12-11 完美世界(北京)软件科技发展有限公司 Safety processing method, device and equipment based on user behaviors
CN112364937A (en) * 2020-11-30 2021-02-12 腾讯科技(深圳)有限公司 User category determination method and device, recommended content determination method and electronic equipment

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
KOMARASAMY G, DR等: "An optimized k-means clustering technique using bat algorithm", 《EUROPEAN JOURNAL OF SCIENTIFIC RESEARCH》, vol. 84, pages 263 - 273 *
许家钰: "基于k-means算法的WiFi用户行为分析***设计与实现", 《中国优秀硕士学位论文全文数据库:信息科技辑》, no. 8, pages 1 - 81 *

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115065600A (en) * 2022-06-13 2022-09-16 远景智能国际私人投资有限公司 Equipment grouping method, device, equipment and storage medium
CN115065600B (en) * 2022-06-13 2024-01-05 远景智能国际私人投资有限公司 Equipment grouping method, device, equipment and storage medium

Similar Documents

Publication Publication Date Title
US20220294821A1 (en) Risk control method, computer device, and readable storage medium
CN110012060B (en) Information pushing method and device of mobile terminal, storage medium and server
WO2019134274A1 (en) Interest exploration method, storage medium, electronic device and system
CN106095842B (en) Online course searching method and device
CN105022761A (en) Group search method and apparatus
WO2019062405A1 (en) Application program processing method and apparatus, storage medium, and electronic device
WO2021169294A1 (en) Application recognition model updating method and apparatus, and storage medium
CN108390775B (en) User experience quality evaluation method and system based on SPICE
CN111708942B (en) Multimedia resource pushing method, device, server and storage medium
CN111935025B (en) Control method, device, equipment and medium for TCP transmission performance
WO2019085754A1 (en) Application cleaning method and apparatus, and storage medium and electronic device
CN111523035A (en) Recommendation method, device, server and medium for APP browsing content
CN111343006B (en) CDN peak flow prediction method, device and storage medium
WO2019062404A1 (en) Application program processing method and apparatus, storage medium, and electronic device
CN113556368A (en) User identification method, device, server and storage medium
CN113298115A (en) User grouping method, device, equipment and storage medium based on clustering
CN111310072B (en) Keyword extraction method, keyword extraction device and computer-readable storage medium
CN111461188A (en) Target service control method, device, computing equipment and storage medium
CN111241225A (en) Resident area change judgment method, resident area change judgment device, resident area change judgment equipment and storage medium
CN106888237B (en) Data scheduling method and system
CN111598390B (en) Method, device, equipment and readable storage medium for evaluating high availability of server
CN110134575B (en) Method and device for calculating service capacity of server cluster
CN111143688B (en) Evaluation method and system based on mobile news client
US20060020639A1 (en) Engine for validating proposed changes to an electronic entity
WO2014117566A1 (en) Ranking method and system

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination