CN115187344A - Big data-based user preference analysis and identification method - Google Patents

Big data-based user preference analysis and identification method Download PDF

Info

Publication number
CN115187344A
CN115187344A CN202211106761.6A CN202211106761A CN115187344A CN 115187344 A CN115187344 A CN 115187344A CN 202211106761 A CN202211106761 A CN 202211106761A CN 115187344 A CN115187344 A CN 115187344A
Authority
CN
China
Prior art keywords
preference
user
browsing
cluster
decision
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202211106761.6A
Other languages
Chinese (zh)
Other versions
CN115187344B (en
Inventor
刘梅
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Nantong Jiutuo Intelligent Equipment Co ltd
Original Assignee
Nantong Jiutuo Intelligent Equipment Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Nantong Jiutuo Intelligent Equipment Co ltd filed Critical Nantong Jiutuo Intelligent Equipment Co ltd
Priority to CN202211106761.6A priority Critical patent/CN115187344B/en
Publication of CN115187344A publication Critical patent/CN115187344A/en
Application granted granted Critical
Publication of CN115187344B publication Critical patent/CN115187344B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q30/00Commerce
    • G06Q30/06Buying, selling or leasing transactions
    • G06Q30/0601Electronic shopping [e-shopping]
    • G06Q30/0631Item recommendations
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q30/00Commerce
    • G06Q30/06Buying, selling or leasing transactions
    • G06Q30/0601Electronic shopping [e-shopping]
    • G06Q30/0641Shopping interfaces
    • G06Q30/0643Graphical representation of items or shoppers
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Business, Economics & Management (AREA)
  • Accounting & Taxation (AREA)
  • Finance (AREA)
  • Development Economics (AREA)
  • Economics (AREA)
  • Marketing (AREA)
  • Strategic Management (AREA)
  • Physics & Mathematics (AREA)
  • General Business, Economics & Management (AREA)
  • General Physics & Mathematics (AREA)
  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The invention relates to the technical field of electric digital data processing, in particular to a user preference analysis and identification method based on big data. The method comprises the following steps: acquiring historical browsing records of a user, wherein the browsing records purchased after browsing in the historical browsing records are purchasing behaviors; obtaining a plurality of clustering clusters, and connecting the clustering clusters related to the index to obtain preference chains, wherein one complete device corresponds to one preference chain, and the starting point of the preference chain is the clustering cluster corresponding to the purchasing behavior of the complete device; obtaining a first decision force, a second decision force and a third decision force of a preference chain; obtaining the difference degree of the cluster to be compared and the corresponding preference chain based on the difference of the first decision, the second decision and the third decision between the cluster to be compared and the corresponding preference chain; and analyzing the preference of the user based on the difference degree, and determining a product recommendation strategy. The method and the device can accurately position the current preference of the user and set the product recommendation strategy suitable for the user according to the preference.

Description

Big data-based user preference analysis and identification method
Technical Field
The invention relates to the technical field of electric digital data processing, in particular to a user preference analysis and identification method based on big data.
Background
When a user browses a detail page of a commodity, the system stores the commodity information in the detail page as a browsing record of the user, which is also called a browsing log. The real-time stream data is one of big data, a user preference copy with commercial value is provided for a website operator, the system pushes related preference products and information according to the copy, and the operator is helped to better focus on customer demands and make marketing plans by statistically analyzing long-term browsing data.
When the existing common user preference analysis method is used for acquiring information by mining data in historical browsing logs of retail products, more attention is paid to the level of personal waste and labor of a user, and real-time browsing data, namely the click rate, the page dwell time and the like of the current user are relied on when the product type, style and the like required by the user are predicted; however, when products such as industrial equipment are sold, the types of the products are few for the operation website, and the real-time browsing data is relatively thin, so that the reference of the real-time data may be relatively low, and at this time, if the preference of the user is only located according to the current real-time browsing data of the user, the result of the location is not ideal, so that when the recommendation strategy of the operation website is finally changed according to the located preference, the effect achieved by the changed recommendation strategy is not ideal.
Disclosure of Invention
In order to solve the above technical problems, an object of the present invention is to provide a method for analyzing and identifying user preferences based on big data, and the adopted technical solution is specifically as follows:
one embodiment of the invention provides a big data-based user preference analysis and identification method, which comprises the following steps: acquiring historical browsing records of a user, wherein the browsing records purchased after browsing in the historical browsing records are purchasing behaviors; clustering by taking each purchasing behavior as a clustering center point and taking related browsing records in a time period for completing each purchasing behavior as clustering points to obtain a plurality of clustering clusters; if the products purchased by the purchasing behaviors of the two clustering clusters are accessory products of a complete device, the indexes of the two clustering clusters are related; connecting the clustering clusters related to the index to obtain preference chains, wherein one complete device corresponds to one preference chain, and the starting point of each preference chain is the clustering cluster corresponding to the purchase behavior of the complete device;
the average value of the ratio of the starting time of the first browsing record to the ending time of the last browsing record in all the clustering clusters on one preference chain is the first decision power of the preference chain; the mean value of the sum of the variances of the browsing record durations of all the clustering clusters on one preference chain is the second decision power of the preference chain; obtaining a third decision-making power of a preference chain based on the browsing record quantity of each product in each cluster on the preference chain and the total browsing record quantity in the cluster;
clustering by taking the current browsing record of the user as a clustering center point and the browsing record in a preset time period as a clustering point to obtain a cluster to be compared; determining a preference chain corresponding to the cluster to be compared according to the complete equipment to which the product browsed by the current browsing record belongs; calculating first, second and third decision forces of the clusters to be compared, and obtaining the difference degree of the clusters to be compared and the corresponding preference chains based on the difference of the first, second and third decisions between the clusters to be compared and the corresponding preference chains; and analyzing the preference of the user based on the difference degree, and determining a product recommendation strategy.
Preferably, the browsing history includes: one browsing record includes information of the browsed product and browsing time information.
Preferably, with each purchase behavior as a cluster center point, clustering with related browsing records in a time period in which each purchase behavior is completed as a cluster point to obtain a plurality of cluster clusters, including: the time period for completing each purchasing behavior is a time period from the beginning of browsing the type of products to the end of purchasing the type of products when the type of products are purchased; the related browsing records indicate that the products browsed by each browsing record of the user in the period of completing one purchasing behavior are the same as the product purchased by the purchasing behavior.
Preferably, the third decision force is:
Figure 707418DEST_PATH_IMAGE002
wherein, the first and the second end of the pipe are connected with each other,
Figure 100002_DEST_PATH_IMAGE003
representing a third decision force;
Figure 162365DEST_PATH_IMAGE004
representing the number of cluster clusters in the preference chain; e represents a natural constant;
Figure 100002_DEST_PATH_IMAGE005
a maximum style quantity of a product representing the same category of products purchased by the purchasing act;
Figure 183541DEST_PATH_IMAGE006
the ith cluster in the preference chain represents the same category of products purchased by the purchasing behavior
Figure 100002_DEST_PATH_IMAGE007
Browsing and recording the quantity of the products;
Figure 260957DEST_PATH_IMAGE008
representing the total number of browsing records in the ith cluster;
Figure 100002_DEST_PATH_IMAGE009
a logarithmic function based on e is shown.
Preferably, clustering is performed by taking the current browsing record of the user as a clustering center point and the browsing record in a preset time period as a clustering point to obtain a cluster to be compared, and the clustering comprises the following steps: the current browsing record of the user is a browsing record when the user browses at the current moment or a browsing record of the user closest to the current moment; the preset period represents a period of time from the user's current browsing history.
Preferably, the degree of difference is:
Figure 100002_DEST_PATH_IMAGE011
wherein, the first and the second end of the pipe are connected with each other,
Figure 665262DEST_PATH_IMAGE012
representing the difference degree of the preference chain a and the cluster b to be compared;
Figure 100002_DEST_PATH_IMAGE013
and
Figure 278777DEST_PATH_IMAGE014
respectively representing the first decision power of the preference chain a and the cluster b to be compared;
Figure 100002_DEST_PATH_IMAGE015
and
Figure 234969DEST_PATH_IMAGE016
respectively representing the second decision-making power of the preference chain a and the cluster b to be compared;
Figure 100002_DEST_PATH_IMAGE017
and
Figure 504408DEST_PATH_IMAGE018
respectively representing the third decision-making power of the preference chain a and the cluster b to be compared;
Figure 100002_DEST_PATH_IMAGE019
Figure 577318DEST_PATH_IMAGE020
Figure 100002_DEST_PATH_IMAGE021
representing the weights of the first, second and third decision forces, respectively.
Preferably, analyzing the user's preference based on the degree of difference and determining the product recommendation policy comprises: setting a difference threshold, if the difference degree is smaller than the difference threshold, the current preference characteristics of the user are similar to the preference characteristics of the user in the corresponding preference chain, and the recommendation strategy is to recommend products purchased by each purchasing behavior in the preference chain similar to the current preference characteristics of the user for the user; if the difference degree is larger than or equal to the difference threshold value, the current preference characteristics of the user are not similar to the preference characteristics of the user in the corresponding preference chain, and the recommendation strategy is to recommend diversified products for the user.
The embodiment of the invention at least has the following beneficial effects:
1. the method comprises the steps of analyzing historical browsing records of each complete device and corresponding accessory products, clustering by taking each purchasing behavior as a clustering central point to obtain a plurality of clustering clusters, connecting the clustering cluster corresponding to the purchasing behavior of each complete device with the clustering cluster corresponding to the purchasing behavior of the accessory products of the complete device to obtain a preference chain corresponding to each complete device, wherein the preference chain information of one complete device comprises historical browsing and purchasing information of one complete device and accessory products of the complete device, so that the information amount is relatively comprehensive, and when the preference of a user is analyzed according to the preference of the user, the preference of the user can be relatively comprehensively and accurately positioned;
2. when the preference of a user is specifically analyzed by using a plurality of current browsing records of the user, obtaining a preference chain corresponding to the plurality of current browsing records of the user through an accessory product of which complete equipment the currently browsed product belongs to, calculating and obtaining a first decision force, a second decision force and a third decision force of the corresponding preference chain, and simultaneously obtaining the first decision force, the second decision force and the third decision force when the user browses the product currently based on the information of the plurality of current browsing records of the user; the preference chains corresponding to the current multiple browsing records of the user and the differences of the first, second and third decision-making powers of the current multiple browsing records of the user are analyzed, then the preference of the current browsing behavior of the user is positioned based on the differences, the historical data is combined, the decision-making habit of the user when purchasing products is more biased, the current decision-making power of the user can be deeply traced, and the accuracy of the user preference positioning result is improved.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions and advantages of the prior art, the drawings used in the embodiments or the description of the prior art will be briefly described below, it is obvious that the drawings in the following description are only some embodiments of the present invention, and other drawings can be obtained by those skilled in the art without creative efforts.
Fig. 1 is a flowchart of a method for analyzing and identifying user preference based on big data according to an embodiment of the present invention.
Detailed Description
To further illustrate the technical means and effects of the present invention adopted to achieve the predetermined invention, the following detailed description is provided with reference to the accompanying drawings and preferred embodiments for a big data based user preference analysis and identification method, and the specific implementation, structure, features and effects thereof according to the present invention. In the following description, the different references to "one embodiment" or "another embodiment" do not necessarily refer to the same embodiment. Furthermore, the particular features, structures, or characteristics may be combined in any suitable manner in one or more embodiments.
Unless defined otherwise, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this invention belongs.
The following describes a specific scheme of the big data-based user preference analysis and identification method in detail with reference to the accompanying drawings.
Example (b):
the main application scenarios of the invention are as follows: for products of industrial equipment and operation websites thereof, the product categories are few, the browsing volume is small, so the reference of real-time data is possibly low, if the preference is positioned only according to the current clicking and browsing behaviors of a client, the random clicking behavior is very easy to cover the product categories in a large range, and the preferred analysis result is not ideal enough.
Referring to fig. 1, a flowchart of a method for identifying a big data based user preference analysis according to an embodiment of the present invention is shown, where the method includes the following steps:
step S1, obtaining historical browsing records of a user, wherein the browsing records purchased after browsing in the historical browsing records are purchasing behaviors; clustering by taking each purchasing behavior as a clustering center point and taking related browsing records in a time period for completing each purchasing behavior as clustering points to obtain a plurality of clustering clusters; if the products purchased by the purchasing behaviors of the two clustering clusters are accessory products of a complete device, the indexes of the two clustering clusters are related; and connecting the clustering clusters related to the index to obtain preference chains, wherein one complete device corresponds to one preference chain, and the starting point of the preference chain is the clustering cluster corresponding to the purchase behavior of the complete device.
First, because of the problem of memory occupation of log files, the traditional browsing logs cannot be completely reserved, and are generally converted into real-time streaming data for analysis and storage after data mining. And although the browsing record interface displays some SKU (stock keeping unit) information of the goods, there is no need to store many SKU information when storing, and the unique numbers (such as 1234A and 1234B) for storing the SKU information are generally selected to represent the browsing record of the goods. The user browsing records are temporary data and are frequently changed, and the data volume is not large.
When mining browsing record data in a user history, browsing records which are particularly classified as purchasing behaviors need to be distinguished, wherein the browsing records purchased after browsing in the browsing records in the user history are purchasing behaviors, and the other browsing records are browsing records only browsed. For industrial equipment products, one complete equipment is an upstream product, and parts and accessories of the complete equipment are collectively called accessory products of the complete equipment and are downstream products.
Furthermore, the browsing volumes of the websites of different types of goods are very different, for example, retail websites such as panning, gathering, and the like have huge daily browsing volumes, and the purchasing bias of individual users is more random, so that the users may buy snacks at the last time, the historical data cannot establish a strong association with the current preference, and more of the historical data are used as tracing of purchasing power and consumption level of the users, so that the real-time browsing behavior is emphasized when goods categories and styles are pushed. For the operation websites of products in industrial manufacturing and equipment, the browsing amount of the websites is small, and most of the operation websites are purchased by enterprises and factories, so that the operation websites cannot be purchased as desired by individual users, the preference of the operation websites is relatively fixed, the operation websites are more biased to preference tracing of historical data, the referential performance of real-time data of the operation websites of the products is low, and the current browsing behavior cannot be ensured to be completely matched with the historical browsing logic.
Finally, classifying browsing records in the history of the user, clustering the browsing records in the time period of completing each purchasing behavior by using each purchasing behavior clustering center point and using the related browsing records in the time period of completing each purchasing behavior as clustering points to obtain a plurality of clustering clusters, realizing the classification of the browsing records in the history, wherein discrete browsing records with high randomness possibly appear in the clustering process, and the browsing records have low reference, so the browsing records can be ignored during clustering; the time period for completing each purchasing behavior is the time period from the beginning of browsing the products to the stopping of browsing the products after the purchasing is finished when the products are purchased, it is emphasized that the products are not purchased to indicate that the purchasing is finished, the browsing record surrounding the purchasing behavior is not only the browsing record before the purchasing, and the browsing record in a period of time just after the purchasing is possibly counted as the purchasing decision process of the user, if the user still browses the related products after the products are purchased to indicate that the user does not completely decide, the user still hesitates and entangles, and the possibility of goods return also exists, so the decision period of each purchasing behavior is from the beginning of browsing to the end of browsing, the decision is considered to be completed, the purchasing is completed, and one time period for completing one purchasing behavior can be called one decision period.
The related browsing records are that the products browsed in the browsing records of the user are the same as the product types of the purchasing behaviors, namely the classification numbers are the same, and at this time, the user is considered to select a certain money for purchasing after browsing various similar products which are the same as the product types in the purchasing behaviors through comparison.
After the cluster clusters are obtained, products purchased according to the purchasing behavior of each cluster are used as indexes of each cluster, the related cluster clusters are connected based on the correlation between each index to obtain a preference chain, the starting point of the preference chain is a cluster corresponding to the purchasing behavior of the complete equipment, the products purchased by the purchasing behavior of each cluster on the preference chain are accessory products of the complete equipment, the order is arranged and connected according to the purchasing time sequence, the preference chain can also be regarded as a set, elements in the set are the purchasing behavior corresponding degree cluster of accessory products of the complete equipment, the first element is the cluster corresponding to the purchasing behavior of the complete equipment, and the arrangement sequence of the other elements is also arranged according to the time sequence. It should be noted that, the product purchased by the purchasing behavior of the two clusters is an accessory product of a complete device, and the indexes of the two clusters are related.
The concept of cluster index is utilized, that is, a group of analysis data sets of single consumption behaviors are clustered by browsing records like purchasing behaviors, the cluster index has uniqueness, products of the purchasing behaviors are equivalent to the unique index items of the group of data, and when historical data are called and analyzed, the index items are found to position the data; the chain structure of the preference chain has the characteristic of cluster indexing, a complete preference chain can be regarded as a cluster which is continuous in time sequence, and a consumption data set of a purchasing behavior center of complete equipment is used as a unique index of the chain structure cluster, so that when a subsequent browsing record is analyzed, a batch of new data is subjected to similarity analysis with a cluster point in the preference chain to judge whether the batch of new data can be classified on the preference chain.
S2, the average value of the ratio of the starting time of the first browsing record to the ending time of the last browsing record in all the clustering clusters on one preference chain is the first decision-making power of the preference chain; the mean value of the sum of the variances of the browsing records of all the clustering clusters on one preference chain is the second decision-making power of the preference chain; and obtaining a third decision-making power of the preference chain based on the browsing record quantity of each product in each cluster on the preference chain and the total browsing record quantity in the clusters.
Firstly, each cluster on the preference chain, namely for each purchasing behavior, has a plurality of browsing records related to the cluster, hidden information in historical data can be obtained by analyzing the related browsing records on the preference chain, and the decision cycle and the decision power of a user purchasing a product and the number of types of the compared products in purchasing are used as references for analyzing the current browsing records of the user.
Further, a first decision power is obtained by analyzing a decision cycle of a user for purchasing a product, and the first decision power is as follows:
Figure DEST_PATH_IMAGE023
wherein the content of the first and second substances,
Figure 419241DEST_PATH_IMAGE024
a first decision force representing a chain of preferences;
Figure DEST_PATH_IMAGE025
representing the starting time of a first browsing record in the ith clustering in the preference chain;
Figure 280756DEST_PATH_IMAGE026
the end time of the last browsing record in the ith clustering cluster in the preference chain is recorded; n represents the number of clusters in a preference chain.
Figure DEST_PATH_IMAGE027
And the ratio of the starting time of the first browsing record to the ending time of the last browsing record in the ith cluster represents the decision-making period of the current purchasing behavior.
Figure 37490DEST_PATH_IMAGE027
The value of (A) is between 0 and 1, the closer to 1, the shorter the decision cycle is, the stronger the decision making power when the user buys the product is,
Figure 268751DEST_PATH_IMAGE028
the decision period of the historical records is averaged, and the larger the value is, the decision power of the historical purchased products of the user is shownThe stronger the first decision force.
Then, it is determined by analyzing the change of the browsing duration when a product is purchased that whether the user continuously browses related product pages to make a purchasing behavior or randomly and discontinuously browses the related product pages to make a purchasing behavior to obtain a second decision power, obviously, the decision power for randomly and discontinuously browsing to make a purchasing behavior is stronger, and the second decision power is:
Figure 89815DEST_PATH_IMAGE030
wherein the content of the first and second substances,
Figure DEST_PATH_IMAGE031
representing a second decision force; n represents the number of clusters in a preference chain;
Figure 420433DEST_PATH_IMAGE032
representing the number of browsing records in a cluster;
Figure DEST_PATH_IMAGE033
representing the duration of the jth browsing record in the ith cluster on the preference chain;
Figure 180577DEST_PATH_IMAGE034
and the average value of the duration of all browsing records in the ith cluster on the preference chain is represented.
Figure DEST_PATH_IMAGE035
The average value is calculated by summing the variances of the durations of all browsing records in each cluster on the preference chain, the smaller the average value is, when the user purchasing behavior occurs, the related product pages are continuously browsed, the decision-making power is weaker, the larger the average value is, namely, the larger the second decision-making power is, when the user purchasing behavior occurs, the user randomly and discontinuously browses the related product pages, and the stronger the decision-making power is.
Finally, the diversity of the products browsed by the user each time the user purchases the products is analyzed, namely the user purchases the products after comparing a plurality of products, or purchases the products after comparing a small amount of products, if the user purchases the products after comparing the small amount of products, the decision capability is strong, and a third decision power is obtained:
Figure 589430DEST_PATH_IMAGE036
wherein the content of the first and second substances,
Figure 828782DEST_PATH_IMAGE003
representing a third decision force;
Figure 330301DEST_PATH_IMAGE004
representing the number of clusters in the preference chain; e represents a natural constant;
Figure 655104DEST_PATH_IMAGE005
a maximum style quantity of a product representing the same category of products purchased by the purchasing act;
Figure 726702DEST_PATH_IMAGE006
the ith cluster in the preference chain represents the same category of products purchased by the purchasing behavior
Figure 758244DEST_PATH_IMAGE007
Browsing and recording the quantity of the products;
Figure 758561DEST_PATH_IMAGE008
representing the total number of browsing records in the ith cluster;
Figure 803615DEST_PATH_IMAGE009
a logarithmic function with base e is shown.
Figure DEST_PATH_IMAGE037
The method for calculating the entropy value is characterized in that when the entropy value of a cluster is large, the fact that the types of browsed products are large and the decision-making power is weak when corresponding purchased products of the cluster are purchased is shown, and the fact that the types of browsed products are small and the decision-making power is strong when the entropy value is small is shown.
Figure 321315DEST_PATH_IMAGE038
For normalization, the entropy value is limited to 0-1 using an exponential function based on e. The closer the third decision force is to 1, the stronger the decision ability of the user in purchasing the product is.
Therefore, the first decision power, the second decision power and the third decision power corresponding to each preference chain are obtained by analyzing the historical data on each preference chain, and the purchasing habits of a user in purchasing a complete device and accessory products subsequently can be represented to a certain extent.
S3, clustering by taking the current browsing record of the user as a clustering center point and the browsing record in a preset time period as a clustering point to obtain a cluster to be compared; determining a preference chain corresponding to the cluster to be compared according to the complete equipment to which the product browsed by the current browsing record belongs; calculating first, second and third decision forces of the clusters to be compared, and obtaining the difference degree of the clusters to be compared and the corresponding preference chains based on the difference of the first, second and third decisions between the clusters to be compared and the corresponding preference chains; and analyzing the preference of the user based on the difference degree, and determining a product recommendation strategy.
First, in step S2, the historical data of the user when purchasing the product is analyzed, and the browsing of the current user is also analyzed in combination with the historical data to locate the user' S preference at that time, so as to determine the current recommendation policy.
Obtaining a current browsing record of a user, wherein the current browsing record of the user can be a record of a page browsed by the user, and can also be a last browsing record of the user in the near future, namely a browsing record of the user closest to the current moment; the current browsing record of the user is taken as a clustering center point, and the browsing record in a preset time period is taken as a clustering point to be clustered to obtain a clustering cluster to be compared, wherein the preset time period represents a period of time from the current browsing record of the user, the length of the preset time period is set by an implementer according to a specific actual condition, but the data volume of the browsing record in the preset time period is required to be ensured to be enough when the setting is carried out. After the cluster to be compared is obtained, if the product browsed by the browsing record of the cluster center point of the cluster to be compared is an accessory product of the complete equipment, the preference chain corresponding to the complete equipment is the preference chain corresponding to the cluster to be compared.
Further, a first decision power, a second decision power and a third decision power of the cluster to be compared are calculated, it should be noted that the first decision power, the second decision power and the third decision power calculated in step S2 are calculated based on a plurality of clusters in a preference chain, but are all mean values of the clusters, so that the first decision power, the second decision power and the third decision power of one cluster to be compared can also be obtained according to the method for calculating the first decision power, the second decision power and the third decision power in step S2.
And finally, obtaining the difference degree of the cluster to be compared and the corresponding preference chain based on the difference of the first decision, the second decision and the third decision between the cluster to be compared and the corresponding preference chain:
Figure DEST_PATH_IMAGE039
wherein the content of the first and second substances,
Figure 915019DEST_PATH_IMAGE012
representing the difference degree of the preference chain a and the cluster b to be compared;
Figure 86238DEST_PATH_IMAGE013
and
Figure 323315DEST_PATH_IMAGE014
respectively representing the first decision power of the preference chain a and the cluster b to be compared;
Figure 471137DEST_PATH_IMAGE015
and
Figure 539587DEST_PATH_IMAGE016
respectively representing the second decision-making power of the preference chain a and the cluster b to be compared;
Figure 819390DEST_PATH_IMAGE017
and
Figure 104616DEST_PATH_IMAGE018
respectively representing the third decision-making power of the preference chain a and the cluster b to be compared;
Figure 495277DEST_PATH_IMAGE019
Figure 683813DEST_PATH_IMAGE020
Figure 633052DEST_PATH_IMAGE021
weights representing first, second and third decision forces, respectively, and
Figure 907039DEST_PATH_IMAGE012
the value ranges from 0 to 1.
Figure 101391DEST_PATH_IMAGE040
The first decision power of the historical corresponding preference chain and the first decision power corresponding to the cluster to be compared are respectively squared and then subjected to difference calculation, the square is used for reducing uncertainty factors in data (standard deviation and norm both use the mathematical idea), the first decision power can be said to be a decision period for a user to purchase a certain product each time, so the difference of the decision period can be said to be a difference of the decision period, and the difference of the decision period is used for the formula
Figure DEST_PATH_IMAGE041
And
Figure 574792DEST_PATH_IMAGE042
the difference corresponding to the three formulas can represent the difference degree between the feature information of the currently browsed product of the user and the feature information of the historically browsed product, so that whether the feature of the currently browsed product of the user is in accordance with the feature of the historically browsed product or not is known, namely, the current preference feature of the user and the feature used in the corresponding preference chain are usedWhether the user preference characteristics are similar.
Figure DEST_PATH_IMAGE043
Is unfolded into
Figure 399660DEST_PATH_IMAGE044
Namely to
Figure 331581DEST_PATH_IMAGE040
The root is opened to restore the squared value,
Figure 391941DEST_PATH_IMAGE019
Figure 227173DEST_PATH_IMAGE020
Figure 580532DEST_PATH_IMAGE021
the weights representing the first, second and third decision forces are used to represent the importance of the three data, and the implementer can set the values thereof according to the specific situation, preferably, the values are all 1 in the present embodiment.
Setting a difference threshold, preferably, the value of the difference threshold in this embodiment is 0.37, if the difference degree is less than the difference threshold, it is indicated that the current preference feature of the user is similar to the preference feature of the user in the corresponding preference chain, and the recommendation policy at this time is to recommend a product purchased by each purchase behavior in the preference chain similar to the current preference feature of the user to the user, that is, to which accessory product of the complete device the product browsed by the user currently browses and records belongs, and then recommend the product purchased by each purchase behavior in the preference chain corresponding to the complete device to the user, because the behavior feature browsed by the user at this time is similar to the historical behavior feature browsed in the preference chain, such recommendation policy is more convenient for the user to select a product for purchase.
If the difference degree is larger than or equal to the difference threshold, it is indicated that the current preference feature of the user is not similar to the preference feature of the user in the corresponding preference chain, and the recommendation strategy is to recommend diversified products to the user at this time.
In addition, when products are pushed, not only the decision power of a user but also the purchasing power of the user need to be considered, the decision power is more deeply evaluated on the purchasing habits of the user and the psychology of browsing the products, the purchasing power determines the price interval of the pushed products, the decision power determines the uniqueness and richness of the pushed products, when the purchasing power of the user is analyzed, the existing traditional preference modeling algorithm is utilized to analyze in combination with the characteristics of real-time clicking amount of browsing behaviors, page dwell time, mouse pulley speed and the like, the existing traditional preference modeling algorithm is mature, and the method does not expand the description. And finally, the products of the pushing interface are distributed according to the purchasing power and the decision power of the user, so that the best shopping experience is provided for the user.
It should be noted that: the sequence of the above embodiments of the present invention is only for description, and does not represent the advantages or disadvantages of the embodiments. And specific embodiments thereof have been described above. In addition, the processes depicted in the accompanying figures do not necessarily require the particular order shown, or sequential order, to achieve desirable results. In some embodiments, multitasking and parallel processing may also be possible or may be advantageous.
The embodiments in the present specification are described in a progressive manner, and the same and similar parts among the embodiments are referred to each other, and each embodiment focuses on the differences from the other embodiments.
The above description is only for the purpose of illustrating the preferred embodiments of the present invention and is not to be construed as limiting the invention, and any modifications, equivalents, improvements and the like that fall within the spirit and principle of the present invention are intended to be included therein.

Claims (7)

1. A big data-based user preference analysis and identification method is characterized by comprising the following steps: acquiring historical browsing records of a user, wherein the browsing records purchased after browsing in the historical browsing records are purchasing behaviors; clustering by taking each purchasing behavior as a clustering center point and taking related browsing records in a time period for completing each purchasing behavior as clustering points to obtain a plurality of clustering clusters; if the products purchased by the purchasing behaviors of the two clustering clusters are accessory products of a complete device, the indexes of the two clustering clusters are related; connecting the clustering clusters related to the index to obtain preference chains, wherein one complete device corresponds to one preference chain, and the starting point of the preference chain is the clustering cluster corresponding to the purchasing behavior of the complete device;
the average value of the ratio of the starting time of the first browsing record to the ending time of the last browsing record in all the clustering clusters on one preference chain is the first decision power of the preference chain; the mean value of the sum of the variances of the browsing records of all the clustering clusters on one preference chain is the second decision-making power of the preference chain; obtaining a third decision-making power of a preference chain based on the browsing record quantity of each product in each cluster on the preference chain and the total browsing record quantity in the cluster;
clustering by taking the current browsing record of the user as a clustering center point and the browsing record in a preset time period as a clustering point to obtain a cluster to be compared; determining a preference chain corresponding to the cluster to be compared according to the complete equipment to which the product browsed by the current browsing record belongs; calculating first, second and third decision forces of the clusters to be compared, and obtaining the difference degree of the clusters to be compared and the corresponding preference chains based on the difference of the first, second and third decisions between the clusters to be compared and the corresponding preference chains; and analyzing the preference of the user based on the difference degree, and determining a product recommendation strategy.
2. The big data-based user preference analysis recognition method according to claim 1, wherein the browsing records comprise: one browsing record includes information of the browsed product and browsing time information.
3. The big data-based user preference analysis and identification method according to claim 1, wherein clustering is performed by taking each purchase behavior as a cluster center point and taking related browsing records in a period of time for completing each purchase behavior as cluster points to obtain a plurality of cluster clusters, and comprises: the time period for completing each purchasing behavior is a time period from the beginning of browsing the type of products to the end of purchasing the type of products when the type of products are purchased; the related browsing records indicate that the products browsed by each browsing record of the user in the period of completing one purchasing behavior are the same as the product purchased by the purchasing behavior.
4. The big-data-based user preference analysis and recognition method according to claim 1, wherein the third decision force is:
Figure 719110DEST_PATH_IMAGE002
wherein, the first and the second end of the pipe are connected with each other,
Figure DEST_PATH_IMAGE003
representing a third decision force;
Figure 192948DEST_PATH_IMAGE004
representing the number of cluster clusters in the preference chain; e represents a natural constant;
Figure DEST_PATH_IMAGE005
a maximum amount of a style representing a same category of product as the product purchased by the purchasing act;
Figure 584484DEST_PATH_IMAGE006
the ith cluster in the preference chain represents the same category of products purchased by the purchasing behavior
Figure DEST_PATH_IMAGE007
Browsing and recording quantity of the products;
Figure 16733DEST_PATH_IMAGE008
representing the total number of browsing records in the ith clustering;
Figure DEST_PATH_IMAGE009
a logarithmic function with base e is shown.
5. The method as claimed in claim 1, wherein the clustering with the current browsing record of the user as the cluster center point and the browsing record in a preset time period as the cluster point to obtain the cluster to be compared comprises: the current browsing record of the user is a browsing record when the user browses at the current moment or a browsing record of the user closest to the current moment; the preset period represents a period of time from the user's current browsing history.
6. The big data-based user preference analysis and identification method according to claim 1, wherein the degree of difference is:
Figure DEST_PATH_IMAGE011
wherein, the first and the second end of the pipe are connected with each other,
Figure 329772DEST_PATH_IMAGE012
representing the difference degree of the preference chain a and the cluster b to be compared;
Figure DEST_PATH_IMAGE013
and
Figure 371415DEST_PATH_IMAGE014
respectively representing the first decision power of the preference chain a and the cluster b to be compared;
Figure DEST_PATH_IMAGE015
and
Figure 357737DEST_PATH_IMAGE016
respectively representing the second decision-making power of the preference chain a and the cluster b to be compared;
Figure DEST_PATH_IMAGE017
and
Figure 960888DEST_PATH_IMAGE018
respectively representing the third decision-making power of the preference chain a and the cluster b to be compared;
Figure DEST_PATH_IMAGE019
Figure 557960DEST_PATH_IMAGE020
Figure DEST_PATH_IMAGE021
representing the weights of the first, second and third decision forces, respectively.
7. The big data based user preference analysis and identification method as claimed in claim 1, wherein said analyzing the user's preference based on said degree of difference and determining the product recommendation policy comprises: setting a difference threshold, if the difference degree is smaller than the difference threshold, the current preference characteristics of the user are similar to the preference characteristics of the user in the corresponding preference chain, and the recommendation strategy is to recommend products purchased by each purchasing behavior in the preference chain similar to the current preference characteristics of the user for the user; if the difference degree is larger than or equal to the difference threshold value, the current preference characteristics of the user are not similar to the preference characteristics of the user in the corresponding preference chain, and the recommendation strategy is to recommend diversified products for the user.
CN202211106761.6A 2022-09-13 2022-09-13 Big data-based user preference analysis and identification method Active CN115187344B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202211106761.6A CN115187344B (en) 2022-09-13 2022-09-13 Big data-based user preference analysis and identification method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202211106761.6A CN115187344B (en) 2022-09-13 2022-09-13 Big data-based user preference analysis and identification method

Publications (2)

Publication Number Publication Date
CN115187344A true CN115187344A (en) 2022-10-14
CN115187344B CN115187344B (en) 2022-12-09

Family

ID=83524332

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202211106761.6A Active CN115187344B (en) 2022-09-13 2022-09-13 Big data-based user preference analysis and identification method

Country Status (1)

Country Link
CN (1) CN115187344B (en)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115809355A (en) * 2023-02-07 2023-03-17 北京厚方科技有限公司 Data storage method for electronic commerce management system
CN115878903A (en) * 2023-02-21 2023-03-31 万链指数(青岛)信息科技有限公司 Intelligent information recommendation method based on big data
CN116720928A (en) * 2023-08-10 2023-09-08 量子数科科技有限公司 Artificial intelligence-based personalized accurate shopping guide method for electronic commerce
CN117710054A (en) * 2023-12-20 2024-03-15 塞奥斯(北京)网络科技有限公司 Intelligent display system for commodity in online mall

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2012089014A (en) * 2010-10-21 2012-05-10 Nippon Telegr & Teleph Corp <Ntt> Purchase action analysis device, purchase action analysis method, and purchase action analysis program
CN106530058A (en) * 2016-11-29 2017-03-22 广东聚联电子商务股份有限公司 Method for recommending commodities based on historical search and browse records
CN108304432A (en) * 2017-08-01 2018-07-20 腾讯科技(深圳)有限公司 Information push processing method, information push processing unit and storage medium
CN109840796A (en) * 2017-11-24 2019-06-04 财团法人工业技术研究院 Decision factor analytical equipment and decision factor analysis method
CN110489642A (en) * 2019-07-25 2019-11-22 山东大学 Method of Commodity Recommendation, system, equipment and the medium of Behavior-based control signature analysis
CN111091282A (en) * 2019-12-10 2020-05-01 焦点科技股份有限公司 Customer loyalty segmentation method based on user behavior data
CN112417302A (en) * 2020-12-08 2021-02-26 六晟信息科技(杭州)有限公司 Big data-based information content intelligent analysis recommendation processing system

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2012089014A (en) * 2010-10-21 2012-05-10 Nippon Telegr & Teleph Corp <Ntt> Purchase action analysis device, purchase action analysis method, and purchase action analysis program
CN106530058A (en) * 2016-11-29 2017-03-22 广东聚联电子商务股份有限公司 Method for recommending commodities based on historical search and browse records
CN108304432A (en) * 2017-08-01 2018-07-20 腾讯科技(深圳)有限公司 Information push processing method, information push processing unit and storage medium
CN109840796A (en) * 2017-11-24 2019-06-04 财团法人工业技术研究院 Decision factor analytical equipment and decision factor analysis method
CN110489642A (en) * 2019-07-25 2019-11-22 山东大学 Method of Commodity Recommendation, system, equipment and the medium of Behavior-based control signature analysis
CN111091282A (en) * 2019-12-10 2020-05-01 焦点科技股份有限公司 Customer loyalty segmentation method based on user behavior data
CN112417302A (en) * 2020-12-08 2021-02-26 六晟信息科技(杭州)有限公司 Big data-based information content intelligent analysis recommendation processing system

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
吴寒等: "基于聚类分析的网络用户画像研究", 《广东通信技术》 *

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115809355A (en) * 2023-02-07 2023-03-17 北京厚方科技有限公司 Data storage method for electronic commerce management system
CN115878903A (en) * 2023-02-21 2023-03-31 万链指数(青岛)信息科技有限公司 Intelligent information recommendation method based on big data
CN116720928A (en) * 2023-08-10 2023-09-08 量子数科科技有限公司 Artificial intelligence-based personalized accurate shopping guide method for electronic commerce
CN116720928B (en) * 2023-08-10 2023-10-27 量子数科科技有限公司 Artificial intelligence-based personalized accurate shopping guide method for electronic commerce
CN117710054A (en) * 2023-12-20 2024-03-15 塞奥斯(北京)网络科技有限公司 Intelligent display system for commodity in online mall

Also Published As

Publication number Publication date
CN115187344B (en) 2022-12-09

Similar Documents

Publication Publication Date Title
CN115187344B (en) Big data-based user preference analysis and identification method
CN109685631B (en) Personalized recommendation method based on big data user behavior analysis
CN104866474B (en) Individuation data searching method and device
CN110543598B (en) Information recommendation method and device and terminal
CN107332910B (en) Information pushing method and device
CN111523976A (en) Commodity recommendation method and device, electronic equipment and storage medium
CN105630836B (en) The sort method and device of search result
CN107688984A (en) Product information method for pushing, device, storage medium and computer equipment
CN109241451B (en) Content combination recommendation method and device and readable storage medium
CN108573408B (en) Popular commodity list making method for maximizing benefits
CN116205675B (en) Data acquisition method and device based on thread division
US20090144226A1 (en) Information processing device and method, and program
CN116431931B (en) Real-time incremental data statistical analysis method
CN115878903B (en) Information intelligent recommendation method based on big data
CN111951051B (en) Method, device and system for recommending products to clients
CN115204985A (en) Shopping behavior prediction method, device, equipment and storage medium
CN107292713A (en) A kind of rule-based individual character merged with level recommends method
CN112633960A (en) Recommendation information pushing method and device and computer readable storage medium
CN115760202A (en) Product operation management system and method based on artificial intelligence
Jianjun Research on collaborative filtering recommendation algorithm based on user behavior characteristics
CN116503142B (en) Partner intelligent marketing scheduling data processing system
CN117745349A (en) Personalized coupon pushing method and system based on user characteristics
CN112150179B (en) Information pushing method and device
CN115860865A (en) Commodity combination construction method and device, equipment, medium and product thereof
CN114780865A (en) Information recommendation method and device, computer equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant