CN115187344A - Big data-based user preference analysis and identification method - Google Patents
Big data-based user preference analysis and identification method Download PDFInfo
- Publication number
- CN115187344A CN115187344A CN202211106761.6A CN202211106761A CN115187344A CN 115187344 A CN115187344 A CN 115187344A CN 202211106761 A CN202211106761 A CN 202211106761A CN 115187344 A CN115187344 A CN 115187344A
- Authority
- CN
- China
- Prior art keywords
- preference
- user
- browsing
- cluster
- decision
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q30/00—Commerce
- G06Q30/06—Buying, selling or leasing transactions
- G06Q30/0601—Electronic shopping [e-shopping]
- G06Q30/0631—Item recommendations
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q30/00—Commerce
- G06Q30/06—Buying, selling or leasing transactions
- G06Q30/0601—Electronic shopping [e-shopping]
- G06Q30/0641—Shopping interfaces
- G06Q30/0643—Graphical representation of items or shoppers
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02D—CLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
- Y02D10/00—Energy efficient computing, e.g. low power processors, power management or thermal management
Landscapes
- Business, Economics & Management (AREA)
- Accounting & Taxation (AREA)
- Finance (AREA)
- Development Economics (AREA)
- Economics (AREA)
- Marketing (AREA)
- Strategic Management (AREA)
- Physics & Mathematics (AREA)
- General Business, Economics & Management (AREA)
- General Physics & Mathematics (AREA)
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Management, Administration, Business Operations System, And Electronic Commerce (AREA)
Abstract
The invention relates to the technical field of electric digital data processing, in particular to a user preference analysis and identification method based on big data. The method comprises the following steps: acquiring historical browsing records of a user, wherein the browsing records purchased after browsing in the historical browsing records are purchasing behaviors; obtaining a plurality of clustering clusters, and connecting the clustering clusters related to the index to obtain preference chains, wherein one complete device corresponds to one preference chain, and the starting point of the preference chain is the clustering cluster corresponding to the purchasing behavior of the complete device; obtaining a first decision force, a second decision force and a third decision force of a preference chain; obtaining the difference degree of the cluster to be compared and the corresponding preference chain based on the difference of the first decision, the second decision and the third decision between the cluster to be compared and the corresponding preference chain; and analyzing the preference of the user based on the difference degree, and determining a product recommendation strategy. The method and the device can accurately position the current preference of the user and set the product recommendation strategy suitable for the user according to the preference.
Description
Technical Field
The invention relates to the technical field of electric digital data processing, in particular to a user preference analysis and identification method based on big data.
Background
When a user browses a detail page of a commodity, the system stores the commodity information in the detail page as a browsing record of the user, which is also called a browsing log. The real-time stream data is one of big data, a user preference copy with commercial value is provided for a website operator, the system pushes related preference products and information according to the copy, and the operator is helped to better focus on customer demands and make marketing plans by statistically analyzing long-term browsing data.
When the existing common user preference analysis method is used for acquiring information by mining data in historical browsing logs of retail products, more attention is paid to the level of personal waste and labor of a user, and real-time browsing data, namely the click rate, the page dwell time and the like of the current user are relied on when the product type, style and the like required by the user are predicted; however, when products such as industrial equipment are sold, the types of the products are few for the operation website, and the real-time browsing data is relatively thin, so that the reference of the real-time data may be relatively low, and at this time, if the preference of the user is only located according to the current real-time browsing data of the user, the result of the location is not ideal, so that when the recommendation strategy of the operation website is finally changed according to the located preference, the effect achieved by the changed recommendation strategy is not ideal.
Disclosure of Invention
In order to solve the above technical problems, an object of the present invention is to provide a method for analyzing and identifying user preferences based on big data, and the adopted technical solution is specifically as follows:
one embodiment of the invention provides a big data-based user preference analysis and identification method, which comprises the following steps: acquiring historical browsing records of a user, wherein the browsing records purchased after browsing in the historical browsing records are purchasing behaviors; clustering by taking each purchasing behavior as a clustering center point and taking related browsing records in a time period for completing each purchasing behavior as clustering points to obtain a plurality of clustering clusters; if the products purchased by the purchasing behaviors of the two clustering clusters are accessory products of a complete device, the indexes of the two clustering clusters are related; connecting the clustering clusters related to the index to obtain preference chains, wherein one complete device corresponds to one preference chain, and the starting point of each preference chain is the clustering cluster corresponding to the purchase behavior of the complete device;
the average value of the ratio of the starting time of the first browsing record to the ending time of the last browsing record in all the clustering clusters on one preference chain is the first decision power of the preference chain; the mean value of the sum of the variances of the browsing record durations of all the clustering clusters on one preference chain is the second decision power of the preference chain; obtaining a third decision-making power of a preference chain based on the browsing record quantity of each product in each cluster on the preference chain and the total browsing record quantity in the cluster;
clustering by taking the current browsing record of the user as a clustering center point and the browsing record in a preset time period as a clustering point to obtain a cluster to be compared; determining a preference chain corresponding to the cluster to be compared according to the complete equipment to which the product browsed by the current browsing record belongs; calculating first, second and third decision forces of the clusters to be compared, and obtaining the difference degree of the clusters to be compared and the corresponding preference chains based on the difference of the first, second and third decisions between the clusters to be compared and the corresponding preference chains; and analyzing the preference of the user based on the difference degree, and determining a product recommendation strategy.
Preferably, the browsing history includes: one browsing record includes information of the browsed product and browsing time information.
Preferably, with each purchase behavior as a cluster center point, clustering with related browsing records in a time period in which each purchase behavior is completed as a cluster point to obtain a plurality of cluster clusters, including: the time period for completing each purchasing behavior is a time period from the beginning of browsing the type of products to the end of purchasing the type of products when the type of products are purchased; the related browsing records indicate that the products browsed by each browsing record of the user in the period of completing one purchasing behavior are the same as the product purchased by the purchasing behavior.
Preferably, the third decision force is:
wherein, the first and the second end of the pipe are connected with each other,representing a third decision force;representing the number of cluster clusters in the preference chain; e represents a natural constant;a maximum style quantity of a product representing the same category of products purchased by the purchasing act;the ith cluster in the preference chain represents the same category of products purchased by the purchasing behaviorBrowsing and recording the quantity of the products;representing the total number of browsing records in the ith cluster;a logarithmic function based on e is shown.
Preferably, clustering is performed by taking the current browsing record of the user as a clustering center point and the browsing record in a preset time period as a clustering point to obtain a cluster to be compared, and the clustering comprises the following steps: the current browsing record of the user is a browsing record when the user browses at the current moment or a browsing record of the user closest to the current moment; the preset period represents a period of time from the user's current browsing history.
Preferably, the degree of difference is:
wherein, the first and the second end of the pipe are connected with each other,representing the difference degree of the preference chain a and the cluster b to be compared;andrespectively representing the first decision power of the preference chain a and the cluster b to be compared;andrespectively representing the second decision-making power of the preference chain a and the cluster b to be compared;andrespectively representing the third decision-making power of the preference chain a and the cluster b to be compared;、、representing the weights of the first, second and third decision forces, respectively.
Preferably, analyzing the user's preference based on the degree of difference and determining the product recommendation policy comprises: setting a difference threshold, if the difference degree is smaller than the difference threshold, the current preference characteristics of the user are similar to the preference characteristics of the user in the corresponding preference chain, and the recommendation strategy is to recommend products purchased by each purchasing behavior in the preference chain similar to the current preference characteristics of the user for the user; if the difference degree is larger than or equal to the difference threshold value, the current preference characteristics of the user are not similar to the preference characteristics of the user in the corresponding preference chain, and the recommendation strategy is to recommend diversified products for the user.
The embodiment of the invention at least has the following beneficial effects:
1. the method comprises the steps of analyzing historical browsing records of each complete device and corresponding accessory products, clustering by taking each purchasing behavior as a clustering central point to obtain a plurality of clustering clusters, connecting the clustering cluster corresponding to the purchasing behavior of each complete device with the clustering cluster corresponding to the purchasing behavior of the accessory products of the complete device to obtain a preference chain corresponding to each complete device, wherein the preference chain information of one complete device comprises historical browsing and purchasing information of one complete device and accessory products of the complete device, so that the information amount is relatively comprehensive, and when the preference of a user is analyzed according to the preference of the user, the preference of the user can be relatively comprehensively and accurately positioned;
2. when the preference of a user is specifically analyzed by using a plurality of current browsing records of the user, obtaining a preference chain corresponding to the plurality of current browsing records of the user through an accessory product of which complete equipment the currently browsed product belongs to, calculating and obtaining a first decision force, a second decision force and a third decision force of the corresponding preference chain, and simultaneously obtaining the first decision force, the second decision force and the third decision force when the user browses the product currently based on the information of the plurality of current browsing records of the user; the preference chains corresponding to the current multiple browsing records of the user and the differences of the first, second and third decision-making powers of the current multiple browsing records of the user are analyzed, then the preference of the current browsing behavior of the user is positioned based on the differences, the historical data is combined, the decision-making habit of the user when purchasing products is more biased, the current decision-making power of the user can be deeply traced, and the accuracy of the user preference positioning result is improved.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions and advantages of the prior art, the drawings used in the embodiments or the description of the prior art will be briefly described below, it is obvious that the drawings in the following description are only some embodiments of the present invention, and other drawings can be obtained by those skilled in the art without creative efforts.
Fig. 1 is a flowchart of a method for analyzing and identifying user preference based on big data according to an embodiment of the present invention.
Detailed Description
To further illustrate the technical means and effects of the present invention adopted to achieve the predetermined invention, the following detailed description is provided with reference to the accompanying drawings and preferred embodiments for a big data based user preference analysis and identification method, and the specific implementation, structure, features and effects thereof according to the present invention. In the following description, the different references to "one embodiment" or "another embodiment" do not necessarily refer to the same embodiment. Furthermore, the particular features, structures, or characteristics may be combined in any suitable manner in one or more embodiments.
Unless defined otherwise, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this invention belongs.
The following describes a specific scheme of the big data-based user preference analysis and identification method in detail with reference to the accompanying drawings.
Example (b):
the main application scenarios of the invention are as follows: for products of industrial equipment and operation websites thereof, the product categories are few, the browsing volume is small, so the reference of real-time data is possibly low, if the preference is positioned only according to the current clicking and browsing behaviors of a client, the random clicking behavior is very easy to cover the product categories in a large range, and the preferred analysis result is not ideal enough.
Referring to fig. 1, a flowchart of a method for identifying a big data based user preference analysis according to an embodiment of the present invention is shown, where the method includes the following steps:
step S1, obtaining historical browsing records of a user, wherein the browsing records purchased after browsing in the historical browsing records are purchasing behaviors; clustering by taking each purchasing behavior as a clustering center point and taking related browsing records in a time period for completing each purchasing behavior as clustering points to obtain a plurality of clustering clusters; if the products purchased by the purchasing behaviors of the two clustering clusters are accessory products of a complete device, the indexes of the two clustering clusters are related; and connecting the clustering clusters related to the index to obtain preference chains, wherein one complete device corresponds to one preference chain, and the starting point of the preference chain is the clustering cluster corresponding to the purchase behavior of the complete device.
First, because of the problem of memory occupation of log files, the traditional browsing logs cannot be completely reserved, and are generally converted into real-time streaming data for analysis and storage after data mining. And although the browsing record interface displays some SKU (stock keeping unit) information of the goods, there is no need to store many SKU information when storing, and the unique numbers (such as 1234A and 1234B) for storing the SKU information are generally selected to represent the browsing record of the goods. The user browsing records are temporary data and are frequently changed, and the data volume is not large.
When mining browsing record data in a user history, browsing records which are particularly classified as purchasing behaviors need to be distinguished, wherein the browsing records purchased after browsing in the browsing records in the user history are purchasing behaviors, and the other browsing records are browsing records only browsed. For industrial equipment products, one complete equipment is an upstream product, and parts and accessories of the complete equipment are collectively called accessory products of the complete equipment and are downstream products.
Furthermore, the browsing volumes of the websites of different types of goods are very different, for example, retail websites such as panning, gathering, and the like have huge daily browsing volumes, and the purchasing bias of individual users is more random, so that the users may buy snacks at the last time, the historical data cannot establish a strong association with the current preference, and more of the historical data are used as tracing of purchasing power and consumption level of the users, so that the real-time browsing behavior is emphasized when goods categories and styles are pushed. For the operation websites of products in industrial manufacturing and equipment, the browsing amount of the websites is small, and most of the operation websites are purchased by enterprises and factories, so that the operation websites cannot be purchased as desired by individual users, the preference of the operation websites is relatively fixed, the operation websites are more biased to preference tracing of historical data, the referential performance of real-time data of the operation websites of the products is low, and the current browsing behavior cannot be ensured to be completely matched with the historical browsing logic.
Finally, classifying browsing records in the history of the user, clustering the browsing records in the time period of completing each purchasing behavior by using each purchasing behavior clustering center point and using the related browsing records in the time period of completing each purchasing behavior as clustering points to obtain a plurality of clustering clusters, realizing the classification of the browsing records in the history, wherein discrete browsing records with high randomness possibly appear in the clustering process, and the browsing records have low reference, so the browsing records can be ignored during clustering; the time period for completing each purchasing behavior is the time period from the beginning of browsing the products to the stopping of browsing the products after the purchasing is finished when the products are purchased, it is emphasized that the products are not purchased to indicate that the purchasing is finished, the browsing record surrounding the purchasing behavior is not only the browsing record before the purchasing, and the browsing record in a period of time just after the purchasing is possibly counted as the purchasing decision process of the user, if the user still browses the related products after the products are purchased to indicate that the user does not completely decide, the user still hesitates and entangles, and the possibility of goods return also exists, so the decision period of each purchasing behavior is from the beginning of browsing to the end of browsing, the decision is considered to be completed, the purchasing is completed, and one time period for completing one purchasing behavior can be called one decision period.
The related browsing records are that the products browsed in the browsing records of the user are the same as the product types of the purchasing behaviors, namely the classification numbers are the same, and at this time, the user is considered to select a certain money for purchasing after browsing various similar products which are the same as the product types in the purchasing behaviors through comparison.
After the cluster clusters are obtained, products purchased according to the purchasing behavior of each cluster are used as indexes of each cluster, the related cluster clusters are connected based on the correlation between each index to obtain a preference chain, the starting point of the preference chain is a cluster corresponding to the purchasing behavior of the complete equipment, the products purchased by the purchasing behavior of each cluster on the preference chain are accessory products of the complete equipment, the order is arranged and connected according to the purchasing time sequence, the preference chain can also be regarded as a set, elements in the set are the purchasing behavior corresponding degree cluster of accessory products of the complete equipment, the first element is the cluster corresponding to the purchasing behavior of the complete equipment, and the arrangement sequence of the other elements is also arranged according to the time sequence. It should be noted that, the product purchased by the purchasing behavior of the two clusters is an accessory product of a complete device, and the indexes of the two clusters are related.
The concept of cluster index is utilized, that is, a group of analysis data sets of single consumption behaviors are clustered by browsing records like purchasing behaviors, the cluster index has uniqueness, products of the purchasing behaviors are equivalent to the unique index items of the group of data, and when historical data are called and analyzed, the index items are found to position the data; the chain structure of the preference chain has the characteristic of cluster indexing, a complete preference chain can be regarded as a cluster which is continuous in time sequence, and a consumption data set of a purchasing behavior center of complete equipment is used as a unique index of the chain structure cluster, so that when a subsequent browsing record is analyzed, a batch of new data is subjected to similarity analysis with a cluster point in the preference chain to judge whether the batch of new data can be classified on the preference chain.
S2, the average value of the ratio of the starting time of the first browsing record to the ending time of the last browsing record in all the clustering clusters on one preference chain is the first decision-making power of the preference chain; the mean value of the sum of the variances of the browsing records of all the clustering clusters on one preference chain is the second decision-making power of the preference chain; and obtaining a third decision-making power of the preference chain based on the browsing record quantity of each product in each cluster on the preference chain and the total browsing record quantity in the clusters.
Firstly, each cluster on the preference chain, namely for each purchasing behavior, has a plurality of browsing records related to the cluster, hidden information in historical data can be obtained by analyzing the related browsing records on the preference chain, and the decision cycle and the decision power of a user purchasing a product and the number of types of the compared products in purchasing are used as references for analyzing the current browsing records of the user.
Further, a first decision power is obtained by analyzing a decision cycle of a user for purchasing a product, and the first decision power is as follows:
wherein the content of the first and second substances,a first decision force representing a chain of preferences;representing the starting time of a first browsing record in the ith clustering in the preference chain;the end time of the last browsing record in the ith clustering cluster in the preference chain is recorded; n represents the number of clusters in a preference chain.And the ratio of the starting time of the first browsing record to the ending time of the last browsing record in the ith cluster represents the decision-making period of the current purchasing behavior.The value of (A) is between 0 and 1, the closer to 1, the shorter the decision cycle is, the stronger the decision making power when the user buys the product is,the decision period of the historical records is averaged, and the larger the value is, the decision power of the historical purchased products of the user is shownThe stronger the first decision force.
Then, it is determined by analyzing the change of the browsing duration when a product is purchased that whether the user continuously browses related product pages to make a purchasing behavior or randomly and discontinuously browses the related product pages to make a purchasing behavior to obtain a second decision power, obviously, the decision power for randomly and discontinuously browsing to make a purchasing behavior is stronger, and the second decision power is:
wherein the content of the first and second substances,representing a second decision force; n represents the number of clusters in a preference chain;representing the number of browsing records in a cluster;representing the duration of the jth browsing record in the ith cluster on the preference chain;and the average value of the duration of all browsing records in the ith cluster on the preference chain is represented.The average value is calculated by summing the variances of the durations of all browsing records in each cluster on the preference chain, the smaller the average value is, when the user purchasing behavior occurs, the related product pages are continuously browsed, the decision-making power is weaker, the larger the average value is, namely, the larger the second decision-making power is, when the user purchasing behavior occurs, the user randomly and discontinuously browses the related product pages, and the stronger the decision-making power is.
Finally, the diversity of the products browsed by the user each time the user purchases the products is analyzed, namely the user purchases the products after comparing a plurality of products, or purchases the products after comparing a small amount of products, if the user purchases the products after comparing the small amount of products, the decision capability is strong, and a third decision power is obtained:
wherein the content of the first and second substances,representing a third decision force;representing the number of clusters in the preference chain; e represents a natural constant;a maximum style quantity of a product representing the same category of products purchased by the purchasing act;the ith cluster in the preference chain represents the same category of products purchased by the purchasing behaviorBrowsing and recording the quantity of the products;representing the total number of browsing records in the ith cluster;a logarithmic function with base e is shown.The method for calculating the entropy value is characterized in that when the entropy value of a cluster is large, the fact that the types of browsed products are large and the decision-making power is weak when corresponding purchased products of the cluster are purchased is shown, and the fact that the types of browsed products are small and the decision-making power is strong when the entropy value is small is shown.For normalization, the entropy value is limited to 0-1 using an exponential function based on e. The closer the third decision force is to 1, the stronger the decision ability of the user in purchasing the product is.
Therefore, the first decision power, the second decision power and the third decision power corresponding to each preference chain are obtained by analyzing the historical data on each preference chain, and the purchasing habits of a user in purchasing a complete device and accessory products subsequently can be represented to a certain extent.
S3, clustering by taking the current browsing record of the user as a clustering center point and the browsing record in a preset time period as a clustering point to obtain a cluster to be compared; determining a preference chain corresponding to the cluster to be compared according to the complete equipment to which the product browsed by the current browsing record belongs; calculating first, second and third decision forces of the clusters to be compared, and obtaining the difference degree of the clusters to be compared and the corresponding preference chains based on the difference of the first, second and third decisions between the clusters to be compared and the corresponding preference chains; and analyzing the preference of the user based on the difference degree, and determining a product recommendation strategy.
First, in step S2, the historical data of the user when purchasing the product is analyzed, and the browsing of the current user is also analyzed in combination with the historical data to locate the user' S preference at that time, so as to determine the current recommendation policy.
Obtaining a current browsing record of a user, wherein the current browsing record of the user can be a record of a page browsed by the user, and can also be a last browsing record of the user in the near future, namely a browsing record of the user closest to the current moment; the current browsing record of the user is taken as a clustering center point, and the browsing record in a preset time period is taken as a clustering point to be clustered to obtain a clustering cluster to be compared, wherein the preset time period represents a period of time from the current browsing record of the user, the length of the preset time period is set by an implementer according to a specific actual condition, but the data volume of the browsing record in the preset time period is required to be ensured to be enough when the setting is carried out. After the cluster to be compared is obtained, if the product browsed by the browsing record of the cluster center point of the cluster to be compared is an accessory product of the complete equipment, the preference chain corresponding to the complete equipment is the preference chain corresponding to the cluster to be compared.
Further, a first decision power, a second decision power and a third decision power of the cluster to be compared are calculated, it should be noted that the first decision power, the second decision power and the third decision power calculated in step S2 are calculated based on a plurality of clusters in a preference chain, but are all mean values of the clusters, so that the first decision power, the second decision power and the third decision power of one cluster to be compared can also be obtained according to the method for calculating the first decision power, the second decision power and the third decision power in step S2.
And finally, obtaining the difference degree of the cluster to be compared and the corresponding preference chain based on the difference of the first decision, the second decision and the third decision between the cluster to be compared and the corresponding preference chain:
wherein the content of the first and second substances,representing the difference degree of the preference chain a and the cluster b to be compared;andrespectively representing the first decision power of the preference chain a and the cluster b to be compared;andrespectively representing the second decision-making power of the preference chain a and the cluster b to be compared;andrespectively representing the third decision-making power of the preference chain a and the cluster b to be compared;、、weights representing first, second and third decision forces, respectively, andthe value ranges from 0 to 1.
The first decision power of the historical corresponding preference chain and the first decision power corresponding to the cluster to be compared are respectively squared and then subjected to difference calculation, the square is used for reducing uncertainty factors in data (standard deviation and norm both use the mathematical idea), the first decision power can be said to be a decision period for a user to purchase a certain product each time, so the difference of the decision period can be said to be a difference of the decision period, and the difference of the decision period is used for the formulaAndthe difference corresponding to the three formulas can represent the difference degree between the feature information of the currently browsed product of the user and the feature information of the historically browsed product, so that whether the feature of the currently browsed product of the user is in accordance with the feature of the historically browsed product or not is known, namely, the current preference feature of the user and the feature used in the corresponding preference chain are usedWhether the user preference characteristics are similar.Is unfolded intoNamely toThe root is opened to restore the squared value,、、the weights representing the first, second and third decision forces are used to represent the importance of the three data, and the implementer can set the values thereof according to the specific situation, preferably, the values are all 1 in the present embodiment.
Setting a difference threshold, preferably, the value of the difference threshold in this embodiment is 0.37, if the difference degree is less than the difference threshold, it is indicated that the current preference feature of the user is similar to the preference feature of the user in the corresponding preference chain, and the recommendation policy at this time is to recommend a product purchased by each purchase behavior in the preference chain similar to the current preference feature of the user to the user, that is, to which accessory product of the complete device the product browsed by the user currently browses and records belongs, and then recommend the product purchased by each purchase behavior in the preference chain corresponding to the complete device to the user, because the behavior feature browsed by the user at this time is similar to the historical behavior feature browsed in the preference chain, such recommendation policy is more convenient for the user to select a product for purchase.
If the difference degree is larger than or equal to the difference threshold, it is indicated that the current preference feature of the user is not similar to the preference feature of the user in the corresponding preference chain, and the recommendation strategy is to recommend diversified products to the user at this time.
In addition, when products are pushed, not only the decision power of a user but also the purchasing power of the user need to be considered, the decision power is more deeply evaluated on the purchasing habits of the user and the psychology of browsing the products, the purchasing power determines the price interval of the pushed products, the decision power determines the uniqueness and richness of the pushed products, when the purchasing power of the user is analyzed, the existing traditional preference modeling algorithm is utilized to analyze in combination with the characteristics of real-time clicking amount of browsing behaviors, page dwell time, mouse pulley speed and the like, the existing traditional preference modeling algorithm is mature, and the method does not expand the description. And finally, the products of the pushing interface are distributed according to the purchasing power and the decision power of the user, so that the best shopping experience is provided for the user.
It should be noted that: the sequence of the above embodiments of the present invention is only for description, and does not represent the advantages or disadvantages of the embodiments. And specific embodiments thereof have been described above. In addition, the processes depicted in the accompanying figures do not necessarily require the particular order shown, or sequential order, to achieve desirable results. In some embodiments, multitasking and parallel processing may also be possible or may be advantageous.
The embodiments in the present specification are described in a progressive manner, and the same and similar parts among the embodiments are referred to each other, and each embodiment focuses on the differences from the other embodiments.
The above description is only for the purpose of illustrating the preferred embodiments of the present invention and is not to be construed as limiting the invention, and any modifications, equivalents, improvements and the like that fall within the spirit and principle of the present invention are intended to be included therein.
Claims (7)
1. A big data-based user preference analysis and identification method is characterized by comprising the following steps: acquiring historical browsing records of a user, wherein the browsing records purchased after browsing in the historical browsing records are purchasing behaviors; clustering by taking each purchasing behavior as a clustering center point and taking related browsing records in a time period for completing each purchasing behavior as clustering points to obtain a plurality of clustering clusters; if the products purchased by the purchasing behaviors of the two clustering clusters are accessory products of a complete device, the indexes of the two clustering clusters are related; connecting the clustering clusters related to the index to obtain preference chains, wherein one complete device corresponds to one preference chain, and the starting point of the preference chain is the clustering cluster corresponding to the purchasing behavior of the complete device;
the average value of the ratio of the starting time of the first browsing record to the ending time of the last browsing record in all the clustering clusters on one preference chain is the first decision power of the preference chain; the mean value of the sum of the variances of the browsing records of all the clustering clusters on one preference chain is the second decision-making power of the preference chain; obtaining a third decision-making power of a preference chain based on the browsing record quantity of each product in each cluster on the preference chain and the total browsing record quantity in the cluster;
clustering by taking the current browsing record of the user as a clustering center point and the browsing record in a preset time period as a clustering point to obtain a cluster to be compared; determining a preference chain corresponding to the cluster to be compared according to the complete equipment to which the product browsed by the current browsing record belongs; calculating first, second and third decision forces of the clusters to be compared, and obtaining the difference degree of the clusters to be compared and the corresponding preference chains based on the difference of the first, second and third decisions between the clusters to be compared and the corresponding preference chains; and analyzing the preference of the user based on the difference degree, and determining a product recommendation strategy.
2. The big data-based user preference analysis recognition method according to claim 1, wherein the browsing records comprise: one browsing record includes information of the browsed product and browsing time information.
3. The big data-based user preference analysis and identification method according to claim 1, wherein clustering is performed by taking each purchase behavior as a cluster center point and taking related browsing records in a period of time for completing each purchase behavior as cluster points to obtain a plurality of cluster clusters, and comprises: the time period for completing each purchasing behavior is a time period from the beginning of browsing the type of products to the end of purchasing the type of products when the type of products are purchased; the related browsing records indicate that the products browsed by each browsing record of the user in the period of completing one purchasing behavior are the same as the product purchased by the purchasing behavior.
4. The big-data-based user preference analysis and recognition method according to claim 1, wherein the third decision force is:
wherein, the first and the second end of the pipe are connected with each other,representing a third decision force;representing the number of cluster clusters in the preference chain; e represents a natural constant;a maximum amount of a style representing a same category of product as the product purchased by the purchasing act;the ith cluster in the preference chain represents the same category of products purchased by the purchasing behaviorBrowsing and recording quantity of the products;representing the total number of browsing records in the ith clustering;a logarithmic function with base e is shown.
5. The method as claimed in claim 1, wherein the clustering with the current browsing record of the user as the cluster center point and the browsing record in a preset time period as the cluster point to obtain the cluster to be compared comprises: the current browsing record of the user is a browsing record when the user browses at the current moment or a browsing record of the user closest to the current moment; the preset period represents a period of time from the user's current browsing history.
6. The big data-based user preference analysis and identification method according to claim 1, wherein the degree of difference is:
wherein, the first and the second end of the pipe are connected with each other,representing the difference degree of the preference chain a and the cluster b to be compared;andrespectively representing the first decision power of the preference chain a and the cluster b to be compared;andrespectively representing the second decision-making power of the preference chain a and the cluster b to be compared;andrespectively representing the third decision-making power of the preference chain a and the cluster b to be compared;、、representing the weights of the first, second and third decision forces, respectively.
7. The big data based user preference analysis and identification method as claimed in claim 1, wherein said analyzing the user's preference based on said degree of difference and determining the product recommendation policy comprises: setting a difference threshold, if the difference degree is smaller than the difference threshold, the current preference characteristics of the user are similar to the preference characteristics of the user in the corresponding preference chain, and the recommendation strategy is to recommend products purchased by each purchasing behavior in the preference chain similar to the current preference characteristics of the user for the user; if the difference degree is larger than or equal to the difference threshold value, the current preference characteristics of the user are not similar to the preference characteristics of the user in the corresponding preference chain, and the recommendation strategy is to recommend diversified products for the user.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202211106761.6A CN115187344B (en) | 2022-09-13 | 2022-09-13 | Big data-based user preference analysis and identification method |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202211106761.6A CN115187344B (en) | 2022-09-13 | 2022-09-13 | Big data-based user preference analysis and identification method |
Publications (2)
Publication Number | Publication Date |
---|---|
CN115187344A true CN115187344A (en) | 2022-10-14 |
CN115187344B CN115187344B (en) | 2022-12-09 |
Family
ID=83524332
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202211106761.6A Active CN115187344B (en) | 2022-09-13 | 2022-09-13 | Big data-based user preference analysis and identification method |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN115187344B (en) |
Cited By (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN115809355A (en) * | 2023-02-07 | 2023-03-17 | 北京厚方科技有限公司 | Data storage method for electronic commerce management system |
CN115878903A (en) * | 2023-02-21 | 2023-03-31 | 万链指数(青岛)信息科技有限公司 | Intelligent information recommendation method based on big data |
CN116720928A (en) * | 2023-08-10 | 2023-09-08 | 量子数科科技有限公司 | Artificial intelligence-based personalized accurate shopping guide method for electronic commerce |
CN117710054A (en) * | 2023-12-20 | 2024-03-15 | 塞奥斯(北京)网络科技有限公司 | Intelligent display system for commodity in online mall |
Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP2012089014A (en) * | 2010-10-21 | 2012-05-10 | Nippon Telegr & Teleph Corp <Ntt> | Purchase action analysis device, purchase action analysis method, and purchase action analysis program |
CN106530058A (en) * | 2016-11-29 | 2017-03-22 | 广东聚联电子商务股份有限公司 | Method for recommending commodities based on historical search and browse records |
CN108304432A (en) * | 2017-08-01 | 2018-07-20 | 腾讯科技(深圳)有限公司 | Information push processing method, information push processing unit and storage medium |
CN109840796A (en) * | 2017-11-24 | 2019-06-04 | 财团法人工业技术研究院 | Decision factor analytical equipment and decision factor analysis method |
CN110489642A (en) * | 2019-07-25 | 2019-11-22 | 山东大学 | Method of Commodity Recommendation, system, equipment and the medium of Behavior-based control signature analysis |
CN111091282A (en) * | 2019-12-10 | 2020-05-01 | 焦点科技股份有限公司 | Customer loyalty segmentation method based on user behavior data |
CN112417302A (en) * | 2020-12-08 | 2021-02-26 | 六晟信息科技(杭州)有限公司 | Big data-based information content intelligent analysis recommendation processing system |
-
2022
- 2022-09-13 CN CN202211106761.6A patent/CN115187344B/en active Active
Patent Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP2012089014A (en) * | 2010-10-21 | 2012-05-10 | Nippon Telegr & Teleph Corp <Ntt> | Purchase action analysis device, purchase action analysis method, and purchase action analysis program |
CN106530058A (en) * | 2016-11-29 | 2017-03-22 | 广东聚联电子商务股份有限公司 | Method for recommending commodities based on historical search and browse records |
CN108304432A (en) * | 2017-08-01 | 2018-07-20 | 腾讯科技(深圳)有限公司 | Information push processing method, information push processing unit and storage medium |
CN109840796A (en) * | 2017-11-24 | 2019-06-04 | 财团法人工业技术研究院 | Decision factor analytical equipment and decision factor analysis method |
CN110489642A (en) * | 2019-07-25 | 2019-11-22 | 山东大学 | Method of Commodity Recommendation, system, equipment and the medium of Behavior-based control signature analysis |
CN111091282A (en) * | 2019-12-10 | 2020-05-01 | 焦点科技股份有限公司 | Customer loyalty segmentation method based on user behavior data |
CN112417302A (en) * | 2020-12-08 | 2021-02-26 | 六晟信息科技(杭州)有限公司 | Big data-based information content intelligent analysis recommendation processing system |
Non-Patent Citations (1)
Title |
---|
吴寒等: "基于聚类分析的网络用户画像研究", 《广东通信技术》 * |
Cited By (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN115809355A (en) * | 2023-02-07 | 2023-03-17 | 北京厚方科技有限公司 | Data storage method for electronic commerce management system |
CN115878903A (en) * | 2023-02-21 | 2023-03-31 | 万链指数(青岛)信息科技有限公司 | Intelligent information recommendation method based on big data |
CN116720928A (en) * | 2023-08-10 | 2023-09-08 | 量子数科科技有限公司 | Artificial intelligence-based personalized accurate shopping guide method for electronic commerce |
CN116720928B (en) * | 2023-08-10 | 2023-10-27 | 量子数科科技有限公司 | Artificial intelligence-based personalized accurate shopping guide method for electronic commerce |
CN117710054A (en) * | 2023-12-20 | 2024-03-15 | 塞奥斯(北京)网络科技有限公司 | Intelligent display system for commodity in online mall |
Also Published As
Publication number | Publication date |
---|---|
CN115187344B (en) | 2022-12-09 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN115187344B (en) | Big data-based user preference analysis and identification method | |
CN109685631B (en) | Personalized recommendation method based on big data user behavior analysis | |
CN104866474B (en) | Individuation data searching method and device | |
CN110543598B (en) | Information recommendation method and device and terminal | |
CN107332910B (en) | Information pushing method and device | |
CN111523976A (en) | Commodity recommendation method and device, electronic equipment and storage medium | |
CN105630836B (en) | The sort method and device of search result | |
CN107688984A (en) | Product information method for pushing, device, storage medium and computer equipment | |
CN109241451B (en) | Content combination recommendation method and device and readable storage medium | |
CN108573408B (en) | Popular commodity list making method for maximizing benefits | |
CN116205675B (en) | Data acquisition method and device based on thread division | |
US20090144226A1 (en) | Information processing device and method, and program | |
CN116431931B (en) | Real-time incremental data statistical analysis method | |
CN115878903B (en) | Information intelligent recommendation method based on big data | |
CN111951051B (en) | Method, device and system for recommending products to clients | |
CN115204985A (en) | Shopping behavior prediction method, device, equipment and storage medium | |
CN107292713A (en) | A kind of rule-based individual character merged with level recommends method | |
CN112633960A (en) | Recommendation information pushing method and device and computer readable storage medium | |
CN115760202A (en) | Product operation management system and method based on artificial intelligence | |
Jianjun | Research on collaborative filtering recommendation algorithm based on user behavior characteristics | |
CN116503142B (en) | Partner intelligent marketing scheduling data processing system | |
CN117745349A (en) | Personalized coupon pushing method and system based on user characteristics | |
CN112150179B (en) | Information pushing method and device | |
CN115860865A (en) | Commodity combination construction method and device, equipment, medium and product thereof | |
CN114780865A (en) | Information recommendation method and device, computer equipment and storage medium |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |