WO2019034087A1 - User preference determination method, apparatus, device, and storage medium - Google Patents

User preference determination method, apparatus, device, and storage medium Download PDF

Info

Publication number
WO2019034087A1
WO2019034087A1 PCT/CN2018/100688 CN2018100688W WO2019034087A1 WO 2019034087 A1 WO2019034087 A1 WO 2019034087A1 CN 2018100688 W CN2018100688 W CN 2018100688W WO 2019034087 A1 WO2019034087 A1 WO 2019034087A1
Authority
WO
WIPO (PCT)
Prior art keywords
brand image
user
brand
image word
word
Prior art date
Application number
PCT/CN2018/100688
Other languages
French (fr)
Chinese (zh)
Inventor
刘朋飞
Original Assignee
北京京东尚科信息技术有限公司
北京京东世纪贸易有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 北京京东尚科信息技术有限公司, 北京京东世纪贸易有限公司 filed Critical 北京京东尚科信息技术有限公司
Publication of WO2019034087A1 publication Critical patent/WO2019034087A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q30/00Commerce
    • G06Q30/02Marketing; Price estimation or determination; Fundraising
    • G06Q30/0201Market modelling; Market analysis; Collecting market data
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/23Clustering techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q30/00Commerce
    • G06Q30/02Marketing; Price estimation or determination; Fundraising
    • G06Q30/0201Market modelling; Market analysis; Collecting market data
    • G06Q30/0203Market surveys; Market polls
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q30/00Commerce
    • G06Q30/02Marketing; Price estimation or determination; Fundraising
    • G06Q30/0241Advertisements
    • G06Q30/0251Targeted advertisements
    • G06Q30/0269Targeted advertisements based on user profile or attribute
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q30/00Commerce
    • G06Q30/06Buying, selling or leasing transactions
    • G06Q30/0601Electronic shopping [e-shopping]
    • G06Q30/0631Item recommendations

Definitions

  • the present disclosure relates to the field of big data technologies, and in particular, to a user preference determining method, a user preference determining apparatus, an electronic device, and a computer readable storage medium.
  • the determination of user preferences is mainly achieved through the questionnaire survey program.
  • the efficiency is low; on the other hand, since the user mainly judges based on subjective factors, it is difficult to accurately obtain the brand image that the user really prefers, and thus the precise marketing of the user cannot be realized.
  • a user preference determining method including:
  • the brand image word whose membership degree is greater than the first threshold is determined as the brand image of the user preference.
  • calculating, by the fuzzy clustering, the membership of the user for each brand image word based on the shopping behavior data of the user includes:
  • the user preference determining method further includes:
  • the product information in the product information database is matched with the brand image word corresponding to each brand name.
  • the user preference determining method further includes:
  • the item information in the frequent item set with the support degree greater than the second threshold is added to the brand image dictionary.
  • generating a frequent item set for each item of merchandise information and the brand image word includes:
  • the statistical shopping behavior data of the user corresponding to each brand image word in the brand image vocabulary includes:
  • the normalized shopping behavior data of the user corresponding to each brand image word in the brand image lexicon is counted.
  • a user preference determining apparatus including:
  • a statistical unit configured to count the shopping behavior data of the user corresponding to each brand image word in the brand image vocabulary, wherein the brand image vocabulary stores a brand image word corresponding to each brand name;
  • a membership degree calculation unit configured to calculate, by the fuzzy clustering, the membership degree of the user for each brand image word based on the shopping behavior data of the user;
  • a user preference determining unit configured to determine the brand image word whose membership degree is greater than the first threshold as the brand image preferred by the user.
  • calculating, by the fuzzy clustering, the membership of the user for each brand image word based on the shopping behavior data of the user includes:
  • an electronic device including:
  • a memory having computer readable instructions stored thereon, the computer readable instructions being executed by the processor to implement a user preference determining method according to any of the above.
  • a computer readable storage medium having stored thereon a computer program, the computer program being executed by a processor to implement the user preference determining method according to any one of the above.
  • the shopping behavior data of the user corresponding to each brand image word is counted, based on the user's shopping behavior data.
  • the user's membership degree to each brand image word is calculated by fuzzy clustering, and the brand image word with membership degree greater than the first threshold is determined as the brand image of the user's preference.
  • FIG. 1 schematically shows a flowchart of a user preference determining method according to an exemplary embodiment of the present disclosure
  • FIG. 2 schematically illustrates an architectural diagram of a user preference determination system in accordance with an exemplary embodiment of the present disclosure
  • FIG. 3 schematically illustrates a flowchart of a user preference determination system in accordance with an exemplary embodiment of the present disclosure
  • FIG. 4 schematically shows a schematic diagram of a constructed FP-tree according to an exemplary embodiment of the present disclosure
  • FIG. 5 schematically shows a block diagram of a user preference determining apparatus according to an exemplary embodiment of the present disclosure
  • FIG. 6 schematically illustrates a block diagram of an electronic device in accordance with an exemplary embodiment of the present disclosure
  • FIG. 7 shows a schematic diagram of a computer readable storage medium in accordance with an exemplary embodiment of the present disclosure.
  • a user preference determination method is first provided.
  • the user preference determination method may include the following steps:
  • Step S110 Counting the shopping behavior data of the user corresponding to each brand image word in the brand image vocabulary, wherein the brand image vocabulary stores a brand image word corresponding to each brand name;
  • Step S120 Calculating, by the fuzzy clustering, the membership degree of the user for each brand image word based on the shopping behavior data of the user;
  • Step S130 Determine the brand image word whose membership degree is greater than the first threshold value as the brand image of the user preference.
  • the user's shopping behavior data corresponding to each brand image word is counted, and the user's shopping behavior data may be associated with the brand image word, thereby facilitating the user's shopping behavior data.
  • the user's preferred brand image is analyzed.
  • the user's membership degree to each brand image word is calculated by fuzzy clustering, and the brand image word with membership degree greater than the first threshold is determined as the user's preference.
  • the brand image can automatically and efficiently extract multiple brand images that the user prefers from the user's shopping behavior data, thereby further deepening the user's consumption psychology and preferences, facilitating accurate marketing, and reducing labor costs.
  • step S110 the shopping behavior data of the user corresponding to each brand image word in the brand image vocabulary is counted, wherein the brand image vocabulary stores a brand image word corresponding to each brand name.
  • the brand image word corresponding to each brand name may be pre-stored in the brand image vocabulary.
  • brand image word cold start module 210 brand image positioning can be described by a brand expert or a business domain expert, and the brand expert summarizes the abstract image concept of the brand by using as few and precise words as possible. Then enter the generalized brand image words into the brand image lexicon. Taking the computer brand as an example, the basic format of the brand image thesaurus can be as shown in Table 1 below:
  • each item information in the item information database stored in the backstage of the shopping website may be matched with the brand image word corresponding to each brand name, so as to pass the user's shopping behavior data in the item information database.
  • the user and the brand are associated, which can be implemented by the product image word matching module 220 in FIG. 2 .
  • the specific implementation process is shown in FIG. 3 .
  • the e-commerce or brand-specific vocabulary such as the category word, the function word, and the modifier word in the product information is mainly used, and the matching and calculating the brand image words in the brand image lexicon are associated and matched.
  • the brand image word in the brand image vocabulary is matched with the data in the product information table by using the common “brand” field in the brand image vocabulary table and the product information table, and is followed by frequent items.
  • the processing in the set feedback module 230 and the fuzzy clustering module 250 is prepared.
  • the "Lenovo" field in the brand image vocabulary table of Table 1 and the "Lenovo” field in the product information table of Table 2 can be used to collect the "Lenovo” collected in the brand image word cold start module 210.
  • the brand image word corresponds to the "Lenovo" product information in the product information table, so that the user can be associated with the "Lenovo” brand through the user's shopping behavior data in the product information table.
  • the matched item information and the brand image word may be processed, for example, statistics.
  • Each user granularity relates to the shopping behavior characteristics of the brand image words, such as the number of purchases, the purchase amount, the order amount, etc., after normalization, input to the fuzzy clustering module for fuzzy clustering calculation. For example, if the user has shopping behavior on n brand image words, for example, there are orders on the n brand image words, the shopping behavior characteristics on the n brand image words corresponding to the user may be constructed, as shown in Table 3 below:
  • m users and n brand image words may form a matrix of m*n.
  • the m* may be The matrix of n is input to the fuzzy clustering module 250 for clustering.
  • the shopping behavior characteristic indicators may be normalized to ensure the uniformity of the dimensions.
  • the data normalization process is a mean normalization process, and the processing method is based on the mean and standard deviation of the original data for data normalization, and the mean value is the concentration degree of the metric data, and the calculation formula is:
  • x1 to xn are the original data that needs to be normalized, and n is the number of brand image words.
  • the standard deviation std is the degree of dispersion of the metric data.
  • the calculation formula is:
  • Xold is the data that needs to be normalized
  • Xnew is the data after normalization
  • the user's shopping behavior feature is not limited to the number of purchases, the purchase amount, and the order quantity.
  • the shopping behavior feature may also be the number of shopping carts and the number of collections, etc., which are also in the present disclosure. Within the scope of protection.
  • step S120 the user's membership degree to each brand image word is calculated by fuzzy clustering based on the shopping behavior data of the user.
  • the user's behavioral performance index on the brand image word may be calculated by inputting the shopping behavior characteristic index of each user of the user-brand image word feature processing module on the brand image word, and calculating according to the fuzzy clustering The fuzzy membership of the user on each brand image word.
  • fuzzy clustering is a soft segmentation algorithm.
  • each sample that needs to be clustered can belong to multiple categories at the same time, and the sum of the total membership degrees of each category of each sample is 1, so that by comparing the size of the membership of each sample on each class, You can know the degree or approximation of the sample in each class.
  • the user's degree of membership to each brand may be used to determine which brand image the user prefers or likes.
  • the brand image word can be used as the cluster center of each group, and the cluster center can be the brand image word that minimizes the value function of the non-similarity index.
  • Fuzzy clustering makes each given data point pass the degree of membership between 0 and 1 to indicate the extent to which it belongs to each group, ie, each brand image word.
  • the membership matrix U is allowed to have elements with values between 0 and 1.
  • the sum of the memberships of a data set is always equal to 1, as shown in Equation 4 below:
  • the value function (or objective function) of fuzzy clustering on n shopping behavior characteristics is the general form of the following formula:
  • m is a weighted index.
  • the specific process of performing fuzzy clustering on n shopping behavior features may be divided into the following three sub-modules:
  • Sub-module 1 Determine initial parameters:
  • there may be two initial parameters one is the number of fuzzy clusters, that is, the number c of brand image words, and the other is the parameter m that controls the softness of the algorithm.
  • c may be a positive integer not greater than 20, because too many clusters may be detrimental to interpretation and specific business applications, and in addition, grid traversal search may be used to find optimal clustering.
  • the number c is also within the protection scope of the present disclosure.
  • the softness parameter m is generally not excessive, otherwise the clustering effect may be affected.
  • the softness parameter m may take a number between 2 and 5, or a positive integer of not more than 10, for example, 2, or Other suitable numbers are not specifically limited herein.
  • Sub-module 2 Constructing a fuzzy matrix according to a given shopping behavior data sample and a corresponding sample feature vector, wherein i cluster center initialization may be randomly selected, and then stepwise iterative optimal solution is:
  • xj is the sample data point
  • uij is the membership degree of the sample data point j to the cluster center i.
  • Sub-module 3 Determine whether the objective function converges (stops iteration, outputs the result):
  • dij is the Euclidean distance of the sample data point j to the cluster center i.
  • the convergence condition may be that the threshold of a certain calculation is less than a certain threshold, or the amount of change of the threshold calculated from the previous target function value is less than a certain threshold. If the objective function reaches the above convergence condition, the algorithm operation stops, and the membership degree of the sample data point j to the cluster center i can be obtained.
  • step S130 the brand image word whose membership degree is greater than the first threshold is determined as the brand image of the user preference.
  • the first threshold may be a value determined according to the number of brand image words and the amount of shopping behavior data of the user, or may be determined according to actual processing results after adopting the user preference determining method in the present exemplary embodiment.
  • the value of the present disclosure is not specifically limited herein. Referring to FIG. 2 and FIG. 3, a brand image word with a membership greater than a first threshold may be determined as a brand image preferred by the user in the user-brand image matching module 260, and the determined brand image of the user preference may be output.
  • the brand image word may be added to the brand image vocabulary according to the information of each brand item on the shopping platform. Therefore, the user preference determining method may further include: generating, according to the matched item information and the brand image word, a frequent item set about each item information and the brand image word; and frequently increasing the support degree by a second threshold The product information in the item set is added to the brand image vocabulary.
  • the second threshold is a value set according to the number of item information items in the item information table, the number of brand image words in the brand image vocabulary, and the computing performance of the computer.
  • the frequent item set feedback module it is possible to calculate the matched product information items in the brand image word and the product information table selected in the brand image thesaurus table in the product sales.
  • the co-occurring commodity information item satisfying the frequent item minimum support threshold, that is, the second threshold is added to the brand image vocabulary table, thereby realizing the automatic expansion of the brand image vocabulary and subsequent coverage of the product and the user.
  • the generated frequent item set includes two parts: one part comes from the brand image word in the brand image lexicon, and the other part comes from the category word, the function word, the modifier word, etc. in the product information table, respectively
  • the brand image word performs the calculation of frequent itemsets, thereby outputting words with a higher frequency of co-occurrence with predetermined brand image words.
  • the commodity information item with the highest support threshold of the frequent item that is, the second threshold, is used as the supplement of the initial brand image word. After the iterative iteration, it is gradually enriched into the brand image lexicon, thereby realizing the brand image word. Independent expansion of the library.
  • the frequent item set of the category word, the function word, the modifier word, and the like in the brand image word and the product information table may be generated by the FP-growth method, and the specific implementation process is as follows: The construction and projection process of the FP-tree composed of various types of words in the brand image word and the commodity information table. For each frequent item constructed, its conditional projection database and projection FP-tree are constructed. Repeat this process for each newly constructed FP-tree until the constructed new FP-tree is empty or contains only one path. When the constructed FP-tree is empty, its prefix is the frequent mode; when only one path is included, the frequent mode is obtained by enumerating all possible combinations and connecting to the prefix of the tree.
  • the FP-tree is a special prefix tree composed of a frequent item header table and an item prefix tree.
  • the so-called prefix tree is a data structure that stores a candidate set.
  • the branch of the tree is identified by the item name, the node of the tree stores the suffix item, and the path represents the item set.
  • the FP-tree generation method is as follows:
  • the first step is to generate a transaction item set in the following format as shown in Table 4:
  • Item set id Item set Frequent item 001 ⁇ f,a,c,d,g,i,m,p ⁇ ⁇ f,c,a,m,p ⁇ 002 ⁇ a,b,c,f,l,m,o ⁇ ⁇ f,c,a,b,m ⁇ 003 ⁇ b,f,h,j,o,w ⁇ ⁇ f,b ⁇ 004 ⁇ b,c,k,s,p ⁇ ⁇ c,b,p ⁇ 005 ⁇ a,f,c,e,l,p,m,n ⁇ ⁇ f,c,a,m,p ⁇
  • the letters represent the brand image words in the brand image lexicon or the commodity information vocabulary in the product information table
  • the item set may represent the category in the brand image word and the product information table.
  • a set of items consisting of words, function words, modifiers, and so on.
  • the second step calculating the frequency of occurrence of the item satisfying the minimum support degree, and arranging the frequent items in descending order of frequency to generate frequently rearranged items.
  • the frequency of occurrences in the item set is shown in Table 5 below:
  • the third step is to scan the brand image words and the commodity information vocabulary in the database again to construct the FP-tree.
  • the final result is shown in Figure 4.
  • each solid line path can represent a set of items.
  • the FP-tree is a highly compressed structure that stores all the information used to mine frequent itemsets. After the FP-tree is generated, it can pass. The FP-tree gets frequent itemsets of various brand image words, thereby expanding the brand image vocabulary.
  • the FP-tree algorithm since the FP-tree algorithm only needs to perform secondary scanning on the transaction database and does not generate a large number of candidate sets, the data processing efficiency can be improved.
  • the user preference determining means may include a statistical unit 510, a membership degree calculating unit 520, and a user preference determining unit 530. among them:
  • the statistical unit 510 is configured to collect the shopping behavior data of the user corresponding to each brand image word in the brand image vocabulary, wherein the brand image vocabulary stores a brand image word corresponding to each brand name;
  • the membership degree calculation unit 520 is configured to calculate the membership degree of the user for each brand image word by using fuzzy clustering based on the shopping behavior data of the user;
  • the user preference determining unit 530 is configured to determine the brand image word whose membership degree is greater than the first threshold as the brand image of the user preference.
  • calculating, by the fuzzy clustering, the membership degree of the user for each brand image word based on the shopping behavior data of the user may include:
  • the user preference determining apparatus may further include: a matching unit, configured to match each item information in the item information database with a brand image word corresponding to each brand name.
  • the user preference determining apparatus may further include: a frequent item set generating unit, configured to generate a frequent item set regarding each item information and the brand image word; and an adding unit, configured to The item information of the frequent item set with the support degree greater than the second threshold is added to the brand image vocabulary.
  • generating a frequent item set about each item information and the brand image word may include:
  • the statistical shopping behavior data of the user corresponding to each brand image word in the brand image vocabulary may include:
  • the normalized shopping behavior data of the user corresponding to each brand image word in the brand image lexicon is counted.
  • modules or units of the user preference determining apparatus are mentioned in the above detailed description, such division is not mandatory. Indeed, in accordance with embodiments of the present disclosure, the features and functions of two or more modules or units described above may be embodied in one module or unit. Conversely, the features and functions of one of the modules or units described above may be further divided into multiple modules or units.
  • an electronic device capable of implementing the above method is also provided.
  • aspects of the present disclosure can be implemented as a system, method, or program product. Accordingly, aspects of the present disclosure may be embodied in the form of a complete hardware embodiment, a complete software embodiment (including firmware, microcode, etc.), or a combination of hardware and software aspects, which may be collectively referred to herein. "Circuit,” “module,” or “system.”
  • FIG. 6 An electronic device 600 according to such an embodiment of the present disclosure is described below with reference to FIG. 6 is merely an example, and should not impose any limitation on the function and scope of use of the embodiments of the present disclosure.
  • electronic device 600 is embodied in the form of a general purpose computing device.
  • the components of the electronic device 600 may include, but are not limited to, the at least one processing unit 610, the at least one storage unit 620, the bus 630 connecting the different system components (including the storage unit 620 and the processing unit 610), and the display unit 640.
  • the storage unit stores program code, which can be executed by the processing unit 610, such that the processing unit 610 performs various exemplary embodiments according to the present disclosure described in the "Exemplary Method" section of the present specification.
  • the processing unit 610 may perform step S110 shown in FIG. 1 to count the shopping behavior data of the user corresponding to each brand image word in the brand image vocabulary, wherein the brand image vocabulary stores a brand image word corresponding to each brand name; step S120. calculating, by the fuzzy clustering, the membership degree of the user for each brand image word based on the shopping behavior data of the user; and step S130.
  • the membership degree is greater than the first threshold
  • the brand image word is determined as the brand image of the user's preference.
  • the storage unit 620 can include a readable medium in the form of a volatile storage unit, such as a random access storage unit (RAM) 6201 and/or a cache storage unit 6202, and can further include a read only storage unit (ROM) 6203.
  • RAM random access storage unit
  • ROM read only storage unit
  • the storage unit 620 can also include a program/utility 6204 having a set (at least one) of the program modules 6205, such program modules 6205 including but not limited to: an operating system, one or more applications, other program modules, and program data, Implementations of the network environment may be included in each or some of these examples.
  • Bus 630 may represent one or more of several types of bus structures, including a memory unit bus or memory unit controller, a peripheral bus, a graphics acceleration port, a processing unit, or a local area using any of a variety of bus structures. bus.
  • the electronic device 600 can also communicate with one or more external devices 670 (eg, a keyboard, pointing device, Bluetooth device, etc.), and can also communicate with one or more devices that enable the user to interact with the electronic device 600, and/or with The electronic device 600 is enabled to communicate with any device (e.g., router, modem, etc.) that is in communication with one or more other computing devices. This communication can take place via an input/output (I/O) interface 650. Also, electronic device 600 can communicate with one or more networks (eg, a local area network (LAN), a wide area network (WAN), and/or a public network, such as the Internet) through network adapter 660. As shown, network adapter 660 communicates with other modules of electronic device 600 via bus 630.
  • network adapter 660 communicates with other modules of electronic device 600 via bus 630.
  • the exemplary embodiments described herein may be implemented by software, or may be implemented by software in combination with necessary hardware. Therefore, the technical solution according to an embodiment of the present disclosure may be embodied in the form of a software product, which may be stored in a non-volatile storage medium (which may be a CD-ROM, a USB flash drive, a mobile hard disk, etc.) or on a network.
  • a non-volatile storage medium which may be a CD-ROM, a USB flash drive, a mobile hard disk, etc.
  • a number of instructions are included to cause a computing device (which may be a personal computer, server, terminal device, or network device, etc.) to perform a method in accordance with an embodiment of the present disclosure.
  • a computer readable storage medium having stored thereon a program product capable of implementing the above method of the present specification.
  • various aspects of the present disclosure may also be embodied in the form of a program product comprising program code for causing said program product to run on a terminal device The terminal device performs the steps according to various exemplary embodiments of the present disclosure described in the "Exemplary Method" section of the present specification.
  • a program product 700 for implementing the above method which may employ a portable compact disk read only memory (CD-ROM) and includes program code, and may be at a terminal device, is described in accordance with an embodiment of the present disclosure.
  • CD-ROM portable compact disk read only memory
  • the program product of the present disclosure is not limited thereto, and in this document, the readable storage medium may be any tangible medium that contains or stores a program that can be used by or in connection with an instruction execution system, apparatus, or device.
  • the program product can employ any combination of one or more readable media.
  • the readable medium can be a readable signal medium or a readable storage medium.
  • the readable storage medium can be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any combination of the above. More specific examples (non-exhaustive lists) of readable storage media include: electrical connections with one or more wires, portable disks, hard disks, random access memory (RAM), read only memory (ROM), erasable Programmable read-only memory (EPROM or flash memory), optical fiber, portable compact disk read only memory (CD-ROM), optical storage device, magnetic storage device, or any suitable combination of the foregoing.
  • the computer readable signal medium may include a data signal that is propagated in the baseband or as part of a carrier, carrying readable program code. Such propagated data signals can take a variety of forms including, but not limited to, electromagnetic signals, optical signals, or any suitable combination of the foregoing.
  • the readable signal medium can also be any readable medium other than a readable storage medium that can transmit, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device.
  • Program code embodied on a readable medium can be transmitted using any suitable medium, including but not limited to wireless, wireline, optical cable, RF, etc., or any suitable combination of the foregoing.
  • Program code for performing the operations of the present disclosure may be written in any combination of one or more programming languages, including an object oriented programming language, such as Java, C++, etc., including conventional procedural Programming language—such as the "C" language or a similar programming language.
  • the program code can execute entirely on the user computing device, partially on the user device, as a stand-alone software package, partially on the remote computing device on the user computing device, or entirely on the remote computing device or server. Execute on.
  • the remote computing device can be connected to the user computing device via any kind of network, including a local area network (LAN) or wide area network (WAN), or can be connected to an external computing device (eg, provided using an Internet service) Businesses are connected via the Internet).
  • LAN local area network
  • WAN wide area network
  • Businesses are connected via the Internet.
  • the exemplary embodiments described herein may be implemented by software, or may be implemented by software in combination with necessary hardware. Therefore, the technical solution according to an embodiment of the present disclosure may be embodied in the form of a software product, which may be stored in a non-volatile storage medium (which may be a CD-ROM, a USB flash drive, a mobile hard disk, etc.) or on a network.
  • a non-volatile storage medium which may be a CD-ROM, a USB flash drive, a mobile hard disk, etc.
  • a number of instructions are included to cause a computing device (which may be a personal computer, server, touch terminal, or network device, etc.) to perform a method in accordance with an embodiment of the present disclosure.

Landscapes

  • Engineering & Computer Science (AREA)
  • Business, Economics & Management (AREA)
  • Finance (AREA)
  • Accounting & Taxation (AREA)
  • Strategic Management (AREA)
  • Development Economics (AREA)
  • Theoretical Computer Science (AREA)
  • Entrepreneurship & Innovation (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Business, Economics & Management (AREA)
  • Marketing (AREA)
  • Economics (AREA)
  • Game Theory and Decision Science (AREA)
  • Artificial Intelligence (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Evolutionary Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • General Engineering & Computer Science (AREA)
  • Evolutionary Computation (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

A user preference determination method, apparatus, an electronic device and a storage medium. The method comprises: creating statistics about shopping behaviour data of a user corresponding to each brand image word in a brand image word library, wherein the brand image word library stores brand image words corresponding to each brand name (S110); based on the shopping behaviour data of the user, calculating a subordination degree of the user to each brand image word through fuzzy clustering (S120); and determining a brand image word, of which the subordination degree is greater than a first threshold, as a brand image preferred by the user (S130). The method, the apparatus, the electronic device and the storage medium can efficiently dig out a brand image preferred by a user from shopping behaviour data of the user, thus being able to more deeply dig out consumption psychology and preferences of the user, facilitating precise marketing and reducing the cost of human labour.

Description

用户偏好确定方法、装置、设备及存储介质User preference determination method, device, device and storage medium 技术领域Technical field
本公开涉及大数据技术领域,具体而言,涉及一种用户偏好确定方法、用户偏好确定装置、电子设备以及计算机可读存储介质。The present disclosure relates to the field of big data technologies, and in particular, to a user preference determining method, a user preference determining apparatus, an electronic device, and a computer readable storage medium.
背景技术Background technique
随着大数据技术的广泛应用,精准营销已经成为电子商务业务实践中品牌商进行营销活动的重要途径,如何根据用户对品牌形象的偏好程度进行精准营销成为一个重要研究方向。With the wide application of big data technology, precision marketing has become an important way for brand owners to conduct marketing activities in e-commerce business practice. How to accurately market according to users' preference for brand image has become an important research direction.
目前,用户偏好的确定主要是通过问卷调查的方案来实现的。在这种方案中,一方面由于需要人工进行,效率低下;另一方面由于用户主要是基于主观因素进行判断,难以精准地得出用户真正偏好的品牌形象,从而无法实现对用户的精准营销。At present, the determination of user preferences is mainly achieved through the questionnaire survey program. In this scheme, on the one hand, due to the need for manual operation, the efficiency is low; on the other hand, since the user mainly judges based on subjective factors, it is difficult to accurately obtain the brand image that the user really prefers, and thus the precise marketing of the user cannot be realized.
因此,需要提供一种能够解决上述问题中的一个或多个问题的用户偏好确定方法以及用户偏好确定装置。Accordingly, it is desirable to provide a user preference determining method and user preference determining apparatus capable of solving one or more of the above problems.
需要说明的是,在上述背景技术部分公开的信息仅用于加强对本公开的背景的理解,因此可以包括不构成对本领域普通技术人员已知的现有技术的信息。It should be noted that the information disclosed in the Background section above is only for enhancement of understanding of the background of the present disclosure, and thus may include information that does not constitute prior art known to those of ordinary skill in the art.
发明内容Summary of the invention
本公开的目的在于提供一种用户偏好确定方法、用户偏好确定装置、电子设备以及计算机可读存储介质,进而至少在一定程度上克服由于相关技术的限制和缺陷而导致的一个或者多个问题。It is an object of the present disclosure to provide a user preference determining method, a user preference determining apparatus, an electronic device, and a computer readable storage medium, thereby at least partially obviating one or more problems due to limitations and disadvantages of the related art.
根据本公开的一个方面,提供了一种用户偏好确定方法,包括:According to an aspect of the present disclosure, a user preference determining method is provided, including:
统计与品牌形象词库中各品牌形象词对应的用户的购物行为数据,其中,所述品牌形象词库中存储有与各品牌名称对应的品牌形象词;Counting the shopping behavior data of the user corresponding to each brand image word in the brand image vocabulary, wherein the brand image vocabulary stores a brand image word corresponding to each brand name;
基于所述用户的购物行为数据通过模糊聚类计算所述用户对各品牌形象词的隶属度;以及Calculating the membership degree of the user for each brand image word by fuzzy clustering based on the shopping behavior data of the user;
将所述隶属度大于第一阈值的品牌形象词确定为所述用户偏好的品牌形象。The brand image word whose membership degree is greater than the first threshold is determined as the brand image of the user preference.
在本公开的一些示例性实施例中,基于所述用户的购物行为数据通过模糊聚类计算所述用户对各品牌形象词的隶属度包括:In some exemplary embodiments of the present disclosure, calculating, by the fuzzy clustering, the membership of the user for each brand image word based on the shopping behavior data of the user includes:
计算所述用户的购物行为数据与各品牌形象词之间的距离;Calculating a distance between the shopping behavior data of the user and each brand image word;
基于所述距离计算所述用户对各品牌形象词的隶属度。Calculating the membership degree of the user for each brand image word based on the distance.
在本公开的一些示例性实施例中,所述用户偏好确定方法还包括:In some exemplary embodiments of the present disclosure, the user preference determining method further includes:
将商品信息数据库中的各项商品信息与各品牌名称对应的品牌形象词进行匹配。The product information in the product information database is matched with the brand image word corresponding to each brand name.
在本公开的一些示例性实施例中,所述用户偏好确定方法还包括:In some exemplary embodiments of the present disclosure, the user preference determining method further includes:
基于匹配的各项商品信息和所述品牌形象词生成关于各项商品信息与所述品牌形象词的频繁项集;Generating a frequent item set for each item of product information and the brand image word based on the matched item information and the brand image word;
将支持度大于第二阈值的频繁项集中的商品信息加入到所述品牌形象词库中。The item information in the frequent item set with the support degree greater than the second threshold is added to the brand image dictionary.
在本公开的一些示例性实施例中,生成关于各项商品信息与所述品牌形象词的频繁项集包括:In some exemplary embodiments of the present disclosure, generating a frequent item set for each item of merchandise information and the brand image word includes:
通过FP-growth运算生成关于各项商品信息与品牌形象词的频繁项集。Generate frequent itemsets about various product information and brand image words through FP-growth operations.
在本公开的一些示例性实施例中,统计与品牌形象词库中各品牌形象词对应的用户的购物行为数据包括:In some exemplary embodiments of the present disclosure, the statistical shopping behavior data of the user corresponding to each brand image word in the brand image vocabulary includes:
对用户的购物行为数据进行归一化处理;Normalize the user's shopping behavior data;
统计与品牌形象词库中各品牌形象词对应的经归一化处理的所述用户的购物行为数据。The normalized shopping behavior data of the user corresponding to each brand image word in the brand image lexicon is counted.
根据本公开的一个方面,提供一种用户偏好确定装置,包括:According to an aspect of the present disclosure, a user preference determining apparatus is provided, including:
统计单元,用于统计与品牌形象词库中各品牌形象词对应的用户的购物行为数据,其中,所述品牌形象词库中存储有与各品牌名称对应的品牌形象词;a statistical unit, configured to count the shopping behavior data of the user corresponding to each brand image word in the brand image vocabulary, wherein the brand image vocabulary stores a brand image word corresponding to each brand name;
隶属度计算单元,用于基于所述用户的购物行为数据通过模糊聚类计算所述用户对各品牌形象词的隶属度;以及a membership degree calculation unit, configured to calculate, by the fuzzy clustering, the membership degree of the user for each brand image word based on the shopping behavior data of the user;
用户偏好确定单元,用于将所述隶属度大于第一阈值的品牌形象词确定为所述用户偏好的品牌形象。And a user preference determining unit, configured to determine the brand image word whose membership degree is greater than the first threshold as the brand image preferred by the user.
在本公开的一些示例性实施例中,基于所述用户的购物行为数据通过模糊聚类计算所述用户对各品牌形象词的隶属度包括:In some exemplary embodiments of the present disclosure, calculating, by the fuzzy clustering, the membership of the user for each brand image word based on the shopping behavior data of the user includes:
计算所述用户的购物行为数据与各品牌形象词之间的距离;Calculating a distance between the shopping behavior data of the user and each brand image word;
基于所述距离计算所述用户对各品牌形象词的隶属度。Calculating the membership degree of the user for each brand image word based on the distance.
根据本公开的一个方面,提供一种电子设备,包括:According to an aspect of the present disclosure, an electronic device is provided, including:
处理器;以及Processor;
存储器,所述存储器上存储有计算机可读指令,所述计算机可读指令被所述处理器执行时实现根据上述任意一项所述的用户偏好确定方法。A memory having computer readable instructions stored thereon, the computer readable instructions being executed by the processor to implement a user preference determining method according to any of the above.
根据本公开的一个方面,还提供了一种计算机可读存储介质,其上存储有计算机程序,所述计算机程序被处理器执行时实现根据上述任意一项所述的用户偏好确定方法。According to an aspect of the present disclosure, there is also provided a computer readable storage medium having stored thereon a computer program, the computer program being executed by a processor to implement the user preference determining method according to any one of the above.
本公开的一种示例性实施例中的用户偏好确定方法、用户偏好确定装置、电子设备 以及计算机可读存储介质,统计与各品牌形象词对应的用户的购物行为数据,基于用户的购物行为数据通过模糊聚类计算用户对各品牌形象词的隶属度,将隶属度大于第一阈值的品牌形象词确定为用户偏好的品牌形象。The user preference determining method, the user preference determining apparatus, the electronic device, and the computer readable storage medium in an exemplary embodiment of the present disclosure, the shopping behavior data of the user corresponding to each brand image word is counted, based on the user's shopping behavior data. The user's membership degree to each brand image word is calculated by fuzzy clustering, and the brand image word with membership degree greater than the first threshold is determined as the brand image of the user's preference.
应当理解的是,以上的一般描述和后文的细节描述仅是示例性和解释性的,并不能限制本公开。The above general description and the following detailed description are intended to be illustrative and not restrictive.
附图说明DRAWINGS
通过参照附图来详细描述其示例实施例,本公开的上述和其它特征及优点将变得更加明显。The above and other features and advantages of the present disclosure will become more apparent from the detailed description.
图1示意性示出了根据本公开一示例性实施例的用户偏好确定方法的流程图;FIG. 1 schematically shows a flowchart of a user preference determining method according to an exemplary embodiment of the present disclosure;
图2示意性示出了根据本公开一示例性实施例的用户偏好确定***的架构图;FIG. 2 schematically illustrates an architectural diagram of a user preference determination system in accordance with an exemplary embodiment of the present disclosure; FIG.
图3示意性示出了根据本公开一示例性实施例的用户偏好确定***的流程图;FIG. 3 schematically illustrates a flowchart of a user preference determination system in accordance with an exemplary embodiment of the present disclosure; FIG.
图4示意性示出了根据本公开一示例性实施例的构建的FP-tree的示意图;FIG. 4 schematically shows a schematic diagram of a constructed FP-tree according to an exemplary embodiment of the present disclosure; FIG.
图5示意性示出了根据本公开一示例性实施例的用户偏好确定装置的框图;FIG. 5 schematically shows a block diagram of a user preference determining apparatus according to an exemplary embodiment of the present disclosure; FIG.
图6示意性示出了根据本公开一示例性实施例的电子设备的框图;FIG. 6 schematically illustrates a block diagram of an electronic device in accordance with an exemplary embodiment of the present disclosure; FIG.
图7示出了根据本公开一示例性实施例的计算机可读存储介质的示意图。FIG. 7 shows a schematic diagram of a computer readable storage medium in accordance with an exemplary embodiment of the present disclosure.
具体实施方式Detailed ways
现在将参考附图更全面地描述示例实施例。然而,示例实施例能够以多种形式实施,且不应被理解为限于在此阐述的实施例;相反,提供这些实施例使得本公开将全面和完整,并将示例实施例的构思全面地传达给本领域的技术人员。在图中相同的附图标记表示相同或类似的部分,因而将省略对它们的重复描述。Example embodiments will now be described more fully with reference to the accompanying drawings. However, the exemplary embodiments can be embodied in a variety of forms and should not be construed as being limited to the embodiments set forth herein. To those skilled in the art. The same reference numerals in the drawings denote the same or similar parts, and the repeated description thereof will be omitted.
此外,所描述的特征、结构或特性可以以任何合适的方式结合在一个或更多实施例中。在下面的描述中,提供许多具体细节从而给出对本公开的实施例的充分理解。然而,本领域技术人员将意识到,可以实践本公开的技术方案而没有所述特定细节中的一个或更多,或者可以采用其它的方法、组元、材料、装置、步骤等。在其它情况下,不详细示出或描述公知结构、方法、装置、实现、材料或者操作以避免模糊本公开的各方面。Furthermore, the described features, structures, or characteristics may be combined in any suitable manner in one or more embodiments. In the following description, numerous specific details are set forth However, one skilled in the art will appreciate that the technical solution of the present disclosure may be practiced without one or more of the specific details, or other methods, components, materials, devices, steps, etc. may be employed. In other instances, well-known structures, methods, devices, implementations, materials, or operations are not shown or described in detail to avoid obscuring aspects of the present disclosure.
附图中所示的方框图仅仅是功能实体,不一定必须与物理上独立的实体相对应。即,可以采用软件形式来实现这些功能实体,或在一个或多个软件硬化的模块中实现这些功能实体或功能实体的一部分,或在不同网络和/或处理器装置和/或微控制器装置中实现这些功能实体。The block diagrams shown in the figures are merely functional entities and do not necessarily have to correspond to physically separate entities. That is, these functional entities may be implemented in software, or implemented in one or more software-hardened modules, or in different network and/or processor devices and/or microcontroller devices. Implement these functional entities.
本示例实施例中,首先提供了一种用户偏好确定方法。参考图1中所示,该用户偏 好确定方法可以包括以下步骤:In the present exemplary embodiment, a user preference determining method is first provided. Referring to FIG. 1, the user preference determination method may include the following steps:
步骤S110.统计与品牌形象词库中各品牌形象词对应的用户的购物行为数据,其中,所述品牌形象词库中存储有与各品牌名称对应的品牌形象词;Step S110. Counting the shopping behavior data of the user corresponding to each brand image word in the brand image vocabulary, wherein the brand image vocabulary stores a brand image word corresponding to each brand name;
步骤S120.基于所述用户的购物行为数据通过模糊聚类计算所述用户对各品牌形象词的隶属度;以及Step S120. Calculating, by the fuzzy clustering, the membership degree of the user for each brand image word based on the shopping behavior data of the user;
步骤S130.将所述隶属度大于第一阈值的品牌形象词确定为所述用户偏好的品牌形象。Step S130. Determine the brand image word whose membership degree is greater than the first threshold value as the brand image of the user preference.
根据本示例实施例中的用户偏好确定方法,一方面,统计与各品牌形象词对应的用户的购物行为数据,可以将用户的购物行为数据与品牌形象词进行关联,从而利于通过用户购物行为数据对用户偏好的品牌形象进行分析;另一方面,基于用户的购物行为数据通过模糊聚类计算用户对各品牌形象词的隶属度,将隶属度大于第一阈值的品牌形象词确定为用户偏好的品牌形象,能够从用户的购物行为数据自动高效地挖掘出用户偏好的多个品牌形象,进而能够更深层次挖掘用户消费心理和偏好,便于精准营销,并且降低了人工成本。According to the user preference determining method in the exemplary embodiment, on the one hand, the user's shopping behavior data corresponding to each brand image word is counted, and the user's shopping behavior data may be associated with the brand image word, thereby facilitating the user's shopping behavior data. The user's preferred brand image is analyzed. On the other hand, based on the user's shopping behavior data, the user's membership degree to each brand image word is calculated by fuzzy clustering, and the brand image word with membership degree greater than the first threshold is determined as the user's preference. The brand image can automatically and efficiently extract multiple brand images that the user prefers from the user's shopping behavior data, thereby further deepening the user's consumption psychology and preferences, facilitating accurate marketing, and reducing labor costs.
下面,将对本示例实施例中的用户偏好确定方法进行进一步的说明。Hereinafter, the user preference determining method in the present exemplary embodiment will be further described.
在步骤S110中,统计与品牌形象词库中各品牌形象词对应的用户的购物行为数据,其中,所述品牌形象词库中存储有与各品牌名称对应的品牌形象词。In step S110, the shopping behavior data of the user corresponding to each brand image word in the brand image vocabulary is counted, wherein the brand image vocabulary stores a brand image word corresponding to each brand name.
在本示例实施例中,品牌形象词库中可以预先存储有与各品牌名称对应的品牌形象词。参照图2所示,在品牌形象词冷启动模块210中,可以由品牌专家或业务领域专家对品牌形象定位进行描述,品牌专家通过用尽量少而精准的词对品牌的抽象形象概念进行概括,然后将概括的品牌形象词输入到品牌形象词库中。以电脑品牌为例,品牌形象词库表的基本格式可以如下表1所示:In the present exemplary embodiment, the brand image word corresponding to each brand name may be pre-stored in the brand image vocabulary. Referring to FIG. 2, in the brand image word cold start module 210, brand image positioning can be described by a brand expert or a business domain expert, and the brand expert summarizes the abstract image concept of the brand by using as few and precise words as possible. Then enter the generalized brand image words into the brand image lexicon. Taking the computer brand as an example, the basic format of the brand image thesaurus can be as shown in Table 1 below:
表1.品牌形象词库表Table 1. Brand image vocabulary table
品牌Brand 品牌形象词Brand image word
DellDell 时尚、商务、年轻Fashion, business, young
联想Lenovo 实力、中庸、性价比Strength, moderation, cost performance
华硕Asus 性能、创新Performance, innovation
进一步地,在本示例实施例中,可以将购物网站后台存储的商品信息数据库中的各项商品信息与各品牌名称对应的品牌形象词进行匹配,以便通过用户在商品信息数据库中的购物行为数据将用户和品牌关联起来,这可以通过图2中的商品形象词匹配模块220来实现,具体实现流程参照图3所示。在本示例实施例中,主要利用商品信息中的品类 词、功能词、修饰词等电商或品牌专用词汇,关联匹配计算这些词与品牌形象词库中的品牌形象词。如下表2所示,利用品牌形象词库表和商品信息表中共同的“品牌”字段,将品牌形象词库中的品牌形象词与商品信息表中的数据匹配在一起,为后续在频繁项集反馈模块230以及模糊聚类模块250中的处理做准备。Further, in the present exemplary embodiment, each item information in the item information database stored in the backstage of the shopping website may be matched with the brand image word corresponding to each brand name, so as to pass the user's shopping behavior data in the item information database. The user and the brand are associated, which can be implemented by the product image word matching module 220 in FIG. 2 . The specific implementation process is shown in FIG. 3 . In the present exemplary embodiment, the e-commerce or brand-specific vocabulary such as the category word, the function word, and the modifier word in the product information is mainly used, and the matching and calculating the brand image words in the brand image lexicon are associated and matched. As shown in Table 2 below, the brand image word in the brand image vocabulary is matched with the data in the product information table by using the common “brand” field in the brand image vocabulary table and the product information table, and is followed by frequent items. The processing in the set feedback module 230 and the fuzzy clustering module 250 is prepared.
表2.商品信息表Table 2. Commodity Information Sheet
Figure PCTCN2018100688-appb-000001
Figure PCTCN2018100688-appb-000001
在本示例实施中,通过表1的品牌形象词库表中的“联想”字段和表2的商品信息表中的“联想”字段,可以将品牌形象词冷启动模块210中收集的有关“联想”的品牌形象词对应匹配到商品信息表中的“联想”商品信息上,从而能够通过用户在商品信息表中的购物行为数据将用户与“联想”品牌关联起来。In the present example implementation, the "Lenovo" field in the brand image vocabulary table of Table 1 and the "Lenovo" field in the product information table of Table 2 can be used to collect the "Lenovo" collected in the brand image word cold start module 210. The brand image word corresponds to the "Lenovo" product information in the product information table, so that the user can be associated with the "Lenovo" brand through the user's shopping behavior data in the product information table.
进一步地,在本示例实施例中,参照图2和图3所示,在用户-品牌形象词特征加工模块240中,可以对匹配的各项商品信息和所述品牌形象词进行处理,例如统计每一个用户粒度有关品牌形象词上的购物行为特征,比如购买次数、购买金额、订单量等,经过归一化之后,输入到模糊聚类模块进行模糊聚类计算。例如,用户在n个品牌形象词上有购物行为例如在n个品牌形象词上有订单量,则可以构建与用户对应的n个品牌形象词上的购物行为特征,具体如下表3所示:Further, in the present exemplary embodiment, referring to FIG. 2 and FIG. 3, in the user-brand image word feature processing module 240, the matched item information and the brand image word may be processed, for example, statistics. Each user granularity relates to the shopping behavior characteristics of the brand image words, such as the number of purchases, the purchase amount, the order amount, etc., after normalization, input to the fuzzy clustering module for fuzzy clustering calculation. For example, if the user has shopping behavior on n brand image words, for example, there are orders on the n brand image words, the shopping behavior characteristics on the n brand image words corresponding to the user may be constructed, as shown in Table 3 below:
表3.与用户对应的n个品牌形象词上的购物行为特征表Table 3. Shopping behavior characteristics table on n brand image words corresponding to users
Figure PCTCN2018100688-appb-000002
Figure PCTCN2018100688-appb-000002
在本示例实施例中,如果商品信息数据库中有m个用户,则m个用户与n个品牌形象词可以形成一个m*n的矩阵,参照图2和图3所示,可以将该m*n的矩阵输入到模糊聚类模块250中进行聚类。此外,如果与用户对应的n各品牌形象词上的购物行为特征存在数据指标不一致的情况,可以对各购物行为特征指标进行归一化处理,以保证量纲统一。在本示例实施例中,数据归一化处理为均值标准化处理,处理方法是基于原始数 据的均值和标准差进行数据标准化,均值是度量数据的集中程度,计算公式为:In the present exemplary embodiment, if there are m users in the product information database, m users and n brand image words may form a matrix of m*n. Referring to FIG. 2 and FIG. 3, the m* may be The matrix of n is input to the fuzzy clustering module 250 for clustering. In addition, if the shopping behavior characteristics of the n brand image words corresponding to the user are inconsistent with the data indicators, the shopping behavior characteristic indicators may be normalized to ensure the uniformity of the dimensions. In the present exemplary embodiment, the data normalization process is a mean normalization process, and the processing method is based on the mean and standard deviation of the original data for data normalization, and the mean value is the concentration degree of the metric data, and the calculation formula is:
Figure PCTCN2018100688-appb-000003
Figure PCTCN2018100688-appb-000003
其中,x1至xn为需要进行归一化处理的原始数据,n为品牌形象词数量。Among them, x1 to xn are the original data that needs to be normalized, and n is the number of brand image words.
标准差std是度量数据的离散程度,计算公式为:The standard deviation std is the degree of dispersion of the metric data. The calculation formula is:
Figure PCTCN2018100688-appb-000004
Figure PCTCN2018100688-appb-000004
标准化处理的公式为:The formula for standardization processing is:
Figure PCTCN2018100688-appb-000005
Figure PCTCN2018100688-appb-000005
其中,Xold为需要进行归一化处理的数据,Xnew为经归一化处理之后的数据。Among them, Xold is the data that needs to be normalized, and Xnew is the data after normalization.
需要说明的是,在本示例实施例中,用户的购物行为特征不限于购买次数、购买金额、订单量,例如购物行为特征还可以为加入购物车数量以及收藏数量等,这同样在本公开的保护范围内。It should be noted that, in the present exemplary embodiment, the user's shopping behavior feature is not limited to the number of purchases, the purchase amount, and the order quantity. For example, the shopping behavior feature may also be the number of shopping carts and the number of collections, etc., which are also in the present disclosure. Within the scope of protection.
接下来,在步骤S120中,基于所述用户的购物行为数据通过模糊聚类计算所述用户对各品牌形象词的隶属度。Next, in step S120, the user's membership degree to each brand image word is calculated by fuzzy clustering based on the shopping behavior data of the user.
在本示例实施例中,可以将用户-品牌形象词特征加工模块的每个用户在品牌形象词上的购物行为特征指标作为输入来计算用户在品牌形象词上的行为表现,根据模糊聚类计算出用户在每个品牌形象词上的模糊隶属度。In the present exemplary embodiment, the user's behavioral performance index on the brand image word may be calculated by inputting the shopping behavior characteristic index of each user of the user-brand image word feature processing module on the brand image word, and calculating according to the fuzzy clustering The fuzzy membership of the user on each brand image word.
模糊聚类与传统的硬聚类不同,是一种软分割的算法。在模糊聚类中,每个需要进行聚类的样本可以同时隶属多个类别,每个样本的各类别的总隶属度之和为1,这样通过比较样本在每个类上隶属度的大小,可以知道样本在每个类的从属程度或近似程度。在本示例实施例中,可以根据用户对各品牌的隶属度来得出用户对哪几个品牌形象比较偏好或者喜好。Different from traditional hard clustering, fuzzy clustering is a soft segmentation algorithm. In fuzzy clustering, each sample that needs to be clustered can belong to multiple categories at the same time, and the sum of the total membership degrees of each category of each sample is 1, so that by comparing the size of the membership of each sample on each class, You can know the degree or approximation of the sample in each class. In the present exemplary embodiment, the user's degree of membership to each brand may be used to determine which brand image the user prefers or likes.
具体而言,在本示例实施例中,模糊聚类的实现方法可以为:将n个购物行为特征向量xi(i=1,2,...,n)分为c个模糊组,c可以为品牌形象词的个数。品牌形象词可以作为各组的聚类中心,聚类中心可以为使得非相似性指标的价值函数达到最小的品牌形象词。模糊聚类使得每个给定数据点通过值在0与1之间的隶属度来表示其属于各个组即各品牌形象词的程度。与引入模糊划分相适应,允许隶属矩阵U有取值在0与1之间的元素。此外,加上归一化约束,一个数据集的隶属度的和总等于1,如下式4所示:Specifically, in the present exemplary embodiment, the implementation method of the fuzzy clustering may be: dividing the n shopping behavior feature vectors xi (i=1, 2, . . . , n) into c fuzzy groups, and c may The number of brand image words. The brand image word can be used as the cluster center of each group, and the cluster center can be the brand image word that minimizes the value function of the non-similarity index. Fuzzy clustering makes each given data point pass the degree of membership between 0 and 1 to indicate the extent to which it belongs to each group, ie, each brand image word. Adapted to the introduction of fuzzy partitioning, the membership matrix U is allowed to have elements with values between 0 and 1. In addition, with the normalization constraint, the sum of the memberships of a data set is always equal to 1, as shown in Equation 4 below:
Figure PCTCN2018100688-appb-000006
Figure PCTCN2018100688-appb-000006
那么,对n个购物行为特征进行模糊聚类的价值函数(或目标函数)就是下式一般化 形式:Then, the value function (or objective function) of fuzzy clustering on n shopping behavior characteristics is the general form of the following formula:
Figure PCTCN2018100688-appb-000007
Figure PCTCN2018100688-appb-000007
其中,这里uij取值范围介于0与1之间;ci为模糊组i的聚类中心,dij=||ci-xj||为第i个聚类中心与第j个数据点间的欧几里德距离,且m(属于1到无穷)是一个加权指数。Wherein, the value of uij is between 0 and 1; ci is the cluster center of fuzzy group i, and dij=||ci-xj|| is the European between the i-th cluster center and the j-th data point The few miles distance, and m (belonging to 1 to infinity) is a weighted index.
进一步地,在本示例实施中,参照图3所示,对n个购物行为特征进行模糊聚类的具体过程可以分为下面3个子模块:Further, in the present example implementation, referring to FIG. 3, the specific process of performing fuzzy clustering on n shopping behavior features may be divided into the following three sub-modules:
子模块1:确定初始参数:在本示例实施例中,初始参数可以有两个,一个是模糊聚类个数即品牌形象词个数c,另一个是控制算法柔软度的参数m。在本示例实施例中,c可以为一个不大于20的正整数,因为聚类个数过多会不利于解读和具体业务应用,此外,还可以通过网格遍历搜索来寻找最优的聚类个数c,这同样在本公开的保护范围内。在本示例实施例中,柔软度参数m一般不能过大,否则会影响聚类效果,柔软度参数m可以取2-5之间的数,或者取不大于10的正整数例如2,也可以是其他适当的数,本公开在此不进行特殊限定。Sub-module 1: Determine initial parameters: In the present exemplary embodiment, there may be two initial parameters, one is the number of fuzzy clusters, that is, the number c of brand image words, and the other is the parameter m that controls the softness of the algorithm. In the present exemplary embodiment, c may be a positive integer not greater than 20, because too many clusters may be detrimental to interpretation and specific business applications, and in addition, grid traversal search may be used to find optimal clustering. The number c is also within the protection scope of the present disclosure. In the present exemplary embodiment, the softness parameter m is generally not excessive, otherwise the clustering effect may be affected. The softness parameter m may take a number between 2 and 5, or a positive integer of not more than 10, for example, 2, or Other suitable numbers are not specifically limited herein.
子模块2:根据给定的购物行为数据样本和对应样本特征向量,构造模糊矩阵,其中,i个聚类中心初始化可以为随机选取,然后逐步迭代最优解为:Sub-module 2: Constructing a fuzzy matrix according to a given shopping behavior data sample and a corresponding sample feature vector, wherein i cluster center initialization may be randomly selected, and then stepwise iterative optimal solution is:
Figure PCTCN2018100688-appb-000008
Figure PCTCN2018100688-appb-000008
其中:xj为样本数据点,uij为样本数据点j对聚类中心i的隶属度。Where: xj is the sample data point, and uij is the membership degree of the sample data point j to the cluster center i.
子模块3:判断目标函数是否收敛(停止迭代,输出结果):Sub-module 3: Determine whether the objective function converges (stops iteration, outputs the result):
本方案目标函数为The objective function of this scheme is
Figure PCTCN2018100688-appb-000009
Figure PCTCN2018100688-appb-000009
其中:dij为样本数据点j对聚类中心i的欧氏距离。收敛条件可以为某次计算的阈值小于某个确定的阀值,或为某次计算的阈值相对上次目标函数值的改变量小于某个阀值。如果目标函数达到上述收敛条件,则算法运算停止,即可得出为样本数据点j对聚类中心i的隶属度。Where: dij is the Euclidean distance of the sample data point j to the cluster center i. The convergence condition may be that the threshold of a certain calculation is less than a certain threshold, or the amount of change of the threshold calculated from the previous target function value is less than a certain threshold. If the objective function reaches the above convergence condition, the algorithm operation stops, and the membership degree of the sample data point j to the cluster center i can be obtained.
接下来,在步骤S130中,将所述隶属度大于第一阈值的品牌形象词确定为所述用户偏好的品牌形象。Next, in step S130, the brand image word whose membership degree is greater than the first threshold is determined as the brand image of the user preference.
在本示例实施例中,第一阈值可以为根据品牌形象词的个数和用户的购物行为数据量 确定的值,也可以为采用本示例实施例中的用户偏好确定方法之后根据实际处理结果确定的值,本公开在此不进行特殊限定。参照图2和图3所示,可以在用户-品牌形象匹配模块260中将隶属度大于第一阈值的品牌形象词确定为用户偏好的品牌形象,并输出所确定的用户偏好的品牌形象。In the present exemplary embodiment, the first threshold may be a value determined according to the number of brand image words and the amount of shopping behavior data of the user, or may be determined according to actual processing results after adopting the user preference determining method in the present exemplary embodiment. The value of the present disclosure is not specifically limited herein. Referring to FIG. 2 and FIG. 3, a brand image word with a membership greater than a first threshold may be determined as a brand image preferred by the user in the user-brand image matching module 260, and the determined brand image of the user preference may be output.
进一步地,在本示例实施例中,为了丰富品牌形象词库中的品牌形象的内容,可以根据购物平台上的各品牌商品的信息来向品牌形象词库中添加品牌形象词。因此,该用户偏好确定方法还可以包括:基于匹配的各项商品信息和所述品牌形象词生成关于各项商品信息与所述品牌形象词的频繁项集;将支持度大于第二阈值的频繁项集中的商品信息加入到所述品牌形象词库中。在本实例实施例中,第二阈值是根据商品信息表中的商品信息项的数量、品牌形象词库中的品牌形象词的数量以及计算机的计算性能等设定的值。Further, in the present exemplary embodiment, in order to enrich the content of the brand image in the brand image lexicon, the brand image word may be added to the brand image vocabulary according to the information of each brand item on the shopping platform. Therefore, the user preference determining method may further include: generating, according to the matched item information and the brand image word, a frequent item set about each item information and the brand image word; and frequently increasing the support degree by a second threshold The product information in the item set is added to the brand image vocabulary. In the present example embodiment, the second threshold is a value set according to the number of item information items in the item information table, the number of brand image words in the brand image vocabulary, and the computing performance of the computer.
具体而言,参照图2和图3所示,在频繁项集反馈模块中:可以计算与品牌形象词库表中选定的品牌形象词和商品信息表中经匹配的商品信息项在商品销售中出现的频繁项情况,将满足频繁项最低支持度阈值即第二阈值的共现的商品信息项加入品牌形象词库表中,实现自动扩充品牌形象词库以及后续对商品和用户的覆盖。Specifically, referring to FIG. 2 and FIG. 3, in the frequent item set feedback module, it is possible to calculate the matched product information items in the brand image word and the product information table selected in the brand image thesaurus table in the product sales. In the frequent item situation, the co-occurring commodity information item satisfying the frequent item minimum support threshold, that is, the second threshold is added to the brand image vocabulary table, thereby realizing the automatic expansion of the brand image vocabulary and subsequent coverage of the product and the user.
在本示例实施例中,生成的频繁项集包括两部分:一部分来自于品牌形象词库中的品牌形象词,另一部分来自于商品信息表中的品类词、功能词、修饰词等,分别与品牌形象词进行频繁项集的计算,从而输出与预先确定的品牌形象词共现频率较高的词。将满足频繁项的最低支持度阈值即第二阈值的共现频率较高的商品信息项作为初始品牌形象词的补充,经过逐步迭代后,逐渐丰富到品牌形象词库中,从而实现品牌形象词库的自主扩充。In the present exemplary embodiment, the generated frequent item set includes two parts: one part comes from the brand image word in the brand image lexicon, and the other part comes from the category word, the function word, the modifier word, etc. in the product information table, respectively The brand image word performs the calculation of frequent itemsets, thereby outputting words with a higher frequency of co-occurrence with predetermined brand image words. The commodity information item with the highest support threshold of the frequent item, that is, the second threshold, is used as the supplement of the initial brand image word. After the iterative iteration, it is gradually enriched into the brand image lexicon, thereby realizing the brand image word. Independent expansion of the library.
进一步地,在本示例实施例中,可以由FP-growth方法来生成品牌形象词与商品信息表中的品类词、功能词、修饰词等的频繁项集,具体实现流程如下:不断地迭代由品牌形象词与商品信息表中各类词构成的FP-tree的构造和投影过程。对于构成的每个频繁项,构造其条件投影数据库和投影FP-tree。对每个新构建的FP-tree重复这个过程,直到构造的新FP-tree为空,或者只包含一条路径。当构造的FP-tree为空时,其前缀即为频繁模式;当只包含一条路径时,通过枚举所有可能组合并与此树的前缀连接即可得到频繁模式。Further, in the present exemplary embodiment, the frequent item set of the category word, the function word, the modifier word, and the like in the brand image word and the product information table may be generated by the FP-growth method, and the specific implementation process is as follows: The construction and projection process of the FP-tree composed of various types of words in the brand image word and the commodity information table. For each frequent item constructed, its conditional projection database and projection FP-tree are constructed. Repeat this process for each newly constructed FP-tree until the constructed new FP-tree is empty or contains only one path. When the constructed FP-tree is empty, its prefix is the frequent mode; when only one path is included, the frequent mode is obtained by enumerating all possible combinations and connecting to the prefix of the tree.
参照图4所示,FP-tree是一种特殊的前缀树,由频繁项头表和项前缀树构成。所谓前缀树,是一种存储候选项集的数据结构,树的分支用项名标识,树的节点存储后缀项,路径表示项集。FP-tree的生成方法如下所示:Referring to FIG. 4, the FP-tree is a special prefix tree composed of a frequent item header table and an item prefix tree. The so-called prefix tree is a data structure that stores a candidate set. The branch of the tree is identified by the item name, the node of the tree stores the suffix item, and the path represents the item set. The FP-tree generation method is as follows:
第一步,生成事务项集,格式如下表4所示:The first step is to generate a transaction item set in the following format as shown in Table 4:
表4.事务项集与频繁项Table 4. Transaction item sets and frequent items
项集idItem set id 项集Item set 频繁项Frequent item
001001 {f,a,c,d,g,i,m,p}{f,a,c,d,g,i,m,p} {f,c,a,m,p}{f,c,a,m,p}
002002 {a,b,c,f,l,m,o}{a,b,c,f,l,m,o} {f,c,a,b,m}{f,c,a,b,m}
003003 {b,f,h,j,o,w}{b,f,h,j,o,w} {f,b}{f,b}
004004 {b,c,k,s,p}{b,c,k,s,p} {c,b,p}{c,b,p}
005005 {a,f,c,e,l,p,m,n}{a,f,c,e,l,p,m,n} {f,c,a,m,p}{f,c,a,m,p}
在本示例实施例中,为了简单方便起见,用字母代表品牌形象词库中的品牌形象词或商品信息表中的各商品信息词汇,项集可以表示由品牌形象词与商品信息表中的品类词、功能词、修饰词等构成的项集。按最小支持度为3来计算时,先扫描数据库中的品牌形象词和各商品信息词汇,计算每个单项的出现频率,保留出现频率大于最小支持度的记录。因此,在表4中最右一列中,仅保留了出现频率大于3的项。In the present exemplary embodiment, for the sake of simplicity and convenience, the letters represent the brand image words in the brand image lexicon or the commodity information vocabulary in the product information table, and the item set may represent the category in the brand image word and the product information table. A set of items consisting of words, function words, modifiers, and so on. When the minimum support degree is 3, the brand image word and each product information vocabulary in the database are scanned first, the frequency of occurrence of each item is calculated, and the record whose frequency of occurrence is greater than the minimum support degree is retained. Therefore, in the rightmost column of Table 4, only items whose appearance frequency is greater than 3 are retained.
第二步:计算满足最小支持度的项的出现频率,并将频繁项按频率降序排列,生成重新排列的频繁项。项集中各项出现的频率如下表5所示:The second step: calculating the frequency of occurrence of the item satisfying the minimum support degree, and arranging the frequent items in descending order of frequency to generate frequently rearranged items. The frequency of occurrences in the item set is shown in Table 5 below:
表5.项集中各项出现的频率Table 5. Frequency of occurrences in the item set
item 频率frequency
ff 44
cc 44
aa 33
bb 33
mm 33
p p 33
如表5所示,在上表4计算得出的各频繁项中,字母f出现了4次,字母c出现了4次,字母a出现了3次,将各字母出现的频率按降序排列,得到经重新排列的频繁项,如表4中的最右一列所示。As shown in Table 5, among the frequent items calculated in Table 4 above, the letter f appears 4 times, the letter c appears 4 times, the letter a appears 3 times, and the frequency of occurrence of each letter is arranged in descending order. The reordered frequent items are obtained, as shown in the rightmost column of Table 4.
第三步,再次扫描数据库中的品牌形象词和各商品信息词汇,构建FP-tree,最终结果如图4所示。在图4中,每一条实线的路径可以表示项集,该FP-tree是一个高度压缩的结构,存储了用于挖掘频繁项集的全部信息,在生成了FP-tree之后,即可通过该FP-tree得到各品牌形象词的频繁项集,从而能够扩充品牌形象词库。此外,由于FP-tree算法仅需对事务数据库进行二次扫描,并且不用产生大量的候选集,因此能够提高数据处理效率。The third step is to scan the brand image words and the commodity information vocabulary in the database again to construct the FP-tree. The final result is shown in Figure 4. In Figure 4, each solid line path can represent a set of items. The FP-tree is a highly compressed structure that stores all the information used to mine frequent itemsets. After the FP-tree is generated, it can pass. The FP-tree gets frequent itemsets of various brand image words, thereby expanding the brand image vocabulary. In addition, since the FP-tree algorithm only needs to perform secondary scanning on the transaction database and does not generate a large number of candidate sets, the data processing efficiency can be improved.
需要说明的是,尽管在附图中以特定顺序描述了本公开中方法的各个步骤,但是,这 并非要求或者暗示必须按照该特定顺序来执行这些步骤,或是必须执行全部所示的步骤才能实现期望的结果。附加的或备选的,可以省略某些步骤,将多个步骤合并为一个步骤执行,以及/或者将一个步骤分解为多个步骤执行等。It should be noted that, although the various steps of the method of the present disclosure are described in a particular order in the drawings, this does not require or imply that the steps must be performed in the specific order, or that all the steps shown must be performed. Achieve the desired results. Additionally or alternatively, certain steps may be omitted, multiple steps being combined into one step execution, and/or one step being decomposed into multiple step executions and the like.
此外,在本示例实施例中,还提供了一种用户偏好确定装置。参照图5所示,该用户偏好确定装置可以包括:统计单元510、隶属度计算单元520以及用户偏好确定单元530。其中:Further, in the present exemplary embodiment, a user preference determining means is also provided. Referring to FIG. 5, the user preference determining means may include a statistical unit 510, a membership degree calculating unit 520, and a user preference determining unit 530. among them:
统计单元510用于统计与品牌形象词库中各品牌形象词对应的用户的购物行为数据,其中,所述品牌形象词库中存储有与各品牌名称对应的品牌形象词;The statistical unit 510 is configured to collect the shopping behavior data of the user corresponding to each brand image word in the brand image vocabulary, wherein the brand image vocabulary stores a brand image word corresponding to each brand name;
隶属度计算单元520用于基于所述用户的购物行为数据通过模糊聚类计算所述用户对各品牌形象词的隶属度;以及The membership degree calculation unit 520 is configured to calculate the membership degree of the user for each brand image word by using fuzzy clustering based on the shopping behavior data of the user;
用户偏好确定单元530用于将所述隶属度大于第一阈值的品牌形象词确定为所述用户偏好的品牌形象。The user preference determining unit 530 is configured to determine the brand image word whose membership degree is greater than the first threshold as the brand image of the user preference.
进一步地,在本示例实施例中,基于所述用户的购物行为数据通过模糊聚类计算所述用户对各品牌形象词的隶属度可以包括:Further, in the present exemplary embodiment, calculating, by the fuzzy clustering, the membership degree of the user for each brand image word based on the shopping behavior data of the user may include:
计算所述用户的购物行为数据与各品牌形象词之间的距离;Calculating a distance between the shopping behavior data of the user and each brand image word;
基于所述距离计算所述用户对各品牌形象词的隶属度。Calculating the membership degree of the user for each brand image word based on the distance.
此外,在本示例实施例中,所述用户偏好确定装置还可以包括:匹配单元,用于将商品信息数据库中的各项商品信息与各品牌名称对应的品牌形象词进行匹配。In addition, in the present exemplary embodiment, the user preference determining apparatus may further include: a matching unit, configured to match each item information in the item information database with a brand image word corresponding to each brand name.
此外,在本示例实施例中,所述用户偏好确定装置还可以包括:频繁项集生成单元,用于生成关于各项商品信息与所述品牌形象词的频繁项集;添加单元,用于将支持度大于第二阈值的频繁项集中的商品信息加入到所述品牌形象词库中。In addition, in the present exemplary embodiment, the user preference determining apparatus may further include: a frequent item set generating unit, configured to generate a frequent item set regarding each item information and the brand image word; and an adding unit, configured to The item information of the frequent item set with the support degree greater than the second threshold is added to the brand image vocabulary.
进一步地,在本示例实施例中,生成关于各项商品信息与所述品牌形象词的频繁项集可以包括:Further, in the present exemplary embodiment, generating a frequent item set about each item information and the brand image word may include:
通过FP-growth运算生成关于各项商品信息与品牌形象词的频繁项集。Generate frequent itemsets about various product information and brand image words through FP-growth operations.
进一步地,在本示例实施例中,统计与品牌形象词库中各品牌形象词对应的用户的购物行为数据可以包括:Further, in the present exemplary embodiment, the statistical shopping behavior data of the user corresponding to each brand image word in the brand image vocabulary may include:
对用户的购物行为数据进行归一化处理;Normalize the user's shopping behavior data;
统计与品牌形象词库中各品牌形象词对应的经归一化处理的所述用户的购物行为数据。The normalized shopping behavior data of the user corresponding to each brand image word in the brand image lexicon is counted.
由于本公开的示例实施例的用户偏好确定装置400的各个功能模块与用户偏好确定 方法的示例实施例的步骤对应,因此在此不再赘述。Since the respective functional modules of the user preference determining apparatus 400 of the exemplary embodiment of the present disclosure correspond to the steps of the exemplary embodiment of the user preference determining method, they are not described herein again.
应当注意,尽管在上文详细描述中提及了用户偏好确定装置的若干模块或者单元,但是这种划分并非强制性的。实际上,根据本公开的实施方式,上文描述的两个或更多模块或者单元的特征和功能可以在一个模块或者单元中具体化。反之,上文描述的一个模块或者单元的特征和功能可以进一步划分为由多个模块或者单元来具体化。It should be noted that although several modules or units of the user preference determining apparatus are mentioned in the above detailed description, such division is not mandatory. Indeed, in accordance with embodiments of the present disclosure, the features and functions of two or more modules or units described above may be embodied in one module or unit. Conversely, the features and functions of one of the modules or units described above may be further divided into multiple modules or units.
在本公开的示例性实施例中,还提供了一种能够实现上述方法的电子设备。In an exemplary embodiment of the present disclosure, an electronic device capable of implementing the above method is also provided.
所属技术领域的技术人员能够理解,本公开的各个方面可以实现为***、方法或程序产品。因此,本公开的各个方面可以具体实现为以下形式,即:完全的硬件实施例、完全的软件实施例(包括固件、微代码等),或硬件和软件方面结合的实施例,这里可以统称为“电路”、“模块”或“***”。Those skilled in the art will appreciate that various aspects of the present disclosure can be implemented as a system, method, or program product. Accordingly, aspects of the present disclosure may be embodied in the form of a complete hardware embodiment, a complete software embodiment (including firmware, microcode, etc.), or a combination of hardware and software aspects, which may be collectively referred to herein. "Circuit," "module," or "system."
下面参照图6来描述根据本公开的这种实施例的电子设备600。图6所示的电子设备600仅仅是一个示例,不应对本公开实施例的功能和使用范围带来任何限制。An electronic device 600 according to such an embodiment of the present disclosure is described below with reference to FIG. The electronic device 600 shown in FIG. 6 is merely an example, and should not impose any limitation on the function and scope of use of the embodiments of the present disclosure.
如图6所示,电子设备600以通用计算设备的形式表现。电子设备600的组件可以包括但不限于:上述至少一个处理单元610、上述至少一个存储单元620、连接不同***组件(包括存储单元620和处理单元610)的总线630、显示单元640。As shown in Figure 6, electronic device 600 is embodied in the form of a general purpose computing device. The components of the electronic device 600 may include, but are not limited to, the at least one processing unit 610, the at least one storage unit 620, the bus 630 connecting the different system components (including the storage unit 620 and the processing unit 610), and the display unit 640.
其中,所述存储单元存储有程序代码,所述程序代码可以被所述处理单元610执行,使得所述处理单元610执行本说明书上述“示例性方法”部分中描述的根据本公开各种示例性实施例的步骤。例如,所述处理单元610可以执行如图1中所示的步骤S110.统计与品牌形象词库中各品牌形象词对应的用户的购物行为数据,其中,所述品牌形象词库中存储有与各品牌名称对应的品牌形象词;步骤S120.基于所述用户的购物行为数据通过模糊聚类计算所述用户对各品牌形象词的隶属度;以及步骤S130.将所述隶属度大于第一阈值的品牌形象词确定为所述用户偏好的品牌形象。Wherein, the storage unit stores program code, which can be executed by the processing unit 610, such that the processing unit 610 performs various exemplary embodiments according to the present disclosure described in the "Exemplary Method" section of the present specification. The steps of the examples. For example, the processing unit 610 may perform step S110 shown in FIG. 1 to count the shopping behavior data of the user corresponding to each brand image word in the brand image vocabulary, wherein the brand image vocabulary stores a brand image word corresponding to each brand name; step S120. calculating, by the fuzzy clustering, the membership degree of the user for each brand image word based on the shopping behavior data of the user; and step S130. The membership degree is greater than the first threshold The brand image word is determined as the brand image of the user's preference.
存储单元620可以包括易失性存储单元形式的可读介质,例如随机存取存储单元(RAM)6201和/或高速缓存存储单元6202,还可以进一步包括只读存储单元(ROM)6203。The storage unit 620 can include a readable medium in the form of a volatile storage unit, such as a random access storage unit (RAM) 6201 and/or a cache storage unit 6202, and can further include a read only storage unit (ROM) 6203.
存储单元620还可以包括具有一组(至少一个)程序模块6205的程序/实用工具6204,这样的程序模块6205包括但不限于:操作***、一个或者多个应用程序、其它程序模块以及程序数据,这些示例中的每一个或某种组合中可能包括网络环境的实现。The storage unit 620 can also include a program/utility 6204 having a set (at least one) of the program modules 6205, such program modules 6205 including but not limited to: an operating system, one or more applications, other program modules, and program data, Implementations of the network environment may be included in each or some of these examples.
总线630可以为表示几类总线结构中的一种或多种,包括存储单元总线或者存储单元控制器、***总线、图形加速端口、处理单元或者使用多种总线结构中的任意总线结构的 局域总线。 Bus 630 may represent one or more of several types of bus structures, including a memory unit bus or memory unit controller, a peripheral bus, a graphics acceleration port, a processing unit, or a local area using any of a variety of bus structures. bus.
电子设备600也可以与一个或多个外部设备670(例如键盘、指向设备、蓝牙设备等)通信,还可与一个或者多个使得用户能与该电子设备600交互的设备通信,和/或与使得该电子设备600能与一个或多个其它计算设备进行通信的任何设备(例如路由器、调制解调器等等)通信。这种通信可以通过输入/输出(I/O)接口650进行。并且,电子设备600还可以通过网络适配器660与一个或者多个网络(例如局域网(LAN),广域网(WAN)和/或公共网络,例如因特网)通信。如图所示,网络适配器660通过总线630与电子设备600的其它模块通信。应当明白,尽管图中未示出,可以结合电子设备600使用其它硬件和/或软件模块,包括但不限于:微代码、设备驱动器、冗余处理单元、外部磁盘驱动阵列、RAID***、磁带驱动器以及数据备份存储***等。The electronic device 600 can also communicate with one or more external devices 670 (eg, a keyboard, pointing device, Bluetooth device, etc.), and can also communicate with one or more devices that enable the user to interact with the electronic device 600, and/or with The electronic device 600 is enabled to communicate with any device (e.g., router, modem, etc.) that is in communication with one or more other computing devices. This communication can take place via an input/output (I/O) interface 650. Also, electronic device 600 can communicate with one or more networks (eg, a local area network (LAN), a wide area network (WAN), and/or a public network, such as the Internet) through network adapter 660. As shown, network adapter 660 communicates with other modules of electronic device 600 via bus 630. It should be understood that although not shown in the figures, other hardware and/or software modules may be utilized in conjunction with electronic device 600, including but not limited to: microcode, device drivers, redundant processing units, external disk drive arrays, RAID systems, tape drives. And data backup storage systems, etc.
通过以上的实施例的描述,本领域的技术人员易于理解,这里描述的示例实施例可以通过软件实现,也可以通过软件结合必要的硬件的方式来实现。因此,根据本公开实施例的技术方案可以以软件产品的形式体现出来,该软件产品可以存储在一个非易失性存储介质(可以是CD-ROM,U盘,移动硬盘等)中或网络上,包括若干指令以使得一台计算设备(可以是个人计算机、服务器、终端装置、或者网络设备等)执行根据本公开实施例的方法。Through the description of the above embodiments, those skilled in the art can easily understand that the exemplary embodiments described herein may be implemented by software, or may be implemented by software in combination with necessary hardware. Therefore, the technical solution according to an embodiment of the present disclosure may be embodied in the form of a software product, which may be stored in a non-volatile storage medium (which may be a CD-ROM, a USB flash drive, a mobile hard disk, etc.) or on a network. A number of instructions are included to cause a computing device (which may be a personal computer, server, terminal device, or network device, etc.) to perform a method in accordance with an embodiment of the present disclosure.
在本公开的示例性实施例中,还提供了一种计算机可读存储介质,其上存储有能够实现本说明书上述方法的程序产品。在一些可能的实施例中,本公开的各个方面还可以实现为一种程序产品的形式,其包括程序代码,当所述程序产品在终端设备上运行时,所述程序代码用于使所述终端设备执行本说明书上述“示例性方法”部分中描述的根据本公开各种示例性实施例的步骤。In an exemplary embodiment of the present disclosure, there is also provided a computer readable storage medium having stored thereon a program product capable of implementing the above method of the present specification. In some possible embodiments, various aspects of the present disclosure may also be embodied in the form of a program product comprising program code for causing said program product to run on a terminal device The terminal device performs the steps according to various exemplary embodiments of the present disclosure described in the "Exemplary Method" section of the present specification.
参考图7所示,描述了根据本公开的实施例的用于实现上述方法的程序产品700,其可以采用便携式紧凑盘只读存储器(CD-ROM)并包括程序代码,并可以在终端设备,例如个人电脑上运行。然而,本公开的程序产品不限于此,在本文件中,可读存储介质可以是任何包含或存储程序的有形介质,该程序可以被指令执行***、装置或者器件使用或者与其结合使用。Referring to FIG. 7, a program product 700 for implementing the above method, which may employ a portable compact disk read only memory (CD-ROM) and includes program code, and may be at a terminal device, is described in accordance with an embodiment of the present disclosure. For example running on a personal computer. However, the program product of the present disclosure is not limited thereto, and in this document, the readable storage medium may be any tangible medium that contains or stores a program that can be used by or in connection with an instruction execution system, apparatus, or device.
所述程序产品可以采用一个或多个可读介质的任意组合。可读介质可以是可读信号介质或者可读存储介质。可读存储介质例如可以为但不限于电、磁、光、电磁、红外线、或半导体的***、装置或器件,或者任意以上的组合。可读存储介质的更具体的例子(非穷举的列表)包括:具有一个或多个导线的电连接、便携式盘、硬盘、随机存取存储器(RAM)、 只读存储器(ROM)、可擦式可编程只读存储器(EPROM或闪存)、光纤、便携式紧凑盘只读存储器(CD-ROM)、光存储器件、磁存储器件、或者上述的任意合适的组合。The program product can employ any combination of one or more readable media. The readable medium can be a readable signal medium or a readable storage medium. The readable storage medium can be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any combination of the above. More specific examples (non-exhaustive lists) of readable storage media include: electrical connections with one or more wires, portable disks, hard disks, random access memory (RAM), read only memory (ROM), erasable Programmable read-only memory (EPROM or flash memory), optical fiber, portable compact disk read only memory (CD-ROM), optical storage device, magnetic storage device, or any suitable combination of the foregoing.
计算机可读信号介质可以包括在基带中或者作为载波一部分传播的数据信号,其中承载了可读程序代码。这种传播的数据信号可以采用多种形式,包括但不限于电磁信号、光信号或上述的任意合适的组合。可读信号介质还可以是可读存储介质以外的任何可读介质,该可读介质可以发送、传播或者传输用于由指令执行***、装置或者器件使用或者与其结合使用的程序。The computer readable signal medium may include a data signal that is propagated in the baseband or as part of a carrier, carrying readable program code. Such propagated data signals can take a variety of forms including, but not limited to, electromagnetic signals, optical signals, or any suitable combination of the foregoing. The readable signal medium can also be any readable medium other than a readable storage medium that can transmit, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device.
可读介质上包含的程序代码可以用任何适当的介质传输,包括但不限于无线、有线、光缆、RF等等,或者上述的任意合适的组合。Program code embodied on a readable medium can be transmitted using any suitable medium, including but not limited to wireless, wireline, optical cable, RF, etc., or any suitable combination of the foregoing.
可以以一种或多种程序设计语言的任意组合来编写用于执行本公开操作的程序代码,所述程序设计语言包括面向对象的程序设计语言—诸如Java、C++等,还包括常规的过程式程序设计语言—诸如“C”语言或类似的程序设计语言。程序代码可以完全地在用户计算设备上执行、部分地在用户设备上执行、作为一个独立的软件包执行、部分在用户计算设备上部分在远程计算设备上执行、或者完全在远程计算设备或服务器上执行。在涉及远程计算设备的情形中,远程计算设备可以通过任意种类的网络,包括局域网(LAN)或广域网(WAN),连接到用户计算设备,或者,可以连接到外部计算设备(例如利用因特网服务提供商来通过因特网连接)。Program code for performing the operations of the present disclosure may be written in any combination of one or more programming languages, including an object oriented programming language, such as Java, C++, etc., including conventional procedural Programming language—such as the "C" language or a similar programming language. The program code can execute entirely on the user computing device, partially on the user device, as a stand-alone software package, partially on the remote computing device on the user computing device, or entirely on the remote computing device or server. Execute on. In the case of a remote computing device, the remote computing device can be connected to the user computing device via any kind of network, including a local area network (LAN) or wide area network (WAN), or can be connected to an external computing device (eg, provided using an Internet service) Businesses are connected via the Internet).
此外,上述附图仅是根据本公开示例性实施例的方法所包括的处理的示意性说明,而不是限制目的。易于理解,上述附图所示的处理并不表明或限制这些处理的时间顺序。另外,也易于理解,这些处理可以是例如在多个模块中同步或异步执行的。Further, the above-described drawings are merely illustrative of the processes included in the method according to the exemplary embodiments of the present disclosure, and are not intended to be limiting. It is easy to understand that the processing shown in the above figures does not indicate or limit the chronological order of these processes. In addition, it is also easy to understand that these processes may be performed synchronously or asynchronously, for example, in a plurality of modules.
通过以上的实施例的描述,本领域的技术人员易于理解,这里描述的示例实施例可以通过软件实现,也可以通过软件结合必要的硬件的方式来实现。因此,根据本公开实施例的技术方案可以以软件产品的形式体现出来,该软件产品可以存储在一个非易失性存储介质(可以是CD-ROM,U盘,移动硬盘等)中或网络上,包括若干指令以使得一台计算设备(可以是个人计算机、服务器、触控终端、或者网络设备等)执行根据本公开实施例的方法。Through the description of the above embodiments, those skilled in the art can easily understand that the exemplary embodiments described herein may be implemented by software, or may be implemented by software in combination with necessary hardware. Therefore, the technical solution according to an embodiment of the present disclosure may be embodied in the form of a software product, which may be stored in a non-volatile storage medium (which may be a CD-ROM, a USB flash drive, a mobile hard disk, etc.) or on a network. A number of instructions are included to cause a computing device (which may be a personal computer, server, touch terminal, or network device, etc.) to perform a method in accordance with an embodiment of the present disclosure.
本领域技术人员在考虑说明书及实践这里公开的发明后,将容易想到本公开的其它实施例。本申请旨在涵盖本公开的任何变型、用途或者适应性变化,这些变型、用途或者适应性变化遵循本公开的一般性原理并包括本公开未公开的本技术领域中的公知常识或惯用技术手段。说明书和实施例仅被视为示例性的,本公开的真正范围和精神由权利要求指 出。Other embodiments of the present disclosure will be apparent to those skilled in the <RTIgt; The present application is intended to cover any variations, uses, or adaptations of the present disclosure, which are in accordance with the general principles of the disclosure and include common general knowledge or common technical means in the art that are not disclosed in the present disclosure. . The specification and examples are to be regarded as illustrative only,
应当理解的是,本公开并不局限于上面已经描述并在附图中示出的精确结构,并且可以在不脱离其范围进行各种修改和改变。本公开的范围仅由所附的权利要求来限制。It is to be understood that the invention is not limited to the details of the details and The scope of the disclosure is to be limited only by the appended claims.

Claims (10)

  1. 一种用户偏好确定方法,包括:A method for determining user preferences, comprising:
    统计与品牌形象词库中各品牌形象词对应的用户的购物行为数据,其中,所述品牌形象词库中存储有与各品牌名称对应的品牌形象词;Counting the shopping behavior data of the user corresponding to each brand image word in the brand image vocabulary, wherein the brand image vocabulary stores a brand image word corresponding to each brand name;
    基于所述用户的购物行为数据通过模糊聚类计算所述用户对各品牌形象词的隶属度;以及Calculating the membership degree of the user for each brand image word by fuzzy clustering based on the shopping behavior data of the user;
    将所述隶属度大于第一阈值的品牌形象词确定为所述用户偏好的品牌形象。The brand image word whose membership degree is greater than the first threshold is determined as the brand image of the user preference.
  2. 根据权利要求1所述的用户偏好确定方法,其中,基于所述用户的购物行为数据通过模糊聚类计算所述用户对各品牌形象词的隶属度包括:The user preference determining method according to claim 1, wherein the calculating the membership degree of each brand image word by the fuzzy clustering based on the shopping behavior data of the user comprises:
    计算所述用户的购物行为数据与各品牌形象词之间的距离;Calculating a distance between the shopping behavior data of the user and each brand image word;
    基于所述距离计算所述用户对各品牌形象词的隶属度。Calculating the membership degree of the user for each brand image word based on the distance.
  3. 根据权利要求1或2所述的用户偏好确定方法,其中,所述用户偏好确定方法还包括:The user preference determining method according to claim 1 or 2, wherein the user preference determining method further comprises:
    将商品信息数据库中的各项商品信息与各品牌名称对应的品牌形象词进行匹配。The product information in the product information database is matched with the brand image word corresponding to each brand name.
  4. 根据权利要求3所述的用户偏好确定方法,其中,所述用户偏好确定方法还包括:The user preference determining method according to claim 3, wherein the user preference determining method further comprises:
    基于匹配的各项商品信息和所述品牌形象词生成关于各项商品信息与所述品牌形象词的频繁项集;Generating a frequent item set for each item of product information and the brand image word based on the matched item information and the brand image word;
    将支持度大于第二阈值的频繁项集中的商品信息加入到所述品牌形象词库中。The item information in the frequent item set with the support degree greater than the second threshold is added to the brand image dictionary.
  5. 根据权利要求3所述的用户偏好确定方法,其中,生成关于各项商品信息与所述品牌形象词的频繁项集包括:The user preference determining method according to claim 3, wherein generating a frequent item set regarding each item of the item information and the brand image word comprises:
    通过FP-growth运算生成关于各项商品信息与品牌形象词的频繁项集。Generate frequent itemsets about various product information and brand image words through FP-growth operations.
  6. 根据权利要求1所述的用户偏好确定方法,其中,统计与品牌形象词库中各品牌形象词对应的用户的购物行为数据包括:The user preference determining method according to claim 1, wherein the statistical shopping behavior data of the user corresponding to each brand image word in the brand image vocabulary includes:
    对用户的购物行为数据进行归一化处理;Normalize the user's shopping behavior data;
    统计与品牌形象词库中各品牌形象词对应的经归一化处理的所述用户的购物行为数据。The normalized shopping behavior data of the user corresponding to each brand image word in the brand image lexicon is counted.
  7. 一种用户偏好确定装置,包括:A user preference determining apparatus includes:
    统计单元,用于统计与品牌形象词库中各品牌形象词对应的用户的购物行为数据,其中,所述品牌形象词库中存储有与各品牌名称对应的品牌形象词;a statistical unit, configured to count the shopping behavior data of the user corresponding to each brand image word in the brand image vocabulary, wherein the brand image vocabulary stores a brand image word corresponding to each brand name;
    隶属度计算单元,用于基于所述用户的购物行为数据通过模糊聚类计算所述用户对各品牌形象词的隶属度;以及a membership degree calculation unit, configured to calculate, by the fuzzy clustering, the membership degree of the user for each brand image word based on the shopping behavior data of the user;
    用户偏好确定单元,用于将所述隶属度大于第一阈值的品牌形象词确定为所述用户偏好的品牌形象。And a user preference determining unit, configured to determine the brand image word whose membership degree is greater than the first threshold as the brand image preferred by the user.
  8. 根据权利要求7所述的用户偏好确定装置,其中,基于所述用户的购物行为数据通过模糊聚类计算所述用户对各品牌形象词的隶属度包括:The user preference determining apparatus according to claim 7, wherein the calculating the membership degree of each brand image word by the fuzzy clustering based on the shopping behavior data of the user comprises:
    计算所述用户的购物行为数据与各品牌形象词之间的距离;Calculating a distance between the shopping behavior data of the user and each brand image word;
    基于所述距离计算所述用户对各品牌形象词的隶属度。Calculating the membership degree of the user for each brand image word based on the distance.
  9. 一种电子设备,包括:An electronic device comprising:
    处理器;以及Processor;
    存储器,所述存储器上存储有计算机可读指令,所述计算机可读指令被所述处理器执行时实现根据权利要求1至6中任一项所述的用户偏好确定方法。A memory having computer readable instructions stored thereon, the computer readable instructions being executed by the processor to implement the user preference determining method according to any one of claims 1 to 6.
  10. 一种计算机可读存储介质,其上存储有计算机程序,所述计算机程序被处理器执行时实现根据权利要求1至6中任一项所述的用户偏好确定方法。A computer readable storage medium having stored thereon a computer program, the computer program being executed by a processor to implement the user preference determining method according to any one of claims 1 to 6.
PCT/CN2018/100688 2017-08-16 2018-08-15 User preference determination method, apparatus, device, and storage medium WO2019034087A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201710699894.1A CN107507028B (en) 2017-08-16 2017-08-16 User preference determination method, device, equipment and storage medium
CN201710699894.1 2017-08-16

Publications (1)

Publication Number Publication Date
WO2019034087A1 true WO2019034087A1 (en) 2019-02-21

Family

ID=60690818

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2018/100688 WO2019034087A1 (en) 2017-08-16 2018-08-15 User preference determination method, apparatus, device, and storage medium

Country Status (2)

Country Link
CN (1) CN107507028B (en)
WO (1) WO2019034087A1 (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112907311A (en) * 2019-12-04 2021-06-04 北京沃东天骏信息技术有限公司 Article identification method and device, computer storage medium and electronic equipment
CN113128211A (en) * 2020-01-14 2021-07-16 北京京东振世信息技术有限公司 Article classification method and device

Families Citing this family (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107507028B (en) * 2017-08-16 2021-11-30 北京京东尚科信息技术有限公司 User preference determination method, device, equipment and storage medium
CN108009897A (en) * 2017-12-25 2018-05-08 北京中关村科金技术有限公司 A kind of real-time recommendation method of commodity, system and readable storage medium storing program for executing
CN110110033A (en) * 2018-01-29 2019-08-09 清华大学 Information extracting method, device, computer equipment and storage medium
CN109359246A (en) * 2018-12-07 2019-02-19 上海宏原信息科技有限公司 A kind of brand cohesion calculation method based on forum user speech
CN109658195B (en) * 2018-12-24 2020-12-25 北京亿百分科技有限公司 Commodity display decision method
CN110413852A (en) * 2019-07-19 2019-11-05 深圳市元征科技股份有限公司 A kind of information-pushing method, device, equipment and medium
CN111401409B (en) * 2020-02-28 2023-04-18 创新奇智(青岛)科技有限公司 Commodity brand feature acquisition method, sales volume prediction method, device and electronic equipment
CN113553493A (en) * 2020-04-24 2021-10-26 哈尔滨工业大学 Service selection method based on demand service probability matrix
CN117972218A (en) * 2024-03-28 2024-05-03 山东怡然信息技术有限公司 User demand accurate matching method and system based on big data

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20060242011A1 (en) * 2005-04-21 2006-10-26 International Business Machines Corporation Method and system for automatic, customer-specific purchasing preferences and patterns of complementary products
US8160918B1 (en) * 2005-01-14 2012-04-17 Comscore, Inc. Method and apparatus for determining brand preference
CN103745379A (en) * 2013-12-23 2014-04-23 苏州亚安智能科技有限公司 Method for realizing customer-demand-based orientationi electronic-commerce platform
CN106682968A (en) * 2017-01-10 2017-05-17 北京三快在线科技有限公司 Navigation menu generation method and device, and server
CN107507028A (en) * 2017-08-16 2017-12-22 北京京东尚科信息技术有限公司 User preference determines method, apparatus, equipment and storage medium

Family Cites Families (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
NZ572036A (en) * 2008-10-15 2010-03-26 Nikola Kirilov Kasabov Data analysis and predictive systems and related methodologies
CN102750647A (en) * 2012-06-29 2012-10-24 南京大学 Merchant recommendation method based on transaction network
CN103810251B (en) * 2014-01-21 2017-05-10 南京财经大学 Method and device for extracting text
CN104298778B (en) * 2014-11-04 2017-07-04 北京科技大学 A kind of Forecasting Methodology and system of the steel rolling product quality based on correlation rule tree
JP6334455B2 (en) * 2015-04-23 2018-05-30 日本電信電話株式会社 Clustering apparatus, method, and program
CN106294462B (en) * 2015-06-01 2019-09-17 Tcl集团股份有限公司 It is a kind of to obtain the method and system for recommending video
CN105488597B (en) * 2015-12-28 2020-01-07 中国民航信息网络股份有限公司 Passenger destination prediction method and system
CN105975608A (en) * 2016-05-17 2016-09-28 北京京东尚科信息技术有限公司 Data mining method and device
CN106844787B (en) * 2017-03-16 2020-06-16 四川大学 Recommendation method for searching target users and matching target products for automobile industry

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8160918B1 (en) * 2005-01-14 2012-04-17 Comscore, Inc. Method and apparatus for determining brand preference
US20060242011A1 (en) * 2005-04-21 2006-10-26 International Business Machines Corporation Method and system for automatic, customer-specific purchasing preferences and patterns of complementary products
CN103745379A (en) * 2013-12-23 2014-04-23 苏州亚安智能科技有限公司 Method for realizing customer-demand-based orientationi electronic-commerce platform
CN106682968A (en) * 2017-01-10 2017-05-17 北京三快在线科技有限公司 Navigation menu generation method and device, and server
CN107507028A (en) * 2017-08-16 2017-12-22 北京京东尚科信息技术有限公司 User preference determines method, apparatus, equipment and storage medium

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112907311A (en) * 2019-12-04 2021-06-04 北京沃东天骏信息技术有限公司 Article identification method and device, computer storage medium and electronic equipment
CN113128211A (en) * 2020-01-14 2021-07-16 北京京东振世信息技术有限公司 Article classification method and device

Also Published As

Publication number Publication date
CN107507028B (en) 2021-11-30
CN107507028A (en) 2017-12-22

Similar Documents

Publication Publication Date Title
WO2019034087A1 (en) User preference determination method, apparatus, device, and storage medium
US20210027146A1 (en) Method and apparatus for determining interest of user for information item
Morgan-Lopez et al. Predicting age groups of Twitter users based on language and metadata features
Wang et al. A sentiment‐enhanced hybrid recommender system for movie recommendation: a big data analytics framework
WO2023097929A1 (en) Knowledge graph recommendation method and system based on improved kgat model
TWI612488B (en) Computer device and method for predicting market demand of commodities
Cheng et al. Unsupervised sentiment analysis with signed social networks
WO2022126971A1 (en) Density-based text clustering method and apparatus, device, and storage medium
EP2866421B1 (en) Method and apparatus for identifying a same user in multiple social networks
CN111259263B (en) Article recommendation method and device, computer equipment and storage medium
US20150052098A1 (en) Contextually propagating semantic knowledge over large datasets
Qiang et al. Short text clustering based on Pitman-Yor process mixture model
Wang et al. Attribute embedding: Learning hierarchical representations of product attributes from consumer reviews
CN111429161B (en) Feature extraction method, feature extraction device, storage medium and electronic equipment
US11037073B1 (en) Data analysis system using artificial intelligence
Tang et al. Propagation-based sentiment analysis for microblogging data
US20220367051A1 (en) Methods and systems for estimating causal effects from knowledge graphs
CN114548321A (en) Self-supervision public opinion comment viewpoint object classification method based on comparative learning
Zou et al. Collaborative community-specific microblog sentiment analysis via multi-task learning
WO2023129339A1 (en) Extracting and classifying entities from digital content items
CN115375361A (en) Method and device for selecting target population for online advertisement delivery and electronic equipment
Zhou et al. Empirical likelihood inferences for varying coefficient partially nonlinear models
WO2020252925A1 (en) Method and apparatus for searching user feature group for optimized user feature, electronic device, and computer nonvolatile readable storage medium
Zhang et al. Evaluation for machine tool components importance based on improved LeaderRank
CN115905648A (en) Gaussian mixture model-based user group and financial user group analysis method and device

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 18846688

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 18846688

Country of ref document: EP

Kind code of ref document: A1