CN111737537B - POI recommendation method, device and medium based on graph database - Google Patents

POI recommendation method, device and medium based on graph database Download PDF

Info

Publication number
CN111737537B
CN111737537B CN202010703591.4A CN202010703591A CN111737537B CN 111737537 B CN111737537 B CN 111737537B CN 202010703591 A CN202010703591 A CN 202010703591A CN 111737537 B CN111737537 B CN 111737537B
Authority
CN
China
Prior art keywords
poi
user
target
data
graph database
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202010703591.4A
Other languages
Chinese (zh)
Other versions
CN111737537A (en
Inventor
吴敏
陈鹏伟
叶小萌
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Hangzhou Ouruozhi Technology Co ltd
Original Assignee
Hangzhou Ouruozhi Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Hangzhou Ouruozhi Technology Co ltd filed Critical Hangzhou Ouruozhi Technology Co ltd
Priority to CN202010703591.4A priority Critical patent/CN111737537B/en
Publication of CN111737537A publication Critical patent/CN111737537A/en
Application granted granted Critical
Publication of CN111737537B publication Critical patent/CN111737537B/en
Priority to US17/325,245 priority patent/US11500934B2/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/901Indexing; Data structures therefor; Storage structures
    • G06F16/9024Graphs; Linked lists
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/953Querying, e.g. by the use of web search engines
    • G06F16/9535Search customisation based on user profiles and personalisation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/953Querying, e.g. by the use of web search engines
    • G06F16/9537Spatial or temporal dependent retrieval, e.g. spatiotemporal queries

Landscapes

  • Engineering & Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Software Systems (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention discloses a POI recommendation method based on a graph database, relates to the field of big data analysis, and aims to realize personalized POI recommendation. The method comprises the following steps: acquiring data, and importing the data into a pre-constructed graph database; receiving ID and positioning information of a target user; acquiring a feature vector of the target user, a friend intimacy coefficient of the target user and a target POI set from the graph database according to the ID and the positioning information of the target user; according to the target POI set, acquiring a feature matrix of a user set with related POI consumption records from the graph database; calculating the similarity between the feature vector of the target user and the feature matrix; determining a target POI according to the similarity result obtained by calculation; and returning the target POI. The invention also discloses an electronic device and a computer storage medium.

Description

POI recommendation method, device and medium based on graph database
Technical Field
The invention relates to the field of big data analysis, in particular to a POI recommendation method, POI recommendation equipment and POI recommendation media based on a graph database.
Background
POI is an abbreviation Of Point Of Interest, and in the local field Of life, generally refers to a place Of Interest to a consumer, such as a sight, movie theater, restaurant, etc. The POI personalized recommendation service based on the geographic position not only can provide convenience for consumers, but also can bring huge commercial benefits to merchants.
The current local life POI recommendation system generally adopts a mode of off-line calculation and preparation in a data warehouse to perform POI recommendation, obtains a pre-recommendation result through processing calculation at a minute or hour level by using, for example, a Hive SQL or Spark Job mode, and then stores the recommendation result in a cache database such as Redis for real-time query. This method mainly has two problems:
1. since the geographic location from which the request originated cannot be known in advance, the vast majority of possible POIs must be pre-computed. In fact, however, only a small fraction of POIs may be reached for each consumer during a day or week. Therefore, the method causes a great waste of computing resources in practical application.
2. Personalized recommendation cannot be achieved. The personalized recommendation algorithm of 'thousands of people and thousands of faces' needs multidimensional data such as real-time positions, social relations of consumers, POI (Point of interest) of merchants, historical consumption records, historical evaluation records of merchants and the like. And information such as social relations, historical consumption records, preferences and the like form a large-scale topological relation network, and a recommendation algorithm is required to perform graph traversal in the topological relation network to acquire data. However, the data storage format in the data warehouse is not suitable for acquiring the data traversed by the dependency graph, so that the traditional recommendation algorithm cannot use the data with the great service reference value, such as the real-time position and the relationship network, and finally the recommendation result is lack of personalization.
Disclosure of Invention
In order to overcome the defects of the prior art, one of the purposes of the invention is to provide a POI recommendation method based on a graph database, so as to realize personalized recommendation and meet the requirements of different consumers.
One of the purposes of the invention is realized by adopting the following technical scheme:
a POI recommendation method based on a graph database comprises the following steps:
acquiring data, and importing the data into a pre-constructed graph database, wherein the data comprises POI data, user data and an incidence relation between the POI data and the user data;
receiving ID and positioning information of a target user;
acquiring a feature vector of the target user, a friend intimacy coefficient of the target user and a target POI set from the graph database according to the ID and the positioning information of the target user, wherein the feature vector of the target user is formed by personal data of the target user;
acquiring a feature matrix of a user set with related POI consumption records from the graph database according to the target POI set, wherein the related POI consumption records are consumption records associated with any POI in the target POI set, and the feature matrix is composed of personal data of each user in the user set;
calculating the similarity between the feature vector of the target user and the feature matrix according to the friend intimacy coefficient;
determining a target POI according to the similarity result obtained by calculation;
and returning the target POI.
Further, after receiving the ID and the positioning information of the target user, the method includes the following steps:
sending the ID and the positioning information of the target user to the graph database;
receiving a target POI returned by the graph database;
returning the target POI;
wherein, the graph database is used for solving the target POI and returning the target POI, and comprises the following steps:
acquiring a feature vector of the target user, a friend intimacy coefficient of the target user and a target POI set according to the ID and the positioning information of the target user, wherein the feature vector of the target user is formed by personal data of the target user;
acquiring a feature matrix of a user set with related POI consumption records according to the target POI set, wherein the related POI consumption records are consumption records associated with any POI in the target POI set, and the feature matrix is composed of personal data of each user in the user set;
calculating the similarity between the feature vector of the target user and the feature matrix according to the friend intimacy coefficient;
determining a target POI according to the similarity result obtained by calculation;
and returning the target POI.
Further, acquiring data and importing the data into a pre-constructed graph database, comprising:
acquiring data in a data warehouse, wherein the data comprises POI data, user data and an incidence relation between the POI data and the user data;
processing the data into graph elements of the graph database;
establishing a batch task, and converting the graph elements into a file format of the graph database in batch to obtain a graph element file;
and pulling the primitive pixel file from the data warehouse to a database server corresponding to the database, and injecting the primitive pixel file into the database.
Further, according to the ID and the positioning information of the target user, obtaining a feature vector of the target user, a friend affinity coefficient of the target user, and a target POI set, specifically including the following steps:
calling a user self-defining function, and inquiring personal data of the target user and a friend intimacy coefficient of the target user according to the ID of the target user, wherein the personal data of the target user comprises user information and preference information;
composing the personal data of the target user into a feature vector of the target user;
and calling a user-defined function, inquiring POI (point of interest) in a preset distance range according to the positioning information, and forming the POI in the preset distance range into the target POI set.
Further, calling a user self-defined function, and acquiring a feature matrix of the user set with the relevant POI consumption records according to the target POI set, wherein the method comprises the following steps:
calling a user self-defining function, inquiring users with related POI consumption records according to the target POI set, and forming the users with the related POI consumption records into a user set, wherein the related POI is any POI in the target POI set;
sequentially acquiring personal data of each user in the user set, and sequentially forming a first feature vector by the personal data of each user, wherein the personal data of each user comprises personal information and preference information;
and forming a feature matrix by using the first feature vectors of the users in the user set to obtain the feature matrix of the user set, wherein each column of elements in the feature matrix is the first feature vector of one user in the user set.
Further, forming a user set by the users with the relevant POI consumption records, and processing the user set, including:
and carrying out duplicate removal processing on each user in the user set.
Further, according to the friend affinity coefficient, calculating the similarity between the feature vector of the target user and the feature matrix, including:
calculating the similarity between the feature vector of the target user and the feature matrix according to a similarity calculation function, wherein the similarity calculation formula comprises:
Figure 906988DEST_PATH_IMAGE001
wherein X represents the feature vector of the target user and Y represents the feature matrix,
Figure 660181DEST_PATH_IMAGE002
is the intimacy coefficient of both X and Y, XiIs the i-th element in X, yiIs the ith column element in Y.
Further, determining a target POI according to the calculated similarity result, including:
sorting the similarity results in a descending order;
selecting the users with the first n similarity ranks;
and querying the POI which is most consumed by the n front ranked users to obtain the target POI.
It is a second object of the present invention to provide an electronic device for performing one of the above objects, comprising a processor, a storage medium, and a computer program stored in the storage medium, which when executed by the processor implements the above-mentioned method for POI recommendation based on a graph database.
It is a further object of the present invention to provide a computer readable storage medium storing one of the objects of the invention, having a computer program stored thereon, which, when executed by a processor, implements the above-described graph database-based POI recommendation method.
Compared with the prior art, the invention has the beneficial effects that:
according to the invention, personal multidimensional data of the user is stored through the graph database, POI recommendation can be obtained through real-time calculation based on the incidence relation between the real-time geographic position and the personal data, the personalized POI recommendation requirement is met, the POI is not required to be calculated in advance, and the calculation resources can be saved.
Drawings
FIG. 1 is a flow chart of a graph-based POI recommendation method of the present invention;
FIG. 2 is a composition layout of a graph database in embodiment 1;
FIG. 3 is a flowchart of a POI recommendation method based on a graph database according to embodiment 2;
fig. 4 is a block diagram of the electronic apparatus of embodiment 3.
Detailed Description
The present invention will now be described in more detail with reference to the accompanying drawings, in which the description of the invention is given by way of illustration and not of limitation. The various embodiments may be combined with each other to form other embodiments not shown in the following description.
Example 1
The embodiment provides a graph database-based POI recommendation method, and aims to solve the problem that personalized recommendation cannot be achieved by an existing POI recommendation method.
As shown in fig. 1, the POI recommendation method based on a graph database specifically includes the following steps:
acquiring data, and importing the data into a pre-constructed graph database, wherein the data comprises POI data, user data and an incidence relation between the POI data and the user data;
receiving ID and positioning information of a target user;
acquiring a feature vector of the target user, a friend intimacy coefficient of the target user and a target POI set from the graph database according to the ID and the positioning information of the target user, wherein the feature vector of the target user is formed by personal data of the target user;
acquiring a feature matrix of a user set with related POI consumption records from the graph database according to the target POI set, wherein the related POI consumption records are consumption records associated with any POI in the target POI set, and the feature matrix is composed of personal data of each user in the user set;
calculating the similarity between the feature vector of the target user and the feature matrix according to the friend intimacy coefficient;
determining a target POI according to the similarity result obtained by calculation;
and returning the target POI.
The POI recommendation method based on the graph database is applied to a business system, and the business system runs on a server. At present, most POI recommendations on the market are based on a popular scoring mode, and are uniformly recommended contents for all consumption. According to the embodiment, POI recommendation conforming to consumption and positioning is made according to different interests and preferences of different consumers, and the requirement of personalized POI recommendation can be met.
To implement personalized recommendation of POIs, historical data based on a large number of relevant users and merchants is needed. At present, mass data accumulated over the years are generally stored in a data warehouse in a PB record mode, but the data storage format of the data warehouse is not suitable for constructing a relational network of data, and does not have a rapid processing capability for the mass data, that is, does not have a large-scale concurrency execution capability, so that reliable POI personality recommendation cannot be realized based on data of users and merchants stored in the existing data warehouse, and the requirements of high concurrency and high calculation amount cannot be met.
In order to implement personalized recommendation of a POI and meet requirements of high concurrency and high computation, the embodiment stores massive user and merchant data by using a graph database, constructs a relational network of the data, and implements data acquisition through graph traversal, and implements data processing meeting the requirements of high concurrency and high computation, thereby implementing rapid reading and computation of the data, and achieving the effects of reducing time delay and improving concurrency execution capacity.
Therefore, the implementation needs to construct a graph database in advance for storing data and constructing a relational network of the data. Specifically, the method comprises the following steps:
in the embodiment, the open source is a distributed database Nebula Graph to finish storage of related data and data relations, and in other embodiments, other Graph databases can be used for storing the relations between the data and the data.
One graph in a graph database consists of the following two graph elements:
(1) nodes and node attributes;
(2) relationships and attributes of relationships.
In the present embodiment, as shown in fig. 2, 4 types of nodes are designed in total:
user (consumer): the attributes are id (number), provence _ name (province or city), consummation _ level and gender;
POI (merchant): its attributes are id (number), latitude (longitude), longtude (latitude), name (business name);
category (commodity type): its attribute is id (number), code (type code);
cuisine (food type): its attributes are id (number), code (type code).
And 4 types of relationships:
the attribute of the friend _ ship (User- > User) is score, which represents the trust or intimacy among consumers, and the higher the trust or intimacy among consumers is, the higher the recommended weight is;
consume (User- > POI): no attribute, indicating that there is a historical consumption record between the consumer and the merchant;
defer _ category (User- > category): the attribute is score, which represents the preference degree of a consumer for a certain commodity;
preferjcuisine (User- > cuisine): its attribute is score, which indicates the consumer's preference for a certain type of food.
It should be noted that, in other application scenarios of the present invention, the Category of Category nodes, Category nodes and Cuisine, may be types of scenic spots (natural scenery, theme park, historical culture, etc.), or types of movies (science fiction, artistic photo, etc.), or types of dishes (yue dish, xiang dish, etc.), and the like, and are not further limited herein. In addition, the attributes of the nodes and the relationships are also only schematic, in other application scenarios of the present invention, the User attribute may further include more information (age, family status, etc.), and the consume relationship may also be the amount of money consumed per time, etc.
After the graph database is constructed, data such as historical consumption data (consumption records completed by users at POI), user relationships, user preferences and the like stored in the data warehouse need to be imported into the graph database, and the data stored in the data warehouse is in a form of tables or files, so that the data in the data warehouse needs to be processed to be converted into pixel elements of the constructed graph database and then imported. However, the time for importing the data is critical, and a large business system generally has billions of graph nodes and billions to billions of graph relations, the storage amount is about 50TB calculated by 500 bytes of one graph element, and if the graph data is directly written into a graph database by means of a database query language, about tens of hours are required.
Preferably, in order to shorten the time for writing data into the graph database and improve the efficiency of data importing, the present embodiment adopts the completion of data importing by sst ingest, specifically, acquiring data, and importing the data into the pre-constructed graph database, including:
acquiring data in a data warehouse, wherein the data comprises POI data, user data and an incidence relation between the POI data and the user data;
processing the data into graph elements of the graph database;
establishing batch tasks in a Spark jobb mode, and converting the graph elements into a file format sst of the graph database in batches to obtain a graph element file in the sst format;
and pulling the primitive file in the sst format from the data warehouse to a corresponding graph database server, and injecting (ingest) the primitive file to the graph database.
It should be noted that the data may be full data and incremental data, and the importing of the full data and the importing of the incremental data may multiplex the importing logic. The Spark is a large data processing framework commonly used in the field, and large-scale parallel data processing and computing tasks can be completed in a Spark jobb mode.
Compared with the way of writing through Query Language, the sst ingest has the advantages that:
a. CPU resources of a data warehouse can be fully utilized, contents of the files can be well serialized and calculated in advance, the files can be compressed integrally, and the data warehouse has a better compression ratio.
b. Data warehouses typically require ETL on data prior to data writing, such as processes for processing statistical scores, commodity preferences, and so forth. The process can be merged with the sst file generation process, reducing the task number of the data warehouse.
c. Multiple copies of data may be prepared in the data warehouse, or incremental data of T +1 may be prepared at the same time, and the import of the full amount data and the import of the incremental data may multiplex the same logic.
d. The data preparation work of the data warehouse does not affect the online external service of the graph database, and can be started only by waiting for the completion of the ETL of the upstream data. When data is imported, the data can be imported when waiting for the database service valley (usually in the morning), and the importing time is more flexible.
e. In the process of pulling the sst file to the server where the graph database is located, the external service of the graph database cannot be influenced, and the influence on the service is small.
After the business system finishes importing data of a graph database, the business system can receive an ID and positioning information sent by a target user through an APP and other clients, calls a user-defined function (UDF) or a storage process according to the ID and the positioning information, and obtains a feature vector of the target user, a friend intimacy coefficient of the target user and a target POI set from the graph database according to the ID and the positioning information of the target user, and the method specifically comprises the following steps:
inquiring personal data of the target user and a friend intimacy coefficient of the target user according to the ID of the target user, wherein the personal data of the target user comprises user information and preference information;
composing the personal data of the target user into a feature vector of the target user;
and inquiring POIs in a preset distance range according to the positioning information, and forming the POIs in the preset distance range into the target POI set.
It should be noted that the User Defined Function (UDF) and the database query statements whose stored procedures are customized by the user are compiled and optimized in advance and are built in the database to reduce the number of interactions between the database and the caller.
Specifically, the graph database queries the attributes of the node corresponding to the ID, that is, the personal information (province, consumption level and gender) a1 of the target user according to the ID, specifically using query statements as follows:
$A1 = FETCH PROP ON %d YIELD User.province_name,User.consumption_level,User.gender,
wherein,% d is id of the target user, and is a parameter transmitted by the service system;
inquiring the preference information A2 and A3 of the target user according to the ID, wherein A2 specifically shows the preferred commodities and scores of the target user, A3 specifically shows the preferred foods and scores of the target user, and specifically adopts an inquiry statement as follows:
$A2 = GO FROM %d OVER prefer_category YIELD prefer_category._dst AS prefer_id, prefer_category.score;
$A3 = GO FROM %d OVER prefer_cuisine YIELD prefer_cuisine._dst AS prefer_id,prefer_cuisine.score AS score;
in this embodiment, the query processes of a2 and A3 are optimized according to the storage characteristics of the Nebula Graph:
the ID of the node in the Nebula Graph is also stored on the edge in the form of _ src and _ dst attributes, and in particular, since the end point is usually stored on two servers together with the edge, the acquisition of data from a remote server can be avoided by taking the ID of the end point through the pre _ category.
Then, the above-mentioned a1, a2 and A3 are used to form the feature vector X of the target user:
Figure 292281DEST_PATH_IMAGE003
of course, if more user information is designed when constructing the primitive elements in the graph database, the feature vector X has more elements.
In addition, according to the ID of the target user, the friend intimacy coefficient of the target user can be obtained from the graph database
Figure 54701DEST_PATH_IMAGE002
The relationship between the target user and each friend is as follows:
$a=GO FROM %d OVER friend_ship YIELD friend_ship.score。
acquiring a target POI set P from a graph database according to the positioning information of the target user, wherein the positioning information in the embodiment comprises longitude and latitude, and the specific query statement is as follows:
$POI = GO FROM func_near(%latitude, %longitude, %distance) OVER poi_location YIELD poi_location._dst AS mid;
here,% latitude and% longtude are longitude and latitude, and% distance is a preset distance range, in this example, the preset distance range is 5KM, and a POI set in the range of 5KM can be obtained from the map database as the target POI set P. Of course, in other embodiments, the preset distance range may also be customized according to specific situations, and is not specifically limited herein.
Preferably, the business system invokes a User Defined Function (UDF) or a storage process, and obtains a feature matrix of the user set with relevant POI consumption records from the graph database according to the target POI set P, including the following steps:
querying users with related POI consumption records according to the target POI set, and forming the users with the related POI consumption records into a user set, wherein the related POI is any POI in the target POI set;
sequentially acquiring personal data of each user in the user set, and sequentially forming a first feature vector by the personal data of each user, wherein the personal data of each user comprises personal information and preference information;
and forming a feature matrix by using the first feature vectors of the users in the user set to obtain the feature matrix of the user set, wherein each column of elements in the feature matrix is the first feature vector of one user in the user set.
Preferably, the user who has the relevant POI consumption record forms a user set, and the processing of the user set includes:
and carrying out duplicate removal processing on each user in the user set.
Specifically, based on a target POI set P, a user who has consumed a record of the POI in the set P is obtained to form a user set Users, in this embodiment, since it is not necessary that data of the same user appears twice in a feature matrix Y constructed later, Users in the user set Users are deduplicated, that is, Users with the same id in the user set Users are deduplicated, a specific deduplication process adopts a deduplication function of a Graph database Nebula Graph, and the following query statement is run to complete:
$Users = GO FROM $P.mid OVER consume REVERSELY YIELD DISTINCT consume._src AS user_id。
acquiring personal data of each user in a user set Users, including querying personal information B1 of each user in the user set Users, and specifically adopting the following query statements:
$B1=FETCH PROP ON $Users.user_id YIELD User.province_name, User.consumption_level, User.gender。
inquiring the preference information of each user in the user set Users, wherein the preference information comprises commodities and scores B2 which are once consumed by each user and foods and scores B3 which are once consumed by each user, and specifically adopting an inquiry statement as follows:
$B2=GO FROM $Users.user_id OVER prefer_category YIELD prefer_category._dst AS prefer_id, prefer_category.score AS category_score;
$B3=GO FROM $Users.user_id OVER prefer_cuisine YIELD prefer_cuisine._dst AS prefer_id,prefer_cuisine.score AS score。
the personal data of each user in the user set Users form a first feature vector, and the feature vectors of each user in the user set Users form a feature matrix Y:
Figure 705125DEST_PATH_IMAGE004
wherein each column element in Y represents a first feature vector of one user in the set of Users.
Preferably, calculating the similarity between the feature vector of the target user and the feature matrix according to the friend affinity coefficient includes:
calculating the similarity between the feature vector of the target user and the feature matrix according to a similarity calculation function, wherein the similarity calculation formula comprises:
Figure 488274DEST_PATH_IMAGE001
wherein X represents the feature vector of the target user and Y represents the feature matrix,
Figure 345021DEST_PATH_IMAGE002
is the intimacy coefficient of both X and Y, XiIs the i-th element in X, yiIs the ith column element in Y.
The similarity between the feature vector X of the target user and the feature matrix Y of the user set is calculated to obtain the proximity degree between other users who are consumed by the POI in the preset range and the target user, so that the preference POI of other users with the highest proximity degree is recommended to the target user, the recommended POI is in line with the preference of the target user, personalized recommendation is achieved, and the requirements of different users are met.
Preferably, the business system determines the target POI according to the calculated similarity result, including:
the similarity results are sorted in a descending order,
selecting the users with the first n similarity ranks;
and inquiring the POI which is most consumed by the n front ranked users to obtain a target POI, and outputting the target POI.
Firstly, sorting the similarity results in a descending ORDER mode, and finding out other users with the most similar preference (the highest proximity degree with a target user), wherein the other users are users in the user set, and sorting is realized BY specifically adopting an | ORDER BY $ -. Z statement. In this example, the first 5 users in the user set with the similarity ranking are selected as users meeting the recommendation requirement, and | LIMIT 5 is specifically adopted, and then the POI with the most consumption of the 5 users is output as the target POI, and the target POI is queried through the following query statement:
GO FROM $-.user_id OVER consume YIELD consume._dst AS poi_id
| GROUP BY $-.poi_id YIELD $-.poi_id AS poi_id, count($-.poi_id) AS count
| ORDER BY $-.count DESC, $-.poi_id。
of course, in other embodiments, the number of selected users may also be defined according to specific situations, that is, n may be freely set, and is not specifically limited herein.
And the service system returns the target POI to the client side, so that the personalized recommendation of the POI is completed.
According to the POI recommendation method based on the graph database, the consumption data, the social data and the preference data of the user are stored and traversed in the graph database mode, and the personalized recommendation of the POI is completed by combining the real-time geographic position, so that the personalized requirements of the user are met. And the invention has good elastic capacity-expanding capability, can reasonably utilize server resources and can not cause waste of server resources.
Example 2
The difference between this embodiment and embodiment 1 is that, in order to implement a high concurrency and low latency target recommended by a POI, processes of querying data from a graph database and obtaining a target POI are compiled and optimized in a UDF (user defined function) or storage process and then built in the graph database, and intermediate results obtained by querying and calculating are not required to be returned to a service system, but are directly cached in the graph database. Only the obtained target POI is returned to the business system, so that a large amount of data interaction between the business system and the graph database is avoided, and the access performance is greatly improved.
Therefore, as shown in fig. 3, the POI recommendation method based on a graph database in the embodiment includes the following steps:
the method comprises the steps that a business system acquires data and leads the data into a pre-constructed graph database, wherein the data comprises POI data, user data and an incidence relation between the POI data and the user data;
the service system receives the ID and the positioning information of a target user;
the business system sends the ID and the positioning information of the target user to the graph database;
the business system receives the target POI returned by the graph database;
the business system returns the target POI to the client;
wherein, the graph database is used for solving the target POI and returning the target POI, and comprises the following steps:
acquiring a feature vector of the target user, a friend intimacy coefficient of the target user and a target POI set according to the ID and the positioning information of the target user, wherein the feature vector of the target user is formed by personal data of the target user;
acquiring a feature matrix of a user set with related POI consumption records according to the target POI set, wherein the related POI consumption records are consumption records associated with any POI in the target POI set, and the feature matrix is composed of personal data of each user in the user set;
calculating the similarity between the feature vector of the target user and the feature matrix according to the friend intimacy coefficient;
determining a target POI according to the similarity result obtained by calculation;
and returning the target POI.
Specifically, the graph database queries the attributes of the node corresponding to the ID, that is, the personal information (province, consumption level and gender) a1 of the target user according to the ID, specifically using query statements as follows:
$A1 = FETCH PROP ON %d YIELD User.province_name,User.consumption_level,User.gender,
wherein,% d is id of the target user, and is a parameter transmitted by the service system;
inquiring the preference information A2 and A3 of the target user according to the ID, wherein A2 specifically shows the preferred commodities and scores of the target user, A3 specifically shows the preferred foods and scores of the target user, and specifically adopts an inquiry statement as follows:
$A2 = GO FROM %d OVER prefer_category YIELD prefer_category._dst AS prefer_id, prefer_category.score;
$A3 = GO FROM %d OVER prefer_cuisine YIELD prefer_cuisine._dst AS prefer_id,prefer_cuisine.score AS score;
in this embodiment, the query processes of a2 and A3 are optimized according to the storage characteristics of the Nebula Graph:
the ID of the node in the Nebula Graph is also stored on the edge in the form of _ src and _ dst attributes, and in particular, since the end point is usually stored on two servers together with the edge, the acquisition of data from a remote server can be avoided by taking the ID of the end point through the pre _ category.
Then, the above-mentioned a1, a2 and A3 are used to form the feature vector X of the target user:
Figure 379973DEST_PATH_IMAGE003
of course, if more user information is designed when constructing the primitive elements in the graph database, the feature vector X has more elements.
In addition, according to the ID of the target user, the friend intimacy coefficient of the target user can be obtained from the graph database
Figure 150483DEST_PATH_IMAGE002
The relationship between the target user and each friend is as follows:
$a=GO FROM %d OVER friend_ship YIELD friend_ship.score。
acquiring a target POI set P from a graph database according to the positioning information of the target user, wherein the positioning information in the embodiment comprises longitude and latitude, and the specific query statement is as follows:
$POI = GO FROM func_near(%latitude, %longitude, %distance) OVER poi_location YIELD poi_location._dst AS mid;
here,% latitude and% longtude are longitude and latitude, and% distance is a preset distance range, in this embodiment, the preset distance range is 5KM, and a POI set in the range of 5KM may be obtained from the map database as the target POI set P. Of course, in other embodiments, the preset distance range may also be customized according to specific situations, and is not specifically limited herein.
Preferably, the graph database obtains a feature matrix of a user set with relevant POI consumption records according to the target POI set P, and includes the following steps:
querying users with related POI consumption records according to the target POI set, and forming the users with the related POI consumption records into a user set, wherein the related POI is any POI in the target POI set;
sequentially acquiring personal data of each user in the user set, and sequentially forming a first feature vector by the personal data of each user, wherein the personal data of each user comprises personal information and preference information;
and forming a feature matrix by using the first feature vectors of the users in the user set to obtain the feature matrix of the user set, wherein each column of elements in the feature matrix is the first feature vector of one user in the user set.
Preferably, the graph database constructs users with the POI consumption records related to the presence into a user set, and processes the user set, including:
and carrying out duplicate removal processing on each user in the user set.
Specifically, based on the target POI set P, a user who has consumed a record of the POI in the set P is obtained to form a user set Users, in this embodiment, since it is not necessary that data of the same user appears twice in a feature matrix Y constructed later, Users in the user set Users are deduplicated, that is, Users with the same id in the user set Users are deduplicated, a specific deduplication process adopts a deduplication function of a Graph database Nebula Graph, and the following statements are executed to complete:
$Users = GO FROM $P.mid OVER consume REVERSELY YIELD DISTINCT consume._src AS user_id。
acquiring personal data of each user in a user set Users, including querying personal information B1 of each user in the user set Users, and specifically adopting the following query statements:
$B1=FETCH PROP ON $Users.user_id YIELD User.province_name, User.consumption_level, User.gender。
inquiring the preference information of each user in the user set Users, wherein the preference information comprises commodities and scores B2 which are once consumed by each user and foods and scores B3 which are once consumed by each user, and specifically adopting an inquiry statement as follows:
$B2=GO FROM $Users.user_id OVER prefer_category YIELD prefer_category._dst AS prefer_id, prefer_category.score AS category_score;
$B3=GO FROM $Users.user_id OVER prefer_cuisine YIELD prefer_cuisine._dst AS prefer_id,prefer_cuisine.score AS score。
the personal data of each user in the user set Users form a characteristic vector, and the characteristic vectors of all Users in the user set Users form a characteristic matrix Y:
Figure 979899DEST_PATH_IMAGE004
wherein each column in Y represents a feature vector of one user in the set of Users.
Preferably, calculating the similarity between the feature vector of the target user and the feature matrix includes:
calculating the similarity between the feature vector of the target user and the feature matrix according to a similarity calculation function, wherein the similarity calculation formula comprises:
Figure 350706DEST_PATH_IMAGE001
wherein X represents the feature vector of the target user and Y represents the feature matrix,
Figure 454928DEST_PATH_IMAGE002
is the intimacy coefficient of both X and Y, XiIs the i-th element in X, yiIs the ith column element in Y.
The similarity between the feature vector X of the target user and the feature matrix Y is calculated to obtain the proximity degree between other users who are consumed by the POI in the preset range and the target user, so that the POI with the preference of other users with the highest proximity degree is recommended to the target user, the recommended POI is in line with the preference of the target user, personalized recommendation is achieved, and the requirements of different users are met.
Preferably, the determining the target POI according to the calculated similarity result by the graph database includes:
sorting the similarity results in a descending order;
selecting the users with the first n similarity ranks;
and querying the POI which is most consumed by the n front ranked users to obtain the target POI.
Firstly, sorting the similarity results in a descending ORDER mode, finding out other users with the most similar preference (the highest proximity degree with a target user), and specifically sorting BY adopting an | ORDER BY $ -. Z statement. In this example, the first 5 users in the user set with the highest similarity ranking are selected as the users meeting the recommendation requirement, and | LIMIT 5 is specifically adopted, and then the POI with the most consumption of the 5 users is output as the target POI, and the target POI is queried through the following query statement:
GO FROM $-.user_id OVER consume YIELD consume._dst AS poi_id
| GROUP BY $-.poi_id YIELD $-.poi_id AS poi_id, count($-.poi_id) AS count
| ORDER BY $-.count DESC, $-.poi_id。
of course, in other embodiments, the number of selected users may also be defined according to specific situations, that is, n may be freely set, and is not specifically limited herein.
In the embodiment, the process of querying data and solving the target POI is completed through the graph database, the times of accessing the database by the business system are reduced, the access performance is improved, and in the actual application, the time delay of 1000 concurrent requests per second is lower than 30 milliseconds, so that the requirements of low time delay and high concurrency of the POI recommendation method are met.
Example 3
Fig. 4 is a schematic structural diagram of an electronic device according to embodiment 3 of the present invention, as shown in fig. 4, the electronic device includes a processor 410, a memory 420, an input device 430, and an output device 440; the number of the processors 410 in the computer device may be one or more, and one processor 410 is taken as an example in fig. 4; the processor 410, the memory 420, the input device 430 and the output device 440 in the electronic apparatus may be connected by a bus or other means, and the bus connection is exemplified in fig. 4.
The memory 420 serves as a computer-readable storage medium for storing software programs, computer-executable programs, and modules, such as program instructions/modules corresponding to the graph database-based POI recommendation method described in embodiments 1-2 of the present invention. The processor 410 executes various functional applications and data processing of the electronic device, that is, implements the graph database-based POI recommendation methods of embodiments 1 and 2, by running software programs, instructions, and modules stored in the memory 420.
The memory 420 may mainly include a program storage area and a data storage area, wherein the program storage area may store an operating system, an application program required for at least one function; the storage data area may store data created according to the use of the terminal, and the like. Further, the memory 420 may include high speed random access memory, and may also include non-volatile memory, such as at least one magnetic disk storage device, flash memory device, or other non-volatile solid state storage device. In some examples, memory 420 may further include memory located remotely from processor 410, which may be connected to an electronic device over a network. Examples of such networks include, but are not limited to, the internet, intranets, local area networks, mobile communication networks, and combinations thereof.
The input device 430 may receive data or the like input through an input apparatus. The output device 440 may include a display device such as a display screen.
Example 4
Embodiment 4 of the present invention also provides a storage medium containing computer-executable instructions, which when executed by a computer processor, are configured to implement a POI recommendation method based on a graph database, the method including:
acquiring data, and importing the data into a pre-constructed graph database, wherein the data comprises POI data, user data and an incidence relation between the POI data and the user data;
receiving ID and positioning information of a target user;
acquiring a feature vector of the target user, a friend intimacy coefficient of the target user and a target POI set from the graph database according to the ID and the positioning information of the target user, wherein the feature vector of the target user is formed by personal data of the target user;
acquiring a feature matrix of a user set with related POI consumption records from the graph database according to the target POI set, wherein the related POI consumption records are consumption records associated with any POI in the target POI set, and the feature matrix is composed of personal data of each user in the user set;
calculating the similarity between the feature vector of the target user and the feature matrix according to the friend intimacy coefficient;
determining a target POI according to the similarity result obtained by calculation;
and returning the target POI.
Of course, the storage medium provided by the embodiment of the present invention contains computer-executable instructions, and the computer-executable instructions are not limited to the method operations described above, and may also perform related operations in the method for POI recommendation based on a graph database provided by any embodiment of the present invention.
From the above description of the embodiments, it is obvious for those skilled in the art that the present invention can be implemented by software and necessary general hardware, and certainly, can also be implemented by hardware, but the former is a better embodiment in many cases. Based on such understanding, the technical solutions of the present invention may be embodied in the form of a software product, which may be stored in a computer-readable storage medium, such as a floppy disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a FLASH Memory (FLASH), a hard disk or an optical disk of a computer, and includes instructions for enabling an electronic device (which may be a mobile phone, a personal computer, a server, or a network device) to execute the methods according to the embodiments of the present invention.
It should be noted that, in the embodiment of the POI recommendation method or apparatus based on a graph database, the included units and modules are only divided according to the functional logic, but are not limited to the above division, as long as the corresponding functions can be realized; in addition, specific names of the functional units are only for convenience of distinguishing from each other, and are not used for limiting the protection scope of the present invention.
Various other modifications and changes may be made by those skilled in the art based on the above-described technical solutions and concepts, and all such modifications and changes should fall within the scope of the claims of the present invention.

Claims (10)

1. A POI recommendation method based on a graph database is characterized by comprising the following steps:
acquiring data, and importing the data into a pre-constructed graph database, wherein the data comprises POI data, user data and an incidence relation between the POI data and the user data;
receiving ID and positioning information of a target user;
acquiring a feature vector of the target user, a friend intimacy coefficient of the target user and a target POI set from the graph database according to the ID and the positioning information of the target user, wherein the feature vector of the target user is composed of personal data of the target user, the personal data of the target user comprises user information and preference information, and the target POI set is a set composed of POIs in a preset distance range inquired according to the positioning information;
acquiring a feature matrix of a user set with related POI consumption records from the graph database according to the target POI set, wherein the related POI consumption records are consumption records associated with any POI in the target POI set, the feature matrix is composed of personal data of each user in the user set, and the personal data of each user comprises personal information and preference information;
calculating the similarity between the feature vector of the target user and the feature matrix according to the friend intimacy coefficient;
determining a target POI according to the similarity result obtained by calculation;
and returning the target POI.
2. The method of graph-based database POI recommendation according to claim 1, comprising the steps of, after receiving the target user's ID and location information:
sending the ID and the positioning information of the target user to the graph database;
receiving a target POI returned by the graph database;
returning the target POI;
wherein, the graph database is used for solving the target POI and returning the target POI, and comprises the following steps:
acquiring a feature vector of the target user, a friend intimacy coefficient of the target user and a target POI set according to the ID and the positioning information of the target user, wherein the feature vector of the target user is composed of personal data of the target user;
acquiring a feature matrix of a user set with related POI consumption records according to the target POI set, wherein the related POI consumption records are consumption records associated with any POI in the target POI set, and the feature matrix is composed of personal data of each user in the user set;
calculating the similarity between the feature vector of the target user and the feature matrix according to the friend intimacy coefficient;
determining a target POI according to the similarity result obtained by calculation;
and returning the target POI.
3. The graph database-based POI recommendation method of claim 1, wherein obtaining data, importing the data into a pre-built graph database, comprises:
acquiring data in a data warehouse, wherein the data comprises POI data, user data and an incidence relation between the POI data and the user data;
processing the data into graph elements of the graph database;
establishing a batch task, and converting the graph elements into a file format of the graph database in batch to obtain a graph element file;
and pulling the primitive pixel file from the data warehouse to a database server corresponding to the database, and injecting the primitive pixel file into the database.
4. The method according to claim 1 or 2, wherein the feature vector of the target user, the friend affinity coefficient of the target user, and the target POI set are obtained according to the ID and the positioning information of the target user, and the method specifically includes the following steps:
calling a user self-defining function, and inquiring personal data of the target user and a friend intimacy coefficient of the target user according to the ID of the target user, wherein the personal data of the target user comprises user information and preference information;
composing the personal data of the target user into a feature vector of the target user;
and calling a user-defined function, inquiring POI (point of interest) in a preset distance range according to the positioning information, and forming the POI in the preset distance range into the target POI set.
5. The graph database-based POI recommendation method according to claim 4, wherein obtaining a feature matrix of a user set having related POI consumption records according to the target POI set comprises the steps of:
calling a user self-defining function, inquiring users with related POI consumption records according to the target POI set, and forming the users with the related POI consumption records into a user set, wherein the related POI is any POI in the target POI set;
sequentially acquiring personal data of each user in the user set, and sequentially forming a first feature vector by the personal data of each user, wherein the personal data of each user comprises personal information and preference information;
and forming a feature matrix by using the first feature vectors of the users in the user set to obtain the feature matrix of the user set, wherein each column of elements in the feature matrix is the first feature vector of one user in the user set.
6. A graph database based POI recommendation method according to claim 5, wherein said users having related POI consumption records are organized into a set of users, and said set of users is processed, comprising:
and carrying out duplicate removal processing on each user in the user set.
7. The graph database-based POI recommendation method of claim 1 or 2, wherein calculating the similarity of the feature vector of the target user and the feature matrix according to the friend affinity coefficient comprises:
calculating the similarity between the feature vector of the target user and the feature matrix according to a similarity calculation function, wherein the similarity calculation formula comprises:
Figure 144009DEST_PATH_IMAGE001
wherein X represents the feature vector of the target user and Y represents the feature matrix,
Figure 489540DEST_PATH_IMAGE002
friend intimacy coefficient for both X and Y, XiIs the i-th element in X, yiThe number of the ith row elements in Y is n, and the n respectively represents the total number of the elements in X or the total number of the rows of the elements in Y.
8. The graph database-based POI recommendation method according to claim 1 or 2, wherein determining the target POI based on the calculated similarity result comprises:
sorting the similarity results in a descending order;
selecting the users with the first n similarity ranks;
and querying the POI which is most consumed by the n front ranked users to obtain the target POI.
9. An electronic device comprising a processor, a storage medium, and a computer program stored on the storage medium, wherein the computer program, when executed by the processor, implements the graph database-based POI recommendation method of any of claims 1 to 8.
10. A computer-readable storage medium, on which a computer program is stored, which, when being executed by a processor, carries out a method for graph database based POI recommendation according to any one of claims 1 to 8.
CN202010703591.4A 2020-06-30 2020-07-21 POI recommendation method, device and medium based on graph database Active CN111737537B (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
CN202010703591.4A CN111737537B (en) 2020-07-21 2020-07-21 POI recommendation method, device and medium based on graph database
US17/325,245 US11500934B2 (en) 2020-06-30 2021-05-20 POI recommendation method and device based on graph database, and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010703591.4A CN111737537B (en) 2020-07-21 2020-07-21 POI recommendation method, device and medium based on graph database

Publications (2)

Publication Number Publication Date
CN111737537A CN111737537A (en) 2020-10-02
CN111737537B true CN111737537B (en) 2020-11-27

Family

ID=72655174

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010703591.4A Active CN111737537B (en) 2020-06-30 2020-07-21 POI recommendation method, device and medium based on graph database

Country Status (1)

Country Link
CN (1) CN111737537B (en)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112445981A (en) * 2020-11-04 2021-03-05 西安电子科技大学 Social and consumption joint recommendation system, method, storage medium and computer equipment
CN115797020B (en) * 2023-02-06 2023-05-02 网思科技股份有限公司 Retail recommendation method, system and medium for data processing based on graph database

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104376010A (en) * 2013-08-14 2015-02-25 腾讯科技(深圳)有限公司 User recommendation method and user recommendation device
CN107679053A (en) * 2017-06-12 2018-02-09 平安科技(深圳)有限公司 Location recommendation method, device, computer equipment and storage medium
CN109446348A (en) * 2018-09-22 2019-03-08 北京微播视界科技有限公司 A kind of operating method, device, terminal and storage medium polymerizeing point of interest
CN109800361A (en) * 2019-02-11 2019-05-24 北京百度网讯科技有限公司 A kind of method for digging of interest point name, device, electronic equipment and storage medium
CN110555112A (en) * 2019-08-22 2019-12-10 桂林电子科技大学 interest point recommendation method based on user positive and negative preference learning

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20160203422A1 (en) * 2015-01-14 2016-07-14 Nextop Italia Srl Semplificata Method and electronic travel route building system, based on an intermodal electronic platform

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104376010A (en) * 2013-08-14 2015-02-25 腾讯科技(深圳)有限公司 User recommendation method and user recommendation device
CN107679053A (en) * 2017-06-12 2018-02-09 平安科技(深圳)有限公司 Location recommendation method, device, computer equipment and storage medium
CN109446348A (en) * 2018-09-22 2019-03-08 北京微播视界科技有限公司 A kind of operating method, device, terminal and storage medium polymerizeing point of interest
CN109800361A (en) * 2019-02-11 2019-05-24 北京百度网讯科技有限公司 A kind of method for digging of interest point name, device, electronic equipment and storage medium
CN110555112A (en) * 2019-08-22 2019-12-10 桂林电子科技大学 interest point recommendation method based on user positive and negative preference learning

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
Geographic-categorical diversification in POI recommendations;Rodrigo Carvalho等;《Proceedings of the 25th Brazillian Symposium on Multimedia and the Web》;20191031;第349-356页 *
Jaroslav Pokorný等.Conceptual and Database Modelling of Graph Databases.《Proceedings of the 20th International Database Engineering & Applications Symposium》.2016, *

Also Published As

Publication number Publication date
CN111737537A (en) 2020-10-02

Similar Documents

Publication Publication Date Title
US11036735B2 (en) Dimension context propagation techniques for optimizing SQL query plans
EP3513305B1 (en) Techniques for dictionary based join and aggregation
WO2020147594A1 (en) Method, system, and device for obtaining expression of relationship between entities, and advertisement retrieval system
WO2018196424A1 (en) Recommendation method and apparatus
US7908287B1 (en) Dynamically autocompleting a data entry
US20150073934A1 (en) System, Process and Software Arrangement for Providing Multidimensional Recommendations/Suggestions
US11500934B2 (en) POI recommendation method and device based on graph database, and storage medium
US9747349B2 (en) System and method for distributing queries to a group of databases and expediting data access
CN111262953B (en) Method and device for pushing information in real time
US20160162890A1 (en) Email suggestor system
US20180046670A1 (en) Processing Joins in a Database System Using Zero Data Records
CN111737537B (en) POI recommendation method, device and medium based on graph database
Gupta et al. Faster as well as early measurements from big data predictive analytics model
US9043311B1 (en) Indexing data updates associated with an electronic catalog system
US20110179013A1 (en) Search Log Online Analytic Processing
CN113204712A (en) Information pushing method, device, medium and program product based on community service
US20220374406A1 (en) KV Database Configuration Method, Query Method, Device, and Storage Medium
CN114969113A (en) Information searching method, device, storage medium and server
CN110427574B (en) Route similarity determination method, device, equipment and medium
US8321408B1 (en) Quick access to hierarchical data via an ordered flat file
CN112100511B (en) Preference degree data obtaining method and device and electronic equipment
CN110020001A (en) Storage, querying method and the corresponding equipment of string data
US20230153286A1 (en) Method and system for hybrid query based on cloud analysis scene, and storage medium
CN116861107A (en) Business content display method, device, equipment, medium and product
CN113407587B (en) Data processing method, device and equipment for online analysis processing engine

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
CB03 Change of inventor or designer information
CB03 Change of inventor or designer information

Inventor after: Wu Min

Inventor after: Chen Pengwei

Inventor after: Ye Xiaomeng

Inventor before: Wu Min

Inventor before: Chen Pengwei

Inventor before: Ye Xiaomeng