WO2018219057A1 - 选址方法及设备 - Google Patents

选址方法及设备 Download PDF

Info

Publication number
WO2018219057A1
WO2018219057A1 PCT/CN2018/083627 CN2018083627W WO2018219057A1 WO 2018219057 A1 WO2018219057 A1 WO 2018219057A1 CN 2018083627 W CN2018083627 W CN 2018083627W WO 2018219057 A1 WO2018219057 A1 WO 2018219057A1
Authority
WO
WIPO (PCT)
Prior art keywords
user
feature data
time period
time
location
Prior art date
Application number
PCT/CN2018/083627
Other languages
English (en)
French (fr)
Inventor
张海滨
蒋丰泽
张旭
Original Assignee
华为技术有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 华为技术有限公司 filed Critical 华为技术有限公司
Priority to EP18808958.5A priority Critical patent/EP3605365A4/en
Publication of WO2018219057A1 publication Critical patent/WO2018219057A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/29Geographical information databases
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2458Special types of queries, e.g. statistical queries, fuzzy queries or distributed queries
    • G06F16/2465Query processing support for facilitating data mining operations in structured databases

Definitions

  • the embodiments of the present application relate to the field of big data analysis technologies, and in particular, to a location selection method and device.
  • Site selection refers to the process of demonstrating and making decisions about addresses prior to construction. For example, shops, hotels, hospitals, schools, etc. are required to be sited between opening or construction. There are many factors to consider when choosing a location. Taking store location as an example, store location needs to consider factors such as regional traffic, business districts, user shopping habits, the nature of the store itself, and surrounding land prices.
  • a method of performing shop location based on a user's search record is provided.
  • Users usually search the map application for the store they want to go to, and obtain a large number of users' search records for the store in the map application, and analyze this to realize the store location.
  • a large number of users searched for the “xx coffee shop” search record in the map application and determined that the user searching for “xx coffee shop” was on the map according to the search record.
  • the spatial location distribution is selected as a preliminary candidate region by selecting a region whose search number is greater than a preset threshold, and then the region where the "xx coffee shop” has been opened in the vicinity is removed from the preliminary candidate region, and the remaining candidate regions are obtained.
  • the above prior art starts the store site selection from the perspective of user requirements, and a prerequisite is to obtain a search record of a large number of users in the map application for the planned store. If the search history is missing for some reason, for example, when the store being planned is an emerging type of store or a lesser-known store, there may be only a small amount of search records in the map application, even for the store that the plan is opened. In this case, the prior art solutions are difficult to implement effectively.
  • the embodiment of the present application provides a method and device for locating an address to solve the problem that the solution provided by the prior art is difficult to implement effectively in the case that the search record is missing.
  • an address selection method includes: acquiring spatiotemporal feature data of a plurality of users, the spatiotemporal feature data is used to indicate a real geographic location of the user at each moment; and performing grids on the spatiotemporal feature data of each user.
  • the processing transforming the spatiotemporal feature data into the mapped spatiotemporal feature data; determining the frequent trajectory of each user according to the mapped spatiotemporal feature data of each user; determining the alternative address of the subject to be selected according to the frequent trajectories of the plurality of users.
  • a technical solution for selecting a user's frequent trajectory by using a user's space-time feature data as a reference is provided, and the prior art relies on a user's search record for site selection.
  • the spatio-temporal feature data of the user can be conveniently collected from the daily activity of the user, thereby ensuring that the solution provided by the embodiment of the present application can be effectively implemented.
  • the frequent trajectory of the user reflects the geographical location that the user often appears, and the user's frequent trajectory is used for site selection, which can accurately reflect the traffic of each geographical location, the identity of the surrounding users, and conform to the user's habits often appearing in it.
  • the actual situation of the location for various daily activities helps to improve the accuracy of site selection.
  • each user's spatiotemporal feature data is rasterized, and the spatiotemporal feature data is converted into mapped spatiotemporal feature data, including: for each user, the user's spatiotemporal feature data is divided according to the cycle.
  • the actual geographical location in each of the k time periods included, k is an integer greater than 1; obtaining a representative geographic location corresponding to the actual geographic location of the user in each time period, The above-mentioned actual geographical location is included in the spatial area represented by the representative geographical location.
  • the mapped spatiotemporal feature data of each user includes mapped spatiotemporal feature data of each user in n cycles, and the mapped spatiotemporal feature data in one cycle includes representative geography in each time period included in the cycle. position.
  • the frequent trajectory of each user is determined according to the mapped spatiotemporal feature data of each user, including: for each user, obtaining a sequence corresponding to the mapped spatiotemporal feature data of the user in each cycle, the sequence Each element in the representation represents a representative geographic location of the user in a time period; according to the n sequences corresponding to the user in n cycles, the number of occurrences of each subsequence in n sequences is obtained; The sub-sequence of the first preset condition is used as the frequent sequence, and the mapped spatiotemporal feature data corresponding to the frequent sequence is determined as the frequent trajectory of the user.
  • the first preset condition includes: the number of occurrences is greater than the first threshold, and the number of elements included in the subsequence is greater than the first or all of the second threshold.
  • the frequent trajectory of the user can be obtained by using the result of the rasterization process, that is, whether the user frequently appears around a certain geographic location in the same time period of different periods to help extract the user. Frequent trajectories. If the rasterization process is not performed, the frequent trajectories of the user are directly extracted according to the spatiotemporal feature data, because it is difficult to obtain the frequent trajectories of the user because the time and space data are difficult to overlap in actual situations.
  • determining an alternate address of the subject to be selected according to a frequent trajectory of multiple users including: constructing a track tag library according to a frequent trajectory of multiple users, where the track tag library includes multiple users
  • the track tag data, the track tag data of each user includes: a user identifier and at least one frequent track; and based on the track tag library, determining an alternate address of the subject to be selected according to the second preset condition.
  • the method further includes: acquiring location information of the user during the working period according to the frequent trajectory of the user; acquiring information of Point of Information (POI) corresponding to the location information; determining according to the POI information
  • POI Point of Information
  • the user's identity tag, where the user's track tag data also includes the user's identity tag.
  • the user identity in the surrounding area of the geographical location is also considered in the location selection, so that the recommendation of the alternative address is more refined and more targeted.
  • the user's frequent trajectory is used to infer the user's identity tag, which can avoid the inaccuracy of the final identity tag due to incorrect identity information or untimely update in the social information, which helps to improve the accuracy of determining the identity tag of the user. .
  • an embodiment of the present application provides an address device, including a processor and a memory, wherein the memory stores a computer readable program; and the processor uses the program in the memory to use The location method described in the above aspects is completed.
  • an embodiment of the present application provides a computer storage medium for storing computer software instructions for use in a location device, including a program designed to perform the above aspects.
  • an embodiment of the present application provides a computer program product for performing the location method described in the above aspect when the computer program product is executed.
  • a technical solution for selecting a frequent trajectory of a user by using the time and space feature data of the user as a reference is provided, and the time and space feature data of the user may be obtained from the user. It is conveniently collected in the daily activity behavior to ensure that the solution provided by the embodiments of the present application can be effectively implemented.
  • the frequent trajectory of the user reflects the geographical location that the user often appears, and the user's frequent trajectory is used for site selection, which can accurately reflect the traffic of each geographical location, the identity of the surrounding users, and conform to the user's habits often appearing in it.
  • the actual situation of the location for various daily activities helps to improve the accuracy of the site selection.
  • FIG. 1 is a flowchart of a method for locating an address provided by an embodiment of the present application
  • FIG. 2 is a schematic diagram of spatial rasterization provided by an embodiment of the present application.
  • FIG. 3 is a schematic diagram of frequent trajectories provided by an embodiment of the present application.
  • FIG. 4 is a schematic structural diagram of a location system provided by an embodiment of the present application.
  • FIG. 5 is a schematic diagram of interaction of an address location system according to an embodiment of the present application.
  • FIG. 6A is a schematic block diagram of a location device provided by an embodiment of the present application.
  • FIG. 6B is a schematic structural diagram of a site selection device according to an embodiment of the present application.
  • FIG. 1 shows a flowchart of an addressing method provided by an embodiment of the present application.
  • the method can include the following steps.
  • Step 101 Acquire time-time feature data of multiple users.
  • Spatio-temporal feature data is used to indicate the actual geographic location of the user at various times.
  • the spatiotemporal feature data includes data in two dimensions: a time dimension and a spatial dimension.
  • the feature data of the spatial dimension can be represented by latitude and longitude coordinates.
  • the feature data of the spatial dimension may also be represented by a road, an intersection, an iconic location, etc., which is not limited in this embodiment of the present application. All in all, the spatio-temporal feature data can reflect when and where the user is.
  • the spatiotemporal feature data for multiple users is shown in Table-1 below:
  • the user's spatiotemporal feature data can be obtained by using a correlation positioning algorithm.
  • the related positioning algorithm may be a base station positioning algorithm, a Global Positioning System (GPS) positioning algorithm, or the like.
  • GPS Global Positioning System
  • the base station positioning algorithm includes a three-base station positioning algorithm and a multi-base station positioning algorithm.
  • the principle is that the terminal measures downlink pilot signals of different base stations to obtain arrival times of downlink pilots of different base stations (Time of Arrival, TOA). Or Time Difference of Arrival (TDOA). Based on the measurement result and the coordinates of the base station, the position of the terminal can be calculated by using a triangular formula estimation algorithm.
  • Step 102 Perform rasterization processing on the spatiotemporal feature data of each user, and convert the spatiotemporal feature data into mapping spatiotemporal feature data.
  • the rasterization process refers to rasterizing the user's spatiotemporal feature data, so that the user's spatiotemporal feature data is mapped into the corresponding time period and space region.
  • Rasterization processing includes: rasterization of time dimensions and rasterization of spatial dimensions.
  • the rasterization of the time dimension refers to rasterizing the feature data of the user time dimension, so that the feature data of the user time dimension is mapped into the corresponding time period.
  • the rasterization of the spatial dimension refers to rasterizing the feature data of the user space dimension, so that the feature data of the user space dimension is mapped into the corresponding spatial region.
  • the rasterization process of the time dimension is specifically: segmenting the time dimension along the time axis, and mapping the feature data of the user time dimension to the corresponding time segment.
  • time period For example, segment the time of day along the timeline in days. If 0 is used as the origin of the time axis, 30 minutes is used as a time period (or a time slot), which is divided into 48 time segments every day. Any moment between 08:00:00 and 08:30:00 is discretized into a number 16.
  • the division of the time period can also be carried out in other ways according to requirements, such as 10 minutes as a time period or 1 hour as a time period. If the user appears in a plurality of different locations within a certain period of time, the location with the most occurrences may be taken as the actual geographic location of the user during the time period.
  • the rasterization processing of the spatial dimension is specifically: using the latitude and longitude to divide the two-dimensional position space (that is, the geospatial space) into grids, and mapping the feature data of the user space dimension to the corresponding grid, and one grid represents a space. region.
  • the actual geographic location of the user at any one time can be located in one of the grids.
  • the implementation manner in this embodiment is that the latitude and longitude 0:0 represents the coordinate origin, and the distance of 500 meters corresponds to the change in the longitude step (step1) of 0.0045 degrees, corresponding to the distance of 500 meters.
  • the step size (step 2) change in latitude is also 0.0045 degrees.
  • the rasterization of spatial dimensions can be calculated using the following formula:
  • Math.floor(x) represents the largest integer less than or equal to x
  • longitude represents the longitude coordinate before rasterization
  • latitude represents the latitude coordinate before rasterization
  • Longi represents the longitude after rasterization.
  • Coordinates Lati represents the latitude coordinates after rasterization
  • step 1 represents the step size in longitude
  • step 2 represents the step size in latitude.
  • each 500 meters extended to the right the longitude increased by 0.0045; each upward extension of 500 meters, the latitude increased by 0.0045.
  • the latitude and longitude coordinates after the processing are (1, 1).
  • the processed latitude and longitude coordinates are (1, 1).
  • point A and point B fall in the same grid represented by (1, 1).
  • Table-3 The space-time feature data of multiple users shown in Table-1 is rasterized by time dimension and rasterized by spatial dimension. The data obtained is shown in Table-3 below:
  • Table 4 shows the format of the original spatiotemporal feature data and the format of the mapped spatiotemporal feature data obtained after the rasterization process.
  • step 102 includes the following sub-steps:
  • step 102a for each user, the space-time feature data of the user is divided according to a period, and time-space feature data of the user in n cycles is obtained, where n is an integer greater than 1.
  • the above period is usually one day, which is more in line with the daily activities of the user.
  • the division of the cycle can also be based on other methods, such as half a day or one week.
  • Step 102b for each of the n periods, determining the actual geographic location of the user in each of the k time periods included in the period according to the spatio-temporal feature data of the user in the period.
  • k is an integer greater than one;
  • one day is used as one cycle, and 30 minutes is used as one time segment, and one cycle includes 48 time segments.
  • acquiring spatio-temporal feature data of the user in each of the k time periods included in the period for each time period, when the user has spatio-temporal feature data in the time period
  • the same actual geographic location is determined as the actual geographic location of the user during the time period
  • the time and space feature data of the user in the time segment indicates multiple different actual geographic locations
  • the actual geographic location that is indicated by the most number is determined as the actual geographic location of the user during that time period.
  • a user includes 5 records of spatio-temporal feature data in a certain period of time, and the actual geographical locations indicated by the 5 spatio-temporal feature data are the same, both are latitude and longitude coordinates A, then the user is within the time period.
  • the actual geographical position is the latitude and longitude coordinate A.
  • the actual geographic location indicated by the three spatio-temporal feature data is the latitude and longitude coordinate B
  • the actual geography indicated by the other two spatio-temporal feature data The position is the latitude and longitude coordinate A and the latitude and longitude coordinate C, respectively
  • the actual geographical position of the user in the time period is the latitude and longitude coordinate B.
  • Step 102c Obtain a representative geographic location corresponding to the actual geographic location of the user in each time period, where the spatial location represented by the representative geographic location includes the actual geographic location.
  • the longitude coordinate of the actual geographic location where the user is located in the time period is divided by the first preset value and the quotient is rounded (eg, rounded down), and the user is obtained.
  • the longitude coordinate of the representative geographical location in the time period; dividing the latitude coordinate of the actual geographical location where the user is located in the time period is divided by the second preset value and taking the quotient (such as rounding down),
  • the latitude coordinates of the representative geographic location of the user during the time period are obtained.
  • the first preset value and the second preset value may be the same or different.
  • the values of the first preset value and the second preset value may be set according to the granularity of the mesh, for example, when the two-dimensional position space is divided into a plurality of square grids with a side length of 500 meters, the first preset The value is the same as the second preset value and has a value of 0.0045. In practical applications, it is necessary to appropriately set the values of the first preset value and the second preset value.
  • the meshing granularity of the mesh is too large, which affects the accuracy of subsequent site selection; if the first preset value and the second preset value are taken If the value is too small, the meshing granularity of the mesh will be too small, which is not conducive to extracting frequent trajectories.
  • each user's mapped spatiotemporal feature data includes mapped spatiotemporal feature data of each user in n cycles, and the mapped spatiotemporal feature data in one cycle includes representative geography in each time period included in the cycle. position.
  • Step 103 Determine a frequent trajectory of each user according to the mapped spatiotemporal feature data of each user.
  • the user maps the spatiotemporal feature data over multiple cycles to determine the frequent trajectory of the user. Frequent trajectories are used to indicate the geographic location where the user frequently appears in the same time period of multiple cycles (if the number of occurrences is greater than a preset threshold).
  • step 103 includes the following sub-steps:
  • Step 103a For each user, obtain a sequence corresponding to the mapped spatiotemporal feature data of the user in each period, and each element in the sequence represents a representative geographical location of the user in a time period;
  • each element in the sequence can be expressed as “time period: representative geographical location”, wherein the time period is the time period after the rasterization process, and the representative geographical position is the latitude and longitude coordinate after the rasterization process. .
  • time period is the time period after the rasterization process
  • representative geographical position is the latitude and longitude coordinate after the rasterization process.
  • “16: (26745, 7015)” indicates that the user appears at the location (26745, 7015) during the time period of 16.
  • a sequence corresponding to the mapped spatiotemporal feature data of the user in the target time period of each cycle is obtained.
  • the target period includes a plurality of time periods, and the plurality of time periods may be continuous or non-contiguous.
  • the target time period refers to a non-sleeping period, such as 7 am to 10 pm every day.
  • the sequence corresponding to the mapping time-space feature data of a certain user in a non-sleeping period of a certain day is as follows: 14: A ⁇ 15: B ⁇ 16: B ⁇ 17 :B ⁇ 18:C ⁇ 19:C ⁇ 20:C ⁇ 21:C ⁇ 22:C ⁇ 23:C ⁇ 24:C ⁇ 25:D ⁇ 26:C ⁇ 27:C ⁇ 28:C ⁇ 29:C ⁇ 30:C ⁇ 31:C ⁇ 32:C ⁇ 33:C ⁇ 34:C ⁇ 35:C ⁇ 36:C ⁇ 37:B ⁇ 38:B ⁇ 39:A ⁇ 40:A ⁇ 41:E ⁇ 42 :A ⁇ 43: A ⁇ 44: A.
  • Step 103b Acquire, according to the n sequences corresponding to the user in n cycles, the number of occurrences of each subsequence in n sequences;
  • a subsequence refers to a sequence formed by any one of the elements or a combination of a plurality of elements.
  • Step 103c Select a subsequence that meets the first preset condition as the frequent sequence, and determine the mapped spatiotemporal feature data corresponding to the frequent sequence as the frequent trajectory of the user.
  • the first preset condition includes: the number of occurrences is greater than the first threshold, and the number of elements included in the subsequence is greater than the first or all of the second threshold.
  • the first threshold and the second threshold are empirical values preset according to actual conditions.
  • a PrefixSpan Prefix-Projected Pattern Growth
  • the above three sequences include the following subsequences: ⁇ a>, ⁇ b>, ⁇ c>, ⁇ a b>, ⁇ b c>, ⁇ a c>, ⁇ a b c>; wherein, the occurrence of ⁇ a> The number of occurrences is 3, the number of occurrences of ⁇ b> is 2, the number of occurrences of ⁇ c> is 2, the number of occurrences of ⁇ a b> is 2, the number of occurrences of ⁇ b c> is 1, and the number of occurrences of ⁇ a c> is 2.
  • the number of occurrences of ⁇ a b c> is 1.
  • the PrefixSpan algorithm requires an input parameter called support, which is between 0 and 1.
  • support which is between 0 and 1.
  • the product of support and the total number of sequences is equal to the minimum threshold for the number of occurrences of frequent sequences.
  • the sequence elements a, b, and c in the above example can be regarded as representative geographical locations of the user in a time period, for example, a can It is "16: (26745, 7015)", b may be "20: (26746, 7008)", and c may be "22: (26746, 7008)".
  • a can It is "16: (26745, 7015)
  • b may be "20: (26746, 7008)
  • c may be "22: (26746, 7008)”.
  • the result is that the user has 3 days at the time of 16 (26745, 7015), and 2 days at 20 (26746, 7008), there are 2 days appeared at the location (26746, 7008) at 22, 2 days at 16 (26745, 7015) and at 20 (26746, 7008). and many more.
  • a subsequence having a number of occurrences greater than 10 and having a number of elements greater than 15 in the subsequence is selected as the frequent sequence.
  • the frequent sequence by limiting the minimum value of the number of elements in the frequent sequence, it is possible to avoid that the frequently selected sequence is too fragmented, and better reflects the relatively complete daily activity path of the user.
  • FIG. 3 which shows a simple schematic diagram of a frequent trajectory
  • a place where a user passes on the first day is indicated by a circle (including four locations A, B, C, and D), and the user passes the next day.
  • the location of the location is indicated by a triangle (including four locations A, C, D, and E), and the frequent trajectory of the user during these two days may include three locations: A, C, and D.
  • the frequent trajectory of the user can be obtained by using the result of the rasterization process, that is, whether the user frequently appears around a certain geographic location in the same time period of different periods to help extract the frequent trajectory of the user.
  • the result of the rasterization process that is, whether the user frequently appears around a certain geographic location in the same time period of different periods to help extract the frequent trajectory of the user.
  • Table 5 User 1 has different spatiotemporal feature data, but the mapped spatiotemporal feature data obtained after rasterization is the same. If the rasterization process is not performed, that is, the frequent trajectory of the user 1 is directly extracted according to the spatio-temporal feature data on the left side of the table-5, because the time and space data are difficult to overlap in actual situations, it is difficult to obtain the frequent trajectory of the user. .
  • Step 104 Determine an alternate address of the subject to be selected according to a frequent trajectory of the plurality of users.
  • the frequent trajectory of the user reflects the geographical location that the user often appears.
  • the frequent trajectory of a large number of users reflects the traffic of each geographical location, and the traffic of the person is a relatively constant daily traffic, rather than the sudden flow of people on a certain day.
  • the main body to be selected may be a building, a hotel, a shopping mall, a playground, a gymnasium, etc., which is not limited in the embodiment of the present application.
  • the technical solution provided by the embodiment of the present application has better applicability to the site selection of a commercial profitable property such as a store.
  • the traffic of each geographical location, the identity of the user to be selected by the subject to be selected, the nature of the subject to be selected, and the surrounding land price may be comprehensively considered.
  • step 104 includes the following sub-steps:
  • Step 104a constructing a track tag library according to frequent trajectories of multiple users
  • the track tag library includes track tag data for multiple users.
  • Each user's track tag data includes: a user ID, at least one frequent track.
  • the track tag data of each user further includes: the number of occurrences corresponding to each frequent track. For example, with 1 day as a period, the number of occurrences corresponding to each frequent trajectory is the number of days with the frequent trajectory.
  • the track tag library is shown in Table -6 below:
  • each user's track tag data further includes: the user's identity tag.
  • the user's identity tag is obtained in the following manner: according to the frequent trajectory of the user, the location information of the user in the working time period is acquired, the POI information corresponding to the location information is obtained, and the user is determined according to the POI information.
  • Working hours refer to the working hours of the working day, such as from 9 am to 5 pm, Monday through Friday.
  • the location information of the user during the working time period is used to indicate the geographical location of the user during the working time period.
  • the geographic location of the user 1 in the working time period is the latitude and longitude coordinate A
  • the POI information corresponding to the latitude and longitude coordinate A is the xx office building
  • the identity tag of the user 1 can be determined as the white collar.
  • the identity tag of the user 2 may be determined to be a medical worker. In the above manner, the user's identity tag can be guessed in conjunction with the frequent trajectory of the user.
  • the user's social information may also be obtained, and the user's identity tag is extracted from the user's social information.
  • the user's frequent trajectory is used to infer the user's identity tag, which can avoid the inaccuracy of the final identity tag caused by the error of the identity information filled in the social information or the update is not timely, which helps to improve the accuracy of determining the identity tag of the user.
  • Step 104b based on the track tag library, determining an alternate address of the subject to be selected according to the second preset condition.
  • the second preset condition may be set by considering factors such as the flow of the person, the identity of the user to be selected by the subject to be selected, the nature of the subject to be selected, and the surrounding land price.
  • the second preset condition includes at least one of the following: the traffic volume in a certain time period in the surrounding area of the geographic location is greater than a preset threshold, and the proportion of the user whose identity tag is white-collar in the surrounding area of the geographic location is greater than the pre-predetermined There is no large business district in the surrounding area of the proportion and geographical location, the land price in the surrounding area is less than the preset price, and so on.
  • the second preset condition can be flexibly set according to the actual location requirement, which is not limited by the embodiment of the present application.
  • the traffic of a certain geographical location within a certain period of time can be estimated based on the frequent trajectories of multiple users.
  • the traffic volume of the geographic location A between 7 am and 9 am can be estimated as follows: from the frequent trajectories of all users, the frequent trajectories of the target passing through the geographic location A from 7 am to 9 am are searched, and the target is frequently The number of trajectories is used as the traffic volume of geographic location A between 7 am and 9 am.
  • a technical solution for selecting a user's frequent trajectory by using a user's space-time feature data as a reference is provided, and the prior art relies on a user's search record for site selection.
  • the spatio-temporal feature data of the user can be conveniently collected from the daily activity of the user, thereby ensuring that the solution provided by the embodiment of the present application can be effectively implemented.
  • the frequent trajectory of the user reflects the geographical location that the user often appears, and the user's frequent trajectory is used for site selection, which can accurately reflect the traffic of each geographical location, the identity of the surrounding users, and conform to the user's habits often appearing in it.
  • the actual situation of the location for various daily activities (such as shopping) helps to improve the accuracy of site selection.
  • FIG. 4 is a schematic diagram showing the composition of an address system provided by an embodiment of the present application.
  • the location system can include a data pre-processing platform 10, a data mining and analysis platform 20, and an alternate location recommendation platform 30.
  • FIG. 5 there is shown a schematic diagram of interaction between various components of the system when the addressing system shown in FIG. 4 is used for addressing.
  • the data pre-processing platform 10 is configured to pre-process the collected basic metadata to obtain spatio-temporal feature data of multiple users.
  • the basic metadata refers to the recorded data that reflects when and where the user is.
  • the data pre-processing platform 10 includes a database 11 and an ETL (Extract-Transform-Load) unit 12.
  • ETL Extract-Transform-Load
  • the base metadata is stored in the database 11.
  • the base metadata is stored in the form of a table.
  • a plurality of basic metadata tables can be stored in the database 11.
  • the latitude and longitude coordinates of the base station are used to approximate the actual geographic location of the user.
  • the terminal used by the user (such as a mobile phone) sends a signal to the base station that provides the service, and the base station can record the time when the terminal signal is received and the device identifier of the terminal.
  • the basic metadata table of the base station records the correspondence between the device identifiers and the time of at least one group of terminals, and exemplary examples are as shown in Table-7 below:
  • the data ETL unit 12 extracts the basic metadata from the database 11, aggregates and transforms the basic metadata to obtain the spatio-temporal feature data of the user, and then transmits the spatio-temporal feature data of the user to the data mining and analysis platform 20.
  • the data ETL unit 12 can treat each terminal as a user and assign a corresponding user ID to it. For each user, the data ETL module 12 extracts time and corresponding base station latitude and longitude coordinates from the base metadata table of each base station, and integrates the time and space feature data of the user.
  • the user's spatiotemporal feature data can be as shown in Table-1 above.
  • the data mining and analysis platform 20 provides functional units such as feature transformation and data mining.
  • the data mining and analysis platform 20 includes a data mapping unit 21 and a trajectory determining unit 22.
  • the data mapping unit 21 is configured to perform rasterization processing on the spatiotemporal feature data of each user, and convert the spatiotemporal feature data into the mapped spatiotemporal feature data.
  • the trajectory determining unit 22 is configured to determine a frequent trajectory of each user according to the mapped spatiotemporal feature data of each user. For the specific process of the feature data mapping and the frequent trajectory mining, refer to the description in the embodiment of FIG. 1 , which is not described in this embodiment.
  • the data mining and analysis platform 20 After the data mining and analysis platform 20 obtains frequent trajectories of multiple users, it provides it to the alternate location recommendation platform 30.
  • the alternative location recommendation platform 30 is configured to determine an alternate address of the subject to be selected according to a frequent trajectory of the plurality of users, and then provide an alternate address of the subject to be selected to the requesting party (eg, a merchant).
  • the alternative location recommendation platform 30 provides the requesting party with a setting and selection function of the restriction condition (ie, the second preset condition introduced above), and the requesting party makes the alternative location by setting and selecting an appropriate restriction condition.
  • the recommendation platform 30 filters out suitable alternate addresses. For the specific process of determining the candidate address of the main body to be selected, refer to the description in the embodiment of FIG. 1 , which is not described in this embodiment.
  • each platform involved in the location selection system shown in FIG. 4 above may be implemented by one or more servers, or the functions of multiple platforms may be integrated into one server.
  • the addressing device includes hardware structures and/or software modules (or units) corresponding to the execution of the respective functions.
  • the embodiments of the present application can be implemented in a combination of hardware or hardware and computer software in combination with the elements of the examples and algorithm steps described in the embodiments disclosed in the application. Whether a function is implemented in hardware or computer software to drive hardware depends on the specific application and design constraints of the solution. A person skilled in the art can use different methods to implement the described functions for each specific application, but such implementation should not be considered to be beyond the scope of the technical solutions of the embodiments of the present application.
  • the embodiment of the present application may divide the functional unit into the location device according to the foregoing method example.
  • each functional unit may be divided according to each function, or two or more functions may be integrated into one processing unit.
  • the above integrated unit can be implemented in the form of hardware or in the form of a software functional unit. It should be noted that the division of the unit in the embodiment of the present application is schematic, and is only a logical function division. In actual implementation, there may be another division manner.
  • FIG. 6A shows a possible structural diagram of the addressing device involved in the above embodiment.
  • the location device 600 includes a processing unit 602 and a communication unit 603.
  • the processing unit 602 is configured to control and manage the actions of the location device 600.
  • processing unit 602 is configured to support addressing device 600 to perform steps 101 through 104 of FIG. 1, and/or to perform other steps of the techniques described herein.
  • the communication unit 603 is used to support the communication of the addressing device 600 with other devices.
  • the location device 600 can also include a storage unit 601 for storing program codes and data of the location device 600.
  • the processing unit 602 can be a processor or a controller, and can be, for example, a central processing unit (CPU), a general-purpose processor, a digital signal processor (DSP), and an application-specific integrated circuit (Application-Specific). Integrated Circuit (ASIC), Field Programmable Gate Array (FPGA) or other programmable logic device, transistor logic device, hardware component, or any combination thereof. It is possible to implement or carry out the various illustrative logical blocks, modules and circuits described in connection with the present disclosure.
  • the processor may also be a combination of computing functions, for example, including one or more microprocessor combinations, a combination of a DSP and a microprocessor, and the like.
  • the communication unit 603 can be a communication interface, a transceiver, a transceiver circuit, etc., wherein the communication interface is a collective name and can include one or more interfaces, such as an interface between the addressing device and other devices.
  • the storage unit 601 can be a memory.
  • the processing unit 602 is a processor
  • the communication unit 603 is a communication interface
  • the storage unit 601 is a memory
  • the addressing device involved in the embodiment of the present application may be the addressing device shown in FIG. 6B.
  • the addressing device 610 includes a processor 612, a communication interface 613, and a memory 611.
  • the location device 610 can also include a bus 614.
  • the communication interface 613, the processor 612, and the memory 611 may be connected to each other through a bus 614.
  • the bus 614 may be a Peripheral Component Interconnect (PCI) bus or an Extended Industry Standard Architecture (abbreviated). EISA) bus and so on.
  • PCI Peripheral Component Interconnect
  • EISA Extended Industry Standard Architecture
  • the bus 614 can be divided into an address bus, a data bus, a control bus, and the like. For ease of representation, only one thick line is shown in Figure 6B, but it does not mean that there is only one bus or one type of bus.
  • the steps of the method or algorithm described in connection with the disclosure of the embodiments of the present application may be implemented in a hardware manner, or may be implemented by a processor executing software instructions.
  • the software instructions may be composed of corresponding software modules (or units), and the software modules (or units) may be stored in a random access memory (RAM), a flash memory, a read only memory (ROM), and Erasable Programmable ROM (EPROM), Electrically Erasable Programmable Read Only Memory (EEPROM), Register, Hard Disk, Mobile Hard Disk, CD-ROM, or is well known in the art. Any other form of storage medium.
  • An exemplary storage medium is coupled to the processor to enable the processor to read information from the storage medium and to write information to the storage medium.
  • the storage medium can also be an integral part of the processor.
  • the processor and the storage medium can be located in an ASIC. Additionally, the ASIC can be located in the location device.
  • the processor and the storage medium can also exist as discrete components in the addressing device.
  • the functions described in the embodiments of the present application may be implemented in hardware, software, firmware, or any combination thereof.
  • the embodiment of the present application also provides a computer program product for implementing the above functions when the computer program product is executed.
  • the computer program described above may be stored in a computer readable medium or transmitted as one or more instructions or code on a computer readable medium.
  • Computer readable media includes both computer storage media and communication media including any medium that facilitates transfer of a computer program from one location to another.
  • a storage medium may be any available media that can be accessed by a general purpose or special purpose computer.

Landscapes

  • Engineering & Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Fuzzy Systems (AREA)
  • Computational Linguistics (AREA)
  • Software Systems (AREA)
  • Probability & Statistics with Applications (AREA)
  • Mathematical Physics (AREA)
  • Remote Sensing (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
  • Navigation (AREA)

Abstract

一种选址方法及设备,涉及大数据分析技术领域。所述方法包括:获取多个用户的时空特征数据,时空特征数据用于指示用户在各个时刻的实际地理位置;对每个用户的时空特征数据进行栅格化处理,将时空特征数据转化为映射时空特征数据;根据每个用户的映射时空特征数据,确定每个用户的频繁轨迹;根据多个用户的频繁轨迹,确定待选址主体的备选地址。本申请实施例提供一种以用户的时空特征数据为参考,通过提取频繁轨迹进行选址的方案,用户的频繁轨迹反映了用户经常出现的地理位置,据此进行选址能够反映出各个地理位置的人流量、周围用户的身份,且符合用户习惯于在其经常出现的地点进行各种日常行为活动的实际情况,有助于提高选址的准确性。

Description

选址方法及设备
本申请要求于2017年6月1日提交中国专利局、申请号为201710405595.2,发明名称为“选址方法及设备”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。
技术领域
本申请实施例涉及大数据分析技术领域,特别涉及一种选址方法及设备。
背景技术
选址是指在建筑之前对地址进行论证和决策的过程。例如,商店、酒店、医院、学校等建筑在开设或建造之间都需进行选址。选址需要考虑诸多因素。以商店选址为例,商店选址需要考虑区域人流量、商圈、用户购物习惯、商店自身性质以及周边地价等因素。
在现有技术中,提供了一种基于用户的搜索记录进行商店选址的方法。用户通常会在地图应用中搜索想去的商店,通过获取大量用户在地图应用中对商店的搜索记录,并对此进行分析,进而实现商店选址。以为一家新筹划开设的“xx咖啡店”进行选址为例,首先获取大量用户在地图应用中搜索“xx咖啡店”的搜索记录,根据搜索记录确定搜索“xx咖啡店”的用户在地图上的空间位置分布,选取搜索人数大于预设阈值的区域作为初步候选区域,然后从初步候选区域中剔除就近范围内已经开设“xx咖啡店”的区域,得到剩余候选区域。这些剩余候选区域可以看作是有开设“xx咖啡店”需求的区域。而后,进一步结合各个剩余候选区域的人流量、周边地价、所处商圈等因素,最终选取某个合适的剩余候选区域作为开设“xx咖啡店”的备选地址。
上述现有技术从用户需求的角度入手进行商店选址,有个先决条件是获取大量用户在地图应用中对筹划开设的商店的搜索记录。如果搜索记录因为某些原因缺失,例如当筹划开设的商店是一种新兴类型的商店或者知名度较低的商店时,地图应用中可能仅有少量甚至没有关于该筹划开设的商店的搜索记录,这种情况下会导致现有技术的方案很难有效实施。
发明内容
本申请实施例提供了一种选址方法及设备,用以解决现有技术提供的方案在搜索记录缺失的情况下很难有效实施的问题。
一方面,提供了一种选址方法,该方法包括:获取多个用户的时空特征数据,时空特征数据用于指示用户在各个时刻的实际地理位置;对每个用户的时空特征数据进行栅格化处理,将时空特征数据转化为映射时空特征数据;根据每个用户的映射时空特征数据,确定每个用户的频繁轨迹;根据多个用户的频繁轨迹,确定待选址主体的备选地址。
本申请实施例提供的方案中,提供了一种以用户的时空特征数据为参考,通过提取用户的频繁轨迹进行选址的技术方案,克服了现有技术中依赖于用户的搜索记录进行选址所存在的问题。用户的时空特征数据可以从用户日常的活动行为中方便地采集得到,从而确保本申请实施例提供的方案能够有效实施。另外,用户的频繁轨迹反映了用户经常出现的地理位置,利用用户的频繁轨迹进行选址,能够准确反映出各个地理位置的人流量、周围用户的身份,且符合用户习惯于在其经常出现的地点进行各种日常行为活动(如购物)的实际情况,有助于提高选址的准确性。
在一个可能的设计中,对每个用户的时空特征数据进行栅格化处理,将时空特征数据转化为映射时空特征数据,包括:对于每一个用户,将该用户的时空特征数据按照周期进行划分,得到该用户在n个周期内的时空特征数据,n为大于1的整数;对于n个周期中的每一个周期,根据该用户在该周期内的时空特征数据,确定该用户在该周期所包括的k个时间段中的每一个时间段内所处的实际地理位置,k为大于1的整数;获取该用户在每一个时间段内所处的实际地理位置对应的代表性地理位置,该代表性地理位置所代表的空间区域中包括上述实际地理位置。其中,每个用户的映射时空特征数据包括每个用户在n个周期内的映射时空特征数据,在一个周期内的映射时空特征数据包括在该周期所包括的每一个时间段内的代表性地理位置。
在一个可能的设计中,根据每个用户的映射时空特征数据,确定每个用户的频繁轨迹,包括:对于每一个用户,获取该用户在每一个周期内的映射时空特征数据对应的序列,序列中的每一个元素表示该用户在一个时间段内所处的代表性地理位置;根据该用户在n个周期对应的n个序列,获取每一个子序列在n个序列中的出现次数;选取符合第一预设条件的子序列作为频繁序列,并将频繁序列对应的映射时空特征数据确定为该用户的频繁轨迹。在一个可能的设计中,第一预设条件包括:出现次数大于第一阈值、子序列中包含的元素数量大于第二阈值中的第一项或全部两项。
本申请实施例提供的方案中,利用栅格化处理的结果可以得到用户的频繁轨迹,即识别在不同周期的相同时间段内用户是否经常出现在某个地理位置周围以助于提取该用户的频繁轨迹。如果不进行栅格化处理,根据时空特征数据直接提取用户的频繁轨迹的话,因为在实际情况下时间和空间数据很难重合,也就很难得到用户的频繁轨迹。
在一个可能的设计中,根据多个用户的频繁轨迹,确定待选址主体的备选地址,包括:根据多个用户的频繁轨迹,构建轨迹标签库,其中,轨迹标签库包括多个用户的轨迹标签数据,每一个用户的轨迹标签数据包括:用户标识和至少一个频繁轨迹;基于轨迹标签库,根据第二预设条件确定待选址主体的备选地址。
在一个可能的设计中,上述方法还包括:根据用户的频繁轨迹,获取用户在工作时段内的位置信息;获取与上述位置信息对应的信息点(Point of Information,POI)信息;根据POI信息确定用户的身份标签,其中,用户的轨迹标签数据中还包括用户的身份标签。
在本申请实施例提供的方案中,在选址时还结合考虑地理位置的周围区域内的用户身份,使得备选地址的推荐更加精细化、更具针对性。另外,采用用户的频繁轨迹推测用户的身份标签,能够避免因社交信息中填写的身份信息错误或者更新不及时而导致最终确定的身份标签不准确,有助于提高确定用户的身份标签的准确度。
另一方面,本申请实施例提供了一种选址设备,包括处理器和存储器,其中,所述存储器中存有计算机可读程序;所述处理器通过运行所述存储器中的程序,以用于完成上述方面所述的选址方法。
另一方面,本申请实施例提供一种计算机存储介质,用于储存为选址设备所用的计算机软件指令,其包含用于执行上述方面所设计的程序。
再一方面,本申请实施例提供一种计算机程序产品,当该计算机程序产品被执行时,其用于执行上述方面所述的选址方法。
相较于现有技术,本申请实施例提供的方案中,提供了一种以用户的时空特征数据为参考,通过提取用户的频繁轨迹进行选址的技术方案,用户的时空特征数据可以从用户日常的 活动行为中方便地采集得到,从而确保本申请实施例提供的方案能够有效实施。另外,用户的频繁轨迹反映了用户经常出现的地理位置,利用用户的频繁轨迹进行选址,能够准确反映出各个地理位置的人流量、周围用户的身份,且符合用户习惯于在其经常出现的地点进行各种日常行为(如购物)的实际情况,有助于提高选址的准确性。
附图说明
图1是本申请一个实施例提供的选址方法的流程图;
图2是本申请一个实施例提供的空间栅格化的示意图;
图3是本申请一个实施例提供的频繁轨迹的示意图;
图4是本申请一个实施例提供的选址***的组成示意图;
图5是本申请一个实施例提供的选址***的交互示意图;
图6A是本申请一个实施例提供的选址设备的示意性框图;
图6B是本申请一个实施例提供的选址设备的结构示意图。
具体实施方式
为使本申请的目的、技术方案和优点更加清楚,下面将结合附图,对本申请实施例中的技术方案进行描述。
请参考图1,其示出了本申请一个实施例提供的选址方法的流程图。该方法可以包括如下几个步骤。
步骤101,获取多个用户的时空特征数据。
时空特征数据用于指示用户在各个时刻的实际地理位置。时空特征数据包括如下两个维度的数据:时间维度和空间维度。空间维度的特征数据可以采用经纬度坐标表示。当然,在其它可能的实现方式中,空间维度的特征数据也可以采用道路、路口、标志性地点等表示,本申请实施例对此不做限定。总而言之,时空特征数据能够反映出用户什么时间在什么地点即可。
示例性地,多个用户的时空特征数据如下表-1所示:
Figure PCTCN2018083627-appb-000001
表-1
另外,用户的时空特征数据可采用相关定位算法获取。例如,上述相关定位算法可以是基站定位算法、全球定位***(Global Positioning System,GPS)定位算法等。以基站定位算法为例,基站定位算法包括三基站定位算法、多基站定位算法等,其原理是终端测量不同基站的下行导频信号,得到不同基站下行导频的到达时刻(Time of Arrival,TOA)或到达时间差(Time Difference of Arrival,TDOA),根据该测量结果并结合基站的坐标,一般采用三角公式估计算法,就能够计算出终端的位置。
步骤102,对每个用户的时空特征数据进行栅格化处理,将时空特征数据转化为映射时空特征数据。
栅格化处理是指将用户的时空特征数据栅格化,使用户的时空特征数据映射到对应的时间段和空间区域中。栅格化处理包括:时间维度的栅格化处理和空间维度的栅格化处理。时间维度的栅格化处理是指将用户时间维度的特征数据栅格化,使用户时间维度的特征数据映射到对应的时间段中。空间维度的栅格化处理是指将用户空间维度的特征数据栅格化,使用户空间维度的特征数据映射到对应的空间区域中。
时间维度的栅格化处理具体为:在时间维度沿着时间轴分段,将用户时间维度的特征数据映射到对应的时间段上。
例如,以天为单位,将一天的时间沿着时间轴分段。如将0点作为时间轴原点,30分钟作为一个时间段(或称为一个时隙slot),每天划分为48个时间段。08:00:00至08:30:00之间的任意一个时刻都离散化为数字16。时间段的划分也可以根据需求采用其它方式,如10分钟作为一个时间段或者1小时作为一个时间段等。如果用户在一个时间段内出现在多个不同地点,则可以取出现次数最多的地点作为该时间段内该用户的实际地理位置。
表-1所示的多个用户的时空特征数据经过时间维度的栅格化处理后,得到的数据如下表-2所示:
用户ID 时间
ID1 16
ID1 20
ID1 22
ID1 30
ID2 16
ID2 18
ID2 20
ID2 22
ID3 17
... ...
IDn 47
表-2
空间维度的栅格化处理具体为:利用经纬度将二维位置空间(也即地理空间)按网格分块,将用户空间维度的特征数据映射到对应的网格上,一个网格代表一个空间区域。
例如,将二维位置空间划分为若干个长为X米、宽为Y米的网格。如X=Y=500,也即将二维位置空间划分为若干个边长为500米的正方形网格。用户在任何一个时刻的实际地理位置都可以定位在其中的一个网格内。如图2所示,在本实施例中的实现方式是将经纬度0:0 代表坐标原点,500米的距离对应在经度上步长(step1)的变化为0.0045度,同理500米的距离对应在纬度上步长(step2)的变化也为0.0045度。
空间维度的栅格化处理可采用如下公式进行计算:
(1)经度转换:Longi=Math.floor(longitude/step1)
(2)纬度转换:Lati=Math.floor(latitude/step2)
其中,Math.floor(x)表示求取小于或等于x的最大整数,longitude表示栅格化处理前的经度坐标,latitude表示栅格化处理前的纬度坐标,Longi表示栅格化处理后的经度坐标,Lati表示栅格化处理后的纬度坐标,step1表示经度上步长,step2表示纬度上步长。
根据图2,每向右延长500米,经度增加0.0045;每向上延长500米,纬度增加0.0045。对于网格中的某一点A,假设其原始的经纬度坐标为(0.0045,0.0045),根据经纬度转换公式Longi=Math.floor(0.0045/0.0045),Lati=Math.floor(0.0045/0.0045),栅格化处理后的经纬度坐标为(1,1)。对于网格中的另一点B,其原始的经纬度坐标为(0.0055,0.0055),根据经纬度转换公式Longi=Math.floor(0.0055/0.0045),Lati=Math.floor(0.0055/0.0045),栅格化处理后的经纬度坐标为(1,1)。通过栅格化处理得到点A和点B落在同一个以(1,1)为代表的网格中。
表-1所示的多个用户的时空特征数据经过时间维度的栅格化处理以及空间维度的栅格化处理后,得到的数据如下表-3所示:
用户ID 时间 空间(经度,纬度)
ID1 16 26745,7015
ID1 20 26746,7008
ID1 22 26746,7008
ID1 30 26746,7008
ID2 16 26747,7008
ID2 18 26734,7019
ID2 20 26733,7016
ID2 22 26733,7016
ID3 17 26745,7015
... ... ...
IDn 47 26743,7012
表-3
如下表-4所示,其示出了原始的时空特征数据的格式以及栅格化处理后得到的映射时空特征数据的格式。
Figure PCTCN2018083627-appb-000002
Figure PCTCN2018083627-appb-000003
表-4
在一个示例中,步骤102包括如下几个子步骤:
步骤102a,对于每一个用户,将用户的时空特征数据按照周期进行划分,得到用户在n个周期内的时空特征数据,n为大于1的整数;
可选地,上述周期通常为1天,这更符合用户的日常活动规律。当然,周期的划分也可以根据需求采用其它方式,如半天或者1个星期等。
步骤102b,对于n个周期中的每一个周期,根据用户在该周期内的时空特征数据,确定用户在该周期所包括的k个时间段中的每一个时间段内所处的实际地理位置,k为大于1的整数;
例如,以1天作为一个周期,30分钟作为一个时间段,则一个周期包括48个时间段。
可选地,对于每一个周期,获取用户在该周期所包括的k个时间段中的每一个时间段内的时空特征数据;对于每一个时间段,当用户在该时间段内的时空特征数据指示同一实际地理位置时,将该同一实际地理位置确定为用户在该时间段内所处的实际地理位置;当用户在该时间段内的时空特征数据指示多个不同的实际地理位置时,将被指示的数量最多的实际地理位置确定为用户在该时间段内所处的实际地理位置。
例如,如果某一用户在某一时间段内包括5条时空特征数据的记录,且这5条时空特征数据所指示的实际地理位置相同,均为经纬度坐标A,则该用户在该时间段内的实际地理位置即为经纬度坐标A。又例如,如果某一用户在某一时间段内包括5条时空特征数据的记录,其中3条时空特征数据所指示的实际地理位置为经纬度坐标B,另外2条时空特征数据所指示的实际地理位置分别为经纬度坐标A和经纬度坐标C,则该用户在该时间段内的实际地理位置即为经纬度坐标B。
步骤102c,获取用户在每一个时间段内所处的实际地理位置对应的代表性地理位置,该代表性地理位置所代表的空间区域中包括上述实际地理位置。
可选地,对于每一个时间段,将用户在该时间段内所处的实际地理位置的经度坐标与第一预设值相除并对商取整(如向下取整),得到用户在该时间段内的代表性地理位置的经度坐标;将用户在该时间段内所处的实际地理位置的纬度坐标与第二预设值相除并对商取整(如向下取整),得到用户在该时间段内的代表性地理位置的纬度坐标。其中,第一预设值和第二预设值可以相同,也可以不同。第一预设值和第二预设值的取值可根据网格的划分粒度进行设定,例如将二维位置空间划分为若干个边长为500米的正方形网格时,第一预设值和第二预设值相同,且取值为0.0045。在实际应用中,需要合理设定第一预设值和第二预设值的取值。如果第一预设值和第二预设值取值过大,则会导致网格的划分粒度过大,影响到后续选址的精准度;如果第一预设值和第二预设值取值过小,则会导致网格的划分粒度过小,不利于提取频繁轨迹。
这样,每个用户的映射时空特征数据包括每个用户在n个周期内的映射时空特征数据,在一个周期内的映射时空特征数据包括在该周期所包括的每一个时间段内的代表性地理位置。
步骤103,根据每个用户的映射时空特征数据,确定每个用户的频繁轨迹。
对于每一个用户,将该用户在多个周期内的映射时空特征数据进行累计,用以确定该用户的频繁轨迹。频繁轨迹用于指示用户在多个周期的相同时间段内经常出现(如出现次数大 于预设阈值)的地理位置。
在一个示例中,步骤103包括如下几个子步骤:
步骤103a,对于每一个用户,获取该用户在每一个周期内的映射时空特征数据对应的序列,序列中的每一个元素表示该用户在一个时间段内所处的代表性地理位置;
例如,序列中的每一个元素可以表示为“时间段:代表性地理位置”,其中,时间段即为栅格化处理后的时间段,代表性地理位置即为栅格化处理后的经纬度坐标。例如,“16:(26745,7015)”表示用户在16这一时间段出现在地点(26745,7015)。
可选地,对于每一个用户,获取该用户在每一个周期的目标时段内的映射时空特征数据对应的序列。目标时段中包括多个时间段,且该多个时间段可以是连续的,也可以是非连续的。可选地,目标时段是指非睡觉时段,例如每天上午7点至晚上10点。例如,以1天作为一个周期,30分钟作为一个时间段,某一用户在某一天的非睡觉时段内的映射时空特征数据对应的序列如下:14:A→15:B→16:B→17:B→18:C→19:C→20:C→21:C→22:C→23:C→24:C→25:D→26:C→27:C→28:C→29:C→30:C→31:C→32:C→33:C→34:C→35:C→36:C→37:B→38:B→39:A→40:A→41:E→42:A→43:A→44:A。
步骤103b,根据该用户在n个周期对应的n个序列,获取每一个子序列在n个序列中的出现次数;
子序列是指序列中的任意一个元素或者多个元素的组合形成的序列。
步骤103c,选取符合第一预设条件的子序列作为频繁序列,并将频繁序列对应的映射时空特征数据确定为用户的频繁轨迹。
第一预设条件包括:出现次数大于第一阈值、子序列中包含的元素数量大于第二阈值中的第一项或全部两项。其中,第一阈值和第二阈值是根据实际情况预先设定的经验值。
可选地,采用PrefixSpan(Prefix-Projected Pattern Growth,前缀投影的模式挖掘)算法提取频繁序列。假设存在如下3个序列:<a b c>、<a b>和<a c>,其可以看作是某一用户在3个不同周期对应的3个序列。上述3个序列中包括如下子序列:<a>、<b>、<c>、<a b>、<b c>、<a c>、<a b c>;其中,<a>的出现次数为3、<b>的出现次数为2、<c>的出现次数为2、<a b>的出现次数为2、<b c>的出现次数为1、<a c>的出现次数为2、<a b c>的出现次数为1。
PrefixSpan算法需要一个输入参数,称为支持度,其取值在0到1之间。支持度与序列总数的乘积,等于频繁序列的出现次数的最小阈值。比如上例中,序列总数为3,如果支持度为0.5,则意味着出现3×0.5=1.5≈2次以及2次以上的子序列被认为是频繁序列。即频繁序列包括:<a>、<b>、<c>、<a b>、<a c>。
当PrefixSpan算法应用于本申请实施例中,用于挖掘频繁轨迹时,上例中的序列元素a、b、c可以看作是用户在一个时间段内所处的代表性地理位置,比如a可以是“16:(26745,7015)”、b可以是“20:(26746,7008)”、c可以是“22:(26746,7008)”。以周期为1天为例,结果可以得到该用户有3天在16这一时间段出现在地点(26745,7015),有2天在20这一时间段出现在地点(26746,7008),有2天在22这一时间段出现在地点(26746,7008),有2天在16这一时间段出现在地点(26745,7015)且在20这一时间段出现在地点(26746,7008),等等。
在一个示例中,选取出现次数大于10,子序列中包含的元素数量大于15的子序列作为频繁序列。在本申请实施例中,通过对频繁序列中的元素数量的最小值进行限定,可以避免 选取的频繁序列过于零散,更好地反映出用户较为完整的日常活动路径。
结合参考图3,其示出了一个频繁轨迹的简单示意图,某一用户在第一天经过的地点以圆圈示意(包括A、B、C、D四个地点),该用户在第二天经过的地点以三角示意(包括A、C、D、E四个地点),则该用户在这两天内的频繁轨迹可包括A、C、D三个地点。
在本申请实施例中,利用栅格化处理的结果可以得到用户的频繁轨迹,即识别在不同周期的相同时间段内用户是否经常出现在某个地理位置周围以助于提取该用户的频繁轨迹。如下表-5所示,用户1具有不同的时空特征数据,但栅格化处理后得到的映射时空特征数据相同。如果不进行栅格化处理,即根据表-5左侧的时空特征数据直接提取用户1的频繁轨迹的话,因为在实际情况下时间和空间数据很难重合,也就很难得到用户的频繁轨迹。
Figure PCTCN2018083627-appb-000004
表-5
另外,在本申请实施例中,以采用PrefixSpan算法挖掘频繁轨迹为例,在其它可能的实现方式中,也可采用求解最长公共子串的算法或者其它算法挖掘频繁轨迹。
步骤104,根据多个用户的频繁轨迹,确定待选址主体的备选地址。
用户的频繁轨迹反映了用户经常出现的地理位置,大量用户的频繁轨迹则体现了各个地理位置的人流量,且该人流量是每天较为固定的人流量,而不是某一天突发的人流量。
待选址主体可以是商店、酒店、商场、游乐场、体育馆等建筑,本申请实施例对此不作限定。可选地,本申请实施例提供的技术方案,对诸如商店这类具有商业盈利性质的建筑的选址适用性更好。
在确定待选址主体的备选地址时,可以综合考虑各个地理位置的人流量、待选址主体所面向的用户身份、待选址主体自身的性质、周边地价等因素。
在一个示例中,步骤104包括如下几个子步骤:
步骤104a,根据多个用户的频繁轨迹,构建轨迹标签库;
轨迹标签库包括多个用户的轨迹标签数据。每一个用户的轨迹标签数据包括:用户标识、至少一个频繁轨迹。可选地,每一个用户的轨迹标签数据还包括:每一个频繁轨迹对应的出现次数。例如,以1天作为一个周期,则每一个频繁轨迹对应的出现次数即为具有该频繁轨迹的天数。
示例性地,轨迹标签库如下表-6所示:
Figure PCTCN2018083627-appb-000005
Figure PCTCN2018083627-appb-000006
表-6
可选地,每一个用户的轨迹标签数据中还包括:该用户的身份标签。
在一种可能的实现方式中,采用如下方式获取用户的身份标签:根据用户的频繁轨迹,获取用户在工作时段内的位置信息,获取与该位置信息对应的POI信息,根据POI信息确定用户的身份标签。工作时段是指工作日的上班时间段,例如每周一至周五的上午9点至下午5点。用户在工作时间段内的位置信息用于指示用户在工作时间段内所处的地理位置。在一个例子中,假设用户1在工作时间段内所处的地理位置为经纬度坐标A,该经纬度坐标A对应的POI信息为xx办公楼,则可确定用户1的身份标签为白领。在另一个例子中,假设用户2在工作时间段内所处的地理位置为经纬度坐标B,该经纬度坐标B对应的POI信息为xx医院,则可确定用户2的身份标签为医疗工作者。采用上述方式,能够结合用户的频繁轨迹对用户的身份标签进行推测。
在其它可能的实现方式中,也可以获取用户的社交信息,从用户的社交信息中提取用户的身份标签。采用用户的频繁轨迹推测用户的身份标签,能够避免因社交信息中填写的身份信息错误或者更新不及时而导致最终确定的身份标签不准确,有助于提高确定用户的身份标签的准确度。
步骤104b,基于轨迹标签库,根据第二预设条件确定待选址主体的备选地址。
第二预设条件可以综合考虑人流量、待选址主体所面向的用户身份、待选址主体自身的性质、周边地价等因素进行设定。例如,第二预设条件包括以下至少一项:地理位置的周围区域内某一时间段内的人流量大于预设门限值、地理位置的周围区域内身份标签为白领的用户占比大于预设比例、地理位置的周围区域内不存在大型商圈、地理位置的周围区域内地价小于预设价格,等等。在实际应用中,第二预设条件可根据实际选址需求进行灵活设定,本申请实施例对此不作限定。
另外,某一地理位置在某一时间段内的人流量可根据多个用户的频繁轨迹进行估计得到。例如,地理位置A在上午7点至9点的人流量可以采用如下方式估计得到:从所有用户的频繁轨迹中,查找在上午7点至9点经过地理位置A的目标频繁轨迹,将目标频繁轨迹的数量作为地理位置A在上午7点至9点的人流量。
本申请实施例提供的方案中,提供了一种以用户的时空特征数据为参考,通过提取用户的频繁轨迹进行选址的技术方案,克服了现有技术中依赖于用户的搜索记录进行选址所存在的问题。用户的时空特征数据可以从用户日常的活动行为中方便地采集得到,从而确保本申请实施例提供的方案能够有效实施。
另外,用户的频繁轨迹反映了用户经常出现的地理位置,利用用户的频繁轨迹进行选址,能够准确反映出各个地理位置的人流量、周围用户的身份,且符合用户习惯于在其经常出现的地点进行各种日常行为活动(如购物)的实际情况,有助于提高选址的准确性。
请参考图4,其示出了本申请一个实施例提供的选址***的组成示意图。该选址***可以包括:数据预处理平台10、数据挖掘与分析平台20和备选位置推荐平台30。结合参考图5,其示出了采用图4所示的选址***进行选址时,***各个组成部分之间的交互示意图。
数据预处理平台10用于对采集的基础元数据进行预处理,得到多个用户的时空特征数据。其中,基础元数据是指记录的能够反映出用户什么时间在什么地点的数据。
可选地,数据预处理平台10包括数据库11和数据ETL(Extract-Transform-Load,抽取-转换-加载)单元12。
数据库11中存储有基础元数据。可选地,基础元数据以表的形式存储。数据库11中可以存储多张基础元数据表。在一个例子中,采用基站的经纬度坐标来近似表示用户的实际地理位置。用户使用的终端(如手机)会向为其提供服务的基站发送信号,基站可以记录接收到终端信号的时间和终端的设备标识。例如,基站的基础元数据表中记录有至少一组终端的设备标识和时间之间的对应关系,其示例性如下表-7所示:
Figure PCTCN2018083627-appb-000007
表-7
数据ETL单元12从数据库11中提取基础元数据,对基础元数据进行汇总、变换得到用户的时空特征数据,而后将用户的时空特征数据传输到数据挖掘与分析平台20。
结合上述例子,数据ETL单元12可以将每一个终端看作是一个用户,并为其分配相应的用户ID。对于每一个用户,数据ETL模块12从各个基站的基础元数据表中提取时间及相应的基站经纬度坐标,整合得到该用户的时空特征数据。用户的时空特征数据可以如上述表-1所示。
数据挖掘与分析平台20提供有特征变换、数据挖掘等功能单元。
可选地,在本申请实施例中,数据挖掘与分析平台20包括数据映射单元21和轨迹确定单元22。其中,数据映射单元21,用于对每个用户的时空特征数据进行栅格化处理,将时空特征数据转化为映射时空特征数据。轨迹确定单元22,用于根据每个用户的映射时空特征数据,确定每个用户的频繁轨迹。上述有关特征数据映射和频繁轨迹挖掘的具体过程可参见图1实施例中的介绍说明,本实施例对此不再赘述。
数据挖掘与分析平台20得到多个用户的频繁轨迹之后,将其提供给备选位置推荐平台30。
备选位置推荐平台30用于根据多个用户的频繁轨迹,确定待选址主体的备选地址,而后将待选址主体的备选地址提供给请求方(例如商家)。可选地,备选位置推荐平台30向请求方提供限制条件(即上文介绍的第二预设条件)的设置和选择功能,请求方通过设置和选择适当的限制条件,以使得备选位置推荐平台30筛选出合适的备选地址。上述有关确定待选址主体的备选地址的具体过程可参见图1实施例中的介绍说明,本实施例对此不再赘述。
需要说明的一点是,上述图4所示的选址***中涉及的各个平台,每一个平台可以由一台或多台服务器实现,也可以将多个平台的功能集成于一台服务器中。
上述方法实施例中,从选址设备的角度对本申请提供的技术方案进行介绍说明。可以理解的是,选址设备为了实现上述功能,其包含了执行各个功能相应的硬件结构和/或软件模 块(或单元)。结合本申请中所公开的实施例描述的各示例的单元及算法步骤,本申请实施例能够以硬件或硬件和计算机软件的结合形式来实现。某个功能究竟以硬件还是计算机软件驱动硬件的方式来执行,取决于技术方案的特定应用和设计约束条件。本领域技术人员可以对每个特定的应用来使用不同的方法来实现所描述的功能,但是这种实现不应认为超出本申请实施例的技术方案的范围。
本申请实施例可以根据上述方法示例对选址设备进行功能单元的划分,例如,可以对应各个功能划分各个功能单元,也可以将两个或两个以上的功能集成在一个处理单元中。上述集成的单元既可以采用硬件的形式实现,也可以采用软件功能单元的形式实现。需要说明的是,本申请实施例中对单元的划分是示意性的,仅仅为一种逻辑功能划分,实际实现时可以有另外的划分方式。
在采用集成的单元的情况下,图6A示出了上述实施例中所涉及的选址设备的一种可能的结构示意图。选址设备600包括:处理单元602和通信单元603。处理单元602用于对选址设备600的动作进行控制管理。例如,处理单元602用于支持选址设备600执行图1中的步骤101至步骤104,和/或用于执行本文所描述的技术的其它步骤。通信单元603用于支持选址设备600与其它设备的通信。选址设备600还可以包括存储单元601,用于存储选址设备600的程序代码和数据。
其中,处理单元602可以是处理器或控制器,例如可以是中央处理器(Central Processing Unit,CPU),通用处理器,数字信号处理器(Digital Signal Processor,DSP),专用集成电路(Application-Specific Integrated Circuit,ASIC),现场可编程门阵列(Field Programmable Gate Array,FPGA)或者其他可编程逻辑器件、晶体管逻辑器件、硬件部件或者其任意组合。其可以实现或执行结合本申请公开内容所描述的各种示例性的逻辑方框,模块和电路。所述处理器也可以是实现计算功能的组合,例如包含一个或多个微处理器组合,DSP和微处理器的组合等等。通信单元603可以是通信接口、收发器、收发电路等,其中,通信接口是统称,可以包括一个或多个接口,例如选址设备与其它设备之间的接口。存储单元601可以是存储器。
当处理单元602为处理器,通信单元603为通信接口,存储单元601为存储器时,本申请实施例所涉及的选址设备可以为图6B所示的选址设备。
参阅图6B所示,该选址设备610包括:处理器612、通信接口613、存储器611。可选地,选址设备610还可以包括总线614。其中,通信接口613、处理器612以及存储器611可以通过总线614相互连接;总线614可以是外设部件互连标准(Peripheral Component Interconnect,简称PCI)总线或扩展工业标准结构(Extended Industry Standard Architecture,简称EISA)总线等。所述总线614可以分为地址总线、数据总线、控制总线等。为便于表示,图6B中仅用一条粗线表示,但并不表示仅有一根总线或一种类型的总线。
结合本申请实施例公开内容所描述的方法或者算法的步骤可以硬件的方式来实现,也可以是由处理器执行软件指令的方式来实现。软件指令可以由相应的软件模块(或单元)组成,软件模块(或单元)可以被存放于随机存取存储器(Random Access Memory,RAM)、闪存、只读存储器(Read Only Memory,ROM)、可擦除可编程只读存储器(Erasable Programmable ROM,EPROM)、电可擦可编程只读存储器(Electrically EPROM,EEPROM)、寄存器、硬盘、移动硬盘、只读光盘(CD-ROM)或者本领域熟知的任何其它形式的存储介质中。一种示例性的存储介质耦合至处理器,从而使处理器能够从该存储介质读取信息,且可向该存储介质写 入信息。当然,存储介质也可以是处理器的组成部分。处理器和存储介质可以位于ASIC中。另外,该ASIC可以位于选址设备中。当然,处理器和存储介质也可以作为分立组件存在于选址设备中。
本领域技术人员应该可以意识到,在上述一个或多个示例中,本申请实施例所描述的功能可以用硬件、软件、固件或它们的任意组合来实现。本申请实施例还提供了计算机程序产品,当该计算机程序产品被执行时,其用于实现上述功能。另外,可以将上述计算机程序存储在计算机可读介质中或者作为计算机可读介质上的一个或多个指令或代码进行传输。计算机可读介质包括计算机存储介质和通信介质,其中通信介质包括便于从一个地方向另一个地方传送计算机程序的任何介质。存储介质可以是通用或专用计算机能够存取的任何可用介质。
以上所述的具体实施方式,对本申请实施例的目的、技术方案和有益效果进行了进一步详细说明,所应理解的是,以上所述仅为本申请实施例的具体实施方式而已,并不用于限定本申请实施例的保护范围,凡在本申请实施例的技术方案的基础之上,所做的任何修改、等同替换、改进等,均应包括在本申请实施例的保护范围之内。

Claims (16)

  1. 一种选址方法,其特征在于,所述方法包括:
    获取多个用户的时空特征数据,所述时空特征数据用于指示用户在各个时刻的实际地理位置;
    对每个用户的时空特征数据进行栅格化处理,将所述时空特征数据转化为映射时空特征数据;
    根据每个用户的映射时空特征数据,确定每个用户的频繁轨迹;
    根据所述多个用户的频繁轨迹,确定待选址主体的备选地址。
  2. 根据权利要求1所述的方法,其特征在于,所述对每个用户的时空特征数据进行栅格化处理,将所述时空特征数据转化为映射时空特征数据,包括:
    对于每一个用户,将所述用户的时空特征数据按照周期进行划分,得到所述用户在n个周期内的时空特征数据,所述n为大于1的整数;
    对于所述n个周期中的每一个周期,根据所述用户在所述周期内的时空特征数据,确定所述用户在所述周期所包括的k个时间段中的每一个时间段内所处的实际地理位置,所述k为大于1的整数;
    获取所述用户在每一个时间段内所处的实际地理位置对应的代表性地理位置,所述代表性地理位置所代表的空间区域中包括所述实际地理位置;
    其中,每个用户的映射时空特征数据包括每个用户在n个周期内的映射时空特征数据,在一个周期内的映射时空特征数据包括在所述周期所包括的每一个时间段内的代表性地理位置。
  3. 根据权利要求2所述的方法,其特征在于,所述获取所述用户在每一个时间段内所处的实际地理位置对应的代表性地理位置,包括:
    对于每一个时间段,将所述用户在所述时间段内所处的实际地理位置的经度坐标与第一预设值相除并对商取整,得到所述用户在所述时间段内的代表性地理位置的经度坐标;
    将所述用户在所述时间段内所处的实际地理位置的纬度坐标与第二预设值相除并对商取整,得到所述用户在所述时间段内的代表性地理位置的纬度坐标。
  4. 根据权利要求2所述的方法,其特征在于,所述根据所述用户在所述周期内的时空特征数据,确定所述用户在所述周期所包括的k个时间段中的每一个时间段内所处的实际地理位置,包括:
    获取所述用户在所述周期所包括的k个时间段中的每一个时间段内的时空特征数据;
    对于每一个时间段,当所述用户在所述时间段内的时空特征数据指示同一实际地理位置时,将所述同一实际地理位置确定为所述用户在所述时间段内所处的实际地理位置;
    当所述用户在所述时间段内的时空特征数据指示多个不同的实际地理位置时,将被指示的数量最多的实际地理位置确定为所述用户在所述时间段内所处的实际地理位置。
  5. 根据权利要求2所述的方法,其特征在于,所述根据每个用户的映射时空特征数据,确定每个用户的频繁轨迹,包括:
    对于每一个用户,获取所述用户在每一个周期内的映射时空特征数据对应的序列,所述序列中的每一个元素表示所述用户在一个时间段内所处的代表性地理位置;
    根据所述用户在所述n个周期对应的n个序列,获取每一个子序列在所述n个序列中的出现次数;
    选取符合第一预设条件的子序列作为频繁序列,并将所述频繁序列对应的映射时空特征数据确定为所述用户的频繁轨迹,其中,所述第一预设条件包括:出现次数大于第一阈值、子序列中包含的元素数量大于第二阈值中的第一项或全部两项。
  6. 根据权利要求1至5任一项所述的方法,其特征在于,所述根据所述多个用户的频繁轨迹,确定待选址主体的备选地址,包括:
    根据所述多个用户的频繁轨迹,构建轨迹标签库,其中,所述轨迹标签库包括所述多个用户的轨迹标签数据,每一个用户的轨迹标签数据包括:用户标识和至少一个频繁轨迹;
    基于所述轨迹标签库,根据第二预设条件确定所述待选址主体的备选地址。
  7. 根据权利要求6所述的方法,其特征在于,所述方法还包括:
    根据所述用户的频繁轨迹,获取所述用户在工作时段内的位置信息;
    获取与所述位置信息对应的信息点POI信息;
    根据所述POI信息确定所述用户的身份标签,其中,所述用户的轨迹标签数据中还包括所述用户的身份标签。
  8. 一种选址设备,其特征在于,所述设备包括:
    数据获取单元,用于获取多个用户的时空特征数据,所述时空特征数据用于指示用户在各个时刻的实际地理位置;
    数据映射单元,用于对每个用户的时空特征数据进行栅格化处理,将所述时空特征数据转化为映射时空特征数据;
    轨迹确定单元,用于根据每个用户的映射时空特征数据,确定每个用户的频繁轨迹;
    地址确定单元,用于根据所述多个用户的频繁轨迹,确定待选址主体的备选地址。
  9. 根据权利要求8所述的设备,其特征在于,所述数据映射单元,用于:
    对于每一个用户,将所述用户的时空特征数据按照周期进行划分,得到所述用户在n个周期内的时空特征数据,所述n为大于1的整数;
    对于所述n个周期中的每一个周期,根据所述用户在所述周期内的时空特征数据,确定所述用户在所述周期所包括的k个时间段中的每一个时间段内所处的实际地理位置,所述k为大于1的整数;
    获取所述用户在每一个时间段内所处的实际地理位置对应的代表性地理位置,所述代表性地理位置所代表的空间区域中包括所述实际地理位置;
    其中,每个用户的映射时空特征数据包括每个用户在n个周期内的映射时空特征数据,在一个周期内的映射时空特征数据包括在所述周期所包括的每一个时间段内的代表性地理位置。
  10. 根据权利要求9所述的设备,其特征在于,所述数据映射单元,用于:
    对于每一个时间段,将所述用户在所述时间段内所处的实际地理位置的经度坐标与第一预设值相除并对商取整,得到所述用户在所述时间段内的代表性地理位置的经度坐标;
    将所述用户在所述时间段内所处的实际地理位置的纬度坐标与第二预设值相除并对商取整,得到所述用户在所述时间段内的代表性地理位置的纬度坐标。
  11. 根据权利要求9所述的设备,其特征在于,所述数据映射单元,用于:
    获取所述用户在所述周期所包括的k个时间段中的每一个时间段内的时空特征数据;
    对于每一个时间段,当所述用户在所述时间段内的时空特征数据指示同一实际地理位置时,将所述同一实际地理位置确定为所述用户在所述时间段内所处的实际地理位置;
    当所述用户在所述时间段内的时空特征数据指示多个不同的实际地理位置时,将被指示的数量最多的实际地理位置确定为所述用户在所述时间段内所处的实际地理位置。
  12. 根据权利要求9所述的设备,其特征在于,所述轨迹确定单元,用于:
    对于每一个用户,获取所述用户在每一个周期内的映射时空特征数据对应的序列,所述序列中的每一个元素表示所述用户在一个时间段内所处的代表性地理位置;
    根据所述用户在所述n个周期对应的n个序列,获取每一个子序列在所述n个序列中的出现次数;
    选取符合第一预设条件的子序列作为频繁序列,并将所述频繁序列对应的映射时空特征数据确定为所述用户的频繁轨迹,其中,所述第一预设条件包括:出现次数大于第一阈值、子序列中包含的元素数量大于第二阈值中的第一项或全部两项。
  13. 根据权利要求8至12任一项所述的设备,其特征在于,所述地址确定单元,用于:
    根据所述多个用户的频繁轨迹,构建轨迹标签库,其中,所述轨迹标签库包括所述多个用户的轨迹标签数据,每一个用户的轨迹标签数据包括:用户标识和至少一个频繁轨迹;
    基于所述轨迹标签库,根据第二预设条件确定所述待选址主体的备选地址。
  14. 根据权利要求13所述的设备,其特征在于,所述设备还包括:
    位置获取单元,用于根据所述用户的频繁轨迹,获取所述用户在工作时段内的位置信息;
    信息获取单元,用于获取与所述位置信息对应的信息点POI信息;
    身份确定单元,用于根据所述POI信息确定所述用户的身份标签,其中,所述用户的轨迹标签数据中还包括所述用户的身份标签。
  15. 一种计算机存储介质,其特征在于,所述计算机存储介质中存储有可执行指令,所述可执行指令用于执行如权利要求1至7任一项所述的方法。
  16. 一种选址设备,其特征在于,所述设备包括处理器和存储器,其中,
    所述存储器中存有计算机可读程序;
    所述处理器通过运行所述存储器中的程序,以用于完成上述权利要求1至7任一项所述的方法。
PCT/CN2018/083627 2017-06-01 2018-04-19 选址方法及设备 WO2018219057A1 (zh)

Priority Applications (1)

Application Number Priority Date Filing Date Title
EP18808958.5A EP3605365A4 (en) 2017-06-01 2018-04-19 SITE SELECTION METHOD AND DEVICE

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201710405595.2A CN108984561B (zh) 2017-06-01 2017-06-01 选址方法及设备
CN201710405595.2 2017-06-01

Publications (1)

Publication Number Publication Date
WO2018219057A1 true WO2018219057A1 (zh) 2018-12-06

Family

ID=64454477

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2018/083627 WO2018219057A1 (zh) 2017-06-01 2018-04-19 选址方法及设备

Country Status (3)

Country Link
EP (1) EP3605365A4 (zh)
CN (1) CN108984561B (zh)
WO (1) WO2018219057A1 (zh)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111738484A (zh) * 2020-04-28 2020-10-02 腾讯科技(深圳)有限公司 一种公交站点选址的方法、装置及计算机可读存储介质
CN117979225A (zh) * 2024-03-29 2024-05-03 北京大也智慧数据科技服务有限公司 健身步道的选址方法、装置、存储介质及电子设备

Families Citing this family (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111400417A (zh) * 2018-12-28 2020-07-10 航天信息股份有限公司 一种自助办税服务大厅选址方法、装置、介质和设备
CN110019568B (zh) * 2019-04-12 2022-03-11 深圳市和讯华谷信息技术有限公司 基于空间聚类的选址方法、装置、计算机设备及存储介质
CN110728400A (zh) * 2019-09-30 2020-01-24 口碑(上海)信息技术有限公司 选址推荐的方法及装置
CN112883126A (zh) * 2019-11-29 2021-06-01 京东安联财产保险有限公司 社区中心的选择方法及相关设备
CN110909262B (zh) * 2019-11-29 2022-10-25 北京明略软件***有限公司 一种身份信息的伴随关系确定方法及装置
CN111062525B (zh) * 2019-12-06 2022-07-05 中国联合网络通信集团有限公司 扩容小区的基站选址方法、装置、设备及存储介质
CN111312406B (zh) * 2020-03-15 2020-11-13 薪得付信息技术(山东)有限公司 一种疫情标签数据处理方法及***
CN111582985A (zh) * 2020-05-09 2020-08-25 北京首汽智行科技有限公司 一种基于用户推荐的共享出行服务网点确定方法和装置
CN114363824B (zh) * 2021-05-26 2023-08-08 科大国创云网科技有限公司 一种基于mr位置和道路gis信息的通勤轨迹刻画方法及***
CN115880041B (zh) * 2023-02-13 2024-01-02 杭州超上科技有限公司 基于大数据分析的住房管理***
CN116321007B (zh) * 2023-03-13 2024-04-02 深圳市交投科技有限公司 出行目的预测方法、装置、设备及存储介质

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103914563A (zh) * 2014-04-18 2014-07-09 中国科学院上海微***与信息技术研究所 一种时空轨迹的模式挖掘方法
CN104731795A (zh) * 2013-12-19 2015-06-24 日本电气株式会社 用于挖掘个体活动模式的设备和方法
CN104965913A (zh) * 2015-07-03 2015-10-07 重庆邮电大学 一种基于gps地理位置数据挖掘的用户分类方法
CN105956951A (zh) * 2016-05-05 2016-09-21 杭州诚智天扬科技有限公司 基于移动信令的旅游热门线路的识别方法

Family Cites Families (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20020120501A1 (en) * 2000-07-19 2002-08-29 Bell Christopher Nathan Systems and processes for measuring, evaluating and reporting audience response to audio, video, and other content
CN103052022B (zh) * 2011-10-17 2015-08-19 ***通信集团公司 基于移动行为的用户稳定点发现方法和***
WO2014011998A1 (en) * 2012-07-12 2014-01-16 Massachusetts Institute Of Technology Text characterization of trajectories
CN103593349B (zh) * 2012-08-14 2016-12-21 中国科学院沈阳自动化研究所 感应网络环境下移动位置分析方法
US10387457B2 (en) * 2014-06-17 2019-08-20 Sap Se Grid-based analysis of geospatial trajectories
CN104156082A (zh) * 2014-08-06 2014-11-19 北京行云时空科技有限公司 面向时空场景的用户界面及应用的控制***及智能终端
CN104270714B (zh) * 2014-09-11 2018-04-10 华为技术有限公司 确定用户行动轨迹的方法和装置
CN104598557B (zh) * 2015-01-05 2018-06-05 华为技术有限公司 数据栅格化、用户行为分析的方法和装置
CN104796468A (zh) * 2015-04-14 2015-07-22 蔡宏铭 实现同行人即时通讯及同行信息共享的方法和***
CN106649656B (zh) * 2016-12-13 2020-03-17 中国科学院软件研究所 一种面向数据库的时空轨迹大数据存储方法

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104731795A (zh) * 2013-12-19 2015-06-24 日本电气株式会社 用于挖掘个体活动模式的设备和方法
CN103914563A (zh) * 2014-04-18 2014-07-09 中国科学院上海微***与信息技术研究所 一种时空轨迹的模式挖掘方法
CN104965913A (zh) * 2015-07-03 2015-10-07 重庆邮电大学 一种基于gps地理位置数据挖掘的用户分类方法
CN105956951A (zh) * 2016-05-05 2016-09-21 杭州诚智天扬科技有限公司 基于移动信令的旅游热门线路的识别方法

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
See also references of EP3605365A4

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111738484A (zh) * 2020-04-28 2020-10-02 腾讯科技(深圳)有限公司 一种公交站点选址的方法、装置及计算机可读存储介质
CN111738484B (zh) * 2020-04-28 2024-05-14 腾讯科技(深圳)有限公司 一种公交站点选址的方法、装置及计算机可读存储介质
CN117979225A (zh) * 2024-03-29 2024-05-03 北京大也智慧数据科技服务有限公司 健身步道的选址方法、装置、存储介质及电子设备

Also Published As

Publication number Publication date
CN108984561B (zh) 2021-06-22
EP3605365A4 (en) 2020-04-01
CN108984561A (zh) 2018-12-11
EP3605365A1 (en) 2020-02-05

Similar Documents

Publication Publication Date Title
WO2018219057A1 (zh) 选址方法及设备
US11223926B2 (en) Systems and methods for statistically associating mobile devices and non-mobile devices with geographic areas
US9880012B2 (en) Hybrid road network and grid based spatial-temporal indexing under missing road links
KR102208892B1 (ko) 잘못된 주소 정보를 식별하기 위한 방법 및 장치
CN108875007B (zh) 兴趣点的确定方法和装置、存储介质、电子装置
US11481666B2 (en) Method and apparatus for acquiring information
US20180268168A1 (en) Anonymization of geographic route trace data
US9706515B1 (en) Location data from mobile devices
CN110020221B (zh) 职住分布确认方法、装置、服务器及计算机可读存储介质
KR20200006584A (ko) 정보 추천 방법 및 장치
EP3327575A1 (en) Information distribution apparatus and method
US9467815B2 (en) Systems and methods for generating a user location history
US20240144059A1 (en) Method and system for smart detection of business hot spots
US12009106B2 (en) Emergency demand prediction device, emergency demand prediction method, and program
CN109522923B (zh) 客户地址聚合方法、装置及计算机可读存储介质
US11562495B2 (en) Identifying spatial locations of images using location data from mobile devices
US9787557B2 (en) Determining semantic place names from location reports
US9167389B1 (en) Clustering location data to determine locations of interest
US20200042620A1 (en) Active Change Detection For Geospatial Entities Using Trend Analysis
Cesario et al. An approach for the discovery and validation of urban mobility patterns
CN108427679A (zh) 一种人流分布处理方法及其设备
WO2019085475A1 (zh) 项目推荐方法、电子设备及计算机可读存储介质
WO2023213137A1 (zh) 基于工业互联网的数据查找方法、装置、设备及存储介质
CN103712628A (zh) 导航路径描绘方法和终端
US11126928B2 (en) Stationary classifier for geographic route trace data

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 18808958

Country of ref document: EP

Kind code of ref document: A1

ENP Entry into the national phase

Ref document number: 2018808958

Country of ref document: EP

Effective date: 20191021

NENP Non-entry into the national phase

Ref country code: DE