CN114501419B - Signaling data processing method, apparatus and storage medium - Google Patents

Signaling data processing method, apparatus and storage medium Download PDF

Info

Publication number
CN114501419B
CN114501419B CN202111665065.4A CN202111665065A CN114501419B CN 114501419 B CN114501419 B CN 114501419B CN 202111665065 A CN202111665065 A CN 202111665065A CN 114501419 B CN114501419 B CN 114501419B
Authority
CN
China
Prior art keywords
user
point
time
resident
class
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202111665065.4A
Other languages
Chinese (zh)
Other versions
CN114501419A (en
Inventor
周剑明
沈松伟
王健
王�琦
李斯哲
汪旭
蒋志朝
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
China United Network Communications Group Co Ltd
Original Assignee
China United Network Communications Group Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by China United Network Communications Group Co Ltd filed Critical China United Network Communications Group Co Ltd
Priority to CN202111665065.4A priority Critical patent/CN114501419B/en
Publication of CN114501419A publication Critical patent/CN114501419A/en
Application granted granted Critical
Publication of CN114501419B publication Critical patent/CN114501419B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04WWIRELESS COMMUNICATION NETWORKS
    • H04W8/00Network data management
    • H04W8/18Processing of user or subscriber data, e.g. subscribed services, user preferences or user profiles; Transfer of user or subscriber data
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/23Clustering techniques
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D30/00Reducing energy consumption in communication networks
    • Y02D30/70Reducing energy consumption in communication networks in wireless communication networks

Landscapes

  • Engineering & Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Theoretical Computer Science (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • General Engineering & Computer Science (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Biology (AREA)
  • Evolutionary Computation (AREA)
  • Physics & Mathematics (AREA)
  • Artificial Intelligence (AREA)
  • General Physics & Mathematics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Databases & Information Systems (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • Navigation (AREA)
  • Mobile Radio Communication Systems (AREA)

Abstract

The application provides a signaling data processing method, a device and a storage medium, wherein the method comprises the following steps: acquiring signaling point data of each user in a region to be planned; performing space-time clustering on the signaling point data of each user by adopting a first distance threshold and a first time threshold to obtain a plurality of first resident point information of each user corresponding to each user; determining a first type of user and a second type of user from the users based on the first residence number of the users; performing space-time clustering on the signaling point data of the first class user and the signaling point data of the second class user respectively by adopting a second distance threshold and a second time threshold to obtain a plurality of second residence point information of the first class user and the second class user respectively; the first resident information and the second resident information are used for carrying out traffic planning on the area to be planned and for city planning. The method solves the problem that the parking point information of each user determined by the clustering method in the prior art is not matched with the actual parking situation of the user.

Description

Signaling data processing method, apparatus and storage medium
Technical Field
The present disclosure relates to the field of data processing technologies, and in particular, to a signaling data processing method, apparatus, and storage medium.
Background
To achieve scientific and reasonable urban planning, such as planning of traffic, traffic guidance development (Transit Oriented Development, TOD for short) business centers and the like, people stream aggregation areas need to be found first, and then urban planning is performed based on aggregation core points of the people stream aggregation areas and parking time of the aggregation areas.
At present, a mass of signaling point data is collected, and after the data is processed, an aggregation core point of a people stream aggregation area and the residence time of the people stream in the aggregation area are obtained. The signaling point data is signaling data of one-time communication between the user terminal and the base station, and each signaling point data comprises information such as user codes, signaling time, base station position and the like. The processing method for the signaling point data mainly comprises the following steps: the data processing device acquires the data of each signaling point of each user in a city within a period of time from the big data platform. Then, the data processing device adopts a fixed space distance threshold value and a parking time threshold value to perform space-time clustering on the signaling point data of each user by a space-time clustering algorithm, so as to obtain a plurality of clusters of each user. The central point of each cluster is a parking point, and each parking point comprises information such as the position and the parking time of the parking point. Finally, the data processing device performs urban area division, and counts and calculates the parking points of the period of time in each area to obtain the aggregation core points of the people stream aggregation areas and the total parking time of users (namely people stream) in the aggregation areas.
In the prior art, the clustering method for the parking points of each user has the problem that the obtained parking point information is not matched with the actual parking situation of the user, so that the people stream gathering area information obtained based on the parking points is not matched with the actual people stream situation, and the subsequent urban planning is unreasonable.
Disclosure of Invention
The application provides a signaling data processing method, a signaling data processing device and a storage medium, which are used for solving the problem that in the prior art, the obtained parking point information is not matched with the actual parking situation of users in a clustering method of the parking points of all users.
In a first aspect, the present application provides a signaling data processing method, including:
acquiring signaling point data of each user in a region to be planned, wherein the signaling point data comprises a base station position and signaling time;
performing space-time clustering on each signaling point data of each user by adopting a first distance threshold and a first time threshold to obtain a plurality of first standing point information of each user corresponding to each user, wherein the first standing point information comprises the standing time of each first standing point and the base station position corresponding to the first standing point;
determining a first class of users and a second class of users from the users based on the first resident point number of each user, wherein the first class of users are users with the first resident point number larger than a quantity upper limit threshold value, and the second class of users are users with the first resident point number smaller than a quantity lower limit threshold value;
Performing space-time clustering on the signaling point data of the first class of users and the signaling point data of the second class of users respectively by adopting a second distance threshold and a second time threshold to obtain a plurality of pieces of second standing point information of the users of the first class of users and the second class of users, wherein the second standing point information comprises the standing time of a second standing point and the position of a base station corresponding to the second standing point;
the first standing point information and the second standing point information are used for carrying out traffic planning on the area to be planned and for city planning.
Optionally, the second distance threshold includes two thresholds δ for the first and second types of users, respectively d21 And delta d22
The adopting a second distance threshold and a second time threshold to perform space-time clustering on the signaling point data of the first class user and the second class user respectively to obtain a plurality of second residence point information of the first class user and the second class user respectively, including:
by delta d21 And a second time threshold value, which performs space-time clustering on each signaling point data of the first class user to obtain each user of the first class userA plurality of second dwell point information;
by delta d22 And a second time threshold value, which is used for performing space-time clustering on the signaling point data of the second class user respectively to obtain a plurality of second resident point information of the second class user.
Optionally, before the space-time clustering is performed on each signaling point data of each user by adopting the first distance threshold and the first time threshold, and a plurality of first residence point information of each user corresponding to each user is obtained, the method further includes:
judging whether the first residence point information can be obtained by adopting the first distance threshold and the first time threshold;
the method for performing space-time clustering on the signaling point data of each user by adopting a first distance threshold and a first time threshold to obtain a plurality of first residence point information of each user corresponding to each user comprises the following steps:
and under the condition that the first resident point information can be obtained, carrying out space-time clustering on each signaling point data of each user by adopting the first distance threshold and the first time threshold to obtain a plurality of pieces of first resident point information of each user corresponding to each user.
Optionally, the method further comprises:
and under the condition that the first resident information cannot be obtained, performing space-time clustering on each piece of signaling point data of each user by adopting the first distance threshold and the first speed threshold to obtain the first resident information.
Optionally, before the time-space clustering is performed on the signaling point data of the first type of user and the signaling point data of the second type of user by adopting the second distance threshold and the second time threshold, and the plurality of second residence point information of the users of the first type of user and the second type of user are obtained, the method further includes:
Judging whether the second residence point information can be obtained by adopting the second distance threshold value and the second time threshold value;
the adopting a second distance threshold and a second time threshold to perform space-time clustering on the signaling point data of the first class user and the second class user respectively to obtain a plurality of second residence point information of the first class user and the second class user respectively, including:
and under the condition that the second resident information can be obtained, adopting the second distance threshold value and the second time threshold value to perform space-time clustering on the signaling point data of the first class user and the signaling point data of the second class user respectively, and obtaining a plurality of pieces of second resident information of the first class user and the second class user respectively.
Optionally, the method further comprises:
and under the condition that the second resident point information can not be obtained, adopting the second distance threshold value and the second speed threshold value to perform space-time clustering on the signaling point data of the first class user and the signaling point data of the second class user respectively, and obtaining the second resident point information of the first class user and the second class user respectively.
Optionally, the base station positions corresponding to the first residence point include base station positions of at least two base stations corresponding to the first residence point, and the base station positions corresponding to the second residence point include base station positions of at least two base stations corresponding to the second residence point;
The method further comprises the steps of:
based on the base station positions corresponding to the first residence points, calculating to obtain first residence point positions, wherein the first residence point positions are the positions of the midpoints of at least two base stations corresponding to the first residence points;
correspondingly, based on the base station positions corresponding to the second residence points, calculating to obtain second residence point positions, wherein the second residence point positions are the positions of the midpoints of at least two base stations corresponding to the second residence points;
sequencing each first resident point or each second resident point of each user, calculating to obtain a distance value between two adjacent first resident point positions based on the first resident point positions, and correspondingly, calculating to obtain a distance value between two adjacent second resident point positions based on the second resident point positions;
using the value D of the distance z Less than or equal to the merge threshold delta z Will D z Corresponding two phasesThe method comprises the steps that in a neighboring resident point merging mode, spatial clustering is carried out on first resident point information or second resident point information of each user, and respective third resident point information of each user is obtained, wherein the third resident point information comprises the resident time of a third resident point and the position of a base station corresponding to the third resident point;
the third standing point information is used for carrying out traffic planning on the area to be planned and is used for city planning.
In a second aspect, the present application provides a signalling data processing device, the device comprising:
the processing module is used for acquiring signaling point data of each user in a region range to be planned from the big data platform; the signaling point data comprise a base station position and signaling time;
the first calculation module is used for acquiring signaling point data of each user from the processing module, and performing space-time clustering on the signaling point data of each user by adopting a first distance threshold and a first time threshold to acquire a plurality of pieces of first resident point information of each user corresponding to each user; the first residence point information comprises residence time of each first residence point and base station positions corresponding to the first residence points;
the user determining module is used for acquiring a plurality of pieces of first resident information of the users corresponding to the users from the first calculating module, and determining a first type of user and a second type of user from the users based on the first resident number of the users; the first type of users are users with the first resident point number being larger than the upper quantity threshold value, and the second type of users are users with the first resident point number being smaller than the lower quantity threshold value;
the second computing module is used for acquiring the first class users and the second class users from the user determining module and acquiring the signaling point data of the first class users and the second class users from the processing module; performing space-time clustering on the signaling point data of the first class user and the signaling point data of the second class user respectively by adopting a second distance threshold and a second time threshold to obtain a plurality of second resident point information of the first class user and the second class user respectively; the second residence point information comprises residence time of the second residence point and a base station position corresponding to the second residence point.
In a third aspect, the present application provides a data processing apparatus comprising:
a processor and a memory;
the memory stores executable instructions executable by the processor;
wherein the processor executes the executable instructions stored by the memory, causing the processor to perform the method as described above.
In a fourth aspect, the present application provides a storage medium having stored therein computer-executable instructions for performing the method as described above when executed by a processor.
The signaling data processing method, the signaling data processing device and the storage medium provided by the application are characterized in that the signaling point data of each user is subjected to space-time clustering by adopting a first distance threshold and a first time threshold to obtain a plurality of first resident point information of each user, then the first type user and the second type user which are not suitable for the space-time clustering by adopting the first distance threshold and the first time threshold are selected from the users based on the resident point number, the number upper limit threshold and the number lower limit threshold of the first resident point of each user, and the adjusted second distance threshold and the second time threshold are adopted to perform the space-time clustering on the signaling point data of the first type user and the second type user respectively to obtain a plurality of second resident point information of each user of the first type user and the second type user, so that the resident point information corresponding to each user is ensured to be matched with the actual resident situation of the user. The method and the device solve the problem that the parking point information of each user determined by the prior art is not matched with the actual parking situation of the user.
Drawings
The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate embodiments consistent with the application and together with the description, serve to explain the principles of the application.
Fig. 1 is an application scenario diagram of a signaling data processing method provided in an embodiment of the present application;
fig. 2 is a flowchart of a signaling data processing method provided in an embodiment of the present application;
fig. 3 is a schematic diagram of a base station position corresponding to each signaling point data of a certain user according to an embodiment of the present application;
FIG. 4 is a schematic illustration of a residence of a user on a day provided in an embodiment of the present application;
fig. 5 is a block diagram of a signaling data processing apparatus according to an embodiment of the present application;
fig. 6 is a block diagram of a data processing apparatus according to an embodiment of the present application.
Specific embodiments thereof have been shown by way of example in the drawings and will herein be described in more detail. These drawings and the written description are not intended to limit the scope of the inventive concepts in any way, but to illustrate the concepts of the present application to those skilled in the art by reference to specific embodiments.
Detailed Description
For the purposes of making the objects, technical solutions and advantages of the embodiments of the present application more clear, the technical solutions of the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application, and it is apparent that the described embodiments are some embodiments of the present invention, but not all embodiments. All other embodiments, which can be made by one of ordinary skill in the art based on the embodiments herein without making any inventive effort, are intended to be within the scope of the present application.
Fig. 1 is an application scenario diagram of a signaling data processing method provided in an embodiment of the present application. As shown in fig. 1, the server 10 of the carrier collects and stores signaling point data of a plurality of base stations, such as signaling data generated every time a user terminal (e.g., a cellular phone) shown in fig. 1 interacts (e.g., communicates) with a base station, and the large data platform 11 obtains signaling point data of a plurality of users from the server 10 connected thereto, and then transmits the obtained signaling point data to the data processing apparatus 12 connected thereto. The data processing device 12 performs signaling data processing on the signaling point data to obtain each parking point information.
In the prior art, the data processing device 12 clusters the parking points of each user by adopting the clustering method in the prior art, and the specific process of obtaining the parking point information of each user is as follows:
specifically, the data processing apparatus 12 first pre-processes the signaling point data of each user, and then sorts the signaling point data according to the signaling time sequence in the signaling point data, so as to obtain the sorted signaling point data of each user. The data processing device 12 then uses a fixed spatial distance threshold delta d0 Dwell time threshold delta t0 And carrying out space-time clustering on the signaling point data of each user to obtain a plurality of clusters of each user, and obtaining the parking points and parking point information corresponding to the clusters based on each cluster. The central point of each cluster is the parking point, and the information such as the position and the parking time of the parking point contained in each parking point is the parking point information. The prior art uses a fixed spatial distance threshold delta d0 Dwell time threshold delta t0 Carrying out space-time clustering on the signaling point data of each user to obtain the parking point information of each user, wherein the adopted space-time clustering algorithm or clustering process based on the space distance threshold value and the parking time threshold value is exemplified as follows:
assume that each signaling point data after the ordering of the user a is Ai, where i=1, 2,3, …, m, …, n; m is more than 0 and less than or equal to n, and m and n are natural numbers.
Firstly, 1 st signaling point data A1 and 2 nd signaling point data A2 of a first cluster C1 are selected, and cluster centers of A1 and A2 (namely the midpoints of two base station positions corresponding to A1 and A2) are calculated to a base station position distance d corresponding to A2 x2 If d x2 Less than the spatial distance threshold delta d0 I.e. d x2 <δ d0 Then it indicates that A1 and A2 are likely to form a stay region corresponding to cluster C1 for one stay;
similarly, recursively to the 3 rd signaling point data A3, the distance d from the cluster center of A1, A2 and A3 (i.e., the center of the corresponding base station positions of A1, A2 and A3) to the corresponding base station position of A3 is calculated x3 If d x3 <δ d0 Then it indicates that A1, A2, and A3 are likely to be in the stay region corresponding to cluster C1;
similarly, when the distance from the m+1th point am+1 to the cluster center of the first m+1 points is greater than the spatial distance threshold δ d0 The cycle is stopped. In this case, A1, A2, A3, …, am may be in the stay region corresponding to the cluster C1;
next, the data processing device 12 employs a dwell time threshold δ t0 It is determined whether A1, A2, A3, …, am are in the stay region corresponding to the cluster C1. Specifically, the data processing device 12 calculates the time difference t of the signaling time between Am and A1 m1 If t m1 >δ t0 It is determined that A1, A2, A3, …, am constitute one stay, and stay areas that are in the same stay area, i.e., stay areas corresponding to the cluster C1. The cluster centers of the signaling point data A1, A2, A3, … and Am in the cluster C1 are the parking point C1C corresponding to the cluster C1. The position of the cluster center is the position of the parking point C1C. The signaling time of the 1 st signaling point data and the signaling time of the last signaling point data in the cluster C1 are respectively the starting time and the ending time of the user at the parking point C1C. That is, the dwell time of the dwell point C1C is t m1
The above is to adopt a fixed space distance threshold delta d0 Dwell time threshold delta t0 And carrying out space-time clustering to obtain a specific algorithm of one cluster (or parking point). Similarly, when calculating the next cluster (or the next parking point), cluster calculation is performed starting from the m+1st point am+1 and the m+2nd point am+2 using the above method. Until the data Ai of each signaling point of the user A is traversed, each parking point and parking point information are obtained.
Due to the fixed spatial distance threshold delta preset in the prior art d0 Is a fixed value determined based on the base station spacing from base station to base station in a city conventional base station deployment.
However, the base station layout in one city is not consistent in all cities, but has regional differences, if the population is small in some areas with sparse population in the city, the base station layout is performed in a layout mode with large base station coverage and small number of base stations, and in this case, the distance between the base stations is large, and the base station spacing often exceeds the base station spacing in conventional base station deployment; in some densely populated areas in a city, the distance between base stations is small in order to ensure communication quality, and the base station spacing is often smaller than that in conventional base station deployment.
When the space-time clustering of the signaling point data is performed by adopting a fixed space distance threshold value and a parking time threshold value which are set based on the conventional base station spacing, in a region with sparse population, because the spacing between the base stations is larger, each base station which is passed by a user (comprising the base stations which are not stopped by the user but only passed by the user) can be calculated by a space-time clustering algorithm to be a parking point, namely, the parking point in the region with sparse population is more; in densely populated areas, because of the small distance between base stations, even if people flow in different areas near multiple base stations in the areas for a long time are parked, the different areas parked for a long time can be calculated as a parking point, and thus the parking points in the areas are few, even only one. Therefore, the clustering method for the parking points of each user in the prior art has the problem that the obtained parking point information is not matched with the actual parking situation of the user, and further, the people stream gathering area information obtained based on the parking points is not matched with the actual people stream situation, so that the subsequent urban planning is unreasonable. For example, if public transportation sites are set based on stop-and-stay points, there are areas where population is sparse, and few or even only one public transportation site is set in areas where population is dense, so that the setting of the public transportation site is unreasonable and does not match with the actual demands of users in the corresponding areas.
In this regard, the present application proposes a signaling data processing method to solve the problem that when the existing technology is used to cluster the parking points of each user, the obtained parking point information is not matched with the actual parking situation of the user. An example of a signaling data processing method is provided in the present application as follows.
Illustratively, as shown in fig. 1, the data processing apparatus 12 acquires signaling point data of each user within a range of an area to be planned from the big data platform 11, wherein the signaling point data includes a base station position and a signaling time. Next, the data processing device 12 performs space-time clustering on each signaling point data of each user by adopting a first distance threshold and a first time threshold, so as to obtain a plurality of first residence point information of each user corresponding to each user, where the first residence point information includes residence time of each first residence point and a base station position corresponding to the first residence point. The first distance threshold is a distance threshold determined based on a base station-to-base station spacing in a conventional base station deployment in the region.
Based on the first number of points of residence for each user, the data processing device 12 determines a first type of user and a second type of user from each user, wherein the first type of user is a user whose first number of points is greater than the upper number threshold and the second type of user is a user whose first number of points is less than the lower number threshold. The first type of user and the second type of user determined by the data processing means 12 represent users that are not capable of spatiotemporal clustering of the signaling point data with the first distance threshold and the first time threshold, such as users in sparsely populated areas and users in densely populated areas.
Then, the data processing device 12 performs space-time clustering on each signaling point data of the first class user and the second class user by adopting a second distance threshold value and a second time threshold value which are adjusted relative to the first distance threshold value and the first time threshold value, so as to obtain a plurality of second residence point information of the first class user and the second class user; and compared with the first distance threshold and the first time threshold, the second distance threshold and the second time threshold are adopted to perform space-time clustering on the signaling point data of the first class user and the signaling point data of the second class user respectively, and the obtained second standing point information of the first class user and the second class user is matched with the actual standing situation of the first class user and the second class user respectively. The second residence point information comprises residence time of the second residence point and a base station position corresponding to the second residence point.
The first resident information and the second resident information are used for carrying out traffic planning on the area to be planned and for city planning.
According to the signaling data processing method, the signaling point data of each user are subjected to space-time clustering by adopting the first distance threshold and the first time threshold, a plurality of pieces of first resident point information of each user corresponding to each user are obtained, then the first type of users and the second type of users which are not suitable for the space-time clustering by adopting the first distance threshold and the first time threshold are selected from the users based on the resident point quantity, the quantity upper limit threshold and the quantity lower limit threshold of the first resident points of each user, the adjusted second distance threshold and the second time threshold are adopted, the signaling point data of the first type of users and the signaling point data of the second type of users are subjected to the space-time clustering respectively, a plurality of pieces of second resident point information of the first type of users and the second type of users are obtained, the fact that the resident point information corresponding to each user is matched with the actual resident situation of the user is guaranteed, and further the people stream aggregation area information obtained based on the first resident point information and the second resident point information is matched with the actual people stream situation is guaranteed, and therefore the subsequent city planning is more scientific and reasonable.
The signaling data processing method provided in the present application is described in detail below with reference to the embodiments shown in fig. 1 and fig. 2. Fig. 2 is a flowchart of a signaling data processing method provided in an embodiment of the present application. The execution subject of the embodiment of the present application is the data processing apparatus 12 in the embodiment shown in fig. 1. As shown in fig. 2, the method includes:
s201, acquiring signaling point data of each user in a region to be planned.
Specifically, the data processing device 12 acquires signaling point data of each user within a range of an area to be planned from the big data platform 11; wherein the signaling point data includes a base station location and a signaling time.
The data processing means 12 illustratively obtain signalling point data for each user for a period of time within a region to be planned from the big data platform 11.
S202, performing space-time clustering on each signaling point data of each user by adopting a first distance threshold and a first time threshold to obtain a plurality of first resident point information of each user corresponding to each user.
Specifically, the data processing apparatus 12 performs space-time clustering on each piece of signaling point data of each user by using a first distance threshold and a first time threshold, and obtains a plurality of pieces of first dwell point information of each user corresponding to each user; the first residence point information comprises residence time of each first residence point and base station positions corresponding to the first residence points.
Illustratively, after obtaining signaling point data of each user within a range of an area to be planned according to step S201, the data processing apparatus 12 sorts the signaling point data of each user according to a signaling time sequence in the signaling point data, and then performs preprocessing on the sorted signaling point data to obtain signaling point data of each user after preprocessing.
Illustratively, the pretreatment regimen may be at least one of the following pretreatment regimen:
pretreatment method 1: for the signaling point data with the missing value, directly deleting the signaling point data;
pretreatment method 2: for the interaction of the same base station within 10 seconds, namely, the interaction of the user with the same base station for a plurality of times within 10 seconds is carried out by the user with the same user number (the user number is specifically like the international ISDN number of the mobile station), only the signaling point data interacted by the user for one time within the 10 seconds is reserved, and other signaling point data interacted by the user with the same base station within the 10 seconds is deleted, optionally, the signaling point data interacted first in a plurality of signaling point data interacted by the user with the same base station within the 10 seconds can be reserved;
pretreatment mode 3: for the error recorded signaling point data, if the signaling point data O appears within 60 seconds, and the distance between the position of the base station corresponding to the signaling point data O and the position of the base station corresponding to the adjacent last signaling point data exceeds an error distance threshold value, if the error distance threshold value is 5000 meters, deleting the signaling point data O;
Pretreatment method 4: and for the long-distance drift signaling point data, adopting a preprocessing mode of deleting the long-distance drift signaling point data. Fig. 3 is a schematic diagram of a base station position corresponding to each signaling point data of a certain user according to an embodiment of the present application. The manner of determining the long-distance drift signaling point data in connection with fig. 3 is described as follows:
presetting a drift distance threshold value theta d Drift speed threshold value theta v Drift distance ratio threshold value θ r The method comprises the steps of carrying out a first treatment on the surface of the Except the first signaling point data and the last signaling point data of the user on the same day, the other signaling point data and two signaling point data groups adjacent to the signaling time are taken as a group, and calculation is performed to determine whether the signaling point data are long-distance drift signaling point data. Specifically, as shown in fig. 3, three signaling point data O1, O2, O3 with adjacent signaling times (i.e., the O1 signaling time is earlier than the O2 signaling time, and the O2 signaling time is earlier than the O3 signaling time), respectively calculating distances d between the O1 and O2 signaling point data corresponding to the base station positions O12 Distance d between O1 and O3 signaling point data corresponding to the position of the base station O13 And the speed v of the user from the base station position corresponding to the O1 signaling point data to the base station position corresponding to the O2 signaling point data O12 If d O12 >θ d ,v O12 >θ v And d O12 /(d O13 +10 -k )>θ r Determining the O2 signaling point data as long-distance drift signaling point data; of which, 10 -k Is a minimum number greater than zero, e.g., k.gtoreq.3.
Illustratively, the data processing device 12 obtains the signaling point data of each user after preprocessing as shown in table 1:
table 1 user number 656445 post-user pre-processed signaling point data
Figure BDA0003448125630000111
Wherein each signaling point data of each user comprises: user number, base station location area code (Location Area Code, abbreviated LAC), base station number (Cell Identity, abbreviated CID), base station category, signaling time (time of day (delay time) as shown in table 1), base station location (longitude and latitude as shown in table 1).
After the data processing device 12 obtains the preprocessed signaling point data of each user, the data processing device 12 adopts a first distance threshold and a first time threshold to perform space-time clustering on the preprocessed signaling point data of each user, so as to obtain a plurality of pieces of first resident point information of each user corresponding to each user. Specifically, the data processing apparatus 12 performs space-time clustering on each signaling point data preprocessed by each user by using a space-time clustering algorithm based on a first distance threshold and a first time threshold, so as to obtain a plurality of first resident point information of each user corresponding to each user. The implementation principle of the space-time clustering algorithm based on the first distance threshold and the first time threshold is similar to that of the space-time clustering algorithm based on the space distance threshold and the residence time threshold adopted in the prior art. The present embodiment is not described herein.
Optionally, before the data processing device 12 performs space-time clustering on each signaling point data of each user by using the first distance threshold and the first time threshold to obtain a plurality of pieces of first residence information of each user corresponding to each user, the data processing device 12 first determines whether the first residence information can be obtained by using the first distance threshold and the first time threshold, and processes the signaling point data according to the determination result in the following manner:
if the data processing device 12 determines that the first residence point information can be obtained by adopting the first distance threshold and the first time threshold, the data processing device 12 performs space-time clustering on each signaling point data of each user by adopting the first distance threshold and the first time threshold under the condition that the first residence point information can be obtained, so as to obtain a plurality of first residence point information of each user corresponding to each user;
if the data processing device 12 determines that the first resident information cannot be obtained by using the first distance threshold and the first time threshold, if the first resident information cannot be obtained, the first distance threshold and the first speed threshold are used to perform space-time clustering on the signaling point data of each user, so as to obtain the first resident information.
The data processing device 12 determines whether the first residence point information can be obtained by adopting a first distance threshold and a first time threshold, specifically: the data processing device 12 adopts a space-time clustering algorithm based on a first distance threshold and a first time threshold to calculate and judge each first standing point of each user, and if the first standing point can be calculated according to the algorithm, the first standing point information can be obtained by determining to adopt the first distance threshold and the first time threshold; otherwise, if the first standing point cannot be calculated according to the algorithm, it is determined that the first standing point information cannot be obtained by adopting the first distance threshold and the first time threshold.
For example, assume that after the signaling point data of the user B is sorted and preprocessed according to this step (i.e. step S202), the signaling point data is Pi, where i=1, 2,3, …, m, …, n; m is more than 0 and less than or equal to n, and m and n are natural numbers. The data processing device 12 obtains the first resident information of the user B specifically as follows:
first, the data processing device 12 employs a threshold delta based on a first distance d1 And a first time threshold delta t1 The 1 st first resident information of the user B is calculated. If the 1 st first resident point J1c of the user B and the resident point information thereof can be obtained, the data processing device 12 judges that the first distance threshold delta is adopted d1 And a first time threshold delta t1 The 1 st first resident information of the user B can be obtained, and the first resident J1c and the resident information thereof are used as the 1 st first resident information of the user B.
Similarly, the data processing device 12 employs a threshold delta based on a first distance d1 And a first time threshold delta t1 The 2 nd first resident point J2c of the user B and the resident point information thereof are obtained through calculation. The signaling points corresponding to J1c and J2c are shown in table 2:
TABLE 2 part of the first camp points of user B and corresponding signaling points
Figure BDA0003448125630000131
When the data processing device 12 employs a threshold delta based on the first distance d1 And a first time threshold delta t1 When the 3 rd first resident information of the user B is obtained by calculation, the distance d from the clustering center of P8, P9, P10 and P11 (namely, the center of the corresponding base station position of P8, P9, P10 and P11) to the corresponding base station position of P11 is calculated x11 <δ d1 Then P8, P9, P10, P11 are indicated as possibly constituting a stay; the data processing means 12 continue to calculate the cluster centers of P8, P9, P10, P11 and P12 (i.e. P8, P9,Center of the corresponding base station positions of P10, P11, and P12) to the corresponding base station position of P12 x12 When d x12 ≥δ d1 At this time, the cycle is stopped. The data processing device 12 then uses the first time threshold delta t1 It is determined whether or not P8, P9, P10, and P11 constitute a stay region corresponding to the cluster J3. Calculated time difference t of signaling times of P11 and P8 11-8 ≤δ t1 The data processing device 12 determines that the first distance threshold delta is adopted d1 And a first time threshold delta t1 It is not possible to determine that P8, P9, P10, P11 form a dwell and are co-located in cluster J3, i.e. the data processing device 12 determines that a first distance threshold delta is applied to P8, P9, P10, P11 d1 And a first time threshold delta t1 The standing point information of the first standing point J3c cannot be obtained.
Then, the data processing device 12 determines that the first distance threshold value delta is adopted d1 And a first time threshold delta t1 In case that the standing point information of the first standing point J3c cannot be obtained, a first distance threshold delta is adopted d1 And a first speed threshold delta v1 The signaling point data P8, P9, P10, P11 of the user B are spatiotemporal clustered. Specifically, the data processing device 12 performs space-time clustering on the signaling point data P8, P9, P10, P11 of the user B using a space-time clustering algorithm of the first distance threshold and the first speed threshold. The spatio-temporal clustering algorithm of the data processing device 12 employing the first distance threshold and the first speed threshold is as follows as exemplary steps S2021-S2022:
s2021, calculating the distance d from the cluster center of P8 and P9 (i.e. the center of the corresponding base station position of P8 and P9) to the corresponding base station position of P9 x9 If d x9 <δ d1 Recursively to the next signaling point data P10; calculating the distance d from the clustering center of P8, P9 and P10 (i.e. the center of the corresponding base station position of P8, P9 and P10) to the corresponding base station position of P10 x10 If d x10 <δ d1 Recursively to the next signaling point data P11; calculating the distance d from the clustering center of P8, P9, P10 and P11 (i.e. the center of the corresponding base station position of P8, P9, P10 and P11) to the corresponding base station position of P11 x11 Due to d x11 <δ d1 Recursively to the next signaling point data P12; calculating the distance d from the cluster center of P8, P9, P10, P11, P12 (i.e. the center of the corresponding base station position of P8, P9, P10, P11, P12) to the corresponding base station position of P12 x12 When d x12 ≥δ d1 At this time, the cycle is stopped. It is possible to determine that P8, P9, P10, P11 constitute a dwell. Alternatively, in the case where the data processing apparatus 12 determines that the signaling point data (e.g., P8, P9, P10, P11) may form a dwell using the spatial-temporal clustering algorithm of the first distance threshold and the first time threshold, the data processing apparatus 12 may perform step S2022 directly on the determined signaling point data (e.g., P8, P9, P10, P11) that may form a dwell without performing step S2021.
S2022 calculates the velocity V from the last stationary point J2c of P8, P9, P10, P11 to the cluster center of P8, P9, P10, P11 2-3 If V 2-3 <δ v1 And determining that P8, P9, P10 and P11 form a stay area which stays for one time and corresponds to the cluster J3, and further obtaining a first standing point J3c and standing point information thereof. Specifically, the position of the cluster center of the cluster J3 is the position of the parking point J3 c. The signaling time of the first signaling point data and the signaling time of the last signaling point data in the cluster J3 are respectively the starting time and the ending time of the stopping of the user B at the stopping point J3c, namely the time difference t of the signaling times of the stopping time P11 and the stopping time P8 of the stopping point J3c 11-8
Conversely, if V 2-3 ≥δ v1 It is determined that P8, P9, P10, P11 do not constitute one stay and are not in the same stay region, and P8, P9, P10, P11 are redundant signaling point data. The redundant signaling point data may not be used to determine the anchor point information.
When the data processing device 12 adopts the first distance threshold delta d1 And a first speed threshold delta v1 After performing space-time clustering calculation on signaling point data P8, P9, P10, P11 of the user B to obtain standing point information of the first standing point J3c or not obtain standing point information of the first standing point J3c, the data processing apparatus 12 continues to determine that the first distance threshold and the first distance threshold are adopted from sequentially arranged signaling point data P12 and P13If the time threshold value can obtain the first resident information, the data processing device 12 adopts the first distance threshold value and the first time threshold value to perform space-time clustering on each signaling point data of each user to obtain a plurality of first resident information of each user corresponding to each user; if the data processing device 12 determines that the first resident information cannot be obtained by using the first distance threshold and the first time threshold, in a similar manner, each signaling point data of each user is subjected to space-time clustering by using the first distance threshold and the first speed threshold to obtain the first resident information, and in a similar manner, if each signaling point data of each user is subjected to space-time clustering by using the first distance threshold and the first speed threshold to obtain the first resident information, the corresponding signaling point data is determined to be redundant signaling point data. And the like until all signaling point data Pi of user B have been traversed.
S203, determining a first type of users and a second type of users from the users based on the first residence number of the users.
Specifically, the data processing device 12 determines the first-class users and the second-class users from the respective users based on the first number of residence points of the respective users; the first type of users are users with the first resident point number being larger than a preset upper quantity threshold value, and the second type of users are users with the first resident point number being smaller than a preset lower quantity threshold value.
The data processing device 12 obtains the respective first resident information of the users corresponding to the users according to step S202, calculates the first resident number Q1 of the users, and compares the first resident number Q1 of the users with the upper threshold Q H And a lower threshold value Q L Comparing, determining whether each user is a first type user or a second type user by adopting the following modes:
if Q1 > Q H Determining the user corresponding to the Q1 as a first type user;
if Q1 is less than Q L And determining the user corresponding to the Q1 as a second type user.
The first type of users represent users in areas with too high travel times or sparse population, and the second type of users represent users in areas with too low travel times or dense population.
Illustratively, an upper quantity threshold Q H =6; lower quantity threshold Q L =2. When q1=1, it means that Q1 corresponds to the user and stays at the parking spot area corresponding to Q1. If q1=1 < 2 of a plurality of users in a certain area is calculated, the situation that the users in the area are not suitable for carrying out cluster analysis by adopting a space-time clustering algorithm of a first distance threshold value and a first time threshold value is indicated, and after the distance threshold value and the time threshold value are adjusted, the information of the parking point matched with the actual stay situation of the users in the area is determined by adopting the space-time clustering algorithm of the adjusted distance threshold value and the adjusted time threshold value.
Therefore, the adjusted distance threshold and time threshold are used to space-time cluster the signaling point data of the first class user and the second class user according to step S204.
S204, adopting a second distance threshold and a second time threshold to perform space-time clustering on the signaling point data of the first class user and the signaling point data of the second class user respectively, and obtaining a plurality of second resident point information of the first class user and the second class user respectively.
Specifically, the data processing device 12 performs space-time clustering on each signaling point data of the first type of user and each signaling point data of the second type of user by adopting a second distance threshold and a second time threshold, so as to obtain a plurality of second residence point information of each user of the first type of user and each user of the second type of user; the second residence point information comprises residence time of the second residence point and a base station position corresponding to the second residence point. Specifically, the data processing apparatus 12 performs space-time clustering on each signaling point data of the first type of user and the second type of user by adopting a space-time clustering algorithm based on the second distance threshold and the second time threshold, so as to obtain a plurality of second residence point information of the first type of user and the second type of user, where the implementation principle of the space-time clustering algorithm based on the second distance threshold and the second time threshold is similar to the implementation principle of the space-time clustering algorithm based on the first distance threshold and the first time threshold shown in step S202, and this step is not repeated.
The step (i.e. step S204) involves performing space-time clustering on the signaling point data of the first class user and the second class user by using a second distance threshold and a second time threshold, so as to obtain the implementation principle of a space-time clustering algorithm of a plurality of second standing point information of the first class user and the second class user, which is similar to the implementation principle of the space-time clustering algorithm shown in step S202, and the specific space-time clustering algorithm process described below is not repeated.
The second distance threshold and the second time threshold are, for example, thresholds adapted for spatiotemporal clustering of the signaling point data of the first class of users and the second class of users, adjusted with respect to the first distance threshold and the first time threshold. Wherein the second distance threshold comprises two thresholds delta for the first class of users and the second class of users, respectively d21 And delta d22 . Specifically, the data processing device 12 employs δ d21 The second time threshold value is used for performing space-time clustering on the signaling point data of the first class of users respectively to obtain a plurality of second resident point information of the users of the first class of users; the data processing device 12 employs delta d22 And a second time threshold value, which is used for performing space-time clustering on the signaling point data of the second class user respectively to obtain a plurality of second resident point information of the second class user.
For example, where the first type of user represents a sparsely populated area of users, it may be desirable to employ a distance threshold (e.g., delta) that is higher than the first distance threshold in order to obtain residence information for the first type of user that matches the actual residence of the first type of user d21 ) And carrying out space-time clustering on the signaling point data of the first class of users with the time threshold value to obtain standing point information of the first class of users. Accordingly, the second class of users represents users in densely populated areas, and in order to obtain the standing point information of the second class of users matched with the actual standing situation of the second class of users, a distance threshold (such as delta) lower than the first distance threshold needs to be adopted d22 ) And carrying out space-time clustering on the signaling point data of the second class user with the time threshold value to obtain standing point information of the second class user.
Illustratively, the adjusted second distance threshold and second time threshold are as shown in table 3 relative to the first distance threshold and first time threshold:
TABLE 3 distance threshold and time threshold
First distance threshold δ d1 =800 meters
First time threshold 1200 seconds
Second distance threshold δ d21 =2000 meters δ d22 =500 meters
Second time threshold 1200 seconds
It should be understood that the data processing apparatus 12 uses the second distance threshold and the second time threshold to perform space-time clustering on the signaling point data of the first type of user and the second type of user, and specifically, the data processing apparatus 12 uses the second distance threshold and the second time threshold to perform space-time clustering on the pre-processed signaling point data of the first type of user and the second type of user.
Optionally, the data processing device 12 uses a second distance threshold and a second time threshold to perform space-time clustering on the signaling point data of the first class user and the signaling point data of the second class user respectively, and before obtaining multiple pieces of second residence point information of the first class user and the second class user respectively, the data processing device 12 first determines whether the second residence point information can be obtained by using the second distance threshold and the second time threshold, and processes the signaling point data according to the determination result in the following manner:
if the data processing device 12 determines that the second residence point information can be obtained by adopting the second distance threshold and the second time threshold, performing space-time clustering on each signaling point data of the first type of user and the second type of user respectively by adopting the second distance threshold and the second time threshold under the condition that the second residence point information can be obtained, and obtaining a plurality of second residence point information of each of the first type of user and the second type of user;
specifically, the data processing device 12 determines that δ is employed d21 And a second time threshold value can obtain second resident information of the first user, and delta is adopted under the condition that the second resident information of the first user can be obtained d21 The second time threshold value is used for performing space-time clustering on the signaling point data of the first class of users respectively to obtain a plurality of second resident point information of the users of the first class of users;
Accordingly, the data processing device 12 determines that delta is employed d22 And a second time threshold value can obtain second resident information of the second class user, and delta is adopted under the condition that the second resident information of the second class user can be obtained d22 The second time threshold value is used for performing space-time clustering on the signaling point data of the second class of users respectively to obtain a plurality of second resident point information of the users of the second class of users;
wherein delta is adopted d21 And a second time threshold or delta d22 The implementation principle of the spatio-temporal clustering algorithm with the second time threshold is similar to that of the spatio-temporal clustering algorithm based on the first distance threshold and the first time threshold, which is shown in step S202, and is not described herein.
If the data processing device 12 determines that the second resident information cannot be obtained by adopting the second distance threshold and the second time threshold, performing space-time clustering on each signaling point data of the first type of user and each signaling point data of the second type of user respectively by adopting the second distance threshold and the second speed threshold under the condition that the second resident information cannot be obtained, and obtaining the second resident information of each user of the first type of user and each second type of user;
specifically, the data processing device 12 determines that δ is employed d21 And the second time threshold cannot obtain the second resident information of the first user, delta is adopted under the condition that the second resident information of the first user cannot be obtained d21 The second speed threshold value is used for carrying out space-time clustering on the signaling point data of the first class of users to obtain a plurality of second resident point information of the users of the first class of users;
accordingly, the data processing device 12 determines that delta is employed d22 And the second time threshold cannot obtain the second resident information of the second class user, delta is adopted in the condition that the second resident information of the second class user cannot be obtained d22 The second speed threshold value is used for carrying out space-time clustering on the signaling point data of the second class of users to obtain a plurality of second resident point information of the users of the second class of users;
wherein delta is adopted d21 And a second speed threshold or delta d22 The implementation principle of the spatio-temporal clustering algorithm of the second speed threshold is similar to that of the spatio-temporal clustering algorithm based on the first distance threshold and the first speed threshold, which is shown in step S202, and is not described herein.
The first resident information and the second resident information are used for carrying out traffic planning on the area to be planned and for city planning. Illustratively, the data processing apparatus 12 may transmit the first and second point information to a device or platform for city planning, so as to use the first and second point information for traffic planning of an area to be planned for city planning.
Illustratively, the data processing apparatus 12 may send the first standing point information of each user that is neither the first type of user nor the second type of user, and the second standing point information of the first type of user and the second type of user, to the device or platform for city planning, so as to use these first standing point information and second standing point information to perform traffic planning for the area to be planned for city planning.
As can be seen from the above, each first residence point and each second residence point correspondingly comprise at least two signaling point data corresponding to each other, and each signaling point data corresponds to a base station position. Thus, the base station positions corresponding to the first dwell point include the base station positions of at least two base stations corresponding to the first dwell point, and the base station positions corresponding to the second dwell point include the base station positions of at least two base stations corresponding to the second dwell point.
Optionally, based on the base station positions corresponding to the first residence points, the first residence point positions can be calculated, where the first residence point positions are positions of midpoints of at least two base stations corresponding to the first residence points;
correspondingly, based on the base station positions corresponding to the second residence points, the second residence point positions can be calculated, and the second residence point positions are the positions of the midpoints of at least two base stations corresponding to the second residence points;
Sequencing each first resident point or each second resident point of each user, and calculating to obtain a distance value D between two adjacent first resident point positions based on the first resident point positions z Correspondingly, calculating a distance value D between two adjacent second standing point positions based on the second standing point positions z
Using the value D of the distance z Less than or equal to the merge threshold delta z Will D z Carrying out spatial clustering on each first resident point information or each second resident point information of each user in a corresponding two-adjacent resident point merging mode to obtain respective third resident point information of each user, wherein the third resident point information comprises the resident time of the third resident point and the position of a base station corresponding to the third resident point;
the third parking spot information is used for carrying out traffic planning on the area to be planned and is used for city planning.
It should be understood that the method of spatial clustering is: if D z ≤δ z Will D z And combining the corresponding two adjacent standing points.
Optionally, the method for spatially clustering the first resident information or the second resident information of each user may be that after all the first resident information or the second resident information of each user is obtained, spatial clustering is performed on the first resident information or the second resident information of each user.
Optionally, the method for spatial clustering of the first resident point information or the second resident point information of each user may further use the spatial clustering method to spatially cluster the calculated resident point with the previous resident point when calculating the first resident point information or the second resident point information of each user. Illustratively, after the data processing device 12 calculates the 2 nd first resident point J2c of the user B, the data processing device 12 calculates the distance value D of J2c from J1c of the user B before the 3 rd first resident point J3c is calculated z If D z ≤δ z Will D z And combining the corresponding two adjacent standing points, namely combining J2c and J1c into one standing point. If D z >δ z Then J2c and J1c are not combined, and J2c and J1c serve as independent standing points. Continuously, the data processing device 12 calculates the 3 rd first resident point J3c of the user B, and after the 3 rd first resident point J3c is obtained, the data processing device 12 determines whether J3c can be combined with the previous resident point by using a similar method before the 4 th first resident point J4c is calculated.
Optionally, the data processing device 12 performs space-time clustering on each signaling point data of each user by adopting a first distance threshold and a first time threshold, so as to obtain a plurality of first residence point information of each user corresponding to each user. Specifically, the data processing device 12 performs space-time clustering on each signaling point data of each user on each day by using a first distance threshold and a first time threshold, so as to obtain a plurality of first residence point information of each user corresponding to each day of each user. When determining the 1 st first residence point or the last first residence point of the current day, it may be directly determined that at least two signaling point data form a residence after determining that the at least two signaling point data may form a residence through the first distance threshold.
Such as the signaling point data for user B shown in table 2,when passing the first distance threshold delta d1 When calculating the 1 st first resident point J1c of the user B, calculating the distance d from the P1, P2 cluster center (i.e. the center of the P1, P2 corresponding base station position) to the P2 corresponding base station position x2 Obtaining d x2 <δ d1 Then calculate the distance d from the cluster center of P1, P2, P3 (i.e. the center of the corresponding base station position of P1, P2, P3) to the corresponding base station position of P3 x3 Obtaining d x3 <δ d1 Then calculate the distance d from the cluster center of P1, P2, P3 and P4 (i.e. the center of the corresponding base station position of P1, P2, P3 and P4) to the corresponding base station position of P4 x4 Obtaining d x4 ≥δ d1 Directly determining that P1, P2 and P3 form one stay and are in a cluster J1.
Accordingly, the data processing device 12 uses the second distance threshold and the second time threshold to perform space-time clustering on the signaling point data of the first class user and the second class user respectively, so as to obtain a plurality of second residence point information of the first class user and the second class user respectively. Specifically, the data processing device 12 uses a second distance threshold and a second time threshold to perform space-time clustering on the signaling point data of each day of the first type of user and the second type of user, so as to obtain a plurality of second residence point information of the user corresponding to each day of the first type of user and the second type of user. When determining the 1 st second residence point or the last second residence point of the current day, it may be directly determined that the at least two signaling point data form a residence after determining that the at least two signaling point data may form a residence through the second distance threshold.
Alternatively, if the data processing device 12 uses a space-time clustering algorithm based on the first distance threshold and the first speed threshold or a space-time clustering algorithm based on the second distance threshold and the second speed threshold, it determines whether the redundant signaling point data satisfies the long-distance round trip feature, if so, it determines that the redundant signaling point data forms a dwell, and obtains the first or second dwell point information correspondingly based on the redundant signaling point data.
Wherein it is determined whether redundant signaling point data satisfies a long distanceThe round trip feature is exemplified as follows: as shown in step S202, after determining that P8, P9, P10, P11 are redundant signaling point data in step S2022, the distances D from the cluster centers of P8, P9, P10, P11 to J2c are calculated d3 Velocity V t3 If D d3 >δ d3 And V is t3 <δ t3 Then it is determined that P8, P9, P10, P11 satisfies the long-distance round trip feature. Wherein delta d3 Delta for the third distance threshold t3 Is a third speed threshold.
Further, the data processing device 12 performs grid division on the area to be planned to obtain a plurality of grids;
calculating the total number of the first residence points and the second residence points in the areas corresponding to the grids in the residence time period and the total number of the first residence points and the second residence points in the area to be planned based on the first residence point information and the second residence point information, and determining the area corresponding to the grid with the occupancy rate greater than or equal to the occupancy rate threshold value as the residence;
Correspondingly, the ratio of the total number of the first resident points and the second resident points in the area corresponding to each grid in the working time period to the total number of the first resident points and the second resident points in the area to be planned is calculated, and the area corresponding to the grid with the ratio larger than or equal to the working ratio threshold value is determined as the working place.
Illustratively, the base station location in the present application may be the latitude and longitude of the base station.
Exemplary, the signaling data processing method provided by the present application is adopted to obtain the positions of the base stations corresponding to each signaling point data of a certain day of a certain user and the positions of the standing points, which are shown in fig. 4. Fig. 4 is a schematic view of a standing point of a certain day of a certain user according to an embodiment of the present application. Based on the location of each standing point or the location of each signaling point corresponding to the base station, the administrative region defined by the boundary line of the specific administrative region where the location of each standing point or each signaling point corresponding to the base station is located can be known.
According to the signaling data processing method, the signaling point data of each user are subjected to space-time clustering by adopting a first distance threshold and a first time threshold, a plurality of pieces of first resident point information of each user corresponding to each user are obtained, then a first type of user and a second type of user which are not suitable for space-time clustering by adopting the first distance threshold and the first time threshold are selected from each user based on the resident point number, the number upper limit threshold and the number lower limit threshold of the first resident point of each user, the adjusted second distance threshold and the second time threshold are adopted to respectively perform space-time clustering on the signaling point data of the first type of user and the signaling point data of the second type of user, a plurality of pieces of second resident point information of the first type of user and the second type of user are obtained, and the fact that the resident point information corresponding to each user is matched with the actual resident situation of the user is ensured.
Meanwhile, the signaling data processing method provided by the application also provides various methods for determining the residence point information matched with different residence situations of users, such as a method for determining the residence point information of each user based on a space-time clustering algorithm of a distance threshold value and a speed threshold value, a method for determining the residence point information of each user based on long-distance round trip characteristics and the like, and further ensures the accuracy of obtaining the residence point information of the users in different areas and different residence situations.
In addition, by adopting the signaling data processing method provided by the application, the residence and the working place of the area to be planned can be determined, so that the targeted reasonable city planning can be conveniently carried out aiming at the areas with different functions.
The embodiment of the application also provides a signaling data processing device. Fig. 5 is a block diagram of a signaling data processing apparatus according to an embodiment of the present application. As shown in fig. 5, the apparatus includes:
a processing module 121, configured to obtain signaling point data of each user in a range of an area to be planned from the big data platform 11; the signaling point data comprises a base station position and signaling time;
the first calculation module 122 is configured to obtain signaling point data of each user from the processing module 121, and perform space-time clustering on each signaling point data of each user by adopting a first distance threshold and a first time threshold to obtain a plurality of first residence point information of each user corresponding to each user; the first residence point information comprises residence time of each first residence point and base station positions corresponding to the first residence points;
A user determining module 123, configured to obtain, from the first computing module 122, respective first residence information of users corresponding to each user, and determine, from each user, a first type of user and a second type of user based on the first residence number of each user; the first class of users are users with the first resident point number being larger than the upper limit threshold value of the number, and the second class of users are users with the first resident point number being smaller than the lower limit threshold value of the number;
a second calculation module 124, configured to obtain the first type of user and the second type of user from the user determination module 123, and obtain the signaling point data of the first type of user and the second type of user from the processing module 121; carrying out space-time clustering on the signaling point data of the first class user and the second class user respectively by adopting a second distance threshold and a second time threshold to obtain a plurality of second resident point information of the first class user and the second class user respectively; the second residence point information comprises residence time of the second residence point and a base station position corresponding to the second residence point.
The specific implementation principle and technical effect of the signaling data processing apparatus provided in this embodiment of the present application are similar to those of the embodiment shown in fig. 2, and the present embodiment is not repeated here.
The embodiment of the application also provides a data processing device. Fig. 6 is a block diagram of a data processing apparatus according to an embodiment of the present application. As shown in fig. 6, the apparatus includes a processor 61 and a memory 62, where the memory 62 stores instructions executable by the processor 61, so that the processor 61 can be used to execute the technical solution of the above method embodiment, and the implementation principle and technical effects are similar, and the embodiment is not repeated here. It should be understood that the processor 61 may be a central processing unit (in english: central Processing Unit, abbreviated as CPU), or may be other general purpose processors, digital signal processors (in english: digital Signal Processor, abbreviated as DSP), application specific integrated circuits (in english: application Specific Integrated Circuit, abbreviated as ASIC), or the like. A general purpose processor may be a microprocessor or the processor may be any conventional processor or the like. The steps of a method disclosed in connection with the present invention may be embodied directly in a hardware processor for execution, or in a combination of hardware and software modules in a processor for execution. The memory 62 may comprise a high-speed RAM memory or may further comprise a non-volatile memory NVM, such as at least one magnetic disk memory, and may also be a U-disk, a removable hard disk, a read-only memory, a magnetic disk or optical disk, etc.
The embodiment of the application also provides a storage medium, wherein computer execution instructions are stored in the storage medium, and when the computer execution instructions are executed by a processor, the signaling data processing method is realized. The storage medium may be implemented by any type or combination of volatile or nonvolatile memory devices such as Static Random Access Memory (SRAM), electrically erasable programmable read-only memory (EEPROM), erasable programmable read-only memory (EPROM), programmable read-only memory (PROM), read-only memory (ROM), magnetic memory, flash memory, magnetic or optical disk. A storage media may be any available media that can be accessed by a general purpose or special purpose computer.
An exemplary storage medium is coupled to the processor such the processor can read information from, and write information to, the storage medium. In the alternative, the storage medium may be integral to the processor. The processor and the storage medium may reside in an application specific integrated circuit (English: application Specific Integrated Circuits; ASIC). It is also possible that the processor and the storage medium reside as discrete components in an electronic device or a master device.
The embodiments of the present application also provide a program product, such as a computer program, which when executed by a processor implements the signaling data processing method covered by the present application.
Those of ordinary skill in the art will appreciate that: all or part of the steps for implementing the method embodiments described above may be performed by hardware associated with program instructions. The foregoing program may be stored in a computer readable storage medium. The program, when executed, performs steps including the method embodiments described above; and the aforementioned storage medium includes: various media that can store program code, such as ROM, RAM, magnetic or optical disks.
Finally, it should be noted that: the above embodiments are only for illustrating the technical solution of the present invention, and are not limiting; although the invention has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical scheme described in the foregoing embodiments can be modified or some or all of the technical features thereof can be replaced with equivalents; such modifications and substitutions do not depart from the essence of the corresponding technical solutions from the scope of the technical solutions of the embodiments of the present invention.

Claims (9)

1. A signaling data processing method, comprising:
acquiring signaling point data of each user in a region to be planned, wherein the signaling point data comprises a base station position and signaling time;
performing space-time clustering on each signaling point data of each user by adopting a first distance threshold and a first time threshold to obtain a plurality of first standing point information of each user corresponding to each user, wherein the first standing point information comprises the standing time of each first standing point and the base station position corresponding to the first standing point;
determining a first class of users and a second class of users from the users based on the first resident point number of each user, wherein the first class of users are users with the first resident point number larger than a quantity upper limit threshold value, and the second class of users are users with the first resident point number smaller than a quantity lower limit threshold value;
performing space-time clustering on the signaling point data of the first class of users and the signaling point data of the second class of users respectively by adopting a second distance threshold and a second time threshold to obtain a plurality of pieces of second standing point information of the users of the first class of users and the second class of users, wherein the second standing point information comprises the standing time of a second standing point and the position of a base station corresponding to the second standing point;
The first standing point information and the second standing point information are used for carrying out traffic planning on the area to be planned and for city planning;
the second distance threshold includes two thresholds delta for the first class user and the second class user, respectively d21 And delta d22
The adopting a second distance threshold and a second time threshold to perform space-time clustering on the signaling point data of the first class user and the second class user respectively to obtain a plurality of second residence point information of the first class user and the second class user respectively, including:
by delta d21 The second time threshold value is used for performing space-time clustering on the signaling point data of the first type of users respectively to obtain a plurality of second resident point information of the users of the first type of users;
by delta d22 And a second time threshold value, which is used for performing space-time clustering on the signaling point data of the second class user respectively to obtain a plurality of second resident point information of the second class user.
2. The method of claim 1, wherein before performing space-time clustering on each signaling point data of each user using the first distance threshold and the first time threshold to obtain a plurality of first residence information of each user corresponding to each user, the method further comprises:
Judging whether the first residence point information can be obtained by adopting the first distance threshold and the first time threshold;
the method for performing space-time clustering on the signaling point data of each user by adopting a first distance threshold and a first time threshold to obtain a plurality of first residence point information of each user corresponding to each user comprises the following steps:
and under the condition that the first resident point information can be obtained, carrying out space-time clustering on each signaling point data of each user by adopting the first distance threshold and the first time threshold to obtain a plurality of pieces of first resident point information of each user corresponding to each user.
3. The method according to claim 2, wherein the method further comprises:
and under the condition that the first resident information cannot be obtained, performing space-time clustering on each piece of signaling point data of each user by adopting the first distance threshold and the first speed threshold to obtain the first resident information.
4. The method of claim 1, wherein prior to said employing a second distance threshold and a second time threshold to space-time cluster each signaling point data of the first type of user and the second type of user, respectively, obtaining a plurality of second point information for each of the first type of user and the second type of user, the method further comprises:
Judging whether the second residence point information can be obtained by adopting the second distance threshold value and the second time threshold value;
the adopting a second distance threshold and a second time threshold to perform space-time clustering on the signaling point data of the first class user and the second class user respectively to obtain a plurality of second residence point information of the first class user and the second class user respectively, including:
and under the condition that the second resident information can be obtained, adopting the second distance threshold value and the second time threshold value to perform space-time clustering on the signaling point data of the first class user and the signaling point data of the second class user respectively, and obtaining a plurality of pieces of second resident information of the first class user and the second class user respectively.
5. The method according to claim 4, wherein the method further comprises:
and under the condition that the second resident point information can not be obtained, adopting the second distance threshold value and the second speed threshold value to perform space-time clustering on the signaling point data of the first class user and the signaling point data of the second class user respectively, and obtaining the second resident point information of the first class user and the second class user respectively.
6. The method of any of claims 1-5, wherein the base station locations corresponding to the first dwell point comprise base station locations of at least two base stations corresponding to the first dwell point, and the base station locations corresponding to the second dwell point comprise base station locations of at least two base stations corresponding to the second dwell point;
The method further comprises the steps of:
based on the base station positions corresponding to the first residence points, calculating to obtain first residence point positions, wherein the first residence point positions are the positions of the midpoints of at least two base stations corresponding to the first residence points;
correspondingly, based on the base station positions corresponding to the second residence points, calculating to obtain second residence point positions, wherein the second residence point positions are the positions of the midpoints of at least two base stations corresponding to the second residence points;
sequencing each first resident point or each second resident point of each user, calculating to obtain a distance value between two adjacent first resident point positions based on the first resident point positions, and correspondingly, calculating to obtain a distance value between two adjacent second resident point positions based on the second resident point positions;
using the value D of the distance z Less than or equal to the merge threshold delta z Will D z Carrying out spatial clustering on each first resident point information or each second resident point information of each user in a corresponding two-phase resident point merging mode to obtain respective third resident point information of each user, wherein the third resident point information comprises the resident time of a third resident point and the position of a base station corresponding to the third resident point;
the third standing point information is used for carrying out traffic planning on the area to be planned and is used for city planning.
7. A signaling data processing apparatus, the apparatus comprising:
the processing module is used for acquiring signaling point data of each user in a region range to be planned from the big data platform; the signaling point data comprise a base station position and signaling time;
the first calculation module is used for acquiring signaling point data of each user from the processing module, and performing space-time clustering on the signaling point data of each user by adopting a first distance threshold and a first time threshold to acquire a plurality of pieces of first resident point information of each user corresponding to each user; the first residence point information comprises residence time of each first residence point and base station positions corresponding to the first residence points;
the user determining module is used for acquiring a plurality of pieces of first resident information of the users corresponding to the users from the first calculating module, and determining a first type of user and a second type of user from the users based on the first resident number of the users; the first type of users are users with the first resident point number being larger than the upper quantity threshold value, and the second type of users are users with the first resident point number being smaller than the lower quantity threshold value;
the second computing module is used for acquiring the first class users and the second class users from the user determining module and acquiring the signaling point data of the first class users and the second class users from the processing module; performing space-time clustering on the signaling point data of the first class user and the signaling point data of the second class user respectively by adopting a second distance threshold and a second time threshold to obtain a plurality of second resident point information of the first class user and the second class user respectively; the second residence point information comprises residence time of a second residence point and a base station position corresponding to the second residence point;
The second distance threshold includes two thresholds delta for the first class user and the second class user, respectively d21 And delta d22
The second calculation module is specifically configured to perform space-time clustering on each signaling point data of the first type of user by using δ_d21 and a second time threshold, so as to obtain a plurality of second residence point information of each user of the first type of user; and adopting delta-d 22 and a second time threshold to perform space-time clustering on the signaling point data of the second class of users respectively to obtain a plurality of pieces of second resident point information of the users of the second class of users.
8. A data processing apparatus, comprising:
a processor and a memory;
the memory stores executable instructions executable by the processor;
wherein execution of the executable instructions stored by the memory by the processor causes the processor to perform the method of any one of claims 1-6.
9. A storage medium having stored therein computer-executable instructions which, when executed by a processor, are adapted to carry out the method of any one of claims 1-6.
CN202111665065.4A 2021-12-30 2021-12-30 Signaling data processing method, apparatus and storage medium Active CN114501419B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202111665065.4A CN114501419B (en) 2021-12-30 2021-12-30 Signaling data processing method, apparatus and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111665065.4A CN114501419B (en) 2021-12-30 2021-12-30 Signaling data processing method, apparatus and storage medium

Publications (2)

Publication Number Publication Date
CN114501419A CN114501419A (en) 2022-05-13
CN114501419B true CN114501419B (en) 2023-05-12

Family

ID=81507678

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111665065.4A Active CN114501419B (en) 2021-12-30 2021-12-30 Signaling data processing method, apparatus and storage medium

Country Status (1)

Country Link
CN (1) CN114501419B (en)

Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106792514A (en) * 2016-11-30 2017-05-31 南京华苏科技有限公司 User's duty residence analysis method based on signaling data
CN106897420A (en) * 2017-02-24 2017-06-27 东南大学 A kind of resident Activity recognition method of user's trip based on mobile phone signaling data
CN107133318A (en) * 2017-05-03 2017-09-05 北京市交通信息中心 A kind of population recognition methods based on mobile phone signaling data
CN109688532A (en) * 2017-10-16 2019-04-26 中移(苏州)软件技术有限公司 A kind of method and device dividing city function region
CN112133090A (en) * 2020-08-14 2020-12-25 南京瑞栖智能交通技术产业研究院有限公司 Multi-mode traffic distribution model construction method based on mobile phone signaling data
CN112543427A (en) * 2020-12-01 2021-03-23 江苏欣网视讯软件技术有限公司 Method and system for analyzing and identifying urban traffic corridor based on signaling track and big data
CN113473398A (en) * 2021-06-02 2021-10-01 中山大学 Mobile phone signaling data stop point identification method, device and storage medium
CN113573238A (en) * 2021-06-11 2021-10-29 北京交通大学 Method for identifying trip passenger trip chain based on mobile phone signaling
WO2021227414A1 (en) * 2020-05-12 2021-11-18 华为技术有限公司 Method and device for detecting user data of user equipment (ue), and storage medium

Patent Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106792514A (en) * 2016-11-30 2017-05-31 南京华苏科技有限公司 User's duty residence analysis method based on signaling data
CN106897420A (en) * 2017-02-24 2017-06-27 东南大学 A kind of resident Activity recognition method of user's trip based on mobile phone signaling data
CN107133318A (en) * 2017-05-03 2017-09-05 北京市交通信息中心 A kind of population recognition methods based on mobile phone signaling data
CN109688532A (en) * 2017-10-16 2019-04-26 中移(苏州)软件技术有限公司 A kind of method and device dividing city function region
WO2021227414A1 (en) * 2020-05-12 2021-11-18 华为技术有限公司 Method and device for detecting user data of user equipment (ue), and storage medium
CN112133090A (en) * 2020-08-14 2020-12-25 南京瑞栖智能交通技术产业研究院有限公司 Multi-mode traffic distribution model construction method based on mobile phone signaling data
CN112543427A (en) * 2020-12-01 2021-03-23 江苏欣网视讯软件技术有限公司 Method and system for analyzing and identifying urban traffic corridor based on signaling track and big data
CN113473398A (en) * 2021-06-02 2021-10-01 中山大学 Mobile phone signaling data stop point identification method, device and storage medium
CN113573238A (en) * 2021-06-11 2021-10-29 北京交通大学 Method for identifying trip passenger trip chain based on mobile phone signaling

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
Ning Su ; et al.Unsupervised clustering baased real-time shot boundary detection for live broadcasting.《2019 IEEE 5th international conference on computer and communications(ICCC)》.2019,全文. *
基于手机信令的城市交通网络关键节点识别及时空特征研究;刘福平;《中国优秀硕士学位论文辑》;全文 *

Also Published As

Publication number Publication date
CN114501419A (en) 2022-05-13

Similar Documents

Publication Publication Date Title
CN110414732B (en) Travel future trajectory prediction method and device, storage medium and electronic equipment
CN105682024A (en) City hot spot identification method based on mobile signaling data
CN112950119B (en) Method, device, equipment and storage medium for splitting instant logistics order
CN110598917B (en) Destination prediction method, system and storage medium based on path track
WO2016127880A1 (en) Method and device for determining quality of offline positioning data
Wang et al. Aggregated metro trip patterns in urban areas of Hong Kong: Evidence from automatic fare collection records
CN111931077B (en) Data processing method, device, electronic equipment and storage medium
CN108154387B (en) Method and device for evaluating bus body advertisement delivery route scheme
CN114501419B (en) Signaling data processing method, apparatus and storage medium
Chen et al. A travel mode identification framework based on cellular signaling data
CN111310340B (en) Urban area interaction abnormal relation identification method and equipment based on human movement
CN112150045B (en) Method for judging supply and demand relationship of city shared vehicle based on vehicle position statistics and monitoring system thereof
CN116129643B (en) Bus travel characteristic identification method, device, equipment and medium
Dash et al. From Mobile Phone Data to Transport Network--Gaining Insight about Human Mobility
US11997508B2 (en) Geospatial-based forecasting for access point deployments
CN114705214B (en) Mileage track calculation method and device, storage medium and electronic equipment
CN114245329B (en) Traffic mode identification method, device, equipment and storage medium
Li et al. A simulation approach to detect arterial traffic congestion using cellular data
Bi et al. Mining Taxi Pick‐Up Hotspots Based on Grid Information Entropy Clustering Algorithm
CN107270919B (en) Bus route grading method and device and bus route navigation method and device
CN115129769A (en) Resident travel survey sample expansion method and device and storage medium
CN111161529B (en) Artificial intelligent traffic flow estimation system and method using mobile network signaling data
Zhang et al. Extracting the complete travel trajectory of subway passengers based on mobile phone data
Li et al. Multi-scale trajectory clustering to identify corridors in mobile networks
Balzano et al. Smart priority park framework based on DDGP3

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant