CN113382402A - Population characteristic analysis method based on universal base station and application thereof - Google Patents

Population characteristic analysis method based on universal base station and application thereof Download PDF

Info

Publication number
CN113382402A
CN113382402A CN202110600069.8A CN202110600069A CN113382402A CN 113382402 A CN113382402 A CN 113382402A CN 202110600069 A CN202110600069 A CN 202110600069A CN 113382402 A CN113382402 A CN 113382402A
Authority
CN
China
Prior art keywords
base station
user
universal base
population
information
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202110600069.8A
Other languages
Chinese (zh)
Inventor
黄有为
张佩珩
张刘毅
尹一铭
黄茂雷
党梦丽
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Zhongke Suzhou Intelligent Computing Technology Research Institute
Original Assignee
Zhongke Suzhou Intelligent Computing Technology Research Institute
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Zhongke Suzhou Intelligent Computing Technology Research Institute filed Critical Zhongke Suzhou Intelligent Computing Technology Research Institute
Priority to CN202110600069.8A priority Critical patent/CN113382402A/en
Publication of CN113382402A publication Critical patent/CN113382402A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04WWIRELESS COMMUNICATION NETWORKS
    • H04W8/00Network data management
    • H04W8/18Processing of user or subscriber data, e.g. subscribed services, user preferences or user profiles; Transfer of user or subscriber data
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/23Clustering techniques
    • G06F18/232Non-hierarchical techniques
    • G06F18/2321Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions
    • G06F18/23213Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions with fixed number of clusters, e.g. K-means clustering
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Evolutionary Computation (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Biology (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Molecular Biology (AREA)
  • Mathematical Physics (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • General Health & Medical Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Computing Systems (AREA)
  • Biomedical Technology (AREA)
  • Software Systems (AREA)
  • Probability & Statistics with Applications (AREA)
  • Databases & Information Systems (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention discloses a population characteristic analysis method based on a universal base station, which is characterized by comprising the following steps: classifying and labeling population and places by extracting the relation characteristics of the user and the universal base station, establishing a data server and a Web server, and interconnecting the two servers and all user terminals; recording positioning information through a user terminal, scanning and connecting peripheral pan-base stations, and respectively uploading a scanning result and the positioning information to a server; and performing characteristic analysis on the stored data in a Web server by taking a universal base station as a medium, and classifying and labeling the population and the place according to the characteristic difference. The universal base station at least comprises various signal sources of Bluetooth, Wifi and a miniature 5G tower. By applying the novel scheme of the invention, the indirect contact relation is restored by fully utilizing different small signal sources widely existing in a real scene, and the characteristic analysis is carried out by combining various clustering classification algorithms, so that the tagging result of population or place with higher accuracy is obtained, and the popularization and the application are more facilitated.

Description

Population characteristic analysis method based on universal base station and application thereof
Technical Field
The invention relates to a mobile internet and computer data processing application, in particular to a population characteristic analysis method based on a universal base station, and belongs to the cross field of communication and computer processing.
Background
With the rapid development of computer digitization technology, people's lives are increasingly improved, and work efficiency is rapidly increased. For this reason, the population makes a significant contribution to the digital and scientific classification of most transactions. Therefore, it is important to continuously optimize the feature classification or provide a more efficient feature classification method.
Currently, the classification method widely used mainly classifies users according to their characteristics, that is, according to their behaviors, users are classified with labels. Commonly seen as follows: the search APP can push related contents through daily search contents of users, and advertisement putting can accurately put advertisements to a target user group after classifying the consumption levels of the user group; in the aspect of city management, the GIS can divide cities according to the population activity range and frequency. However, these classification methods do not concern the relationship characteristics of the actual physical world (concern only the characteristic analysis of the virtual world), or do not have the advantage of data collection at short distances. I.e. missing real associations between humans and specific real-life transactions or activities, such as: when a person enters a large business complex, the current classification methods are only concerned with the overall preference of the person for the business complex, or the class and level of users for which the business complex is suitable, and belong to macroscopic conceptual classifications. And the specific merchants which can not refine the attraction to the person in the coverage range of the commercial complex lack the accuracy of the classification of the human mouth features and places from the contact relationship aspect and lack the guidance of consumption recommendation. Meanwhile, in social affairs such as census, infectious disease prevention and control, urban comprehensive law enforcement management and the like, higher requirements are provided for classification of population or place characteristics.
Disclosure of Invention
In view of the above-mentioned deficiencies of the prior art, the present invention aims to provide a population characteristic analysis method based on a universal base station and an application thereof, which optimizes and perfects the classification and labeling of population and location.
The technical solution of the invention for realizing the above purpose is as follows: the population characteristic analysis method based on the universal base station is characterized by comprising the following steps: the classification and labeling of population and place are carried out by extracting the relation characteristics of the user and the universal base station, and the method comprises the following steps: establishing a data server at least storing user information, place information, generic base station information and classification labels and a Web server for processing service logic; uploading user information through a user terminal and scanning original data accumulated by a universal base station for characteristic analysis, and interconnecting and interacting information between two servers and all user terminals; user positioning and pan-base station scanning uploading, recording positioning information of a user in real time through a user terminal, scanning and connecting peripheral pan-base stations through the user terminal, and respectively uploading obtained scanning results and positioning information of the user and the pan-base stations to a data server; and data processing, namely setting an algorithm for classification and labeling in the Web server, performing characteristic analysis on data stored in the data server by taking a universal base station as a medium, classifying population and places according to characteristic differences, and labeling, wherein the universal base station at least comprises various signal sources of Bluetooth, Wifi and a miniature 5G tower.
The new technical solution of the population characteristic analysis applied in the invention has obvious progress: the scheme makes full use of the fact that different small signal sources widely exist in a real scene to restore an indirect contact relation, does not need to additionally set a base station, stores and accumulates data on the basis of the contact relation established by scanning of the miniature universal base station, performs characteristic analysis by combining various existing mature cluster classification algorithms, obtains tagging results of population or places with higher accuracy, and is more beneficial to popularization and application.
Drawings
FIG. 1 is a classification and labeling approach that may be used in the demographic analysis method of the present invention.
Fig. 2 is a schematic diagram of a model structure of data storage in the population characteristic analysis method of the present invention.
FIG. 3 is a schematic diagram of the state of the k-means clustering process used in the population characteristic analysis method of the present invention.
FIG. 4 is a flowchart illustrating an overall classification of a demographic analysis method according to the present invention.
Detailed Description
The following detailed description of the embodiments of the present invention is provided in conjunction with the accompanying drawings to make the technical solution of the present invention easier to understand and grasp, so as to define the protection scope of the present invention more clearly.
In the era of mobile internet, various signal base stations cover almost every corner, and there are population and areas with electronic equipment, and signals are generally not poor. Various network signals are generated in various base stations, and various ubiquitous pan-base stations can be fully utilized to carry out intelligent population characteristic analysis. The "pan base station" is defined as a small signal source with a coverage range within 100 meters of a square circle, and in reality, the corresponding devices are mainly Wifi signals, bluetooth signals, small-sized 5G towers and other short-distance signal base stations. Some valuable features can be obtained and classified through population characteristic analysis, and are commonly called as 'tagging'. These labels may be for places, e.g. a place with a large middle-noon population may be labeled "high aggregation"/"rest area"/"dining area"; these feature tags may also be human-specific, e.g., a person who enters a city # 11 wire-line Wifi signal in the morning, possibly labeled "office workers", "college", "city people", etc. Moreover, the feature tags can be stacked and re-classified. The tagged places or humans are collectively referred to as objects, one object may correspond to a plurality of tags, and one tag may belong to a plurality of objects.
The invention provides a population characteristic analysis method based on a universal base station and application thereof in a targeted manner aiming at the limitation and deficiency of population characteristic analysis in the aspect of data accumulation in the prior art. The hand-held equipment supporting the pan base stations scans and records the surrounding pan base stations, and the relationship between people and the pan base stations is stored and used as analysis and new data accumulation.
The outline characteristics of the method of the invention are as follows: the method comprises the following main steps of classifying and labeling population and places by extracting the relation characteristics of a user and a universal base station: establishing a data server at least storing user information, place information, generic base station information and classification labels and a Web server for processing service logic; uploading user information through a user terminal and scanning original data accumulated by a universal base station for characteristic analysis, and interconnecting and interacting information between two servers and all user terminals; user positioning and pan-base station scanning uploading, recording positioning information of a user in real time through a user terminal, scanning and connecting peripheral pan-base stations through the user terminal, and respectively uploading obtained scanning results and positioning information of the user and the pan-base stations to a data server; and (3) data processing, namely setting an algorithm for classification and labeling in the Web server, performing characteristic analysis on the data stored in the data server by taking a universal base station as a medium, classifying population and places according to characteristic differences, and labeling.
Therefore, the classification of the population and the place is based on the association of the population and the digital signals, and is different from the traditional data base related to population classification, a newly defined data storage structure model is provided, necessary preparation is provided for data analysis, classification and labeling, and the advantage of data acquisition in a short distance is reflected, and the obtained data has stronger pertinence to the place details, so that the classification and labeling accuracy related to the population or the place is higher.
The population characteristic analysis method scans, connects and accesses the peripheral pan-base station comprising a communication network through a user mobile phone, uploads and stores the associated information between the user and the pan-base station which is scanned or connected currently through a networking background-oriented data server, and performs data processing and characteristic analysis and obtains a user-defined labeled classification result through a pre-designed classification and labeling algorithm in a background Web server. The algorithm used in the method can be the existing mature classification algorithm or an optimized and adjusted version based on the existing algorithm; and the algorithm itself is not regarded as the technical gist of the present application.
In the implementation process, the user mobile phone serving as a part of the basic construction has diversity on networking, and a mobile communication network is necessary generally, so that background uploading of positioning information of the user mobile phone and associated information interacted with the universal base station is guaranteed. Since the user is usually in a form and a shape which are not separated from the mobile phone, the mobile phone has a state characteristic of relative still and frequent activity as a necessary mapping relation. And as various signal sources of the universal base station, the universal base station has a smaller accessible range and a fixed relative position. The location of the user handset may change continuously as time progresses during the scanning and accessing of the universal base station. On the other hand, although the positioning information can reflect the overlapping property, the positioning information may have a time offset deviation, and is easy to generate an confusing contact record; and the contact records directly scanned and accessed by signal sources such as Bluetooth and Wifi have real-time performance, so that the contact relation on the time difference is lost. Therefore, optimization is pertinently provided by the scheme, the 'universal base station' with relatively fixed geographic position and wide layout is used as a medium for indirect contact, the contact relation between different users and the universal base station is restored, and then the relevance of population and place with time sequence is obtained, so that basic data for a classification algorithm can be obtained.
In order to more clearly understand the realisation of the above described solution and to understand its innovative core, it is described in detail below in terms of more closely resembling preferred embodiments.
In the basic construction, a data storage structure at least comprising user tables (users), universal base station tables (bases), place tables (landworks), relations and occurrence time tables (relations-time) and tag tables (tags) is created in a data server, as shown in fig. 2. The data server may be a cloud server and have installed thereon a relational or non-relational database system as needed. And simultaneously, an online linkage Web server is provided for receiving, processing, temporarily storing and returning data service logic, and a standardized and fixed data interface (API) is required to be provided so as to be communicated with a user terminal.
The method also comprises the steps of developing and loading a piece of registered user information on the user terminal, scanning a universal base station, uploading App positioned by a user or a third-party plug-in (hereinafter referred to as a program) capable of being embedded with other Apps; the program implements registration and login through the user terminal to complete equipment binding and obtain a unique user ID, uploads user information and updates in a corresponding user table, and a background keeps running a positioning function and a base station scanning function, wherein the user terminal is not limited to the mobile phone, but can be a smart watch or a bracelet other than a smart mobile phone.
The user information comprises basic information at least including user ID, name, gender, birth year and month, head portrait and ID card number, and optionally input extended information including study calendar, speciality, life habit and hobby. The expansion has flexibility, the user table established by the data server is used for storing, and the associated information entries can be added, updated or deleted, so that the user information in the database is ensured to be up-to-date. And after the program allows scanning the signals of the surrounding ubiquitous base stations, the obtained data at least comprises the ubiquitous base station name, the signal strength RSSI, the UUID, the MAC address and the like. Regardless of the Android or iOS system, various small programs, especially the currently mainstream mobile phone SDK, already provide these scanning functions, and can be directly called in the process of developing the program.
In the user positioning and universal base station scanning uploading process, the recorded positioning information comprises GPS information which is acquired by positioning SDK and contains longitude, latitude and detailed address, or geographic information which is acquired by IP fuzzy positioning when the user forbids to access GPS. And after the positioning data is acquired, quick and simple structural processing is carried out at the user terminal, and then the pan-base station scanning is combined. It should be clear that, the information of the universal base station and the user information are uploaded to the data server together with the positioning information, so that the fixed position of the universal base station is determined by the user positioning in a rational manner.
Similarly, the information of the flooding base station is obtained according to the program function, and after a user enters the signal coverage area of various flooding base stations, the user triggers the scanning of the flooding base stations; and the user terminal scans or accesses any pan base station, records the name, signal strength RSSI, UUID and MAC address of the corresponding pan base station, summarizes all signal source information scanned or accessed at the same time into structured data, and makes necessary preparation for uploading and storing.
Various short-distance signal sources exist in daily life, including routes, hot spots and fixed Bluetooth devices, and are ubiquitous in public places. For example, places such as coffee shops, train stations, airports, etc. generally exist in Wifi form, and also exist in bluetooth or other signal source modes. The places where the signal sources are located are places with high activity gathering performance of people and are spaces where contact relationships often occur. The signal sources can be scanned by the mobile phone device and used for recording the contact relationship between the user and the ubiquitous base stations, the contact relationship between the user and the user can be indirectly determined, and the defect that the contact relationship is lost in the time difference in the process of constructing the social network through traditional Bluetooth scanning can be overcome.
The data server confirms the uniqueness of the scanned flooding base station, distinguishes the uniqueness of different flooding base stations according to the MAC address of the signal source and stores the uniqueness in an information table of the flooding base station; meanwhile, the GPS position of the user during scanning is stored, namely the position of the base station; the one-to-many contact relationship of the user and the surrounding ubiquitous base station is also written into the database, and the storage mode is still one-to-one row-level storage.
Data processing is another important aspect of the population characteristic analysis method, and more importantly, the data processing is combined with various existing cluster classification algorithms. The specific refinement steps shown in conjunction with fig. 4 are as follows: the classification method which can be adopted is determined firstly, and according to the figure 1, the labeling method can be divided into a manual labeling method and an automatic labeling method, wherein the automatic labeling method refers to automatic classification and labeling through a computer, the traditional mode matching is included, the advanced machine learning classification method is included, and the unsupervised and supervised learning methods in the machine learning field are mainly adopted for classification hereinafter.
The labeling was done manually after using k-means cluster analysis. The clustering algorithm is a method for automatically dividing a stack of label-free structured data into several classes, and belongs to an unsupervised learning method. The method ensures that the data of the same class have similar characteristics. Wherein, the k value refers to the number of clusters to be obtained, and if k = n, it means that the class centroids corresponding to n k classes are randomly selected, then the distances from all points in the sample to the n centroids are respectively obtained, and the class of each sample is marked as the class of the centroid with the minimum distance from the sample. The value of k needs to be predetermined and can be determined by estimation or multiple attempts. In the scheme, the k value can be given as an initial value according to experience (artificially defined label number) or according to data distribution characteristics and a corresponding algorithm (for example, k +1 after population density around the pan-base station is greater than a threshold).
The k-means cluster analysis process is as follows:
(1) selecting initialized k preset labels as initial clustering centers
Figure DEST_PATH_IMAGE002
(2) And respectively calculating a distance set from the sample point xi to k centers aiming at each sample point xi in the data set needing to be classified, and then classifying the sample points to the center tj with the minimum distance.
(3) The position of the cluster center is recalculated for each classification tj.
(4) And (4) ending the circulation if the clustering center does not change the position any more or exceeds the preset maximum circulation times in the step (3). The above k-means clustering process is shown in FIG. 3.
(5) Classification obtained after clustering by K-means
Figure DEST_PATH_IMAGE004
And manually correcting the label on the result. The new data xi +1 generated later can be classified according to the existing K-means model, and automatically labeled.
And inputting some sample data and classification results obtained on the neural network into the neural network for model training. The neural network adopts an unlimited realization frame, so that the classification problem is changed into supervised learning, and the automatic classification and labeling capability of a computer is improved.
And storing the model after the neural network training for the computer to automatically classify the new data.
The classification performed according to any of the above steps requires manual review at regular intervals to correct the error label. And calculating the label error rate through sampling inspection, and continuously optimizing the K value in the K-means and the model of the neural network.
And 6) classifying the population according to different characteristics and labeling. Firstly, a label data dictionary is artificially set for user data, and some target labels for classification are initialized. Such as: dividing user tags according to gender; classifying users according to age level: children, young, middle-aged, elderly; the method comprises the following steps: students, office workers, floating workers, etc. And reading random user information in the data server, and taking the relationship between the user and the base station as an input value. Unsupervised learning is firstly carried out on a large amount of user data and relation data through the k-means clustering analysis in the step 5-2). Setting a k value according to the number of the labels of the data dictionary used for the classification, and after a classification result is obtained after k-means, manually reviewing, correcting and labeling the classification result. Such as: the array dictionary is 'office group', 'free occupation' or 'retirement': the method is characterized in that a plurality of users establish a human-base station relationship in a certain WIFI signal area for a plurality of times during working days and working hours, and can be marked as 'office workers' in the WIFI area, if the working hours are not fixed, the contact base station is not fixed, and can be marked as a free occupation. If the same base station is swept for a long period of time on weekdays and non-weekdays, the activity is low and the age of the user in the database is greater than 60, it can be marked as "retirement". And carrying out neural network analysis on the basis of K-means. And (5) carrying out classification and artificial labeling through the k-means in the last step to obtain an input set and an output set, and constructing a neural network model. For example, a convolutional neural network is used, including a convolutional layer for efficient feature extraction of the data. And (4) inputting the data trained in the step (6-2) into a convolutional neural network model for training, and finally obtaining a model which can be classified and labeled for people. Later, other data which is not processed and classified is put into the trained model, and the convolutional neural network classifies according to the characteristics of new data and designates the label of 'human'. This step requires periodic manual review of the classification results, correction of misclassifications, and optimization of the neural network model. And automatically marking the user by adopting a pattern matching method. According to the prompting of the device name, users are classified in a labeled mode, when most users stay for many months in a certain WIFI signal area for a long time, and data show that the name of the Wifi signal is 'XJTLU', namely the English abbreviation of 'West intersection, Philips university'. It can be determined that most of the users are teachers and students or workers at the university of western intersection, philips.
There are many types of regional classification and feature analysis, including the following steps: firstly, a data dictionary is set for a region label person, the region is classified according to different set data dictionaries, and if the data dictionary is set as follows according to the region position: downtown, urban, suburban, rural, etc. Unsupervised learning can be carried out on information of a user and a base station which are in relation with the base station through k-means cluster analysis, a k value is set through an initially set data dictionary, then a k-means cluster analysis model is set, the model is trained, a base station data set is input, the classification condition of the base station is obtained through the data set of the base station and human relation, and then manual review work is carried out on the classification result of the k-means. Such as: on weekends, the number of base stations in contact with users is increasing, and on a smaller day of work, the location may be a "business district" or a "shopping mall". And training and classifying the neural network model on the basis of k-means. After classification and labeling are carried out through k-means cluster analysis, a neural network model is constructed, for example, a convolutional neural network at least comprises one convolutional layer, and the convolutional layer is used for effective feature extraction of data. Inputting the result set input and output in the step 7-2) into convolutional neural network model training. The data that has not been processed and classified is then put into a model, and the labels or categories of these data can be derived from the previously trained model. Also, the results classified by the neural network need to be checked manually and the model optimized periodically. Secondly, the places can be classified by the visiting situation of the users associated with the base station, and the places can be classified by the labels of the classified users. For example: one location was at 11 noon: 00-1: 00 users who are often marked with "office workers" visit and stay for a short time, the location is likely to be a "dining area" or a "rest area". And carrying out pattern matching classification by adopting a search engine or data dictionary query so as to match out the label nouns possibly related to the flooding base station. For example, the WIFI name "CAS-IICT" may match the "institute for Chinese institute computing" landmark in the dictionary. The search engine can retrieve the landmark names on the map according to the longitude and latitude of the universal base station or other geographic information and then give out the labels. The step is premised on that a data dictionary is built according to existing data, and a crawler or a search is carried out by utilizing an internet search engine. The manual examination and marking are directly carried out, the administrator directly modifies the label through the operation of a management platform or a database, and all the steps can be manually reviewed and corrected.
Therefore, the patent designs a feature classification method based on a universal base station aiming at the association of population and digital signals, and the patent comprises the following steps: people and places categories. After a certain user carrying the electronic equipment enters a WIFI/Bluetooth/5G area within a certain range, a relationship is established between the user and the universal base station within a coverage range, and the relationship is stored in a cloud server. And the stored relationship information is subjected to feature analysis through various clustering classification algorithms, and finally, the places and the population are labeled. The more accurate classification results can be popularized and used for recommendation systems, advertisement delivery, census and city management.
In summary, the population characteristic analysis method based on the universal base station of the present invention is detailed with the illustrated embodiments, and has outstanding substantive features and significant progress. In summary, the scheme fully utilizes different small signal sources widely existing in a real scene to restore the indirect contact relationship, does not need to additionally set a base station, is expanded on the basis of establishing the contact relationship through Bluetooth and Wifi scanning, optimizes the time sequence tropism when the relationship network is established, can effectively make up the defect that the contact relationship is lost due to time difference in the existing relationship network graph directly obtained through Bluetooth scanning, and improves the integrity and reliability of tracking.
In addition to the above embodiments, the present invention may have other embodiments, and any technical solutions formed by equivalent substitutions or equivalent transformations are within the scope of the present invention as claimed.

Claims (10)

1. The population characteristic analysis method based on the universal base station is characterized by comprising the following steps: the classification and labeling of population and place are carried out by extracting the relation characteristics of the user and the universal base station, and the method comprises the following steps:
establishing a data server at least storing user information, place information, generic base station information and classification labels and a Web server for processing service logic; uploading user information through a user terminal and scanning original data accumulated by a universal base station for characteristic analysis, and interconnecting and interacting information between two servers and all user terminals;
user positioning and pan-base station scanning uploading, recording positioning information of a user in real time through a user terminal, scanning and connecting peripheral pan-base stations through the user terminal, and respectively uploading obtained scanning results and positioning information of the user and the pan-base stations to a data server;
and data processing, namely setting an algorithm for classification and labeling in the Web server, performing characteristic analysis on data stored in the data server by taking a universal base station as a medium, classifying population and places according to characteristic differences, and labeling, wherein the universal base station at least comprises various signal sources of Bluetooth, Wifi and a miniature 5G tower.
2. The population characteristic analysis method based on the universal base station as claimed in claim 1, wherein: in the basic construction, a data storage structure at least comprising a user table, a universal base station table, a place table, a relation and occurrence time table thereof and a label table is established in a data server; developing and loading a type of registered user information, scanning a universal base station, uploading an App positioned by a user or a third-party plug-in capable of being embedded with other Apps on a user terminal; the method comprises the steps of completing equipment binding and obtaining a unique user ID through registration and login of a user terminal, uploading user information and updating in a corresponding user table, and keeping a running positioning function and a universal base station scanning function in a background, wherein the user terminal is at least a smart phone, a smart watch or a bracelet.
3. The population characteristic analysis method based on the universal base station as claimed in claim 2, wherein: the user information comprises basic information at least including user ID, name, gender, birth year and month, head portrait and ID card number and optional input extended information including study calendar, speciality, life habit and hobby, and the App or third-party plug-in provides functions of adding, updating and deleting corresponding user information items.
4. The population characteristic analysis method based on the universal base station as claimed in claim 1, wherein: in the user positioning and universal base station scanning uploading process, the recorded positioning information comprises GPS information which is acquired by positioning SDK and contains longitude, latitude and detailed address or geographic information acquired by IP fuzzy positioning.
5. The population characteristic analysis method based on the universal base station as claimed in claim 1, wherein: in the user positioning and pan base station scanning uploading process, after a user terminal enters a signal coverage area of more than one pan base station, the name, the signal strength RSSI, the UUID and the MAC address of the corresponding pan base station are recorded by scanning or accessing any pan base station, all signal source information scanned or accessed at the same time is summarized into structured data by the user terminal, and the structured data, the user information, the positioning information and the current time are used as associated data packets to be uploaded to a data server.
6. The population characteristic analysis method based on the universal base station as claimed in claim 5, wherein: the data server receives the associated data packets in real time and updates the positioning information corresponding to the position change of each user, the Web server distinguishes the uniqueness of the flooding base station according to more than one MAC address in the associated data packets, and the actual geographic position of each flooding base station is estimated by combining the positioning information; and writing the one-to-many contact relationship between the user and the surrounding universal base station into a database in a one-to-one row-level storage mode.
7. The population characteristic analysis method based on the universal base station as claimed in claim 1, wherein: in the data processing, firstly, unsupervised learning is carried out by using a k-means clustering method, and manual correction is carried out on the result to obtain an initial classification model; inputting the initial classification model into a neural network, and performing model training with supervised learning by combining sample data to obtain a computer automatic classification model; and carrying out manual review and error label correction at regular intervals, and adjusting the k value in the k-means clustering method and the model framework of the neural network.
8. The population characteristic analysis method based on the universal base station as claimed in claim 7, wherein: and (3) analyzing population characteristics, site characteristics based on the universal base station and relationship characteristics of population and site by using a k-means clustering method as a basis for classification and labeling.
9. The population characteristic analysis method based on the universal base station as claimed in claim 7, wherein: before using k-means clustering method, the data processing method also comprises the step of artificially setting a data dictionary for the associated data, and assisting classification through a search engine in the classification and labeling process.
10. The population characteristic analysis method based on the universal base station is characterized by comprising the following steps: the classification results of population and place are used for recommendation system, advertisement putting, census and city management.
CN202110600069.8A 2021-05-31 2021-05-31 Population characteristic analysis method based on universal base station and application thereof Pending CN113382402A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110600069.8A CN113382402A (en) 2021-05-31 2021-05-31 Population characteristic analysis method based on universal base station and application thereof

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110600069.8A CN113382402A (en) 2021-05-31 2021-05-31 Population characteristic analysis method based on universal base station and application thereof

Publications (1)

Publication Number Publication Date
CN113382402A true CN113382402A (en) 2021-09-10

Family

ID=77574882

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110600069.8A Pending CN113382402A (en) 2021-05-31 2021-05-31 Population characteristic analysis method based on universal base station and application thereof

Country Status (1)

Country Link
CN (1) CN113382402A (en)

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109495856A (en) * 2018-12-18 2019-03-19 成都方未科技有限公司 A kind of mobile phone user's type mark method based on big data
KR101976189B1 (en) * 2018-06-07 2019-05-08 넥스엔정보기술(주) Method of providing analysis service of floating population
CN110782284A (en) * 2019-10-24 2020-02-11 腾讯科技(深圳)有限公司 Information pushing method and device and readable storage medium
CN111524609A (en) * 2020-04-22 2020-08-11 第四范式(北京)技术有限公司 Method and system for generating screening model and screening infectious disease high-risk infected people

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR101976189B1 (en) * 2018-06-07 2019-05-08 넥스엔정보기술(주) Method of providing analysis service of floating population
CN109495856A (en) * 2018-12-18 2019-03-19 成都方未科技有限公司 A kind of mobile phone user's type mark method based on big data
CN110782284A (en) * 2019-10-24 2020-02-11 腾讯科技(深圳)有限公司 Information pushing method and device and readable storage medium
CN111524609A (en) * 2020-04-22 2020-08-11 第四范式(北京)技术有限公司 Method and system for generating screening model and screening infectious disease high-risk infected people

Similar Documents

Publication Publication Date Title
Chen et al. Delineating urban functional areas with building-level social media data: A dynamic time warping (DTW) distance based k-medoids method
Inostroza et al. A heat vulnerability index: spatial patterns of exposure, sensitivity and adaptive capacity for Santiago de Chile
CN104254865A (en) Empirical expert determination and question routing system and method
US11966424B2 (en) Method and apparatus for dividing region, storage medium, and electronic device
US20170318434A1 (en) Systems and Methods to Identify Home Addresses of Mobile Devices
CN101842788A (en) Method, apparatus and computer program product for performing a visual search using grid-based feature organization
Xu et al. Tourism geography through the lens of time use: A computational framework using fine-grained mobile phone data
CN107291888A (en) Life commending system method near hotel is moved in based on machine learning statistical model
CN112308001A (en) Data analysis method and personnel tracking method and system for smart community
CN109359162A (en) A kind of school's site selecting method based on GIS
TWI526963B (en) A method, a device and recording media for searching target clients
WO2017185462A1 (en) Location recommendation method and system
Gao et al. Identification of urban regions’ functions in Chengdu, China, based on vehicle trajectory data
Liang et al. Assessing the validity of SafeGraph data for visitor monitoring in Yellowstone National Park
Achmad et al. Tourism contextual information for recommender system
Shen et al. Novel model for predicting individuals’ movements in dynamic regions of interest
Sun et al. Deep convolutional autoencoder for urban land use classification using mobile device data
CN106649636A (en) Personnel mobility analysis method and device based on mobile terminal
CN103440278A (en) Data mining system and method
Chaudhuri et al. Application of web-based Geographical Information System (GIS) in tourism development
Nuzir et al. Dynamic Land-Use Map Based on Twitter Data.
CN104615707B (en) The definite method and device of information point
CN113382402A (en) Population characteristic analysis method based on universal base station and application thereof
Celikten et al. Extracting patterns of urban activity from geotagged social data
Encalada et al. Mining big data for tourist hot spots: Geographical patterns of online footprints

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication

Application publication date: 20210910

RJ01 Rejection of invention patent application after publication