CN112132727B

CN112132727B - Government service pushing method of situation big data based on city big data

Info

Publication number: CN112132727B
Application number: CN202011006238.7A
Authority: CN
Inventors: 陈恩红; 陈钢
Original assignee: Yangtze River Delta Information Intelligence Innovation Research Institute
Current assignee: Yangtze River Delta Information Intelligence Innovation Research Institute
Priority date: 2020-09-23
Filing date: 2020-09-23
Publication date: 2023-08-18
Anticipated expiration: 2040-09-23
Also published as: CN112132727A

Abstract

The invention discloses a government affair service pushing method of situation big data based on city big data, comprising the following steps: step a, constructing a natural person situation data model, and describing the social activity rule, individual business characteristics, social interaction states and the like of a single natural person in an omnibearing manner; step b, pushing situation cluster government affair service; wherein, step a includes: step a1, constructing a basic attribute situation of a natural person; step a2, constructing a natural human life event situation; step a3, constructing a social relationship situation of natural people; step b comprises: step b1, calculating a situation vector; step b2, clustering the situation; and b3, pushing government service. The method not only can innovate government population data application modes, but also is helpful for promoting the transformation from government forms to service forms, so that government departments can provide accurate personalized, refined and mobile services for the public.

Description

Government service pushing method of situation big data based on city big data

Technical Field

The invention relates to a government affair service pushing method based on situation big data of city big data.

Background

With the deep progress of government affair disclosure, the work such as 'internet+' government affair service, the contradiction between the rapid increase of website information content of each level government and the personalized demands of users is increasingly prominent. The traditional electronic commerce recommendation system only considers the relation between the user and the project, and does not consider the situation information of the user. However, in the field of government service, the accuracy of the recommendation algorithm is greatly affected by the context information.

Therefore, a new method is urgently needed to solve the above technical problems.

Disclosure of Invention

The invention aims to provide a government service pushing method for situation big data based on city big data, which not only can innovate government population data application modes, but also is helpful for pushing government forms to change into service forms, so that government departments can provide accurate personalized, refined and mobile services for the public.

In order to achieve the above purpose, the present invention provides a government service pushing method of situation big data based on city big data, comprising:

step a, constructing a natural person situation data model, and describing the social activity rule, individual business characteristics, social interaction states and the like of a single natural person in an omnibearing manner;

step b, pushing situation cluster government affair service; wherein, the liquid crystal display device comprises a liquid crystal display device,

step a comprises:

step a1, constructing a basic attribute situation of a natural person;

step a2, constructing a natural human life event situation;

step a3, constructing a social relationship situation of natural people;

step b comprises:

step b1, calculating a situation vector;

step b2, clustering the situation;

and b3, pushing government service.

Preferably, the basic attribute context in step a1 includes effective identity proof reflecting birth and social attribution of a natural person, index reflecting natural humanization quality, index reflecting employment situation of the natural person and index reflecting knowledge and skill level of the natural person in a professional activity.

Preferably, step a1 comprises:

firstly, extracting relevant data catalogs of government departments such as public security bureau, personal social bureau, civil government bureau and the like, and forming natural person basic information, passport information, driving license information, taiwan pass information, kong and Australian pass information, social security card information, cultural degree information, work unit information, practitioner qualification information, practice qualification information, professional technical job qualification information, professional skill information and religious staff information after multi-table association;

secondly, regarding the multi-source data item in the related information of the natural person, taking the person with the highest weighted score of accuracy and updating time as an adoption object; then connecting to a local MySQL database through a pymysql technology, and storing the regulated data into the local database;

finally, predicting the basic attribute which cannot be obtained through the step a1 by adopting an attribute reasoning method; after acquiring known attribute data affecting unknown attributes, constructing a graph according to a certain algorithm, and then reasoning based on prior probability; calculating influence values among attribute values when constructing an attribute graph, and determining the sequence of attribute influence on the influence sequence among the attributes; wherein, the liquid crystal display device comprises a liquid crystal display device,

the steps required for unknown attribute reasoning include: calculating an influence value between attribute values of the attributes, calculating an influence value between the attributes, and calculating a condition value between the attributes.

Preferably, the personal event context in step a2 comprises various kinds of business events or activities that natural persons participate in during the life cycle.

Preferably, step a2 comprises:

firstly, constructing a life event data model dictionary (key: event type, value: [ business event 1, business event 2, … …, business event n ]), wherein each business event is composed of a plurality of data items (D1, the..once again, DN);

secondly, extracting and regulating data related to natural human life events of various government departments, connecting the data to a local MySQL database through a pymysql technology, and storing the regulated data into the local database;

then, calculating text similarity for the data stored in the substep a2 by using word vectors, generating word vectors by using Glove model, word2vec model and Bert model training, calculating similarity of business item word vectors, setting a specified threshold value, and fusing similar business items of a plurality of departments to form a corresponding life item data set T;

and finally, organizing the data in the data set T by using a life event data model dictionary so as to realize the fusion and standardization of the natural life event multi-source data.

Preferably, the social relationship context in step a3 includes reflecting various person-person relationships established by natural persons during social activities, reflecting person-ground relationships that natural persons belong to and establish with certain types of places (places) for reasons of living, working, learning, etc., and reflecting person-object relationships that natural persons possess tangible or intangible assets during full life cycle history.

Preferably, step a3 comprises:

firstly, defining a plurality of entity objects of natural people, places, motor vehicles, houses, lands and intellectual property rights; each natural person node can establish a plurality of relations with a plurality of nodes (natural persons, places, articles and the like); adding attributes for each entity object, wherein the natural person attributes are basic attributes constructed in the step a1, the place attributes comprise place names, place types and roles of persons in places, and the object attributes comprise object names, object types and possession times; wherein a single node may contain multiple attribute descriptions that characterize its physical characteristics;

secondly, defining a relation and an event, and adding the relation or the event between entity objects; wherein, the comprehensive relationship of relatives, neighbors or colleagues is added between individuals, the belonging relationship is added between individuals and organizations, and the possession relationship is added between individuals and certificates; each relationship comprises a start node and a stop node; the attributes of the relationship include relationship type, relationship establishment time and relationship release time;

then, after the definition of the entity, the attribute, the relation and the event is finished, extracting the existing various data through a data extraction tool, and finally constructing and forming a set of complete natural human social relation knowledge graph through entity alignment and attribute filling of the extracted knowledge;

and finally, establishing an index by adopting common fields of the identification card number, the property right number, the land use right number and the motor vehicle number plate number, and optimizing the inquiry performance of the social relation diagram database.

Preferably, the government service object (i.e. natural person) in step b1 has a basic attribute context (R ₁ ) Life event context (R) ₂ ) And social relationship context (R) ₃ ) Context vector R _mi ＝{R ¹ _mi ,R ² _mi ,R ³ _mi ,...,R ^l _mi ' representing a service object U _m For the situation index R _i Is the selection of vector R _nj ＝{R ¹ _nj ,R ² _nj ,R ³ _nj ,...,R ^l _nj U is represented by } _n Contextual index data R of (2) _j The distance between the two context vectors is calculated as follows:

and obtaining natural people context vectors after context calculation, wherein each natural person corresponds to one context vector, and then generating a user-project-context three-dimensional matrix model according to an original user-project scoring matrix, wherein the matrix model comprises 3 types of vectors, namely a user vector, a government service project vector and a context vector.

Preferably, in the step b2, modeling the natural human situation by adopting a clustering algorithm based on K-means; the clustering algorithm based on K-means needs to define the number of to-be-formed clustering sets, namely the value of K in advance, randomly selects K objects as the centroids of the clusters, distributes the samples to the clustering set closest to the centroids by calculating the similarity between each sample and the centroids for the rest situation vectors, and then updates the centroid value for the clustering set distributed with new samples. Distributing all samples to finish natural person clustering to obtain K similar user clusters; the context clustering execution step comprises the following steps:

1. setting the number K of classifications;

2. selecting K different objects from the dataset S as initial centroids { b } ₁ ,b ₂ ,...,b _k }；

3. Calculating a context distance d between the context vector i and the centroid b for any non-accessed context vector i in the dataset S;

4. classifying the context vector i into a cluster set C closest to the context vector i;

5. re-calculating the average value of all the objects in the K clustering sets as a new centroid value;

6. updating the centroid value b in the changed set C;

7. repeating steps 3 to 6 until all centroids { b } ₁ ,b ₂ ,...,b _k No longer changes;

8. output context cluster set { C ₁ ,C ₂ ,...,C _k } and corresponding centroid value { b } ₁ ,b ₂ ,...,b _k }。

Preferably, in step b3, K similar clusters are obtained according to a K-means clustering algorithm, in order to reduce the computational complexity and obtain better pushing accuracy, in an original user-item matrix, corresponding sub-user-item matrices are obtained by searching according to information of the K clusters, and then a UserCF recommendation algorithm is directly applied to the sub-matrices to perform Top-N pushing; setting a scoring threshold P, wherein the score of the natural person is higher than the scoring threshold P, and the natural person possibly needs the government service project, or else the natural person possibly does not need the government service project; the execution steps are as follows:

1. defining push list length N, scoring threshold P, similarity number M and context cluster set { C } ₁ ,C ₂ ,...,C _k -an original user-project matrix R;

2. selecting a context cluster { C } ₁ ,C ₂ ,...,C _k Any unvisited set C in } _i Searching the corresponding user-project subset R in the original user-project matrix R _i ；

3. Repeating 2 until all situation cluster clusters { C ₁ ,C ₂ ,...,C _k Corresponding user-item subset { R } ¹ ,R ² ,...,R ^k }；

4. For user-item subset R _i Applying a collaborative filtering recommendation algorithm and predicting a deletion score;

first, for the cluster C _i User U in (B) _j Find its corresponding government service item set I _j ；

Second, in item set I _j The government affair service items with the scores Ratings more than or equal to P are screened to obtain government affair service items possibly needed by the government affair service items;

then, calculating the similarity sim between the users, and selecting M similarity neighbors { u }, wherein M is the same as M ₁ ,u ₂ ,...,u _M }；

Next, the current user U is interested in similar neighbors _j Government service item p with no score is calculated U _j Pearson similarity to p, and in aggregate RI _j The steps are sorted according to decreasing similarity;

finally, the top N of the scoring list is taken as the current user U _j Generating a government service push list L with the length of N;

5. repeat 4 until the natural people in each cluster get pushed.

According to the technical scheme, the government service object (natural person) situation big data model is built, the potentially required government service is mined according to the situation of the natural person, and the government department is helped to accurately lock the service object. Not only can the government demographic data application model be innovated, but also the transformation of government forms into service forms can be facilitated. And converging government population data based on the natural person situation big data model, and promoting government population data application to expand from a small data primary application mode which can only meet single-system application to a big data deep application mode which can meet multi-system comprehensive application. The situation clustering and government service pushing method is constructed based on the big natural person situation data, and government departments can provide accurate personalized, refined and mobile services for the public and technical support for pushing government forms to change from production forms to service forms.

Additional features and advantages of the invention will be set forth in the detailed description which follows.

Drawings

The accompanying drawings are included to provide a further understanding of the invention, and are incorporated in and constitute a part of this specification, illustrate the invention and together with the description serve to explain, without limitation, the invention. In the drawings:

fig. 1 is a flowchart of a government service pushing method based on situation big data of city big data.

Detailed Description

The following describes specific embodiments of the present invention in detail with reference to the drawings. It should be understood that the detailed description and specific examples, while indicating and illustrating the invention, are not intended to limit the invention.

In the present invention, unless otherwise indicated, directional terms contained in the terms merely represent the orientation of the terms in a conventional use state or are commonly understood by those skilled in the art, and should not be construed as limitations on the terms.

Referring to fig. 1, the invention provides a government service pushing method of situation big data based on city big data, comprising the following steps:

step a, constructing a natural person situation data model, and carrying out omnibearing description and depiction aiming at the social activity rule, individual business characteristics, social interaction state and the like of a single natural person, wherein the omnibearing description and depiction comprises a basic attribute situation, a life event situation and a social relation situation;

step a comprises:

step a1, constructing a basic attribute situation of a natural person;

step a2, constructing a natural human life event situation;

step a3, constructing a social relationship situation of natural people;

step b comprises:

step b1, calculating a situation vector;

step b2, clustering the situation;

and b3, pushing government service.

The basic attribute context in the step a1 comprises effective identity identification (such as resident identification card, passport, pass and the like) reflecting the birth and social attribution of the natural person, index (cultural degree) reflecting the humanization quality of the natural person, index reflecting the employment situation (current work unit) of the natural person and index reflecting the knowledge and skill level of the natural person in a certain professional activity (such as practitioner qualification, practice qualification, professional skill and the like).

Specifically, step a1 includes:

The personal event situation in the step a2 comprises various business events or activities participated by natural persons in the life cycle process, such as educational experience, employment experience, lost business registration, social security participation, marital registration, tax payment and the like.

Specifically, step a2 includes:

The social relationship context in the step a3 includes reflecting various person-person relationships (such as family relationships, colleague relationships, etc.) established by natural persons during social activities, reflecting person-ground relationships (such as residential places, employment places, schools, etc.) which the natural persons belong to and establish with for reasons of living, working, learning, etc., and reflecting tangible or intangible assets owned by the natural persons during full life cycle histories and person-object relationships (such as houses, lands, automobiles, intellectual property, etc.) which the natural persons establish with.

Specifically, step a3 includes:

In step b1, the government service object (i.e. natural person) has basic attribute context (R ₁ ) Life event context (R) ₂ ) And social relationship context (R) ₃ ) Context vector R _mi ＝{R ¹ _mi ,R ² _mi ,R ³ _mi ,...,R ^l _mi ' representing a service object U _m For the situation index R _i Is the selection of vector R _nj ＝{R ¹ _nj ,R ² _nj ,R ³ _nj ,...,R ^l _nj U is represented by } _n Contextual index data R of (2) _j The distance between the two context vectors is calculated as follows:

In the step b2, modeling is carried out on the situation of the natural person by adopting a clustering algorithm based on K-means; the clustering algorithm based on K-means needs to define the number of to-be-formed clustering sets, namely the value of K in advance, randomly selects K objects as the centroids of the clusters, distributes the samples to the clustering set closest to the centroids by calculating the similarity between each sample and the centroids for the rest situation vectors, and then updates the centroid value for the clustering set distributed with new samples. Distributing all samples to finish natural person clustering to obtain K similar user clusters; the context clustering execution step comprises the following steps:

1. setting the number K of classifications;

6. updating the centroid value b in the changed set C;

In the step b3, K similar clusters are obtained according to a K-means clustering algorithm, in order to reduce the computational complexity and obtain better pushing accuracy, in an original user-project matrix, corresponding sub-user-project matrixes are obtained by searching according to the information of the K clusters, and then a UserCF recommendation algorithm is directly applied to the sub-matrixes to push Top-N; setting a scoring threshold P, wherein the score of the natural person is higher than the scoring threshold P, and the natural person possibly needs the government service project, or else the natural person possibly does not need the government service project; the execution steps are as follows:

5. repeat 4 until the natural people in each cluster get pushed.

Through the technical scheme, the government service object (natural person) situation big data model is built, the potentially required government service is mined according to the situation of the natural person, and the government department is helped to accurately lock the service object. Not only can the government demographic data application model be innovated, but also the transformation of government forms into service forms can be facilitated. And converging government population data based on the natural person situation big data model, and promoting government population data application to expand from a small data primary application mode which can only meet single-system application to a big data deep application mode which can meet multi-system comprehensive application. The situation clustering and government service pushing method is constructed based on the big natural person situation data, and government departments can provide accurate personalized, refined and mobile services for the public and technical support for pushing government forms to change from production forms to service forms.

The preferred embodiments of the present invention have been described in detail above with reference to the accompanying drawings, but the present invention is not limited to the specific details of the above embodiments, and various simple modifications can be made to the technical solution of the present invention within the scope of the technical concept of the present invention, and all the simple modifications belong to the protection scope of the present invention.

In addition, the specific features described in the above embodiments may be combined in any suitable manner, and in order to avoid unnecessary repetition, various possible combinations are not described further.

Moreover, any combination of the various embodiments of the invention can be made without departing from the spirit of the invention, which should also be considered as disclosed herein.

Claims

1. The utility model provides a government affair service pushing method of situation big data based on city big data, which is characterized by comprising the following steps:

step a, constructing a natural person situation data model, and describing the social activity rule, individual business characteristics and the social interaction state of a single natural person in an omnibearing way;

step a comprises:

step a1, constructing a basic attribute situation of a natural person;

step a2, constructing a natural human life event situation;

step a3, constructing a social relationship situation of natural people;

step b comprises:

step b1, calculating a situation vector;

step b2, clustering the situation;

step b3, pushing government service; wherein, the liquid crystal display device comprises a liquid crystal display device,

in step b1, the government service object has basic attribute context (R ₁ ) Life event context (R) ₂ ) And social relationship context (R) ₃ ) Context vector R _mi ＝{R ¹ _mi ,R ² _mi ,R ³ _mi ,...,R ^l _mi ' representing a service object U _m For the situation index R _i Is the selection of vector R _nj ＝{R ¹ _nj ,R ² _nj ,R ³ _nj ,...,R ^l _nj U is represented by } _n Contextual index data R of (2) _j The distance between the two context vectors is calculated as follows:

obtaining natural people situation vectors after situation calculation, wherein each natural person corresponds to one situation vector, and then generating a user-project-situation three-dimensional matrix model according to an original user-project scoring matrix, wherein the matrix model comprises 3 types of vectors which are respectively a user vector, a government service project vector and a situation vector;

in the step b2, modeling is carried out on the situation of the natural person by adopting a clustering algorithm based on K-means; the clustering algorithm based on K-means needs to define the number of clustering sets to be formed, namely the value of K in advance, K objects are randomly selected to serve as the centroids of the clusters, the similarity between each sample and the centroids is calculated for the rest situation vectors, the samples are distributed to the clustering sets closest to the centroids, and then the centroids are updated for the clustering sets distributed with new samples; distributing all samples to finish natural person clustering to obtain K similar user clusters; the context clustering execution step comprises the following steps:

(1) Setting the number K of classifications;

(2) Selecting K different objects from the dataset S as initial centroids { b } ₁ ,b ₂ ,...,b _k }；

(3) Calculating a context distance d between the context vector i and the centroid b for any non-accessed context vector i in the dataset S;

(4) Classifying the context vector i into a cluster set C closest to the context vector i;

(5) Re-calculating the average value of all the objects in the K clustering sets as a new centroid value;

(6) Updating the centroid value b in the changed set C;

(7) Repeating steps 3 to 6 until all centroids { b } ₁ ,b ₂ ,...,b _k No longer changes;

(8) Output context cluster set { C ₁ ,C ₂ ,...,C _k } and corresponding centroid value { b } ₁ ,b ₂ ,...,b _k }；

(1) Defining push list length N, scoring threshold P and similarity numberQuantity M, context cluster set { C ₁ ,C ₂ ,...,C _k -an original user-project matrix R;

(2) Selecting a context cluster { C } ₁ ,C ₂ ,...,C _k Any unvisited set C in } _i Searching the corresponding user-project subset R in the original user-project matrix R _i ；

(3) Repeating 2 until all situation cluster clusters { C ₁ ,C ₂ ,...,C _k Corresponding user-item subset { R } ¹ ,R ² ,...,R ^k }；

(4) For user-item subset R _i Applying a collaborative filtering recommendation algorithm and predicting a deletion score;

(5) Repeating (4) until the natural people in each cluster are pushed.

2. The government affair service pushing method based on the situation big data of the city big data according to claim 1, wherein the basic attribute situation in the step a1 comprises effective identification reflecting the birth and the social attribution of the natural person, an index reflecting the natural humanization quality, an index reflecting the employment situation of the natural person and an index reflecting the knowledge and skill level of the natural person in a certain professional activity.

3. The government service pushing method of contextual big data based on urban big data according to claim 2, wherein step a1 comprises:

firstly, extracting relevant data catalogs of public security bureaus, personal social bureaus and government departments of civil government bureaus, and forming natural person basic information, passport information, driving license information, taiwan pass information, kong and Australian pass information, social security card information, cultural degree information, work unit information, practitioner qualification information, practice qualification information, professional technical job qualification information, professional skill information and religious staff information after multi-table association;

4. The government affair service pushing method based on situation big data of city big data according to claim 1, wherein the situation of the personal event in step a2 includes various business events or activities that the natural person participates in during the life cycle.

5. The government service pushing method of contextual big data based on urban big data according to claim 4, wherein step a2 comprises:

6. The government service pushing method according to claim 1, wherein the social relationship context in step a3 includes a person-person relationship reflecting various person-person relationships established by natural persons during social activities, a person-ground relationship reflecting and established by natural person due to living, working, learning reasons belonging to certain places, and a person-object relationship reflecting and established by physical or intangible assets possessed by natural persons during the whole life cycle.

7. The government service pushing method of contextual big data based on urban big data according to claim 6, wherein step a3 comprises:

firstly, defining a plurality of entity objects of natural people, places, motor vehicles, houses, lands and intellectual property rights; each natural person node may establish a plurality of relationships with a plurality of nodes, wherein the plurality of nodes includes natural persons, places, and items; adding attributes for each entity object, wherein the natural person attributes are basic attributes constructed in the step a1, the place attributes comprise place names, place types and roles of persons in places, and the object attributes comprise object names, object types and possession times; wherein a single node may contain multiple attribute descriptions that characterize its physical characteristics;