CN115713249B - Government satisfaction evaluation system and method based on data security and privacy protection - Google Patents

Government satisfaction evaluation system and method based on data security and privacy protection Download PDF

Info

Publication number
CN115713249B
CN115713249B CN202211235305.1A CN202211235305A CN115713249B CN 115713249 B CN115713249 B CN 115713249B CN 202211235305 A CN202211235305 A CN 202211235305A CN 115713249 B CN115713249 B CN 115713249B
Authority
CN
China
Prior art keywords
data
module
government
satisfaction
scoring
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202211235305.1A
Other languages
Chinese (zh)
Other versions
CN115713249A (en
Inventor
卢清华
李梦园
李方伟
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Chongqing Yitong College
Original Assignee
Chongqing Yitong College
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Chongqing Yitong College filed Critical Chongqing Yitong College
Priority to CN202211235305.1A priority Critical patent/CN115713249B/en
Publication of CN115713249A publication Critical patent/CN115713249A/en
Application granted granted Critical
Publication of CN115713249B publication Critical patent/CN115713249B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02PCLIMATE CHANGE MITIGATION TECHNOLOGIES IN THE PRODUCTION OR PROCESSING OF GOODS
    • Y02P90/00Enabling technologies with a potential contribution to greenhouse gas [GHG] emissions mitigation
    • Y02P90/30Computing systems specially adapted for manufacturing

Landscapes

  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The invention discloses a government satisfaction evaluation system and a government satisfaction evaluation method based on data security and privacy protection, and belongs to the technical field of data scoring. The method specifically comprises the following steps: the system comprises a database module, a data security module and a data scoring module, wherein the database module further comprises a data acquisition module and a data classification module; the data security module comprises an identity authentication module, a data desensitization module and a feedback early warning module; the data scoring module comprises a preprocessing module, a model training module and a scoring module. The data security module provides effective encryption protection work for the sensitive information of the investigation masses by using an identity authentication, data desensitization technology and a feedback early warning mechanism, enhances data security and privacy protection, forms an accessible system, increases data transparency and improves fairness and openness of government affair evaluation work. The data scoring module completes model training by using a Catboost algorithm in machine learning, establishes a scientific government satisfaction scoring model and completes relevant data scoring work.

Description

Government satisfaction evaluation system and method based on data security and privacy protection
Technical Field
The invention belongs to the technical field of data scoring, and particularly relates to a government satisfaction evaluation system and method based on data security and privacy protection.
Background
Traditional government performance evaluation is mainly top-down evaluation, which leads partial government workers to only go up and down, and pursues a few evaluated indexes on one side, so that statistical data contains 'moisture'. With the continuous advancement of digital government construction, the data multi-running way becomes a novel and efficient government office mode. Meanwhile, various levels of governments are paying more attention to the opinion of people, and the public is involved in government affair evaluation.
The existing government affair evaluation system only collects satisfaction data through a simple code scanning questionnaire or a web page questionnaire, and only takes the average value after the sum of the satisfaction scores as a final scoring result in data processing. The resulting problems are: (1) The complete government satisfaction degree scoring system is lacking, and the complete government evaluation system with strong adaptability and popularization is lacking from data collection to database establishment, data processing, scoring model establishment and the like; (2) The lack of data security management does not process sensitive information such as attributes of masses, and risks of privacy information disclosure exist; (3) The scoring model is too behind, a scientific and efficient algorithm model is not formed, and the problems behind government affair data are difficult to mine, so that guiding suggestions are provided. (4) The government satisfaction data is not disclosed, so that people cannot access to the query, and the public confidence is lacked.
Therefore, evaluation standards, methods and programs of the application science are urgent to perform government satisfaction evaluation work, and the method has important significance for government personnel to perform responsibilities, construct harmonious trunk group relations and establish good government images.
CN111222753a, an e-government performance assessment system, comprising: the system comprises an evaluation model module, a data acquisition module, a data processing module, an index weight learning and generating module, an evaluation result generating module and an evaluation report natural language generating module. In the electronic government performance evaluation system provided by the invention, an index weight training model is established, the weight of each basic evaluation index is learned from data, and the certainty, objectivity and rationality of the weight of each basic evaluation index in the evaluation process are improved. According to the invention, the final evaluation result has more basis and directivity for the fusion of the scores on the basic evaluation indexes through the weight setting; and through the setting of a plurality of basic evaluation indexes, the government affair module is further conveniently evaluated from all directions, so that the requirements of people are precisely known, and the government affair module is improved.
The e-government performance evaluation system proposed by the CN111222753A patent does not have an encryption protection link for checking sensitive information of masses. The patent provides a data security module, which utilizes identity authentication, data desensitization technology and feedback early warning mechanism to enhance data security and privacy protection.
The cn111222753a patent data processing lacks a data access system. The data access port is arranged in the data security module to form an accessible system, so that people can access original government satisfaction investigation data, data transparency is improved, and fairness and openness of government evaluation work are improved.
The patent CN111222753A focuses on determining the weight of an evaluation index through an evidence calculation method and a differential purification algorithm, combining the weight to synthesize the total evaluation information on the evaluation index to obtain a final evaluation result, and generating a text evaluation report in natural language. The patent focuses on processing the government affair evaluation satisfaction data to obtain a scientific government affair satisfaction degree scoring result. The index screening process is simplified by using a machine algorithm-Catboost algorithm, the influence of human subjective factors is reduced, and the government satisfaction index contribution degree is obtained by using an own importance () function of the Catboost algorithm, so that the index weight is determined. Not only is the data processing speed improved, but also the scientificity and fairness of weight assignment are ensured.
Disclosure of Invention
The present invention is directed to solving the above problems of the prior art. A government satisfaction evaluation system and method based on data security and privacy protection are provided. The technical scheme of the invention is as follows:
a government satisfaction evaluation system based on data security and privacy protection, comprising: a database module, a data security module and a data scoring module, wherein,
the database module is used for forming a database according to the collection and classification results of various government satisfaction evaluation data and providing the database to the data security module;
the data security module is used for performing access control, privacy protection and feedback early warning on government satisfaction evaluation data, completing the security management work of the data and providing the processed data set to the data scoring module;
and the data scoring module is used for model training of the government satisfaction evaluation data, constructing a government satisfaction scoring model and outputting scoring results.
Further, the database module comprises a data acquisition module and a data classification module, wherein the data acquisition module is used for collecting various data of government satisfaction; the data classification module is used for dividing the collected government satisfaction data into month data, quarter data and year data according to time, and further dividing the collected government satisfaction data into sub-data sets including safe construction, legal construction and service evaluation according to government theme content.
Further, the data security module comprises an identity authentication module, a data desensitization module and a feedback early warning module, wherein the identity authentication module is used for authenticating the identity of a visitor and determining the inquiry authority of the visitor; the data desensitization module is used for carrying out desensitization processing on the data set containing sensitive information; the feedback early warning module records the user behavior to generate a log; when the number of times of identity authentication rejection of the same user exceeds a set threshold, the access port is locked in time and early warning is fed back to the terminal.
Further, the specific steps of the data security module for completing the access control, privacy protection and feedback early warning work are as follows:
(1) When a user applies for accessing government satisfaction data resources, firstly, authenticating the identity of the user, and refusing access if the authentication is not passed; if the authentication is passed, the operation can be further performed;
judging the data resource requested by the user, and obtaining a required data set according to the authority when the data resource does not contain sensitive data; when the request data resource contains sensitive data, encrypting sensitive information including attributes of masses to obtain a desensitized data set;
(2) Recording the access process of the user and generating a corresponding log;
(3) When the number of times of identity authentication rejection of the same user exceeds a set threshold value, the access port is locked in time and feedback early warning is given to the terminal.
Further, the data desensitizing module is used for desensitizing the data set containing sensitive information, and specifically comprises the following steps:
(1) The sensitive data set enters a data desensitizing module;
(2) Determining a desensitization scheme, and desensitizing sensitive data by using modes of truncation, encryption, hiding and replacement;
(3) Writing a desensitization rule, writing a desensitization rule table, wherein different desensitization rules correspond to different data encryption methods;
(4) According to the sensitive data category, namely name, ID card number, mobile phone number, address and desensitization scheme, through the main key association response, according to the appointed desensitization scheme, finish the sensitive data desensitization;
(5) The desensitized data set is provided to a data scoring module.
Further, the data scoring module comprises a preprocessing module, a model training module and a scoring module, wherein the preprocessing module is used for preprocessing the accessed government satisfaction data set including data cleaning, unbalanced data processing and data set segmentation, and providing the processed data set to the model training module; the model training module is used for carrying out model training on the data set, completing model training by utilizing a machine learning algorithm-Catboost algorithm, obtaining government satisfaction index contribution degree by utilizing an importance () function in the Catboost algorithm, further determining index weight and providing the index weight to the scoring module; and the scoring module establishes a government satisfaction scoring model by using the index weight, finishes the scoring work of related data and finally obtains a government satisfaction scoring result.
Furthermore, the model training is completed by using a machine learning algorithm-Catboost algorithm, and the government satisfaction index contribution degree is obtained by using an importance () function in the Catboost algorithm, so that index weights are determined and provided for a scoring module, and the method specifically comprises the following steps:
a government satisfaction evaluation method based on data security and privacy protection based on the system comprises the following steps:
collecting and classifying various government satisfaction evaluation data by utilizing a database module to form a database and providing the database to a data security module;
the data security module is utilized to carry out access control, privacy protection and feedback early warning on government satisfaction evaluation data, the security management work of the data is completed, and the processed data set is provided for the data scoring module;
and performing model training of government satisfaction evaluation data by utilizing a data scoring module, constructing a government satisfaction scoring model, and outputting scoring results.
The invention has the advantages and beneficial effects as follows:
aiming at the defects in the prior art, the invention establishes a complete government satisfaction evaluation system. It comprises the following steps: the system comprises a database module, a data security module and a data scoring module, wherein the database module comprises a data acquisition module and a data classification module and is used for forming a database according to the collection and classification results of various government satisfaction evaluation data and providing the database to the data security module; the data security module comprises an identity authentication module, a data desensitization module and a feedback early warning module, and is used for controlling access to government satisfaction evaluation data, protecting privacy and carrying out feedback early warning, completing data security management work and providing a processed data set to the data scoring module; the data scoring module comprises a preprocessing module, a model training module and a scoring module, and is used for model training of government satisfaction evaluation data, constructing a government satisfaction scoring model and outputting scoring results.
The invention has the following advantages: 1. forming a complete government satisfaction data link, and increasing adaptability and popularization; 2. the data security module is added, the access control, encryption technology and feedback early warning mechanism are utilized to provide effective protection work for the sensitive information of the investigation masses, and the data security and privacy protection are enhanced; 3. an identity authentication module is added to form an accessible system, so that the data transparency is increased, and the fairness and the openness of government affair evaluation work are improved; 4. the government satisfaction scoring model is established by using a machine learning-Catboost algorithm, so that the data processing speed is improved, and the scientificity and fairness of satisfaction index weight assignment are ensured.
Drawings
FIG. 1 is a government satisfaction evaluation system based on data security and privacy protection in accordance with a preferred embodiment of the present invention;
fig. 2 is a schematic flow chart of a database module in a government satisfaction evaluation system based on data security and privacy protection provided by the application;
fig. 3 is a schematic flow chart of a data security module in a government satisfaction evaluation system based on data security and privacy protection provided by the application;
fig. 4 is a schematic flow chart of a data scoring module in a government satisfaction evaluation system based on data security and privacy protection provided by the application;
FIG. 5 is a flow chart of data scoring in an embodiment of the present application;
FIG. 6 is a graph showing the results of a desensitization process for a dataset in an embodiment of the present application;
FIG. 7 is a comparison of the results of the algorithmic model evaluation of the dataset in the example of the present application;
fig. 8 is a constructed government satisfaction score model.
Detailed Description
The technical solutions in the embodiments of the present invention will be clearly and specifically described below with reference to the drawings in the embodiments of the present invention. The described embodiments are only a few embodiments of the present invention.
The technical scheme for solving the technical problems is as follows:
as shown in fig. 1-6, a government satisfaction evaluation system based on data security and privacy protection comprises a database module, a data security module and a data scoring module, and the implementation method comprises the following steps:
collecting and classifying various government satisfaction evaluation data by utilizing a database module to form a database and providing the database to a data security module;
the data security module is utilized to carry out access control, privacy protection and feedback early warning on government satisfaction evaluation data, the security management work of the data is completed, and the processed data set is provided for the data scoring module;
and performing model training of government satisfaction evaluation data by utilizing a data scoring module, constructing a government satisfaction scoring model, and outputting scoring results.
Preferably, the database module comprises a data acquisition module and a data classification module, wherein the data acquisition module is used for collecting various data of government satisfaction; the data classification module is used for dividing the collected government satisfaction data into month data, quarter data and year data according to time, and further dividing the collected government satisfaction data into sub-data sets such as safe construction, legal construction, service evaluation and the like according to government theme content.
Preferably, the data security module comprises an identity authentication module, a data desensitization module and a feedback early warning module, wherein the identity authentication module is used for authenticating the identity of a visitor and determining the inquiry authority of the visitor; the data desensitization module is used for carrying out desensitization treatment on the data set containing sensitive information, so that the data security is improved; and the feedback early warning module records the user behavior to generate a log. When the number of times of identity authentication rejection of the same user exceeds a set threshold, the access port is locked in time and early warning is fed back to the terminal, so that the access safety is improved.
Preferably, the data scoring module comprises a preprocessing module, a model training module and a scoring module, wherein the preprocessing module is used for cleaning the accessed government satisfaction data set, processing unbalanced data, dividing the data set and providing the processed data set to the model training module; the model training module is used for carrying out model training on the data set, completing model training by utilizing a machine learning algorithm-Catboost algorithm, obtaining government satisfaction index contribution degree by utilizing an importance () function in the Catboost algorithm, further determining index weight and providing the index weight to the scoring module; and the scoring module establishes a government satisfaction scoring model by using the index weight, finishes the scoring work of related data and finally obtains a government satisfaction scoring result.
Preferably, the specific steps for the data security module to complete the access control, privacy protection and feedback early warning work are as follows:
(1) When a user applies for accessing government satisfaction data resources, firstly, authenticating the identity of the user, and refusing access if the authentication is not passed; if the authentication is passed, the operation can be further performed;
(2) Judging the data resource requested by the user, and obtaining a required data set according to the authority when the data resource does not contain sensitive data; when the request data resource contains sensitive data, sensitive information such as attributes of masses is encrypted to obtain a desensitized data set.
(3) And recording the access process of the user, and generating a corresponding log.
(4) When the number of times of identity authentication rejection of the same user exceeds a set threshold value, the access port is locked in time and feedback early warning is given to the terminal.
Preferably, when the sensitive data set is accessed, the data desensitizing module is automatically entered to complete dynamic data desensitization, and the specific steps are as follows:
(1) The sensitive data set enters a data desensitizing module;
(2) A desensitization regimen is determined. Desensitizing sensitive data by means of truncation, encryption, hiding, replacement and the like, such as replacing a true value with special characters (x, etc.);
(3) Writing a desensitization rule, writing a desensitization rule table, wherein different desensitization rules correspond to different data encryption methods;
(4) According to the sensitive data category, namely sensitive information including name, ID card number, mobile phone number and address and a desensitization scheme, responding through a main key association, and according to a designated desensitization scheme, completing desensitization of the sensitive data;
(5) The desensitized data set is provided to a data scoring module.
Preferably, the government satisfaction data set enters a data scoring module to finish the related data scoring operation, and the specific steps are as follows:
(1) And finishing data preprocessing work: the data cleaning work is completed through the preprocessing module, namely, a data set is checked, and data is described; then the over-sampling or under-sampling method is used for finishing the processing of unbalanced data; finally, the data set is divided into a training set and a testing set and provided for a model training module
(2) And (3) completing model training work: the training set data enter a data scoring module to perform model training through a model training module, the model training is completed through a machine learning algorithm-Catboost algorithm, the model is evaluated, and the government satisfaction index contribution degree is obtained through an importance () function in the Catboost algorithm, and index weights are determined and provided for the scoring module;
(3) Finishing data scoring work: and establishing a government satisfaction degree scoring model by using the index weight through a scoring module, and finishing related data scoring work to finally obtain a government satisfaction degree scoring result.
Preferably, the data scoring module completes data preprocessing, model training and establishment of a scoring model to obtain scoring results, and detailed programming pseudo-code sentences based on Python software are as follows:
a first part: completing data preprocessing work
Description data of #
df.describe(data)
Null filling of # with KNN
From fancyimpute import BiScaler,KNN,NuclearNormMinimization,SoftImpute
dataset=KNN(k=3).complete(data)
# processing of unbalanced data, oversampling and undersampling
# oversampling: the corresponding function in the Python library is random oversuppler:
from imblearn.over_sampling import RandomOverSampler
ROS=RandomOverSampler(random_state=0)
x_resampled,y_resampled=ROS.fit_sample(x,y)
# undersampling: the function in the corresponding Python library is RandomunderworSampler
from imblearn.under_sampling import RandomUnderSampler
RUS=RandomUnderSampler(random_state=0)
x_resampled,y_resampled=RUS.fit_sample(x,y)
# segmentation dataset
dftrain,dfvalid=train_test_split(dfdata,train_size=0.7,random_state=42)
Xtrain,Ytrain=dftrain.drop(label_col,axis=1),dftrain[lable_col]
Xvalid,Yvalid=dfvalid.drop(label_col,axis=1),dftrain[lable_col]
Cate_cols_indexs=np.where(Xtrain,columns.isin(cate_cols))[]
Data_train=cb.pool(data=Xtrain,label=Ytrain,cat_features=cate_cols)
Data_valid=cb.pool(data=Xvalid,label=Yvalid,cat_features=cate_cols)
A second part: completing model training work
Setting parameters of #
iterations=1000
early_stopping_rounds=200
Params={'learning_rate':0.05,
'loss_function':"Logloss",
'eval_metric':"Accuracy",
'depth':6,
'min_data_in_leaf':20,
'random_seed':42,
'logging_level':'Silent',
'use_best_model':True,
'one_hot_max_size':2,
'boosting_type':"Ordered",
'max_ctr_complexity':4}
Training model #
model=cb.CatBoostClassifier(
iterations=iterations,
early_stopping_rounds=early_stopping_rounds,
train_dir='catboost_info/',
**Params)
Direct training #
model.fit(
Data_train,
eval_set=Data_valid,
plot=TRUE
)
print("model.get_all_params():")
print(model.get_all_params())
Model for # evaluation
y_pred_train=model.predict(Xtrain)
y_pred_valid=model.predict(Xvalid)
train_score=f1_score(Ytrain,y_pred_train)
valid_score=f1_score(Yvalid,y_pred_valid)
print('train f1_score:{:.5}'.format(train_score))
print('valid f1_score:{:.5}\n'.format(valid_score))
Determining feature importance #
dfimportance=model.get_feature_importtance(prettified=True)
dfimportance=dfimportance.sort_values(by="Importances").iloc[-20:]
fig_importance=px.bar(dfimportance,
x="Importances",y="Feature ID",title="CatBoost Feature Importance Ranking")
display(dfimportance)
display(fig_importance)
Third section: completing data scoring work
# establishes a scoring model and determines scoring results
Fi=fi/sum (Fi) #fi represents the contribution rate of the ith government index
print(Fi)
W=F1*W1+F2*W2+F3*W3+...+Fn*Wn
Suppose that satisfaction evaluation data of the 2021J-district residents for government food safety work is collected by the data collection module A1 in the database module of fig. 2, and that the data set has sensitive information of the name, the identification card number, the mobile phone number, the address, etc. of the investigator. The data classification module A2 is used for further dividing the data into 3 sub-data sets according to time and government topics: 2021 year data D1, food safety job data D2, and 2021 year food safety job data D3.
Now, as shown in fig. 3, the user R accesses 2021 year food safety work data D3 through the data security module B, and after acquiring the authority through the identity authentication module B1, the data automatically enters into the data desensitization module B2 to complete the desensitization work of the sensitive data information, which specifically comprises the following steps:
(1) According to the sensitive data category, namely name, ID card number, mobile phone number, address and desensitization scheme, through the primary key response, according to the appointed desensitization scheme, finish the desensitization of sensitive data;
the desensitization processing result of the data set D3 is shown in FIG. 6, wherein the names in the data set are reserved with surnames, and the names are hidden; the identification card number reserves the first six digits and the last four digits, so that the identification card number can be matched with regional information and the safety of the information can be improved; starting from the fifth bit, hiding four bits of the mobile phone number; the address information is cut off and only remains in the area, thereby being convenient for checking compliance of government work satisfaction investigation in the corresponding area and preventing information leakage.
(2) The resulting desensitized dataset D3' is provided to the data scoring module C.
To further determine the satisfaction score for the residents of the J area with respect to the government food safety work, the data set D3' utilizes the data scoring module C to complete the final scoring work, referring to fig. 4, as follows:
(1) The government satisfaction data set D3' enters a preprocessing module C1 to finish basic data preprocessing work, including data cleaning work, data checking, null value filling feature and data set segmentation and the like;
(2) The model training work is completed through the model training module C2. Model training is carried out on the data set D3' by using a Catboost algorithm through Python software, and the government satisfaction data set is assumed to be
Figure GDA0003994136310000129
Wherein the method comprises the steps of
Figure GDA0003994136310000121
Is an index vector of m government satisfaction features,/->
Figure GDA0003994136310000122
Is a tag value corresponding to a government satisfaction index. The Catboost algorithm uses the mean +.>
Figure GDA0003994136310000123
I.e.
Figure GDA0003994136310000124
To deduce the frequency of each category characteristic, coding to form a brand new numerical variable ++>
Figure GDA0003994136310000125
I.e.
Figure GDA0003994136310000126
Wherein [ (S)]Representing an indication function: satisfy the following requirements
Figure GDA0003994136310000127
The function returns to 1 when the function represents the category variable index, and returns to 0 when the function represents the category variable index is opposite; p is a priori value of the super parameter; the parameter alpha (alpha > 0) is the weight of the prior value; />
Figure GDA0003994136310000128
And y is j And respectively represent the j-th category variable index and the corresponding label value thereof.
After the automatic coding work is finished, the Catboost algorithm replaces the gradient estimation method by the self-ordering lifting method, and each government satisfaction sample D is obtained k (D k ∈D' 3 ) Training to obtain a unique model M i Finally obtain M n I.e. finding an unbiased gradient estimate of the sample, thereby training and obtaining the final model.
(3) And (5) completing model evaluation. The model trained by the cast model government satisfaction scoring module is evaluated through four indexes, and fig. 7 shows the calculation result of each index, wherein the four measurement indexes are respectively: model training speed, accuracy, F1 value and AUC value, wherein the accuracy, F1 value, AUC value calculation method and measurement content are as follows:
model training speed refers to the time required by different algorithms to train out a model under the same computer equipment environment and the same amount of data sets;
accuracy (precision) refers to the ratio of the sample with the true correct government satisfaction score to the sample, and the calculation formula is
Figure GDA0003994136310000131
Recall (Recall) refers to the proportion of samples for which the government satisfaction score is truly correct and the classification is identified by the model, calculated as
Figure GDA0003994136310000132
The F1 value refers to a weighted harmonic mean of accuracy and recall, and assumes that the two weights are the same, i.e
Figure GDA0003994136310000133
(4) And constructing a government satisfaction degree scoring model. As shown in fig. 8, the relativity importance of n feature variables in the data set D3' is obtained by using the importance function in the CatBoost algorithm, the index contribution rate is determined, and the scoring model is obtained by further weighting according to the index weight, which comprises the following specific steps:
index contribution rate f of n different indexes obtained according to Catboost model i Further according to F i =f i /∑f i Obtaining the weight of the corresponding index, reconstructing a resident food safety satisfaction degree scoring model, and assuming that the resident food safety satisfaction degree scoring is W, wherein each index scoring is W i The final government satisfaction scoring model is as follows: w=f 1 W 1 +F 2 W 2 +...+F i W i +...+F n W n
The system, apparatus, module or unit set forth in the above embodiments may be implemented in particular by a computer chip or entity, or by a product having a certain function. One typical implementation is a computer. In particular, the computer may be, for example, a personal computer, a laptop computer, a cellular telephone, a camera phone, a smart phone, a personal digital assistant, a media player, a navigation device, an email device, a game console, a tablet computer, a wearable device, or a combination of any of these devices.
Computer readable media, including both non-transitory and non-transitory, removable and non-removable media, may implement information storage by any method or technology. The information may be computer readable instructions, data structures, modules of a program, or other data. Examples of storage media for a computer include, but are not limited to, phase change memory (PRAM), static Random Access Memory (SRAM), dynamic Random Access Memory (DRAM), other types of Random Access Memory (RAM), read Only Memory (ROM), electrically Erasable Programmable Read Only Memory (EEPROM), flash memory or other memory technology, compact disc read only memory (CD-ROM), digital Versatile Discs (DVD) or other optical storage, magnetic cassettes, magnetic tape magnetic disk storage or other magnetic storage devices, or any other non-transmission medium, which can be used to store information that can be accessed by a computing device. Computer-readable media, as defined herein, does not include transitory computer-readable media (transmission media), such as modulated data signals and carrier waves.
It should also be noted that the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising one … …" does not exclude the presence of other like elements in a process, method, article or apparatus that comprises the element.
The above examples should be understood as illustrative only and not limiting the scope of the invention. Various changes and modifications to the present invention may be made by one skilled in the art after reading the teachings herein, and such equivalent changes and modifications are intended to fall within the scope of the invention as defined in the appended claims.

Claims (5)

1. A government satisfaction evaluation system based on data security and privacy protection, comprising: a database module, a data security module and a data scoring module, wherein,
the database module is used for forming a database according to the collection and classification results of various government satisfaction evaluation data and providing the database to the data security module;
the data security module is used for performing access control, privacy protection and feedback early warning on government satisfaction evaluation data, completing the security management work of the data and providing the processed data set to the data scoring module;
the data scoring module is used for model training of government satisfaction evaluation data, constructing a government satisfaction scoring model and outputting scoring results;
the database module comprises a data acquisition module and a data classification module, wherein the data acquisition module is used for collecting various data of government satisfaction; the data classification module is used for dividing the collected government satisfaction data into month data, quarter data and year data according to time, and further dividing the collected government satisfaction data into sub-data sets including safe construction, legal construction and service evaluation according to government theme content;
the data security module comprises an identity authentication module, a data desensitization module and a feedback early warning module, wherein the identity authentication module is used for authenticating the identity of a visitor and determining the inquiry authority of the visitor; the data desensitization module is used for carrying out desensitization processing on the data set containing sensitive information; the feedback early warning module records the user behavior to generate a log; when the number of times of identity authentication rejection of the same user exceeds a set threshold, locking an access port in time and feeding back early warning to a terminal;
the data scoring module comprises a preprocessing module, a model training module and a scoring module, wherein the preprocessing module is used for preprocessing the accessed government satisfaction data set, including data cleaning, unbalanced data processing and data set segmentation, and providing the processed data set to the model training module; the model training module is used for carrying out model training on the data set, completing model training by utilizing a machine learning algorithm-Catboost algorithm, obtaining government satisfaction index contribution degree by utilizing an importance function in the Catboost algorithm, further determining index weight and providing the index weight for the scoring module; and the scoring module establishes a government satisfaction scoring model by using the index weight, finishes the scoring work of related data and finally obtains a government satisfaction scoring result.
2. The government satisfaction evaluation system based on data security and privacy protection according to claim 1, wherein the specific steps of the data security module for completing access control, privacy protection and feedback early warning work are as follows:
(1) When a user applies for accessing government satisfaction data resources, firstly, authenticating the identity of the user, and refusing access if the authentication is not passed; if the authentication is passed, further operation is performed;
(2) Judging the data resource requested by the user, and obtaining a required data set according to the authority when the data resource does not contain sensitive data; when the request data resource contains sensitive data, encrypting sensitive information including attributes of masses to obtain a desensitized data set;
(3) Recording the access process of the user and generating a corresponding log;
(4) When the number of times of identity authentication rejection of the same user exceeds a set threshold value, the access port is locked in time and feedback early warning is given to the terminal.
3. The government satisfaction evaluation system based on data security and privacy protection according to claim 1, wherein the data desensitizing module is used for desensitizing a data set containing sensitive information, and the specific steps are as follows:
(1) The sensitive data set enters a data desensitizing module;
(2) Determining a desensitization scheme, and desensitizing sensitive data by using modes of truncation, encryption, hiding and replacement;
(3) Writing a desensitization rule, writing a desensitization rule table, wherein different desensitization rules correspond to different data encryption methods;
(4) According to the sensitive data category, namely name, ID card number, mobile phone number, address and desensitization scheme, through the main key association response, according to the appointed desensitization scheme, finish the sensitive data desensitization;
(5) The desensitized data set is provided to a data scoring module.
4. The government satisfaction evaluation system based on data security and privacy protection according to claim 1, wherein the model training is completed by using a machine learning algorithm-Catboost algorithm, and the government satisfaction index contribution is obtained by using an importance function in the Catboost algorithm, so as to determine an index weight and provide the index weight to a scoring module, and the government satisfaction evaluation system specifically comprises:
assume that the government satisfaction data set is d= (X) k ,Y k ) k=1,2...,n Wherein
Figure FDA0004184520360000021
Is an index vector containing m government satisfaction characteristics, Y k =(y 1 ,y 2 ,...y k ),y k E R is a label value corresponding to a government satisfaction index, and the Catboost algorithm utilizes the mean value of the same class of characteristic data +.>
Figure FDA0004184520360000031
I.e. < ->
Figure FDA0004184520360000032
To deduce the frequency of each category characteristic, and code to form new numerical variable
Figure FDA0004184520360000033
Wherein [ (S)]Representing an indication function: satisfy the following requirements
Figure FDA0004184520360000034
The function returns to 1 if the time is short, and returns to 0 if the time is short; p is a priori value of the super parameter; the parameter α is the weight of a priori value, where α > 0,/->
Figure FDA0004184520360000035
And y is j Respectively representing the j-th category variable index and the corresponding label value thereof;
after the automatic coding work is finished, the Catboost algorithm replaces the gradient estimation method by the self-ordering lifting method, and each government satisfaction sample D is obtained k Training to obtain a unique model M i Wherein D is k E D, finally obtain M n I.e. finding an unbiased gradient estimate of the sample, thereby training and obtaining the final model.
5. A method for evaluating government satisfaction based on data security and privacy protection based on the system of any one of claims 1-4, comprising the steps of:
collecting and classifying various government satisfaction evaluation data by utilizing a database module to form a database and providing the database to a data security module;
the data security module is utilized to carry out access control, privacy protection and feedback early warning on government satisfaction evaluation data, the security management work of the data is completed, and the processed data set is provided for the data scoring module;
and performing model training of government satisfaction evaluation data by utilizing a data scoring module, constructing a government satisfaction scoring model, and outputting scoring results.
CN202211235305.1A 2022-10-10 2022-10-10 Government satisfaction evaluation system and method based on data security and privacy protection Active CN115713249B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202211235305.1A CN115713249B (en) 2022-10-10 2022-10-10 Government satisfaction evaluation system and method based on data security and privacy protection

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202211235305.1A CN115713249B (en) 2022-10-10 2022-10-10 Government satisfaction evaluation system and method based on data security and privacy protection

Publications (2)

Publication Number Publication Date
CN115713249A CN115713249A (en) 2023-02-24
CN115713249B true CN115713249B (en) 2023-06-13

Family

ID=85230949

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202211235305.1A Active CN115713249B (en) 2022-10-10 2022-10-10 Government satisfaction evaluation system and method based on data security and privacy protection

Country Status (1)

Country Link
CN (1) CN115713249B (en)

Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108304725A (en) * 2018-02-09 2018-07-20 山东汇贸电子口岸有限公司 A kind of method and system to the desensitization of government data resource
CN109189826A (en) * 2018-08-14 2019-01-11 北京新广视通科技有限公司 A kind of government affairs service system based on big data
CN111222753A (en) * 2019-12-17 2020-06-02 合肥工业大学 E-government performance evaluation system and method
CN111582653A (en) * 2020-04-14 2020-08-25 五邑大学 Government affair service evaluation processing method, system, device and storage medium
CN111603161A (en) * 2020-05-28 2020-09-01 苏州小蓝医疗科技有限公司 Electroencephalogram classification method
CN113850483A (en) * 2021-09-10 2021-12-28 百维金科(上海)信息科技有限公司 Enterprise credit risk rating system
CN114219688A (en) * 2021-12-06 2022-03-22 安徽长泰科技有限公司 Government affair data supervision system for ensuring information safety
CN115130122A (en) * 2022-06-12 2022-09-30 四川云云旺软件技术有限公司 Big data security protection method and system

Patent Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108304725A (en) * 2018-02-09 2018-07-20 山东汇贸电子口岸有限公司 A kind of method and system to the desensitization of government data resource
CN109189826A (en) * 2018-08-14 2019-01-11 北京新广视通科技有限公司 A kind of government affairs service system based on big data
CN111222753A (en) * 2019-12-17 2020-06-02 合肥工业大学 E-government performance evaluation system and method
CN111582653A (en) * 2020-04-14 2020-08-25 五邑大学 Government affair service evaluation processing method, system, device and storage medium
CN111603161A (en) * 2020-05-28 2020-09-01 苏州小蓝医疗科技有限公司 Electroencephalogram classification method
CN113850483A (en) * 2021-09-10 2021-12-28 百维金科(上海)信息科技有限公司 Enterprise credit risk rating system
CN114219688A (en) * 2021-12-06 2022-03-22 安徽长泰科技有限公司 Government affair data supervision system for ensuring information safety
CN115130122A (en) * 2022-06-12 2022-09-30 四川云云旺软件技术有限公司 Big data security protection method and system

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
基于熵权TOPSIS法的短时交通流预测模型性能综合评价;邵毅明;钟颖;吴文文;胡广雪;;重庆理工大学学报(自然科学)(第07期);213-219+262 *

Also Published As

Publication number Publication date
CN115713249A (en) 2023-02-24

Similar Documents

Publication Publication Date Title
CN110222170B (en) Method, device, storage medium and computer equipment for identifying sensitive data
CN111831629B (en) Data processing method and device
CN112860841A (en) Text emotion analysis method, device and equipment and storage medium
CN111866196B (en) Domain name traffic characteristic extraction method, device and equipment and readable storage medium
CN111932269A (en) Equipment information processing method and device
CN113904872A (en) Feature extraction method and system for anonymous service website fingerprint attack
CN111985207B (en) Access control policy acquisition method and device and electronic equipment
CN108241867A (en) A kind of sorting technique and device
CN113157210A (en) Privacy permission transfer method based on APP function
CN114595689A (en) Data processing method, data processing device, storage medium and computer equipment
CN112016317A (en) Sensitive word recognition method and device based on artificial intelligence and computer equipment
CN104615621B (en) Correlation treatment method and system in search
CN115545985A (en) Library archives integrated management system based on wisdom community
CN115713249B (en) Government satisfaction evaluation system and method based on data security and privacy protection
CN109426905B (en) Criminal document criminal deviation judging method and device
CN111047146B (en) Risk identification method, device and equipment for enterprise users
CN115982388B (en) Case quality control map establishment method, case document quality inspection method, case quality control map establishment equipment and storage medium
CN116881687A (en) Power grid sensitive data identification method and device based on feature extraction
CN106874739A (en) A kind of recognition methods of terminal iidentification and device
CN112199434B (en) Data processing method, device, electronic equipment and storage medium
CN115617998A (en) Text classification method and device based on intelligent marketing scene
CN105786929A (en) Information monitoring method and device
CN114064893A (en) Abnormal data auditing method, device, equipment and storage medium
CN110309312B (en) Associated event acquisition method and device
CN113408263A (en) Criminal period prediction method and device, storage medium and electronic device

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant