CN113436751A - Weekly ILI proportion trend prediction system and method - Google Patents

Weekly ILI proportion trend prediction system and method Download PDF

Info

Publication number
CN113436751A
CN113436751A CN202110725434.8A CN202110725434A CN113436751A CN 113436751 A CN113436751 A CN 113436751A CN 202110725434 A CN202110725434 A CN 202110725434A CN 113436751 A CN113436751 A CN 113436751A
Authority
CN
China
Prior art keywords
data
ili
week
weekly
model
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202110725434.8A
Other languages
Chinese (zh)
Inventor
刘文丽
李向阳
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shandong Health Medical Big Data Co ltd
Original Assignee
Shandong Health Medical Big Data Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shandong Health Medical Big Data Co ltd filed Critical Shandong Health Medical Big Data Co ltd
Priority to CN202110725434.8A priority Critical patent/CN113436751A/en
Publication of CN113436751A publication Critical patent/CN113436751A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16HHEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
    • G16H50/00ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics
    • G16H50/80ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics for detecting, monitoring or modelling epidemics or pandemics, e.g. flu
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16HHEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
    • G16H50/00ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics
    • G16H50/70ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics for mining of medical data, e.g. analysing previous cases of other patients

Landscapes

  • Engineering & Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • Public Health (AREA)
  • Medical Informatics (AREA)
  • Data Mining & Analysis (AREA)
  • Biomedical Technology (AREA)
  • Databases & Information Systems (AREA)
  • Pathology (AREA)
  • Epidemiology (AREA)
  • General Health & Medical Sciences (AREA)
  • Primary Health Care (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The invention discloses a system and a method for predicting the weekly ILI ratio trend, belonging to the technical field of disease control. The weekly ILI proportion trend prediction system comprises a data acquisition module, a model development module, a historical weekly ILI proportion monitoring module, a future week ILI proportion prediction module and a week amplification abnormity module, wherein the data acquisition module is used for acquiring Baidu index data, meteorological data and outpatient and flu sample data; the model development module comprises data preprocessing, feature engineering, model tuning and model generation; the history week ILI ratio monitoring module is used for monitoring the history week ILI ratio; and the week amplification abnormity module performs early warning by using the historical week ILI ratio monitoring result and the future week ILI ratio prediction result. The weekly ILI proportion trend prediction system increases weather and Baidu index data factors during prediction, further improves the rationality and accuracy of prediction, and has good popularization and application values.

Description

Weekly ILI proportion trend prediction system and method
Technical Field
The invention relates to the technical field of disease control, and particularly provides a weekly ILI proportion trend prediction system and method.
Background
Influenza (influenza for short) is an acute respiratory infectious disease caused by influenza virus, has strong infectivity and high morbidity, and is easy to cause outbreak or pandemic. Traditional disease monitoring systems are based on hospital and laboratory tests, and there is often a lag period of time between the symptom report or sample collection and the final diagnosis of the disease.
With the rise of artificial intelligence technology, various artificial intelligence models are gradually applied to influenza prediction. Since influenza disease is a typical seasonal disease and has high periodicity, the industry often uses time series class models for future trend prediction of ILI (influenza-like case) proportion.
It was found that the weather quenching phenomenon occurred more often in the early stages of an outbreak of influenza. With the development of networks, with the coming of flu season, the amount of vocabulary search for cold medication, cold symptoms, and the like in the networks has increased. However, this method of predicting future ILI ratios using only a time-series algorithm can only consider the impression of historical ILI ratios on future ILI ratios, and cannot incorporate other influencing factors into the prediction.
Disclosure of Invention
The technical task of the invention is to provide a weekly ILI proportion trend prediction system which increases weather and network data factors during prediction and further improves the rationality and accuracy of prediction aiming at the existing problems.
A further technical task of the present invention is to provide a method for predicting the weekly ILI proportion trend.
In order to achieve the purpose, the invention provides the following technical scheme:
a week ILI proportion trend prediction system comprises a data acquisition module, a model development module, a historical week ILI proportion monitoring module, a future week ILI proportion prediction module and a week amplification abnormity module, wherein the data acquisition module is used for acquiring Baidu index data, meteorological data and outpatient and flu sample data; the model development module comprises data preprocessing, feature engineering, model tuning and model generation; the history week ILI ratio monitoring module is used for monitoring the history week ILI ratio; and the week amplification abnormity module performs early warning by using the historical week ILI ratio monitoring result and the future week ILI ratio prediction result. When the Baidu index is collected, the data of the 81-word day Baidu search index which is related to the flu is collected in a network crawling mode in nearly three years; and then, the 81-word day hundred degree search index is periodically collected in a period of one week. Firstly, acquiring meteorological data of three years including day weather, day temperature, day wind level, night weather, night temperature and night wind level in a network crawling mode; and then, the meteorological data are regularly collected by taking one week as a period. The collected data will be stored in the kudu database.
The clinic and flu-like data are obtained by respectively counting the clinic amount of respiratory tract syndrome diseases and the clinic amount of flu-like cases in three years; then, the clinic amount of the clinic period of the respiratory tract syndrome related diseases and the clinic period of the flu-like cases are regularly collected by taking one week as a period.
Preferably, the data preprocessing is to preprocess the data aiming at the outpatient service and flu sample data, the meteorological data and the Baidu index data, and backfill the preprocessed data into a storage table; the characteristic engineering is to extract the pre-processing data of the clinic and flu sample data, the meteorological data and the Baidu index data which are nearly three years from the storage table, and respectively perform characteristic processing on the clinic and flu sample data, the meteorological data and the Baidu index data; the model tuning is to perform model tuning on the multiple regression model Xgboost and the time sequence model ARIMA; based on the model tuning result, model training is carried out after data of last three years in data acquisition is subjected to feature engineering, a model is generated, and finally model development is carried out in an interface mode. The input of the development interface is the date pre _ date of the Monday to be predicted; and outputting a prediction result pre _ value of the model to the ratio of the ILI to be predicted.
The specific implementation method of the history week ILI ratio monitoring module for monitoring the history week ILI ratio is as follows:
acquiring daily clinic quantity and daily clinic quantity of influenza-like cases related to respiratory syndrome diseases, which are acquired from an outpatient clinic and influenza-like case database and are advanced one year from the week to be predicted;
secondly, calculating the weekly outpatient quantity of departments related to the respiratory syndrome disease and the weekly outpatient quantity of flu-like cases in one year by taking weeks as units;
third, the ILI fraction is calculated for the week and shown in line graph form. The calculation formula is as follows:
Figure BDA0003137502180000021
preferably, the prediction of the ILI proportion for the one week in the future by the ILI proportion predicting module for the one week in the future comprises the following processes:
1) writing a timing program, calling a model interface every Monday, and inputting the date of the day;
2) automatically capturing climate and hectometer index data and historical outpatient and flu sample case data required by prediction from a database;
3) inputting the captured data into a model after the data is processed by feature engineering, generating a prediction result, and returning the prediction result in a json form;
4) and when the model returns a result each time, respectively storing the Xgboost prediction result, the final prediction result and the ILI proportion value of the last week into a database for the next calculation.
Preferably, when the week-amplification-abnormality warning is performed, the week-amplification-abnormality module calculates the last week of the week to be predicted as the current week in the week-amplification-abnormality warning, and calculates the week to be predicted as the current week in the week-amplification-abnormality warning.
A weekly ILI proportion trend prediction method is realized based on the weekly ILI proportion trend prediction system, and comprises the following steps:
s1, preprocessing data: cleaning the acquired data, and converting the data into data meeting characteristic engineering, wherein the data comprises outpatient service and flu sample data, meteorological data and Baidu index data;
s2, characteristic engineering: performing feature combination, screening and feature vectorization on the preprocessed data;
and S3, and predicting by a multiple regression model fusing time factors.
Preferably, the outpatient clinic and flu sample data processing comprises processing of time data and processing of clinic volume data, wherein the processing of the time data is to extract the time characteristics of the part of the week belonging to the year every day, and the processing of the clinic volume data is to calculate the clinic volume and the flu clinic volume; the meteorological data comprises day weather, day temperature, day wind level, night weather, night temperature and night wind level; the Baidu index data is the search heat of the influenza related words in the network.
Preferably, the outpatient service and the influenza sample data characteristic engineering based on the data preprocessing result calculate the weekly outpatient service volume, the weekly ILI, the weekly influenza sample outpatient service volume, the ILI volume in the last two weeks and the ILI volume in the last three weeks by taking the yearly week as an index; the meteorological data characteristic engineering based on the data preprocessing result comprises the processing of numerical data and the processing of classified data; the hundredth index data based on the data preprocessing result respectively calculates the weekly average hundredth index of each keyword in units of yearly weeks, and each keyword is standardized in the weekly dimension by using a Z _ score mode.
Preferably, the time factor-fused multiple regression model prediction takes the historical meteorological data, Baidu index data, outpatient service and flu sample data processed by the characteristic engineering as input factors, and the ILI ratio of one week in the future is predicted, wherein the prediction comprises a multiple regression network, a time series network and self-adaptive weight adjustment.
Compared with the prior art, the week ILI proportion trend prediction method has the following outstanding beneficial effects: according to the weekly ILI proportion trend prediction system method, on the basis of retention time factors, weather and Baidu index data factors are added during prediction, so that the reasonability and accuracy of the model are improved, and the method has good popularization and application values.
Drawings
FIG. 1 is a topological diagram of a weekly ILI proportion trend prediction system according to the present invention;
FIG. 2 is a flow chart of model development;
FIG. 3 is a flow chart of the multiple regression model prediction of the weekly ILI proportion trend prediction method of the present invention.
Detailed Description
The weekly ILI proportion trend prediction system and method of the present invention will be described in further detail with reference to the accompanying drawings and examples.
Examples
As shown in fig. 1, the weekly ILI proportion trend prediction system of the present invention includes a data acquisition module, a model development module, a historical weekly ILI proportion monitoring module, a future one-week ILI proportion prediction module, and a weekly amplification anomaly module.
The data acquisition module is used for acquiring Baidu index data, meteorological data and outpatient service and flu sample data. Acquiring data of the Baidu index, namely acquiring data of a daily Baidu search index of 81 words (shown in table 1) related to the flu by a network crawling mode; and then, the 81-word day hundred degree search index is periodically collected in a period of one week. Firstly, acquiring meteorological data of three years including day weather, day temperature, day wind level, night weather, night temperature and night wind level in a network crawling mode; and then, the meteorological data are regularly collected by taking one week as a period. The clinic and flu-like data are obtained by respectively counting the clinic amount of respiratory tract syndrome diseases and the clinic amount of flu-like cases in three years; then, the clinic amount of the clinic period of the respiratory tract syndrome related diseases and the clinic period of the flu-like cases are regularly collected by taking one week as a period. The collected data are stored in a kudu database, and the specific table structures of the meteorological data, the Baidu index data, the outpatient service data and the flu-like data are shown in tables 2, 3 and 4.
Figure BDA0003137502180000041
Figure BDA0003137502180000051
TABLE 2
Serial number Name of field Chinese character Type (B) Remarks for note
1 RECORD_DATE Date string
2 DAY_WEATHER Weather of the day string
3 DAY_TEMPERATURE Temperature of day string
4 DAY_WIND_SCALE Solar wind power string
5 NIGHT_WEATHER Night weather string
6 NIGHT_TEMPERATURE Night temperature string
7 NIGHT_WIND_SCALE Night wind force string
TABLE 3
Figure BDA0003137502180000052
Figure BDA0003137502180000061
Figure BDA0003137502180000071
Figure BDA0003137502180000081
TABLE 4
Serial number Name of field Chinese character
1 RECORD_DATE Date of recording
2 MZ_NUM Department outpatient service volume related to respiratory syndrome
3 INFLUENZA_NUM Clinic visit amount of influenza-like cases
As shown in FIG. 2, the model development module includes data preprocessing, feature engineering, model tuning, and model generation.
The data preprocessing is to respectively carry out data preprocessing on three types of data, namely outpatient service and flu sample data, meteorological data and Baidu index data. And backfilling the preprocessed data into a storage table. And extracting the preprocessing data of the clinic and flu sample data, the meteorological data and the Baidu index data which are about three years from the storage table, and respectively performing characteristic processing on the clinic and flu sample data, the meteorological data and the Baidu index data according to a characteristic engineering method in the invention. Finally forming a sample set. Each sample in the sample set is a feature vector of one week, and each vector contains 125 dimensions. The specific meaning of each dimension is shown in table 5.
TABLE 5
Figure BDA0003137502180000082
Figure BDA0003137502180000091
Figure BDA0003137502180000101
Figure BDA0003137502180000111
And the model optimization is to perform model optimization on the multiple regression model Xgboost and the time series model ARIMA.
Regarding the tuning of Xgboost, the parameters to be tuned are the maximum tree depth max _ depth of the model and the number of decision trees n _ estimators. In the following, two parameters are tuned and optimized by means of grid search. The method comprises the following specific steps
Firstly, setting initial parameter sets of max _ depth and n _ estimators as [5, 50, 100] and [10, 50, 100 ];
secondly, searching a group of parameters which enable the loss of the square error of the model to be minimum by using a five-fold poor verification mode as the optimal parameters of the model;
thirdly, resetting the search groups by taking the optimal parameters of max _ depth and n _ estimators as centers and 1 as an interval from top to bottom;
and fourthly, according to the new search group, the loop is repeated from the first step until the optimal group obtained in the loop is the same as the last time.
Regarding the tuning of ARIMA, parameters to be tuned are p is an autoregressive term, q is the number of moving average terms, and a difference order d. The specific tuning method is as follows.
First, d is determined. Acquiring ILI proportion data in the last three years, performing stability test, and if the test is passed once, setting d to 0. If the check fails, the data is differentiated until the stationarity check is passed. The order of the final difference is d.
And secondly, determining p and q values. And drawing an autocorrelation and partial autocorrelation graph of the data passing through the stationarity test. Wherein q is a truncated value of the autocorrelation graph; p is the tail-truncated value of the partial auto-correlation.
And (4) based on model tuning results, performing model training by using data of nearly three years in data acquisition and performing characteristic engineering. And finally model development is carried out in an interface form.
The input of the development interface is the date pre _ date of the Monday to be predicted; and outputting a prediction result pre _ value of the model to the ratio of the ILI to be predicted.
The history week ILI ratio monitoring module is used for monitoring the history week ILI ratio, and the specific implementation method is as follows:
acquiring daily clinic quantity and daily clinic quantity of influenza-like cases related to respiratory syndrome diseases, which are acquired from an outpatient clinic and influenza-like case database and are advanced one year from the week to be predicted;
secondly, calculating the weekly outpatient quantity of departments related to the respiratory syndrome disease and the weekly outpatient quantity of flu-like cases in one year by taking weeks as units;
third, the ILI fraction is calculated for the week and shown in line graph form. The calculation formula is as follows.
Figure BDA0003137502180000121
The ILI fraction is predicted in the future week and is implemented as follows.
Firstly, writing a timing program, calling a model interface every Monday, and inputting the current date;
secondly, the model automatically accesses a database to capture climate and hectometer index data required by prediction and historical outpatient service and flu sample case data;
thirdly, the captured data is processed by the model through characteristic engineering and then is input into the model to generate a prediction result, and the prediction result is returned in a json form.
And fourthly, respectively storing the Xgboost prediction result, the final prediction result and the ILI ratio value of the last week into a database for the next calculation when the model returns the result each time.
And the week amplification early warning module is used for early warning by using historical week ILI ratio monitoring content and future week ILI ratio prediction results. The early warning about the historical ILI ratio is to calculate the last week of the week to be predicted as the current week in the week amplification abnormity early warning; the warning about the ILI proportion in the next week is to calculate the week to be predicted as the current week in the week-increase abnormality warning.
The increase of influenza-like cases in the week is the ratio of the increase of influenza-like cases in the week to the increase of influenza-like cases in the last week. When the value is more than 1, the increase of the influenza-like case tends to be upward; when the value is less than 1, it indicates that the increase of the influenza-like case is in a downward trend; when the value is equal to 1, it indicates a normal increase in the influenza-like case. The specific weekly amplification abnormity early warning module comprises the following contents:
1) increase in week x is the number of people increasing in this week/the number of people increasing in the last week
0.9> x > is 0, the amplification is reduced;
1.1> x > -0. 9, normally amplifying;
1.1> x, increasing in amplitude.
Based on the week amplification indexes, no early warning is given to the amplification and the reduction of a certain area; the normal amplification lasts for 3 weeks in a certain street, and the anomaly analysis provides that the number of the streets continuously increases in a certain place for 3 weeks; the amplification of a certain area is in an ascending trend, and the abnormity analysis provides 'the amplification of the certain area is abnormal in the past'.
The weekly ILI proportion trend prediction method is realized based on the weekly ILI proportion trend prediction system, and comprises the following steps:
s1, preprocessing data: the collected data are cleaned and converted into data capable of meeting characteristic engineering. The data required to be subjected to data preprocessing in the invention are divided into three types of outpatient service and influenza sample data, meteorological data and Baidu index data.
The outpatient and flu-like data processing mainly comprises two parts of time data processing and outpatient data processing. In the time data processing of this section, the week belonging to the year every day is extracted as the time feature of this section. In this section, the daily outpatient amount and the daily flu outpatient amount are calculated as the outpatient amount and the flu outpatient amount.
The meteorological data comprises day weather, day temperature, day wind level, night weather, night temperature and night wind level. In the preprocessing of meteorological data, the acquired data is cleaned. The cleaning mode includes special character removal (e.g., in ° c in temperature), missing value filling (filling using the average of the two day and night temperatures before and after the missing night temperature).
The Baidu index data is used for representing the searching popularity of people in the network to influenza related words at the early season of the influenza epidemic season. The Baidu search index is based on the searching amount of the netizens in Baidu as data, takes the keywords as statistical objects, and scientifically analyzes and calculates the weighted sum of the searching frequency of each keyword in Baidu webpage search. The method can comprehensively and scientifically represent the search heat of people on the search keywords in the same day. Generally, the higher the heat, the higher the search index.
The centesimal search index using influenza-related words is used as a centesimal data factor in the present invention. The preprocessing of data on this section mainly refers to the filling of data null values. The acquired daily vacancy values are all filled with 0.
S2, characteristic engineering: and performing feature combination, screening and feature vectorization on the preprocessed data.
And the outpatient service and influenza sample data feature engineering is based on the data preprocessing result, and further performs feature engineering on the data. The procedure calculates weekly clinics, weekly ILIs, weekly flu sample clinics, ILIs in the last two weeks (two weeks forward from the current index week), and ILIs in the last three weeks (three weeks forward from the current index week) using the year week as an index.
Meteorological data characteristic engineering: the characteristic engineering for meteorological data processing based on the data preprocessing result comprises processing of numerical data and processing of classification data. Numerical type data including temperature, wind power. The treatment method comprises the following steps: first, the weekly maximum, the weekly minimum and the weekly average are calculated from three dimensions of day, night and day respectively by taking the year and week as indexes. Then, the average temperature of the last three weeks (three weeks ahead from the current index week), the temperature difference of the last three weeks, and the maximum temperature difference of the lowest temperatures of the two adjacent days in the week are further calculated using the year week as an index. And the category type data comprises day weather and night weather. The treatment method comprises the following steps: first, the weather type is summarized based on crawled data; then, the number of each weather type appearing in the week is counted from two angles of day and night by taking the week as an index.
Baidu search index feature engineering: and performing feature engineering on the hundred-degree search index data based on the data preprocessing result. The treatment method comprises the following steps: firstly, respectively calculating the weekly average hundredth index of each keyword by taking the annual week as a unit; each keyword is then normalized in the peripheral dimension using the Z _ score approach.
And S3, and predicting by a multiple regression model fusing time factors.
And the multivariate regression model prediction is to predict the ILI ratio of the future week by taking the historical meteorological data, Baidu index data, clinic and flu sample data processed by the characteristic engineering as input factors. The algorithm implementation process mainly comprises a multiple regression network, a time series network and self-adaptive weight adjustment, and the specific process is shown in FIG. 3.
First, inputting feature engineering processing data. The data processed by the feature engineering is divided into X1 part and X2 part. Wherein, part X2 refers to historical week ILI fraction; the X1 section is the other feature content in the feature engineering.
Secondly, inputting the X1 and X2 parts into a multiple regression network Xgboost together, and performing ILI proportion prediction for one week in the future by using an algorithm based on the Xgboost to obtain a prediction result y of the XgboostxAnd storing the yx of each prediction into an Xgboost historical prediction result base.
Thirdly, inputting the X2 part into the ARIMA of the time series network, and using an ARIMA-based algorithm to predict the ILI ratio in one week in the future to obtain a prediction result y of the ARIMAA
And fourthly, calculating the XGboost historical prediction error variance Q based on the ratio of the Xgboost historical calculation result to the historical actual week ILI.
And fifthly, calculating a final historical prediction error variance R based on a ratio of a final historical prediction result of the algorithm to the historical actual week ILI.
A sixth step of based on yA、yxR, Q, the final prediction result is calculated by the following formula.
Figure BDA0003137502180000151
Wherein, Pk-1The value is the final predicted result value of the algorithm at the previous moment, and y is the final predicted result value of the algorithm at the current moment.
The above-described embodiments are merely preferred embodiments of the present invention, and general changes and substitutions by those skilled in the art within the technical scope of the present invention are included in the protection scope of the present invention.

Claims (8)

1. A weekly ILI proportion trend prediction system, characterized by: the system comprises a data acquisition module, a model development module, a historical week ILI proportion monitoring module, a future week ILI proportion prediction module and a week amplification abnormity module, wherein the data acquisition module is used for acquiring Baidu index data, meteorological data and outpatient and flu sample data; the model development module comprises data preprocessing, feature engineering, model tuning and model generation; the history week ILI ratio monitoring module is used for monitoring the history week ILI ratio; and the week amplification abnormity module performs early warning by using the historical week ILI ratio monitoring result and the future week ILI ratio prediction result.
2. The weekly ILI proportion trend prediction system of claim 1, wherein: the data preprocessing is to preprocess data aiming at outpatient service and influenza sample data, meteorological data and Baidu index data, and backfill the preprocessed data into a storage table; the characteristic engineering is to extract the pre-processing data of the clinic and flu sample data, the meteorological data and the Baidu index data which are nearly three years from the storage table, and respectively perform characteristic processing on the clinic and flu sample data, the meteorological data and the Baidu index data; the model tuning is to perform model tuning on the multiple regression model Xgboost and the time sequence model ARIMA; and based on the model tuning result, performing model training by using data of nearly three years in data acquisition through feature engineering to generate a model.
3. The weekly ILI proportion trend prediction system of claim 2, wherein: the prediction module of the ILI ratio of the future week predicts the ILI ratio of the future week and comprises the following processes:
1) writing a timing program, calling a model interface every Monday, and inputting the date of the day;
2) automatically capturing climate and hectometer index data and historical outpatient and flu sample case data required by prediction from a database;
3) inputting the captured data into a model after the data is processed by feature engineering, generating a prediction result, and returning the prediction result in a json form;
4) and when the model returns a result each time, respectively storing the Xgboost prediction result, the final prediction result and the ILI proportion value of the last week into a database for the next calculation.
4. The weekly ILI proportion trend prediction system of claim 3, wherein: and when the week amplification abnormity early warning is carried out, the week amplification abnormity module calculates the last week of the week to be predicted as the current week in the week amplification abnormity early warning, and calculates the week to be predicted as the current week in the week amplification abnormity early warning.
5. A weekly ILI proportion trend prediction method is characterized by comprising the following steps: the method is implemented based on the weekly ILI proportion trend prediction system of any of claims 1-4, comprising the following steps:
s1, preprocessing data: cleaning the acquired data, and converting the data into data meeting characteristic engineering, wherein the data comprises outpatient service and flu sample data, meteorological data and Baidu index data;
s2, characteristic engineering: performing feature combination, screening and feature vectorization on the preprocessed data;
and S3, and predicting by a multiple regression model fusing time factors.
6. The weekly ILI proportion trend prediction method of claim 5, wherein: the outpatient clinic and flu sample data processing comprises the processing of time data and the processing of outpatient clinic volume data, wherein the processing of the time data is to extract the weeks belonging to the year every day as the time characteristics of the part, and the processing of the outpatient clinic volume data is to calculate the daily outpatient clinic volume and the daily flu clinic volume; the meteorological data comprises day weather, day temperature, day wind level, night weather, night temperature and night wind level; the Baidu index data is the search heat of the influenza related words in the network.
7. The weekly ILI proportion trend prediction method of claim 6, wherein: calculating the weekly clinic amount, the weekly ILI, the weekly influenza sample clinic amount, the ILI amount of nearly two weeks and the ILI amount of nearly three weeks by taking the annual week as an index in clinic and influenza sample data characteristic engineering based on data preprocessing results; the meteorological data characteristic engineering based on the data preprocessing result comprises the processing of numerical data and the processing of classified data; the hundredth index data based on the data preprocessing result respectively calculates the weekly average hundredth index of each keyword in units of yearly weeks, and each keyword is standardized in the weekly dimension by using a Z _ score mode.
8. The weekly ILI proportion trend prediction method of claim 7, wherein: the time factor-fused multivariate regression model prediction takes historical meteorological data, Baidu index data, outpatient service and flu sample data processed by characteristic engineering as input factors, and predicts ILI ratio in one week in the future, including multivariate regression network, time series network and self-adaptive weight adjustment.
CN202110725434.8A 2021-06-29 2021-06-29 Weekly ILI proportion trend prediction system and method Pending CN113436751A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110725434.8A CN113436751A (en) 2021-06-29 2021-06-29 Weekly ILI proportion trend prediction system and method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110725434.8A CN113436751A (en) 2021-06-29 2021-06-29 Weekly ILI proportion trend prediction system and method

Publications (1)

Publication Number Publication Date
CN113436751A true CN113436751A (en) 2021-09-24

Family

ID=77757642

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110725434.8A Pending CN113436751A (en) 2021-06-29 2021-06-29 Weekly ILI proportion trend prediction system and method

Country Status (1)

Country Link
CN (1) CN113436751A (en)

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107688872A (en) * 2017-08-20 2018-02-13 平安科技(深圳)有限公司 Forecast model establishes device, method and computer-readable recording medium
CN108766585A (en) * 2018-05-31 2018-11-06 平安科技(深圳)有限公司 Generation method, device and the computer readable storage medium of influenza prediction model
CN109545386A (en) * 2018-11-02 2019-03-29 深圳先进技术研究院 A kind of influenza spatio-temporal prediction method and device based on deep learning
CN111415752A (en) * 2020-03-01 2020-07-14 集美大学 Hand-foot-and-mouth disease prediction method integrating meteorological factors and search indexes
CN111508598A (en) * 2020-05-06 2020-08-07 万达信息股份有限公司 Method for predicting outpatient quantity of respiratory system diseases

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107688872A (en) * 2017-08-20 2018-02-13 平安科技(深圳)有限公司 Forecast model establishes device, method and computer-readable recording medium
CN108766585A (en) * 2018-05-31 2018-11-06 平安科技(深圳)有限公司 Generation method, device and the computer readable storage medium of influenza prediction model
CN109545386A (en) * 2018-11-02 2019-03-29 深圳先进技术研究院 A kind of influenza spatio-temporal prediction method and device based on deep learning
CN111415752A (en) * 2020-03-01 2020-07-14 集美大学 Hand-foot-and-mouth disease prediction method integrating meteorological factors and search indexes
CN111508598A (en) * 2020-05-06 2020-08-07 万达信息股份有限公司 Method for predicting outpatient quantity of respiratory system diseases

Similar Documents

Publication Publication Date Title
CN107591800B (en) Method for predicting running state of power distribution network with distributed power supply based on scene analysis
CN107194208A (en) A kind of genetic analysis annotates method and apparatus
US20210157786A1 (en) Methods and Systems for Detecting Spurious Data Patterns
CN105095238A (en) Decision tree generation method used for detecting fraudulent trade
WO2015052851A1 (en) Customer data analysis system
CN111143838B (en) Database user abnormal behavior detection method
CN110674970A (en) Enterprise legal risk early warning method, device, equipment and readable storage medium
CN115577701B (en) Risk behavior identification method, device, equipment and medium aiming at big data security
CN111581956A (en) Sensitive information identification method and system based on BERT model and K nearest neighbor
CN105956740A (en) Semantic risk calculating method based on text logical characteristic
CN110956278A (en) Method and system for retraining machine learning models
CN113706291A (en) Fraud risk prediction method, device, equipment and storage medium
Wilson et al. The motif tracking algorithm
CN112951441B (en) Monitoring and early warning method, device, equipment and storage medium based on multiple dimensions
CN117271701A (en) Method and system for extracting system operation abnormal event relation based on TGGAT and CNN
CN113436751A (en) Weekly ILI proportion trend prediction system and method
CN113642669B (en) Feature analysis-based fraud prevention detection method, device, equipment and storage medium
CN113220973B (en) Public opinion verification method based on knowledge reasoning technology
Waraga et al. Investigating water consumption patterns through time series clustering
Ete et al. Forecasting the Number of Tourist Arrivals to Batam by applying the Singular Spectrum Analysis and the Arima Method
KR101632537B1 (en) Technical ripple effect analysis method
CN114495137A (en) Bill abnormity detection model generation method and bill abnormity detection method
Minaei-Bidgoli et al. Correlation mining between time series stream and event stream
Ayachit et al. Predicting h1n1 and seasonal flu: Vaccine cases using ensemble learning approach
CN111460005B (en) JSD-based outlier detection method for time sequence data

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication

Application publication date: 20210924

RJ01 Rejection of invention patent application after publication