CN106372240B - Data analysis method and device - Google Patents

Data analysis method and device Download PDF

Info

Publication number
CN106372240B
CN106372240B CN201610827546.3A CN201610827546A CN106372240B CN 106372240 B CN106372240 B CN 106372240B CN 201610827546 A CN201610827546 A CN 201610827546A CN 106372240 B CN106372240 B CN 106372240B
Authority
CN
China
Prior art keywords
data
data analysis
request
model
analysis instruction
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201610827546.3A
Other languages
Chinese (zh)
Other versions
CN106372240A (en
Inventor
曹玮
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Sohu New Media Information Technology Co Ltd
Original Assignee
Beijing Sohu New Media Information Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Sohu New Media Information Technology Co Ltd filed Critical Beijing Sohu New Media Information Technology Co Ltd
Priority to CN201610827546.3A priority Critical patent/CN106372240B/en
Publication of CN106372240A publication Critical patent/CN106372240A/en
Application granted granted Critical
Publication of CN106372240B publication Critical patent/CN106372240B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2458Special types of queries, e.g. statistical queries, fuzzy queries or distributed queries
    • G06F16/2465Query processing support for facilitating data mining operations in structured databases
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2216/00Indexing scheme relating to additional aspects of information retrieval not explicitly covered by G06F16/00 and subgroups
    • G06F2216/03Data mining

Landscapes

  • Engineering & Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Software Systems (AREA)
  • Probability & Statistics with Applications (AREA)
  • Mathematical Physics (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Fuzzy Systems (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The invention discloses a data analysis method, which comprises the following steps: responding to the request operation of data analysis, and generating a corresponding data analysis instruction; acquiring initial data corresponding to the data analysis instruction; converting the initial data into model input data according to a conversion mode corresponding to the data analysis instruction; calling a mathematical model algorithm corresponding to the data analysis instruction to operate the model input data to obtain model output data; analyzing a data mining result corresponding to the data analysis instruction from the model output data according to an analysis mode corresponding to the data analysis instruction; feeding back the data mining result; wherein, the conversion mode and the analysis mode are preset. The user does not need to determine the input variable of the data model and interpret the output variable, namely, the user can obtain the data mining result without understanding the data model. In addition, the invention also discloses a data analysis device.

Description

data analysis method and device
Technical Field
the present invention relates to the field of data processing technologies, and in particular, to a method and an apparatus for data analysis.
Background
with the development of information technology, people can inquire more and more abundant data information. Therefore, when faced with a large amount of data, users increasingly need to analyze and mine the results of the rules reflected by the large amount of data. To meet the data mining needs of users, many data analysis software are currently emerging. These data analysis software provide mathematical model algorithms for data mining. The inventor has found that when the existing data analysis software is used, a user needs to input variables of the mathematical model, and the data analysis software feeds back the input variables of the mathematical model to the user. This requires that the user must determine the input variables of the mathematical model and interpret the output variables of the mathematical model based on a large amount of raw data to obtain the data mining result, while deeply understanding the mathematical model. However, for a user who does not understand the mathematical model, since the user cannot determine the input variables of the mathematical model and cannot interpret the input variables of the mathematical model, the data mining result cannot be obtained using the existing data analysis software.
Disclosure of Invention
The technical problem to be solved by the present invention is to provide a method and an apparatus for data analysis, so that a user can obtain a data mining result without understanding a data model.
In order to solve the above technical problem, the present invention provides a data analysis method, including:
in order to solve the above technical problem, the present invention provides a data analysis method, including:
Responding to a request operation of data analysis, and generating a data analysis instruction corresponding to the request operation;
acquiring initial data corresponding to the data analysis instruction;
converting the initial data into model input data according to a conversion mode corresponding to the data analysis instruction, wherein the format of the model input data conforms to the format of an input variable of a mathematical model corresponding to the data analysis instruction;
calling a mathematical model algorithm corresponding to the data analysis instruction to operate the model input data to obtain model output data;
Analyzing a data mining result corresponding to the data analysis instruction from the model output data according to an analysis mode corresponding to the data analysis instruction;
feeding back the data mining result;
Wherein the conversion mode and the analysis mode are preset.
optionally, the conversion mode includes data splicing processing, data type conversion processing, data statistics calculation processing, and/or data format conversion processing.
optionally, the invoking a mathematical model algorithm corresponding to the data analysis instruction to perform an operation on the model input data to obtain model output data includes:
Calling a mathematical model algorithm corresponding to the data analysis instruction, and inputting the model input data into the mathematical model algorithm;
And receiving data output by the mathematical model algorithm as the model output data.
Optionally, the mathematical model algorithm is specifically an algorithm provided by an R language.
Optionally, the generating a data analysis instruction corresponding to the request operation in response to the request operation of data analysis includes:
responding to a request operation of data analysis, and displaying optional request categories, wherein the request categories are used for representing data mining results needing to be requested;
in response to an operation of selecting a target request category among the selectable request categories, generating the data analysis instruction based on the target request category.
Optionally, the obtaining initial data corresponding to the data analysis instruction includes:
Generating a data query request corresponding to the data analysis instruction and sending the data query request to a database;
and receiving data corresponding to the data query request returned by the database as the initial data.
Optionally, the method further includes:
And correspondingly storing request information and the data mining result, wherein the request information is used for identifying the request operation.
Optionally, the storing the request information and the data mining result correspondingly includes: and sending request information and the data mining result to the database correspondingly so that the database stores the request information and the data mining result correspondingly.
Optionally, the feeding back the data mining result includes:
Determining a graph type for displaying the data mining result according to the data analysis instruction;
and generating and displaying an image according to the data mining result according to the image type.
In addition, the present invention also provides a data analysis apparatus, comprising:
The generating unit is used for responding to a request operation of data analysis and generating a data analysis instruction corresponding to the request operation;
an acquisition unit configured to acquire initial data corresponding to the data analysis instruction;
The conversion unit is used for converting the initial data into model input data according to a conversion mode corresponding to the data analysis instruction, wherein the format of the model input data conforms to the format of an input variable of a mathematical model corresponding to the data analysis instruction;
the arithmetic unit is used for calling a mathematical model algorithm corresponding to the data analysis instruction to calculate the model input data to obtain model output data;
The analysis unit is used for screening out a data mining result corresponding to the data analysis instruction from the model output data according to an analysis mode corresponding to the data analysis instruction;
The feedback unit is used for feeding back the data mining result;
wherein the conversion mode and the analysis mode are preset.
optionally, the conversion mode includes data splicing processing, data type conversion processing, data statistics calculation processing, and/or data format conversion processing.
optionally, the operation unit includes:
The input unit is used for calling a mathematical model algorithm corresponding to the data analysis instruction and inputting the model input data into the mathematical model algorithm;
And the receiving unit is used for receiving the data output by the mathematical model algorithm as the model output data.
optionally, the mathematical model algorithm is specifically an algorithm provided by an R language.
optionally, the generating unit includes:
The display subunit is used for responding to the request operation of data analysis and displaying optional request types, wherein the request types are used for expressing data mining results needing to be requested;
The generation subunit is configured to, in response to an operation of selecting a target request category from the selectable request categories, generate the data analysis instruction based on the target request category.
Optionally, the obtaining unit includes:
The sending subunit is used for generating a data query request corresponding to the data analysis instruction and sending the data query request to a database;
And the receiving subunit is configured to receive, as the initial data, data corresponding to the data query request returned by the database.
Optionally, the apparatus further comprises:
and the storage unit is used for correspondingly storing request information and the data mining result, wherein the request information is used for identifying the request operation.
optionally, the storage unit is specifically configured to: and sending request information and the data mining result to the database correspondingly so that the database stores the request information and the data mining result correspondingly.
optionally, the feedback unit includes:
The determining subunit is used for determining the graph type for displaying the data mining result according to the data analysis instruction;
And the image subunit is used for generating and displaying the data mining result into an image according to the image type.
compared with the prior art, the invention has the following advantages:
in the embodiment of the invention, a corresponding conversion mode and an analysis mode can be preset for a request of a data mining result, and after a user triggers a request operation of data analysis, the corresponding preset analysis mode of the corresponding conversion mode can be determined according to a data analysis instruction corresponding to the request operation, so that initial data corresponding to the data analysis instruction can be converted into model input data which can be used as an input variable of a mathematical model according to the determined conversion mode, and a data mining result corresponding to the data analysis instruction can be analyzed from model output data which is used as an output variable of the mathematical model according to the determined analysis mode and fed back to the user. Therefore, the user does not need to determine the input variables of the mathematical model on the basis of the original data and interpret the output variables of the mathematical model, and the data mining result can be obtained. Therefore, data mining results can be obtained even if the user does not understand the mathematical model.
drawings
in order to more clearly illustrate the embodiments of the present application or the technical solutions in the prior art, the drawings needed to be used in the description of the embodiments or the prior art will be briefly described below, it is obvious that the drawings in the following description are only some embodiments described in the present application, and other drawings can be obtained by those skilled in the art without creative efforts.
FIG. 1 is a block diagram of an exemplary application scenario in an embodiment of the present invention;
FIG. 2 is a flow chart illustrating a method of data analysis according to an embodiment of the present invention;
fig. 3 is a schematic flow chart of an apparatus for data analysis according to an embodiment of the present invention.
Detailed Description
in order to make the technical solutions of the present application better understood, the technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application, and it is obvious that the described embodiments are only a part of the embodiments of the present application, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present application.
The inventors have found through research that the existing data analysis software is, for example, SPSS software. When the SPSS is used, a user needs to convert a large amount of original data into input variables conforming to the mathematical model on the basis of deeply understanding the mathematical model, and output variables of the data model are obtained and fed back to the user through the operation of SPSS software, so that the user needs to read and analyze the output variables on the basis of deeply understanding the mathematical model to obtain a data mining result. However, not all users with data mining requirements can deeply understand the mathematical model, and for users who cannot understand the mathematical model, it is difficult for the users themselves to convert a large amount of raw data into input variables that conform to the mathematical model and to interpret output variables that analyze the mathematical model, and therefore, it is often difficult for the users to obtain data mining results using existing data analysis software.
based on this, in the embodiment of the present invention, a corresponding conversion manner and an analysis manner may be set in advance for the request of the data mining result, and after the user triggers the request operation of the data analysis, the corresponding analysis manner of the preset corresponding conversion manner may be determined according to the data analysis instruction corresponding to the request operation, so that the initial data corresponding to the data analysis instruction may be converted into model input data that may be used as an input variable of the mathematical model according to the determined conversion manner, and the data mining result corresponding to the data analysis instruction may also be analyzed from the model output data that is used as an output variable of the mathematical model according to the determined analysis manner and fed back to the user. Therefore, the user does not need to determine the input variables of the mathematical model on the basis of the original data and interpret the output variables of the mathematical model, and the data mining result can be obtained. Therefore, data mining results can be obtained even if the user does not understand the mathematical model.
For example, one of the scenarios of the embodiment of the present invention may be applied to the scenario shown in fig. 1. In this scenario, the server 101 and the client 102 may interact with information. A user may initiate a request operation for data analysis to the server machine 101 through the client machine 102. In response to the request operation, the server 101 may generate a data analysis instruction corresponding to the request operation and obtain initial data corresponding to the data analysis instruction. According to the conversion mode corresponding to the data analysis instruction, the server 101 may convert the initial data into model input data, where a format of the model input data conforms to a format of an input variable of a mathematical model corresponding to the data analysis instruction. Then, the server 101 may call a mathematical model algorithm corresponding to the data analysis instruction to perform an operation on the model input data, so as to obtain model output data. The server 101 may analyze a data mining result corresponding to the data analysis instruction from the model output data according to an analysis manner corresponding to the data analysis instruction and feed the data mining result back to the client 102, so that the client 102 displays the data mining result to a user. Wherein the conversion mode and the analysis mode are preset.
it is to be appreciated that in the application scenarios described above, while the actions of the embodiments of the invention are described as being performed by the server 101, the actions can also be performed in part by the client 102, in part by the server 101, or entirely by the client 102. The embodiments of the present invention are not limited in terms of the implementation of the subject matter as long as the acts disclosed in the embodiments of the present invention are performed.
It is to be understood that the above scenario is only one scenario example provided by the embodiment of the present invention, and the embodiment of the present invention is not limited to this scenario.
The following describes in detail a specific implementation of the method and apparatus for data analysis according to the embodiments of the present invention by using embodiments, with reference to the accompanying drawings.
Exemplary method
Referring to fig. 2, a flow chart of a method for data analysis in an embodiment of the present invention is shown. In this embodiment, the method may include, for example, the steps of:
Step 201: and responding to the request operation of data analysis, and generating a data analysis instruction corresponding to the request operation.
step 202: initial data corresponding to the data analysis instructions is obtained.
Step 203: and converting the initial data into model input data according to a conversion mode corresponding to the data analysis instruction, wherein the format of the model input data conforms to the format of an input variable of a mathematical model corresponding to the data analysis instruction.
Step 204: and calling a mathematical model algorithm corresponding to the data analysis instruction to operate the model input data to obtain model output data.
step 205: and analyzing a data mining result corresponding to the data analysis instruction from the model output data according to an analysis mode corresponding to the data analysis instruction.
step 206: and feeding back the data mining result.
wherein the conversion mode and the analysis mode are preset.
In specific implementation, the conversion mode and the analysis mode corresponding to various data analysis instructions can be preset in the server. When a user needs to do data mining, the user can perform the request operation of data analysis on the client. The client can send the information related to the request operation to the server, so that the server can respond to the request operation. The server may generate a corresponding data analysis instruction when responding to the request operation. Then, the server may convert the initial data into model input data conforming to a mathematical model input variable format by using a conversion method preset for the data analysis instruction in advance, so that the model input model may be input to a mathematical model algorithm corresponding to the data analysis instruction for operation. Then, for the model output data output by the mathematical model algorithm, the server may analyze the data mining result corresponding to the data analysis instruction from the model output data by using an analysis mode preset for the data analysis instruction. The server feeds back the data mining result, and the user can obtain the required data mining result.
It is to be understood that, in order to facilitate the user to input the request operation to trigger the data analysis process, the server may provide the user with a plurality of request categories, and the user may trigger the request operation by selecting a target category from the plurality of request categories provided by the server, so that the server generates the data analysis instruction of the target category in response to the request operation. Specifically, in some implementations of this embodiment, step 201 may include, for example: responding to a request operation of data analysis, and displaying optional request categories, wherein the request categories are used for representing data mining results needing to be requested; in response to an operation of selecting a target request category among the selectable request categories, generating the data analysis instruction based on the target request category.
For example, the request category may be a query condition, and the server may set a menu containing the query condition to show selectable request categories, so that the user may select a certain query condition as a target query condition from the selectable query conditions by pulling down the menu, and the server generates the data analysis instruction based on the target query condition.
It can be understood that, in order to enable the user not to manually input the initial data to simplify the operation of the user, the time cost is saved, and the efficiency of data analysis is improved. Initial data required in the data analysis process is stored in a database, and the server can query the database according to the data analysis instruction to acquire corresponding initial data. Specifically, in some implementations of this embodiment, step 202 may include, for example: generating a data query request corresponding to the data analysis instruction and sending the data query request to a database; and receiving data corresponding to the data query request returned by the database as the initial data. Of course, the manner of acquiring the initial data may be manually input by the user himself.
It is understood that in the data analysis process, the acquired initial data is not the input data required by the data model algorithm, that is, the initial data needs to be converted to form the input data required by the mathematical model. In this embodiment, a variety of conversion methods may be used to convert the initial data into model input data.
For example, in some embodiments, the conversion may include a data stitching process. The data splicing processing refers to splicing spatially adjacent data into complete target data.
As another example, in some embodiments, the conversion mode may include a data type conversion process. The data type conversion processing refers to converting the type of the initial data into an input data type required by the model.
as another example, in some embodiments, the transformation may include a data statistics calculation process. The data statistical calculation processing means that initial data is statistically calculated.
As yet another example, in some embodiments, the conversion mode may include a data format conversion process. The data format conversion processing is to convert the format of the initial data into an input data format required by the model. Specifically, the server may convert the format of the initial data into a format required by the R language.
it should be noted that the data splicing process, the data type conversion process, the data statistics calculation process, and the data format conversion process may be combined to form the conversion mode corresponding to the data analysis instruction. For example, the conversion method corresponding to the data analysis instruction may be any one of data concatenation processing, data type conversion processing, data statistics calculation processing, and data format conversion processing. For another example, the conversion mode corresponding to the data analysis instruction may be any two, any three, or a combination of four processing modes, that is, data splicing processing, data type conversion processing, data statistics calculation processing, and data format conversion processing.
It is understood that in step 204, the server operates by calling a mathematical model algorithm preset for the data analysis instruction. Specifically, in some implementations of this embodiment, step 204 may include: calling a mathematical model algorithm corresponding to the data analysis instruction, and inputting the model input data into the mathematical model algorithm; and receiving data output by the mathematical model algorithm as the model output data. Wherein the data output by the mathematical model algorithm can be returned to the server in the form of a data stream.
it will be appreciated that the mathematical model algorithm may be provided by data analysis software. In particular, in some implementations of the present embodiment, the mathematical model algorithm may be provided by, for example, an R language. The R language is a free, source code-open software belonging to the GUN system and used for the language of statistical analysis and drawing and the operating environment, and is an excellent tool for statistical calculation and statistical drawing.
in this embodiment, the R language provides a variety of mathematical model algorithms that may be invoked, including, for example, a data six overview, a data six comparison, a linear regression model, a time series-holtziters model analysis, a time series-ARIMA model analysis, a classical hypothesis test, and the like.
For example, in some embodiments, the six-point overview of the data is called to perform the operation, specifically, the maximum value, the minimum value, the median, the average value, the upper quartile and the lower quartile of the set of input data are calculated, and then the data are described in the simplest statistical manner through the six overview numbers. The median is a number in the middle after the data are arranged from small to large, and if even number of data exist, the average of the two most middle numerical values is taken as the median; the upper quartile is also called as a larger quartile, and 75% of the data are arranged from small to large, namely 25% of the data are larger than the upper quartile and 75% of the data are smaller than the upper quartile; the lower quartile is also called as a smaller quartile, and after all data are arranged from small to large, the 25 th percent of the data are more than 75 percent of the data and less than 25 percent of the data.
for another example, in some embodiments, the operation is performed by calling six data comparisons, specifically, the six statistical overview numbers of two sets of input data are compared, and the normalized variance of each of the two sets of data is calculated, and then the data discrete degrees are compared by the variance. The operation relates to two concepts of normalization and variance, wherein the normalization is a simplified calculation mode, namely a dimensional expression is converted into a dimensionless expression to form a scalar; variance is the mean of the sum of the squares of the differences between each data and its mean, respectively.
For another example, in some embodiments, a linear regression is called to perform the operation, specifically, a linear regression model is generated by passing two sets of input data through the R language, whether the two sets of input data have an obvious linear relationship is analyzed, and then a linear regression equation, a goodness of fit and a model P value are obtained. Wherein the linear regression is a regression analysis modeling the relationship between one or more independent variables and dependent variables using the least squares function of the linear regression equation; the goodness of fit refers to the degree of fit of a regression straight line to data; the model P value is the probability of occurrence of a result obtained from a sample when the assumed slope value and intercept value are true.
as another example, in some embodiments, the time series-holtziters model is invoked or operated using the time series-ARIMA model, and specifically, the historical time data is analyzed to predict the value of 7 days in the future. The time sequence is formed by arranging numerical values of a certain statistical index of a certain phenomenon on different time according to time sequence; the mathematical method applied by the HoltWinters model is an exponential smoothing method, the influence of the trend and the season is added, and the exponential smoothing method is a special weighted moving average method; the ARIMA (autoregressive integrated moving average) model is a model established by converting a non-stationary time sequence into a stationary time sequence and regressing a dependent variable only on a hysteresis value of the dependent variable and a current value and a hysteresis value of a random error term. In addition, upper and lower values of 80% and 95% confidence intervals, which are estimated intervals of the overall parameter constructed from the sample statistics, can also be obtained. It should be noted that the model operations may be combined according to the data analysis instruction.
It will be appreciated that the model output data includes a large amount of specialized data describing the mathematical results, however, the user is often not concerned with the mathematical results, or the user may not understand the mathematical results, and the user usually desires to have the data mining results reflected by the mathematical results. Therefore, in order to enable the user to directly obtain the data mining result required by the user, in step 205, the data mining result corresponding to the data analysis command may be analyzed from the model output data by using an analysis method preset for the data analysis command. The analysis method may be, for example, interception processing, splitting processing, or the like.
it should be noted that, in order to meet the requirement of the user, in some embodiments of this embodiment, after the data mining result corresponding to the data analysis instruction is obtained through analysis, the data mining result and the request information may be correspondingly stored, so that the user may query his/her historical task in the future. Specifically, after step 205, this embodiment may further include, for example: and correspondingly storing request information and the data mining result, wherein the request information is used for identifying the request operation.
Specifically, in some embodiments of this embodiment, for the step of correspondingly saving the request information and the data mining result, for example, the step may include: and sending request information and the data mining result to the database correspondingly so that the database stores the request information and the data mining result correspondingly.
It can be understood that when the server feeds back the data mining result to the client, in order to facilitate the user to understand the data mining result, the server may select different data mining result display modes for different data mining results according to the data analysis instruction, and further visualize different data mining results through different graphs. Specifically, in some implementations of this embodiment, step 206 may include, for example: determining a graph type for displaying the data mining result according to the data analysis instruction; and generating and displaying an image according to the data mining result according to the image type.
The method of the present embodiment can be applied to various scenarios. For example, in an exemplary application scenario, the data mining result required to be requested by the user may be a predicted playing amount of 7 days in the future by analyzing an actual playing amount of a class a video of a video website in one month, and the user may send request operation information of data analysis to the server through the client. The server responds to the data analysis request operation, selectable request categories can be displayed through a menu containing query conditions, the request categories are used for representing data mining results needing to be requested, a user can select 7 days of prediction and analyze one month in the selectable query conditions as target query conditions, and the server generates the data analysis instruction based on the target query conditions. The server generates query request information of actual play volume in one month corresponding to the data analysis instruction and sends the query request information to the database, the database returns corresponding data to the server after receiving the query information, and the server receives the data as initial data. Wherein the initial data may be a video playing amount per hour per month, the server may convert the data playing amount per hour into a daily video playing amount required by a mathematical model provided by an R language through an analysis mode corresponding to the data analysis instruction, and the server may invoke a time series-ARIMA model algorithm corresponding to the data analysis instruction to input the model input data into the time series-ARIMA model algorithm for operation, the output data may include the analysis result of the video playing amount of one month, the video playing amount of the future 7 days, and the upper and lower limits of the 80% confidence interval and the upper and lower limits of the 95% confidence interval of the video playing amount of the future 7 days, the output data may be returned to the server in the form of a data stream, and the server receives the output data as the model output data. And splitting and intercepting the predicted video playing amount of the data mining result corresponding to the data analysis instruction from the model output data in an analysis mode preset for the data analysis instruction, and feeding the predicted video playing amount back to the client side in the future 7 days. The server can also send request information to the database corresponding to the data mining result, so that the database correspondingly stores the request information and the data mining result. And according to the data analysis instruction, the server determines to display the data mining result by using a curve trend graph, and generates the curve trend graph from the data mining result and displays the curve trend graph to the client.
Through various implementation manners provided by this embodiment, a corresponding conversion manner and an analysis manner may be set in advance for a request of a data mining result, and after a user triggers a request operation of data analysis, a corresponding analysis manner of the preset corresponding conversion manner may be determined according to a data analysis instruction corresponding to the request operation, so that initial data corresponding to the data analysis instruction may be converted into model input data that may be used as an input variable of a mathematical model according to the determined conversion manner, and a data mining result corresponding to the data analysis instruction may also be analyzed from model output data that is used as an output variable of the mathematical model according to the determined analysis manner and fed back to the user. Therefore, the user does not need to determine the input variables of the mathematical model on the basis of the original data and interpret the output variables of the mathematical model, and the data mining result can be obtained. Therefore, data mining results can be obtained even if the user does not understand the mathematical model.
exemplary device
Referring to fig. 3, a schematic structural diagram of an apparatus for data analysis according to an embodiment of the present invention is shown. In this embodiment, the apparatus may specifically include:
a generating unit 301, configured to generate, in response to a request operation for data analysis, a data analysis instruction corresponding to the request operation;
an obtaining unit 302 configured to obtain initial data corresponding to the data analysis instruction;
A conversion unit 303, configured to convert the initial data into model input data according to a conversion manner corresponding to the data analysis instruction, where a format of the model input data conforms to a format of an input variable of a mathematical model corresponding to the data analysis instruction;
the operation unit 304 is configured to invoke a mathematical model algorithm corresponding to the data analysis instruction to perform operation on the model input data, so as to obtain model output data;
An analyzing unit 305, configured to screen a data mining result corresponding to the data analysis instruction from the model output data according to an analysis manner corresponding to the data analysis instruction;
A feedback unit 306, configured to feed back the data mining result;
Wherein the conversion mode and the analysis mode are preset.
Optionally, in some embodiments of this embodiment, the generating unit 301 may include, for example:
the display subunit is used for responding to the request operation of data analysis and displaying optional request types, wherein the request types are used for expressing data mining results needing to be requested;
the generation subunit is configured to, in response to an operation of selecting a target request category from the selectable request categories, generate the data analysis instruction based on the target request category.
optionally, in some embodiments of this embodiment, the obtaining unit 302 may include, for example:
the sending subunit is used for generating a data query request corresponding to the data analysis instruction and sending the data query request to a database;
And the receiving subunit is configured to receive, as the initial data, data corresponding to the data query request returned by the database.
optionally, in some embodiments of this embodiment, the conversion manner includes data splicing processing, data type conversion processing, data statistics calculation processing, and/or data format conversion processing.
Optionally, in some implementations of this embodiment, the operation unit 304 may include, for example:
The input unit is used for calling a mathematical model algorithm corresponding to the data analysis instruction and inputting the model input data into the mathematical model algorithm;
and the receiving unit is used for receiving the data output by the mathematical model algorithm as the model output data.
optionally, in some embodiments of this embodiment, the mathematical model algorithm may be specifically an algorithm provided for an R language, for example.
optionally, in still other embodiments of this embodiment, the apparatus may further include:
and the storage unit is used for correspondingly storing request information and the data mining result, wherein the request information is used for identifying the request operation.
Optionally, the storage unit is specifically configured to: and sending request information and the data mining result to the database correspondingly so that the database stores the request information and the data mining result correspondingly.
Optionally, in some implementations of this embodiment, the feedback unit may include, for example:
The determining subunit is used for determining the graph type for displaying the data mining result according to the data analysis instruction;
And the image subunit is used for generating and displaying the data mining result into an image according to the image type.
Through various implementation manners provided by this embodiment, a corresponding conversion manner and an analysis manner may be set in advance for a request of a data mining result, and after a user triggers a request operation of data analysis, a corresponding analysis manner of the preset corresponding conversion manner may be determined according to a data analysis instruction corresponding to the request operation, so that initial data corresponding to the data analysis instruction may be converted into model input data that may be used as an input variable of a mathematical model according to the determined conversion manner, and a data mining result corresponding to the data analysis instruction may also be analyzed from model output data that is used as an output variable of the mathematical model according to the determined analysis manner and fed back to the user. Therefore, the user does not need to determine the input variables of the mathematical model on the basis of the original data and interpret the output variables of the mathematical model, and the data mining result can be obtained. Therefore, data mining results can be obtained even if the user does not understand the mathematical model.
it is noted that, herein, relational terms such as first and second, and the like may be used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. The terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising an … …" does not exclude the presence of other identical elements in a process, method, article, or apparatus that comprises the element.
For the device embodiments, since they substantially correspond to the method embodiments, reference may be made to the partial description of the method embodiments for relevant points. The above-described system embodiments are merely illustrative, and the units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of the present embodiment. One of ordinary skill in the art can understand and implement it without inventive effort.
the foregoing is directed to embodiments of the present application and it is noted that numerous modifications and adaptations may be made by those skilled in the art without departing from the principles of the present application and are intended to be within the scope of the present application.

Claims (8)

1. A method of data analysis, comprising:
responding to a request operation of data analysis, and displaying optional request categories, wherein the request categories are used for representing data mining results needing to be requested;
In response to an operation of selecting a target request category from the selectable request categories, generating a data analysis instruction based on the target request category;
acquiring initial data corresponding to the data analysis instruction;
converting the initial data into model input data according to a conversion mode corresponding to the data analysis instruction, wherein the format of the model input data conforms to the format of an input variable of a mathematical model corresponding to the data analysis instruction;
Calling a mathematical model algorithm corresponding to the data analysis instruction to operate the model input data to obtain model output data;
Analyzing a data mining result corresponding to the data analysis instruction from the model output data according to an analysis mode corresponding to the data analysis instruction;
feeding back the data mining result;
Wherein the conversion mode and the analysis mode are preset;
The conversion mode comprises data splicing processing, data type conversion processing, data statistics calculation processing and/or data format conversion processing.
2. The method of claim 1, wherein invoking a mathematical model algorithm corresponding to the data analysis instruction to operate on the model input data to obtain model output data comprises:
Calling a mathematical model algorithm corresponding to the data analysis instruction, and inputting the model input data into the mathematical model algorithm;
And receiving data output by the mathematical model algorithm as the model output data.
3. The method according to claim 1, characterized in that the mathematical model algorithm is embodied as an algorithm provided in the R language.
4. the method of claim 1, wherein the obtaining initial data corresponding to the data analysis instructions comprises:
generating a data query request corresponding to the data analysis instruction and sending the data query request to a database;
And receiving data corresponding to the data query request returned by the database as the initial data.
5. the method of claim 1, further comprising:
and correspondingly storing request information and the data mining result, wherein the request information is used for identifying the request operation.
6. the method according to claim 5, wherein the storing the request information and the data mining result correspondingly comprises: and sending request information and the data mining result to the database correspondingly so that the database stores the request information and the data mining result correspondingly.
7. The method of claim 1, wherein the feeding back the data mining results comprises:
Determining a graph type for displaying the data mining result according to the data analysis instruction;
and generating and displaying a graph according to the data mining result according to the graph type.
8. an apparatus for data analysis, comprising:
The generating unit is used for responding to the request operation of data analysis and displaying optional request types, and the request types are used for expressing data mining results needing to be requested; and in response to an operation of selecting a target request category from the selectable request categories, generating data analysis instructions based on the target request category;
the conversion unit is used for converting the initial data into model input data according to a conversion mode corresponding to the data analysis instruction, wherein the format of the model input data conforms to the format of an input variable of a mathematical model corresponding to the data analysis instruction;
the arithmetic unit is used for calling a mathematical model algorithm corresponding to the data analysis instruction to calculate the model input data to obtain model output data;
the analysis unit is used for screening out a data mining result corresponding to the data analysis instruction from the model output data according to an analysis mode corresponding to the data analysis instruction;
The feedback unit is used for feeding back the data mining result;
wherein the conversion mode and the analysis mode are preset;
The conversion mode comprises data splicing processing, data type conversion processing, data statistics calculation processing and/or data format conversion processing.
CN201610827546.3A 2016-09-14 2016-09-14 Data analysis method and device Active CN106372240B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201610827546.3A CN106372240B (en) 2016-09-14 2016-09-14 Data analysis method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201610827546.3A CN106372240B (en) 2016-09-14 2016-09-14 Data analysis method and device

Publications (2)

Publication Number Publication Date
CN106372240A CN106372240A (en) 2017-02-01
CN106372240B true CN106372240B (en) 2019-12-10

Family

ID=57897423

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201610827546.3A Active CN106372240B (en) 2016-09-14 2016-09-14 Data analysis method and device

Country Status (1)

Country Link
CN (1) CN106372240B (en)

Families Citing this family (14)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10452998B2 (en) * 2017-03-19 2019-10-22 International Business Machines Corporation Cognitive blockchain automation and management
US10515233B2 (en) 2017-03-19 2019-12-24 International Business Machines Corporation Automatic generating analytics from blockchain data
CN107341247A (en) * 2017-07-07 2017-11-10 河南科技大学 A kind of data analysis system and data analysing method
CN107526935A (en) * 2017-09-08 2017-12-29 新屿信息科技(上海)有限公司 A kind of data statistical approach and device
CN109992252B (en) * 2017-12-29 2022-12-16 中移物联网有限公司 Data analysis method, terminal, device and storage medium based on Internet of things
CN108447118A (en) * 2018-03-20 2018-08-24 北京知道创宇信息技术有限公司 Big data method for visualizing, device and the electronic equipment that 3D visions are presented
JP6984746B2 (en) * 2018-05-24 2021-12-22 株式会社島津製作所 Analytical system
CN108804616B (en) * 2018-05-30 2020-12-08 中国科学院空间应用工程与技术中心 Device and method for mining on-orbit image data of space payload
CN110728290B (en) * 2018-07-17 2020-07-31 阿里巴巴集团控股有限公司 Method and device for detecting security of data model
CN109284097B (en) * 2018-09-07 2022-02-15 武汉轻工大学 Method, device, system and storage medium for realizing complex data analysis
CN109711825A (en) * 2018-12-29 2019-05-03 北京航天数据股份有限公司 A kind of comprehensive method of commerce of industry pattern and the comprehensive transaction platform of industry pattern
CN110737918B (en) * 2019-10-15 2023-08-08 重庆远见信息产业集团股份有限公司 External data sharing management platform
CN111506605B (en) * 2020-04-02 2023-07-25 尚娱软件(深圳)有限公司 Data analysis method, device, equipment and computer readable storage medium
CN113535737B (en) * 2021-09-15 2022-03-01 北京搜狐新媒体信息技术有限公司 Feature generation method and device, electronic equipment and computer storage medium

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1975720A (en) * 2006-12-27 2007-06-06 章毅 Data tapping system based on Wcb and control method thereof
CN101286210A (en) * 2007-04-11 2008-10-15 中国科学院地理科学与资源研究所 Populace space distribution numerical simulation system
CN101778400A (en) * 2010-01-08 2010-07-14 哈尔滨工业大学 Database-based telephone traffic analysis and prediction system and telephone traffic prediction method using same

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10521455B2 (en) * 2014-03-18 2019-12-31 Nanobi Data And Analytics Private Limited System and method for a neural metadata framework

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1975720A (en) * 2006-12-27 2007-06-06 章毅 Data tapping system based on Wcb and control method thereof
CN101286210A (en) * 2007-04-11 2008-10-15 中国科学院地理科学与资源研究所 Populace space distribution numerical simulation system
CN101778400A (en) * 2010-01-08 2010-07-14 哈尔滨工业大学 Database-based telephone traffic analysis and prediction system and telephone traffic prediction method using same

Also Published As

Publication number Publication date
CN106372240A (en) 2017-02-01

Similar Documents

Publication Publication Date Title
CN106372240B (en) Data analysis method and device
TWI703458B (en) Data processing model construction method, device, server and client
WO2018103595A1 (en) Authorization policy recommendation method and device, server, and storage medium
US8466783B2 (en) Alarm analysis system and a method for providing statistics on alarms from a process control system
CN111026959B (en) Prompt message pushing method, device and storage medium
CN111768040A (en) Model interpretation method, device, equipment and readable storage medium
CN105468161A (en) Instruction execution method and device
CN112965803A (en) AI model generation method and electronic equipment
CN114546365B (en) Flow visualization modeling method, server, computer system and medium
WO2014054232A1 (en) Information system construction assistance device, information system construction assistance method, and information system construction assistance program
CN111723515A (en) Method, device and system for operating operator
US20140310306A1 (en) System And Method For Pattern Recognition And User Interaction
CN115271648A (en) Project visualization monitoring system, method, equipment and storage medium
US9720939B1 (en) Method and system for implementing categorically organized relationship effects
CN107644042B (en) Software program click rate pre-estimation sorting method and server
CN111723000B (en) Test method, test device, electronic equipment and storage medium
CN116386813A (en) Method, device, equipment and storage medium for balancing load between operations
KR20210048818A (en) Apparatus and method for trade based on artificial intelligence using fintech
CN116089490A (en) Data analysis method, device, terminal and storage medium
CN113033938B (en) Method, device, terminal equipment and storage medium for determining resource allocation strategy
CN111539529B (en) Event reasoning method and device
CN114327709A (en) Control page generation method and device, intelligent device and storage medium
CN114598652A (en) Flow regulation and control method, device, equipment and readable storage medium
US20160343071A1 (en) Systems and methods for generating communication data analytics
WO2020070906A1 (en) Workshop assistance system and workshop assistance method

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant