CN111797071A - Data processing method, data processing device, storage medium and electronic equipment - Google Patents

Data processing method, data processing device, storage medium and electronic equipment Download PDF

Info

Publication number
CN111797071A
CN111797071A CN201910282435.2A CN201910282435A CN111797071A CN 111797071 A CN111797071 A CN 111797071A CN 201910282435 A CN201910282435 A CN 201910282435A CN 111797071 A CN111797071 A CN 111797071A
Authority
CN
China
Prior art keywords
data
time periods
time
time period
terminal use
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201910282435.2A
Other languages
Chinese (zh)
Inventor
何明
陈仲铭
李姬俊男
刘耀勇
陈岩
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Guangdong Oppo Mobile Telecommunications Corp Ltd
Original Assignee
Guangdong Oppo Mobile Telecommunications Corp Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Guangdong Oppo Mobile Telecommunications Corp Ltd filed Critical Guangdong Oppo Mobile Telecommunications Corp Ltd
Priority to CN201910282435.2A priority Critical patent/CN111797071A/en
Publication of CN111797071A publication Critical patent/CN111797071A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/21Design, administration or maintenance of databases
    • G06F16/211Schema design and management
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2458Special types of queries, e.g. statistical queries, fuzzy queries or distributed queries
    • G06F16/2477Temporal data queries
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/10Complex mathematical operations
    • G06F17/18Complex mathematical operations for evaluating statistical data, e.g. average values, frequency distributions, probability functions, regression analysis

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Mathematical Physics (AREA)
  • Computational Mathematics (AREA)
  • Mathematical Analysis (AREA)
  • Mathematical Optimization (AREA)
  • Software Systems (AREA)
  • Probability & Statistics with Applications (AREA)
  • Pure & Applied Mathematics (AREA)
  • Evolutionary Biology (AREA)
  • Algebra (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Operations Research (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Fuzzy Systems (AREA)
  • Computational Linguistics (AREA)
  • Telephonic Communication Services (AREA)

Abstract

The embodiment of the application discloses a data processing method, a device, a storage medium and electronic equipment, the embodiment of the application can acquire the terminal use data of a plurality of time periods for statistics, construct a first relation table between the time periods and the data types of the terminal use data, acquire the scene types corresponding to the time periods, combine the first relation table, a second relation table between time periods, scene categories and data types of the terminal usage data may be constructed, by the scheme, a three-in-one relation table among the data type, the time period and the scene category of the terminal use data is formed, the analysis of the time characteristic of the terminal use data is realized, when the data in the specific time period is subjected to feature extraction, the data features in the specific time period and the specific scene can be accurately extracted according to the second relation table, and the accuracy of data feature extraction is further improved.

Description

Data processing method, data processing device, storage medium and electronic equipment
Technical Field
The present application relates to the field of data processing technologies, and in particular, to a data processing method and apparatus, a storage medium, and an electronic device.
Background
As a terminal manufacturer, it is often necessary to analyze behavior habits and states of a terminal user in a specific scene, and then data of the terminal user needs to be collected for analysis. For example, to analyze the regularity and trend of user data in a conference scene, it is necessary to acquire usage data of a user terminal in the conference scene and extract features from the usage data for analysis.
However, in a general feature extraction method, all types of terminal usage data are extracted and analyzed for all types of scenes and for all time periods. In fact, when a user uses a terminal device, the user has obvious time characteristics, and the types of data used in different time periods are different. In addition, the corresponding scene categories are also different at different time periods. In summary, the analysis of the time characteristics of the terminal usage data is very deficient at present, which results in difficulty in accurately extracting data features in a specific time period and a specific scene, and results in low accuracy of an analysis conclusion.
Disclosure of Invention
The embodiment of the application provides a data processing method, a data processing device, a storage medium and electronic equipment, which can analyze the time characteristics of terminal use data and further improve the accuracy of data feature extraction.
In a first aspect, an embodiment of the present application provides a data processing method, including:
acquiring terminal use data of a plurality of time periods, wherein the terminal use data comprises a plurality of data types;
counting the terminal use data of the time periods, and constructing a first relation table between the time periods and the data types of the terminal use data according to the counting result;
obtaining scene types corresponding to the multiple time periods;
and constructing a second relation table among the time periods, the scene categories and the data types of the terminal use data according to the scene categories corresponding to the time periods and the first relation table.
In a second aspect, an embodiment of the present application provides a data processing apparatus, including:
the terminal comprises a data acquisition module, a data processing module and a data processing module, wherein the data acquisition module is used for acquiring terminal use data of a plurality of time periods, and the terminal use data comprises a plurality of data types;
the first association analysis module is used for counting the terminal use data of the time periods and constructing a first relation table between the time periods and the data types of the terminal use data according to the counting result;
the category acquisition module is used for acquiring scene categories corresponding to the time periods;
and the second correlation analysis module is used for constructing a second relation table among the time periods, the scene categories and the data types of the terminal use data according to the scene categories corresponding to the time periods and the first relation table.
In a third aspect, a storage medium is provided in this application, and a computer program is stored thereon, and when the computer program runs on a computer, the computer is caused to execute the data processing method provided in any embodiment of this application.
In a fourth aspect, an embodiment of the present application provides an electronic device, including a processor and a memory, where the memory has a computer program, and the processor is configured to execute the data processing method provided in any embodiment of the present application by calling the computer program.
According to the technical scheme provided by the embodiment of the application, the terminal use data of a plurality of time periods are obtained, statistics is carried out on the terminal use data of the time periods, and a first relation table between the time periods and the data types of the terminal use data is constructed according to the statistical result, wherein the terminal use data comprises a plurality of data types. Then, the scene category corresponding to the time period is obtained, according to a first relation table between the time period and the data type of the terminal use data, a second relation table between time periods, scene categories and data types of the terminal usage data may be constructed, by the scheme, not only the time characteristic of the terminal use data can be mined and the incidence relation between the data type of the terminal use data and the time period can be found, but also the three-in-one relation table among the data type of the terminal use data, the time period and the scene type can be further formed by combining the scene type, so that the analysis of the time characteristic of the terminal use data is realized, when the data in the specific time period is subjected to feature extraction, the data features in the specific time period and the specific scene can be extracted in a targeted manner according to the second relation table, so that the accuracy of data feature extraction is improved.
Drawings
In order to more clearly illustrate the technical solutions in the embodiments of the present application, the drawings needed to be used in the description of the embodiments are briefly introduced below, and it is obvious that the drawings in the following description are only some embodiments of the present application, and it is obvious for those skilled in the art to obtain other drawings based on these drawings without creative efforts.
Fig. 1 is a schematic view of a panoramic sensing architecture of a data processing method according to an embodiment of the present application.
Fig. 2 is a schematic flowchart of a first data processing method according to an embodiment of the present disclosure.
Fig. 3 is a schematic flowchart of a second data processing method according to an embodiment of the present application.
Fig. 4 is a schematic structural diagram of a data processing apparatus according to an embodiment of the present application.
Fig. 5 is a schematic structural diagram of a first electronic device according to an embodiment of the present application.
Fig. 6 is a schematic structural diagram of a second electronic device according to an embodiment of the present application.
Detailed Description
The technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application. It is to be understood that the embodiments described are only a few embodiments of the present application and not all embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without inventive step, are within the scope of the present application.
Referring to fig. 1, fig. 1 is a schematic view of a panoramic sensing architecture of a data processing method according to an embodiment of the present application. The data processing method is applied to the electronic equipment. A panoramic perception framework is arranged in the electronic equipment. The panoramic sensing architecture is an integration of hardware and software for implementing the data processing method in an electronic device.
The panoramic perception architecture comprises an information perception layer, a data processing layer, a feature extraction layer, a scene modeling layer and an intelligent service layer.
The information perception layer is used for acquiring information of the electronic equipment or information in an external environment. The information-perceiving layer may include a plurality of sensors. For example, the information sensing layer includes a plurality of sensors such as a distance sensor, a magnetic field sensor, a light sensor, an acceleration sensor, a fingerprint sensor, a hall sensor, a position sensor, a gyroscope, an inertial sensor, an attitude sensor, a barometer, and a heart rate sensor.
Among other things, a distance sensor may be used to detect a distance between the electronic device and an external object. The magnetic field sensor may be used to detect magnetic field information of the environment in which the electronic device is located. The light sensor can be used for detecting light information of the environment where the electronic equipment is located. The acceleration sensor may be used to detect acceleration data of the electronic device. The fingerprint sensor may be used to collect fingerprint information of a user. The Hall sensor is a magnetic field sensor manufactured according to the Hall effect, and can be used for realizing automatic control of electronic equipment. The location sensor may be used to detect the geographic location where the electronic device is currently located. Gyroscopes may be used to detect angular velocity of an electronic device in various directions. Inertial sensors may be used to detect motion data of an electronic device. The gesture sensor may be used to sense gesture information of the electronic device. A barometer may be used to detect the barometric pressure of the environment in which the electronic device is located. The heart rate sensor may be used to detect heart rate information of the user.
And the data processing layer is used for processing the data acquired by the information perception layer. For example, the data processing layer may perform data cleaning, data integration, data transformation, data reduction, and the like on the data acquired by the information sensing layer.
The data cleaning refers to cleaning a large amount of data acquired by the information sensing layer to remove invalid data and repeated data. The data integration refers to integrating a plurality of single-dimensional data acquired by the information perception layer into a higher or more abstract dimension so as to comprehensively process the data of the plurality of single dimensions. The data transformation refers to performing data type conversion or format conversion on the data acquired by the information sensing layer so that the transformed data can meet the processing requirement. The data reduction means that the data volume is reduced to the maximum extent on the premise of keeping the original appearance of the data as much as possible.
The characteristic extraction layer is used for extracting characteristics of the data processed by the data processing layer so as to extract the characteristics included in the data. The extracted features may reflect the state of the electronic device itself or the state of the user or the environmental state of the environment in which the electronic device is located, etc.
The feature extraction layer may extract features or process the extracted features by a method such as a filtering method, a packing method, or an integration method.
The filtering method is to filter the extracted features to remove redundant feature data. Packaging methods are used to screen the extracted features. The integration method is to integrate a plurality of feature extraction methods together to construct a more efficient and more accurate feature extraction method for extracting features.
The scene modeling layer is used for building a model according to the features extracted by the feature extraction layer, and the obtained model can be used for representing the state of the electronic equipment, the state of a user, the environment state and the like. For example, the scenario modeling layer may construct a key value model, a pattern identification model, a graph model, an entity relation model, an object-oriented model, and the like according to the features extracted by the feature extraction layer.
The intelligent service layer is used for providing intelligent services for the user according to the model constructed by the scene modeling layer. For example, the intelligent service layer can provide basic application services for users, perform system intelligent optimization for electronic equipment, and provide personalized intelligent services for users.
In addition, the panoramic perception architecture can further comprise a plurality of algorithms, each algorithm can be used for analyzing and processing data, and the plurality of algorithms can form an algorithm library. For example, the algorithm library may include algorithms such as markov algorithm, hidden dirichlet distribution algorithm, bayesian classification algorithm, support vector machine, K-means clustering algorithm, K-nearest neighbor algorithm, conditional random field, residual network, long-short term memory network, convolutional neural network, cyclic neural network, and the like.
Based on the panoramic sensing framework, the electronic equipment acquires terminal use data of a target user through an information sensing layer and/or other modes. The data processing layer processes the terminal use data, for example, performs data cleaning, data integration, and the like on the acquired terminal use data. Next, the feature extraction layer processes the terminal usage data according to the feature extraction scheme provided in the embodiment of the present application, for example, obtains terminal usage data of multiple time periods, performs statistics on the terminal usage data of the multiple time periods, and constructs a first relationship table between the time periods and data types of the terminal usage data according to the statistical result, where the terminal usage data includes multiple data types. Then, a scenario category corresponding to the time period is obtained, and according to the first relation table between the time period and the data type of the terminal usage data, a second relation table between the time period, the scenario category and the data type of the terminal usage data can be constructed. By the scheme, the time characteristic of the terminal use data can be mined, the incidence relation between the data type of the terminal use data and the time period can be found, the three-in-one relation table among the data type of the terminal use data, the time period and the scene type can be further formed by combining the scene type, the time characteristic of the terminal use data can be analyzed, when the data of the specific time period is subjected to feature extraction, the data features of the specific time period and the specific scene can be extracted in a targeted manner according to the second relation table, and the accuracy of data feature extraction is improved.
An execution main body of the data processing method may be the data processing apparatus provided in the embodiment of the present application, or an electronic device integrated with the data processing apparatus, where the data processing apparatus may be implemented in a hardware or software manner. The electronic device may be a smart phone, a tablet computer, a palm computer, a notebook computer, or a desktop computer.
Referring to fig. 2, fig. 2 is a first flowchart illustrating a data processing method according to an embodiment of the present disclosure. The specific flow of the data processing method provided by the embodiment of the application can be as follows:
step 101, acquiring terminal use data of a plurality of time periods, wherein the terminal use data comprises a plurality of data types.
In the embodiment of the application, the terminal use data is generated according to the use condition of a user on the intelligent terminal, such as electronic equipment. For example, the terminal usage data mainly includes the following three major categories: environmental data, user behavior data, and terminal operational data. Wherein each large category of data comprises a plurality of data types, for example, the environmental data comprises data types of temperature, illumination, and the like; the user behavior data may include: the time, place, and frequency of opening the application; the terminal operation data may include: the operation state of the terminal, such as the on-off state of the mobile data network, the connection state of the wireless hotspot, the identity information of the connected wireless hotspot, the currently running application program, the previous foreground application program, the time for the current application program to stay in the background, the time for the current application program to be switched to the background last time, the plugging and unplugging state of the earphone jack, the charging state, the battery power information, the screen display time and other data types. The terminal usage data may further include data collected by sensors integrated in the terminal, such as a combination of one or more of a motion sensor, a light sensor, a temperature sensor, and a humidity sensor, among others.
In addition, in the embodiment of the application, the continuous time is divided into a plurality of time periods, and the terminal use data is collected according to the time periods. And storing the acquired terminal use data into a preset database, for example, constructing a terminal use database by adopting MySQL in advance, and storing the acquired terminal use data into the database.
There are various ways in which the time period may be divided, for example, in some embodiments, the time period may be divided manually, such as every hour as one time period. Assuming one day (24 hours) as one time interval, one day may be divided into 24 time periods. Alternatively, each half hour is used as a time period, so that the day is divided into 48 time periods.
Alternatively, in other embodiments, the time segments may be partitioned based on information entropy. Specifically, before the step of acquiring terminal usage data for a plurality of time periods, the method further includes: the preset time interval is divided into a plurality of time periods.
The step of dividing a preset time interval into a plurality of time periods comprises the following steps:
calculating the total information entropy of a preset time interval according to terminal use data in the preset time interval;
dividing the time interval into two time periods according to each time division point in a preset value range, and calculating weighted average information entropy of the two divided time periods;
determining a time division point with the maximum difference value between the weighted average information entropy and the total information entropy;
dividing the time interval into a first time period and a second time period according to the time division point;
and taking the first time period and the second time period as new time intervals, and repeatedly executing the steps until the number of the segments of the time intervals is greater than a third preset threshold value.
The total information entropy of the time interval can be calculated by using any type of data in the terminal usage data, for example, the starting data of the target application is obtained from the terminal usage data; and calculating the total information entropy of the preset time interval according to the starting data. For example, the WeChat APP is used as a target application, and the starting data of the WeChat APP is the starting times of the WeChat APP.
Assuming 10 minutes as a time unit, a total of 144 time units are obtained in 24 hours, starting from 1 and marking up to 144, and counting the opening times of the WeChat APP in each ten minutes and marking as counttWhere t represents the tth ten minutes.
Based on the number of opening times of the WeChat APP, calculating the probability within the tth ten minutes:
Figure BDA0002022113160000071
the time interval is then divided into a plurality of time segments based on the information entropy of time:
a. calculating the total information entropy of the whole time interval:
Figure BDA0002022113160000072
where T is 144.
b. Suppose that the time interval is divided at the x-th point, and x has a value in the range of [2, 143]]Taking the example of the 10 th point division, the time interval can be divided into two parts: 1 to 10 and 11 to 144, followed by calculation of two time periods after the slicing [1, 10 ]]And [11, 144]]Weighted average information entropy of (1):
Figure BDA0002022113160000073
c. calculating the difference between the weighted average information entropy and the information entropy: Δ E (10) ═ Ent ([1, T ]) -Ent ([1, T ]; 10).
d. And (c) repeating the steps b and c, wherein each x can obtain a delta E (x) because the value range of x in the step b is [2, 143 ]. The largest Δ e (x) is selected, and the corresponding x is taken as the point of time segment division. Assuming that Δ E is maximum when x is 20, the time interval is divided into two time segments [1, 20] and [21, 144] with a division time point of 20.
After step d, two time periods can be obtained: [1, x ] and [ x +1, 144 ]. Repeating the steps a to d for the two divided time periods until the number of segments of the time interval reaches a third preset threshold value, for example 24, and stopping the iteration. Assuming that the third preset threshold is 20, the 24-hour time interval may be divided into 20 time periods.
And 102, counting the terminal use data of the time periods, and constructing a first relation table between the time periods and the data types of the terminal use data according to the counting result.
After the terminal use data of each time period is acquired, the data types relevant to each time period are found from all the data types.
Referring to fig. 3, fig. 3 is a schematic flowchart of a second data processing method according to an embodiment of the present application. In some embodiments, the step 102 of counting the terminal usage data of the plurality of time periods, and the building a first relation table between the time periods and the data types of the terminal usage data according to the statistical result includes:
step 1021, counting the use frequency corresponding to each data type in each time period according to the terminal use data;
step 1022, taking the data type with the usage frequency greater than a first preset threshold in a time period as the associated data type of the time period, and taking the usage frequency as the corresponding association degree;
and step 1023, constructing a first relation table between the time periods and the data types of the terminal use data according to the data types and the association degrees associated with each time period.
Due to the fact that the data types used by the user may vary in different time periods, for example, the frequency of use of the payment-type APP in the time periods corresponding to breakfast, lunch and dinner may be significantly higher than in other time periods. Also for example, GPS data may be turned on during hours such as the morning or evening when the user is commuting, and not turned on or turned on infrequently during other hours. The terminal usage data may be counted based on the usage frequency of each data type. And counting the use frequency corresponding to each data type in each time period from all the terminal use data. If the use frequency of terminal use data of a certain data type in a time period is greater than a first preset threshold, the data type is judged to be associated with the time period, otherwise, the data type is judged to be irrelevant to the time period, so that when the data of the time period are analyzed, only the terminal use data of the data type associated with the time period can be collected, and the irrelevant data types are ignored, so that the scale of the collected data is reduced, and the collected data is targeted. The first preset threshold value may be preset according to actual conditions.
Wherein, the frequency of use can be used as the relevance of the data type. After determining the data type and the association degree associated with each time period, constructing a first relation table according to the data type and the association degree associated with each time period. The first relation table includes the data type associated with each time period and the association degree of the associated data type. The first relation table may be stored in a table form or a data pair form, for example, as<ti,d(ti)>Where i e (1, n), a complete time interval can be divided into n successive time segments, tiDenotes the ith time segment, d (t), of n successive time segmentsi) Represents tiThe data type associated with the time period and the association degree thereof.
And 103, acquiring scene categories corresponding to the multiple time periods.
Next, the scene category of each time period is determined. The scheme of the application can be used for analyzing the historical terminal use data of the user. Therefore, the scene type corresponding to each time period can be determined according to the scene type sequence corresponding to the time interval. The scene type refers to the type of a scene where a terminal user is located, needs to be defined in advance, and is determined according to the use condition of the user on the electronic equipment. For example, the contextual categories may include office, running, dining, traveling, gaming, movie watching, payment, fitness, dinner gathering, and so forth. In the embodiment of the present application, the sequence of the scene categories within the preset time interval is known.
And 104, constructing a second relation table among the time periods, the scene categories and the data types of the terminal use data according to the scene categories corresponding to the time periods and the first relation table.
After the scene type corresponding to each time period is obtained, the relationship among the scene type, the time period and the data type of the terminal use data is learned according to the scene type and the first relationship table, the data type with strong association relationship with the time period and the scene type corresponding to the time period is found from the first relationship table, and a second relationship table is constructed.
For example, in some embodiments, the step 104 of constructing a second relationship table between the time periods, the scenario categories, and the data types of the terminal usage data according to the scenario categories corresponding to the plurality of time periods and the first relationship table includes:
performing regression analysis on the scene categories corresponding to the time periods, the associated data types and the association degrees based on a preset regression algorithm, and determining the data types associated with the time periods and the scene categories corresponding to the time periods;
and constructing a second relation table among the time periods, the scene categories and the data types of the terminal use data according to the time periods and the data types related to the scene categories corresponding to the time periods.
The regression algorithm may be a ridge regression algorithm, a multiple regression algorithm, a lasso (least absolute value convergence and selection operator, lasso algorithm) regression algorithm, etc., and then the ridge regression is taken as an example to learn the relationship among the scene type, the time period, and the data type of the data used by the terminal.
In some embodiments, based on a preset regression algorithm, performing regression analysis on the scenario categories corresponding to the time period and the associated data types and association degrees, and determining the data types associated with both the time period and the scenario categories corresponding to the time period includes:
establishing an optimization target equation of ridge regression by taking the association degree of the data type associated with the time period as an independent variable and taking the scene type corresponding to the time period as a dependent variable;
solving the optimization objective equation according to a gradient descent method to obtain a weight vector;
determining a weight of a data type associated with the time period according to the weight vector;
and taking the data type with the weight larger than a second preset threshold value as a data type associated with the time period and the scene category corresponding to the time period.
For example, the preset regression algorithm is a ridge regression algorithm; determining time periods tjAfter the scene type (2), the scene type is used as the y value of the ridge regression model, i.e., the dependent variable. d (t)i) As the x-vector of the ridge regression model, i.e., the argument. Constructing an optimization objective equation of a ridge regression learning model:
Figure BDA0002022113160000101
wherein W weight vector represents the weight of each data type, W ═ W1,w2,...wj,...,wm) M represents a time period tiNumber of data types in, wjRepresents the weight of the jth data type of the m data types, where j ∈ (1, m).
Specifically, the equation may be solved using a gradient descent algorithm to obtain the weight vector W. From the weight vector, a time period t can be determinedjA weight for each data type of association.
E.g. t1In the time period, if the weight of a data type is greater than a second preset threshold value, the data type and t are judged1Time period and time period t1If the corresponding scene categories have strong association relationship, the data type is reserved; if t1And in the time period, if the weight of a data type is less than or equal to a second preset threshold value, discarding the data. According to the mode, the data types associated with each time period and the scene type corresponding to the time period can be searched, and a second relation table can be generated, wherein the second relation table can be a tableShown as<ti,ci,d′(ti)>Wherein c isiContext class, d' (t), representing the ti time periodi) Represents the sum of tiTime period and context ciHave strongly associated data types.
Specifically, in another optional embodiment, after the step 104, according to the scenario categories corresponding to the multiple time periods and the first relationship table, building a second relationship table between the time periods, the scenario categories, and the data types of the terminal usage data, the method further includes:
determining a target time period and a scene category of the target time period;
determining data types associated with the target time period and the scene categories corresponding to the target time period according to the second relation table;
acquiring terminal use data belonging to the determined data type in the target time period;
and extracting data characteristics from the acquired terminal use data according to a preset characteristic extraction algorithm to serve as time domain characteristics corresponding to the target time period.
After obtaining a second relation table of a triad among the time period, the scene category and the data type of the terminal usage data, the data feature of a certain time period can be extracted according to the relation table. For example, to acquire the data feature of the target time period, the scene type of the time period is determined, then the data types associated with the time period and the scene type are searched according to the second relation table, the terminal use data belonging to the data types are acquired, the time domain feature is extracted according to the time domain feature extraction method, the acceleration sensor data sequence in the target time period is counted by taking the acceleration sensor data as an example, and the time domain feature such as the peak value, the mean value, the root mean square value, the kurtosis index, the form factor and the like in the data sequence is acquired.
In particular implementation, the present application is not limited by the execution sequence of the described steps, and some steps may be performed in other sequences or simultaneously without conflict.
As can be seen from the above, the data processing method provided in the embodiment of the present application may obtain terminal usage data of multiple time periods, perform statistics on the terminal usage data of the multiple time periods, and construct a first relation table between the time periods and data types of the terminal usage data according to the statistical result, where the terminal usage data includes multiple data types. Then, the scene category corresponding to the time period is obtained, according to a first relation table between the time period and the data type of the terminal use data, a second relation table between time periods, scene categories and data types of the terminal usage data may be constructed, by the scheme, not only the time characteristic of the terminal use data can be mined and the incidence relation between the data type of the terminal use data and the time period can be found, but also the three-in-one relation table among the data type of the terminal use data, the time period and the scene type can be further formed by combining the scene type, so that the analysis of the time characteristic of the terminal use data is realized, when the data in the specific time period is subjected to feature extraction, the data features in the specific time period and the specific scene can be extracted in a targeted manner according to the second relation table, so that the accuracy of data feature extraction is improved.
In one embodiment, a data processing apparatus is also provided. Referring to fig. 4, fig. 4 is a schematic structural diagram of a data processing apparatus 400 according to an embodiment of the present disclosure. The data processing apparatus 400 is applied to an electronic device, and the data processing apparatus 400 includes a data obtaining module 401, a first association analysis module 402, a category obtaining module 403, and a second association analysis module 404, as follows:
a data obtaining module 401, configured to obtain terminal usage data of multiple time periods, where the terminal usage data includes multiple data types.
In the embodiment of the application, the terminal use data is generated according to the use condition of a user on the intelligent terminal, such as electronic equipment. For example, the terminal usage data mainly includes the following three major categories: environmental data, user behavior data, and terminal operational data. Wherein each large category of data comprises a plurality of data types, for example, the environmental data comprises data types of temperature, illumination, and the like; the user behavior data may include: the time, place, and frequency of opening the application; the terminal operation data may include: the operation state of the terminal, such as the on-off state of the mobile data network, the connection state of the wireless hotspot, the identity information of the connected wireless hotspot, the currently running application program, the previous foreground application program, the time for the current application program to stay in the background, the time for the current application program to be switched to the background last time, the plugging and unplugging state of the earphone jack, the charging state, the battery power information, the screen display time and other data types. The terminal usage data may further include data collected by sensors integrated in the terminal, such as a combination of one or more of a motion sensor, a light sensor, a temperature sensor, and a humidity sensor, among others.
In addition, in this embodiment of the application, the data obtaining module 401 divides continuous time into a plurality of time periods, and collects terminal usage data according to the time periods. And storing the acquired terminal use data into a preset database, for example, constructing a terminal use database by adopting MySQL in advance, and storing the acquired terminal use data into the database.
There are various ways in which the time period may be divided, for example, in some embodiments, the time period may be divided manually, such as every hour as one time period. Assuming one day (24 hours) as one time interval, one day may be divided into 24 time periods. Alternatively, each half hour is used as a time period, so that the day is divided into 48 time periods.
Alternatively, in other embodiments, the time segments may be partitioned based on information entropy. Specifically, the apparatus further includes a time period dividing module, where the time period dividing module is configured to: the preset time interval is divided into a plurality of time periods.
Specifically, the time period dividing module is further configured to: calculating the total information entropy of a preset time interval according to terminal use data in the preset time interval; dividing the time interval into two time periods according to each time division point in a preset value range, and calculating weighted average information entropy of the two divided time periods; determining a time division point with the maximum difference value between the weighted average information entropy and the total information entropy; dividing the time interval into a first time period and a second time period according to the time division point; and taking the first time period and the second time period as new time intervals, and repeatedly executing the steps until the number of the segments of the time intervals is greater than a third preset threshold value.
The total information entropy of the time interval can be calculated by using any type of data in the terminal usage data, for example, the starting data of the target application is obtained from the terminal usage data; and calculating the total information entropy of the preset time interval according to the starting data. For example, the WeChat APP is used as a target application, and the starting data of the WeChat APP is the starting times of the WeChat APP.
Assuming 10 minutes as a time unit, a total of 144 time units are obtained in 24 hours, starting from 1 and marking up to 144, and counting the opening times of the WeChat APP in each ten minutes and marking as counttWhere t represents the tth ten minutes.
Based on the number of opening times of the WeChat APP, calculating the probability within the tth ten minutes:
Figure BDA0002022113160000131
the time interval is then divided into a plurality of time segments based on the information entropy of time:
a. calculating the total information entropy of the whole time interval:
Figure BDA0002022113160000132
where T is 144.
b. Suppose that the time interval is divided at the x-th point, and x has a value in the range of [2, 143]]Taking the example of the 10 th point division, the time interval can be divided into two parts: 1 to 10 and 11 to 144, followed by calculation of two time periods after the slicing [1, 10 ]]And [11, 144]]Weighted average information entropy of (1):
Figure BDA0002022113160000133
c. calculating the difference between the weighted average information entropy and the information entropy: Δ E (10) ═ Ent ([1, T ]) -Ent ([1, T ]; 10).
d. And (c) repeating the steps b and c, wherein each x can obtain a delta E (x) because the value range of x in the step b is [2, 143 ]. The largest Δ e (x) is selected, and the corresponding x is taken as the point of time segment division. Assuming that Δ E is maximum when x is 20, the time interval is divided into two time segments [1, 20] and [21, 144] with a division time point of 20.
After step d, two time periods can be obtained: [1, x ] and [ x +1, 144 ]. Repeating the steps a to d for the two divided time periods until the number of segments of the time interval reaches a third preset threshold value, for example 24, and stopping the iteration. Assuming that the third preset threshold is 20, the 24-hour time interval may be divided into 20 time periods.
A first association analysis module 402, configured to perform statistics on the terminal usage data in the multiple time periods, and construct a first relationship table between the time periods and data types of the terminal usage data according to a statistical result.
After the terminal usage data of each time period is acquired, the data types associated with each time period are found from all the data types through the first association analysis module 402.
In some embodiments, the first association analysis module 402 is further configured to: counting the use frequency corresponding to each data type in each time period according to the terminal use data; taking the data type with the use frequency larger than a first preset threshold value in a time period as a related data type of the time period, and taking the use frequency as the corresponding relevance degree; and constructing a first relation table between the time periods and the data types of the terminal use data according to the data types and the association degrees associated with each time period.
Due to the fact that the data types used by the user may vary in different time periods, for example, the frequency of use of the payment-type APP in the time periods corresponding to breakfast, lunch and dinner may be significantly higher than in other time periods. Also for example, GPS data may be turned on during hours such as the morning or evening when the user is commuting, and not turned on or turned on infrequently during other hours. The terminal usage data may be counted based on the usage frequency of each data type. The first association analysis module 402 counts the usage frequency corresponding to each data type in each time period from all the terminal usage data. If the use frequency of terminal use data of a certain data type in a time period is greater than a first preset threshold, the data type is judged to be associated with the time period, otherwise, the data type is judged to be irrelevant to the time period, so that when the data of the time period are analyzed, only the terminal use data of the data type associated with the time period can be collected, and the irrelevant data types are ignored, so that the scale of the collected data is reduced, and the collected data is targeted. The first preset threshold value may be preset according to actual conditions.
Wherein, the frequency of use can be used as the relevance of the data type. After determining the data type and association degree associated with each time period, the first association analysis module 402 constructs a first relationship table according to the data type and association degree associated with each time period. The first relation table includes the data type associated with each time period and the association degree of the associated data type. The first relation table may be stored in a table form or a data pair form, for example, as<ti,d(ti)>Where i e (1, n), a complete time interval can be divided into n successive time segments, tiDenotes the ith time segment, d (t), of n successive time segmentsi) Represents tiThe data type associated with the time period and the association degree thereof.
A category obtaining module 403, configured to obtain the category of the scene corresponding to the multiple time periods.
Next, the scene category of each time period is determined. The scheme of the application can be used for analyzing the historical terminal use data of the user. Therefore, the scene type corresponding to each time period can be determined according to the scene type sequence corresponding to the time interval. The scene type refers to the type of a scene where a terminal user is located, needs to be defined in advance, and is determined according to the use condition of the user on the electronic equipment. For example, the contextual categories may include office, running, dining, traveling, gaming, movie watching, payment, fitness, dinner gathering, and so forth. In the embodiment of the present application, the sequence of the scene categories within the preset time interval is known.
A second association analysis module 404, configured to construct a second relationship table among the time periods, the scenario categories, and the data types of the terminal usage data according to the scenario categories corresponding to the multiple time periods and the first relationship table.
After obtaining the scenario category corresponding to each time period, the second association analysis module 404 learns the relationship among the scenario category, the time period, and the data type of the terminal usage data according to the scenario category and the first relationship table, finds out the data type having a strong association relationship with the time period and the scenario category corresponding to the time period from the first relationship table, and constructs a second relationship table.
For example, in some embodiments, the second association analysis module 404 is further configured to: performing regression analysis on the scene categories corresponding to the time periods, the associated data types and the association degrees based on a preset regression algorithm, and determining the data types associated with the time periods and the scene categories corresponding to the time periods; and constructing a second relation table among the time periods, the scene categories and the data types of the terminal use data according to the time periods and the data types related to the scene categories corresponding to the time periods.
The regression algorithm may be a ridge regression algorithm, a multiple regression algorithm, a lasso (least absolute value convergence and selection operator, lasso algorithm) regression algorithm, etc., and then the ridge regression is taken as an example to learn the relationship among the scene type, the time period, and the data type of the data used by the terminal.
Wherein, in some embodiments, the second association analysis module 404 is further configured to: establishing an optimization target equation of ridge regression by taking the association degree of the data type associated with the time period as an independent variable and taking the scene type corresponding to the time period as a dependent variable; solving the optimization objective equation according to a gradient descent method to obtain a weight vector; determining a weight of a data type associated with the time period according to the weight vector; and taking the data type with the weight larger than a second preset threshold value as a data type associated with the time period and the scene category corresponding to the time period.
For example, the preset regression algorithm is a ridge regression algorithm; determining time periods tjAfter the scene type (2), the scene type is used as the y value of the ridge regression model, i.e., the dependent variable. d (t)i) As the x-vector of the ridge regression model, i.e., the argument. Constructing an optimization objective equation of a ridge regression learning model:
Figure BDA0002022113160000151
wherein W weight vector represents the weight of each data type, W ═ W1,w2,...wj,...,wm) M denotes the number of data types in the time period ti, wjRepresents the weight of the jth data type of the m data types, where j ∈ (1, m).
Specifically, the second correlation analysis module 404 may solve the equation using a gradient descent algorithm to obtain the weight vector W. From the weight vector, a time period t can be determinedjA weight for each data type of association.
E.g. t1In the time period, if the weight of a data type is greater than a second preset threshold value, the data type and t are judged1Time period and time period t1If the corresponding scene categories have strong association relationship, the data type is reserved; if t1And in the time period, if the weight of a data type is less than or equal to a second preset threshold value, discarding the data. In this way, the data types associated with each time period and the scene type corresponding to the time period can be found, and a second relation table can be generated, wherein the second relation table can be expressed as<ti,ci,d′(ti)>Wherein c isiRepresents tiContext class of time period, d' (t)i) Represents the sum of tiTime period and context ciHave strongly associated data types.
Specifically, in another optional embodiment, the apparatus further includes a feature extraction module, configured to determine a target time period and a scenario category of the target time period; determining data types associated with the target time period and the scene categories corresponding to the target time period according to the second relation table; acquiring terminal use data belonging to the determined data type in the target time period; and extracting data characteristics from the acquired terminal use data according to a preset characteristic extraction algorithm to serve as time domain characteristics corresponding to the target time period.
After obtaining a second relation table of a triad among the time period, the scene category, and the data type of the terminal usage data, the feature extraction module may extract the data feature of a certain time period according to the relation table. For example, to acquire the data feature of the target time period, the scene type of the time period is determined, then the data types associated with the time period and the scene type are searched according to the second relation table, the terminal use data belonging to the data types are acquired, the time domain feature is extracted according to the time domain feature extraction method, the acceleration sensor data sequence in the target time period is counted by taking the acceleration sensor data as an example, and the time domain feature such as the peak value, the mean value, the root mean square value, the kurtosis index, the form factor and the like in the data sequence is acquired.
In particular implementation, the present application is not limited by the execution sequence of the described steps, and some steps may be performed in other sequences or simultaneously without conflict.
As can be seen from the above, the data processing apparatus provided in the embodiment of the present application may acquire terminal usage data of multiple time periods, perform statistics on the terminal usage data of the multiple time periods, and construct a first relation table between the time periods and data types of the terminal usage data according to the statistical result, where the terminal usage data includes multiple data types. Then, the scene category corresponding to the time period is obtained, according to a first relation table between the time period and the data type of the terminal use data, a second relation table between time periods, scene categories and data types of the terminal usage data may be constructed, by the scheme, not only the time characteristic of the terminal use data can be mined and the incidence relation between the data type of the terminal use data and the time period can be found, but also the three-in-one relation table among the data type of the terminal use data, the time period and the scene type can be further formed by combining the scene type, so that the analysis of the time characteristic of the terminal use data is realized, when the data in the specific time period is subjected to feature extraction, the data features in the specific time period and the specific scene can be extracted in a targeted manner according to the second relation table, so that the accuracy of data feature extraction is improved.
The embodiment of the application also provides the electronic equipment. The electronic device can be a smart phone, a tablet computer and the like. As shown in fig. 5, fig. 5 is a schematic view of a first structure of an electronic device according to an embodiment of the present application. The electronic device 300 comprises a processor 301 and a memory 302. The processor 301 is electrically connected to the memory 302.
The processor 301 is a control center of the electronic device 300, connects various parts of the entire electronic device using various interfaces and lines, and performs various functions of the electronic device and processes data by running or calling a computer program stored in the memory 302 and calling data stored in the memory 302, thereby performing overall monitoring of the electronic device.
In this embodiment, the processor 301 in the electronic device 300 loads instructions corresponding to one or more processes of the computer program into the memory 302 according to the following steps, and the processor 301 runs the computer program stored in the memory 302, so as to implement various functions:
acquiring terminal use data of a plurality of time periods, wherein the terminal use data comprises a plurality of data types;
counting the terminal use data of the time periods, and constructing a first relation table between the time periods and the data types of the terminal use data according to the counting result;
obtaining scene types corresponding to the multiple time periods;
and constructing a second relation table among the time periods, the scene categories and the data types of the terminal use data according to the scene categories corresponding to the time periods and the first relation table.
In some embodiments, after the step of building a second table of relationships between time periods, context categories and data types of the terminal usage data, the method further comprises:
determining a target time period and a scene category of the target time period;
determining data types associated with the target time period and the scene categories corresponding to the target time period according to the second relation table;
acquiring terminal use data belonging to the determined data type in the target time period;
and extracting data characteristics from the acquired terminal use data according to a preset characteristic extraction algorithm to serve as time domain characteristics corresponding to the target time period.
In some embodiments, the step of counting the terminal usage data of the plurality of time periods and constructing a first relation table between the time periods and the data types of the terminal usage data according to the statistical result includes:
counting the use frequency corresponding to each data type in each time period according to the terminal use data;
taking the data type with the use frequency larger than a first preset threshold value in a time period as a related data type of the time period, and taking the use frequency as the corresponding relevance degree;
and constructing a first relation table between the time periods and the data types of the terminal use data according to the data types and the association degrees associated with each time period.
In some embodiments, the step of constructing a second relationship table between the time periods, the scenario categories, and the data types of the terminal usage data according to the scenario categories corresponding to the plurality of time periods and the first relationship table includes:
performing regression analysis on the scene categories corresponding to the time periods, the associated data types and the association degrees based on a preset regression algorithm, and determining the data types associated with the time periods and the scene categories corresponding to the time periods;
and constructing a second relation table among the time periods, the scene categories and the data types of the terminal use data according to the time periods and the data types related to the scene categories corresponding to the time periods.
In some embodiments, the predetermined regression algorithm is a ridge regression algorithm; performing regression analysis on the scenario categories corresponding to the time periods, the associated data types and the association degrees based on a preset regression algorithm, and determining the data types associated with the time periods and the scenario categories corresponding to the time periods, wherein the method comprises the following steps:
establishing an optimization target equation of ridge regression by taking the association degree of the data type associated with the time period as an independent variable and taking the scene type corresponding to the time period as a dependent variable;
solving the optimization objective equation according to a gradient descent method to obtain a weight vector;
determining a weight of a data type associated with the time period according to the weight vector;
and taking the data type with the weight larger than a second preset threshold value as a data type associated with the time period and the scene category corresponding to the time period.
In some embodiments, before the step of obtaining terminal usage data for a plurality of time periods, the method further comprises:
the preset time interval is divided into a plurality of time periods.
In some embodiments, the step of dividing the preset time interval into a plurality of time periods comprises:
calculating the total information entropy of a preset time interval according to terminal use data in the preset time interval;
dividing the time interval into two time periods according to each time division point in a preset value range, and calculating weighted average information entropy of the two divided time periods;
determining a time division point with the maximum difference value between the weighted average information entropy and the total information entropy;
dividing the time interval into a first time period and a second time period according to the time division point;
and taking the first time period and the second time period as new time intervals, and repeatedly executing the steps until the number of the segments of the time intervals is greater than a third preset threshold value.
In some embodiments, the step of calculating the total information entropy of the preset time interval according to the terminal usage data in the preset time interval includes:
acquiring starting data of a target application from terminal use data;
and calculating the total information entropy of the preset time interval according to the starting data.
Memory 302 may be used to store computer programs and data. The memory 302 stores computer programs containing instructions executable in the processor. The computer program may constitute various functional modules. The processor 301 executes various functional applications and data processing by calling a computer program stored in the memory 302.
In some embodiments, as shown in fig. 6, fig. 6 is a second schematic structural diagram of an electronic device provided in the embodiments of the present application. The electronic device 300 further includes: radio frequency circuit 303, display screen 304, control circuit 305, input unit 306, audio circuit 307, sensor 308, and power supply 309. The processor 301 is electrically connected to the rf circuit 303, the display 304, the control circuit 305, the input unit 306, the audio circuit 307, the sensor 308, and the power source 309, respectively.
The radio frequency circuit 303 is used for transceiving radio frequency signals to communicate with a network device or other electronic devices through wireless communication.
The display screen 304 may be used to display information entered by or provided to the user as well as various graphical user interfaces of the electronic device, which may be comprised of images, text, icons, video, and any combination thereof.
The control circuit 305 is electrically connected to the display screen 304, and is used for controlling the display screen 304 to display information.
The input unit 306 may be used to receive input numbers, character information, or user characteristic information (e.g., fingerprint), and to generate keyboard, mouse, joystick, optical, or trackball signal inputs related to user settings and function control. The input unit 306 may include a fingerprint recognition module.
Audio circuitry 307 may provide an audio interface between the user and the electronic device through a speaker, microphone. Where audio circuitry 307 includes a microphone. The microphone is electrically connected to the processor 301. The microphone is used for receiving voice information input by a user.
The sensor 308 is used to collect external environmental information. The sensor 308 may include one or more of an ambient light sensor, an acceleration sensor, a gyroscope, and the like.
The power supply 309 is used to power the various components of the electronic device 300. In some embodiments, the power source 309 may be logically coupled to the processor 301 through a power management system, such that functions to manage charging, discharging, and power consumption management are performed through the power management system.
Although not shown in fig. 6, the electronic device 300 may further include a camera, a bluetooth module, and the like, which are not described in detail herein.
As can be seen from the above, the electronic device may obtain terminal usage data of multiple time periods, perform statistics on the terminal usage data of the multiple time periods, and construct a first relation table between the time periods and data types of the terminal usage data according to a statistical result, where the terminal usage data includes multiple data types. Then, the scene category corresponding to the time period is obtained, according to a first relation table between the time period and the data type of the terminal use data, a second relation table between time periods, scene categories and data types of the terminal usage data may be constructed, by the scheme, not only the time characteristic of the terminal use data can be mined and the incidence relation between the data type of the terminal use data and the time period can be found, but also the three-in-one relation table among the data type of the terminal use data, the time period and the scene type can be further formed by combining the scene type, so that the analysis of the time characteristic of the terminal use data is realized, when the data in the specific time period is subjected to feature extraction, the data features in the specific time period and the specific scene can be extracted in a targeted manner according to the second relation table, so that the accuracy of data feature extraction is improved.
An embodiment of the present application further provides a storage medium, where a computer program is stored in the storage medium, and when the computer program runs on a computer, the computer executes the data processing method according to any of the above embodiments.
It should be noted that, all or part of the steps in the methods of the above embodiments may be implemented by hardware related to instructions of a computer program, which may be stored in a computer-readable storage medium, which may include, but is not limited to: read Only Memory (ROM), Random Access Memory (RAM), magnetic or optical disks, and the like.
The term "module" as used herein may be considered a software object executing on the computing system. The different components, modules, engines, and services described herein may be considered as implementation objects on the computing system. The apparatus and method described herein may be implemented in software, but may also be implemented in hardware, and are within the scope of the present application.
The terms "first", "second", and "third", etc. in this application are used to distinguish between different objects and not to describe a particular order. Furthermore, the terms "include" and "have," as well as any variations thereof, are intended to cover non-exclusive inclusions. For example, a process, method, system, article, or apparatus that comprises a list of steps or modules is not limited to only those steps or modules listed, but rather, some embodiments may include other steps or modules not listed or inherent to such process, method, article, or apparatus.
Reference herein to "an embodiment" means that a particular feature, structure, or characteristic described in connection with the embodiment can be included in at least one embodiment of the application. The appearances of the phrase in various places in the specification are not necessarily all referring to the same embodiment, nor are separate or alternative embodiments mutually exclusive of other embodiments. It is explicitly and implicitly understood by one skilled in the art that the embodiments described herein can be combined with other embodiments.
The data processing method, the data processing apparatus, the storage medium, and the electronic device provided in the embodiments of the present application are described in detail above. The principle and the implementation of the present application are explained herein by applying specific examples, and the above description of the embodiments is only used to help understand the method and the core idea of the present application; meanwhile, for those skilled in the art, according to the idea of the present application, there may be variations in the specific embodiments and the application scope, and in summary, the content of the present specification should not be construed as a limitation to the present application.

Claims (11)

1. A data processing method, comprising:
acquiring terminal use data of a plurality of time periods, wherein the terminal use data comprises a plurality of data types;
counting the terminal use data of the time periods, and constructing a first relation table between the time periods and the data types of the terminal use data according to the counting result;
obtaining scene types corresponding to the multiple time periods;
and constructing a second relation table among the time periods, the scene categories and the data types of the terminal use data according to the scene categories corresponding to the time periods and the first relation table.
2. The data processing method of claim 1, wherein after the step of building a second table of relationships between time periods, scene categories, and data types of the terminal usage data, the method further comprises:
determining a target time period and a scene category of the target time period;
determining data types associated with the target time period and the scene categories corresponding to the target time period according to the second relation table;
acquiring terminal use data belonging to the determined data type in the target time period;
and extracting data characteristics from the acquired terminal use data according to a preset characteristic extraction algorithm to serve as time domain characteristics corresponding to the target time period.
3. The data processing method of claim 1, wherein the step of counting the terminal usage data of the plurality of time periods and constructing a first relation table between the time periods and the data types of the terminal usage data according to the statistical result comprises:
counting the use frequency corresponding to each data type in each time period according to the terminal use data;
taking the data type with the use frequency larger than a first preset threshold value in a time period as a related data type of the time period, and taking the use frequency as the corresponding relevance degree;
and constructing a first relation table between the time periods and the data types of the terminal use data according to the data types and the association degrees associated with each time period.
4. The data processing method according to claim 3, wherein the step of constructing a second relation table between the time periods, the scene categories, and the data types of the terminal usage data based on the scene categories corresponding to the plurality of time periods and the first relation table comprises:
performing regression analysis on the scene categories corresponding to the time periods, the associated data types and the association degrees based on a preset regression algorithm, and determining the data types associated with the time periods and the scene categories corresponding to the time periods;
and constructing a second relation table among the time periods, the scene categories and the data types of the terminal use data according to the time periods and the data types related to the scene categories corresponding to the time periods.
5. The data processing method of claim 4, wherein the predetermined regression algorithm is a ridge regression algorithm; performing regression analysis on the scenario categories corresponding to the time periods, the associated data types and the association degrees based on a preset regression algorithm, and determining the data types associated with the time periods and the scenario categories corresponding to the time periods, wherein the method comprises the following steps:
establishing an optimization target equation of ridge regression by taking the association degree of the data type associated with the time period as an independent variable and taking the scene type corresponding to the time period as a dependent variable;
solving the optimization objective equation according to a gradient descent method to obtain a weight vector;
determining a weight of a data type associated with the time period according to the weight vector;
and taking the data type with the weight larger than a second preset threshold value as a data type associated with the time period and the scene category corresponding to the time period.
6. The data processing method of any of claims 1 to 5, wherein prior to the step of obtaining terminal usage data for a plurality of time periods, the method further comprises:
the preset time interval is divided into a plurality of time periods.
7. The data processing method of claim 6, wherein the step of dividing the preset time interval into a plurality of time periods comprises:
calculating the total information entropy of a preset time interval according to terminal use data in the preset time interval;
dividing the time interval into two time periods according to each time division point in a preset value range, and calculating weighted average information entropy of the two divided time periods;
determining a time division point with the maximum difference value between the weighted average information entropy and the total information entropy;
dividing the time interval into a first time period and a second time period according to the time division point;
and taking the first time period and the second time period as new time intervals, and repeatedly executing the steps until the number of the segments of the time intervals is greater than a third preset threshold value.
8. The data processing method of claim 7, wherein the step of calculating the total information entropy for a preset time interval according to the terminal usage data within the preset time interval comprises:
acquiring starting data of a target application from terminal use data;
and calculating the total information entropy of the preset time interval according to the starting data.
9. A data processing apparatus, comprising:
the terminal comprises a data acquisition module, a data processing module and a data processing module, wherein the data acquisition module is used for acquiring terminal use data of a plurality of time periods, and the terminal use data comprises a plurality of data types;
the first association analysis module is used for counting the terminal use data of the time periods and constructing a first relation table between the time periods and the data types of the terminal use data according to the counting result;
the category acquisition module is used for acquiring scene categories corresponding to the time periods;
and the second correlation analysis module is used for constructing a second relation table among the time periods, the scene categories and the data types of the terminal use data according to the scene categories corresponding to the time periods and the first relation table.
10. A storage medium having stored thereon a computer program, characterized in that, when the computer program runs on a computer, it causes the computer to execute a data processing method according to any one of claims 1 to 8.
11. An electronic device comprising a processor and a memory, said memory storing a computer program, characterized in that said processor is adapted to execute the data processing method according to any of claims 1 to 8 by invoking said computer program.
CN201910282435.2A 2019-04-09 2019-04-09 Data processing method, data processing device, storage medium and electronic equipment Pending CN111797071A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910282435.2A CN111797071A (en) 2019-04-09 2019-04-09 Data processing method, data processing device, storage medium and electronic equipment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910282435.2A CN111797071A (en) 2019-04-09 2019-04-09 Data processing method, data processing device, storage medium and electronic equipment

Publications (1)

Publication Number Publication Date
CN111797071A true CN111797071A (en) 2020-10-20

Family

ID=72805347

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910282435.2A Pending CN111797071A (en) 2019-04-09 2019-04-09 Data processing method, data processing device, storage medium and electronic equipment

Country Status (1)

Country Link
CN (1) CN111797071A (en)

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103870550A (en) * 2014-03-03 2014-06-18 同济大学 User behavior pattern acquisition method based on Android system and system thereof
US20150347784A1 (en) * 2014-05-30 2015-12-03 Apple Inc. Managing user information - authorization masking
CN107169525A (en) * 2017-06-01 2017-09-15 腾云天宇科技(北京)有限公司 A kind of method, device and mobile terminal for determining mobile terminal application scenarios
CN107454258A (en) * 2017-07-31 2017-12-08 广东欧珀移动通信有限公司 The method to set up of mobile terminal and its contextual model, computer-readable recording medium
CN107783801A (en) * 2017-11-06 2018-03-09 广东欧珀移动通信有限公司 Application program forecast model is established, preloads method, apparatus, medium and terminal
CN107943583A (en) * 2017-11-14 2018-04-20 广东欧珀移动通信有限公司 Processing method, device, storage medium and the electronic equipment of application program
CN108763502A (en) * 2018-05-30 2018-11-06 腾讯科技(深圳)有限公司 Information recommendation method and system

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103870550A (en) * 2014-03-03 2014-06-18 同济大学 User behavior pattern acquisition method based on Android system and system thereof
US20150347784A1 (en) * 2014-05-30 2015-12-03 Apple Inc. Managing user information - authorization masking
CN107169525A (en) * 2017-06-01 2017-09-15 腾云天宇科技(北京)有限公司 A kind of method, device and mobile terminal for determining mobile terminal application scenarios
CN107454258A (en) * 2017-07-31 2017-12-08 广东欧珀移动通信有限公司 The method to set up of mobile terminal and its contextual model, computer-readable recording medium
CN107783801A (en) * 2017-11-06 2018-03-09 广东欧珀移动通信有限公司 Application program forecast model is established, preloads method, apparatus, medium and terminal
CN107943583A (en) * 2017-11-14 2018-04-20 广东欧珀移动通信有限公司 Processing method, device, storage medium and the electronic equipment of application program
CN108763502A (en) * 2018-05-30 2018-11-06 腾讯科技(深圳)有限公司 Information recommendation method and system

Similar Documents

Publication Publication Date Title
CN111798811B (en) Screen backlight brightness adjusting method and device, storage medium and electronic equipment
CN111797854B (en) Scene model building method and device, storage medium and electronic equipment
EP2919136A1 (en) Method and device for clustering
CN111797861A (en) Information processing method, information processing apparatus, storage medium, and electronic device
CN111796979A (en) Data acquisition strategy determining method and device, storage medium and electronic equipment
CN111797302A (en) Model processing method and device, storage medium and electronic equipment
CN111797851A (en) Feature extraction method and device, storage medium and electronic equipment
CN111797148A (en) Data processing method, data processing device, storage medium and electronic equipment
CN111797873A (en) Scene recognition method and device, storage medium and electronic equipment
CN111798019B (en) Intention prediction method, intention prediction device, storage medium and electronic equipment
CN111797874B (en) Behavior prediction method and device, storage medium and electronic equipment
CN111797261A (en) Feature extraction method and device, storage medium and electronic equipment
CN111797867A (en) System resource optimization method and device, storage medium and electronic equipment
CN111797860B (en) Feature extraction method and device, storage medium and electronic equipment
CN111796663B (en) Scene recognition model updating method and device, storage medium and electronic equipment
CN111797071A (en) Data processing method, data processing device, storage medium and electronic equipment
CN112948763B (en) Piece quantity prediction method and device, electronic equipment and storage medium
CN111797880A (en) Data processing method, data processing device, storage medium and electronic equipment
CN111796916A (en) Data distribution method, device, storage medium and server
CN111797863A (en) Model training method, data processing method, device, storage medium and equipment
CN111797878A (en) Data processing method, data processing device, storage medium and electronic equipment
CN111796924A (en) Service processing method, device, storage medium and electronic equipment
CN111800537B (en) Terminal use state evaluation method and device, storage medium and electronic equipment
CN111797655A (en) User activity identification method and device, storage medium and electronic equipment
CN111797877B (en) Data processing method and device, storage medium and electronic equipment

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination