CN111444170A - Automatic machine learning method and device based on predicted business scene - Google Patents

Automatic machine learning method and device based on predicted business scene Download PDF

Info

Publication number
CN111444170A
CN111444170A CN201811618614.0A CN201811618614A CN111444170A CN 111444170 A CN111444170 A CN 111444170A CN 201811618614 A CN201811618614 A CN 201811618614A CN 111444170 A CN111444170 A CN 111444170A
Authority
CN
China
Prior art keywords
data
imported
machine learning
user
predicted
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201811618614.0A
Other languages
Chinese (zh)
Other versions
CN111444170B (en
Inventor
王敏
秦川
周振华
李瀚�
刘勇
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
4Paradigm Beijing Technology Co Ltd
Original Assignee
4Paradigm Beijing Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 4Paradigm Beijing Technology Co Ltd filed Critical 4Paradigm Beijing Technology Co Ltd
Priority to CN201811618614.0A priority Critical patent/CN111444170B/en
Publication of CN111444170A publication Critical patent/CN111444170A/en
Application granted granted Critical
Publication of CN111444170B publication Critical patent/CN111444170B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Landscapes

  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The present disclosure provides a method and apparatus for automatic machine learning based on predictive business scenarios. The automatic machine learning method may include: extracting a data normal form corresponding to a predicted service scene; providing data import guidance based on the extracted data paradigm; receiving a data item imported according to the data import guide; and performing automatic model training according to the imported data items, wherein the data paradigm at least comprises: a data table category corresponding to the predicted traffic scenario. According to the method and the device, data import guidance can be provided for the user, so that the user can use the automatic machine learning product or method more easily, and the use threshold is reduced.

Description

Automatic machine learning method and device based on predicted business scene
Technical Field
The present disclosure relates generally to machine learning techniques and, more particularly, to a method and apparatus for automatic machine learning based on predictive business scenarios.
Background
Existing machine learning-based modeling methods involve the following operations: obtaining historical data records, performing data processing and feature processing to obtain training samples, and performing model training by using the training samples according to a specific modeling algorithm.
In order to obtain a specific model for predicting specific information, a modeling scientist or a professional modeling person is required to determine data suitable for building the specific model in each business scenario according to modeling experience and understanding of the business scenarios so as to model. Because of depending on modeling experience and understanding of the business scenario, the modeling has a higher threshold, and people who do not know the modeling method or the business scenario cannot complete the modeling task easily.
However, for this reason, data in a fixed format needs to be imported, that is, a user needs to prepare data according to the fixed format, and often the existing data of the user needs to be converted into the required data format to use the converted data to train the model, so that the operation is complicated, and even sometimes, the user cannot complete the data preparation work by himself or herself.
Disclosure of Invention
Exemplary embodiments of the present disclosure aim to overcome the disadvantages of the prior art automatic machine learning techniques in which data preparation is inconvenient.
According to an exemplary embodiment of the present disclosure, an automatic machine learning method based on a predictive business scenario is provided. The automatic machine learning method may include: extracting a data normal form corresponding to a predicted service scene; providing data import guidance based on the extracted data paradigm; receiving a data item imported according to the data import guide; and performing automatic model training according to the imported data items, wherein the data paradigm at least comprises: a data table category corresponding to the predicted traffic scenario.
Optionally, the data table corresponding to the data table category includes at least one of the following items: at least two subject data tables; at least two subject data tables and at least one relationship data table regarding interrelationships between the at least two subject data tables; the system comprises at least one main body data table and a service table corresponding to the at least one main body data table.
Optionally, the data import guidance includes at least one of: the data table management system comprises interaction controls used for guiding a user to respectively import data tables corresponding to each data table type, interaction controls used for guiding the user to specify items to be predicted in the imported data tables or constructing the items to be predicted based on the imported data tables, interaction controls used for guiding the user to specify main keys of the imported data tables, interaction controls used for guiding the user to specify time types of the imported data tables, interaction controls used for guiding the user to specify field types of the imported data tables, interaction controls used for guiding the user to establish association relations among the imported data tables, and a scene configuration table used for guiding the user to establish business scenes related to the imported data tables.
Optionally, receiving the data item imported according to the data import guide includes: and receiving at least one of a data table, an item to be predicted, a primary key of the data table, a time type of the data table, a field type of the data table, an incidence relation between the data tables and a scene configuration table which are imported according to the data import guide.
Optionally, performing automatic model training according to the imported data item includes: splicing the data tables into a data splicing table based on the imported data items, and extracting characteristics to obtain a training sample table; and automatically performing machine learning by using the training samples in the training sample table to obtain a machine learning model.
Optionally, the predicted traffic scenario relates to a marketing scenario, an anti-fraud scenario, and/or a recommendation scenario.
Optionally, the predicted business scenario relates to a marketing scenario, and the category of the data table at least includes: user table, product table, behavior table.
Optionally, the time types of the data table include: at least one of a flow meter, a static meter, a zipper meter, and a slicer meter.
According to another exemplary embodiment of the present disclosure, a system is provided comprising at least one computing device and at least one storage device storing instructions, wherein the instructions, when executed by the at least one computing device, cause the at least one computing device to perform the automatic machine learning method as described above.
According to another exemplary embodiment of the present disclosure, a computer-readable storage medium storing instructions that, when executed by at least one computing device, cause the at least one computing device to perform the automatic machine learning method as described above is provided.
According to another exemplary embodiment of the present disclosure, an automatic machine learning device based on a predictive business scenario is provided. The automatic machine learning apparatus includes: the data normal form extracting unit is used for extracting a data normal form corresponding to the predicted service scene; a guidance unit that provides data import guidance based on the extracted data paradigm; a data receiving unit that receives a data item imported according to the data import guide; and a model training unit for performing automatic model training according to the imported data item, wherein the data paradigm at least comprises: a data table category corresponding to the predicted traffic scenario.
Optionally, the data table corresponding to the data table category includes at least one of the following items: at least two subject data tables; at least two subject data tables and at least one relationship data table regarding interrelationships between the at least two subject data tables; the system comprises at least one main body data table and a service table corresponding to the at least one main body data table.
Optionally, the data import guidance includes at least one of: the data table management system comprises interaction controls used for guiding a user to respectively import data tables corresponding to each data table type, interaction controls used for guiding the user to specify items to be predicted in the imported data tables or constructing the items to be predicted based on the imported data tables, interaction controls used for guiding the user to specify main keys of the imported data tables, interaction controls used for guiding the user to specify time types of the imported data tables, interaction controls used for guiding the user to specify field types of the imported data tables, interaction controls used for guiding the user to establish association relations among the imported data tables, and a scene configuration table used for guiding the user to establish business scenes related to the imported data tables.
Optionally, the data receiving unit receives at least one of a data table imported according to the data import guide, an item to be predicted, a primary key of the data table, a time type of the data table, a field type of the data table, an association relationship between the data tables, and a scene configuration table.
Optionally, the model training unit splices the data tables into a data splicing table based on the imported data items, and performs feature extraction to obtain a training sample table; and automatically performing machine learning by using the training samples in the training sample table to obtain a machine learning model.
Optionally, the predicted traffic scenario relates to a marketing scenario, an anti-fraud scenario, and/or a recommendation scenario.
Optionally, the predicted business scenario relates to a marketing scenario, and the category of the data table at least includes: user table, product table, behavior table.
Optionally, the time types of the data table include: at least one of a flow meter, a static meter, a zipper meter, and a slicer meter.
In the present disclosure, data import guidance is provided for a user according to a predicted business scenario, thereby making it easier for the user to use an automatic machine learning product (e.g., software product) or method, reducing the usage threshold. More specifically, during the process from the preparation of data to the completion of modeling, data items imported by a user according to a data import guide can be automatically modeled. Here, the data paradigm on which the data introduction guidance is based includes at least the data table type corresponding to the predicted traffic scenario and is limited, and the data field of the training data table is not directly limited, so that the time taken for the user to prepare the data table is reduced, and the user can conveniently introduce the data table.
Additional aspects and/or advantages of the present general inventive concept will be set forth in part in the description which follows and, in part, will be obvious from the description, or may be learned by practice of the general inventive concept.
Drawings
The above and other objects and features of the exemplary embodiments of the present disclosure will become more apparent from the following description taken in conjunction with the accompanying drawings which illustrate exemplary embodiments, wherein:
FIG. 1 illustrates a flow diagram of a method of automatic machine learning based on predictive business scenarios, according to an exemplary embodiment of the present disclosure;
fig. 2 illustrates a block diagram of an automatic machine learning device based on predictive traffic scenarios, according to an exemplary embodiment of the present disclosure.
Detailed Description
Reference will now be made in detail to the embodiments of the present disclosure, examples of which are illustrated in the accompanying drawings, wherein like reference numerals refer to the like elements throughout. The embodiments are described below in order to explain the present disclosure by referring to the figures.
In an exemplary embodiment of the present disclosure, a data import guide may be provided to a user based on a data paradigm corresponding to a predicted business scenario, and the user may import data items according to the provided data import guide, and the data import guide defines a data table category, so that not only is the imported data necessarily defined, but also an excessive table-splitting burden is avoided for the user. According to an exemplary embodiment of the present invention, the data paradigm (data pattern) defines at least the data table categories corresponding to the predicted business scenario, which are designed to correspond to the actual business, so that the related automatic machine learning solution (i.e., the system (e.g., software product) or method) can be concise in the data import link and can rely on subsequent processing to perform the automatic machine learning process.
Fig. 1 illustrates a flow diagram of a method of automatic machine learning based on predictive business scenarios, according to an exemplary embodiment of the present disclosure.
As shown in fig. 1, the automatic machine learning method of the present exemplary embodiment may include steps S110 to S140.
In step S110, a data normal form corresponding to the predicted service scenario is extracted, where the data normal form at least includes: a data table category corresponding to the predicted traffic scenario. The data paradigm corresponding to a business scenario can be extracted through reasonable abstraction of the business scenario. As an example, these data table categories may each relate to different objects, which may focus on different aspects of subjects, behaviors, relationships, and so forth. The division of the category of the data table may reflect the typical data reserve situation in the business scenario and may also contribute to the final implementation of data splicing, but the exemplary embodiments of the present invention are not limited thereto.
Here, the business prediction using the machine learning model is a typical way to mine value from data, and is applicable to various business scenarios, which often involve a complex data preparation process, that is, data used for generating the machine learning sample may come from a plurality of different tables, and sometimes may need to go through a complex splicing process, for example, a new statistical field or even a new table is generated in the splicing process.
As an example, the predicted traffic scenario may relate to a marketing scenario, an anti-fraud scenario, and/or a recommendation scenario. Here, different business scenarios may correspond to the same or different data patterns.
As an example, the data table corresponding to the data table category includes at least one of: at least two subject data tables; at least two subject data tables and at least one relationship data table regarding interrelationships between the at least two subject data tables; the system comprises at least one main body data table and a service table corresponding to the at least one main body data table. That is, according to an exemplary embodiment of the present invention, the data table may depict subjects related to predicted traffic, related traffic and/or relationships between subjects, and the like. In any data paradigm, the data schema may include only a body table, or include a body table and a corresponding business table, or include a relationship between a body table and different body tables, which is not limited in this regard in the exemplary embodiment of the present invention, and may also be a combination of any of the above items.
For example, the data table categories under the marketing scenario include: a user table and a product table; or, the user table and the behavior table representing the purchasing behavior of the user; or, the product table and the behavior table representing the behavior of the product being purchased; alternatively, the user table, the product table, and the action table represent a user purchasing a product.
As an example, one data table category may correspond to a class of data tables, each class of data tables may include one or more data tables.
In step S120, data import guidance is provided based on the extracted data paradigm.
Specifically, on the basis of the data paradigm, a corresponding data import guide mode can be provided to help a user import the corresponding data table and related items thereof according to the guidance. By way of example, the data introduction guidance mode may guide a user to import a data table, specify or construct an item to be predicted, specify a primary key, specify a time type of the data table, specify a field type, specify an association relationship, specify a scene configuration, and/or the like.
Accordingly, the data import guidance includes at least one of: the data table management system comprises interaction controls used for guiding a user to respectively import data tables corresponding to each data table type, interaction controls used for guiding the user to specify items to be predicted in the imported data tables or constructing the items to be predicted based on the imported data tables, interaction controls used for guiding the user to specify main keys of the imported data tables, interaction controls used for guiding the user to specify time types of the imported data tables, interaction controls used for guiding the user to specify field types of the imported data tables, interaction controls used for guiding the user to establish association relations among the imported data tables, and a scene configuration table used for guiding the user to establish business scenes related to the imported data tables.
By way of example, the interactive controls may be provided by pop-up windows, buttons, text displays, list displays, question answers, radio boxes, check boxes, and the like, wherein user input may be received from a user through the interactive controls such as buttons, question answers, radio boxes, check boxes, and the like, and information, i.e., data items, required in performing automatic model training may be determined according to the user input.
The interactive control used for guiding the user to respectively import the data table corresponding to each data table type can be used for explaining to the user what data is respectively stored in each data table type needing to be imported by the user. For example, the user may be illustrated that: the user table is used for recording the relevant information of the user as a marketing target, the product table is used for recording the relevant information of the product to be marketed, and the behavior table is used for recording the relevant information of the behavior of the user for purchasing the product.
The interactive control used for guiding the user to specify the item to be predicted in the imported data table or constructing the item to be predicted based on the imported data table can be used for guiding the user to specify the item to be predicted in the imported data table or guiding the user to construct the item to be predicted based on the imported data table. The item to be predicted may mean what information the model to be trained is used to predict. For example, it may be predicted whether a user will purchase a specified product within a predetermined period of time in the future. The specified item to be predicted may be one or more data fields of a data table. The constructed item to be predicted may be obtained by operating on two or more data fields of a data table. In the prior art, when a prediction object of a model needs to be changed, a data format of training data which needs to be imported needs to be determined again, and data is prepared according to the determined data format, which brings inconvenience to a user of a modeling method. However, according to the exemplary embodiments of the present invention, since the item to be predicted itself or the construction manner thereof can be flexibly specified, the use experience of the user is enhanced.
An interactive control for directing the user to specify a primary key of the imported data sheet may be used to direct which feature of the data sheet imported by the user is the primary key. For example, when the data table includes a user table, the user ID field in the user table may be designated as a primary key by the interaction control, when the data table includes a product table, the product ID field in the product table may be designated as a primary key by the interaction control, and when the data table includes attribute fields for both products and behaviors, both the user ID field and the product ID field in the data table may be designated as primary keys by the interaction control.
The interactive control used for guiding the user to specify the time type of the imported data table can be used for guiding the user to specify the time type of the imported data table, wherein the time type is mainly used for indicating that the data table integrally reacts to the data condition which is irrelevant to time, a certain moment and/or a plurality of time periods. For example, the user may specify, for example, at least one of a slicer table, a static table, a zipper table, and a flow meter. Further, the user can import the watermeter, the static table, the zipper table and/or the slicing table as required, and specify at least one item of the main key, the field type, the incidence relation, the item to be predicted and the scene configuration table, so that the splicing of the data table, the feature extraction and/or the sample training in the automatic model training process are facilitated.
An interaction control for directing a user to specify a field type of an imported data table may be used to direct a user to specify a field type of at least one data field of an imported data table. As an example, the field types may include at least: numerical type, category type, date type. For example, a data field named age is designated as a numeric type, a data field named gender is designated as a categorical type (categorical means a data type whose value category range is known, e.g., gender), and a data field named consumption date is designated as a date type.
The interaction control used for guiding the user to establish the association relationship between the imported data tables can be used for guiding the user to specify the association relationship between the imported data tables. For example, when the data table includes a user table, a product table, and a behavior table representing behaviors of a user purchasing a product, the user may specify through the interactive control: the user table with the user ID as the main key, the product table with the product ID as the main key and the product table with the user ID and the product ID as the main key are associated through the user ID and the product ID.
The scene configuration table is used for guiding the user to establish the service scene related to the imported data table, and can be used for guiding the user to specify which data fields can be used for describing the service scene.
In the above manner, a user may be provided with a guide for inputting a data table and its related items or additional items, and it should be noted that exemplary embodiments of the present invention are not limited to the above items.
In step S130, a data item imported according to the data import guide is received.
Here, the data item refers to a specific data table corresponding to the data table category defined in the data paradigm, and its related item and/or additional item.
As an example, step S130 may include: and receiving at least one of a data table, an item to be predicted, a primary key of the data table, a time type of the data table, a field type of the data table, an incidence relation between the data tables and a scene configuration table which are imported according to the data import guide.
As an example, the predictive business scenario relates to a marketing scenario, and the imported data items include: the system comprises a user table to be marketed, a product table to be marketed, a behavior table, a main key of the user table to be marketed, a main key of the product table to be marketed and a main key of the behavior table, wherein the main key of the behavior table comprises the main key of the user table to be marketed and the main key of the product table to be marketed.
In step S140, automatic model training is performed according to the imported data item.
According to an exemplary embodiment of the present invention, since the imported data items relate to the data table categories and some additional information related thereto, data table splicing, feature extraction, and/or model training may be automatically performed according to the imported data items.
As an example, step S140 may include: splicing the data tables into a data splicing table based on the imported data items, and extracting characteristics to obtain a training sample table; and automatically performing machine learning by using the training samples in the training sample table to obtain a machine learning model.
For example, in step S140, according to at least one item of the imported data table, the item to be predicted, the primary key of the data table, the time type of the data table, the field type of the data table, the association relationship between the data tables, and the scene configuration table, the operations of data table splicing, feature extraction, and/or learning the machine learning model may be performed.
In the existing automatic machine learning products, data is automatically changed into a model after being input, and the model does not participate in the data preparation process and has no influence on what data should be used. And a scientist or a modeling staff finds suitable data in each scene according to own modeling experience and business understanding to perform modeling. This step remains a threshold for primary modelers or business personnel.
According to an exemplary embodiment of the present invention, the above problems are solved by a technical product framework, taking flexibility and expansibility into account. As an optional mode, after the step of data-data relationship abstraction is completed, the table-feature engineering-algorithm-parameter adjustment may be completed in multiple ways in each step in the whole process, and each method may be pluggable, modifiable, and decoupled from each other.
Fig. 2 illustrates a block diagram of an automatic machine learning device based on predictive traffic scenarios, according to an exemplary embodiment of the present disclosure.
As shown in fig. 2, the automatic machine learning apparatus 200 of the present exemplary embodiment includes: a data normal form extracting unit 210 that extracts a data normal form corresponding to the predicted service scenario; a guidance unit 220 providing data import guidance based on the extracted data paradigm; a data receiving unit 230 receiving a data item imported according to the data import guide; and a model training unit 240 for performing automatic model training according to the imported data item, wherein the data paradigm at least comprises: a data table category corresponding to the predicted traffic scenario.
As an example, the data table corresponding to the data table category comprises at least one of: at least two subject data tables; at least two subject data tables and at least one relationship data table regarding interrelationships between the at least two subject data tables; the system comprises at least one main body data table and a service table corresponding to the at least one main body data table.
As an example, the data import guidance includes at least one of: the data table management system comprises interaction controls used for guiding a user to respectively import data tables corresponding to each data table type, interaction controls used for guiding the user to specify items to be predicted in the imported data tables or constructing the items to be predicted based on the imported data tables, interaction controls used for guiding the user to specify main keys of the imported data tables, interaction controls used for guiding the user to specify time types of the imported data tables, interaction controls used for guiding the user to specify field types of the imported data tables, interaction controls used for guiding the user to establish association relations among the imported data tables, and a scene configuration table used for guiding the user to establish business scenes related to the imported data tables.
As an example, the data receiving unit 230 receives at least one of a data table imported according to the data import guide, an item to be predicted, a primary key of the data table, a time type of the data table, a field type of the data table, an association relationship between the data tables, and a scene configuration table.
As an example, the model training unit 240 splices the data tables into a data splicing table based on the imported data items, and performs feature extraction to obtain a training sample table; and automatically performing machine learning by using the training samples in the training sample table to obtain a machine learning model.
By way of example, the predicted traffic scenario relates to a marketing scenario, an anti-fraud scenario, and/or a recommendation scenario.
As an example, the predicted traffic scenario relates to a marketing scenario, and the category of data tables includes at least: user table, product table, behavior table.
By way of example, the time types of the data table include: at least one of a flow meter, a static meter, a zipper meter, and a slicer meter.
It should be understood that the specific implementation of the automatic machine learning device according to the exemplary embodiment of the present disclosure may be implemented with reference to the related specific implementation described in conjunction with fig. 1, and will not be described herein again.
The various elements of the automatic machine learning device shown in fig. 2 may each be configured as software, hardware, firmware, or any combination thereof that performs a particular function. For example, these units may correspond to dedicated integrated circuits, to pure software code, or to units or modules of software combined with hardware. Furthermore, one or more functions implemented by these units may also be performed collectively by components in a physical entity device (e.g., a processor, a client or a server, etc.).
The automatic machine learning method and apparatus according to exemplary embodiments of the present disclosure are described above with reference to fig. 1 and 2. It is to be understood that the above-described method may be implemented by a program recorded on a computer-readable medium, for example, according to an exemplary embodiment of the present disclosure, there may be provided a computer-readable storage medium storing instructions that, when executed by at least one computing device, cause the at least one computing device to perform: extracting a data normal form corresponding to a predicted service scene; providing data import guidance based on the extracted data paradigm; receiving a data item imported according to the data import guide; and performing automatic model training according to the imported data items, wherein the data paradigm at least comprises: a data table category corresponding to the predicted traffic scenario.
The computer program in the computer-readable storage medium described above may be run in an environment deployed in a computer apparatus, such as a processor, a client, a host, a proxy device, a server, etc., for example, by at least one computing device in a stand-alone environment or a distributed cluster environment, as examples of which may be computers, processors, computing units (or modules), clients, hosts, proxy devices, servers, etc. It should be noted that the computer program may also be used to perform additional steps in addition to the above steps or perform more specific processing when performing the above steps, and the contents of these additional steps and further processing have already been described with reference to fig. 1 and will not be described again here in order to avoid repetition.
It should be noted that the automatic machine learning method and apparatus according to the exemplary embodiments of the present disclosure may fully rely on the execution of a computer program to implement the corresponding functions, i.e., the respective units correspond to the respective steps in the functional architecture of the computer program, so that the entire system is called by a special software package (e.g., lib library) to implement the corresponding functions.
On the other hand, the respective units of the automatic machine learning apparatus shown in fig. 2 may also be implemented by hardware, software, firmware, middleware, microcode, or any combination thereof. When implemented in software, firmware, middleware or microcode, the program code or code segments to perform the corresponding operations may be stored in a computer-readable medium such as a storage medium, so that a processor may perform the corresponding operations by reading and executing the corresponding program code or code segments.
For example, according to an exemplary embodiment of the present disclosure, a system may be provided comprising at least one computing device and at least one storage device storing instructions, wherein the instructions, when executed by the at least one computing device, cause the at least one computing device to perform the steps of: extracting a data normal form corresponding to a predicted service scene; providing data import guidance based on the extracted data paradigm; receiving a data item imported according to the data import guide; and performing automatic model training according to the imported data items, wherein the data paradigm at least comprises: a data table category corresponding to the predicted traffic scenario.
Here, the automatic machine learning apparatus may constitute a stand-alone computing environment or a distributed computing environment including at least one computing device and at least one storage device, where the computing device may be a general-purpose or special-purpose computer, a processor, etc., may be a unit that performs processing solely by software, or may be an entity that combines software and hardware, as examples. That is, the computing device may be implemented as a computer, processor, computing unit (or module), client, host, proxy device, server, or the like. Further, the storage devices may be physical storage devices or logically partitioned storage units that may be operatively coupled to the computing device or may communicate with each other, such as through I/O ports, network connections, and the like.
Further, for example, exemplary embodiments of the present disclosure may also be implemented as a computing device comprising a storage component having stored therein a set of computer-executable instructions that, when executed by the processor, perform an automatic machine learning method based on a predicted business scenario.
In particular, the computing devices may be deployed in servers or clients, as well as on node devices in a distributed network environment. Further, the computing device may be a PC computer, tablet device, personal digital assistant, smart phone, web application, or other device capable of executing the set of instructions described above.
The computing device need not be a single computing device, but can be any device or collection of circuits capable of executing the instructions (or sets of instructions) described above, individually or in combination. The computing device may also be part of an integrated control system or system manager, or may be configured as a portable electronic device that interfaces with local or remote (e.g., via wireless transmission).
In the computing device, the processor may include a Central Processing Unit (CPU), a Graphics Processing Unit (GPU), a programmable logic device, a special purpose processor system, a microcontroller, or a microprocessor. By way of example, and not limitation, processors may also include analog processors, digital processors, microprocessors, multi-core processors, processor arrays, network processors, and the like.
Some of the operations described in the automatic machine learning method according to the exemplary embodiment of the present disclosure may be implemented by software, some of the operations may be implemented by hardware, and further, the operations may be implemented by a combination of hardware and software.
The processor may execute instructions or code stored in one of the memory components, which may also store data. Instructions and data may also be transmitted and received over a network via a network interface device, which may employ any known transmission protocol.
The memory component may be integral to the processor, e.g., having RAM or flash memory disposed within an integrated circuit microprocessor or the like. Further, the storage component may comprise a stand-alone device, such as an external disk drive, storage array, or any other storage device usable by a database system. The storage component and the processor may be operatively coupled or may communicate with each other, such as through an I/O port, a network connection, etc., so that the processor can read files stored in the storage component.
Further, the computing device may also include a video display (such as a liquid crystal display) and a user interaction interface (such as a keyboard, mouse, touch input device, etc.). All components of the computing device may be connected to each other via a bus and/or a network.
The operations involved in an automatic machine learning method according to exemplary embodiments of the present disclosure may be described as various interconnected or coupled functional blocks or functional diagrams. However, these functional blocks or functional diagrams may be equally integrated into a single logic device or operated on by non-exact boundaries.
While various exemplary embodiments of the present disclosure have been described above, it should be understood that the above description is exemplary only, and not exhaustive, and that the present disclosure is not limited to the disclosed exemplary embodiments. Many modifications and variations will be apparent to those of ordinary skill in the art without departing from the scope and spirit of the disclosure. Therefore, the protection scope of the present disclosure should be subject to the scope of the claims.

Claims (10)

1. An automatic machine learning method based on a predictive business scenario, comprising:
extracting a data normal form corresponding to a predicted service scene;
providing data import guidance based on the extracted data paradigm;
receiving a data item imported according to the data import guide; and
automatic model training is performed based on the imported data items,
wherein the data paradigm comprises at least: a data table category corresponding to the predicted traffic scenario.
2. The automatic machine learning method of claim 1, wherein the data table corresponding to the data table category includes at least one of:
at least two subject data tables;
at least two subject data tables and at least one relationship data table regarding interrelationships between the at least two subject data tables;
the system comprises at least one main body data table and a service table corresponding to the at least one main body data table.
3. The automatic machine learning method of claim 1, wherein the data import guidance comprises at least one of: the data table management system comprises interaction controls used for guiding a user to respectively import data tables corresponding to each data table type, interaction controls used for guiding the user to specify items to be predicted in the imported data tables or constructing the items to be predicted based on the imported data tables, interaction controls used for guiding the user to specify main keys of the imported data tables, interaction controls used for guiding the user to specify time types of the imported data tables, interaction controls used for guiding the user to specify field types of the imported data tables, interaction controls used for guiding the user to establish association relations among the imported data tables, and a scene configuration table used for guiding the user to establish business scenes related to the imported data tables.
4. The automatic machine learning method of any of claims 1 to 3, wherein receiving data items imported according to the data import guide comprises: and receiving at least one of a data table, an item to be predicted, a primary key of the data table, a time type of the data table, a field type of the data table, an incidence relation between the data tables and a scene configuration table which are imported according to the data import guide.
5. The automatic machine learning method of any of claims 1 to 3, wherein performing automatic model training from imported data items comprises:
splicing the data tables into a data splicing table based on the imported data items, and extracting characteristics to obtain a training sample table; and
and automatically performing machine learning by using the training samples in the training sample table to obtain a machine learning model.
6. The automated machine learning method of any of claims 1 to 5, wherein the predicted traffic scenario relates to a marketing scenario, an anti-fraud scenario, and/or a recommendation scenario.
7. The automated machine learning method of claim 6, wherein the predicted business scenario relates to a marketing scenario, and the data table categories include at least: user table, product table, behavior table.
8. A system comprising at least one computing device and at least one storage device storing instructions that, when executed by the at least one computing device, cause the at least one computing device to perform the method of automatic machine learning of any of claims 1 to 7.
9. A computer-readable storage medium storing instructions that, when executed by at least one computing device, cause the at least one computing device to perform the method of automatic machine learning of any of claims 1 to 7.
10. An automatic machine learning device based on predictive business scenarios, comprising:
the data normal form extracting unit is used for extracting a data normal form corresponding to the predicted service scene;
a guidance unit that provides data import guidance based on the extracted data paradigm;
a data receiving unit that receives a data item imported according to the data import guide; and
a model training unit for performing automatic model training based on the imported data items,
wherein the data paradigm comprises at least: a data table category corresponding to the predicted traffic scenario.
CN201811618614.0A 2018-12-28 2018-12-28 Automatic machine learning method and equipment based on predictive business scene Active CN111444170B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201811618614.0A CN111444170B (en) 2018-12-28 2018-12-28 Automatic machine learning method and equipment based on predictive business scene

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201811618614.0A CN111444170B (en) 2018-12-28 2018-12-28 Automatic machine learning method and equipment based on predictive business scene

Publications (2)

Publication Number Publication Date
CN111444170A true CN111444170A (en) 2020-07-24
CN111444170B CN111444170B (en) 2023-10-03

Family

ID=71626546

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201811618614.0A Active CN111444170B (en) 2018-12-28 2018-12-28 Automatic machine learning method and equipment based on predictive business scene

Country Status (1)

Country Link
CN (1) CN111444170B (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112149838A (en) * 2020-09-03 2020-12-29 第四范式(北京)技术有限公司 Method, device, electronic equipment and storage medium for realizing automatic model building

Citations (16)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104317974A (en) * 2014-11-21 2015-01-28 武汉理工大学 Reconfigurable multi-source data importing method in ERP system
CN104376081A (en) * 2014-11-18 2015-02-25 国家电网公司 Data application processing system, handhold terminal and on-site checking data processing system
CN105718473A (en) * 2014-12-05 2016-06-29 成都复晓科技有限公司 Data modeling method
CN106202762A (en) * 2016-07-16 2016-12-07 北京工业大学 A kind of user's water yield data based on ArcGIS instrument are automatically imported modeling software method
CN106250987A (en) * 2016-07-22 2016-12-21 无锡华云数据技术服务有限公司 A kind of machine learning method, device and big data platform
CN106326248A (en) * 2015-06-23 2017-01-11 阿里巴巴集团控股有限公司 A storage method and device for data of databases
CN106777970A (en) * 2016-12-15 2017-05-31 北京锐软科技股份有限公司 The integrated system and method for a kind of medical information system data template
US20170220930A1 (en) * 2016-01-29 2017-08-03 Microsoft Technology Licensing, Llc Automatic problem assessment in machine learning system
CN107346330A (en) * 2017-06-20 2017-11-14 小草数语(北京)科技有限公司 Data comparison method and device
CN107506462A (en) * 2017-08-30 2017-12-22 中国建设银行股份有限公司 Data processing method, system, electronic equipment, the storage medium of Enterprise Data
US20180109776A1 (en) * 2016-10-14 2018-04-19 Marvel Digital Limited Method, system and medium for improving the quality of 2d-to-3d automatic image conversion using machine learning techniques
CN108008942A (en) * 2017-11-16 2018-05-08 第四范式(北京)技术有限公司 The method and system handled data record
CN108520019A (en) * 2018-03-22 2018-09-11 平安好房(上海)电子商务有限公司 Data managing method, device, equipment and computer readable storage medium
CN108710949A (en) * 2018-04-26 2018-10-26 第四范式(北京)技术有限公司 The method and system of template are modeled for creating machine learning
CN109002528A (en) * 2018-07-12 2018-12-14 北京猫眼文化传媒有限公司 A kind of method, apparatus and storage medium of data importing
CN109033277A (en) * 2018-07-10 2018-12-18 广州极天信息技术股份有限公司 Class brain system, method, equipment and storage medium based on machine learning

Patent Citations (16)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104376081A (en) * 2014-11-18 2015-02-25 国家电网公司 Data application processing system, handhold terminal and on-site checking data processing system
CN104317974A (en) * 2014-11-21 2015-01-28 武汉理工大学 Reconfigurable multi-source data importing method in ERP system
CN105718473A (en) * 2014-12-05 2016-06-29 成都复晓科技有限公司 Data modeling method
CN106326248A (en) * 2015-06-23 2017-01-11 阿里巴巴集团控股有限公司 A storage method and device for data of databases
US20170220930A1 (en) * 2016-01-29 2017-08-03 Microsoft Technology Licensing, Llc Automatic problem assessment in machine learning system
CN106202762A (en) * 2016-07-16 2016-12-07 北京工业大学 A kind of user's water yield data based on ArcGIS instrument are automatically imported modeling software method
CN106250987A (en) * 2016-07-22 2016-12-21 无锡华云数据技术服务有限公司 A kind of machine learning method, device and big data platform
US20180109776A1 (en) * 2016-10-14 2018-04-19 Marvel Digital Limited Method, system and medium for improving the quality of 2d-to-3d automatic image conversion using machine learning techniques
CN106777970A (en) * 2016-12-15 2017-05-31 北京锐软科技股份有限公司 The integrated system and method for a kind of medical information system data template
CN107346330A (en) * 2017-06-20 2017-11-14 小草数语(北京)科技有限公司 Data comparison method and device
CN107506462A (en) * 2017-08-30 2017-12-22 中国建设银行股份有限公司 Data processing method, system, electronic equipment, the storage medium of Enterprise Data
CN108008942A (en) * 2017-11-16 2018-05-08 第四范式(北京)技术有限公司 The method and system handled data record
CN108520019A (en) * 2018-03-22 2018-09-11 平安好房(上海)电子商务有限公司 Data managing method, device, equipment and computer readable storage medium
CN108710949A (en) * 2018-04-26 2018-10-26 第四范式(北京)技术有限公司 The method and system of template are modeled for creating machine learning
CN109033277A (en) * 2018-07-10 2018-12-18 广州极天信息技术股份有限公司 Class brain system, method, equipment and storage medium based on machine learning
CN109002528A (en) * 2018-07-12 2018-12-14 北京猫眼文化传媒有限公司 A kind of method, apparatus and storage medium of data importing

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
JIE ZHANG ET AL.: "regularize,expand and compress:multi-task based lifelong learning via nonexpansive automl", 《COMPUTER VISION AND PATTERN RECOGNITION》 *
张腾 等: "基于机器学习的交通数据分析与应用", vol. 2, no. 2 *
李娜: "报表模板库***开发", 《中国学位论文全文数据库》 *

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112149838A (en) * 2020-09-03 2020-12-29 第四范式(北京)技术有限公司 Method, device, electronic equipment and storage medium for realizing automatic model building

Also Published As

Publication number Publication date
CN111444170B (en) 2023-10-03

Similar Documents

Publication Publication Date Title
CN109697066B (en) Method and system for realizing data sheet splicing and automatically training machine learning model
CN109739855B (en) Method and system for realizing data sheet splicing and automatically training machine learning model
WO2021109928A1 (en) Creation method, usage method and apparatus for machine learning scheme template
US7797356B2 (en) Dynamically detecting exceptions based on data changes
US11693655B2 (en) Method, apparatus, and system for outputting a development unit performance insight interface component comprising a visual emphasis element in response to an insight interface component request
US11775412B2 (en) Machine learning models applied to interaction data for facilitating modifications to online environments
US11418381B2 (en) Hybrid cloud integration deployment and management
KR20180127622A (en) Systems for data collection and analysis
US20230023630A1 (en) Creating predictor variables for prediction models from unstructured data using natural language processing
US9736031B2 (en) Information system construction assistance device, information system construction assistance method, and information system construction assistance program
CN112256886B (en) Probability calculation method and device in atlas, computer equipment and storage medium
CN103677806A (en) Method and system for managing a system
US20230095634A1 (en) Apparatuses, methods, and computer program products for ml assisted service risk analysis of unreleased software code
CN114282686A (en) Method and system for constructing machine learning modeling process
CN111444170B (en) Automatic machine learning method and equipment based on predictive business scene
WO2020093613A1 (en) Page data processing method and apparatus, storage medium, and computer device
CN113867700B (en) Model construction method, display platform, server and storage medium
CN115794545A (en) Automatic processing method of operation and maintenance data and related equipment thereof
CN112236786B (en) Future prediction simulation device, method, and recording device
US10810551B2 (en) Project management support system, project management support method, and non-transitory computer readable medium storing a project management support program
US11803358B1 (en) Adaptive issue type identification platform
US20240112065A1 (en) Meta-learning operation research optimization
US20230317215A1 (en) Machine learning driven automated design of clinical studies and assessment of pharmaceuticals and medical devices
KR102449831B1 (en) Electronic device for providing information regarding new text, server for identifying new text and operation method thereof
US20220327460A1 (en) Systems and methods for scaled engineering

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant