CN113760240B

CN113760240B - Method and device for generating data model

Info

Publication number: CN113760240B
Application number: CN202010910061.7A
Authority: CN
Inventors: 蒲海洋
Original assignee: Beijing Jingdong Century Trading Co Ltd; Beijing Wodong Tianjun Information Technology Co Ltd
Current assignee: Beijing Jingdong Century Trading Co Ltd; Beijing Wodong Tianjun Information Technology Co Ltd
Priority date: 2020-09-02
Filing date: 2020-09-02
Publication date: 2024-06-14
Anticipated expiration: 2040-09-02
Also published as: CN113760240A

Abstract

The invention discloses a method and a device for generating a data model, and relates to the technical field of computers. One embodiment of the method comprises the following steps: matching corresponding processing logic from an index configuration library according to index information and dimension information input by a user; wherein, the index configuration library stores a plurality of processing logics, and each processing logic defines a data source, an index field, a dimension field, filtering conditions of the index field and/or the dimension field, and an association relationship between an operator and the data source; optimizing and splicing the processing logic to generate an executable data development script; executing the executable data development script, thereby generating a data model. The implementation mode can solve the technical problems of large script development workload and poor reusability.

Description

Method and device for generating data model

Technical Field

The present invention relates to the field of computer technologies, and in particular, to a method and apparatus for generating a data model.

Background

Currently, data models of data warehouses are mostly generated by developing HQL scripts by business analysts or data developers.

In the process of implementing the present invention, the inventor finds that at least the following problems exist in the prior art:

1) The threshold for developing the HQL script is high, and the workload for developing the HQL script is large;

2) The HQL script has poor readability, poor reusability, messy and numerous codes, is inconvenient to maintain, and can also cause data ambiguity.

Disclosure of Invention

In view of the above, the embodiments of the present invention provide a method and an apparatus for generating a data model, so as to solve the technical problems of large script development workload and poor reusability.

To achieve the above object, according to one aspect of an embodiment of the present invention, there is provided a method of generating a data model, including:

matching corresponding processing logic from an index configuration library according to index information and dimension information input by a user; wherein, the index configuration library stores a plurality of processing logics, and each processing logic defines a data source, an index field, a dimension field, filtering conditions of the index field and/or the dimension field, and an association relationship between an operator and the data source;

optimizing and splicing the processing logic to generate an executable data development script;

Executing the executable data development script, thereby generating a data model.

Optionally, before matching the corresponding processing logic from the index configuration library according to the index information and the dimension information input by the user, the method further comprises:

generating processing logic according to configuration information pre-configured by a user, and storing the processing logic into an index configuration library;

the configuration information comprises a data source, an index field, a dimension field, filtering conditions of the index field and/or the dimension field, and an association relationship between operators and the data source.

Optionally, optimizing and stitching the processing logic to generate an executable data development script, comprising:

Reading the processing logic to obtain the filtering conditions of the data source, the index field, the dimension field, the index field and/or the dimension field and the association relationship between the operator and the data source defined in the processing logic;

Instantiating the filtering conditions of the data source, the index field, the dimension field, the index field and/or the dimension field to generate a factor table;

and optimizing and splicing the internal calculation logic of the atom table according to the association relation between the operator and the data source so as to generate an executable data development script.

Optionally, optimizing and stitching the internal computing logic of the atom table according to the association relationship between the operator and the data source to generate an executable data development script, including:

processing the atom table according to the association relation between the operator and the data source to generate a table derived table;

and optimizing and splicing the internal computing logic of the derivative table to generate an executable data development script.

Optionally, the atom table is AtomicTable objects, and the derivative table is DERIVEDTA BLE objects.

Alternatively, a field whose value is significant is used as an index field, and a field whose value is nonsensical is used as a dimension field.

Optionally, the operators include at least one of an association operator, an aggregation operator, a merge operator, a deduplication operator, a selection operator, and a filtering operator.

In addition, according to another aspect of an embodiment of the present invention, there is provided an apparatus for generating a data model, including:

The matching module is used for matching corresponding processing logic from the index configuration library according to the index information and the dimension information input by the user; wherein, the index configuration library stores a plurality of processing logics, and each processing logic defines a data source, an index field, a dimension field, filtering conditions of the index field and/or the dimension field, and an association relationship between an operator and the data source;

the script module is used for optimizing and splicing the processing logic to generate an executable data development script;

And the execution module is used for executing the executable data development script so as to generate a data model.

Optionally, the matching module is further configured to:

Before corresponding processing logic is matched from an index configuration library according to index information and dimension information input by a user, generating processing logic according to configuration information pre-configured by the user, and storing the processing logic into the index configuration library;

Optionally, the script module is further configured to:

According to another aspect of an embodiment of the present invention, there is also provided an electronic device including:

one or more processors;

storage means for storing one or more programs,

The one or more processors implement the method of any of the embodiments described above when the one or more programs are executed by the one or more processors.

According to another aspect of an embodiment of the present invention, there is also provided a computer readable medium having stored thereon a computer program which, when executed by a processor, implements the method according to any of the embodiments described above.

One embodiment of the above invention has the following advantages or benefits: because the corresponding processing logic is matched from the index configuration library according to the index information and the dimension information input by the user, and the processing logic is optimized and spliced to generate the technical means of executable data development script, the technical problems of large script development workload and poor reusability in the prior art are overcome. According to the embodiment of the invention, the processing logic is stored in the index configuration library in advance, so that the user can automatically generate the executable data development script only by inputting the index and the dimension, the code development amount is obviously reduced, and the development threshold is lowered. Because standardized processing logic is maintained in the index configuration library, the coding style can be unified, the unique data result is kept, and the ambiguity of the data is eliminated; code reusability can also be improved, and subsequent maintenance and secondary development are facilitated.

Further effects of the above-described non-conventional alternatives are described below in connection with the embodiments.

Drawings

The drawings are included to provide a better understanding of the invention and are not to be construed as unduly limiting the invention. Wherein:

FIG. 1 is a schematic diagram of the main flow of a method of generating a data model according to an embodiment of the present invention;

FIG. 2 is a schematic diagram of processing logic in an index configuration library according to an embodiment of the present invention;

FIG. 3 is a schematic diagram of an HQL statement in accordance with an embodiment of the invention;

FIG. 4 is a schematic diagram of result_table_sql according to an embodiment of the present invention;

FIG. 5 is a schematic diagram of a data model according to an embodiment of the invention;

FIG. 6 is a schematic diagram of the main flow of a method of generating a data model according to one referenceable embodiment of the invention;

FIG. 7 is a schematic diagram of the main flow of a method of generating a data model according to another referenceable embodiment of the invention;

FIG. 8 is a process in custom development mode according to an embodiment of the invention;

FIG. 9 is a schematic diagram of the main modules of an apparatus for generating a data model according to an embodiment of the present invention;

FIG. 10 is an exemplary system architecture diagram in which embodiments of the present invention may be applied;

Fig. 11 is a schematic diagram of a computer system suitable for use in implementing an embodiment of the invention.

Detailed Description

Exemplary embodiments of the present invention will now be described with reference to the accompanying drawings, in which various details of the embodiments of the present invention are included to facilitate understanding, and are to be considered merely exemplary. Accordingly, those of ordinary skill in the art will recognize that various changes and modifications of the embodiments described herein can be made without departing from the scope and spirit of the invention. Also, descriptions of well-known functions and constructions are omitted in the following description for clarity and conciseness.

Fig. 1 is a schematic diagram of the main flow of a method of generating a data model according to an embodiment of the present invention. As an embodiment of the present invention, as shown in fig. 1, the method for generating a data model may include:

Step 101, matching corresponding processing logic from an index configuration library according to index information and dimension information input by a user.

In the embodiment of the invention, the user can match the corresponding processing logic from the index configuration library only by inputting the index information and the dimension information. The index configuration library stores a plurality of processing logics, and each processing logic defines a data source, an index field, a dimension field, filtering conditions of the index field and/or the dimension field, and an association relation between operators and the data source. It should be noted that, for different index fields and dimension fields, different processing logic may be maintained, so that by using index information and dimension information input by a user, unique processing logic can be matched, thereby solving the problem of ambiguity of data. Data ambiguity refers to the fact that the names of the different front and back fields are the same, but the final data results are different due to different processing logic, so that a plurality of different data results exist.

Alternatively, a field whose value is significant is used as an index field, and a field whose value is nonsensical is used as a dimension field. For example, consider a trade index:

dimension: merchant ID, user ID;

The index is as follows: amount of the deal (GMV), amount of the deal order.

Merchant ID 1980&2000: the numerical value is not significant, i.e. dimension.

GMV 1000&10000: the numerical values are obviously different, namely the indexes.

Optionally, the operators include at least one of an association operator (join), an aggregation operator (group_by), a merge operator (union), a deduplication operator (distinct), a selection operator (sellect), and a filter_sql. Optionally, the association relationship may include a left association, a right association, and/or an inner association.

Prior to step 101, it may further include: generating processing logic according to configuration information pre-configured by a user, and storing the processing logic into an index configuration library; the configuration information comprises a data source, an index field, a dimension field, filtering conditions of the index field and/or the dimension field, and an association relationship between operators and the data source. In order to match the corresponding processing logic from the index configuration library, before step 101, the filtering conditions of the data source, the index field, the dimension field, the index field and/or the dimension field, the association relationship between the operator and the data source, and the like need to be configured in advance, so as to generate the processing logic. Optionally, the conversion operation of the derived table may also be configured, and accordingly, the processing logic also defines the conversion operation of the derived table.

Taking the transaction index as an example, processing logic shown in fig. 2 is generated according to configuration information pre-configured by a user, such as a data source, an index field, a dimension field, filtering conditions of the index field and/or the dimension field, an association relationship between an operator and the data source, and the like, and the processing logic is stored in an index configuration library.

And 102, optimizing and splicing the processing logic to generate an executable data development script.

After processing logic required by a user is matched from the index configuration library, the processing logic is optimized and spliced, so that an executable data development script is automatically generated. The user can generate the executable data development script without inputting the association relation between the data sources and the conversion operation of the derivative table, so that the code development amount is obviously reduced.

Optionally, step 102 may include: reading the processing logic to obtain the filtering conditions of the data source, the index field, the dimension field, the index field and/or the dimension field and the association relationship between the operator and the data source defined in the processing logic; instantiating the filtering conditions of the data source, the index field, the dimension field, the index field and/or the dimension field to generate a factor table; and optimizing and splicing the internal calculation logic of the atom table according to the association relation between the operator and the data source so as to generate an executable data development script.

In the step, firstly, processing logic matched from an index configuration library is read to acquire information such as a data source, an index field, a dimension field, filtering conditions of the index field and/or the dimension field, association relations between operators and the data source and the like defined in the processing logic, then an atomic table is generated according to the information in an instantiation mode, and finally internal calculation logic of the atomic table is optimized and spliced, so that an executable data development script is automatically generated. Alternatively, the executable data development script may be an HQL statement (a structured query language applied to Hive's SQL-like). As shown in fig. 3, a complete HQL statement can be generated by steps 101 and 102, where result_table_sql is a variable, as shown in fig. 4.

Optionally, optimizing and stitching the internal computing logic of the atom table according to the association relationship between the operator and the data source to generate an executable data development script, including: processing the atom table according to the association relation between the operator and the data source to generate a table derived table; and optimizing and splicing the internal computing logic of the derivative table to generate an executable data development script. It should be noted that if the processing logic further defines a conversion operation of the derived table, then the conversion operation is further performed on the derived table according to the defined conversion operation, and then the internal calculation logic of the derived table is optimized and spliced.

Optionally, the atom table is AtomicTable objects, which is an executable abstract data structure; the derivative table is DerivedTable objects, a plurality of atomic tables are combined through different association relations to generate a derivative table (namely an intermediate table), and the derivative table is also an executable abstract data structure and is combined by a plurality of AtomicTable objects.

And step 103, executing the executable data development script, thereby generating a data model.

Executing the executable data development script, a data model may be generated, as shown in FIG. 5.

According to the various embodiments described above, it can be seen that the technical means of generating the executable data development script by matching the corresponding processing logic from the index configuration library according to the index information and the dimension information input by the user and optimizing and splicing the processing logic solves the technical problems of large script development workload and poor reusability in the prior art. According to the embodiment of the invention, the processing logic is stored in the index configuration library in advance, so that the user can automatically generate the executable data development script only by inputting the index and the dimension, the code development amount is obviously reduced, and the development threshold is lowered. Because standardized processing logic is maintained in the index configuration library, the coding style can be unified, the unique data result is kept, and the ambiguity of the data is eliminated; code reusability can also be improved, and subsequent maintenance and secondary development are facilitated.

In order to facilitate the development of a data model by a user, two development modes are provided in the embodiment of the invention: standard development mode [ standard ] and custom development mode [ dev ]. In the standard development mode, a user can automatically generate an executable data development script only by inputting index information and dimension information; in the custom development mode, a user needs to input information such as a data source, an index field, a dimension field, filtering conditions of the index field and/or the dimension field, an association relation between an operator and the data source and the like, so as to generate a derivative table, and then internal calculation logic of the derivative table is optimized and spliced, so that an executable data development script is generated.

FIG. 6 is a schematic diagram of the main flow of a method of generating a data model according to one referenceable embodiment of the invention. As yet another embodiment of the present invention, as shown in fig. 6, taking a standard development mode as an example, the method for generating a data model may include:

Step 601, matching corresponding processing logic from an index configuration library according to index information and dimension information input by a user.

In the standard development mode, a user can match corresponding processing logic from the index configuration library only by inputting index information and dimension information. The index configuration library stores a plurality of processing logics, and each processing logic defines a data source, an index field, a dimension field, filtering conditions of the index field and/or the dimension field, and an association relation between operators and the data source.

In order to match processing logic from the index configuration library, filtering conditions, association relationships between operators and data sources, etc. of the data sources, index fields, dimension fields, index fields and/or dimension fields need to be configured in advance before step 601, so that processing logic is generated.

Step 602, reading the processing logic to obtain filtering conditions of the data source, the index field, the dimension field, the index field and/or the dimension field, and association relations between operators and the data source, which are defined in the processing logic.

Step 603, instantiating the filtering conditions of the data source, the index field, the dimension field, the index field, and/or the dimension field to generate a table of elements (such as AtomicTable objects).

In step 604, the atom table is processed to generate a table derivative table (e.g., derivedTable objects) according to the association between the operator and the data source.

Step 605, optimizing and stitching the internal computing logic of the derivative table to generate an executable data development script.

Step 606, executing the executable data development script, thereby generating a data model.

In addition, the implementation of the method for generating a data model according to one embodiment of the present invention is described in detail in the above method for generating a data model, and thus the description thereof will not be repeated here.

FIG. 7 is a schematic diagram of the main flow of a method of generating a data model according to another referenceable embodiment of the invention. As another embodiment of the present invention, as shown in fig. 6, taking a custom hair mode as an example, the method for generating a data model may include:

in step 701, a factor table is generated by instantiation according to filtering conditions of a data source, an index field, a dimension field, an index field and/or a dimension field input by a user.

And 702, processing the atom table according to the association relation between the operator input by the user and the data source to generate a table derivative table.

Step 703, optimizing and stitching the internal computing logic of the derivative table to generate an executable data development script.

Step 704, executing the executable data development script, thereby generating a data model.

In the custom development mode, because the corresponding processing logic cannot be matched from the index configuration library, the user is required to set all configuration information, such as filtering conditions of the data source, the index field, the dimension field, the index field and/or the dimension field, operators, association relations between the data sources, conversion operations of the derivative table and the like, as shown in fig. 8, the development framework sequentially generates the atomic table and the derivative table according to the set configuration information, and then performs optimization and splicing, so that the executable data development script is generated. Therefore, the development amount in the custom development mode is less than that in the standard development mode.

FIG. 9 is a schematic diagram of main modules of an apparatus for generating a data model according to an embodiment of the present invention, and as shown in FIG. 9, the apparatus 900 for generating a data model includes a matching module 901, a script module 902, and an execution module 903; the matching module 901 is configured to match corresponding processing logic from the index configuration library according to the index information and the dimension information input by the user; wherein, the index configuration library stores a plurality of processing logics, and each processing logic defines a data source, an index field, a dimension field, filtering conditions of the index field and/or the dimension field, and an association relationship between an operator and the data source; script module 902 is configured to optimize and splice the processing logic to generate an executable data development script; the execution module 903 is configured to execute the executable data development script, thereby generating a data model.

Optionally, the matching module 901 is further configured to:

Optionally, the script module 902 is further configured to:

The specific implementation of the apparatus for generating a data model according to the present invention is described in detail in the method for generating a data model described above, and thus the description thereof will not be repeated here.

Fig. 10 illustrates an exemplary system architecture 1000 to which a method of generating a data model or an apparatus of generating a data model of an embodiment of the present invention may be applied.

As shown in fig. 10, a system architecture 1000 may include terminal devices 1001, 1002, 1003, a network 1004, and a server 1005. The network 1004 serves as a medium for providing a communication link between the terminal apparatuses 1001, 1002, 1003 and the server 1005. The network 1004 may include various connection types, such as wired, wireless communication links, or fiber optic cables, among others.

A user can interact with a server 1005 via a network 1004 using terminal apparatuses 1001, 1002, 1003 to receive or transmit messages or the like. Various communication client applications such as shopping class applications, web browser applications, search class applications, instant messaging tools, mailbox clients, social platform software, etc. (by way of example only) may be installed on the terminal devices 1001, 1002, 1003.

The terminal devices 1001, 1002, 1003 may be various electronic devices having a display screen and supporting web browsing, including but not limited to smartphones, tablets, laptop and desktop computers, and the like.

The server 1005 may be a server providing various services, such as a background management server (merely an example) providing support for shopping-type websites browsed by the user using the terminal apparatuses 1001, 1002, 1003. The background management server may analyze and process the received data such as the article information query request, and feedback the processing result (e.g., the target push information, the article information—only an example) to the terminal device.

It should be noted that, the method for generating a data model according to the embodiment of the present invention is generally performed by the server 1005, and accordingly, the device for generating a data model is generally provided in the server 1005. The method for generating a data model provided by the embodiment of the present invention may also be performed by the terminal devices 1001, 1002, 1003, and accordingly, the apparatus for generating a data model may be provided in the terminal devices 1001, 1002, 1003.

It should be understood that the number of terminal devices, networks and servers in fig. 10 is merely illustrative. There may be any number of terminal devices, networks, and servers, as desired for implementation.

Referring now to FIG. 11, there is illustrated a schematic diagram of a computer system 1100 suitable for use in implementing the terminal device of an embodiment of the present invention. The terminal device shown in fig. 11 is only an example, and should not impose any limitation on the functions and the scope of use of the embodiment of the present invention.

As shown in fig. 11, the computer system 1100 includes a Central Processing Unit (CPU) 1101, which can execute various appropriate actions and processes according to a program stored in a Read Only Memory (ROM) 1102 or a program loaded from a storage section 1108 into a Random Access Memory (RAM) 1103. In the RAM1103, various programs and data required for the operation of the system 1100 are also stored. The CPU 1101, ROM 1102, and RAM1103 are connected to each other by a bus 1104. An input/output (I/O) interface 1105 is also connected to bus 1104.

The following components are connected to the I/O interface 1105: an input section 1106 including a keyboard, a mouse, and the like; an output portion 1107 including a Cathode Ray Tube (CRT), a Liquid Crystal Display (LCD), and the like, a speaker, and the like; a storage section 1108 including a hard disk or the like; and a communication section 1109 including a network interface card such as a LAN card, a modem, and the like. The communication section 1109 performs communication processing via a network such as the internet. The drive 1110 is also connected to the I/O interface 1105 as needed. Removable media 1111, such as a magnetic disk, an optical disk, a magneto-optical disk, a semiconductor memory, or the like, is installed as needed in drive 1110, so that a computer program read therefrom is installed as needed in storage section 1108.

In particular, according to embodiments of the present disclosure, the processes described above with reference to flowcharts may be implemented as computer software programs. For example, embodiments of the present disclosure include a computer program comprising a computer program embodied on a computer readable medium, the computer program comprising program code for performing the method shown in the flow chart. In such an embodiment, the computer program can be downloaded and installed from a network via the communication portion 1109, and/or installed from the removable media 1111. The above-described functions defined in the system of the present invention are performed when the computer program is executed by a Central Processing Unit (CPU) 1101.

The computer readable medium shown in the present invention may be a computer readable signal medium or a computer readable storage medium, or any combination of the two. The computer readable storage medium can be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or a combination of any of the foregoing. More specific examples of the computer-readable storage medium may include, but are not limited to: an electrical connection having one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing. In the context of this document, a computer readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. In the present invention, however, the computer-readable signal medium may include a data signal propagated in baseband or as part of a carrier wave, with the computer-readable program code embodied therein. Such a propagated data signal may take any of a variety of forms, including, but not limited to, electro-magnetic, optical, or any suitable combination of the foregoing. A computer readable signal medium may also be any computer readable medium that is not a computer readable storage medium and that can communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device. Program code embodied on a computer readable medium may be transmitted using any appropriate medium, including but not limited to: wireless, wire, fiber optic cable, RF, etc., or any suitable combination of the foregoing.

The flowcharts and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of systems, methods and computer programs according to various embodiments of the present invention. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams or flowchart illustration, and combinations of blocks in the block diagrams or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.

The modules involved in the embodiments of the present invention may be implemented in software or in hardware. The described modules may also be provided in a processor, for example, as: a processor includes a matching module, a script module, and an execution module, where the names of the modules do not constitute a limitation on the module itself in some cases.

As another aspect, the present invention also provides a computer-readable medium that may be contained in the apparatus described in the above embodiments; or may be present alone without being fitted into the device. The computer readable medium carries one or more programs which, when executed by a device, implement the method of: matching corresponding processing logic from an index configuration library according to index information and dimension information input by a user; wherein, the index configuration library stores a plurality of processing logics, and each processing logic defines a data source, an index field, a dimension field, filtering conditions of the index field and/or the dimension field, and an association relationship between an operator and the data source; optimizing and splicing the processing logic to generate an executable data development script; executing the executable data development script, thereby generating a data model.

According to the technical scheme of the embodiment of the invention, the technical means of optimizing and splicing the processing logic to generate the executable data development script is adopted to match the corresponding processing logic from the index configuration library according to the index information and the dimension information input by the user, so that the technical problems of large script development workload and poor reusability in the prior art are overcome. According to the embodiment of the invention, the processing logic is stored in the index configuration library in advance, so that the user can automatically generate the executable data development script only by inputting the index and the dimension, the code development amount is obviously reduced, and the development threshold is lowered. Because standardized processing logic is maintained in the index configuration library, the coding style can be unified, the unique data result is kept, and the ambiguity of the data is eliminated; code reusability can also be improved, and subsequent maintenance and secondary development are facilitated.

The above embodiments do not limit the scope of the present invention. It will be apparent to those skilled in the art that various modifications, combinations, sub-combinations and alternatives can occur depending upon design requirements and other factors. Any modifications, equivalent substitutions and improvements made within the spirit and principles of the present invention should be included in the scope of the present invention.

Claims

1. A method of generating a data model, comprising:

Executing the executable data development script, thereby generating a data model;

Optimizing and stitching the processing logic to generate an executable data development script, comprising:

2. The method of claim 1, further comprising, prior to matching corresponding processing logic from the index configuration library based on the index information and the dimension information entered by the user:

3. The method of claim 1, wherein optimizing and stitching the internal computing logic of the atom table according to the association between the operator and the data source to generate an executable data development script comprises:

4. A method according to claim 3, wherein the Table of elements is an Atomic Table object and the Derived Table is a Derived Table object.

5. The method of claim 1, wherein the operators comprise at least one of an association operator, an aggregation operator, a merge operator, a deduplication operator, a selection operator, and a filtering operator.

6. An apparatus for generating a data model, comprising:

the execution module is used for executing the executable data development script so as to generate a data model;

the script module is further configured to:

7. An electronic device, comprising:

one or more processors;

storage means for storing one or more programs,

The one or more processors implement the method of any of claims 1-5 when the one or more programs are executed by the one or more processors.

8. A computer readable medium, on which a computer program is stored, characterized in that the program, when being executed by a processor, implements the method according to any of claims 1-5.