CN113760240B - Method and device for generating data model - Google Patents

Method and device for generating data model Download PDF

Info

Publication number
CN113760240B
CN113760240B CN202010910061.7A CN202010910061A CN113760240B CN 113760240 B CN113760240 B CN 113760240B CN 202010910061 A CN202010910061 A CN 202010910061A CN 113760240 B CN113760240 B CN 113760240B
Authority
CN
China
Prior art keywords
index
field
processing logic
dimension
data source
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202010910061.7A
Other languages
Chinese (zh)
Other versions
CN113760240A (en
Inventor
蒲海洋
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Jingdong Century Trading Co Ltd
Beijing Wodong Tianjun Information Technology Co Ltd
Original Assignee
Beijing Jingdong Century Trading Co Ltd
Beijing Wodong Tianjun Information Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Jingdong Century Trading Co Ltd, Beijing Wodong Tianjun Information Technology Co Ltd filed Critical Beijing Jingdong Century Trading Co Ltd
Priority to CN202010910061.7A priority Critical patent/CN113760240B/en
Publication of CN113760240A publication Critical patent/CN113760240A/en
Application granted granted Critical
Publication of CN113760240B publication Critical patent/CN113760240B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F8/00Arrangements for software engineering
    • G06F8/30Creation or generation of source code
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/21Design, administration or maintenance of databases
    • G06F16/211Schema design and management
    • G06F16/212Schema design and management with details for data modelling support
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/28Databases characterised by their database models, e.g. relational or object models
    • G06F16/283Multi-dimensional databases or data warehouses, e.g. MOLAP or ROLAP

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Software Systems (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention discloses a method and a device for generating a data model, and relates to the technical field of computers. One embodiment of the method comprises the following steps: matching corresponding processing logic from an index configuration library according to index information and dimension information input by a user; wherein, the index configuration library stores a plurality of processing logics, and each processing logic defines a data source, an index field, a dimension field, filtering conditions of the index field and/or the dimension field, and an association relationship between an operator and the data source; optimizing and splicing the processing logic to generate an executable data development script; executing the executable data development script, thereby generating a data model. The implementation mode can solve the technical problems of large script development workload and poor reusability.

Description

Method and device for generating data model
Technical Field
The present invention relates to the field of computer technologies, and in particular, to a method and apparatus for generating a data model.
Background
Currently, data models of data warehouses are mostly generated by developing HQL scripts by business analysts or data developers.
In the process of implementing the present invention, the inventor finds that at least the following problems exist in the prior art:
1) The threshold for developing the HQL script is high, and the workload for developing the HQL script is large;
2) The HQL script has poor readability, poor reusability, messy and numerous codes, is inconvenient to maintain, and can also cause data ambiguity.
Disclosure of Invention
In view of the above, the embodiments of the present invention provide a method and an apparatus for generating a data model, so as to solve the technical problems of large script development workload and poor reusability.
To achieve the above object, according to one aspect of an embodiment of the present invention, there is provided a method of generating a data model, including:
matching corresponding processing logic from an index configuration library according to index information and dimension information input by a user; wherein, the index configuration library stores a plurality of processing logics, and each processing logic defines a data source, an index field, a dimension field, filtering conditions of the index field and/or the dimension field, and an association relationship between an operator and the data source;
optimizing and splicing the processing logic to generate an executable data development script;
Executing the executable data development script, thereby generating a data model.
Optionally, before matching the corresponding processing logic from the index configuration library according to the index information and the dimension information input by the user, the method further comprises:
generating processing logic according to configuration information pre-configured by a user, and storing the processing logic into an index configuration library;
the configuration information comprises a data source, an index field, a dimension field, filtering conditions of the index field and/or the dimension field, and an association relationship between operators and the data source.
Optionally, optimizing and stitching the processing logic to generate an executable data development script, comprising:
Reading the processing logic to obtain the filtering conditions of the data source, the index field, the dimension field, the index field and/or the dimension field and the association relationship between the operator and the data source defined in the processing logic;
Instantiating the filtering conditions of the data source, the index field, the dimension field, the index field and/or the dimension field to generate a factor table;
and optimizing and splicing the internal calculation logic of the atom table according to the association relation between the operator and the data source so as to generate an executable data development script.
Optionally, optimizing and stitching the internal computing logic of the atom table according to the association relationship between the operator and the data source to generate an executable data development script, including:
processing the atom table according to the association relation between the operator and the data source to generate a table derived table;
and optimizing and splicing the internal computing logic of the derivative table to generate an executable data development script.
Optionally, the atom table is AtomicTable objects, and the derivative table is DERIVEDTA BLE objects.
Alternatively, a field whose value is significant is used as an index field, and a field whose value is nonsensical is used as a dimension field.
Optionally, the operators include at least one of an association operator, an aggregation operator, a merge operator, a deduplication operator, a selection operator, and a filtering operator.
In addition, according to another aspect of an embodiment of the present invention, there is provided an apparatus for generating a data model, including:
The matching module is used for matching corresponding processing logic from the index configuration library according to the index information and the dimension information input by the user; wherein, the index configuration library stores a plurality of processing logics, and each processing logic defines a data source, an index field, a dimension field, filtering conditions of the index field and/or the dimension field, and an association relationship between an operator and the data source;
the script module is used for optimizing and splicing the processing logic to generate an executable data development script;
And the execution module is used for executing the executable data development script so as to generate a data model.
Optionally, the matching module is further configured to:
Before corresponding processing logic is matched from an index configuration library according to index information and dimension information input by a user, generating processing logic according to configuration information pre-configured by the user, and storing the processing logic into the index configuration library;
the configuration information comprises a data source, an index field, a dimension field, filtering conditions of the index field and/or the dimension field, and an association relationship between operators and the data source.
Optionally, the script module is further configured to:
Reading the processing logic to obtain the filtering conditions of the data source, the index field, the dimension field, the index field and/or the dimension field and the association relationship between the operator and the data source defined in the processing logic;
Instantiating the filtering conditions of the data source, the index field, the dimension field, the index field and/or the dimension field to generate a factor table;
and optimizing and splicing the internal calculation logic of the atom table according to the association relation between the operator and the data source so as to generate an executable data development script.
Optionally, the script module is further configured to:
processing the atom table according to the association relation between the operator and the data source to generate a table derived table;
and optimizing and splicing the internal computing logic of the derivative table to generate an executable data development script.
Optionally, the atom table is AtomicTable objects, and the derivative table is DERIVEDTA BLE objects.
Alternatively, a field whose value is significant is used as an index field, and a field whose value is nonsensical is used as a dimension field.
Optionally, the operators include at least one of an association operator, an aggregation operator, a merge operator, a deduplication operator, a selection operator, and a filtering operator.
According to another aspect of an embodiment of the present invention, there is also provided an electronic device including:
one or more processors;
storage means for storing one or more programs,
The one or more processors implement the method of any of the embodiments described above when the one or more programs are executed by the one or more processors.
According to another aspect of an embodiment of the present invention, there is also provided a computer readable medium having stored thereon a computer program which, when executed by a processor, implements the method according to any of the embodiments described above.
One embodiment of the above invention has the following advantages or benefits: because the corresponding processing logic is matched from the index configuration library according to the index information and the dimension information input by the user, and the processing logic is optimized and spliced to generate the technical means of executable data development script, the technical problems of large script development workload and poor reusability in the prior art are overcome. According to the embodiment of the invention, the processing logic is stored in the index configuration library in advance, so that the user can automatically generate the executable data development script only by inputting the index and the dimension, the code development amount is obviously reduced, and the development threshold is lowered. Because standardized processing logic is maintained in the index configuration library, the coding style can be unified, the unique data result is kept, and the ambiguity of the data is eliminated; code reusability can also be improved, and subsequent maintenance and secondary development are facilitated.
Further effects of the above-described non-conventional alternatives are described below in connection with the embodiments.
Drawings
The drawings are included to provide a better understanding of the invention and are not to be construed as unduly limiting the invention. Wherein:
FIG. 1 is a schematic diagram of the main flow of a method of generating a data model according to an embodiment of the present invention;
FIG. 2 is a schematic diagram of processing logic in an index configuration library according to an embodiment of the present invention;
FIG. 3 is a schematic diagram of an HQL statement in accordance with an embodiment of the invention;
FIG. 4 is a schematic diagram of result_table_sql according to an embodiment of the present invention;
FIG. 5 is a schematic diagram of a data model according to an embodiment of the invention;
FIG. 6 is a schematic diagram of the main flow of a method of generating a data model according to one referenceable embodiment of the invention;
FIG. 7 is a schematic diagram of the main flow of a method of generating a data model according to another referenceable embodiment of the invention;
FIG. 8 is a process in custom development mode according to an embodiment of the invention;
FIG. 9 is a schematic diagram of the main modules of an apparatus for generating a data model according to an embodiment of the present invention;
FIG. 10 is an exemplary system architecture diagram in which embodiments of the present invention may be applied;
Fig. 11 is a schematic diagram of a computer system suitable for use in implementing an embodiment of the invention.
Detailed Description
Exemplary embodiments of the present invention will now be described with reference to the accompanying drawings, in which various details of the embodiments of the present invention are included to facilitate understanding, and are to be considered merely exemplary. Accordingly, those of ordinary skill in the art will recognize that various changes and modifications of the embodiments described herein can be made without departing from the scope and spirit of the invention. Also, descriptions of well-known functions and constructions are omitted in the following description for clarity and conciseness.
Fig. 1 is a schematic diagram of the main flow of a method of generating a data model according to an embodiment of the present invention. As an embodiment of the present invention, as shown in fig. 1, the method for generating a data model may include:
Step 101, matching corresponding processing logic from an index configuration library according to index information and dimension information input by a user.
In the embodiment of the invention, the user can match the corresponding processing logic from the index configuration library only by inputting the index information and the dimension information. The index configuration library stores a plurality of processing logics, and each processing logic defines a data source, an index field, a dimension field, filtering conditions of the index field and/or the dimension field, and an association relation between operators and the data source. It should be noted that, for different index fields and dimension fields, different processing logic may be maintained, so that by using index information and dimension information input by a user, unique processing logic can be matched, thereby solving the problem of ambiguity of data. Data ambiguity refers to the fact that the names of the different front and back fields are the same, but the final data results are different due to different processing logic, so that a plurality of different data results exist.
Alternatively, a field whose value is significant is used as an index field, and a field whose value is nonsensical is used as a dimension field. For example, consider a trade index:
dimension: merchant ID, user ID;
The index is as follows: amount of the deal (GMV), amount of the deal order.
Merchant ID 1980&2000: the numerical value is not significant, i.e. dimension.
GMV 1000&10000: the numerical values are obviously different, namely the indexes.
Optionally, the operators include at least one of an association operator (join), an aggregation operator (group_by), a merge operator (union), a deduplication operator (distinct), a selection operator (sellect), and a filter_sql. Optionally, the association relationship may include a left association, a right association, and/or an inner association.
Prior to step 101, it may further include: generating processing logic according to configuration information pre-configured by a user, and storing the processing logic into an index configuration library; the configuration information comprises a data source, an index field, a dimension field, filtering conditions of the index field and/or the dimension field, and an association relationship between operators and the data source. In order to match the corresponding processing logic from the index configuration library, before step 101, the filtering conditions of the data source, the index field, the dimension field, the index field and/or the dimension field, the association relationship between the operator and the data source, and the like need to be configured in advance, so as to generate the processing logic. Optionally, the conversion operation of the derived table may also be configured, and accordingly, the processing logic also defines the conversion operation of the derived table.
Taking the transaction index as an example, processing logic shown in fig. 2 is generated according to configuration information pre-configured by a user, such as a data source, an index field, a dimension field, filtering conditions of the index field and/or the dimension field, an association relationship between an operator and the data source, and the like, and the processing logic is stored in an index configuration library.
And 102, optimizing and splicing the processing logic to generate an executable data development script.
After processing logic required by a user is matched from the index configuration library, the processing logic is optimized and spliced, so that an executable data development script is automatically generated. The user can generate the executable data development script without inputting the association relation between the data sources and the conversion operation of the derivative table, so that the code development amount is obviously reduced.
Optionally, step 102 may include: reading the processing logic to obtain the filtering conditions of the data source, the index field, the dimension field, the index field and/or the dimension field and the association relationship between the operator and the data source defined in the processing logic; instantiating the filtering conditions of the data source, the index field, the dimension field, the index field and/or the dimension field to generate a factor table; and optimizing and splicing the internal calculation logic of the atom table according to the association relation between the operator and the data source so as to generate an executable data development script.
In the step, firstly, processing logic matched from an index configuration library is read to acquire information such as a data source, an index field, a dimension field, filtering conditions of the index field and/or the dimension field, association relations between operators and the data source and the like defined in the processing logic, then an atomic table is generated according to the information in an instantiation mode, and finally internal calculation logic of the atomic table is optimized and spliced, so that an executable data development script is automatically generated. Alternatively, the executable data development script may be an HQL statement (a structured query language applied to Hive's SQL-like). As shown in fig. 3, a complete HQL statement can be generated by steps 101 and 102, where result_table_sql is a variable, as shown in fig. 4.
Optionally, optimizing and stitching the internal computing logic of the atom table according to the association relationship between the operator and the data source to generate an executable data development script, including: processing the atom table according to the association relation between the operator and the data source to generate a table derived table; and optimizing and splicing the internal computing logic of the derivative table to generate an executable data development script. It should be noted that if the processing logic further defines a conversion operation of the derived table, then the conversion operation is further performed on the derived table according to the defined conversion operation, and then the internal calculation logic of the derived table is optimized and spliced.
Optionally, the atom table is AtomicTable objects, which is an executable abstract data structure; the derivative table is DerivedTable objects, a plurality of atomic tables are combined through different association relations to generate a derivative table (namely an intermediate table), and the derivative table is also an executable abstract data structure and is combined by a plurality of AtomicTable objects.
And step 103, executing the executable data development script, thereby generating a data model.
Executing the executable data development script, a data model may be generated, as shown in FIG. 5.
According to the various embodiments described above, it can be seen that the technical means of generating the executable data development script by matching the corresponding processing logic from the index configuration library according to the index information and the dimension information input by the user and optimizing and splicing the processing logic solves the technical problems of large script development workload and poor reusability in the prior art. According to the embodiment of the invention, the processing logic is stored in the index configuration library in advance, so that the user can automatically generate the executable data development script only by inputting the index and the dimension, the code development amount is obviously reduced, and the development threshold is lowered. Because standardized processing logic is maintained in the index configuration library, the coding style can be unified, the unique data result is kept, and the ambiguity of the data is eliminated; code reusability can also be improved, and subsequent maintenance and secondary development are facilitated.
In order to facilitate the development of a data model by a user, two development modes are provided in the embodiment of the invention: standard development mode [ standard ] and custom development mode [ dev ]. In the standard development mode, a user can automatically generate an executable data development script only by inputting index information and dimension information; in the custom development mode, a user needs to input information such as a data source, an index field, a dimension field, filtering conditions of the index field and/or the dimension field, an association relation between an operator and the data source and the like, so as to generate a derivative table, and then internal calculation logic of the derivative table is optimized and spliced, so that an executable data development script is generated.
FIG. 6 is a schematic diagram of the main flow of a method of generating a data model according to one referenceable embodiment of the invention. As yet another embodiment of the present invention, as shown in fig. 6, taking a standard development mode as an example, the method for generating a data model may include:
Step 601, matching corresponding processing logic from an index configuration library according to index information and dimension information input by a user.
In the standard development mode, a user can match corresponding processing logic from the index configuration library only by inputting index information and dimension information. The index configuration library stores a plurality of processing logics, and each processing logic defines a data source, an index field, a dimension field, filtering conditions of the index field and/or the dimension field, and an association relation between operators and the data source.
In order to match processing logic from the index configuration library, filtering conditions, association relationships between operators and data sources, etc. of the data sources, index fields, dimension fields, index fields and/or dimension fields need to be configured in advance before step 601, so that processing logic is generated.
Step 602, reading the processing logic to obtain filtering conditions of the data source, the index field, the dimension field, the index field and/or the dimension field, and association relations between operators and the data source, which are defined in the processing logic.
Step 603, instantiating the filtering conditions of the data source, the index field, the dimension field, the index field, and/or the dimension field to generate a table of elements (such as AtomicTable objects).
In step 604, the atom table is processed to generate a table derivative table (e.g., derivedTable objects) according to the association between the operator and the data source.
Step 605, optimizing and stitching the internal computing logic of the derivative table to generate an executable data development script.
Step 606, executing the executable data development script, thereby generating a data model.
In addition, the implementation of the method for generating a data model according to one embodiment of the present invention is described in detail in the above method for generating a data model, and thus the description thereof will not be repeated here.
FIG. 7 is a schematic diagram of the main flow of a method of generating a data model according to another referenceable embodiment of the invention. As another embodiment of the present invention, as shown in fig. 6, taking a custom hair mode as an example, the method for generating a data model may include:
in step 701, a factor table is generated by instantiation according to filtering conditions of a data source, an index field, a dimension field, an index field and/or a dimension field input by a user.
And 702, processing the atom table according to the association relation between the operator input by the user and the data source to generate a table derivative table.
Step 703, optimizing and stitching the internal computing logic of the derivative table to generate an executable data development script.
Step 704, executing the executable data development script, thereby generating a data model.
In the custom development mode, because the corresponding processing logic cannot be matched from the index configuration library, the user is required to set all configuration information, such as filtering conditions of the data source, the index field, the dimension field, the index field and/or the dimension field, operators, association relations between the data sources, conversion operations of the derivative table and the like, as shown in fig. 8, the development framework sequentially generates the atomic table and the derivative table according to the set configuration information, and then performs optimization and splicing, so that the executable data development script is generated. Therefore, the development amount in the custom development mode is less than that in the standard development mode.
In addition, the implementation of the method for generating a data model according to one embodiment of the present invention is described in detail in the above method for generating a data model, and thus the description thereof will not be repeated here.
FIG. 9 is a schematic diagram of main modules of an apparatus for generating a data model according to an embodiment of the present invention, and as shown in FIG. 9, the apparatus 900 for generating a data model includes a matching module 901, a script module 902, and an execution module 903; the matching module 901 is configured to match corresponding processing logic from the index configuration library according to the index information and the dimension information input by the user; wherein, the index configuration library stores a plurality of processing logics, and each processing logic defines a data source, an index field, a dimension field, filtering conditions of the index field and/or the dimension field, and an association relationship between an operator and the data source; script module 902 is configured to optimize and splice the processing logic to generate an executable data development script; the execution module 903 is configured to execute the executable data development script, thereby generating a data model.
Optionally, the matching module 901 is further configured to:
Before corresponding processing logic is matched from an index configuration library according to index information and dimension information input by a user, generating processing logic according to configuration information pre-configured by the user, and storing the processing logic into the index configuration library;
the configuration information comprises a data source, an index field, a dimension field, filtering conditions of the index field and/or the dimension field, and an association relationship between operators and the data source.
Optionally, the script module 902 is further configured to:
Reading the processing logic to obtain the filtering conditions of the data source, the index field, the dimension field, the index field and/or the dimension field and the association relationship between the operator and the data source defined in the processing logic;
Instantiating the filtering conditions of the data source, the index field, the dimension field, the index field and/or the dimension field to generate a factor table;
and optimizing and splicing the internal calculation logic of the atom table according to the association relation between the operator and the data source so as to generate an executable data development script.
Optionally, the script module 902 is further configured to:
processing the atom table according to the association relation between the operator and the data source to generate a table derived table;
and optimizing and splicing the internal computing logic of the derivative table to generate an executable data development script.
Optionally, the atom table is AtomicTable objects, and the derivative table is DERIVEDTA BLE objects.
Alternatively, a field whose value is significant is used as an index field, and a field whose value is nonsensical is used as a dimension field.
Optionally, the operators include at least one of an association operator, an aggregation operator, a merge operator, a deduplication operator, a selection operator, and a filtering operator.
According to the various embodiments described above, it can be seen that the technical means of generating the executable data development script by matching the corresponding processing logic from the index configuration library according to the index information and the dimension information input by the user and optimizing and splicing the processing logic solves the technical problems of large script development workload and poor reusability in the prior art. According to the embodiment of the invention, the processing logic is stored in the index configuration library in advance, so that the user can automatically generate the executable data development script only by inputting the index and the dimension, the code development amount is obviously reduced, and the development threshold is lowered. Because standardized processing logic is maintained in the index configuration library, the coding style can be unified, the unique data result is kept, and the ambiguity of the data is eliminated; code reusability can also be improved, and subsequent maintenance and secondary development are facilitated.
The specific implementation of the apparatus for generating a data model according to the present invention is described in detail in the method for generating a data model described above, and thus the description thereof will not be repeated here.
Fig. 10 illustrates an exemplary system architecture 1000 to which a method of generating a data model or an apparatus of generating a data model of an embodiment of the present invention may be applied.
As shown in fig. 10, a system architecture 1000 may include terminal devices 1001, 1002, 1003, a network 1004, and a server 1005. The network 1004 serves as a medium for providing a communication link between the terminal apparatuses 1001, 1002, 1003 and the server 1005. The network 1004 may include various connection types, such as wired, wireless communication links, or fiber optic cables, among others.
A user can interact with a server 1005 via a network 1004 using terminal apparatuses 1001, 1002, 1003 to receive or transmit messages or the like. Various communication client applications such as shopping class applications, web browser applications, search class applications, instant messaging tools, mailbox clients, social platform software, etc. (by way of example only) may be installed on the terminal devices 1001, 1002, 1003.
The terminal devices 1001, 1002, 1003 may be various electronic devices having a display screen and supporting web browsing, including but not limited to smartphones, tablets, laptop and desktop computers, and the like.
The server 1005 may be a server providing various services, such as a background management server (merely an example) providing support for shopping-type websites browsed by the user using the terminal apparatuses 1001, 1002, 1003. The background management server may analyze and process the received data such as the article information query request, and feedback the processing result (e.g., the target push information, the article information—only an example) to the terminal device.
It should be noted that, the method for generating a data model according to the embodiment of the present invention is generally performed by the server 1005, and accordingly, the device for generating a data model is generally provided in the server 1005. The method for generating a data model provided by the embodiment of the present invention may also be performed by the terminal devices 1001, 1002, 1003, and accordingly, the apparatus for generating a data model may be provided in the terminal devices 1001, 1002, 1003.
It should be understood that the number of terminal devices, networks and servers in fig. 10 is merely illustrative. There may be any number of terminal devices, networks, and servers, as desired for implementation.
Referring now to FIG. 11, there is illustrated a schematic diagram of a computer system 1100 suitable for use in implementing the terminal device of an embodiment of the present invention. The terminal device shown in fig. 11 is only an example, and should not impose any limitation on the functions and the scope of use of the embodiment of the present invention.
As shown in fig. 11, the computer system 1100 includes a Central Processing Unit (CPU) 1101, which can execute various appropriate actions and processes according to a program stored in a Read Only Memory (ROM) 1102 or a program loaded from a storage section 1108 into a Random Access Memory (RAM) 1103. In the RAM1103, various programs and data required for the operation of the system 1100 are also stored. The CPU 1101, ROM 1102, and RAM1103 are connected to each other by a bus 1104. An input/output (I/O) interface 1105 is also connected to bus 1104.
The following components are connected to the I/O interface 1105: an input section 1106 including a keyboard, a mouse, and the like; an output portion 1107 including a Cathode Ray Tube (CRT), a Liquid Crystal Display (LCD), and the like, a speaker, and the like; a storage section 1108 including a hard disk or the like; and a communication section 1109 including a network interface card such as a LAN card, a modem, and the like. The communication section 1109 performs communication processing via a network such as the internet. The drive 1110 is also connected to the I/O interface 1105 as needed. Removable media 1111, such as a magnetic disk, an optical disk, a magneto-optical disk, a semiconductor memory, or the like, is installed as needed in drive 1110, so that a computer program read therefrom is installed as needed in storage section 1108.
In particular, according to embodiments of the present disclosure, the processes described above with reference to flowcharts may be implemented as computer software programs. For example, embodiments of the present disclosure include a computer program comprising a computer program embodied on a computer readable medium, the computer program comprising program code for performing the method shown in the flow chart. In such an embodiment, the computer program can be downloaded and installed from a network via the communication portion 1109, and/or installed from the removable media 1111. The above-described functions defined in the system of the present invention are performed when the computer program is executed by a Central Processing Unit (CPU) 1101.
The computer readable medium shown in the present invention may be a computer readable signal medium or a computer readable storage medium, or any combination of the two. The computer readable storage medium can be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or a combination of any of the foregoing. More specific examples of the computer-readable storage medium may include, but are not limited to: an electrical connection having one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing. In the context of this document, a computer readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. In the present invention, however, the computer-readable signal medium may include a data signal propagated in baseband or as part of a carrier wave, with the computer-readable program code embodied therein. Such a propagated data signal may take any of a variety of forms, including, but not limited to, electro-magnetic, optical, or any suitable combination of the foregoing. A computer readable signal medium may also be any computer readable medium that is not a computer readable storage medium and that can communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device. Program code embodied on a computer readable medium may be transmitted using any appropriate medium, including but not limited to: wireless, wire, fiber optic cable, RF, etc., or any suitable combination of the foregoing.
The flowcharts and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of systems, methods and computer programs according to various embodiments of the present invention. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams or flowchart illustration, and combinations of blocks in the block diagrams or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.
The modules involved in the embodiments of the present invention may be implemented in software or in hardware. The described modules may also be provided in a processor, for example, as: a processor includes a matching module, a script module, and an execution module, where the names of the modules do not constitute a limitation on the module itself in some cases.
As another aspect, the present invention also provides a computer-readable medium that may be contained in the apparatus described in the above embodiments; or may be present alone without being fitted into the device. The computer readable medium carries one or more programs which, when executed by a device, implement the method of: matching corresponding processing logic from an index configuration library according to index information and dimension information input by a user; wherein, the index configuration library stores a plurality of processing logics, and each processing logic defines a data source, an index field, a dimension field, filtering conditions of the index field and/or the dimension field, and an association relationship between an operator and the data source; optimizing and splicing the processing logic to generate an executable data development script; executing the executable data development script, thereby generating a data model.
According to the technical scheme of the embodiment of the invention, the technical means of optimizing and splicing the processing logic to generate the executable data development script is adopted to match the corresponding processing logic from the index configuration library according to the index information and the dimension information input by the user, so that the technical problems of large script development workload and poor reusability in the prior art are overcome. According to the embodiment of the invention, the processing logic is stored in the index configuration library in advance, so that the user can automatically generate the executable data development script only by inputting the index and the dimension, the code development amount is obviously reduced, and the development threshold is lowered. Because standardized processing logic is maintained in the index configuration library, the coding style can be unified, the unique data result is kept, and the ambiguity of the data is eliminated; code reusability can also be improved, and subsequent maintenance and secondary development are facilitated.
The above embodiments do not limit the scope of the present invention. It will be apparent to those skilled in the art that various modifications, combinations, sub-combinations and alternatives can occur depending upon design requirements and other factors. Any modifications, equivalent substitutions and improvements made within the spirit and principles of the present invention should be included in the scope of the present invention.

Claims (8)

1. A method of generating a data model, comprising:
matching corresponding processing logic from an index configuration library according to index information and dimension information input by a user; wherein, the index configuration library stores a plurality of processing logics, and each processing logic defines a data source, an index field, a dimension field, filtering conditions of the index field and/or the dimension field, and an association relationship between an operator and the data source;
optimizing and splicing the processing logic to generate an executable data development script;
Executing the executable data development script, thereby generating a data model;
Optimizing and stitching the processing logic to generate an executable data development script, comprising:
Reading the processing logic to obtain the filtering conditions of the data source, the index field, the dimension field, the index field and/or the dimension field and the association relationship between the operator and the data source defined in the processing logic;
Instantiating the filtering conditions of the data source, the index field, the dimension field, the index field and/or the dimension field to generate a factor table;
and optimizing and splicing the internal calculation logic of the atom table according to the association relation between the operator and the data source so as to generate an executable data development script.
2. The method of claim 1, further comprising, prior to matching corresponding processing logic from the index configuration library based on the index information and the dimension information entered by the user:
generating processing logic according to configuration information pre-configured by a user, and storing the processing logic into an index configuration library;
the configuration information comprises a data source, an index field, a dimension field, filtering conditions of the index field and/or the dimension field, and an association relationship between operators and the data source.
3. The method of claim 1, wherein optimizing and stitching the internal computing logic of the atom table according to the association between the operator and the data source to generate an executable data development script comprises:
processing the atom table according to the association relation between the operator and the data source to generate a table derived table;
and optimizing and splicing the internal computing logic of the derivative table to generate an executable data development script.
4. A method according to claim 3, wherein the Table of elements is an Atomic Table object and the Derived Table is a Derived Table object.
5. The method of claim 1, wherein the operators comprise at least one of an association operator, an aggregation operator, a merge operator, a deduplication operator, a selection operator, and a filtering operator.
6. An apparatus for generating a data model, comprising:
The matching module is used for matching corresponding processing logic from the index configuration library according to the index information and the dimension information input by the user; wherein, the index configuration library stores a plurality of processing logics, and each processing logic defines a data source, an index field, a dimension field, filtering conditions of the index field and/or the dimension field, and an association relationship between an operator and the data source;
the script module is used for optimizing and splicing the processing logic to generate an executable data development script;
the execution module is used for executing the executable data development script so as to generate a data model;
the script module is further configured to:
Reading the processing logic to obtain the filtering conditions of the data source, the index field, the dimension field, the index field and/or the dimension field and the association relationship between the operator and the data source defined in the processing logic;
Instantiating the filtering conditions of the data source, the index field, the dimension field, the index field and/or the dimension field to generate a factor table;
and optimizing and splicing the internal calculation logic of the atom table according to the association relation between the operator and the data source so as to generate an executable data development script.
7. An electronic device, comprising:
one or more processors;
storage means for storing one or more programs,
The one or more processors implement the method of any of claims 1-5 when the one or more programs are executed by the one or more processors.
8. A computer readable medium, on which a computer program is stored, characterized in that the program, when being executed by a processor, implements the method according to any of claims 1-5.
CN202010910061.7A 2020-09-02 2020-09-02 Method and device for generating data model Active CN113760240B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010910061.7A CN113760240B (en) 2020-09-02 2020-09-02 Method and device for generating data model

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010910061.7A CN113760240B (en) 2020-09-02 2020-09-02 Method and device for generating data model

Publications (2)

Publication Number Publication Date
CN113760240A CN113760240A (en) 2021-12-07
CN113760240B true CN113760240B (en) 2024-06-14

Family

ID=78785777

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010910061.7A Active CN113760240B (en) 2020-09-02 2020-09-02 Method and device for generating data model

Country Status (1)

Country Link
CN (1) CN113760240B (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116881244B (en) * 2023-06-05 2024-03-26 易智瑞信息技术有限公司 Real-time processing method and device for space data based on column storage database

Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101996250A (en) * 2010-11-15 2011-03-30 中国科学院计算技术研究所 Hadoop-based mass stream data storage and query method and system

Family Cites Families (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103020158B (en) * 2012-11-26 2016-09-07 中兴通讯股份有限公司 A kind of report form creation, device and system
CN104850623B (en) * 2015-05-19 2018-08-07 杭州迅涵科技有限公司 Multi-dimensional data analysis model dynamic expansion method and system
CN109002289A (en) * 2017-06-07 2018-12-14 北京京东尚科信息技术有限公司 A kind of method and apparatus constructing data model
CN107766132B (en) * 2017-06-25 2019-03-15 平安科技(深圳)有限公司 Multi-task scheduling method, application server and computer readable storage medium
CN110427434B (en) * 2019-06-28 2022-06-07 苏宁云计算有限公司 Multidimensional data query method and device
CN110674117A (en) * 2019-09-26 2020-01-10 京东数字科技控股有限公司 Data modeling method and device, computer readable medium and electronic equipment
CN111159204B (en) * 2020-01-02 2020-08-11 北京东方金信科技有限公司 Method and system for generating label in configuration mode

Patent Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101996250A (en) * 2010-11-15 2011-03-30 中国科学院计算技术研究所 Hadoop-based mass stream data storage and query method and system

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
基于数据挖掘方法的寿险业务管理KPI指标分析;黄慧超;张晓龙;;软件导刊;20080731(第07期);全文 *

Also Published As

Publication number Publication date
CN113760240A (en) 2021-12-07

Similar Documents

Publication Publication Date Title
CN111177231B (en) Report generation method and report generation device
CN112148711B (en) Batch processing task processing method and device
CN110555030B (en) SQL sentence processing method and device
CN111125064B (en) Method and device for generating database schema definition statement
CN110689268B (en) Method and device for extracting indexes
CN108108986B (en) Design method and device of customer relationship management system and electronic equipment
CN110543297A (en) method and apparatus for generating source code
CN113485781A (en) Report generation method and device, electronic equipment and computer readable medium
CN113419789A (en) Method and device for generating data model script
CN115509522A (en) Interface arranging method and system for low-code scene and electronic equipment
CN113760240B (en) Method and device for generating data model
CN113760961B (en) Data query method and device
CN108959294B (en) Method and device for accessing search engine
CN112433713A (en) Application program design graph processing method and device
CN116775613A (en) Data migration method, device, electronic equipment and computer readable medium
CN113536748A (en) Method and device for generating chart data
CN112579151A (en) Method and device for generating model file
CN115145652A (en) Method, device, equipment and medium for creating data processing task
CN113760969A (en) Data query method and device based on elastic search
CN113077201B (en) Method, device and system for analyzing service parameters
CN113515285A (en) Method and device for generating real-time calculation logic data
CN113448960A (en) Method and device for importing form file
CN111078230A (en) Code generation method and device
CN112988778A (en) Method and device for processing database query script
CN113778501B (en) Code task processing method and device

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant