CN111143370B - Method, apparatus and computer-readable storage medium for analyzing relationships between a plurality of data tables - Google Patents

Method, apparatus and computer-readable storage medium for analyzing relationships between a plurality of data tables Download PDF

Info

Publication number
CN111143370B
CN111143370B CN201911379115.5A CN201911379115A CN111143370B CN 111143370 B CN111143370 B CN 111143370B CN 201911379115 A CN201911379115 A CN 201911379115A CN 111143370 B CN111143370 B CN 111143370B
Authority
CN
China
Prior art keywords
data
relationship
field
instruction
receiving
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201911379115.5A
Other languages
Chinese (zh)
Other versions
CN111143370A (en
Inventor
王燕忠
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Qiqi Technology Co ltd
Original Assignee
Beijing Qiqi Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Qiqi Technology Co ltd filed Critical Beijing Qiqi Technology Co ltd
Priority to CN201911379115.5A priority Critical patent/CN111143370B/en
Publication of CN111143370A publication Critical patent/CN111143370A/en
Application granted granted Critical
Publication of CN111143370B publication Critical patent/CN111143370B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/22Indexing; Data structures therefor; Storage structures
    • G06F16/2282Tablespace storage structures; Management thereof
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/28Databases characterised by their database models, e.g. relational or object models
    • G06F16/284Relational databases

Landscapes

  • Engineering & Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Software Systems (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention discloses a method, equipment and a computer-readable storage medium for analyzing relations among a plurality of data tables, wherein one or more relation graphs are formed among the data tables, and the method comprises the following steps: visualizing at least one of the relationship graphs; receiving a first instruction from a user about executing an operation on the relationship graph; in response to receiving the first instruction, visualizing at least one form associated with the at least one relationship graph, the form comprising a plurality of fields; receiving a second instruction from the user regarding modifying the field; and in response to receiving the second instruction, modifying and saving the field accordingly. The analysis method provided by the invention can be used for increasing the accuracy of data identification by executing analysis operation or retrieval operation on the table relational graph.

Description

Method, apparatus and computer-readable storage medium for analyzing relationships between a plurality of data tables
Technical Field
The present invention relates generally to the field of databases. More particularly, the present invention relates to a method, apparatus and computer-readable storage medium for analyzing relationships between a plurality of data tables.
Background
With the development of the digital era, data needs to be uploaded to a specified information system for uniform analysis and processing. However, in the process, the problem that the delivered planning table is incomplete and the uploaded data of each place is inaccurate is usually faced, so that the data uploaded by each place cannot be analyzed in a centralized manner. In consideration of the point, a data acquisition tool can be developed to solve the problems of inaccurate uploaded data of various regions and incomplete issued acquisition standard planning tables. However, for the collected data, it is common practice to determine the relationship between the uploaded data tables through manual operation, which causes inefficiency in the identification of the relationship between the data tables and brings about considerable labor cost. Therefore, how to efficiently determine the relationship between data tables and improve the accuracy of the relationship identification at the same time becomes a technical challenge.
Disclosure of Invention
To at least partially solve the technical problems mentioned in the background, aspects of the present invention provide a method, apparatus, and computer-readable storage medium for analyzing relationships between a plurality of data tables.
In one aspect, the present invention provides a method for analyzing relationships between a plurality of data tables having one or more relationship graphs formed therebetween, the method comprising: visualizing at least one of the relationship graphs; receiving a first instruction from a user about executing an operation on the relationship graph; in response to receiving the first instruction, visualizing at least one form associated with the at least one relationship graph, the form comprising a plurality of fields; receiving a second instruction from the user regarding modifying the field; and in response to receiving the second instruction, modifying and saving the field accordingly.
In one embodiment, the relational graph comprises a plurality of data frames corresponding to the data tables, and the data frames are connected in a visualized mode through a first set of relation lines.
In another embodiment, further comprising: receiving a third instruction from the user to analyze the first set of relationship lines; and in response to receiving the third instruction, visualizing a field relationship analysis form related to any one of the first set of relationship lines.
In yet another embodiment, further comprising: in response to receiving the first instruction, retrieving an associated data table, data table field, or data table field value in the relationship graph.
In yet another embodiment, wherein the retrieving step further comprises: the field relation graph comprises a plurality of data frames corresponding to the data tables related in the relation graph, and the data frames are connected in a visualized mode through a second group of relation lines; receiving a fourth instruction from the user to analyze the second set of relationship lines; and in response to receiving the fourth instruction, visualizing a field relationship analysis form related to any of the second set of relationship lines.
In yet another embodiment, further comprising: receiving a fifth instruction from the user about modifying the field relationship diagram; and in response to receiving the fifth instruction, visualizing an informational form, the informational form including a plurality of fields for modification.
In yet another embodiment, further comprising: and generating a new data analysis template according to the modified form for the analysis processing of the subsequent data table.
In yet another embodiment, wherein the relational graph of the data table is an entity-to-contact ER graph.
In another aspect, the present invention provides an apparatus for analyzing relationships between a plurality of data tables, comprising: at least one processor; at least one memory storing computer program instructions that, when executed by at least one processor, cause the apparatus to perform the above-described method.
In yet another aspect, the present invention provides a computer readable storage medium comprising program instructions for analyzing relationships between a plurality of data tables, which when executed by a processor, perform the method described above and its various embodiments.
According to the technical scheme for executing the table relationship identification disclosed by the invention, the time for identifying the table relationship can be saved to the greatest extent, and the identification accuracy is improved. Furthermore, according to the data relation diagram established by table relation recognition, the scheme of the invention can also provide various analysis and retrieval schemes by the data relation diagram so as to correct the table relation which is automatically recognized, thereby further improving the accuracy of the relation recognition.
Drawings
The above and other objects, features and advantages of exemplary embodiments of the present invention will become readily apparent from the following detailed description read in conjunction with the accompanying drawings. In the accompanying drawings, which are meant to be exemplary and not limiting, several embodiments of the invention are shown and indicated, like or corresponding reference numerals being used for like or corresponding parts, wherein:
FIG. 1 is a functional block diagram illustrating a data service system according to an embodiment of the present invention;
FIG. 2 is a schematic diagram illustrating an analysis tool interface according to an embodiment of the present invention;
FIG. 3 is a schematic diagram illustrating a data table analysis interface according to an embodiment of the present invention;
FIG. 4 is a schematic diagram illustrating a data table field analysis interface according to an embodiment of the invention;
FIG. 5 is a diagram illustrating a data table field relationship analysis form according to an embodiment of the invention;
FIG. 6 is a schematic diagram illustrating a data table retrieval interface according to an embodiment of the present invention;
FIG. 7 is a schematic diagram illustrating a data table field retrieval interface according to an embodiment of the present invention;
FIG. 8 is a diagram illustrating a table information form according to an embodiment of the present invention;
FIG. 9a is a diagram illustrating a column information form according to an embodiment of the invention;
FIG. 9b is a diagram illustrating the relationship of related fields in a column information form according to an embodiment of the invention;
FIG. 9c is a diagram illustrating relevant field values in a list information form according to an embodiment of the invention; and
FIG. 10 is a schematic diagram illustrating a data table field value retrieval interface according to an embodiment of the present invention.
FIG. 11 is a flow chart illustrating a method of analysis of data table relationships according to an embodiment of the invention;
FIG. 12 is a flow chart diagram illustrating a method of operation of an analysis of data table relationships in accordance with an embodiment of the present invention; and
FIG. 13 is a flow chart illustrating a method of operation of a retrieval of data table relationships in accordance with an embodiment of the present invention.
Detailed Description
The technical solution of the present invention provides a method, apparatus and computer-readable storage medium for determining relationships between data tables in a database as a whole. Different from the existing data identification mode, the analysis method provided by the invention can execute analysis operation or retrieval operation on the table relational graph, and can increase the accuracy of data identification. In some aspects, the user can edit or modify the data table through a form built in the system, and the data processing is convenient.
The technical solution of the present invention and various embodiments thereof will be described in detail below with reference to the accompanying drawings.
Fig. 1 is a functional block diagram illustrating a data service system 100 according to an embodiment of the present invention. As shown in fig. 1, the data service system 100 of the present invention can be divided into a data layer 110 and an application layer 120 according to functions and roles, wherein the data layer can be used to identify and save data. In one or more embodiments, the application layer may be divided into three functional blocks, task management 122, analysis tool 124, and system management 126, depending on function and role. The following will be described in detail with respect to the respective functional blocks:
for the task management function block 122, its main functions are throughout the data analysis process, and its specific operations may include, but are not limited to: the task is subjected to task operations such as new creation, viewing, deletion, import, export and sharing, and the task content can comprise data connection, extraction configuration, analysis configuration, template identification, code table identification, log table identification, table field identification, automatic analysis relationship identification, data label identification, data processing configuration, task starting, task log and other identification works related to table relationship establishment. The results after the task is completed can be shown by adding marks into the data or establishing a table relationship.
For the analysis tool function block 124, its main function involves analyzing the results (table relationships) after the completion of the automatic execution again, including: filtering empty tables, filtering empty fields, data table analysis, table field analysis, table relationship analysis, table field retrieval, table field value retrieval, and the like. Thus, the accuracy of automatic analysis can be verified, and data table relationships and field value annotations can be further deeply analyzed.
For the system management function block 126, its main functions relate to user login operation and user management, where the main functions of user management include: and the contents of message reminding and operation log viewing, login password modification, user login switching, document viewing assistance and the like during task execution. In addition, the system management is also used for updating and maintaining subsequent information aiming at the data semantic library and the industry semantic library. In some embodiments, functions such as database setup and system setup may also be performed by the system management function block.
In one or more embodiments, the database of the present invention may use SQL Server (Structured Query Language Server). Using the SQL language, performing queries to a database, retrieving data from a database, inserting new records into a database, updating data in a database, deleting records from a database, creating a new database, creating new tables in a database, creating stored procedures in a database, creating views in a database, or setting up a table, storing procedures and permissions for views may be implemented. In other embodiments, the database of the present invention may use a Remote Dictionary Server (redis). In particular, the redis is a data structure server, which supports data persistence, and can save data in a memory in a disk, and can be loaded again for use at the time of restart. Based on the above description, those skilled in the art will appreciate that the database of the present invention may use various database management systems, which are currently available or developed in the future, as long as the database management system can provide a safe and reliable storage function for the structured data.
In some embodiments, when data needs to be read from a database and analyzed, the system may automatically implement operations such as data collection, extraction, cleaning, verification, warehousing, and the like through a collection template, where the reason for performing the extraction is that the obtained data may have multiple structures and types, and the data extraction process may help to convert these complex data into a single type or a type convenient to process, so as to achieve the purpose of rapid analysis and processing. For a scrubbing operation, some data is not intended to be of interest and other data is completely erroneous or irrelevant data, for example, because most data is not entirely valuable. Therefore, such useless data can be removed by the cleansing operation, and effective data can be extracted.
In general, for data for which there is already a matching acquisition template, the data acquisition procedure may be performed directly through the matching acquisition template. However, for data for which there is no matching template, table relationship identification may be performed by using the primary foreign key of the data table in the database as a basis. Because the data tables in a relational database all include a primary key, a primary key is a field or set of fields that uniquely identifies a row in a table. For example, when the data table is a personal table about personal information, the table contains a record including fields such as identification number, name and age. Since only the identification number uniquely identifies the individual, and other fields such as name and age may be repeated, the identification number is the primary key.
Further, the primary key may also be a field group, such as: the other table contains a record including fields for name, age, and gender. Comparing name and age individually may have multiple duplicate records, and only the combination of name and age may be used as a record for unique identification, so the field set of name and age forms a primary key. The primary key can uniquely identify a certain row of records, so that errors can be avoided when data updating and deleting are carried out, and the primary key can be used for being associated with other tables and used as a unique identifier in the primary table.
In addition, the data table may include one or more foreign keys that may be used to create an association with another table. For example, one record in the student table includes the school number, name, sex, class, and the like, wherein the name in the personal table is not the primary key of the personal table, but the name in the personal table and the name in the student table may correspond to each other, and the name in the student table is the primary key of the student table, so the name in the student table may be the foreign key of the personal table.
After the database identifies the table Relationship based on the main foreign key of the data table, the system outputs an ER (Entity Relationship graph) according to the automatic identification result, which is helpful for the user to check the analysis result. But from the ER graph of table relationship identification automatically performed by the system, in some cases only the direct relationship between data tables can be known by the main foreign key comparison. Tables for data that are not identical or similar in some fields, but actually have an association, may be further modified by the analysis tools of the present invention. For this reason, a detailed operational description will be made below through the embodiments of fig. 2-10.
Fig. 2 is a schematic diagram illustrating an analysis tool interface 200 according to an embodiment of the present invention. The interface of the data service system of the present invention has one or more display areas available for operation, and only a portion of the data service system or a simplified version thereof is disclosed based on a convenient description. The following description is directed to only a portion of the main operation display area related to the technical solution disclosure.
As shown in fig. 2, the interface 200 is divided into a first display area 210 and a second display area 220, wherein the first display area is used for presenting the function stage currently executed by the system and is displayed in the form of function blocks. The aforementioned functional stages can be classified into functions such as "task management", "data query", "analysis tool", and "system management" according to the execution program of the system. Then, the second display area is used for presenting the result executed by the function block of the first display area. Further, the table relationship analysis disclosed herein is performed at the analysis tool stage of the system. The second display area 220 displays a table relationship ER map created by performing a table relationship identification operation in advance.
In order to establish a table relationship ER graph among a plurality of data tables, the invention uses the primary keys included in the data tables in the relational database. The primary key may be used to identify a field in a database or a field or group of fields that uniquely identifies a row in a table. In some application scenarios, the primary key itself may also be a field set.
Since the primary key can uniquely identify a certain row of records, and thus can ensure that no error occurs when data updating and deleting are performed, the primary key can be used for being associated with other tables and used as a unique identifier in the primary table. In some implementations, the data table may also include one or more foreign keys, where a foreign key of a parent table may be a primary key of a child table and an association may be made with another child table via the foreign key.
The invention takes the main external key of the data tables in the database as the basis to identify the table relationship, one of the data tables in the database can be designated as a parent table, the other data tables are designated as child tables, and the main key in the parent table can be used for respectively carrying out relevance comparison with the external key of the child table one by one. When the comparison shows that the child table and the parent table do not have the main foreign key relationship, the similarity between the table fields can be compared. When the similarity is above a certain predetermined threshold (e.g. 80%), then it may be determined that a new relationship is obtained.
After the relationship between the data tables is known by comparing the similarity between the main foreign key and the table field, the table relationship ER diagram in the invention can be constructed according to the association between a plurality of data tables and a plurality of table fields. As can be seen from the second display area, the ER diagram includes a plurality of data boxes 221-228, wherein each of the plurality of data boxes corresponds to a data table. The main foreign key relation between each data table can be clearly identified through a tree structure (comprising views or relations and the like) expanded by the ER diagram.
In one embodiment, the data table includes primary keys, foreign keys, and field names, such as: data table 1 includes primary key a1, foreign key b1, and field name 1; data table 2 includes a primary key b1, a foreign key b2, and a field name 2. Similarly, data tables 3, 4, 5, 6, 7, 8, etc. each include a primary key, a foreign key, and a field name. Based on the primary foreign key comparison results in the plurality of data tables, a data table relationship may be established. For example, foreign key b1 of data Table 1 is the same as primary key b1 of data Table 2, indicating that a relationship exists between data Table 1 and data Table 2.
The above describes an ER map obtained by the system automatically performing table relationship identification. In some cases, only the direct relationship between data tables can be known through the main foreign key comparison, and the data tables with different or dissimilar fields but actually having relevance can be further identified through the analysis tool of the invention. In one or more embodiments, the user can then confirm or modify the data table based on the ER map. For example, the confirmation or correction operation may be performed by a functional block in the second display area that selects "analyze" 231 or "search" 232. In some application scenarios, when the "analyze" function block is selected, the interface format presented will be as shown in FIG. 3. In addition, when the "search" function block is selected, it will take a form similar to that shown in fig. 6, 7, and 10.
To further illustrate the application principle of the present invention in automatically identifying the ER graph of the established data table relationship, the analysis operations of fig. 3-5 and the retrieval operations of fig. 6-10 are combined, so that the establishment of the data table relationship is more comprehensive and accurate through further analysis.
FIG. 3 is a schematic diagram illustrating a spreadsheet analysis interface 300, which may be in the form of the interface presented after selecting the "analyze" functional block of FIG. 2, according to another embodiment of the present invention. It is to be noted here that although a different data table name from that in fig. 2 is used in fig. 3, the plurality of data tables shown in fig. 3 may have a correspondence with the plurality of data tables in fig. 2, except that "field" and "category code" are shown in the data table of fig. 3 instead of the primary key in the data table of fig. 2.
As shown in fig. 3, the interface 300 is divided into a first display area 310 and a second display area 320. After selecting the "parse" function block in the second display area in fig. 2, the table relationship ER map will be displayed in the second display area, wherein the data tables a-f corresponding to the data boxes 321-326 include the category codes a12, a22 … f22, f32 …, etc. in addition to the fields a1, a21 … f21, f31 …, etc. In one embodiment, the association between the data tables can be established according to the comparison of the category codes in the data tables.
Besides the relationship between each data table can be checked through the tree structure of the ER table, the invention can also perform the operations of adjusting the size of the graph, positioning a single table, filtering empty tables and the like during the analysis of the data table through the function of setting the table. In one scenario, when analyzing one or more data tables, a user may view each data table information and field structure (e.g., table name and table field), etc., and may view all values of the current table field excluding duplicate entries through the fields in the table.
In one embodiment, the ER map is configured such that when two tables are related, a relationship line 327 having a color (e.g., green) is formed between the two tables. However, if the table relationship line after automatic analysis appears in another color (e.g., yellow), it can be further confirmed by a human whether the data table relationship is established. Then, one-to-one or one data table comparison is performed according to the searched relations, and some table relations correspond to the combination of a plurality of table relations. For example, when identifying a personal name, many people with the same name but different names are often found, and at this time, it cannot be determined that there is a certain correlation between two data tables from the personal name alone, so it is necessary to perform additional verification by using other data materials such as an identification number and a birth date. In view of this, through the identification of the one-to-many table relationship combination, the data table without direct relationship can be found and the table relationship can be established. By again confirming the relationship between the data tables, the user can manually modify the relationship line from one color (e.g., yellow) to another color (e.g., green).
Further, after the relationship identification is confirmed, if a relationship should exist between the data tables but the relationship is not automatically set by the system, the user can manually drag the two data tables to establish the table relationship by confirming again. In one scenario, when two tables have the same field, the system automatically establishes the table field relationship, and when there is no same field, the user can manually edit the field to establish the table relationship. In addition, when the user selects the data table to be operated in the ER diagram, the data table and the data table related to the data table and the relationship line are highlighted, and the relationship of the data table can be further modified or deleted by clicking the relationship line.
In some application scenarios, when a field appears in multiple tables, the user may choose to perform comparative analysis on the table fields, such as analyzing similar fields for Chinese comments and business process relationships. For example, when there is a common record of merchandise between the invoice and the tax return, the invoice and the tax return can be considered to have an association. In addition, the relationship line between the two data tables can be selected to be clicked to check and perform comparative analysis (the specific field relationship analysis manner can refer to the example of fig. 5).
For the ER diagram of the relationship between the data tables established according to the automatic identification of the system, the invention can not only compare the listed fields and the category codes one by one, but also reconfirm or redefine the relationship between the fields and the data tables according to the relationship between the specific fields and the data tables. Details of this aspect will be described below in conjunction with fig. 4.
FIG. 4 is a schematic diagram illustrating a data table field analysis interface 400 according to the present disclosure. As shown in fig. 4, the interface 400 is divided into a first display area 410 and a second display area 420, and the table field analysis can select a field relation to be referred to through an item area of "view by field" in the analysis tool. Here, the item area "view by field" may list all table fields by field initial and include a query field function. When a field is selected from the "view by field" entry area, all data tables corresponding to the selected field are displayed in the second display area. For example, in the illustrated example, the present invention uses an "index number" as a target for searching. As can be seen from the second display area, the "index number" has an association with the payment service-index amount, the payment service-index receipt, the payment sending-payment voucher, the payment sending-plan receipt table, the payment service-plan amount table, and the like.
Furthermore, by clicking the field of the item area of 'view by field', Chinese comments of the current field in all data tables can be viewed, and meanwhile, the data table relationship and the data table value established by the current field are viewed according to the data tables, so that whether the field rule is consistent with the data table analysis rule can be determined. In some scenarios, when a field annotation has a change, it may also be modified here.
How the table relationships are compared and modified for the fields of the two data tables will be described below with reference to fig. 5 in conjunction with fig. 3.
FIG. 5 is a diagram illustrating a data table field relationship analysis form 500 according to the present invention. After selecting two data tables to be viewed according to the ER diagram of fig. 3, when a relationship line between the two data tables to be viewed is clicked, the system automatically displays the field relationship analysis form 500, and determines a parent table and a child table according to the relationship importance (for example, the high or low of the relationship with other tables) of the data tables in the ER diagram by the system, for example, the parent table 511 (in the diagram, the example is "PLAN payment amount table") with high importance, and the child table 512 (in the diagram, the example is "bud department index detail table") with low importance. The field relational analysis form has a first display area 510 for displaying a simple relational table between a parent table and a child table, which may be established when the system is identified from previous data table relations in one embodiment or SET by the user himself in an ER diagram in another embodiment, in which "bud _ ID index limit" is SET as a primary key and "SET _ YEAR of business" is a foreign key, and a second display area 520. As can be seen from the relationship table in the first display area, the two tables have the same "bud _ ID" field, and the parent table and the child table in this example have relevance by comparison.
Further, in the second display area 520, in addition to the fact that the "relationship name" is named according to the field name and the table name via the system, the second display area can provide the user to write or edit by himself to set the relationship condition, for example, the "field relationship" can be used to establish the relationship between the parent table and the child table, and the "relationship chinese comment" and the "relationship remark" can be used to further describe the relationship definition and serve as a reference when the relationship is established later. Finally, the field relationship analysis form also includes two functional options of "add relationship" 521 and "delete relationship" 522 in the second display area. When the relationship setting is completed, the function option of "add relationship" is clicked, and the setting can be stored and returned to the ER diagram interface of fig. 3. Conversely, if the relationship is to be cancelled, the function option of "delete relationship" may be clicked. Likewise, the settings are also stored and returned to the ER map interface of FIG. 3.
Besides the table relationship analysis mode, the invention also discloses another retrieval mode combined with the data tables shown in fig. 6-10, which can search through the data tables more widely, so that the establishment of the data tables and/or the relationship among the data tables is more complete.
FIG. 6 is a schematic diagram illustrating a data table retrieval interface 600 according to the present disclosure. As shown in fig. 6, the interface 600 is divided into a first display area 610 for displaying the currently executed function stage of the system and a second display area 620, which includes a first block 621, a second block 622, and third blocks 623_1 and 623_2, wherein the first block has a "search" column for inputting the field to be searched (hereinafter, collectively referred to as "input value"). Further, the second block includes items such as "data table", "data table field", and "data value", which represent the type of data to be searched. In this embodiment, if "data table" is selected as the type of data to be searched, the third block 623_1 will list the relevant fields searched by the SQL syntax, and the fields set by the third block 623_1 can be used as the conditions for matching the data table. The system may then quickly locate where the input value appears in one or more data tables based on the input value written to the search field. Then, the matching result is displayed in the third block 623_2 in the form of a table. On the other hand, when the page is displayed, the matching results are displayed in a descending order according to the matching degree and the matching times. For example, the rank with high match is the top rank and the rank with low match is the bottom rank.
In some application scenarios, when the fields in the data table are found to be associated with other data tables through the matching result lookup, whether the fields are associated with other data tables can be confirmed by underlining or displaying different colors. When a plurality of data tables have associations, the data table with the highest association can be selected to be opened as a main table, and the main table is connected with other associated data tables.
Further, in the third block 623_2, a further analysis can be performed for a specific item in the matching result. For example, selecting any column in the table may cause a corresponding data table to appear, and whether to display the corresponding table name chinese annotation and the corresponding field name chinese annotation may be set in the data table, or only the table name chinese annotation and the field name chinese annotation may be displayed. Next, when viewing data in the data table, the table chinese comments and the field chinese comments in the data table may be edited, and the specific functions may refer to the example of fig. 9.
FIG. 7 is a schematic diagram illustrating a data table field retrieval interface 700 according to the present disclosure. As shown in fig. 7, similar to fig. 6, the interface 700 may also be divided into a first display area 710 and a second display area 720, wherein the first display area is used for presenting the currently executed functional stage of the system, and the second display area includes a first section 721, a second section 722 and a third section 723. FIG. 7 differs from the embodiment of FIG. 6 in that the selection of "data table fields" in FIG. 7 as the data type to be looked up quickly locates which fields in one or more data tables the input value appears in based on the input value written to the search field. Then, according to the matching result and based on the searched condition, it is displayed at the third block in the form of a field relation graph. In one embodiment, the field relationship diagram may include a plurality of data frames, wherein each of the plurality of data frames corresponds to a data table.
Further, after selecting two data tables to be viewed according to the multiple associated data tables displayed in the third block, when the relationship line 724 between the two data tables to be viewed is clicked, the system automatically displays the field relationship analysis form (the modification manner of the specific form is described in fig. 5). In addition, by viewing the data in the data table, the table Chinese notes and the field Chinese notes in the data table can be edited, and the specific operation is described with reference to FIG. 8.
Fig. 8 is a diagram illustrating a table information form 800 including a first display area 810 and a second display area 820, wherein the first display area includes set blocks of table information 811 and column information 812, as shown in fig. 8, which is displayed when any one of the data boxes in the field relationship diagram of fig. 7 is selected, according to an embodiment of the present invention. FIG. 8 is a table of corresponding table information displayed in the table information setting block, wherein the user can modify "table name", "Chinese translation of table" and "table notes", wherein the "table notes" can automatically display the translation or note information related to the data table read from the database for the user to refer to and then determine whether to modify. Further, when the column information setting block in fig. 8 is selected, a column information table relating to column information setting as shown in fig. 9a appears.
FIG. 9a is a diagram illustrating a column information form 900 according to an embodiment of the invention. As shown in FIG. 9a, the selected data table bar in FIG. 7 lists the fields of "field name", "Chinese translation of field", and "field type", etc., wherein the column information form also includes the "whether or not to primary key" field, and the setting of this part can be used as the reference for the system to perform table relationship identification or comparison. Further, the "operation" column 910 corresponding to each column of fields may further include two function symbols 911 and 912, where the function symbol 911 represents a field relationship analysis option. In addition, the function symbol 911 may also represent a field value analysis option.
FIG. 9b is a diagram illustrating the relationship of related fields in a column information form according to an embodiment of the present invention. When the function symbol 911 in the "operation" column of fig. 9a is selected, a field corresponding to the function symbol 911 and a data table analysis chart related to the field are displayed as in fig. 9 b. For example, when the "bug _ bit payment delivery-payment receipt" 920 has the field "FILE _ CODE index number" 921 as the name of the relationship, the relationship may be generated with two data tables of the "bug _ index receipt" 922 and the "PAY _ bit payment delivery-payment voucher 923 that also have the common field" FILE _ CODE index number "921.
Further, fig. 9c is a diagram illustrating related field values in the column information form according to an embodiment of the present invention. When the function symbol 912 of the operation bar in fig. 9a is selected, as shown in fig. 9c, the data table and field value analysis table result corresponding to the selected function symbol 912 is displayed, for example, the field value related to the "PLAN amount table of payment data" 930 of PLAN amount table of PLAN.
FIG. 10 is a schematic diagram illustrating a data table field value retrieval interface 1000 in accordance with the present disclosure. As shown in fig. 11, the interface 1000 is divided into a first display area 1010 and a second display area 1020, wherein the first display area is used for displaying the currently executed function stage of the system, and the second display area includes a first block 1021, a second block 1022 and a third block 1023. The first block has a column of "search" for inputting the condition to be searched (hereinafter, collectively referred to as "input value"), and the second block includes items of "data table", "data table field", and "data value", which represent the type of data to be searched.
In this example, the "search" field is found conditioned on a case number, such as "Sichuan financial Row 2019001", and the type of data to be found is selected as "data value", then the system quickly locates which field values appear in one or more data tables based on the input value written to the search field. Further, the matching result is displayed in the third block in the form of a field value relationship diagram, which may be composed of one or more data tables and one or more fields having an association. After the field value relational graph is established, the table Chinese comments and the field Chinese comments in the data table can be edited by looking at the data in the data table.
FIGS. 11-13 below are directed to various analysis methods of the present invention, respectively, and are described in conjunction with the embodiments of FIGS. 2-10.
FIG. 11 is a flow diagram illustrating a method 1100 of analysis of data table relationships, according to an embodiment of the invention. The database of the invention comprises a plurality of data tables, wherein one or more relational graphs are formed according to the main foreign key relations among the plurality of data tables and are stored in the database. As shown in fig. 11, at step 1101, at least one relational graph is visualized, and the method 1100 displays the relational graph in the database on the analysis tool interface as shown in fig. 2, wherein the relational graph includes a plurality of data frames, and each of the plurality of data frames corresponds to one data table. Through a tree structure (comprising views, relations and the like) developed by the relational graph, the main foreign key relation between each data table can be clearly identified, and the visual connection between each data frame is established through the first group of relation lines according to the relation between the data tables. In some cases, only the direct relationships between data tables may be known through the primary foreign key comparison. Tables of data for which some fields are not identical or similar, but in fact have an association, may be further identified by the analysis tools of the present invention.
In one or more embodiments, the user may further confirm or modify the data table according to the relationship diagram, which may be performed by the functional block of selecting "analyze" or "search" in the second display area of fig. 2. At step 1102, a first instruction is received from a user regarding performing an operation on the relationship graph, and the method 1100 may perform an analysis operation or a retrieval operation.
In some application scenarios, when the "analyze" functional block is selected, the interface is presented in the form as shown in FIG. 3. The data table corresponding to the data frame of the relationship diagram in fig. 3 includes fields and category codes. According to the comparison of the class codes in the data tables, the relevance among the data tables can be established, and after the relation between the two data tables is established, a relation line can be formed between the two data tables.
Alternatively, when the "retrieve" function block is selected, the interface form is presented as shown in fig. 7. The first block of the interface has a "search" field for entering the field to be searched. Further, after the search by the method 1100, an item of "data table field" in the second block is selected, and a relationship diagram corresponding to the search result is displayed, wherein the relationship diagram includes a plurality of data frames corresponding to the data table, and the data frames are visually connected by relationship lines.
Next, at step 1103, in response to receiving the first instruction, at least one form associated with the at least one relationship graph is visualized, the form including a plurality of fields. For example, when the user selects a data table to be operated in the relationship diagram, the relationship of the data table may be further modified or deleted by clicking the relationship line, and after two data tables to be viewed are selected according to the relationship diagram, the system may automatically display the field relationship analysis form when the relationship line between the two data tables to be viewed is clicked (as shown in fig. 5). At step 1104, a second instruction for modifying the field is received from the user, and the field relationship analysis form provides a user to write or edit a relationship condition in addition to the "relationship name" named according to the field name and the table name through the system.
Further, at step 1105, in response to receiving the second instruction, the field is modified and saved accordingly. The field relation analysis form also comprises two function options of 'adding relation' and 'deleting relation', and when the relation setting is completed, the function option of 'adding relation' is clicked, the setting can be stored and the relation graph interface can be returned. Conversely, if the relationship is to be cancelled, the function option of "delete relationship" can be clicked. Likewise, the settings are also stored and returned to the relational graph interface. And finally, generating a new data analysis template according to the modified form for the analysis processing of the subsequent data table.
FIG. 12 is a flow diagram illustrating a method 1200 of operation of an analysis of data table relationships according to an embodiment of the invention. The database of the invention comprises a plurality of data tables, wherein one or more relational graphs are formed according to the main foreign key relations among the plurality of data tables and are stored in the database. As shown in fig. 12, at step 1201, at least one relational graph is visualized, and the relational graph in the database is displayed on the analysis tool interface as shown in fig. 2 by the method 1200, where the relational graph includes a plurality of data frames, and each of the plurality of data frames corresponds to one data table. Through a tree structure (comprising views, relations and the like) developed by the relational graph, the main foreign key relation between each data table can be clearly identified, and the visual connection between each data frame is established through the first group of relation lines according to the relation between the data tables. Further, in some cases, only the direct relationships between data tables may be known through the primary foreign key comparison. Tables of data for which some fields are not identical or similar, but in fact have an association, may be further identified by the analysis tools of the present invention.
In one or more embodiments, the user may further confirm or modify the data table according to the relationship diagram, which may be performed by selecting a "analyze" or "search" functional block in the second display area, and at step 1202, receiving a first instruction from the user regarding performing an operation on the relationship diagram, and the method 1200 may perform the analysis operation or the retrieval operation. In some application scenarios, when the functional block of "analyze" is selected, the interface format presented is as shown in FIG. 3. The data table corresponding to the data frame of the relationship diagram in fig. 3 includes fields and category codes. In one embodiment, the association between the data tables can be established by comparing the category codes in the data tables, and when the two data tables are related, a relationship line exists between the two tables.
In one scenario, when analyzing one or more data tables, a user may view each data table information and field structure (e.g., table name and table field), etc., and when there are identical fields in both tables, the system may automatically establish table field relationships, and when there are no identical fields, the user may manually edit the fields to establish the table relationships. In addition, at step 1203, a third instruction is received from the user to analyze the first set of relationship lines, and when the user selects a data table to be operated on in the relationship diagram, the user may further modify or delete the data table relationship by clicking on the relationship lines. Next, at step 1204, in response to receiving the third instruction, visualizing a field relationship analysis form related to any relationship line in the first set of relationship lines, and after selecting two data tables to be viewed according to the relationship diagram of fig. 3, when a relationship line between the two data tables to be viewed is clicked, the system automatically displays the field relationship analysis form.
At step 1205, a second instruction from the user to modify the field is received, and the field relationship analysis form provides the user to write or edit a relationship condition in addition to the "relationship name" named via the system according to the field name and the table name. Further, at step 1206, in response to receiving the second instruction, modifying and saving the field accordingly, where the field relationship analysis form further includes two function options of "add relationship" and "delete relationship". When the relationship setting is completed, the function option of "add relationship" is clicked, and the setting can be stored and returned to the relationship diagram interface of fig. 3. Conversely, if the relationship is to be cancelled, the function option of "delete relationship" can be clicked. Likewise, the settings are also stored and returned to the relational graph interface of FIG. 3. Finally, at step 1207, a new data analysis template is generated from the modified form for subsequent data table analysis processing.
FIG. 13 is a flow diagram illustrating a method 1300 of data table relationship retrieval operations according to another embodiment of the invention. The database of the invention comprises a plurality of data tables, wherein one or more relational graphs are formed according to the main foreign key relations among the plurality of data tables and are stored in the database. As shown in fig. 13, at step 1301, method 1300 visualizes at least one relational graph to display the relational graph in the database at the analysis tool interface as in fig. 2. In one or more embodiments, the user may further confirm or modify the data table according to the relationship diagram, and the specific manner may be performed by the functional block selecting "analyze" 231 or "search" 232 in the second display area. At step 1302, the method 1300 receives a first instruction from a user regarding performing an operation on the relationship graph in order to perform an analysis operation or a retrieval operation. In some application scenarios, when the "retrieve" functional block is selected, the interface format presented is as shown in FIG. 6.
At step 1303, in response to receiving the first instruction, the method 1300 retrieves the associated data table, data table field, or data table field value in the relationship graph, such as the interface of FIG. 6 having a "search" field for entering the field to be looked up (hereinafter collectively referred to as "input value"). Further, an entry is selected that looks up the "data table field". At step 1304, the method 1300 visualizes a field relationship graph including a plurality of data frames corresponding to the data tables associated in the relationship graph, the data frames having a visualized connection established therebetween via a second set of relationship lines. Based on the user selecting "data table fields" from FIG. 7 as the data type to be looked up, the method 1300 may quickly locate which fields in one or more data tables the input value appears in based on the input value written in the search field, and display the fields in the form of a field relationship graph in the interface based on the matching result and based on the conditions of the look-up.
Further, at step 1305, method 1300 receives a fourth instruction from the user to analyze the second set of relationship lines, and after selecting two data tables to view according to the field relationship diagram, when clicking on a relationship line between the two data tables to view, at step 1306, in response to receiving the fourth instruction, method 1300 visualizes a field relationship analysis form associated with any relationship line in the second set of relationship lines, and automatically displays the field relationship analysis form.
At step 1307, in response to receiving the second instruction, the method 1300 modifies and saves the field accordingly. In some scenarios, when the field relationship analysis form further includes two function options of "add relationship" and "delete relationship", after the modification of the relationship is completed, the modification can be stored and returned to the interface of fig. 7 by clicking the function option of "add relationship".
In one embodiment, at step 1308, method 1300 receives a fifth instruction from the user to modify the field relationship diagram so that upon selecting the data box of FIG. 7, the information form shown in FIG. 8 will be displayed. At step 1309, in response to receiving the fifth instruction, method 1300 visualizes an information form that includes a plurality of fields for modification, and the user can modify the "table name", "chinese table annotation", and "table remark" in the information setting form. Finally, at step 1310, the method 1300 generates a new data analysis template from the modified form for subsequent data table analysis processing.
In summary, the technical solution for analyzing the relationship between multiple data tables disclosed in the present invention can confirm the relationship after the system automatically identifies the relationship, and flexibly modify the data table name, the chinese annotation and the table remark through the form setting, and generate a new data analysis template from the modified form, so that the error rate can be reduced, and great advantages can be generated for the integration of subsequent data.
Further, it will be appreciated by those skilled in the art that the present invention also discloses an apparatus for analyzing relationships between a plurality of data tables, comprising: at least one processor; at least one memory storing computer program instructions that, when executed by at least one processor, cause the apparatus to perform the methods and embodiments thereof described in connection with the figures. Additionally, the present invention also discloses a computer readable storage medium comprising program instructions for analyzing relationships between a plurality of data tables, which when executed by a processor, perform the method and its various embodiments described in connection with the figures.
It should be appreciated that aspects of the invention may be performed by any module, unit, component, server, computer, terminal, or device executing instructions, and that such module, unit, component, server, computer, terminal, or device may include or otherwise access a computer-readable medium, such as a storage medium, computer storage medium, or data storage device (removable) and/or non-removable) such as, for example, a magnetic disk, optical disk, or magnetic tape. Computer storage media may include volatile and nonvolatile, removable and non-removable media implemented in any method or technology for storage of information, such as computer readable instructions, data structures, program modules or other data.
Examples of computer storage media include RAM, ROM, EEPROM, flash memory or other memory technology, CD-ROM, Digital Versatile Disks (DVD) or other optical storage, magnetic cassettes, magnetic tape, magnetic disk storage or other magnetic storage devices, or any other medium which can be used to store the desired information and which can be accessed by an application, a module, or both. Any such computer storage media may be part of, or accessible or connectable to, a device. Any applications or modules described herein may be implemented using computer-readable/executable instructions that may be stored or otherwise maintained by such computer-readable media.
It is also to be understood that the terminology used in the description of the invention herein is for the purpose of describing particular embodiments only and is not intended to be limiting of the invention. As used in the specification and claims of this application, the singular form of "a", "an", and "the" are intended to include the plural forms as well, unless the context clearly indicates otherwise. It should be further understood that the term "and/or" as used in the specification and claims of this specification refers to any and all possible combinations of one or more of the associated listed items and includes such combinations.
The principles of the present invention have been explained above by means of a number of embodiments, and such an explanation is only intended to help understand the method of the present invention and its core idea. The invention is not limited to the embodiments described above, but rather to the embodiments of the invention, which are applicable to various fields of application.

Claims (9)

1. A method for analyzing relationships between a plurality of data tables, one or more relationship graphs being formed between the plurality of data tables, wherein the relationship graphs include a plurality of data frames corresponding to the data tables, each data frame includes a field and a category code, and according to types of the category codes in each data table, associations between the plurality of data tables can be established, and each data frame establishes a visual connection through a first set of relationship lines, the method comprising:
visualizing at least one of the relationship graphs;
receiving a first instruction from a user about executing an operation on the relational graph, wherein the first instruction refers to modifying or deleting data table relations by clicking relation lines among a plurality of data tables;
in response to receiving the first instruction, visualizing at least one form associated with the at least one relationship graph, the form comprising a plurality of fields;
receiving a second instruction from the user regarding modifying the field, and
and responding to the received second instruction, and correspondingly modifying and saving the field.
2. The method of claim 1, further comprising:
receiving a third instruction from the user to analyze with respect to the first set of relationship lines; and
in response to receiving the third instruction, visualizing a field relationship analysis form related to any of the first set of relationship lines.
3. The method of claim 1, further comprising:
in response to receiving the first instruction, retrieving an associated data table, data table field, or data table field value in the relationship graph.
4. The method of claim 3, wherein the retrieving step further comprises:
the field relation graph comprises a plurality of data frames corresponding to the data tables related in the relation graph, and the data frames are connected in a visualized mode through a second group of relation lines;
receiving a fourth instruction from the user to analyze the second set of relationship lines; and
in response to receiving the fourth instruction, visualizing a field relationship analysis form related to any of the second set of relationship lines.
5. The method of claim 4, further comprising:
receiving a fifth instruction from the user about modifying the field relationship diagram; and
in response to receiving the fifth instruction, visualizing an informational form that includes a plurality of fields for modification.
6. The method of any of claims 1-5, further comprising:
and generating a new data analysis template according to the modified form for the analysis processing of the subsequent data table.
7. The method of claim 6, wherein the relational graph of the data table is an entity-to-contact ER graph.
8. An apparatus for analyzing relationships between a plurality of data tables, comprising:
at least one processor;
at least one memory storing computer program instructions that, when executed by at least one processor, cause the apparatus to perform the method of any of claims 1-7.
9. A computer readable storage medium comprising program instructions for analyzing relationships between a plurality of data tables, which when executed by a processor, perform the method of any one of claims 1-7.
CN201911379115.5A 2019-12-27 2019-12-27 Method, apparatus and computer-readable storage medium for analyzing relationships between a plurality of data tables Active CN111143370B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201911379115.5A CN111143370B (en) 2019-12-27 2019-12-27 Method, apparatus and computer-readable storage medium for analyzing relationships between a plurality of data tables

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201911379115.5A CN111143370B (en) 2019-12-27 2019-12-27 Method, apparatus and computer-readable storage medium for analyzing relationships between a plurality of data tables

Publications (2)

Publication Number Publication Date
CN111143370A CN111143370A (en) 2020-05-12
CN111143370B true CN111143370B (en) 2021-03-26

Family

ID=70521074

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201911379115.5A Active CN111143370B (en) 2019-12-27 2019-12-27 Method, apparatus and computer-readable storage medium for analyzing relationships between a plurality of data tables

Country Status (1)

Country Link
CN (1) CN111143370B (en)

Families Citing this family (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111814444A (en) * 2020-07-21 2020-10-23 四川爱联科技有限公司 Table data summarizing and analyzing method based on BS (browser/server) architecture
CN112115138A (en) * 2020-08-19 2020-12-22 第四范式(北京)技术有限公司 Method, device and equipment for determining association relation between data tables
CN112947207A (en) * 2021-02-26 2021-06-11 王继凡 Geothermal source energy-saving method and system based on Internet of things

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102880599A (en) * 2011-07-12 2013-01-16 新诺亚舟科技(深圳)有限公司 Sentence exploring method for analyzing sentences and supporting learning of analysis
CN103606037A (en) * 2013-11-06 2014-02-26 远光软件股份有限公司 Query and configuration method of business data and device thereof
CN106599039A (en) * 2016-11-07 2017-04-26 深圳市睿捷软件技术有限公司 Statistical representation method supporting free combination and nesting of data in relational database
CN109213754A (en) * 2018-03-29 2019-01-15 北京九章云极科技有限公司 A kind of data processing system and data processing method

Family Cites Families (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108733713B (en) * 2017-04-21 2022-01-11 创新先进技术有限公司 Data query method and device in data warehouse
CN107391537B (en) * 2017-04-25 2020-09-15 阿里巴巴集团控股有限公司 Method, device and equipment for generating data relation model
CN109299187A (en) * 2018-11-05 2019-02-01 用友网络科技股份有限公司 Data analysing method, device and equipment
CN110413608A (en) * 2019-06-17 2019-11-05 平安普惠企业管理有限公司 Data query method, apparatus, readable storage medium storing program for executing and program product

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102880599A (en) * 2011-07-12 2013-01-16 新诺亚舟科技(深圳)有限公司 Sentence exploring method for analyzing sentences and supporting learning of analysis
CN103606037A (en) * 2013-11-06 2014-02-26 远光软件股份有限公司 Query and configuration method of business data and device thereof
CN106599039A (en) * 2016-11-07 2017-04-26 深圳市睿捷软件技术有限公司 Statistical representation method supporting free combination and nesting of data in relational database
CN109213754A (en) * 2018-03-29 2019-01-15 北京九章云极科技有限公司 A kind of data processing system and data processing method

Also Published As

Publication number Publication date
CN111143370A (en) 2020-05-12

Similar Documents

Publication Publication Date Title
US10860548B2 (en) Generating and reusing transformations for evolving schema mapping
US8849840B2 (en) Quick find for data fields
CN111143370B (en) Method, apparatus and computer-readable storage medium for analyzing relationships between a plurality of data tables
CN108132957B (en) Database processing method and device
US20110295864A1 (en) Iterative fact-extraction
CN108664509B (en) Method, device and server for ad hoc query
KR101505858B1 (en) A templet-based online composing system for analyzing reports or views of big data by providing past templets of database tables and reference fields
AU2017265144B2 (en) Information retrieval
US10303704B2 (en) Processing a data set that is not organized according to a schema being used for organizing data
US7853595B2 (en) Method and apparatus for creating a tool for generating an index for a document
CN113760891B (en) Data table generation method, device, equipment and storage medium
JP2008262537A (en) Reasoning information based on retrieving case from archive record
CN109636303B (en) Storage method and system for semi-automatically extracting and structuring document information
US5557788A (en) Relational access system for network type data bases which uses a unique declarative statement
CN117539893A (en) Data processing method, medium, device and computing equipment
CN117667841A (en) Enterprise data management platform and method
CN111143483A (en) Method, apparatus and computer readable storage medium for determining data table relationships
US7225412B2 (en) Visualization toolkit for data cleansing applications
Monaco Methods for in-sourcing authority control with MarcEdit, SQL, and regular expressions
JP2008117280A (en) Software source code-retrieval method and system
KR101083425B1 (en) Database detecting system and detecting method using the same
CN111309773A (en) Vehicle information query method, device and system and storage medium
JP2007072749A (en) Method and device for retrieving database change point
CN108304430B (en) Method for modifying database
CN118377808A (en) Automatic extraction and identification method and system for design data

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant