CN115455149B - Database construction method based on coding query mode and application thereof - Google Patents

Database construction method based on coding query mode and application thereof Download PDF

Info

Publication number
CN115455149B
CN115455149B CN202211141746.5A CN202211141746A CN115455149B CN 115455149 B CN115455149 B CN 115455149B CN 202211141746 A CN202211141746 A CN 202211141746A CN 115455149 B CN115455149 B CN 115455149B
Authority
CN
China
Prior art keywords
database
dimension
keywords
data
query
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202211141746.5A
Other languages
Chinese (zh)
Other versions
CN115455149A (en
Inventor
郁强
潘蓝翔
赵军辉
黄红叶
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
CCI China Co Ltd
Original Assignee
CCI China Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by CCI China Co Ltd filed Critical CCI China Co Ltd
Priority to CN202211141746.5A priority Critical patent/CN115455149B/en
Publication of CN115455149A publication Critical patent/CN115455149A/en
Application granted granted Critical
Publication of CN115455149B publication Critical patent/CN115455149B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/3331Query processing
    • G06F16/334Query execution
    • G06F16/3344Query execution using natural language analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/31Indexing; Data structures therefor; Storage structures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/10Text processing
    • G06F40/12Use of codes for handling textual entities
    • G06F40/126Character encoding
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/205Parsing
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/279Recognition of textual entities
    • G06F40/289Phrasal analysis, e.g. finite state techniques or chunking
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N20/00Machine learning
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Artificial Intelligence (AREA)
  • Computational Linguistics (AREA)
  • Health & Medical Sciences (AREA)
  • Data Mining & Analysis (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Software Systems (AREA)
  • General Health & Medical Sciences (AREA)
  • Databases & Information Systems (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Computation (AREA)
  • Medical Informatics (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The application provides a database construction method based on a coding query mode and application thereof, and the method comprises the following steps: s00, traversing the original data and analyzing to obtain dimension keywords of each data in the original data; s01, coding each dimension keyword so that the codes are associated with the dimension keywords in a one-to-one correspondence manner; s02, constructing a mapping grid relation of a database according to the mapping relation among the dimension keywords; s03, correcting the mapping grid relation of the database and determining the data set storage mode of the database according to the corresponding relation between the codes and the dimension keywords; s04, constructing a query key based on a structure of a data set storage mode of a database; s05, storing the original data in a database in a determined data set storage mode. The method has the advantages of quickly constructing the database, quickly expanding the database and improving the query efficiency.

Description

Database construction method based on coding query mode and application thereof
Technical Field
The application relates to the technical field of databases, in particular to a database construction method based on a coding query mode and application thereof.
Background
With the continuous development of modern society, the application of networks is becoming wider and wider, and corresponding data transmitted through networks is becoming more and more. For processing and utilizing such data, the data is typically kept in a database, where network data may be kept for analysis of user behavior, for measurement and optimization of the network, and for auditing. Especially in the technical field of urban brain and urban management, the data volume is huge, data of multiple dimensions in one city are often involved, and the multiple data are often associated, so that a large amount of data are stored in a database, much calculation power is wasted during data retrieval or inquiry, the inquiry and retrieval speed is low, and the waiting time of a user is increased.
However, in the conventional technology, when the database is searched and queried, keywords are often input, then data are extracted or ordered based on the keywords, when the keywords span multiple dimensions, corresponding time is further increased, the query city dimension wage income database is taken as an example, the search keywords comprise data of men, zhejiang people and income in three dimensions, in the process of searching, all the data in the database need to be traversed one by one, then the data comprising the three dimensions are extracted to obtain a preliminary query database, and the preliminary query database is further analyzed to obtain a query result, so that the query time is long and the calculation power consumption is high.
As can be seen, a large and easily expandable database is required for storage because of the large amount of network data and the possible large variety of network data. At the same time, for various applications requiring database queries, rapid queries for large-capacity databases are also very important. Therefore, a database construction method based on the code query mode and application thereof are needed to solve the above problems.
Disclosure of Invention
The embodiment of the application provides a database construction method based on a coding query mode and application thereof, aiming at the problems of high calculation resource consumption, large data volume and the like in the prior art.
The core technology of the invention mainly comprises three directions: 1, establishing a database; 2, expanding the capacity of a database; 3, querying a database.
In a first aspect, the present application provides a method for constructing a database based on a coding query mode, where the method includes the following steps:
s00, traversing the original data and analyzing to obtain dimension keywords of each data in the original data;
s01, coding each dimension keyword so that the codes are associated with the dimension keywords in a one-to-one correspondence manner;
s02, constructing a mapping grid relation of a database according to the mapping relation among the dimension keywords;
s03, correcting the mapping grid relation of the database and determining the data set storage mode of the database according to the corresponding relation between the codes and the dimension keywords;
s04, constructing a query key based on a structure of a data set storage mode of a database;
the inquiry key consists of codes corresponding to the keywords in each dimension;
s05, storing the original data in a database in a determined data set storage mode.
Further, the method also comprises a step for expanding the database, and specifically comprises the following steps:
s10, receiving and analyzing the capacity expansion data to be stored in the database to obtain capacity expansion dimension keywords of the capacity expansion data;
s11, comparing the capacity expansion dimension keywords with original dimension keywords in a database to obtain differences between the capacity expansion dimension keywords and the original dimension keywords;
s12, judging the difference between the two;
s13, if the difference is greater than the threshold value, encoding the expansion dimension keywords to construct a new encoding, updating a data set storage mode of the database according to the corresponding relation between the new encoding and the expansion dimension keywords, and updating the query key at the same time, so that the expansion data is stored in the database in the updated data set storage mode to finish expansion; if the difference is smaller than or equal to the threshold value, integrating the expansion dimension keywords into the original dimension keywords, and storing the expansion data into a database by the original data set to finish expansion.
Further, the method also comprises the step of inquiring the database, and specifically comprises the following steps of:
s20, receiving and analyzing an input query key to acquire dimension keywords contained in the query key;
s21, confirming a corresponding query code based on the acquired dimension keywords;
wherein the query code is a code for a query;
s22, extracting data related to the query codes from the database according to the query codes.
Further, in step S00, each dimension keyword is output for the user to preview and review, and accepts modification, deletion or addition of the user.
Further, step S06 is further included, extracting dimension keywords based on machine learning.
Further, in step S01, each dimension keyword is encoded in digital form.
Further, in step S02, the dimension keywords that can be summarized in the upper level are subjected to the upper level to obtain the upper-level dimension keywords, so that a mapping relationship is formed between the lower-level dimension keywords and the upper-level dimension keywords.
In a second aspect, the present application provides a database query expansion device based on a coding query mode, including:
the building module is used for traversing the original data and analyzing the original data to obtain the dimension keywords of each data in the original data; encoding each dimension keyword so that the encoding is associated with the dimension keywords in a one-to-one correspondence; constructing a mapping grid relation of a database according to the mapping relation among the dimension keywords; correcting the mapping grid relation of the database and determining the data set storage mode of the database according to the corresponding relation between the codes and the dimension keywords; constructing a query key based on the structure of a data set storage mode of a database; storing the original data in a database in a determined data set storage mode;
the capacity expansion module is used for receiving and analyzing capacity expansion data to be stored in the database so as to obtain capacity expansion dimension keywords of the capacity expansion data; comparing the capacity expansion dimension keywords with original dimension keywords in a database to obtain differences between the capacity expansion dimension keywords and the original dimension keywords; judging the difference between the two; if the difference is greater than the threshold value, encoding the expansion dimension keywords to construct a new code, updating a data set storage mode of the database according to the corresponding relation between the new code and the expansion dimension keywords, and updating the query key at the same time to enable the expansion data to be stored in the database in the updated data set storage mode to finish expansion; if the difference is smaller than or equal to the threshold value, integrating the capacity expansion dimension keywords into the original dimension keywords, and storing the capacity expansion data into a database by using the original data set to finish capacity expansion;
the query module is used for receiving and analyzing the input query key to acquire the dimension keywords contained in the query key; confirming a corresponding query code based on the acquired dimension keywords;
and the output module is used for outputting the queried data.
In a third aspect, the present application provides an electronic device, including a memory, in which a computer program is stored, and a processor configured to run the computer program to perform the above-described method of constructing a database based on a coded query pattern.
In a fourth aspect, the present application provides a readable storage medium having stored therein a computer program comprising program code for controlling a process to execute a process comprising a database construction method based on a coded query pattern according to the above.
The main contributions and innovation points of the invention are as follows: 1. compared with the prior art, in the process of constructing the database, the method and the device for constructing the data set based on the encoding by analyzing the original data have the advantages that the data is input with a certain association relation by constructing the data set storage mode and the query key based on the encoding, so that ordered storage of the data is facilitated, data stacking storage in the traditional technology is replaced, storage is more scientific, and particularly when the data volume is huge, the storage of the data is ordered, and the rapid construction of the database is facilitated;
2. compared with the prior art, in the capacity expansion process of the database, the fact that the keywords of the data possibly change along with the increase of the service time of the database is considered, and the original database can not necessarily cover the dimension keywords, so that in the capacity expansion process, the capacity expansion dimension keywords of new data can be identified, the capacity expansion dimension keywords are encoded, the new codes are added into a data set storage mode, and the data storage mode can be updated;
3. compared with the prior art, in the data query process, the dimension information of the query keywords can be obtained based on the query keywords, the codes corresponding to the dimension information can be obtained based on the dimension information, the data can be rapidly extracted in the database through the codes, and meanwhile, the association relation between different data can be extracted, so that the response efficiency of data query can be improved.
The details of one or more embodiments of the application are set forth in the accompanying drawings and the description below to provide a more thorough understanding of the other features, objects, and advantages of the application.
Drawings
The accompanying drawings, which are included to provide a further understanding of the application and are incorporated in and constitute a part of this application, illustrate embodiments of the application and together with the description serve to explain the application and do not constitute an undue limitation to the application. In the drawings:
FIG. 1 is a flow chart of a method of constructing a database based on a coded query pattern according to an embodiment of the present application;
FIG. 2 is a flow chart of database expansion;
FIG. 3 is a flow chart of a database query;
fig. 4 is a schematic diagram of a hardware structure of an electronic device according to an embodiment of the present application.
Detailed Description
Reference will now be made in detail to exemplary embodiments, examples of which are illustrated in the accompanying drawings. When the following description refers to the accompanying drawings, the same numbers in different drawings refer to the same or similar elements, unless otherwise indicated. The implementations described in the following exemplary embodiments do not represent all implementations consistent with one or more embodiments of the present specification. Rather, they are merely examples of apparatus and methods consistent with aspects of one or more embodiments of the present description as detailed in the accompanying claims.
It should be noted that: in other embodiments, the steps of the corresponding method are not necessarily performed in the order shown and described in this specification. In some other embodiments, the method may include more or fewer steps than described in this specification. Furthermore, individual steps described in this specification, in other embodiments, may be described as being split into multiple steps; while various steps described in this specification may be combined into a single step in other embodiments.
Because the network data has a large data volume and may vary more, a large-capacity easily-expandable database is required for storage. At the same time, for various applications requiring database queries, rapid queries for large-capacity databases are also very important.
Based on the method, the database is built based on the keywords to solve the problems existing in the prior art.
Example 1
Specifically, the embodiment of the application provides a database construction method based on a coding query mode, and specifically, referring to fig. 1, the method includes:
s00, traversing the original data and analyzing to obtain dimension keywords of each data in the original data;
in this embodiment, the original data is known data, and in the process of parsing the original data, the content included in the query field is split according to a language type, so as to obtain multiple dimension keywords, where the language type includes a word breaking sequence and a word association degree in the same data.
Preferably, after the dimension keywords are generated, a preview function can be provided, that is, all dimension keywords are previewed for user review, and the user can access to revisions of the dimension keywords based on the review results, for example, the dimension keywords can be deleted or the dimension keywords can be revised by supplementing the dimension keywords.
Preferably, the construction of the original database and the deep learning technology can be combined to realize the automatic extraction of the dimension keywords, and the extraction of the dimension keywords is more accurate along with the increase of the data volume.
S01, coding each dimension keyword so that the codes are associated with the dimension keywords in a one-to-one correspondence manner;
preferably, each dimension keyword can be coded in a digital manner, so that the construction of the codes is further simplified, and the data set storage mode and the establishment of the query key are facilitated.
The corresponding relation between English letters and numbers is as example: a (1), B (2), C (3), D (4), E (5), F (6), G (7), H (8), I (9), J (210), K (211), L (212), M (213), N (214), O (215), P (216), Q (217), R (218), S (219), T (220), U (221), V (222), W (223), X (224), Y (225), Z (226).
S02, constructing a mapping grid relation of a database according to the mapping relation among the dimension keywords;
in this embodiment, for example, taking "income of man and Zhejiang people" as an example, the original database may further include "income of woman and Zhejiang people", that is, four dimension keywords, which are parallel mapping relations, may also be used for upper positions of man and woman to obtain upper dimension keywords, where the upper dimension keywords may be gender, the gender and Zhejiang people and income are cross mapping relations, and the man, woman and gender are subordinate mapping relations.
S03, correcting the mapping grid relation of the database and determining the data set storage mode of the database according to the corresponding relation between the codes and the dimension keywords;
in this embodiment, the dimension keywords in the mapping grid relationship are replaced by codes, so that a data set storage mode and a query key can be constructed, and the data set storage mode and the query key are constructed as gridded databased.
S04, constructing a query key based on a structure of a data set storage mode of a database;
the inquiry key consists of codes corresponding to the keywords in each dimension;
s05, storing the original data in a database in a determined data set storage mode.
In this embodiment, an encrypted digital library is constructed, for example, based on the english alphabet, each english letter corresponds to an encrypted number, in the process of determining the code based on the dimension keyword, the first letters of the pinyin of the first character and the last character of each keyword are extracted, and the code is determined based on the first letters of the first character and the last character of the keyword and the encrypted number, if the keyword is male, the number corresponding to n is 214, and the number corresponding to x is 224, then the code corresponding to the keyword for male is 214224. Further, after determining the codes corresponding to the keywords, the confirmed codes can be stored to construct a code mapping relation, when determining the codes corresponding to the next keywords, the new codes and the code mapping relation are required to be compared, whether the new codes and the code mapping relation are repeated or not is determined, if the repetition is performed, the second pinyin letters of the first characters of the keywords are taken again to encrypt the keywords, and the cycle is performed until the new codes and the code mapping relation are not repeated, and the coding of the keywords can be completed based on the repetition.
It can be understood that the arrangement is convenient for coding the keywords, realizes the encryption of the keywords, and is convenient for coding the deciphered codes.
As shown in fig. 2, the capacity expansion step includes:
s10, receiving and analyzing the capacity expansion data to be stored in the database to obtain capacity expansion dimension keywords of the capacity expansion data;
in this embodiment, as the usage time of the database increases, the keywords of the data may change, the original database may not necessarily cover the future dimension keywords, and also taking "man's income of Zhejiang" as an example, a large amount of "man's income of Zhejiang" and "man's financial income of Zhejiang" appear in the subsequent data input process.
S11, comparing the capacity expansion dimension keywords with original dimension keywords in a database to obtain differences between the capacity expansion dimension keywords and the original dimension keywords;
s12, judging the difference between the two;
preferably, a similarity algorithm may be employed herein to determine or calculate the magnitude of the difference, or other machine learning model to calculate the similarity or magnitude of the difference.
S13, if the difference is greater than the threshold value, encoding the expansion dimension keywords to construct a new encoding, updating a data set storage mode of the database according to the corresponding relation between the new encoding and the expansion dimension keywords, and updating the query key at the same time, so that the expansion data is stored in the database in the updated data set storage mode to finish expansion; if the difference is smaller than or equal to the threshold value, integrating the expansion dimension keywords into the original dimension keywords, and storing the expansion data into a database by the original data set to finish expansion.
In this embodiment, it may be understood that, in the process of comparing the expanded dimension keyword with the dimension keyword in the data volume, the comparison result may be determined based on the number of differences between the expanded dimension keyword and the angle of the number of overlapping, for example, in the case that the number of overlapping is very small, the new expanded data is not easy to be stored in the database, and it should be recommended to establish a new database; if the overlap ratio is greater than the first threshold (for example, the value is greater than 30%) and the number of differences is greater than 2, the difference between the two can be considered to be large in this case, and if the overlap ratio is greater than the first threshold (for example, the value is greater than 30%) and the number of differences is less than or equal to 2, the difference between the two can be considered to be small.
As shown in fig. 3, the querying step (applicable to the unexpanded database and the expanded database) includes:
s20, receiving and analyzing an input query key to acquire dimension keywords contained in the query key;
s21, confirming a corresponding query code based on the acquired dimension keywords;
wherein the query code is a code for a query;
s22, extracting data related to the query codes from the database according to the query codes.
In this embodiment, taking "man's income of Zhejiang" as an example, in the construction process of the database, the man's corresponding code 001, zhejiang corresponding code 002, and the income corresponding code 003, in the query process, the data in the database can be directly extracted based on 001002003, the query key in the database can be simultaneously extracted, the extracted data can be displayed in a mapping grid manner, the rapid query of the data can be performed, and meanwhile, the accurate analysis is facilitated.
Preferably, during the encoding process, the data can be encoded by escape (string) function and ncodeURI (URIstring) function, and the query key can be resolved by encodeURIComponent (URIstring) to extract the data quickly.
Preferably, the present application is also suitable for processing big data of city dimensions, and can store financial information and traffic data of users, and when storing the financial information, the financial information includes: gender, age, corporate nature, etc.; when traffic information storage is involved, the information may include flow rate, number of vehicles, and speed of the vehicle. Such as passenger side, driver side and government traffic control departments that need to drive out.
Taking the application at the passenger end as an example, the database can count the destination, the origin and the time period of the passenger as keywords, and a plurality of keywords can be included below the destination and the origin, for example, the keywords can relate to shopping places, transportation hub places and public service places (such as schools), and the database related to the destination, the origin and the time period can be built based on the keywords;
by taking the application of the database to the driver side as an example, the driver can quickly acquire the client arrival and departure directions of different time periods by inquiring the database, and based on the database, the equal passenger position of the driver can be promoted to be adapted to the client to send out the starting place, so that the efficiency of passengers to take the automobile can be improved, and the passenger carrying efficiency of the driver side can also be improved.
The same database can also serve government institutions, the government institutions can acquire the position information of passengers and drivers, in this case, the drivers can include taxi drivers and private car drivers, the capacity of the database can be larger through quick and efficient storage and inquiry of the data, the data coverage range of the database is wider, the venues of the data can be clearer due to coding statistics, and government functional departments can plan urban dimensions based on the venues of the data, such as road construction and maintenance planning, schools, market area division and the like.
Example two
Based on the same conception, the application also provides a database query capacity expansion device based on a coding query mode, which comprises the following steps:
the building module is used for traversing the original data and analyzing the original data to obtain the dimension keywords of each data in the original data; encoding each dimension keyword so that the encoding is associated with the dimension keywords in a one-to-one correspondence; constructing a mapping grid relation of a database according to the mapping relation among the dimension keywords; correcting the mapping grid relation of the database and determining the data set storage mode of the database according to the corresponding relation between the codes and the dimension keywords; constructing a query key based on the structure of a data set storage mode of a database; storing the original data in a database in a determined data set storage mode;
the capacity expansion module is used for receiving and analyzing capacity expansion data to be stored in the database so as to obtain capacity expansion dimension keywords of the capacity expansion data; comparing the capacity expansion dimension keywords with original dimension keywords in a database to obtain differences between the capacity expansion dimension keywords and the original dimension keywords; judging the difference between the two; if the difference is greater than the threshold value, encoding the expansion dimension keywords to construct a new code, updating a data set storage mode of the database according to the corresponding relation between the new code and the expansion dimension keywords, and updating the query key at the same time to enable the expansion data to be stored in the database in the updated data set storage mode to finish expansion; if the difference is smaller than or equal to the threshold value, integrating the capacity expansion dimension keywords into the original dimension keywords, and storing the capacity expansion data into a database by using the original data set to finish capacity expansion;
the query module is used for receiving and analyzing the input query key to acquire the dimension keywords contained in the query key; confirming a corresponding query code based on the acquired dimension keywords;
and the output module is used for outputting the queried data.
Example III
This embodiment also provides an electronic device, referring to fig. 4, comprising a memory 404 and a processor 402, the memory 404 having stored therein a computer program, the processor 402 being arranged to run the computer program to perform the steps of any of the method embodiments described above.
In particular, the processor 402 may include a Central Processing Unit (CPU), or an Application Specific Integrated Circuit (ASIC), or may be configured to implement one or more integrated circuits of embodiments of the present application.
The memory 404 may include, among other things, mass storage 404 for data or instructions. By way of example, and not limitation, memory 404 may comprise a Hard Disk Drive (HDD), floppy disk drive, solid State Drive (SSD), flash memory, optical disk, magneto-optical disk, tape, or Universal Serial Bus (USB) drive, or a combination of two or more of these. Memory 404 may include removable or non-removable (or fixed) media, where appropriate. Memory 404 may be internal or external to the data processing apparatus, where appropriate. In a particular embodiment, the memory 404 is a Non-Volatile (Non-Volatile) memory. In particular embodiments, memory 404 includes Read-only memory (ROM) and Random Access Memory (RAM). Where appropriate, the ROM may be a mask-programmed ROM, a Programmable ROM (PROM), an Erasable PROM (EPROM), an Electrically Erasable PROM (EEPROM), an electrically rewritable ROM (EAROM) or FLASH memory (FLASH) or a combination of two or more of these. The RAM may be Static Random Access Memory (SRAM) or Dynamic Random Access Memory (DRAM) where appropriate, and the DRAM may be fast page mode dynamic random access memory 404 (FPMDRAM), extended Data Output Dynamic Random Access Memory (EDODRAM), synchronous Dynamic Random Access Memory (SDRAM), or the like.
Memory 404 may be used to store or cache various data files that need to be processed and/or used for communication, as well as possible computer program instructions for execution by processor 402.
The processor 402 reads and executes the computer program instructions stored in the memory 404 to implement any of the database construction methods based on the coded query pattern in the above embodiments.
Optionally, the electronic apparatus may further include a transmission device 406 and an input/output device 408, where the transmission device 406 is connected to the processor 402 and the input/output device 408 is connected to the processor 402.
The transmission device 406 may be used to receive or transmit data via a network. Specific examples of the network described above may include a wired or wireless network provided by a communication provider of the electronic device. In one example, the transmission device includes a network adapter (Network Interface Controller, simply referred to as NIC) that can connect to other network devices through the base station to communicate with the internet. In one example, the transmission device 406 may be a Radio Frequency (RF) module, which is configured to communicate with the internet wirelessly.
The input-output device 408 is used to input or output information. In this embodiment, the input information may be data to be stored or a query key, and the output information may be data to be queried.
Example IV
The present embodiment also provides a readable storage medium having stored therein a computer program including program code for controlling a process to execute the process including the database construction method based on the coded query pattern according to the first embodiment.
It should be noted that, specific examples in this embodiment may refer to examples described in the foregoing embodiments and alternative implementations, and this embodiment is not repeated herein.
In general, the various embodiments may be implemented in hardware or special purpose circuits, software, logic or any combination thereof. Some aspects of the invention may be implemented in hardware, while other aspects may be implemented in firmware or software which may be executed by a controller, microprocessor or other computing device, although the invention is not limited thereto. While various aspects of the invention may be illustrated and described as block diagrams, flow charts, or using some other pictorial representation, it is well understood that these blocks, apparatus, systems, techniques or methods described herein may be implemented in, as non-limiting examples, hardware, software, firmware, special purpose circuits or logic, general purpose hardware or controller or other computing devices, or some combination thereof.
Embodiments of the invention may be implemented by computer software executable by a data processor of a mobile device, such as in a processor entity, or by hardware, or by a combination of software and hardware. Computer software or programs (also referred to as program products) including software routines, applets, and/or macros can be stored in any apparatus-readable data storage medium and they include program instructions for performing particular tasks. The computer program product may include one or more computer-executable components configured to perform embodiments when the program is run. The one or more computer-executable components may be at least one software code or a portion thereof. In addition, in this regard, it should be noted that any blocks of the logic flows as illustrated may represent program steps, or interconnected logic circuits, blocks and functions, or a combination of program steps and logic circuits, blocks and functions. The software may be stored on a physical medium such as a memory chip or memory block implemented within a processor, a magnetic medium such as a hard disk or floppy disk, and an optical medium such as, for example, a DVD and its data variants, a CD, etc. The physical medium is a non-transitory medium.
It should be understood by those skilled in the art that the technical features of the above embodiments may be combined in any manner, and for brevity, all of the possible combinations of the technical features of the above embodiments are not described, however, they should be considered as being within the scope of the description provided herein, as long as there is no contradiction between the combinations of the technical features.
The foregoing examples merely represent several embodiments of the present application, the description of which is more specific and detailed and which should not be construed as limiting the scope of the present application in any way. It should be noted that it would be apparent to those skilled in the art that various modifications and improvements could be made without departing from the spirit of the present application, which would be within the scope of the present application. Accordingly, the scope of protection of the present application shall be subject to the appended claims.

Claims (9)

1. The database construction method based on the coding query mode is characterized by comprising the following steps:
s00, traversing the original data and analyzing to obtain dimension keywords of each data in the original data;
s01, encoding each dimension keyword so that the codes are associated with the dimension keywords in a one-to-one correspondence manner;
s02, constructing a mapping grid relation of a database according to the mapping relation among the dimension keywords;
s03, correcting the mapping grid relation of the database by using the corresponding relation between the codes and the dimension keywords and determining the data set storage mode of the database;
s04, constructing a query key based on the structure of the data set storage mode of the database;
the inquiry key consists of codes corresponding to each dimension keyword;
s05, storing the original data in the database in a determined data set storage mode;
the method also comprises a step for expanding the database, and specifically comprises the following steps:
s10, receiving and analyzing the capacity expansion data to be stored in the database to obtain capacity expansion dimension keywords of the capacity expansion data;
s11, comparing the capacity expansion dimension keywords with original dimension keywords in the database to obtain differences between the capacity expansion dimension keywords and the original dimension keywords;
s12, judging the difference between the two;
s13, if the difference is greater than a threshold value, encoding the capacity expansion dimension keywords to construct a new code, updating a data set storage mode of the database according to the corresponding relation between the new code and the capacity expansion dimension keywords, and updating a query key at the same time, so that the capacity expansion data is stored in the database in the updated data set storage mode to finish capacity expansion; if the difference is smaller than or equal to a threshold value, integrating the expansion dimension keywords into the original dimension keywords, and storing the expansion data into the database according to the original data set to finish expansion.
2. The method for constructing a database based on a coded query pattern according to claim 1, further comprising a step for querying the database, specifically comprising the steps of:
s20, receiving and analyzing an input query key to acquire dimension keywords contained in the query key;
s21, confirming a corresponding query code based on the acquired dimension keywords;
wherein the query code is a code for a query;
s22, extracting data related to the query codes in the database according to the query codes.
3. The method for constructing a database according to claim 1, wherein in step S00, each dimension keyword is output for a user to review in a preview, and accept modification, deletion or addition of the user.
4. The method for constructing a database according to claim 1, further comprising step S06 of extracting the dimension keyword based on machine learning.
5. The method for constructing a database according to the code query method of claim 4, wherein in step S01, each of the dimension keywords is coded in a digital form.
6. The method for constructing a database based on a coded query pattern as claimed in claim 5, wherein in step S02, the dimension keywords that can be summarized in the upper level are subjected to the upper level to obtain the dimension keywords in the upper level, so that a mapping relationship is formed between the dimension keywords in the lower level and the dimension keywords in the upper level.
7. The database inquiry capacity-expanding device based on the coding inquiry mode is characterized by comprising:
the building module is used for traversing the original data and analyzing the original data to obtain the dimension keywords of each data in the original data; encoding each dimension keyword so that the encoding is associated with the dimension keywords in a one-to-one correspondence; constructing a mapping grid relation of a database according to the mapping relation among the dimension keywords; correcting the mapping grid relation of the database and determining the data set storage mode of the database according to the corresponding relation between the codes and the dimension keywords; constructing a query key based on the structure of a data set storage mode of a database; storing the original data in a database in a determined data set storage mode;
the capacity expansion module is used for receiving and analyzing capacity expansion data to be stored in the database so as to obtain capacity expansion dimension keywords of the capacity expansion data; comparing the capacity expansion dimension keywords with original dimension keywords in a database to obtain differences between the capacity expansion dimension keywords and the original dimension keywords; judging the difference between the two; if the difference is greater than the threshold value, encoding the expansion dimension keywords to construct a new code, updating a data set storage mode of the database according to the corresponding relation between the new code and the expansion dimension keywords, and updating the query key at the same time to enable the expansion data to be stored in the database in the updated data set storage mode to finish expansion; if the difference is smaller than or equal to the threshold value, integrating the capacity expansion dimension keywords into the original dimension keywords, and storing the capacity expansion data into a database by using the original data set to finish capacity expansion;
the query module is used for receiving and analyzing the input query key to acquire the dimension keywords contained in the query key; confirming a corresponding query code based on the acquired dimension keywords;
and the output module is used for outputting the queried data.
8. An electronic device comprising a memory and a processor, wherein the memory has stored therein a computer program, the processor being arranged to run the computer program to perform the coded query pattern based database construction method of any of claims 1 to 6.
9. A readable storage medium, characterized in that the readable storage medium has stored therein a computer program comprising program code for controlling a process to execute a process comprising the method of constructing a database based on the coded query pattern according to any one of claims 1 to 6.
CN202211141746.5A 2022-09-20 2022-09-20 Database construction method based on coding query mode and application thereof Active CN115455149B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202211141746.5A CN115455149B (en) 2022-09-20 2022-09-20 Database construction method based on coding query mode and application thereof

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202211141746.5A CN115455149B (en) 2022-09-20 2022-09-20 Database construction method based on coding query mode and application thereof

Publications (2)

Publication Number Publication Date
CN115455149A CN115455149A (en) 2022-12-09
CN115455149B true CN115455149B (en) 2023-05-30

Family

ID=84303922

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202211141746.5A Active CN115455149B (en) 2022-09-20 2022-09-20 Database construction method based on coding query mode and application thereof

Country Status (1)

Country Link
CN (1) CN115455149B (en)

Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7595485B1 (en) * 2007-02-07 2009-09-29 Thermo Finnigan Llc Data analysis to provide a revised data set for use in peptide sequencing determination
CN105069109A (en) * 2015-08-07 2015-11-18 北京思特奇信息技术股份有限公司 Dilatation method and system of distributed database
CN106844534A (en) * 2016-12-30 2017-06-13 西安电子科技大学 Towards the GeoHash coding methods by geographical spatial data one-dimensional of NoSQL databases
CN108121712A (en) * 2016-11-28 2018-06-05 北京国双科技有限公司 A kind of keyword storage method and device
CN110489494A (en) * 2019-08-27 2019-11-22 中国工商银行股份有限公司 Distributed data base on-line rapid estimation method and device
CN110597961A (en) * 2019-09-18 2019-12-20 腾讯科技(深圳)有限公司 Text category labeling method and device, electronic equipment and storage medium
CN110929511A (en) * 2018-09-04 2020-03-27 清华大学 Intelligent matching method for personalized traditional Chinese medicine diagnosis and treatment information and traditional Chinese medicine information based on semantic similarity
CN111552783A (en) * 2020-04-30 2020-08-18 深圳前海微众银行股份有限公司 Content analysis query method, device, equipment and computer storage medium
CN111639153A (en) * 2020-04-24 2020-09-08 平安国际智慧城市科技股份有限公司 Query method and device based on legal knowledge graph, electronic equipment and medium

Family Cites Families (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
AUPQ131399A0 (en) * 1999-06-30 1999-07-22 Silverbrook Research Pty Ltd A method and apparatus (NPAGE02)
CN101673291A (en) * 2009-10-28 2010-03-17 北京东方灵盾科技有限公司 Method, system and device of data retrieval of patent documents
CN104217032B (en) * 2014-09-28 2017-10-03 北京国双科技有限公司 The processing method and processing device of database dimension
CN110837520A (en) * 2019-10-10 2020-02-25 苏宁云计算有限公司 Data processing method, platform and system

Patent Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7595485B1 (en) * 2007-02-07 2009-09-29 Thermo Finnigan Llc Data analysis to provide a revised data set for use in peptide sequencing determination
CN105069109A (en) * 2015-08-07 2015-11-18 北京思特奇信息技术股份有限公司 Dilatation method and system of distributed database
CN108121712A (en) * 2016-11-28 2018-06-05 北京国双科技有限公司 A kind of keyword storage method and device
CN106844534A (en) * 2016-12-30 2017-06-13 西安电子科技大学 Towards the GeoHash coding methods by geographical spatial data one-dimensional of NoSQL databases
CN110929511A (en) * 2018-09-04 2020-03-27 清华大学 Intelligent matching method for personalized traditional Chinese medicine diagnosis and treatment information and traditional Chinese medicine information based on semantic similarity
CN110489494A (en) * 2019-08-27 2019-11-22 中国工商银行股份有限公司 Distributed data base on-line rapid estimation method and device
CN110597961A (en) * 2019-09-18 2019-12-20 腾讯科技(深圳)有限公司 Text category labeling method and device, electronic equipment and storage medium
CN111639153A (en) * 2020-04-24 2020-09-08 平安国际智慧城市科技股份有限公司 Query method and device based on legal knowledge graph, electronic equipment and medium
CN111552783A (en) * 2020-04-30 2020-08-18 深圳前海微众银行股份有限公司 Content analysis query method, device, equipment and computer storage medium

Also Published As

Publication number Publication date
CN115455149A (en) 2022-12-09

Similar Documents

Publication Publication Date Title
CN109165215B (en) Method and device for constructing space-time index in cloud environment and electronic equipment
US7945569B2 (en) Method and apparatus for querying spatial data
US7818303B2 (en) Web graph compression through scalable pattern mining
CN108572958B (en) Data processing method and device
US10520318B2 (en) Method and system for providing map data of navigation device
CN105678412A (en) Path planning method and device facing multiple passengers
EP4206611A1 (en) Data processing method and device
CN109033295B (en) Method and device for merging super-large data sets
US20220277139A1 (en) Computer-readable recording medium, encoding device, index generating device, search device, encoding method, index generating method, and search method
CN110060472B (en) Road traffic event positioning method, system, readable storage medium and device
CN116028489A (en) Automatic data exploration method and application thereof
CN114153898A (en) Method, device and application for combing relationships among database tables
CN115455149B (en) Database construction method based on coding query mode and application thereof
US20160154890A1 (en) Multidimensional-range search apparatus and multidimensional-range search method
WO2012119339A1 (en) Retrieval method and apparatus
CN112597190A (en) Point neighbor track query method and device, electronic equipment and readable storage medium
Koide et al. Enhanced indexing and querying of trajectories in road networks via string algorithms
CN113806466A (en) Path time query method and device, electronic equipment and readable storage medium
CN111538796A (en) Address normalization processing method, device, equipment and storage medium
CN117033534A (en) Geographic information processing method, device, computer equipment and storage medium
CN113112626B (en) Highway toll information acquisition system
CN113535962B (en) Data warehouse-in method, device, electronic device, program product and storage medium
KR102497634B1 (en) Method and apparatus for compressing fastq data through character frequency-based sequence reordering
CN109614456B (en) Deep learning-based geographic information positioning and partitioning method and device
US9436715B2 (en) Data management apparatus and data management method

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant