CN108846103B - Data query method and device - Google Patents

Data query method and device Download PDF

Info

Publication number
CN108846103B
CN108846103B CN201810633278.0A CN201810633278A CN108846103B CN 108846103 B CN108846103 B CN 108846103B CN 201810633278 A CN201810633278 A CN 201810633278A CN 108846103 B CN108846103 B CN 108846103B
Authority
CN
China
Prior art keywords
query
parameter
subject
character string
name
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201810633278.0A
Other languages
Chinese (zh)
Other versions
CN108846103A (en
Inventor
付浩伟
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Tiangong Matrix Information Technology Co ltd
Original Assignee
Beijing Tiangong Matrix Information Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Tiangong Matrix Information Technology Co ltd filed Critical Beijing Tiangong Matrix Information Technology Co ltd
Priority to CN201810633278.0A priority Critical patent/CN108846103B/en
Publication of CN108846103A publication Critical patent/CN108846103A/en
Application granted granted Critical
Publication of CN108846103B publication Critical patent/CN108846103B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Landscapes

  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The embodiment of the invention provides a data query method and device. The method comprises the following steps: receiving a query character string sent by a terminal; performing main body analysis by using a main body library according to the query character string to obtain a corresponding main body name; performing parameter analysis on the query character string by using a corresponding parameter analysis model according to the subject name to obtain corresponding parameter information; and generating a query instruction by the subject name and the parameter information, and querying according to the query instruction. The device is used for executing the method. According to the embodiment of the invention, the main body name in the query character string is obtained through the main body library, the parameter information is obtained by using the corresponding parameter analysis model according to the main body name, and finally the query is carried out according to the query instruction formed by the main body name and the parameter information to obtain the query result.

Description

Data query method and device
Technical Field
The invention relates to the technical field of data processing, in particular to a data query method and device.
Background
Due to the development of automation technology, more and more automation devices or other intelligent devices are used for industrial automation production, and therefore, in order to provide users with the capability of inquiring the relevant parameter information of a certain product, some companies provide online inquiry services.
In the prior art, an inquiry company stores more than 7 hundred million product specification records so as to provide accurate and comprehensive search service for users. In the database, each product specification includes: the brand, the class, the product series, the product name, the parameter, the material number and other types of information which can be searched. For example: iC65N-C16A/3P + VEA 30mA are stand-alone SKUs. The corresponding information is as follows: the product name is iC65N-C16A/3P + VEA 30 mA; product category, miniature circuit breaker; brand: Schneider electric; the product series is iC65 series small-sized circuit breaker; the material number of the factory is 1001; characteristic parameters are breaking capacity type [ N type ]; the number of poles is [3 poles ]; the tripping characteristic is [ C type ]; rated current [ 16A ]; ... key characteristic parameters are reflected in the product name in the form of code characters. But there are also quite a few parameters (up to 300 items) that are not embodied in the product name.
The search system needs to search and return related product names, material numbers and other peripheral information according to the character string containing the (part of) information input by the user. Common problems that exist in user queries include:
the user's description of particular content may be irregular. For example: 16 amps, written 16A; schneider is written as Schneider or Schneider et al.
The sequence of the pieces of information contained in the user string is not fixed. For example: schneider 3P16A or iC65N3P16A Schneider Electric.
The information items contained in the user string are incomplete. For example: the string "schneider 3 pole 16A" contains some fraction of the brand and product name. Wherein "schneider" is the brand name; "3 polar 16A" is preferably understood to be some character in the product name, and possibly a value in a product parameter.
Since the query strings input by the user are not all standard, the query result may not be the one desired by the user, and thus the query accuracy is low.
Disclosure of Invention
In view of the above, an object of the embodiments of the present invention is to provide a data query method and apparatus, so as to solve the above technical problems.
In a first aspect, an embodiment of the present invention provides a data query method, including:
receiving a query character string sent by a terminal;
performing main body analysis by using a main body library according to the query character string to obtain a corresponding main body name;
performing parameter analysis on the query character string by using a corresponding parameter analysis model according to the subject name to obtain corresponding parameter information;
and generating a query instruction by the subject name and the parameter information, and querying according to the query instruction.
Further, the method further comprises:
and performing preprocessing operation on the query character string, wherein the preprocessing operation comprises delimiter replacement, subject name pre-identification and parameter information pre-identification.
Further, the method further comprises:
the method comprises the steps of obtaining standard subject names corresponding to all product specifications and all suspected subject names corresponding to all the standard subject names in advance;
and forming the subject library by the standard subject name and the corresponding set of the suspected subject names.
Further, the method further comprises:
the method comprises the steps of obtaining parameter naming rules corresponding to all product specifications in advance, and constructing a corresponding parameter dictionary according to the parameter naming rules corresponding to each product specification, wherein the parameter dictionary comprises the following components: parameter codes, log values of word frequencies and code attribute numbers;
and constructing a corresponding parameter analysis model according to the parameter dictionary.
Further, the method further comprises:
and sequencing query results obtained by querying according to a preset rule, wherein the preset rule comprises any one or combination of similarity, query frequency, click feedback rate and editing distance.
Further, the performing a main body analysis by using a main body library according to the query character string to obtain a main body name corresponding to the query character string includes:
and matching the query character string with the subject name in the subject library by using a regular expression or an Aho-Corasick automata algorithm to obtain the subject name corresponding to the query character string.
Further, the performing parameter analysis on the query string by using a corresponding parameter analysis model according to the main body to obtain corresponding parameter information includes:
carrying out body removing operation on the query character string to obtain a non-body query character string;
and inputting the non-main body query character string into the parameter analysis model, wherein the parameter analysis model takes the parameter probability and the maximum parameter group as the parameter information according to a dynamic programming algorithm.
Further, the querying according to the query instruction includes:
and querying the query instruction by using an Elastic Search engine to obtain a query result.
In a second aspect, an embodiment of the present invention provides a data query apparatus, including:
the receiving module is used for receiving the query character string sent by the terminal;
the main body analysis module is used for carrying out main body analysis by using a main body library according to the query character string to obtain a corresponding main body name;
the parameter analysis module is used for carrying out parameter analysis on the query character string by using a corresponding parameter analysis model according to the subject name to obtain corresponding parameter information;
and the query module is used for generating a query instruction by the subject name and the parameter information and querying according to the query instruction.
Further, the apparatus further comprises:
and the preprocessing module is used for preprocessing the query character string, wherein the preprocessing operation comprises delimiter replacement, subject name pre-identification and parameter information pre-identification.
In a third aspect, an embodiment of the present invention provides an electronic device, including: a processor, a memory, and a bus, wherein,
the processor and the memory are communicated with each other through the bus;
the memory stores program instructions executable by the processor, the processor being capable of performing the method steps of the first aspect when invoked by the program instructions.
In a fourth aspect, an embodiment of the present invention provides a non-transitory computer-readable storage medium, including:
the non-transitory computer readable storage medium stores computer instructions that cause the computer to perform the method steps of the first aspect.
According to the embodiment of the invention, the main body name in the query character string is obtained through the main body library, the parameter information is obtained by using the corresponding parameter analysis model according to the main body name, and finally the query is carried out according to the query instruction formed by the main body name and the parameter information to obtain the query result.
Additional features and advantages of the invention will be set forth in the description which follows, and in part will be obvious from the description, or may be learned by the practice of the embodiments of the invention. The objectives and other advantages of the invention will be realized and attained by the structure particularly pointed out in the written description and claims hereof as well as the appended drawings.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings needed to be used in the embodiments will be briefly described below, it should be understood that the following drawings only illustrate some embodiments of the present invention and therefore should not be considered as limiting the scope, and for those skilled in the art, other related drawings can be obtained according to the drawings without inventive efforts.
Fig. 1 is a schematic flow chart of a data query method according to an embodiment of the present invention;
FIG. 2 is a schematic flow chart of another data query method according to an embodiment of the present invention;
FIG. 3 is a schematic structural diagram of a data query device according to an embodiment of the present invention;
fig. 4 is a block diagram of an electronic device according to an embodiment of the present invention.
Detailed Description
The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. The components of embodiments of the present invention generally described and illustrated in the figures herein may be arranged and designed in a wide variety of different configurations. Thus, the following detailed description of the embodiments of the present invention, presented in the figures, is not intended to limit the scope of the invention, as claimed, but is merely representative of selected embodiments of the invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments of the present invention without making any creative effort, shall fall within the protection scope of the present invention.
It should be noted that: like reference numbers and letters refer to like items in the following figures, and thus, once an item is defined in one figure, it need not be further defined and explained in subsequent figures. Meanwhile, in the description of the present invention, the terms "first", "second", and the like are used only for distinguishing the description, and are not to be construed as indicating or implying relative importance.
Fig. 1 is a schematic flow chart of a data query method according to an embodiment of the present invention, as shown in fig. 1, the method includes:
step 101: and receiving the query character string sent by the terminal.
In a specific implementation process, a user inputs a query character string corresponding to a required product in a search box of a terminal, and a query device receives the query character string sent by the terminal, wherein the query character string comprises a main name and/or parameter information of the required product. It should be noted that when a user enters a query string, its content may not be a canonical query statement specified by the device.
Step 102: and performing main body analysis by using a main body library according to the query character string to obtain a corresponding main body name.
In a specific implementation, the device obtains a pre-built body library, which is a collection of body names for all product specifications. If the total amount of the product specifications stored in the inquiry apparatus is small, it can be stored in the form of a text file (one is a subject name); if the total amount of the product specification stored in the query device is large, the subject name may be stored by constructing a double-array trie. The query string is compared with each subject name in the subject library to obtain the subject names included in the query string, it should be noted that the subject names in the query string may be one or multiple, and the subject names are predefined, for example, a brand, a category, a product series, a product name, and the like in a product specification are used as the subject names.
Step 103: and performing parameter analysis on the query character string by using a corresponding parameter analysis model according to the subject name to obtain corresponding parameter information.
In a specific implementation process, because parameter naming rules of different products of different manufacturers are different, each product corresponding to each manufacturer has a corresponding parameter analysis model, so that the corresponding parameter analysis model can be obtained through the subject name, the subject name in the query character string input by the user is removed, and then the subject name is input into the parameter analysis model for parameter analysis, and the parameter analysis model can output corresponding parameter information according to the input content. It should be noted that the parametric analysis model is pre-constructed.
Step 103: and generating a query instruction by the subject name and the parameter information, and querying according to the query instruction.
In a specific implementation process, after a subject name and parameter information corresponding to a query character string are obtained, a corresponding query instruction is generated according to the subject name and the parameter information, and the specific generation mode is as follows: if two different series of subject names are found in the query string according to the subject names, and each subject name corresponds to respective parameter information, then the two subject names and respective corresponding union sets form a query instruction; if the same series of parameters but different parameters are found to be searched according to the subject name and the parameter information, taking intersection of the parameter information to form a query instruction; if the same series, the same parameter but different parameter values are found to be inquired according to the subject name and the parameter information, the union set is taken to form the inquiry instruction. And sending the generated query instruction to a database for querying. It should be noted that the database stores information on product specifications in advance. Take the database of the space engineering matrix as an example: the product data of the astronomical matrix is formed by a piece of product specification data, namely the product specification is a basic data unit. The product rules include information such as full specification, short name, brand, type, price, several parameters and accessories. Based on this, we can design the following general product record specifications.
Name of field Data type Means of implication Remarks for note
id long Product ID
vendor string Manufacturer(s)
cat string Type of product
series string Name of series
shortSeries string Series abbreviation
name string Product full scale
price float Price
P1Value string Attribute 1 value
P2Code string Attribute 1 code
…..
PnValue string Value of attribute n
PnCode string Attribute n code
accessory long Accessory ID
Because different products have different attributes, an abstract naming mode such as P1.. Pn is adopted for naming the attributes so as to construct a uniform index structure, and specific attribute names can be stored in another data table. The format of this table may be defined as follows:
name of field Data type Description of the invention
SpecID Int Specification ID
PSeq Int Parameter Numbers (from 1)
PDef String Parameter definition (name)
And combing the service data according to the product specification, constructing a normalized program, and converting the original service data into normalized data. The normalization program generally processes the original data by means of a regular expression, and the processed result is stored in a database.
According to the embodiment of the invention, the main body name in the query character string is obtained through the main body library, the parameter information is obtained by using the corresponding parameter analysis model according to the main body name, and finally the query is carried out according to the query instruction formed by the main body name and the parameter information to obtain the query result.
On the basis of the above embodiment, the method further includes:
and performing preprocessing operation on the query character string, wherein the preprocessing operation comprises delimiter replacement, subject name pre-identification and parameter information pre-identification.
In a specific implementation process, after receiving a query string sent by a user, a query device needs to perform preprocessing on the query string in order to perform preliminary normalization on the query string, reduce interference, and improve accuracy of performing main body analysis and parameter analysis. The specific content of the pretreatment comprises the following steps:
the separators such as "-", "/" are uniformly replaced by blank spaces, so that the subsequent analysis is facilitated.
A special multi-word subject is identified, such as "XX YY" as one subject and combined into "XX-YY". This part needs to establish a main word list to be recognized in advance, and then uses Aho-Corasick (AC) automaton to perform recognition.
Specific parameters are identified, some specific parameters are identified in advance and converted.
The embodiment of the invention carries out preliminary standardization on the query character string by preprocessing the query character string, reduces interference and improves the accuracy of main body analysis and parameter analysis.
On the basis of the above embodiment, the method further includes:
the method comprises the steps of obtaining standard subject names corresponding to all product specifications and all suspected subject names corresponding to all the standard subject names in advance;
and forming the subject library by the standard subject name and the corresponding set of the suspected subject names.
In a specific implementation process, the subject names corresponding to all product specifications are calibrated in advance to obtain standard subject names, and then suspected subject names possibly input by a user corresponding to each standard subject name are obtained according to experience. For example: the standard subject name is Schneider, and a user may input Schneider or Schneider during query, so that both Schneider and Schneider are used as suspected subject names, one standard subject name and the corresponding suspected subject name are used as a record and are put into a subject library, and a set of a plurality of standard subject names and the corresponding suspected subject names form the subject library.
According to the embodiment of the invention, the subject name in the query character string is identified by constructing the subject library, so that the accurate analysis of different products of different manufacturers is realized, and the accuracy of data query is improved.
On the basis of the above embodiment, the method further includes:
the method comprises the steps of obtaining parameter naming rules corresponding to all product specifications in advance, and constructing a corresponding parameter dictionary according to the parameter naming rules corresponding to each product specification, wherein the parameter dictionary comprises the following components: parameter codes, log values of word frequencies and code attribute numbers;
and constructing a corresponding parameter analysis model according to the parameter dictionary.
In a specific implementation process, since parameter naming rules of different products of different manufacturers are different, a parameter dictionary needs to be respectively constructed for each product according to the parameter naming rule corresponding to each product specification. Wherein, the parameter dictionary comprises: the parameter code, the log value of the word frequency, and the code attribute number should be noted that the parameter dictionary may further include other parameters, which is not specifically limited in this embodiment of the present invention. It should be noted that if synonymous expressions like "6A" and "6 ampere" are present in the user input, it should be considered that the form of the synonymous expressions is also added to the training data. Because different product specifications are used with different frequencies, the input product specification records should be weighted (repeated) with their frequency of use to ensure compliance with the true usage profile.
The mathematical form of the parametric analysis model is:
Figure BDA0001700331370000091
where O represents the query string entered by the user and W represents the parameters that the user wants to express (i.e., the parameter information we want to obtain), P (O | W) can be considered as 1 and ignored because our extraction does not change the input and is based on the input. We only need to find the largest p (w), which can find the optimal parameter sequence according to the dynamic programming algorithm, and then find the corresponding parameter information from the parameter dictionary. Here, we use the uni-gram unigram language model because of the small number of samples. In the unary language model, P (w) ═ P (w)1)*P(w2)*....*P(wn). All the parametric analysis algorithm needs to do is find the partition that maximizes the probability of the query string (non-body part) entered by the user.
According to the embodiment of the invention, the parameter analysis model corresponding to each product specification is constructed, and the parameter information in the query character string can be accurately obtained through the parameter analysis model, so that the query result can be accurately obtained.
On the basis of the above embodiment, the method further includes:
and sequencing query results obtained by querying according to a preset rule, wherein the preset rule comprises any one or combination of similarity, query frequency, click feedback rate and editing distance.
In a specific implementation process, after the query result is obtained through the query instruction, there may be more than one query result, and therefore, the order of the query result display needs to be determined, when the query result is sorted, any one or a combination of similarity, query frequency, click feedback rate and edit distance may be considered, and of course, a sorting model may also be established based on a machine learning manner, and the sorting of the query result is output through the sorting model.
According to the embodiment of the invention, the query results are ranked, so that the query results which the user wants to obtain can be ranked at the top, and the user can browse conveniently.
On the basis of the above embodiment, the obtaining a subject name corresponding to the query string by performing subject analysis using a subject library according to the query string includes:
and matching the query character string with the subject name in the subject library by using a regular expression or an Aho-Corasick automata algorithm to obtain the subject name corresponding to the query character string.
In a specific implementation, the function of the subject analysis is to find key subject names from the query string. The implementation method comprises two methods:
first, when the number of subjects is small (<1000), a regular expression may be used for matching search, where the regular expression is "a | B | C | D. Wherein A, B, C etc. are subject names.
Secondly, when the number of the subject names is large, an Aho-Corasick automaton algorithm can be used for efficient matching search of linear time complexity. It should be noted that the Aho-coraasic automata algorithm is the prior art, and the core concept thereof is not described herein. In addition, in the embodiment of the present invention, matching of the subject name may also be implemented by using other algorithms, which is not specifically limited in the embodiment of the present invention.
According to the embodiment of the invention, the main body library is used for carrying out main body analysis on the query character string, so that the accurate analysis of different products of different manufacturers is realized, and the accuracy of data query is improved.
On the basis of the above embodiment, the performing, according to the main body, parameter analysis on the query string by using a corresponding parameter analysis model to obtain corresponding parameter information includes:
carrying out body removing operation on the query character string to obtain a non-body query character string;
and inputting the non-main body query character string into the parameter analysis model, wherein the parameter analysis model takes the parameter probability and the maximum parameter group as the parameter information according to a dynamic programming algorithm.
In a specific implementation process, after the query character string is subjected to main body analysis to obtain a corresponding main body name, the main body name in the query character string is removed to obtain a non-main body query character string, the non-query character string is input into a parameter analysis model for parameter analysis, and a dynamic programming algorithm is used for outputting parameter probability and a maximum parameter group as parameter information. For example, the query string IC65N3P16A may be a non-subject query string obtained by performing a democration operation, which is: N3P16A, after inputting N3P16A to the parameter analysis model, a parameter matrix can be obtained through a parameter dictionary in the parameter analysis model:
0 1 2 3 4 5
0 P(N) P(N3) P(N3P) P(N3P1) P(N3P16) P(N3P16A)
1 P(3) P(3P) P(3P1) P(3P16) P(3P16A)
2 P(P) P(P1) P(P16) P(P16A)
3 P(1) P(16) P(16A)
4 P(6) P(6A)
5 P(A)
finding the parameter probability and the maximum parameter group by a dynamic programming algorithm comprises the following steps: n, 3P, 16A.
According to the embodiment of the invention, the parameter analysis is carried out on the non-main body query character string through the parameter analysis model, and the parameter information in the query character string can be accurately obtained through the parameter analysis model, so that the query result can be accurately obtained.
On the basis of each of the above embodiments, the performing query according to the query instruction includes:
and querying the query instruction by using an Elastic Search engine to obtain a query result.
In a specific implementation process, an Elastic Search (ES) is currently selected as a back-end Search engine by the query device. The ES is an open-source distributed search engine based on Lucene, has the excellent characteristics of rich functions, rich search grammar, flexible sequencing configuration, reliable distributed architecture, high performance and the like, and has sufficient open-source community support.
The index update program reads the normalized product specification record and writes it to the ES index via ES HTTP API. The search engine provides the underlying search service through the HTTP API of the ES.
Fig. 2 is a schematic flow chart of another data query method provided in the embodiment of the present invention, as shown in fig. 2, including:
the inquiry device acquires product service data of all product specifications in advance and carries out standardized processing on the product service data;
storing the product service data subjected to normalized processing into an ES index on one hand, updating the ES index, training an offline model as training data on the other hand, and generating an analysis model after training, wherein the analysis model comprises a subject library and a parameter optimization model;
when a user inputs a query character string, the query character string is analyzed through the analysis model to obtain a corresponding main name and parameter information, a query instruction formed by the main name and the parameter information is queried from an ES index through a search engine, and a query result is ranked and displayed through the ranking model.
According to the embodiment of the invention, the main body name in the query character string is obtained through the main body library, the parameter information is obtained by using the corresponding parameter analysis model according to the main body name, and finally the query is carried out according to the query instruction formed by the main body name and the parameter information to obtain the query result.
Fig. 3 is a schematic structural diagram of a data query apparatus according to an embodiment of the present invention, as shown in fig. 3, the apparatus includes: a receiving module 301, a subject analysis module 302, a parameter analysis module 303, and a query module 304, wherein,
the receiving module 301 is configured to receive a query string sent by a terminal; the main body analysis module 302 is configured to perform main body analysis by using a main body library according to the query character string to obtain a corresponding main body name; the parameter analysis module 303 is configured to perform parameter analysis on the query string by using a corresponding parameter analysis model according to the subject name to obtain corresponding parameter information; the query module 304 is configured to generate a query instruction from the subject name and the parameter information, and perform a query according to the query instruction.
On the basis of the above embodiment, the apparatus further includes:
and the preprocessing module is used for preprocessing the query character string, wherein the preprocessing operation comprises delimiter replacement, subject name pre-identification and parameter information pre-identification.
On the basis of the above embodiment, the apparatus further includes:
the subject library construction module is used for acquiring standard subject names corresponding to all product specifications and all suspected subject names corresponding to each standard subject name in advance;
and forming the subject library by the standard subject name and the corresponding set of the suspected subject names.
On the basis of the above embodiment, the apparatus further includes:
the parameter analysis model building module is used for obtaining parameter naming rules corresponding to all product specifications in advance and building a corresponding parameter dictionary according to the parameter naming rules corresponding to each product specification, wherein the parameter dictionary comprises: parameter codes, log values of word frequencies and code attribute numbers;
and constructing a corresponding parameter analysis model according to the parameter dictionary.
On the basis of the above embodiment, the apparatus further includes:
and the sequencing module is used for sequencing the query results obtained by the query according to a preset rule, wherein the preset rule comprises any one or combination of similarity, query frequency, click feedback rate and editing distance.
On the basis of the above embodiment, the main body analysis module is specifically configured to:
and matching the query character string with the subject name in the subject library by using a regular expression or an Aho-Corasick automata algorithm to obtain the subject name corresponding to the query character string.
On the basis of the above embodiment, the parameter analysis module is specifically configured to:
carrying out body removing operation on the query character string to obtain a non-body query character string;
and inputting the non-main body query character string into the parameter analysis model, wherein the parameter analysis model takes the parameter probability and the maximum parameter group as the parameter information according to a dynamic programming algorithm.
On the basis of the foregoing embodiments, the query module is specifically configured to:
and querying the query instruction by using an Elastic Search engine to obtain a query result.
It is clear to those skilled in the art that, for convenience and brevity of description, the specific working process of the apparatus described above may refer to the corresponding process in the foregoing method, and will not be described in too much detail herein.
In summary, in the embodiments of the present invention, the subject name in the query string is obtained through the subject library, then the parameter information is obtained according to the subject name by using the corresponding parameter analysis model, and finally the query is performed according to the query instruction composed of the subject name and the parameter information to obtain the query result.
Referring to fig. 4, fig. 4 is a block diagram of an electronic device according to an embodiment of the present invention. The electronic device may include an inquiring apparatus 401, a memory 402, a memory controller 403, a processor 404, a peripheral interface 405, an input-output unit 406, an audio unit 407, and a display unit 408.
The memory 402, the memory controller 403, the processor 404, the peripheral interface 405, the input/output unit 406, the audio unit 407, and the display unit 408 are electrically connected to each other directly or indirectly, so as to implement data transmission or interaction. For example, the components may be electrically connected to each other via one or more communication buses or signal lines. The querying device 401 includes at least one software function module which can be stored in the memory 402 in the form of software or firmware (firmware) or solidified in an Operating System (OS) of the querying device 401. The processor 404 is configured to execute an executable module stored in the memory 402, such as a software function module or a computer program included in the querying device 401.
The Memory 402 may be, but is not limited to, a Random Access Memory (RAM), a Read Only Memory (ROM), a Programmable Read-Only Memory (PROM), an Erasable Read-Only Memory (EPROM), an electrically Erasable Read-Only Memory (EEPROM), and the like. The memory 402 is used for storing a program, and the processor 404 executes the program after receiving an execution instruction, and the method executed by the server defined by the flow process disclosed in any of the foregoing embodiments of the present invention may be applied to the processor 404, or implemented by the processor 404.
The processor 404 may be an integrated circuit chip having signal processing capabilities. The Processor 404 may be a general-purpose Processor, and includes a Central Processing Unit (CPU), a Network Processor (NP), and the like; but may also be a Digital Signal Processor (DSP), an Application Specific Integrated Circuit (ASIC), an off-the-shelf programmable gate array (FPGA) or other programmable logic device, discrete gate or transistor logic, discrete hardware components. The various methods, steps and logic blocks disclosed in the embodiments of the present invention may be implemented or performed. A general purpose processor may be a microprocessor or the processor 404 may be any conventional processor or the like.
The peripheral interface 405 couples various input/output devices to the processor 404 and to the memory 402. In some embodiments, the peripheral interface 405, the processor 404, and the memory controller 403 may be implemented in a single chip. In other examples, they may be implemented separately from the individual chips.
The input and output unit 406 is used for providing input data for a user to realize the interaction of the user with the server (or the local terminal). The input/output unit 406 may be, but is not limited to, a mouse, a keyboard, and the like.
Audio unit 407 provides an audio interface to the user, which may include one or more microphones, one or more speakers, and audio circuitry.
The display unit 408 provides an interactive interface (e.g., a user interface) between the electronic device and a user or for displaying image data to a user reference. In this embodiment, the display unit 408 may be a liquid crystal display or a touch display. In the case of a touch display, the display can be a capacitive touch screen or a resistive touch screen, which supports single-point and multi-point touch operations. Supporting single-point and multi-point touch operations means that the touch display can sense touch operations from one or more locations on the touch display at the same time, and the sensed touch operations are sent to the processor 404 for calculation and processing.
The peripheral interface 405 couples various input/output devices to the processor 404 and to the memory 402. In some embodiments, the peripheral interface 405, the processor 404, and the memory controller 403 may be implemented in a single chip. In other examples, they may be implemented separately from the individual chips.
The input and output unit 406 is used for providing input data for a user to realize the interaction of the user with the processing terminal. The input/output unit 406 may be, but is not limited to, a mouse, a keyboard, and the like.
It will be appreciated that the configuration shown in fig. 4 is merely illustrative and that the electronic device may include more or fewer components than shown in fig. 4 or may have a different configuration than shown in fig. 4. The components shown in fig. 4 may be implemented in hardware, software, or a combination thereof.
In the embodiments provided in the present application, it should be understood that the disclosed apparatus and method can be implemented in other ways. The apparatus embodiments described above are merely illustrative, and for example, the flowchart and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of apparatus, methods and computer program products according to various embodiments of the present invention. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams and/or flowchart illustration, and combinations of blocks in the block diagrams and/or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.
In addition, the functional modules in the embodiments of the present invention may be integrated together to form an independent part, or each module may exist separately, or two or more modules may be integrated to form an independent part.
The functions, if implemented in the form of software functional modules and sold or used as a stand-alone product, may be stored in a computer readable storage medium. Based on such understanding, the technical solution of the present invention may be embodied in the form of a software product, which is stored in a storage medium and includes instructions for causing a computer device (which may be a personal computer, a server, or a network device) to execute all or part of the steps of the method according to the embodiments of the present invention. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk or an optical disk, and other various media capable of storing program codes.
The above description is only a preferred embodiment of the present invention and is not intended to limit the present invention, and various modifications and changes may be made by those skilled in the art. Any modification, equivalent replacement, or improvement made within the spirit and principle of the present invention should be included in the protection scope of the present invention. It should be noted that: like reference numbers and letters refer to like items in the following figures, and thus, once an item is defined in one figure, it need not be further defined and explained in subsequent figures.
The above description is only for the specific embodiments of the present invention, but the scope of the present invention is not limited thereto, and any person skilled in the art can easily conceive of the changes or substitutions within the technical scope of the present invention, and all the changes or substitutions should be covered within the scope of the present invention. Therefore, the protection scope of the present invention shall be subject to the protection scope of the claims.
It is noted that, herein, relational terms such as first and second, and the like may be used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Also, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising an … …" does not exclude the presence of other identical elements in a process, method, article, or apparatus that comprises the element.

Claims (10)

1. A method for querying data, comprising:
receiving a query character string sent by a terminal;
performing main body analysis by using a main body library according to the query character string to obtain a corresponding main body name; the subject library refers to a set of subject names of all product specifications;
performing parameter analysis on the query character string by using a corresponding parameter analysis model according to the subject name to obtain corresponding parameter information;
generating a query instruction by the subject name and the parameter information, and querying according to the query instruction;
the generating a query instruction by the subject name and the parameter information, and querying according to the query instruction includes:
if two different series of subject names are included in the query string according to the subject names and each subject name corresponds to respective parameter information, merging the two subject names and the respective parameter information to form a query instruction; if the same series of parameters to be searched but different parameters are known according to the subject name and the parameter information, taking intersection of the parameter information to form a query instruction; if the query is the same series and the same parameter but different parameter values are known according to the subject name and the parameter information, the union set is taken to form a query instruction.
2. The method of claim 1, further comprising:
and performing preprocessing operation on the query character string, wherein the preprocessing operation comprises delimiter replacement, subject name pre-identification and parameter information pre-identification.
3. The method of claim 1, further comprising:
the method comprises the steps of obtaining standard subject names corresponding to all product specifications and all suspected subject names corresponding to all the standard subject names in advance;
and forming the subject library by the standard subject name and the corresponding set of the suspected subject names.
4. The method of claim 1, further comprising:
the method comprises the steps of obtaining parameter naming rules corresponding to all product specifications in advance, and constructing a corresponding parameter dictionary according to the parameter naming rules corresponding to each product specification, wherein the parameter dictionary comprises the following components: parameter codes, log values of word frequencies and code attribute numbers;
and constructing a corresponding parameter analysis model according to the parameter dictionary.
5. The method of claim 1, further comprising:
and sequencing query results obtained by querying according to a preset rule, wherein the preset rule comprises any one or combination of similarity, query frequency, click feedback rate and editing distance.
6. The method according to claim 1, wherein the performing a subject analysis by using a subject library according to the query string to obtain a subject name corresponding to the query string comprises:
and matching the query character string with the subject name in the subject library by using a regular expression or an Aho-Corasick automata algorithm to obtain the subject name corresponding to the query character string.
7. The method of claim 1, wherein the performing parameter analysis on the query string according to the subject name by using a corresponding parameter analysis model to obtain corresponding parameter information comprises:
carrying out body removing operation on the query character string to obtain a non-body query character string;
and inputting the non-main body query character string into the parameter analysis model, wherein the parameter analysis model takes the parameter probability and the maximum parameter group as the parameter information according to a dynamic programming algorithm.
8. The method according to any one of claims 1-7, wherein said performing a query according to the query instruction comprises:
and querying the query instruction by using an Elastic Search engine to obtain a query result.
9. A data query apparatus, comprising:
the receiving module is used for receiving the query character string sent by the terminal;
the main body analysis module is used for carrying out main body analysis by using a main body library according to the query character string to obtain a corresponding main body name; the subject library refers to a set of subject names of all product specifications;
the parameter analysis module is used for carrying out parameter analysis on the query character string by using a corresponding parameter analysis model according to the subject name to obtain corresponding parameter information;
the query module is used for generating a query instruction by the subject name and the parameter information and querying according to the query instruction;
the query module is specifically configured to:
if two different series of subject names are included in the query string according to the subject names and each subject name corresponds to respective parameter information, merging the two subject names and the respective parameter information to form a query instruction; if the same series of parameters to be searched but different parameters are known according to the subject name and the parameter information, taking intersection of the parameter information to form a query instruction; if the query is the same series and the same parameter but different parameter values are known according to the subject name and the parameter information, the union set is taken to form a query instruction.
10. The apparatus of claim 9, further comprising:
and the preprocessing module is used for preprocessing the query character string, wherein the preprocessing operation comprises delimiter replacement, subject name pre-identification and parameter information pre-identification.
CN201810633278.0A 2018-06-19 2018-06-19 Data query method and device Active CN108846103B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201810633278.0A CN108846103B (en) 2018-06-19 2018-06-19 Data query method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201810633278.0A CN108846103B (en) 2018-06-19 2018-06-19 Data query method and device

Publications (2)

Publication Number Publication Date
CN108846103A CN108846103A (en) 2018-11-20
CN108846103B true CN108846103B (en) 2021-01-15

Family

ID=64203036

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201810633278.0A Active CN108846103B (en) 2018-06-19 2018-06-19 Data query method and device

Country Status (1)

Country Link
CN (1) CN108846103B (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110287203B (en) * 2019-05-24 2021-03-26 北京百度网讯科技有限公司 Updating method and updating device for vending machine and vending machine

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101131706A (en) * 2007-09-28 2008-02-27 北京金山软件有限公司 Query amending method and system thereof
CN101916263A (en) * 2010-07-27 2010-12-15 武汉大学 Fuzzy keyword query method and system based on weighing edit distance
CN102880614A (en) * 2011-07-15 2013-01-16 阿里巴巴集团控股有限公司 Data searching method and equipment
CN107977422A (en) * 2017-11-27 2018-05-01 中国电子科技集团公司第二十八研究所 A kind of Method of Fuzzy Matching for equipping model name

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7630982B2 (en) * 2007-02-24 2009-12-08 Trend Micro Incorporated Fast identification of complex strings in a data stream

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101131706A (en) * 2007-09-28 2008-02-27 北京金山软件有限公司 Query amending method and system thereof
CN101916263A (en) * 2010-07-27 2010-12-15 武汉大学 Fuzzy keyword query method and system based on weighing edit distance
CN102880614A (en) * 2011-07-15 2013-01-16 阿里巴巴集团控股有限公司 Data searching method and equipment
CN107977422A (en) * 2017-11-27 2018-05-01 中国电子科技集团公司第二十八研究所 A kind of Method of Fuzzy Matching for equipping model name

Also Published As

Publication number Publication date
CN108846103A (en) 2018-11-20

Similar Documents

Publication Publication Date Title
CN108804641B (en) Text similarity calculation method, device, equipment and storage medium
JP4368336B2 (en) Category setting support method and apparatus
CN111563385B (en) Semantic processing method, semantic processing device, electronic equipment and medium
CN110020422A (en) The determination method, apparatus and server of Feature Words
KR20200007969A (en) Information processing methods, terminals, and computer storage media
CN111274267A (en) Database query method and device and computer readable storage medium
CN112100396B (en) Data processing method and device
US11093529B2 (en) Method for displaying landmark data
CN109508441B (en) Method and device for realizing data statistical analysis through natural language and electronic equipment
CN112035599A (en) Query method and device based on vertical search, computer equipment and storage medium
JP2020513128A (en) Topic structuring method, search result providing method, computer program and topic structuring system
CN109144964A (en) log analysis method and device based on machine learning
CN111159987A (en) Data chart drawing method, device, equipment and computer readable storage medium
CN113408301A (en) Sample processing method, device, equipment and medium
CN110263121B (en) Table data processing method, apparatus, electronic apparatus and computer readable storage medium
CN108846103B (en) Data query method and device
CN110874366A (en) Data processing and query method and device
CN113536156B (en) Search result ordering method, model building method, device, equipment and medium
CN115080603B (en) Database query language conversion method, device, equipment and storage medium
US11507593B2 (en) System and method for generating queryeable structured document from an unstructured document using machine learning
CN112800314B (en) Method, system, storage medium and equipment for search engine query automatic completion
CN112182177A (en) User problem processing method and device, electronic equipment and storage medium
CN114416772A (en) Data query method and device, electronic equipment and storage medium
CN113869408A (en) Classification method and computer equipment
CN113886422A (en) Data extraction method, device, equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant