CN115809311A - Data processing method and device of knowledge graph and computer equipment - Google Patents

Data processing method and device of knowledge graph and computer equipment Download PDF

Info

Publication number
CN115809311A
CN115809311A CN202211654507.XA CN202211654507A CN115809311A CN 115809311 A CN115809311 A CN 115809311A CN 202211654507 A CN202211654507 A CN 202211654507A CN 115809311 A CN115809311 A CN 115809311A
Authority
CN
China
Prior art keywords
data
real
time information
information
knowledge graph
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202211654507.XA
Other languages
Chinese (zh)
Inventor
张宝利
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Qichacha Technology Co ltd
Original Assignee
Qichacha Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Qichacha Technology Co ltd filed Critical Qichacha Technology Co ltd
Priority to CN202211654507.XA priority Critical patent/CN115809311A/en
Publication of CN115809311A publication Critical patent/CN115809311A/en
Pending legal-status Critical Current

Links

Images

Landscapes

  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The disclosure relates to the technical field of knowledge graphs, and particularly discloses a data processing method and device of a knowledge graph and computer equipment, wherein the method comprises the following steps: acquiring historical information of a data producer, and establishing a data file mapping table according to the historical information; importing the data file mapping table into a knowledge graph to form base data of the knowledge graph; acquiring real-time information of the data producer, and sending the updated real-time information to a message middleware under the condition that the content of the real-time information is determined to be updated according to the historical information; updating base data of the knowledge-graph with the message middleware. According to the method and the system, after the updated real-time information exists in the real-time information according to the historical information, the updated real-time information is led into the knowledge graph through the message middleware, so that the knowledge graph can be updated in time, the stability of data written into the knowledge graph is improved, and the stability of service provided by the knowledge graph is further ensured.

Description

Data processing method and device of knowledge graph and computer equipment
Technical Field
The present disclosure relates to the field of knowledge graph technologies, and in particular, to a method and an apparatus for processing data of a knowledge graph, and a computer device.
Background
With the advent of the internet and the big data era, everything interconnection becomes possible, data generated by the interconnection also grows explosively, and the data can be just used as effective raw materials for analyzing relationships. Therefore, the knowledge graph is widely applied to occasions with requirements such as data mining and data analysis.
In the related technology, the data in the knowledge graph is often N +1 data, that is, the whole data in the same day is imported when the flow is low at night, and the knowledge graph service is suspended during import, so that the real-time performance of the knowledge graph data and the stability of the service are greatly limited.
Disclosure of Invention
In view of the above, it is necessary to provide a data processing method, an apparatus, a computer device, a storage medium and a computer program product for knowledge graph in view of the above technical problems.
In a first aspect, the present disclosure provides a method for data processing of a knowledge-graph. The method comprises the following steps:
acquiring historical information of a data producer, and establishing a data file mapping table according to the historical information;
importing the data file mapping table into a knowledge graph to form base data of the knowledge graph;
acquiring real-time information of the data producer, and sending the updated real-time information to a message middleware under the condition that the content of the real-time information is determined to be updated according to the historical information;
updating base data of the knowledge-graph with the message middleware.
In one embodiment, the obtaining of the real-time information of the data producer and the sending of the updated real-time information to the message middleware in the case that the content of the real-time information is determined to be updated according to the history information includes:
judging whether the information abstract of the real-time information is consistent with the information abstract of the historical information in the data file mapping table or not;
and in response to the fact that the information abstract of the real-time information is inconsistent with the information abstract of the historical information in the data file mapping table, determining the real-time information with inconsistent information abstract as the updated real-time information.
In one embodiment, the method further comprises:
and in response to the inconsistency between the information abstract of the real-time information and the information abstract of the historical information in the data file mapping table, updating the data file mapping table according to the updated real-time information.
In one embodiment, the message middleware comprises a plurality of partitions, and the obtaining the real-time information of the data producer and sending the updated real-time information to the message middleware when the content of the real-time information is determined to be updated according to the historical information further comprises:
and sending the updated real-time information to a partition corresponding to the message middleware according to the key field of the updated real-time information.
In one embodiment, the updating the base data of the knowledge-graph with the message middleware comprises:
determining an updating write-in flow threshold value of the knowledge graph according to the data reading flow of the knowledge graph;
and updating the base data of the knowledge graph according to the updated write flow threshold.
In one embodiment, the importing the data file mapping table into a knowledge graph to form base data of the knowledge graph comprises:
and importing the whole data file mapping table into a database of the knowledge graph through a distributed computing engine.
In a second aspect, the present disclosure also provides a data processing apparatus for a knowledge-graph. The device comprises:
the historical data module is used for acquiring historical information of a data producer and establishing a data file mapping table according to the historical information;
the mapping table importing module is used for importing the data file mapping table into a knowledge graph to form base data of the knowledge graph;
the real-time data module is used for acquiring the real-time information of the data producer and sending the updated real-time information to the message middleware under the condition that the content of the real-time information is determined to be updated according to the historical information;
and the knowledge graph updating module is used for updating the base data of the knowledge graph by utilizing the message middleware.
In one embodiment, the real-time data module comprises:
the information abstract unit is used for judging whether the information abstract of the real-time information is consistent with the information abstract of the historical information in the data file mapping table or not;
and the updating determining unit is used for responding to the inconsistency between the information abstract of the real-time information and the information abstract of the historical information in the data file mapping table, and determining the real-time information with the inconsistent information abstract as the updated real-time information.
In one embodiment, the apparatus further comprises:
and the mapping table updating module is used for responding to the inconsistency between the information abstract of the real-time information and the information abstract of the historical information in the data file mapping table and updating the data file mapping table according to the updated real-time information.
In one embodiment, the message middleware includes a plurality of partitions, and the update sending unit is further configured to send the updated real-time information to the partition corresponding to the message middleware according to the key field of the updated real-time information.
In one embodiment, the knowledge-graph update module comprises:
the updating write-in flow threshold unit is used for determining an updating write-in flow threshold of the knowledge graph according to the data reading flow of the knowledge graph;
and the updating and writing unit is used for updating the base data of the knowledge graph according to the updating and writing flow threshold value.
In one embodiment, the mapping table importing module includes:
and the calculation engine unit is used for importing the whole data file mapping table into the database of the knowledge graph through a distributed calculation engine.
In a third aspect, the present disclosure also provides a computer device. The computer device comprises a memory storing a computer program and a processor implementing the steps of the data processing method of the knowledge-graph when executing the computer program.
In a fourth aspect, the present disclosure also provides a computer-readable storage medium. The computer-readable storage medium, on which a computer program is stored, which, when being executed by a processor, carries out the steps of the above-mentioned data processing method of a knowledge-graph.
In a fifth aspect, the present disclosure also provides a computer program product. The computer program product comprises a computer program which, when executed by a processor, performs the steps of the above-described data processing method of a knowledge-graph.
The data processing method, the data processing device, the computer equipment, the storage medium and the computer program product of the knowledge graph at least have the following beneficial effects:
according to the method, the data are imported into the knowledge graph through the data file mapping table, the data disorder probability caused by direct writing of the data is reduced, and the data can be analyzed subsequently according to the data file mapping table; in addition, after the updated real-time information is determined to exist in the real-time information according to the historical information, the updated real-time information is led into the knowledge graph through the message middleware, so that the knowledge graph can be updated in time, the stability of data written into the knowledge graph is improved, and the stability of service provided by the knowledge graph is further ensured; meanwhile, the updated real-time information is written into the message middleware, so that unnecessary service logic can be supported to run in an asynchronous mode, the response speed is accelerated, the message middleware plays a buffering role under the condition that the writing concurrency is large, the message middleware can gradually introduce information into a knowledge graph, abnormal connection with a database is avoided, in addition, the message middleware realizes the decoupling of data production and data consumption, and the data writing into the message middleware and the data reading from the message middleware are not interfered with each other; and the real-time information is facilitated to be persisted through the message-entering middleware, so that the follow-up investigation and analysis are facilitated.
Drawings
In order to more clearly illustrate the embodiments of the present disclosure or technical solutions in the conventional technologies, the drawings used in the description of the embodiments or conventional technologies will be briefly introduced below, it is obvious that the drawings in the following description are only some embodiments of the present disclosure, and other drawings can be obtained by those skilled in the art without creative efforts.
FIG. 1 is a diagram of an application environment of a data processing method of a knowledge graph in one embodiment;
FIG. 2 is a schematic flow chart diagram illustrating a data processing method for a knowledge-graph in one embodiment;
FIG. 3 is a schematic flow chart diagram of a data processing method for a knowledge-graph in another embodiment;
FIG. 4 is a data flow diagram of a data processing method of a knowledge-graph in one embodiment;
FIG. 5 is a schematic flow chart diagram of a data processing method of a knowledge-graph in another embodiment;
FIG. 6 is a block diagram of a data processing apparatus for a knowledge-graph in one embodiment;
FIG. 7 is a block diagram of a data processing apparatus of a knowledge-graph in another embodiment;
FIG. 8 is a block diagram of a data processing apparatus of a knowledge-graph in another embodiment;
FIG. 9 is a block diagram showing an internal configuration of a computer device according to an embodiment.
Detailed Description
In order to make the objects, technical solutions and advantages of the present application more clearly understood, the present application is further described in detail below with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are merely illustrative of the present application and are not intended to limit the present application.
Unless defined otherwise, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this disclosure belongs. The terminology used herein in the description of the disclosure is for the purpose of describing particular embodiments only and is not intended to be limiting of the disclosure.
It should be noted that the terms "first," "second," and the like in the description and claims of the present disclosure and in the above-described drawings are used for distinguishing between similar elements and not necessarily for describing a particular sequential or chronological order. It is to be understood that the data so used is interchangeable under appropriate circumstances such that the embodiments of the disclosure described herein are capable of operation in sequences other than those illustrated or otherwise described herein. The implementations described in the exemplary embodiments below do not represent all implementations consistent with the present disclosure. Rather, they are merely examples of apparatus and methods consistent with certain aspects of the present disclosure, as detailed in the appended claims. The terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, the presence of additional identical or equivalent elements in processes, methods, articles, or apparatus that include the recited elements is not excluded. For example, if the terms first, second, etc. are used to denote names, they do not denote any particular order.
As used herein, the singular forms "a", "an" and "the" may include the plural forms as well, unless the context clearly indicates otherwise. It will be further understood that the terms "comprises/comprising," "includes" or "including," or "having," and the like, specify the presence of stated features, integers, steps, operations, components, parts, or combinations thereof, but do not preclude the presence or addition of one or more other features, integers, steps, operations, components, parts, or combinations thereof. Also, in this specification, the term "and/or" includes any and all combinations of the associated listed items.
The data processing method of the knowledge graph provided by the embodiment of the disclosure can be applied to the application environment shown in fig. 1. Wherein the terminal 102 communicates with the server 104 via a network. Server 104 includes a knowledge graph database, and server 104 provides a knowledge graph service to terminal 102. The server 104 may obtain data generated by public data sources and a business system, which may be a Hadoop distributed file system (i.e., HDFS). The server 104 may obtain data generated by the public data source and the service system in real time, screen out update data for changing the real-time data, and import the update data into the knowledge graph database in time. The knowledge profile database may be integrated on the server 104, or may be placed on the cloud or other network server. The terminal 102 may be, but not limited to, various personal computers, notebook computers, smart phones, tablet computers, internet of things devices, and portable wearable devices, and the internet of things devices may be smart speakers, smart televisions, smart air conditioners, smart car-mounted devices, and the like. The portable wearable device can be a smart watch, a smart bracelet, a head-mounted device, and the like. The server 104 may be implemented as a stand-alone server or as a server cluster comprised of multiple servers.
In some embodiments of the present disclosure, as shown in fig. 2, a method for processing data of a knowledge graph is provided, which is described by taking an example that the method 200 is applied to a server in fig. 1, and includes the following steps:
and step 210, acquiring historical information of a data producer, and processing data according to the historical information knowledge graph.
The data producer may include a public data source and a service system, and the public data source may refer to a public website, a public database, and other data sources open to the public. A business system may refer to a system for performing business links required for a particular task, enabling interaction between a service provider and a user.
Illustratively, the server may obtain historical information generated by the public data source and the business system through the data interface and store the historical information into the data file mapping table. The data file mapping table can be established based on hive, which generally refers to a data warehouse tool based on Hadoop and is used for data extraction, transformation and loading, and the method is a mechanism capable of storing, querying and analyzing large-scale data stored in Hadoop. The hive data warehouse tool can map the Structured data file into a hive table (i.e. a data file mapping table) and provide SQL (Structured Query Language) Query function.
Step 220, importing the data file mapping table into a knowledge graph to form base data of the knowledge graph.
The knowledge graph can be constructed based on the requirements of the business system and is used for providing knowledge graph service for the business system. The base data can form a knowledge graph through information extraction, knowledge fusion, knowledge processing and the like.
Illustratively, the established data file mapping table is imported into the knowledge graph database according to the mode of the knowledge graph. The mode of the knowledge graph can be defined by clearly defining entities, attributes and relations in the knowledge graph, and clearly defining the feasible scope of the knowledge graph. Alternatively, the knowledge graph database may be a Nebula database, which is a distributed and extensible graph database.
Optionally, before importing the data file mapping table into the knowledgegraph, data cleansing may be performed by means of HQL (hive sql) so that the data in the data file mapping table satisfies the data format required by the knowledgegraph. The data cleaning method can include but is not limited to missing value filling, numerical value replacement, data type conversion, data sorting, repeated value processing and the like, and the cleaned data result directly influences the result of the final data analysis.
Step 230, acquiring the real-time information of the data producer, and sending the updated real-time information to the message middleware under the condition that the content of the real-time information is determined to be updated according to the historical information.
For example, the server may monitor real-time information generated by the data producer in real time, and make a determination in combination with the historical information to determine that updated real-time information exists. The server may also send the determined updated real-time information to the messaging middleware.
Alternatively, the server may monitor real-time information generated by the data producer in multiple ways, such as Redis or human triggering. Redis (Remote Dictionary Server), a high-performance key-value database, supports a publish/subscribe mechanism, can subscribe to a channel and receive a complete message publishing record of a data producer. The message middleware may choose Kafka, which is a high throughput distributed publish-subscribe messaging system.
Alternatively, in the step of sending the updated real-time information to the messaging middleware, the server may send the changed data field in the real-time information or the entire set of data objects of the real-time information to the messaging middleware.
Step 240, updating base data of the knowledge-graph with the message middleware.
After the server sends the updated real-time information to the message middleware, the base data of the knowledge graph can be further updated according to the updated real-time information in the message middleware, so that the real-time performance of the knowledge graph is guaranteed.
According to the data processing method of the knowledge graph, data are imported into the knowledge graph through the data file mapping table, so that the data disorder probability caused by direct writing of the data is reduced, and the data can be analyzed subsequently according to the data file mapping table; in addition, after the updated real-time information is determined to exist in the real-time information according to the historical information, the updated real-time information is led into the knowledge graph through the message middleware, so that the knowledge graph can be updated in time, the stability of data written into the knowledge graph is improved, and the stability of services provided by the knowledge graph is further ensured; meanwhile, the updated real-time information is written into the message middleware, so that unnecessary service logic can be supported to run in an asynchronous mode, the response speed is accelerated, the message middleware plays a buffering role under the condition that the writing concurrency is large, the message middleware can gradually introduce information into a knowledge graph, abnormal connection with a database is avoided, in addition, the message middleware realizes the decoupling of data production and data consumption, and the data writing into the message middleware and the data reading from the message middleware are not interfered with each other; and the real-time information is persistent through the message-entering middleware, so that the follow-up investigation and analysis are facilitated.
In some embodiments of the present disclosure, as shown in fig. 3, step 230 comprises:
step 232, judging whether the information abstract of the real-time information is consistent with the information abstract of the historical information in the data file mapping table.
For example, the server may parse the message digest of the real-time message/the historical message and determine whether the message digest of the real-time message and the message digest of the historical message are consistent. Alternatively, the Message Digest for parsing the real-time Message/history information may be generated by using an MD5 Algorithm, MD5 (MD 5 Message-Digest Algorithm), which is generally referred to as a cryptographic hash function, to generate a 128-bit (16-byte) hash value (i.e., message Digest). If the hash values (i.e., message digests) are different, then the change in the data itself is indicated.
Step 234, in response to the fact that the information digest of the real-time information is inconsistent with the information digest of the history information in the data file mapping table, determining the real-time information with inconsistent information digest as the updated real-time information.
For example, in combination with the data flow diagram of the data processing method of the knowledge graph provided in this embodiment shown in fig. 4, when the server determines that the information digest of the real-time information is inconsistent with the information digest of the history information in the data file mapping table, the server may determine that the real-time information with inconsistent information digest is updated real-time information, and trigger an action of sending the updated real-time information to the message middleware.
According to the embodiment, whether the information abstract of the real-time information is consistent with the information abstract of the historical information is judged, and then the real-time information with inconsistent information abstract is determined to be the updated real-time information, so that the updated real-time information can be determined more conveniently and efficiently.
In some embodiments of the present disclosure, the method further comprises:
and in response to the inconsistency between the information abstract of the real-time information and the information abstract of the historical information in the data file mapping table, updating the data file mapping table according to the updated real-time information.
Illustratively, when the server determines that the information digest of the real-time information is inconsistent with the information digest of the historical information in the data file mapping table, the server further triggers updating of the historical information in the data file mapping table according to the real-time information, so as to maintain the real-time performance of the data file mapping table. It should be noted that, in the subsequent step of repeatedly determining whether the information summary of the real-time information is consistent with the information summary of the history information in the data file mapping table, the data file mapping table is a real-time updated data file mapping table.
Optionally, when the server sends the updated real-time information to the message middleware, the updated real-time information may be synchronously written into the data file mapping table for updating. The data file mapping table can support HDFS system reading and consulting to analyze real-time information.
According to the embodiment, the data file mapping table is updated in time according to the updating condition of the real-time information, the real-time information is stored through the data file mapping table and is read and consulted by the HDFS, data correction is facilitated by combining the whole data object set when the real-time information is imported into the knowledge graph, the data disorder probability caused by real-time data writing is reduced, and subsequent data tracking analysis is achieved.
In some embodiments of the present disclosure, step 230 further comprises:
and sending the updated real-time information to a partition corresponding to the message middleware according to the key field of the updated real-time information.
For example, the message middleware may include a plurality of partitions, and the server may send the updated real-time information to the partition corresponding to the message middleware according to the key field of the updated real-time information according to a preset key field rule, so that the updated real-time information of the same piece of data may enter the same partition. Alternatively, the key field may be selected according to the characteristics and timing of the data.
According to the embodiment, the updated real-time information is sent to the partition corresponding to the message middleware according to the key field, so that the updated real-time information of the same data can enter the same partition, the data management is enhanced, and the knowledge graph writing efficiency is improved.
In some embodiments of the present disclosure, as shown in fig. 5, step 240 comprises:
step 242, determining an updated write flow threshold of the knowledge graph according to the data read flow of the knowledge graph.
Illustratively, after writing updated real-time information in the message middleware, the server acquires real-time data reading flow of the knowledge graph, and determines an updated writing flow threshold of the knowledge graph according to the real-time data reading flow of the knowledge graph. The write-in flow threshold value is updated to ensure that the data read flow of the knowledge graph is not affected, and further ensure the stability of the service provided by the knowledge graph.
Step 244, updating the base data of the knowledge-graph according to the updated write flow threshold.
Illustratively, the update write traffic threshold may be modified according to the real-time data read traffic of the knowledge graph, and the server writes the updated real-time information in the message middleware in the knowledge graph database according to the update write traffic threshold determined in real time without exceeding the update write traffic threshold.
The method and the device determine the update write-in flow threshold of the knowledge graph through the data read flow of the knowledge graph, write the updated real-time information in the message middleware in the knowledge graph database under the condition that the update write-in flow threshold is not exceeded, ensure the stability of service provided by the knowledge graph, write the updated real-time information into the knowledge graph in time and improve the real-time performance of the knowledge graph.
In some embodiments of the present disclosure, step 220 comprises:
and importing the whole data file mapping table into a database of the knowledge graph through a distributed computing engine.
Illustratively, the server may be imported through the distributed computing engine during the process of importing the data filemap table into the knowledge-graph. Optionally, the distributed computing engine may use Spark, which is an open source cluster computing environment similar to Hadoop, and which enables a memory distributed data set that optimizes the iterative workload in addition to providing interactive queries.
In the embodiment, the full amount of the data file mapping table is imported into the database of the knowledge graph through the Spark distributed computing engine, so that the full amount of the data file mapping table with large data volume can be imported, and the importing efficiency is improved.
It should be understood that, although the steps in the flowcharts related to the embodiments as described above are sequentially displayed as indicated by arrows, the steps are not necessarily performed sequentially as indicated by the arrows. The steps are not performed in the exact order shown and described, and may be performed in other orders, unless explicitly stated otherwise. Moreover, at least a part of the steps in the flowcharts related to the embodiments described above may include multiple steps or multiple stages, which are not necessarily performed at the same time, but may be performed at different times, and the execution order of the steps or stages is not necessarily sequential, but may be rotated or alternated with other steps or at least a part of the steps or stages in other steps.
Based on the same inventive concept, the embodiment of the present disclosure further provides a data processing apparatus for implementing the above-mentioned data processing method of the knowledge graph. The implementation scheme for solving the problem provided by the apparatus is similar to the implementation scheme described in the method, so specific limitations in the data processing apparatus embodiments of one or more knowledge graphs provided below can be referred to the limitations on the data processing method of the knowledge graph in the foregoing, and details are not described here again.
In some embodiments of the present disclosure, a data processing apparatus for a knowledge-graph is provided, as shown in fig. 6. The apparatus 700 comprises:
a history data module 710, configured to obtain history information of a data producer, and establish a data file mapping table according to the history information;
a mapping table importing module 720, configured to import the data file mapping table into a knowledge graph to form base data of the knowledge graph;
the real-time data module 730 is configured to obtain real-time information of the data producer, and send the updated real-time information to the message middleware under the condition that it is determined that the content of the real-time information is updated according to the historical information;
a knowledge graph update module 740 configured to update base data of the knowledge graph using the message middleware.
In some embodiments of the present disclosure, as shown in fig. 7, the real-time data module 730 includes:
an information summarization unit 732, configured to determine whether the information summary of the real-time information is consistent with the information summary of the history information in the data file mapping table;
the update determining unit 734 is configured to determine, in response to that the information digest of the real-time information is inconsistent with the information digest of the history information in the data file mapping table, that the real-time information with the inconsistent information digest is the updated real-time information.
In some embodiments of the present disclosure, the apparatus further comprises:
and the mapping table updating module is used for responding to the inconsistency between the information abstract of the real-time information and the information abstract of the historical information in the data file mapping table and updating the data file mapping table according to the updated real-time information.
In some embodiments of the present disclosure, the message middleware includes a plurality of partitions, and the update sending unit 738 is further configured to send the updated real-time information to the partition corresponding to the message middleware according to the key field of the updated real-time information.
In some embodiments of the present disclosure, as shown in fig. 8, the knowledge-graph update module 740 comprises:
an update write traffic threshold unit 742, configured to determine an update write traffic threshold of the knowledge graph according to data read traffic of the knowledge graph;
an update writing unit 744, configured to update the base data of the knowledge graph according to the update writing flow threshold.
In one embodiment, the mapping table importing module includes:
and the calculation engine unit is used for importing the whole data file mapping table into the database of the knowledge graph through a distributed calculation engine.
The modules in the data processing device of the knowledge graph can be wholly or partially realized by software, hardware and a combination thereof. The modules can be embedded in a hardware form or independent from a processor in the computer device, and can also be stored in a memory in the computer device in a software form, so that the processor can call and execute operations corresponding to the modules. It should be noted that, the division of the modules in the embodiment of the present disclosure is illustrative, and is only one logical function division, and there may be another division manner in actual implementation.
In another embodiment provided by the present disclosure, a computer device is provided, and the computer device may be a server, and its internal structure diagram may be as shown in the figure. The computer device includes a processor, a memory, an Input/Output interface (I/O for short), and a communication interface. The processor, the memory and the input/output interface are connected through a system bus, and the communication interface is connected to the system bus through the input/output interface. Wherein the processor of the computer device is configured to provide computing and control capabilities. The memory of the computer device includes a non-volatile storage medium and an internal memory. The non-volatile storage medium stores an operating system, a computer program, and a database. The internal memory provides an environment for the operating system and the computer program to run on the non-volatile storage medium. The database of the computer device is used to store the base data of the knowledge-graph. The input/output interface of the computer device is used for exchanging information between the processor and an external device. The communication interface of the computer device is used for connecting and communicating with an external terminal through a network. The computer program is executed by a processor to implement a method of data processing of a knowledge-graph.
Those skilled in the art will appreciate that the architecture shown in fig. 9 is merely a block diagram of some of the structures associated with the disclosed aspects and is not intended to limit the computing devices to which the disclosed aspects apply, as particular computing devices may include more or less components than those shown, or may combine certain components, or have a different arrangement of components.
In another embodiment provided by the present disclosure, a computer-readable storage medium is provided, on which a computer program is stored, which computer program, when being executed by a processor, implements the steps in the above-mentioned method embodiments.
In another embodiment provided by the present disclosure, a computer program product is provided, which comprises a computer program that, when being executed by a processor, implements the steps in the above-mentioned method embodiments.
It should be noted that, the user information (including but not limited to user equipment information, user personal information, etc.) and data (including but not limited to data for analysis, stored data, displayed data, etc.) referred to in the present application are information and data authorized by the user or sufficiently authorized by each party, and the collection, use and processing of the related data need to comply with the relevant laws and regulations and standards of the relevant country and region.
It will be understood by those skilled in the art that all or part of the processes of the methods of the embodiments described above can be implemented by hardware instructions of a computer program, which can be stored in a non-volatile computer-readable storage medium, and when executed, can include the processes of the embodiments of the methods described above. Any reference to memory, databases, or other media used in the embodiments provided herein can include at least one of non-volatile and volatile memory. The nonvolatile Memory may include Read-Only Memory (ROM), magnetic tape, floppy disk, flash Memory, optical Memory, high-density embedded nonvolatile Memory, resistive Random Access Memory (ReRAM), magnetic Random Access Memory (MRAM), ferroelectric Random Access Memory (FRAM), phase Change Memory (PCM), graphene Memory, and the like. Volatile Memory can include Random Access Memory (RAM), external cache Memory, and the like. By way of illustration and not limitation, RAM can take many forms, such as Static Random Access Memory (SRAM) or Dynamic Random Access Memory (DRAM), among others. The databases involved in the embodiments provided herein may include at least one of relational and non-relational databases. The non-relational database may include, but is not limited to, a block chain based distributed database, and the like. The processors referred to in the embodiments provided herein may be general purpose processors, central processing units, graphics processors, digital signal processors, programmable logic devices, quantum computing based data processing logic devices, etc., without limitation.
In the description herein, references to "some embodiments," "other embodiments," "desired embodiments," or the like, mean that a particular feature, structure, material, or characteristic described in connection with the embodiment or example is included in at least one embodiment or example of the invention. In this specification, schematic depictions of the above terms do not necessarily refer to the same embodiment or example.
It is understood that the embodiments of the method described above are described in a progressive manner, and the same/similar parts of the embodiments are referred to each other, and each embodiment focuses on differences from the other embodiments. Reference may be made to the description of other method embodiments for relevant points.
The technical features of the embodiments described above may be arbitrarily combined, and for the sake of brevity, all possible combinations of the technical features of the embodiments described above are not described, but should be considered as being within the scope of the present specification as long as there is no contradiction between the combinations of the technical features.
The above-mentioned embodiments only express several embodiments of the present disclosure, and the description thereof is more specific and detailed, but not construed as limiting the claims. It should be noted that, for those skilled in the art, various changes and modifications can be made without departing from the concept of the present disclosure, and these changes and modifications are all within the scope of the present disclosure. Therefore, the protection scope of the present disclosure should be subject to the appended claims.

Claims (10)

1. A method of data processing of a knowledge graph, the method comprising:
acquiring historical information of a data producer, and establishing a data file mapping table according to the historical information;
importing the data file mapping table into a knowledge graph to form base data of the knowledge graph;
acquiring real-time information of the data producer, and sending the updated real-time information to a message middleware under the condition that the content of the real-time information is determined to be updated according to the historical information;
updating base data of the knowledge-graph with the message middleware.
2. The method of claim 1, wherein the obtaining of the real-time information of the data producer and the sending of the updated real-time information to the message middleware in case of determining that the content of the real-time information is updated according to the history information comprises:
judging whether the information abstract of the real-time information is consistent with the information abstract of the historical information in the data file mapping table;
and in response to the fact that the information abstract of the real-time information is inconsistent with the information abstract of the historical information in the data file mapping table, determining the real-time information with inconsistent information abstract as the updated real-time information.
3. The method of claim 2, further comprising:
and in response to the inconsistency between the information abstract of the real-time information and the information abstract of the historical information in the data file mapping table, updating the data file mapping table according to the updated real-time information.
4. The method of claim 2, wherein the message middleware comprises a plurality of partitions, wherein the obtaining of the real-time information of the data producer and the sending of the updated real-time information to the message middleware in the case that the content of the real-time information is determined to be updated according to the history information further comprises:
and sending the updated real-time information to a partition corresponding to the message middleware according to the key field of the updated real-time information.
5. The method of claim 1, wherein updating base data of the knowledge-graph using the message middleware comprises:
determining an updating write-in flow threshold value of the knowledge graph according to the data reading flow of the knowledge graph;
and updating the base data of the knowledge graph according to the updated write flow threshold.
6. The method of claim 1, wherein importing the data file mapping table into a knowledge graph to form base data of the knowledge graph comprises:
and importing the whole data file mapping table into a database of the knowledge graph through a distributed computing engine.
7. A data processing apparatus for a knowledge graph, the apparatus comprising:
the historical data module is used for acquiring historical information of a data producer and establishing a data file mapping table according to the historical information;
the mapping table importing module is used for importing the data file mapping table into a knowledge graph to form base data of the knowledge graph;
the real-time data module is used for acquiring the real-time information of the data producer and sending the updated real-time information to the message middleware under the condition that the content of the real-time information is determined to be updated according to the historical information;
and the knowledge graph updating module is used for updating the base data of the knowledge graph by utilizing the message middleware.
8. A computer device comprising a memory and a processor, the memory storing a computer program, characterized in that the processor, when executing the computer program, implements the steps of the method of any of claims 1 to 6.
9. A computer-readable storage medium, on which a computer program is stored, which, when being executed by a processor, carries out the steps of the method of any one of claims 1 to 6.
10. A computer program product comprising a computer program, characterized in that the computer program realizes the steps of the method of any one of claims 1 to 6 when executed by a processor.
CN202211654507.XA 2022-12-22 2022-12-22 Data processing method and device of knowledge graph and computer equipment Pending CN115809311A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202211654507.XA CN115809311A (en) 2022-12-22 2022-12-22 Data processing method and device of knowledge graph and computer equipment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202211654507.XA CN115809311A (en) 2022-12-22 2022-12-22 Data processing method and device of knowledge graph and computer equipment

Publications (1)

Publication Number Publication Date
CN115809311A true CN115809311A (en) 2023-03-17

Family

ID=85486761

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202211654507.XA Pending CN115809311A (en) 2022-12-22 2022-12-22 Data processing method and device of knowledge graph and computer equipment

Country Status (1)

Country Link
CN (1) CN115809311A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116932779A (en) * 2023-08-14 2023-10-24 企查查科技股份有限公司 Knowledge graph data processing method and device

Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2021073254A1 (en) * 2019-10-18 2021-04-22 平安科技(深圳)有限公司 Knowledge graph-based entity linking method and apparatus, device, and storage medium
CN113626616A (en) * 2021-08-25 2021-11-09 中国电子科技集团公司第三十六研究所 Aircraft safety early warning method, device and system
CN114153986A (en) * 2021-11-29 2022-03-08 北京达佳互联信息技术有限公司 Knowledge graph construction method and device, electronic equipment and storage medium
CN114238654A (en) * 2021-12-15 2022-03-25 科大讯飞股份有限公司 Knowledge graph construction method and device and computer readable storage medium
CN114328981A (en) * 2022-03-14 2022-04-12 中国电子科技集团公司第二十八研究所 Knowledge graph establishing and data obtaining method and device based on mode mapping
CN114385833A (en) * 2022-03-23 2022-04-22 支付宝(杭州)信息技术有限公司 Method and device for updating knowledge graph
WO2022222716A1 (en) * 2021-04-21 2022-10-27 华东理工大学 Construction method and apparatus for chemical industry knowledge graph, and intelligent question and answer method and apparatus
CN115455935A (en) * 2022-09-14 2022-12-09 华东师范大学 Intelligent text information processing system

Patent Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2021073254A1 (en) * 2019-10-18 2021-04-22 平安科技(深圳)有限公司 Knowledge graph-based entity linking method and apparatus, device, and storage medium
WO2022222716A1 (en) * 2021-04-21 2022-10-27 华东理工大学 Construction method and apparatus for chemical industry knowledge graph, and intelligent question and answer method and apparatus
CN113626616A (en) * 2021-08-25 2021-11-09 中国电子科技集团公司第三十六研究所 Aircraft safety early warning method, device and system
CN114153986A (en) * 2021-11-29 2022-03-08 北京达佳互联信息技术有限公司 Knowledge graph construction method and device, electronic equipment and storage medium
CN114238654A (en) * 2021-12-15 2022-03-25 科大讯飞股份有限公司 Knowledge graph construction method and device and computer readable storage medium
CN114328981A (en) * 2022-03-14 2022-04-12 中国电子科技集团公司第二十八研究所 Knowledge graph establishing and data obtaining method and device based on mode mapping
CN114385833A (en) * 2022-03-23 2022-04-22 支付宝(杭州)信息技术有限公司 Method and device for updating knowledge graph
CN115455935A (en) * 2022-09-14 2022-12-09 华东师范大学 Intelligent text information processing system

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
宋伟;张游杰;: "基于环境信息融合的知识图谱构建方法", 计算机***应用, no. 06, 15 June 2020 (2020-06-15) *

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116932779A (en) * 2023-08-14 2023-10-24 企查查科技股份有限公司 Knowledge graph data processing method and device
CN116932779B (en) * 2023-08-14 2024-03-12 企查查科技股份有限公司 Knowledge graph data processing method and device

Similar Documents

Publication Publication Date Title
US20220156289A1 (en) Generating a multi-column index for relational databases by interleaving data bits for selectivity
US10372723B2 (en) Efficient query processing using histograms in a columnar database
US9367574B2 (en) Efficient query processing in columnar databases using bloom filters
US8719254B2 (en) Efficient querying using on-demand indexing of monitoring tables
Chavan et al. Survey paper on big data
CN111209352A (en) Data processing method and device, electronic equipment and storage medium
CN115809311A (en) Data processing method and device of knowledge graph and computer equipment
CN107430633B (en) System and method for data storage and computer readable medium
CN115858471A (en) Service data change recording method, device, computer equipment and medium
US20240070180A1 (en) Mutation-Responsive Documentation Regeneration Based on Knowledge Base
US11914655B2 (en) Mutation-responsive documentation generation based on knowledge base
CN117931747A (en) Metadata management method, device, system and equipment for data marts
CN115422199A (en) Processing method and device of multidimensional statistical data and computer equipment
CN116204549A (en) Data query method, apparatus, computer device, storage medium, and program product
CN117807080A (en) Text data processing method, apparatus, computer device and storage medium
CN117216009A (en) File processing method, apparatus, device, storage medium and computer program product
CN117194524A (en) Offline index data processing method, device, equipment and storage medium
CN116450669A (en) Data query method, device, computer equipment and storage medium
CN117370349A (en) Index storage method, index query method, index storage device, index query equipment and index medium
CN117234562A (en) Configuration parameter updating method and device and computer equipment
CN113987051A (en) Space-time big data management method based on metadata
CN117312283A (en) Database and table data verification method and device, computer equipment and storage medium
CN115408405A (en) Form processing method and device and computer equipment
CN116483870A (en) Data processing method, device, computer equipment and storage medium
CN117909550A (en) Query method, query device, computer equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
CB02 Change of applicant information
CB02 Change of applicant information

Country or region after: China

Address after: No. 8 Huizhi Street, Suzhou Industrial Park, Suzhou Area, China (Jiangsu) Pilot Free Trade Zone, Suzhou City, Jiangsu Province, 215000

Applicant after: Qichacha Technology Co.,Ltd.

Address before: Room 503, 5 / F, C1 building, 88 Dongchang Road, Suzhou Industrial Park, 215000, Jiangsu Province

Applicant before: Qicha Technology Co.,Ltd.

Country or region before: China