CN117891851A - Knowledge base analysis method and system based on artificial intelligence - Google Patents

Knowledge base analysis method and system based on artificial intelligence Download PDF

Info

Publication number
CN117891851A
CN117891851A CN202410302742.3A CN202410302742A CN117891851A CN 117891851 A CN117891851 A CN 117891851A CN 202410302742 A CN202410302742 A CN 202410302742A CN 117891851 A CN117891851 A CN 117891851A
Authority
CN
China
Prior art keywords
search
information
data
similar
analysis
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202410302742.3A
Other languages
Chinese (zh)
Other versions
CN117891851B (en
Inventor
张发恩
郭江亮
徐安琪
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Qingdao Chuangxin Qizhi Technology Group Co ltd
Original Assignee
Qingdao Chuangxin Qizhi Technology Group Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Qingdao Chuangxin Qizhi Technology Group Co ltd filed Critical Qingdao Chuangxin Qizhi Technology Group Co ltd
Priority to CN202410302742.3A priority Critical patent/CN117891851B/en
Publication of CN117891851A publication Critical patent/CN117891851A/en
Application granted granted Critical
Publication of CN117891851B publication Critical patent/CN117891851B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Landscapes

  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention provides a knowledge base analysis method and system based on artificial intelligence, and relates to the technical field of knowledge base analysis. Acquiring historical search data of a knowledge base, and classifying the search input information to form search input type data; determining search output result data of the search input category data according to the knowledge base historical search data; carrying out merging analysis on the search input category data and the corresponding search output result data to form similar search data; and carrying out merging analysis on different types of search input type data according to different similar search data to form knowledge base search guide data. According to the method, the data of the knowledge base is reasonably classified and divided based on the search requirement, so that a data division system with full load search habit is formed, and the efficiency of acquiring the required data in the knowledge base is greatly improved.

Description

Knowledge base analysis method and system based on artificial intelligence
Technical Field
The invention relates to the technical field of knowledge base analysis, in particular to a knowledge base analysis method and system based on artificial intelligence.
Background
With the development and scientific progress of society, various industries are deeply ploughed and subdivided, deep ploughed and subdivided results are accompanied by deeper and wider knowledge layers, meanwhile, knowledge systems which are beneficial to improving the working efficiency and the like are accumulated by certain operation needs under big data, and a knowledge base with huge data volume is formed. The knowledge base can basically meet the demands of operation, production and life at the cost.
In general, in order to facilitate the related personnel to efficiently and simply use the knowledge base, required data in the knowledge base can be quickly obtained, the knowledge data in the knowledge base can be reasonably classified and sorted, and an efficient searching mode can be established to improve the operation efficiency of the related personnel. However, basically all the searching modes established for the knowledge base are reversely formed based on the data types in the knowledge base, namely, the searching modes are not established based on the requirement consideration of related personnel, but are classified according to the data types, so that the related personnel must first know a huge knowledge base data division system when searching and extracting the required data in the knowledge base by using a search engine, but the searching modes are certainly difficult for independent related personnel, the inaccuracy of the searching result is directly brought, and the required knowledge data is difficult to start to be obtained efficiently.
Therefore, the knowledge base analysis method and system based on artificial intelligence are designed, and the data of the knowledge base is reasonably classified based on search requirements to form a data classification system with full load search habits, so that the efficiency of acquiring the required data in the knowledge base is greatly improved, and the method and system are the problems to be solved urgently at present.
Disclosure of Invention
The invention aims to provide an artificial intelligence-based knowledge base analysis method, which is used for classifying the requirements of historical search by acquiring historical search input data aiming at a knowledge base, extracting characteristic information and extracting real search requirement information. And then, the search requirement information is corresponding to the finally obtained required search result, and the reasonable knowledge base data division which takes the search requirement as the guide is performed by combining the identity and the relativity of the search output result on the basis of the search input information, so that the reasonable search guide data is established. On the one hand, the searching mode is established based on the searching requirement, fully accords with the characteristic of the searching requirement, can conveniently acquire accurate searching results by adopting more accurate searching input information when in requirement searching, can provide quick historical searching references for searching operation, has stronger referential property and pertinence, can basically meet most searching requirements under big data, and on the other hand, the efficiency of acquiring the accurate searching results is more efficient due to the searching mode established based on the searching requirement to the positive direction, and meanwhile resources consumed by searching are saved to a certain extent, and the operation pressure and burden of operators are also greatly reduced.
The invention also aims to provide an artificial intelligence-based knowledge base analysis system which can ensure the integrity and stability of knowledge base analysis effectively and provide a stable material basis for knowledge base analysis by forming a knowledge base data analysis function which fully satisfies the search efficiency.
In a first aspect, the present invention provides an artificial intelligence based knowledge base analysis method, including obtaining knowledge base historical search data, and performing category division of search input information to form search input category data; determining search output result data of the search input category data according to the knowledge base historical search data; carrying out merging analysis on the search input category data and the corresponding search output result data to form similar search data; and carrying out merging analysis on different types of search input type data according to different similar search data to form knowledge base search guide data.
In the method, the type of the requirement of the historical search is divided by acquiring the input data of the historical search aiming at the knowledge base, and the characteristic information is extracted to extract the real search requirement information. And then, the search requirement information is corresponding to the finally obtained required search result, and the reasonable knowledge base data division which takes the search requirement as the guide is performed by combining the identity and the relativity of the search output result on the basis of the search input information, so that the reasonable search guide data is established. On the one hand, the searching mode is established based on the searching requirement, fully accords with the characteristic of the searching requirement, can conveniently acquire accurate searching results by adopting more accurate searching input information when in requirement searching, can provide quick historical searching references for searching operation, has stronger referential property and pertinence, can basically meet most searching requirements under big data, and on the other hand, the efficiency of acquiring the accurate searching results is more efficient due to the searching mode established based on the searching requirement to the positive direction, and meanwhile resources consumed by searching are saved to a certain extent, and the operation pressure and burden of operators are also greatly reduced.
As one possible implementation manner, obtaining historical search data of a knowledge base, and performing category division of search input information to form search input category data, including: extracting all search input information according to the historical search data of the knowledge base; dividing the search input information according to information types to form different search input type information sets; extracting characteristic information of each search input type information in different search input type information sets to form search input unit characteristic information corresponding to each search input type information; and collecting all the search input unit characteristic information under the search input type information set to form the search input type characteristic information set.
In the invention, the formation of the search input category data comprises two aspects, namely, dividing the search input information obtained from the knowledge base historical search data according to the data types, and extracting the characteristic information of the search results from all the divided search input information in different data types. The division of data types mainly distinguishes different data types to ensure that data extraction can be performed in a targeted manner when extracting feature information for different types of search data, and after all, the feature information extraction is completely different especially among types of data such as image type data, voice type data, text type data types, each of which has feature information unique to each, such as color feature information possessed by the image type data, volume feature information possessed by the voice type data, and the like. The characteristic information of different types of data can be extracted according to the finally obtained search output result information, so that the truly required information of the search can be accurately obtained according to the final search result, after all, different operators can adopt different search inputs to cause huge and invalid input data volume when searching the same target, and can also adopt the same search inputs to cause the ambiguity of the search requirement when searching different targets, and therefore, the characteristic extraction according to the search result can effectively reduce the data volume of the search requirement and avoid the inaccuracy of the search input information. The feature information is also extracted for different types of data, for example, most of image data is graphic information such as shape, color and the like, so that the feature information of the image input information is mainly the feature information of the types, and the input information of the text type is mainly semantic.
As one possible implementation, determining search output result data for searching for input category data based on knowledge base historical search data includes: extracting all search output result information corresponding to each search input unit feature information according to the knowledge base historical search data to form a search output result information set corresponding to the search input unit feature information; for each search output result information in the search output result information set, determining the number of output histories in which the search output result information appears, wherein i represents the number of different search output result information in the search output result information set; and determining historical output frequency/> corresponding to the search output result information according to the output historical frequency/> of different search output result information in the search output result information set.
In the application, the characteristic extraction of the search input information is carried out according to the search output result information, so after the characteristic information of the search input is obtained, the search input information and the search result information are required to be corresponding to form a real requirement and result combination, and a data basis is provided for the subsequent type division analysis of the knowledge base data based on forward directions. It should be noted that the analysis mode of the application provides the establishment of the search system based on the historical input requirement, the established search system can further carry out intelligent recognition and division of data types based on the artificial intelligence technology, and can continuously and automatically acquire the search operation data to update the type division data of the search system, thereby ensuring the real-time performance and accuracy of the search system. It will be appreciated that it is sometimes not necessary for the search result to be a very low-level result information, for example, when an operator performs a knowledgeable data search, such as the type of welding, the analysis method of discrete samples, etc., all specific results under the type will be obtained, and it is possible that the operator will choose the actual search result of different specific result operations, so that the present application performs frequency statistics of different search result information for such search forms to consider which specific results are commonly used under such search requirements, provide reasonable guiding results for subsequent searches, and after all have clustering requirements for operators, so that the specific result guiding formed based on frequency statistics is very useful.
As one possible implementation manner, performing a merging analysis of the search input category data and the corresponding search output result data to form similar search data, including: in the same search input type characteristic information set, carrying out the following combined analysis: randomly selecting two pieces of search input unit feature information, and carrying out matching analysis based on the output result items to form similar item matching analysis results; if the matching analysis result of the similar items shows no matching, determining that the selected two search input unit feature information are not combined; if the matching analysis result of the similar items shows that the two search input unit feature information are matched, carrying out similar similarity analysis based on output frequency on the selected two search input unit feature information to form a similar similarity analysis result; when the similar similarity analysis results are displayed as dissimilar, determining that the selected two search input unit feature information are not combined; when the similar similarity analysis results are similar, forming a search input similar characteristic information group by using the selected two search input unit characteristic information sets, forming a search output similar characteristic result information group by using the corresponding search output result information sets, and determining the average historical output frequency of each search output result information; and finishing pairwise merging analysis of all the search input unit feature information in the search input type feature information set, collecting all the search input similar feature information sets and corresponding search output similar feature result information sets, and forming similar search data by the search input type feature information set which is not merged and the corresponding search output result information set.
In the application, different search input information corresponding to the same search requirement target exists in the same type of search input data, and in order to avoid redundancy of the input information, the search input is further optimized, the accuracy of the search input is improved, and the combination of different search input information aiming at the same search requirement target is necessary. The method and the device for merging the different search input information are various in merging modes of the different search input information, and merge the different search input information by examining two aspects. On the one hand, the matching degree of the search results is that, for a search requirement target to be a specific search result, only a single result comparison is needed to determine whether to input feature information for the same requirement, but for a search requirement not to be a specific result, a plurality of specific search results are corresponding, and the results are correct and really needed for searching, but the selection among the search results needs to be performed according to other consideration of operators, such as project requirement, design requirement and the like. Thus, in performing a consolidated analysis of this type of search requirement, it is desirable to consider the case of matching entries. On the other hand, especially for the case that the search requirement is not a specific result, the specific results formed by the search are selected according to other requirements and requirements of operators, so that the common options under large data are required to be determined to provide guiding assistance for the selection of operators, so that comparison of the use frequency of the specific results is necessary, and if the two input requirements can be combined, the matching terms of the specific results are similar in the use frequency after the matching of the specific results, and the similarity judgment can accurately determine whether the two input requirements have the same combinable requirement characteristic information. In addition, all the requirement input characteristic information in the same type of requirement data needs to be compared in pairs, and the comparison can ensure that a plurality of input requirement characteristic information can be effectively combined when the input requirement characteristic information is the same.
As a possible implementation manner, two search input unit feature information are arbitrarily selected, and matching analysis based on the output result items is performed to form similar item matching analysis results, including: all search output result information corresponding to the selected two search input unit feature information is respectively determined, and the similar matching rate threshold is set for carrying out the following item matching analysis: if the matching rate/> of the search output result information corresponding to the two search input unit feature information exceeds the similar matching rate threshold/> , outputting similar item matching result information; if the matching rate/> of the search output result information corresponding to the two search input unit feature information does not exceed the similar matching rate threshold/> , outputting similar item unmatched result information; wherein ,/> denotes the number of homogeneous matching entries, and/() denotes the number of unmatched all search output result information remaining after dividing all search output result information corresponding to the two search input unit feature information by homogeneous matching entries.
In the invention, the search input characteristic information is subjected to the merging analysis based on item matching, firstly, the number of matched items is determined in the output results corresponding to the selected input demand characteristic information, namely, the same number of specific search output results is determined, and of course, the demand input characteristic information is the same characteristic information which can be merged, but the corresponding output result data pages are not necessarily matched in a one-to-one correspondence manner on the specific search results, on the one hand, the search input characteristic information is obtained based on the original data of the search input information, some other characteristics are ignored, which can cause the search results to generate additional or other search output results, on the other hand, even the same search input characteristic information can influence the output results in the modes of sequence, interval and the like of the input characteristic information, and on the other hand, some additional specific output results are generated. The threshold value of the matching rate of the same kind can be determined according to actual conditions, and big data analysis and determination can be performed according to the data conditions of the characteristic information. It should be noted that, since the search book is a process of comparing information and further performing range constraint to obtain a target, in this process, some other useful information or auxiliary information for the target result may be provided for an operator, so for the output result corresponding to the feature information formed by combining after the matching is completed, the output result may be considered to be determined as a main portion and a secondary portion, where the main portion is a search output result capable of being matched, and the secondary portion is a remaining search output result incapable of being matched after the matching, so that overall search result data of the input feature information is provided when the data is conducted later, so that the search input guidance of the knowledge base is more overall and reasonable.
As a possible implementation manner, if the result of the matching analysis of the similar items shows a match, performing similar similarity analysis based on output frequency on the selected two search input unit feature information to form a similar similarity analysis result, including: determining the relative historical output frequency difference of the historical output frequencies of all the two matched search output result information for the selected two search input unit characteristic information; all relative historical output frequency differences/> , determining a historical average output frequency difference/> ; setting a similar frequency difference threshold/> , and carrying out the following analysis and judgment according to the historical average output frequency difference : if/> , outputting similar result information of the same kind; if/> , the same type of dissimilar result information is output.
In the invention, when the similarity analysis of the same type of search input characteristic information is carried out, the difference of the using frequencies of the search output results which can be matched is considered not to be too large, so that the similarity judgment is necessary and reasonable to the degree of the difference of the using frequencies. Meanwhile, the frequency gap investigation of a single matching item can influence the investigation result due to certain data deviation, so that the investigation on the average use frequency difference of all the matching items is more reasonable. And for the similar frequency difference threshold value, the same type of frequency difference threshold value can be determined according to the actual situation, and large data analysis and determination can be performed based on the knowledge base data situation of the characteristic information.
As one possible implementation manner, according to different similar search data, performing merging analysis of different types of search input type data to form knowledge base search guide data, including: two similar search data of different types are arbitrarily selected, and one characteristic information is arbitrarily selected from the two similar search data respectively, so that the combination analysis is carried out in the following manner: two pieces of characteristic information are selected at will, matching analysis based on the output result items is carried out, and non-similar item matching analysis results are formed; if the non-similar item matching analysis result shows that the two pieces of characteristic information are not matched, determining that the two pieces of characteristic information are not combined; if the non-similar item matching analysis result shows matching, performing non-similar similarity analysis based on output frequency on the two selected characteristic information to form a non-similar similarity analysis result; when the non-similar similarity analysis results show dissimilarity, determining that the two selected characteristic information are not combined; when the non-similar similarity analysis results are similar, forming a search input guide feature information group by the selected two feature information sets, and forming a search output guide feature information group by the corresponding search output result information sets; and finishing pairwise merging analysis of all feature information in the two similar search data, and integrating all the search input guide feature information groups and corresponding search output guide feature result information groups, the non-merged search input type feature information sets and corresponding search output result information sets, and all the non-merged search input similar feature information sets and corresponding search output similar feature result information sets to form the knowledge base search guide data.
In the invention, after the merging analysis of the same type of requirement input characteristic information is completed, the workload of the knowledge base on searching and analyzing the same type of data is reduced to a certain extent, and a data base is provided for establishing the same type of searching requirement language or representative guiding data, so that the searching and analyzing of the requirement input are more efficient and reasonable. Of course, as the diversity of the data types in the knowledge base improves the data dimension of the knowledge base, a wider searching mode is provided for operators, so that the operation efficiency and the searching accuracy are improved. Therefore, certain input requirement information with the same characteristic information exists on different types of search input data, and the information is also required to be subjected to merging analysis so as to reduce the redundancy of data analysis. The method is convenient for the operator to quickly and efficiently identify and provide accurate and reasonable output result data when the operator performs operations with the same search targets and different search requirement input information. Similarly, the combination of the flying class requirement input characteristic information is determined by examining the matching degree of the matching item and the similarity of the matching item of the corresponding output result information.
As one possible implementation manner, two pieces of feature information are arbitrarily selected, and matching analysis based on the output result items is performed to form non-homogeneous item matching analysis results, including: all the search output result information corresponding to the two selected characteristic information is respectively determined, and a non-homogeneous matching rate threshold is set for carrying out the following item matching analysis: if the matching rate/> of the search output result information corresponding to the two feature information exceeds the non-homogeneous matching rate threshold/> , outputting non-homogeneous item matching result information; if the matching rate/> of the search output result information corresponding to the two feature information does not exceed the non-homogeneous matching rate threshold/> , outputting non-homogeneous item non-matching result information; where,/> ,/> denotes the number of non-homogeneous matching entries, and/> denotes the number of non-homogeneous matching entries remaining after dividing all the search output result information corresponding to the two feature information by all the search output result information that cannot be matched.
In the invention, the non-homogeneous matching rate threshold is a judging standard for determining whether the non-homogeneous demand input characteristic information is matched on an output result, can be determined through an actual analysis condition, and can also be determined through big data analysis according to the data condition of the characteristic information in a database. Similarly, since the search book is a process of comparing information and further performing range constraint to obtain a target, in this process, some other useful information or auxiliary information for the target result may be provided for an operator, so that for the output result corresponding to the feature information formed by combining after the matching is completed, the output result may be considered to be determined as a main portion and a secondary portion, where the main portion is a search output result capable of matching, and the secondary portion is a remaining search output result incapable of matching after matching, so that overall search result data of the input feature information is provided during subsequent data guiding, and the search input guiding of the knowledge base is more overall and reasonable.
As a possible implementation manner, if the matching analysis result of the non-similar items shows a match, performing non-similar similarity analysis based on output frequency on the two selected feature information to form a non-similar similarity analysis result, including: for the selected two feature information, determining the non-similar relative historical output frequency difference of the historical output frequency of all the two mutually matched search output result information; all the non-homogeneous relative historical output frequency differences are determined; setting a non-homogeneous frequency difference threshold, and carrying out the following analysis and judgment according to the non-homogeneous historical average output frequency difference: if the average output frequency difference of the non-similar histories does not exceed the non-similar frequency difference threshold, outputting non-similar result information; and if the non-homogeneous historical average output frequency difference exceeds a non-homogeneous frequency difference threshold, outputting non-homogeneous dissimilar result information.
In the present invention, the similarity analysis based on the frequency of use is also determined by the average difference in use of the matching items. The average output frequency difference of the non-homogeneous histories can be determined according to the actual analysis condition, and can also be determined by analyzing big data based on the data condition of the knowledge base.
In a second aspect, the present invention provides an artificial intelligence based knowledge base analysis system, the artificial intelligence based knowledge base analysis system being configured to obtain knowledge base historical search data, perform category classification of search input information, and form search input category data; determining search output result data of the search input category data according to the knowledge base historical search data; carrying out merging analysis on the search input category data and the corresponding search output result data to form similar search data; and carrying out merging analysis on different types of search input type data according to different similar search data to form knowledge base search guide data.
In the invention, the system effectively ensures the integrity and stability of knowledge base analysis by forming the knowledge base data analysis function which fully satisfies the search efficiency, and provides a stable material basis for knowledge base analysis.
The knowledge base analysis method and system based on artificial intelligence provided by the invention have the beneficial effects that:
According to the method, the type of the history search requirement is divided by acquiring the history search input data aiming at the knowledge base, and the characteristic information is extracted, so that the real search requirement information is extracted. And then, the search requirement information is corresponding to the finally obtained required search result, and the reasonable knowledge base data division which takes the search requirement as the guide is performed by combining the identity and the relativity of the search output result on the basis of the search input information, so that the reasonable search guide data is established. On the one hand, the searching mode is established based on the searching requirement, fully accords with the characteristic of the searching requirement, can conveniently acquire accurate searching results by adopting more accurate searching input information when in requirement searching, can provide quick historical searching references for searching operation, has stronger referential property and pertinence, can basically meet most searching requirements under big data, and on the other hand, the efficiency of acquiring the accurate searching results is more efficient due to the searching mode established based on the searching requirement to the positive direction, and meanwhile resources consumed by searching are saved to a certain extent, and the operation pressure and burden of operators are also greatly reduced.
The system can ensure the integrity and stability of knowledge base analysis effectively by forming the knowledge base data analysis function which fully satisfies the search efficiency, and provides a stable material basis for knowledge base analysis.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings that are needed in the embodiments of the present invention will be briefly described below, it should be understood that the following drawings only illustrate some embodiments of the present invention and should not be considered as limiting the scope, and other related drawings can be obtained according to these drawings without inventive effort for a person skilled in the art.
FIG. 1 is a step diagram of an artificial intelligence based knowledge base analysis method according to an embodiment of the present invention.
Detailed Description
The technical solutions in the embodiments of the present invention will be described below with reference to the accompanying drawings in the embodiments of the present invention.
With the development and scientific progress of society, various industries are deeply ploughed and subdivided, deep ploughed and subdivided results are accompanied by deeper and wider knowledge layers, meanwhile, knowledge systems which are beneficial to improving the working efficiency and the like are accumulated by certain operation needs under big data, and a knowledge base with huge data volume is formed. The knowledge base can basically meet the demands of operation, production and life at the cost.
In general, in order to facilitate the related personnel to efficiently and simply use the knowledge base, required data in the knowledge base can be quickly obtained, the knowledge data in the knowledge base can be reasonably classified and sorted, and an efficient searching mode can be established to improve the operation efficiency of the related personnel. However, basically all the searching modes established for the knowledge base are reversely formed based on the data types in the knowledge base, namely, the searching modes are not established based on the requirement consideration of related personnel, but are classified according to the data types, so that the related personnel must first know a huge knowledge base data division system when searching and extracting the required data in the knowledge base by using a search engine, but the searching modes are certainly difficult for independent related personnel, the inaccuracy of the searching result is directly brought, and the required knowledge data is difficult to start to be obtained efficiently.
Referring to fig. 1, an embodiment of the present invention provides an artificial intelligence based knowledge base analysis method, which performs type classification on requirements of historical search by acquiring historical search input data for a knowledge base, and extracts feature information to extract real search requirement information. And then, the search requirement information is corresponding to the finally obtained required search result, and the reasonable knowledge base data division which takes the search requirement as the guide is performed by combining the identity and the relativity of the search output result on the basis of the search input information, so that the reasonable search guide data is established. On the one hand, the searching mode is established based on the searching requirement, fully accords with the characteristic of the searching requirement, can conveniently acquire accurate searching results by adopting more accurate searching input information when in requirement searching, can provide quick historical searching references for searching operation, has stronger referential property and pertinence, can basically meet most searching requirements under big data, and on the other hand, the efficiency of acquiring the accurate searching results is more efficient due to the searching mode established based on the searching requirement to the positive direction, and meanwhile resources consumed by searching are saved to a certain extent, and the operation pressure and burden of operators are also greatly reduced.
The knowledge base analysis method based on artificial intelligence specifically comprises the following steps:
s1: and acquiring historical search data of the knowledge base, and classifying the search input information to form search input category data.
Acquiring historical search data of a knowledge base, classifying the search input information to form search input category data, and comprising: extracting all search input information according to the historical search data of the knowledge base; dividing the search input information according to information types to form different search input type information sets; extracting characteristic information of each search input type information in different search input type information sets to form search input unit characteristic information corresponding to each search input type information; and collecting all the search input unit characteristic information under the search input type information set to form the search input type characteristic information set.
The formation of the search input category data includes two aspects, namely, dividing search input information obtained from knowledge base historical search data according to data types, and extracting characteristic information of search results from all the divided search input information in different data types. The division of data types mainly distinguishes different data types to ensure that data extraction can be performed in a targeted manner when extracting feature information for different types of search data, and after all, the feature information extraction is completely different especially among types of data such as image type data, voice type data, text type data types, each of which has feature information unique to each, such as color feature information possessed by the image type data, volume feature information possessed by the voice type data, and the like. The characteristic information of different types of data can be extracted according to the finally obtained search output result information, so that the truly required information of the search can be accurately obtained according to the final search result, after all, different operators can adopt different search inputs to cause huge and invalid input data volume when searching the same target, and can also adopt the same search inputs to cause the ambiguity of the search requirement when searching different targets, and therefore, the characteristic extraction according to the search result can effectively reduce the data volume of the search requirement and avoid the inaccuracy of the search input information. The feature information is also extracted for different types of data, for example, most of image data is graphic information such as shape, color and the like, so that the feature information of the image input information is mainly the feature information of the types, and the input information of the text type is mainly semantic.
S2: and determining search output result data of the search input category data according to the knowledge base historical search data.
Determining search output result data for searching the input category data based on the knowledge base historical search data, comprising: extracting all search output result information corresponding to each search input unit feature information according to the knowledge base historical search data to form a search output result information set corresponding to the search input unit feature information; for each search output result information in the search output result information set, determining the number of output histories in which the search output result information appears, wherein i represents the number of different search output result information in the search output result information set; and determining historical output frequency/> corresponding to the search output result information according to the output historical frequency/> of different search output result information in the search output result information set.
Of course, the feature extraction of the search input information needs to be performed with respect to the search output result information, so after the feature information of the search input is obtained, the search input information and the search result information need to be corresponded to form a real requirement and result combination, and a data basis is provided for the subsequent type division analysis of the knowledge base data based on forward direction. It should be noted that the analysis mode of the application provides the establishment of the search system based on the historical input requirement, the established search system can further carry out intelligent recognition and division of data types based on the artificial intelligence technology, and can continuously and automatically acquire the search operation data to update the type division data of the search system, thereby ensuring the real-time performance and accuracy of the search system. It will be appreciated that it is sometimes not necessary for the search result to be a very low-level result information, for example, when an operator performs a knowledgeable data search, such as the type of welding, the analysis method of discrete samples, etc., all specific results under the type will be obtained, and it is possible that the operator will choose the actual search result of different specific result operations, so that the present application performs frequency statistics of different search result information for such search forms to consider which specific results are commonly used under such search requirements, provide reasonable guiding results for subsequent searches, and after all have clustering requirements for operators, so that the specific result guiding formed based on frequency statistics is very useful.
S3: and carrying out merging analysis on the search input category data and the corresponding search output result data to form similar search data.
Carrying out merging analysis on the search input category data and the corresponding search output result data to form similar search data, wherein the merging analysis comprises the following steps: in the same search input type characteristic information set, carrying out the following combined analysis: randomly selecting two pieces of search input unit feature information, and carrying out matching analysis based on the output result items to form similar item matching analysis results; if the matching analysis result of the similar items shows no matching, determining that the selected two search input unit feature information are not combined; if the matching analysis result of the similar items shows that the two search input unit feature information are matched, carrying out similar similarity analysis based on output frequency on the selected two search input unit feature information to form a similar similarity analysis result; when the similar similarity analysis results are displayed as dissimilar, determining that the selected two search input unit feature information are not combined; when the similar similarity analysis results are similar, forming a search input similar characteristic information group by using the selected two search input unit characteristic information sets, forming a search output similar characteristic result information group by using the corresponding search output result information sets, and determining the average historical output frequency of each search output result information; and finishing pairwise merging analysis of all the search input unit feature information in the search input type feature information set, collecting all the search input similar feature information sets and corresponding search output similar feature result information sets, and forming similar search data by the search input type feature information set which is not merged and the corresponding search output result information set.
Different search input information corresponding to the same search requirement target exists in the same type of search input data, and in order to avoid redundancy of the input information, it is necessary to further optimize the search input and improve the accuracy of the search input, and to combine different search input information aiming at the same search requirement target. The method and the device for merging the different search input information are various in merging modes of the different search input information, and merge the different search input information by examining two aspects. On the one hand, the matching degree of the search results is that, for a search requirement target to be a specific search result, only a single result comparison is needed to determine whether to input feature information for the same requirement, but for a search requirement not to be a specific result, a plurality of specific search results are corresponding, and the results are correct and really needed for searching, but the selection among the search results needs to be performed according to other consideration of operators, such as project requirement, design requirement and the like. Thus, in performing a consolidated analysis of this type of search requirement, it is desirable to consider the case of matching entries. On the other hand, especially for the case that the search requirement is not a specific result, the specific results formed by the search are selected according to other requirements and requirements of operators, so that the common options under large data are required to be determined to provide guiding assistance for the selection of operators, so that comparison of the use frequency of the specific results is necessary, and if the two input requirements can be combined, the matching terms of the specific results are similar in the use frequency after the matching of the specific results, and the similarity judgment can accurately determine whether the two input requirements have the same combinable requirement characteristic information. In addition, all the requirement input characteristic information in the same type of requirement data needs to be compared in pairs, and the comparison can ensure that a plurality of input requirement characteristic information can be effectively combined when the input requirement characteristic information is the same.
The method for searching the input unit feature information comprises the steps of selecting two pieces of search input unit feature information at will, performing matching analysis based on output result items to form similar item matching analysis results, and comprises the following steps: all search output result information corresponding to the selected two search input unit feature information is respectively determined, and the similar matching rate threshold is set for carrying out the following item matching analysis: if the matching rate/> of the search output result information corresponding to the two search input unit feature information exceeds the similar matching rate threshold/> , outputting similar item matching result information; if the matching rate/> of the search output result information corresponding to the two search input unit feature information does not exceed the similar matching rate threshold/> , outputting similar item unmatched result information; where,/> ,/> denotes the number of homogeneous matching entries, and/> denotes the number of all search output result information that are not matched and remain after dividing all search output result information corresponding to the two search input unit feature information by homogeneous matching entries.
The method comprises the steps of carrying out item matching-based merging analysis on search input characteristic information, firstly, determining the number of matched items in output results corresponding to selected input demand characteristic information, namely, the same number of specific search output results, and naturally, considering that the demand input characteristic information is the same characteristic information which can be merged, but corresponding output result data pages are not necessarily matched in a one-to-one correspondence mode on the specific search results, on one hand, acquiring the search input characteristic information based on original data of the search input information, omitting some other characteristics, which can cause the search results to generate additional or other search output results, on the other hand, even the same search input characteristic information, influencing the output results in the modes of sequence, interval and the like of the input characteristics when carrying out input of the search output original data, and on the other hand, generating some additional specific output results, so that when carrying out item matching, in order to avoid merging the input demand characteristic information of the type to be excluded, determining whether merging is needed or not according to the percentage occupied by matching items, so as to further improve rationality and comprehensiveness of feature information merging. The threshold value of the matching rate of the same kind can be determined according to actual conditions, and big data analysis and determination can be performed according to the data conditions of the characteristic information. It should be noted that, since the search book is a process of comparing information and further performing range constraint to obtain a target, in this process, some other useful information or auxiliary information for the target result may be provided for an operator, so for the output result corresponding to the feature information formed by combining after the matching is completed, the output result may be considered to be determined as a main portion and a secondary portion, where the main portion is a search output result capable of being matched, and the secondary portion is a remaining search output result incapable of being matched after the matching, so that overall search result data of the input feature information is provided when the data is conducted later, so that the search input guidance of the knowledge base is more overall and reasonable.
If the matching analysis result of the similar items shows that the two selected search input unit feature information are matched, carrying out similar similarity analysis based on output frequency to form a similar similarity analysis result, wherein the method comprises the following steps: determining the relative historical output frequency difference of the historical output frequencies of all the two matched search output result information for the selected two search input unit characteristic information; all relative historical output frequency differences/> , determining a historical average output frequency difference ; setting a similar frequency difference threshold/> , and carrying out the following analysis and judgment according to the historical average output frequency difference/> : if/> , outputting similar result information of the same kind; if/> , the same type of dissimilar result information is output.
When the similarity analysis of the same type of search input characteristic information is performed, the difference of the frequency of use of the search output results which can be matched is considered not to be too large, so that the similarity judgment is necessary and reasonable to the extent of the difference of the frequency of use. Meanwhile, the frequency gap investigation of a single matching item can influence the investigation result due to certain data deviation, so that the investigation on the average use frequency difference of all the matching items is more reasonable. And for the similar frequency difference threshold value, the same type of frequency difference threshold value can be determined according to the actual situation, and large data analysis and determination can be performed based on the knowledge base data situation of the characteristic information.
S4: and carrying out merging analysis on different types of search input type data according to different similar search data to form knowledge base search guide data.
According to different similar search data, carrying out merging analysis of different types of search input type data to form knowledge base search guide data, wherein the method comprises the following steps: two similar search data of different types are arbitrarily selected, and one characteristic information is arbitrarily selected from the two similar search data respectively, so that the combination analysis is carried out in the following manner: two pieces of characteristic information are selected at will, matching analysis based on the output result items is carried out, and non-similar item matching analysis results are formed; if the non-similar item matching analysis result shows that the two pieces of characteristic information are not matched, determining that the two pieces of characteristic information are not combined; if the non-similar item matching analysis result shows matching, performing non-similar similarity analysis based on output frequency on the two selected characteristic information to form a non-similar similarity analysis result; when the non-similar similarity analysis results show dissimilarity, determining that the two selected characteristic information are not combined; when the non-similar similarity analysis results are similar, forming a search input guide feature information group by the selected two feature information sets, and forming a search output guide feature information group by the corresponding search output result information sets; and finishing pairwise merging analysis of all feature information in the two similar search data, and integrating all the search input guide feature information groups and corresponding search output guide feature result information groups, the non-merged search input type feature information sets and corresponding search output result information sets, and all the non-merged search input similar feature information sets and corresponding search output similar feature result information sets to form the knowledge base search guide data.
After the merging analysis of the same type of requirement input characteristic information is completed, the workload of the knowledge base on searching and analyzing the same type of data is reduced to a certain extent, and a data base is provided for establishing the same type of searching requirement language or representative guiding data, so that the searching and analyzing of the requirement input is more efficient and reasonable. Of course, as the diversity of the data types in the knowledge base improves the data dimension of the knowledge base, a wider searching mode is provided for operators, so that the operation efficiency and the searching accuracy are improved. Therefore, certain input requirement information with the same characteristic information exists on different types of search input data, and the information is also required to be subjected to merging analysis so as to reduce the redundancy of data analysis. The method is convenient for the operator to quickly and efficiently identify and provide accurate and reasonable output result data when the operator performs operations with the same search targets and different search requirement input information. Similarly, the combination of the flying class requirement input characteristic information is determined by examining the matching degree of the matching item and the similarity of the matching item of the corresponding output result information.
Two pieces of characteristic information are selected at will, matching analysis based on the output result items is carried out, and a non-similar item matching analysis result is formed, and the method comprises the following steps: all the search output result information corresponding to the two selected characteristic information is respectively determined, and a non-homogeneous matching rate threshold is set for carrying out the following item matching analysis: if the matching rate/> of the search output result information corresponding to the two feature information exceeds the non-homogeneous matching rate threshold/> , outputting non-homogeneous item matching result information; if the matching rate/> of the search output result information corresponding to the two feature information does not exceed the non-homogeneous matching rate threshold/> , outputting non-homogeneous item non-matching result information; where,/> ,/> denotes the number of non-homogeneous matching entries, and/> denotes the number of non-homogeneous matching entries remaining after dividing all the search output result information corresponding to the two feature information by all the search output result information that cannot be matched.
The non-homogeneous matching rate threshold is a judging standard for determining whether the non-homogeneous demand input characteristic information is matched on an output result, can be determined through actual analysis conditions, and can also be determined through big data analysis according to the data conditions of the characteristic information in a database. Similarly, since the search book is a process of comparing information and further performing range constraint to obtain a target, in this process, some other useful information or auxiliary information for the target result may be provided for an operator, so that for the output result corresponding to the feature information formed by combining after the matching is completed, the output result may be considered to be determined as a main portion and a secondary portion, where the main portion is a search output result capable of matching, and the secondary portion is a remaining search output result incapable of matching after matching, so that overall search result data of the input feature information is provided during subsequent data guiding, and the search input guiding of the knowledge base is more overall and reasonable.
If the non-homogeneous item matching analysis result shows matching, performing non-homogeneous similarity analysis based on output frequency on the two selected characteristic information to form a non-homogeneous similarity analysis result, wherein the non-homogeneous similarity analysis result comprises: for the selected two feature information, determining the non-similar relative historical output frequency difference of the historical output frequency of all the two mutually matched search output result information; all the non-homogeneous relative historical output frequency differences are determined; setting a non-homogeneous frequency difference threshold, and carrying out the following analysis and judgment according to the non-homogeneous historical average output frequency difference: if the average output frequency difference of the non-similar histories does not exceed the non-similar frequency difference threshold, outputting non-similar result information; and if the non-homogeneous historical average output frequency difference exceeds a non-homogeneous frequency difference threshold, outputting non-homogeneous dissimilar result information.
Similarity analysis based on frequency of use is also determined by the average difference in usage of the matching terms. The average output frequency difference of the non-homogeneous histories can be determined according to the actual analysis condition, and can also be determined by analyzing big data based on the data condition of the knowledge base.
The invention also provides a knowledge base analysis system based on artificial intelligence, which adopts the knowledge base analysis method based on artificial intelligence provided by the invention, and is configured to acquire knowledge base historical search data, perform category division of search input information and form search input category data; determining search output result data of the search input category data according to the knowledge base historical search data; carrying out merging analysis on the search input category data and the corresponding search output result data to form similar search data; and carrying out merging analysis on different types of search input type data according to different similar search data to form knowledge base search guide data.
The system can ensure the integrity and stability of knowledge base analysis effectively by forming the knowledge base data analysis function which fully satisfies the search efficiency, and provides a stable material basis for knowledge base analysis.
In summary, the method for analyzing the knowledge base based on the artificial intelligence provided by the embodiment of the invention has the beneficial effects that:
According to the method, the type of the history search requirement is divided by acquiring the history search input data aiming at the knowledge base, and the characteristic information is extracted, so that the real search requirement information is extracted. And then, the search requirement information is corresponding to the finally obtained required search result, and the reasonable knowledge base data division which takes the search requirement as the guide is performed by combining the identity and the relativity of the search output result on the basis of the search input information, so that the reasonable search guide data is established. On the one hand, the searching mode is established based on the searching requirement, fully accords with the characteristic of the searching requirement, can conveniently acquire accurate searching results by adopting more accurate searching input information when in requirement searching, can provide quick historical searching references for searching operation, has stronger referential property and pertinence, can basically meet most searching requirements under big data, and on the other hand, the efficiency of acquiring the accurate searching results is more efficient due to the searching mode established based on the searching requirement to the positive direction, and meanwhile resources consumed by searching are saved to a certain extent, and the operation pressure and burden of operators are also greatly reduced.
The system can ensure the integrity and stability of knowledge base analysis effectively by forming the knowledge base data analysis function which fully satisfies the search efficiency, and provides a stable material basis for knowledge base analysis.
In the embodiment of the application, the indication can comprise direct indication and indirect indication, and can also comprise explicit indication and implicit indication. In the specific implementation process, the manner of indicating the information to be indicated is various, for example, but not limited to, the information to be indicated may be directly indicated, such as the information to be indicated itself or an index of the information to be indicated. The information to be indicated can also be indicated indirectly by indicating other information, wherein the other information and the information to be indicated have an association relation. It is also possible to indicate only a part of the information to be indicated, while other parts of the information to be indicated are known or agreed in advance. For example, the indication of the specific information may also be achieved by means of a pre-agreed (e.g., protocol-specified) arrangement sequence of the respective information, thereby reducing the indication overhead to some extent. And meanwhile, the universal part of each information can be identified and indicated uniformly, so that the indication cost caused by independently indicating the same information is reduced.
The specific indication means may be any of various existing indication means, such as, but not limited to, the above indication means, various combinations thereof, and the like. Specific details of various indications may be referred to the prior art and are not described herein. As can be seen from the above, for example, when multiple pieces of information of the same type need to be indicated, different manners of indication of different pieces of information may occur. In a specific implementation process, a required indication mode can be selected according to specific needs, and the selected indication mode is not limited in the embodiment of the present application, so that the indication mode according to the embodiment of the present application is understood to cover various methods that can enable a party to be indicated to learn information to be indicated.
It should be understood that the information to be indicated may be sent together as a whole or may be sent separately in a plurality of sub-information, and the sending periods and/or sending timings of these sub-information may be the same or different. Specific transmission method the embodiment of the present application is not limited. The transmission period and/or the transmission timing of the sub-information may be predefined, for example, predefined according to a protocol, or may be configured by the transmitting end device by transmitting configuration information to the receiving end device.
The "pre-defining" or "pre-configuring" may be implemented by pre-storing corresponding codes, tables, or other manners that may be used to indicate relevant information in the device, and the embodiments of the present application are not limited to the specific implementation manner. Where "save" may refer to saving in one or more memories. The one or more memories may be provided separately or may be integrated in an encoder or decoder, processor, or communication device. The one or more memories may also be provided separately as part of a decoder, processor, or communication device. The type of memory may be any form of storage medium, and embodiments of the application are not limited in this regard.
The "protocol" referred to in the embodiments of the present application may refer to a protocol family in the communication field, a standard protocol similar to a frame structure of the protocol family, or a related protocol applied to a future communication system, which is not specifically limited in the embodiments of the present application.
In the embodiment of the present application, the descriptions of "when … …", "in … …", "if" and "if" all refer to that the device will perform corresponding processing under some objective condition, and are not limited in time, and do not require that the device must have a judging action when implementing, and do not mean that there are other limitations.
In the description of the embodiments of the present application, unless otherwise indicated, "/" means that the objects associated in tandem are in a "or" relationship, e.g., A/B may represent A or B; the "and/or" in the embodiment of the present application is merely an association relationship describing the association object, and indicates that three relationships may exist, for example, a and/or B may indicate: a alone, a and B together, and B alone, wherein A, B may be singular or plural. Also, in the description of the embodiments of the present application, unless otherwise indicated, "plurality" means two or more than two. "at least one of" or the like means any combination of these items, including any combination of single item(s) or plural items(s). For example, at least one (one) of a, b or c may represent: a, b, c, a-b, a-c, b-c, or a-b-c, wherein a, b, c may be single or plural. In addition, in order to facilitate the clear description of the technical solution of the embodiments of the present application, in the embodiments of the present application, the words "first", "second", etc. are used to distinguish the same item or similar items having substantially the same function and effect. It will be appreciated by those of skill in the art that the words "first," "second," and the like do not limit the amount and order of execution, and that the words "first," "second," and the like do not necessarily differ. Meanwhile, in the embodiments of the present application, words such as "exemplary" or "such as" are used to mean serving as examples, illustrations or explanations. Any embodiment or design described herein as "exemplary" or "e.g." in an embodiment should not be taken as preferred or advantageous over other embodiments or designs. Rather, the use of words such as "exemplary" or "such as" is intended to present related concepts in a concrete fashion that may be readily understood.
It should be appreciated that the processor in embodiments of the application may be a central processing unit (central processing unit, CPU), which may also be other general purpose processors, digital signal processors (DIGITAL SIGNAL processors, dsps), application specific integrated circuits (asics), off-the-shelf programmable gate arrays (field programmable GATE ARRAY, fpgas) or other programmable logic devices, discrete gate or transistor logic devices, discrete hardware components, etc. A general purpose processor may be a microprocessor or the processor may be any conventional processor or the like.
It should also be appreciated that the memory in embodiments of the present application may be either volatile memory or nonvolatile memory, or may include both volatile and nonvolatile memory. The nonvolatile memory may be a read-only memory (ROM), a Programmable ROM (PROM), an erasable programmable ROM (erasable PROM), an electrically erasable programmable EPROM (EEPROM), or a flash memory. The volatile memory may be random access memory (random access memory, RAM) which acts as external cache memory. By way of example, and not limitation, many forms of random access memory (random access memory, RAM) are available, such as static random access memory (STATIC RAM, SRAM), dynamic Random Access Memory (DRAM), synchronous Dynamic Random Access Memory (SDRAM), double data rate synchronous dynamic random access memory (double DATA RATE SDRAM, DDR SDRAM), enhanced synchronous dynamic random access memory (ENHANCED SDRAM, ESDRAM), synchronous link dynamic random access memory (SYNCHLINK DRAM, SLDRAM), and direct memory bus random access memory (direct rambus RAM, DR RAM).
The above embodiments may be implemented in whole or in part by software, hardware (e.g., circuitry), firmware, or any other combination. When implemented in software, the above-described embodiments may be implemented in whole or in part in the form of a computer program product. The computer program product comprises one or more computer instructions or computer programs. When the computer instructions or computer program are loaded or executed on a computer, the processes or functions described in accordance with embodiments of the present application are produced in whole or in part. The computer may be a general purpose computer, a special purpose computer, a computer network, or other programmable apparatus. The computer instructions may be stored in a computer-readable storage medium or transmitted from one computer-readable storage medium to another computer-readable storage medium, for example, the computer instructions may be transmitted from one website site, computer, server, or data center to another website site, computer, server, or data center by wired (e.g., infrared, wireless, microwave, etc.). The computer readable storage medium may be any available medium that can be accessed by a computer or a data storage device such as a server, data center, etc. that contains one or more sets of available media. The usable medium may be a magnetic medium (e.g., floppy disk, hard disk, magnetic tape), an optical medium (e.g., DVD), or a semiconductor medium. The semiconductor medium may be a solid state disk.
It should be understood that the term "and/or" is merely an association relationship describing the associated object, and means that three relationships may exist, for example, a and/or B may mean: there are three cases, a alone, a and B together, and B alone, wherein a, B may be singular or plural. In addition, the character "/" herein generally indicates that the associated object is an "or" relationship, but may also indicate an "and/or" relationship, and may be understood by referring to the context.
In the present application, "at least one" means one or more, and "a plurality" means two or more. "at least one of" or the like means any combination of these items, including any combination of single item(s) or plural items(s). For example, at least one (one) of a, b, or c may represent: a, b, c, a-b, a-c, b-c, or a-b-c, wherein a, b, c may be single or plural.
It should be understood that, in various embodiments of the present application, the sequence numbers of the foregoing processes do not mean the order of execution, and the order of execution of the processes should be determined by the functions and internal logic thereof, and should not constitute any limitation on the implementation process of the embodiments of the present application.
Those of ordinary skill in the art will appreciate that the various illustrative elements and algorithm steps described in connection with the embodiments disclosed herein may be implemented as electronic hardware, or combinations of computer software and electronic hardware. Whether such functionality is implemented as hardware or software depends upon the particular application and design constraints imposed on the solution. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present application.
It will be clear to those skilled in the art that, for convenience and brevity of description, specific working procedures of the above-described systems, apparatuses and units may refer to corresponding procedures in the foregoing method embodiments, and are not repeated herein.
In the several embodiments provided by the present application, it should be understood that the disclosed systems, devices, and methods may be implemented in other manners. For example, the apparatus embodiments described above are merely illustrative, e.g., the division of the units is merely a logical function division, and there may be additional divisions when actually implemented, e.g., multiple units or components may be combined or integrated into another system, or some features may be omitted or not performed. Alternatively, the coupling or direct coupling or communication connection shown or discussed with each other may be an indirect coupling or communication connection via some interfaces, devices or units, which may be in electrical, mechanical or other form.
The units described as separate units may or may not be physically separate, and units shown as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the units may be selected according to actual needs to achieve the purpose of the solution of this embodiment.
In addition, each functional unit in the embodiments of the present application may be integrated in one processing unit, or each unit may exist alone physically, or two or more units may be integrated in one unit.
The functions, if implemented in the form of software functional units and sold or used as a stand-alone product, may be stored in a computer-readable storage medium. Based on this understanding, the technical solution of the present application may be embodied essentially or in a part contributing to the prior art or in a part of the technical solution, in the form of a software product stored in a storage medium, comprising several instructions for causing a computer device (which may be a personal computer, a server, a network device, etc.) to perform all or part of the steps of the method according to the embodiments of the present application. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a read-only memory (ROM), a random access memory (random access memory, RAM), a magnetic disk, or an optical disk, or other various media capable of storing program codes.
The foregoing is merely illustrative of the present application, and the present application is not limited thereto, and any person skilled in the art will readily recognize that variations or substitutions are within the scope of the present application. Therefore, the protection scope of the present application shall be subject to the protection scope of the claims.

Claims (10)

1. A method for knowledge base analysis based on artificial intelligence, comprising:
Acquiring historical search data of a knowledge base, and classifying the search input information to form search input category data;
Determining search output result data of the search input category data according to the knowledge base historical search data;
carrying out merging analysis on the search input category data and the corresponding search output result data to form similar search data;
And carrying out merging analysis on different types of search input type data according to different similar search data to form knowledge base search guide data.
2. The method for analyzing knowledge base based on artificial intelligence according to claim 1, wherein the obtaining knowledge base history search data, performing category classification of search input information, forming search input category data, comprises:
extracting all search input information according to the knowledge base history search data;
dividing the search input information according to information types to form different search input type information sets;
extracting characteristic information of each piece of search input type information in different search input type information sets to form search input unit characteristic information corresponding to each piece of search input type information;
and collecting all the search input unit characteristic information under the search input type information set to form a search input type characteristic information set.
3. The artificial intelligence based knowledge base analysis method of claim 2, wherein said determining search output result data of said search input category data based on said knowledge base historical search data comprises:
Extracting all search output result information corresponding to each search input unit feature information according to the knowledge base historical search data to form a search output result information set corresponding to the search input unit feature information;
determining, for each of the search output result information in the set of search output result information, the number of output histories in which the search output result information appears, i representing the number of different ones of the search output result information in the set of search output result information;
And determining historical output frequency/> corresponding to the search output result information according to the output historical times of different search output result information in the search output result information set.
4. The method of claim 3, wherein the performing a merging analysis of the same type of search input category data with the corresponding search output result data to form a same type of search data comprises:
and in the same searching input type characteristic information set, carrying out the following combined analysis:
Randomly selecting two pieces of the search input unit characteristic information, and carrying out matching analysis based on the output result items to form similar item matching analysis results;
if the matching analysis result of the similar items shows that the two items are not matched, determining that the two selected searching input unit characteristic information are not combined;
If the matching analysis results of the similar items show that the matching is performed, performing similar similarity analysis based on output frequency on the selected two searching input unit feature information to form a similar similarity analysis result;
When the similar similarity analysis results are displayed to be dissimilar, determining that the two selected searching input unit characteristic information are not combined;
When the similar similarity analysis results are similar, forming a search input similar characteristic information group by using the two selected search input unit characteristic information sets, forming a search output similar characteristic information group by using the corresponding search output result information sets, and determining the average historical output frequency of each search output result information;
and completing pairwise merging analysis of all the search input unit feature information in the search input type feature information set, collecting all the search input similar feature information sets and the corresponding search output similar feature result information sets, and forming similar search data by the search input type feature information set and the corresponding search output result information set which are not merged.
5. The method for analyzing a knowledge base based on artificial intelligence according to claim 4, wherein the arbitrarily selecting two pieces of the search input unit feature information, performing a matching analysis based on the output result items, and forming a similar item matching analysis result, includes:
Determining all the search output result information corresponding to the selected two search input unit feature information respectively, and setting a similar matching rate threshold to perform the following item matching analysis:
Outputting similar item matching result information if the matching rate of the search output result information corresponding to the two search input unit feature information exceeds the similar matching rate threshold/> ;
If the matching rate of the search output result information corresponding to the two search input unit feature information does not exceed the similar matching rate threshold/> , outputting similar item unmatched result information;
Wherein ,/> denotes the number of matching items of the same kind, and/() denotes the number of all the search output result information which are not matched and remain after dividing all the search output result information corresponding to the two search input unit feature information by the matching items of the same kind.
6. The method according to claim 5, wherein if the similar item matching analysis result shows a match, performing similar similarity analysis based on output frequency on the selected two search input unit feature information to form a similar similarity analysis result, including:
Determining the relative historical output frequency difference of the historical output frequencies of all the two matched search output result information for the selected two search input unit characteristic information;
All the relative historical output frequency differences , determining a historical average output frequency difference/> ;
setting a similar frequency difference threshold , and carrying out the following analysis and judgment according to the historical average output frequency difference/> :
if , outputting similar result information of the same kind;
If , the same type of dissimilar result information is output.
7. The method of claim 6, wherein the performing a merging analysis of different types of search input type data according to different similar search data to form knowledge base search guide data comprises:
two similar search data of different types are arbitrarily selected, and feature information is arbitrarily selected from the two similar search data respectively, so that the combination analysis of the following modes is performed:
two pieces of characteristic information are selected at will, matching analysis based on the output result items is carried out, and non-similar item matching analysis results are formed;
If the non-homogeneous item matching analysis result shows that the two pieces of characteristic information are not matched, determining that the two pieces of characteristic information are not combined;
If the non-homogeneous item matching analysis result shows matching, performing non-homogeneous similarity analysis based on output frequency on the two selected characteristic information to form a non-homogeneous similarity analysis result;
when the non-similar similarity analysis results show that the two pieces of characteristic information are dissimilar, determining that the two pieces of characteristic information are not combined;
when the non-similar similarity analysis results are similar, forming a search input guide feature information group by the selected two feature information sets, and forming a search output guide feature information group by the corresponding search output result information sets;
And completing pairwise merging analysis of all feature information in the two same-type search data, and integrating all the search input guide feature information groups and the corresponding search output guide feature result information groups, the search input type feature information sets and the corresponding search output result information sets which are not merged, and all the search input same-type feature information sets and the corresponding search output same-type feature result information sets which are not merged to form the knowledge base search guide data.
8. The method for analyzing a knowledge base based on artificial intelligence according to claim 7, wherein the arbitrarily selecting two pieces of characteristic information, performing a matching analysis based on the output result items, and forming a non-homogeneous item matching analysis result, includes:
And respectively determining all the search output result information corresponding to the two selected characteristic information, setting a non-homogeneous matching rate threshold , and carrying out the following item matching analysis:
Outputting non-homogeneous item matching result information if the matching rate of the search output result information corresponding to the two feature information exceeds the non-homogeneous matching rate threshold/> ;
If the matching rate of the search output result information corresponding to the two feature information is not more than the non-homogeneous matching rate threshold/> , outputting non-homogeneous item non-matching result information;
Wherein ,/> denotes the number of non-homogeneous matching items,/> denotes the number of all the search output result information corresponding to the two feature information except for all the search output result information which is left as non-homogeneous matching items and cannot be matched.
9. The method according to claim 8, wherein if the non-homogeneous item matching analysis result shows a match, performing non-homogeneous similarity analysis based on output frequency on the two selected feature information to form a non-homogeneous similarity analysis result, including:
for the selected two feature information, determining the non-similar relative historical output frequency differences of the historical output frequencies of all the two matched search output result information;
all the non-homogeneous relative historical output frequency differences are used for determining a non-homogeneous historical average output frequency difference;
Setting a non-homogeneous frequency difference threshold, and carrying out the following analysis and judgment according to the non-homogeneous historical average output frequency difference:
If the non-homogeneous historical average output frequency difference does not exceed the non-homogeneous frequency difference threshold, non-homogeneous similar result information is output;
And if the non-homogeneous historical average output frequency difference exceeds the non-homogeneous frequency difference threshold, outputting non-homogeneous dissimilar result information.
10. An artificial intelligence based knowledge base analysis system adopting the artificial intelligence based knowledge base analysis method of any one of claims 1-9, characterized in that the artificial intelligence based knowledge base analysis system is configured to obtain knowledge base historical search data, perform category division of search input information, and form search input category data; determining search output result data of the search input category data according to the knowledge base historical search data; carrying out merging analysis on the search input category data and the corresponding search output result data to form similar search data; and carrying out merging analysis on different types of search input type data according to different similar search data to form knowledge base search guide data.
CN202410302742.3A 2024-03-18 2024-03-18 Knowledge base analysis method and system based on artificial intelligence Active CN117891851B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202410302742.3A CN117891851B (en) 2024-03-18 2024-03-18 Knowledge base analysis method and system based on artificial intelligence

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202410302742.3A CN117891851B (en) 2024-03-18 2024-03-18 Knowledge base analysis method and system based on artificial intelligence

Publications (2)

Publication Number Publication Date
CN117891851A true CN117891851A (en) 2024-04-16
CN117891851B CN117891851B (en) 2024-06-11

Family

ID=90652094

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202410302742.3A Active CN117891851B (en) 2024-03-18 2024-03-18 Knowledge base analysis method and system based on artificial intelligence

Country Status (1)

Country Link
CN (1) CN117891851B (en)

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20100161663A1 (en) * 2008-12-19 2010-06-24 International Business Machines Corporation Searching For A Business Name In A Database
US20110060734A1 (en) * 2009-04-29 2011-03-10 Alibaba Group Holding Limited Method and Apparatus of Knowledge Base Building
CN105786871A (en) * 2014-12-23 2016-07-20 北京奇虎科技有限公司 Question-answer search result display method and device based on search terms
CN111680125A (en) * 2020-06-05 2020-09-18 深圳市华云中盛科技股份有限公司 Litigation case analysis method, litigation case analysis device, computer device, and storage medium
CN113032673A (en) * 2021-03-24 2021-06-25 北京百度网讯科技有限公司 Resource acquisition method and device, computer equipment and storage medium
CN114491148A (en) * 2022-04-14 2022-05-13 武汉中科通达高新技术股份有限公司 Target person searching method and device, computer equipment and storage medium
CN117197591A (en) * 2023-11-06 2023-12-08 青岛创新奇智科技集团股份有限公司 Data classification method based on machine learning

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20100161663A1 (en) * 2008-12-19 2010-06-24 International Business Machines Corporation Searching For A Business Name In A Database
US20110060734A1 (en) * 2009-04-29 2011-03-10 Alibaba Group Holding Limited Method and Apparatus of Knowledge Base Building
CN105786871A (en) * 2014-12-23 2016-07-20 北京奇虎科技有限公司 Question-answer search result display method and device based on search terms
CN111680125A (en) * 2020-06-05 2020-09-18 深圳市华云中盛科技股份有限公司 Litigation case analysis method, litigation case analysis device, computer device, and storage medium
CN113032673A (en) * 2021-03-24 2021-06-25 北京百度网讯科技有限公司 Resource acquisition method and device, computer equipment and storage medium
CN114491148A (en) * 2022-04-14 2022-05-13 武汉中科通达高新技术股份有限公司 Target person searching method and device, computer equipment and storage medium
CN117197591A (en) * 2023-11-06 2023-12-08 青岛创新奇智科技集团股份有限公司 Data classification method based on machine learning

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
HANNAH BAST等: "Semantic Search on Text and Knowledge Bases", FOUNDATIONS AND TRENDS IN INFORMATION RETRIEVAL, vol. 10, 22 July 2016 (2016-07-22) *
李爱军;王海滨;郑晓波;: "基于推理控制策略的智能型电力搜索引擎的研究", 西华大学学报(自然科学版), no. 06, 15 November 2008 (2008-11-15) *

Also Published As

Publication number Publication date
CN117891851B (en) 2024-06-11

Similar Documents

Publication Publication Date Title
WO2020007028A1 (en) Medical consultation data recommendation method, device, computer apparatus, and storage medium
JP2021531591A (en) Association recommendation method, equipment, computer equipment and storage media
WO2017097231A1 (en) Topic processing method and device
CN111190792B (en) Log storage method and device, electronic equipment and readable storage medium
US20100131485A1 (en) Method and system for automatic construction of information organization structure for related information browsing
CN112035599B (en) Query method and device based on vertical search, computer equipment and storage medium
CN110704743A (en) Semantic search method and device based on knowledge graph
KR101577376B1 (en) System and method for determining infringement of copyright based on the text reference point
EP3407209A1 (en) Apparatus and method for extracting and storing events from a plurality of heterogeneous sources
CN110309504B (en) Text processing method, device, equipment and storage medium based on word segmentation
CN112560444A (en) Text processing method and device, computer equipment and storage medium
Berko et al. A method to solve uncertainty problem for big data sources
CN110569419A (en) question-answering system optimization method and device, computer equipment and storage medium
CN114817243A (en) Method, device and equipment for establishing database joint index and storage medium
Ge et al. REQUEST: A scalable framework for interactive construction of exploratory queries
CN115576999A (en) Task data processing method, device and equipment based on cloud platform and storage medium
CN113254624B (en) Intelligent question-answering processing method, device, equipment and medium based on artificial intelligence
CN117891851B (en) Knowledge base analysis method and system based on artificial intelligence
US10866944B2 (en) Reconciled data storage system
CN106407332B (en) Search method and device based on artificial intelligence
CN111723179A (en) Feedback model information retrieval method, system and medium based on concept map
US20180336242A1 (en) Apparatus and method for generating a multiple-event pattern query
CN112965998A (en) Compound database establishing and searching method and system
CN107239517B (en) Multi-condition searching method and device based on Hbase database
CN110941765A (en) Search intention identification method, information search method and device and electronic equipment

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant