WO2021017633A1 - Technical open digital asset retrieval method - Google Patents

Technical open digital asset retrieval method Download PDF

Info

Publication number
WO2021017633A1
WO2021017633A1 PCT/CN2020/094207 CN2020094207W WO2021017633A1 WO 2021017633 A1 WO2021017633 A1 WO 2021017633A1 CN 2020094207 W CN2020094207 W CN 2020094207W WO 2021017633 A1 WO2021017633 A1 WO 2021017633A1
Authority
WO
WIPO (PCT)
Prior art keywords
technical solution
subset
target
technical
patent classification
Prior art date
Application number
PCT/CN2020/094207
Other languages
French (fr)
Chinese (zh)
Inventor
白杰
Original Assignee
南京瑞祥信息技术有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 南京瑞祥信息技术有限公司 filed Critical 南京瑞祥信息技术有限公司
Publication of WO2021017633A1 publication Critical patent/WO2021017633A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q30/00Commerce
    • G06Q30/02Marketing; Price estimation or determination; Fundraising
    • G06Q30/0282Rating or review of business operators or products
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q30/00Commerce
    • G06Q30/06Buying, selling or leasing transactions
    • G06Q30/0601Electronic shopping [e-shopping]
    • G06Q30/0623Item investigation

Definitions

  • This application relates to the field of Internet data processing, in particular to a retrieval method of technical open digital assets.
  • the digital assets that can be traded that is, the certified digital assets will be stored in the digital asset registration platform of the transaction center, and the digital asset registration platform will manage and manage the digital asset information before or after the transaction. Use, such as sharing information with other data platforms, and searching or verifying digital assets stored on the platform upon request.
  • Digital assets are usually digital assets with intellectual property such as intellectual property as the core. They mainly include technical, design and expression digital assets, technical digital assets such as patent digital assets, and design digital assets such as copyright or appearance. Digital assets, expression digital assets such as trademark digital assets, etc. Among them, technical digital assets, such as patent digital assets, may consist of multiple patents that have a competitive relationship and complementary relationship and their accompanying or dependent parent assets.
  • Figure 1 is an application scenario diagram of the digital asset registration platform. In the figure, the user realizes the access to the digital asset registration platform 2 through the Internet through software such as clients 11, 12 or APP installed on the terminal 1.
  • the certified digital asset registered on the digital asset registration platform 2 is in the form of a data package, refer to Figure 2 ( Figure 2 is an example of the structure and content of the certified digital asset data package).
  • the data package includes two parts, a digital asset description item 21 and a digital asset entity 22.
  • the digital asset description item 21 is stored on the digital asset registration platform 2, and the digital asset entity 22 is stored in a centralized or decentralized form on a local server or a third party.
  • Server 3 If the digital asset registration platform 2 is a public chain node or a dedicated sub-chain node of the blockchain network 4, the digital asset entity 22 will also be stored in the dedicated sub-chain of the blockchain.
  • the digital asset description item 21 includes a digital asset registration item 211 and a digital asset technical description 212.
  • the digital asset registration item 211 may include multiple (at least one) items, for example, including authentication code information and the bibliographic data of hundreds of competitive patents or complementary patents.
  • the digital asset technical description 212 is a comprehensive description of the former's comprehensive technology, law, market information, etc. In form, the description includes an abstract and a detailed description.
  • technical digital assets such as patented digital assets
  • technical digital assets usually include multiple patents or patent portfolios with completely different properties.
  • a patent asset data package about an engine may include structure, materials, control, software, and even chemistry. Therefore, its open requirements are also diversified, which makes it more difficult to achieve content-based overall retrieval for patent asset data packages with open requirements.
  • the purpose of this application is to provide a method that takes the entire patent digital asset data package as the retrieval object and can effectively retrieve the target patent digital asset data package in an automated manner.
  • the first technical open digital asset retrieval method provided by this application includes:
  • the target technical solution set A and each technical solution subset B calculate the similarity index between the target technical solution set A and each technical solution subset B;
  • the data packets in the digital asset data packet set are sorted and output.
  • the second technical open digital asset retrieval method provided by this application includes:
  • each patent classification number set B calculate the similarity index between the target technical solution set A and each technical solution subset B;
  • the data packets in the digital asset data packet set are sorted and output.
  • the technical solution requirement description file is set in the digital asset data package, and then based on the technical description file provided by the search request, it can be expressed in the digital asset data package set to be tested by the technical classification of the technical solution and the technical classification Calculate the relative similarity between the two technical solutions, and further obtain the relative similarity between the two technical solution sets, so as to realize the online search for the technical system requirements as a whole.
  • the biggest feature of the technical solution of this application is that the relative or vague or inaccurate similarity index between the individual technical solutions is used to realize a collection of technical systems, that is, a data package composed of technical solutions in different fields.
  • the query and quantitative expression of the data package or technical system in the collection overcomes the limitation of traditional query thinking that can only use the keywords of the technical solution as clues; the second is to meet the needs of the retrieved data package, that is, the retrieval that meets the needs of others This is a huge contrast to the traditional retrieval that only meets its own needs.
  • Figure 1 is an application scenario diagram of the digital asset registration platform
  • Figure 2 is an example diagram of the structure and content of a certified digital asset data package
  • Figure 3 is a flowchart of the first technical open digital asset retrieval method given in an embodiment of the application.
  • FIG. 4 is an example diagram of calculating the similarity index between the target technical solution set A and the technical solution subset B used in the process described in FIG. 3;
  • FIG. 5 is an example diagram of calculating the similarity between the target technical solution set A and the technical solution subset B used in the example of FIG. 4;
  • FIG. 6 is a flowchart of the second method of calculating the similarity index between the target technical solution set A and the technical solution subset B used in the process of FIG. 3;
  • Fig. 7 is a flowchart of a second method for searching technical digital assets according to an embodiment of the present application.
  • FIG. 8 is a flowchart of the first method of calculating the similarity index between the target technical solution set A and the technical solution subset B used in the process described in FIG. 7.
  • a technical system is an organic collection of multiple technical solutions at different levels and with different content, these technical solutions may belong to different fields or disciplines, may be related, or may not be related at all.
  • a technical solution for an engine system will involve From the perspective of technical solutions, mechanical, material, circuit control, and software control programs may not have any direct relationship with each other.
  • a technical solution may be used in different technical systems.
  • a technical solution may not reflect a technical system at all. Therefore, we will not judge the overall technical system through specific individual technical solutions. Besides, it is common knowledge that the whole cannot be replaced by the part. Due to the diversity and uncertainty of the technical solutions required by the data package, the traditional keyword-based retrieval methods and retrieval methods based on technical fields have become invalid.
  • Fig. 3 is a flowchart of the first technical open digital asset retrieval method given in an embodiment of the present application.
  • the first step (step 31) is to set a technical solution requirement description file in the digital asset data package.
  • This file records the technical solutions required for the technological evolution of the digital asset data package.
  • These solutions can be competitive , It can also be complementary, or it can be related to the upstream and downstream of the industrial chain.
  • the problem solved by the solution can be a technical problem or a legal problem, and so on.
  • the implementation of this step is usually set at the first formation stage of the digital asset data package, and its content can be modified and supplemented during the transaction process.
  • the next step is the retrieval method realized by the calculation program.
  • a retrieval request sent by a legitimate client is obtained.
  • the retrieval request includes a technical description file of the target digital asset.
  • This file contains at least one or more technical solutions.
  • These technical solutions are usually descriptions of technical solutions that the requestor can provide , They constitute the target technical solution set A.
  • the solution description can be any form that is conducive to clearly describing the digital asset data package, such as a WORD document, or a PDF document, and so on.
  • the search request may also contain other restrictive conditions to narrow the search scope.
  • step 33 a collection of digital asset data packets to be detected can be obtained on the system platform or in the blockchain network according to the retrieval conditions.
  • the technical solution subset B corresponding to all the technical points of the digital asset data package can be obtained through its technical description part or document.
  • step 34 the similarity index between the target technical solution set A and each technical solution subset B is calculated.
  • the similarity index can characterize the overall similarity between each digital asset data packet in the digital asset data packet set to be detected and the technical solution given in the target digital asset technical description file in the retrieval request.
  • step 35 the data packets in the digital asset data packet set to be detected are reordered and output according to the similarity index, so as to realize the retrieval of technical digital assets.
  • step used in step 34 to calculate the similarity index between the target technical solution set A and the technical solution subset B may adopt the following sub-steps.
  • FIG. 4 is an example diagram of calculating the similarity index between the target technical solution set A and the technical solution subset B used in step 13.
  • the technical solution in the technical solution subset corresponding to the digital asset data package 421, 422, 423 is determined.
  • the technical solution subset of the data package 421 includes technical solutions 211 and 212
  • the technical solution subset of the data packet 422 includes technical solutions 221, 222, 223, and 224
  • the technical solution subset of the data packet 423 includes technologies Plans 231, 232 and 233.
  • X11, X12, and X13 are the basis for reordering the data packets in step 14.
  • the calculation of the similarity of the technical solutions in (1) above can be based on keywords or semantics to calculate each technical solution in the target technical solution set A and each technical solution sub-set B
  • the similarity of the two technical solutions For example, for the keyword-based method, refer to FIG. 5, which also shows an example of using the similarity to calculate the maximum similarity of the solution and the similarity index between the set of technical solutions.
  • the derivative words include synonyms, synonyms, hypernyms, hyponyms, etc. of keywords; wherein, the H1, H2, and H3 are a keyword set formed after the repeated keywords are removed.
  • the technical solution subset of the data package 521 includes technical solutions 211 and 212; the technical solution subset of the data packet 522 includes technical solutions 221, 222, 223, and 224; the technical solution subset of the data packet 523 includes technologies Plans 231, 232 and 233.
  • the number of occurrences of each keyword in the target keyword set H1, H2, H3 and each data packet 521, 522, 523 is calculated. That is, the number of times each keyword in H1, H2, and H3 appears in each technical solution in the technical solution subsets of the data packets 521, 522, and 523.
  • the keywords in the set H1 appear 10 times in the technical solution 211 of the data packet 521, that is, the similarity value is 10; in the technical solution 212, they appear 15 times, that is, the similarity value is 15;
  • the keyword in the set H2 appears 5 times in the technical solution 211 of the data packet 521, that is, the similarity value is 5; it appears 15 times in the technical solution 212, that is, the similarity value is 15; in the technical solution 221 of the data packet 522 Appears 5 times in technical solution 222, that is, the similarity value is 5; appears 10 times in technical solution 222, that is, the similarity value is 10; appears 20 times in technical solution 223, that is, the similarity value is 20; appears in technical solution 224 10 times, that is, the similarity value is 10; 5 times in the technical solution 231 of the data packet 523, that is, the similarity value is 5; 5 times in the technical solution 232, that is, the similarity value is 5, in the technical solution 233 Appears 5 times in, that is, the similarity value is 5.
  • the keywords in the set H3 appear 10 times in the technical solution 211 of the data packet 521, that is, the similarity value is 10; they appear 20 times in the technical solution 212; that is, the similarity value is 20; in the technical solution 221 of the data packet 522 Appears 25 times in technical solution 222, that is, the similarity value is 25; appears 15 times in technical solution 222, that is, the similarity value is 15; appears 5 times in technical solution 223, that is, the similarity value is 5; appears in technical solution 224 5 times, the similarity value is 5; 10 times in the technical solution 231 of the data packet 523, the similarity value is 10; 5 times in the technical solution 232, that is, the similarity value is 5, in the technical solution 233 Appears 5 times in, that is, the similarity value is 5.
  • the maximum similarity value of the digital asset data package 521 in the package set 52 is 25.
  • the maximum similarity value of the technical solution 11 and the digital asset data package 521 is 25, that is, the value of A11 in FIG. 5 is 25.
  • the maximum similarity between the technical solution 12 in the target technical solution 51 and the digital asset data package 521 in the digital asset data package set 52 to be detected can also be obtained.
  • the technical solution 12 is the same as the digital asset data package 521.
  • the maximum similarity value of the scheme is 20, that is, the value of A12 in Figure 5 is 20.
  • the maximum similarity A13 value between the technical solution 13 in the target technical solution 51 and the digital asset data packet 521 in the digital asset data packet set 52 to be detected is 30.
  • a semantic-based calculation method is used to calculate the semantic similarity between technical solutions.
  • the semantic similarity function is LAN(X1, X2), where X1 is the description document of the first technical document, and X2 is the description document of the second technical document. Therefore, the semantic similarity between technical solution 11 and technical solution 211 is LAN (Technical Solution 11, Technical Solution 211). Obviously, the similarity index between digital asset data packets can be obtained through semantic similarity, which will not be repeated here.
  • FIG. 6 is a flowchart of the second method of calculating the similarity index between the target technical solution set A and the technical solution subset B used in the process described in FIG. 3.
  • step 61 determine or select a technology classification rule that includes four levels and has progressive features.
  • This technology classification rule can be designed and used in advance. If it is used to search for a technology system in a specific field, for example, the chemical field or the semiconductor field, the targeted design of the technology classification rule is conducive to the accuracy of retrieval and judgment. However, in most cases, you can choose one of the commonly used general technology classification rules, which does not have much difference in application effects. The most commonly used are the international patent classification rules, European or US patent classification rules, etc. The gradual characteristics are the aforementioned four abstract levels. Obviously, the aforementioned international patent classification rules have this characteristic. If you design this rule yourself, you can refer to the following table. For example, the meaning of the four abstraction level technical classification rules is as follows, among which, the smaller the value, the higher the abstraction level:
  • BAFA01A105 For a technical point code BAFA01A105, where B represents the technical direction information of the technical point, AF stands for technical field information, A01 stands for professional direction information, and A105 stands for professional field information.
  • Step 62 Select technical points from the two technical systems respectively.
  • the selection of technical points is carried out in accordance with the principles of comprehensiveness, generalization, and focus.
  • the comprehensiveness emphasizes that the selection of technical points should cover or take into account every branch of the technical system structure, avoiding omissions to the greatest extent; the generalization is intended to make the selected technical points and their descriptions multi-level, making the technical points
  • the collection can reflect the overall characteristics of the system; the key point is to select key technical solutions or innovative technical solutions with characteristics in the system as much as possible to maximize the recognizability of the system.
  • the technical key information in the technical key set is the technical description file of the technical key, including text or picture and other information, for example, it can also be the style of the patent application document; and in the classification number set, it is each technology The technical classification code corresponding to the key file.
  • the classification number sets A and B will be the operation objects.
  • Step 63 In the classified number set A, according to the number of classified numbers in it, 80% of the numbers are selected in any manner, such as random or sequential, as the operation object (when the number of numbers is small, 100% is usually selected. About The number of selected numbers is explained in detail later), and a new classification number set A is obtained; similarly, in the classification number set B, according to the number of classification numbers in it, 100% of the numbers are selected as operation objects to obtain a new classification Number set B.
  • the new classification number set A For the new classification number set A, for each number, obtain each level code indicated by the number, remove the duplicates, and obtain each level code set X11, X12, X13, and X14 of all numbers and the corresponding Numbers Y11, Y12, Y13 and Y14, and, for each number in the new classification number set B, get each level code indicated by the number, remove the duplicates, and get each level of all numbers Encoding sets X21, X22, X23 and X24 and corresponding numbers Y21, Y22, Y23 and Y24.
  • Step 64 Calculate the number E1 of X11, X21 encoding overlap, and the number of X12, X22 encoding overlap E2, X13, X23 encoding overlap according to the encoding set X11, X12, X13, and X14, and X21, X22, X23 and X24 The number of E3 and the number of X14, X24 code overlapped E4.
  • n the number of coding levels of the technical classification rules
  • the technical classification number of each of the aforementioned technical points may include one or more classification numbers.
  • Fig. 7 is a flowchart of a second method for searching technical digital assets according to an embodiment of the present application.
  • the first step (step 71) is to set a technical solution requirement description file in the digital asset data package.
  • This file records the technical solutions required by the digital asset data package’s technological evolution. These solutions can be competitive , It can also be complementary, or it can be related to the upstream and downstream of the industrial chain.
  • the problem solved by the solution can be a technical problem or a legal problem, and so on.
  • the next step is the retrieval method realized by the calculation program.
  • a retrieval request sent by a legitimate client is obtained.
  • the retrieval request includes a target digital asset technical description file.
  • This file includes at least one or more technical solutions corresponding to all the technical points of the target digital asset. These technical solutions constitute Target technical solution set A.
  • a collection of digital asset data packets to be detected can be obtained on the system platform or in the blockchain network according to the retrieval conditions.
  • the technical solution subset B corresponding to all the technical points of the digital asset data package can be obtained through its technical description part or document.
  • step 74 the patent classification number set A corresponding to all the technical solutions in the target technical solution set A and the patent classification number set B corresponding to all the technical solutions in each technical solution subset B are determined. Since a technical solution may have multiple patent classification numbers, set A and set B should adopt a single classification number to be included in the standard, or only select the main classification number of the technical solution into the collection, or include all the classification numbers of the technical solution set. The former is conducive to improving the calculation efficiency, but when the digital processor has sufficient computing resources, the latter will improve the calculation accuracy.
  • step 75 according to the patent classification number set A and each patent classification number set B, the similarity index between the target technical solution set A and each technical solution subset B is calculated.
  • the similarity index can characterize the overall similarity between each digital asset data packet in the digital asset data packet set to be detected and the technical solution given in the target digital asset technical description file in the retrieval request.
  • step 76 the data packets in the digital asset data packet set to be detected are reordered and output according to the similarity index, so as to realize the retrieval of technical digital assets.
  • the method of judging the similarity index of two technical systems adopted in the embodiment shown in FIG. 7 utilizes patent classification rules. For example, through the international patent classification numbers recorded in the patent application information of two technical systems, the overlapping information of the technical fields pointed out by them can be obtained, so that the degree of similarity between the two technical systems can be judged as a whole.
  • any technology classification rules can be used to obtain the technical classifications of the key or main technical points of the two technical systems, and are not limited to patent classifications.
  • patent classifications are only a form of technology classification, as long as two According to the same technical classification rules, all technical systems can use the methods provided in this application to classify the key or main technical points in the system.
  • the US or European patent classification numbers can be used to determine the degree of conflict between any two technical systems according to the method provided in this application.
  • the following uses the International Patent Classification Number (IPC) as the technical classification rules for key technical points in the technical system to illustrate the specific implementation process of other embodiments of the present application.
  • IPC International Patent Classification Number
  • the International Patent Classification Number, IPC adopts a classification method that combines function and application, and the classification principle is based on functionality and supplemented by application.
  • the technical content is indicated as five parts: department-big category-small category-big group-small group, and form a complete classification system. Therefore, a complete IPC classification number is composed of the symbol combination of representative department, major category, sub-category, major group and group.
  • the five parts of information are used to determine the degree of similarity or conflict between two technical systems or two sets of technical systems.
  • four of the five pieces of information namely the information of major categories, subcategories, major groups, and groups, are used to determine the degree of similarity or similarity between two technical systems or two sets of technical systems.
  • the degree of conflict In the same way, three of the five pieces of information, that is, the information of small categories, large groups, and small groups, can also be used to determine the degree of similarity or conflict between two technical systems or two sets of technical systems.
  • use two of the five pieces of information, that is, the information of the large group and the small group to judge the degree of similarity or conflict between two technical systems or two sets of technical systems.
  • use one of the five pieces of information that is, the group's information to determine the degree of conflict between the two technical systems or the technical systems corresponding to the two sets.
  • the department has the largest range of information concepts, and the purpose of using this information is to not omit the information used; while the group’s information concepts range is the smallest, and the purpose of using this information is to make the information used more accurate . Therefore, there can also be multiple examples of using patent classification information, for example, only using the information of department, sub-category, large group and group to judge the similarity or conflict between two technical systems or two sets of technical systems . and many more.
  • the following uses three of the five pieces of information, that is, the information of small categories, large groups, and groups to determine the degree of similarity or conflict between two technical systems to further explain the technical solution of the present application.
  • the method described in this embodiment can be implemented in the form of software.
  • FIG. 8 is a flowchart of the first method of calculating the similarity index between the target technical solution set A and the technical solution subset B used in step 74 of the process.
  • the characteristic of the process shown in Fig. 8 is that the patent application of two technical systems or technical solutions is used as the technical point, and the international patent classification number of the patent application is used as the technical classification rule.
  • the International Patent Classification Number analyzes the technical relevance or similarity between the two technical systems based on the IPC classification numbers of the IPC classifications of the set A and set B patent applications.
  • step 81 the IPC numbers in all patent application information of the patent classification number set A and the patent classification number set B are obtained to form two IPC number sets, which correspond to the sets A and B respectively.
  • step 82 obtain the small class codes, large group codes, and group codes indicated by all the international patent classification numbers in the first number set, remove the repeated parts in each group code, and obtain the small class code set B3 (the first in Table 1 Column, namely the IPC subclass of set A), the number of subclass codes b3 is 19 (the last row of the first column of Table 1, that is, the last row of the IPC subclass column of set A), the large group code set B2 (Table 2 The first column of the IPC group of set A), the number of group codes b2 is 19 (the last row of the first column of Table 2, which is the last row of the IPC group column of set A), and, the group code set B1 (the first column of Table 3, that is, the IPC group of set A), the number of group codes b1 is 13 (the last row of the first column of Table 3, that is, the last row of the IPC group of set A).
  • the small class code, large group code, and group code indicated by all the international patent classification numbers in the second number set remove the repeated parts in each group code, and obtain the small class code set D3 (the second column of Table 1, namely IPC subclass of set B), the number of subclass codes d3 is 10 (the last row of the second column of Table 2, which is the last row of the IPC subclass column of set B), and the large group code set D2 (the second of Table 3)
  • the number of group codes d2 is 10 (the second column of Table 3, the last row, that is, the last row of the IPC group column of set B), and the group code set D1 (Table 4
  • the second column of, that is the IPC group of set B), and the number of group codes d1 is 5 (the last row of the second column of Table 3, that is, the last row of the IPC group of set B).
  • Table 2 Comparison table of IPC subclass information of set A and set B
  • Table 3 Comparison table of IPC groups of set A and set B
  • IPC group of set A IPC group of set B Coincident IPC group A41D13/00 E21C35/00 E21D15/00 A61F17/00 E21C41/00 To
  • IPC group of set A IPC group of set B Coincident IPC group A61J9/00 E21D15/00 To A62D1/00 B23P19/00 To B61K7/00 B25B27/00 To B61L11/00 E02F9/00 To B61L23/00 E21C33/00 To B65G11/00 E21D20/00 To B65G21/00 E21D23/00 To B65G65/00 E21F13/00 To B66B15/00 To To B66C1/00 To To B66D1/00 To To C01B33/00 To To C02F1/00 To To C09K3/00 To To E21D15/00 To To E21D19/00 To To 19 items in total 10 items in total Repeat 1 item
  • Table 4 Comparison table of IPC groups of set A and set B
  • IPC group of group A IPC group of set B Coincident IPC group 13 items in total 5 items in total Repeat 0 items
  • step 82 the 100% patent classification number analysis objects of the set A and the set B are respectively selected. In other embodiments, only a part of them may be selected. The result of this is that the execution result of the method has a certain error, but it does not affect the overall judgment. At the same time, it also enhances the practicability of the method. Any technical system can be judged even if there is an error in the patent classification number. In addition, setting a range of options can achieve a better balance between effect and efficiency, as well as the flexibility of the method.
  • step 83 according to the sub-code sets B3 and D3 of the two technical systems obtained in step 82, the large-group code sets B2, D2, and the group-code sets B1, D1, the number of overlapping sub-categories of the two technical systems E3 is calculated as 5 (the third column of Table 1, that is, the last row of the overlapping IPC sub-category column), the number of overlapping large groups of codes E2 is 1 (the third column of Table 2, that is, the last row of the overlapping IPC large group of columns) and The number of overlapping group codes E1 is 0 (the third column of Table 3, that is, the last row of the overlapping IPC group column).
  • steps 85 and 86 calculate the patent technology related index F of any one technical system relative to another technical system;
  • F A C3*A3+C2*A2+C1*A1
  • F B C3*B3+C2*B2+C1*B1
  • C3, C2, and C1 are empirical constants.
  • C3, C2, and C1 respectively represent the correlation coefficients between the IPC classifications of small, large and small groups and the conflicts between the two systems, and their empirical values are 1, 2, and 3.
  • F A C3*A3+C2*A2+C1*A1
  • step 16 according to the correlation index F, or calculate the patent conflict probability G of any one technical system relative to another technical system; where.
  • G A certain aspect of the similarity index is a set of A and B of the subset aspect
  • G B B is a subset of the target aspect aspect of the set A similarity index.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Business, Economics & Management (AREA)
  • General Physics & Mathematics (AREA)
  • Accounting & Taxation (AREA)
  • Finance (AREA)
  • Physics & Mathematics (AREA)
  • Development Economics (AREA)
  • Strategic Management (AREA)
  • General Business, Economics & Management (AREA)
  • Marketing (AREA)
  • Economics (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Entrepreneurship & Innovation (AREA)
  • Game Theory and Decision Science (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The present application discloses a technical open digital asset retrieval method. The method comprises configuring a technical solution requirement description file in a digital asset data package in advance, the file comprising technical solutions required to make the data package complete. A specific retrieval process comprises: acquiring a retrieval request comprising a technical description file, obtaining a target technical solution set A, and then acquiring technical solution subsets B corresponding to technical solution requirement description files of individual digital asset data packages in a digital asset data package set undergoing detection; calculating, according to the set A and each of the technical solution subsets B, a similarity indicator of the set A with respect to the subset B; and sorting and outputting, according to the similarity indicator, the data packages in the digital asset data package set, thereby enabling easy acquisition of a digital asset data package closest to a requirement. The present application achieves requirement-based retrieval in a technical system set for a technical system, and the technical system may comprise any complex techniques in different aspects or fields.

Description

技术类开放式数字资产的检索方法Retrieval method of technical open digital assets
本申请要求在2019年7月26日提交中国专利局、申请号为201910684229.4、发明名称为“技术类开放式数字资产的检索方法”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。This application claims the priority of a Chinese patent application filed with the Chinese Patent Office on July 26, 2019, the application number is 201910684229.4, and the invention title is "Technical Open Digital Asset Search Method", the entire content of which is incorporated herein by reference Applying.
技术领域Technical field
本申请涉及互联网络数据处理领域,尤其涉及技术类开放式数字资产的检索方法。This application relates to the field of Internet data processing, in particular to a retrieval method of technical open digital assets.
背景技术Background technique
在数字资产金融交易中心,能够交易的数字资产,即认证后的数字资产会存储在交易中心的数字资产登记平台,由数字资产登记平台管理对这些交易前或交易后的数字资产信息进行管理和使用,例如与其他数据平台实现信息共享、应请求对平台存储的数字资产进行检索或验证等。In the digital asset financial transaction center, the digital assets that can be traded, that is, the certified digital assets will be stored in the digital asset registration platform of the transaction center, and the digital asset registration platform will manage and manage the digital asset information before or after the transaction. Use, such as sharing information with other data platforms, and searching or verifying digital assets stored on the platform upon request.
数字资产通常是以知识产权等智力成果为核心的数字化形式的资产,主要包括技术类、设计类和表达类的数字资产,技术类的数字资产如专利数字资产,设计类数字资产如版权或外观数字资产,表达类数字资产如商标类数字资产,等等。其中,技术类数字资产,如专利数字资产,可能由多项有竞争关系互补关系的专利及其伴随或依赖的母体资产组成。图1是数字资产登记平台的一个应用场景图。图中,用户通过终端1上安装的客户端11、12或APP等软件,通过互联网实现对数字资产登记平台2的访问。通常,在数字资产登记平台2登记的已认证数字资产形式上是一个数据包,参考图2(图2是认证数字资产数据包结构、内容示例)。该数据包包括两部分,数字资产著录项目21和数字资产实体22,其中,数字资产著录项目21存储在数字资产登记平台2,数字资产实体22以集中或分散的形式存储在本地服务器或第三方服务器3中。如果数字资产登记平台2是区块链网络4的公链节点或专用子链节点,则数字资产实体22也会存储在区块链的专用子链中。进一步,数字资产著录项目21包括数字资产登记项211和数字资产技术描述212。以专利数字资产为例,其中的数字资产登记项211可能包括多项(至少一项),例如包括认证码信息和几百项竞争性专利或互补性专利的著录项目数据。而数字资产技术描述212是对前者的综合技术、法律、市场信息等的综合描述,形式上,该描述包括摘要和详细描述。Digital assets are usually digital assets with intellectual property such as intellectual property as the core. They mainly include technical, design and expression digital assets, technical digital assets such as patent digital assets, and design digital assets such as copyright or appearance. Digital assets, expression digital assets such as trademark digital assets, etc. Among them, technical digital assets, such as patent digital assets, may consist of multiple patents that have a competitive relationship and complementary relationship and their accompanying or dependent parent assets. Figure 1 is an application scenario diagram of the digital asset registration platform. In the figure, the user realizes the access to the digital asset registration platform 2 through the Internet through software such as clients 11, 12 or APP installed on the terminal 1. Generally, the certified digital asset registered on the digital asset registration platform 2 is in the form of a data package, refer to Figure 2 (Figure 2 is an example of the structure and content of the certified digital asset data package). The data package includes two parts, a digital asset description item 21 and a digital asset entity 22. The digital asset description item 21 is stored on the digital asset registration platform 2, and the digital asset entity 22 is stored in a centralized or decentralized form on a local server or a third party. Server 3. If the digital asset registration platform 2 is a public chain node or a dedicated sub-chain node of the blockchain network 4, the digital asset entity 22 will also be stored in the dedicated sub-chain of the blockchain. Further, the digital asset description item 21 includes a digital asset registration item 211 and a digital asset technical description 212. Taking a patent digital asset as an example, the digital asset registration item 211 may include multiple (at least one) items, for example, including authentication code information and the bibliographic data of hundreds of competitive patents or complementary patents. The digital asset technical description 212 is a comprehensive description of the former's comprehensive technology, law, market information, etc. In form, the description includes an abstract and a detailed description.
实践中,用户经常需要使用客户端软件11、12对数字资产交易平台2存储的数字资产信息进行浏览、检索操作。对于技术类数字资产,检索的目的在于找到目标专利资产包。In practice, users often need to use the client software 11 and 12 to browse and retrieve the digital asset information stored in the digital asset trading platform 2. For technical digital assets, the purpose of the search is to find the target patent asset package.
通常的数字资产具有开放性,目的是吸纳有价值的其它以知识产权为核心的有价值数字资产加入组成新的数据包。新加入的数字资产与数据包中已有的数字资产相比, 二者要么具有竞争性,要么具有互补性,以在技术角度完善数字包代表的技术体系,使其更具有竞争力和价值创造力。因此,对具有开放式需求的特定数字资产数据包的检索成为困扰我们的问题。传统的检索方法,例如基于关键词的检索方法,本质上是相似度检索法,即,在待检索的数据包集合中,找到与自己给出的关键词使用条件相似度最高的数据包。然而,相似度检索法的检索依据只能体现检索者的需求,无法体现待检测数据包的开放式需求,因此,无法对具有开放式需求的特定数字资产数据包进行有效检索。Normal digital assets are open, and the purpose is to absorb other valuable digital assets with intellectual property as the core to join to form a new data package. Compared with the existing digital assets in the data package, the newly added digital assets are either competitive or complementary. We will improve the technical system represented by the digital package from a technical perspective to make it more competitive and value creation. force. Therefore, the retrieval of specific digital asset data packages with open requirements has become a problem that plagues us. Traditional retrieval methods, such as keyword-based retrieval methods, are essentially similarity retrieval methods, that is, in the data packet set to be retrieved, find the data packet with the highest similarity to the keyword usage conditions given by yourself. However, the search basis of the similarity search method can only reflect the needs of the searcher, and cannot reflect the open demand of the data package to be detected. Therefore, it is impossible to effectively search for the specific digital asset data package with open demand.
另外,技术类数字资产,例如专利数字资产,其数据包通常包括多个性质完全不同的专利或专利组合,例如一个关于发动机的专利资产数据包,就可能包括结构、材料、控制、软件甚至化学方面的专利,因此,它的开放式需求也是多样化的,这使得对于具有开放式需求的专利资产数据包更难以实现以内容为基础的整体检索。In addition, technical digital assets, such as patented digital assets, usually include multiple patents or patent portfolios with completely different properties. For example, a patent asset data package about an engine may include structure, materials, control, software, and even chemistry. Therefore, its open requirements are also diversified, which makes it more difficult to achieve content-based overall retrieval for patent asset data packages with open requirements.
发明内容Summary of the invention
基于上述技术问题,本申请的目的在于提供一种以专利数字资产数据包整体为检索对象、以及能够以自动化方式有效检索目标专利数字资产数据包的方法。Based on the above technical problems, the purpose of this application is to provide a method that takes the entire patent digital asset data package as the retrieval object and can effectively retrieve the target patent digital asset data package in an automated manner.
本申请提供的第一种技术类开放式数字资产的检索方法包括:The first technical open digital asset retrieval method provided by this application includes:
在数字资产数据包中设置技术方案需求描述文件,其中包括数字资产数据包所需求的至少一个技术方案;Set the technical solution requirement description file in the digital asset data package, which includes at least one technical solution required by the digital asset data package;
获取检索请求以及对应的目标数字资产技术描述文件,所述技术描述文件中包括至少一个技术方案,得到目标技术方案集合A;Obtain the retrieval request and the corresponding target digital asset technical description file, where the technical description file includes at least one technical solution, and obtain the target technical solution set A;
获取待检测数字资产数据包集合,确定集合中每个数字资产数据包中技术方案需求描述文件对应的技术方案子集合B;Obtain the digital asset data package set to be detected, and determine the technical solution sub-set B corresponding to the technical solution requirement description file in each digital asset data package in the set;
根据目标技术方案集合A和每个技术方案子集合B,计算目标技术方案集合A与每个技术方案子集合B的相似度指数;According to the target technical solution set A and each technical solution subset B, calculate the similarity index between the target technical solution set A and each technical solution subset B;
根据所述相似度指数将数字资产数据包集合中的数据包排序后输出。According to the similarity index, the data packets in the digital asset data packet set are sorted and output.
本申请提供的第二种技术类开放式数字资产的检索方法包括:The second technical open digital asset retrieval method provided by this application includes:
在数字资产数据包中设置技术方案需求描述文件,其中包括数字资产数据包所需求的至少一个技术方案;Set the technical solution requirement description file in the digital asset data package, which includes at least one technical solution required by the digital asset data package;
获取检索请求以及对应的目标数字资产技术描述文件,所述技术描述文件中包括至少一个技术方案,得到目标技术方案集合A;Obtain the retrieval request and the corresponding target digital asset technical description file, where the technical description file includes at least one technical solution, and obtain the target technical solution set A;
获取待检测数字资产数据包集合,确定集合中每个数字资产数据包中技术方案需求描述文件对应的技术方案子集合B;Obtain the digital asset data package set to be detected, and determine the technical solution sub-set B corresponding to the technical solution requirement description file in each digital asset data package in the set;
确定目标技术方案集合A对应的所有技术方案的专利分类号集合A和每个技术方案子集合B对应的所有技术方案的专利分类号集合B;Determine the patent classification number set A of all technical solutions corresponding to the target technical solution set A and the patent classification number set B of all technical solutions corresponding to each technical solution subset B;
根据专利分类号集合A和每个专利分类号集合B,计算目标技术方案集合A与每 个技术方案子集合B的相似度指数;According to the patent classification number set A and each patent classification number set B, calculate the similarity index between the target technical solution set A and each technical solution subset B;
根据所述相似度指数将数字资产数据包集合中的数据包排序后输出。According to the similarity index, the data packets in the digital asset data packet set are sorted and output.
本申请在数字资产数据包中设置技术方案需求描述文件,再根据检索请求提供的技术描述文件,就能够在待检测数字资产数据包集合中,通过将技术方案的技术分类,以及通过技术分类表达的技术方向、领域等信息,计算两个技术方案之间的相对的相似程度,进一步得到两个技术方案集合之间的相对的相似程度,从而在整体上实现针对技术***需求的线上检索。本申请技术方案的最大特点,一是利用了技术方案个体之间的相对的或模糊的或不精确的相似度指标,实现了对于技术***集合,即由多个不同领域技术方案构成的数据包集合中的数据包或技术***的查询和定量表达,克服了只能以技术方案的关键词为线索的传统查询思维的限制;二是实现了满足被检索数据包需求,即满足他人需求的检索,这与传统的仅满足自身需求的检索形成了巨大反差。In this application, the technical solution requirement description file is set in the digital asset data package, and then based on the technical description file provided by the search request, it can be expressed in the digital asset data package set to be tested by the technical classification of the technical solution and the technical classification Calculate the relative similarity between the two technical solutions, and further obtain the relative similarity between the two technical solution sets, so as to realize the online search for the technical system requirements as a whole. The biggest feature of the technical solution of this application is that the relative or vague or inaccurate similarity index between the individual technical solutions is used to realize a collection of technical systems, that is, a data package composed of technical solutions in different fields. The query and quantitative expression of the data package or technical system in the collection overcomes the limitation of traditional query thinking that can only use the keywords of the technical solution as clues; the second is to meet the needs of the retrieved data package, that is, the retrieval that meets the needs of others This is a huge contrast to the traditional retrieval that only meets its own needs.
附图说明Description of the drawings
图1是数字资产登记平台的一个应用场景图;Figure 1 is an application scenario diagram of the digital asset registration platform;
图2是认证数字资产数据包结构、内容示例图;Figure 2 is an example diagram of the structure and content of a certified digital asset data package;
图3是本申请实施例给出的第一个技术类开放式数字资产的检索方法流程图;Figure 3 is a flowchart of the first technical open digital asset retrieval method given in an embodiment of the application;
图4是图3所述流程采用的第一种计算目标技术方案集合A与技术方案子集合B的相似度指数的示例图;FIG. 4 is an example diagram of calculating the similarity index between the target technical solution set A and the technical solution subset B used in the process described in FIG. 3;
图5是图4所述示例采用的计算目标技术方案集合A与技术方案子集合B的相似度的示例图;FIG. 5 is an example diagram of calculating the similarity between the target technical solution set A and the technical solution subset B used in the example of FIG. 4;
图6是图3所述流程采用的第二种计算目标技术方案集合A与技术方案子集合B的相似度指数的流程图;FIG. 6 is a flowchart of the second method of calculating the similarity index between the target technical solution set A and the technical solution subset B used in the process of FIG. 3;
图7是本申请实施例给出的第二个技术类数字资产的检索方法流程图;Fig. 7 is a flowchart of a second method for searching technical digital assets according to an embodiment of the present application;
图8是图7所述流程采用的第一种计算目标技术方案集合A与技术方案子集合B的相似度指数的流程图。FIG. 8 is a flowchart of the first method of calculating the similarity index between the target technical solution set A and the technical solution subset B used in the process described in FIG. 7.
具体实施方式Detailed ways
从经济角度和技术角度看,任何一个技术***或体系都需要不断积极进化或演变,才可能保持竞争力和价值创造能力,因此,作为交易或投资对象的数字资产数据包代表的技术***也需要不断完善才能实现积极进化。然而,作为新的技术方案提供方如何才能在众多的数字资产数据包中快速找到最接近的或者对所提供的技术方案需求可能性最大的数据包?From an economic and technical point of view, any technological system or system needs to continue to actively evolve or evolve in order to maintain competitiveness and value creation capabilities. Therefore, the technical system represented by the digital asset data package as the transaction or investment object also needs Continuous improvement can achieve positive evolution. However, as a new technical solution provider, how can it quickly find the closest or the most likely data package among the numerous digital asset data packages?
由于一个技术***是多个不同层次、不同内容技术方案的有机集合,这些技术方案可能属于不同的领域或学科,可能有关联,也可能完全没有联系,例如一个发动机***的技术方案,就会涉及机械类、材料类、电路控制类、软件控制类等方案,从技术方案角度看,它们彼此可能没有任何直接的关系。另外,一个技术方案可能在不同的技术***中都可以使用,例如,从一个技术方案本身来看,可能完全无法反映一个技术***,因此,我们不会通过具体的个体技术方案去判断技术***整体的性质,况且,不能用局部代替整体是公知常识。由于数据包所需求的技术方案具有多样性和不 确定性,使得传统上基于关键词的检索方法和基于技术领域的检索方法都变得无效。Since a technical system is an organic collection of multiple technical solutions at different levels and with different content, these technical solutions may belong to different fields or disciplines, may be related, or may not be related at all. For example, a technical solution for an engine system will involve From the perspective of technical solutions, mechanical, material, circuit control, and software control programs may not have any direct relationship with each other. In addition, a technical solution may be used in different technical systems. For example, a technical solution may not reflect a technical system at all. Therefore, we will not judge the overall technical system through specific individual technical solutions. Besides, it is common knowledge that the whole cannot be replaced by the part. Due to the diversity and uncertainty of the technical solutions required by the data package, the traditional keyword-based retrieval methods and retrieval methods based on technical fields have become invalid.
图3是本申请实施例给出的第一个技术类开放式数字资产的检索方法流程图。Fig. 3 is a flowchart of the first technical open digital asset retrieval method given in an embodiment of the present application.
按照图3,第一步(步骤31),在数字资产数据包中设置技术方案需求描述文件,这个文件中记载了该数字资产数据包技术进化所需要的技术方案,这些方案可以是竞争性的,也可以是互补性的,还可以是产业链上下游相关的,方案解决的问题可以是技术类问题,也可以是法律类问题,等等。该步骤的实施,通常在数字资产数据包首次形成阶段设置,其内容可以后续,即在交易过程中修改补充完善。According to Figure 3, the first step (step 31) is to set a technical solution requirement description file in the digital asset data package. This file records the technical solutions required for the technological evolution of the digital asset data package. These solutions can be competitive , It can also be complementary, or it can be related to the upstream and downstream of the industrial chain. The problem solved by the solution can be a technical problem or a legal problem, and so on. The implementation of this step is usually set at the first formation stage of the digital asset data package, and its content can be modified and supplemented during the transaction process.
后续步骤,是通过计算程序实现的检索方法。The next step is the retrieval method realized by the calculation program.
在步骤32,获得合法客户端发送的检索请求,该检索请求中包括目标数字资产技术描述文件,这个文件中有至少一个或多个技术方案,这些技术方案通常是请求人能够提供的技术方案描述,它们构成了目标技术方案集合A。其中,方案描述可以是任何有利于清楚地描述数字资产数据包的形式,例如、WORD文档,或者PDF文档,等等。另外,检索请求中也可以包含其它限制性条件,以缩小检索范围。In step 32, a retrieval request sent by a legitimate client is obtained. The retrieval request includes a technical description file of the target digital asset. This file contains at least one or more technical solutions. These technical solutions are usually descriptions of technical solutions that the requestor can provide , They constitute the target technical solution set A. Among them, the solution description can be any form that is conducive to clearly describing the digital asset data package, such as a WORD document, or a PDF document, and so on. In addition, the search request may also contain other restrictive conditions to narrow the search scope.
在步骤33,可以在***平台上或区块链网络中,根据检索条件获取待检测数字资产数据包集合。对于集合中每个数字资产数据包,可以通过其技术描述部分或文档,得到该数字资产数据包的全部技术要点对应的技术方案子集合B。In step 33, a collection of digital asset data packets to be detected can be obtained on the system platform or in the blockchain network according to the retrieval conditions. For each digital asset data package in the set, the technical solution subset B corresponding to all the technical points of the digital asset data package can be obtained through its technical description part or document.
在步骤34,计算目标技术方案集合A与每个技术方案子集合B的相似度指数。所述相似度指数能够表征待检测数字资产数据包集合中的每个数字资产数据包与检索请求中的目标数字资产技术描述文件中给出的技术方案在整体上的相似程度。最后在步骤35,根据所述相似度指数将待检测数字资产数据包集合中的数据包重新排序后输出,从而实现对技术类数字资产的检索。In step 34, the similarity index between the target technical solution set A and each technical solution subset B is calculated. The similarity index can characterize the overall similarity between each digital asset data packet in the digital asset data packet set to be detected and the technical solution given in the target digital asset technical description file in the retrieval request. Finally, in step 35, the data packets in the digital asset data packet set to be detected are reordered and output according to the similarity index, so as to realize the retrieval of technical digital assets.
其中,步骤34中采用的步骤计算目标技术方案集合A与技术方案子集合B的相似度指数的步骤可以采用下述子步骤。参考图4。图4是步骤13采用的第一种计算目标技术方案集合A与技术方案子集合B的相似度指数的示例图。Wherein, the step used in step 34 to calculate the similarity index between the target technical solution set A and the technical solution subset B may adopt the following sub-steps. Refer to Figure 4. FIG. 4 is an example diagram of calculating the similarity index between the target technical solution set A and the technical solution subset B used in step 13.
按照图4,先确定目标技术方案集合41中的每一个技术方案11、12、13,然后逐个确定待检测数字资产数据包集合42中的每个数字资产数据包421、422、423,进一步可以通过每个数字资产数据包421、422、423中的技术描述部分或文档,确定数字资产数据包421、422、423对应的技术方案子集合中的技术方案。具体说,数据包421的技术方案子集合中有技术方案211、212;数据包422的技术方案子集合中有技术方案221、222、223和224;数据包423的技术方案子集合中有技术方案231、232和233。接着计算目标技术方案集合41中每一个技术方案11、12、13与每个数据包421、422、423的技术方案子集合中的每个技术方案的相似度。即,技术方案11、12、13与数据包421、422和423的技术方案子集合中的每个技术方案的相似度。According to Figure 4, first determine each technical solution 11, 12, 13 in the target technical solution set 41, and then determine each digital asset data packet 421, 422, 423 in the digital asset data packet set 42 to be detected one by one, and further According to the technical description part or document in each digital asset data package 421, 422, 423, the technical solution in the technical solution subset corresponding to the digital asset data package 421, 422, 423 is determined. Specifically, the technical solution subset of the data package 421 includes technical solutions 211 and 212; the technical solution subset of the data packet 422 includes technical solutions 221, 222, 223, and 224; the technical solution subset of the data packet 423 includes technologies Plans 231, 232 and 233. Then, the similarity between each technical solution 11, 12, 13 in the target technical solution set 41 and each technical solution in the technical solution sub-set of each data packet 421, 422, 423 is calculated. That is, the degree of similarity between the technical solutions 11, 12, and 13 and each technical solution in the technical solution subset of the data packets 421, 422, and 423.
具体说,完成下列计算:Specifically, complete the following calculations:
(一)方案相似度计算,可采用多种顺序计算,例如如下顺序。(1) The calculation of the similarity of the scheme can be calculated in various orders, for example, the following order.
1、计算技术方案11与数据包421技术方案211、212的相似度a11-211和a11-212;技术方案11与技术方案221、222、223、224的相似度a11-221、a11-222、a11-223和a11-224;技术方案11与技术方案231、232、233的相似度a11-231、a11-232、a11-233;参考图4中的数据集合43。1. Calculate the similarity a11-211 and a11-212 between the technical solution 11 and the data packet 421 technical solutions 211 and 212; the similarity between the technical solution 11 and the technical solutions 221, 222, 223, and 224 a11-221, a11-222, a11-223 and a11-224; similarity a11-231, a11-232, and a11-233 between technical solution 11 and technical solutions 231, 232, 233; refer to data set 43 in Fig. 4.
2、计算技术方案12与数据包421技术方案211、212的相似度a12-211和a12-212;技术方案12与技术方案221、222、223、224的相似度a12-221、a12-222、a12-223和a12-224;技术方案12与技术方案231、232、233的相似度a12-231、a12-232、a12-233;参考图4中的数据集合44。2. Calculate the similarity a12-211 and a12-212 between the technical solution 12 and the data package 421 technical solutions 211 and 212; the similarity a12-221, a12-222, and the technical solution 12 and the technical solutions 221, 222, 223, and 224 a12-223 and a12-224; similarity a12-231, a12-232, a12-233 between technical solution 12 and technical solutions 231, 232, 233; refer to data set 44 in Fig. 4.
3、计算技术方案13与数据包421技术方案211、212的相似度a13-211和a13-212;技术方案13与技术方案221、222、223、224的相似度a13-221、a13-222、a13-223和a13-224;技术方案13与技术方案231、232、233的相似度a13-231、a13-232、a13-233;参考图4中的数据集合45。3. Calculate the similarity a13-211 and a13-212 between the technical solution 13 and the data packet 421 technical solutions 211 and 212; the similarity a13-221, a13-222, and the technical solution 13 and the technical solutions 221, 222, 223, 224 a13-223 and a13-224; similarity a13-231, a13-232, a13-233 between technical solution 13 and technical solutions 231, 232, 233; refer to data set 45 in Fig. 4.
(二)方案最大相似度和技术方案集合之间的相似度指数计算计算。(2) The calculation of the maximum similarity of the scheme and the similarity index between the set of technical schemes.
1、计算目标技术方案集合41的每个技术方案与数据包421的技术方案子集合的方案最大相似度,以及,计算目标技术方案集合41与数据包421的技术方案子集合的相似度指数。1. Calculate the maximum similarity between each technical solution of the target technical solution set 41 and the technical solution subset of the data package 421, and calculate the similarity index between the target technical solution set 41 and the technical solution subset of the data package 421.
(1)计算技术方案11、12、13与数据包321的技术方案子集合的方案最大相似度A11、A12、A13,其中:(1) Calculate the maximum similarities A11, A12, and A13 of the technical solutions 11, 12, 13 and the technical solution subset of the data package 321, where:
A11=a11-211+a11-212;A12=a12-211+a12-212;A13=a13-211+a13-212;A11=a11-211+a11-212; A12=a12-211+a12-212; A13=a13-211+a13-212;
(2)计算目标技术方案集合41与数据包421的技术方案子集合的相似度指数X11,其中:(2) Calculate the similarity index X11 between the target technical solution set 41 and the technical solution subset of the data packet 421, where:
X11=A11+A12+A13。X11=A11+A12+A13.
2、计算目标技术方案集合41的每个技术方案与数据包422的技术方案子集合的方案最大相似度,以及,计算目标技术方案集合41与数据包422的技术方案子集合的相似度指数。2. Calculate the maximum similarity between each technical solution of the target technical solution set 41 and the technical solution subset of the data package 422, and calculate the similarity index between the target technical solution set 41 and the technical solution subset of the data package 422.
(1)计算技术方案11、12、13与数据包422的技术方案子集合的方案最大相似度B11、B12、B13,其中:(1) Calculate the maximum similarities B11, B12, B13 between the technical schemes 11, 12, and 13 of the technical scheme subset of the data packet 422, where:
B11=a11-221+a11-222+a11-223+a11-224;B12=a12-221+a12-222+a12-223+a12-224;B13=a13-221+a13-222+a13-223+a13-224;B11=a11-221+a11-222+a11-223+a11-224; B12=a12-221+a12-222+a12-223+a12-224; B13=a13-221+a13-222+a13-223+ a13-224;
(2)计算目标技术方案集合41与数据包422的技术方案子集合的相似度指数X12,其中:(2) Calculate the similarity index X12 between the target technical solution set 41 and the technical solution subset of the data packet 422, where:
X12=B11+B12+B13。X12=B11+B12+B13.
3、计算目标技术方案集合41的每个技术方案与数据包423的技术方案子集合的方案最大相似度,以及,计算目标技术方案集合41与数据包423的技术方案子集合的相似度指数。3. Calculate the maximum similarity between each technical solution of the target technical solution set 41 and the technical solution subset of the data packet 423, and calculate the similarity index between the target technical solution set 41 and the technical solution subset of the data packet 423.
(1)计算技术方案11、12、13与数据包423的技术方案子集合的方案最大相似度C11、C12、C13,其中:(1) Calculate the maximum similarities C11, C12, C13 of the technical schemes 11, 12, 13 and the technical scheme subset of the data packet 423, where:
C11=a11-231+a11-232+a11-233;C12=a12-231+a12-232+a12-233;C11=a11-231+a11-232+a11-233; C12=a12-231+a12-232+a12-233;
C13=a13-231+a13-232+a13-233;C13=a13-231+a13-232+a13-233;
(2)计算目标技术方案集合41与数据包423的技术方案子集合的相似度指数:(2) Calculate the similarity index between the target technical solution set 41 and the technical solution subset of the data packet 423:
X13=C11+C12+C13。X13=C11+C12+C13.
可见,X11、X12和X13就是步骤14数据包重新排序的依据。It can be seen that X11, X12, and X13 are the basis for reordering the data packets in step 14.
上述(一)中技术方案相似度的计算,可以采用基于关键词的计算方法,也可以采用基于语义的计算方法,来计算目标技术方案集合A中每一个技术方案与技术方案子集合B的每个技术方案的相似度。例如,基于关键词的方法,参考图5,图5中还给出了利用所述相似度计算方案最大相似度和技术方案集合之间的相似度指数的示例。The calculation of the similarity of the technical solutions in (1) above can be based on keywords or semantics to calculate each technical solution in the target technical solution set A and each technical solution sub-set B The similarity of the two technical solutions. For example, for the keyword-based method, refer to FIG. 5, which also shows an example of using the similarity to calculate the maximum similarity of the solution and the similarity index between the set of technical solutions.
先确定目标技术方案集合51中的每一个技术方案11、12、13,提炼出技术方案11、12、13分别对应的全部关键词及对应的衍生词生成的目标关键词集合H1、H2、H3,所述衍生词包括关键词的同义词、近义词、上位词、下位词等;其中,所述H1、H2、H3是分别去除其中的重复关键词后形成的关键词集合。然后逐个确定待检测数字资产数据包集合52中的每个数字资产数据包521、522、523,进一步可以通过每个数字资产数据包521、522、523中的技术描述部分或文档,确定数字资产数据包521、522、523对应的技术方案子集合中的技术方案。具体说,数据包521的技术方案子集合中有技术方案211、212;数据包522的技术方案子集合中有技术方案221、222、223和224;数据包523的技术方案子集合中有技术方案231、232和233。First determine each of the technical solutions 11, 12, and 13 in the target technical solution set 51, and extract all the keywords corresponding to the technical solutions 11, 12, and 13 and the target keyword set H1, H2, H3 generated by the corresponding derivative words. The derivative words include synonyms, synonyms, hypernyms, hyponyms, etc. of keywords; wherein, the H1, H2, and H3 are a keyword set formed after the repeated keywords are removed. Then determine each digital asset data package 521, 522, 523 in the digital asset data package set 52 to be detected one by one, and further determine the digital asset through the technical description part or document in each digital asset data package 521, 522, 523 The technical solutions in the technical solution subset corresponding to the data packets 521, 522, and 523. Specifically, the technical solution subset of the data package 521 includes technical solutions 211 and 212; the technical solution subset of the data packet 522 includes technical solutions 221, 222, 223, and 224; the technical solution subset of the data packet 523 includes technologies Plans 231, 232 and 233.
接着计算计算目标关键词集合H1、H2、H3中的每个关键词与每个数据包521、522、523中出现的次数。即,H1、H2、H3中的每个关键词在数据包521、522和523的技术方案子集合中的每个技术方案中出现的次数。Then, the number of occurrences of each keyword in the target keyword set H1, H2, H3 and each data packet 521, 522, 523 is calculated. That is, the number of times each keyword in H1, H2, and H3 appears in each technical solution in the technical solution subsets of the data packets 521, 522, and 523.
如图5所示,集合H1中的关键词在数据包521的技术方案211中出现10次,即相似度数值为10;在技术方案212中出现15次,即相似度数值为15;在数据包522的技术方案221中出现20次,即相似度数值为20;在技术方案222中出现15次,即相似度数值为15;在技术方案223中出现30次,即相似度数值为30;在技术方案224中出现5次,即相似度数值为5;在数据包523的技术方案231中出现0次,即相似度数值为0;在技术方案232中出现5次,即相似度数值为5,在技术方案233中出现2次,即相似度数值为2。As shown in Figure 5, the keywords in the set H1 appear 10 times in the technical solution 211 of the data packet 521, that is, the similarity value is 10; in the technical solution 212, they appear 15 times, that is, the similarity value is 15; There are 20 occurrences in the technical solution 221 of package 522, that is, the similarity value is 20; 15 occurrences in the technical solution 222, that is, the similarity value is 15; 30 times in the technical solution 223, that is, the similarity value is 30; 5 times in the technical solution 224, that is, the similarity value is 5; 0 times in the technical solution 231 of the data packet 523, that is, the similarity value is 0; 5 times in the technical solution 232, that is, the similarity value is 5. It appears twice in the technical solution 233, that is, the similarity value is 2.
集合H2中的关键词在数据包521的技术方案211中出现5次,即相似度数值为5;在技术方案212中出现15次,即相似度数值为15;在数据包522的技术方案221中出现5次,即相似度数值为5;在技术方案222中出现10次,即相似度数值为10;在技术方案223中出现20次,即相似度数值为20;在技术方案224中出现10次,即相似度数值为10;在数据包523的技术方案231中出现5次,即相似度数值为5;在技术方案232中出现5次,即相似度数值为5,在技术方案233中出现5次,即相似度数值为5。The keyword in the set H2 appears 5 times in the technical solution 211 of the data packet 521, that is, the similarity value is 5; it appears 15 times in the technical solution 212, that is, the similarity value is 15; in the technical solution 221 of the data packet 522 Appears 5 times in technical solution 222, that is, the similarity value is 5; appears 10 times in technical solution 222, that is, the similarity value is 10; appears 20 times in technical solution 223, that is, the similarity value is 20; appears in technical solution 224 10 times, that is, the similarity value is 10; 5 times in the technical solution 231 of the data packet 523, that is, the similarity value is 5; 5 times in the technical solution 232, that is, the similarity value is 5, in the technical solution 233 Appears 5 times in, that is, the similarity value is 5.
集合H3中的关键词在数据包521的技术方案211中出现10次,即相似度数值为10;在技术方案212中出现20次;即相似度数值为20;在数据包522的技术方案221中出现25次,即相似度数值为25;在技术方案222中出现15次,即相似度数值为15;在技术方案223中出现5次,即相似度数值为5;在技术方案224中出现5次,即相似度数值为5;在数据包523的技术方案231中出现10次,即相似度数值为10;在技术方案232中出现5次,即相似度数值为5,在技术方案233中出现5次,即相似度数值为5。The keywords in the set H3 appear 10 times in the technical solution 211 of the data packet 521, that is, the similarity value is 10; they appear 20 times in the technical solution 212; that is, the similarity value is 20; in the technical solution 221 of the data packet 522 Appears 25 times in technical solution 222, that is, the similarity value is 25; appears 15 times in technical solution 222, that is, the similarity value is 15; appears 5 times in technical solution 223, that is, the similarity value is 5; appears in technical solution 224 5 times, the similarity value is 5; 10 times in the technical solution 231 of the data packet 523, the similarity value is 10; 5 times in the technical solution 232, that is, the similarity value is 5, in the technical solution 233 Appears 5 times in, that is, the similarity value is 5.
将目标关键词集合H1中的每个关键词在技术方案子集合521的技术方案211、212中出现的次数10、15相加,就是目标技术方案51中的技术方案11与待检测数字资产数据包集合52中的数字资产数据包521的方案最大相似度25,本例中,技术方案11与数字资产数据包521的方案最大相似度数值为25,即图5中A11的值为25。Add 10 and 15 the number of occurrences of each keyword in the target keyword set H1 in the technical solutions 211 and 212 of the technical solution sub-set 521, which is the technical solution 11 in the target technical solution 51 and the digital asset data to be detected The maximum similarity value of the digital asset data package 521 in the package set 52 is 25. In this example, the maximum similarity value of the technical solution 11 and the digital asset data package 521 is 25, that is, the value of A11 in FIG. 5 is 25.
同理,还可以获得目标技术方案51中的技术方案12与待检测数字资产数据包集合52中的数字资产数据包521的方案最大相似度,本例中,技术方案12与数字资产数据包521的方案最大相似度数值为20,即图5中A12的值为20。目标技术方案51中的技术方案13与待检测数字资产数据包集合52中的数字资产数据包521的方案最大相似度A13值为30。Similarly, the maximum similarity between the technical solution 12 in the target technical solution 51 and the digital asset data package 521 in the digital asset data package set 52 to be detected can also be obtained. In this example, the technical solution 12 is the same as the digital asset data package 521. The maximum similarity value of the scheme is 20, that is, the value of A12 in Figure 5 is 20. The maximum similarity A13 value between the technical solution 13 in the target technical solution 51 and the digital asset data packet 521 in the digital asset data packet set 52 to be detected is 30.
进一步,目标技术方案51与待检测数字资产数据包集合52中的数字资产数据包521的相似度指数X11=A11+A12+A13=25+20+30=75。目标技术方案51与待检测数字资产数据包集合52中的数字资产数据包522的相似度指数X12=B11+B12+B13=70+45+50=165。目标技术方案51与待检测数字资产数据包集合52中的数字资产数据包523的相似度指数X13=C11+C12+C13=7+15+20=42。Further, the similarity index X11=A11+A12+A13=25+20+30=75 between the target technical solution 51 and the digital asset data package 521 in the digital asset data package set 52 to be detected. The similarity index between the target technical solution 51 and the digital asset data packet 522 in the digital asset data packet set 52 to be detected is X12=B11+B12+B13=70+45+50=165. The similarity index between the target technical solution 51 and the digital asset data package 523 in the digital asset data package set 52 to be detected is X13=C11+C12+C13=7+15+20=42.
在本申请的其它实施例中,采用基于语义的计算方法计算技术方案之间的语义相似度。假设语义相似度函数为LAN(X1,X2),其中,X1为第一个技术文件的描述文档,X2为第二个技术文件的描述文档,因此技术方案11与技术方案211的语义相似度为LAN(技术方案11,技术方案211)。显然,通过语义相似度可以得到数字资产数据包之间的相似度指数,此不再赘述。In other embodiments of the present application, a semantic-based calculation method is used to calculate the semantic similarity between technical solutions. Assume that the semantic similarity function is LAN(X1, X2), where X1 is the description document of the first technical document, and X2 is the description document of the second technical document. Therefore, the semantic similarity between technical solution 11 and technical solution 211 is LAN (Technical Solution 11, Technical Solution 211). Obviously, the similarity index between digital asset data packets can be obtained through semantic similarity, which will not be repeated here.
图6是图3所述流程采用的第二种计算目标技术方案集合A与技术方案子集合B 的相似度指数的流程图。FIG. 6 is a flowchart of the second method of calculating the similarity index between the target technical solution set A and the technical solution subset B used in the process described in FIG. 3.
图6所述流程给出了一个通用的方案,它采用的原理是,为了从整体上描述一个技术***,将一个技术***的关键技术方案用四个抽象层次(也可以是更多的层次或更少的层次,但是不能少于两个层次,过多的层次会使方法的效率降低,且对判断的准确性提升程度有限)的概括描述去表达,根据两个技术***关键技术方案每个层次的表达数量的统计和比较,就可以快速判断两个技术***的相似性或竞争性的程度。参考图6。The process shown in Figure 6 gives a general scheme. The principle adopted is that in order to describe a technical system as a whole, the key technical schemes of a technical system use four abstract levels (or more levels or Fewer levels, but not less than two levels. Too many levels will reduce the efficiency of the method, and the accuracy of the judgment will be limited.) To express the general description according to the key technical solutions of the two technical systems. Statistics and comparison of the number of levels of expression can quickly determine the degree of similarity or competition between two technical systems. Refer to Figure 6.
在步骤61,确定或选择一个具有包括四级具有逐级特征的技术分类规则。这个技术分类规则可以预先设计出来使用,如果用于检索特定领域的技术***,例如,化学领域或半导体领域等,有针对性设计的技术分类规则有利于检索和判断的准确性。但是,大部分情况下可以在常用的通用技术分类规则中选择一个使用,这在应用效果上没有太大区别,最常用的就是国际专利分类规则、欧洲或美国专利分类规则等。所述逐级特征就是前述的四个抽象层次,显然,前述的国际专利分类规则等就具有这个特征。如果自己设计这个规则可以参考下表,例如,四个抽象层次的技术分类规则的含义如下,其中,数值越小,抽象程度越高:In step 61, determine or select a technology classification rule that includes four levels and has progressive features. This technology classification rule can be designed and used in advance. If it is used to search for a technology system in a specific field, for example, the chemical field or the semiconductor field, the targeted design of the technology classification rule is conducive to the accuracy of retrieval and judgment. However, in most cases, you can choose one of the commonly used general technology classification rules, which does not have much difference in application effects. The most commonly used are the international patent classification rules, European or US patent classification rules, etc. The gradual characteristics are the aforementioned four abstract levels. Obviously, the aforementioned international patent classification rules have this characteristic. If you design this rule yourself, you can refer to the following table. For example, the meaning of the four abstraction level technical classification rules is as follows, among which, the smaller the value, the higher the abstraction level:
表1技术规则设计表Table 1 Technical rule design table
层级Level One two three four
名称name 技术方向Technical direction 技术领域Technical field 专业方向Professional direction 专业领域professional field
表达expression A-GA-G A-ZA-Z A-Z+数字0-9A-Z+digit 0-9 A-Z+数字0-9A-Z+digit 0-9
说明 Description 1位1 person 2位2 people 3位3 people 4位4
例如,对于一个技术要点的编码BAFA01A105,其中,B代表该技术要点的技术方向信息,AF代表技术领域信息,A01代表专业方向信息,A105代表专业领域信息。For example, for a technical point code BAFA01A105, where B represents the technical direction information of the technical point, AF stands for technical field information, A01 stands for professional direction information, and A105 stands for professional field information.
由于技术分类规则设计和内容定义属于公有技术范畴,此不在赘述。Since the technical classification rule design and content definition belong to the category of public technology, I will not repeat it here.
步骤62,分别从两个技术***中选择技术要点。技术要点的选择按照全面、概括、重点兼顾的原则进行。所述全面,是强调技术要点的选择应该覆盖或兼顾技术***结构的每一个分支,最大限度避免遗漏;所述概括,意在使所选择的技术要点及其描述具有多层次性,使得技术要点集合能体现***的整体性特征;所述重点,尽可能选择***中有特点的关键技术方案或创新技术方案,最大限度的提高***的可识别性。这样,对于从第一个技术***总结提炼出来的技术要点集合A,和从第二个技术***总结提炼出来的技术要点集合B,使用前述的技术分类规则对其中的每个技术要点进行技术分类,这样就得到了对应的分类号码集合A、B。其中,所述技术要点集合中的技术要点信息是该技术要点的技术性描述文件,包括文字或图片等信息,例如也可以是专利申请文件的样式;而在分类号码集合中,则是每一个技术要点文件对应的技术 分类代码。Step 62: Select technical points from the two technical systems respectively. The selection of technical points is carried out in accordance with the principles of comprehensiveness, generalization, and focus. The comprehensiveness emphasizes that the selection of technical points should cover or take into account every branch of the technical system structure, avoiding omissions to the greatest extent; the generalization is intended to make the selected technical points and their descriptions multi-level, making the technical points The collection can reflect the overall characteristics of the system; the key point is to select key technical solutions or innovative technical solutions with characteristics in the system as much as possible to maximize the recognizability of the system. In this way, for the technical point set A summarized and refined from the first technical system, and the technical point set B summarized and refined from the second technical system, use the aforementioned technology classification rules to classify each of the technical points. , In this way, the corresponding classification number sets A and B are obtained. Among them, the technical key information in the technical key set is the technical description file of the technical key, including text or picture and other information, for example, it can also be the style of the patent application document; and in the classification number set, it is each technology The technical classification code corresponding to the key file.
在下述步骤,将以分类号码集合A、B为操作对象。In the following steps, the classification number sets A and B will be the operation objects.
步骤63,在分类号码集合A中,根据其中的分类号码数量,以任意的方式,例如随机或顺序方式选择80%的号码作为操作对象(在号码数量较少时,通常100%的选择。关于号码选择数量的说明在后文有详述),得到新的分类号码集合A;同样,在分类号码集合B中,根据其中的分类号码数量,选择100%的号码作为操作对象,得到新的分类号码集合B。Step 63: In the classified number set A, according to the number of classified numbers in it, 80% of the numbers are selected in any manner, such as random or sequential, as the operation object (when the number of numbers is small, 100% is usually selected. About The number of selected numbers is explained in detail later), and a new classification number set A is obtained; similarly, in the classification number set B, according to the number of classification numbers in it, 100% of the numbers are selected as operation objects to obtain a new classification Number set B.
针对新的分类号码集合A,对于其中的每一个号码,获取该号码指示的每一个级别编码,去掉其中的重复项,得到全部号码的每一个级别编码集合X11、X12、X13和X14及对应的数量Y11、Y12、Y13和Y14,以及,在新的分类号码集合B中,对于其中的每一个号码,获取该号码指示的每一个级别编码,去掉其中的重复项,得到全部号码的每一个级别编码集合X21、X22、X23和X24及对应的数量Y21、Y22、Y23和Y24。下面说明“去掉其中的重复项”是如何操作的。假设新的分类号码集合A的全部号码的第一级别编码,即代表技术方向的编码集合X11为:For the new classification number set A, for each number, obtain each level code indicated by the number, remove the duplicates, and obtain each level code set X11, X12, X13, and X14 of all numbers and the corresponding Numbers Y11, Y12, Y13 and Y14, and, for each number in the new classification number set B, get each level code indicated by the number, remove the duplicates, and get each level of all numbers Encoding sets X21, X22, X23 and X24 and corresponding numbers Y21, Y22, Y23 and Y24. The following explains how to "remove duplicates". Assume that the first-level codes of all numbers in the new classification number set A, that is, the code set X11 representing the technical direction is:
X11={B,A,C,C,B,D,E,F,D,B},其中,B重复2次,C重复1次,D重复1次,去除重复后,X11={B,A,C,D,E,F},这中情况下,对应的编码数量Y11=6。X11={B,A,C,C,B,D,E,F,D,B}, where B is repeated twice, C is repeated once, and D is repeated once. After removing the duplication, X11={B, A, C, D, E, F}, in this case, the corresponding number of codes Y11=6.
步骤64,根据所述编码集合X11、X12、X13和X14,和X21、X22、X23和X24,计算X11、X21编码重合的数量E1,以及X12、X22编码重合的数量E2、X13、X23编码重合的数量E3和X14、X24编码重合的数量E4。Step 64: Calculate the number E1 of X11, X21 encoding overlap, and the number of X12, X22 encoding overlap E2, X13, X23 encoding overlap according to the encoding set X11, X12, X13, and X14, and X21, X22, X23 and X24 The number of E3 and the number of X14, X24 code overlapped E4.
例如,假设X11={B,A,C,D,E,F},X21={B,A,G},则X11、X21编码重合的数量E1=2。For example, assuming X11={B,A,C,D,E,F}, X21={B,A,G}, then the number of overlapping codes of X11 and X21 is E1=2.
步骤65,计算分类号码集合A、B的每一个级别的编码相对重合度Ai、Bi;其中,对于分类号码集合A,Ai=(Ei/Y1i)%;对于分类号码集合B,Bi=(Ei/Y2i)%。Step 65: Calculate the relative coincidence degrees Ai, Bi of each level of the classification number set A and B; among them, for the classification number set A, Ai=(Ei/Y1i)%; for the classification number set B, Bi=(Ei /Y2i)%.
步骤66和67,根据所述相对重合度Ai、Bi,计算分类号码集合A技术相关指数F A和分类号码集合B技术相关指数F A;其中,F A=∑Ci*Ai;F B=∑Ci*Bi;式中,Ci为经验常数; Step 66 and 67, based on the relative overlap of Ai, Bi, calculating classification numbers set F A A technology-related index number and classification set B F A technology-related index; wherein, F A = ΣCi * Ai; F B = Σ Ci*Bi; where Ci is an empirical constant;
根据所述相关指数F A和F B,计算分类号码集合A、B的相似性概率G A、G B;其中,G A=F A/(∑Ci);G B=F B/(∑Ci); According to the correlation indexes F A and F B , calculate the similarity probabilities G A and G B of the classification number sets A and B; where G A =F A /(∑Ci); G B =F B /(∑Ci );
将G A作为目标技术方案集合A与技术方案子集合B的相似度指数,G B作为技术方案子集合B与目标技术方案集合A的相似度指数; The G A certain aspect as set A and the aspect B of the subset similarity index, G B B as the target aspect aspect subset of set A similarity index;
上述相关式中,i=1-n,其中n为技术分类规则的编码级别数,本例中,n=4。In the above correlation formula, i=1-n, where n is the number of coding levels of the technical classification rules, in this example, n=4.
前述每个技术要点的技术分类号码可以包括一个或多个分类号码。The technical classification number of each of the aforementioned technical points may include one or more classification numbers.
图7是本申请实施例给出的第二个技术类数字资产的检索方法流程图。Fig. 7 is a flowchart of a second method for searching technical digital assets according to an embodiment of the present application.
按照图5,第一步(步骤71),在数字资产数据包中设置技术方案需求描述文件, 这个文件中记载了该数字资产数据包技术进化所需要的技术方案,这些方案可以是竞争性的,也可以是互补性的,还可以是产业链上下游相关的,方案解决的问题可以是技术类问题,也可以是法律类问题,等等。According to Figure 5, the first step (step 71) is to set a technical solution requirement description file in the digital asset data package. This file records the technical solutions required by the digital asset data package’s technological evolution. These solutions can be competitive , It can also be complementary, or it can be related to the upstream and downstream of the industrial chain. The problem solved by the solution can be a technical problem or a legal problem, and so on.
后续步骤,是通过计算程序实现的检索方法。The next step is the retrieval method realized by the calculation program.
在步骤72获得合法客户端发送的检索请求,该检索请求中包括目标数字资产技术描述文件,这个文件中包括目标数字资产的全部技术要点对应的至少一个或多个技术方案,这些技术方案构成了目标技术方案集合A。In step 72, a retrieval request sent by a legitimate client is obtained. The retrieval request includes a target digital asset technical description file. This file includes at least one or more technical solutions corresponding to all the technical points of the target digital asset. These technical solutions constitute Target technical solution set A.
在步骤73,可以在***平台上或区块链网络中,根据检索条件获取待检测数字资产数据包集合。对于集合中每个数字资产数据包,可以通过其技术描述部分或文档,得到该数字资产数据包的全部技术要点对应的技术方案子集合B。In step 73, a collection of digital asset data packets to be detected can be obtained on the system platform or in the blockchain network according to the retrieval conditions. For each digital asset data package in the set, the technical solution subset B corresponding to all the technical points of the digital asset data package can be obtained through its technical description part or document.
在步骤74,确定目标技术方案集合A中所有技术方案对应的专利分类号集合A和每个技术方案子集合B中所有技术方案对应的专利分类号集合B。由于一个技术方案可能会有多个专利分类号,因此,集合A和集合B应当采用一个分类号纳入标准,要么只选择技术方案的主分类号纳入集合,要么将技术方案的所有分类号都纳入集合。前者有利于提高计算效率,但是当数字处理器计算资源充足时,后者则会提高计算精确度。In step 74, the patent classification number set A corresponding to all the technical solutions in the target technical solution set A and the patent classification number set B corresponding to all the technical solutions in each technical solution subset B are determined. Since a technical solution may have multiple patent classification numbers, set A and set B should adopt a single classification number to be included in the standard, or only select the main classification number of the technical solution into the collection, or include all the classification numbers of the technical solution set. The former is conducive to improving the calculation efficiency, but when the digital processor has sufficient computing resources, the latter will improve the calculation accuracy.
在步骤75,根据专利分类号集合A和每个专利分类号集合B,计算目标技术方案集合A与每个技术方案子集合B的相似度指数。所述相似度指数能够表征待检测数字资产数据包集合中的每个数字资产数据包与检索请求中的目标数字资产技术描述文件中给出的技术方案在整体上的相似程度。最后在步骤76,根据所述相似度指数将待检测数字资产数据包集合中的数据包重新排序后输出,从而实现对技术类数字资产的检索。In step 75, according to the patent classification number set A and each patent classification number set B, the similarity index between the target technical solution set A and each technical solution subset B is calculated. The similarity index can characterize the overall similarity between each digital asset data packet in the digital asset data packet set to be detected and the technical solution given in the target digital asset technical description file in the retrieval request. Finally, in step 76, the data packets in the digital asset data packet set to be detected are reordered and output according to the similarity index, so as to realize the retrieval of technical digital assets.
图7所述实施例采用的判断两个技术***相似度指数的方法,利用了专利分类规则。例如,通过两个技术***的专利申请信息中记载的国际专利分类号,可以获知其指出的技术领域重合信息,由此就可以从整体上判断两个技术***的相似性程度。在其它的实施例中,可以利用任意的技术分类规则获得两个技术***的关键或主要技术点的技术分类,而不限于专利分类,或者说,专利分类只是技术分类的一种形式,只要两个技术***按照相同的技术分类规则,对***中的关键或主要技术点进行技术分类,都可以使用本申请提供的方法。例如利用两个技术***在美国或者在欧洲申请的专利,就可以使用美国或欧洲的专利分类号,按照本申请提供的方法来判断任意两个技术***的冲突程度。下面以国际专利分类号(IPC)作为技术***中关键技术点的技术分类规则,说明本申请的其它实施例的具体实现过程。The method of judging the similarity index of two technical systems adopted in the embodiment shown in FIG. 7 utilizes patent classification rules. For example, through the international patent classification numbers recorded in the patent application information of two technical systems, the overlapping information of the technical fields pointed out by them can be obtained, so that the degree of similarity between the two technical systems can be judged as a whole. In other embodiments, any technology classification rules can be used to obtain the technical classifications of the key or main technical points of the two technical systems, and are not limited to patent classifications. In other words, patent classifications are only a form of technology classification, as long as two According to the same technical classification rules, all technical systems can use the methods provided in this application to classify the key or main technical points in the system. For example, if two technical systems are applied for patents in the United States or Europe, the US or European patent classification numbers can be used to determine the degree of conflict between any two technical systems according to the method provided in this application. The following uses the International Patent Classification Number (IPC) as the technical classification rules for key technical points in the technical system to illustrate the specific implementation process of other embodiments of the present application.
国际专利分类号,即IPC,采用了功能和应用相结合的分类方式,以功能性为主、应用性为辅的分类原则。使用等级的形式,将技术内容注明为:部—大类—小类—大 组—小组五个部分,逐级分类形成完整的分类体系。因此,一个完整的IPC分类号由代表部、大类、小类、大组和小组的符号组合构成。The International Patent Classification Number, IPC, adopts a classification method that combines function and application, and the classification principle is based on functionality and supplemented by application. In the form of grades, the technical content is indicated as five parts: department-big category-small category-big group-small group, and form a complete classification system. Therefore, a complete IPC classification number is composed of the symbol combination of representative department, major category, sub-category, major group and group.
在一个实施例中,就是利用这五个部分的信息来判断两个技术***,或两个集合的技术***的相似性程度或冲突程度。在另一个实施例中,利用这五个部分信息中的四个,即大类、小类、大组和小组的信息来判断两个技术***,或两个集合的技术***的相似性程度或冲突程度。同理,也可以利用这五个部分信息中的三个,即小类、大组和小组的信息来判断两个技术***,或两个集合的技术***的相似性程度或冲突程度。或者,利用这五个部分信息中的二个,即大组和小组的信息来判断两个技术***,或两个集合的技术***的相似性程度或冲突程度。或者,也利用这五个部分信息中的一个,即小组的信息来判断两个技术***,或两个集合对应的技术***的冲突程度。In one embodiment, the five parts of information are used to determine the degree of similarity or conflict between two technical systems or two sets of technical systems. In another embodiment, four of the five pieces of information, namely the information of major categories, subcategories, major groups, and groups, are used to determine the degree of similarity or similarity between two technical systems or two sets of technical systems. The degree of conflict. In the same way, three of the five pieces of information, that is, the information of small categories, large groups, and small groups, can also be used to determine the degree of similarity or conflict between two technical systems or two sets of technical systems. Or, use two of the five pieces of information, that is, the information of the large group and the small group to judge the degree of similarity or conflict between two technical systems or two sets of technical systems. Or, use one of the five pieces of information, that is, the group's information to determine the degree of conflict between the two technical systems or the technical systems corresponding to the two sets.
显然,在这五个信息中,部的信息概念范围最大,利用该信息的目的在于不遗漏所使用的信息;而小组的信息概念范围最小,利用该信息的目的在于使所使用的信息更精准。因此,还可以有多个利用专利分类信息的实施例,例如只利用部、小类、大组和小组的信息来判断两个技术***,或两个集合的技术***的相似性程度或冲突程度。等等。下面以利用这五个部分信息中的三个,即小类、大组和小组的信息来判断两个技术***相似性程度或冲突程度的第四个实施例,进一步对本申请的技术方案进行说明,该实施例所述方法可以用软件的形式来实施。Obviously, among the five information, the department has the largest range of information concepts, and the purpose of using this information is to not omit the information used; while the group’s information concepts range is the smallest, and the purpose of using this information is to make the information used more accurate . Therefore, there can also be multiple examples of using patent classification information, for example, only using the information of department, sub-category, large group and group to judge the similarity or conflict between two technical systems or two sets of technical systems . and many more. The following uses three of the five pieces of information, that is, the information of small categories, large groups, and groups to determine the degree of similarity or conflict between two technical systems to further explain the technical solution of the present application. The method described in this embodiment can be implemented in the form of software.
具体说,图7所述流程的步骤74中采用的计算目标技术方案集合A与技术方案子集合B的相似度指数的步骤可以采用下述子步骤。参考图8。图8是所述流程步骤74采用的第一种计算目标技术方案集合A与技术方案子集合B的相似度指数的流程图。Specifically, the step of calculating the similarity index between the target technical solution set A and the technical solution subset B adopted in step 74 of the process shown in FIG. 7 may adopt the following sub-steps. Refer to Figure 8. FIG. 8 is a flowchart of the first method of calculating the similarity index between the target technical solution set A and the technical solution subset B used in step 74 of the process.
图8所述流程的特点是,利用两个技术***或技术方案集合的专利申请作为技术要点,以专利申请的国际专利分类号作为技术分类规则。具体说,国际专利分类号依据集合A及集合B专利申请的IPC分类的小类、大组和小组分类号进行二个技术***之间的技术相关性或相似度分析。The characteristic of the process shown in Fig. 8 is that the patent application of two technical systems or technical solutions is used as the technical point, and the international patent classification number of the patent application is used as the technical classification rule. Specifically, the International Patent Classification Number analyzes the technical relevance or similarity between the two technical systems based on the IPC classification numbers of the IPC classifications of the set A and set B patent applications.
首先在步骤81,获得专利分类号集合A和专利分类号集合B的所有专利申请信息中的IPC号码,形成两个IPC号码集,这两个IPC号码集分别与集合A、B对应。First, in step 81, the IPC numbers in all patent application information of the patent classification number set A and the patent classification number set B are obtained to form two IPC number sets, which correspond to the sets A and B respectively.
在步骤82,获取第一个号码集所有的国际专利分类号指示的小类编码、大组编码和小组编码,去掉每组编码中重复的部分,得到小类编码集合B3(表1的第一列,即集合A的IPC小类)、小类编码的数量b3为19(表1的第一列最后一行,即集合A的IPC小类列的最后一行),大组编码集合B2(表2的第一列,即集合A的IPC大组)、大组编码的数量b2为19(表2的第一列最后一行,即集合A的IPC大组列的最后一行),和,小组编码集合B1(表3的第一列,即集合A的IPC小组)、小组编码的数量b1为13(表3的第一列最后一行,即集合A的IPC小组列的最后一行)。In step 82, obtain the small class codes, large group codes, and group codes indicated by all the international patent classification numbers in the first number set, remove the repeated parts in each group code, and obtain the small class code set B3 (the first in Table 1 Column, namely the IPC subclass of set A), the number of subclass codes b3 is 19 (the last row of the first column of Table 1, that is, the last row of the IPC subclass column of set A), the large group code set B2 (Table 2 The first column of the IPC group of set A), the number of group codes b2 is 19 (the last row of the first column of Table 2, which is the last row of the IPC group column of set A), and, the group code set B1 (the first column of Table 3, that is, the IPC group of set A), the number of group codes b1 is 13 (the last row of the first column of Table 3, that is, the last row of the IPC group of set A).
再获取第二个号码集所有的国际专利分类号指示的小类编码、大组编码和小组编码,去掉每组编码中重复的部分,得到小类编码集合D3(表1的第二列,即集合B的IPC小类)、小类编码的数量d3为10(表2的第二列最后一行,即集合B的IPC小类列的最后一行),大组编码集合D2(表3的第二列,即集合B的IPC大组)、大组编码的数量d2为10(表3的第二列最后一行,即集合B的IPC大组列的最后一行),和小组编码集合D1(表4的第二列,即集合B的IPC小组)、小组编码的数量d1为5(表3的第二列最后一行,即集合B的IPC小组列的最后一行)。Then obtain the small class code, large group code, and group code indicated by all the international patent classification numbers in the second number set, remove the repeated parts in each group code, and obtain the small class code set D3 (the second column of Table 1, namely IPC subclass of set B), the number of subclass codes d3 is 10 (the last row of the second column of Table 2, which is the last row of the IPC subclass column of set B), and the large group code set D2 (the second of Table 3) Column, that is, the IPC group of set B), the number of group codes d2 is 10 (the second column of Table 3, the last row, that is, the last row of the IPC group column of set B), and the group code set D1 (Table 4 The second column of, that is the IPC group of set B), and the number of group codes d1 is 5 (the last row of the second column of Table 3, that is, the last row of the IPC group of set B).
表2:集合A和集合B的IPC小类信息比较表Table 2: Comparison table of IPC subclass information of set A and set B
集合A的IPC小类IPC subclass of set A 集合B的IPC小类IPC subclass of set B 重合IPC小类Coincident IPC subclass
A41DA41D E21CE21C B65GB65G
A62DA62D E21DE21D C02FC02F
B01DB01D B23PB23P E21CE21C
B01FB01F B25BB25B E21DE21D
B03BB03B E02FE02F E21FE21F
B61GB61G B65GB65G  To
B61KB61K E21FE21F  To
B61LB61L G06QG06Q  To
B65GB65G C02FC02F  To
B66DB66D E01HE01H  To
C01BC01B  To  To
C01FC01F  To  To
C02FC02F  To  To
C09KC09K  To  To
C25CC25C  To  To
E01BE01B  To  To
E21CE21C  To  To
E21DE21D  To  To
E21FE21F  To  To
合计19项19 items in total 合计10项10 items in total 重复5项Repeat 5 items
表3:集合A和集合B的IPC大组比较表Table 3: Comparison table of IPC groups of set A and set B
集合A的IPC大组IPC group of set A 集合B的IPC大组IPC group of set B 重合IPC大组Coincident IPC group
A41D13/00A41D13/00 E21C35/00E21C35/00 E21D15/00E21D15/00
A61F17/00A61F17/00 E21C41/00E21C41/00  To
集合A的IPC大组IPC group of set A 集合B的IPC大组IPC group of set B 重合IPC大组Coincident IPC group
A61J9/00A61J9/00 E21D15/00E21D15/00  To
A62D1/00A62D1/00 B23P19/00B23P19/00  To
B61K7/00B61K7/00 B25B27/00B25B27/00  To
B61L11/00B61L11/00 E02F9/00E02F9/00  To
B61L23/00B61L23/00 E21C33/00E21C33/00  To
B65G11/00B65G11/00 E21D20/00E21D20/00  To
B65G21/00B65G21/00 E21D23/00E21D23/00  To
B65G65/00B65G65/00 E21F13/00E21F13/00  To
B66B15/00B66B15/00  To  To
B66C1/00B66C1/00  To  To
B66D1/00B66D1/00  To  To
C01B33/00C01B33/00  To  To
C02F1/00C02F1/00  To  To
C09K3/00C09K3/00  To  To
C25C3/00C25C3/00  To  To
E21D15/00E21D15/00  To  To
E21D19/00E21D19/00  To  To
合计19项19 items in total 合计10项10 items in total 重复1项Repeat 1 item
表4:集合A和集合B的IPC小组比较表Table 4: Comparison table of IPC groups of set A and set B
集合A的IPC小组IPC group of group A 集合B的IPC小组IPC group of set B 重合IPC小组Coincident IPC group
A47K3/22A47K3/22 E21C35/22E21C35/22  To
B60M1/20B60M1/20 E21C41/16E21C41/16  To
B61C11/02B61C11/02 B25B27/14B25B27/14  To
B61G3/24B61G3/24 E21C35/04E21C35/04  To
B61K7/16B61K7/16 E21D15/54E21D15/54  To
B61K7/18B61K7/18  To  To
B65G11/02B65G11/02  To  To
B65G21/20B65G21/20  To  To
B65G65/10B65G65/10  To  To
B66B15/02B66B15/02  To  To
B66D1/36B66D1/36  To  To
C01B33/113C01B33/113  To  To
C09K3/22C09K3/22  To  To
集合A的IPC小组IPC group of group A 集合B的IPC小组IPC group of set B 重合IPC小组Coincident IPC group
合计13项13 items in total 合计5项5 items in total 重复0项Repeat 0 items
需要说明,在步骤82中,分别选择了集合A和集合B的100%专利分类号分析对象,在其它的实施例中,也可只选择其中的一部分。这样做的结果是方法的执行结果有一定的误差,但是不影响整体判断,同时也增强了方法的实用性,任何技术***在专利分类号确定有误差的情况下也可以判断。另外,设定一个选择范围,可以在效果和效率之间取得更好的平衡,以及方法的使用灵活性。It should be noted that in step 82, the 100% patent classification number analysis objects of the set A and the set B are respectively selected. In other embodiments, only a part of them may be selected. The result of this is that the execution result of the method has a certain error, but it does not affect the overall judgment. At the same time, it also enhances the practicability of the method. Any technical system can be judged even if there is an error in the patent classification number. In addition, setting a range of options can achieve a better balance between effect and efficiency, as well as the flexibility of the method.
在步骤83,根据步骤82得到的二个技术***的小类编码集合B3、D3,大组编码集合B2、D2和小组编码集合B1、D1,计算二个技术***小类编码重合的数量E3为5(表1的第三列,即重合的IPC小类列的最后一行)、大组编码重合的数量E2为1(表2的第三列,即重合的IPC大组列的最后一行)和小组编码重合的数量E1为0(表3的第三列,即重合的IPC小组列的最后一行)。In step 83, according to the sub-code sets B3 and D3 of the two technical systems obtained in step 82, the large-group code sets B2, D2, and the group-code sets B1, D1, the number of overlapping sub-categories of the two technical systems E3 is calculated as 5 (the third column of Table 1, that is, the last row of the overlapping IPC sub-category column), the number of overlapping large groups of codes E2 is 1 (the third column of Table 2, that is, the last row of the overlapping IPC large group of columns) and The number of overlapping group codes E1 is 0 (the third column of Table 3, that is, the last row of the overlapping IPC group column).
在步骤84,根据任意一个技术***的小类编码数量b3=19、d3=10,大组编码数量b2=19、d2=10,和小组编码数量b1=13、d1=5,以及二个技术***小类编码重合的数量E3=5、大组编码重合的数量E2=1和小组编码重合的数量E1=0,计算任意一个技术***的小类编码重合度、大组编码重合度和小组编码重合度;其中,In step 84, according to any technical system, the number of small-class codes b3=19, d3=10, the number of large-group codes b2=19, d2=10, and the number of group codes b1=13, d1=5, and two technologies The number of overlaps of system sub-category codes E3=5, the number of overlaps of large group codes E2=1 and the number of overlaps of group codes E1=0, calculate the overlap degree of sub-category codes, the overlap degree of large group codes and the group codes of any technical system Coincidence degree; where,
对于第一个技术***,A3=(E3/b3)%=(5/19)%≈26%,A2=(E2/b2)%=(1/19)%≈5%,A1=(E1/b1)%=(0/13)%=0;For the first technical system, A3=(E3/b3)%=(5/19)%≈26%, A2=(E2/b2)%=(1/19)%≈5%, A1=(E1/ b1)%=(0/13)%=0;
对于第二个技术***,B3=(E3/d3)%=(5/10)%≈50%,B2=(E2/d2)%=(1/10)%≈10%,B1=(E1/d1)%=(0/5)%=0。For the second technical system, B3=(E3/d3)%=(5/10)%≈50%, B2=(E2/d2)%=(1/10)%≈10%, B1=(E1/ d1)%=(0/5)%=0.
在步骤85和86,根据所述重合度,计算任意一个技术***相对另一个技术***的专利技术相关指数F;其中,In steps 85 and 86, according to the degree of coincidence, calculate the patent technology related index F of any one technical system relative to another technical system; where,
对于第一个技术***,F A=C3*A3+C2*A2+C1*A1,F B=C3*B3+C2*B2+C1*B1,C3、C2、C1为经验常数,本例中,C3、C2、C1分别表示IPC小类、大组和小组的分类与两个***冲突的相关系数,其经验值分别为1,2,3。 For the first technical system, F A =C3*A3+C2*A2+C1*A1,F B =C3*B3+C2*B2+C1*B1, C3, C2, and C1 are empirical constants. In this example, C3, C2, and C1 respectively represent the correlation coefficients between the IPC classifications of small, large and small groups and the conflicts between the two systems, and their empirical values are 1, 2, and 3.
对于第一个技术***,F A=C3*A3+C2*A2+C1*A1,即,F A=C3*A3+C2*A2+C1*A1=1*26%+2*5%+3*0=36%。 For the first technical system, F A =C3*A3+C2*A2+C1*A1, that is, F A =C3*A3+C2*A2+C1*A1=1*26%+2*5%+3 *0=36%.
对于第二个技术***,F B=C3*B3+C2*B2+C1*B1,即,F B=C3*B3+C2*B2+C1*B1=1*50%+2*10%+3*0=60%。 For the second technical system, F B =C3*B3+C2*B2+C1*B1, that is, F B =C3*B3+C2*B2+C1*B1=1*50%+2*10%+3 *0=60%.
在步骤16,根据所述相关指数F,或计算任意一个技术***相对另一个技术***的专利冲突概率G;其中。G A=F A/(C3+C2+C1)=36%/(1+2+3)=6%。G B=F B/(C3+C2+C1)=60%/(1+2+3)=10%。 In step 16, according to the correlation index F, or calculate the patent conflict probability G of any one technical system relative to another technical system; where. G A =F A /(C3+C2+C1)=36%/(1+2+3)=6%. G B =F B /(C3+C2+C1)=60%/(1+2+3)=10%.
其中,G A是目标技术方案集合A与技术方案子集合B的相似度指数,G B是技术方案子集合B与目标技术方案集合A的相似度指数。 Wherein, G A certain aspect of the similarity index is a set of A and B of the subset aspect, G B B is a subset of the target aspect aspect of the set A similarity index.

Claims (10)

  1. 技术类开放式数字资产的检索方法,其特征在于:The retrieval method of technical open digital assets is characterized by:
    在数字资产数据包中设置技术方案需求描述文件,其中包括数字资产数据包所需求的至少一个技术方案;Set the technical solution requirement description file in the digital asset data package, which includes at least one technical solution required by the digital asset data package;
    获取检索请求以及对应的目标数字资产技术描述文件,所述技术描述文件中包括至少一个技术方案,得到目标技术方案集合A;Obtain the retrieval request and the corresponding target digital asset technical description file, where the technical description file includes at least one technical solution, and obtain the target technical solution set A;
    获取待检测数字资产数据包集合,确定集合中每个数字资产数据包中技术方案需求描述文件对应的技术方案子集合B;Obtain the digital asset data package set to be detected, and determine the technical solution sub-set B corresponding to the technical solution requirement description file in each digital asset data package in the set;
    根据目标技术方案集合A和每个技术方案子集合B,计算目标技术方案集合A与每个技术方案子集合B的相似度指数;According to the target technical solution set A and each technical solution subset B, calculate the similarity index between the target technical solution set A and each technical solution subset B;
    根据所述相似度指数将数字资产数据包集合中的数据包排序后输出。According to the similarity index, the data packets in the digital asset data packet set are sorted and output.
  2. 如权利要求1所述的检索方法,其特征在于,按照下述步骤计算目标技术方案集合A与技术方案子集合B的相似度指数:The retrieval method according to claim 1, wherein the similarity index between the target technical solution set A and the technical solution subset B is calculated according to the following steps:
    计算目标技术方案集合A中每一个技术方案与技术方案子集合B的每个技术方案的方案相似度,将所述方案相似度求和,得到方案最大相似度;Calculate the solution similarity between each technical solution in the target technical solution set A and each technical solution in the technical solution subset B, and sum the solution similarities to obtain the maximum solution similarity;
    将每个方案最大相似度相加,得到目标技术方案集合A与技术方案子集合B的相似度指数。The maximum similarity of each solution is added to obtain the similarity index of the target technical solution set A and the technical solution subset B.
  3. 如权利要求2所述的检索方法,其特征在于,按照下述步骤计算目标技术方案集合A中每一个技术方案与技术方案子集合B的每个技术方案的方案相似度:The retrieval method according to claim 2, wherein the solution similarity between each technical solution in the target technical solution set A and each technical solution in the technical solution subset B is calculated according to the following steps:
    获取目标技术方案集合A中每一个技术方案的全部关键词及对应的衍生词生成的目标关键词集合;所述目标关键词集合去除重复关键词后的关键词集合;Obtain all the keywords of each technical solution in the target technical solution set A and the target keyword set generated by the corresponding derivative words; the keyword set after the target keyword set removes duplicate keywords;
    计算目标关键词集合中的每个关键词在技术方案子集合B的每个技术方案中出现的次数;Calculate the number of times each keyword in the target keyword set appears in each technical solution of the technical solution sub-set B;
    将全部所述次数相加的和作为目标技术方案A与技术方案子集合B的相似度。The sum of all the times is regarded as the similarity between the target technical solution A and the technical solution subset B.
  4. 如权利要求1所述的检索方法,其特征在于,按照下述步骤计算目标技术方案集合A与技术方案子集合B的相似度指数的步骤包括:The retrieval method according to claim 1, wherein the step of calculating the similarity index between the target technical solution set A and the technical solution subset B according to the following steps comprises:
    确定或选择一个具有包括至少两级具有逐级特征的技术分类规则;Determine or select a technology classification rule that includes at least two levels with progressive features;
    将目标技术方案集合A和技术方案子集合B作为对应的技术要点集合A、B,使用所述技术分类规则对技术要点集合A、B中的技术要点进行技术分类,得到对应的分类号码集合A、B;Take the target technical solution set A and the technical solution subset B as the corresponding technical point sets A and B, and use the technology classification rules to classify the technical points in the technical point sets A and B to obtain the corresponding classification number set A , B;
    在分类号码集合A中,选择M%的分类号码,获取每一个号码指示的每一个级别编码,得到M%的号码中的全部分类号码的每一个级别编码构成的集合X1i及对应的数量Y1i;以及,在分类号码集合B中,选择N%的分类号码,获取每一个号码指示的每一 个级别编码,得到N%的号码中的全部分类号码的每一个级别编码集合X2i及对应的数量Y2i;其中,所述集合中的信息是去除重复以后的信息;In the classification number set A, select M% classification numbers, obtain each level code indicated by each number, and obtain the set X1i and the corresponding number Y1i of each level code of all the classification numbers in the M% number; And, in the classification number set B, select N% classification numbers, obtain each level code indicated by each number, and obtain each level code set X2i and the corresponding number Y2i of all classification numbers in the N% numbers; Wherein, the information in the set is information after repetition is removed;
    根据所述编码集合X1i、X2i,计算X1i、X2i中每一个级别编码重合的数量Ei;According to the code sets X1i and X2i, calculate the number Ei of overlapping codes of each level in X1i and X2i;
    根据Y1i、Y2i和Ei,计算分类号码集合A、B的每一个级别的编码相对重合度Ai、Bi;其中,对于分类号码集合A,Ai=(Ei/Y1i)%;对于分类号码集合B,Bi=(Ei/Y2i)%;According to Y1i, Y2i and Ei, calculate the relative coincidence degree Ai, Bi of each level of the classification number set A and B; among them, for the classification number set A, Ai=(Ei/Y1i)%; for the classification number set B, Bi=(Ei/Y2i)%;
    根据所述相对重合度Ai、Bi,计算分类号码集合A的技术相关指数F A和分类号码集合B的技术相关指数F A;其中, The degree of coincidence relative to the Ai, Bi, classification number set is calculated correlation index F A technical classification numbers A and set B F A technique of the related index; wherein,
    F A=∑Ci*Ai;F B=∑Ci*Bi;其中,Ci为经验常数; F A =∑Ci*Ai; F B =∑Ci*Bi; Among them, Ci is an empirical constant;
    根据所述相关指数F A和F B,计算分类号码集合A、B的相似性概率G A、G B;其中, According to the related indexes F A and F B , calculate the similarity probabilities G A and G B of the classification number sets A and B ;
    G A=F A/(∑Ci);G B=F B/(∑Ci); G A =F A /(∑Ci); G B =F B /(∑Ci);
    其中,G A是目标技术方案集合A与技术方案子集合B的相似度指数,G B是技术方案子集合B与目标技术方案集合A的相似度指数; Among them, G A is the similarity index between the target technical solution set A and the technical solution subset B, and G B is the similarity index between the technical solution subset B and the target technical solution set A;
    上述式中,i=1-n,其中n为技术分类规则的编码级别数。In the above formula, i=1-n, where n is the number of coding levels of the technical classification rules.
  5. 如权利要求4所述的检索方法,其特征在于,每个技术要点的技术分类号码包括一个或多个分类号码。The retrieval method according to claim 4, wherein the technical classification number of each technical point includes one or more classification numbers.
  6. 技术类开放式数字资产的检索方法,其特征在于:The retrieval method of technical open digital assets is characterized by:
    在数字资产数据包中设置技术方案需求描述文件,其中包括数字资产数据包所需求的至少一个技术方案;Set the technical solution requirement description file in the digital asset data package, which includes at least one technical solution required by the digital asset data package;
    获取检索请求以及对应的目标数字资产技术描述文件,所述技术描述文件中包括至少一个技术方案,得到目标技术方案集合A;Obtain the retrieval request and the corresponding target digital asset technical description file, where the technical description file includes at least one technical solution, and obtain the target technical solution set A;
    获取待检测数字资产数据包集合,确定集合中每个数字资产数据包中技术方案需求描述文件对应的技术方案子集合B;Obtain the digital asset data package set to be detected, and determine the technical solution sub-set B corresponding to the technical solution requirement description file in each digital asset data package in the set;
    确定目标技术方案集合A对应的所有技术方案的专利分类号集合A和每个技术方案子集合B对应的所有技术方案的专利分类号集合B;Determine the patent classification number set A of all technical solutions corresponding to the target technical solution set A and the patent classification number set B of all technical solutions corresponding to each technical solution subset B;
    根据专利分类号集合A和每个专利分类号集合B,计算目标技术方案集合A与每个技术方案子集合B的相似度指数;According to the patent classification number set A and each patent classification number set B, calculate the similarity index between the target technical solution set A and each technical solution subset B;
    根据所述相似度指数将数字资产数据包集合中的数据包排序后输出。According to the similarity index, the data packets in the digital asset data packet set are sorted and output.
  7. 如权利要求6所述的检索方法,其特征在于,按照下述步骤计算目标技术方案集合A与技术方案子集合B的相似度指数:7. The retrieval method according to claim 6, wherein the similarity index between the target technical solution set A and the technical solution subset B is calculated according to the following steps:
    分别获取专利分类号集合A中M%的国际专利分类号指示的部编码集合B5、数量b5,大类编码集合B4、数量b4,小类编码集合B3、数量b3,大组编码集合B2、数量b2,和小组编码集合B1、数量b1;以及,获取专利分类号集合B中N%的国际专利分类号指示的部编码集合D5、数量d5,大类编码集合D4、数量d4,小类编码集合D3、 数量d3,大组编码集合D2、数量d2,和小组编码集合D1、数量d1;其中,100≥M>0;100≥N>0,所述集合中的信息是去除重复以后的信息;Obtain the partial code set B5 and quantity b5 indicated by the M% of the international patent classification numbers in the patent classification number set A, the quantity b5, the large type code set B4, the quantity b4, the small type code set B3, the quantity b3, and the large group code set B2 and quantity b2, and group code set B1, quantity b1; and, obtain the partial code set D5 indicated by the international patent classification number of N% of the patent classification number set B, the quantity d5, the large category code set D4, the quantity d4, the small category code set D3, quantity d3, large group coding set D2, quantity d2, and small group coding set D1, quantity d1; where 100≥M>0; 100≥N>0, the information in the set is the information after repetition is removed;
    根据所述专利分类号集合A、B的部编码集合B5、D5、大类编码集合B4、D4、小类编码集合B3、D3、大组编码集合B2、D2和小组编码集合B1、D1,计算专利分类号集合A、B的部编码重合的数量E5、大类编码重合的数量E4、小类编码重合的数量E3、大组编码重合的数量E2和小组编码重合的数量E1;According to the partial code sets B5 and D5 of the patent classification number sets A and B, the large category code sets B4, D4, the small category code sets B3, D3, the large group code sets B2, D2, and the group code sets B1, D1, calculate The number of overlapping partial codes of patent classification number sets A and B is E5, the number of overlapping codes of major category E4, the number of overlapping codes of small category E3, the number of overlapping codes of large group E2 and the number of overlapping group codes E1;
    根据所述专利分类号集合A、B的部编码数量b5、d5,大类编码数量b4、d4,小类编码数量b3、d3,大组编码数量b2、d2,和小组编码数量b1、d1,以及专利分类号集合A、B的部编码重合的数量E5、大类编码重合的数量E4、小类编码重合的数量E3、大组编码重合的数量E2和小组编码重合的数量E1,计算两个专利分类号集合A、B的部编码重合度A5、B5,大类编码重合度A4、B4,小类编码重合度A3、B3,大组编码重合度A2、B2和小组编码重合度A1、B1;其中,According to the number of partial codes b5 and d5 of the patent classification number sets A and B, the number of major categories b4, d4, the number of minor categories b3, d3, the number of major groups b2, d2, and the number of group codes b1, d1, And the number of overlapping partial codes E5 of the set of patent classification numbers A and B, the number of overlapping categories of codes E4, the number of overlapping categories of codes E3, the number of overlapping groups of codes E2 and the number of overlapping groups of codes E1, calculate two Partial code coincidence degree A5, B5 of patent classification number set A and B, large class code coincidence degree A4, B4, small class code coincidence degree A3, B3, large group code coincidence degree A2, B2 and group code coincidence degree A1, B1 ;among them,
    对于专利分类号集合A,A5=(E5/b5)%,A4=(E4/b4)%,A3=(E3/b3)%,A2=(E2/b2)%,A1=(E1/b1)%;For patent classification number set A, A5=(E5/b5)%, A4=(E4/b4)%, A3=(E3/b3)%, A2=(E2/b2)%, A1=(E1/b1) %;
    对于专利分类号集合B,B5=(E5/d5)%,B4=(E4/d4)%,B3=(E3/d3)%,B2=(E2/d2)%,B1=(E1/d1)%;For patent classification number set B, B5=(E5/d5)%, B4=(E4/d4)%, B3=(E3/d3)%, B2=(E2/d2)%, B1=(E1/d1) %;
    根据所述重合度A5、B5,A4、B4,A3、B3,A2、B2,A1、B1,计算目标技术方案集合A与技术方案子集合B的专利技术相关指数F A或F B;其中, Based on the coincidence degree A5, B5, A4, B4, A3, B3, A2, B2, A1, B1, calculates the target set aspect aspect subset A and B related patent F A or F. B index; wherein,
    对于目标技术方案集合A,F A=C5*A5+C4*A4+C3*A3+C2*A2+C1*A1; For the target technical solution set A, F A = C5*A5+C4*A4+C3*A3+C2*A2+C1*A1;
    对于技术方案子集合B,F B=C5*B5+C4*B4+C3*B3+C2*B2+C1*B1; For aspect subset B, F B = C5 * B5 + C4 * B4 + C3 * B3 + C2 * B2 + C1 * B1;
    其中,C5、C4、C3、C2、C1为经验常数;Among them, C5, C4, C3, C2, C1 are empirical constants;
    根据所述相关指数F,计算目标技术方案集合A与技术方案子集合B的相互相似性概率G;其中:According to the correlation index F, calculate the mutual similarity probability G between the target technical solution set A and the technical solution subset B; where:
    G A=F A/(C5+C4+C3+C2+C1);G B=F B/(C5+C4+C3+C2+C1); G A =F A /(C5+C4+C3+C2+C1); G B =F B /(C5+C4+C3+C2+C1);
    其中,G A是目标技术方案集合A与技术方案子集合B的相似度指数,G B是技术方案子集合B与目标技术方案集合A的相似度指数。 Wherein, G A certain aspect of the similarity index is a set of A and B of the subset aspect, G B B is a subset of the target aspect aspect of the set A similarity index.
  8. 如权利要求6所述的检索方法,其特征在于,按照下述步骤计算目标技术方案集合A与技术方案子集合B的相似度指数:7. The retrieval method according to claim 6, wherein the similarity index between the target technical solution set A and the technical solution subset B is calculated according to the following steps:
    分别获取专利分类号集合A中M%的国际专利分类号指示的大类编码集合B4、数量b4,小类编码集合B3、数量b3,大组编码集合B2、数量b2,和小组编码集合B1、数量b1;以及,获取专利分类号集合B中N%的国际专利分类号指示的大类编码集合D4、数量d4,小类编码集合D3、数量d3,大组编码集合D2、数量d2,和小组编码集合D1、数量d1;其中,100≥M>0;100≥N>0,所述集合中的信息是去除重复以后的信息;Obtain the major code set B4, the quantity b4, the small code set B3, the quantity b3, the large group code set B2, the quantity b2, and the group code set B1 indicated by the M% of the international patent classification numbers in the patent classification number set A. Quantity b1; and, obtain the general code set D4, the quantity d4, the small code set D3, the quantity d3, the large group code set D2, the quantity d2, and the group indicated by N% of the international patent classification numbers in the patent classification number set B Encoding set D1, quantity d1; where, 100≥M>0; 100≥N>0, the information in the set is information after repetition is removed;
    根据所述专利分类号集合A、B的大类编码集合B4、D4、小类编码集合B3、D3、大组编码集合B2、D2和小组编码集合B1、D1,计算专利分类号集合A、B的大类编码重合的数量E4、小类编码重合的数量E3、大组编码重合的数量E2和小组编码重合的数量E1;Calculate the patent classification number sets A, B according to the patent classification number sets A and B of the major category code sets B4, D4, the small category code sets B3, D3, the major category code sets B2, D2, and the group code sets B1, D1 E4, E3, E2, E1, and E1 of the overlapping codes of the major category, E3, E2, and E1;
    根据所述专利分类号集合A、B的大类编码数量b4、d4,小类编码数量b3、d3,大组编码数量b2、d2,和小组编码数量b1、d1,以及专利分类号集合A、B的大类编码重合的数量E4、小类编码重合的数量E3、大组编码重合的数量E2和小组编码重合的数量E1,计算两个专利分类号集合A、B的大类编码重合度A4、B4,小类编码重合度A3、B3,大组编码重合度A2、B2和小组编码重合度A1、B1;其中,According to the patent classification number sets A and B, the number of major categories b4, d4, the number of minor categories b3, d3, the number of major groups b2, d2, and the number of group codes b1, d1, and the number of patent classification numbers A, The number of overlaps of the major category codes of B E4, the number of overlaps of subcategory codes E3, the number of overlaps of major group codes E2 and the number of overlaps of group codes E1, calculate the overlap degree of major category codes A4 of the two patent classification number sets A and B , B4, small category code coincidence degree A3, B3, large group code coincidence degree A2, B2 and group code coincidence degree A1, B1; among them,
    对于专利分类号集合A,A4=(E4/b4)%,A3=(E3/b3)%,A2=(E2/b2)%,A1=(E1/b1)%;For patent classification number set A, A4=(E4/b4)%, A3=(E3/b3)%, A2=(E2/b2)%, A1=(E1/b1)%;
    对于专利分类号集合B,B4=(E4/d4)%,B3=(E3/d3)%,B2=(E2/d2)%,B1=(E1/d1)%;For patent classification number set B, B4=(E4/d4)%, B3=(E3/d3)%, B2=(E2/d2)%, B1=(E1/d1)%;
    根据所述重合度A4、B4,A3、B3,A2、B2,A1、B1,计算目标技术方案集合A与技术方案子集合B的专利技术相关指数F A或F B;其中, The coincidence of the A4, B4, patents A3, B3, A2, B2, A1, B1, calculates a target aspect subset of set A and B related aspect Index B F A or F.; Wherein,
    对于目标技术方案集合A,F A=C4*A4+C3*A3+C2*A2+C1*A1; For the target technical solution set A, F A =C4*A4+C3*A3+C2*A2+C1*A1;
    对于技术方案子集合B,F B=C4*B4+C3*B3+C2*B2+C1*B1; For aspect subset B, F B = C4 * B4 + C3 * B3 + C2 * B2 + C1 * B1;
    其中,C4、C3、C2、C1为经验常数;Among them, C4, C3, C2, C1 are empirical constants;
    根据所述相关指数F,计算目标技术方案集合A与技术方案子集合B的相互相似性概率G;其中:According to the correlation index F, calculate the mutual similarity probability G between the target technical solution set A and the technical solution subset B; where:
    G A=F A/(C4+C3+C2+C1);G B=F B/(C4+C3+C2+C1); G A =F A /(C4+C3+C2+C1); G B =F B /(C4+C3+C2+C1);
    其中,G A是目标技术方案集合A与技术方案子集合B的相似度指数,G B是技术方案子集合B与目标技术方案集合A的相似度指数。 Wherein, G A certain aspect of the similarity index is a set of A and B of the subset aspect, G B B is a subset of the target aspect aspect of the set A similarity index.
  9. 如权利要求6所述的检索方法,其特征在于,按照下述步骤计算目标技术方案集合A与技术方案子集合B的相似度指数包括:7. The retrieval method according to claim 6, wherein calculating the similarity index between the target technical solution set A and the technical solution subset B according to the following steps comprises:
    分别获取专利分类号集合A中M%的国际专利分类号指示的小类编码集合B3、数量b3,大组编码集合B2、数量b2,和小组编码集合B1、数量b1;以及,获取专利分类号集合B中N%的国际专利分类号指示的小类编码集合D3、数量d3,大组编码集合D2、数量d2,和小组编码集合D1、数量d1;其中,100≥M>0;100≥N>0,所述集合中的信息是去除重复以后的信息;Obtain the sub-category code set B3, the quantity b3, the large group code set B2, the quantity b2, and the group code set B1, the quantity b1 indicated by the M% of the international patent classification numbers in the patent classification number set A; and, obtain the patent classification number In the set B, N% of the international patent classification numbers indicate the small category code set D3, the number d3, the large group code set D2, the number d2, and the small group code set D1, the number d1; where 100≥M>0; 100≥N >0, the information in the set is the information after the repetition is removed;
    根据所述专利分类号集合A、B的小类编码集合B3、D3、大组编码集合B2、D2和小组编码集合B1、D1,计算专利分类号集合A、B的小类编码重合的数量E3、大组编码重合的数量E2和小组编码重合的数量E1;According to the patent classification number sets A and B, the small class code sets B3, D3, the large group code sets B2, D2, and the group code sets B1, D1, calculate the number of overlapping small class codes E3 of the patent classification number sets A and B , The number of overlapping codes in a large group E2 and the number of overlapping group codes E1;
    根据所述专利分类号集合A、B的小类编码数量b3、d3,大组编码数量b2、d2,和小组编码数量b1、d1,以及专利分类号集合A、B的小类编码重合的数量E3、大组 编码重合的数量E2和小组编码重合的数量E1,计算两个专利分类号集合A、B的小类编码重合度A3、B3,大组编码重合度A2、B2和小组编码重合度A1、B1;其中,According to the patent classification number sets A and B, the number of small class codes b3, d3, the number of large group codes b2, d2, and the number of group codes b1, d1, and the number of overlapping small class codes of the patent classification number sets A and B E3. The number of overlapping codes of the large group E2 and the number of overlapping groups of code E1 are calculated to calculate the coincidence degree A3, B3 of the two patent classification number sets A and B, the degree of coincidence of the large group code A2, B2 and the group code coincidence degree A1, B1; among them,
    对于专利分类号集合A,A3=(E3/b3)%,A2=(E2/b2)%,A1=(E1/b1)%;For the patent classification number set A, A3=(E3/b3)%, A2=(E2/b2)%, A1=(E1/b1)%;
    对于专利分类号集合B,B3=(E3/d3)%,B2=(E2/d2)%,B1=(E1/d1)%;For patent classification number set B, B3=(E3/d3)%, B2=(E2/d2)%, B1=(E1/d1)%;
    根据所述重合度A3、B3,A2、B2,A1、B1,计算目标技术方案集合A与技术方案子集合B的专利技术相关指数F A或F B;其中, The coincidence of the A3, B3, A2, B2, A1, B1, calculates the target set aspect aspect subset A and B related patent F A or F. B index; wherein,
    对于目标技术方案集合A,F A=C3*A3+C2*A2+C1*A1; For the target technical solution set A, F A = C3*A3+C2*A2+C1*A1;
    对于技术方案子集合B,F B=C3*B3+C2*B2+C1*B1; For the technical solution subset B, F B = C3*B3+C2*B2+C1*B1;
    其中,C3、C2、C1为经验常数;Among them, C3, C2, C1 are empirical constants;
    根据所述相关指数F,计算目标技术方案集合A与技术方案子集合B的相互相似性概率G;其中:According to the correlation index F, calculate the mutual similarity probability G between the target technical solution set A and the technical solution subset B; where:
    G A=F A/(C3+C2+C1);G B=F B/(C3+C2+C1); G A =F A /(C3+C2+C1); G B =F B /(C3+C2+C1);
    其中,G A是目标技术方案集合A与技术方案子集合B的相似度指数,G B是技术方案子集合B与目标技术方案集合A的相似度指数。 Wherein, G A certain aspect of the similarity index is a set of A and B of the subset aspect, G B B is a subset of the target aspect aspect of the set A similarity index.
  10. 如权利要求6所述的检索方法,其特征在于,按照下述步骤计算目标技术方案集合A与技术方案子集合B的相似度指数:7. The retrieval method according to claim 6, wherein the similarity index between the target technical solution set A and the technical solution subset B is calculated according to the following steps:
    分别获取专利分类号集合A中M%的国际专利分类号指示的大组编码集合B2、数量b2,和小组编码集合B1、数量b1;以及,获取专利分类号集合B中N%的国际专利分类号指示的大组编码集合D2、数量d2,和小组编码集合D1、数量d1;其中,100≥M>0;100≥N>0,所述集合中的信息是去除重复以后的信息;Obtain the large group code set B2, the quantity b2, and the group code set B1, the quantity b1 indicated by the M% of the international patent classification numbers in the patent classification number set A; and obtain the N% international patent classifications in the patent classification number set B The large group code set D2, the quantity d2, and the group code set D1, the quantity d1 indicated by the number; among them, 100≥M>0; 100≥N>0, the information in the set is the information after the repetition is removed;
    根据所述专利分类号集合A、B的大组编码集合B2、D2和小组编码集合B1、D1,计算专利分类号集合A、B的大组编码重合的数量E2和小组编码重合的数量E1;Calculate the number E2 and the number E1 of overlapping group codes of the patent classification number sets A and B according to the group code sets B2, D2 and the group code sets B1, D1 of the patent classification number sets A and B;
    根据所述专利分类号集合A、B的大组编码数量b2、d2,和小组编码数量b1、d1,以及专利分类号集合A、B的大组编码重合的数量E2和小组编码重合的数量E1,计算两个专利分类号集合A、B的大组编码重合度A2、B2和小组编码重合度A1、B1;其中,According to the patent classification number sets A and B, the number of large group codes b2, d2, and the number of group codes b1, d1, and the number of overlaps E2 and the number of group codes of the patent classification number sets A and B overlap E1 , Calculate the coincidence degree A2, B2 and group coding coincidence A1, B1 of two sets of patent classification numbers A and B; among them,
    对于专利分类号集合A,A2=(E2/b2)%,A1=(E1/b1)%;For the patent classification number set A, A2=(E2/b2)%, A1=(E1/b1)%;
    对于专利分类号集合B,B2=(E2/d2)%,B1=(E1/d1)%;For patent classification number set B, B2=(E2/d2)%, B1=(E1/d1)%;
    根据所述重合度A2、B2,A1、B1,计算目标技术方案集合A与技术方案子集合B的专利技术相关指数F A或F B;其中, Based on the coincidence degree A2, B2, A1, B1, calculates a target aspect subset of set A and B of the aspect related patent F A or F. B index; wherein,
    对于目标技术方案集合A,F A=C2*A2+C1*A1; For the target technical solution set A, F A = C2*A2+C1*A1;
    对于技术方案子集合B,F B=C2*B2+C1*B1; For aspect subset B, F B = C2 * B2 + C1 * B1;
    其中,C2、C1为经验常数;Among them, C2 and C1 are empirical constants;
    根据所述相关指数F,计算目标技术方案集合A与技术方案子集合B的相互相似性概率G;其中:According to the correlation index F, calculate the mutual similarity probability G between the target technical solution set A and the technical solution subset B; where:
    G A=F A/(C2+C1);G B=F B/(C2+C1); G A =F A /(C2+C1); G B =F B /(C2+C1);
    其中,G A是目标技术方案集合A与技术方案子集合B的相似度指数,G B是技术方案子集合B与目标技术方案集合A的相似度指数。 Wherein, G A certain aspect of the similarity index is a set of A and B of the subset aspect, G B B is a subset of the target aspect aspect of the set A similarity index.
PCT/CN2020/094207 2019-07-26 2020-06-03 Technical open digital asset retrieval method WO2021017633A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201910684229.4A CN112307055A (en) 2019-07-26 2019-07-26 Retrieval method of technical open type digital assets
CN201910684229.4 2019-07-26

Publications (1)

Publication Number Publication Date
WO2021017633A1 true WO2021017633A1 (en) 2021-02-04

Family

ID=74230041

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2020/094207 WO2021017633A1 (en) 2019-07-26 2020-06-03 Technical open digital asset retrieval method

Country Status (3)

Country Link
CN (1) CN112307055A (en)
FR (1) FR3099599A1 (en)
WO (1) WO2021017633A1 (en)

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20130086084A1 (en) * 2011-10-03 2013-04-04 Steven W. Lundberg Patent mapping
CN103455609A (en) * 2013-09-05 2013-12-18 江苏大学 New kernel function Luke kernel-based patent document similarity detection method
CN107247780A (en) * 2017-06-12 2017-10-13 北京理工大学 A kind of patent document method for measuring similarity of knowledge based body
CN107528876A (en) * 2016-08-09 2017-12-29 天津转知汇网络技术有限公司 The instant distributing interaction method and system of patent information
CN108595409A (en) * 2018-03-16 2018-09-28 上海大学 A kind of requirement documents based on neural network and service document matches method

Family Cites Families (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20110246379A1 (en) * 2010-04-02 2011-10-06 Cpa Global Patent Research Limited Intellectual property scoring platform
US10891701B2 (en) * 2011-04-15 2021-01-12 Rowan TELS Corp. Method and system for evaluating intellectual property
CN105320772B (en) * 2015-11-02 2019-03-26 武汉大学 A kind of association paper querying method of patent duplicate checking
CN109325099A (en) * 2018-09-18 2019-02-12 江苏润桐数据服务有限公司 A kind of method and apparatus of automatically retrieval

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20130086084A1 (en) * 2011-10-03 2013-04-04 Steven W. Lundberg Patent mapping
CN103455609A (en) * 2013-09-05 2013-12-18 江苏大学 New kernel function Luke kernel-based patent document similarity detection method
CN107528876A (en) * 2016-08-09 2017-12-29 天津转知汇网络技术有限公司 The instant distributing interaction method and system of patent information
CN107247780A (en) * 2017-06-12 2017-10-13 北京理工大学 A kind of patent document method for measuring similarity of knowledge based body
CN108595409A (en) * 2018-03-16 2018-09-28 上海大学 A kind of requirement documents based on neural network and service document matches method

Also Published As

Publication number Publication date
FR3099599A1 (en) 2021-02-05
CN112307055A (en) 2021-02-02

Similar Documents

Publication Publication Date Title
WO2021139074A1 (en) Knowledge graph-based case retrieval method, apparatus, device, and storage medium
Dong et al. Reference reconciliation in complex information spaces
US10019442B2 (en) Method and system for peer detection
CN103679462B (en) A kind of comment data treating method and apparatus, a kind of searching method and system
WO2019214245A1 (en) Information pushing method and apparatus, and terminal device and storage medium
Konstas et al. On social networks and collaborative recommendation
Zhang et al. Infogather+ semantic matching and annotation of numeric and time-varying attributes in web tables
Yang et al. Discovering topic representative terms for short text clustering
CN108717407B (en) Entity vector determination method and device, and information retrieval method and device
Hausladen et al. Text classification of ideological direction in judicial opinions
Lakkaraju et al. Document similarity based on concept tree distance
Zhou et al. Relevance feature mapping for content-based multimedia information retrieval
Qi et al. Using inferred tag ratings to improve user-based collaborative filtering
CN103714118A (en) Book cross-reading method
Haq et al. Text mining techniques to capture facts for cloud computing adoption and big data processing
CN110765266A (en) Method and system for merging similar dispute focuses of referee documents
CN112417152A (en) Topic detection method and device for case-related public sentiment
WO2021017640A1 (en) Query method of technical digital assets
CN108509449B (en) Information processing method and server
Ko et al. Semantically-based recommendation by using semantic clusters of users' viewing history
WO2021017633A1 (en) Technical open digital asset retrieval method
WO2021017638A2 (en) Method for determining similarity of any two technology systems
Zhao et al. Trailmix: An ensemble recommender system for playlist curation and continuation
US9026540B1 (en) Systems and methods for information match scoring
Chan et al. The Power of Bounds: Answering Approximate Earth Mover's Distance with Parametric Bounds

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 20847626

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 20847626

Country of ref document: EP

Kind code of ref document: A1