CN112579841B - Multi-mode database establishment method, retrieval method and system - Google Patents

Multi-mode database establishment method, retrieval method and system Download PDF

Info

Publication number
CN112579841B
CN112579841B CN202011542924.6A CN202011542924A CN112579841B CN 112579841 B CN112579841 B CN 112579841B CN 202011542924 A CN202011542924 A CN 202011542924A CN 112579841 B CN112579841 B CN 112579841B
Authority
CN
China
Prior art keywords
data
mode
mode data
database
modal
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202011542924.6A
Other languages
Chinese (zh)
Other versions
CN112579841A (en
Inventor
贾红
李植民
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shenzhen University
Original Assignee
Shenzhen University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shenzhen University filed Critical Shenzhen University
Priority to CN202011542924.6A priority Critical patent/CN112579841B/en
Publication of CN112579841A publication Critical patent/CN112579841A/en
Application granted granted Critical
Publication of CN112579841B publication Critical patent/CN112579841B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/903Querying
    • G06F16/90335Query processing
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/906Clustering; Classification

Landscapes

  • Engineering & Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Computational Linguistics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention provides a multi-mode database establishing method, a searching method and a system, wherein the establishing method is realized by acquiring mixed mode data; extracting data features corresponding to each mode data from the mixed mode data, and constructing a mixed mode feature set; clustering the feature data in the mixed mode feature set according to the preset tag class number range to obtain sub tags corresponding to the mode data under different preset tag class numbers and total tags corresponding to the mixed mode data; and respectively calculating the scoring values of preset clustering evaluation indexes corresponding to the total tags under different preset tag category numbers to determine the target tag category number, and establishing a multi-mode database according to the sub-tags corresponding to the modal data under the target tag category number and the total tags corresponding to the mixed modal data. The method has the advantages that the built database has the overall label based on the overall situation and the sub label based on each local mode, the accuracy of detail retrieval of the data is improved, and good universality is achieved.

Description

Multi-mode database establishment method, retrieval method and system
Technical Field
The invention relates to the technical field of computers, in particular to a multi-mode database establishment method, a multi-mode database retrieval method and a multi-mode database retrieval system.
Background
Multimodal data refers to data collected in a variety of different devices or scenarios. Data sets in the real world tend to be multi-modal, for example: a story may be described by a text narration, also in images or audio; a document may be represented in a number of different languages, may also be represented by user ratings, and so on. The establishment of the multi-modal database aims at obtaining important characteristics and representative retrieval labels of the multi-modal data by analyzing and processing the multi-modal data, and based on the important characteristics and the representative retrieval labels, the establishment of the database which is convenient for subsequent data retrieval is carried out.
The multi-modal database can fully utilize complementarity among the multiple modes, eliminates redundancy among the modes, and can more comprehensively embody the authenticity of data compared with the traditional single-mode database, so that the requirement for establishing the multi-modal database is urgent. However, existing multimodal databases are generally of the same data type at the data layer, for example: all data are image data or audio data and the like, and when the multi-mode data establish the labels, the category number of the sub-labels in each mode is set to be consistent with the category number of the total labels. However, the setting mode ignores the data characteristic information of the data of different modes, and influences the accuracy of detail retrieval of the data in the multi-mode database.
Disclosure of Invention
In view of the above, the embodiments of the present invention provide a method for establishing a multi-modal database, a method for retrieving the multi-modal database, and a system thereof, so as to solve the problem in the prior art that the accuracy of detail retrieval of the multi-modal database is low because the method for establishing the multi-modal database ignores the data characteristic information of the data of different modalities.
The embodiment of the invention provides a method for establishing a multi-mode database, which comprises the following steps:
acquiring mixed mode data, wherein the mixed mode data comprises multi-mode data with different data types;
extracting data features corresponding to each mode data from the mixed mode data, and constructing a mixed mode feature set;
clustering the feature data in the mixed mode feature set according to the preset tag class number range to obtain sub tags corresponding to the mode data under different preset tag class numbers and total tags corresponding to the mixed mode data;
respectively calculating scoring values of preset clustering evaluation indexes corresponding to the total tags under different preset tag category numbers;
and determining the number of target label categories based on the grading value, and establishing a multi-mode database according to sub-labels corresponding to the modal data under the number of target label categories and the total labels corresponding to the mixed modal data.
Optionally, the extracting data features corresponding to each mode data from the mixed mode data, and constructing a mixed mode feature set includes:
classifying the mixed mode data based on data types to obtain each single mode data;
acquiring preset feature extraction parameters of each single-mode data, and carrying out feature extraction on each single-mode data according to the preset feature extraction parameters to obtain feature data corresponding to each single-mode data;
and constructing the mixed mode feature set based on feature data corresponding to each single mode data.
Optionally, clustering the feature data in the mixed mode feature set according to a preset tag class number range to obtain sub tags corresponding to each mode data under different preset tag class numbers and total tags corresponding to the mixed mode data, including:
acquiring the number of the current preset label categories;
clustering the characteristic data corresponding to each single-mode data to obtain the current sub-label corresponding to each single-mode data;
clustering the current sub-labels corresponding to the modal data based on the current preset label class number to obtain the current total label of the mixed modal data.
Optionally, clustering the feature data in the mixed mode feature set according to a preset tag class number range to obtain sub tags corresponding to each mode data under different preset tag class numbers and total tags corresponding to the mixed mode data, and further including:
calculating the adjusted Lande coefficient value of each current sub-label based on the current total label;
and performing dimension reduction updating on the characteristic data corresponding to each current sub-label which does not meet the preset adjustment Lande coefficient value, and clustering the updated characteristic data again to update the current sub-label corresponding to each single-mode data.
Optionally, the determining the number of target tag categories based on the scoring value, and establishing a multi-mode database according to sub-tags corresponding to each mode data and total tags corresponding to the mixed mode data under the number of target tag categories includes:
sorting scoring values corresponding to the number of the preset label categories;
determining the maximum preset label category number of the grading value as the target label category number;
and respectively adding a total label and a sub-label corresponding to each mode data, and establishing a multi-mode database of the mixed mode data.
The embodiment of the invention also provides a multi-mode database retrieval method, which comprises the following steps:
acquiring a search tag set, wherein the search tag set comprises a plurality of search tags;
and searching in the multi-modal database based on the search tag set to obtain a search result corresponding to the search tag set, wherein the multi-modal database is a multi-modal database established by adopting the multi-modal database establishing method according to another embodiment of the invention.
The embodiment of the invention also provides a system for establishing the multi-mode database, which comprises the following steps:
the first acquisition module is used for acquiring mixed mode data, wherein the mixed mode data comprise multi-mode data with different data types;
the first processing module is used for extracting data features corresponding to each mode data from the mixed mode data and constructing a mixed mode feature set;
the second processing module is used for clustering the characteristic data in the mixed mode characteristic set according to the preset label category number range to obtain sub-labels corresponding to the mode data under different preset label category numbers and total labels corresponding to the mixed mode data;
the third processing module is used for respectively calculating the scoring values of the preset clustering evaluation indexes corresponding to the total labels under different preset label category numbers;
and the fourth processing module is used for determining the number of target label categories based on the grading value, and establishing a multi-mode database according to the sub-labels corresponding to the modal data under the number of target label categories and the total labels corresponding to the mixed modal data.
The embodiment of the invention also provides a multi-mode database retrieval system, which is characterized by comprising:
the second acquisition module is used for acquiring a search tag set, wherein the search tag set comprises a plurality of search tags;
and a fifth processing module, configured to perform a search in the multi-modal database based on the search tag set, to obtain a search result corresponding to the search tag set, where the multi-modal database is a multi-modal database built by using the multi-modal database building system according to another embodiment of the present invention.
The embodiment of the invention also provides electronic equipment, which comprises: the system comprises a memory and a processor, wherein the memory and the processor are in communication connection, the memory stores computer instructions, and the processor executes the computer instructions so as to execute the multi-mode database establishment method provided by the embodiment of the invention or execute the multi-mode database retrieval method provided by the embodiment of the invention.
The embodiment of the invention also provides a computer readable storage medium which stores computer instructions for causing the computer to execute the multi-modal database establishing method provided by the embodiment of the invention or execute the multi-modal database searching method provided by the embodiment of the invention.
The technical scheme of the invention has the following advantages:
the embodiment of the invention provides a method and a system for establishing a multi-mode database, which are characterized in that mixed mode data are obtained, and the mixed mode data comprise multi-mode data with different data types; extracting data features corresponding to each mode data from the mixed mode data, and constructing a mixed mode feature set; clustering the feature data in the mixed mode feature set according to the preset tag class number range to obtain sub tags corresponding to the mode data under different preset tag class numbers and total tags corresponding to the mixed mode data; respectively calculating scoring values of preset clustering evaluation indexes corresponding to the total tags under different preset tag category numbers; and determining the number of target label categories based on the scoring value, and establishing a multi-mode database according to the sub-labels corresponding to the modal data and the total labels corresponding to the mixed modal data under the number of target label categories. The method comprises the steps of carrying out data feature extraction on modal data of each data type in the number range of different preset label categories, constructing a sub-label corresponding to each modal data and a total label of mixed modal data in a feature clustering mode, determining the sub-label and the total label of each modal data by calculating the scoring value of a preset clustering evaluation index corresponding to the number of different preset label categories, and establishing a multi-modal database, so that the database has the total label based on the overall situation and the sub-label based on each local mode, can be used for carrying out secondary label on the data, improves the accuracy of detail retrieval on the data, has good universality and can establish databases comprising various different data types.
The embodiment of the invention provides a multi-modal database retrieval method and a system, which are characterized in that a retrieval tag set is obtained, and retrieval is performed in a multi-modal database established by adopting the multi-modal database establishment method provided by the other embodiment of the invention based on the retrieval tag set, so as to obtain a retrieval result corresponding to the retrieval tag set. Therefore, the multi-mode database with the overall label and the sub-labels under each mode based on the local is utilized to search the search label set, so that detail search of data stored in the database is facilitated, the accuracy of search results is improved, and the application range of the multi-mode database is further improved.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings that are needed in the description of the embodiments or the prior art will be briefly described, and it is obvious that the drawings in the description below are some embodiments of the present invention, and other drawings can be obtained according to the drawings without inventive effort for a person skilled in the art.
FIG. 1 is a flowchart of a method for establishing a multi-modal database in an embodiment of the invention;
FIG. 2 is a schematic diagram of a label generation process in an embodiment of the invention;
FIG. 3 is a diagram of a multimodal database built in accordance with an embodiment of the invention;
FIG. 4 is a flowchart of a multi-modal database retrieval method in an embodiment of the invention
FIG. 5 is a schematic diagram of a multi-modal database creation system according to an embodiment of the present invention;
FIG. 6 is a schematic diagram of a multi-modal database retrieval system according to an embodiment of the present invention;
fig. 7 is a schematic structural diagram of an electronic device according to an embodiment of the present invention.
Detailed Description
For the purpose of making the objects, technical solutions and advantages of the embodiments of the present invention more apparent, the technical solutions of the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings in the embodiments of the present invention, and it is apparent that the described embodiments are some embodiments of the present invention, but not all embodiments of the present invention. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to fall within the scope of the invention.
The technical features of the different embodiments of the invention described below may be combined with one another as long as they do not conflict with one another.
The multi-modal database can fully utilize complementarity among the multiple modes, eliminates redundancy among the modes, and can more comprehensively embody the authenticity of data compared with the traditional single-mode database, so that the requirement for establishing the multi-modal database is urgent. However, existing multimodal databases are generally of the same data type at the data layer, for example: all data are image data or audio data and the like, and when the multi-mode data establish the labels, the category number of the sub-labels in each mode is set to be consistent with the category number of the total labels. However, the setting mode ignores the data characteristic information of the data of different modes, and influences the accuracy of detail retrieval of the data in the multi-mode database.
Based on the above problems, the embodiment of the present invention provides a method for establishing a multi-modal database of multi-modal data specific to different data types, as shown in fig. 1, where the method for establishing a multi-modal database mainly includes the following steps:
step S101: and acquiring mixed mode data.
In the embodiment of the present invention, the mixed mode data is described by taking the mixed mode data as the multi-mode data composed of images, texts, user evaluations and audio as an example, which is not limited thereto.
Step S102: and extracting data features corresponding to each mode data from the mixed mode data, and constructing a mixed mode feature set.
The data of different modes contains data information of different types, the data features are typical features reflecting the data of the types, and the specific extracted feature types can be set according to actual needs, so that the invention is not limited to the specific extracted feature types. In the embodiment of the invention, the data features extracted for the image data comprise color distribution, texture, edges, direction gradient histograms and the like, the data features extracted for the text data are word frequency features, the data features extracted for the user evaluation data are frequencies of keywords, the data features extracted for the audio data are frequency spectrum features and the like.
Step S103: clustering the feature data in the mixed mode feature set according to the preset tag class number range to obtain sub tags corresponding to the mode data under different preset tag class numbers and total tags corresponding to the mixed mode data.
The preset tag category number range is a general range of the total tag number according to the establishment requirement of the multi-mode database, for example: the preset number of tag categories ranges from (5, 10), indicating that the total number of tags for all data partitions in the multi-modal database is 5 to 10.
Step S104: and respectively calculating the scoring values of the preset clustering evaluation indexes corresponding to the total labels under different preset label category numbers.
The preset cluster evaluation Index is used for evaluating accuracy of classification results of multi-mode data under different preset label category numbers, in the embodiment of the invention, a contour coefficient (Calinski-Harabasz Index, abbreviated as CH coefficient) is selected as the preset cluster evaluation Index, and in practical application, other cluster evaluation indexes such as Calinski Harabaz Index and the like can be selected, which is not limited by the invention.
Step S105: and determining the number of target label categories based on the scoring value, and establishing a multi-mode database according to the sub-labels corresponding to the modal data and the total labels corresponding to the mixed modal data under the number of target label categories.
Specifically, after the sub-tags and the total tags are respectively set for different modal data according to the tag results under the target tag category number, the sub-tags are used as serial numbers to store the mixed modal data, and a multi-modal database corresponding to the mixed modal data is obtained
Through the steps S101 to S105, according to the multi-mode database establishing method provided by the embodiment of the invention, the data feature extraction is performed on the mode data of each data type in the range of different preset label types, then the total label of the sub-label corresponding to each mode data and the mixed mode data is established in a feature clustering mode, then the sub-label and the total label of each mode data are determined by calculating the scoring value of the corresponding preset clustering evaluation index under the different preset label types, and the multi-mode database is established, so that the database has the global total label and the local sub-label under each mode, the data can be subjected to secondary label, the accuracy of detail retrieval on the data is improved, the universality is good, and the database comprising various different data types can be established.
Specifically, in an embodiment, the step S102 specifically includes the following steps:
step S201: and classifying the mixed mode data based on the data types to obtain each single mode data.
Specifically, the mixed mode data is classified according to data types to obtain each single mode data, wherein each single mode data represents one data type, such as: the mixed modality data is classified according to the several data types of images, text, audio and user evaluation.
Step S202: acquiring preset feature extraction parameters of each single-mode data, and carrying out feature extraction on each single-mode data according to the preset feature extraction parameters to obtain feature data corresponding to each single-mode data.
Step S203: and constructing a mixed mode characteristic set based on the characteristic data corresponding to each single mode data.
Wherein, by representing the single-mode data with the feature data, a mixed mode feature set is constructed, such as: assuming that the single-mode data is image data, extracting features such as color distribution, texture, edge, direction gradient histogram and the like in the image data as single-mode data representing the image data. The data volume of the single-mode data is simplified in a feature extraction mode, the subsequent cluster analysis is facilitated, the calculation rate is improved, and the sub-labels corresponding to the data of each mode are obtained.
Specifically, in one embodiment, the step S103 specifically includes the following steps:
step S301: and obtaining the number of the current preset label categories.
Wherein the current preset tag class number k belongs to the preset tag class number range k e [ Min_k-Max_k ].
Step S302: and clustering the characteristic data corresponding to each single-mode data to obtain the current sub-label corresponding to each single-mode data.
Specifically, a single view clustering algorithm is performed on each single mode data to obtain a sub-label L of each mode 1 -4 . In practical application, the clustering algorithm may be selected according to the data type of the single-mode data, which is not limited in the present invention. In the embodiment of the invention, a clustering algorithm based on distance is used for images and audio, and a clustering algorithm based on cosine similarity is used for text and user evaluation. The category number of the sub-tags in each mode is selected as the preset tag category number range [ Min_k-Max_k ]]Random values within.
Step S303: clustering the current sub-labels corresponding to the modal data based on the number of the current preset label categories to obtain the current total label of the mixed modal data.
Specifically, by subtag L of each modality 1-4 As input, a multi-view clustering algorithm is used to obtain an overall total label L, wherein the number of categories of the total label L is the current preset label category number k. In the embodiment of the invention, the total label L is obtained by using a multi-view clustering algorithm based on a co-occurrence matrix, and in practical application, the multi-view clustering algorithm can be selected from the prior art according to practical needs, and the invention is not limited to the above.
Step S304: and respectively calculating the adjusted Lande coefficient value of each current sub-label based on the current total label.
Step S305: and performing dimension reduction updating on the characteristic data corresponding to each current sub-label which does not meet the preset adjustment Lande coefficient value, and clustering the updated characteristic data again to update the current sub-label corresponding to each single-mode data.
Specifically, each sub-label L is based on the current total label L 1-4 Collaborative learning is performed. The specific implementation mode is that the Mono-modal data with the Rankine coefficient index lower than the average value is subjected to supervised linear dimension reduction based on L sub-labels, new Mono-modal data is obtained, and clustering analysis is performed again to obtain new sub-labels. And until a stopping criterion is met, namely the class relation among the characteristic data samples is not changed any more, and the sub-label and the total label corresponding to the circulating result are obtained at the moment. The above steps are then repeated until the total tag class number k is traversed [ Min_k-Max_k ]]Fig. 2 is a schematic diagram of the label generation process.
Specifically, in one embodiment, the step S105 specifically includes the following steps:
step S501: and sorting the scoring values corresponding to the number of the preset label categories.
Step S502: and determining the maximum preset label category number of the grading value as the target label category number.
Specifically, carrying out CH coefficient calculation on total tags of the cycle result corresponding to the number of each preset tag category, and drawing a line graph by taking the CH coefficient as an ordinate and [ Min_k-Max_k ] as an abscissa; the 'Elbow' method (Elbow method) is used, i.e. the k value at the 'inflection point' is selected in the graph, and the obtained sub-label and total label are the final result.
Step S503: and respectively adding a total label and a sub-label corresponding to the total label to each mode data, and establishing a multi-mode database of the mixed mode data.
Specifically, by storing the mixed mode data and storing the total label for each data, and storing the respective sub-labels for each mode of each data, the multi-mode database is built up, and fig. 3 is a schematic diagram of the multi-mode database built up according to the embodiment of the present invention. By setting the range of the preset label category number and utilizing the process of collaborative learning between the sub-labels and the total labels, the accuracy of classifying the data sub-labels and the total labels in the multi-modal database is improved, and the accuracy of the search result is improved when the established multi-modal database is subjected to label search.
In practical application, the established multi-modal database is assumed to comprise tens of thousands of pieces of data, each piece of data having a corresponding image modality and text modality; assuming that the total label of the database points to 6 different people, namely the category number is 6, for image data in a single mode, the sub label of the database can point to 7 types of expressions, namely the category number is 7, and for text data in the single mode, the sub label of the database can be 8 types of subjects; so in this database, if a piece of data, its total label is 'Li Ming', while its sub-label in image mode might be 'happy', and its sub-label in text mode is 'play'. Therefore, the labels in the multi-mode database established by the embodiment of the invention not only have global total labels, but also have local sub-labels under each mode, and can be used for carrying out secondary labels on data, thereby being beneficial to carrying out secondary retrieval on the data. The number of the sub-labels of the single-mode data obtained by the embodiment of the invention is not fixed, and the method and the device are more in line with the practical significance. In addition, the multi-mode database established by the embodiment of the invention has good universality, can be used for various data of different types, and can be suitable for data such as texts, images, videos and the like.
The embodiment of the invention also provides a multi-mode database searching method, as shown in fig. 4, which comprises the following steps:
step S1: a search tag set is obtained, the search tag set comprising a plurality of search tags.
Specifically, the search tag set is composed of one primary tag (corresponding to the total tag corresponding to the data in the multi-mode database) and a plurality of secondary tags (corresponding to the sub-tags in the multi-mode database).
Step S2: based on the search tag set, searching is carried out in the multi-modal database to obtain a search result corresponding to the search tag set, wherein the multi-modal database is a multi-modal database established by the multi-modal database establishing method provided by the other embodiment of the invention.
Specifically, in the multi-mode database established in another embodiment of the present invention, each piece of data includes a total tag and a sub-tag, so that the data corresponding to each sub-tag can be obtained by performing secondary search on the data from the multi-mode database by using the search tag set, thereby improving the user experience.
By executing the steps, the multi-mode database searching method provided by the embodiment of the invention performs searching of the searching label set by utilizing the multi-mode database which is based on the overall label and the sub-label under each local mode, thereby being beneficial to performing detail searching of data stored in the database, improving the accuracy of the searching result and further improving the application range of the multi-mode database.
The embodiment of the invention also provides a system for establishing the multi-mode database, as shown in fig. 5, which comprises:
the first obtaining module 101 is configured to obtain mixed mode data, where the mixed mode data includes multi-mode data with different data types. For details, see the description of step S101 in the above method embodiment.
The first processing module 102 is configured to extract data features corresponding to each mode data from the mixed mode data, and construct a mixed mode feature set. For details, see the description related to step S102 in the above method embodiment.
The second processing module 103 is configured to cluster the feature data in the mixed mode feature set according to the preset tag class number range, so as to obtain sub-tags corresponding to each mode data under different preset tag class numbers and total tags corresponding to the mixed mode data. For details, see the description of step S103 in the above method embodiment.
And the third processing module 104 is configured to calculate score values of preset clustering evaluation indexes corresponding to the total labels under different preset label category numbers respectively. For details, see the description related to step S104 in the above method embodiment.
The fourth processing module 105 is configured to determine the number of target tag categories based on the score value, and establish a multi-mode database according to the sub-tags corresponding to the modal data and the total tags corresponding to the mixed modal data under the number of target tag categories. For details, see the description of step S105 in the above method embodiment.
Through the collaborative cooperation of the above components, the multi-mode database establishing system provided by the embodiment of the invention performs data feature extraction on the mode data of each data type in the number range of different preset label categories, then constructs the total label of the sub-label corresponding to each mode data and the mixed mode data in a feature clustering mode, then determines the sub-label and the total label of each mode data by calculating the scoring value of the corresponding preset clustering evaluation index under the different preset label category numbers, and further enables the database to have the global total label and the local sub-label under each mode, so that the data can be subjected to secondary label, the accuracy of detail retrieval on the data is improved, and the database comprising various different data types can be established.
The embodiment of the invention also provides a multi-modal database retrieval system, as shown in fig. 6, which comprises:
the second obtaining module 1 is configured to obtain a search tag set, where the search tag set includes a plurality of search tags. For details, see the description of step S1 in the above method embodiment.
And the fifth processing module 2 is configured to perform a search in a multi-modal database based on the search tag set to obtain a search result corresponding to the search tag set, where the multi-modal database is a multi-modal database built by the multi-modal database building system provided by another embodiment of the present invention. For details, see the description of step S2 in the above method embodiment.
Through the cooperation of the components, the multi-mode database retrieval system provided by the embodiment of the invention performs the retrieval of the retrieval tag set by utilizing the multi-mode database which is based on the global total tag and the sub-tag under each mode locally, thereby being beneficial to the detail retrieval of the data stored in the database, improving the accuracy of the retrieval result and further improving the application range of the multi-mode database.
There is also provided in accordance with an embodiment of the present invention an electronic device, as shown in fig. 7, which may include a processor 901 and a memory 902, wherein the processor 901 and the memory 902 may be connected via a bus or otherwise, as exemplified by the bus connection in fig. 7.
The processor 901 may be a central processing unit (Central Processing Unit, CPU). The processor 901 may also be other general purpose processors, digital signal processors (Digital Signal Processor, DSP), application specific integrated circuits (Application Specific Integrated Circuit, ASIC), field programmable gate arrays (Field-Programmable Gate Array, FPGA) or other programmable logic devices, discrete gate or transistor logic devices, discrete hardware components, or a combination thereof.
The memory 902 is used as a non-transitory computer readable storage medium for storing non-transitory software programs, non-transitory computer executable programs, and modules, such as program instructions/modules corresponding to the methods in the method embodiments of the present invention. The processor 901 executes various functional applications of the processor and data processing, i.e., implements the methods in the above-described method embodiments, by running non-transitory software programs, instructions, and modules stored in the memory 902.
The memory 902 may include a storage program area and a storage data area, wherein the storage program area may store an operating system, at least one application program required for a function; the storage data area may store data created by the processor 901, and the like. In addition, the memory 902 may include high-speed random access memory, and may also include non-transitory memory, such as at least one magnetic disk storage device, flash memory device, or other non-transitory solid state storage device. In some embodiments, memory 902 optionally includes memory remotely located relative to processor 901, which may be connected to processor 901 via a network. Examples of such networks include, but are not limited to, the internet, intranets, local area networks, mobile communication networks, and combinations thereof.
One or more modules are stored in the memory 902 that, when executed by the processor 901, perform the methods of the method embodiments described above.
The specific details of the electronic device may be correspondingly understood by referring to the corresponding related descriptions and effects in the above method embodiments, which are not repeated herein.
It will be appreciated by those skilled in the art that implementing all or part of the above-described embodiment method may be implemented by a computer program to instruct related hardware, and the program may be stored in a computer readable storage medium, and the program may include the above-described embodiment method when executed. The storage medium may be a magnetic Disk, an optical Disk, a Read-Only Memory (ROM), a random access Memory (Random Access Memory, RAM), a Flash Memory (Flash Memory), a Hard Disk (HDD), or a Solid State Drive (SSD); the storage medium may also comprise a combination of memories of the kind described above.
Although embodiments of the present invention have been described in connection with the accompanying drawings, various modifications and variations may be made by those skilled in the art without departing from the spirit and scope of the invention, and such modifications and variations are within the scope of the invention as defined by the appended claims.

Claims (9)

1. A method for building a multimodal database, comprising:
acquiring mixed mode data, wherein the mixed mode data comprises multi-mode data with different data types;
extracting data features corresponding to each mode data from the mixed mode data, and constructing a mixed mode feature set;
clustering the feature data in the mixed mode feature set according to the preset tag class number range to obtain sub tags corresponding to the mode data under different preset tag class numbers and total tags corresponding to the mixed mode data;
respectively calculating scoring values of preset clustering evaluation indexes corresponding to the total tags under different preset tag category numbers;
determining the number of target label categories based on the grading values, and establishing a multi-mode database according to sub-labels corresponding to the modal data under the number of target label categories and total labels corresponding to the mixed modal data;
the step of extracting data features corresponding to each mode data from the mixed mode data to construct a mixed mode feature set comprises the following steps:
classifying the mixed mode data based on data types to obtain each single mode data;
acquiring preset feature extraction parameters of each single-mode data, and carrying out feature extraction on each single-mode data according to the preset feature extraction parameters to obtain feature data corresponding to each single-mode data;
and constructing the mixed mode feature set based on feature data corresponding to each single mode data.
2. The method of claim 1, wherein clustering the feature data in the mixed mode feature set according to the preset tag class number range to obtain sub-tags corresponding to each mode data under different preset tag class numbers and total tags corresponding to the mixed mode data, comprises:
acquiring the number of the current preset label categories;
clustering the characteristic data corresponding to each single-mode data to obtain the current sub-label corresponding to each single-mode data;
clustering the current sub-labels corresponding to the modal data based on the current preset label class number to obtain the current total label of the mixed modal data.
3. The method of claim 2, wherein clustering the feature data in the mixed mode feature set according to the preset tag class number range to obtain sub-tags corresponding to each mode data and total tags corresponding to the mixed mode data under different preset tag class numbers, further comprises:
calculating the adjusted Lande coefficient value of each current sub-label based on the current total label;
and performing dimension reduction updating on the characteristic data corresponding to each current sub-label which does not meet the preset adjustment Lande coefficient value, and clustering the updated characteristic data again to update the current sub-label corresponding to each single-mode data.
4. The method according to claim 1, wherein determining the number of target tag categories based on the scoring value, and establishing a multi-modal database according to sub-tags corresponding to each modal data under the number of target tag categories and a total tag corresponding to the mixed modal data, comprises:
sorting scoring values corresponding to the number of the preset label categories;
determining the maximum preset label category number of the grading value as the target label category number;
and respectively adding a total label and a sub-label corresponding to each mode data, and establishing a multi-mode database of the mixed mode data.
5. A method for multi-modal database retrieval, comprising:
acquiring a search tag set, wherein the search tag set comprises a plurality of search tags;
based on the search tag set, searching is carried out in the multi-modal database to obtain a search result corresponding to the search tag set, wherein the multi-modal database is a multi-modal database established by adopting the multi-modal database establishing method according to any one of claims 1-4.
6. A multi-modal database creation system, comprising:
the first acquisition module is used for acquiring mixed mode data, wherein the mixed mode data comprise multi-mode data with different data types;
the first processing module is used for extracting data features corresponding to each mode data from the mixed mode data and constructing a mixed mode feature set; the step of extracting data features corresponding to each mode data from the mixed mode data to construct a mixed mode feature set comprises the following steps: classifying the mixed mode data based on data types to obtain each single mode data; acquiring preset feature extraction parameters of each single-mode data, and carrying out feature extraction on each single-mode data according to the preset feature extraction parameters to obtain feature data corresponding to each single-mode data; constructing the mixed mode feature set based on feature data corresponding to each single mode data;
the second processing module is used for clustering the characteristic data in the mixed mode characteristic set according to the preset label category number range to obtain sub-labels corresponding to the mode data under different preset label category numbers and total labels corresponding to the mixed mode data;
the third processing module is used for respectively calculating the scoring values of the preset clustering evaluation indexes corresponding to the total labels under different preset label category numbers;
and the fourth processing module is used for determining the number of target label categories based on the grading value, and establishing a multi-mode database according to the sub-labels corresponding to the modal data under the number of target label categories and the total labels corresponding to the mixed modal data.
7. A multimodal database retrieval system, comprising:
the second acquisition module is used for acquiring a search tag set, wherein the search tag set comprises a plurality of search tags;
and a fifth processing module, configured to perform searching in the multi-modal database based on the search tag set, to obtain a search result corresponding to the search tag set, where the multi-modal database is a multi-modal database built by using the multi-modal database building system according to claim 6.
8. An electronic device, comprising:
a memory and a processor in communication with each other, the memory having stored therein computer instructions which, upon execution, perform the method of any one of claims 1-4 or the method of claim 5.
9. A computer readable storage medium having stored thereon computer instructions for causing the computer to perform the method of any one of claims 1-4 or to perform the method of claim 5.
CN202011542924.6A 2020-12-23 2020-12-23 Multi-mode database establishment method, retrieval method and system Active CN112579841B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202011542924.6A CN112579841B (en) 2020-12-23 2020-12-23 Multi-mode database establishment method, retrieval method and system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202011542924.6A CN112579841B (en) 2020-12-23 2020-12-23 Multi-mode database establishment method, retrieval method and system

Publications (2)

Publication Number Publication Date
CN112579841A CN112579841A (en) 2021-03-30
CN112579841B true CN112579841B (en) 2024-01-05

Family

ID=75139237

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202011542924.6A Active CN112579841B (en) 2020-12-23 2020-12-23 Multi-mode database establishment method, retrieval method and system

Country Status (1)

Country Link
CN (1) CN112579841B (en)

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106202281A (en) * 2016-06-28 2016-12-07 广东工业大学 A kind of multi-modal data represents learning method and system
CN110597878A (en) * 2019-09-16 2019-12-20 广东工业大学 Cross-modal retrieval method, device, equipment and medium for multi-modal data
CN110990596A (en) * 2019-12-04 2020-04-10 山东师范大学 Multi-mode hash retrieval method and system based on self-adaptive quantization
CN112001438A (en) * 2020-08-19 2020-11-27 四川大学 Multi-mode data clustering method for automatically selecting clustering number
CN112015923A (en) * 2020-09-04 2020-12-01 平安科技(深圳)有限公司 Multi-mode data retrieval method, system, terminal and storage medium

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20170038769A1 (en) * 2015-07-27 2017-02-09 Shohei Hidaka Method and system of dimensional clustering
US11204966B2 (en) * 2019-02-01 2021-12-21 EMC IP Holding Company LLC Contextual image-assisted search

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106202281A (en) * 2016-06-28 2016-12-07 广东工业大学 A kind of multi-modal data represents learning method and system
CN110597878A (en) * 2019-09-16 2019-12-20 广东工业大学 Cross-modal retrieval method, device, equipment and medium for multi-modal data
CN110990596A (en) * 2019-12-04 2020-04-10 山东师范大学 Multi-mode hash retrieval method and system based on self-adaptive quantization
CN112001438A (en) * 2020-08-19 2020-11-27 四川大学 Multi-mode data clustering method for automatically selecting clustering number
CN112015923A (en) * 2020-09-04 2020-12-01 平安科技(深圳)有限公司 Multi-mode data retrieval method, system, terminal and storage medium

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
多模态情感识别研究进展;何俊,刘跃,何忠文;《计算机应用研究》;第第35卷卷(第第11期期);正文 *

Also Published As

Publication number Publication date
CN112579841A (en) 2021-03-30

Similar Documents

Publication Publication Date Title
CN108804641B (en) Text similarity calculation method, device, equipment and storage medium
US10740678B2 (en) Concept hierarchies
CN109189991B (en) Duplicate video identification method, device, terminal and computer readable storage medium
WO2021093755A1 (en) Matching method and apparatus for questions, and reply method and apparatus for questions
CN110019732B (en) Intelligent question answering method and related device
CN106844341B (en) Artificial intelligence-based news abstract extraction method and device
CN108154198B (en) Knowledge base entity normalization method, system, terminal and computer readable storage medium
WO2021139262A1 (en) Document mesh term aggregation method and apparatus, computer device, and readable storage medium
US8583669B2 (en) Query suggestion for efficient legal E-discovery
CN107844493B (en) File association method and system
JP2022024102A (en) Method for training search model, method for searching target object and device therefor
CN113806582B (en) Image retrieval method, image retrieval device, electronic equipment and storage medium
CN114329029B (en) Object retrieval method, device, equipment and computer storage medium
CN115203421A (en) Method, device and equipment for generating label of long text and storage medium
CN114780746A (en) Knowledge graph-based document retrieval method and related equipment thereof
KR101472451B1 (en) System and Method for Managing Digital Contents
CN112183102A (en) Named entity identification method based on attention mechanism and graph attention network
CN112836029A (en) Graph-based document retrieval method, system and related components thereof
CN111538903B (en) Method and device for determining search recommended word, electronic equipment and computer readable medium
CN112650833A (en) API (application program interface) matching model establishing method and cross-city government affair API matching method
CN113569118B (en) Self-media pushing method, device, computer equipment and storage medium
CN113590811A (en) Text abstract generation method and device, electronic equipment and storage medium
CN117009518A (en) Similar event judging method integrating basic attribute and text content and application thereof
CN111553442A (en) Method and system for optimizing classifier chain label sequence
CN112579841B (en) Multi-mode database establishment method, retrieval method and system

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant