CN110889340A - Visual question-answering model based on iterative attention mechanism - Google Patents

Visual question-answering model based on iterative attention mechanism Download PDF

Info

Publication number
CN110889340A
CN110889340A CN201911099046.2A CN201911099046A CN110889340A CN 110889340 A CN110889340 A CN 110889340A CN 201911099046 A CN201911099046 A CN 201911099046A CN 110889340 A CN110889340 A CN 110889340A
Authority
CN
China
Prior art keywords
attention
question
iterative
follows
attention mechanism
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201911099046.2A
Other languages
Chinese (zh)
Inventor
颜丙旭
刘杰
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Harbin Engineering University
Original Assignee
Harbin Engineering University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Harbin Engineering University filed Critical Harbin Engineering University
Priority to CN201911099046.2A priority Critical patent/CN110889340A/en
Publication of CN110889340A publication Critical patent/CN110889340A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V30/00Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
    • G06V30/40Document-oriented image-based pattern recognition
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/213Feature extraction, e.g. by transforming the feature space; Summarisation; Mappings, e.g. subspace methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/217Validation; Performance evaluation; Active pattern learning techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/25Fusion techniques
    • G06F18/253Fusion techniques of extracted features
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V30/00Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
    • G06V30/10Character recognition

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Data Mining & Analysis (AREA)
  • Artificial Intelligence (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Evolutionary Biology (AREA)
  • Evolutionary Computation (AREA)
  • General Engineering & Computer Science (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Multimedia (AREA)
  • Image Analysis (AREA)

Abstract

The invention discloses a visual question-answering model based on an iterative attention mechanism, which comprises the following three steps: step S1, constructing a dual attention mechanism; step S2, iterating the internal structure of the model; step S3, predicting answers; the invention uses VGGNet to extract the characteristics of the image; coding the question and the answer in a bidirectional LSTM mode; taking the results of the first part and the second part as input, namely taking the picture characteristic vector and the problem characteristic vector as input, firstly adding an attention mechanism into the two vectors respectively, obtaining two attention characteristic vectors after calculation, and then fusing to obtain a new picture and a new problem characteristic vector; iteratively performing the third part of work content to reduce the granularity of the concerned area, and obtaining a final picture and a problem feature vector; and predicting answer distribution by using the pictures obtained in the steps and the feature vectors of the questions. The invention has the beneficial effects that: the focus is on the question, the focus area is accurate, and the predicted answer is accurate.

Description

Visual question-answering model based on iterative attention mechanism
Technical Field
The invention relates to the technical field of vision based on computers, in particular to a visual question-answering model based on an iterative attention mechanism.
Background
A key solution to Visual Question Answering (VQA) consists in how to extract and fuse visual and linguistic features extracted from input images and questions; the general framework of the existing methods is that visual and linguistic features are extracted separately from images and questions in an initial step, and fused together in a later step to compute and predict; in early studies, researchers employed simple fusion methods such as concatenation, summation, multiplication of visual and linguistic features, which were then fed into fully-concatenated layers to predict answers.
To date, VQA all focus models in the literature on visual attention, not on issues; consider the problem "how many ways of cat are one in this image? "and" how many mechanisms can you see in this image? "of the problem; they have the same meaning, and both questions are basically determined by "howmann cats", which shows that models using "howmann cats" are more robust than models using words that are not relevant to the answer.
Furthermore, most of the recently proposed visual question-answer models are based on neural networks; one commonly used method is to extract global image feature vectors using Convolutional Neural Networks (CNN) and encode the corresponding questions into feature vectors using long short term memory networks (LSTM), then process them and predict the answers; although these methods have had good results, these models often do not give accurate answers when the answers are related to some fine-grained regions in the image.
The above disadvantages can be simplified into two points:
① the focus of the existing attention models is on the vision, not the problem
② when using the attention mechanism, the area of interest is not precise, especially for some fine-grained areas;
③, the above-mentioned deficiencies result in inaccurate answers to the predicted questions.
Therefore, the prior art needs a visual question-answering model based on an iterative attention mechanism, which has the advantages of accurate attention area and accurate predicted answer, and has the attention point on the question.
Disclosure of Invention
The present invention is directed to a visual question-answering model based on an iterative attention mechanism, so as to solve the above-mentioned problems in the background art.
In order to achieve the purpose, the invention provides the following technical scheme:
a visual question-answering model based on an iterative attention mechanism comprises the following steps:
step S1: constructing a double attention mechanism;
step S2: the internal structure of the iterative model, namely the fusion method of each image and the problem;
step S3: and (6) predicting answers.
As a further scheme of the invention: the step S1 includes:
firstly, extracting image features by using VggNet, and considering that an iterative model is used later, the iterative model is QlEncoding the problem with Bi-LSTM to be VlCreating two attention graphs, QlAnd VlThe calculation formula of (a) is as follows:
Figure BDA0002269264650000021
Figure BDA0002269264650000022
above AQlAnd BVlEach row of (a) contains a single intent;
the dimension feature vector
Figure BDA0002269264650000023
And
Figure BDA0002269264650000024
projecting into a plurality of low dimensional spaces; let the number of low-dimensional spaces be dh(≡ d/h) as feature vector dimension; by using
Figure BDA0002269264650000025
And
Figure BDA0002269264650000026
representing a linear projection; the projection feature matrix of the ith space is:
Figure BDA0002269264650000027
an attention map is created in each matrix by normalization by column and by row using the softmax function, and the formula is as follows;
Figure BDA0002269264650000028
Figure BDA0002269264650000029
when the invention uses multiplicative (or dot-product) attention, as follows, the average fusion of multiple features is equivalent to averaging the attention maps, and the formula is as follows:
Figure BDA00022692646500000210
Figure BDA0002269264650000031
the present invention uses product attention to obtain a characterization of a problem and an image
Figure BDA0002269264650000032
And
Figure BDA00022692646500000322
the formula is as follows:
Figure BDA0002269264650000033
Figure BDA0002269264650000034
the above
Figure BDA0002269264650000035
And VlIs the same size as (i.e. d x T),
Figure BDA0002269264650000036
and QlIs the same, i.e., d × N.
As a further scheme of the invention: the step S2 includes:
in computing feature representations
Figure BDA0002269264650000037
And
Figure BDA0002269264650000038
then, in the matrix
Figure BDA0002269264650000039
The nth column of (a) stores a representation of the entire image associated with the nth question word, i.e., the attention feature vector for the nth word; then, the nth column vector is cascaded
Figure BDA00022692646500000310
And the nth question word vector
Figure BDA00022692646500000311
Fusing to form two-dimensional vector
Figure BDA00022692646500000312
Projecting the connected vectors into a d-dimensional space through a single-layer network, and connecting the d-dimensional space through a ReLU activation function and a residual error; the formula is as follows:
Figure BDA00022692646500000313
wherein
Figure BDA00022692646500000314
And
Figure BDA00022692646500000315
are learning weights and bias terms; when N (N is 1, …, N) words are all involved in the calculation, the result is obtained
Figure BDA00022692646500000316
Similarly, the representation v of the t-th image arealtRepresentation of the entire query word in relation to the t-th image area
Figure BDA00022692646500000317
Connected in series, projected into d-dimensional space, the formula is as follows:
Figure BDA00022692646500000318
wherein the content of the first and second substances,
Figure BDA00022692646500000319
and
Figure BDA00022692646500000320
are learning weights and bias terms; when T (T is 1, …, T) areas are all involved in the calculation, the result is obtained
Figure BDA00022692646500000321
As a further scheme of the invention: the step S3 includes:
the invention uses the last output Q of the iterative modelLAnd VLTo predict answer distributions; since they contain representations of N question words and T image regions, the present invention first performs a self-attention mechanism on them to obtain an aggregated representation of the entire question and image; for QLThe operation of (1) is as follows:
calculating the "score", sqL1,…,sqLNAre each qL1,…,qLNBy applying MLP with two layers in the hidden layer;
normalization with softmax to obtain weights
Figure BDA0002269264650000041
Use of
Figure BDA0002269264650000042
Calculating an aggregation expression by using a formula;
v was obtained using the same methodLWeight matrix of
Figure BDA0002269264650000043
And polymerization of
Figure BDA0002269264650000044
The score of a predefined answer is calculated using MLP, which is a widely used method in recent studies, and the formula is as follows:
Figure BDA0002269264650000045
compared with the prior art, the invention has the beneficial effects that:
aiming at the problems that the existing visual question-answering model does not adopt an attention mechanism to the problem words to eliminate the interference of irrelevant words, and the attention area is not accurate when the attention mechanism is utilized, the invention creatively constructs a double attention mechanism and an iterative model to utilize the attention mechanism on the problems and reduce the granularity of the attention area; the specific idea is that an attention feature vector is generated on an image area corresponding to each question word, and an attention feature vector is generated on the question word corresponding to each image area; then it performs the calculation of attention feature vectors, the concatenation of multimodal representations and their transformation through the single layer network of ReLU and residual concatenation; the calculation is packaged into an iterative attention mechanism model, the model considers the interaction between all image areas and all question words, a hierarchical structure can be formed in an iterative mode, multi-step interaction between the images and the questions is achieved to reduce the granularity of the attention areas, the attention areas and the attention words are obtained more accurately, and then answer prediction is carried out; experiments prove that the model improves the accuracy of the predicted answers.
Drawings
FIG. 1 is a step diagram of a visual question-answering model based on an iterative attention mechanism according to the present invention.
FIG. 2 is a flowchart illustrating the effect of the visual question-answering model based on the iterative attention mechanism according to the present invention.
FIG. 3 is a schematic diagram of step S1 of the visual question-answering model based on the iterative attention mechanism according to the present invention.
FIG. 4 is a schematic diagram of step S2 of the visual question-answering model based on the iterative attention mechanism according to the present invention.
Detailed Description
The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments; all other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
Referring to fig. 1 to 4, in an embodiment of the present invention, a visual question-answering model based on an iterative attention mechanism includes the following steps:
step S1: constructing a double attention mechanism;
step S2: the internal structure of the iterative model, namely the fusion method of each image and the problem;
step S3: and (6) predicting answers.
The step S1 includes:
firstly, extracting image features by using VggNet, and considering that an iterative model is used later, the iterative model is QlEncoding the problem with Bi-LSTM to be VlCreating two attention graphs, QlAnd VlThe calculation formula of (a) is as follows:
Figure BDA0002269264650000051
Figure BDA0002269264650000052
above AQlAnd BVlEach row of (a) contains a single intent;
the dimension feature vector
Figure BDA0002269264650000053
And
Figure BDA0002269264650000054
projecting into a plurality of low dimensional spaces; let the number of low-dimensional spaces be dh(≡ d/h) as feature vector dimension; by using
Figure BDA0002269264650000055
And
Figure BDA0002269264650000056
representing a linear projection; the projection feature matrix of the ith space is:
Figure BDA0002269264650000057
an attention map is created in each matrix by normalization by column and by row using the softmax function, and the formula is as follows;
Figure BDA0002269264650000058
Figure BDA0002269264650000059
when the invention uses multiplicative (or dot-product) attention, as follows, the average fusion of multiple features is equivalent to averaging the attention maps, and the formula is as follows:
Figure BDA00022692646500000510
Figure BDA00022692646500000511
the present invention uses product attention to obtain a characterization of a problem and an image
Figure BDA0002269264650000061
And
Figure BDA0002269264650000062
the formula is as follows:
Figure BDA0002269264650000063
Figure BDA0002269264650000064
the above
Figure BDA0002269264650000065
And VlIs the same size as (i.e. d x T),
Figure BDA0002269264650000066
and QlIs the same, i.e., d × N.
The step S2 includes:
in computing feature representations
Figure BDA0002269264650000067
And
Figure BDA0002269264650000068
then, in the matrix
Figure BDA0002269264650000069
The nth column of (a) stores a representation of the entire image associated with the nth question word, i.e., the attention feature vector for the nth word; then, the nth column vector is cascaded
Figure BDA00022692646500000610
And the nth question word vector
Figure BDA00022692646500000611
Fusing to form two-dimensional vector
Figure BDA00022692646500000612
Projecting the connected vectors into a d-dimensional space through a single-layer network, and connecting the d-dimensional space through a ReLU activation function and a residual error; the formula is as follows:
Figure BDA00022692646500000613
wherein
Figure BDA00022692646500000614
And
Figure BDA00022692646500000615
are learning weights and bias terms; when N (N is 1, …, N) words are all involved in the calculation, the result is obtained
Figure BDA00022692646500000616
Similarly, the representation v of the t-th image arealtRepresentation of the entire query word in relation to the t-th image area
Figure BDA00022692646500000617
Connected in series, projected into d-dimensional space, the formula is as follows:
Figure BDA00022692646500000618
wherein the content of the first and second substances,
Figure BDA00022692646500000619
and
Figure BDA00022692646500000620
are learning weights and bias terms; when T (T is 1, …, T) areas are all involved in the calculation, the result is obtained
Figure BDA00022692646500000621
The step S3 includes:
the invention uses the last input of the iterative modelOut of QLAnd VLTo predict answer distributions; since they contain representations of N question words and T image regions, the present invention first performs a self-attention mechanism on them to obtain an aggregated representation of the entire question and image; for QLThe operation of (1) is as follows:
calculating the "score", sqL1,…,sqLNAre each qL1,…,qLNBy applying MLP with two layers in the hidden layer;
normalization with softmax to obtain weights
Figure BDA00022692646500000622
Use of
Figure BDA0002269264650000071
Calculating an aggregation expression by using a formula;
v was obtained using the same methodLWeight matrix of
Figure BDA0002269264650000072
And polymerization of
Figure BDA0002269264650000073
The score of a predefined answer is calculated using MLP, which is a widely used method in recent studies, and the formula is as follows:
Figure BDA0002269264650000074
in the implementation of the invention, the comparison of the effect of the model of the invention and other models is tested on a COCO-QA data set, and experiments prove that the model of the invention is superior to other models, and the test effect is as follows:
Figure BDA0002269264650000075
therefore, the method and the system can help the visually impaired people to understand the visual information, and can apply the visual question-answering to the image retrieval system in the future to help the user to retrieve the required images.
It will be evident to those skilled in the art that the invention is not limited to the details of the foregoing illustrative embodiments, and that the present invention may be embodied in other specific forms without departing from the spirit or essential attributes thereof; the present embodiments are therefore to be considered in all respects as illustrative and not restrictive, the scope of the invention being indicated by the appended claims rather than by the foregoing description, and all changes which come within the meaning and range of equivalency of the claims are therefore intended to be embraced therein; any reference sign in a claim should not be construed as limiting the claim concerned.
Furthermore, it should be understood that although the present description refers to embodiments, not every embodiment may contain only a single embodiment, and such description is for clarity only, and those skilled in the art should integrate the description, and the embodiments may be combined as appropriate to form other embodiments understood by those skilled in the art.

Claims (4)

1. A visual question-answering model based on an iterative attention mechanism is characterized in that: the method comprises the following steps:
step S1: constructing a double attention mechanism;
step S2: the internal structure of the iterative model, namely the fusion method of each image and the problem;
step S3: and (6) predicting answers.
2. The visual question-answering model based on the iterative attention mechanism as claimed in claim 1, wherein: the step S1 includes:
firstly, extracting image features by using VggNet, and considering that an iterative model is used later, the iterative model is QlEncoding the problem with Bi-LSTM to be VlCreating two attention graphs, QlAnd VlThe calculation formula of (a) is as follows:
Figure FDA0002269264640000011
Figure FDA0002269264640000012
above AQlAnd BVlEach row of (a) contains a single intent;
the dimension feature vector
Figure FDA0002269264640000013
And
Figure FDA0002269264640000014
projecting into a plurality of low dimensional spaces; let the number of low-dimensional spaces be dh(≡ d/h) as feature vector dimension; by using
Figure FDA0002269264640000015
And
Figure FDA0002269264640000016
representing a linear projection; the projection feature matrix of the ith space is:
Figure FDA0002269264640000017
an attention map is created in each matrix by normalization by column and by row using the softmax function, and the formula is as follows;
Figure FDA0002269264640000018
Figure FDA0002269264640000019
when the invention uses multiplicative (or dot-product) attention, as follows, the average fusion of multiple features is equivalent to averaging the attention maps, and the formula is as follows:
Figure FDA00022692646400000110
Figure FDA0002269264640000021
the present invention uses product attention to obtain a characterization of a problem and an image
Figure FDA0002269264640000022
And
Figure FDA0002269264640000023
the formula is as follows:
Figure FDA0002269264640000024
Figure FDA0002269264640000025
the above
Figure FDA0002269264640000026
And VlIs the same size as (i.e. d x T),
Figure FDA0002269264640000027
and QlIs the same, i.e., d × N.
3. The visual question-answering model based on the iterative attention mechanism as claimed in claim 1, wherein: the step S2 includes:
in computing feature representations
Figure FDA0002269264640000028
And
Figure FDA0002269264640000029
then, in the matrix
Figure FDA00022692646400000210
The nth column of (a) stores a representation of the entire image associated with the nth question word, i.e., the attention feature vector for the nth word; then, the nth column vector is cascaded
Figure FDA00022692646400000211
And the nth question word vector
Figure FDA00022692646400000212
Fusing to form two-dimensional vector
Figure FDA00022692646400000213
Projecting the connected vectors into a d-dimensional space through a single-layer network, and connecting the d-dimensional space through a ReLU activation function and a residual error; the formula is as follows:
Figure FDA00022692646400000214
wherein
Figure FDA00022692646400000215
And
Figure FDA00022692646400000216
are learning weights and bias terms; when N (N is 1, …, N) words are all involved in the calculation, the result is obtained
Figure FDA00022692646400000217
Similarly, the representation v of the t-th image arealtRepresentation of the entire query word in relation to the t-th image area
Figure FDA00022692646400000218
Connected in series, projected into d-dimensional space, the formula is as follows:
Figure FDA00022692646400000219
wherein the content of the first and second substances,
Figure FDA00022692646400000220
and
Figure FDA00022692646400000221
are learning weights and bias terms; when T (T is 1, …, T) areas are all involved in the calculation, the result is obtained
Figure FDA00022692646400000222
4. The visual question-answering model based on the iterative attention mechanism as claimed in claim 1, wherein: the step S3 includes:
the invention uses the last output Q of the iterative modelLAnd VLTo predict answer distributions; since they contain representations of N question words and T image regions, the present invention first performs a self-attention mechanism on them to obtain an aggregated representation of the entire question and image; for QLThe operation of (1) is as follows:
calculating the "score", sqL1,…,sqLNAre each qL1,…,qLNBy applying MLP with two layers in the hidden layer;
normalization with softmax to obtain weights
Figure FDA0002269264640000031
Use of
Figure FDA0002269264640000032
Calculating an aggregation expression by using a formula;
v was obtained using the same methodLWeight matrix of
Figure FDA0002269264640000033
And polymerization of
Figure FDA0002269264640000034
The score of a predefined answer is calculated using MLP, which is a widely used method in recent studies, and the formula is as follows:
Figure FDA0002269264640000035
CN201911099046.2A 2019-11-12 2019-11-12 Visual question-answering model based on iterative attention mechanism Pending CN110889340A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201911099046.2A CN110889340A (en) 2019-11-12 2019-11-12 Visual question-answering model based on iterative attention mechanism

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201911099046.2A CN110889340A (en) 2019-11-12 2019-11-12 Visual question-answering model based on iterative attention mechanism

Publications (1)

Publication Number Publication Date
CN110889340A true CN110889340A (en) 2020-03-17

Family

ID=69747275

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201911099046.2A Pending CN110889340A (en) 2019-11-12 2019-11-12 Visual question-answering model based on iterative attention mechanism

Country Status (1)

Country Link
CN (1) CN110889340A (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111680484A (en) * 2020-05-29 2020-09-18 北京理工大学 Answer model generation method and system for visual general knowledge reasoning question and answer
CN111858849A (en) * 2020-06-10 2020-10-30 南京邮电大学 VQA method based on intensive attention module
CN112036276A (en) * 2020-08-19 2020-12-04 北京航空航天大学 Artificial intelligent video question-answering method

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111680484A (en) * 2020-05-29 2020-09-18 北京理工大学 Answer model generation method and system for visual general knowledge reasoning question and answer
CN111680484B (en) * 2020-05-29 2023-04-07 北京理工大学 Answer model generation method and system for visual general knowledge reasoning question and answer
CN111858849A (en) * 2020-06-10 2020-10-30 南京邮电大学 VQA method based on intensive attention module
CN112036276A (en) * 2020-08-19 2020-12-04 北京航空航天大学 Artificial intelligent video question-answering method

Similar Documents

Publication Publication Date Title
US20220222920A1 (en) Content processing method and apparatus, computer device, and storage medium
CN108647233A (en) A kind of answer sort method for question answering system
CN108920544A (en) A kind of personalized position recommended method of knowledge based map
CN113486190B (en) Multi-mode knowledge representation method integrating entity image information and entity category information
CN110889340A (en) Visual question-answering model based on iterative attention mechanism
CN109544306A (en) A kind of cross-cutting recommended method and device based on user behavior sequence signature
CN113761153B (en) Picture-based question-answering processing method and device, readable medium and electronic equipment
CN109766557A (en) A kind of sentiment analysis method, apparatus, storage medium and terminal device
CN115080801B (en) Cross-modal retrieval method and system based on federal learning and data binary representation
CN113177141B (en) Multi-label video hash retrieval method and device based on semantic embedded soft similarity
CN112131261B (en) Community query method and device based on community network and computer equipment
CN112699215B (en) Grading prediction method and system based on capsule network and interactive attention mechanism
Wu et al. Multi-stage optimization model for hesitant qualitative decision making with hesitant fuzzy linguistic preference relations
CN114818691A (en) Article content evaluation method, device, equipment and medium
CN114780777B (en) Cross-modal retrieval method and device based on semantic enhancement, storage medium and terminal
CN113239209A (en) Knowledge graph personalized learning path recommendation method based on RankNet-transformer
CN112000788A (en) Data processing method and device and computer readable storage medium
CN105677838A (en) User profile creating and personalized search ranking method and system based on user requirements
CN110008411A (en) It is a kind of to be registered the deep learning point of interest recommended method of sparse matrix based on user
CN114330704A (en) Statement generation model updating method and device, computer equipment and storage medium
Fotheringham et al. Multiscale geographically weighted regression: Theory and practice
CN116662497A (en) Visual question-answer data processing method, device and computer equipment
CN106021346A (en) A retrieval processing method and device
CN112035567A (en) Data processing method and device and computer readable storage medium
Wang Research on Online Education Resources Recommendation Based on Deep Learning

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
WD01 Invention patent application deemed withdrawn after publication

Application publication date: 20200317

WD01 Invention patent application deemed withdrawn after publication