CN112115290B - VR panorama scheme matching method based on image intelligent retrieval - Google Patents

VR panorama scheme matching method based on image intelligent retrieval Download PDF

Info

Publication number
CN112115290B
CN112115290B CN202010809509.6A CN202010809509A CN112115290B CN 112115290 B CN112115290 B CN 112115290B CN 202010809509 A CN202010809509 A CN 202010809509A CN 112115290 B CN112115290 B CN 112115290B
Authority
CN
China
Prior art keywords
inter
function
layer
scheme
panorama
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202010809509.6A
Other languages
Chinese (zh)
Other versions
CN112115290A (en
Inventor
万倩倩
周兵
王庆利
苏亮亮
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Nanjing Weilijia Intelligent Technology Co ltd
Nanjing Zhishan Intelligent Science And Technology Research Institute Co ltd
Original Assignee
Nanjing Weilijia Intelligent Technology Co ltd
Nanjing Zhishan Intelligent Science And Technology Research Institute Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Nanjing Weilijia Intelligent Technology Co ltd, Nanjing Zhishan Intelligent Science And Technology Research Institute Co ltd filed Critical Nanjing Weilijia Intelligent Technology Co ltd
Priority to CN202010809509.6A priority Critical patent/CN112115290B/en
Publication of CN112115290A publication Critical patent/CN112115290A/en
Application granted granted Critical
Publication of CN112115290B publication Critical patent/CN112115290B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/50Information retrieval; Database structures therefor; File system structures therefor of still image data
    • G06F16/58Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually
    • G06F16/583Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using metadata automatically derived from the content
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/50Information retrieval; Database structures therefor; File system structures therefor of still image data
    • G06F16/55Clustering; Classification
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/50Information retrieval; Database structures therefor; File system structures therefor of still image data
    • G06F16/58Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually
    • G06F16/5866Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using information manually generated, e.g. tags, keywords, comments, manually generated location and time information
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/213Feature extraction, e.g. by transforming the feature space; Summarisation; Mappings, e.g. subspace methods
    • G06F18/2135Feature extraction, e.g. by transforming the feature space; Summarisation; Mappings, e.g. subspace methods based on approximation criteria, e.g. principal component analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Library & Information Science (AREA)
  • Databases & Information Systems (AREA)
  • Evolutionary Computation (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Biophysics (AREA)
  • Mathematical Physics (AREA)
  • Computational Linguistics (AREA)
  • Health & Medical Sciences (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Biomedical Technology (AREA)
  • Software Systems (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Evolutionary Biology (AREA)
  • Image Analysis (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The application discloses a VR panorama scheme matching method based on image intelligent retrieval, which comprises the following steps: constructing a VR panorama scheme database, wherein VR panorama scheme links, inter-function labels and inter-function picture feature vectors are stored in the VR panorama scheme database, and the VR panorama scheme links, the inter-function labels and the inter-function picture feature vectors are mutually associated; carrying out extraction of inter-function labels and inter-function picture feature vectors on an input image to be retrieved by using a convolutional neural network model; and carrying out tag retrieval on the VR panorama scheme database by utilizing the inter-function tags and the inter-function picture feature vectors to find a matched VR panorama scheme. According to the VR panorama scheme matching method based on the image intelligent retrieval, similar schemes can be quickly and accurately matched in a massive VR panorama scheme library through one indoor effect diagram.

Description

VR panorama scheme matching method based on image intelligent retrieval
Technical Field
The application relates to a VR panorama scheme matching method, in particular to a VR panorama scheme matching method based on image intelligent retrieval.
Background
The combination of the virtual reality technology and the indoor design enables the indoor design to be displayed to a user in a brand new posture. The computer simulates indoor space and environment, shows graceful color collocation, and soft and warm lamplight, and forms a gorgeous VR panorama scheme regardless of the spatial distribution of a lattice and a long-lasting artistic form. The injection of artificial intelligence has brought new power for indoor design field for indoor design becomes more convenient, more intelligent, more high-efficient. Deep learning in recent years is more remarkable in indoor design field, indoor three-dimensional scene recognition and indoor model retrieval are all advancing towards more intelligent targets.
With the penetration of internet plus business models, traditional designers and users face-to-face with models of recommended designs are gradually replaced by online models. More and more indoor design companies then upload the designed VR panorama onto a website or applet for enjoyment by the user. The scheme is displayed mainly in a panoramic roaming mode, and the pictures are auxiliary. The user can roam in the indoor panoramic scheme without going home, such as being in the scene.
New business models must have some resistance to emerging. Through research on the indoor design field, a large number of VR panorama schemes are designed by many indoor design companies for users to select in order to meet the demands of users. In this bulky solution, it is difficult and heavy for the user to choose a set of his own favorite stool like a sea fishing needle. Thus, the indoor designer divides the panorama scheme into Chinese style, european style, american style and the like, and the user searches in text form.
Due to the lack of cognition of users on indoor design professions, the blurring of color, space and artistic collocation perception and the limitation of text expression, users cannot quickly and accurately select favorite VR panorama schemes in a large number of schemes through text. If the user only has one favorite indoor effect graph, matching with the indoor effect graph can not be realized in a massive panoramic scheme. For a rendered indoor scene effect diagram, manual mode is adopted at present to mark the indoor scene effect diagram according to functions, so that a great deal of manpower is wasted, the efficiency is low, and the intelligence is lacking.
These problems seriously affect the user's choice of VR panorama scheme, affecting the efficiency of effect map categorization. Traditional panoramic scheme selects, and the mode that effect map was categorized can't satisfy the development needs in indoor design field, can't satisfy more intelligent, more efficient user experience. Many indoor design companies and indoor design professionals are urgent to need a VR panorama matching system based on an effect map. Therefore, it is necessary to design a VR panorama scheme matching method based on image intelligent retrieval, which can extract picture information to be intelligently and rapidly retrieved in a panorama scheme library, and can automatically classify input pictures according to functions.
Disclosure of Invention
The application aims to: the VR panorama scheme matching method based on the image intelligent retrieval can extract picture information to be retrieved intelligently and rapidly in a panorama scheme library, and meanwhile input pictures can be automatically classified according to functions.
The technical scheme is as follows: the application discloses a VR panorama scheme matching method based on image intelligent retrieval, which comprises the following steps:
step 1, constructing a VR panorama scheme database, wherein VR panorama scheme links, inter-function labels and inter-function picture feature vectors are stored in the VR panorama scheme database, and the VR panorama scheme links, the inter-function labels and the inter-function picture feature vectors are mutually associated;
step 2, extracting inter-function labels and inter-function picture feature vectors of the input image to be retrieved by using a convolutional neural network model;
and 3, carrying out tag retrieval on the VR panorama scheme database by utilizing the inter-function tags and the inter-function picture feature vectors to find a matched VR panorama scheme.
Further, in step 1, the specific steps of constructing the VR panorama scheme database are as follows:
step 1.1, a scheme table and a plurality of inter-function tables are arranged in a VR panorama scheme database, the scheme table is used for storing each VR panorama scheme link and corresponding scheme id, each inter-function table is used for respectively storing each inter-function picture feature vector and corresponding scheme id, inter-function labels correspond to the inter-function tables, and the VR panorama scheme links, the inter-function labels and the inter-function picture feature vectors are mutually related through the scheme id;
step 1.2, obtaining scheme data from an existing VR panorama scheme, wherein the scheme data comprises VR panorama scheme links, scheme ids, inter-function labels and inter-function picture feature vectors;
and 1.3, correspondingly storing the obtained scheme data in a scheme table and a function-to-function table, thereby establishing a VR panorama scheme database.
Further, in step 2, the specific steps of extracting the inter-function labels and the inter-function picture feature vectors by using the convolutional neural network model are as follows:
step 2.1, constructing a convolutional neural network which sequentially comprises a CONV1 layer, a Max1 layer, a CONV2 layer, a Max2 layer, a CONV3 layer, a Max3 layer, a flat spreading layer, an FC1 layer, an FC2 layer, an FC3 layer and a soft Max classifier;
step 2.2, training the constructed convolutional neural network by using sample data, and training the convolutional neural network by using a Batch training mode with a Batch-size of 64; and during training, data are transmitted into the network model, the output value of the network is calculated layer by layer, and finally, the gradient descent algorithm is utilized to update parameters so that the network model approaches to the optimal solution.
Step 2.3, optimizing the convolutional neural network model by using a Dropout and ELU activation function;
and 2.4, extracting inter-function labels and inter-function picture feature vectors of the input image to be retrieved by utilizing the optimized convolutional neural network.
Further, in step 2.1, the input of the convolutional neural network is a 224×224×3 digital matrix, the CONV1 layer, the CONV2 layer and the CONV3 layer represent the current change as the convolutional layer change, the Max1 layer, the Max2 layer and the Max3 layer represent the Max pooling layer, the FC1 layer, the FC2 layer and the FC3 layer represent the fully connected layer, and the flat layer is an intermediate layer for expanding the convolutional layer into the fully connected layer.
In step 2.3, a dropoff layer is added between the FC1 layer and the FC2 layer and between the FC2 layer and the FC3 layer, and the dropoff parameter is set to 0.3.
Further, in step 2.3, when the convolutional neural network model is optimized by using the ELU activation function, the ELU activation function is applied to the CONV1 layer, the CONV2 layer and the CONV3 layer, and the ELU activation function has the following function formula:
where α is an adjustable parameter that controls when the negative part of the ELU activation function reaches saturation, where x represents the function input and f (x) represents the activation output.
Further, in step 2.1, the Softmax classifier is an improved Softmax classifier, and the calculation formula of the improved Softmax classifier is as follows:
where V represents an input vector, i represents a position index corresponding to the input vector, length represents a dimension of the input vector, and S represents a corresponding probability result.
Further, in step 2.4, when the optimized convolutional neural network is used to extract the inter-function labels and the inter-function picture feature vectors of the input image to be retrieved, the improved Softmax classifier outputs the inter-function labels, and the output data of the FC1 layer is intercepted as the inter-function picture feature vectors.
Further, in step 3, the specific steps of performing tag search in the VR panorama scheme database by using inter-function tags and inter-function picture feature vectors are as follows:
step 3.1, judging whether the input image to be searched is an effective inter-function image through the inter-function label, if so, entering step 3.2, if not, prompting to input the image to be searched again, and returning to step 2;
step 3.2, searching in the VR panorama scheme database by utilizing the inter-function label of the image to be searched, searching for the corresponding inter-function label of the image to be searched in the VR panorama scheme database, and finding out the corresponding inter-function table in the VR panorama scheme database according to the inter-function label obtained by searching;
step 3.3, obtaining the similarity between the feature vector of the image to be searched and each inter-function picture feature vector stored in the inter-function table, wherein a similarity calculation formula is as follows:
where d is Euclidean distance, s is similarity, h and q are two input vectors, and i is the index of the input vector.
Step 3.4, sorting the sizes of the similarity calculation results, selecting the first N inter-function picture feature vectors with larger similarity, and storing scheme ids corresponding to the inter-function picture feature vectors;
and 3.5, finding a corresponding VR panorama scheme link in the scheme table according to the saved scheme id, and finding each matched VR panorama scheme according to the VR panorama scheme link.
Further, in step 3.3, before performing similarity calculation, the feature vector of the image to be searched and the feature vector of the inter-function picture in the inter-function table are subjected to dimension reduction processing by using a PCA algorithm, so that the feature vector of the image to be searched and the feature vector of the inter-function picture in the inter-function table are reduced to 256 dimensions.
Compared with the prior art, the application has the beneficial effects that: the convolutional neural network model suitable for the effect graph is designed and trained, the functional labels can be extracted through the convolutional neural network model, and the feature vectors of the pictures can be extracted, so that when the convolutional neural network model is used for classification, the classification accuracy rate of the images reaches over 96 percent; the panoramic scheme similar to the picture containing information can be output by inputting one image to be retrieved, so that the matching efficiency and the matching accuracy are high.
Drawings
FIG. 1 is a flow chart of the method of the present application;
FIG. 2 is a flowchart of tag screening according to the present application;
FIG. 3 is a schematic diagram of a convolutional neural network model of the present application;
fig. 4 is an inter-function picture and a label thereof according to the present application.
Detailed Description
The technical scheme of the present application will be described in detail with reference to the accompanying drawings, but the scope of the present application is not limited to the embodiments.
Example 1:
as shown in fig. 1, the VR panorama scheme matching method based on image intelligent retrieval disclosed by the application automatically extracts picture information through a convolutional neural network, and combines a text and content retrieval mode to quickly match a scheme in a database, and comprises the following steps:
step 1, constructing a VR panorama scheme database, wherein VR panorama scheme links, inter-function labels and inter-function picture feature vectors are stored in the VR panorama scheme database, and the VR panorama scheme links, the inter-function labels and the inter-function picture feature vectors are mutually associated, as shown in FIG. 4, which is a schematic diagram of the inter-function picture and the labels thereof;
step 2, extracting inter-function labels and inter-function picture feature vectors of the input image to be retrieved by using a convolutional neural network model;
and 3, carrying out tag retrieval on the VR panorama scheme database by utilizing the inter-function tags and the inter-function picture feature vectors to find a matched VR panorama scheme.
The convolutional neural network model suitable for the effect graph is designed and trained, the functional labels can be extracted through the convolutional neural network model, and the feature vectors of the pictures can be extracted, so that when the convolutional neural network model is used for classification, the classification accuracy rate of the images reaches over 96 percent; the panoramic scheme similar to the picture containing information can be output by inputting one image to be retrieved, so that the matching efficiency and the matching accuracy are high.
Further, converting the panoramic scheme into an image matching problem by using a database, and imaging the complex VR panoramic scheme data, wherein in step 1, the specific steps of constructing the VR panoramic scheme database are as follows:
step 1.1, a scheme table and a plurality of inter-function tables are arranged in a VR panorama scheme database, the scheme table is used for storing each VR panorama scheme link and corresponding scheme id, each inter-function table is used for respectively storing each inter-function picture feature vector and corresponding scheme id, inter-function labels correspond to the inter-function tables, and the VR panorama scheme links, the inter-function labels and the inter-function picture feature vectors are mutually related through the scheme id;
step 1.2, obtaining scheme data from the existing VR panorama scheme, wherein the scheme data comprises VR panorama scheme links, scheme ids, inter-function labels and inter-function picture feature vectors, and when a VR panorama scheme database is established, the inter-function labels and the inter-function picture feature vectors of training samples are extracted by using a convolutional neural network model, and the convolutional neural network model adopted is the same as that in the step 2;
and 1.3, correspondingly storing the obtained scheme data in a scheme table and a function-to-function table, thereby establishing a VR panorama scheme database.
Further, in step 2, the specific steps of extracting the inter-function labels and the inter-function picture feature vectors by using the convolutional neural network model are as follows:
step 2.1, constructing a convolutional neural network which sequentially comprises a CONV1 layer, a Max1 layer, a CONV2 layer, a Max2 layer, a CONV3 layer, a Max3 layer, a flat spreading layer, an FC1 layer, an FC2 layer, an FC3 layer and a soft Max classifier, wherein the convolutional neural network is shown in figure 3; the whole network is trained by adopting a small convolution kernel, the size of the convolution kernel is not more than 5 multiplied by 5, and table 1 details the size of the convolution kernel, the step and the size and parameter conditions of an input/output matrix of each layer of the network.
Table 1 effect diagram classification network structure description
Layer Input size Output size Description of the application Number of parameters
CONV1 layer 224×224×3 110×110×32 Convolution kernel: 32 5× 5,s =2 2432
Max1 110×110×32 55×55×32 And (3) core: 2×2, s=1 0
CONV2 55×55×32 28×28×64 Convolution kernel: 64 3×3 s=2 18496
Max2 28×28×64 14×14×64 And (3) core: 2×2, s=1 0
CONV3 14×14×64 7×7×128 Convolution kernel: 128 3×3 s=2 73856
Max3 7×7×128 3×3×128 And (3) core: 2×2, s=1 0
Flatten 3×3×128 1152 Flatten lay-flat 0
FC1 1152 4096 Full connection layer 4722688
FC2 4096 2048 Full connection layer 8390656
FC3 2048 6 Softmax classification 12294
As shown in FIG. 3, the model fully utilizes the feature extraction functions of the convolution layer and the pooling layer, plays the integration capability of the full-connection layer, autonomously learns the features of the image and is used for a final improved Softmax classifier, the pooling operation is added after convolution calculation of each layer in the network, the training speed of network parameter improvement is reduced, the network input is a 224×224×3 digital matrix, each arrow represents the layer change process, each rectangle in the figure represents the data matrix of the current layer in the network, and the whole network is transmitted into the improved Softmax classifier after ten changes to obtain the final classification result.
Step 2.2, training the constructed convolutional neural network by using sample data, and training the convolutional neural network by using a Batch training mode with a Batch-size of 64; during training, data are transmitted into a network model, the output value of the network is calculated layer by layer, and finally, the gradient descent algorithm is utilized to update parameters so that the network model approaches to an optimal solution; computer resources can be fully utilized by utilizing the Batch mode, a trained network model is used, and for 1000 sample tests, the optimal training accuracy and testing accuracy under different Batch-size conditions are counted, and the results are shown in Table 2.
TABLE 2 different batch-size accuracy
batch-size 16 32 64 128 256
Optimal training accuracy (%) 99.8 99.7 99.6 99.4 99.7
Test accuracy (%) 90.1 91.0 91.5 90.7 89.9
It was found from Table 2 that although batch-size affects the training speed, the number of iterations and time taken to reach a specified accuracy, it does not cause a significant drop in training accuracy. After a certain training time is reached, the optimal training accuracy of the experiment can reach more than 99 percent. The test accuracy rate of the indoor effect graph for 1000 indoor effect graphs can reach more than 89%. In combination with the experimental statistics, the hardware condition of the experimental computer and the time required for training are considered, and the test accuracy is 91.5% at the maximum when the batch-size is 64, so the batch-size is 64 for training the final network model.
Step 2.3, optimizing the convolutional neural network model by using a Dropout and ELU activation function;
and 2.4, extracting inter-function labels and inter-function picture feature vectors of the input image to be retrieved by utilizing the optimized convolutional neural network.
Further, in step 2.1, the input of the convolutional neural network is a 224×224×3 digital matrix, the CONV1 layer, the CONV2 layer and the CONV3 layer represent the current change as the convolutional layer change, the Max1 layer, the Max2 layer and the Max3 layer represent the Max pooling layer, the FC1 layer, the FC2 layer and the FC3 layer represent the fully connected layer, and the flat layer is an intermediate layer for expanding the convolutional layer into the fully connected layer.
Further, in step 2.3, the convolutional neural network model is optimized by using Dropout, specifically, a Dropout layer is added between the FC1 layer and the FC2 layer and between the FC2 layer and the FC3 layer, the Dropout parameter is set to 0.3, the network model is optimized by adding two Dropout layers, the test accuracy is improved, the batch-size is 64 in the experiment, and the training accuracy and the test accuracy are respectively set under the conditions that the Dropout parameters are 0.3, 0.5 and 0.7 as shown in table 3.
TABLE 3 accuracy of different Dropout parameters
Dropout parameter 0.3 0.5 0.7
Optimal training accuracy (%) 99.1 98.7 97.7
Test accuracy (%) 96.3 95.7 93.6
The test accuracy is improved by introducing the Dropout mode, and experiments show that the test accuracy is highest when the Dropout parameter is 0.3, and the method is suitable for the convolutional neural network model and the effect graph data.
Further, in step 2.3, when the convolutional neural network model is optimized by using the ELU activation function, the ELU activation function is applied to the CONV1 layer, the CONV2 layer and the CONV3 layer, and the ELU activation function has the following function formula:
where α is an adjustable parameter that controls when the negative part of the ELU activation function reaches saturation, where x represents the function input and f (x) represents the activation output; all the full-connection layers still use a ReLu activation function, a batch-size is 64, and a Dropout parameter is 0.3, a training network is adopted in an experiment, the training accuracy is shown in a table 4 under the condition that a network model and super parameters are unchanged, and experiments show that the ELU activation function can improve the classification accuracy of the model herein by 0.6%, so that the testing accuracy of the network herein is improved to 96.9%.
TABLE 4 different activation function accuracy
Furthermore, since the original Softmax adopts an exponential operation form, when the parameters are larger, the exponential growth is very large, the division between the oversized numbers is very easy to cause overflow, the classification will fail or be wrong once overflow occurs, the probability calculation formula of the original Softmax classifier judges the currently input classification, and the probability calculation formula is as follows:
in order to improve the reliability of the convolutional neural network model, in step 2.1, the Softmax classifier adopts an improved Softmax classifier, and the calculation formula of the improved Softmax classifier is as follows:
where V represents an input vector, i represents a position index corresponding to the input vector, length represents a dimension of the input vector, and S represents a corresponding probability result. The improved Softmax classifier is kept consistent with the classification of the original classifier, but the improved Softmax classifier does not have all the super-large values or all the super-small values, and effectively solves the overflow problem of Softmax.
The resulting convolutional neural network model structure is shown in table 5.
Table 5 final effect diagram classification network structure description
Layer Input size Output size Description of the application Number of parameters
CONV1 224×224×3 110×110×32 Convolution kernel: 32 5× 5,s =2, elu activated 2432
Max1 110×110×32 55×55×32 And (3) core: 2×2, s=1 0
CONV2 55×55×32 28×28×64 Convolution kernel: 64 3×3, s=2, eli activation 18496
Max2 28×28×64 14×14×64 And (3) core: 2×2, s=1 0
CONV3 14×14×64 7×7×128 Convolution kernel: 128 3×3, s=2, eli activated 73856
Max3 7×7×128 3×3×128 And (3) core: 2×2, s=1 0
Flatten 3×3×128 1152 Flatten lay-flat 0
FC1 1152 4096 Full connectivity layer, relu activation 4722688
Dropout1 4096 4096 Dropout=0.3 0
FC2 4096 2048 Full connectivity layer, relu activation 8390656
Dropout2 2048 2048 Dropout=0.3 0
FC3 2048 6 Improved Softmax classifier 12294
The convolutional neural network model improves the classification accuracy of the indoor effect graph to 96.9%, and has higher intelligence and accuracy.
Further, in step 2.4, when the optimized convolutional neural network is used to extract the inter-function labels and the inter-function picture feature vectors of the input image to be searched, the improved Softmax classifier outputs the inter-function labels, and the output data of the FC1 layer is intercepted as the inter-function picture feature vectors, i.e. the 4096-dimensional output of the FC1 layer is used as the feature description of the image to be searched.
Further, the feature vector and the label of the image are automatically extracted by using the convolutional neural network, and in the step 3, the specific steps of carrying out label retrieval in the VR panorama scheme database by using the feature vector of the inter-function label and the inter-function picture are as follows:
step 3.1, judging whether the input image to be searched is an effective inter-function image through the inter-function tag, if so, namely, if the acquired inter-function tag is in the range of a kitchen, a living room, a bedroom, a dining room and a toilet, entering step 3.2, if not, prompting to input the image to be searched again, and returning to step 2, if not, like the cat image in fig. 2;
step 3.2, searching in the VR panorama scheme database by utilizing the inter-function label of the image to be searched, searching the corresponding inter-function label of the image to be searched in the VR panorama scheme database, and finding the corresponding inter-function table in the VR panorama scheme database according to the inter-function label obtained by searching, for example, selecting a kitchen table when the inter-function label is a kitchen label, and selecting a living room table when the inter-function label is a living room label;
step 3.3, obtaining the similarity between the feature vector of the image to be searched and each inter-function picture feature vector stored in the inter-function table, wherein a similarity calculation formula is as follows:
where d is Euclidean distance, s is similarity, h and q are two input vectors, and i is the index of the input vector.
Step 3.4, sorting the sizes of the similarity calculation results, selecting the first N inter-function picture feature vectors with larger similarity, wherein N can be 50, saving 50 results is to prevent a user from temporarily changing the number of screening results, avoiding repeated calculation, increasing the interactivity of the whole system and saving scheme ids corresponding to the inter-function picture feature vectors; because each VR panorama scheme has a plurality of pictures among functions, the maximum similarity is selected as the similarity between the VR panorama scheme and the input image to be searched, the comprehensiveness of the system can be increased, and the matching accuracy is improved;
and 3.5, finding a corresponding VR panorama scheme link in the scheme table according to the saved scheme id, and finding each matched VR panorama scheme according to the VR panorama scheme link.
Further, in step 3.3, before performing similarity calculation, the feature vector of the image to be searched and the feature vector of the inter-function picture in the inter-function table are subjected to dimension reduction processing by using a PCA algorithm, so that the feature vector of the image to be searched and the feature vector of the inter-function picture in the inter-function table are reduced to 256 dimensions. Because the extracted feature vector contains 4096 numbers, the calculated amount is large when the Euclidean distance is calculated, and the operation speed of the algorithm is seriously influenced, so that the 4096-dimensional feature vector is reduced to 256 dimensions, the idea dimension reduction data of uncorrelated data to remove highly correlated data is reserved, the CPU resource occupation when the system is operated can be effectively reduced, the operation speed of the algorithm is improved, and the user experience is improved.
As described above, although the present application has been shown and described with reference to certain preferred embodiments, it is not to be construed as limiting the application itself. Various changes in form and details may be made therein without departing from the spirit and scope of the application as defined by the appended claims.

Claims (10)

1. The VR panorama scheme matching method based on the intelligent image retrieval is characterized by comprising the following steps:
step 1, constructing a VR panorama scheme database, wherein VR panorama scheme links, inter-function labels and inter-function picture feature vectors are stored in the VR panorama scheme database, and the VR panorama scheme links, the inter-function labels and the inter-function picture feature vectors are mutually associated;
step 2, extracting inter-function labels and inter-function picture feature vectors of the input image to be retrieved by using a convolutional neural network model;
and 3, carrying out tag retrieval on the VR panorama scheme database by utilizing the inter-function tags and the inter-function picture feature vectors to find a matched VR panorama scheme.
2. The VR panorama scheme matching method based on image intelligent retrieval according to claim 1, wherein in step 1, the specific steps of constructing a VR panorama scheme database are as follows:
step 1.1, a scheme table and a plurality of inter-function tables are arranged in a VR panorama scheme database, the scheme table is used for storing each VR panorama scheme link and corresponding scheme id, each inter-function table is used for respectively storing each inter-function picture feature vector and corresponding scheme id, inter-function labels correspond to the inter-function tables, and the VR panorama scheme links, the inter-function labels and the inter-function picture feature vectors are mutually related through the scheme id;
step 1.2, obtaining scheme data from an existing VR panorama scheme, wherein the scheme data comprises VR panorama scheme links, scheme ids, inter-function labels and inter-function picture feature vectors;
and 1.3, correspondingly storing the obtained scheme data in a scheme table and a function-to-function table, thereby establishing a VR panorama scheme database.
3. The VR panorama scheme matching method based on image intelligent retrieval according to claim 1, wherein in step 2, the specific steps of extracting the inter-function labels and the inter-function picture feature vectors by using the convolutional neural network model are as follows:
step 2.1, constructing a convolutional neural network which sequentially comprises a CONV1 layer, a Max1 layer, a CONV2 layer, a Max2 layer, a CONV3 layer, a Max3 layer, a flat spreading layer, an FC1 layer, an FC2 layer, an FC3 layer and a soft Max classifier;
step 2.2, training the constructed convolutional neural network by using sample data, and training the convolutional neural network by using a Batch training mode with a Batch-size of 64; during training, data are transmitted into a network model, the output value of the network is calculated layer by layer, and finally, the gradient descent algorithm is utilized to update parameters so that the network model approaches to an optimal solution;
step 2.3, optimizing the convolutional neural network model by using a Dropout and ELU activation function;
and 2.4, extracting inter-function labels and inter-function picture feature vectors of the input image to be retrieved by utilizing the optimized convolutional neural network.
4. The VR panorama scheme matching method based on image intelligent retrieval according to claim 3, wherein in step 2.1, the input of the convolutional neural network is 224×224×3 digital matrix, the CONV1 layer, the CONV2 layer and the CONV3 layer represent the current change as the convolutional layer change, the Max1 layer, the Max2 layer and the Max3 layer represent the Max pooling layer, the FC1 layer, the FC2 layer and the FC3 layer represent the fully connected layer, and the flat layer is an intermediate layer for expanding the convolutional layer into the fully connected layer.
5. The VR panorama matching method based on image intelligent retrieval according to claim 3, wherein in step 2.3, a dropoff layer is added between the FC1 layer and the FC2 layer and between the FC2 layer and the FC3 layer, and the dropoff parameter is set to 0.3.
6. The VR panorama scheme matching method based on image intelligent retrieval according to claim 3, wherein in step 2.3, when the convolutional neural network model is optimized by using the ELU activation function, the ELU activation function is applied to the CONV1 layer, the CONV2 layer and the CONV3 layer, and the function formula of the ELU activation function is as follows:
where α is an adjustable parameter that controls when the negative part of the ELU activation function reaches saturation, where x represents the function input and f (x) represents the activation output.
7. The VR panorama scheme matching method based on image intelligent retrieval of claim 3, wherein in step 2.1, the Softmax classifier is an improved Softmax classifier, and the calculation formula of the improved Softmax classifier is as follows:
where V represents an input vector, i represents a position index corresponding to the input vector, length represents a dimension of the input vector, and S represents a corresponding probability result.
8. The VR panorama scheme matching method based on image intelligent retrieval according to claim 7, wherein in step 2.4, when the inter-function label and the inter-function picture feature vector of the input image to be retrieved are extracted by using the optimized convolutional neural network, the inter-function label is output by the improved Softmax classifier, and the output data of the FC1 layer is intercepted as the inter-function picture feature vector.
9. The VR panorama scheme matching method based on image intelligent retrieval according to claim 2, wherein in step 3, the specific steps of performing label retrieval in the VR panorama scheme database by using inter-function labels and inter-function picture feature vectors are as follows:
step 3.1, judging whether the input image to be searched is an effective inter-function image through the inter-function label, if so, entering step 3.2, if not, prompting to input the image to be searched again, and returning to step 2;
step 3.2, searching in the VR panorama scheme database by utilizing the inter-function label of the image to be searched, searching for the corresponding inter-function label of the image to be searched in the VR panorama scheme database, and finding out the corresponding inter-function table in the VR panorama scheme database according to the inter-function label obtained by searching;
step 3.3, obtaining the similarity between the feature vector of the image to be searched and each inter-function picture feature vector stored in the inter-function table, wherein a similarity calculation formula is as follows:
wherein d is Euclidean distance, s is similarity, h and q are two input vectors, and i is the index of the input vector;
step 3.4, sorting the sizes of the similarity calculation results, selecting the first N inter-function picture feature vectors with larger similarity, and storing scheme ids corresponding to the inter-function picture feature vectors;
and 3.5, finding a corresponding VR panorama scheme link in the scheme table according to the saved scheme id, and finding each matched VR panorama scheme according to the VR panorama scheme link.
10. The VR panorama scheme matching method based on intelligent image retrieval according to claim 9, wherein in step 3.3, before similarity calculation, feature vectors of the input image to be retrieved and feature vectors of the inter-function pictures in the inter-function table are subjected to dimension reduction processing by using a PCA algorithm, and feature vectors of the image to be retrieved and feature vectors of the inter-function pictures in the inter-function table are reduced to 256 dimensions.
CN202010809509.6A 2020-08-12 2020-08-12 VR panorama scheme matching method based on image intelligent retrieval Active CN112115290B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010809509.6A CN112115290B (en) 2020-08-12 2020-08-12 VR panorama scheme matching method based on image intelligent retrieval

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010809509.6A CN112115290B (en) 2020-08-12 2020-08-12 VR panorama scheme matching method based on image intelligent retrieval

Publications (2)

Publication Number Publication Date
CN112115290A CN112115290A (en) 2020-12-22
CN112115290B true CN112115290B (en) 2023-11-10

Family

ID=73804135

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010809509.6A Active CN112115290B (en) 2020-08-12 2020-08-12 VR panorama scheme matching method based on image intelligent retrieval

Country Status (1)

Country Link
CN (1) CN112115290B (en)

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107256246A (en) * 2017-06-06 2017-10-17 西安工程大学 PRINTED FABRIC image search method based on convolutional neural networks
WO2020145981A1 (en) * 2019-01-10 2020-07-16 Hewlett-Packard Development Company, L.P. Automated diagnoses of issues at printing devices based on visual data

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109635141B (en) * 2019-01-29 2021-04-27 京东方科技集团股份有限公司 Method, electronic device, and computer-readable storage medium for retrieving an image

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107256246A (en) * 2017-06-06 2017-10-17 西安工程大学 PRINTED FABRIC image search method based on convolutional neural networks
WO2020145981A1 (en) * 2019-01-10 2020-07-16 Hewlett-Packard Development Company, L.P. Automated diagnoses of issues at printing devices based on visual data

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
基于卷积神经网络的鸟类视频图像检索研究;张惠凡;罗泽;;科研信息化技术与应用(第05期);全文 *

Also Published As

Publication number Publication date
CN112115290A (en) 2020-12-22

Similar Documents

Publication Publication Date Title
CN110866140B (en) Image feature extraction model training method, image searching method and computer equipment
CN111427995B (en) Semantic matching method, device and storage medium based on internal countermeasure mechanism
CN108875076B (en) Rapid trademark image retrieval method based on Attention mechanism and convolutional neural network
CN104317834B (en) A kind of across media sort methods based on deep neural network
CN101739428B (en) Method for establishing index for multimedia
CN109525892B (en) Video key scene extraction method and device
JP2010504593A (en) Extracting dominant colors from an image using a classification technique
CN110110800B (en) Automatic image annotation method, device, equipment and computer readable storage medium
KR20120053211A (en) Method and apparatus for multimedia search and method for pattern recognition
CN109961095B (en) Image labeling system and method based on unsupervised deep learning
CN110175249A (en) A kind of search method and system of similar pictures
CN106776896A (en) A kind of quick figure fused images search method
CN109063112A (en) A kind of fast image retrieval method based on multi-task learning deep semantic Hash, model and model building method
CN109711465A (en) Image method for generating captions based on MLL and ASCA-FR
CN105956631A (en) On-line progressive image classification method facing electronic image base
Wang et al. Aspect-ratio-preserving multi-patch image aesthetics score prediction
Shi et al. A benchmark and baseline for language-driven image editing
CN115457332A (en) Image multi-label classification method based on graph convolution neural network and class activation mapping
CN116310466A (en) Small sample image classification method based on local irrelevant area screening graph neural network
US20230072445A1 (en) Self-supervised video representation learning by exploring spatiotemporal continuity
CN108304588B (en) Image retrieval method and system based on k neighbor and fuzzy pattern recognition
CN112115290B (en) VR panorama scheme matching method based on image intelligent retrieval
CN110765305A (en) Medium information pushing system and visual feature-based image-text retrieval method thereof
CN113192108B (en) Man-in-loop training method and related device for vision tracking model
CN112101559B (en) Case crime name deducing method based on machine learning

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant