WO2020177378A1 - Text information feature extraction method and device, computer apparatus, and storage medium - Google Patents

Text information feature extraction method and device, computer apparatus, and storage medium Download PDF

Info

Publication number
WO2020177378A1
WO2020177378A1 PCT/CN2019/117424 CN2019117424W WO2020177378A1 WO 2020177378 A1 WO2020177378 A1 WO 2020177378A1 CN 2019117424 W CN2019117424 W CN 2019117424W WO 2020177378 A1 WO2020177378 A1 WO 2020177378A1
Authority
WO
WIPO (PCT)
Prior art keywords
text information
length
network
input
meta
Prior art date
Application number
PCT/CN2019/117424
Other languages
French (fr)
Chinese (zh)
Inventor
赵峰
王健宗
Original Assignee
平安科技(深圳)有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 平安科技(深圳)有限公司 filed Critical 平安科技(深圳)有限公司
Publication of WO2020177378A1 publication Critical patent/WO2020177378A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/30Semantic analysis
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Definitions

  • This application relates to the field of information technology, and in particular to a method, device, computer equipment and storage medium for feature extraction of text information.
  • Convolutional neural networks have recently gradually become a basic module of natural language processing. Despite their success, most existing convolutional neural networks use the same static filter learned for all input sentences.
  • the biggest disadvantage of a static filter is that it is not related to text, that is, it treats all types of text equally. For example, when we people read a popular science article and a current affairs news, the reading methods are generally different, and the focus of reading is usually different. For current affairs news, we should mainly extract information such as time, location, people, and events. As for popular science articles, more weight should be given to concepts, logic, and causality.
  • the static filter can only treat all contextual information with the same weight, so it is limited in the accuracy of text recognition.
  • the embodiments of the present application provide a method, device, computer equipment, and storage medium for feature extraction of text information to solve the problems that the existing text recognition technology cannot adapt to the context and the accuracy of text recognition is poor.
  • a feature extraction method of text information includes:
  • the meta-network refers to a network used to generate a set of unique filters corresponding to the input text information
  • the length-adjusted text information is passed into the meta-network as input, and a set of unique filters corresponding to the text information is generated through the meta-network.
  • the unique filter refers to the adjusted length Context-sensitive filters for text information
  • the length-adjusted text information is passed into the unique filter as input, and the feature vector matrix corresponding to the text information is extracted through the unique filter, and each element in the feature vector matrix represents the text information Characteristics.
  • the adjusting the length of the text information to be recognized to the input length of the meta-network includes:
  • the inputting the text information with the adjusted length as input to the meta network, and generating a set of unique filters corresponding to the text information through the meta network includes:
  • the vector matrix includes a plurality of word embedding vectors, and the length of each word embedding vector is equal;
  • the inputting the length-adjusted text information into the unique filter as input, and extracting the feature vector matrix corresponding to the text information through the unique filter includes:
  • the vector matrix includes a plurality of word embedding vectors, and the length of each word embedding vector is equal;
  • the method further includes:
  • a feature extraction device for text information includes:
  • a training module for setting up and training a meta-network where the meta-network refers to a network for generating a set of unique filters corresponding to the input text information
  • Information acquisition module for acquiring text information to be recognized
  • a length adjustment module configured to adjust the length of the text information to be recognized to the input length of the meta network
  • the filter generation module is configured to input the text information after the length adjustment into the meta network, and generate a set of unique filters corresponding to the text information through the meta network, and the unique filter refers to A filter related to the context of the text information after the length adjustment;
  • the feature extraction module is configured to input the text information after the length adjustment to the unique filter, and extract the feature vector matrix corresponding to the text information through the unique filter, and each of the feature vector matrix The elements represent the characteristics of the text information.
  • the length adjustment module includes:
  • a length acquiring unit configured to acquire the input length of the meta-network, and determine whether the length of the text information to be recognized reaches the input length
  • the length adjustment unit is configured to, if the length of the text information to be recognized does not reach the input length, fill preset characters to the end of the text information to be recognized, so as to adjust the length of the text information to be recognized Adjust to the input length.
  • the filter generating module includes:
  • the first vectorization unit is configured to perform vectorization processing on the length-adjusted text information to obtain a vector matrix.
  • the vector matrix includes several word embedding vectors, and each word embedding vector has the same length;
  • the first convolution unit is configured to perform a convolution operation on the vector matrix through the meta-network to obtain a hidden layer vector of a specified length
  • the transposed convolution unit is configured to perform a transposed convolution operation on the hidden layer vector to obtain a group of unique filters corresponding to the length-adjusted text information.
  • a computer device includes a memory, a processor, and computer-readable instructions stored in the memory and capable of running on the processor, and the processor implements the following steps when the processor executes the computer-readable instructions:
  • the meta-network refers to a network used to generate a set of unique filters corresponding to the input text information
  • the length-adjusted text information is passed into the meta-network as input, and a set of unique filters corresponding to the text information is generated through the meta-network.
  • the unique filter refers to the adjusted length Context-sensitive filters for text information
  • the length-adjusted text information is passed into the unique filter as input, and the feature vector matrix corresponding to the text information is extracted through the unique filter, and each element in the feature vector matrix represents the text information Characteristics.
  • One or more non-volatile readable storage media storing computer readable instructions.
  • the computer readable instructions execute the following steps:
  • the meta-network refers to a network used to generate a set of unique filters corresponding to the input text information
  • the length-adjusted text information is passed into the meta-network as input, and a set of unique filters corresponding to the text information is generated through the meta-network.
  • the unique filter refers to the adjusted length Context-sensitive filters for text information
  • the length-adjusted text information is passed into the unique filter as input, and the feature vector matrix corresponding to the text information is extracted through the unique filter, and each element in the feature vector matrix represents the text information Characteristics.
  • Fig. 1 is a flowchart of a method for extracting features of text information in an embodiment of the present application
  • step S103 is a flowchart of step S103 in the method for extracting features of text information in an embodiment of the present application
  • step S104 is a flowchart of step S104 in the method for extracting features of text information in an embodiment of the present application
  • step S105 is a flowchart of step S105 in the method for extracting features of text information in an embodiment of the present application
  • Fig. 5 is a flowchart of a method for feature extraction of text information in an embodiment of the present application
  • Fig. 6 is a functional block diagram of a feature extraction device for text information in an embodiment of the present application.
  • Fig. 7 is a schematic diagram of a computer device in an embodiment of the present application.
  • the feature extraction method of text information provided by the embodiment of the application is applied to a server.
  • the server can be implemented by an independent server or a server cluster composed of multiple servers.
  • a method for feature extraction of text information is provided, which includes the following steps:
  • a meta-network is set up and trained.
  • the meta-network refers to a network for generating a set of unique filters corresponding to the input text information.
  • an embodiment of the present application proposes a meta-network to learn context.
  • the meta network refers to a network used to generate a set of unique filters corresponding to the input text information, which can generate a weight matrix related to the context of the input text information according to the input text information, It's a network about the network.
  • the filters generated by the meta-network are customized for different types of text information, and are suitable for the different types of text information, thus changing the current situation of using the same filter to treat all types of text equally in the previous convolutional neural networks. Since the filter is related to the context of the text information, the extracted features can be more accurate.
  • the meta-network may be any differentiable deep network.
  • the meta-network is obtained in advance through training on a large number of training text sets.
  • a pre-trained meta-network is obtained, and the parameters of the meta-network are fine-tuned to obtain a meta-network suitable for this text information.
  • step S102 the text information to be recognized is obtained.
  • the text information is a sentence composed according to specified linguistic cohesion and semantic coherence rules, including but not limited to verbal text information and literary text information.
  • the server can obtain the text information to be recognized according to actual needs or application scenarios. For example, the server obtains the text information to be recognized from a preset database, and a large amount of text information is collected in advance in the preset database. Alternatively, the server obtains the voice information input by the user through the microphone of the client, and then converts the voice information into text to obtain the text information to be recognized. Or, the server obtains the image information through the camera function of the client, and then performs OCR text recognition on the image information to obtain the text information to be recognized. It is understandable that the server can also obtain the text information to be recognized in a variety of ways, which will not be repeated here.
  • the text information to be recognized is preferably the same or similar to the training text and context distribution of the meta-network, that is, the text to be recognized
  • the information is the same as the training text or is part of the training text, and the two are the same or similar in terms of style, type, and semantics.
  • step S103 the length of the text information to be recognized is adjusted to the input length of the meta-network.
  • the length of the text information refers to the length of the character string of the text information.
  • the embodiment of the present application adjusts the length of the text information to be recognized to the meta network before inputting the text information to be recognized into the meta network.
  • the input length is the length of the string of the input parameters of the meta-network set in advance.
  • the distribution of the string lengths of all training texts can be counted in advance, and then a maximum length is selected as the input length of the meta-network for unification.
  • the step S103 adjusting the length of the text information to be recognized to the input length of the meta network includes:
  • step S1031 the input length of the meta network is acquired, and it is determined whether the length of the text information to be recognized reaches the input length.
  • the embodiment of the present application obtains the character string length of the text information to be recognized by calculating the number of characters of the text information to be recognized. Then, the length of the character string is compared with the input length to determine whether the length of the text information to be recognized reaches the input length.
  • step S104 If so, that is, the length of the text information to be recognized reaches the input length, then jump to step S104, and generate a set of unique filters corresponding to the text information to be recognized through the meta-network.
  • step S1032 if not, fill preset characters to the end of the text information to be recognized, so as to adjust the length of the text information to be recognized to the input length. Then jump to step S104.
  • this embodiment of the present application uses preset characters to complement the text information to be recognized, so as to adjust the text information to be recognized to the input length.
  • the preset character is a special character representing blank space for meta-networks and convolutional neural networks, such as NUL.
  • step S103 is an example of adjusting the length of the text information in step S103 to the input length of the meta-network. It is assumed that the distribution of the string lengths of all training texts is calculated in advance, and then a maximum length is selected as the input length, such as 7. If the text information to be recognized is "the weather is really good today", the character string length is 6 obtained through step S1031, and the input length 7 is not reached. Then, in step S1033, the preset character NUL is used to complement the to-be-recognized text information "Today's weather is really good", and the length-adjusted text information "Today's weather is really good NUL" is obtained.
  • step S104 the length-adjusted text information is passed into the meta-network as input, and a set of unique filters corresponding to the text information is generated through the meta-network.
  • the unique filter refers to the length The adjusted context-related filter of the text information.
  • the meta-network can learn the adjusted convolutional neural network filter according to the input parameters.
  • the input parameter is the text information to be recognized after the length is adjusted
  • the filter is a set of unique filters corresponding to the text information to be recognized.
  • the unique filter is related to the context of the text information to be recognized, so that using the unique filter can refine different text information to be recognized and extract different features.
  • the meta-network in order to solve the problem of variable input length, in this embodiment of the present application, the meta-network generates a unique filter of a predefined size.
  • step S104 inputting the text information with the adjusted length as input into the meta network, and generating a set of unique filters corresponding to the text information through the meta network includes:
  • step S1041 the text information after the length adjustment is vectorized to obtain a vector matrix.
  • the vector matrix includes several word embedding vectors, and each word embedding vector has the same length.
  • the embodiment of the present application performs vectorization processing on the length-adjusted text information to obtain a vector matrix corresponding to the length-adjusted text information as the vector matrix corresponding to the text information to be recognized.
  • the vector matrix includes several word embedding vectors.
  • the word embedding vector refers to the word vector of each word after the length-adjusted text information is segmented, that is, each word in the length-adjusted text information is mapped to a column vector in a vector matrix.
  • the length of the word embedding vector is pre-designated, that is, for text information to be recognized of different lengths, the length of the corresponding word embedding vector is the same.
  • the preset character is a special character representing blank space for the meta-network and convolutional neural network.
  • the embodiment of the present application converts the length-adjusted text information into a vector matrix, which facilitates the recognition and learning of subsequent convolutional neural networks, that is, it facilitates the subsequent execution of convolution operations and transposed convolution operations.
  • step S1042 a convolution operation is performed on the vector matrix through the meta-network to obtain a hidden layer vector of a specified length.
  • a convolution operation is performed on the vector matrix through a preset convolution layer in the meta network, that is, the vector matrix is convolved through a convolution layer filter.
  • the product operation is to calculate the dot product between the filter and the vector matrix to extract higher-level features to obtain a hidden layer vector of a specified length.
  • the parameters that make up the convolutional layer filter can be optimized by the loss function.
  • step S1043 a transposed convolution operation is performed on the hidden layer vector to obtain a set of unique filters corresponding to the length-adjusted text information.
  • transpose convolution transpose convolution
  • deconvolution deconvolution
  • deconvolution deconvolution
  • a transposed convolutional layer is superimposed on the hidden layer described in step S1042.
  • the hidden layer vector is passed through the transposed convolution layer to perform a transposed convolution operation to generate a set of convolution kernels, and the convolution kernel is used as the length-adjusted text
  • a set of unique filters corresponding to the information that is, a set of unique filters corresponding to the text information to be recognized.
  • the parameters that make up the transposed convolutional layer can be optimized by the loss function.
  • the unique filter is a filter related to the context of the text information to be recognized, customized for the text information to be recognized, and suitable for the text information to be recognized.
  • the hidden layer vector is obtained according to the vector matrix, so that the hidden layer
  • the vector has nothing to do with the length of the text information to be recognized, which ensures that the filter generated through the meta network has the same dimension and size for each text information to be recognized, that is, the size of the filter generated through the meta network remains consistent.
  • the parameters in the convolutional layer and the transposed convolutional layer are all jointly differentiable. Therefore, the parameters of the convolutional layer and the transposed convolutional layer can be used when training the meta-network.
  • the parameters are optimized and updated through the gradient back propagation algorithm.
  • the idea of the back propagation algorithm ie BP algorithm
  • the back propagation algorithm is to calculate the error through the output of the convolutional layer and the transposed convolutional layer, and pass the error backward step by step. It is mainly composed of excitation propagation and weight update.
  • the steps are repeated iteratively until the feature vector matrix of the training text reaches the predetermined error expectation value.
  • the backpropagation algorithm can further optimize the parameters of the meta-network and improve the accuracy of the meta-network to generate unique filters.
  • step S105 the length-adjusted text information is passed into the unique filter as input, and the feature vector matrix corresponding to the text information is extracted through the unique filter, and each element in the feature vector matrix Indicates the characteristics of the text information.
  • the unique filter After obtaining the unique filter corresponding to the length-adjusted text information through the meta-network, the unique filter is used to identify the length-adjusted text information. Specifically, the length-adjusted text information is passed into the unique filter as input, and then the output after the unique filter is obtained, and the output is used as the feature vector corresponding to the text information to be recognized matrix.
  • the feature vector matrix includes feature information of the text information to be recognized, that is, semantic information.
  • step S105 the text information whose length has been adjusted is passed into the unique filter as input, and the feature vector matrix corresponding to the text information is extracted by the unique filter. :
  • step S1051 the text information after the length adjustment is vectorized to obtain a vector matrix.
  • the vector matrix includes several word embedding vectors, each of which has the same length.
  • the length-adjusted text information can be vectorized, and each word in the text information can be mapped into a vector matrix.
  • the column vector is used to obtain the word embedding vector of each word, and the word embedding vector is combined to obtain the vector matrix corresponding to the text information after the length adjustment.
  • the length of the word embedding vector is pre-designated, that is, for text information to be recognized of different lengths, the length of the corresponding word embedding vector is the same.
  • the embodiment of the present application converts the text information to be recognized into a vector matrix, which facilitates the subsequent execution of convolution operations.
  • the length of the text information after the length adjustment is T
  • the words formed are x 1 , x 2 ,..., x T.
  • a vector matrix X ⁇ R d ⁇ T is obtained.
  • Each column in the vector matrix X represents the d dimension corresponding to a word in the length-adjusted text information The word embedding vector.
  • step S1052 a convolution operation is performed on the vector matrix through the unique filter to extract a feature map corresponding to the text information.
  • the vector matrix is passed as an input to the unique filter to perform a convolution operation, that is, the vector matrix is convolved through the unique filter Operation, calculating the dot product between the filter and the vector matrix to extract higher-level features to obtain a feature map corresponding to the text information.
  • i 1, 2,..., T-h+1, where ⁇ represents a convolution operator, b represents a bias vector with dimension K, and f represents a nonlinear activation function, such as ReLU.
  • step S1053 a pooling operation is performed on the feature map, and the maximum value of each row in the feature map is extracted as the main feature, and the feature vector matrix corresponding to the text information is obtained.
  • the feature map is then passed into the maximum pooling layer as input, and the maximum pooling layer is used to extract the maximum value for each row in the feature map to obtain the main feature and combine all the main features
  • a K-dimensional vector is obtained, and the K-dimensional vector is used as the feature vector matrix corresponding to the text information after the length adjustment, that is, the feature vector matrix corresponding to the text information to be recognized.
  • the unimportant features are discarded through the maximum pooling layer, and only the most prominent features are retained.
  • the feature map can be reduced, the calculation complexity can be simplified, and the recognition accuracy can be improved on the other hand.
  • Each element in the feature vector matrix represents the feature of the text information, that is, semantic information.
  • the unique filter generated by the meta-network is related to the context of the text information to be recognized, and the unique filters corresponding to different text information to be recognized are not the same, that is, in the convolutional neural network
  • the weight matrices are not the same. Obtaining the feature vector matrix of the text information to be recognized through the unique filter greatly improves the accuracy of feature extraction.
  • the embodiment of the application sets up and trains a meta-network, which refers to a network used to generate a set of unique filters corresponding to the input text information; when recognizing text information, according to The text information to be recognized acquires a meta network, and the length of the text information to be recognized is adjusted to the input length of the meta network; then the text information after the length adjustment is passed into the meta network as input, and The meta network generates a set of unique filters corresponding to the text information, and the unique filter refers to a filter related to the context of the text information after the length adjustment; the text information after the length adjustment is used as The input is passed into the unique filter, and the feature vector matrix corresponding to the text information is extracted through the unique filter, and each element in the feature vector matrix represents the feature of the text information;
  • a set of filters corresponding to text information acquisition is used to recognize the text information, which solves the problems of existing text recognition technology that cannot adapt to the context and poor text recognition accuracy, and greatly improves the accuracy of text information feature extraction rate.
  • the feature vector matrix corresponds to the text information to be recognized, and includes several elements, and each element represents a feature extracted from the text information, that is, semantic information. Compared with the text information to be recognized, the dimension of the feature vector matrix is greatly reduced.
  • the classification of the text information to be recognized can be further realized based on the feature vector matrix.
  • the method may further include:
  • step S106 the feature vector matrix is input to the fully connected layer, and then the output of the fully connected layer is input to the preset Softmax classifier.
  • step S107 the category corresponding to the text information is obtained according to the output of the Softmax classifier.
  • the embodiment of the present application purifies the eigenvector matrix through a fully connected layer to convert the eigenvector matrix into a vector of a specified dimension to facilitate subsequent softmax classifiers to perform classification operations.
  • the fully connected layer sets K*N weight coefficients and N offset values in advance according to the number N of classification categories, K is the last dimension of the previous layer of the fully connected layer, that is, the dimension of the output feature vector matrix . Then, the eigenvector matrix and the weight matrix of the fully connected layer are multiplied and an offset value is added, and the resulting sum is combined into a one-dimensional vector, thereby obtaining the output of the fully connected layer.
  • the output of the fully connected layer is input into the preset softmax classifier.
  • the Softmax classifier is used to deal with multi-classification problems, and its output needs to be processed numerically through the Softmax function.
  • the definition of Softmax function is as follows:
  • V n represents the element in the one-dimensional vector output by the fully connected layer
  • n represents the category index
  • n 1, 2, 3,..., N
  • S n represents the ratio of the index of the current element V n to the sum of all the element indices.
  • the softmax classifier converts the output value of the multi-category fully connected layer into a relative probability, and its elements represent the relative probability between different categories, which is easy to understand and compare. Based on the output of the softmax classifier, the category corresponding to the element with the highest probability is the most likely, and it can be clearly predicted that the text information to be recognized is the category corresponding to the element with the highest probability.
  • the category may be an intention category, such as consent, rejection, waiting, etc., or a webpage category, an emotion category, a user comment category, etc., which are not limited here.
  • the unique filter generated through the meta-network is related to the context of the text information to be recognized, and the unique filters corresponding to different text information to be recognized are different, that is, the weight matrix is different.
  • Obtaining the feature vector matrix of the text information to be recognized through the unique filter greatly improves the accuracy of feature extraction; classifying based on the feature vector matrix further improves the accuracy of classification.
  • a feature extraction device for text information is provided, and the feature extraction device for text information corresponds to the feature extraction method for text information in the foregoing embodiment in a one-to-one correspondence.
  • the feature extraction device for text information includes a training module 61, an information acquisition module 62, a length adjustment module 63, a filter generation module 64, and a feature extraction module 65.
  • the detailed description of each functional module is as follows:
  • the training module 61 is used to set up and train a meta-network, where the meta-network refers to a network used to generate a set of unique filters corresponding to the input text information;
  • the information acquisition module 62 is used to acquire the text information to be recognized
  • the length adjustment module 63 is configured to adjust the length of the text information to be recognized to the input length of the meta network
  • the filter generating module 64 is configured to input the text information after the length adjustment into the meta network, and generate a set of unique filters corresponding to the text information through the meta network, and the unique filter is Refers to a filter related to the context of the text information after length adjustment;
  • the feature extraction module 65 is configured to pass the length-adjusted text information as input to the unique filter, and extract the feature vector matrix corresponding to the text information through the unique filter. Each element represents the characteristics of the text information.
  • the length adjustment module 63 includes:
  • a length acquiring unit configured to acquire the input length of the meta-network, and determine whether the length of the text information to be recognized reaches the input length
  • the length adjustment unit is configured to, if the length of the text information to be recognized does not reach the input length, fill preset characters to the end of the text information to be recognized, so as to adjust the length of the text information to be recognized Adjust to the input length.
  • the filter generating module 64 includes:
  • the first vectorization unit is configured to perform vectorization processing on the length-adjusted text information to obtain a vector matrix.
  • the vector matrix includes a plurality of word embedding vectors, and each word embedding vector has the same length;
  • the first convolution unit is configured to perform a convolution operation on the vector matrix through the meta-network to obtain a hidden layer vector of a specified length
  • the transposed convolution unit is configured to perform a transposed convolution operation on the hidden layer vector to obtain a group of unique filters corresponding to the length-adjusted text information.
  • the feature extraction module 65 includes:
  • the second vectorization unit is configured to perform vectorization processing on the length-adjusted text information to obtain a vector matrix.
  • the vector matrix includes a plurality of word embedding vectors, and each word embedding vector has the same length;
  • a second convolution unit configured to perform a convolution operation on the vector matrix through the unique filter to extract a feature map corresponding to the text information
  • the pooling unit is configured to perform a pooling operation on the feature map, extract the maximum value of each row in the feature map as a main feature, and obtain a feature vector matrix corresponding to the text information.
  • the method further includes:
  • the classification module is used to pass the feature vector matrix as input to the fully connected layer, and then use the output of the fully connected layer as input to the preset Softmax classifier; obtain the text information according to the output of the Softmax classifier The corresponding category.
  • Each module in the apparatus for extracting features of text information described above can be implemented in whole or in part by software, hardware, and a combination thereof.
  • the foregoing modules may be embedded in the form of hardware or independent of the processor in the computer device, or may be stored in the memory of the computer device in the form of software, so that the processor can call and execute the operations corresponding to the foregoing modules.
  • a computer device is provided.
  • the computer device may be a server, and its internal structure diagram may be as shown in FIG. 7.
  • the computer equipment includes a processor, a memory, a network interface and a database connected through a system bus.
  • the processor of the computer device is used to provide calculation and control capabilities.
  • the memory of the computer device includes a non-volatile storage medium and an internal memory.
  • the non-volatile storage medium stores an operating system, computer readable instructions, and a database.
  • the internal memory provides an environment for the operation of the operating system and computer-readable instructions in the non-volatile storage medium.
  • the network interface of the computer device is used to communicate with an external terminal through a network connection.
  • the computer readable instruction is executed by the processor to realize a feature extraction method of text information.
  • a computer device including a memory, a processor, and computer-readable instructions stored in the memory and capable of running on the processor, and the processor implements the following steps when the processor executes the computer-readable instructions:
  • the meta-network refers to a network used to generate a set of unique filters corresponding to the input text information
  • the length-adjusted text information is passed into the meta-network as input, and a set of unique filters corresponding to the text information is generated through the meta-network.
  • the unique filter refers to the adjusted length Context-sensitive filters for text information
  • the length-adjusted text information is passed into the unique filter as input, and the feature vector matrix corresponding to the text information is extracted through the unique filter, and each element in the feature vector matrix represents the text information Characteristics.
  • one or more non-volatile readable storage media storing computer readable instructions are provided.
  • the computer readable instructions are executed by one or more processors, the one or more Each processor performs the following steps:
  • the meta-network refers to a network used to generate a set of unique filters corresponding to the input text information
  • the length-adjusted text information is passed into the meta-network as input, and a set of unique filters corresponding to the text information is generated through the meta-network.
  • the unique filter refers to the adjusted length Context-sensitive filters for text information
  • the length-adjusted text information is passed into the unique filter as input, and the feature vector matrix corresponding to the text information is extracted through the unique filter, and each element in the feature vector matrix represents the text information Characteristics.
  • Non-volatile memory may include read only memory (ROM), programmable ROM (PROM), electrically programmable ROM (EPROM), electrically erasable programmable ROM (EEPROM), or flash memory.
  • ROM read only memory
  • PROM programmable ROM
  • EPROM electrically programmable ROM
  • EEPROM electrically erasable programmable ROM
  • Volatile memory may include random access memory (RAM) or external cache memory.
  • RAM is available in many forms, such as static RAM (SRAM), dynamic RAM (DRAM), synchronous DRAM (SDRAM), double data rate SDRAM (DDRSDRAM), enhanced SDRAM (ESDRAM), synchronous chain Channel (Synchlink) DRAM (SLDRAM), memory bus (Rambus) direct RAM (RDRAM), direct memory bus dynamic RAM (DRDRAM), and memory bus dynamic RAM (RDRAM), etc.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Computational Linguistics (AREA)
  • General Health & Medical Sciences (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The present application discloses a text information feature extraction method and device, a computer apparatus, and a storage medium. The method comprises: providing and training a unit network, wherein the unit network is used to generate a group of unique filters corresponding to inputted text information; adjusting the length of text information to undergo recognition to be the unit network input length; introducing the length-adjusted text information to the unit network, and generating a group of unique filters corresponding to the text information by means of the unit network, wherein a unique filter refers to a filter related to a length-adjusted text information context; and introducing the length-adjusted text information to the unique filter, and extracting a feature vector matrix corresponding to the text information by means of the unique filter. The present application resolves the issue in which existing text recognition techniques cannot be adapted for context, and thus, text recognition accuracy is poor.

Description

文本信息的特征提取方法、装置、计算机设备及存储介质Feature extraction method, device, computer equipment and storage medium of text information
本申请以2019年3月6日提交的申请号为201910168231.6,名称为“文本信息的特征提取方法、装置、计算机设备及存储介质”的中国发明专利申请为基础,并要求其优先权。This application is based on the Chinese invention patent application filed on March 6, 2019 with the application number 201910168231.6, titled "Feature extraction method, device, computer equipment and storage medium of text information", and claims its priority.
技术领域Technical field
本申请涉及信息技术领域,尤其涉及一种文本信息的特征提取方法、装置、计算机设备及存储介质。This application relates to the field of information technology, and in particular to a method, device, computer equipment and storage medium for feature extraction of text information.
背景技术Background technique
卷积神经网络在近来逐渐成为自然语言处理的一种基础模块,尽管得到了成功,但是大部分现存的卷积神经网络都采用的是对所有输入句子应用习得的相同的静态过滤器。静态过滤器最大的不足在于它不是与文本相关的,也就是它同等地对待所有类型的文本。比如我们人在阅读一篇科普文章和一篇时政新闻时,阅读方式一般是不同的,阅读的重点也通常是不同的;对于时政新闻,我们应当主要提取时间、地点、人物、事件等信息,而对于科普文章,应该在概念、逻辑、因果等关系上给予更大的权重。而静态过滤器只能用同样的权重来对待所有的上下文信息,因而在文本识别的准确率这一方面上受到了限制。Convolutional neural networks have recently gradually become a basic module of natural language processing. Despite their success, most existing convolutional neural networks use the same static filter learned for all input sentences. The biggest disadvantage of a static filter is that it is not related to text, that is, it treats all types of text equally. For example, when we people read a popular science article and a current affairs news, the reading methods are generally different, and the focus of reading is usually different. For current affairs news, we should mainly extract information such as time, location, people, and events. As for popular science articles, more weight should be given to concepts, logic, and causality. The static filter can only treat all contextual information with the same weight, so it is limited in the accuracy of text recognition.
由此可见,寻找一种能够适应上下文语境、提高文本识别准确率的方法成为本领域亟需解决的技术问题。It can be seen that finding a method that can adapt to the context and improve the accuracy of text recognition has become an urgent technical problem in this field.
发明内容Summary of the invention
本申请实施例提供了一种文本信息的特征提取方法、装置、计算机设备及存储介质,以解决现有文本识别技术无法适应上下文语境、文本识别准确率欠佳的问题。The embodiments of the present application provide a method, device, computer equipment, and storage medium for feature extraction of text information to solve the problems that the existing text recognition technology cannot adapt to the context and the accuracy of text recognition is poor.
一种文本信息的特征提取方法,包括:A feature extraction method of text information includes:
设置并训练元网络,所述元网络是指用于生成与所输入的文本信息对应的一组唯一过滤器的网络;Setting up and training a meta-network, the meta-network refers to a network used to generate a set of unique filters corresponding to the input text information;
获取待识别的文本信息;Obtain the text information to be recognized;
将所述待识别的文本信息的长度调整为所述元网络的输入长度;Adjusting the length of the text information to be recognized to the input length of the meta-network;
将长度调整后的所述文本信息作为输入传入所述元网络,通过所述元网络生成所述文本信息对应的一组唯一过滤器,所述唯一过滤器是指与长度调整后的所述文本信息的上下文相关的过滤器;The length-adjusted text information is passed into the meta-network as input, and a set of unique filters corresponding to the text information is generated through the meta-network. The unique filter refers to the adjusted length Context-sensitive filters for text information;
将长度调整后的所述文本信息作为输入传入所述唯一过滤器,通过所述唯一过滤器提取所述文本信息对应的特征向量矩阵,所述特征向量矩阵中的各个元素表示所述文本信息的特征。The length-adjusted text information is passed into the unique filter as input, and the feature vector matrix corresponding to the text information is extracted through the unique filter, and each element in the feature vector matrix represents the text information Characteristics.
可选地,所述将所述待识别的文本信息的长度调整为所述元网络的输入长度包括:Optionally, the adjusting the length of the text information to be recognized to the input length of the meta-network includes:
获取所述元网络的输入长度,判断所述待识别的文本信息的长度是否达到所述输入长度;Acquiring the input length of the meta network, and determining whether the length of the text information to be recognized reaches the input length;
若否时,将预设字符填充至所述待识别的文本信息末尾,以将所述待识别的文本信息的长度调整为所述输入长度。If not, fill preset characters to the end of the text information to be recognized, so as to adjust the length of the text information to be recognized to the input length.
可选地,所述将长度调整后的所述文本信息作为输入传入所述元网络,通过所述元网 络生成所述文本信息对应的一组唯一过滤器包括:Optionally, the inputting the text information with the adjusted length as input to the meta network, and generating a set of unique filters corresponding to the text information through the meta network includes:
对长度调整后的所述文本信息进行向量化处理,得到向量矩阵,所述向量矩阵中包括若干个词嵌入向量,每一个词嵌入向量的长度相等;Performing vectorization processing on the length-adjusted text information to obtain a vector matrix, the vector matrix includes a plurality of word embedding vectors, and the length of each word embedding vector is equal;
通过所述元网络对所述向量矩阵执行卷积运算,得到指定长度的隐藏层向量;Performing a convolution operation on the vector matrix through the meta network to obtain a hidden layer vector of a specified length;
对所述隐藏层向量执行转置卷积运算,得到长度调整后的所述文本信息对应的一组唯一过滤器。Performing a transposed convolution operation on the hidden layer vector to obtain a set of unique filters corresponding to the text information after the length adjustment.
可选地,所述将长度调整后的所述文本信息作为输入传入所述唯一过滤器,通过所述唯一过滤器提取所述文本信息对应的特征向量矩阵包括:Optionally, the inputting the length-adjusted text information into the unique filter as input, and extracting the feature vector matrix corresponding to the text information through the unique filter includes:
对长度调整后的所述文本信息进行向量化处理,得到向量矩阵,所述向量矩阵中包括若干个词嵌入向量,每一个词嵌入向量的长度相等;Performing vectorization processing on the length-adjusted text information to obtain a vector matrix, the vector matrix includes a plurality of word embedding vectors, and the length of each word embedding vector is equal;
通过所述唯一过滤器对所述向量矩阵执行卷积运算,提取所述文本信息对应的特征图;Performing a convolution operation on the vector matrix through the unique filter to extract a feature map corresponding to the text information;
对所述特征图执行池化操作,提取特征图中每一行的最大值作为主要特征,得到所述文本信息对应的特征向量矩阵。Perform a pooling operation on the feature map, extract the maximum value of each row in the feature map as a main feature, and obtain a feature vector matrix corresponding to the text information.
可选地,在通过所述唯一过滤器生成所述文本信息对应的特征向量矩阵后,还包括:Optionally, after generating the feature vector matrix corresponding to the text information through the unique filter, the method further includes:
将所述特征向量矩阵作为输入传入全连接层,然后将全连接层的输出作为输入传入预设的Softmax分类器;Pass the feature vector matrix as input to the fully connected layer, and then use the output of the fully connected layer as input to the preset Softmax classifier;
根据所述Softmax分类器的输出获取所述文本信息对应的类别。Acquire the category corresponding to the text information according to the output of the Softmax classifier.
一种文本信息的特征提取装置,包括:A feature extraction device for text information includes:
训练模块,用于设置并训练元网络,所述元网络是指用于生成与所输入的文本信息对应的一组唯一过滤器的网络;A training module for setting up and training a meta-network, where the meta-network refers to a network for generating a set of unique filters corresponding to the input text information;
信息获取模块,用于获取待识别的文本信息;Information acquisition module for acquiring text information to be recognized;
长度调整模块,用于将所述待识别的文本信息的长度调整为所述元网络的输入长度;A length adjustment module, configured to adjust the length of the text information to be recognized to the input length of the meta network;
过滤器生成模块,用于将长度调整后的所述文本信息作为输入传入所述元网络,通过所述元网络生成所述文本信息对应的一组唯一过滤器,所述唯一过滤器是指与长度调整后的所述文本信息的上下文相关的过滤器;The filter generation module is configured to input the text information after the length adjustment into the meta network, and generate a set of unique filters corresponding to the text information through the meta network, and the unique filter refers to A filter related to the context of the text information after the length adjustment;
特征提取模块,用于将长度调整后的所述文本信息作为输入传入所述唯一过滤器,通过所述唯一过滤器提取所述文本信息对应的特征向量矩阵,所述特征向量矩阵中的各个元素表示所述文本信息的特征。The feature extraction module is configured to input the text information after the length adjustment to the unique filter, and extract the feature vector matrix corresponding to the text information through the unique filter, and each of the feature vector matrix The elements represent the characteristics of the text information.
可选地,所述长度调整模块包括:Optionally, the length adjustment module includes:
长度获取单元,用于获取所述元网络的输入长度,判断所述待识别的文本信息的长度是否达到所述输入长度;A length acquiring unit, configured to acquire the input length of the meta-network, and determine whether the length of the text information to be recognized reaches the input length;
长度调整单元,用于若所述待识别的文本信息的长度未达到所述输入长度时,将预设字符填充至所述待识别的文本信息末尾,以将所述待识别的文本信息的长度调整为所述输入长度。The length adjustment unit is configured to, if the length of the text information to be recognized does not reach the input length, fill preset characters to the end of the text information to be recognized, so as to adjust the length of the text information to be recognized Adjust to the input length.
可选地,所述过滤器生成模块包括:Optionally, the filter generating module includes:
第一向量化单元,用于对长度调整后的所述文本信息进行向量化处理,得到向量矩阵,所述向量矩阵中包括若干个词嵌入向量,每一个词嵌入向量的长度相等;The first vectorization unit is configured to perform vectorization processing on the length-adjusted text information to obtain a vector matrix. The vector matrix includes several word embedding vectors, and each word embedding vector has the same length;
第一卷积单元,用于通过所述元网络对所述向量矩阵执行卷积运算,得到指定长度的隐藏层向量;The first convolution unit is configured to perform a convolution operation on the vector matrix through the meta-network to obtain a hidden layer vector of a specified length;
转置卷积单元,用于对所述隐藏层向量执行转置卷积运算,得到所述长度调整后的文本信息对应的一组唯一过滤器。The transposed convolution unit is configured to perform a transposed convolution operation on the hidden layer vector to obtain a group of unique filters corresponding to the length-adjusted text information.
一种计算机设备,包括存储器、处理器以及存储在所述存储器中并可在所述处理器上运行的计算机可读指令,所述处理器执行所述计算机可读指令时实现如下步骤:A computer device includes a memory, a processor, and computer-readable instructions stored in the memory and capable of running on the processor, and the processor implements the following steps when the processor executes the computer-readable instructions:
设置并训练元网络,所述元网络是指用于生成与所输入的文本信息对应的一组唯一 过滤器的网络;Setting up and training a meta-network, the meta-network refers to a network used to generate a set of unique filters corresponding to the input text information;
获取待识别的文本信息;Obtain the text information to be recognized;
将所述待识别的文本信息的长度调整为所述元网络的输入长度;Adjusting the length of the text information to be recognized to the input length of the meta-network;
将长度调整后的所述文本信息作为输入传入所述元网络,通过所述元网络生成所述文本信息对应的一组唯一过滤器,所述唯一过滤器是指与长度调整后的所述文本信息的上下文相关的过滤器;The length-adjusted text information is passed into the meta-network as input, and a set of unique filters corresponding to the text information is generated through the meta-network. The unique filter refers to the adjusted length Context-sensitive filters for text information;
将长度调整后的所述文本信息作为输入传入所述唯一过滤器,通过所述唯一过滤器提取所述文本信息对应的特征向量矩阵,所述特征向量矩阵中的各个元素表示所述文本信息的特征。The length-adjusted text information is passed into the unique filter as input, and the feature vector matrix corresponding to the text information is extracted through the unique filter, and each element in the feature vector matrix represents the text information Characteristics.
一个或多个存储有计算机可读指令的非易失性可读存储介质,所述计算机可读指令被一个或多个处理器执行时,使得所述一个或多个处理器执行如下步骤:One or more non-volatile readable storage media storing computer readable instructions. When the computer readable instructions are executed by one or more processors, the one or more processors execute the following steps:
设置并训练元网络,所述元网络是指用于生成与所输入的文本信息对应的一组唯一过滤器的网络;Setting up and training a meta-network, the meta-network refers to a network used to generate a set of unique filters corresponding to the input text information;
获取待识别的文本信息;Obtain the text information to be recognized;
将所述待识别的文本信息的长度调整为所述元网络的输入长度;Adjusting the length of the text information to be recognized to the input length of the meta-network;
将长度调整后的所述文本信息作为输入传入所述元网络,通过所述元网络生成所述文本信息对应的一组唯一过滤器,所述唯一过滤器是指与长度调整后的所述文本信息的上下文相关的过滤器;The length-adjusted text information is passed into the meta-network as input, and a set of unique filters corresponding to the text information is generated through the meta-network. The unique filter refers to the adjusted length Context-sensitive filters for text information;
将长度调整后的所述文本信息作为输入传入所述唯一过滤器,通过所述唯一过滤器提取所述文本信息对应的特征向量矩阵,所述特征向量矩阵中的各个元素表示所述文本信息的特征。The length-adjusted text information is passed into the unique filter as input, and the feature vector matrix corresponding to the text information is extracted through the unique filter, and each element in the feature vector matrix represents the text information Characteristics.
本申请的一个或多个实施例的细节在下面的附图和描述中提出,本申请的其他特征和优点将从说明书、附图以及权利要求变得明显。The details of one or more embodiments of the present application are presented in the following drawings and description, and other features and advantages of the present application will become apparent from the description, drawings and claims.
附图说明Description of the drawings
为了更清楚地说明本申请实施例的技术方案,下面将对本申请实施例的描述中所需要使用的附图作简单地介绍,显而易见地,下面描述中的附图仅仅是本申请的一些实施例,对于本领域普通技术人员来讲,在不付出创造性劳动性的前提下,还可以根据这些附图获得其他的附图。In order to explain the technical solutions of the embodiments of the present application more clearly, the following will briefly introduce the drawings that need to be used in the description of the embodiments of the present application. Obviously, the drawings in the following description are only some embodiments of the present application. For those of ordinary skill in the art, other drawings can be obtained based on these drawings without creative labor.
图1是本申请一实施例中文本信息的特征提取方法的一流程图;Fig. 1 is a flowchart of a method for extracting features of text information in an embodiment of the present application;
图2是本申请一实施例中文本信息的特征提取方法中步骤S103的一流程图;2 is a flowchart of step S103 in the method for extracting features of text information in an embodiment of the present application;
图3是本申请一实施例中文本信息的特征提取方法中步骤S104的一流程图;3 is a flowchart of step S104 in the method for extracting features of text information in an embodiment of the present application;
图4是本申请一实施例中文本信息的特征提取方法中步骤S105的一流程图;4 is a flowchart of step S105 in the method for extracting features of text information in an embodiment of the present application;
图5是本申请一实施例中文本信息的特征提取方法的一流程图;Fig. 5 is a flowchart of a method for feature extraction of text information in an embodiment of the present application;
图6是本申请一实施例中文本信息的特征提取装置的一原理框图;Fig. 6 is a functional block diagram of a feature extraction device for text information in an embodiment of the present application;
图7是本申请一实施例中计算机设备的一示意图。Fig. 7 is a schematic diagram of a computer device in an embodiment of the present application.
具体实施方式detailed description
下面将结合本申请实施例中的附图,对本申请实施例中的技术方案进行清楚、完整地描述,显然,所描述的实施例是本申请一部分实施例,而不是全部的实施例。基于本申请中的实施例,本领域普通技术人员在没有作出创造性劳动前提下所获得的所有其他实施例,都属于本申请保护的范围。The technical solutions in the embodiments of the present application will be described clearly and completely in conjunction with the accompanying drawings in the embodiments of the present application. Obviously, the described embodiments are part of the embodiments of the present application, rather than all of them. Based on the embodiments in this application, all other embodiments obtained by those of ordinary skill in the art without creative work fall within the protection scope of this application.
本申请实施例提供的文本信息的特征提取方法应用于服务器。所述服务器可以用独立的服务器或者是多个服务器组成的服务器集群来实现。在一实施例中,如图1所示,提 供一种文本信息的特征提取方法,包括如下步骤:The feature extraction method of text information provided by the embodiment of the application is applied to a server. The server can be implemented by an independent server or a server cluster composed of multiple servers. In one embodiment, as shown in Figure 1, a method for feature extraction of text information is provided, which includes the following steps:
在步骤S101中,设置并训练元网络,所述元网络是指用于生成与所输入的文本信息对应的一组唯一过滤器的网络。In step S101, a meta-network is set up and trained. The meta-network refers to a network for generating a set of unique filters corresponding to the input text information.
在这里,为了解决大部分现存的卷积神经网络采用对所有待识别的文本信息应用习得的相同的静态过滤器的问题,本申请实施例提出了一种使用元网络来学习上下文语境的卷积神经网络,并应用于文本处理。其中,所述元网络是指用于生成与所输入的文本信息对应的一组唯一过滤器的网络,能够根据所输入的文本信息生成与所输入的文本信息的上下文语境相关的权重矩阵,是一个关于网络的网络。元网络所生成的过滤器是针对不同类型的文本信息定制的,适用于该不同类型的文本信息,从而改变了以往的卷积神经网络中使用同一过滤器同等对待所有类型的文本的现状。由于所述过滤器与文本信息的上下文语境相关,使得提取到的特征可以更准确。Here, in order to solve the problem that most existing convolutional neural networks apply the same static filter learned to all text information to be recognized, an embodiment of the present application proposes a meta-network to learn context. Convolutional neural network and applied to text processing. Wherein, the meta network refers to a network used to generate a set of unique filters corresponding to the input text information, which can generate a weight matrix related to the context of the input text information according to the input text information, It's a network about the network. The filters generated by the meta-network are customized for different types of text information, and are suitable for the different types of text information, thus changing the current situation of using the same filter to treat all types of text equally in the previous convolutional neural networks. Since the filter is related to the context of the text information, the extracted features can be more accurate.
在本申请实施例中,所述元网络可以是任意可微分的深度网络。所述元网络预先通过在大量的训练文本集上训练得到。在进行文本信息的特征提取时,获取预先训练得到的元网络,并对所述元网络的参数进行微调,得到适用于本次文本信息的元网络。In the embodiment of the present application, the meta-network may be any differentiable deep network. The meta-network is obtained in advance through training on a large number of training text sets. When performing feature extraction of text information, a pre-trained meta-network is obtained, and the parameters of the meta-network are fine-tuned to obtain a meta-network suitable for this text information.
在步骤S102中,获取待识别的文本信息。In step S102, the text information to be recognized is obtained.
在本申请实施例中,所述文本信息是根据指定的语言衔接和语义连贯规则组成的一个语句,包括但不限于话术文本信息、文学文本信息。服务器可以根据实际需要或者应用场景的需要获取待识别的文本信息。例如,服务器从预设数据库中获取待识别的文本信息,所述预设数据库中预先收集了大量的文本信息。或者,服务器通过客户端的麦克风获取用户输入的语音信息,然后将所述语音信息转换为文字,得到待识别的文本信息。又或者,服务器通过客户端的拍照功能获取图像信息,然后对所述图像信息进行OCR文本识别,得到待识别的文本信息。可以理解的是,服务器还可以通过多种方式获取到待识别的文本信息,此处不再过多赘述。In the embodiment of the present application, the text information is a sentence composed according to specified linguistic cohesion and semantic coherence rules, including but not limited to verbal text information and literary text information. The server can obtain the text information to be recognized according to actual needs or application scenarios. For example, the server obtains the text information to be recognized from a preset database, and a large amount of text information is collected in advance in the preset database. Alternatively, the server obtains the voice information input by the user through the microphone of the client, and then converts the voice information into text to obtain the text information to be recognized. Or, the server obtains the image information through the camera function of the client, and then performs OCR text recognition on the image information to obtain the text information to be recognized. It is understandable that the server can also obtain the text information to be recognized in a variety of ways, which will not be repeated here.
为了提高元网络输出结果的价值,提高文本特征提取的准确率,所述待识别的文本信息优选为与所述元网络的训练文本及上下文分布是相同或相似的,即所述待识别的文本信息与所述训练文本相同或者为训练文本中的一部分,两者在风格、类型、语义等层面上是相同或相似的。In order to increase the value of the meta-network output results and improve the accuracy of text feature extraction, the text information to be recognized is preferably the same or similar to the training text and context distribution of the meta-network, that is, the text to be recognized The information is the same as the training text or is part of the training text, and the two are the same or similar in terms of style, type, and semantics.
在步骤S103中,将所述待识别的文本信息的长度调整为所述元网络的输入长度。In step S103, the length of the text information to be recognized is adjusted to the input length of the meta-network.
在这里,所述文本信息的长度是指文本信息的字符串长度。为了使得元网络根据不同的输入文本生成的唯一过滤器的大小是统一的,本申请实施例在将所述待识别的文本信息输入元网络之前,调整所述待识别的文本信息的长度为元网络的输入长度。所述输入长度是预先设置的元网络的输入参数的字符串长度。可选地,可以在训练元网络时,预先统计所有训练文本的字符串长度的分布,然后选取一个最大长度作为元网络的输入长度,以进行统一。如图2所示,所述步骤S103将所述待识别的文本信息的长度调整为所述元网络的输入长度包括:Here, the length of the text information refers to the length of the character string of the text information. In order to make the size of the unique filter generated by the meta network according to different input texts uniform, the embodiment of the present application adjusts the length of the text information to be recognized to the meta network before inputting the text information to be recognized into the meta network. The input length of the network. The input length is the length of the string of the input parameters of the meta-network set in advance. Optionally, when the meta-network is trained, the distribution of the string lengths of all training texts can be counted in advance, and then a maximum length is selected as the input length of the meta-network for unification. As shown in FIG. 2, the step S103 adjusting the length of the text information to be recognized to the input length of the meta network includes:
在步骤S1031中,获取所述元网络的输入长度,判断所述待识别的文本信息的长度是否达到所述输入长度。In step S1031, the input length of the meta network is acquired, and it is determined whether the length of the text information to be recognized reaches the input length.
在这里,本申请实施例通过计算所述待识别的文本信息的字符个数,得到所述待识别的文本信息的字符串长度。然后将所述字符串长度与所述输入长度进行比较,以判断所述待识别的文本信息的长度是否达到所述输入长度。Here, the embodiment of the present application obtains the character string length of the text information to be recognized by calculating the number of characters of the text information to be recognized. Then, the length of the character string is compared with the input length to determine whether the length of the text information to be recognized reaches the input length.
若是时,即所述待识别的文本信息的长度达到所述输入长度,则跳转至步骤S104,通过所述元网络生成所述待识别的文本信息对应的一组唯一过滤器。If so, that is, the length of the text information to be recognized reaches the input length, then jump to step S104, and generate a set of unique filters corresponding to the text information to be recognized through the meta-network.
在步骤S1032中,若否时,将预设字符填充至所述待识别的文本信息末尾,以将所述待识别的文本信息的长度调整为所述输入长度。然后跳转至步骤S104。In step S1032, if not, fill preset characters to the end of the text information to be recognized, so as to adjust the length of the text information to be recognized to the input length. Then jump to step S104.
在这里,对于字符串长度未达到输入长度的待识别的文本信息,本申请实施例采用预 设字符补齐所述待识别的文本信息,以将所述待识别的文本信息调整到输入长度。所述预设字符对于元网络、卷积神经网络来说是一个代表空白的特殊字符,比如NUL。Here, for the text information to be recognized whose character string length does not reach the input length, this embodiment of the present application uses preset characters to complement the text information to be recognized, so as to adjust the text information to be recognized to the input length. The preset character is a special character representing blank space for meta-networks and convolutional neural networks, such as NUL.
为了便于理解,以下对上述步骤S103调整文本信息的长度为所述元网络的输入长度进行举例说明。假设预先统计所有训练文本的字符串长度的分布,然后选取一个最大长度作为所述输入长度,比如7。若待识别的文本信息为“今天天气真好”,通过步骤S1031得到其字符串长度为6,未达到输入长度7。则在步骤S1033中采用预设字符NUL来对所述待识别的文本信息“今天天气真好”进行补齐,得到长度调整后的文本信息“今天天气真好NUL”。For ease of understanding, the following is an example of adjusting the length of the text information in step S103 to the input length of the meta-network. It is assumed that the distribution of the string lengths of all training texts is calculated in advance, and then a maximum length is selected as the input length, such as 7. If the text information to be recognized is "the weather is really good today", the character string length is 6 obtained through step S1031, and the input length 7 is not reached. Then, in step S1033, the preset character NUL is used to complement the to-be-recognized text information "Today's weather is really good", and the length-adjusted text information "Today's weather is really good NUL" is obtained.
在步骤S104中,将长度调整后的所述文本信息作为输入传入所述元网络,通过所述元网络生成所述文本信息对应的一组唯一过滤器,所述唯一过滤器是指与长度调整后的所述文本信息的上下文相关的过滤器。In step S104, the length-adjusted text information is passed into the meta-network as input, and a set of unique filters corresponding to the text information is generated through the meta-network. The unique filter refers to the length The adjusted context-related filter of the text information.
如前所述,所述元网络能够根据输入参数学习得到经过调整的卷积神经网络过滤器。在本申请实施例中,所述输入参数为长度调整后的待识别的文本信息,所述过滤器为所述待识别的文本信息对应的一组唯一过滤器。所述唯一过滤器与待识别的文本信息的上下文相关,从而使用所述唯一过滤器能够对不同的待识别的文本信息精炼及提取到不同的特征。As mentioned above, the meta-network can learn the adjusted convolutional neural network filter according to the input parameters. In the embodiment of the present application, the input parameter is the text information to be recognized after the length is adjusted, and the filter is a set of unique filters corresponding to the text information to be recognized. The unique filter is related to the context of the text information to be recognized, so that using the unique filter can refine different text information to be recognized and extract different features.
可选地,为了解决输入长度可变的问题,在本申请实施例中,所述元网络生成预定义大小的唯一过滤器。如图3所示,所述步骤S104将长度调整后的所述文本信息作为输入传入所述元网络,通过所述元网络生成所述文本信息对应的一组唯一过滤器包括:Optionally, in order to solve the problem of variable input length, in this embodiment of the present application, the meta-network generates a unique filter of a predefined size. As shown in FIG. 3, in step S104, inputting the text information with the adjusted length as input into the meta network, and generating a set of unique filters corresponding to the text information through the meta network includes:
在步骤S1041中,对长度调整后的所述文本信息进行向量化处理,得到向量矩阵,所述向量矩阵中包括若干个词嵌入向量,每一个词嵌入向量的长度相等。In step S1041, the text information after the length adjustment is vectorized to obtain a vector matrix. The vector matrix includes several word embedding vectors, and each word embedding vector has the same length.
在这里,本申请实施例对长度调整后的所述文本信息进行向量化处理,得到长度调整后的文本信息对应的向量矩阵,作为所述待识别的文本信息对应的向量矩阵。所述向量矩阵中包括若干个词嵌入向量。所述词嵌入向量是指将长度调整后的所述文本信息进行分词后每个单词的词向量,即长度调整后的文本信息中的每一个词映射为向量矩阵中的一个列向量。在本申请实施例中,所述词嵌入向量的长度是预先指定的,即对于不同长度的待识别的文本信息,其对应的词嵌入向量的长度都是相同的。尽管长度调整时采用了预设字符来填充所述待识别的文本信息,但是所述预设字符对于元网络、卷积神经网络来说是一个代表空白的特殊字符。本申请实施例将长度调整后的所述文本信息转化为向量矩阵,有利于方便了后续卷积神经网络的识别和学习,即方便了后续执行卷积运算和转置卷积运算。Here, the embodiment of the present application performs vectorization processing on the length-adjusted text information to obtain a vector matrix corresponding to the length-adjusted text information as the vector matrix corresponding to the text information to be recognized. The vector matrix includes several word embedding vectors. The word embedding vector refers to the word vector of each word after the length-adjusted text information is segmented, that is, each word in the length-adjusted text information is mapped to a column vector in a vector matrix. In the embodiment of the present application, the length of the word embedding vector is pre-designated, that is, for text information to be recognized of different lengths, the length of the corresponding word embedding vector is the same. Although a preset character is used to fill the text information to be recognized during the length adjustment, the preset character is a special character representing blank space for the meta-network and convolutional neural network. The embodiment of the present application converts the length-adjusted text information into a vector matrix, which facilitates the recognition and learning of subsequent convolutional neural networks, that is, it facilitates the subsequent execution of convolution operations and transposed convolution operations.
在步骤S1042中,通过所述元网络对所述向量矩阵执行卷积运算,得到指定长度的隐藏层向量。In step S1042, a convolution operation is performed on the vector matrix through the meta-network to obtain a hidden layer vector of a specified length.
在得到长度调整后的所述文本信息对应的向量矩阵之后,通过元网络中预设的卷积层对所述向量矩阵执行卷积操作,即通过卷积层滤波器对所述向量矩阵进行卷积运算,计算滤波器和所述向量矩阵之间的点积,以提取更高层次的特征,得到指定长度的隐藏层向量。在这里,组成卷积层滤波器的参数可以通过损失函数进行优化。After the length-adjusted vector matrix corresponding to the text information is obtained, a convolution operation is performed on the vector matrix through a preset convolution layer in the meta network, that is, the vector matrix is convolved through a convolution layer filter. The product operation is to calculate the dot product between the filter and the vector matrix to extract higher-level features to obtain a hidden layer vector of a specified length. Here, the parameters that make up the convolutional layer filter can be optimized by the loss function.
在步骤S1043中,对所述隐藏层向量执行转置卷积运算,得到所述长度调整后的文本信息对应的一组唯一过滤器。In step S1043, a transposed convolution operation is performed on the hidden layer vector to obtain a set of unique filters corresponding to the length-adjusted text information.
在这里,转置卷积(transpose convolution)运算,又称为解卷积(deconvolution)或者反卷积,类似于卷积的逆运算。本申请实施例在步骤S1042所述的隐藏层之上叠加了一个转置卷积层。在得到隐藏层向量之后,将所述隐藏层向量通过所述转置卷积层,进行转置卷积运算,生成一组卷积核,以所述卷积核作为长度调整后的所述文本信息对应的一组唯一过滤器,即所述待识别的文本信息对应的一组唯一过滤器。在这里,组成转置卷积层的参数可以通过损失函数进行优化。可以理解的是,所述唯一过滤器是与所述待识别的文本信息的上下文相关的过滤器,为所述待识别的文本信息定制的,适用于所述待识别的文本信 息。Here, transpose convolution (transpose convolution) operation, also called deconvolution (deconvolution) or deconvolution, is similar to the inverse operation of convolution. In this embodiment of the application, a transposed convolutional layer is superimposed on the hidden layer described in step S1042. After the hidden layer vector is obtained, the hidden layer vector is passed through the transposed convolution layer to perform a transposed convolution operation to generate a set of convolution kernels, and the convolution kernel is used as the length-adjusted text A set of unique filters corresponding to the information, that is, a set of unique filters corresponding to the text information to be recognized. Here, the parameters that make up the transposed convolutional layer can be optimized by the loss function. It is understandable that the unique filter is a filter related to the context of the text information to be recognized, customized for the text information to be recognized, and suitable for the text information to be recognized.
在本申请实施例中,由于待识别的文本信息通过步骤S103调整到指定的输入长度,然后封装到一个相同长度的向量矩阵中,隐藏层向量根据所述向量矩阵得到,从而使得所述隐藏层向量与所述待识别的文本信息长度无关,保证了通过元网络生成的过滤器对每一个待识别的文本信息具有相同的维度和大小,即通过元网络生成的过滤器的大小保持一致。In the embodiment of the present application, since the text information to be recognized is adjusted to the specified input length through step S103, and then encapsulated into a vector matrix of the same length, the hidden layer vector is obtained according to the vector matrix, so that the hidden layer The vector has nothing to do with the length of the text information to be recognized, which ensures that the filter generated through the meta network has the same dimension and size for each text information to be recognized, that is, the size of the filter generated through the meta network remains consistent.
可选地,在本申请实施例中,上述卷积层和转置卷积层中的参数均为联合可微分的,因此在训练元网络时可以将卷积层的参数和转置卷积层的参数一起通过梯度的反向传播算法进行优化、更新。在这里,反向传播算法(即BP算法)的思想是将通过卷积层和转置卷积层的输出进行误差计算,并将误差反向逐级传下去,主要由激励传播和权重更新两个环节反复循环迭代,直至训练文本的特征向量矩阵达到预定的误差期望值。通过反向传播算法可以进一步优化元网络的参数,提高元网络生成唯一过滤器的准确度。Optionally, in the embodiment of the present application, the parameters in the convolutional layer and the transposed convolutional layer are all jointly differentiable. Therefore, the parameters of the convolutional layer and the transposed convolutional layer can be used when training the meta-network. The parameters are optimized and updated through the gradient back propagation algorithm. Here, the idea of the back propagation algorithm (ie BP algorithm) is to calculate the error through the output of the convolutional layer and the transposed convolutional layer, and pass the error backward step by step. It is mainly composed of excitation propagation and weight update. The steps are repeated iteratively until the feature vector matrix of the training text reaches the predetermined error expectation value. The backpropagation algorithm can further optimize the parameters of the meta-network and improve the accuracy of the meta-network to generate unique filters.
在步骤S105中,将长度调整后的所述文本信息作为输入传入所述唯一过滤器,通过所述唯一过滤器提取所述文本信息对应的特征向量矩阵,所述特征向量矩阵中的各个元素表示所述文本信息的特征。In step S105, the length-adjusted text information is passed into the unique filter as input, and the feature vector matrix corresponding to the text information is extracted through the unique filter, and each element in the feature vector matrix Indicates the characteristics of the text information.
在通过元网络得到长度调整后的所述文本信息对应的唯一过滤器之后,使用所述唯一过滤器对长度调整后的所述文本信息进行识别。具体为:将长度调整后的所述文本信息作为输入传入所述唯一过滤器,然后获取经过所述唯一过滤器后的输出,以所述输出作为所述待识别的文本信息对应的特征向量矩阵。所述特征向量矩阵中包含所述待识别的文本信息的特征信息,即语义信息。After obtaining the unique filter corresponding to the length-adjusted text information through the meta-network, the unique filter is used to identify the length-adjusted text information. Specifically, the length-adjusted text information is passed into the unique filter as input, and then the output after the unique filter is obtained, and the output is used as the feature vector corresponding to the text information to be recognized matrix. The feature vector matrix includes feature information of the text information to be recognized, that is, semantic information.
可选地,如图4所示,所述步骤S105将长度调整后的所述文本信息作为输入传入所述唯一过滤器,通过所述唯一过滤器提取所述文本信息对应的特征向量矩阵包括:Optionally, as shown in FIG. 4, in step S105, the text information whose length has been adjusted is passed into the unique filter as input, and the feature vector matrix corresponding to the text information is extracted by the unique filter. :
在步骤S1051中,对长度调整后的所述文本信息进行向量化处理,得到向量矩阵,所述向量矩阵中包括若干个词嵌入向量,每一个词嵌入向量的长度相等。In step S1051, the text information after the length adjustment is vectorized to obtain a vector matrix. The vector matrix includes several word embedding vectors, each of which has the same length.
在这里,在将长度调整后的文本信息传入所述唯一过滤器之前,可以对长度调整后的所述文本信息进行向量化处理,将文本信息中的每一个词映射为向量矩阵中的一个列向量,得到每一个词的词嵌入向量,组合所述词嵌入向量得到长度调整后的所述文本信息对应的向量矩阵。其中,所述词嵌入向量的长度是预先指定的,即对于不同长度的待识别的文本信息,其对应的词嵌入向量的长度都是相同的。本申请实施例将所述待识别的文本信息转化为向量矩阵,有利于方便后续执行卷积运算。Here, before the length-adjusted text information is passed into the unique filter, the length-adjusted text information can be vectorized, and each word in the text information can be mapped into a vector matrix. The column vector is used to obtain the word embedding vector of each word, and the word embedding vector is combined to obtain the vector matrix corresponding to the text information after the length adjustment. Wherein, the length of the word embedding vector is pre-designated, that is, for text information to be recognized of different lengths, the length of the corresponding word embedding vector is the same. The embodiment of the present application converts the text information to be recognized into a vector matrix, which facilitates the subsequent execution of convolution operations.
示例性地,假设长度调整后的文本信息的长度为T,组成的单词为x 1,x 2,...,x T。对长度调整后的所述文本信息进行向量化处理后,得到一个向量矩阵X∈R d×T,向量矩阵X中的每一列表示长度调整后的所述文本信息中的一个单词对应的d维度的词嵌入向量。 Illustratively, suppose that the length of the text information after the length adjustment is T, and the words formed are x 1 , x 2 ,..., x T. After the length-adjusted text information is vectorized, a vector matrix X∈R d×T is obtained. Each column in the vector matrix X represents the d dimension corresponding to a word in the length-adjusted text information The word embedding vector.
在步骤S1052中,通过所述唯一过滤器对所述向量矩阵执行卷积运算,提取所述文本信息对应的特征图。In step S1052, a convolution operation is performed on the vector matrix through the unique filter to extract a feature map corresponding to the text information.
在得到长度调整后的所述文本信息对应的向量矩阵之后,将所述向量矩阵作为输入传入所述唯一过滤器执行卷积操作,即通过所述唯一过滤器对所述向量矩阵进行卷积运算,计算过滤器和所述向量矩阵之间的点积,以提取更高层次的特征,得到所述文本信息对应的特征图。After the length-adjusted vector matrix corresponding to the text information is obtained, the vector matrix is passed as an input to the unique filter to perform a convolution operation, that is, the vector matrix is convolved through the unique filter Operation, calculating the dot product between the filter and the vector matrix to extract higher-level features to obtain a feature map corresponding to the text information.
示例性地,假设所述唯一过滤器的权重为W∈R K×h×d,将所述唯一过滤器和所述向量矩阵中的每个大小为h的窗口进行卷积运算,得到所述向量矩阵的特征图P。其中,所述特征图P中的每一个元素p i由窗口大小为h的文本片段生成:p i=f(W×x i:i+h-1+b)。 Exemplarily, assuming that the weight of the unique filter is W∈R K×h×d , perform convolution operation on the unique filter and each window of size h in the vector matrix to obtain the The feature map P of the vector matrix. Wherein, each element p i in the feature map P is generated from a text segment with a window size of h: p i =f(W× xi:i+h-1 +b).
在上式中,i=1,2,…,T-h+1,在这里,×表示卷积运算符,b表示维度为K的偏置向量,f表示非线性激活函数,比如ReLU。In the above formula, i=1, 2,..., T-h+1, where × represents a convolution operator, b represents a bias vector with dimension K, and f represents a nonlinear activation function, such as ReLU.
在步骤S1053中,对所述特征图执行池化操作,提取特征图中每一行的最大值作为主要特征,得到所述文本信息对应的特征向量矩阵。In step S1053, a pooling operation is performed on the feature map, and the maximum value of each row in the feature map is extracted as the main feature, and the feature vector matrix corresponding to the text information is obtained.
在本申请实施例中,所述特征图随后作为输入传入最大池化层中,通过所述最大池化层对所述特征图中的每一行提取最大值,得到主要特征,组合所有主要特征得到一个K维向量,以所述K维向量作为长度调整后的所述文本信息对应的特征向量矩阵,即所述待识别的文本信息对应的特征向量矩阵。本申请实施例通过最大池化层弃置不重要的特征,仅保留了最突出的特征,一方面可以使特征图变小,简化计算复杂度,一方面可以提高识别的准确度。In the embodiment of the present application, the feature map is then passed into the maximum pooling layer as input, and the maximum pooling layer is used to extract the maximum value for each row in the feature map to obtain the main feature and combine all the main features A K-dimensional vector is obtained, and the K-dimensional vector is used as the feature vector matrix corresponding to the text information after the length adjustment, that is, the feature vector matrix corresponding to the text information to be recognized. In the embodiment of the present application, the unimportant features are discarded through the maximum pooling layer, and only the most prominent features are retained. On the one hand, the feature map can be reduced, the calculation complexity can be simplified, and the recognition accuracy can be improved on the other hand.
所述特征向量矩阵中的各个元素表示所述文本信息的特征,即语义信息。在本申请实施例中,经过元网络生成的唯一过滤器与所述待识别的文本信息的上下文相关,不同的待识别的文本信息对应的唯一过滤器不相同,即在卷积神经网络中的权重矩阵不相同。通过所述唯一过滤器获取所述待识别的文本信息的特征向量矩阵,大大地提高了特征提取的准确率。Each element in the feature vector matrix represents the feature of the text information, that is, semantic information. In the embodiment of the present application, the unique filter generated by the meta-network is related to the context of the text information to be recognized, and the unique filters corresponding to different text information to be recognized are not the same, that is, in the convolutional neural network The weight matrices are not the same. Obtaining the feature vector matrix of the text information to be recognized through the unique filter greatly improves the accuracy of feature extraction.
综上所示,本申请实施例通过设置并训练元网络,所述元网络是指用于生成与所输入的文本信息对应的一组唯一过滤器的网络;在对文本信息进行识别时,根据待识别的文本信息获取元网络,并将所述待识别的文本信息的长度调整为所述元网络的输入长度;然后将长度调整后的所述文本信息作为输入传入所述元网络,通过所述元网络生成所述文本信息对应的一组唯一过滤器,所述唯一过滤器是指与长度调整后的所述文本信息的上下文相关的过滤器;将长度调整后的所述文本信息作为输入传入所述唯一过滤器,通过所述唯一过滤器提取所述文本信息对应的特征向量矩阵,所述特征向量矩阵中的各个元素表示所述文本信息的特征;从而实现了针对待识别的文本信息习得对应的一组过滤器用于识别所述文本信息,解决了现有文本识别技术无法适应上下文语境、文本识别准确率欠佳的问题,极大地提高了文本信息的特征提取的准确率。In summary, the embodiment of the application sets up and trains a meta-network, which refers to a network used to generate a set of unique filters corresponding to the input text information; when recognizing text information, according to The text information to be recognized acquires a meta network, and the length of the text information to be recognized is adjusted to the input length of the meta network; then the text information after the length adjustment is passed into the meta network as input, and The meta network generates a set of unique filters corresponding to the text information, and the unique filter refers to a filter related to the context of the text information after the length adjustment; the text information after the length adjustment is used as The input is passed into the unique filter, and the feature vector matrix corresponding to the text information is extracted through the unique filter, and each element in the feature vector matrix represents the feature of the text information; A set of filters corresponding to text information acquisition is used to recognize the text information, which solves the problems of existing text recognition technology that cannot adapt to the context and poor text recognition accuracy, and greatly improves the accuracy of text information feature extraction rate.
在本申请实施例中,所述特征向量矩阵与待识别的文本信息对应,包括若干个元素,每一个元素表示从所述文本信息中提取出来的特征,即语义信息。相比于待识别的文本信息,所述特征向量矩阵的维度大大缩小。基于所述特征向量矩阵可以进一步实现对所述待识别文本信息的分类。如图5所示,所述方法还可以包括:In the embodiment of the present application, the feature vector matrix corresponds to the text information to be recognized, and includes several elements, and each element represents a feature extracted from the text information, that is, semantic information. Compared with the text information to be recognized, the dimension of the feature vector matrix is greatly reduced. The classification of the text information to be recognized can be further realized based on the feature vector matrix. As shown in Figure 5, the method may further include:
在步骤S106中,将所述特征向量矩阵作为输入传入全连接层,然后将全连接层的输出作为输入传入预设的Softmax分类器。In step S106, the feature vector matrix is input to the fully connected layer, and then the output of the fully connected layer is input to the preset Softmax classifier.
在步骤S107中,根据所述Softmax分类器的输出获取所述文本信息对应的类别。In step S107, the category corresponding to the text information is obtained according to the output of the Softmax classifier.
在这里,本申请实施例通过全连接层对所述特征向量矩阵进行提纯,以将特征向量矩阵转化为指定维度的向量,方便后续softmax分类器执行分类操作。所述全连接层预先根据分类类别的数量N,设置K*N个权重系数和N个偏置值,K为全连接层前一层的最后一维的维度,即输出的特征向量矩阵的维度。然后将所述特征向量矩阵与全连接层的权重矩阵相乘后加上一个偏置值,所得的和组合成一维向量,从而得到全连接层的输出。Here, the embodiment of the present application purifies the eigenvector matrix through a fully connected layer to convert the eigenvector matrix into a vector of a specified dimension to facilitate subsequent softmax classifiers to perform classification operations. The fully connected layer sets K*N weight coefficients and N offset values in advance according to the number N of classification categories, K is the last dimension of the previous layer of the fully connected layer, that is, the dimension of the output feature vector matrix . Then, the eigenvector matrix and the weight matrix of the fully connected layer are multiplied and an offset value is added, and the resulting sum is combined into a one-dimensional vector, thereby obtaining the output of the fully connected layer.
将所述全连接层的输出作为输入传入预设的softmax分类器。在这里,所述Softmax分类器用于处理多分类问题,其输出需要经过Softmax函数进行数值处理。关于Softmax函数的定义如下:The output of the fully connected layer is input into the preset softmax classifier. Here, the Softmax classifier is used to deal with multi-classification problems, and its output needs to be processed numerically through the Softmax function. The definition of Softmax function is as follows:
Figure PCTCN2019117424-appb-000001
Figure PCTCN2019117424-appb-000001
在上述中,V n表示全连接层输出的一维向量中的元素,n表示类别索引,n=1,2,3,…,N,总的类别个数为N。S n表示当前元素V n的指数与所有元素指数和的比值。将所有的S n组合成一维向量,得到softmax分类器的输出。通过上式可知,softmax分类器将多分类的全连接层输出数值转化为相对概率,其元素表征了不同类别之间的相对概率,便于理解和比较。基于所述softmax分类器的输出,概率最大的元素对应的类别可能性最大,可以清晰地预测所述待识别的文本信息为概率最大的元素对应的类别。 In the above, V n represents the element in the one-dimensional vector output by the fully connected layer, n represents the category index, n = 1, 2, 3,..., N, and the total number of categories is N. S n represents the ratio of the index of the current element V n to the sum of all the element indices. Combine all S n into a one-dimensional vector to get the output of the softmax classifier. It can be seen from the above formula that the softmax classifier converts the output value of the multi-category fully connected layer into a relative probability, and its elements represent the relative probability between different categories, which is easy to understand and compare. Based on the output of the softmax classifier, the category corresponding to the element with the highest probability is the most likely, and it can be clearly predicted that the text information to be recognized is the category corresponding to the element with the highest probability.
可选地,在本申请实施例中,所述类别可以是意图类别,比如同意、拒绝、等待等,也可以是网页类别、情感类别、用户评论类别等等,此处不作限制。Optionally, in the embodiment of the present application, the category may be an intention category, such as consent, rejection, waiting, etc., or a webpage category, an emotion category, a user comment category, etc., which are not limited here.
在本申请实施例中,经过元网络生成的唯一过滤器与所述待识别的文本信息的上下文相关,不同的待识别的文本信息对应的唯一过滤器不相同,即权重矩阵不相同。通过所述唯一过滤器获取所述待识别的文本信息的特征向量矩阵,大大地提高了特征提取的准确率;基于所述特征向量矩阵进行分类,进一步提高了分类的准确率。In the embodiment of the present application, the unique filter generated through the meta-network is related to the context of the text information to be recognized, and the unique filters corresponding to different text information to be recognized are different, that is, the weight matrix is different. Obtaining the feature vector matrix of the text information to be recognized through the unique filter greatly improves the accuracy of feature extraction; classifying based on the feature vector matrix further improves the accuracy of classification.
应理解,上述实施例中各步骤的序号的大小并不意味着执行顺序的先后,各过程的执行顺序应以其功能和内在逻辑确定,而不应对本申请实施例的实施过程构成任何限定。It should be understood that the size of the sequence number of each step in the foregoing embodiment does not mean the order of execution. The execution sequence of each process should be determined by its function and internal logic, and should not constitute any limitation to the implementation process of the embodiment of the present application.
在一实施例中,提供一种文本信息的特征提取装置,该中文本信息的特征提取装置与上述实施例中文本信息的特征提取方法一一对应。如图6所示,该文本信息的特征提取装置包括训练模块61、信息获取模块62、长度调整模块63、过滤器生成模块64、特征提取模块65。各功能模块详细说明如下:In one embodiment, a feature extraction device for text information is provided, and the feature extraction device for text information corresponds to the feature extraction method for text information in the foregoing embodiment in a one-to-one correspondence. As shown in FIG. 6, the feature extraction device for text information includes a training module 61, an information acquisition module 62, a length adjustment module 63, a filter generation module 64, and a feature extraction module 65. The detailed description of each functional module is as follows:
训练模块61,用于设置并训练元网络,所述元网络是指用于生成与所输入的文本信息对应的一组唯一过滤器的网络;The training module 61 is used to set up and train a meta-network, where the meta-network refers to a network used to generate a set of unique filters corresponding to the input text information;
信息获取模块62,用于获取待识别的文本信息;The information acquisition module 62 is used to acquire the text information to be recognized;
长度调整模块63,用于将所述待识别的文本信息的长度调整为所述元网络的输入长度;The length adjustment module 63 is configured to adjust the length of the text information to be recognized to the input length of the meta network;
过滤器生成模块64,用于将长度调整后的所述文本信息作为输入传入所述元网络,通过所述元网络生成所述文本信息对应的一组唯一过滤器,所述唯一过滤器是指与长度调整后的所述文本信息的上下文相关的过滤器;The filter generating module 64 is configured to input the text information after the length adjustment into the meta network, and generate a set of unique filters corresponding to the text information through the meta network, and the unique filter is Refers to a filter related to the context of the text information after length adjustment;
特征提取模块65,用于将长度调整后的所述文本信息作为输入传入所述唯一过滤器,通过所述唯一过滤器提取所述文本信息对应的特征向量矩阵,所述特征向量矩阵中的各个元素表示所述文本信息的特征。The feature extraction module 65 is configured to pass the length-adjusted text information as input to the unique filter, and extract the feature vector matrix corresponding to the text information through the unique filter. Each element represents the characteristics of the text information.
可选地,所述长度调整模块63包括:Optionally, the length adjustment module 63 includes:
长度获取单元,用于获取所述元网络的输入长度,判断所述待识别的文本信息的长度是否达到所述输入长度;A length acquiring unit, configured to acquire the input length of the meta-network, and determine whether the length of the text information to be recognized reaches the input length;
长度调整单元,用于若所述待识别的文本信息的长度未达到所述输入长度时,将预设字符填充至所述待识别的文本信息末尾,以将所述待识别的文本信息的长度调整为所述输入长度。The length adjustment unit is configured to, if the length of the text information to be recognized does not reach the input length, fill preset characters to the end of the text information to be recognized, so as to adjust the length of the text information to be recognized Adjust to the input length.
可选地,所述过滤器生成模块64包括:Optionally, the filter generating module 64 includes:
第一向量化单元,用于对长度调整后的所述文本信息进行向量化处理,得到向量矩阵, 所述向量矩阵中包括若干个词嵌入向量,每一个词嵌入向量的长度相等;The first vectorization unit is configured to perform vectorization processing on the length-adjusted text information to obtain a vector matrix. The vector matrix includes a plurality of word embedding vectors, and each word embedding vector has the same length;
第一卷积单元,用于通过所述元网络对所述向量矩阵执行卷积运算,得到指定长度的隐藏层向量;The first convolution unit is configured to perform a convolution operation on the vector matrix through the meta-network to obtain a hidden layer vector of a specified length;
转置卷积单元,用于对所述隐藏层向量执行转置卷积运算,得到所述长度调整后的文本信息对应的一组唯一过滤器。The transposed convolution unit is configured to perform a transposed convolution operation on the hidden layer vector to obtain a group of unique filters corresponding to the length-adjusted text information.
可选地,所述特征提取模块65包括:Optionally, the feature extraction module 65 includes:
第二向量化单元,用于对长度调整后的所述文本信息进行向量化处理,得到向量矩阵,所述向量矩阵中包括若干个词嵌入向量,每一个词嵌入向量的长度相等;The second vectorization unit is configured to perform vectorization processing on the length-adjusted text information to obtain a vector matrix. The vector matrix includes a plurality of word embedding vectors, and each word embedding vector has the same length;
第二卷积单元,用于通过所述唯一过滤器对所述向量矩阵执行卷积运算,提取所述文本信息对应的特征图;A second convolution unit, configured to perform a convolution operation on the vector matrix through the unique filter to extract a feature map corresponding to the text information;
池化单元,用于对所述特征图执行池化操作,提取特征图中每一行的最大值作为主要特征,得到所述文本信息对应的特征向量矩阵。The pooling unit is configured to perform a pooling operation on the feature map, extract the maximum value of each row in the feature map as a main feature, and obtain a feature vector matrix corresponding to the text information.
可选地,在通过所述唯一过滤器生成所述文本信息对应的特征向量矩阵后,还包括:Optionally, after generating the feature vector matrix corresponding to the text information through the unique filter, the method further includes:
分类模块,用于将所述特征向量矩阵作为输入传入全连接层,然后将全连接层的输出作为输入传入预设的Softmax分类器;根据所述Softmax分类器的输出获取所述文本信息对应的类别。The classification module is used to pass the feature vector matrix as input to the fully connected layer, and then use the output of the fully connected layer as input to the preset Softmax classifier; obtain the text information according to the output of the Softmax classifier The corresponding category.
关于文本信息的特征提取装置的具体限定可以参见上文中对于文本信息的特征提取方法的限定,在此不再赘述。上述文本信息的特征提取装置中的各个模块可全部或部分通过软件、硬件及其组合来实现。上述各模块可以硬件形式内嵌于或独立于计算机设备中的处理器中,也可以以软件形式存储于计算机设备中的存储器中,以便于处理器调用执行以上各个模块对应的操作。For the specific limitation of the feature extraction device of text information, please refer to the above limitation of the feature extraction method of text information, which will not be repeated here. Each module in the apparatus for extracting features of text information described above can be implemented in whole or in part by software, hardware, and a combination thereof. The foregoing modules may be embedded in the form of hardware or independent of the processor in the computer device, or may be stored in the memory of the computer device in the form of software, so that the processor can call and execute the operations corresponding to the foregoing modules.
在一个实施例中,提供了一种计算机设备,该计算机设备可以是服务器,其内部结构图可以如图7所示。该计算机设备包括通过***总线连接的处理器、存储器、网络接口和数据库。其中,该计算机设备的处理器用于提供计算和控制能力。该计算机设备的存储器包括非易失性存储介质、内存储器。该非易失性存储介质存储有操作***、计算机可读指令和数据库。该内存储器为非易失性存储介质中的操作***和计算机可读指令的运行提供环境。该计算机设备的网络接口用于与外部的终端通过网络连接通信。该计算机可读指令被处理器执行时以实现一种文本信息的特征提取方法。In one embodiment, a computer device is provided. The computer device may be a server, and its internal structure diagram may be as shown in FIG. 7. The computer equipment includes a processor, a memory, a network interface and a database connected through a system bus. Among them, the processor of the computer device is used to provide calculation and control capabilities. The memory of the computer device includes a non-volatile storage medium and an internal memory. The non-volatile storage medium stores an operating system, computer readable instructions, and a database. The internal memory provides an environment for the operation of the operating system and computer-readable instructions in the non-volatile storage medium. The network interface of the computer device is used to communicate with an external terminal through a network connection. The computer readable instruction is executed by the processor to realize a feature extraction method of text information.
在一个实施例中,提供了一种计算机设备,包括存储器、处理器及存储在存储器上并可在处理器上运行的计算机可读指令,处理器执行计算机可读指令时实现以下步骤:In one embodiment, a computer device is provided, including a memory, a processor, and computer-readable instructions stored in the memory and capable of running on the processor, and the processor implements the following steps when the processor executes the computer-readable instructions:
设置并训练元网络,所述元网络是指用于生成与所输入的文本信息对应的一组唯一过滤器的网络;Setting up and training a meta-network, the meta-network refers to a network used to generate a set of unique filters corresponding to the input text information;
获取待识别的文本信息;Obtain the text information to be recognized;
将所述待识别的文本信息的长度调整为所述元网络的输入长度;Adjusting the length of the text information to be recognized to the input length of the meta-network;
将长度调整后的所述文本信息作为输入传入所述元网络,通过所述元网络生成所述文本信息对应的一组唯一过滤器,所述唯一过滤器是指与长度调整后的所述文本信息的上下文相关的过滤器;The length-adjusted text information is passed into the meta-network as input, and a set of unique filters corresponding to the text information is generated through the meta-network. The unique filter refers to the adjusted length Context-sensitive filters for text information;
将长度调整后的所述文本信息作为输入传入所述唯一过滤器,通过所述唯一过滤器提取所述文本信息对应的特征向量矩阵,所述特征向量矩阵中的各个元素表示所述文本信息的特征。The length-adjusted text information is passed into the unique filter as input, and the feature vector matrix corresponding to the text information is extracted through the unique filter, and each element in the feature vector matrix represents the text information Characteristics.
在一个实施例中,提供了一个或多个存储有计算机可读指令的非易失性可读存储介质,所述计算机可读指令被一个或多个处理器执行时,使得所述一个或多个处理器执行如下步骤::In one embodiment, one or more non-volatile readable storage media storing computer readable instructions are provided. When the computer readable instructions are executed by one or more processors, the one or more Each processor performs the following steps:
设置并训练元网络,所述元网络是指用于生成与所输入的文本信息对应的一组唯一过 滤器的网络;Setting up and training a meta-network, the meta-network refers to a network used to generate a set of unique filters corresponding to the input text information;
获取待识别的文本信息;Obtain the text information to be recognized;
将所述待识别的文本信息的长度调整为所述元网络的输入长度;Adjusting the length of the text information to be recognized to the input length of the meta-network;
将长度调整后的所述文本信息作为输入传入所述元网络,通过所述元网络生成所述文本信息对应的一组唯一过滤器,所述唯一过滤器是指与长度调整后的所述文本信息的上下文相关的过滤器;The length-adjusted text information is passed into the meta-network as input, and a set of unique filters corresponding to the text information is generated through the meta-network. The unique filter refers to the adjusted length Context-sensitive filters for text information;
将长度调整后的所述文本信息作为输入传入所述唯一过滤器,通过所述唯一过滤器提取所述文本信息对应的特征向量矩阵,所述特征向量矩阵中的各个元素表示所述文本信息的特征。The length-adjusted text information is passed into the unique filter as input, and the feature vector matrix corresponding to the text information is extracted through the unique filter, and each element in the feature vector matrix represents the text information Characteristics.
本领域普通技术人员可以理解实现上述实施例方法中的全部或部分流程,是可以通过计算机可读指令来指令相关的硬件来完成,所述的计算机可读指令可存储于一非易失性计算机可读取存储介质中,该计算机可读指令在执行时,可包括如上述各方法的实施例的流程。其中,本申请所提供的各实施例中所使用的对存储器、存储、数据库或其它介质的任何引用,均可包括非易失性和/或易失性存储器。非易失性存储器可包括只读存储器(ROM)、可编程ROM(PROM)、电可编程ROM(EPROM)、电可擦除可编程ROM(EEPROM)或闪存。易失性存储器可包括随机存取存储器(RAM)或者外部高速缓冲存储器。作为说明而非局限,RAM以多种形式可得,诸如静态RAM(SRAM)、动态RAM(DRAM)、同步DRAM(SDRAM)、双数据率SDRAM(DDRSDRAM)、增强型SDRAM(ESDRAM)、同步链路(Synchlink)DRAM(SLDRAM)、存储器总线(Rambus)直接RAM(RDRAM)、直接存储器总线动态RAM(DRDRAM)、以及存储器总线动态RAM(RDRAM)等。A person of ordinary skill in the art can understand that all or part of the processes in the above-mentioned embodiment methods can be implemented by instructing relevant hardware through computer-readable instructions, which can be stored in a non-volatile computer. In a readable storage medium, when the computer-readable instructions are executed, they may include the processes of the above-mentioned method embodiments. Wherein, any reference to memory, storage, database or other media used in the embodiments provided in this application may include non-volatile and/or volatile memory. Non-volatile memory may include read only memory (ROM), programmable ROM (PROM), electrically programmable ROM (EPROM), electrically erasable programmable ROM (EEPROM), or flash memory. Volatile memory may include random access memory (RAM) or external cache memory. As an illustration and not a limitation, RAM is available in many forms, such as static RAM (SRAM), dynamic RAM (DRAM), synchronous DRAM (SDRAM), double data rate SDRAM (DDRSDRAM), enhanced SDRAM (ESDRAM), synchronous chain Channel (Synchlink) DRAM (SLDRAM), memory bus (Rambus) direct RAM (RDRAM), direct memory bus dynamic RAM (DRDRAM), and memory bus dynamic RAM (RDRAM), etc.
所属领域的技术人员可以清楚地了解到,为了描述的方便和简洁,仅以上述各功能单元、模块的划分进行举例说明,实际应用中,可以根据需要而将上述功能分配由不同的功能单元、模块完成,即将所述装置的内部结构划分成不同的功能单元或模块,以完成以上描述的全部或者部分功能。Those skilled in the art can clearly understand that for the convenience and conciseness of description, only the division of the above-mentioned functional units and modules is used as an example. In practical applications, the above-mentioned functions can be allocated to different functional units and modules as required. Module completion means dividing the internal structure of the device into different functional units or modules to complete all or part of the functions described above.
以上所述实施例仅用以说明本申请的技术方案,而非对其限制;尽管参照前述实施例对本申请进行了详细的说明,本领域的普通技术人员应当理解:其依然可以对前述各实施例所记载的技术方案进行修改,或者对其中部分技术特征进行等同替换;而这些修改或者替换,并不使相应技术方案的本质脱离本申请各实施例技术方案的精神和范围,均应包含在本申请的保护范围之内。The above-mentioned embodiments are only used to illustrate the technical solutions of the present application, not to limit them; although the present application has been described in detail with reference to the foregoing embodiments, those of ordinary skill in the art should understand that it can still implement the foregoing The technical solutions recorded in the examples are modified, or some of the technical features are equivalently replaced; these modifications or replacements do not make the essence of the corresponding technical solutions deviate from the spirit and scope of the technical solutions of the embodiments of the application, and should be included in Within the scope of protection of this application.

Claims (20)

  1. 一种文本信息的特征提取方法,其特征在于,包括:A method for feature extraction of text information, which is characterized in that it includes:
    设置并训练元网络,所述元网络是指用于生成与所输入的文本信息对应的一组唯一过滤器的网络;Setting up and training a meta-network, the meta-network refers to a network used to generate a set of unique filters corresponding to the input text information;
    获取待识别的文本信息;Obtain the text information to be recognized;
    将所述待识别的文本信息的长度调整为所述元网络的输入长度;Adjusting the length of the text information to be recognized to the input length of the meta-network;
    将长度调整后的所述文本信息作为输入传入所述元网络,通过所述元网络生成所述文本信息对应的一组唯一过滤器,所述唯一过滤器是指与长度调整后的所述文本信息的上下文相关的过滤器;The length-adjusted text information is passed into the meta-network as input, and a set of unique filters corresponding to the text information is generated through the meta-network. The unique filter refers to the adjusted length Context-sensitive filters for text information;
    将长度调整后的所述文本信息作为输入传入所述唯一过滤器,通过所述唯一过滤器提取所述文本信息对应的特征向量矩阵,所述特征向量矩阵中的各个元素表示所述文本信息的特征。The length-adjusted text information is passed into the unique filter as input, and the feature vector matrix corresponding to the text information is extracted through the unique filter, and each element in the feature vector matrix represents the text information Characteristics.
  2. 如权利要求1所述的文本信息的特征提取方法,其特征在于,所述将所述待识别的文本信息的长度调整为所述元网络的输入长度包括:The method for extracting features of text information according to claim 1, wherein the adjusting the length of the text information to be recognized to the input length of the meta-network comprises:
    获取所述元网络的输入长度,判断所述待识别的文本信息的长度是否达到所述输入长度;Acquiring the input length of the meta network, and determining whether the length of the text information to be recognized reaches the input length;
    若否时,将预设字符填充至所述待识别的文本信息末尾,以将所述待识别的文本信息的长度调整为所述输入长度。If not, fill preset characters to the end of the text information to be recognized, so as to adjust the length of the text information to be recognized to the input length.
  3. 如权利要求1或2所述的文本信息的特征提取方法,其特征在于,所述将长度调整后的所述文本信息作为输入传入所述元网络,通过所述元网络生成所述文本信息对应的一组唯一过滤器包括:The method for extracting features of text information according to claim 1 or 2, wherein the text information whose length has been adjusted is passed into the meta-network as an input, and the text information is generated through the meta-network The corresponding set of unique filters includes:
    对长度调整后的所述文本信息进行向量化处理,得到向量矩阵,所述向量矩阵中包括若干个词嵌入向量,每一个词嵌入向量的长度相等;Performing vectorization processing on the length-adjusted text information to obtain a vector matrix, the vector matrix includes a plurality of word embedding vectors, and the length of each word embedding vector is equal;
    通过所述元网络对所述向量矩阵执行卷积运算,得到指定长度的隐藏层向量;Performing a convolution operation on the vector matrix through the meta network to obtain a hidden layer vector of a specified length;
    对所述隐藏层向量执行转置卷积运算,得到长度调整后的所述文本信息对应的一组唯一过滤器。Performing a transposed convolution operation on the hidden layer vector to obtain a set of unique filters corresponding to the text information after the length adjustment.
  4. 如权利要求1或2所述的文本信息的特征提取方法,其特征在于,所述将长度调整后的所述文本信息作为输入传入所述唯一过滤器,通过所述唯一过滤器提取所述文本信息对应的特征向量矩阵包括:The method for extracting features of text information according to claim 1 or 2, wherein the text information whose length has been adjusted is passed into the unique filter as input, and the unique filter is used to extract the The feature vector matrix corresponding to the text information includes:
    对长度调整后的所述文本信息进行向量化处理,得到向量矩阵,所述向量矩阵中包括 若干个词嵌入向量,每一个词嵌入向量的长度相等;Performing vectorization processing on the length-adjusted text information to obtain a vector matrix, the vector matrix includes a plurality of word embedding vectors, each of which has the same length;
    通过所述唯一过滤器对所述向量矩阵执行卷积运算,提取所述文本信息对应的特征图;Performing a convolution operation on the vector matrix through the unique filter to extract a feature map corresponding to the text information;
    对所述特征图执行池化操作,提取特征图中每一行的最大值作为主要特征,得到所述文本信息对应的特征向量矩阵。Perform a pooling operation on the feature map, extract the maximum value of each row in the feature map as a main feature, and obtain a feature vector matrix corresponding to the text information.
  5. 如权利要求1或2所述的文本信息的特征提取方法,其特征在于,在通过所述唯一过滤器提取所述文本信息对应的特征向量矩阵后,还包括:The method for feature extraction of text information according to claim 1 or 2, characterized in that, after the feature vector matrix corresponding to the text information is extracted by the unique filter, the method further comprises:
    将所述特征向量矩阵作为输入传入全连接层,然后将全连接层的输出作为输入传入预设的Softmax分类器;Pass the feature vector matrix as input to the fully connected layer, and then use the output of the fully connected layer as input to the preset Softmax classifier;
    根据所述Softmax分类器的输出获取所述文本信息对应的类别。Acquire the category corresponding to the text information according to the output of the Softmax classifier.
  6. 一种文本信息的特征提取装置,其特征在于,包括:A feature extraction device for text information, characterized in that it comprises:
    训练模块,用于设置并训练元网络,所述元网络是指用于生成与所输入的文本信息对应的一组唯一过滤器的网络;A training module for setting up and training a meta-network, where the meta-network refers to a network for generating a set of unique filters corresponding to the input text information;
    信息获取模块,用于获取待识别的文本信息;Information acquisition module for acquiring text information to be recognized;
    长度调整模块,用于将所述待识别的文本信息的长度调整为所述元网络的输入长度;A length adjustment module, configured to adjust the length of the text information to be recognized to the input length of the meta network;
    过滤器生成模块,用于将长度调整后的所述文本信息作为输入传入所述元网络,通过所述元网络生成所述文本信息对应的一组唯一过滤器,所述唯一过滤器是指与长度调整后的所述文本信息的上下文相关的过滤器;The filter generation module is configured to input the text information after the length adjustment into the meta network, and generate a set of unique filters corresponding to the text information through the meta network, and the unique filter refers to A filter related to the context of the text information after the length adjustment;
    特征提取模块,用于将长度调整后的所述文本信息作为输入传入所述唯一过滤器,通过所述唯一过滤器提取所述文本信息对应的特征向量矩阵,所述特征向量矩阵中的各个元素表示所述文本信息的特征。The feature extraction module is configured to input the text information after the length adjustment to the unique filter, and extract the feature vector matrix corresponding to the text information through the unique filter, and each of the feature vector matrix The elements represent the characteristics of the text information.
  7. 如权利要求6所述的文本信息的特征提取装置,其特征在于,所述长度调整模块包括:7. The text information feature extraction device according to claim 6, wherein the length adjustment module comprises:
    长度获取单元,用于获取所述元网络的输入长度,判断所述待识别的文本信息的长度是否达到所述输入长度;A length acquiring unit, configured to acquire the input length of the meta-network, and determine whether the length of the text information to be recognized reaches the input length;
    长度调整单元,用于若所述待识别的文本信息的长度未达到所述输入长度时,将预设字符填充至所述待识别的文本信息末尾,以将所述待识别的文本信息的长度调整为所述输入长度。The length adjustment unit is configured to, if the length of the text information to be recognized does not reach the input length, fill preset characters to the end of the text information to be recognized, so as to adjust the length of the text information to be recognized Adjust to the input length.
  8. 如权利要求6或7所述的文本信息的特征提取装置,其特征在于,所述过滤器生成模块包括:8. The text information feature extraction device according to claim 6 or 7, wherein the filter generation module comprises:
    第一向量化单元,用于对长度调整后的所述文本信息进行向量化处理,得到向量矩阵, 所述向量矩阵中包括若干个词嵌入向量,每一个词嵌入向量的长度相等;The first vectorization unit is configured to perform vectorization processing on the length-adjusted text information to obtain a vector matrix. The vector matrix includes a plurality of word embedding vectors, and each word embedding vector has the same length;
    第一卷积单元,用于通过所述元网络对所述向量矩阵执行卷积运算,得到指定长度的隐藏层向量;The first convolution unit is configured to perform a convolution operation on the vector matrix through the meta-network to obtain a hidden layer vector of a specified length;
    转置卷积单元,用于对所述隐藏层向量执行转置卷积运算,得到所述长度调整后的文本信息对应的一组唯一过滤器。The transposed convolution unit is configured to perform a transposed convolution operation on the hidden layer vector to obtain a group of unique filters corresponding to the length-adjusted text information.
  9. 如权利要求6或7所述的文本信息的特征提取装置,其特征在于,所述特征提取模块包括:The feature extraction device for text information according to claim 6 or 7, wherein the feature extraction module comprises:
    第二向量化单元,用于对长度调整后的所述文本信息进行向量化处理,得到向量矩阵,所述向量矩阵中包括若干个词嵌入向量,每一个词嵌入向量的长度相等;The second vectorization unit is configured to perform vectorization processing on the length-adjusted text information to obtain a vector matrix. The vector matrix includes a plurality of word embedding vectors, and each word embedding vector has the same length;
    第二卷积单元,用于通过所述唯一过滤器对所述向量矩阵执行卷积运算,提取所述文本信息对应的特征图;A second convolution unit, configured to perform a convolution operation on the vector matrix through the unique filter to extract a feature map corresponding to the text information;
    池化单元,用于对所述特征图执行池化操作,提取特征图中每一行的最大值作为主要特征,得到所述文本信息对应的特征向量矩阵。The pooling unit is configured to perform a pooling operation on the feature map, extract the maximum value of each row in the feature map as a main feature, and obtain a feature vector matrix corresponding to the text information.
  10. 如权利要求6或7所述的文本信息的特征提取装置,其特征在于,所述特征提取装置还包括:The feature extraction device for text information according to claim 6 or 7, wherein the feature extraction device further comprises:
    分类模块,用于将所述特征向量矩阵作为输入传入全连接层,然后将全连接层的输出作为输入传入预设的Softmax分类器;根据所述Softmax分类器的输出获取所述文本信息对应的类别。The classification module is used to pass the feature vector matrix as input to the fully connected layer, and then use the output of the fully connected layer as input to the preset Softmax classifier; obtain the text information according to the output of the Softmax classifier The corresponding category.
  11. 一种计算机设备,包括存储器、处理器以及存储在所述存储器中并可在所述处理器上运行的计算机可读指令,其特征在于,所述处理器执行所述计算机可读指令时实现如下步骤:A computer device comprising a memory, a processor, and computer-readable instructions stored in the memory and capable of running on the processor, wherein the processor executes the computer-readable instructions as follows step:
    设置并训练元网络,所述元网络是指用于生成与所输入的文本信息对应的一组唯一过滤器的网络;Setting up and training a meta-network, the meta-network refers to a network used to generate a set of unique filters corresponding to the input text information;
    获取待识别的文本信息;Obtain the text information to be recognized;
    将所述待识别的文本信息的长度调整为所述元网络的输入长度;Adjusting the length of the text information to be recognized to the input length of the meta-network;
    将长度调整后的所述文本信息作为输入传入所述元网络,通过所述元网络生成所述文本信息对应的一组唯一过滤器,所述唯一过滤器是指与长度调整后的所述文本信息的上下文相关的过滤器;The length-adjusted text information is passed into the meta-network as input, and a set of unique filters corresponding to the text information is generated through the meta-network. The unique filter refers to the adjusted length Context-sensitive filters for text information;
    将长度调整后的所述文本信息作为输入传入所述唯一过滤器,通过所述唯一过滤器提取所述文本信息对应的特征向量矩阵,所述特征向量矩阵中的各个元素表示所述文本信息的特征。The length-adjusted text information is passed into the unique filter as input, and the feature vector matrix corresponding to the text information is extracted through the unique filter, and each element in the feature vector matrix represents the text information Characteristics.
  12. 如权利要求11所述的计算机设备,其特征在于,所述将所述待识别的文本信息的长度调整为所述元网络的输入长度包括:The computer device according to claim 11, wherein the adjusting the length of the text information to be recognized to the input length of the meta network comprises:
    获取所述元网络的输入长度,判断所述待识别的文本信息的长度是否达到所述输入长度;Acquiring the input length of the meta network, and determining whether the length of the text information to be recognized reaches the input length;
    若否时,将预设字符填充至所述待识别的文本信息末尾,以将所述待识别的文本信息的长度调整为所述输入长度。If not, fill preset characters to the end of the text information to be recognized, so as to adjust the length of the text information to be recognized to the input length.
  13. 如权利要求11或12所述的计算机设备,其特征在于,所述将长度调整后的所述文本信息作为输入传入所述元网络,通过所述元网络生成所述文本信息对应的一组唯一过滤器包括:The computer device according to claim 11 or 12, wherein the text information whose length has been adjusted is passed into the meta-network as input, and a set of text information corresponding to the text information is generated through the meta-network. The only filters include:
    对长度调整后的所述文本信息进行向量化处理,得到向量矩阵,所述向量矩阵中包括若干个词嵌入向量,每一个词嵌入向量的长度相等;Performing vectorization processing on the length-adjusted text information to obtain a vector matrix, the vector matrix includes a plurality of word embedding vectors, and the length of each word embedding vector is equal;
    通过所述元网络对所述向量矩阵执行卷积运算,得到指定长度的隐藏层向量;Performing a convolution operation on the vector matrix through the meta network to obtain a hidden layer vector of a specified length;
    对所述隐藏层向量执行转置卷积运算,得到长度调整后的所述文本信息对应的一组唯一过滤器。Performing a transposed convolution operation on the hidden layer vector to obtain a set of unique filters corresponding to the text information after the length adjustment.
  14. 如权利要求11或12所述的计算机设备,其特征在于,所述将长度调整后的所述文本信息作为输入传入所述唯一过滤器,通过所述唯一过滤器提取所述文本信息对应的特征向量矩阵包括:The computer device according to claim 11 or 12, wherein the length-adjusted text information is passed into the unique filter as input, and the text information corresponding to the text information is extracted through the unique filter. The eigenvector matrix includes:
    对长度调整后的所述文本信息进行向量化处理,得到向量矩阵,所述向量矩阵中包括若干个词嵌入向量,每一个词嵌入向量的长度相等;Performing vectorization processing on the length-adjusted text information to obtain a vector matrix, the vector matrix includes a plurality of word embedding vectors, and the length of each word embedding vector is equal;
    通过所述唯一过滤器对所述向量矩阵执行卷积运算,提取所述文本信息对应的特征图;Performing a convolution operation on the vector matrix through the unique filter to extract a feature map corresponding to the text information;
    对所述特征图执行池化操作,提取特征图中每一行的最大值作为主要特征,得到所述文本信息对应的特征向量矩阵。Perform a pooling operation on the feature map, extract the maximum value of each row in the feature map as a main feature, and obtain a feature vector matrix corresponding to the text information.
  15. 如权利要求11或12所述的计算机设备,其特征在于,所述处理器执行所述计算机可读指令时还实现如下步骤:The computer device according to claim 11 or 12, wherein the processor further implements the following steps when executing the computer-readable instruction:
    将所述特征向量矩阵作为输入传入全连接层,然后将全连接层的输出作为输入传入预设的Softmax分类器;Pass the feature vector matrix as input to the fully connected layer, and then use the output of the fully connected layer as input to the preset Softmax classifier;
    根据所述Softmax分类器的输出获取所述文本信息对应的类别。Acquire the category corresponding to the text information according to the output of the Softmax classifier.
  16. 一个或多个存储有计算机可读指令的非易失性可读存储介质,所述计算机可读指令被一个或多个处理器执行时,使得所述一个或多个处理器执行如下步骤:One or more non-volatile readable storage media storing computer readable instructions. When the computer readable instructions are executed by one or more processors, the one or more processors execute the following steps:
    设置并训练元网络,所述元网络是指用于生成与所输入的文本信息对应的一组唯一过 滤器的网络;Setting up and training a meta-network, the meta-network refers to a network used to generate a set of unique filters corresponding to the input text information;
    获取待识别的文本信息;Obtain the text information to be recognized;
    将所述待识别的文本信息的长度调整为所述元网络的输入长度;Adjusting the length of the text information to be recognized to the input length of the meta-network;
    将长度调整后的所述文本信息作为输入传入所述元网络,通过所述元网络生成所述文本信息对应的一组唯一过滤器,所述唯一过滤器是指与长度调整后的所述文本信息的上下文相关的过滤器;The length-adjusted text information is passed into the meta-network as input, and a set of unique filters corresponding to the text information is generated through the meta-network. The unique filter refers to the adjusted length Context-sensitive filters for text information;
    将长度调整后的所述文本信息作为输入传入所述唯一过滤器,通过所述唯一过滤器提取所述文本信息对应的特征向量矩阵,所述特征向量矩阵中的各个元素表示所述文本信息的特征。The length-adjusted text information is passed into the unique filter as input, and the feature vector matrix corresponding to the text information is extracted through the unique filter, and each element in the feature vector matrix represents the text information Characteristics.
  17. 如权利要求16所述的非易失性可读存储介质,其特征在于,所述将所述待识别的文本信息的长度调整为所述元网络的输入长度包括:The non-volatile readable storage medium according to claim 16, wherein the adjusting the length of the text information to be recognized to the input length of the meta-network comprises:
    获取所述元网络的输入长度,判断所述待识别的文本信息的长度是否达到所述输入长度;Acquiring the input length of the meta network, and determining whether the length of the text information to be recognized reaches the input length;
    若否时,将预设字符填充至所述待识别的文本信息末尾,以将所述待识别的文本信息的长度调整为所述输入长度。If not, fill preset characters to the end of the text information to be recognized, so as to adjust the length of the text information to be recognized to the input length.
  18. 如权利要求16或17所述的非易失性可读存储介质,其特征在于,所述将长度调整后的所述文本信息作为输入传入所述元网络,通过所述元网络生成所述文本信息对应的一组唯一过滤器包括:The non-volatile readable storage medium according to claim 16 or 17, wherein the text information whose length has been adjusted is passed into the meta-network as input, and the meta-network is used to generate the A set of unique filters corresponding to text information includes:
    对长度调整后的所述文本信息进行向量化处理,得到向量矩阵,所述向量矩阵中包括若干个词嵌入向量,每一个词嵌入向量的长度相等;Performing vectorization processing on the length-adjusted text information to obtain a vector matrix, the vector matrix includes a plurality of word embedding vectors, and the length of each word embedding vector is equal;
    通过所述元网络对所述向量矩阵执行卷积运算,得到指定长度的隐藏层向量;Performing a convolution operation on the vector matrix through the meta network to obtain a hidden layer vector of a specified length;
    对所述隐藏层向量执行转置卷积运算,得到长度调整后的所述文本信息对应的一组唯一过滤器。Performing a transposed convolution operation on the hidden layer vector to obtain a set of unique filters corresponding to the text information after the length adjustment.
  19. 如权利要求16或17所述的非易失性可读存储介质,其特征在于,所述将长度调整后的所述文本信息作为输入传入所述唯一过滤器,通过所述唯一过滤器提取所述文本信息对应的特征向量矩阵包括:The non-volatile readable storage medium according to claim 16 or 17, wherein the text information whose length is adjusted is passed as input to the unique filter, and the unique filter is used to extract The feature vector matrix corresponding to the text information includes:
    对长度调整后的所述文本信息进行向量化处理,得到向量矩阵,所述向量矩阵中包括若干个词嵌入向量,每一个词嵌入向量的长度相等;Performing vectorization processing on the length-adjusted text information to obtain a vector matrix, the vector matrix includes a plurality of word embedding vectors, and the length of each word embedding vector is equal;
    通过所述唯一过滤器对所述向量矩阵执行卷积运算,提取所述文本信息对应的特征图;Performing a convolution operation on the vector matrix through the unique filter to extract a feature map corresponding to the text information;
    对所述特征图执行池化操作,提取特征图中每一行的最大值作为主要特征,得到所述 文本信息对应的特征向量矩阵。Perform a pooling operation on the feature map, extract the maximum value of each row of the feature map as the main feature, and obtain the feature vector matrix corresponding to the text information.
  20. 如权利要求16或17所述的非易失性可读存储介质,其特征在于,所述计算机可读指令被一个或多个处理器执行时,使得所述一个或多个处理器还执行如下步骤:The non-volatile readable storage medium according to claim 16 or 17, wherein when the computer readable instructions are executed by one or more processors, the one or more processors also execute the following step:
    将所述特征向量矩阵作为输入传入全连接层,然后将全连接层的输出作为输入传入预设的Softmax分类器;Pass the feature vector matrix as input to the fully connected layer, and then use the output of the fully connected layer as input to the preset Softmax classifier;
    根据所述Softmax分类器的输出获取所述文本信息对应的类别。Acquire the category corresponding to the text information according to the output of the Softmax classifier.
PCT/CN2019/117424 2019-03-06 2019-11-12 Text information feature extraction method and device, computer apparatus, and storage medium WO2020177378A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201910168231.6A CN110020431B (en) 2019-03-06 2019-03-06 Feature extraction method and device of text information, computer equipment and storage medium
CN201910168231.6 2019-03-06

Publications (1)

Publication Number Publication Date
WO2020177378A1 true WO2020177378A1 (en) 2020-09-10

Family

ID=67189329

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2019/117424 WO2020177378A1 (en) 2019-03-06 2019-11-12 Text information feature extraction method and device, computer apparatus, and storage medium

Country Status (2)

Country Link
CN (1) CN110020431B (en)
WO (1) WO2020177378A1 (en)

Families Citing this family (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110020431B (en) * 2019-03-06 2023-07-18 平安科技(深圳)有限公司 Feature extraction method and device of text information, computer equipment and storage medium
CN110889290B (en) * 2019-11-13 2021-11-16 北京邮电大学 Text encoding method and apparatus, text encoding validity checking method and apparatus
CN116401381B (en) * 2023-06-07 2023-08-04 神州医疗科技股份有限公司 Method and device for accelerating extraction of medical relations

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107797985A (en) * 2017-09-27 2018-03-13 百度在线网络技术(北京)有限公司 Establish synonymous discriminating model and differentiate the method, apparatus of synonymous text
CN108763319A (en) * 2018-04-28 2018-11-06 中国科学院自动化研究所 Merge the social robot detection method and system of user behavior and text message
US20180373947A1 (en) * 2017-06-22 2018-12-27 StradVision, Inc. Method for learning text recognition, method for recognizing text using the same, and apparatus for learning text recognition, apparatus for recognizing text using the same
CN110020431A (en) * 2019-03-06 2019-07-16 平安科技(深圳)有限公司 Feature extracting method, device, computer equipment and the storage medium of text information

Family Cites Families (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102073704B (en) * 2010-12-24 2013-09-25 华为终端有限公司 Text classification processing method, system and equipment
CN102541958A (en) * 2010-12-30 2012-07-04 百度在线网络技术(北京)有限公司 Method, device and computer equipment for identifying short text category information
CN105404899A (en) * 2015-12-02 2016-03-16 华东师范大学 Image classification method based on multi-directional context information and sparse coding model
US9659248B1 (en) * 2016-01-19 2017-05-23 International Business Machines Corporation Machine learning and training a computer-implemented neural network to retrieve semantically equivalent questions using hybrid in-memory representations
KR20180077846A (en) * 2016-12-29 2018-07-09 주식회사 엔씨소프트 Apparatus and method for detecting debatable document
CN107066553B (en) * 2017-03-24 2021-01-01 北京工业大学 Short text classification method based on convolutional neural network and random forest
CN107169035B (en) * 2017-04-19 2019-10-18 华南理工大学 A kind of file classification method mixing shot and long term memory network and convolutional neural networks
US10417266B2 (en) * 2017-05-09 2019-09-17 Apple Inc. Context-aware ranking of intelligent response suggestions
CN107301246A (en) * 2017-07-14 2017-10-27 河北工业大学 Chinese Text Categorization based on ultra-deep convolutional neural networks structural model
CN107766324B (en) * 2017-09-25 2020-09-01 浙江大学 Text consistency analysis method based on deep neural network
CN108536678B (en) * 2018-04-12 2023-04-07 腾讯科技(深圳)有限公司 Text key information extraction method and device, computer equipment and storage medium
CN109299262B (en) * 2018-10-09 2022-04-15 中山大学 Text inclusion relation recognition method fusing multi-granularity information

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20180373947A1 (en) * 2017-06-22 2018-12-27 StradVision, Inc. Method for learning text recognition, method for recognizing text using the same, and apparatus for learning text recognition, apparatus for recognizing text using the same
CN107797985A (en) * 2017-09-27 2018-03-13 百度在线网络技术(北京)有限公司 Establish synonymous discriminating model and differentiate the method, apparatus of synonymous text
CN108763319A (en) * 2018-04-28 2018-11-06 中国科学院自动化研究所 Merge the social robot detection method and system of user behavior and text message
CN110020431A (en) * 2019-03-06 2019-07-16 平安科技(深圳)有限公司 Feature extracting method, device, computer equipment and the storage medium of text information

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
吴琼等 (WU, QIONG ET AL.): "多尺度卷积循环神经网络的情感分类技术 (Sentiment Classification with Multiscale Convolutional Recurrent Neural Network)", 华侨大学学报 (自然科学版) (JOURNAL OF HUAQIAO UNIVERSITY (NATURAL SCIENCE)), vol. 38, no. 6, 20 November 2017 (2017-11-20), XP55732002, DOI: 20191230094619A *

Also Published As

Publication number Publication date
CN110020431B (en) 2023-07-18
CN110020431A (en) 2019-07-16

Similar Documents

Publication Publication Date Title
CN111368996B (en) Retraining projection network capable of transmitting natural language representation
WO2021027533A1 (en) Text semantic recognition method and apparatus, computer device, and storage medium
CN110347835B (en) Text clustering method, electronic device and storage medium
US11816442B2 (en) Multi-turn dialogue response generation with autoregressive transformer models
WO2020177230A1 (en) Medical data classification method and apparatus based on machine learning, and computer device and storage medium
WO2020177378A1 (en) Text information feature extraction method and device, computer apparatus, and storage medium
US20230186912A1 (en) Speech recognition method, apparatus and device, and storage medium
CN113094578B (en) Deep learning-based content recommendation method, device, equipment and storage medium
WO2020147395A1 (en) Emotion-based text classification method and device, and computer apparatus
CN110569500A (en) Text semantic recognition method and device, computer equipment and storage medium
CN111428485B (en) Judicial document paragraph classifying method, device, computer equipment and storage medium
EP3620982B1 (en) Sample processing method and device
CN113297366B (en) Emotion recognition model training method, device, equipment and medium for multi-round dialogue
JP2022158735A (en) Learning device, learning method, learning program, retrieval device, retrieval method, and retrieval program
WO2022141868A1 (en) Method and apparatus for extracting speech features, terminal, and storage medium
WO2020211720A1 (en) Data processing method and pronoun resolution neural network training method
WO2022227214A1 (en) Classification model training method and apparatus, and terminal device and storage medium
WO2021127982A1 (en) Speech emotion recognition method, smart device, and computer-readable storage medium
CN111611383A (en) User intention recognition method and device, computer equipment and storage medium
CN110807309A (en) Method and device for identifying content type of PDF document and electronic equipment
CN113011532A (en) Classification model training method and device, computing equipment and storage medium
CN114360520A (en) Training method, device and equipment of voice classification model and storage medium
CN111898363A (en) Method and device for compressing long and difficult sentences of text, computer equipment and storage medium
CN115017900B (en) Conversation emotion recognition method based on multi-mode multi-prejudice
CN111680132A (en) Noise filtering and automatic classifying method for internet text information

Legal Events

Date Code Title Description
NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 19917824

Country of ref document: EP

Kind code of ref document: A1