CN111339292A - Training method, system, equipment and storage medium of text classification network - Google Patents

Training method, system, equipment and storage medium of text classification network Download PDF

Info

Publication number
CN111339292A
CN111339292A CN201811555318.0A CN201811555318A CN111339292A CN 111339292 A CN111339292 A CN 111339292A CN 201811555318 A CN201811555318 A CN 201811555318A CN 111339292 A CN111339292 A CN 111339292A
Authority
CN
China
Prior art keywords
training
texts
text
category
training set
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201811555318.0A
Other languages
Chinese (zh)
Inventor
王颖帅
李晓霞
苗诗雨
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Jingdong Century Trading Co Ltd
Beijing Jingdong Shangke Information Technology Co Ltd
Original Assignee
Beijing Jingdong Century Trading Co Ltd
Beijing Jingdong Shangke Information Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Jingdong Century Trading Co Ltd, Beijing Jingdong Shangke Information Technology Co Ltd filed Critical Beijing Jingdong Century Trading Co Ltd
Priority to CN201811555318.0A priority Critical patent/CN111339292A/en
Publication of CN111339292A publication Critical patent/CN111339292A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Health & Medical Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Evolutionary Computation (AREA)
  • Artificial Intelligence (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention provides a method, a system, equipment and a storage medium for training a text classification network, wherein the method comprises the following steps: receiving a training text and a category label of the training text as a training set; counting the number of texts of each category in a training set; adjusting the number of texts of each category in the training set, so that the number ratio of the texts of each category in the training set meets a first preset proportion requirement; and training the convolutional neural network for text classification by adopting the adjusted training set to obtain a trained text classification network. According to the method and the device, the number of the texts in the training set is balanced, and a more accurate convolutional neural network is constructed to serve as the text classification network, so that the more accurate text classification network is provided, the user input can be classified more accurately, the user requirements are accurately positioned, and the user experience is improved.

Description

Training method, system, equipment and storage medium of text classification network
Technical Field
The present invention relates to the field of data processing technologies, and in particular, to a method, a system, a device, and a storage medium for training a text classification network.
Background
Deep learning is a branch of machine learning, and the basic idea is to learn a plurality of presentation layers as a model of data. In recent years, with the development of big data and artificial intelligence, deep learning is widely applied to image recognition, natural language processing, recommendation and the like, for example, in the shopping field of e-commerce, more and more users like to use intelligent assistants, and like to use more high-tech products anytime and anywhere. An intelligent assistant refers to a software application or platform that meets the needs of a user by understanding natural language in speech or text based on artificial intelligence techniques.
In the prior art, the intelligent assistant understands the training of the text classification network in the following ways: and classifying according to dialects based on the regular matching of the templates, wherein each class needs products to want the dialects in advance, and the dialects templates continuously follow up the online service. However, with this method, more patterns need to be manually conceived, and since human cognition is limited, it is impossible to cover all the on-line patterns, and the prediction is rigid.
Disclosure of Invention
Aiming at the problems in the prior art, the invention aims to provide a method, a system, equipment and a storage medium for training a text classification network, so that a more accurate text classification network is provided, user input can be classified more accurately, user requirements can be accurately positioned, and user experience is improved.
The embodiment of the invention provides a training method of a text classification network, which comprises the following steps:
receiving a training text and a category label of the training text as a training set;
counting the number of texts of each category in a training set;
adjusting the number of texts of each category in the training set, so that the number ratio of the texts of each category in the training set meets a first preset proportion requirement;
and training the convolutional neural network for text classification by adopting the adjusted training set to obtain a trained text classification network.
Optionally, the method further comprises the steps of:
dividing the texts in the training set into texts with different length intervals according to the sentence length;
and adjusting the number of texts in each length interval to enable the number ratio of the texts in each length interval in the training set to meet the requirement of a second preset proportion.
Optionally, the adjusting the number of texts in each category in the training set includes the following steps:
for the category of which the number of texts is too small to meet the requirement of the first preset proportion, further supplementing the number of the texts in the category;
and screening a specified number of texts from the texts of the category for the category with excessive text number which cannot meet the first preset proportion requirement, wherein the number of the texts of the category obtained by screening meets the first preset proportion requirement.
Optionally, before counting the number of texts in each category in the training set, the method further includes the following steps:
dividing texts in a training set into a plurality of batches;
and respectively counting the number of texts of each category in each batch, and adjusting the number of texts of each category in the batch, so that the ratio of the number of texts of each category in the batch meets a first preset proportion requirement.
Optionally, after receiving the training text and the category label of the training text, the method further includes the following steps:
performing regular matching on the text for training by adopting a preset filtering statement;
and filtering out the training texts which are regularly matched.
Optionally, after receiving the training text and the category label of the training text, the method further includes the following steps:
and screening the training texts, and removing repeated texts in the training texts.
Optionally, the convolutional neural network includes a first convolutional layer, a first pooling layer, a second convolutional layer, a second pooling layer, a third convolutional layer, a third pooling layer, a fully-connected layer, and a classification layer, which are connected in sequence.
Optionally, the first convolution layer, the second convolution layer and the third convolution layer adopt 1 × 1 convolution kernel, 3 × 3 convolution kernel and 5 × 5 convolution kernel, respectively; and the pooling areas of the first, second and third pooling layers respectively adopt a 2 x 2 matrix.
Optionally, the convolutional neural network is implemented based on a local perceptual field, and each layer shares local connection parameters with a previous layer.
Optionally, when the convolutional neural network for text classification is trained, the loss of the obtained optimal text classification model is less than 0.05, the accuracy is 0.91, and the F1 value is 0.92.
Optionally, the text classification network includes a Highway network layer, and the Highway network layer performs feature fusion on the chinese character level features, the segmentation level features, and the word vector level features in the training set.
Optionally, the method further comprises the following steps:
acquiring evaluation data of a text classification network;
searching classification error data in the evaluation data, and adding a correctly classified label to the classification error data again;
adding the classified error data and the correct label into a training set;
and re-training the text classification network by adopting the updated training set.
The embodiment of the invention also provides a training system of a text classification network, which is applied to the training method of the text classification network, and the training system of the text classification network comprises the following steps:
the text acquisition module is used for receiving the training text and the class label of the training text as a training set;
the training set balancing module is used for counting the number of texts of each category in the training set and adjusting the number of the texts of each category in the training set, so that the number ratio of the texts of each category in the training set meets a first preset proportion requirement;
and the classification network training module is used for training the convolutional neural network for text classification by adopting the adjusted training set to obtain a trained text classification network.
The embodiment of the present invention further provides a training device for a text classification network, including:
a processor;
a memory having stored therein executable instructions of the processor;
wherein the processor is configured to perform the steps of the method of training a text classification network via execution of the executable instructions.
The embodiment of the present invention further provides a computer-readable storage medium, which is used for storing a program, and when the program is executed, the steps of the training method for the text classification network are implemented.
It is to be understood that both the foregoing general description and the following detailed description are exemplary and explanatory only and are not restrictive of the disclosure.
The training method, the system, the equipment and the storage medium of the text classification network provided by the invention have the following advantages:
according to the method, the text quantity in the training set is balanced, the text classification network is guaranteed to have good generalization capability for each category, and a more accurate convolutional neural network is constructed to serve as the text classification network, so that the more accurate text classification network is provided, the user input can be classified more accurately, the user requirements are accurately positioned, the predicted user sentence pattern coverage is larger, and the user experience is improved; further, the method and the device evaluate the text classification network, establish badcase feedback according to classification error data in evaluation data, and update the training set according to the feedback data, thereby realizing the continuous improvement of the text classification network.
Drawings
Other features, objects and advantages of the present invention will become more apparent upon reading of the following detailed description of non-limiting embodiments thereof, with reference to the accompanying drawings.
FIG. 1 is a flow chart of a method of training a text classification network according to an embodiment of the invention;
FIG. 2 is a schematic diagram of a convolutional neural network according to an embodiment of the present invention;
FIG. 3 is a block diagram of a training system for a text classification network according to an embodiment of the present invention;
FIG. 4 is a schematic structural diagram of a training device of a text classification network according to an embodiment of the present invention;
fig. 5 is a schematic structural diagram of a computer-readable storage medium according to an embodiment of the present invention.
Detailed Description
Example embodiments will now be described more fully with reference to the accompanying drawings. Example embodiments may, however, be embodied in many different forms and should not be construed as limited to the examples set forth herein; rather, these embodiments are provided so that this disclosure will be thorough and complete, and will fully convey the concept of example embodiments to those skilled in the art. The described features, structures, or characteristics may be combined in any suitable manner in one or more embodiments.
Furthermore, the drawings are merely schematic illustrations of the present disclosure and are not necessarily drawn to scale. The same reference numerals in the drawings denote the same or similar parts, and thus their repetitive description will be omitted. Some of the block diagrams shown in the figures are functional entities and do not necessarily correspond to physically or logically separate entities. These functional entities may be implemented in the form of software, or in one or more hardware units or integrated circuits, or in different networks and/or processor means and/or microcontroller means.
As shown in fig. 1, an embodiment of the present invention provides a method for training a text classification network, where the method includes the following steps:
s100: receiving a training text and a category label of the training text as a training set;
s200: counting the number of texts of each category in a training set;
s300: adjusting the number of texts of each category in the training set, so that the number ratio of the texts of each category in the training set meets a first preset proportion requirement;
s400: and training the convolutional neural network for text classification by adopting the adjusted training set to obtain a trained text classification network.
Therefore, the text classification method and the device ensure that the generalization capability of the text classification network to each category is good by balancing the number of texts in the training set, and construct a more accurate convolutional neural network as the text classification network, thereby providing a more accurate text classification network, classifying the user input more accurately, accurately positioning the user requirements, predicting greater sentence pattern coverage of the user, and improving the user experience.
In this embodiment, the scenes of the text classification include seven, which are:
ACT _ COMMMODITY, which represents a specific COMMODITY query service scenario;
ACT _ ORDER, representing an ORDER query service scenario;
ACT _ DISCOUNT, which represents the service scene of the fuzzy preferential query;
ACT _ SPECIFY _ DISCOUNT, which represents a specific preferential query service scene;
ACT _ AFTER _ SALES, which represents the business scene of AFTER-SALES service;
ACT _ SHORTCUT, which represents a direct service scene of the total station;
ACT _ UNKNOWN, indicating UNKNOWN.
In practical applications, the scenes may be changed, or some of the scenes may be added or deleted, so as to correspondingly adjust the type and number of the categories corresponding to the text classification network.
In step S100, the training text and the category label of the training text are received, and the user first sentence input may be filtered or the training text may be extracted from the intelligent assistant log, and then manual labeling is performed. The goal of manual labeling of the invention is to label the correct shopping scene intention as the label of the deep learning model training set and train the network for the valuable words input by the user in the intelligent assistant. In the intelligent assistant project, the dialogue between the user and the small robot needs to be identified most by the first sentence of the dialogue of the user, so the program filters the first sentence of each conversation of the user. For the intelligent assistant log, the log of the intelligent assistant falls into a big data Hive table (Hive is a data warehouse tool based on Hadoop, a structured data file can be mapped into a database table, a simple sql query function is provided, and sql statements can be converted into MapReduce tasks to run), all fields include 'business scene', 'channel number', 'current scene', 'device id', 'input text', 'user pin', 'user position', 'time', 'session id', 'context information', and the like. The data in the intelligent assistant log table can be used as a supplement of the dialect corresponding to the template and added into the training set.
In step S300, adjusting the number of texts in each category in the training set, including the following steps:
for the category of which the number of texts is too small to meet the requirement of the first preset proportion, further supplementing the number of the texts in the category; supplementing the amount of text of the category may include manually adding some text belonging to the category, or copying a portion of the text of the category, etc.;
for the category with excessive text quantity and incapable of meeting the first preset proportion requirement, screening a specified quantity of texts from the text of the category, wherein the quantity of the texts of the category obtained through screening meets the first preset proportion requirement, and the meeting of the first preset proportion requirement means that after the unselected text of the category is removed, the quantity of the remaining texts of the category and the quantity of the texts of other categories meet the first preset proportion requirement.
Here, the first preset proportion requirement may be set according to the distribution on the line, so that the text proportion in the training set is adjusted to be substantially consistent with the distribution on the line, for example, as shown in table 1 below, in practical application, the proportion of the number of texts in each category may be set as required.
TABLE 1
ACT_COMMODITY 2436
ACT_ORDER 1015
ACT_DISCOUNT 779
ACT_SPECIFY_DISCOUNT 597
ACT_AFTER_SALES 690
ACT_SHORTCUT 501
In this embodiment, the training method for the text classification network further includes the following steps:
dividing the texts in the training set into texts with different length intervals according to the sentence length;
and adjusting the number of texts in each length interval to enable the number ratio of the texts in each length interval in the training set to meet the requirement of a second preset proportion.
Here, the setting of the second preset proportion requirement may be set according to the distribution of the sentence lengths actually input on the user line. The length interval can be simply divided into two categories: the long sentence with the number of words larger than the length threshold value and the short sentence with the number of words smaller than or equal to the length threshold value can also be divided into a plurality of length intervals, and the text is classified into the corresponding length intervals according to the number of words of the text.
In this embodiment, in the convolutional neural network training, batch processing of data, where samples enter model training in batches, each batch of data is a batch, and before counting the number of texts in each category in the training set, the method further includes the following steps:
dividing the text in the training set into a plurality of batches (batch);
and respectively counting the number of texts of each category in each batch, and adjusting the number of texts of each category in the batch, so that the ratio of the number of texts of each category in the batch meets a first preset proportion requirement.
Therefore, the sampling module is designed for the samples in each batch, the sampling module supports different sampling modes, and for the samples with small number of categories, repeated sampling with or without putting back can be selected, and a weight sampling mode can also be selected.
In this embodiment, after receiving the training text and the category label of the training text, the method further includes the following steps:
performing regular matching on the text for training by adopting a preset filtering statement;
and filtering the training texts which are regularly matched, wherein the garbage user input without information content is removed, so that the expectation of the training set is effective, and the generalization capability of the model can be improved.
The regular dialect of the embodiment is set in advance, and in order to balance that the training corpus covers all regular training corpora as much as possible, the distribution of the user input text under the regular training corpora is counted and screened. For example, if regular dialogs are set to have sentences which do not help the user to classify the intention, the regular matching mode is adopted to filter out similar sentences, and invalid training texts are reduced.
In this embodiment, after receiving the training text and the category label of the training text, the method further includes the following steps:
and screening the training texts, and removing repeated texts in the training texts.
Specifically, the corpus of the training set is firstly deduplicated, samples which are repeatedly trained due to activities or hot commodities are filtered, then shuffle operation can be conducted on all categories, different categories are scattered alternately in the training set, and the model learning capacity and the generalization effect can be improved.
As shown in fig. 2, a structure of a convolutional neural network CNN according to an embodiment of the present invention is shown. The CNN is called a Convolutional Neural Network, is a feed-forward Neural Network, mainly comprises a Convolutional layer and a pooling layer, and is generally used for solving the classification problem. In this embodiment, the convolutional neural network includes a first convolutional layer, a first pooling layer, a second convolutional layer, a second pooling layer, a third convolutional layer, a third pooling layer, a fully-connected layer, and a classification layer, which are connected in sequence.
In this embodiment, the first, second, and third convolutional layers employ 1 × 1,3 × 3, and 5 × 5 convolution kernels, respectively; and the pooling areas of the first, second and third pooling layers respectively adopt a 2 x 2 matrix.
Convolution is a mathematical operation, the purpose of the operation is to simplify more complex data expression, filter redundant noise in complex data, extract key features, adjust the numerical value of convolution kernel, can realize the effects of feature edge detection, sharpening, blurring and the like, the specific operation of the convolution is as follows:
starting from the upper left corner of the original feature, selecting a region with the same size as the convolution;
multiplying the selected area by the convolution kernel element by element, and then summing to obtain a value serving as a fine-grained feature of the new feature mapping;
moving the specified step length in horizontal and vertical original characteristic areas;
in order not to shrink the newly generated feature map, it may be padded.
The convolutional neural network of the invention adopts the technology of local perception field. The input of the neural network is a vector, then the vector is transformed in a series of hidden layers, each hidden layer consists of a plurality of neurons, each neuron is connected with all the neurons in the previous layer, but in one hidden layer, the neurons are mutually independent and are not connected at all, in order to reduce the number of hidden layer parameters and accelerate the training speed, the invention selects a local perception field mode, namely, the nodes in one layer are not required to be connected with all the nodes in the previous layer, and only part of the nodes are required to be connected.
Further, the convolutional neural network of the present invention adopts a concept of parameter sharing. The invention obtains inspiration from biological vision, greatly reduces the number of parameters in a neural network by adopting a mode of local connection of neurons, but is still larger, and the model training still takes a long time.
The invention selects three convolution kernels 1 x 1,3 x 3 and 5 x 5 to extract features, and then uses multilayer convolution network. The output of each layer is converted by a nonlinear activation function, the convolution of the first layer detects the characteristics of some Chinese character levels according to the input text of a user, the convolution of the second layer detects the probability characteristics of some words and phrases, the classifier utilizing the high-level characteristics is arranged at the later layer, in the intelligent assistance, the words spoken by the user are composed of Chinese characters, words and phrases, and sentences, and the linguistic data is composed of sentences, and the CNN classifier identifies the semantic user intention through the linguistic data.
For the pooling layer, the invention makes two setting options for pooling, which are divided into average pooling and maximum pooling:
(1) average pooling
Assuming that the input is a 4 × 4 matrix, the pooling region is a 2 × 2 matrix, the size after pooling is 2 × 2, in the back propagation calculation process, the residual errors of 4 nodes after pooling are obtained from the back propagation of the last layer, wherein one node corresponds to 4 nodes before pooling, and the residual error sum of each layer is not changed when the back propagation is required to be satisfied, so the residual error value of the neuron before pooling is the average after pooling;
(2) maximum pooling
The basic flow is the same as the average pooling, namely the pooling formula is different, and the same assumption is that a 4 × 4 matrix is input, the pooled region is a 2 × 2 matrix, the size after pooling is 2 × 2, in the forward calculation process, which element in the pooled 2 × 2 region is selected as the maximum value needs to be recorded, the reverse residual propagation is carried out, only the residual is propagated to the neuron at the maximum position, and before the erythropoiesis, the derivative of the activation function needs to be added if the nonlinear activation calculation is carried out.
In addition, the feed-forward neural network contains a large number of neurons, including a plurality of layer organizations: an input layer, one or more hidden layers, an output layer, each neuron connected to all neurons of a previous layer, not all connections being identical, since they have different weights, the weights of the links bearing information of the whole network, a feed forward neural network containing a sufficient number of hidden neurons being able to fit within a certain accuracy functions of the type: any continuous function, requires a hidden layer; any function, even discontinuous, requires two hidden layers; calculating how many hidden layers and neurons are needed for the nonlinear function to reach the specified precision requires determining the structure through experience and some heuristic methods.
In convolutional neural networks, the goal of the back propagation algorithm is: the error between the actual output value of the network and the correct output is minimized, since the network is feed forward, the excitation always flows from the input unit to the output unit, after reaching the output unit, the output of the network will be compared with the correct output, and then the gradient of the cost function will be propagated backwards, while the weights are updated, which is recursive and can be applied to any number of hidden layers.
The off-line evaluation indexes of the classifier algorithm are a loss value, accuracy and an F1 value, wherein the loss value of the stored optimal model is less than 0.05, the accuracy reaches 0.91, and the F1 value reaches 0.92.
Further, the text classification network comprises a Highway network layer, the Highway network layer performs feature fusion on the Chinese character level features, the word segmentation level features and the word vector level features in the training set, and the text classification network has a better effect of improving the user intention capture of feature dimensions.
In this embodiment, the test panel tests the online data of the intelligent assistant at intervals, the test model predicts whether the correct answer is consistent with the correct answer considered by the human, whether the test model meets the online threshold requirement, and the human evaluation result is used as an important evaluation index. The intelligent assistant needs to be online and jointly debugged and tested with multiple parties such as an algorithm end, a server end, a client end and the like, and the excellent performance and the correct logic of each interface are ensured.
After the model evaluation, the training method of the text classification network further comprises the following steps:
acquiring evaluation data of a text classification network;
searching classified error data in the evaluation data, and adding a correctly classified label to the classified error data again, wherein the classified error data is badcase;
adding the classified error data and the correct label into a training set, wherein the error data added into the training set can be actually extracted error data or training text data artificially expanded according to the rule of the error data;
and re-training the text classification network by adopting the updated training set.
Specifically, after a tester feeds back badcase to an algorithm end, and finds the rule of badcase, the analysis is to solve badcase through data processing or badcase through model network optimization, after each update iteration, the problem that a current batch of badcase cannot generate new badcase is solved, and in order to ensure that the iteration effect is improved for one time, a regression test set is designed, and a regression test machine is continuously expanded to cover almost all on-line sentence patterns as much as possible.
For example, the present invention may implement short text simulation via badcase simulation data. The initial text classification network has a poor short text prediction effect, so a module is designed to generate a training corpus of short texts, for example, a user inputs ultrashort texts such as ' mop one ', ' Qingdao beer ' and the like at an intelligent assistant, theoretically, the model is expected to be recognized as a ' commodity inquiry ' service scene, an initial CNN classifier can recognize an unknown ' service scene, the short text simulation improves the recognition effect of the example, some short texts are added into a training set, and category labels of commodity inquiry are added to the short texts, so that the defect that the short texts cannot be classified correctly is overcome.
With the expansion of services, new dialects are added continuously, and in order to cover the effect of regular matching in time by the model, larger coverage of the text classification network can be realized by continuously expanding and perfecting the training set.
In this embodiment, after the evaluation result of the text classification network reaches the online threshold, the intelligent assistant may be online. Firstly, after a text classification network model and a semantic understanding model in intelligent assistance are trained, pre-publishing and visual testing are carried out to determine whether logic errors exist; the intelligent assistant performs ABtest on models of different versions on line, cuts flow respectively, and tests the model effect after converting the flow into the same proportion; with the continuous increase of the input of online users, in the evaluation process, the service scene categories are manually marked by testers, product words and brand words are marked again, new product words and brand words are collected, updated data content is added into a word stock and a training set, the coverage capability of a text classification network is continuously improved, a model is continuously optimized, new badcase is solved, the accuracy of text recognition and classification is further improved, and the use experience of the users is improved.
As shown in fig. 3, an embodiment of the present invention further provides a training system for a text classification network, which is applied to the training method for the text classification network, and the training system for the text classification network includes:
the text acquisition module M100 is used for receiving a training text and a category label of the training text as a training set;
the training set balancing module M200 is used for counting the number of texts of each category in the training set and adjusting the number of the texts of each category in the training set, so that the number ratio of the texts of each category in the training set meets a first preset proportion requirement;
and the classification network training module M300 is used for training the convolutional neural network for text classification by adopting the adjusted training set to obtain a trained text classification network.
The embodiment of the invention also provides training equipment of the text classification network, which comprises a processor; a memory having stored therein executable instructions of the processor; wherein the processor is configured to perform the steps of the method of training a text classification network via execution of the executable instructions.
As will be appreciated by one skilled in the art, aspects of the present invention may be embodied as a system, method or program product. Thus, various aspects of the invention may be embodied in the form of: an entirely hardware embodiment, an entirely software embodiment (including firmware, microcode, etc.) or an embodiment combining hardware and software aspects that may be collectively referred to herein as a "circuit," unit, "or" platform.
An electronic device 600 according to this embodiment of the invention is described below with reference to fig. 4. The electronic device 600 shown in fig. 4 is only an example, and should not bring any limitation to the functions and the scope of use of the embodiments of the present invention.
As shown in fig. 4, the electronic device 600 is embodied in the form of a general purpose computing device. The components of the electronic device 600 may include, but are not limited to: at least one processing unit 610, at least one memory unit 620, a bus 630 connecting the different platform components (including the memory unit 620 and the processing unit 610), a display unit 640, etc.
Wherein the storage unit stores program code executable by the processing unit 610 to cause the processing unit 610 to perform steps according to various exemplary embodiments of the present invention described in the above-mentioned electronic prescription flow processing method section of the present specification. For example, the processing unit 610 may perform the steps as shown in fig. 1.
Therefore, when the processor of the training device of the text classification network executes the program codes in the storage unit, the application can initiate the network request and simultaneously store the name of the network request function, the network request parameter, the network return parameter, the callback function and other related information, so that the packet capturing debugging application is facilitated, the user operation is facilitated, the packet capturing efficiency is improved, and the packet capturing cost is reduced.
The storage unit 620 may include readable media in the form of volatile memory units, such as a random access memory unit (RAM)6201 and/or a cache memory unit 6202, and may further include a read-only memory unit (ROM) 6203.
The memory unit 620 may also include a program/utility 6204 having a set (at least one) of program units 6205, such program units 6205 including, but not limited to: an operating system, one or more application programs, other program elements, and program data, each of which, or some combination thereof, may comprise an implementation of a network environment.
Bus 630 may be one or more of several types of bus structures, including a memory unit bus or memory unit controller, a peripheral bus, an accelerated graphics port, a processing unit, or a local bus using any of a variety of bus architectures.
The electronic device 600 may also communicate with one or more external devices 700 (e.g., keyboard, pointing device, bluetooth device, etc.), with one or more devices that enable a user to interact with the electronic device 600, and/or with any devices (e.g., router, modem, etc.) that enable the electronic device 600 to communicate with one or more other computing devices. Such communication may occur via an input/output (I/O) interface 650. Also, the electronic device 600 may communicate with one or more networks (e.g., a Local Area Network (LAN), a Wide Area Network (WAN), and/or a public network such as the Internet) via the network adapter 660. The network adapter 660 may communicate with other elements of the electronic device 600 via the bus 630. It should be appreciated that although not shown in the figures, other hardware and/or software elements may be used in conjunction with the electronic device 600, including but not limited to: microcode, device drivers, redundant processing units, external disk drive arrays, RAID systems, tape drives, and data backup storage platforms, to name a few.
The embodiment of the present invention further provides a computer-readable storage medium, which is used for storing a program, and when the program is executed, the steps of the training method for the text classification network are implemented. In some possible embodiments, aspects of the present invention may also be implemented in the form of a program product comprising program code for causing a terminal device to perform the steps according to various exemplary embodiments of the present invention described in the above-mentioned electronic prescription flow processing method section of this specification, when the program product is run on the terminal device.
Referring to fig. 5, a program product 800 for implementing the above method according to an embodiment of the present invention is described, which may employ a portable compact disc read only memory (CD-ROM) and include program code, and may be run on a terminal device, such as a personal computer. However, the program product of the present invention is not limited in this regard and, in the present document, a readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device.
Therefore, when the program code in the computer storage medium of the embodiment is executed, the application can initiate the network request and simultaneously store the name of the network request function, the network request parameter, the network return parameter, the callback function and other related information, so that the packet capturing debugging application is facilitated, the user operation is facilitated, the packet capturing efficiency is improved, and the packet capturing cost is reduced.
The program product may employ any combination of one or more readable media. The readable medium may be a readable signal medium or a readable storage medium. A readable storage medium may be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any combination of the foregoing. More specific examples (a non-exhaustive list) of the readable storage medium include: an electrical connection having one or more wires, a portable disk, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing.
The computer readable storage medium may include a propagated data signal with readable program code embodied therein, for example, in baseband or as part of a carrier wave. Such a propagated data signal may take many forms, including, but not limited to, electro-magnetic, optical, or any suitable combination thereof. A readable storage medium may also be any readable medium that is not a readable storage medium and that can communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device. Program code embodied on a readable storage medium may be transmitted using any appropriate medium, including but not limited to wireless, wireline, optical fiber cable, RF, etc., or any suitable combination of the foregoing.
Program code for carrying out operations for aspects of the present invention may be written in any combination of one or more programming languages, including an object oriented programming language such as Java, C + + or the like and conventional procedural programming languages, such as the "C" programming language or similar programming languages. The program code may execute entirely on the user's computing device, partly on the user's device, as a stand-alone software package, partly on the user's computing device and partly on a remote computing device, or entirely on the remote computing device or server. In the case of a remote computing device, the remote computing device may be connected to the user computing device through any kind of network, including a Local Area Network (LAN) or a Wide Area Network (WAN), or may be connected to an external computing device (e.g., through the internet using an internet service provider).
The embodiments in the present specification are described in a progressive manner, and the same and similar parts among the embodiments are referred to each other, and each embodiment focuses on the differences from the other embodiments. In particular, embodiments of the system, apparatus, and computer storage medium are described in relative terms as substantially similar to the method embodiments, where relevant, reference may be had to the description of the method embodiments.
The training method, the system, the equipment and the storage medium of the text classification network provided by the invention have the following advantages:
according to the method, the text quantity in the training set is balanced, the text classification network is guaranteed to have good generalization capability for each category, and a more accurate convolutional neural network is constructed to serve as the text classification network, so that the more accurate text classification network is provided, the user input can be classified more accurately, the user requirements are accurately positioned, the predicted user sentence pattern coverage is larger, and the user experience is improved; further, the method and the device evaluate the text classification network, establish badcase feedback according to classification error data in evaluation data, and update the training set according to the feedback data, thereby realizing the continuous improvement of the text classification network.
The foregoing is a more detailed description of the invention in connection with specific preferred embodiments and it is not intended that the invention be limited to these specific details. For those skilled in the art to which the invention pertains, several simple deductions or substitutions can be made without departing from the spirit of the invention, and all shall be considered as belonging to the protection scope of the invention.

Claims (15)

1. A training method of a text classification network is characterized by comprising the following steps:
receiving a training text and a category label of the training text as a training set;
counting the number of texts of each category in a training set;
adjusting the number of texts of each category in the training set, so that the number ratio of the texts of each category in the training set meets a first preset proportion requirement;
and training the convolutional neural network for text classification by adopting the adjusted training set to obtain a trained text classification network.
2. The method of claim 1, further comprising the steps of:
dividing the texts in the training set into texts with different length intervals according to the sentence length;
and adjusting the number of texts in each length interval to enable the number ratio of the texts in each length interval in the training set to meet the requirement of a second preset proportion.
3. The method for training the text classification network according to claim 1, wherein the adjusting the number of texts in each category in the training set comprises the following steps:
for the category of which the number of texts is too small to meet the requirement of the first preset proportion, further supplementing the number of the texts in the category;
and screening a specified number of texts from the texts of the category for the category with excessive text number which cannot meet the first preset proportion requirement, wherein the number of the texts of the category obtained by screening meets the first preset proportion requirement.
4. The method of claim 1, wherein before counting the number of texts in each category in the training set, the method further comprises the following steps:
dividing texts in a training set into a plurality of batches;
and respectively counting the number of texts of each category in each batch, and adjusting the number of texts of each category in the batch, so that the ratio of the number of texts of each category in the batch meets a first preset proportion requirement.
5. The method of claim 1, wherein after receiving the training text and the class label of the training text, the method further comprises the following steps:
performing regular matching on the text for training by adopting a preset filtering statement;
and filtering out the training texts which are regularly matched.
6. The method of claim 1, wherein after receiving the training text and the class label of the training text, the method further comprises the following steps:
and screening the training texts, and removing repeated texts in the training texts.
7. The method of claim 1, wherein the convolutional neural network comprises a first convolutional layer, a first pooling layer, a second convolutional layer, a second pooling layer, a third convolutional layer, a third pooling layer, a fully-connected layer, and a classification layer, which are connected in sequence.
8. The method of claim 7, wherein the first convolutional layer, the second convolutional layer, and the third convolutional layer employ a 1 x 1 convolutional kernel, a 3 x 3 convolutional kernel, and a 5 x 5 convolutional kernel, respectively; and the pooling areas of the first, second and third pooling layers respectively adopt a 2 x 2 matrix.
9. The method of claim 1, wherein the convolutional neural network is implemented based on local perceptual fields, and each layer shares local connection parameters with a previous layer.
10. The method of claim 1, wherein the loss of the optimal text classification model obtained when training the convolutional neural network for text classification is less than 0.05, the accuracy is 0.91, and the value of F1 is 0.92.
11. The method of claim 1, wherein the text classification network comprises a Highway network layer, and the Highway network layer performs feature fusion on the Chinese character level features, the segmentation level features and the word vector level features in the training set.
12. The method of claim 1, further comprising the steps of:
acquiring evaluation data of a text classification network;
searching classification error data in the evaluation data, and adding a correctly classified label to the classification error data again;
adding the classified error data and the correct label into a training set;
and re-training the text classification network by adopting the updated training set.
13. A training system for a text classification network, which is applied to the training method for a text classification network according to any one of claims 1 to 12, the training system for a text classification network comprising:
the text acquisition module is used for receiving the training text and the class label of the training text as a training set;
the training set balancing module is used for counting the number of texts of each category in the training set and adjusting the number of the texts of each category in the training set, so that the number ratio of the texts of each category in the training set meets a first preset proportion requirement;
and the classification network training module is used for training the convolutional neural network for text classification by adopting the adjusted training set to obtain a trained text classification network.
14. An apparatus for training a text classification network, comprising:
a processor;
a memory having stored therein executable instructions of the processor;
wherein the processor is configured to perform the steps of the method of training of a text classification network of any one of claims 1 to 12 via execution of the executable instructions.
15. A computer-readable storage medium storing a program, wherein the program is configured to implement the steps of the method of training a text classification network according to any one of claims 1 to 12 when executed.
CN201811555318.0A 2018-12-18 2018-12-18 Training method, system, equipment and storage medium of text classification network Pending CN111339292A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201811555318.0A CN111339292A (en) 2018-12-18 2018-12-18 Training method, system, equipment and storage medium of text classification network

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201811555318.0A CN111339292A (en) 2018-12-18 2018-12-18 Training method, system, equipment and storage medium of text classification network

Publications (1)

Publication Number Publication Date
CN111339292A true CN111339292A (en) 2020-06-26

Family

ID=71184998

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201811555318.0A Pending CN111339292A (en) 2018-12-18 2018-12-18 Training method, system, equipment and storage medium of text classification network

Country Status (1)

Country Link
CN (1) CN111339292A (en)

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111859987A (en) * 2020-07-28 2020-10-30 网易(杭州)网络有限公司 Text processing method, and training method and device of target task model
CN112417111A (en) * 2020-11-04 2021-02-26 厦门快商通科技股份有限公司 Text classification method, question answering system and dialogue robot
CN112489794A (en) * 2020-12-18 2021-03-12 推想医疗科技股份有限公司 Model training method and device, electronic terminal and storage medium
CN112765348A (en) * 2021-01-08 2021-05-07 重庆创通联智物联网有限公司 Short text classification model training method and device
CN113656575A (en) * 2021-07-13 2021-11-16 北京搜狗科技发展有限公司 Training data generation method and device, electronic equipment and readable medium
CN114724132A (en) * 2022-04-11 2022-07-08 深圳市星桐科技有限公司 Text recognition model training method, recognition method, device, medium and equipment
WO2023173555A1 (en) * 2022-03-15 2023-09-21 平安科技(深圳)有限公司 Model training method and apparatus, text classification method and apparatus, device, and medium

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102298646A (en) * 2011-09-21 2011-12-28 苏州大学 Method and device for classifying subjective text and objective text
CN107301246A (en) * 2017-07-14 2017-10-27 河北工业大学 Chinese Text Categorization based on ultra-deep convolutional neural networks structural model

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102298646A (en) * 2011-09-21 2011-12-28 苏州大学 Method and device for classifying subjective text and objective text
CN107301246A (en) * 2017-07-14 2017-10-27 河北工业大学 Chinese Text Categorization based on ultra-deep convolutional neural networks structural model

Cited By (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111859987A (en) * 2020-07-28 2020-10-30 网易(杭州)网络有限公司 Text processing method, and training method and device of target task model
CN111859987B (en) * 2020-07-28 2024-05-17 网易(杭州)网络有限公司 Text processing method, training method and device for target task model
CN112417111A (en) * 2020-11-04 2021-02-26 厦门快商通科技股份有限公司 Text classification method, question answering system and dialogue robot
CN112417111B (en) * 2020-11-04 2022-08-23 厦门快商通科技股份有限公司 Text classification method, question answering system and dialogue robot
CN112489794A (en) * 2020-12-18 2021-03-12 推想医疗科技股份有限公司 Model training method and device, electronic terminal and storage medium
CN112765348A (en) * 2021-01-08 2021-05-07 重庆创通联智物联网有限公司 Short text classification model training method and device
CN113656575A (en) * 2021-07-13 2021-11-16 北京搜狗科技发展有限公司 Training data generation method and device, electronic equipment and readable medium
CN113656575B (en) * 2021-07-13 2024-02-02 北京搜狗科技发展有限公司 Training data generation method and device, electronic equipment and readable medium
WO2023173555A1 (en) * 2022-03-15 2023-09-21 平安科技(深圳)有限公司 Model training method and apparatus, text classification method and apparatus, device, and medium
CN114724132A (en) * 2022-04-11 2022-07-08 深圳市星桐科技有限公司 Text recognition model training method, recognition method, device, medium and equipment

Similar Documents

Publication Publication Date Title
CN107679234B (en) Customer service information providing method, customer service information providing device, electronic equipment and storage medium
CN108363790B (en) Method, device, equipment and storage medium for evaluating comments
CN111339292A (en) Training method, system, equipment and storage medium of text classification network
CN111026842B (en) Natural language processing method, natural language processing device and intelligent question-answering system
CN109492164A (en) A kind of recommended method of resume, device, electronic equipment and storage medium
US11409964B2 (en) Method, apparatus, device and storage medium for evaluating quality of answer
CN110019736B (en) Question-answer matching method, system, equipment and storage medium based on language model
CN107491547A (en) Searching method and device based on artificial intelligence
CN110415679B (en) Voice error correction method, device, equipment and storage medium
CN110222178A (en) Text sentiment classification method, device, electronic equipment and readable storage medium storing program for executing
CN109086265B (en) Semantic training method and multi-semantic word disambiguation method in short text
CN109299245B (en) Method and device for recalling knowledge points
CN111625634A (en) Word slot recognition method and device, computer-readable storage medium and electronic device
WO2021169485A1 (en) Dialogue generation method and apparatus, and computer device
CN113326374B (en) Short text emotion classification method and system based on feature enhancement
CN110334186A (en) Data query method, apparatus, computer equipment and computer readable storage medium
CN111639247A (en) Method, apparatus, device and computer-readable storage medium for evaluating quality of review
CN107463935A (en) Application class methods and applications sorter
CN111274822A (en) Semantic matching method, device, equipment and storage medium
CN111177351A (en) Method, device and system for acquiring natural language expression intention based on rule
CN115357719A (en) Power audit text classification method and device based on improved BERT model
KR20200041199A (en) Method, apparatus and computer-readable medium for operating chatbot
CN115455151A (en) AI emotion visual identification method and system and cloud platform
US20230121404A1 (en) Searching for normalization-activation layer architectures
CN112102116B (en) Input prediction method, system, equipment and storage medium based on travel session

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination