CN115186057A - Method and device for obtaining text classification model - Google Patents

Method and device for obtaining text classification model Download PDF

Info

Publication number
CN115186057A
CN115186057A CN202210794143.9A CN202210794143A CN115186057A CN 115186057 A CN115186057 A CN 115186057A CN 202210794143 A CN202210794143 A CN 202210794143A CN 115186057 A CN115186057 A CN 115186057A
Authority
CN
China
Prior art keywords
text
target
category
artificial intelligence
training
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202210794143.9A
Other languages
Chinese (zh)
Inventor
方科
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Bank of China Ltd
Original Assignee
Bank of China Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Bank of China Ltd filed Critical Bank of China Ltd
Priority to CN202210794143.9A priority Critical patent/CN115186057A/en
Publication of CN115186057A publication Critical patent/CN115186057A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/3331Query processing
    • G06F16/334Query execution
    • G06F16/3344Query execution using natural language analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/35Clustering; Classification
    • G06F16/353Clustering; Classification into predefined classes

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Artificial Intelligence (AREA)
  • Computational Linguistics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The application discloses a method and a device for obtaining a text classification model, which can be applied to the field of artificial intelligence or the field of finance, wherein the method comprises the following steps: selecting target keywords contained in a target text from a keyword library, wherein the keyword library contains a plurality of keywords, and each keyword corresponds to one text category; determining a target category identification corresponding to the target text according to the target keyword; the target category identification is used for indicating a text category of the target text; training the artificial intelligence model through a target text containing a target category identifier to obtain a text classification model, wherein the text classification model is used for classifying the text. The target category identification of the target text can be determined through keywords without depending on manual labeling of the sample data. On one hand, the method and the device can reduce the cost of the training model and improve the speed of the training model. On the other hand, a large amount of sample data for training can be determined through the keywords, and the quality of the trained text classification model is guaranteed.

Description

Method and device for obtaining text classification model
Technical Field
The application relates to the field of artificial intelligence, in particular to a method and a device for obtaining a text classification model.
Background
Classifying text is a common means of collating textual information. In many scenes, the text has large data volume and high timeliness, and artificial analysis is almost impossible, such as customer service conversation text, product evaluation published by customers, massive financial information generated every moment and the like, so that the text needs to be automatically classified and labeled by a computer. However, when facing a specific task scenario, one of the important challenges often faced is that it has been determined by business that the text is divided into specific categories, but the collected samples have no category identification themselves. At this time, the category of the collected sample needs to be identified manually, and a training sample of the artificial intelligence model is obtained. However, manual labeling of a large amount of text is not only costly but also time consuming.
Disclosure of Invention
In order to solve the technical problem, the application provides a method and a device for obtaining a text classification model, which are used for obtaining a trained text classification model relatively quickly at a relatively low cost.
In order to achieve the above purpose, the technical solutions provided in the embodiments of the present application are as follows:
the embodiment of the application provides a method for obtaining a text classification model, which comprises the following steps:
selecting target keywords contained in a target text from a keyword library, wherein the keyword library contains a plurality of keywords, and each keyword corresponds to one text category;
determining a target category identification corresponding to the target text according to the target keyword; the target category identification is used for indicating a text category of the target text;
and training an artificial intelligent model through a target text containing the target category identification to obtain a text classification model, wherein the text classification model is used for classifying the text.
As a possible implementation manner, the training an artificial intelligence model through a target text containing the target class identifier to obtain a text classification model includes:
dividing the target texts containing the target category identifications into a training set and a test set;
training the artificial intelligence model through the training set to obtain a trained artificial intelligence model;
and testing the artificial intelligence model through the test set, and determining the artificial intelligence model passing the test as the text classification model.
As a possible implementation, the testing the artificial intelligence model through the text of the test set, and determining the artificial intelligence model passing the test as a text classification model includes:
inputting a first text in the test set into the trained artificial intelligence model to obtain a first category identifier corresponding to the first text;
judging whether a target category identification corresponding to the first text is consistent with the first category identification;
when the target category identification corresponding to the first text is consistent with the first category identification, transferring the first text from the test set to the training set;
when the test set does not contain text, determining the artificial intelligence model as a text classification model.
As a possible implementation manner, the inputting a first text in the test set into the trained artificial intelligence model, and obtaining a first category identifier corresponding to the first text includes:
inputting a first text in the test set into the trained artificial intelligence model to obtain a first class identifier corresponding to the first text and a confidence coefficient corresponding to the first class identifier;
the determining whether the target category identifier corresponding to the first text is consistent with the first category identifier includes:
and when the confidence corresponding to the first category identification is larger than a preset threshold value, judging whether the target category identification corresponding to the first text is consistent with the first category identification.
As a possible implementation, the method further includes:
inputting a second text in the test set into the trained artificial intelligence model to obtain a second category identification corresponding to the second text;
judging whether the target category identification corresponding to the second text is consistent with the second category identification;
and when the target class identification corresponding to the second text is inconsistent with the prediction class identification, training the artificial intelligence model through a training set containing the first text.
As a possible implementation manner, the second text in the test set is input into the trained artificial intelligence model, and a second category identifier corresponding to the second text is obtained;
inputting a second text in the test set into the trained artificial intelligence model to obtain a second category identification corresponding to the second text and a confidence coefficient corresponding to the second category identification;
the determining whether the target category identifier corresponding to the second text is consistent with the second category identifier includes:
and when the confidence corresponding to the second category identification is larger than a preset threshold, judging whether the target category identification corresponding to the second text is consistent with the second category identification.
As a possible implementation manner, the training an artificial intelligence model through a target text containing the target class identifier to obtain a text classification model includes:
and training an artificial intelligent model by adopting a supervised learning algorithm through a target text containing the target category identification to obtain a text classification model.
The embodiment of the present application further provides an obtaining apparatus of a text classification model, including:
the system comprises a selection module, a text classification module and a text classification module, wherein the selection module is used for selecting target keywords contained in a target text in a keyword library, the keyword library contains a plurality of keywords, and each keyword corresponds to one text category;
the determining module is used for determining a target category identifier corresponding to the target text according to the target keyword; the target category identification is used for indicating a text category of the target text;
and the training module is used for training an artificial intelligent model through a target text containing the target category identification to obtain a text classification model, and the text classification model is used for classifying the text.
As a possible implementation, the training module comprises:
the classification unit is used for classifying the target text containing the target class identification into a training set and a test set;
a training set training unit for training the artificial intelligence model through the training set to obtain a trained artificial intelligence model;
and the test unit tests the artificial intelligence model through the test set and determines the artificial intelligence model passing the test as the text classification model.
As a possible implementation, the test unit is specifically configured to:
inputting a first text in the test set into the trained artificial intelligence model to obtain a first class identifier corresponding to the first text;
judging whether the target category identification corresponding to the first text is consistent with the first category identification;
when the target category identification corresponding to the first text is consistent with the first category identification, transferring the first text from the test set to the training set;
when the test set does not contain text, determining the artificial intelligence model as a text classification model.
According to the technical scheme, the method has the following beneficial effects:
the embodiment of the application provides a method for obtaining a text classification model, which comprises the following steps: selecting target keywords contained in a target text from a keyword library, wherein the keyword library contains a plurality of keywords, and each keyword corresponds to one text category; determining a target category identification corresponding to the target text according to the target keyword; the target category identification is used for indicating a text category of the target text; training the artificial intelligence model through a target text containing a target category identifier to obtain a text classification model, wherein the text classification model is used for classifying the text.
Therefore, according to the method for obtaining the text classification model, the target category identification corresponding to the target text is determined by matching the target text with the keywords in the keyword library, and the artificial intelligent model is trained through the target text containing the target category identification to obtain the text classification model. Therefore, the method for obtaining the text classification model provided by the embodiment of the application can determine the target category identification of the target text through the keywords without manually marking the sample data. On the one hand, the method and the device can reduce the cost of the training model and improve the speed of the training model. On the other hand, a large amount of sample data for training can be determined through the keywords, and the quality of the trained text classification model is guaranteed.
Drawings
In order to more clearly illustrate the embodiments of the present application or the technical solutions in the prior art, the drawings needed to be used in the description of the embodiments or the prior art will be briefly introduced below, and it is obvious that the drawings in the following description are some embodiments of the present application, and it is obvious for those skilled in the art to obtain other drawings based on these drawings without creative efforts.
Fig. 1 is a flowchart of a method for obtaining a text classification model according to an embodiment of the present disclosure;
fig. 2 is a flowchart of another method for obtaining a text classification model according to an embodiment of the present disclosure;
fig. 3 is a schematic diagram of an apparatus for obtaining a text classification model according to an embodiment of the present application.
Detailed Description
In order to help better understand the scheme provided in the embodiment of the present application, before the method provided in the embodiment of the present application is introduced, a scenario of an application of the scheme in the embodiment of the present application is introduced.
Classifying text is a common means of collating textual information. In many scenes, the text has large data volume and high timeliness, and artificial analysis is almost impossible, such as customer service conversation text, product evaluation published by customers, massive financial information generated every moment and the like, so that the text needs to be automatically classified and labeled by a computer. However, when facing a specific task scenario, one of the important challenges often faced is that it has been determined by business that the text is divided into specific categories, but the collected samples have no category identification themselves. At this time, the category of the collected sample needs to be identified manually, and a training sample of the artificial intelligence model is obtained. However, manual labeling of a large amount of text is not only costly but also time consuming.
In order to solve the foregoing technical problem, an embodiment of the present application provides a method for obtaining a text classification model, including: selecting target keywords contained in a target text from a keyword library, wherein the keyword library contains a plurality of keywords, and each keyword corresponds to one text category; determining a target category identification corresponding to the target text according to the target keyword; the target category identification is used for indicating a text category of the target text; training the artificial intelligent model through a target text containing a target category identification to obtain a text classification model, wherein the text classification model is used for classifying the text.
Therefore, according to the method for obtaining the text classification model, the target category identification corresponding to the target text is determined by matching the target text with the keywords in the keyword library, and the artificial intelligent model is trained through the target text containing the target category identification to obtain the text classification model. Therefore, the method for obtaining the text classification model provided by the embodiment of the application can determine the target category identification of the target text through the keywords without manually marking the sample data. On the one hand, the method and the device can reduce the cost of the training model and improve the speed of the training model. On the other hand, a large amount of sample data for training can be determined through the keywords, and the quality of the trained text classification model is guaranteed.
In order to make the aforementioned objects, features and advantages of the present application more comprehensible, embodiments accompanying the drawings are described in detail below.
Referring to fig. 1, this figure is a flowchart of a method for obtaining a text classification model according to an embodiment of the present application.
As shown in fig. 1, the method for obtaining a text classification model provided in the embodiment of the present application includes:
s101: and selecting a target keyword contained in the target text from a keyword library, wherein the keyword library contains a plurality of keywords, and each keyword corresponds to one text category.
S102: determining a target category identification corresponding to the target text according to the target keyword; the target category identifies a text category for indicating target text.
S103: training the artificial intelligence model through a target text containing a target category identifier to obtain a text classification model, wherein the text classification model is used for classifying the text.
It should be noted that the target text in the embodiment of the present application may be a plurality of texts or may be one text, and the embodiment of the present application is not limited herein. The method and the device can adopt a supervised learning algorithm, such as a Support Vector Machine (SVM), to train an artificial intelligent model to obtain a text classification model. In practical applications, the name of each category of the text may be obtained first, and a near Word group corresponding to the category name is obtained according to a near Word group module, for example, word2vec (Word to vector). And then storing the name of the category and the similar meaning phrase corresponding to the category into a keyword library, wherein the name of the category and the similar meaning phrase corresponding to the category correspond to the same category, and the name of the category and the category identification corresponding to the similar meaning phrase corresponding to the category are the same category identification.
As a possible implementation manner, in the present application, training an artificial intelligence model through a target text including a target category identifier to obtain a text classification model may include: dividing target texts containing target category identifications into a training set and a test set; training the artificial intelligence model through a training set to obtain a trained artificial intelligence model; and testing the artificial intelligence model through the test set, and determining the artificial intelligence model passing the test as a text classification model. As an example, 70% of the target text may be used as a training set and 30% as a test set.
Specifically, the first text in the test set can be input into the trained artificial intelligence model, and the first category identification corresponding to the first text is obtained; judging whether the target category identification corresponding to the first text is consistent with the first category identification; and when the target class identification corresponding to the first text is consistent with the first class identification, transferring the first text from the test set to the training set. And when the target category identification corresponding to the first text is not consistent with the first category identification, continuously storing the first text in the test set.
The second text in the test set can be input into the trained artificial intelligence model to obtain a second category identifier corresponding to the second text; judging whether the target class identification corresponding to the second text is consistent with the second class identification; and when the target class identification and the prediction class identification corresponding to the second text are not consistent, the second text is continuously stored in the test set. If the test set contains a third text, the third text product can be further processed in the above-mentioned flow until all the texts in the test set have been redistributed. It should be noted that, when the texts in the test set are all redistributed according to the above-mentioned process, the texts in the test set will be reduced, and the texts in the training set will be increased. At this time, the artificial intelligence model can be trained again through the training set containing the first text, namely the training set with the increased texts. And training the trained artificial intelligence model again through the test set until the texts in the test set are all transferred to the training set. When the test set does not contain the text, namely the text in the test set is reduced to 0, the artificial intelligence model is determined as the text classification model.
It should be noted that, because the cost of labeling by keywords is low, the embodiment of the application can obtain a large amount of target texts to train the artificial intelligence model. Because a large amount of target texts are used for training, the accuracy of the text classification model obtained by training can be superior to that of the original method for labeling through keywords.
In order to improve the training efficiency of the artificial intelligence model, whether the category identification of the text needs to be consistent or not can be determined according to the confidence coefficient of the prediction result of the artificial intelligence model. Specifically, in the present application, inputting a first text in a test set into a trained artificial intelligence model, and obtaining a first category identifier corresponding to the first text includes: and inputting the first text in the test set into the trained artificial intelligence model to obtain a first class identifier corresponding to the first text and a confidence coefficient corresponding to the first class identifier. Then, when the confidence corresponding to the first category identification is larger than a preset threshold, whether the target category identification corresponding to the first text is consistent with the first category identification is judged. When the confidence corresponding to the first category identifier is smaller than the preset threshold, the first category identifier may not be judged, and the first text may be continuously stored in the test set. Correspondingly, inputting a second text in the test set into the trained artificial intelligence model to obtain a second category identifier corresponding to the second text; inputting a second text in the test set into the trained artificial intelligence model to obtain a second category identification corresponding to the second text and a confidence coefficient corresponding to the second category identification; judging whether the target category identification corresponding to the second text is consistent with the second category identification, including: and when the confidence corresponding to the second category identification is larger than a preset threshold, judging whether the target category identification corresponding to the second text is consistent with the second category identification.
Referring to fig. 2, this figure is a flowchart of another method for obtaining a text classification model according to an embodiment of the present application.
In summary, as shown in fig. 2, the method for obtaining a text classification model provided in the embodiment of the present application includes: firstly, a near-meaning phrase of a category name is obtained, and keywords are hit on a text to obtain a text label (classification label). And then segmenting the text with the obtained classification label, segmenting the text into a test set and a labeled training set, wherein the text in the training set is used for training a text classification model. And when the prediction result simultaneously has results which are inconsistent and consistent with the test set, selecting the text with the consistent comparison result, adding the selected text into the training set, and deleting the text from the test set. And then, training the text classification model again by using the new training set until the prediction result is completely consistent with the labeling result in the test set, and obtaining a final classification model.
In summary, according to the method for obtaining the text classification model provided by the application, the target category identifier corresponding to the target text is determined by matching the target text with the keywords in the keyword library, and the artificial intelligent model is trained through the target text containing the target category identifier, so that the text classification model is obtained. Therefore, the method for obtaining the text classification model provided by the embodiment of the application can determine the target category identification of the target text through the keywords without manually marking the sample data. On the one hand, the method and the device can reduce the cost of the training model and improve the speed of the training model. On the other hand, a large amount of sample data for training can be determined through the keywords, and the quality of the trained text classification model is guaranteed.
According to the method for obtaining the text classification model provided by the embodiment, the embodiment of the application provides a device for obtaining the text classification model.
Referring to fig. 3, this figure is a schematic diagram of an apparatus for obtaining a text classification model according to an embodiment of the present application.
As shown in fig. 3, the text classification model provided in the embodiment of the present application includes:
a selection module 100, configured to select a target keyword included in a target text from a keyword library, where the keyword library includes a plurality of keywords, and each keyword corresponds to a text category;
a determining module 200, configured to determine a target category identifier corresponding to the target text according to the target keyword; the target category identification is used for indicating a text category of the target text;
the training module 300 is configured to train the artificial intelligence model through a target text including a target category identifier to obtain a text classification model, where the text classification model is used to classify the text.
As a possible implementation, the training module comprises: the classification unit is used for classifying the target texts containing the target category identifications into a training set and a test set; the training set training unit is used for training the artificial intelligence model through a training set to obtain a trained artificial intelligence model; and the test unit tests the artificial intelligence model through the test set and determines the artificial intelligence model passing the test as a text classification model.
As a possible implementation, the test unit is specifically configured to: inputting a first text in the test set into the trained artificial intelligence model to obtain a first class identifier corresponding to the first text; judging whether the target category identification corresponding to the first text is consistent with the first category identification; when the target category identification corresponding to the first text is consistent with the first category identification, transferring the first text from the test set to the training set; when the test set does not contain text, the artificial intelligence model is determined as a text classification model.
In summary, the device for obtaining the text classification model determines the target category identifier corresponding to the target text by matching the target text with the keywords in the keyword library, and trains the artificial intelligent model through the target text containing the target category identifier to obtain the text classification model. Therefore, the device for obtaining the text classification model provided by the embodiment of the application can determine the target category identifier of the target text through the keywords without manually labeling the sample data. On the one hand, the method and the device can reduce the cost of the training model and improve the speed of the training model. On the other hand, a large amount of sample data for training can be determined through the keywords, and the quality of the trained text classification model is guaranteed.
From the above description of the embodiments, it is clear to those skilled in the art that all or part of the steps in the method of the above embodiments may be implemented by software plus a necessary general hardware platform. Based on such understanding, the technical solutions of the present application or portions contributing to the prior art may be embodied in the form of a software product, which may be stored in a storage medium, such as a ROM/RAM, a magnetic disk, an optical disk, etc., and includes several instructions for causing a computer device (which may be a personal computer, a server, or a network communication device such as a media gateway, etc.) to execute the method described in the embodiments or some portions of the embodiments of the present application.
It should be noted that, in the present specification, the embodiments are described in a progressive manner, each embodiment focuses on differences from other embodiments, and the same and similar parts among the embodiments may be referred to each other. The method disclosed by the embodiment corresponds to the system disclosed by the embodiment, so that the description is simple, and the relevant points can be referred to the system part for description.
It should also be noted that, in this document, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrases "comprising one of 8230; \8230;" 8230; "does not exclude the presence of additional like elements in a process, method, article, or apparatus that comprises the element.
The foregoing description of the disclosed embodiments will enable those skilled in the art to make or use the invention in various modifications to these embodiments, which will be apparent to those skilled in the art, and the general principles defined herein may be implemented in other embodiments without departing from the spirit or scope of the application. Thus, the present application is not intended to be limited to the embodiments shown herein but is to be accorded the widest scope consistent with the principles and novel features disclosed herein.
It should be noted that the method and the device for obtaining the text classification model provided by the invention can be used in the fields of artificial intelligence, block chain, distribution, cloud computing, big data, internet of things, mobile internet, network security, chip, virtual reality, augmented reality, holography, quantum computing, quantum communication, quantum measurement, digital twinning, and finance. The above description is only an example, and does not limit the application field of the method and apparatus for obtaining the text classification model provided by the present invention.

Claims (10)

1. A method for obtaining a text classification model is characterized by comprising the following steps:
selecting target keywords contained in a target text from a keyword library, wherein the keyword library contains a plurality of keywords, and each keyword corresponds to one text category;
determining a target category identification corresponding to the target text according to the target keyword; the target category identification is used for indicating a text category of the target text;
and training an artificial intelligent model through a target text containing the target category identification to obtain a text classification model, wherein the text classification model is used for classifying the text.
2. The method of claim 1, wherein training an artificial intelligence model through a target text containing the target class identifier to obtain a text classification model comprises:
dividing the target text containing the target category identification into a training set and a test set;
training the artificial intelligence model through the training set to obtain a trained artificial intelligence model;
and testing the artificial intelligence model through the test set, and determining the artificial intelligence model passing the test as the text classification model.
3. The method of claim 2, wherein the testing the artificial intelligence model with the text of the test set, and determining the artificial intelligence model with the test as a text classification model comprises:
inputting a first text in the test set into the trained artificial intelligence model to obtain a first class identifier corresponding to the first text;
judging whether the target category identification corresponding to the first text is consistent with the first category identification;
when the target category identification corresponding to the first text is consistent with the first category identification, transferring the first text from the test set to the training set;
when the test set does not contain text, determining the artificial intelligence model as a text classification model.
4. The method of claim 3, wherein the inputting the first text in the test set into the trained artificial intelligence model, and obtaining the first category identifier corresponding to the first text comprises:
inputting a first text in the test set into the trained artificial intelligence model to obtain a first class identifier corresponding to the first text and a confidence coefficient corresponding to the first class identifier;
the determining whether the target category identifier corresponding to the first text is consistent with the first category identifier includes:
and when the confidence corresponding to the first category identification is larger than a preset threshold value, judging whether the target category identification corresponding to the first text is consistent with the first category identification.
5. The method of claim 3, further comprising:
inputting a second text in the test set into the trained artificial intelligence model to obtain a second category identifier corresponding to the second text;
judging whether the target class identification corresponding to the second text is consistent with the second class identification;
and when the target class identification corresponding to the second text is inconsistent with the prediction class identification, training the artificial intelligence model through a training set containing the first text.
6. The method of claim 5, wherein the second text in the test set is input into the trained artificial intelligence model to obtain a second category identifier corresponding to the second text;
inputting a second text in the test set into the trained artificial intelligence model to obtain a second category identification corresponding to the second text and a confidence coefficient corresponding to the second category identification;
the determining whether the target category identifier corresponding to the second text is consistent with the second category identifier includes:
and when the confidence corresponding to the second category identification is greater than a preset threshold value, judging whether the target category identification corresponding to the second text is consistent with the second category identification.
7. The method of claim 1, wherein training an artificial intelligence model through a target text containing the target class identifier to obtain a text classification model comprises:
and training an artificial intelligent model by adopting a supervised learning algorithm through a target text containing the target category identification to obtain a text classification model.
8. An apparatus for obtaining a text classification model, comprising:
the system comprises a selection module, a text classification module and a text classification module, wherein the selection module is used for selecting target keywords contained in a target text in a keyword library, the keyword library contains a plurality of keywords, and each keyword corresponds to one text category;
the determining module is used for determining a target category identifier corresponding to the target text according to the target keyword; the target category identification is used for indicating a text category of the target text;
and the training module is used for training an artificial intelligent model through a target text containing the target category identification to obtain a text classification model, and the text classification model is used for classifying the text.
9. The apparatus of claim 8, wherein the training module comprises:
the classification unit is used for classifying the target text containing the target class identification into a training set and a test set;
a training set training unit for training the artificial intelligence model through the training set to obtain a trained artificial intelligence model;
and the test unit tests the artificial intelligence model through the test set and determines the artificial intelligence model passing the test as the text classification model.
10. The apparatus of claim 9, wherein the test unit is specifically configured to:
inputting a first text in the test set into the trained artificial intelligence model to obtain a first category identifier corresponding to the first text;
judging whether the target category identification corresponding to the first text is consistent with the first category identification;
when the target category identification corresponding to the first text is consistent with the first category identification, transferring the first text from the test set to the training set;
when the test set does not contain text, determining the artificial intelligence model as a text classification model.
CN202210794143.9A 2022-07-07 2022-07-07 Method and device for obtaining text classification model Pending CN115186057A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210794143.9A CN115186057A (en) 2022-07-07 2022-07-07 Method and device for obtaining text classification model

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210794143.9A CN115186057A (en) 2022-07-07 2022-07-07 Method and device for obtaining text classification model

Publications (1)

Publication Number Publication Date
CN115186057A true CN115186057A (en) 2022-10-14

Family

ID=83517787

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210794143.9A Pending CN115186057A (en) 2022-07-07 2022-07-07 Method and device for obtaining text classification model

Country Status (1)

Country Link
CN (1) CN115186057A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117235237A (en) * 2023-11-10 2023-12-15 腾讯科技(深圳)有限公司 Text generation method and related device

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117235237A (en) * 2023-11-10 2023-12-15 腾讯科技(深圳)有限公司 Text generation method and related device
CN117235237B (en) * 2023-11-10 2024-03-12 腾讯科技(深圳)有限公司 Text generation method and related device

Similar Documents

Publication Publication Date Title
CN110276066B (en) Entity association relation analysis method and related device
CN109460455B (en) Text detection method and device
US9595005B1 (en) Systems and methods for predictive coding
US8775416B2 (en) Adapting a context-independent relevance function for identifying relevant search results
Paramesh et al. Automated IT service desk systems using machine learning techniques
CN109271489B (en) Text detection method and device
US8788503B1 (en) Content identification
CN110163376B (en) Sample detection method, media object identification method, device, terminal and medium
CN107832287A (en) A kind of label identification method and device, storage medium, terminal
CN105787025A (en) Network platform public account classifying method and device
CN113064964A (en) Text classification method, model training method, device, equipment and storage medium
CN108427686A (en) Text data querying method and device
CN111881283A (en) Business keyword library creating method, intelligent chat guiding method and device
CN114491034B (en) Text classification method and intelligent device
CN113222022A (en) Webpage classification identification method and device
CN110909768B (en) Method and device for acquiring marked data
CN115186057A (en) Method and device for obtaining text classification model
CN112667803A (en) Text emotion classification method and device
CN112905753A (en) Method and device for distinguishing text information
CN113139051B (en) Text classification model training method, text classification method, device and medium
Rebmann et al. Multi-perspective identification of event groups for event abstraction
CN112133308B (en) Method and device for classifying multiple tags of speech recognition text
CN114529191A (en) Method and apparatus for risk identification
Wang et al. A novel feature-based text classification improving the accuracy of twitter sentiment analysis
Sumathi et al. Sentiment Analysis on Feedback Data of E-commerce Products Based on NLP

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination