CN114330370A - Natural language processing system and method based on artificial intelligence - Google Patents

Natural language processing system and method based on artificial intelligence Download PDF

Info

Publication number
CN114330370A
CN114330370A CN202210260510.7A CN202210260510A CN114330370A CN 114330370 A CN114330370 A CN 114330370A CN 202210260510 A CN202210260510 A CN 202210260510A CN 114330370 A CN114330370 A CN 114330370A
Authority
CN
China
Prior art keywords
natural language
layer
language processing
sample
input
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202210260510.7A
Other languages
Chinese (zh)
Other versions
CN114330370B (en
Inventor
李晋
刘宇鹏
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Li Jin
Original Assignee
Tianjin Sirui Information Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Tianjin Sirui Information Technology Co ltd filed Critical Tianjin Sirui Information Technology Co ltd
Priority to CN202210260510.7A priority Critical patent/CN114330370B/en
Publication of CN114330370A publication Critical patent/CN114330370A/en
Application granted granted Critical
Publication of CN114330370B publication Critical patent/CN114330370B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Landscapes

  • Machine Translation (AREA)

Abstract

The invention provides a natural language processing system and a natural language processing method based on artificial intelligence, which are used for obtaining a natural language information original data set, carrying out anomaly analysis on the original data set and generating an anomaly value set of the original data set; removing sample data in the abnormal value set from the original data set, inputting the information data set from which the sample data in the abnormal value set is removed into a semantic matching model for identification, and determining a semantic matching result; and sorting the predicted values of the matching results in consideration of the estimated loss values according to the sizes, wherein the sequence obtained after sorting is the natural language processing result. When the natural language processing time is increased along with the use duration, each layer is continuously optimized, so that the intelligent accuracy of the natural language processing is gradually improved.

Description

Natural language processing system and method based on artificial intelligence
Technical Field
The invention relates to the technical field of natural language processing, in particular to a natural language processing system and a natural language processing method based on artificial intelligence.
Background
With the advent of the big data age, the internet has a problem of text information explosion. Natural Language Processing (NLP): is an important direction in the fields of computer science and artificial intelligence. It studies various theories and methods that enable efficient communication between humans and computers using natural language. Natural language processing is a science integrating linguistics, computer science and mathematics. Therefore, the research in this field will involve natural language, i.e. the language that people use everyday, so it is closely related to the research of linguistics. NLP techniques typically include text processing, semantic understanding, machine translation, robotic question and answer, and knowledge mapping.
People generally retrieve required information through a search engine, text matching is a core problem in natural language understanding, and the method has specific application in the fields of searching, advertising, recommendation, intelligent customer service systems and the like in the real world. Many tasks in natural language understanding, such as paraphrase recognition, repetitive question recognition, natural language reasoning, machine-read understanding, etc. studied herein, can be formalized as text matching questions.
For the study of text matching, the traditional method mainly focuses on manually defining features. With the rise of deep learning, many researchers use deep representation learning to perform text matching research, and deep self-coding language models are widely applied to natural language understanding tasks recently, and the strong language representation capability of the deep self-coding language models can improve the performance of the natural language understanding tasks.
However, the existing pre-training and fine-tuning method of the self-coding language model is not specific to a specific text matching task, only searches based on keyword matching, only considers information in a grammar level, only returns a webpage related to a result, does not pay attention to semantic matching, and leads to difficulty in obtaining accurate text information by a user under the condition that the user is difficult to express the self requirement by using keywords.
For example, patent document CN105608201A discloses a text matching method supporting a multi-keyword expression, including: a grammar conversion stage, which converts the multi-keyword expression into a plurality of groups of keywords; a keyword matching stage, namely, taking a plurality of groups of keywords output in the grammar conversion stage as input, and completing by adopting a keyword matching algorithm to obtain keywords appearing in the text; and a matching degree determining stage, wherein the text with the keywords output in the keyword matching stage is used as input, and the matching degree of the keywords in the keyword matching stage and the multiple groups of keywords obtained in the grammar conversion stage is determined. But the technical scheme has complex matching logic expression and needs strong processing system support.
For example, patent document CN113283235B provides a method and system for predicting a user tag, including: acquiring a user text set and a preset keyword library; obtaining each approximate word in a user text through the keywords, obtaining the keywords corresponding to the approximate words m before ranking according to the magnitude of the degree of association, determining n-dimensional vectors matched with the corresponding keywords, and determining a feature matrix through the m n-dimensional vectors; inputting the characteristic matrix into a neural network for training to obtain a prediction model; and predicting the text of the user to be processed through the prediction model to obtain a predicted user label. However, in the technical scheme, the search is performed only on the basis of keyword matching, and semantic matching is not concerned, so that a user is difficult to acquire accurate text information.
Disclosure of Invention
In order to solve the technical problem, the invention provides a natural language processing method based on artificial intelligence, which comprises the following steps:
s1, acquiring a natural language information original data set, and performing anomaly analysis on the original data set to generate an anomaly value set of the original data set;
s2, removing sample data in the abnormal value set from the original data set, inputting the information data set from which the sample data in the abnormal value set is removed into a semantic matching model for recognition, and determining a semantic matching result;
and S3, after multiplying the predicted value of the matching result by the estimated loss value, sequencing according to the product size, wherein the sequence obtained after sequencing is the natural language processing result.
Further, step S1 specifically includes: randomly selecting m samples from an original data set to form a network topology structure, wherein n NODEs form a NODE set NODE = { NODE =1,node2,……,nodenH, the node path length set is Lnode={ L1,…, Li,…,LnAnd the standard deviation of the path length of the network topology structure is:
Figure 100002_DEST_PATH_IMAGE002
wherein n is the total number of nodes,
Figure 100002_DEST_PATH_IMAGE004
is the average value of the node path length;
normalizing the path length standard difference set to obtain normalized index
Figure 100002_DEST_PATH_IMAGE006
Expressed as:
Figure DEST_PATH_IMAGE008
wherein the path length standard deviation of the network topology is set as
Figure 100002_DEST_PATH_IMAGE010
={
Figure 768490DEST_PATH_IMAGE010
1,…,
Figure 216789DEST_PATH_IMAGE010
i,…,
Figure 652318DEST_PATH_IMAGE010
nAt a maximum of
Figure 578686DEST_PATH_IMAGE010
maxMinimum value of
Figure 534004DEST_PATH_IMAGE010
min
The outliers in the sample points were calculated as:
Figure 100002_DEST_PATH_IMAGE012
wherein the path length set of m sample points is Hd={h1,…,hi, …,hm};
Calculating an abnormal value of each sample point, and combining the abnormal values into an abnormal value set N;
randomly selecting m samples for multiple times, calculating an abnormal value set, and forming an abnormal value set N capable of covering the original data setGeneral assembly
Further, in step S2, the semantic matching model includes an input layer, an intermediate layer and an output layer; the input layer calculates the weight of each input vector by using a positive and inverse frequency algorithm; the middle layer adopts a multi-layer bidirectional feature extraction model; the output layer calculates an output vector using a failure estimation model.
Further, the forward and inverse frequency algorithm specifically includes:
calculating the inverse frequency idf (E) of the input vector E:
IDF(E)=log(P/nE);
wherein, P is the total number of the training vector set; n isEThe times of the input vector E appearing in the training vector set;
computing input vector weights K (E, D)i):
Figure 100002_DEST_PATH_IMAGE014
Wherein, TF (E, D)i) Training vector set I for input vector EiFrequency of middle;
Figure 100002_DEST_PATH_IMAGE016
is a normalization factor.
Further, the multi-layer bidirectional feature extraction model has three sublayers, namely a bidirectional Transformer coding layer, an interaction layer and a normalization layer.
Further, in each of the bidirectional Transformer coding layers, a matrix K composed of an input matrix X and each input vector weight calculated by a forward inverse frequency algorithm is used as an input, and an output matrix Z of the bidirectional Transformer coding layer is calculated:
Figure 100002_DEST_PATH_IMAGE018
where d is the dimension of the input matrix X, Q represents the vector sequence of the sets of input vectors E1, …, En, and B is the number of encodings.
Further, in the interaction layer, let the output matrix in the left and right directions be expressed as Z1And Z2Then the interaction matrix of the two output matrices is calculated as follows:
R1=Z1*Z2 T
R2=Z2*Z1 T
wherein R is1Is Z1The interaction matrix of (2); r2Is Z2The interaction matrix of (2);
calculating the final output matrix R after passing through each side coding layermulWhere H is the number of coding layers, RiAn output matrix representing the i-th coding layer, C (R)i) The function represents that all H coding layers are spliced together;
Rmul=C(Ri) ,i=1,…,H;
calculating layer normalization, output matrix after layer normalization
Figure 100002_DEST_PATH_IMAGE020
Expressed as:
Figure 100002_DEST_PATH_IMAGE022
wherein the LN function represents a layer normalization function.
Further, output matrixes on the left side and the right side after layer normalization are respectively represented as v1 and v2, and v1 and v2 are subjected to matching operation:
Figure 100002_DEST_PATH_IMAGE024
wherein y' represents the predicted value of the matching result of the two texts, v1 represents that v2 multiplies the corresponding elements of v1 and v2 one by one, and the function F represents that the spliced vectors of 4 vectors are input into a classifier for processing and the predicted value of the matching result is output.
Further, the calculating of the estimated failure value of the predicted value of the matching result by using the failure estimation model specifically includes:
predicting a loss value Lp of a sample of an output layer based on an ith left training sample and a corresponding right training sample in the obtained left training samples:
Figure 100002_DEST_PATH_IMAGE026
wherein,
Figure 100002_DEST_PATH_IMAGE028
may refer to the similarity between the head and tail samples in the ith left training sample,
Figure 100002_DEST_PATH_IMAGE030
the similarity between a head sample and a reference sample included in a jth right training sample in a right training sample corresponding to an ith left training sample is shown, I is the number of the right training samples, I is an integer less than or equal to the total number of the left training samples, and j is an integer less than or equal to I;
degree of similarity
Figure 75842DEST_PATH_IMAGE028
Expressed in cosine similarity:
Figure 100002_DEST_PATH_IMAGE032
wherein ei1 refers to the vector representation of the head sample in the ith left training sample, ei2 refers to the vector representation of the tail sample in the ith left training sample,
Figure DEST_PATH_IMAGE034
are empirical parameters.
The invention also provides a natural language processing system based on artificial intelligence, which is used for realizing the natural language processing method.
The invention has the technical effects and advantages that:
1. the invention carries out natural language processing by deep learning based on artificial intelligence, improves the comprehension degree of a program by checking and using a mode in data, tunes the input weight to improve the prediction accuracy, and continuously optimizes each level when the natural language processing is increased along with the use duration, so that the intelligent accuracy of the natural language processing is gradually improved.
2. The invention uses the semantic matching model as the language processing module of the natural language processing system to help the system to quickly process the language, provides a great deal of language characteristic information for the artificial intelligent deep neural network, reduces the operation amount of the deep neural network, is beneficial to the quick operation of natural language processing and improves the processing efficiency.
3. The invention identifies the natural language information after removing the data abnormal points and determines the semantic matching result, thereby reducing the data processing operation amount of the information acquisition hardware equipment, simplifying the hardware structure and being suitable for mass popularization and generalization.
Drawings
In order to more clearly illustrate the technical solutions in the embodiments of the present invention, the drawings needed to be used in the description of the embodiments will be briefly introduced below, and it is obvious that the drawings in the following description are only some embodiments of the present invention, and it is obvious for those skilled in the art to obtain other drawings based on these drawings without inventive exercise.
FIG. 1 is a flow chart of a natural language processing method based on artificial intelligence according to the present invention.
Fig. 2 is a schematic diagram of the network topology of the present invention.
FIG. 3 is a schematic structural diagram of the semantic matching model of the present invention.
Fig. 4 is a schematic architecture diagram of the intermediate layer and the output layer of the present invention.
Fig. 5 is a data diagram of data processing using the present invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the embodiments of the present application clearer, the technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application, and it is obvious that the described embodiments are some embodiments of the present application, but not all embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present application.
Referring to fig. 1, a flow chart of the artificial intelligence-based natural language processing method according to the present invention includes the following steps:
and S1, acquiring the natural language information original data set, and performing anomaly analysis on the original data set to generate an anomaly value set of the original data set.
According to the characteristics of abnormal data, the abnormal values of the data can be divided into abnormal large values, abnormal small values, zero values, negative values and missing values. The causes of the zero value and the negative value are complex, the zero value and the negative value need to be screened out for manual identification, and the zero value and the negative value in the identification data need to be judged by combining the actual situation of the data when the zero value and the negative value are abnormal values; the abnormal large value and the abnormal small value are values different from the normal rule of the data, and are not simple data beyond a certain threshold value, because for the data in the normal range, if the data are inconsistent with the rule of the data at the adjacent moment, the data are also judged to be abnormal values; the missing value is caused by object abnormality, and if only simple deletion or zero-setting processing is performed on the missing value, the accuracy of data at a time close to the missing value is affected, so that the abnormal value needs to be corrected. In the present embodiment, analysis is performed only for the case of an abnormal value distinguished from the normal regularity of data.
Specifically, the original data set is a data set with M data, and M samples are randomly selected from the original data set to form a network topology, as shown in fig. 2, the network topology in the figure has n NODEs, and the n NODEs form a NODE set NODE = { NODE = { (NODE)1,node2,……,nodenH, the node path length set is Lnode={ L1,…, Li,…,LnCalculating the standard deviation of the path length of the network topology structure by using the following formula:
Figure DEST_PATH_IMAGE035
wherein n is the total number of nodes,
Figure DEST_PATH_IMAGE004A
is the average of the node path lengths.
If the path length standard deviation set of the network topology is as
Figure 936350DEST_PATH_IMAGE010
={
Figure 289578DEST_PATH_IMAGE010
1,…,
Figure 224036DEST_PATH_IMAGE010
i,…,
Figure 889504DEST_PATH_IMAGE010
nAt a maximum of
Figure 909412DEST_PATH_IMAGE010
maxMinimum value of
Figure 302216DEST_PATH_IMAGE010
minNormalizing the path length standard difference set to obtain normalized index
Figure DEST_PATH_IMAGE006A
Expressed as:
Figure DEST_PATH_IMAGE036
in the network topology, m piecesThe path length set of the local point is Hd={h1,…,hi, …,hmAnd f, calculating an abnormal value S in the sample point according to the formula:
Figure DEST_PATH_IMAGE037
weighting and calculating the abnormal value of each sample point by using an abnormal value calculation formula, and combining the abnormal values into an abnormal value set N;
repeating the steps, randomly selecting M samples for multiple times, calculating an abnormal value set, and finally forming an abnormal value set N capable of covering all M dataGeneral assembly
And S2, removing the sample data in the abnormal value set from the original data set, inputting the information data set from which the sample data in the abnormal value set is removed into a semantic matching model for recognition, and determining a semantic matching result.
In the embodiment of the application, the fact that the actual semantic information may have errors is considered, and the semantic interference caused by the errors of the actual semantic information can be avoided by using the input of the semantic vector, so that the information input to the semantic matching model for recognition is processed by semantic vector word segmentation, and the method does not depend on any pre-training technology and word segmentation technology, and cannot cause errors caused by inaccuracy of the pre-training technology or the word segmentation technology.
As shown in fig. 3, the semantic matching model is a schematic structural diagram, and includes an input layer, an intermediate layer, and an output layer.
Wherein, in the input layer, E1, … and En are used for representing input vectors of the semantic matching model; the middle layer adopts a multi-layer bidirectional Transformer characteristic extraction model; in the output layer, T1, …, Tn represent the output vector of the semantic matching model. The semantic matching model is used for obtaining word vectors, so that the application of a subsequent text classifier is facilitated.
For the input layer, in order to strengthen the influence of the input vectors of the semantic matching model, the weight of each input vector is calculated by using a positive inverse frequency algorithm.
The forward and inverse frequency algorithm is a weighted statistical algorithm for information retrieval and text mining to evaluate the importance of a semantic information to a data set or a corpus.
Let the input vector be E, the forward and inverse frequency algorithm be the forward frequency × the inverse frequency, TF be the forward frequency, and IDF be the inverse frequency.
The inverse frequency idf (E) of the input vector E is calculated as follows:
IDF(E)=log(P/nE);
in the formula: p is the total number of the training vector set; n isEThe number of times the input vector E appears in the training vector set. Input vector weights K (E, D) calculated using a positive inverse frequency algorithmi) The calculation formula of (a) is as follows:
Figure DEST_PATH_IMAGE038
in the formula: TF (E, D)i) Training vector set I for input vector EiFrequency of middle;
Figure DEST_PATH_IMAGE039
is a normalization factor.
Fig. 4 is a schematic diagram of the architecture of the middle layer and the output layer.
The intermediate layer adopts a multi-layer bidirectional Transformer feature extraction model, the multi-layer bidirectional Transformer feature extraction model has three sub-layers, namely a bidirectional Transformer coding layer, an interaction layer and a normalization layer, and the structure of the multi-layer bidirectional Transformer feature extraction model is shown in fig. 4.
The multi-group input vectors E1, … and En form an input matrix X and are input into a multi-layer bidirectional Transformer characteristic extraction model, and the calculation process realized in the multi-layer bidirectional Transformer characteristic extraction model is as follows:
in each of the bidirectional Transformer encoding layers, a matrix K composed of an input matrix X and each input vector weight calculated by a forward-inverse frequency algorithm is used as an input, and an output matrix Z of the bidirectional Transformer encoding layer is calculated.
Figure DEST_PATH_IMAGE040
Where d is the dimension of the input matrix X, Q represents the vector sequence of the sets of input vectors E1, …, En, and B is the number of encoding times, i.e., the number of layers of the bidirectional transform encoding layer.
In the interaction layer, since the transform coding layer is bidirectional, it is assumed here that the left and right directional output matrices are represented as Z1 and Z2, and the interaction matrix of the two output matrices is calculated as follows:
R1=Z1*Z2 T
R2=Z2*Z1 T
wherein R is1Is Z1The interaction matrix of (2); r2Is Z2The interaction matrix of (2).
Calculating the final output matrix R after passing through each side coding layermulWhere H is the number of coding layers, RiAn output matrix representing the i-th coding layer, C (R)i) The function represents the concatenation of all H encoded layers, RmulIs the same as the input matrix X.
Rmul=C(Ri) ,i=1,…,H;
In the normalization layer, the LN function is used to calculate the output matrix after layer normalization
Figure DEST_PATH_IMAGE041
Expressed as:
Figure DEST_PATH_IMAGE042
in the output layer, assuming that output matrixes on the left side and the right side after layer normalization are respectively represented as v1 and v2, v1 and v2 are input into the matching layer to perform matching operation, and the matching result of the two texts is calculated as follows:
Figure DEST_PATH_IMAGE043
wherein y' represents a predicted value of a matching result of two texts, and v1 v2 represents that corresponding elements of v1 and v2 are in phase-by-phaseMultiplication emphasizes the identity between two texts, while | V1-V2 | emphasizes the difference between two texts, the function F represents the concatenation vector V =that would be 4 vectors
Figure DEST_PATH_IMAGE045
The input is input to a classifier to process and output a predicted value of a matching result.
In the process of detecting specific data in a big data embedded network, the splicing vector V = of 4 vectors obtained by the formula
Figure DEST_PATH_IMAGE046
For the basis, the method is fused with fractional Fourier transformation to perform data matching processing, performs data classification space guidance, constructs a K-L data classifier, and realizes big data embedded data classification by using the classifier, and the specific steps are as follows:
in the classification process of big data embedded data, the splicing vector of 4 vectors obtained by the above formula is used as a basis, and the following defined fractional order Fourier transform expression is utilized:
Figure DEST_PATH_IMAGE048
in the formula,
Figure DEST_PATH_IMAGE050
a transform kernel representing a fractional order Fourier transform,
Figure DEST_PATH_IMAGE052
the representative data features match the rotation angles,
Figure DEST_PATH_IMAGE054
a token representing the form of the transform operator,
Figure DEST_PATH_IMAGE056
a set of attributes representing a clustering feature of the data.
In the classification process of the big data embedded data, the data matching based on the fractional order Fourier transformation is realized by utilizing the rotational additivity in the fractional order Fourier domain.
Figure DEST_PATH_IMAGE058
Wherein P represents the order of the fractional order Fourier domain of the specific data which is positive real number, q represents the order of the fractional order Fourier domain of the specific data which is negative real number,
Figure DEST_PATH_IMAGE060
and representing a large data embedded data classification fractional order Fourier domain.
The classifier obtained by the above formula is used for obtaining the energy distribution of the specific data among different frequencies in the big data embedded network, thereby realizing the detection of the specific data in the big data network.
And calculating an estimated failure value of a predicted value y' of the matching result by using the failure estimation model, taking output matrixes on the left side and the right side after layer normalization as a left training sample and a right training sample respectively, and inputting the left training sample and the right training sample into the failure estimation model of an output layer to obtain the estimated failure value.
The estimated failure value is used for indicating the efficiency of the output layer for predicting the sample, and is used for measuring the performance of the output layer for predicting the sample. The smaller the estimated failure value is, the better the sample prediction performance of the output layer is, namely the higher the prediction accuracy is.
In the embodiment of the present application, the estimated failure value Lp of the output layer obtained based on the ith left training sample and the corresponding right training sample in the obtained left training samples:
Figure DEST_PATH_IMAGE061
wherein,
Figure DEST_PATH_IMAGE062
may refer to the similarity between the head and tail samples in the ith left training sample,
Figure DEST_PATH_IMAGE030A
the similarity between a head sample and a reference sample included in a jth right training sample in the right training samples corresponding to the ith left training sample is referred to, I is the number of the right training samples, I is an integer less than or equal to the total number of the left training samples, and j may be an integer less than or equal to I.
In a preferred embodiment, the similarity is expressed in terms of cosine similarity, as described above
Figure 956095DEST_PATH_IMAGE028
Namely, it can satisfy:
Figure DEST_PATH_IMAGE031
where ei1 may refer to the vector representation of the leading sample in the ith left training sample, ei2 refers to the vector representation of the trailing sample in the ith left training sample,
Figure DEST_PATH_IMAGE063
is an empirical parameter, typically 1.5. The similarity between the head sample and the reference sample included in the jth right-direction training sample can be calculated by referring to the above formula
Figure DEST_PATH_IMAGE065
And will not be described herein.
And S3, after multiplying the predicted value of the matching result by the estimated loss value, sequencing according to the product size, wherein the sequence obtained after sequencing is the natural language processing result. Fig. 5 is a data diagram showing the data processing by the above steps.
The invention also provides a natural language processing system based on artificial intelligence, which is used for realizing the natural language processing method.
The system comprises an acquisition module, a data processing module and a data processing module, wherein the acquisition module is used for acquiring an original data set of natural language information; in a preferred embodiment, the raw data set of natural language information further comprises at least two data sets to be trained.
The processor is used for removing the sample data in the abnormal value set from the original data set, inputting the information data set from which the sample data in the abnormal value set is removed into the semantic matching model for identification, and determining a semantic matching result; and (4) after multiplying the predicted value of the matching result by the estimated loss value, sequencing according to the product size, wherein the sequence obtained after sequencing is the natural language processing result. The processor provided in this embodiment may be deployed in a computer device, and may generate a large difference due to different configurations or performances, and may include one or more Central Processing Units (CPUs) and memories, and one or more storage media (e.g., one or more mass storage devices) for storing applications or data. The memory and storage medium may be, among other things, transient or persistent storage. The program stored on the storage medium may include one or more modules, each of which may include a series of instruction operations for the server. Further, the processor may be configured to communicate with the storage medium and execute a series of instruction operations in the storage medium on the processor.
In the above embodiments, the descriptions of the respective embodiments have respective emphasis, and for parts that are not described in detail in a certain embodiment, reference may be made to related descriptions of other embodiments.
Embodiments of the present application also provide a computer-readable storage medium, in which a computer program is stored, and when the computer program runs on a computer, the computer is caused to execute the method described in the foregoing embodiments.
Embodiments of the present application also provide a computer program product including a program, which, when run on a computer, causes the computer to perform the methods described in the foregoing embodiments.
It will be understood by those skilled in the art that all or part of the processes of the methods of the embodiments described above can be implemented by hardware instructions of a computer program, which can be stored in a non-volatile computer-readable storage medium, and when executed, can include the processes of the embodiments of the methods described above. Any reference to memory, storage, database or other medium used in the embodiments provided herein can include at least one of non-volatile and volatile memory. Non-volatile memory may include Read Only Memory (ROM), magnetic tape, floppy disk, flash memory, optical storage, or the like. Volatile memory can include Random Access Memory (RAM) or external cache memory. By way of illustration and not limitation, RAM can take many forms, such as Static Random Access Memory (SRAM) or Dynamic Random Access Memory (DRAM), among others.
The above-mentioned embodiments only express several embodiments of the present application, and the description thereof is more specific and detailed, but not construed as limiting the scope of the invention. It should be noted that, for a person skilled in the art, several variations and modifications can be made without departing from the concept of the present application, which falls within the scope of protection of the present application. Therefore, the protection scope of the present patent shall be subject to the appended claims.

Claims (10)

1. A natural language processing method based on artificial intelligence is characterized by comprising the following steps:
s1, acquiring a natural language information original data set, and performing anomaly analysis on the original data set to generate an anomaly value set of the original data set;
s2, removing sample data in the abnormal value set from the original data set, inputting the information data set from which the sample data in the abnormal value set is removed into a semantic matching model for recognition, and determining a semantic matching result;
and S3, after multiplying the predicted value of the matching result by the estimated loss value, sequencing according to the product size, wherein the sequence obtained after sequencing is the natural language processing result.
2. The natural language processing method based on artificial intelligence of claim 1, wherein the step S1 specifically includes: randomly selecting m samples from an original data set to form a network topology structure, wherein n nodes form the network topology structureNODE set NODE = { NODE1,node2,……,nodenH, the node path length set is Lnode={ L1,…,Li,…,LnAnd the standard deviation of the path length of the network topology structure is:
Figure DEST_PATH_IMAGE002
wherein n is the total number of nodes,
Figure DEST_PATH_IMAGE004
is the average value of the node path length;
normalizing the path length standard difference set to obtain normalized index
Figure DEST_PATH_IMAGE006
Expressed as:
Figure DEST_PATH_IMAGE007
wherein the path length standard deviation of the network topology is set as
Figure DEST_PATH_IMAGE009
={
Figure 488949DEST_PATH_IMAGE009
1,…,
Figure 733986DEST_PATH_IMAGE009
i,…,
Figure 310460DEST_PATH_IMAGE009
nAt a maximum of
Figure 33566DEST_PATH_IMAGE009
maxMinimum value of
Figure 316780DEST_PATH_IMAGE009
min
The outliers in the sample points were calculated as:
Figure DEST_PATH_IMAGE010
wherein the path length set of m sample points is Hd={h1,…,hi, …,hm};
Calculating an abnormal value of each sample point, and combining the abnormal values into an abnormal value set N;
randomly selecting m samples for multiple times, calculating an abnormal value set, and forming an abnormal value set N capable of covering the original data setGeneral assembly
3. The artificial intelligence based natural language processing method according to claim 1, wherein in step S2, the semantic matching model includes an input layer, an intermediate layer and an output layer; the input layer calculates the weight of each input vector by using a positive and inverse frequency algorithm; the middle layer adopts a multi-layer bidirectional feature extraction model; the output layer calculates an output vector using a failure estimation model.
4. The artificial intelligence based natural language processing method according to claim 3, wherein the forward and inverse frequency algorithm specifically comprises:
calculating the inverse frequency idf (E) of the input vector E:
IDF(E)=log(P/nE);
wherein, P is the total number of the training vector set; n isEThe times of the input vector E appearing in the training vector set;
computing input vector weights K (E, D)i):
Figure DEST_PATH_IMAGE012
Wherein, TF (E, D)i) Training vector set I for input vector EiFrequency of middle;
Figure DEST_PATH_IMAGE014
is a normalization factor.
5. The artificial intelligence based natural language processing method of claim 3, wherein the multi-layered bidirectional feature extraction model has three sub-layers, which are a bidirectional Transformer coding layer, an interaction layer and a normalization layer.
6. The artificial intelligence based natural language processing method according to claim 5, wherein in each of the bidirectional fransformer coding layers, a matrix K formed by an input matrix X and a weight of each input vector calculated by a forward inverse frequency algorithm is used as an input, and an output matrix Z of the bidirectional fransformer coding layer is calculated:
Figure DEST_PATH_IMAGE016
where d is the dimension of the input matrix X, Q represents the vector sequence of the sets of input vectors E1, …, En, and B is the number of encodings.
7. The artificial intelligence based natural language processing method of claim 5,
in the interaction layer, the output matrix in left and right directions is represented as Z1And Z2Then the interaction matrix of the two output matrices is calculated as follows:
R1=Z1*Z2 T
R2=Z2*Z1 T
wherein R is1Is Z1The interaction matrix of (2); r2Is Z2The interaction matrix of (2);
computing a pass throughFinal output matrix R after each side coding layermulWhere H is the number of coding layers, RiAn output matrix representing the i-th coding layer, C (R)i) The function represents that all H coding layers are spliced together;
Rmul=C(Ri) ,i=1,…,H;
calculating layer normalization, output matrix after layer normalization
Figure DEST_PATH_IMAGE018
Expressed as:
Figure DEST_PATH_IMAGE020
wherein the LN function represents a layer normalization function.
8. The artificial intelligence based natural language processing method according to claim 7, wherein the output matrix representations of the left and right sides after the layer normalization are v1 and v2, respectively, and v1 and v2 are subjected to matching operations:
Figure DEST_PATH_IMAGE022
wherein y' represents the predicted value of the matching result of the two texts, v1 represents that v2 multiplies the corresponding elements of v1 and v2 one by one, and the function F represents that the spliced vectors of 4 vectors are input into a classifier for processing and the predicted value of the matching result is output.
9. The artificial intelligence based natural language processing method according to claim 8, wherein the calculating an estimated failure value of the predicted value of the matching result using the failure estimation model specifically includes:
predicting a loss value Lp of a sample of an output layer based on an ith left training sample and a corresponding right training sample in the obtained left training samples:
Figure DEST_PATH_IMAGE024
wherein,
Figure DEST_PATH_IMAGE026
may refer to the similarity between the head and tail samples in the ith left training sample,
Figure DEST_PATH_IMAGE028
the similarity between a head sample and a reference sample included in a jth right training sample in a right training sample corresponding to an ith left training sample is shown, I is the number of the right training samples, I is an integer less than or equal to the total number of the left training samples, and j is an integer less than or equal to I;
degree of similarity
Figure 686712DEST_PATH_IMAGE026
Expressed in cosine similarity:
Figure DEST_PATH_IMAGE030
wherein ei1 refers to the vector representation of the head sample in the ith left training sample, ei2 refers to the vector representation of the tail sample in the ith left training sample,
Figure DEST_PATH_IMAGE032
are empirical parameters.
10. A natural language processing system based on artificial intelligence, wherein the natural language processing system is used for implementing the natural language processing method according to any one of claims 1 to 9.
CN202210260510.7A 2022-03-17 2022-03-17 Natural language processing system and method based on artificial intelligence Active CN114330370B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210260510.7A CN114330370B (en) 2022-03-17 2022-03-17 Natural language processing system and method based on artificial intelligence

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210260510.7A CN114330370B (en) 2022-03-17 2022-03-17 Natural language processing system and method based on artificial intelligence

Publications (2)

Publication Number Publication Date
CN114330370A true CN114330370A (en) 2022-04-12
CN114330370B CN114330370B (en) 2022-05-20

Family

ID=81033553

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210260510.7A Active CN114330370B (en) 2022-03-17 2022-03-17 Natural language processing system and method based on artificial intelligence

Country Status (1)

Country Link
CN (1) CN114330370B (en)

Citations (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108776694A (en) * 2018-06-05 2018-11-09 哈尔滨工业大学 A kind of time series abnormal point detecting method and device
CN109657947A (en) * 2018-12-06 2019-04-19 西安交通大学 A kind of method for detecting abnormality towards enterprises ' industry classification
US20190318407A1 (en) * 2015-07-17 2019-10-17 Devanathan GIRIDHARI Method for product search using the user-weighted, attribute-based, sort-ordering and system thereof
CN110956224A (en) * 2019-08-01 2020-04-03 平安科技(深圳)有限公司 Evaluation model generation method, evaluation data processing method, evaluation model generation device, evaluation data processing equipment and medium
CN111666502A (en) * 2020-07-08 2020-09-15 腾讯科技(深圳)有限公司 Abnormal user identification method and device based on deep learning and storage medium
CN111753527A (en) * 2020-06-29 2020-10-09 平安科技(深圳)有限公司 Data analysis method and device based on natural language processing and computer equipment
CN111860850A (en) * 2019-04-28 2020-10-30 第四范式(北京)技术有限公司 Model training method, information processing method and device and electronic equipment
CN111882431A (en) * 2020-08-04 2020-11-03 武汉众邦银行股份有限公司 Intelligent message pushing method based on NLP deep learning
CN113011911A (en) * 2021-01-21 2021-06-22 腾讯科技(深圳)有限公司 Data prediction method, device, medium and electronic equipment based on artificial intelligence
CN113158076A (en) * 2021-04-05 2021-07-23 北京工业大学 Social robot detection method based on variational self-coding and K-nearest neighbor combination
CN113436698A (en) * 2021-08-27 2021-09-24 之江实验室 Automatic medical term standardization system and method integrating self-supervision and active learning

Patent Citations (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20190318407A1 (en) * 2015-07-17 2019-10-17 Devanathan GIRIDHARI Method for product search using the user-weighted, attribute-based, sort-ordering and system thereof
CN108776694A (en) * 2018-06-05 2018-11-09 哈尔滨工业大学 A kind of time series abnormal point detecting method and device
CN109657947A (en) * 2018-12-06 2019-04-19 西安交通大学 A kind of method for detecting abnormality towards enterprises ' industry classification
CN111860850A (en) * 2019-04-28 2020-10-30 第四范式(北京)技术有限公司 Model training method, information processing method and device and electronic equipment
CN110956224A (en) * 2019-08-01 2020-04-03 平安科技(深圳)有限公司 Evaluation model generation method, evaluation data processing method, evaluation model generation device, evaluation data processing equipment and medium
CN111753527A (en) * 2020-06-29 2020-10-09 平安科技(深圳)有限公司 Data analysis method and device based on natural language processing and computer equipment
CN111666502A (en) * 2020-07-08 2020-09-15 腾讯科技(深圳)有限公司 Abnormal user identification method and device based on deep learning and storage medium
CN111882431A (en) * 2020-08-04 2020-11-03 武汉众邦银行股份有限公司 Intelligent message pushing method based on NLP deep learning
CN113011911A (en) * 2021-01-21 2021-06-22 腾讯科技(深圳)有限公司 Data prediction method, device, medium and electronic equipment based on artificial intelligence
CN113158076A (en) * 2021-04-05 2021-07-23 北京工业大学 Social robot detection method based on variational self-coding and K-nearest neighbor combination
CN113436698A (en) * 2021-08-27 2021-09-24 之江实验室 Automatic medical term standardization system and method integrating self-supervision and active learning

Also Published As

Publication number Publication date
CN114330370B (en) 2022-05-20

Similar Documents

Publication Publication Date Title
CN111401077B (en) Language model processing method and device and computer equipment
CN116194912A (en) Method and system for aspect-level emotion classification using graph diffusion transducers
CN109408743B (en) Text link embedding method
CN109344399B (en) Text similarity calculation method based on stacked bidirectional lstm neural network
Tran et al. Ensemble application of ELM and GPU for real-time multimodal sentiment analysis
CN110990555B (en) End-to-end retrieval type dialogue method and system and computer equipment
CN111797196A (en) Service discovery method combining attention mechanism LSTM and neural topic model
Grzegorczyk Vector representations of text data in deep learning
CN112307048B (en) Semantic matching model training method, matching method, device, equipment and storage medium
CN112988970A (en) Text matching algorithm serving intelligent question-answering system
CN115497465A (en) Voice interaction method and device, electronic equipment and storage medium
CN116304748A (en) Text similarity calculation method, system, equipment and medium
Somogyi The Application of Artificial Intelligence
CN116975271A (en) Text relevance determining method, device, computer equipment and storage medium
Lin et al. Lifelong Text-Audio Sentiment Analysis learning
CN113516094A (en) System and method for matching document with review experts
Menon et al. Improving ranking in document based search systems
CN111666375A (en) Matching method of text similarity, electronic equipment and computer readable medium
CN114330370B (en) Natural language processing system and method based on artificial intelligence
CN116680407A (en) Knowledge graph construction method and device
CN114595324A (en) Method, device, terminal and non-transitory storage medium for power grid service data domain division
Bouallégue et al. Learning deep wavelet networks for recognition system of Arabic words
CN114003773A (en) Dialogue tracking method based on self-construction multi-scene
CN113590755A (en) Word weight generation method and device, electronic equipment and storage medium
CN113157892A (en) User intention processing method and device, computer equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
TR01 Transfer of patent right
TR01 Transfer of patent right

Effective date of registration: 20221101

Address after: 1709, F13, Block A, Yard 93, Jianguo Road, Chaoyang District, Beijing 100022

Patentee after: Li Jin

Address before: 300000 No. 201-10, unit 2, building 2, No. 39, Gaoxin Sixth Road, Binhai science and Technology Park, high tech Zone, Binhai New Area, Tianjin

Patentee before: Tianjin Sirui Information Technology Co.,Ltd.