CN114330370A - Natural language processing system and method based on artificial intelligence - Google Patents
Natural language processing system and method based on artificial intelligence Download PDFInfo
- Publication number
- CN114330370A CN114330370A CN202210260510.7A CN202210260510A CN114330370A CN 114330370 A CN114330370 A CN 114330370A CN 202210260510 A CN202210260510 A CN 202210260510A CN 114330370 A CN114330370 A CN 114330370A
- Authority
- CN
- China
- Prior art keywords
- natural language
- layer
- language processing
- sample
- input
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 238000003058 natural language processing Methods 0.000 title claims abstract description 42
- 238000000034 method Methods 0.000 title claims abstract description 40
- 238000013473 artificial intelligence Methods 0.000 title claims abstract description 21
- 230000002159 abnormal effect Effects 0.000 claims abstract description 38
- 238000004458 analytical method Methods 0.000 claims abstract description 5
- 239000013598 vector Substances 0.000 claims description 61
- 239000000523 sample Substances 0.000 claims description 59
- 238000012549 training Methods 0.000 claims description 51
- 239000011159 matrix material Substances 0.000 claims description 37
- 230000002457 bidirectional effect Effects 0.000 claims description 21
- 238000010606 normalization Methods 0.000 claims description 18
- 230000003993 interaction Effects 0.000 claims description 15
- 238000012545 processing Methods 0.000 claims description 14
- 238000004422 calculation algorithm Methods 0.000 claims description 12
- 230000006870 function Effects 0.000 claims description 11
- 238000000605 extraction Methods 0.000 claims description 10
- 238000012163 sequencing technique Methods 0.000 claims description 8
- 239000013074 reference sample Substances 0.000 claims description 4
- 230000015654 memory Effects 0.000 description 9
- 230000008569 process Effects 0.000 description 8
- 238000010586 diagram Methods 0.000 description 7
- 238000004590 computer program Methods 0.000 description 4
- 238000005516 engineering process Methods 0.000 description 4
- 238000013528 artificial neural network Methods 0.000 description 3
- 238000004364 calculation method Methods 0.000 description 3
- 238000006243 chemical reaction Methods 0.000 description 3
- 238000011160 research Methods 0.000 description 3
- 230000011218 segmentation Effects 0.000 description 3
- 238000013135 deep learning Methods 0.000 description 2
- 230000009466 transformation Effects 0.000 description 2
- 238000012896 Statistical algorithm Methods 0.000 description 1
- 230000005856 abnormality Effects 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 238000004891 communication Methods 0.000 description 1
- 238000012217 deletion Methods 0.000 description 1
- 230000037430 deletion Effects 0.000 description 1
- 238000001514 detection method Methods 0.000 description 1
- 230000000694 effects Effects 0.000 description 1
- 230000003203 everyday effect Effects 0.000 description 1
- 238000004880 explosion Methods 0.000 description 1
- 238000013507 mapping Methods 0.000 description 1
- 238000005065 mining Methods 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 230000003287 optical effect Effects 0.000 description 1
- 230000002085 persistent effect Effects 0.000 description 1
- 230000003252 repetitive effect Effects 0.000 description 1
- 230000003068 static effect Effects 0.000 description 1
- 230000001052 transient effect Effects 0.000 description 1
- 238000013519 translation Methods 0.000 description 1
Images
Landscapes
- Machine Translation (AREA)
Abstract
The invention provides a natural language processing system and a natural language processing method based on artificial intelligence, which are used for obtaining a natural language information original data set, carrying out anomaly analysis on the original data set and generating an anomaly value set of the original data set; removing sample data in the abnormal value set from the original data set, inputting the information data set from which the sample data in the abnormal value set is removed into a semantic matching model for identification, and determining a semantic matching result; and sorting the predicted values of the matching results in consideration of the estimated loss values according to the sizes, wherein the sequence obtained after sorting is the natural language processing result. When the natural language processing time is increased along with the use duration, each layer is continuously optimized, so that the intelligent accuracy of the natural language processing is gradually improved.
Description
Technical Field
The invention relates to the technical field of natural language processing, in particular to a natural language processing system and a natural language processing method based on artificial intelligence.
Background
With the advent of the big data age, the internet has a problem of text information explosion. Natural Language Processing (NLP): is an important direction in the fields of computer science and artificial intelligence. It studies various theories and methods that enable efficient communication between humans and computers using natural language. Natural language processing is a science integrating linguistics, computer science and mathematics. Therefore, the research in this field will involve natural language, i.e. the language that people use everyday, so it is closely related to the research of linguistics. NLP techniques typically include text processing, semantic understanding, machine translation, robotic question and answer, and knowledge mapping.
People generally retrieve required information through a search engine, text matching is a core problem in natural language understanding, and the method has specific application in the fields of searching, advertising, recommendation, intelligent customer service systems and the like in the real world. Many tasks in natural language understanding, such as paraphrase recognition, repetitive question recognition, natural language reasoning, machine-read understanding, etc. studied herein, can be formalized as text matching questions.
For the study of text matching, the traditional method mainly focuses on manually defining features. With the rise of deep learning, many researchers use deep representation learning to perform text matching research, and deep self-coding language models are widely applied to natural language understanding tasks recently, and the strong language representation capability of the deep self-coding language models can improve the performance of the natural language understanding tasks.
However, the existing pre-training and fine-tuning method of the self-coding language model is not specific to a specific text matching task, only searches based on keyword matching, only considers information in a grammar level, only returns a webpage related to a result, does not pay attention to semantic matching, and leads to difficulty in obtaining accurate text information by a user under the condition that the user is difficult to express the self requirement by using keywords.
For example, patent document CN105608201A discloses a text matching method supporting a multi-keyword expression, including: a grammar conversion stage, which converts the multi-keyword expression into a plurality of groups of keywords; a keyword matching stage, namely, taking a plurality of groups of keywords output in the grammar conversion stage as input, and completing by adopting a keyword matching algorithm to obtain keywords appearing in the text; and a matching degree determining stage, wherein the text with the keywords output in the keyword matching stage is used as input, and the matching degree of the keywords in the keyword matching stage and the multiple groups of keywords obtained in the grammar conversion stage is determined. But the technical scheme has complex matching logic expression and needs strong processing system support.
For example, patent document CN113283235B provides a method and system for predicting a user tag, including: acquiring a user text set and a preset keyword library; obtaining each approximate word in a user text through the keywords, obtaining the keywords corresponding to the approximate words m before ranking according to the magnitude of the degree of association, determining n-dimensional vectors matched with the corresponding keywords, and determining a feature matrix through the m n-dimensional vectors; inputting the characteristic matrix into a neural network for training to obtain a prediction model; and predicting the text of the user to be processed through the prediction model to obtain a predicted user label. However, in the technical scheme, the search is performed only on the basis of keyword matching, and semantic matching is not concerned, so that a user is difficult to acquire accurate text information.
Disclosure of Invention
In order to solve the technical problem, the invention provides a natural language processing method based on artificial intelligence, which comprises the following steps:
s1, acquiring a natural language information original data set, and performing anomaly analysis on the original data set to generate an anomaly value set of the original data set;
s2, removing sample data in the abnormal value set from the original data set, inputting the information data set from which the sample data in the abnormal value set is removed into a semantic matching model for recognition, and determining a semantic matching result;
and S3, after multiplying the predicted value of the matching result by the estimated loss value, sequencing according to the product size, wherein the sequence obtained after sequencing is the natural language processing result.
Further, step S1 specifically includes: randomly selecting m samples from an original data set to form a network topology structure, wherein n NODEs form a NODE set NODE = { NODE =1,node2,……,nodenH, the node path length set is Lnode={ L1,…, Li,…,LnAnd the standard deviation of the path length of the network topology structure is:
wherein the path length standard deviation of the network topology is set as={ 1,…, i,…, nAt a maximum of maxMinimum value of min;
The outliers in the sample points were calculated as:
wherein the path length set of m sample points is Hd={h1,…,hi, …,hm};
Calculating an abnormal value of each sample point, and combining the abnormal values into an abnormal value set N;
randomly selecting m samples for multiple times, calculating an abnormal value set, and forming an abnormal value set N capable of covering the original data setGeneral assembly。
Further, in step S2, the semantic matching model includes an input layer, an intermediate layer and an output layer; the input layer calculates the weight of each input vector by using a positive and inverse frequency algorithm; the middle layer adopts a multi-layer bidirectional feature extraction model; the output layer calculates an output vector using a failure estimation model.
Further, the forward and inverse frequency algorithm specifically includes:
calculating the inverse frequency idf (E) of the input vector E:
IDF(E)=log(P/nE);
wherein, P is the total number of the training vector set; n isEThe times of the input vector E appearing in the training vector set;
computing input vector weights K (E, D)i):
Wherein, TF (E, D)i) Training vector set I for input vector EiFrequency of middle;is a normalization factor.
Further, the multi-layer bidirectional feature extraction model has three sublayers, namely a bidirectional Transformer coding layer, an interaction layer and a normalization layer.
Further, in each of the bidirectional Transformer coding layers, a matrix K composed of an input matrix X and each input vector weight calculated by a forward inverse frequency algorithm is used as an input, and an output matrix Z of the bidirectional Transformer coding layer is calculated:
where d is the dimension of the input matrix X, Q represents the vector sequence of the sets of input vectors E1, …, En, and B is the number of encodings.
Further, in the interaction layer, let the output matrix in the left and right directions be expressed as Z1And Z2Then the interaction matrix of the two output matrices is calculated as follows:
R1=Z1*Z2 T;
R2=Z2*Z1 T;
wherein R is1Is Z1The interaction matrix of (2); r2Is Z2The interaction matrix of (2);
calculating the final output matrix R after passing through each side coding layermulWhere H is the number of coding layers, RiAn output matrix representing the i-th coding layer, C (R)i) The function represents that all H coding layers are spliced together;
Rmul=C(Ri) ,i=1,…,H;
wherein the LN function represents a layer normalization function.
Further, output matrixes on the left side and the right side after layer normalization are respectively represented as v1 and v2, and v1 and v2 are subjected to matching operation:
wherein y' represents the predicted value of the matching result of the two texts, v1 represents that v2 multiplies the corresponding elements of v1 and v2 one by one, and the function F represents that the spliced vectors of 4 vectors are input into a classifier for processing and the predicted value of the matching result is output.
Further, the calculating of the estimated failure value of the predicted value of the matching result by using the failure estimation model specifically includes:
predicting a loss value Lp of a sample of an output layer based on an ith left training sample and a corresponding right training sample in the obtained left training samples:
wherein,may refer to the similarity between the head and tail samples in the ith left training sample,the similarity between a head sample and a reference sample included in a jth right training sample in a right training sample corresponding to an ith left training sample is shown, I is the number of the right training samples, I is an integer less than or equal to the total number of the left training samples, and j is an integer less than or equal to I;
wherein ei1 refers to the vector representation of the head sample in the ith left training sample, ei2 refers to the vector representation of the tail sample in the ith left training sample,are empirical parameters.
The invention also provides a natural language processing system based on artificial intelligence, which is used for realizing the natural language processing method.
The invention has the technical effects and advantages that:
1. the invention carries out natural language processing by deep learning based on artificial intelligence, improves the comprehension degree of a program by checking and using a mode in data, tunes the input weight to improve the prediction accuracy, and continuously optimizes each level when the natural language processing is increased along with the use duration, so that the intelligent accuracy of the natural language processing is gradually improved.
2. The invention uses the semantic matching model as the language processing module of the natural language processing system to help the system to quickly process the language, provides a great deal of language characteristic information for the artificial intelligent deep neural network, reduces the operation amount of the deep neural network, is beneficial to the quick operation of natural language processing and improves the processing efficiency.
3. The invention identifies the natural language information after removing the data abnormal points and determines the semantic matching result, thereby reducing the data processing operation amount of the information acquisition hardware equipment, simplifying the hardware structure and being suitable for mass popularization and generalization.
Drawings
In order to more clearly illustrate the technical solutions in the embodiments of the present invention, the drawings needed to be used in the description of the embodiments will be briefly introduced below, and it is obvious that the drawings in the following description are only some embodiments of the present invention, and it is obvious for those skilled in the art to obtain other drawings based on these drawings without inventive exercise.
FIG. 1 is a flow chart of a natural language processing method based on artificial intelligence according to the present invention.
Fig. 2 is a schematic diagram of the network topology of the present invention.
FIG. 3 is a schematic structural diagram of the semantic matching model of the present invention.
Fig. 4 is a schematic architecture diagram of the intermediate layer and the output layer of the present invention.
Fig. 5 is a data diagram of data processing using the present invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the embodiments of the present application clearer, the technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application, and it is obvious that the described embodiments are some embodiments of the present application, but not all embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present application.
Referring to fig. 1, a flow chart of the artificial intelligence-based natural language processing method according to the present invention includes the following steps:
and S1, acquiring the natural language information original data set, and performing anomaly analysis on the original data set to generate an anomaly value set of the original data set.
According to the characteristics of abnormal data, the abnormal values of the data can be divided into abnormal large values, abnormal small values, zero values, negative values and missing values. The causes of the zero value and the negative value are complex, the zero value and the negative value need to be screened out for manual identification, and the zero value and the negative value in the identification data need to be judged by combining the actual situation of the data when the zero value and the negative value are abnormal values; the abnormal large value and the abnormal small value are values different from the normal rule of the data, and are not simple data beyond a certain threshold value, because for the data in the normal range, if the data are inconsistent with the rule of the data at the adjacent moment, the data are also judged to be abnormal values; the missing value is caused by object abnormality, and if only simple deletion or zero-setting processing is performed on the missing value, the accuracy of data at a time close to the missing value is affected, so that the abnormal value needs to be corrected. In the present embodiment, analysis is performed only for the case of an abnormal value distinguished from the normal regularity of data.
Specifically, the original data set is a data set with M data, and M samples are randomly selected from the original data set to form a network topology, as shown in fig. 2, the network topology in the figure has n NODEs, and the n NODEs form a NODE set NODE = { NODE = { (NODE)1,node2,……,nodenH, the node path length set is Lnode={ L1,…, Li,…,LnCalculating the standard deviation of the path length of the network topology structure by using the following formula:
If the path length standard deviation set of the network topology is as={ 1,…, i,…, nAt a maximum of maxMinimum value of minNormalizing the path length standard difference set to obtain normalized indexExpressed as:
in the network topology, m piecesThe path length set of the local point is Hd={h1,…,hi, …,hmAnd f, calculating an abnormal value S in the sample point according to the formula:
weighting and calculating the abnormal value of each sample point by using an abnormal value calculation formula, and combining the abnormal values into an abnormal value set N;
repeating the steps, randomly selecting M samples for multiple times, calculating an abnormal value set, and finally forming an abnormal value set N capable of covering all M dataGeneral assembly。
And S2, removing the sample data in the abnormal value set from the original data set, inputting the information data set from which the sample data in the abnormal value set is removed into a semantic matching model for recognition, and determining a semantic matching result.
In the embodiment of the application, the fact that the actual semantic information may have errors is considered, and the semantic interference caused by the errors of the actual semantic information can be avoided by using the input of the semantic vector, so that the information input to the semantic matching model for recognition is processed by semantic vector word segmentation, and the method does not depend on any pre-training technology and word segmentation technology, and cannot cause errors caused by inaccuracy of the pre-training technology or the word segmentation technology.
As shown in fig. 3, the semantic matching model is a schematic structural diagram, and includes an input layer, an intermediate layer, and an output layer.
Wherein, in the input layer, E1, … and En are used for representing input vectors of the semantic matching model; the middle layer adopts a multi-layer bidirectional Transformer characteristic extraction model; in the output layer, T1, …, Tn represent the output vector of the semantic matching model. The semantic matching model is used for obtaining word vectors, so that the application of a subsequent text classifier is facilitated.
For the input layer, in order to strengthen the influence of the input vectors of the semantic matching model, the weight of each input vector is calculated by using a positive inverse frequency algorithm.
The forward and inverse frequency algorithm is a weighted statistical algorithm for information retrieval and text mining to evaluate the importance of a semantic information to a data set or a corpus.
Let the input vector be E, the forward and inverse frequency algorithm be the forward frequency × the inverse frequency, TF be the forward frequency, and IDF be the inverse frequency.
The inverse frequency idf (E) of the input vector E is calculated as follows:
IDF(E)=log(P/nE);
in the formula: p is the total number of the training vector set; n isEThe number of times the input vector E appears in the training vector set. Input vector weights K (E, D) calculated using a positive inverse frequency algorithmi) The calculation formula of (a) is as follows:
in the formula: TF (E, D)i) Training vector set I for input vector EiFrequency of middle;is a normalization factor.
Fig. 4 is a schematic diagram of the architecture of the middle layer and the output layer.
The intermediate layer adopts a multi-layer bidirectional Transformer feature extraction model, the multi-layer bidirectional Transformer feature extraction model has three sub-layers, namely a bidirectional Transformer coding layer, an interaction layer and a normalization layer, and the structure of the multi-layer bidirectional Transformer feature extraction model is shown in fig. 4.
The multi-group input vectors E1, … and En form an input matrix X and are input into a multi-layer bidirectional Transformer characteristic extraction model, and the calculation process realized in the multi-layer bidirectional Transformer characteristic extraction model is as follows:
in each of the bidirectional Transformer encoding layers, a matrix K composed of an input matrix X and each input vector weight calculated by a forward-inverse frequency algorithm is used as an input, and an output matrix Z of the bidirectional Transformer encoding layer is calculated.
Where d is the dimension of the input matrix X, Q represents the vector sequence of the sets of input vectors E1, …, En, and B is the number of encoding times, i.e., the number of layers of the bidirectional transform encoding layer.
In the interaction layer, since the transform coding layer is bidirectional, it is assumed here that the left and right directional output matrices are represented as Z1 and Z2, and the interaction matrix of the two output matrices is calculated as follows:
R1=Z1*Z2 T;
R2=Z2*Z1 T;
wherein R is1Is Z1The interaction matrix of (2); r2Is Z2The interaction matrix of (2).
Calculating the final output matrix R after passing through each side coding layermulWhere H is the number of coding layers, RiAn output matrix representing the i-th coding layer, C (R)i) The function represents the concatenation of all H encoded layers, RmulIs the same as the input matrix X.
Rmul=C(Ri) ,i=1,…,H;
In the normalization layer, the LN function is used to calculate the output matrix after layer normalizationExpressed as:
in the output layer, assuming that output matrixes on the left side and the right side after layer normalization are respectively represented as v1 and v2, v1 and v2 are input into the matching layer to perform matching operation, and the matching result of the two texts is calculated as follows:
wherein y' represents a predicted value of a matching result of two texts, and v1 v2 represents that corresponding elements of v1 and v2 are in phase-by-phaseMultiplication emphasizes the identity between two texts, while | V1-V2 | emphasizes the difference between two texts, the function F represents the concatenation vector V =that would be 4 vectorsThe input is input to a classifier to process and output a predicted value of a matching result.
In the process of detecting specific data in a big data embedded network, the splicing vector V = of 4 vectors obtained by the formulaFor the basis, the method is fused with fractional Fourier transformation to perform data matching processing, performs data classification space guidance, constructs a K-L data classifier, and realizes big data embedded data classification by using the classifier, and the specific steps are as follows:
in the classification process of big data embedded data, the splicing vector of 4 vectors obtained by the above formula is used as a basis, and the following defined fractional order Fourier transform expression is utilized:
in the formula,a transform kernel representing a fractional order Fourier transform,the representative data features match the rotation angles,a token representing the form of the transform operator,a set of attributes representing a clustering feature of the data.
In the classification process of the big data embedded data, the data matching based on the fractional order Fourier transformation is realized by utilizing the rotational additivity in the fractional order Fourier domain.
Wherein P represents the order of the fractional order Fourier domain of the specific data which is positive real number, q represents the order of the fractional order Fourier domain of the specific data which is negative real number,and representing a large data embedded data classification fractional order Fourier domain.
The classifier obtained by the above formula is used for obtaining the energy distribution of the specific data among different frequencies in the big data embedded network, thereby realizing the detection of the specific data in the big data network.
And calculating an estimated failure value of a predicted value y' of the matching result by using the failure estimation model, taking output matrixes on the left side and the right side after layer normalization as a left training sample and a right training sample respectively, and inputting the left training sample and the right training sample into the failure estimation model of an output layer to obtain the estimated failure value.
The estimated failure value is used for indicating the efficiency of the output layer for predicting the sample, and is used for measuring the performance of the output layer for predicting the sample. The smaller the estimated failure value is, the better the sample prediction performance of the output layer is, namely the higher the prediction accuracy is.
In the embodiment of the present application, the estimated failure value Lp of the output layer obtained based on the ith left training sample and the corresponding right training sample in the obtained left training samples:
wherein,may refer to the similarity between the head and tail samples in the ith left training sample,the similarity between a head sample and a reference sample included in a jth right training sample in the right training samples corresponding to the ith left training sample is referred to, I is the number of the right training samples, I is an integer less than or equal to the total number of the left training samples, and j may be an integer less than or equal to I.
In a preferred embodiment, the similarity is expressed in terms of cosine similarity, as described aboveNamely, it can satisfy:
where ei1 may refer to the vector representation of the leading sample in the ith left training sample, ei2 refers to the vector representation of the trailing sample in the ith left training sample,is an empirical parameter, typically 1.5. The similarity between the head sample and the reference sample included in the jth right-direction training sample can be calculated by referring to the above formulaAnd will not be described herein.
And S3, after multiplying the predicted value of the matching result by the estimated loss value, sequencing according to the product size, wherein the sequence obtained after sequencing is the natural language processing result. Fig. 5 is a data diagram showing the data processing by the above steps.
The invention also provides a natural language processing system based on artificial intelligence, which is used for realizing the natural language processing method.
The system comprises an acquisition module, a data processing module and a data processing module, wherein the acquisition module is used for acquiring an original data set of natural language information; in a preferred embodiment, the raw data set of natural language information further comprises at least two data sets to be trained.
The processor is used for removing the sample data in the abnormal value set from the original data set, inputting the information data set from which the sample data in the abnormal value set is removed into the semantic matching model for identification, and determining a semantic matching result; and (4) after multiplying the predicted value of the matching result by the estimated loss value, sequencing according to the product size, wherein the sequence obtained after sequencing is the natural language processing result. The processor provided in this embodiment may be deployed in a computer device, and may generate a large difference due to different configurations or performances, and may include one or more Central Processing Units (CPUs) and memories, and one or more storage media (e.g., one or more mass storage devices) for storing applications or data. The memory and storage medium may be, among other things, transient or persistent storage. The program stored on the storage medium may include one or more modules, each of which may include a series of instruction operations for the server. Further, the processor may be configured to communicate with the storage medium and execute a series of instruction operations in the storage medium on the processor.
In the above embodiments, the descriptions of the respective embodiments have respective emphasis, and for parts that are not described in detail in a certain embodiment, reference may be made to related descriptions of other embodiments.
Embodiments of the present application also provide a computer-readable storage medium, in which a computer program is stored, and when the computer program runs on a computer, the computer is caused to execute the method described in the foregoing embodiments.
Embodiments of the present application also provide a computer program product including a program, which, when run on a computer, causes the computer to perform the methods described in the foregoing embodiments.
It will be understood by those skilled in the art that all or part of the processes of the methods of the embodiments described above can be implemented by hardware instructions of a computer program, which can be stored in a non-volatile computer-readable storage medium, and when executed, can include the processes of the embodiments of the methods described above. Any reference to memory, storage, database or other medium used in the embodiments provided herein can include at least one of non-volatile and volatile memory. Non-volatile memory may include Read Only Memory (ROM), magnetic tape, floppy disk, flash memory, optical storage, or the like. Volatile memory can include Random Access Memory (RAM) or external cache memory. By way of illustration and not limitation, RAM can take many forms, such as Static Random Access Memory (SRAM) or Dynamic Random Access Memory (DRAM), among others.
The above-mentioned embodiments only express several embodiments of the present application, and the description thereof is more specific and detailed, but not construed as limiting the scope of the invention. It should be noted that, for a person skilled in the art, several variations and modifications can be made without departing from the concept of the present application, which falls within the scope of protection of the present application. Therefore, the protection scope of the present patent shall be subject to the appended claims.
Claims (10)
1. A natural language processing method based on artificial intelligence is characterized by comprising the following steps:
s1, acquiring a natural language information original data set, and performing anomaly analysis on the original data set to generate an anomaly value set of the original data set;
s2, removing sample data in the abnormal value set from the original data set, inputting the information data set from which the sample data in the abnormal value set is removed into a semantic matching model for recognition, and determining a semantic matching result;
and S3, after multiplying the predicted value of the matching result by the estimated loss value, sequencing according to the product size, wherein the sequence obtained after sequencing is the natural language processing result.
2. The natural language processing method based on artificial intelligence of claim 1, wherein the step S1 specifically includes: randomly selecting m samples from an original data set to form a network topology structure, wherein n nodes form the network topology structureNODE set NODE = { NODE1,node2,……,nodenH, the node path length set is Lnode={ L1,…,Li,…,LnAnd the standard deviation of the path length of the network topology structure is:
wherein the path length standard deviation of the network topology is set as={ 1,…, i,…, nAt a maximum of maxMinimum value of min;
The outliers in the sample points were calculated as:
wherein the path length set of m sample points is Hd={h1,…,hi, …,hm};
Calculating an abnormal value of each sample point, and combining the abnormal values into an abnormal value set N;
randomly selecting m samples for multiple times, calculating an abnormal value set, and forming an abnormal value set N capable of covering the original data setGeneral assembly。
3. The artificial intelligence based natural language processing method according to claim 1, wherein in step S2, the semantic matching model includes an input layer, an intermediate layer and an output layer; the input layer calculates the weight of each input vector by using a positive and inverse frequency algorithm; the middle layer adopts a multi-layer bidirectional feature extraction model; the output layer calculates an output vector using a failure estimation model.
4. The artificial intelligence based natural language processing method according to claim 3, wherein the forward and inverse frequency algorithm specifically comprises:
calculating the inverse frequency idf (E) of the input vector E:
IDF(E)=log(P/nE);
wherein, P is the total number of the training vector set; n isEThe times of the input vector E appearing in the training vector set;
computing input vector weights K (E, D)i):
5. The artificial intelligence based natural language processing method of claim 3, wherein the multi-layered bidirectional feature extraction model has three sub-layers, which are a bidirectional Transformer coding layer, an interaction layer and a normalization layer.
6. The artificial intelligence based natural language processing method according to claim 5, wherein in each of the bidirectional fransformer coding layers, a matrix K formed by an input matrix X and a weight of each input vector calculated by a forward inverse frequency algorithm is used as an input, and an output matrix Z of the bidirectional fransformer coding layer is calculated:
where d is the dimension of the input matrix X, Q represents the vector sequence of the sets of input vectors E1, …, En, and B is the number of encodings.
7. The artificial intelligence based natural language processing method of claim 5,
in the interaction layer, the output matrix in left and right directions is represented as Z1And Z2Then the interaction matrix of the two output matrices is calculated as follows:
R1=Z1*Z2 T;
R2=Z2*Z1 T;
wherein R is1Is Z1The interaction matrix of (2); r2Is Z2The interaction matrix of (2);
computing a pass throughFinal output matrix R after each side coding layermulWhere H is the number of coding layers, RiAn output matrix representing the i-th coding layer, C (R)i) The function represents that all H coding layers are spliced together;
Rmul=C(Ri) ,i=1,…,H;
wherein the LN function represents a layer normalization function.
8. The artificial intelligence based natural language processing method according to claim 7, wherein the output matrix representations of the left and right sides after the layer normalization are v1 and v2, respectively, and v1 and v2 are subjected to matching operations:
wherein y' represents the predicted value of the matching result of the two texts, v1 represents that v2 multiplies the corresponding elements of v1 and v2 one by one, and the function F represents that the spliced vectors of 4 vectors are input into a classifier for processing and the predicted value of the matching result is output.
9. The artificial intelligence based natural language processing method according to claim 8, wherein the calculating an estimated failure value of the predicted value of the matching result using the failure estimation model specifically includes:
predicting a loss value Lp of a sample of an output layer based on an ith left training sample and a corresponding right training sample in the obtained left training samples:
wherein,may refer to the similarity between the head and tail samples in the ith left training sample,the similarity between a head sample and a reference sample included in a jth right training sample in a right training sample corresponding to an ith left training sample is shown, I is the number of the right training samples, I is an integer less than or equal to the total number of the left training samples, and j is an integer less than or equal to I;
10. A natural language processing system based on artificial intelligence, wherein the natural language processing system is used for implementing the natural language processing method according to any one of claims 1 to 9.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202210260510.7A CN114330370B (en) | 2022-03-17 | 2022-03-17 | Natural language processing system and method based on artificial intelligence |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202210260510.7A CN114330370B (en) | 2022-03-17 | 2022-03-17 | Natural language processing system and method based on artificial intelligence |
Publications (2)
Publication Number | Publication Date |
---|---|
CN114330370A true CN114330370A (en) | 2022-04-12 |
CN114330370B CN114330370B (en) | 2022-05-20 |
Family
ID=81033553
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202210260510.7A Active CN114330370B (en) | 2022-03-17 | 2022-03-17 | Natural language processing system and method based on artificial intelligence |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN114330370B (en) |
Citations (11)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108776694A (en) * | 2018-06-05 | 2018-11-09 | 哈尔滨工业大学 | A kind of time series abnormal point detecting method and device |
CN109657947A (en) * | 2018-12-06 | 2019-04-19 | 西安交通大学 | A kind of method for detecting abnormality towards enterprises ' industry classification |
US20190318407A1 (en) * | 2015-07-17 | 2019-10-17 | Devanathan GIRIDHARI | Method for product search using the user-weighted, attribute-based, sort-ordering and system thereof |
CN110956224A (en) * | 2019-08-01 | 2020-04-03 | 平安科技(深圳)有限公司 | Evaluation model generation method, evaluation data processing method, evaluation model generation device, evaluation data processing equipment and medium |
CN111666502A (en) * | 2020-07-08 | 2020-09-15 | 腾讯科技(深圳)有限公司 | Abnormal user identification method and device based on deep learning and storage medium |
CN111753527A (en) * | 2020-06-29 | 2020-10-09 | 平安科技(深圳)有限公司 | Data analysis method and device based on natural language processing and computer equipment |
CN111860850A (en) * | 2019-04-28 | 2020-10-30 | 第四范式(北京)技术有限公司 | Model training method, information processing method and device and electronic equipment |
CN111882431A (en) * | 2020-08-04 | 2020-11-03 | 武汉众邦银行股份有限公司 | Intelligent message pushing method based on NLP deep learning |
CN113011911A (en) * | 2021-01-21 | 2021-06-22 | 腾讯科技(深圳)有限公司 | Data prediction method, device, medium and electronic equipment based on artificial intelligence |
CN113158076A (en) * | 2021-04-05 | 2021-07-23 | 北京工业大学 | Social robot detection method based on variational self-coding and K-nearest neighbor combination |
CN113436698A (en) * | 2021-08-27 | 2021-09-24 | 之江实验室 | Automatic medical term standardization system and method integrating self-supervision and active learning |
-
2022
- 2022-03-17 CN CN202210260510.7A patent/CN114330370B/en active Active
Patent Citations (11)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20190318407A1 (en) * | 2015-07-17 | 2019-10-17 | Devanathan GIRIDHARI | Method for product search using the user-weighted, attribute-based, sort-ordering and system thereof |
CN108776694A (en) * | 2018-06-05 | 2018-11-09 | 哈尔滨工业大学 | A kind of time series abnormal point detecting method and device |
CN109657947A (en) * | 2018-12-06 | 2019-04-19 | 西安交通大学 | A kind of method for detecting abnormality towards enterprises ' industry classification |
CN111860850A (en) * | 2019-04-28 | 2020-10-30 | 第四范式(北京)技术有限公司 | Model training method, information processing method and device and electronic equipment |
CN110956224A (en) * | 2019-08-01 | 2020-04-03 | 平安科技(深圳)有限公司 | Evaluation model generation method, evaluation data processing method, evaluation model generation device, evaluation data processing equipment and medium |
CN111753527A (en) * | 2020-06-29 | 2020-10-09 | 平安科技(深圳)有限公司 | Data analysis method and device based on natural language processing and computer equipment |
CN111666502A (en) * | 2020-07-08 | 2020-09-15 | 腾讯科技(深圳)有限公司 | Abnormal user identification method and device based on deep learning and storage medium |
CN111882431A (en) * | 2020-08-04 | 2020-11-03 | 武汉众邦银行股份有限公司 | Intelligent message pushing method based on NLP deep learning |
CN113011911A (en) * | 2021-01-21 | 2021-06-22 | 腾讯科技(深圳)有限公司 | Data prediction method, device, medium and electronic equipment based on artificial intelligence |
CN113158076A (en) * | 2021-04-05 | 2021-07-23 | 北京工业大学 | Social robot detection method based on variational self-coding and K-nearest neighbor combination |
CN113436698A (en) * | 2021-08-27 | 2021-09-24 | 之江实验室 | Automatic medical term standardization system and method integrating self-supervision and active learning |
Also Published As
Publication number | Publication date |
---|---|
CN114330370B (en) | 2022-05-20 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN111401077B (en) | Language model processing method and device and computer equipment | |
CN116194912A (en) | Method and system for aspect-level emotion classification using graph diffusion transducers | |
CN109408743B (en) | Text link embedding method | |
CN109344399B (en) | Text similarity calculation method based on stacked bidirectional lstm neural network | |
Tran et al. | Ensemble application of ELM and GPU for real-time multimodal sentiment analysis | |
CN110990555B (en) | End-to-end retrieval type dialogue method and system and computer equipment | |
CN111797196A (en) | Service discovery method combining attention mechanism LSTM and neural topic model | |
Grzegorczyk | Vector representations of text data in deep learning | |
CN112307048B (en) | Semantic matching model training method, matching method, device, equipment and storage medium | |
CN112988970A (en) | Text matching algorithm serving intelligent question-answering system | |
CN115497465A (en) | Voice interaction method and device, electronic equipment and storage medium | |
CN116304748A (en) | Text similarity calculation method, system, equipment and medium | |
Somogyi | The Application of Artificial Intelligence | |
CN116975271A (en) | Text relevance determining method, device, computer equipment and storage medium | |
Lin et al. | Lifelong Text-Audio Sentiment Analysis learning | |
CN113516094A (en) | System and method for matching document with review experts | |
Menon et al. | Improving ranking in document based search systems | |
CN111666375A (en) | Matching method of text similarity, electronic equipment and computer readable medium | |
CN114330370B (en) | Natural language processing system and method based on artificial intelligence | |
CN116680407A (en) | Knowledge graph construction method and device | |
CN114595324A (en) | Method, device, terminal and non-transitory storage medium for power grid service data domain division | |
Bouallégue et al. | Learning deep wavelet networks for recognition system of Arabic words | |
CN114003773A (en) | Dialogue tracking method based on self-construction multi-scene | |
CN113590755A (en) | Word weight generation method and device, electronic equipment and storage medium | |
CN113157892A (en) | User intention processing method and device, computer equipment and storage medium |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant | ||
TR01 | Transfer of patent right | ||
TR01 | Transfer of patent right |
Effective date of registration: 20221101 Address after: 1709, F13, Block A, Yard 93, Jianguo Road, Chaoyang District, Beijing 100022 Patentee after: Li Jin Address before: 300000 No. 201-10, unit 2, building 2, No. 39, Gaoxin Sixth Road, Binhai science and Technology Park, high tech Zone, Binhai New Area, Tianjin Patentee before: Tianjin Sirui Information Technology Co.,Ltd. |