CN110321534B

CN110321534B - Text editing method, device, equipment and readable storage medium

Info

Publication number: CN110321534B
Application number: CN201810262255.3A
Authority: CN
Inventors: 占吉清; 陈志刚; 胡国平; 胡郁
Original assignee: iFlytek Co Ltd
Current assignee: iFlytek Co Ltd
Priority date: 2018-03-28
Filing date: 2018-03-28
Publication date: 2023-11-24
Anticipated expiration: 2038-03-28
Also published as: CN110321534A

Abstract

The application discloses a text editing method, a device, equipment and a readable storage medium, wherein the method acquires original text data to be edited and a user editing command, determines an editing operation corresponding to the editing command, determines target command words from the user editing command according to semantic relativity of the original text data and words contained in the user editing command, and finally edits the target command words in the original text data according to the editing operation. According to the scheme, the user can automatically edit the original text data only by inputting the editing command, so that manual operation is greatly reduced, and the editing efficiency is improved. In addition, the application comprehensively considers the semantic relativity of the original text data and each word contained in the user editing command when determining the target command word, greatly improves the accuracy of determining the target command word, and can more accurately complete the whole text editing process according to the user's wish.

Description

Text editing method, device, equipment and readable storage medium

Technical Field

The present application relates to the field of natural language processing technology, and more particularly, to a text editing method, apparatus, device, and readable storage medium.

Background

Text editing refers to typesetting and editing of characters on an original text. The conventional text editing method realizes text editing by means of a keyboard and a mouse. This approach undoubtedly takes up both hands of the user and the editing process is time consuming and labor intensive.

With the advent of the artificial intelligence era, more and more people wish to implement editing of text data by an automatic method to improve editing efficiency and reduce manual operations.

Disclosure of Invention

In view of the above, the present application provides a text editing method, apparatus, device, and readable storage medium, so as to reduce manual operations in the editing process and improve editing efficiency.

In order to achieve the above object, the following solutions have been proposed:

a text editing method, comprising:

acquiring original text data to be edited and a user editing command;

determining editing operation corresponding to the user editing command;

determining target command words from the user editing commands according to the semantic relativity of the original text data and words contained in the user editing commands;

and editing the target command word in the original text data according to the editing operation.

Preferably, the determining, according to the semantic relevance between the original text data and each word included in the user editing command, a target command word from the user editing command includes:

determining the correlation characteristics of the corresponding words according to the semantic correlation of each word in the user editing command and the original text data;

determining a model by utilizing a pre-trained command word, and determining target command words from words contained in the user editing command by utilizing correlation characteristics of the words in the user editing command;

the training sample of the command word determination model pre-training comprises the following components: word vectors of training words contained in user editing commands corresponding to training text data, and correlation features of the training words determined according to semantic correlation of the training words and the training text data; the sample tag includes: whether the training word is a labeling result of the target command word or not.

Preferably, the determining the relevance feature of the corresponding word according to the semantic relevance of each word in the user editing command and the original text data includes:

performing word segmentation and word vectorization on the user editing command and the original text data to obtain word vectors of the words respectively contained;

Respectively determining i-element entries of the word segmentation contained in the user editing command and the original text data, wherein i is a value [1, N ] and N is a set constant;

and determining the correlation characteristics of the segmented words contained in the user editing command according to the matching condition of the i-element vocabulary entries of the segmented words contained in the user editing command and the i-element vocabulary entries of each segmented word in the original text data.

Preferably, the determining the relevance feature of the word segment included in the user editing command according to the matching condition between the i-element entry of the word segment included in the user editing command and the i-element entry of each word segment in the original text data includes:

calculating the matching score of the i-element vocabulary entry of the segmentation word contained in the user editing command and the i-element vocabulary entry of each segmentation word in the original text data;

determining the coverage of the i-element entry of the word segmentation included in the user editing command in the original text data according to the size relation between the matching score and the set matching score threshold;

and determining the relevance characteristics of the word segmentation contained in the user editing command according to the matching score and/or the coverage.

Preferably, the determining a model by using pre-trained command words and correlation features of words in the user editing command, determining a target command word from words contained in the user editing command, includes:

The first input layer of the model is determined through the command words, and word vectors and correlation characteristics of words in the user editing command are input;

determining a first hidden layer set of a model through the command words, and transforming the correlation characteristics of each word to obtain transformed correlation characteristics of each word;

and determining and outputting target command words in the user editing command according to the transformed relevance features of the words through an output layer of the command word determining model.

Preferably, the method further comprises:

determining a second input layer of the model through the command words, and inputting word vectors of words in the original text data;

determining a second hidden layer set of the model through the command words, and transforming word vectors of words in the original text data to obtain transformed word vectors of the words;

determining an encoding layer of a model through the command words, multiplying the attention degree weight of the kth word in the preset user editing command to each word in the original text data by the transformed word vector of the corresponding word to obtain the attention degree weight vector of the kth word to each word in the original text data, wherein the k takes the value of [1-m ], and m is the total number of words contained in the user editing command;

And determining an encoding layer of a model through the command words, adding the attention degree weight vector of the kth word to each word in the original text data and the transformed correlation characteristic of the kth word output by the first hidden layer set, wherein the addition result is used as the input of the output layer, so that the output layer determines and outputs the target command word in the user editing command according to the addition result.

Preferably, the process of acquiring the user editing command includes:

acquiring a user editing command in a voice form and converting the user editing command into a user editing command in a text form;

or alternatively, the first and second heat exchangers may be,

a user edit command in text form is obtained.

A text editing apparatus comprising:

the user editing command acquisition unit is used for acquiring the original text data to be edited and the user editing command;

an editing operation determining unit, configured to determine an editing operation corresponding to the user editing command;

a target command word determining unit, configured to determine a target command word from the user editing command according to the original text data and semantic relativity of each word included in the user editing command;

and the editing processing unit is used for editing the target command word in the original text data according to the editing operation.

Preferably, the target command word determining unit includes:

the correlation characteristic determining unit is used for determining the correlation characteristic of the corresponding word according to the semantic correlation of each word in the user editing command and the original text data;

a model prediction unit, configured to determine a model by using a pre-trained command word, and a correlation feature of each word in the user editing command, and determine a target command word from each word included in the user editing command;

Preferably, the correlation characteristic determining unit includes:

the word vector determining unit is used for carrying out word segmentation and word vector quantization on the user editing command and the original text data to obtain word vectors of the word segments respectively contained;

an i-element entry determining unit, configured to determine i-element entries of the word segmentation included in the user editing command and the original text data, where i is a value [1, N ], and N is a set constant;

And the matching score processing unit is used for determining the relevance characteristics of the segmented words contained in the user editing command according to the matching condition of the i-element vocabulary entries of the segmented words contained in the user editing command and the i-element vocabulary entries of each segmented word in the original text data.

Preferably, the matching score processing unit includes:

the matching score calculating unit is used for calculating the matching condition of the i-element vocabulary entry of the word segmentation contained in the user editing command and the i-element vocabulary entry of each word segmentation in the original text data;

a coverage determining unit, configured to determine, according to a size relationship between the matching score and a set matching score threshold, a coverage of an i-element entry of a word segment included in the user editing command in the original text data;

and the comprehensive processing unit is used for determining the correlation characteristics of the segmentation included in the user editing command according to the matching score and/or the coverage.

Preferably, the model prediction unit includes:

the first input unit is used for determining a first input layer of a model through the command words and inputting word vectors and correlation characteristics of words in the user editing command;

the first feature transformation unit is used for determining a first hidden layer set of the model through the command words, transforming the correlation features of the words and obtaining transformed correlation features of the words;

And the output unit is used for determining an output layer of the model through the command words, and determining and outputting target command words in the user editing command according to the transformed correlation characteristics of the words.

Preferably, the model prediction unit further comprises:

a second input unit for determining a second input layer of the model by the command word, and inputting a word vector of each word in the original text data;

the second feature transformation unit is used for transforming word vectors of words in the original text data through the second hidden layer set of the command word determination model to obtain transformed word vectors of the words;

the coding unit is used for determining a coding layer of a model through the command words, multiplying the attention degree weight of the kth word in the preset user editing command to each word in the original text data by the transformed word vector of the corresponding word to obtain the attention degree weight vector of the kth word to each word in the original text data, and the k takes the value [1-m ] and m as the total number of words contained in the user editing command; and adding the attention degree weight vector of the kth word to each word in the original text data with the transformed correlation characteristic of the kth word output by the first hidden layer set, wherein the addition result is used as the input of the output layer, so that the output layer determines and outputs a target command word in the user editing command according to the addition result.

Preferably, the user editing command acquiring unit includes:

the first user editing command acquisition subunit is used for acquiring user editing commands in a voice form and converting the user editing commands into user editing commands in a text form;

or alternatively, the first and second heat exchangers may be,

and the second user editing command acquisition subunit is used for acquiring the user editing command in a text form.

A text editing apparatus comprising: a memory and a processor;

the memory is used for storing programs;

the processor is used for executing the program to realize each text editing method as disclosed above

And (3) step (c).

A readable storage medium having stored thereon a computer program which, when executed by a processor, implements the steps of the text editing method as disclosed above.

From the above technical solution, it can be seen that, in the text editing method provided by the embodiment of the present application, original text data to be edited and a user editing command are obtained, an editing operation corresponding to the editing command is determined, according to the original text data and semantic relevance of each word included in the user editing command, a target command word is determined from the user editing command, and finally, according to the editing operation, the target command word in the original text data is edited. According to the scheme, the user can automatically edit the original text data only by inputting the editing command, so that manual operation is greatly reduced, and the editing efficiency is improved.

In addition, the application comprehensively considers the semantic relativity of the original text data and each word contained in the user editing command when determining the target command word, greatly improves the accuracy of determining the target command word, and can more accurately complete the whole text editing process according to the user's wish.

Drawings

In order to more clearly illustrate the embodiments of the present application or the technical solutions in the prior art, the drawings that are required to be used in the embodiments or the description of the prior art will be briefly described below, and it is obvious that the drawings in the following description are only embodiments of the present application, and that other drawings can be obtained according to the provided drawings without inventive effort for a person skilled in the art.

FIG. 1 is a flow chart of a text editing method disclosed in an embodiment of the application;

FIG. 2 is a flowchart of a method for determining a target command word according to an embodiment of the present application;

FIG. 3 is a flowchart of a method for determining relevance features of words according to an embodiment of the present application;

FIG. 4 is a flowchart of a method for determining word segmentation correlation characteristics according to a matching score according to an embodiment of the present application;

FIG. 5 is a schematic diagram of a command word determining model according to an embodiment of the present application;

FIG. 6 is a flowchart of a method for determining a target command word based on a command word determination model according to an embodiment of the present application;

FIG. 7 is a schematic diagram of another command word determination model according to an embodiment of the present application;

FIG. 8 is a flowchart of another method for determining a target command word based on a command word determination model according to an embodiment of the present application;

FIG. 9 is a schematic diagram of a text editing apparatus according to an embodiment of the present application;

fig. 10 is a block diagram of a hardware structure of a server according to an embodiment of the present application.

Detailed Description

The following description of the embodiments of the present application will be made clearly and completely with reference to the accompanying drawings, in which it is apparent that the embodiments described are only some embodiments of the present application, but not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the application without making any inventive effort, are intended to be within the scope of the application.

In order to solve the problems of a large amount of manual operations and low editing efficiency existing in the traditional text editing mode by means of a keyboard and a mouse, the inventor hopes to provide a new text editing scheme by means of an artificial intelligence technology. The aim is to achieve that only the user needs to input an editing command, and the machine can edit the original text data to be edited by itself. Thereby greatly reducing manual operation and improving editing efficiency.

Further, the user can input the edit command in the form of voice, which can further free the user's hands.

The inventor researches and discovers that in the process of realizing text editing by means of artificial intelligence, a key link is to accurately identify a target command word, and the target command word is an object to be edited and operated by a user.

One solution first conceived by the inventors is:

a large number of user editing commands are collected, and target command words in the commands are manually marked. Training the neural network model by using the user editing command marked with the target command word to obtain the trained neural network model. The neural network model can be utilized to predict target command words in user editing commands for subsequent use.

However, through practical application, the inventor finds that the problem of low accuracy of predicting the target command word exists in the mode. The reason is found that the original text to be edited is various, and the editing command sent by the user is quite different for different original text data, so that the training data of the neural network model cannot cover all conditions, and the prediction accuracy of the neural network model obtained by training cannot reach a satisfactory degree.

On the basis, the inventor continues to deeply study, and finally obtains the following text editing scheme, so that the problem that the scheme for predicting the target command word by using the neural network model is limited by the body quantity of the training sample and the prediction accuracy is low is solved. Next, the text editing method of the present application will be described with reference to fig. 1, and as shown in fig. 1, the method includes:

step S100, acquiring original text data to be edited and a user editing command;

the original text data to be edited refers to text data which a user needs to edit. The original text data to be edited acquired in this step may be in a voice form, a text form, a photo form, or the like. That is, the user may input the original text data by voice, typing, photographing, etc. For the original text data in different forms, the application can adopt corresponding processing modes to convert the original text data into the text form. For example, the user inputs the original text data by voice, which in this embodiment can be recognized as text by a voice transcription technique. When the user inputs original text data in the form of a photograph by photographing, the original text data can be recognized from the photograph by an image recognition technique in this embodiment.

By user edit command, it is meant a command that the user uses when modifying the original text data, such as "weather of blue whale today is good, i intend to go to that play for several days" for the original text data. "blue whale" needs to be changed into "Nanjing", and then user editing commands need to be given as follows: "modify blue whale to Nanjing".

As above, the user editing command obtained in this step may be in a voice form, a text form, other multimedia forms, or the like. Such as the user speaking the edit command in voice, or entering the edit command directly in text form, or handwriting the edit command by other touch screens, etc. In this embodiment, aiming at user editing commands in different forms, the user editing commands are converted into user editing commands in text form according to corresponding processing modes.

Step S110, determining editing operation corresponding to the user editing command;

specifically, the types of various editing operations of the original text data by the user are numerous, and the present application can collect various editing operations in advance. Further, after the user editing command is acquired, the editing operation corresponding to the user editing command can be determined based on the collected editing operation.

Alternatively, the editing operation that exists may be searched in the user editing command against editing operations that are collected in advance as the corresponding editing operation.

Alternatively, the present application may use training user editing commands including various editing operations as training data in advance, train a neural network model, and identify editing operations corresponding to the user editing commands using the trained neural network model.

Step S120, determining target command words from the user editing commands according to the semantic relativity of the original text data and words contained in the user editing commands;

specifically, the user editing command is directed to the original text data to be edited, and therefore, the target command word contained in the user editing command also has semantic relevance to the original text data. In this embodiment, the target command word is not predicted separately for the user editing command under the condition of cutting off the original text data, but the semantic relevance of each word included in the original text data and the user editing command is comprehensively considered, and the target command word in the user editing command is obtained on the basis, and the semantic relevance of the original text data and the user editing command is considered, so that the obtained target command word is more accurate.

And step S130, editing the target command word in the original text data according to the editing operation.

Specifically, the foregoing has determined the editing operation corresponding to the user editing command and the target command word for which the editing operation is directed, and therefore, the editing operation can be performed on the target command word in the original text data in accordance with the editing operation. For example, if the editing operation is "underlined" and the target command word is "weather", the original text data "weather of blue whale today" may be underlined by "weather" in "i intend to go to play for several days", and the final post-editing effect is as follows: "blue whale todayWeather ofWell, I intend to go to that play for several days. Further, on the basis of the above, the user further issues an edit command of "modify blue whale to Nanjing", and the edit operation is "handle …Modified to … ", the target command word is" blue whale "before modification and" Nanjing "after modification, and the effect after editing is as follows: "Nanjing todayWeather ofWell, I intend to go to that play for several days.

According to the text editing method provided by the embodiment of the application, the original text data to be edited and the user editing command are obtained, the editing operation corresponding to the editing command is determined, the target command word is determined from the user editing command according to the semantic relevance between the original text data and each word contained in the user editing command, and finally the target command word in the original text data is edited according to the editing operation. According to the scheme, the user can automatically edit the original text data only by inputting the editing command, so that manual operation is greatly reduced, and the editing efficiency is improved.

In the following embodiment, the process of determining the target command word from the user editing command is described in detail with reference to the above step S120 according to the semantic relevance between the original text data and each word included in the user editing command.

Referring to fig. 2, fig. 2 is a flowchart of a method for determining a target command word according to an embodiment of the present application. As shown in fig. 2, the method includes:

step 200, determining the correlation characteristics of the corresponding words according to the semantic correlation of each word in the user editing command and the original text data;

wherein the relevance features of the words are used to describe semantic relationships between the user editing commands and the original text data.

Step S210, determining a model by utilizing a pre-trained command word, and determining target command words from words contained in the user editing command by utilizing correlation characteristics of the words in the user editing command.

The command word determining model is obtained through pre-training.

When the command word determining model is pre-trained, the training samples thereof may include:

the method comprises the steps of editing word vectors of training words contained in commands of users corresponding to training text data, and determining correlation characteristics of the training words according to semantic correlation of the training words and the training text data.

The sample tag may include: whether the training word is a labeling result of the target command word or not.

Specifically, the application can collect training text data in advance and send the user editing command to the training text data by the user. And carrying out word vectorization on training words contained in the user editing command to obtain word vectors. And determining the correlation characteristics of the training words according to the semantic correlation of the training words and the training text data. Meanwhile, whether each training word is a target command word is manually marked. And finally, training the command word determining model by utilizing the word vector corresponding to the training word, the correlation characteristic of the training word and the artificial labeling result.

The command word determining model in the embodiment can adopt a deep neural network model, and can accurately predict the target command word in the user editing command by combining the word vector corresponding to the training word, the correlation characteristic of the training word and the manual labeling result.

Further, the process of determining the relevance feature of the word in step S200 is described. Referring in detail to fig. 3, the process may include:

step S300, word segmentation and word vectorization are carried out on the user editing command and the original text data, so that word vectors of the words contained in the user editing command and the original text data are obtained;

specifically, the word segmentation tool may be used to segment the user editing command and the original text data, respectively, so as to obtain the word segments contained in the user editing command and the word segments contained in the original text data.

If the original text data is "i want to order a train ticket from south Beijing to Beijing today", the word segmentation results are as follows "i want to order a train ticket from south Beijing to Beijing today", wherein the different words are separated by spaces. The user edit command is "change one order to buy two sheets today", and the result after word segmentation is "change one order to buy two sheets today" as follows.

After the word segmentation, further carrying out word vectorization on each word segment to obtain word vectors corresponding to the word segments.

Alternatively, if the original text data is longer, a part of the original text data may be selected for word segmentation. For example, one or more segments of the original text data may be selected for word segmentation, or fixed length of the original text data may be selected for word segmentation, where the fixed length may be set according to the application requirements.

Step S310, respectively determining i-element entries of the word segmentation contained in the user editing command and the original text data;

wherein i takes the value [1, N ]. N is a preset constant, and the value of N is determined according to application requirements. When the value of N is different, the number of the determined i-element entries is also different. If the value of N is 1, the i-element vocabulary entry is only a 1-element vocabulary entry, and if the value of N is 2, the i-element vocabulary entry comprises a 1-element vocabulary entry and a 2-element vocabulary entry.

The definition of the i-element entry of the segmentation contained in the text is as follows: the phrase consisting of the segmented words and the continuous i-1 segmented words after the segmented words in the text is called as an i-element entry of the segmented words.

Still taking the original text data and the user editing command as an example:

the 1-element entries of each word segment contained in the original text data are respectively as follows: i want to order a train ticket from south to beijing today.

The 2-element entries of each word segment contained in the original text data are respectively as follows: i want to order a train ticket from south Beijing to Beijing today.

The 1-element entries of each word segment contained in the user editing command are respectively as follows: the order is changed to buy two sheets today.

The 2-element entries of each word segment contained in the user editing command are respectively as follows: the order is changed to buy two sheets one by one today.

The i-element entries of the different segmentations are separated by spaces.

Step 320, determining the relevance characteristics of the segmented words contained in the user editing command according to the matching condition of the i-element vocabulary entries of the segmented words contained in the user editing command and the i-element vocabulary entries of each segmented word in the original text data.

Specifically, for each segmented i-gram entry contained in the user editing command, the matching condition of the i-gram entry and each segmented i-gram entry in the original text data is determined. And determining a relevance feature of the segmentation included in the user editing command based on the matching condition. Wherein, the relevance characteristic of the segmentation can be embodied by the matching condition.

In an alternative embodiment, the process of determining the word segmentation correlation feature according to the matching score in step S320 may be implemented with reference to fig. 4, and as shown in fig. 4, the process may include:

step S400, calculating the matching score of the i-element vocabulary entry of the word segmentation contained in the user editing command and the i-element vocabulary entry of each word segmentation in the original text data;

the euclidean distance or cosine distance of the word vectors of the two i-element entries can be calculated as a matching score during specific calculation.

The user edits the i-element vocabulary entry of each segmentation in the command, and the matching score of the i-element vocabulary entry of each segmentation in the original text data forms a matching score vector of the i-element vocabulary entry, and the dimension is equal to the total number of the segmentation in the original text data.

Step S410, determining the coverage of the i-element entry of the word segmentation included in the user editing command in the original text data according to the size relation between the matching score and the set matching score threshold;

specifically, in this embodiment, a matching score threshold may be preset, and when the matching score is greater than the set matching score threshold, the i-element entry of the segmented word may be considered to appear in the original text data, that is, the coverage is 1. When the matching score is not greater than the set matching score threshold, it can be considered that the i-element entry of the segmented word does not appear in the original text data, and the coverage is 0.

Step S420, determining the relevance features of the word segments included in the user editing command according to the matching score and/or the coverage.

In this embodiment, the relevance feature of the word segment included in the user editing command may be determined by referring to the matching score alone, for example, the matching score vector of the i-element term of the word segment included in the user editing command described above is used as the relevance feature of the word segment.

In addition, the present embodiment may also determine the relevance feature of the word segment included in the user editing command with reference to the coverage alone, for example, the coverage in the original text data is taken as the relevance feature of the word segment of the i-element entry of the word segment included in the user editing command.

Of course, in this embodiment, the relevance feature of the word segment included in the user editing command may also be determined by combining the reference matching score and the coverage.

In order to facilitate the understanding of the present application, the following examples are provided.

Still assume that the original text data is "i want to order a train ticket from south to beijing today". The user edit command is "change one order to buy two orders today".

Assuming that N is equal to 2, the original text data and the user editing command include 1, 2-element entries of the segmentation word as described above, and will not be described here again.

Taking the word "order" in the user editing command as an example, the matching score vector of the 2-element vocabulary entry "order one" and the 2-element vocabulary entry of each word in the original text data is (0.11,0.21,0.9, …, 0.11). Because "order one" appears in the original text data, when it performs matching calculation with "order one" of 2-element entries of the word "order" in the original text data, the matching score is higher, for example, 0.9 in the foregoing matching score vector, and when it matches with 2-element entries of other word divisions in the original text data, the matching score is lower.

Finally, calculating a matching score vector of the i-element vocabulary entry of the segmentation in the user editing command and the i-element vocabulary entry of each segmentation in the original text data, and coverage in the original text data, wherein the result is shown in the following table:

TABLE 1

Further, the procedure of determining the target command word by using the command word determining model in the above step S210 will be described.

The lower command word determination model is briefly described first.

An alternative structure, the command word determination model in this embodiment may include a first input layer, a first set of hidden layers, and an output layer. An alternative command word determination model is illustrated in fig. 5. Includes a first input layer V1-Vm, a first hidden layer set and an output layer T1-Tm. m is the total number of words contained in the user edit command.

Based on the command word determination model of the present embodiment, a process of determining a target command word will be described with reference to fig. 6, and may include:

step S500, inputting word vectors and correlation features of words in the user editing command through a first input layer;

specifically, the foregoing embodiment has described determining the word vector and the relevance feature of each word in the user editing command, and in this embodiment, the determined word vector and the relevance feature may be input into the command word determining model through the first input layer.

Step S510, transforming the correlation characteristic of each word through the first hidden layer set to obtain the transformed correlation characteristic of each word;

Specifically, the first hidden layer set can transform the relevance features of each word input by the first input layer to obtain transformed relevance features of each word.

And step S520, determining and outputting target command words in the user editing command according to the transformed relevance characteristics of the words through an output layer.

Specifically, the relevance features are transformed through the first hidden layer set, and the output layer of the command word determining model can determine and output target command words in the user editing command according to the transformed relevance features of the words. Target command words can be marked in the output result, for example, target command words are marked by 1, and non-target command words are marked by 0.

The user edit command "change one order to buy two sheets today" as in the foregoing example, the corresponding output result may be "change one order/1 sheet/1 today/1 to two sheets/1/0 buy/1". According to the method, the target command words comprise "order", "one piece", "today", "buy" and "two pieces".

In another embodiment of the present application, another command word determining model is introduced, and based on the command word determining model illustrated in fig. 5, the command word determining model in this embodiment further includes another branch, as shown in fig. 7, for determining the attention of each word in the user editing command to each word in the original text data. The branch comprises: and a second input layer S1-Sq, a second hidden layer set and an encoding layer. Compared to the command word determining model disclosed in the previous embodiment, the command word determining model in this embodiment needs to be further increased in training due to the addition of the second input layer: word vectors of segmented words contained in the training text data.

Based on the command word determination model of the present embodiment, a process of determining a target command word will be described with reference to fig. 8, and may include:

step S600, inputting word vectors and correlation features of words in the user editing command through a first input layer;

step S610, transforming the correlation characteristics of each word through the first hidden layer set to obtain transformed correlation characteristics;

the steps S600 to S610 in this embodiment correspond to the steps S500 to S510 in the foregoing embodiment one by one, and the detailed description is referred to above, which is not repeated here.

Step S620, inputting word vectors of words in the original text data through a second input layer;

specifically, the second input layer inputs a word vector s of each word in the original text data ₁ ,…s _q Where q represents the total number of words in the original text data.

Step 630, determining a second hidden layer set of the model through the command word, and transforming word vectors of the words in the original text data to obtain transformed word vectors of the words;

wherein, the transformed word vector obtained by transforming the word vector by the second hidden layer set can use h ₁ ,…h _q And (3) representing.

It will be appreciated that there is no strict order limitation between steps S620-S630 and steps S600-S610, and the two branches may be executed synchronously or in any order. Fig. 8 illustrates only one alternative implementation.

Step S640, multiplying the attention degree weight of the kth word in the preset user editing command to each word in the original text data by the transformed word vector of the corresponding word through the coding layer to obtain the attention degree weight vector of the kth word to each word in the original text data;

wherein, k takes the value of [1, m ], m is the total number of words contained in the user editing command.

Definition p _kj And (5) representing the attention degree weight of the kth word in the preset user editing command to the jth word in the original text data. h is a _j And the transformed word vector obtained by transforming the word vector of the jth word in the original text data is represented. Then preset the attention weight p _kj And the transformed word vector h _j Multiplication to obtain multiplication result c _k Expressed as:

wherein p is _kj Can be trained by collecting a large amount of data in advance. As a known constant in this embodiment.

Step S650, adding the attention degree weight vector of the kth word to each word in the original text data and the transformed correlation feature of the kth word output by the first hidden layer set through an encoding layer to obtain an addition result;

step S660, determining an output layer of the model through the command words, and determining and outputting target command words in the user editing command according to the addition result.

Compared with the previous embodiment, the command word determining model in this embodiment further includes another branch, where the branch is configured to determine the degree of interest of each word in the user editing command with respect to each word in the original text data, and add the determined degree of interest to the transformed relevance feature of each word in the user editing command determined by the first hidden layer set, and finally determine the target command word according to the addition result. And the attention degree of the words in the original text data is summarized by further considering the user editing command, so that the prediction accuracy of the target command words is improved.

The text editing apparatus provided in the embodiments of the present application will be described below, and the text editing apparatus described below and the text editing method described above may be referred to correspondingly to each other.

Referring to fig. 9, fig. 9 is a schematic structural diagram of a text editing apparatus according to an embodiment of the present application. As shown in fig. 9, the apparatus may include:

a user editing command acquiring unit 11 for acquiring original text data to be edited and a user editing command;

an editing operation determining unit 12 for determining an editing operation corresponding to the user editing command;

a target command word determining unit 13, configured to determine a target command word from the user editing command according to the original text data and semantic relativity of each word included in the user editing command;

An editing processing unit 14, configured to edit the target command word in the original text data according to the editing operation.

Alternatively, the target command word determining unit may include:

Alternatively, the correlation characteristic determining unit may include:

Optionally, the matching score processing unit may include:

Alternatively, the model prediction unit may include:

Optionally, the model prediction unit may further include:

Alternatively, the user editing command acquiring unit may include:

or alternatively, the first and second heat exchangers may be,

The text editing device provided by the embodiment of the application can be applied to servers such as PC terminals, cloud platforms, servers, server clusters and the like. Alternatively, fig. 10 shows a block diagram of a hardware structure of a server, and referring to fig. 10, the hardware structure of the server may include: at least one processor 1, at least one communication interface 2, at least one memory 3 and at least one communication bus 4;

in the embodiment of the application, the number of the processor 1, the communication interface 2, the memory 3 and the communication bus 4 is at least one, and the processor 1, the communication interface 2 and the memory 3 complete the communication with each other through the communication bus 4;

processor 1 may be a central processing unit CPU, or a specific integrated circuit ASIC (Application Specific Integrated Circuit), or one or more integrated circuits configured to implement embodiments of the present application, etc.;

The memory 3 may comprise a high-speed RAM memory, and may further comprise a non-volatile memory (non-volatile memory) or the like, such as at least one magnetic disk memory;

wherein the memory stores a program, the processor is operable to invoke the program stored in the memory, the program operable to:

acquiring original text data to be edited and a user editing command;

determining editing operation corresponding to the user editing command;

Alternatively, the refinement function and the extension function of the program may be described with reference to the above.

The embodiment of the present application also provides a storage medium storing a program adapted to be executed by a processor, the program being configured to:

acquiring original text data to be edited and a user editing command;

determining editing operation corresponding to the user editing command;

Finally, it is further noted that relational terms such as first and second, and the like are used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Moreover, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising one … …" does not exclude the presence of other like elements in a process, method, article, or apparatus that comprises the element.

In the present specification, each embodiment is described in a progressive manner, and each embodiment is mainly described in a different point from other embodiments, and identical and similar parts between the embodiments are all enough to refer to each other.

The previous description of the disclosed embodiments is provided to enable any person skilled in the art to make or use the present application. Various modifications to these embodiments will be readily apparent to those skilled in the art, and the generic principles defined herein may be applied to other embodiments without departing from the spirit or scope of the application. Thus, the present application is not intended to be limited to the embodiments shown herein but is to be accorded the widest scope consistent with the principles and novel features disclosed herein.

Claims

1. A text editing method, comprising:

acquiring original text data to be edited and a user editing command;

determining editing operation corresponding to the user editing command;

editing the target command word in the original text data according to the editing operation;

the determining the target command word from the user editing command according to the semantic relativity of the original text data and each word contained in the user editing command comprises the following steps:

and determining a model by utilizing a pre-trained command word, and determining target command words from words contained in the user editing command by utilizing the correlation characteristics of the words in the user editing command.

2. The method of claim 1, wherein the command word determining training samples at model pre-training comprises: word vectors of training words contained in user editing commands corresponding to training text data, and correlation features of the training words determined according to semantic correlation of the training words and the training text data; the sample tag includes: whether the training word is a labeling result of the target command word or not.

3. The method of claim 2, wherein the determining the relevance feature of each word in the user-edited command based on the semantic relevance of the corresponding word to the original text data comprises:

4. A method according to claim 3, wherein said determining the relevance characteristics of the tokens contained in the user-edited command based on the matching of the i-gram entries of the tokens contained in the user-edited command with the i-gram entries of each of the tokens in the original text data comprises:

5. The method of claim 2, wherein said determining a model using pre-trained command words and relevance features of words in said user-edited command, determining a target command word from words contained in said user-edited command, comprises:

6. The method as recited in claim 5, further comprising:

7. The method of claim 1, wherein the process of obtaining the user edit command comprises:

or alternatively, the first and second heat exchangers may be,

a user edit command in text form is obtained.

8. A text editing apparatus, comprising:

The editing processing unit is used for editing the target command words in the original text data according to the editing operation;

the target command word determining unit includes:

and the model prediction unit is used for determining a model by utilizing the pre-trained command words and the correlation characteristics of the words in the user editing command, and determining target command words from the words contained in the user editing command.

9. The apparatus of claim 8, wherein the command word determination training samples at model pre-training comprises: word vectors of training words contained in user editing commands corresponding to training text data, and correlation features of the training words determined according to semantic correlation of the training words and the training text data; the sample tag includes: whether the training word is a labeling result of the target command word or not.

10. The apparatus according to claim 9, wherein the correlation characteristic determination unit includes:

11. The apparatus of claim 10, wherein the match score processing unit comprises:

12. The apparatus of claim 9, wherein the model prediction unit comprises:

13. The apparatus of claim 12, wherein the model prediction unit further comprises:

14. The apparatus according to claim 8, wherein the user editing command acquiring unit includes:

15. A text editing apparatus, characterized by comprising: a memory and a processor;

the memory is used for storing programs;

the processor is configured to execute the program to implement the respective steps of the text editing method as claimed in any one of claims 1 to 7.

16. A readable storage medium having stored thereon a computer program, which, when executed by a processor, implements the steps of the text editing method according to any of claims 1-7.