CN110705253A - Burma language dependency syntax analysis method and device based on transfer learning - Google Patents
Burma language dependency syntax analysis method and device based on transfer learning Download PDFInfo
- Publication number
- CN110705253A CN110705253A CN201910808117.5A CN201910808117A CN110705253A CN 110705253 A CN110705253 A CN 110705253A CN 201910808117 A CN201910808117 A CN 201910808117A CN 110705253 A CN110705253 A CN 110705253A
- Authority
- CN
- China
- Prior art keywords
- burma
- english
- dependency
- word
- syntax analysis
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000004458 analytical method Methods 0.000 title claims abstract description 96
- 238000013526 transfer learning Methods 0.000 title claims abstract description 13
- 239000013598 vector Substances 0.000 claims abstract description 118
- 238000012549 training Methods 0.000 claims abstract description 28
- 238000000034 method Methods 0.000 claims abstract description 24
- 238000013508 migration Methods 0.000 claims abstract description 21
- 230000005012 migration Effects 0.000 claims abstract description 21
- 238000007781 pre-processing Methods 0.000 claims abstract description 7
- 239000011159 matrix material Substances 0.000 claims description 24
- 238000013507 mapping Methods 0.000 claims description 19
- 238000013527 convolutional neural network Methods 0.000 claims description 15
- 230000006870 function Effects 0.000 claims description 7
- 230000001617 migratory effect Effects 0.000 claims description 5
- 238000012512 characterization method Methods 0.000 claims description 4
- 230000004913 activation Effects 0.000 claims description 3
- 230000001419 dependent effect Effects 0.000 claims description 3
- 238000012360 testing method Methods 0.000 claims description 3
- 238000003058 natural language processing Methods 0.000 abstract description 3
- 230000000694 effects Effects 0.000 description 15
- 238000013135 deep learning Methods 0.000 description 4
- 238000010586 diagram Methods 0.000 description 3
- 238000002474 experimental method Methods 0.000 description 3
- 238000003062 neural network model Methods 0.000 description 3
- 238000012545 processing Methods 0.000 description 2
- 238000013528 artificial neural network Methods 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 230000008451 emotion Effects 0.000 description 1
- 238000011156 evaluation Methods 0.000 description 1
- 230000001537 neural effect Effects 0.000 description 1
- 238000003672 processing method Methods 0.000 description 1
- 238000011160 research Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/30—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F16/35—Clustering; Classification
- G06F16/355—Class or cluster creation or modification
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- Data Mining & Analysis (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Biomedical Technology (AREA)
- Evolutionary Computation (AREA)
- Life Sciences & Earth Sciences (AREA)
- Artificial Intelligence (AREA)
- Databases & Information Systems (AREA)
- Biophysics (AREA)
- Computational Linguistics (AREA)
- Health & Medical Sciences (AREA)
- General Health & Medical Sciences (AREA)
- Molecular Biology (AREA)
- Computing Systems (AREA)
- Mathematical Physics (AREA)
- Software Systems (AREA)
- Machine Translation (AREA)
Abstract
The invention relates to a Burma language dependency syntax analysis method and device based on transfer learning, and belongs to the technical field of natural language processing. The method comprises the following steps of preprocessing Burmese data: carrying out English and Burma bilingual word vector representation, and representing the bilingual word vectors in the same semantic space; migration of English dependency syntactic analysis corpus: migrating the dependency arc, position and part-of-speech information of English to Burma, and carrying out Burma dependency syntactic analysis model training to obtain Burma dependency syntactic analysis model; and vectorizing and expressing the input Burma sentence through a pre-trained Burma dependency syntax analysis model, and then carrying out Burma dependency syntax analysis prediction. The Burma language dependency syntax analysis device based on the transfer learning is manufactured in a functional modularization mode according to the steps, dependency syntax analysis is achieved on Burma language sentences, the problem that the performance is poor due to the fact that Burma language dependency syntax analysis data are lack is solved, and the Burma language dependency syntax analysis device has important theoretical and practical application values.
Description
Technical Field
The invention relates to a Burma language dependency syntax analysis method and device based on transfer learning, and belongs to the technical field of natural language processing.
Background
The problem of insufficient linguistic data of low-resource languages is solved by using the idea of transfer learning, which is a research hotspot of current natural language processing. The main reason is that a large amount of accurate dependency syntax analysis linguistic data exist in English, but for Burma with scarce resources, data scarcity is marked, a small-scale Burma mark data set can be obtained only through collection and manual marking of the linguistic data, and the training data is too small and inevitably affects the Burma dependency syntax analysis effect. On the premise of no Burma dependency syntax parsing corpus, the accurate dependency syntax parsing corpus in English is utilized to migrate the corpus to Burma, so that a good effect can be obtained.
Disclosure of Invention
The invention provides a method and a device for analyzing Burma dependency syntax based on migration learning, which are used for solving the problems of scarcity of Burma dependency syntax analysis marking data, small-scale training data and poor effect of Burma emotion classification and solving the problem of poor effect of a model trained by the marking data.
The technical scheme of the invention is as follows: the Burma language dependency syntax analysis method based on the transfer learning comprises the following specific steps:
step1, preprocessing Burma data: carrying out English and Burma bilingual word vector representation, and representing the bilingual word vectors in the same semantic space;
step2, English dependency parsing corpus migration: migrating the dependency arc, position and part-of-speech information of English to Burma, and carrying out Burma dependency syntactic analysis model training to obtain Burma dependency syntactic analysis model;
step3, vectorizing the input Burma sentence by the pre-trained Burma dependency syntax analysis model, and then performing Burma dependency syntax analysis prediction.
As a preferred embodiment of the present invention, the Step1 specifically comprises the following steps:
step1.1, acquiring 20106 sentences of the divided words in English and Burmese through an Asian language tree library website (http:// www2. nick. go. jp/astrec-att/member/mutiyama/ALT /);
step1.2, fusing syllable characteristic information and syllable position characteristic information of Burma language by using a convolutional neural network CNN to train a monolingual word vector of Burma language;
step1.3, training bilingual word vectors by using the Burmese bilingual corpus, and then combining the bilingual word vectors with the monolingual word vectors according to a certain proportion to map the Burmese bilingual word vectors in the same semantic space.
As a preferable scheme of the invention, the step Step1.2 comprises the following specific steps:
step1.2.1, initializing the vector of Burma vocabulary at the input layer of the convolutional neural network: randomly initializing syllable vectors for the syllables to represent the Burma words by the syllables; for the input Burma vocabulary, dividing Burma words into Burma syllables, wherein the Burma words are composed of Burma phonetic syllables, because each Burma phonetic syllable has initialized random vector, d represents the dimension of the vector of Burma phonetic syllables, C is the syllable in Burma words, the initial vector of one Burma word becomes the combination of several syllable vectors Q belonged to Rd×|C|(ii) a Suppose that the Burma word k belongs to v and is composed of a series of syllables [ c1,c2,c3,c4,...c1]Where l is the length of the Burma word k; then, the syllable level of k is represented by matrix Ck∈Rd×lGiven, wherein the jSyllableThe columns correspond toThe syllable vector of, i.e. the first of QColumns;
step1.2.2, extracting Burma phonetic segment characteristics from the convolutional layer of the convolutional neural network: applications CkAnd a filter H of width w in H ∈ Rd×wAfter convolution operation, a bias is added and non-linearity is applied to obtain the feature mapping fk∈Rl-w+1(ii) a Specifically, fkI th of (1)SyllableThe individual elements are given by:
fk[isyllable]=tanh(<Ck[*,iSyllable:iMusical scale+w-1],H>+b)
Wherein C isk[*,iSyllable:iSyllable+w-1]Is CkI of (a)Syllable-(iSyllable+ w-1) column; and finally, extracting the information with the highest f value in the features by adopting maxporoling:
step1.2.3, further extracting features in the convolutional neural network by using a gate structure network: as the mutual relation among syllables and syllable position characteristics in a Burma word are further extracted, the concrete formula is as follows:
z=g(Wy+b)+y
in the formula, g is a nonlinear activation function tanh, y is the output of the last network, W represents weight, and z represents the correlation between the extracted syllables, namely syllable features; b represents parameters randomly generated in the training process;
wherein, because of the language characteristics of Burma, the location characteristics are corresponded with the syllable characteristics, and after extracting the syllable characteristics of Burma, the location characteristics can be corresponded with.
As a preferable scheme of the invention, the step Step1.3 comprises the following specific steps:
step1.3.1, in Burma monolingualThe word vector represents: the prediction model takes the current word w as input to predict the context; word embedding of the current word w is denoted vwEmbedding of context c is denoted as v'cThe distribution probability of a word w and context c is expressed as a softmax function of the form:
v represents a vocabulary, a parameter theta is contained in a word embedding matrix and a context embedding matrix, and the prediction model obtains a maximized log value through training to train a data set D; d represents a set of word w and context c pairs;
j (theta) represents the word vector of the Burmese monolingua, and c' represents the word in the vocabulary V;
step1.3.2, after each monolingual word vector of the Burma bilingual is expressed, training bilingual word vectors by using Burma bilingual corpus, setting the Burma bilingual language set to be L, testing a joint target, mapping the Burma bilingual word vectors in the same semantic space, and adjusting the proportion of the bilingual word vectors through alpha and beta, wherein the specific formula is shown as follows;
wherein, the bilingual and monolingual word vectors obtained by J are expressed,a vector of a bilingual word is represented,expressing the obtained English single-language word vector, alpha and beta expressing the proportionality coefficient of the single-language word vector and the English Burma bilingual word vector,a data set representing a bilingual word vector,a data set representing a monolingual word vector in english.
As a preferred embodiment of the present invention, the Step2 specifically comprises the following steps:
step2.1, constructing a part of Burmese dependency syntactic analysis corpus based on a word mapping method;
the method comprises the steps of utilizing the existing English dependency syntax analysis corpus, generating an Burma dictionary by the obtained Burma parallel sentence pair, and constructing the Burma dependency syntax analysis corpus in a word mapping mode;
burma dependency syntax analysis after mapping of the English words comprises the positions, the part of speech information and dependency arc information of the Burma words;
step2.2, migration of English dependent arcs: parallel aligned language data of Burmese bilingual is passed through a WarcThe weight matrix associates the dependency relationship between English and Burma, EENarcAnd EMYarcVectors representing dependency relationships in english and burma respectively,representing the splicing of dependency arc vectors of English and Burma, wherein i and j respectively represent the ith and j Burma words;
EMYarc=Warc·EENarc
step2.3, migration of English positional information: for BurmaWord and corresponding English part of speech establish relation matrix Wpos,EENposAnd EMYposRespectively representing part-of-speech vectors in English and Burma, using a relationship matrix WposThe part-of-speech information of English is migrated to the part-of-speech information of Burma so that the part-of-speech of Burma contains more information,representing the concatenation of the part of speech vectors of English and Burma, wherein i and j respectively represent the ith and j Burma words;
EMYpos=Wpos·EENpos
step2.4, migration of English part-of-speech information: adding the position information into the word vector, and establishing a relation matrix W according to the mapping relation between dictionariesloc,EENlocAnd EMYlocRespectively representing the position vectors of English and Burma, using a relationship matrix WlocThe position information of the words of English and Burma is migrated to a representation space, so that the difference between Burma and English is reduced, English dependency syntax analysis knowledge can be learned in the process of training Burma dependency syntax analysis model,representing the splicing of position vectors of English and Burma, wherein i and j respectively represent the ith and j Burma words;
EMYloc=Wloc·EENloc
step2.5, utilizing the migrated Burmese dependency syntax analysis corpus to train the Burmese dependency syntax analysis model through a Standford parser tool.
A Burma language dependency syntax analysis device based on transfer learning comprises the following modules:
the Burmese bilingual word vector characterization module is used for preprocessing Burmese data: carrying out English and Burma bilingual word vector representation, and representing the bilingual word vectors in the same semantic space;
the English dependency syntactic analysis migration module is used for migrating English dependency syntactic analysis corpora: migrating the dependency arc, position and part-of-speech information of English to Burma, and carrying out Burma dependency syntactic analysis model training to obtain Burma dependency syntactic analysis model;
and the Burma dependency syntax analysis and prediction module is used for vectorizing and expressing the input Burma sentence through the pre-trained Burma dependency syntax analysis model and then carrying out Burma dependency syntax analysis and prediction.
The invention has the beneficial effects that:
1. according to the method, the characteristics of Burma speech syllable characteristics and syllable position characteristic information are fused to strengthen the characterization capability of Burma language monolingual word vectors, and the effect of Burma language dependency syntactic analysis is improved;
2. the method migrates the part of speech, the position and the dependency arc in English to Burma, and solves the problem of poor Burma dependency syntactic analysis effect caused by insufficient dependency syntactic analysis language.
Drawings
FIG. 1 is a diagram of a Burma and English word mapping-based model in the present invention;
FIG. 2 is a representation of the migration process in the present invention;
FIG. 3 is a diagram of migration information based on the dependency syntax analysis of the migrated learning Burma language according to the present invention;
FIG. 4 is an overall flow chart of the present invention;
FIG. 5 is a diagram illustrating a Burma language dependency parsing apparatus based on transfer learning according to the present invention.
Detailed Description
Example 1: as shown in fig. 1-5, the method for analyzing the burma language dependency syntax based on the migration learning comprises the following specific steps:
step1, preprocessing Burma data: carrying out English and Burma bilingual word vector representation, and representing the bilingual word vectors in the same semantic space;
step2, English dependency parsing corpus migration: migrating the dependency arc, position and part-of-speech information of English to Burma, and carrying out Burma dependency syntactic analysis model training to obtain Burma dependency syntactic analysis model;
step3, vectorizing the input Burma sentence by the pre-trained Burma dependency syntax analysis model, and then performing Burma dependency syntax analysis prediction.
As a preferred embodiment of the present invention, the Step1 specifically comprises the following steps:
step1.1, acquiring 20106 sentences of the divided words in English and Burmese through an Asian language tree library website (http:// www2. nick. go. jp/astrec-att/member/mutiyama/ALT /); the format of the resulting Burma parallel sentence is shown in Table 1:
table 1 shows the format for obtaining the Burma parallel sentence
Step1.2, fusing syllable characteristic information and syllable position characteristic information of Burma language by using a convolutional neural network CNN to train a monolingual word vector of Burma language;
step1.3, training bilingual word vectors by using the Burmese bilingual corpus, and then combining the bilingual word vectors with the monolingual word vectors according to a certain proportion to map the Burmese bilingual word vectors in the same semantic space.
As a preferable scheme of the invention, the step Step1.2 comprises the following specific steps:
step1.2.1, initializing the vector of Burma vocabulary at the input layer of the convolutional neural network: randomly initializing syllable vectors for the syllables to represent the Burma words by the syllables; for the input Burma vocabulary, the Burma words are divided into Burma syllables, and the Burma words are composed of Burma phonetic segments, because each Burma phonetic segment is already providedInitializing random vector, d represents dimension of vector of Burma syllable, C is syllable in Burma word, and the initial vector of Burma word is changed into combination of several syllable vectors Q belonging to Rd×|C|(ii) a Suppose that the Burma word k belongs to v and is composed of a series of syllables [ c1,c2,c3,c4,...c1]Where l is the length of the Burma word k; then, the syllable level of k is represented by matrix Ck∈Rd×lGiven, wherein the jSyllableThe columns correspond toThe syllable vector of, i.e. the first of QColumns;
step1.2.2, extracting Burma phonetic segment characteristics from the convolutional layer of the convolutional neural network: applications CkAnd a filter H of width w in H ∈ Rd×wAfter convolution operation, a bias is added and non-linearity is applied to obtain the feature mapping fk∈Rl-w+1(ii) a Specifically, fkI th of (1)SyllableThe individual elements are given by:
fk[isyllable]=tanh(<Ck[*,iSyllable:iMusical scale+w-1],H>+b)
Wherein C isk[*,iSyllable:iSyllable+w-1]Is CkI of (a)Syllable-(iSyllable+ w-1) column; and finally, extracting the information with the highest f value in the features by adopting maxporoling:
corresponding to filter H (when applied to the maine word k). The role of maxporoling is to capture the most important function for a given filter, namely the one with the highest value. The filter is essentially a n-gram that selects syllables in the Burmese vocabulary, where the size of the n-gram corresponds to the filter width, which is extracted by a convolutional neural networkAnd taking the characteristics of the Burmese words. A plurality of filters of different widths are used to obtain the eigenvectors of k. So if we have a total of H filters H1,....,Hn。Inputting a representation of the k word;
step1.2.3, further extracting features in the convolutional neural network by using a gate structure network: as the mutual relation among syllables and syllable position characteristics in a Burma word are further extracted, the concrete formula is as follows:
z=g(Wy+b)+y
in the formula, g is a nonlinear activation function tanh, y is the output of the last network, W represents weight, and z represents the correlation between the extracted syllables, namely syllable features; b represents parameters randomly generated in the training process;
wherein, because of the language characteristics of Burma, the location characteristics are corresponded with the syllable characteristics, and after extracting the syllable characteristics of Burma, the location characteristics can be corresponded with.
As a preferable scheme of the invention, the step Step1.3 comprises the following specific steps:
word vector representation in step1.3.1, incine monolingual: the prediction model takes the current word w as input to predict the context; word embedding of the current word w is denoted vwEmbedding of context c is denoted as v'cThe distribution probability of a word w and context c is expressed as a softmax function of the form:
v represents a vocabulary, a parameter theta is contained in a word embedding matrix and a context embedding matrix, and the prediction model obtains a maximized log value through training to train a data set D; d represents a set of word w and context c pairs;
j (theta) represents the word vector of the Burmese monolingua, and c' represents the word in the vocabulary V;
step1.3.2, after each monolingual word vector of the Burma bilingual is expressed, training bilingual word vectors by using Burma bilingual corpus, setting the Burma bilingual language set to be L, testing a joint target, mapping the Burma bilingual word vectors in the same semantic space, and adjusting the proportion of the bilingual word vectors through alpha and beta, wherein the specific formula is shown as follows;
wherein, the bilingual and monolingual word vectors obtained by J are expressed,a vector of a bilingual word is represented,expressing the obtained English single-language word vector, alpha and beta expressing the proportionality coefficient of the single-language word vector and the English Burma bilingual word vector,a data set representing a bilingual word vector,a data set representing a monolingual word vector in english.
As a preferred embodiment of the present invention, the Step2 specifically comprises the following steps:
step2.1, constructing a part of Burmese dependency syntactic analysis corpus based on a word mapping method;
the method comprises the steps of utilizing the existing English dependency syntax analysis corpus, generating an Burma dictionary by the obtained Burma parallel sentence pair, and constructing the Burma dependency syntax analysis corpus in a word mapping mode;
burma dependency syntax analysis after mapping of the English words comprises the positions, the part of speech information and dependency arc information of the Burma words; as shown in FIG. 1, toFor example, the format of the corpus is shown in table 2;
TABLE 2 corpora required for dependency parsing
Step2.2, as shown in FIGS. 2 and 3, migration of English dependent arcs: parallel aligned language data of Burmese bilingual is passed through a WarcThe weight matrix associates the dependency relationship between English and Burma, EENarcAnd EMYarcVectors representing dependency relationships in english and burma respectively,representing the splicing of dependency arc vectors of English and Burma, wherein i and j respectively represent the ith and j Burma words;
EMYarc=Warc·EENarc
step2.3, migration of English positional information: establishing a relation matrix W for Burmese and corresponding parts of speech of Englishpos,EENposAnd EMYposRespectively representing part-of-speech vectors in English and Burma, using a relationship matrix WposThe part-of-speech information of English is migrated to the part-of-speech information of Burma so that the part-of-speech of Burma contains more information,representing the concatenation of the part of speech vectors of English and Burma, wherein i and j respectively represent the ith and j Burma words;
EMYpos=Wpos·EENpos
step2.4, migration of English part-of-speech information: adding the position information into the word vector, and establishing a relation matrix W according to the mapping relation between dictionariesloc,EENlocAnd EMYlocRespectively representing the position vectors of English and Burma, using a relationship matrix WlocThe position information of the words of English and Burma is migrated to a representation space, so that the difference between Burma and English is reduced, English dependency syntax analysis knowledge can be learned in the process of training Burma dependency syntax analysis model,representing the splicing of position vectors of English and Burma, wherein i and j respectively represent the ith and j Burma words;
EMYloc=Wloc·EENloc
step2.5, utilizing the migrated Burmese dependency syntax analysis corpus to train the Burmese dependency syntax analysis model through a Standford parser tool.
Specifically, in order to verify the performance of the method, experimental data is adopted from a Burma language data set of an Asia low-resource language tree library, the Burma language data set comprises 20011 Burma sentences with well-divided words, and after the Burma sentences are removed and repeated, the Burma sentences are divided to generate 75498 Burma words.
The evaluation indexes of the dependency syntactic analysis are a dependency arc accuracy (UAS) and a dependency tag accuracy (LAS), the dependency arc accuracy is a ratio of the total number of words in the sentence of the word number with the correct dependency arc, the dependency tag accuracy is a ratio of the number of words with the correct dependency arc and the correct dependency relationship to the total number of words in the sentence, and the formula is shown as follows.
The method based on deep learning can effectively improve the performance and the result of the model, different neural network models can generate different influences on the experimental result, and currently, the neural network models commonly used in dependency syntax analysis include LSTM and Bi-LSTM neural networks. The results of the specific experiments are shown in table 3 below.
Table 3 shows the effect of different neural network models on the experimental results
Experimental results show that the Burmese dependency syntax analysis data processing based on the LSTM and Bi-LSTM deep learning method cannot achieve a good effect, the deep learning method needs a large data volume, and the Burmese dependency syntax analysis data processing method is low in data quality and small in data volume, so that the Burmese dependency syntax analysis data processing based on the LSTM and Bi-LSTM deep learning method is poor in effect.
Table 4 compares the performance of the migratory learning model by comparing the biased bilingual word vectors generated by the ratio of alpha to beta required in the bilingual word vector training in Burmese. The results of the specific experiments are shown in table 4 below.
Table 4 shows the effect of different ratios of alpha to beta on the results of the experiment
Experimental results show that the effect of mixing bilingual word vectors in Burma and English in different proportions is different, and the best effect is obtained when Burma and English are mixed with the bilingual word vectors in a ratio of 1: 1.
Table 5 shows the comparison of the effects between burmese dependency syntactic analysis model training using the migration learning based shared network parameters and burmese dependency syntactic analysis model training using the location feature information of syllables and words, respectively.
TABLE 5 Burma dependency syntactic analysis model performance fusing multiple semantic information
The experimental result shows that fusing different semantic information can affect the experimental result, the effect obtained by fusing the syllable information of Burma is better than that obtained by the method based on the shared network parameters, because the minimum unit of Burma is syllable, the syllable characteristics of Burma can be more accurately obtained by combining the syllable characteristics of Burma into Burma dependency syntax analysis, the effect obtained by the Burma dependency syntax analysis method based on dependency arc, position, part of speech and part of speech in the training set is the best, and Burma can be better represented by migrating the information of dependency arc, position and part of speech of English.
According to the concept of the present invention, the present invention further provides a Burma dependency parsing apparatus based on transfer learning, as shown in FIG. 5, the apparatus includes the following integrated modules:
the Burmese bilingual word vector characterization module is used for preprocessing Burmese data: carrying out English and Burma bilingual word vector representation, and representing the bilingual word vectors in the same semantic space;
the English dependency syntactic analysis migration module is used for migrating English dependency syntactic analysis corpora: migrating the dependency arc, position and part-of-speech information of English to Burma, and carrying out Burma dependency syntactic analysis model training to obtain Burma dependency syntactic analysis model;
and the Burma dependency syntax analysis and prediction module is used for vectorizing and expressing the input Burma sentence through the pre-trained Burma dependency syntax analysis model and then carrying out Burma dependency syntax analysis and prediction.
While the present invention has been described in detail with reference to the embodiments shown in the drawings, the present invention is not limited to the embodiments, and various changes can be made without departing from the spirit of the present invention within the knowledge of those skilled in the art.
Claims (6)
1. The Burma language dependency syntax analysis method based on the transfer learning is characterized by comprising the following steps of:
the Burma language dependency syntax analysis method based on the transfer learning comprises the following specific steps:
step1, preprocessing Burma data: carrying out English and Burma bilingual word vector representation, and representing the bilingual word vectors in the same semantic space;
step2, English dependency parsing corpus migration: migrating the dependency arc, position and part-of-speech information of English to Burma, and carrying out Burma dependency syntactic analysis model training to obtain Burma dependency syntactic analysis model;
step3, vectorizing the input Burma sentence by the pre-trained Burma dependency syntax analysis model, and then performing Burma dependency syntax analysis prediction.
2. The Burma language dependency syntax analysis method based on migratory learning of claim 1, characterized in that: the specific steps of Step1 are as follows:
step1.1, acquiring 20106 sentences of parallel sentence pairs of words well divided in English and Burma through an Asian language tree library website;
step1.2, fusing syllable characteristic information and syllable position characteristic information of Burma language by using a convolutional neural network CNN to train a monolingual word vector of Burma language;
step1.3, training bilingual word vectors by using the Burmese bilingual corpus, and then combining the bilingual word vectors with the monolingual word vectors according to a certain proportion to map the Burmese bilingual word vectors in the same semantic space.
3. The Burma language dependency syntax analysis method based on migratory learning of claim 1, characterized in that: the specific steps of the step Step1.2 are as follows:
step1.2.1, initializing the vector of Burma vocabulary at the input layer of the convolutional neural network: randomly initializing syllable vectors for the syllables to represent the Burma words by the syllables; for the input Burma vocabulary, dividing Burma words into Burma syllables, wherein the Burma words are composed of Burma phonetic syllables, because each Burma phonetic syllable has initialized random vector, d represents the dimension of the vector of Burma phonetic syllables, C is the syllable in Burma words, the initial vector of one Burma word becomes the combination of several syllable vectors Q belonged to Rd×|C|(ii) a Suppose that the Burma word k belongs to v and is composed of a series of syllables [ c1,c2,c3,c4,...c1]Where l is the length of the Burma word k; then, the syllable level of k is represented by matrix Ck∈Rd×lGiven, wherein the jSyllableThe columns correspond toThe syllable vector of, i.e. the first of QColumns;
step1.2.2, extracting Burma phonetic segment characteristics from the convolutional layer of the convolutional neural network: applications CkAnd a filter H of width w in H ∈ Rd×wAfter convolution operation, a bias is added and non-linearity is applied to obtain the feature mapping fk∈Rl-w+1(ii) a Specifically, fkI th of (1)SyllableThe individual elements are given by:
fk[isyllable]=tanh(<Ck[*,iSyllable:iMusical scale+w-1],H>+b)
Wherein C isk[*,iSyllable:iSyllable+w-1]Is CkI of (a)Syllable-(iSyllable+ w-1) column; and finally, extracting the information with the highest f value in the features by adopting maxporoling:
step1.2.3, further extracting features in the convolutional neural network by using a gate structure network: as the mutual relation among syllables and syllable position characteristics in a Burma word are further extracted, the concrete formula is as follows:
z=g(Wy+b)+y
in the formula, g is a nonlinear activation function tanh, y is the output of the last network, W represents weight, and z represents the correlation between the extracted syllables, namely syllable features; b represents parameters randomly generated in the training process;
wherein, because of the language characteristics of Burma, the location characteristics are corresponded with the syllable characteristics, and after extracting the syllable characteristics of Burma, the location characteristics can be corresponded with.
4. The Burma language dependency syntax analysis method based on migratory learning of claim 1, characterized in that: the specific steps of the step Step1.3 are as follows:
word vector representation in step1.3.1, incine monolingual: the prediction model takes the current word w as input to predict the context; word embedding of the current word w is denoted vwEmbedding of context c is denoted as v'cThe distribution probability of a word w and context c is expressed as a softmax function of the form:
v represents a vocabulary, a parameter theta is contained in a word embedding matrix and a context embedding matrix, and the prediction model obtains a maximized log value through training to train a data set D; d represents a set of word w and context c pairs;
j (theta) represents the word vector of the Burmese monolingua, and c' represents the word in the vocabulary V;
step1.3.2, after each monolingual word vector of the Burma bilingual is expressed, training bilingual word vectors by using Burma bilingual corpus, setting the Burma bilingual language set to be L, testing a joint target, mapping the Burma bilingual word vectors in the same semantic space, and adjusting the proportion of the bilingual word vectors through alpha and beta, wherein the specific formula is shown as follows;
wherein, the bilingual and monolingual word vectors obtained by J are expressed,a vector of a bilingual word is represented,expressing the obtained English single-language word vector, alpha and beta expressing the proportionality coefficient of the single-language word vector and the English Burma bilingual word vector,a data set representing a bilingual word vector,a data set representing a monolingual word vector in english.
5. The Burma language dependency syntax analysis method based on migratory learning of claim 1, characterized in that: the specific steps of Step2 are as follows:
step2.1, constructing a part of Burmese dependency syntactic analysis corpus based on a word mapping method;
the method comprises the steps of utilizing the existing English dependency syntax analysis corpus, generating an Burma dictionary by the obtained Burma parallel sentence pair, and constructing the Burma dependency syntax analysis corpus in a word mapping mode;
burma dependency syntax analysis after mapping of the English words comprises the positions, the part of speech information and dependency arc information of the Burma words;
step2.2, migration of English dependent arcs: parallel aligned language data of Burmese bilingual is passed through a WarcThe weight matrix associates the dependency relationship between English and Burma, EENarcAnd EMYarcVectors representing dependency relationships in english and burma respectively,representing the splicing of dependency arc vectors of English and Burma, wherein i and j respectively represent the ith and j Burma words;
EMYarc=Warc·EENarc
step2.3, migration of English positional information: establishing a relation matrix W for Burmese and corresponding parts of speech of Englishpos,EENposAnd EMYposRespectively representing part-of-speech vectors in English and Burma, andusing a relationship matrix WposThe part-of-speech information of English is migrated to the part-of-speech information of Burma so that the part-of-speech of Burma contains more information,representing the concatenation of the part of speech vectors of English and Burma, wherein i and j respectively represent the ith and j Burma words;
EMYpos=Wpos·EENpos
step2.4, migration of English part-of-speech information: adding the position information into the word vector, and establishing a relation matrix W according to the mapping relation between dictionariesloc,EENlocAnd EMYlocRespectively representing the position vectors of English and Burma, using a relationship matrix WlocThe position information of the words of English and Burma is migrated to a representation space, so that the difference between Burma and English is reduced, English dependency syntax analysis knowledge can be learned in the process of training Burma dependency syntax analysis model,representing the splicing of position vectors of English and Burma, wherein i and j respectively represent the ith and j Burma words;
EMYloc=Wloc·EENloc
step2.5, utilizing the migrated Burmese dependency syntax analysis corpus to train the Burmese dependency syntax analysis model through a Standford parser tool.
6. A Burma language dependency syntax analysis device based on transfer learning is characterized in that: the system comprises the following modules:
the Burmese bilingual word vector characterization module is used for preprocessing Burmese data: carrying out English and Burma bilingual word vector representation, and representing the bilingual word vectors in the same semantic space;
the English dependency syntactic analysis migration module is used for migrating English dependency syntactic analysis corpora: migrating the dependency arc, position and part-of-speech information of English to Burma, and carrying out Burma dependency syntactic analysis model training to obtain Burma dependency syntactic analysis model;
and the Burma dependency syntax analysis and prediction module is used for vectorizing and expressing the input Burma sentence through the pre-trained Burma dependency syntax analysis model and then carrying out Burma dependency syntax analysis and prediction.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910808117.5A CN110705253A (en) | 2019-08-29 | 2019-08-29 | Burma language dependency syntax analysis method and device based on transfer learning |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910808117.5A CN110705253A (en) | 2019-08-29 | 2019-08-29 | Burma language dependency syntax analysis method and device based on transfer learning |
Publications (1)
Publication Number | Publication Date |
---|---|
CN110705253A true CN110705253A (en) | 2020-01-17 |
Family
ID=69194219
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201910808117.5A Pending CN110705253A (en) | 2019-08-29 | 2019-08-29 | Burma language dependency syntax analysis method and device based on transfer learning |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN110705253A (en) |
Cited By (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112084769A (en) * | 2020-09-14 | 2020-12-15 | 深圳前海微众银行股份有限公司 | Dependency syntax model optimization method, device, equipment and readable storage medium |
CN112287688A (en) * | 2020-09-17 | 2021-01-29 | 昆明理工大学 | English-Burmese bilingual parallel sentence pair extraction method and device integrating pre-training language model and structural features |
CN113116363A (en) * | 2021-04-15 | 2021-07-16 | 西北工业大学 | Method for judging hand fatigue degree based on surface electromyographic signals |
WO2021147404A1 (en) * | 2020-07-30 | 2021-07-29 | 平安科技(深圳)有限公司 | Dependency relationship classification method and related device |
CN113449520A (en) * | 2021-07-22 | 2021-09-28 | 中国工商银行股份有限公司 | Word sense disambiguation method and device |
CN113779962A (en) * | 2020-06-10 | 2021-12-10 | 阿里巴巴集团控股有限公司 | Data processing method, device, equipment and storage medium |
Citations (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110008467A (en) * | 2019-03-04 | 2019-07-12 | 昆明理工大学 | A kind of interdependent syntactic analysis method of Burmese based on transfer learning |
-
2019
- 2019-08-29 CN CN201910808117.5A patent/CN110705253A/en active Pending
Patent Citations (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110008467A (en) * | 2019-03-04 | 2019-07-12 | 昆明理工大学 | A kind of interdependent syntactic analysis method of Burmese based on transfer learning |
Non-Patent Citations (2)
Title |
---|
JIANG GUO 等: ""A Representation Learning Framework for Multi-Source Transfer Parsing"", 《AAAI-16》 * |
林颂凯: ""基于神经网络的汉缅双语句子级Embedding语义表征方法应用研究"", 《中国优秀硕士学位论文全文数据库 信息科技辑》 * |
Cited By (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113779962A (en) * | 2020-06-10 | 2021-12-10 | 阿里巴巴集团控股有限公司 | Data processing method, device, equipment and storage medium |
CN113779962B (en) * | 2020-06-10 | 2024-02-02 | 阿里巴巴集团控股有限公司 | Data processing method, device, equipment and storage medium |
WO2021147404A1 (en) * | 2020-07-30 | 2021-07-29 | 平安科技(深圳)有限公司 | Dependency relationship classification method and related device |
CN112084769A (en) * | 2020-09-14 | 2020-12-15 | 深圳前海微众银行股份有限公司 | Dependency syntax model optimization method, device, equipment and readable storage medium |
CN112287688A (en) * | 2020-09-17 | 2021-01-29 | 昆明理工大学 | English-Burmese bilingual parallel sentence pair extraction method and device integrating pre-training language model and structural features |
CN113116363A (en) * | 2021-04-15 | 2021-07-16 | 西北工业大学 | Method for judging hand fatigue degree based on surface electromyographic signals |
CN113449520A (en) * | 2021-07-22 | 2021-09-28 | 中国工商银行股份有限公司 | Word sense disambiguation method and device |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN110705253A (en) | Burma language dependency syntax analysis method and device based on transfer learning | |
CN108363790B (en) | Method, device, equipment and storage medium for evaluating comments | |
CN112801010B (en) | Visual rich document information extraction method for actual OCR scene | |
CN108614875B (en) | Chinese emotion tendency classification method based on global average pooling convolutional neural network | |
CN109344236B (en) | Problem similarity calculation method based on multiple characteristics | |
CN111177326B (en) | Key information extraction method and device based on fine labeling text and storage medium | |
CN112214610B (en) | Entity relationship joint extraction method based on span and knowledge enhancement | |
CN108959242B (en) | Target entity identification method and device based on part-of-speech characteristics of Chinese characters | |
CN111125331A (en) | Semantic recognition method and device, electronic equipment and computer-readable storage medium | |
CN110309511B (en) | Shared representation-based multitask language analysis system and method | |
CN112149421A (en) | Software programming field entity identification method based on BERT embedding | |
CN111966812B (en) | Automatic question answering method based on dynamic word vector and storage medium | |
CN105068997B (en) | The construction method and device of parallel corpora | |
CN110688862A (en) | Mongolian-Chinese inter-translation method based on transfer learning | |
TW201403354A (en) | System and method using data reduction approach and nonlinear algorithm to construct Chinese readability model | |
CN113268576B (en) | Deep learning-based department semantic information extraction method and device | |
CN112417823B (en) | Chinese text word order adjustment and word completion method and system | |
CN111985612A (en) | Encoder network model design method for improving video text description accuracy | |
CN112926345A (en) | Multi-feature fusion neural machine translation error detection method based on data enhancement training | |
CN114818717A (en) | Chinese named entity recognition method and system fusing vocabulary and syntax information | |
CN110222338A (en) | A kind of mechanism name entity recognition method | |
CN113627150A (en) | Method and device for extracting parallel sentence pairs for transfer learning based on language similarity | |
KR20230009564A (en) | Learning data correction method and apparatus thereof using ensemble score | |
CN115859164A (en) | Method and system for identifying and classifying building entities based on prompt | |
CN113160917A (en) | Electronic medical record entity relation extraction method |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20200117 |