CN106649294A - Training of classification models and method and device for recognizing subordinate clauses of classification models - Google Patents
Training of classification models and method and device for recognizing subordinate clauses of classification models Download PDFInfo
- Publication number
- CN106649294A CN106649294A CN201611250331.6A CN201611250331A CN106649294A CN 106649294 A CN106649294 A CN 106649294A CN 201611250331 A CN201611250331 A CN 201611250331A CN 106649294 A CN106649294 A CN 106649294A
- Authority
- CN
- China
- Prior art keywords
- english
- sentence
- disaggregated model
- subordinate clause
- text sequence
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/40—Processing or translation of natural language
- G06F40/42—Data-driven translation
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/20—Natural language analysis
- G06F40/205—Parsing
- G06F40/211—Syntactic parsing, e.g. based on context-free grammar [CFG] or unification grammars
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Health & Medical Sciences (AREA)
- Artificial Intelligence (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Computational Linguistics (AREA)
- General Health & Medical Sciences (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Electrically Operated Instructional Devices (AREA)
Abstract
The embodiment of the invention provides training of classification models and a method and device for recognizing subordinate clauses of the classification models. The training method includes the steps that English sentences containing English subordinate clauses are set as training samples; the training samples are converted into feature text sequences; the classification models used for recognizing the English subordinate clauses are trained by means of the feature text sequences. The types of the subordinate clauses included in the English sentences can be automatically recognized, information diversity of the English sentences is improved, the situation that users manually compare the English sentences by inquiring other documents is reduced, waste time can be reduced, and efficiency is improved. Besides, the error probability is reduced under the circumstance that little knowledge is mastered.
Description
Technical field
The present invention relates to the technical field of computer disposal, more particularly to a kind of training of the disaggregated model of English subordinate clause
Method, a kind of method and a kind of corresponding training cartridge of the disaggregated model of English subordinate clause that English subordinate clause is recognized based on disaggregated model
Put, a kind of device that English subordinate clause is recognized based on disaggregated model.
Background technology
With globalization development, English is used as one of international language, it has also become people study basic subject it
One.
People run into the english sentence being ignorant of when the English film of english article, viewing is read, and most people all can be by
Translation application is translated.
Current translation application is often translated to english sentence, obtains corresponding implication, but, for
For practising the people of purpose, especially student, can have other demands, at this time, it may be necessary to people manually pass through to the English sentence
Inquire about other data to contrast English sentence, not only spend more time, cause it is less efficient, and to the acquisition of knowledge
Easily malfunction in the case of less.
The content of the invention
In view of the above problems, it is proposed that the present invention so as to provide one kind overcome the problems referred to above or at least in part solve on
State the disaggregated model of a kind of English subordinate clause of problem training method, it is a kind of based on disaggregated model recognize English subordinate clause method and
A kind of corresponding trainer of the disaggregated model of English subordinate clause, a kind of device that English subordinate clause is recognized based on disaggregated model.
According to one aspect of the present invention, there is provided a kind of training method of the disaggregated model of English subordinate clause, including:
English sentence with English subordinate clause is set to into training sample;
Training sample conversion is characterized into text sequence;
The disaggregated model for recognizing English subordinate clause is trained using the feature text sequence.
Alternatively, it is described by the training sample conversion be characterized text sequence the step of include:
Recognize the composition structure of the training sample;
Characteristic sequence text is formed using the composition structure.
Alternatively, it is described to be trained for wrapping the step of the disaggregated model for recognizing English subordinate clause using the feature text sequence
Include:
The feature text sequence is input in convolutional neural networks;
In the convolutional neural networks based on the order of word in the training sample, using the feature text sequence
Train the disaggregated model for recognizing English subordinate clause.
According to a further aspect in the invention, there is provided a kind of method that English subordinate clause is recognized based on disaggregated model, including:
Determine english sentence to be identified;
English sentence conversion is characterized into text sequence;
The feature text sequence is input into into preset disaggregated model, to recognize the subordinate clause class that the english sentence is included
Type.
Alternatively, it is described to include the step of english sentence conversion is characterized into text sequence:
Recognize the composition structure of the english sentence;
Characteristic sequence text is formed using the composition structure.
Alternatively, it is described that the feature text sequence is input into preset disaggregated model, to recognize the english sentence institute
Comprising subordinate clause type the step of include:
The feature text sequence is input in the disaggregated model trained by convolutional neural networks;
The order of word in the english sentence is based in the disaggregated model, is recognized using the feature text sequence
The subordinate clause type that the english sentence is included.
According to a further aspect in the invention, there is provided a kind of trainer of the disaggregated model of English subordinate clause, including:
Training sample setup module, is suitable to for the english sentence with English subordinate clause to be set to training sample;
Training sample modular converter, is suitable to for training sample conversion to be characterized text sequence;
Disaggregated model training module, is suitable for use with the feature text sequence and trains classification mould for recognizing English subordinate clause
Type.
Alternatively, the training sample modular converter includes:
The composition of sample recognizes submodule, is suitable to recognize the composition structure of the training sample;
Sample characteristics form submodule, are suitable for use with the composition structure and form characteristic sequence text.
Alternatively, the disaggregated model training module includes:
Convolutional neural networks input submodule, is suitable to that the feature text sequence is input in convolutional neural networks;
Convolutional neural networks train submodule, are suitable in the convolutional neural networks based on word in the training sample
Order, disaggregated model for recognizing English subordinate clause is trained using the feature text sequence.
According to a further aspect in the invention, there is provided a kind of device that English subordinate clause is recognized based on disaggregated model, including:
English sentence determining module, is adapted to determine that english sentence to be identified;
English sentence modular converter, is suitable to for english sentence conversion to be characterized text sequence;
Subordinate clause type identification module, is suitable to for the feature text sequence to be input into preset disaggregated model, described to recognize
The subordinate clause type that english sentence is included.
Alternatively, the english sentence modular converter includes:
Sentence structure recognizes submodule, is suitable to recognize the composition structure of the english sentence;
Sentence characteristics form submodule, are suitable for use with the composition structure and form characteristic sequence text.
Alternatively, the subordinate clause type identification module includes:
Disaggregated model input submodule, is suitable to that the feature text sequence was input into by dividing that convolutional neural networks are trained
In class model;
Disaggregated model recognizes submodule, be suitable in the disaggregated model based on the order of word in the english sentence,
The subordinate clause type that the english sentence is included is recognized using the feature text sequence.
English sentence with English subordinate clause is set to the embodiment of the present invention into training sample and conversion is characterized text sequence
Row, train the disaggregated model for recognizing English subordinate clause so that can be with automatic identification english sentence using this feature text sequence
Comprising subordinate clause type, improve the information diversity of english sentence, reduce user manually by inquiring about other data pair
English sentence is contrasted, and not only can reduce the time of cost, improves efficiency, and, in the situation less to the acquisition of knowledge
The lower probability for reducing error.
English sentence conversion is characterized text sequence and is input into preset disaggregated model by the embodiment of the present invention, to recognize English
The subordinate clause type that sentence is included, realizes the type of the subordinate clause that automatic identification english sentence is included, and improves english sentence
Information diversity, reduce user manually by inquiry other data English sentence is contrasted, cost not only can be reduced
Time, improve efficiency, and, in the probability for reducing error in the case of less to the acquisition of knowledge.
Described above is only the general introduction of technical solution of the present invention, in order to better understand the technological means of the present invention,
And can be practiced according to the content of specification, and in order to allow the above and other objects of the present invention, feature and advantage can
Become apparent, below especially exemplified by the specific embodiment of the present invention.
Description of the drawings
By the detailed description for reading hereafter preferred embodiment, various other advantages and benefit is common for this area
Technical staff will be clear from understanding.Accompanying drawing is only used for illustrating the purpose of preferred embodiment, and is not considered as to the present invention
Restriction.And in whole accompanying drawing, it is denoted by the same reference numerals identical part.In the accompanying drawings:
The step of Fig. 1 shows a kind of recognition methods of english information according to an embodiment of the invention flow chart;
Fig. 2A-Fig. 2 E show a kind of identification operation example figure of english sentence according to an embodiment of the invention;
The step of Fig. 3 shows the recognition methods of another kind of english information according to an embodiment of the invention flow chart;
The step of Fig. 4 shows a kind of training method of the disaggregated model of English subordinate clause according to an embodiment of the invention
Flow chart;
Fig. 5 shows a kind of identification exemplary plot of composition structure according to an embodiment of the invention;
Fig. 6 shows a kind of step of method that English subordinate clause is recognized based on disaggregated model according to an embodiment of the invention
Rapid flow chart;
Fig. 7 shows a kind of structured flowchart of the identifying device of english information according to an embodiment of the invention;
Fig. 8 shows the structured flowchart of the identifying device of another kind of english information according to an embodiment of the invention;
Fig. 9 shows a kind of structure of the trainer of the disaggregated model of English subordinate clause according to an embodiment of the invention
Block diagram;And
Figure 10 shows a kind of device that English subordinate clause is recognized based on disaggregated model according to an embodiment of the invention
Structured flowchart.
Specific embodiment
The exemplary embodiment of the disclosure is more fully described below with reference to accompanying drawings.Although showing the disclosure in accompanying drawing
Exemplary embodiment, it being understood, however, that may be realized in various forms the disclosure and should not be by embodiments set forth here
Limited.On the contrary, there is provided these embodiments are able to be best understood from the disclosure, and can be by the scope of the present disclosure
Complete conveys to those skilled in the art.
With reference to Fig. 1, flow process the step of show a kind of recognition methods of english information according to an embodiment of the invention
Figure, specifically may include steps of:
Step 101, selection target view data.
In implementing, the embodiment of the present invention can apply in mobile terminal, for example, mobile phone, PDA (Personal
Digital Assistant, personal digital assistant), laptop computer, palm PC etc., the embodiment of the present invention to this not
It is any limitation as.
These mobile terminals can install the operating systems such as Windows, Android (Android), IOS, WindowsPhone,
In these operating systems, English identification application can be installed, to carry out the identification of english information, English identification is using can be with
For the system application in operating system, or third-party application.
In embodiments of the present invention, English identification application can be selected to record English letter according to the operational order of user
The destination image data of breath, with pending identification.
In implementing, English identification application can selection target view data in the following way:
First, shoot.
In the manner, mobile terminal configuration has camera (camera), and as shown in Figure 2 A, user starts English identification should
With after, the control of " take pictures and know sentence " is clicked at the interface of English identification application, eject menu bar as shown in Figure 2 B, Yong Huke
To click on the control of " taking pictures ".
English identification application can call camera to gather preview image data according to the control of " taking pictures ".
By taking android system as an example, English identification application is formerly in manifest (the Java bags of application program) file
Use of the statement to camera and other related feature (function, such as auto-focusing).
Intent (is intended to, such as used in the main activity (movable component) of English identification application
MediaStore.ACTION_IMAGE_CAPTURE) notify that the built-in video camera application of operating system, video camera application pass through
StartActivityForResult () method performs the intent of camera, and user will after being taken pictures using shooting
Preview image data returns main activity, and the method for reception preview image data is added in main activity (such as
OnActivityResult () method), the preview image data operation to returning.
Because english information may be less, in order to reduce the interference of other things, improve the accuracy of identification, can be pre-
Look in view data and load preview pane, for example, as that shown in fig. 2 c four angles for white point rectangle, user can be by the way that adjust should
The shape of preview pane, position, size so that english information includes the position of the preview pane, and excludes other things.
Certainly, user can also directly choose whole frame preview image data as destination image data, the embodiment of the present invention
This is not any limitation as.
If user clicks on " √ " control as that shown in fig. 2 c, the preview image data in preview pane can be extracted, as
Destination image data.
2nd, it is local to upload.
In the manner, as shown in Figure 2 A, user starts after English identification application, in the interface point of English identification application
The control of " take pictures and know sentence " is hit, menu bar as shown in Figure 2 B is ejected, user can click on the control of " selecting from mobile phone photo album "
Part, so as to select local view data.
English identification application can import locally stored view data, as target image number according to the selection of user
According to.
Can be the view data of acquisition of formerly taking pictures it should be noted that the locally stored view data of mobile terminal,
Can also be sectional drawing obtain view data, can also be other modes obtain view data, the embodiment of the present invention to this not
It is any limitation as.
Certainly, the mode of above-mentioned selection target view data is intended only as example, when the embodiment of the present invention is implemented, can be with
The mode of other selection target view data is set according to actual conditions, the embodiment of the present invention is not any limitation as to this.In addition, removing
Outside the mode of above-mentioned selection target view data, those skilled in the art can also according to actual needs using other selection mesh
The mode of logo image data, the embodiment of the present invention is not also any limitation as to this.
Step 102, recognizes english information from the destination image data, and splits out one or more english sentences.
For destination image data, can be by OCR (Optical Character Recognition, optical character knowledge
English information) is not recognized from destination image data.
In this kind of mode, destination image data can be pre-processed, including binaryzation, noise remove, inclination compared with
Just etc., to improve the precision of identification.
For the destination image data after pretreatment, character features can be extracted, generally include the following two kinds:
1st, the feature of statistics.For example, the black/white points ratio in character area, when word is distinguished into several regions, this
One by one region black/white count than joint, just into a numerical value vector in space.
2nd, it is the feature of structure.For example, after word image graph thinning, obtain the stroke end points of word, the quantity in crosspoint and
Position, or be characterized with stroke section.
The feature of extraction is compared with all English alphabets to be recognized of storage in database, from the ratio of theorem in Euclid space
To method, relax the modes such as Comparison Method (Relaxation), dynamic routine Comparison Method (Dynamic Programming, DP), not
Go out the corresponding English alphabet of this feature.
Hereafter, it is possible to use in the English alphabet and its possible similar candidates sub-block after comparison, identify according to before and after
English alphabet find out most logical English alphabet, corrected.
In embodiments of the present invention, one or more english sentences may be included in destination image data, then can be based on
The modes such as fullstop are recognized and split out each sentence.
In actual applications, in order to save the resource consumption of mobile terminal, the identification of english information, the fractionation of english sentence
Can be performed by server.
Then in the manner, English identification application can send destination image data to server, and server passes through light
Learn character recognition mode and recognize english information from destination image data, one or more English sentences are split out from english information
Son, and return English identification application.
English identification is returned using the reception server, is recognized from destination image data by optical character identification mode
English information, and one or more english sentences split out from english information.
As shown in Figure 2 D, due to when server carries out the identification of english information, the fractionation of english sentence needs to expend some
Between, then show information such as " recognizing ... ", user waiting prompt in the interface of English identification application.
Certainly, the identification of english information, the fractionation of english sentence can also be performed by English identification application, and the present invention is implemented
Example is not any limitation as to this.
Step 103, by the english sentence interactive elements that each word can be clicked are split into, and, recognize the English
The clause factor of sentence.
In embodiments of the present invention, each word of composition english sentence can be split, generate to click afterwards
Interactive elements, such as JSON (JavaScript Object Notatio plant the data interchange format of lightweight) data etc..
Each word can be generated in an independent interactive elements, the i.e. interactive elements by recording the forms such as word
The word is represented, these interactive elements are distributed according to the distribution of word, complete english sentence can be constituted.
User can select one or more interactive elements and then select one or more words by modes such as clicks, with
Just the word to selecting carries out the operation such as translating.
For example, as shown in Figure 2 E, for english sentence " The question whether it is right or
Wrong depends on the result ", can respectively to " The ", " question ", " whether ", " it ", " is ",
" right ", " or ", " wrong ", " depends ", " on ", " the ", " result " respectively generate the interactive elements that can be clicked.
Further, it is also possible to the English attribute in the clause factor of english sentence, the i.e. english sentence is recognized, to facilitate user
Inquired about.
In embodiments of the present invention, the clause factor can include following one or more:
1st, sentence structure
The structure of English sentence can include following one or more:
1.1st, subject-predicate phrase, in this structure, predicate is intransitive verb, and for example, (he runs He runs quickly.
Hurry up.)
1.2nd, main copular construction, in this structure, predicate is link-verb, for example, He is older than he
Looks. (he is than looking old.)
1.3rd, SVO structure, in this structure, predicate is transitive verb, therefore has object, for example, I saw a film
Yesterday. (I saw a film yesterday.)
1.4th, the double guest's structures of subject-predicate, in this structure, predicate is the transitive verb with double objects, for example, He gave
(he gives me a book to me a book/a book to me..)
1.5th, SVO mends structure, and in this structure, predicate is the transitive verb for having object complement, for example, They
(they make this girl angry to made the girl angry..)
2nd, subordinate clause type
Subordinate clause (Subordinate Clause) is that for main clause, i.e., in compound sentence, subordinate clause is subordinated to certain
One main clause, and can not individually make a sentence, but with subject part and predicate part, guided by that, who, when etc.
Word (Connective) is guided.
In English, mainly there is three kinds of subordinates clause, i.e., noun clause (including subject clause, object clause, predicative clause,
Appositive clause), Adjective subordinate clause (i.e. attributive clause), adverbial subordinate clause (i.e. adverbial clause, including time, condition, knot
Really, purpose, reason, concession, place, mode etc.).
Specifically:
2.1st, subject clause, the sentence that subject is used as in compound sentence is called subject clause.
For example, That he finished writing the composition in such a short time
(he has just write this composition let us and has been taken aback surprised us all. in the so short time.)
2.2nd, object clause, the sentence that object is used as in compound sentence is called subject clause.
For example, Tell him which class you are in. (tell him you are in which class.)
2.3rd, predicative clause, the sentence that predicative is used as in compound sentence is called subject clause.
For example, China is no longer what she used to be. are (during the China of today is no longer past
State.)
2.4th, appositive clause, is used as the sentence of appositive appositive clause in compound sentence.
For example, (I has heard that what our teams won disappears to I heard the news that our team had won.
Breath.)
2.5th, attributive clause, is used as the sentence of attribute appositive clause in compound sentence.
For example, (missing dog have found The dog that/which was lost has been found..)
2.6th, adverbial clause, is used as the sentence of the adverbial modifier appositive clause in compound sentence.
For example, (I will not go ginseng to I will not go to her party if she doesn ' t invite me.
Plus her party, if she does not invite me.)
In one embodiment of the invention, subordinate clause type can in the following way be recognized:
Sub-step S1031, determines english sentence to be identified;
Sub-step S1032, by the english sentence conversion text sequence is characterized;
Sub-step S1033, is input into preset disaggregated model, to recognize the english sentence institute by the feature text sequence
Comprising subordinate clause type.
In embodiments of the present invention, due to sub-step S1031, sub-step S1032 and sub-step S1033 and step 501, step
Rapid 502, the application basic simlarity of step 503, so description is fairly simple, related part is referring to step 501, step 502, step
Rapid 503 part explanation, embodiment of the present invention here is not described in detail.
3rd, sentence tense
The tense of English sentence can include following one or more:
3.1st, present indefinite simple present, represents regular thing, regular action or general true.
For example, She doesn't often write to her family, only once a month. (she seldom
Write home, only the envelope of January one.)
3.2nd, past idenfinite, can be used to be described in over the state of the action or presence occurred when certain, also may be used
For representing the recurrent habitual action in the time in past section.
For example, (he has taken driving license last month to He got his driving license last month..)
3.3rd, future simple tense, can be used to the situation for describing the action that will occur or being present in future.
For example, (he arrives at here tonight He will arrive here this evening..)
3.4th, present progressive tense, can be used to describe " speak, write ought carving for article " occurent action, or " existing
The action that stage " is being carried out always.
For example, (they are matching football to They are having a football match..)
3.5th, past progressive tense, can represent the action for occurring, carrying out on past certain time point.
For example, At this moment yesterday, I was packing for camp. (this when of yesterday, I
Getting things together camping.)
3.6th, past perfect tense, represents that past perfect tense is represented and is having occurred and that in those years or before action or complete
Into action.
For example, When I woke up, it had stopped raining. (when I wakes up, rain stop over.)
4th, part of speech
Part of speech is called part of speech, and function of the English word according to it in sentence can include following one or more:
4.1st, noun (noun, n.), for example, student (student).
4.2nd, pronoun (pronoun, pron.), for example, you (you).
4.3rd, adjective (adjective, adj.), for example, happy (glad).
4.4th, adverbial word (adverb, adv.), for example, quickly (promptly).
4.5th, verb (verb, v.), for example, cut (cuts, cuts).
4.6th, number (numeral, num.), for example, three (three).
4.7th, article (article, art.), for example, a ().
4.8th, preposition (preposition, prep.), for example, at ().
4.9th, conjunction (conjunction, conj.), for example, and (and).
4.10th, interjection (interjection, interj.), for example, oh ().
It should be noted that an English word might have multiple parts of speech, the part of speech in the embodiment of the present invention can refer to
Part of speech of the English word in english sentence to be identified, can assist in identifying English word to be identified by contextual information
English sentence in part of speech.
Certainly, the above-mentioned clause factor is intended only as example, when the embodiment of the present invention is implemented, can be set according to actual conditions
Other clause factors are put, the embodiment of the present invention is not any limitation as to this.In addition, in addition to the above-mentioned clause factor, art technology
Personnel can also according to actual needs adopt other clause factors, the embodiment of the present invention not also to be any limitation as this.
Because the data volume of the clause factor may be more, therefore, it can recognize in batches, show the clause factor, it is also possible to one
Play identification, show the clause factor in batches, the embodiment of the present invention is not any limitation as to this.
For example, interface as shown in Figure 2 E, if user click on " clause analysis " control, can show sentence structure,
Subordinate clause type, if user clicks on the control of " tense analysis ", can show sentence tense, if user clicks on " part of speech analysis "
Control, then can show part of speech.
In actual applications, in order to save the resource consumption of mobile terminal, the fractionation of English word, the identification of the clause factor
Can be performed by server.
Then in the manner, English identification application can send english sentence to server, and server is from english sentence
Each word is split out, and, sentence structure, subordinate clause type, sentence tense, the word from english sentence identification is in english sentence
In part of speech in one or more information, and return English identification application.
English identification is returned using the reception server, from each word that english sentence splits out, and, from English sentence
One or more information in part of speech of the sentence structure, subordinate clause information, sentence tense, word of son identification in english sentence.
Hereafter, English identification is applied in interface, and with each word the interactive elements that can be clicked are generated.
Certainly, the fractionation of English word, the identification of the clause factor can also be performed by English identification application, and the present invention is implemented
Example is not any limitation as to this.
The embodiment of the present invention recognizes english information from the destination image data for selecting, and splits out one or more English
Sentence, by english sentence the interactive elements that each word can be clicked are split into, and, recognize the clause factor of english sentence, a side
Face, user can pass through one or more words needed for selecting in interactive elements carries out the operation such as follow-up translation, the opposing party
Face, the clause factor of automatic identification english sentence improves the information diversity of english sentence, reduces user manually by inquiry
Other data are contrasted to English sentence, not only can reduce the time of cost, improve efficiency, and, to the acquisition of knowledge
The probability of error is reduced in the case of less.
With reference to Fig. 3, flow the step of show the recognition methods of another kind of english information according to an embodiment of the invention
Cheng Tu, specifically may include steps of:
Step 301, selection target view data.
Step 302, recognizes english information from the destination image data, and splits out one or more english sentences.
Step 303, by the english sentence interactive elements that each word can be clicked are split into, and, recognize the English
The clause factor of sentence.
Step 304, selects one or more target english sentences from one or more of english sentences.
One or more of target english sentences are translated by step 305, obtain target-language information.
In embodiments of the present invention, user can be turned over from selection target english sentence in the English sentence for identifying
Translate, the target-language information needed for obtaining, such as Chinese translation, Korean translation, Portugal language translation.
For example, as shown in Figure 2 E, for english sentence " The question whether it is right or
Wrong depends on the result ", can translate into " problem is pair or wrong, depending on result ".
It should be noted that can be simple sentence translation, or many translations for English sentence.
In actual applications, in order to save the resource consumption of mobile terminal, the translation of target english sentence can be by servicing
Device is performed.
Then in the manner, English identification application can send one or more target english sentences to server, clothes
Business device by this, translated by one or more target english sentences, obtains target-language information, and returns English identification application.
English identification application receives what the server was returned, translates what one or more of target english sentences were obtained
Target-language information.
Certainly, the translation of target english sentence can also be performed by English identification application, and the embodiment of the present invention is not added with to this
To limit.
Step 306, the word selection target word based on the interactive elements from the english sentence.
Step 307, translates to the target word, obtains target-language information.
In embodiments of the present invention, user can be translated from selection target word in certain English sentence, obtain institute
The target-language information for needing, such as Chinese translation, Korean translation, Portugal language translation.
For example, as shown in Figure 2 E, for english sentence " The question whether it is right or
Wrong depends on the result ", user can click on selection " question ", " depends ", " on " as mesh
Mark word, clicks on " turning over " control and is translated.
In actual applications, in order to save the resource consumption of mobile terminal, the translation of target word can be held by server
OK.
Then in the manner, English identification application can send target word to server, and server is to the target list
Word is translated, and obtains target-language information, and returns English identification application.
English identification is returned using the reception server, the target-language information that special translating purpose word is obtained.
Certainly, the translation of target word can also be performed by English identification application, and the embodiment of the present invention is not limited this
System.
With reference to Fig. 4, a kind of training method of the disaggregated model of English subordinate clause according to an embodiment of the invention is shown
The step of flow chart, specifically may include steps of:
Step 401, by the english sentence with English subordinate clause training sample is set to.
In embodiments of the present invention, English subordinate clause (Subordinate Clause) can be collected as the instruction of disaggregated model
Practice sample.
So-called subordinate clause, is that for main clause, i.e., in compound sentence, subordinate clause is subordinated to some main clause, and can not
Individually make a sentence, but with subject part and predicate part, drawn by the introducers such as that, who, when (Connective)
Lead.
In English, mainly there is three kinds of subordinates clause, i.e., noun clause (including subject clause, object clause, predicative clause,
Appositive clause), Adjective subordinate clause (i.e. attributive clause), adverbial subordinate clause (i.e. adverbial clause, including time, condition, knot
Really, purpose, reason, concession, place, mode etc.).
Specifically:
Subject clause, the sentence that subject is used as in compound sentence is called subject clause.
For example, That he finished writing the composition in such a short time
(he has just write this composition let us and has been taken aback surprised us all. in the so short time.)
Object clause, the sentence that object is used as in compound sentence is called subject clause.
For example, Tell him which class you are in. (tell him you are in which class.)
Predicative clause, the sentence that predicative is used as in compound sentence is called subject clause.
For example, China is no longer what she used to be. are (during the China of today is no longer past
State.)
Appositive clause, is used as the sentence of appositive appositive clause in compound sentence.
For example, (I has heard that what our teams won disappears to I heard the news that our team had won.
Breath.)
Attributive clause, is used as the sentence of attribute appositive clause in compound sentence.
For example, (missing dog have found The dog that/which was lost has been found..)
Adverbial clause, is used as the sentence of the adverbial modifier appositive clause in compound sentence.
For example, (I will not go ginseng to I will not go to her party if she doesn ' t invite me.
Plus her party, if she does not invite me.)
Step 402, by the training sample conversion text sequence is characterized.
In implementing, can be with the feature of recognition training sample (i.e. English subordinate clause), with the feature replacement training sample
(i.e. English subordinate clause), forms feature text sequence.
In one embodiment of the invention, step 402 can include following sub-step:
Sub-step S4021, recognizes the composition structure of the training sample;
Sub-step S4022, using the composition structure characteristic sequence text is formed.
In embodiments of the present invention, Stamford parser (stanford parser) can be pre-configured with, wherein,
Stamford parser is a Lexical probability CFG analyzer, while also using dependency analysis.
By Stamford parser (stanford parser), training sample (i.e. English subordinate clause) can be carried out
Interdependent syntactic analysis is done, the dependence of english sentence is exported.
Stamford parser (stanford parser) is used for natural language processing, mainly realizes following
Function:
1) recognize and mark the part of speech of word in sentence;
2) the grammatical relation Stanford Dependencies in a sentence two-by-two between word is created;
3) syntactic structure of a sentence is obtained.
Furthermore, the Stamford parser (stanford parser) can provide the syntax of a sentence
Analytic tree, and the part of speech and constituent of each word.
For English subordinate clause, English word itself do not have too many meaning, the composition structure of english sentence is strong feature,
Therefore, the embodiment of the present invention can extract strong feature, remove useless feature.
In one example, as shown in figure 5, by Stamford parser (stanford parser) to English sentence
Sub " The boy who is presenting the powerpoint is the most handsome man. " carry out according to
Syntactic analysis is deposited, can be changed and be characterized text sequence " ROOT S NP DT NN SBAR WHNP WP S VP VBZ VP
Wherein, ROOT represents that to process the sentence of text, NP represents noun to VBG NP DT JJ VP VBZ NP DT RBS JJ NN. "
Phrase, DT (determiner) represent that determiner, NN represent major terms, etc..
In addition to the parser of Stamford, the composition structure of other modes recognition training sample can also be adopted, this
Inventive embodiments are not any limitation as to this.
Step 403, the disaggregated model for recognizing English subordinate clause is trained using the feature text sequence.
In actual applications, can be trained using feature text sequence, to obtain use by the method for machine learning
In the disaggregated model of the English subordinate clause of identification.
In one embodiment of the invention, step 403 can include following sub-step:
Sub-step S4031, the feature text sequence is input in convolutional neural networks;
Sub-step S4032, in the convolutional neural networks based on the order of word in the training sample, using described
Feature text sequence trains the disaggregated model for recognizing English subordinate clause.
Convolutional neural networks (Convolutional Neural Network, CNN) are formula neutral nets of bursting before, energy
Its topological structure is extracted from a two dimensional image, network structure is optimized using back-propagation algorithm, solved in network not
Know parameter.
For natural language processing (Natural Language Processing, NLP), convolutional neural networks are input into
No longer it is pixel, but the feature text sequence represented in forms such as matrixes, this matrix is the equal of a width " image ".
Convolutional neural networks classification when, it may be considered that in english sentence in word word order, so as to learn to English
The sentence structure of language subordinate clause.
In implementing, convolutional neural networks structure includes:Convolutional layer, down-sampled layer, full linking layer.Each layer has many
Individual characteristic pattern, each characteristic pattern extracts a kind of feature of input by a kind of convolution filter, and each characteristic pattern has multiple nerves
Unit.
Convolutional layer:It is, by convolution algorithm, can to make that the reason for using convolutional layer is an important feature of convolution algorithm
Original signal feature strengthens, and reduces noise.
Down-sampled layer:Using it is down-sampled the reason for be, according to the principle of image local correlation, sub-sampling to be carried out to image
Amount of calculation can be reduced, while keeping image rotation consistency.
The purpose of sampling mainly obscures the particular location of feature, because after certain feature is found out, its particular location
Inessential, we only need to this feature with other relative positions, such as one " 8 ", above we have obtained
When one " o ", we require no knowledge about its particular location in image, it is only necessary to know below it and be one " o " we just
It is known that be one ' 8' because in picture " 8 " in picture it is to the left or it is to the right do not affect us and recognize it, it is this
Obscuring the strategy of particular location can be identified to the picture for deforming and distort.
Full articulamentum:Connected entirely using softmax, the picture that the activation value for obtaining i.e. convolutional neural networks are extracted is special
Levy.
After having constructed convolutional neural networks, convolutional Neural is solved, training mainly includes four steps, this four step is divided
For two stages:
First stage, forward propagation stage:
1) sample, is taken from sample set, convolutional Neural is input into;
2), corresponding reality output is calculated;In this stage, information, through conversion step by step, is sent to output from input layer
Layer.
Second stage, back-propagation stage:
1) difference of reality output and corresponding preferable output, is calculated;
2), weight matrix is adjusted by the method for minimization error.
Furthermore, the training process of network is as follows:
(1), training group is selected, randomly seeks N number of sample respectively from sample set as training group;
(2), by each weights, threshold value, be set to it is little close to 0 random value, and initialize Accuracy Controlling Parameter and study
Rate;
(3) input pattern, is taken from training group and is added to network, and provide its target output vector;
(4) intermediate layer output vector, is calculated, the reality output vector of network is calculated;
(5), the element in the element and object vector in output vector is compared, output error is calculated;For
The hidden unit in intermediate layer is also required to calculate error;
(6) adjustment amount of each weights and the adjustment amount of threshold value, are calculated successively;
(7) weights and adjustment threshold value, are adjusted;
(8), after M is experienced, whether judge index meets required precision, if be unsatisfactory for, returns (3), continues iteration;
If satisfaction is put into next step;
(9), training terminates, and weights and threshold value are preserved hereof.At this moment it is considered that each weights has reached surely
Fixed, grader has been formed.It is trained again, directly derives weights from file and threshold value is trained, it is not necessary to carry out
Initialization.
In addition to convolutional neural networks, the method that can also adopt other machines study is trained for recognizing English subordinate clause
Disaggregated model, for example, SVM (Support Vector Machine, SVMs), adaboost etc., the present invention is real
Apply example not to be any limitation as this.
English sentence with English subordinate clause is set to the embodiment of the present invention into training sample and conversion is characterized text sequence
Row, train the disaggregated model for recognizing English subordinate clause so that can be with automatic identification english sentence using this feature text sequence
Comprising subordinate clause type, improve the information diversity of english sentence, reduce user manually by inquiring about other data pair
English sentence is contrasted, and not only can reduce the time of cost, improves efficiency, and, in the situation less to the acquisition of knowledge
The lower probability for reducing error.
With reference to Fig. 6, a kind of side that English subordinate clause is recognized based on disaggregated model according to an embodiment of the invention is shown
The step of method flow chart, specifically may include steps of:
Step 601, determines english sentence to be identified.
In implementing, interface as shown in Figure 2 E, for some english sentence, if user clicks on " clause analysis "
Control, then can be using the english sentence as english sentence to be identified, to recognize sentence structure, subordinate clause type.
Now, if the identification of the clause factor (including subordinate clause type) can be performed by server, server can connect
The english sentence of English identification application upload is received as english sentence to be identified.
Certainly, if the identification of the clause factor (including subordinate clause type) can be performed by English identification application, English is known
Ying Yong not be using the extracting directly english sentence as english sentence to be identified.
Additionally, in addition to aforesaid way, english sentence to be identified can also be determined using other modes, for example, use
Family directly inputs english sentence to be identified, etc., and the embodiment of the present invention is not any limitation as to this.
Step 602, by the english sentence conversion text sequence is characterized.
In implementing, the feature of english sentence can be recognized, with the feature replacement english sentence, form feature text
Sequence.
In one embodiment of the invention, step 602 can include following sub-step:
Sub-step S6021, recognizes the composition structure of the english sentence;
Sub-step S6022, using the composition structure characteristic sequence text is formed.
In embodiments of the present invention, Stamford parser (stanford parser) can be pre-configured with, wherein,
Stamford parser is a Lexical probability CFG analyzer, while also using dependency analysis.
By Stamford parser (stanford parser), training sample (i.e. English subordinate clause) can be carried out
Interdependent syntactic analysis is done, the dependence of english sentence is exported.
Stamford parser (stanford parser) is used for natural language processing, mainly realizes following
Function:
1) recognize and mark the part of speech of word in sentence;
2) the grammatical relation Stanford Dependencies in a sentence two-by-two between word is created;
3) syntactic structure of a sentence is obtained.
Furthermore, the Stamford parser (stanford parser) can provide the syntax of a sentence
Analytic tree, and the part of speech and constituent of each word.
For English subordinate clause, English word itself do not have too many meaning, the composition structure of english sentence is strong feature,
Therefore, the embodiment of the present invention can extract strong feature, remove useless feature.
In one example, as shown in figure 5, by Stamford parser (stanford parser) to English sentence
Sub " The boy who is presenting the powerpoint is the most handsome man. " carry out according to
Syntactic analysis is deposited, can be changed and be characterized text sequence " ROOT S NP DT NN SBAR WHNP WP S VP VBZ VP
Wherein, ROOT represents that to process the sentence of text, NP represents noun to VBG NP DT JJ VP VBZ NP DT RBS JJ NN. "
Phrase, DT (determiner) represent that determiner, NN represent major terms, etc..
In addition to the parser of Stamford, other modes can also be adopted to recognize the composition structure of english sentence, this
Inventive embodiments are not any limitation as to this.
Step 603, is input into preset disaggregated model, to recognize that the english sentence is included by the feature text sequence
Subordinate clause type.
Using the embodiment of the present invention, can be by the method for machine learning, the feature text being converted into using training sample
Sequence is trained, to obtain the disaggregated model for recognizing English subordinate clause.
In one embodiment of the invention, the disaggregated model can in the following way be trained:
Sub-step S6031, by the english sentence with English subordinate clause training sample is set to;
Sub-step S6032, by the training sample conversion text sequence is characterized;
Sub-step S6033, the disaggregated model for recognizing English subordinate clause is trained using the feature text sequence.
In embodiments of the present invention, due to sub-step S6031, sub-step S6032 and sub-step S6033 and step 401, step
Rapid 402, the application basic simlarity of step 403, so description is fairly simple, related part is referring to step 401, step 402, step
Rapid 403 part explanation, embodiment of the present invention here is not described in detail.
In implementing, feature text sequence can be input into the disaggregated model, to identify that the english sentence is wrapped
The subordinate clause type for containing.
In one embodiment of the invention, step 603 can include following sub-step:
Sub-step S6034, the feature text sequence is input in the disaggregated model trained by convolutional neural networks;
Sub-step S6035, in the disaggregated model based on the order of word in the english sentence, using the feature
Text sequence recognizes the subordinate clause type that the english sentence is included.
In embodiments of the present invention, disaggregated model is trained based on convolutional neural networks.
Convolutional neural networks classification when, it may be considered that in english sentence in word word order, so as to learn to English
The sentence structure of language subordinate clause, so as to recognize the subordinate clause type that english sentence is included.
English sentence conversion is characterized text sequence and is input into preset disaggregated model by the embodiment of the present invention, to recognize English
The subordinate clause type that sentence is included, realizes the type of the subordinate clause that automatic identification english sentence is included, and improves english sentence
Information diversity, reduce user manually by inquiry other data English sentence is contrasted, cost not only can be reduced
Time, improve efficiency, and, in the probability for reducing error in the case of less to the acquisition of knowledge.
For embodiment of the method, in order to be briefly described, therefore it is all expressed as a series of combination of actions, but this area
Technical staff should know that the embodiment of the present invention is not limited by described sequence of movement, because according to present invention enforcement
Example, some steps can adopt other orders or while carry out.Secondly, those skilled in the art also should know, specification
Described in embodiment belong to preferred embodiment, necessary to the involved action not necessarily embodiment of the present invention.
With reference to Fig. 7, a kind of structured flowchart of the identifying device of english information according to an embodiment of the invention is shown,
Specifically can include such as lower module:
Destination image data selecting module 701, is suitably selected for destination image data;
Sentence splits module 702, is suitable to from the destination image data to recognize english information, and splits out one or many
Individual english sentence;
Sentence Attribute Recognition module 703, is suitable to for the english sentence to be split into the interactive elements that each word can be clicked, with
And, recognize the clause factor of the english sentence.
In one embodiment of the invention, the destination image data selecting module 701 includes:
Preview image data gathers submodule, is suitable to call camera to gather preview image data;
Preview pane loads submodule, is suitable to load preview pane in the preview image data;
Preview image data extracting sub-module, is suitable to extract the preview image data in the preview pane, as target figure
As data;
And/or,
View data imports submodule, is suitable to import locally stored view data, as destination image data.
In one embodiment of the invention, the sentence splits module 702 and includes:
Destination image data sending submodule, is suitable to the destination image data be sent to server;
Fractionation information receiving submodule, is suitable to receive what the server was returned, by optical character identification mode from institute
State the english information of destination image data identification, and one or more english sentences split out from the english information.
In one embodiment of the invention, the sentence Attribute Recognition module 703 includes:
English sentence sending submodule, is suitable to the english sentence be sent to server;
Sentence attribute reception submodule, is suitable to receive what the server was returned, from the english sentence split out it is each
Individual word, and, sentence structure, subordinate clause type, sentence tense, the word from english sentence identification is in the english sentence
In part of speech in one or more information;
The interactive elements that can be clicked are generated with each word.
With reference to Fig. 8, the structural frames of the identifying device of another kind of english information according to an embodiment of the invention are shown
Figure, specifically can include such as lower module:
Destination image data selecting module 801, is suitably selected for destination image data;
Sentence splits module 802, is suitable to from the destination image data to recognize english information, and splits out one or many
Individual english sentence;
Sentence Attribute Recognition module 803, is suitable to for the english sentence to be split into the interactive elements that each word can be clicked, with
And, recognize the clause factor of the english sentence.
Target english sentence selecting module 804, is suitable to select one or more from one or more of english sentences
Target english sentence;
Target english sentence translation module 805, is suitable to translate one or more of target english sentences, obtains
Target-language information.
Target word selecting module 806, is suitable to the word based on the interactive elements from the english sentence and selects mesh
Mark word;
Target word translation module 807, is suitable to translate the target word, obtains target-language information.
In one embodiment of the invention, the target english sentence translation module 805 includes:
Target english sentence sending submodule, is suitable to one or more of target english sentences be sent to server;
Target english sentence translation information receiving submodule, is suitable to receive what the server was returned, translates one
Or the target-language information that multiple target english sentences are obtained.
In one embodiment of the invention, the target word translation module 707 includes:
Target word sending submodule, is suitable to the target word be sent to server;
Target word translation information receiving submodule, is suitable to receive what the server was returned, translates the target word
The target-language information of acquisition.
With reference to Fig. 9, a kind of trainer of the disaggregated model of English subordinate clause according to an embodiment of the invention is shown
Structured flowchart, specifically can include such as lower module:
Training sample setup module 901, is suitable to for the english sentence with English subordinate clause to be set to training sample;
Training sample modular converter 902, is suitable to for training sample conversion to be characterized text sequence;
Disaggregated model training module 903, is suitable for use with the feature text sequence and trains for recognizing dividing for English subordinate clause
Class model.
In one embodiment of the invention, the training sample modular converter 902 includes:
The composition of sample recognizes submodule, is suitable to recognize the composition structure of the training sample;
Sample characteristics form submodule, are suitable for use with the composition structure and form characteristic sequence text.
In one embodiment of the invention, the disaggregated model training module 903 includes:
Convolutional neural networks input submodule, is suitable to that the feature text sequence is input in convolutional neural networks;
Convolutional neural networks train submodule, are suitable in the convolutional neural networks based on word in the training sample
Order, disaggregated model for recognizing English subordinate clause is trained using the feature text sequence.
With reference to Figure 10, a kind of dress that English subordinate clause is recognized based on disaggregated model according to an embodiment of the invention is shown
The structured flowchart put, specifically can include such as lower module:
English sentence determining module 1001, is adapted to determine that english sentence to be identified;
English sentence modular converter 1002, is suitable to for english sentence conversion to be characterized text sequence;
Subordinate clause type identification module 1003, is suitable to for the feature text sequence to be input into preset disaggregated model, to recognize
The subordinate clause type that the english sentence is included.
In one embodiment of the invention, the english sentence modular converter 1002 includes:
Sentence structure recognizes submodule, is suitable to recognize the composition structure of the english sentence;
Sentence characteristics form submodule, are suitable for use with the composition structure and form characteristic sequence text.
In one embodiment of the invention, the subordinate clause type identification module 1003 includes:
Disaggregated model input submodule, is suitable to that the feature text sequence was input into by dividing that convolutional neural networks are trained
In class model;
Disaggregated model recognizes submodule, be suitable in the disaggregated model based on the order of word in the english sentence,
The subordinate clause type that the english sentence is included is recognized using the feature text sequence.
For device embodiment, due to itself and embodiment of the method basic simlarity, so description is fairly simple, it is related
Part is illustrated referring to the part of embodiment of the method.
Provided herein algorithm and display be not inherently related to any certain computer, virtual system or miscellaneous equipment.
Various general-purpose systems can also be used together based on teaching in this.As described above, construct required by this kind of system
Structure be obvious.Additionally, the present invention is also not for any certain programmed language.It is understood that, it is possible to use it is various
Programming language realizes the content of invention described herein, and the description done to language-specific above is to disclose this
Bright preferred forms.
In specification mentioned herein, a large amount of details are illustrated.It is to be appreciated, however, that the enforcement of the present invention
Example can be put into practice in the case of without these details.In some instances, known method, structure is not been shown in detail
And technology, so as not to obscure the understanding of this description.
Similarly, it will be appreciated that in order to simplify the disclosure and help understand one or more in each inventive aspect, exist
Above in the description of the exemplary embodiment of the present invention, each feature of the present invention is grouped together into single enforcement sometimes
In example, figure or descriptions thereof.However, the method for the disclosure should be construed to reflect following intention:I.e. required guarantor
The more features of feature that the application claims ratio of shield is expressly recited in each claim.More precisely, such as following
Claims reflect as, inventive aspect is all features less than single embodiment disclosed above.Therefore,
Thus the claims for following specific embodiment are expressly incorporated in the specific embodiment, wherein each claim itself
All as the separate embodiments of the present invention.
Those skilled in the art are appreciated that can be carried out adaptively to the module in the equipment in embodiment
Change and they are arranged in one or more equipment different from the embodiment.Can be the module or list in embodiment
Unit or component are combined into a module or unit or component, and can be divided in addition multiple submodule or subelement or
Sub-component.In addition at least some in such feature and/or process or unit is excluded each other, can adopt any
Combine to all features disclosed in this specification (including adjoint claim, summary and accompanying drawing) and so disclosed
Where all processes or unit of method or equipment are combined.Unless expressly stated otherwise, this specification is (including adjoint power
Profit is required, summary and accompanying drawing) disclosed in each feature can it is identical by offers, be equal to or the alternative features of similar purpose carry out generation
Replace.
Although additionally, it will be appreciated by those of skill in the art that some embodiments described herein include other embodiments
In included some features rather than further feature, but the combination of the feature of different embodiments means in of the invention
Within the scope of and form different embodiments.For example, in the following claims, embodiment required for protection appoint
One of meaning can in any combination mode using.
The present invention all parts embodiment can be realized with hardware, or with one or more processor operation
Software module realize, or with combinations thereof realization.It will be understood by those of skill in the art that can use in practice
Microprocessor or digital signal processor (DSP) are realizing the instruction of the disaggregated model of English subordinate clause according to embodiments of the present invention
Practice equipment, some or all functions of some or all parts in the equipment of English subordinate clause are recognized based on disaggregated model.
The present invention is also implemented as some or all equipment or program of device for performing method as described herein
(for example, computer program and computer program).Such program for realizing the present invention can be stored in computer-readable
On medium, or there can be the form of one or more signal.Such signal can be downloaded from internet website
Arrive, or provide on carrier signal, or provide in any other form.
It should be noted that above-described embodiment the present invention will be described rather than limits the invention, and ability
Field technique personnel can design without departing from the scope of the appended claims alternative embodiment.In the claims,
Any reference symbol between bracket should not be configured to limitations on claims.Word "comprising" is not excluded the presence of not
Element listed in the claims or step.Word "a" or "an" before element does not exclude the presence of multiple such
Element.The present invention can come real by means of the hardware for including some different elements and by means of properly programmed computer
It is existing.If in the unit claim for listing equipment for drying, several in these devices can be by same hardware branch
To embody.The use of word first, second, and third does not indicate that any order.These words can be explained and be run after fame
Claim.
Claims (10)
1. a kind of training method of the disaggregated model of English subordinate clause, including:
English sentence with English subordinate clause is set to into training sample;
Training sample conversion is characterized into text sequence;
The disaggregated model for recognizing English subordinate clause is trained using the feature text sequence.
2. the method for claim 1, it is characterised in that described that training sample conversion is characterized into text sequence
Step includes:
Recognize the composition structure of the training sample;
Characteristic sequence text is formed using the composition structure.
3. method as claimed in claim 1 or 2, it is characterised in that described to be trained for knowing using the feature text sequence
The step of disaggregated model of not English subordinate clause, includes:
The feature text sequence is input in convolutional neural networks;
The order of word in the training sample is based in the convolutional neural networks, is trained using the feature text sequence
For recognizing the disaggregated model of English subordinate clause.
4. a kind of method that English subordinate clause is recognized based on disaggregated model, including:
Determine english sentence to be identified;
English sentence conversion is characterized into text sequence;
The feature text sequence is input into into preset disaggregated model, to recognize the subordinate clause type that the english sentence is included.
5. method as claimed in claim 4, it is characterised in that described to be characterized text sequence from by english sentence conversion
The step of include:
Recognize the composition structure of the english sentence;
Characteristic sequence text is formed using the composition structure.
6. the method as described in claim 5 or 6, it is characterised in that described that the feature text sequence is input into into preset point
Class model, includes the step of to recognize subordinate clause type that the english sentence included:
The feature text sequence is input in the disaggregated model trained by convolutional neural networks;
Based on the order of word in the english sentence, described using feature text sequence identification in the disaggregated model
The subordinate clause type that english sentence is included.
7. a kind of trainer of the disaggregated model of English subordinate clause, including:
Training sample setup module, is suitable to for the english sentence with English subordinate clause to be set to training sample;
Training sample modular converter, is suitable to for training sample conversion to be characterized text sequence;
Disaggregated model training module, is suitable for use with the feature text sequence and trains disaggregated model for recognizing English subordinate clause.
8. device as claimed in claim 7, it is characterised in that the training sample modular converter includes:
The composition of sample recognizes submodule, is suitable to recognize the composition structure of the training sample;
Sample characteristics form submodule, are suitable for use with the composition structure and form characteristic sequence text.
9. device as claimed in claim 7 or 8, it is characterised in that the disaggregated model training module includes:
Convolutional neural networks input submodule, is suitable to that the feature text sequence is input in convolutional neural networks;
Convolutional neural networks train submodule, be suitable in the convolutional neural networks based in the training sample word it is suitable
Sequence, disaggregated model for recognizing English subordinate clause is trained using the feature text sequence.
10. a kind of device that English subordinate clause is recognized based on disaggregated model, including:
English sentence determining module, is adapted to determine that english sentence to be identified;
English sentence modular converter, is suitable to for english sentence conversion to be characterized text sequence;
Subordinate clause type identification module, is suitable to for the feature text sequence to be input into preset disaggregated model, to recognize the English
The subordinate clause type that sentence is included.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201611250331.6A CN106649294B (en) | 2016-12-29 | 2016-12-29 | Classification model training and clause recognition method and device |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201611250331.6A CN106649294B (en) | 2016-12-29 | 2016-12-29 | Classification model training and clause recognition method and device |
Publications (2)
Publication Number | Publication Date |
---|---|
CN106649294A true CN106649294A (en) | 2017-05-10 |
CN106649294B CN106649294B (en) | 2020-11-06 |
Family
ID=58836645
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201611250331.6A Active CN106649294B (en) | 2016-12-29 | 2016-12-29 | Classification model training and clause recognition method and device |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN106649294B (en) |
Cited By (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109086272A (en) * | 2018-08-01 | 2018-12-25 | 浙江蓝鸽科技有限公司 | Sentence pattern recognition methods and its system |
CN109799977A (en) * | 2019-01-25 | 2019-05-24 | 西安电子科技大学 | The method and system of instruction repertorie exploitation scheduling data |
CN112559552A (en) * | 2020-12-03 | 2021-03-26 | 北京百度网讯科技有限公司 | Data pair generation method and device, electronic equipment and storage medium |
WO2021207939A1 (en) * | 2020-04-14 | 2021-10-21 | 深圳市欢太数字科技有限公司 | Sentence pattern mining method and apparatus, electronic device, and storage medium |
CN114627482A (en) * | 2022-05-16 | 2022-06-14 | 四川升拓检测技术股份有限公司 | Method and system for realizing table digital processing based on image processing and character recognition |
Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US6295529B1 (en) * | 1998-12-24 | 2001-09-25 | Microsoft Corporation | Method and apparatus for indentifying clauses having predetermined characteristics indicative of usefulness in determining relationships between different texts |
CN101339617A (en) * | 2007-07-06 | 2009-01-07 | 上海思必得通讯技术有限公司 | Mobile phones photographing and translation device |
US20090076796A1 (en) * | 2007-09-18 | 2009-03-19 | Ariadne Genomics, Inc. | Natural language processing method |
US20110213610A1 (en) * | 2010-03-01 | 2011-09-01 | Lei Chen | Processor Implemented Systems and Methods for Measuring Syntactic Complexity on Spontaneous Non-Native Speech Data by Using Structural Event Detection |
US20140136188A1 (en) * | 2012-11-02 | 2014-05-15 | Fido Labs Inc. | Natural language processing system and method |
US9135633B2 (en) * | 2009-05-18 | 2015-09-15 | Strategyn Holdings, Llc | Needs-based mapping and processing engine |
-
2016
- 2016-12-29 CN CN201611250331.6A patent/CN106649294B/en active Active
Patent Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US6295529B1 (en) * | 1998-12-24 | 2001-09-25 | Microsoft Corporation | Method and apparatus for indentifying clauses having predetermined characteristics indicative of usefulness in determining relationships between different texts |
CN101339617A (en) * | 2007-07-06 | 2009-01-07 | 上海思必得通讯技术有限公司 | Mobile phones photographing and translation device |
US20090076796A1 (en) * | 2007-09-18 | 2009-03-19 | Ariadne Genomics, Inc. | Natural language processing method |
US9135633B2 (en) * | 2009-05-18 | 2015-09-15 | Strategyn Holdings, Llc | Needs-based mapping and processing engine |
US20110213610A1 (en) * | 2010-03-01 | 2011-09-01 | Lei Chen | Processor Implemented Systems and Methods for Measuring Syntactic Complexity on Spontaneous Non-Native Speech Data by Using Structural Event Detection |
US20140136188A1 (en) * | 2012-11-02 | 2014-05-15 | Fido Labs Inc. | Natural language processing system and method |
Non-Patent Citations (1)
Title |
---|
王倩 等: "基于谓词及句义类型块的汉语句义类型识别", 《中文信息学报》 * |
Cited By (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109086272A (en) * | 2018-08-01 | 2018-12-25 | 浙江蓝鸽科技有限公司 | Sentence pattern recognition methods and its system |
CN109086272B (en) * | 2018-08-01 | 2023-02-17 | 浙江蓝鸽科技有限公司 | Sentence pattern recognition method and system |
CN109799977A (en) * | 2019-01-25 | 2019-05-24 | 西安电子科技大学 | The method and system of instruction repertorie exploitation scheduling data |
WO2021207939A1 (en) * | 2020-04-14 | 2021-10-21 | 深圳市欢太数字科技有限公司 | Sentence pattern mining method and apparatus, electronic device, and storage medium |
CN112559552A (en) * | 2020-12-03 | 2021-03-26 | 北京百度网讯科技有限公司 | Data pair generation method and device, electronic equipment and storage medium |
CN112559552B (en) * | 2020-12-03 | 2023-07-25 | 北京百度网讯科技有限公司 | Data pair generation method and device, electronic equipment and storage medium |
US11748340B2 (en) | 2020-12-03 | 2023-09-05 | Beijing Baidu Netcom Science And Technology Co., Ltd. | Data pair generating method, apparatus, electronic device and storage medium |
CN114627482A (en) * | 2022-05-16 | 2022-06-14 | 四川升拓检测技术股份有限公司 | Method and system for realizing table digital processing based on image processing and character recognition |
Also Published As
Publication number | Publication date |
---|---|
CN106649294B (en) | 2020-11-06 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Jurgens et al. | Incorporating dialectal variability for socially equitable language identification | |
US11409945B2 (en) | Natural language processing using context-specific word vectors | |
CN110852087B (en) | Chinese error correction method and device, storage medium and electronic device | |
CN106649294A (en) | Training of classification models and method and device for recognizing subordinate clauses of classification models | |
US11250842B2 (en) | Multi-dimensional parsing method and system for natural language processing | |
Xu et al. | Improve Chinese word embeddings by exploiting internal structure | |
CN107818085B (en) | Answer selection method and system for reading understanding of reading robot | |
CN112036162B (en) | Text error correction adaptation method and device, electronic equipment and storage medium | |
CN111310440B (en) | Text error correction method, device and system | |
CN107204184A (en) | Audio recognition method and system | |
CN107193807A (en) | Language conversion processing method, device and terminal based on artificial intelligence | |
CN107798123B (en) | Knowledge base and establishing, modifying and intelligent question and answer methods, devices and equipment thereof | |
CN104008126A (en) | Method and device for segmentation on basis of webpage content classification | |
CN107679032A (en) | Voice changes error correction method and device | |
US20170060826A1 (en) | Automatic Sentence And Clause Level Topic Extraction And Text Summarization | |
CN107844473B (en) | Word sense disambiguation method based on context similarity calculation | |
Rozovskaya et al. | Building a state-of-the-art grammatical error correction system | |
WO2019229768A1 (en) | A bot engine for automatic dynamic intent computation | |
CN110263154A (en) | A kind of network public-opinion emotion situation quantization method, system and storage medium | |
CN114757176A (en) | Method for obtaining target intention recognition model and intention recognition method | |
CN112035506A (en) | Semantic recognition method and equipment | |
CN116628328A (en) | Web API recommendation method and device based on functional semantics and structural interaction | |
US20180203855A1 (en) | System for creating interactive media and method of operating the same | |
CN106855854A (en) | A kind of recognition methods of english information and device | |
CN114970666B (en) | Spoken language processing method and device, electronic equipment and storage medium |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |