Invention content
The embodiment of the present application provides a kind of user language assessment and determines method and system, solves prior art user and is using
During language repeater and point reader study language, the situation of oneself language learning can not be effectively grasped, user learns language
Speech it is less efficient the problem of.
A kind of user language appraisal procedure provided by the embodiments of the present application, including:
Identify whether current text is target text;
When identify current text be target text when, then receive user with read operation;
Corresponding to be played first is determined to perform with read operation with pronunciation frequency;
First is played with pronunciation frequency;
Acquire user according to described first with pronunciation frequency re-reading second with pronunciation frequency;
It is assessed with pronunciation frequency the second of user.
Preferably, current text image is obtained;Extract the characteristics of image of the current text image;In pre-stored figure
As searching whether there is extracted characteristics of image in property data base;If so, identification current text is target text;If
No, then it is non-targeted text to identify current text.
Preferably, the characteristics of image of current text image is extracted by convolutional neural networks algorithm;Or by recycling nerve
Network algorithm extracts the characteristics of image of current text image;Or current text image is extracted by scale invariant feature change algorithm
Characteristics of image.
Preferably, the page of text image that continues in target text is obtained;The image of the page of text that continues described in extraction image is special
Sign;According to the characteristics of image of the page of text image that continues, in page number property data base, the page of text correspondence that continues is determined
The page number;According to the corresponding page number of the page of text that continues, in sentence retrtieval database, determine that the page number is corresponding
Sentence retrtieval;According to the page number and the corresponding sentence retrtieval of the page number, first is obtained with pronunciation frequency;By institute
First obtained is determined as performing with read operation corresponding to be played first with pronunciation frequency with pronunciation frequency.
Preferably, successively extraction user second with each word audio in pronunciation frequency;By the sequence of extraction word audio
For each word audio, each phoneme audio in the word audio is extracted successively;By the sequence of extraction phoneme audio for every
A phoneme audio determines the corresponding standard phoneme audio of the phoneme audio, and the phoneme audio and standard phoneme audio is carried out
Comparison determines the first fractional value of the phoneme audio;For any word audio, all phonemes which is included
Second fractional value of the sum of first fractional value of audio as the word audio;The second of the user is included with pronunciation frequency
The sum of the second number value of all word audios as the user second with pronunciation frequency assessed value, according to the user
Second with pronunciation frequency assessed value, assessed with pronunciation frequency the second of user.
Preferably, it according to the sequence of extraction of the word audio comprising the phoneme audio, is carried in the sentence retrtieval
The corresponding word of word audio is taken, according to the sequence of extraction of the phoneme audio, in the corresponding word of word audio extracted
The corresponding word phonetic symbol of the phoneme audio is extracted, according to the corresponding word phonetic symbol of the phoneme audio is extracted, in standard phoneme sound
In frequency database, the corresponding standard phoneme audio of the word phonetic symbol is determined.
Preferably, judge whether the second of the user with the assessed value of pronunciation frequency be more than preset threshold value;If so, institute
State user second is qualified with pronunciation frequency, and prompts user;If it is not, then the second of the user is not qualified with pronunciation frequency, lay equal stress on
Replay puts described second with pronunciation frequency.
A kind of user language assessment system provided by the embodiments of the present application, including:
Central processing unit, for identifying whether current text is target text;
Image feedback device, for when central processing unit identification current text be target text when, then receive user with
Read operation;
Central processing unit, for determining to perform with read operation corresponding to be played first with pronunciation frequency;
Loud speaker, for playing first with pronunciation frequency;
Microphone, for acquire user according to described first with pronunciation frequency re-reading second with pronunciation frequency;
Cloud server is assessed for second to user with pronunciation frequency.
Preferably, the system also includes:
Camera, for scanning current text image;
The central processing unit is specifically used for, and obtains current text image, and the image for extracting the current text image is special
Sign searches whether there is extracted characteristics of image in pre-stored image feature base, if so, identification ought be above
This is target text, if it is not, then identifying that current text is non-targeted text.
Preferably, the central processing unit is additionally operable to, and the figure of current text image is extracted by convolutional neural networks algorithm
As feature;Or the characteristics of image of current text image is extracted by Recognition with Recurrent Neural Network algorithm;Or become by scale invariant feature
Change the characteristics of image of algorithm extraction current text image.
Preferably, the system also includes:
Camera, for scanning the page of text image that continues in target text;
The central processing unit is specifically used for, and obtains the page of text image that continues in target text, continue text described in extraction
The characteristics of image of this page of image according to the characteristics of image of the page of text image that continues, in page number property data base, determines institute
The corresponding page number of the page of text that continues is stated, according to the corresponding page number of the page of text that continues, in sentence retrtieval database, really
Determine the corresponding sentence retrtieval of the page number, according to the page number and the corresponding sentence retrtieval of the page number, obtain
First with pronunciation frequency, is determined as performing with read operation corresponding to be played first with pronunciation with pronunciation frequency by acquired first
Frequently.
Preferably, the cloud server is specifically used for, and the second of extraction user is with each word sound in pronunciation frequency successively
Frequently, each phoneme audio in the word audio is extracted successively for each word audio by the sequence of extraction word audio, by carrying
The sequence of phoneme audio is taken for each phoneme audio, determines the corresponding standard phoneme audio of the phoneme audio, and by the phoneme
Audio is compared with standard phoneme audio, determines the first fractional value of the phoneme audio, for any word audio, by the list
Second fractional value of the sum of first fractional value of all phoneme audios that word audio is included as the word audio, by the use
The second of family is with the sum of second number value of all word audios for being included of pronunciation frequency as the user second with pronunciation
The assessed value of frequency, according to the second of the user with pronunciation frequency assessed value, assessed with pronunciation frequency the second of user.
Preferably, the cloud server is additionally operable to, according to the sequence of extraction of the word audio comprising the phoneme audio,
The corresponding word of word audio is extracted in the sentence retrtieval, according to the sequence of extraction of the phoneme audio, what is extracted
The corresponding word phonetic symbol of the phoneme audio is extracted in the corresponding word of word audio, according to extracting the corresponding list of phoneme audio
Word phonetic symbol in standard phoneme audio database, determines the corresponding standard phoneme audio of the word phonetic symbol.
Preferably, the cloud server is specifically used for, judge the second of the user with pronunciation frequency assessed value whether
More than preset threshold value, if so, the second of the user is qualified with pronunciation frequency, and pass through loud speaker and image feedback device
User is prompted, if it is not, then the second of the user is not qualified with pronunciation frequency, and passes through loud speaker and repeats playing described second with reading
Audio.
The embodiment of the present application provides a kind of user language appraisal procedure and system, this method include:Identifying current text is
It is no for target text, when it is target text to identify current text, then receive user with read operation, determine to perform with read operation
Corresponding to be played first, with pronunciation frequency, plays first with pronunciation frequency, acquisition user is re-reading with pronunciation frequency according to described first
Second with pronunciation frequency, assessed with pronunciation frequency the second of user.By the above method, user during with reading,
User with reading can be assessed, user is enabled to effectively to grasp oneself language learning situation, improves user's study
The efficiency of language.
Specific embodiment
Purpose, technical scheme and advantage to make the application are clearer, below in conjunction with the application specific embodiment and
Technical scheme is clearly and completely described in corresponding attached drawing.Obviously, described embodiment is only the application one
Section Example, instead of all the embodiments.Based on the embodiment in the application, those of ordinary skill in the art are not doing
Go out all other embodiments obtained under the premise of creative work, shall fall in the protection scope of this application.
Fig. 1 is user language evaluation process provided by the embodiments of the present application, specifically includes following steps:
S101:Identify whether current text is target text.
In practical applications, people usually can learn language to improve learning efficiency by language learning device.
Further, during user learns language by the system of the application as shown in Figure 2, the system is first
Identify whether current text is target text.
It should be noted that in this application, text can refer to complete books, can also refer to by individually several
The text of page paper composition, in addition, in this application, user needs the text learnt being placed in specified region, so as to
Text is identified in system, and it is current that user, which is needed the current text definition for needing to learn and be placed in specified region,
Text.
Herein it should also be noted that, in this application, which can only identify and help user's learning system that can prop up
The text held can not be identified and learn for the text do not supported in system, which text specifically supported, can basis
Actual demand is set, and in this application, the text which can identify and help user's learning system and can support is determined
Justice is target text.
Further, the application give identification current text whether be target text embodiment, it is specific as follows, obtain
Current text image is taken, extracts the characteristics of image of the current text image, is searched in pre-stored image feature base
With the presence or absence of the characteristics of image extracted, if so, identification current text is target text, if it is not, then identifying that current text is
Non-targeted text.
It should be noted that in this application, current text image refers to the cover of text, and can pass through
Camera scans and obtains current text image, certainly, in practical applications, will can scan and identify in systems in advance
Current text image setting be other pages in text, specifically how setting can determine according to actual demand, in addition,
During the characteristics of image for extracting the current text image, can current text image be extracted by convolutional neural networks algorithm
Characteristics of image, specifically, current text image is input to according to tri- channels of RGB in convolutional neural networks, convolutional Neural
Network can carry out process of convolution to current text image, and pond processing is carried out again to the current text image after process of convolution, after
Continuous be repeated as many times to current text image according to the sequence that process of convolution and pondization are handled is handled, until extraction part is special
Sign, the local feature that will finally be extracted by the full articulamentum of multilayer, calculate global characteristics, which is to work as
The characteristics of image of preceding text image.
Herein it should also be noted that, in this application, can not only use the extraction of convolutional neural networks algorithm current
The characteristics of image of text image, can also by Recognition with Recurrent Neural Network algorithm extract current text image characteristics of image or
The characteristics of image of current text image is extracted by scale invariant feature change algorithm, current text no longer is extracted to above two
The algorithm of the characteristics of image of image is described in detail.
S102:When identify current text be target text when, then receive user with read operation.
In this application, after it is target text to identify current text, user's current text can be prompted to support with reading mould
Formula, user can input or click with read operation in system as shown in Figure 2 according to actual learning situation, and e.g., user currently thinks
The language content in the third page in learning objective text (that is, current text) is wanted, then user can be by target text page turning to
Page three, and in system as shown in Figure 2 input or click with read operation, system receive user with read operation after, can hold
Row step S103 after it is non-targeted text to identify current text, then prompts user's current text not support with reading mode.
It should be noted that referring to that system can play the voice of current text with reading mode, user is repeated.
S103:Corresponding to be played first is determined to perform with read operation with pronunciation frequency.
Further, due to when receive user with read operation after, system as shown in Figure 2 is it needs to be determined that user is current
Page turning to page of text in pronunciation frequency what is, therefore, in this application, it is thus necessary to determine that perform treat corresponding with read operation
First played is with pronunciation frequency.
It should be noted that for the broadcasting of better compartment system with pronunciation frequency and user according to system plays
With pronunciation frequency repeat with pronunciation frequency, therefore, in this application, by system plays with pronunciation frequency be defined as first with pronunciation
Frequently, user is defined as second with pronunciation frequency with what pronunciation frequency was repeated according to system plays with pronunciation frequency.
Further the application, which gives, determines to perform with read operation corresponding to be played first with the specific of pronunciation frequency
Embodiment, it is specific as follows:
The page of text image that continues in target text is obtained, the characteristics of image of the page of text image that continues is extracted, according to this
The characteristics of image of the page of text that continues image in page number property data base, determines the corresponding page number of the page of text that continues, according to this
The corresponding page number of the page of text that continues in sentence retrtieval database, determines the corresponding sentence retrtieval of the page number, according to
The page number and the corresponding sentence retrtieval of the page number obtain first with pronunciation frequency;By acquired first with pronunciation
Frequency is determined as performing with read operation corresponding to be played first with pronunciation frequency.
It should be noted that in this application, which is certain one page in target text, and it is to use
The page of text of hope study that the current page turning in family is arrived.Extracting the characteristics of image for continuing page of text image ought be above with said extracted
The extracting mode of the characteristics of image of this image is the same.What is stored in the page number property data base is that supported text includes
The text page number.The sentence retrtieval database purchase has sentence retrtieval, is usually all to use during with reading
One pattern with reading, that is to say, that play in short, user repeats this in short, therefore, in the sentence retrtieval
The time of occurrence point for indicating each sentence in some audio is described, it can be with definition statement mark there are many form according to actual conditions
Remember text, the application gives a kind of definition format of sentence retrtieval, specific such as table 1:
Time of occurrence |
Sentence |
00:00:00.000-->00:00:10.000 |
Good morning |
00:00:11.000-->00:00:15.000 |
Nice to meet you |
00:00:16.000-->00:00:25.000 |
I’m very happy to meet you |
Table 1
Wherein, what the part of table 1 in front was indicated is the time, and what is partly indicated below is in specific audio is corresponding
Hold.
Herein it should also be noted that, the application is during with reading or more than the more pattern with reading,
That is disposably playing more words, user disposably repeats the more words played, and being specifically arranged to how many words can root
Determine according to situation, the application is not specifically limited, and when using more than more pattern with reading, is equally remembered in sentence retrtieval
The whole time of occurrence point for indicating that multiple sentences are formed in some audio is carried, such as shown in table 2:
Time of occurrence |
Sentence |
00:00:00.000-->00:00:15.000 |
Good morning, nice to meet you |
00:00:16.000-->00:00:25.000 |
I ' m very happy to meet you, thank you |
Table 2
In addition, according to the page number and the corresponding sentence retrtieval of the page number, first is obtained with pronunciation frequency, specifically
Each sentence corresponding first can be found in the page number with pronunciation frequency according to the page number using playing controller, according to sentence
The corresponding time of occurrence of every words, then can determine first corresponding to the sentence for currently needing to play with pronunciation in retrtieval
Frequently.
S104:First is played with pronunciation frequency.
S105:Acquire user according to described first with pronunciation frequency re-reading second with pronunciation frequency.
S106:It is assessed with pronunciation frequency the second of user.
Further, it in order to which user is allowed clearly to grasp the situation of the study language of oneself, in this application, is adopting
Collect user according to described first with pronunciation frequency re-reading second with pronunciation frequency after, need to be commented with pronunciation frequency the second of user
Estimate.
Further, the application gives the second specific embodiment assessed with pronunciation frequency to user, specifically
It is as follows:
The second of extraction user is with each word audio in pronunciation frequency successively, by the sequence of extraction word audio for each
Word audio extracts each phoneme audio in the word audio successively, and each phoneme sound is directed to by the sequence of extraction phoneme audio
Frequently, it determines the corresponding standard phoneme audio of the phoneme audio, and the phoneme audio and standard phoneme audio is compared, determine
First fractional value of the phoneme audio, for any word audio, the of all phoneme audios which is included
Second fractional value of the sum of one fractional value as the word audio, all lists that the second of the user is included with pronunciation frequency
The sum of second number value of word audio as the user second with pronunciation frequency assessed value, according to the second of the user with
The assessed value of pronunciation frequency is assessed with pronunciation frequency the second of user.
For example, it is assumed that currently according to the page number and the corresponding sentence retrtieval of the page number, first got is with reading
Audio is good luck, then the user acquired is with pronunciation frequency with pronunciation frequency re-reading second according to first:Good luck's
Audio, extract the second of user successively is with each word audio in pronunciation frequency:The audio of good and the audio of luck, for
The audio of good, each phoneme audio extracted successively in the audio of good are:The audio of g, the audio of u, the audio of d, for
The audio of luck, each phoneme audio extracted successively in the audio of luck are:The audio of l, the audio of Λ, the audio of k, for g
Audio, determine the audio of the corresponding standard g of audio of the g, and the audio of the g and the audio of standard g are compared, determine
First fractional value of the audio of the g;For the audio of u, the audio of the corresponding standard u of audio of the u is determined, and by the sound of the u
The audio of frequency and standard u is compared, and determines the first fractional value of the audio of the u;For the audio of d, the audio pair of the d is determined
The audio of standard d answered, and the audio of the d and the audio of standard d are compared, determine the first fractional value of the audio of the d;
For the audio of l, the audio of the corresponding standard l of audio of the l is determined, and the audio of the audio of the l and standard l is carried out pair
Than determining the first fractional value of the audio of the l;For the audio of Λ, the audio of the corresponding standard Λ of audio of the Λ is determined, and
The audio of the Λ and the audio of standard Λ are compared, determine the first fractional value of the audio of the Λ;For the audio of k, really
The audio of the corresponding standard k of audio of the fixed k, and the audio of the k and the audio of standard k are compared, determine the audio of the k
The first fractional value;For the audio of good, the first fractional value of the audio for the g that the audio of the good is included, the audio of u
The first fractional value, the second fractional value of the sum of the first fractional value of the audio of d as the word audio, for the sound of luck
Frequently, the first fractional value of the audio of the l audio of the luck included, the first fractional value of the audio of Λ, the of the audio of k
Second fractional value of the sum of one fractional value as the word audio;The good that finally the second of user is included with pronunciation frequency
The sum of second fractional value of audio and the second fractional value of audio of luck as the user second with pronunciation frequency assessed value.
In addition, it should be noted that the application gives above-mentioned determining phoneme audio corresponding standard phoneme sound
The specific embodiment of frequency, it is specific as follows:
According to the sequence of extraction of the word audio comprising the phoneme audio, word sound is extracted in the sentence retrtieval
Frequently according to the sequence of extraction of the phoneme audio, the sound is extracted in the corresponding word of word audio extracted for corresponding word
The corresponding word phonetic symbol of plain audio, according to the corresponding word phonetic symbol of the phoneme audio is extracted, in standard phoneme audio database
In, determine the corresponding standard phoneme audio of the word phonetic symbol.
For example, continuation of the previous cases, for the audio of g, determines that current first is right with pronunciation frequency first in sentence retrtieval
The sentence good luck answered, according to the sequence of extraction of the word audio (that is, audio of good) of the audio comprising the g (that is,
First is extracted in pronunciation frequency being first), the audio of the good of audio of the extraction comprising the g in sentence retrtieval
Corresponding word good, that is, current first is extracted in sentence retrtieval in pronunciation frequently corresponding sentence good luck
First word good, word good are the corresponding word of audio of the good of the audio comprising the g, according to the sound of the g
The sequence of extraction (being extracted that is, being first in the audio of good) of frequency, in the corresponding list of the audio of the good extracted
The corresponding word phonetic symbol g of audio of the g is extracted in word good, that is, first word phonetic symbol g, the list are extracted in word good
Word phonetic symbol g is the corresponding word phonetic symbol of audio of the g, according to the corresponding word phonetic symbol g of the audio for extracting the g, in standard
In phoneme audio database, the corresponding standard phoneme audio of word phonetic symbol is determined.
Further, the application in second according to the user with the assessed value of pronunciation frequency, to the second of user with reading
It, specifically can be as follows during audio is assessed:
Judge whether the second of the user with the assessed value of pronunciation frequency be more than preset threshold value;
If so, the second of the user is qualified with pronunciation frequency, and prompt user;
If it is not, then the second of the user with pronunciation frequency it is not qualified, and repeat playing described second with pronunciation frequency, again to
Family is assessed with pronunciation frequency, until qualification.
By the above method, user can assess user with reading, enable to user during with reading
Oneself language learning situation is effectively grasped, improves the efficiency that user learns language.
It is above user language appraisal procedure provided by the embodiments of the present application, based on same thinking, the embodiment of the present application
A kind of user language assessment system is also provided, as shown in Fig. 2, the system includes:
Central processing unit 201, for identifying whether current text is target text;
Image feedback device 202, for when it is target text that central processing unit 201, which identifies current text, then receiving use
Family with read operation;
Central processing unit 201, for determining to perform with read operation corresponding to be played first with pronunciation frequency;
Loud speaker 203, for playing first with pronunciation frequency;
Microphone 204, for acquire user according to described first with pronunciation frequency re-reading second with pronunciation frequency;
Cloud server 205 is assessed for second to user with pronunciation frequency.
The system also includes:
Camera 206, for scanning current text image;
The central processing unit 201 is specifically used for, and obtains current text image, extracts the image of the current text image
Feature searches whether there is extracted characteristics of image in pre-stored image feature base, if so, identification is current
Text is target text, if it is not, then identifying that current text is non-targeted text.
The central processing unit 201 is additionally operable to, and the image that current text image is extracted by convolutional neural networks algorithm is special
Sign;Or the characteristics of image of current text image is extracted by Recognition with Recurrent Neural Network algorithm;Or changed by scale invariant feature and calculated
Method extracts the characteristics of image of current text image.
The system also includes:
Camera 206, for scanning the page of text image that continues in target text;
The central processing unit 201 is specifically used for, and obtains the page of text image that continues in target text, continues described in extraction
The characteristics of image of page of text image according to the characteristics of image of the page of text image that continues, in page number property data base, determines
The corresponding page number of the page of text that continues, according to the corresponding page number of the page of text that continues, in sentence retrtieval database,
It determines the corresponding sentence retrtieval of the page number, according to the page number and the corresponding sentence retrtieval of the page number, obtains
First is taken with pronunciation frequency, is determined as performing with read operation corresponding to be played first with reading with pronunciation frequency by acquired first
Audio.
The cloud server 205 is specifically used for, and the second of extraction user is pressed with each word audio in pronunciation frequency successively
The sequence of word audio is extracted for each word audio, each phoneme audio in the word audio is extracted successively, by extraction sound
The sequence of plain audio is directed to each phoneme audio, determines the corresponding standard phoneme audio of the phoneme audio, and by the phoneme audio
It is compared with standard phoneme audio, determines the first fractional value of the phoneme audio, for any word audio, by the word sound
Second fractional value of the sum of first fractional value of all phoneme audios that frequency is included as the word audio, by the user's
Second with the sum of second number value of all word audios for being included of pronunciation frequency as the user second with pronunciation frequency
Assessed value, according to the second of the user with pronunciation frequency assessed value, assessed with pronunciation frequency the second of user.
The cloud server 205 is additionally operable to, according to the sequence of extraction of the word audio comprising the phoneme audio, described
The corresponding word of word audio is extracted in sentence retrtieval, according to the sequence of extraction of the phoneme audio, in the word extracted
The corresponding word phonetic symbol of the phoneme audio is extracted in the corresponding word of audio, according to extracting the corresponding word sound of the phoneme audio
Mark, in standard phoneme audio database, determines the corresponding standard phoneme audio of the word phonetic symbol.
The cloud server 205 is specifically used for, judge the second of the user with pronunciation frequency assessed value whether be more than
Preset threshold value if so, the second of the user is qualified with pronunciation frequency, and passes through loud speaker 203 and image feedback device
202 prompting users if it is not, then the second of the user is not qualified with pronunciation frequency, and passes through loud speaker 203 and repeat playing described the
Two with pronunciation frequency.
In a typical configuration, computing device includes one or more processors (CPU), input/output interface, net
Network interface and memory.
Memory may include computer-readable medium in volatile memory, random access memory (RAM) and/or
The forms such as Nonvolatile memory, such as read-only memory (ROM) or flash memory (flash RAM).Memory is computer-readable medium
Example.
Computer-readable medium includes permanent and non-permanent, removable and non-removable media can be by any method
Or technology come realize information store.Information can be computer-readable instruction, data structure, the module of program or other data.
The example of the storage medium of computer includes, but are not limited to phase transition internal memory (PRAM), static RAM (SRAM), moves
State random access memory (DRAM), other kinds of random access memory (RAM), read-only memory (ROM), electric erasable
Programmable read only memory (EEPROM), fast flash memory bank or other memory techniques, CD-ROM read-only memory (CD-ROM),
Digital versatile disc (DVD) or other optical storages, magnetic tape cassette, the storage of tape magnetic rigid disk or other magnetic storage apparatus
Or any other non-transmission medium, available for storing the information that can be accessed by a computing device.It defines, calculates according to herein
Machine readable medium does not include temporary computer readable media (transitory media), such as data-signal and carrier wave of modulation.
It should also be noted that, term " comprising ", "comprising" or its any other variant are intended to nonexcludability
Comprising so that process, method, commodity or equipment including a series of elements are not only including those elements, but also wrap
Include other elements that are not explicitly listed or further include for this process, method, commodity or equipment it is intrinsic will
Element.In the absence of more restrictions, the element limited by sentence "including a ...", it is not excluded that wanted including described
Also there are other identical elements in the process of element, method, commodity or equipment.
It will be understood by those skilled in the art that embodiments herein can be provided as method, system or computer program product.
Therefore, complete hardware embodiment, complete software embodiment or the embodiment in terms of combining software and hardware can be used in the application
Form.It is deposited moreover, the application can be used to can be used in one or more computers for wherein including computer usable program code
The shape of computer program product that storage media is implemented on (including but not limited to magnetic disk storage, CD-ROM, optical memory etc.)
Formula.
The foregoing is merely embodiments herein, are not limited to the application.For those skilled in the art
For, the application can have various modifications and variations.All any modifications made within spirit herein and principle are equal
Replace, improve etc., it should be included within the scope of claims hereof.