WO2021232857A1 - Scanning and shadow reading processing method and related apparatus - Google Patents

Scanning and shadow reading processing method and related apparatus Download PDF

Info

Publication number
WO2021232857A1
WO2021232857A1 PCT/CN2021/074984 CN2021074984W WO2021232857A1 WO 2021232857 A1 WO2021232857 A1 WO 2021232857A1 CN 2021074984 W CN2021074984 W CN 2021074984W WO 2021232857 A1 WO2021232857 A1 WO 2021232857A1
Authority
WO
WIPO (PCT)
Prior art keywords
follow
voice
speech
display interface
user
Prior art date
Application number
PCT/CN2021/074984
Other languages
French (fr)
Chinese (zh)
Inventor
宋英双
谢硕
周正
谢卓
宋金昌
Original Assignee
北京搜狗科技发展有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 北京搜狗科技发展有限公司 filed Critical 北京搜狗科技发展有限公司
Publication of WO2021232857A1 publication Critical patent/WO2021232857A1/en

Links

Images

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L25/00Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
    • G10L25/48Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use
    • G10L25/51Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use for comparison or discrimination
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/01Input arrangements or combined input and output arrangements for interaction between user and computer
    • G06F3/048Interaction techniques based on graphical user interfaces [GUI]
    • G06F3/0487Interaction techniques based on graphical user interfaces [GUI] using specific features provided by the input device, e.g. functions controlled by the rotation of a mouse with dual sensing arrangements, or of the nature of the input device, e.g. tap gestures based on pressure sensed by a digitiser
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/16Sound input; Sound output
    • G06F3/167Audio in a user interface, e.g. using voice commands for navigating, audio feedback
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V30/00Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
    • G06V30/10Character recognition
    • G06V30/14Image acquisition
    • G06V30/142Image acquisition using hand-held instruments; Constructional details of the instruments

Definitions

  • This application relates to the technical field of dictionary pens, and in particular to a scanning and reading processing method and related devices.
  • the dictionary pen has the function of querying the dictionary. It does not need to manually input text. It can directly scan the text through optical character recognition (English: Optical Character Recognition, abbreviation: OCR) technology to query the dictionary to obtain the query result of the text. Compared with paper dictionaries, electronic dictionaries, dictionary applications, etc., the dictionary pen is simpler, faster and more convenient in realizing the function of querying dictionaries.
  • OCR optical Character Recognition
  • the current dictionary pen only supports the function of querying the dictionary. After the dictionary pen scans the text query dictionary to obtain the query result of the text, the user can clarify the pronunciation and meaning of the text through the query result displayed on the display interface of the dictionary pen. However, when the user follows the pronunciation of the text in the process of learning text, the dictionary pen cannot obtain the user’s follow-up voice, nor can it determine whether the user’s follow-up voice has inaccurate pronunciation, which makes the user unable to know clearly Whether the pronunciation of the follow-up speech is accurate.
  • this application provides a scanning and reading processing method and related devices, so that the user can know whether the pronunciation of the follow-up voice is accurate; that is, the dictionary pen can realize the user's follow-up voice acquisition and evaluation, which is better Help users learn the query results of the text.
  • an embodiment of the present application provides a method for scanning and reading processing, which is applied to a dictionary pen, and the method includes:
  • a follow-up evaluation result of the follow-up voice is obtained and displayed on the display interface of the dictionary pen.
  • the obtaining the follow-up voice of the user in response to the user's follow-up voice input operation on the query result display interface includes:
  • the input voice is processed by using voice noise reduction technology and voice activity detection technology to obtain the follow-up voice.
  • the obtaining a follow-up evaluation result of the follow-up voice based on the follow-up voice and a model reading voice corresponding to the follow-up voice and display it on the display interface of the dictionary pen includes:
  • the follow-up evaluation result of the follow-up speech is obtained and displayed on the display interface of the dictionary pen.
  • the obtaining a follow-up evaluation result of the follow-up voice based on the follow-up voice and a model reading voice corresponding to the follow-up voice and display it on the display interface of the dictionary pen includes:
  • the follow-up evaluation result of the follow-up speech is obtained and displayed on the display interface of the dictionary pen.
  • the follow-up evaluation result includes at least a follow-up evaluation score.
  • the follow-up evaluation result further includes a follow-up correction suggestion.
  • the method further includes:
  • the method further includes:
  • a follow-up evaluation result of the new follow-up voice is obtained and displayed on the display interface of the dictionary pen.
  • an embodiment of the present application provides a scanning and reading processing device, which is applied to a dictionary pen, and the device includes:
  • the first obtaining and displaying unit is configured to obtain the query result of the text in response to the scanning operation of the text by the dictionary pen and display it on the display interface of the dictionary pen;
  • the follow-up voice obtaining unit is used to obtain the follow-up voice of the user in response to the user's follow-up voice input operation on the query result display interface;
  • the second obtaining and displaying unit is configured to obtain the follow-up evaluation result of the follow-up voice based on the follow-up voice and the model-reading voice corresponding to the follow-up voice and display it on the display interface of the dictionary pen.
  • the follow-up speech obtaining unit includes:
  • the input voice obtaining subunit is used to obtain the input voice of the user in response to the user's follow-up voice input operation on the query result display interface;
  • the follow-up voice obtaining subunit is used to process the input voice by using the voice noise reduction technology and the voice activity detection technology to obtain the follow-up voice.
  • the second obtaining and displaying unit includes:
  • the sending subunit is used to send the follow-up speech to the pronunciation evaluation server when the dictionary pen is in a networked network environment
  • the first matching obtaining subunit is configured to match the follow-up speech and the model pronunciation through the pronunciation evaluation server to obtain the degree of matching between the follow-up speech and the model pronunciation;
  • the first obtaining and displaying subunit is configured to obtain the follow-up evaluation result of the follow-up voice based on the degree of matching between the follow-up voice and the model pronunciation and display it on the display interface of the dictionary pen.
  • the second obtaining and displaying unit includes:
  • the second matching obtaining subunit is used to match the follow-up speech and the model pronunciation through the pronunciation evaluation offline toolkit when the dictionary pen is in an offline network environment to obtain the follow-up speech and the model pronunciation The matching degree of the voice;
  • the second obtaining and displaying subunit is configured to obtain the follow-up evaluation result of the follow-up voice based on the degree of matching between the follow-up voice and the model-read voice and display it on the display interface of the dictionary pen.
  • the follow-up evaluation result includes at least a follow-up evaluation score.
  • the follow-up evaluation result further includes a follow-up correction suggestion.
  • the device when the follow-up evaluation result includes a follow-up correction suggestion, the device further includes:
  • the recommendation obtaining unit is configured to recommend oral practice content based on the correcting pronunciation of the follow-up, and display it on the display interface of the dictionary pen.
  • the follow-up voice obtaining unit is further configured to obtain the new follow-up voice of the user in response to the user's follow-up voice input operation on the follow-up evaluation result display interface;
  • the second obtaining and displaying unit is further configured to obtain a follow-up evaluation result of the new follow-up voice based on the new follow-up voice and the model voice and display it on the display interface of the dictionary pen.
  • an embodiment of the present application provides a dictionary pen for scanning and reading processing.
  • the dictionary pen includes a memory and one or more programs, wherein one or more programs are stored in the memory, and
  • the one or more programs configured to be executed by one or more processors include instructions for performing the following operations:
  • a follow-up evaluation result of the follow-up voice is obtained and displayed on the display interface of the dictionary pen.
  • an embodiment of the present application provides a machine-readable medium with instructions stored thereon, which, when executed by one or more processors, cause the device to perform the scan as described in any one of the above-mentioned first aspects.
  • the dictionary pen uses the dictionary pen to scan the text, the dictionary pen obtains the query result of the text and displays it on its display interface; the user clicks on the query result display interface and then enters the voice, the dictionary pen obtains the user’s
  • the dictionary pen obtains the user’s
  • follow-up speech According to the follow-up speech and its corresponding normal speech, the dictionary pen obtains the follow-up evaluation result of the follow-up speech and displays it on its display interface.
  • the dictionary pen can obtain the user's follow-up voice when the user is reading, and judge whether there is inaccurate pronunciation in the follow-up voice by matching the follow-up voice with the model voice
  • the user can know whether the pronunciation of the follow-up speech is accurate; that is, the dictionary pen can realize the user's follow-up speech acquisition and evaluation, and better help users learn text queries result.
  • FIG. 1 is a schematic diagram of a system framework involved in an application scenario in an embodiment of this application;
  • FIG. 2 is a schematic flowchart of a method for scanning and reading processing according to an embodiment of the application
  • FIG. 3 is a schematic diagram of a process in which a dictionary pen obtains a user's follow-up voice according to an embodiment of the application;
  • FIG. 4 is a schematic diagram of a display interface of a follow-up evaluation result of a dictionary pen provided by an embodiment of the application;
  • FIG. 5 is a schematic diagram of a display interface for follow-up evaluation results of another dictionary pen provided by an embodiment of the application;
  • FIG. 6 is a schematic diagram of a display interface for oral practice content of a dictionary pen provided by an embodiment of the application.
  • FIG. 7 is a schematic diagram of a process in which a dictionary pen obtains a new follow-up voice, and a follow-up evaluation result of the new follow-up voice is obtained and displayed on the display interface of the dictionary pen according to an embodiment of the application;
  • FIG. 8 is a schematic structural diagram of an apparatus for scanning and reading processing according to an embodiment of the application.
  • Fig. 9 is a schematic structural diagram of a dictionary pen for scanning and reading processing provided by an embodiment of the application.
  • FIG. 10 is a schematic structural diagram of a server provided by an embodiment of this application.
  • the dictionary pen supports the function of querying the dictionary, that is, the user uses the dictionary pen to scan the text, and the dictionary pen searches the dictionary to obtain the query result of the text and displays it on its display interface.
  • the user can browse the query result to clarify the pronunciation and meaning of the text .
  • the function of the dictionary pen to query the dictionary is far from enough for the user to learn text.
  • the dictionary pen cannot obtain the user’s follow-up voice. It is also impossible to determine whether the user's follow-up speech has an inaccurate pronunciation, so that the user cannot know whether the pronunciation of the follow-up speech is accurate.
  • the dictionary pen obtains the query result of the text and displays it on its display interface; the user clicks on the query result display interface and then enters the voice ,
  • the dictionary pen obtains the user's follow-up speech; according to the follow-up speech and its corresponding model pronunciation, the dictionary pen obtains the follow-up evaluation result of the follow-up speech and displays it on its display interface.
  • the dictionary pen can obtain the user's follow-up voice when the user is following it, and judge whether there is an inaccurate pronunciation in the follow-up voice by matching the follow-up voice with the model voice , In order to obtain and display the follow-up evaluation results, so that the user can know whether the pronunciation of the follow-up speech is accurate; that is, the dictionary pen can realize the user's follow-up speech acquisition and evaluation, and better help users learn the text query results.
  • one of the scenarios of the embodiment of the present application can be applied to the scenario shown in FIG. Displayed on its display interface for the user to browse the query results of the text; the user clicks on the dictionary pen 100 query result display interface and then enters the voice, the dictionary pen 100 obtains the user’s follow-up voice; based on the follow-up voice and the follow-up voice Corresponding to the normal reading voice, the dictionary pen 100 obtains the follow-up evaluation result of the follow-up voice and displays it on its display interface, so that the user can browse the follow-up evaluation result of the reading voice.
  • FIG. 2 shows a schematic flowchart of a scanning and reading processing method in an embodiment of the present application.
  • the method may include the following steps, for example:
  • Step 201 In response to the scanning operation of the dictionary pen on the text, a query result of the text is obtained and displayed on the display interface of the dictionary pen.
  • the dictionary pen when the user uses the dictionary pen to scan text, such as words, phrases or sentences, in response to the dictionary pen scanning operation of the text through OCR technology, the dictionary pen can query the dictionary to obtain The query result corresponding to the text, so that the user can intuitively obtain the query result of the clear text through the dictionary pen, the dictionary pen needs to display the query result of the text on its display interface.
  • the text scanned by the user using the dictionary pen through OCR technology is the English word "take", and the dictionary pen searches the dictionary to obtain the English word "take”.
  • the query result is the pronunciation and meaning of the English word "take”.
  • the pronunciation and meaning of the word "take” are displayed on the display interface; the user can clarify the pronunciation and meaning of the English word "take” by browsing the query result display interface of the dictionary pen.
  • Step 202 In response to the user's follow-up voice input operation on the query result display interface, obtain the follow-up voice of the user.
  • the dictionary pen When the user is learning the text, after executing step 201 to implement the dictionary pen query function, the dictionary pen only supports the dictionary query function, which is far from enough for the user to learn the text. For example, when the user follows the pronunciation of the text, The current dictionary pen cannot interact with the user, and cannot obtain the user's follow-up speech, nor can it determine whether the user's follow-up speech has inaccurate pronunciation, so that the user cannot know whether the pronunciation of the follow-up speech is accurate. Therefore, in the embodiment of the present application, on the basis of step 201, on the query result display interface of the dictionary pen, a control that allows the user to perform the voice input operation of the follow-up reading, such as the recording button, etc., is set. When the user touches the control to input the follow-up voice, the dictionary pen can obtain the follow-up voice of the user, so as to be able to judge whether there is an inaccurate pronunciation in the user's follow-up voice.
  • the dictionary pen uses the picked-up voice as the user's input voice, including some interference noises such as environmental noise and white noise, and follow-up speech input
  • the silent period caused by a certain delay at the beginning and the end of the input; in order to obtain a more accurate user's follow-up speech, it is necessary to use voice noise reduction technology to process the user's input speech to eliminate the interference noise in the input speech, such as 3m speech Noise reduction technology, etc., and use voice activity detection technology to process the user's input voice to identify and eliminate the silent period in the input voice, so as to obtain the processed input voice as the user's follow-up voice. Therefore, in an optional implementation manner of the embodiment of the present application, the step 202 may include, for example, the following steps:
  • Step A In response to the user's follow-up voice input operation on the query result display interface, obtain the user's input voice
  • Step B Use voice noise reduction technology and voice activity detection technology to process the input voice to obtain the follow-up voice.
  • a dictionary pen shown in FIG. 3 is a schematic diagram of a process in which a user's follow-up voice is obtained by a dictionary pen.
  • a recording button is preset on the query result display interface of the dictionary pen.
  • the dictionary pen obtains the user's input voice, and uses 3m voice noise reduction technology and voice activity detection technology to process the input The voice obtains the user's follow-up voice.
  • Step 203 Obtain a follow-up evaluation result of the follow-up voice based on the follow-up voice and the model-reading voice corresponding to the follow-up voice and display it on the display interface of the dictionary pen.
  • the dictionary pen after performing step 202 to realize the function of the dictionary pen to obtain the user's follow-up speech, the dictionary pen needs to determine whether there is an inaccurate pronunciation in the user's follow-up speech, and the judgment standard is the follow-up speech
  • the normal pronunciation can be either a pre-stored professional standard pronunciation, or a standard pronunciation of the text, or a synthesized voice based on the standard pronunciation of the human voice and the text.
  • the dictionary pen needs to display the follow-up evaluation result of the follow-up speech on its display interface.
  • the dictionary pen may be in a networked network environment or in an offline network environment.
  • the dictionary pen can realize the evaluation of the user's follow-up speech regardless of whether it is in the networked network environment or the offline network environment.
  • the following is a detailed description of the specific implementation of step 203 when the dictionary pen is in a networked network environment and the dictionary pen is in an offline network environment:
  • the dictionary pen can connect to the pronunciation evaluation server for evaluating the follow-up speech through the network, so as to complete the user's follow-up speech evaluation through the pronunciation evaluation server.
  • the dictionary pen first needs to send the obtained follow-up speech to the pronunciation evaluation server through the network, and the voice evaluation server matches the received follow-up speech with the corresponding model reading voice to obtain the difference between the follow-up voice and the model reading voice.
  • the step 203 may include, for example, the following steps:
  • Step C When the dictionary pen is in a networked network environment, the follow-up speech is sent to the pronunciation evaluation server.
  • Step D Match the follow-up speech and the model pronunciation through the pronunciation evaluation server to obtain the degree of matching between the follow-up speech and the model pronunciation.
  • the preset algorithm can be used to calculate the matching degree of the follow-up speech and the model-reading speech.
  • Reading voice vector calculate the distance between the reading voice vector and the normal reading voice through the distance formula, to obtain the matching degree of the follow-up voice and the normal reading voice; you can also use pre-trained users to obtain the matching degree of the follow-up voice and the normal reading voice.
  • Network model to obtain the matching degree between the follow-up speech and the model-reading speech. For example, input the follow-up speech and the model-reading speech into a trained neural network model, and the trained neural network model outputs the match between the follow-up speech and the model-reading speech. Spend.
  • Step E Obtain a follow-up evaluation result of the follow-up voice based on the degree of matching between the follow-up voice and the model-read voice and display it on the display interface of the dictionary pen.
  • the dictionary pen when the dictionary pen is in an offline network environment, although the dictionary pen cannot be connected to the pronunciation evaluation server for evaluating and following speech through the network, the dictionary pen is pre-loaded with a pronunciation evaluation offline toolkit for evaluating and following speech.
  • the dictionary pen can complete the user's follow-up voice evaluation through the pronunciation evaluation offline toolkit.
  • the dictionary pen will first use the pronunciation evaluation offline toolkit to match the obtained follow-up speech with its corresponding normal reading speech to obtain the matching degree between the follow-up speech and the normal reading speech; then, pass The matching degree of the follow-up voice and the model-read voice can obtain the follow-up evaluation result of the follow-up voice and display it on the display interface of the dictionary pen. Therefore, in an optional implementation manner of the embodiment of the present application, the step 203 may include, for example, the following steps:
  • Step F When the dictionary pen is in an offline network environment, match the follow-up speech and the model pronunciation through the pronunciation evaluation offline toolkit to obtain the degree of matching between the follow-up speech and the model pronunciation.
  • step F refers to the specific implementation of step D above.
  • Step G Obtain a follow-up evaluation result of the follow-up voice based on the degree of matching between the follow-up voice and the model-read voice and display it on the display interface of the dictionary pen.
  • the follow-up evaluation result of the follow-up speech is essentially to match the follow-up speech with its corresponding model-reading speech. Based on the matching degree between the follow-up speech and the model-reading speech, the follow-up speech can be at least The matching degree with the fan reading voice is converted into the follow-up evaluation score, and the follow-up evaluation score is displayed on the display interface of the dictionary pen as the follow-up evaluation result.
  • the follow-up evaluation result includes at least a follow-up evaluation score.
  • the dictionary pen obtains the follow-up evaluation result of the follow-up speech as the follow-up evaluation score based on the follow-up speech and its corresponding normal speech; the follow-up evaluation result of the follow-up speech is displayed on the display interface of the dictionary pen
  • Figure 4 is a schematic diagram of the follow-up evaluation result display interface of a dictionary pen; among them, the follow-up evaluation score of the follow-up voice is displayed on the follow-up evaluation result display interface, for example, the follow-up evaluation score is "100 points" ".
  • the follow-up evaluation score of is used as the default follow-up evaluation score.
  • the dictionary pen It is also necessary to obtain the corresponding pronunciation correction suggestions for the follow-up speech, which together with the follow-up evaluation scores are used as the reading evaluation result, and the follow-up evaluation scores and the follow-up correction suggestions are displayed in the preset order on the display interface of the dictionary pen. , Or simultaneously display the follow-up evaluation scores and follow-up correction suggestions on the display interface of the dictionary pen, so that users can learn the follow-up correction suggestions displayed on the display interface of the dictionary pen to correct their follow-up speech.
  • the follow-up evaluation result further includes a follow-up correction suggestion.
  • the reading evaluation score is higher than the preset follow-up evaluation score, it means that the pronunciation of the user's follow-up speech is accurate, and the follow-up speech does not need to be corrected.
  • the dictionary pen obtains the follow-up evaluation result of the follow-up speech based on the follow-up speech and its corresponding normal speech. If the follow-up evaluation score of the obtained follow-up speech is lower than the preset follow-up evaluation score , You also need to obtain the corresponding pronunciation correction suggestions as the follow-up evaluation results; when the follow-up evaluation results of the follow-up speech are displayed on the display interface of the dictionary pen, another dictionary pen follow-up evaluation as shown in Figure 5 Schematic diagram of the result display interface; among them, the follow-up evaluation result display interface not only displays the follow-up evaluation score of the follow-up speech, for example, the follow-up evaluation score is "88 points"; it also needs to display the follow-up speech correction suggestions For example, the suggestion to correct the pronunciation of the follow-up is to highlight the pronunciation corresponding to the inaccurate character in the follow-up speech, and it is recommended that the user correct the pronunciation of the inaccurate character.
  • the follow-up evaluation result when the follow-up evaluation result includes follow-up correction suggestions, it means that the pronunciation of the user's follow-up speech is inaccurate.
  • the user's follow-up speech needs to be corrected, it can also be based on the follow-up correction.
  • the sound suggestion recommends some oral practice content to assist in correcting the user’s follow-up speech.
  • the oral practice content also needs to be displayed on the display interface of the dictionary pen, so that the user can learn the oral practice content displayed on the display interface of the dictionary pen. Correct the user's follow-up voice.
  • the oral practice content may be in the form of text, image, audio and/or video.
  • the method may further include, for example, step H: based on the follow-up To correct the sound suggestion recommends oral practice content, and displays it on the display interface of the dictionary pen.
  • Figure 6 is a schematic diagram of the oral practice content display interface of a dictionary pen; when the follow-up evaluation score and follow-up correction suggestions are displayed on the follow-up evaluation result display interface, the dictionary pen is based on the follow-up correction It is recommended to recommend oral practice content and display the oral practice content on the display interface of the dictionary pen.
  • the oral practice content is the oral practice text corresponding to the pronunciation highlighted in the pronunciation correction suggestions.
  • the method when the follow-up evaluation result includes follow-up correction suggestions, after the user learns the follow-up correction suggestions displayed on the display interface of the dictionary pen for correcting the follow-up pronunciation, the user can still Once again, enter the new follow-up speech on the display interface of the follow-up evaluation result of the dictionary pen, so that the dictionary pen can obtain the user's new follow-up speech to evaluate the user's new follow-up speech, so as to obtain the follow-up evaluation result of the new follow-up speech. Displayed on the display interface of the dictionary pen. Therefore, in an optional implementation manner of the embodiment of the present application, when the follow-up evaluation result includes a follow-up correction suggestion, after step 203, the method, for example, may further include the following steps:
  • Step I In response to the user's follow-up voice input operation on the follow-up evaluation result display interface, obtain the user's new follow-up voice;
  • Step J Obtain a follow-up evaluation result of the new follow-up voice based on the new follow-up voice and the model pronunciation and display it on the display interface of the dictionary pen.
  • a dictionary pen obtains a new follow-up voice, obtains a follow-up evaluation result of the new follow-up voice, and displays the process diagram on the display interface of the dictionary pen.
  • the dictionary pen ’s follow-up evaluation result display interface is preset with a recording button.
  • the dictionary pen When the user touches the control to input a new follow-up voice, the dictionary pen obtains the user's new follow-up voice; the dictionary The pen based on the new follow-up voice and the model-reading voice, the follow-up evaluation result of the follow-up voice is displayed on the display interface of the dictionary pen, and the follow-up evaluation score of the new follow-up voice is displayed on the display interface of the follow-up evaluation result, such as follow-up
  • the reading evaluation score is "100 points".
  • the user uses the dictionary pen to scan the text, and the dictionary pen obtains the query result of the text and displays it on its display interface; the user inputs the follow-up voice on the query result display interface, and the dictionary pen obtains the user’s
  • the dictionary pen obtains the follow-up speech: According to the follow-up speech and its corresponding normal speech, the dictionary pen obtains the follow-up evaluation result of the follow-up speech and displays it on its display interface.
  • the dictionary pen can obtain the user's follow-up voice when the user is reading, and judge whether there is inaccurate pronunciation in the follow-up voice by matching the follow-up voice with the model voice
  • the user can know whether the pronunciation of the follow-up speech is accurate; that is, the dictionary pen can realize the user's follow-up speech acquisition and evaluation, and better help users learn text queries result.
  • the device may specifically include, for example:
  • the first obtaining and displaying unit 801 is configured to obtain the query result of the text in response to the scanning operation of the text by the dictionary pen and display it on the display interface of the dictionary pen;
  • the follow-up voice obtaining unit 802 is configured to obtain the follow-up voice of the user in response to the user's follow-up voice input operation on the query result display interface;
  • the second obtaining and displaying unit 803 is configured to obtain a follow-up evaluation result of the follow-up voice based on the follow-up voice and the model-reading voice corresponding to the follow-up voice and display it on the display interface of the dictionary pen.
  • the follow-up speech obtaining unit 802 includes:
  • the input voice obtaining subunit is used to obtain the input voice of the user in response to the user's follow-up voice input operation on the query result display interface;
  • the follow-up voice obtaining subunit is used to process the input voice by using the voice noise reduction technology and the voice activity detection technology to obtain the follow-up voice.
  • the second obtaining and displaying unit 803 includes:
  • the sending subunit is used to send the follow-up speech to the pronunciation evaluation server when the dictionary pen is in a networked network environment
  • the first matching obtaining subunit is configured to match the follow-up speech and the model pronunciation through the pronunciation evaluation server to obtain the degree of matching between the follow-up speech and the model pronunciation;
  • the first obtaining and displaying subunit is configured to obtain the follow-up evaluation result of the follow-up voice based on the degree of matching between the follow-up voice and the model pronunciation and display it on the display interface of the dictionary pen.
  • the second obtaining and displaying unit 803 includes:
  • the second matching obtaining subunit is used to match the follow-up speech and the model pronunciation through the pronunciation evaluation offline toolkit when the dictionary pen is in an offline network environment to obtain the follow-up speech and the model pronunciation The matching degree of the voice;
  • the second obtaining and displaying subunit is configured to obtain the follow-up evaluation result of the follow-up voice based on the degree of matching between the follow-up voice and the model pronunciation and display it on the display interface of the dictionary pen.
  • the follow-up evaluation result includes at least a follow-up evaluation score.
  • the follow-up evaluation result further includes a follow-up correction suggestion.
  • the device when the follow-up evaluation result includes a follow-up correction suggestion, the device further includes:
  • the recommendation obtaining unit is configured to recommend oral practice content based on the correcting pronunciation of the follow-up, and display it on the display interface of the dictionary pen.
  • the follow-up voice obtaining unit 802 is further configured to obtain a new follow-up voice of the user in response to the user's follow-up voice input operation on the follow-up evaluation result display interface;
  • the second obtaining and displaying unit 803 is further configured to obtain a follow-up evaluation result of the new follow-up voice based on the new follow-up voice and the model voice and display it on the display interface of the dictionary pen.
  • the user uses the dictionary pen to scan the text, and the dictionary pen obtains the query result of the text and displays it on its display interface; the user clicks on the query result display interface to input voice, and the dictionary pen obtains the user Follow-up voice; according to the follow-up voice and its corresponding normal-read voice, the dictionary pen obtains the follow-up evaluation result of the follow-up voice and displays it on its display interface.
  • the dictionary pen can obtain the user's follow-up voice when the user is reading, and judge whether there is inaccurate pronunciation in the follow-up voice by matching the follow-up voice with the model voice
  • the user can know whether the pronunciation of the follow-up speech is accurate; that is, the dictionary pen can realize the user's follow-up speech acquisition and evaluation, and better help users learn text queries result.
  • Fig. 9 is a block diagram showing a dictionary pen 900 for scanning and reading processing according to an exemplary embodiment.
  • the device 900 may include one or more of the following components: a processing component 902, a memory 904, a power supply component 906, a multimedia component 908, an audio component 910, an input/output (I/O) interface 912, a sensor component 914, And the communication component 916.
  • the processing component 902 generally controls the overall operations of the device 900, such as operations associated with display, telephone calls, data communications, camera operations, and recording operations.
  • the processing component 902 may include one or more processors 920 to execute instructions to complete all or part of the steps of the foregoing method.
  • the processing component 902 may include one or more modules to facilitate the interaction between the processing component 902 and other components.
  • the processing component 902 may include a multimedia module to facilitate the interaction between the multimedia component 908 and the processing component 902.
  • the memory 904 is configured to store various types of data to support the operation of the device 900. Examples of such data include instructions for any application or method operating on the device 900, contact data, phone book data, messages, pictures, videos, etc.
  • the memory 904 can be implemented by any type of volatile or non-volatile storage device or a combination thereof, such as static random access memory (SRAM), electrically erasable programmable read-only memory (EEPROM), erasable and Programmable Read Only Memory (EPROM), Programmable Read Only Memory (PROM), Read Only Memory (ROM), Magnetic Memory, Flash Memory, Magnetic Disk or Optical Disk.
  • SRAM static random access memory
  • EEPROM electrically erasable programmable read-only memory
  • EPROM erasable and Programmable Read Only Memory
  • PROM Programmable Read Only Memory
  • ROM Read Only Memory
  • Magnetic Memory Flash Memory
  • Magnetic Disk Magnetic Disk or Optical Disk.
  • the power supply component 906 provides power to various components of the device 900.
  • the power supply component 906 may include a power management system, one or more power supplies, and other components associated with the generation, management, and distribution of power for the device 900.
  • the multimedia component 908 includes a screen that provides an output interface between the device 900 and the user.
  • the screen may include a liquid crystal display (LCD) and a touch panel (TP). If the screen includes a touch panel, the screen may be implemented as a touch screen to receive input signals from the user.
  • the touch panel includes one or more touch sensors to sense touch, sliding, and gestures on the touch panel. The touch sensor may not only sense the boundary of a touch or slide action, but also detect the duration and pressure associated with the touch or slide operation.
  • the multimedia component 908 includes a front camera and/or a rear camera. When the device 900 is in an operation mode, such as a shooting mode or a video mode, the front camera and/or the rear camera can receive external multimedia data. Each front camera and rear camera can be a fixed optical lens system or have focal length and optical zoom capabilities.
  • the audio component 910 is configured to output and/or input audio signals.
  • the audio component 910 includes a microphone (MIC), and when the device 900 is in an operation mode, such as a call mode, a recording mode, and a voice recognition mode, the microphone is configured to receive external audio signals.
  • the received audio signal may be further stored in the memory 904 or transmitted via the communication component 916.
  • the audio component 910 further includes a speaker for outputting audio signals.
  • the I/O interface 912 provides an interface between the processing component 902 and a peripheral interface module.
  • the above-mentioned peripheral interface module may be a keyboard, a click wheel, a button, and the like. These buttons may include, but are not limited to: home button, volume button, start button, and lock button.
  • the sensor component 914 includes one or more sensors for providing the device 900 with various aspects of state evaluation.
  • the sensor component 914 can detect the on/off status of the device 900 and the relative positioning of components.
  • the component is the display and the keypad of the device 900.
  • the sensor component 914 can also detect the position change of the device 900 or a component of the device 900. , The presence or absence of contact between the user and the device 900, the orientation or acceleration/deceleration of the device 900, and the temperature change of the device 900.
  • the sensor component 914 may include a proximity sensor configured to detect the presence of nearby objects when there is no physical contact.
  • the sensor component 914 may also include a light sensor, such as a CMOS or CCD image sensor, for use in imaging applications.
  • the sensor component 914 may also include an acceleration sensor, a gyroscope sensor, a magnetic sensor, a pressure sensor or a temperature sensor.
  • the communication component 916 is configured to facilitate wired or wireless communication between the device 900 and other devices.
  • the device 900 can access a wireless network based on a communication standard, such as WiFi, 2G, or 3G, or a combination thereof.
  • the communication component 916 receives a broadcast signal or broadcast related information from an external broadcast management system via a broadcast channel.
  • the communication component 916 further includes a near field communication (NFC) module to facilitate short-range communication.
  • the NFC module can be implemented based on radio frequency identification (RFID) technology, infrared data association (IrDA) technology, ultra-wideband (UWB) technology, Bluetooth (BT) technology and other technologies.
  • RFID radio frequency identification
  • IrDA infrared data association
  • UWB ultra-wideband
  • Bluetooth Bluetooth
  • the apparatus 900 may be implemented by one or more application specific integrated circuits (ASIC), digital signal processors (DSP), digital signal processing devices (DSPD), programmable logic devices (PLD), field programmable A gate array (FPGA), controller, microcontroller, microprocessor, or other electronic components are implemented to perform the above methods.
  • ASIC application specific integrated circuits
  • DSP digital signal processors
  • DSPD digital signal processing devices
  • PLD programmable logic devices
  • FPGA field programmable A gate array
  • controller microcontroller, microprocessor, or other electronic components are implemented to perform the above methods.
  • non-transitory computer-readable storage medium including instructions, such as the memory 904 including instructions, and the foregoing instructions may be executed by the processor 920 of the device 900 to complete the foregoing method.
  • the non-transitory computer-readable storage medium may be ROM, random access memory (RAM), CD-ROM, magnetic tape, floppy disk, optical data storage device, etc.
  • a non-transitory computer-readable storage medium when the instructions in the storage medium are executed by the processor of the mobile terminal, so that a scan-and-read processing method can be executed, the method includes:
  • a follow-up evaluation result of the follow-up voice is obtained and displayed on the display interface of the dictionary pen.
  • FIG. 10 is a schematic diagram of the structure of a server in an embodiment of the present application.
  • the server 1000 may have relatively large differences due to different configurations or performances, and may include one or more central processing units (CPU) 1022 (for example, one or more processors) and a memory 1032, one or one
  • the above storage medium 1030 (for example, one or one storage device with a large amount of storage) for storing the application program 1042 or the data 1044.
  • the memory 1032 and the storage medium 1030 may be short-term storage or permanent storage.
  • the program stored in the storage medium 1030 may include one or more modules (not shown in the figure), and each module may include a series of command operations on the server.
  • the central processing unit 1022 may be configured to communicate with the storage medium 1030, and execute a series of instruction operations in the storage medium 1030 on the server 1000.
  • the server 1000 may also include one or more power supplies 1026, one or more wired or wireless network interfaces 1050, one or more input and output interfaces 1058, one or more keyboards 1056, and/or, one or more operating systems 1041 , Such as Windows ServerTM, Mac OS XTM, UnixTM, LinuxTM, FreeBSDTM and so on.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Human Computer Interaction (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Multimedia (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • General Health & Medical Sciences (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Computational Linguistics (AREA)
  • Signal Processing (AREA)
  • Acoustics & Sound (AREA)
  • Electrically Operated Instructional Devices (AREA)
  • User Interface Of Digital Computer (AREA)

Abstract

A scanning and shadow reading processing method and apparatus, a dictionary pen, and a machine-readable medium. The method comprises: in response to a scanning operation of a dictionary pen on text, obtaining a query result of the text and displaying same on a display interface of the dictionary pen (201); in response to a shadow reading speech input operation of a user on the display interface of the query result, obtaining shadow reading speech of the user (202); and obtaining a shadow reading evaluation result of the shadow reading speech on the basis of the shadow reading speech and standard reading speech corresponding to the shadow reading speech, and displaying the shadow reading evaluation result on the display interface of the dictionary pen (203). In this way, by means of a dictionary pen, shadow reading speech of a user can be acquired and evaluated, thereby better assisting the user in learning a query result of text.

Description

一种扫描跟读处理的方法及相关装置Method and related device for scanning and reading processing
本申请要求于2020年05月20日提交中国专利局、申请号为202010430268.4、发明名称为“一种扫描跟读处理的方法及相关装置”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。This application claims the priority of a Chinese patent application filed with the Chinese Patent Office on May 20, 2020, the application number is 202010430268.4, and the invention title is "a method and related device for scanning and reading processing", the entire content of which is incorporated by reference In this application.
技术领域Technical field
本申请涉及词典笔技术领域,尤其涉及一种扫描跟读处理的方法及相关装置。This application relates to the technical field of dictionary pens, and in particular to a scanning and reading processing method and related devices.
背景技术Background technique
词典笔具有查询词典的功能,不需要手动输入文本,直接通过光学字符识别(英文:Optical Character Recognition,缩写:OCR)技术扫描文本,即可查询词典获得该文本的查询结果。词典笔相较于纸质字典、电子词典,词典应用程序等等在实现查询词典的功能上更为简单、快速、便捷。The dictionary pen has the function of querying the dictionary. It does not need to manually input text. It can directly scan the text through optical character recognition (English: Optical Character Recognition, abbreviation: OCR) technology to query the dictionary to obtain the query result of the text. Compared with paper dictionaries, electronic dictionaries, dictionary applications, etc., the dictionary pen is simpler, faster and more convenient in realizing the function of querying dictionaries.
目前的词典笔仅仅支持查询词典的功能,在词典笔扫描文本查询词典获得该文本的查询结果后,用户通过词典笔的显示界面上所显示的查询结果即可明确文本的发音和含义。但是,当用户在学习文本过程中对文本的发音进行跟读时,词典笔无法获取用户的跟读语音,也无法判断用户的跟读语音中是否存在发音不准的问题,从而导致用户无法明确其跟读语音的发音是否准确。The current dictionary pen only supports the function of querying the dictionary. After the dictionary pen scans the text query dictionary to obtain the query result of the text, the user can clarify the pronunciation and meaning of the text through the query result displayed on the display interface of the dictionary pen. However, when the user follows the pronunciation of the text in the process of learning text, the dictionary pen cannot obtain the user’s follow-up voice, nor can it determine whether the user’s follow-up voice has inaccurate pronunciation, which makes the user unable to know clearly Whether the pronunciation of the follow-up speech is accurate.
发明内容Summary of the invention
有鉴于此,本申请提供一种扫描跟读处理的方法及相关装置,使得用户能够明确其跟读语音的发音是否准确;即,词典笔能够实现用户的跟读语音的获取和评测,更好地帮助用户学习文本的查询结果。In view of this, this application provides a scanning and reading processing method and related devices, so that the user can know whether the pronunciation of the follow-up voice is accurate; that is, the dictionary pen can realize the user's follow-up voice acquisition and evaluation, which is better Help users learn the query results of the text.
第一方面,本申请实施例提供了一种扫描跟读处理的方法,应用于词典笔,所述方法包括:In the first aspect, an embodiment of the present application provides a method for scanning and reading processing, which is applied to a dictionary pen, and the method includes:
响应于词典笔对文本的扫描操作,获得所述文本的查询结果并显示在所 述词典笔的显示界面上;In response to the scanning operation of the dictionary pen on the text, obtaining the query result of the text and displaying it on the display interface of the dictionary pen;
响应于用户在查询结果显示界面上的跟读语音输入操作,获得所述用户的跟读语音;In response to the user's follow-up voice input operation on the query result display interface, obtain the follow-up voice of the user;
基于所述跟读语音和所述跟读语音对应的范读语音,获得所述跟读语音的跟读评测结果并显示在所述词典笔的显示界面上。Based on the follow-up voice and the model-reading voice corresponding to the follow-up voice, a follow-up evaluation result of the follow-up voice is obtained and displayed on the display interface of the dictionary pen.
可选的,所述响应于用户在查询结果显示界面上的跟读语音输入操作,获得所述用户的跟读语音,包括:Optionally, the obtaining the follow-up voice of the user in response to the user's follow-up voice input operation on the query result display interface includes:
响应于用户在查询结果显示界面上的跟读语音输入操作,获得所述用户的输入语音;In response to the user's follow-up voice input operation on the query result display interface, obtaining the user's input voice;
利用语音降噪技术和语音活性检测技术处理所述输入语音,获得所述跟读语音。The input voice is processed by using voice noise reduction technology and voice activity detection technology to obtain the follow-up voice.
可选的,所述基于所述跟读语音和所述跟读语音对应的范读语音,获得所述跟读语音的跟读评测结果并显示在所述词典笔的显示界面上,包括:Optionally, the obtaining a follow-up evaluation result of the follow-up voice based on the follow-up voice and a model reading voice corresponding to the follow-up voice and display it on the display interface of the dictionary pen includes:
当所述词典笔处于联网网络环境下,将所述跟读语音发送至所述读音评测服务器;When the dictionary pen is in a networked network environment, sending the follow-up speech to the pronunciation evaluation server;
通过所述读音评测服务器匹配所述跟读语音与所述范读语音,获得所述跟读语音与所述范读语音的匹配度;Matching the follow-up speech and the model pronunciation through the pronunciation evaluation server to obtain the degree of matching between the follow-up speech and the model pronunciation;
基于所述跟读语音与所述范读语音的匹配度,获得所述跟读语音的跟读评测结果并显示在所述词典笔的显示界面上。Based on the degree of matching between the follow-up speech and the model pronunciation, the follow-up evaluation result of the follow-up speech is obtained and displayed on the display interface of the dictionary pen.
可选的,所述基于所述跟读语音和所述跟读语音对应的范读语音,获得所述跟读语音的跟读评测结果并显示在所述词典笔的显示界面上,包括:Optionally, the obtaining a follow-up evaluation result of the follow-up voice based on the follow-up voice and a model reading voice corresponding to the follow-up voice and display it on the display interface of the dictionary pen includes:
当所述词典笔处于离线网络环境下,通过读音评测离线工具包匹配所述跟读语音与所述范读语音,获得所述跟读语音与所述范读语音的匹配度;When the dictionary pen is in an offline network environment, use the pronunciation evaluation offline toolkit to match the follow-up speech and the model pronunciation to obtain the degree of matching between the follow-up speech and the model pronunciation;
基于所述跟读语音与所述范读语音的匹配度,获得所述跟读语音的跟读评测结果并显示在所述词典笔的显示界面上。Based on the degree of matching between the follow-up speech and the model pronunciation, the follow-up evaluation result of the follow-up speech is obtained and displayed on the display interface of the dictionary pen.
可选的,所述跟读评测结果至少包括跟读评测分数。Optionally, the follow-up evaluation result includes at least a follow-up evaluation score.
可选的,若所述跟读评测分数低于预设跟读评测分数,所述跟读评测结果还包括跟读纠音建议。Optionally, if the follow-up evaluation score is lower than the preset follow-up evaluation score, the follow-up evaluation result further includes a follow-up correction suggestion.
可选的,当所述跟读评测结果包括跟读纠音建议时,所述方法还包括:Optionally, when the follow-up evaluation result includes follow-up correction suggestions, the method further includes:
基于所述跟读纠音建议推荐口语练习内容,并显示在所述词典笔的显示界面上。Suggestions for recommended oral practice content based on the correct pronunciation of the follow-up, and display on the display interface of the dictionary pen.
可选的,当所述跟读评测结果包括跟读纠音建议时,所述方法还包括:Optionally, when the follow-up evaluation result includes follow-up correction suggestions, the method further includes:
响应于用户在跟读评测结果显示界面上的跟读语音输入操作,获得所述用户的新跟读语音;In response to the user's follow-up voice input operation on the follow-up evaluation result display interface, obtain the user's new follow-up voice;
基于所述新跟读语音和所述范读语音,获得所述新跟读语音的跟读评测结果并显示在所述词典笔的显示界面上。Based on the new follow-up voice and the model-read voice, a follow-up evaluation result of the new follow-up voice is obtained and displayed on the display interface of the dictionary pen.
第二方面,本申请实施例提供了一种扫描跟读处理的装置,应用于词典笔,所述装置包括:In the second aspect, an embodiment of the present application provides a scanning and reading processing device, which is applied to a dictionary pen, and the device includes:
第一获得显示单元,用于响应于词典笔对文本的扫描操作,获得所述文本的查询结果并显示在所述词典笔的显示界面上;The first obtaining and displaying unit is configured to obtain the query result of the text in response to the scanning operation of the text by the dictionary pen and display it on the display interface of the dictionary pen;
跟读语音获得单元,用于响应于用户在查询结果显示界面上的跟读语音输入操作,获得所述用户的跟读语音;The follow-up voice obtaining unit is used to obtain the follow-up voice of the user in response to the user's follow-up voice input operation on the query result display interface;
第二获得显示单元,用于基于所述跟读语音和所述跟读语音对应的范读语音,获得所述跟读语音的跟读评测结果并显示在所述词典笔的显示界面上。The second obtaining and displaying unit is configured to obtain the follow-up evaluation result of the follow-up voice based on the follow-up voice and the model-reading voice corresponding to the follow-up voice and display it on the display interface of the dictionary pen.
可选的,所述跟读语音获得单元包括:Optionally, the follow-up speech obtaining unit includes:
输入语音获得子单元,用于响应于用户在查询结果显示界面上的跟读语音输入操作,获得所述用户的输入语音;The input voice obtaining subunit is used to obtain the input voice of the user in response to the user's follow-up voice input operation on the query result display interface;
跟读语音获得子单元,用于利用语音降噪技术和语音活性检测技术处理所述输入语音,获得所述跟读语音。The follow-up voice obtaining subunit is used to process the input voice by using the voice noise reduction technology and the voice activity detection technology to obtain the follow-up voice.
可选的,所述第二获得显示单元包括:Optionally, the second obtaining and displaying unit includes:
发送子单元,用于当所述词典笔处于联网网络环境下,将所述跟读语音发送至所述读音评测服务器;The sending subunit is used to send the follow-up speech to the pronunciation evaluation server when the dictionary pen is in a networked network environment;
第一匹配获得子单元,用于通过所述读音评测服务器匹配所述跟读语音与所述范读语音,获得所述跟读语音与所述范读语音的匹配度;The first matching obtaining subunit is configured to match the follow-up speech and the model pronunciation through the pronunciation evaluation server to obtain the degree of matching between the follow-up speech and the model pronunciation;
第一获得显示子单元,用于基于所述跟读语音与所述范读语音的匹配度,获得所述跟读语音的跟读评测结果并显示在所述词典笔的显示界面上。The first obtaining and displaying subunit is configured to obtain the follow-up evaluation result of the follow-up voice based on the degree of matching between the follow-up voice and the model pronunciation and display it on the display interface of the dictionary pen.
可选的,所述第二获得显示单元包括:Optionally, the second obtaining and displaying unit includes:
第二匹配获得子单元,用于当所述词典笔处于离线网络环境下,通过读 音评测离线工具包匹配所述跟读语音与所述范读语音,获得所述跟读语音与所述范读语音的匹配度;The second matching obtaining subunit is used to match the follow-up speech and the model pronunciation through the pronunciation evaluation offline toolkit when the dictionary pen is in an offline network environment to obtain the follow-up speech and the model pronunciation The matching degree of the voice;
第二获得显示子单元,用于基于所述跟读语音与所述范读语音的匹配度,获得所述跟读语音的跟读评测结果并显示在所述词典笔的显示界面上。The second obtaining and displaying subunit is configured to obtain the follow-up evaluation result of the follow-up voice based on the degree of matching between the follow-up voice and the model-read voice and display it on the display interface of the dictionary pen.
可选的,所述跟读评测结果至少包括跟读评测分数。Optionally, the follow-up evaluation result includes at least a follow-up evaluation score.
可选的,若所述跟读评测分数低于预设跟读评测分数,所述跟读评测结果还包括跟读纠音建议。Optionally, if the follow-up evaluation score is lower than the preset follow-up evaluation score, the follow-up evaluation result further includes a follow-up correction suggestion.
可选的,当所述跟读评测结果包括跟读纠音建议时,所述装置还包括:Optionally, when the follow-up evaluation result includes a follow-up correction suggestion, the device further includes:
推荐获得单元,用于基于所述跟读纠音建议推荐口语练习内容,并显示在所述词典笔的显示界面上。The recommendation obtaining unit is configured to recommend oral practice content based on the correcting pronunciation of the follow-up, and display it on the display interface of the dictionary pen.
可选的,当所述跟读评测结果包括跟读纠音建议时,Optionally, when the follow-up evaluation result includes follow-up correction suggestions,
所述跟读语音获得单元,还用于响应于用户在跟读评测结果显示界面上的跟读语音输入操作,获得所述用户的新跟读语音;The follow-up voice obtaining unit is further configured to obtain the new follow-up voice of the user in response to the user's follow-up voice input operation on the follow-up evaluation result display interface;
所述第二获得显示单元,还用于基于所述新跟读语音和所述范读语音,获得所述新跟读语音的跟读评测结果并显示在所述词典笔的显示界面上。The second obtaining and displaying unit is further configured to obtain a follow-up evaluation result of the new follow-up voice based on the new follow-up voice and the model voice and display it on the display interface of the dictionary pen.
第三方面,本申请实施例提供了一种用于扫描跟读处理的词典笔,所述词典笔包括有存储器,以及一个或者一个以上的程序,其中一个或者一个以上程序存储于存储器中,且经配置以由一个或者一个以上处理器执行所述一个或者一个以上程序包含用于进行以下操作的指令:In a third aspect, an embodiment of the present application provides a dictionary pen for scanning and reading processing. The dictionary pen includes a memory and one or more programs, wherein one or more programs are stored in the memory, and The one or more programs configured to be executed by one or more processors include instructions for performing the following operations:
响应于词典笔对文本的扫描操作,获得所述文本的查询结果并显示在所述词典笔的显示界面上;In response to the scanning operation of the dictionary pen on the text, obtaining the query result of the text and displaying it on the display interface of the dictionary pen;
响应于用户在查询结果显示界面上的跟读语音输入操作,获得所述用户的跟读语音;In response to the user's follow-up voice input operation on the query result display interface, obtain the follow-up voice of the user;
基于所述跟读语音和所述跟读语音对应的范读语音,获得所述跟读语音的跟读评测结果并显示在所述词典笔的显示界面上。Based on the follow-up voice and the model-reading voice corresponding to the follow-up voice, a follow-up evaluation result of the follow-up voice is obtained and displayed on the display interface of the dictionary pen.
第四方面,本申请实施例提供了一种机器可读介质,其上存储有指令,当由一个或多个处理器执行时,使得装置执行如上述第一方面中任一项所述的扫描跟读处理的方法。In a fourth aspect, an embodiment of the present application provides a machine-readable medium with instructions stored thereon, which, when executed by one or more processors, cause the device to perform the scan as described in any one of the above-mentioned first aspects. Follow-up processing methods.
与现有技术相比,本申请至少具有以下优点:Compared with the prior art, this application has at least the following advantages:
采用本申请实施例的技术方案,用户利用词典笔扫描文本,词典笔获得文本的查询结果并显示在其显示界面上;用户在查询结果显示界面上点击跟读后输入语音,词典笔获得用户的跟读语音;依据跟读语音和其对应的范读语音,词典笔获得跟读语音的跟读评测结果并显示在其显示界面上。由此可见,当词典笔扫描文本显示其查询结果后,用户进行跟读时词典笔能够获取用户的跟读语音,并通过匹配跟读语音与范读语音判断跟读语音中是否存在发音不准的问题,以得到跟读评测结果并显示,使得用户能够明确其跟读语音的发音是否准确;即,词典笔能够实现用户的跟读语音的获取和评测,更好地帮助用户学习文本的查询结果。Using the technical solution of the embodiment of this application, the user uses the dictionary pen to scan the text, the dictionary pen obtains the query result of the text and displays it on its display interface; the user clicks on the query result display interface and then enters the voice, the dictionary pen obtains the user’s Follow-up speech: According to the follow-up speech and its corresponding normal speech, the dictionary pen obtains the follow-up evaluation result of the follow-up speech and displays it on its display interface. It can be seen that when the dictionary pen scans the text and displays the query results, the dictionary pen can obtain the user's follow-up voice when the user is reading, and judge whether there is inaccurate pronunciation in the follow-up voice by matching the follow-up voice with the model voice In order to obtain and display the follow-up evaluation results, the user can know whether the pronunciation of the follow-up speech is accurate; that is, the dictionary pen can realize the user's follow-up speech acquisition and evaluation, and better help users learn text queries result.
附图说明Description of the drawings
为了更清楚地说明本申请实施例的技术方案,下面将对本申请实施例描述中所需要使用的附图作简单地介绍,显而易见地,下面描述中的附图仅仅是本申请中记载的一些实施例,对于本领域普通技术人员来讲,在不付出创造性劳动的前提下,还可以根据这些附图获得其它的附图。In order to explain the technical solutions of the embodiments of the present application more clearly, the following will briefly introduce the accompanying drawings used in the description of the embodiments of the present application. Obviously, the accompanying drawings in the following description are only some implementations recorded in the present application. For example, for those of ordinary skill in the art, without creative work, other drawings can be obtained from these drawings.
图1为本申请实施例中一种应用场景所涉及的***框架示意图;FIG. 1 is a schematic diagram of a system framework involved in an application scenario in an embodiment of this application;
图2为本申请实施例提供的一种扫描跟读处理的方法的流程示意图;2 is a schematic flowchart of a method for scanning and reading processing according to an embodiment of the application;
图3为本申请实施例提供的一种词典笔获得用户的跟读语音的过程示意图;FIG. 3 is a schematic diagram of a process in which a dictionary pen obtains a user's follow-up voice according to an embodiment of the application;
图4为本申请实施例提供的一种词典笔的跟读评测结果显示界面示意图;FIG. 4 is a schematic diagram of a display interface of a follow-up evaluation result of a dictionary pen provided by an embodiment of the application;
图5为本申请实施例提供的另一种词典笔的跟读评测结果显示界面示意图;FIG. 5 is a schematic diagram of a display interface for follow-up evaluation results of another dictionary pen provided by an embodiment of the application;
图6为本申请实施例提供的一种词典笔的口语练习内容显示界面示意图;FIG. 6 is a schematic diagram of a display interface for oral practice content of a dictionary pen provided by an embodiment of the application;
图7为本申请实施例提供的一种词典笔获得新跟读语音、获得新跟读语音的跟读评测结果并显示在词典笔的显示界面上的过程示意图;FIG. 7 is a schematic diagram of a process in which a dictionary pen obtains a new follow-up voice, and a follow-up evaluation result of the new follow-up voice is obtained and displayed on the display interface of the dictionary pen according to an embodiment of the application;
图8为本申请实施例提供的一种扫描跟读处理的装置的结构示意图;FIG. 8 is a schematic structural diagram of an apparatus for scanning and reading processing according to an embodiment of the application;
图9为本申请实施例提供的一种用于扫描跟读处理的词典笔的结构示意 图;Fig. 9 is a schematic structural diagram of a dictionary pen for scanning and reading processing provided by an embodiment of the application;
图10为本申请实施例提供的一种服务器的结构示意图。FIG. 10 is a schematic structural diagram of a server provided by an embodiment of this application.
具体实施方式Detailed ways
为了使本技术领域的人员更好地理解本申请方案,下面将结合本申请实施例中的附图,对本申请实施例中的技术方案进行清楚、完整地描述,显然,所描述的实施例仅是本申请一部分实施例,而不是全部的实施例。基于本申请中的实施例,本领域普通技术人员在没有做出创造性劳动前提下所获得的所有其他实施例,都属于本申请保护的范围。In order to enable those skilled in the art to better understand the solution of the application, the technical solutions in the embodiments of the application will be clearly and completely described below in conjunction with the drawings in the embodiments of the application. Obviously, the described embodiments are only It is a part of the embodiments of the present application, but not all of the embodiments. Based on the embodiments in this application, all other embodiments obtained by those of ordinary skill in the art without creative work shall fall within the protection scope of this application.
现阶段,词典笔支持查询词典的功能,即,用户利用词典笔扫描文本,词典笔查询词典获得该文本的查询结果显示在其显示界面上,用户浏览查询结果即可明确该文本的发音和含义。但是,发明人经过研究发现,词典笔查询词典的功能对于用户学习文本而言远远不够,当用户在学习文本过程中对文本的发音进行跟读时,词典笔无法获取用户的跟读语音,也无法判断用户的跟读语音中是否存在发音不准的问题,从而导致用户无法明确其跟读语音的发音是否准确。At this stage, the dictionary pen supports the function of querying the dictionary, that is, the user uses the dictionary pen to scan the text, and the dictionary pen searches the dictionary to obtain the query result of the text and displays it on its display interface. The user can browse the query result to clarify the pronunciation and meaning of the text . However, the inventor found through research that the function of the dictionary pen to query the dictionary is far from enough for the user to learn text. When the user follows the pronunciation of the text in the process of learning the text, the dictionary pen cannot obtain the user’s follow-up voice. It is also impossible to determine whether the user's follow-up speech has an inaccurate pronunciation, so that the user cannot know whether the pronunciation of the follow-up speech is accurate.
为了解决这一问题,在本申请实施例中,在用户利用词典笔扫描文本,词典笔获得文本的查询结果并显示在其显示界面上之后;用户在查询结果显示界面上点击跟读后输入语音,词典笔获得用户的跟读语音;依据跟读语音和其对应的范读语音,词典笔获得跟读语音的跟读评测结果并显示在其显示界面上。可见,当词典笔扫描文本显示其查询结果后,用户进行跟读时词典笔能够获取用户的跟读语音,并通过匹配跟读语音与范读语音判断跟读语音中是否存在发音不准的问题,以得到跟读评测结果并显示,使得用户能够明确其跟读语音的发音是否准确;即,词典笔能够实现用户的跟读语音的获取和评测,更好地帮助用户学习文本的查询结果。In order to solve this problem, in the embodiment of this application, after the user scans the text with the dictionary pen, the dictionary pen obtains the query result of the text and displays it on its display interface; the user clicks on the query result display interface and then enters the voice , The dictionary pen obtains the user's follow-up speech; according to the follow-up speech and its corresponding model pronunciation, the dictionary pen obtains the follow-up evaluation result of the follow-up speech and displays it on its display interface. It can be seen that when the dictionary pen scans the text and displays the query results, the dictionary pen can obtain the user's follow-up voice when the user is following it, and judge whether there is an inaccurate pronunciation in the follow-up voice by matching the follow-up voice with the model voice , In order to obtain and display the follow-up evaluation results, so that the user can know whether the pronunciation of the follow-up speech is accurate; that is, the dictionary pen can realize the user's follow-up speech acquisition and evaluation, and better help users learn the text query results.
举例来说,本申请实施例的场景之一,可以是应用到如图1所示的场景中,该场景包括词典笔100,用户利用词典笔100扫描文本,词典笔100获得文本的查询结果并显示在其显示界面上,以便用户浏览文本的查询结果;用户在词典笔100查询结果显示界面上点击跟读后输入语音,词典笔100获 得用户的跟读语音;基于跟读语音和跟读语音对应的范读语音,词典笔100获得跟读语音的跟读评测结果并显示在其显示界面上,以便用户浏览读语音的跟读评测结果。For example, one of the scenarios of the embodiment of the present application can be applied to the scenario shown in FIG. Displayed on its display interface for the user to browse the query results of the text; the user clicks on the dictionary pen 100 query result display interface and then enters the voice, the dictionary pen 100 obtains the user’s follow-up voice; based on the follow-up voice and the follow-up voice Corresponding to the normal reading voice, the dictionary pen 100 obtains the follow-up evaluation result of the follow-up voice and displays it on its display interface, so that the user can browse the follow-up evaluation result of the reading voice.
还需要说明的是,上述场景仅是本申请实施例提供的一个场景示例,本申请实施例并不限于此场景。It should also be noted that the above scenario is only an example of a scenario provided in the embodiment of the present application, and the embodiment of the present application is not limited to this scenario.
下面结合附图,通过实施例来详细说明本申请实施例中扫描跟读处理的方法及相关装置的具体实现方式。Hereinafter, in conjunction with the accompanying drawings, the specific implementation of the scanning follow-up processing method and related devices in the embodiments of the present application will be described in detail through embodiments.
示例性方法Exemplary method
参见图2,示出了本申请实施例中一种扫描跟读处理的方法的流程示意图。在本实施例中,应用于词典笔,所述方法例如可以包括以下步骤:Refer to FIG. 2, which shows a schematic flowchart of a scanning and reading processing method in an embodiment of the present application. In this embodiment, applied to a dictionary pen, the method may include the following steps, for example:
步骤201:响应于词典笔对文本的扫描操作,获得所述文本的查询结果并显示在所述词典笔的显示界面上。Step 201: In response to the scanning operation of the dictionary pen on the text, a query result of the text is obtained and displayed on the display interface of the dictionary pen.
首先,在词典笔具有查询词典的功能的基础上,用户利用词典笔扫描文本时,比如词语、短语或语句等文本,响应于词典笔通过OCR技术对文本的扫描操作,词典笔可以查询词典获得文本对应的查询结果,为了用户能够通过词典笔即可直观获取明确文本的查询结果,词典笔需要将文本的查询结果显示在其显示界面上。First of all, on the basis that the dictionary pen has the function of querying the dictionary, when the user uses the dictionary pen to scan text, such as words, phrases or sentences, in response to the dictionary pen scanning operation of the text through OCR technology, the dictionary pen can query the dictionary to obtain The query result corresponding to the text, so that the user can intuitively obtain the query result of the clear text through the dictionary pen, the dictionary pen needs to display the query result of the text on its display interface.
作为一种示例,用户利用词典笔通过OCR技术扫描的文本为英语单词“take”,词典笔查询词典获得英语单词“take”的查询结果为英语单词“take”的发音和含义,词典笔将英语单词“take”的发音和含义显示在其显示界面上;用户浏览词典笔的查询结果显示界面,即可明确英语单词“take”的发音和含义。As an example, the text scanned by the user using the dictionary pen through OCR technology is the English word "take", and the dictionary pen searches the dictionary to obtain the English word "take". The query result is the pronunciation and meaning of the English word "take". The pronunciation and meaning of the word "take" are displayed on the display interface; the user can clarify the pronunciation and meaning of the English word "take" by browsing the query result display interface of the dictionary pen.
步骤202:响应于用户在查询结果显示界面上的跟读语音输入操作,获得所述用户的跟读语音。Step 202: In response to the user's follow-up voice input operation on the query result display interface, obtain the follow-up voice of the user.
当用户在学习文本过程中,在执行步骤201实现词典笔查询词典的功能后,词典笔仅仅支持查询词典功能对于用户学习文本而言远远不够,比如,用户对文本的发音进行跟读时,目前的词典笔无法与用户交互,则无法获取用户的跟读语音,也无法判断用户的跟读语音中是否存在发音不准的问题, 从而导致用户无法明确其跟读语音的发音是否准确。因此,在本申请实施例中,为了解决上述问题,在步骤201的基础上,在词典笔的查询结果显示界面上,设置允许用户进行跟读语音输入操作的控件,比如录音按键等等,当用户触控该控件输入跟读语音时,词典笔能够获得用户的跟读语音,以便能够判断用户的跟读语音中是否存在发音不准的问题。When the user is learning the text, after executing step 201 to implement the dictionary pen query function, the dictionary pen only supports the dictionary query function, which is far from enough for the user to learn the text. For example, when the user follows the pronunciation of the text, The current dictionary pen cannot interact with the user, and cannot obtain the user's follow-up speech, nor can it determine whether the user's follow-up speech has inaccurate pronunciation, so that the user cannot know whether the pronunciation of the follow-up speech is accurate. Therefore, in the embodiment of the present application, in order to solve the above-mentioned problem, on the basis of step 201, on the query result display interface of the dictionary pen, a control that allows the user to perform the voice input operation of the follow-up reading, such as the recording button, etc., is set. When the user touches the control to input the follow-up voice, the dictionary pen can obtain the follow-up voice of the user, so as to be able to judge whether there is an inaccurate pronunciation in the user's follow-up voice.
具体地,当用户在词典笔的查询结果显示界面上输入跟读语音时,词典笔将拾取到的语音作为用户的输入语音,其中包括一些环境噪声、白噪声等干扰噪声,以及跟读语音输入开始和输入结束时一定时延产生的静音期;为了获得更为准确的用户的跟读语音,需要利用语音降噪技术处理用户的输入语音,以消除输入语音中的干扰噪声,比如3米语音降噪技术等等,并利用语音活性检测技术处理用户的输入语音,以识别并消除输入语音中静音期,从而获得处理后的输入语音作为用户的跟读语音。因此,在本申请实施例一种可选的实施方式中,所述步骤202例如可以包括以下步骤:Specifically, when the user inputs the follow-up speech on the query result display interface of the dictionary pen, the dictionary pen uses the picked-up voice as the user's input voice, including some interference noises such as environmental noise and white noise, and follow-up speech input The silent period caused by a certain delay at the beginning and the end of the input; in order to obtain a more accurate user's follow-up speech, it is necessary to use voice noise reduction technology to process the user's input speech to eliminate the interference noise in the input speech, such as 3m speech Noise reduction technology, etc., and use voice activity detection technology to process the user's input voice to identify and eliminate the silent period in the input voice, so as to obtain the processed input voice as the user's follow-up voice. Therefore, in an optional implementation manner of the embodiment of the present application, the step 202 may include, for example, the following steps:
步骤A:响应于用户在查询结果显示界面上的跟读语音输入操作,获得所述用户的输入语音;Step A: In response to the user's follow-up voice input operation on the query result display interface, obtain the user's input voice;
步骤B:利用语音降噪技术和语音活性检测技术处理所述输入语音,获得所述跟读语音。Step B: Use voice noise reduction technology and voice activity detection technology to process the input voice to obtain the follow-up voice.
作为一种示例,如图3所示的一种词典笔获得用户的跟读语音的过程示意图。其中,词典笔的查询结果显示界面上预先设置录音按键,当用户触控该控件输入跟读语音时,词典笔获得用户的输入语音,利用3米语音降噪技术和语音活性检测技术处理该输入语音获得用户的跟读语音。As an example, a dictionary pen shown in FIG. 3 is a schematic diagram of a process in which a user's follow-up voice is obtained by a dictionary pen. Among them, a recording button is preset on the query result display interface of the dictionary pen. When the user touches the control to input the follow-up voice, the dictionary pen obtains the user's input voice, and uses 3m voice noise reduction technology and voice activity detection technology to process the input The voice obtains the user's follow-up voice.
步骤203:基于所述跟读语音和所述跟读语音对应的范读语音,获得所述跟读语音的跟读评测结果并显示在所述词典笔的显示界面上。Step 203: Obtain a follow-up evaluation result of the follow-up voice based on the follow-up voice and the model-reading voice corresponding to the follow-up voice and display it on the display interface of the dictionary pen.
在本申请实施例中,在执行步骤202实现词典笔获取用户的跟读语音的功能后,词典笔需要判断用户的跟读语音中是否存在发音不准的问题,而其判断标准是跟读语音对应的范读语音,该范读语音既可以是预先存储的专业人员范读文本得到的语音,也可以是文本的标准发音语音,还可以是基于人声和文本的标准发音合成得到的语音。在此基础上,对比跟读语音与其对应的范读语音,获得跟读语音的跟读评测结果,以表示用户的跟读语音中是否 存在发音不准的问题;为了用户通过词典笔即可直观获取明确其跟读语音的发音是否准确,词典笔需要将跟读语音的跟读评测结果显示在其显示界面上。In the embodiment of this application, after performing step 202 to realize the function of the dictionary pen to obtain the user's follow-up speech, the dictionary pen needs to determine whether there is an inaccurate pronunciation in the user's follow-up speech, and the judgment standard is the follow-up speech Corresponding to the normal pronunciation, the normal pronunciation can be either a pre-stored professional standard pronunciation, or a standard pronunciation of the text, or a synthesized voice based on the standard pronunciation of the human voice and the text. On this basis, compare the follow-up speech with its corresponding normal pronunciation, and obtain the follow-up evaluation result of the follow-up speech to indicate whether the user’s follow-up speech has inaccurate pronunciation; for the user to use the dictionary pen to be intuitive To obtain a clear definition of whether the pronunciation of the follow-up speech is accurate, the dictionary pen needs to display the follow-up evaluation result of the follow-up speech on its display interface.
其中,词典笔可能处于联网网络环境下,也可能处于离线网络环境下,在本申请实施例中,词典笔不论处于联网网络环境下还是离线网络环境下,均能够实现用户的跟读语音的评测。下面分别对词典笔处于联网网络环境下以及词典笔处于离线网络环境下详细说明步骤203的具体实现方式:Among them, the dictionary pen may be in a networked network environment or in an offline network environment. In the embodiment of the application, the dictionary pen can realize the evaluation of the user's follow-up speech regardless of whether it is in the networked network environment or the offline network environment. . The following is a detailed description of the specific implementation of step 203 when the dictionary pen is in a networked network environment and the dictionary pen is in an offline network environment:
第一,当词典笔处于联网网络环境下,词典笔可以通过网络连接用于评测跟读语音的读音评测服务器,以通过该读音评测服务器完成用户的跟读语音的评测。具体地,词典笔首先需要将获得的跟读语音通过网络发送至读音评测服务器,语音评测服务器将接收到的跟读语音与其对应的范读语音进行匹配,以得到跟读语音与范读语音的匹配度,其中,匹配度越高表示跟读语音的发音越准确,匹配度越低表示跟读语音的发音越不准确;则通过跟读语音与范读语音的匹配度,可以获得跟读语音的跟读评测结果并显示在词典笔的显示界面上。因此,在本申请实施例一种可选的实施方式中,所述步骤203例如可以包括以下步骤:First, when the dictionary pen is in a networked network environment, the dictionary pen can connect to the pronunciation evaluation server for evaluating the follow-up speech through the network, so as to complete the user's follow-up speech evaluation through the pronunciation evaluation server. Specifically, the dictionary pen first needs to send the obtained follow-up speech to the pronunciation evaluation server through the network, and the voice evaluation server matches the received follow-up speech with the corresponding model reading voice to obtain the difference between the follow-up voice and the model reading voice. The matching degree, where the higher the matching degree, the more accurate the pronunciation of the follow-up voice, the lower the matching degree, the less accurate the pronunciation of the follow-up voice; the follow-up voice can be obtained by the matching degree between the follow-up voice and the model pronunciation Follow-up evaluation results are displayed on the display interface of the dictionary pen. Therefore, in an optional implementation manner of the embodiment of the present application, the step 203 may include, for example, the following steps:
步骤C:当所述词典笔处于联网网络环境下,将所述跟读语音发送至所述读音评测服务器。Step C: When the dictionary pen is in a networked network environment, the follow-up speech is sent to the pronunciation evaluation server.
步骤D:通过所述读音评测服务器匹配所述跟读语音与所述范读语音,获得所述跟读语音与所述范读语音的匹配度。Step D: Match the follow-up speech and the model pronunciation through the pronunciation evaluation server to obtain the degree of matching between the follow-up speech and the model pronunciation.
其中,在匹配跟读语音与范读语音,既可以利用预设算法计算获得跟读语音与范读语音的匹配度,例如,将跟读语音和范读语音分别转换成跟读语音向量和范读语音向量,通过距离公式计算读语音向量与范读语音的距离,以获得跟读语音与范读语音的匹配度;也可以利用预先训练用户获得跟读语音与范读语音的匹配度的神经网络模型,获得跟读语音与范读语音的匹配度,例如,将跟读语音和范读语音输入训练好的神经网络模型,该训练好的神经网络模型输出跟读语音与范读语音的匹配度。Among them, in matching the follow-up speech and the model-reading speech, the preset algorithm can be used to calculate the matching degree of the follow-up speech and the model-reading speech. Reading voice vector, calculate the distance between the reading voice vector and the normal reading voice through the distance formula, to obtain the matching degree of the follow-up voice and the normal reading voice; you can also use pre-trained users to obtain the matching degree of the follow-up voice and the normal reading voice. Network model to obtain the matching degree between the follow-up speech and the model-reading speech. For example, input the follow-up speech and the model-reading speech into a trained neural network model, and the trained neural network model outputs the match between the follow-up speech and the model-reading speech. Spend.
步骤E:基于所述跟读语音与所述范读语音的匹配度,获得所述跟读语音的跟读评测结果并显示在所述词典笔的显示界面上。Step E: Obtain a follow-up evaluation result of the follow-up voice based on the degree of matching between the follow-up voice and the model-read voice and display it on the display interface of the dictionary pen.
第二,当词典笔处于离线网络环境下,虽然词典笔无法通过网络连接用 于对评测跟读语音的读音评测服务器,但是词典笔中预先装载有用于评测跟读语音的读音评测离线工具包,词典笔通过该读音评测离线工具包即可完成用户的跟读语音的评测。参见上述步骤D-步骤E的说明,词典笔首先通过读音评测离线工具包将获得的跟读语音与其对应的范读语音进行匹配,以得到跟读语音与范读语音的匹配度;然后,通过跟读语音与范读语音的匹配度,即可获得跟读语音的跟读评测结果并显示在词典笔的显示界面上。因此,在本申请实施例一种可选的实施方式中,所述步骤203例如可以包括以下步骤:Second, when the dictionary pen is in an offline network environment, although the dictionary pen cannot be connected to the pronunciation evaluation server for evaluating and following speech through the network, the dictionary pen is pre-loaded with a pronunciation evaluation offline toolkit for evaluating and following speech. The dictionary pen can complete the user's follow-up voice evaluation through the pronunciation evaluation offline toolkit. Refer to the description of step D-step E above, the dictionary pen will first use the pronunciation evaluation offline toolkit to match the obtained follow-up speech with its corresponding normal reading speech to obtain the matching degree between the follow-up speech and the normal reading speech; then, pass The matching degree of the follow-up voice and the model-read voice can obtain the follow-up evaluation result of the follow-up voice and display it on the display interface of the dictionary pen. Therefore, in an optional implementation manner of the embodiment of the present application, the step 203 may include, for example, the following steps:
步骤F:当所述词典笔处于离线网络环境下,通过读音评测离线工具包匹配所述跟读语音与所述范读语音,获得所述跟读语音与所述范读语音的匹配度。Step F: When the dictionary pen is in an offline network environment, match the follow-up speech and the model pronunciation through the pronunciation evaluation offline toolkit to obtain the degree of matching between the follow-up speech and the model pronunciation.
其中,步骤F的具体实施方式可参见上述步骤D的具体实施方式。For the specific implementation of step F, refer to the specific implementation of step D above.
步骤G:基于所述跟读语音与所述范读语音的匹配度,获得所述跟读语音的跟读评测结果并显示在所述词典笔的显示界面上。Step G: Obtain a follow-up evaluation result of the follow-up voice based on the degree of matching between the follow-up voice and the model-read voice and display it on the display interface of the dictionary pen.
在本申请实施例中,跟读语音的跟读评测结果实质上是匹配跟读语音与其对应的范读语音,基于跟读语音与范读语音的匹配度获得的,则至少可以将跟读语音与范读语音的匹配度转换为跟读评测分数,将跟读评测分数作为跟读评测结果显示在词典笔的显示界面上。其中,跟读语音与范读语音的匹配度越高时跟读评测分数越高,表示跟读语音的发音越准确;跟读语音与范读语音的匹配度越低时跟读评测分数越低,表示跟读语音的发音越不准确,从而使得跟读评测结果包括跟读评测分数能够表示用户的跟读语音中是否存在发音不准的问题。因此,在本申请实施例一种可选的实施方式中,所述跟读评测结果至少包括跟读评测分数。In the embodiment of this application, the follow-up evaluation result of the follow-up speech is essentially to match the follow-up speech with its corresponding model-reading speech. Based on the matching degree between the follow-up speech and the model-reading speech, the follow-up speech can be at least The matching degree with the fan reading voice is converted into the follow-up evaluation score, and the follow-up evaluation score is displayed on the display interface of the dictionary pen as the follow-up evaluation result. Among them, the higher the matching degree between the follow-up speech and the fan-reading voice, the higher the follow-up evaluation score, which means the more accurate the pronunciation of the follow-up speech; the lower the matching degree between the follow-up speech and the fan-reading voice, the lower the follow-up evaluation score , Indicates that the pronunciation of the follow-up speech is more inaccurate, so that the follow-up evaluation result including the follow-up evaluation score can indicate whether the user’s follow-up speech has inaccurate pronunciation. Therefore, in an optional implementation manner of the embodiment of the present application, the follow-up evaluation result includes at least a follow-up evaluation score.
作为一种示例,词典笔依据跟读语音和其对应的范读语音,获得跟读语音的跟读评测结果作为跟读评测分数;在词典笔的显示界面上显示跟读语音的跟读评测结果时,如图4所示的一种词典笔的跟读评测结果显示界面示意图;其中,该跟读评测结果显示界面上显示跟读语音的跟读评测分数,比如跟读评测分数为“100分”。As an example, the dictionary pen obtains the follow-up evaluation result of the follow-up speech as the follow-up evaluation score based on the follow-up speech and its corresponding normal speech; the follow-up evaluation result of the follow-up speech is displayed on the display interface of the dictionary pen At the same time, as shown in Figure 4 is a schematic diagram of the follow-up evaluation result display interface of a dictionary pen; among them, the follow-up evaluation score of the follow-up voice is displayed on the follow-up evaluation result display interface, for example, the follow-up evaluation score is "100 points" ".
在本申请实施例中,在跟读评测结果至少包括跟读评测分数的基础上,跟读评测分数越低表示跟读语音的发音越不准确,则可以预先设置一个表示 跟读语音的发音准确的跟读评测分数作为预设跟读评测分数,当读评测分数低于该预设跟读评测分数时,表示用户的跟读语音的发音不准确,该跟读语音需要被纠正,则词典笔还需要获得该跟读语音对应的跟读纠音建议,与跟读评测分数共同作为读评测结果,并在词典笔的显示界面上按照预设顺序分别显示跟读评测分数和跟读纠音建议,或者在词典笔的显示界面上同时显示跟读评测分数和跟读纠音建议,以便用户可以学习词典笔的显示界面上所显示的跟读纠音建议,从而纠正其跟读语音。因此,在本申请实施例一种可选的实施方式中,若所述跟读评测分数低于预设跟读评测分数,所述跟读评测结果还包括跟读纠音建议。当然,当读评测分数高于预设跟读评测分数时,表示用户的跟读语音的发音准确,该跟读语音不需要被纠正。In the embodiment of the present application, on the basis that the follow-up evaluation result includes at least the follow-up evaluation score, the lower the follow-up evaluation score indicates that the pronunciation of the follow-up voice is less accurate, and a pre-set indicating that the pronunciation of the follow-up voice is accurate The follow-up evaluation score of is used as the default follow-up evaluation score. When the reading evaluation score is lower than the preset follow-up evaluation score, it means that the pronunciation of the user’s follow-up speech is inaccurate and the follow-up speech needs to be corrected, then the dictionary pen It is also necessary to obtain the corresponding pronunciation correction suggestions for the follow-up speech, which together with the follow-up evaluation scores are used as the reading evaluation result, and the follow-up evaluation scores and the follow-up correction suggestions are displayed in the preset order on the display interface of the dictionary pen. , Or simultaneously display the follow-up evaluation scores and follow-up correction suggestions on the display interface of the dictionary pen, so that users can learn the follow-up correction suggestions displayed on the display interface of the dictionary pen to correct their follow-up speech. Therefore, in an optional implementation manner of the embodiment of the present application, if the follow-up evaluation score is lower than the preset follow-up evaluation score, the follow-up evaluation result further includes a follow-up correction suggestion. Of course, when the reading evaluation score is higher than the preset follow-up evaluation score, it means that the pronunciation of the user's follow-up speech is accurate, and the follow-up speech does not need to be corrected.
作为一种示例,词典笔依据跟读语音和其对应的范读语音,获得跟读语音的跟读评测结果过程中,若获得的跟读语音的跟读评测分数低于预设跟读评测分数,还需要获得对应的跟读纠音建议作为跟读评测结果;在词典笔的显示界面上显示跟读语音的跟读评测结果时,如图5所示的另一种词典笔的跟读评测结果显示界面示意图;其中,该跟读评测结果显示界面上,不仅显示跟读语音的跟读评测分数,比如跟读评测分数为“88分”;还需要显示跟读语音的跟读纠音建议,比如跟读纠音建议为标亮跟读语音中发音不准字符对应的发音,建议用户纠正发音不准字符对应的发音。As an example, the dictionary pen obtains the follow-up evaluation result of the follow-up speech based on the follow-up speech and its corresponding normal speech. If the follow-up evaluation score of the obtained follow-up speech is lower than the preset follow-up evaluation score , You also need to obtain the corresponding pronunciation correction suggestions as the follow-up evaluation results; when the follow-up evaluation results of the follow-up speech are displayed on the display interface of the dictionary pen, another dictionary pen follow-up evaluation as shown in Figure 5 Schematic diagram of the result display interface; among them, the follow-up evaluation result display interface not only displays the follow-up evaluation score of the follow-up speech, for example, the follow-up evaluation score is "88 points"; it also needs to display the follow-up speech correction suggestions For example, the suggestion to correct the pronunciation of the follow-up is to highlight the pronunciation corresponding to the inaccurate character in the follow-up speech, and it is recommended that the user correct the pronunciation of the inaccurate character.
在本申请实施例中,当跟读评测结果包括跟读纠音建议时,表示用户的跟读语音的发音不准确,在用户的跟读语音需要被纠正的基础上,还可以依据跟读纠音建议向用户推荐一些辅助纠正用户的跟读语音的口语练习内容,该口语练习内容同样需要显示在词典笔的显示界面上,以便用户学习词典笔的显示界面上所显示的口语练习内容,辅助纠正用户的跟读语音。其中,口语练习内容可以是文本形式、图像形式、音频形式和/或视频形式。因此,在本申请实施例一种可选的实施方式中,当所述跟读评测结果包括跟读纠音建议时,在步骤203之后所述方法例如还可以包括步骤H:基于所述跟读纠音建议推荐口语练习内容,并显示在所述词典笔的显示界面上。In the embodiment of the present application, when the follow-up evaluation result includes follow-up correction suggestions, it means that the pronunciation of the user's follow-up speech is inaccurate. On the basis that the user's follow-up speech needs to be corrected, it can also be based on the follow-up correction. The sound suggestion recommends some oral practice content to assist in correcting the user’s follow-up speech. The oral practice content also needs to be displayed on the display interface of the dictionary pen, so that the user can learn the oral practice content displayed on the display interface of the dictionary pen. Correct the user's follow-up voice. Among them, the oral practice content may be in the form of text, image, audio and/or video. Therefore, in an optional implementation manner of the embodiment of the present application, when the follow-up evaluation result includes a follow-up correction suggestion, after step 203, the method may further include, for example, step H: based on the follow-up To correct the sound suggestion recommends oral practice content, and displays it on the display interface of the dictionary pen.
作为一种示例,如图6所示的一种词典笔的口语练习内容显示界面示意图;当跟读评测结果显示界面上显示跟读评测分数和跟读纠音建议,词典笔 依据跟读纠音建议推荐口语练习内容,将口语练习内容显示在词典笔的显示界面上,比如口语练习内容为跟读纠音建议中标亮的发音对应的口语练习文本。As an example, as shown in Figure 6 is a schematic diagram of the oral practice content display interface of a dictionary pen; when the follow-up evaluation score and follow-up correction suggestions are displayed on the follow-up evaluation result display interface, the dictionary pen is based on the follow-up correction It is recommended to recommend oral practice content and display the oral practice content on the display interface of the dictionary pen. For example, the oral practice content is the oral practice text corresponding to the pronunciation highlighted in the pronunciation correction suggestions.
在本申请实施例中,当跟读评测结果包括跟读纠音建议时,在用户学习词典笔的显示界面上所显示的用于纠正其跟读语音的跟读纠音建议后,用户还可以再次在词典笔的跟读评测结果显示界面上输入新跟读语音,以便词典笔获得用户的新跟读语音进行用户的新跟读语音的评测,从而获得新跟读语音的跟读评测结果并显示在词典笔的显示界面上。因此,在本申请实施例一种可选的实施方式中,当所述跟读评测结果包括跟读纠音建议时,在步骤203之后所述方法例如还可以包括以下步骤:In the embodiment of this application, when the follow-up evaluation result includes follow-up correction suggestions, after the user learns the follow-up correction suggestions displayed on the display interface of the dictionary pen for correcting the follow-up pronunciation, the user can still Once again, enter the new follow-up speech on the display interface of the follow-up evaluation result of the dictionary pen, so that the dictionary pen can obtain the user's new follow-up speech to evaluate the user's new follow-up speech, so as to obtain the follow-up evaluation result of the new follow-up speech. Displayed on the display interface of the dictionary pen. Therefore, in an optional implementation manner of the embodiment of the present application, when the follow-up evaluation result includes a follow-up correction suggestion, after step 203, the method, for example, may further include the following steps:
步骤I:响应于用户在跟读评测结果显示界面上的跟读语音输入操作,获得所述用户的新跟读语音;Step I: In response to the user's follow-up voice input operation on the follow-up evaluation result display interface, obtain the user's new follow-up voice;
步骤J:基于所述新跟读语音和所述范读语音,获得所述新跟读语音的跟读评测结果并显示在所述词典笔的显示界面上。Step J: Obtain a follow-up evaluation result of the new follow-up voice based on the new follow-up voice and the model pronunciation and display it on the display interface of the dictionary pen.
作为一种示例,如图7所示的一种词典笔获得新跟读语音、获得新跟读语音的跟读评测结果并显示在词典笔的显示界面上的过程示意图。其中,在图5的基础上,该词典笔的跟读评测结果显示界面上预先设置有录音按键,当用户触控该控件输入新跟读语音时,词典笔获得用户的新跟读语音;词典笔依据新跟读语音和范读语音,获得跟读语音的跟读评测结果在词典笔的显示界面上显示,该跟读评测结果显示界面上显示新跟读语音的跟读评测分数,比如跟读评测分数为“100分”。As an example, as shown in FIG. 7, a dictionary pen obtains a new follow-up voice, obtains a follow-up evaluation result of the new follow-up voice, and displays the process diagram on the display interface of the dictionary pen. Among them, on the basis of Figure 5, the dictionary pen’s follow-up evaluation result display interface is preset with a recording button. When the user touches the control to input a new follow-up voice, the dictionary pen obtains the user's new follow-up voice; the dictionary The pen based on the new follow-up voice and the model-reading voice, the follow-up evaluation result of the follow-up voice is displayed on the display interface of the dictionary pen, and the follow-up evaluation score of the new follow-up voice is displayed on the display interface of the follow-up evaluation result, such as follow-up The reading evaluation score is "100 points".
通过本实施例提供的各种实施方式,用户利用词典笔扫描文本,词典笔获得文本的查询结果并显示在其显示界面上;用户在查询结果显示界面上输入跟读语音,词典笔获得用户的跟读语音;依据跟读语音和其对应的范读语音,词典笔获得跟读语音的跟读评测结果并显示在其显示界面上。由此可见,当词典笔扫描文本显示其查询结果后,用户进行跟读时词典笔能够获取用户的跟读语音,并通过匹配跟读语音与范读语音判断跟读语音中是否存在发音不准的问题,以得到跟读评测结果并显示,使得用户能够明确其跟读语音的发音是否准确;即,词典笔能够实现用户的跟读语音的获取和评测,更好地 帮助用户学习文本的查询结果。Through the various implementation manners provided in this embodiment, the user uses the dictionary pen to scan the text, and the dictionary pen obtains the query result of the text and displays it on its display interface; the user inputs the follow-up voice on the query result display interface, and the dictionary pen obtains the user’s Follow-up speech: According to the follow-up speech and its corresponding normal speech, the dictionary pen obtains the follow-up evaluation result of the follow-up speech and displays it on its display interface. It can be seen that when the dictionary pen scans the text and displays the query results, the dictionary pen can obtain the user's follow-up voice when the user is reading, and judge whether there is inaccurate pronunciation in the follow-up voice by matching the follow-up voice with the model voice In order to obtain and display the follow-up evaluation results, the user can know whether the pronunciation of the follow-up speech is accurate; that is, the dictionary pen can realize the user's follow-up speech acquisition and evaluation, and better help users learn text queries result.
示例性装置Exemplary device
参见图8,示出了本申请实施例中一种扫描跟读处理的装置的结构示意图。在本实施例中,应用于词典笔,所述装置例如具体可以包括:Referring to FIG. 8, there is shown a schematic structural diagram of a scanning and reading processing apparatus in an embodiment of the present application. In this embodiment, applied to a dictionary pen, the device may specifically include, for example:
第一获得显示单元801,用于响应于词典笔对文本的扫描操作,获得所述文本的查询结果并显示在所述词典笔的显示界面上;The first obtaining and displaying unit 801 is configured to obtain the query result of the text in response to the scanning operation of the text by the dictionary pen and display it on the display interface of the dictionary pen;
跟读语音获得单元802,用于响应于用户在查询结果显示界面上的跟读语音输入操作,获得所述用户的跟读语音;The follow-up voice obtaining unit 802 is configured to obtain the follow-up voice of the user in response to the user's follow-up voice input operation on the query result display interface;
第二获得显示单元803,用于基于所述跟读语音和所述跟读语音对应的范读语音,获得所述跟读语音的跟读评测结果并显示在所述词典笔的显示界面上。The second obtaining and displaying unit 803 is configured to obtain a follow-up evaluation result of the follow-up voice based on the follow-up voice and the model-reading voice corresponding to the follow-up voice and display it on the display interface of the dictionary pen.
在本申请实施例一种可选的实施方式中,所述跟读语音获得单元802包括:In an optional implementation manner of the embodiment of the present application, the follow-up speech obtaining unit 802 includes:
输入语音获得子单元,用于响应于用户在查询结果显示界面上的跟读语音输入操作,获得所述用户的输入语音;The input voice obtaining subunit is used to obtain the input voice of the user in response to the user's follow-up voice input operation on the query result display interface;
跟读语音获得子单元,用于利用语音降噪技术和语音活性检测技术处理所述输入语音,获得所述跟读语音。The follow-up voice obtaining subunit is used to process the input voice by using the voice noise reduction technology and the voice activity detection technology to obtain the follow-up voice.
在本申请实施例一种可选的实施方式中,所述第二获得显示单元803包括:In an optional implementation manner of the embodiment of the present application, the second obtaining and displaying unit 803 includes:
发送子单元,用于当所述词典笔处于联网网络环境下,将所述跟读语音发送至所述读音评测服务器;The sending subunit is used to send the follow-up speech to the pronunciation evaluation server when the dictionary pen is in a networked network environment;
第一匹配获得子单元,用于通过所述读音评测服务器匹配所述跟读语音与所述范读语音,获得所述跟读语音与所述范读语音的匹配度;The first matching obtaining subunit is configured to match the follow-up speech and the model pronunciation through the pronunciation evaluation server to obtain the degree of matching between the follow-up speech and the model pronunciation;
第一获得显示子单元,用于基于所述跟读语音与所述范读语音的匹配度,获得所述跟读语音的跟读评测结果并显示在所述词典笔的显示界面上。The first obtaining and displaying subunit is configured to obtain the follow-up evaluation result of the follow-up voice based on the degree of matching between the follow-up voice and the model pronunciation and display it on the display interface of the dictionary pen.
在本申请实施例一种可选的实施方式中,所述第二获得显示单元803包括:In an optional implementation manner of the embodiment of the present application, the second obtaining and displaying unit 803 includes:
第二匹配获得子单元,用于当所述词典笔处于离线网络环境下,通过读 音评测离线工具包匹配所述跟读语音与所述范读语音,获得所述跟读语音与所述范读语音的匹配度;The second matching obtaining subunit is used to match the follow-up speech and the model pronunciation through the pronunciation evaluation offline toolkit when the dictionary pen is in an offline network environment to obtain the follow-up speech and the model pronunciation The matching degree of the voice;
第二获得显示子单元,用于基于所述跟读语音与所述范读语音的匹配度,获得所述跟读语音的跟读评测结果并显示在所述词典笔的显示界面上。The second obtaining and displaying subunit is configured to obtain the follow-up evaluation result of the follow-up voice based on the degree of matching between the follow-up voice and the model pronunciation and display it on the display interface of the dictionary pen.
在本申请实施例一种可选的实施方式中,所述跟读评测结果至少包括跟读评测分数。In an optional implementation manner of the embodiment of the present application, the follow-up evaluation result includes at least a follow-up evaluation score.
在本申请实施例一种可选的实施方式中,若所述跟读评测分数低于预设跟读评测分数,所述跟读评测结果还包括跟读纠音建议。In an optional implementation manner of the embodiment of the present application, if the follow-up evaluation score is lower than the preset follow-up evaluation score, the follow-up evaluation result further includes a follow-up correction suggestion.
在本申请实施例一种可选的实施方式中,当所述跟读评测结果包括跟读纠音建议时,所述装置还包括:In an optional implementation manner of the embodiment of the present application, when the follow-up evaluation result includes a follow-up correction suggestion, the device further includes:
推荐获得单元,用于基于所述跟读纠音建议推荐口语练习内容,并显示在所述词典笔的显示界面上。The recommendation obtaining unit is configured to recommend oral practice content based on the correcting pronunciation of the follow-up, and display it on the display interface of the dictionary pen.
在本申请实施例一种可选的实施方式中,当所述跟读评测结果包括跟读纠音建议时,In an optional implementation manner of the embodiment of the present application, when the follow-up evaluation result includes a follow-up correction suggestion,
所述跟读语音获得单元802,还用于响应于用户在跟读评测结果显示界面上的跟读语音输入操作,获得所述用户的新跟读语音;The follow-up voice obtaining unit 802 is further configured to obtain a new follow-up voice of the user in response to the user's follow-up voice input operation on the follow-up evaluation result display interface;
所述第二获得显示单元803,还用于基于所述新跟读语音和所述范读语音,获得所述新跟读语音的跟读评测结果并显示在所述词典笔的显示界面上。The second obtaining and displaying unit 803 is further configured to obtain a follow-up evaluation result of the new follow-up voice based on the new follow-up voice and the model voice and display it on the display interface of the dictionary pen.
通过本实施例提供的各种实施方式,用户利用词典笔扫描文本,词典笔获得文本的查询结果并显示在其显示界面上;用户在查询结果显示界面上点击跟读输入语音,词典笔获得用户的跟读语音;依据跟读语音和其对应的范读语音,词典笔获得跟读语音的跟读评测结果并显示在其显示界面上。由此可见,当词典笔扫描文本显示其查询结果后,用户进行跟读时词典笔能够获取用户的跟读语音,并通过匹配跟读语音与范读语音判断跟读语音中是否存在发音不准的问题,以得到跟读评测结果并显示,使得用户能够明确其跟读语音的发音是否准确;即,词典笔能够实现用户的跟读语音的获取和评测,更好地帮助用户学习文本的查询结果。Through the various implementation manners provided in this embodiment, the user uses the dictionary pen to scan the text, and the dictionary pen obtains the query result of the text and displays it on its display interface; the user clicks on the query result display interface to input voice, and the dictionary pen obtains the user Follow-up voice; according to the follow-up voice and its corresponding normal-read voice, the dictionary pen obtains the follow-up evaluation result of the follow-up voice and displays it on its display interface. It can be seen that when the dictionary pen scans the text and displays the query results, the dictionary pen can obtain the user's follow-up voice when the user is reading, and judge whether there is inaccurate pronunciation in the follow-up voice by matching the follow-up voice with the model voice In order to obtain and display the follow-up evaluation results, the user can know whether the pronunciation of the follow-up speech is accurate; that is, the dictionary pen can realize the user's follow-up speech acquisition and evaluation, and better help users learn text queries result.
图9是根据一示例性实施例示出的一种用于扫描跟读处理的词典笔900 的框图。Fig. 9 is a block diagram showing a dictionary pen 900 for scanning and reading processing according to an exemplary embodiment.
参照图9,装置900可以包括以下一个或多个组件:处理组件902,存储器904,电源组件906,多媒体组件908,音频组件910,输入/输出(I/O)的接口912,传感器组件914,以及通信组件916。9, the device 900 may include one or more of the following components: a processing component 902, a memory 904, a power supply component 906, a multimedia component 908, an audio component 910, an input/output (I/O) interface 912, a sensor component 914, And the communication component 916.
处理组件902通常控制装置900的整体操作,诸如与显示,电话呼叫,数据通信,相机操作和记录操作相关联的操作。处理组件902可以包括一个或多个处理器920来执行指令,以完成上述的方法的全部或部分步骤。此外,处理组件902可以包括一个或多个模块,便于处理组件902和其他组件之间的交互。例如,处理部件902可以包括多媒体模块,以方便多媒体组件908和处理组件902之间的交互。The processing component 902 generally controls the overall operations of the device 900, such as operations associated with display, telephone calls, data communications, camera operations, and recording operations. The processing component 902 may include one or more processors 920 to execute instructions to complete all or part of the steps of the foregoing method. In addition, the processing component 902 may include one or more modules to facilitate the interaction between the processing component 902 and other components. For example, the processing component 902 may include a multimedia module to facilitate the interaction between the multimedia component 908 and the processing component 902.
存储器904被配置为存储各种类型的数据以支持在设备900的操作。这些数据的示例包括用于在装置900上操作的任何应用程序或方法的指令,联系人数据,电话簿数据,消息,图片,视频等。存储器904可以由任何类型的易失性或非易失性存储设备或者它们的组合实现,如静态随机存取存储器(SRAM),电可擦除可编程只读存储器(EEPROM),可擦除可编程只读存储器(EPROM),可编程只读存储器(PROM),只读存储器(ROM),磁存储器,快闪存储器,磁盘或光盘。The memory 904 is configured to store various types of data to support the operation of the device 900. Examples of such data include instructions for any application or method operating on the device 900, contact data, phone book data, messages, pictures, videos, etc. The memory 904 can be implemented by any type of volatile or non-volatile storage device or a combination thereof, such as static random access memory (SRAM), electrically erasable programmable read-only memory (EEPROM), erasable and Programmable Read Only Memory (EPROM), Programmable Read Only Memory (PROM), Read Only Memory (ROM), Magnetic Memory, Flash Memory, Magnetic Disk or Optical Disk.
电源组件906为装置900的各种组件提供电力。电源组件906可以包括电源管理***,一个或多个电源,及其他与为装置900生成、管理和分配电力相关联的组件。The power supply component 906 provides power to various components of the device 900. The power supply component 906 may include a power management system, one or more power supplies, and other components associated with the generation, management, and distribution of power for the device 900.
多媒体组件908包括在所述装置900和用户之间的提供一个输出接口的屏幕。在一些实施例中,屏幕可以包括液晶显示器(LCD)和触摸面板(TP)。如果屏幕包括触摸面板,屏幕可以被实现为触摸屏,以接收来自用户的输入信号。触摸面板包括一个或多个触摸传感器以感测触摸、滑动和触摸面板上的手势。所述触摸传感器可以不仅感测触摸或滑动动作的边界,而且还检测与所述触摸或滑动操作相互关联的持续时间和压力。在一些实施例中,多媒体组件908包括一个前置摄像头和/或后置摄像头。当设备900处于操作模式,如拍摄模式或视频模式时,前置摄像头和/或后置摄像头可以接收外部的多媒体数据。每个前置摄像头和后置摄像头可以是一个固定的光学透镜***或具 有焦距和光学变焦能力。The multimedia component 908 includes a screen that provides an output interface between the device 900 and the user. In some embodiments, the screen may include a liquid crystal display (LCD) and a touch panel (TP). If the screen includes a touch panel, the screen may be implemented as a touch screen to receive input signals from the user. The touch panel includes one or more touch sensors to sense touch, sliding, and gestures on the touch panel. The touch sensor may not only sense the boundary of a touch or slide action, but also detect the duration and pressure associated with the touch or slide operation. In some embodiments, the multimedia component 908 includes a front camera and/or a rear camera. When the device 900 is in an operation mode, such as a shooting mode or a video mode, the front camera and/or the rear camera can receive external multimedia data. Each front camera and rear camera can be a fixed optical lens system or have focal length and optical zoom capabilities.
音频组件910被配置为输出和/或输入音频信号。例如,音频组件910包括一个麦克风(MIC),当装置900处于操作模式,如呼叫模式、记录模式和语音识别模式时,麦克风被配置为接收外部音频信号。所接收的音频信号可以被进一步存储在存储器904或经由通信组件916发送。在一些实施例中,音频组件910还包括一个扬声器,用于输出音频信号。The audio component 910 is configured to output and/or input audio signals. For example, the audio component 910 includes a microphone (MIC), and when the device 900 is in an operation mode, such as a call mode, a recording mode, and a voice recognition mode, the microphone is configured to receive external audio signals. The received audio signal may be further stored in the memory 904 or transmitted via the communication component 916. In some embodiments, the audio component 910 further includes a speaker for outputting audio signals.
I/O接口912为处理组件902和***接口模块之间提供接口,上述***接口模块可以是键盘,点击轮,按钮等。这些按钮可包括但不限于:主页按钮、音量按钮、启动按钮和锁定按钮。The I/O interface 912 provides an interface between the processing component 902 and a peripheral interface module. The above-mentioned peripheral interface module may be a keyboard, a click wheel, a button, and the like. These buttons may include, but are not limited to: home button, volume button, start button, and lock button.
传感器组件914包括一个或多个传感器,用于为装置900提供各个方面的状态评估。例如,传感器组件914可以检测到设备900的打开/关闭状态,组件的相对定位,例如所述组件为装置900的显示器和小键盘,传感器组件914还可以检测装置900或装置900一个组件的位置改变,用户与装置900接触的存在或不存在,装置900方位或加速/减速和装置900的温度变化。传感器组件914可以包括接近传感器,被配置用来在没有任何的物理接触时检测附近物体的存在。传感器组件914还可以包括光传感器,如CMOS或CCD图像传感器,用于在成像应用中使用。在一些实施例中,该传感器组件914还可以包括加速度传感器,陀螺仪传感器,磁传感器,压力传感器或温度传感器。The sensor component 914 includes one or more sensors for providing the device 900 with various aspects of state evaluation. For example, the sensor component 914 can detect the on/off status of the device 900 and the relative positioning of components. For example, the component is the display and the keypad of the device 900. The sensor component 914 can also detect the position change of the device 900 or a component of the device 900. , The presence or absence of contact between the user and the device 900, the orientation or acceleration/deceleration of the device 900, and the temperature change of the device 900. The sensor component 914 may include a proximity sensor configured to detect the presence of nearby objects when there is no physical contact. The sensor component 914 may also include a light sensor, such as a CMOS or CCD image sensor, for use in imaging applications. In some embodiments, the sensor component 914 may also include an acceleration sensor, a gyroscope sensor, a magnetic sensor, a pressure sensor or a temperature sensor.
通信组件916被配置为便于装置900和其他设备之间有线或无线方式的通信。装置900可以接入基于通信标准的无线网络,如WiFi,2G或3G,或它们的组合。在一个示例性实施例中,通信部件916经由广播信道接收来自外部广播管理***的广播信号或广播相关信息。在一个示例性实施例中,所述通信部件916还包括近场通信(NFC)模块,以促进短程通信。例如,在NFC模块可基于射频识别(RFID)技术,红外数据协会(IrDA)技术,超宽带(UWB)技术,蓝牙(BT)技术和其他技术来实现。The communication component 916 is configured to facilitate wired or wireless communication between the device 900 and other devices. The device 900 can access a wireless network based on a communication standard, such as WiFi, 2G, or 3G, or a combination thereof. In an exemplary embodiment, the communication component 916 receives a broadcast signal or broadcast related information from an external broadcast management system via a broadcast channel. In an exemplary embodiment, the communication component 916 further includes a near field communication (NFC) module to facilitate short-range communication. For example, the NFC module can be implemented based on radio frequency identification (RFID) technology, infrared data association (IrDA) technology, ultra-wideband (UWB) technology, Bluetooth (BT) technology and other technologies.
在示例性实施例中,装置900可以被一个或多个应用专用集成电路(ASIC)、数字信号处理器(DSP)、数字信号处理设备(DSPD)、可编程逻辑器件(PLD)、现场可编程门阵列(FPGA)、控制器、微控制器、微处理器 或其他电子组件实现,用于执行上述方法。In an exemplary embodiment, the apparatus 900 may be implemented by one or more application specific integrated circuits (ASIC), digital signal processors (DSP), digital signal processing devices (DSPD), programmable logic devices (PLD), field programmable A gate array (FPGA), controller, microcontroller, microprocessor, or other electronic components are implemented to perform the above methods.
在示例性实施例中,还提供了一种包括指令的非临时性计算机可读存储介质,例如包括指令的存储器904,上述指令可由装置900的处理器920执行以完成上述方法。例如,所述非临时性计算机可读存储介质可以是ROM、随机存取存储器(RAM)、CD-ROM、磁带、软盘和光数据存储设备等。In an exemplary embodiment, there is also provided a non-transitory computer-readable storage medium including instructions, such as the memory 904 including instructions, and the foregoing instructions may be executed by the processor 920 of the device 900 to complete the foregoing method. For example, the non-transitory computer-readable storage medium may be ROM, random access memory (RAM), CD-ROM, magnetic tape, floppy disk, optical data storage device, etc.
一种非临时性计算机可读存储介质,当所述存储介质中的指令由移动终端的处理器执行时,使得能够执行一种扫描跟读处理的方法,所述方法包括:A non-transitory computer-readable storage medium, when the instructions in the storage medium are executed by the processor of the mobile terminal, so that a scan-and-read processing method can be executed, the method includes:
响应于词典笔对文本的扫描操作,获得所述文本的查询结果并显示在所述词典笔的显示界面上;In response to the scanning operation of the dictionary pen on the text, obtaining the query result of the text and displaying it on the display interface of the dictionary pen;
响应于用户在查询结果显示界面上的跟读语音输入操作,获得所述用户的跟读语音;In response to the user's follow-up voice input operation on the query result display interface, obtain the follow-up voice of the user;
基于所述跟读语音和所述跟读语音对应的范读语音,获得所述跟读语音的跟读评测结果并显示在所述词典笔的显示界面上。Based on the follow-up voice and the model-reading voice corresponding to the follow-up voice, a follow-up evaluation result of the follow-up voice is obtained and displayed on the display interface of the dictionary pen.
图10是本申请实施例中服务器的结构示意图。该服务器1000可因配置或性能不同而产生比较大的差异,可以包括一个或一个以***处理器(central processing units,CPU)1022(例如,一个或一个以上处理器)和存储器1032,一个或一个以上存储应用程序1042或数据1044的存储介质1030(例如一个或一个以上海量存储设备)。其中,存储器1032和存储介质1030可以是短暂存储或持久存储。存储在存储介质1030的程序可以包括一个或一个以上模块(图示没标出),每个模块可以包括对服务器中的一系列指令操作。更进一步地,中央处理器1022可以设置为与存储介质1030通信,在服务器1000上执行存储介质1030中的一系列指令操作。FIG. 10 is a schematic diagram of the structure of a server in an embodiment of the present application. The server 1000 may have relatively large differences due to different configurations or performances, and may include one or more central processing units (CPU) 1022 (for example, one or more processors) and a memory 1032, one or one The above storage medium 1030 (for example, one or one storage device with a large amount of storage) for storing the application program 1042 or the data 1044. Among them, the memory 1032 and the storage medium 1030 may be short-term storage or permanent storage. The program stored in the storage medium 1030 may include one or more modules (not shown in the figure), and each module may include a series of command operations on the server. Furthermore, the central processing unit 1022 may be configured to communicate with the storage medium 1030, and execute a series of instruction operations in the storage medium 1030 on the server 1000.
服务器1000还可以包括一个或一个以上电源1026,一个或一个以上有线或无线网络接口1050,一个或一个以上输入输出接口1058,一个或一个以上键盘1056,和/或,一个或一个以上操作***1041,例如Windows ServerTM,Mac OS XTM,UnixTM,LinuxTM,FreeBSDTM等等。The server 1000 may also include one or more power supplies 1026, one or more wired or wireless network interfaces 1050, one or more input and output interfaces 1058, one or more keyboards 1056, and/or, one or more operating systems 1041 , Such as Windows ServerTM, Mac OS XTM, UnixTM, LinuxTM, FreeBSDTM and so on.
本说明书中各个实施例采用递进的方式描述,每个实施例重点说明的都 是与其他实施例的不同之处,各个实施例之间相同相似部分互相参见即可。对于实施例公开的装置而言,由于其与实施例公开的方法相对应,所以描述的比较简单,相关之处参见方法部分说明即可。The various embodiments in this specification are described in a progressive manner. Each embodiment focuses on the differences from other embodiments, and the same or similar parts between the various embodiments can be referred to each other. For the device disclosed in the embodiment, since it corresponds to the method disclosed in the embodiment, the description is relatively simple, and the relevant parts can be referred to the description of the method part.
专业人员还可以进一步意识到,结合本文中所公开的实施例描述的各示例的单元及算法步骤,能够以电子硬件、计算机软件或者二者的结合来实现,为了清楚地说明硬件和软件的可互换性,在上述说明中已经按照功能一般性地描述了各示例的组成及步骤。这些功能究竟以硬件还是软件方式来执行,取决于技术方案的特定应用和设计约束条件。专业技术人员可以对每个特定的应用来使用不同方法来实现所描述的功能,但是这种实现不应认为超出本申请的范围。Professionals may further realize that the units and algorithm steps of the examples described in the embodiments disclosed in this article can be implemented by electronic hardware, computer software, or a combination of the two, in order to clearly illustrate the possibilities of hardware and software. Interchangeability, in the above description, the composition and steps of each example have been generally described in accordance with the function. Whether these functions are executed by hardware or software depends on the specific application and design constraint conditions of the technical solution. Professionals and technicians can use different methods for each specific application to implement the described functions, but such implementation should not be considered beyond the scope of this application.
需要说明的是,在本文中,诸如第一和第二等之类的关系术语仅仅用来将一个实体或者操作与另一个实体或操作区分开来,而不一定要求或者暗示这些实体或操作之间存在任何这种实际的关系或者顺序。术语“包括”、“包含”或者其任何其他变体意在涵盖非排他性的包含,从而使得包括一系列要素的过程、方法、物品或者设备不仅包括那些要素,而且还包括没有明确列出的其他要素,或者是还包括为这种过程、方法、物品或者设备所固有的要素。在没有更多限制的情况下,由语句“包括一个……”限定的要素,并不排除在包括所述要素的过程、方法、物品或者设备中还存在另外的相同要素。It should be noted that in this article, relational terms such as first and second are only used to distinguish one entity or operation from another entity or operation, and do not necessarily require or imply one of these entities or operations. There is any such actual relationship or order between. The terms "including", "including" or any other variations thereof are intended to cover non-exclusive inclusion, so that a process, method, article, or device that includes a series of elements includes not only those elements, but also other elements that are not explicitly listed. Elements, or also include elements inherent to such processes, methods, articles, or equipment. If there are no more restrictions, the element defined by the sentence "including a..." does not exclude the existence of other identical elements in the process, method, article, or equipment that includes the element.
以上所述,仅是本申请的较佳实施例而已,并非对本申请作任何形式上的限制。虽然本申请已以较佳实施例揭露如上,然而并非用以限定本申请。任何熟悉本领域的技术人员,在不脱离本申请技术方案范围情况下,都可利用上述揭示的方法和技术内容对本申请技术方案做出许多可能的变动和修饰,或修改为等同变化的等效实施例。因此,凡是未脱离本申请技术方案的内容,依据本申请的技术实质对以上实施例所做的任何简单修改、等同变化及修饰,均仍属于本申请技术方案保护的范围内。The above are only preferred embodiments of the application, and do not limit the application in any form. Although this application has been disclosed as above in preferred embodiments, it is not intended to limit the application. Anyone familiar with the art, without departing from the scope of the technical solution of the present application, can use the methods and technical content disclosed above to make many possible changes and modifications to the technical solution of the present application, or be modified into equivalent changes. Examples. Therefore, any simple amendments, equivalent changes, and modifications made to the above embodiments based on the technical essence of the application without departing from the content of the technical solution of the application still fall within the protection scope of the technical solution of the application.

Claims (25)

  1. 一种扫描跟读处理的方法,其特征在于,应用于词典笔,包括:A scanning and reading processing method, which is characterized in that it is applied to a dictionary pen and includes:
    响应于词典笔对文本的扫描操作,获得所述文本的查询结果并显示在所述词典笔的显示界面上;In response to the scanning operation of the dictionary pen on the text, obtaining the query result of the text and displaying it on the display interface of the dictionary pen;
    响应于用户在查询结果显示界面上的跟读语音输入操作,获得所述用户的跟读语音;In response to the user's follow-up voice input operation on the query result display interface, obtain the follow-up voice of the user;
    基于所述跟读语音和所述跟读语音对应的范读语音,获得所述跟读语音的跟读评测结果并显示在所述词典笔的显示界面上。Based on the follow-up voice and the model-reading voice corresponding to the follow-up voice, a follow-up evaluation result of the follow-up voice is obtained and displayed on the display interface of the dictionary pen.
  2. 根据权利要求1所述的方法,其特征在于,所述响应于用户在查询结果显示界面上的跟读语音输入操作,获得所述用户的跟读语音,包括:The method according to claim 1, wherein the obtaining the follow-up voice of the user in response to the user's follow-up voice input operation on the query result display interface comprises:
    响应于用户在查询结果显示界面上的跟读语音输入操作,获得所述用户的输入语音;In response to the user's follow-up voice input operation on the query result display interface, obtaining the user's input voice;
    利用语音降噪技术和语音活性检测技术处理所述输入语音,获得所述跟读语音。The input voice is processed by using voice noise reduction technology and voice activity detection technology to obtain the follow-up voice.
  3. 根据权利要求1所述的方法,其特征在于,所述基于所述跟读语音和所述跟读语音对应的范读语音,获得所述跟读语音的跟读评测结果并显示在所述词典笔的显示界面上,包括:The method of claim 1, wherein the follow-up speech is based on the follow-up speech and the model-reading speech corresponding to the follow-up speech to obtain a follow-up evaluation result of the follow-up speech and display it in the dictionary The display interface of the pen includes:
    当所述词典笔处于联网网络环境下,将所述跟读语音发送至所述读音评测服务器;When the dictionary pen is in a networked network environment, sending the follow-up speech to the pronunciation evaluation server;
    通过所述读音评测服务器匹配所述跟读语音与所述范读语音,获得所述跟读语音与所述范读语音的匹配度;Matching the follow-up speech and the model pronunciation through the pronunciation evaluation server to obtain the degree of matching between the follow-up speech and the model pronunciation;
    基于所述跟读语音与所述范读语音的匹配度,获得所述跟读语音的跟读评测结果并显示在所述词典笔的显示界面上。Based on the degree of matching between the follow-up speech and the model pronunciation, the follow-up evaluation result of the follow-up speech is obtained and displayed on the display interface of the dictionary pen.
  4. 根据权利要求1所述的方法,其特征在于,所述基于所述跟读语音和所述跟读语音对应的范读语音,获得所述跟读语音的跟读评测结果并显示在所述词典笔的显示界面上,包括:The method of claim 1, wherein the follow-up speech is based on the follow-up speech and the model-reading speech corresponding to the follow-up speech to obtain a follow-up evaluation result of the follow-up speech and display it in the dictionary The display interface of the pen includes:
    当所述词典笔处于离线网络环境下,通过读音评测离线工具包匹配所述跟读语音与所述范读语音,获得所述跟读语音与所述范读语音的匹配度;When the dictionary pen is in an offline network environment, use the pronunciation evaluation offline toolkit to match the follow-up speech and the model pronunciation to obtain the degree of matching between the follow-up speech and the model pronunciation;
    基于所述跟读语音与所述范读语音的匹配度,获得所述跟读语音的跟读 评测结果并显示在所述词典笔的显示界面上。Based on the degree of matching between the follow-up speech and the model-reading speech, a follow-up evaluation result of the follow-up speech is obtained and displayed on the display interface of the dictionary pen.
  5. 根据权利要求1-4任意一项所述的方法,其特征在于,所述跟读评测结果至少包括跟读评测分数。The method according to any one of claims 1 to 4, wherein the follow-up evaluation result includes at least a follow-up evaluation score.
  6. 根据权利要求5所述的方法,其特征在于,若所述跟读评测分数低于预设跟读评测分数,所述跟读评测结果还包括跟读纠音建议。The method according to claim 5, wherein if the follow-up evaluation score is lower than a preset follow-up evaluation score, the follow-up evaluation result further includes a follow-up correction suggestion.
  7. 根据权利要求6所述的方法,其特征在于,当所述跟读评测结果包括跟读纠音建议时,所述方法还包括:The method according to claim 6, wherein when the follow-up evaluation result includes follow-up correction suggestions, the method further comprises:
    基于所述跟读纠音建议推荐口语练习内容,并显示在所述词典笔的显示界面上。Suggestions for recommended oral practice content based on the correct pronunciation of the follow-up, and display on the display interface of the dictionary pen.
  8. 根据权利要求6所述的方法,其特征在于,当所述跟读评测结果包括跟读纠音建议时,所述方法还包括:The method according to claim 6, wherein when the follow-up evaluation result includes follow-up correction suggestions, the method further comprises:
    响应于用户在跟读评测结果显示界面上的跟读语音输入操作,获得所述用户的新跟读语音;In response to the user's follow-up voice input operation on the follow-up evaluation result display interface, obtain the user's new follow-up voice;
    基于所述新跟读语音和所述范读语音,获得所述新跟读语音的跟读评测结果并显示在所述词典笔的显示界面上。Based on the new follow-up voice and the model-read voice, a follow-up evaluation result of the new follow-up voice is obtained and displayed on the display interface of the dictionary pen.
  9. 一种扫描跟读处理的装置,其特征在于,应用于词典笔,包括:A device for scanning and reading processing is characterized in that it is applied to a dictionary pen and includes:
    第一获得显示单元,用于响应于词典笔对文本的扫描操作,获得所述文本的查询结果并显示在所述词典笔的显示界面上;The first obtaining and displaying unit is configured to obtain the query result of the text in response to the scanning operation of the text by the dictionary pen and display it on the display interface of the dictionary pen;
    跟读语音获得单元,用于响应于用户在查询结果显示界面上的跟读语音输入操作,获得所述用户的跟读语音;The follow-up voice obtaining unit is used to obtain the follow-up voice of the user in response to the user's follow-up voice input operation on the query result display interface;
    第二获得显示单元,用于基于所述跟读语音和所述跟读语音对应的范读语音,获得所述跟读语音的跟读评测结果并显示在所述词典笔的显示界面上。The second obtaining and displaying unit is configured to obtain the follow-up evaluation result of the follow-up voice based on the follow-up voice and the model-reading voice corresponding to the follow-up voice and display it on the display interface of the dictionary pen.
  10. 根据权利要求9所述的装置,其特征在于,所述跟读语音获得单元包括:The device according to claim 9, wherein the follow-up speech obtaining unit comprises:
    输入语音获得子单元,用于响应于用户在查询结果显示界面上的跟读语音输入操作,获得所述用户的输入语音;The input voice obtaining subunit is used to obtain the input voice of the user in response to the user's follow-up voice input operation on the query result display interface;
    跟读语音获得子单元,用于利用语音降噪技术和语音活性检测技术处理所述输入语音,获得所述跟读语音。The follow-up voice obtaining subunit is used to process the input voice by using the voice noise reduction technology and the voice activity detection technology to obtain the follow-up voice.
  11. 根据权利要求9所述的装置,其特征在于,所述第二获得显示单 元包括:The device according to claim 9, wherein the second obtaining and displaying unit comprises:
    发送子单元,用于当所述词典笔处于联网网络环境下,将所述跟读语音发送至所述读音评测服务器;The sending subunit is used to send the follow-up speech to the pronunciation evaluation server when the dictionary pen is in a networked network environment;
    第一匹配获得子单元,用于通过所述读音评测服务器匹配所述跟读语音与所述范读语音,获得所述跟读语音与所述范读语音的匹配度;The first matching obtaining subunit is configured to match the follow-up speech and the model pronunciation through the pronunciation evaluation server to obtain the degree of matching between the follow-up speech and the model pronunciation;
    第一获得显示子单元,用于基于所述跟读语音与所述范读语音的匹配度,获得所述跟读语音的跟读评测结果并显示在所述词典笔的显示界面上。The first obtaining and displaying subunit is configured to obtain the follow-up evaluation result of the follow-up voice based on the degree of matching between the follow-up voice and the model pronunciation and display it on the display interface of the dictionary pen.
  12. 根据权利要求9所述的装置,其特征在于,所述第二获得显示单元包括:The device according to claim 9, wherein the second obtaining and displaying unit comprises:
    第二匹配获得子单元,用于当所述词典笔处于离线网络环境下,通过读音评测离线工具包匹配所述跟读语音与所述范读语音,获得所述跟读语音与所述范读语音的匹配度;The second matching obtaining subunit is used to match the follow-up speech and the model pronunciation through the pronunciation evaluation offline toolkit when the dictionary pen is in an offline network environment to obtain the follow-up speech and the model pronunciation The matching degree of the voice;
    第二获得显示子单元,用于基于所述跟读语音与所述范读语音的匹配度,获得所述跟读语音的跟读评测结果并显示在所述词典笔的显示界面上。The second obtaining and displaying subunit is configured to obtain the follow-up evaluation result of the follow-up voice based on the degree of matching between the follow-up voice and the model pronunciation and display it on the display interface of the dictionary pen.
  13. 根据权利要求9-12任意一项所述的装置,其特征在于,所述跟读评测结果至少包括跟读评测分数。The device according to any one of claims 9-12, wherein the follow-up evaluation result includes at least a follow-up evaluation score.
  14. 根据权利要求13所述的装置,其特征在于,若所述跟读评测分数低于预设跟读评测分数,所述跟读评测结果还包括跟读纠音建议。13. The device of claim 13, wherein if the follow-up evaluation score is lower than a preset follow-up evaluation score, the follow-up evaluation result further includes a follow-up correction suggestion.
  15. 根据权利要求14所述的装置,其特征在于,当所述跟读评测结果包括跟读纠音建议时,所述装置还包括:The device according to claim 14, wherein when the follow-up evaluation result includes a follow-up correction suggestion, the device further comprises:
    推荐获得单元,用于基于所述跟读纠音建议推荐口语练习内容,并显示在所述词典笔的显示界面上。The recommendation obtaining unit is configured to recommend oral practice content based on the correcting pronunciation of the follow-up, and display it on the display interface of the dictionary pen.
  16. 根据权利要求14所述的装置,其特征在于,当所述跟读评测结果包括跟读纠音建议时,The device according to claim 14, wherein when the follow-up evaluation result includes a follow-up correction suggestion,
    所述跟读语音获得单元,还用于响应于用户在跟读评测结果显示界面上的跟读语音输入操作,获得所述用户的新跟读语音;The follow-up voice obtaining unit is further configured to obtain the new follow-up voice of the user in response to the user's follow-up voice input operation on the follow-up evaluation result display interface;
    所述第二获得显示单元,还用于基于所述新跟读语音和所述范读语 音,获得所述新跟读语音的跟读评测结果并显示在所述词典笔的显示界面上。The second obtaining and displaying unit is further configured to obtain a follow-up evaluation result of the new follow-up speech based on the new follow-up speech and the model pronunciation and display it on the display interface of the dictionary pen.
  17. 一种用于扫描跟读处理的词典笔,其特征在于,包括有存储器,以及一个或者一个以上的程序,其中一个或者一个以上程序存储于存储器中,且经配置以由一个或者一个以上处理器执行所述一个或者一个以上程序包含用于进行以下操作的指令:A dictionary pen for scanning and reading processing, characterized in that it includes a memory and one or more programs, wherein one or more programs are stored in the memory and configured to be run by one or more processors Executing the one or more programs includes instructions for performing the following operations:
    响应于词典笔对文本的扫描操作,获得所述文本的查询结果并显示在所述词典笔的显示界面上;In response to the scanning operation of the dictionary pen on the text, obtaining the query result of the text and displaying it on the display interface of the dictionary pen;
    响应于用户在查询结果显示界面上的跟读语音输入操作,获得所述用户的跟读语音;In response to the user's follow-up voice input operation on the query result display interface, obtain the follow-up voice of the user;
    基于所述跟读语音和所述跟读语音对应的范读语音,获得所述跟读语音的跟读评测结果并显示在所述词典笔的显示界面上。Based on the follow-up voice and the model-reading voice corresponding to the follow-up voice, a follow-up evaluation result of the follow-up voice is obtained and displayed on the display interface of the dictionary pen.
  18. 根据权利要求17所述的装置,其特征在于,所述响应于用户在查询结果显示界面上的跟读语音输入操作,获得所述用户的跟读语音,包括:The device according to claim 17, wherein the obtaining the follow-up voice of the user in response to the user's follow-up voice input operation on the query result display interface comprises:
    响应于用户在查询结果显示界面上的跟读语音输入操作,获得所述用户的输入语音;In response to the user's follow-up voice input operation on the query result display interface, obtaining the user's input voice;
    利用语音降噪技术和语音活性检测技术处理所述输入语音,获得所述跟读语音。The input voice is processed by using voice noise reduction technology and voice activity detection technology to obtain the follow-up voice.
  19. 根据权利要求17所述的装置,其特征在于,所述基于所述跟读语音和所述跟读语音对应的范读语音,获得所述跟读语音的跟读评测结果并显示在所述词典笔的显示界面上,包括:18. The device according to claim 17, wherein the follow-up evaluation result of the follow-up speech is obtained based on the follow-up speech and the model-reading speech corresponding to the follow-up speech and displayed in the dictionary The display interface of the pen includes:
    当所述词典笔处于联网网络环境下,将所述跟读语音发送至所述读音评测服务器;When the dictionary pen is in a networked network environment, sending the follow-up speech to the pronunciation evaluation server;
    通过所述读音评测服务器匹配所述跟读语音与所述范读语音,获得所述跟读语音与所述范读语音的匹配度;Matching the follow-up speech and the model pronunciation through the pronunciation evaluation server to obtain the degree of matching between the follow-up speech and the model pronunciation;
    基于所述跟读语音与所述范读语音的匹配度,获得所述跟读语音的跟读评测结果并显示在所述词典笔的显示界面上。Based on the degree of matching between the follow-up speech and the model pronunciation, the follow-up evaluation result of the follow-up speech is obtained and displayed on the display interface of the dictionary pen.
  20. 根据权利要求17所述的装置,其特征在于,所述基于所述跟读 语音和所述跟读语音对应的范读语音,获得所述跟读语音的跟读评测结果并显示在所述词典笔的显示界面上,包括:18. The device according to claim 17, wherein the follow-up evaluation result of the follow-up speech is obtained based on the follow-up speech and the model-reading speech corresponding to the follow-up speech and displayed in the dictionary The display interface of the pen includes:
    当所述词典笔处于离线网络环境下,通过读音评测离线工具包匹配所述跟读语音与所述范读语音,获得所述跟读语音与所述范读语音的匹配度;When the dictionary pen is in an offline network environment, use the pronunciation evaluation offline toolkit to match the follow-up speech and the model pronunciation to obtain the degree of matching between the follow-up speech and the model pronunciation;
    基于所述跟读语音与所述范读语音的匹配度,获得所述跟读语音的跟读评测结果并显示在所述词典笔的显示界面上。Based on the degree of matching between the follow-up speech and the model pronunciation, the follow-up evaluation result of the follow-up speech is obtained and displayed on the display interface of the dictionary pen.
  21. 根据权利要求17-20任意一项所述的装置,其特征在于,所述跟读评测结果至少包括跟读评测分数。The device according to any one of claims 17-20, wherein the follow-up evaluation result includes at least a follow-up evaluation score.
  22. 根据权利要求21所述的装置,其特征在于,若所述跟读评测分数低于预设跟读评测分数,所述跟读评测结果还包括跟读纠音建议。22. The device of claim 21, wherein if the follow-up evaluation score is lower than a preset follow-up evaluation score, the follow-up evaluation result further includes a follow-up correction suggestion.
  23. 根据权利要求22所述的装置,其特征在于,当所述跟读评测结果包括跟读纠音建议时,所述处理器执行所述一个或者一个以上程序包含还用于进行以下操作的指令:The device according to claim 22, wherein when the follow-up evaluation result includes a follow-up correction suggestion, the processor executing the one or more programs includes instructions further used to perform the following operations:
    基于所述跟读纠音建议推荐口语练习内容,并显示在所述词典笔的显示界面上。Suggestions for recommended oral practice content based on the correct pronunciation of the follow-up, and display on the display interface of the dictionary pen.
  24. 根据权利要求22所述的装置,其特征在于,当所述跟读评测结果包括跟读纠音建议时,所述处理器执行所述一个或者一个以上程序包含还用于进行以下操作的指令:The device according to claim 22, wherein when the follow-up evaluation result includes a follow-up correction suggestion, the processor executing the one or more programs includes instructions further used to perform the following operations:
    响应于用户在跟读评测结果显示界面上的跟读语音输入操作,获得所述用户的新跟读语音;In response to the user's follow-up voice input operation on the follow-up evaluation result display interface, obtain the user's new follow-up voice;
    基于所述新跟读语音和所述范读语音,获得所述新跟读语音的跟读评测结果并显示在所述词典笔的显示界面上。Based on the new follow-up voice and the model-read voice, a follow-up evaluation result of the new follow-up voice is obtained and displayed on the display interface of the dictionary pen.
  25. 一种机器可读介质,其上存储有指令,当由一个或多个处理器执行时,使得装置执行如权利要求1至8中任一项所述的扫描跟读处理的方法。A machine-readable medium having instructions stored thereon, which when executed by one or more processors, cause a device to execute the method for scanning and reading processing according to any one of claims 1 to 8.
PCT/CN2021/074984 2020-05-20 2021-02-03 Scanning and shadow reading processing method and related apparatus WO2021232857A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202010430268.4 2020-05-20
CN202010430268.4A CN111613244A (en) 2020-05-20 2020-05-20 Scanning and reading-following processing method and related device

Publications (1)

Publication Number Publication Date
WO2021232857A1 true WO2021232857A1 (en) 2021-11-25

Family

ID=72204968

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2021/074984 WO2021232857A1 (en) 2020-05-20 2021-02-03 Scanning and shadow reading processing method and related apparatus

Country Status (2)

Country Link
CN (1) CN111613244A (en)
WO (1) WO2021232857A1 (en)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111613244A (en) * 2020-05-20 2020-09-01 北京搜狗科技发展有限公司 Scanning and reading-following processing method and related device
CN112989073A (en) * 2021-03-11 2021-06-18 读书郎教育科技有限公司 Method for scanning textbook and inquiring and matching textbook

Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH04311989A (en) * 1991-04-11 1992-11-04 Seiko Epson Corp Voice utterance learning unit
CN201213041Y (en) * 2008-06-30 2009-03-25 东莞市步步高教育电子产品有限公司 Optical click-to-read machine
CN101727900A (en) * 2009-11-24 2010-06-09 北京中星微电子有限公司 Method and equipment for detecting user pronunciation
CN203217570U (en) * 2013-04-09 2013-09-25 李毅鹏 Translation machine
CN206147973U (en) * 2016-08-12 2017-05-03 西安外事学院 French word pen of giving financial aid to students
CN208384843U (en) * 2018-01-19 2019-01-15 烟台工程职业技术学院 A kind of Portable English learning-aid device
CN110085261A (en) * 2019-05-16 2019-08-02 上海流利说信息技术有限公司 A kind of pronunciation correction method, apparatus, equipment and computer readable storage medium
CN111613244A (en) * 2020-05-20 2020-09-01 北京搜狗科技发展有限公司 Scanning and reading-following processing method and related device

Family Cites Families (15)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR20010093505A (en) * 2000-03-29 2001-10-29 원경주 A control methode for conversation study system
CN101123042A (en) * 2007-09-21 2008-02-13 无敌科技(西安)有限公司 English learning system and its method combining with pronunciation skills and video/audio image
CN101958060A (en) * 2009-08-28 2011-01-26 陈美含 English spelling instant technical tool
CN204965791U (en) * 2015-07-09 2016-01-13 张宇 Electronic dictionary pen
CN107578653A (en) * 2016-07-04 2018-01-12 武汉理工大学 A kind of pen for correcting irregular Chinese speech pronunciation
CN106649513B (en) * 2016-10-14 2020-03-31 盐城工学院 Audio data clustering method based on spectral clustering
JP6466391B2 (en) * 2016-12-15 2019-02-06 株式会社ファニー Language learning device
CN107944441B (en) * 2017-12-15 2020-09-08 郑州科技学院 Scanning type translation pen
CN108257615A (en) * 2018-01-15 2018-07-06 北京物灵智能科技有限公司 A kind of user language appraisal procedure and system
CN108899032A (en) * 2018-06-06 2018-11-27 平安科技(深圳)有限公司 Method for recognizing sound-groove, device, computer equipment and storage medium
CN110853674A (en) * 2018-07-24 2020-02-28 中兴通讯股份有限公司 Text collation method, apparatus, and computer-readable storage medium
CN109036464B (en) * 2018-09-17 2022-02-22 腾讯科技(深圳)有限公司 Pronunciation error detection method, apparatus, device and storage medium
CN109545244A (en) * 2019-01-29 2019-03-29 北京猎户星空科技有限公司 Speech evaluating method, device, electronic equipment and storage medium
TWM592580U (en) * 2019-03-22 2020-03-21 黃筠惠 Image type pronunciation systems
CN111079791A (en) * 2019-11-18 2020-04-28 京东数字科技控股有限公司 Face recognition method, face recognition device and computer-readable storage medium

Patent Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH04311989A (en) * 1991-04-11 1992-11-04 Seiko Epson Corp Voice utterance learning unit
CN201213041Y (en) * 2008-06-30 2009-03-25 东莞市步步高教育电子产品有限公司 Optical click-to-read machine
CN101727900A (en) * 2009-11-24 2010-06-09 北京中星微电子有限公司 Method and equipment for detecting user pronunciation
CN203217570U (en) * 2013-04-09 2013-09-25 李毅鹏 Translation machine
CN206147973U (en) * 2016-08-12 2017-05-03 西安外事学院 French word pen of giving financial aid to students
CN208384843U (en) * 2018-01-19 2019-01-15 烟台工程职业技术学院 A kind of Portable English learning-aid device
CN110085261A (en) * 2019-05-16 2019-08-02 上海流利说信息技术有限公司 A kind of pronunciation correction method, apparatus, equipment and computer readable storage medium
CN111613244A (en) * 2020-05-20 2020-09-01 北京搜狗科技发展有限公司 Scanning and reading-following processing method and related device

Also Published As

Publication number Publication date
CN111613244A (en) 2020-09-01

Similar Documents

Publication Publication Date Title
JP6321296B2 (en) Text input method, apparatus, program, and recording medium
EP3171279A1 (en) Method and device for input processing
TW201725580A (en) Speech input method and terminal device
WO2018098865A1 (en) Message reading method and apparatus
CN106791921A (en) The processing method and processing device of net cast
WO2021232857A1 (en) Scanning and shadow reading processing method and related apparatus
WO2021031308A1 (en) Audio processing method and device, and storage medium
WO2021120690A1 (en) Speech recognition method and apparatus, and medium
JP7116088B2 (en) Speech information processing method, device, program and recording medium
US11335348B2 (en) Input method, device, apparatus, and storage medium
CN105139848B (en) Data transfer device and device
CN108962220A (en) Multimedia file plays the text display method and device under scene
WO2021208531A1 (en) Speech processing method and apparatus, and electronic device
CN105279499A (en) Age recognition method and device
US10764418B2 (en) Method, device and medium for application switching
CN112331194B (en) Input method and device and electronic equipment
CN107943317A (en) Input method and device
CN113936697A (en) Voice processing method and device for voice processing
CN105913841B (en) Voice recognition method, device and terminal
WO2017035985A1 (en) String storing method and device
CN114051157A (en) Input method and device
CN113591495A (en) Speech translation method, device and storage medium
CN108241438B (en) Input method, input device and input device
CN111913590A (en) Input method, device and equipment
CN111831132A (en) Information recommendation method and device and electronic equipment

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 21807675

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

32PN Ep: public notification in the ep bulletin as address of the adressee cannot be established

Free format text: NOTING OF LOSS OF RIGHTS PURSUANT TO RULE 112(1) EPC (EPO FORM 1205A DATED 15.03.2023)

122 Ep: pct application non-entry in european phase

Ref document number: 21807675

Country of ref document: EP

Kind code of ref document: A1