CN111354339A - Method, device and equipment for constructing vocabulary phoneme table and storage medium - Google Patents

Method, device and equipment for constructing vocabulary phoneme table and storage medium Download PDF

Info

Publication number
CN111354339A
CN111354339A CN202010150627.0A CN202010150627A CN111354339A CN 111354339 A CN111354339 A CN 111354339A CN 202010150627 A CN202010150627 A CN 202010150627A CN 111354339 A CN111354339 A CN 111354339A
Authority
CN
China
Prior art keywords
vocabulary
phonetic
phonetic symbols
symbols
target
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202010150627.0A
Other languages
Chinese (zh)
Other versions
CN111354339B (en
Inventor
赵伟伟
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
WeBank Co Ltd
Original Assignee
WeBank Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by WeBank Co Ltd filed Critical WeBank Co Ltd
Priority to CN202010150627.0A priority Critical patent/CN111354339B/en
Publication of CN111354339A publication Critical patent/CN111354339A/en
Application granted granted Critical
Publication of CN111354339B publication Critical patent/CN111354339B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/02Feature extraction for speech recognition; Selection of recognition unit
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/06Creation of reference templates; Training of speech recognition systems, e.g. adaptation to the characteristics of the speaker's voice
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/26Speech to text systems
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/02Feature extraction for speech recognition; Selection of recognition unit
    • G10L2015/025Phonemes, fenemes or fenones being the recognition units

Landscapes

  • Engineering & Computer Science (AREA)
  • Computational Linguistics (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Human Computer Interaction (AREA)
  • Physics & Mathematics (AREA)
  • Acoustics & Sound (AREA)
  • Multimedia (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Artificial Intelligence (AREA)
  • Machine Translation (AREA)
  • Document Processing Apparatus (AREA)

Abstract

The invention discloses a method, a device, equipment and a storage medium for constructing a vocabulary phoneme table, wherein the method comprises the following steps: selecting a plurality of vocabulary phonetic symbol conversion tools, and respectively labeling phonetic symbols for the vocabulary to be labeled by the vocabulary phonetic symbol conversion tools to obtain a plurality of phonetic symbols of the vocabulary to be labeled; based on a voting strategy, selecting a winning phonetic symbol from the phonetic symbols as a target phonetic symbol; and converting the target phonetic symbols into phonemes to generate a vocabulary phoneme table. Therefore, phonetic symbols are marked for the vocabulary to be marked through the vocabulary phonetic symbol transferring tools, the target phonetic symbols are determined based on the voting strategy, the quality of the vocabulary phoneme list is improved, and the construction efficiency of the vocabulary phoneme list is improved.

Description

Method, device and equipment for constructing vocabulary phoneme table and storage medium
Technical Field
The invention relates to the technical field of voice recognition, in particular to a method, a device, equipment and a storage medium for constructing a vocabulary phoneme table.
Background
With the development of computer technology, more and more technologies (big data, distributed, Blockchain, artificial intelligence, etc.) are applied to the financial field, and the traditional financial industry is gradually changing to financial technology (Fintech), but higher requirements are also put forward on the technologies due to the requirements of security and real-time performance of the financial industry.
The vocabulary phoneme table (lexicon table) is a key part of building a hybrid speech recognition system. Generally, to improve the speech recognition effect, the speech recognition system needs to convert words into phonemes of finer granularity. The current method for constructing the vocabulary phoneme table is manual labeling or utilizing an open source lexicon table, but the manual labeling is time-consuming and labor-consuming, the vocabulary phoneme table constructed by the open source lexicon table cannot ensure the quality, and professional vocabularies in the professional field are lacked in most cases.
Disclosure of Invention
The invention provides a method, a device, equipment and a storage medium for constructing a vocabulary phoneme table, aiming at improving the quality of the vocabulary phoneme table and improving the construction efficiency of the vocabulary phoneme table.
In order to achieve the above object, the present invention provides a method for constructing a vocabulary phoneme table, the method comprising:
selecting a plurality of vocabulary phonetic symbol conversion tools, and respectively labeling phonetic symbols for the vocabulary to be labeled by the vocabulary phonetic symbol conversion tools to obtain a plurality of phonetic symbols of the vocabulary to be labeled;
based on a voting strategy, selecting a winning phonetic symbol from the phonetic symbols as a target phonetic symbol;
and converting the target phonetic symbols into phonemes to generate a vocabulary phoneme table.
Preferably, the voting strategy is used for determining a winning phonetic symbol according to the number of votes of the phonetic symbol;
the step of selecting a winning phonetic symbol from the plurality of phonetic symbols as a target phonetic symbol based on the voting strategy comprises:
assigning each phonetic symbol of the plurality of phonetic symbols to an original ticket;
if the same phonetic symbols exist in the plurality of phonetic symbols, combining the original tickets of the same phonetic symbols, marking the phonetic symbols after the combination as candidate phonetic symbols, and counting the number of the candidate phonetic symbols;
and sorting the candidate phonetic symbols according to the ticket number, determining a winning phonetic symbol according to a sorting result, and taking the winning phonetic symbol as a target phonetic symbol.
Preferably, after the step of assigning each of the plurality of phonetic symbols to an original ticket, the method further includes:
if the same phonetic symbols do not exist in the plurality of phonetic symbols, judging that a winning phonetic symbol does not exist in the plurality of phonetic symbols;
marking the corresponding vocabulary to be marked as ambiguous vocabulary, and transferring the ambiguous vocabulary into an ambiguous vocabulary pool;
and marking phonetic symbols for the ambiguous words in the ambiguous word pool by a plurality of spare word transfer phonetic symbol tools until obtaining the winning phonetic symbols of the ambiguous words.
Preferably, before the step of selecting a winning phonetic symbol from the plurality of phonetic symbols as the target phonetic symbol based on the voting strategy, the method further includes:
judging whether the vocabulary to be labeled is polyphonic words or not according to the number of the phonetic symbols labeled by each vocabulary phonetic symbol conversion tool:
if the number of phonetic symbols marked by one or more vocabulary phonetic symbol conversion tools is more than 1, judging that the vocabulary to be marked is polyphone;
if the vocabulary to be labeled is polyphonic, splitting the polyphonic into a plurality of sub vocabularies to be labeled;
respectively storing the plurality of sub-vocabularies to be labeled and corresponding phonetic symbols in an associated manner, and executing the following steps after obtaining the plurality of phonetic symbols of the sub-vocabularies to be labeled: and selecting a winning phonetic symbol from the plurality of phonetic symbols as a target phonetic symbol based on a voting strategy.
Preferably, the selecting the plurality of vocabulary phonetic symbol transferring tools, labeling phonetic symbols for the vocabulary to be labeled by the plurality of vocabulary phonetic symbol transferring tools, respectively, and after the obtaining the plurality of phonetic symbols of the vocabulary to be labeled, further comprises:
and normalizing the plurality of phonetic symbols to obtain a plurality of phonetic symbols with consistent formats so as to select a winning phonetic symbol from the plurality of phonetic symbols with consistent formats.
Preferably, the step of converting the target phonetic symbols into phonemes and generating a vocabulary phoneme table comprises:
and converting the target phonetic symbols into phonemes based on a phoneme format, and generating a vocabulary phoneme table according to the phonemes of the vocabulary to be labeled.
Preferably, the step of converting the target phonetic symbols into phonemes and generating the vocabulary phoneme table further comprises:
receiving a vocabulary phoneme table updating request, and acquiring a target updating vocabulary and a target updating operation from the vocabulary phoneme table updating request;
and performing corresponding updating operation on the phonemes of the target vocabulary based on the target updating operation.
Further, in order to achieve the above object, the present invention provides a vocabulary phoneme table constructing apparatus including:
the selection module is used for selecting a plurality of vocabulary phonetic symbol conversion tools, and the plurality of vocabulary phonetic symbol conversion tools are used for marking phonetic symbols for the vocabulary to be marked respectively to obtain a plurality of phonetic symbols of the vocabulary to be marked;
the voting module is used for selecting a winning phonetic symbol from the phonetic symbols as a target phonetic symbol based on a voting strategy;
and the conversion module is used for converting the target phonetic symbols into phonemes to generate a vocabulary phoneme table.
Further, in order to achieve the above object, the present invention also provides a vocabulary phoneme table constructing apparatus including a processor, a memory, and a vocabulary phoneme table constructing program stored in the memory, the vocabulary phoneme table constructing program being executed by the processor to implement the steps of the vocabulary phoneme table constructing method as described above.
Further, to achieve the above object, the present invention also provides a computer storage medium having stored thereon a vocabulary phoneme table constructing program, which when executed by a processor, implements the steps of the vocabulary phoneme table constructing method as described above.
Compared with the prior art, the invention provides a method, a device, equipment and a storage medium for constructing a vocabulary phoneme table, wherein the method comprises the following steps: selecting a plurality of vocabulary phonetic symbol conversion tools, and respectively labeling phonetic symbols for the vocabulary to be labeled by the vocabulary phonetic symbol conversion tools to obtain a plurality of phonetic symbols of the vocabulary to be labeled; based on a voting strategy, selecting a winning phonetic symbol from the phonetic symbols as a target phonetic symbol; and converting the target phonetic symbols into phonemes to generate a vocabulary phoneme table. Therefore, the phonetic symbols are marked for the vocabulary to be marked through the plurality of vocabulary phonetic symbol conversion tools, the target phonetic symbols are determined based on the voting strategy, the energy consumed by manual marking is overcome, the problem that the vocabulary phonetic table is inaccurate due to the fact that certain vocabularies such as rarely used vocabularies are lacked in manual marking or an open source lexicon table is avoided, the quality of the vocabulary phonetic table is improved, and the construction efficiency of the vocabulary phonetic table is improved.
Drawings
FIG. 1 is a diagram of a hardware architecture of a vocabulary phoneme table construction device according to embodiments of the present invention;
FIG. 2 is a flowchart illustrating a first embodiment of the vocabulary phoneme table constructing method of the present invention;
FIG. 3 is a flowchart illustrating a second embodiment of the vocabulary phoneme table construction method of the present invention;
FIG. 4 is a flowchart illustrating a third exemplary embodiment of a vocabulary phoneme table constructing method according to the present invention;
FIG. 5 is a functional block diagram of the vocabulary phoneme table constructing device according to the first embodiment of the present invention.
The implementation, functional features and advantages of the objects of the present invention will be further explained with reference to the accompanying drawings.
Detailed Description
It should be understood that the specific embodiments described herein are merely illustrative of the invention and are not intended to limit the invention.
The vocabulary phoneme table construction equipment mainly related to the embodiment of the invention is network connection equipment capable of realizing network connection, and the vocabulary phoneme table construction equipment can be a server, a cloud platform and the like.
Referring to fig. 1, fig. 1 is a hardware configuration diagram of a vocabulary phoneme table constructing device according to embodiments of the present invention. In the embodiment of the present invention, the vocabulary phoneme table constructing apparatus may include a processor 1001 (e.g., a central processing Unit, CPU), a communication bus 1002, an input port 1003, an output port 1004, and a memory 1005. The communication bus 1002 is used for realizing connection communication among the components; the input port 1003 is used for data input; the output port 1004 is used for data output, the memory 1005 may be a high-speed RAM memory, or a non-volatile memory (non-volatile memory), such as a magnetic disk memory, and the memory 1005 may optionally be a storage device independent of the processor 1001. Those skilled in the art will appreciate that the hardware configuration depicted in FIG. 1 is not intended to be limiting of the present invention, and may include more or less components than those shown, or some components in combination, or a different arrangement of components.
With continued reference to FIG. 1, the memory 1005 of FIG. 1, which is one type of readable storage medium, may include an operating system, a network communication module, an application program module, and a vocabulary phone list construction program. In fig. 1, the network communication module is mainly used for connecting to a server and performing data communication with the server; and the processor 1001 may call the vocabulary phoneme table construction program stored in the memory 1005 and perform the vocabulary phoneme table construction method provided by the embodiment of the present invention.
The embodiment of the invention provides a method for constructing a vocabulary phoneme table.
Referring to fig. 2, fig. 2 is a flow chart of the first embodiment of the vocabulary phoneme table constructing method of the present invention.
In this embodiment, the vocabulary phoneme table construction method is applied to a vocabulary phoneme table construction device, and the method includes:
step S101, selecting a plurality of vocabulary phonetic symbol conversion tools, and marking phonetic symbols for the vocabulary to be marked by the vocabulary phonetic symbol conversion tools respectively to obtain a plurality of phonetic symbols of the vocabulary to be marked;
the vocabulary phone list is a key part in building a hybrid speech recognition system. Generally, to ensure recognition effect, a speech recognition system needs to convert words into phonemes of finer granularity. The number of words is millions, but well-designed phonemes are often only a few hundred. The speech recognition system can remarkably reduce the search space and improve the recognition effect by modeling the phonemes with coarser granularity. In this embodiment, words to be labeled are prepared in advance, and the words to be labeled include chinese words, english words, japanese words, french words, and the like. The vocabulary to be labeled comprises common vocabulary, uncommon vocabulary, professional vocabulary and the like.
Further, a plurality of vocabulary phonetic symbol conversion tools are selected, and the number of the vocabulary phonetic symbol conversion tools is larger than or equal to 3. The vocabulary phonetic symbol conversion tool comprises but is not limited to a Chinese vocabulary phonetic symbol conversion tool and an English vocabulary American pronunciation phonetic symbol conversion tool.
The vocabulary to phonetic symbol tool can convert the vocabulary into corresponding phonetic symbols. For example, for the word "patent", it may be converted to the phonetic symbol [ zhuanan 1 li4 ]; for patent, it can be converted into American phonetic symbol [' petent ].
The phonetic symbols marked by different vocabulary phonetic symbol transfer tools may not be in the same form. Therefore, after obtaining a plurality of phonetic symbols of the vocabulary to be labeled, normalizing the plurality of phonetic symbols to obtain a plurality of phonetic symbols with consistent formats. Specifically, the format of the normalized phonetic symbol is set first, and then the several phonetic symbols are normalized to the format. For example, for a Chinese vocabulary, the format may be set to "Pinyin + Pitch", and the Pitch is represented by the numbers 1 (one), 2 (two), 3 (three), 4 (four), 5 (soft). For example, if the phonetic symbol marked by a certain vocabulary phonetic symbol conversion tool is represented by "tone", ˇ, or "tone", the phonetic symbol is converted into a tone represented by a numeral. For example, for the word "ma", if the phonetic symbol of a certain word-to-phonetic symbol tool is [ m-a ], it is normalized to [ ma2 ].
It can be understood that after the vocabulary to be labeled is labeled with phonetic symbols by the plurality of vocabulary phonetic symbol conversion tools, each vocabulary has a plurality of corresponding phonetic symbols, and the number of the vocabulary phonetic symbols corresponds to the number of the vocabulary phonetic symbol conversion tools. For example, if the vocabulary to be labeled is labeled by a 5 vocabulary phonetic symbol conversion tool, if the vocabulary to be labeled is a single-tone vocabulary, each vocabulary to be labeled has 5 corresponding phonetic symbols; and if the vocabulary to be labeled is polyphone, the number of phonetic symbols corresponding to each vocabulary to be labeled is more than 5.
Step S102, selecting a winning phonetic symbol from the phonetic symbols as a target phonetic symbol based on a voting strategy;
it can be understood that the accuracy of each vocabulary phonetic symbol transferring tool is not one hundred percent, and thus the phonetic symbols obtained after the phonetic symbol labeling of the vocabulary to be labeled by each vocabulary phonetic symbol transferring tool may be different. In this embodiment, a voting strategy is adopted to select a correct phonetic symbol from a plurality of phonetic symbols of the vocabulary to be labeled.
Specifically, the step S102: based on the voting strategy, the step of selecting a winning phonetic symbol from the plurality of phonetic symbols as a target phonetic symbol comprises the following steps:
step S102 a: assigning each phonetic symbol of the plurality of phonetic symbols to an original ticket;
all the phonetic symbols are voting objects, and an original ticket can be obtained. In this embodiment, the original tickets are virtual tickets, and the original tickets with the same number of phonetic symbols are respectively assigned to the original tickets. To facilitate the vote counting, the present embodiment sets the number of original votes to one.
Step S102 b: if the same phonetic symbols exist in the plurality of phonetic symbols, combining the original tickets of the same phonetic symbols, marking the phonetic symbols after the combination as candidate phonetic symbols, and counting the number of the candidate phonetic symbols;
generally, considering that the accuracy of the vocabulary phonetic symbol transferring tool is not one hundred percent, the same phonetic symbol may exist when the same vocabulary is labeled by different vocabulary phonetic symbol transferring tools. And comparing the phonetic symbols marked by the vocabulary phonetic symbol conversion tools, screening out the same phonetic symbols, and combining the original tickets of the same phonetic symbols. In this embodiment, the original ticket may be closed using a script, a statistical tool, and the like.
Further, the phonetic symbols after the ticket combination are marked as candidate phonetic symbols, and the number of the candidate phonetic symbols is counted and stored for subsequent selection of the winning phonetic symbols.
Step S102 c: and sorting the candidate phonetic symbols according to the ticket number, determining a winning phonetic symbol according to a sorting result, and taking the winning phonetic symbol as a target phonetic symbol.
And after the tickets are combined, calculating the ticket number of each candidate phonetic symbol, and sequencing the corresponding candidate phonetic symbols according to a sequencing rule based on the ticket number. If the sorting rule is forward sorting, namely sorting according to the number of tickets from high to low, determining the candidate phonetic symbol sorted first as a winning phonetic symbol; and if the sorting rule is reverse sorting, namely sorting according to the ticket number from low to high, determining the candidate phonetic symbol with the first last sorting as a winning phonetic symbol, and taking the winning phonetic symbol as a target phonetic symbol.
For example, for the word "middle", if the number of the several phonetic symbols is 5, and the 5 phonetic symbols are [ zhong1], [ zhong1], [ zong1], [ zhong1], [ zong1], respectively, an original ticket is assigned to the 5 phonetic symbols, and 3 [ zhong1] votes, 2 [ zong1] votes, and the [ zhong1] and [ zong1] are candidate phonetic symbols. If the sorting rule is forward sorting, sorting [ zhong1] and [ zong1] forward according to the number of tickets, wherein the sorting result is that [ zhong1] is 3 tickets and [ zong1] is 2 tickets: [ zhong1] > [ zong1 ]; [ zhong1] ranks first, so [ zhong1] is determined to be the winning phonetic symbol.
If the phonetic symbols of the vocabulary to be marked comprise one or more phonetic symbols capable of ticket combination and one or more phonetic symbols incapable of ticket combination, the number of the phonetic symbols after ticket combination is definitely larger than that of the phonetic symbols incapable of ticket combination, so that the one or more phonetic symbols incapable of ticket combination can be ignored.
In addition, the minimum value of the ticket number of the winning phonetic symbols can be set according to the number of the vocabulary phonetic symbol conversion tools. For example, if the number of the vocabulary phonetic symbol conversion tools is 7, the minimum value of the number of tickets of the winning phonetic symbol is set to 5. Thus, the accuracy of the vocabulary phoneme table can be further improved.
Further, the step of assigning each of the plurality of phonetic symbols to an original ticket further comprises:
step S102a 1: if the same phonetic symbols do not exist in the plurality of phonetic symbols, judging that a winning phonetic symbol does not exist in the plurality of phonetic symbols;
if the same phonetic symbol does not exist, the phonetic symbols marked by each vocabulary phonetic symbol transferring tool are different, so that it is difficult to determine which phonetic symbol marked by the vocabulary phonetic symbol transferring tool is correct, and thus, it is determined that no winning phonetic symbol exists.
Step S102a 2: marking the corresponding vocabulary to be marked as ambiguous vocabulary, and transferring the ambiguous vocabulary into an ambiguous vocabulary pool;
if the winning phonetic symbol does not exist, the accuracy of the vocabulary marking phonetic symbol is difficult to ensure, therefore, the vocabulary to be marked which does not exist the winning phonetic symbol is marked as an ambiguous vocabulary, and the ambiguous vocabulary is transferred to an ambiguous vocabulary pool.
Step S102a 3: and marking phonetic symbols for the ambiguous words in the ambiguous word pool by a plurality of spare word transfer phonetic symbol tools until obtaining the winning phonetic symbols of the ambiguous words.
And selecting a plurality of spare word transliteration phonetic symbol tools to label phonetic symbols of the words in the ambiguous word pool again for the ambiguous words. At least one of the alternative word phonetic symbol converting tools is different from the previous word phonetic symbol converting tool. The words in the ambiguous pool of words may be manually labeled. Practical tests prove that the ambiguous vocabulary generated by the technical scheme of the embodiment accounts for less than 1% of the vocabulary to be labeled, so that the ambiguous vocabulary can be labeled manually without consuming too much labor.
It will be appreciated that in other embodiments, a different voting strategy may be set from which winning phonetic symbols are obtained. For example, the phonetic symbols are grouped, and the same phonetic symbols are grouped into the same group, so as to obtain a plurality of groups of phonetic symbols; then, the number of phonetic symbols in each group of phonetic symbols is counted, the candidate phonetic symbols are sorted according to the ticket number, and the winning phonetic symbols are determined according to the sorting result.
Step S103, converting the target phonetic symbols into phonemes to generate a vocabulary phoneme table.
Specifically, the target phonetic symbols are converted into phonemes based on a phoneme format, and the phonemes of the vocabulary to be labeled are saved as a vocabulary phoneme table. And predetermining the phoneme format, and converting the target phonetic symbols into phonemes based on the phoneme format. The phoneme format comprises a phoneme connection mode, a phoneme arrangement sequence and the like; the phoneme connection mode comprises point number connection, space connection and the like. For example, for the word "mandarin chinese" to be annotated, the corresponding target phonetic symbol is [ pu1tong1hua4], and the corresponding phoneme is represented as [ p u1t ong1h ua4 ].
And after acquiring phonemes of all the vocabularies in the vocabulary to be labeled, storing the phonemes and generating the vocabulary phoneme table.
In this embodiment, by the above scheme, a plurality of vocabulary phonetic symbol conversion tools are selected, and the plurality of vocabulary phonetic symbol conversion tools are used to label phonetic symbols for the vocabulary to be labeled respectively, so as to obtain a plurality of phonetic symbols of the vocabulary to be labeled; based on a voting strategy, selecting a winning phonetic symbol from the phonetic symbols as a target phonetic symbol; and converting the target phonetic symbols into phonemes to generate a vocabulary phoneme table. Therefore, phonetic symbols are marked for the vocabulary to be marked through the vocabulary phonetic symbol transferring tools, the target phonetic symbols are determined based on the voting strategy, the quality of the vocabulary phoneme list is improved, and the construction efficiency of the vocabulary phoneme list is improved.
As shown in fig. 3, a second embodiment of the present invention provides a method for constructing a vocabulary phoneme table, based on the first embodiment shown in fig. 2, before the step of selecting a winning phonetic symbol from the plurality of phonetic symbols as a target phonetic symbol based on a voting strategy, the method further includes:
step S1011: judging whether the vocabulary to be marked is polyphonic words or not according to the phonetic symbols marked by each vocabulary to phonetic symbol tool;
the polyphone is a word with two or more phonetic symbols, and is a short for different-pronunciation homophone word. Specifically, whether the vocabulary to be labeled is polyphonic words or not is judged according to the number of the phonetic symbols labeled by each vocabulary phonetic symbol conversion tool.
Step S1012: if the number of phonetic symbols marked by one or more vocabulary phonetic symbol conversion tools is more than 1, judging that the vocabulary to be marked is polyphone;
in this embodiment, whether the vocabulary to be labeled is a polyphonic word is determined according to the number of phonetic symbols. And if the number of phonetic symbols marked by one or more vocabulary phonetic symbol conversion tools is more than 1, judging that the vocabulary to be marked is a polyphone. For example, for the word "facing the sun", there are two phonetic symbols [ zhao1 yang2], [ chao2 yang2], and if the word is labeled by a plurality of word transliteration tools, at least one word transliteration tool can label the two phonetic symbols, taking into account the error rate and the situation.
Step S1013: and if the vocabulary to be labeled is polyphonic, splitting the vocabulary to be labeled into a plurality of sub vocabularies to be labeled.
In this embodiment, the number of the sub-vocabulary to be labeled is the same as the number of the phonetic symbols. For example, "facing yang" may be divided into two sub-words.
Further, in the process of dividing into sub-vocabularies, the vocabulary to be labeled is mapped to the most common phonetic symbol, and other phonetic symbols except the most common phonetic symbol are mapped in the form of 'vocabulary + suffix'. The suffix may be a letter, a number, etc. In this embodiment, the most frequently used phonetic symbol refers to a phonetic symbol with the highest frequency of use in the sentences in each sentence library containing the vocabulary. For example, for the polyphonic word "ten thousand", the phonetic symbols are [ wan4], [ mo4], respectively, wherein the most common phonetic symbol is [ wan4], and then it is mapped to [ ten thousand wan4], [ ten thousand _2mo4 ].
Step S1014: respectively storing the plurality of sub-vocabularies to be labeled and corresponding phonetic symbols in an associated manner, and executing the following steps after obtaining the plurality of phonetic symbols of the sub-vocabularies to be labeled: and selecting a winning phonetic symbol from the plurality of phonetic symbols as a target phonetic symbol based on a voting strategy.
And storing the plurality of sub-vocabularies to be labeled and the corresponding phonetic symbols in an associated manner, and respectively marking the mapping results of the sub-vocabularies to be labeled. Thus, a plurality of phonetic symbols of the sub-vocabulary to be labeled are obtained.
After obtaining a plurality of phonetic symbols of the sub-vocabulary to be labeled, executing step S102: and selecting a winning phonetic symbol from the plurality of phonetic symbols as a target phonetic symbol based on a voting strategy.
In this embodiment, by the above scheme, a plurality of vocabulary phonetic symbol conversion tools are selected, and the plurality of vocabulary phonetic symbol conversion tools are used to label phonetic symbols for the vocabulary to be labeled respectively, so as to obtain a plurality of phonetic symbols of the vocabulary to be labeled; based on a voting strategy, selecting a winning phonetic symbol from the phonetic symbols as a target phonetic symbol; and converting the target phonetic symbols into phonemes to generate a vocabulary phoneme table. Therefore, phonetic symbols are marked for the vocabulary to be marked through the vocabulary phonetic symbol transferring tools, the target phonetic symbols are determined based on the voting strategy, the quality of the vocabulary phoneme list is improved, and the construction efficiency of the vocabulary phoneme list is improved.
As shown in fig. 4, a third embodiment of the present invention proposes a method for constructing a vocabulary phoneme table, based on the first embodiment and the second embodiment shown in fig. 2 and fig. 3, wherein the step of converting the target phonetic symbols into phonemes and generating the vocabulary phoneme table further includes:
step S104, receiving a vocabulary phoneme table updating request, and acquiring a target updating vocabulary and a target updating operation from the vocabulary phoneme table updating request;
after the construction of the vocabulary phoneme table is completed, in order to obtain a more complete and accurate vocabulary phoneme table, the construction of the vocabulary phoneme table needs to be modified, newly added, deleted, and the like.
Specifically, a vocabulary phoneme table updating request is received, wherein the vocabulary phoneme table updating request comprises updating operations, and the updating operations comprise modification, deletion and addition. And acquiring the vocabulary to be updated and a corresponding target updating operation from the vocabulary phoneme table updating request, wherein the target updating operation comprises one of modification, addition and deletion.
Step S105, based on the target updating operation, performing a corresponding updating operation on the phonemes of the target vocabulary.
And acquiring a target vocabulary to be updated from the vocabulary phoneme table updating request and target updating operation corresponding to the target vocabulary. If the target updating operation is modification, further acquiring modified phonemes, and replacing the phonemes of the target vocabulary in the vocabulary phoneme table with the modified phonemes; if the target updating operation is newly added, acquiring a target vocabulary needing to be newly added and a phoneme of the target vocabulary, and then storing the target vocabulary and the phoneme to the vocabulary phoneme table.
In this embodiment, the newly added or modified phoneme may be an artificially labeled phoneme, or may be a phoneme obtained based on the scheme of the first embodiment of the present invention.
In this embodiment, by the above scheme, a plurality of vocabulary phonetic symbol conversion tools are selected, and the plurality of vocabulary phonetic symbol conversion tools are used to label phonetic symbols for the vocabulary to be labeled respectively, so as to obtain a plurality of phonetic symbols of the vocabulary to be labeled; based on a voting strategy, selecting a winning phonetic symbol from the phonetic symbols as a target phonetic symbol; converting the target phonetic symbols into phonemes, generating a vocabulary phoneme table, receiving a vocabulary phoneme table updating request, and acquiring a target updating vocabulary and a target updating operation from the vocabulary phoneme table updating request; and performing corresponding updating operation on the phonemes of the target vocabulary based on the target updating operation. Therefore, phonetic symbols are marked for the vocabulary to be marked through the vocabulary phonetic symbol transferring tools, the target phonetic symbols are determined based on the voting strategy, the quality of the vocabulary phoneme list is improved, and the construction efficiency of the vocabulary phoneme list is improved. But also enables fast updates.
In addition, the embodiment also provides a device for constructing the vocabulary phoneme table. Referring to fig. 5, fig. 5 is a functional block diagram of a vocabulary phoneme table constructing device according to a first embodiment of the present invention.
In this embodiment, the vocabulary phoneme table constructing device is a virtual device, and is stored in the memory 1005 of the vocabulary phoneme table constructing apparatus shown in fig. 1, so as to implement all functions of the vocabulary phoneme table constructing program: the system comprises a plurality of vocabulary phonetic symbol conversion tools, a plurality of phonetic symbol conversion tools and a plurality of phonetic symbol conversion tools, wherein the vocabulary phonetic symbols are respectively marked for the vocabulary to be marked by the vocabulary phonetic symbol conversion tools, and a plurality of phonetic symbols of the vocabulary to be marked are obtained; the system comprises a voting strategy and a plurality of phonetic symbols, wherein the voting strategy is used for selecting a winning phonetic symbol from the phonetic symbols as a target phonetic symbol; and the system is used for converting the target phonetic symbols into phonemes and generating a vocabulary phoneme table.
Specifically, the vocabulary phoneme constructing device includes:
a selection module 10, configured to select a plurality of vocabulary phonetic symbol conversion tools, and label phonetic symbols for the vocabulary to be labeled by the plurality of vocabulary phonetic symbol conversion tools, respectively, to obtain a plurality of phonetic symbols of the vocabulary to be labeled;
a voting module 20, configured to select a winning phonetic symbol from the plurality of phonetic symbols as a target phonetic symbol based on a voting strategy;
a conversion module 30, configured to convert the target phonetic symbol into a phoneme, and generate a vocabulary phoneme table.
Further, the voting module comprises:
the assigning unit is used for assigning each phonetic symbol in the plurality of phonetic symbols to an original ticket respectively;
the ticket combination unit is used for combining the original tickets with the same phonetic symbols if the same phonetic symbols exist in the plurality of phonetic symbols, marking the phonetic symbols after the combination of the tickets as candidate phonetic symbols and counting the number of the tickets of the candidate phonetic symbols;
and the determining unit is used for sorting the candidate phonetic symbols according to the ticket number and determining winning phonetic symbols according to a sorting result.
Further, the giving unit further includes:
a judging subunit, configured to judge that a winning phonetic symbol does not exist in the plurality of phonetic symbols if the same phonetic symbol does not exist in the plurality of phonetic symbols;
the marking subunit is used for marking the corresponding vocabulary to be marked as an ambiguous vocabulary and transferring the ambiguous vocabulary to an ambiguous vocabulary pool;
and the obtaining subunit is used for marking the phonetic symbols of the ambiguous vocabulary in the ambiguous vocabulary pool by a plurality of spare word transfer phonetic symbol tools until obtaining the winning phonetic symbols of the ambiguous vocabulary.
Further, the voting module further comprises:
the judging unit is used for judging whether the vocabulary to be marked is polyphone or not according to the number of the phonetic symbols marked by each vocabulary phonetic symbol transferring tool:
the judging unit is used for judging that the vocabulary to be marked is polyphone if the number of phonetic symbols marked by one or more vocabulary phonetic symbol transferring tools is more than 1;
the splitting unit is used for splitting the polyphone into a plurality of sub-vocabularies to be labeled if the vocabularies to be labeled are polyphone;
a storage unit, configured to store the plurality of sub-vocabularies to be labeled and corresponding phonetic symbols in an associated manner, and execute the following steps after obtaining the plurality of phonetic symbols of the sub-vocabularies to be labeled: and selecting a winning phonetic symbol from the plurality of phonetic symbols as a target phonetic symbol based on a voting strategy.
Further, the address selecting module further comprises:
and normalizing the plurality of phonetic symbols to obtain a plurality of phonetic symbols with consistent formats so as to select a winning phonetic symbol from the plurality of phonetic symbols with consistent formats.
Further, the conversion module further comprises:
and converting the target phonetic symbols into phonemes based on a phoneme format, and generating a vocabulary phoneme table according to the phonemes of the vocabulary to be labeled.
Further, the conversion module further comprises:
the device comprises an acquisition unit, a processing unit and a processing unit, wherein the acquisition unit is used for receiving a vocabulary phoneme table updating request and acquiring a target updating vocabulary and a target updating operation from the vocabulary phoneme table updating request;
and the updating unit is used for executing corresponding updating operation on the phonemes of the target vocabulary based on the target updating operation.
In addition, an embodiment of the present invention further provides a computer storage medium, where the computer storage medium stores a vocabulary phoneme table constructing program, and the vocabulary phoneme table constructing program is executed by a processor to implement the steps of the vocabulary phoneme table constructing method described above, which are not described herein again.
Compared with the prior art, the invention provides a vocabulary phoneme table construction method, a device, equipment and a storage medium, wherein the method comprises the following steps: selecting a plurality of vocabulary phonetic symbol conversion tools, and respectively labeling phonetic symbols for the vocabulary to be labeled by the vocabulary phonetic symbol conversion tools to obtain a plurality of phonetic symbols of the vocabulary to be labeled; based on a voting strategy, selecting a winning phonetic symbol from the phonetic symbols as a target phonetic symbol; and converting the target phonetic symbols into phonemes to generate a vocabulary phoneme table. Therefore, phonetic symbols are marked for the vocabulary to be marked through the vocabulary phonetic symbol transferring tools, the target phonetic symbols are determined based on the voting strategy, the quality of the vocabulary phoneme list is improved, and the construction efficiency of the vocabulary phoneme list is improved.
It should be noted that, in this document, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or system that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or system. Without further limitation, an element defined by the phrase "comprising an … …" does not exclude the presence of other like elements in a process, method, article, or system that comprises the element.
The above-mentioned serial numbers of the embodiments of the present invention are merely for description and do not represent the merits of the embodiments.
Through the above description of the embodiments, those skilled in the art will clearly understand that the method of the above embodiments can be implemented by software plus a necessary general hardware platform, and certainly can also be implemented by hardware, but in many cases, the former is a better implementation manner. Based on this understanding, the technical solutions of the present invention may be embodied in the form of a software product, which is stored in a storage medium (e.g., ROM/RAM, magnetic disk, optical disk) as described above, and includes several requests for a terminal device to execute the methods according to the embodiments of the present invention.
The above description is only for the preferred embodiment of the present invention and is not intended to limit the scope of the present invention, and all equivalent structures or flow transformations made by the present specification and drawings, or applied directly or indirectly to other related arts, are included in the scope of the present invention.

Claims (10)

1. A method of constructing a vocabulary phoneme table, the method comprising:
selecting a plurality of vocabulary phonetic symbol conversion tools, and respectively labeling phonetic symbols for the vocabulary to be labeled by the vocabulary phonetic symbol conversion tools to obtain a plurality of phonetic symbols of the vocabulary to be labeled;
based on a voting strategy, selecting a winning phonetic symbol from the phonetic symbols as a target phonetic symbol;
and converting the target phonetic symbols into phonemes to generate a vocabulary phoneme table.
2. The method of claim 1, wherein the voting strategy is used to determine a winning phonetic symbol based on the number of votes for the phonetic symbol;
the step of selecting a winning phonetic symbol from the plurality of phonetic symbols as a target phonetic symbol based on the voting strategy comprises:
assigning each phonetic symbol of the plurality of phonetic symbols to an original ticket;
if the same phonetic symbols exist in the plurality of phonetic symbols, combining the original tickets of the same phonetic symbols, marking the phonetic symbols after the combination as candidate phonetic symbols, and counting the number of the candidate phonetic symbols;
and sorting the candidate phonetic symbols according to the ticket number, determining a winning phonetic symbol according to a sorting result, and taking the winning phonetic symbol as a target phonetic symbol.
3. The method of claim 2, wherein said step of assigning each of said plurality of phonetic symbols to a respective original ticket further comprises:
if the same phonetic symbols do not exist in the plurality of phonetic symbols, judging that a winning phonetic symbol does not exist in the plurality of phonetic symbols;
marking the corresponding vocabulary to be marked as ambiguous vocabulary, and transferring the ambiguous vocabulary into an ambiguous vocabulary pool;
and marking phonetic symbols for the ambiguous words in the ambiguous word pool by a plurality of spare word transfer phonetic symbol tools until obtaining the winning phonetic symbols of the ambiguous words.
4. The method according to claim 1, wherein said step of selecting a winning phonetic symbol from said plurality of phonetic symbols as a target phonetic symbol based on a voting strategy further comprises:
judging whether the vocabulary to be labeled is polyphonic words or not according to the number of the phonetic symbols labeled by each vocabulary phonetic symbol conversion tool:
if the number of phonetic symbols marked by one or more vocabulary phonetic symbol conversion tools is more than 1, judging that the vocabulary to be marked is polyphone;
if the vocabulary to be labeled is polyphonic, splitting the polyphonic into a plurality of sub vocabularies to be labeled;
respectively storing the plurality of sub-vocabularies to be labeled and corresponding phonetic symbols in an associated manner, and executing the following steps after obtaining the plurality of phonetic symbols of the sub-vocabularies to be labeled: and selecting a winning phonetic symbol from the plurality of phonetic symbols as a target phonetic symbol based on a voting strategy.
5. The method according to claim 1, wherein said selecting a plurality of vocabulary phonetic transcription tools for respectively labeling phonetic transcriptions for the vocabulary to be labeled, and wherein said obtaining a plurality of phonetic transcriptions for the vocabulary to be labeled further comprises, after said selecting:
and normalizing the plurality of phonetic symbols to obtain a plurality of phonetic symbols with consistent formats so as to select a winning phonetic symbol from the plurality of phonetic symbols with consistent formats.
6. The method of claim 1, wherein converting the target phonetic symbols into phonemes and generating a vocabulary phoneme table comprises:
and converting the target phonetic symbols into phonemes based on a phoneme format, and generating a vocabulary phoneme table according to the phonemes of the vocabulary to be labeled.
7. The method of claim 1, wherein converting the target phonetic symbols into phonemes, the step of generating a vocabulary phoneme table further comprising:
receiving a vocabulary phoneme table updating request, and acquiring a target updating vocabulary and a target updating operation from the vocabulary phoneme table updating request;
and performing corresponding updating operation on the phonemes of the target vocabulary based on the target updating operation.
8. A vocabulary phoneme table construction apparatus, the vocabulary phoneme table construction apparatus comprising:
the selection module is used for selecting a plurality of vocabulary phonetic symbol conversion tools, and the plurality of vocabulary phonetic symbol conversion tools are used for marking phonetic symbols for the vocabulary to be marked respectively to obtain a plurality of phonetic symbols of the vocabulary to be marked;
the voting module is used for selecting a winning phonetic symbol from the phonetic symbols as a target phonetic symbol based on a voting strategy;
and the conversion module is used for converting the target phonetic symbols into phonemes to generate a vocabulary phoneme table.
9. A vocabulary phoneme table construction device, comprising a processor, a memory and a vocabulary phoneme table construction program stored in the memory, wherein the vocabulary phoneme table construction program, when executed by the processor, implements the steps of the vocabulary phoneme table construction method as claimed in any one of claims 1to 7.
10. A computer storage medium having stored thereon a vocabulary phoneme table construction program which, when executed by a processor, implements the steps of the vocabulary phoneme table construction method of any of claims 1-7.
CN202010150627.0A 2020-03-05 2020-03-05 Vocabulary phoneme list construction method, device, equipment and storage medium Active CN111354339B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010150627.0A CN111354339B (en) 2020-03-05 2020-03-05 Vocabulary phoneme list construction method, device, equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010150627.0A CN111354339B (en) 2020-03-05 2020-03-05 Vocabulary phoneme list construction method, device, equipment and storage medium

Publications (2)

Publication Number Publication Date
CN111354339A true CN111354339A (en) 2020-06-30
CN111354339B CN111354339B (en) 2023-11-03

Family

ID=71194340

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010150627.0A Active CN111354339B (en) 2020-03-05 2020-03-05 Vocabulary phoneme list construction method, device, equipment and storage medium

Country Status (1)

Country Link
CN (1) CN111354339B (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112530402A (en) * 2020-11-30 2021-03-19 深圳市优必选科技股份有限公司 Voice synthesis method, voice synthesis device and intelligent equipment

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2004006123A2 (en) * 2002-07-03 2004-01-15 2012244 Ontario Inc. Method and system of creating and using chinese language data and user-corrected data
CN103578467A (en) * 2013-10-18 2014-02-12 威盛电子股份有限公司 Acoustic model building method, voice recognition method and electronic device
JP2014164260A (en) * 2013-02-27 2014-09-08 Canon Inc Information processor and information processing method
CN109117463A (en) * 2018-07-26 2019-01-01 掌阅科技股份有限公司 Text pinyin marking method, electronic equipment, storage medium
CN109918619A (en) * 2019-01-07 2019-06-21 平安科技(深圳)有限公司 A kind of pronunciation mask method and device based on basic dictionary mark
CN109977361A (en) * 2019-03-01 2019-07-05 广州多益网络股份有限公司 A kind of Chinese phonetic alphabet mask method, device and storage medium based on similar word
CN110827803A (en) * 2019-11-11 2020-02-21 广州国音智能科技有限公司 Method, device and equipment for constructing dialect pronunciation dictionary and readable storage medium

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2004006123A2 (en) * 2002-07-03 2004-01-15 2012244 Ontario Inc. Method and system of creating and using chinese language data and user-corrected data
JP2014164260A (en) * 2013-02-27 2014-09-08 Canon Inc Information processor and information processing method
CN103578467A (en) * 2013-10-18 2014-02-12 威盛电子股份有限公司 Acoustic model building method, voice recognition method and electronic device
CN109117463A (en) * 2018-07-26 2019-01-01 掌阅科技股份有限公司 Text pinyin marking method, electronic equipment, storage medium
CN109918619A (en) * 2019-01-07 2019-06-21 平安科技(深圳)有限公司 A kind of pronunciation mask method and device based on basic dictionary mark
CN109977361A (en) * 2019-03-01 2019-07-05 广州多益网络股份有限公司 A kind of Chinese phonetic alphabet mask method, device and storage medium based on similar word
CN110827803A (en) * 2019-11-11 2020-02-21 广州国音智能科技有限公司 Method, device and equipment for constructing dialect pronunciation dictionary and readable storage medium

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
LADAN GOLIPOUR ET AL.: "Context-independent phoneme recognition using a K-Nearest Neighbour classification approach", 2009 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING *
黄中伟等: "普通话语音识别中的基本音素分析", 深圳大学学报理工版 *

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112530402A (en) * 2020-11-30 2021-03-19 深圳市优必选科技股份有限公司 Voice synthesis method, voice synthesis device and intelligent equipment
CN112530402B (en) * 2020-11-30 2024-01-12 深圳市优必选科技股份有限公司 Speech synthesis method, speech synthesis device and intelligent equipment

Also Published As

Publication number Publication date
CN111354339B (en) 2023-11-03

Similar Documents

Publication Publication Date Title
CN108847241B (en) Method for recognizing conference voice as text, electronic device and storage medium
WO2020186778A1 (en) Error word correction method and device, computer device, and storage medium
CN112016310A (en) Text error correction method, system, device and readable storage medium
US7840399B2 (en) Method, device, and computer program product for multi-lingual speech recognition
US20060129396A1 (en) Method and apparatus for automatic grammar generation from data entries
JP4267385B2 (en) Statistical language model generation device, speech recognition device, statistical language model generation method, speech recognition method, and program
US20070055493A1 (en) String matching method and system and computer-readable recording medium storing the string matching method
WO2006030302A1 (en) Optimization of text-based training set selection for language processing modules
CN111192570B (en) Language model training method, system, mobile terminal and storage medium
CN110211562B (en) Voice synthesis method, electronic equipment and readable storage medium
TW201822190A (en) Speech recognition system and method thereof, vocabulary establishing method and computer program product
CN111508479A (en) Voice recognition method, device, equipment and storage medium
CN112818089B (en) Text phonetic notation method, electronic equipment and storage medium
CN110852075A (en) Voice transcription method and device for automatically adding punctuation marks and readable storage medium
CN112116907A (en) Speech recognition model establishing method, speech recognition device, speech recognition equipment and medium
CN111401012A (en) Text error correction method, electronic device and computer readable storage medium
KR20230156125A (en) Lookup table recursive language model
CN114492396A (en) Text error correction method for automobile proper nouns and readable storage medium
CN111354339B (en) Vocabulary phoneme list construction method, device, equipment and storage medium
CN112287657B (en) Information matching system based on text similarity
CN110750967B (en) Pronunciation labeling method and device, computer equipment and storage medium
US20060074924A1 (en) Optimization of text-based training set selection for language processing modules
CN112417851B (en) Text error correction word segmentation method and system and electronic equipment
CN115691503A (en) Voice recognition method and device, electronic equipment and storage medium
CN114528851A (en) Reply statement determination method and device, electronic equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant