CN110069762A - A kind of document polyphone sort method, device, medium and electronic equipment - Google Patents

A kind of document polyphone sort method, device, medium and electronic equipment Download PDF

Info

Publication number
CN110069762A
CN110069762A CN201910204459.6A CN201910204459A CN110069762A CN 110069762 A CN110069762 A CN 110069762A CN 201910204459 A CN201910204459 A CN 201910204459A CN 110069762 A CN110069762 A CN 110069762A
Authority
CN
China
Prior art keywords
feature information
information
document
multitone
polyphone
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201910204459.6A
Other languages
Chinese (zh)
Inventor
徐煜
王镇佳
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Tianjin ByteDance Technology Co Ltd
Original Assignee
Tianjin ByteDance Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Tianjin ByteDance Technology Co Ltd filed Critical Tianjin ByteDance Technology Co Ltd
Priority to CN201910204459.6A priority Critical patent/CN110069762A/en
Publication of CN110069762A publication Critical patent/CN110069762A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/10Text processing
    • G06F40/166Editing, e.g. inserting or deleting
    • G06F40/177Editing, e.g. inserting or deleting of tables; using ruled lines
    • G06F40/18Editing, e.g. inserting or deleting of tables; using ruled lines of spreadsheets

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Computational Linguistics (AREA)
  • General Health & Medical Sciences (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Information Transfer Between Computers (AREA)

Abstract

Present disclose provides a kind of document polyphone sort method, device, storage medium and electronic equipments, which comprises chooses rank region, clicks rank button;Judge whether it is and sorts for the first time;If so, load multitone character word stock, sorts after handling the multitone character word stock;Return to ranking results.The disclosure needs dynamically load multitone numerical data base by shifting to an earlier date compressed data library, and according to document, so that the document starting time does not delay because of the load of database.Simultaneously, in conjunction with the Chinese phonetic alphabet processing method of polyphone, the Chinese terms of polyphone quickly can be subjected to ascending or descending order sequence, it solves the technical issues of document polyphone cannot be ranked up, the application demand that user sorts to polyphone is met under the premise of not postponing the document starting time.

Description

A kind of document polyphone sort method, device, medium and electronic equipment
Technical field
This disclosure relates to field of computer technology, in particular to a kind of document polyphone sort method, device, Jie Matter and electronic equipment.
Background technique
Document currently on the market mainly has ***, office etc., and most of document products all have sequencing ability.But It is to be substantially to go to sort by character set (Chinese character is showed in the storage of computer, is often referred to Unicode).But for wider General application, it would be desirable to which the sequence of polyphone is provided.But the tables such as ***, office are not providing polyphone ranking function.
The ranking function in table is when being directed to Chinese sorting at present, without providing the ability of polyphone sequence.Reason Have it is multiple, one reason for this is that polyphone is complicated, it is difficult to which all browsers can be compatible with by finding suitable dictionary.In addition it is exactly The dictionary of polyphone is very big, and polyphone sequence is supported to mean the decline of document process performance.It sorts, can introduce if it is front end Very big dictionary file.It is translated if it is backstage sequence it is necessary to be called by interface, then data this for table are huge Application, experience can have a greatly reduced quality.
It therefore, is a technical problem anxious to be resolved for the sequence of polyphone in document.
Summary of the invention
The disclosure is designed to provide a kind of document polyphone sort method, device, medium and electronic equipment, can solve At least one technical problem certainly mentioned above, specific as follows:
According to the specific embodiment of the disclosure, in a first aspect, the disclosure provides a kind of document content information processing method, Include:
Extract the document content information;
Call the multitone character word stock loaded in the document, wherein the multitone character word stock includes polyphone phrase Fisrt feature information and second feature information;
The content information is matched with the fisrt feature information, if successful match, the second feature is believed Breath processing, and return through the second feature information of processing;
Processing is ranked up to the content information according to the second feature information of the processing.
Optionally, to the second feature information processing, and the second feature information of processing is returned through, comprising:
Remove the mark information in the second feature information;
Return to the second feature information for removing the mark information.
It is optionally, described that processing is ranked up to the content information according to the second feature information of the processing, comprising:
It is ranked up according to the second feature information of the first character in the content information, the second feature information is Remove the second feature information of the mark information;
If the second feature information of the first character is identical, it is ranked up according to the second feature information of next word;
Until content information sequence finishes.
Optionally, after the extraction document content information, comprising:
It executes and calls the instruction of multitone character word stock;
When judgement is to execute to call multitone character word stock instruction for the first time, the multitone character word stock is loaded.
Optionally, the load multitone character word stock includes:
The multitone character word stock is compressed, the compressed multitone character word stock is loaded.
According to the specific embodiment of the disclosure, second aspect, the disclosure provides a kind of document content information processing unit, Include:
Extraction unit, for extracting the document content information;
Call unit, for calling the multitone character word stock loaded in the document, wherein the multitone character word stock includes The fisrt feature information and second feature information of polyphone phrase;
Matching unit, for matching the content information with the fisrt feature information, if successful match, to institute Second feature information processing is stated, and returns through the second feature information of processing;
Sequencing unit, for being ranked up processing to the content information according to the second feature information of the processing.
Optionally, the matching unit is also used to:
Remove the mark information in the second feature information;
Return to the second feature information for removing the mark information.
Optionally, the sequencing unit is also used to:
It is ranked up according to the second feature information of the first character in the content information, the second feature information is Remove the second feature information of the mark information;
If the second feature information of the first character is identical, it is ranked up according to the second feature information of next word;
Until content information sequence finishes.
According to the specific embodiment of the disclosure, the third aspect, the disclosure provides a kind of computer readable storage medium, On be stored with computer program, when described program is executed by processor realize as above described in any item methods.
According to the specific embodiment of the disclosure, fourth aspect, the disclosure provides a kind of electronic equipment, comprising: one or Multiple processors;Storage device, for storing one or more programs, when one or more of programs are by one or more When a processor executes, so that one or more of processors realize as above described in any item methods.
Compared with the existing technology, the disclosure has the following technical effect that
According to the technical solution of the disclosure, the disclosure adds by shifting to an earlier date compressed data library, and according to the needs of document dynamic Multitone numerical data base is carried, so that the document starting time does not delay because of the load of database.Meanwhile it being spelled in conjunction with the Chinese of polyphone The Chinese terms of polyphone quickly can be carried out ascending or descending order sequence, solve document polyphone not by voice handling method The technical issues of can be carried out sequence meets the application that user sorts to polyphone under the premise of not postponing the document starting time Demand.
Detailed description of the invention
The drawings herein are incorporated into the specification and forms part of this specification, and shows the implementation for meeting the disclosure Example, and together with specification for explaining the principles of this disclosure.It should be evident that the accompanying drawings in the following description is only the disclosure Some embodiments for those of ordinary skill in the art without creative efforts, can also basis These attached drawings obtain other attached drawings.In the accompanying drawings:
Fig. 1 shows the flow chart of the document polyphone sort method according to the embodiment of the present disclosure;
Fig. 2 shows the flow charts according to the document polyphone sort method of another embodiment of the disclosure;
Fig. 3 shows the document polyphone sort method editing mode schematic diagram according to the embodiment of the present disclosure;
Fig. 4 shows the document polyphone sort method stress state schematic diagram according to the embodiment of the present disclosure;
Fig. 5, which is shown, returns to status diagram according to the document polyphone sort method of the embodiment of the present disclosure;
Fig. 6 shows the apparatus structure schematic diagram of document polyphone sort method according to an embodiment of the present disclosure;
Fig. 7 shows the apparatus structure schematic diagram of document polyphone sort method according to another embodiment of the present disclosure;
Fig. 8 shows electronic equipment attachment structure schematic diagram according to an embodiment of the present disclosure.
Specific embodiment
In order to keep the purposes, technical schemes and advantages of the disclosure clearer, below in conjunction with attached drawing to the disclosure make into It is described in detail to one step, it is clear that described embodiment is only disclosure a part of the embodiment, rather than whole implementation Example.It is obtained by those of ordinary skill in the art without making creative efforts based on the embodiment in the disclosure All other embodiment belongs to the range of disclosure protection.
The term used in the embodiments of the present disclosure is only to be not intended to be limiting merely for for the purpose of describing particular embodiments The disclosure.In the embodiment of the present disclosure and the "an" of singular used in the attached claims, " described " and "the" It is also intended to including most forms, unless the context clearly indicates other meaning, " a variety of " generally comprise at least two.
It should be appreciated that term "and/or" used herein is only a kind of incidence relation for describing affiliated partner, indicate There may be three kinds of relationships, for example, A and/or B, can indicate: individualism A, exist simultaneously A and B, individualism B these three Situation.In addition, character "/" herein, typicallys represent the relationship that forward-backward correlation object is a kind of "or".
It will be appreciated that though may be described in the embodiments of the present disclosure using term first, second, third, etc.., But these ... it should not necessarily be limited by these terms.These terms be only used to by ... distinguish.For example, implementing not departing from the disclosure In the case where example range, first ... can also be referred to as second ..., and similarly, second ... can also be referred to as the One ....
Depending on context, word as used in this " if ", " if " can be construed to " ... when " or " when ... " or " in response to determination " or " in response to detection ".Similarly, context is depended on, phrase " if it is determined that " or " such as Fruit detection (condition or event of statement) " can be construed to " when determining " or " in response to determination " or " when detection (statement Condition or event) when " or " in response to detection (condition or event of statement) ".
It should also be noted that, the terms "include", "comprise" or its any other variant are intended to nonexcludability Include, so that commodity or device including a series of elements not only include those elements, but also including not clear The other element listed, or further include for this commodity or the intrinsic element of device.In the feelings not limited more Under condition, the element that is limited by sentence "including a ...", it is not excluded that in the commodity or device for including the element also There are other identical elements.
The alternative embodiment of the disclosure is described in detail with reference to the accompanying drawing.
Embodiment 1
In conjunction with attached drawing 1, according to the specific embodiment of the disclosure, the disclosure provides a kind of document polyphone sort method, The special polyphone applied to online document sorts, for example, in document table polyphone sequence, polyphone can also be applied With the mixing of non-polyphone, as shown in Fig. 3 or 5, wherein main method and step is as follows:
Step S102: the document content information is extracted.
As shown in figure 3, choosing rank region, clicks rank button and start Sorting task.
Specifically, user opens the document edited, it is any a that document generally includes office or apple or other Office software, it is convenient for expression, it is illustrated by taking excel as an example below.
User needs to be ranked up some region of Chinese character content of excel, especially relates to some polyphones Chinese character, such as the content of rank region includes " Chongqing ", " weight ", " growing up ", " length " etc., if the sequence content is a certain (content that sorts sometimes is not necessarily located in same row to column, can be located in a certain region, for ease of understanding, is carried out in the case of a column Illustrate), a certain cell in column head of excel can be chosen at this time, and clicking ascending or descending order rank button can the row of calling Sequence function after receiving sequence trigger command from the background, calls corresponding Chinese character polyphone sequence dictionary to be ranked up response.
Step S104: the multitone character word stock loaded in the document is called, wherein the multitone character word stock includes multitone The fisrt feature information and second feature information of words group.
It should be understood that the multitone character word stock is loaded into edited document from local or server-side or cloud In, when executing ordering instruction at this time, implement calling function.
Wherein, the fisrt feature information can be text itself, such as the polyphones such as " weight ", " length ", or according to more The store code of sound word Corresponding matching, the store code and each polyphone unique match.The second feature information can To be the corresponding Chinese phonetic alphabet of polyphone, the Chinese phonetic alphabet includes the representative symbol of tone such as five character codes, described Representing symbol can be any form that can risk specific Chinese character.One of example, multitone character word stock be stored with " Chongqing " and Polyphones phrases such as toned " chongqing ".
Step S106: the content information is matched with the fisrt feature information, if successful match, to described Two feature information processings, and return through the second feature information of processing;
It is further to execute step further include: to remove the mark information in the second feature information;It returns described in removal The second feature information of mark information.
As one of example, in conjunction with attached drawing 3 or Fig. 5, for example, clicking rank button, institute for after the phrase selection wait sort State phrase to be sorted and matched according to certain sequence with the multitone character word stock for being already loaded into document, first be, for example, text The first information of word itself carries out matched and searched and then illustrates that the phrase is if finding related polyphone phrase, such as " Chongqing " Polyphone phrase, otherwise, such as word " once ", corresponding polyphone phrase is not found, then illustrates that the phrase is non-more Sound words group.Later, it after being determined as polyphone phrase, needs to handle the corresponding second feature information of the phrase, Such as the phonetic " chongqing " in " Chongqing " is handled, processing result is then returned to, such as return to the spelling for removing tone Sound.Mark information herein can be the tone etc. of phonetic.
Step S108: processing is ranked up to the content information according to the second feature information of the processing.
Optionally, including following substep:
The first, it is ranked up according to the second feature information of the first character in the content information, the second feature Information is to remove the second feature information of the mark information.
Such as it is ranked up according to the phonetic for removing tone of all phrases, such as the first character in " Chongqing " and " weight " The phonetic for removing tone according to from front to back, or sequence arrangement from back to front.
If the second feature information of second, the described first character is identical, carried out according to the second feature information of next word Sequence.
Such as " Chongqing " and " pet " are ranked up, then first character phonetic is identical, it is ranked up according to second word, Second word is identical then to sort according to third word.
Third, and so on, it is finished until the content information sorts, returns to ranking results.
According to the technical solution of the embodiment of the present disclosure, the embodiment of the present disclosure is by shifting to an earlier date compressed data library, and according to document Need dynamically load multitone numerical data base so that document starting the time do not delay because of the load of database.Meanwhile in conjunction with more The Chinese terms of polyphone quickly can be carried out ascending or descending order sequence, solved by the Chinese phonetic alphabet processing method of sound word The technical issues of document polyphone cannot be ranked up meets user to multitone under the premise of not postponing the document starting time The application demand of word sequence.
Embodiment 2
In conjunction with attached drawing 2, according to the specific embodiment of the disclosure, the disclosure provides a kind of document polyphone sort method, The special polyphone applied to online document sorts, for example, in document table polyphone sequence, polyphone can also be applied With the mixing of non-polyphone, as shown in Fig. 3 or 5, wherein main method and step is as follows:
Step S202: choosing rank region, clicks rank button.As shown in Figure 3.
User opens the document edited, and it is soft that document generally includes any a office of office or apple or other Part, it is convenient for expression, it is illustrated by taking excel as an example below.
User needs to be ranked up some region of Chinese character content of excel, especially relates to some polyphones Chinese character, such as the content of rank region includes " Chongqing ", " weight ", " growing up ", " length " etc., if the sequence content is a certain (content that sorts sometimes is not necessarily located in same row to column, can be located in a certain region, for ease of understanding, is carried out in the case of a column Illustrate), a certain cell in column head of excel can be chosen at this time, and clicking ascending or descending order rank button can the row of calling Sequence function after receiving sequence trigger command from the background, calls corresponding Chinese character polyphone sequence dictionary to be ranked up response.
Step S204: judge whether it is and sort for the first time.
Judge whether it is and receives ordering instruction for the first time, to determine whether multitone character word stock had loaded in a document, It sorts for the first time if being judged as, needs to load multitone character word stock, sort for the first time if being judged as non-, call directly and added The multitone character word stock of load, without reloading.
It should be noted that multitone character word stock described in the present embodiment needs only when opening documents editing to multitone The multitone character word stock is just loaded when word is ranked up, rather than loads multitone character word stock when document opening, because of multitone words Library has a biggish data volume, and usually several million, multitone character word stock is loaded at the very start, if do not had in editor's document process It is related to the task of polyphone sequence, it will cause the extra of loading content, the speed for having dragged slowly document to open affects user Experience.
Step S206: if so, load multitone character word stock, sorts after handling the multitone character word stock.
For example, the load multitone character word stock includes: to compress to the multitone character word stock, load compressed described Multitone character word stock.One can specific embodiment be that the data volume of multitone character word stock is 1M before loading, pressed using the library gzip Contracting, can be the multitone character library boil down to 300K or so of 1M by good compression ratio.This is greatly lowered database load Time, improve the experience effect of user.Under normal conditions can in 0.1-1 seconds loaded.At this point, in loading procedure In can remind user in a manner of pop-up box, such as prompt informations such as " multitone character word stock loading in ".Such as Fig. 4 institute Show.
Optionally, it is described the multitone character word stock is handled after sort, including following specific method step:
Step S2062: the Chinese phonetic alphabet is converted by the word in the multitone character word stock.
The word has multiple words, and each word is converted, such as " Chongqing " is converted into " chongqing ", " weight Amount " is converted into " zhongliang " etc..
Step S2064: remove the tone of the word in the multitone character word stock;
Step S2066: it sorts to the word after tone is removed.
Specifically, in step S2066, the described pair of word sequence removed after tone, comprising: to the word after tone is removed, press It is ranked up according to the phonetic of first character in phrase;If the phonetic of the first character is identical, according to second word phonetic into Row sequence;And so on, the sequencing until two words are discharged.Wherein, when Pinyin sorting, in the past according to all letters of phonetic It successively compares backward, it is forward that letter is lacked on corresponding position.
For example, " Chongqing, weight " according to after descending sort be " weight, Chongqing ", " Chongqing, weight, China " descending sort Being afterwards " weight, China, Chongqing " is " weight, here, Chongqing " after " Chongqing, weight, here " descending sort, and so on.
Optionally, it sorts for the first time if being judged as non-, without loading multitone character word stock, directly in the rank region Word sequence.
At this time since document has already turned on, multitone numerical data base had also been loaded, then be not required to when polyphone sequence Multitone numerical data base is reloaded, is called directly.But after document is closed, the multitone numerical data base also moves back together Out, polyphone sequence is hereafter if desired carried out, then needs to reload multitone numerical data base.
Step S208: ranking results are returned to.
The actual interface for being presented in user is, after user clicks ascending or descending order rank button, by the operation on backstage, It only requires a very short time, ranking results can be returned.As shown in Figure 5.
According to the technical solution of the embodiment of the present disclosure, the embodiment of the present disclosure is by shifting to an earlier date compressed data library, and according to document Need dynamically load multitone numerical data base so that document starting the time do not delay because of the load of database.Meanwhile in conjunction with more The Chinese terms of polyphone quickly can be carried out ascending or descending order sequence, solved by the Chinese phonetic alphabet processing method of sound word The technical issues of document polyphone cannot be ranked up meets user to multitone under the premise of not postponing the document starting time The application demand of word sequence.
Embodiment 3
As shown in fig. 6, the disclosure provides a kind of document polyphone collator according to the specific embodiment of the disclosure, Including extraction unit 602, call unit 604, matching unit 606 and sequencing unit 608, each unit is described as follows.Fig. 6 institute The method that showing device can execute embodiment illustrated in fig. 1, the part that the present embodiment is not described in detail can refer to real shown in Fig. 1 Apply the related description of example.Description in implementation procedure and the technical effect embodiment shown in Figure 1 of the technical solution, herein not It repeats again.
Extraction unit 602, for extracting the document content information;
Call unit 604, for calling the multitone character word stock loaded in the document, wherein the multitone character word stock Fisrt feature information and second feature information including polyphone phrase;
Matching unit 606, for the content information to be matched with the fisrt feature information, if successful match, To the second feature information processing, and return through the second feature information of processing;
Sequencing unit 608, for being ranked up processing to the content information according to the second feature information of the processing.
The embodiment of the present disclosure needs dynamically load multitone numerical data base by shifting to an earlier date compressed data library, and according to document, So that the document starting time does not delay because of the load of database.It, can meanwhile in conjunction with the Chinese phonetic alphabet processing method of polyphone The Chinese terms of polyphone are quickly subjected to ascending or descending order sequence, solve the technology that document polyphone cannot be ranked up Problem meets the application demand that user sorts to polyphone under the premise of not postponing the document starting time.
Embodiment 4
As shown in fig. 7, the disclosure provides a kind of document polyphone collator according to the specific embodiment of the disclosure, Including selection unit 702, judging unit 704, sequencing unit 706 and return unit 708, each unit is described as follows.Fig. 7 institute The method that showing device can execute embodiment illustrated in fig. 2, the part that the present embodiment is not described in detail can refer to real shown in Fig. 2 Apply the related description of example.Description in implementation procedure and the technical effect embodiment shown in Figure 2 of the technical solution, herein not It repeats again.
Selection unit 702 clicks rank button for choosing rank region;
Judging unit 704: it sorts for the first time for judging whether it is.
Sequencing unit 706: if so, load multitone character word stock, sorts after handling the multitone character word stock.
Optionally, sequencing unit 706 is also used to:
The first, the Chinese phonetic alphabet is converted by the word in the multitone character word stock.The word has multiple words, each word It is converted, such as " Chongqing " is converted into " chongqing ", " weight " is converted into " zhongliang " etc..
The second, remove the tone of the word in the multitone character word stock;
Third sorts to removing the word after tone.
Return unit 708: for returning to ranking results.
The embodiment of the present disclosure needs dynamically load multitone numerical data base by shifting to an earlier date compressed data library, and according to document, So that the document starting time does not delay because of the load of database.It, can meanwhile in conjunction with the Chinese phonetic alphabet processing method of polyphone The Chinese terms of polyphone are quickly subjected to ascending or descending order sequence, solve the technology that document polyphone cannot be ranked up Problem meets the application demand that user sorts to polyphone under the premise of not postponing the document starting time.
Embodiment 5
The present embodiment provides a kind of electronic equipment, which is used for the processing method of polyphone sequence, the electronic equipment, It include: at least one processor;And the memory being connect at least one described processor communication;Wherein,
The memory is stored with the instruction that can be executed by one processor, and described instruction is by described at least one It manages device to execute, so that at least one described processor is able to carry out method and step described in embodiment as above.
Specific processing mode can be found in embodiment 1 and embodiment 2.
Below with reference to Fig. 8, it illustrates the structural representations for the electronic equipment 800 for being suitable for being used to realize the embodiment of the present disclosure Figure.Terminal device in the embodiment of the present disclosure can include but is not limited to such as mobile phone, laptop, digital broadcasting and connect Receive device, PDA (personal digital assistant), PAD (tablet computer), PMP (portable media player), car-mounted terminal (such as vehicle Carry navigation terminal) etc. mobile terminal and such as number TV, desktop computer etc. fixed terminal.Electricity shown in Fig. 8 Sub- equipment is only an example, should not function to the embodiment of the present disclosure and use scope bring any restrictions.
As shown in figure 8, electronic equipment 800 may include processing unit (such as central processing unit, graphics processor etc.) 801, random access can be loaded into according to the program being stored in read-only memory (ROM) 802 or from storage device 808 Program in memory (RAM) 803 and execute various movements appropriate and processing.In RAM 803, it is also stored with electronic equipment Various programs and data needed for 800 operations.Processing unit 801, ROM 802 and RAM 803 pass through the phase each other of bus 808 Even.Input/output (I/O) interface 805 is also connected to bus 808.
In general, following device can connect to I/O interface 805: including such as touch screen, touch tablet, keyboard, mouse, taking the photograph As the input unit 806 of head, microphone, accelerometer, gyroscope etc.;Including such as liquid crystal display (LCD), loudspeaker, vibration The output device 807 of dynamic device etc.;Storage device 808 including such as tape, hard disk etc.;And communication device 809.Communication device 809, which can permit electronic equipment 800, is wirelessly or non-wirelessly communicated with other equipment to exchange data.Although Fig. 8 shows tool There is the electronic equipment 800 of various devices, it should be understood that being not required for implementing or having all devices shown.It can be with Alternatively implement or have more or fewer devices.
Specific processing mode can be found in embodiment 1 and embodiment 2.
Particularly, in accordance with an embodiment of the present disclosure, it may be implemented as computer above with reference to the process of flow chart description Software program.For example, embodiment of the disclosure includes a kind of computer program product comprising be carried on computer-readable medium On computer program, which includes the program code for method shown in execution flow chart.In such reality It applies in example, which can be downloaded and installed from network by communication device 809, or from storage device 808 It is mounted, or is mounted from ROM 802.When the computer program is executed by processing unit 801, the embodiment of the present disclosure is executed Method in the above-mentioned function that limits.
Flow chart and block diagram in attached drawing are illustrated according to the system of the various embodiments of the disclosure, method and computer journey The architecture, function and operation in the cards of sequence product.In this regard, each box in flowchart or block diagram can generation A part of one module, program segment or code of table, a part of the module, program segment or code include one or more use The executable instruction of the logic function as defined in realizing.It should also be noted that in some implementations as replacements, being marked in box The function of note can also occur in a different order than that indicated in the drawings.For example, two boxes succeedingly indicated are actually It can be basically executed in parallel, they can also be executed in the opposite order sometimes, and this depends on the function involved.Also it to infuse Meaning, the combination of each box in block diagram and or flow chart and the box in block diagram and or flow chart can be with holding The dedicated hardware based system of functions or operations as defined in row is realized, or can use specialized hardware and computer instruction Combination realize.
Embodiment 6
The embodiment of the present disclosure provides a kind of nonvolatile computer storage media, and the computer storage medium is stored with The method in above-mentioned any means embodiment can be performed in computer executable instructions, the computer executable instructions.
It should be noted that the above-mentioned computer-readable medium of the disclosure can be computer-readable signal media or meter Calculation machine readable storage medium storing program for executing either the two any combination.Computer readable storage medium for example can be --- but not Be limited to --- electricity, magnetic, optical, electromagnetic, infrared ray or semiconductor system, device or device, or any above combination.Meter The more specific example of calculation machine readable storage medium storing program for executing can include but is not limited to: have the electrical connection, just of one or more conducting wires Taking formula computer disk, hard disk, random access storage device (RAM), read-only memory (ROM), erasable type may be programmed read-only storage Device (EPROM or flash memory), optical fiber, portable compact disc read-only memory (CD-ROM), light storage device, magnetic memory device, Or above-mentioned any appropriate combination.In the disclosure, computer readable storage medium can be it is any include or storage journey The tangible medium of sequence, the program can be commanded execution system, device or device use or in connection.And at this In open, computer-readable signal media may include in a base band or as the data-signal that carrier wave a part is propagated, In carry computer-readable program code.The data-signal of this propagation can take various forms, including but not limited to Electromagnetic signal, optical signal or above-mentioned any appropriate combination.Computer-readable signal media can also be computer-readable and deposit Any computer-readable medium other than storage media, the computer-readable signal media can send, propagate or transmit and be used for By the use of instruction execution system, device or device or program in connection.Include on computer-readable medium Program code can transmit with any suitable medium, including but not limited to: electric wire, optical cable, RF (radio frequency) etc. are above-mentioned Any appropriate combination.
Above-mentioned computer-readable medium can be included in above-mentioned electronic equipment;It is also possible to individualism, and not It is fitted into the electronic equipment.
The calculating of the operation for executing the disclosure can be write with one or more programming languages or combinations thereof Machine program code, above procedure design language include object oriented program language-such as Java, Smalltalk, C+ +, it further include conventional procedural programming language-such as " C " language or similar programming language.Program code can Fully to execute, partly execute on the user computer on the user computer, be executed as an independent software package, Part executes on the remote computer or executes on a remote computer or server completely on the user computer for part. In situations involving remote computers, remote computer can pass through the network of any kind --- including local area network (LAN) Or wide area network (WAN)-is connected to subscriber computer, or, it may be connected to outer computer (such as utilize Internet service Provider is connected by internet).
Being described in unit involved in the embodiment of the present disclosure can be realized by way of software, can also be by hard The mode of part is realized.Wherein, the title of unit does not constitute the restriction to the unit itself under certain conditions.

Claims (10)

1. a kind of document content information processing method characterized by comprising
Extract the document content information;
Call the multitone character word stock loaded in the document, wherein the multitone character word stock includes the first of polyphone phrase Characteristic information and second feature information;
The content information is matched with the fisrt feature information, if successful match, at the second feature information Reason, and return through the second feature information of processing;
Processing is ranked up to the content information according to the second feature information of the processing.
2. the method according to claim 1, wherein to the second feature information processing, and returning through place The second feature information of reason, comprising:
Remove the mark information in the second feature information;
Return to the second feature information for removing the mark information.
3. according to the method described in claim 2, it is characterized in that, the second feature information according to the processing is to described Content information is ranked up processing, comprising:
It is ranked up according to the second feature information of the first character in the content information, the second feature information is removal The second feature information of the mark information;
If the second feature information of the first character is identical, it is ranked up according to the second feature information of next word;
Until content information sequence finishes.
4. the method according to claim 1, wherein after the extraction document content information, comprising:
It executes and calls the instruction of multitone character word stock;
When judgement is to execute to call multitone character word stock instruction for the first time, the multitone character word stock is loaded.
5. according to right want 4 described in method, which is characterized in that the load multitone character word stock includes:
The multitone character word stock is compressed, the compressed multitone character word stock is loaded.
6. a kind of document content information processing unit characterized by comprising
Extraction unit, for extracting the document content information;
Call unit, for calling the multitone character word stock loaded in the document, wherein the multitone character word stock includes multitone The fisrt feature information and second feature information of words group;
Matching unit, for matching the content information with the fisrt feature information, if successful match, to described Two feature information processings, and return through the second feature information of processing;
Sequencing unit, for being ranked up processing to the content information according to the second feature information of the processing.
7. device according to claim 6, which is characterized in that the matching unit is also used to:
Remove the mark information in the second feature information;
Return to the second feature information for removing the mark information.
8. device according to claim 7, which is characterized in that the sequencing unit is also used to:
It is ranked up according to the second feature information of the first character in the content information, the second feature information is removal The second feature information of the mark information;
If the second feature information of the first character is identical, it is ranked up according to the second feature information of next word;
Until content information sequence finishes.
9. a kind of computer readable storage medium, is stored thereon with computer program, which is characterized in that described program is by processor The method as described in any one of claims 1 to 5 is realized when execution.
10. a kind of electronic equipment characterized by comprising
One or more processors;
Storage device, for storing one or more programs, when one or more of programs are by one or more of processing When device executes, so that one or more of processors realize the method as described in any one of claims 1 to 5.
CN201910204459.6A 2019-03-18 2019-03-18 A kind of document polyphone sort method, device, medium and electronic equipment Pending CN110069762A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910204459.6A CN110069762A (en) 2019-03-18 2019-03-18 A kind of document polyphone sort method, device, medium and electronic equipment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910204459.6A CN110069762A (en) 2019-03-18 2019-03-18 A kind of document polyphone sort method, device, medium and electronic equipment

Publications (1)

Publication Number Publication Date
CN110069762A true CN110069762A (en) 2019-07-30

Family

ID=67365310

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910204459.6A Pending CN110069762A (en) 2019-03-18 2019-03-18 A kind of document polyphone sort method, device, medium and electronic equipment

Country Status (1)

Country Link
CN (1) CN110069762A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117633143A (en) * 2023-11-29 2024-03-01 雅昌文化(集团)有限公司 Chinese vocabulary entry multi-condition compound ordering method

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5893092A (en) * 1994-12-06 1999-04-06 University Of Central Florida Relevancy ranking using statistical ranking, semantics, relevancy feedback and small pieces of text
JP2002288215A (en) * 2001-03-26 2002-10-04 Ricoh Co Ltd Document search device, document search method, program, and recording medium
CN102223430A (en) * 2011-06-13 2011-10-19 深圳桑菲消费通信有限公司 Method for ranking and searching polyphones of contacts in mobile phone
CN107193789A (en) * 2017-05-22 2017-09-22 上海携程金融信息服务有限公司 Chinese converted Chinese phonetic transcription and system containing polyphone
CN108133048A (en) * 2018-01-16 2018-06-08 广东欧珀移动通信有限公司 file ordering method, device and mobile terminal

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5893092A (en) * 1994-12-06 1999-04-06 University Of Central Florida Relevancy ranking using statistical ranking, semantics, relevancy feedback and small pieces of text
JP2002288215A (en) * 2001-03-26 2002-10-04 Ricoh Co Ltd Document search device, document search method, program, and recording medium
CN102223430A (en) * 2011-06-13 2011-10-19 深圳桑菲消费通信有限公司 Method for ranking and searching polyphones of contacts in mobile phone
CN107193789A (en) * 2017-05-22 2017-09-22 上海携程金融信息服务有限公司 Chinese converted Chinese phonetic transcription and system containing polyphone
CN108133048A (en) * 2018-01-16 2018-06-08 广东欧珀移动通信有限公司 file ordering method, device and mobile terminal

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117633143A (en) * 2023-11-29 2024-03-01 雅昌文化(集团)有限公司 Chinese vocabulary entry multi-condition compound ordering method

Similar Documents

Publication Publication Date Title
CN107368515B (en) Application page recommendation method and system
EP2653964A2 (en) Methods and systems for speech-enabling a human-to-machine interface
CN111428010B (en) Man-machine intelligent question-answering method and device
CN110647614A (en) Intelligent question and answer method, device, medium and electronic equipment
CN104765750B (en) Input language switching method and device in input method application
US20220261545A1 (en) Systems and methods for producing a semantic representation of a document
CN111898643A (en) Semantic matching method and device
AU2017216520A1 (en) Common data repository for improving transactional efficiencies of user interactions with a computing device
CN115982376B (en) Method and device for training model based on text, multimode data and knowledge
CN105893351B (en) Audio recognition method and device
CN111708911B (en) Searching method, searching device, electronic equipment and computer-readable storage medium
CN109902152B (en) Method and apparatus for retrieving information
CN110442803A (en) Data processing method, device, medium and the calculating equipment executed by calculating equipment
CN111737607B (en) Data processing method, device, electronic equipment and storage medium
CN113053362A (en) Method, device, equipment and computer readable medium for speech recognition
CN106896936B (en) Vocabulary pushing method and device
CN110069762A (en) A kind of document polyphone sort method, device, medium and electronic equipment
CN109948155B (en) Multi-intention selection method and device and terminal equipment
JP7403571B2 (en) Voice search methods, devices, electronic devices, computer readable media and computer programs
CN109472028B (en) Method and device for generating information
CN114818736B (en) Text processing method, chain finger method and device for short text and storage medium
CN107168627B (en) Text editing method and device for touch screen
CN110852078A (en) Method and device for generating title
CN115858742A (en) Question text expansion method, device, equipment and storage medium
CN112784046B (en) Text clustering method, device, equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication

Application publication date: 20190730