CN107958039A - A kind of term error correction method, device and server - Google Patents

A kind of term error correction method, device and server Download PDF

Info

Publication number
CN107958039A
CN107958039A CN201711167241.5A CN201711167241A CN107958039A CN 107958039 A CN107958039 A CN 107958039A CN 201711167241 A CN201711167241 A CN 201711167241A CN 107958039 A CN107958039 A CN 107958039A
Authority
CN
China
Prior art keywords
term
target word
word
alternative
character string
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201711167241.5A
Other languages
Chinese (zh)
Inventor
伍志鹏
施文祥
杨天行
王志华
朱莹
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Baidu Netcom Science and Technology Co Ltd
Original Assignee
Beijing Baidu Netcom Science and Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Baidu Netcom Science and Technology Co Ltd filed Critical Beijing Baidu Netcom Science and Technology Co Ltd
Priority to CN201711167241.5A priority Critical patent/CN107958039A/en
Publication of CN107958039A publication Critical patent/CN107958039A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/60Information retrieval; Database structures therefor; File system structures therefor of audio data
    • G06F16/63Querying
    • G06F16/635Filtering based on additional data, e.g. user or group profiles
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/60Information retrieval; Database structures therefor; File system structures therefor of audio data
    • G06F16/68Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually
    • G06F16/683Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using metadata automatically derived from the content
    • G06F16/685Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using metadata automatically derived from the content using automatically derived transcript of audio data, e.g. lyrics

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Multimedia (AREA)
  • General Physics & Mathematics (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Data Mining & Analysis (AREA)
  • Library & Information Science (AREA)
  • Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • General Health & Medical Sciences (AREA)
  • Machine Translation (AREA)

Abstract

The present invention proposes a kind of term error correction method, device and mobile terminal.The term error correction method includes:The similarity of alternative target word in term and dictionary is calculated according to similarity calculating method, similarity is more than the target word of predetermined threshold value alternately word;The alternative word is ranked up by search temperature, to determine target word corresponding with term.The technical program can be realized:User is when using network searching platform, alert requests directly can be sent to the predeterminable event shown on network searching platform, and when meeting reminding triggering conditions, network searching platform sends the prompting message to predeterminable event to user, the service that network searching platform can provide to the user is more diversified, can preferably facilitate user.

Description

A kind of term error correction method, device and server
Technical field
The present invention relates to field of computer technology, and in particular to a kind of term error correction method, device and server.
Background technology
In daily life, user often inadvertently hears a first good song, but does not know that what title of the song is, can only be from listening During obtain snatch of song, snatch of song is then changed into lyrics fragment, is retrieved further according to lyrics fragment.But by Many words have that sound is different with word from the former lyrics in this lyrics fragment, therefore according to the retrieval word and search of mistake The lyrics out are frequently not the lyrics for the song that user wants.
Traditional method retrieved to lyrics fragment is to calculate the similar of the lyrics based on character string fuzzy matching algorithm Degree, then the highest lyrics of similarity are found out as the target lyrics.This method work as the lyrics in error character number it is less in the case of It can also be applicable in, but when error character number is more, retrieval success rate is often relatively low.
The content of the invention
The embodiment of the present invention provides a kind of term error correction method, device and server, to solve or alleviate the prior art In one or more technical problems.
In a first aspect, an embodiment of the present invention provides a kind of term error correction method, including:
The similarity of term and alternative target word in dictionary is calculated according to similarity calculating method, similarity is more than The target word of predetermined threshold value alternately word;
The alternative word is ranked up by search temperature, to determine target word corresponding with term.
With reference to first aspect, the present invention is in the first embodiment of first aspect, according to similarity calculating method meter The similarity of term and alternative target word in dictionary is calculated, including:
The term and alternative target word are done respectively and turn phonetic processing, to obtain the pinyin character string of term and mesh Mark the pinyin character string of word;
The pinyin character string of the term is compared with the pinyin character string of alternative target word, with determine with it is described The target word of the close default quantity of retrieval;
The term and the editing distance of the target word of the default quantity are calculated, so that it is determined that the term and institute State the similarity of the target word of default quantity.
First embodiment with reference to first aspect, by the pinyin character string of the term and the phonetic of alternative target word Character string is compared, to determine the target word of the default quantity close with the retrieval, including:
The pinyin character string of the alternative target word in the dictionary is compared using full-text index instrument, filter out with The pinyin character string of the similar target word of the pinyin character string of the term, and determine corresponding target word.
With reference to first aspect, the embodiment of the present invention presses the alternative word in second of embodiment of first aspect Temperature is ranked up, including:
Obtain the search temperature of the alternative word;The temperature comes from search website;
The alternative word is ranked up according to the alternative search temperature.
Second aspect, an embodiment of the present invention provides a kind of term error correction device, including:
First computing module, is configured to calculate the phase of term and alternative target word in dictionary according to similarity calculating method Like degree, similarity is more than the term of predetermined threshold value alternately word;
First sorting module, is configured to the alternative word being ranked up by search temperature, corresponding with term to determine Target word.
With reference to second aspect, in the first embodiment of second aspect, first computing module includes:
Turn phonetic module, be configured to do the term and alternative target word respectively turn phonetic processing, to be retrieved The pinyin character string of word and the pinyin character string of target word;
Comparing module, is configured to be compared the pinyin character string of the term and the pinyin character string of alternative target word It is right, to determine the target word of the default quantity close with the retrieval;
Second computing module, is configured to calculate the editing distance of the target word of the term and the default quantity, from And determine the term and the similarity of the target word of the default quantity.
With reference to the first embodiment of second aspect, the comparing module includes:
Screening module, be configured so that full-text index instrument to the pinyin character string of the alternative target word in the dictionary into Row compares, and filters out the pinyin character string of the target word similar to the pinyin character string of the term, and determine corresponding mesh Mark word.
With reference to second aspect, in the second embodiment of second aspect, first sorting module includes:
Acquisition module, is configured to obtain the search temperature of the alternative word;The temperature comes from search website;
Second sorting module, is configured to be ranked up the alternative word according to the alternative search temperature.
The third aspect, the present invention provide a kind of server, and the server includes:
One or more processors;
Storage device, is configured to store one or more programs;
Communication interface, is configured to make the processor and storage device communicate with external equipment;
When one or more of programs are performed by one or more of processors so that one or more of places Reason device realizes method as described above.
Fourth aspect, the present invention provide a kind of computer-readable recording medium, it is stored with computer program, the program quilt Processor realizes method as described above when performing.
A technical solution in above-mentioned technical proposal has the following advantages that or beneficial effect:According to similarity calculating method The similarity of term and alternative target word in dictionary is calculated, alternative word is determined according to similarity, further according to the search of alternative word Temperature determines target word corresponding with term;In term comprising sound with the different Chinese character of word it is more in the case of, this skill Art scheme can improve the success rate of retrieval.
Above-mentioned general introduction is merely to illustrate that the purpose of book, it is not intended to is limited in any way.Except foregoing description Schematical aspect, outside embodiment and feature, it is further by reference to attached drawing and the following detailed description, the present invention Aspect, embodiment and feature would is that what is be readily apparent that.
Brief description of the drawings
In the accompanying drawings, unless specified otherwise herein, otherwise represent the same or similar through the identical reference numeral of multiple attached drawings Component or element.What these attached drawings were not necessarily to scale.It should be understood that these attached drawings depict only according to the present invention Some disclosed embodiments, and should not serve to limit the scope of the present invention.
Fig. 1 is the flow chart of the term error correction method of the embodiment of the present invention one;
Fig. 2 is the flow chart of the term error correction method of the embodiment of the present invention two;
Fig. 3 is the schematic diagram of the term error correction device of the embodiment of the present invention three;
Fig. 4 is the schematic diagram of the server of the embodiment of the present invention four.
Embodiment
Hereinafter, some exemplary embodiments are simply just described.As one skilled in the art will recognize that Like that, without departing from the spirit or scope of the present invention, described embodiment can be changed by various different modes. Therefore, attached drawing and description are considered essentially illustrative rather than restrictive.
Embodiment one
The present embodiments relate to a kind of term error correction method.Fig. 1 is the term error correction method of the embodiment of the present invention Flow chart.As shown in Figure 1, the term error correction method of the embodiment of the present invention includes the following steps:
S101, the similarity of term and alternative target word in dictionary is calculated according to similarity calculating method, will be similar Target word alternately word of the degree more than predetermined threshold value.
Specifically, editing distance (Edit Distance or Levenshtein), between referring to two character strings, by one A minimum edit operation number changed into needed for another character string.The edit operation of character string license includes replacing a character Change another character into, be inserted into a character, then delete a character.For example, character string abe is substituted for character string a needs There are two 2 operations, abe, which is substituted for character string ab, needs an operation, and abe, which is substituted for character string abc, needs an operation.
In general, editing distance is smaller, and the similarity of two strings is bigger.When calculating similarity, two character strings are first taken The maximum maxlen of length, with 1- (operand divided by maxlen), obtains similarity.For example, abe is substituted for character string abc An operation is needed, the two character string maximum lengths are 3, and similarity is 1- (1/3)=0.666.
Similarity can be more than predetermined threshold value by the present embodiment according to the similarity of target in term and dictionary, such as 0.75, target word alternately word.As the deformation of the present embodiment, it can also be ranked up by similarity size, take and come The alternately word of the target word of first 20.
The present embodiment can be applied in the scene of lyrics error correction, for example, when user only knows one section of lyrics, it is desirable to determine When being which head is sung, then it can be compared using the lyrics as term with the lyrics in library, to determine this section of lyrics With the similarity of the lyrics in library, and alternative song is obtained according to similarity.
S102, the alternative word is ranked up by search temperature, to determine target word corresponding with term.
By taking term is one section of lyrics as an example, major search website all scans for temperature ranking to some hit songs, In addition, some music sites etc. are also provided with music chart, and ranking, therefore, Ke Yigen are carried out to some hit songs Ranking according to these search websites or Top Site etc. to music, determines the temperature of music.Then according to one searched for The alternative song determined during the section lyrics, then be ranked up to the temperature of alternative song, can will in standby song temperature it is highest The corresponding target song of one section of lyrics as retrieval.
A technical solution in above-mentioned technical proposal has the following advantages that or beneficial effect:According to similarity calculating method The similarity of term and alternative target word in dictionary is calculated, alternative word is determined according to similarity, further according to the search of alternative word Temperature determines target word corresponding with term;In term comprising sound with the different Chinese character of word it is more in the case of, this skill Art scheme can improve the success rate of retrieval.
Embodiment two
The present embodiments relate to a kind of term error correction method.Fig. 2 is the term error correction method of the embodiment of the present invention Flow chart.As shown in Fig. 2, the term error correction method of the embodiment of the present invention includes the following steps:
S201, does the term and alternative target word and turns phonetic processing, to obtain the pinyin character of term respectively The pinyin character string of string and target word.
When practicing editing distance algorithm, it may be considered that the symbol into character string, such as first fall filtered symbol, Or distinguished symbol;In the specific implementation, Chinese character can also be converted into the modes such as quadrangle coding to calculate term and target The editing distance of word.The present embodiment is in the specific implementation, general in the case of be not in that sound is different with word for English song, and right In Chinese character, there may be the sound situation different with word.In view of the sound situation different with word, therefore Chinese character is all switched into phonetic, Further according to phonetic, using editing distance algorithm, the corresponding pinyin character string of term and the corresponding pinyin character of target word are calculated The similarity of string, as term and the similarity of target word.
In a wherein application scenarios, user overhears the fragment of a song, it is desirable to obtains the complete of the song The lyrics, or want to know be any song, user can be retrieved according to the corresponding lyrics of fragment for the song heard;By In the pronunciation for the simply lyrics that user is heard, therefore, using the lyrics fragment as can be according to the master of user during term See and judge to input corresponding Chinese character, in the specific implementation, which is wholly converted into phonetic, then by the target in library The lyrics are also converted into phonetic and are compared, and in advance can sing the target in library in other embodiments of the invention certainly Word all changes into phonetic file, and such user can improve speed when retrieval, the comparison target lyrics.
S202, the pinyin character string of the term is compared with the pinyin character string of alternative target word, to determine It is more than the target word of the default quantity of predetermined threshold value with the close or similarity of the retrieval.
Step S202 includes:A, using full-text index instrument to the pinyin character string of the alternative target word in the dictionary into Row compares, and filters out the pinyin character string of the target word similar to the pinyin character string of the term, and determine corresponding mesh Mark word.
Since song offerings number is various, it is impossible to each first target lyrics are contrasted one by one with the lyrics retrieved, The present embodiment first uses full-text index instrument, such as lucene, carries out preliminary retrieval, filters out the relevant lyrics;Consider These relevant lyrics still numbers are various, therefore, only select the target lyrics of default quantity, for example, the first target song of selection 20 Word, then therefrom screened.
S203, calculates the term and the editing distance of the target word of the default quantity, so that it is determined that the retrieval The similarity of the target word of word and the default quantity, is more than the target word of predetermined threshold value alternately word by similarity.
The method for specifically calculating the term and the editing distance of the target word of the default quantity, and definite institute State term and target word as described in embodiment one, details are not described herein.
S204, obtains the search temperature of the alternative word;The temperature comes from search website.
The method of the search temperature of the alternative word is obtained as described in embodiment one, details are not described herein.For user institute The lyrics of retrieval, can obtain the search website of the temperature can include Baidu music ranking list, Tencent's music chart, hundred Degree search website and heat search some common websites such as website.
S205, is ranked up the alternative word according to the alternative search temperature.
A technical solution in above-mentioned technical proposal has the following advantages that or beneficial effect:According to similarity calculating method The similarity of term and alternative target word in dictionary is calculated, alternative word is determined according to similarity, further according to the search of alternative word Temperature determines target word corresponding with term;In term comprising sound with the different Chinese character of word it is more in the case of, this skill Art scheme can improve the success rate of retrieval.
Embodiment three
The embodiment of the present invention three is a kind of term error correction device.Fig. 3 is the term error correction device of the embodiment of the present invention Flow chart.As shown in figure 3, the term error correction device of the embodiment of the present invention three includes:
First computing module 31, is configured to calculate term and alternative target word in dictionary according to similarity calculating method Similarity, is more than the term alternately word of predetermined threshold value by similarity;
First sorting module 32, is configured to the alternative word being ranked up by search temperature, to determine and term pair The target word answered.
The term error correction device of the present embodiment can realize the retrieval success rate for improving term, the technique effect and reality Apply that the technique effect of example one is identical, and details are not described herein.
Further, first computing module 31 of the embodiment of the present invention three includes:
311 turns of phonetic modules, be configured to do the term and alternative target word respectively turn phonetic processing, to be examined The pinyin character string of rope word and the pinyin character string of target word;
312 comparing modules, be configured to by the pinyin character string of the pinyin character string of the term and alternative target word into Row compares, to determine the target word of the default quantity close with the retrieval;
313 second computing modules, are configured to calculate the editing distance of the target word of the term and the default quantity, So that it is determined that the term and the similarity of the target word of the default quantity.
The comparing module 312 includes:
Screening module (not shown), is configured so that full-text index instrument to the alternative target word in the dictionary Pinyin character string is compared, and filters out the pinyin character string of the target word similar to the pinyin character string of the term, and Determine corresponding target word.
Further, first sorting module 32 includes:
Acquisition module 321, is configured to obtain the search temperature of the alternative word;The temperature comes from search website;
Second sorting module 322, is configured to be ranked up the alternative word according to the alternative search temperature.
The term error correction device of the present embodiment can simply and easily determine the similarity of term and target word and standby The temperature of word is selected, the technique effect is identical with the technique effect of embodiment two, and details are not described herein.
Example IV
The embodiment of the present invention four provides a kind of server, as shown in figure 4, the server includes:Memory 41 and processor 42,41 memory of memory contains the computer program that can be run on processor 42.Processor 42 performs the computer program Information classification approach in Shi Shixian above-described embodiments.The quantity of memory 41 and processor 42 can be one or more.
The equipment further includes:
Communication interface 43, for the communication between memory 41 and processor 42.
Memory 41 may include high-speed RAM memory, it is also possible to further include nonvolatile memory (non-volatile Memory), a for example, at least magnetic disk storage.
If memory 41, processor 42 and the independent realization of communication interface 43, memory 41, processor 42 and communication connect Mouth 43 can be connected with each other by bus and complete mutual communication.The bus can be industry standard architecture (ISA, Industry Standard Architecture) bus, external equipment interconnection (PCI, Peripheral Component) bus or extended industry-standard architecture (EISA, Extended Industry Standard Component) bus etc..The bus can be divided into address bus, data/address bus, controlling bus etc..For ease of representing, Fig. 4 In only represented with a thick line, it is not intended that an only bus or a type of bus.
Optionally, in specific implementation, if memory 41, processor 42 and communication interface 43 are integrated in chip piece On, then memory 41, processor 42 and communication interface 43 can complete mutual communication by internal interface.
Embodiment five
Computer-readable recording medium, it is stored with computer program, and such as Fig. 1-2 is realized when which is executed by processor Any method in illustrated embodiment.
In the description of this specification, reference term " one embodiment ", " some embodiments ", " example ", " specifically show The description of example " or " some examples " etc. means specific features, structure, material or the spy for combining the embodiment or example description Point is contained at least one embodiment of the present invention or example.Moreover, particular features, structures, materials, or characteristics described It may be combined in any suitable manner in any one or more of the embodiments or examples.In addition, without conflicting with each other, this The technical staff in field can be by the different embodiments or example described in this specification and different embodiments or exemplary spy Sign is combined and combines.
In addition, term " first ", " second " are only used for description purpose, and it is not intended that instruction or hint relative importance Or the implicit quantity for indicating indicated technical characteristic.Thus, " first " is defined, the feature of " second " can be expressed or hidden Include at least one this feature containing ground.In the description of the present invention, " multiple " are meant that two or more, unless otherwise It is clearly specific to limit.
Any process or method described otherwise above description in flow chart or herein is construed as, and represents to include Module, fragment or the portion of the code of the executable instruction of one or more the step of being used for realization specific logical function or process Point, and the scope of the preferred embodiment of the present invention includes other realization, wherein can not press shown or discuss suitable Sequence, including according to involved function by it is basic at the same time in the way of or in the opposite order, carry out perform function, this should be of the invention Embodiment person of ordinary skill in the field understood.
Expression or logic and/or step described otherwise above herein in flow charts, for example, being considered use In the order list for the executable instruction for realizing logic function, may be embodied in any computer-readable medium, for Instruction execution system, device or equipment (such as computer based system including the system of processor or other can be held from instruction The system of row system, device or equipment instruction fetch and execute instruction) use, or combine these instruction execution systems, device or set It is standby and use.For the purpose of this specification, " computer-readable medium " can any can be included, store, communicate, propagate or pass Defeated program is for instruction execution system, device or equipment or the dress used with reference to these instruction execution systems, device or equipment Put.
Computer-readable medium described in the embodiment of the present invention can be that computer-readable signal media or computer can Read storage medium either the two any combination.The more specifically example of computer-readable recording medium is at least (non-poor Property list to the greatest extent) including following:Electrical connection section (electronic device) with one or more wiring, portable computer diskette box (magnetic Device), random access memory (RAM), read-only storage (ROM), erasable edit read-only storage (EPROM or flash Memory), fiber device, and portable read-only storage (CDROM).In addition, computer-readable recording medium even can be with It is the paper or other suitable media that can print described program on it, because can be for example by being carried out to paper or other media Optical scanner, is then handled described electronically to obtain into edlin, interpretation or if necessary with other suitable methods Program, is then stored in computer storage.
In embodiments of the present invention, computer-readable signal media can be included in a base band or as a carrier wave part The data-signal of propagation, wherein carrying computer-readable program code.The data-signal of this propagation can use a variety of Form, includes but not limited to electromagnetic signal, optical signal or above-mentioned any appropriate combination.Computer-readable signal media is also Can be any computer-readable medium beyond computer-readable recording medium, which can send, pass Either transmission is broadcast for instruction execution system, input method or device use or program in connection.Computer can The program code for reading to include on medium can be transmitted with any appropriate medium, be included but not limited to:Wirelessly, electric wire, optical cable, penetrate Frequently (Radio Frequency, RF) etc., or above-mentioned any appropriate combination.
It should be appreciated that each several part of the present invention can be realized with hardware, software, firmware or combinations thereof.Above-mentioned In embodiment, software that multiple steps or method can be performed in memory and by suitable instruction execution system with storage Or firmware is realized.If, and in another embodiment, can be with well known in the art for example, realized with hardware Any one of row technology or their combination are realized:With the logic gates for realizing logic function to data-signal Discrete logic, have suitable combinational logic gate circuit application-specific integrated circuit, programmable gate array (PGA), scene Programmable gate array (FPGA) etc..
Those skilled in the art are appreciated that to realize all or part of step that above-described embodiment method carries Suddenly it is that relevant hardware can be instructed to complete by program, the program can be stored in a kind of computer-readable storage medium In matter, the program upon execution, including one or a combination set of the step of embodiment of the method.
In addition, each functional unit in each embodiment of the present invention can be integrated in a processing module, can also That unit is individually physically present, can also two or more units be integrated in a module.Above-mentioned integrated mould Block can both be realized in the form of hardware, can also be realized in the form of software function module.The integrated module is such as Fruit is realized in the form of software function module and as independent production marketing or in use, can also be stored in a computer In readable storage medium storing program for executing.The storage medium can be read-only storage, disk or CD etc..
The above description is merely a specific embodiment, but protection scope of the present invention is not limited thereto, any Those familiar with the art the invention discloses technical scope in, its various change or replacement can be readily occurred in, These should be covered by the protection scope of the present invention.Therefore, protection scope of the present invention should be with the guarantor of the claim Protect subject to scope.

Claims (10)

  1. A kind of 1. term error correction method, it is characterised in that the described method includes:
    The similarity of term and alternative target word in dictionary is calculated according to similarity calculating method, similarity is more than default The target word of threshold value alternately word;
    The alternative target word is ranked up by search temperature, therefrom to determine target word corresponding with term.
  2. 2. according to the method described in claim 1, it is characterized in that, calculated according to similarity calculating method in term and dictionary The similarity of alternative target word, including:
    The term and alternative target word are done respectively and turn phonetic processing, to obtain the pinyin character string of term and target word Pinyin character string;
    The pinyin character string of the term is compared with the pinyin character string of alternative target word, to determine and the retrieval Close or similarity is more than the target word of the default quantity of predetermined threshold value;
    Calculate the editing distance of the target word of the term and the default quantity, so that it is determined that the term with it is described pre- If the similarity of the target word of quantity.
  3. 3. according to the method described in claim 2, it is characterized in that, by the pinyin character string of the term and alternative target word Pinyin character string be compared, to determine the target word of the default quantity close with the retrieval, including:
    The pinyin character string of the alternative target word in the dictionary is compared using full-text index instrument, filter out with it is described The pinyin character string of the similar target word of the pinyin character string of term, and determine corresponding target word.
  4. 4. according to the method described in claim 1, it is characterized in that, the alternative word is ranked up by temperature, including:
    Obtain the search temperature of the alternative word;The temperature comes from search website;
    The alternative word is ranked up according to the alternative search temperature.
  5. A kind of 5. term error correction device, it is characterised in that including:
    First computing module, be configured to according to similarity calculating method calculate term in dictionary alternative target word it is similar Similarity, is more than the term alternately word of predetermined threshold value by degree;
    First sorting module, is configured to the alternative word being ranked up by search temperature, to determine mesh corresponding with term Mark word.
  6. 6. device according to claim 5, it is characterised in that first computing module includes:
    Turn phonetic module, be configured to do the term and alternative target word respectively turn phonetic processing, to obtain term The pinyin character string of pinyin character string and target word;
    Comparing module, is configured to the pinyin character string of the term being compared with the pinyin character string of alternative target word, To determine the target word of the default quantity close with the retrieval;
    Second computing module, is configured to calculate the editing distance of the target word of the term and the default quantity, so that really The fixed term and the similarity of the target word of the default quantity.
  7. 7. device according to claim 6, it is characterised in that the comparing module includes:
    Screening module, is configured so that full-text index instrument compares the pinyin character string of the alternative target word in the dictionary It is right, the pinyin character string of the target word similar to the pinyin character string of the term is filtered out, and determine corresponding target word.
  8. 8. device according to claim 5, it is characterised in that first sorting module includes:
    Acquisition module, is configured to obtain the search temperature of the alternative word;The temperature comes from search website;
    Second sorting module, is configured to be ranked up the alternative word according to the alternative search temperature.
  9. 9. a kind of server, it is characterised in that the server includes:
    One or more processors;
    Storage device, is configured to store one or more programs;
    Communication interface, is configured to make the processor and storage device communicate with external equipment;
    When one or more of programs are performed by one or more of processors so that one or more of processors Realize the method as described in any in claim 1-4.
  10. 10. a kind of computer-readable recording medium, it is stored with computer program, it is characterised in that the program is held by processor The method as described in any in claim 1-4 is realized during row.
CN201711167241.5A 2017-11-21 2017-11-21 A kind of term error correction method, device and server Pending CN107958039A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201711167241.5A CN107958039A (en) 2017-11-21 2017-11-21 A kind of term error correction method, device and server

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201711167241.5A CN107958039A (en) 2017-11-21 2017-11-21 A kind of term error correction method, device and server

Publications (1)

Publication Number Publication Date
CN107958039A true CN107958039A (en) 2018-04-24

Family

ID=61965176

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201711167241.5A Pending CN107958039A (en) 2017-11-21 2017-11-21 A kind of term error correction method, device and server

Country Status (1)

Country Link
CN (1) CN107958039A (en)

Cited By (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109062903A (en) * 2018-08-22 2018-12-21 北京百度网讯科技有限公司 Method and apparatus for correcting wrong word
CN109189907A (en) * 2018-08-22 2019-01-11 山东浪潮通软信息科技有限公司 A kind of search method and device based on semantic matches
CN109933691A (en) * 2019-02-11 2019-06-25 北京百度网讯科技有限公司 Method, apparatus, equipment and storage medium for content retrieval
CN110532411A (en) * 2019-07-22 2019-12-03 平安科技(深圳)有限公司 Picture searching temperature acquisition methods, device, computer equipment and storage medium
CN110633356A (en) * 2019-09-04 2019-12-31 广州市巴图鲁信息科技有限公司 Word similarity calculation method and device and storage medium
CN112131461A (en) * 2020-09-09 2020-12-25 重庆易宠科技有限公司 Commodity searching method, system, terminal and computer readable storage medium
CN113553398A (en) * 2021-07-15 2021-10-26 杭州网易云音乐科技有限公司 Search word correcting method and device, electronic equipment and computer storage medium
CN115577712A (en) * 2022-12-06 2023-01-06 共道网络科技有限公司 Text error correction method and device

Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103514236A (en) * 2012-06-30 2014-01-15 重庆新媒农信科技有限公司 Retrieval condition error correction prompt processing method based on Pinyin in retrieval application
CN103885949A (en) * 2012-12-19 2014-06-25 中国科学院声学研究所 Song searching system and method based on lyrics
US20140214808A1 (en) * 2013-01-30 2014-07-31 Casio Computer Co., Ltd. Search device, search method and recording medium
US20160042427A1 (en) * 2011-04-06 2016-02-11 Google Inc. Mining For Product Classification Structures For Internet-Based Product Searching
CN106598939A (en) * 2016-10-21 2017-04-26 北京三快在线科技有限公司 Method and device for text error correction, server and storage medium
CN106599299A (en) * 2016-12-28 2017-04-26 北京奇虎科技有限公司 Determining method and device of website key words
CN106708893A (en) * 2015-11-17 2017-05-24 华为技术有限公司 Error correction method and device for search query term
CN107169010A (en) * 2017-03-31 2017-09-15 北京奇艺世纪科技有限公司 A kind of determination method and device of recommendation search keyword

Patent Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20160042427A1 (en) * 2011-04-06 2016-02-11 Google Inc. Mining For Product Classification Structures For Internet-Based Product Searching
CN103514236A (en) * 2012-06-30 2014-01-15 重庆新媒农信科技有限公司 Retrieval condition error correction prompt processing method based on Pinyin in retrieval application
CN103885949A (en) * 2012-12-19 2014-06-25 中国科学院声学研究所 Song searching system and method based on lyrics
US20140214808A1 (en) * 2013-01-30 2014-07-31 Casio Computer Co., Ltd. Search device, search method and recording medium
CN106708893A (en) * 2015-11-17 2017-05-24 华为技术有限公司 Error correction method and device for search query term
CN106598939A (en) * 2016-10-21 2017-04-26 北京三快在线科技有限公司 Method and device for text error correction, server and storage medium
CN106599299A (en) * 2016-12-28 2017-04-26 北京奇虎科技有限公司 Determining method and device of website key words
CN107169010A (en) * 2017-03-31 2017-09-15 北京奇艺世纪科技有限公司 A kind of determination method and device of recommendation search keyword

Cited By (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109062903A (en) * 2018-08-22 2018-12-21 北京百度网讯科技有限公司 Method and apparatus for correcting wrong word
CN109189907A (en) * 2018-08-22 2019-01-11 山东浪潮通软信息科技有限公司 A kind of search method and device based on semantic matches
CN109933691A (en) * 2019-02-11 2019-06-25 北京百度网讯科技有限公司 Method, apparatus, equipment and storage medium for content retrieval
CN109933691B (en) * 2019-02-11 2023-06-09 北京百度网讯科技有限公司 Method, apparatus, device and storage medium for content retrieval
CN110532411A (en) * 2019-07-22 2019-12-03 平安科技(深圳)有限公司 Picture searching temperature acquisition methods, device, computer equipment and storage medium
CN110633356A (en) * 2019-09-04 2019-12-31 广州市巴图鲁信息科技有限公司 Word similarity calculation method and device and storage medium
CN110633356B (en) * 2019-09-04 2022-05-20 广州市巴图鲁信息科技有限公司 Word similarity calculation method and device and storage medium
CN112131461A (en) * 2020-09-09 2020-12-25 重庆易宠科技有限公司 Commodity searching method, system, terminal and computer readable storage medium
CN113553398A (en) * 2021-07-15 2021-10-26 杭州网易云音乐科技有限公司 Search word correcting method and device, electronic equipment and computer storage medium
CN113553398B (en) * 2021-07-15 2024-01-26 杭州网易云音乐科技有限公司 Search word correction method, search word correction device, electronic equipment and computer storage medium
CN115577712A (en) * 2022-12-06 2023-01-06 共道网络科技有限公司 Text error correction method and device

Similar Documents

Publication Publication Date Title
CN107958039A (en) A kind of term error correction method, device and server
CN107609186A (en) Information processing method and device, terminal device and computer-readable recording medium
US20190155898A1 (en) Method and device for extracting entity relation based on deep learning, and server
CN109657213B (en) Text similarity detection method and device and electronic equipment
CN107885888A (en) Information processing method and device, terminal device and computer-readable recording medium
CN108256044B (en) Live broadcast room recommendation method and device and electronic equipment
CN105117380A (en) Paste processing method and device
CN108920543B (en) Query and interaction method and device, computer device and storage medium
CN107742079B (en) Malicious software identification method and system
JP2008152774A (en) Characteristic expression extraction device, method, and program
CN109033385A (en) Picture retrieval method, device, server and storage medium
CN107111618B (en) Linking thumbnails of images to web pages
CN103020207A (en) Browser label page grouping management method and device
CN107885875B (en) Synonymy transformation method and device for search words and server
CN108009254A (en) More indexing means and device, cloud system and computer-readable recording medium
CN109656385A (en) Input prediction method and device based on knowledge graph and electronic equipment
CN110245357B (en) Main entity identification method and device
CN111782946A (en) Book friend recommendation method, calculation device and computer storage medium
CN103246750B (en) The method scanned for by Quick Response Code and search engine server
CN109508390B (en) Input prediction method and device based on knowledge graph and electronic equipment
WO2019202787A1 (en) Dialogue system
CN109799917A (en) Character input method and device
CN104240107A (en) Community data screening system and method thereof
CN110008475A (en) Participle processing method, device, equipment and storage medium
CN104636420A (en) System and method for hyperlink data presentation

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20180424