CN110490712A - A kind of commodity class heading search method, system and storage medium - Google Patents

A kind of commodity class heading search method, system and storage medium Download PDF

Info

Publication number
CN110490712A
CN110490712A CN201910774650.4A CN201910774650A CN110490712A CN 110490712 A CN110490712 A CN 110490712A CN 201910774650 A CN201910774650 A CN 201910774650A CN 110490712 A CN110490712 A CN 110490712A
Authority
CN
China
Prior art keywords
lemma
keyword
hot word
target keyword
classification
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201910774650.4A
Other languages
Chinese (zh)
Inventor
韩冲
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Zhejiang China Light Textile City Network Co Ltd
Original Assignee
Zhejiang China Light Textile City Network Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Zhejiang China Light Textile City Network Co Ltd filed Critical Zhejiang China Light Textile City Network Co Ltd
Priority to CN201910774650.4A priority Critical patent/CN110490712A/en
Publication of CN110490712A publication Critical patent/CN110490712A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/953Querying, e.g. by the use of web search engines
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q30/00Commerce
    • G06Q30/06Buying, selling or leasing transactions
    • G06Q30/0601Electronic shopping [e-shopping]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q30/00Commerce
    • G06Q30/06Buying, selling or leasing transactions
    • G06Q30/0601Electronic shopping [e-shopping]
    • G06Q30/0603Catalogue ordering
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q30/00Commerce
    • G06Q30/06Buying, selling or leasing transactions
    • G06Q30/0601Electronic shopping [e-shopping]
    • G06Q30/0641Shopping interfaces

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Business, Economics & Management (AREA)
  • General Physics & Mathematics (AREA)
  • Accounting & Taxation (AREA)
  • Finance (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Marketing (AREA)
  • General Business, Economics & Management (AREA)
  • Strategic Management (AREA)
  • Economics (AREA)
  • Databases & Information Systems (AREA)
  • Development Economics (AREA)
  • General Engineering & Computer Science (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Biology (AREA)
  • Evolutionary Computation (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The embodiment of the invention discloses a kind of commodity class heading search method, system and storage mediums, comprising: obtains the target to be searched keyword of input, and judges the type of target keyword;If it is determined that target keyword is hot word, then the target keyword is matched with the hot word lemma being pre-stored in dictionary;If successful match, it will be searched in search server with the lemma category information predetermined that the lemma is bound and show search result;If it is determined that target keyword is non-hot word, then the target keyword is input to disaggregated model trained in advance, obtains prediction classification results, the prediction classification results are input in search server and are searched for, corresponding classification is obtained and is shown.The embodiment of the present invention using matched method by the lexicographic tree of target keyword and built in advance and definition there is the lemma of lemma classification to carry out matching cutting so that user complete matching while can check commodity classification described in the target keywords.

Description

A kind of commodity class heading search method, system and storage medium
Technical field
The present embodiments relate to Internet technical fields, and in particular to a kind of commodity class heading search method, system and deposits Storage media.
Background technique
Currently, the shopping mode of e-commerce website is broadly divided into three classes: classification browsing, advertisement operation and search.Wherein, Classification refers to the classification of commodity, there is foreground and backstage, and foreground is shown for UI (User Interface, user interface), Backstage is used for merchandise control, and the mapping relations of front and back are described by rule.The bibliography system of mainstream is at present with tree It indicates, each parent mesh has multiple subcategories, only one parent mesh of each subcategory, therefore, classification indicates from top to bottom Range is smaller and smaller.
Classification browsing mode is runed by website and is realized, first combination level-one classification, is serially opened up according still further to the attention rate of user Show these combinations, when user wants to buy the commodity of some class now, carries out commodity screening into the classification, this classification is clear The mode of looking at, which requires user to be familiar with bibliography system just, can find oneself desired commodity.Advertisement operation refers to through advertising single-item Or hotel owner, user click advertisement and enter shop purchase.And under search pattern, user inputs keyword according to buying intention and carries out Inquiry, the category list and items list recommended, this mode do not require the purchasing model of user's understanding mainstream.
Under the search pattern of this mainstream, in order to reduce in e-commerce website shopping process the search time of user and Number of clicks, intelligent classification airmanship are come into being.
In early days, e-commerce website is navigated using classification commodity amount, and the navigation of classification commodity amount refers to when user inputs After keyword, recommending classification, dependent merchandise quantity is determined now by class, and successively shows.In this classification using text matches Under commodity amount navigation mode, with the sharp increase of commodity amount and merchandise classification, when user's designated key word is inquired, obtained class Mesh number increases considerably, and text matches can not reflect the correlation of query word with classification, and user can not judge which be arrived A little classes carry out finer screening now.
Summary of the invention
For this purpose, the embodiment of the present invention provides a kind of commodity class heading search method, system and storage medium, to solve existing skill In art since common existing electric business platform does not show different commercial articles searching classification and attribute according to different keywords and The correlation of caused query word and classification, user obtain the problem of actual products need to take considerable time.
To achieve the goals above, the embodiment of the present invention provides a kind of commodity class heading search method, which is characterized in that packet It includes:
The target to be searched keyword of input is obtained, and judges the type of target keyword;
If it is determined that target keyword is hot word, then by the target keyword and the hot word word being pre-stored in dictionary Member is matched;If successful match, will be shown with the lemma category information predetermined that the lemma is bound;
If it is determined that target keyword is non-hot word, then the target keyword is input to disaggregated model trained in advance, Prediction classification results are obtained, the prediction classification results are input in search server and are searched for, corresponding classification is obtained and carries out Display.
It is further, described to match the target keyword with the hot word lemma being pre-stored in dictionary, It specifically includes:
Pre-stored hot word lemma in the target keyword and dictionary is used into the participle based on string matching Method carries out matching retrieval, if successful match, will show with the lemma category information predetermined that the lemma is bound.
Further, the training of the disaggregated model includes the following steps:
It treats trained keyword sample and carries out data prediction;
Using pretreated keyword sample to be trained as input signal, has with the keyword sample to be trained and close The lemma of connection relationship is input to disaggregated model to be trained, obtains the model parameter of disaggregated model as output;Wherein, described Model parameter includes frequency of training, learning rate, layering soft-threshold.
Further, the disaggregated model is trained and further includes, using gradient descent algorithm to the frequency of training, Learning rate, layering soft-threshold carry out algorithm calculating, obtain optimized parameter, optimize to the model.
Further, further include, using Solr the or ElasticSearch search server based on Lucene by the mesh Mark keyword is matched with pre-stored lemma in the hot word bank;The target keyword is carried out using IK segmenter Matching;The disaggregated model uses FastText textual classification model.
Another aspect of the present invention also provides a kind of commodity class heading search system characterized by comprising acquisition module, Matching module and display module;Wherein, the target to be searched keyword for obtaining module and being used to obtain input, and judge target The type of keyword;If it is determined that target keyword is hot word, then the matching module is for by the target keyword and in advance The hot word lemma being stored in dictionary is matched;If successful match, the display module with the lemma for that will bind Lemma category information predetermined shown;
If it is determined that target keyword is non-hot word, then the matching module is for the target keyword to be input in advance Trained disaggregated model obtains prediction classification results, and the display module is used to for the prediction classification results to be input to search clothes It is searched in business device, obtains corresponding classification and shown.
Further, the dictionary includes the hot word lemma based on hot word bank building;The matching module includes hot word Matching module;The matching module matches the target keyword with the hot word lemma being pre-stored in dictionary, It specifically includes:
Hot word matching module is used to use the target keyword with hot word lemma pre-stored in dictionary and be based on The segmenting method of string matching carries out matching retrieval.
Further, further include model training module, carry out data prediction for treating trained keyword sample;
Using pretreated keyword sample to be trained as input signal, has with the keyword sample to be trained and close The lemma of connection relationship is input to disaggregated model to be trained, obtains the model parameter of disaggregated model as output;Wherein, described Model parameter includes frequency of training, learning rate, layering soft-threshold.
Further, further include model optimization module, using gradient descent algorithm to the frequency of training, learning rate, point Layer soft-threshold carries out algorithm calculating, obtains optimized parameter, optimizes to the model.
The third aspect of the present invention also provides a kind of computer readable storage medium, which is characterized in that is stored with above-mentioned institute The method stated.
The embodiment of the present invention has the advantages that
The embodiment of the present invention will be in the lexicographic tree of target keyword and built in advance using the segmenting method based on string matching And definition has the lemma of lemma classification to carry out matching participle, so that user can check the target critical while completing matching Commodity classification described in word.In addition, defined terms metaclass purpose simultaneously, other additional information are also defined, so that matching While retrieving more comprehensive category information, the detailed additional information of such purpose can be obtained, user can think them It is to be understood that commodity have more comprehensively and get information about, and then be easier and more accurately help user find what they wanted Commodity classification and details relevant to the classification.
Further, when that can not retrieve corresponding lemma in hot word bank, the embodiment of the present invention is defeated by target keyword Enter to the disaggregated model of commodity classification trained in advance, obtain the corresponding classification of the target keyword and associated accessory information, opens up Show to user.The embodiment of the present invention realizes that simply easy to maintain, scalability is very strong, and effect is reliable, can using open source software To realize the classification tendentiousness of large-scale electric business platform, search accuracy is greatly promoted.
Detailed description of the invention
It, below will be to embodiment party in order to illustrate more clearly of embodiments of the present invention or technical solution in the prior art Formula or attached drawing needed to be used in the description of the prior art are briefly described.It should be evident that the accompanying drawings in the following description is only It is merely exemplary, it for those of ordinary skill in the art, without creative efforts, can also basis The attached drawing of offer, which is extended, obtains other implementation attached drawings.
Structure depicted in this specification, ratio, size etc., only to cooperate the revealed content of specification, for Those skilled in the art understands and reads, and is not intended to limit the invention enforceable qualifications, therefore does not have technical Essential meaning, the modification of any structure, the change of proportionate relationship or the adjustment of size are not influencing the function of the invention that can be generated Under effect and the purpose that can reach, should all still it fall in the range of disclosed technology contents can cover.
Fig. 1 is a kind of commodity class heading search method flow schematic block diagram that the embodiment of the present invention 1 provides;
Fig. 2 is that a kind of IK for commodity class heading search system that the embodiment of the present invention 1 provides segments effect picture;
Fig. 3 is a kind of commodity class heading search system schematic block diagram that the embodiment of the present invention 3 provides.
Specific embodiment
Embodiments of the present invention are illustrated by particular specific embodiment below, those skilled in the art can be by this explanation Content disclosed by book is understood other advantages and efficacy of the present invention easily, it is clear that described embodiment is the present invention one Section Example, instead of all the embodiments.Based on the embodiments of the present invention, those of ordinary skill in the art are not doing Every other embodiment obtained under the premise of creative work out, shall fall within the protection scope of the present invention.
It is a kind of commodity class heading search method flow schematic block diagram that the embodiment of the present invention 1 provides referring to Fig. 1, comprising:
Obtain the target to be searched keyword of input;
The target keyword is matched with the lemma being pre-stored in dictionary;
If successful match, will be shown with the lemma category information predetermined that the lemma is bound.
Above-mentioned target keyword refers to that the user of user's input wants the product name or associated with commodity of search Descriptive matter in which there.
Above-mentioned dictionary includes the hot word lemma based on hot word bank building;It also include by the resulting classification of professional lemma training Model.Hot word bank provided in an embodiment of the present invention is mainly dictionary of the operation personnel after particular screen, in combination with searching The irregular renolation of Suo Zhi, the hot word in hot word bank are existed in the form of lemma.
Above-mentioned keyword, which can be the title of certain brand, be also possible to direct product name is also possible to user to commodity Words of description etc..
Above-mentioned hot word bank can search for the multiple channels such as log and user's release product keyword from user and obtain, It can be obtained from the specialized vocabulary of related fields, the question and answer channel of such as global professional website can get many professions Word.Each lemma in hot word bank, can all preset its attribute value, which is a string of characters, meanwhile, it is also each Lemma stores adeditive attribute information corresponding with the lemma, which can be class associated with the lemma Mesh, the name of an article, the state of publication, current purchase quantity information etc.;It should be noted that associated classification can be understood as with Lemma has the classification of at least one same word, and material information class is made in the product for being also possible to field corresponding with the keyword Mesh etc..
The process for the hot word lemma that above-mentioned dictionary includes is that the data for including are imported into dictionary in advance in hot word bank In, as the lemma mentioned in entry, that is, embodiment of the present invention of dictionary, our additions in lemma are associated with the lemma Classification and adeditive attribute information described above, generate lexicographic tree, when target keyword is matched with the lemma in lexicographic tree When success, additional in advance will also be simultaneously displayed with the associated classification of the keyword and adeditive attribute information.
When target keywords are matched with the hot word lemma in dictionary, step includes:
Pre-stored hot word lemma in the target keyword and dictionary is used into the participle based on string matching Method carries out matching retrieval, if successful match, by the lemma category information predetermined bound with the lemma and additional category Property information searches in search server and shows search result.
In the embodiment of the present invention, the segmenting method based on string matching, which is called, does machine segmenting method, it be according to The Chinese character string that certain strategy is analysed to is matched with the lemma in " sufficiently big " machine dictionary library, if in dictionary In find some character string, then successful match (identifying a word), for example, being by target critical in embodiments of the present invention The propertystring of word and the propertystring of lemma carry out similitude matching, if the lemma attribute is in the target keyword Propertystring in occur, then show target keywords and lemma successful match.
The IK participle basis preferably used in embodiments of the present invention is exactly machine participle.From building lexicographic tree to lemma Matching, detailed process are described in detail as follows:
It referring to fig. 2, is that the IK that the embodiment of the present invention 1 provides segments effect picture, for example, such as textile industry is being searched Target keyword " regeneration cotton high-quality yarn " is inputted in rope column, participle lemma is " regeneration " in hot word bank, " cotton ", " high-quality ", " quality ", " yarn ";If there is participle lemma " regeneration cotton " in specialized dictionary, just directly it is divided into " regeneration cotton ".Including target critical Including word, each lemma can allocate an attribute value in advance, such as the character string shown in row-bytes, in start and end The length that the character string can be shown in column goes to judge lemma by the customized categoryString attribute value that upper icon is remembered Attribute value and the attribute value of target keyword whether associative classification, determine whether to belong to hot word.If lemma attribute value is in target Exist in the attribute value of keyword, indicates to illustrate successful match with the presence of hot word lemma in target keyword, complete a lemma Cutting, show that the related classification of addition and adeditive attribute information simultaneously save in advance with the lemma, at this time so as to next time It goes to reconfigure search condition removal search in target keyword search process, that is, goes to search using the classification of IK lemma attribute Rope.
For developing principle, the above-mentioned lemma matching to dictionary is divided into two big processes, i.e., the dictionary of rewriting IK and The process of the participle adapter of IK, keyword+associated classification+adeditive attribute information form new lemma, are loaded into dictionary A branch for tree loads matching lemma from dictionary, while obtaining customized related classification and adeditive attribute information from map, To which the category information and adeditive attribute letter of target keyword in target keyword and lemma successful match, can be obtained simultaneously Breath.Specifically, include the following steps:
The first step constructs new lemma, constructs new lemma i.e. when storing the lemma, makes by oneself for lemma addition Attribute value, classification associated with the lemma and the adeditive attribute information of justice can be in lemmas for developing principle angle Additional self defined class such as KeywordScores in class includes Property ID, the associated classification and adeditive attribute in such Information etc..
Posttectonic new lemma is loaded into update dictionary in dictionary by second step, for developing principle angle, IK segmenter is that we provide three classes vocabularys: 1, subject term table main2012.dic 2, quantifier table Quantifier.dic 3, stop words stopword.dic.Dictionary is to be loaded with this word respectively in dictionary management class Allusion quotation is into internal storage structure.Specific dictionary code, is located at org.wltea.analyzer.dic.DictSegment, this class is real Having showed a kernel data structure of an IK segmenter, i.e. Tire Tree is a kind of fairly simple tree of structure, The fillSegment method of heavy duty DictSegment class thus, the byte with lemma are key, and sorce object is that value is stored in In hashmap.
Third step loads matching lemma from dictionary, adds customized related classification and adeditive attribute information.1. or upper one It walks in the additional matchAndHaveScore of DictSegment class, self-defined information score, addition is obtained by the byte of lemma To in Hit object.Such as code KeywordScores scores=ds.getRootScores () .get (new String (charArray));It is emphasized that a bit, it is to be ensured that lemma load is complete, and being unable to lemma, no load does not just obtain.This method master Acting on is exactly to return to matching Hit object, only customized related classification and adeditive attribute information also is loaded here. 2. also needing to be loaded into lemma attribute, in the getNextLexeme method of segmenter context class AnalyzeContext It completes lemma and loads Custom Attributes.3. being added to participle adapter, lucene version is supported.Here relatively easy, be exactly Protecting the lemma of customized element to be added in lemma attribute, the Custom Attributes of word is just added here, is such as classified, attribute Deng as IK segments effect picture.
Alternative embodiment of the present invention further includes that the dictionary further includes by the resulting classification mould of professional lemma training The target keyword is input to disaggregated model trained in advance, obtained pre- by type when that can not match with the hot word lemma Classification results are surveyed, the prediction classification results are input in search server and are searched for, corresponding classification is obtained and is shown.
Alternative embodiment of the present invention further includes that the training of the disaggregated model includes the following steps:
It treats trained keyword sample and carries out data prediction;
Using pretreated keyword sample to be trained as input signal, has with the keyword sample to be trained and close The lemma of connection relationship is input to disaggregated model to be trained, obtains the model parameter of disaggregated model as output;Wherein, described Model parameter includes frequency of training, learning rate, layering soft-threshold.
Preferably, alternative embodiment of the present invention further includes the disaggregated model using FastText textual classification model.
Model training and verifying are completed using FastText.Model training optimizes and revises parameter according to official document's step, Lift scheme rate of precision.Data set source specialized dictionary, opposite hot word bank vocabulary is bigger, and 1~10G data are all by pre- place Reason.Here the word main source in dictionary is briefly described, it can be from own platform data source, such as from businessman's release product Keyword and classification can be crawled by operation personnel's arrangement etc. from outside, be not described in detail here.For specific specialized vocabulary, Optimal setting training parameter, if frequency of training is-epochs=10 times, setting algorithm learning rate is learning rate= 1.0, word n-grams=2.In short, training basic step is as follows:
(1) data prediction disposes space, retains such as 17*21 Oxford cloth 150D such as some profession symbol;
(2) sample training number epochs (using parameter-epoch, critical field [5,50]) is adjusted;
(3) regularized learning algorithm rate learning rate (using parameter-lr, critical field [0.1-1]);
(4) optimize word n-grams (using parameter-wordNgrams, critical field [1-5]);
(5) adjustment layering softmax (using parameter-loss hs) accelerates training speed.
Model verifying is just very simple, needs to pay close attention to rate of precision Precision and recall rate Recall.Rate of precision It is real positive sample that how many Precision, which refers to being predicted as in positive sample, and recall rate Recall is referred in sample How many is predicted correctly positive sample, it is proposed that rate of precision is more than 60% as prediction effectively classification.FastText realizes version Have very much, JAVA editions can refer to mayabot/fastText4j.FastText can also further train term vector simultaneously, realize same The functions such as adopted word excavates, and word derives.
The embodiment of the present invention is by carrying out matching retrieval for pre-stored lemma in target keyword and hot word bank, then Target keyword is subjected to cutting according to the lemma that matching retrieval obtains, and by the multiple lemmas obtained after cutting and fixed in advance The lemma additional information of justice is bound, and is generated pre-set target corresponding with the lemma and is shown information, so that user exists While retrieving more comprehensive category information, the detailed additional information of such purpose can be obtained, allows user can be to them The commodity wanted to know about, which have, more comprehensively and to be got information about, and then is easier and user is more accurately helped to find them to think The commodity classification wanted and details relevant to the classification.
Further, when that can not retrieve corresponding lemma in hot word bank, the embodiment of the present invention is defeated by target keyword Enter to the disaggregated model of commodity classification trained in advance, obtain the corresponding classification of the target keyword and associated accessory information, opens up Show to user.The embodiment of the present invention realizes that simply easy to maintain, scalability is very strong, and effect is reliable, can using open source software To realize the classification tendentiousness of large-scale electric business platform, search accuracy is greatly promoted.
Another aspect of the present invention also provides a kind of commodity class heading search system characterized by comprising acquisition module, Matching module and display module;Wherein, the target to be searched keyword for obtaining module and being used to obtain input, and judge target The type of keyword;If it is determined that target keyword is hot word, then the matching module is for by the target keyword and in advance The hot word lemma being stored in dictionary is matched;If successful match, the display module with the lemma for that will bind Lemma category information predetermined searched in search server and show search result;
If it is determined that target keyword is non-hot word, then the matching module is for the target keyword to be input in advance Trained disaggregated model, the display module with the lemma category information predetermined that classification results are bound for will search for It is searched in server and shows search result.
Further, the dictionary includes the hot word lemma based on hot word bank building;The matching module includes hot word Matching module;The matching module matches the target keyword with the hot word lemma being pre-stored in dictionary, It specifically includes:
Hot word matching module is used to use the target keyword with hot word lemma pre-stored in dictionary and be based on The segmenting method of string matching carries out matching retrieval.
Further, the dictionary includes further including by the resulting disaggregated model module of professional lemma training, when with institute When stating hot word lemma can not match, the disaggregated model module is used to for the target keyword to be input to classification trained in advance Model obtains the lemma category information predetermined bound with classification results and is shown.
Further, further include model training module, carry out data prediction for treating trained keyword sample;
Using pretreated keyword sample to be trained as input signal, has with the keyword sample to be trained and close The lemma of connection relationship is input to disaggregated model to be trained, obtains the model parameter of disaggregated model as output;Wherein, described Model parameter includes frequency of training, learning rate, layering soft-threshold.
Further, further include model optimization module, using gradient descent algorithm to the frequency of training, learning rate, point Layer soft-threshold carries out algorithm calculating, obtains optimized parameter, optimizes to the model.
A kind of commodity class heading search system provided in an embodiment of the present invention obtains the mesh to be searched of input according to module is obtained Keyword is marked, and judges the type of target keyword;Matching module is hot word or non-hot word according to target keyword, to difference Word use different matching ways;If hot word, then the matching module be used for by the target keyword be stored in advance Hot word lemma in dictionary is matched;If successful match, the display module is pre- for will bind with the lemma The lemma category information first defined is shown;If it is determined that target keyword is non-hot word, then the matching module is used for institute It states target keyword and is input to disaggregated model trained in advance, it is preparatory fixed that the display module is used to bind with classification results The lemma category information of justice is simultaneously shown.According to the different classes of of target keyword, different matching algorithms pair is respectively adopted Target keyword is matched, and the merchandise classification bound in advance is shown so that the classification of commodity classification it is more accurate and Finely.In addition, defined terms metaclass purpose simultaneously, also define other additional information so that matching retrieval it is more comprehensive Category information while, the detailed additional information of such purpose can be obtained, allow user can be to the commodity that they want to know about Have and more comprehensively and get information about, so be easier and more accurately help user find commodity classification that they want and with The relevant details of the classification.
Further, the embodiment of the present invention realizes simple that easy to maintain, scalability is very strong, and effect is reliable, utilizes open source Software can realize the classification tendentiousness of large-scale electric business platform, greatly promote search accuracy.
The third aspect of the present invention also provides a kind of storage medium, is stored with method described above.
Although above having used general explanation and specific embodiment, the present invention is described in detail, at this On the basis of invention, it can be made some modifications or improvements, this will be apparent to those skilled in the art.Therefore, These modifications or improvements without departing from theon the basis of the spirit of the present invention are fallen within the scope of the claimed invention.

Claims (10)

1. a kind of commodity class heading search method characterized by comprising
The target to be searched keyword of input is obtained, and judges the type of target keyword;
If it is determined that target keyword is hot word, then by the target keyword and the hot word lemma that is pre-stored in dictionary into Row matching;If successful match, will be searched in search server with the lemma category information predetermined that the lemma is bound And search result is shown;
If it is determined that target keyword is non-hot word, then the target keyword is input to disaggregated model trained in advance, is obtained It predicts classification results, the prediction classification results is input in search server and are searched for, corresponding classification is obtained and is shown.
2. the method according to claim 1, wherein described the target keyword and will be pre-stored within dictionary Hot word lemma in library is matched, and is specifically included:
Pre-stored hot word lemma in the target keyword and dictionary is used into the segmenting method based on string matching Matching retrieval is carried out, if successful match, will be shown with the lemma category information predetermined that the lemma is bound.
3. the method according to claim 1, wherein the training of the disaggregated model includes the following steps:
It treats trained keyword sample and carries out data prediction;
Using pretreated keyword sample to be trained as input signal, has with the keyword sample to be trained and be associated with pass The lemma of system is input to disaggregated model to be trained, obtains the model parameter of disaggregated model as output;Wherein, the model Parameter includes frequency of training, learning rate, layering soft-threshold.
4. according to the method described in claim 3, it is characterized in that, being trained to the disaggregated model and further including, using ladder It spends descent algorithm and algorithm calculating is carried out to the frequency of training, learning rate, layering soft-threshold, optimized parameter is obtained, to the mould Type optimizes.
5. according to the method described in claim 4, it is characterized in that, further include, using based on Lucene Solr or ElasticSearch search server scans for;The target keyword is matched using IK segmenter;The classification Model uses FastText textual classification model.
6. a kind of commodity class heading search system characterized by comprising obtain module, matching module and display module;Wherein, The target to be searched keyword for obtaining module and being used to obtain input, and judge the type of target keyword;If it is determined that target Keyword is hot word, then the matching module is used for the target keyword and the hot word lemma being pre-stored in dictionary It is matched;If successful match, the display module is used for the lemma category information predetermined that will be bound with the lemma It is searched in search server and shows search result;
If it is determined that target keyword is non-hot word, then the matching module is used to the target keyword being input to preparatory training Disaggregated model obtain prediction classification results, the display module is used to the prediction classification results being input to search server Middle search obtains corresponding classification and is shown.
7. system according to claim 6, which is characterized in that the dictionary includes the hot word word based on hot word bank building Member;The matching module includes hot word matching module;The matching module is by the target keyword and is pre-stored within dictionary Hot word lemma in library is matched, and is specifically included:
Hot word matching module, which is used to use pre-stored hot word lemma in the target keyword and dictionary, is based on character The segmenting method of String matching carries out matching retrieval.
8. system according to claim 7, which is characterized in that further include model training mould before the disaggregated model module Block carries out data prediction for treating trained keyword sample;
Using pretreated keyword sample to be trained as input signal, has with the keyword sample to be trained and be associated with pass The lemma of system is input to disaggregated model to be trained, obtains the model parameter of disaggregated model as output;Wherein, the model Parameter includes frequency of training, learning rate, layering soft-threshold.
9. system according to claim 8, which is characterized in that further include model optimization module, using gradient descent algorithm Algorithm calculating is carried out to the frequency of training, learning rate, layering soft-threshold, optimized parameter is obtained, the model is optimized.
10. a kind of computer readable storage medium, which is characterized in that be stored with any method of the claims 1-5.
CN201910774650.4A 2019-08-21 2019-08-21 A kind of commodity class heading search method, system and storage medium Pending CN110490712A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910774650.4A CN110490712A (en) 2019-08-21 2019-08-21 A kind of commodity class heading search method, system and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910774650.4A CN110490712A (en) 2019-08-21 2019-08-21 A kind of commodity class heading search method, system and storage medium

Publications (1)

Publication Number Publication Date
CN110490712A true CN110490712A (en) 2019-11-22

Family

ID=68552618

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910774650.4A Pending CN110490712A (en) 2019-08-21 2019-08-21 A kind of commodity class heading search method, system and storage medium

Country Status (1)

Country Link
CN (1) CN110490712A (en)

Cited By (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111159552A (en) * 2019-12-30 2020-05-15 北京每日优鲜电子商务有限公司 Commodity searching method, commodity searching device, server and storage medium
CN111638834A (en) * 2020-04-27 2020-09-08 维沃移动通信有限公司 Content searching method and electronic equipment
CN111931040A (en) * 2020-06-30 2020-11-13 深圳市世强元件网络有限公司 Recommendation method for service entry of service entity in network platform
CN112287042A (en) * 2020-11-22 2021-01-29 长沙修恒信息科技有限公司 Material name processing system in ERP system
CN112328872A (en) * 2020-10-27 2021-02-05 北京字节跳动网络技术有限公司 Information display method, information search method and device
CN112445895A (en) * 2020-11-16 2021-03-05 深圳市世强元件网络有限公司 Method and system for identifying user search scene
CN112687403A (en) * 2021-01-08 2021-04-20 拉扎斯网络科技(上海)有限公司 Medicine dictionary generation and medicine search method and device
CN112767081A (en) * 2021-01-19 2021-05-07 广州新丝路信息科技有限公司 Cross-border bonded bin commodity classification method and device
CN113222455A (en) * 2021-05-28 2021-08-06 西安热工研究院有限公司 Generator set parameter name matching method based on modular decomposition and matching
CN113483518A (en) * 2021-05-19 2021-10-08 海信视像科技股份有限公司 Refrigerator and interface display method
CN113743973A (en) * 2020-11-30 2021-12-03 北京沃东天骏信息技术有限公司 Method and device for analyzing market hotspot trend
CN115708085A (en) * 2021-08-09 2023-02-21 腾讯科技(深圳)有限公司 Business processing method, neural network model training method, device, equipment and medium

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104424342A (en) * 2013-09-11 2015-03-18 携程计算机技术(上海)有限公司 Method for keyword matching, and device, server and system of method
CN106484698A (en) * 2015-08-25 2017-03-08 北京奇虎科技有限公司 A kind of method for pushing of search keyword and device
CN108304533A (en) * 2018-01-29 2018-07-20 上海名轩软件科技有限公司 Keyword recommendation method and equipment
US20180276728A1 (en) * 2007-11-14 2018-09-27 Panjiva, Inc. Transaction facilitating marketplace platform
CN109635198A (en) * 2018-12-17 2019-04-16 杭州柚子街信息科技有限公司 The method, apparatus of presentation user's search result, medium and electronic equipment on merchandise display platform

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20180276728A1 (en) * 2007-11-14 2018-09-27 Panjiva, Inc. Transaction facilitating marketplace platform
CN104424342A (en) * 2013-09-11 2015-03-18 携程计算机技术(上海)有限公司 Method for keyword matching, and device, server and system of method
CN106484698A (en) * 2015-08-25 2017-03-08 北京奇虎科技有限公司 A kind of method for pushing of search keyword and device
CN108304533A (en) * 2018-01-29 2018-07-20 上海名轩软件科技有限公司 Keyword recommendation method and equipment
CN109635198A (en) * 2018-12-17 2019-04-16 杭州柚子街信息科技有限公司 The method, apparatus of presentation user's search result, medium and electronic equipment on merchandise display platform

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
钟文波: "搜索引擎中关键词分类方法评估及推荐应用", 《中国优秀硕士学位论文全文数据库》 *

Cited By (15)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111159552A (en) * 2019-12-30 2020-05-15 北京每日优鲜电子商务有限公司 Commodity searching method, commodity searching device, server and storage medium
CN111638834A (en) * 2020-04-27 2020-09-08 维沃移动通信有限公司 Content searching method and electronic equipment
CN111931040A (en) * 2020-06-30 2020-11-13 深圳市世强元件网络有限公司 Recommendation method for service entry of service entity in network platform
CN111931040B (en) * 2020-06-30 2024-01-12 深圳市世强元件网络有限公司 Recommendation method for service entry of service entity in network platform
CN112328872A (en) * 2020-10-27 2021-02-05 北京字节跳动网络技术有限公司 Information display method, information search method and device
CN112445895A (en) * 2020-11-16 2021-03-05 深圳市世强元件网络有限公司 Method and system for identifying user search scene
CN112445895B (en) * 2020-11-16 2024-04-19 深圳市世强元件网络有限公司 Method and system for identifying user search scene
CN112287042A (en) * 2020-11-22 2021-01-29 长沙修恒信息科技有限公司 Material name processing system in ERP system
CN113743973A (en) * 2020-11-30 2021-12-03 北京沃东天骏信息技术有限公司 Method and device for analyzing market hotspot trend
CN112687403A (en) * 2021-01-08 2021-04-20 拉扎斯网络科技(上海)有限公司 Medicine dictionary generation and medicine search method and device
CN112687403B (en) * 2021-01-08 2022-12-02 拉扎斯网络科技(上海)有限公司 Medicine dictionary generation and medicine search method and device
CN112767081A (en) * 2021-01-19 2021-05-07 广州新丝路信息科技有限公司 Cross-border bonded bin commodity classification method and device
CN113483518A (en) * 2021-05-19 2021-10-08 海信视像科技股份有限公司 Refrigerator and interface display method
CN113222455A (en) * 2021-05-28 2021-08-06 西安热工研究院有限公司 Generator set parameter name matching method based on modular decomposition and matching
CN115708085A (en) * 2021-08-09 2023-02-21 腾讯科技(深圳)有限公司 Business processing method, neural network model training method, device, equipment and medium

Similar Documents

Publication Publication Date Title
CN110490712A (en) A kind of commodity class heading search method, system and storage medium
US8566177B2 (en) User supplied and refined tags
US9690846B2 (en) Intelligent navigation of a category system
KR101375940B1 (en) Systems and methods for providing advanced search result page content
US9305100B2 (en) Object oriented data and metadata based search
US8010523B2 (en) Dynamic search box for web browser
CA2897886C (en) Methods and apparatus for identifying concepts corresponding to input information
JP6022056B2 (en) Generate search results
US8700621B1 (en) Generating query suggestions from user generated content
US10585927B1 (en) Determining a set of steps responsive to a how-to query
US20070174270A1 (en) Knowledge management system, program product and method
US8239399B2 (en) Providing tools for navigational search query results
US7702609B2 (en) Adapting to inexact user input
US10984056B2 (en) Systems and methods for evaluating search query terms for improving search results
CN102375885A (en) Method and device for providing search suggestions corresponding to query sequence
JP2003518664A (en) Method and system for constructing a personalized result set
KR20120089859A (en) Systems and methods for providing advanced search result page content
US8156073B1 (en) Item attribute generation using query and item data
US20230153366A1 (en) System and method for improved searching across multiple databases
US20150154294A1 (en) Suggested domain names positioning based on term frequency or term co-occurrence
US20150347423A1 (en) Methods for completing a user search
Yamamoto et al. Rerank-by-example: Efficient browsing of web search results
WO2023034802A1 (en) Data management suggestions from knowledge graph actions
WO2019056727A1 (en) Display method and apparatus for organization name search formula, device and storage medium
US8195458B2 (en) Open class noun classification

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20191122