CN110490712A - A kind of commodity class heading search method, system and storage medium - Google Patents
A kind of commodity class heading search method, system and storage medium Download PDFInfo
- Publication number
- CN110490712A CN110490712A CN201910774650.4A CN201910774650A CN110490712A CN 110490712 A CN110490712 A CN 110490712A CN 201910774650 A CN201910774650 A CN 201910774650A CN 110490712 A CN110490712 A CN 110490712A
- Authority
- CN
- China
- Prior art keywords
- lemma
- keyword
- hot word
- target keyword
- classification
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/90—Details of database functions independent of the retrieved data types
- G06F16/95—Retrieval from the web
- G06F16/953—Querying, e.g. by the use of web search engines
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/21—Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
- G06F18/214—Generating training patterns; Bootstrap methods, e.g. bagging or boosting
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/24—Classification techniques
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q30/00—Commerce
- G06Q30/06—Buying, selling or leasing transactions
- G06Q30/0601—Electronic shopping [e-shopping]
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q30/00—Commerce
- G06Q30/06—Buying, selling or leasing transactions
- G06Q30/0601—Electronic shopping [e-shopping]
- G06Q30/0603—Catalogue ordering
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q30/00—Commerce
- G06Q30/06—Buying, selling or leasing transactions
- G06Q30/0601—Electronic shopping [e-shopping]
- G06Q30/0641—Shopping interfaces
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Business, Economics & Management (AREA)
- General Physics & Mathematics (AREA)
- Accounting & Taxation (AREA)
- Finance (AREA)
- Physics & Mathematics (AREA)
- Data Mining & Analysis (AREA)
- Marketing (AREA)
- General Business, Economics & Management (AREA)
- Strategic Management (AREA)
- Economics (AREA)
- Databases & Information Systems (AREA)
- Development Economics (AREA)
- General Engineering & Computer Science (AREA)
- Life Sciences & Earth Sciences (AREA)
- Artificial Intelligence (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Bioinformatics & Computational Biology (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Evolutionary Biology (AREA)
- Evolutionary Computation (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
The embodiment of the invention discloses a kind of commodity class heading search method, system and storage mediums, comprising: obtains the target to be searched keyword of input, and judges the type of target keyword;If it is determined that target keyword is hot word, then the target keyword is matched with the hot word lemma being pre-stored in dictionary;If successful match, it will be searched in search server with the lemma category information predetermined that the lemma is bound and show search result;If it is determined that target keyword is non-hot word, then the target keyword is input to disaggregated model trained in advance, obtains prediction classification results, the prediction classification results are input in search server and are searched for, corresponding classification is obtained and is shown.The embodiment of the present invention using matched method by the lexicographic tree of target keyword and built in advance and definition there is the lemma of lemma classification to carry out matching cutting so that user complete matching while can check commodity classification described in the target keywords.
Description
Technical field
The present embodiments relate to Internet technical fields, and in particular to a kind of commodity class heading search method, system and deposits
Storage media.
Background technique
Currently, the shopping mode of e-commerce website is broadly divided into three classes: classification browsing, advertisement operation and search.Wherein,
Classification refers to the classification of commodity, there is foreground and backstage, and foreground is shown for UI (User Interface, user interface),
Backstage is used for merchandise control, and the mapping relations of front and back are described by rule.The bibliography system of mainstream is at present with tree
It indicates, each parent mesh has multiple subcategories, only one parent mesh of each subcategory, therefore, classification indicates from top to bottom
Range is smaller and smaller.
Classification browsing mode is runed by website and is realized, first combination level-one classification, is serially opened up according still further to the attention rate of user
Show these combinations, when user wants to buy the commodity of some class now, carries out commodity screening into the classification, this classification is clear
The mode of looking at, which requires user to be familiar with bibliography system just, can find oneself desired commodity.Advertisement operation refers to through advertising single-item
Or hotel owner, user click advertisement and enter shop purchase.And under search pattern, user inputs keyword according to buying intention and carries out
Inquiry, the category list and items list recommended, this mode do not require the purchasing model of user's understanding mainstream.
Under the search pattern of this mainstream, in order to reduce in e-commerce website shopping process the search time of user and
Number of clicks, intelligent classification airmanship are come into being.
In early days, e-commerce website is navigated using classification commodity amount, and the navigation of classification commodity amount refers to when user inputs
After keyword, recommending classification, dependent merchandise quantity is determined now by class, and successively shows.In this classification using text matches
Under commodity amount navigation mode, with the sharp increase of commodity amount and merchandise classification, when user's designated key word is inquired, obtained class
Mesh number increases considerably, and text matches can not reflect the correlation of query word with classification, and user can not judge which be arrived
A little classes carry out finer screening now.
Summary of the invention
For this purpose, the embodiment of the present invention provides a kind of commodity class heading search method, system and storage medium, to solve existing skill
In art since common existing electric business platform does not show different commercial articles searching classification and attribute according to different keywords and
The correlation of caused query word and classification, user obtain the problem of actual products need to take considerable time.
To achieve the goals above, the embodiment of the present invention provides a kind of commodity class heading search method, which is characterized in that packet
It includes:
The target to be searched keyword of input is obtained, and judges the type of target keyword;
If it is determined that target keyword is hot word, then by the target keyword and the hot word word being pre-stored in dictionary
Member is matched;If successful match, will be shown with the lemma category information predetermined that the lemma is bound;
If it is determined that target keyword is non-hot word, then the target keyword is input to disaggregated model trained in advance,
Prediction classification results are obtained, the prediction classification results are input in search server and are searched for, corresponding classification is obtained and carries out
Display.
It is further, described to match the target keyword with the hot word lemma being pre-stored in dictionary,
It specifically includes:
Pre-stored hot word lemma in the target keyword and dictionary is used into the participle based on string matching
Method carries out matching retrieval, if successful match, will show with the lemma category information predetermined that the lemma is bound.
Further, the training of the disaggregated model includes the following steps:
It treats trained keyword sample and carries out data prediction;
Using pretreated keyword sample to be trained as input signal, has with the keyword sample to be trained and close
The lemma of connection relationship is input to disaggregated model to be trained, obtains the model parameter of disaggregated model as output;Wherein, described
Model parameter includes frequency of training, learning rate, layering soft-threshold.
Further, the disaggregated model is trained and further includes, using gradient descent algorithm to the frequency of training,
Learning rate, layering soft-threshold carry out algorithm calculating, obtain optimized parameter, optimize to the model.
Further, further include, using Solr the or ElasticSearch search server based on Lucene by the mesh
Mark keyword is matched with pre-stored lemma in the hot word bank;The target keyword is carried out using IK segmenter
Matching;The disaggregated model uses FastText textual classification model.
Another aspect of the present invention also provides a kind of commodity class heading search system characterized by comprising acquisition module,
Matching module and display module;Wherein, the target to be searched keyword for obtaining module and being used to obtain input, and judge target
The type of keyword;If it is determined that target keyword is hot word, then the matching module is for by the target keyword and in advance
The hot word lemma being stored in dictionary is matched;If successful match, the display module with the lemma for that will bind
Lemma category information predetermined shown;
If it is determined that target keyword is non-hot word, then the matching module is for the target keyword to be input in advance
Trained disaggregated model obtains prediction classification results, and the display module is used to for the prediction classification results to be input to search clothes
It is searched in business device, obtains corresponding classification and shown.
Further, the dictionary includes the hot word lemma based on hot word bank building;The matching module includes hot word
Matching module;The matching module matches the target keyword with the hot word lemma being pre-stored in dictionary,
It specifically includes:
Hot word matching module is used to use the target keyword with hot word lemma pre-stored in dictionary and be based on
The segmenting method of string matching carries out matching retrieval.
Further, further include model training module, carry out data prediction for treating trained keyword sample;
Using pretreated keyword sample to be trained as input signal, has with the keyword sample to be trained and close
The lemma of connection relationship is input to disaggregated model to be trained, obtains the model parameter of disaggregated model as output;Wherein, described
Model parameter includes frequency of training, learning rate, layering soft-threshold.
Further, further include model optimization module, using gradient descent algorithm to the frequency of training, learning rate, point
Layer soft-threshold carries out algorithm calculating, obtains optimized parameter, optimizes to the model.
The third aspect of the present invention also provides a kind of computer readable storage medium, which is characterized in that is stored with above-mentioned institute
The method stated.
The embodiment of the present invention has the advantages that
The embodiment of the present invention will be in the lexicographic tree of target keyword and built in advance using the segmenting method based on string matching
And definition has the lemma of lemma classification to carry out matching participle, so that user can check the target critical while completing matching
Commodity classification described in word.In addition, defined terms metaclass purpose simultaneously, other additional information are also defined, so that matching
While retrieving more comprehensive category information, the detailed additional information of such purpose can be obtained, user can think them
It is to be understood that commodity have more comprehensively and get information about, and then be easier and more accurately help user find what they wanted
Commodity classification and details relevant to the classification.
Further, when that can not retrieve corresponding lemma in hot word bank, the embodiment of the present invention is defeated by target keyword
Enter to the disaggregated model of commodity classification trained in advance, obtain the corresponding classification of the target keyword and associated accessory information, opens up
Show to user.The embodiment of the present invention realizes that simply easy to maintain, scalability is very strong, and effect is reliable, can using open source software
To realize the classification tendentiousness of large-scale electric business platform, search accuracy is greatly promoted.
Detailed description of the invention
It, below will be to embodiment party in order to illustrate more clearly of embodiments of the present invention or technical solution in the prior art
Formula or attached drawing needed to be used in the description of the prior art are briefly described.It should be evident that the accompanying drawings in the following description is only
It is merely exemplary, it for those of ordinary skill in the art, without creative efforts, can also basis
The attached drawing of offer, which is extended, obtains other implementation attached drawings.
Structure depicted in this specification, ratio, size etc., only to cooperate the revealed content of specification, for
Those skilled in the art understands and reads, and is not intended to limit the invention enforceable qualifications, therefore does not have technical
Essential meaning, the modification of any structure, the change of proportionate relationship or the adjustment of size are not influencing the function of the invention that can be generated
Under effect and the purpose that can reach, should all still it fall in the range of disclosed technology contents can cover.
Fig. 1 is a kind of commodity class heading search method flow schematic block diagram that the embodiment of the present invention 1 provides;
Fig. 2 is that a kind of IK for commodity class heading search system that the embodiment of the present invention 1 provides segments effect picture;
Fig. 3 is a kind of commodity class heading search system schematic block diagram that the embodiment of the present invention 3 provides.
Specific embodiment
Embodiments of the present invention are illustrated by particular specific embodiment below, those skilled in the art can be by this explanation
Content disclosed by book is understood other advantages and efficacy of the present invention easily, it is clear that described embodiment is the present invention one
Section Example, instead of all the embodiments.Based on the embodiments of the present invention, those of ordinary skill in the art are not doing
Every other embodiment obtained under the premise of creative work out, shall fall within the protection scope of the present invention.
It is a kind of commodity class heading search method flow schematic block diagram that the embodiment of the present invention 1 provides referring to Fig. 1, comprising:
Obtain the target to be searched keyword of input;
The target keyword is matched with the lemma being pre-stored in dictionary;
If successful match, will be shown with the lemma category information predetermined that the lemma is bound.
Above-mentioned target keyword refers to that the user of user's input wants the product name or associated with commodity of search
Descriptive matter in which there.
Above-mentioned dictionary includes the hot word lemma based on hot word bank building;It also include by the resulting classification of professional lemma training
Model.Hot word bank provided in an embodiment of the present invention is mainly dictionary of the operation personnel after particular screen, in combination with searching
The irregular renolation of Suo Zhi, the hot word in hot word bank are existed in the form of lemma.
Above-mentioned keyword, which can be the title of certain brand, be also possible to direct product name is also possible to user to commodity
Words of description etc..
Above-mentioned hot word bank can search for the multiple channels such as log and user's release product keyword from user and obtain,
It can be obtained from the specialized vocabulary of related fields, the question and answer channel of such as global professional website can get many professions
Word.Each lemma in hot word bank, can all preset its attribute value, which is a string of characters, meanwhile, it is also each
Lemma stores adeditive attribute information corresponding with the lemma, which can be class associated with the lemma
Mesh, the name of an article, the state of publication, current purchase quantity information etc.;It should be noted that associated classification can be understood as with
Lemma has the classification of at least one same word, and material information class is made in the product for being also possible to field corresponding with the keyword
Mesh etc..
The process for the hot word lemma that above-mentioned dictionary includes is that the data for including are imported into dictionary in advance in hot word bank
In, as the lemma mentioned in entry, that is, embodiment of the present invention of dictionary, our additions in lemma are associated with the lemma
Classification and adeditive attribute information described above, generate lexicographic tree, when target keyword is matched with the lemma in lexicographic tree
When success, additional in advance will also be simultaneously displayed with the associated classification of the keyword and adeditive attribute information.
When target keywords are matched with the hot word lemma in dictionary, step includes:
Pre-stored hot word lemma in the target keyword and dictionary is used into the participle based on string matching
Method carries out matching retrieval, if successful match, by the lemma category information predetermined bound with the lemma and additional category
Property information searches in search server and shows search result.
In the embodiment of the present invention, the segmenting method based on string matching, which is called, does machine segmenting method, it be according to
The Chinese character string that certain strategy is analysed to is matched with the lemma in " sufficiently big " machine dictionary library, if in dictionary
In find some character string, then successful match (identifying a word), for example, being by target critical in embodiments of the present invention
The propertystring of word and the propertystring of lemma carry out similitude matching, if the lemma attribute is in the target keyword
Propertystring in occur, then show target keywords and lemma successful match.
The IK participle basis preferably used in embodiments of the present invention is exactly machine participle.From building lexicographic tree to lemma
Matching, detailed process are described in detail as follows:
It referring to fig. 2, is that the IK that the embodiment of the present invention 1 provides segments effect picture, for example, such as textile industry is being searched
Target keyword " regeneration cotton high-quality yarn " is inputted in rope column, participle lemma is " regeneration " in hot word bank, " cotton ", " high-quality ",
" quality ", " yarn ";If there is participle lemma " regeneration cotton " in specialized dictionary, just directly it is divided into " regeneration cotton ".Including target critical
Including word, each lemma can allocate an attribute value in advance, such as the character string shown in row-bytes, in start and end
The length that the character string can be shown in column goes to judge lemma by the customized categoryString attribute value that upper icon is remembered
Attribute value and the attribute value of target keyword whether associative classification, determine whether to belong to hot word.If lemma attribute value is in target
Exist in the attribute value of keyword, indicates to illustrate successful match with the presence of hot word lemma in target keyword, complete a lemma
Cutting, show that the related classification of addition and adeditive attribute information simultaneously save in advance with the lemma, at this time so as to next time
It goes to reconfigure search condition removal search in target keyword search process, that is, goes to search using the classification of IK lemma attribute
Rope.
For developing principle, the above-mentioned lemma matching to dictionary is divided into two big processes, i.e., the dictionary of rewriting IK and
The process of the participle adapter of IK, keyword+associated classification+adeditive attribute information form new lemma, are loaded into dictionary
A branch for tree loads matching lemma from dictionary, while obtaining customized related classification and adeditive attribute information from map,
To which the category information and adeditive attribute letter of target keyword in target keyword and lemma successful match, can be obtained simultaneously
Breath.Specifically, include the following steps:
The first step constructs new lemma, constructs new lemma i.e. when storing the lemma, makes by oneself for lemma addition
Attribute value, classification associated with the lemma and the adeditive attribute information of justice can be in lemmas for developing principle angle
Additional self defined class such as KeywordScores in class includes Property ID, the associated classification and adeditive attribute in such
Information etc..
Posttectonic new lemma is loaded into update dictionary in dictionary by second step, for developing principle angle,
IK segmenter is that we provide three classes vocabularys: 1, subject term table main2012.dic 2, quantifier table
Quantifier.dic 3, stop words stopword.dic.Dictionary is to be loaded with this word respectively in dictionary management class
Allusion quotation is into internal storage structure.Specific dictionary code, is located at org.wltea.analyzer.dic.DictSegment, this class is real
Having showed a kernel data structure of an IK segmenter, i.e. Tire Tree is a kind of fairly simple tree of structure,
The fillSegment method of heavy duty DictSegment class thus, the byte with lemma are key, and sorce object is that value is stored in
In hashmap.
Third step loads matching lemma from dictionary, adds customized related classification and adeditive attribute information.1. or upper one
It walks in the additional matchAndHaveScore of DictSegment class, self-defined information score, addition is obtained by the byte of lemma
To in Hit object.Such as code KeywordScores scores=ds.getRootScores () .get (new String
(charArray));It is emphasized that a bit, it is to be ensured that lemma load is complete, and being unable to lemma, no load does not just obtain.This method master
Acting on is exactly to return to matching Hit object, only customized related classification and adeditive attribute information also is loaded here.
2. also needing to be loaded into lemma attribute, in the getNextLexeme method of segmenter context class AnalyzeContext
It completes lemma and loads Custom Attributes.3. being added to participle adapter, lucene version is supported.Here relatively easy, be exactly
Protecting the lemma of customized element to be added in lemma attribute, the Custom Attributes of word is just added here, is such as classified, attribute
Deng as IK segments effect picture.
Alternative embodiment of the present invention further includes that the dictionary further includes by the resulting classification mould of professional lemma training
The target keyword is input to disaggregated model trained in advance, obtained pre- by type when that can not match with the hot word lemma
Classification results are surveyed, the prediction classification results are input in search server and are searched for, corresponding classification is obtained and is shown.
Alternative embodiment of the present invention further includes that the training of the disaggregated model includes the following steps:
It treats trained keyword sample and carries out data prediction;
Using pretreated keyword sample to be trained as input signal, has with the keyword sample to be trained and close
The lemma of connection relationship is input to disaggregated model to be trained, obtains the model parameter of disaggregated model as output;Wherein, described
Model parameter includes frequency of training, learning rate, layering soft-threshold.
Preferably, alternative embodiment of the present invention further includes the disaggregated model using FastText textual classification model.
Model training and verifying are completed using FastText.Model training optimizes and revises parameter according to official document's step,
Lift scheme rate of precision.Data set source specialized dictionary, opposite hot word bank vocabulary is bigger, and 1~10G data are all by pre- place
Reason.Here the word main source in dictionary is briefly described, it can be from own platform data source, such as from businessman's release product
Keyword and classification can be crawled by operation personnel's arrangement etc. from outside, be not described in detail here.For specific specialized vocabulary,
Optimal setting training parameter, if frequency of training is-epochs=10 times, setting algorithm learning rate is learning rate=
1.0, word n-grams=2.In short, training basic step is as follows:
(1) data prediction disposes space, retains such as 17*21 Oxford cloth 150D such as some profession symbol;
(2) sample training number epochs (using parameter-epoch, critical field [5,50]) is adjusted;
(3) regularized learning algorithm rate learning rate (using parameter-lr, critical field [0.1-1]);
(4) optimize word n-grams (using parameter-wordNgrams, critical field [1-5]);
(5) adjustment layering softmax (using parameter-loss hs) accelerates training speed.
Model verifying is just very simple, needs to pay close attention to rate of precision Precision and recall rate Recall.Rate of precision
It is real positive sample that how many Precision, which refers to being predicted as in positive sample, and recall rate Recall is referred in sample
How many is predicted correctly positive sample, it is proposed that rate of precision is more than 60% as prediction effectively classification.FastText realizes version
Have very much, JAVA editions can refer to mayabot/fastText4j.FastText can also further train term vector simultaneously, realize same
The functions such as adopted word excavates, and word derives.
The embodiment of the present invention is by carrying out matching retrieval for pre-stored lemma in target keyword and hot word bank, then
Target keyword is subjected to cutting according to the lemma that matching retrieval obtains, and by the multiple lemmas obtained after cutting and fixed in advance
The lemma additional information of justice is bound, and is generated pre-set target corresponding with the lemma and is shown information, so that user exists
While retrieving more comprehensive category information, the detailed additional information of such purpose can be obtained, allows user can be to them
The commodity wanted to know about, which have, more comprehensively and to be got information about, and then is easier and user is more accurately helped to find them to think
The commodity classification wanted and details relevant to the classification.
Further, when that can not retrieve corresponding lemma in hot word bank, the embodiment of the present invention is defeated by target keyword
Enter to the disaggregated model of commodity classification trained in advance, obtain the corresponding classification of the target keyword and associated accessory information, opens up
Show to user.The embodiment of the present invention realizes that simply easy to maintain, scalability is very strong, and effect is reliable, can using open source software
To realize the classification tendentiousness of large-scale electric business platform, search accuracy is greatly promoted.
Another aspect of the present invention also provides a kind of commodity class heading search system characterized by comprising acquisition module,
Matching module and display module;Wherein, the target to be searched keyword for obtaining module and being used to obtain input, and judge target
The type of keyword;If it is determined that target keyword is hot word, then the matching module is for by the target keyword and in advance
The hot word lemma being stored in dictionary is matched;If successful match, the display module with the lemma for that will bind
Lemma category information predetermined searched in search server and show search result;
If it is determined that target keyword is non-hot word, then the matching module is for the target keyword to be input in advance
Trained disaggregated model, the display module with the lemma category information predetermined that classification results are bound for will search for
It is searched in server and shows search result.
Further, the dictionary includes the hot word lemma based on hot word bank building;The matching module includes hot word
Matching module;The matching module matches the target keyword with the hot word lemma being pre-stored in dictionary,
It specifically includes:
Hot word matching module is used to use the target keyword with hot word lemma pre-stored in dictionary and be based on
The segmenting method of string matching carries out matching retrieval.
Further, the dictionary includes further including by the resulting disaggregated model module of professional lemma training, when with institute
When stating hot word lemma can not match, the disaggregated model module is used to for the target keyword to be input to classification trained in advance
Model obtains the lemma category information predetermined bound with classification results and is shown.
Further, further include model training module, carry out data prediction for treating trained keyword sample;
Using pretreated keyword sample to be trained as input signal, has with the keyword sample to be trained and close
The lemma of connection relationship is input to disaggregated model to be trained, obtains the model parameter of disaggregated model as output;Wherein, described
Model parameter includes frequency of training, learning rate, layering soft-threshold.
Further, further include model optimization module, using gradient descent algorithm to the frequency of training, learning rate, point
Layer soft-threshold carries out algorithm calculating, obtains optimized parameter, optimizes to the model.
A kind of commodity class heading search system provided in an embodiment of the present invention obtains the mesh to be searched of input according to module is obtained
Keyword is marked, and judges the type of target keyword;Matching module is hot word or non-hot word according to target keyword, to difference
Word use different matching ways;If hot word, then the matching module be used for by the target keyword be stored in advance
Hot word lemma in dictionary is matched;If successful match, the display module is pre- for will bind with the lemma
The lemma category information first defined is shown;If it is determined that target keyword is non-hot word, then the matching module is used for institute
It states target keyword and is input to disaggregated model trained in advance, it is preparatory fixed that the display module is used to bind with classification results
The lemma category information of justice is simultaneously shown.According to the different classes of of target keyword, different matching algorithms pair is respectively adopted
Target keyword is matched, and the merchandise classification bound in advance is shown so that the classification of commodity classification it is more accurate and
Finely.In addition, defined terms metaclass purpose simultaneously, also define other additional information so that matching retrieval it is more comprehensive
Category information while, the detailed additional information of such purpose can be obtained, allow user can be to the commodity that they want to know about
Have and more comprehensively and get information about, so be easier and more accurately help user find commodity classification that they want and with
The relevant details of the classification.
Further, the embodiment of the present invention realizes simple that easy to maintain, scalability is very strong, and effect is reliable, utilizes open source
Software can realize the classification tendentiousness of large-scale electric business platform, greatly promote search accuracy.
The third aspect of the present invention also provides a kind of storage medium, is stored with method described above.
Although above having used general explanation and specific embodiment, the present invention is described in detail, at this
On the basis of invention, it can be made some modifications or improvements, this will be apparent to those skilled in the art.Therefore,
These modifications or improvements without departing from theon the basis of the spirit of the present invention are fallen within the scope of the claimed invention.
Claims (10)
1. a kind of commodity class heading search method characterized by comprising
The target to be searched keyword of input is obtained, and judges the type of target keyword;
If it is determined that target keyword is hot word, then by the target keyword and the hot word lemma that is pre-stored in dictionary into
Row matching;If successful match, will be searched in search server with the lemma category information predetermined that the lemma is bound
And search result is shown;
If it is determined that target keyword is non-hot word, then the target keyword is input to disaggregated model trained in advance, is obtained
It predicts classification results, the prediction classification results is input in search server and are searched for, corresponding classification is obtained and is shown.
2. the method according to claim 1, wherein described the target keyword and will be pre-stored within dictionary
Hot word lemma in library is matched, and is specifically included:
Pre-stored hot word lemma in the target keyword and dictionary is used into the segmenting method based on string matching
Matching retrieval is carried out, if successful match, will be shown with the lemma category information predetermined that the lemma is bound.
3. the method according to claim 1, wherein the training of the disaggregated model includes the following steps:
It treats trained keyword sample and carries out data prediction;
Using pretreated keyword sample to be trained as input signal, has with the keyword sample to be trained and be associated with pass
The lemma of system is input to disaggregated model to be trained, obtains the model parameter of disaggregated model as output;Wherein, the model
Parameter includes frequency of training, learning rate, layering soft-threshold.
4. according to the method described in claim 3, it is characterized in that, being trained to the disaggregated model and further including, using ladder
It spends descent algorithm and algorithm calculating is carried out to the frequency of training, learning rate, layering soft-threshold, optimized parameter is obtained, to the mould
Type optimizes.
5. according to the method described in claim 4, it is characterized in that, further include, using based on Lucene Solr or
ElasticSearch search server scans for;The target keyword is matched using IK segmenter;The classification
Model uses FastText textual classification model.
6. a kind of commodity class heading search system characterized by comprising obtain module, matching module and display module;Wherein,
The target to be searched keyword for obtaining module and being used to obtain input, and judge the type of target keyword;If it is determined that target
Keyword is hot word, then the matching module is used for the target keyword and the hot word lemma being pre-stored in dictionary
It is matched;If successful match, the display module is used for the lemma category information predetermined that will be bound with the lemma
It is searched in search server and shows search result;
If it is determined that target keyword is non-hot word, then the matching module is used to the target keyword being input to preparatory training
Disaggregated model obtain prediction classification results, the display module is used to the prediction classification results being input to search server
Middle search obtains corresponding classification and is shown.
7. system according to claim 6, which is characterized in that the dictionary includes the hot word word based on hot word bank building
Member;The matching module includes hot word matching module;The matching module is by the target keyword and is pre-stored within dictionary
Hot word lemma in library is matched, and is specifically included:
Hot word matching module, which is used to use pre-stored hot word lemma in the target keyword and dictionary, is based on character
The segmenting method of String matching carries out matching retrieval.
8. system according to claim 7, which is characterized in that further include model training mould before the disaggregated model module
Block carries out data prediction for treating trained keyword sample;
Using pretreated keyword sample to be trained as input signal, has with the keyword sample to be trained and be associated with pass
The lemma of system is input to disaggregated model to be trained, obtains the model parameter of disaggregated model as output;Wherein, the model
Parameter includes frequency of training, learning rate, layering soft-threshold.
9. system according to claim 8, which is characterized in that further include model optimization module, using gradient descent algorithm
Algorithm calculating is carried out to the frequency of training, learning rate, layering soft-threshold, optimized parameter is obtained, the model is optimized.
10. a kind of computer readable storage medium, which is characterized in that be stored with any method of the claims 1-5.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910774650.4A CN110490712A (en) | 2019-08-21 | 2019-08-21 | A kind of commodity class heading search method, system and storage medium |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910774650.4A CN110490712A (en) | 2019-08-21 | 2019-08-21 | A kind of commodity class heading search method, system and storage medium |
Publications (1)
Publication Number | Publication Date |
---|---|
CN110490712A true CN110490712A (en) | 2019-11-22 |
Family
ID=68552618
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201910774650.4A Pending CN110490712A (en) | 2019-08-21 | 2019-08-21 | A kind of commodity class heading search method, system and storage medium |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN110490712A (en) |
Cited By (12)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111159552A (en) * | 2019-12-30 | 2020-05-15 | 北京每日优鲜电子商务有限公司 | Commodity searching method, commodity searching device, server and storage medium |
CN111638834A (en) * | 2020-04-27 | 2020-09-08 | 维沃移动通信有限公司 | Content searching method and electronic equipment |
CN111931040A (en) * | 2020-06-30 | 2020-11-13 | 深圳市世强元件网络有限公司 | Recommendation method for service entry of service entity in network platform |
CN112287042A (en) * | 2020-11-22 | 2021-01-29 | 长沙修恒信息科技有限公司 | Material name processing system in ERP system |
CN112328872A (en) * | 2020-10-27 | 2021-02-05 | 北京字节跳动网络技术有限公司 | Information display method, information search method and device |
CN112445895A (en) * | 2020-11-16 | 2021-03-05 | 深圳市世强元件网络有限公司 | Method and system for identifying user search scene |
CN112687403A (en) * | 2021-01-08 | 2021-04-20 | 拉扎斯网络科技(上海)有限公司 | Medicine dictionary generation and medicine search method and device |
CN112767081A (en) * | 2021-01-19 | 2021-05-07 | 广州新丝路信息科技有限公司 | Cross-border bonded bin commodity classification method and device |
CN113222455A (en) * | 2021-05-28 | 2021-08-06 | 西安热工研究院有限公司 | Generator set parameter name matching method based on modular decomposition and matching |
CN113483518A (en) * | 2021-05-19 | 2021-10-08 | 海信视像科技股份有限公司 | Refrigerator and interface display method |
CN113743973A (en) * | 2020-11-30 | 2021-12-03 | 北京沃东天骏信息技术有限公司 | Method and device for analyzing market hotspot trend |
CN115708085A (en) * | 2021-08-09 | 2023-02-21 | 腾讯科技(深圳)有限公司 | Business processing method, neural network model training method, device, equipment and medium |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN104424342A (en) * | 2013-09-11 | 2015-03-18 | 携程计算机技术(上海)有限公司 | Method for keyword matching, and device, server and system of method |
CN106484698A (en) * | 2015-08-25 | 2017-03-08 | 北京奇虎科技有限公司 | A kind of method for pushing of search keyword and device |
CN108304533A (en) * | 2018-01-29 | 2018-07-20 | 上海名轩软件科技有限公司 | Keyword recommendation method and equipment |
US20180276728A1 (en) * | 2007-11-14 | 2018-09-27 | Panjiva, Inc. | Transaction facilitating marketplace platform |
CN109635198A (en) * | 2018-12-17 | 2019-04-16 | 杭州柚子街信息科技有限公司 | The method, apparatus of presentation user's search result, medium and electronic equipment on merchandise display platform |
-
2019
- 2019-08-21 CN CN201910774650.4A patent/CN110490712A/en active Pending
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20180276728A1 (en) * | 2007-11-14 | 2018-09-27 | Panjiva, Inc. | Transaction facilitating marketplace platform |
CN104424342A (en) * | 2013-09-11 | 2015-03-18 | 携程计算机技术(上海)有限公司 | Method for keyword matching, and device, server and system of method |
CN106484698A (en) * | 2015-08-25 | 2017-03-08 | 北京奇虎科技有限公司 | A kind of method for pushing of search keyword and device |
CN108304533A (en) * | 2018-01-29 | 2018-07-20 | 上海名轩软件科技有限公司 | Keyword recommendation method and equipment |
CN109635198A (en) * | 2018-12-17 | 2019-04-16 | 杭州柚子街信息科技有限公司 | The method, apparatus of presentation user's search result, medium and electronic equipment on merchandise display platform |
Non-Patent Citations (1)
Title |
---|
钟文波: "搜索引擎中关键词分类方法评估及推荐应用", 《中国优秀硕士学位论文全文数据库》 * |
Cited By (15)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111159552A (en) * | 2019-12-30 | 2020-05-15 | 北京每日优鲜电子商务有限公司 | Commodity searching method, commodity searching device, server and storage medium |
CN111638834A (en) * | 2020-04-27 | 2020-09-08 | 维沃移动通信有限公司 | Content searching method and electronic equipment |
CN111931040A (en) * | 2020-06-30 | 2020-11-13 | 深圳市世强元件网络有限公司 | Recommendation method for service entry of service entity in network platform |
CN111931040B (en) * | 2020-06-30 | 2024-01-12 | 深圳市世强元件网络有限公司 | Recommendation method for service entry of service entity in network platform |
CN112328872A (en) * | 2020-10-27 | 2021-02-05 | 北京字节跳动网络技术有限公司 | Information display method, information search method and device |
CN112445895A (en) * | 2020-11-16 | 2021-03-05 | 深圳市世强元件网络有限公司 | Method and system for identifying user search scene |
CN112445895B (en) * | 2020-11-16 | 2024-04-19 | 深圳市世强元件网络有限公司 | Method and system for identifying user search scene |
CN112287042A (en) * | 2020-11-22 | 2021-01-29 | 长沙修恒信息科技有限公司 | Material name processing system in ERP system |
CN113743973A (en) * | 2020-11-30 | 2021-12-03 | 北京沃东天骏信息技术有限公司 | Method and device for analyzing market hotspot trend |
CN112687403A (en) * | 2021-01-08 | 2021-04-20 | 拉扎斯网络科技(上海)有限公司 | Medicine dictionary generation and medicine search method and device |
CN112687403B (en) * | 2021-01-08 | 2022-12-02 | 拉扎斯网络科技(上海)有限公司 | Medicine dictionary generation and medicine search method and device |
CN112767081A (en) * | 2021-01-19 | 2021-05-07 | 广州新丝路信息科技有限公司 | Cross-border bonded bin commodity classification method and device |
CN113483518A (en) * | 2021-05-19 | 2021-10-08 | 海信视像科技股份有限公司 | Refrigerator and interface display method |
CN113222455A (en) * | 2021-05-28 | 2021-08-06 | 西安热工研究院有限公司 | Generator set parameter name matching method based on modular decomposition and matching |
CN115708085A (en) * | 2021-08-09 | 2023-02-21 | 腾讯科技(深圳)有限公司 | Business processing method, neural network model training method, device, equipment and medium |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN110490712A (en) | A kind of commodity class heading search method, system and storage medium | |
US8566177B2 (en) | User supplied and refined tags | |
US9690846B2 (en) | Intelligent navigation of a category system | |
KR101375940B1 (en) | Systems and methods for providing advanced search result page content | |
US9305100B2 (en) | Object oriented data and metadata based search | |
US8010523B2 (en) | Dynamic search box for web browser | |
CA2897886C (en) | Methods and apparatus for identifying concepts corresponding to input information | |
JP6022056B2 (en) | Generate search results | |
US8700621B1 (en) | Generating query suggestions from user generated content | |
US10585927B1 (en) | Determining a set of steps responsive to a how-to query | |
US20070174270A1 (en) | Knowledge management system, program product and method | |
US8239399B2 (en) | Providing tools for navigational search query results | |
US7702609B2 (en) | Adapting to inexact user input | |
US10984056B2 (en) | Systems and methods for evaluating search query terms for improving search results | |
CN102375885A (en) | Method and device for providing search suggestions corresponding to query sequence | |
JP2003518664A (en) | Method and system for constructing a personalized result set | |
KR20120089859A (en) | Systems and methods for providing advanced search result page content | |
US8156073B1 (en) | Item attribute generation using query and item data | |
US20230153366A1 (en) | System and method for improved searching across multiple databases | |
US20150154294A1 (en) | Suggested domain names positioning based on term frequency or term co-occurrence | |
US20150347423A1 (en) | Methods for completing a user search | |
Yamamoto et al. | Rerank-by-example: Efficient browsing of web search results | |
WO2023034802A1 (en) | Data management suggestions from knowledge graph actions | |
WO2019056727A1 (en) | Display method and apparatus for organization name search formula, device and storage medium | |
US8195458B2 (en) | Open class noun classification |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20191122 |