CN112183069B

CN112183069B - Keyword construction method and system based on historical keyword put-in data

Info

Publication number: CN112183069B
Application number: CN202011079017.2A
Authority: CN
Inventors: 陈嘉真; 徐凯波; 张琛
Original assignee: Shanghai Minglue Artificial Intelligence Group Co Ltd
Current assignee: Shanghai Minglue Artificial Intelligence Group Co Ltd
Priority date: 2020-10-10
Filing date: 2020-10-10
Publication date: 2024-06-28
Anticipated expiration: 2040-10-10
Also published as: CN112183069A

Abstract

The invention discloses a keyword construction method and a keyword construction system based on historical keyword input data, wherein the keyword construction method comprises the steps of obtaining given data; constructing a plurality of alternative keywords according to a preset rule according to given data; performing feature processing on a plurality of candidate keywords through a word2vec model to obtain features of each candidate keyword; evaluating and sorting the plurality of candidate keywords according to the characteristics of each candidate keyword and the given data; and outputting recommended keywords according to the plurality of candidate keywords which are ordered. According to the method, the expected display quantity and click quantity of the structure of the keywords are learned through the historical data of the keywords, the composition of the keywords is guided through the model, the newly constructed keywords are ensured to be reasonable, and good throwing performance is achieved.

Description

Keyword construction method and system based on historical keyword put-in data

Technical Field

The invention relates to the technical field of data processing, in particular to a keyword construction method and system based on historical keyword delivery data.

Background

With the rapid development of network technology and information products, various network platforms attract more and more users, the Internet has become one of the most important information transmission media at present, and in the activity of a shopping platform, electronic commerce can realize thousands of clients' drainage by purchasing keywords, and the keywords are constructed by extracting words or characters from known corpus to form new vocabulary, so that accurate summarized text information is formed.

The current common keyword construction method mainly comprises the steps of randomly constructing some keywords according to a pattern set manually: firstly, the vocabulary with higher heat is divided into long tail words and core words (most of the current words are manually screened, or small samples are marked and then classified by a classification model). Then constructing according to the modes of brand word class word and core word, brand word class word and long tail word and core word, etc.

The rationality can be barely guaranteed by the above method, but because pattern is manufactured manually, only a small number of reasonable words can be covered. Secondly, the performance of the words cannot be judged.

Disclosure of Invention

Aiming at the technical problem that the keyword construction lacks good delivery performance, the invention provides a keyword construction method and system based on historical keyword delivery data.

In a first aspect, an embodiment of the present application provides a keyword construction method based on historical keyword delivery data, including:

S1, acquiring given data;

s2, constructing a plurality of alternative keywords according to a preset rule according to the given data;

s3, performing feature processing on the plurality of candidate keywords through a word2vec model to obtain features of each candidate keyword;

S4, evaluating and sequencing a plurality of candidate keywords according to the characteristics of each candidate keyword and the given data;

S5, outputting recommended keywords according to the plurality of candidate keywords which are ordered.

According to the keyword construction method based on the historical keyword put-in data, the given data comprise scene environment variables, candidate hotness root words, evaluation indexes and recommended keyword numbers.

The above keyword construction method based on historical keyword input data, wherein the step S2 includes:

step S21: randomly constructing a plurality of keywords according to the candidate hotness root words;

Step S22: and screening a plurality of candidate keywords from the plurality of keywords according to a preset rule.

The method for constructing keywords based on historical keyword delivery data, wherein the step S3 further includes: and performing word segmentation processing on the alternative keywords by using jieba in advance.

The method for constructing keywords based on historical keyword delivery data, wherein the step S3 further includes: the word2vec model is adopted to pretrain the candidate keywords after word segmentation to obtain word vectors, and the word vector average sum of the candidate keywords is taken as the characteristic of the candidate keywords.

The above keyword construction method based on historical keyword input data, wherein the step S4 includes:

Step S41: obtaining index expression through a prediction model according to the characteristics of the alternative keywords, the scene environment variables and the evaluation indexes;

step S42: and sequencing the plurality of candidate keywords through the evaluation model according to the index performance.

The method for constructing keywords based on historical keyword delivery data, wherein the step S5 further comprises outputting recommended keywords according to the recommended keyword quantity.

In a second aspect, an embodiment of the present application provides a keyword construction system based on historical keyword delivery data, including:

An input unit that inputs given data;

a keyword construction unit for randomly constructing a plurality of keywords according to the given data;

a primary screening unit for screening a plurality of candidate keywords from the plurality of keywords according to a preset rule;

the keyword feature acquisition unit is used for carrying out feature processing on a plurality of candidate keywords through a word2vec model to acquire the features of each candidate keyword;

the secondary screening unit evaluates and sorts the plurality of candidate keywords through a prediction model and an evaluation model according to the characteristics of each candidate keyword, the environment variable and the evaluation index;

and the output unit is used for outputting recommended keywords according to the plurality of candidate keywords which are ordered.

In a third aspect, an embodiment of the present application provides an electronic device, including a memory, a processor, and a computer program stored in the memory and capable of running on the processor, where the processor implements the keyword construction method according to the first aspect when executing the computer program.

In a fourth aspect, an embodiment of the present application provides a computer readable storage medium having stored thereon a computer program which, when executed by a processor, implements the keyword construction method as described in the first aspect above.

Compared with the prior art, the invention has the advantages and positive effects that:

1. A large number of keywords are randomly constructed, and then a reasonable part of the keywords are screened out through manual rules, so that the phenomenon that data volume limitation is caused in modeling due to the fact that the proportion of historical put data in the randomly constructed keywords is smaller is avoided.

2. The expected display amount or click amount of the structure of the keywords is learned through the historical data of the keywords, so that the prediction model can be utilized to evaluate and sort a large number of relatively reasonable keywords which are built, the keywords which are well represented are screened as final results, and the newly-built keywords are guaranteed to have good throwing performance.

3. Each recommended keyword has a corresponding display amount or return on investment (Return On Investment, abbreviated as ROI) performance prediction, so that the interpretability of the model is improved, and the user experience is further improved.

4. Both constructing keywords based on rationality and evaluating keywords based on model can be modularized. Each part can adopt different models to achieve the purpose, for example, the models for evaluating the keywords can use traditional statistical methods and other commonly used machine learning models, such as neural networks, tree models and the like; the rationality evaluation can be performed by taking historically put keywords as reference training models, giving scores, performing primary screening, and then inputting the keywords into a prediction model and an evaluation model for secondary screening.

Drawings

The accompanying drawings, which are included to provide a further understanding of the application and are incorporated in and constitute a part of this specification, illustrate embodiments of the application and together with the description serve to explain the application and do not constitute a limitation on the application. In the drawings:

FIG. 1 is a flow chart of a keyword construction method based on historical keyword placement data according to an embodiment of the present application;

FIG. 2 is a framework diagram of a keyword construction system based on historical keyword placement data in accordance with an embodiment of the present application;

Fig. 3 is a frame diagram of a computer device according to an embodiment of the application.

Wherein, the reference numerals are as follows:

81. A processor; 82. a memory; 83. a communication interface; 80. a bus.

Detailed Description

The present application will be described and illustrated with reference to the accompanying drawings and examples in order to make the objects, technical solutions and advantages of the present application more apparent. It should be understood that the specific embodiments described herein are for purposes of illustration only and are not intended to limit the scope of the application. All other embodiments, which can be made by a person of ordinary skill in the art based on the embodiments provided by the present application without making any inventive effort, are intended to fall within the scope of the present application.

It is apparent that the drawings in the following description are only some examples or embodiments of the present application, and it is possible for those of ordinary skill in the art to apply the present application to other similar situations according to these drawings without inventive effort. Moreover, it should be appreciated that while such a development effort might be complex and lengthy, it would nevertheless be a routine undertaking of design, fabrication, or manufacture for those of ordinary skill having the benefit of this disclosure, and thus should not be construed as having the benefit of this disclosure.

Reference in the specification to "an embodiment" means that a particular feature, structure, or characteristic described in connection with the embodiment may be included in at least one embodiment of the application. The appearances of such phrases in various places in the specification are not necessarily all referring to the same embodiment, nor are separate or alternative embodiments mutually exclusive of other embodiments. It is to be expressly and implicitly understood by those of ordinary skill in the art that the described embodiments of the application can be combined with other embodiments without conflict.

Unless defined otherwise, technical or scientific terms used herein should be given the ordinary meaning as understood by one of ordinary skill in the art to which this application belongs. The terms "a," "an," "the," and similar referents in the context of the application are not to be construed as limiting the quantity, but rather as singular or plural. The terms "comprising," "including," "having," and any variations thereof, are intended to cover a non-exclusive inclusion; for example, a process, method, system, article, or apparatus that comprises a list of steps or modules (elements) is not limited to only those steps or elements but may include other steps or elements not expressly listed or inherent to such process, method, article, or apparatus. The terms "connected," "coupled," and the like in connection with the present application are not limited to physical or mechanical connections, but may include electrical connections, whether direct or indirect. The term "plurality" as used herein means two or more. "and/or" describes an association relationship of an association object, meaning that there may be three relationships, e.g., "a and/or B" may mean: a exists alone, A and B exist together, and B exists alone. The character "/" generally indicates that the context-dependent object is an "or" relationship. The terms "first," "second," "third," and the like, as used herein, are merely distinguishing between similar objects and not representing a particular ordering of objects.

The present invention will be described in detail below with reference to the embodiments shown in the drawings, but it should be understood that the embodiments are not limited to the present invention, and functional, method, or structural equivalents and alternatives according to the embodiments are within the scope of protection of the present invention by those skilled in the art.

Before explaining the various embodiments of the invention in detail, the core inventive concepts of the invention are summarized and described in detail by the following examples.

The technical scheme is used for keyword construction based on historical keyword input data, and comprises the steps of constructing a reasonable large number of keywords for candidate word roots according to a preset rule, and screening reserved keywords through an index prediction model. The method solves the problems that the rationality between the word root and the word root in the keyword can not be completely captured only by the method for manufacturing the new word through the pattern, and further the expected performance of the keyword can not be ensured.

Embodiment one:

referring to fig. 1, this example discloses a specific embodiment of a keyword construction method (hereinafter referred to as "method") based on historical keyword placement data.

Specifically, the method disclosed in this embodiment mainly includes the following steps:

And S1, acquiring given data.

Specifically, the given data includes scene environment variables, candidate hotness roots, evaluation indexes and recommended keyword numbers.

The environment scene variables comprise brand words, category words, time points, activity type names and the like; the evaluation index includes ROI, display amount, click rate, or the like.

ROI means return on investment (Return On Investment) which refers to the value that should be returned by investment, i.e., the economic return an enterprise receives from an investment campaign. The return on investment can reflect the comprehensive profitability of the investment centers, and the incomparable factors of profit difference caused by different investment amounts are removed, so that the investment centers have transverse comparability and are beneficial to judging the quality of the operation performance of each investment center; in addition, the return on investment can be used as the basis for selecting investment opportunities, which is beneficial to optimizing resource allocation.

When searching and inquiring the netizen, if the keyword meeting the searching requirement of the netizen in the account is triggered, the creative corresponding to the keyword appears on a searching result page, and is called as one-time association display of the keyword and the creative. The number of presentations obtained over a period of time is referred to as the "presentation amount". The display quantity shows the quality measurement of the key words and the quality of the creative. For websites, the display amount is the number of times the website is triggered to display when the user searches for related keywords, the display times of the website in a period of time are collectively called as the display amount, and the display amount of one website reflects the quality of the website keywords and the quality of the website optimization.

The display amount is helpful for knowing how many netizens are covered by the popularization result, and is a concept on quantity. Through the display amount data provided by the statistical report, the display opportunities of which keywords are associated with the creative are larger, and the exposure opportunities can be brought for many times each day, so that the number of potential clients covered by popularization activities can be estimated.

If the netizen is interested in the promotion, further knowledge of the product/service is desired, and the website may be clicked to be accessed when the promotion is displayed. The number of clicks obtained over a period of time is referred to as the "click volume". In short, the click volume refers to the number of times that is clicked. The click rate is the percentage of the website clicked when the user searches, the algorithm is click rate/display quantity=click rate, and the click rate of one website reflects the title and description of the website, and reflects whether the creative of the website has attractive force to the client.

The data such as consumption, average price, clicking, display, clicking rate, thousand times of display cost and the like can be seen in various advertising promotion background, and are the basis for comprehensively evaluating promotion effects and deeply developing promotion optimization.

However, the wide enterprise popularization range does not mean that the enterprise can be oriented to an explicit target client, and only the large-scale popularization oriented to the explicit target client can be reasonably displayed, and the blind improvement of the display quantity only can improve the enterprise popularization cost. The popularization of the target crowd is to select reasonable area delivery, precisely select keywords and reasonably set a keyword matching mode. Through careful operation, the potential target clients can more accurately search the enterprise website by various related search words, and the final transaction is achieved.

And S2, constructing a plurality of alternative keywords according to a preset rule according to the given data.

The step S2 specifically includes the following:

Specifically, a plurality of candidate keywords with lengths within 5 are constructed according to a preset rule, and the preset rule can refer to 'brand word/category word+1 to 4 candidate hot roots' or 'brand word+category word+1 to 3 candidate hot roots', and the like, and in addition, the candidate hot roots cannot be repeated.

The method for generating the alternative keywords in the invention is not limited to pattern manufacturing, the purpose of the pattern manufacturing keywords is to ensure the rationality of the keywords, models can be used for modeling the rationality of the keywords independently, for example, words appearing in a historical word stock can be used as positive samples, randomly generated words can be used as negative samples for classification modeling, and the models can use common classification models of machine learning. And then screening randomly manufactured keywords by using the model, and leaving the keywords with higher rank to be input into a next index prediction model for prediction.

And S3, performing feature processing on the plurality of candidate keywords through a word2vec model to obtain the features of each candidate keyword.

Wherein, the step S3 further includes: and performing word segmentation processing on the alternative keywords by using jieba in advance.

Specifically, "jieba" word segmentation is a Python chinese word segmentation component, which can perform functions such as word segmentation, part of speech tagging, keyword extraction, etc. on a chinese text, and support a custom dictionary. Three word segmentation modes are supported: precisely segmenting words, trying to cut the sentence most precisely, and being suitable for text analysis; the full mode scans all words which can form words in sentences, so that the speed is very high, but ambiguity cannot be resolved; and the search engine mode is used for re-segmenting the long word based on the accurate mode, so that recall is improved, and the method is suitable for a search engine.

Wherein, the step S3 further includes: the word2vec model is adopted to pretrain the candidate keywords after word segmentation to obtain word vectors, and the word vector average sum of the candidate keywords is taken as the characteristic of the candidate keywords.

Specifically, in order for a computer to process natural language, it is first necessary to model the natural language. Natural language modeling methods have undergone a transition from rule-based methods to statistical-based methods. The natural language model resulting from the statistical-based modeling method is referred to as a statistical language model. In the process of modeling natural language, problems such as dimension disasters, word similarity, model generalization capability, model performance and the like can occur. The solution to the above problem is an inherent impetus to drive the development of statistical language models. In the context of research into statistical language models, *** corporation opened a software tool for training Word2vec in 2013. The Word2Vec model is a model for converting words into vector representations, and the Word2Vec model is evolved from a neural probability language model and is a typical distributed coding mode. The method has the advantages that the neural probability language model is improved, and the calculation efficiency is improved. Specifically, there are two main implementations of the Word2Vec model: continuous Bag of words Model (CBOW Model) and skip-gram Model. Word2vec can quickly and effectively express a Word into a vector form through an optimized training model according to a given corpus, and provides a new tool for application research in the field of natural language processing. Word2vec relies on skip-grams or continuous Word bags (CBOW) to establish neuropord embedding.

And S4, evaluating and sequencing the plurality of candidate keywords according to the characteristics of each candidate keyword and the given data.

The step S4 specifically includes the following:

Specifically, the index representation is mainly ROI, display amount, click-through amount, or a combination thereof, or the like; the predictive model for evaluating the keyword performance may be a tree model that is more common in machine learning, but may also be evaluated by other models, such as a neural network model (the characteristics of the keyword may still use the word vector average of the root word, or the characteristic vector of the keyword may be generated by weighting the root word with an environmental variable using an Attention mechanism), other statistical models, and so on.

The step S5 further includes outputting recommended keywords according to the number of recommended keywords.

Specifically, after the keywords are ranked according to the quality of the index performance, the recommended keywords of top recommended keyword number performance are output.

Embodiment two:

In combination with the method for constructing keywords based on the historical keyword delivery data disclosed in the first embodiment, the present embodiment discloses a specific implementation example of a keyword construction system (hereinafter referred to as "system") based on the historical keyword delivery data.

Referring to fig. 2, the system includes:

An input unit that inputs given data;

Specifically, given data including scene environment variables, candidate hotness roots, evaluation indexes, and the number of recommended keywords is input in an input unit.

Specifically, in the keyword construction unit, a large number of keywords are randomly constructed according to the candidate hotness root words.

Specifically, in the primary screening unit, a plurality of candidate keywords are screened from the plurality of keywords according to a preset rule, and the preset rule may refer to "brand word/category word+1 to 4 candidate hot roots" or "brand word+category word+1 to 3 candidate hot roots" and the like.

Specifically, in the keyword feature acquiring unit, the candidate keywords are subjected to word segmentation in advance by using jieba, then word vectors are obtained by pre-training the candidate keywords subjected to word segmentation by adopting a word2vec model, and the average sum of the word vectors of the candidate keywords is taken as the feature of the candidate keywords.

Specifically, in the secondary screening unit, firstly, according to the characteristics of the candidate keywords, the scene environment variables and the evaluation indexes, index performances are obtained through a prediction model, and then, according to the index performances, the candidate keywords are ranked through the evaluation model.

Specifically, in the output unit, recommended keywords are output according to the recommended keyword number, that is, after the keywords are ranked according to the quality of the index performance, the recommended keywords represented by top "recommended keyword number" are output.

The technical scheme of the same part of the keyword construction system based on the historical keyword delivery data disclosed in the present embodiment and the other part of the keyword construction method based on the historical keyword delivery data disclosed in the first embodiment is described in the first embodiment, and is not repeated here.

Embodiment III:

in connection with FIG. 3, this embodiment discloses a specific implementation of a computer device. The computer device may include a processor 81 and a memory 82 storing computer program instructions.

In particular, the processor 81 may include a Central Processing Unit (CPU), or an Application SPECIFIC INTEGRATED Circuit (ASIC), or may be configured as one or more integrated circuits that implement embodiments of the present application.

Memory 82 may include, among other things, mass storage for data or instructions. By way of example, and not limitation, memory 82 may comprise a hard disk drive (HARD DISK DRIVE, abbreviated HDD), floppy disk drive, solid state drive (Solid STATE DRIVE, abbreviated SSD), flash memory, optical disk, magneto-optical disk, magnetic tape, or universal serial bus (Universal Serial Bus, abbreviated USB) drive, or a combination of two or more of these. The memory 82 may include removable or non-removable (or fixed) media, where appropriate. The memory 82 may be internal or external to the data processing apparatus, where appropriate. In a particular embodiment, the memory 82 is a Non-Volatile (Non-Volatile) memory. In particular embodiments, memory 82 includes Read-Only Memory (ROM) and random access Memory (Random Access Memory, RAM). Where appropriate, the ROM may be a mask-programmed ROM, a programmable ROM (Programmable Read-Only Memory, abbreviated PROM), an erasable PROM (Erasable Programmable Read-Only Memory, abbreviated EPROM), an electrically erasable PROM (ELECTRICALLY ERASABLE PROGRAMMABLE READ-Only Memory, abbreviated EEPROM), an electrically rewritable ROM (ELECTRICALLY ALTERABLE READ-Only Memory, abbreviated EAROM), or a FLASH Memory (FLASH), or a combination of two or more of these. The RAM may be a Static Random-Access Memory (SRAM) or a dynamic Random-Access Memory (Dynamic Random Access Memory DRAM), where the DRAM may be a fast page mode dynamic Random-Access Memory (Fast Page Mode Dynamic Random Access Memory, FPMDRAM), an extended data output dynamic Random-Access Memory (Extended Date Out Dynamic Random Access Memory, EDODRAM), a synchronous dynamic Random-Access Memory (Synchronous Dynamic Random-Access Memory, SDRAM), or the like, as appropriate.

Memory 82 may be used to store or cache various data files that need to be processed and/or communicated, as well as possible computer program instructions for execution by processor 81.

The processor 81 implements any of the keyword construction methods of the above embodiments by reading and executing the computer program instructions stored in the memory 82.

In some of these embodiments, the computer device may also include a communication interface 83 and a bus 80. As shown in fig. 3, the processor 81, the memory 82, and the communication interface 83 are connected to each other through the bus 80 and perform communication with each other.

The communication interface 83 is used to enable communication between modules, devices, units and/or units in embodiments of the application. Communication port 83 may also enable communication with other components such as: and the external equipment, the image/data acquisition equipment, the database, the external storage, the image/data processing workstation and the like are used for data communication.

Bus 80 includes hardware, software, or both, coupling components of the computer device to each other. Bus 80 includes, but is not limited to, at least one of: data Bus (Data Bus), address Bus (Address Bus), control Bus (Control Bus), expansion Bus (Expansion Bus), local Bus (Local Bus). By way of example, and not limitation, bus 80 may include a graphics acceleration interface (ACCELERATED GRAPHICS Port, abbreviated as AGP) or other graphics Bus, an enhanced industry standard architecture (Extended Industry Standard Architecture, abbreviated as EISA) Bus, a Front Side Bus (Front Side Bus, abbreviated as FSB), a HyperTransport (abbreviated as HT) interconnect, an industry standard architecture (Industry Standard Architecture, abbreviated as ISA) Bus, a wireless bandwidth (InfiniBand) interconnect, a Low Pin Count (LPC) Bus, a memory Bus, a micro channel architecture (Micro Channel Architecture, abbreviated as MCA) Bus, a peripheral component interconnect (PERIPHERAL COMPONENT INTERCONNECT, abbreviated as PCI) Bus, a PCI-Express (PCI-X) Bus, a serial advanced technology attachment (SERIAL ADVANCED Technology Attachment, abbreviated as SATA) Bus, a video electronics standards Association local (Video Electronics Standards Association Local Bus, abbreviated as VLB) Bus, or other suitable Bus, or a combination of two or more of these. Bus 80 may include one or more buses, where appropriate. Although embodiments of the application have been described and illustrated with respect to a particular bus, the application contemplates any suitable bus or interconnect.

In addition, in combination with the keyword construction method in the above embodiment, the embodiment of the present application may be implemented by providing a computer readable storage medium. The computer readable storage medium has stored thereon computer program instructions; the computer program instructions, when executed by a processor, implement any of the keyword construction methods of the above embodiments.

The technical features of the above-described embodiments may be arbitrarily combined, and all possible combinations of the technical features in the above-described embodiments are not described for brevity of description, however, as long as there is no contradiction between the combinations of the technical features, they should be considered as the scope of the description.

In summary, the method has the advantages that the expected display amount or click amount of the keyword structure is learned through the history data of the keywords, so that the prediction model can be utilized to evaluate and sort a large number of relatively reasonable keywords, and the keywords with good performances are screened as final results, thereby ensuring that the newly constructed keywords have good throwing performance; both constructing keywords based on rationality and evaluating keywords based on model can be modularized. Each part can adopt different models to achieve the purpose, for example, the models for evaluating the keywords can use traditional statistical methods and other commonly used machine learning models, such as neural networks, tree models and the like; the rationality evaluation can be performed by taking historically put keywords as reference training models, giving scores, performing primary screening, and then inputting the keywords into a prediction model and an evaluation model for secondary screening.

The above examples illustrate only a few embodiments of the application, which are described in detail and are not to be construed as limiting the scope of the application. It should be noted that it will be apparent to those skilled in the art that several variations and modifications can be made without departing from the spirit of the application, which are all within the scope of the application. Accordingly, the scope of protection of the present application is to be determined by the appended claims.

Claims

1. The keyword construction method based on historical keyword input data is characterized by comprising the following steps:

S1, acquiring given data; the given data comprises scene environment variables, candidate hotness root words, evaluation indexes and recommended keyword quantity;

s2, constructing a plurality of alternative keywords according to a preset rule according to the given data; the method for constructing the candidate keywords selects words appearing in a historical word stock as positive samples, randomly generated words are used as negative samples to conduct classification modeling, then a machine learning classification model is used for screening randomly manufactured keywords, and keywords with higher ranks are left for being input into an index prediction model in a subsequent step to conduct prediction;

2. The keyword construction method according to claim 1, wherein the step S2 includes:

3. The keyword construction method according to claim 1, wherein the step S3 further comprises: and performing word segmentation processing on the alternative keywords by using jieba in advance.

4. The keyword construction method according to claim 3, wherein the step S3 further comprises: the word2vec model is adopted to pretrain the candidate keywords after word segmentation to obtain word vectors, and the word vector average sum of the candidate keywords is taken as the characteristic of the candidate keywords.

5. The keyword construction method according to claim 1, wherein the step S4 further comprises:

step S42: and sorting the plurality of candidate keywords through an evaluation model according to the index performance.

6. The keyword construction method according to claim 1, wherein the step S5 further comprises outputting recommended keywords according to the number of recommended keywords.

7. The keyword construction system based on historical keyword input data is characterized by comprising:

An input unit that inputs given data; the given data comprises scene environment variables, candidate hotness root words, evaluation indexes and recommended keyword quantity;

A keyword construction unit for randomly constructing a plurality of keywords according to the given data; the method for constructing the candidate keywords selects words appearing in a historical word stock as positive samples, randomly generated words are used as negative samples to conduct classification modeling, then a machine learning classification model is used for screening randomly manufactured keywords, and keywords with higher ranks are left for being input into an index prediction model in a subsequent step to conduct prediction;

The secondary screening unit evaluates and sorts the plurality of candidate keywords through a prediction model and an evaluation model according to the characteristics of each candidate keyword, the environment variable and the evaluation index, wherein index expression is obtained through the prediction model according to the characteristics of the candidate keywords, the scene environment variable and the evaluation index;

8. An electronic device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, wherein the processor implements the keyword construction method of any one of claims 1 to 6 when executing the computer program.

9. A computer-readable storage medium having stored thereon a computer program, which when executed by a processor implements the keyword construction method as claimed in any one of claims 1 to 6.