CN112434527A - Keyword determination method and device, electronic equipment and storage medium - Google Patents

Keyword determination method and device, electronic equipment and storage medium Download PDF

Info

Publication number
CN112434527A
CN112434527A CN202011415378.XA CN202011415378A CN112434527A CN 112434527 A CN112434527 A CN 112434527A CN 202011415378 A CN202011415378 A CN 202011415378A CN 112434527 A CN112434527 A CN 112434527A
Authority
CN
China
Prior art keywords
target
vector
word
word vector
sample
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202011415378.XA
Other languages
Chinese (zh)
Other versions
CN112434527B (en
Inventor
陈嘉真
徐凯波
张琛
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shanghai Minglue Artificial Intelligence Group Co Ltd
Original Assignee
Shanghai Minglue Artificial Intelligence Group Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shanghai Minglue Artificial Intelligence Group Co Ltd filed Critical Shanghai Minglue Artificial Intelligence Group Co Ltd
Priority to CN202011415378.XA priority Critical patent/CN112434527B/en
Publication of CN112434527A publication Critical patent/CN112434527A/en
Application granted granted Critical
Publication of CN112434527B publication Critical patent/CN112434527B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/279Recognition of textual entities
    • G06F40/289Phrasal analysis, e.g. finite state techniques or chunking
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/30Semantic analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Health & Medical Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • General Physics & Mathematics (AREA)
  • Computational Linguistics (AREA)
  • General Engineering & Computer Science (AREA)
  • Biomedical Technology (AREA)
  • Evolutionary Computation (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Data Mining & Analysis (AREA)
  • Biophysics (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Machine Translation (AREA)

Abstract

The application provides a keyword determination method, a keyword determination device, electronic equipment and a storage medium, wherein the keyword determination method comprises the following steps: acquiring a first word vector corresponding to a target word segmentation; inputting the first word vector into a pre-trained encoder, and acquiring a second word vector of the target word segmentation output by the encoder; wherein the second word vector is lower in dimensionality than the first word vector; inputting the second word vector into a pre-trained putting effect prediction model, and acquiring a first target putting value corresponding to the target word segmentation output by the putting effect prediction model; and determining the target word segmentation with the first target delivery value meeting the preset condition as a keyword, and sending the keyword to a corresponding user terminal. According to the method and the device, the keywords used for marking the products to be sold are determined based on the predicted putting effect, and the flexibility and the accuracy of determining the keywords are improved.

Description

Keyword determination method and device, electronic equipment and storage medium
Technical Field
The present application relates to the field of computer information technologies, and in particular, to a keyword determination method and apparatus, an electronic device, and a storage medium.
Background
In practice, when performing an online internet sales activity, in order to improve the product identification and purchase rate, a keyword is usually used to mark the product for sale, for example, a certain brand of laundry detergent is marked as a choice of dad.
At this stage, the keywords used to mark the products for sale are generally determined as follows: recording the putting effect of each used keyword in the online or offline sales activities, searching the historical sales activities similar to the new sales activities when the new sales activities are developed, and taking the keywords of the historical sales activities as the keywords of the new sales activities.
However, with this method for determining keywords, the selected keywords can only be historically released keywords, and the effect of releasing new keywords cannot be predicted for new keywords that have not been used in a historical sales campaign.
Disclosure of Invention
In view of this, an embodiment of the present application aims to provide a keyword determination method, an apparatus, an electronic device, and a storage medium, where an encoder and a delivery effect prediction model are used to predict a delivery effect of each target participle, and a keyword used for marking a product to be sold is determined based on the predicted delivery effect, so that flexibility and accuracy of determining the keyword are improved.
In a first aspect, an embodiment of the present application provides a method for determining a keyword, where the method includes:
acquiring a first word vector corresponding to a target word segmentation;
inputting the first word vector into a pre-trained encoder, and acquiring a second word vector of the target word segmentation output by the encoder; wherein the second word vector is lower in dimensionality than the first word vector;
inputting the second word vector into a pre-trained putting effect prediction model, and acquiring a first target putting value corresponding to the target word segmentation output by the putting effect prediction model;
and determining the target word segmentation with the first target delivery value meeting the preset condition as a keyword, and sending the keyword to a corresponding user terminal.
In one possible embodiment, the encoder is trained by:
obtaining a plurality of first sample participles and a third word vector corresponding to each first sample participle in the plurality of first sample participles;
for each first sample participle, inputting a third word vector corresponding to the first sample participle into an initial encoder, and acquiring a fourth word vector of the first sample participle output by the initial encoder;
inputting the fourth word vector of the first sample word segmentation into an initial decoder matched with the initial encoder, and acquiring a fifth word vector of the first sample word segmentation output by the initial decoder;
and determining a first loss value corresponding to the first sample participle according to a third word vector and a fifth word vector corresponding to the first sample participle respectively, if the first loss value corresponding to any one first sample participle is larger than a first preset threshold value, continuing to train the initial encoder and the initial decoder, and if the first loss value corresponding to each first sample participle is smaller than or equal to the first preset threshold value, determining the current initial encoder as the pre-trained encoder.
In a possible embodiment, the delivery effect prediction model is trained by:
obtaining a plurality of second sample participles, and a sixth word vector and a first release value corresponding to each second sample participle in the plurality of second sample participles;
for each second sample word segmentation, inputting a sixth word vector corresponding to the second sample word segmentation into the pre-trained encoder, and acquiring a seventh word vector of the second sample word segmentation output by the encoder;
inputting a seventh word vector of the second sample word segmentation into an initial putting effect prediction model, and acquiring a second putting value of the second sample word segmentation output by the initial putting effect prediction model;
and determining a second loss value corresponding to the second sample participle according to a first release value and a second release value corresponding to the second sample participle, if the second loss value corresponding to any second sample participle is larger than a second preset threshold, continuing to train the initial release effect prediction model, and if the second loss value corresponding to each second sample participle is smaller than or equal to the second preset threshold, determining the current initial release effect prediction model as the pre-trained release effect prediction model.
In a possible implementation manner, the obtaining a first word vector corresponding to a target word segmentation includes:
performing word segmentation processing on the target word segmentation to obtain a plurality of word roots of the target word segmentation;
searching a root vector corresponding to each root of the target participle according to the corresponding relation between the participle and the vector;
and determining a first word vector corresponding to the target word segmentation according to the root vector corresponding to each root of the target word segmentation.
In one possible embodiment, the method further comprises:
obtaining an environment feature vector of a launching environment where the target word segmentation is located;
determining an eighth word vector of the target word segmentation according to the second word vector and the environment characteristic vector respectively corresponding to the target word segmentation;
and inputting the eighth word vector into the putting effect prediction model, and acquiring a second target putting value corresponding to the target word segmentation output by the putting effect prediction model.
In a possible implementation, the encoder corresponds to a multi-layer neural network, and the inputting the first word vector into a pre-trained encoder and obtaining a second word vector of the target word segmentation output by the encoder includes:
determining a mean vector and a variance vector corresponding to the target participle according to a transformation matrix and a nonlinear function corresponding to each layer of neural network in the multilayer neural network and a first word vector of the target participle;
and constructing a Gaussian distribution function corresponding to the target word segmentation based on the mean vector and the variance vector corresponding to the target word segmentation, and determining a second word vector of the target word segmentation.
In one possible embodiment, the method further comprises:
and selecting keywords from the target participles according to a first target delivery value corresponding to each target participle in the target participles.
In a second aspect, an embodiment of the present application provides an apparatus for determining a keyword, where the apparatus includes:
the first acquisition module is used for acquiring a first word vector corresponding to the target word segmentation;
the second obtaining module is used for inputting the first word vector into a pre-trained encoder and obtaining a second word vector of the target word segmentation output by the encoder; wherein the second word vector is lower in dimensionality than the first word vector;
a third obtaining module, configured to input the second word vector into a pre-trained delivery effect prediction model, and obtain a first target delivery value corresponding to the target word segmentation output by the delivery effect prediction model;
and the sending module is used for determining the target word segmentation of which the first target delivery value meets the preset condition as a keyword and sending the keyword to a corresponding user terminal.
In a third aspect, an embodiment of the present application provides an electronic device, including: a processor, a memory and a bus, wherein the memory stores machine-readable instructions executable by the processor, the processor and the memory communicate with each other through the bus when the electronic device runs, and the processor executes the machine-readable instructions to execute the steps of the keyword determination method according to any one of the first aspect.
In a fourth aspect, the present application provides a computer-readable storage medium, on which a computer program is stored, where the computer program is executed by a processor to perform the steps of the keyword determination method according to any one of the first aspect.
The embodiment of the application provides a keyword determination method, a keyword determination device, electronic equipment and a storage medium, wherein the keyword determination method comprises the following steps: acquiring a first word vector corresponding to a target word segmentation; inputting the first word vector into a pre-trained encoder, and acquiring a second word vector of the target word segmentation output by the encoder; wherein the second word vector is lower in dimensionality than the first word vector; inputting the second word vector into a pre-trained putting effect prediction model, and acquiring a first target putting value corresponding to the target word segmentation output by the putting effect prediction model; and determining the target word segmentation with the first target delivery value meeting the preset condition as a keyword, and sending the keyword to a corresponding user terminal. According to the method and the device, the keywords used for marking the products to be sold are determined based on the predicted putting effect, and the flexibility and the accuracy of determining the keywords are improved.
In order to make the aforementioned objects, features and advantages of the present application more comprehensible, preferred embodiments accompanied with figures are described in detail below.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present application, the drawings that are required to be used in the embodiments will be briefly described below, it should be understood that the following drawings only illustrate some embodiments of the present application and therefore should not be considered as limiting the scope, and for those skilled in the art, other related drawings can be obtained from the drawings without inventive effort.
Fig. 1 is a flowchart illustrating a method for determining a keyword according to an embodiment of the present application;
FIG. 2 is a flow chart of another keyword determination method provided in the embodiments of the present application;
FIG. 3 is a flow chart of another keyword determination method provided in the embodiments of the present application;
FIG. 4 is a flow chart of another keyword determination method provided in the embodiments of the present application;
FIG. 5 is a flow chart of another keyword determination method provided in the embodiments of the present application;
FIG. 6 is a flow chart of another keyword determination method provided in the embodiments of the present application;
fig. 7 is a schematic structural diagram illustrating a keyword determination apparatus according to an embodiment of the present application;
fig. 8 shows a schematic structural diagram of an electronic device provided in an embodiment of the present application.
Detailed Description
In order to make the objects, technical solutions and advantages of the embodiments of the present application clearer, the technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application, and it is obvious that the described embodiments are only a part of the embodiments of the present application, and not all the embodiments. The components of the embodiments of the present application, generally described and illustrated in the figures herein, can be arranged and designed in a wide variety of different configurations. Thus, the following detailed description of the embodiments of the present application, presented in the accompanying drawings, is not intended to limit the scope of the claimed application, but is merely representative of selected embodiments of the application. All other embodiments, which can be derived by a person skilled in the art from the embodiments of the present application without making any creative effort, shall fall within the protection scope of the present application.
At this stage, the keywords used to mark the products for sale are generally determined as follows: recording the putting effect of each used keyword in the online or offline sales activities, searching the historical sales activities similar to the new sales activities when the new sales activities are developed, and taking the keywords of the historical sales activities as the keywords of the new sales activities.
However, with this method for determining keywords, the selected keywords can only be historically released keywords, and the effect of releasing new keywords cannot be predicted for new keywords that have not been used in a historical sales campaign.
Based on the above problem, an embodiment of the present application provides a method, an apparatus, an electronic device, and a storage medium for determining a keyword, where the method for determining a keyword includes: acquiring a first word vector corresponding to a target word segmentation; inputting the first word vector into a pre-trained encoder, and acquiring a second word vector of the target word segmentation output by the encoder; wherein the second word vector is lower in dimensionality than the first word vector; inputting the second word vector into a pre-trained putting effect prediction model, and acquiring a first target putting value corresponding to the target word segmentation output by the putting effect prediction model; and determining the target word segmentation with the first target delivery value meeting the preset condition as a keyword, and sending the keyword to a corresponding user terminal. According to the method and the device, the keywords used for marking the products to be sold are determined based on the predicted putting effect, and the flexibility and the accuracy of determining the keywords are improved.
The above-mentioned drawbacks are the results of the inventor after practical and careful study, and therefore, the discovery process of the above-mentioned problems and the solution proposed by the present application to the above-mentioned problems in the following should be the contribution of the inventor to the present application in the process of the present application.
The technical solutions in the present application will be described clearly and completely with reference to the drawings in the present application, and it should be understood that the described embodiments are only a part of the embodiments of the present application, and not all of the embodiments. The components of the present application, as generally described and illustrated in the figures herein, may be arranged and designed in a wide variety of different configurations. Thus, the following detailed description of the embodiments of the present application, presented in the accompanying drawings, is not intended to limit the scope of the claimed application, but is merely representative of selected embodiments of the application. All other embodiments, which can be derived by a person skilled in the art from the embodiments of the present application without making any creative effort, shall fall within the protection scope of the present application.
It should be noted that: like reference numbers and letters refer to like items in the following figures, and thus, once an item is defined in one figure, it need not be further defined and explained in subsequent figures.
To facilitate understanding of the present embodiment, a method for determining a keyword disclosed in the embodiments of the present application will be described in detail first.
Referring to fig. 1, fig. 1 is a flowchart of a method for determining a keyword according to an embodiment of the present application, where the method includes the following steps:
s101, obtaining a first word vector corresponding to the target word segmentation.
In practice, when an online sales activity is performed on the internet, a keyword is usually used to mark a product to be sold so as to improve the attention and purchase rate of the product to be sold, for example, a certain brand of laundry detergent is marked as good dad, where a target participle is a candidate keyword, and a keyword that is finally used to mark the product to be sold is selected from a plurality of target participles by predicting the sales index, such as click rate and display rate, of each target participle in the sales activity to be held, where a placement value is used to represent the quality of the sales index, a placement value is high, the sales index is good, and a placement value is low, and the sales index is poor.
The target participle is an unstructured representation, and in order to perform correlation processing on the target participle, the unstructured target participle needs to be converted into a structured first word vector, and the first word vector is used for representing semantic features of the target participle.
S102, inputting the first word vector into a pre-trained encoder, and acquiring a second word vector of the target word segmentation output by the encoder; wherein the second word vector has a lower dimension than the first word vector.
In the step, the encoder is used for deeply mining semantic features of a first word vector corresponding to the target participle, and meanwhile, the dimensionality of the first word vector is reduced, so that the processing precision and the processing speed of the target participle in the subsequent processing process are improved.
After the initial encoder is trained by using first sample participles associated with target participles, a pre-trained encoder is obtained, a first word vector corresponding to the target participles is input into the encoder, the encoder outputs a second word vector corresponding to the target participles, the dimensionality of the second word vector is lower than that of the first word vector, the dimensionality of the second word vector is a dimensionality preset in the encoder, different target participles are input into the same encoder, and the dimensionality of the second word vector of each target participle output by the encoder is the same.
Specifically, referring to fig. 2, fig. 2 is a flowchart of another method for determining a keyword according to an embodiment of the present application, where the encoder corresponds to a multilayer neural network, and in step 102, the first word vector is input to a pre-trained encoder to obtain a second word vector of the target word segmentation output by the encoder, where the method includes:
s1021, determining a mean vector and a variance vector corresponding to the target word segmentation according to the transformation matrix and the nonlinear function corresponding to each layer of neural network in the multilayer neural network and the first word vector of the target word segmentation.
The encoder is corresponding to a plurality of layers of neural networks, each layer of neural network is corresponding to a transformation matrix and a nonlinear function, the number of layers of the neural networks is set according to the actual requirement of a user, here, taking two layers of neural networks as an example, the processing process of the encoder is explained, wherein, the first layer of neural network is corresponding to a first transformation matrix and a first nonlinear function, and the second layer of neural network is corresponding to a second transformation matrix and a second nonlinear function. Inputting a first word vector of a target word segmentation into an encoder, namely inputting the first word vector into a first-layer neural network, calculating the product of the first word vector and a first transformation matrix to obtain a first intermediate word vector, wherein the number of columns of the first transformation matrix is less than or equal to the number of rows of the first transformation matrix so that the number of columns of the first intermediate word vector is less than or equal to the number of columns of the first word vector, and then processing the first intermediate word vector by using a first nonlinear function to realize deep mining of semantic features of the first intermediate word vector to obtain a second intermediate word vector.
And inputting the second intermediate word vector into a second-layer neural network, calculating the product of the second intermediate word vector and a second transformation matrix to obtain a third intermediate word vector, wherein the number of columns of the second transformation matrix is less than or equal to the number of rows of the second transformation matrix, so that the number of columns of the third intermediate word vector is less than or equal to the number of columns of the second intermediate word vector, and then processing the third intermediate word vector by using a second nonlinear function to realize deep mining of semantic features of the third intermediate word vector to obtain a fourth intermediate word vector.
It should be noted that the number of columns of the second transformation matrix is an even number, that is, the number of columns of the fourth intermediate word vector is an even number, a vector composed of the 1 st to N/2 nd elements of the fourth intermediate word vector including N elements is used as an average vector of the target participles, and a vector composed of the N/2+1 st to N nd elements is used as a variance vector of the target participles.
S1022, constructing a Gaussian distribution function corresponding to the target word segmentation based on the mean vector and the variance vector corresponding to the target word segmentation, and determining a second word vector of the target word segmentation.
And constructing a Gaussian distribution function corresponding to the target word segmentation based on the average vector and the variance vector corresponding to the target word segmentation and an initial Gaussian distribution function prestored in the encoder, wherein the Gaussian distribution function is a probability distribution function, after the Gaussian distribution function corresponding to the target word segmentation is determined, sampling to obtain a second word vector corresponding to the target word segmentation, and outputting the second word vector of the target word segmentation by the encoder.
S103, inputting the second word vector into a pre-trained putting effect prediction model, and obtaining a first target putting value corresponding to the target word segmentation output by the putting effect prediction model.
In this step, the first target delivery value is used to represent a sales index of the target participle in a sales activity (delivery environment) to be held, such as click rate, display amount, and the like, after training of the initial delivery effect prediction model is completed by using a second sample participle associated with the target participle, a pre-trained delivery effect prediction model is obtained, a second word vector corresponding to the target participle is input into the delivery effect prediction model, and a delivery value output by the delivery effect prediction model is determined as the first target delivery value corresponding to the target participle.
Optionally, the putting effect prediction model is a DNN model, i.e., a regression model.
And S104, determining the target word segmentation with the first target putting value meeting the preset condition as a keyword, and sending the keyword to a corresponding user terminal.
In this step, the preset condition is that the first target delivery value is greater than a preset threshold, after the first target delivery value of the target participle is determined, the first target delivery value is compared with the preset threshold, and if the first target delivery value is greater than the preset threshold, the first target delivery value is determined to meet the preset condition, and the target participle is determined to be a keyword.
As another optional implementation, the method for determining the keyword further includes: and selecting keywords from the target participles according to a first target delivery value corresponding to each target participle in the target participles.
After the first target delivery value of each target word is determined, sequencing the first target delivery value of each target word, and determining the first target words as keywords, for example, determining the first target words as keywords; or, determining the target word segmentation arranged in the first three positions as keywords, and selecting a plurality of keywords for the corresponding users to select.
After determining the keywords, the keywords are sent to a corresponding user terminal, for example, a user terminal of a merchant selling products, or a user terminal of a merchant producing the products to be sold, or a user terminal of a third-party platform providing a selling platform for the products to be sold.
According to the keyword determining method provided by the embodiment of the application, the encoder and the putting effect prediction model are adopted to predict the putting effect of each target participle, the keywords used for marking the products to be sold are determined based on the predicted putting effect, and the flexibility and the accuracy of determining the keywords are improved.
Further, referring to fig. 3, fig. 3 is a flowchart of another keyword determination method provided in the embodiment of the present application, and the encoder is trained in the following manner:
s301, obtaining a plurality of first sample participles and a third word vector corresponding to each first sample participle in the plurality of first sample participles.
In the step, considering that the target participle is used for marking products to be sold in the sales activities, the first sample participle uses hot word list data recommended by the Ali history, the word quantity of the first sample participle far exceeds the number of the put-in keywords, a large number of first sample participles are used for training the encoder, and the accuracy of the encoder can be improved.
After a plurality of first sample participles are obtained from the hot word list data recommended by the ali history, the unstructured first sample participles are converted into structured third word vectors, and the third word vectors corresponding to each first sample participle in the plurality of first sample participles are determined.
S302, aiming at each first sample participle, inputting a third word vector corresponding to the first sample participle into an initial encoder, and obtaining a fourth word vector of the first sample participle output by the initial encoder.
In this step, the initial encoder is an untrained encoder or an untrained encoder, and the third word vector of each of the plurality of first sample participles is input into the initial encoder one by one, and the vector output by the initial encoder is determined as the fourth word vector of the first sample participle.
S303, inputting the fourth word vector of the first sample word segmentation into an initial decoder matched with the initial encoder, and obtaining a fifth word vector of the first sample word segmentation output by the initial decoder.
In the step, the encoder corresponds to a decoder matched with the encoder, the encoder is used for performing dimension reduction processing and feature mining on the vector, the decoder is used for restoring the vector obtained by encoding of the encoder to an original state, and when the encoder is trained, the encoder and the decoder are combined to realize co-training.
Acquiring an initial decoder corresponding to an initial encoder, wherein the initial decoder is an untrained decoder or an untrained decoder, training the initial encoder and the initial decoder together, inputting a fourth word vector of a first sample word segmentation output by the initial encoder into the initial decoder, and determining a vector output by the initial decoder as a fifth word vector of the first sample word segmentation, namely a vector obtained by encoding the encoder after recovery.
S304, according to the third word vector and the fifth word vector respectively corresponding to the first sample participle, determining a first loss value corresponding to the first sample participle, if the first loss value corresponding to any one first sample participle is larger than a first preset threshold value, continuing to train the initial encoder and the initial decoder, and if the first loss value corresponding to each first sample participle is smaller than or equal to the first preset threshold value, determining the current initial encoder as the pre-trained encoder.
In this step, if training of the initial encoder and the initial decoder is completed, a vector input to the initial encoder is consistent with a vector output by the initial decoder, and therefore, according to a third word vector input to the initial encoder and a fifth word vector output by the initial decoder, a first loss value of a model during training of a current first sample participle is determined, if a first loss value corresponding to any one first sample participle is greater than a first preset threshold, it is indicated that the third word vector and the fifth word vector of the first sample participle are not consistent, or if the difference is large, the initial encoder and/or the initial decoder at this time are not accurate, it is necessary to continue training of the two models, if the first loss value corresponding to each first sample participle is less than or equal to the first preset threshold, it is indicated that the third word vector and the fifth word vector of each first sample participle are consistent, or, the difference is small, the initial encoder and the initial decoder are accurate, and the current initial encoder is determined as a pre-trained encoder.
Further, referring to fig. 4, fig. 4 is a flowchart of another keyword determination method provided in the embodiment of the present application, and the delivery effect prediction model is trained in the following manner:
s401, a plurality of second sample participles are obtained, and a sixth word vector and a first putting value corresponding to each second sample participle in the plurality of second sample participles are obtained.
In this step, considering that the target participle is used for marking a product to be sold in a sales activity, the second sample participle is a released keyword, the first release value of the second sample participle is a release value corresponding to a real sales index of the second sample participle in the held sales activity, and the unstructured second sample participle is converted into a structured sixth word vector, that is, a sixth word vector corresponding to each second sample participle in the plurality of second sample participles is determined.
S402, aiming at each second sample participle, inputting a sixth word vector corresponding to the second sample participle into the pre-trained encoder, and obtaining a seventh word vector of the second sample participle output by the encoder.
In the step, after the encoder training is finished, the putting effect prediction model is trained, a pre-trained encoder is used for carrying out dimensionality reduction on the sixth word vector of each second sample word segmentation, and the features of the sixth word vector are deeply mined to obtain the seventh word vector of each second sample word segmentation.
And S403, inputting the seventh word vector of the second sample participle into an initial putting effect prediction model, and acquiring a second putting value of the second sample participle output by the initial putting effect prediction model.
In this step, the initial putting effect prediction model is an untrained putting effect prediction model, or an untrained putting effect prediction model, the seventh word vector of each second sample word segmentation is input into the initial putting effect prediction model one by one, and the putting value output by the initial putting effect prediction model is determined as the second putting value of the second sample word segmentation.
S404, according to a first putting value and a second putting value corresponding to the second sample participle, determining a second loss value corresponding to the second sample participle, if the second loss value corresponding to any second sample participle is larger than a second preset threshold value, continuing to train the initial putting effect prediction model, and if the second loss value corresponding to each second sample participle is smaller than or equal to the second preset threshold value, determining the current initial putting effect prediction model as the pre-trained putting effect prediction model.
In the step, the first delivery value is a real delivery value of the second sample participles in the held sales activity, the second delivery value is a delivery value predicted by the initial delivery effect prediction model, a second loss value when the initial delivery effect prediction model predicts the second delivery value of each second sample participle is determined according to the difference between the first delivery value and the second delivery value, if the second loss value corresponding to any second sample participle is greater than a second preset threshold value, the first delivery value of the second sample participle is not consistent with the second delivery value, or if the difference is large, the initial delivery effect prediction model is inaccurate, the model needs to be trained, if the second loss value corresponding to each second sample participle is less than or equal to the second preset threshold value, the first delivery value of each second sample participle is consistent with the second value, or the difference is smaller, the initial putting effect prediction model at the moment is accurate, and the current initial putting effect prediction model is determined to be a pre-trained putting effect prediction model.
Further, referring to fig. 5, fig. 5 is a flowchart of another method for determining a keyword according to an embodiment of the present application, where the obtaining a first word vector corresponding to a target word segmentation includes:
s501, performing word segmentation processing on the target word segmentation to obtain a plurality of word roots of the target word segmentation.
In this step, considering that the target participle is a word which has not been put in history and a vector representation of the target participle cannot be obtained, the target participle is firstly participled to obtain a plurality of roots of the target participle, for example, the roots of the target participle "milk bath lotion" are "milk" and "bath lotion".
S502, according to the corresponding relation between the participles and the vectors, root vectors corresponding to each root of the target participles are searched.
In the step, after a plurality of roots of each target participle are obtained, the root vector of each root of the target participle is determined according to the corresponding relation between the preset participle and the vector.
Optionally, word2vec and other language models or a GCN model (graph convolution neural network) are used to obtain a root vector of each root, and specifically, the root is input into a corresponding model to obtain the root vector of the root output by the model.
S503, determining a first word vector corresponding to the target participle according to the root vector corresponding to each root of the target participle.
In the step, root vectors of each root of the target participle are spliced to obtain a first word vector of the target participle, for example, root vectors corresponding to root "milk" and "bath lotion" in "milk bath lotion" are (1, 2, 3) and (4, 5, 6), respectively, and a first word vector of "milk bath lotion" is (1, 2, 3, 4, 5, 6).
Correspondingly, a third word vector corresponding to each first sample participle is obtained as follows: performing word segmentation processing on the first sample word segmentation to obtain a plurality of word roots of the first sample word segmentation; searching a root vector corresponding to each root of the first sample participle according to the corresponding relation between the participle and the vector; and determining a third word vector corresponding to the first sample participle according to the root vector corresponding to each root of the first sample participle.
Obtaining a sixth word vector corresponding to each second sample participle by the following method: performing word segmentation processing on the second sample word segmentation to obtain a plurality of word roots of the second sample word segmentation; searching a root vector corresponding to each root of the second sample participle according to the corresponding relation between the participle and the vector; and determining a sixth word vector corresponding to the second sample participle according to the root vector corresponding to each root of the second sample participle.
Further, referring to fig. 6, fig. 6 is a flowchart of another method for determining a keyword according to an embodiment of the present application, where the method further includes:
s601, obtaining an environment feature vector of a launching environment where the target word segmentation is located.
In this step, the release environment is a sales activity to be held, and the environmental characteristics of the release environment include the activity type of the sales activity, such as promotion, daily life, cost-effective accumulation, twenty-first class, and the brand and category of the sales activity. And determining the environmental feature vector of the launching environment where the target word is located according to the corresponding relation between each environmental feature and the vector.
S602, determining an eighth word vector of the target word segmentation according to the second word vector and the environment characteristic vector respectively corresponding to the target word segmentation.
In the step, the second word vector of the target word segmentation is spliced with the environment characteristic vector of the launching environment where the target word segmentation is located, so that an eighth word vector of the target word segmentation is obtained.
Here, the environmental characteristics of the delivery environment in which the target participle is located are introduced into the eighth word vector of the target participle, that is, the word vector of the target participle is determined based on the semantic characteristics of the target participle and the environmental characteristics of the delivery environment in which the target participle is located, so that the characteristics covered by the eighth word vector of the target participle are further enriched.
S603, inputting the eighth word vector into the putting effect prediction model, and obtaining a second target putting value corresponding to the target word segmentation output by the putting effect prediction model.
Inputting an eighth word vector corresponding to the target word segmentation into the putting effect prediction model, determining a putting value output by the putting effect prediction model as a second target putting value corresponding to the target word segmentation, determining a keyword in the target word segmentation based on the second target putting value, and sending the keyword to a corresponding user terminal.
Based on the same inventive concept, the embodiment of the present application further provides a device for determining a keyword corresponding to the method for determining a keyword, and since the principle of solving the problem of the device in the embodiment of the present application is similar to the method for determining a keyword in the embodiment of the present application, the implementation of the device may refer to the implementation of the method, and repeated details are not described again.
Referring to fig. 7, fig. 7 is a schematic structural diagram of an apparatus for determining a keyword according to an embodiment of the present application, where the apparatus includes:
a first obtaining module 701, configured to obtain a first word vector corresponding to a target word segmentation;
a second obtaining module 702, configured to input the first word vector into a pre-trained encoder, and obtain a second word vector of the target word segmentation output by the encoder; wherein the second word vector is lower in dimensionality than the first word vector;
a third obtaining module 703, configured to input the second word vector into a pre-trained delivery effect prediction model, and obtain a first target delivery value corresponding to the target word segmentation output by the delivery effect prediction model;
a sending module 704, configured to determine the target word segmentation of which the first target delivery value meets the preset condition as a keyword, and send the keyword to a corresponding user terminal.
In a possible embodiment, the apparatus further comprises:
the fourth obtaining module is used for obtaining a plurality of first sample participles and a third word vector corresponding to each first sample participle in the plurality of first sample participles;
a fifth obtaining module, configured to, for each first sample participle, input a third word vector corresponding to the first sample participle into an initial encoder, and obtain a fourth word vector of the first sample participle output by the initial encoder;
a sixth obtaining module, configured to input the fourth word vector of the first sample word segmentation into an initial decoder matched with the initial encoder, and obtain a fifth word vector of the first sample word segmentation output by the initial decoder;
a first determining module, configured to determine a first loss value corresponding to the first sample word segmentation according to a third word vector and a fifth word vector corresponding to the first sample word segmentation, continue to train the initial encoder and the initial decoder if there is a first loss value corresponding to any one of the first sample word segmentations that is greater than a first preset threshold, and determine the current initial encoder as the pre-trained encoder if the first loss value corresponding to each of the first sample word segmentations is less than or equal to the first preset threshold.
In a possible embodiment, the apparatus further comprises:
a seventh obtaining module, configured to obtain a plurality of second sample participles, and a sixth word vector and a first release value corresponding to each of the plurality of second sample participles;
an eighth obtaining module, configured to, for each second sample participle, input a sixth word vector corresponding to the second sample participle into the pre-trained encoder, and obtain a seventh word vector of the second sample participle output by the encoder;
a ninth obtaining module, configured to input a seventh word vector of the second sample word segmentation into an initial putting effect prediction model, and obtain a second putting value of the second sample word segmentation output by the initial putting effect prediction model;
and the second determining module is used for determining a second loss value corresponding to the second sample word segmentation according to a first delivery value and a second delivery value corresponding to the second sample word segmentation, if the second loss value corresponding to any second sample word segmentation is larger than a second preset threshold, continuing to train the initial delivery effect prediction model, and if the second loss value corresponding to each second sample word segmentation is smaller than or equal to the second preset threshold, determining the current initial delivery effect prediction model as the pre-trained delivery effect prediction model.
In a possible implementation manner, the first obtaining module 701, when obtaining a first word vector corresponding to a target word segmentation, includes:
performing word segmentation processing on the target word segmentation to obtain a plurality of word roots of the target word segmentation;
searching a root vector corresponding to each root of the target participle according to the corresponding relation between the participle and the vector;
and determining a first word vector corresponding to the target word segmentation according to the root vector corresponding to each root of the target word segmentation.
In a possible embodiment, the apparatus further comprises:
a tenth obtaining module, configured to obtain an environment feature vector of a delivery environment in which the target word segmentation is located;
a third determining module, configured to determine, according to the second word vector and the environment feature vector respectively corresponding to the target word segmentation, an eighth word vector of the target word segmentation;
an eleventh obtaining module, configured to input the eighth word vector into the delivery effect prediction model, and obtain a second target delivery value corresponding to the target word segmentation output by the delivery effect prediction model.
In a possible implementation manner, the encoder corresponds to a multi-layer neural network, and the second obtaining module 702, when inputting the first word vector into a pre-trained encoder and obtaining a second word vector of the target word segmentation output by the encoder, includes:
determining a mean vector and a variance vector corresponding to the target participle according to a transformation matrix and a nonlinear function corresponding to each layer of neural network in the multilayer neural network and a first word vector of the target participle;
and constructing a Gaussian distribution function corresponding to the target word segmentation based on the mean vector and the variance vector corresponding to the target word segmentation, and determining a second word vector of the target word segmentation.
In a possible embodiment, the apparatus further comprises:
the selecting module is used for selecting keywords from the target participles according to a first target putting value corresponding to each target participle in the target participles.
The keyword determining device provided by the embodiment of the application adopts the encoder and the putting effect prediction model to predict the putting effect of each target participle, determines the keywords for marking the products to be sold based on the predicted putting effect, and improves the flexibility and accuracy of determining the keywords.
Referring to fig. 8, fig. 8 is a schematic structural diagram of an electronic device according to an embodiment of the present application, where the electronic device 800 includes: a processor 801, a memory 802 and a bus 803, wherein the memory 802 stores machine-readable instructions executable by the processor 801, and when the electronic device is operated, the processor 801 communicates with the memory 802 via the bus 803, and the processor 801 executes the machine-readable instructions to perform the steps of the method for determining keywords as described above.
Specifically, the memory 802 and the processor 801 can be general-purpose memories and processors, which are not limited in particular, and the determining method of the keywords can be performed when the processor 801 runs a computer program stored in the memory 802.
Corresponding to the method for determining the keyword, an embodiment of the present application further provides a computer-readable storage medium, where a computer program is stored on the computer-readable storage medium, and the computer program is executed by a processor to perform the steps of the method for determining the keyword.
It is clear to those skilled in the art that, for convenience and brevity of description, the specific working processes of the system and the apparatus described above may refer to the corresponding processes in the foregoing method embodiments, and are not described herein again. In the several embodiments provided in the present application, it should be understood that the disclosed system, apparatus and method may be implemented in other ways. The above-described apparatus embodiments are merely illustrative, and for example, the division of the modules is merely a logical division, and there may be other divisions in actual implementation, and for example, a plurality of modules or components may be combined or integrated into another system, or some features may be omitted, or not executed. In addition, the shown or discussed mutual coupling or direct coupling or communication connection may be an indirect coupling or communication connection of devices or modules through some communication interfaces, and may be in an electrical, mechanical or other form.
The modules described as separate parts may or may not be physically separate, and parts displayed as modules may or may not be physical modules, may be located in one place, or may be distributed on a plurality of network modules. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of the present embodiment.
In addition, functional modules in the embodiments of the present application may be integrated into one processing module, or each of the modules may exist alone physically, or two or more modules are integrated into one module.
The functions, if implemented in the form of software functional modules and sold or used as a stand-alone product, may be stored in a non-volatile computer-readable storage medium executable by a processor. Based on such understanding, the technical solution of the present application or portions thereof that substantially contribute to the prior art may be embodied in the form of a software product stored in a storage medium and including instructions for causing a computer device (which may be a personal computer, a server, or a network device) to execute all or part of the steps of the method according to the embodiments of the present application. And the aforementioned storage medium includes: various media capable of storing program codes, such as a usb disk, a removable hard disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk, or an optical disk.
Finally, it should be noted that: the above-mentioned embodiments are only specific embodiments of the present application, and are used for illustrating the technical solutions of the present application, but not limiting the same, and the scope of the present application is not limited thereto, and although the present application is described in detail with reference to the foregoing embodiments, those skilled in the art should understand that: any person skilled in the art can modify or easily conceive the technical solutions described in the foregoing embodiments or equivalent substitutes for some technical features within the technical scope disclosed in the present application; such modifications, changes or substitutions do not depart from the spirit and scope of the exemplary embodiments of the present application, and are intended to be covered by the scope of the present application. Therefore, the protection scope of the present application shall be subject to the protection scope of the claims.

Claims (10)

1. A method for determining a keyword, the method comprising:
acquiring a first word vector corresponding to a target word segmentation;
inputting the first word vector into a pre-trained encoder, and acquiring a second word vector of the target word segmentation output by the encoder; wherein the second word vector is lower in dimensionality than the first word vector;
inputting the second word vector into a pre-trained putting effect prediction model, and acquiring a first target putting value corresponding to the target word segmentation output by the putting effect prediction model;
and determining the target word segmentation with the first target delivery value meeting the preset condition as a keyword, and sending the keyword to a corresponding user terminal.
2. The method of claim 1, wherein the encoder is trained by:
obtaining a plurality of first sample participles and a third word vector corresponding to each first sample participle in the plurality of first sample participles;
for each first sample participle, inputting a third word vector corresponding to the first sample participle into an initial encoder, and acquiring a fourth word vector of the first sample participle output by the initial encoder;
inputting the fourth word vector of the first sample word segmentation into an initial decoder matched with the initial encoder, and acquiring a fifth word vector of the first sample word segmentation output by the initial decoder;
and determining a first loss value corresponding to the first sample participle according to a third word vector and a fifth word vector corresponding to the first sample participle respectively, if the first loss value corresponding to any one first sample participle is larger than a first preset threshold value, continuing to train the initial encoder and the initial decoder, and if the first loss value corresponding to each first sample participle is smaller than or equal to the first preset threshold value, determining the current initial encoder as the pre-trained encoder.
3. The method for determining keywords according to claim 1, wherein the impression effect prediction model is trained by:
obtaining a plurality of second sample participles, and a sixth word vector and a first release value corresponding to each second sample participle in the plurality of second sample participles;
for each second sample word segmentation, inputting a sixth word vector corresponding to the second sample word segmentation into the pre-trained encoder, and acquiring a seventh word vector of the second sample word segmentation output by the encoder;
inputting a seventh word vector of the second sample word segmentation into an initial putting effect prediction model, and acquiring a second putting value of the second sample word segmentation output by the initial putting effect prediction model;
and determining a second loss value corresponding to the second sample participle according to a first release value and a second release value corresponding to the second sample participle, if the second loss value corresponding to any second sample participle is larger than a second preset threshold, continuing to train the initial release effect prediction model, and if the second loss value corresponding to each second sample participle is smaller than or equal to the second preset threshold, determining the current initial release effect prediction model as the pre-trained release effect prediction model.
4. The method for determining the keyword according to claim 1, wherein the obtaining the first word vector corresponding to the target word segmentation comprises:
performing word segmentation processing on the target word segmentation to obtain a plurality of word roots of the target word segmentation;
searching a root vector corresponding to each root of the target participle according to the corresponding relation between the participle and the vector;
and determining a first word vector corresponding to the target word segmentation according to the root vector corresponding to each root of the target word segmentation.
5. The method for determining keywords according to claim 1, further comprising:
obtaining an environment feature vector of a launching environment where the target word segmentation is located;
determining an eighth word vector of the target word segmentation according to the second word vector and the environment characteristic vector respectively corresponding to the target word segmentation;
and inputting the eighth word vector into the putting effect prediction model, and acquiring a second target putting value corresponding to the target word segmentation output by the putting effect prediction model.
6. The method for determining the keyword according to claim 1, wherein the encoder corresponds to a multi-layer neural network, and the inputting the first word vector into a pre-trained encoder to obtain a second word vector of the target word segmentation output by the encoder includes:
determining a mean vector and a variance vector corresponding to the target participle according to a transformation matrix and a nonlinear function corresponding to each layer of neural network in the multilayer neural network and a first word vector of the target participle;
and constructing a Gaussian distribution function corresponding to the target word segmentation based on the mean vector and the variance vector corresponding to the target word segmentation, and determining a second word vector of the target word segmentation.
7. The method for determining keywords according to claim 1, further comprising:
and selecting keywords from the target participles according to a first target delivery value corresponding to each target participle in the target participles.
8. An apparatus for determining a keyword, the apparatus comprising:
the first acquisition module is used for acquiring a first word vector corresponding to the target word segmentation;
the second obtaining module is used for inputting the first word vector into a pre-trained encoder and obtaining a second word vector of the target word segmentation output by the encoder; wherein the second word vector is lower in dimensionality than the first word vector;
a third obtaining module, configured to input the second word vector into a pre-trained delivery effect prediction model, and obtain a first target delivery value corresponding to the target word segmentation output by the delivery effect prediction model;
and the sending module is used for determining the target word segmentation of which the first target delivery value meets the preset condition as a keyword and sending the keyword to a corresponding user terminal.
9. An electronic device, comprising: a processor, a memory and a bus, the memory storing machine-readable instructions executable by the processor, the processor and the memory communicating via the bus when the electronic device is operating, the processor executing the machine-readable instructions to perform the steps of the method for determining a keyword according to any one of claims 1 to 7.
10. A computer-readable storage medium, characterized in that a computer program is stored on the computer-readable storage medium, which computer program, when being executed by a processor, performs the steps of the keyword determination method according to any one of claims 1 to 7.
CN202011415378.XA 2020-12-03 2020-12-03 Keyword determination method and device, electronic equipment and storage medium Active CN112434527B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202011415378.XA CN112434527B (en) 2020-12-03 2020-12-03 Keyword determination method and device, electronic equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202011415378.XA CN112434527B (en) 2020-12-03 2020-12-03 Keyword determination method and device, electronic equipment and storage medium

Publications (2)

Publication Number Publication Date
CN112434527A true CN112434527A (en) 2021-03-02
CN112434527B CN112434527B (en) 2024-06-18

Family

ID=74692012

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202011415378.XA Active CN112434527B (en) 2020-12-03 2020-12-03 Keyword determination method and device, electronic equipment and storage medium

Country Status (1)

Country Link
CN (1) CN112434527B (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113159921A (en) * 2021-04-23 2021-07-23 上海晓途网络科技有限公司 Overdue prediction method and device, electronic equipment and storage medium

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109034378A (en) * 2018-09-04 2018-12-18 腾讯科技(深圳)有限公司 Network representation generation method, device, storage medium and the equipment of neural network
CN111160017A (en) * 2019-12-12 2020-05-15 北京文思海辉金信软件有限公司 Keyword extraction method, phonetics scoring method and phonetics recommendation method
CN111738791A (en) * 2020-01-20 2020-10-02 北京沃东天骏信息技术有限公司 Text processing method, device, equipment and storage medium

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109034378A (en) * 2018-09-04 2018-12-18 腾讯科技(深圳)有限公司 Network representation generation method, device, storage medium and the equipment of neural network
CN111160017A (en) * 2019-12-12 2020-05-15 北京文思海辉金信软件有限公司 Keyword extraction method, phonetics scoring method and phonetics recommendation method
CN111738791A (en) * 2020-01-20 2020-10-02 北京沃东天骏信息技术有限公司 Text processing method, device, equipment and storage medium

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113159921A (en) * 2021-04-23 2021-07-23 上海晓途网络科技有限公司 Overdue prediction method and device, electronic equipment and storage medium

Also Published As

Publication number Publication date
CN112434527B (en) 2024-06-18

Similar Documents

Publication Publication Date Title
CN106599226B (en) Content recommendation method and content recommendation system
CN110427560B (en) Model training method applied to recommendation system and related device
CN110717098B (en) Meta-path-based context-aware user modeling method and sequence recommendation method
CN109376222B (en) Question-answer matching degree calculation method, question-answer automatic matching method and device
CN110008973B (en) Model training method, method and device for determining target user based on model
CN109189921B (en) Comment evaluation model training method and device
CN113256367B (en) Commodity recommendation method, system, equipment and medium for user behavior history data
CN113191838B (en) Shopping recommendation method and system based on heterogeneous graph neural network
Saleh Machine Learning Fundamentals: Use Python and scikit-learn to get up and running with the hottest developments in machine learning
US11574351B2 (en) System and method for quality assessment of product description
CN109189922B (en) Comment evaluation model training method and device
CN108133390A (en) For predicting the method and apparatus of user behavior and computing device
CN114783421A (en) Intelligent recommendation method and device, equipment and medium
CN111160000A (en) Composition automatic scoring method, device terminal equipment and storage medium
CN116764236A (en) Game prop recommending method, game prop recommending device, computer equipment and storage medium
CN113886697A (en) Clustering algorithm based activity recommendation method, device, equipment and storage medium
CN112434527A (en) Keyword determination method and device, electronic equipment and storage medium
CN113032676A (en) Recommendation method and system based on micro-feedback
CN117522519A (en) Product recommendation method, device, apparatus, storage medium and program product
CN115641179A (en) Information pushing method and device and electronic equipment
CN110555719A (en) commodity click rate prediction method based on deep learning
CN112417866A (en) Method and device for determining word segmentation recommendation value, electronic equipment and storage medium
CN115345669A (en) Method and device for generating file, storage medium and computer equipment
CN112632275B (en) Crowd clustering data processing method, device and equipment based on personal text information
CN115456708A (en) Recommendation model training method and device, electronic equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant