CN113536785A

CN113536785A - Text recommendation method, intelligent terminal and computer readable storage medium

Info

Publication number: CN113536785A
Application number: CN202110660904.7A
Authority: CN
Inventors: 刁永祥; 吴飞; 张浩宇; 王玉杰; 方四安; 徐承; 柳林
Original assignee: Hefei Ustc Iflytek Co ltd
Current assignee: Hefei Ustc Iflytek Co ltd
Priority date: 2021-06-15
Filing date: 2021-06-15
Publication date: 2021-10-22
Anticipated expiration: 2041-06-15
Also published as: CN113536785B

Abstract

The application discloses a text recommendation method, an intelligent terminal and a computer readable storage medium, wherein the text recommendation method comprises the following steps: acquiring a first word segmentation sequence and a historical text sequence corresponding to a text to be selected; respectively performing feature extraction processing on the first word segmentation sequence and a second word segmentation sequence corresponding to each historical text in the historical text sequence to obtain a text feature vector to be selected and a historical feature vector; the text feature vector to be selected comprises position information of each participle in the text to be selected, and the historical feature vector comprises position information of each participle in each second participle sequence; determining a prediction probability value of a text to be selected based on a user interest feature representation obtained by extracting interest features of the historical feature vector and the text feature vector to be selected; and recommending texts to the user according to the predicted probability value. According to the scheme, when text features are extracted, key factors of the position information can be fully considered, and text recommendation can be performed by combining reading habits and attention points of users.

Description

Text recommendation method, intelligent terminal and computer readable storage medium

Technical Field

The present application relates to the field of computer technologies, and in particular, to a text recommendation method, an intelligent terminal, and a computer-readable storage medium.

Background

In recent years, with the development of information technology and the increasing popularization of the internet, people acquire more and more information, and information overload is generated to a certain extent. In the fields of commodities and news, a recommendation system is a means for solving information overload, and text recommendation can help users and readers to find interesting text contents from massive text information so as to shorten the search time of the users and improve the reading experience of the readers and the users.

For example, in recommending news information, although news recommendation is rapidly developed in recent years, many problems still exist to be solved, for example, it still remains a challenge to accurately model the interest of users in news, and the interest of users is usually more diverse and dynamically evolves over time, and mining and modeling are required to be performed based on a large amount of user feedback behaviors. However, news platforms often do not have explicit user feedback, and even implicit feedback for many users is quite sparse.

At present, a news recommendation system is classified according to methods and generally comprises a recommendation method based on content modeling, a news recommendation method adopting collaborative filtering and a mixed model combining the advantages of the two methods, wherein a content recommendation algorithm is mainly used for recommending news similar to the preference of a user based on the news reading interest of the user; the collaborative filtering-based recommendation algorithm is that news is recommended to users based on the interests and hobbies of similar crowds; the mixed algorithm is to comprehensively consider the user preference and the interest points of the similar crowd to recommend news to the user.

However, in the above manner, usually, due to the lack of the position information in the extracted news content, the accurate feature information of the news cannot be obtained, and the interest points of the user cannot be accurately extracted, so as to screen out the representative text.

Disclosure of Invention

The technical problem mainly solved by the application is to provide a text recommendation method, an intelligent terminal and a computer-readable storage medium, which can add word position coding assisted single text to-be-selected semantic representation in chapters to comprehensively consider interest points of users and screen out representative texts.

In order to solve the above problem, a first aspect of the present application provides a text recommendation method, where the text recommendation method includes: acquiring a first word segmentation sequence and a historical text sequence corresponding to a text to be selected; performing feature extraction processing on the first word segmentation sequence to obtain a feature vector of the text to be selected; the feature vector of the text to be selected comprises position information of each word in the text to be selected; performing feature extraction processing on a second word segmentation sequence corresponding to each historical text in the historical text sequence to obtain a historical feature vector; the historical characteristic vector comprises position information of each participle in each second participle sequence; performing interest feature extraction processing on the historical feature vector to obtain user interest feature representation corresponding to the historical text sequence; determining a prediction probability value corresponding to the text to be selected based on the text feature vector to be selected and the user interest feature representation; and recommending texts to the user according to the predicted probability value.

The method for extracting the feature of the first word segmentation sequence to obtain the feature vector of the text to be selected comprises the following steps: converting the first word segmentation sequence to obtain a text representation to be selected and a position coding tensor; processing the representation of the text to be selected and the position coding tensor corresponding to the representation of the text to be selected to obtain the semantic representation to be selected corresponding to the text to be selected; and obtaining a feature vector of the text to be selected corresponding to the text to be selected based on the semantic representation to be selected.

The method for converting the first word segmentation sequence to obtain the representation of the text to be selected and the position coding tensor comprises the following steps: performing vector conversion on the text to be selected to obtain the representation of the text to be selected; and obtaining a position coding tensor based on the candidate text representation.

The obtaining of the position coding tensor based on the candidate text representation comprises the following steps: and sequentially carrying out position coding operation on each element in the to-be-selected text representation by utilizing the dimension of the to-be-selected text representation and the position of each element in the to-be-selected text representation to obtain a position coding tensor.

The method for obtaining the feature vector of the text to be selected corresponding to the text to be selected based on the semantic representation to be selected comprises the following steps: acquiring a related importance degree value between every two elements in the semantic representation to be selected and an importance degree value of each element in the text representation to be selected; obtaining attention vector representation corresponding to the semantic representation to be selected based on the relevant importance degree value; and obtaining the feature vector of the text to be selected based on the attention vector representation and the importance degree.

The method for obtaining the relevant importance degree value between every two elements in the semantic representation to be selected comprises the following steps of: processing the semantic representation to be selected through the trained first network model to obtain a relevant importance degree value between every two elements in the semantic representation to be selected; the correlation importance degree value of the first element relative to the second element in the semantic representation to be selected is the product of the transposed vector of the first element and a first weight coefficient matrix and the second element obtained after the first network model is trained.

Obtaining attention vector representation corresponding to the semantic representation to be selected based on the relevant importance degree value, wherein the attention vector representation comprises the following steps: carrying out normalization operation on the relevant importance degree values to obtain relevant importance degree representation between every two elements in the semantic representation to be selected; and carrying out a first feature extraction operation on the related importance degree representation through a first network model to obtain an attention vector representation.

The method for obtaining the feature vector of the text to be selected based on the attention vector representation and the importance degree value comprises the following steps of: performing second feature extraction operation on the attention vector representation and the importance degree value through the first network model to obtain an attention mechanism weight coefficient; and carrying out first aggregation operation on the attention vector representation and the attention mechanism weight coefficient to obtain a text feature vector to be selected.

The method comprises the following steps of forming a history text sequence by a plurality of history texts in sequence of reading time, and extracting interest features of history feature vectors to obtain user interest feature representations corresponding to the history text sequence, wherein the steps comprise: performing interest feature extraction operation on the historical feature vectors to obtain a weight coefficient value corresponding to each historical text; and obtaining user interest feature representation based on the historical feature vector and each weight coefficient value.

Wherein, the extracting operation of the interest feature is performed on the historical feature vector to obtain the weight coefficient value corresponding to each historical text, and the extracting operation comprises the following steps: performing interest feature extraction operation on the reading time and the historical feature vector of each historical text through the trained second network model to obtain a hidden vector output value corresponding to each historical text; and performing importance degree feature extraction operation on each hidden vector output value through a second network model to obtain a weight coefficient value corresponding to each historical text.

The step of obtaining the user interest feature representation based on the historical feature vector and each weight coefficient value comprises the following steps: and performing a second aggregation operation on the historical feature vector and each weight coefficient value to obtain a user interest feature representation.

The method comprises the following steps of selecting a plurality of texts, and recommending the texts to a user according to the predicted probability value, wherein the number of the texts to be selected is multiple: recommending the texts to be selected, of which the corresponding prediction probability values exceed a set threshold value, to a user; or sequencing the texts to be selected according to the prediction probability value corresponding to each text to be selected, and recommending the texts to be selected with the set number with the top sequencing order to the user.

The text recommendation method is realized through a trained text recommendation network model, and the method for training the text recommendation network model comprises the following steps: inputting the read text sequence sample and the unread text sequence sample into a first preset network model, and respectively recommending texts of the read text sequence sample and the unread text sequence sample through the first preset network model so as to respectively give a first prediction probability value and a second prediction probability value; obtaining a posterior probability corresponding to the read text sequence sample and a negative likelihood function of the posterior probability based on the first prediction probability value and the second prediction probability value; and continuously optimizing the set quantitative parameters in the first preset network model through a negative likelihood function, so as to establish a text recommendation network model.

In order to solve the above problem, a second aspect of the present application provides an intelligent terminal, wherein the intelligent terminal includes a memory and a processor coupled to each other, the memory stores program data, and the processor is configured to execute the program data to implement the text recommendation method according to any one of the above items.

In order to solve the above problem, a third aspect of the present application provides a computer-readable storage medium, wherein the computer-readable storage medium stores program data that can be executed to implement the text recommendation method as described in any one of the above.

The invention has the beneficial effects that: different from the situation of the prior art, the text recommendation method in the application performs feature extraction processing on a first word segmentation sequence after acquiring the first word segmentation sequence and a historical text sequence corresponding to a text to be selected so as to obtain a feature vector of the text to be selected; performing feature extraction processing on a second word segmentation sequence corresponding to each historical text in the historical text sequence to obtain a historical feature vector, wherein the to-be-selected text feature vector comprises position information of each word segmentation in the to-be-selected text, and the historical feature vector comprises the position information of each word segmentation in each second word segmentation sequence, so that when text content features are extracted, key factors such as position information can be fully considered, and the problem that the content of a corresponding sentence is possibly biased or even is opposite due to the fact that the text sequences are lost in the position information aspect can be effectively avoided; and interest feature extraction processing is carried out on the historical feature vector to obtain user interest feature representation corresponding to the historical text sequence, prediction probability value corresponding to the text to be selected is determined based on the text feature vector to be selected and the user interest feature representation, text recommendation is carried out on the user according to the prediction probability value, reading preference and attention points of the user can be effectively determined according to the obtained prediction probability value, representative text is selected and recommended to the user, and recommendation accuracy is guaranteed.

Drawings

FIG. 1 is a schematic flow chart diagram illustrating an embodiment of a method for text recommendation of the present application;

FIG. 2 is a schematic flow chart of one embodiment of S12 of FIG. 1;

FIG. 3 is a schematic flow chart of one embodiment of S121 in FIG. 2;

FIG. 4 is a schematic flow chart of one embodiment of S123 of FIG. 2;

FIG. 5 is a schematic flow chart of one embodiment of S1232 of FIG. 4;

FIG. 6 is a schematic flow chart of one embodiment of S1233 of FIG. 4;

FIG. 7 is a schematic flow chart of one embodiment of S14 of FIG. 1;

FIG. 8 is a schematic flow chart of one embodiment of S141 of FIG. 7;

fig. 9 is a schematic flowchart of a specific application scenario of feature extraction performed on a first word segmentation sequence and a historical text sequence corresponding to a text to be selected according to the present application;

FIG. 10 is a schematic flow chart illustrating a specific application scenario for obtaining a user interest feature representation corresponding to a historical text sequence according to the present application;

FIG. 11 is a diagram illustrating an embodiment of a method for training a text recommendation network model according to the present application;

FIG. 12 is a schematic structural diagram of an embodiment of an intelligent terminal according to the present application;

FIG. 13 is a schematic structural diagram of an embodiment of a computer-readable storage medium of the present application.

Detailed Description

The following describes in detail the embodiments of the present application with reference to the drawings attached hereto.

In the following description, for purposes of explanation and not limitation, specific details are set forth such as particular system structures, interfaces, techniques, etc. in order to provide a thorough understanding of the present application.

The terms "system" and "network" are often used interchangeably herein. The term "and/or" herein is merely an association describing an associated object, meaning that three relationships may exist, e.g., a and/or B, may mean: a exists alone, A and B exist simultaneously, and B exists alone. In addition, the character "/" herein generally indicates that the former and latter related objects are in an "or" relationship. Further, the term "plurality" herein means two or more than two.

Referring to fig. 1, fig. 1 is a schematic flowchart illustrating an embodiment of a text recommendation method according to the present application.

Specifically, the text recommendation method in this embodiment may include the following steps:

s11: and acquiring a first word segmentation sequence and a historical text sequence corresponding to the text to be selected.

The text to be recommended according to the present application may be any of the same type of news, patent documents, professional papers, movie reviews, topic reviews, and the like. It can be understood that different users usually have different reading preferences and interests for different texts in the same type, and the browsing records of the users in the continuous reading process are displayed, so that feature extraction can be performed through a deep learning network model or a specific algorithm based on the browsing records of the users, personalized recommendation can be performed on the users, the users are helped to find interesting text contents from massive text information, the searching time of the users is shortened, and the reading experience of the readers and the users is improved.

Specifically, a history text which is read by a user and a text to be selected which is recommended to the user for reading are obtained, and after word segmentation processing is respectively carried out on the history text and the text to be selected, a history text sequence and a first word segmentation sequence corresponding to the text to be selected are obtained.

Optionally, the number of the texts to be selected may be one or multiple, where each text to be selected corresponds to a first word segmentation sequence; the history text sequence may also be a second word segmentation sequence corresponding to one history text, or an aggregation sequence formed by a plurality of second word segmentation sequences corresponding to a plurality of history texts and using the history reading time as the sequence.

It can be understood that the first word segmentation sequence and the second word segmentation sequence are text sequences with a certain sequence and composed of a plurality of word vectors, so that corresponding text sequences can be obtained after word segmentation processing is performed on original text contents.

S12: and performing feature extraction processing on the first word segmentation sequence to obtain a feature vector of the text to be selected.

After the first word segmentation sequence is obtained, the first word segmentation sequence is further converted, for example, a pre-training language model, or a Global vector (GloVe) algorithm, or any other reasonable vector conversion algorithm is adopted, so as to perform feature extraction on the first word segmentation sequence, and obtain a corresponding feature vector of the text to be selected.

Further, position coding is performed on each element in the to-be-selected text representation based on the position information of each element in the to-be-selected text representation, for example, a sine and cosine position coding algorithm, a relative position coding algorithm, or any other reasonable position coding algorithm is adopted to perform position coding on the to-be-selected text representation, so as to obtain a position coding tensor corresponding to the to-be-selected text sequence.

The feature vector of the text to be selected comprises position information of each word in the text to be selected. For example, in the feature extraction processing of the first word segmentation sequence, the method further includes performing position coding on the first word segmentation sequence based on position information of each element in the vector representation corresponding to the first word segmentation sequence, so as to obtain a position coding tensor corresponding to the first word segmentation sequence, and performing extraction to obtain a feature vector of the text to be selected, so that the feature vector of the text to be selected includes the position information of each word in the text to be selected.

S13: and performing feature extraction processing on the second word segmentation sequence corresponding to each historical text in the historical text sequence to obtain a historical feature vector.

Understandably, the way of extracting the features of the second word segmentation sequences corresponding to each historical text after acquiring the historical text sequences can be the same as the way of extracting the features of the first word segmentation sequences, so that the second word segmentation sequences corresponding to each historical text are sequentially subjected to feature extraction processing, and after full connection, historical feature vectors corresponding to the historical text sequences are obtained.

Therefore, the historical feature vector obtained in the above manner also includes the position information of each participle in each second participle sequence, so that the reading preference and the attention point of the user can be more accurately obtained based on feature extraction performed on the historical text sequence, and more appropriate unread text in a plurality of texts to be selected can be selected and recommended to the user.

It can be understood that, regarding the text sequence, the position and the arrangement order of each word in the sentence, that is, the position of each word in the text sequence, are important for understanding the text content, and they are not only the components of the grammatical structure of a sentence, but also important factors influencing people to understand the meaning of the sentence content. And the position or the arrangement sequence of a word in a sentence is different, the connotation of the whole sentence is possible to deviate and even be opposite. Therefore, when extracting the feature information of the text content, it is necessary to introduce word sequence information for recording the position information of each word.

Therefore, the to-be-selected text feature vector and the history feature vector which are sequentially obtained by the method comprise corresponding position information, so that the problem of inaccurate corresponding semantic representation caused by word sequence information loss due to the fact that the position information of words in a corresponding text sequence cannot be considered can be effectively solved.

S14: and performing interest feature extraction processing on the historical feature vector to obtain a user interest feature representation corresponding to the historical text sequence.

Further, performing interest feature extraction processing on the historical feature vector, for example, performing a cyclic function or convolution processing on the historical feature vector through a suitable cyclic network model in combination with the corresponding historical reading time to obtain a user interest feature representation corresponding to the historical text sequence.

S15: and determining a prediction probability value corresponding to the text to be selected based on the feature vector of the text to be selected and the user interest feature representation.

Specifically, vector multiplication is carried out on the user interest feature representation and the text feature vector to be selected, and the prediction probability value corresponding to the text sequence to be selected is determined based on the product of the user interest feature representation and the text feature vector to be selected; or, in another embodiment, the product of the two is further normalized to determine the result of the normalization process as the predicted probability value.

Understandably, the prediction probability value is a probability value corresponding to the interest degree of the user to the text to be selected, and the larger the prediction probability value is, the more the user is interested in the corresponding text to be selected.

S16: and recommending texts to the user according to the predicted probability value.

Further, when the prediction probability value corresponding to the text to be selected is obtained, the reading preference of the corresponding user can be obtained according to the prediction probability value, and then the text to be selected with a larger prediction probability value, for example, the prediction probability value exceeding a set threshold value, can be pushed to the user; or sequencing the texts to be selected according to the size of the prediction probability value so as to push the priority in the front of the sequencing order to the user for reading.

According to the scheme, in the process of obtaining the feature vector of the text sequence, by introducing the position information, when the feature of the text content is extracted, the key factors such as the position information can be fully considered, so that the problem that the content of a corresponding statement possibly deviates and is even opposite due to the loss of the text sequence in the aspect of the position information can be effectively solved, more effective user interest feature expression and the text feature vector to be selected can be further extracted, the confidence of the corresponding obtained prediction probability value is higher, the recommendation can be more accurate, the reading preference and the attention point of the user can be obtained according to the corresponding prediction probability value, and the representative text can be screened out and recommended to the user.

Referring to fig. 2, fig. 2 is a schematic flowchart illustrating an embodiment of S12 in fig. 1. In one embodiment, the text recommendation method of the present application further includes some more specific steps in addition to the above-mentioned steps S11-S16. Specifically, the step S12 may further include the following steps:

s121: and performing conversion processing on the first word segmentation sequence to obtain a text representation to be selected and a position coding tensor.

After the first word segmentation sequence is obtained, the first word segmentation sequence is further converted, for example, the first word segmentation sequence is converted into a corresponding text representation to be selected by using a pre-training language model, or a Global vector (GloVe) algorithm, or any other reasonable vector conversion algorithm. It will be understood that each element in the first sequence of words is a vector representation of the word at the corresponding position in the first sequence of words.

S122: and processing the representation of the text to be selected and the position coding tensor corresponding to the representation of the text to be selected to obtain the semantic representation to be selected corresponding to the text to be selected.

Further, each element in the text representation to be selected is added to an element at a corresponding position in the position coding tensor, that is, the text representation to be selected and the position coding tensor are added to obtain a semantic representation to be selected corresponding to the text sequence to be selected.

It can be understood that, regarding the text sequence, the position of each word in the sentence, i.e. the arrangement order of the words in the sentence, is important for understanding the text content, and they are not only the components of the grammatical structure of a sentence, but also the important factors influencing people to understand the meaning of the sentence content. And the position or the arrangement sequence of a word in a sentence is different, the connotation of the whole sentence is possible to deviate and even be opposite. Therefore, when extracting feature information for text content, it is necessary to introduce word order information for recording position information of each word.

Therefore, when the semantic representation is performed on the first word segmentation sequence corresponding to the text to be selected, corresponding position coding information is introduced, namely the text representation to be selected is added to the position coding tensor to obtain the semantic representation to be selected corresponding to the text sequence to be selected, so that the problem of inaccurate corresponding semantic representation caused by the fact that the position information of the words in the corresponding text sequence cannot be considered can be effectively solved.

S123: and obtaining a feature vector of the text to be selected corresponding to the text to be selected based on the semantic representation to be selected.

Further, extracting features of the semantic representation to be selected, for example, inputting the semantic representation to be selected into a trained multi-head attention mechanism network or any other reasonable trained deep learning network model to extract features of the semantic representation to be selected; or a set weight matrix is introduced, and corresponding function operation is carried out on the semantic representation to be selected so as to convert the semantic representation to be selected into a text feature vector to be selected corresponding to the text sequence to be selected.

Referring to fig. 3, fig. 3 is a schematic flowchart illustrating an embodiment of S121 in fig. 2. In an embodiment, the step S121 may further include:

s1211: and carrying out vector conversion on the first word segmentation sequence to obtain the representation of the text to be selected.

For convenience of understanding, the text recommendation to be performed is specifically a news recommendation, and one of the texts to be selected participating in the recommendation is news D. It can be known that the news item D may specifically be the first word segmentation sequence acquired in S11, or may also be one of the second word segmentation sequences corresponding to the historical text sequence, and the feature extraction methods performed on the first word segmentation sequence and the second word segmentation sequence are the same.

Here, it is assumed that the news D is composed of M (M is a positive integer) words, that is, each element in the news D is an identification code corresponding to each word therein, such as:

D＝[w₁，w₂，w₃，...，w_M]。

each word in the news D is obtained by performing word segmentation on corresponding news text, such as one or more of a title, an abstract and a body text of the news. Similarly, the method for recommending other types of texts is not described herein again.

And after obtaining the news D, further adopting a pre-training language model to convert the news D into a corresponding news representation L to be selected. Each element in the candidate news representation L is a vector representation of a word at a corresponding position in the news D:

L＝[x₁，x₂，x₃，...，x_M]。

s1212: and obtaining the position coding tensor based on the candidate text representation.

Further, position coding is carried out on the news representation L to be selected based on the position information of each element in the news representation L to be selected and a sin-cos (sine and cosine) position coding mode is adopted, so that a position coding tensor PE corresponding to the news D is obtained.

In an embodiment, the step S1212 may further include: and sequentially carrying out position coding operation on each element in the to-be-selected text representation by utilizing the dimension of the to-be-selected text representation and the position of each element in the to-be-selected text representation to obtain a position coding tensor.

Further, according to the dimension of the to-be-selected news representation L and the position of each element in the to-be-selected news representation L, position coding operation is sequentially performed on each element in the to-be-selected news representation L through the pre-training language model, and a position coding tensor PE is obtained.

The calculation method of the position code performed on the news representation L to be selected may specifically be:

where 2n +1 ═(1, 2, 3.. times, M), pos denotes the position of each element in the candidate news representation L, d denotes the position of each element in the candidate news representation L, and_modelis a vector dimension in the pre-training language model, that is, a dimension for modeling each element in the news representation L to be selected through the pre-training language model, and may specifically be d_model300 or 250, etc.

Correspondingly, the candidate semantic representation E with the position vector is calculated based on the candidate news representation L and the position coding tensor PE₁Namely:

E₁＝[e₁，e₂，e₃，...，e_M]；

wherein e is_j＝x_j+PE_j(j＝1，2，3，...，M)。

Referring to fig. 4, fig. 4 is a schematic flowchart illustrating an embodiment of S123 in fig. 2. In an embodiment, the S123 may specifically include:

s1231: and acquiring the related importance degree value between every two elements in the semantic representation to be selected and the importance degree value of each element in the text representation to be selected.

Understandably, the inherent relevance between words in news is helpful for semantic expression of news, in other words, some keywords in the same news may have important inherent relevance with multiple words. Based on the above factors, a multi-head attention mechanism is also needed to extract the intrinsic relevance between every two words in news.

Specifically, a semantic representation E to be selected is obtained₁The relative importance degree value between every two elements in the system, for example, representing E for the semantic to be selected by a multi-head attention mechanism₁The degree of importance of the correlation between every two elements in the semantic representation E to be selected₁Each element in the list is extracted relative to other elements and the relevant importance degree value thereof to obtain the corresponding relevant importance degree value.

S1232: and obtaining attention vector representation corresponding to the semantic representation to be selected based on the relevant importance degree.

Further, based on the obtained semantic representation E to be selected₁For example, by using a multi-head attention mechanism, performing softmax (logistic regression) normalization operation on each relevant importance value to obtain an attention vector representation corresponding to the semantic representation to be selected.

S1233: and obtaining the feature vector of the text to be selected based on the attention vector representation and the importance degree.

Still further, based on the importance degree value of each element in the attention vector representation and the to-be-selected text representation, for example, the importance degree values of each element in the attention vector representation and the to-be-selected text representation are vector-connected through a multi-head attention mechanism to obtain a corresponding to-be-selected text feature vector.

In an embodiment, the S1231 may further include: and processing the semantic representation to be selected through the trained first network model to obtain a relevant importance degree value between every two elements in the semantic representation to be selected.

The first network model may be a multi-head attention mechanism network model, and the candidate semantic representation E₁The value of the relevant importance degree of the first element relative to the second element in (1) is specifically a product of a transposed vector of the first element and a first weight coefficient matrix and a second element obtained after the first network model is trained.

Understandably, the first element and the second elementThe element may specifically be a semantic representation E to be selected₁And the first element may also be the same as the second element.

In particular, assume that the i-th word and its j-th word in news D are vector encoded, i.e. the corresponding semantic representations are each a 1 × 300 matrix e_iAnd e_jAnd predefining a weight coefficient matrix with two dimensions of 300 multiplied by 300 and obtained through training and learning

And

that is, the initial multi-head attention mechanism network model corresponding to the first network model is trained to obtain the first weight coefficient matrix of the first network model

And a second weight coefficient matrix

W is only used as a mark symbol, does not participate in the calculation process, and is only used for distinguishing different text vector representations. And in the case of passing through the kth (k is a positive integer and is a specific number of attention mechanism networks included in the multi-head attention mechanism network model) attention mechanism network in the first network model, pair e_iAnd e_jAnd performing set matrix multiplication operation to obtain the related importance degree R (scalar quantity) between the ith word and the jth word:

wherein the content of the first and second substances,

is a word vector transpose vector corresponding to the ith word, i.e., the semantic representation to be selected E₁The ith language ofTransposed vector of semantic representation, e_jIs the jth vocabulary corresponding word vector, i.e., the candidate semantic representation E₁The (j) th semantic representation in (b),

representing the weight coefficients of the predefined learnable kth self-attention mechanism network.

Referring to fig. 5, fig. 5 is a schematic flowchart illustrating an embodiment of S1232 in fig. 4. In an embodiment, the S1232 may further include:

s12321: and carrying out normalization operation on the related importance degree value to obtain a related importance degree representation between every two elements in the semantic representation to be selected.

Considering that the text sequence is composed of M words, in order to compare the correlation between the ith word and the M words and further measure which words have stronger correlation with the ith word, softmax normalization operation needs to be carried out on the correlation importance degree value between the ith word and the M words so as to calculate the semantic expression E to be selected₁The degree of importance of the correlation between the ith word and the jth word

(normalized weight coefficient):

by the above process, the correlation coefficient between the ith word and the M words in the news D, i.e. the semantic representation E to be selected can be calculated in sequence₁The relative importance between every two elements in (1) represents:

s12322: and carrying out a first feature extraction operation on the related importance degree representation through a first network model to obtain an attention vector representation.

Further, a first feature extraction operation is carried out on the related importance degree representation through a first network model, so that a vector representation of the ith word after information interaction with news D sequence global information can be obtained, namely the attention vector representation of the ith word learned through a kth attention mechanism network

(Self-Attention)：

Wherein the content of the first and second substances,

the method is used for enhancing the capability of the network model for extracting features so as to improve the generalization capability of the network model.

Further, the h (hyper-parameter, which is generally a reasonable value such as 16, 12, or 10 according to the experimental result) attention mechanism networks are vector-connected by vector connection to obtain the attention vector representation about the ith word

Referring to fig. 6, fig. 6 is a schematic flowchart illustrating an embodiment of S1233 of fig. 4. In an embodiment, the S1233 may further include:

s12331: and performing second feature extraction operation on the attention vector representation and the importance degree value through the first network model to obtain an attention mechanism weight coefficient corresponding to the attention vector representation.

Understandably, the importance of each word will often vary in the same news. Therefore, it is also necessary to adopt Addi in the first network modelthe method comprises the following steps of (1) evaluating the importance degree of each word of a news text by using a positive-Attention mechanism network (additive Attention mechanism network), namely performing second feature extraction operation on the Attention vector representation and the importance degree value to obtain the Attention mechanism weight coefficient of the ith word

Wherein the content of the first and second substances,

O_w、o_wthe first weight coefficient matrix, the second weight coefficient matrix and the first bias matrix are respectively predefined first network models which can be obtained through training and learning, and are obtained based on the importance degree value of each element in the text representation to be selected, and the w just represents a marker and does not participate in the calculation process.

S12332: and carrying out first aggregation operation on the attention vector representation and the attention mechanism weight coefficient to obtain a text feature vector to be selected.

Further, performing a first aggregation operation on the attention vector representation and the attention mechanism weight coefficient to extract features of a two-layer attention network, and performing final candidate text feature vector D of news D₁Expressed in the following form:

referring to fig. 7, fig. 7 is a schematic flowchart illustrating an embodiment of S14 in fig. 1. In an embodiment, the S14 may specifically include:

s141: and performing interest feature extraction operation on the historical feature vectors to obtain a weight coefficient value corresponding to each historical text.

In this embodiment, the history text sequence includes a plurality of history texts, and the plurality of history texts form the history text sequence in order of reading time. By the same way of extracting the features of the news D, the feature extraction can be sequentially performed on the second word segmentation sequence corresponding to each historical text in the historical text sequence to obtain the corresponding historical feature vector. Please refer to the corresponding process of feature extraction of the first word segmentation sequence corresponding to the text to be selected, which is not described herein again.

Generally, the focus of a user changes with time, which is classified into a long-term focus and a short-term focus. Taking news as an example, after converting a news text into a vector according to the reading history of a user, extracting the characteristics of news content by using a self-attention mechanism network, analyzing the degree of correlation between each word in a single news article, and screening a representative word by using an additive-attention mechanism network to represent the semantic information of the news. And for semantic representation of each news, further adopting a self-attribute + additive-attribute attention mechanism to screen out important news from a plurality of news for representing user interest. Through analysis, the news information is a long-term interest characteristic in nature, namely news which focuses on a certain field for a long time is determined to be most, therefore, news screened according to historical news records is necessarily related to the field, and if news which focuses on other fields recently is difficult to screen. Therefore, after obtaining each news semantic feature vector, news showing long-term and short-term concerns needs to be screened out respectively.

Specifically, the history feature vector is further subjected to an interest feature extraction operation, for example, an important program of each element in the history feature vector is sequentially subjected to extraction operation processing through any reasonable Recurrent neural network, such as a Gate current Unit (GRU) network or a long-and-short Recurrent network, so as to sequentially obtain a weight coefficient value corresponding to each history text.

S142: and obtaining user interest feature representation based on the historical feature vector and each weight coefficient value.

Further, the historical feature vector and each corresponding weight coefficient value are subjected to aggregation operation processing, so that the interest feature expression of the user can be obtained.

In a specific embodiment, for convenience of description, the above history text sequence may be specifically expressed as:

C＝{c₁，c₂，...，c_s}；

wherein, S represents the total number of the historical texts included in the historical text sequence, and here, taking the historical texts as read news as an example, it can be known that S is specifically a news space. And each element in the historical text sequence C is ordered by reading time, e.g., C₁Corresponding reading time comparison c₂Early, and c₂Corresponding reading time comparison c₃Early, and so on. In the same manner as the above-mentioned feature extraction for news D, feature extraction may be performed for each historical text in the historical text sequence in turn to obtain a historical feature vector D by combining:

d＝{d₁，d₂，...，d_s}。

further, please refer to fig. 8, in which fig. 8 is a schematic flowchart illustrating an embodiment of S141 in fig. 7. In an embodiment, the S141 may specifically include:

s1411: and performing interest feature extraction operation on the reading time and the historical feature vector of each historical text through the trained second network model to obtain a hidden vector output value corresponding to each historical text.

The second network model may be the same neural network model as the first network model, or may be different neural network models, which is not limited herein. Element d in historical feature vector d_tThe corresponding formula of the operation is as follows:

wherein, W_r、W_zAnd W is a fifth weight coefficient matrix, a seventh weight coefficient matrix and a fifth weight coefficient matrix, wherein the fifth weight coefficient matrix, the seventh weight coefficient matrix and the seventh weight coefficient matrix are respectively obtained after the second network model is trained, and t corresponds to the reading time of a specific text in the historical text.

It is understood that each news corresponds to a moment when the user clicks and reads, and if there are S news, it is understood that there are S time points in total. Assuming that the current time is time t, i.e. the user concerns a certain news at time t, time t-1 corresponds to the time point of the last reading of the news, and the short-term user interest can be modeled as:

wherein r is_tRepresentational information reset gate for resetting the state of the last moment

Reset, z_tAnd an information update door for information update of the current state. d_tA news characteristic representation representing the current time outputs a news d at the current time after passing through the GRU network_tHidden vector state of

As can be readily seen, z_tThe larger the change in the current state is. Therefore, it is necessary to refer to the current state value as an output result of the current state. The short-term interest characteristic is a hidden vector output value corresponding to the current time t

It can be understood that the extraction process of the hidden vector output value fully considers the influence of the reading time sequence, and the historical text sequence is formed by taking the reading time as the sequence, so that each element in the historical feature vector is sequentially and circularly processed. Therefore, the hidden vector output value sufficiently reflects the influence of the reading time on the processing result, and can be understood as short-term interest characteristic representation of the user.

Specifically, because the relevance and importance of the historical texts corresponding to each hidden vector output value are different, a weight coefficient value corresponding to each historical text needs to be obtained based on each hidden vector output value. For example, each hidden vector output value is processed through a trained network model to obtain a weight coefficient value corresponding to each historical text.

It can be understood that the weight coefficient value is a weight value that the content of the history text corresponding to the long-term reading record of the user affects the reading interest of the user.

S1412: performing importance degree feature extraction operation on each hidden vector output value through a second network model to obtain a weight coefficient value corresponding to each historical text;

wherein, the hidden vector output value corresponding to the i moment

The formula for performing the importance degree feature extraction operation is as follows:

wherein the content of the first and second substances,

O_n、o_nand n is a marker which does not participate in the calculation process.

Understandably, in a news sequence, each news item has its intrinsic relevance, and the importance varies. And adopting Additive-Attention network mechanism in the second network model to evaluate the importance degree of each news and further represent the long-term interest semantic features of the user, wherein the weight coefficient value of the ith news is the weight coefficient value of the above news

The importance weight coefficient of the S pieces of news can be obtained through the calculation processes in sequence:

still further, in an embodiment, the S142 may further include: performing second aggregation operation on the historical feature vectors and each weight coefficient value to obtain user interest feature representation;

wherein, the second aggregation operation corresponds to the following formula:

wherein s is the number of the history texts included in the history text sequence.

Understandably, given that a reader is enthusiastic and interested in some areas, we can easily think that the reader may be interested in news events in the area for a long time, and at the same time, possibly because of some hot events, the reader is interested in news in a certain aspect, so we can reasonably distinguish and treat the long-term and short-term interests of the reader. Furthermore, the reading habits of the readers often have a unidirectional nature in time, in short, the readers probably do not see news a week ago and two days ago, so the problem should be considered when modeling the reading interests of the readers.

In the embodiment, the reading history of the reader is further analyzed by adopting the GRU on the basis of the characteristic representation of each news, and the relationship between long-term and short-term historical news information is tried to be established to represent the transition state of the attention point of the reader in a period of time, so that the performance of the user habit in a real user scene can be more effectively simulated, and the user interest characteristic representation which can better embody the preference of the user can be obtained.

Further, in an embodiment, the step S16 may specifically include: recommending the texts to be selected, of which the corresponding prediction probability values exceed a set threshold value, to the user.

Understandably, the prediction probability value is the degree of the user's interest in the corresponding text sequence to be selected, that is, by setting a specific set threshold, the text sequence to be selected with higher user's interest can be effectively screened out for recommendation to the user.

Optionally, the set threshold may be any reasonable value such as 0.7, 0.8, or 0.6, and may be set according to the actual requirement of the user, which is not limited in this application.

Further, in an embodiment, the step S16 may specifically include: and sequencing the texts to be selected according to the prediction probability value corresponding to each text to be selected, and recommending the texts to be selected with the set number and with the top sequencing order to the user.

Similarly, the text sequences to be selected are sorted according to the prediction probability value corresponding to each text sequence to be selected, and the text sequences to be selected are also sorted according to the interest degree of the user, so that the text sequences to be selected with higher interest degree can be arranged in the front and sequentially recommended to the user to meet the reading preference of the user; and the reading interest of the user can be met by setting a specific set number, and the control of the user on the reading time and the reading efficiency can be effectively met.

Optionally, the set number may be any reasonable number such as 5, 6, or 8, and may be specifically set according to the actual requirement of the user, which is not limited in this application.

In a specific embodiment, please refer to fig. 9 and fig. 10 in combination, where fig. 9 is a schematic flowchart of a specific application scenario for performing feature extraction on a candidate text sequence and a historical text sequence in the present application, and fig. 10 is a schematic flowchart of a specific application scenario for acquiring a user interest feature representation corresponding to a historical text sequence in the present application. In this embodiment, a corresponding text recommendation method is specifically implemented through a trained text recommendation network model, where the text recommendation is a News recommendation, and the text recommendation network model specifically includes a News Encoder and a User Encoder, which are used as examples for explanation.

Therefore, after each word segmentation sequence participating in recommendation is obtained, the word segmentation sequences comprise a first word segmentation sequence corresponding to the text to be selected and a second word segmentation sequence corresponding to each historical text in the historical text sequence, and feature extraction can be performed on each word segmentation sequence in sequence through the News Encoder.

For example, when one piece of news D obtained consists of M words, the news D may be represented as:

D＝[w₁，w₂，w₃，...，w_M]。

so that the News D can be firstly processed with WordEmbedding + PosEmboding (vector conversion and position coding) by the News Encoder to obtain the corresponding semantic representation E₁Comprises the following steps:

E₁＝[e₁，e₂，e₃，...，e_M]。

further, E is expressed for semantic by a network model of attention mechanism integrated in the News Encoder₁The related importance degree between every two elements in the method is extracted to obtain the following components in sequence: attention vector representation

The degree of importance of the association between the ith word and the M words in news D indicates:

and finally, carrying out feature extraction on the related importance degree representation to obtain a corresponding text feature vector:

it is understood that the User Encoder is a history text sequence C ═ C { C } formed by the News Encoder according to the reading time for each read history text₁，c₂，...，c_sPerforming feature extraction to obtain feature representation d of S news: d ═ d₁，d₂，...，d_sAfter the feature representation d is obtained, short-term feature extraction is carried out on the feature representation d in sequence to obtain semantic feature information

And then, extracting long-term interest features to obtain an importance weight coefficient of S news:

the expression of the user interest feature vector obtained after full connection of the semantic feature information and the importance weight coefficient is as follows:

the specific operation method is the same as the operation method in the embodiment corresponding to the text recommendation method, and is not described herein again.

Referring to fig. 11, fig. 11 is a schematic diagram illustrating an embodiment of a method for training a text recommendation network model according to the present application. In an application scenario, the text recommendation method is specifically implemented by a trained text recommendation network model, and in this embodiment, the method for training the text recommendation network model may specifically include:

s21: inputting the read text sequence sample and the unread text sequence sample into a first preset network model, and respectively carrying out text recommendation on the read text sequence sample and the unread text sequence sample through the first preset network model so as to respectively give a first prediction probability value and a second prediction probability value.

Specifically, the read text sequence samples are recorded as a positive example set D, U unread text sequence samples are randomly sampled to serve as negative samples, the negative samples are arranged according to the issuing time sequence and are sequentially input into a first preset network model, text recommendation is respectively carried out on the read text sequence samples and the unread text sequence samples through the first preset network model, and a first prediction probability value and a second prediction probability value are respectively and correspondingly given.

S22: and obtaining the posterior probability corresponding to the read text sequence sample and a negative likelihood function of the posterior probability based on the first prediction probability value and the second prediction probability value.

And further, performing corresponding function processing on the first prediction probability value and the second prediction probability value through a first preset network model to obtain a posterior probability corresponding to the read text sequence sample and a negative likelihood function of the posterior probability.

Specifically, assume that the model prediction score of each positive case, i.e., the first prediction probability value is y⁺The model prediction probability values corresponding to the U negative samples, i.e. the second prediction probability value is

For example, the posterior probability of the ith positive example is obtained by normalization by softmax, and is as follows:

the target function of the training process is the negative likelihood function of the posterior probability:

loss＝-∑_i∈Dlog(p_i)。

s23: and continuously optimizing the set quantitative parameters in the first preset network model through a negative likelihood function, so as to establish a text recommendation network model.

Still further, the set quantization parameters in the first preset network model are continuously optimized through the negative likelihood function, that is, the set quantization parameters in the first preset network model, for example, the weight matrix and the bias matrix mentioned in the above embodiment, are continuously adjusted, so that when the negative likelihood function reaches a stable state, the optimization is stopped, and the text recommendation network model capable of performing text recommendation is established.

Referring to fig. 12, fig. 12 is a schematic structural diagram of an embodiment of an intelligent terminal according to the present application. The intelligent terminal 31 includes a memory 311 and a processor 312 coupled to each other, the memory 311 stores program data, and the processor 312 is configured to execute the program data stored in the memory 311 to implement any one of the text recommendation methods described above.

Optionally, the intelligent terminal 31 is one of any reasonable intelligent electronic devices such as a microcomputer, a server, a mobile phone, a tablet computer, and a smart watch, which is not limited in this application.

In particular, the processor 312 is configured to control itself and the memory 311 to implement the steps of any of the above-described embodiments of the text recommendation method. Processor 312 may also be referred to as a CPU (Central Processing Unit). Processor 312 may be an integrated circuit chip having signal processing capabilities. The Processor 312 may also be a general purpose Processor, a Digital Signal Processor (DSP), an Application Specific Integrated Circuit (ASIC), a Field Programmable Gate Array (FPGA) or other Programmable logic device, discrete Gate or transistor logic, discrete hardware components. A general purpose processor may be a microprocessor or the processor may be any conventional processor or the like. Additionally, processor 312 may be implemented collectively by an integrated circuit chip.

Referring to fig. 13, fig. 13 is a schematic structural diagram of an embodiment of a computer-readable storage medium according to the present application. Wherein the computer readable storage medium 41 stores program data 411 that can be executed by a processor, the program data 411 being executable to implement any of the above-described text recommendation methods.

In the several embodiments provided in the present application, it should be understood that the disclosed method and apparatus may be implemented in other manners. For example, the above-described apparatus embodiments are merely illustrative, and for example, a division of a module or a unit is merely one type of logical division, and an actual implementation may have another division, for example, a unit or a component may be combined or integrated with another system, or some features may be omitted, or not implemented. In addition, the shown or discussed mutual coupling or direct coupling or communication connection may be an indirect coupling or communication connection of devices or units through some interfaces, and may be in an electrical, mechanical or other form.

Units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on network elements. Some or all of the units can be selected according to actual needs to achieve the purpose of the embodiment.

In addition, functional units in the embodiments of the present application may be integrated into one processing unit, or each unit may exist alone physically, or two or more units are integrated into one unit. The integrated unit can be realized in a form of hardware, and can also be realized in a form of a software functional unit.

The integrated unit, if implemented in the form of a software functional unit and sold or used as a stand-alone product, may be stored in a computer readable storage medium. Based on such understanding, the technical solution of the present application may be substantially implemented or contributed to by the prior art, or all or part of the technical solution may be embodied in a software product, which is stored in a storage medium and includes instructions for causing a computer device (which may be a personal computer, a server, a network device, or the like) or a processor (processor) to execute all or part of the steps of the method according to the embodiments of the present application. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk or an optical disk, and other various media capable of storing program codes.

Claims

1. A text recommendation method, characterized in that the text recommendation method comprises:

acquiring a first word segmentation sequence and a historical text sequence corresponding to a text to be selected;

performing feature extraction processing on the first word segmentation sequence to obtain a feature vector of the text to be selected; the feature vector of the text to be selected comprises position information of each word in the text to be selected;

performing feature extraction processing on a second word segmentation sequence corresponding to each historical text in the historical text sequences to obtain historical feature vectors; the historical feature vector comprises position information of each participle in each second participle sequence;

performing interest feature extraction processing on the historical feature vector to obtain a user interest feature representation corresponding to the historical text sequence;

determining a prediction probability value corresponding to the text to be selected based on the text feature vector to be selected and the user interest feature representation;

and recommending texts for the users according to the predicted probability values.

2. The text recommendation method according to claim 1, wherein the step of performing feature extraction processing on the first word segmentation sequence to obtain a feature vector of the text to be selected comprises:

converting the first word segmentation sequence to obtain a text representation to be selected and a position coding tensor;

processing the representation of the text to be selected and the position coding tensor corresponding to the representation of the text to be selected to obtain a semantic representation to be selected corresponding to the text to be selected;

and obtaining the feature vector of the text to be selected corresponding to the text to be selected based on the semantic representation to be selected.

3. The method of claim 2, wherein the converting the first word segmentation sequence to obtain the candidate text representation and the position coding tensor comprises:

performing vector conversion on the first word segmentation sequence to obtain the representation of the text to be selected;

and obtaining the position coding tensor based on the candidate text representation.

4. The text recommendation method according to claim 3, wherein said deriving the position coding tensor based on the candidate text representation comprises:

and sequentially carrying out position coding operation on each element in the to-be-selected text representation by using the dimension of the to-be-selected text representation and the position of each element in the to-be-selected text representation so as to obtain the position coding tensor.

5. The text recommendation method according to claim 2, wherein obtaining the feature vector of the text to be selected corresponding to the text to be selected based on the semantic representation to be selected comprises:

acquiring a relevant importance degree value between every two elements in the semantic representation to be selected and an importance degree value of each element in the text representation to be selected;

obtaining attention vector representation corresponding to the semantic representation to be selected based on the relevant importance degree value;

and obtaining the feature vector of the text to be selected based on the attention vector representation and the importance degree value.

6. The text recommendation method according to claim 5, wherein the step of obtaining the relative importance degree value between each two elements in the semantic representation to be selected comprises:

processing the semantic representation to be selected through the trained first network model to obtain the relevant importance degree value between every two elements in the semantic representation to be selected;

wherein, the correlation importance degree value of a first element in the semantic representation to be selected relative to a second element thereof is the product of the transposed vector of the first element and a first weight coefficient matrix and the second element obtained after the first network model is trained.

7. The text recommendation method according to claim 6, wherein the deriving the attention vector representation corresponding to the semantic representation to be selected based on the related importance value comprises:

carrying out normalization operation on the relevant importance degree value to obtain relevant importance degree representation between every two elements in the semantic representation to be selected;

and carrying out first feature extraction operation on the related importance degree representation through the first network model to obtain the attention vector representation.

8. The text recommendation method of claim 7, wherein the step of deriving the candidate text feature vector based on the attention vector representation and the importance value comprises:

performing a second feature extraction operation on the attention vector representation and the importance degree value through the first network model to obtain an attention mechanism weight coefficient;

and performing a first aggregation operation on the attention vector representation and the attention mechanism weight coefficient to obtain the text feature vector to be selected.

9. The text recommendation method according to claim 1, wherein a plurality of the historical texts form the historical text sequence in reading time sequence, and the step of performing interest feature extraction processing on the historical feature vector to obtain a user interest feature representation corresponding to the historical text sequence comprises:

performing interest feature extraction operation on the historical feature vectors to obtain a weight coefficient value corresponding to each historical text;

and obtaining the user interest feature representation based on the historical feature vector and each weight coefficient value.

10. The method of claim 9, wherein the performing an interest feature extraction operation on the historical feature vectors to obtain a weight coefficient value corresponding to each historical text comprises:

performing interest feature extraction operation on the reading time of each historical text and the historical feature vector through a trained second network model to obtain the hidden vector output value corresponding to each historical text;

and performing importance degree feature extraction operation on each hidden vector output value through the second network model to obtain the weight coefficient value corresponding to each historical text.

11. The method of claim 9, wherein the step of deriving the user interest feature representation based on the historical feature vectors and each of the weight coefficient values comprises:

and carrying out second aggregation operation on the historical feature vector and each weight coefficient value to obtain the user interest feature representation.

12. The text recommendation method according to claim 1, wherein the number of the texts to be selected is plural, and the step of recommending the text to the user according to the predicted probability value comprises:

recommending the texts to be selected, corresponding to the prediction probability values of which exceed a set threshold value, in the plurality of texts to be selected obtained by calculation to a user;

or sequencing the texts to be selected according to the prediction probability value corresponding to each text to be selected, and recommending the set number of texts to be selected with the top sequencing order to the user.

13. The text recommendation method according to claim 1, wherein the text recommendation method is implemented by a trained text recommendation network model, and the method for training the text recommendation network model comprises:

inputting a read text sequence sample and an unread text sequence sample into a first preset network model, and respectively carrying out text recommendation on the read text sequence sample and the unread text sequence sample through the first preset network model so as to respectively give a first prediction probability value and a second prediction probability value;

obtaining a posterior probability corresponding to the read text sequence sample and a negative likelihood function of the posterior probability based on the first prediction probability value and the second prediction probability value;

and continuously optimizing the set quantitative parameters in the first preset network model through the negative likelihood function, so as to establish the text recommendation network model.

14. An intelligent terminal, characterized in that the intelligent terminal comprises a memory and a processor coupled to each other;

the memory stores program data;

the processor is configured to execute the program data to implement the text recommendation method of any of claims 1-13.

15. A computer-readable storage medium, characterized in that the computer-readable storage medium stores program data executable to implement the text recommendation method according to any one of claims 1-13.