CN118051615A

CN118051615A - User interest portrait generating method, device, equipment, storage medium and product

Info

Publication number: CN118051615A
Application number: CN202410175392.9A
Authority: CN
Inventors: 岳文应
Original assignee: Bigo Technology Pte Ltd
Current assignee: Bigo Technology Pte Ltd
Priority date: 2024-02-07
Filing date: 2024-02-07
Publication date: 2024-05-17

Abstract

The embodiment of the application provides a user interest portrait generation method, a device, equipment, a storage medium and a product, wherein the method comprises the following steps: acquiring room text information corresponding to a voice room accessed by a user last time, wherein the statistical time range of the room text information is determined based on the trigger time point of an active behavior event of the user; determining a short-term interest image of the user based on the room text information and the trained first text classification model; acquiring a recorded historical short-term portrait of a user and a room interest portrait of a concerned voice room, and determining a long-term interest portrait of the user according to the short-term interest portrait, the historical short-term portrait and the room interest portrait; the short-term interest portrayal and the long-term interest portrayal are combined to obtain the user interest portrayal. The method and the device have the advantages that the short-term interest portraits of the users are reasonably constructed, the long-term interest characteristics of the users are accurately mastered, the comprehensive interest portraits of the users are obtained, the recommendation of related content conforming to the interests of the users is facilitated, and personalized experience is customized.

Description

User interest portrait generating method, device, equipment, storage medium and product

Technical Field

The embodiment of the application relates to the technical field of computers, in particular to a user interest portrait generation method, a user interest portrait generation device, user interest portrait generation equipment, a storage medium and a product.

Background

With the popularity of social media and real-time communication tools, speech houses are rapidly rising as a new social interaction mode. The voice house provides a good communication platform for the host and user interaction, so that the user can timely acquire the needed information and establish closer social connection by listening to the host voice or directly participating in the discussion and sharing knowledge of multiple topics. In order to enhance the user experience, the speech room platform needs to understand the interests of the user more deeply in order to customize the personalized interactive experience for which relevant content is recommended. Because the data generated by the voice room during the active period is huge and complex, the topics related to the voice room by the user are various, and the depth and the breadth of discussion are large, so that the interests of the user in the voice room can be different from person to person, and the user's preference can be accurately captured only by understanding and adapting to various interests and contexts, so that the pushed content is ensured to meet the expectations of the user.

In the related art, a user portrait is generated by using a rule-based or traditional machine learning method, but personalized analysis on the interests of the user is lacking, the capturing of real-time changes of the interests of the user cannot be satisfied, the interests of the user cannot be comprehensively and accurately mastered, and improvement is needed.

Disclosure of Invention

The embodiment of the application provides a user interest portrait generation method, device, equipment, storage medium and product, which solve the problems that the real-time change capture of the user interest cannot be met and the user interest cannot be comprehensively and accurately mastered in the related technology, realize the real-time change of the user interest, reasonably construct the short-term interest portrait of the user, accurately mastering the long-term interest characteristic of the user and effectively construct the long-term interest portrait of the user, thereby obtaining the comprehensive and accurate user interest portrait, being beneficial to recommending related content conforming to the user interest and customizing personalized interactive experience.

In a first aspect, an embodiment of the present application provides a method for generating a user interest portrait, where the method includes:

acquiring room text information corresponding to a voice room accessed by a user last time, wherein the statistical time range of the room text information is determined based on the trigger time point of the active behavior event of the user;

determining a short-term interest image of the user based on the room text information and the trained first text classification model;

Acquiring a recorded historical short-term portrait of the user and a room interest portrait of a concerned voice room, and determining a long-term interest portrait of the user according to the short-term interest portrait, the historical short-term portrait and the room interest portrait;

combining the short-term interest portraits and the long-term interest portraits to obtain a user interest portraits.

In a second aspect, an embodiment of the present application further provides a user interest portrait generating device, where the device includes:

The acquisition module is configured to acquire room text information corresponding to a voice room accessed by a user last time, and the statistical time range of the room text information is determined based on the trigger time point of the active behavior event of the user;

A short-term representation generation module configured to determine a short-term interest representation of the user based on the room text information and a trained first text classification model;

A long-term representation generation module configured to obtain a recorded historical short-term representation of the user and a room-interest representation of a speech room of interest, and to determine a long-term representation of interest for the user based on the short-term representation of interest, the historical short-term representation, and the room-interest representation;

A user representation generation module configured to combine the short-term interest representation and the long-term interest representation to obtain a user interest representation.

In a third aspect, an embodiment of the present application further provides a user interest portrait generating device, including:

one or more processors;

A storage device configured to store one or more programs,

When the one or more programs are executed by the one or more processors, the one or more processors implement the user interest portrait generation method according to the embodiment of the present application.

In a fourth aspect, embodiments of the present application also provide a non-volatile storage medium storing computer-executable instructions that, when executed by a computer processor, are configured to perform the user interest portrait generation method according to an embodiment of the present application.

In a fifth aspect, the embodiment of the present application further provides a computer program product, where the computer program product includes a computer program, where the computer program is stored in a computer readable storage medium, and where at least one processor of the device reads and executes the computer program from the computer readable storage medium, so that the device performs the method for generating a user interest portrait according to the embodiment of the present application.

According to the embodiment of the application, the room text information corresponding to the voice room accessed by the user last time is acquired, the statistical time range of the room text information is determined based on the trigger time point of the active behavior event of the user, the short-term interest image of the user is determined based on the room text information and the trained first text classification model, the recorded historical short-term image of the user and the room interest image of the concerned voice room are acquired, and the long-term interest image of the user is determined according to the short-term interest image, the historical short-term image and the room interest image; the short-term interest portrayal and the long-term interest portrayal are combined to obtain the user interest portrayal. According to the scheme, the reference information of the user interests is effectively and reasonably determined based on the triggering time point of the active behavior event of the user by acquiring the room text information of the voice room accessed by the user last time, the real-time change of the user interests is effectively captured by utilizing the room text information meeting the information diversification and the text classification model of the corresponding type, the short-term interest portraits of the user are reasonably constructed, the long-term interest characteristics of the user are accurately mastered by fusing the recorded short-term interest portraits and the room interest portraits corresponding to the voice room concerned by the user, and the long-term interest portraits of the user are effectively constructed, so that the comprehensive and accurate user interest portraits are obtained, the recommendation of related content conforming to the user interests is facilitated, and personalized interaction experience is customized.

Drawings

FIG. 1 is a flowchart of a user interest portrait generation method according to an embodiment of the present application;

FIG. 2 is a flow chart of a user interest image generation method including a short-term interest image generation process according to an embodiment of the present application;

FIG. 3 is a flowchart of a method for generating a user interest portrait including a weight calculation process according to an embodiment of the present application;

FIG. 4 is a flow chart of another method for generating a user interest profile including a short-term interest profile generation process according to an embodiment of the present application;

FIG. 5 is a flow chart of a user interest representation generation method including a long-term interest representation generation process provided by an embodiment of the present application;

FIG. 6 is a flow chart of a user interest portrait generation method including a room interest portrait generation process provided by an embodiment of the present application;

FIG. 7 is a block diagram of a user interest portrait creating apparatus according to an embodiment of the present application;

Fig. 8 is a schematic structural diagram of a user interest portrait generating device according to an embodiment of the present application.

Detailed Description

Embodiments of the present application will be described in further detail below with reference to the drawings and examples. It should be understood that the particular embodiments described herein are illustrative only and are not limiting of embodiments of the application. It should be further noted that, for convenience of description, only some, but not all of the structures related to the embodiments of the present application are shown in the drawings.

The terms first, second and the like in the description and in the claims, are used for distinguishing between similar elements and not necessarily for describing a particular sequential or chronological order. It is to be understood that the data so used may be interchanged, as appropriate, such that embodiments of the present application may be implemented in sequences other than those illustrated or described herein, and that the objects identified by "first," "second," etc. are generally of a type, and are not limited to the number of objects, such as the first object may be one or more. Furthermore, in the description and claims, "and/or" means at least one of the connected objects, and the character "/", generally means that the associated object is an "or" relationship.

The user interest portrait generation method provided by the embodiment of the application can be used for constructing the user interest portrait by combining the information such as the multi-source text of the voice room accessed by the user, the voice text during the user access and the like, so as to effectively optimize the personalized recommendation service, and related application scenes can comprise: live voice, social voice, voice tutorial, etc. The above-listed application scenarios are merely exemplary and explanatory, and in practical applications, the user interest portrait generating method may be used in real-time voice communication under other scenarios, which is not limited by the embodiment of the present application.

In the related art, a user portrait is generated by using a rule-based or traditional machine learning method, and when the interest understanding and recommending problems in a real-time voice scene are actually solved, some defects such as insufficient accuracy, real-time performance, individuation degree and the like may exist. First, when processing voice data, the related art has difficulty in efficiently recognizing, analyzing and understanding real-time voice input of a user, and cannot fully utilize rich information generated by the user in real-time voice communication of a voice room, so that accurate experience cannot be provided for the user. Secondly, the related system platform has certain limitation on the expression of the interests of users with diversity and complexity, and because the topics in the voice room are widely related, the interest change of the users on different topics is difficult to accurately capture, so that the recommended content is not personalized enough. In addition, the data source used in the related technology is single or one-sided, and the interests of the user cannot be deeply and comprehensively described. Based on the above, in order to better meet the service requirement of a real-time voice scene, the application aims to provide a user interest portrait generation method, a device, equipment, a storage medium and a product, and to establish a comprehensive and accurate user interest portrait, wherein the user interest portrait can be used for providing a more accurate reference signal for a recommendation system, so that the user interest portrait can accurately capture interest points of a user, thereby providing customized and personalized room and content recommendation, being beneficial to improving the experience of finding interesting contents for the user, injecting higher intelligent factors into a content recommendation algorithm of a platform, intelligently recommending interesting users, promoting the establishment of social relations, being beneficial to the user to more purposefully expand social circles and enhancing the depth and breadth of social experience. Moreover, the user interest portrayal can help the operation and product team to know the preference and trend of the user more deeply, so that a more refined operation strategy is implemented, and the retention rate of the platform and the user participation are improved.

The main execution body of each step of the user interest portrait generating method provided by the embodiment of the application may be a computer device, where the computer device refers to any electronic device with data computing, processing and storage capabilities, such as a mobile phone, a PC (Personal Computer ), a tablet computer and other terminal devices, and may also be a server and other devices.

FIG. 1 is a flowchart of a user interest portrait generation method according to an embodiment of the present application. As shown in fig. 1, the method comprises the following steps:

step S101, acquiring room text information corresponding to a voice room accessed by a user last time, wherein the statistical time range of the room text information is determined based on the trigger time point of the active behavior event of the user.

The room text information can be text content reflecting the characteristics of voice content such as communication topics, attention directions and the like of the voice room, and can be derived from real-time voice stream, public screen stream or public information of the voice room. The room text information may include speech recognition text, public screen text, bulletin text, and title text. The statistical time range may be to determine which period of room text information to acquire with reference to a trigger time point of an active behavior event of the user. For example, the active behavior event may be a user gift, a trigger time point of the user gift may be taken as a reference time, voice stream data of a host in 10 minutes before and after the reference time is intercepted, and a voice recognition text of the voice stream data is extracted; for example, the active behavior event may be speaking interaction of the user on the public screen, a trigger time point of speaking interaction of the user on the public screen may be taken as a reference time, voice stream data and public screen stream data of a room within 5 minutes before and after the reference time are intercepted, a voice recognition text of the voice stream data and a public screen text of the public screen stream data are extracted, optionally, the voice stream data or the public screen stream data can be input into a set sensitive word model or emotion recognition model, negative emotion data is removed, and it is ensured that the voice stream data and the public screen stream data can effectively reflect interest trends of the user. For another example, if the user does not perform the voice or public screen interaction after entering the voice room, but performs the stay of the duration meeting the preset range, the user may be considered to have a high possibility of being interested in the voice room, the voice stream data and the public screen stream data of the voice room during the room may be extracted, and the voice recognition text of the voice stream data and the public screen text of the public screen stream data may be extracted, where the preset range may be used for removing the data that is too short or too long in the room duration, for example, if the room duration is as long as 2-3 hours, the user may be considered to have an abnormal condition of hanging up, and if the room duration is only 1-2 minutes, the user may be considered to be interested in the content through quick browsing and screening. Of course, the foregoing setting of acquiring the room text information is merely exemplary, and the active behavior event and the selection of the statistical time range may be adaptively adjusted by the developer according to the on-line time of the voice room in the actual scene and the actual interest prediction effect, which is not limited herein.

Step S102, determining short-term interest images of the user based on the room text information and the trained first text classification model.

The first text classification model is used for generating interest labels according to provided text contents, the first text classification model can be a common neural network such as CNN and RNN, room text data of a voice room capable of being recorded in a history mode and interest label data corresponding to the room text data, and the first text classification model is trained, so that the first text classification model can have good interest marking capability, and the acquired room text information can be accurately marked with interest. Optionally, the room text information may include a plurality of different types of text contents, and the trained first text classification model may be set correspondingly for each type of text content, so that the first text classification model corresponding to each type may effectively adapt to the text characteristics of the text information of the type, and perform accurate interest marking. For example, the room text information may be a speech recognition text, and the training may be performed based on the speech recognition data of the speech room and the corresponding interest tag data of the history record to obtain a first text classification model for interest marking of the speech recognition text. The short-term interest profile may be characteristic information describing real-time interest preferences of the user accessing the speech room. Alternatively, the short-term interest portrait may be obtained by combining all interest tags obtained after the room text information is input to the trained first text classification model. Optionally, the room text information may be input into a trained first text classification model to obtain a plurality of interest tags, a preset number of the interest tags are screened from high frequency to low frequency to obtain a part of interest tags, and the part of interest tags are combined according to the frequency order to obtain a short-term interest portrait. In addition, different types of text contents of the room text information can be input into a corresponding trained first text classification model in a classifying mode to obtain a plurality of interest labels, the interest labels are subjected to frequency classification calculation or weighting calculation, a preset number of interest labels with frequency or weight values from high to low in each type are selected, and the interest labels are subjected to ranking combination according to the frequency or weight values to obtain short-term interest portraits. Optionally, the frequency or weight value of each selected interest tag can be weighted and adjusted in different proportions according to the direct association degree of each type of interest tag and the user interest, and the interest tags after weighted and adjusted are combined according to the frequency or weight value in a ranking manner to obtain the short-term interest portrait. Of course, the foregoing short-term interest representation generation manner is merely an exemplary description, and the developer may adaptively adjust the requirements of the user interest according to the actual application scenario, which is not limited herein.

Step S103, acquiring recorded historical short-term portraits of the user and room interest portraits of the concerned voice rooms, and determining the long-term interest portraits of the user according to the short-term interest portraits, the historical short-term portraits and the room interest portraits.

The method includes the steps that short-term interest images corresponding to a voice room are visited by a user each time to be recorded, a plurality of historical short-term images are obtained, the recorded historical short-term images of the user can be obtained, the historical short-term images of the user in a preset time range can be obtained, for example, the preset time range is half a year, a plurality of historical short-term images except for short-term interest images corresponding to the voice room, which are generated by the user last time, can be obtained in half a year, the obtained number can be adaptively selected by a developer according to the change degree of the interest of the user along with time in different application scenes, the method is not limited, for example, the short-term interest images corresponding to the voice room by the user 30 times in half a year need to be referred to be obtained by the long-term interest images, and interest image information for reference can be composed of 29 historical interest images and the interest images generated last time. Because the user may frequently access a voice room of interest, a room interest profile of the voice room of interest may be used to determine a long-term interest profile of the user, which may be characteristic information describing content interest preferences of the voice room. In addition, the long-term interest profile may be feature information describing real-time interest preferences of the user to access the speech room, which may be obtained by fusing short-term interest profiles, historical short-term profiles, and room interest profiles. Alternatively, the long-term interest profile may be obtained by extracting a predetermined number of partial tag combinations from high to low according to a frequency or weight value from all tags of the short-term interest profile, the history short-term profile, and the room interest profile. Optionally, the labels of the short-term interest images and the historical short-term images can be selected according to different types of interest images, for example, a preset number of first partial labels can be extracted from all labels of the short-term interest images and the historical short-term images from high to low according to frequency or weight values, a preset number of second partial labels can be extracted from all labels of the room interest images from high to low according to frequency or weight values, and the first partial labels and the second partial labels are combined according to frequency or weight values in a high-low ordering mode to obtain the long-term interest images. Optionally, the frequency or weight value of each selected interest tag can be weighted and adjusted in different proportions according to the direct association degree of each type of interest portrait and the long-term interest of the user, and the interest tags after being weighted and adjusted are combined according to the frequency or weight value in a high-low order to obtain the long-term interest portrait. Of course, the foregoing manner of generating the long-term interest representation is merely exemplary, and the developer may adaptively adjust the requirements of the developer for the user interest according to the actual application scenario, which is not limited herein.

Step S104, combining the short-term interest portraits and the long-term interest portraits to obtain the user interest portraits.

The user interest portraits can comprise short-term interest portraits and long-term interest portraits, interest changes of users are comprehensively depicted from different statistical dimensions, short-term interest portraits can reflect short-time interest trends of the users, long-term interest portraits can reflect long-time interest characteristics of the users, and therefore diversified pushing strategies can be formulated corresponding to different types of interest portraits, and pushing quality is improved.

The method comprises the steps that through obtaining room text information corresponding to a voice room accessed by a user last time, the statistical time range of the room text information is determined based on the trigger time point of an active behavior event of the user; determining a short-term interest image of the user based on the room text information and the trained first text classification model; acquiring a recorded historical short-term portrait of a user and a room interest portrait of a concerned voice room, and determining a long-term interest portrait of the user according to the short-term interest portrait, the historical short-term portrait and the room interest portrait; the short-term interest portrayal and the long-term interest portrayal are combined to obtain the user interest portrayal. According to the scheme, the reference information of the user interests is effectively and reasonably determined based on the triggering time point of the active behavior event of the user by acquiring the room text information of the voice room accessed by the user last time, the real-time change of the user interests is effectively captured by utilizing the room text information meeting the information diversification and the text classification model of the corresponding type, the short-term interest portraits of the user are reasonably constructed, the long-term interest characteristics of the user are accurately mastered by fusing the recorded short-term interest portraits and the room interest portraits corresponding to the voice room concerned by the user, and the long-term interest portraits of the user are effectively constructed, so that the comprehensive and accurate user interest portraits are obtained, the recommendation of related content conforming to the user interests is facilitated, and personalized interaction experience is customized.

FIG. 2 is a flow chart of a user interest image generation method including a short-term interest image generation process according to an embodiment of the present application. Wherein, the room text information comprises different types of first text contents, as shown in fig. 2, comprising the following steps:

Step S201, acquiring room text information corresponding to a voice room accessed by a user last time, wherein the statistical time range of the room text information is determined based on the trigger time point of an active behavior event of the user;

Step S202, respectively inputting different types of first text contents into the trained first text classification model of the corresponding type to obtain a plurality of first interest tags of different types.

The room text information may include a plurality of types of first text content corresponding to different information dimensions in the voice room, for example, a voice recognition text corresponding to voice stream data in the voice room, a public screen text corresponding to public screen stream data in the voice room, a bulletin text corresponding to a content profile of the voice room, and a title text. For the user behavior, interest depiction can be performed by referring to multiple types of first text content respectively, for example, for the user gifting behavior, the gifting interest depiction can be performed through the voice recognition text of the anchor, for the user public screen interaction behavior, the public screen interaction interest depiction can be performed through the voice recognition text and the public screen text of the room, and for the user house entering behavior, the public screen interaction interest depiction can be performed through the voice recognition text and the public screen text of the room. The first text content of each type may be subjected to interest marking corresponding to the first text classification model with which training is completed, for example, the speech recognition text may be input to the corresponding first text classification model, so as to obtain interest labels of each sentence in the speech recognition text. Therefore, each type of first text content can generate a corresponding first interest tag corresponding to each text sentence or text word, and finally a plurality of first interest tags of different types are obtained.

Step S203, counting the frequency of each first interest tag in the total first interest tags of the same type, and calculating to obtain the weight value of each first interest tag in each type according to the frequency counting result and the obtained interest tag information of the total voice room.

In the interest depiction of each user behavior, multiple types of first text content may be corresponding, and each type of first text content may generate a corresponding first interest tag corresponding to each text sentence or text word, so that each type of first text content may correspond to multiple repeated first interest tags, and frequency statistics of each first interest tag may be performed in the same type of full first interest tags, so as to achieve duplicate removal, and meanwhile, the quantity distribution characteristics of the first interest tags are maintained. In addition, the weight value may be a value reflecting a hit probability of each first interest tag, and the higher the weight value, the greater the probability that the first interest tag hits the user's interest orientation, and the lower the weight value, the less the probability that the first interest tag hits the user's interest orientation. Alternatively, the frequency of each first interest tag may be calculated based on the frequency statistics, and each type of adjustment coefficient may be set according to the capacity of different types of text content, and the frequency of each first interest tag and the adjustment coefficient conforming to the type thereof may be multiplied to obtain the weight value thereof. Optionally, the frequency of each first interest tag may be calculated based on the frequency statistics, and the inverse frequency of each first interest tag with respect to the interest tag information of the full speech room may be calculated, and the frequency of each first interest tag and the inverse frequency may be multiplied to obtain the weight value thereof. Of course, the foregoing calculation manner of the weight value is merely an exemplary description, and the developer may adaptively adjust the calculation amount according to the accuracy of the interest depiction, which is not limited herein.

And step S204, carrying out weight value adjustment corresponding to a first preset proportion on each first interest tag, and sequencing and integrating a plurality of first interest tags with the adjusted weight values to obtain a short-term interest image of the user.

The first preset proportion may be set according to the importance of interest descriptions corresponding to different user behaviors, the importance of each interest description may be determined by the association degree of the different user behaviors and the interest orientation, and by way of example, the user behaviors may include a user gift behavior, a user public screen interaction behavior and a user house-entering behavior, the importance ranking may be that the user gift interest > the user public screen interaction interest > the user house-entering interest, the corresponding first preset proportion may be set to 3:2:1, and accordingly, the weight value of a first interest tag corresponding to the user gift interest may be multiplied by 3/6, the weight value of a first interest tag corresponding to the user public screen interaction interest may be multiplied by 2/6, and the weight value of a first interest tag corresponding to the user house-entering interest may be multiplied by 1/6. The foregoing specific user behavior and specific values of the first preset proportion are merely exemplary descriptions, and the selection of the user behavior and the first preset proportion may be adaptively adjusted by the developer according to the actual interest prediction effect, which is not limited herein. The sorting and integrating can be to comb the plurality of first interest tags to obtain short-term interest images of the user, and optionally, the sorting and combining can be directly carried out on the plurality of first interest tags with the adjusted weight values according to the weight values to obtain the short-term interest images of the user. Optionally, for the first interest tags corresponding to the interests of each user behavior, a preset number of first interest tags can be selected from high to low according to the weight value, and the first interest tags selected corresponding to different user behaviors are ranked and combined according to the weight value to obtain the short-term interest image of the user. Of course, the foregoing short-term interest image generation manner is merely an exemplary description, and the developer may adjust the integration manner according to the interest characterization granularity requirement of the actual application scenario, which is not limited in this disclosure.

Step S205, a recorded historical short-term portrait of the user and a room interest portrait of the concerned voice room are obtained, and a long-term interest portrait of the user is determined based on the short-term interest portrait, the historical short-term portrait and the room interest portrait.

And S206, combining the short-term interest portraits and the long-term interest portraits to obtain the user interest portraits.

By referring to the first text content of different types, the first interest tag is generated, the multi-source information of the voice room is fully utilized, the comprehensiveness and the comprehensiveness of the interest tag are improved, and the generated interest tag reflects the characteristics of the room more comprehensively; the weight value of each interest tag is calculated, so that the reference value of the interest orientation of the user reflected by different interest tags is effectively distinguished, and a reasonable reference basis is provided for short-term interest portrait generation of the user by combing a plurality of interest tags; by adjusting the weight value of each first interest tag in proportion, the weight value can be more attached to the importance of interest descriptions corresponding to different user behaviors, and short-term interest images of users can be accurately integrated.

Fig. 3 is a flowchart of a user interest portrait generating method including a weight value calculating process according to an embodiment of the present application, where as shown in fig. 3, the method includes the following steps:

Step S301, room text information corresponding to a voice room accessed by a user last time is acquired, wherein the statistical time range of the room text information is determined based on the trigger time point of the active behavior event of the user.

Step S302, the first text contents of different types are respectively input into the trained first text classification model of the corresponding type, and a plurality of first interest tags of different types are obtained.

Step S303, counting the frequency of each first interest tag in the whole first interest tags of the same type.

Step S304, calculating the frequency corresponding to each first interest tag according to the frequency statistical result and the total amount of each type of first interest tag.

The frequency corresponding to each first interest tag can be calculated by dividing the frequency obtained by statistics with the total amount of the first interest tags of the same type, and the frequency can be used for reflecting the importance degree of each first interest tag in all the first interest tags of the same type, and the higher the importance degree is along with the increase of the frequency.

Step S305, counting the number of voice rooms in which each first interest tag appears from the obtained interest tag information of the total voice rooms, and calculating to obtain the reverse frequency corresponding to each first interest tag according to the number and the total voice rooms.

If the frequency of occurrence of a certain first interest tag in other voice rooms is high, it is indicated that the first interest tag is applicable to other users, and does not have good distinction, so that the lower the frequency of occurrence of the first interest tag in other voice rooms is, the better, therefore, as the frequency of occurrence of the first interest tag in other voice rooms is higher, the lower the inverse frequency should be, and by way of example, the formula for calculating the inverse frequency corresponding to each first interest tag is as follows:

The +1 may avoid the situation that the denominator is 0, and of course, other subtraction functions or functions capable of reflecting the negative correlation between the function value and the argument may be used to perform inverse frequency calculation, which is not limited herein.

Step S306, multiplying the frequency corresponding to each first interest tag by the inverse frequency to obtain a corresponding weight value.

The frequency is higher along with the occurrence frequency of the first interest tags in the total first interest tags of the same type, and the value is larger; the reverse frequency is smaller as the occurrence frequency of the first interest tag in other voice rooms is higher, so that the frequency is multiplied by the reverse frequency, the first interest tag belonging to the high distinction degree of the user can be obtained, and the personalized requirement is met.

Step S307, carrying out weight value adjustment corresponding to a first preset proportion on each first interest tag, and carrying out sequencing integration on a plurality of first interest tags with the weight values adjusted to obtain a short-term interest image of the user.

Step S308, acquiring recorded historical short-term portraits of the user and room interest portraits of the concerned voice rooms, and determining the long-term interest portraits of the user according to the short-term interest portraits, the historical short-term portraits and the room interest portraits.

Step S309, combining the short-term interest portraits and the long-term interest portraits to obtain the user interest portraits.

By obtaining the corresponding weight value according to the frequency and the reverse frequency of the first interest tag, the user exclusive interest tag with high distinction degree is effectively obtained, personalized interest depiction is provided for the user, and the personalized degree of the user interest portrait is improved.

FIG. 4 is a flowchart of another method for generating a user interest image including a short-term interest image generating process according to an embodiment of the present application, as shown in FIG. 4, including the following steps:

Step S401, acquiring room text information corresponding to a voice room accessed by a user last time, wherein the statistical time range of the room text information is determined based on the trigger time point of an active behavior event of the user;

step S402, inputting different types of first text contents into the trained first text classification model of the corresponding type respectively to obtain a plurality of first interest tags of different types.

Step S403, counting the frequency of each first interest tag in the total first interest tags of the same type, and calculating to obtain the weight value of each first interest tag in each type according to the frequency counting result and the obtained interest tag information of the total voice room.

Step S404, obtaining a user voice text corresponding to a voice room accessed by the user last time, and inputting the user voice text into a trained second text classification model to obtain a plurality of second interest tags.

The user behavior may further include user voice behavior, the user voice text may be obtained by identifying voice content of the user during the voice room, the user voice text may be input into a corresponding second text classification model, and a second interest tag of each sentence in the user voice text may be obtained, where the plurality of second interest tags may be used to characterize the user voice interests.

And step S405, performing frequency statistics on each second interest tag in the total number of second interest tags, and calculating a weight value of each second interest tag according to the frequency statistics result and the obtained interest tag information corresponding to the total number of users.

The second text classification model generates a corresponding second interest tag corresponding to each text sentence or text word in the user voice text, so that the user voice text may correspond to a plurality of repeated second interest tags, and frequency statistics of each second interest tag can be performed in the full second interest tags of the same type, so that the duplicate removal is achieved, and meanwhile, the quantity distribution characteristics of the second interest tags are reserved. In addition, the weight value may be a value reflecting a hit probability of each second interest tag, and the higher the weight value, the greater the probability that the second interest tag hits the user's interest orientation, and the lower the weight value, the less the probability that the second interest tag hits the user's interest orientation. Alternatively, the frequency of each second interest tag may be calculated based on the frequency statistics, and the frequency of each second interest tag may be used as a weight value thereof. Optionally, the frequency of each second interest tag may be calculated based on the frequency statistics, and the inverse frequency of each second interest tag with respect to the interest tag information of the total number of users may be calculated, and the frequency of each second interest tag and the inverse frequency may be multiplied to obtain the weight value thereof. Of course, the foregoing calculation manner of the weight value is merely an exemplary description, and the developer may adaptively adjust the calculation amount according to the accuracy of the interest depiction, which is not limited herein.

Step S406, adjusting the weight value of each first interest tag and each second interest tag in a corresponding second preset proportion, and sorting and integrating the plurality of first interest tags and the plurality of second interest tags after the weight value adjustment to obtain a short-term interest image of the user.

The second preset proportion may be set according to the importance of interest descriptions corresponding to different user behaviors, the importance of each interest description may be determined by the association degree of different user behaviors and interest orientations, and for example, the user behaviors may include a user gift behavior, a user public screen interaction behavior, a user house entrance behavior and a user voice behavior, the importance ranking may be that the user gift interest > the user voice interest > the user public screen interaction interest > the user house entrance interest, the second preset proportion may be set to be 4:3:2:1, and accordingly, the weight value of a first interest tag corresponding to the user gift interest may be multiplied by 4/10, the weight value of a second interest tag corresponding to the user voice interest may be multiplied by 3/10, the weight value of a first interest tag corresponding to the user public screen interaction interest may be multiplied by 2/10, and the weight value of a first interest tag corresponding to the user house entrance interest may be multiplied by 1/10. The foregoing specific user behavior and specific values of the second preset proportion are merely exemplary descriptions, and the selection of the user behavior and the second preset proportion may be adaptively adjusted by the developer according to the actual interest prediction effect, which is not limited herein.

Step S407, acquiring recorded historical short-term portraits of the user and room interest portraits of the concerned voice rooms, and determining the long-term interest portraits of the user according to the short-term interest portraits, the historical short-term portraits and the room interest portraits.

Step S408, the short-term interest portrait and the long-term interest portrait are combined to obtain the user interest portrait.

By adding the interest description of the voice behaviors of the user, the multisource information of the voice room can be fully and comprehensively utilized, the comprehensiveness and the comprehensiveness of the interest labels are improved, and the generated interest labels can synchronously reflect the interest characteristics of the user in the room period; by adjusting the weight value of each first interest tag and each second interest tag in proportion, the weight value can be more attached to the importance of interest descriptions corresponding to different user behaviors, and short-term interest images of users can be accurately integrated.

FIG. 5 is a flowchart of a user interest image generation method including a long-term interest image generation process according to an embodiment of the present application, as shown in FIG. 5, including the following steps:

step S501, acquiring room text information corresponding to a voice room accessed by a user last time, wherein the statistical time range of the room text information is determined based on the trigger time point of an active behavior event of the user;

Step S502, inputting first text contents of different types into a trained first text classification model of a corresponding type respectively to obtain a plurality of first interest tags of different types.

Step S503, counting the frequency of each first interest tag in the total first interest tags of the same type, and calculating to obtain the weight value of each first interest tag in each type according to the frequency counting result and the obtained interest tag information of the total voice room.

Step S504, adjusting the weight value of each first interest tag corresponding to a first preset proportion, and sorting and integrating the plurality of first interest tags with the adjusted weight values to obtain a short-term interest image of the user.

Step S505, a recorded short-term historical portrait of the user and a room interest portrait of the concerned voice room are obtained.

And S506, sequentially extracting the first preset number of interest tags from the short-term interest portrait and the history short-term portrait according to the weight value from high to low.

The first preset number of interest tags are sequentially extracted from high to low according to the weight value, the first few interest tags with high importance degree in the short-term interest images can be selected, redundant interest tags with low importance degree are removed, and the number of interest tags in the short-term interest images is reasonably simplified.

And S507, sequentially extracting interest labels with a second preset number from the room interest portrait according to the weight value from high to low.

The first interest labels with high importance degree in the room interest figures can be selected, the redundant interest labels with low importance degree are removed, and the number of the interest labels in the room interest figures is reasonably simplified.

And step S508, adjusting the weight value of the extracted interest tag corresponding to a third preset proportion, and sequencing and integrating the interest tags with the adjusted weight value to obtain a long-term interest portrait of the user.

The third preset proportion may be set according to importance of interest portraits in different dimensions, because the short-term interest portraits are related to room behaviors of the user, and the room interest portraits are only related to voice rooms of which the user sets attention, the user does not necessarily enter the related voice rooms, so that a plurality of short-term interest portraits in a statistical time range are closer to interest orientations of the user than the room interest portraits, thus, for example, the importance ranking may be that the short-term interest portraits > the room interest portraits, the third preset proportion may be set to be 4:1 correspondingly, the weight value of interest labels corresponding to the short-term interest portraits may be multiplied by 4/5, and the weight value of interest labels corresponding to the room interest portraits may be multiplied by 1/5. The specific value of the third preset proportion is only described as an example, and the selection of the third preset proportion can be adaptively adjusted by a developer according to the actual interest prediction effect, which is not limited herein.

Step S509, combining the short-term interest portraits and the long-term interest portraits to obtain the user interest portraits.

By combining the short-term interest image and the room interest image of the voice room focused by the user, the method not only can effectively grasp the current interest of the user, but also can consider the interest characteristics focused by the user for a long time, thereby comprehensively describing the interest orientation of the user and being beneficial to improving the accuracy and the user satisfaction of personalized recommendation.

FIG. 6 is a flowchart of a user interest portrait generation method including a room interest portrait generation process according to an embodiment of the present application, as shown in FIG. 6, including the following steps:

step S601, acquiring room text information corresponding to a voice room accessed by a user last time, wherein the statistical time range of the room text information is determined based on the trigger time point of the active behavior event of the user.

Step S602, determining a short-term interest image of the user based on the room text information and the trained first text classification model.

Step S603, acquiring room text information of the voice room focused by the user when the voice room is recently opened, where the room text information includes second text contents of different types.

The room text information may include a plurality of types of second text content corresponding to different information dimensions in the voice room, and the room text information may include a voice recognition text, a public screen text, a bulletin text and a title text corresponding to the last open.

Step S604, inputting second text contents of different types into the trained first text classification model of the corresponding type, and obtaining a plurality of third interest tags of different types.

The second text content of each type may be subjected to interest marking corresponding to the first text classification model with which training is completed, for example, the speech recognition text may be input to the corresponding first text classification model, so as to obtain interest labels of each sentence in the speech recognition text. Therefore, each type of second text content can generate a corresponding third interest tag corresponding to each text sentence or text word, and finally a plurality of third interest tags with different types are obtained.

Step S605, counting the frequency of each third interest tag in the total third interest tags of the same type, and calculating the weight value of each third interest tag in each type according to the frequency counting result and the obtained interest tag information of the total voice room.

And generating a corresponding third interest tag corresponding to each text sentence or text word by each type of second text content, wherein each type of second text content possibly corresponds to a plurality of repeated third interest tags, and frequency statistics of each third interest tag can be performed in the whole third interest tags of the same type, so that the duplicate removal effect is achieved, and meanwhile, the quantity distribution characteristics of the third interest tags are reserved. In addition, the weight value may be a value reflecting the hit probability of each third interest tag, and the higher the weight value, the greater the probability that the third interest tag hits the interest topic of the speech room, the lower the weight value, and the lower the probability that the third interest tag hits the interest topic of the speech room.

And step S606, carrying out weight value adjustment corresponding to a fourth preset proportion on each third interest tag, and sequencing and integrating a plurality of third interest tags with the weight values adjusted to obtain the room interest portrait of the voice room.

The fourth preset proportion may be set according to the importance of the second text content in different types, the importance of the second text content may be determined according to the association degree of the behavior habit of the user accessing the voice room and the interest orientation, for example, if the room text information includes a voice recognition text, a public screen text, a title text and an announcement text, the importance ranking may be that the title text and the announcement text > the voice recognition text > the public screen text, the corresponding fourth preset proportion may be set to be 5:3:1, and accordingly, the weight value of the third interest tag corresponding to the title text and the announcement text may be multiplied by 5/9, the weight value of the third interest tag corresponding to the voice recognition text may be multiplied by 3/9, and the weight value of the third interest tag corresponding to the public screen text may be multiplied by 1/9. The specific values of the specific voice room information and the fourth preset proportion are only exemplary descriptions, and the selection of the voice room information and the fourth preset proportion can be adaptively adjusted by a developer according to the actual interest prediction effect, so that the application is not limited.

Step S607, acquiring the recorded historical short-term portraits of the user and the room interest portraits of the concerned voice rooms, and determining the long-term interest portraits of the user according to the short-term interest portraits, the historical short-term portraits and the room interest portraits.

Step S608, the short-term interest portrait and the long-term interest portrait are combined to obtain the user interest portrait.

By using the room text information generated when the voice room is recently opened to generate the third interest tag of the room, the real-time monitoring of the interest topics in the voice room is realized, the interest orientation of the user is captured through the interest characteristic side surface of the concerned voice room, the flexibility of interest depiction of the user is improved, and an effective interest tag reference is provided for long-term interest portraits of the user.

Fig. 7 is a block diagram of a user interest image generating device according to an embodiment of the present application, where the device is configured to execute the user interest image generating method according to the above embodiment, and has functional modules and beneficial effects corresponding to the executing method. As shown in fig. 7, the apparatus includes:

The acquisition module 101 is configured to acquire room text information corresponding to a voice room accessed by a user last time, and the statistical time range of the room text information is determined based on the trigger time point of the active behavior event of the user;

A short-term representation generation module 102 configured to determine a short-term interest representation of the user based on the room text information and the trained first text classification model;

A long-term representation generation module 103 configured to acquire a recorded historical short-term representation of the user and a room-interest representation of the speech room of interest, and to determine a long-term representation of interest of the user based on the short-term representation of interest, the historical short-term representation, and the room-interest representation;

the user representation generation module 104 is configured to combine the short-term interest representation with the long-term interest representation to obtain the user interest representation.

In one possible embodiment, the short-term representation generation module 102 is further configured to:

Respectively inputting different types of first text contents into a first text classification model with corresponding types of training completed to obtain a plurality of first interest labels with different types;

Counting the frequency of each first interest tag in the total first interest tags of the same type, and calculating to obtain the weight value of each first interest tag in each type according to the frequency counting result and the obtained interest tag information of the total voice room;

and adjusting the weight value of each first interest tag corresponding to a first preset proportion, and sequencing and integrating the plurality of first interest tags with the adjusted weight value to obtain a short-term interest image of the user.

calculating to obtain the frequency corresponding to each first interest tag according to the frequency statistics result and the total amount of each type of first interest tag;

Counting the number of voice rooms with each first interest tag from the obtained interest tag information of the total voice rooms, and calculating to obtain the reverse frequency corresponding to each first interest tag according to the number and the total voice room;

and multiplying the frequency corresponding to each first interest tag by the reverse frequency to obtain a corresponding weight value.

In one possible embodiment, the system further comprises a second short-term representation generation module configured to:

acquiring a user voice text corresponding to a voice room accessed by a user last time, and inputting the user voice text into a trained second text classification model to obtain a plurality of second interest tags;

Counting the frequency of each second interest tag in the total second interest tags, and calculating the weight value of each second interest tag according to the frequency counting result and the obtained interest tag information corresponding to the total users;

and carrying out weight value adjustment corresponding to a second preset proportion on each first interest tag and each second interest tag, and sequencing and integrating the plurality of first interest tags and the plurality of second interest tags after the weight value adjustment to obtain a short-term interest image of the user.

In one possible embodiment, the long-term image generation module is further configured to:

sequentially extracting a first preset number of interest tags from short-term interest figures and historical short-term figures according to weight values from high to low;

sequentially extracting interest labels with a second preset number from the room interest portrait according to the weight value from high to low;

and adjusting the weight value of the extracted interest tag corresponding to a third preset proportion, and sequencing and integrating the interest tags with the adjusted weight value to obtain a long-term interest portrait of the user.

In one possible embodiment, the system further comprises a room representation generation module configured to:

acquiring room text information of a voice room focused by a user when the voice room is recently opened, wherein the room text information comprises second text contents of different types;

Inputting second text contents of different types into the trained first text classification model of corresponding types to obtain a plurality of third interest tags of different types;

Counting the frequency of each third interest tag in the total third interest tags of the same type, and calculating to obtain the weight value of each third interest tag in each type according to the frequency counting result and the obtained interest tag information of the total voice room;

and adjusting the weight value of each third interest tag corresponding to a fourth preset proportion, and sequencing and integrating a plurality of third interest tags after the weight value adjustment to obtain the room interest portrait of the voice room.

FIG. 8 is a schematic structural diagram of a user interest portrait generating device according to an embodiment of the present application, where, as shown in FIG. 8, the device includes a processor 201, a memory 202, an input device 203, and an output device 204; the number of processors 201 in the device may be one or more, one processor 201 being taken as an example in fig. 8; the processor 201, memory 202, input devices 203, and output devices 204 in the apparatus may be connected by a bus or other means, for example in fig. 8. The memory 202 is a computer-readable storage medium and may be configured to store a software program, a computer-executable program, and modules, such as program instructions/modules corresponding to the user interest image generation method in the embodiment of the present application. The processor 201 executes various functional applications of the device and data processing by executing software programs, instructions and modules stored in the memory 202, that is, implements the user interest portrait generation method described above. The input device 203 may be configured to receive input numeric or character information and to generate key signal inputs related to user settings and function control of the apparatus. The output device 204 may include a display device such as a display screen.

The embodiment of the present application also provides a non-volatile storage medium containing computer-executable instructions, which when executed by a computer processor are configured to perform a user interest portrait generation method described in the above embodiment, including: acquiring room text information corresponding to a voice room accessed by a user last time, wherein the statistical time range of the room text information is determined based on the trigger time point of an active behavior event of the user; determining a short-term interest image of the user based on the room text information and the trained first text classification model; acquiring a recorded historical short-term portrait of a user and a room interest portrait of a concerned voice room, and determining a long-term interest portrait of the user according to the short-term interest portrait, the historical short-term portrait and the room interest portrait; the short-term interest portrayal and the long-term interest portrayal are combined to obtain the user interest portrayal.

It should be noted that, in the embodiment of the user interest portrait generating device, each unit and module included are only divided according to the functional logic, but not limited to the above division, so long as the corresponding functions can be implemented; in addition, the specific names of the functional units are also only for convenience of distinguishing from each other, and are not configured to limit the protection scope of the embodiments of the present application.

In some possible embodiments, aspects of the method provided by the present application may also be implemented in the form of a program product, which includes a program code configured to cause a computer device to perform the steps in the method according to the various exemplary embodiments of the present application described in the present specification when the program product is run on the computer device, for example, the computer device may perform the user interest image generating method described in the embodiments of the present application. The program product may be implemented using any combination of one or more readable media.

Claims

1. A method for generating a user interest portrait, comprising:

2. The user interest portrayal generating method according to claim 1, wherein the room text information includes different types of first text contents, and the determining the user's short-term interest portrayal based on the room text information and the trained first text classification model comprises:

Respectively inputting the first text contents of different types into a first text classification model which is trained and completed in corresponding types to obtain a plurality of first interest tags of different types;

Performing frequency statistics on each first interest tag in the total first interest tags of the same type, and calculating to obtain a weight value of each first interest tag in each type according to the frequency statistics result and the obtained interest tag information of the total voice room;

And carrying out weight value adjustment corresponding to a first preset proportion on each first interest tag, and sequencing and integrating a plurality of first interest tags with the adjusted weight values to obtain a short-term interest portrait of the user.

3. The method for generating a user interest portrait according to claim 2, wherein the calculating a weight value of each first interest tag in each type according to the frequency statistics result and the obtained interest tag information of the total number of voice rooms includes:

Counting the number of voice rooms in which each first interest tag appears from the obtained interest tag information of the total voice rooms, and calculating to obtain the reverse frequency corresponding to each first interest tag according to the number and the total voice room;

Multiplying the frequency corresponding to each first interest tag by the reverse frequency to obtain a corresponding weight value.

4. The user interest portrait creation method of claim 2 further comprising:

Acquiring a user voice text corresponding to a voice room accessed by the user last time, and inputting the user voice text into a trained second text classification model to obtain a plurality of second interest tags;

And carrying out weight value adjustment corresponding to a second preset proportion on each first interest tag and each second interest tag, and sequencing and integrating the plurality of first interest tags and the plurality of second interest tags after the weight value adjustment to obtain a short-term interest portrait of the user.

5. The user interest profile generation method of claim 2, wherein the determining a long-term interest profile for the user based on the short-term interest profile, the historical short-term profile, and the room interest profile comprises:

Sequentially extracting a first preset number of interest tags from the short-term interest portrait and the historical short-term portrait from high to low according to weight values;

6. The user interest profile generation method according to any one of claims 1 to 5, further comprising, prior to the acquiring the recorded historical short-term profile of the user and the room interest profile of the voice room of interest:

inputting the second text contents of different types into the trained first text classification model of corresponding types to obtain a plurality of third interest tags of different types;

Performing frequency statistics on each third interest tag in the total third interest tags of the same type, and calculating to obtain a weight value of each third interest tag in each type according to the frequency statistics result and the obtained interest tag information of the total voice room;

And carrying out weight value adjustment corresponding to a fourth preset proportion on each third interest tag, and sequencing and integrating a plurality of third interest tags subjected to weight value adjustment to obtain the room interest portrait of the voice room.

7. A user interest portrait creation apparatus, comprising:

8. A user interest portrayal generating apparatus, the apparatus comprising: one or more processors; a storage device configured to store one or more programs that, when executed by the one or more processors, cause the one or more processors to implement the user interest representation generation method of any of claims 1-6.

9. A non-transitory storage medium storing computer executable instructions which, when executed by a computer processor, are configured to perform the user interest representation generation method of any of claims 1-6.

10. A computer program product comprising a computer program which, when executed by a processor, implements the user interest portrayal generation method according to any one of claims 1-6.