CN112802454B - Method and device for recommending awakening words, terminal equipment and storage medium - Google Patents

Method and device for recommending awakening words, terminal equipment and storage medium Download PDF

Info

Publication number
CN112802454B
CN112802454B CN202011633865.3A CN202011633865A CN112802454B CN 112802454 B CN112802454 B CN 112802454B CN 202011633865 A CN202011633865 A CN 202011633865A CN 112802454 B CN112802454 B CN 112802454B
Authority
CN
China
Prior art keywords
score
alternative
keyword
behavior
recommendation
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202011633865.3A
Other languages
Chinese (zh)
Other versions
CN112802454A (en
Inventor
曹金磊
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Volkswagen Mobvoi Beijing Information Technology Co Ltd
Original Assignee
Volkswagen Mobvoi Beijing Information Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Volkswagen Mobvoi Beijing Information Technology Co Ltd filed Critical Volkswagen Mobvoi Beijing Information Technology Co Ltd
Priority to CN202011633865.3A priority Critical patent/CN112802454B/en
Publication of CN112802454A publication Critical patent/CN112802454A/en
Application granted granted Critical
Publication of CN112802454B publication Critical patent/CN112802454B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/02Feature extraction for speech recognition; Selection of recognition unit
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/335Filtering based on additional data, e.g. user or group profiles
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/205Parsing
    • G06F40/216Parsing using statistical methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/279Recognition of textual entities
    • G06F40/289Phrasal analysis, e.g. finite state techniques or chunking
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/26Speech to text systems

Abstract

The embodiment of the invention discloses a recommendation method, a recommendation device, terminal equipment and a storage medium of a wake-up word, wherein the method comprises the following steps: acquiring network data of a target user within a preset time period; acquiring alternative keywords in each content display page, and acquiring a text importance score of each alternative keyword according to the word frequency and the inverse text frequency index; determining a behavior type score of each alternative keyword in at least one content presentation page according to a user behavior type corresponding to the at least one content presentation page; determining a recommendation score of each alternative keyword according to the text importance score and the behavior type score of each alternative keyword; and determining a recommended awakening word matched with the target user according to the recommendation score of each alternative keyword, and displaying the recommended awakening word to the target user. According to the technical scheme disclosed in the embodiment of the invention, personalized push aiming at different users is realized according to the actual concern points and interest points of the users, and the human-computer interaction experience of the users is improved.

Description

Method and device for recommending awakening words, terminal equipment and storage medium
Technical Field
The embodiment of the invention relates to the field of voice interaction, in particular to a method and a device for recommending a wake-up word, terminal equipment and a storage medium.
Background
With the continuous progress of science and technology, the speech recognition technology is rapidly developed, and the technology support is provided for the speech interaction between the user and the intelligent terminal equipment.
The man-machine interaction between a user and intelligent terminal equipment is very similar to the communication between people, and comprises a plurality of links such as awakening, responding, inputting, understanding and feeding back, wherein the awakening is the first contact point of the user and the terminal equipment interaction each time, the experience of the awakening link is crucial in the whole voice interaction process, and the first impression of the user on a product is directly influenced by the quality of the experience.
In the prior art, a certain word group is usually used as a wakeup word, for example, "question and answer start" is used as a wakeup word, or a plurality of fixed word group combinations are recommended to a user, for example, "question and answer start", "answer please" and "ask me", and the like, and the user selects the certain wakeup word, but such a recommendation method cannot recommend the matched wakeup word according to the actual needs of the user, cannot realize personalized push for different users, and is poor in human-computer interaction experience of the user.
Disclosure of Invention
The embodiment of the invention provides a recommendation method, device and equipment of awakening words and a storage medium, so as to recommend personalized voice awakening words to a user.
In a first aspect, an embodiment of the present invention provides a method for recommending a wakeup word, including:
acquiring network data of a target user within a preset time period; the network data comprises at least one content display page and a user behavior type corresponding to the at least one content display page;
acquiring alternative keywords in each content display page, and acquiring a text importance score of each alternative keyword according to a word frequency and an inverse text frequency index;
determining a behavior type score of each alternative keyword in the at least one content presentation page according to a user behavior type corresponding to the at least one content presentation page;
determining a recommendation score of each alternative keyword according to the text importance score and the behavior type score of each alternative keyword;
and determining a recommended awakening word matched with the target user according to the recommendation score of each alternative keyword, and displaying the recommended awakening word to the target user.
In a second aspect, an embodiment of the present invention provides a device for recommending a wakeup word, including:
the network data acquisition module is used for acquiring network data of a target user within a preset time period; the network data comprises at least one content display page and a user behavior type corresponding to the at least one content display page;
the text importance score acquisition module is used for acquiring the alternative keywords in each content display page and acquiring the text importance score of each alternative keyword according to the word frequency and the inverse text frequency index;
a behavior type score obtaining module, configured to determine a behavior type score of each candidate keyword in the at least one content presentation page according to a user behavior type corresponding to the at least one content presentation page;
a recommendation score obtaining module, configured to determine a recommendation score of each candidate keyword according to the text importance score and the behavior type score of each candidate keyword;
and the awakening word display module is used for determining the recommended awakening words matched with the target user according to the recommendation scores of the alternative keywords and displaying the recommended awakening words to the target user.
In a third aspect, an embodiment of the present invention provides a terminal device, including:
one or more processors;
a storage device for storing one or more programs,
when the one or more programs are executed by the one or more processors, the one or more processors implement the method for recommending a wake-up word according to any embodiment of the present invention.
In a fourth aspect, an embodiment of the present invention further provides a computer-readable storage medium, on which a computer program is stored, where the computer program, when executed by a processor, implements a method for recommending a wake word according to any embodiment of the present invention.
According to the technical scheme disclosed in the embodiment of the invention, the alternative keywords in each content display page are obtained according to the network data of the user in a period of time in the past, the text importance score is calculated, the recommendation score of each alternative keyword is obtained according to the behavior type score corresponding to the alternative keyword, the recommendation awakening word displayed to the user is further determined according to the recommendation score, the personalized push aiming at different users is realized according to the actual interest points and interest points of the user, and the human-computer interaction experience of the user is improved.
Drawings
Fig. 1 is a flowchart of a method for recommending a wakeup word according to an embodiment of the present invention;
fig. 2 is a block diagram of a device for recommending a wake-up word according to a second embodiment of the present invention;
fig. 3 is a block diagram of a terminal device according to a third embodiment of the present invention.
Detailed Description
The present invention will be described in further detail with reference to the accompanying drawings and examples. It is to be understood that the specific embodiments described herein are merely illustrative of the invention and are not to be construed as limiting the invention. It should be further noted that, for the convenience of description, only some of the structures related to the present invention are shown in the drawings, not all of the structures.
Example one
Fig. 1 is a flowchart of a method for recommending a wakeup word according to an embodiment of the present invention, where this embodiment is applicable to recommend a voice wakeup word related to a user according to network data of the user, and the method may be executed by a device for recommending a wakeup word according to an embodiment of the present invention, where the device may be implemented by software and/or hardware and integrated in a terminal device or a server, and the method specifically includes the following steps:
s110, acquiring network data of a target user in a preset time period; the network data comprises at least one content display page and a user behavior type corresponding to the at least one content display page.
Network data, including page information obtained by a user through a terminal device, for example, a display page of a nearby road condition searched by the user through a vehicle-mounted terminal device, also including page information obtained by the user through a mobile phone Application (APP) and a PC (Personal Computer) end software program related to the terminal device, and also including page information obtained through a mobile phone browser or a PC end browser, for example, page information obtained in a web page form through a browser in a mobile phone or a PC device according to a registered account number of the terminal device; the identity information of the user can be distinguished through a registered account of the terminal equipment; in the embodiment of the present invention, the terminal device is an electronic device with a voice interaction function, and the type of the terminal device is not particularly limited.
Different user behaviors, such as browsing behavior, searching behavior, commenting behavior, approving behavior, collecting behavior, shopping cart adding behavior, purchasing behavior and the like, all obtain corresponding content display pages, for example, after the user executes the purchasing behavior, the user can obtain corresponding order pages and payment pages; after executing a search behavior, a user can obtain a corresponding search result page; therefore, for each obtained content presentation page, the user behavior type corresponding to the content presentation page, that is, the user behavior type triggering the content presentation page, may be obtained, for example, the content presentation page a is obtained through a browsing behavior of the user, and the content presentation page B is obtained through a searching behavior of the user. In particular, the browsing behavior comprises an audio playing behavior; the audio playing behavior comprises audio information acquired by a user through accessing the internet and local audio information played through the vehicle-mounted terminal equipment, wherein the audio information comprises a content display page of audio, and information such as the type (such as music, a phase sound, a comment and the like) of the audio, the title of a song, a dialog and/or a performer is displayed. Key information of the audio content may be extracted as alternative keywords, for example: when the type of the audio is music, the name of the music, the noun appearing most frequently in the music, the name of the singer of the music, the name of the favorite of the singer, or the like may be extracted; when the type of audio is a comment, the title of the comment, the name of the protagonist in the comment, the name of the author of the comment, or the name of the most popular character in the comment, etc. may be extracted.
The network data in the preset time period reflects the information of things which are concerned by the user in the past time period, has stronger relevance with the user and can arouse the interest of the user, and the specific time value can be set according to the needs, for example, the preset time period is 5 days, namely, the network data of the target user in the past 5 days is obtained; particularly, the preset time period can be related to the activity degree of the user, the activity degree is high, and the preset time period can be set to be a small value, namely, rich network data of the user can be obtained in a short time, so that the attention points and the interest points of the user are reflected; the activity degree is low, a preset time period needs to be set to a larger value, namely, a longer time is needed to acquire the network data of the user, so that the attention points and the interest points of the user can be accurately reflected; the activity level of the user can be determined according to the average network access time of the user every day.
S120, obtaining the alternative keywords in each content display page, and obtaining the text importance score of each alternative keyword according to the word frequency and the inverse text frequency index.
The word frequency and the inverse text frequency index reflect the importance degree of the vocabulary in all content display pages; wherein, term Frequency (TF) represents the Frequency of each vocabulary in a content presentation page appearing in the content presentation page, and the larger the Frequency value is, the greater the importance of the vocabulary in the content presentation page is; for example, if the word "puppet cat" appears 20 times in a content presentation page, and the content presentation page has 100 words in total, the TF value is 20/100=0.2; the Inverse Document Frequency Index (IDF) is a measure of the universal importance of the vocabulary, if the number of the content presentation pages containing the vocabulary is smaller, the IDF is larger, which indicates that the content presentation page where the vocabulary is located is more important, and the IDF value can be obtained by dividing the total number of the content presentation pages by the number of the content presentation pages containing the vocabulary, and then dividing the result by the logarithm of base 10, for example, the total number of the content presentation pages is 100, the number of the content presentation pages containing "puppet cat" in each content presentation page is 10, and the IDF value is calculated as
Figure BDA0002880712400000061
Finally multiplying IF by IDF to obtain TF-IDF numerical value, namely the text importance score of the vocabulary; taking the above technical solution as an example, the text importance score of the word "puppet cat" is 0.2 × 1=0.2.
The content display pages usually comprise rich vocabulary information, all vocabularies in each content display page are not required to be used as alternative keywords, and partial vocabularies with more occurrence times can be obtained in each content display page in a screening mode to be used as alternative keywords so as to reduce the calculation pressure of terminal equipment or a server; specifically, the obtaining of the alternative keywords in each content display page includes: acquiring alternative keywords of which the occurrence times are greater than or equal to a preset minimum occurrence time in each content display page; or in each content display page, arranging the words according to the sequence of the occurrence times from large to small, and in each content display page, acquiring a first preset number of alternative keywords according to the arrangement sequence of the words. The vocabulary with the occurrence frequency reaching a certain frequency requirement (for example, 5 times) in each content display page can be used as the alternative keyword; the number requirement of the alternative keywords, i.e. the first preset number (e.g. 3) may also be set in each content presentation page, i.e. in each content presentation page, 3 words with the largest occurrence number are used as the alternative keywords, so as to reduce the number of the alternative keywords, reduce the computing pressure of the terminal device or the server,
s130, determining the behavior type score of each candidate keyword in the at least one content display page according to the user behavior type corresponding to the at least one content display page.
Different user behaviors reflect different degrees of user attention, for example, browsing behaviors of the user only reflect general attention of the user, and purchasing behaviors of the user obviously represent interest points which are very concerned by the user, so different behavior type scores are allocated to different user behavior types, for example, browsing behaviors, searching behaviors, commenting behaviors, agreeing behaviors, collecting behaviors, shopping cart adding behaviors and purchasing behaviors in the technical scheme, the behavior type scores of the behaviors are sequentially increased, and the behavior type scores are respectively set to be 0.4, 0.5, 0.6, 0.7, 0.8, 0.9 and 1.
S140, determining the recommendation score of each candidate keyword according to the importance score and the behavior type score of each candidate keyword.
And performing multiplication operation on the text importance score and the behavior type score of the alternative keyword, wherein the multiplication operation result is the recommendation score of the alternative keyword, and the higher the recommendation score is, the more important the text content of the vocabulary is, the greater the incidence relation with the user is, and the more the attention point and the interest point of the user are matched.
Optionally, in this embodiment of the present invention, after determining, according to a user behavior type corresponding to the at least one content presentation page, a behavior type score of each candidate keyword in the at least one content presentation page, the method further includes: determining the behavior frequency score of each alternative keyword according to the user behavior frequency in unit time corresponding to each alternative keyword; determining a recommendation score of each candidate keyword according to the text importance score and the behavior type score of each candidate keyword, wherein the determining of the recommendation score of each candidate keyword comprises the following steps: and determining the recommendation score of each alternative keyword according to the text importance score, the behavior type score and the behavior frequency score of each alternative keyword.
The unit time is related to the preset time period in the above technical solution, if the preset time period is the past several days (i.e. taking "day" as the time unit), the unit time may be set to one day, if the preset time period is the past several months (i.e. taking "month" as the time unit), the unit time may be set to one month; the user behavior frequency in unit time corresponding to the alternative keyword reflects the influence degree of the alternative keyword on the user, and the larger the user behavior frequency in unit time is, the larger the influence degree on the user is; for example, in one day, the alternative keyword a appears in the content presentation page acquired by the network behavior of the user 2 times, and the alternative keyword B appears in the content presentation page acquired by the network behavior of the user 50 times, and obviously, the influence of the alternative keyword B on the user is greater than that of the alternative keyword a. According to the frequency band of the user behavior frequency in the unit time, a corresponding behavior frequency score can be obtained, for example, when the user behavior frequency of the alternative keyword in the corresponding unit time is 0 to 10 times, the corresponding behavior frequency score is 1; when the number of times is between 11 and 50, the corresponding behavior frequency is 1.5; the corresponding behavior frequency is 1.8 when the number of times is 50 to 100, and the corresponding behavior frequency is 2 when the number of times is more than 100. And performing product operation on the text importance score, the behavior type score and the behavior frequency score of the alternative keyword, wherein the product operation result is the recommendation score of the alternative keyword.
Optionally, in this embodiment of the present invention, after determining, according to a user behavior type corresponding to the at least one content presentation page, a behavior type score of each candidate keyword in the at least one content presentation page, the method further includes: obtaining interest attenuation scores of each alternative keyword; wherein the interest decay score is related to the interval time between the acquisition time of the candidate keyword and the current time; determining a recommendation score of each candidate keyword according to the text importance score and the behavior type score of each candidate keyword, wherein the determining of the recommendation score of each candidate keyword comprises the following steps: determining a recommendation score of each alternative keyword according to the text importance score, the behavior type score and the interest attenuation score of each alternative keyword; or determining the recommendation score of each alternative keyword according to the text importance score, the behavior type score, the behavior frequency score and the interest decay score of each alternative keyword.
The interest level of a user in a thing usually declines continuously with time, for example, the user browses a webpage five days ago, the alternative keyword "puppet cat" is included in the alternative keyword acquired by the webpage, the user browses a webpage one day ago, the alternative keyword "bose cat" is included in the alternative keyword acquired by the webpage, obviously, the browsing content one day ago is more capable of really predicting the current interest point of the user compared with the browsing content five days ago, therefore, the interest decay score of each alternative keyword can be determined according to the interval time between the acquisition time of each alternative keyword and the current time, for example, the preset time period is 5 days, the corresponding interval time can be 5 days, 4 days, 3 days, 2 days and 1 day, the interest decay scores can be respectively set to be 0.6, 0.7, 0.8, 0.9 and 1, namely, the interest decay score increases with the interval time.
Optionally, in this embodiment of the present invention, the obtaining an interest decay score of each candidate keyword includes: obtaining interest attenuation score of each candidate keyword based on the following formula
n i =k i ×exp(-m i ×t i ) (formula 1-1)
Wherein m is i As attenuation coefficient, t i Is the interval time, k, between the acquisition time of the alternative keyword and the current time i Initial interest score for alternative keywords, n i And the interest decay score of the alternative keyword is exp, an exponential function with e as a base, and i is the number of the user behavior type corresponding to the alternative keyword.
Taking the above technical solution as an example, the numbers of the browsing behavior, the searching behavior, the commenting behavior, the like behavior, the collecting behavior, the shopping cart adding behavior and the purchasing behavior are respectively 1 to 7, that is, the i values are respectively 1 to 7. Different user behavior types have different initial interest scores, the higher the behavior type score is, the higher the corresponding initial interest score is, and the higher the interest decay score is at the same interval time; for example, the browsing behavior reflects general attention of the user, and accordingly, the initial interest score of the candidate keyword in the user behavior type is low, and after time decay, the interest decay score is low; the purchasing behavior is obviously the interest point which is very concerned by the user, and accordingly, the initial interest score of the alternative keyword under the behavior type of the user is higher, although the interest decay score may still keep a higher value after a period of decay; therefore, different initial interest scores and interest attenuation scores with the interval time being a preset time period are preset for different user behavior types, finally, attenuation coefficients respectively matched with each user behavior type are calculated, and therefore calculation formulas of the interest attenuation scores respectively corresponding to the user behavior types are obtained.
For example, the initial interest score k for browsing behavior (corresponding to an i value of 1) 1 Set to 100, the preset time period is 5 days, and the decay score of interest after 5 days is expected to be 1, i.e. t 1 When the value is 5, n 1 The value is 1, so that the corresponding attenuation system can be calculated and obtainedNumber m 1 To 0.921, the corresponding formula 1-1 can be specifically changed to n 1 =100×exp(-0.921×t 1 ) According to the interval time t between the acquisition time and the current time of each alternative keyword 1 That is, the interest decay score n of the candidate keyword can be obtained 1
S150, according to the recommendation score of each alternative keyword, determining a recommended awakening word matched with the target user, and displaying the recommended awakening word to the target user.
When a user wakes up the terminal equipment or modifies a wake-up word of the terminal equipment, one or more alternative keywords with the highest recommendation score are used as the recommended wake-up word To be pushed To the user in a TTS (Text To Speech) broadcasting mode and/or a screen display mode.
Optionally, in this embodiment of the present invention, the determining, according to the recommendation score of each candidate keyword, a recommended wakening word matched with the target user, and displaying the recommended wakening word to the target user includes: classifying the alternative keywords, and respectively acquiring the alternative keywords with the highest recommendation scores under the classification categories as alternative recommended words; and acquiring awakening recommended words with highest recommendation scores and a second preset number from the alternative recommended words, and displaying the awakening recommended words to the target user. Classifying the candidate keywords, including clustering the obtained candidate keywords, for example, clustering the keywords by a Word to vector (Word vector) model to form different classification categories; the method also comprises the steps of presetting a plurality of classification categories, and dividing the candidate keywords into the classification categories according to word senses after the candidate keywords are obtained. The alternative keywords are classified, the awakening recommended words are respectively obtained under each classification category, the awakening words with strong relevance are recommended to the user from a plurality of different angles, the recommendation range related to the recommended words is further expanded, and the problem that the recommended awakening words are poor in diversity under a single category is solved. For example, most of the network data acquired by the user in the last 5 days is related to "cat", and according to the recommendation score of each candidate keyword, it can be determined that the candidate keywords with the highest score are "puppet cat", "bosch cat", "banglas cat", "skateboard", and "lipstick" in turn, and if the number of the awakening recommended words (i.e., the second preset number) is 3, it is obvious that the recommended awakening words should be "puppet cat", "bosch cat", and "banglas cat"; however, in fact, the three candidate keywords are all in the same category (i.e. category "cat"), and even if only one of the three candidate keywords is recommended to the user (i.e. only "puppet cat" with the highest recommendation score is used as the recommended awakening word), the user can associate with the other two candidate keywords, and still can set the awakening word meeting the needs of the user according to the recommendation; therefore, under different classification categories, the three finally determined recommended awakening words are respectively 'puppet cat' (the corresponding category is 'cat'), 'skateboard' (the corresponding category is 'sports equipment'), 'lipstick' (the corresponding category is 'cosmetics'), and multi-angle pushing of the awakening words is realized.
According to the technical scheme disclosed in the embodiment of the invention, the alternative keywords in each content display page are obtained according to the network data of the user in a period of time in the past, the text importance score is calculated, the recommendation score of each alternative keyword is obtained according to the behavior type score corresponding to the alternative keyword, the recommendation awakening word displayed to the user is further determined according to the recommendation score, the personalized push aiming at different users is realized according to the actual interest points and interest points of the user, and the human-computer interaction experience of the user is improved.
Example two
Fig. 2 is a block diagram of a structure of a device for recommending a wakeup word according to a second embodiment of the present invention, where the device specifically includes: a network data acquisition module 210, a text importance score acquisition module 220, a behavior type score acquisition module 230, a recommendation score acquisition module 240 and a wakeup word presentation module 250;
a network data acquiring module 210, configured to acquire network data of a target user within a preset time period; the network data comprises at least one content display page and a user behavior type corresponding to the at least one content display page;
a text importance score obtaining module 220, configured to obtain alternative keywords in each content presentation page, and obtain a text importance score of each alternative keyword according to a word frequency and an inverse text frequency index;
a behavior type score obtaining module 230, configured to determine a behavior type score of each candidate keyword in the at least one content presentation page according to a user behavior type corresponding to the at least one content presentation page;
a recommendation score obtaining module 240, configured to determine a recommendation score of each candidate keyword according to the text importance score and the behavior type score of each candidate keyword;
and the awakening word display module 250 is configured to determine, according to the recommendation score of each candidate keyword, a recommended awakening word matched with the target user, and display the recommended awakening word to the target user.
According to the technical scheme disclosed in the embodiment of the invention, the alternative keywords in each content display page are obtained according to the network data of the user in a period of time in the past, the text importance score is calculated, the recommendation score of each alternative keyword is obtained according to the behavior type score corresponding to the alternative keyword, the recommendation awakening word displayed to the user is further determined according to the recommendation score, the personalized push aiming at different users is realized according to the actual interest points and interest points of the user, and the human-computer interaction experience of the user is improved.
Optionally, on the basis of the above technical solution, the user behavior type includes a browsing behavior, a searching behavior, a commenting behavior, a like behavior, a collecting behavior, an adding shopping cart behavior, and/or a purchasing behavior.
Optionally, on the basis of the above technical solution, the text importance score obtaining module 220 is specifically configured to obtain alternative keywords, of which the occurrence times are greater than or equal to a preset minimum occurrence time, in each content display page; or in each content display page, arranging the words according to the sequence of the occurrence times from large to small, and in each content display page, acquiring a first preset number of alternative keywords according to the arrangement sequence of the words.
Optionally, on the basis of the above technical solution, the apparatus for recommending a wakeup word further includes:
and the behavior frequency score acquisition module is used for determining the behavior frequency score of each alternative keyword according to the user behavior frequency in unit time corresponding to each alternative keyword.
Optionally, on the basis of the foregoing technical solution, the recommendation score obtaining module 240 is specifically configured to determine the recommendation score of each candidate keyword according to the text importance score, the behavior type score, and the behavior frequency score of each candidate keyword.
Optionally, on the basis of the above technical solution, the apparatus for recommending a wakeup word further includes:
an interest attenuation score obtaining module, configured to obtain an interest attenuation score of each candidate keyword; wherein the interest decay score is related to the interval time between the acquisition time of the candidate keyword and the current time.
Optionally, on the basis of the foregoing technical solution, the recommendation score obtaining module 240 is specifically configured to determine the recommendation score of each candidate keyword according to the text importance score, the behavior type score, and the interest decay score of each candidate keyword; or determining the recommendation score of each alternative keyword according to the text importance score, the behavior type score, the behavior frequency score and the interest attenuation score of each alternative keyword.
Optionally, on the basis of the above technical solution, the interest decay score obtaining module is specifically configured to obtain the interest decay score of each candidate keyword based on the following formula
n i =k i ×exp(-m i ×t i )
Wherein m is i As attenuation coefficient, t i Is the interval time, k, between the acquisition time of the alternative keyword and the current time i Initial interest score for alternative keywords, n i Interest decay score for the candidate keyword, exp ise is an exponential function with the base, and i is the number of the user behavior type corresponding to the alternative keyword.
Optionally, on the basis of the above technical solution, the awakening word display module 250 specifically includes:
the classification execution unit is used for classifying the alternative keywords and respectively acquiring the alternative keywords with the highest recommendation scores under the classification categories as alternative recommended words;
and the awakening word display unit is used for acquiring the awakening recommended words with the highest recommendation scores and the second preset number in the alternative recommended words and displaying the awakening recommended words to the target user.
Optionally, on the basis of the above technical solution, the browsing behavior includes an audio playing behavior.
The device can execute the recommendation method of the awakening words provided by any embodiment of the invention, and has corresponding functional modules and beneficial effects of the execution method. For technical details that are not described in detail in this embodiment, reference may be made to the method provided in any embodiment of the present invention.
EXAMPLE III
Fig. 3 is a schematic structural diagram of a terminal device according to a third embodiment of the present invention. Fig. 3 illustrates a block diagram of an exemplary terminal device 12 suitable for use in implementing embodiments of the present invention. The terminal device 12 shown in fig. 3 is only an example, and should not bring any limitation to the functions and the scope of use of the embodiment of the present invention.
As shown in fig. 3, terminal device 12 is in the form of a general purpose computing device. The components of terminal device 12 may include, but are not limited to: one or more processors or processing units 16, a memory 28, and a bus 18 that couples various system components including the memory 28 and the processing unit 16.
Bus 18 represents one or more of any of several types of bus structures, including a memory bus or memory controller, a peripheral bus, an accelerated graphics port, and a processor or local bus using any of a variety of bus architectures. By way of example, such architectures include, but are not limited to, industry Standard Architecture (ISA) bus, micro-channel architecture (MAC) bus, enhanced ISA bus, video Electronics Standards Association (VESA) local bus, and Peripheral Component Interconnect (PCI) bus.
Terminal device 12 typically includes a variety of computer system readable media. Such media may be any available media that is accessible by terminal device 12 and includes both volatile and nonvolatile media, removable and non-removable media.
The memory 28 may include computer system readable media in the form of volatile memory, such as Random Access Memory (RAM) 30 and/or cache memory 32. Terminal device 12 may further include other removable/non-removable, volatile/nonvolatile computer system storage media. By way of example only, storage system 34 may be used to read from and write to non-removable, nonvolatile magnetic media (not shown in FIG. 3, and commonly referred to as a "hard drive"). Although not shown in FIG. 3, a magnetic disk drive for reading from and writing to a removable, nonvolatile magnetic disk (e.g., a "floppy disk") and an optical disk drive for reading from or writing to a removable, nonvolatile optical disk (e.g., a CD-ROM, DVD-ROM, or other optical media) may be provided. In these cases, each drive may be connected to bus 18 by one or more data media interfaces. Memory 28 may include at least one program product having a set (e.g., at least one) of program modules that are configured to carry out the functions of embodiments of the invention.
A program/utility 40 having a set (at least one) of program modules 42 may be stored, for example, in memory 28, such program modules 42 including, but not limited to, an operating system, one or more application programs, other program modules, and program data, each of which examples or some combination thereof may comprise an implementation of a network environment. Program modules 42 generally carry out the functions and/or methodologies of the described embodiments of the invention.
Terminal device 12 may also communicate with one or more external devices 14 (e.g., keyboard, pointing device, display 24, etc.), with one or more devices that enable a user to interact with terminal device 12, and/or with any devices (e.g., network card, modem, etc.) that enable terminal device 12 to communicate with one or more other computing devices. Such communication may be through an input/output (I/O) interface 22. Also, terminal device 12 may communicate with one or more networks (e.g., a Local Area Network (LAN), a Wide Area Network (WAN), and/or a public network, such as the Internet) via network adapter 20. As shown, the network adapter 20 communicates with the other modules of the terminal device 12 over the bus 18. It should be understood that although not shown in the figures, other hardware and/or software modules may be used in conjunction with terminal device 12, including but not limited to: microcode, device drivers, redundant processing units, external disk drive arrays, RAID systems, tape drives, and data backup storage systems, among others.
The processing unit 16 executes various functional applications and data processing by executing programs stored in the memory 28, for example, implementing a recommendation method of a wake word provided by any embodiment of the present invention. Namely: acquiring network data of a target user within a preset time period; the network data comprises at least one content display page and a user behavior type corresponding to the at least one content display page; acquiring alternative keywords in each content display page, and acquiring a text importance score of each alternative keyword according to a word frequency and an inverse text frequency index; determining a behavior type score of each alternative keyword in the at least one content presentation page according to a user behavior type corresponding to the at least one content presentation page; determining a recommendation score of each alternative keyword according to the text importance score and the behavior type score of each alternative keyword; and determining a recommended awakening word matched with the target user according to the recommendation score of each alternative keyword, and displaying the recommended awakening word to the target user.
Example four
The fourth embodiment of the present invention further provides a computer-readable storage medium, on which a computer program is stored, where the computer program, when executed by a processor, implements a method for recommending a wake-up word according to any embodiment of the present invention; the method comprises the following steps:
acquiring network data of a target user within a preset time period; the network data comprises at least one content display page and a user behavior type corresponding to the at least one content display page;
acquiring alternative keywords in each content display page, and acquiring a text importance score of each alternative keyword according to a word frequency and an inverse text frequency index;
determining a behavior type score of each alternative keyword in the at least one content presentation page according to a user behavior type corresponding to the at least one content presentation page;
determining a recommendation score of each alternative keyword according to the text importance score and the behavior type score of each alternative keyword;
and determining a recommended awakening word matched with the target user according to the recommendation score of each alternative keyword, and displaying the recommended awakening word to the target user.
Computer storage media for embodiments of the invention may employ any combination of one or more computer-readable media. The computer readable medium may be a computer readable signal medium or a computer readable storage medium. A computer readable storage medium may be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any combination of the foregoing. More specific examples (a non-exhaustive list) of the computer readable storage medium would include the following: an electrical connection having one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing. In the context of this document, a computer readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device.
A computer readable signal medium may include a propagated data signal with computer readable program code embodied therein, for example, in baseband or as part of a carrier wave. Such a propagated data signal may take any of a variety of forms, including, but not limited to, electro-magnetic, optical, or any suitable combination thereof. A computer readable signal medium may also be any computer readable medium that is not a computer readable storage medium and that can communicate, propagate, or transport a program for use by or in connection with a hint execution system, apparatus, or device.
Program code embodied on a computer readable medium may be transmitted using any appropriate medium, including but not limited to wireless, wireline, optical fiber cable, RF, etc., or any suitable combination of the foregoing.
Computer program code for carrying out operations for aspects of the present invention may be written in any combination of one or more programming languages, including an object oriented programming language such as Java, smalltalk, C + + or the like and conventional procedural programming languages, such as the "C" programming language or similar programming languages. The program code may execute entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer or entirely on the remote computer or server. In the case of a remote computer, the remote computer may be connected to the user's computer through any type of network, including a Local Area Network (LAN) or a Wide Area Network (WAN), or the connection may be made to an external computer (for example, through the Internet using an Internet service provider).
It is to be noted that the foregoing description is only exemplary of the invention and that the principles of the technology may be employed. Those skilled in the art will appreciate that the present invention is not limited to the particular embodiments described herein, and that various obvious changes, rearrangements and substitutions will now be apparent to those skilled in the art without departing from the scope of the invention. Therefore, although the present invention has been described in greater detail by the above embodiments, the present invention is not limited to the above embodiments, and may include other equivalent embodiments without departing from the spirit of the present invention, and the scope of the present invention is determined by the scope of the appended claims.

Claims (15)

1. A method for recommending a wake-up word, comprising:
acquiring network data of a target user within a preset time period; the network data comprises at least one content display page and a user behavior type corresponding to the at least one content display page; the user behavior type is used for triggering the content display page;
acquiring alternative keywords in each content display page, and acquiring a text importance score of each alternative keyword according to a word frequency and an inverse text frequency index;
determining a behavior type score of each alternative keyword in the at least one content presentation page according to a user behavior type corresponding to the at least one content presentation page;
determining a recommendation score of each alternative keyword according to the text importance score and the behavior type score of each alternative keyword;
and determining a recommended awakening word matched with the target user according to the recommendation score of each alternative keyword, and displaying the recommended awakening word to the target user.
2. The method of claim 1, wherein the user behavior types include browsing behavior, searching behavior, commenting behavior, praise behavior, collecting behavior, adding shopping cart behavior, and/or purchasing behavior.
3. The method according to claim 1, wherein the obtaining of the alternative keywords in each of the content presentation pages comprises:
acquiring alternative keywords of which the occurrence times are greater than or equal to a preset minimum occurrence time in each content display page;
or in each content display page, arranging the words according to the sequence of the occurrence times from large to small, and in each content display page, acquiring a first preset number of alternative keywords according to the arrangement sequence of the words.
4. The method of claim 1, further comprising, after determining a behavior type score for each of the candidate keywords in the at least one content presentation page according to a user behavior type corresponding to the at least one content presentation page:
determining the behavior frequency score of each alternative keyword according to the user behavior frequency in unit time corresponding to each alternative keyword;
determining a recommendation score of each candidate keyword according to the text importance score and the behavior type score of each candidate keyword, wherein the determining of the recommendation score of each candidate keyword comprises the following steps:
and determining the recommendation score of each alternative keyword according to the text importance score, the behavior type score and the behavior frequency score of each alternative keyword.
5. The method according to claim 1 or 4, after determining the behavior type score of each candidate keyword in the at least one content presentation page according to the user behavior type corresponding to the at least one content presentation page, further comprising:
obtaining interest attenuation scores of each alternative keyword; wherein the interest decay score is related to the interval time between the acquisition time of the candidate keyword and the current time;
determining a recommendation score of each candidate keyword according to the text importance score and the behavior type score of each candidate keyword, wherein the determining of the recommendation score of each candidate keyword comprises the following steps:
determining a recommendation score of each alternative keyword according to the text importance score, the behavior type score and the interest decay score of each alternative keyword;
or determining the recommendation score of each alternative keyword according to the text importance score, the behavior type score, the behavior frequency score and the interest attenuation score of each alternative keyword.
6. The method of claim 5, wherein obtaining an interest decay score for each of the candidate keywords comprises:
obtaining interest decay score of each alternative keyword based on the following formula
n i =k i ×exp(-m i ×t i )
Wherein m is i As attenuation coefficient, t i Is the interval time, k, between the acquisition time of the alternative keyword and the current time i Initial interest score for alternative keywords, n i And the interest decay score of the alternative keyword is exp, which is an exponential function with e as a base, and i is the number of the user behavior type corresponding to the alternative keyword.
7. The method according to claim 1, wherein the determining, according to the recommendation score of each candidate keyword, a recommended wake-up word matching the target user and presenting the recommended wake-up word to the target user comprises:
classifying the alternative keywords, and respectively acquiring the alternative keywords with the highest recommendation scores under the classification categories as alternative recommended words;
and acquiring awakening recommended words with highest recommendation scores and a second preset number from the alternative recommended words, and displaying the awakening recommended words to the target user.
8. The method of claim 2, wherein the browsing behavior comprises an audio playback behavior.
9. An apparatus for recommending a wake-up word, comprising:
the network data acquisition module is used for acquiring network data of a target user within a preset time period; the network data comprises at least one content display page and a user behavior type corresponding to the at least one content display page; the user behavior type is used for triggering the content display page;
the text importance score acquisition module is used for acquiring the alternative keywords in each content display page and acquiring the text importance score of each alternative keyword according to the word frequency and the inverse text frequency index;
a behavior type score obtaining module, configured to determine a behavior type score of each candidate keyword in the at least one content presentation page according to a user behavior type corresponding to the at least one content presentation page;
the recommendation score acquisition module is used for determining the recommendation score of each alternative keyword according to the text importance score and the behavior type score of each alternative keyword;
and the awakening word display module is used for determining the recommended awakening words matched with the target user according to the recommendation scores of the alternative keywords and displaying the recommended awakening words to the target user.
10. The apparatus of claim 9, wherein the apparatus for recommending the wake-up word further comprises:
a behavior frequency score acquisition module, configured to determine a behavior frequency score of each alternative keyword according to the user behavior frequency within a unit time corresponding to each alternative keyword;
the recommendation score obtaining module is specifically configured to determine a recommendation score of each candidate keyword according to the text importance score, the behavior type score, and the behavior frequency score of each candidate keyword.
11. The apparatus according to claim 9 or 10, wherein the apparatus for recommending the wake-up word further comprises:
an interest attenuation score obtaining module, configured to obtain an interest attenuation score of each candidate keyword; wherein the interest decay score is related to the interval time between the acquisition time of the alternative keyword and the current time;
the recommendation score acquisition module is specifically used for determining a recommendation score of each alternative keyword according to the text importance score, the behavior type score and the interest attenuation score of each alternative keyword; or determining the recommendation score of each alternative keyword according to the text importance score, the behavior type score, the behavior frequency score and the interest attenuation score of each alternative keyword.
12. The apparatus according to claim 11, wherein the interest-decay score obtaining module is specifically configured to obtain the interest-decay score of each candidate keyword based on the following formula
n i =k i ×exp(-m i ×t i )
Wherein m is i As attenuation coefficient, t i Is the interval time, k, between the acquisition time of the alternative keyword and the current time i Initial interest score for alternative keywords, n i And the interest decay score of the alternative keyword is exp, an exponential function with e as a base, and i is the number of the user behavior type corresponding to the alternative keyword.
13. The apparatus according to claim 9, wherein the wake-up word presentation module specifically includes:
the classification execution unit is used for classifying the alternative keywords and respectively acquiring the alternative keywords with the highest recommendation scores under the classification categories as alternative recommended words;
and the awakening word display unit is used for acquiring the awakening recommended words with the highest recommendation scores and the second preset number in the alternative recommended words and displaying the awakening recommended words to the target user.
14. A terminal device, characterized in that the terminal device comprises:
one or more processors;
a storage device to store one or more programs,
when executed by the one or more processors, cause the one or more processors to implement a method of recommending wake words as recited in any of claims 1-8.
15. A computer-readable storage medium, on which a computer program is stored which, when being executed by a processor, carries out a method of recommending a wake-up word as claimed in any one of claims 1 to 8.
CN202011633865.3A 2020-12-31 2020-12-31 Method and device for recommending awakening words, terminal equipment and storage medium Active CN112802454B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202011633865.3A CN112802454B (en) 2020-12-31 2020-12-31 Method and device for recommending awakening words, terminal equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202011633865.3A CN112802454B (en) 2020-12-31 2020-12-31 Method and device for recommending awakening words, terminal equipment and storage medium

Publications (2)

Publication Number Publication Date
CN112802454A CN112802454A (en) 2021-05-14
CN112802454B true CN112802454B (en) 2023-02-21

Family

ID=75808578

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202011633865.3A Active CN112802454B (en) 2020-12-31 2020-12-31 Method and device for recommending awakening words, terminal equipment and storage medium

Country Status (1)

Country Link
CN (1) CN112802454B (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113343084A (en) * 2021-05-25 2021-09-03 北京字节跳动网络技术有限公司 Method and device for pushing key field of text sending, storage medium and computer equipment

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108664513A (en) * 2017-03-31 2018-10-16 北京京东尚科信息技术有限公司 Method, apparatus and equipment for pushing keyword
CN109615487A (en) * 2019-01-04 2019-04-12 平安科技(深圳)有限公司 Products Show method, apparatus, equipment and storage medium based on user behavior
CN111414498A (en) * 2020-04-29 2020-07-14 北京字节跳动网络技术有限公司 Multimedia information recommendation method and device and electronic equipment
CN111723260A (en) * 2019-03-19 2020-09-29 百度在线网络技术(北京)有限公司 Method and device for acquiring recommended content, electronic equipment and readable storage medium
CN111949887A (en) * 2020-08-31 2020-11-17 华东理工大学 Item recommendation method and device and computer-readable storage medium

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP5395461B2 (en) * 2009-02-27 2014-01-22 株式会社東芝 Information recommendation device, information recommendation method, and information recommendation program

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108664513A (en) * 2017-03-31 2018-10-16 北京京东尚科信息技术有限公司 Method, apparatus and equipment for pushing keyword
CN109615487A (en) * 2019-01-04 2019-04-12 平安科技(深圳)有限公司 Products Show method, apparatus, equipment and storage medium based on user behavior
CN111723260A (en) * 2019-03-19 2020-09-29 百度在线网络技术(北京)有限公司 Method and device for acquiring recommended content, electronic equipment and readable storage medium
CN111414498A (en) * 2020-04-29 2020-07-14 北京字节跳动网络技术有限公司 Multimedia information recommendation method and device and electronic equipment
CN111949887A (en) * 2020-08-31 2020-11-17 华东理工大学 Item recommendation method and device and computer-readable storage medium

Also Published As

Publication number Publication date
CN112802454A (en) 2021-05-14

Similar Documents

Publication Publication Date Title
US10417344B2 (en) Exemplar-based natural language processing
US10733197B2 (en) Method and apparatus for providing information based on artificial intelligence
CN110069608B (en) Voice interaction method, device, equipment and computer storage medium
US9280595B2 (en) Application query conversion
US9852215B1 (en) Identifying text predicted to be of interest
US10387568B1 (en) Extracting keywords from a document
CN109299316B (en) Music recommendation method and device and computer equipment
US9342233B1 (en) Dynamic dictionary based on context
US10810374B2 (en) Matching a query to a set of sentences using a multidimensional relevancy determination
US20110099003A1 (en) Information processing apparatus, information processing method, and program
CN109918555B (en) Method, apparatus, device and medium for providing search suggestions
CN110147494B (en) Information searching method and device, storage medium and electronic equipment
AU2018250372B2 (en) Method to construct content based on a content repository
CN105550217B (en) Scene music searching method and scene music searching device
US20090327877A1 (en) System and method for disambiguating text labeling content objects
CN113806588A (en) Method and device for searching video
CN106202087A (en) A kind of information recommendation method and device
US20120239382A1 (en) Recommendation method and recommender computer system using dynamic language model
CN112802454B (en) Method and device for recommending awakening words, terminal equipment and storage medium
JP4883644B2 (en) RECOMMENDATION DEVICE, RECOMMENDATION SYSTEM, RECOMMENDATION DEVICE CONTROL METHOD, AND RECOMMENDATION SYSTEM CONTROL METHOD
WO2016103519A1 (en) Data analysis system, data analysis method, and data analysis program
CN113407775A (en) Video searching method and device and electronic equipment
CN111460177A (en) Method and device for searching film and television expression, storage medium and computer equipment
CN111737607A (en) Data processing method, data processing device, electronic equipment and storage medium
US11768867B2 (en) Systems and methods for generating interactable elements in text strings relating to media assets

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant