CN111353087A - Hot word statistical method and device, storage medium and electronic terminal - Google Patents

Hot word statistical method and device, storage medium and electronic terminal Download PDF

Info

Publication number
CN111353087A
CN111353087A CN201811567342.6A CN201811567342A CN111353087A CN 111353087 A CN111353087 A CN 111353087A CN 201811567342 A CN201811567342 A CN 201811567342A CN 111353087 A CN111353087 A CN 111353087A
Authority
CN
China
Prior art keywords
current
keyword
linked list
hot word
count
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201811567342.6A
Other languages
Chinese (zh)
Inventor
胡晓
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Jingdong Century Trading Co Ltd
Beijing Jingdong Shangke Information Technology Co Ltd
Original Assignee
Beijing Jingdong Century Trading Co Ltd
Beijing Jingdong Shangke Information Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Jingdong Century Trading Co Ltd, Beijing Jingdong Shangke Information Technology Co Ltd filed Critical Beijing Jingdong Century Trading Co Ltd
Priority to CN201811567342.6A priority Critical patent/CN111353087A/en
Publication of CN111353087A publication Critical patent/CN111353087A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q30/00Commerce
    • G06Q30/02Marketing; Price estimation or determination; Fundraising
    • G06Q30/0201Market modelling; Market analysis; Collecting market data

Landscapes

  • Business, Economics & Management (AREA)
  • Strategic Management (AREA)
  • Engineering & Computer Science (AREA)
  • Accounting & Taxation (AREA)
  • Development Economics (AREA)
  • Finance (AREA)
  • Entrepreneurship & Innovation (AREA)
  • Game Theory and Decision Science (AREA)
  • Data Mining & Analysis (AREA)
  • Economics (AREA)
  • Marketing (AREA)
  • Physics & Mathematics (AREA)
  • General Business, Economics & Management (AREA)
  • General Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The present disclosure relates to the field of data processing technologies, and in particular, to a hotword statistical method, a hotword statistical device, a storage medium, and an electronic terminal. The method comprises the following steps: extracting session data in a current preset period; preprocessing the session data to obtain corresponding effective texts, and counting the effective texts to obtain corresponding counts; updating the current hot word linked list which is arranged in order according to the effective text and the corresponding count; wherein the current hotword linked list includes a current keyword object and a corresponding current count. According to the method and the device, the hot words can be sequenced in a linked list mode, the updating process of the keyword object statistics can be simplified, and the quick acquisition of the hot word ranking is realized.

Description

Hot word statistical method and device, storage medium and electronic terminal
Technical Field
The present disclosure relates to the field of data processing technologies, and in particular, to a hotword statistical method, a hotword statistical device, a storage medium, and an electronic terminal.
Background
With the rapid development of internet technology, people acquire information and shop through networks, which are indispensable parts of life. And the topics, affairs or commodities which are generally concerned by people in a period can be accurately reflected by counting the hotwords. For a shopping website, the merchant may also count the items or questions that are currently most interesting or of primary interest to the user based on the hotwords.
However, the existing hot word statistical and analysis method still has certain defects. For example, the existing hotword statistical method generally needs to refer to a large amount of historical visit data of websites, and obtain the current key hotwords by counting and analyzing the historical data. However, the hot word statistical method cannot perform real-time statistics on the hot words, and cannot meet the requirements of the website and the user on timeliness of the hot words. Particularly, on some specific promotion days, the access amount and the consultation amount of shopping websites are increased rapidly, so that the data amount of website access is huge, and the problems of some hot spots and explosion spots cannot be found and processed in time through statistics and analysis. In addition, the data pressure of the server is easily too high by counting and analyzing massive historical data in a short time to obtain hot words.
It is to be noted that the information disclosed in the above background section is only for enhancement of understanding of the background of the present disclosure, and thus may include information that does not constitute prior art known to those of ordinary skill in the art.
Disclosure of Invention
An object of the present disclosure is to provide a hotword counting method, a hotword counting apparatus, a storage medium, and an electronic terminal, which overcome, at least to some extent, the inability to acquire hotwords in real time due to limitations and disadvantages of the related art.
Additional features and advantages of the disclosure will be set forth in the detailed description which follows, or in part will be obvious from the description, or may be learned by practice of the disclosure.
According to a first aspect of the present disclosure, there is provided a hotword statistics method, including:
extracting session data in a current preset period;
preprocessing the session data to obtain corresponding effective texts, and counting the effective texts to obtain corresponding counts;
updating the current hot word linked list which is arranged in order according to the effective text and the corresponding count; wherein the current hotword linked list includes a current keyword object and a corresponding current count.
In an exemplary embodiment of the present disclosure, the preprocessing the session data to obtain the valid text includes:
and carrying out generalization processing and word segmentation processing on the session data to obtain a corresponding effective text.
In an exemplary embodiment of the present disclosure, before performing generalization processing and word segmentation processing on the session data, the method further includes:
and filtering the session data by utilizing a preset keyword blacklist.
In an exemplary embodiment of the present disclosure, the valid text includes at least one keyword, and the updating the ordered current hotword list according to the valid text and the corresponding count includes:
judging whether the current hot word linked list comprises the keyword or not;
when the current hot word linked list comprises the keywords, updating the count of the current keyword object in the current hot word linked list according to the count of the keywords;
and updating the sorting position of the current keyword object in the hot word linked list according to the updated count of the current keyword object.
In an exemplary embodiment of the disclosure, after updating the sorting position of the current keyword object in the hot word list according to the updated count of the current keyword object, the method further includes:
and counting the change rate corresponding to the keyword object according to the current count before updating and the updated count after updating the current keyword object.
In an exemplary embodiment of the present disclosure, the hot word linked list includes a preset number of keyword objects, and the updating the current hot word linked list arranged in order according to the valid text and the corresponding count further includes:
if the current hot word linked list does not comprise the keyword, reading the number of the keyword objects in the current hot word linked list;
if the number of the keyword objects in the current hot word linked list is less than the preset number, adding the keyword to the current hot word linked list, and comparing the keyword with the current count corresponding to each current keyword object in sequence to determine the sorting position of the keyword in the hot word linked list; or
And if the current hot word linked list comprises a preset number of current keyword objects, sequentially comparing the current keyword objects with the counts corresponding to the current keyword objects in a reverse order according to the counts corresponding to the keywords so as to determine the sorting positions of the keywords in the current hot word linked list.
In an exemplary embodiment of the present disclosure, the current hotword linked list comprises a plurality of categories of current hotword linked lists;
after the counting is performed on the valid texts to obtain corresponding counts, the method further includes:
and classifying the effective texts according to a preset rule so as to update the current hot word linked list of the corresponding category according to a classification result.
In an exemplary embodiment of the present disclosure, the method further comprises:
and filtering the keyword objects in the updated hot word linked list according to a preset rule.
In an exemplary embodiment of the present disclosure, the method further comprises:
performing semantic recognition on each keyword object in the hot word linked list to merge keywords with the same semantics and corresponding counts;
and sorting according to the count corresponding to the combined keyword object.
According to a second aspect of the present disclosure, there is provided a hotword statistic apparatus including:
the session data extraction module is used for extracting session data in a current preset period;
the effective text counting module is used for preprocessing the session data to obtain corresponding effective texts and counting the effective texts to obtain corresponding counts;
the hot word counting module is used for updating the current hot word linked list which is arranged in order according to the effective text and the corresponding count; wherein the current hotword linked list includes a current keyword object and a corresponding current count.
According to a third aspect of the present disclosure, there is provided a storage medium having stored thereon a computer program which, when executed by a processor, implements the hotword statistics method described above.
According to a fourth aspect of the present disclosure, there is provided an electronic terminal comprising:
a processor; and
a memory for storing executable instructions of the processor;
wherein the processor is configured to perform the above-described hotword statistics method via execution of the executable instructions.
In the hotword statistical method provided by the embodiment of the disclosure, the corresponding effective text is obtained by processing the session data in the preset period, so that the current hotword linked list can be updated according to the count of the keywords in the effective text, and further, the hotword can be updated in a short period in real time. In addition, the hot words are sorted in a linked list mode, so that the updating process of the keyword object statistics can be simplified, and the quick acquisition of the hot word ranking is realized.
It is to be understood that both the foregoing general description and the following detailed description are exemplary and explanatory only and are not restrictive of the disclosure.
Drawings
The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate embodiments consistent with the present disclosure and together with the description, serve to explain the principles of the disclosure. It is to be understood that the drawings in the following description are merely exemplary of the disclosure, and that other drawings may be derived from those drawings by one of ordinary skill in the art without the exercise of inventive faculty.
FIG. 1 is a schematic diagram that schematically illustrates a hotword statistics method in an exemplary embodiment of the present disclosure;
FIG. 2 is a schematic diagram schematically illustrating a method of preprocessing session data in an exemplary embodiment of the present disclosure;
FIG. 3 is a diagram schematically illustrating a current ranking result of a hotspot problem in an exemplary embodiment of the present disclosure;
FIG. 4 is a diagram schematically illustrating a result of ranking a hot word in an exemplary embodiment of the present disclosure;
FIG. 5 is a diagram schematically illustrating the ranking result of a hot commodity in an exemplary embodiment of the present disclosure;
FIG. 6 is a diagram that schematically illustrates a ranking result of a hot store in an exemplary embodiment of the present disclosure;
FIG. 7 is a diagram that schematically illustrates a linked list structure, in an exemplary embodiment of the present disclosure;
FIG. 8 is a schematic diagram illustrating a hotword statistics apparatus in an exemplary embodiment of the present disclosure;
FIG. 9 schematically illustrates another schematic diagram of a hotword statistics apparatus in an exemplary embodiment of the present disclosure;
fig. 10 schematically illustrates still another schematic diagram of a hotword statistic device in an exemplary embodiment of the disclosure.
Detailed Description
Example embodiments will now be described more fully with reference to the accompanying drawings. Example embodiments may, however, be embodied in many different forms and should not be construed as limited to the examples set forth herein; rather, these embodiments are provided so that this disclosure will be thorough and complete, and will fully convey the concept of example embodiments to those skilled in the art. The described features, structures, or characteristics may be combined in any suitable manner in one or more embodiments.
Furthermore, the drawings are merely schematic illustrations of the present disclosure and are not necessarily drawn to scale. The same reference numerals in the drawings denote the same or similar parts, and thus their repetitive description will be omitted. Some of the block diagrams shown in the figures are functional entities and do not necessarily correspond to physically or logically separate entities. These functional entities may be implemented in the form of software, or in one or more hardware modules or integrated circuits, or in different networks and/or processor devices and/or microcontroller devices.
The exemplary embodiment first provides a hot word statistical method, which can be applied to fast statistics and analysis of hot words in websites such as shopping and news. Referring to fig. 1, the above-mentioned hotword statistic method may include the steps of:
s1, extracting the session data in the current preset period;
s2, preprocessing the session data to obtain corresponding effective texts, and counting the effective texts to obtain corresponding counts;
s3, updating the current hot word linked list in order arrangement according to the effective text and the corresponding count; wherein the current hotword linked list includes a current keyword object and a corresponding current count.
In the hotword statistical method provided by the example embodiment, on one hand, the corresponding effective text is obtained by processing the session data in the preset period, so that the current hotword linked list can be updated according to the count of the keywords in the effective text, and further, the hotword can be updated in real time in a shorter period. On the other hand, the hot words are sorted in the form of the linked list, so that the updating process of the keyword object statistics can be simplified, and the quick acquisition of the hot word ranking is realized.
Hereinafter, each step of the hotword statistical method in the present exemplary embodiment will be described in more detail with reference to the drawings and examples.
In step S1, session data in the current preset period is extracted.
In this exemplary embodiment, the preset period may be configured according to actual requirements, for example, the period is set to be a time duration of one hour, half hour, two hours, or fifteen minutes, so that the corresponding hotword may be obtained according to the session data generated in the relatively short period.
In addition, the session data may be a chat record or a session record between the client and the staff, for example; or a session record between the client and the intelligent customer service; or a text-form conversation record obtained by performing voice recognition on a voice record between the client and the staff. For example, the text corresponding to the session data may include information such as a session ID, a message ID, a shop ID, and a product ID.
Of course, in other exemplary embodiments of the present disclosure, the above-mentioned conversation data may also be a message for a commodity, a topic, or the like, or a chat record between the user and a forum, and the user in the discussion area, or the like. The present disclosure does not specifically limit the specific form of the session record.
Step S2, pre-process the session data to obtain corresponding valid texts, and count the valid texts to obtain corresponding counts.
In this example embodiment, after the original session data is obtained, it may be preprocessed. Specifically, as shown in fig. 2, the method may include:
step S21, filtering the session data by using a preset keyword blacklist;
step S22, generalizing and word-dividing the session data to obtain corresponding effective texts;
in step S23, the valid texts are counted to obtain corresponding counts.
In a preferred embodiment of the present disclosure, after session data is obtained, a preset keyword blacklist may be used to filter and filter each obtained session data. For example, a session data set may be generated according to session data in a plurality of text formats in a preset period, and after the session data set is obtained, each session data set may be identified and it is determined whether each session record includes a phrase in a preset keyword blacklist.
When the vocabulary in the blacklist of the preset keywords is identified to be contained in the session record, the current session record can be deleted, and the session record is not processed.
For example, the keyword blacklist may be obtained by customization, for example, configuring the keyword blacklist to include sensitive words, or obtaining a special field according to historical session data statistics, and the like. For example, configuring the keyword blacklist includes: tax, industry, corruption or violence. And identifying, screening and filtering the current session data set by using the keyword blacklist to judge whether each session data set contains the sensitive words. For example, the following is included in a session data:
session ID: 0000180909001
Smart customer service (message ID: 001): hi-shopping encounters problems, the bottom shortcut button can be clicked or the problems can be directly input, and the intelligent customer service can accompany your e
User (message ID: 002): anti-putrefaction struggle is a long-term task
Smart customer service (message ID: 003): this problem is deep Wo-seem to require knowledge supplementation
When the session data is identified, the current session data is confirmed to have the blacklist keyword 'corrupt', so the session ID: 0000180909001 does the deletion process.
After the session data is screened by using the keyword blacklist, generalization processing can be performed on the screened session data. Generalization refers to a data processing process that abstracts data from a lower conceptual level to a higher conceptual level. In this embodiment, specifically, the lower concept may be replaced with the upper concept. For example, the detailed product number is replaced with "WAREID" or the like.
For example, a piece of session data includes "consultation order number: 76340140502, product ID: 4158812 "," how large the millet L43 screen is "," order number: 76340140502, or "trade code: 7154350 "; the corresponding generalization processing results are respectively' consultation order number: ORDER ID, product ID: the wake ID "," how large the branch NUMBER screen is "," order NUMBER: logistics of ORDER ID ", and" commodity number: WARE ID ".
After the corresponding generalization result of the session data is obtained, word segmentation processing can be performed on the generalization result, so that the effective text of the session data is obtained. In particular, word segmentation refers to a data processing process that reassembles a sequence of consecutive words into a sequence of words according to certain criteria. The effective text can be subjected to word segmentation by utilizing a word segmentation tool, such as a jieba word segmentation tool. For example, the valid text may include: store ID, commodity ID, and word segmentation of the word segmentation result. For example, the valid data corresponding to the session data in the above example may include: logistics, screen, order, item ID: 4158812 and item ID: 4158812, etc. After the valid data is obtained, counting statistics can be carried out on each keyword in the current period.
Of course, in other exemplary embodiments of the present disclosure, after the session data is obtained, the generalization processing and the word segmentation processing may also be directly performed on the session data.
Step S3, updating the current hot word linked list arranged in order according to the effective text and the corresponding count; wherein the current hotword linked list includes a current keyword object and a corresponding current count.
In this example embodiment, as shown in fig. 7, a linked list structure may be used to store hot words, and the length of the linked list may be configured according to actual requirements. For example, if hot spot data of top-k needs to be displayed currently, a linked list capable of storing n objects can be established in advance; wherein k is a positive integer. For example, if hot word data with the number of top5 or top10 needs to be presented currently, the linked list can be set to include 5 or 10 objects. Each object may contain a keyword object and a corresponding count. For example, the objects in the linked list may be sorted from small to large, or from large to small, according to the count. The effective text may include a plurality of keywords. In addition, referring to the sorting results shown in fig. 3 to 6, the keywords included in the effective text may be store IDs, product names, product SKUs, or specific hot words.
For each keyword in the effective vocabulary, the current hot word linked list can be read first, and whether the current hot word linked list contains the keyword in the effective text or not is judged. Specifically, the step S3 may include:
step S31, judging whether the current hot word linked list comprises the key words;
step S32, when judging that the current hot word linked list includes the keyword, updating the count of the current keyword object in the current hot word linked list according to the count of the keyword;
and step S33, updating the sorting position of the current keyword object in the hot word linked list according to the updated count of the current keyword object.
When the current hot word linked list is read and the keyword is judged to be contained, the count of the keyword counted in the effective text can be added to the current count in the current hot word linked list to obtain the updated count corresponding to the keyword. And judging the ordering of the hot word list according to the updated count.
For example, the keyword "cancel order" is included in the valid text, and the corresponding count is 1068. The current hot word linked list contains top10 hot words and the keyword object "cancel order", the current count is 1205, and the sequence is fifth. Since the current hotword linked list contains the keyword of "cancel order", the current count of the keyword object "cancel order" can be added to the count in the valid text, and the updated count of the keyword object "cancel order" is 2273. Then, the count is compared with the original fourth keyword object in the original current hotword linked list, and if the count is smaller than the current count of the original fourth keyword object, the order of the keyword object "cancel order" is still the fifth order, as shown in the ordering result shown in fig. 3. Or if the number of the keyword objects is larger than the current count of the original fourth keyword object, the keyword objects are exchanged with the sequencing position of the original fourth keyword object; and again compared with the original third-digit keyword object until the number of the third-digit keyword objects is less than or equal to the number of the previous keyword objects.
Based on the above, in the present exemplary embodiment, the method described above may further include:
step S34, counting the change rate corresponding to the keyword object according to the current count before updating and the updated count after updating the current keyword object.
By calculating the count of the keyword object before updating and the count of the keyword object after updating, the attention degree of the user to the keyword object, and the problems of hot spots and explosion spots can be clearly and simply judged. For example, referring to fig. 3, the preset period duration is one hour, and for the keyword object "apply for after sale", if the count is updated and the change rate is found to be increased by 19.9% by statistics, it indicates that the current time period has a higher number of after-sale problems, and the merchant may analyze the specific content after applying for after sale according to the information. In addition, the count and the change rate of the keywords in the current period can be displayed in the updated sorting result.
In addition, the specific merchant or specific commodity corresponding to each applied after-sale can be traced, so that the order corresponding to the applied after-sale and the commodity or merchant corresponding to the order can be known.
In other exemplary embodiments of the present disclosure, the method described above may further include:
step S31, judging whether the current hot word linked list comprises the key words;
step S35, if the current hot word linked list does not include the key words, the number of key word objects in the current hot word linked list is read;
step S36, if the number of the keyword objects in the current hot word linked list is less than the preset number, adding the keyword to the current hot word linked list, and comparing the keyword with the current count corresponding to each current keyword object in sequence to determine the sorting position of the keyword in the hot word linked list; or
Step S37, if the current hotword list includes a preset number of current keyword objects, comparing the current keyword objects with the counts corresponding to the current keyword objects in reverse order according to the counts corresponding to the keywords, so as to determine the sorting positions of the keywords in the current hotword list.
When the current hot word linked list does not contain the keyword, the number of the keyword objects in the current hot word linked list can be read firstly. For example, the keyword "cancel order" is included in the valid text, and the corresponding count is 1068. The preset length of the hot word linked list is to include top10 hot words, and if the number of the keyword objects in the current hot word linked list is less than ten, for example, the number of the current keyword objects is zero, the keyword "cancel order" is directly added to the hot word linked list. Or if the number of the current keyword objects is five, adding the keyword 'order cancel' to the last position in the hot word chain table, comparing the number with the count of the fifth keyword object, and if the number is less than the count of the fifth keyword object, configuring the order of the keyword 'order cancel' as the sixth position. Or if the number of the first keyword object is greater than the number of the second keyword object, the position of the first keyword object is interacted with the number of the second keyword object, and the comparison with the number of the first keyword object is continued until the number of the first keyword object is less than or equal to the number of the second keyword object.
In addition, if the number of the keyword objects in the current hot word linked list is equal to ten, the count of the keyword 'cancel order' can be compared with the count of the tenth keyword object in the current hot word linked list, and if the count is smaller than the count of the current tenth keyword object, the keyword is not added into the hot word linked list. Or if the number of the current tenth keyword objects is larger than the count of the tenth keyword objects, deleting the current tenth keyword objects from the hot word chain table, adding the keyword 'order canceling' into the hot word chain table, configuring the order as the tenth keyword, and then sequentially comparing the tenth keyword objects with the previous keyword objects until the tenth keyword objects are smaller than or equal to the count of the previous keyword objects, thereby determining the final order of the keyword 'order canceling' in the hot word chain table.
Based on the above, in other exemplary embodiments of the present disclosure, after the valid text is obtained, the keywords in the valid text may also be classified. Specifically, the method may include:
and classifying the effective texts according to a preset rule so as to update the current hot word linked list of the corresponding category according to a classification result.
For example, the effective text may include stores, commodities, word segments, and the like. At this time, the keywords can be classified according to the attributes of the keywords, and a corresponding keyword set is generated. For example, as shown in fig. 3 to 6, the category may be classified into a store category, a commodity category, a hot question category, a hot vocabulary category, or the like. Therefore, different hot word linked lists can be updated by using different types of keyword sets. For example, the store hotword list is updated with information in the store keyword set.
In addition, in an exemplary embodiment of the present disclosure, for a newly generated hot word linked list or an updated hot word linked list, the keyword objects in the hot word linked list may be filtered according to a certain custom rule, so as to remove the keyword objects that do not meet requirements or are meaningless to the statistical data in the hot word linked list. Such as the sorting result shown in fig. 3, in which the first corresponding question "article ID: 1035839 ", may be culled from the hotword list.
In addition, in an exemplary embodiment of the present disclosure, after obtaining the hotword list, the method may further include:
step S41, carrying out semantic recognition on each keyword object in the hot word linked list so as to merge keywords with the same semantic and corresponding counts;
and step S42, sorting according to the count corresponding to the combined keyword object.
For example, for the sorted results as shown in fig. 3, where: the logistics query, the estimated delivery time, the delivery time and the delivery information of the order which the customer wants to query can be identified as the logistics information query, so that a plurality of keyword objects can be combined, the corresponding counts are combined, and the hot word chain table is updated according to the combined count result. Therefore, repeated keyword objects are effectively reduced, and the accuracy of the hot word linked list sorting result is improved.
For the updated hot word linked list, the data can be stored in the REDIS database, and the data is read from the REDIS database when the data is required to be requested, and then is provided for the interactive interface to be displayed.
According to the hot word statistical method provided by the disclosure, on one hand, accurate effective texts can be obtained by filtering, generalizing and segmenting the different types of session data, so that the hot words can be accurately counted conveniently. Another convenience, through utilizing the linked list structure to store the hot word, can set up shorter control cycle, and utilize the count of keyword to compare with the keyword object in the linked list in proper order when updating the hot word linked list, utilize the principle of minimum heap sequencing, can be simple, quick the sequencing result of confirming each keyword, and can guarantee that after updating, the hot word linked list guarantees the orderly sequencing of keyword object all the time. The defect that hot words are counted by using long-time offline historical data is effectively overcome, and timeliness of sequencing results is improved.
It is to be noted that the above-mentioned figures are only schematic illustrations of the processes involved in the method according to an exemplary embodiment of the invention, and are not intended to be limiting. It will be readily understood that the processes shown in the above figures are not intended to indicate or limit the chronological order of the processes. In addition, it is also readily understood that these processes may be performed synchronously or asynchronously, e.g., in multiple modules.
Further, referring to fig. 8, in the present exemplary embodiment, there is provided a hotword statistic device 10, including: a session data extraction module 101, a valid text statistics module 102, and a hotword statistics module 103. Wherein:
the session data extraction module 101 may be configured to extract session data in a current preset period.
The valid text statistics module 102 may be configured to pre-process the session data to obtain corresponding valid texts, and perform statistics on the valid texts to obtain corresponding counts.
The hotword counting module 103 may be configured to update the current hotword linked list arranged in order according to the valid text and the corresponding count; wherein the current hotword linked list includes a current keyword object and a corresponding current count.
The specific details of each module in the hotword statistical device are described in detail in the corresponding hotword statistical method, and therefore are not described herein again.
It should be noted that although in the above detailed description several modules or units of the device for action execution are mentioned, such a division is not mandatory. Indeed, the features and functionality of two or more modules or units described above may be embodied in one module or unit, according to embodiments of the present disclosure. Conversely, the features and functions of one module or unit described above may be further divided into embodiments by a plurality of modules or units.
In an exemplary embodiment of the present disclosure, an electronic device capable of implementing the above method is also provided.
As will be appreciated by one skilled in the art, aspects of the present invention may be embodied as a system, method or program product. Thus, various aspects of the invention may be embodied in the form of: an entirely hardware embodiment, an entirely software embodiment (including firmware, microcode, etc.) or an embodiment combining hardware and software aspects that may all generally be referred to herein as a "circuit," module "or" system.
An electronic device 600 according to this embodiment of the invention is described below with reference to fig. 9. The electronic device 600 shown in fig. 9 is only an example, and should not bring any limitation to the functions and the scope of use of the embodiments of the present invention.
As shown in fig. 9, the electronic device 600 is embodied in the form of a general purpose computing device. The components of the electronic device 600 may include, but are not limited to: the at least one processing unit 610, the at least one memory unit 620, and a bus 630 that couples the various system components including the memory unit 620 and the processing unit 610.
Wherein the storage unit stores program code that is executable by the processing unit 610 to cause the processing unit 610 to perform steps according to various exemplary embodiments of the present invention as described in the above section "exemplary methods" of the present specification. For example, the processing unit 610 may execute step S1 shown in fig. 1 to extract session data in a current preset period; s2, preprocessing the session data to obtain corresponding effective texts, and counting the effective texts to obtain corresponding counts; s3, updating the current hot word linked list in order arrangement according to the effective text and the corresponding count; wherein the current hotword linked list includes a current keyword object and a corresponding current count.
The storage unit 620 may include readable media in the form of volatile memory units, such as a random access memory unit (RAM)6201 and/or a cache memory unit 6202, and may further include a read-only memory unit (ROM) 6203.
The memory unit 620 may also include a program/utility 6204 having a set (at least one) of program modules 6205, such program modules 6205 including, but not limited to: an operating system, one or more application programs, other program modules, and program data, each of which, or some combination thereof, may comprise an implementation of a network environment.
Bus 630 may be one or more of several types of bus structures, including a memory unit bus or memory unit controller, a peripheral bus, an accelerated graphics port, a processing unit, or a local bus using any of a variety of bus architectures.
The electronic device 600 may also communicate with one or more external devices 700 (e.g., keyboard, pointing device, bluetooth device, etc.), with one or more devices that enable a user to interact with the electronic device 600, and/or with any devices (e.g., router, modem, etc.) that enable the electronic device 600 to communicate with one or more other computing devices. Such communication may occur via an input/output (I/O) interface 650. Also, the electronic device 600 may communicate with one or more networks (e.g., a Local Area Network (LAN), a Wide Area Network (WAN), and/or a public network such as the Internet) via the network adapter 660. As shown, the network adapter 660 communicates with the other modules of the electronic device 600 over the bus 630. It should be appreciated that although not shown in the figures, other hardware and/or software modules may be used in conjunction with the electronic device 600, including but not limited to: microcode, device drivers, redundant processing units, external disk drive arrays, RAID systems, tape drives, and data backup storage systems, among others.
Through the above description of the embodiments, those skilled in the art will readily understand that the exemplary embodiments described herein may be implemented by software, or by software in combination with necessary hardware. Therefore, the technical solution according to the embodiments of the present disclosure may be embodied in the form of a software product, which may be stored in a non-volatile storage medium (which may be a CD-ROM, a usb disk, a removable hard disk, etc.) or on a network, and includes several instructions to enable a computing device (which may be a personal computer, a server, a terminal device, or a network device, etc.) to execute the method according to the embodiments of the present disclosure.
In an exemplary embodiment of the present disclosure, there is also provided a computer-readable storage medium having stored thereon a program product capable of implementing the above-described method of the present specification. In some possible embodiments, aspects of the invention may also be implemented in the form of a program product comprising program code means for causing a terminal device to carry out the steps according to various exemplary embodiments of the invention described in the above section "exemplary methods" of the present description, when said program product is run on the terminal device.
Referring to fig. 10, a program product 800 for implementing the above method according to an embodiment of the present invention is described, which may employ a portable compact disc read only memory (CD-ROM) and include program code, and may be run on a terminal device, such as a personal computer. However, the program product of the present invention is not limited in this regard and, in the present document, a readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device.
The program product may employ any combination of one or more readable media. The readable medium may be a readable signal medium or a readable storage medium. A readable storage medium may be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any combination of the foregoing. More specific examples (a non-exhaustive list) of the readable storage medium include: an electrical connection having one or more wires, a portable disk, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing.
A computer readable signal medium may include a propagated data signal with readable program code embodied therein, for example, in baseband or as part of a carrier wave. Such a propagated data signal may take many forms, including, but not limited to, electro-magnetic, optical, or any suitable combination thereof. A readable signal medium may also be any readable medium that is not a readable storage medium and that can communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device.
Program code embodied on a readable medium may be transmitted using any appropriate medium, including but not limited to wireless, wireline, optical fiber cable, RF, etc., or any suitable combination of the foregoing.
Program code for carrying out operations for aspects of the present invention may be written in any combination of one or more programming languages, including an object oriented programming language such as Java, C + + or the like and conventional procedural programming languages, such as the "C" programming language or similar programming languages. The program code may execute entirely on the user's computing device, partly on the user's device, as a stand-alone software package, partly on the user's computing device and partly on a remote computing device, or entirely on the remote computing device or server. In the case of a remote computing device, the remote computing device may be connected to the user computing device through any kind of network, including a Local Area Network (LAN) or a Wide Area Network (WAN), or may be connected to an external computing device (e.g., through the internet using an internet service provider).
Furthermore, the above-described figures are merely schematic illustrations of processes involved in methods according to exemplary embodiments of the invention, and are not intended to be limiting. It will be readily understood that the processes shown in the above figures are not intended to indicate or limit the chronological order of the processes. In addition, it is also readily understood that these processes may be performed synchronously or asynchronously, e.g., in multiple modules.
Other embodiments of the disclosure will be apparent to those skilled in the art from consideration of the specification and practice of the disclosure disclosed herein. This application is intended to cover any variations, uses, or adaptations of the disclosure following, in general, the principles of the disclosure and including such departures from the present disclosure as come within known or customary practice within the art to which the disclosure pertains. It is intended that the specification and examples be considered as exemplary only, with a true scope and spirit of the disclosure being indicated by the following claims.
It will be understood that the present disclosure is not limited to the precise arrangements described above and shown in the drawings and that various modifications and changes may be made without departing from the scope thereof. The scope of the present disclosure is to be limited only by the terms of the appended claims.

Claims (12)

1. A hotword statistical method, comprising:
extracting session data in a current preset period;
preprocessing the session data to obtain corresponding effective texts, and counting the effective texts to obtain corresponding counts;
updating the current hot word linked list which is arranged in order according to the effective text and the corresponding count; wherein the current hotword linked list includes a current keyword object and a corresponding current count.
2. The method of claim 1, wherein preprocessing the session data to obtain valid text comprises:
and carrying out generalization processing and word segmentation processing on the session data to obtain a corresponding effective text.
3. The method of claim 2, wherein before the generalization and word segmentation, the method further comprises:
and filtering the session data by utilizing a preset keyword blacklist.
4. The method of claim 1, wherein the valid text comprises at least one keyword, and wherein updating the ordered list of current hotword links based on the valid text and corresponding counts comprises:
judging whether the current hot word linked list comprises the keyword or not;
when the current hot word linked list comprises the keywords, updating the count of the current keyword object in the current hot word linked list according to the count of the keywords;
and updating the sorting position of the current keyword object in the hot word linked list according to the updated count of the current keyword object.
5. The method of claim 4, wherein after updating the current keyword object's sorting position in the hotword list according to the updated current keyword object's count, the method further comprises:
and counting the change rate corresponding to the keyword object according to the current count before updating and the updated count after updating the current keyword object.
6. The method of claim 4, wherein the hotword list includes a predetermined number of keyword objects, and wherein updating the ordered current hotword list according to the valid text and the corresponding count further comprises:
if the current hot word linked list does not comprise the keyword, reading the number of the keyword objects in the current hot word linked list;
if the number of the keyword objects in the current hot word linked list is less than the preset number, adding the keyword to the current hot word linked list, and comparing the keyword with the current count corresponding to each current keyword object in sequence to determine the sorting position of the keyword in the hot word linked list; or
And if the current hot word linked list comprises a preset number of current keyword objects, sequentially comparing the current keyword objects with the counts corresponding to the current keyword objects in a reverse order according to the counts corresponding to the keywords so as to determine the sorting positions of the keywords in the current hot word linked list.
7. The method of claim 1, wherein the current hotword list comprises a plurality of categories of current hotword lists;
after the counting is performed on the valid texts to obtain corresponding counts, the method further includes:
and classifying the effective texts according to a preset rule so as to update the current hot word linked list of the corresponding category according to a classification result.
8. The method of claim 7, further comprising:
and filtering the keyword objects in the updated hot word linked list according to a preset rule.
9. The method of claim 1, further comprising:
performing semantic recognition on each keyword object in the hot word linked list to merge keywords with the same semantics and corresponding counts;
and sorting according to the count corresponding to the combined keyword object.
10. A hotword statistic device, comprising:
the session data extraction module is used for extracting session data in a current preset period;
the effective text counting module is used for preprocessing the session data to obtain corresponding effective texts and counting the effective texts to obtain corresponding counts;
the hot word counting module is used for updating the current hot word linked list which is arranged in order according to the effective text and the corresponding count; wherein the current hotword linked list includes a current keyword object and a corresponding current count.
11. A storage medium having stored thereon a computer program which, when executed by a processor, implements a hotword statistics method according to any one of claims 1 to 9.
12. An electronic terminal, comprising:
a processor; and
a memory for storing executable instructions of the processor;
wherein the processor is configured to perform the hotword statistics method of any one of claims 1-9 via execution of the executable instructions.
CN201811567342.6A 2018-12-20 2018-12-20 Hot word statistical method and device, storage medium and electronic terminal Pending CN111353087A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201811567342.6A CN111353087A (en) 2018-12-20 2018-12-20 Hot word statistical method and device, storage medium and electronic terminal

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201811567342.6A CN111353087A (en) 2018-12-20 2018-12-20 Hot word statistical method and device, storage medium and electronic terminal

Publications (1)

Publication Number Publication Date
CN111353087A true CN111353087A (en) 2020-06-30

Family

ID=71192183

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201811567342.6A Pending CN111353087A (en) 2018-12-20 2018-12-20 Hot word statistical method and device, storage medium and electronic terminal

Country Status (1)

Country Link
CN (1) CN111353087A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113051893A (en) * 2021-04-30 2021-06-29 中国银行股份有限公司 Hot word statistical method, system, electronic equipment and storage medium

Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101727494A (en) * 2009-12-29 2010-06-09 华中师范大学 Network hot word generating system in specific area

Patent Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101727494A (en) * 2009-12-29 2010-06-09 华中师范大学 Network hot word generating system in specific area

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113051893A (en) * 2021-04-30 2021-06-29 中国银行股份有限公司 Hot word statistical method, system, electronic equipment and storage medium
CN113051893B (en) * 2021-04-30 2024-01-26 中国银行股份有限公司 Hotword statistics method, system, electronic equipment and storage medium

Similar Documents

Publication Publication Date Title
CN106971321B (en) Marketing information pushing method, marketing information pushing device, marketing information pushing equipment and storage medium
US10504120B2 (en) Determining a temporary transaction limit
CN111984779B (en) Dialogue text analysis method, device, equipment and readable medium
WO2018040068A1 (en) Knowledge graph-based semantic analysis system and method
CN107292365B (en) Method, device and equipment for binding commodity label and computer readable storage medium
CN106991175B (en) Customer information mining method, device, equipment and storage medium
CN111427974A (en) Data quality evaluation management method and device
CN112579893A (en) Information pushing method, information display method, information pushing device, information display device and information display equipment
CN112925664A (en) Target user determination method and device, electronic equipment and storage medium
CN111582314A (en) Target user determination method and device and electronic equipment
CN114092056A (en) Project management method, device, electronic equipment, storage medium and product
CN111444424A (en) Information recommendation method and information recommendation system
CN112749238A (en) Search ranking method and device, electronic equipment and computer-readable storage medium
CN113077312A (en) Hotel recommendation method, system, equipment and storage medium
CN114637842A (en) Enterprise industry classification method and device, storage medium and electronic equipment
CN111353087A (en) Hot word statistical method and device, storage medium and electronic terminal
CN116883181A (en) Financial service pushing method based on user portrait, storage medium and server
CN112330059A (en) Method, apparatus, electronic device, and medium for generating prediction score
CN115759100A (en) Data processing method, device, equipment and medium
US10353929B2 (en) System and method for computing critical data of an entity using cognitive analysis of emergent data
CN112328899B (en) Information processing method, information processing apparatus, storage medium, and electronic device
CN110737749B (en) Entrepreneurship plan evaluation method, entrepreneurship plan evaluation device, computer equipment and storage medium
CN114550157A (en) Bullet screen gathering identification method and device
CN113326461A (en) Cross-platform content distribution method, device, equipment and storage medium
CN113239273A (en) Method, device, equipment and storage medium for generating text

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination