CN110245343B - Bullet screen analysis method and device - Google Patents

Bullet screen analysis method and device Download PDF

Info

Publication number
CN110245343B
CN110245343B CN201810186987.9A CN201810186987A CN110245343B CN 110245343 B CN110245343 B CN 110245343B CN 201810186987 A CN201810186987 A CN 201810186987A CN 110245343 B CN110245343 B CN 110245343B
Authority
CN
China
Prior art keywords
word
bullet screen
candidate
time segment
determining
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201810186987.9A
Other languages
Chinese (zh)
Other versions
CN110245343A (en
Inventor
李明
沈一
茅越
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Youku Culture Technology Beijing Co ltd
Original Assignee
Alibaba China Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Alibaba China Co Ltd filed Critical Alibaba China Co Ltd
Priority to CN201810186987.9A priority Critical patent/CN110245343B/en
Publication of CN110245343A publication Critical patent/CN110245343A/en
Application granted granted Critical
Publication of CN110245343B publication Critical patent/CN110245343B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/205Parsing
    • G06F40/216Parsing using statistical methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/279Recognition of textual entities
    • G06F40/289Phrasal analysis, e.g. finite state techniques or chunking
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/30Semantic analysis

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Computational Linguistics (AREA)
  • General Health & Medical Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Artificial Intelligence (AREA)
  • Probability & Statistics with Applications (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
  • Machine Translation (AREA)

Abstract

The disclosure relates to a bullet screen analysis method and device. The method comprises the following steps: performing word segmentation processing on each bullet screen aiming at the specified object to obtain word segmentation results of each bullet screen; determining each candidate word corresponding to the designated object according to the word frequency of the words in the word segmentation result of each bullet screen; for a first candidate word corresponding to the specified object, determining a parameter value of the first candidate word in each time segment according to the word frequency of the first candidate word in each time segment of the specified object and the word frequencies of all candidate words in each time segment; and determining the first candidate word as the keyword of the first time segment under the condition that the parameter value of the first candidate word in the first time segment meets the condition. The method and the device can accurately determine the keywords of the specified object in each time segment, thereby helping business personnel to know the attention points of a large number of users in each time segment of the specified object.

Description

Bullet screen analysis method and device
Technical Field
The present disclosure relates to the field of computer technologies, and in particular, to a bullet screen analysis method and apparatus.
Background
With the continuous popularization of social networks and mobile internet, the cost of publishing information by people is lower and lower, and more users are willing to share own opinions and comments on people, events and products on the internet, for example, comments on objects such as videos and audios are published in a pop-up screen mode. The barrage reflects the viewpoint and emotional tendency of people for objects such as videos and audios, and has important significance for analyzing the objects such as the videos and the audios. Therefore, how to analyze the bullet screen becomes a problem to be solved urgently.
Disclosure of Invention
In view of this, the present disclosure provides a bullet screen analysis method and apparatus.
According to an aspect of the present disclosure, there is provided a bullet screen analysis method, including:
performing word segmentation processing on each bullet screen aiming at a specified object to obtain word segmentation results of each bullet screen;
determining each candidate word corresponding to the designated object according to the word frequency of the word in the word segmentation result of each bullet screen;
for a first candidate word corresponding to the specified object, determining a parameter value of the first candidate word in each time segment according to the word frequency of the first candidate word in each time segment of the specified object and the word frequency of all candidate words in each time segment, wherein the first candidate word is one candidate word in each candidate word;
determining the first candidate word as a keyword of a first time segment when a parameter value of the first candidate word in the first time segment meets a condition, wherein the first time segment is one of the time segments.
In a possible implementation manner, for a first candidate word corresponding to the specified object, determining a parameter value of the first candidate word in each time segment according to the word frequency of the first candidate word in each time segment of the specified object and the word frequencies of all candidate words in each time segment, includes:
determining a first ratio of the word frequency of all candidate words in the first time segment to the total word frequency of all candidate words in each time segment;
determining the product of the total word frequency of the first candidate word in each time slice and the first ratio as a first word frequency;
determining a second ratio of the word frequency of the first candidate word in the first time segment to the total word frequency of all candidate words in each time segment;
determining a third ratio of the first word frequency to the total word frequency of all candidate words in each time segment;
and determining a parameter value of the first candidate word in the first time slice according to the second ratio and the third ratio.
In a possible implementation manner, determining, according to the second ratio and the third ratio, a parameter value of the first candidate word in the first time slice includes:
and determining the parameter value of the first candidate word in the first time segment according to the second ratio, the third ratio, the number of time segments and the total number of the candidate words.
In a possible implementation manner, performing a word segmentation process on each bullet screen for a specified object to obtain a word segmentation result of each bullet screen includes:
extracting new words from each bullet screen aiming at the specified object;
and performing word segmentation processing on each bullet screen according to the new words to obtain word segmentation results of each bullet screen.
In one possible implementation, extracting new words from each bullet screen for a specific object includes:
carrying out adjacent character cutting on each bullet screen aiming at the specified object to obtain a cutting result;
and extracting new words from each bullet screen aiming at the specified object according to the solidification degree and the freedom degree of the cutting result.
According to another aspect of the present disclosure, there is provided a bullet screen analyzing apparatus including:
the word segmentation module is used for carrying out word segmentation processing on each bullet screen aiming at the specified object to obtain word segmentation results of each bullet screen;
the first determining module is used for determining each candidate word corresponding to the specified object according to the word frequency of the word in the word segmentation result of each bullet screen;
a second determining module, configured to determine, for a first candidate word corresponding to the specified object, a parameter value of the first candidate word in each time segment according to word frequencies of the first candidate word in each time segment of the specified object and word frequencies of all candidate words in each time segment, where the first candidate word is one candidate word in the candidate words;
a third determining module, configured to determine the first candidate word as a keyword of a first time segment when a parameter value of the first candidate word in the first time segment meets a condition, where the first time segment is one of the time segments.
In one possible implementation manner, the second determining module includes:
the first determining submodule is used for determining a first ratio of the word frequency of all candidate words in the first time slice to the total word frequency of all candidate words in all time slices;
a second determining submodule, configured to determine a product of a total word frequency of the first candidate word in each time segment and the first ratio as a first word frequency;
a third determining submodule, configured to determine a second ratio of the word frequency of the first candidate word in the first time segment to a total word frequency of all candidate words in each time segment;
a fourth determining submodule, configured to determine a third ratio of the first word frequency to a total word frequency of all candidate words in each time slice;
and the fifth determining submodule is used for determining the parameter value of the first candidate word in the first time slice according to the second ratio and the third ratio.
In one possible implementation, the fifth determining submodule is configured to:
and determining the parameter value of the first candidate word in the first time segment according to the second ratio, the third ratio, the number of time segments and the total number of the candidate words.
In one possible implementation, the word segmentation module includes:
the extraction submodule is used for extracting new words from each bullet screen aiming at the specified object;
and the word segmentation sub-module is used for carrying out word segmentation processing on each bullet screen according to the new words to obtain word segmentation results of each bullet screen.
In one possible implementation, the extracting sub-module includes:
the cutting unit is used for cutting adjacent characters of each bullet screen aiming at the specified object to obtain a cutting result;
and the extraction unit is used for extracting new words from each bullet screen aiming at the specified object according to the solidification degree and the freedom degree of the cutting result.
According to another aspect of the present disclosure, there is provided a bullet screen analyzing apparatus including: a processor; a memory for storing processor-executable instructions; wherein the processor is configured to perform the above method.
According to another aspect of the present disclosure, there is provided a non-transitory computer-readable storage medium having computer program instructions stored thereon, wherein the computer program instructions, when executed by a processor, implement the above-described method.
The bullet screen analysis method and the bullet screen analysis device in each aspect of the disclosure perform word segmentation processing on each bullet screen aiming at a specified object to obtain word segmentation results of each bullet screen, determining each candidate word corresponding to the designated object according to the word frequency of the words in the word segmentation result of each bullet screen, for the first candidate word corresponding to the specified object, according to the word frequency of the first candidate word in each time segment of the specified object, and the word frequency of all candidate words in each time segment, determining the parameter value of the first candidate word in each time segment, and determining the first candidate word as the keyword of the first time segment under the condition that the parameter value of the first candidate word in the first time segment meets the condition, thereby being capable of accurately determining the keyword of the specified object in each time segment, therefore, business personnel can be helped to know the attention points of a large number of users in each time segment of the specified object.
Other features and aspects of the present disclosure will become apparent from the following detailed description of exemplary embodiments, which proceeds with reference to the accompanying drawings.
Drawings
The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate exemplary embodiments, features, and aspects of the disclosure and, together with the description, serve to explain the principles of the disclosure.
Fig. 1 shows a flow diagram of a bullet screen analysis method according to an embodiment of the present disclosure.
Fig. 2 is a schematic diagram illustrating a bullet screen amount for a specific object in a bullet screen analysis method according to an embodiment of the present disclosure.
Fig. 3 shows an exemplary flowchart of step S13 of the bullet screen analyzing method according to an embodiment of the present disclosure.
Fig. 4 shows an exemplary flowchart of step S11 of the bullet screen analyzing method according to an embodiment of the present disclosure.
Fig. 5 shows an exemplary flowchart of the bullet screen analysis method S111 according to an embodiment of the present disclosure.
Fig. 6 shows a block diagram of a bullet screen analysis device according to an embodiment of the present disclosure.
Fig. 7 shows an exemplary block diagram of a bullet screen analysis device according to an embodiment of the present disclosure.
Fig. 8 is a block diagram illustrating an apparatus 800 for bullet screen analysis in accordance with an exemplary embodiment.
Fig. 9 is a block diagram illustrating an apparatus 1900 for bullet screen analysis according to an exemplary embodiment.
Detailed Description
Various exemplary embodiments, features and aspects of the present disclosure will be described in detail below with reference to the accompanying drawings. In the drawings, like reference numbers indicate functionally identical or similar elements. While the various aspects of the embodiments are presented in drawings, the drawings are not necessarily drawn to scale unless specifically indicated.
The word "exemplary" is used exclusively herein to mean "serving as an example, embodiment, or illustration. Any embodiment described herein as "exemplary" is not necessarily to be construed as preferred or advantageous over other embodiments.
Furthermore, in the following detailed description, numerous specific details are set forth in order to provide a better understanding of the present disclosure. It will be understood by those skilled in the art that the present disclosure may be practiced without some of these specific details. In some instances, methods, means, elements and circuits that are well known to those skilled in the art have not been described in detail so as not to obscure the present disclosure.
Fig. 1 shows a flowchart of a bullet screen analysis method according to an embodiment of the present disclosure. As shown in fig. 1, the method includes steps S11 through S14.
In step S11, the word segmentation processing is performed on each bullet screen for the designated object, and the word segmentation result of each bullet screen is obtained.
The designated object may refer to any object capable of publishing a bullet screen. For example, the designated object may be video or audio, etc.
In this embodiment, a word segmentation technology in the related art may be adopted to perform word segmentation processing on each bullet screen for the specified object, so as to obtain a word segmentation result of each bullet screen.
In step S12, each candidate word corresponding to the designated object is determined according to the word frequency of the word in the word segmentation result of each bullet screen.
In one possible implementation manner, the N words with the highest word frequency in all time segments of the specified object may be determined as candidate words corresponding to the specified object. For example, N may be equal to 50.
In step S13, for a first candidate word corresponding to the specified object, a parameter value of the first candidate word in each time segment is determined according to the word frequency of the first candidate word in each time segment of the specified object and the word frequencies of all candidate words in each time segment, where the first candidate word is one of the candidate words.
In a possible implementation manner, a parameter value of the first candidate word in the first time segment may be positively correlated with a word frequency of the first candidate word in the first time segment. In this implementation manner, the parameter value of the first candidate word in the first time segment may be determined in various manners, as long as the parameter value of the first candidate word in the first time segment is determined according to the word frequency of the first candidate word in each time segment of the specified object and the word frequencies of all candidate words in each time segment, and the parameter value of the first candidate word in the first time segment is positively correlated to the word frequency of the first candidate word in the first time segment.
In step S14, in a case that a parameter value of the first candidate word in a first time segment satisfies a condition, the first candidate word is determined as a keyword of the first time segment, where the first time segment is one of the time segments.
In one possible implementation manner, it may be determined that the parameter value of the first candidate word in the first time slice satisfies the condition if the parameter value of the first candidate word in the first time slice is greater than a third threshold. For example, the third threshold is equal to 0.0001.
In the embodiment, word segmentation processing is performed on each bullet screen of a specified object to obtain word segmentation results of each bullet screen, each candidate word corresponding to the specified object is determined according to the word frequency of the word in the word segmentation results of each bullet screen, for a first candidate word corresponding to the specified object, the parameter value of the first candidate word in each time segment is determined according to the word frequency of the first candidate word in each time segment of the specified object and the word frequency of all candidate words in each time segment, and the first candidate word is determined as the keyword of the first time segment under the condition that the parameter value of the first candidate word in the first time segment meets the condition, so that the keyword of the specified object in each time segment can be accurately determined, and service personnel can be helped to know the attention point of a majority of users in each time segment of the specified object.
Fig. 2 is a schematic diagram illustrating a bullet screen amount for a specific object in a bullet screen analysis method according to an embodiment of the present disclosure. In fig. 2, the horizontal axis represents the time axis in minutes, and the vertical axis represents the bullet screen amount. With the present embodiment, the keywords of the specified object in fig. 2 in each time segment can be determined. For example, a keyword specifying a time segment to which the object belongs at a maximum time point of the bullet screen amount may be determined.
Fig. 3 shows an exemplary flowchart of step S13 of the bullet screen analysis method according to an embodiment of the present disclosure. As shown in fig. 3, step S13 may include steps S131 to S135.
In step S131, a first ratio of the word frequency of all candidate words in the first time segment to the total word frequency of all candidate words in each time segment is determined.
For example, the duration of the designated object is 4 minutes, the designated object may be divided into 4 time slices, and the duration of each time slice may be 1 minute. Candidate words corresponding to the designated object include "hair", "accent", and "military training". The word frequency of each candidate word in each time segment is shown in table 1.
TABLE 1
Hair with hair-protecting layer Accent Military training
1 st time slice 1 2 9
2 nd time slice 4 0 3
3 rd time slice 6 1 3
4 th time slice 0 10 2
For example, the 1 st candidate is "hair", the 2 nd candidate is "accent", and the 3 rd candidate is "military training". N is a radical of i,j Indicates the ith candidate word is at the jthWord frequency, N, of time segments 1,1 =1,N 1,2 =4,N 1,3 =6,N 1,4 =0,N 2,1 =2,N 2,2 =0,N 2,3 =1,N 2,4 =10,N 3,1 =9,N 3,2 =3,N 3,3 =3,N 3,4 =2。N a,j Representing the word frequency, N, of all candidate words in the jth time segment a,1 =12,N a,2 =7,N a,3 =10,N a,4 =12。N a,a Representing the total word frequency of all candidate words in each time segment, N a,a =41。B a,j A first ratio of the word frequency of all candidate words in the jth time segment to the total word frequency of all candidate words in each time segment is represented,
Figure BDA0001590569770000081
in step S132, a product of the total word frequency of the first candidate word in each time segment and the first ratio is determined as a first word frequency.
E.g. N i,a Representing the total word frequency, N, of the ith candidate word in each time slice 1,a =11,N 2,a =13,N 3,a =17。N′ i,j Representing the first word frequency, N 'of the ith candidate word in the jth time slice' i,j =N i,a ×B a,j . The first word frequency of each candidate word shown in table 1 at each time segment is shown in table 2.
TABLE 2
Figure BDA0001590569770000091
In step S133, a second ratio of the word frequency of the first candidate word in the first time segment to the total word frequency of all candidate words in each time segment is determined.
For example, B i,j A second ratio of the word frequency of the ith candidate word in the jth time segment to the total word frequency of all candidate words in each time segment is represented,
Figure BDA0001590569770000092
the second ratio of each candidate word in each time segment shown in table 1 is shown in table 3.
TABLE 3
Figure BDA0001590569770000093
In step S134, a third ratio of the first word frequency to the total word frequency of all candidate words in each time segment is determined.
For example, B' i,j A third ratio of the first word frequency of the ith candidate word in the jth time slice to the total word frequency of each time slice is represented,
Figure BDA0001590569770000101
the third ratio of each candidate word in each time segment shown in table 1 is shown in table 4.
TABLE 4
Figure BDA0001590569770000102
In step S135, a parameter value of the first candidate word in the first time slice is determined according to the second ratio and the third ratio.
In one possible implementation manner, determining a parameter value of the first candidate word in the first time slice according to the second ratio and the third ratio may include: and determining the parameter value of the first candidate word in the first time segment according to the second ratio, the third ratio, the number of the time segments and the total number of the candidate words.
As an example of this implementation, R i,j A parameter value representing the ith candidate word at the jth time segment,
Figure BDA0001590569770000103
where n may be equal to the product of the number of time segments and the total number of candidate words. In the example shown in table 1, n is 12.
Although the specific implementation of step S13 is described above with the example shown in fig. 3, those skilled in the art will appreciate that the present disclosure should not be limited thereto. Those skilled in the art can flexibly set the specific implementation manner of step S13 according to the actual application scenario requirements and/or personal preferences. For example, the parameter values of the two candidate words with the highest word frequency in the first time segment may be determined to be 1, the parameter values of the other candidate words in the first time segment may be determined to be 0, and the candidate word with the parameter value of 1 may be determined to be the keyword in the first time segment.
Fig. 4 shows an exemplary flowchart of step S11 of the bullet screen analysis method according to an embodiment of the present disclosure. As shown in fig. 4, step S11 may include step S111 and step S112.
In step S111, a new word is extracted from each bullet screen for the specified object.
In the present embodiment, a new word extraction technique in the related art may be adopted to extract a new word from each bullet screen theory for a specified object.
In step S112, word segmentation processing is performed on each bullet screen according to the new words, so as to obtain word segmentation results of each bullet screen.
In a possible implementation manner, the extracted new words may be used as a dictionary of word segmentation, and word segmentation processing may be performed on each bullet screen for the specified object.
Fig. 5 illustrates an exemplary flowchart of the barrage analysis method S111 according to an embodiment of the disclosure. As shown in fig. 5, step S111 may include step S1111 and step S1112.
In step S1111, adjacent character cutting is performed on each bullet screen for the designated object, and a cutting result is obtained.
In a possible implementation manner, an offsettattribute class of Lucene TokenStream may be adopted to perform adjacent text cutting on each barrage for the specified object, so as to obtain a cutting result.
In step S1112, new words are extracted from each bullet screen for the specified object based on the degree of solidification and the degree of freedom of the cutting result.
In this embodiment, the degrees of solidity and the degrees of freedom of the words in the cutting result may be calculated according to the method of calculating the degrees of solidity and the degrees of freedom in the related art.
In one possible implementation, the word a may be determined as a new word if the degree of solidification of the word a in the cutting result is greater than a first threshold and the degree of freedom is greater than a second threshold.
Fig. 6 shows a block diagram of a bullet screen analysis device according to an embodiment of the present disclosure. As shown in fig. 6, the apparatus includes: the word segmentation module 61 is configured to perform word segmentation processing on each bullet screen of the designated object to obtain word segmentation results of each bullet screen; the first determining module 62 is configured to determine, according to the word frequency of the word in the word segmentation result of each bullet screen, each candidate word corresponding to the designated object; a second determining module 63, configured to determine, for a first candidate word corresponding to the specified object, a parameter value of the first candidate word in each time segment according to the word frequency of the first candidate word in each time segment of the specified object and the word frequencies of all candidate words in each time segment, where the first candidate word is one candidate word in each candidate word; the third determining module 64 is configured to determine the first candidate word as the keyword of the first time segment when a parameter value of the first candidate word in the first time segment meets a condition, where the first time segment is one of the time segments.
Fig. 7 shows an exemplary block diagram of a bullet screen analysis device according to an embodiment of the present disclosure. As shown in fig. 7:
in one possible implementation, the second determining module 63 includes: the first determining sub-module 631 is configured to determine a first ratio of the word frequency of all the candidate words in the first time segment to the total word frequency of all the candidate words in each time segment; the second determining submodule 632 is configured to determine a product of the total word frequency of the first candidate word in each time segment and the first ratio as a first word frequency; a third determining submodule 633, configured to determine a second ratio of the word frequency of the first candidate word in the first time slice to the total word frequency of all candidate words in each time slice; a fourth determining sub-module 634, configured to determine a third ratio of the first word frequency to the total word frequency of all candidate words in each time slice; the fifth determining submodule 635 is configured to determine, according to the second ratio and the third ratio, a parameter value of the first candidate word in the first time segment.
In one possible implementation, the fifth determining submodule 635 is configured to: and determining the parameter value of the first candidate word in the first time segment according to the second ratio, the third ratio, the number of the time segments and the total number of the candidate words.
In one possible implementation, the word segmentation module 61 includes: an extraction submodule 611, configured to extract new words from each bullet screen for the specified object; and the word segmentation sub-module 612 is configured to perform word segmentation processing on each bullet screen according to the new words to obtain word segmentation results of each bullet screen.
In one possible implementation, the extraction sub-module 611 includes: the cutting unit is used for cutting adjacent characters of each bullet screen aiming at the specified object to obtain a cutting result; and the extracting unit is used for extracting new words from each bullet screen aiming at the specified object according to the solidification degree and the freedom degree of the cutting result.
In the embodiment, word segmentation processing is performed on each bullet screen of a specified object to obtain word segmentation results of each bullet screen, each candidate word corresponding to the specified object is determined according to the word frequency of the word in the word segmentation results of each bullet screen, for a first candidate word corresponding to the specified object, the parameter value of the first candidate word in each time segment is determined according to the word frequency of the first candidate word in each time segment of the specified object and the word frequency of all candidate words in each time segment, and the first candidate word is determined as the keyword of the first time segment under the condition that the parameter value of the first candidate word in the first time segment meets the condition, so that the keyword of the specified object in each time segment can be accurately determined, and service personnel can be helped to know the attention point of a majority of users in each time segment of the specified object.
Fig. 8 is a block diagram illustrating an apparatus 800 for bullet screen analysis in accordance with an exemplary embodiment. For example, the apparatus 800 may be a mobile phone, a computer, a digital broadcast terminal, a messaging device, a game console, a tablet device, a medical device, an exercise device, a personal digital assistant, and the like.
Referring to fig. 8, the apparatus 800 may include one or more of the following components: processing component 802, memory 804, power component 806, multimedia component 808, audio component 810, input/output (I/O) interface 812, sensor component 814, and communication component 816.
The processing component 802 generally controls overall operation of the device 800, such as operations associated with display, telephone calls, data communications, camera operations, and recording operations. The processing components 802 may include one or more processors 820 to execute instructions to perform all or a portion of the steps of the methods described above. Further, the processing component 802 can include one or more modules that facilitate interaction between the processing component 802 and other components. For example, the processing component 802 can include a multimedia module to facilitate interaction between the multimedia component 808 and the processing component 802.
The memory 804 is configured to store various types of data to support operations at the apparatus 800. Examples of such data include instructions for any application or method operating on device 800, contact data, phonebook data, messages, pictures, videos, and so forth. The memory 804 may be implemented by any type or combination of volatile or non-volatile memory devices such as Static Random Access Memory (SRAM), electrically erasable programmable read-only memory (EEPROM), erasable programmable read-only memory (EPROM), programmable read-only memory (PROM), read-only memory (ROM), magnetic memory, flash memory, magnetic or optical disks.
Power components 806 provide power to the various components of device 800. The power components 806 may include a power management system, one or more power supplies, and other components associated with generating, managing, and distributing power for the device 800.
The multimedia component 808 includes a screen that provides an output interface between the device 800 and the user. In some embodiments, the screen may include a Liquid Crystal Display (LCD) and a Touch Panel (TP). If the screen includes a touch panel, the screen may be implemented as a touch screen to receive an input signal from a user. The touch panel includes one or more touch sensors to sense touch, slide, and gestures on the touch panel. The touch sensor may not only sense the boundary of a touch or slide action, but also detect the duration and pressure associated with the touch or slide operation. In some embodiments, the multimedia component 808 includes a front facing camera and/or a rear facing camera. The front camera and/or the rear camera may receive external multimedia data when the device 800 is in an operating mode, such as a shooting mode or a video mode. Each front camera and rear camera may be a fixed optical lens system or have a focal length and optical zoom capability.
The audio component 810 is configured to output and/or input audio signals. For example, the audio component 810 includes a Microphone (MIC) configured to receive external audio signals when the apparatus 800 is in an operational mode, such as a call mode, a recording mode, and a voice recognition mode. The received audio signals may further be stored in the memory 804 or transmitted via the communication component 816. In some embodiments, audio component 810 also includes a speaker for outputting audio signals.
The I/O interface 812 provides an interface between the processing component 802 and peripheral interface modules, which may be keyboards, click wheels, buttons, etc. These buttons may include, but are not limited to: a home button, a volume button, a start button, and a lock button.
The sensor assembly 814 includes one or more sensors for providing various aspects of state assessment for the device 800. For example, the sensor assembly 814 may detect the open/closed status of the device 800, the relative positioning of components, such as a display and keypad of the device 800, the sensor assembly 814 may also detect a change in the position of the device 800 or a component of the device 800, the presence or absence of user contact with the device 800, the orientation or acceleration/deceleration of the device 800, and a change in the temperature of the device 800. Sensor assembly 814 may include a proximity sensor configured to detect the presence of a nearby object without any physical contact. The sensor assembly 814 may also include a light sensor, such as a CMOS or CCD image sensor, for use in imaging applications. In some embodiments, the sensor assembly 814 may also include an acceleration sensor, a gyroscope sensor, a magnetic sensor, a pressure sensor, or a temperature sensor.
The communication component 816 is configured to facilitate communication between the apparatus 800 and other devices in a wired or wireless manner. The device 800 may access a wireless network based on a communication standard, such as WiFi, 2G or 3G, or a combination thereof. In an exemplary embodiment, the communication component 816 receives a broadcast signal or broadcast related information from an external broadcast management system via a broadcast channel. In an exemplary embodiment, the communication component 816 further includes a Near Field Communication (NFC) module to facilitate short-range communications. For example, the NFC module may be implemented based on Radio Frequency Identification (RFID) technology, infrared data association (IrDA) technology, Ultra Wideband (UWB) technology, Bluetooth (BT) technology, and other technologies.
In an exemplary embodiment, the apparatus 800 may be implemented by one or more Application Specific Integrated Circuits (ASICs), Digital Signal Processors (DSPs), Digital Signal Processing Devices (DSPDs), Programmable Logic Devices (PLDs), Field Programmable Gate Arrays (FPGAs), controllers, micro-controllers, microprocessors or other electronic components for performing the above-described methods.
In an exemplary embodiment, a non-transitory computer-readable storage medium, such as the memory 804, is also provided that includes computer program instructions executable by the processor 820 of the device 800 to perform the above-described methods.
Fig. 9 is a block diagram illustrating an apparatus 1900 for bullet screen analysis according to an exemplary embodiment. For example, the apparatus 1900 may be provided as a server. Referring to fig. 9, the apparatus 1900 includes a processing component 1922 further including one or more processors and memory resources, represented by memory 1932, for storing instructions, e.g., applications, executable by the processing component 1922. The application programs stored in memory 1932 may include one or more modules that each correspond to a set of instructions. Further, the processing component 1922 is configured to execute instructions to perform the above-described method.
The device 1900 may also include a power component 1926 configured to perform power management of the device 1900, a wired or wireless network interface 1950 configured to connect the device 1900 to a network, and an input/output (I/O) interface 1958. The device 1900 may operate based on an operating system stored in memory 1932, such as Windows Server, Mac OS XTM, UnixTM, LinuxTM, FreeBSDTM, or the like.
In an exemplary embodiment, a non-transitory computer readable storage medium, such as a memory 1932, is also provided that includes computer program instructions executable by the processing component 1922 of the apparatus 1900 to perform the methods described above.
The present disclosure may be systems, methods, and/or computer program products. The computer program product may include a computer-readable storage medium having computer-readable program instructions embodied thereon for causing a processor to implement various aspects of the present disclosure.
The computer readable storage medium may be a tangible device that can hold and store the instructions for use by the instruction execution device. The computer readable storage medium may be, for example, but not limited to, an electronic memory device, a magnetic memory device, an optical memory device, an electromagnetic memory device, a semiconductor memory device, or any suitable combination of the foregoing. More specific examples (a non-exhaustive list) of the computer readable storage medium would include the following: a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), a Static Random Access Memory (SRAM), a portable compact disc read-only memory (CD-ROM), a Digital Versatile Disc (DVD), a memory stick, a floppy disk, a mechanical encoding device, such as punch cards or in-groove raised structures having instructions stored thereon, and any suitable combination of the foregoing. Computer-readable storage media as used herein is not to be interpreted as a transitory signal per se, such as a radio wave or other freely propagating electromagnetic wave, an electromagnetic wave propagating through a waveguide or other transmission medium (e.g., optical pulses through a fiber optic cable), or an electrical signal transmitted through an electrical wire.
The computer-readable program instructions described herein may be downloaded from a computer-readable storage medium to a respective computing/processing device, or to an external computer or external storage device via a network, such as the internet, a local area network, a wide area network, and/or a wireless network. The network may include copper transmission cables, fiber optic transmission, wireless transmission, routers, firewalls, switches, gateway computers and/or edge servers. The network adapter card or network interface in each computing/processing device receives the computer-readable program instructions from the network and forwards the computer-readable program instructions for storage in a computer-readable storage medium in the respective computing/processing device.
The computer program instructions for carrying out operations of the present disclosure may be assembler instructions, Instruction Set Architecture (ISA) instructions, machine-related instructions, microcode, firmware instructions, state setting data, or source or object code written in any combination of one or more programming languages, including an object oriented programming language such as Smalltalk, C + + or the like and conventional procedural programming languages, such as the "C" programming language or similar programming languages. The computer-readable program instructions may execute entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer or entirely on the remote computer or server. In the case of a remote computer, the remote computer may be connected to the user's computer through any type of network, including a Local Area Network (LAN) or a Wide Area Network (WAN), or the connection may be made to an external computer (for example, through the Internet using an Internet service provider). In some embodiments, the electronic circuitry that can execute the computer-readable program instructions implements aspects of the present disclosure by utilizing the state information of the computer-readable program instructions to personalize the electronic circuitry, such as a programmable logic circuit, a Field Programmable Gate Array (FPGA), or a Programmable Logic Array (PLA).
Various aspects of the present disclosure are described herein with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems) and computer program products according to embodiments of the disclosure. It will be understood that each block of the flowchart illustrations and/or block diagrams, and combinations of blocks in the flowchart illustrations and/or block diagrams, can be implemented by computer readable program instructions.
These computer-readable program instructions may be provided to a processor of a general purpose computer, special purpose computer, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions/acts specified in the flowchart and/or block diagram block or blocks. These computer-readable program instructions may also be stored in a computer-readable storage medium that can direct a computer, programmable data processing apparatus, and/or other devices to function in a particular manner, such that the computer-readable medium storing the instructions comprises an article of manufacture including instructions which implement the function/act specified in the flowchart and/or block diagram block or blocks.
The computer readable program instructions may also be loaded onto a computer, other programmable data processing apparatus, or other devices to cause a series of operational steps to be performed on the computer, other programmable apparatus or other devices to produce a computer implemented process such that the instructions which execute on the computer, other programmable apparatus or other devices implement the functions/acts specified in the flowchart and/or block diagram block or blocks.
The flowchart and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of systems, methods and computer program products according to various embodiments of the present disclosure. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of instructions, which comprises one or more executable instructions for implementing the specified logical function(s). In some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams and/or flowchart illustration, and combinations of blocks in the block diagrams and/or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.
Having described embodiments of the present disclosure, the foregoing description is intended to be exemplary, not exhaustive, and not limited to the disclosed embodiments. Many modifications and variations will be apparent to those of ordinary skill in the art without departing from the scope and spirit of the described embodiments. The terms used herein were chosen in order to best explain the principles of the embodiments, the practical application, or technical improvements to the techniques in the marketplace, or to enable others of ordinary skill in the art to understand the embodiments disclosed herein.

Claims (10)

1. A bullet screen analysis method is characterized by comprising the following steps:
performing word segmentation processing on each bullet screen aiming at a specified object to obtain word segmentation results of each bullet screen;
determining each candidate word corresponding to the designated object according to the word frequency of the word in the word segmentation result of each bullet screen;
determining a first ratio of the word frequency of all candidate words in a first time segment to the total word frequency of all candidate words in each time segment;
determining the product of the total word frequency of the first candidate word in each time slice and the first ratio as a first word frequency;
determining a second ratio of the word frequency of the first candidate word in the first time segment to the total word frequency of all candidate words in each time segment;
determining a third ratio of the first word frequency to the total word frequency of all candidate words in each time slice;
determining a parameter value of the first candidate word in the first time slice according to the second ratio and the third ratio;
and under the condition that the parameter value of the first candidate word in a first time segment meets the condition, determining the first candidate word as a keyword of the first time segment, and determining the attention point of a user in each time segment of a specified object according to the keyword, wherein the first time segment is one time segment of the time segments.
2. The method of claim 1, wherein determining the parameter value of the first candidate word at the first time slice according to the second ratio and the third ratio comprises:
and determining the parameter value of the first candidate word in the first time segment according to the second ratio, the third ratio, the number of time segments and the total number of the candidate words.
3. The method of claim 1, wherein performing a word segmentation process on each bullet screen for a specified object to obtain a word segmentation result of each bullet screen comprises:
extracting new words from each bullet screen aiming at the specified object;
and performing word segmentation processing on each bullet screen according to the new words to obtain word segmentation results of each bullet screen.
4. The method of claim 3, wherein extracting new words from each bullet screen for a given object comprises:
carrying out adjacent character cutting on each bullet screen aiming at the specified object to obtain a cutting result;
and extracting new words from each bullet screen aiming at the specified object according to the solidification degree and the freedom degree of the cutting result.
5. A bullet screen analysis device, comprising:
the word segmentation module is used for performing word segmentation processing on each bullet screen aiming at the specified object to obtain a word segmentation result of each bullet screen;
the first determining module is used for determining each candidate word corresponding to the specified object according to the word frequency of the word in the word segmentation result of each bullet screen;
the first determining submodule is used for determining a first ratio of the word frequency of all candidate words in a first time slice to the total word frequency of all candidate words in each time slice;
a second determining submodule, configured to determine a product of a total word frequency of the first candidate word in each time segment and the first ratio as a first word frequency;
a third determining submodule, configured to determine a second ratio of the word frequency of the first candidate word in the first time segment to a total word frequency of all candidate words in each time segment;
a fourth determining sub-module, configured to determine a third ratio of the first word frequency to a total word frequency of all candidate words in each time segment;
a fifth determining submodule, configured to determine, according to the second ratio and the third ratio, a parameter value of the first candidate word in the first time slice;
the third determining module is configured to determine the first candidate word as a keyword of a first time segment when a parameter value of the first candidate word in the first time segment meets a condition, and determine a point of interest of a user in each time segment of a specified object according to the keyword, where the first time segment is one of the time segments.
6. The apparatus of claim 5, wherein the fifth determination submodule is configured to:
and determining the parameter value of the first candidate word in the first time segment according to the second ratio, the third ratio, the number of time segments and the total number of the candidate words.
7. The apparatus of claim 5, wherein the word segmentation module comprises:
the extraction submodule is used for extracting new words from each bullet screen aiming at the specified object;
and the word segmentation sub-module is used for carrying out word segmentation processing on each bullet screen according to the new words to obtain word segmentation results of each bullet screen.
8. The apparatus of claim 7, wherein the extraction sub-module comprises:
the cutting unit is used for cutting adjacent characters of each bullet screen aiming at the specified object to obtain a cutting result;
and the extracting unit is used for extracting new words from each bullet screen aiming at the specified object according to the solidification degree and the freedom degree of the cutting result.
9. A bullet screen analysis device, comprising:
a processor;
a memory for storing processor-executable instructions;
wherein the processor is configured to perform the method of any one of claims 1 to 4.
10. A non-transitory computer readable storage medium having computer program instructions stored thereon, wherein the computer program instructions, when executed by a processor, implement the method of any of claims 1 to 4.
CN201810186987.9A 2018-03-07 2018-03-07 Bullet screen analysis method and device Active CN110245343B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201810186987.9A CN110245343B (en) 2018-03-07 2018-03-07 Bullet screen analysis method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201810186987.9A CN110245343B (en) 2018-03-07 2018-03-07 Bullet screen analysis method and device

Publications (2)

Publication Number Publication Date
CN110245343A CN110245343A (en) 2019-09-17
CN110245343B true CN110245343B (en) 2022-09-09

Family

ID=67882461

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201810186987.9A Active CN110245343B (en) 2018-03-07 2018-03-07 Bullet screen analysis method and device

Country Status (1)

Country Link
CN (1) CN110245343B (en)

Family Cites Families (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8463648B1 (en) * 2012-05-04 2013-06-11 Pearl.com LLC Method and apparatus for automated topic extraction used for the creation and promotion of new categories in a consultation system
CN105447004B (en) * 2014-08-08 2019-12-03 北京小度互娱科技有限公司 The excavation of word, relevant inquiring method and device are recommended in inquiry
US9501568B2 (en) * 2015-01-02 2016-11-22 Gracenote, Inc. Audio matching based on harmonogram
CN105260362B (en) * 2015-10-30 2019-02-12 小米科技有限责任公司 New words extraction method and apparatus
CN105893478B (en) * 2016-03-29 2019-10-29 广州华多网络科技有限公司 A kind of tag extraction method and apparatus
CN106028176B (en) * 2016-05-31 2018-12-14 北京奇艺世纪科技有限公司 The method and device at the time point of Hot Contents in a kind of determining Streaming Media
CN106210770B (en) * 2016-07-11 2019-05-14 北京小米移动软件有限公司 A kind of method and apparatus showing barrage information
CN107105318B (en) * 2017-03-21 2021-01-29 华为技术有限公司 Video hotspot segment extraction method, user equipment and server
CN107491541B (en) * 2017-08-24 2021-03-02 北京丁牛科技有限公司 Text classification method and device
CN107608964B (en) * 2017-09-13 2021-01-12 上海六界信息技术有限公司 Live broadcast content screening method, device, equipment and storage medium based on barrage

Also Published As

Publication number Publication date
CN110245343A (en) 2019-09-17

Similar Documents

Publication Publication Date Title
CN106960014B (en) Associated user recommendation method and device
CN107944409B (en) Video analysis method and device capable of distinguishing key actions
CN107692997B (en) Heart rate detection method and device
CN108985176B (en) Image generation method and device
CN110569777B (en) Image processing method and device, electronic device and storage medium
CN110942036B (en) Person identification method and device, electronic equipment and storage medium
CN110933488A (en) Video editing method and device
CN110858924B (en) Video background music generation method and device and storage medium
CN110781813B (en) Image recognition method and device, electronic equipment and storage medium
CN110928879A (en) Wide table generation method and device
CN106919629B (en) Method and device for realizing information screening in group chat
CN108320208B (en) Vehicle recommendation method and device
CN109165738B (en) Neural network model optimization method and device, electronic device and storage medium
CN108174269B (en) Visual audio playing method and device
CN109685041B (en) Image analysis method and device, electronic equipment and storage medium
CN106599191B (en) User attribute analysis method and device
CN111242303A (en) Network training method and device, and image processing method and device
CN110493637B (en) Video splitting method and device
CN109344703B (en) Object detection method and device, electronic equipment and storage medium
CN110633715B (en) Image processing method, network training method and device and electronic equipment
CN110232181B (en) Comment analysis method and device
CN110929545A (en) Human face image sorting method and device
CN110909562A (en) Video auditing method and device
CN110955800A (en) Video retrieval method and device
CN103970831B (en) Recommend the method and apparatus of icon

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
TA01 Transfer of patent application right
TA01 Transfer of patent application right

Effective date of registration: 20200430

Address after: 310052 room 508, floor 5, building 4, No. 699, Wangshang Road, Changhe street, Binjiang District, Hangzhou City, Zhejiang Province

Applicant after: Alibaba (China) Co.,Ltd.

Address before: 100080 Beijing Haidian District city Haidian street A Sinosteel International Plaza No. 8 block 5 layer A, C

Applicant before: Youku network technology (Beijing) Co.,Ltd.

CB02 Change of applicant information
CB02 Change of applicant information

Address after: Room 554, 5 / F, building 3, 969 Wenyi West Road, Wuchang Street, Yuhang District, Hangzhou City, Zhejiang Province

Applicant after: Alibaba (China) Co.,Ltd.

Address before: 310052 room 508, 5th floor, building 4, No. 699 Wangshang Road, Changhe street, Binjiang District, Hangzhou City, Zhejiang Province

Applicant before: Alibaba (China) Co.,Ltd.

GR01 Patent grant
GR01 Patent grant
TR01 Transfer of patent right
TR01 Transfer of patent right

Effective date of registration: 20240622

Address after: 101400 Room 201, 9 Fengxiang East Street, Yangsong Town, Huairou District, Beijing

Patentee after: Youku Culture Technology (Beijing) Co.,Ltd.

Country or region after: China

Address before: Room 554, 5 / F, building 3, 969 Wenyi West Road, Wuchang Street, Yuhang District, Hangzhou City, Zhejiang Province

Patentee before: Alibaba (China) Co.,Ltd.

Country or region before: China