WO2019214046A1 - Method, device, computer device, and storage medium for asset trend analysis - Google Patents

Method, device, computer device, and storage medium for asset trend analysis Download PDF

Info

Publication number
WO2019214046A1
WO2019214046A1 PCT/CN2018/094887 CN2018094887W WO2019214046A1 WO 2019214046 A1 WO2019214046 A1 WO 2019214046A1 CN 2018094887 W CN2018094887 W CN 2018094887W WO 2019214046 A1 WO2019214046 A1 WO 2019214046A1
Authority
WO
WIPO (PCT)
Prior art keywords
event
asset
keyword
media account
account
Prior art date
Application number
PCT/CN2018/094887
Other languages
French (fr)
Chinese (zh)
Inventor
王健宗
黄章成
吴天博
肖京
Original Assignee
平安科技(深圳)有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 平安科技(深圳)有限公司 filed Critical 平安科技(深圳)有限公司
Publication of WO2019214046A1 publication Critical patent/WO2019214046A1/en

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q40/00Finance; Insurance; Tax strategies; Processing of corporate or income taxes
    • G06Q40/06Asset management; Financial planning or analysis

Definitions

  • the present application relates to the field of computer technology, and in particular, to a method, an apparatus, a computer device and a storage medium for analyzing an asset trend.
  • Assets (such as stocks, funds, futures, etc.) will increase or decrease due to some current events, so the assets are real-time, that is, the amount of assets will change in real time.
  • asset holders will pay attention to their own assets, but their energy is limited, and the occurrence of hot events is sudden, making it difficult for asset holders to pay attention to hot events in real time and then analyze hot events for their own assets. Impact. If the asset declines rapidly in a short period of time due to a hot event, it will cause great damage to the property of the asset holder.
  • the main purpose of the present application is to provide a method, device, computer device and storage medium for analyzing the trend of assets in real-time automatic generation of asset trends.
  • the present application proposes a method for analyzing asset trends, including:
  • the target event keyword and the emotional vocabulary are input into a preset asset trend analysis model to obtain a trend chart of the asset to be analyzed, and the asset trend analysis model is based on an LSTM model.
  • the application also provides an apparatus for analyzing the trend of assets, including:
  • An acquisition module configured to acquire an asset keyword of an asset to be analyzed
  • An event keyword module configured to acquire a hot event, extract keywords in the hot event, and obtain an event keyword
  • a target hotspot event module configured to determine, in the event keyword, a target event keyword that matches the asset keyword, and obtain a target hotspot event corresponding to the target event keyword;
  • An emotional vocabulary module configured to obtain a comment text of the target hotspot event, and extract an emotional vocabulary in the comment text
  • the application further provides a computer device comprising a memory and a processor, the memory storing computer readable instructions, the processor executing the computer readable instructions to implement the steps of any of the methods described above.
  • the present application also provides a computer non-transitory readable storage medium having stored thereon computer readable instructions that, when executed by a processor, implement the steps of any of the methods described above.
  • the method, the device, the computer device and the storage medium of the asset trend analysis of the present application automatically acquire the hot event related to the asset, and generate a trend prediction chart about the asset according to the keyword description in the hot event and the comment on the hot event.
  • the chart predicts that the asset is below a certain value, the user is sent a message to reduce the risk of the user's assets being lost.
  • FIG. 1 is a schematic flow chart of a method for analyzing an asset trend according to an embodiment of the present application
  • FIG. 2 is a schematic diagram of a specific process for obtaining a hot spot event in the method for analyzing an asset trend according to an embodiment of the present application
  • FIG. 3 is a schematic flowchart of a method for obtaining a hot spot event in the method for analyzing an asset trend according to another embodiment of the present application;
  • FIG. 4 is a schematic flowchart of a method for obtaining a comment text of a hot event in the method for analyzing an asset trend according to an embodiment of the present application
  • FIG. 5 is a schematic diagram of a specific process for extracting emotional vocabulary in a review text in the method for analyzing an asset trend according to an embodiment of the present application
  • FIG. 6 is a schematic block diagram showing the structure of an apparatus for analyzing an asset trend according to an embodiment of the present application
  • FIG. 7 is a schematic block diagram showing the structure of an event keyword module of an apparatus for analyzing an asset trend according to an embodiment of the present application
  • FIG. 8 is a schematic structural block diagram of an event keyword module of an apparatus for analyzing an asset trend according to another embodiment of the present application.
  • FIG. 9 is a schematic block diagram showing the structure of an emotional vocabulary module of an apparatus for analyzing an asset trend according to an embodiment of the present application.
  • FIG. 10 is a schematic block diagram showing the structure of an emotional vocabulary module of an apparatus for analyzing an asset trend according to an embodiment of the present application
  • FIG. 11 is a schematic block diagram showing the structure of a computer device according to an embodiment of the present application.
  • an embodiment of the present application provides a method for analyzing an asset trend, including the following steps:
  • S2 acquiring a hot event, extracting a keyword in the hot event, and obtaining an event keyword
  • the asset keyword of the asset is a data packet assembled by the background staff in advance.
  • Assets include stocks, funds, futures (gold, crude oil).
  • the key words of the asset include the industry to which the asset belongs.
  • the word corresponding to the industry in which the asset is located is one of the asset keywords.
  • Ping An Group belongs to the financial industry and the technology industry.
  • finance and technology are the key words of the assets of Ping An Group;
  • Ping An Group has one For the wholly-owned subsidiary of R&D, artificial intelligence and AI are also the key words of the assets of Ping An Group.
  • the asset keyword corresponding to the asset also includes the name of the top management of the company to which the asset belongs.
  • asset keywords are labels that describe assets.
  • the background staff sorts out the assets owned by the user and the corresponding asset keywords and collects them in a database.
  • the system accesses the database, receives the name of the asset input by the user, and extracts the asset keyword corresponding to the asset name.
  • a hotspot event refers to a thing, and a related text message or report describes it.
  • the text message there are some words that describe the event critically, and the word is the event keyword.
  • the method for extracting keywords may be extracted from the title, or may be based on some training models to determine that the number of words appearing in the text message is more than the event keyword.
  • the event keyword is generally the subject of the matter and the nature of the matter. For example, there is a hot event about Jia Yueting's FF91 electric car in the United States for alpine testing.
  • the event keywords output after the training model is calculated may include: Jia Yueting, LeTV, electric car, luxury, hope, etc.
  • the system matches the event keyword with the asset keyword in the data packet.
  • the asset keywords are arranged in a predetermined logical order in the data packet, and the event keywords are also arranged in the same preset logical order, so that the event keyword is determined whether the asset in the preset data packet is When the keywords correspond, the speed of judgment will be faster.
  • there are 10 event keywords 10 event keywords are arranged in alphabetical order of pinyin; there are 1000 asset keywords in the data packet, and the 1000 asset keywords are also arranged in alphabetical order according to pinyin.
  • each event keyword is viewed in the order of the pinyin letters in the approximate position of the asset keyword ranking in the data packet, and then the event keyword is approximated to the above ranking.
  • the asset keywords of the location are matched to obtain the result, which greatly reduces the number of comparisons and saves the judgment time.
  • the event keyword that matches the success is determined as the target event keyword, and then the hot event corresponding to the target event keyword is searched up, that is, the target hot event.
  • the comment text of the target hot event is a comment from the other media users on the media platform to post the event to the media user, and also includes comments from other media users who forward the event.
  • the system directly reads the comments below the media account that posted the target hotspot event, as well as the comments below the media account that forwards the target hotspot event via crawler technology.
  • the comment text also includes an emoticon pack.
  • An emoticon is a way to use pictures to express feelings.
  • the expression pack is generally provided with text.
  • Emotional vocabulary includes text and expressions.
  • Emotional vocabulary is a vocabulary that is prepared by the background staff in advance, and is generally used to comment on stocks.
  • the network vocabulary used to comment on stocks e.g., the network vocabulary used to comment on stocks: enhancing stocks, pleasing stocks; stock terminology: full warehouse, Jiancang, Kamakura, grab hat, More killing; common words in the financial field: William index, moving average, tradable shares, treasury bills; emotional vocabulary also includes emoji for expressing emotions, such as smiling faces, crying faces, happy; further, the above emoji also includes special Emoji in the stock field, such as [buy], [replica], [loss of money] and other emoji.
  • LSTM Long Short-Term Memory
  • LSTM Long Short-Term Memory
  • each neuron is a "memory cell” with an “input gate”, a “forget gate” and an “output gate”, three Also known as the "triple door.”
  • This “forgetting gate” which controls the convergence of the gradients during training, while maintaining long-term memory.
  • the hotspot events in this application are also related to time series. Therefore, the LSTM model is based on keyword processing in hot events, and has good performance. Technical effects, can accurately analyze the corresponding asset charts.
  • the staff When training the asset trend analysis model, the staff first inputs the asset keyword of an asset and the emotional vocabulary that will affect the asset's depreciation into the LSTM model for training; thus inputting the asset keyword of multiple assets and affecting the asset depreciation
  • the emotional vocabulary is input into the LSTM model training.
  • the keywords of an asset and the emotional vocabulary that will affect the appreciation of the asset are input into the LSTM model for training; thus, the keywords of the multiple assets and the emotional vocabulary that affects the appreciation of the asset are input to the LSTM model for training.
  • the weight coefficient of each emotional vocabulary corresponding to each asset keyword is obtained, that is, an asset analysis model based on the LSTM model is obtained.
  • the influence of the emotional vocabulary on the appreciation or depreciation of the asset can be output. Combined with the above step S3, the impact of the target hot spot event on the asset can be calculated.
  • the step of acquiring a hot spot event includes:
  • the preset media account refers to some media accounts that are influential or authoritative in terms of finance. Including some personal microblog accounts opened by financial experts, or media accounts related to financial related official information channels, such as the official Weibo of the Securities and Futures Commission.
  • financial experts or media accounts related to financial related official information channels, such as the official Weibo of the Securities and Futures Commission.
  • the method for obtaining the growth rate is: acquiring the attention quantity x at the current time, extracting the attention quantity y of the preset time before the current time, and calculating the growth rate of x relative to y.
  • the period for obtaining the growth rate may be 5 minutes, 10 minutes, or the like, or may be 10 seconds, 20 seconds, or the like.
  • the growth threshold is a threshold for determining whether the account has a hotspot event.
  • the growth rate obtained in the above step S21 may also be a negative number, that is, the number of media accounts concerned is reduced.
  • the growth threshold includes one or two numbers. Specifically, the growth threshold may be -20% and 10%, that is, the growth rate is lower than 20% or higher than 10%, both exceeding the growth threshold. When the growth rate exceeds the growth threshold, it is determined that the message published by the media account within the preset time period contains a hot event.
  • the content of the message published by the media account on the day is obtained, specifically, the text content is obtained. If the message published in the media account contains a picture, the text in the picture is identified by scanning. Therefore, the message published by the media account in the preset time period may be determined as a hot event, or some of the messages may be further filtered as a hot event according to a message published by the media account within a preset time period, for example, The message that the number of comments in the message published by the media account in the preset time period exceeds the preset comment threshold is determined as a hot event.
  • the step of acquiring the hotspot event includes:
  • a social person will comment on the message.
  • Each person can post multiple comments.
  • the system gets the number of comments. Is to get the number of comments on the published message, not the number of comments on the media account. For example, if a media account publishes two messages one day, the first message has 500 comments, and the second message has 800 comments, the system obtains the number of comments corresponding to the two messages of the media account, respectively 500 and 800.
  • the number of comments is compared to whether or not the comment threshold is exceeded.
  • the comment threshold is a number that defines whether the published message is a hot event.
  • the comment threshold is a threshold for judging that the message published by each media account contains a commentary of a hot event. For example, the comment threshold is 600.
  • the first message has 500 arguments, which does not exceed the comment threshold; the second message has 800 comments, which exceeds the comment threshold, and determines that the second message is a hot spot. event.
  • the method may further include:
  • the media account with the financial mark refers to the media account related to finance, such as publishing a certain length or a certain number of financial articles, or some financial officially certified media accounts.
  • the system accesses the microblogging background, accesses all financial media accounts with financial symbols in the microblog, and obtains information published by these financial media accounts.
  • the information of the media account includes the number of friends, the number of fans, the level, and the microblog data, and the information related to the history microblog, which are information of the media account, and each information is quantized.
  • the number of friends, the number of fans, and the ranks are all quantified data; the number of microblogs can be published in the past year, or the number of comments in the microblog has reached 500 in the past year; the default formula is for the media.
  • s is the account score
  • a is the friend score
  • b is the fan score
  • c is the rank number
  • d is the score of the microblog comment over 500.
  • the friend score is calculated as shown in Table 1:
  • Table 1 Friend Number and Friend Score Mapping Table
  • the account score is calculated by using a preset formula, and the score reflects the influence of a media account.
  • the account score is then compared to a score threshold.
  • the score threshold is a preset medium used to define whether the influence of a media account is large enough to be used as a reference.
  • the system then sets the media account whose media account score exceeds the score threshold as the preset media.
  • the score threshold is 60 according to the preset formula in the above step S202.
  • the step of acquiring the comment text of the target hotspot event includes:
  • the network community refers to an online community such as a stock group, a post bar, a knowing body, and the like, and has a large number of user groups to discuss something. Access to the online community associated with the hot event, which can be a post to access the asset name.
  • the online community of a group of users who are concerned about the stock extracts the post posted in the post bar and comments on the post after posting the post.
  • Post posts and comments on posts ie the chat messages above.
  • the system registers an exchange account and enters a live chat group. When a hot event occurs, the chat record in the live chat group is extracted.
  • the step of extracting the emotional vocabulary in the comment text includes:
  • the emotional vocabulary database is all the words that the background staff pre-organizes to express the feelings of the person with emotions, and all the sorted words are put together and stored in the server. After the comment text of the hot event is obtained, the emotional vocabulary database is called. Further, the emotional vocabulary database includes a lexical vocabulary library and a lexical vocabulary library, and the staff puts the emotional vocabulary into separate vocabularies.
  • step S44 after the emotional vocabulary database is called, the content of the review text is scanned, each word of the review content is read, semantic analysis is performed, and the words in the review text are matched with the words in the emotional vocabulary database. Word.
  • step S45 all the matched words in the emotional vocabulary database are extracted, and the matched vocabulary is defined as the emotional vocabulary of the target hot event, that is, the emotion of the plurality of users to the hot event. Summarize the emotional trends of this hot event. Specifically, all the vocabulary in the lexical vocabulary in the emotional vocabulary database are uniformly aggregated, and all the vocabulary in the lexical vocabulary in the emotional vocabulary database are uniformly aggregated to calculate the proportion of defamatory vocabulary and defamatory vocabulary. Through these emotional vocabulary and the ratio of derogatory words to derogatory words, it can be judged whether the hot event is a favorable event or a bad event. This step facilitates faster calculation of asset charts in subsequent asset trend analysis models.
  • step of obtaining the trend chart of the asset to be analyzed includes:
  • the system analyzes a trend of the asset through the asset trend analysis model according to the hot event and the related comment text, and the asset may be swollen or may fall sharply, whether it is fierce or not. If you go up or down, it is very likely to have a big impact on the user's assets. Therefore, it is necessary to take the initiative to remind the user and send a warning signal to the user. To remind users more directly and effectively, send information to the user's mobile terminal, or dial the user's mobile phone through the server.
  • the trend of the asset exceeds the preset value, including the trend of the asset exceeding the minimum threshold or the trend of the asset exceeding the maximum threshold.
  • the user's preset mobile phone is sent to send a voice message to remind the user; the trend of the asset exceeds the highest threshold, and the short message is sent to the user's preset mobile phone to remind the user.
  • the method for analyzing the asset trend of the present application automatically acquires a hot event related to the asset, and generates a trend forecast map for the asset according to the keyword description in the hot event and the comment on the hot event.
  • the chart predicts that the asset is below a certain value, the user is sent a message to reduce the risk of the user's assets being lost.
  • the present application further provides an asset trend analysis device, including:
  • the event keyword module 2 is configured to acquire a hot event, extract keywords in the hot event, and obtain an event keyword;
  • the target hotspot event module 3 is configured to determine, in the event keyword, a target event keyword that matches the asset keyword, and acquire a target hotspot event corresponding to the target event keyword;
  • An emotional vocabulary module 4 configured to obtain a comment text of the target hotspot event, and extract an emotional vocabulary in the comment text
  • the obtaining module 5 is configured to input the target event keyword and the emotional vocabulary into a preset asset trend analysis model to obtain a trend chart of the asset to be analyzed, and the asset trend analysis model is based on an LSTM model.
  • the asset keyword of the asset is a data packet assembled by the background staff in advance.
  • Assets include stocks, funds, futures (gold, crude oil).
  • the key words of the asset include the industry to which the asset belongs.
  • the word corresponding to the industry in which the asset is located is one of the asset keywords.
  • Ping An Group belongs to the financial industry and the technology industry.
  • finance and technology are the key words of the assets of Ping An Group;
  • Ping An Group has one For the wholly-owned subsidiary of R&D, artificial intelligence and AI are also the key words of the assets of Ping An Group.
  • the asset keyword corresponding to the asset also includes the name of the top management of the company to which the asset belongs.
  • asset keywords are labels that describe assets.
  • the background staff sorts out the assets owned by the user and the corresponding asset keywords and collects them in a database.
  • the obtaining module 1 accesses the database, receives the asset name input by the user, and extracts the asset keyword corresponding to the asset name.
  • a hot event is a thing that has a related text message or report describing it. In a text message, there are some words that describe the event critically. The word is the event keyword.
  • the event keyword module 2 extracts the keyword method, which may be extracted from the topic, or may be based on some training models to determine that the number of words appearing in the text message is more than the event keyword.
  • the event keyword is generally the subject of the matter and the nature of the matter. For example, there is a hot event about Jia Yueting's FF91 electric car in the United States for alpine testing.
  • the event keywords output by the event keyword module 2 through the trained model may include: Jia Yueting, LeTV, electric car, luxury, hope, and the like.
  • the target hot event module 3 matches the event keyword with the asset keyword in the data packet.
  • the asset keywords are arranged in a predetermined logical order in the data packet, and the event keywords are also arranged in the same preset logical order, so that the event keyword is determined whether the asset in the preset data packet is When the keywords correspond, the speed of judgment will be faster.
  • the keywords correspond, the speed of judgment will be faster.
  • there are 10 event keywords 10 event keywords are arranged in alphabetical order of pinyin; there are 1000 asset keywords in the data packet, and the 1000 asset keywords are also arranged in alphabetical order according to pinyin.
  • the target hotspot event module 3 treats each event keyword according to the order of the pinyin letters in the approximate position of the asset keyword ranking in the data packet, and then the event key The word is matched with the asset keyword of the above ranked approximate position to obtain the result, which greatly reduces the number of comparisons and saves the judgment time.
  • the event keyword that matches the success is determined as the target event keyword, and then the target hot event module 3 searches for the hot event corresponding to the target event keyword, that is, the target hot event.
  • the things described in the target hotspot event are related to the user's assets and are likely to cause changes in the user's assets.
  • the comment text of the target hot event is a comment from the other media users on the media platform to post the event to the media user, and also includes comments from other media users who forward the event.
  • the Emotional Vocabulary Module 4 directly reads the comments below the media account that posted the target hotspot event, as well as the comments below the media account that forwarded the target hotspot event via the crawler technology.
  • the comment text also includes an emoticon pack.
  • An emoticon is a way to use pictures to express feelings.
  • the expression package is generally provided with text. When the comment text is an expression package, the emotional vocabulary module 4 recognizes the text in the expression package and converts it into text. Emotional vocabulary includes text and expressions.
  • Emotional vocabulary is a vocabulary that is prepared by the background staff in advance, and is generally used to comment on stocks.
  • the network vocabulary used to comment on stocks enhancing stocks, pleasing stocks; stock terminology: full warehouse, Jiancang, Kamakura, grab hat, More killing; common words in the financial field: William index, moving average, tradable shares, treasury bills; emotional vocabulary also includes emoji for expressing emotions, such as smiling faces, crying faces, happy; further, the above emoji also includes special Emoji in the stock field, such as [buy], [replica], [loss of money] and other emoji.
  • the obtaining module 5 inputs the target event keyword and the emotional vocabulary into the preset asset trend analysis model, and the asset trend analysis model predicts the appreciation or depreciation trend of the asset to be analyzed according to the logic after the training, and is calculated by the module 5.
  • LSTM Long Short-Term Memory
  • LSTM Long Short-Term Memory
  • It is a time recurrent neural network suitable for processing and predicting important events with relatively long intervals and delays in time series.
  • each neuron is a "memory cell” with an “input gate”, a “forget gate” and an “output gate”, three Also known as the "triple door.”
  • This “forgetting gate” which controls the convergence of the gradients during training, while maintaining long-term memory.
  • the hotspot events in this application are also related to time series. Therefore, the LSTM model is based on keyword processing in hot events, and has good performance. Technical effects, can accurately analyze the corresponding asset charts.
  • the staff When training the asset trend analysis model, the staff first inputs the asset keyword of an asset and the emotional vocabulary that will affect the asset's depreciation into the LSTM model for training; thus inputting the asset keyword of multiple assets and affecting the asset depreciation
  • the emotional vocabulary is input into the LSTM model training.
  • the keywords of an asset and the emotional vocabulary that will affect the appreciation of the asset are input into the LSTM model for training; thus, the keywords of the multiple assets and the emotional vocabulary that affects the appreciation of the asset are input to the LSTM model for training.
  • the weight coefficient of each emotional vocabulary corresponding to each asset keyword is obtained, that is, an asset analysis model based on the LSTM model is obtained.
  • the influence of the emotional vocabulary on the appreciation or depreciation of the asset can be output.
  • the impact of the target hotspot event on the asset can be calculated.
  • the event keyword module 2 includes:
  • a growth rate unit 21 configured to obtain a growth rate of a quantity of attention of the preset media account
  • the determining unit 22 is configured to: when the growth rate exceeds the growth threshold, determine that the message published by the media account within a preset time period includes a hot event;
  • the first hotspot event unit 23 is configured to determine the hotspot event according to a message that is sent by the media account in the preset time period.
  • the preset media account refers to some media accounts that are influential or authoritative in terms of finance. Including some personal microblog accounts opened by financial experts, or media accounts related to financial related official information channels, such as the official Weibo of the Securities and Futures Commission.
  • the growth rate unit 21 acquires the growth rate of the attention quantity of the preset media account.
  • the method for obtaining the growth rate by the growth rate unit 21 is: acquiring the attention quantity x of the current time, extracting the attention quantity y of the preset time before the current time, and calculating the growth rate of x relative to y.
  • the period in which the growth rate unit 21 obtains the growth rate may be 5 minutes, 10 minutes, or the like, or may be 10 seconds, 20 seconds, or the like.
  • the growth threshold is a threshold for determining whether the account has a hotspot event.
  • the value calculated by the above-mentioned growth rate unit 21 may also be a negative number, that is, the number of media accounts concerned is reduced. Therefore, the growth threshold in decision unit 22 includes one or two numbers. Specifically, the growth threshold may be -20% and 10%, that is, the growth rate is lower than 20% or higher than 10%, both exceeding the growth threshold.
  • the determining unit 22 determines that the message published by the media account within the preset time period contains a hotspot event.
  • the first hot event unit 23 acquires the content of the message published by the media account on the day.
  • the first hot event unit 23 obtains the text content. If the message published by the media account contains a picture, the first hot event unit 23 identifies the text in the picture by scanning. Therefore, the message published by the media account in the preset time period may be determined as a hot event, or some of the messages may be further filtered as a hot event according to a message published by the media account within a preset time period, for example, The message that the number of comments in the message published by the media account in the preset time period exceeds the preset comment threshold is determined as a hot event.
  • the event keyword module 2 includes:
  • a comment number unit 24 configured to obtain a number of comments of a message posted by the preset media account
  • the second hotspot event unit 25 is configured to determine that the target message is a hotspot event when the number of comments of the target message in the message published by the preset media account exceeds a comment threshold.
  • the number of comments unit 24 gets the number of comments. Is to get the number of comments on the published message, not the number of comments on the media account. For example, if a media account publishes two messages one day, the first message has 500 comments, and the second message has 800 comments, the comment number unit 24 obtains the number of comments corresponding to the two messages of the media account. , 500 and 800 respectively.
  • the second hotspot event unit 25 compares the number of comments with a comment threshold.
  • the comment threshold is a number that defines whether the published message is a hot event.
  • the comment threshold is a threshold for judging that the message published by each media account contains a commentary of a hot event.
  • the comment threshold is 600.
  • the first message has 500 arguments and does not exceed the comment threshold; the second message has 800 comments, exceeding the comment threshold, and the second hot event unit 25 determines that the second message is a hot event.
  • the device for asset analysis described above further includes:
  • An account information module for obtaining information of a financial media account with a financial mark
  • An account score module configured to input information of the financial media account to a preset formula to obtain an account score, where the account score is used to quantify the influence of the financial media account;
  • a preset media module configured to set a financial media account whose account score exceeds a score threshold as the preset media account.
  • the media account with the financial mark refers to a media account related to finance, such as publishing a certain length or a certain number of financial articles, or some financial officially certified media accounts.
  • the account information module accesses the microblogging background, accesses all the financial media accounts with financial symbols in the microblog, and the account information module obtains the information published by the financial media accounts.
  • the information of the media account includes the number of friends, the number of fans, the level, and the microblog data, and the information related to the history microblog are all information of the media account, and the account score module quantifies each information.
  • the number of friends, the number of fans, and the ranks are all quantified data; the number of microblogs can be published in the past year, or the number of comments in the microblog has reached 500 in the past year; the default formula is for the media.
  • the account score module sets a specific formula as follows:
  • s is the account score
  • a is the friend score
  • b is the fan score
  • c is the rank number
  • d is the score of the microblog comment over 500.
  • Table 2 Friend Number and Friend Score Mapping Table
  • the account score module calculates the account score by using a preset formula, and the score reflects the influence of a media account.
  • the preset media module compares the account score to a score threshold.
  • the score threshold is a preset medium used to define whether the influence of a media account is large enough to be used as a reference.
  • the preset media module sets the media account whose media account score exceeds the score threshold as the preset media.
  • the score threshold is 60 according to a preset formula in the account score module described above.
  • the emotional vocabulary module 4 includes:
  • the access unit 41 is configured to access a network community associated with the target hotspot event
  • the extracting unit 42 is configured to extract chat information of the current moment of the network community, and use the chat information as the comment text.
  • the network community refers to an online community such as a stock group, a post bar, and a knowledgeable community, and has a large number of user groups to discuss something.
  • the access unit 41 accesses the network community associated with the hotspot event, which may be a post that accesses the asset name.
  • the extracting unit 42 extracts the post posted in the post bar and the comment on the post after posting the post. Post posts and comments on posts, ie the chat messages above.
  • the extracting unit 42 registers an exchange account and enters a live chat group. When a hot event occurs, the extracting unit 42 extracts the chat record in the live chat group.
  • the emotional vocabulary module 4 includes:
  • the calling unit 43 is configured to invoke the emotional vocabulary database
  • the matching unit 44 is configured to match the vocabulary in the comment text with the vocabulary in the sentiment vocabulary database
  • the determining unit 45 is configured to determine that the vocabulary in the comment text that matches the vocabulary in the emotional vocabulary database is an emotional vocabulary.
  • the emotional vocabulary database is all the words that the background staff pre-organizes to express the feelings of the person with emotions, and all the sorted words are put together and stored in the server.
  • the calling unit 43 calls the emotional vocabulary database.
  • the emotional vocabulary database includes a lexical vocabulary library and a lexical vocabulary library, and the staff puts the emotional vocabulary into separate vocabularies.
  • the calling unit 43 scans the content of the review text, reads each word of the comment content, and the matching unit 44 performs semantic analysis to match the words in the review text with the words in the emotional vocabulary database. word.
  • All of the matching vocabulary words in the emotional vocabulary database are extracted, and the determining unit 45 determines that the matched vocabulary is the emotional vocabulary of the target hotspot event, that is, the emotion of the plurality of users to the hotspot event. Summarize the emotional trends of this hot event. Specifically, all the vocabulary in the lexical vocabulary in the emotional vocabulary database are uniformly aggregated, and all the vocabulary in the lexical vocabulary in the emotional vocabulary database are uniformly aggregated to calculate the proportion of defamatory vocabulary and defamatory vocabulary. Through these emotional vocabulary and the ratio of derogatory words to derogatory words, it can be judged whether the hot event is a favorable event or a bad event. This step facilitates faster calculation of asset charts in subsequent asset trend analysis models.
  • the device for analyzing the asset trend of the present application automatically acquires a hot event related to the asset, and generates a trend forecast map for the asset according to the keyword description in the hot event and the comment on the hot event.
  • the chart predicts that the asset is below a certain value, the user is sent a message to reduce the risk of the user's assets being lost.
  • the computer device may be a server, and its internal structure may be as shown in FIG. 11.
  • the computer device includes a processor, memory, network interface, and database connected by a system bus. Among them, the computer designed processor is used to provide calculation and control capabilities.
  • the memory of the computer device includes a non-volatile storage medium, an internal memory.
  • the non-volatile storage medium stores an operating system, computer readable instructions, and a database.
  • the memory provides an environment for the operation of operating systems and computer readable instructions in a non-volatile storage medium.
  • the database of the computer device is used to store data such as an LSTM model, an emotional vocabulary database, and the like.
  • the network interface of the computer device is used to communicate with an external terminal via a network connection.
  • the computer readable instructions when executed, perform the flow of an embodiment of the methods described above.
  • FIG. 11 is only a block diagram of a part of the structure related to the solution of the present application, and does not constitute a limitation of the computer device to which the solution of the present application is applied.
  • An embodiment of the present application also provides a computer non-volatile readable storage medium having stored thereon computer readable instructions that, when executed, perform the processes of the embodiments of the methods described above.
  • the above description is only the preferred embodiment of the present application, and is not intended to limit the scope of the patent application, and the equivalent structure or equivalent process transformations made by the specification and the drawings of the present application, or directly or indirectly applied to other related The technical field is equally included in the scope of patent protection of the present application.

Landscapes

  • Engineering & Computer Science (AREA)
  • Business, Economics & Management (AREA)
  • Finance (AREA)
  • Accounting & Taxation (AREA)
  • Development Economics (AREA)
  • Operations Research (AREA)
  • Game Theory and Decision Science (AREA)
  • Human Resources & Organizations (AREA)
  • Entrepreneurship & Innovation (AREA)
  • Economics (AREA)
  • Marketing (AREA)
  • Strategic Management (AREA)
  • Technology Law (AREA)
  • Physics & Mathematics (AREA)
  • General Business, Economics & Management (AREA)
  • General Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

A method, a device, a computer device, and a storage medium for asset trend analysis. The method comprises: obtaining, according to keywords of an asset to be analyzed, comment text corresponding to a current event corresponding to the keywords, inputting sentiment words in the comment text into a sentiment analysis model, and obtaining an asset trend forecast chart. An asset trend forecast chart is automatically generated according to a current event related to an asset and comments on the current event.

Description

资产走势分析的方法、装置、计算机设备和存储介质Method, device, computer equipment and storage medium for analyzing asset trend
本申请要求于2018年5月8日提交中国专利局、申请号为2018104338049,发明名称为“资产走势分析的方法、装置、计算机设备和存储介质”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。This application claims the priority of the Chinese patent application filed on May 8, 2018, the Chinese Patent Office, Application No. 2018104338049, entitled "Methods, Devices, Computer Equipment and Storage Media for Asset Trend Analysis", the entire contents of which are The citations are incorporated herein by reference.
技术领域Technical field
本申请涉及到计算机技术领域,特别是涉及到一种资产走势分析的方法、装置、计算机设备和存储介质。The present application relates to the field of computer technology, and in particular, to a method, an apparatus, a computer device and a storage medium for analyzing an asset trend.
背景技术Background technique
资产(比如股票、基金、期货等)会因为一些时事热点会导致增值或缩水,因而资产具有实时性,即资产的额度会实时发生变化。然而资产持有者虽然会关注自己的资产,但是人的精力有限,而且热点事件的发生具有突发性,使资产持有者很难去实时关注热点事件后再去分析热点事件对自己的资产的影响。如果因为热点事件而导致资产在短时间内迅速下滑,对资产持有者的财产造成很大的损失。Assets (such as stocks, funds, futures, etc.) will increase or decrease due to some current events, so the assets are real-time, that is, the amount of assets will change in real time. However, although asset holders will pay attention to their own assets, but their energy is limited, and the occurrence of hot events is sudden, making it difficult for asset holders to pay attention to hot events in real time and then analyze hot events for their own assets. Impact. If the asset declines rapidly in a short period of time due to a hot event, it will cause great damage to the property of the asset holder.
目前还没有自动根据热点事件来对资产进行分析的方法。There is currently no way to automatically analyze assets based on hot events.
技术问题technical problem
本申请的主要目的为提供一种实时自动生成资产走势的资产走势分析的方法、装置、计算机设备和存储介质。The main purpose of the present application is to provide a method, device, computer device and storage medium for analyzing the trend of assets in real-time automatic generation of asset trends.
技术解决方案Technical solution
为了实现上述发明目的,本申请提出一种资产走势分析的方法,包括:In order to achieve the above object, the present application proposes a method for analyzing asset trends, including:
获取待分析资产的资产关键词;Obtain the asset keyword of the asset to be analyzed;
获取热点事件,提取所述热点事件中的关键词,得到事件关键词;Obtaining a hot event, extracting keywords in the hot event, and obtaining an event keyword;
在所述事件关键词中,确定与所述资产关键词匹配的目标事件关键词,获取所述目标事件关键词对应的目标热点事件;Determining, in the event keyword, a target event keyword that matches the asset keyword, and acquiring a target hotspot event corresponding to the target event keyword;
获取所述目标热点事件的评论文本,并提取所述评论文本中的情感词汇;Obtaining a comment text of the target hotspot event, and extracting an emotional vocabulary in the comment text;
将所述目标事件关键词以及所述情感词汇输入到预设的资产走势分析模型,得到所述待分析资产的走势图,所述资产走势分析模型是基于LSTM的模型。The target event keyword and the emotional vocabulary are input into a preset asset trend analysis model to obtain a trend chart of the asset to be analyzed, and the asset trend analysis model is based on an LSTM model.
本申请还提供一种资产走势分析的装置,包括:The application also provides an apparatus for analyzing the trend of assets, including:
获取模块,用于获取待分析资产的资产关键词;An acquisition module, configured to acquire an asset keyword of an asset to be analyzed;
事件关键词模块,用于获取热点事件,提取所述热点事件中的关键词,得到事件关键词;An event keyword module, configured to acquire a hot event, extract keywords in the hot event, and obtain an event keyword;
目标热点事件模块,用于在所述事件关键词中,确定与所述资产关键词匹配的目标事件关键词,获取所述目标事件关键词对应的目标热点事件;a target hotspot event module, configured to determine, in the event keyword, a target event keyword that matches the asset keyword, and obtain a target hotspot event corresponding to the target event keyword;
情感词汇模块,用于获取所述目标热点事件的评论文本,并提取所述评论文本中的情感词汇;An emotional vocabulary module, configured to obtain a comment text of the target hotspot event, and extract an emotional vocabulary in the comment text;
得到模块,用于将所述目标事件关键词以及所述情感词汇输入到预设的资产走势分析模型,得到所述待分析资产的走势图,所述资产走势分析模型是基于LSTM的模型。And a module for inputting the target event keyword and the emotional vocabulary into a preset asset trend analysis model to obtain a trend chart of the asset to be analyzed, wherein the asset trend analysis model is based on an LSTM model.
本申请还提供一种计算机设备,包括存储器和处理器,所述存储器存储有计算机可读指令,所述处理器执行所述计算机可读指令时实现上述任一项所述方法的步骤。The application further provides a computer device comprising a memory and a processor, the memory storing computer readable instructions, the processor executing the computer readable instructions to implement the steps of any of the methods described above.
本申请还提供一种计算机非易失性可读存储介质,其上存储有计算机可读指令,所述计算机可读指令被处理器执行时实现上述任一项所述的方法的步骤。The present application also provides a computer non-transitory readable storage medium having stored thereon computer readable instructions that, when executed by a processor, implement the steps of any of the methods described above.
有益效果Beneficial effect
本申请的资产走势分析的方法、装置、计算机设备和存储介质,自动获取与资产相关的热点事件后,根据热点事件中的关键词描述以及对热点事件的评论,生成关于资产的走势预测图。当走势图预测到资产低于一定值,给用户发送信息,减小用户的资产受到损失的风险。The method, the device, the computer device and the storage medium of the asset trend analysis of the present application automatically acquire the hot event related to the asset, and generate a trend prediction chart about the asset according to the keyword description in the hot event and the comment on the hot event. When the chart predicts that the asset is below a certain value, the user is sent a message to reduce the risk of the user's assets being lost.
附图说明DRAWINGS
图1为本申请一实施例的资产走势分析的方法的流程示意图;1 is a schematic flow chart of a method for analyzing an asset trend according to an embodiment of the present application;
图2为本申请一实施例的上述资产走势分析的方法中获取热点事件的具体流程示意图;2 is a schematic diagram of a specific process for obtaining a hot spot event in the method for analyzing an asset trend according to an embodiment of the present application;
图3为本申请另一实施例的上述资产走势分析的方法中获取热点事件的具体流程示意图;FIG. 3 is a schematic flowchart of a method for obtaining a hot spot event in the method for analyzing an asset trend according to another embodiment of the present application;
图4为本申请一实施例的上述资产走势分析的方法中获取热点事件的评论文本的具体流程示意图;4 is a schematic flowchart of a method for obtaining a comment text of a hot event in the method for analyzing an asset trend according to an embodiment of the present application;
图5为本申请一实施例的上述资产走势分析的方法中提取评论文本中的情感词汇的具体流程示意图;FIG. 5 is a schematic diagram of a specific process for extracting emotional vocabulary in a review text in the method for analyzing an asset trend according to an embodiment of the present application;
图6为本申请一实施例的资产走势分析的装置的结构示意框图;6 is a schematic block diagram showing the structure of an apparatus for analyzing an asset trend according to an embodiment of the present application;
图7为本申请一实施例的资产走势分析的装置的事件关键词模块的结构示意框图;7 is a schematic block diagram showing the structure of an event keyword module of an apparatus for analyzing an asset trend according to an embodiment of the present application;
图8为本申请另一实施例的资产走势分析的装置的事件关键词模块的结构示意框图;8 is a schematic structural block diagram of an event keyword module of an apparatus for analyzing an asset trend according to another embodiment of the present application;
图9为本申请一实施例的资产走势分析的装置的情感词汇模块的结构示意框图;9 is a schematic block diagram showing the structure of an emotional vocabulary module of an apparatus for analyzing an asset trend according to an embodiment of the present application;
图10为本申请一实施例的资产走势分析的装置的情感词汇模块的结构示意框图;10 is a schematic block diagram showing the structure of an emotional vocabulary module of an apparatus for analyzing an asset trend according to an embodiment of the present application;
图11为本申请一实施例的计算机设备的结构示意框图。FIG. 11 is a schematic block diagram showing the structure of a computer device according to an embodiment of the present application.
本发明的最佳实施方式BEST MODE FOR CARRYING OUT THE INVENTION
参照图1,本申请实施例提供一种资产走势分析的方法,包括步骤:Referring to FIG. 1 , an embodiment of the present application provides a method for analyzing an asset trend, including the following steps:
S1、获取待分析资产的资产关键词;S1. Obtaining an asset keyword of the asset to be analyzed;
S2、获取热点事件,提取所述热点事件中的关键词,得到事件关键词;S2: acquiring a hot event, extracting a keyword in the hot event, and obtaining an event keyword;
S3、在所述事件关键词中,确定与所述资产关键词匹配的目标事件关键词,获取所述目标事件关键词对应的目标热点事件;S3, in the event keyword, determining a target event keyword that matches the asset keyword, and acquiring a target hotspot event corresponding to the target event keyword;
S4、获取所述目标热点事件的评论文本,并提取所述评论文本中的情感词汇;S4: Obtain a comment text of the target hotspot event, and extract an emotional vocabulary in the comment text;
S5、将所述目标事件关键词以及所述情感词汇输入到预设的资产走势分析模型,得到所述待分析资产的走势图,所述资产走势分析模型是基于LSTM的模型。S5. Enter the target event keyword and the emotional vocabulary into a preset asset trend analysis model to obtain a trend chart of the asset to be analyzed, and the asset trend analysis model is based on an LSTM model.
如上述步骤S1所述,资产的资产关键词,是后台工作人员事先整理出来的资产关键词集合成的数据包。资产包括股票、基金、期货(黄金、原油)。资产关键词包括资产所属行业,资产所在的行业对应的词就是其中一个资产关键词,比如平安集团属于金融行业与科技行业,那么金融、科技均是属于平安集团的资产关键词;平安集团有一个研发人工智能的全资子公司,则人工智能、AI也是属于平安集团的资产关键词。资产对应的资产关键词还包括资产所属公司的高层管理人员姓名,比如阿里巴巴的总负责人是马云,那么阿里巴巴集团的资产关键词就包括马云。从另一个角度去理解,资产关键词是指对资产描述的标签。后台工作人员将用户拥有的资产以及对应的资产关键词都整理出来,集合在一个数据库中。***访问数据库,接收到用户输入的资产名称,然后提取出该资产名称对应的资产关键词。As described in the above step S1, the asset keyword of the asset is a data packet assembled by the background staff in advance. Assets include stocks, funds, futures (gold, crude oil). The key words of the asset include the industry to which the asset belongs. The word corresponding to the industry in which the asset is located is one of the asset keywords. For example, Ping An Group belongs to the financial industry and the technology industry. Then finance and technology are the key words of the assets of Ping An Group; Ping An Group has one For the wholly-owned subsidiary of R&D, artificial intelligence and AI are also the key words of the assets of Ping An Group. The asset keyword corresponding to the asset also includes the name of the top management of the company to which the asset belongs. For example, the chief person in charge of Alibaba is Ma Yun, then the key keywords of Alibaba Group's assets include Ma Yun. To understand from another perspective, asset keywords are labels that describe assets. The background staff sorts out the assets owned by the user and the corresponding asset keywords and collects them in a database. The system accesses the database, receives the name of the asset input by the user, and extracts the asset keyword corresponding to the asset name.
如上述步骤S2所述,热点事件是指一件事情,有相关的文字消息或报导对其进行描述,文字消息中,会有一些词语对事件进行关键性的描述,该词即为事件关键词。提取出关键词的方法,可以是从题目中进行提取,也可以是根据一些训练模型,判断文字消息中出现的词次数较多的为事件关键词。事件关键词一般是涉及到事情的主体、事情的性质。例如,有一个热点事件是关于贾跃亭的FF91电动汽车在美国进行高寒测试。通过训练的模型计算后输出的事件关键词就可能包括:贾跃亭、乐视、电动汽车、豪华、希望等。As described in step S2 above, a hotspot event refers to a thing, and a related text message or report describes it. In the text message, there are some words that describe the event critically, and the word is the event keyword. . The method for extracting keywords may be extracted from the title, or may be based on some training models to determine that the number of words appearing in the text message is more than the event keyword. The event keyword is generally the subject of the matter and the nature of the matter. For example, there is a hot event about Jia Yueting's FF91 electric car in the United States for alpine testing. The event keywords output after the training model is calculated may include: Jia Yueting, LeTV, electric car, luxury, hope, etc.
如上述步骤S3所述,***提取出热点事件中的事件关键词后,将事件关键词与数据包中的资产关键词进行匹配。在一具体实施例中,资产关键词在数据包中按照预设逻辑顺序排列,事件关键词也按照同样的预设逻辑顺序排列,这样在判断事件关键词是否与预设的数据包中的资产关键词对应时,判断的速度会更快。例如,有10个事件关键词,10个事件关键词按照拼音的字母顺序排列;数据包中有1000个资产关键词,这1000个资产关键词也按照拼音的字母顺序排列。在判断10个事件关键词是否分别与资产关键词对应时,将每一个事件关键词根据拼音字母的顺序看在数据包中的资产关键词排行大致位置,然后将该事件关键词与上述排行大致位置的资产关键词进行匹配得出结果,大大的减少相比对的次数,节省判断时间。将匹配成功的事件关键词确定为目标事件关键词,然后上溯查找目标事件关键词对应的热点事件,即目标热点事件。As described in step S3 above, after extracting the event keyword in the hot event, the system matches the event keyword with the asset keyword in the data packet. In a specific embodiment, the asset keywords are arranged in a predetermined logical order in the data packet, and the event keywords are also arranged in the same preset logical order, so that the event keyword is determined whether the asset in the preset data packet is When the keywords correspond, the speed of judgment will be faster. For example, there are 10 event keywords, 10 event keywords are arranged in alphabetical order of pinyin; there are 1000 asset keywords in the data packet, and the 1000 asset keywords are also arranged in alphabetical order according to pinyin. When judging whether the 10 event keywords correspond to the asset keywords respectively, each event keyword is viewed in the order of the pinyin letters in the approximate position of the asset keyword ranking in the data packet, and then the event keyword is approximated to the above ranking. The asset keywords of the location are matched to obtain the result, which greatly reduces the number of comparisons and saves the judgment time. The event keyword that matches the success is determined as the target event keyword, and then the hot event corresponding to the target event keyword is searched up, that is, the target hot event.
如上述步骤S4所述,目标热点事件所描述的事情与用户的资产具有一定关联性,很有可能会引起用户的资产的变动。目标热点事件的评论文本,是从媒体平台上其他媒体用户对该媒体用户发布该事件的评论,也包括转发该事件的其他媒体用户对该事件的评论。***直接读取发布目标热点事件的媒体账户下面的评论,也包括通过爬虫技术访问转发目标热点事件的媒体账户下面的评论。评论文本还包括表情包。表情包是一种利用图片来表示感情的一种方式。表情包中一般配有文字,当评论文本是表情包时,识别表情包中的文字并转换成文本。情感词汇包括文本以及表情。情感词汇是后台工作人员事先整理出来的,一般用于对股票进行评论的词汇,例如,用于评论股票的网络词汇:妖股、仙股;炒股专业术语:满仓、建仓、斩仓、抢帽子、多杀多;财经领域常用词:威廉指标、移动平均线、流通股、国库券;情感词汇还包括用于表达感情的表情符号,比如笑脸、哭脸、开心;进一步地,上述表情符号还包括专用于股票领域的表情符号,比如[买入]、[复盘]、[亏大了]等表情符号。As described in the above step S4, the content described by the target hotspot event has a certain relevance to the user's assets, and is likely to cause changes in the user's assets. The comment text of the target hot event is a comment from the other media users on the media platform to post the event to the media user, and also includes comments from other media users who forward the event. The system directly reads the comments below the media account that posted the target hotspot event, as well as the comments below the media account that forwards the target hotspot event via crawler technology. The comment text also includes an emoticon pack. An emoticon is a way to use pictures to express feelings. The expression pack is generally provided with text. When the comment text is an emoticon package, the text in the emoticon pack is recognized and converted into text. Emotional vocabulary includes text and expressions. Emotional vocabulary is a vocabulary that is prepared by the background staff in advance, and is generally used to comment on stocks. For example, the network vocabulary used to comment on stocks: enchanting stocks, pleasing stocks; stock terminology: full warehouse, Jiancang, Kamakura, grab hat, More killing; common words in the financial field: William index, moving average, tradable shares, treasury bills; emotional vocabulary also includes emoji for expressing emotions, such as smiling faces, crying faces, happy; further, the above emoji also includes special Emoji in the stock field, such as [buy], [replica], [loss of money] and other emoji.
如上述步骤S5所述,将目标事件关键词以及情感词汇输入到预设的资产走势分析模型中,资产走势分析模型根据训练后的逻辑,预测出该待分析资产的升值或降值趋势,计算出用户资产的走势图。LSTM(Long Short-Term Memory)是长短期记忆网络,是一种时间递归神经网络,适合于处理和预测时间序列中间隔和延迟相对较长的重要事件。在LSTM中,每个神经元是一个“记忆细胞”,细胞里面有一个“输入门”(input gate),一个“遗忘门”(forget gate)和一个“输出门”(output gate),三个一起也称为“三重门”。LSTM模型的关键之一就在于这个“遗忘门”,它能够控制训练时候梯度在这里的收敛性,同时也能够保持长期的记忆性。因LSTM模型是适合处理时间序列中间隔和延迟相对较长的重要事件,本申请中的热点事件也是与时间序列相关的方案,因此,LSTM模型基于热点事件中的关键词处理,具有很好的技术效果,能准确的分析出对应的资产走势图。As described in the above step S5, the target event keyword and the emotional vocabulary are input into the preset asset trend analysis model, and the asset trend analysis model predicts the appreciation or depreciation trend of the asset to be analyzed according to the logic after the training, and calculates A chart of user assets. LSTM (Long Short-Term Memory) is a long-term and short-term memory network. It is a time recurrent neural network suitable for processing and predicting important events with relatively long intervals and delays in time series. In LSTM, each neuron is a "memory cell" with an "input gate", a "forget gate" and an "output gate", three Also known as the "triple door." One of the keys to the LSTM model is this “forgetting gate”, which controls the convergence of the gradients during training, while maintaining long-term memory. Because the LSTM model is suitable for processing important events with relatively long intervals and delays in time series, the hotspot events in this application are also related to time series. Therefore, the LSTM model is based on keyword processing in hot events, and has good performance. Technical effects, can accurately analyze the corresponding asset charts.
在训练资产走势分析模型时,工作人员首先将一个资产的资产关键词以及会影响该资产降值的情感词汇输入到LSTM模型进行训练;如此输入多个资产的资产关键词以及影响该资产降值的情感词汇输入到LSTM模型训练。然后再将一个资产的关键词以及会影响该资产升值的感情词汇输入到LSTM模型进行训练;如此输入多个资产的关键词以及会影响该资产升值的感情词汇输入到LSTM模型进行训练。如上述训练后,得到各情感词汇分别对应各个资产关键词的影响权重系数,即得到一个基于LSTM模型的资产分析模型。用户再输入资产关键词以及情感词汇至这个资产分析模型,即可输出情感词汇对资产的升值或降值的影响。结合上述步骤S3,即可计算出目标热点事件对该资产的影响。When training the asset trend analysis model, the staff first inputs the asset keyword of an asset and the emotional vocabulary that will affect the asset's depreciation into the LSTM model for training; thus inputting the asset keyword of multiple assets and affecting the asset depreciation The emotional vocabulary is input into the LSTM model training. Then, the keywords of an asset and the emotional vocabulary that will affect the appreciation of the asset are input into the LSTM model for training; thus, the keywords of the multiple assets and the emotional vocabulary that affects the appreciation of the asset are input to the LSTM model for training. After the above training, the weight coefficient of each emotional vocabulary corresponding to each asset keyword is obtained, that is, an asset analysis model based on the LSTM model is obtained. After the user inputs the asset keyword and the emotional vocabulary to the asset analysis model, the influence of the emotional vocabulary on the appreciation or depreciation of the asset can be output. Combined with the above step S3, the impact of the target hot spot event on the asset can be calculated.
参照图2,本实施例中,上述获取热点事件的步骤包括:Referring to FIG. 2, in the embodiment, the step of acquiring a hot spot event includes:
S21、获取预设媒体账户的关注数量的增长率;S21. Obtain a growth rate of the number of attentions of the preset media account;
S22、当所述增长率超过增长阈值时,判定所述媒体账户在预设时间段内发布的消息包含有热点事件;S22. When the growth rate exceeds a growth threshold, determining that the media account is published in a preset time period includes a hot event;
S23、根据所述媒体账户在所述预设时间段内发布的消息,确定所述热点事件。S23. Determine the hot event according to the message that is sent by the media account in the preset time period.
本实施例中,如上述S21步骤所述,预设媒体账户是指一些在财经方面有影响力或者有权威性的媒体账户。包括一些财经达人开通的个人微博账户,或者是财经相关的官方的发布信息渠道的媒体账户,例如***的官方微博。在一个事件发酵成热点事件时,都会伴随一个现象就是关注这个事件的人呈指数级增长;对应的,体现在预设媒体账户上的是,关注预设媒体账户的人的数量也是有一个明显的增长。因此要获取预设媒体账户的关注数量的增长率。具体的,获取增长率的方法为:获取当前时刻的关注数量x,提取当前时刻之前的预设时间的关注数量y,计算出x相对y的增长率。本实施例中,每隔一分钟计算关注数量的变化,假定一分钟前的关注数量是t1,当前时刻的关注数量是t0,则增长率a=(t0-t1)/t1。在其他的实施方式中,获取增长率的周期可以是5分钟、10分钟等,也可以是10秒、20秒等。In this embodiment, as described in the above step S21, the preset media account refers to some media accounts that are influential or authoritative in terms of finance. Including some personal microblog accounts opened by financial experts, or media accounts related to financial related official information channels, such as the official Weibo of the Securities and Futures Commission. When an event is fermented into a hot event, there is a phenomenon that the person paying attention to the event grows exponentially; correspondingly, it is reflected in the preset media account that the number of people who pay attention to the preset media account is also obvious. growth of. Therefore, the growth rate of the number of attentions of the preset media account is obtained. Specifically, the method for obtaining the growth rate is: acquiring the attention quantity x at the current time, extracting the attention quantity y of the preset time before the current time, and calculating the growth rate of x relative to y. In this embodiment, the change in the number of attention is calculated every one minute, assuming that the number of attentions before one minute is t1, and the number of attention at the current time is t0, then the growth rate a=(t0-t1)/t1. In other embodiments, the period for obtaining the growth rate may be 5 minutes, 10 minutes, or the like, or may be 10 seconds, 20 seconds, or the like.
如上述S22步骤所述,增长阈值是用于判定该账户是否有发布热点事件的临界值。上述S21步骤中获取的增长率,也有可能是负数,即关注媒体账户的数量减少。因此,所述增长阈值包括一个或两个数字。具体的,增长阈值可以是-20%和10%,即增长率低于20%或者高于10%,均是超过增长阈值。当增长率超过了增长阈值,判定该媒体账户在预设时间段内发布的消息包含有热点事件。As described in step S22 above, the growth threshold is a threshold for determining whether the account has a hotspot event. The growth rate obtained in the above step S21 may also be a negative number, that is, the number of media accounts concerned is reduced. Thus, the growth threshold includes one or two numbers. Specifically, the growth threshold may be -20% and 10%, that is, the growth rate is lower than 20% or higher than 10%, both exceeding the growth threshold. When the growth rate exceeds the growth threshold, it is determined that the message published by the media account within the preset time period contains a hot event.
如上述S23步骤所述,当判定该媒体账户发布的消息包含有热点事件后,获取当天该媒体账户发布的消息内容,具体的,是获取文字内容。若媒体账户发布的消息中包含图片,通过扫描识别图片中的文字。从而,可以将媒体账户在所述预设时间段内发布的消息确定为热点事件,也可以根据媒体账户在预设时间段内发布的消息,进一步筛选其中的部分消息作为热点事件,例如,将媒体账户在预设时间段内发布的消息中评论数超过预设评论阈值的消息确定为热点事件。As described in the step S23 above, when it is determined that the message published by the media account includes a hotspot event, the content of the message published by the media account on the day is obtained, specifically, the text content is obtained. If the message published in the media account contains a picture, the text in the picture is identified by scanning. Therefore, the message published by the media account in the preset time period may be determined as a hot event, or some of the messages may be further filtered as a hot event according to a message published by the media account within a preset time period, for example, The message that the number of comments in the message published by the media account in the preset time period exceeds the preset comment threshold is determined as a hot event.
参照图3,在另一实施例中,上述获取热点事件的步骤包括:Referring to FIG. 3, in another embodiment, the step of acquiring the hotspot event includes:
S24、获取预设媒体账户发布的消息的评论数;S24. Obtain a number of comments of a message posted by a preset media account.
S25、当所述预设媒体账户发布的消息中目标消息的评论数超过评论阈值时,确定所述目标消息为热点事件。S25. When the number of comments of the target message in the message published by the preset media account exceeds a comment threshold, determine that the target message is a hot event.
如上述步骤S24所述,媒体账户发布一条消息后,后面会有社会人士对该消息进行评论。每个人可以进行多条评论留言。***获取评论的数量。是获取对发布消息的评论数量,而不是获取媒体账户的评论数量。例如,某个媒体账户某天发布了两条消息,第一条消息有500条评论,第二条消息有800条评论,则***获取了该媒体账户的两个消息对应的评论数,分别是500和800。As described in step S24 above, after the media account posts a message, a social person will comment on the message. Each person can post multiple comments. The system gets the number of comments. Is to get the number of comments on the published message, not the number of comments on the media account. For example, if a media account publishes two messages one day, the first message has 500 comments, and the second message has 800 comments, the system obtains the number of comments corresponding to the two messages of the media account, respectively 500 and 800.
如上述步骤S25所述,对评论数进行比较,是否超过评论阈值。评论阈值是一个数字,用于界定发布的消息是否是热点事件。评论阈值是用于判断每个媒体账户发布的消息包含有热点事件的评论依据的临界值。例如,评论阈值是600。在S24步骤中提到的两条消息,第一条消息的论数是500条,没有超过评论阈值;第二条消息的评论数是800条,超过了评论阈值,判定第二条消息为热点事件。As described in the above step S25, the number of comments is compared to whether or not the comment threshold is exceeded. The comment threshold is a number that defines whether the published message is a hot event. The comment threshold is a threshold for judging that the message published by each media account contains a commentary of a hot event. For example, the comment threshold is 600. The two messages mentioned in the step S24, the first message has 500 arguments, which does not exceed the comment threshold; the second message has 800 comments, which exceeds the comment threshold, and determines that the second message is a hot spot. event.
进一步地,上述步骤S21或步骤S24之前,还可以包括:Further, before the step S21 or the step S24, the method may further include:
S201、获取带有财经标记的财经媒体账户的信息;S201. Obtain information of a financial media account with a financial mark;
S202、将所述财经媒体账户的信息输入至预设公式,得到账户分数,所述账户分数用于量化所述财经媒体账户的影响力;S202. Enter information of the financial media account into a preset formula to obtain an account score, where the account score is used to quantify the influence of the financial media account;
S203、将所述账户分数超过分数阈值的财经媒体账户设置为所述预设媒体账户。S203. Set a financial media account whose account score exceeds a score threshold as the preset media account.
如上述S201步骤所述,带有财经标记的媒体账户是指与财经相关的媒体账户,例如发表过一定篇幅或一定数量的财经类文章,或者是一些财经官方认证的媒体账户。具体的,***访问微博后台,访问所有微博中带有财经标记的财经媒体账户,获取这些财经媒体账户发布的信息。As described in the above step S201, the media account with the financial mark refers to the media account related to finance, such as publishing a certain length or a certain number of financial articles, or some financial officially certified media accounts. Specifically, the system accesses the microblogging background, accesses all financial media accounts with financial symbols in the microblog, and obtains information published by these financial media accounts.
如上述S202步骤所述,媒体账户的信息包括好友数、粉丝数、等级、和发布微博数据,以及历史微博相关的信息等均是媒体账户的信息,将每个信息进行量化。好友数、粉丝数、等级均是量化的数据;发布微博数据可以是近一年发布微博的数量,或者是近一年发布微博中评论数达到500的数量;预设公式是对媒体账户进行评价的一个公式,用于体现该媒体账户的影响力。好友数越多、粉丝数越多、等级越高、发布的微博数量越多、微博评论数量越多,对应的影响力越大,最终得到的账户分数越高。例如,设置一具体公式如下:As described in step S202 above, the information of the media account includes the number of friends, the number of fans, the level, and the microblog data, and the information related to the history microblog, which are information of the media account, and each information is quantized. The number of friends, the number of fans, and the ranks are all quantified data; the number of microblogs can be published in the past year, or the number of comments in the microblog has reached 500 in the past year; the default formula is for the media. A formula for evaluating an account to reflect the influence of the media account. The more friends, the more fans, the higher the level, the more microblogs are posted, the more Weibo comments, the greater the impact, and the higher the account score. For example, set a specific formula as follows:
s=c*(a+b)+ds=c*(a+b)+d
上述公式中,s是账户分数,a是好友分数,b是粉丝分数,c是等级数,d是微博评论超过500的分数。好友分数的计算方式如下表1:In the above formula, s is the account score, a is the friend score, b is the fan score, c is the rank number, and d is the score of the microblog comment over 500. The friend score is calculated as shown in Table 1:
好友数量   Number of friends    得分   Score   
0-10   0-10    1   1   
11-20   11-20    2   2   
21-50   21-50    3   3   
51-100   51-100    5   5   
101-1000   101-1000    10   10   
1000以上   More than 1000    30   30   
表1:好友数量与好友分数映射表Table 1: Friend Number and Friend Score Mapping Table
其他的粉丝分数、等级分数、微博评论超过500的分数均可以是如上表1按照阶梯进行评分。Other fan scores, grade scores, and scores of more than 500 Weibo comments may be scored according to the above table 1 according to the ladder.
如上述步骤S203所述,获取到媒体账户的信息后,通过预设公式,计算得到账户分数,该分数是反映一个媒体账户的影响力。然后将该账户分数与分数阈值进行比较。分数阈值是用于界定一个媒体账户的影响力是否大到可以用作参考的预设媒体。然后***将媒体账户分数超过分数阈值的媒体账户设置为预设媒体。在一具体实施例中,根据上述S202步骤中的预设公式,分数阈值是60。After obtaining the information of the media account, as described in the above step S203, the account score is calculated by using a preset formula, and the score reflects the influence of a media account. The account score is then compared to a score threshold. The score threshold is a preset medium used to define whether the influence of a media account is large enough to be used as a reference. The system then sets the media account whose media account score exceeds the score threshold as the preset media. In a specific embodiment, the score threshold is 60 according to the preset formula in the above step S202.
参照图4,进一步地,上述获取所述目标热点事件的评论文本的步骤包括:Referring to FIG. 4, further, the step of acquiring the comment text of the target hotspot event includes:
S41、访问与所述目标热点事件关联的网络社区;S41. Access a network community associated with the target hotspot event;
S42、提取所述网络社区当前时刻的聊天信息,将所述聊天信息作为所述评论文本。S42. Extract chat information of a current moment of the network community, and use the chat information as the comment text.
本实施例中,如上述步骤S41所述,网络社区是指如股票群、贴吧、知乎等网络社区,拥有大量的用户群体,对某个事情进行讨论。访问与热点事件关联的网络社区,可以是访问资产名字的贴吧。In this embodiment, as described in the above step S41, the network community refers to an online community such as a stock group, a post bar, a knowing body, and the like, and has a large number of user groups to discuss something. Access to the online community associated with the hot event, which can be a post to access the asset name.
如上述步骤S42所述,访问了资产名字的贴吧后,即关注该股票的一群用户的网络社区,提取出该贴吧里发布的帖子以及发布帖子后针对该帖子的评论。发布帖子以及对帖子的评论,即上述的聊天信息。在另一具体实施例中,***注册一个交流的账户,进入到一个实时聊天群里,当热点事件发生时,提取出该实时聊天群里的聊天记录。As described in step S42 above, after accessing the post of the asset name, the online community of a group of users who are concerned about the stock extracts the post posted in the post bar and comments on the post after posting the post. Post posts and comments on posts, ie the chat messages above. In another embodiment, the system registers an exchange account and enters a live chat group. When a hot event occurs, the chat record in the live chat group is extracted.
参照图5,进一步地,上述提取所述评论文本中的情感词汇的步骤包括:Referring to FIG. 5, further, the step of extracting the emotional vocabulary in the comment text includes:
S43、调用情感词汇数据库;S43. Calling an emotional vocabulary database;
S44、将所述评论文本中的词汇与所述情感词汇数据库中的词汇进行匹配;S44. Match the vocabulary in the comment text with the vocabulary in the sentiment vocabulary database;
S45、确定与所述情感词汇数据库中的词汇匹配的所述评论文本中的词汇为情感词汇。S45. Determine that the vocabulary in the comment text that matches the vocabulary in the emotional vocabulary database is an emotional vocabulary.
本实施例中,如上述步骤S43所述,情感词汇数据库是后台工作人员事先整理出的所有带有感情倾向表达人感情的词语,将所有整理出来的词语汇总到一起,存放在服务器中。当获取到了热点事件的评论文本后,调用出该情感词汇数据库。进一步地,情感词汇数据库包括褒义词汇库和贬义词汇库,工作人员将情感词汇整理出来后,分别放入不同的词汇库。In this embodiment, as described in the above step S43, the emotional vocabulary database is all the words that the background staff pre-organizes to express the feelings of the person with emotions, and all the sorted words are put together and stored in the server. After the comment text of the hot event is obtained, the emotional vocabulary database is called. Further, the emotional vocabulary database includes a lexical vocabulary library and a lexical vocabulary library, and the staff puts the emotional vocabulary into separate vocabularies.
如上述步骤S44所述,调用出情感词汇数据库后,将评论文本的内容进行扫描,读取评论内容的每个字,进行语义分析,将评论文本中的词与情感词汇数据库中的词相匹配的词。As described in step S44 above, after the emotional vocabulary database is called, the content of the review text is scanned, each word of the review content is read, semantic analysis is performed, and the words in the review text are matched with the words in the emotional vocabulary database. Word.
如上述步骤S45所述,将所有的与情感词汇数据库中的匹配的词汇提取出来,定义这些匹配的词汇是该目标热点事件的情感词汇,即众多用户对该热点事件的情感。汇总出这一热点事件的感情趋势。具体的,将所有与情感词汇数据库中的褒义词汇库中的词汇统一汇总,将所有与情感词汇数据库中的贬义词汇库中的词汇统一汇总,计算出褒义词汇与贬义词汇的比例。通过这些情感词汇以及褒义词汇与贬义词汇的比例可以判断出热点事件是利好事件还是利空事件。这一步骤方便在后续的资产走势分析模型中更快的计算资产的走势图。As described in the above step S45, all the matched words in the emotional vocabulary database are extracted, and the matched vocabulary is defined as the emotional vocabulary of the target hot event, that is, the emotion of the plurality of users to the hot event. Summarize the emotional trends of this hot event. Specifically, all the vocabulary in the lexical vocabulary in the emotional vocabulary database are uniformly aggregated, and all the vocabulary in the lexical vocabulary in the emotional vocabulary database are uniformly aggregated to calculate the proportion of defamatory vocabulary and defamatory vocabulary. Through these emotional vocabulary and the ratio of derogatory words to derogatory words, it can be judged whether the hot event is a favorable event or a bad event. This step facilitates faster calculation of asset charts in subsequent asset trend analysis models.
进一步地,上述得到所述待分析资产的走势图的步骤之后包括:Further, the step of obtaining the trend chart of the asset to be analyzed includes:
S6、当所述待分析资产的走势低于预设值时,发送信息至用户终端。S6. When the trend of the asset to be analyzed is lower than a preset value, send the information to the user terminal.
本实施例中,如上述步骤S6所述,***根据热点事件以及相关的评论文本,通过资产走势分析模型分析出资产的一个走势,资产有可能是猛涨,也有可能是大跌,无论是猛涨或是大跌,都很有可能对用户的资产造成很大的影响,因此有必要主动对用户进行提醒,发出警示信号给用户。为更直接有效的对用户进行提醒,发送信息到用户的移动终端,或者,通过服务器拨打用户的手机。In this embodiment, as described in step S6 above, the system analyzes a trend of the asset through the asset trend analysis model according to the hot event and the related comment text, and the asset may be swollen or may fall sharply, whether it is fierce or not. If you go up or down, it is very likely to have a big impact on the user's assets. Therefore, it is necessary to take the initiative to remind the user and send a warning signal to the user. To remind users more directly and effectively, send information to the user's mobile terminal, or dial the user's mobile phone through the server.
资产的走势超过预设值,包括资产的走势超过最低阈值或资产的走势超过最高阈值。资产的走势超过最低阈值时,拨打用户预设的手机发送语音消息提醒用户;资产的走势超过最高阈值,发送短信至用户预设的手机提醒用户。The trend of the asset exceeds the preset value, including the trend of the asset exceeding the minimum threshold or the trend of the asset exceeding the maximum threshold. When the trend of the asset exceeds the minimum threshold, the user's preset mobile phone is sent to send a voice message to remind the user; the trend of the asset exceeds the highest threshold, and the short message is sent to the user's preset mobile phone to remind the user.
综上所述,本申请的资产走势分析的方法,自动获取与资产相关的热点事件后,根据热点事件中的关键词描述以及对热点事件的评论,生成关于资产的走势预测图。当走势图预测到资产低于一定值,给用户发送信息,减小用户的资产受到损失的风险。In summary, the method for analyzing the asset trend of the present application automatically acquires a hot event related to the asset, and generates a trend forecast map for the asset according to the keyword description in the hot event and the comment on the hot event. When the chart predicts that the asset is below a certain value, the user is sent a message to reduce the risk of the user's assets being lost.
参照图6,本申请还提出一种资产走势分析装置,包括:Referring to FIG. 6, the present application further provides an asset trend analysis device, including:
获取模块1,用于获取待分析资产的资产关键词;Obtaining module 1 for acquiring an asset keyword of the asset to be analyzed;
事件关键词模块2,用于获取热点事件,提取所述热点事件中的关键词,得到事件关键词;The event keyword module 2 is configured to acquire a hot event, extract keywords in the hot event, and obtain an event keyword;
目标热点事件模块3,用于在所述事件关键词中,确定与所述资产关键词匹配的目标事件关键词,获取所述目标事件关键词对应的目标热点事件;The target hotspot event module 3 is configured to determine, in the event keyword, a target event keyword that matches the asset keyword, and acquire a target hotspot event corresponding to the target event keyword;
情感词汇模块4,用于获取所述目标热点事件的评论文本,并提取所述评论文本中的情感词汇;An emotional vocabulary module 4, configured to obtain a comment text of the target hotspot event, and extract an emotional vocabulary in the comment text;
得到模块5,用于将所述目标事件关键词以及所述情感词汇输入到预设的资产走势分析模型,得到所述待分析资产的走势图,所述资产走势分析模型是基于LSTM的模型。The obtaining module 5 is configured to input the target event keyword and the emotional vocabulary into a preset asset trend analysis model to obtain a trend chart of the asset to be analyzed, and the asset trend analysis model is based on an LSTM model.
本实施例中,资产的资产关键词,是后台工作人员事先整理出来的资产关键词集合成的数据包。资产包括股票、基金、期货(黄金、原油)。资产关键词包括资产所属行业,资产所在的行业对应的词就是其中一个资产关键词,比如平安集团属于金融行业与科技行业,那么金融、科技均是属于平安集团的资产关键词;平安集团有一个研发人工智能的全资子公司,则人工智能、AI也是属于平安集团的资产关键词。资产对应的资产关键词还包括资产所属公司的高层管理人员姓名,比如阿里巴巴的总负责人是马云,那么阿里巴巴集团的资产关键词就包括马云。从另一个角度去理解,资产关键词是指对资产描述的标签。后台工作人员将用户拥有的资产以及对应的资产关键词都整理出来,集合在一个数据库中。获取模块1访问数据库,接收到用户输入的资产名称,然后提取出该资产名称对应的资产关键词。热点事件是指一件事情,有相关的文字消息或报导对其进行描述,文字消息中,会有一些词语对事件进行关键性的描述,该词即为事件关键词。事件关键词模块2提取出关键词的方法,可以是从题目中进行提取,也可以是根据一些训练模型,判断文字消息中出现的词次数较多的为事件关键词。事件关键词一般是涉及到事情的主体、事情的性质。例如,有一个热点事件是关于贾跃亭的FF91电动汽车在美国进行高寒测试。事件关键词模块2通过训练的模型计算后输出的事件关键词就可能包括:贾跃亭、乐视、电动汽车、豪华、希望等。事件关键词模块2提取出热点事件中的事件关键词后,目标热点事件模块3将事件关键词与数据包中的资产关键词进行匹配。在一具体实施例中,资产关键词在数据包中按照预设逻辑顺序排列,事件关键词也按照同样的预设逻辑顺序排列,这样在判断事件关键词是否与预设的数据包中的资产关键词对应时,判断的速度会更快。例如,有10个事件关键词,10个事件关键词按照拼音的字母顺序排列;数据包中有1000个资产关键词,这1000个资产关键词也按照拼音的字母顺序排列。在判断10个事件关键词是否分别与资产关键词对应时,目标热点事件模块3将每一个事件关键词根据拼音字母的顺序看在数据包中的资产关键词排行大致位置,然后将该事件关键词与上述排行大致位置的资产关键词进行匹配得出结果,大大的减少相比对的次数,节省判断时间。将匹配成功的事件关键词确定为目标事件关键词,然后目标热点事件模块3上溯查找目标事件关键词对应的热点事件,即目标热点事件。目标热点事件所描述的事情与用户的资产具有一定关联性,很有可能会引起用户的资产的变动。目标热点事件的评论文本,是从媒体平台上其他媒体用户对该媒体用户发布该事件的评论,也包括转发该事件的其他媒体用户对该事件的评论。情感词汇模块4直接读取发布目标热点事件的媒体账户下面的评论,也包括通过爬虫技术访问转发目标热点事件的媒体账户下面的评论。评论文本还包括表情包。表情包是一种利用图片来表示感情的一种方式。表情包中一般配有文字,当评论文本是表情包时,情感词汇模块4识别表情包中的文字并转换成文本。情感词汇包括文本以及表情。情感词汇是后台工作人员事先整理出来的,一般用于对股票进行评论的词汇,例如,用于评论股票的网络词汇:妖股、仙股;炒股专业术语:满仓、建仓、斩仓、抢帽子、多杀多;财经领域常用词:威廉指标、移动平均线、流通股、国库券;情感词汇还包括用于表达感情的表情符号,比如笑脸、哭脸、开心;进一步地,上述表情符号还包括专用于股票领域的表情符号,比如[买入]、[复盘]、[亏大了]等表情符号。In this embodiment, the asset keyword of the asset is a data packet assembled by the background staff in advance. Assets include stocks, funds, futures (gold, crude oil). The key words of the asset include the industry to which the asset belongs. The word corresponding to the industry in which the asset is located is one of the asset keywords. For example, Ping An Group belongs to the financial industry and the technology industry. Then finance and technology are the key words of the assets of Ping An Group; Ping An Group has one For the wholly-owned subsidiary of R&D, artificial intelligence and AI are also the key words of the assets of Ping An Group. The asset keyword corresponding to the asset also includes the name of the top management of the company to which the asset belongs. For example, the chief person in charge of Alibaba is Ma Yun, then the key keywords of Alibaba Group's assets include Ma Yun. To understand from another perspective, asset keywords are labels that describe assets. The background staff sorts out the assets owned by the user and the corresponding asset keywords and collects them in a database. The obtaining module 1 accesses the database, receives the asset name input by the user, and extracts the asset keyword corresponding to the asset name. A hot event is a thing that has a related text message or report describing it. In a text message, there are some words that describe the event critically. The word is the event keyword. The event keyword module 2 extracts the keyword method, which may be extracted from the topic, or may be based on some training models to determine that the number of words appearing in the text message is more than the event keyword. The event keyword is generally the subject of the matter and the nature of the matter. For example, there is a hot event about Jia Yueting's FF91 electric car in the United States for alpine testing. The event keywords output by the event keyword module 2 through the trained model may include: Jia Yueting, LeTV, electric car, luxury, hope, and the like. After the event keyword module 2 extracts the event keyword in the hot event, the target hot event module 3 matches the event keyword with the asset keyword in the data packet. In a specific embodiment, the asset keywords are arranged in a predetermined logical order in the data packet, and the event keywords are also arranged in the same preset logical order, so that the event keyword is determined whether the asset in the preset data packet is When the keywords correspond, the speed of judgment will be faster. For example, there are 10 event keywords, 10 event keywords are arranged in alphabetical order of pinyin; there are 1000 asset keywords in the data packet, and the 1000 asset keywords are also arranged in alphabetical order according to pinyin. When determining whether the 10 event keywords respectively correspond to the asset keywords, the target hotspot event module 3 treats each event keyword according to the order of the pinyin letters in the approximate position of the asset keyword ranking in the data packet, and then the event key The word is matched with the asset keyword of the above ranked approximate position to obtain the result, which greatly reduces the number of comparisons and saves the judgment time. The event keyword that matches the success is determined as the target event keyword, and then the target hot event module 3 searches for the hot event corresponding to the target event keyword, that is, the target hot event. The things described in the target hotspot event are related to the user's assets and are likely to cause changes in the user's assets. The comment text of the target hot event is a comment from the other media users on the media platform to post the event to the media user, and also includes comments from other media users who forward the event. The Emotional Vocabulary Module 4 directly reads the comments below the media account that posted the target hotspot event, as well as the comments below the media account that forwarded the target hotspot event via the crawler technology. The comment text also includes an emoticon pack. An emoticon is a way to use pictures to express feelings. The expression package is generally provided with text. When the comment text is an expression package, the emotional vocabulary module 4 recognizes the text in the expression package and converts it into text. Emotional vocabulary includes text and expressions. Emotional vocabulary is a vocabulary that is prepared by the background staff in advance, and is generally used to comment on stocks. For example, the network vocabulary used to comment on stocks: enchanting stocks, pleasing stocks; stock terminology: full warehouse, Jiancang, Kamakura, grab hat, More killing; common words in the financial field: William index, moving average, tradable shares, treasury bills; emotional vocabulary also includes emoji for expressing emotions, such as smiling faces, crying faces, happy; further, the above emoji also includes special Emoji in the stock field, such as [buy], [replica], [loss of money] and other emoji.
得到模块5将目标事件关键词以及情感词汇输入到预设的资产走势分析模型中,资产走势分析模型根据训练后的逻辑,预测出该待分析资产的升值或降值趋势,得到模块5计算得出用户资产的走势图。LSTM(Long Short-Term Memory)是长短期记忆网络,是一种时间递归神经网络,适合于处理和预测时间序列中间隔和延迟相对较长的重要事件。在LSTM中,每个神经元是一个“记忆细胞”,细胞里面有一个“输入门”(input gate),一个“遗忘门”(forget gate)和一个“输出门”(output gate),三个一起也称为“三重门”。LSTM模型的关键之一就在于这个“遗忘门”,它能够控制训练时候梯度在这里的收敛性,同时也能够保持长期的记忆性。因LSTM模型是适合处理时间序列中间隔和延迟相对较长的重要事件,本申请中的热点事件也是与时间序列相关的方案,因此,LSTM模型基于热点事件中的关键词处理,具有很好的技术效果,能准确的分析出对应的资产走势图。The obtaining module 5 inputs the target event keyword and the emotional vocabulary into the preset asset trend analysis model, and the asset trend analysis model predicts the appreciation or depreciation trend of the asset to be analyzed according to the logic after the training, and is calculated by the module 5. A chart of user assets. LSTM (Long Short-Term Memory) is a long-term and short-term memory network. It is a time recurrent neural network suitable for processing and predicting important events with relatively long intervals and delays in time series. In LSTM, each neuron is a "memory cell" with an "input gate", a "forget gate" and an "output gate", three Also known as the "triple door." One of the keys to the LSTM model is this “forgetting gate”, which controls the convergence of the gradients during training, while maintaining long-term memory. Because the LSTM model is suitable for processing important events with relatively long intervals and delays in time series, the hotspot events in this application are also related to time series. Therefore, the LSTM model is based on keyword processing in hot events, and has good performance. Technical effects, can accurately analyze the corresponding asset charts.
在训练资产走势分析模型时,工作人员首先将一个资产的资产关键词以及会影响该资产降值的情感词汇输入到LSTM模型进行训练;如此输入多个资产的资产关键词以及影响该资产降值的情感词汇输入到LSTM模型训练。然后再将一个资产的关键词以及会影响该资产升值的感情词汇输入到LSTM模型进行训练;如此输入多个资产的关键词以及会影响该资产升值的感情词汇输入到LSTM模型进行训练。如上述训练后,得到各情感词汇分别对应各个资产关键词的影响权重系数,即得到一个基于LSTM模型的资产分析模型。用户再输入资产关键词以及情感词汇至这个资产分析模型,即可输出情感词汇对资产的升值或降值的影响。结合上述目标热点事件模块3,即可计算出目标热点事件对该资产的影响。When training the asset trend analysis model, the staff first inputs the asset keyword of an asset and the emotional vocabulary that will affect the asset's depreciation into the LSTM model for training; thus inputting the asset keyword of multiple assets and affecting the asset depreciation The emotional vocabulary is input into the LSTM model training. Then, the keywords of an asset and the emotional vocabulary that will affect the appreciation of the asset are input into the LSTM model for training; thus, the keywords of the multiple assets and the emotional vocabulary that affects the appreciation of the asset are input to the LSTM model for training. After the above training, the weight coefficient of each emotional vocabulary corresponding to each asset keyword is obtained, that is, an asset analysis model based on the LSTM model is obtained. After the user inputs the asset keyword and the emotional vocabulary to the asset analysis model, the influence of the emotional vocabulary on the appreciation or depreciation of the asset can be output. Combined with the above target hotspot event module 3, the impact of the target hotspot event on the asset can be calculated.
参照图7,进一步地,上述事件关键词模块2包括:Referring to FIG. 7, further, the event keyword module 2 includes:
增长率单元21,用于获取预设媒体账户的关注数量的增长率;a growth rate unit 21, configured to obtain a growth rate of a quantity of attention of the preset media account;
判定单元22,用于当所述增长率超过增长阈值时,判定所述媒体账户在预设时间段内发布的消息包含有热点事件;The determining unit 22 is configured to: when the growth rate exceeds the growth threshold, determine that the message published by the media account within a preset time period includes a hot event;
第一热点事件单元23,用于根据所述媒体账户在所述预设时间段内发布的消息,确定所述热点事件。The first hotspot event unit 23 is configured to determine the hotspot event according to a message that is sent by the media account in the preset time period.
本实施例中,预设媒体账户是指一些在财经方面有影响力或者有权威性的媒体账户。包括一些财经达人开通的个人微博账户,或者是财经相关的官方的发布信息渠道的媒体账户,例如***的官方微博。在一个事件发酵成热点事件时,都会伴随一个现象就是关注这个事件的人呈指数级增长;对应的,体现在预设媒体账户上的是,关注预设媒体账户的人的数量也是有一个明显的增长。因此要增长率单元21获取预设媒体账户的关注数量的增长率。具体的,增长率单元21获取增长率的方法为:获取当前时刻的关注数量x,提取当前时刻之前的预设时间的关注数量y,计算出x相对y的增长率。本实施例中,增长率单元21每隔一分钟计算关注数量的变化,假定一分钟前的关注数量是t1,当前时刻的关注数量是t0,则增长率a=(t0-t1)/t1。在其他的实施方式中,增长率单元21获取增长率的周期可以是5分钟、10分钟等,也可以是10秒、20秒等。增长阈值是用于判定该账户是否有发布热点事件的临界值。上述增长率单元21计算出来的数值也有可能是负数,即关注媒体账户的数量减少。因此,判定单元22中的增长阈值包括一个或两个数字。具体的,增长阈值可以是-20%和10%,即增长率低于20%或者高于10%,均是超过增长阈值。当增长率超过了增长阈值,判定单元22判定该媒体账户在预设时间段内发布的消息包含有热点事件。当判定单元22判定该媒体账户发布的消息包含有热点事件后,第一热点事件单元23获取当天该媒体账户发布的消息内容,具体的,第一热点事件单元23获取文字内容。若媒体账户发布的消息中包含图片,第一热点事件单元23通过扫描识别图片中的文字。从而,可以将媒体账户在所述预设时间段内发布的消息确定为热点事件,也可以根据媒体账户在预设时间段内发布的消息,进一步筛选其中的部分消息作为热点事件,例如,将媒体账户在预设时间段内发布的消息中评论数超过预设评论阈值的消息确定为热点事件。In this embodiment, the preset media account refers to some media accounts that are influential or authoritative in terms of finance. Including some personal microblog accounts opened by financial experts, or media accounts related to financial related official information channels, such as the official Weibo of the Securities and Futures Commission. When an event is fermented into a hot event, there is a phenomenon that the person paying attention to the event grows exponentially; correspondingly, it is reflected in the preset media account that the number of people who pay attention to the preset media account is also obvious. growth of. Therefore, the growth rate unit 21 acquires the growth rate of the attention quantity of the preset media account. Specifically, the method for obtaining the growth rate by the growth rate unit 21 is: acquiring the attention quantity x of the current time, extracting the attention quantity y of the preset time before the current time, and calculating the growth rate of x relative to y. In the present embodiment, the growth rate unit 21 calculates the change in the number of attention every one minute, assuming that the number of attentions before one minute is t1, and the number of attention at the current time is t0, then the growth rate a=(t0-t1)/t1. In other embodiments, the period in which the growth rate unit 21 obtains the growth rate may be 5 minutes, 10 minutes, or the like, or may be 10 seconds, 20 seconds, or the like. The growth threshold is a threshold for determining whether the account has a hotspot event. The value calculated by the above-mentioned growth rate unit 21 may also be a negative number, that is, the number of media accounts concerned is reduced. Therefore, the growth threshold in decision unit 22 includes one or two numbers. Specifically, the growth threshold may be -20% and 10%, that is, the growth rate is lower than 20% or higher than 10%, both exceeding the growth threshold. When the growth rate exceeds the growth threshold, the determining unit 22 determines that the message published by the media account within the preset time period contains a hotspot event. When the determining unit 22 determines that the message published by the media account includes a hotspot event, the first hot event unit 23 acquires the content of the message published by the media account on the day. Specifically, the first hot event unit 23 obtains the text content. If the message published by the media account contains a picture, the first hot event unit 23 identifies the text in the picture by scanning. Therefore, the message published by the media account in the preset time period may be determined as a hot event, or some of the messages may be further filtered as a hot event according to a message published by the media account within a preset time period, for example, The message that the number of comments in the message published by the media account in the preset time period exceeds the preset comment threshold is determined as a hot event.
参照图8,在另一实施例中,进一步地,上述事件关键词模块2包括:Referring to FIG. 8, in another embodiment, further, the event keyword module 2 includes:
评论数单元24,用于获取预设媒体账户发布的消息的评论数;a comment number unit 24, configured to obtain a number of comments of a message posted by the preset media account;
第二热点事件单元25,用于当所述预设媒体账户发布的消息中目标消息的评论数超过评论阈值时,确定所述目标消息为热点事件。The second hotspot event unit 25 is configured to determine that the target message is a hotspot event when the number of comments of the target message in the message published by the preset media account exceeds a comment threshold.
本实施例中,媒体账户发布一条消息后,后面会有社会人士对该消息进行评论。每个人可以进行多条评论留言。评论数单元24获取评论的数量。是获取对发布消息的评论数量,而不是获取媒体账户的评论数量。例如,某个媒体账户某天发布了两条消息,第一条消息有500条评论,第二条消息有800条评论,则评论数单元24获取了该媒体账户的两个消息对应的评论数,分别是500和800。第二热点事件单元25对评论数进行比较,是否超过评论阈值。评论阈值是一个数字,用于界定发布的消息是否是热点事件。评论阈值是用于判断每个媒体账户发布的消息包含有热点事件的评论依据的临界值。例如,评论阈值是600。在评论数单元24中提到的两条消息,第一条消息的论数是500条,没有超过评论阈值;第二条消息的评论数是800条,超过了评论阈值,第二热点事件单元25判定第二条消息为热点事件。In this embodiment, after the media account issues a message, a social person will comment on the message. Each person can post multiple comments. The number of comments unit 24 gets the number of comments. Is to get the number of comments on the published message, not the number of comments on the media account. For example, if a media account publishes two messages one day, the first message has 500 comments, and the second message has 800 comments, the comment number unit 24 obtains the number of comments corresponding to the two messages of the media account. , 500 and 800 respectively. The second hotspot event unit 25 compares the number of comments with a comment threshold. The comment threshold is a number that defines whether the published message is a hot event. The comment threshold is a threshold for judging that the message published by each media account contains a commentary of a hot event. For example, the comment threshold is 600. In the two messages mentioned in the comment number unit 24, the first message has 500 arguments and does not exceed the comment threshold; the second message has 800 comments, exceeding the comment threshold, and the second hot event unit 25 determines that the second message is a hot event.
进一步地,上述资产分析的装置还包括:Further, the device for asset analysis described above further includes:
账户信息模块,用于获取带有财经标记的财经媒体账户的信息;An account information module for obtaining information of a financial media account with a financial mark;
账户分数模块,用于将所述财经媒体账户的信息输入至预设公式,得到账户分数,所述账户分数用于量化所述财经媒体账户的影响力;An account score module, configured to input information of the financial media account to a preset formula to obtain an account score, where the account score is used to quantify the influence of the financial media account;
预设媒体模块,用于将所述账户分数超过分数阈值的财经媒体账户设置为所述预设媒体账户。And a preset media module, configured to set a financial media account whose account score exceeds a score threshold as the preset media account.
本实施例中,带有财经标记的媒体账户是指与财经相关的媒体账户,例如发表过一定篇幅或一定数量的财经类文章,或者是一些财经官方认证的媒体账户。具体的,账户信息模块访问微博后台,访问所有微博中带有财经标记的财经媒体账户,账户信息模块获取这些财经媒体账户发布的信息。媒体账户的信息包括好友数、粉丝数、等级、和发布微博数据,以及历史微博相关的信息等均是媒体账户的信息,账户分数模块将每个信息进行量化。好友数、粉丝数、等级均是量化的数据;发布微博数据可以是近一年发布微博的数量,或者是近一年发布微博中评论数达到500的数量;预设公式是对媒体账户进行评价的一个公式,用于体现该媒体账户的影响力。好友数越多、粉丝数越多、等级越高、发布的微博数量越多、微博评论数量越多,对应的影响力越大,最终得到的账户分数越高。例如,账户分数模块设置一具体公式如下:In this embodiment, the media account with the financial mark refers to a media account related to finance, such as publishing a certain length or a certain number of financial articles, or some financial officially certified media accounts. Specifically, the account information module accesses the microblogging background, accesses all the financial media accounts with financial symbols in the microblog, and the account information module obtains the information published by the financial media accounts. The information of the media account includes the number of friends, the number of fans, the level, and the microblog data, and the information related to the history microblog are all information of the media account, and the account score module quantifies each information. The number of friends, the number of fans, and the ranks are all quantified data; the number of microblogs can be published in the past year, or the number of comments in the microblog has reached 500 in the past year; the default formula is for the media. A formula for evaluating an account to reflect the influence of the media account. The more friends, the more fans, the higher the level, the more microblogs are posted, the more Weibo comments, the greater the impact, and the higher the account score. For example, the account score module sets a specific formula as follows:
s=c*(a+b)+ds=c*(a+b)+d
上述公式中,s是账户分数,a是好友分数,b是粉丝分数,c是等级数,d是微博评论超过500的分数。好友分数的计算方式如下表2:In the above formula, s is the account score, a is the friend score, b is the fan score, c is the rank number, and d is the score of the microblog comment over 500. The way friends scores are calculated is shown in Table 2:
好友数量   Number of friends    得分   Score   
0-10   0-10    1   1   
11-20   11-20    2   2   
21-50   21-50    3   3   
51-100   51-100    5   5   
101-1000   101-1000    10   10   
1000以上   More than 1000    30   30   
表2:好友数量与好友分数映射表Table 2: Friend Number and Friend Score Mapping Table
其他的粉丝分数、等级分数、微博评论超过500的分数均可以是如上表1按照阶梯进行评分。账户信息模块获取到媒体账户的信息后,账户分数模块通过预设公式,计算得到账户分数,该分数是反映一个媒体账户的影响力。然后预设媒体模块将该账户分数与分数阈值进行比较。分数阈值是用于界定一个媒体账户的影响力是否大到可以用作参考的预设媒体。预设媒体模块将媒体账户分数超过分数阈值的媒体账户设置为预设媒体。在一具体实施例中,根据上述账户分数模块中的预设公式,分数阈值是60。Other fan scores, grade scores, and scores of more than 500 Weibo comments may be scored according to the above table 1 according to the ladder. After the account information module obtains the information of the media account, the account score module calculates the account score by using a preset formula, and the score reflects the influence of a media account. The preset media module then compares the account score to a score threshold. The score threshold is a preset medium used to define whether the influence of a media account is large enough to be used as a reference. The preset media module sets the media account whose media account score exceeds the score threshold as the preset media. In a specific embodiment, the score threshold is 60 according to a preset formula in the account score module described above.
参照图9,进一步地,上述情感词汇模块4包括:Referring to FIG. 9, further, the emotional vocabulary module 4 includes:
访问单元41,用于访问与所述目标热点事件关联的网络社区;The access unit 41 is configured to access a network community associated with the target hotspot event;
提取单元42,用于提取所述网络社区当前时刻的聊天信息,将所述聊天信息作为所述评论文本。The extracting unit 42 is configured to extract chat information of the current moment of the network community, and use the chat information as the comment text.
本实施例中,网络社区是指如股票群、贴吧、知乎等网络社区,拥有大量的用户群体,对某个事情进行讨论。访问单元41访问与热点事件关联的网络社区,可以是访问资产名字的贴吧。访问单元41访问了资产名字的贴吧后,即关注该股票的一群用户的网络社区,提取单元42提取出该贴吧里发布的帖子以及发布帖子后针对该帖子的评论。发布帖子以及对帖子的评论,即上述的聊天信息。在另一具体实施例中,提取单元42注册一个交流的账户,进入到一个实时聊天群里,当热点事件发生时,提取单元42提取出该实时聊天群里的聊天记录。In this embodiment, the network community refers to an online community such as a stock group, a post bar, and a knowledgeable community, and has a large number of user groups to discuss something. The access unit 41 accesses the network community associated with the hotspot event, which may be a post that accesses the asset name. After the access unit 41 accesses the post of the asset name, that is, the online community of a group of users who are concerned about the stock, the extracting unit 42 extracts the post posted in the post bar and the comment on the post after posting the post. Post posts and comments on posts, ie the chat messages above. In another embodiment, the extracting unit 42 registers an exchange account and enters a live chat group. When a hot event occurs, the extracting unit 42 extracts the chat record in the live chat group.
参照图10,进一步地,上述情感词汇模块4包括:Referring to FIG. 10, further, the emotional vocabulary module 4 includes:
调用单元43,用于调用情感词汇数据库;The calling unit 43 is configured to invoke the emotional vocabulary database;
匹配单元44,用于将所述评论文本中的词汇与所述情感词汇数据库中的词汇进行匹配;The matching unit 44 is configured to match the vocabulary in the comment text with the vocabulary in the sentiment vocabulary database;
确定单元45,用于确定与所述情感词汇数据库中的词汇匹配的所述评论文本中的词汇为情感词汇。The determining unit 45 is configured to determine that the vocabulary in the comment text that matches the vocabulary in the emotional vocabulary database is an emotional vocabulary.
本实施例中,情感词汇数据库是后台工作人员事先整理出的所有带有感情倾向表达人感情的词语,将所有整理出来的词语汇总到一起,存放在服务器中。当获取到了热点事件的评论文本后,调用单元43调用出该情感词汇数据库。进一步地,情感词汇数据库包括褒义词汇库和贬义词汇库,工作人员将情感词汇整理出来后,分别放入不同的词汇库。调用单元43调用出情感词汇数据库后,将评论文本的内容进行扫描,读取评论内容的每个字,匹配单元44进行语义分析,将评论文本中的词与情感词汇数据库中的词相匹配的词。将所有的与情感词汇数据库中的匹配的词汇提取出来,确定单元45确定这些匹配的词汇是该目标热点事件的情感词汇,即众多用户对该热点事件的情感。汇总出这一热点事件的感情趋势。具体的,将所有与情感词汇数据库中的褒义词汇库中的词汇统一汇总,将所有与情感词汇数据库中的贬义词汇库中的词汇统一汇总,计算出褒义词汇与贬义词汇的比例。通过这些情感词汇以及褒义词汇与贬义词汇的比例可以判断出热点事件是利好事件还是利空事件。这一步骤方便在后续的资产走势分析模型中更快的计算资产的走势图。In this embodiment, the emotional vocabulary database is all the words that the background staff pre-organizes to express the feelings of the person with emotions, and all the sorted words are put together and stored in the server. After the comment text of the hot event is obtained, the calling unit 43 calls the emotional vocabulary database. Further, the emotional vocabulary database includes a lexical vocabulary library and a lexical vocabulary library, and the staff puts the emotional vocabulary into separate vocabularies. After calling the emotional vocabulary database, the calling unit 43 scans the content of the review text, reads each word of the comment content, and the matching unit 44 performs semantic analysis to match the words in the review text with the words in the emotional vocabulary database. word. All of the matching vocabulary words in the emotional vocabulary database are extracted, and the determining unit 45 determines that the matched vocabulary is the emotional vocabulary of the target hotspot event, that is, the emotion of the plurality of users to the hotspot event. Summarize the emotional trends of this hot event. Specifically, all the vocabulary in the lexical vocabulary in the emotional vocabulary database are uniformly aggregated, and all the vocabulary in the lexical vocabulary in the emotional vocabulary database are uniformly aggregated to calculate the proportion of defamatory vocabulary and defamatory vocabulary. Through these emotional vocabulary and the ratio of derogatory words to derogatory words, it can be judged whether the hot event is a favorable event or a bad event. This step facilitates faster calculation of asset charts in subsequent asset trend analysis models.
综上所述,本申请的资产走势分析的装置,自动获取与资产相关的热点事件后,根据热点事件中的关键词描述以及对热点事件的评论,生成关于资产的走势预测图。当走势图预测到资产低于一定值,给用户发送信息,减小用户的资产受到损失的风险。In summary, the device for analyzing the asset trend of the present application automatically acquires a hot event related to the asset, and generates a trend forecast map for the asset according to the keyword description in the hot event and the comment on the hot event. When the chart predicts that the asset is below a certain value, the user is sent a message to reduce the risk of the user's assets being lost.
参照图11,本申请实施例中还提供一种计算机设备,该计算机设备可以是服务器,其内部结构可以如图11所示。该计算机设备包括通过***总线连接的处理器、存储器、网络接口和数据库。其中,该计算机设计的处理器用于提供计算和控制能力。该计算机设备的存储器包括非易失性存储介质、内存储器。该非易失性存储介质存储有操作***、计算机可读指令和数据库。该内存器为非易失性存储介质中的操作***和计算机可读指令的运行提供环境。该计算机设备的数据库用于存储LSTM模型、情感词汇数据库等数据。该计算机设备的网络接口用于与外部的终端通过网络连接通信。该计算机可读指令在执行时,执行如上述各方法的实施例的流程。本领域技术人员可以理解,图11中示出的结构,仅仅是与本申请方案相关的部分结构的框图,并不构成对本申请方案所应用于其上的计算机设备的限定。Referring to FIG. 11, a computer device is provided in the embodiment of the present application. The computer device may be a server, and its internal structure may be as shown in FIG. 11. The computer device includes a processor, memory, network interface, and database connected by a system bus. Among them, the computer designed processor is used to provide calculation and control capabilities. The memory of the computer device includes a non-volatile storage medium, an internal memory. The non-volatile storage medium stores an operating system, computer readable instructions, and a database. The memory provides an environment for the operation of operating systems and computer readable instructions in a non-volatile storage medium. The database of the computer device is used to store data such as an LSTM model, an emotional vocabulary database, and the like. The network interface of the computer device is used to communicate with an external terminal via a network connection. The computer readable instructions, when executed, perform the flow of an embodiment of the methods described above. Those skilled in the art can understand that the structure shown in FIG. 11 is only a block diagram of a part of the structure related to the solution of the present application, and does not constitute a limitation of the computer device to which the solution of the present application is applied.
本申请一实施例还提供一种计算机非易失性可读存储介质,其上存储有计算机可读指令,该计算机可读指令在执行时,执行如上述各方法的实施例的流程。以上所述仅为本申请的优选实施例,并非因此限制本申请的专利范围,凡是利用本申请说明书及附图内容所作的等效结构或等效流程变换,或直接或间接运用在其他相关的技术领域,均同理包括在本申请的专利保护范围内。An embodiment of the present application also provides a computer non-volatile readable storage medium having stored thereon computer readable instructions that, when executed, perform the processes of the embodiments of the methods described above. The above description is only the preferred embodiment of the present application, and is not intended to limit the scope of the patent application, and the equivalent structure or equivalent process transformations made by the specification and the drawings of the present application, or directly or indirectly applied to other related The technical field is equally included in the scope of patent protection of the present application.

Claims (20)

  1. 一种资产走势分析的方法,其特征在于,包括:A method for analyzing asset trends, characterized by comprising:
    获取待分析资产的资产关键词;Obtain the asset keyword of the asset to be analyzed;
    获取热点事件,提取所述热点事件中的关键词,得到事件关键词;Obtaining a hot event, extracting keywords in the hot event, and obtaining an event keyword;
    在所述事件关键词中,确定与所述资产关键词匹配的目标事件关键词,获取所述目标事件关键词对应的目标热点事件;Determining, in the event keyword, a target event keyword that matches the asset keyword, and acquiring a target hotspot event corresponding to the target event keyword;
    获取所述目标热点事件的评论文本,并提取所述评论文本中的情感词汇;Obtaining a comment text of the target hotspot event, and extracting an emotional vocabulary in the comment text;
    将所述目标事件关键词以及所述情感词汇输入到预设的资产走势分析模型,得到所述待分析资产的走势图,所述资产走势分析模型是基于LSTM的模型。The target event keyword and the emotional vocabulary are input into a preset asset trend analysis model to obtain a trend chart of the asset to be analyzed, and the asset trend analysis model is based on an LSTM model.
  2. 如权利要求1所述的资产走势分析的方法,其特征在于,所述获取热点事件的步骤包括:The method of analyzing an asset trend according to claim 1, wherein the step of acquiring a hot spot event comprises:
    获取预设媒体账户的关注数量的增长率;Get the growth rate of the number of attentions of the preset media account;
    当所述增长率超过增长阈值时,判定所述媒体账户在预设时间段内发布的消息包含有热点事件;When the growth rate exceeds the growth threshold, determining that the message published by the media account within a preset time period includes a hot event;
    根据所述媒体账户在所述预设时间段内发布的消息,确定所述热点事件。Determining the hotspot event according to a message published by the media account within the preset time period.
  3. 如权利要求1所述的资产走势分析的方法,其特征在于,所述获取热点事件的步骤包括:The method of analyzing an asset trend according to claim 1, wherein the step of acquiring a hot spot event comprises:
    获取预设媒体账户发布的消息的评论数;Get the number of comments for the message posted by the preset media account;
    当所述预设媒体账户发布的消息中目标消息的评论数超过评论阈值时,确定所述目标消息为热点事件。And determining, when the number of comments of the target message in the message published by the preset media account exceeds a comment threshold, determining that the target message is a hot event.
  4. 如权利要求2所述的资产走势分析的方法,其特征在于,所述方法还包括:The method of analyzing an asset trend according to claim 2, wherein the method further comprises:
    获取带有财经标记的财经媒体账户的信息;Obtain information on financial media accounts with financial symbols;
    将所述财经媒体账户的信息输入至预设公式,得到账户分数,所述账户分数用于量化所述财经媒体账户的影响力;Entering the information of the financial media account into a preset formula to obtain an account score, wherein the account score is used to quantify the influence of the financial media account;
    将所述账户分数超过分数阈值的财经媒体账户设置为所述预设媒体账户。A financial media account whose account score exceeds a score threshold is set as the preset media account.
  5. 如权利要求1所述的资产走势分析的方法,其特征在于,所述获取所述目标热点事件的评论文本的步骤包括:The method for analyzing an asset trend according to claim 1, wherein the step of acquiring the comment text of the target hotspot event comprises:
    访问与所述目标热点事件关联的网络社区;Accessing a network community associated with the target hotspot event;
    提取所述网络社区当前时刻的聊天信息,将所述聊天信息作为所述评论文本。Extracting chat information of the current moment of the network community, and using the chat information as the comment text.
  6. 如权利要求1所述的资产走势分析的方法,其特征在于,所述提取所述评论文本中的情感词汇的步骤包括:The method of analyzing an asset trend according to claim 1, wherein the step of extracting the emotional vocabulary in the comment text comprises:
    调用情感词汇数据库;Calling the emotional vocabulary database;
    将所述评论文本中的词汇与所述情感词汇数据库中的词汇进行匹配;Matching words in the comment text with words in the emotional vocabulary database;
    确定与所述情感词汇数据库中的词汇匹配的所述评论文本中的词汇为情感词汇。Determining that the vocabulary in the review text that matches the vocabulary in the emotional vocabulary database is an emotional vocabulary.
  7. 一种资产走势分析的装置,其特征在于,包括:An apparatus for analyzing an asset trend, characterized by comprising:
    获取模块,用于获取待分析资产的资产关键词;An acquisition module, configured to acquire an asset keyword of an asset to be analyzed;
    事件关键词模块,用于获取热点事件,提取所述热点事件中的关键词,得到事件关键词;An event keyword module, configured to acquire a hot event, extract keywords in the hot event, and obtain an event keyword;
    目标热点事件模块,用于在所述事件关键词中,确定与所述资产关键词匹配的目标事件关键词,获取所述目标事件关键词对应的目标热点事件;a target hotspot event module, configured to determine, in the event keyword, a target event keyword that matches the asset keyword, and obtain a target hotspot event corresponding to the target event keyword;
    情感词汇模块,用于获取所述目标热点事件的评论文本,并提取所述评论文本中的情感词汇;An emotional vocabulary module, configured to obtain a comment text of the target hotspot event, and extract an emotional vocabulary in the comment text;
    得到模块,用于将所述目标事件关键词以及所述情感词汇输入到预设的资产走势分析模型,得到所述待分析资产的走势图,所述资产走势分析模型是基于LSTM的模型。And a module for inputting the target event keyword and the emotional vocabulary into a preset asset trend analysis model to obtain a trend chart of the asset to be analyzed, wherein the asset trend analysis model is based on an LSTM model.
  8. 如权利要求7所述的资产走势分析的装置,其特征在于,所述事件关键词模块包括:The apparatus for analyzing an asset trend according to claim 7, wherein the event keyword module comprises:
    增长率单元,用于获取预设媒体账户的关注数量的增长率;a growth rate unit for obtaining a growth rate of the number of attentions of the preset media account;
    判定单元,用于当所述增长率超过增长阈值时,判定所述媒体账户在预设时间段内发布的消息包含有热点事件;a determining unit, configured to: when the growth rate exceeds a growth threshold, determine that the message published by the media account within a preset time period includes a hot event;
    第一热点事件单元,用于根据所述媒体账户在所述预设时间段内发布的消息,确定所述热点事件。The first hot event unit is configured to determine the hot event according to a message published by the media account in the preset time period.
  9. 如权利要求7所述的资产走势分析的装置,其特征在于,所述事件关键词模块包括:The apparatus for analyzing an asset trend according to claim 7, wherein the event keyword module comprises:
    评论数单元,用于获取预设媒体账户发布的消息的评论数;a comment number unit for obtaining the number of comments of the message posted by the preset media account;
    第二热点事件单元,用于当所述预设媒体账户发布的消息中目标消息的评论数超过评论阈值时,确定所述目标消息为热点事件。The second hot event unit is configured to determine that the target message is a hot event when the number of comments of the target message in the message published by the preset media account exceeds a comment threshold.
  10. 如权利要求8所述的资产走势分析的装置,其特征在于,所述资产分析的装置还包括:The apparatus for analyzing an asset trend according to claim 8, wherein the means for analyzing the asset further comprises:
    账户信息模块,用于获取带有财经标记的财经媒体账户的信息;An account information module for obtaining information of a financial media account with a financial mark;
    账户分数模块,用于将所述财经媒体账户的信息输入至预设公式,得到账户分数,所述账户分数用于量化所述财经媒体账户的影响力;An account score module, configured to input information of the financial media account to a preset formula to obtain an account score, where the account score is used to quantify the influence of the financial media account;
    预设媒体模块,用于将所述账户分数超过分数阈值的财经媒体账户设置为所述预设媒体账户。And a preset media module, configured to set a financial media account whose account score exceeds a score threshold as the preset media account.
  11. 如权利要求7所述的资产走势分析的装置,其特征在于,所述情感词汇模块包括:The apparatus for analyzing an asset trend according to claim 7, wherein the emotional vocabulary module comprises:
    访问单元,用于访问与所述目标热点事件关联的网络社区;An access unit, configured to access a network community associated with the target hotspot event;
    提取单元,用于提取所述网络社区当前时刻的聊天信息,将所述聊天信息作为所述评论文本。And an extracting unit, configured to extract chat information of a current moment of the network community, and use the chat information as the comment text.
  12. 如权利要求7所述的资产走势分析的装置,其特征在于,所述情感词汇模块4包括:The apparatus for analyzing the trend of an asset according to claim 7, wherein the emotional vocabulary module 4 comprises:
    调用单元,用于调用情感词汇数据库;Calling unit for calling the emotional vocabulary database;
    匹配单元,用于将所述评论文本中的词汇与所述情感词汇数据库中的词汇进行匹配;a matching unit, configured to match a vocabulary in the comment text with a vocabulary in the sentiment vocabulary database;
    确定单元,用于确定与所述情感词汇数据库中的词汇匹配的所述评论文本中的词汇为情感词汇。a determining unit, configured to determine that the vocabulary in the comment text that matches the vocabulary in the emotional vocabulary database is an emotional vocabulary.
  13. 一种计算机设备,包括存储器和处理器,所述存储器存储有计算机可读指令,其特征在于,所述处理器执行所述计算机可读指令时实现资产走势分析的方法,该资产走势分析的方法,包括:A computer device comprising a memory and a processor, the memory storing computer readable instructions, wherein the processor implements an asset trend analysis method when the computer readable instructions are executed, and the asset trend analysis method ,include:
    获取待分析资产的资产关键词;Obtain the asset keyword of the asset to be analyzed;
    获取热点事件,提取所述热点事件中的关键词,得到事件关键词;Obtaining a hot event, extracting keywords in the hot event, and obtaining an event keyword;
    在所述事件关键词中,确定与所述资产关键词匹配的目标事件关键词,获取所述目标事件关键词对应的目标热点事件;Determining, in the event keyword, a target event keyword that matches the asset keyword, and acquiring a target hotspot event corresponding to the target event keyword;
    获取所述目标热点事件的评论文本,并提取所述评论文本中的情感词汇;Obtaining a comment text of the target hotspot event, and extracting an emotional vocabulary in the comment text;
    将所述目标事件关键词以及所述情感词汇输入到预设的资产走势分析模型,得到所述待分析资产的走势图,所述资产走势分析模型是基于LSTM的模型。The target event keyword and the emotional vocabulary are input into a preset asset trend analysis model to obtain a trend chart of the asset to be analyzed, and the asset trend analysis model is based on an LSTM model.
  14. 如权利要求13所述的计算机设备,其特征在于,所述获取热点事件的步骤包括:The computer device according to claim 13, wherein the step of acquiring a hot spot event comprises:
    获取预设媒体账户的关注数量的增长率;Get the growth rate of the number of attentions of the preset media account;
    当所述增长率超过增长阈值时,判定所述媒体账户在预设时间段内发布的消息包含有热点事件;When the growth rate exceeds the growth threshold, determining that the message published by the media account within a preset time period includes a hot event;
    根据所述媒体账户在所述预设时间段内发布的消息,确定所述热点事件。Determining the hotspot event according to a message published by the media account within the preset time period.
  15. 如权利要求13所述的资产走势分析的方法,其特征在于,所述获取热点事件的步骤包括:The method for analyzing an asset trend according to claim 13, wherein the step of acquiring a hot spot event comprises:
    获取预设媒体账户发布的消息的评论数;Get the number of comments for the message posted by the preset media account;
    当所述预设媒体账户发布的消息中目标消息的评论数超过评论阈值时,确定所述目标消息为热点事件。And determining, when the number of comments of the target message in the message published by the preset media account exceeds a comment threshold, determining that the target message is a hot event.
  16. 如权利要求14所述的资产走势分析的方法,其特征在于,所述方法还包括:The method of analyzing an asset trend according to claim 14, wherein the method further comprises:
    获取带有财经标记的财经媒体账户的信息;Obtain information on financial media accounts with financial symbols;
    将所述财经媒体账户的信息输入至预设公式,得到账户分数,所述账户分数用于量化所述财经媒体账户的影响力;Entering the information of the financial media account into a preset formula to obtain an account score, wherein the account score is used to quantify the influence of the financial media account;
    将所述账户分数超过分数阈值的财经媒体账户设置为所述预设媒体账户。A financial media account whose account score exceeds a score threshold is set as the preset media account.
  17. 一种计算机非易失性可读存储介质,其上存储有计算机可读指令,其特征在于,所述处理器执行所述计算机可读指令时实现资产走势分析的方法,该资产走势分析的方法,包括:获取待分析资产的资产关键词;A computer non-volatile readable storage medium having computer readable instructions stored thereon, wherein the processor implements an asset trend analysis method when the computer readable instructions are executed, and the asset trend analysis method , including: obtaining asset keywords of the assets to be analyzed;
    获取热点事件,提取所述热点事件中的关键词,得到事件关键词;Obtaining a hot event, extracting keywords in the hot event, and obtaining an event keyword;
    在所述事件关键词中,确定与所述资产关键词匹配的目标事件关键词,获取所述目标事件关键词对应的目标热点事件;Determining, in the event keyword, a target event keyword that matches the asset keyword, and acquiring a target hotspot event corresponding to the target event keyword;
    获取所述目标热点事件的评论文本,并提取所述评论文本中的情感词汇;Obtaining a comment text of the target hotspot event, and extracting an emotional vocabulary in the comment text;
    将所述目标事件关键词以及所述情感词汇输入到预设的资产走势分析模型,得到所述待分析资产的走势图,所述资产走势分析模型是基于LSTM的模型。The target event keyword and the emotional vocabulary are input into a preset asset trend analysis model to obtain a trend chart of the asset to be analyzed, and the asset trend analysis model is based on an LSTM model.
  18. 如权利要求17所述的计算机非易失性可读存储介质,其特征在于,所述获取热点事件的步骤包括:The computer non-volatile readable storage medium of claim 17, wherein the step of obtaining a hotspot event comprises:
    获取预设媒体账户的关注数量的增长率;Get the growth rate of the number of attentions of the preset media account;
    当所述增长率超过增长阈值时,判定所述媒体账户在预设时间段内发布的消息包含有热点事件;When the growth rate exceeds the growth threshold, determining that the message published by the media account within a preset time period includes a hot event;
    根据所述媒体账户在所述预设时间段内发布的消息,确定所述热点事件。Determining the hotspot event according to a message published by the media account within the preset time period.
  19. 如权利要求17所述的计算机非易失性可读存储介质,其特征在于,所述获取热点事件的步骤包括:The computer non-volatile readable storage medium of claim 17, wherein the step of obtaining a hotspot event comprises:
    获取预设媒体账户发布的消息的评论数;Get the number of comments for the message posted by the preset media account;
    当所述预设媒体账户发布的消息中目标消息的评论数超过评论阈值时,确定所述目标消息为热点事件。And determining, when the number of comments of the target message in the message published by the preset media account exceeds a comment threshold, determining that the target message is a hot event.
  20. 如权利要求18所述的计算机非易失性可读存储介质,其特征在于,所述方法还包括:The computer non-volatile readable storage medium of claim 18, wherein the method further comprises:
    获取带有财经标记的财经媒体账户的信息;Obtain information on financial media accounts with financial symbols;
    将所述财经媒体账户的信息输入至预设公式,得到账户分数,所述账户分数用于量化所述财经媒体账户的影响力;Entering the information of the financial media account into a preset formula to obtain an account score, wherein the account score is used to quantify the influence of the financial media account;
    将所述账户分数超过分数阈值的财经媒体账户设置为所述预设媒体账户。A financial media account whose account score exceeds a score threshold is set as the preset media account.
PCT/CN2018/094887 2018-05-08 2018-07-06 Method, device, computer device, and storage medium for asset trend analysis WO2019214046A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201810433804.9A CN108733782A (en) 2018-05-08 2018-05-08 Method, apparatus, computer equipment and the storage medium of assets trend analysis
CN201810433804.9 2018-05-08

Publications (1)

Publication Number Publication Date
WO2019214046A1 true WO2019214046A1 (en) 2019-11-14

Family

ID=63938090

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2018/094887 WO2019214046A1 (en) 2018-05-08 2018-07-06 Method, device, computer device, and storage medium for asset trend analysis

Country Status (2)

Country Link
CN (1) CN108733782A (en)
WO (1) WO2019214046A1 (en)

Families Citing this family (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111488514B (en) * 2019-01-25 2024-03-01 北京京东尚科信息技术有限公司 Method, device, equipment and storage medium for mining violently rising words
CN111177517B (en) * 2019-12-16 2023-04-07 北京明略软件***有限公司 Method and device for determining severity of risk event
CN112348279B (en) * 2020-11-18 2024-04-05 武汉大学 Information propagation trend prediction method, device, electronic equipment and storage medium
CN113393330B (en) * 2021-07-11 2022-12-23 深圳市鼎驰科技发展有限公司 Financial wind control management system based on block chain

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103778215A (en) * 2014-01-17 2014-05-07 北京理工大学 Stock market forecasting method based on sentiment analysis and hidden Markov fusion model
CN106384166A (en) * 2016-09-12 2017-02-08 中山大学 Deep learning stock market prediction method combined with financial news
CN107767273A (en) * 2017-09-05 2018-03-06 平安科技(深圳)有限公司 Asset Allocation method, electronic installation and medium based on social data
CN107797983A (en) * 2017-04-07 2018-03-13 平安科技(深圳)有限公司 Microblog data processing method, device, computer equipment and storage medium

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103778215A (en) * 2014-01-17 2014-05-07 北京理工大学 Stock market forecasting method based on sentiment analysis and hidden Markov fusion model
CN106384166A (en) * 2016-09-12 2017-02-08 中山大学 Deep learning stock market prediction method combined with financial news
CN107797983A (en) * 2017-04-07 2018-03-13 平安科技(深圳)有限公司 Microblog data processing method, device, computer equipment and storage medium
CN107767273A (en) * 2017-09-05 2018-03-06 平安科技(深圳)有限公司 Asset Allocation method, electronic installation and medium based on social data

Also Published As

Publication number Publication date
CN108733782A (en) 2018-11-02

Similar Documents

Publication Publication Date Title
WO2021051521A1 (en) Response information obtaining method and apparatus, computer device, and storage medium
WO2020147428A1 (en) Interactive content generation method and apparatus, computer device, and storage medium
US11601294B2 (en) Systems and methods for structuring information in a collaboration environment
WO2023029420A1 (en) Power user appeal screening method and system, electronic device, and storage medium
US10516701B2 (en) Natural language processing artificial intelligence network and data security system
WO2019214046A1 (en) Method, device, computer device, and storage medium for asset trend analysis
CN117219080A (en) Virtual assistant for generating personalized responses within a communication session
CN114556354A (en) Automatically determining and presenting personalized action items from an event
US9710829B1 (en) Methods, systems, and articles of manufacture for analyzing social media with trained intelligent systems to enhance direct marketing opportunities
US20140188830A1 (en) Social Community Identification for Automatic Document Classification
CN114095282B (en) Wind control processing method and device based on short text feature extraction
WO2019214048A1 (en) Method, device, computer apparatus, and storage medium for automatically generating investment advice
Xu et al. MNRD: A merged neural model for rumor detection in social media
CN110825868A (en) Topic popularity based text pushing method, terminal device and storage medium
Wang et al. User and topic hybrid context embedding for finance-related text data mining
CN112667792A (en) Man-machine conversation data processing method and device, computer equipment and storage medium
Huang et al. Boosting financial trend prediction with twitter mood based on selective hidden Markov models
Chao et al. Opinion mining and the visualization of stock selection in quantitative trading
CN113177164B (en) Multi-platform collaborative new media content monitoring and management system based on big data
WO2022001161A1 (en) Online interview method and system
WO2021073258A1 (en) Task follow-up method, apparatus and device based on emotion analysis, and storage medium
Hajare et al. A machine learning pipeline to examine political bias with congressional speeches
Fatemi et al. Understanding stay-at-home attitudes through framing analysis of tweets
Lan et al. Mining semantic variation in time series for rumor detection via recurrent neural networks
Cheng et al. Weibo user attribute analysis method based on multi-feature

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 18918042

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 18918042

Country of ref document: EP

Kind code of ref document: A1