CN110321407A - A kind of election results prediction technique, device and computer storage medium - Google Patents

A kind of election results prediction technique, device and computer storage medium Download PDF

Info

Publication number
CN110321407A
CN110321407A CN201910448874.6A CN201910448874A CN110321407A CN 110321407 A CN110321407 A CN 110321407A CN 201910448874 A CN201910448874 A CN 201910448874A CN 110321407 A CN110321407 A CN 110321407A
Authority
CN
China
Prior art keywords
data
index
election
cutting
correlation
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201910448874.6A
Other languages
Chinese (zh)
Inventor
杜蕾
姚茂建
僬梦姝
王晓斌
黄三伟
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Hunan Ant Software Ltd By Share Ltd
Original Assignee
Hunan Ant Software Ltd By Share Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Hunan Ant Software Ltd By Share Ltd filed Critical Hunan Ant Software Ltd By Share Ltd
Priority to CN201910448874.6A priority Critical patent/CN110321407A/en
Publication of CN110321407A publication Critical patent/CN110321407A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/3331Query processing
    • G06F16/334Query execution
    • G06F16/3344Query execution using natural language analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/35Clustering; Classification
    • G06F16/353Clustering; Classification into predefined classes
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/30Semantic analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/04Forecasting or optimisation specially adapted for administrative or management purposes, e.g. linear programming or "cutting stock problem"
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q50/00Information and communication technology [ICT] specially adapted for implementation of business processes of specific business sectors, e.g. utilities or tourism
    • G06Q50/01Social networking

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Business, Economics & Management (AREA)
  • General Engineering & Computer Science (AREA)
  • Computational Linguistics (AREA)
  • Human Resources & Organizations (AREA)
  • Artificial Intelligence (AREA)
  • General Health & Medical Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Economics (AREA)
  • Strategic Management (AREA)
  • Data Mining & Analysis (AREA)
  • Computing Systems (AREA)
  • Marketing (AREA)
  • General Business, Economics & Management (AREA)
  • Databases & Information Systems (AREA)
  • Tourism & Hospitality (AREA)
  • Entrepreneurship & Innovation (AREA)
  • Biomedical Technology (AREA)
  • Operations Research (AREA)
  • Primary Health Care (AREA)
  • Game Theory and Decision Science (AREA)
  • Development Economics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Quality & Reliability (AREA)
  • Biophysics (AREA)
  • Evolutionary Computation (AREA)
  • Molecular Biology (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention discloses a kind of election results prediction techniques, which comprises acquisition election network of relation speech data simultaneously carry out pretreatment operation, generate reduced data;Selecting index is carried out based on the modeling of index related and historical data, generates election index of correlation;Based on the reduced data, time dimension cut-off and region dimension cut-off are generated, reduced data described in cutting generates cutting data;Based on the election index of correlation, index calculating is carried out in conjunction with semantic analysis technology and deep learning algorithm, generates index calculated result;Based on the index calculated result and the cutting data, election results are predicted.The present invention further simultaneously discloses election results prediction meanss and computer storage medium.

Description

A kind of election results prediction technique, device and computer storage medium
Technical field
The present invention relates to big data analysis fields more particularly to a kind of election results prediction technique, device and computer to deposit Storage media.
Background technique
With the update of social media, webpage and microblogging have become most of groups and obtain information and exchange of information Platform.Different from traditional media, the information of social media has the features such as magnanimity, opening, real-time and interactivity, greatly User's speech of amount, which converges and forms public opinion, will affect public opinion trend.The influence power of network public-opinion is increasingly important, and media are to election The report of relevant information is excavated from mass network data and elects relevant information and right so that election process more transparence Election results carry out prediction and have feasibility.
Traditional election results prediction technique is mostly based on comment number, forwarding number and the statistics for thumbing up the data with existing such as number Value, statistical value can not represent the orientation of netizen's speech, therefore be easy to appear the election results and reality election knot of prediction Fruit the case where there are relatively large deviations.
Summary of the invention
In view of this, the main purpose of the present invention is to provide a kind of higher election results prediction techniques of accuracy, dress It sets and computer storage medium.
In order to achieve the above objectives, the technical scheme of the present invention is realized as follows:
The present invention provides a kind of election results prediction techniques, which comprises acquisition election network of relation speech number According to and carry out pretreatment operation, generate reduced data;Selecting index is carried out based on the modeling of index related and historical data, it is raw At election index of correlation;Based on the reduced data, time dimension cut-off and region dimension cut-off are generated, described in cutting Reduced data generates cutting data;Based on the election index of correlation, in conjunction with semantic analysis technology and deep learning algorithm Index calculating is carried out, index calculated result is generated;Based on the index calculated result and the cutting data, prediction election knot Fruit.
In above scheme, selecting index is carried out based on the modeling of index related and historical data, generates election index of correlation It include: that highly relevant index is screened based on the correlation between index;The contribution degree that each index is calculated based on historical data, as Weighted basis;Based on the highly relevant index and the weighted basis, election index of correlation is generated.
In above scheme, it is based on the election index of correlation, is referred in conjunction with semantic analysis technology and deep learning algorithm Mark calculates, and generating index calculated result includes: to obtain described cut using deep learning algorithm based on the election index of correlation The Sentiment orientation of divided data;Based on the election index of correlation and the election index of correlation, calculated using semantic analysis technology The topic emotion of the data of cutting;Based on the Sentiment orientation and the topic emotion, the cutting data are extracted Political orientation generates index calculated result.
In above scheme, it is based on the election index of correlation, the cutting data are obtained using deep learning algorithm Sentiment orientation includes: based on cutting data and the prediction index system, using the convolutional Neural net in deep learning Network algorithm constructs text emotion disaggregated model, carries out emotional semantic classification to text.
In above scheme, it is based on the election index of correlation and the election index of correlation, using semantic analysis technology meter The topic emotion for calculating the cutting data includes: based on cutting data and the election index of correlation, using semanteme Analytical technology carries out topic extraction to cutting data, obtains hot topic;Based on the text emotion disaggregated model, heat is calculated The topic emotion of door topic.
In above scheme, it is based on the reduced data, generates time dimension cut-off and region dimension cut-off, cutting The reduced data, generating cutting data includes: to be detected the reduced data based on the reduced data and elected Wave crest variation on timeline;Changed according to the wave crest, generates the time cut-off of the reduced data;According to it is described when Between cut-off, reduced data described in cutting generates time cutting data.
In above scheme, it is based on the reduced data, generates time dimension cut-off and region dimension cut-off, cutting The reduced data, generating cutting data includes: to determine the source of the reduced data based on the reduced data Region;According to the source region, the region cut-off of the reduced data is generated;According to the region cut-off, cutting Reduced data formation zone cutting data.
In above scheme, described according to the region cut-off, cutting number in reduced data formation zone described in cutting According to later, the method also includes: the weight of region cutting data is set.
To achieve the above object, the present invention also provides a kind of election results prediction meanss, described device include processor, And the memory being connected to the processor by communication bus;Wherein,
The memory, for storing election results Prediction program;
The processor, for executing the election results Prediction program, to realize election described in any of the above-described scheme Prediction of result step.
To achieve the above object, the present invention also provides a kind of computer storage medium, the computer storage medium is deposited One or more program is contained, one or more of programs can be executed by one or more processor, so that described One or more processor executes election results prediction steps described in any of the above-described scheme.
Election results prediction technique provided by the present invention, acquisition election network of relation speech data simultaneously carry out pretreatment behaviour Make, generates reduced data;Selecting index is carried out based on the modeling of index related and historical data, generates election index of correlation; Based on the reduced data, time dimension cut-off and region dimension cut-off are generated, reduced data described in cutting generates Cutting data;Based on the election index of correlation, index calculating is carried out in conjunction with semantic analysis technology and deep learning algorithm, it is raw At index calculated result;Based on the index calculated result and the cutting data, election results are predicted.In this way, by adopting Collection election network of relation speech data simultaneously carry out pretreatment operation, generate reduced data, improve reduced data and election thing The positive correlation of part;By carrying out selecting index based on the modeling of index related and historical data, election index of correlation is generated;Base In the reduced data, time dimension cut-off and region dimension cut-off are generated, reduced data described in cutting generates Cutting data;Based on the election index of correlation, index calculating is carried out in conjunction with semantic analysis technology and deep learning algorithm, is generated Index calculated result;Based on the index calculated result and the cutting data, predicts election results, improve election results The accuracy of prediction.
Detailed description of the invention
Fig. 1 is election results prediction technique flow diagram in an alternate embodiment of the present invention;
Fig. 2 is election results prediction technique flow diagram in an alternate embodiment of the present invention;
Fig. 3 is election results prediction technique flow diagram in an alternate embodiment of the present invention;
Fig. 4 is election results prediction technique flow diagram in an alternate embodiment of the present invention;
Fig. 5 is election results prediction technique flow diagram in an alternate embodiment of the present invention;
Fig. 6 is the composed structure schematic diagram of election results prediction meanss in an alternate embodiment of the present invention;
Fig. 7 is the composed structure schematic diagram of election results forecasting system in an alternate embodiment of the present invention;
Fig. 8 is the composed structure schematic diagram of election results forecasting system in an alternate embodiment of the present invention;
Fig. 9 is the composed structure schematic diagram of election results forecasting system in an alternate embodiment of the present invention;
Figure 10 is the composed structure schematic diagram of election results forecasting system in an alternate embodiment of the present invention.
Specific embodiment
With reference to the accompanying drawing and specific embodiment the present invention is further described in more detail.
Fig. 1 is election results prediction technique flow diagram in the embodiment of the present invention, referring to Fig. 1, the embodiment of the present invention Provide a kind of election results prediction technique, which comprises
Step S101: acquisition election network of relation speech data simultaneously carry out pretreatment operation, generate reduced data.Pre- place The election related data is managed, reduced data is generated.Here, the election related data is denoised, it is related in the election There may be the noises such as junk data, such as advertisement, star in data, are calculated using Bayes and obtain particular words, will be described The particular words removal in related data is elected, has realized removal junk data.In an alternate embodiment of the invention, election is obtained The information of candidate, the information of candidate mainly from election official website or opinion poll, determine the name list of the candidates, election position with And candidate is representative or the tissue being subordinate to etc., obtains and is subordinate to the relevant data of tissue to candidate and candidate.Optional In embodiment, for candidate's synonym, screened by the way of candidates names plus particular keywords combination, to go Except election extraneous data, candidate's synonym can be effectively solved, the interference of irrelevant personage is avoided.In optional embodiment In, non-textual content is removed to retain content of text.
Step S102: selecting index is carried out based on the modeling of index related and historical data, generates election index of correlation.This In, please refer to Fig. 2:
Step S201: based on the correlation between index, highly relevant index is screened.Here, it acquires and elects from internet Relevant data generate election related data.Specifically, from webpage, portal website and internet social platform (such as microblogging, Twitter and Facebook etc.) acquisition data relevant to electing and electing candidate.From netizen's public opinion, media report and time Official's propaganda strength of choosing extracts prediction index.Wherein, for netizen's public opinion mainly from common people's angle analysis, source mainly includes all kinds of Social platform, such as microblogging, Facebook and Twitter etc.;Media report source mainly includes the interconnection such as news, portal website Net media about election and candidate long article notebook data, can also include media volume, media supporting rate, topic supporting rate and Media volume growth rate etc. refines index;Candidate official propaganda strength source at election time, candidate or be subordinate to group The liveness and attention rate being woven in social media, main includes social media hair rich quantity, comment data amount, powder during electing Silk amount and bean vermicelli growth rate etc..
Step S202: the contribution degree of each index is calculated based on historical data, as weighted basis.Here, it is selected based on the previous session The contribution degree of an index is calculated, using contribution degree as weight base in influence of the historical data recorded in act to election results Plinth.
Step S203: being based on the highly relevant index and the weighted basis, generates election index of correlation.Here, root According to prediction index and the weight base included by netizen's public opinion, media report and the candidate official propaganda strength extracted Plinth generates election index of correlation.
Step S103: being based on the reduced data, generates time dimension cut-off and region dimension cut-off, cutting institute Reduced data is stated, cutting data are generated.
Based on the reduced data, generates cut-off and generate cutting data.In an alternative embodiment, it asks Refering to Fig. 3:
Step S301: being based on the reduced data, detects wave crest of the reduced data on election timeline and becomes Change.Here, there is obvious become in all wave crests of the detection reduced data on election timeline, i.e. each side's participation in the election camp's data volume The time point changed or obviously fluctuated.
Step S302: changed according to the wave crest, generate the time cut-off of the reduced data.Here, each square matrix Battalion's accounting embodies the time point obviously fluctuated on reduced data, can be used as time cut-off;The selection of cut-off is comprehensive It closes and considers election cycle feature and camp's variation characteristic.
Step S303: according to the time cut-off, reduced data described in cutting generates time cutting data.This In, election event is divided from time angle, gentle phase, peak period can be divided into, face and select three periods of phase.It is comprehensive Data variation and the analysis for closing on ballot phase data drawing list carry out dynamic according to time cut-off to the election timeline and cut It cuts, is three phases: gentle phase, ballot the first two months to time cut-off by the election timeline cutting;Peak period, time Cut-off to ballot first 10 days;Face the phase of selecting, ballot first 10 days start to ballot.
Step S304: the weight of time cutting data is set.It here, is time cutting data setting power Weight, as gentle phase, peak period and faces and selects the phase that weight is respectively set.
In another optional embodiment, Fig. 4 is please referred to:
Step S401: it is based on the reduced data, determines the source region of the reduced data.Here, processed Data are divided from data source, can be divided into local or non-local reduced data.
Step S402: according to the source region, the region cut-off of the reduced data is generated.Here, according to institute The source region for stating reduced data is different, generates the region cut-off of the reduced data.Optionally, work as reduced data When source region is divided with local and non-locality, the non-local region cut-off in locality-of the reduced data is generated.
Step S403: according to the region cut-off, the cutting data of reduced data formation zone described in cutting.This In, it is local reduced data and non-locality by the reduced data cutting according to local-non-local region cut-off Reduced data.
Step S404: the weight of region cutting data is set.It here, will according to the significance level of zone data The weight of local reduced data is set as w1, it is non-locality reduced data weight be set as w2, reflected by setting weight The typical regional characters of data and the reduced data of different geographical are handled to the influence degree of candidate.
Step S104: being based on the election index of correlation, carries out index in conjunction with semantic analysis technology and deep learning algorithm It calculates, generates index calculated result.Here, Fig. 5 is please referred to:
Step S501: being based on the election index of correlation, and the feelings of the cutting data are obtained using deep learning algorithm Sense tendency.Here, based on cutting data and the prediction index system, using the convolutional neural networks in deep learning Algorithm constructs text emotion disaggregated model, carries out emotional semantic classification to text.Using the interdependent algorithm of neural network syntax, netizen is obtained Discuss topic warmly;Emotion just negatively using the convolutional neural networks algorithm based on deep learning to text classification, obtains netizen's speech The Sentiment orientation of opinion.
Step S502: it is based on the election index of correlation and the election index of correlation, is calculated using semantic analysis technology The topic emotion of the data of cutting.Here, based on cutting data and the election index of correlation, using semanteme point Analysis technology carries out topic extraction to cutting data, obtains hot topic;Based on the text emotion disaggregated model, calculate popular The topic emotion of topic.
Here, text emotion tendency is extracted using semantic analysis, is divided into topic and extracts, topic pumping just negative with emotion Take using syntax dependency parsing, wherein topic, which extracts, uses neural network syntax dependency parsing device, by content of text into Row word segmentation processing obtains the part of speech of each word, the dependence between word and word is generated, by the Subject, Predicate and Object ingredient in every a word It extracts to be reconfigured, generates topic.Specifically, neural network syntax dependency parsing device is simple three-layer network knot Structure, input are the term vectors after participle, include that part of speech and arc label label are exported by a nonlinear function by sofmax layers, Obtain each word dependence and associated word.Input data can be expressed as [xw,xt,xl], respectively word, Part of speech, arc label label;The nonlinear function of hidden layer can be expressed as h=(Wxw+Txt+Lxl+b)3, wherein W, T, L be respectively word, Part of speech, the corresponding weight of arc label label, b are biasing;The function representation of softmax output layer are as follows: softmax (WoutH), WoutIt is defeated The weight of layer out.
Here, what is obtained by neural network syntax dependency parsing device is dependence between word and word, it is also necessary to right The constituent structure of word extracts recombination, by part of speech and dependence, finds the core word of each sentence, goes out from core word Hair extracts nearest subject-predicate relationship and dynamic guest's relationship, forms Subject, Predicate and Object structure, obtains the topic of a word;If in sentence not There are Subject, Predicate and Object structures, then extract preposition object and verbal endocentric phrase part, with core word combination, are similarly obtained a topic;Such as Fruit subject-predicate relationship and dynamic guest's relationship are all not present, then the topic for abandoning the words extracts.Wherein, emotion is just negatively i.e. first by one The data with affective tag are criticized as training set, hand picking batch of data does affective tag processing, the training that will be handled well Collect in the convolutional neural networks being put into, training pattern, emotion is carried out to other all texts by model and is just negatively being classified, is obtained The emotion of every data is just negative, and netizen is calculated to the Sentiment orientation of candidate.
Step S503: being based on the Sentiment orientation and the topic emotion, extracts the political orientation of the cutting data, Generate index calculated result.Here, it is based on the Sentiment orientation and the topic emotion, the political orientation of netizen is calculated, Index calculated result is generated according to the political orientation.
Step S105: being based on the index calculated result and the cutting data, predicts election results.Here, it is based on The index calculated result and the cutting data calculate by a series of statistics, the supporting rate of candidate are obtained, wherein propping up Holdup soprano is the predicting candidate people of current locale.
Election results prediction technique provided by the present invention, acquisition election network of relation speech data simultaneously carry out pretreatment behaviour Make, generates reduced data;Selecting index is carried out based on the modeling of index related and historical data, generates election index of correlation; Based on the reduced data, time dimension cut-off and region dimension cut-off are generated, reduced data described in cutting generates Cutting data;Based on the election index of correlation, index calculating is carried out in conjunction with semantic analysis technology and deep learning algorithm, it is raw At index calculated result;Based on the index calculated result and the cutting data, election results are predicted.In this way, by adopting Collection election network of relation speech data simultaneously carry out pretreatment operation, generate reduced data, improve reduced data and election thing The positive correlation of part;By carrying out selecting index based on the modeling of index related and historical data, election index of correlation is generated;Base In the reduced data, time dimension cut-off and region dimension cut-off are generated, reduced data described in cutting generates Cutting data;Based on the election index of correlation, index calculating is carried out in conjunction with semantic analysis technology and deep learning algorithm, is generated Index calculated result;Based on the index calculated result and the cutting data, predicts election results, improve election results The accuracy of prediction.
To achieve the above object, the present invention also provides a kind of election results prediction meanss, referring to Fig. 6, described device Including processor 501 and the memory 503 being connect by communication bus 502 with the processor 501;Wherein, the storage Device 503, for storing election results Prediction program;The processor 501, for executing the election results Prediction program, with Realize election results prediction steps described in any of the above-described scheme: acquisition election network of relation speech data simultaneously carry out pretreatment behaviour Make, generates reduced data;Selecting index is carried out based on the modeling of index related and historical data, generates election index of correlation; Based on the reduced data, time dimension cut-off and region dimension cut-off are generated, reduced data described in cutting generates Cutting data;Based on the election index of correlation, index calculating is carried out in conjunction with semantic analysis technology and deep learning algorithm, it is raw At index calculated result;Based on the index calculated result and the cutting data, election results are predicted.
Here, the processor 501, for executing the election results Prediction program, to realize that following election results are pre- It surveys step: based on the correlation between index, screening highly relevant index;The contribution degree of each index is calculated based on historical data, is made For weighted basis;Based on the highly relevant index and the weighted basis, election index of correlation is generated.
Here, the processor 501, for executing the election results Prediction program, to realize that following election results are pre- It surveys step: being based on the election index of correlation, the Sentiment orientation of the cutting data is obtained using deep learning algorithm;It is based on The election index of correlation and the election index of correlation calculate the topic feelings of the cutting data using semantic analysis technology Sense;Based on the Sentiment orientation and the topic emotion, the political orientation of the cutting data is extracted, index is generated and calculates knot Fruit.
Here, the processor 501, for executing the election results Prediction program, to realize that following election results are pre- Survey step: based on cutting data and the prediction index system, using the convolutional neural networks algorithm in deep learning Text emotion disaggregated model is constructed, emotional semantic classification is carried out to text.
Here, the processor 501, for executing the election results Prediction program, to realize that following election results are pre- It surveys step: based on cutting data and the election index of correlation, cutting data being carried out using semantic analysis technology Topic extracts, and obtains hot topic;Based on the text emotion disaggregated model, the topic emotion of hot topic is calculated.
Here, the processor 501, for executing the election results Prediction program, to realize that following election results are pre- It surveys step: being based on the reduced data, detect wave crest variation of the reduced data on election timeline;According to described Wave crest variation, generates the time cut-off of the reduced data;According to the time cut-off, reduced data described in cutting Generate time cutting data.
Here, the processor 501, for executing the election results Prediction program, to realize that following election results are pre- It surveys step: the weight of time cutting data is set.
Here, the processor 501, for executing the election results Prediction program, to realize that following election results are pre- It surveys step: being based on the reduced data, determine the source region of the reduced data;According to the source region, generate The region cut-off of the reduced data;According to the region cut-off, reduced data formation zone described in cutting has been cut Divided data.
Here, the processor 501, for executing the election results Prediction program, to realize that following election results are pre- It surveys step: the weight of region cutting data is set.
Optionally, the processor 501 can be general processor, digital signal processor (Digital Signal Processor, DSP), it is specific integrated circuit (Application Specific Integrated Circuit, ASIC), existing It is field programmable gate array (Field Programmable Gate Array, FPGA) or other programmable logic device, discrete Door or transistor logic, discrete hardware components.Here, the program that the processor 501 executes can store with institute It states processor 501 to pass through among the memory 503 that communication bus 502 connects, the memory 503 can be volatile memory Or nonvolatile memory, it may also comprise both volatile and non-volatile memories.Wherein, nonvolatile memory can be Read-only memory (ROM, Read Only Memory), programmable read only memory (PROM, Programmable Read-Only Memory), Erasable Programmable Read Only Memory EPROM (EPROM, Erasable Programmable Read-Only Memory), Electrically erasable programmable read-only memory (EEPROM, Electrically Erasable Programmable Read-Only Memory), magnetic RAM (FRAM, ferromagnetic random access memory), flash Device (Flash Memory), magnetic surface storage, CD or CD-ROM (CD-ROM, Compact Disc Read-Only Memory);Magnetic surface storage can be magnetic disk storage or magnetic tape storage.Volatile memory can be arbitrary access and deposit Reservoir (RAM, Random Access Memory) is used as External Cache.By exemplary but be not restricted explanation, The RAM of many forms is available, such as static random access memory (SRAM, Static Random Access Memory), same Walk static random access memory (SSRAM, Synchronous Static Random Access Memory), dynamic random Access memory (DRAM, Dynamic Random Access Memory), Synchronous Dynamic Random Access Memory (SDRAM, Synchronous Dynamic Random Access Memory), double data speed synchronous dynamic RAM It is (DDRSDRAM, Double Data Rate Synchronous Dynamic Random Access Memory), enhanced same Walk dynamic random access memory (ESDRAM, Enhanced Synchronous Dynamic Random Access Memory), synchronized links dynamic random access memory (SLDRAM, Sync Link Dynamic Random Access Memory), direct rambus random access memory (DRRAM, Direct Rambus Random Access Memory). The memory 503 of description of the embodiment of the present invention is intended to include but is not limited to the memory 503 of these and any other suitable type. Memory 503 in the embodiment of the present invention is for storing various types of data to support the operation of the processor 501.These The example of data includes: any computer program for the processor 501 operation, such as operating system and application program;Connection Personal data;Telephone book data;Message;Picture;Video etc..Wherein, operating system include various system programs, such as ccf layer, Core library layer, driving layer etc., for realizing various basic businesses and the hardware based task of processing.
To achieve the above object, the present invention also provides a kind of computer storage medium, the computer storage medium is deposited One or more program is contained, one or more of programs can be executed by one or more processor 501, so that institute It states one or more processor 501 and executes election results prediction steps described in any of the above-described scheme: acquisition election associated nets Network speech data simultaneously carry out pretreatment operation, generate reduced data;Referred to based on the modeling of index related and historical data Mark is chosen, and election index of correlation is generated;Based on the reduced data, time dimension cut-off and region dimension cutting are generated Point, reduced data described in cutting generate cutting data;Based on the election index of correlation, in conjunction with semantic analysis technology and Deep learning algorithm carries out index calculating, generates index calculated result;Based on the index calculated result and the cutting number According to prediction election results.
Optionally, one or more of programs can be executed by one or more processor 501, so that one Or multiple processors 501 execute following election results prediction steps: based on the correlation between index, screening highly relevant finger Mark;The contribution degree that each index is calculated based on historical data, as weighted basis;Based on the highly relevant index and the weight Basis generates election index of correlation.
Optionally, one or more of programs can be executed by one or more processor 501, so that one Or multiple processors 501 execute following election results prediction steps: being based on the election index of correlation, calculated using deep learning Method obtains the Sentiment orientation of the cutting data;Based on the election index of correlation and the election index of correlation, using language Adopted analytical technology calculates the topic emotion of the cutting data;Based on the Sentiment orientation and the topic emotion, institute is extracted The political orientation of cutting data is stated, index calculated result is generated.
Optionally, one or more of programs can be executed by one or more processor 501, so that one Or multiple processors 501 execute following election results prediction steps: based on cutting data and the prediction index body System, constructs text emotion disaggregated model using the convolutional neural networks algorithm in deep learning, carries out emotional semantic classification to text.
Optionally, one or more of programs can be executed by one or more processor 501, so that one Or multiple processors 501 execute following election results prediction steps: referring to based on the data of cutting are related to the election Mark carries out topic extraction to cutting data using semantic analysis technology, obtains hot topic;Classified based on the text emotion Model calculates the topic emotion of hot topic.
Optionally, one or more of programs can be executed by one or more processor 501, so that one Or multiple processors 501 execute following election results prediction steps: being based on the reduced data, detect the processed number According to the wave crest variation on election timeline;Changed according to the wave crest, generates the time cut-off of the reduced data;Root According to the time cut-off, reduced data described in cutting generates time cutting data.
Optionally, one or more of programs can be executed by one or more processor 501, so that one Or multiple processors 501 execute following election results prediction steps: the weight of time cutting data is arranged.
Optionally, one or more of programs can be executed by one or more processor 501, so that one Or multiple processors 501 execute following election results prediction steps: being based on the reduced data, determine the processed number According to source region;According to the source region, the region cut-off of the reduced data is generated;According to the region cutting Point, the cutting data of reduced data formation zone described in cutting.
Optionally, one or more of programs can be executed by one or more processor 501, so that one Or multiple processors 501 execute following election results prediction steps: the weight of region cutting data is arranged.
Optionally, the computer storage medium can be volatile memory, such as random access memory;Or it is non- Volatile memory, such as read-only memory, flash memory, hard disk or solid state hard disk;It is also possible to include above-mentioned memory One of 503 or any combination respective equipment, such as mobile phone, computer, tablet device, personal digital assistant.
The present invention provides an optional specific embodiments to be described further technical solution of the present invention, such as Fig. 7 Shown, election forecasting system 701 is divided for four module, data acquisition and preprocessing module 702, selecting index module 703, index Computing module 704 and prediction module 705.
Here, data acquisition and preprocessing module 702 include data acquisition and data de-noising.Data acquisition i.e. from webpage, Portal website and internet social platform (such as microblogging, Twitter and Facebook) acquisition and election and election candidate's phase The data of pass.Data de-noising denoises the election related data, and there may be rubbish numbers in the election related data According to, such as the noises such as advertisement, star, it is calculated using Bayes and obtains particular words, described in the election related data Particular words removal, has realized removal junk data.In an alternate embodiment of the invention, the information of election candidate, candidate are obtained Information mainly from election official website or opinion poll, determine that the name list of the candidates, election position and candidate are representative or be subordinate to The tissue etc. of category obtains and is subordinate to the relevant data of tissue to candidate and candidate.In an alternate embodiment of the invention, for candidate Synonym is screened by the way of candidates names plus particular keywords combination, can be with to remove election extraneous data Candidate's synonym is effectively solved, the interference of irrelevant personage is avoided.In an alternate embodiment of the invention, remove non-textual content with Retain content of text.
Here, in selecting index module 703, selecting index is mainly with netizen's public opinion, media report and official's propaganda strength Three aspects carry out the building of prediction index.Mainly from the angle analysis of the common people, data source mostlys come from all kinds of netizen's public opinion Social platform: microblogging, facebook, twitter etc. extract its political orientation from the speech of the network of the common people oneself.Media report Road is mainly the Internet medias such as news, portal website about election and the long article notebook data of candidate, is extended to media Volume, media supporting rate, topic supporting rate, media volume growth rate etc. refine index, throw for measuring media report voter The guiding performance of ticket influences.Official's propaganda strength is candidate or place political parties and groups enlivening in social media at election time Degree and attention rate specifically include that social media sends out rich data volume, comment data amount, bean vermicelli amount, bean vermicelli growth rate during election Etc..
Here, in index computing module 704, including semantic module 706, general indicator computing module 707 and region Time cutting module 708.Wherein, semantic module 706 includes that topic extracts 709 and emotion just negative 710, calculates emotion It is inclined to statistic.Region time cutting module 708 comprehensively considers the conditions such as time, region to the shadow of final election results It rings.Wherein, region time cutting resume module integrated data changes and closes on the ballot phase, and the election stage is divided into initial stage, peak Phase faces three periods of constituency.It is local data amount setting weight is w according to the importance degree of zone data1, non-locality It is w that weight, which is arranged, in data report amount2
Here, region is the factor that can not ignore in prediction election, especially local media report or locality The common people be to the importance of this area candidate correlation speech data it is self-evident, media's public opinion is direct or indirect to be affected Voter's guiding.Meanwhile variable quantity of the data in certain period reflects concern trend to candidate, variation to a certain extent Situation, the election results even last to the local voting behavior for participating in voter have a major impact.Number is reported to correlation candidate people According to emotional semantic parser is combined, the just negative Sentiment orientation statistic of each candidate is obtained.By dividing time into three Constituency is faced in a initial stage in stage, peak period, and carries out statistical forecast from media, microblogging, liveness and emotion four dimensions, and every Candidate's supporting rate calculation formula is as follows:Wherein, w1For The concern of candidate's i local media report and social networks concern comment weight, w2Concern and society are reported for non-native media Network attention is handed over to comment on weight, | | d | | it is number of dimensions;Clocal(i,t,d)For candidate i under t period d dimension local data Amount;Cothers(i,t,d)For the data volume of non-local area;It is waited to participate in all election contests in the region The data volume chosen, j ∈ { 1,2 ..., n } indicate that this area has n candidates to put up.By data with the variation of time Amount is divided into three phases: constituency, t ∈ { t are faced in initial stage, peak period1,t2,t3}.Data statistics further contemplate social media, Microblogging, candidate's liveness and semanteme emotion four dimensions d ∈ { d1,d2,d3,d4, candidate's unit interval, single dimension Upper network supporting rate calculates:Finally, being directed to the supporting rate calculating formula of somewhere candidate i Are as follows:Election prediction takes traditional statistics and combines nerve Network abstraction goes out netizen to the algorithm of correlation candidate people's topic sentiment analysis.After being calculated by a series of statistics, each Candidate has netizen's support amount, the candidate of each dimension respectively on media, microblogging, liveness and emotion four dimensions The accounting of people's support amount and the current dimension general branch amount of holding as the candidate current dimension netizen's supporting rate, then to each A candidate calculates the average supporting rate of candidate, is the supporting rate mean value of four latitudes, finally for area, to areal Candidate carry out candidate's supporting rate compare, the high person of supporting rate become current locale predicting candidate people.
Referring to Fig. 8, Fig. 8 illustrates an alternative embodiment of election results forecasting system, preprocessed data is successively carried out 801, subordinate sentence processing 802, the interdependent algorithm 803 of syntax, topic extraction 804, the positive and negative surface model classification 805 of emotion and topic emotion are inclined To calculating 806, realize that topic Sentiment orientation calculates;Successively carry out 805 and of the positive and negative surface model classification of preprocessed data 801, emotion Text emotion tendency calculates 807, realizes that text emotion tendency calculates.
Referring to Fig. 9, Fig. 9 illustrates an alternative embodiment of election results forecasting system, preprocessed data is successively carried out 801, affective tag obtains 902, CNN (Convolutional Neural Networks) model training 903 and model 904, real The foundation of existing model.
Referring to Fig. 10, Figure 10 illustrates an alternative embodiment of election results forecasting system, pretreatment number is successively carried out 1003 are calculated according to 801, media-related data statistics 1001, microblogging related data statistics 1002 and candidate's liveness, wherein matchmaker Body related data statistics 1001 includes media index 1004, and microblogging related data statistics 1002 includes microblogging index 1005, candidate It includes liveness index 1006 that people's liveness, which calculates 1003,.Three last social media, microblogging, candidate's liveness dimensions calculate Candidate elects supporting rate to reflect vote by ballot result.
The foregoing is only a preferred embodiment of the present invention, is not intended to limit the scope of the present invention.

Claims (10)

1. a kind of election results prediction technique, which is characterized in that the described method includes:
Acquisition election network of relation speech data simultaneously carry out pretreatment operation, generate reduced data;
Selecting index is carried out based on the modeling of index related and historical data, generates election index of correlation;
Based on the reduced data, time dimension cut-off and region dimension cut-off are generated, reduced data described in cutting, Generate cutting data;
Based on the election index of correlation, index calculating is carried out in conjunction with semantic analysis technology and deep learning algorithm, generates index Calculated result;
Based on the index calculated result and the cutting data, election results are predicted.
2. election results prediction technique according to claim 1, which is characterized in that built based on index related and historical data Mould carries out selecting index, generates election index of correlation and includes:
Based on the correlation between index, highly relevant index is screened;
The contribution degree that each index is calculated based on historical data, as weighted basis;
Based on the highly relevant index and the weighted basis, election index of correlation is generated.
3. election results prediction technique according to claim 1, which is characterized in that be based on the election index of correlation, knot It closes semantic analysis technology and deep learning algorithm carries out index calculating, generating index calculated result includes:
Based on the election index of correlation, the Sentiment orientation of the cutting data is obtained using deep learning algorithm;
Based on the election index of correlation and the election index of correlation, the cutting data are calculated using semantic analysis technology Topic emotion;
Based on the Sentiment orientation and the topic emotion, the political orientation of the cutting data is extracted, index is generated and calculates As a result.
4. election results prediction technique according to claim 3, which is characterized in that be based on the election index of correlation, adopt Include: with the Sentiment orientation that deep learning algorithm obtains the cutting data
Based on the data of cutting and the prediction index system, constructed using the convolutional neural networks algorithm in deep learning Text emotion disaggregated model carries out emotional semantic classification to text.
5. election results prediction technique according to claim 4, which is characterized in that be based on the election index of correlation and institute Election index of correlation is stated, includes: using the topic emotion that semantic analysis technology calculates the cutting data
Based on the data of cutting and the election index of correlation, topic is carried out to cutting data using semantic analysis technology It extracts, obtains hot topic;
Based on the text emotion disaggregated model, the topic emotion of hot topic is calculated.
6. election results prediction technique according to claim 1, which is characterized in that be based on the reduced data, generate Time dimension cut-off and region dimension cut-off, reduced data described in cutting, generating cutting data includes:
Based on the reduced data, wave crest variation of the reduced data on election timeline is detected;
Changed according to the wave crest, generates the time cut-off of the reduced data;
According to the time cut-off, reduced data described in cutting generates time cutting data.
7. election results prediction technique according to claim 6, which is characterized in that described according to the time cutting Point, reduced data described in cutting generate the time after cutting data, the method also includes:
The weight of time cutting data is set.
8. election results prediction technique according to any one of claim 1 to 7, which is characterized in that located based on described Data are managed, time dimension cut-off and region dimension cut-off are generated, reduced data described in cutting generates cutting data packet It includes:
Based on the reduced data, the source region of the reduced data is determined;
According to the source region, the region cut-off of the reduced data is generated;
According to the region cut-off, the cutting data of reduced data formation zone described in cutting.
9. a kind of election results prediction meanss, which is characterized in that described device includes processor and by communication bus and institute State the memory of processor connection;Wherein,
The memory, for storing election results Prediction program;
The processor, for executing the election results Prediction program, to realize as described in any one of claims 1 to 8 Election results prediction steps.
10. a kind of computer storage medium, which is characterized in that the computer storage medium is stored with one or more journey Sequence, one or more of programs can be executed by one or more processor, so that one or more of processors Execute such as election results prediction steps described in any item of the claim 1 to 8.
CN201910448874.6A 2019-05-28 2019-05-28 A kind of election results prediction technique, device and computer storage medium Pending CN110321407A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910448874.6A CN110321407A (en) 2019-05-28 2019-05-28 A kind of election results prediction technique, device and computer storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910448874.6A CN110321407A (en) 2019-05-28 2019-05-28 A kind of election results prediction technique, device and computer storage medium

Publications (1)

Publication Number Publication Date
CN110321407A true CN110321407A (en) 2019-10-11

Family

ID=68119314

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910448874.6A Pending CN110321407A (en) 2019-05-28 2019-05-28 A kind of election results prediction technique, device and computer storage medium

Country Status (1)

Country Link
CN (1) CN110321407A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112348257A (en) * 2020-11-09 2021-02-09 中国石油大学(华东) Election prediction method driven by multi-source data fusion and time sequence analysis

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112348257A (en) * 2020-11-09 2021-02-09 中国石油大学(华东) Election prediction method driven by multi-source data fusion and time sequence analysis

Similar Documents

Publication Publication Date Title
Collins et al. Trends in combating fake news on social media–a survey
Efron Information search and retrieval in microblogs
Bi et al. Inferring the demographics of search users: Social data meets search queries
CN107451861B (en) Method for identifying user internet access characteristics under big data
Schifferes et al. Downloading disaster: BBC news online coverage of the global financial crisis
JP5944535B2 (en) System and method for determining context
CN109983455A (en) The diversified media research result on online social networks
Claster et al. Tourism, travel and tweets: algorithmic text analysis methodologies in tourism
Breur Data analysis across various media: Data fusion, direct marketing, clickstream data and social media
CN101329674A (en) System and method for providing personalized searching
CN103559207A (en) Financial behavior analyzing system based on social media calculation
CN106557560A (en) Level music based on user interest recommends method
TW201131403A (en) Algorithmically choosing when to use branded content versus aggregated content
Alazzawi et al. What can I do there? Towards the automatic discovery of place-related services and activities
CN108062366B (en) Public culture information recommendation system
KR102460209B1 (en) System for providing politics verse platform service
Boullier et al. Big data challenge for social sciences and market research: from society and opinion to replications
Mualam et al. Architecture is not everything: A multi-faceted conceptual framework for evaluating heritage protection policies and disputes
Putera et al. How indonesia uses big data “indonesian one data” for the future of policy making
Dong et al. Improving sequential recommendation with attribute-augmented graph neural networks
Salari et al. Estimation of 2017 Iran’s presidential election using sentiment analysis on social media
Papadakis et al. Methods for web revisitation prediction: survey and experimentation
CN110321407A (en) A kind of election results prediction technique, device and computer storage medium
JP2020521246A (en) Automated classification of network accessible content
Wang et al. A sequential game-theoretic study of the retweeting behavior in Sina Weibo

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20191011