CN110555568A - Road traffic running state real-time perception method based on social network information - Google Patents

Road traffic running state real-time perception method based on social network information Download PDF

Info

Publication number
CN110555568A
CN110555568A CN201910861533.1A CN201910861533A CN110555568A CN 110555568 A CN110555568 A CN 110555568A CN 201910861533 A CN201910861533 A CN 201910861533A CN 110555568 A CN110555568 A CN 110555568A
Authority
CN
China
Prior art keywords
data
traffic
information
social network
layer
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201910861533.1A
Other languages
Chinese (zh)
Other versions
CN110555568B (en
Inventor
陈坚
纪柯柯
庹永恒
袁晓骏
刘亦欣
袁伯龙
林东源
郭雪荣
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Chongqing Jiaotong University
Original Assignee
Chongqing Jiaotong University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Chongqing Jiaotong University filed Critical Chongqing Jiaotong University
Priority to CN201910861533.1A priority Critical patent/CN110555568B/en
Publication of CN110555568A publication Critical patent/CN110555568A/en
Application granted granted Critical
Publication of CN110555568B publication Critical patent/CN110555568B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/26Visual data mining; Browsing structured data
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/29Geographical information databases
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/35Clustering; Classification
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/088Non-supervised learning, e.g. competitive learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/04Forecasting or optimisation specially adapted for administrative or management purposes, e.g. linear programming or "cutting stock problem"
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q50/00Information and communication technology [ICT] specially adapted for implementation of business processes of specific business sectors, e.g. utilities or tourism
    • G06Q50/01Social networking
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q50/00Information and communication technology [ICT] specially adapted for implementation of business processes of specific business sectors, e.g. utilities or tourism
    • G06Q50/10Services
    • G06Q50/26Government or public services

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Business, Economics & Management (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Databases & Information Systems (AREA)
  • Tourism & Hospitality (AREA)
  • General Engineering & Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Economics (AREA)
  • Human Resources & Organizations (AREA)
  • Strategic Management (AREA)
  • General Health & Medical Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • General Business, Economics & Management (AREA)
  • Marketing (AREA)
  • Development Economics (AREA)
  • Primary Health Care (AREA)
  • Computing Systems (AREA)
  • Artificial Intelligence (AREA)
  • Computational Linguistics (AREA)
  • Game Theory and Decision Science (AREA)
  • Remote Sensing (AREA)
  • Entrepreneurship & Innovation (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Quality & Reliability (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Educational Administration (AREA)
  • Evolutionary Computation (AREA)
  • Molecular Biology (AREA)
  • Operations Research (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

the invention belongs to the technical field of real-time perception of road traffic running states, and discloses a real-time perception method of road traffic running states based on social network information, which automatically acquires, classifies and extracts multi-angle effective traffic information in a social network platform by utilizing a web crawler technology, screens, analyzes and predicts the information, establishes an end-to-end self-learning model, and visually marks the information on a map; the future data are learned and predicted from the past data based on deep learning, and the predicted future data are fed back to the social network for users to share the traffic information in real time. The invention has the advantages of unlimited spatial distribution, no need of laying and maintaining ground sensing equipment, obvious economic advantage, capability of effectively capturing sudden traffic incidents, specific site traffic incidents, temporary traffic control, newly added traffic restrictions and traffic environment information, capability of providing quick response signals for relevant management departments and capability of providing a trip decision basis for residents.

Description

Road traffic running state real-time perception method based on social network information
Technical Field
The invention belongs to the technical field of real-time perception of road traffic running states, and particularly relates to a real-time perception method of road traffic running states based on social network information.
background
At present, the closest prior art includes technologies such as sensors, WSNs, RFIDs, location sensing, yard sensing, road and bridge sensing, and storage and processing of structured and unstructured mass data such as GPS, position, coil, video, road network and the like applied to vehicles, which are all very mature, that is, the technology has been provided for storage and analysis of data. In the aspect of transmission, short-range transmission is available at present, and 3G and 4G long-range transmission is available, so that the transmission of large data volume, one is rapid, the other is large, and the current foundation is also provided.
With the continuous development of computer technology, internet technology and information technology, more and more data are generated and spread in the internet, the data resources are fully utilized, useful information is mined from the data resources, the common requirements of workers and researchers in many industries are met, and various industries also begin to think that the potential, innovation point and controllable data are converted into usable and valuable information in a systematic mode.
the internet has become an important channel for government agencies, enterprises and the public to publish and share real-time road traffic information as a convenient and efficient information carrier. The traffic information is rich in types and strong in timeliness, can be complemented with information acquired by other traffic information acquisition technologies, and plays an important role in government planning decision making and public travel service.
The acquisition of the traffic information completely depends on hardware terminals such as sensors arranged in a large range, and the limitation of data terminal acquisition is inevitably exposed, for example, the coverage of equipment is incomplete, and the extension and maintenance costs are high. Therefore, a large amount of multi-source traffic data in the social network platform can be used as one of effective sources of traditional traffic information collection, especially unexpected traffic events such as roads, construction control and traffic accidents, and can be rapidly transmitted on the social network to timely and effectively inform each traveler, so that people can know the position of the event and the occurrence background of the event, and the cost is relatively low. The availability of an Application Program Interface (API) for microblogging helps to extract such information, and has a more powerful, attractive, and inexpensive tool compared to other social platforms.
Secondly, when a user sees that different road condition states (red, yellow and green) are presented in the navigation software, the information obtained by the user is only the smooth degree of the road, the red represents road condition congestion, the yellow represents slow driving of the road condition, and the green represents smooth, but deep reasons under the different road condition states and the possible duration time of the accident state are difficult to know from the presented information.
At present, urban traffic road condition release depends on inductive detection devices such as coils and RFID, and real-time performance and situation are the weakest links in urban road condition release. The method has important practical significance for developing sudden accidents and temporary control in urban roads under the social network environment and timely releasing and inducing research, and provides scientific basis for government departments to implement urban traffic management and decision making.
In the prior art, the research on the social network mostly focuses on the propagation rule of the social network, and stays at the level of a mathematical model, with the continuous deepening of big data in various industries, the social network gradually starts to be fused with other industries, and public opinion analysis and emergency detection are the fields with the highest use frequency of the social network. In the transportation industry, a plurality of scholars also discuss the application of the social network, but currently, the application only stays at the conceptual level, and some research institutions start the research and development of related systems, but still have a great defect, firstly, because the amount of information published on the social network is limited, the information cannot be more comprehensively covered in the whole urban road network. Secondly, most of information data on the social network is that users have certain emotional colors, and certain deviation exists in the definition of traffic conditions, so that the information data cannot be used as a detection source of urban congestion conditions. Finally, with the further development of the social network platform, a large number of fans are rapidly harvested from short video APP, and the collected data is not comprehensive necessarily by considering the single social network platform.
In summary, the problems of the prior art are as follows:
data acquired by traffic detection equipment in the prior art does not contain information of more angles, the spatial distribution is limited, ground sensing equipment needs to be laid and maintained, the economic cost is high, on the other hand, the existing guidance information cannot provide possible duration of the event for travelers, and no clear guidance effect is provided for the travelers to travel. And is not beneficial to the traffic management department to respond to the emergency traffic incident in time. The road traffic running state real-time perception method in the prior art cannot share real-time information of traffic events, cannot match different road traffic running states, and carries out visual marking on a map, so that the road traffic change information is low in reflection speed and poor in effect. And traditional traffic detection equipment relies on hardware terminals such as sensors that arrange on a large scale, inevitably exposes the limitation that data terminal gathered. The prior art is difficult to effectively capture sudden traffic events, specific place traffic events, temporary traffic control, newly added traffic restrictions and traffic environment information. The prior art is lack of clear accident handling time information, and the information has no clear guiding function on travelers.
disclosure of Invention
Aiming at the problems in the prior art, the invention provides a road traffic running state real-time perception method based on social network information.
The invention is realized in such a way that a road traffic running state real-time perception method based on social network information comprises the following steps:
the method comprises the steps of automatically collecting, classifying and extracting multi-angle effective traffic information in a social network platform by using traffic detection equipment, screening the traffic information in an abnormal operation state for analysis, predicting the possible duration time of an event by using a comprehensive algorithm, establishing an end-to-end self-learning model, and carrying out visual marking on a map. And displaying the information of road traffic change.
and secondly, learning and predicting future data from past data through an algorithm based on deep learning, and feeding the predicted future data back to the social network for the user to share the traffic information in real time. The direct effect that above-mentioned technical scheme brought has: the method avoids dependence on hardware terminals such as fixed sensors and the like, and reduces application limitations in the aspects of acquisition range, update period, cost, coverage and the like. Due to the characteristics of high speed of social network information transmission and the like, each traveler can be timely and effectively informed, and people can know the position of an event and the occurrence background of the event. The perception technology is based on a social network big data training set, an accident duration prediction platform is established, and accident occurrence time and occurrence probability are predicted.
Further, the effective traffic information in the social network platform is automatically collected, classified and extracted in the first step, the traffic information in the abnormal operation state is screened out and analyzed, the related traffic information on the network is obtained by utilizing the whole network information crawling technology, the text classification technology and the information identification technology, a credit rating system is introduced, the feasibility degree registration of the user information is established, and screening is carried out. And presenting the accident duration prediction algorithm to a user in a visual form in a social network platform through an end-to-end self-learning accident duration prediction algorithm.
The abnormal operation state information includes a general congestion level, a regulation level, and a traffic accident level.
Further, the social network platform in the first step uses Hadoop as a big data platform, uses HDFS to store data, and uses Hive to analyze and preprocess data. The data acquisition method comprises the steps of compiling a crawler by Python for data crawling, classifying texts by using Text-CNN, performing distributed computing scheduling by using Spark, and finally displaying by using a Web front end.
Further, in the first step, the effective traffic information in the social network platform is automatically collected, classified and extracted,
The method comprises the steps of mining key words of effective information in the traffic information through a TF-IDF algorithm, multiplying word frequency contained in the TF-IDF by inverse document frequency to obtain a TF-IDF value, wherein the larger the TF-IDF is, the higher the importance is, the words are sorted from large to small according to the TF-IDF value, and the first words are the effective information key words in the traffic information.
Further, the TF-IDF algorithm specifically includes:
i) Calculating the TF value:
And calculating the frequency of occurrence of a word divided by the total frequency to obtain the word frequency TF of the word.
ii) calculating the IDF value:
The total number of documents in the corpus divided by the total number of documents containing the word +1 is calculated and then logarithmized. log means taking the logarithm of the obtained value.
iii) calculating the TFIDF value:
fidfi,j=tfi,f×idfi
The final result is obtained.
Further, in the first step, the possible duration time of the event is predicted by using a comprehensive algorithm, and an end-to-end self-learning model is established, which specifically comprises the following steps:
1) Constructing a training set: and (3) comparing and analyzing traditional traffic accident duration prediction parameters, establishing an accident duration prediction model based on a social network, firstly constructing a prediction training set, and extracting key information described by a data text as the prediction characteristic of the accident duration by using the crawled data in addition to extracting the basic characteristic.
And obtaining the duration of the accident, obtaining a label of a training set after approximately matching the accident occurrence time with the microblog report time, and directly predicting the accident occurrence time through microblog information. From the original data of the traffic accident, 5 attribute parameters of the number of vehicles related to the severity of the traffic accident, whether lanes are blocked, casualties, whether buses are related and whether trucks are related are extracted.
2) And (4) model construction, wherein different types of models are incorporated into an integral model system through a combined model.
3) And (5) displaying the accident duration prediction.
Further, the second step of deep learning based algorithm learns the predicted future data from the past data without adding any artificial tags.
Another object of the present invention is to provide a system for sensing road traffic running state based on social network information in real time, which is characterized in that the system for sensing road traffic running state based on social network information in real time comprises:
And the data layer is connected with the Hadoop big data platform and the data preprocessing layer and is used for containing all data supporting the operation of the whole system.
And the data preprocessing layer is connected with the data layer, the Hadoop big data platform and the analysis processing layer and is used for refining and supplying rough and unrelated data.
And the analysis processing layer is connected with the data preprocessing layer, the Hadoop big data platform and the visual layer, is used for analyzing and processing data and provides an interface for the application terminal.
And the visualization layer is connected with the analysis processing layer and is used for providing an interface for the application terminal.
And the Hadoop big data platform is connected with the data layer, the data preprocessing layer and the analysis processing layer and is used for providing data processing parallelization support for the whole platform.
Furthermore, data shared by various map navigation platform providers and data with statistical value are arranged in the data layer and used and analyzed by the data preprocessing layer.
The data preprocessing layer is also used for cleaning data acquired by the data fusion layer, and removing noise in the data, filling null values, missing values and processing inconsistent data by filling missing data, eliminating abnormal data, smoothing noise data and correcting inconsistent data. And then carrying out data standardization processing, wherein after the raw data are subjected to data standardization processing, all indexes are in the same order of magnitude. Meanwhile, a common relational database and a Hive database are combined to perform large data distributed storage and management.
the analysis processing layer predicts or makes decisions on data by using data machine learning, and models are constructed from sample input.
The visualization layer displays the pathological traffic conditions through the visualization module.
The invention further aims to provide a real-time perception terminal for implementing the social network information-based road traffic running state.
The invention has the advantages and positive effects that:
the system can automatically collect, extract and classify the effective traffic information in the social network, and compared with the data collected by the traditional traffic detection equipment, the data adopted by the system contains information of more angles, the spatial distribution is not limited, the ground sensing equipment does not need to be arranged and maintained, the obvious economic advantage is achieved, and powerful support can be provided for collecting the traffic data. The system makes full use of real-time information sharing, reason analysis and degree description of people on traffic events on the social network, matches different road traffic running states, and carries out visual marking on a map, thereby well showing deep reasons of road traffic changes. The description information of the traffic accident in the social network is extracted for the first time, the traditional traffic accident duration prediction parameters are matched, an end-to-end self-learning model is further established, and the purpose that the neural network learns and predicts future data from past data in unsupervised learning is achieved. The system has good practical significance on some road sections with accidents such as accidents, temporary construction and control, informs travelers of road network pathological change information in advance, and guides travel. The system fully utilizes the real-time shared traffic information on the social network to help the traveler make a decision, and when no other more accurate traffic information exists, the traffic information in the social network becomes an important basis for the traveler to make a decision, so that the traveler is assisted to make a more optimal choice.
The system is used as an urban road real-time sensing system, the urban road traffic running state based on the social network is researched, massive traffic data in the social network are utilized, a machine learning algorithm is adopted, the road traffic running state is acquired more accurately and timely, the possible duration of the running state is predicted, the system is released to a navigation system at the first time by utilizing the efficient transmission efficiency related to the network, and the system is used as a supplement to the current ITS detection configuration, so that the influence of road emergencies on the normal running of urban traffic is reduced, and the invariance and loss of urban residents are reduced. The system has the advantages of advancement and scientificity, and is mainly embodied in the following aspects.
The idea is innovative: the invention develops a social network-based urban road traffic running state real-time perception system. The system can automatically collect, classify and extract effective traffic information in the social network platform, screen out social information in abnormal operation states, analyze abnormal reasons, predict the possible duration time of events by using a comprehensive algorithm, and visually mark on a map, so that the influence of the abnormal road traffic operation states on urban traffic is reduced, and invariance and loss brought to urban residents are reduced. In the age of multi-source data fusion, the system can effectively compensate the traditional real-time traffic information acquisition technology of fixed sensors or floating cars and the like, and timely capture sudden traffic events, specific-place traffic events, temporary traffic control, newly added traffic restrictions and traffic environment information.
In the prior art, the research on the social network mostly focuses on the propagation rule of the social network, and the social network stays at the aspect of a mathematical model, in recent years, with the continuous deepening of big data in various industries, the social network gradually starts to be fused with other industries, and public opinion analysis and emergency detection are the fields with the highest use frequency of the social network. In the transportation industry, a plurality of scholars also discuss the application of the social network, but currently, the application only stays at the conceptual level, and some research institutions start the research and development of related systems, but still have a great defect, firstly, because the amount of information published on the social network is limited, the information cannot be more comprehensively covered in the whole urban road network. Secondly, most of information data on the social network is that users have certain emotional colors, and certain deviation exists in the definition of traffic conditions, so that the information data cannot be used as a detection source of urban congestion conditions. Finally, with the further development of the social network platform, a large number of fans are rapidly harvested from short video APP, and the collected data is not comprehensive necessarily by considering the single social network platform.
The system integrates the existing social network platform, develops a whole-network information crawling technology, comprehensively acquires the related traffic information on the network, introduces a credit rating system, establishes user information feasibility degree registration, and accordingly conducts a series of screening. The invention utilizes the traffic information to judge the abnormal operation state of the urban road network, namely the common congestion level, the control level and the traffic accident level, develops an end-to-end self-learning accident duration prediction algorithm, presents the algorithm to the user in a visual form in the system and induces the user to carry out reasonable travel selection. The system comprehensively utilizes technologies such as a web crawler technology, a text classification technology and information identification, is based on advanced social network platform data, and has theoretical and methodical innovativeness.
The comprehensiveness of the technology: the system uses Hadoop as a big data platform, uses HDFS to store data, uses Hive to perform data analysis and preprocessing work, uses Python to compile a crawler for data crawling, uses Text-CNN to perform Text classification processing, uses Spark to perform distributed computation scheduling, and finally uses a Web front end to perform display.
Text classification is based on a Convolutional Neural Network (CNN), which makes use of the information contained in the order of words. The CNN model takes the original text as input, does not need too many artificial features, and has greater advantages compared with the traditional method. End-to-end learning thinking patterns are established, neural networks are trained by either simulation or reinforcement learning, actions are generated based on input of traditional underlying data, and human labels are not required to be used at any stage of the process. In unsupervised learning, the neural network learns to predict future data from past data without any artificial labels. The method can greatly improve the prediction accuracy and efficiency of the accident duration time and reduce the workload of manual participation in prediction.
The system advances include:
The system can automatically collect, extract and classify the effective traffic information in the social network, and compared with the data collected by the traditional traffic detection equipment, the data adopted by the system contains information of more angles, the spatial distribution is not limited, the ground sensing equipment does not need to be arranged and maintained, the obvious economic advantage is achieved, and powerful support can be provided for collecting the traffic data.
The system makes full use of real-time information sharing, reason analysis and degree description of people on traffic events on the social network, matches different road traffic running states, and carries out visual marking on a map, thereby well showing deep reasons of road traffic changes.
The description information of the traffic accident in the social network is extracted for the first time, the traditional traffic accident duration prediction parameters are matched, an end-to-end self-learning model is further established, and the purpose that the neural network learns and predicts future data from past data in unsupervised learning is achieved.
the system has good practical significance on some road sections with accidents such as accidents, temporary construction and control, informs travelers of road network pathological change information in advance, and guides travel.
The system fully utilizes the real-time shared traffic information on the social network to help the traveler make a decision, and when no other more accurate traffic information exists, the traffic information in the social network becomes an important basis for the traveler to make a decision, so that the traveler is assisted to make a more optimal choice.
The advantages of the invention further include:
The method comprises the steps of automatically acquiring, classifying and extracting multi-angle effective traffic information in a social network platform by utilizing a web crawler technology, screening out the social information in an abnormal operation state for analysis, predicting the possible duration time and occurrence probability of an event by utilizing a comprehensive algorithm, establishing an end-to-end self-learning model, and carrying out visual marking on a map; displaying information of road traffic change; the future data are learned and predicted from the past data based on deep learning, and the predicted future data are fed back to the social network for users to share the traffic information in real time. The data adopted by the invention contains information of more angles, the spatial distribution is not limited, ground sensing equipment does not need to be laid and maintained, the economic advantage is obvious, and powerful support is provided for the acquisition of traffic data. The system can effectively capture sudden traffic events, specific-place traffic events, temporary traffic control, newly added traffic restrictions and traffic environment information, provide quick response signals for relevant management departments, and provide a travel decision basis for residents.
Drawings
Fig. 1 is a structural diagram of a real-time perception system for road traffic operation status based on social network information according to an embodiment of the present invention.
In the figure: 1. and (6) a data layer. 2. And a data preprocessing layer. 3. The treatment layer is analyzed. 4. A visualization layer. 5. Hadoop big data platform.
fig. 2 is a schematic diagram of a road traffic running state real-time perception system based on social network information according to an embodiment of the present invention.
Fig. 3 is a flowchart of data collection in a social networking platform according to an embodiment of the present invention.
Fig. 4 is a diagram of data mining processing provided by an embodiment of the present invention.
Fig. 5 is a system block diagram provided by an embodiment of the invention.
Fig. 6 is a diagram of data layers provided by an embodiment of the invention.
Fig. 7 is a diagram of a data preprocessing layer provided by an embodiment of the invention.
Fig. 8 is a diagram of a data analysis layer provided by an embodiment of the invention.
Fig. 9 is a diagram of the interface and visualization layers provided by an embodiment of the invention.
FIG. 10 is a Hadoop architecture diagram provided by an embodiment of the present invention.
Fig. 11 is a diagram of a microblog crawling process provided by the embodiment of the invention.
Fig. 12 is a flowchart for calculating text similarity according to an embodiment of the present invention.
FIG. 13 is a diagram of two forms of Word2Vec provided by an embodiment of the present invention.
FIG. 14 is a sentence matrix diagram provided by an embodiment of the present invention.
Fig. 15 is a diagram of convolution operations provided by an embodiment of the present invention.
FIG. 16 is a diagram of a pooling operation provided by an embodiment of the present invention.
Fig. 17 is a diagram of a classification process provided by an embodiment of the invention.
fig. 18 is a flow chart of accident time prediction provided by an embodiment of the present invention.
FIG. 19 is a diagram showing an algorithm model provided by an embodiment of the present invention.
Fig. 20 is a data processing flow diagram provided by an embodiment of the present invention.
Fig. 21 is a platform data visualization display diagram provided in the embodiment of the present invention.
Fig. 22 is a platform data visualization display diagram provided by the embodiment of the invention.
Detailed Description
In order to further understand the contents, features and effects of the present invention, the following embodiments are illustrated and described in detail with reference to the accompanying drawings.
Data acquired by traffic detection equipment in the prior art does not contain information of more angles, the spatial distribution is limited, ground sensing equipment needs to be laid and maintained, and the economic cost is high.
The road traffic running state real-time perception method in the prior art cannot share real-time information of traffic events, cannot match different road traffic running states, and carries out visual marking on a map, so that the road traffic change information is low in reflection speed and poor in effect. There are limitations in the application of decision-making for travelers in social networks.
To solve the above problems, the following describes the technical solution of the present invention in detail with reference to the accompanying drawings.
As shown in fig. 1, the system for sensing a road traffic running state in real time based on social network information provided by the embodiment of the present invention includes: the system comprises a data layer 1, a data preprocessing layer 2, an analysis processing layer 3, a visualization layer 4 and a Hadoop big data platform 5.
And the data layer 1 is connected with the Hadoop big data platform 5 and the data preprocessing layer 2 and is used for containing all data supporting the operation of the whole system.
And the data preprocessing layer 2 is connected with the data layer 1, the Hadoop big data platform 2 and the analysis processing layer 3 and is used for basically refining the rough and unrelated large-scale data for algorithm calling.
And the analysis processing layer 3 is connected with the data preprocessing layer 2, the Hadoop big data platform 5 and the visualization layer 4, and is used for analyzing and processing data and providing an interface for an application terminal.
And the visualization layer 4 is connected with the analysis processing layer 3 and is used for providing an interface for the application terminal.
And the Hadoop big data platform 5 is connected with the data layer 1, the data preprocessing layer 2 and the analysis processing layer 3 and is used for providing data processing parallelization support for the whole platform.
Data layer 1 provided by the invention: the data layer is used as the bottom layer of the whole framework, the main part of the data layer comprises data shared by various map navigation platform providers, scientific research data with statistical research value and the like, and the data can be used as basic data for upper layers to use and analyze.
The data preprocessing layer 2 provided by the invention comprises: the data preprocessing layer firstly cleans the data acquired by the data fusion layer, and removes noise in the data, fills in null values, lost values and processes inconsistent data by filling in missing data, eliminating abnormal data, smoothing noise data and correcting inconsistent data. Then, in order to eliminate dimension influence among the indexes, data standardization processing is needed to solve comparability among the data indexes, and after data standardization processing is carried out on the original data, all the indexes are in the same order of magnitude and are suitable for comprehensive comparison and evaluation. Meanwhile, the existing common relational database (MySql, Oracle) and Hive database are combined to perform large data distributed storage and management. Hive is a data warehouse infrastructure built on Hadoop. It provides a set of tools that can be used to perform data Extraction Transformation Loading (ETL), a mechanism that can store, query, and analyze large-scale data stored in Hadoop. Hive defines a simple SQL-like query language called HQL that allows users familiar with SQL to query data. Meanwhile, the language also allows developers familiar with MapReduce to develop custom mappers and reducers to process complex analysis work which cannot be completed by built-in mappers and reducers.
Analysis layer 3 provided by the present invention: machine learning (machine learning) is a sub-field of computer science that enables computers to learn without explicit programming. Machine learning explores the study and construction of algorithms that can learn and predict data by evolving from pattern recognition and computational learning theories in the field of artificial intelligence research, using several algorithmic process analyses. Such algorithms overcome strict static program instructions, building models from sample inputs through data-driven prediction or decision-making.
The visualization layer 4 provided by the invention: the API provides an interface for the application terminal to directly call the algorithm, and other invention can realize all the algorithms and data of the system through the API of the system. The visualization module is used for displaying the traffic ill-condition road conditions.
the Hadoop big data platform 5 provided by the invention comprises the following components: hadoop is a distributed system infrastructure developed by the Apache Foundation. A user can develop a distributed program without knowing the distributed underlying details. The power of the cluster is fully utilized to carry out high-speed operation and storage. The original objective of Hadoop design is to achieve high reliability, high expansibility, high fault tolerance and high efficiency, and the unique advantages of the design make Hadoop popular with companies as soon as the Hadoop appears, and also attract general attention of the research community. To date, the Hadoop technology has been widely used in the internet field.
in the embodiment of the present invention, first, all data supporting the operation of the whole system is contained by the data layer 1. The rough and unrelated large-scale data is basically refined through the data preprocessing layer 2 for algorithm calling. And the analysis processing layer 3 is used for carrying out data analysis processing and providing an interface for the application terminal. The application terminal is then provided with an interface through the visualization layer 4. And finally, providing data processing parallelization support for the whole platform through a Hadoop big data platform 5.
A schematic diagram of a road traffic running state real-time perception system based on social network information provided by the embodiment of the invention is shown in fig. 2.
The invention is further described below with reference to specific examples and analyses.
Examples
1. The current situation is as follows: although the real-time traffic information acquisition technology based on the fixed sensor or the floating vehicle can effectively sense the traffic state of the road, the technology is difficult to effectively capture sudden traffic events, specific-place traffic events, temporary traffic control, newly added traffic restrictions and traffic environment information, and certain application limitations still exist in the aspects of acquisition range, update period, cost, coverage and the like. Currently, social network media represented by micro blogs have become an important channel for public information sharing. Under the background of geographic information generalization, real-time traffic information contained in social network media is collected by utilizing a text mining technology, so that the defects of the existing traffic information collection means are overcome, and the method has important practical significance.
1.1 social network: the social network is developed on the basis of the traditional social network service by taking the internet technology as a support and the real social relationship, so that more convenient information interaction between people is realized, and high-quality network service is provided for people at any time and any place. Taking the microblog as an example, the microblog quickly amplifies personal voice to social space through the functions of praise, forwarding and comment, and amplifies personal behaviors into social behaviors.
The social network communication system can be divided into 21 types according to the functional attributes, including instant messaging (WeChat, QQ), network video (tremble), quick information sharing (microblog) and the like, and people can interact with net friends on the social network, acquire latest information, make a statement and participate in the propagation process of the social network in the modes of message forwarding, mutual comment and the like. These social networking platforms have a large number of fan groups, as shown in table 1 below, the number of users of any mainstream social networking platform is more than one hundred million, and the trend of growth is maintained, which provides a large amount of valuable data for the road real-time status analysis of the invention.
Table 12017.12-2018.06 partial social network platform user number and usage rate
1.2 traffic accident time prediction: traffic accidents often cause enormous social costs, and a large number of traffic experts have been working on the analysis of traffic accident predictions in order to discover factors behind time and possibly duration. In view of the current research situation at home and abroad, the accident duration prediction research can be mainly divided into two categories: (1) and (5) statistical method. (2) Artificial intelligence method. Aiming at the statistical method, the method mainly focuses on the research of the distribution probability of the duration time of the traffic accident. Gorob et al analyzed the duration of truck accidents in California, resulting in classification by collision type, with accident durations following a logarithmic positive-Tailored distribution. Ozbay and Kachroo conclude that the duration of events of the same event type and severity obeys a normal distribution. In terms of statistical methods, a regression model is often adopted to predict the duration of the traffic accident, and due to low prediction accuracy, Nam and Mannering are based on data of two years of Washington, a danger-based continuous event model is established to predict the duration of the accident, so that the relationship between the duration of the accident and the ending probability of the accident is well explained. Alkaabi and Chung utilize risk-based duration models and analyze factors that affect such durations.
The method for artificial intelligence is mainly based on intelligent algorithm research and application. Heet et al, Smith, ozbayand kachroo, have first used decision trees to predict the duration of a traffic accident. Alkaabi et al indicate that decision tree methods may not require knowledge of the probability distribution of event durations. Ozbay and Noyan illustrate the shortcomings of decision trees, which sometimes become unstable and insensitive to the randomness of the data. With the continuous and deep theoretical research, Bayesian Networks (BN), Artificial Neural Networks (ANN), Genetic Algorithms (GA), support vector machines, etc. are also used for the research of accident duration prediction.
1.3 application of social network: with the rapid development of social networking sites, more and more people are willing to post own opinions on social media, so that social media data becomes an important data source and better reflects various activities in the real world. Combining social media data with different domain knowledge may study and mine different information, such as studying activity patterns in different areas, detecting land types of cities. The method can be used for detecting the occurrence of disaster events and knowing the occurrence condition of the events by combining with knowledge related to disaster emergencies.
The social network is an important data source for social traffic research, and Qiao and the like indicate that microblog messages can be used as effective supplements of traffic detection sensors such as coils and videos and can be used for positioning traffic jam in time. Zeng et al indicate that social media information can provide traffic warning signals and road condition information forecasts. Endarntoto et al developed a Twitter traffic information acquisition system and designed a section of android phone software for displaying traffic information. Gu et al developed a set of social media-based traffic incident detection systems and were applied in two cities. Zhengzhihao et al crawls the traffic information data on the microblog by using a support vector machine algorithm and marks the collected traffic information on a map. Xiong et al propose an intelligent transportation system framework based on information-physics-social system, and the operation mechanism of the social transportation system is described in detail herein.
The arrival of the 1.4WEB2.0 era indicates that people enter the era of mutual promotion of virtual networks and real society, more and more data are generated and spread in the Internet, and a social network platform records the point drops of life of people, so that a holding space is provided for huge data, and a large amount of traffic data is not lacked. Traffic accidents and traffic congestion often bring huge losses to the whole city, and data in the social network can be used as effective supplement of a traditional traffic detection source and is released to different groups of people in the society by the characteristics of timely, quick and effective information propagation.
A large number of urban emergencies often have uncertain occurrence places and time, the traditional traffic detection equipment is difficult to automatically detect and identify the traffic emergencies, and manual investigation is utilized, so that on one hand, the working intensity is high, and on the other hand, the real-time response degree is poor. And the traffic accident time is accurately predicted in time, and the travelers are issued at the first time to assist the travelers in better selection of travel modes and paths. At the present stage, the traffic accident time prediction is mainly based on two types of statistical methods and intelligent algorithms, the statistical methods obey the positive distribution based on the accident time, and the accuracy is difficult to guarantee. The prediction of an intelligent algorithm on time, such as a neural network algorithm, a support vector machine, a genetic algorithm and the like, has higher accuracy, but the current research only aims at the detection data of traditional traffic equipment.
2. Social network information: the social network information referred to in the present invention is specifically online social network information (online social network information), which is generated with the appearance of online social services (SNS). The categories of online social services can be roughly divided into four categories: the application system comprises instant messaging applications (QQ, WeChat, WhatsApp, Skype and the like), online social applications (QQ space, man-machine net, Facebook, Google and the like), microblog applications (Sina microblog, Teng microblog, Twitter and the like), and shared space applications (forum, blog, video sharing, evaluation sharing and the like).
2.1) social networking information: social networking information can be divided into six broad categories from the content scope: academic information, educational information, government information, traffic information, cultural information, and harmful and illegal information.
From the degree of availability of social networking information resources and the security level classification of social networking information resources, it can be divided into three categories: information resources that are fully disclosed: such information resources are available to each user, such as content posted by various types of social networking sites and information available through free registration, among others. Information resources that are semi-public: such information resources can be obtained conditionally, for example, after some social platforms are registered, information resources which are valuable or meet personal needs can be obtained only by paying a certain fee, and the like. Information resources (confidential information resources) that are not disclosed to the outside: this kind of information resources are only provided for limited advanced users with certain usage rights, such as confidential information and information exchanged via network inside various military agencies and multinational companies. The social network information belongs to completely public information resources, and any individual can collect and analyze the information.
2.2) traffic information in social network information: the invention classifies the traffic information in the social network into two categories according to the property of the traffic information and the timeliness of the traffic information.
(1) First, the nature of traffic information: from the perspective of the nature of traffic information, the traffic information in the social network can be divided into descriptive information and predictive information.
Descriptive information: the method refers to objective description of traffic network conditions in the social network, including traffic accidents, congestion conditions, queuing length, weather conditions and the like, and attitudes and behaviors of individuals on the traffic conditions.
second, predictive information: the method is characterized in that some official public accounts predict the future road network conditions according to the previously collected road network conditions and publish the road network conditions through a social network platform, so that people are induced to go out.
(2) Timeliness of traffic information: from the perspective of timeliness of traffic information, the traffic information in the social network may also be divided into static traffic information and dynamic traffic information.
i) Static traffic information in the social network can be defined as information which is published or planned by authorities in a social network platform and is relatively infrequently changed, and most of the information is information before travel. The method comprises the following steps: road construction and maintenance information: the road construction generally reduces the number of available lanes, influences the smoothness of the road, and travelers know in advance whether the road construction condition can be selected to travel or not and travel routes. Second, road section charging information: congestion-only charging information, toll-gate charging information, etc. Third, public transportation information: the bus stop board can give information such as service time, departure frequency and vehicle running places of buses, and can also know the information through the mobile phone APP client. Fourthly, basic information of the traveling vehicle: road restrictions include height and weight restrictions, vehicle speed restrictions, hazardous material restrictions, etc.
And ii) dynamic traffic information, wherein the dynamic traffic information in the social network is defined as the most current information available from the social network platform at a specific time point, and is different from the static traffic information: the dynamic information can be released before going out and can also be acquired in the way of going out. For travelers, obtaining real-time traffic information can help them to make real-time travel selection according to actual road network conditions. The dynamic traffic information required by travelers includes: traffic accident information: including the cause of the accident, the progress of the accident, the traffic congestion at the accident site, etc. And secondly, optional travel route and mode information, especially under the condition of a certain road closure or a certain bus line. And weather information which may affect traveling, such as fog , sand storm, etc. And fourthly, releasing corresponding road prediction information and the like provided by the terminal.
The system divides the traffic information in the traffic network into four types, namely traffic control, construction control, road congestion information and traffic accident information, and defines the four types of information as the road pathological change state so as to train data by using a deep learning algorithm and identify required text data.
and iii) extracting traffic information in the social network, wherein the social network platform becomes a popular communication tool among Internet users, such as Twitter, microblog and the like, millions of information appear on the social network platform every day, and authors of the information write their lives, share their opinions and changes and analyze current problems. The social network platform has low content requirement threshold and no time and place limitation, and brings convenience for people to communicate by adding real-time interactive information sum, so that people can acquire extensive messages and knowledge all over the world more quickly and better. The system adopts data in the social network platform as a data source, adopts Python to write crawler codes, crawls traffic data about urban road congestion, control, accidents and the like in the social network platform in real time for analysis and processing, and uses the data as subsequent experimental data after the data is cleaned, wherein the specific flow is shown in the following figure 3.
Aiming at massive unstructured data in a social network platform, the data cannot be processed by a traditional method, so that a data mining technology is adopted, and the data mining technology is a process of automatically discovering useful information by combining a traditional data analysis method and a complex algorithm for processing massive data. Data mining is an indispensable part of knowledge discovery (KDD) in a database, and KDD is an overall process of converting raw data into useful information as shown in fig. 4.
and finally, the json unstructured storage and the csv structured storage are respectively used for storing the data, so that visual analysis can be conveniently carried out on the data by using visual software subsequently. The structuralization adopts a Key-Value pair (Key-Value) mode for storage, so as to facilitate subsequent reading.
3. The real-time analysis system for urban road traffic operation conditions comprises:
3.1) overall architecture: the method mainly comprises the steps of data preprocessing, data marking, model building and urban road state analysis. Training is carried out through social network platform data, and finally displaying is carried out on a map, so that residents are induced to carry out travel behavior selection, urban congestion conditions are improved, and road traffic efficiency is improved.
To realize the above functions, the scheme designs the system into a 5-layer model according to the functions, as shown in fig. 5:
The first layer is a data layer, which contains all the data that supports the operation of the entire system.
the second layer is a data preprocessing layer, and basic refinement is carried out on coarse and unrelated large-scale data for algorithm calling.
The third layer is an analysis processing layer for analyzing and processing data and providing an interface for the application terminal. The fourth layer is a visualization layer and provides an interface for the application terminal.
And the fifth layer is a Hadoop big data platform and provides data processing parallelization support for the whole platform.
3.1.1) data layer: the data fusion layer is used as the bottom layer of the whole framework, the main part of the data fusion layer comprises data shared by various map navigation platform providers, scientific research data with statistical research value and the like, and the data can be used as basic data for the upper layer to use and analyze, and the data is shown in fig. 6.
3.1.2) data preprocessing layer: the data preprocessing layer firstly cleans the data acquired by the data fusion layer, and removes noise in the data, fills in null values, lost values and processes inconsistent data by filling in missing data, eliminating abnormal data, smoothing noise data and correcting inconsistent data. Then, in order to eliminate dimension influence among the indexes, data standardization processing is needed to solve comparability among the data indexes, and after data standardization processing is carried out on the original data, all the indexes are in the same order of magnitude and are suitable for comprehensive comparison and evaluation. Meanwhile, the existing common relational database (MySql, Oracle) and Hive database are combined to perform large data distributed storage and management, and a specific framework is shown in fig. 7.
Hive is a data warehouse infrastructure built on Hadoop. It provides a set of tools that can be used to perform data Extraction Transformation Loading (ETL), a mechanism that can store, query, and analyze large-scale data stored in Hadoop. Hive defines a simple SQL-like query language called HQL that allows users familiar with SQL to query data. Meanwhile, the language also allows developers familiar with MapReduce to develop custom mappers and reducers to process complex analysis work which cannot be completed by built-in mappers and reducers.
3.1.3) data analysis layer: machine learning (machine learning) is a sub-field of computer science that enables computers to learn without explicit programming. Machine learning explores the study and construction of algorithms that can learn and predict data by evolving from pattern recognition and computational learning theories in the field of artificial intelligence research, using several algorithmic process analyses. Such algorithms overcome the strict static program instructions to build models from sample inputs through data-driven prediction or decision-making, as shown in fig. 8 above.
3.1.4) interface and visualization layer: the API provides an interface for the application terminal to directly call the algorithm, and other invention can realize all the algorithms and data of the system through the API of the system. The visualization module shows the pathological traffic conditions, as shown in fig. 9.
3.1.5) Hadoop big data platform: hadoop is a distributed system infrastructure developed by the Apache Foundation. A user can develop a distributed program without knowing the distributed underlying details. The power of the cluster is fully utilized to carry out high-speed operation and storage. The original objective of Hadoop design is to achieve high reliability, high expansibility, high fault tolerance and high efficiency, and the unique advantages of the design make Hadoop popular with companies as soon as the Hadoop appears, and also attract general attention of the research community. The Hadoop technology has been widely used in the internet field, and its architecture diagram is shown in fig. 10.
3.2) text preprocessing: the text features are also called as NLP features, knowledge and tools related to natural language processing are needed to extract the related features, a large amount of time is consumed for the part of the features, the main reason is that the data volume is large, the noise is excessive, the preprocessing of the text is quite challenging, due to the limitation of time and machines, the features of two aspects of text similarity and emotional tendency are extracted, and the process is detailed below.
3.2.1) text training set construction: the method uses the Weibo open interface to crawl data, uses the Python language to crawl, and captures microblog IDs, microblog texts, user IDs, user names, forwarding numbers, comment numbers, praise numbers, time, forwarding details, comment details and praise details. The specific flow is shown in fig. 11.
3.2.2) text similarity feature: the text similarity uses the Jieba word segmentation and the TF-IDF algorithm, and finally the cosine similarity is used for calculating the similarity between two microblogs. The main algorithm flow is shown in fig. 12.
3.2.2.1) Jieba word segmentation, the Jieba word segmentation uses a word segmentation method based on character string matching and a word segmentation method based on statistics, and the main algorithm flow comprises the following three steps:
Firstly, a local dictionary is loaded, wherein the local dictionary contains 30 ten thousand words which comprise the occurrence times and the parts of speech of the entries and are used for generating a trie tree.
Considering a sentence to be divided, successive kanji and english characters are obtained using rules, they are cut into phrase lists, a direction random graph (DAG) is used and dynamically programmed for each phrase, a maximum probability path is obtained, and maximum division is performed on the basis of word frequency.
For characters which are not searched in a dictionary, word segmentation is carried out by using an HMM model, new words are recognized, namely, the new words outside the dictionary are recognized, and a Viterbi algorithm is used.
3.2.2.2) TF-IDF algorithm TF-IDF (termfrequency-inversechomentfrequency) is a technique for mining keywords in articles, is relatively simple and highly efficient, and is often used for the initial text data cleansing.
The TF-IDF includes "term frequency" (abbreviated TF) and "inverse document frequency" (abbreviated IDF). Multiplying these two parts together yields the value of TF-IDF. If the TF-IDF of a word in an article is larger, the description importance is higher, so the invention can calculate the TF-IDF values of all words in the article, and then sort the words from large to small according to the values, and the words at the top are the keywords of the article.
The algorithm comprises the following steps:
i) For words inside the document, the TF value is first calculated:
The word frequency (TF) of a word in an article can be obtained by calculating the number of times of the word appearing in the article by dividing the total number of times of the article through the formula (1).
ii) calculating the IDF value:
The total number of documents in the corpus divided by the total number of documents containing the word +1 can be calculated by equation (2), and then logarithmized. log means taking the logarithm of the obtained value.
iii) calculating the TFIDF value:
fidfi,j=tfi,f×idfi (3)。
The final result is obtained.
and iv) cosine similarity, the cosine similarity is generally used as a calculation formula of text similarity, because the similarity between two vectors is mainly compared with the distance in distance.
The numerator is the dot product of the vector A and the vector B, and the denominator is the product of the L2 norms of the two, namely the square sum and the reopening. The cosine similarity takes the value of [ -1,1], and a larger value indicates more similarity.
3.2.2) Word2 Vec: word2vec is a group of related models used to generate Word vectors. These models are shallow, two-layer neural networks that are trained to reconstruct linguistic word text. The network is represented by words and the input words in adjacent positions are guessed, and the order of the words is unimportant under the assumption of the bag-of-words model in word2 vec.
After training is complete, the word2vec model may be used to map each word to a vector, which may be used to represent word-to-word relationships. The vector is a hidden layer of the neural network.
word2vec encodes each Word in a vector similar to an auto-encoder, but unlike the constrained Boltzmann machine which trains the input words through reconstruction, Word2vec trains words with other words in the input corpus that are adjacent to them.
it does this by one of two methods, one is to use context to predict the target word (a method called continuous word bag (CBOW)) and the other is to use a word to predict the target context (called skip syntax), as shown in fig. 13. The present invention uses the latter approach because it produces more accurate results on large datasets.
When a feature vector assigned to a word cannot be used to accurately predict the context of the word, the components of the vector will be adjusted. The context of each word in the corpus is the process of sending error information to adjust the feature vectors. The vectors of words judged to be similar according to context are pushed closer by adjusting the numbers in the vectors.
3.3) algorithm model construction: the system classifies the traffic information by using a TextCNN algorithm, and is mainly divided into the following four parts.
3.3.1) embedding layer: the invention needs to specify the length L of an input sequence by analyzing the length of a corpus sample, sample sequences shorter than L need to be filled (self-defined filling characters), and sequences longer than L need to be intercepted. The final input layer inputs distributed representations, namely word vectors, corresponding to all words in the text sequence.
l is specified as the length of the longest sentence in the corpus, so that truncation is not required. Then, each word is converted into a corresponding word vector to obtain a sentence matrix, which is shown in fig. 14.
3.3.2) convolutional layers: in the NLP field, a general convolution kernel only performs one-dimensional sliding, that is, the width of the convolution kernel is as wide as the dimension of a word vector, and the convolution kernel only performs one-dimensional sliding.
a number of convolution kernels of different sizes are typically used in the Text-CNN model. The height of the convolution kernel, i.e. the window value, can be understood as N in the N-gram model, i.e. the length of the utilized local word order, and the window value is also a hyper-parameter, which needs to be tried in the task, and generally takes a value between 2 and 8, as shown in fig. 15.
3.3.3) pooling layer: max-pool (maximum pooling) is used in a pooling layer of the Text-CNN model, so that parameters of the model are reduced, and the input of a fixed-length full connection layer is ensured to be obtained on the output of a roll base layer with an indefinite length. In addition to maximum pooling, the paper also discusses TopK maximum pooling, i.e., selecting the TopK maximum values output by each convolutional layer as the output of the pooling layer, as shown in FIG. 16.
3.3.4) full connection layer: the full-connection layer is used as a classifier, and the original Text-CNN model uses a full-connection network with only one hidden layer, which is equivalent to inputting the features extracted by the convolution and pooling layers into an LR classifier for classification. The text is finally divided into four categories: respectively, traffic control, construction control, road congestion information, and traffic accident information, as shown in fig. 17.
3.4) accident duration prediction: the invention builds on end-to-end learning, trains neural networks by simulating or reinforcing learning, and generates actions based on the input of traditional basic data without using human marks at any stage of the process. In unsupervised learning, the neural network learns to predict future data from past data without any artificial labels. The method comprises the following steps:
3.4.1) training set construction: and (3) comparing and analyzing the traditional traffic accident duration prediction parameters, and establishing an accident duration prediction model based on a social network. Firstly, constructing a prediction training set, and extracting basic features, such as time, coordinates and weather information, and extracting key information described by a data text as prediction features of accident duration by using the crawled data.
And then obtaining the duration time of the accident from the traffic department, and obtaining a label of a training set, namely the duration time of the accident after approximately matching the accident occurrence time with the microblog report time, so that the accident occurrence time can be directly predicted through microblog information, as shown in fig. 18. The severity of the accident is closely related to the way the accident is handled, including whether a trailer is used, whether the police are driven, whether the ambulance is driven, etc., thereby directly affecting the duration of the traffic accident. From the raw data of the traffic accident, 5 attribute parameters related to the severity of the traffic accident are extracted: the number of vehicles, whether the lane is blocked, the number of casualties, whether the bus is involved and whether the truck is involved.
3.4.2) model construction: the traffic jam is divided into a frequent traffic jam and an occasional traffic jam. Traffic accidents are a significant cause of sporadic traffic congestion. In the frequent traffic jam, the road traffic center can provide useful information for the driver by analyzing historical data, and the daily traveler can estimate the frequent traffic jam. For occasional congestion, a driver cannot obtain effective information about the traffic flow condition on a road, the duration of a traffic accident is accurately predicted, and the method has very important significance for timely transmitting the traffic accident to the driver on the road.
in the prediction of the duration of the traffic accident, by comparing the prediction accuracy of different prediction models, on different roads, the prediction accuracy difference of different prediction methods is large, the application condition, the construction mechanism and the prediction accuracy of any single model have certain limitations, and in the face of the reality of dynamic change, the result of any single model in the prediction cannot be satisfied, so that the system brings different models into an integral model system through a combined model, and the purpose of improving the advantages and avoiding the disadvantages is achieved. The algorithm model is shown in fig. 19.
3.4.3) accident duration prediction presentation.
3.5) platform system display: the platform provides interfaces in the forms of Mysql, JavaSE, Hadoop, Spark and the like, and is oriented to data, algorithm, visualization and system developers.
3.5.1) data acquisition and display: the system adopts a crawler technology, crawls data from various large social network platforms, processes and classifies the data, saves the texts meeting the conditions as the data required by the system, and analyzes the positions of the texts, so that the texts are displayed on a map, and the realization of a real-time analysis system for urban road resistance is achieved.
3.5.2) developing an algorithm display: algorithm developers can implement the implementation of various algorithms, the improvement of algorithms, and the design of new algorithms. Developers invest in algorithm design and realization, construct a better algorithm, and do not need to consider the structure of data and the display of the data.
3.5.3) system operation state monitoring display.
3.5.4) visualization of system data: the system provides a B/S data visualization display mode, and can be embedded in any webpage.
The data analysis platform aims to realize low coupling of a data module, an analysis module and a visualization module and provide diversified and standardized module interfaces. The system realizes modularization of the data analysis process. The user only needs to configure each module, and the algorithm and the visualization mode integrated on the system can be freely combined.
3.6) the visual display of the platform data comprises the following steps:
3.6.1) visualization data flow: data visualization requests data according to user behavior, and feeds back the data to the user through different hierarchical processing and processing, as shown in fig. 20.
3.6.1.1) data obtained by using python in microblog are processed, cleaned and stored in a database, and meanwhile, the result of predicting future road conditions through the data is also written into the database.
3.6.1.2) the user views the map, clicks on the page, and the browser listens to events and sends asynchronous requests to the server through ajax.
3.6.1.3) the server receives the request, processes logic and business, and obtains the requested data from the database.
3.6.1.4) the server side obtains the data from the database, processes the data and returns the processed data to the browser.
3.6.1.5) the browser draws the page after receiving the response, and presents the data.
3.6.2) displaying the functions, namely matching the sorted traffic information data with different road pathological change states by the operation of the flow, dividing the data into three types of common road congestion, traffic control congestion and traffic accident congestion, realizing the butt joint of the data and the geographic position by utilizing a Gaode map API (application program interface), and releasing the information of the road with pathological change to a Gaode map platform, thereby realizing the visual display of the data, wherein the display effect is shown as figure 21.
The whole page is a map of the Chongqing city, and is marked with a traffic route, a traffic mode, a building position and a name. The condition of the road section is described by marking a red icon on the traffic route on the map. The icons have the following meanings:
The icon with the "blocked" word in the red circle indicates that the road segment is in a congested state.
The yellow man holding the point shovel icon indicates that the road segment is being serviced.
And the icon of the collision of the two vehicles indicates that the traffic accident occurs on the road section.
Clicking the right side of the red-red icon with a mouse to pop up a microblog information list about the road section, wherein the microblog content is the traffic condition of the road section at the latest time, clicking any one piece of information in the list will jump to the microblog and check specific information, which is shown in fig. 22.
the present invention is further described below in conjunction with technical innovations.
The system is used as an urban road real-time sensing system, the urban road traffic running state based on the social network is researched, massive traffic data in the social network are utilized, a machine learning algorithm is adopted, the road traffic running state is acquired more accurately and timely, the possible duration of the running state is predicted, the system is released to a navigation system at the first time by utilizing the efficient transmission efficiency related to the network, and the system is used as a supplement to the current ITS detection configuration, so that the influence of road emergencies on the normal running of urban traffic is reduced, and the invariance and loss of urban residents are reduced. The system has the advantages of advancement and scientificity, and is mainly embodied in the following aspects.
The invention develops a social network-based urban road traffic running state real-time perception system. The system can automatically collect, classify and extract effective traffic information in the social network platform, screen out social information in abnormal operation states, analyze abnormal reasons, predict the possible duration time of events by using a comprehensive algorithm, and visually mark on a map, so that the influence of the abnormal road traffic operation states on urban traffic is reduced, and invariance and loss brought to urban residents are reduced. In the age of multi-source data fusion, the system can effectively compensate the traditional real-time traffic information acquisition technology of fixed sensors or floating cars and the like, and timely capture sudden traffic events, specific-place traffic events, temporary traffic control, newly added traffic restrictions and traffic environment information.
The system integrates the existing social network platform, develops a whole-network information crawling technology, comprehensively acquires the related traffic information on the network, introduces a credit rating system, establishes user information feasibility degree registration, and accordingly conducts a series of screening. The invention utilizes the traffic information to judge the abnormal operation state of the urban road network, namely the common congestion level, the control level and the traffic accident level, develops an end-to-end self-learning accident duration prediction algorithm, presents the algorithm to the user in a visual form in the system and induces the user to carry out reasonable travel selection. The system comprehensively utilizes technologies such as a web crawler technology, a text classification technology and information identification, is based on advanced social network platform data, and has theoretical and methodical innovativeness.
The comprehensiveness of the technology: the system uses Hadoop as a big data platform, uses HDFS to store data, uses Hive to perform data analysis and preprocessing work, uses Python to compile a crawler for data crawling, uses Text-CNN to perform Text classification processing, uses Spark to perform distributed computation scheduling, and finally uses a Web front end to perform display.
Text classification is based on a Convolutional Neural Network (CNN), which makes use of the information contained in the order of words. The CNN model takes the original text as input, does not need too many artificial features, and has greater advantages compared with the traditional method. End-to-end learning thinking patterns are established, neural networks are trained by either simulation or reinforcement learning, actions are generated based on input of traditional underlying data, and human labels are not required to be used at any stage of the process. In unsupervised learning, the neural network learns to predict future data from past data without any artificial labels. The method can greatly improve the prediction accuracy and efficiency of the accident duration time and reduce the workload of manual participation in prediction.
The system advances include:
The system can automatically collect, extract and classify the effective traffic information in the social network, and compared with the data collected by the traditional traffic detection equipment, the data adopted by the system contains information of more angles, the spatial distribution is not limited, the ground sensing equipment does not need to be arranged and maintained, the obvious economic advantage is achieved, and powerful support can be provided for collecting the traffic data.
The system makes full use of real-time information sharing, reason analysis and degree description of people on traffic events on the social network, matches different road traffic running states, and carries out visual marking on a map, thereby well showing deep reasons of road traffic changes.
the description information of the traffic accident in the social network is extracted for the first time, the traditional traffic accident duration prediction parameters are matched, an end-to-end self-learning model is further established, and the purpose that the neural network learns and predicts future data from past data in unsupervised learning is achieved.
the system has good practical significance on some road sections with accidents such as accidents, temporary construction and control, informs travelers of road network pathological change information in advance, and guides travel.
The system fully utilizes the real-time shared traffic information on the social network to help the traveler make a decision, and when no other more accurate traffic information exists, the traffic information in the social network becomes an important basis for the traveler to make a decision, so that the traveler is assisted to make a more optimal choice.
The invention is further described below in connection with market applications.
Market environment: the advent of the WEB2.0 era indicates that people enter the era of mutual promotion of virtual networks and real society, and with the rapid development of social networking sites, more people are expected to be willing to publish their own opinions on social media, so that social media data becomes an important data source and better reflects various activities in the real world. The traffic information acquired through the social network platform has a good market environment, on one hand, the cost for acquiring the information from the social network platform is low, and the information can be well used as a supplement of traditional traffic detection equipment. On the other hand, the social network platform has the characteristics of high propagation efficiency, wide popularization and the like, so that the information is more simplified and the cost is reduced.
Utilize traditional check out test set, like coil, video detection and floating car etc. carry out road conditions through speed of a motor vehicle, density, flow and judge, can only obtain the traffic phenomenon on the intuition, need police dispatch to the reason that causes the accident to judge, all cause great loss in time and money. Secondly, some sudden traffic events, such as temporary traffic control, temporary construction and the like, are difficult to obtain due to the fact that the sudden traffic events cannot be released on some information platforms, and will have great influence on the trip of the traveler, so that a system platform is urgently needed to help the traveler to obtain real-time traffic conditions on the road and reasonably plan the travel path.
The expressway is used as a convenient passage for traveling between cities in China, and accidents are more likely to happen due to the fact that the speed of vehicles traveling on the expressway is higher. If the video monitoring equipment is arranged every 5 kilometers, the total mileage of the expressway in China is about 14 kilometers, 28000 highway units are required to be arranged in total, each highway unit is 500 yuan, the total cost is about 1 million yuan, the long-term use of the expressway is ensured, and the annual maintenance benefit is more than 1 million yuan.
The real-time road pathological change state sensing system based on the social network has the advantages of real-time performance, originality, comprehensiveness and the like, and can effectively guide emergency rescue by acquiring the road pathological change state in real time according to the traffic data on the social network platform, induce residents to reasonably select a travel mode and reduce indirect loss. In recent years, the concept of 'social traffic' is proposed, information on a social network is concerned highly by academic circles, engineering circles and relevant government departments at home and abroad, and the invention provides wide ideas and scientific bases for the application of the social network in the traffic field, so the invention has wide popularization prospects.
The above description is only for the preferred embodiment of the present invention, and is not intended to limit the present invention in any way, and all simple modifications, equivalent changes and modifications made to the above embodiment according to the technical spirit of the present invention are within the scope of the technical solution of the present invention.

Claims (10)

1. A road traffic running state real-time perception method based on social network information is characterized by comprising the following steps:
The method comprises the steps that firstly, multi-angle effective traffic information in a social network platform is automatically collected, classified and extracted by utilizing a web crawler technology, and the traffic information in an abnormal operation state is screened out; predicting the possible duration time of the screened traffic information in the abnormal operation state by using a comprehensive algorithm, establishing an end-to-end self-learning model, visually marking on a map, and displaying the traffic change of the road;
And secondly, learning and predicting future data from the past data based on a deep learning algorithm, and feeding the predicted future data back to the social network for real-time sharing by the user.
2. The method for sensing the traffic running state of the road based on the social network information in real time as claimed in claim 1, wherein in the first step, the relevant traffic information on the network is obtained by using a whole network information crawling technology, a text classification technology and an information identification technology, a credit rating system is introduced, and user information feasibility degree registration is established for screening; presenting the accident duration prediction algorithm to a user in a visual form in a social network platform through an end-to-end self-learning accident duration prediction algorithm;
The abnormal operation state information includes a general congestion level, a regulation level, and a traffic accident level.
3. the method for perceiving the traffic running state of the road in real time based on the social network information as claimed in claim 1, wherein in the first step, the social network platform uses Hadoop as a big data platform, uses HDFS to store data, and uses Hive to analyze and preprocess the data; the data acquisition method comprises the steps of compiling a crawler by Python for data crawling, classifying texts by using Text-CNN, performing distributed computing scheduling by using Spark, and finally displaying by using a Web front end.
4. The method for sensing the traffic running state of the road based on the social network information in real time as claimed in claim 1, wherein in the first step of automatically collecting, classifying and extracting the effective traffic information in the social network platform, the keyword of the effective traffic information is mined through a TF-IDF algorithm, the word frequency and the inverse document frequency contained in the TF-IDF are multiplied to obtain the value of the TF-IDF, and the larger the TF-IDF is, the higher the importance is, the traffic effective information keyword positioned at the forefront is obtained by sorting the TF-IDF values from large to small.
5. the method for real-time perception of the traffic operation state based on the social network information as claimed in claim 4, wherein the TF-IDF algorithm specifically includes:
i) Calculating the TF value:
Calculating the frequency of occurrence of a word divided by the total frequency to obtain the word frequency TF of the word;
ii) calculating the IDF value:
calculating the total number of the documents in the corpus divided by the total number of the documents containing the word +1, and then taking a logarithm; log means taking the logarithm of the obtained value;
iii) calculating the TFIDF value:
fidfi,j=tfi,f×idfi
The final result is obtained.
6. The method for perceiving the traffic running state of the road based on the social network information in real time as claimed in claim 1, wherein in the first step, the time for which the event is likely to last is predicted by using a comprehensive algorithm, and an end-to-end self-learning model is established, and the method specifically comprises the following steps:
1) constructing a training set: comparing and analyzing traditional traffic accident duration prediction parameters, establishing an accident duration prediction model based on a social network, firstly constructing a prediction training set, and extracting key information described by a data text as the prediction characteristic of the accident duration by using the crawled data in addition to extracting the basic characteristic;
Obtaining the duration of the accident, carrying out approximate matching on the accident occurrence time and the social network reporting time to obtain a label of a training set, and directly predicting the accident occurrence time through social network information; the prediction training set of the accident duration adopts the matching of social network information and the actual processing time of the traffic police, and a corresponding time training set is constructed;
Extracting 5 attribute parameters of the number of vehicles, whether lanes are blocked, the number of casualties, whether buses are involved and whether trucks are involved from the original data of the traffic accident;
2) Building models, namely incorporating different types of models into an integral model system through a combined model;
3) And (5) displaying the accident duration prediction.
7. The method for real-time perception of the traffic operation state based on the social network information as claimed in claim 1, wherein the second step learns the data for predicting the future from the past data based on a deep learning algorithm without adding any artificial tags.
8. The social network information-based real-time perception system for the road traffic running state based on the social network information, which implements the social network information-based real-time perception method for the road traffic running state, is characterized by comprising:
the data layer is connected with the Hadoop big data platform and the data preprocessing layer and is used for containing all data supporting the operation of the whole system;
The data preprocessing layer is connected with the data layer, the Hadoop big data platform and the analysis processing layer and is used for refining and supplying rough and unrelated data;
The analysis processing layer is connected with the data preprocessing layer, the Hadoop big data platform and the visual layer, and is used for analyzing and processing data and providing an interface for the application terminal;
The visualization layer is connected with the analysis processing layer and is used for providing an interface for the application terminal;
and the Hadoop big data platform is connected with the data layer, the data preprocessing layer and the analysis processing layer and is used for providing data processing parallelization support for the whole platform.
9. The system for real-time perception of road traffic operating conditions based on social network information as claimed in claim 8, wherein the data layer is populated with data shared by various map navigation platform providers and data having statistical value for use and analysis by a data pre-processing layer;
The data preprocessing layer is also used for cleaning data acquired by the data fusion layer, removing noise in the data, filling null values, missing values and processing inconsistent data by filling missing data, eliminating abnormal data, smoothing noise data and correcting inconsistent data; then carrying out data standardization processing, wherein after the data standardization processing is carried out on the original data, all indexes are in the same order of magnitude; meanwhile, a common relational database and a Hive database are combined to perform big data distributed storage and management;
The analysis processing layer predicts or decides data by using data machine learning, and a model is input and constructed from a sample;
The visualization layer displays the pathological traffic conditions through the visualization module.
10. A social network information-based road traffic running state real-time perception terminal for implementing the social network information-based road traffic running state real-time perception method according to claim 1.
CN201910861533.1A 2019-09-12 2019-09-12 Road traffic running state real-time perception method based on social network information Active CN110555568B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910861533.1A CN110555568B (en) 2019-09-12 2019-09-12 Road traffic running state real-time perception method based on social network information

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910861533.1A CN110555568B (en) 2019-09-12 2019-09-12 Road traffic running state real-time perception method based on social network information

Publications (2)

Publication Number Publication Date
CN110555568A true CN110555568A (en) 2019-12-10
CN110555568B CN110555568B (en) 2022-12-02

Family

ID=68740040

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910861533.1A Active CN110555568B (en) 2019-09-12 2019-09-12 Road traffic running state real-time perception method based on social network information

Country Status (1)

Country Link
CN (1) CN110555568B (en)

Cited By (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111552772A (en) * 2020-04-22 2020-08-18 中国计量大学 Real-time traffic road condition text data and traffic volume combined visual analysis method
CN111599170A (en) * 2020-04-13 2020-08-28 浙江工业大学 Traffic running state classification method based on time sequence traffic network diagram
CN111696347A (en) * 2020-06-02 2020-09-22 安徽宇呈数据技术有限公司 Method and device for automatically analyzing traffic incident information
CN111738500A (en) * 2020-06-11 2020-10-02 大连海事大学 Navigation time prediction method and device based on deep learning
CN112990473A (en) * 2019-12-12 2021-06-18 杭州海康威视数字技术股份有限公司 Model training method, device and system
CN113032365A (en) * 2021-03-22 2021-06-25 航天科工智慧产业发展有限公司 Group dynamic management method
CN114971664A (en) * 2021-02-26 2022-08-30 富联精密电子(天津)有限公司 Advertisement putting method and related equipment
TWI786902B (en) * 2021-10-26 2022-12-11 中華電信股份有限公司 Device, method and computer program product for exploration of potential event hotspot
TWI810921B (en) * 2022-04-29 2023-08-01 江俊昇 A method and system for improving planning of road sections causing traffic accidents
CN116842018A (en) * 2023-07-06 2023-10-03 江西桔贝科技有限公司 Big data screening method and system

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103914988A (en) * 2013-01-04 2014-07-09 ***通信集团公司 Traffic road condition data processing method, device and system
CN104156440A (en) * 2014-08-12 2014-11-19 东南大学 Traffic data obtaining method based on microblogs
CN106021508A (en) * 2016-05-23 2016-10-12 武汉大学 Sudden event emergency information mining method based on social media
CN107330547A (en) * 2017-06-15 2017-11-07 重庆交通大学 A kind of city bus dynamic dispatching optimization method and system
US9846999B1 (en) * 2016-08-24 2017-12-19 International Business Machines Corporation Smartphone safety system for pedestrians
CN108053673A (en) * 2017-12-08 2018-05-18 上海壹账通金融科技有限公司 A kind of road conditions forecasting procedure, storage medium and server

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103914988A (en) * 2013-01-04 2014-07-09 ***通信集团公司 Traffic road condition data processing method, device and system
CN104156440A (en) * 2014-08-12 2014-11-19 东南大学 Traffic data obtaining method based on microblogs
CN106021508A (en) * 2016-05-23 2016-10-12 武汉大学 Sudden event emergency information mining method based on social media
US9846999B1 (en) * 2016-08-24 2017-12-19 International Business Machines Corporation Smartphone safety system for pedestrians
CN107330547A (en) * 2017-06-15 2017-11-07 重庆交通大学 A kind of city bus dynamic dispatching optimization method and system
CN108053673A (en) * 2017-12-08 2018-05-18 上海壹账通金融科技有限公司 A kind of road conditions forecasting procedure, storage medium and server

Non-Patent Citations (6)

* Cited by examiner, † Cited by third party
Title
CHEN JIAN等: "SEM-logit integration model of travel mode choice behaviors", 《JOURNAL OF SOUTH CHINA UNIVERSITY OF TECHNOLOGY (NATURAL SCIENCE EDITION)》 *
THEODORE GEORGIOU等: "Mining_complaints_for_traffic-jam_estimation_A_social_sensor_application", 《MINING_COMPLAINTS_FOR_TRAFFIC-JAM_ESTIMATION_A_SOCIAL_SENSOR_APPLICATION》 *
YUANYUAN CHEN等: "Detecting_Traffic_Information_From_Social_Media_Texts_With_Deep_Learning_Approaches", 《IEEE TRANSACTIONS ON INTELLIGENT TRANSPORTATION SYSTEMS 》 *
杨戌初: "社交网络中突发事件的态势感知算法研究与实现", 《中国优秀博硕士学位论文全文数据库(硕士)信息科技辑》 *
陈坚等: "社交网络交通信息对出行方式选择行为影响模型", 《交通运输***工程与信息》 *
陈宏飞: "基于微博的西安市交通拥堵状况时空分布研究", 《陕西师范大学学报(自然科学版)》 *

Cited By (14)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112990473A (en) * 2019-12-12 2021-06-18 杭州海康威视数字技术股份有限公司 Model training method, device and system
CN112990473B (en) * 2019-12-12 2024-02-02 杭州海康威视数字技术股份有限公司 Model training method, device and system
CN111599170B (en) * 2020-04-13 2021-12-17 浙江工业大学 Traffic running state classification method based on time sequence traffic network diagram
CN111599170A (en) * 2020-04-13 2020-08-28 浙江工业大学 Traffic running state classification method based on time sequence traffic network diagram
CN111552772A (en) * 2020-04-22 2020-08-18 中国计量大学 Real-time traffic road condition text data and traffic volume combined visual analysis method
CN111696347A (en) * 2020-06-02 2020-09-22 安徽宇呈数据技术有限公司 Method and device for automatically analyzing traffic incident information
CN111738500A (en) * 2020-06-11 2020-10-02 大连海事大学 Navigation time prediction method and device based on deep learning
CN111738500B (en) * 2020-06-11 2024-01-12 大连海事大学 Navigation time prediction method and device based on deep learning
CN114971664A (en) * 2021-02-26 2022-08-30 富联精密电子(天津)有限公司 Advertisement putting method and related equipment
CN113032365A (en) * 2021-03-22 2021-06-25 航天科工智慧产业发展有限公司 Group dynamic management method
TWI786902B (en) * 2021-10-26 2022-12-11 中華電信股份有限公司 Device, method and computer program product for exploration of potential event hotspot
TWI810921B (en) * 2022-04-29 2023-08-01 江俊昇 A method and system for improving planning of road sections causing traffic accidents
CN116842018A (en) * 2023-07-06 2023-10-03 江西桔贝科技有限公司 Big data screening method and system
CN116842018B (en) * 2023-07-06 2024-02-23 上海比滋特信息技术有限公司 Big data screening method and system

Also Published As

Publication number Publication date
CN110555568B (en) 2022-12-02

Similar Documents

Publication Publication Date Title
CN110555568B (en) Road traffic running state real-time perception method based on social network information
Zhu et al. KST-GCN: A knowledge-driven spatial-temporal graph convolutional network for traffic forecasting
Li et al. Spatial data mining
Camburn et al. Machine learning-based design concept evaluation
Huang et al. An efficient passenger-hunting recommendation framework with multitask deep learning
CN108710625B (en) Automatic thematic knowledge mining system and method
CN104318340B (en) Information visualization methods and intelligent visible analysis system based on text resume information
Alomari et al. Analysis of tweets in Arabic language for detection of road traffic conditions
CN102546771A (en) Cloud mining network public opinion monitoring system based on characteristic model
CN110442728A (en) Sentiment dictionary construction method based on word2vec automobile product field
Alomari et al. Sentiment analysis of Arabic tweets for road traffic congestion and event detection
Chen et al. CEM: A convolutional embedding model for predicting next locations
Suma et al. Automatic detection and validation of smart city events using hpc and apache spark platforms
Zhang Application of data mining technology in digital library.
Wang et al. Measuring urban vibrancy of residential communities using big crowdsourced geotagged data
Peng et al. A forecast model of tourism demand driven by social network data
Xu et al. Detecting spatiotemporal traffic events using geosocial media data
CN111353085A (en) Cloud mining network public opinion analysis method based on feature model
Chuanxia et al. Machine learning and IoTs for forecasting prediction of smart road traffic flow
Zou et al. Deep learning for cross-domain data fusion in urban computing: Taxonomy, advances, and outlook
Ma et al. An overview of Hadoop applications in transportation big data
Patil Machine Learning for Traffic Management in Large-Scale Urban Networks: A Review
He et al. Perceiving commerial activeness over satellite images
Ji et al. Reliable event detection via multiple edge computing on streaming traffic social data
Sampath et al. Traffic Prediction in Indian Cities from Twitter Data Using Deep Learning and Word Embedding Models

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant