Volume 39 Issue 2
Apr.  2021
Turn off MathJax
Article Contents
LIU Zhao, HE Shanglu, LIU Yingshun. A Method to Identify Traffic Incidents Based on Social Network Data[J]. Journal of Transport Information and Safety, 2021, 39(2): 53-60. doi: 10.3963/j.jssn.1674-4861.2021.02.007
Citation: LIU Zhao, HE Shanglu, LIU Yingshun. A Method to Identify Traffic Incidents Based on Social Network Data[J]. Journal of Transport Information and Safety, 2021, 39(2): 53-60. doi: 10.3963/j.jssn.1674-4861.2021.02.007

A Method to Identify Traffic Incidents Based on Social Network Data

doi: 10.3963/j.jssn.1674-4861.2021.02.007
  • Received Date: 2020-06-18
  • A text classification method based on machine learning is studied to identify traffic incidents by mining the data from the social networks. The original texts are crawled by web crawler"Beautiful Soup"based on the keywords and location. These texts are preprocessed using regular expression matching, duplicate removing, and"0-1"mark? ing. According to the features of preprocessed texts, the paper proposes a method to select feature words based on fea? ture weights. The feature weight is calculated by normalizing, weighting, and combining the word frequency and the ratio of the text containing that word. Accordingly, the feature weight of each unique word in the training set of the traf? fic incident text can be achieved, used as a criterion for selecting feature words, and as the inputs of classifiers. A test is conducted to compare different classifiers and methods to select feature words. The results show that the proposed method to select feature words combined with the XGBoost classifier has the optimal performance, with a precision rate of 0.679 6, a recall rate of 0.648 1, an F1 value of 0.663 5, and an AUC value of 0.759 4.

     

  • loading
  • [1]
    QIAO F X, YU L. Social media applications to publish dynamic transportation information on campus[C]. International Conference of Chinese Transportation Professionals, Nanjing, China: ICCTP, 2011.
    [2]
    郑治豪, 吴文兵, 陈鑫, 等. 基于社交媒体数据的交通感知分析系统[J]. 自动化学报, 2018, 44(4): 656-666. https://www.cnki.com.cn/Article/CJFDTOTAL-MOTO201804007.htm

    ZHENG Zhihao, WU Wenbing, CHEN Xin, et al. A traffic sensing and analyzing systemusing social media data[J]. Acta Automatica Sinica, 2018, 44(4): 656-666. (in Chinese) https://www.cnki.com.cn/Article/CJFDTOTAL-MOTO201804007.htm
    [3]
    滕靖, 刘韶杰, 龚越, 等. 交通事件网络舆情分析方法[J]. 交通信息与安全, 2019, 37(6): 139-148. http://www.jtxa.net/tiasn/paper/editpaper.do?flag=abstract&PAPERID=2019-00518

    TENG Jing, LIU Shaojie, GONG Yue, et al. An analysis method of online public opinions on traffic incidents[J]. Journal of Transport Information and Safety, 2019, 37(6): 139-148. (in Chinese) http://www.jtxa.net/tiasn/paper/editpaper.do?flag=abstract&PAPERID=2019-00518
    [4]
    张恒才, 陆锋, 陈洁. 微博客蕴含交通信息的提取[J]. 中国图象图形学报, 2013, 18(1): 123-129. https://www.cnki.com.cn/Article/CJFDTOTAL-ZGTB201301017.htm

    ZHANG Hengcai, LU Feng, CHEN Jie. Extracting traffic information from massive microblog messages[J]. Journal of Image and Graphics, 2013, 18(1): 123-129. (in Chinese) https://www.cnki.com.cn/Article/CJFDTOTAL-ZGTB201301017.htm
    [5]
    GU Y M, QIAN Z, CHEN F. From twitterto detector: realtime traffic incident detectionusing social media data[J]. Transportation Research Part C: Emerging Technologies, 2016(67): 321-342. http://www.sciencedirect.com/science/article/pii/S0968090X16000644
    [6]
    D'ANDREA E, DUCANGE P, LAZZERINI B, et al. Real-time detection of traffic from twitter stream analysis[J]. IEEE Transactions on Intelligent Transportation Systems, 2015, 16(4): 2269-2283. doi: 10.1109/TITS.2015.2404431
    [7]
    徐翔, 刘悦. 全球社交网络中用户"社会互动位置—信息位置"同质效应研究——基于Twitter信息传播的数据挖掘和实证分析[J]. 华东理工大学学报(社会科学版), 2019, 34 (5): 92-102. https://www.cnki.com.cn/Article/CJFDTOTAL-HDLS201905012.htm

    XU Xiang, LIU Yue. Research on the homogenouseffect of "social interaction locationinformation location"of the users in the global social networks: Data mining and empirical analysis based on twitter informationdissemination[J]. Journal of East ChinaUniversity of Science and Technology(Social Science Edition), 2019, 34(5): 92-102. (in Chinese) https://www.cnki.com.cn/Article/CJFDTOTAL-HDLS201905012.htm
    [8]
    叶颖婕. 基于关联规则的交通事故风险因素挖掘及预测模型构建[D]. 北京: 北京工业大学, 2018.

    YE Yingjie. Research on mining algorithm and prediction model of traffic accident risk factors based on news data[D]. Beijing: Beijing University of Technology, 2018. (in Chinese)
    [9]
    胡泽文, 王效岳, 白如江. 国内外文本分类研究计量分析与综述[J]. 图书情报工作, 2011, 55(6): 78-81+142. https://www.cnki.com.cn/Article/CJFDTOTAL-TSQB201106021.htm

    HU Zewen, WANG Xiaoyue, BAI Rujiang. Quantitative Analysis and review of text classification research at home and abroad[J]. Library And Information Service, 2011, 55(6): 78-81+142. (in Chinese) https://www.cnki.com.cn/Article/CJFDTOTAL-TSQB201106021.htm
    [10]
    SALAS A, GEORGAKIS P, PETALAS Y. Incident detection using data from social media[C]. 20th International IEEE Conference on Intelligent Transportation Systems, yokohama, Janpan: IEEE, 2017.
    [11]
    SAKAKI T, MATSUO Y, YANAGIHARAT, et al. Realtime event extraction for drivinginformation from social sensors[C]. International IEEE Conference Cyber Technology in Automation, Control, and Intelligent Systems, Bangkok, Thailand: IEEE, 2012.
    [12]
    宋呈祥, 陈秀宏, 牛强. 文本分类中基于CHI改进的特征选择方法[J]. 微电子学与计算机, 2018, 35(9): 74-78. https://www.cnki.com.cn/Article/CJFDTOTAL-WXYJ201809016.htm

    SONG Chengxiang, CHEN Xiuhong, NIU Qiang. Improved feature selection methodbased on chi for text categorization[J]. Microelectronics & Computer, 2018, 35(9): 74-78. (in Chinese) https://www.cnki.com.cn/Article/CJFDTOTAL-WXYJ201809016.htm
    [13]
    吴小晴, 万国金, 李程文, 等. 一种改进TF-IDF的中文邮件识别算法研究[J]. 现代电子技术, 2020, 43(12): 83-86. https://www.cnki.com.cn/Article/CJFDTOTAL-XDDJ202012021.htm

    WU Xiaoqing, WAN Guojin, LI Chengwen, et al. Research on improved TF-IDF Chinese mail recognition algorithm[J]. Modern Electronics Technique, 2020, 43(12): 83-86. (in Chinese) https://www.cnki.com.cn/Article/CJFDTOTAL-XDDJ202012021.htm
    [14]
    庄穆妮, 李勇, 谭旭, 等. 基于BERT-LDA模型的新冠肺炎疫情网络舆情演化仿真[J]. 系统仿真学报, 2021, 33(1): 24-36. https://www.cnki.com.cn/Article/CJFDTOTAL-XTFZ202101005.htm

    ZHUANG Muni, LI Yong, TAN Xu, et al. Evolutionary simulation of online public opinion based on the BERT-LDA model under COVID-19[J]. Journal of System Simulation, 2021, 33 (1): 24-36. (inChinese) https://www.cnki.com.cn/Article/CJFDTOTAL-XTFZ202101005.htm
    [15]
    曾奇. 面向微博的短文本分类算法研究[D]. 成都: 电子科技大学, 2019.

    ZENG Qi. Research on short text classification algorithms for Microblog[D]. Chengdu: University of Electronic Scienceand Technology of China, 2019. (in Chinese)
    [16]
    柳本民, 闫寒. 基于SVM事故分类的连环追尾事故影响因素分析[J]. 交通信息与安全, 2020, 38(1): 43-51. http://www.jtxa.net/tiasn/paper/editpaper.do?flag=abstract&PAPERID=2019-00587

    LIU Benmin, YAN Han. An analysis of influencing factors of multi-vehicle rear-end accidentsbased on accident classification of SVM[J]. Journal of Transport Information and Safety, 2020, 38(1): 43-51. (in Chinese) http://www.jtxa.net/tiasn/paper/editpaper.do?flag=abstract&PAPERID=2019-00587
    [17]
    李晓峰, 马静, 李驰, 等. 基于XGBoost模型的电商商品品名识别算法研究[J]. 数据分析与知识发现, 2019, 3(7): 34-41. https://www.cnki.com.cn/Article/CJFDTOTAL-XDTQ201907005.htm

    LI Xiaofeng, MA Jing, LI Chi, et al. Identifying commodity names based on XGBoost model[J]. Data Analysis and Knowledge Discovery, 2019, 3(7): 34-41. (in Chinese) https://www.cnki.com.cn/Article/CJFDTOTAL-XDTQ201907005.htm
    [18]
    徐婷, 张香, 张亚坤, 等. 基于AdaBoost算法的货车驾驶人安全倾向性分类[J]. 安全与环境学报, 2019, 19(4): 1273-1281. https://www.cnki.com.cn/Article/CJFDTOTAL-AQHJ201904024.htm

    XU Ting, ZHANG Xiang, ZHANG Yakun, et al. Truck driver safety tendency classification based on the AdaBoost algorithm[J]. Journal of Safety and Environment, 2019, 19(4): 1273-1281. (in Chinese) https://www.cnki.com.cn/Article/CJFDTOTAL-AQHJ201904024.htm
    [19]
    尹何举, 昝红英, 陈俊怡, 等. 交通事故的自动判案研究[J]. 中文信息学报, 2019, 33(3): 136-144. https://www.cnki.com.cn/Article/CJFDTOTAL-MESS201903018.htm

    YI Heju, ZAN Hongying, CHEN Junyi, et al. Study on automatic judgment of traffic accidents[J]. Journal of Chinese Information Processing, 2019, 33(3): 136-144. (in Chinese) https://www.cnki.com.cn/Article/CJFDTOTAL-MESS201903018.htm
  • 加载中

Catalog

    通讯作者: 陈斌, bchen63@163.com
    • 1. 

      沈阳化工大学材料科学与工程学院 沈阳 110142

    1. 本站搜索
    2. 百度学术搜索
    3. 万方数据库搜索
    4. CNKI搜索

    Figures(7)  / Tables(7)

    Article Metrics

    Article views (697) PDF downloads(34) Cited by()
    Proportional views
    Related

    /

    DownLoad:  Full-Size Img  PowerPoint
    Return
    Return