Research and Practice for the Text Preprocessing Technology of Road Traffic Accident Information
School of Safety Science and Engineering,Henan Polytechnic University,Jiaozuo 454000,China; Department of Nuclear System Safety,Nagaoka University of Technology,Nagaoka,9402188; .Anyang Institute of Technology,Anyang 455000,China; Artificial Intelligence Research Center, National Institute of Advanced Industrial Science and Technology (AIST),Tokyo,135 0064
Text preprocessing is the key and timeconsuming step in text mining.Because of the lack of tools,much meaningful information is undiscovered in the narrative records of traffic accidents.In the essay,we aim at developing a tool for text mining which can shorten the time of information procession,enhance the extraction ratio and improve the accuracy of identification.According to the Chinese Specifications for Road Traffic Management Information Collection,this paper builds a systematic semantics set for preprocessing of the descriptive texts in the traffic accident information.The set comprises 12 items and 185 standard words,which are standard words and codes extracted from the fundamental traffic accident information defined in the standards of National Public Security.the paper extracts 8156 severe traffic accidents collected between 2004 and 2014 from the Accident Query System operated by State Administration of Work Safety.Then the paper applies the semantic sets to analyzing the descriptive text in the traffic information.The practice demonstrates that by using the standard semantic sets in analyzing the descriptive texts,more information related to environments can be extracted,more characteristics of the accidents can be well captured,and it is helpful to understand the mechanism of occurrence.The study also provides a key prerequisite in developing an automatic identification system for traffic accident information.
ZHANG Kun,MEI Shidong,JING Guoxun et al. Research and Practice for the Text Preprocessing Technology of Road Traffic Accident Information[J]. Safety and Environmental Engineering, 2017, 24(4): 112-116.